0% found this document useful (0 votes)

85 views6 pages

Speech Emotion Recognition with ML

This document discusses emotion recognition on speech signals using machine learning. It aims to recognize emotions in speech and classify them into 7 categories using the Berlin database of emotional speech. Features like MFCC and energy are extracted from the speech signals and used to train classifiers like SVM, Random Forest and Gradient Boosting. Random Forest achieved the highest accuracy of 81.05% in predicting the correct emotion. The document also reviews previous work on speech emotion recognition using various algorithms and number of emotion categories.

Uploaded by

Soum Sarkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views6 pages

Speech Emotion Recognition with ML

Uploaded by

Soum Sarkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Emotion Recognition On Speech

Signals Using Machine Learning

Mohan Ghai, Shamit Lal, Shivam Dugga l and Shrey Manik
Delhi Technological University
mohanghai13@[Link], shamitlal@[Link], shivamduggal.9507@[Link],
[Link]@[Link]

Abstract - With the increase in man to machine There can be many perceived emotions for a single
interaction, speech analysis has become an integral part utterance, each representing some part of the utterance. It is
in reducing the gap between physical and digital world. difficult to predict a single emotion as the boundary
An important subfield within this domain is the between these parts is hard to determine. Another issue is
recognition of emotion in speech signals, which was
that the type of emotion generally depends upon the
traditionally studied in linguistics and psychology.
Speech emotion recognition is a field having diverse speaker, environment and culture. If a person is in a
applications. The prime objective of this paper is to particular emotional state for a long time like depression, all
recognize emotions in speech and classify them in 7 other emotions will be temporary and the outcome of the
emotion output classes namely, anger, boredom, disgust, emotion recognition system will be either the long term
anxiety, happiness, sadness and neutral. The proposed emotion or the short term (temporary) emotion. Presence of
approach is based upon the Mel Frequency Cepstral noise may also present some challenges in feature
coefficients (MFCC) and energy of the speech signals as extraction.
feature inputs and uses Berlin database of emotional
A vital part of emotion detection system pipeline is the
speech. The features extracted from speech are converted
into a feature vector, which in turn are used to train selection of appropriate features to classify speech segments
different classification algorithms namely, Support Vector in different emotional classes. For this task, hidden
Machine (SVM) , Random Decision Forest and Gradient parameters like energy of signal, pitch, timbre, MFCC are
Boosting. Random forest was found to have the highest preferred over the words and content of the speech itself. As
accuracy and predicted correct emotion 81.05% of the these parameters vary continuously with time, audio is
time. divided into frames of appropriate size. It is assumed that
above-mentioned parameters don’t change significantly
Keywords - Berlin database, Emotion recognition,
over the duration of a frame. Thus the entire audio can be
Gradient Boosting, MFCC, SVM, Random Forest.
segmented in frames, each represented by a feature vector.
I These frames then can be used as training set for
classification algorithms.
Emotions play a vital role in human communication. In
In the next section, the previous related work on speech
order to extend its role towards the human-machine
emotion recognition systems is explained. Subsequent,
interaction, it is desirable for the computers to have some
Section 3 provides an insight on the database that is used
built-in abilities for recognizing the different emotional
for implementing the system. Section 4 provides the
states of the user [2,5]. With the advent of technology in the
framework along with the approach used for feature
recent years, more intelligent interaction between humans
extraction. In section 5, the algorithms proposed, Random
and machines is desired. We are still far from naturally
Decision Forest,SVM and Gradient Boosting, are described
interacting with the machines. Research studies have
in detail. Results obtained are mentioned in Section 6 and
provided evidence that human emotions influence the
the next Section 7 concludes the paper
decision making process to a certain extent [1-4]. Hence, it
is desirable for machines to have the ability to detect B
D
O E S

emotions in speech signals. This is a challenging task which
This database is easily and publically available and is
is drawing attention recently. Deciding which features to
among the most popular emotional databases used for
use to accomplish this task successfully is still an open
emotion recognition. It comprises of 535 emotional
question.

978-1-5090-6399-4/17/$31.00 2017
c IEEE 34

uthorized licensed use limited to: CENTRE FOR DEVELOPMENT OF ADVANCED COMPUTING - CDAC - PUNE. Downloaded on May 27,2021 at [Link] UTC from IEEE Xplore. Restrictions apply
utterances recorded from 5 female and 5 male actors (total of The system consists of five basic blocks : Emotional
10 speakers) spoken in German language [11]. The Berlin Speech Database, Extraction of Features, Feature Vector,
database consist of 7 emotions ( anger, boredom, disgust, Classifiers along with Emotion Output as depicted in figure
anxiety/fear, happiness, sadness and neutral ). Each file has 1.
16-bit PCM, mono channels, sampled at 16Khz. Total
length of the 535 emotional utterances is 1487 seconds and
average utterance length is around 2.77 seconds [9].

R

W
Recognition of emotions in audio signals has been a field
of extensive study in the past. Previous work in this area
included use of various classifiers like SVM, Neural
Networks, Bayes Classifier etc. The number of emotions
classified varied from study to study, they play an important
aspect in evaluating the accuracy of the different classifiers.
Reduction in the number of emotions used for recognition
has generated more accurate results as depicted below.
The following table summarises the previous study done on
the topic.
TABLE 1
R W
Study Algorithm # of Accurac
Used Emotions y (%)

[Kamran Soltani, Raja Two layer 6 77.1

Noor Ainon,2007] [1] Neural
Network

[Li Wern Chew, Kah PCA, LDA 6 (divided 81.67

Phooi Seng, Li-Minn and RBF into three
Ang, Vish Ramakonar, independent Figure 1. Speech Emotion Recognition Framework
2011] [2] classes)
A. Feature Extraction
[Taner Danisman, Adil SVM 4/5 77.5/66.8
Alpkocak, 2008] [3]
Feature extraction process started by dividing an audio
[Lugger and Yang, Bayes 6 74.4 file into frames [19]. This was done by sampling the audio
2007] [4] Classifier signal at 16000 Hz and selecting 0.025 sec as the duration
for each frame. On the basis of above two parameters frame
[Yixiong Pan, Peipei SVM 3 95.1
Shen, Liping Shen, size was computed.
2012] [5] Large part of the audio signals have frequency elements
only up to 8 kHz. According to the Sampling theorem 16
kHz is chosen as the ideal frequency for sampling.
S

E R
F
It was assumed that on short time durations, the audio
The main aim of the system is to recognise the emotional signal remains constant. This is why signals were framed
state of the human from his or her voice. The system extracts into 20-40 ms frames. If the frame chosen is much smaller in
the best features from the audio signals such as energy and duration then it doesn't have enough samples to get an
MFCC features, and summarizes them into a limited number appropriate spectral estimate, if it is larger in duration then
of features, over which supervised learning algorithm the signal changes vividly through the entire frame.
classifier is used [2,11]. This classifier correlates features to For each audio sample an output vector was maintained
emotion attached with the audio. which is a 7 element vector. Each emotion was given a label
between 0-6. According to the audio label the

2017 International Conference On Big Data Analytics and computational Intelligence (ICBDACI) 35

uthorized licensed use limited to: CENTRE FOR DEVELOPMENT OF ADVANCED COMPUTING - CDAC - PUNE. Downloaded on May 27,2021 at [Link] UTC from IEEE Xplore. Restrictions apply
corresponding vector index was marked 1. All the other
elements were marked as 0. Feature matrix was then
generated by further breaking down frames into segments
with each segment having 25 frames. Segment hop size was
selected as 13 which gave the number of overlapping frames
per segment. After segmentation, each segment was
appended to the feature matrix to prepare the final feature
matrix.
Most of the past researches in this field focussed on using Figure 2. MFCC Feature Extraction
single frames as data points for training their algorithms.
Using this approach, some of the frames can be wrongly The Mel scale gives relationship between the actual
classified as belonging to a particular class when the entire measured frequency with the perceived frequency of a pure
audio belongs to some other class. This motivated us to tone. Humans can differentiate slight transition in pitch at
combine 25 frames in segments and use these segments as lower frequencies more appropriately in comparison to
data points. The benefits of this are two-fold. First, it leads higher frequencies. The equation for changing frequency(f)
to an increase of features for each data point. Second, it to Mel scale is:
leads to an averaging effect where the classification of Mel (F) = 2595× log10(1 +f / 700) (1)
entire segment depends on the features extracted from all its
constituents frames and a single frame cannot dictate the 2) Energy: The energy of the speech is extracted from the
classification. intensity of the speech. The energy associated with speech
1) Mel Frequency Cepstral Coefficients(MFCC): MFCC is continuously varying, therefore, energy associated with a
represents parts of the human speech production and short range of speech is of prime interest [19]. Energy can be
perception. MFCC depicts the logarithmic perception of voiced, unvoiced and silence classification of speech.
loudness and pitch of human auditory system [1,2,5] . It Energy level for each frame is calculated by squaring the
also eliminates characteristics that are speaker dependent by amplitudes of each frame and then summing up those
removing the elementary frequency along with their values. Combining energy and MFCC features gives a clear
harmonics. insight to the emotion of the speech.
The procedure for evaluating MFCC features [1,2] : B. Classification Algorithms
1. For each frame an estimate of periodogram of the
1) SVM: SVM is a supervised learning algorithm used
power spectrum is made.
most widely for pattern recognition applications. The
2. Add the energy in each filter after applying mel
algorithm is simple to use and provides good results even
frequency filter.
when trained on limited size training dataset. More
3. Reduce each energy to logarithm of the
formally, it is an algorithm which constructs, in a high
corresponding energy.
dimensional or infinite dimensional space, a hyperplane or
4. Evaluate DCT of the filterbank energies obtained
set of hyperplanes which can be used for regression and
in step 3.
classification tasks. It tries to learn a hyperplane that results
5. DCT coefficients from 1 to 13 are taken into
in maximum separation between the data points belonging
consideration.
to different classes, leading to a better classifier. The data
These 13 features represent the 13 MFCC. can be linearly separable or non-linearly separable. Non
A frame vector was maintained with each frame to store linearly separable data is classified by mapping it’s feature
the features of the frame. For every audio frame 13 MFCC space to a high dimensional space using a kernel function.
features were generated. The data points are then linearly separable in this high
dimensional space. The optimization problem in SVM
reduces to

(2)

36 2017 International Conference On Big Data Analytics and computational Intelligence (ICBDACI)

uthorized licensed use limited to: CENTRE FOR DEVELOPMENT OF ADVANCED COMPUTING - CDAC - PUNE. Downloaded on May 27,2021 at [Link] UTC from IEEE Xplore. Restrictions apply
Where w is normal vector to the hyperplane and < [Link] > Random Forests algorithm works as follows:
denote the inner product of w and xk for each data point (xk
1. Random Record Selection: Each decision tree is
, yk ) in the training set.
trained on approximately 2/3rd of the training data.
Some of the widely used kernels are linear kernel and
2. Random Variable Selection: Only some of the
radial bias kernel.
variables used for prediction are chosen. This
Linear kernel function is given as:
selection is done randomly and node is split
k ernel(x, y ) = (x, y ) (3)
according to the most optimal split of the node.
Radial bias kernel is given as:
3. Trees grow to maximum extent without any
pruning.
(4)
Random Forests algorithm is based on bagging. In
In this paper, SVM is implemented using random bias
bagging, a random sample is selected with replacements
kernel. The implementation is based on libsvm. The fit
repeatedly from the training set and trees are then fit to these
complexity of the algorithm increases quadratically with
samples: By averaging the prediction values or by taking
the number of samples it is trained on.
the majority vote from all the decision trees we predict the
values of the unseen sample. Gini importance determines
2) Gradient Boosting: Gradient boosting is used to
whether parent node needs to be split into children nodes. If
produce a prediction model in the form of ensemble of weak
reducing the parent into children reduces the gini impurity
prediction models. It works in a stage-wise manner and
or increases the intensity , the node is split.
generalises the different boosting methods by optimising
the technique for minimizing arbitrary differentiable loss I=Gparent−Gsplit1−Gsplit2 (5)
functions by adding, at each stage, a new tree that best
reduces (steps down the gradient of) the chosen loss G=∑ pi(1−pi) (6)
function. This summation is run over all the classes and pi is the
probability of class i. G stands for gini importance and I is
The basic idea of the algorithm is to develop new
the intensity.
base-learners that are maximally connected with the loss
In this paper, the random forests classifier was
function specifically it's negative gradient, related with the
implemented using 15 forests. Gini impurity was used to
total group. The loss functions applied can be arbitrary
measure the quality of the split of a node. Testing for
chosen, in this paper deviance loss function is used for the
different max depths, we decided to split nodes until each
Gradient Boost Classifier. After the initial fitting by the
leaf had samples belonging to a single class only. Using
basic classifier, additional copies of the classifier are fitted
these hyperparameters highest accuracy of 81.05% was
on the same initial dataset but the weights of the samples
achieved.
that have been incorrectly classified are adjusted in order to
R

make successive classifiers focus on more complex cases.
In this paper, we trained gradient boosting algorithm Three classification algorithms, namely Random Decision
using deviance loss. Learning rate was set to 0.15. Number Forest, SVM and Gradient Boosting classified an audio
of boosting stages performed were 120. Maximum depth of signal into one of the 7 classes. Out of the three, Random
the regression estimator was set to 4. These hyperparameters Decision Forest achieved the highest accuracy of 81.05%.
selection resulted in the highest accuracy on test set. Considering that many past papers achieved similar or less
3) Random Forests: Random Forests uses an ensemble of accuracy when trained on less than 7 emotional output
learning methods and is used for regression, classification classes, we consider our accuracy as a significant
and other tasks. Random Forests works by constructing a improvement. For all the three algorithms, we achieved
large number of decision trees at the time of training and it highest classification accuracy for samples belonging to
outputs the mean prediction or mode of the class of the anger class while least accuracy was achieved for those
individual trees. Random decision forests prevents decision belonging to happiness class. The results for each algorithm
trees from overfitting the training data. is summarized in the following tables and graphs.

2017 International Conference On Big Data Analytics and computational Intelligence (ICBDACI) 37

uthorized licensed use limited to: CENTRE FOR DEVELOPMENT OF ADVANCED COMPUTING - CDAC - PUNE. Downloaded on May 27,2021 at [Link] UTC from IEEE Xplore. Restrictions apply
TABLE 2
SVM
Emotion Recognised (%)
Emotion
A B D An H S N

A 86.10 1.85 2.29 3.62 3.97 0.44 1.72

B 4.62 51.04 5.55 4.35 1.20 18.86 14.38

D 16.29 14.15 44.00 7.51 4.00 7.90 6.15

An 30.66 6.13 6.70 37.83 4.34 7.83 6.51

H 57.99 4.33 6.37 8.42 15.89 1.97 5.04

Figure 4. Graph showing the accuracy results for Gradient Boosting
S 1.90 13.80 2.52 2.27 0.06 73.87 5.58
for different emotions
N 9.55 26.63 4.27 6.06 2.25 10.87 40.37
TABLE 4
Accuracy 55.89% R F

Emotion Recognised (%)

Emotio
n A B D An H S N

A 93.38 0.54 0.67 1.34 2.91 0.18 0.98

B 4.20 87.65 1.05 1.31 0.99 1.58 3.22

D 7.18 3.74 83.91 1.25 1.15 1.15 1.63

An 19.66 2.98 0.79 70.90 2.18 2.18 1.29

H 27.21 4.22 0.65 3.57 61.66 0.57 2.11

Figure 3. Graph showing the accuracy results for SVM for different S 1.96 3.80 0.52 0.63 0.40 91.83 0.86

emotions
N 11.51 8.25 2.14 2.54 4.13 2.22 69.21
TABLE 3
Accuracy 81.05%
G B

Emotion Recognised (%)
Emotion
A B D An H S N

A 88.53 1.50 1.32 2.69 3.18 0.88 1.90

B 6.69 66.29 2.41 2.07 0.67 12.17 9.70

D 15.12 8.20 55.80 5.56 2.93 6.24 6.15

An 28.49 6.51 4.81 47.74 2.45 7.17 2.83

H 43.12 6.14 2.44 2.75 40.36 2.05 3.15

S 2.94 6.38 1.60 1.72 0.12 82.76 4.48

N 16.61 13.74 2.56 6.99 3.03 8.15 48.91

Figure 5. Graph showing the accuracy results for Random Forest
Accuracy 65.23% for different emotions

38 2017 International Conference On Big Data Analytics and computational Intelligence (ICBDACI)

uthorized licensed use limited to: CENTRE FOR DEVELOPMENT OF ADVANCED COMPUTING - CDAC - PUNE. Downloaded on May 27,2021 at [Link] UTC from IEEE Xplore. Restrictions apply
N - Neutral, S - Sadness, H - Happiness, An - Anxiety, method and GA-optimized fuzzy ARTMAP neural network”, Neural
Computing and Applications, Volume 21, Issue 8, pp 2115–2126, 2011
D - Disgust, B - Boredom, A - Anger.
[5] Li Wern Chew, Kah Phooi Seng, Li-Minn Ang, Vish Ramakonar,
C Amalan Gnanasegaran, "Audio-Emotion Recognition System using Parallel
Classifiers and Audio Feature Analyzer”, Third International Conference
Emotion recognition in speech signals has become on Computational Intelligence, Modelling & Simulation, 2011.
[6] Zeng, M. Pantic, G. Roisman, T. Huang, “A Survey of Affect
potentially a major topic for research in the field of
Recognition Methods: Audio, Visual, and Spontaneous Expressions”,
interaction between humans and computers due to its wide IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.
degree of applications in the recent times. 31, no. 1, pp 39-58, Jan. 2009.
In this paper, an approach to emotion recognition in [7] J. S. Park, Ji-H. Kim and Yung-H. Oh, “Feature Vector
audio signals based upon Random Decision Forest, SVM Classification based Speech Emotion Recognition for Service Robots”
IEEE Transactions on Consumer Electronics, vol. 55, no. 3, pp.
and Gradient Boosting classifiers was presented. The
1590-1596,Aug. 2009.
performance of the classifiers can boost up significantly if [8] Xia Mao, Lijiang Chen, Liqin Fu “Multi-Level Speech Emotion
suitable features are properly obtained. Hence, we Recognition based on HMM and ANN” in World Congress on
concluded that inclusion of energy as a feature along with Computer Science and Information Engineering. 2009.
other 13 MFCC features led to better assessment of the [9] Taner Danisman, Adil Alpkocak, “Emotion Classification of Audio
Signals Using Ensemble of Support Vector Machines”, Perception in
emotion attached with the speech. It can be seen that
Multimodal Dialogue Systems Volume 5078 of the series Lecture Notes in
integrating frames into overlapping segments led to a Computer Science pp 205-216, Springer-Verlag Berlin Heidelberg, 2008.
greater continuity in samples and also resulted in each data [10] S. Casale, A. Russo, G. Scebba, S. Serrano, “Speech Emotion
point having many more features. Also treating each Classification using Machine Learning Algorithms”, The IEEE
segment as an independent data point increased the size International Conference on Semantic Computing, 2008.
[11] Kamran Soltani, Raja Noor Ainon, “SPEECH EMOTION
training set many folds leading to an increase in accuracy
when using different classification algorithms. DETECTION BASED ON NEURAL NETWORKS”, IEEE
International Symposium on Signal Processing and Its Applications, ISSPA
Considering the different classification strategies the
2007.
maximum accuracy of 81.05 % is obtained for the database [12] Lugger, M., Yang, B.: An Incremental Analysis of Different Feature
by using Random Decision Forest classifier. Groups In Speaker Independent Emotion Recognition. In: 16th Int.
Congress of Phonetic Sciences , 2007.
F
W [13] [Link], [Link], [Link], [Link]. “Real-time emotion
detection system using speech: Multi-modal fusion of different timescale
The paper presents only the analyses of seven human features”, Proceedings of IEEE Multimedia Signal Processing Workshop,
emotions using speech signals. It can be expanded to Chania, Greece, 2007.
predict more human emotions. Also more tuples can be [14] Eun Ho Kim, Kyung Hak Hyun, Soo Hyun Kim and Yoon Keun
collected and better feature engineering can be applied in Kwak “Speech Emotion Recognition Using Eigen-FFT in Clean and Noisy
Environments” in16th IEEE International Conference on Robot &
the future to enhance the result of the prediction algorithms.
Human Interactive Communication,, August 26, 2007, Korea.
The classification algorithms wrongly predicted some of the [15] Chul Min Lee, and Shrikanth S. Narayanan, “Toward detecting
samples belonging to happiness class as belonging to anger emotions in spoken dialogs”, IEEE Transaction on Speech and Audio
class. This can be rectified by extracting more features to Processing,, vol. 13, no. 2, pp. 293- 303,Mar. 2005.
better distinguish between these two class. [16] T. Vogt and E. Andre. Comparing feature sets for acted and
spontaneous speech in view of automatic emotion recog-
nition. In IEEE, editor, Int’l Conf. Multimedia and Expo,
R

pages 474–477, Jul 2005.
[17] M. Song, J. Bu, C. Chen, and N. Li, “Audio- Visual Based emotion
[1] K.V .Krishna Kishore, [Link] Satish, "Emotion Recognition in recognition: A new Approach,” in Proc. IEEE Comput. Soc. Conf.
Speech Using MFCC and Wavelet Features”, 3rd IEEE International Computer Vision and Pattern Recognition, vol. 2, pp. 1020– 1025. 2004.
Advance Computing Conference (IACC) , 2013. [18] [Link] Berlin emotional speech database.
[2] Yixiong Pan, Peipei Shen and Liping Shen, “Speech Emotion [19] Nwe, T. L., Wei, F. S., & De Silva, L.C., ” Speech
Recognition Using Support Vector Machine”, International Journal of based emotion classification”, TENCON 2001
Smart Home, 2012 Proceedings of IEEE Region 10 International Conference
[3] Ashish B. Ingale and [Link],, ”Speech Emotion on Electrical and Electronic Technology, Vol. 1. pp. 297-
Recognition Using Hidden Markov Model and Support Vector Machine”, 301), IEEE.
International Journal of Advanced Engineering Research and Studies,
Vol. 1,Issue 3, 2012.
[4] Davood Gharavian, Mansour Sheikhan, Alireza Nazerieh, Sahar
Garoucy, “Speech emotion recognition using FCBF feature selection

2017 International Conference On Big Data Analytics and computational Intelligence (ICBDACI) 39

uthorized licensed use limited to: CENTRE FOR DEVELOPMENT OF ADVANCED COMPUTING - CDAC - PUNE. Downloaded on May 27,2021 at [Link] UTC from IEEE Xplore. Restrictions apply

Emotion Detection in Speech Analysis
No ratings yet
Emotion Detection in Speech Analysis
7 pages
Speech Emotion Detection Based On MFCC and CNN-LSTM Architecture
No ratings yet
Speech Emotion Detection Based On MFCC and CNN-LSTM Architecture
7 pages
Emotion Classification from Speech Signals
No ratings yet
Emotion Classification from Speech Signals
6 pages
An Approach On Emotion Recognition by Using Speech Signals
No ratings yet
An Approach On Emotion Recognition by Using Speech Signals
4 pages
Audio-Visual Emotion Recognition Techniques
No ratings yet
Audio-Visual Emotion Recognition Techniques
10 pages
Speech Emotion Recognition Using ML
No ratings yet
Speech Emotion Recognition Using ML
13 pages
A Review On Speech Emotion Classification Using Linear Predictive Coding and Neural Networks
No ratings yet
A Review On Speech Emotion Classification Using Linear Predictive Coding and Neural Networks
5 pages
Tamil Speech Emotion Recognition System
No ratings yet
Tamil Speech Emotion Recognition System
6 pages
Speech Emotion Recognition Techniques
No ratings yet
Speech Emotion Recognition Techniques
13 pages
Lyapunov Exponent for Emotion Recognition
No ratings yet
Lyapunov Exponent for Emotion Recognition
4 pages
Speech Emotion Recognition via Classification
No ratings yet
Speech Emotion Recognition via Classification
12 pages
Emotion Recognition in Speech Processing
No ratings yet
Emotion Recognition in Speech Processing
4 pages
DSP RP 19
No ratings yet
DSP RP 19
4 pages
Speech Emotion Recognition Based On SVM Using Matlab PDF
No ratings yet
Speech Emotion Recognition Based On SVM Using Matlab PDF
6 pages
Emotion Detection in Speech Using ML
No ratings yet
Emotion Detection in Speech Using ML
15 pages
Dual Recurrent Model for Speech Emotion Recognition
No ratings yet
Dual Recurrent Model for Speech Emotion Recognition
106 pages
Emotion Recognition in Speech Signals
No ratings yet
Emotion Recognition in Speech Signals
8 pages
Emotion Recognition in Speech Analysis
No ratings yet
Emotion Recognition in Speech Analysis
11 pages
AMIGOS Dataset for Emotion Detection
No ratings yet
AMIGOS Dataset for Emotion Detection
9 pages
Speech Emotion Recognition Based On SVM Using MATLAB: March 2016
No ratings yet
Speech Emotion Recognition Based On SVM Using MATLAB: March 2016
7 pages
Pre-Processing: Bageshree V. Sathe-Pathak, Ashish R. Panat
No ratings yet
Pre-Processing: Bageshree V. Sathe-Pathak, Ashish R. Panat
4 pages
German Speech Emotion Recognition System
No ratings yet
German Speech Emotion Recognition System
5 pages
Modeling and Simulation of Bacterial Foraging Variants - Acoustic Feature Selection and Classification
No ratings yet
Modeling and Simulation of Bacterial Foraging Variants - Acoustic Feature Selection and Classification
7 pages
Ijcet: International Journal of Computer Engineering & Technology (Ijcet)
No ratings yet
Ijcet: International Journal of Computer Engineering & Technology (Ijcet)
9 pages
SVM for Speech Emotion Recognition
No ratings yet
SVM for Speech Emotion Recognition
6 pages
Set Conference Draft Paper - 223585
No ratings yet
Set Conference Draft Paper - 223585
6 pages
SER Using SVM
No ratings yet
SER Using SVM
5 pages
O04 3004
No ratings yet
O04 3004
18 pages
Eyben Etal 2010 Comparaison Corpus Pour Categorisation Auto Affects
No ratings yet
Eyben Etal 2010 Comparaison Corpus Pour Categorisation Auto Affects
12 pages
Emotion Classification via Gamma Distribution
No ratings yet
Emotion Classification via Gamma Distribution
5 pages
Speech Emotion Recognition Techniques
No ratings yet
Speech Emotion Recognition Techniques
11 pages
Electronics 12 00839 v2
No ratings yet
Electronics 12 00839 v2
17 pages
Hybrid Model For Speech Emotion Recognition of Normal and Autistic Children (Sernac)
No ratings yet
Hybrid Model For Speech Emotion Recognition of Normal and Autistic Children (Sernac)
14 pages
Machine Learning and Deep Learning Techniques For Emotion Recognition From Human Speech Using Acoustic Analysis
No ratings yet
Machine Learning and Deep Learning Techniques For Emotion Recognition From Human Speech Using Acoustic Analysis
10 pages
Speech Emotion Recognition via CSO Algorithm
No ratings yet
Speech Emotion Recognition via CSO Algorithm
9 pages
Emotion and Depression Detection From Speech
No ratings yet
Emotion and Depression Detection From Speech
9 pages
Speech Emotion Recognition with ML
No ratings yet
Speech Emotion Recognition with ML
20 pages
Emotion Recognition Using Speech Processing
No ratings yet
Emotion Recognition Using Speech Processing
5 pages
Effective Modelling of Human Expressive States From Voice by Adaptively Tuning The Neuro-Fuzzy Inference System
No ratings yet
Effective Modelling of Human Expressive States From Voice by Adaptively Tuning The Neuro-Fuzzy Inference System
10 pages
Comparative Analysis of Neural Networks For Speech Emotion Recognition
No ratings yet
Comparative Analysis of Neural Networks For Speech Emotion Recognition
5 pages
JETIR2106163
No ratings yet
JETIR2106163
5 pages
DSP RP 25
No ratings yet
DSP RP 25
6 pages
Speech Emotion Detection via CNN and MFCC
No ratings yet
Speech Emotion Detection via CNN and MFCC
2 pages
Speech Emotion Recognition Using Machine Learning Techniques
No ratings yet
Speech Emotion Recognition Using Machine Learning Techniques
8 pages
Speech Emotion Recognition with SVM
100% (1)
Speech Emotion Recognition with SVM
21 pages
Emotion Classification From Speech Signal Based On
No ratings yet
Emotion Classification From Speech Signal Based On
16 pages
Speech Emotion Recognition Using Deep Learning
No ratings yet
Speech Emotion Recognition Using Deep Learning
8 pages
Emotion and Gender Recognition via Speech
No ratings yet
Emotion and Gender Recognition via Speech
4 pages
Speech Emotion AI for Tech Experts
No ratings yet
Speech Emotion AI for Tech Experts
15 pages
Emotional Speech Recognition with CNNs
No ratings yet
Emotional Speech Recognition with CNNs
11 pages
DEMO PPT
No ratings yet
DEMO PPT
35 pages
Speech Emotion Recognition: Ashish B. Ingale, D. S. Chaudhari
No ratings yet
Speech Emotion Recognition: Ashish B. Ingale, D. S. Chaudhari
4 pages
Speech Emotion Recognition with HMMs
No ratings yet
Speech Emotion Recognition with HMMs
4 pages
Emotional Analysis and Evaluation of Kannada Speech Database
No ratings yet
Emotional Analysis and Evaluation of Kannada Speech Database
11 pages
10 1016@j Specom 2019 10 004 PDF
No ratings yet
10 1016@j Specom 2019 10 004 PDF
14 pages
Wa0007
No ratings yet
Wa0007
6 pages
Proceeding of The 3rd International Conference On Informatics and Technology
No ratings yet
Proceeding of The 3rd International Conference On Informatics and Technology
7 pages
Machine Learning Notes
83% (12)
Machine Learning Notes
19 pages
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
95% (20)
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
334 pages
Full Course of Machine Learning
100% (19)
Full Course of Machine Learning
660 pages
Artificial Intelligence Problems and Their Solutions
100% (2)
Artificial Intelligence Problems and Their Solutions
273 pages
Machine Learning PPT For Students
73% (11)
Machine Learning PPT For Students
18 pages
Hackers Guide To Machine Learning With Python PDF
100% (16)
Hackers Guide To Machine Learning With Python PDF
272 pages
Introduction to Artificial Intelligence Concepts
100% (7)
Introduction to Artificial Intelligence Concepts
41 pages
Burkov's Guide to Machine Learning
100% (11)
Burkov's Guide to Machine Learning
135 pages
Understanding Machine Learning
100% (73)
Understanding Machine Learning
416 pages
Artificial Intelligence Handwritten Notes by Riya
67% (3)
Artificial Intelligence Handwritten Notes by Riya
118 pages
Neural Networks and Deep Learning - Deep Learning Explained To Your Granny - A Visual Introduction For Beginners Who Want To Make Their Own Deep Learning Neural Network (Machine Learning)
100% (5)
Neural Networks and Deep Learning - Deep Learning Explained To Your Granny - A Visual Introduction For Beginners Who Want To Make Their Own Deep Learning Neural Network (Machine Learning)
84 pages
Applied Generative AI For Beginners Practical Knowledge 1703207445
95% (19)
Applied Generative AI For Beginners Practical Knowledge 1703207445
221 pages
Hands On Machine Learning With Python Concepts and Applications For Beginners - John Anderson 2018
91% (11)
Hands On Machine Learning With Python Concepts and Applications For Beginners - John Anderson 2018
166 pages
Project Report On Artificial Intelligence
87% (53)
Project Report On Artificial Intelligence
23 pages
AI Overview: History, Challenges, Future
100% (15)
AI Overview: History, Challenges, Future
12 pages
Building AI Agents With LLMS, RAG, and Knowledge Graphs
100% (14)
Building AI Agents With LLMS, RAG, and Knowledge Graphs
560 pages
Machine Learning With Python
100% (16)
Machine Learning With Python
692 pages
Machine Learning?
100% (6)
Machine Learning?
114 pages
100 Generative AI Use Cases Examples For Industries
100% (10)
100 Generative AI Use Cases Examples For Industries
63 pages
Artificial Intelligence, Machine Learning, Deep Learning and Computer Vision
100% (2)
Artificial Intelligence, Machine Learning, Deep Learning and Computer Vision
75 pages
Deep Learning With Python
100% (10)
Deep Learning With Python
396 pages
Artificial Intelligence
67% (3)
Artificial Intelligence
30 pages
Machine Learning Projects in Python
100% (17)
Machine Learning Projects in Python
135 pages
Deep Learning Book
100% (5)
Deep Learning Book
42 pages
Generative Ai Fundamentals v1
100% (19)
Generative Ai Fundamentals v1
80 pages
Artificial Intelligence
100% (2)
Artificial Intelligence
300 pages
Machine Learning Basics for Beginners
100% (5)
Machine Learning Basics for Beginners
134 pages
2019 Book DataScienceAndBigDataAnalytics
100% (15)
2019 Book DataScienceAndBigDataAnalytics
418 pages
Expert Systems Overview & Components
No ratings yet
Expert Systems Overview & Components
13 pages
Sample Papper Half Year Math Sir
No ratings yet
Sample Papper Half Year Math Sir
22 pages
Data-Driven Language Teaching Explained
No ratings yet
Data-Driven Language Teaching Explained
8 pages
B.Tech Computer Science Online Counselling 2024
No ratings yet
B.Tech Computer Science Online Counselling 2024
2 pages
Figures of Rhetoric - Figures of Music
No ratings yet
Figures of Rhetoric - Figures of Music
45 pages
Final Capsule 2 1
No ratings yet
Final Capsule 2 1
10 pages
C-C-L2a-1584 - Install and Replace Windows and Doors
No ratings yet
C-C-L2a-1584 - Install and Replace Windows and Doors
12 pages
Csat Pyq's Booklet by Amit Sir (Ias Setu)
No ratings yet
Csat Pyq's Booklet by Amit Sir (Ias Setu)
129 pages
Occupational Safety and Health Practices
No ratings yet
Occupational Safety and Health Practices
4 pages
Understanding Health Dimensions Explained
100% (1)
Understanding Health Dimensions Explained
11 pages
Abhijeet
No ratings yet
Abhijeet
1 page
Dehradun Classified Epaper 27th Oct2024
No ratings yet
Dehradun Classified Epaper 27th Oct2024
40 pages
Selecting and Coaching Teacher Leaders
No ratings yet
Selecting and Coaching Teacher Leaders
12 pages
Workshop 1
No ratings yet
Workshop 1
23 pages
2nd Grade Field Notes Template
No ratings yet
2nd Grade Field Notes Template
4 pages
British Literature II Exam Questions
No ratings yet
British Literature II Exam Questions
4 pages
BSBA Major in Financial Management: Course Offering 1 Semester, SY 2021 - 2022
No ratings yet
BSBA Major in Financial Management: Course Offering 1 Semester, SY 2021 - 2022
11 pages
Introduction to Mr. Brant Saretsky
No ratings yet
Introduction to Mr. Brant Saretsky
3 pages
Document and Record
No ratings yet
Document and Record
29 pages
Cambridge English Empower Empower B1 Academic Skills Teacher U12 Worksheet
No ratings yet
Cambridge English Empower Empower B1 Academic Skills Teacher U12 Worksheet
2 pages
Accounting & Finance Professional Profile
No ratings yet
Accounting & Finance Professional Profile
2 pages
IPT PLACEMENTS - Distibution Final For Students
No ratings yet
IPT PLACEMENTS - Distibution Final For Students
2 pages
Sean Abraham
No ratings yet
Sean Abraham
2 pages
SSC CGL 21st Sep Shift 3
No ratings yet
SSC CGL 21st Sep Shift 3
55 pages
Standard Work Process Procedure: Purpose
No ratings yet
Standard Work Process Procedure: Purpose
12 pages
Music9 Q3 Module5
No ratings yet
Music9 Q3 Module5
19 pages
Syllabus - Found of Leadership - Young India Fellowship - Jaggard July 14 2018-Revised-Compressed
No ratings yet
Syllabus - Found of Leadership - Young India Fellowship - Jaggard July 14 2018-Revised-Compressed
53 pages
LKPD Development for Diverse Learners
No ratings yet
LKPD Development for Diverse Learners
11 pages
Types of Listening Skills Explained
No ratings yet
Types of Listening Skills Explained
18 pages
Dst-2425-Sample Paper-Class-Xii-P1-I.q
No ratings yet
Dst-2425-Sample Paper-Class-Xii-P1-I.q
6 pages
Generosity vs. Envy in Matthew 20:1-16
No ratings yet
Generosity vs. Envy in Matthew 20:1-16
8 pages

Speech Emotion Recognition with ML

Uploaded by

Speech Emotion Recognition with ML

Uploaded by

Emotion Recognition On Speech

Signals Using Machine Learning

[Kamran Soltani, Raja Two layer 6 77.1

[Li Wern Chew, Kah PCA, LDA 6 (divided 81.67

A 86.10 1.85 2.29 3.62 3.97 0.44 1.72

B 4.62 51.04 5.55 4.35 1.20 18.86 14.38

D 16.29 14.15 44.00 7.51 4.00 7.90 6.15

An 30.66 6.13 6.70 37.83 4.34 7.83 6.51

H 57.99 4.33 6.37 8.42 15.89 1.97 5.04

Emotion Recognised (%)

A 93.38 0.54 0.67 1.34 2.91 0.18 0.98

B 4.20 87.65 1.05 1.31 0.99 1.58 3.22

D 7.18 3.74 83.91 1.25 1.15 1.15 1.63

An 19.66 2.98 0.79 70.90 2.18 2.18 1.29

H 27.21 4.22 0.65 3.57 61.66 0.57 2.11

A 88.53 1.50 1.32 2.69 3.18 0.88 1.90

B 6.69 66.29 2.41 2.07 0.67 12.17 9.70

D 15.12 8.20 55.80 5.56 2.93 6.24 6.15

An 28.49 6.51 4.81 47.74 2.45 7.17 2.83

H 43.12 6.14 2.44 2.75 40.36 2.05 3.15

S 2.94 6.38 1.60 1.72 0.12 82.76 4.48

N 16.61 13.74 2.56 6.99 3.03 8.15 48.91

You might also like