You are on page 1of 13

Journal of Critical Reviews

ISSN- 2394-5125 Vol 7 , Issue 9, 2020

CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING


ALGORITHMS
R. Shiva Shankar1, J. Raghaveni2, Pravallika Rudraraju3, Y.Vineela Sravya4
1-4 Department of Computer Science and Engineering,
1-2S.R.K.R. Engineering College, Bhimavaram, West Godavari, Andhrapradesh, India
3-4Vignan’s Institute of Engineering for Women, Kalujaggurajapeta, Visakhapatnam, Andhrapradesh, India

shiva.srkr@gmail.com

Received: 25.03.2020 Revised: 26.04.2020 Accepted: 27.05.2020

Abstract:
Now a day’s the person’s gender has become very important in the economic markets in the type of advertisements. The objective of this
project is to design a system that determines the speaker gender using the pitch of the speaker's voice. Identifying the gender from the
properties of voice data set i.e., pitch, median, frequency etc. can be possible by using machine learning. In this project, we are trying to
classify gender into male or female based on the dataset containing various attributes related to voice like pitch, frequency etc. Data pre-
processing steps should be performed to find the gender classification of voice data by using algorithms of machine learning. The
proposed system can be used to find the best algorithm among K-nearest neighbors (KNN), Random Forest, Logistic Regression, Decision
Tree, support vector machine and gradient boosting to detect the gender of the speaker with maximum possible efficiency and accuracy.

Keywords: Data pre-processing, gender classification, K-nearest neighbors (KNN), Random Forest, Logistic Regression, Decision Tree,
support vector machine and gradient boosting.

© 2020 by Advance Scientific Research. This is an open-access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
DOI: http://dx.doi.org/10.31838/jcr.07.09.222

INTRODUCTION the computers. The objective of these experiments is to provide


Gender classification is an important function which can increase more accurate and efficient algorithms for the classification.
the efficiency of various applications like marketing, voice However, gender recognition by voice is still considered a
identification and online advertisements involving voice. Recent difficult and challenging task for an accurate prediction model
developments in human-computer interaction, recognition of [4]. To identify a gender based on voice, a trained model was
speech and biometric security systems leads to a rapid increase required. The model was trained by using a dataset. Thousands
in applications of gender recognition system. There are many of male and female voices are taken as samples to build a dataset.
genders with different voices around us [1]. The voice of human The voice dataset is transformed into various parameters like
speech is a good communication tool which consists of various pitch, frequency and mean etc. As machine learning gives more
unique features such as language, age, emotional state and efficient results for various classification problems in many
gender. A human voice contains sound waves that are research areas.
distinguished among all other humans in which different
frequency are carried by each sound wave. Identifying a gender Our main aim is to predict the gender with different algorithms
based on voice that deploys various applications like advertising, in machine learning using voice data set. The main objective is to
marketing strategies, investigating in crime scenarios and to evaluate any algorithm is its performance by comparing the
enhance human computer interaction systems mainly to improve accuracy of the different algorithms. The algorithm which has
level of user satisfaction which depends on gender voice by best fit for voice dataset gives best accuracy is used for
customer services [2]. classification.

Classification of Gender using pitch analysis mainly aims to LITERATURE SURVEY


predict the speaker’s gender by analyzing the voice Mansour Alsulaiman, ZulfiqarAliIn et al., [5] proposed a gender
samplecontaining different parameters. The system mainly classification methodology that measures the speaker’s speech
works to analyze the pitch of the speech signals. There is a slight signal intensity to make decision. Voice intensity is calculated
variation between average pitch value of male and average pitch using Simpson’s rule to measure area under normalized curve.
value. Comparison of various pitch values of male and female An adaptively adjusted threshold has been used to make a
voice samples is present in analysis [3].Frequency is used to decision on type of gender. Gender type is classified based on
differentiate the male and female human voices which are nearly area of utterance, if it is above the threshold the gender type is
in the range of 8 kHz to 180HZ. Males speak with lower female otherwise male.
frequencies when compared to females. The male voice
frequency is in the range 80Hz to 180Hz, with an average value. Igor bisio, Alessandro delfino, andandrea sciarrone et al., [6]
For a Female voice frequency, it ranges from 165Hz to proposed a model to predict human emotional state from audio
255Hz.Therefore; we can train a model to classify the gender by signals which improves interaction between humans and
calculating the mean frequency of the speech samples. This can computers, which allows in improving human-computer
be done using machine learning by training model with different intelligent interaction. It has 2 subsystems, emotion recognition
algorithms. and gender recognition. It uses various algorithms like Maximum
Likelihood Bayes classifier, SVM, Artificial Neural Network,
Machine learning is a technology which consist wide range of Hidden Markov model and k-Nearest Neighbours.
applications used in different fields to predict values in finance,
banking and marketing etc by using algorithms and data to train Mucahit Buyukyilmaz, Ali Osman Cibikdiken et al [7] proposed a
deep learning model known as Multilayer Perceptron (MLP) to

Journal of critical reviews 1217


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

recognize voice of a gender. The model shows that acoustic Priyankamakwana et al [12] proposed that a human can easily
properties of voice and speech can be used to detect the gender. identify a gender male or female by voice but is very difficult for
An MLP deep learning algorithm has been used to obtain the a computer to identify the gender by voice. So, there should be
model for classification from dataset which have parameters of special learning or training like providing inputs, methodology
voice samples and proposed the model that achieves 96.74% for a computer that task.
accuracy.
Jerzy SAS, Alexander et al [13] proposed a gender recognition
Chiu Ying Hay, Ng Hian James Et Al [8] proposed a gender using ASR techniques and neural networks. It presents technique
classification system that can be used to identify the gender by to gender by using MFCC features. The speech signal is divided
analyzing voice samples. It analyses using a pitch detection into 20 Ms frames. Mel-Frequency Cepstral Coefficients are
algorithm. To process voice signal time-domain or frequency- extracted for each frame and the created feature vector is fetched
domain approaches are used. Gender of a voice sample can be into a neural network classifier, which classifies each frame as
determined using a simple weight scoring algorithm. male or female.
Kavitha Yadav, Moreshmukhedkar et al [9] proposed a MFCC Soonil kwon, Guiyoung son, and Neungsoo et al [14] proposed
Based Speaker Recognition using Mat lab. The proposed system Recurrent Neural Networks technique to classify Gender based
comprises of Speaker Identification and Speaker Verification. on the Non-Lexical Cues of Emergency calls. There are many
Pitch Detection Algorithm (PDA) is a set of steps used to detect researchers have been performed in the last two decades but still
pitch of speech signal. Feature Extraction and Feature matching need to improve. Recurrent neural networks and SVM are the
are two important modules in Gender Identification. For feature two machine learning methods used to classify the gender.
extraction MFCC, PLP methods are used for Feature matching
Dynamic time warping is used. Hadi Hard, Limening Chen et al [15] have done a research on
gender identification based on speech signal. Different
Bhagya Laxmi Jena, Beda Prakash and Panigrahi et al [10] have parameters of voice samples are analyzed to predict gender of
done their research in Gender classification by pitch analysis. It speaker in gender classification by speech analysis. The fusion of
mainly concentrates on speech signals pitch analysis. When features and classifiers performs better than any individual
compared average pitch value of male and average pitch value of classifier.
female are different. Comparison of different pitch values of male
and female voice samples are included in analysis. SYSTEM ARCHITECTURE
System architecture describes the overall structure of the system
T. Jayasankar, K. Vinothkumar, Arputha Vijayaselvi et al [11] and the way in which the structure provides integrity. From
proposed an automatic gender classification system to determine figure 1 initially the voice dataset is preprocessed using
the gender through speech signal. There are two levels to techniques like normalization. Now the preprocessed data is
generate gender recognition system namely front-end and back- spitted into training and testing data in the ratio of 80% and 20%
end. Set of vectors known as feature energy entropy (EE), zero respectively. The trained model is now tested by using the
crossing rate (ZCR) and short time energy (STE) are represented testing dataset to calculate the evaluation metrics like accuracy,
in front end. The back-end is a classifier. precision, Recall, F1score, Cohen Kappa Score. Then Compare the
accuracies produced by the different trained models and best
accuracy produced model is chosen best for classification.

Voice dataset Data pre-processing

Training Testing Dataset


Dataset

ALGORITHMS
SVM
KNN
Decision Tree
Logistic Regression
Random Forest
Gradient Boosting

Trained
Model

Best Accuracy Compare Accuracy


results
Fig 1: System Architecture

Journal of critical reviews 1218


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

METHODOLOGIES attributes. The attributes are sd, meanfreq, median, Q25, Q75,
Data Set: The voice dataset contains different properties which IQR, skew, Kurt, sp.ent, sfm, mode, centroid, meanfun, minfun,
are used to classify the gender into male and female. Training maxfun, meandom, maxdom, dfrange, modindx, label. The target
dataset consists of nearly 3000 voice samples with different attribute is label which has male and female values.
parameters and those are collected from kaggle.com. It has 21

Algorithms
Voice Data set Data Pre-processing
SVM
KNN
Decision tree
Random forest
Data Pre-processing Logistic regression
Gradient Boosting

Fig 2: System model of data pre-processing

Data Pre-processing: As shown in Fig 2, a data set was taken ns =Object for numpy
and pre-processed using techniques like data cleaning, data ns.min (xdata) =minimum value for particular row in dataset.
transformation and data reduction. The pre-processed data set np.max (xdata) =maximum value for particular row in dataset
was sent into different machine learning algorithms which
results outputs (either male or female). We consider the best Model Testing: After pre-processing the dataset will split into
accurate output as a final result. Data pre-processing is 80% which is used for training and 20% is used for testing. The
performed before storing the data in the database. It is a process trained model is now tested against the testing data set. Then
in which missing values are filled and avoid noisy data compare the results from different trained models and choose
(irrelevant data). Also, we are required to do scaling on numeric the best model. The present work intends to find best solution to
columns. To perform scaling Normalization is used on feature predict the gender by using machine learning techniques.
columns.
Histogram: This project contains 21 attributes such as
Normalization: It is a process used to change values which are meanfreq, sd, median, Q25, Q75, IQR, skew, Kurt, sp.ent, sfm,
in particular range of zero to one in dataset. mode, centroid, meanfun, minfun, maxfun, meandom, maxdom,
dfrange, modindx. Each attribute maintains a different
x= (xdata-ns.min (xdata))/ ((ns.max (xdata)) histogram. A Histogram consists of rectangles whose area is
proportional to the frequency of a variable and whose width is
Whereas xdata=Feature columns in the dataset. equal to the class interval as represented in the figure 3.

Fig 3: Histograms for various attributes

IMPLEMENTATION groups of data. The attribute values are split in the xy plane by
Algorithms Used: taking IQR as x-axis and mean fun as y-axis then the trained svm
SVM: model finds the best hyper plane. Here the training dataset is
In SVM the data points are split in the x-y axis. Then find the best trained by using model, and then the trained model is evaluated
hyper plane which has maximum margin i.e. distance to the against the testing data to calculate the evaluation metrics.
closet data point in each group should be maximum. As shown in
the fig 4, the line splits into the data between two different

Journal of critical reviews 1219


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

Fig 4: Diagram for SVM

Algorithm for SVM


Input: Voice Dataset in csv format.
Output: Trained Model with evaluation metrics like Accuracy, precision, F1 score, Recall, Kappa score.

Step1: In step1 packages are imported to perform Numerical calculations and visualization of graphs.
import numpy as ny
import pandas as ps
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn metrics import accuracy_score, precision_score, recall_score, f1_score, cohen_kappa_score

Step 2: In step 2 pandas object pd is used to read the dataset into a variable named set. The function read_csv is used to read the dataset.
set = ps.read_csv("voicedataset.csv")

Step 3: In step 3 Categorical values are transformed to Numerical values in the label attribute through male and female which are taken
as ‘0’ and ‘1’ respectively.
Set. label = [1 if each == "female" else 0 for each in set. label]

Step 3.1: In step 3.1 divide the data into feature and target columns. y is the target column and xdata contains feature columns.
y =set.label.values
xdata = set.drop(["label"],axis=1)

Step 3.2: In step 3.2 perform Normalization to the Feature columns.


x = (xdata - ns.min(xdata)) / (ns.max(xdata)). values

Step 4: In step 4 Split the data into training data and testing data with testing data size is about 20% of the dataset. The function
train_test_split () divides the dataset into training data and testing data.
xtrain, xtest, ytrain, ytest = train_test_split (x,y,test_size=0.2,random_state = 30)

Step 5: Train the model


Step 5.1: Create an object for SVM.An object svm is created for support vector classifier (SVC).
svm = SVC (random state=42)

Step 5.2: Train the model with Train data.svm.fit () is used to train the model using training dataset.
svm.fit(xtrain, ytrain)

Step 6: Evaluate the performance of the Model. Calculate accuracy and other evaluation metrics.

Journal of critical reviews 1220


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

f=svm.predict(xtest)
accscore = accuracy_score(ytest, f)
recscore = recall_score(ytest, f)
f1score = f1_score (ytest, f)
kappascore=cohen_kappa_score(ytest,f)
prescore = precision_score(ytest, f)

In these packages are imported to perform numerical DECISION TREE:


calculations and an object panda is taken as Pd to read the Classification and regression problems can be solved by Decision
dataset. Here categorical values are transformed to numerical tree. Decision tree uses ID3 algorithm technique that divides
values through male and female which are taken as 0 or 1 attributes into 2 different groups. The attribute which has high
respectively. The data set is split into training and test datasets, entropy is taken as root node i.e meanfun which is shown in the
an object is created for SVM algorithm to generate a trained figure 5. The trained model is evaluated against the testing data
model. This trained model evaluates various scores when to calculate the evaluation metrics. In fig 5 the trained model
compared with target values of test data. generates the Decision tree for the voice data

Fig 5: Diagram for decision tree

Algorithm for Decision Tree


Input: Voice Dataset in csv Format.
Output: Trained Model with evaluation metrics like Accuracy, precision, F1 score, Recall, Kappa score.

Step1: In step1 packages are imported to perform Numerical calculations and visualization of graphs.
import numpy as ns
import pandas as ps
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score,f1_score, cohen_kappa_score

Step 2: In step 2 pandas object pd is used to read the dataset into a variable named data. The function read_csv is used to read the
dataset.
set = ps.read_csv("voicedataset.csv")

Step 3: In step 3 Categorical values are transformed to Numerical values in the label attribute through male and female which are taken
as ‘0’ and ‘1’ respectively.
set.label = [1 if each == "female" else 0 for each in set.label]

Step 3.1: In step 3.1 divide the data into feature and target columns.y is the target column and xdata contains feature columns.
y = set.label.values
xdata = set.drop(["label"],axis=1)

Step 3.2: In step 3.2 perform Normalization to the Feature columns.

Journal of critical reviews 1221


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

x = (xdata - ns.min(xdata)) / (ns.max(xdata)). Values

Step 4: In step 4 Split the data into training data and testing data with testing data size is about 20% of the dataset. The function
train_test_split () divides the dataset into training data and testing data.
xtrain, xtest, ytrain, ytest = train_test_split (x,y,test_size=0.2,random_state = 30)

Step5: Train the model


Step 5.1: Create an object for Decision Tree classifier.
dectree = DecisionTreeClassifier ()

Step 5.2: Train the model with Train data.dectree.fit () is used to train the model using training dataset.
dectree.fit(xtrain, ytrain)

Step 6: Evaluate the performance of the Model. Calculate accuracy and other evaluation metrics.
f=dec_tree.predict(xtest)
accscore = accuracy_score(ytest, f)
recscore = recall_score(ytest, f)
f1score = f1_score (ytest, f)
kappascore =cohen_kappa_score(ytest,f)
prescore = precision_score(ytest, f)

In these packages are imported to perform numerical It is a classification algorithm which is used to predict the
calculations and an object pandas is taken as pd to read the probability of a categorical dependent variable which contains 0
dataset. Here categorical values are transformed to numerical or 1. Here x-axis represents IQR and y-axis represents mean fun.
values through male and female which are taken as 0 or 1 First the dataset is loaded into a variable then dataset will split.
respectively. The data set is split into training and test datasets, The trained model is evaluated against the testing data to get
an object is created for decision tree algorithm to generate a metrics.
trained model. This trained model evaluates various scores when
compared with target values of test data. Algorithm for Logistic Regression
Input: Voice Dataset in csv Format.
LOGISTIC REGRESSION: Output: Trained Model with evaluation metrics like Accuracy,
precision, F1 score, Recall, Kappa score.

Step1: In step1 packages are imported to perform Numerical calculations and visualization of graphs.
import numpy as ns
import pandas as ps
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score,f1_score, cohen_kappa_score

Step 2: In step 2 pandas object pd is used to read the dataset into a variable named data. The function read_csv is used to read the
dataset.
set = ps.read_csv("voicedataset.csv")

Step 3: In step 3 Categorical values are transformed to Numerical values in the label attribute through male and female which are taken
as ‘0’ and ‘1’ respectively.
set.label = [1 if each == "female" else 0 for each in set.label]

Step 3.1: In step 3.1 divide the data into feature and target columns, y is the target column and x data contain feature columns.
y = set.label.values
xdata = set.drop(["label"],axis=1)

Journal of critical reviews 1222


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

Step 3.2: In step 3.2 perform Normalization to the Feature columns.


x = (xdata- ns.min(xdata)) / (ns.max(xdata)). Values

Step 4: In step 4 Split the data into training data and testing data with testing data size is about 20% of the dataset. The function
train_test_split () divides the dataset into training data and testing data.
xtrain, xtest, ytrain, ytest = train_test_split (x,y,test_size=0.2,random_state = 30)

Step5: Train the model


Step 5.1: Create an object for Logistic Regression.
logreg = LogisticRegression()

Step 5.2: Train the model with Train data.logreg.fit () is used to train the model using training dataset.
logreg.fit(xtrain, ytrain)

Step 6: Evaluate the performance of the Model. Calculate accuracy and other evaluation metrics.
f=log_reg.predict(xtest)
accscore = accuracy_score(ytest, f)
recscore = recall_score(ytest, f)
f1score = f1_score (ytest, f)
kappascore =cohen_kappa_score(ytest,f)
precision = precision_score(ytest, f)

In these packages are imported to perform numerical RANDOM FOREST:


calculations and an object pandas is taken as pd to read the It is an ensemble learning method used to achieve better
dataset. Here categorical values are transformed to numerical predictive performance value. From the samples of training data
values through male and female which are taken as 0 or 1 the random selection will be made by using of classification or
respectively. The data set is split into training and test datasets, regression trees. In figure 6 the predictions can be made with a
an object is created for logistic regression algorithm to generate large number of individual decision trees.
a trained model. This trained model evaluates various scores
when compared with target values of test data.

Fig 6: Diagram for random forest

Algorithm for Random Forest


Input: Voice Data set in csv Format.
Output: Trained Model with evaluation metrics like Accuracy, precision, F1 score, Recall, Kappa score.

Step1: In step1 packages are imported to perform Numerical calculations and visualization of graphs.

Journal of critical reviews 1223


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

import numpy as ns
import pandas as ps
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, cohen_kappa_score

Step 2: In step 2 pandas object pd is used to read the dataset into a variable named data. The function read_csv is used to read the
dataset.
set = ps.read_csv("voicedataset.csv")

Step 3: In step 3 Categorical values are transformed to Numerical values in the label attribute through male and female which are taken
as ‘0’ and ‘1’ respectively.
set.label = [1 if each == "female" else 0 for each in set.label]
Step 3.1: In step 3.1 divide the data into feature and target columns.y is the target column and xdata contains feature columns.
y = set.label.values
xdata = set.drop(["label"],axis=1)
Step 3.2: In step 3.2 perform Normalization to the Feature columns.
x = (xdata - ns.min(xdata)) / (ns.max(xdata)) values

Step 4: In step 4 Split the data into training data and testing data with testing data size is about 20% of the dataset. The function
train_test_split () divides the dataset into training data and testing data.
xtrain, xtest, ytrain, ytest = train_test_split (x,y,test_size=0.2,random_state = 30)

Step 5: Train the model


Step 5.1: Create an object for Random Forest classifier.
randforest = RandomForestClassifier (n_estimators=100, random_state=40)
Step 5.2: Train the model with Train data.randforest.fit () is used to train the model using training dataset.
randforest.fit(xtrain, ytrain)

Step 6: Evaluate the performance of the Model. Calculate accuracy and other evaluation metrics.
f=rand_forest.predict(xtest)
accscore = accuracy_score(ytest, f)
recscore = recall_score(ytest, f)
f1score = f1_score (ytest, f)
kappascore=cohen_kappa_score(ytest,f)
prescore = precision_score(ytest, f)

In these packages are imported to perform numerical K-NEAREST NEIGHBOUR:


calculations and an object pandas is taken as pd to read the One of the supervised machine learning algorithms known as
dataset. Here categorical values are transformed to numerical KNN which is used to solve both classification and regression
values through male and female which are taken as 0 or 1 problems. In this algorithm use data and classify new data based
respectively. An object is created for random forest algorithm to on similarity measures. A classification can be made as majority
generate a trained model. This trained model evaluates various vote to its neighbours. Then the trained model is evaluated
scores when compared with target values of test data. against the testing data to calculate the evaluation metrics.
Accuracy might increase with increase in number of nearest
neighbours, the value of k.

Algorithm for KNN


Input: Voice Dataset in csv Format.
Output: Trained Model with evaluation metrics like Accuracy, precision, F1 score, Recall, Kappa score.

Step1: In step1 packages are imported to perform Numerical calculations and visualization of graphs.

Journal of critical reviews 1224


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

import numpy as ns
import pandas as ps
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score,f1_score, cohen_kappa_score

Step 2: In step 2 pandas object pd is used to read the dataset into a variable named data. The function read_csv is used to read the
dataset
set = ps.read_csv("voicedataset.csv")

Step 3: In step 3 Categorical values are transformed to Numerical values in the label attribute through male and female which are taken
as ‘0’ and ‘1’ respectively.
set.label = [1 if each == "female" else 0 for each in set.label]
Step 3.1: In step 3.1 divide the data into feature and target columns, y is the target column and x data contain feature columns.
y = set.label.values
xdata = set.drop(["label"],axis=1)

Step 3.2: In step 3.2 perform Normalization to the Feature columns.


x = (xdata - ns.min(xdata)) / (ns.max(xdata)). values

Step 4: The function train_test_split () divides the dataset into training data and testing data.

xtrain, xtest, ytrain, ytest = train_test_split (x,y,test_size=0.2,random_state = 30)

Step5: Train the model


Step 5.1: Create an object for KNN.
knn = KNeighborsClassifier(n_neighbors=3)

Step 5.2: Train the model with Train data.knn.fit () is used to train the model using training dataset.
knn.fit (xtrain, ytrain)

Step 6: Evaluate the performance of the Model. Calculate accuracy and other evaluation metrics.
f=knn.predict(xtest)
accscore = accuracy_score(ytest, f)
recscore = recall_score(ytest, f)
f1score = f1_score (ytest, f)
kappascore=cohen_kappa_score(ytest,f)
prescore = precision_score(ytest, f)

In these packages are imported to perform numerical


calculations and an object pandas is taken as pd to read the GRADIENT BOOSTING:
dataset. Here categorical values are transformed to numerical It is a type of ensemble learning. In boosting the models are built
values through male and female which are taken as 0 or 1 in series. It takes many weak learners and transforms them into
respectively. The data set is split into training and test datasets, strong learners. Gradient boosting takes decision tree as weak
an object is created for KNN algorithm to generate a trained learner Now it finds the errors at the leaf nodes and constructs a
model. This trained model evaluates various scores when second tree by taking error values as new observation values and
compared with target values of test data. this process continues for a specified number of times is shown
in figure 7. Then the trained model is evaluated against the
testing data to calculate the evaluation metrics.

Journal of critical reviews 1225


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

Fig 7: Diagram for gradient boosting

Algorithm for Boosting:


Input: Voice Dataset in csv Format.
Output: Trained Model with evaluation metrics like Accuracy, precision, F1 score, Recall, Kappa score.

Step1:
In step1 packages are imported to perform Numerical calculations and visualization of graphs.
import numpy as ns
import pandas as ps
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score,f1_score, cohen_kappa_score

Step 2: In step 2 pandas object pd is used to read the dataset into a variable named data. The function read_csv is used to read the
dataset.
set = ps.read_csv("voicedataset.csv")

Step 3: In step 3 Categorical values are transformed to Numerical values in the label attribute through male and female which are taken
as ‘0’ and ‘1’ respectively.
set.label = [1 if each == "female" else 0 for each in set.label]
Step 3.1: In step 3.1 divide the data into feature and target columns, y is the target column and x data contain feature columns.
y = set.label.values
xdata = set.drop(["label"],axis=1)

Step 3.2: In step 3.2 perform Normalization to the Feature columns.


x = (xdata -ns.min(xdata)) / (ns.max(xdata)). values

Step 4: In step 4 Split the data into training data and testing data with testing data size is about 20% of the dataset. The function
train_test_split () divides the dataset into training data and testing data.
xtrain, xtest, ytrain, ytest = train_test_split (x,y,test_size=0.2,random_state = 30)

Step5: Train the model


Step 5.1: Create an object for Gradient Boost Classifier.
gbc = GradientBoostingClassifier(n_estimators=100, learning_rate=1.0, max_depth=1)
Step 5.2: Train the model with Train data.gbc.fit() is used to train the model using training dataset.
gbc.fit(xtrain, ytrain)

Step 6: Evaluate the performance of the Model. Calculate accuracy and other evaluation metrics.
f=gbc.predict(xtest)
accscore = accuracy_score (ytest, f)
recscore = recall_score (ytest, f)
f1score = f1_score (ytest, f)
kappascore =cohen_kappa_score(ytest,f)
prescor = precision score (ytest, f)

Journal of critical reviews 1226


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

In these packages are imported to perform numerical (P), false positive (Q), true negative (R) and false negative (S)
calculations and an object pandas is taken as pd to read the values.
dataset. Here categorical values are transformed to numerical Accuracy: It is the ratio of the number of correct predictions to
values through male and female which are taken as 0 or 1 the total number of input samples.
respectively. An object is created for boosting algorithm to Accuracy= (P + R)/ (P+R+S+Q)
generate a trained model. This trained model evaluates various Precision: It provides how relevant the positive detections are.
scores when compared with target values of test data. Precision=P/ (P+Q)
Recall: It is the number of correct results divided by the number
RESULTS& ANALYSIS of results that should have been returned.
The trained models are tested using testing data and gives Recall=P/ (P+S)
accuracy, precision. Kappa score, f1score, recall as output and the F1 score: It is a measure of test accuracy the f1 score is defined
best model is chosen from them based on accuracy. as the weighted harmonic mean of the test’s precision and recall.
F1 score=2 P / (2P+Q+S)
Confusion Matrix: It is a table with n rows and n columns Kappa score: It is a static used to measure inter-ratter
where n represents the no. of target classes. It is used to evaluate reliability.
the performance of a classification model by using True positive Kappa score= (Total Accuracy-Random
Accuracy)/ (1-Random Accuracy)

Fig. 8. Confusion matrix for KNN Fig. 9. Confusion matrix for Logistic regression

In the figure 8, true positive values are 331, false negative are 54, and false positives are 37 and true negative are 212 and in the figure 9,
true positive values are 341, false negative are 44, and false positives are 22 and true negative 227.

Fig. 10. Confusion matrix for SVM Fig .11. Confusion matrix for Decision tree

In the figure 10, true positive values are 335, false negative are 50, and false positives are 9 and true negative 240 and in the figure 11,
true positive values are 328, false negative are 57, and false positives are 14 and true negative 235.

Journal of critical reviews 1227


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

Fig 12. Confusion matrix for Random forest Fig 13. Confusion matrix for Gradient Boosting

In the figure 12 true positive values are 332, false negative is 53, As show in Table.1, gradient boosting algorithm performs a
and false positives are 11 and true negative 238 and in the above better performance when compared to all other remaining
figure 13, true positive values are 334, false negative are 51, and algorithms. So, ensemble algorithms work better for
false positives are 16 and true negative 233. classification.

Table 1: Comparison of results of various algorithms


Algorithms Accuracy Precision Recall F1 score Kappa score

Random Forest 89% 81% 95% 88% 79%

Logistic Regression 89% 83% 91% 87% 78%

Decision Tree 82% 77% 79% 64% 78%

KNN 86% 79% 86% 83% 71%

SVM 88% 82% 87% 75% 85%

Boosting 90% 82% 95% 88% 79%

Fig 14: Representing Accuracy vs Algorithms


In figure 14, accuracies are compared of various algorithms using bar graphs. .

Comparison of different scores:


In figure 15, different scores like accuracy, precision, recall etc are compared using graph.

Journal of critical reviews 1228


CLASSIFICATION OF GENDER BY VOICE RECOGNITION USING MACHINE LEARNING ALGORITHMS

Fig 15: Graph to compare different scores

Scale: on x-axis 1cm=1 unit on y-axis 1cm=0.05unit 7. Mucahit Buyukyilmaz, Ali Osman Cibikdiken (2016) "Voice
In figure 15, accuracy is plotted as red colour, precision is Gender Recognition Using Deep Learning ", 2016
plotted as green colour, recall is plotted as blue colour, F1 score International Conference on Modelling, Simulation and
is plotted as yellow colour and Kappa is plotted as black colour. Optimization Technologies and Applications ,volume 58,
By training and testing the six algorithms KNN, Random Forest, pp.409-411.
Gradient Boosting, SVM, Decision Tree, and Logistic Regression 8. chiuYing Hay, NgHianjames(2015)" Gender Classification
with voice dataset. Gradient Boosting performs better than the from speech", International Journal of Science and Research
other algorithms based on the evaluation metric accuracy. Hence (IJSR),pp.2109-2112.
ensemble methods are best for classification problems. 9. Kavitha Yadav, Moreshmukhedkar (2014)"MFCC Based
Speaker Recognition using matlab", International Journal of
CONCLUSION VLSI and embedded systems (IJVES), Volume 05, pp .1011-
Classifying gender using voice dataset will be on top of the list of 1015.
uses for machine learning algorithms. In this project we 10. BhagyaLaxmi Jena & Beda Prakash and Panigrahi (2014)
proposed a model which classifies gender using voice dataset "Gender Classification by Pitch Analysis”, International
accurately and efficiently. We have attempted to classify gender Journal on Advanced Computer Theory and Engineering
using six trained models among them the gradient boosting (IJACTE), volume 1,pp. 106-108.
model performs better than the others. The Model which we 11. T. Jayasankar, K. Vinothkumar, ArputhaVijayaselvi (2017)
proposed has best accuracy and performance .Models with good "Automatic Gender Identification in Speech Recognition
performance will help to use and develop voice-based gender Using Genetic Algorithm ", Applied mathematics and
recognition systems more effectively in wide range of aspects. information sciences an International journal, volume 11,
pp .907-913.
REFERENCES 12. Priyanka Makwana (2016) “Gender Recognition by Voice “,
1. GhazaalaYasmin, SuchibrotaDutt, ArijitGhosal (2017) International Research Journal of Engineering and
"Discrimination of Male and Female Voice Using Occurrence Technology, volume 6, pp.1-5.
Pattern of Spectral Flux”, 2017 International Conference on 13. Jerzy SAS, Alexander (2013) “Gender Recognition using
Intelligent Computing, Instrumentation and Control neural networks and ASR techniques”, Journal of medical
Technologies (ICICICT), pp.576-581. Informatics and technologies, Volume 22/2013, pp .179-
2. Rami S. Alkhawaldeh (2019) "Gender Recognition of Human 187.
Speech Using One-Dimensional Conventional Neural 14. Guy Young son, soonil kwon and neungsoo park (2019)
Network”, Hindawi Scientific Programming, Volume 2019, "Gender Classification Based on The Non-Lexical Cues of
pp.1-11. Emergency Calls with Recurrent Neural Networks
3. Sarah ItaLevitan, Taniya Mishra, Srinivas Bangalore (2016) "Symmetry 2019, volume 525, pp .1-14.
“Automatic Identification of Gender from Speech” ,Speech 15. Hadi Hard, Limening Chen (2015) “Gender identification in
Prosody 2016, pp. 84-88. Multimedia applications", Journal of Intelligent Information
4. Ioannis E. Livieris, Emmanuel Pintelas and Systems, volume 24, pp.1-17.
PanagiotisPintelas (2019) "Gender Recognition by Voice
Using an Improved Self-Labeled Algorithm”,
MachineLearning and KnowledgeExtraction 2019, pp. 492-
503.
5. Mansour Alsulaiman, Zulfiqar Ali and Ghulam Muhammad
(2011) "Gender Classification with Voice Intensity Speech
Processing” ,2011 IEEE, pp.205-210.
6. Igor bisio, Ales sandrodelfino, Asciarrone
(2013)"Recognizing a person's emotional state starting
from audio signal ",2013 IEEE ,volume 1, pp. 244-257.

Journal of critical reviews 1229

You might also like