You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/355980794

Voice Recognization using Machine Learning Approach

Article · November 2021

CITATIONS READS

0 496

1 author:

Tayba Asgher
Riphah International University
8 PUBLICATIONS   1 CITATION   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

asghertayba@gmail.com View project

All content following this page was uploaded by Tayba Asgher on 07 November 2021.

The user has requested enhancement of the downloaded file.


Voice Recognization using Machine Learning Approach
Tayba Asgher
Department of Computer
Science
Riphah International
university
Lahore, Pakistan
asghertayba@gamil.com

Abstract— Voice is a great efficient manner same as the machine can bethink to do the same
of communication among humans. It is the best pattern by selecting and including the correct
way to create connections between peers. features from speech data on an ML algorithm.
That’s why a technique introduced to detect Gender recognition is a method that is repeatedly
and predict the voice either a spoken person is used to control the gender class of a talker by
male or female using a machine learning treating speech signs. Voice signals occupied
(ML)mechanism. There are many methods from a verified speech can be used to obtain
used by researchers to predict the most auditory qualities like period, strength, rate of
accurate results such as Hidden Markov recurrence, and purifying [2].
Model, Dynamic Time Warping, Artificial Machine learning (ML) is a subdivision of
Neural Network (ANN), support vector artificial intelligence (AI) that contains
machine (SVM) and so on. In this study, use procedures, or processes, for repeatedly building
ML base Approach that is ANN and SVM in models from statistics [3]. A rule-based method
which Bayesian Regulation algorithm use for will do a job the similar way all time intended for
ANN and got 98.78% and 97.6% Accurate improvement or poorer, the performance of an
results. For all these experiments use the ML system know how to be better from end to
MATLAB Optimize too. end training, by revealing the process to further
Keywords—Artificial Neural Network (ANN), information. The basic knowledge of Artificial
Support Vector Machine (SVM), MATLAB, Neural Network (ANNs) is built continuously to
Machine Learning (ML). trust that working on human intelligence by
creating the accurate networks, be able to be
I. INTRODUCTION
imitative by silicon and wires for example alive
Voice recognization is a technique in which neurons and dendrites [4]. Deep learning (DL) is
words, phrases vocalized via human are translated an instruction of methods for book learning in
into digital signals, at that time these signals are ANN that includes a huge amount of “hidden”
converted into coding design to which layers to recognize structures. It lies among the
significance has been authorized [1]. Now focused input and output layers. The individual layer is
on the human voice for the reason that most made up of artificial neurons, frequently with
regularly and most logically use your vocal sound sigmoid or Rectified Linear Unit activation
to lead into our thinking to others in our functions. Quite a few ML algorithms such as
surroundings. Although computer programs are ANN, SVM, K-Nearest Neighbor (KNN), are
usually aimed to produce an accurate and precise used in the internet of things (IoT) based systems
reaction upon acceptance of the appropriate input, for estimate and classification [5].
the spoken words and vocal sounds by humans are In a Feed-Forward Network with alternative
different, and duplicate words can have not the networks, nearly all links can obstacle in excess of
same senses if verbal by changed accents. More more than one middle layer. In this study, ML
than a few methods have been applied, with based approach use to classify the gender by their
different steps of achievement, to reduce these voice either male or female. Now apply ANN and
problems. SVM Algorithm and predict the class label data.
For gender classification, a communication and
voice recognition scheme be able to be used. As
the human ears sink the difference of the voice
II. LITERATURE REVIEWS The two-step Gaussian Mixture Model (GMM)
Many researchers have used these procedures in process to identify age and masculinity and
different ways and with distinct limitations to wished-for classifier remained initial confirmed
achieving the best outcomes. To the binding of for uncovering of four age groups and for
ANN along HMM can be granted with patio- identifying the gender for completely but
temporal ANN utilized for a short assemblage of children’s vocal sound in Czech and Slovak
speech units [6]. In 2011, Harada designed the languages. The prediction accurateness on this
organization the voice game controller working identification was overhead 90% [12]. For the
the games by voice individually. The outcomes alike work industrialized a two-level GMM
reveal that the data of quietness signal more classifier to sense age and gender. The
durable than the data of speech 50% [7]. Kumar et classification results attained on gender
al. in originated two movable games placed on acknowledgment remained 97.5% [13]. Moreover,
speech data that practiced helping children to read the acquired classification correctness
English in 2012 [8]. consequences were associated with the
Hamalainen et al. to train children between 3-10 consequences attained by the conservative
years the vital arts of harmony and calculation attending test which is an assessment technique of
scheme an institutional game based on speech the excellence of the artificial language [14].
identification in 2013 [9]. Wavelet-based feature,
the vector is checked with DTW. The solution is III. PROPOSED METHODOLGOY
centralized on an error judgment rule for inside The proposed SVM and ANN model is show in
oxidization engines [10]. In speech recognization, figure 1. The key feature of communication insight
a computational-intelligence solution makes the be there the determination of a specific piece of
control of a classifier. Three commonly applied speaking. There are many procedures to apply in
classifiers are dynamic time warping, hidden ANN and SVM model. This
Markov model, and artificial neural networks
[11].

Figure 1: Proposed Model

research has planned a new architecture method these two stages interconnect to respectively over
for predicting the gender with a feedforward a cloud environment.
propagation neural network. There are two phase
one is training and the second is validation phase,
A. Dataset Fault in feed-forward propagation is show as in
Now take two different dataset Gender Eq. (5).
Recognization and voice recognition from ∞ - ∑ (Ҭ ­ ∂k)2 (5)
kaggle.com. first one use for ANN which has 3169 Afterward that, calculate the enactment of the
sample and 21 features and few missing values. prediction layer in expressions of Mean squared
The output variable has two classes that show the
error (MSE), accuracy, and miss rate. If
voice of a man or women. Voice recognition
dataset also available on internet which has 2840 mandatory learning principles is not encountered,
sample with 21 features, and the target variable at that time we reinstruct the prediction layer.
has two classes one is male and other is female. When learning standards is achieved, now move
forward for validation purpose in trained model.
B. Pre-processesing D. Validation
Firstly, use formulating sample data and group When save data on cloud, the validation step
data to build the architecture of the ANN model. occurred which can further divide into two layer,
To measure the noise of the signal before and where input data same as before. The information
after in given figures. For this sample data, we is directed to the prediction layer that calculates
take built-in values for noise is noissin variable. the data and forecasts the gender.
Subsequently data collection, the basic three- IV. RESULTS AND DISCUSSION
technique directed to train the artificial neural
Machine learning procedure has been smeared
network. Missing data problem is the first
to the datasets and the MATLAB tool is used for
problem, and this data is interchanged by the
experiments. The dataset is collected from Kaggle
ordinary immediate value. Second normalized
[14]. In the ML approach, there were 2099
data, and lastly randomized our data. A mean
instances for exercise the dataset. 70% of data is
method used to calculate the missing values, it is
secondhand in training (1469) although
formulated as:
outstanding 30% data is secondhand for validation
T(c)={󠆹mean(c), if c=null
(629).
c, otherwise
Table 1: Performance Calculation
C. Application layer Proposed Accuracy Miss RMSE
The application training layer is started Model (%) rate(%)
afterward pre-processing. it is split into two Train 98.78 0.28 1.37 x
sublayers the one forecast layer and the sec 10-3
enactment valuation layer. In first layer, recycled Validation 90.00 10.31 5.06 x
the adapted feedforward neural network. It is 10-2
additionally distributed into 3 layers which are the
input layer, hidden layer, and the output layer. We use ANN fitting app to simulate our dataset
The input layer, 21 neurons, 30 hidden neurons where we apply Bayesian Algorithm and got
are used in the output layer so there is only one much enough results.
output as shown in Fig. 1.
Here, use sigmoid (x) activation function, hidden Accuracy =
layer s(x)= sigmoid (x) input written as in Eq. (1). Miss Rate =
§J = ∑mi=1 (Uij * ∞i) + bi (1)
Sensitivity =
The hidden layer of the projected structure with
the sigmoid function is presented in Eq. (2) Specificity =
∂j = 1/(1+e-$j) where k = 1,2, 3, …...n (2) Precision =
Input is occupied from the output layer is show in False positive ratio = 1-specificity
Eq. (3) False negative ratio = 1- sensitivity
$k = b2 + ∑j=1n (βjk+∂j) (3)
The output layer activation function is shown in To extent the enactment of the planned model, we
Eq. (4). used the statistical measures. In this paper also
∂k=1/(1+e-$k) where k = 1,2, 3, …...q (4) applying the SVM for voice based gender
identification, to check the better performance of
the algorithms. The proposed architecture working
on the mechanisms of parallel computing. K-fold
cross validation method used to split data into
different folds for train and test the dataset.
Different evaluation metrics used to access the
performance of the architecture that is shown
below.
Table 2: Performance of the proposed Architecture
Proposed Accuracy(%) Specificity(%) Precision(%) Sensitivity(%) Miss FPR(%) FNR(%)
Model rate(%)
SVM 97.6 97 94 96 2.4 0.0202 0.0278
ANN 98.78 97.5 97.3 97.4 1.22 0.012 0.011

Figure 2: Comparison of Proposed model with previous Algorithms

CONCLUSION work, apply hybridation empowered by fuzzy


ANN is the basic and most applicable technique logic designer.
for future computing results. In this paper, show REFERENCES
that it is a very useful method for voice [1] Adams, Russ, (New York, 1990) Sourcebook of
recognization classification. It works most similar Automatic Identification and Data Collection, ADA90:
Van Nostrand Reinhold.
to like a human brain than a predictable computer
[2] Gamit, M.R.; Dhameliya, K.; Bhatt, N.S. Classification
sensibleness. Here use the Neural Fitting app and Techniques for Speech Recognition: A Review. Int. J.
SVM for simulating on MATLAB and achieved Emerging Technol. Adv. Eng. 2015, 5, 58–63
[3] Martin Heller (May 15,
better results than MLP, as see the training model 2019) https://www.infoworld.com/article/3214424/wha
as more difficult and complex, and sensitive t-is-machine-learning-intelligence-derived-from-
data.html, Contributing Editor, InfoWorld edn., : .
which maybe cause problems. There are many [4] tutorial points
methods used by researchers to predict the most (2015) https://www.tutorialspoint.com/artificial_intellig
accurate results such as Hidden Markov Model, ence/artificial_intelligence_neural_networks.htm, : .
[5] Y. Meidan, M. Bohadana, A. Shabtai, J. D.
Dynamic Time Warping, Artificial Neural Guarnizo, M. Ochoa et al., “Pro􀀀liot: A machine
Network (ANN) and SVM, and so on. In this learning approach for IoT device identi􀀀cation
based on network traf􀀀c analysis,” in Proc. of the
study, use ML base Approach that is ANN in Symp. onApplied
which we apply the Bayesian Regulation [6] Computing, Marrakech, Morocco, pp. 506–509,
2017
technique and got 98.78% and 97.6% Accurate
[7] R. J. Schalkoff, "Pattern Recognition: Statistical,
results. For all these experiments we use the Structural and Neural Approaches," 1992.
MATLAB Optimize tool. Faith this study takes on [8] Yuan Shao, "Higher Order Spectra Invariants
to show the straightforward considerate of ANN for Shape Pattern Recognition," College of
Engineering and Technology, Ohio University,
and stimulate the investigation assembly occupied 2000.
on Spontaneous Voice Recognition. In future [9] A. M. A. Al-Shatnawi, "A Non-Iterative
Thinning Method Based On Exploited Vertices
of Voronoi Diagrams," Phd, Faculty of
Information Science and Technology, Universiti
Kebangsaan Malaysia, Bangi, 2010.
[10] Maitha H. Al Shamisi, Ali H. Assi and Hassan A. N.
Hejase (n.d.) Using MATLAB to Develop Artificial
Neural Network Models for Predicting Global Solar
Radiation in Al Ain City – UAE, Engineering
Education and Research Using MATLAB:
[11] B. Rehmam, Z. Halim, G. Abbas, and T.
Muhammad, "Artificial neural network-based
speech recognition using dwt analysis applied on
isolated words from oriental languages,"
Malaysian Journal of Computer Science, vol.
28, pp. 242-262, 2015
[12] L. Boussaid and M. Hassine, "Arabic isolated
word recognition system using hybrid feature
extraction techniques and neural network,"
International Journal of Speech Technology,
vol. 21, pp. 29-3
[13] Pˇribil, J.; Pˇribilová, A.; Matoušek, J. GMM-based
speaker gender and age classification after voice
conversion. In Proceedings of the 2016 First
International Workshop on Sensing, Processing and
Learning for Intelligent Machines (SPLINE), Aalborg,
Denmark, 6–8 July 2016; pp. 1–5.
[14] AARON NICHIE (May 2013) VOICE RECOGNITION
USING ARTIFICIAL NEURAL NETWORKS AND
GAUSSIAN MIXTURE MODELS, Vol. 5 No.05 edn.,
Aaron Nichie et al. / International Journal of
Engineering Science and Technology (IJEST): .
[15] Bhushan C. Kamble (Issue 1 (2016) ISSN 2349-1469
EISSN 2349-1477) Speech Recognition Using
Artificial Neural Network – A Review, Vol. 3, edn., Int'l
Journal of Computing, Communications &
Instrumentation Engg. (IJCCIE):

View publication stats

You might also like