You are on page 1of 682

15 IoT-Based Agriculture Trend Prediction Based on Weather Data 135

the Agricultural Ministry of Bangladesh government. Thereafter, we predict the six


specific crops, such as Aus, Aman, Boro, Watermelon, Cotton, and Wheat in various
courses of events in various climate conditions. We also proceed with our work in a
particular location where we will get the variables using an IoT system in real-time.
We discover critical relationships between climate data and crop cultivation. We pre-
dict the agricultural trend using IoT system data with the help of our model based
on decision tree regression.
The rest of the paper is organized as follows. Section 15.2 describes related works,
whereas the methodology is depicted in Sect. 15.3. Section 15.4 visualizes experi-
mental results and provides discussion on the results. Finally, Sect. 15.5 concludes
the work.

15.2 Literature Review

For this part, we have selected a vast number of related research articles and studied
them which are based on IoT and climate-smart agriculture.
The research paper [6] gives an overview of an energy-efficient and secure IoT-
based WSN framework as an application of an intelligent agricultural system to
appoint more suitable cluster heads based on multi-criteria decision functions. Mainly
adopt a single-hop paradigm for data transmission and decrease the chances of bot-
tlenecks in agriculture sensors and base Station. Data Security has accommodated
information transmission from farming sensors toward base stations.
To explore the heterogeneity in the effects of reception of environmental climate-
smart agricultural (CSA) adoption of government assistance markers, adoption of the
CSA practices help in improving the food security and reducing poverty in the South
Asian continent. MTE is used to deal with the treatment effects of heterogeneity,
PRTE for climate change, HDDS and HFIAS for food and nutrition security and
poverty status of households [7]. Md Kamrul et al. study [8] examined the effect of
CSA adoption on household food security of seaside farmers in the southern part of
Bangladesh. The authors recognized 17 CSA practices and the information collected
from 118 casually chosen farmers in Patuakhali, Bangladesh. A connection between
the CSA adoption and household food security was found through further developing
food creation, improving pay, and expanding according to yearly capital expenditure
on foods is the main strength of the paper. The main goal of the [9] article is to foster
a typology of farm-level CSA practices among country households and networks in
Southern Malawi to work with investigations of CSA adoption. Utilized typology
to create and test theories about CSA adoption with essential family 808 overview
information from WALA. Also, utilizing recursive bivariate prohibits regression to
assess the impact of program interest on CSA adoption and discovered positive and
measurably huge impacts.
The main goal of the article [10] is to improve crop yields, reduce cost and improve
quality to increase agricultural productivity using IoT devices. The study has shown
the prediction of yield with the parameters of different sensors, an analysis system for
136 M. Farshid et al.

providing water in the firm using a web or manual mode. They applied data mining
by association rules to extract the best-estimated value with computerized automatic
devices for monitoring the climate.
In smart agriculture Elijah et al. [11] provided a mixed analysis of the benefits and
drawbacks of IoT and Data Analytical (DA) technologies. They discussed several IoT
implementation strategies, including wireless connection, cloud storage, and other
sorts of analytical evaluation. Various projects in the development of IoT work, such
as community farming, safety control, cost reduction, awareness, and security work,
have helped the agriculture sector, according to their survey.
A comprehensive survey paper using data mining techniques on smart agricul-
ture is designed by Issad et al. [12] to show more research and ongoing studies of
contemporary practice in agriculture using data mining with other techniques like
deep learning, image processing, AI, ML, and data analytics and solving several
agricultural problems. They mainly focus on controlling the flow of irrigation, plant
disease monitoring, pest monitoring and manageability of inputs, crop yield predic-
tion, and impacts on productivity based on climate change using the combination of
data mining with other techniques.
This paper’s main objective is to help the Agricultural Sector find the best methods
of advancing crop productivity with the help of Big Data and Machine Learning
activities. Well tested and documented Sensor Calibration and adjustment of sensor
data collection is the key to gathering the Phenotyping campaign for the successful
model building using Big Data [13].
Farhat et al. [14] proposed ML techniques with statistical parameters for yield
generation prediction. Tuning of Hyperparameter and using four different ML algo-
rithms named linear regression (LR), elastic net (EN), k-nearest neighbor (k-NN),
and support vector regression (SVR) with the steps of Data Collection by Site sur-
vey, Proximal sensing, sampling and yield of data set. This study is needed for Food
Security Initiatives worldwide with the sustaining model of crop yield prediction.
In the topic of smart agriculture, several studies have been conducted. However, in
our study, we are working on unique datasets with different types of crops’ and predict
the crops cultivation by considering four parameters with two different techniques
like IoT and Data Mining which is the main strength of our work.

15.3 Methodology

In this section, we will discuss how we developed an IoT device with a classification
algorithm to predict agricultural trends in Chittagong and assess weather conditions.
Figure 15.1 shows the work flow diagram of our system.
First, we developed IoT devices based on the Arduino platform for the real-time
data system. In Fig. 15.2, the components we previously utilized were the D1 mini
for the development board, the DHT 22 for temperature and humidity, the rain gauge
or rain sensor for precipitation, the Anemometer for wind, and the ads1115 for the
analog sensor.
15 IoT-Based Agriculture Trend Prediction Based on Weather Data 137

Fig. 15.1 Workflow diagram

Fig. 15.2 Schematic diagram


138 M. Farshid et al.

Fig. 15.3 ThinkSpeak

As an IoT gateway system, we utilize a Wi-Fi protocol system and router that
directly sends data over the cloud. ThingSpeak is a cloud-based IoT analytics plat-
form service that aggregates, visualizes, and analyzes real data streams. Mathworks
developed this service (Fig. 15.3).
The data was being streamed on the ThingSpeak cloud and was being accessed
using the API key. We can view the most recent data or all previous data.
Furthermore, we may incorporate this information into our dataset in Fig. 15.4a.
Then we take agricultural farming data from the Chittagong region and weather
parameters for those precise times. We will start with the data preparation phase.
In Fig. 15.4b, we determine the Mean, Standard Deviation, Q1, Q3 of our dataset
where we show all the descriptive values for our independent and dependent variables.
We have also added individual independent variable outliers using a boxplot
(Fig. 15.5).
Our independent data consists of categorical values. As a result, we have to do
the feature selection. The chi2 test feature selection is used. The chi2 square formula
is provided below:
 (O 2 − E 2 )
X c2 = i i
(15.1)
Ei

where, c = Degree of freedom, O = Observed values, E = Expected values. Following


that, we build a classification model based on the features we have chosen and look
into its prediction capability. Then we will connect our real data to this model, and
we will be able to find a prediction for a given period of time based on that data.
(a) Input Data (b) Mean, STD, Q1, and Q3

Fig. 15.4 Dataset description


15 IoT-Based Agriculture Trend Prediction Based on Weather Data
139
140 M. Farshid et al.

Fig. 15.5 Boxplot

In this study, we developed a Decision Tree regression model to predict crops in


the Chittagong region based on various weather conditions. In this experiment, we
used our newly created dataset. After building the feature selection, we split the
dataset into training and testing. The training dataset is 70% of the real data, and the
remaining dataset is the testing dataset. We assigned the independent variable to the
X training dataset, and in y, we assigned the dependent variable. Then, on training
data, we built the model based on decision tree regression. Based on this, we predict
the crops based on the real data.
The crops that we have predicted are Aus, Aman, Boro, Wheat, Watermelon,
and Cotton. From the independent variables, we got four parameters: temperature,
humidity, wind speed, and precipitation. After that, we predict the dependent variable,
as mentioned, the crops based on the model by analyzing the actual data from the
IoT device.

15.4 Result and Discussion

Figure 15.6a depicts the plot of the p bar of our chi-square test for the correlation
between weather conditions and cultivating crops.
15 IoT-Based Agriculture Trend Prediction Based on Weather Data 141

(a) P-bar of chi-square test (b) Prediction values on testing data

Fig. 15.6 Model analysis

(a) Confusion Matrix (b) Predicted result on real data

Fig. 15.7 Outcome

As mentioned earlier, we predict the category values of six crops. This feature
selection section is required to verify the attributes of both the independent and
dependent variables. Then, as indicated in the methodology section, we use historical
data of independent and dependent variables to build the model using the Decision
regression tree algorithm. Figure 15.6b is the outcome of the prediction section based
on the testing data.
We next compute the confusion matrix using the decision tree model shown in
Fig. 15.7a. A confusion matrix is a diagram that depicts the predicted outcomes of
a classification model. The quantity of correct and incorrect predictions is summed
up by counted esteem and separated by each class.
There are 57 data points for the true negative, 29 data points for false negative,
24 data points for false positive, and 439 data points for the true positive in that
confusion matrix. The decision tree regression model’s accuracy is 67–70%. Then,
using ThingSpeak streaming feature, we get real-time data and run it through our
built-in decision tree regression model. Figure 15.7b depicts the output experiment
that was put to the test.
142 M. Farshid et al.

We gathered the data on the October 5th 2021, and based on that real-time data
analysis. We discovered that the anticipated crops are Aman, Boro, and Wheat. This
research is being carried out to determine the influence of weather on agricultural
trends. We have a model based on real data from NASA Power View and Bangladesh
Agriculture Book 2021. After that, we preprocessed the data and created a decision
tree regression model. We discovered that our prediction accuracy was 67–70% using
that approach. That is good for fresh data collection depending on Chittagong’s
location. Chittagong was chosen as the location since it is one of Bangladesh’s
coastline areas, and the real data from the IoT devices was streaming live from
Chittagong. We have built the model by using last five years’ of data which is also
based on Chittagong’s location.

15.5 Conclusion

In this work, we have simulated an IoT device for real-time weather data in our
proposed methodology, and we built the machine learning model using decision tree
regression to assess the agricultural trend of Chittagong city. We have used data that
was collected from two sources: NASA Power View and the Bangladesh Agriculture
Book. By using the algorithm, we have then constructed a model based on that data.
At initial stage, we had to remove noise from the data collection and normalize it.
The IoT devices collect real-time meteorological data, assess the model based on
the current situation, and predict future crop results. This research can assist us in
predicting the ideal crops for ideal weather circumstances and assist farmers and
others involved in the agricultural industry in gaining information about future crop
predictions. Farmers will get benefited and the new prospects for economic and
sustainable development may emerge if we correctly can anticipate crops using the
proposed system. In future, we plan to integrate the soil analysis component to aid in
the identification of ideal crops as well as the yield rate of those crops for a particular
place and time.

References

1. Sethi, P., Sarangi, S.R.: Internet of things: architectures, protocols, and applications. J. Electr.
Comput. Eng. (2017)
2. Khan, R., Khan, S.U., Zaheer, R., Khan, S.: Future internet: the internet of things architecture,
possible applications and key challenges. In: 2012 10th International Conference on Frontiers
of Information Technology, pp. 257–260. IEEE (2012)
3. Mondal, M.A., Rehena, Z.: IoT based intelligent agriculture field monitoring system. In: 2018
8th International Conference on Cloud Computing, Data Science & Engineering (Confluence),
pp. 625–629. IEEE (2018)
4. Kumari, N., Gosavi, S., Nagre, S.S., et al.: Real-time cloud based weather monitoring system.
In: 2020 2nd International Conference on Innovative Mechanisms for Industry Applications
(ICIMIA), pp. 25–29. IEEE (2020)
15 IoT-Based Agriculture Trend Prediction Based on Weather Data 143

5. Saini, M.K., Saini, R.K.: Agriculture monitoring and prediction using internet of things (IoT).
In: 2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC),
pp. 53–56. IEEE (2020)
6. Hasan, M.K., Desiere, S., D’Haese, M., Kumar, L.: Impact of climate-smart agriculture adop-
tion on the food security of coastal farmers in Bangladesh. Food Secur. 10(4), 1073–1088
(2018)
7. Shahzad, M.F., Abdulai, A.: The heterogeneous effects of adoption of climate-smart agriculture
on household welfare in Pakistan. Appl. Econ. 53(9), 1013–1038 (2021)
8. Kumar, V., et al.: Importance of weather prediction for sustainable agriculture in Bihar, India.
Arch. Agric. Environ. Sci. (2017)
9. Amadu, F.O., McNamara, P.E., Miller, D.C.: Understanding the adoption of climate-smart agri-
culture: a farm-level typology with empirical evidence from southern Malawi. World Develop.
126, 104692 (2020)
10. Muangprathub, J., Boonnam, N., Kajornkasirat, S., Lekbangpong, N., Wanichsombat, A., Nil-
laor, P.: IoT and agriculture data analysis for smart farm. Comput. Electron. Agric. 156, 467–474
(2019)
11. Elijah, O., Rahman, T.A., Orikumhi, I., Leow, C.Y., Nour Hindia, M.H.D.: An overview of
internet of things (IoT) and data analytics in agriculture: benefits and challenges. IEEE Internet
Things J. 5(5), 3758–3773 (2018)
12. Issad, H.A., Aoudjit, R., Rodrigues, J.J.P.C.: A comprehensive review of data mining techniques
in smart agriculture. Eng. Agric. Environ. Food 12(4), 511–525 (2019)
13. Shakoor, N., Northrup, D., Murray, S., Mockler, T.C.: Big data driven agriculture: big data
analytics in plant breeding, genomics, and the use of remote sensing technologies to advance
crop productivity. Plant Phenome J. 2(1), 1–8 (2019)
14. Abbas, F., Afzaal, H., Farooque, A.A., Tang, S.: Crop yield prediction through proximal sensing
and machine learning algorithms. Agronomy 10(7), 1046 (2020)
Chapter 16
EEG in Optic Nerves Disorder Based
on FSVM Using Kernel Membership
Function

M. Jeyavani and M. Karuppasamy

Abstract Giant cell arteritis is an optic nerves disorder. The symptoms of giant cell
arteritis disorders are fever, dry cough, headache, jaw pain, and blood circulation in
the arms. Mostly, it affects older age people but it is rare to affect children. The giant
cell is called also temporal. Swollen extend the medium, large size arteries from the
human neck to head. It may affect people’s vision and disorders in the nerves cannot
be predicted. So electroencephalogram (EEG) is used to read human nerves and find
the disorders in the nerves. EEG is used to patients’ scalp wave reading continuously.
Although many methods have been used to capture EEG, noise data remains, and
accuracy is low. In our approach, it is to avoid classification problems, remove noise
data, increase accuracy, and reduce time consideration using Fuzzy Support Vector
Machine.

Keywords Electroencephalogram · Fuzzy set · Membership function

16.1 Introduction

Giant cell arteritis is incurable; sometimes, it will be a long-term treatment. There


is not only a significant risk of permanent vision loss and people may be affected
by the stroke but also the risk of the loss of life. The eye visual disorder function is
divided into five types: visual acuity, contrast sensitivity, color, depth, and motion.
First, the lights hitting the eye after that light reflect retina tissue. Retina tissue turns
the electrical signals through the nerve to the brain, and the brain turns the signals
into images to identify. Optic vision disorder eye checking is used according to
the cataract, cloudy vision, and floaters. Visual disturbances that occur when the
image cannot be seen in visual vision are divided into three types of disorders.

M. Jeyavani (B) · M. Karuppasamy


Kalasalingam Academy of Research and Education, Krishnankoil, Srivilliputhur, Tamil Nadu
626128, India
e-mail: jeyavanim@gmail.com
M. Karuppasamy
e-mail: karuppasamy.m1987@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 145
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_16
146 M. Jeyavani and M. Karuppasamy

They are Anophthalmia, Glaucoma (Pressure), Age-related Macular Degeneration


(AMD), Optic Neuritis, Giant Cell Arteritis, and Chiasm Disorders are among the
most common optic neurological disorders. Therefore, the Electroencephalogram
(EEG) is used to diagnose one of these neurological disorders, Giant cell arteritis,
an optic neurological disorder. EEG gives information about the human body, and it
is used to communicate with optic nerve disorder patients [1].
EEG module keeps scalp reading continuously in the brain waves. This scalp
reading is comparing the basic requirements and specific requirements like food,
water, toilet, assistance, sleep, and recreation. The fuzzy membership function was
applied to the fuzzy logic pattern recognition, and it is used to diagnose eye movement
of optic nerves and color detection of eye capacity [2].
In our approach, giant cell arteritis EEG data sets are collected to find the fuzzy
classification. Fuzzy sets are divided into two parts are triangular and trapezoid
membership functions. It is used on each axis with a set of rules and then applied
in the Support Vector Machine (SVM) classification to adjust the fuzzy membership
function.

16.2 Related Work

Murata and Ishibuchi [3]. One of main contributions, Fuzzy partition was used to
evaluate, in each axis partition by fuzzy logic in the triangular and trapezoid member-
ship function. Triangular and trapezoid fuzzy sets are used sets of rules to access the
membership function. Also, a genetic algorithm was used to adjust the membership
function, but not used the time consideration.
Hsu et al. [4]. The contributions, Electroencephalography (EEG) signal is used to
classify based on the imaginary. The kernel used the membership function to adjust
for the leg movements in the EEG. FSVM and Neurofeedback tool helps to monitor
the function of the human brain. But accurate data and time consideration are low.
Xu et al. [5]. The main contributions, a Fuzzy Support Vector Machine with a kernel
classifier that used imagery tasks. EEG signals time–frequency wavelet used to drive
the pattern analysis. FSVM is a better performance than SVM and is proposed to
reduce the noise data, but the accuracy is low. Nagpal and Upadhyay [6]. Major
contributions, Fuzzy rules and Fuzzy Inference algorithms are used to identify the
patient’s sleep and awake stages. Fuzzy models are used to accurately evaluate the
EEG signals. EEG identifies the classes of sleep and awake stages in the signals
using Fuzzy rules. But the accuracy is low.
As mentioned above to remove noise data, improve the accuracy, and reduce the
time consideration, Fuzzy rule-based classification was used for the Fuzzy partition.
The fuzzy partition method is a rule-based method, and triangular and a trapezoid
membership function is used to show the graphical picture easy to understand the
patient’s health issues. After that membership function was used, for the EEG signal
transform wave, and the Support Vector Machine was used to adjust the membership
function and to reduce the process of time.
16 EEG in Optic Nerves Disorder Based on FSVM … 147

16.3 Proposed Work

The EEG data samples are collected from machine-learning database UCI repository.
The following steps are used to do the accuracy, remove noise data, and reduce time
consideration of the data. Phase1: Fuzzy membership function is used to predict the
data and according to the prediction to show in the graphical method. Phase 2: A
support vector machine is used to remove noise data and to reduce time. There is
much different information on the human scalp. In EEG, Brain–Computer Interface
is used to select only the specific information. However, some noise data will be
occurring during fuzzy classification; here, a fuzzy rule-based system is utilized to
remove the noise data, a fuzzy inference algorithm is used to monitor imagery tasks
and a kernel algorithm-based support vector machine is well used to classify features
of statistical data.

16.4 Methodology

In our approach, fuzzy and support vector machine methods are used to compare
the performance. The sample data are analyzed and compared to get training data
and validation data, from both training data and validation data used to show the
performance results. In our proposed approach for analyzing and reducing the time-
consuming of the large data, a machine learning algorithm is utilized. Kernel’s
theorem and FSVM theory are used to develop the accuracy and a prediction model for
giant cell arteritis optic neural disorder. To reduce the time involved, time–frequency
research is split up into two types as the time domain and frequency domain; both
time domain and frequency domain are used to simplify signals. The first one the
time domain is used to analyze the geometric properties of the signal waves such as
amplitude, mean, root mean square, slope magnitude, and kurtosis. To implement
the frequency domain, the requirements of signal processing wavelet mathematical
Fourier transform was used.
The data are applied in two classifications as fuzzy and support vector machines.
In fuzzy, effective results have been used to deal with uncertainty. The classical
bivalent set is called a crisp set in the fuzzy set theories. Everything is a rule-based
on either true or false. An object is divided into two types. They are completely
within a set and not entirely within the set. They are also called Union and Intersect.
In support vector machine, they handle linear and nonlinear methods to solving the
multi-criteria problem. Support vector machine used to do a pairwise comparison
for membership function. The crisp set shows whether a member of a set or not. But
fuzzy set allows components to be partially in the set. Thus, the degree of membership
is granted on everyone in an element. To find a crisp set, fuzzy membership functions
have to follow these steps. (1) Fuzzy set, (2) Fuzzy inference algorithm, (3) Fuzzy
classification, and (4) Original Membership Function in Feature Extraction.
148 M. Jeyavani and M. Karuppasamy

16.4.1 Fuzzy Set

A fuzzy set is a class of objects with a continuum of membership. Each object


represents the quality of the member between zero and one. This set is characterized
by the function of the member properties. The content of this package includes
inclusion, union, intersection, complement, related, convexity, etc. In fuzzy sets,
sample data sets are used to be applied as two CSV files. Then, the decision attributes
and condition attributes are applied in the decision tables. The decision table is
applied in union and intersect to make it a crisp set. Set options in universe, adjusted
in frequency 0–25 by 0.1. Finally, the six fuzzy partition variables and six fuzzy rules
were used for fuzzy membership.

16.4.2 Fuzzy Inference Algorithm

Fuzzy inference is the use of fuzzy logic to create a mapping from input to output.
Then comes the forms of decision-making and action. A training data sets are defined
as follow {(x 1 , y1 , μ1 ), (x 2 , y2 , μ2 ) ……(x n , yn , μn ). n is the number of samples of
the exercise. x is the voltage of the input vector. µi is the outlier degree of the sample.
One dimension has many more, such as the transparency distortion model. Used to
give a little µi reduce its most impact. But if the input vector is multidimensional,
the advantages of ambiguous set principles are used to solve the problems of linear
correlation measurement methods [7].

16.4.3 Fuzzy Classification

Pleas Fuzzy classification is the process of assembling the components of Fuzzy sets.
The function of the number is defined by the actual value of the ambiguous proposal.
F is a classifying function of set functions. The set of functions is set to take into
account the expected risk reduction policy. The anticipated probability is

1
R( f ) = f (x) − yd p(x, y) (16.1)
2

R( f ) is a nonlinear classifying function, y is cumulative probability in prediction


P(x, y) is estimated in terms of distribution. In practice, this is not feasible and should
be evaluated for distribution from the training set (x, y), i = 1, … n. This leads to
a nasty problem. This problem is solved by risk minimization. It is defined as an
experimental risk when not represented in the training set.
16 EEG in Optic Nerves Disorder Based on FSVM … 149

1 1
n
R( f ) = (x + a)n = f (xi) − yi (16.2)
n 1=0
2

R( f ) is a nonlinear classifying function, (x + a)n is the input vector, and (xi)− yi


is loss under true probability. The expected risk of the average value in terms of
training set loss refers to the measurements of the expected value of the loss under
actual probability. Loss observations are provided for independent, uniform sample
distribution. The loss observations are proposed, in the independent, uniform sample
distribution. The experimental risk is taken as either 0 or 1. Taking 0 as a correct
classification and 1 as an incorrect classification. The solutions for dealing with the
experimental risks seem to be complex. If P(x, y) is not known, the risk cannot be
reduced from a direct distribution. So, P(x, y) is used to find [8].

16.4.4 Original Membership Function in Feature Extraction

In the context of ambiguous sets, the signal phenomenon occurs based on the size of
the member of the waveform, which is the waveform of confidence in the presence of a
signal based on noise and able to measure signal-less confidence and uncertainty. The
decision can be made by being a member of the waveforms. Membership value is a set
of wavelength parameters. In fuzzy figure fuzzy rules, fuzzy logic is used to combine
information into several parameters. Sample features applied in Fuzzy Membership
Function. Membership function means the classification of sets by graphs based on
the classification of clarifying ambiguous sets. Fig. 16.1 represents sample data are
applied in fuzzy membership. The set options in universe sequence from 0 to 25 by
0.1. Membership functions are divided into six types. They are Triangular, Trape-
zoidal, Gaussian, S-shape, Z-shape, and Sigmoid. Our approach used triangular and
trapezoidal membership functions. Triangular and trapezoid membership functions
are used on each axis with a set of rules. Specified samples of fuzzy trapezoid corners
are applied as c(−2, 0, 2, 6). Features of fuzzy trapezoid corners are applied as c(5,
8, 10, 12).
The fuzzy trapezoid membership function consists of six fuzzy partition variables
which are of poor, good, excellent, blow, high, and very high. Applied fuzzy partition

Fig. 16.1 Fuzzy partition


applied in Trapezoid
membership function
150 M. Jeyavani and M. Karuppasamy

Fig. 16.2 Fuzzy partition


applied in Triangular
Membership Function

variable names are blow = 5, high = 12.5, very high = 20, and fuzzy cone radius =
5 (Fig. 16.2).
Six fuzzy rules are applied in triangular of the blow, high, very high, excellent,
features, and generous. Each rule is given a fuzzy partition according to the fuzzy
logic, six rules that are applied to extract the specified data. The rules are based
on three variables generated. Training data refers to membership values of points
of comparative importance between own classes. Taking positive training data for
validation data, for example, membership value is calculated using the ambiguous
membership function [9].

16.5 Classification and Regression

16.5.1 Support Vector Machine

Support Vector Machine (SVM) is used to sample data, and fuzzy reduction data are
applied separately. Kernel theorem was used in polynomial value for classification,
with the result applied in traditional inference. Finally applied to predict value to
estimate the SVM model, the prediction value is applied in the plot, histogram, and
density to show the prediction value easily through pictures.
From the Fig. 16.3, prediction value applied through density. N = 168 and band-

Fig. 16.3 Prediction value is Prediction value in density.


applied through a density
16 EEG in Optic Nerves Disorder Based on FSVM … 151

Support Vector Machine

Fig. 16.4 Classification of SVM

Fuzzy Support Vector Machine

Fig. 16.5 Classification of fuzzy support vector machine

width = 0.2949. SVM is handled and considers the distance between the two obser-
vations. The probability of these observations varies depending on the measured and
unmeasured distances.
The business and miscalculations increase the margin. Such SVM linear classifi-
cation Fig. 16.4 is known as linear SVM. Support Vector Machine is partitioned into
two classifications. The SVM properties have a flexible structure and also give better
results. So according to the input, the parametric methods select the classification.
A nonlinear is the result that can be developed with a small increase in the
complexity of the classification in Fig. 16.5. To increase use, “kernel trick” nonlinear
decision boundary classification is done. Our approach reviews the basics and
formulates the SVM. Finally, kernel induced membership function, proposed, and
applied.

16.5.2 Kernel Membership Function

The main objective of SVM is a Kernel feature face F that detects the hyperplane and
segregates the optimal space of unique information data Rd . As reported by Kernel
Hilbert Space theorem, the information is expanded and then converted to data F.
In this write-up, Gaussian kernel is used. Feature spaces F borderless evolution,
152 M. Jeyavani and M. Karuppasamy

Gaussian Kernel theorem is introduced. All mapped data points are all multiplied on
the surface at the genesis of space F. Fuzzification is processed in the input space Rd .
Sometimes, the graphical data location in F may not be of comparative importance, so
the proposed fuzzily point in F directly. The nonlinear survey is unknown practice,
in feature space F. Euclidean distance between any ∅(xi)and ∅(x), purposive by
applying Kernel tricks as it follows laity, is

∅(xi) − (x) = (∅(xi) − (x)2) (16.3)

Gaussian Kernel takes the matching valuation for all x = y, i.e., K(xi, xi = 1) =
1, ∀ xi, can be calculated in F from a distance; activating, using the Kernel-induced
obscure members, enables the member value to accomplish the task outside the
feature through task F. There are two views on FSVM.
(1) In training data, membership function value points, set to 1, i.e., μi+ = μi− =
1, ∀ xi ∈ S f + , and ∀ xi ∈ S f − , FSVM is homogeneous to SVM.
(2) The value ε (near zero) of the lower limit of members must be set correctly,
where 0 ≤ ε ≤ 1. A very small of ε results in worse classification performance
while a large value of a (near one) makes the FSVM still sensitive to outliers.
Following the suggestion, we set ε = 0.3 [8–10].

16.6 Result Overview

The fuzzy set alone cannot handle the noise data. So SVM integrated with fuzzy
to use in membership function. Fuzzy to reduce noise, to increase accuracy, and to
reduce time-consuming applied sample data in fuzzy, and support vector machine
results’ overview is given below (Table 16.1).
Each data set included the attributes of samples, features, and the number of
classes (Table 16.2).
Finally, this table has the overall performance to show the Fuzzy, SVM, and
FSVM. The table shows the percentage is Fuzzy 93.65%, SVM 95.23%, and FSVM
95.76%. FSVM is showing the result which is the highest value 95.76%.

Table 16.1 EEG raw data sets were used for Union and Intersect in FUZZY
F3 FC5 T7 P7 A B P8 T8 FC6
4259.49 4120 4341.03 4595.9 4092.82 4612.31 4199.49 4219.49 4198.46
4268.21 4126.15 4344.62 4595.38 4102.05 4622.56 4205.13 4221.54 4205.13
4277.95 4134.36 4346.15 4591.28 4095.9 4620 4208.72 4235.38 4212.31

Table 16.2 Classification


FUZZY (%) SVM (%) FSVM (%)
Accuracy Results from data
sets 93.65 95.23 95.76
16 EEG in Optic Nerves Disorder Based on FSVM … 153

Fig. 16.6 Classification Accuracy Results in the chart


accuracy results 96.00%
95.00%
94.00%
93.00%
92.00%
Series1
FUZZY SVM FSVM
Table 3. ClassificaƟon Accuracy Results
from data sets.

In Fig. 16.6, Fuzzy membership function and Support Vector Machine are
combined and compared the results with Fuzzy Support Vector Machine.

16.7 Conclusion

An article discussed soft computing tools that were considered, analyzed, and
compared to show training data set and test data set. Comparing which about two has
the highest prediction value, test data was found to have the highest prediction value.
Systems are analyzed by FSVM, the problem of how to isolate Giant cell artists, a
specific optic nerve disorder, from EEG sample data. SVM is capable of extracting
real information. The SVM has integrated with the ambiguous membership function
to develop practical value-added methods to deal with uncertainty. Thus, a worth-
while tool is derived in decision-making classifications. In SVM, linear and nonlinear
data are extracting the output of waveform which shows the prediction value. Also,
fuzzy accuracy is 93%, SVM accuracy 95% is FSVM accuracy 96%, so it shows
FSVM accuracy is better than other solutions.

References

1. Zehong, C., Chin-Teng, L.: Inherent fuzzy entropy for the improvement of EEG complexity
evaluation. IEEE Trans. Fuzzy Syst. 26(2), 1032–1035 (2017)
2. Wessam, S., Sara, A., Nada, J.: EEG-based communication system for patients with locked-in
syndrome using fuzzy logic. In: Proceedings of 10th Biomedical Engineering International
Conference (BMEiCON, Japan, 2017)
3. Tadahiko, M., Hisao, I.: Adjusting membership functions of fuzzy classification rules by
genetic algorithms. In: Proceedings of 1995 IEEE International Conference on Fuzzy Systems,
Department of Industrial Engineering, University of Osaka Prefecture, Japan (1995)
4. Li, W.-C., Li-Fong, L., Chun-Wei, C., Yu-Tsung, H., Yi-Hung, L.: EEG classification of
imaginary lower limb stepping movements based on fuzzy support vector machine with
kernel-induced membership function. Taiwan Fuzzy Syst. Assoc. 19(2), 566–579
5. Qi, X., Hui, Z., Yongji, W., Jian, H.: Fuzzy support vector machine for classification of EEG
signals using wavelet-based features. Med. Eng. Phys. 31(7), 858–865 (2009)
154 M. Jeyavani and M. Karuppasamy

6. Chetna, N., Prabhat, P.K.: Wavelet-based sleep EEG detection using fuzzy logic. Department
of EEE, Birla Institute of Technology Offshore Campus, Ras Al Khaimah, UAE, pp. 794–805
(2019)
7. Hanmin, S., Jian, X.: Electric vehicle state of charge estimation: Nonlinear correlation and
fuzzy support vector machine. ScienceDirect. 281, 131–137 (2015)
8. Arindam, C., Kajal, D.: Fuzzy support vector machine for bankruptcy prediction applied soft
computing. Elsevier 11(2), 2472–2486 (2011)
9. Tai-Yue, W., Huei-Min, C.: Fuzzy support vector machine for multi-class text categorization.
Inform. Process. Manage. 43(4), 914–929 (2007)
10. Manikandan, T., Bharathi, N.: Lung cancer detection using fuzzy auto-seed cluster mean
morphological segmentation and SVM classifier. J. Med. Syst. 40(7), 181 (2016)
Chapter 17
DDoS Detection in ONOS SDN
Controller Using Snort

Mukesh Kumar and Abhinav Bhandari

Abstract SDN has changed the network industry in the last decade because of its
benefits like decoupling of control plane and data plane, programmability, customiza-
tion, etc. Security is one domain where it needs to improve continuously for
better. DDoS can be a big problem related to centralized control plane model as
any successful DDoS attack can create lots of damage to SDN-based network by
disrupting control plane availability. Hping3 is used to simulate TCP-SYN-based
DDoS attack on the controller which is highlighted using Wireshark Packet Analyzer.
Snort is an industry standard intrusion detection system (IDS) that comes in two
variants: (1) base open version and (2) Cisco Snort. We have used Snort base or
open-source IDS to detect DDoS by applying Snort rules to the incoming traffic
toward ONOS SDN controller. Rules created filter the incoming traffic and only
generate alerts for illegitimate or DDoS traffic toward SDN controller. Using Snort,
any incoming traffic toward SDN controller from outside network has to go through
Snort which detects the DDoS traffic and generates alerts against the same.

Keywords SDN · DDoS · ONOS · Controller · Snort · IDS

17.1 Introduction

Software-defined network or SDN [1–3] is one of the most popular state-of-the-art


technologies that has spread over different sectors of networking including enter-
prises, service providers, data centers, and many more. The main intention to create
a new network was to redesign the network according to modern needs which were
not met through conventional ways. Therefore, a new type of network design had
been proposed by disintegrating of control and data planes [4]. The concept of SDN
had been welcomed by all the major network providers, such as Cisco, Huawei,

M. Kumar (B) · A. Bhandari


Punjabi University, Patiala, Punjab 147001, India
e-mail: Mukesh.hcl.noida@gmail.com
A. Bhandari
e-mail: bhandarinitj@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 155
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_17
156 M. Kumar and A. Bhandari

Fig. 17.1 Traditional network versus SDN [9]

Juniper, Google, Microsoft. The SDN uses the centralization technique to have a
centralized controller and data plane devices that work on the instruction given by
their controller. Because SDN has been advancing rapidly, it has been decategorized
into several parts: SDWAN or software-defined WAN [5], software designed radio
(SDR) [6], software designed access (SDA), and SDS or software-defined security.
OpenFlow [7] protocol has been used to communicate between the controller and
the data plane [2, 4]. The controller is the actual working brain of network. Unlike
older network technologies, control plane is being developed by routing protocols
which is eminent part of SDN framework. Apart from this, controller clustering
[8] has been used to protect the network from problems encountered by controller,
where primary control takes the charge. In DC, SDN controllers can be installed that
will work on either virtual machine or on hardware devices like servers having Linux
distributions such as Red Hat, Ubuntu, Mint. In Fig. 17.1, architectural comparison is
discussed that demonstrates traditional as well as software-defined network designs.

17.1.1 Open Network Operating System (ONOS)

The ONOS [10] is an open-source SDN controller which is managed by the Linux
foundation. It is widely used in large organizations, such as AT&T, COMCAST,
Dell, Google. The ONOS has come up with GUI and CLI variants. The network
paths have been decided using the centralized system which makes the system more
advance and efficient. ONOS is also quite extensive and flexible that creates different
functions and modules using APIs.
This SDN controller is written in Java-based platform, and it has the bundles
present in the KarafOSGi controller that runs on JVM or Java virtual machine (JVM).
The ONOS has high availability as well as performance along with several features
to make SDN more advances. The convergence time is 50 ms when the problem
arises in the primary link or interfaces. ONOS controller’s internal view is shown in
Fig. 17.2.
17 DDoS Detection in ONOS SDN Controller Using Snort 157

Fig. 17.2 ONOS controller internal view [11]

17.1.2 Distributed Denial of Service (DDoS)

DDoS stands for Distributed Denial of Service. These kinds of attacks are severe
as hackers attempt to halt the services offered by network and application. Mostly,
DDoS attacks are being conducted on websites and network applications. These
kinds of attacks are also meant to avoid other security loopholes in the network. For
initiating the attacks, hackers may use home routers, Android-based devices, IoT
devices. They create bots with malicious contents to infect the network devices. The
controllers of SDN are quite vulnerable to DDoS attacks [12–15].

17.1.3 Snort

Snort [16] is an open-source and free-to-use intrusion detection system/prevention


system (IDS/IPS). Snort uses signature-based database to detect and prevent any
security issues. Snort works in a manner, where it detects any traffic that is coming
over a specific network interface. Tasks performed by Snort include protocol analysis,
content analysis, real-time traffic analysis that is coming in the network or going
outbound from the network. Snort has different features like it can be used for port
scanning, creating rules for traffic filtering, malware detection, and various other
vulnerability issues. Snort can be deployed on Linux or Windows OS.
158 M. Kumar and A. Bhandari

17.2 SDN Problems

17.2.1 Resolving On-demand Upgradation

There are certain technologies including cloud computing, IoT, machine learning,
etc. which are creating lot of issues in the industry. Resultantly, there is mammoth
data size which needs to be processed without having the delay/latency in the network
[17–22].

17.2.2 Automation of Devices

Since there is automation of network as well as server, the challenges associated with
it are complex and problematic, especially in data centers. There are certain types
of Application Programming Interfaces being used by the application layer which
needs to be handled carefully [17–22].

17.2.3 Security

Security is one of the primary challenges associated with SDN systems. In order to
protect the controller from unauthorized access and other problems, such as DDoS
attacks, security is paramount [23, 24]. Since SDN controllers can have open-source
application, third-party software may hinder the functionality and cause harm to the
system as the application that are being used in controller are not verified and tested
[17–22].

17.3 Results and Discussions

Snort is an open-source and free-to-use intrusion detection system/prevention system


(IDS/IPS). Snort uses signature-based database to detect and prevent any security
issues. Snort works in a manner, where it detects any traffic that is coming over a
specific network interface. Snort can be used for port scanning, creating rules for
traffic filtering, malware detection, and various other vulnerability issues.
We have used it on the Ubuntu distribution of Linux. Installation of Snort needs
to have some prerequisites installed on the base OS, which is Ubuntu in our case.
Packages like OpenSSH server, ethtool, libcap, Bison, zliblg, OpenSSL, etc. are
needed to be installed to exclude any case of dependency issue when installing Snort
on Ubuntu. Figure 17.3 shows the topology used, and Fig. 17.4 lists the prerequisite
installation on Ubuntu.
17 DDoS Detection in ONOS SDN Controller Using Snort 159

Fig. 17.3 Topology used in experimentation

Fig. 17.4 Installing prerequisite for snort

Now, as prerequisites are installed, we now need to install data acquisition


package, i.e., DAQ on Ubuntu. We will download latest DAQ package from its
source using wget utility of Linux and download it in the same directory in which
we issue the command as shown in Fig. 17.5.
After downloading the tar.gz DAQ package, we have to extract it using the tar
extraction command using—xzyf attributes as shown in Fig. 17.6.
Now, to install the DAQ 2.0.6 package, we get into the DAQ downloaded directory
and use./configure command to configure and install the DAQ package. After DAQ
is configured, we use make and make install command to install DAQ. When DAQ
make process is completed, we need to then push make install command in the
terminal in order to install DAQ and its associated libraries. As the installation of
DAQ is completed, we can now install Snort package from the source available at
Snort.org website and is installed with almost the similar process as we have used
with the DAQ package installation as shown in Fig. 17.7.
After downloading the Snort file, we need to extract the file using tar command
and then configure it using the./configure command. After this, Snort IDS mode
160 M. Kumar and A. Bhandari

Fig. 17.5 Getting DAQ tar package using wget utility

Fig. 17.6 Extracting DAQ tar package

Fig. 17.7 Snort tar file downloaded using wget utility

with directory-based structure is configured for Snort. Directory is created using the
following commands (Fig. 17.8).
Next, we provide the needed permissions to the directories (Fig. 17.9).
After the permissions are assigned, we need to add the Snort configuration from
source as given below:
cd/root/snort-2.9.13.1/etc. cp-avr.conf .map.dtd.config /etc./snort/cp-
avr/root/snort-2.9.13.1/src/dynamic-preprocessors/build/usr/local/lib/snort
_dynamicpreprocessor/*/usr/local/lib/snort_dynamicpreprocessor/
17 DDoS Detection in ONOS SDN Controller Using Snort 161

Fig. 17.8 Creating files and directory for Snort

Fig. 17.9 Assigning permissions to directories

Fig. 17.10 Rules added in Snort

After copying the sources, we have to comment our all the rule sets using the
following command:
sed -i "s/include $RULE\_PATH/#include \$RULE_PATH/" /etc/snort
/snort.conf.
Now, rules can be added and configured as requirement in IDS, so we have created
some local rules as shown in Fig. 17.10.
Above-added rules are in alerts category which states that the user will receive an
alert with message “Potential DDoS Detected” when any traffic tries to communicate
with the SDN controller over web interface where source network is not internal
network, which in our case also acting as management network. Here, we have
created a rule that states that if any GUI request is made toward SDN controller from
non-internal network, there should be an alert message generated in the log.
After adding rules, we have initiated a DDoS using Hping3 tool as shown in
Fig. 17.11.
Alerts generated can be seen in Fig. 17.12, when TCP-SYN DDoS is attempted
from Hping3 [25] with random source addresses toward ONOS SDN controller:
The alerts figure generated using Snort shows that flood of traffic is initiated
toward ONOS SDN controller using non-internal/management traffic. To see how
the TCP-SYN is creating havoc on the controller network with the DDoS, a Wireshark
capture is used with TCP-SYN filter enabled that shows the exponential rise in the
TCP-SYN traffic during flooding enabled (Fig. 17.13).
162 M. Kumar and A. Bhandari

Fig. 17.11 DDoS initiated using Hping3 toward ONOS

Fig. 17.12 Alerts generated in the alert console


17 DDoS Detection in ONOS SDN Controller Using Snort 163

Fig. 17.13 TCP-SYN-based DDoS on ONOS from illegitimate sources

17.4 Conclusion and Future Scope

SDN is rapidly changing the deployment methods of world networks. Industry is


accepting the change with open hands because of the benefits provided by the SDN,
but still there are security challenges related with SDN as it is not secure by default
and we need to configure best security practices in order to make it secure. DDoS is
one of the biggest vulnerabilities that controller have. Snort is open-source-based IDS
and can be used to perform intrusion detection on Linux and Windows platform by
configuring and creating rules and policies as per the needs. The policies are tunable
and are used to generate the alerts when some traffic hits the rule on some specific
interface. We have used Snort package in integration with the DAQ that alerts on
any connection attempt as stated in the snort rules. Snort has a console that displays
the live connection attempts as specified under the rules which can state any traffic
which is illegitimate. Any illegitimate traffic that tries to attempt the critical section
of the network can be detected, and an alert is generated to secure the controller.

17.4.1 Future Scope

SDN is one of hottest research areas in network industry, and even after around
10 years, researchers are getting newer possibilities with SDN because of the
customization and programmability that it brings with it. Snort in the future can
be integrated with ML algorithms in order to take decisions on traffic pattern types
and filters the illegitimate traffic accordingly.
164 M. Kumar and A. Bhandari

References

1. Shahzad, N., Mujtaba, G., Elahi, M.: Benefits, security and issues in software defined
networking (SDN). NUST J. Eng. Sci. 8(1), 38–43 (2015)
2. Kreutz, D., Ramos, F., Verissimo, P., Rothenberg, C., Azodolmolky, S., Uhlig, S.: Software-
defined networking: a comprehensive survey. Proc. IEEE 103(1), 14–76 (2015)
3. Feamster, N., Rexford, J., Zegura, E.: The road to SDN: an intellectual history of programmable
networks. Princeton, New York (2015)
4. Garg, G.,Garg, R.: Review on architecture & security issues of SDN. Int. J. Innov. Res. Comput.
Commun. Eng. 2(11) (2007). ISO 3297: 2007
5. SD-WAN: https://resources.epsilontel.com/begin-your-sdn-journey-with-sd-wan/
6. SDR: https://www.sciencedirect.com/topics/engineering/software-defined-radio
7. McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G.M., Peterson, L.L., Rexford,
J., Shenker, S., Turner, J.S.: OpenFlow: enabling innovation in campus networks. Comput.
Commun. Rev. 38(2), 69–74 (2008)
8. Suciu, G., Vulpe, A., Halunga, S., Fratu, O., Todoran, G., Suciu, V.: Smart cities built on
resilient cloud computing and secure internet of things. In: 2013 19th International Conference
on Control Systems and Computer Science (CSCS), pp. 513–518. IEEE (2013)
9. GitHub of OpenDaylight Integration Project: https: //githubcom/opendaylight/integration
10. Berde, P., Gerola, M., Hart, J., Higuchi, Y., Kobayashi, M., Koide, T., Lantz, B., Snow, W.,
Parulkar, G., O’Connor, B., Radoslavov, P.: ONOS. In: Proceedings of The Third Workshop
on Hot Topics in Software Defined Networking—HotSDN ’14, pp. 1–6 (2014)
11. ONOS Interval View: https://wiki.onosproject.org/display/ONOS/Basic+ONOS+Tutorial
12. Sangodoyin, A., Sigwele, T., Pillai, P., Hu, Y.F., Awan, I., Disso, J.: DoS attack impact assess-
ment on software defined networks. ICST Institute for Computer Sciences, Social Informatics
and Telecommunications Engineering (2018)
13. Lawal, B.H., Nuray, A.T.: Real-time detection and mitigation of distributed denial of service
(DDoS) attacks in software defined networking (SDN) In 26th Signal Processing and
Communications Applications Conference (SIU), Izmir, pp. 1–4 (2018)
14. Tahir, M., Li, M., Ayoub, N., Shehzaib, U., Wagan, A.: A novel DDoS floods detection and
testing approaches for network traffic based on linux technique. Int. J. Adv. Computer Sci.
Appl. (IJACSA) 9(2), 341–357 (2018)
15. Bawany, N.Z., Shamsi, J.A., Salah, K.: DDoS attack detection and mitigation using SDN:
Methods, practices, and solutions. Arab. J. Sci. Eng. (Springer) 42, 425–441 (2017)
16. Roesch, M.: Snort: lightweight intrusion detection for networks. In: LISA ’99: 13th Systems
Administration Conference, pp. 229–238 (1999)
17. Ombase, P.M., Kulkarni, N.P., Bagade, S.T., Mhaisgawali, A.V.: Survey on DoS attack
challenges in software defined networking. Int. J. Computer Appl. (0975–8887) 173(2) (2017)
18. SDN security challenges in SDN environments. SDXCentral
19. What are SDN controllers (or SDN controllers platforms)? SDXCentral
20. SDN controller comparison part 1: Sdn controller vendors. SDXCentral
21. Understanding the SDN architecture. SDXCentral.
22. Sher, D.: Gartner: Application layer DDos attacks to increase in 2013 (2013)
23. Shu, Z., Wan, J., Li, D., Lin, J., Vasilakos, A.V., Imran, M.: Security in software-defined
networking: threats and countermeasures Mob. Netw. Appl. 21(5), 764–776 (2016)
24. Saleh, M.A., Manaf, A.A.: A novel protective framework for defeating HTTP-based denial of
service and distributed denial of service attacks web links. Sci. World J. 2015, 1-19 (2015)
25. Hping3: https://tools.kali.org/information-gathering/hping3
Chapter 18
Classification of Keratoconus Using
Corneal Topography Pattern
with Transfer Learning Approach

Savita R. Gandhi , Jigna Satani , and Dax Jain

Abstract Keratoconus, also referred as KCN, is a progressive ocular disease that


causes the thinning of cornea and distorts its curvature. The gradual thinning of
cornea induces the loss of elasticity results into a cone-shaped protrusion. This may
irreversibly change the cornea and could cause the loss of vision. In spite of many
researches have been pursued over a decade, it remains difficult to detect keratoconus
accurately in its early stage. Apart from being an important prerequisite for refractive
surgery, identification of corneal steepening shape helps to choose the right treatment
and determines the progression of the keratoconus. The different shapes of steepening
are extracted from the given corneal topographies herein. In this study, we have
applied, pretrained deep learning models using transfer learning approach to classify
the corneal topography patterns from corneal eroded images derived from the corneal
images. The said models are used to classify corneal eroded images into ten labels
as per patterns prevailed in corneal curvature due to the steepening of the surface.
This is a step forward toward predicting the progression of KCN in its early stage
with more accuracy.

Keywords Keratoconus · Corneal topography · ATLAS 9000 · Deep learning ·


Transfer learning · Pretrained ImageNet model · Computer vision · Edges with
mask

18.1 Introduction

Keratoconus (KCN) is an ophthalmic condition wherein cornea bulges out conically


due to progressive thinning of outer layer referred as cornea, leading to vision loss

S. R. Gandhi (B) · J. Satani · D. Jain


Department of Computer Science, Gujarat University, Ahmedabad 380009, India
e-mail: drsavitagandhi@gmail.com
J. Satani
e-mail: jignasatani@gmail.com
D. Jain
e-mail: daxjain789@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 165
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_18
166 S. R. Gandhi et al.

[1–3]. Various researches suggest that the factors such as sensitive or thin cornea,
extensive eye rubbing, environmental condition, extensive screen time and genetic
factors are some of the root causes of the disease [4]. Latest customized corneal
lenses may halt the progression of the disease, subject to acceptance by the eyes. In
severe cases of KCN, treatments like corneal transplantation and epithelium grafting
are sometimes endangered in absence of any possible treatments [5–7].
Even with the advancement of technology, the early detection of KCN is yet the
best remedy. Moreover, the refractive surgery is not advised on keratoconic eyes
[8, 9]. The clinical screening of keratoconus is now replaced by the topographical
screening using keratometry devices which produce a color topographic and tomo-
graphic map [9, 10]. These maps render measurement of corneal steepening. The
incidences of corneal elevation and other irregularities are revealed in topographical
maps [11] that elaborate the patterns as per the progression of the keratoconus. Thus,
topographical display of various shapes plays an important role in determining the
severity of the KCN along with other corneal measures. At least, as of now, it seems
that the early identification of (a) shape, (b) steepening patterns and (c) thinning is
the best way out, as, in most of the cases, KCN exhibits symptoms of its presence
only in the later stages and by that time irreversible damages might occur. Even a
small delay in determination of the KCN in its early stages may narrow down the
choices of otherwise available appropriate treatments [12, 13].
In this study, the shape of corneal distortion manifested by the keratoconus has
been detected by extracting a pattern of corneal irregularities from the given corneal
topographical maps. These patterns were further classified by us into ten signif-
icant shapes associated with the progression of disease. Here, to determine these
patterns, we primarily used (a) the ‘transfer learning’: an outshoot of deep layered
network learning techniques, (b) tailoring CNN (convolutional neural network) and
(c) pretrained ImageNet models along with the methodologies of (d) ‘Computer
Vision’. The topographic maps used here-in were axial curvature elevation maps,
derived from the ATLAS 9000 topographer. The CNN being the best known deep
learning neural network for image recognition, classification and detection of objects
from the images by extracting features has been our obvious choice.

18.2 Related Work

The transfer learning is based upon conventional wisdom of ‘sharing knowledge


with others’. Using the data similar in nature but collected from different sources,
the pretrained deep learning models are firstly trained rigorously and then these
vastly knowledge equipped models are applied upon relatively smaller data or task
to speed-up the very training [14, 15].
The application of the ‘deep neural network’ and ‘transfer learning’ upon medical
images has proven its worth and hence are extensively used with MRI, CT scan, X-ray,
microscopy, color fundus images, etc. [16].
18 Classification of Keratoconus Using Corneal Topography … 167

Not, so long ago, the optic diseases were diagnosed primarily through clin-
ical screening but the sharp advancement of neural network has drawn the atten-
tion of researchers for early and accurate detection of ophthalmic disease [17]. In
past decades, many scientists have significantly contributed in keratoconus detec-
tion by applying NN, ANN, SVM, MLP, RBFNN, decision tree, etc. [10, 18–22].
Lately, introduction of machine learning techniques have empowered the process of
identification and classification of disease including keratoconus [23, 24].
The convolution neural network has just begun to claim its efficiency over other
peer techniques by giving higher accuracy in image recognition and classification, so
does for keratoconus detection as well [25–28]. Lavric et al. [29] applied customized
CNN architecture named as ‘KeratoDetect’ to detect the eyes with keratoconus by
achieving 99% accuracy after increasing epoch numbers.
With the recent advancement in deep learning, a conventional thought of ‘sharing
gained knowledge’ has been interpreted in transfer learning which allows a small-
scale task to use the learnings of pretrained large-scale models such as AlexNet,
DenseNet, ResiNet, VGG16, etc. Among family of pretrained models, VGG16 is
simple, lightweight and shows efficient outcomes with image segregation. Kim et al.
[16] suggested transfer learning approach to classify images between X-ray images
and normal images by mapping databases of various sources where images were
obtained from same imaging modalities. Salih et al. [30] performed CNN trained by
VGG16 to extract the features from corneal topography which was supplied to SVM
for classifying features of elevation and thickness. The classification outcome was
further used to predict its match with clinical diagnosis of corneal disease.
Deep learning neural network alongside with the computer vision techniques
offer tailor-made solution [31]. The study of the achievements of other researchers
suggests the significant performance of the deep neural network [32–35], motivating
us to apply it with the recent advance techniques of transfer learning for our research.

18.3 Study Data and Methods

Being an outperformer, the rich deep learning technique has delivered new approach
of transferring the former learnings to new data and tasks, also referred as inductive
transfer. Transfer learning applies previously gained knowledge to learn new tasks
which may be smaller in size or used with imbalanced data. We customized convo-
lutional neural network (CNN) to incorporate various pretrained ImageNet models
to deal with corneal topographies used.
The bilateral axial elevation maps, clinically tested and approved, were used as the
subject group. These maps were fetched from corneal wavefront analysis and Placido
disk-based Carl Zeiss ATLAS9000 topography modality. This device has been mainly
used to measure corneal shape, curvature and irregularities. The elevation-based
topography has its own advantages over Placido-based devices [36, 37]. The axial
curvature maps display global curvature of the corneal surface. Corneal topographical
maps have been used to detect the keratoconus and identify the shapes of steepening
168 S. R. Gandhi et al.

occurred due to the progression of the disease. The axial elevation map highlights
the protrusion of corneal curvature.
This study uses axial maps divided into ten groups based on the Amsler–Krumeich
standard classification scheme [38] that relies upon the anterior corneal features for
the identification of keratoconus and its progression shown in Fig. 18.1 according
to the degree of steepening and skewing of the curvature [39]. As per the standard
classification, patterns are identified as (i) Round, (ii) Oval, (iii) Symmetric Bowtie
(SB), (iv) Symmetric Bowtie with Skewed Radial Axes (SB/SRAX), (v) Asym-
metric Bowtie with Superior Steepening (ASB/SS), (vi) Asymmetric Bowtie with
Inferior Steepening (ASB/IS), (vii) Asymmetric Bowtie with Skewed Radial Axes
(AB/SRAX), (viii) Superior Steepening (SS) (ix) Inferior Steepening (IS) and (x)
Irregular for identifying the severity of the keratoconus. Out of these labels, ‘IS’
and ‘AB/SRAX’ show irregularities in corneal curvature, whereas ‘SB/SRAX’ and
‘Oval’ reveal symmetry in them. According to the classification scheme, various

Fig. 18.1 Corneal topography pattern as per classification scheme [40]


18 Classification of Keratoconus Using Corneal Topography … 169

patterns found due to the skewness occurred in corneal curvature were associated to
the degree of progression of keratoconus [39]. Total 372 bilateral axial maps from
200 patients have been used as subject group, here in this study.
In our previous research, more than 800 maps of 534 × 534 × 3 pixels were
processed with the methods of OpenCV to derive images with edges and area of
steepening within edges from the elevation maps referred as ‘images with edges
and color mask’. There, we attained more than 98% of accuracy in detection of
keratoconus from ‘images with edges and color mask’, variant of color topographical
maps [41]. In continuation to our previous research, we selected 372 keratoconus
images for all the ten classes in this study. This study excluded maps of forme fruste
keratoconus. These selected maps were resized into 224 × 224 pixels in order to feed
to our customized CNN model. Using transfer learning approach, the significant
features were fed to the pretrained ImageNet models to form very deep layered
learning architecture. Here, all 372 images were converted into grayscale eroded
images with two objectives: (i) segregate the pattern from the images respect to the
shape of the steepening and (ii) classify KCN images into ten varieties of shapes
specified as per standard classification mentioned by Amsler–Krumeich by reducing
the dimension of the derived images with images-and-mask [40].
The Computer Vision methods were used to achieve our first objective as follows.
As a first step, the ‘canny edge detection method’ was customized to optimize
the conversion of the color ‘images with edges and mask’ into grayscale eroded
images. These RGB color images were converted into grayscale using OpenCV’s
color conversion method, followed by Gaussian method to eliminate the noise from
the converted grayscale images. To determine any significant edge in an elevated area
in an image, gradients were calculated using Sobel method of OpenCV. Further the
double thresholding was applied to set the minimum and maximum threshold values.
This was done to ultimately read the intensity of every pixel of the grayscale image
to identify the most relevant pixel, for drawing an edge, pixel by pixel. Thus, initial
requirement of deriving various shapes and patterns was successfully achieved by
repeating the aforesaid steps on the data images which are shown in Fig. 18.2.
These derived shapes from the image were then used to determine the prevalence
and type of the corneal distortion. For it, the grayscale images were required to
be refined further, for determining the degree of the progression of keratoconus by
classifying the shapes into ten labeled patterns as per Amsler–Krumeich standard
classification scheme. These images were transformed into 64 × 64 to get convolved
with Law’s texture convolution method in order to reduce the dimension and refine
the edges for clear identification of the shapes.
In this process, grayscale images were convolved with the five Law’s texture
energy kernels. These kernels are single dimensional, provide blurring of noise,
smoothen the gray level texture, detect and contrast the edge and finally emphasize
the ripples and spots at pixel level. Thus, the steepening patterns present in the
maps were distinguished and were used further to check the bent of distortion in the
curvature of the cornea. The grayscale eroded images can be further used for the
classification of the patterns using the deep learning models.
170 S. R. Gandhi et al.

Fig. 18.2 Patterns from the variant of the keratoconus maps with edges and color mask

In order to attain the second objective of our study, the grayscale eroded images so
obtained were used as an input. Also, here, we have used a wide range of conventional
artificial neural network models, starting with ANN and also tried highly recommend
deep neural network models which are ready-to-use and pretrained using millions
of heterogeneous image datasets. Before feeding the image dataset to the various
transfer learning models, highly imbalance maps in each of ten classes were further
required to be augmented to balance the each class with sufficient data in it.
As shown in Fig. 18.3, new images were generated by augmenting with (i) rotation
of 10 degree about height, (ii) width shift of 10% and (iii) horizontal and vertical flip.
These steps yielded 3962 maps, amounting to approximately 350 images for each
class. Here, the models used 3231 images for training and 359 as test data.
Figure 18.4 explains the workflow for the multiclass classification, wherein the
preprocessed grayscale eroded images of corneal topographies with edges and mask
were resized and used as an input for the ImageNet deep learning models. Various
pretrained deep learning models were optimized using the transfer learning approach
to classify the eroded maps into ten different groups. In this multiclass classification,
the features were extracted from fully connected dense layer of models and then
used them with logistic regression estimator as an input. These pretrained ImageNet
18 Classification of Keratoconus Using Corneal Topography … 171

Fig. 18.3 Edges derived from eroded images using Law’s texture method

Fig. 18.4 Work flow of multiclass classification of topography patterns using images with edges
and color mask

models are listed in Table 18.1 in Sect. 4, which were implemented using ‘Keras
module sets’.
Among the pretrained models used in this study, VGG16, VGG19, MobileNet and
ResNet50 models use 224 × 224 for input images, whereas InceptionV3, Xception
and InceptionResNetV2 demand 299 × 299 as input size. These architectures use
172 S. R. Gandhi et al.

a large kernel size of 11 × 11 and 5 × 5, which are applied in the first couple of
convolutional layers of the model, and the subsequent layers then use 3 × 3 sized
kernel. Each of the selected pretrained deep convolutional neural network models
used ImageNet weights with MaxPooling followed by single flatten layer and fully
connected layer with two dense layers. ‘ReLU’ activation function was used by
models to adjust the weights for assuring nonlinearity to convolutional layers. Also
the default padding was used to fit the kernel over the images. A convolution layer
followed MaxPooling layer to reduce the dimensionality of image and passing it to
the next layer as an input. In order to implement transfer learning approach, sorted
features from the first dense layer of fully connected were used to feed into logistic
regression model to predict the probability ratio for each of the ten shapes. ANN
model was designed with two dense layers of 256 and 128 neurons. After converting
the output data into vector, it was fed to the fully connected layer which further used
Softmax activation function to convert the output into probability indices. Batch
normalization along with dropout method was applied in the subsequent layers after
flattening to avoid overfitting.
Although the flattened layer could have measured the probability of signifi-
cant features, yet we have applied the logistic regression upon the sliced fully
connected layer to determine the various patterns to yield higher accuracy. We have
selected VGG16, VGG19, InceptionResNetV2 and MobileNet networks, to achieve
the aforesaid.

18.4 Discussion and Results

We had to erode the grayscale images so that it can be classified into various groups
based upon corneal steepening patterns. Since ANN alone was not able to classify
the preprocessed images into various classes and exhibited poor training and testing
accuracy, we further trained our pretrained deep learning models by fitting the logistic
regression estimator using the transfer learning approach to elevate the classification
capabilities of each model. This would ultimately help us in determining the degree
of progression of the keratoconus disease. VGG16 and VGG19 models used 4096
features. InceptionV3 and MobileNet models have used 1000 features. A total of
2048 features were used by ResNet50 model, while the InceptionResNetV2 used
1536 features only. Table 18.1 illustrates the comparative chart of the performance
metrics. It was necessary for highlighting the quality of classification of patterns of
corneal steepening, through transfer learning.
Among the pretrained models used here for classification, VGG16 and VGG19
gave 99.41% and 99.62% training accuracy, whereas the testing accuracy obtained
by the same models was 76.04% and 77.43%, respectively. MobileNet and ResNet50
trained the model well with 91.95% and 92.23% accuracy and classified data with
75.48 and 76.60 test accuracy. The prediction by the Xception and InceptionV3 were
50.13% and 34.26%, respectively.
18 Classification of Keratoconus Using Corneal Topography … 173

Table 18.1 Comparison of training accuracy and testing accuracy obtained by deep learning neural
networks using transfer learning approach KCN grayscale topographic maps of KCN
Model type Training accuracy Testing accuracy Precision Recall F1-score
ANN 20.73 12.26 – – –
VGG16 99.41 76.04 75.93 76.04 75.79
VGG19 99.62 77.43 77.59 77.43 77.38
Xception 63.75 50.13 51.02 50.13 50.03
InceptionV3 34.21 34.26 38.33 34.26 32.11
InceptionResNetV2 86.22 77.18 76.21 77.16 76.42
MobileNet 91.95 75.48 75.75 75.49 75.39
ResNet50 92.23 76.60 – – –

As it can be seen from the Table 18.1 that the performance of models—Xception
and InceptionV3—was not up to the mark in spite of using the logistic regression for
accuracy optimization; hence, the support vector machine was then applied with these
two models to improve accuracy but resulted into 37.88% for Xception and 38.44%
for InceptionV3 model, respectively, which were similar to the performance gained
using logistic regression by these two models. InceptionResNetV2 exhibited 86.22%
as training accuracy and 77.18% as testing accuracy which was closer to testing fit
with better training to testing ratio. Out of many of applied transfer learning models,
VGG16, VGG19, InceptionResNetV2 and MobileNet showed an assuring average
of 76.24% F1-score. Thus, results with more than 75% accuracy in identifying the
corneal steepening patterns are further illustrated in Table 18.1 herein.

18.5 Analysis of Results

Figure 18.5 gives the performance matrix attained by the various pretrained models
used here, whereas Fig. 18.6 represents the comparison of the various training and
testing accuracies gained by each of the deep convolutional models when applied
upon keratoconus dataset.
Here, since the VGG16 and VGG19 were trained well, its training accuracy
attained was 99.41% and 99.62%, respectively. However, VGG16’s confusion matrix,
in few cases, suggests that it misclassified the ‘Asymmetric Bowtie - Inferior steep-
ening’ image with ‘Asymmetric Bowtie – skewed radial axis’ image. Similarly,
VGG19, in few cases, mistook ‘Asymmetric Bowtie – skewed radial axis’ maps
as ‘Superior Steepening’ image. In spite of such a great training accuracy, VGG16
and VGG19, testing accuracy came down to 76.04% and 77.43%, respectively.
The InceptionResiNetV2 led to well-balanced training and testing accuracy of
86.22% and 77.18%, respectively, when it comes to identification of corneal curva-
ture. The MobileNet yielded 91.95% of training and 75.48% testing accuracy. The
174 S. R. Gandhi et al.

Fig. 18.5 Confusion matrix for each transfer learning model

Fig. 18.6 Comparative analysis of the training and testing accuracies obtained by various deep
learning models, when approached with the transfer learning

ResiNet50 obtained training and testing accuracy of 92.23% and 76.6%, respec-
tively. It shall be noted from the Fig. 18.5 that even the very basic ANN, widely used
Xception, which is extension of the Inception model and InceptionV3, these three
18 Classification of Keratoconus Using Corneal Topography … 175

Fig. 18.7 F1-score of pretrained deep neural network model applied with eroded corneal maps

models out of many used here were highly unstable and their performance was below
average.
The deep neural models used here were frequently misinterpreted the following
two patterns, i.e., (i) manifestation due to occurrence of skewed steepening in the
inferior side of cornea and (ii) ‘Symmetric bowtie’ shapes, that with the ‘Round’ and
‘Oval’ maps which often prevail in the advanced keratoconus.
Figure 18.7 shows the overall performances of all the deep neural models used
herein, as per respective F1-scores as follows: the VGG19 had the highest of 77.38%,
InceptionResNetV2 attained 76.42% and the VGG16’s 75.79% and MobileNet’s
75.39% were almost similar. However, the InceptionV3 and Xception were average
in identifying the corneal steepening pattern accurately.
It can be seen in Fig. 18.8 that Xception and InceptionV3 have led to the highest
misclassification followed by the VGG16. While VGG19 and MobileNet showed
similar performances in classifying patterns, it was the InceptionResNetV2 that
outperformed that too with minimal error rate.
Among all the pretrained models, with the eroded images of distorted corneal
curvatures, the performance of InceptionResNetV2 was the best with minimum
misinterpretation rate and was trained well with similar training and testing accuracy.
176 S. R. Gandhi et al.

Fig. 18.8 MSE for pretrained deep neural network model applied with eroded corneal maps

18.6 Conclusion and Future Work

Out of our research, it can logically be concluded that the ‘InceptionResNetV2,


VGG19, MobileNet and VGG16 had best of accuracy, precision, Recall and F1-
score’, among all the pretrained models used herein with bilateral corneal axial
maps for determining the keratoconus pattern. Hence any of the aforesaid, our
researched, deep convolution neural networks can be used to identify the corneal
steepening patterns to determine the keratoconus progression and its comparative
treatments. However, the pattern classification was made using only anterior featured
corneal maps. Even for refractive surgeries, our researched CNN, can be used along
with the shapes classified using Law’s texture. We hope our gathered knowledge of
patterns and shapes of the distortion in corneal curvature will enable early detection
of keratoconus, in times to come.

References

1. Romero-Jiménez, M., Santodomingo-Rubido, J., Wolffsohn, J.S.: Keratoconus: a review. Cont.


Lens Anterior Eye 33, 157–166 (2010)
2. Krachmer, J.H.: Keratoconus and related noninflammatory cornea1 thinning disorders. Surv.
Ophthalmol. 30 (1984)
3. Piñero, D.P., Nieto, J.C., Lopez-Miguel, A.: Characterization of corneal structure in kerato-
conus. J. Cataract Refract. Surg. 38, 2167–2183 (2012)
4. McComish, B.J., et al.: Association of genetic variation with keratoconus. JAMA Ophthalmol.
138, 174 (2020)
5. Pedrotti, E., et al.: New treatments for keratoconus. Int. Ophthalmol. 40, 1619–1623 (2020)
18 Classification of Keratoconus Using Corneal Topography … 177

6. Dapena, I., Parker, J.S., Melles, G.R.J.: Potential benefits of modified corneal tissue grafts
for keratoconus: Bowman layer ‘inlay’ and ‘onlay’ transplantation, and allogenic tissue ring
segments. Curr. Opin. Ophthalmol. (Publish Ahead of Print) (2020)
7. Fariselli, C., Vega-Estrada, A., Arnalich-Montiel, F., Alio, J.L.: Artificial neural network to
guide intracorneal ring segments implantation for keratoconus treatment. Eye Vis. 7, 20 (2020)
8. Bejdic, N., Biscevic, A., Pjano, M., Ivezic, B.: Incidence of keratoconus in Refractive surgery
population of Vojvodina—single center study. Mater. Sociomed. 32, 46 (2020)
9. Salomão, M., et al.: Recent developments in keratoconus diagnosis. Expert Rev. Ophthalmol.
13, 329–341 (2018)
10. Arbelaez, M.C., Versaci, F., Vestri, G., Barboni, P., Savini, G.: Use of a support vector machine
for keratoconus and subclinical keratoconus detection by topographic and tomographic data.
Ophthalmology 119, 2231–2238 (2012)
11. Karabatsas, C.H., Cook, S.D., Sparrow, J.M.: Proposed classification for topographic patterns
seen after penetrating keratoplasty. Br. J. Ophthalmol. 83, 403–409 (1999)
12. Accardo, P.A., Pensiero, S.: Neural network-based system for early keratoconus detection from
corneal topography. J. Biomed. Inform. 35, 151–159 (2002)
13. Kreps, E. O., Claerhout, I. & Koppen, C. Diagnostic patterns in keratoconus. Contact Lens and
Anterior Eye S136704842030103X (2020) doi:https://doi.org/10.1016/j.clae.2020.05.002.
14. Yu, Y., et al.: Deep transfer learning for modality classification of medical images. Information
8, 91 (2017)
15. Shin, H.-C., et al.: deep convolutional neural networks for computer-aided detection: CNN
architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35, 1285–
1298 (2016)
16. Kim, H.G., Choi, Y., Ro, Y.M.: Modality-bridge transfer learning for medical image classi-
fication. In: 2017 10th International Congress on Image and Signal Processing, BioMedical
Engineering and Informatics (CISP-BMEI), pp. 1–5. IEEE (2017). https://doi.org/10.1109/
CISP-BMEI.2017.8302286
17. Lu, W., et al.: Applications of artificial intelligence in ophthalmology: general overview. J.
Ophthalmol. 2018, 1–15 (2018)
18. Smolek, M.K.: Current keratoconus detection methods compared with a neural network
approach. Invest. Ophthalmol. 38, 10 (1997)
19. Kovács, I., et al.: Accuracy of machine learning classifiers using bilateral data from a
Scheimpflug camera for identifying eyes with preclinical signs of keratoconus. J. Cataract
Refract. Surg. 42, 275–283 (2016)
20. Toutounchian, F., Shanbehzadeh, J., Khanlari, M.: Detection of keratoconus and suspect
keratoconus by machine vision. Hong Kong 3 (2012)
21. Valdés-Mas, M.A., et al.: A new approach based on Machine Learning for predicting
corneal curvature (K1) and astigmatism in patients with keratoconus after intracorneal ring
implantation. Comput. Methods Programs Biomed. 116, 39–47 (2014)
22. Smadja, D., et al.: Detection of subclinical keratoconus using an automated decision tree
classification. Am. J. Ophthalmol. 156, 237-246.e1 (2013)
23. Consejo, A., Melcer, T., Rozema, J.J.: Introduction to machine learning for ophthalmologists.
Sem. Ophthalmol. 34, 19–41 (2019)
24. Souza, M.B., Medeiros, F.W., Souza, D.B., Garcia, R., Alves, M.R.: Evaluation of machine
learning classifiers in keratoconus detection from orbscan II examinations. Clinics 65, 1223–
1228 (2010)
25. Badillo, P.D., Zhivolupova, Y.A., Kudlakhmedov, S.Sh.: Convolutional neural networks for
astigmatism detection. In: 2020 IEEE Conference of Russian Young Researchers in Electrical
and Electronic Engineering (EIConRus), pp. 1360–1365. IEEE (2020). https://doi.org/10.1109/
EIConRus49466.2020.9038998
26. Jmour, N., Zayen, S., Abdelkrim, A.: Convolutional neural networks for image classification. In:
2018 International Conference on Advanced Systems and Electric Technologies (IC_ASET),
pp. 397–402. IEEE (2018). https://doi.org/10.1109/ASET.2018.8379889
178 S. R. Gandhi et al.

27. Klyce, S.D.: The future of keratoconus screening with artificial intelligence. Ophthalmology
125, 1872–1873 (2018)
28. Imran, A., et al.: Fundus image-based cataract classification using a hybrid convolutional and
recurrent neural network. Vis. Comput. (2020). https://doi.org/10.1007/s00371-020-01994-3
29. Lavric, A., Valentin, P.: KeratoDetect: Keratoconus detection algorithm using convolutional
neural networks. Comput. Intell. Neurosci. 2019, 1–9 (2019)
30. Salih, N., Hussein, N.: Human Corneal state prediction from topographical maps using a deep
neural network and a support vector machine. 8
31. Kattire, S.S., Shah, A.V.: Boundary detection algorithm implementation for medical images.
Int. J. Eng. Res. 3, 3 (2014)
32. Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42,
60–88 (2017)
33. Kuo, B.-I., et al.: Keratoconus screening based on deep learning approach of corneal topography.
Trans. Vis. Sci. Tech. 9, 53 (2020)
34. Sonar, H., Kadam, A., Bhoir, P., Joshi, B.: Detection of keratoconus disease. ITM Web Conf.
32, 03019 (2020)
35. Hesamian, M.H., Jia, W., He, X., Kennedy, P.: Deep learning techniques for medical image
segmentation: achievements and challenges. J Digit Imaging 32, 582–596 (2019)
36. Belin, M.W., Khachikian, S.S.: An introduction to understanding elevation-based topography:
how elevation data are displayed—a review. Clin. Experiment. Ophthalmol. 37, 14–29 (2009)
37. Martínez-Abad, A., Piñero, D.P.: New perspectives on the detection and progression of
keratoconus. J. Cataract Refract. Surg. 43, 1213–1227 (2017)
38. Giannaccare, G., et al.: Comparison of Amsler-Krumeich and Sandali classifications for staging
eyes with keratoconus. Appl. Sci. 11, 4007 (2021)
39. Li, X., Yang, H., Rabinowitz, Y.S.: Keratoconus: classification scheme based on videokeratog-
raphy and clinical signs. J. Cataract Refract. Surg. 35, 1597–1603 (2009)
40. Rasheed, K., Rabinowitz, Y.S., Remba, D., Remba, M.J.: Interobserver and intraobserver reli-
ability of a classification scheme for corneal topographic patterns. Br. J. Ophthalmol. 82,
1401–1406 (1998)
41. Gandhi, S.R., Satani, J., Bhuva, K., Patadiya, P.: Evaluation of deep learning networks for
keratoconus detection using corneal topographic images. In: Singh, S.K., Roy, P., Raman,
B., Nagabhushan, P. (eds.) Computer Vision and Image Processing, pp. 367–380. Springer,
Singapore (2021)
Chapter 19
Intelligent Heuristic Keyword-Based
Search Methodologies Applied
to Cryptographic Cloud Environment

Panchal Mital Nikunj, Dushyantsinh B. Rathod, and Jaykumar Dave

Abstract Cloud computing is now considered as one of the very important part in
information technology. Relying on data transferred to a cloud server has been must
as cloud providers offer this services. These have arisen many problems like data
integrity, security, and privacy. As the third-party owner of data, the companies take
care of it. But the process of transferring the data from one machine to another and to
use it by people all over the world is also a challenging task. This critical disadvantage
makes existing procedures unsatisfactory in cloud computing as it significantly influ-
ences framework convenience, delivering client looking through encounters excep-
tionally baffling and framework viability extremely low. In proposed research, for
whenever we first formally face the problem of compelling fuzzy keywords search
over encoded cloud data while maintaining with keyword security. The keyword
search “fuzzy” increases using it easily by interacting with data. When end user is
looking for inputs to match perfectly with pre-defined keywords or the nearby match
with data is semantics dependent. At the point when certain match comes, it variably
finds the document match and returns. Through secure search techniques, we are
trying to compare the selected from the papers and display the comparison for better
understanding of the processes for researchers.

Keywords Cloud computing · Fuzzy keyword · Searchable encryption

P. M. Nikunj (B)
Faculty of Engineering and Technology, Sankalchand Patel University, Visnagar, Gujarat, India
e-mail: mital@ldce.ac.in
D. B. Rathod · J. Dave
Computer Engineering Department, Sankalchand Patel University, Visnagar, Gujarat, India
e-mail: dbrathod.fet@spu.ac.in
J. Dave
e-mail: jdave.fet@spu.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 179
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_19
180 P. M. Nikunj et al.

19.1 Introduction

Schemes of searchable encryption like single or multi-keyword are different in many


cases. Traditional keyword search is based on plaintext keyword search. That is not
feasible as privacy can only be protected if the data are encrypted in the strongest
possible ways. Fuzzy keyword search technique, wildcard-based technique, gram-
based technique, privacy preservation technique and tree traverse search technique
were proposed by many researcher.
In most popular searchable encryption scheme for keywords, trapdoor plays a
important phase of public key cryptography plan which is the fundamental secu-
rity establishment of present-day cryptography. So, the trapdoor in a public key
encryption system ends up being a sort of restricted resource. I sum up the technique
of negative and aggressive learning model in fake intelligence and introduced some
really efficient ways with helpfully obtain difficult and computationally hard trapdoor
in light of the programmed data hypothetical inquiry method. The necessary routine
is building a rich engineering to look and find a deterministic reversible generator
which can effectively encode and interpret variable information or messages.
The engineering incorporates a secret keyword generator based on a encoder
which automatically take variation’s liable for looking the suitable private gateways
fulfilling a limit of entropy, an irregular yielding arbitrary disorder generated by
yielding, and a unique classifier taking the consequences of the two generators.
The assessment of all the techniques is compared and displayed in different aspects
with high proficiency in producing modest encryption scheme. In such schemes, the
security of the documents and keywords stored on server should not be compromised.

19.2 Literature Survey

This paper is introducing probability trapdoor function that enhances the multiple
searchable encryption scheme. Encryption techniques including single keyword
search with searchable encryption based on public key which is dependent on bilinear
mapping function results in inefficient search accuracy and lower efficiency.

19.2.1 Comparison of Searchable Encryption Techniques

Searchable encryption on single word is also not feasible as the result could be
multiple files [1]. The process resulting in symmetric searchable encryption is intro-
duced for single word searchable encryption process. Multiple words can result in a
lengthy process but at the same time, the result is more efficient. There are schemes
that introduce multi-keyword sorting from coordinate system matching search though
19 Intelligent Heuristic Keyword-Based Search Methodologies … 181

the accuracy is inadequate because of insufficient weight’s differences among consid-


eration of keywords. Fuzzy keyword as a result gives better practical solution to the
problem of searchable keywords added by user in case the user does not have the
exact keyword to access the files of his ownership. Locality sensitive hashing tech-
niques and gram-based fuzzy set combined give better efficiency and separate use of
both by different researchers is demonstrated in graphs which has limit over types
and lengths of word [1]. For improving competence the multi-keyword technique a
tree structure which is using a vector model is introduced in the scheme using word
frequency and IDF model and found the importance into the search process [1].
Key gen phase and build index phase generate the hashing index and store it
as a separate index files which is then used by the cloud server to provide files
to authenticated users based on the keywords entered. Secure searchable encryption
supports to solve security problem for sending files for storage over cloud server. This
raising a need of encryption of user data to provide the confidentiality and privacy
of data to the user where the process is followed for finding closest keyword if exact
keyword match is not found. Simultaneous support of secure searchable encryption
scheme and matching the results based on ranking was proposed by authors. There are
two factors which are used for indexing storage and writing queries. Rank generated
for matching the outcome while making sure of data confidentiality and privacy of
users are maintained, and safeguarding the resources of the mobile device users [2].

19.2.2 Fuzzy and Aggregation-Based Keyword Search

Aggregate keyword based on token generated to preserve the search functionality


among the multi-indexed data and key sharing scheme which is combined used based
on public key encryption. The problem is to find out a pattern or technique to deal
with keyword search while the file on cloud is too large. Technique named as multi-
keyword fuzzy search with efficient leakage-resilient framework over cloud data
which is encrypted and is introduced to solve the purpose. While searching the result
of the cloud server can detect the encrypted file set, index and tokens submitted. The
search knowledge or search keywords or file set is unknown [3]. Two index files
are created to increase the efficiency of time without worrying about file size. LSH,
bloom filter, and gram counting order are used as fuzzy keyword search function [3].
Liu et al. [4] propose a prime inner product encoding (PIPE) scheme, and disinte-
grated property of various prime numbers is used to come up with process with better
flexible multi-keyword fuzzy search and better accuracy. The process is exploited for
generating only integer value while the keywords are alike to ease the finding fuzzy
keyword match among a large dataset. As a solution of AND and OR operator in
query to not lose its meaning’s effective ness, the words are added noise randomly to
a query vector, and this helps in generating proper search without losing the meaning
of the query.
182 P. M. Nikunj et al.

19.2.3 Ranked and Verifiable Keyword Search

Traditionally, the data are encrypted by data owner usually before uploading the data
over cloud. For this purpose, a scheme that helps in increasing effectiveness and
efficiency of data upload and also provides data owner assurance about the upload
securely over cloud server. A linked list is employed for fuzzy keyword set generation
along with a index vector to each fuzzy keyword set. One authentication label is
generated that elucidates the problem of authentication [5]. Key generation process
and trapdoor computation and search phases are implemented to test the results over
theorem of security analysis of different schemes. A cipher text policy attribute-based
encryption scheme which contains five stages as follows: setup, Encrypt, KeyGen,
Delegate, and Decrypt. This bilinear process generates cipher and encrypt it with
encryption algorithm. The private key is generated. In delegate stage, a secret key is
generated based on key provided by user, and in last stage, the decryption is processed
[6]. Two main properties of CPABE algorithm are bi-linearity and non-degeneracy.
To improve search efficiency, a word pattern function is calculated with hash function.
The paper is validated with multiple theorem of testing the data of all types. From
which corresponding relationship is tested by analyzing index value with trapdoor as
secret key component [6]. Experimental results are produced with index generation
as per document. The construction of FKS is done on of AES while index generation.
Since the computer overhead and encryption method are very low since encryption
method used is AES symmetric, FKS is very coherent in this context [6].
The other experimental result is based on time of trapdoor generation. Index
of document v/s trap door generation is exhibited with graph where it is in curve
linear order. Here, the effectiveness is proven because of its continuous results with
improvements. To resist sparse non-negative matrix factorization-based attack, the
locality sensitive hashing and bloom filter techniques are avoided. A random redun-
dancy method is introduced along with tree-based indexing technique. Thus the
queries which have “AND”, “OR” and “NOT” with other keywords are processed
and proven deterministic [7].
The index is further divided into multiple sub-indexes to enhance multi-
dimensional query to process logic query with better efficacy. With the computation
of bloom filter and LSH the multi-dimensional query, the index is stored in separate
table. Via a secure channel, the transmission of secure KNN algorithm for indexes
and keys used is symmetric encryption algorithm for documents. Based on trapdoor,
the data are generated and encrypted. The data are then sent to the server. The cloud
then calculates the query score based on trapdoor and reverts with conclusion to the
user of data. Performance is analyzed with precision, recall, and accuracy parame-
ters on real-world dataset. The complexity of trapdoor generation increases as that
of size of index. The complexity of index generation is largely depend upon the size
of document but because of tree construction scheme has largely downgraded.
19 Intelligent Heuristic Keyword-Based Search Methodologies … 183

19.2.4 Algorithms Description

The schemes introduced as novel approach in Liu et al. [8] are found promising as the
new tree constructed with bloom filter techniques. Symmetric searchable encryption
has limitation of searching over encrypted data that shows the effectiveness of such
technique helps in benefitting over cryptography-enabled algorithms for regular data
and values. With multi-keyword ranked search, the stages are query phase where
users can generate the trapdoor corresponding to multi-keyword, which improves
the search accuracy. Search phase will process over search results and its raking, and
the server returns the top-k documents that are most relevant to query keywords. This
makes clients see as the most significant one without unscrambling each of the reports
and diminishes the redundant correspondence traffic [8]. Machine learning with
PKE generates encapsulated keys for automatic approach along with establishing
session keys. Trapdoors for public key encryption can be used with twist of adding
statistically approximating map for more probabilistic approach toward generating
higher security standards. Liu et al. [8] presented the meaning of AI-based trapdoor
plan interestingly, and all the more explicitly, it is a generative learning model in
securing a desirable trapdoor with greatest entropy. In the model, we developed a
VAE-like joining structure based on a convolutional organization and two direct
coding networks with an extra resampling layer to disguise each output. Then, at that
point, applying the generative trapdoor showed better substitutability and adaptability
in practical customization, particularly in networks comprising of AI-based processes
[9].
Data owner transfers its information to the server and wishes to share (portions of)
it with numerous customers, who are then, approved to issues maximum search ques-
tions by recovering the relating archives. For example, one could consider numerous
clinics for sharing his information utilizing a cloud administration to selectively share
patient’s information by ensuing security and privacy. The test approaches are real
and ideal game for MUSSE scheme with various users set and multiple corrupted
user sets. The results are promising as that of in comparison with different databases,
computation time, and communication size [10].

19.2.5 Comparison of Techniques

See Table 19.1.


184 P. M. Nikunj et al.

Table 19.1 Comparison of literature


Paper id Approach Advantages Disadvantages Remarks
1 Multi-keyword Complexity is Storage overhead Better encryption is
encryption with five managed efficiently of hash function is achieved by
polynomial time by managing not invertible evolving the vector
with different keyword vectors of keywords with
schemes of single, when constructing document identifier
spatial, and trapdoors
multi-keyword
search with key
generation and
index generation
steps
2 Bloom filter and Choosing proper k The Jaccard Experimental data
locality-based and m for choosing distance between are limited, and so
hashing that uses documents and the correct the result can be
Alignable index based on keyword and varied at times
locality-sensitive document selected misspelled
hashing is processed to keyword can be
perform search fuzzy
operation
3 Efficient This technique uses No noticeable A novel two stage
leakage-resilient searches of cipher disadvantage is indexing technique
multi-keyword text on cloud data found with LSH and
fuzzy search which are bloom filter is
(EliMFS) encrypted. A reducing search
framework over number of time. There are
encrypted cloud data symmetric different threat
searchable models used in
encryption has been experiments to test
proposed by the proposed model
authors, and the
enormous amount
of data is
outsourced on
cloud
4 Prime inner product To support the The flexibility of Encoding of query
encoding for non-composable writing query with keyword or an
effective property of prime logical operators index keyword into
wildcard-based numbers, a new can generate vector filled with
multi-keyword approach of multiple possible prime numbers
fuzzy search generating prime results
inner product
encoding (PIPE)
scheme is created
(continued)
19 Intelligent Heuristic Keyword-Based Search Methodologies … 185

Table 19.1 (continued)


Paper id Approach Advantages Disadvantages Remarks
5 Enabling efficient The proposed paper – The approach
verifiable fuzzy states that exact processes toward
keyword search for keyword search operations for
encrypted data scheme which is generating
verifiable and then authentication label
can extend to fuzzy with fuzzy keyword
keyword. The for verification
approach of using
linked list to
achieve storage
efficiency is proven
beneficial
6 Fuzzy keyword Access control Index generation The results are
search method along with fuzzy and tree structure promising when
supporting access keyword search generation process compared to wild
control (FKS-AC) both achieved with are specific to type card character and
over ciphertexts novel approach of data LSH and bloom
filter techniques
7 Fuzzy keyword It results in better – Theoretical analysis
search algorithm generation of is promising in
with the addition of keyword even over terms of word and
logical operators in encrypted data document-based
query because of the results but needs to
logical operators be test over
and tree-based different threat
index generation to models
improve search
efficiency
8 Multi-keyword The query keyword – Even though
ranked searchable set comprises of multiple techniques
encryption with the wild card are used to generate
wildcard keyword characters. After secure keywords,
trapdoor, query is and decryption of
generated with keyword is
nesting of performed with
generating top-k ranking of
hierarchical documents
clustering
9 Generative Generating – The comparison of
trapdoors for public sub-optimal AI network models
key cryptography trapdoor with with independent
based on automatic automated process settings produces
entropy with artificial better results while
optimization intelligence to reducing
make trapdoors complexity
easy to implement
bypassing the
computational
overheads
(continued)
186 P. M. Nikunj et al.

Table 19.1 (continued)


Paper id Approach Advantages Disadvantages Remarks
10 Multi-user Corrupted data can – Producing maps
collusion-resistant be limited and thus and tables with
searchable sharing of sensitive corrupted and
encryption with information of non-corrupted user
optimal search time database to limit and data for future
cross-over leakage; transaction and
this paper is revisiting only
introducing a three nodes which are
novel technique for proven
single server SSE non-infectant with
schemes any attack is a
promising idea that
can be carried
forward

19.3 Conclusion

The survey of techniques followed by the popular searchable encryption techniques


by researchers is studied with different aspects. The aspects include guarantee
privacy to users, decomposing keywords with no harm, generating secure public
key, and maintaining quick index of documents. Techniques analyzed and surveyed
are yielding better results when used hybrid followed by the outcome requirements.

References

1. Ping, Y., et al.: A multi-keyword searchable encryption scheme based on probability trapdoor
over encryption cloud data. Information 11(8), 394 (2020)
2. Li, M., et al.: Multi-keyword Fuzzy search over encrypted cloud storage data. Procedia
Computer Sci. 187, 365–370 (2021)
3. Chen, J., et al.: EliMFS: achieving efficient, leakage-resilient, and multi-keyword fuzzy search
on encrypted cloud data. IEEE Trans. Services Comput. 13(6), 1072–1085 (2017)
4. Liu, Q., et al.: Prime inner product encoding for effective wildcard-based multi-keyword fuzzy
search. IEEE Trans. Services Comput. (2020)
5. Ge, X., et al.: Enabling efficient verifiable fuzzy keyword search over encrypted data in cloud
computing. IEEE Access 6, 45725–45739 (2018)
6. Zhu, H., et al.: Fuzzy keyword search and access control over ciphertexts in cloud computing.
In: Australasian Conference on Information Security and Privacy. Springer, Cham (2017)
7. Fu, S., et al.: A privacy-preserving fuzzy search scheme supporting logic query over encrypted
cloud data. Mob. Networks Appl. 26(4), 1574–1585 (2021)
8. Liu, J., et al.: Multi-keyword ranked searchable encryption with the wildcard keyword for data
sharing in cloud computing. Computer J. (2021)
9. Zhu, S., Han, Y.: Generative trapdoors for public key cryptography based on automatic entropy
optimization. China Commun. 18(8), 35–46 (2021)
10. Wang, Y., Papadopoulos, D.: Multi-user collusion-resistant searchable encryption with
optimal search time. In: Proceedings of the 2021 ACM Asia Conference on Computer and
Communications Security (2021)
Chapter 20
Surakhsha Kavach: ML-Based
Cross-platform Application
for COVID-19 Vulnerability Detection

Jasmine Kaur Wadhwa, Srushti Patil, Ruchi Raicha, Yaminee Patil,


and Sonal Jain

Abstract Suraksha Kavach app is the way of detecting vulnerability of COVID-


19 attack of person. As people were not aware at an early stage, so many people
were suffering at a large number leading to serious consequences. So, the idea of
ML-based app is to make people aware regarding their vulnerabilities of COVID-19
attack which will help people to take precautions at an early stage to avoid further
serious consequences and protect themselves and their family. Overall, this app will
be easily available on any platform which will help people to easily access and take
the benefit of it. We have arrived at a conclusion that ML-based Suraksha Kavach
app is a much viable solution for the people to take precautionary measures at an
early stage.

Keywords COVID-19 · Suraksha Kavach · Machine learning

20.1 Introduction

On December 31, 2019, a novel pathogenic coronavirus (2019-nCoV) epidemic was


first discovered in Wuhan, Hubei Province, South China. On March 11, 2020, the
World Health Organization declared it a pandemic. COVID-19 is the official name
for the coronavirus illness.

J. K. Wadhwa (B) · S. Patil · R. Raicha · Y. Patil · S. Jain


Department of Information Technology, A.P Shah Institute of Technology, Thane, India
e-mail: jasminekaurwadhwa@apsit.edu.in
S. Patil
e-mail: srushtipatil@apsit.edu.in
R. Raicha
e-mail: ruchiraicha@apsit.edu.in
Y. Patil
e-mail: ympatil@apsit.edu.in
S. Jain
e-mail: Sajain@apsit.edu.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 187
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_20
188 J. K. Wadhwa et al.

COVID-19 is an infectious disease caused by the coronavirus 2(SARS-CoV-2)


that causes severe acute respiratory syndrome. World Health Organization (WHO)
declared the outbreak a Public Health Emergency of International Concern.
As this COVID-19 is becoming a concern for all of us and its spreading very fast,
it is necessary for us to take precautions as early as possible.
So, Suraksha Kavach app is a cross-platform application where user can get the
vulnerability attack of COVID-19 by entering his age and state. After he enters his
age and state, he gets the predicted vulnerability attack of COVID-19.
This prediction is done using various machine learning algorithms like logistic
regression, random forest, Naïve Bayes, and support vector machine (SVM).
According to the age and state, the data is first fitted into training and testing; after
getting probability, it is then classified into three classes:
0—very less vulnerable
1—vulnerable
2—more vulnerable
For example: If a user enters age as 60 and state as Maharashtra, so his chance of
getting COVID-19 is more as his age is more than 50 and also the COVID-19 cases
are more, so in output, there will be message like “You are more vulnerable”.
This will help people especially the senior citizens to take precautions at a very
early stage and protect themselves from getting COVID-19 and severe diseases.
The organization of this paper from hereafter is as follows: Sect. 20.2 contains the
literature survey done, and Sect. 20.3 presents the analysis of the project. Section 20.4
describes the existing system architecture, methods, and the solution to that is shown
in the proposed system architecture in Sect. 20.5. Section 20.6 contains future scope,
and Sect. 20.7 contains conclusion.

20.2 Literature Survey

1. In the first literature, the author has discussed the regression analysis of COVID-
19 using various machine learning algorithms like polynomial regression, linear
regression, etc. [1].
2. In the second literature, the author has discussed the approach to predict coro-
navirus using various machine learning algorithms like support vector machine
(SVM), KNN algorithm, decision tree, random forest [2].
3. In the third literature, the author has discussed regarding detection of COVID-19
from medical images/symptoms of patients using machine learning. This has
been carried out using deep learning algorithms [3].
20 Surakhsha Kavach: ML-based Cross-platform Application … 189

20.3 Analysis

Before going toward existing system architecture, we have done the analysis
regarding COVID-19 cases statewise.
So, we have taken the dataset India_Covid.csv which has been preprocessed and
is taken from Kaggle and then we visualized statewise COVID-19 cases as shown
below.
Figure 20.1 shows Maharashtra has the highest confirmed state and West Bengal
has the lowest confirmed state.
Figure 20.2 shows statewise cured cases in which Maharashtra is the highest
cured state and Rajasthan is the lowest cured state. So now, we have done analysis
of COVID-19 cases and death individually of state.
1. Maharashtra
Figure 20.3 shows datewise confirmed cases in Maharashtra from March 8, 2020, to
April 22, 2020.
Figure 20.4 shows datewise death cases in Maharashtra from March 8, 2020, to
April 22, 2020. So, we can see that there is an increase in death rate as COVID-19
cases are increasing.
2. Gujarat
Figure 20.5 shows datewise confirmed cases in Gujarat from March 8, 2020, to April
22, 2020.

Fig. 20.1 Statewise confirmed cases


190 J. K. Wadhwa et al.

Fig. 20.2 Statewise cured cases

Fig. 20.3 Datewise confirmed cases in Maharashtra


20 Surakhsha Kavach: ML-based Cross-platform Application … 191

Fig. 20.4 Datewise death cases in Maharashtra

Fig. 20.5 Datewise confirmed cases in Gujarat


192 J. K. Wadhwa et al.

Fig. 20.6 Datewise death cases in Gujarat

Figure 20.6 shows datewise death cases in Gujarat from March 8, 2020, to April
22, 2020. So, we can see that there is an increase in death rate as COVID-19 cases
are increasing.

20.4 Existing System Architecture

There are many software available which gives information regarding COVID-19
cases and vaccination status. People may know that how many cases are around
them with the help of tools. They may get the alert notification if there are too many
COVID-19 cases around them. So that a person can take a precautions. However there
is no information regarding the percentage attack that one may have of COVID-19
which leads people to serious consequences. The existing system includes (Fig. 20.7).
(1) The registration
(2) Login
(3) User can see COVID-19 cases statewise
(4) User can also see the information regarding vaccination
(5) But, there is no information regarding the percentage of COVID-19 attack that
one may have as a precautionary measure.
This project aims to solve the limitations of the existing systems and further
develop a new system.
20 Surakhsha Kavach: ML-based Cross-platform Application … 193

Fig. 20.7 Existing system


workflow

20.5 Proposed System Architecture

Figure 20.8 depicts the process of the proposed system. The proposed system is
explained below.
(1) login/Register
It consists of username and password. This combination is all-inclusive authen-
tication methods used because of convenience to others and low cost of deploy-
ment...Once user logged in...he/she can take experience of app.
(2) COVID-19 Dashboard
Once the user is successfully logged in…he/she can see the daily COVID-19 cases.
(3) Check the vulnerability attack
The user can click on this button and can check his/her vulnerability attack of COVID-
19.

Fig. 20.8 Proposed system workflow


194 J. K. Wadhwa et al.

Fig. 20.9 Random forest algorithm accuracy

(4) Predict the attack


After clicking on the vulnerability attack button, enter his/her age and state and then
he/she
can get the vulnerability attack of COVID-19 he may have so that he can take
precautionary measures at an early stage to prevent themselves from a serious
diseases. The prediction is done based on ML algorithms. Algorithms used are:
1. Random Forest
Random forest is a popular machine learning algorithm that is part of a supervised
machine learning strategy. It can be used for both classification and regression prob-
lems in ML. A fragmented system which contains decision trees for different sets of
data provided and takes the measure to improve accuracy of the data speculation is
known as random forest. This algorithm takes less training time compared to other
algorithms.
It also predicts the output with high accuracy. In large databases, it works well.
We applied this algorithm on our dataset, so the accuracy is 54.44% (Fig. 20.9).
2. Logistic Regression
Logistic regression is a machine learning algorithm that is a part of supervised
machine learning algorithm. This algorithm speculates the output of a categorical-
dependent variable with the help of given set of independent variables. The outcome
can be either Yes or No or 0, 1 or true or false that is categorical or discrete values.
We applied this algorithm on our dataset, so the accuracy is 49.43% (Fig. 20.10).
3. Support Vector Machine
SVM is a machine learning algorithm that comes under supervised machine learning
algorithm. This algorithm is very helpful in classification problems. In SVM algo-
rithm, it divides the datasets into classes so that we can find hyperplane. Hyperplane
is a boundary that differentiates the two classes in SVM. The diagram below has
two distinct categories which are separated by a decision boundary or hyperplane
[4] (Fig. 20.11).
We applied this algorithm on our dataset, so the accuracy is 54.11% (Fig. 20.12).
20 Surakhsha Kavach: ML-based Cross-platform Application … 195

Fig. 20.10 Logistic regression accuracy

Fig. 20.11 Support vector machine

Fig. 20.12 Support vector machine accuracy


196 J. K. Wadhwa et al.

Fig. 20.13 Naïve Bayes algorithm accuracy

4. Naive Bayes

Naive Bayes algorithm is a machine learning algorithm that comes under supervised
machine learning algorithm. For solving classification problem, Naïve Bayes algo-
rithm is used. This algorithm is based on Bayes Theorem and is a supervised machine
learning algorithm. The Naïve Bayes is one of the most effective and faster machine
learning algorithms that helps us to make faster predictions.
It is a system of possible classification, which means that it predicts based on the
possibilities of an object.
The formula for Bayes Theorem is given as:

P(A|B) = P(B|A)P(A)/P(B)

Naïve Bayes Classifier Algorithm


where
P(A|B): Posterior probability
P(B|A): Likelihood probability.
When this algorithm is applied on our dataset, the accuracy is 53.66% (Fig. 20.13).
So overall, this age and state data is first fitted into training and testing and then
gives probability and then it gets classified into classes like how much the user is
vulnerable and then the user gets output according to it.

20.6 Future Scope

The future scope of our proposed system can be:


1. The idea about various languages can be implemented in the application, so that
all people who do not understand English can easily access the application.
20 Surakhsha Kavach: ML-based Cross-platform Application … 197

2. The idea about lockdown reminder is one of the features we can add so that
people can know by the notifications on screen.
3. The idea of booking vaccination slots can also be implemented.

20.7 Conclusion

In our project, the prediction of percentage attack is done using machine learning
algorithms. Random forest is suitable algorithm as its accuracy is more. We came
up with this solution as people are getting infected by this COVID-19 at a very large
number. So, it is necessary to spread awareness and take precautions as soon as
possible. This app will save the life of many people and help them to live happy and
healthy life.

References

1. Gambhir, E., Jain, R., Gupta, A.: Regression analysis of COVID-19 using machine (2020)
2. Rohini, M., Naveena, K.R., Jothipriya, G., Kameshwaran, S., Jagadeeswari, M.: A compara-
tive approach to predict corona virus using machine learning. In: International Conference on
Artificial Intelligence and Smart Systems (ICAIS-2021) (2021)
3. Siddhu, A.K., Kumar, A., Kundu, S.: Detection of COVID19 from medical images and/or symp-
toms of patient. In: International Conference on System Modeling & Advancement in Research
Trends (2020)
4. [Online]. Available: https://medium.com/analytics-vidhya/what-is-machine-learning-3977a3
b08384
Chapter 21
Malware Family Categorization Using
Genetic Algorithm-CNN-Based Image
Classification Technique

Prabhsimar Singh Taneja, Shubhang Gopal, Priybhanu Yadav,


and Rahul Gupta

Abstract Malware analysis and classification of malware families using different


techniques is a prominent field of research. The presence of malware code in files
has continuously increased over time making it tedious for companies to analyze
the large number of files manually. We have used a novel machine learning image
classification-based technique to classify images of malware files into their respec-
tive families. Classifying malware images using neural networks helps to simply
the malware detection process. This research studied drawbacks in existing machine
learning approaches and has used genetic algorithm built on the backbone of convo-
lutional neural networks to implement a model which has achieved the objective
of classifying new malware files into families on its own. This has helped achieve
98.11% classification accuracy on the MalImg dataset.

Keywords Malware classification · Machine learning · Genetic algorithm · Image


classification · Convolutional neural network

21.1 Introduction

Malware has continuously increased in volume, making it essential for us to work


on newer and robust malware classification methods [1]. Various techniques like
packing, encryption and the use of polymorphic malware have been posing newer
problems to researchers [2]. Analyzing existing machine learning malware classifi-
cation models and modifying the approaches can yield different results [3]. In order
to develop a comprehensive understanding of the problem, a review of different
models on a popular malware dataset—MalImg—was carried out. Techniques such
as convolutional neural networks (CNNs), K-nearest neighbor (KNN) and others
were analyzed to understand the approach towards malware classification [3, 4].
Finally, genetic algorithm was used with CNN to introduce a novel approach to
malware family classification. The adaptive nature of genetic programming helped

P. S. Taneja (B) · S. Gopal · P. Yadav · R. Gupta


Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, New Delhi 110042,
India
e-mail: prabhsimar100@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 199
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_21
200 P. S. Taneja et al.

resolve several drawbacks of traditional classification techniques and thus formed


more functional relationships between data and its categories by optimization of
hyperparameters and consequently increasing the efficiency of CNN.

21.2 Literature Review

21.2.1 Traditional Malware Analysis

Malware analysis has traditionally been carried out to detect malware files using
conventional techniques such as static and dynamic analysis [5]. However, techniques
like dynamic analysis require running malware files in a virtual environment, and
the limited extent to which these techniques can be effectively scaled makes them
less efficient than several modern machine learning-based alternatives [6]. Various
malware families tend to share similar patterns when it comes to how they interact
and behave in systems, and hence, machine learning-based techniques have become
an insightful alternative to malware classification [7].

21.2.2 Malware Image Classification Using Machine


Learning

A lot of success has been achieved using neural networks and models based on tech-
niques such as CNN and KNN [3]. However, a lot of these CNN-based approaches
have had issues with large malware samples and have required alterations to tackle
problems in detection and classification; hence, there is a need for an optimal CNN
architecture to assist in malware classification [8]. Modern methods in malware
classification research have also included the use and implementation of feature
extractors such as recurrent neural networks such that the extracted features have
even been used to train the classifier to detect the malicious malware [4]. In some
techniques, a neural network has also been used to reduce the error rate by a signifi-
cant margin to obtain the desired results [9]. Some other works in the domain have
used graph-based approaches built upon clustering techniques [10], whereas some
researchers have also tried to use metadata to analyze import and export tables to
assist in malware analysis [11].
Researchers have also used combinations of GIST with techniques such as KNN
and SVM and these techniques have also not given very high accuracy in the clas-
sification of malware [12]. At the same time, a combination of SVM with multi-
layer perceptron or gated recurrent unit has also shown lesser accuracy than genetic
algorithm-based models [13].
21 Malware Family Categorization Using Genetic Algorithm-CNN … 201

MalImg dataset has often been used to build an improved CNN model for malware
family classification and improve the effectiveness of the approach [14]. The gener-
ational flow of genetic algorithm has allowed for optimal solutions from different
generations [15]. Furthermore, the ability of genetic algorithm to select the best
feature subset from the original feature vector of malware images allows for improved
training and much higher accuracy in classification [16].

21.3 Methodology and Experimentation

21.3.1 Dataset Description

MalImg dataset was introduced by Nataraj et al. [17], and it is a dataset made up of
9339 malware samples in image form belonging to 25 different malware families.
Grayscale images have been used in our implementation, and the number of malware
image samples belonging to each family varies in MalImg. The dataset is divided
into two parts—training and testing samples. In the implementation, 70% of the
images, i.e., 6538 malware samples are used for training and the other 30%, i.e.,
2801 malware samples are used for testing.

21.3.2 Visual Representation of Malware File

Malware samples are present as images in MalImg dataset where every byte is inter-
preted as a pixel. Using malicious executable file in image from makes it easier
to differentiate different sections of the binary source code. Moreover, given the
tendency of malware creators to make only small variations in new malwares, visual
references to existing malwares helps in detecting the small changes. The MalImg
dataset was built by reading binary malware file in the form of a vector of 8-bits
unsigned integers, converting each component of vector from binary to decimal
value and saving the decimal value in a new vector [18]. We have used the RGB
values of the image files of the dataset to convert them into matrices of dimensions
matching those of the image size. This was then down sampled before using them
as the input to our model.

21.3.3 Architecture of CNN

Due to the visual representation of the malware object code, the CNN model forma-
tion is to inherently carry out an image classification task. The various malware visual
representations have been classified into various categories by discrete extraction of
202 P. S. Taneja et al.

Fig. 21.1 Architecture of convolution neural network used for image classification

patterns within them, since the binary image files that were obtained from a particular
malware pattern family have a higher propensity to produce images of greater simi-
larity. Upon undertaking such an assumption, we observe that the feature extraction
assists to identify patterns based on the distribution of pixels on the screen.
The base CNN architecture contains three convolution layers, each convolution
layer is then backed by max pooling layers. The input signals for the image are
reduced by the max pooling layer, and this is followed by a combination of flatten
layer and fully connected layer. Also, a two-dimensional batch-normalization layer
and a rectified linear unit (ReLU) function follow each convolution layer. The best-fit
CNN architecture as determined by the application of genetic algorithm is described
in Sect. 21.3.4. The CNN architecture is as seen in Fig. 21.1.

21.3.4 Implementation of Genetic Algorithm

Genetic algorithm was executed with a population size of 10 and for each set of
population, the fitness score of the CNN model was calculated. This algorithm was
executed for 10 generations, and then the individual with the best fitness score was
chosen as the best-fit CNN architecture. This optimized CNN architecture was then
used for the classification of the malware images [19]. The overview of the algorithm
is as follows:

Algorithm 1: Overview of Genetic Algorithm Used


Input: Dataset of malware files represented as images, randomly generated
initial population, number of generations N and size of population S
Output: The best-fit architecture of CNN including the selected hyperparam-
eters
1. Randomly generate hyperparameters for the initial population;
21 Malware Family Categorization Using Genetic Algorithm-CNN … 203

2. i←0;
3. while i < N do
4. For all individuals in Pi , calculate their fitness scores;
5. Ai ← The top individuals of the population having highest fitness
values are selected as a part of the next population;
6. Bi ← Using the mutation and crossover operators, generate offspring
from the selected parent pool Ai ;
7. Pi+1 ← Generate new population using Ai ∪ Bi ;
8. i ← i + 1;
9. end
10. return optimized set of hyperparameters for the CNN architecture.

The genetic algorithm to obtain the optimized CNN hyperparameters was


implemented using the specification mentioned in Table 21.1. The step-by-step
implementation details follow the table.

1. Population Initialization

Initial population consists of randomly generated chromosomes which define the


hyperparameters for the CNN model. In order to generate the initial population
chromosomes, we randomly generate values for the number of filters and the size of
kernels for each layer. We took the maximum size of the kernels (S max ) as 20 and set
the maximum number of filters (F max ) to 100.

Table 21.1 Description of parameters used in the implementation of genetic algorithm


GA specifications Value
Population size per generation (N p ) 10
No. of generations (N) 10
No. of epochs per generation (N e ) 20
Max possible no. of filters (F max ) 100
Max size of kernels (S max ) 20
Population retained as elite in % (t) 40
Mutation probability (mp ) 0.2
204 P. S. Taneja et al.

2. Fitness Evaluation

The algorithm for fitness evaluation is explained below in Algorithm 2.

Algorithm 2: Fitness Evaluation


Input: The individual (ω) from current population Pi , number of epochs (N e ),
training dataset (Dtrain ), testing dataset (Dtest )
Output: Individual (ω) along with its fitness score
1. Extract the hyperparameters from ω;
2. Using the malware image dataset, construct a CNN (cnn) with the
extracted hyperparameters;
3. sbest ← 0;
4. e ← 0;
5. while e < N e do
6. cnn.train(Dtrain );
7. acc ← cnn.test(Dtest ).accuracy;
8. s ← (acc ∗ 100);
9. if s > sbest then
10. sbest ←s;
11. end
12. end
13. sbest ← fitness score of ω;
14. return ω with its fitness.

In the current population, we defined fitness for each individual as the accuracy
of the CNN architecture for hyperparameters extracted from that individual, which
is calculated by training the CNN model on the training dataset. For population Pi ,
we select an individual and parameters like number of filter and size of each filter are
extracted to generate a CNN model. We split the existing training dataset into Dtrain
(70%) and Dtest (30%). Then, this model is trained (using Dtrain ) and tested (using
Dtest ), which gives us the fitness for the respective individual.

3. Selection

The process of selection is performed at the beginning of every generation. For each
generation, certain individuals are selected from the present population as elites.
Using the current population Pi , we sort the individuals on the basis of their calculated
fitness scores in descending order. Now, we choose the top t% individuals from Pi ,
these selected elite individuals contribute directly to further generations. Moreover,
using this set of elite individuals Ai as the parent pool, we generate offsprings Bi by
using crossover and mutation techniques to complete the remaining population for
the following generation.
21 Malware Family Categorization Using Genetic Algorithm-CNN … 205

4. Crossover
Once the parent pool for the next generation is generated using the method specified in
the above step, we applied a uniform crossover method to generate new offsprings.
From parent population Ai , we randomly select two individuals for crossover. In
uniform crossover, to create new children, genes are selected randomly from either
of the selected parents and then these children are added to offspring population Bi .
Hence, the offspring chromosome is more likely to be diverse and different from
its parent, adding a new randomness to the population [20]. We defined Bsize , i.e.,
the size of offspring population, as the difference between size of current population
|Pi | and size of parent population |Ai |, where |x| represents size of collection x. This
whole process is repeated until the size of newly created offspring population Bi is
not the same as Bsize .
5. Mutation
By introducing new values in the individuals, the mutation function adds some
uniqueness and allows the genetic algorithm to experiment with different parameters
rather than sticking with the same population. We have a probability of mutation
defined as mp . For each child chromosome in Bi , we generate a number randomly
between 0 and 1. If this number lies within the probability range, then the mutation
operation is performed on the individual. We then select a random gene from the
offspring and replace it with a random value within permissible range. Crossover
and mutation operations are performed as shown in Algorithm 3.
6. Population Generation
After the above processes, the acquired parent population (Ai ) and child population
(Bi ) are combined together to form the new population of the next generation. Now,
the fitness of all individuals is computed again on this newly generated population,
and this process is then repeated until we reach the maximum number of generations
(N) in order to find the best CNN architecture along with its hyperparameters.

Algorithm 3: Crossover and Mutation to Generate New Population


Input: Individuals with their fitness scores in population (Pi ), population
retained as elite in % (t), mutation probability (mp ), size of population (N p )
Output: Population for the next generation (Pi+1 )
1. On the basis of fitness scores, sort Pi in descending order;
2. Choose the top t% individuals from Pi as the parent pool (Ai );
3. Bi ← null;
4. Bsize ← N p − |Ai |;
5. while |Bi | < Bsize do
6. p1 ← Select a parent randomly from Ai ;
7. p2 ← Select a parent randomly from Ai ;
206 P. S. Taneja et al.

8. if p1 =p2 then
9. Apply uniform crossover operation on p1 and p2 to create c1 and c2 ;
10. Bi ← Bi ∪ c1 ∪ c2 ;
11. end
12. end
13. foreach offspring c in Bi do
14. n ← For the range (0, 1), generate a number randomly;
15. if n < mp then
16. Select a random gene from offspring c and replace this gene with a
randomly generated value;
17. end
18. end
19. Pi+1 ← Ai ∪ Bi ;
20. return Pi+1 .

21.4 Results and Comparison

The success of our implementation of genetic algorithm-CNN on the dataset was


evaluated by plotting a confusion matrix for the 25 malware families in MalImg
dataset, as shown in Fig. 21.2. The final optimized hyperparameters generated by the
genetic algorithm for the three convolutional layers were [66, 40, 89] for the number
of filters and [3, 1, 2] for the size of kernels, respectively. The implementation returned
highly promising results. The model was able to classify the input malware image
with an accuracy of 98.11%. Moreover, more than two-thirds of the malware families

Fig. 21.2 Confusion matrix for GA-CNN implementation on MalImg dataset


21 Malware Family Categorization Using Genetic Algorithm-CNN … 207

Table 21.2 Comparison of results obtained by different malware family classification techniques
Approach Year Author Accuracy (%) Average precision Average recall
(%) (%)
GA-CNN – – 98.11 98.05 98.1
GIST-SVM 2018 Cui et al. [12] 92.2 92.5 91.4
GIST-KNN 2018 Cui et al. [12] 91.90 92.10 91.70
ResNet (DL) 2018 Bhodia et al. 94.80 – –
[21]
GLCM-SVM 2018 Cui et al. [12] 93.40 93.0 93.20
MLP-SVM 2020 Bensaoud et al. 94.55 – –
[13]
GRU-SVM 2020 Bensaoud et al. 94.17 – –
[13]

returned perfect F1 scores and precision and recall values. The results obtained after
the implementation of genetic algorithm-CNN were better than majority of existing
conventional implementation approaches to malware image classification, as shown
in Table 21.2.
The accuracy obtained by the GA-CNN model implemented by us is higher than
majority of other approaches because of the constraints and limitations in traditional
CNN architectures in non-GA-based algorithms.

21.5 Conclusion and Future Work

Our research work involved using visual representation malware file data in the
form of grayscale images and then using image classification techniques of GA-
CNN for automating the optimization of CNN architecture. The model shows higher
accuracy, precision and recall than several other models applied on MalImg dataset
for malware classification. Moreover, most successful CNNs require manual tuning
to design the most optimal architecture which can be used by the researcher for
image classification. However, by using a GA-CNN model, the most optimal CNN
is automatically generated thereby removing the hindrance cause by lack of domain
knowledge. In addition, resource utilization is better when compared to other tech-
niques. The endeavor has been benchmarked by comparing the results of genetic
algorithm on MalImg with other experiments.
Future research work in this domain can be towards improving the CNN opti-
mization techniques while applying the algorithm for malware family classification.
At the same time, application of other evolutionary algorithms in malware classifica-
tion could also provide promising results. Besides this, since evolutionary algorithms
are a modern machine learning concept inspired by biological evolution which have
shown great results with high level of accuracy, genetic programming can also be
208 P. S. Taneja et al.

built upon other conventional techniques. Using GA-CNN model in a domain such
as malware classification produces highly effective results.

References

1. Or-Meir, O., Nissim, N., Elovici, Y., Rokach, L.: Dynamic malware analysis in the modern
era—a state of the art survey. ACM Comput. Surv. (CSUR) 52, 1–48 (2019)
2. Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new
malicious executables. In: Proceedings 2001 IEEE Symposium on Security and Privacy. S&P
2001 (2000)
3. Gibert, D.: Convolutional Neural Networks for Malware Classification. University Rovira i
Virgili, Tarragona, Spain (2016)
4. Pascanu, R., Stokes, J.W., Sanossian, H., Marinescu, M., Thomas, A.: Malware classification
with recurrent networks. In: 2015 IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP) (2015)
5. Idika, N., Mathur, A.P.: A survey of malware detection techniques, vol. 48. Purdue University
(2007)
6. Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of
malware behavior. In: International Conference on Detection of Intrusions and Malware, and
Vulnerability Assessment (2008)
7. Siddiqui, M., Wang, M.C., Lee, J.: A survey of data mining techniques for malware detection
using file features. In: Proceedings of the 46th Annual Southeast Regional Conference (2008)
8. Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., Ahmadi, M.: Microsoft malware classifica-
tion challenge (2018). arXiv preprint arXiv:1802.10135
9. Dahl, G.E., Stokes, J.W., Deng, L., Yu, D.: Large-scale malware classification using random
projections and neural networks. In: 2013 IEEE International Conference on Acoustics, Speech
and Signal Processing (2013)
10. Kinable, J., Kostakis, O.: Malware classification based on call graph clustering. J. Comput.
Virol. 7, 233–245 (2011)
11. Shafiq, M.Z., Tabish, S.M., Mirza, F., Farooq, M.: Pe-miner: mining structural information to
detect malicious executables in realtime. In: International Workshop on Recent Advances in
Intrusion Detection (2009)
12. Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G.-G., Chen, J.: Detection of malicious code variants
based on deep learning. IEEE Trans. Industr. Inf. 14, 3187–3196 (2018)
13. Bensaoud, A., Abudawaood, N., Kalita, J.: Classifying malware images with convolutional
neural network models. Int. J. Netw. Secur. 22, 1022–1031 (2020)
14. Kalash, M., Rochan, M., Mohammed, N., Bruce, N.D.B., Wang, Y., Iqbal, F.: Malware classi-
fication with deep convolutional neural networks. In: 2018 9th IFIP International Conference
on New Technologies, Mobility and Security (NTMS) (2018)
15. Fatima, A., Maurya, R., Dutta, M.K., Burget, R., Masek, J.: Android malware detection using
genetic algorithm based optimized feature selection and machine learning. In: 2019 42nd
International Conference on Telecommunications and Signal Processing (TSP), pp. 220–223.
IEEE (2019)
16. Martín, A., Fuentes-Hurtado, F., Naranjo, V., Camacho, D.: Evolving deep neural networks
architectures for android malware classification. In: 2017 IEEE Congress on Evolutionary
Computation (CEC) (2017)
17. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and
automatic classification. In: Proceedings of the 8th International Symposium on Visualization
for Cyber Security (2011)
18. Nataraj, L., Yegneswaran, V., Porras, P., Zhang, J.: A comparative assessment of malware
classification using binary texture analysis and dynamic analysis. In: Proceedings of the 4th
ACM Workshop on Security and Artificial Intelligence (2011)
21 Malware Family Categorization Using Genetic Algorithm-CNN … 209

19. Bakhshi, A., Noman, N., Chen, Z., Zamani, M., Chalup, S.: Fast automatic optimisation of
CNN architectures for image classification using genetic algorithm. In: 2019 IEEE Congress
on Evolutionary Computation (CEC) (2019)
20. Anderson-Cook, C.M.: Practical genetic algorithms. J. Am. Stat. Assoc. 100(471), 1099–1099
(2005)
21. Bhodia, N., Prajapati, P., Di Troia, F., Stamp, M.: Transfer learning for image-based malware
classification (2019). arXiv preprint arXiv:1903.11551
Chapter 22
Master Data Management Maturity
Evaluation: A Case Study in Educational
Institute

Dupinder Kaur and Dilbag Singh

Abstract To deal with an organization’s essential data as a single coherent system,


Master Data Management is essential. It links all the critical data as a unified version
of truth known as “Master data”. It is responsible for data sharing, integration,
analytics and decision making. The quality of business intelligence, analytics, and
AI depends upon Master data management. A maturity model can be used to test the
effectiveness of Master Data Management program in an organization. In the present
research, a case organization has been considered to assess master data’s maturity
level using Spruitz–Pietzka’s maturity model. The considered model consists of 13
focus areas and five key topics. Each focus area consists of packed capabilities used
to determine the maturity of master data. The findings showed among 62 applicable
capabilities, 44 (70.96%) are applied and 18 (29.03) are absent. Thus, on the basis
of applied capabilities, overall maturity level is taken as 1. Hence, an organization
can accomplish higher development by implementing missing capabilities.

Keywords Master data management (MDM) · Maturity levels · Master data


management maturity model (MD3M)

22.1 Introduction

Information has become a crucial asset in today’s digital economy. It acts as a basis of
intelligence for extensive commercial business choices. To guarantee the personnel
have right statistics for decision making, businesses have to put resources into data
management solutions that in addition improve reliability, security, and scalability.
The Data Management technique is an approach for information amassing, coor-
dinating, securing and storing information to generate green results and predictions
[1]. Master Data Management (MDM) program is a data management practice which
provides robust and scalable platform for real-time data correction. MDM ensures

D. Kaur (B) · D. Singh


Department of Computer Science and Engineering, Chaudhary Devi Lal University, Sirsa,
Haryana, India
e-mail: dupindercsa@cdlu.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 211
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_22
212 D. Kaur and D. Singh

consistency, semantic consistency, exactness and responsibility of the organization’s


master data assets by sharing “Single perspective on client information” [2].
Master data is the data about essential business entities or objects such as customer,
product, supplier, etc. around which the business is conducted. Master data is the core
data of an organization that exits independently and builds a foundation for smooth
implementation of business process and well-informed business decisions [3]. Busi-
ness leaders have realized that high-quality data and proper management drives the
business success. Thus, to meet these expectations, need for implementation of MDM
arises [4]. MDM detects and declare data relationship, resolve duplicate records and
provide an access to data citizens [5]. The goal of an MDM program is to define,
create, integrate and maintain the master data. It reduces the expense and intricacy
brought about by high information duplication and troubles related in keeping up
with these information elements [6]. In addition to creating “unified Version of real-
ity”, MDM additionally offers computerized tracking and reconciliation of facts from
diverse supply [7].
Master Data Maturity Model is the estimation standard to know the organization’s
ability to constantly implement MDM. This model helps to understand required
capabilities for success implementation of MDM by getting an insight into levels of
maturity [8]. Various models have been designed to assess master data maturity. In
present research, Spruitz–Pietzka MD3M has been used to assess the level of MDM
implementation in an educational institute. The reason behind this study is that the
five factors needed for student’s Master Data Management perfectly matches with
capabilities of MD3M model.
By knowing the maturity of MDM, an organization can raise the awareness of
good quality data; have focus on essential capabilities and key focus areas. The
section below describes the considered focus area majorly used for the assessment
of MDM in this organization.

22.1.1 Data Model

The intention behind this key topic is to relate the display of data, infrastructure and
organization in it. The information like: Type of data to be considered as master
data, type of system that use master data, the structured and its storage location are
answered. The key focus areas considered under this are: Definition of Master Data,
Master Data Model and Data Landscape [8].

22.1.2 Data Quality

This key indicates the data quality in all regards including methods of access, ways
to enhance data quality and impact of data quality issues. Business decisions are
highly dependent on quality of data. Master Data Management helps in improving
22 Master Data Management Maturity Evaluation: A Case Study … 213

data quality. Thus, key focus areas considered here are: Assessment of Data Quality,
Impact of Business and Awareness of Quality Gaps [9].

22.1.3 Usage and Ownership

It defines role of user for assessing the information. Read or write access and grant
or denied access for data is assigned. Further, for data protection and privacy reasons
access rights allow only those users to access the data that have the permission. Data
availability is ensured all the time. The key focus areas are: Data Usage, ownership
and Access methods [8].

22.1.4 Data Protection

This key concentrates on privacy and security toward probable prevalence like;
viruses, hacker, secret leaks, gadget failure or service disruption. These incidences
bring huge losses and hassles. Security to master data is provided under this key
topic to protect it from above said incidences [8]. The focus area is: Data Protection.

22.1.5 Maintenance

This key focuses on physical storage of data and how data is treated during lifecycle.
The objective is to store the data in an efficient and persistence manner. Any change
in master data during its life cycle is also covered. The main focus areas of this topic
are: Storage and Data Lifecycle [9].
The rest of paper is organized into six sections. In Sect. 22.2, exiting work on
master data management maturity is reviewed. Section 22.3 elaborates the research
methodology with questionnaire and data collection method used for present research
work. Section 22.4 presents results of the case study. Finally, Sect. 22.5 concludes
the paper by proving an overall maturity level for the case organization.

22.2 Literature Review

Zulwelly Murti et al. proposed architecture for managing the personnel information
at XYZ institute. The study was carried out in accordance with the need of institute
to plan master data management for managing personnel data. One master entity
named as master data personnel and three strong entities: employees, attendance and
performance was created. A DFD has been designed for further guidance of master
214 D. Kaur and D. Singh

data implementation. This architecture has proven as a success factor in implementing


MDM at XYZ institute [9].
Riikka Vilminko-Heikkinen examined the demanding situations in establishing
and devolving MDM characteristic within an organization. In case study and ethno-
graphic observations, 15 issues were found and compared with existing one. Many
new elements such as identity of entities to be concerned in MDM initiative,
level of granularity required, mutual understanding of Master data domain, etc. are
considered a business driven challenges [10].
Daniel Vasquez Zuniga et al. designed a model for measuring master data matu-
rity for microfinance sector with the goal of enhancing approaches till entities attain
preferred maturity ranges. For the implementation of designed model, information
of the Peruvian microfinance organization has been utilized. After the validation of
proposed model, it has been investigated that the model serves as a mean of iden-
tifying maturity level of MDM with the initiative of improving the implementation
process [11].
Dataflux provided a different method for master data maturity measurement. The
model centered on structure, governance, identification, integration, management,
and BPM of master data. Five levels were designed for maturity assessment. The
model developed by Spruit and Pietzka suggested one missing layer in Dataflux
model [12].
Macro Spruit et al. designed maturity matrix covering 13 focus areas and 65
capabilities for master data maturity assessment in an organization. To validate the
proposed model, a case study on an organization has been conducted to check MDM
maturity level. The results showed that the overall maturity level was 0 but the ratio of
implemented versus missing capabilities was 60:40. Thus, it provided a benchmark
tool with which distinctive corporations can compare and enhance their MDM matu-
rity levels. Spruitz and Pietzka deigned the idea for assessing the maturity of MDM
program. Five levels: initial, repeatable, defined process, managed and measurable
and optimized were include in Master data Management Maturity (MD3M) model
[13].
Table 22.1 indicates the MDM maturity level and their description as adopted by
Spruitz and Pietzka.

Table 22.1 Description of maturity levels [13]


Level Description
Initial Organization is privy to issues associated with MDM system
Repeatable Initial steps have taken with the aid of one or more persons to clear up the issues
related to MDM procedure
Defined Teamwork of different departments to resolve the MDM issues
Managed Well described and documented system to control MDM, is communicated in all
departments of an organization
Optimized Efficient and effective implementation of MDM within organization
22 Master Data Management Maturity Evaluation: A Case Study … 215

22.3 Research Methodology

In order to assess the master data maturity, a study on various existing Master Data
Maturity Model has been performed. Spruit and Pietzka’s MD3M has been adopted to
know the maturity of master data in present research [13]. MD3M has more focused
detailed about capabilities as compared to other models. A questionnaire has been
used in MD3M model to measure master data management maturity.

22.3.1 Questionnaire

Spruit and Pietzka designed two sets as influential factors and capabilities in the
questionnaire. The model was designed so that it can be applicable in all kinds of
organization whether the capabilities are applicable or not [13]. It consists of an
employer, interaction with third party, range of system used by organization. Table
22.2 represents the questionnaire for influential elements.
Table 22.2 expresses influential factors questionnaire set. The first question can
impact the company’s capability to have interaction with another business at intervals
the equal organization. The subsequent inquiry can affect an organization’s capacity
to pass judgment on non-monetary effect of information. The third question can have
an impression on the organization’s practicality to judge facts satisfactory consid-
ering the wish of statistics high-quality of every business units. The fourth question
can have an effect on the corporation’s capability to control information resource
throughout organization’s data landscape. Thus, to determine the maturity level, this
form is going to be mapped to the table. Answer “Yes” shows that the organization
has applied the stated key point, whereas “No” shows that the key points are not yet

Table 22.2 Questionnaire on organization’s influential factors [13]


Question Answer
Does the organization belong to a team and the A “Yes” response implies activation of
employees of organization have to interact “Definition of master records”-E capability
frequently with unique inner people or company
and special?
Is the organization is non-financial gain, or army A “No” response activates “Impact on
organization or a central authority? Business”-D and “Impact on Business”-E
capabilities
Does organization exceed the number of A “Yes” answer enables “Assessment of
personnel at close to 250? information pleasant”-C capability
Does the worker have to work with many precise A “Yes” response implies the activation of
structures for implementing every day work and “Data Landscape”-E capability
want to look at distinct strategies while doing
this?
216 D. Kaur and D. Singh

implemented. Sixty-five queries are included in capability questionnaire, but it can


be reduced to sixty, depending upon the solution of first questionnaire.

22.3.2 Data Collection

Two subject matter experts (SMEs) fill the questionnaire through group discussion.
Both SMEs are IT specialists in case organization.

22.4 Case Study

The answers filled by SMEs are used for practical implementation of MDM maturity
assessment. In Subsection A, organization’s detail is presented and in next one, MDM
assessment results are elaborated.

22.4.1 Organization’s Profile

The organization is a government educational institute. It offers courses in various


science and non-science subjects. There are 24 educational departments, which
provide 21 careers oriented and specialized courses to the students. It also offers
activity-orientated guides through distance education.

22.4.2 Results

The results of questionnaires are presented here. The questionnaire was replied
through institution dialogue with SMEs as shown within the following table.
Table 22.3 shows the results of organization’s influential factor. The answers of

Table 22.3 Influential factor answers in case organization


Question Answer
Does the organization belong to team and the employees of organization have to Yes
interact frequently with unique inner people or company and special?
Is organization is non-financial gain, or army organization or a central authority? Yes
Does organization exceed the number of personnel at close to 250? Yes
Does the worker have to work with many precise structures for implementing every No
day work and want to look at distinct strategies while doing this?
22 Master Data Management Maturity Evaluation: A Case Study … 217

Table 22.4 MD3M measurement matrix result


Key topics L1 L2 L3 L4 L5
Data model
Definition of master data 1 1 1 1 0
Master data model 1 1 1 1 0
Data landscape 1 1 1 1 NA
Data quality
Assessment of data quality 1 1 1 1 1
Impact on business 1 0 0 NA NA
Awareness of quality gaps 1 1 1 1 1
Improvement 1 1 0 1 0
Usage and ownership
Data usage 1 0 0 1 0
Data ownership 1 1 1 1 1
Data access 1 1 0 0 0
Data protection
Data security 1 1 1 0 1
Data maintenance
Storage 1 1 1 1 1
Data life cycle 1 1 0 0 0

influential factors are mapped into resultant matrix in order to assess the maturity
level of MDM in the case organization. The taxonomy “1” for applied, “0” for absent
and “NA” for not applicable capabilities is used to fill the entries within the matrix.
Table 22.4 indicates the questionnaire effects with set of capabilities with maturity
levels.
Table 22.4 represents the mapping of answers within matrix. The maturity levels
are shown as from level L1 to L5. The table below represents the number of applied
and absent capabilities on different levels.
Table 22.5 shows the result of applied and lacking capabilities by levels. From
above effects, it has been observed that out of sixty-two capabilities, 44 (70.96%)
capabilities have been applied, whereas 18 (29.03%) capabilities are absent.
To decide, typical Master Data Maturity, for each subject matter, the capability
that has been carried out has been taken into consideration. For a capability: if better
stage on a focus place has been carried out but the decrease degree has no longer
but implemented, then the maturity stage may be the highest functionality degree
earlier than the capability that has no longer yet carried out. Hence, overall maturity
for individual key topic as data model is 4, data quality is 1, usage and ownership is
1, data protection is 3 and data maintenance is 2. On the basis of this idea, following
graph has been formulated to reveal the general maturity stage.
Figure 22.1 illustrates the maturity level for individual key topic. Thus, it can be
218 D. Kaur and D. Singh

Table 22.5 Percentage of applied and absent capabilities result by levels


Levels # No of capabilities Applied Absent
# % # %
Initial 13 13 100 0 0
Repeatable 13 11 84.61 2 15.38
Defined 13 8 61.53 5 38.46
Managed 12 8 66.67 4 33.33
Optimized 10 4 50 6 50
Total 62 44 70.96 18 29.04

Fig. 22.1 Overall maturity level for individual key topic

seen that overall maturity level of MDM in case organization is 1, considering lowest
maturity level for all key subjects.

22.5 Conclusion

A Master Data Management program helps in creating single version of truth known
for all business entities leading to fewer errors and less redundancy in business
processes. An ideal MDM solution can lead in multiple business operations with
22 Master Data Management Maturity Evaluation: A Case Study … 219

inculcation of required capabilities. To access the level of maturity of MDM program


in an organization, master data maturity models are used. Thus, in the present study
an examination has been performed to know the maturity level of MDM in an educa-
tional institute. MD3M model is used to measure and suggests the improvement areas
to organization related to master data.
Lower level implemented topic has been considered for maturity level of overall
MDM. Hence, the assessment showed that MDM maturity level for this organization
is 1. Nevertheless, it is not that MDM is not implemented. Out of 62 applicable
capabilities, 44 (70.96%) are implemented whereas 18 capabilities (29.03%) are
missing. Thus, with comprehensive and sustainable assessment using MD3M, the
organization can improve its maturity. To implement MDM, the organization should
focus on the capabilities to get better results and high maturity. This will lead the
business with additional values among others.

References

1. https://www.tableau.com/learn/articles/what-is-data-management, last accessed on


04/09/2021
2. Singh, D., Kaur, D.: A master data management solution for building frameworks: a constructive
way to pilot the implementation. In: 2nd International Conference on Data Analytics and
Management (ICDAM). Springer, Berlin (2021)
3. https://help.sap.com/viewer/99218f2c48044ddc8f2ea30adc0e38a1/7.1.18/enUS/471c5928c
d0412b8e10000000a1553f7.html, last accessed on 04/09/2021
4. Singh, D., Kaur, D.: Data profiling model for assessing the quality traits of master data
management. Int. J. Recent Technol. Eng. 8(6), 446–450 (2020)
5. Lepeniotis, P.: Master data management: its importance and reasons for failed implementations,
Ph.D. thesis. Sheffield Hallam University (2020)
6. Aditya Rahman, A., Dharma, G., et al.: Master data management maturity assessment: a
case study of a Pasar Rebo Public Hospital. In: 2019 International Conference on Advanced
Computer Science and Information Systems (ICACSIS), pp. 497–504. IEEE, Bali, Indonesia
(2019)
7. Pratama, F.G., Astana, S., Yudhoatmojo, S.B., Hidayanto, A.N.: Master data management
maturity assessment: a case study of organization in Ministry of Education and Culture. In:
International Conference on Computer, Control, Informatics and its Applications (IC3INA),
pp. 1–6. IEEE, Tangerang, Indonesia (2018)
8. Qodarsih, N., Yudhoatmojo, S.B., Hidayanto, A.N.: Master data management maturity assess-
ment—a case study in the supreme court of the Republic of Indonesia. In: 6th International
Conference on Cyber and IT Service Management (CITSM), pp. 1–7 (2018)
9. Murti, Z., Andarrachmi, A., Hidayanto, A.N., Yudhoatmojo, S.B.: Master data management
planning: (Case study of personnel information system at XYZ Institute). In: International
Conference on Information Management and Technology (ICIMTech), pp. 160–165. IEEE,
Jakarta (2018)
10. Vilminko, R., Pekkola, S.: Master data management and its organizational implementation: an
ethnographical study within the public sector. J. Enterp. Inf. Manag. 30(3), 454–475 (2017)
11. Zuniga, D.V., et al.: Master data management maturity model for the microfinance sector in
Peru. In: Proceedings of 2nd International Conference on Information System and Data Mining,
pp. 49–53, USA (2018)
12. DataFlux Company: MDM components and the maturity model (2010). Retrieved September 3,
2021 from http://www.knowledgeintegrity.com/Assets/MDMComponentsMaturityModel.pdf
220 D. Kaur and D. Singh

13. Spruit, M., Pietzka, K.: MD3M: the master data management maturity model. Comput. Human
Behav. 51, 1068–1076 (2014)
Chapter 23
Malware Family Classification Using
Music Information Retrieval Techniques

Navdeep Sehrawat, Piyush Shandilya, Prajjwal Kumar, and Rahul Gupta

Abstract Malware identification and eradication have become a significant task


in system security because of the large number of dangerous programs which are
released each year. The quantity of these types of viruses has recently risen consid-
erably. Many studies and proposals have been made in the past on malware detection
approaches based on PE headers, opcodes, function calls, (control flow diagram) CFG
and opcodes. Extraction of high-level features is inaccurate and time-consuming due
to new obfuscation techniques, whereas approaches based on program bytes are
much more efficient. The goal of this research work is to perform a byte-level anal-
ysis for malware classification utilizing signal processed from .wav/audio files and
also propose a study depicting the importance of different audio features in malware
classification.

Keywords Malware classification · Machine learning · Music information


retrieval · Audio signal processing · MFCC · Chromagram · MIDI

23.1 Introduction

Malware is a vast and broad term that refers to any software which is meant to cause
harm, steal information from users, or perform other undesirable behavior on users’
machine without his/her consent.
In year 2020 alone, there has been a record of high number of malware attacks
that have disrupted business worth billions. Security providers provide solutions
for detecting and dealing with malicious software using various malware detec-
tion methodologies. Signature-based techniques usually commence with extracting
a signature from a propagated malware binary. Then, in order to locate a match,
this signature was compared to the signature database [1]. Changing malware with
obfuscation techniques can swiftly evade anti-virus software and prevent detection,
despite the fact that signature-based procedures are quick and exact.

N. Sehrawat (B) · P. Shandilya · P. Kumar · R. Gupta


Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, New Delhi 110042,
India
e-mail: 1998navdeep@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 221
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_23
222 N. Sehrawat et al.

In our research, we present a static method for recognizing malware that uses raw
bytes and standard signal processing techniques to extract characteristics. The study
of extracting information from music is known as music information retrieval (MIR)
[2]. The goal is to find similar musical frequency patterns from the input audio files
and classify music into nine separate families of malware; we used MFCC, chroma-
gram and Mel spectrogram as audio-specific characteristics. Previous research has
shown that these properties may be useful in order to create efficient ML models to
reliably classify music in the form of .wav files.

23.2 Literature Review

This section provides some previous work which have been conducted in the MIR
domain to detect and classify malware.
Farrokhmanesh [1] proposed a MIR based malware classification approach, which
extracts bytes from programs and forms an audio from it. In detail, executable
file’s bytes are transformed into MIDI notes and audio files are created as a result.
Then, machine learning models and algorithms are applied onto audio features such
as MFCC and chromagram. The static analysis proposed by Farrokhmanesh [1]
are usually time-consuming and computationally expensive as his entire approach
depended upon the size of byte-sequence which needs to be analyzed.
Mercaldo [3] proposed a methodology for detecting malware in mobile samples.
In this study, entire focus revolves around analyzing the bytes, when compared to
methods that use reverse engineering for feature extraction, the suggested method
results to be much faster and more accurate.
Kalash [4] converts the malware samples of the Microsoft malware dataset to gray-
scale images and then train an SVM and a CNN for classification. Narayanan [5] uses
the Microsoft Malware Classification Challenge dataset to do malware classification.
Support vector machines, artificial neural networks and k-nearest neighbors are used
in the training and classification of malware data.
Zhao [6] introduced a unique deep learning classification system for gray-scale
image visualization with the goal of detecting malware samples. The framework is
built on CNN of ten hidden layers that allows for weight updates from the cloud
to the device. Binary files are turned to gray-scale images without decryption or
disassembly using three processes: code mapping, texture partitioning and texture
extraction.

23.3 Methodology and Experimentation

Here in this research, classification of malware executable .bytes file using


approaches from the field of music categorization is proposed, based on the resem-
blance between instrumental music and executable .bytes file. The following sections
23 Malware Family Classification Using Music Information … 223

will cover the explanation of each phase of our strategy as well as the discussion of
various problems.

23.3.1 Conversion of the Executable Bytes File into MIDI


Files

Our process starts by converting the bytes of the executable bytes files into musical
notes in the form of midi files and then creating .wav audio files. A problem arises
during this conversion, due to the fact that a MIDI signal note ranges from 0 to
(27 − 1), and every byte in the bytes file can have 28 different values. And, the real
number is even less than 27 because the most pitches an instrument has are 88, which
is in the case of piano [1]. This problem of mapping these 28 values to 27 notes is
thoroughly studied previously, and we have researched on these studies to find the
suitable method for solving this problem.
We studied six techniques which were described in the previous researches [1,
7] and chose the one that best suited our needs. In the chosen strategy, a unique
musical note is assigned to each bit if it is set, but no musical note is assigned if the
bit is unset because of which only 8 separate notes are required, and there will be no
overlapping of notes in various octaves. This method will logically show the highest
level of precision. These eight notes could be any distinct notes; we have taken notes
from 50 to 57.
Audio files generated from executable binaries form the basis of our training set.
Each byte will play a piece of music in each frame containing different notes played
at the same time in a channel, and there can be multiple channels like this. Between
each successive combination of notes from 2 successive bytes, we added an empty
note of 0.25 s. This reduces the overlapping of sounds caused by the echo from the
sounds of the successive bytes, which in turn increases the accuracy of our method.

Algorithm 1. Music Creation from Malware Files


INPUT: bytes file from the dataset of malware samples.
OUTPUT: Midi files containing musical notes mapped to each byte of a
sample.
1. procedure MIDI mapper (bytefile)  mapper function
2. time ← 0
3. for line ∈ bytefile do
4. for byte ← line do
5. byte ← Hexadecimal conversion of byte
6. bStr ← Binary string conversion of byte
7. channel ← 0
224 N. Sehrawat et al.

8. duration ← 1
9. pitch ← 50
10. for b ∈ bStr do
11. if b = 0 then
12. volume ← 100
13. Add a note with channel, time, pitch, volume
14. channel ← channel + 1
15. pitch ← pitch + 1
16. time ← time + 1.25  0.25 s to reduce echo
17. write into .mid file.

23.3.2 Conversion of MIDI Signals to .Wav Files

The creation of an actual audio file from a .bytes file begins after the MIDI file has
been prepared. An audio synthesizer is required for this technique. The phase in our
process that takes the greatest time, CPU and memory is music synthesizing. Signal
production and feature extraction become more expensive and time-consuming as
the number of MIDI notes increases. Considering the size of the dataset, we wanted
to keep the process fast, so, we chose to convert a portion of the bytes file into an
audio file; in our case, this portion will be the first 1024 bytes of the bytes file. We
have used pyFluidSynth [8] library for generating wav files in our research.
It is worth noting that generating audio at a lower quality is a significant parameter
for reducing CPU utilization and memory consumption at this stage. An increase in
the sample rate makes the quality of the audio file better while also reducing CPU
utilization and memory requirements for this process.
The sensitivity of the MFCC feature is lower than the sampling rate and the bit
rate, as well, due to the simplicity of the created music in our process. In this case, we
sample at a relatively modest rate and output audio in a single channel. As a result,
the smallest sample rate possible is chosen, which has no impact on the ultimate
accuracy. This number has been adjusted to 11,025 Hz based on our tests.

23.3.3 Audio Feature Extraction

In this research work, we have studied the following audio features.

1. Mel-Frequency Cepstral Coefficients (MFCC)

MFCC are one of the most essential methods for extracting a feature from an
audio signal. A signal’s MFCC is a short number of features (typically 13–20) that
succinctly define a spectral envelope’s overall shape [9].
23 Malware Family Classification Using Music Information … 225

2. Chromagram

This feature is again directly obtained from a waveform and contains the chroma
features which are based on the 12 different pitch classes. This feature has the ability
to get the harmonic and melodic characteristics of music. Typically, chroma features
are examined when evaluating audio samples whose pitches can be meaningfully
classified and where the tunings measure up to the tempered scale. We have used
three types of chromagrams for the analysis.
• Chroma STFT which uses short term Fourier transform to obtain the audio
spectrum.
• Chroma CENS which uses energy normalization to get chroma energy normalized
statistics.
• Chroma CQT which uses constant Qtransform to obtain the audio spectrum.

3. Mel spectrogram

Mel spectrogram is utilized to offer sound information to our models that is identical
to what a person typically perceive. The Mel spectrogram is created by passing raw
audio waveforms through filter banks. Each sample has a 128 × 128 shape after this
operation, signifying 128 filter banks and 128-time steps per clip [10]. On a Mel
frequency scale, Mel spectrogram is divided into a number of locations at evenly
spaced periods and frequencies.
The Mel frequency scale is defined as:

mel = 2595 ∗ log 10(1 + hertz/700) (23.1)

MFCCs and chromagrams are calculated frame by frame such that each frame
ranges around 9–16 ms. As a result, these two features are used to describe each signal
frame individually. Each frame contains 13–20 values of MFCCs which represents the
audio’s spectral envelope and 12 values of chromagrams describing the different pitch
profiles of the audio. A 5-min-long music generates about 5000 frames, and the size
of the feature vector then becomes very large and unmanageable. To overcome this,
we down-sampled these features using the mean, covariance and standard deviation
of the data.
With all of these features, we divided the malware samples into smaller parts to get
localized variations in these features which improves the accuracy of the result. For
MFCC, we divided the sample into 18 parts and the resulting mean of the 13 MFCC
features over all the frames became a feature vector of size 18 * 13 which is 234,
and this size is quite manageable and contains the localized variations. Similarly, for
chromagram, we divided the sample into again 18 parts, to get a feature vector of
size 18 * 12 which is 216. And, for Mel spectrogram, since it has 128 values, we
divided the sample into three parts, which resulted in a feature vector of size 3 * 128,
which is 384.
226 N. Sehrawat et al.

23.3.4 Building the Model

After the audio features are extracted from the dataset containing malware files
of different families, we train machine learning classifiers on these features. The
following classifiers are used for multi label classification of malware samples.

1. KNN

The k-nearest neighbors algorithm is a classification approach. The output of this


categorization is the class of the sample. A data point is categorized through a majority
of vote among its neighbors, with the data point allocated to the mode class (one
with the most frequency) among k closest neighbors. Usually, the k is chosen to be
an odd number to form a majority.

2. SVM
Support vector machines are ML classification models based on supervised learning,
which examine data for categorization and have associated learning algorithms. An
SVM model represents a sample instance as a data point in a hypothetical space.
Predictions of new sample points are done based on their position in this hypothetical
space which is divided into different classes by hyperplanes.

3. Logistic Regression

Logistic regression is a method for modeling the likelihood of a discrete output given
an input variable. The sigmoid function is utilized to generate logistic regression from
linear regression. The vertical axis represents the probability calculated for a specific
categorization whereas the horizontal axis represents the “x”. It is assumed that the
y | x distribution is a Bernoulli distribution.
 
F(x) = 1/ 1 + e(−(β0 +βi ∗x)) (23.2)

4. Random Forest (RF)

RF is an ensemble learning algorithm popularly known for deciphering classifica-


tion and regression problems. It generates decision trees from a variety of samples,
classifying them using the majority of votes and regressing them using the mean. In
our classifier, we have set the number of decision trees in RF to 100 [11].

5. XGboost

XGboost is an implementation of gradient boosted decision trees. This algorithm


makes use of sequential construction of decision trees. Weights play a significant role
in XGboost. Weights are assigned to all of the independent factors, which are then
passed into the decision tree to predict outcomes [11]. Further, only those factor’s
weights are increased whose tree’s prediction was incorrect. The results given by
these trees are finally integrated to produce a more powerful and precise model.
23 Malware Family Classification Using Music Information … 227

6. Neural Network

The neural network (NN) is formed up of artificial neurons, which are a collection of
cross-linked nodes. Signals have to traverse between multiple hidden layers while it
goes from the input layer (first layer), to the output layer (last layer). The dimension
of an output space is represented by the value inside dense. As the first layer has
256 neurons, the output space dimension is also 256. In malware/legitimate binary
classifier, the output layers for malware detection contain two neurons; however, the
multilabel family identification classifier will have a number of neurons equal to the
number of malicious families evaluated in the results and comparison section.

23.4 Results and Comparison

23.4.1 Description of Dataset

The dataset incorporated in this study is made available by the company Microsoft and
contains known malware files from nine different families. Every malware sample
file in this dataset has a unique ID which maps it to one of the 9 malware families
it belongs to, a twenty-character hash value identifying it and a class. We are also
provided by a csv file which maps the malware families to the file ids. The data is
presented as bytes in the .bytes files [12].

23.4.2 Results

In this examination, the impact of each feature is studied along with various combi-
nation of features on different classifiers. The various combinations of features used
for study are mentioned in Tables 23.1 and 23.2. In Table 23.1, the performance of
features when taken one at a time is shown with various classifiers. It can be clearly
seen that MFCC gave the best accuracy of 96.91% with random forest algorithm.

Table 23.1 Accuracy of classifiers with features when taken one at a time
Feature KNN SVM LR RF XGboost NN
Chroma STFT 93.66 84.12 80.27 95.50 94.47 85.47
Chroma CQT 94.36 86.29 81.68 95.99 95.07 91.06
Chroma CENS 94.15 84.88 80.54 96.15 95.12 87.37
MFCC 92.20 95.39 93.50 96.91 95.18 91.97
Melspectrogram 84.12 71.60 64.61 92.79 90.51 73.33
228 N. Sehrawat et al.

Table 23.2 Accuracy of classifiers when combinations of features are taken


Feature KNN SVM LR Random Forest XGboost NN
Chroma STFT + MFCC 92.20 95.77 93.82 96.31 95.88 92.24
Chroma CQT + MFCC 92.30 95.61 93.93 96.80 96.31 89.64
Chroma CENS + MFCC 92.30 95.67 93.44 96.53 95.93 91.38
Chroma STFT + MFCC + Mel 92.30 95.66 93.98 97.12 96.53 91.59
spectrogram
Chroma CQT + MFCC + Mel 92.30 95.56 94.04 96.80 96.21 91.59
spectrogram
Chroma CENS + MFCC + Mel 92.30 95.56 93.60 96.80 95.45 91.54
spectrogram

In Table 23.2, the performance of features when taken in various combination is


shown with various classifiers. It can be seen that every classifier performed better
when features are taken in combination.
As MFCC define music through spectral envelope’s overall shape, chromagram
represents the harmonic and melodic characteristics of music and Mel spectrogram
is utilized to offer sound information to our models, it can be said that the classifiers
get more relevant parameters to define the music when features are combined. Hence,
they perform better.
The best accuracy of 97.12% is achieved on random forest classifier with the
combination of MFCC, chromagram STFT and Mel spectrogram. Also, classifier
XG boost performed best on the combination of MFCC, chromagram STFT and Mel
spectrogram with accuracy of 96.53%.
Figure 23.1 represents the confusion matrix for the classifier random forest when
tested with the combination of features which provided the highest accuracy of
97.12%. In Table 23.3, we have compared our best result with previous researches
and found that our approach gave the best result among the discussed researches.
We have added true positive rate (TPR), false positive rate (FPR) and f -measure for
better understanding and comparison of our result [13].

Fig. 23.1 Confusion matrix


of the best result (random
forest on the combination of
MFCC, chroma STFT and
Mel spectrogram)
23 Malware Family Classification Using Music Information … 229

Table 23.3 Comparison of results with previous approaches


Year Author Approach Accuracy TPR FPR F-measure
– Ours Musical feature analysis 97.12 97.12 0.003 97.12
and random forest
2016 Narayanan [5] Image conversion and 96.6 – – –
PCA transformation
2018 Farrokhmanesh [1] Music classification 95.87 95.90 8.00 95.80
random forest
2018 Kalash [4] Gist SVM 88.74 – – –
2019 Zhao [6] Maldeep 92.5 97.2 9.4 –

23.5 Conclusion and Future Work

In this work, we have researched and employed music classification techniques to


classify malware files into different malware families. The proposed study showcases
the importance of the different audio features in the classification of malware samples
over the different classifiers employed. To translate the bytes files to MIDI messages,
we define a mapping approach between the bytes and musical notes, and for this,
we employed an algorithm which ensures the maximum accuracy of the process.
We studied various audio features which includes MFCC, chromagrams and Mel
spectrogram. The classifiers are employed on these features both individually and
on different combinations of these features. In order to make this approach feasible
in terms of time and resource consumption 1024 bytes of the sample files and a
sampling rate of 11,025 Hz is chosen. This research makes use of logistic regression
(LR), SVM, kNN, and some ensemble classifiers such as XGBoost, neural networks
(NN) and random forest (RF).
In the future, we are looking to explore into some evolutionary classifiers. We are
also trying to investigate CNN and genetic algorithm to use over Mel spectrogram.
We will try to test our study, and these classifiers on other datasets such as Android
malware dataset and Malimg dataset. In addition, we will try to study other musical
features like spectral centroid, bandwidth, spectral rolloff and zero crossing rate and
include them into our feature vector to see any improvements in the accuracy of the
classifications.

References

1. Farrokhmanesh, M., Hamzeh, A.: Music classification as a new approach for malware detection.
J. Comput. Virol. Hacking Tech. 15, 77–96 (2019)
2. Devi, J.S., Srinivas, Y., Krishna, N.M.: A study: analysis of music features for musical instru-
ment recognition and music similarity search. Int. J. Comput. Sci. Inf. (IJCSI) 2, 21–24
(2012)
230 N. Sehrawat et al.

3. Mercaldo, F., Santone, A.: Audio signal processing for Android malware detection and family
identification. J. Comput. Virol. Hacking Tech. 17, 139–152 (2021)
4. Kalash, M., Rochan, M., Mohammed, N., Bruce, N.D.B., Wang, Y., Iqbal, F.: Malware classi-
fication with deep convolutional neural networks. In: 2018 9th IFIP International Conference
on New Technologies, Mobility and Security (NTMS) (2018)
5. Narayanan, B.N., Djaneye-Boundjou, O., Kebede, T.M.: Performance analysis of machine
learning and pattern recognition algorithms for malware classification. In: 2016 IEEE National
Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS) (2016)
6. Zhao, Y., Xu, C., Bo, B., Feng, Y.: Maldeep: a deep learning classification framework against
malware variants based on texture visualization. Secur. Commun. Netw. 2019 (2019)
7. Cataltepe, Z., Yaslan, Y., Sonmez, A.: Music genre classification using MIDI and audio features.
EURASIP J. Adv. Signal Process. 2007, 1–8 (2007)
8. Newmarch, J.: FluidSynth, pp. 351–353 (2017)
9. Deng, J.D., Simmermacher, C., Cranefield, S.: A study on feature analysis for musical
instrument classification. IEEE Trans. Syst. Man Cybern. Part B (Cybernetics) 38, 429–438
(2008)
10. Zhang, B., Leitner, J., Thornton, S.: Audio recognition using mel spectrograms and convolution
neural networks. Final report (2019)
11. Bentéjac, C., Csörgő, A., Martı́nez-Muñoz, G.: A comparative analysis of gradient boosting
algorithms. Artif. Intell. Rev. 54, 1937–1967 (2021)
12. Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., Ahmadi, M.: Microsoft malware classifica-
tion challenge (2018). arXiv preprint arXiv:1802.10135.
13. Powers, D.M.W.: Evaluation: from precision, recall and F-measure to ROC, informedness,
markedness and correlation (2020). arXiv preprint arXiv:2010.16061
Chapter 24
Text-Based Prediction of Heart Disease
Doctor Chatbot Using Machine Learning

Nehabanu H. Harlapur and Vidya Handur

Abstract Health care is important to live a happy and healthy life [1]. So Chat-
bots can be helpful in monitoring current health status before visiting the doctors
physically. Doctor Chatbot is a means to communicate with man and machines. A
Doctor Chatbot allows a user to ask queries about a health issue in a manner that
they would consult a doctor. Doctor Chatbot interprets human input and responds
back using Artificial Intelligence [2] and Natural Language Processing. Therapeutic,
Educational, Agricultural areas are important domains to consider. Chatbots can now
be utilized everywhere and at any time. Every day, a large number of people, both
young and elderly, suffer as a result of a heart attack. In the proposed study the main
objective is to predict whether a person has heart related issues using Deep Learning
Techniques of Machine Learning. As a result accuracy of the prediction is measured.
Comparative analysis of SVM, Naive Bayes, KNN algorithm is carried out and SVM
outperforms other two algorithms.

Keywords Doctor chatbot · Dialogflow · Heart disease dataset · Support vector


machine · Natural language processing

24.1 Introduction

Doctor Chatbot is software application that converses with a human through text
in Natural Language [3]. A Doctor Chatbot helps to improve the performance by
interacting with users [4] in a human way, providing health related information to
users [5]. Most of chatbots are for common purpose and they are not particular
to some certain health field like Oncology, Ophthalmology, Gynecology, Obesity,
Diabetic Foot, Lung disease, Prostate disease.

N. H. Harlapur (B) · V. Handur


School of Computer Science and Engineering, KLE Technological University, Hubballi 580031,
India
e-mail: neha.h.harlapur@gmail.com
V. Handur
e-mail: vidya_handur@kletech.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 231
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_24
232 N. H. Harlapur and V. Handur

In general, e-healthcare facilities are valuable resources for developing countries,


but they are often difficult to build due to a lack of understanding and development
of the technology. A large percentage of Internet users rely on it. In addition, the
user can obtain medical advice during emergency in an easier way. They can be used
as channels for comparison [6]. The main goal is to create an agent called Doctor
Chatbot using Google Cloud Dialogflow as user interface front end, using Cleveland
Heart Disease dataset; SVM algorithm predicts whether or not user has heart disease.
SVM algorithm classifies the accuracy measured. A Doctor Chatbot could benefit
physicians and patients in a many ways [7]. The majority of people are unconcerned
about health. It is more difficult to have regular health checkups. With the support of
previously collected data, the Doctor Chatbot assists in the early detection of cardiac
disease and can aid in early discovery of disease. Diagnosis at a last very last stage
can be dangerous and diagnosis becomes more costly.

24.2 Literature Survey

Numerous works have been carried out related to Chatbots where different techniques
and algorithms are used.
Seema J, Suman S et al. [8] described a study using Artificial Intelligence (AI).
They have used algorithms such as, Decision tree, Genetic algorithm, Naïve Bayes,
Pattern Matching. Python library, data is processed in advance using the NLTK
package accessible in Python [9]. The accuracy of various algorithms was measured
and Decision Tree outperformed the other algorithms.
Hiba Hussain, et al. [10] described a study that mainly aims to provide users
with an instant and accurate disease prognosis based on their symptoms, as well
as a complete analysis of their pathology findings. They have used Decision Tree
and KNN algorithm. Decision Tree and KNN both have 82.6% and 85.74% accu-
racy, respectively. The notion of Optical Character Recognition is used to analyze
pathology reports (OCR).
Lekha Athota et al. [11] described that they used n-gram, TFIDF and cosine
similarity algorithms. The algorithm yields low user protection and characters, as
well as retrieving responses to inquiries, securities are ensured. The study show that
number of correct answers given by the chatbot was 80%.
Mrs. Rashmi Dharwadkar et al. [12] described a technology that will assist users
to ask medical related questions via voice in medical institutes. The authors have
used Porter Stemmer Algorithm, Word Order Similarity between Sentences which
lead to 60% of accuracy.
HameedullahKazi et al. [13] described the development of AIML-based Chatter-
bean. The AIML-based chatbot converts into questions. They employed 87 samples,
which then categorized based on samples. The questions were all posed, with
accounting for 47% of the total.
Nalini G et al. [14] proposed a system personal health care assistant that keeps
track of feelings and recommends careful behaviors in response. Using machine
24 Text-Based Prediction of Heart Disease Doctor Chatbot … 233

learning techniques, the system employs a personal support chatbot to manage and
respond to user messages. The technique uses logistic regression to achieve a 0.96
higher accuracy than previous machine learning algorithms.
Marco Polignano et al. [15] proposed a system containing Health Assistant Bot
(HAB) along with different modules. Initially, the user can create a profile, and
the system identifies the condition of the user by System Checker (SC). The team
created the assistant as a Conversational Agent (CA) to engage with people in natural
language in order to perform this straightforward method. The CA is made up of two
parts: Intent Recognition (IR) and Entity Recognition (ER) (EI). On real-world use
scenarios, the system has a success rate of 76%.

24.3 Proposed Methodology

The Natural Language Processing (NLP) functioning is easy to understand as


computers typically require people to “speak” to them in a specific programming
language. NLP task is simple to comprehend. Because it comprises a significant
number of composite variables, human language is inadequate. NLP can be used to
pose a question [16]. The machine reacts by recognizing sections of the user’s text
that will link to specific features in a data set. The saved data includes text, report
such as patient medical records and symptoms related to heart disease, from which
we may predict the disease and suggest some providers or doctors for appointments.
The following are the four steps carried out during implementation.
1. User Log In into the System
User logs in into the system. After that, enquire about heart related health concerns.
2. User Queries
User can ask some questions regarding heart issue using Google Cloud Dialogflow
for conversation.
3. Disease Prediction
Predict disease based on symptoms. If the symptoms are not present it will suggest
some of the precautions.User can ask some questions regarding heart issue using
Google Cloud Dialogflow for conversation.
4. Accuracy
Accuracy of Algorithm is measured.
Figure 24.1 shows the high level design of the proposed system.
The Doctor Chatbot development involves using various key components of Arti-
ficial Intelligence such as Natural Language Processing. Doctor Chatbot is devel-
oped based on Heart disease data set and Support vector machine algorithm. Support
Vector Machine (SVM) Algorithm is used to predict heart disease using Cleveland
heart disease dataset.
234 N. H. Harlapur and V. Handur

Fig. 24.1 High level design

Support Vector Machine [17]


Figure 24.2 shows SVM is measured for extreme class, i.e., extreme data points
(support vectors). The purpose of SVM is to find the best classification function in
the training data to distinguish between two classes. [18].
Let D denote the heart disease dataset where

Fig. 24.2 Separating dataset


with hyper plane
24 Text-Based Prediction of Heart Disease Doctor Chatbot … 235

Table 24.1 Dataset attributes


Name of attributes Description
Age Period
Gender m-0, f-1
Cp Typical angina
trestbps Blood pressure
Chol Cholesterol
Fbs Fasting blood sugar > 120?, yes = 1,
no = 0
restecg Electrocardiographic results 0,1,2
thalach Heart rate achieved
exang Induced angina (1 = yes; 0 = no)
oldpeak Depression induced by exercise
slope The slope of the peak ST segment
Ca No. of vessels (0–3) colored
fluoroscopy
Thal 3 = normal, 6 = fixed defect, 7 =
reversible defect
Diagnosis Diagnosis of heart disease

 
D=(X 1 , y1 ), (X 2 , y2 ), . . . , X |D|, y|D| (24.1)

X i is the set of training tuples with associated class labels. Each yi can have: +1
or −1.
Description of Cleveland Dataset
The data is collected from the Cleveland Heart Disease dataset clinic foundation.
Table 24.1 shows the heart disease dataset with 303 observations on the following
14 parameters [19, 20].
Accuracy: Accuracy is a performance ratio that measures the proportion of
accurately predicted observations to total observations.

Accuracy = (TP + TN)/(P + N ) (24.2)

where TP = Total number of true positive tuples, TN = Total number of true


negative tuples, P = number of positive tuples and N = number of negative tuples.
236 N. H. Harlapur and V. Handur

24.4 Implementation

The major goal of this system Doctor Chatbot is to identify whether the user has
Heart Disease with the highest level of accuracy which is the main characteristic of
the project.
Steps for implementing the proposed system are as follows:
Step 1: User login.
Step 2: Run connection.py that connect all the code or URL https://local host
5000.
Step 3: Copy the URL and run on Google chrome.
Step 4: Doctor Chatbot Dialogflow Console appears where it ask for user login.
Step 5: Enter user input query in the Doctor Chatbot window.
The Algorithm 1 denotes the processing of user input.
Step 6: Process the query of user and start predicting the symptoms based on heart
disease dataset.
Step 7: The Doctor Chatbot will suggest to fill the form based on their previously
diagnosed heart disease report.
Step 8: If suppose heart disease is present then the user can book for the
appointment with the doctor.
Step 9: If the user need any suggestions and precautionary measures the Doctor
Chatbot will suggest some routine to follow.
Step 10: Based on the heart disease dataset graphs are drawn.
Step 11: Accuracy is measured using Support Vector Machine Algorithm as
represented in Algorithm 2.
Step 12: Exit.
Algorithms

Algorithm 1: Algorithm for SVM


Input data procedure SORT FEATURES
svm = svmtrain on input data(all features);
remaining features = all features;
array sorted features[];
while remaining features > 0 do
if remaining features < 100 then
n=1
end if
svm = svmtrain on inputdata
(remaining features)
24 Text-Based Prediction of Heart Disease Doctor Chatbot … 237

end while
end procedure

Algorithm 2: Algorithm for SVM Testing


procedure SVM TEST
array accuracy[]
for i1: length(sorted features) do
svm = svmtrain on inputdata(sorte features(1:i));
class out = svm classify on test data;
acc(i) = correct of class out;
end for
best model = sorted features(max(acc));
end procedure
return accuracy;

24.5 Results

This section shows the graphical user interface designs for the user to communicate
with the System.
Login page
Figure 24.3 shows the login page where the user can enter the details.
Chat with doctor chatbot
Figure 24.4 shows that the Doctor Chatbot asks some of heart related questions
frequently.
Fill the report based on the dataset
Figure 24.5 represents that the system provides detailed report predicting whether
the user is suffering from heart disease or not.
Book an appointment
Figure 24.6 signifies that the system suggests the user to visit the doctor if they are
suffering from heart disease and helps to book an appointment.
238 N. H. Harlapur and V. Handur

Fig. 24.3 Login page

List of doctors to take an appointment


Figure 24.7 represents the list of doctors that are suggested by Doctor Chatbot.
Precautions and measures to be taken
Figure 24.8 shows the precautions and measures to control the heart disease.
Target ratio
Figure 24.9 shows the target value whether heart disease is present or no.
Accuracy
The accuracy obtained through SVM is 90%. Figure 24.10 shows the comparison of
SVM, KNN and Naive Bayes algorithms. Comparing all the three, SVM algorithm
out performs the other two.

24.6 Conclusion

Text-based Doctor Chatbot aims to improvise the patients experience without the
need of Doctor at a preliminary stage. The aim of the proposed system is to obtain
patient reports, analyze them, and determine whether they are suffering from Heart
24 Text-Based Prediction of Heart Disease Doctor Chatbot … 239

Fig. 24.4 Chat with doctor


chatbot

Diseases using the technology mentioned. It also aids in early detection of heart
disease. The system has been programmed to allow users to schedule appointments
with specialists who are specialized in cardiac issues. The results show that SVM
classifier has achieved accuracy of 90%.SVM model assigns examples to binary
linear classifier. SVM separates the hyper plane and it is called as discriminative
classifier. The Doctor Chatbot also concludes with a statement that: “Please note,
this is not a diagnosis. Regularly visit a doctor if you are in doubt or if the symptoms
get worse. If the situation is serious, always call the emergency services”. It states
that the patient should not rely only on the chatbot, as this might result in death in
the worst-case scenario.
240 N. H. Harlapur and V. Handur

Fig. 24.5 Report form


24 Text-Based Prediction of Heart Disease Doctor Chatbot … 241

Fig. 24.6 Book an


appointment
242 N. H. Harlapur and V. Handur

Fig. 24.7 List of doctors


24 Text-Based Prediction of Heart Disease Doctor Chatbot … 243

Fig. 24.8 Precautions to be


taken

Fig. 24.9 Target value


244 N. H. Harlapur and V. Handur

Fig. 24.10 Accuracy measured

References

1. Mohamed, M.M., Zhuopeng, W.: Artificial Intelligence healthcare chatbot system. Int. J. Adv.
Res. Comp. Comm. Eng. 9(2), February (2020)
2. Dahiya, M.: A tool of conversation: Chatbot. Int. J. Comp. Sci. Eng. 5(5), E-ISSN: 2347-693,
30 May (2017)
3. Naaz, F., Siddiqui, F.: Modified n-gram based model for identifying and filtering near-duplicate
documents detection. Int. J. Adv. Comp. Eng. Networking 5(10). ISSN: 2320-2106, October
(2017)
4. Rahman, A.M.: Abdullah A1 Mamun, Alma Islam, Programming challenges of chatbot:
Current and future prospective. Int. Islamic University Chittagong, IEEE (2017)
5. Hussain, S., Ginige, A.: Extending a conventional chatbot knowledge base to external knowl-
edge source and introducing user based sessions for diabetes education. 2018 32nd Inter-
national Conference on Advanced Information Networking and Applications Workshops.
978-1-5386-5395-1/18/$31.00 ©2018 IEEE
6. Jyothirmayi, N., Soniya, A., Grace, Y., Kishor Kumar Reddy, C., Ramana Murthy, B.V.: Survey
on chatbot Conversational system. J. Appl. Sci. Comp. VI(I), January. ISSN NO: 1076-5131
(2019)
7. Palanica, A., Flaschner, P., Thommandram, A., Li, M., Fossat, Y.: Physicians’ perceptions of
chatbots in health care: cross-sectional web-based survey. J. Medical Internet Res. 21(4), 5
April (2019)
8. Seema, J., Suman, S., Chirag, S.R., Vinay, G., Balakrishna, D.: Chatbot—Smart health
prediction. Int. J. Scient. Res. Sci. Tech. 8(3) ,May–June (2021)
9. Belfin, R.V., Mathew, A.A., Babu, B., Shobana, A.J., Manilal, M.: A graph based chatbot
for cancer patients. In: IEEE, 2019 5th International Conference on Advanced Computing &
Communication Systems (ICACCS). 978-1-5386-9533-3/19/$31.00 ©2019
10. Hussain, H., Aswani, K., Gupta, M., Thampi, G.T.: Implementation of disease prediction
chatbot and report analyzer using the concepts of NLP, machine learning and OCR. Int. Res.
J. Eng. Tech. (IRJET) 7(40, Apr (2020)
11. Athota, L., Kumar, V.: Proposed “Chatbot for healthcare system using artificial intelligence.
Infocom Tech. Optimization (Trends and Future Directions) (ICRITO) Amity University,
Noida, India. June 4–5, 2020).
24 Text-Based Prediction of Heart Disease Doctor Chatbot … 245

12. Dharwadkar, R., Deshpande, N.A.: Proposed “A Medical ChatBot”. Int. J. Comp. Trends Tech.
(IJCTT) 60(1), June (2018)
13. Hameedullah K., Chowdhry, B.S., Memon, Z.: Med chatbot: An UMLS based chatbot for
medical students. Int. J. Comp. Appl. 55(17), October (2016)
14. NaliniPriya, G., Priyadarshani, P., Puja Shree, S., RajaRajeshwari, K.: “BayMax: A smart
healthcare system provide services to millennials using machine learning technique. In: IEEE
6th International Conference on smart structures and systems ICSSS 2019 (2019)
15. Marcopolignano, F.: Health assistant bot: A personal health assistant for the Italian language
digital object identifier. https://doi.org/10.1109/ACCESS.2020.3000815.
16. Raj , P., Murali Krishna, R., Krishna, S.M., Vardhan, K.H.: Kameswara Rao Presented a method
“Emergency Patient Care System Using Chatbot”. Int. J. Tech. Res. Eng. 6 (7), March (2019)
17. Rosruen, N., Samanchuen, T.: Chatbot utilization for medical consultant system. In: IEEE, The
2018 Technology Innovation Management and Engineering Science International Conference
(TIMES-iCON2018). 978-1-5386-7573-1/18/$31.00 ©2018
18. Chen, J., Hengjinda, P.: Early prediction of coronary artery disease (CAD) by machine learning
method-A comparative study. J. Artificial Intelli. 3(1), 17–33 (2021)
19. Ali, L., Rahman, A., Khan, A., Zhou, M., Javeed, A., Khan, J.A.: An automated diagnostic
system for heart disease prediction based on statistical model and optimally configured deep
neural network. IEEE Access. 7, 34938–34945 (2019)
20. Powar, A., Shilvant, S., Pawar, V., Parab, V., Shetgaonkar, P., Aswale, S.: Data mining &
artificial intelligence techniques for prediction of heart disorders: A survey.In: 2019 Interna-
tional Conference on Vision Towards Emerging Trends in Communication and Networking
(ViTECoN), Vellore, India, pp. 1–7 (2019) .https://doi.org/10.1109/ViTECoN.2019.8899547
21. Maroengsit, W., Theeramunkong, T., Pongnumkul, S., Chaovalit, P.: A survey on eval-
uation methods for chatbots. ACM Digital Library; ICIET 2019, March 29–31 (2019)
AizuWakamatsu, Japan; ACM ISBN 978-1-4503-6639-7/19/03…$15.00
22. Dialogflow: https://dialogflow.com/
23. Flask Library: https://flask.palletsprojects.com/en/1.1.x/
24. Ngrok: https://ngrok.com/
Chapter 25
Intelligent Proctoring System

Manasa Sanjeev, Mirza Kaazima Ifrah, Nuthi Sriram, Potti Priya,


and C. R. Kavitha

Abstract The purpose of this study is to develop and deploy an Intelligent Online
Proctoring System that uses artificial intelligence to monitor online tests, which is
cardinal in today’s online learning environments. Nowadays, most educational insti-
tutions around the world have adopted online learning as the best way of getting all
students on track during the pandemic. On the other hand, current online educational
systems lack the capacity to prevent students from deceiving the online examina-
tions, making it impossible to maintain the integrity and equality of all test takers as
in conventional teaching exams. According to the literature review, in order to coun-
teract the growing problem of online cheating, online proctoring should become a
standard component of online exams. Despite the fact that such proctoring systems
have been developed and commercialized, there has been no reporting on their actual
widespread use in real-world exams as they are heavy software systems which cannot
be freely used in schooling.

Keywords Artificial intelligence · Dual-camera-based proctoring · Online


proctoring · Online education · Online exams

25.1 Introduction

The COVID-19 (coronavirus disease) pandemic was one of the most significant
occurrences in contemporary history. Hundreds of millions of individuals have been
affected, and it has an impact on all aspects of society. It has been forced to implement
bold and innovative reforms across the board as a result of the pandemic. One of the
most seriously affected areas by the pandemic has been education and academia.

M. Sanjeev (B) · M. Kaazima Ifrah · N. Sriram · P. Priya · C. R. Kavitha


Department of Computer Science and Engineering, Amrita School of Engineering Bengaluru,
Amrita Vishwa Vidyapeetham, Bengaluru 560035, India
e-mail: blenu4cse18065@bl.students.amrita.edu
C. R. Kavitha
e-mail: cr_kavitha@blr.amrita.edu

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 247
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_25
248 M. Sanjeev et al.

The pandemic has prompted educational institutions and schools to create “Online
Classes,” [1, 2] which have thrown the traditional classroom design into disarray.
Since then, many others have embraced this idea as their own. However, in the
domain of educational evaluation, there is still need for development. There have
been no simple or accurate ways for performing pupil tests in recent years. There
are a few options, but they are either too expensive for educational institutions or too
complicated for students to use [3]. To assist institutes in verifying their students’
performance and prepare them for any malpractices, there must be a system that
not only compliments the “e-learning approach,” but is also simple to use by all
users. As mentioned above, there is a need for an all-round proctoring system that
can monitor all the actions of a test taker. Most of the existing proctors involve a
single-camera proctoring system [4–6] which can be fooled when using a phone or
a book outside of the range of the camera without being caught by the proctor. To
overcome such forms of malpractice and make the proctoring system more robust,
there is a need to come up with a better system, which cannot be fooled by the test
takers easily, without using hardware products that are difficult to access. Hence, to
solve the above problems and challenges, a more efficient, available, and low-cost
solution is required. So, the main aim of this system is to develop a dual-camera
proctoring system.

25.2 Related Work

Several research publications pertinent to the fields of proctoring were examined


when doing the literature review. Although it was not related to real-time analysis
of the same, the paper “A Visual Analytics Approach to Facilitate the Proctoring of
Online Exams—2021” [7] focused in-depth on areas related to video capture, head
pose estimation, and detection of face. By analyzing video recordings of the exam and
data on mouse movement, the major goal of this research paper is to propose a unique
visual analysis technique to proctoring online exams. The method described identifies
and visualizes suspicious head and mouse motions. They created a suspected case
identification engine based on rules, which can spot suspicious cases based on video
and mouse actions, as well as calculate the possibility of cheating. They hoped to
enhance their approach to real-time proctoring and test the technology in real-world
online tests in the future. Thus, it left room for future work and improvisation.
Few other studies [8–10] focused solely on face detection and verification, leaving
out other types of fraud such as voice and object detection. Paper [5] also conducted
head position estimation and audio change detection in addition to face detection.
This system, however, is limited to a single student situation. Other studies [11, 12]
include a variety of proctoring measures to prevent malpractices, but they did not
work in a real-time scenario. The system prerequisites check was also discovered to
be missing in the paper [13] which was deemed necessary to be implemented in this
solution. This is essential to ensure that the student taking up the test has camera
access, the latest browser, and a stable internet connection, needed to take up the
25 Intelligent Proctoring System 249

online proctored test. A study [14] focused on the usage of a 360 degree security
camera over the webcam. Similarly, another study [15] made use of two cameras, a
wearcam and a webcam. Though such cameras (360 degree camera, wearcam) cover
a larger view of the examination surroundings, they are not affordable as not every
student taking the test from home has access to them. Thus, to cover a larger view and
to make the entire system affordable and accessible to all, a dual-camera proctoring
system is essential that makes use of two easily available cameras, a webcam and a
mobile camera.

25.3 Design

The design and architecture of the web app are extremely simple to understand.
Figures. 25.1 and 25.2 depict a variety of use cases and user groups for this system.
There are three versions of the same web application, two of them are for the student
taking up an examination to enable dual-camera proctoring on his smartphone and
personal computer(PC)/laptop along with test taking capabilities. The teacher has a
separate dashboard to check the results and create unique exams for the students.
Several types of malpractices are detected in the PC/laptop version on the web
application. They are classified into object detection in case of book, laptop, phone,
and paper. Furthermore, switching tabs, exiting full screen, no face, multiple face, dif-
ferent face detection’s are all recorded with head pose detection. The audio through-
out the exam is recorded and stored in the database for the teacher to look into
later.
The workflow of the application is very simple including simple steps of logging
into the respective version, submitting biometrics and giving the exam with the
proctor enabled in case of a student, and creating an exam or viewing results in case
of a teacher. Once it has been completed, the users are logged out from the system.
The proctoring needs to be enabled both on the phone as well as the PC/laptop. The
biometrics are also collected from both devices before the examination begins.

25.4 Implementation

25.4.1 Online Proctoring

Online proctoring, like offline invigilation, is a service that protects the integrity of the
exam by remotely monitoring the candidate’s activity throughout the examination.
Existing online tests typically require students to utilize cameras to observe and
record their activities throughout the exams to ensure effective proctoring [15]. The
types of online proctoring systems are:
250 M. Sanjeev et al.

Fig. 25.1 Flowchart of web app

Live Proctoring A certified proctor observes the candidate as he or she takes the
exam using live audio and video feeds in this type of proctoring [11, 16]. These
proctors have been taught to check the validity of the applicant and to watch for red
flags such as facial movements or the appearance of any unconfirmed gadget that
might suggest probable cheating. If suspicious circumstances develop, the proctor
has the option of either terminating the exam or noting the illegal conduct. This type
of proctoring allows the proctor to watch a restricted number of applicants at one
time. Both the candidate and the proctor can be at any location as long as they have
internet connectivity. The main downside of this type of proctoring is that it is reliant
on the proctor’s availability on a specific date and time. It is also costly as it requires
human intervention, much like offline proctoring, and it is not scalable for the same
reason.
Recorded Proctoring Unlike live online proctoring, recorded proctoring simply
captures the student’s activity during the examination, which is then played back at a
25 Intelligent Proctoring System 251

Fig. 25.2 Use case diagram

higher speed by a proctor to search for any suspicious behavior or occurrence over the
course of the exam [11, 16]. This type of proctoring has the advantage of requiring
no scheduling because the candidate can take the exam whenever it is convenient
for him or her. However, because it also requires human interaction to evaluate the
footage, it is costly and difficult to scale.
Automated Proctoring This is the most sophisticated type of proctoring, requiring
no physical involvement at all. Using modern audio and video analytics, this type of
252 M. Sanjeev et al.

proctoring captures the candidate’s behaviors, while simultaneously monitoring the


feed for red flags of any kind that might suggest malpractice [11]. It is the cheapest
method of proctoring since it eliminates the need for physical intervention and is
scalable for the same reason.
So, to maintain the integrity of the exams, and to make it convenient for all the
users, “Intelligent Proctoring System” has been designed as an automated proctoring
system.

25.4.2 Web App Features

• Dual Login: Both the students and the teachers can login using the same website
to proceed with their respective functions.
• System Compatibility Check: Checks for the most recent browser version, internet
speed, and access to the camera.
• Face Detection: Detects multiple faces and scenarios where the face is not visible.
• Face Recognition: The test taker’s face is recognized when the biometrics are
captured. This feature helps in identifying impersonation cases.
• Phone Detection: Detects mobile phone, tablets, calculators, and laptops.
• Book/Paper Detection: Detects reading materials like books, papers, etc.
• Tab Switch Detection: Monitors a variety of browser and system activities, such
as quitting full screen mode, launching numerous tabs, and monitors the use of
keyboard shortcuts involving Ctrl (Control) and Alt (Alternate).
• Head Pose Detection: Extreme movements of the test taker like turning their head
left or right are detected.
• Audio Recording: All the audio throughout the duration of the exam is recorded
and stored in the database for future use by the test creator.
• Dual-camera Proctoring: Proctoring is enabled both on the PC/laptop as well as
the smartphone.
• Test Creation: Create a test with a unique ID using Google forms.
• Results Dashboard: Shows the results, i.e, the number of checklisted malpractices.
• User-friendly Interface: Offers a user-friendly interface for students, and teachers
can effortlessly add new exams and generate results. If a student requests for re-
evaluation after being rejected due to malpractice detection, teachers can review
malpractice records for human verification of logs.

25.4.3 Object Detection

Object detection is a computer vision and image processing approach for finding and
locating several items in an image or video. After recognizing an item in an image
or video, this method creates bounding boxes around it. A point, height, and width
25 Intelligent Proctoring System 253

define these bounding boxes. It then adds class names to the items, such as cat, book,
or pen.
Tensorflow JS Tensorflow.js is a machine learning JavaScript library. It can be used
to:

• Run ML models—This may be used to convert a previously trained Keras or


TensorFlow model into TensorFlow.js format and then load it into the browser. As
a result, it is useful for running TensorFlow models in the browser.
• Retrain existing models—Deep learning models with millions of parameters
(weights) generally require a significant quantity of data and computer resources
to train from beginning. Transfer learning is a strategy for speeding up this process
by reusing a model component, which has already been trained on a similar job
in a new model. This might aid in the retraining of an imported model, and it is a
rapid approach to train an accurate model with a minimal quantity of data. As a
result, TensorFlow.js may be used to retrain pre-existing machine learning models
utilizing sensor data via the web.
• Build new models—Using JavaScript and a high-level layer’s Application Pro-
gramming Interface (API), TensorFlow.js may also be used to design, train, and
deploy models fully in the web. Using versatile and intuitive APIs, one can con-
struct and train models directly in JavaScript.

Object detection models There are various algorithms to detect objects in images/
videos. Few such algorithms are:

• Region-based Convolutional Neural Networks(R-CNN)—R-CNN is a convolu-


tional neural network, which takes region recommendations and merges them. It
aids in the use of a deep network to locate objects and the training of a high-capacity
model with just a limited amount of annotated detection data. It accomplishes
excellent object detection accuracy by using a deep ConvNet to categorize object
suggestions. R-CNN has the ability to expand to thousands of object classes.
• You Only Look Once (YOLO)—YOLO is a popular object detection technology
utilized by researchers all around the globe. YOLO’s unified architecture, accord-
ing to studies, is lightning fast [6]. In real time, the standard YOLO model analyzes
pictures at 45 Frames Per Second (FPS), while Fast YOLO, a smaller version of
the network, processes an incredible 150+ FPS. This approach beats other identi-
fication algorithms, like R-CNN, when generalized from natural images to various
other domains. When precision is not a major concern but when one needs super
quick prediction, YOLO is a superior alternative.
• Single-shot Detector (SSD)—SSD is a technique based on deep neural networks to
identify objects in pictures. This method discretizes the output space of bounding
boxes into a set of default boxes over a variety of aspect ratios. After discretization,
this method scales per feature map location. The SSD network combines predic-
tions from numerous feature maps with different resolutions to handle objects of
various sizes naturally. SSD, which is a single-shot detector for many classes,
is both faster than the prior approach (YOLO) and far more accurate, almost as
accurate as Faster R-CNN, which is considered as one of the slowest techniques.
254 M. Sanjeev et al.

To conclude, though YOLO is perfect for real-time proctoring, it might not be


suitable when an exact prediction is required. Since a genuine student should not be
flagged, SSD has been used here for object detection. As a result, common objects
in context (COCO-SSD), one of the standard object detection models using Tensor-
Flow.js, is utilized, which can recognize 80 different types of objects.

25.4.4 Face Recognition

Face recognition plays an important role in an online proctoring system [16–18].


A file has been built using the face recognition model that comes with the face-api
library of Node Package Manager (NPM). In this created file, the relevant models
are loaded first, and a description of the face is generated once the face is identified
from the student’s image that is given as an input to the API. The API utilizes facial
descriptors generated from example images of the same face to recognize a face.
Once the descriptions from the photographs have been extracted, they are given a
label, and the labels and descriptions of the face are saved in a JavaScript Object
Notation (JSON) file that may be used to identify the student.

25.4.5 Head Pose Detection

PoseNet has been used to detect the head pose of a student. PoseNet is a posture
recognition system that can be used to recognize human poses in real time. It is
a TensorFlow deep learning model that can estimate human posture by identifying
various parts of a human body and then combining these points to construct a skeletal
structure of a human pose. It is built on a lightweight CNN model that can be used
instead of libraries that are dependent on APIs. PoseNet provides us with a total of 17
important points to utilize, ranging from our ears and eyes to our ankles and knees.
Ears have been used to detect the head pose of a student. If the student turns his/her
head to the left or to the right, then the corresponding head pose is detected and the
action is recorded.

25.5 Results

As seen in Fig. 25.3, the application does not allow any unrecognized student to
take the exam in place of the intended student. Only if the facial data of the student
matches with the student model weight file, the student will be allowed to proceed
with the test.
As shown in Fig. 25.4, these are the final results which can be viewed on the teacher
dashboard. It is divided into two sections: cheat score for laptop and cheat score for
25 Intelligent Proctoring System 255

Fig. 25.3 Face recognition

Fig. 25.4 Teacher cheat score dashboard

mobile. The teacher can check the situations where the student has been flagged. The
system-level checks for malpractice include ExitFullScreen, Alt, Ctrl. The visuals
parameters which keep track of malpractice include Phone, Book, Laptop, noFace,
MutlipleFace, LeftTurn, RightTurn. Finally, the entire audio recording of the session
is stored for the teacher to get back and verify.
256 M. Sanjeev et al.

The teacher can verify both the laptop cheat record and mobile cheat record to
examine if the exam taken is authentic. The ideal case is when the laptop cheat
record shows ExtiFullScreen—empty, Phone—False, Book—False, Laptop—False,
noFace—False, MultipleFace—False, LeftTurn—False, RightTurn—False, Alt—0,
Ctrl—0, and the presence of an audio recording.
The ideal case in the case of mobile cheat record is when ExitFullScreen—Empty,
Phone—False, Book—False, Laptop—True. Here, the laptop has to be true, since
the mobile camera is responsible for keeping track of the laptop on which the exam
is being taken. If the laptop is not visible, then the student will be flagged. Since, it is
possible that the student would have been involved in some kind of malpractice like
placing the mobile phone in front of the laptop screen or a textbook on the laptop
keyboard.

25.6 Conclusion and Future Enhancements

25.6.1 Conclusion

The web application module has been completed successfully and tested thoroughly
to detect all types of malpractice. It works seamlessly on PC/laptop as well as smart-
phone browsers.
The application successfully detects all types of malpractices like book, paper,
phone, face, laptop, along with face recognition, head pose detection. It records
the audio throughout the exam duration and also detects tab switching and exiting
full screen operation. The mobile proctoring system keeps track of the field of view,
which is not reachable by the laptop webcam. The laptop webcam and mobile camera
work, complementing one another. Thus, this dual-camera approach minimizes the
chances of a student committing a fraudulent activity by active visual and auditory
monitoring.

25.6.2 Future Enhancement

The audio is recorded and stored in the database throughout the examination period.
This can further be improved with a more intelligent audio detection system to detect
malpractice in case of talking, chatting, voice search, etc. Also, besides the face
recognition performed during the biometric phase, it can be extended to real-time
face recognition that can make sure that the student has not replaced himself/herself
with someone else during the test.
As an extension to this project, a third camera can be placed behind the screen of
the PC/laptop to detect any human who is trying to help the test taker and objects
like books and papers on the wall behind their systems.
25 Intelligent Proctoring System 257

Acknowledgements The sense of accomplishment that comes with completing a task would be
incomplete without acknowledging the individuals, whose continued support has served as a source
of inspiration throughout the project.
We offer our heartfelt pranams to “AMMA,” MATA AMRITANANDAMAYI DEVI, who has
blessed us during this project’s development. Br. Viswamrita Chaitanya Swamiji, Director, Amrita
School of Engineering, Bangalore (ASEB), deserves our thanks. We would like to thank Dr. Sriram
Devanathan, Principal and Chairperson, Department of Computer Science and Engineering, ASEB,
for his encouragement and assistance throughout the project.
With great pleasure, we express our gratitude and heartfelt thanks to our project panel members,
and Ms. Kavitha C. R., Assistant Professor, Department of Computer Science and Engineering,
ASEB, for their invaluable guidance, moral support, encouragement, and affection throughout the
project. Finally, we owe a debt of gratitude to our parents, who have always loved, supported, and
encouraged us in whatever we do.

References

1. Nigam, A., Pasricha, R., Singh, T., Churi, P.: A systematic review on AI-based proctoring
systems: past, present and future. Educ. Inf. Technol. (Dordr.) 1–25 [published online ahead
of print, 2021 June 23]. https://doi.org/10.1007/s10639-021-10597-x
2. Raman, R., Vachharajani, H., Nedungadi, P.: Adoption of online proctored examinations by
university students during COVID-19: innovation diffusion study. Educ. Inf. Technol. 26(6),
7339–7358 (2021). https://doi.org/10.1007/s10639-021-10581-5
3. Subramanian, N.S., Narayanan, S., Soumya, M.D., Jayakumar, N., Bijlani, K.: Using Aadhaar
for continuous test-taker presence verification in online exams. In: Satapathy, S., Tavares, J.,
Bhateja, V., Mohanty, J. (eds.) Information and Decision Sciences. Advances in Intelligent Sys-
tems and Computing, vol. 701. Springer, Singapore (2018). https://doi.org/10.1007/s10639-
021-10581-5
4. Raj, R.S.V., Narayanan, S.A., Bijlani, K.: Heuristic-based automatic online proctoring system.
In: 2015 IEEE 15th International Conference on Advanced Learning Technologies 2015, pp.
458–459. https://doi.org/10.1109/ICALT.2015.127
5. Prathish, S., Bijlani, K.: An intelligent system for online exam monitoring. In: 2016 Interna-
tional Conference on Information Science (ICIS) 2016, pp. 138–143. https://doi.org/10.1109/
INFOSCI.2016.7845315
6. Harish, S.: New features for webcam proctoring using python and opencv. Revista Gestão
Inovação e Tecnologias 11, 1497–1513 (2021). https://doi.org/10.47059/revistageintec.v11i2.
1776
7. Li, H., Xu, M., Wang, Y., Wei, H., Qu, H.: A visual analytics approach to facilitate the proctoring
of online exams, pp. 1–17 (2021). https://doi.org/10.1145/3411764.3445294
8. Asep, H.S.G., Bandung, Y.: A design of continuous user verification for online exam proctoring
on M-learning. In: International Conference on Electrical Engineering and Informatics (ICEEI)
2019, pp. 284–289 (2019). https://doi.org/10.1109/ICEEI47359.2019.8988786
9. Labayen, M., Vea, R., Florez, J., Guillén-Gámez, F.D., García-Magariño, I.: SMOWL: a tool
for continuous student validation based on face recognition for online learning (2014)
10. Zhang, Z., Zhang, M., Chang, Y., Esche, S.K., Chassapis, C.: A virtual laboratory system with
biometric authentication and remote proctoring based on facial recognition. Comput. Educ. J.
7, 74–84 (2016)
11. Hussein, M.J., Yusuf, J., Deb, A.S., Fong, L., Naidu, S.: An evaluation of online proctoring
tools. Open Praxis 12(4), 509–525 (2020). https://doi.org/10.3316/informit.620366163696963
12. Khanna, V., Brodiya, S., Chaudhary, D.: Artificial intelligence based automated exam proctor-
ing system. Int. Res. J. Eng. Technol. (IRJET) 8(12), 558–560 (2021)
258 M. Sanjeev et al.

13. Jia, J., He, Y.: The design, implementation and pilot application of an intelligent online proc-
toring system for online exams. Interact. Technol. Smart Educ. (ahead-of-print 2021). https://
doi.org/10.1108/ITSE-12-2020-0246
14. Turani, A.A., Alkhateeb, J.H., Alsewari, A.A.: Students online exam proctoring: a case study
using 360 degree security cameras. In: Emerging Technology in Computing. Communication
and Electronics (ETCCE) 2020, pp. 1–5 (2020). https://doi.org/10.1109/ETCCE51779.2020.
9350872
15. Atoum, Y., Chen, L., Liu, A.X., Hsu, S.D.H., Liu, X.: Automated online exam proctoring. IEEE
Trans. Multimedia 19(7), 1609–1624 (2017). https://doi.org/10.1109/TMM.2017.2656064
16. Labayen, M., Vea, R., Flórez, J., Aginako, N., Sierra, B.: Online student authentication and
proctoring system based on multimodal biometrics technology. IEEE Access 9, 72398–72411
(2021). https://doi.org/10.1109/ACCESS.2021.3079375
17. Ashwinkumar, J.S., Kumaran, H.S., Sivakarthikeyan, U., Rajesh, K.P., Lavanya, R.: Deep
learning based approach for facilitating online proctoring using transfer learning. In: 2021
5th International Conference on Computer, Communication and Signal Processing (ICCCSP)
2021, pp. 306–312. https://doi.org/10.1109/ICCCSP52374.2021.9465530
18. Pradeesh, N., et al.: Fast and reliable group attendance marking system using face recogni-
tion in classrooms. In: 2019 2nd International Conference on Intelligent Computing, Instru-
mentation and Control Technologies (ICICICT), pp. 986–990 (2019). https://doi.org/10.1109/
ICICICT46008.2019.8993323
Chapter 26
Project-Based Learning:
A Contemporary Approach to Blend
Theory and Practical Knowledge
of Database Management Course

Rashmi K. Dixit , Sandipkumar Sahoo, and Rachit P. Shaha

Abstract Implementation and presentation of real-time database project was


conducted, where the students were envisaged to comply with necessary develop-
ment tools to create a working prototype. The grail of the event was not only to
enhance cognition of the integral database applications but also to scrutinize the
convergence of dataflow in a rudimentary database structure. The students were
mandated to form groups of four to six members, in accordance to the project needs.
Projects were effectuated with visualization of data handling in Web sites and other
Internet generics. The database was the sobriquet during each presentation. Database
when applied with other virtuous protocols were presented with all due diligence to
dataflow pattern. The culmination of this event was the showcasing of the task. The
event was utter success, completing all of its objectives.

Keywords Database engineering · Project-based learning · Real-world problems ·


Database concepts

26.1 Introduction

Database Engineering plays a stellar role in plethora of computer-based real-time


applications. Data is the heart of computer science and engineering branch and does
sustains the intuitive flow. Database system provides the user with many exigent
and necessary tools. With these different courses of actions comes the skepticism
of how data will be properly stored, where will the data repository reside, which
type of data complies diligently and even whether choosing complex data struc-
tures over simple data structures gives enhanced aggregate. Specific data structures
demands specific schema and data model. Deemed data model will define data, rela-
tionship, semantics, and consistent constraints. Database designer needs to know
data definition language (DDL) to define database schema and data manipulation

The experiment is conducted for T.Y.B.Tech class.

R. K. Dixit (B) · S. Sahoo · R. P. Shaha


Walchand Institute of Technology, Solapur, Maharashtra, India
e-mail: rashmirajivk@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 259
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_26
260 R. K. Dixit et al.

language (DML) to access and manipulate data. Defining schema includes properties
of data, domain constraints, integrity constraints, authorization, etc. Once data model
is finalized, that model needs converting into logical database design and physical
database design. Thus database designing is ostensibly cardinal in database manage-
ment system (DBMS). Enshrined database design requires proper understanding of
client’s needs.
This pertinent knowledge can only be engulfed through experience, which comes
at the cost of undertaking real-time projects over miniature assignments sessions
focusing solely on trivial applications. Stephan Bell [1] expressed the idea of PBL
is a student-driven, teacher-facilitated approach to learning. PBL allows students to
learn concept and apply in real-time scenario. PBL maps theoretical knowledge to
practical implementation which leads to up-gradation of skill sets while solving real-
world problems. In the changing era, project-based learning (PBL) provides a facility
to analyze real-world problems and demands. With this active and engaged learning,
students get profound knowledge in the respective subjects. Hence, project-based
learning is chosen as teaching paradigm in Database Engineering course.
This approach provides incessant pace toward practicing real-time projects. No
skills are undermined. If they get corroborated as industrial skills. PBL does more
to a database than just reaching the culmination of the project.

26.2 Literature Review

Researchers with different slits use unorthodox strategies with regards to project
management tools, project-based learning, and other sub-contained systems. There is
a speculative prejudice between project-based learning and problem-based learning.
This phenomenon is syndicated by Julie E. Mills et al. [2]. She brought into limelight
the limpid lack of design experience, team work ethics, and communication skills
among generality of engineering graduates. Stephan Bell [1] demonstrates PBL as
an innovative approach to a multitude if strategies. It not only makes the learner
advanced problem solver but also good communicator. Thomas M Connolly [3] stated
that complexities occurred during the solution of real-time problems can sometimes
create excruciating effect on few IT professionals and fresh graduates. He used
principals in constructivist epistemology to teach database design and analyzing
concepts to overcome this issue. Cesar Dominguez et al. [4] integrated PBL, project
management techniques and called the blend of these ‘practical task development’
(PTD).
L. Helle [5] threw some light on the researchers who previously worked on project-
based learning and are now focusing on the effectuality of individual scores. PBL
implementation also comes with several flaws. Claus Pahl et al. [6] used multimedia
system based on the virtual apprenticeship model to acquire knowledge and skills
in database design. Few thorns on the way if students are identification and gener-
alization of real-world problems tuned with issue such as deadline of the solution
and indemnity of approach to the solution. Teachers face the impediment difficulty
26 Project-Based Learning: A Contemporary Approach … 261

equivalence level of different problem statements, assessment of solution, arbitrating


the contribution, and knowledge of each team member.
We blend PBL with preaching Database Engineering in traditional way with
primary objective being rooting key concept of DBE and secondary objective being
inculcating team work culture and enhancing the ability to handle complexities during
work.

26.3 Implementation Details

We have indoctrinated PBL while edifying Database Engineering course to third


year Computer Science Engineering students. This structure includes four theory
lectures and two hour lab practice sessions. Curriculum [7] includes introduction to
database, relational model, database design and ER model, indexing, normalization,
transaction, and non-currency control.
Initially concepts were taught in the classroom and later short assignments on
various database topics were assigned to scrutinize the concepts during lab sessions.
It was the prime observation which conceded that final year students faced difficul-
ties with regards to real-time problem which they had selected for final year project
under respective course head of assigned project. This led us to introduce PBL to
third year students with the grail of achieving the first three outcomes. This PBL
activity was presented in seven stages.
The activity had a validation of one semester. Progress was accessed at every
phase. The phases were segregated as.
• Topic selection
• Requirement analysis
• Database design
• Frontend designing
• Connection of database using backend
• Connection of backend, frontend, and database
• Higher level of queries execution for different requirements
• Implantation of complete application.

26.3.1 Junctures of Activity

1. Team Formation and Topic Selection


Students were mandated to form a team of maximum four members. The class
was divided into four batches and each batch had a sub-division of four to five
batches. Students were envisaged to develop team spirit, management skills,
and communication skills (Fig. 26.1).
262 R. K. Dixit et al.

Fig. 26.1 Group formation

Prior to team formation, teams were given objectives to select topic. Topics which
offered behemoth streams of dataflow were expected to be selected. Automo-
tive, Banking, Education, Legal, Government, Pharmaceutical, Music, Sports,
and many queued were the plethora of topics which were selected. Following
Fig. 26.2 shows some example of topic chosen by the student group.

2. Requirement Analysis
Requirements were first collected from users and analyzed. Entities and their
attributes were identified. Relation among entities and cardinality mapping
were also subjected. These requirements were then converted to conceptual
designs. Redundancies were annihilated by normalizing the data. Through iter-
ative process, final schema is designed. This conceptual schema is converted
into logical design using entity-relation (ER) diagram. Various tools such as
26 Project-Based Learning: A Contemporary Approach … 263

Fig. 26.2 Topic selection

EDraw, ERDPlus, and LucidChart are used to draw the ER diagrams. These
diagrams were rechecked by faculties before giving the green signal.
Figure 26.3 shows ER diagram. These ER diagrams are then converted to
relational schemas.

3. Database Design
MySQL serves as database mainstream of structured query language (SQL).
MySQL is used for creating and manipulating database. DDL commands are
used to create database and tables. Post preparation of database is soldered by
front end programming (Fig. 26.4).

4. Front End Design


Frontend comprises graphics user interface (GUI) programming languages like
HTML, CSS, JAVASCRIPT, and REACT.JS are used. User interface is tested.
Real-time working interface is encouraged. Various forms are designed for
264 R. K. Dixit et al.

Fig. 26.3 Entity relationship modeling

Fig. 26.4 Relational model

various operations on data. Working on all of the menus, buttons and various
other redirections are tested (Fig. 26.5).

5. Connection Front End With Backend


Application program interface (API) such ODBC and JDBC are used to connect
front end with database. Students used DML commands such as insert, update,
and delete to manipulate data demonstrating SQL concepts. Samples form is
shown in Fig. 26.6.
26 Project-Based Learning: A Contemporary Approach … 265

Fig. 26.5 Front end design

Fig. 26.6 Higher order queries

At this stage, connectivity of controls with the fields, validations, and integrity
checks were tested and indemnities of manipulation operations are tested.
6. Execution of Higher Level of Queries
In a well-designed database, the data that you want to present through a form or
report is usually located in multiple tables. A query can pull the information from
various tables and assemble it for display in the form or report. During project
266 R. K. Dixit et al.

Fig. 26.7 Project presentation

implementation students fire number of higher order quires to get different


results.

7. Implantation of Complete Application (Fig. 26.7)

26.4 Assessment

For completion of each phase, time span is allocated/deadlines are given. For each
phases, after deadline checking were done by course instructor and suggestions were
given. In this way, during complete project-based learning, intermediate monitoring is
done, and some points are suggested which is beneficial for students. In the checking,
instructor also checks whether last suggestion were included or not. Grading chart
is prepared by teacher to mark teams grading which will be also considered in the
last presentation also.
At the end of the semester, online presentations for complete project which now
turn into ‘product’ were made. Rubric template was defined which include stepwise
marks. Marks were given by each group to other group (Fig. 26.8).
Nearly 80% of the teams made their complete presentation and the rest were
partially evacuated.
The group project will be graded based on the percentages listed in the descrip-
tion document (70% design, 30% implementation). The design will be graded based
on how well they have described and documented the sections required. The imple-
mentation will be graded based on project evaluation template. Students’ individual
grade will be based on their project grade (80%) and group’s evaluation of their
26 Project-Based Learning: A Contemporary Approach … 267

Fig. 26.8 Project evaluation template

contribution. When the projects are turned in, teacher will pass out a group evalua-
tion worksheet which each student must fill out. Teacher will assign their individual
grade based on these worksheets.
It is observed that, last 20% as each student contributing 20 points to a pool, which
will be distributed based on the percentage of work each student completes. Thus,
if one student does not work, they may receive 0 points, while his/her partner may
receive 40 points.
Thus, in DBE course, students could run only DDL and DML commands success-
fully with interest for a given schema in not sufficient but they need to lift up good
schemas for real-world applications. In this run, students are able executed such
higher order queries indirectly through PBL and came up with good database design
with regards to database application.
Not only students feedback were conducted after the successful completion of
project, at the end of semester, as well as peer evaluation form were designed
(Fig. 26.9).
268 R. K. Dixit et al.

Fig. 26.9 Peer evaluation template

26.5 Conclusion

In COVID-19 pandemic, there was lack of adherence toward real-life projects that
tests various aspects of ostensibly database engineering. The program above stip-
ulates not only the rudimentary concepts but also rooted ones. Program above
demanded a substantial amount of time and knowledge. The majority of the projects
were ready to deploy Web sites. The data flow management was corroborated
by already deployed Web sites that require efficient data flow, storage junction
management, swift rendering, and many more.
Presentation of a case study on how PBL can be blended with traditional teaching
to improve conceptual understanding of database concepts in DBE course. In labo-
ratory sessions of DBE course, instead of carrying out independent assignments to
demonstrate database concepts, these theoretical database concepts were skimmed
into real-world problems using PBL which indeed led to improvement in skill sets
required in database designing and application developing.
26 Project-Based Learning: A Contemporary Approach … 269

References

1. Bell, S.: Project-based learning for the 21st century: skills for the future. The Clearing House:
A Journal of Educational Strategies, Issues and Ideas. 83(2), 39–43 (2010)
2. Mills, J.E., Treagust, D.F.: Engineering education – is problem based or project-based learning
the answer? Australas. J. Eng. Educ. 3(2), 2–16 (2003)
3. Connolly, T.M., Begg, C.E.: A constructivist- based approach to teaching database analysis and
design. J. Inform. Syst. Educ. 17, 43–53 (2005)
4. Dominguaz, C., Jaime, A.: A project based learning in database design learning. J. Educ. Psychol.
20(4), 191–206 (1985)
5. Helle, L., Tynjala, P., Olkinuora, E.: Project-based learning in post- secondary education – theory,
practice and rubber sling shots. Higher Education 51, 287–314 (2006)
6. Pahl, C., Barrett, R., Kenny, C.: Supporting active database learning and training through inter-
active multimedia. In: Proceedings of the 9th Annual SIGCSE Conference on Innovation and
Technology in Computer Science Education, Leeds, 28–30 June (2004)
7. Silberschatz, A., Korth, H.F., Sudarshan, S.: Database System Concepts, 6th edn. McGraw Hill,
New York (2002)
Chapter 27
Comparative Study of CNN-Based
Multi-Disease Detection Models Through
X-Ray Images

Diwakar and Deep Raj

Abstract Adaptation of computer-aided techniques in health care is continu-


ally improving diagnosis and treatment using chest X-ray images. Deep learning
approaches are proving to be effective in offering more accurate disease detec-
tion However, there are still significant hurdles in medical imaging. In this paper,
presented an experiential comparative analysis of popular deep learning-based convo-
lutional neural networks (CNN’s) models such as ResNet50, Xception, VGG16, and
VGG19 using transfer learning for multi-disease detection. Although experimented
with several deep convolution architectures but presented here top most only. This
paper addresses four classes (chest disease) classification using chest X-ray, namely
COVID, Normal, Pneumonia, and Tuberculosis. All four models are trained, tested,
and validated using the same chest X-ray dataset which consists of 700 images for
each disease. The comparative result presented, accuracy, predict output, training
and validation loss, confusion matrices, error rate, and F1-score.

Keywords X-ray image · Convolution neural network · Transfer learning ·


COVID detection · Pneumonia detection

27.1 Introduction

COVID-19, pneumonia, and tuberculosis disease all have very similar symptoms
(e.g., cough, fever, shortness of breath, etc.) and diagnosing these diseases needs to
be gone through several testing steps like a blood sample, (RT-PCR), microbiological
examination of sputum, and other appropriate samples. This takes a long to iden-
tify actual disease. As a consequence, unknowingly many infected patients infect
others and disease becomes more severe. So that need to diagnose these diseases
at an early stage by any other diagnostic methods. In the field of medical science,
scientist or research found another effective test method to diagnose these diseases
using a radiological image such as X-rays and computed tomography (CT) scans.

Diwakar (B) · D. Raj


Babasaheb Bhimrao Ambedkar University (A Central University), Lucknow, India
e-mail: Diwakarmsccs0@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 271
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_27
272 Diwakar and D. Raj

In computer vision, deep learning has already proven its ability to classify images
with human-like accuracy. Many deep neural network models have previously been
developed to detect these diseases through chest X-rays. In addition, deep learning
is a touchy subject in the field of medical image processing. Researchers and scien-
tists are actively working on improving the efficiency and accuracy of the results.
Moreover, in the medical field, one key issue is the lack of huge datasets with
reliable ground-truth labeling. Data argumentation techniques and transfer learning
approaches can help to address the lack of datasets. In this paper, experimented with
four deep neural networks: ResNet50, Xception, VGG16, and VGG19 using transfer
learning approach with ImageNet dataset to detect four types of classes: COVID-
19, Normal, Pneumonia, and Tuberculosis. Experimental result shows the accuracy
81.38%, 89.72%, 98.33%, and 98.05%, respectively. In addition, the components of
the DCNN, used dataset, the models functioning, and other relevant information are
addressed in detail in the following sections.
In the field of medical science, lots of imaging technologies are used to detect
abnormalities in the human body. It can be difficult to diagnose a disease or take a
long time to understand anomalies from the filmed image. The use of a convolutional
neural network tool is an efficient technique to complete this operation with greater
accuracy and in less time. Many research papers have been analyzed and are displayed
here.
Ikechukwu et al. proposed a comparative study of segmentation and classification
of Pneumonia using ResNet50, VGG19, and training from scratch and got accu-
racy 84.5%, 93.5%, and 93.60%, respectively [1]. Kumari et al. experimented with
COVID detection using ResNet50, VGG16, INCEPTIONV3, and Xception model
and received accuracy 94%, 98%, 97%, and 97%, respectively [2]. Classification of
COVID-19 from X-ray images presented by Ji et al. [3] using feature fusion-based on
deep learning. Shazia et al. did excellent research on multiple neural networks using
transfer learning for COVID detection through chest X-Ray with several popular
models [4]. Ammar et al. [5] proposed comparative study on deep learning-based
models VGG16, ResNet50V2 and ResNet152V2 and Xception, MobileNetV2, and
denseNet for COVID detection. Rajagopal [6] did a comparative analysis on COVID-
19 classification using convolution neural network, transfer learning, and machine
learning approaches.

27.2 Dataset and Methods

In this paper, all images collected from Kaggle—COVID-19 Radiography Database


[7], Chest X-Ray Images (Pneumonia) [8], Tuberculosis (TB) Chest X-ray Database
[9]. COVID-19 patients’ chest X-ray images, Pneumonia patients’ chest X-ray
images, Normal X-rays images, and Tuberculosis X-ray images are all included
in this dataset as shown in below Fig. 27.1.
27 Comparative Study of CNN-Based Multi-Disease Detection Models … 273

Fig. 27.1 Examples of four chest X-rays

27.2.1 Training, Validation and Testing

For each model, used 700 images, 100 images for testing and randomly took 15% of
images from the rest of 600 images for validation purposes and the rest (510 images)
for training the multi-class classifier (Table 27.1).
There were a lot of steps involved in diagnosing these diseases, as shown in the
diagram below (Fig. 27.2). The data should first be preprocessed, which involves size
adjustments, rotation, flipping, and position translation, among other things. Then
data are separated into two parts: a train set and a test set, each with four classes. These
datasets are fed into a transfer learning model that has been pre-trained (ImageNet)
to extract the key features. And this is used as an input to a fully connected layer,
followed by an activation function with the appropriate model, and ultimately, an
output using the softmax activation function model.

Table 27.1 Dataset


Classes Number of images
COVID-19 700
Normal 700
Pneumonia 700
Tuberculosis 700
Total Images = 2800

Fig. 27.2 General architecture of multi-class classification model


274 Diwakar and D. Raj

Fig. 27.3 CNN architecture

27.2.2 Convolutional Neural Network

Convolutional neural networks (CNNs or ConvNets) are artificial neural networks.


They are made up of neurons with learnable biases and weights. CNNs are used to
automatically extract features and classify images from image datasets. Object detec-
tion, recommendation systems, image classification, and natural language processing
are some of the applications that deep convolutional neural networks (DCNN’s) are
utilized for. A convolutional network’s architecture typically includes four layers:
convolution, pooling, activation, and dense or fully connected layers as shown in the
below Fig. 27.3 [10].

27.2.3 Transfer Learning

Transfer learning (TL) is a machine learning research subject that focuses on storing
and transferring knowledge gained while solving one problem and applying the same
knowledge to a different but related problem such as a related domain problem,
modifying or retraining an existing model, and so on. In the TL approach, early
convolutional layers of the network are frozen, and only the last few layers are
trained that generate a prediction.
In this experimental study, all four models—VGG16, VGG19, ResNet50, and
Xception architecture—have adopted ImageNet weight using the transfer learning
approach, which includes more than 14 million images with hand-annotated descrip-
tions in over 20,000 categories. It is usually difficult to train a model from scratch
(random weight initialization) because it demands a powerful computing machine
and a large amount of data. In this study, work with a small dataset and use the
transfer learning approach used with it to avoid the problem of overfitting.
27 Comparative Study of CNN-Based Multi-Disease Detection Models … 275

27.2.4 Evaluation Standard

The performance of the neural-based model system is measured using the following
indicators—accuracy, sensitivity, precision, F1-score, and confusion matrix. The
following definitions will be used to describe these evaluations.
True Positive (TP) is when the model correctly predicts the positive class. And on
the other hand, True Negative (TN) is when the model correctly predicts the negative
class. In the same way, A false positive (FP) is when the model incorrectly predicts
the positive class. And a false negative (FN) is when the model incorrectly predicts
the negative class.
The number of correct predictions divided by the total number of predictions in
the dataset is used to calculate accuracy.

TP + TN
Accuracy = (27.1)
TP + TN + FP + FN

The number of correct positive predictions divided by the total number of positives
produces recall or sensitivity

TP
Recall = (27.2)
TP + FN

The number of correct positive predictions divided by the total number of positive
predictions yields precision.

TP
Precision = (27.3)
TP + FP
2(Precision . Recall)
f1 score = (27.4)
Precision + Recall

Performance of model measure by using the categorical cross-entropy loss func-


tion which is used for multi-class classification tasks. Categorical cross-entropy loss
function defined as:


M
Categorical cross - entropy loss function = - yo ,clog(Po , c) (27.5)
c=1

where ‘M’ is the number of classes (COVID, Pneumonia, Normal, and Tuberculosis),
‘log’ is natural log, y stands for binary indicator (0 or 1) if class label ‘c’ is correct
classification for observation ‘o’ and ‘p’ stands for predicted probability observation
‘o’ is of class c.
Confusion matrices is a table that gives TP, TN, FP, and FN values and visualizes
predictive analytics such as recall, precision, accuracy, etc., and shows how well a
classification model performs on a set of test data for which the true values are known.
276 Diwakar and D. Raj

This study applied transfer learning to all four types of common deep convolutional
network models, as mentioned below.

27.2.5 Experimental Setup

This experiment is conducted using the Google Colab with Nvidia K80/T4 GPU with
12 GB memory. All four models VGG16, VGG19, ResNet50, and Xception were
a developer using TensorFlow 2.6.0 and trained with pre-trained ImageNet weight
from Keras application API and frozen the layers excluding the last fully connected
dense layer. Used Adam optimizer to minimize the loss function and improved the
efficacy with a learning rate of 0.001. To handle the underfit or overfit model issue,
implemented early stopping technique with 30 epochs and training terminated by
using callback (patience = 5) when no improvement observed. All the images in the
dataset were resized to 224 × 224 pixels.

27.2.6 VGG16

VGG16 is a CNN architecture that was proposed by Simonyan et al. [11] in 2014.
VGG16 is made up of 16 layers: 13 convolutional layers, three fully connected
layers, and max pooling layers which are used to reduce the volume size and softmax
activation function followed by the last fully connected layer. VGG16’s input is a
224 × 224 RGB image with a fixed size. VGG16 focused on having 3 × 3 filter
convolution layers instead of a large number of hyper-parameters. Layers with a
stride 1 and always used the same padding and max pool layer of 2 × 2 filter with
stride 2. This experimental study uses only one fully connected/Dense layer which is
used four units in softmax activation as we have four classes (COVID, Pneumonia,
Normal, and Tuberculosis) to predict.

27.2.7 VGG19

In VGG19 architecture, 19 stands for 19 convolution layers. VGG19 is also of fixed


input size 224 × 224 RGB images. VGG19 is made up of 19 layers: 16 convolutional
layers, three fully connected layers, and max pooling layers which are used to reduce
the volume size and softmax activation function followed by the last fully connected
layer. Pre-trained weight (ImageNet) is used in this experiment, with only one full
connected layer with four units in softmax activation as having four classes (COVID,
Pneumonia, Normal, and Tuberculosis) to predict as shown below figure.
27 Comparative Study of CNN-Based Multi-Disease Detection Models … 277

27.2.8 ResNet50

ResNet or Residual Neural Network [12] have several variants. In this study, used
ResNet50 which has more than 23 million parameters, consist of 48 convolutional
layers, 1 max pool layer, and one average pool layer in this model. Each convolution
block has three convolution layers, and each identification block has three convolution
layers as well.

27.2.9 Xception

The Xception model [13] is an extension of the inception architecture which has
approx. 23 million parameters. In this study, Xception’s input is a 224 × 224 RGB
image pre-trained weight (ImageNet) with the size of 88 MB is used and the last full
connected layer modified with four units in softmax activation as having four classes
(COVID, Pneumonia, Normal, and Tuberculosis) to predict.

27.3 Experimental Result and Discussion

This experimental study classify the multiple diseases (COVID, Normal, Pneu-
monia, and tuberculosis) based on deep learning convolutional neural network
(CNN’s) models, ResNet50, Xception, VGG16, and VGG19 using a transfer learning
approach. The final predicted output is shown for all four architecture as mentioned
in below figure (Fig. 27.4).

27.3.1 Training and Validation Accuracy

Training and validation accuracy obtained for all four models is illustrated below in
Fig. 27.5. In VGG16 and VGG19, the result shows that training accuracy increases
more than 90% in just four epochs and continuously increasing reached 98% both in
17 and 12 epochs, respectively. In the case of ResNet50, the curve showing validation
accuracy is more than training accuracy increases. It could be because of the smaller
dataset or use of dropout in training or learning late is not compatible (High learning
rate) for this model. We used the same dataset for all models in this experiment
for comparison purposes, which is why we mentioned this outcome as it is. So that
ResNet50 can continue to improve. Furthermore, in the case of the Xception model,
training and validation accuracy reached more than 80% in four epochs.
278 Diwakar and D. Raj

Fig. 27.4 Predicted output

27.3.2 Training and Validation Loss

Visualization of the performance of the model is done using a learning curve. Three
types of common behavior overfitting, underfitting, and good fit can be observed by
the learning curve. Here, result shows that VGG16, VGG19, and Xception models
loss reached minimum loss in 17, 12, and 14 epochs, respectively but in the case
of ResNet50, training loss does not decrease, allowing the model to improve more
with more data images and fine-tuning techniques. Training and validation losses
obtained for all four models are illustrated below in Fig. 27.6.

27.3.3 Confusion Matrix

The confusion matrix displays the number of images classified correctly and incor-
rectly. As per the confusion matrix result, it clearly shows that VGG16 and VGG19
models performing well with higher training and validation performance. In the case
27 Comparative Study of CNN-Based Multi-Disease Detection Models … 279

Fig. 27.5 Training and validation accuracy plot

of ResNet50 showed few confliction between COVID and Tuberculosis disease as


displayed incorrect predictions in below figure (Fig. 27.7).

27.3.4 F1-Score, Precision and Recall

F1-score, precision, and recall for all four models are illustrated below in Fig. 27.8. As
result shows that VGG16, VGG19, ResNet, and Xception model achieved F1-score
0.97, 0.97, 0.84, 0.87, respectively.

27.4 Conclusion

In medical science, most of the time it is difficult to diagnose an actual disease or


differentiate the disease those have many similarities or similar symptoms. So that
handling this issue, we experimented on a deep learning-based convolutional neural
network model for multi-disease detection. four popular model VGG16, VGG19,
280 Diwakar and D. Raj

Fig. 27.6 Training and validation loss plot

ResNet50, and Xception implemented using transfer learning (ImageNet weight),


using TensorFlow (version 2.6.0) with Keras application API on Google Colab plat-
form and achieved accuracy 98.33%, 98.05%, 81.38%, and 89.72% and F1-score,
0.97, 0.97, 0.84, and 0.87, respectively. As per the experiment, conclude that VGG16
and VGG19 produce the best accuracy and the Xception model achieved approx. 90%
accuracy which can improve more by either fine-tuning techniques or giving more
data to train model. Moreover, ResNet50 model is not up to the mark. Only one exper-
iment can’t say that any model is good or bad. Every model depends on n number
of factors like dataset size, number of epochs, fine-tuning techniques, optimization
techniques, and many more.
27 Comparative Study of CNN-Based Multi-Disease Detection Models … 281

Fig. 27.7 Confusion Matrix

Fig. 27.8 F1-score, precision, and recall


282 Diwakar and D. Raj

References

1. Victor Ikechukwu, A., Murali, S., Deepu, R., Shivamurthy, R.C.: ResNet-50 vs VGG-19 vs
training from scratch: a comparative analysis of the segmentation and classification of Pneu-
monia from chest X-ray images. Glob. Transit. Proc. 2, 375–381 (2021). https://doi.org/10.
1016/j.gltp.2021.08.027
2. Kumari, S., Ranjith, E., Gujjar, A., Narasimman, S., Zeelani, H.A.S.: Comparative analysis of
deep learning models for COVID-19 detection. Glob. Transit. Proc. 2, 559–565 (2021). https://
doi.org/10.1016/j.gltp.2021.08.030
3. Ji, D., Zhang, Z., Zhao, Y., Zhao, Q.: Research on classification of covid-19 chest x-ray image
modal feature fusion based on deep learning. J. Healthc. Eng. 2021, e6799202 (2021). https://
doi.org/10.1155/2021/6799202
4. Shazia, A., Xuan, T.Z., Chuah, J.H., Usman, J., Qian, P., Lai, K.W.: A comparative study of
multiple neural network for detection of COVID-19 on chest X-ray. EURASIP J. Adv. Signal
Process. 2021(1), 1–16. https://asp-eurasipjournals.springeropen.com/articles/https://doi.org/
10.1186/s13634-021-00755-1 (2021). Accessed 25 Jan 2022
5. Chalifah, A., Purbojo, S., Umitaibatin, R., Rudiyanto, D.: Comparative study of convolutional
neural network feature extractors used for covid-19 detection from chest X-ray images. (2020)
https://doi.org/10.13140/RG.2.2.15462.24642
6. Rajagopal, R.: Comparative analysis of COVID-19 X-ray images classification using convolu-
tional neural network, transfer learning, and machine learning classifiers using deep features.
Pattern Recognit. Image Anal. 31, 313–322 (2021). https://doi.org/10.1134/S10546618210
20140
7. COVID-19 Radiography Database. https://kaggle.com/tawsifurrahman/covid19-radiography-
database. Accessed 25 Jan 2022
8. Chest X-Ray Images (Pneumonia). https://kaggle.com/paultimothymooney/chest-xray-pne
umonia. Accessed 25 Jan 2022
9. Tuberculosis (TB) Chest X-ray Database. https://kaggle.com/tawsifurrahman/tuberculosis-tb-
chest-xray-dataset. Accessed 25 Jan 2022
10. diagrams-Dec-2020_F.png (1536×593). https://i0.wp.com/www.run.ai/wp-content/uploads/
2021/01/diagrams-Dec-2020_F.png?resize=1536%2C593&ssl=1. Accessed 25 Jan 2022
11. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. arXiv preprint arXiv:1409.1556 Cs. (2015)
12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.
ArXiv151203385 Cs. (2015)
13. Chollet, F.: Xception: deep learning with depthwise separable convolutions. ArXiv161002357
Cs. (2017)
Chapter 28
Student’s Employability Concern
for Colleges and Universities

Asmita S. Deshmukh and Anjali B. Raut

Abstract Educational data mining (EDM) is acquiring extensive favor in the educa-
tion field due to its predictive potential. Mostly, the previous efforts in this area were
only governed to predict the student’s performance based on academic results. These
days, education has become aligned with employment and so along with academic
grades, the skills of an individual equally contribute. Prediction of students’ accom-
plishments in the campus during placements at the beginning stages can give students
an idea about the preparations they need to make to become market-ready. Also, for
those students who have very poor performance, proactive or motivating actions can
be taken at the college or university level to build their performance. The proposed
model is a case study that can be executed in college for placement improvement
keeping in mind the requirements of both the company as well as students opting for
placements.

Keywords Educational data mining · Job analysis · Unified model · Clustering ·


Classification · Predictive analytics

28.1 Introduction

Big data analytics is applied by every company that is interested in exploiting the
abundantly available data for competitive advantage in nearly any sector. Data mining
is the process of drawing out information that is beneficial from large datasets. This
enables users to have an understanding of data, to make useful findings from the
knowledge mined from the databases. From the last decade, educational data mining
(EDM) has attained a lot of attention from researchers due to the existence of educa-
tional details which are accessible from many sources. The main aim of EDM is to
make data mining (DM) models more effective to safeguard the numerous amounts
of educational information and to develop a protective atmosphere for the student’s
learning [1]. As with data mining, EDM also is thought of as a very capable tool in

A. S. Deshmukh (B) · A. B. Raut


HVPM’s College of Engineering and Technology, Amravati, India
e-mail: asmitadeshmukh7@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 283
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_28
284 A. S. Deshmukh and A. B. Raut

the education field. The basis of higher education organizations is to provide good
chances to their students in terms of employability. Today, EDM is an expanding
research field to help academic institutions upgrade their student’s performance.
Though it is known that employability is eventually the individual student’s respon-
sibility, colleges, and to some extent universities are partly responsible for it as they
impact the students in their social, local, and global environment. Also, not only
the college but also teachers, classmates, and family, all have the role and poten-
tial to prepare students to take up the society as responsible citizens contributing
to viable development in the coming years. The academic institutions can upgrade
the students not only through examinations conducted but also make them market-
ready by taking efforts for their overall growth. EDM provides distinct practices to
predict the academic achievement of students. In addition to EDMs, which assist
decision-making for educational surroundings even quality educational datasets can
also contribute. This can construct better results, and hence, the decisions built on
these datasets can increase the education quality by predicting the attainment of
students.

28.2 Historical Background

Taking the existing data to make it meaningful so that it produces accurate insights
into the past data is one of the key discussions in most institutes and organizations
today. In this section, we have reviewed literature related to big data, tools, techniques,
and methods. Nowadays, big data are the most talked about technology trend.

28.2.1 Big Data Analytics Tools, Methods, and Frameworks

In this review paper, the authors have surveyed the latest technologies that are devel-
oped for big data. This will assist us in selecting and adopting the correct blend of
these various big data technologies as per the technological demands and certain
application requirements. A worldwide view of the main technologies of big data
is given along with a comparison according to dissimilar system layers. It clas-
sifies and discusses features, advantages, limits, and usages of main technologies
[2]. This paper offers a broader interpretation of big data that grabs its characteris-
tics. Also, predictive analytics, its process, and its relation to machine learning are
defined. It enlarges the need to construct new tools for predictive analytics using
machine learning and artificial intelligence that often use statistical techniques to
give computers the capability to “learn” with data, without being programmed. With
a lot of data, it is a need for prediction models along with the machine, to make the
executives better at their decision-making process [3]. In this paper, the authors have
explained the importance of digital conversion that has emerged through big data
analytics ecosystems and is noticed in the impending years. The big data analytics
28 Student’s Employability Concern for Colleges and Universities 285

ecosystems will develop as a driver of digital renewal and sustainability as busi-


ness models get progressively more oriented toward individual and societal needs.
This is to conceptualize big data and business analytics ecosystems and propose a
model to pave the way to digital renewal and sustainable societies, and the model is
digital transformation and sustainability (DTS) model [4, 5]. In this paper, the authors
have defined predictive analytics that is made up of several statistical and analytical
techniques. These techniques will help in developing novel plans for the upcoming
prospects of prediction. The available data mining techniques can be used to help
forecast a future event which is predictive analytics. And recommendations can be
made based on this is called prescriptive analytics. In this article, these data mining
techniques are applied, and predictive analytics is applied on the various medical
datasets. This is carried out to predict various diseases with accuracy levels, pros,
and cons, and toward the end concludes about the concerns of those algorithms and
advanced approaches on big data [6].

28.2.2 Review of Educational Data Mining

The EDM process transforms raw data coming from educational systems into some
potential information that could greatly impact educational research and practice.
The main aim of EDM is to make DM models more effective to safeguard the
numerous amounts of educational information and to develop a system helping
educational institutes meet their target of placement. The research studies on EDM
are increasing due to the benefits acquired by the knowledge received from machine
learning processes which help to enhance decision-making in higher education insti-
tutions. To determine the grade points of engineering students in Nigerian Universi-
ties, the authors have used predictive analysis and the KNIME model. This creates
an opportunity for recognizing students who may either graduate with less score
or may not even graduate, so that early intervention may be deployed [7]. In this
paper, the authors discuss the fast growth in educational data points contributing to
the huge amount of data that requires a more advanced set of algorithms. All educa-
tional problems cannot directly apply traditional data mining algorithms as they
may have a particular objective and function. This means a preprocessing algorithm
has to be first applied and then specified data mining methods can be used on the
problems. The preprocessing algorithm suggested here is clustering. EDM has many
studies focused on the application of various data mining algorithms to educational
attributes. This paper gives a systematic literature review on clustering algorithms,
their applicability, and usability over three decades (1983–2016) in the context of
EDM [1]. This is a survey paper where authors discussed the problems faced by
most of the institutes and colleges of higher education such as student admissions,
academic performance, and placements. In this regard, many of them gathered and
analyzed a vast dataset of their students used various tools and techniques to extract
some useful and hidden information [8]. The authors have discussed the current
issue, which is unemployment adversely impacting the entire world. Several factors
286 A. S. Deshmukh and A. B. Raut

affect the employability of graduates, and academic grades were the most dominant
element in determining an individual’s employment status, previously [9]. In this
paper, the authors explain about data mining allows the users to have perceptions of
the data and convenience to draw decisions out of the information extracted from
databases. The EDM process can be used for analyzing the student’s performance
based on numerous constraints to predict and evaluate whether a student will be
placed or not in the campus placement. Various institutions can predict the perfor-
mance of students taking higher education and improve their standard of education
by identifying the pupils at risk, enhancing the overall achievements. This can in
turn refine the educational resource management system providing better chances
or opportunities for students to get placed [10]. In this paper, the authors portray
a method productive for extracting the students’ performance based on numerous
specifications to predict and evaluate about a student’s recruitment is done or not
during the process of campus placement. For predictions, various machine learning
algorithms are used and some are J48, Naïve Bayes, random forest, etc. [5, 11]. In
this paper, the key objective of the study is two-fold, first is to compare the estima-
tion in 2 subjects like Business Administration Degree among Finland and Spain,
and second is to test the factors like gender, age, subject, students’ inspiration, or
priorities which affect the student assessment [12].

28.2.3 Review of Clustering-Based EDM Techniques

In this paper, the authors project an experimental study to show the correlation among
students and professors for teaching the students based on the talents under the
application of clustering models like K-means, expectation–maximization, as well
as farthest first. Identifying the root cause for the dropout rate and prediction in PG
institutions is highly complicated. Thus, there are 3 models unified in groupings and
regression predicts school completion in HEIs in Brazil [13]. The authors proposed a
newly developed approach composed of K-means with linear regression and robust
regression (RR), and also K-means with support vector regression (SVR) in this
paper. Moreover, 4 conventional approaches have been employed for estimating
SVR, bagging, linear regression, RR, and compared the performance. This work
has utilized the methodology of the cross-industry standard process for data mining
(CRISP-DM) [14]. The authors have studied many algorithms in EDM that had been
working to measure a student’s grade point average in the next semester’s courses.
This can help identify dropout students at an early stage or help students to choose
the right elective courses appropriate for them. The methods in general used for this
are machine learning, however, with the variation of the dataset, there is an accu-
racy problem that can be changed. More importantly, the characteristic of the dataset
related to the applied model impacts the performance of prediction models. For the
grades of elective subjects that are missing from a dataset of undergraduate students,
the authors used a distributed platform building it on Spark. Also, a comparison of
several methods based on a mix of collaborative filtering and matrix factorization is
28 Student’s Employability Concern for Colleges and Universities 287

done and the performance of these algorithms is evaluated [15]. In this paper, the
authors discussed EDM issues, predicting student scores in the coming semesters
as a problem, and a lot of work that has selected various proficiency algorithms.
The encouraging results can be considered for recognizing the dropout students at
an early stage or even helping students choose the elective courses which are very
important. Few extensively used methods for predicting student’s performance are
depending on techniques often used in recommendation systems such as collabo-
rative filtering and matrix factorization [16]. In this paper, the author discussed the
accurate prediction of students’ performance in campus placement at an initial stage
to identify those students, who are in a risky situation to be employed. If we know
such students beforehand proactive actions can be taken to better their performance.
The authors proposed a notion of a unified model based on clustering and clas-
sification. A consolidated predictive model is assembled by integrating clustering
and classification methods to handle the data collected. At the preprocessing stage,
two-level clustering (K-means kernel) plus chi-square test enforced for automati-
cally selecting the suitable attributes. Then, ensemble vote classification approach
is applied to predict students’ employability with a combination of four classifiers.
A generalized solution for predicting student employability is proposed using this
framework. The comparable results clearly illustrate the model attainment above
different classification approaches [16, 17]. The study in this paper presents a fuzzy
C-means clustering algorithm that uses 2D and 3D clustering to assess student’s
performance at Huaqiao University on the basis of exam results. Based on the exper-
imental results against 2D and 3D clustering, the educators can have a clear view
of the student’s performance to build a professional base for decisions. The students
also can get some instructions about their performance from the extracted results
[18]. In this paper, a comparison with the traditional model has showcased the effect
and scalability of newly deployed approaches for the school dropout problem. The
key objective of a newly developed model is to identify the useful predictors from
learning content and understand the inter-relationships among those metrics with
the help of LA and EDM, where the effects are examined on various attributes of
a student’s performance by applying disposition analysis. Here, the K-means clus-
tering DM model has been applied for accomplishing clusters which are then mapped
for finding the significant attributes of learning content. Relationships among these
features are found and access the student’s function [19].

28.2.4 Review of EDM Using Other Methods

The potential of data mining to derive important information from all available data
makes it very helpful to predict student’s achievement, performance at the university
level, and many more are already studied, and during the study, it is observed that
the count of the students not managing to graduate on time increases greatly every
year. In this paper, the authors explain how the prediction model used supports the
students and lectures. It helps students to select courses and prepare proper study plans
288 A. S. Deshmukh and A. B. Raut

for them. It empowers lecturers and educational managers to observe and support
students to finish their programs with outstanding results. A technique to anticipate
student achievement applying different deep learning approaches is proposed. The
authors analyze and present several data preprocessing techniques, such as long short-
term memory (LSTM) and convolutional neural networks (CNNs) to do prediction
tasks. As the methods considered provide a good forecast, they are expected to be
adapted in practical cases [20]. In this paper, the authors explain the first objective
of the study to test a structured method to implement artificial neural networks
(ANN) for predicting the academic achievement of students in higher education.
The second objective is to study the relevance of several well-known predictors of
academic achievements in higher education. As per the findings, it is possible to
efficiently execute ANN to classify students’ academic performance as either high
or low is suggested. ANN outperforms other ML algorithms in the recall and the
F1 score evaluation metrics. It is also found that for contributing to the student’s
academic performance, the important predictors are the prior academic achievement
of students, their socioeconomic surrounding, and also their characteristics of high
school. Toward the end, this study explains recommendations for applying ANN and
certain considerations for the testing of academic achievement in higher education
[16, 21]. The authors in this article provide a review of EDM during the 2015–2019
period. A large number of available algorithms are recognized that allow for studies
to assess their findings effectively in checking students’ academic achievement. Out
of the various algorithms, decision trees are the most used methods for satisfying
levels of efficiency. More used than others were Naive Bayes, C4.5, and random
forest algorithms and at the same time, KNN algorithm took it over as instance-
based methods. In the literature studied, the most often used predictive variable is
student scores. From an experiential view, EDM also needs to give some proof for
the fulfillment of the educational policy or learning process improvement, something
that has not been established. So, the finding and suggestions of the authors are that
it is expected to have an extra deep effect on education and as a whole on society, if
we have EDM focused on earlier education levels [22].

28.3 Proposed Work

The maximum efforts taken before in this area, in our country probably till the
last decade, was only controlling the prediction of performance that was based on
academic results. Although the educational system till now was not job-oriented,
it has now become so, the focus is now to find reasons for the unemployment of
undergraduate students during or after their studies. One of the ways to understand the
reasons for unemployment can be through observation and analysis of the additional
factors that are possessed by those students who get placed at an early stage. Even in
the recent computerized time, most of the procedures carried on by the training and
placement cell are majorly manual. The information collection task of the students
has to be first automated. Then, the collected information can be converted to use
28 Student’s Employability Concern for Colleges and Universities 289

data to predict the chances of students for employment or identify areas that require
improvement. To classify a large volume of students’ data using an excel sheet will
take a lot of time and using any programming language also will be difficult to code
for many conditions to classify the large dataset [5]. The automation will not only
reduce the time but will also add efficiency to the predicting process. Based on this
study, we can propose a system that depicts a competent method for the student’s
data mining depending on various factors to predict and analyze the recruitment of
a student during the campus placement. Our proposed design will have 2 parts: data
analysis and data modeling.
In Fig. 28.1, different phases of our system model are shown.
Data Collection
To do this, we will start collecting information about the students in terms of academic
performance and other required information. The user will have to enter the required
details after logging into the system. The data entered by the user will be stored in
our dataset. Our dataset will comprise information about the students in terms of
academics and other related information. We have started the collection of this infor-
mation by floating forms shared with students. The data collected are quantitative or

Fig. 28.1 Proposed system


flow
290 A. S. Deshmukh and A. B. Raut

qualitative depending on the attributes. Data collection will be through observations,


surveys, and tests.
Dataset Preprocessing
The major task once we collect the data will be preparing this data for use. We
have to analyze the data because quality plays a very important role preprocessing is
required. As such we cannot use the real-world data as it is but have to perform many
more modifications in this collected data. The data collected may be incorrect, due
to some human or computer error, a faulty instrument used for making data entry,
or even transmission error while transmitting the data, etc. Data cleaning will be
done by filling the missing values if any, smoothing the noisy data if required, and
resolving inconsistencies. If we use the unprocessed data without good quality, then
our model will not give expected results.
Data Integration
Then, we can go for data integration of multiple databases, followed by data reduction
and data transformation, and data discretization if required. So as we collect this data
from surveys, then test results or some other source say employers requirement it
needs to be in one place to extract the required information from it.
Analysis and Prediction
During this phase, we will use different algorithms to frame predictive models
depending on patterns observed. We will be using Python, R and also test our hypoth-
esis using a standard statistical model. Sample input sets will be used to check the
validity of the model. The model will be predicting the placement results using
multiple algorithms combined preferring the hybrid approach.
Knowledge Extraction
After the model training, it is deployed and can work in an iterative model.
The extracted knowledge can make judgments about which students require more
assistance along with who are rank holders in academics and placements.
Result
The results that are predicted and improved can be visualized to understand fast and
better.
Implementation
The working of our proposed model will start by registering and logging in of the
students with their user credentials. Domain selection will be according to the user’s
interest, domains included are if the user is interested to be a software engineer,
software tester, app developer, etc. Depending on the domain selected, the user will
have to appear for the test. Tests will be conducted semester-wise. Based on the
result of these tests conducted, if the user is eligible, the system will be terminated
and will give some indication of the user is ready for placement. If the user is not
eligible, a report will be generated comprising the marks of all the tests along with
28 Student’s Employability Concern for Colleges and Universities 291

suggestions included for improvement. Based on the test results, the model will show
the list of companies for which the students have satisfied the criteria. The student
need not stop his preparation but can improvise his performance. The only concern
while working on this is the collection of data provided truly by the participants so
that the prediction model works with accuracy.
It is not fully completed but ongoing work. Some partial work can be shown as
(Figs. 28.2 and 28.3).

Fig. 28.2 Register page for students

Fig. 28.3 Application form for students


292 A. S. Deshmukh and A. B. Raut

28.4 Conclusions

Our proposed system will help institutions analyze the student’s performance during
the pre-placement phase. During this analysis along with the academic performance,
the technical skillset of students will also be tested which in turn can help in predicting
a good career path. This can also provide details about the student’s expertise to help
companies know them better and thereby eliminate unnecessary rounds for capable
candidates. The proposed model is expected not only to outperform the prediction
performance of different classifiers but also to help us find a suitable number of
attributes that can be considered and finalized to be a part of this model.
The future scope is designing such models for X standard students also adding
more features that are important to allow them to choose a better career path. Also,
the current system is focusing on UG Engineering students but can also work for
other graduate options also.

References

1. Dutt, A., Ismail, M.A., Herawan, T.: A systematic review on educational data mining. IEEE
Access 5, 15991–16005 (2017). https://doi.org/10.1109/ACCESS.2017
2. Oussousa, A., Benjellouna, F.Z., Lahcena, A.A., Belfkih, S.: Big Data technologies: A survey.
J. King Saud University—Computer Info Sci 30(4), 431–448 (2017)
3. Ongsulee, P., Chotchaung, V., Bamrungsi, E., Rodcheewit, T.: Big Data, predictive analytics,
and machine learning. 16th International Proceedings on ICT and Knowledge Engineering
(ICT&KE) IEEE, pp. 37–42 (2018)
4. Pappas, I.O., Mikalef, P., Giannakos, M.N., Krogstie, J., Lekakos, G.: Big data and business
analytics ecosystems: paving the way towards digital transformation and sustainable societies.
Info. Syst. e-Business 16, 479–491 (2018)
5. Yadav, D., Shinde, O., Singh, A., Deshmukh, A.: Placement recommender and evaluator. Int.
Res. J. Tech. 6(4), 1926–1931 (2019)
6. Poornima, S., Pushpalatha, M.: A survey of predictive analytics using big data with data mining.
Int. J. Bioinformatics Research Appl. 14(3), 269–282 (2018)
7. Adekitan, A.I., Sala, O.: The impact of engineering students’ performance in the first three
years on their graduation result using educational data mining. Elsevier Ltd., 1–21 (2019).
https://doi.org/10.1016/j.heliyon.2019.e01250
8. Gour, S., Gour, M.: Study of tools and techniques used to analyzing student’s database for
academic growth in higher education: a review. WWJMRD 4(2), 139–143 (2018)
9. Othman, Z., Shan, S.W., Yusoff, I., Kee, C.P.: Classification techniques for predicting graduate
employability. Int. J. Advanced Science Engineering Information Technology 8(4–2), 1712–
1720 (2018)
10. Agarwal, K., Maheshwari, E., Roy, C., Pandey, M., Rautray, S.S.: Analyzing student perfor-
mance in engineering placement using data mining. Proceedings of International Conference
on Computational Intelligence and Data Engineering, Lecture Notes on Data Engineering and
Communications Technologies 28 Springer Nature Singapore, pp. 171–181. https://doi.org/10.
1007/978-981-13-6459-4_18
11. Rao, K.S., Swapna, N., Kumar, P.P.: Educational data mining for student placement prediction
using machine learning algorithms. Int. J. Engineering Technological Sciences 7(12), 43–46
(2018)
28 Student’s Employability Concern for Colleges and Universities 293

12. Minano, M.C., Campo, C., Grande, E.U., Ezama, D.P., Akpinar, M., Rivero, C.: Solving the
mystery about the factors conditioning higher education students’ assessment: Finland versus
Spain. J. Education Training 62(6), 617–630 (2020). https://doi.org/10.1108/ET-08-2019-0168
13. Nájera, A.U., Calleja, D., Medina, M.A.: Associating students and teachers for tutoring in
higher education using clustering and data mining. Comput. Appl. Eng. Educ. 25(5), 823–832
(2017)
14. Lima, M.N., Soares, W.L., Silva, I.R., Fagundes, R.A.: A combined model based on clustering
and regression to predicting school dropout in higher education institution. Int. J. Comp. Appl.
176(34), 1–8 (2020)
15. Mai, L., Phat, D.N., Chung, M., Thoai N.: An apache spark-based platform for predicting
the performance of undergraduate student. 21st International Conference on High Perfor-
mance Computing and Communications, 191–199. https://doi.org/10.1109/HPCC/SmartCity/
DSS.2019.00041
16. Mai, T.L., Do, P., Chung, M., Le, V.T., Thoai, N.: Adapting the score prediction to character-
istics of undergraduate student data. International Conference on Advanced Computing and
Applications, 70–77. https://doi.org/10.1109/ACOMP.2019.00018
17. Thakar, P., Mehta, A., Manisha: A unified model of clustering and classification to improve
students’ employability prediction. Int. J. Intelligent Systems and Applications 9(9), 10–18
(2017). https://doi.org/10.5815/ijisa.2017.09.02
18. Li, Y., Gou, J.: Fan Z (2019) Educational data mining for students’ performance based on fuzzy
C-means clustering. J. Engineering 11, 8245–8250 (2019)
19. Bharara, S., Sabitha, S., Bansal, A.: Application of learning analytics using clustering data
Mining for Students’ disposition analysis. Educ. Inf. Technol. 23(2), 957–984 (2018)
20. Thanh, D.T., Nghe, N.T., Hai, N.T., Sang, L.H.: Deep learning with data transformation and
factor analysis for student performance prediction. Int. J. Advanced Computer Science and
Applications 11(8), 711–721
21. Hernandez, C.R., Musso, M., Kyndt, E., Cascallar, E.: Artificial neural networks in academic
performance prediction: Systematic implementation and predictor evaluation. Computers and
Education: Artificial Intelligence 2(2021), 1–14. https://doi.org/10.1016/j.caeai.2021.100018
22. Papadogiannis, I., Poulopoulos, V., Wallace, M.: A critical review of data mining for education:
What has been done, what has been learnt and what remains to be seen. Int. J. Educational
Research Review 5(4), 353–372. https://doi.org/10.24331/ijere.755047
Chapter 29
Clipped RBM and DBN Based
Mechanism for Optimal Classification
of Brain Cancer

Neha Ahlawat and D. Franklin Vinod

Abstract In healthcare, Big Data analytics has attracted great attention among the
research community. Health care records is enormous challenging not only by its
volume but also the nature of diversification of data sets and their huge dimen-
sions. Recent developments have exhibited deep learning models to be extremely
strong generative models which can separate features consequently and acquire high
prescient execution. The main pretention of this learning is to construct a structure
that may assist in tumour resolution and determination of the cerebrum MR image
by the suggested Clipped-RBMs tumour detection algorithm. The brain functioning
can be analyzed by many methods like Ultrasound-rays, SPECT, Computed Tomog-
raphy (CT), X-rays (Plain film) and Magnetic Resonance image (MRI). MR Image is
utilized in the introduced mechanism because of its predominant picture quality and
capability of identifying minute features. The result demonstrate that the proposed
unaided component learning architecture has more powerful component represen-
tation and higher classification precision capability than the existing state-of-the-art
models.

Keywords Computed tomography (CT) · MRI · Binary2 RBM · PSO · Deep


learning (DL) · Convolutional neural network (CNN)

29.1 Introduction

The worldwide accessibility of the internet, advanced mobile technologies, sensors,


and high-performance computing devices developed an astonishing collection of data
using digital interaction with users. Extraction of proficient information from such
a massive amount of data is quite a challenging task that requires more developed
platforms to get an immense Big Data analysis effectively. The fundamental task of
Big Data Analysis is to extract proficient patterns from the colossal amount of data
that can be used in preference making and prevision. The role of Big Data analytics in

N. Ahlawat (B) · D. F. Vinod


Department of Computer Science and Engineering, Faculty of Engineering and Technology, SRM
Institute of Science and Technology, Delhi-NCR Campus, Delhi-Meerut Road, Modinagar,
Ghaziabad, Uttar Pradesh, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 295
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_29
296 N. Ahlawat and D. F. Vinod

medical imaging has been developing essentially over the previous decades. In health-
care, Big Data analytics has attracted great attention among the research community
[1]. Health care records is enormous challenging not only by its volume but also the
nature of diversification of data sets and its huge dimensions.
Consequently, the area of big data and computational perception imposes a
splendid prospect in building a higher quality health system. In Big Data, we handle
coarse data which is predominantly unlabeled and un-cautious. To manage Big Data
investigation, a significant sub-field of machine learning known as Deep learning is
utilized to excerpt helpful information out of the Big Data. Even though the tradi-
tionary image classification procedures had been extensively utilized in realistic
issues, but there are cons such as less division accuracy, vulnerable adaptive capa-
bility etc. The deep learning model has an incredible perception capacity, which
incorporates the feature detachment and classification procedure into an entire to
complete the image classification process. Nowadays Deep learning (DL) is a high
focus area in medical image examination which works well on unlabelled data [2].
An essential strength of DL is the analysis and learning of colossal measure of unjus-
tified facts efficiently, which makes it an absolute tool for Big Data Analytics where
unrefined information is hugely unlabeled and un- cautious.
Deep learning-based algorithms showed promising execution as well as speed in
various domains like speech acknowledgment, text acknowledgment, lips perusing,
CAD system, face acknowledgment, drug disclosure and so on [3]. Deep learning
(DL) recently showed an astonishing performance, particularly in classification and
distribution issues. This paper aims to construct a structure that may assist in tumour
uncovering and identification of the cerebrum (Brain) MRI picture through the intro-
duced tumour detection algorithm. Brain (cerebrum) tumour recognition and clas-
sification rely upon histopathological assessment of tissue removal illustration. The
present strategy is intrusive, inefficient, and inclined to inadequate errors. These limi-
tations display it is so significant to execute a fully computerised strategy for multi-
characterization of brain tumours based on profound learning [4].
Deep learning models are being utilised to diagnose brain tumours using magnetic
resonance imaging. The precise detection of a brain tumour is a tough task because
of the complicated structure of the brain. The unexpected growth of the cells in the
body is defined as cancer and the cerebrum (brain) tumour. It is the most serious
disease which occurred due to the unusual development of tissues around or inside
the brain. In the human anatomy, Brain is a main key organ which is used to control
all the functioning of our central nervous system and many other important organs
of the body [5]. Any abnormal expansion of cells in the human skull may affect the
functioning of important organs of the human body as well as it may spread and
affect other parts also. Brain tumours are mainly categorized into Primary (benign
or noncancerous) and Secondary (Metastatic) tumours. The metastatic tumours are
considered cancerous or malignant. Generally, Glioma is a very usual kind of deadly
tumour, which may prompt to small life cycle in their peak grade. The symptoms
may be varying according to location & shape as well as their type. This disease
spreads quickly by entering the different tissues of the brain and makes the state of a
patient more terrible [6]. The analysis also becomes difficult because of the distant
29 Clipped RBM and DBN Based Mechanism for Optimal Classification … 297

shape, size, and location of tumour in the brain. It is also noticed from the literature
that it is extremely difficult to distinguish the tumour in prior stages.
In the medical field, many different diagnostic methods like Ultrasound images,
Computed Tomography (CT) images, X-rays, and Magnetic Resonance Images
(MRI) are used to diagnose the function, structure and diseases that influence the
human brain. In Neurology, an MRI inspection is most popular for visualizing the
minute attributes of the brain. It is predominantly utilized for brain tumour detec-
tion because it is discovered to be a painless and safe test [7]. The result of this
investigation uncovers whether the functioning of the brain is normal or unusual.
This paper is coordinated in this way: Sect. 29.1 portrays the introduction or
theoretical background of the paper, Sect. 29.2 deliberates the inspiration of the
conceptual features of the existing investigation done in the enclosure of brain tumour
detection, Sect. 29.3 divides into two subsections: Sect. 3.1 gives us a knowledge
about the existing methodology while Sect. 3.2 gives the detailed knowledge of the
proposed system, Sect. 29.4 are committed to results and discourse progressively
propagated by a consummation in Sect. 29.5.

29.2 Related Work

Some of the prevalent works related with the classification and training are discussed
briefly here. In [8], Hossam H. Sultan et al. suggested a model which is dependent on
a convolutional neural network and characterize diverse brain tumour by utilizing two
openly accessible datasets: one is taken from Nan Fang and other one acquired from
General hospital, TMU. China. The first information based on growths of different
types of tumours. Another one isolates the three glioma grades into different cate-
gories of Grades. Their proposed engineering has accomplished the highest precision
of 96.13% and 98.7% concerning the above mentioned two data records. This simu-
lation model is built from many layers beginning from the information layer which
holds the pre-handled pictures which going through different layers of convolution
and their enactment capacities (convolution, ReLU, standardization and Maxpooling
layers).
In [9], Kamanasish Bhattacharjee et al. examined the Classification approach of
sub-atomic cerebrum neoplasia which is finished by planning a multi-facet percep-
tron (MLP). The preparation of MLP (multi-facet perceptron) is finished utilizing
Hybrid Genetic Algorithm (HGA) and Particle Swarm Optimization (PSO). Then,
at that point, the prepared MLP is effectively group the six benchmark datasets.
The grouping through GA 100 percent exactness was accomplished for the Sigmoid
capacity dataset just as for the XOR. However, with the increment in intricacy of
datasets, for example, irises and Breast Cancer dataset the precision disintegrates to
90% utilizing PPSOGA2 and 90% for Irises dataset.
In this paper [10], Mohamed Arbane et al. suggested a simulation for the order of
cerebrum cancers from MRI pictures utilizing convolutional neural network (CNN)
based exchange cognitive strategy. The carried-out model investigates various CNN
298 N. Ahlawat and D. F. Vinod

structures, specifically Res.Net, Xception and Mobil-Net-V2. The dataset used in


this examination is separated into three areas for planning, supporting, and evaluation
the suggested DL simulation. The essential subgroup is adapted to fit the simulation
additionally, contains 80% of the entire dataset. The extra is correspondingly parted
for endorsing and testing the structure. This framework accomplished the finest
outcomes with 98.25% and 98.43% in respect of precision and F1-score, separately.
Hasan Ucuzal et al. detailed the idea of electronic programming which can charac-
terize the brain cancers (glioma, pituitary, meningioma) considering high-accuracy
T-1 difference attractive reverberation pictures utilizing convolutional-neural orga-
nization from profound learning calculation. As per the performance results, all the
determined measurements are over 98% for ordering the sorts of cerebrum cancers
on the given preparation dataset. There is a whole of 3061 MR picture filters for
glioma, meningioma, pituitary cerebrum growths. 2590 cases of these MR pictures
are utilized in the preparation stage and the remanent 466 in the inspection stage
[11].

29.3 Methodology

29.3.1 Introduction of Basic DBN

A Deep Belief Network (DBNs), which comprises of equipped restricted Boltzmann


machine (RBMs), is a vital unaided DL model and its portrayal is fundamental for
construction of DBN [12, 13]. An RBM is a two-way and balanced irregular neural
organization model along with bi-layers. The foremost category of an RBM is the
Binary2 RBM, which has twofold parameter as visible and hidden parameters [14].
It consists of p visible parameter v = (v1 , …, vq ) and q hidden parameter h = (h1 ,
…, hq ).The existing RBM primarily used to show twofold information; the two
irregular factors (v, h) take the values (v, h)∈ {0,1}p+q . The Energy function of EBM
is characterized as derives in Eq. (29.1):


p

q

p

q
E(v, h) = − bi vi − cjh j − vi Wi j h j (29.1)
i=1 j=1 i=1 j=1

where bi indicates the ith bias term related with the ith visible parameter vi , and
c j indicates the jth. bias term related with the jth. hidden parameter h j . In addition,
Wi j . indicates the weight related with the ith. visible parameter vi and the jth hidden
parameter h j [15, 16].
The structure of RBM has just associations between the layer of hidden (covered
up) units and visible (apparent) units yet, it doesn’t have joins between the units in the
comparable layer. That is the reason it is “restricted or limited” [17]. This limitation
is useful, since the hidden variables are autonomous given the condition of the visible
29 Clipped RBM and DBN Based Mechanism for Optimal Classification … 299

units vicesfrom t viewpoint of possibility or likelihood, which are portrayed as in


Eq. (29.2) as well aequation (3).
 
  
P h j = 1/v = sigmoid · b j + vi Wi j (29.2)
i∈v
⎛ ⎞

P(vi = 1/ h) = sigmoid · ⎝ci + h j Wi j ⎠ (29.3)
j∈h

where, sigmoid activation function (σ ) = 1+exp(−x)1


, W ij : the weight network
between visible units and hidden units.
After the pre-trained phase of a RBM, another layer of RBM is stacked on the top
of trained RBM to make a DBN. The contribution of the entire DBN compares to the
input of the primary (first) layer RBM, and the output related to the output of the last
layer RBM. The output of the earlier RBM gives the contribution of each RBM after
the first layer. DBN, as displayed in (Fig. 29.1), is a deep architecture based on the
fine-tuning or adjusting phase which optimizes the weights by limiting the entropy
error [18, 19].

Fine Tuning Phase OUTPUT LAYER

RBM N HIDDEN LAYER N

RBM 2 HIDDEN LAYER 2

RBM 1 C
HIDDEN LAYER 1

VISIBLE LAYER

Fig. 29.1 DBN architecture


300 N. Ahlawat and D. F. Vinod

The Learning process of RBM is maximizing the log-likelihood to make the


dispersion distribution learned by the RBM as indistinguishable as the allocation
of the input results. The subordinate for sigmoid activation function encountered
vanishing gradient problem which leads the unstable behaviour during RBM training.
To overcome the problem of gradient here we proposed the clipping RBM concept
with ReLU activation for firing the neurons of hidden units.

29.3.2 Clipped Restricted Boltzmann Machine (C-RBM)

The vanishing and expanding (exploding) gradients problem are common in deep
neural networks. Weights are adjusted at different times because the networks are
usually deep. Weights may tend to disperse with time, resulting in disappearing
gradients, or they may become too enormous, resulting in exploding gradients. This
thought is fundamentally setting up a rule for avoiding exploding gradients. A clipped
ReLU layer accomplish a threshold operation, where any data of information esteem
under zero is set to nothing and any worth over the clipping boundary is set to that
clipping value. The training of RBM hidden layers using Clipping concept can help
to reduce the problem of exploding gradient. ReLU uses this straightforward rule as
in Eq. (29.4). It is simple and doesn’t require a lot of hefty computation as there is
no convoluted math

ReLU = Rectified Linear Unit = f (x) =ax(0, x)


x; for x > 0
0; for x <= 0 (29.4)

Gradient clipping as in Eq. (29.5) involves compelling the inclination values


(element-wise) to a particular minimum or maximum value if the gradient surpassed
a normal reach. For a clipping value z (>0), it computes

Clipped ReLU (x, z) = min(max(0, x), z) (29.5)

The underflow or excess of weights is characterised as “exploding gradients”


because the unsteady preparation process makes the network neglect to learn so that
the model is rendered unusable. By adjusting the error gradients, either by scaling the
vector standard or clipping gradient values to a reach, the training cycle can be made
steadier. After the learning process of improvised RBM, we use the fine-tuning algo-
rithm to alter the parameters of C-RBM after the first layer of unsupervised learning
with clipped RBM. The changed RBM’s hidden layer is then used as the visible layer
of the new RBM. DBN will execute an unsupervised and supervised learning when-
ever a new hidden layer is introduced. All the network’s weights have reached the
lowest training error after alternating between unsupervised and supervised learning.
29 Clipped RBM and DBN Based Mechanism for Optimal Classification … 301

29.4 Result

29.4.1 Dataset Information

To estimate the interpretation of the suggested method we conducted an experiment


on the BRATS dataset that contains different patient images with each modality
comprising of various slices. The clinical image information comprises of 85 multi-
contrariness MR picture scans from glioma sufferers, out of which 28 were gained
from mild (histological assessment: astrocytoma’s or oligoastrocytomas) and 41 from
prominent (anaplastic astrocytoma’s and glioblastoma multiforme tumours) glioma
sufferers.

29.4.2 Experimental Analysis

The model which developed will classifies MR images of brain tumours using a deep
belief network image classification implore based upon C-RBM. To distinguish the
MR images of brain cancer, the experiment uses a DBNs image categorization algo-
rithm based on C-RBM. Table 29.1 compares the suggested model to existing tradi-
tionary simulation models in terms of performance. To assess the interpretation of
the result four criteria are utilized i.e. Accuracy (A), Specificity (S 2 ), Sensitivity (S 1 ),
and F1-Score (FS). These performance criteria have been determined by Eqs. 29.6,
29.7, 29.8 and 29.9 which describe below.

Accuracy(A) = (TP + TN)/(TP + TN + FP + FN) (29.6)

Sensitivity(S1 ) = TP/(FN + TP) (29.7)

Specificity(S2 ) = TN/(FP + TN) (29.8)

Table 29.1 Performance comparison between deep belief network vs other traditional models
Classification approaches Accuracy (A) Specificity (S 2 ) Sensitivity (S 1 ) F-score (FS)
FCM 0.85 0.83 0.87 0.84
KNN 0.77 0.80 0.78 0.76
K-SVM 0.90 0.87 0.91 0.89
CNN 0.91 0.90 0.87 0.88
RCNN 0.89 0.86 0.88 0.90
DBN 0.93 0.91 0.94 0.96
302 N. Ahlawat and D. F. Vinod

Comparative analysis
1.2

0.8

0.6

0.4

0.2

0
Accuracy Specificity Sensitivity F-score

FCM KNN K-SVM CNN RCNN DBN

Fig. 29.2 Performance comparison chart

F1 − score (FS) = 2TP/(2TP + FP + FN) (29.9)

In Eqs. 29.6, 29.7, 29.8 and 29.9, TP indicates the true +ve number, TN indicates
the true −ve number, FP indicates the false +ve number, and FN indicates the false
−ve number [20]. The comparative methods utilized in (Fig. 29.2) for analysis of
different methods based on their Sensitivity (S 1 ), Specificity (S 2 ), Accuracy (A) and
F1-score (FS).

29.5 Conclusion

The main pretention of this paper is to introduce a simulation that can determine
the brain tumour from MR-images. The Clipped RBM and DBN based brain tumour
detection algorithm is successfully implemented and applied on images. The primary
objective of initializing parameters is to prevent layer enactment (activation) outputs
from exploding or vanishing gradients detonating or disappearing slopes during the
forward propagation. If either of the issues occurs, loss gradients (inclination) will
either be excessively enormous or too small, and the network will take more time
to converge. Our proposed architecture assisted well with showing better outcomes
and conquer the issues effectively. More research will be conducted in the future
to evaluate how to upgrade the recommended model and test it with other cancer
databases for real-time applications.
29 Clipped RBM and DBN Based Mechanism for Optimal Classification … 303

References

1. Jan, B., Farman, H., Khan, M., et al.: Deep learning in big data Analytics: A comparative study.
Elsevier 75, 275–287 (2017)
2. Somasundaram, S., Gobinath, R.: Current trends on deep learning models for brain tumor
segmentation and detection. International Conference on Machine Learning, Big Data, Cloud
and Parallel Computing (COMITCon), pp. 217–221. IEEE, Faridabad, India (2019)
3. Yu, Y., Li, M., Liu, L., Li, Y., Wang, J.: Clinical big data and deep learning: Applications,
challenges, and future outlooks. Big Data Mining Anal. TUP 2(4), 288–305 (2019)
4. Mallick, P.K., Ryu, S.H., Satapathy, S.K., Shruti, et.al.: Brain MRI image classification for
cancer detection using deep wavelet autoencoder based deep neural network. IEEE Access 7,
46278–46287 (2017)
5. Hashemzehi, R., Mahdavi, S.J.S., Kheirabadi, M., et al.: Detection of brain tumors from MRI
images base on deep learning using hybrid model CNN and NADE. Elsevier 40(3), 1225–1232
(2020)
6. Zhao, L., Jia, K.: Deep feature learning with discrimination mechanism for brain tumor
segmentation and diagnosis. International Conference on Intelligent Information Hiding and
Multimedia Signal Processing (IIH-MSP), pp. 306–309. IEEE Adelaide, SA, Australia (2015)
7. Bhanumathi, V., Sangeetha, R.: CNN based training and classification of MRI brain images.
5th International Conference on Advanced Computing & Communication Systems (ICACCS),
pp. 129–133. IEEE Coimbatore, India (2019)
8. Sultan, H.M., Salem, N.M., Al-Atabany, W.: Multi-classification of brain tumor images using
deep neural network. IEEE Access 7, 69215–69225 (2019)
9. Bhattacharjee, K., Pant, M.: Hybrid particle swarm optimization-genetic algorithm trained
multi-layer perceptron for classification of human glioma from molecular brain neoplasia data.
Cognitive Syst. Res. 58, 173–194 (2019)
10. Arbane, M., Benlamri, R., Brik, Y., et.al..: Transfer learning for automatic brain tumour classifi-
cation using MRI images. 2nd International Workshop on Human-Centric Smart Environments
for Health and Well-being (IHSH), pp. 210–214. IEEE, Boumerdes, Algeria (2020)
11. Ucuzal, H., Yasar, S., Colak, C.: Classification of brain tumour types by deep learning
with convolutional neural network on magnetic resonance images using a developed web-
based interface. 3rd International Symposium on Multidisciplinary Studies and Innovative
Technologies (ISMSIT), pp. 1–5. IEEE, Ankara, Turkey (2019)
12. Jemimma, T.A., Raj, Y.J.V.: Brain tumor segmentation and classification using deep belief
network. Intelligent Computing and Control Systems-ICICCS, pp. 1390–1394. IEEE, Madurai,
India (2018)
13. Kwon, Y.-M., Kwon, Y.-W., Chung, D.-K., et.al.: The comparison of performance according
to initialization methods of deep neural network for malware dataset. Int. J. Innovative Tech.
Exploring Eng. (IJITEE) 8(4S2), 57–62 (2019)
14. Thahseen, P., Anish, K.B.: A deep belief network based brain tumor detection in MRI images.
Int. J. Science Res. (IJSR) 6(7), 495–500 (2017)
15. Wei, J., Lv, J., Yi, Z.: A new sparse restricted Boltzmann machine. Int. J. Pattern Recognition
Artificial Intell. 33(10) (2019)
16. Zhu, Y., Zhang, Y., Pan, Y.: Large-scale restricted Boltzmann machines on single GPU. IEEE
International Conference on Big Data, pp. 169–174. IEEE Silicon Valley, CA, USA (2013)
17. Hinton, G., Osindero, S., Teh, Y.W.: A fast-learning algorithm for deep belief nets. Neural
Comput. 18, 1527–1554 (2006)
18. Ali, S.A., Raza, B., Malik, A.K., Shahid, et.al.: An optimally configured and improved deep
belief network (OCI-DBN) approach for heart disease prediction based on Ruzzo–Tompa and
stacked genetic algorithm. IEEE Access 8, 65947–65958 (2020)
304 N. Ahlawat and D. F. Vinod

19. Wenyu, Z., Lu, R., Lei, W.: A method of deep belief network image classification based on
probability measure rough set theory. Int. J. Pattern Recognition Artificial Intell. 32 (2018)
20. Zararsiz, G., Akyildiz, H.Y., Goksuluk, D., et al.: Statistical learning approaches in diagnosing
patients with nontraumatic acute abdomen. Turk. J. Electr. Eng. Comput. Sci. 24, 3685–3697
(2016)
Chapter 30
Air Quality Prediction Using Supervised
Machine Learning Techniques

Atul Lal Shrivastava and Rajendra Kumar Dwivedi

Abstract To maintain good air quality, the observation of pollution levels framework
estimates a variety of air contaminants in various locations. In the current circum-
stances, it is the most desirable concern. The release of hazardous gases into the
environment by businesses, vehicles, and other sources pollutes the air. These days,
the level of pollution in the air has surpassed fundamental levels, and the amount of
contamination in several major urban areas has exceeded the public authority’s air
quality list criteria. It has a tremendous impact on a person’s health. With advances
in machine learning technology, it is now possible to forecast the future of toxins
based on historical data. In this paper, we described a device that can collect the
current toxic gases. With the help of the previous records of toxic gases, we executed
a computation using machine learning techniques to detect the contamination in the
air. It would be a good idea to use machine learning to forecast the future contamina-
tion information. The information gathered is saved in an Excel document for further
analysis. To collect contamination data, the respective sensors are used on Arduino.

Keywords Machine Learning · Supervised Learning · Air quality prediction · Air


pollution · Air quality index (AQI)

30.1 Introduction

Measurement of air pollution is becoming more popular at this time because it has a
big impact on human health as well as biological balance. Apart from the negative
effects of hazardous environmental effects, air pollution has a negative impact on
one’s health, productivity, and energy efficiency. Because air pollution has resulted in
several dangerous repercussions for individuals, it should be monitored on a regular
basis to ensure that it is appropriately regulated. One method for controlling it is
by knowing the source, force, and location of its origin. They preserve a string of

A. L. Shrivastava (B) · R. K. Dwivedi


Department of Information Technology and Computer Application, MMMUT, Gorakhpur, India
e-mail: atulsha08@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 305
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_30
306 A. L. Shrivastava and R. K. Dwivedi

Table 30.1 AQI category and health issues


Levels of health concern in the air Value in numbers Explanation and health issues
quality index
Good 0–50 The air quality is excellent, and
pollution poses little or no danger
Moderate 51–100 Although the air quality is adequate,
some pollutants may provide a
moderate health risk to a small
number of people who are typically
sensitive to air pollution
Unhealthy for sensitive groups 101–150 Health issues may arise for members
of vulnerable communities. It seems
unlikely that the general population
will be impacted
Unhealthy 151–200 Members of sensitive populations
may have more serious health
repercussions than others
Very unhealthy 201–250 Emergency health alerts. It is more
likely that the entire population will
be impacted

poisonous gases at each place. The WHO has issued a warning about the degree of
pollution in the country. It tells us that the time has passed for us to pollute the air.
Table 30.1 explains AQI category and different health issues. Air tracking is a
technique for determining the ambient levels of contaminants in the atmosphere.
As air pollution continues to rise, monitoring has become a crucial responsibility.
The device’s data allow us to determine the source and severity of contaminants in
a given location. We can use this data to take steps or make attempts to minimise
pollution levels so that we can breathe clean air. There has been an increase in air
pollution. It has a negative influence on both the environment and human health. As
the concentration of gases in the atmosphere rises, such gases have a greater impact
on the human body’s reactions, potentially resulting in potentially dangerous conse-
quences. Because of the rise in contaminants in the air, air pollution also influences
seasonal rainfall. The amount of rain that falls is also affected. As a result, air quality
must be continually monitored.
O3 , NO2 , CO, and microscopic particles are the most widely used types of pollu-
tion in the air. They have the potential to harm humans and are the primary cause of
cancer disease, birth abnormalities, and breathing problems. Today’s pollutants are
rising as a result of 2.5 gases, which can cause circulatory system difficulties, lung
cancer, as well as other pulmonary and respiratory problems. Air pollution causes
long-term harm to the liver, kidneys, brain, nerves, and other parts of the human
body. The air quality index is a linear characteristic of concentrations of impurities.
At the boundaries between air quality index categories, there is a discontinuous surge
from one to the next. The equation below is an example of how it is used to deter-
mine the AQI from the concentration. Machine learning techniques help to predict
30 Air Quality Prediction Using Supervised … 307

air quality in the environment. [1, 2, 3]. There are various such techniques available
under supervised and unsupervised machine learning categories [4, 5].

30.2 Literature Review

Anikender Kumar and Pramila Goyal [6] presented a study. They rely on the previous
data. Various equations were used from 2000 to 2005 to predict the daily air quality
index for the year 2006. Then, using the multiple linear regression technique,
they anticipated AQI value was compared to the observed AQI value. Independent
variables are discovered using principal component analysis.
Huixiang Liu et al. [7] presented the study, in which Beijing and an Italian city were
picked as the two cities. It includes hourly averaged AQI as well ozone and sulphur
dioxide values in Delhi from December 2013 to August 2018. It has 1738 instances.
The second dataset, consisting of 9358 cases, was gathered. Between March 2004
and February 2005, data were collected from Italian cities. This dataset includes CO,
NMH, Benzene, NOx , and NO2 hourly average concentrations. However, because
NOx prediction is one of the most valuable predictors, they concentrated mostly on it.
They employed SVR and RFR to calculate AQI and NOx concentrations algorithms.
In terms of forecasting AQI and NOx concentration, SVR outperforms RFR.
Ziyue Guan and Richard O. Sinnott [8] described that the PM2.5 concentra-
tion was predicted using many different algorithms. Data for the city of Melbourne
were acquired from the Environmental Protection Agency’s official Webpage (EPA),
which includes the PM2.5 air parameter, as well as unofficial data from air-beam,
a mobile device designed to monitor PM2.5 values. LSTM performs the best and
predicts high-PM2.5 values with decent accuracy.
Heidar-Maleki et al. [9] predicted over the period of August 2009 to August 2010,
they employed an ANN machine learning method to predict air pollution concentra-
tions. Factors including meteorological measurements, air pollution concentrations,
time, and date are all used as inputs to ANN algorithms.
Aditya C.R. et al. [10] forecasted PM2.5 concentrations for a specific day. They
employed the logistic regression technique to classify the air as contaminated or not
polluted. They have conducted research on benzene, which is amongst the pollu-
tants for which they have forecasted future trends. They forecasted the future levels
of the pollutants listed on the basis of the previous data, utilising data analytics,
and time series regression forecasting. The Anand Vihar and Shadipur monitoring
stations in Delhi are being investigated based on the findings of this study. The find-
ings reveal a significant increase in PM10, NO2, and PM2 concentrations. 5 are
clearly higher, indicating that Delhi is becoming more polluted. Mohamed Shakir
and Naresh [11] performed their research on Karnataka’s pollution control board and
provided the information. The study reveals the relationship or correlations between
environmental parameters by K-means clustering methods.
308 A. L. Shrivastava and R. K. Dwivedi

Kazem Naddaf et al. [12] used WHO’s AirQ programme, which offers quantifi-
able data on the influence. All-cause mortality, cardiovascular illnesses, and respira-
tory diseases were all taken into account. “The study’s findings suggest that the air
pollutant PM10 had the greatest health impact on Tehran City’s 8,700,000 residents,
causing an excess of overall mortality of 2194 deaths out of 47,284 in a year.”
Gunasekaran et al. [13] described that the major reason that the yearly average
concentrations of pollutants like sulphur dioxide, nitrogen oxides, and suspended
particle matter are within national guidelines, and this location has been determined
to have no major pollution issues. However, the pollutant PM10 has a slightly higher
yearly average concentration than the national standard. In the same year, except
from July to October, the monthly 24-h average PM10 concentrations exceeded the
national standard threshold. For the city of Athens, Greece, the researchers used
(ANN) and MLR techniques to anticipate PM10 concentrations over a two-year
period. The dataset is partitioned into three unequal subsets before applying the
input to ANN. This study also included a performance comparison between ANN
and MLR, which revealed that ANN outperforms MLR. According to the findings, if
an ANN is correctly trained, it will provide sufficient prediction solutions or results.

30.3 Data and Methods

In data and methods, data collection and methods are written and explained below.
A. Data Collection
The Indian capital is Delhi. With a population of over 10 million people, it is one of
the country’s fastest-growing industrial metropolises (worldpopulationreview.com).
The goal of this study was to forecast the air quality in a ranking of Hyderabad’s six
districts that had air quality in the continuous environment monitoring.
B. Methods
We have used linear regression, decision tree, and random forest machine learning
algorithm to predict the air quality prediction. Figure 30.1 presents the methodology
of predicting air quality using machine learning approach.
i. Linear Regression: It is the most basic regression method that involves a
slew of unrelated variables. In this study, six air contaminants are used as
independent variables to calculate the AQI value. D = B + M1 I1 + M2 I2 +
M3 I3 + M4 I4 + M5 I5 + M6 I6 The predicted dependent variable (AQI) is
D, and the intercept is b. The computed regression coefficients are T1, T2, T3,
T4, T5, and T6, whilst the independent variables are S1, S2, S3, S4, S5, and
S6.
ii. Decision Tree: A binary tree is used to separate the data points in this method.
There are two types of nodes: master node and leaf node. Leaf nodes, also
known as leaf nodes, exist. There are 3 factors in general. With the best splits,
30 Air Quality Prediction Using Supervised … 309

Fig. 30.1 System model

the node impurity decreases [9]. As a result, homogeneity rises for every child
node. Similarly, given a new data point, the best node will be chosen, and the
average of everything in that node’s data points will be the anticipated new
data point’s value. Finding the most effective tree split lower. The impurity of
child nodes is quite important. Because variance is the measure of impurity
for the regression, it is necessary to compute variance reduction. The lower the
impurity, the greater the variance reduction value. The following formula can
be used to compute the variance reduction: The split that reduces variance the
most is the best split. This indicates that the child nodes differ significantly.
iii. Random Forest: The random forest technique employs feature similarity to
forecast the values of fresh data points. This means that the new point is given
a value based on how closely it resembles the training set points. Each training
point’s Euclidean distance from the new point is determined. Based on the
distance, the k closest data points are chosen. The new data point’s anticipated
value is equal to the mean of all the previous k data points. Depending on the
k value, the error changes.

30.4 Results and Discussion

The graph below indicates that all of the features used to make the forecasts are
associated and may thus be used to make the model more realistic.
In Fig. 30.2, the probability prediction of CO is explained, and in Fig. 30.3,
the probability prediction of O3 is explained. Table 30.2 explains the probability
prediction of CO and O3 .
310 A. L. Shrivastava and R. K. Dwivedi

Fig. 30.2 Probability of prediction of CO

Fig. 30.3 Probability of prediction of O3

Table 30.2 Probability of


Machine learning Probability of Probability of
prediction
algorithm prediction of CO prediction of O3
Linear regression 0.02 0.08
Decision tree 0.60 0.60
Random forest 0.81 0.75
30 Air Quality Prediction Using Supervised … 311

30.5 Conclusion and Future Directions

The goal of this paper is to learn everything there is to be familiar with the air quality
index (AQI), which determines whether or not the air we breathe is contaminated.
According to this evaluation, the majority of researchers worked on AQI and pollutant
concentration level predictions, which will provide an accurate picture of AQI. Many
researchers prefer artificial neural networks (ANNs), linear regression, and logistic
regression to estimate AQI and air pollution concentrations. The future scope of the
project could embrace all of the elements, such as parameters of the weather and air
contaminants.

References

1. Dwivedi, R.K., Kumar, R., Buyya, R.: Gaussian distribution-based machine learning scheme
for anomaly detection in healthcare sensor cloud. Int. J. Cloud Appl. Comp. (IJCAC) 11(1),
52–72 (2021). https://doi.org/10.4018/IJCAC.2021010103
2. Dwivedi, R.K., Kumar, R., Buyya, R.: A novel machine learning-based approach for outlier
detection in smart healthcare sensor clouds. Int. J. Healthcare Info. Systems Informatics
(IJHISI) 16(4), 1–26 (2021). https://doi.org/10.4018/IJHISI.20211001.oa26
3. Dwivedi, R.K., Rai, A.K., Kumar, R.: A study on machine learning based anomaly detec-
tion approaches in wireless sensor network. 10th IEEE International Conference on Cloud
Computing, Data Science & Engineering (Confluence-2020), Amity University Noida,
pp. 200–205 (2020)
4. Dwivedi, R.K., Rai, A.K., Kumar R.: Outlier detection in wireless sensor networks using
machine learning techniques: A survey. IEEE International Conference on Electrical and
Electronics Engineering (ICE3–2020), MMMUT Gorakhpur, pp. 316–321 (2020)
5. Dwivedi, R.K., Pandey, S., Kumar, R.: A study on machine learning approaches for outlier
detection in wireless sensor network. In: The Proceeding of 2018 8th IEEE International
Conference on Cloud Computing, Data Science & Engineering (Confluence), pp. 189–192.
Amity University, Noida (2018)
6. Kumar, A., Goyal, P.: Forecasting of air quality in Delhi using principal component regression
technique. Atmospheric Pollut. Res. 2, 436–444 (2011)
7. Liu, H., Li, Q., Yu, D., Gu, Y.: Air quality index and air pollutant concentration prediction based
on machine learning algorithms. Applied Sciences, ISSN 2076-3417; CODEN: ASPCC7, 2019,
9, 4069 (2019). https://doi.org/10.3390/app9194069
8. Guan Z., Sinnot R.O.: Prediction of air pollution through machine learning on the cloud.
IEEE/ACM5th International Conference on Big Data Computing Applications and Tech-
nologies (BDCAT), 978-1-5386-5502-3/18/$31.00 ©2018 IEEE DOI https://doi.org/10.1109/
BDCAT.2018.00015
9. Malek, H., Sorooshian, A., Goudarzi, G., Baboli, Z.: Yaser Tahmasebi Birgani, Mojtaba
Rahmati, “Air pollution prediction by using an artifcial neural network model.” Clean Technol.
Environ. Policy 21, 1341–1352 (2019)
10. Aditya, C.R., Chandana C.R., Deshmukh, R., Nayana, D.K., Vidyavastu, P.G.: Detection and
prediction of air pollution using machine learning models. Int. J. Engineering Trends Tech.
59(4) (2018)
11. Sharma, N.: ShwetaTaneja, VaishaliSagar, Arshita Bhatt, “Forecasting air pollution load in
Delhi using data analysis tools.” ScienceDirect 132, 1077–1085 (2018)
312 A. L. Shrivastava and R. K. Dwivedi

12. Naddafi, K., Hassanvand, M.S., Yunesian, M., Momeniha, F., Nabizadeh, R., Faridi, S., Gholam-
pour, A.: Health impact assessment of air pollution in megacity of Tehran, Iran. Iranian J.
Environ. Health Sci. Eng. 9, 28 (2012)
13. R. Gunasekaran, K., Kumaraswamy, P.P., Chandrasekaran, R., Elanchezhian: Monitoring of
ambient air quality in Salem city, Tamil Nadu. Int. J. Current Res. 4(3), 275–280 (2012) ISSN:
0975-833X
14. Shishegaran, M., Saeedi, A.K., Ghiasinejad, H.: Prediction of air quality in Tehran by devel-
oping the nonlinear ensemble model. J. Clean. Prod. 259, 120825 (2020). https://doi.org/10.
1016/j.jclepro.2020.120825.
15. Bhalgat, P., Pitale, S, Bhoite, S.: Air quality prediction using machine learning algorithms. Int.
J. Computer Appl. Tech. Res. 8(9), 367–370. ISSN 2319-8656 (2019)
16. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees.
Wadsworth & Brooks/Cole Advanced Books & Software, Monterey, CA (1984).978-0-412-
04841-8
17. Pant, P, Lal, R.M., Guttikunda, S.K., Russell, A.G., Nagpure, A.S., Ramaswami, A., Peltie,
R.E.: Monitoring particulate matter in India: recent trends and future outlook. Air Quality,
Atmosphere & Health (2018)
18. Celik, M.B., Kadi, I.: The relation between meteorological factors and pollutants concentrations
in Karabuk City. G.U. J. Sci. 20(4), 87–95 (2007)
19. Shakir, M., Rakesh, N.: Investigation on air pollutant data sets using data mining tool. IEEE
Xplore Part Number:CFP18OZV-ART; ISBN:978-1-5386-1442-6
20. Khaniabadi, Y.O., Goudarzi, G., Daryanoosh, S.M., Borgini, A., Tittarelli, A., De Marco, A.:
Exposure to PM10, NO2 , and O3 and impacts on human health. Environ. Sci. Pollut. Res.
(2016)
21. TikheShruti, S., Khare, K.C., Londhe, S.N.: Forecasting criteria air pollutants using data driven
approaches: An Indian case study. Int. J. Soft Comp. 8(4), 305–312 (2013) ISSN: 1816-9503
22. Kottur, S.V., Mantha, S.S: An integrated model using artificial neural network (Ann) and kriging
for forecasting air pollutants using meteorological data. Int. J. Adv. Res. Comp. Comm. Eng.
4(1) (2015). ISSN (Online): 2278-1021 ISSN (Print): 2319-5940
23. Raturi, R., Prasad: Recognition of future air quality index Using artificial neural network. Int.
Res. J. Eng. Tech. (IRJET) 5(3) (2018). e-ISSN: 2395-0056 p-ISSN: 2395-0072
24. Aditya, C.R., Deshmukh, C.R., Nayana, D.K., Vidyavastu, P.G.: Detection and prediction of
air pollution using machine learning models. Int. J. Eng. Trends Tech. (IJETT) 59(4) (2018)
25. Kang, G.K., Gao, J.Z., Chiao, S., Lu, S., Xie, G.: Air quality prediction: Big data and machine
learning approaches. Int. J. Environmental Sci. Develop. 9(1) (2018)
26. Manisalidis, E., Stavropoulou, A., Stavropoulos, Bezirtzoglou, E.: Environmental and health
impacts of air pollution: A review. Front. Public Heal. 8, 1–13 (2020). https://doi.org/10.3389/
fpubh.2020.00014
27. Jia, C., Batterman, S., Godwin, C.: VOCs in industrial, urban and suburban neighborhoods-
Part 2: Factors affecting indoor and outdoor concentrations. Atmos. Environ. 42(9), 2101–2116
(2008). https://doi.org/10.1016/j.atmosenv.2007.11.047
28. Klemm, R.J., Lipfert, F.W., Wyzga, R.E., Gust, C.: Daily mortality and air pollution in Atlanta:
Two years of data from ARIES. Inhal. Toxicol. 16(SUPPL. 1), 131–141 (2004). https://doi.
org/10.1080/08958370490443213
Chapter 31
A Survey on Fire Detection-Based
Features Extraction Using Deep Learning

K. Jose Triny, P. Deepak Kumar, V. Ezhilarasan, M. Santhosh Kumar,


and S. Suriya

Abstract Forest fires causes a severe risk to human being and the environment.
By 2030, woodland fires are expected to have destroyed half of the world’s forests.
Assuming early fire recognition approaches is the most actual strategy to limit the
damage caused by forest fires. Various industrial fireplace detection sensor structures
exists on the market today, but they are all problematic to use in wide open areas like
as woods due to their slow response time. Image processing has been used in this
study for a variety of reasons, including the rapid advancement of virtual cameras era.
Accurate forest fire detection algorithms remain a challenge since certain items have
similar characteristics to fire, potentially resulting in a high false alarm rate. To reduce
the error rate at the fire detection, initially detect the background from foreground
image regions. Second, the RGB shade space is used to select prospective fireplace
zones. Finally, features extraction is employed to distinguish between real picture
and fireplace-like items, as candidate areas can also include moving patch-like things.
Finally, the region of interest is classified as either true fire or non-fire using machine
learning and deep learning techniques.

Keywords Deep learning · Forest fire recognition · Foreground separation · Image


mining · Machine learning

31.1 Introduction

Human activities have made the forests pompous as a whole. Forest regions have
sprung up as a result of the rapid rise of the population and urbanization. Forest
fires are a natural hazard to the environment and the atmospheric system, and the
environment has an impact on living creatures. Satellite imaging may also be used
to monitor, manage, and assess fire damage in order to comply with burn areas and
determine a safe fire range [1, 2]. The ability of photos from dataset images collected
in a faraway place to obtain precise information is referred to be a satellite image. The

K. Jose Triny (B) · P. Deepak Kumar · V. Ezhilarasan · M. Santhosh Kumar · S. Suriya


Department of Computer Science and Engineering, M.Kumarasamy College of Engineering,
Thalavapalayam, Karur, Tamil Nadu 639113, India
e-mail: josetrinyk.cse@mkce.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 313
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_31
314 K. Jose Triny et al.

forest fire image is captured by a satellite sensor, which has a growing range of time
resolution and forest area coverage [3]. Satellite pictures also used as a fire monitoring
tool, as well as a management and damage-finding tool for ensuring compliance with
burn areas and determining a safe fire range [4, 5]. The color consistency is used to
classify this fire, which includes components from the original fire. It not only notices
but also differentiates between diverse types of flames, such as fire and materials.
The threshold value, the detection of matrix value, and the differential matrix value
of the system were all assumed in the system’s operation to assess the forest fire
[6, 7].
Each year, forest fires damage a substantial amount of forest land and species. It
results in the loss of many lives as well as valuable natural resources and personal
property [8]. It has a bad impact on the global climate. The problem has worsened
in comparison to past years [9, 10]. Human arrival on forest regions is a main reason
for forest fires. Traditional firefighting tactics rely on powered gadgets or humans to
keep an eye on the environment [11]. Air photograph testing, element sampling, and
temperature sampling are the most commonly utilized fire smoke detection proce-
dures. Unless the particles spread the sensors and stimulate them, an alarm is not
triggered [12, 13].
Due to the rapid advancement in image processing technology, fire detection
entirely based on processing the image and it is more viable than fire watch towers,
sensors, satellites. People are forced to gaze at the region at some point in the case of
fireplace watch towers [14, 15]. Sensors are strategies that can sense their surrounds
and compute data. The sensors are used to detect physiological variables like strain,
temperature, humidity and chemical variables like carbon monoxide, carbon dioxide,
and nitrogen dioxide [16].
The coverage of broad regions in wooded areas is impractical in a wireless sensor-
based fire detecting system, and battery expense is also a major challenge [17, 18].
Satellite-based devices can disclose a large area, but satellite snap shots have a
low decision rate. Because a hearth can only be noticed when it has grown large
enough, real-time detection is not possible. Furthermore, these systems are extremely
expensive. The accuracy of satellite-based woodland fire detection will be severely
hampered by weather conditions (for example, clouds) [19].
Image processing has been used entirely in this research for several reasons which
includes rapid development of digital camera generation [20]. Fire detection systems
are one of the most important components of surveillance systems that disclose homes
and their surroundings. It is significantly better if the gadget can record the first stages
of a fireplace as part of an early warning process [21].
Almost all hearth detection structures currently use inbuilt sensors, which rely
on the sensors’ reliability and positional dispersion in general. For a high-precision
fire detection device, it is critical that these sensors are dispersed widely. Because
a regular arrangement of sensors in close proximity is required in a sensor-based
completely fire detection equipment for outside surroundings, coverage of wide
regions is impractical [22].
Traditional hearth detection technologies are being replaced by computer vision-
based structures as a result of rapid improvements in the digital era and video
31 A Survey on Fire Detection-Based Features … 315

Fig. 31.1 Various forest fire from surveillance cameras

processing methodologies. Computer-based fire detection systems have three levels:


fire pixel type, transferring item segmentation, and candidate area evaluation [23,
24]. It depends on two categories: the geometry and region’s historical changes. The
success of the fireplace pixel classifier, which creates seed areas for the remainder of
the system to exercise, is critical to the fire detection performance [25, 26]. Figure 31.1
shows the various forest fire from surveillance cameras.

31.2 Related Work

Liu et al. [27] The ZYNQ heterogeneous platform was used to provide a config-
urable machine structure. The roofline model complements the highly efficient design
of accelerators under this framework and the accelerator is scalable. The CNN
hardware accelerator we constructed correctly participates the famous convolution
and intensity smart separate convolution-based totally completely at the univariate-
computation engine model. The bandwidth of the on-chip ping-pong buffer activity
is excessive and the CNN accelerator we intended is in reality within the pipeline. To
optimize bandwidth and reduce the PUTs generated by the on-chip off-chip statis-
tics exchange the three on-chip flow configure the ping-pong buffer mode and also
use the information movement interface. Even when using preferred convolution or
intensity sensitive separable convolution, the upper age achieves full pipeline status
much faster than in a country without a pipeline. The clock frequency of our current
devices is at best at 100 MHz, which is low compared to other designs.
316 K. Jose Triny et al.

Oktay et al. [28] Our approach provides a stand-alone attention gate variant for
clinical image segmentation, eliminating the need for an external object localization
model. Because the suggested method is simple and modular, it can easily be used
to photograph categorization and regression problems, such as herbal photograph
evaluation and device translation. According to experimental data the proposed AGs
are very valuable for tissue/organ documentation and localization. This is especially
true for small organs with varied lengths, such as the pancreatic, and similar behavior
is expected for global class tasks. AGs’ training behavior can benefit from transfer
learning and multi-level education programs. For example, pre-trained U-net weights
may be recycled to launch attention networks and gates may be trained to achieve
a fine-tuning level. In the same way for example, residual connections around the
gate block are used in dual carriageway networks to provide for better gradient lower
back propagation and scarcely softer interest mechanisms.
Qiu et al. [29] Using DLIA, we shaped a low-cost, dense in-situ/on-line early
threatening fireside sensor with correct move detection. Thanks to a DAC combined
in the MCU for ramp stages, the ramp scanning sign, which is overlaid on a sinusoidal
controlled sign, converts the DFB laser’s injection current into a manipulation voltage
for task simplification and compactness. The length that needs to be revised has
been thwarted. The ADC also acquired the lower back absorption indicator at the
same moment. As a result, the driving force modern became located in the same
area as the photograph-lower detector’s back sign. Correlation, integration, and filter
out processes were among the DLIA’s digital procedures that were programmed in
embedded C.
Hua et al. [30] offer an summary of the ethics and case trainings of wood area
hearth monitoring using satellite TV and infrared remote sensing from a drone. Bi-
spectral approaches, endless threshold techniques, spatial contextual techniques, and
multi-temporal techniques are among the FFM-applicable IRRS processes included
in this study. The contextual approaches are described as depth on knowledge based
on satellite imaging’s such as MODIS, VIIRS, and Landsat 8 OLI.
Krüll et al. [31] input to the remote-controlled images which are captured from
a image to predict the fire like smoke in an terrain area. Microwave radiation can
partially permeate materials like leaves and thin walls. With the use of chip-based
completely additions, the current layout can be further reduced.
Tian et al. [32] For the identification and separation of smoke from the back-
ground components, a method was presented and validated. In specifically, using
twin over-the-whole dictionaries, an optimization technique was devised based on
the imaging version, allowing the separation of quasi-smoke and quasi-historical past
components. The suggested framework is appropriate for detecting and separating
transparent or semi-transparent objects, and it can be extended. Extensive detection
experiments have been carried out, with the findings indicating that the suggested
feature outperforms the existing features for smoke detection. This validates the
detecting method’s effectiveness. Furthermore, the proposed method can distinguish
smoke from other difficult items in a grayscale photo that have a similar visual appear-
ance, such as fog/haze, cloud, shadow, and so on. Smoke separation experiments have
also shown that the proposed separation approach can efficiently estimate/separate
31 A Survey on Fire Detection-Based Features … 317

the genuine smoke and background components. Nonlinear modeling of the smoke
problem, such as kernel or car-encoder-based modeling, could provide even more
improvement. The suggested framework is appropriate for detecting and separating
transparent or semi-transparent and deformable objects, and it can be extended.
Saputra et al. [33] wanted to reduce the hearth effect to the bare minimum so
that injuries may be kept to a minimum. When a fire breaks out in a home, the most
important thing we can do is get as many people out as possible and let them leave
safely. As a result, emergency exit access takes the most risk in this case.
Yin et al. [34] Point-based sensors were employed, which were based on smoke
particle sampling, ambient temperature monitoring, and relative humidity sampling.
However, these sensors have a few intrinsic flaws that are difficult to overcome. These
sensors must sample combustion products immediately for particle, temperature, and
humidity assessments. As a result, those sensors must be placed near areas where
fireplaces are lit. As a result, these sensors can only be used in limited or enclosed
spaces. Furthermore, transferring combustion products, such as smoke particles, to
sensors might take a lengthy time, resulting in a sluggish reaction. Photograph-based
complete fire detection has been extensively investigated in order to overcome the
above-mentioned drawbacks of traditional fire detection technologies. When we look
at fires, we see that smoke often spreads faster than flame.
KOSMAS et al. [35] Within the last decade, video-based total structures have
gotten a lot of attention. In contrast to traditional sensors, which are regular limited
to the interior and this requires close proximity to the fire, video cameras can detect
smoke from a distance. It allows for the evaluation of dynamic textures using statistics
from various photo components. We also propose a strategy for applying it to video-
based early warning systems that specialize in smoke identification. At last, we
recommend combining multidimensional dynamic texture evaluation with smoke
spatiotemporal modeling using a particle swarms optimization technique to improve
classification accuracy. A multivariate evaluation toward a general LDS descriptor
is used to assess the h-LDS’ ability to probe dynamic texture information.
Adib et al. [36] each sub-distinct sensor’s behavior was used Linear Discrimi-
nant Analysis (LDA) was utilized as a categorization approach to train a sensor to
respond to target gases and odors. This advancement makes the entire device more
appealing to future customers in the electronic marketplace while avoiding difficult
situations. The device has been modified to work as an intelligent fireplace alarm as
an application. According to human nostril theory, a substance’s pre-burning odor is
a chemical aggregation released as a complex gas ensemble as a result of heating,
which may be sensed by the nose and used as a pattern for ambient situation tracking.
The comparative study is shown in Table 31.1.
318 K. Jose Triny et al.

Table 31.1 Related works


S. Title of the paper Authors Algorithm Merits Demerits
No.
1 An FPGA-based B. Liu, D. Zou Convolutional Maximize Need
CNN accelerator Neural Network bandwidth additional
integrating depthwise (CNN) while hardware
separable convolution minimizing
latency
2 Attention U-net: O. Oktay, J. Attention U-Net Image Training
Learning where to Schlemper architecture classification behavior is
look for the pancreas and less
regression is
easy
3 Development of an X. Qiu, Y. Wei Sensor-based Warning Cost is high
early warning fire implementation detection of
detection system fire
based on a laser
spectroscopic carbon
monoxide sensor
using a 32-bit
systemon-chip
4 The progress of L. Hua and G. Bi-spectral It increases Need to design
operational forest fire Shao methods the accuracy different
monitoring with of detection sensors over
infrared remote important
sensing regions
5 Early forest fire W. Krüll, R. AirRobot with Early smoke Does not
detection and Tobera gas sensors and a detection implemented
verification using camera in real-time
optical smoke, gas environments
and microwave
sensors
6 Detection and Tian, Hongda Dichromatic Developed to Processing
separation of smoke atmospheric separate the time is high
from single image scattering model difference
frames between
original
smoke and
backdrop
elements
7 Prototype of early fire Saputra, Ferry Fuzzy logic Monitoring Computational
detection system for Astika of the complexity is
home monitoring surroundings high
based on Wireless is easy
Sensor Network
8 A deep normalization Yin, Zhijian Deep Extract It is less
and convolutional normalization attributes for efficient than
neural network for and convolutional smoke the network
image smoke neural network detection architectures
detection automatically
(continued)
31 A Survey on Fire Detection-Based Features … 319

Table 31.1 (continued)


S. Title of the paper Authors Algorithm Merits Demerits
No.
9 Higher order linear Dimitropoulos, Multidimensional High Need more
dynamical systems Kosmas dynamic texture detection rate infrared
for smoke detection analysis sensors
in video surveillance
applications
10 SnO2 nanowire-based Adib, Gas Mixing False alarm Expert
aerosol jet printed Mustahsin System (GMS) possibility knowledge is
electronic nose as fire can be needed
detector reduced

31.3 Existing Methodologies

Sensor-based traditional smoke detection systems detect the presence of combustion


byproducts such as smoke, heat, and radiation [37]. Furthermore, the pace with which
they respond is determined by how quickly the combustion products approach the
sensors close enough to activate them. Furthermore, sensors cannot offer enough
information to accurately estimate the size, position, and dynamics of the fire [38,
39]. Individual sensors are not aligned up in these, resulting in alarm behavior that
is unpredictable and non-synchronous. Smoke alarms are fire alarm systems that
provide a local auditory or visual signal when smoke is detected [40, 41]. Smoke
detectors are typically used in fire alarm systems, with the presumption that smoke
will be produced by the fire. When we see smoke, we know there’s a fire. Even if
there is a fire, the smoke may not appear for a long time after the surrounds have
been burned. Smoke may not be produced in some fires [42].

31.3.1 Sensors

The sensor’s accuracy, reliability, and spatial distributions determine the system’s
improvement [43]. In outdoor environments, large count of sensor is implemented
to predict the fire from the temperature levels. Sensors also require frequent battery
charges, which are challenging to achieve in a big open area. Sensors detect fire only
when it is quite close to it. This will result in sensor damage [44].

31.3.2 Computer Vision-Based Systems

Existing framework applied fire detection schemes based on camera-based tech-


nology with video image processing. The following steps are used in computer vision
applications [45].
320 K. Jose Triny et al.

1. Pixel classification of flames.


2. Moving object segmentation.
3. Examine the potential candidate region.
The fire detection system analyzed in terms of system performance with image
pixels that are provide the complexity and high error rate [46, 47]. From the results,
accurate fire pixel classification is required with a high true positive rate and a low
false positive rate. There are, however, several algorithms that deal specifically with
fire pixel categorization [48, 49].

31.3.3 CCD Cameras

Large number of CCD cameras are used to predict and forecast the fires. This type
of method combines non-pictorial properties such as humidity and temperature with
statistical properties such as average, standard deviation, and second-order moments
[50]. This technique also used to find the smoke from images and reduce the false
positive rate.

31.4 Motivation and Impact of Former Works

Forest fire detection systems are gaining a lot of attention because of the continual
threat from fire to both economic properties and public safety. Increase in forest fires
in forest areas around the world has resulted in an increased motivation for developing
fire warning systems for the early detection of wildfires. In this project, we proposed
an algorithm which combines color information of the fire with the edge of the fire
information. We propose a system which automatically detects the presence of fire
based on the deep learning algorithm. With such a lead, proposed system we can
implement preprocessing steps to eliminate the noises in images and also implement
features extraction to extract the color features and segment the fire regions. Finally
classify the pixels using deep learning algorithm with efficient mobile alert system
to corresponding authorities. Comprehensive experiments are carried out in order to
detect forest fire using conventional machine learning algorithms, object detection
techniques, deep learning models. Accuracy (AC), f-measure (FM), precision (PR),
and recall (RC) are employed as evaluation metrics to demonstrate the performance
of the models. The graph for Comparison between SVM and RF is shown in Fig. 31.2.
The graph shows that RF with 80.26%, SVM with 72.24% of accuracy results. But
in our proposed system we tend to incorporate CNN algorithm with an approximate
accuracy rate of above 90% by implementing different frameworks for achieving the
better results by means of educating those intermittent layers.
31 A Survey on Fire Detection-Based Features … 321

Fig. 31.2 Comparison between SVM and RF

31.5 Conclusion

In this survey we can analyze the various existing methodologies for forest fire
detection. The system of the forest area fire photograph reputation algorithm based
entirely on CNN is better results than the existing algorithms. The sole purpose is
to obtain a dataset of flame images for labeled and unlabeled images. The effects of
mastering fee, batch size, and other characteristics on CNN’s overall performance
are primarily evaluated through trials, and the most appropriate parameters are deter-
mined. Adaptive pooling is used to avoid data lacking a photograph, and the price of
flame popularity in the fireplace is segmented rather than other state of art methods.

References

1. Yadav, G., Gupta, V., Gaur, V., et al.: Optimized flame detection using image processing based
techniques. Indian J. Comput. Sci. Eng. 3, 203–207 (2012)
2. Sam, G., Benjamin, R.B.: A comparative analysis on different image processing techniques for
forest fire detection. Int. J. Comput. Sci. Netw. 5(1), 110–114 (2016)
3. Ko, B.C., Cheong, K.-H., Nam, J.-Y.: Fire detection based on vision sensor and support vector
machines. Fire Saf. J. 44(3), 322–329 (2009)
4. Chen, Y., Zhang, Y., Xin, J., et al.: A UAV-based forest fire detection algorithm using convo-
lutional neural network. In: 2018 37th Chinese Control Conference (CCC), Wuhan, People’s
Republic of China, pp. 10305–10310 (2018)
5. Jiao, Z., Zhang, Y., Xin, J., et al.: A deep learning based forest fire detection approach using
UAV and YOLOv3. In: 2019 1st International Conference on Industrial Artificial Intelligence
(IAI), Shenyang, People’s Republic of China, pp. 1–5 (2019)
6. Murugesan, M., Thilagamani, S.: Efficient anomaly detection in surveillance videos based on
multi layer perception recurrent neural network. J. Microprocess. Microsyst. 79 (2020)
7. Yin, M., Lang, C., Li, Z., Feng, S., Wang, T.: Recurrent convolutional network for video-based
smoke detection. Multimedia Tools Appl. 78(1), 237–256 (2019)
322 K. Jose Triny et al.

8. Cetin, A.E., et al.: Video fire detection–review. Digit. Sign. Process. 23(6), 1827–1843 (2013)
9. Muhammad, K., Ahmad, J., Mehmood, I., Rho, S., Baik, S.W.: Convolutional neural networks
based fire detection in surveillance videos. IEEE Access 6, 18174–18183 (2018)
10. Thilagamani, S., Nandhakumar, C.: Implementing green revolution for organic plant forming
using KNN-classification technique. Int. J. Adv. Sci. Technol 29(7S), 1707–1712 (2020)
11. Rajesh Kanna, P., Santhi, P.: Unified deep learning approach for efficient intrusion detection
system using integrated spatial–temporal features. Knowl.-Based Syst. 226 (2021)
12. Lin, G., Zhang, Y., Xu, G., Zhang, Q.: Smoke detection on video sequences using 3D
convolutional neural networks. Fire Technol. 55(5), 1827–1847 (2019)
13. Zhang, Q.-X., Lin, G.-H., Zhang, Y.-M., Xu, G., Wang, J.-J.: Wildland forest fire smoke
detection based on faster R-CNN using synthetic smoke images. Procedia Eng. 211, 441–446
(2018)
14. Barmpoutis, P., Dimitropoulos, K., Kaza, K., Grammalidis, N.: Fire detection from images using
faster R-CNN and multidimensional texture analysis. In: Proceeding of IEEE International
Conference on Acoustics, Speech Signal Process (ICASSP), May, pp. 8301–8305 (2019)
15. Jiao, Z., Zhang, Y., Xin, J., Mu, L., Yi, Y., Liu, H., Liu, D.: A deep learning based forest fire
detection approach using UAV and YOLOv3. In: 1st International Conference on Industrial
Artificial Intelligence (IAI), July, pp. 1–5 (2019)
16. Deepa, K., Kokila, M., Nandhini, A., Pavethra, A., Umadevi, M.: Rainfall prediction using
CNN. Int. J. Adv. Sci. Technol. 29(7 Special Issue), 1623–1627 (2020)
17. Turgay, C.: Fast and efficient method for fire detection using image processing. ETRI J. 32(6),
1–12 (2010)
18. Deepa, K., Thilagamani, S.: Segmentation techniques for overlapped latent fingerprint
matching. Int. J. Innovative Technol. Exploring Eng. 8(12), 1849–1852 (2019)
19. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional
neural networks. In: Proceedings of Advance Neural Information Processing System (NIPS),
pp. 1097–1105 (2012)
20. Deepika, S., Pandiaraja, P.: Ensuring CIA triad for user data using collaborative filtering
mechanism. In: 2013 International Conference on Information Communication and Embedded
Systems (ICICES), pp. 925–928 (2013)
21. Santhi, P., Mahalakshmi, G.: Classification of magnetic resonance images using eight directions
gray level co-occurrence matrix (8dglcm) based feature extraction. Int. J. Eng. Adv. Technol.
8(4), 839–846 (2019)
22. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition (2014). arXiv:1409.1556 [Online]. Available http://arxiv.org/abs/1409.1556
23. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M.,
Adam, H.: MobileNets: efficient convolutional neural networks for mobile vision applications
(2017). arXiv:1704.04861 [Online]. Available: http://arxiv.org/abs/1704.04861
24. Thilagamani, S., Shanti, N.: Gaussian and Gabor filter approach for object segmentation. J.
Comput. Inf. Sci. Eng. 14(2), 021006 (2014)
25. Rajesh Kanna, P., Santhi, P.: Hybrid intrusion detection using map reduce based black widow
optimized convolutional long short-term memory neural networks. Expert Syst. Appl. 194
(2022)
26. Muhammad, K., Khan, S., Elhoseny, M., Hassan Ahmed, S., Wook Baik, S.: Efficient fire
detection for uncertain surveillance environment. IEEE Trans. Ind. Inf. 15(5), 3113–3122
(2019)
27. Liu, B., Zou, D., Feng, L., Feng, S., Fu, P., Li, J.: An FPGA-based CNN accelerator integrating
depthwise separable convolution. Electronics 8(3), 281–282 (2019)
28. Oktay, O., Schlemper, J., Le Folgoc, L., Lee, M., Heinrich, M., Misawa, K., Mori, K.,
McDonagh, S., Hammerla, N.Y., Kainz, B., Glocker, B., Rueckert, D.: Attention U-net: learning
where to look for the pancreas (2018). arXiv:1804.03999
29. Qiu, X., Wei, Y., Li, N., Guo, A., Zhang, E., Li, C., Peng, Y., Wei, J., Zang, Z.: Development of
an early warning fire detection system based on a laser spectroscopic carbon monoxide sensor
using a 32-bit system-on-chip. Infr. Phys. Technol. 96, 44–51 (2019)
31 A Survey on Fire Detection-Based Features … 323

30. Hua, L., Shao, G.: The progress of operational forest fire monitoring with infrared remote
sensing. J. For. Res. 28(2), 215–229 (2017)
31. Krüll, W., Tobera, R., Willms, I., Essen, H., von Wahl, N.: Early forest fire detection and
verification using optical smoke, gas and microwave sensors. J. For. Res. 45, 584–594 (2012)
32. Tian, H., et al.: Detection and separation of smoke from single image frames. IEEE Trans.
Image Process. 27(3), 1164–1177 (2017)
33. Saputra, F.A., Udin Harun Al Rasyid, M., Abiantoro, B.A.: Prototype of early fire detec-
tion system for home monitoring based on wireless sensor network. In: 2017 International
Electronics Symposium on Engineering Technology and Applications (IES-ETA). IEEE (2017)
34. Yin, Z., et al.: A deep normalization and convolutional neural network for image smoke
detection. IEEE Access 5, 18429–18438 (2017)
35. Dimitropoulos, K., Barmpoutis, P., Grammalidis, N.: Higher order linear dynamical systems
for smoke detection in video surveillance applications. IEEE Trans. Circ. Syst. Video Technol.
27(5), 1143–1154 (2017)
36. Adib, M., et al.: SnO2 nanowire-based aerosol jet printed electronic nose as fire detector. IEEE
Sens. J. 18(2), 494–500 (2017)
37. Pandiaraja, P., Aravinthan, K., Lakshmi, N.R., Kaaviya, K.S., Madumithra, K.: Efficient
cloud storage using data partition and time based access control with secure AES encryption
technique. Int. J. Adv. Sci. Technol. 29(7), 1698–1706 (2020)
38. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolu-
tional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
pp. 4700–4708 (2017)
39. Gunasekar, M., Thilagamani, S.: Performance analysis of ensemble feature selection method
under SVM and BMNB classifiers for sentiment analysis. Int. J. Sci. Technol. Res. 9(2), 1536–
1540 (2020)
40. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image
segmentation. In: International Conference Medical Image Computing and Computer-Assisted
Intervention (MICCAI), pp. 234–241 (2015)
41. Wang, C., Wang, Y., Liu, Y., He, Z., He, R., Sun, Z.: ScleraSegNet: an attention assisted U-
net model for accurate sclera segmentation. IEEE Trans. Biometrics Behav. Identity Sci. 2(1),
40–54 (2020)
42. Perumal, P., Suba, S.: An analysis of a secure communication for healthcare system using
wearable devices based on elliptic curve cryptography. J. World Rev. Sci. Technol. Sustain.
Dev. 18(1), 51–58 (2022)
43. Zahangir Alom, M., Hasan, M., Yakopcic, C., Taha, T.M., Asari, V.K.: Recurrent residual
convolutional neural network based on U-net (R2UNet) for medical image segmentation (2018).
arXiv:1802.06955 [Online]. Available http://arxiv.org/abs/1802.06955
44. Logeswaran, R., Aarthi, P., Dineshkumar, M., Lakshitha, G., Vikram, R.: Portable charger for
handheld devices using radio frequency. Int. J. Innovative Technol. Exploring Eng. (IJITEE)
8(6), 837–839 (2019)
45. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization (2014). arXiv:1412.6980
[Online]. Available http://arxiv.org/abs/1412.6980
46. Pandiaraja, P., Sharmila, S.: Optimal routing path for heterogenous vehicular Adhoc network.
Int. J. Adv. Sci. Technol. 29(7), 1762–1771 (2020)
47. Chen, C.L.P., Liu, Z.: Broad learning system: an effective and efficient incremental learning
system without the need for deep architecture. IEEE Trans. Neural Netw. Learn. Syst. 29(1),
10–24 (2018)
48. Pradeep, D., Sundar, C.: QAOC: Noval query analysis and ontology-based clustering for data
management in Hadoop. Future Gener. Comput. Syst. 108, 849–860 (2020)
49. Ramnath, S., Javali, A., Narang, B., Mishra, P., Routray, S.K.: IoT based localization and
tracking. In: International Conference on IoT and Application (ICIOT), May, pp. 1–4 (2017)
50. Lu, G., Yan, Y., Colechin, M.: A digital imaging based multifunctional flame monitoring system.
IEEE Trans. Instrum. Meas. 53(4), 1152–1158 (2004)
Chapter 32
A Survey on Exploratory Mineral Data
Analysis on Geological Location Using
Deep Learning

P. Santhi, S. A. Angelin Pricila, T. Devisha, C. Madhumitha, and S. Tharani

Abstract The increase in demand of metals which is due to the rise in metal prices.
As a result, seismic methods will become a more important tool for mine design
and exploration to help unravel buildings containing mineral reserves at vast depth.
Seismic methods can be used to pick out the mineral deposits which are located
at depth with higher achievements. The cost of production would reduce, and the
return on invested capital infrastructure would increase, if it was feasible to see
clearly beneath a mining site and chart the position or extent of the resources there.
Machine learning and deep learning have many numbers of classification techniques.
User is needed to pre-select the characteristics of input in machine learning, deep
learning for supervised learning tasks uses raw data to determine categorization
features (or attributes). Deep learning possesses easier methods to analyze the unin-
terpreted images. The efficiency is greater in deep learning when compared with
machine learning. Compared to existing machine learning techniques, deep learning
techniques come up with more efficiency in mineral classification.

Keywords Seismic images · Image processing · Machine learning · Mineral data ·


Deep learning · Geological information · Classification techniques · Mineral
classification · Categorization features

32.1 Introduction

Mineral exploration is the fashionable precept works by extracting items of geologic


data from varied places. Exploration works in tiers of accelerating sophistication, with
cheap, cruder techniques enforced at the begin, and if the following facts is economi-
cally exciting, this warrants subsequent, additional superior (and steeply priced) tech-
niques. However, it’s terribly rare to seek out adequately increased mineral bodies,
and then most exploration campaigns forestall once the primary/couple of ranges.
Exploration pyramid, displaying however once the initial table examine analysis,

P. Santhi (B) · S. A. Angelin Pricila · T. Devisha · C. Madhumitha · S. Tharani


Department of Computer Science and Engineering, M.Kumarasamy College of Engineering,
Thalavapalayam, Karur, Tamilnadu 639113, India
e-mail: santhip.cse@mkce.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 325
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_32
326 P. Santhi et al.

Fig. 32.1 Salt bodies in subsurface

Fig. 32.2 Mineral


exploration techniques

every future level can become abundant less and far less probable to position into
numeric setting. In Eire, despite thousands of mining licenses, handiest three gift
day samples of mines have opened as a result of the Sixties. Here, Fig. 32.1 shows
the salt bodies in subsurface layers and the salt bodies as seismic image in Fig. 32.2.

32.1.1 Geological Mapping

All geological revisions, including geotechnical, groundwater, geohazards, and


mineral exploration, start with mapping. It’s crucial for establishing a foundational
understanding of an area’s geology, with each following step of work in the area
building on and linking back to this first stage. Prospecting is frequently directed
alongside mapping with a mineral exploration objective.
32 A Survey on Exploratory Mineral Data Analysis on Geological … 327

32.1.2 Soil Sampling

Soil is made up of multiple layers, one of which ‘grab’ metal ions leaking from the
underlying rocks, resulting in a mineral-rich horizon. In soil sampling, this is the
recommended layer, from which a 1 kg specimen is taken to evaluate the chemistry
of the geology underlying. This technique is very crucial for gathering data in the
countryside when the rocks are hidden by dirt and vegetation, due to Ireland’s vast
swaths of pasture and woodland.

32.1.3 Stream Sediments

The sediment fabric in circulation beds is derived from the erosion of rocks further
upstream, according to this process. As a result, sample this strong in a variety of
areas can reveal insights about the geology of the upland area, as well as being
investigated for chemistry to see if any metals of relevance can be found therein.

32.1.4 Drilling

Drilling is a particularly lavish method so can only be employed in the rare locations
that have tested significantly attention-grabbing for his or her mineral potential.
However, in these few places, drilling offers physical proof of the rock below and
is employed to substantiate the theories of the underlying earth science that were
developed throughout earlier techniques like soil sampling, etc. Drilling is a vital
module of late-stage exploration comes. This page highlights the method of drilling;
however, the dedicated ways additionally provide more clarification for any readers
with issues with reference to drilling.

32.1.5 Geophysical Methods

The properties which can be said to as physical is found out by using geophysics
and also the above techniques analyze the chemical (directly or indirectly) of the
geology beneath the surface. Because both approaches can provide a new dimen-
sion to any geological theory, an exploration campaign can frequently evaluate and
collaborate on both geochemical and geophysical research. Magnetic, electromag-
netic, electric resistivity, induced polarization, seismic, and radiometric surveys are
all described on this page. Deposit development and mining, without a doubt, entail
far more substantial environmental and socioeconomic consequences. Construction,
mining, mineral processing, waste rock disposal, and tailings management1 will
328 P. Santhi et al.

have a significant impact on the natural environment. Local communities may be


severely harmed by the influx of outsiders and other changes that mining brings to
a town. An influx of outsiders could put a strain on public infrastructure that was
meant for a smaller population. Outsiders may have different cultural values than
locals, resulting in a clash of lifestyles. Many of the social effects of a new mine
can be mitigated or controlled via careful design, but some network commerce is
unavoidable and permanent.

32.2 Related Works

The large-scale visible recognition issues identified as the main strategies in the
picture classification. In huge warehouses the Miles of drilled cores are stored in the
bins, it has been never digitally defined and have been ignored for many years. The
lithology gives complete explanation for rock-type description based on cores and the
structure of subsurface geology. The microscopic description and type of rock sand
is one of the maximum essential strategies in sedimentary and digenetic research, is
the specialty of petrography. Potential records received from skinny phase analysis
in comparison to hand specimen descriptions encompass mineral distribution and
percent, pore area evaluation, and cement composition. By developing a simple
internet site, the overall populace could have instantaneous get admission to a rock
identification device the use of switch gaining knowledge of generations [1, 2].
Analyzing the seismic mirrored image styles, finds the principal functions, and
tags them via wonderful scores and/or hues. However, the dramatical way of
increasing length of seismic surveying in 3D is extensively ready now to perfor-
mance the difficulty of such guide for simplification. In diverse disciplines, the
three-dimensional seismic images have a major role in it which consists of geohazard
assessment, power exploration and civil engineering [3, 4]. Time-eating and labor-
extensive method is simplified using seismic volumes and often calls for reciprocal
actions among geologists and geophysicists. Manual simplification has been the
maximum truthful and then powerful technique for the fixing of this problem, in
which an interpreter can visually analyzes the seismic action styles, finds the domain
capabilities and tags them with the aid of wonderful trace and/or colors. Although the
fiercely, growing the measurement of 3D seismic charting at present and considerably
demanding the work of that interpretation process [5–7].
Exploiting to emphasize or exaggerate the ambiguity of the goal object. Band ratio
photos have spectral enhancement by means of dividing the virtual number values in
a single spectral band via the corresponding values in any other band. The pictures
which came as a resultant show the special characteristics of the picture capability.
The band ratio of the resultant photo which accurately satisfies the brightness of
the variant. Band rationing of Landsat-ETM+ and ASTER records minimizes the
effects of environmental elements [8–10]. Clay minerals are a portion of a well-
known however critical group in the phyllosilicates that contain big probabilities in
the silicate sheets when the water trapped between the sheets arranged. Clay minerals
32 A Survey on Exploratory Mineral Data Analysis on Geological … 329

are divided into 4 principal businesses: kaolinite, montmorillonite/smectite, elite, and


chlorite. Kaolinite, dickite, and nacrite are the three individuals in the kaolinite group,
which might be polymorphs. Illite is largely a hydrated microscopic muscovite, and
is the principal element of shales. Silicate minerals are the most usual of Earth’s
minerals and include some of the mineral types [11–13].
Applied structure-orientated smoothing with aspect-maintaining gets rid of noise
while improving seismic systems and polishing structural edges in a seismic picture.
All of these seismic photograph working responsibilities are related to every different
as all of them contain the seismic structural capabilities. In seismic image processing
convention systems, those responsibilities are frequently independently accom-
plished with the aid of exclusive techniques and demands remain in every of them.
Convolutional neural network (CNN) is used, to educate the community, robotically
produce heaps of 3D noisy artificial seismic images and corresponding ground fact
of fault snap shots, easy seismic snap shots and seismic normal vectors [14–16].
Even though the artificial facts set are educated, the community learned accurately
to carry out all those photo processing tasks in the preferable seismic picture. In
several areas, there are multiple examples which clearly show the community is
considerably grows to the advanced traditional strategies in all of the computing
duties that make more accurate and sharpened fault detection, extra accurate seismic
reflection slopes or normal vectors and the greater structures structural edges with a
better smoothed seismic quantity [17–19].
On the combination of quarry blasts and tectonic earthquakes, we established
the events at Sakarya prefecture, in Turkey and in terms the frequency and time
areas are analyzed through virtual seismograms. The main issues currently running
on the seismic supply class is the misclassification of the events performed in it.
To reduce and secure you from this misclassification of the occasions, in desire
to whole network we used direct P and S waves that handiest at one station. In
seismogram, the more complication is that there is longer the space of the station
from the epicenter. So, we decreased wave path and distance outcome the usage of
handiest one near station. Statistically, three criteria have been used for discrimination
evaluation among earthquakes and quarry blasts at Sakarya place, Turkey [20, 21].
To conform a system studying technique using SVM to classify 1.5 < ML < 2.
Nine seismic activities that took place within the Tianshan natural belt in China from
2009 to 2017. In the past few decades, the Tianshan natural belt has been home to
many outstanding sorts of seismic activities, such as virtually going on TEs, QBs,
and IEs with the aid of the impoundment of large water reservoirs and the electricity
field manufacturing. The abundance of seismic interest unrelated to tectonic system
has made it hard to assess the seismic danger inside the area [22]. Alternatively,
the abundance of those combined events inside the vicinity affords a possibility to
establish and take a look at various occasion classifiers with the aid of the use of
system analyzing strategies [23]. The beauty SVM method may be extended to more
than one-elegance class troubles through adopting a way of “one as opposed to all”
configuration [24, 25]. Every problem is handled as a binary SVM classifier that
seaworms gadgets of a selected elegance from the items of all the different schooling
[26].
330 P. Santhi et al.

Many hours were spent establishing new qualities and procedures for determining
each structure. There are various techniques for seismic side detection, texture eval-
uation and reflector geometry estimation in the field of seismic characteristic evalu-
ation to aid in the depiction of faults and salt domes from 3-d seismic data, each of
which measures the lateral adjustments in seismic reflection, which includes ampli-
tude and/or waveform, using different operators [27, 28]. Three-dimensional seismic
interpretation is critical for successful hydrocarbon exploration and subsurface reser-
voir construction. When the size increase in the 3D seismic surveys, the seismic
extent becomes for difficult while deciphering when it is done manually [29–31].
The synthetic intelligence and the device mastering strategies have been imple-
mented in different disciplines and it has been carried out successfully in recent years.
It promotes the supporting of seismic interpretation in different responsibilities and
experienced interpreters were mimicked inside the seismic area which includes struc-
ture detection and facial analysis like salt domes and faults [32]. The comparison
of overall performance of popular frameworks like convolutional neural commu-
nity, multi-layer perceptron community and neural network framework, to hassle the
seismic salt body delineation. To those frameworks, the convolutional neural network
gives higher contribution and overall performance to understand more about seismic
alerts, and it figures out the necessary seismic structures [33, 34].
Developing novel qualities and techniques/algorithms to aid in the separation of
salt bodies from the non-salt components that surround them [35–45]. It is one of the
most required subsurface structures that have a big impact on hydrocarbon storage and
sealing in offshore petroleum reserves. Based entirely on a multi-characteristic okay-
approach cluster analysis, this research disseminates unsupervised methodology for
determining the surface of salt bodies using 3D seismic surveying. There are four
steps in the workflow. First, a collection of seismic traits is certain and analyzed
from the quantity of unique seismic amplitude, every surrounding non-boundary
seismic function is separated from the salt barriers in its specific manner. Second,
to start over the boundary and non-boundary clusters in their centers, the units of
consultant samples are manually picked up. Third, for okay-manner cluster evaluation
is generally needed for the volumetric processing. Finally, seismic is applied with
k-means model affords us with a chance extent wherein the excessive values indicate
the presence of salt-dome obstacles [46–48].
Using the most sincere and powerful method for resolving this issue, in which a
simplifier visually analyzes the seismic mirrored picture styles, recognizes the critical
styles, and tags them with distinguishing scores and/or colors. Seismic interpretation
is increasingly widely used for illustrating assisting activities and subsurface geology
in a variety of fields, including environmental engineering and petroleum exploration.
Some computer-assisted techniques have been developed in recent decades to speed
up the interpreting process and improve exactly. Even though, maximum of the
prevailing simplification techniques is modeled for decoding a positive sample (e.g.,
faults and salt domes) in a given seismic dataset at one time; correspondingly, the
rest patterns might be disregarded. Interpreting all of the important seismic styles
turns into viable with the useful resource of multiple classification strategies. When
32 A Survey on Exploratory Mineral Data Analysis on Geological … 331

imposing them into the seismic domain, the principal downside is the less perfor-
mance specifically for a huge dataset, since the separation want to be repeated at
each pattern in the seismic activities [49, 50]. The related works are summarized in
Table 32.1 as shown below.

Table 32.1 Related works


Title Author Algorithm Merits Demerits
Deep convolutional Rafael Pires Convolutional Obtained an Complexity
neural networks as de Lima neural network accuracy for the test can be high
a geological image (CNN) set
classification tool
[1]
Real-time seismic Haibin Di Deconvolutional Time-consuming Difficult to
image neural network process handle to
interpretation via multiple
deconvolutional datasets
neural network [5]
Minerals Khunsa Maximum Geological and Support
identification and Fatima likelihood mineral mapping of limited area
mapping using classification data
ASTER satellite
image [11]
Multitask learning Xinming Wu Single More efficiently Only support
for local seismic convolutional train the CNN synthetic data
image processing: neural network sets
fault detection, (CNN)
structure-oriented
smoothing with
edge-preserving,
and seismic normal
estimation by
using a single
convolutional
neural network
[17]
Classification of Emrah Maximum Discrimination Seismic risk
seismic events Budakog frequency (fmax) analysis of data can be
using linear function occurred
discriminant
function (LDF) in
the Sakarya region,
Turkey [20]
Support Vector Lanlan Tang, SVM classifiers Multiple clusters are False positive
Machine Miao Zhang formed can be
Classification of occurred
Seismic Events in
the Tianshan
Orogenic Belt [26]
(continued)
332 P. Santhi et al.

Table 32.1 (continued)


Title Author Algorithm Merits Demerits
Why using CNN Zhen Wang Multi-layer Features are Need to
for seismic perceptron (MLP) extracted classify the
interpretation? An multiple
investigation [33] patches
Multi-attribute Muhammad Multi-attribute Clearly separate the Need to focus
k-means clustering Shafiq k-means target features deep learning
for salt-boundary clustering algorithms
delineation from
three-dimensional
seismic data [46]
Developing a Haibin Di Seismic pattern Identify the Difficult to
seismic pattern interpretation important seismic annotate the
interpretation network (SpiNet) patterns datasets
network (SpiNet)
for automated
seismic
interpretation [49]
Numerical Xin Tan Discrete element Mineral grains are Only handle
Simulation of model extracted homogenous
Heterogeneous datasets
Rock Using
Discrete Element
Model Based on
Digital Image
Processing [34]

32.3 Proposed Work

Fast and accurate substance identification is vital for a variety of Earth-based activ-
ities, including geological exploration, fabric and engineering sciences, and various
analytical researches. Due to time and weight constraints, gathering in situ data
about rock mineral composition is critical for planetary floor exploration, particularly
missions including pattern return. Taking into account the structural, chemical, and
functional residences of the planetary substances to be diagnosed, spectroscopy is a
significant analytical tool for doing this. Many robotic planetary exploration missions
have already benefited from data obtained through novel spectroscopic techniques.
Many series in spectroscopy, such as receiving utilized to know minerals and estimate
elemental composition, have been uncovered because to machine learning (ML).
32 A Survey on Exploratory Mineral Data Analysis on Geological … 333

32.3.1 Support Vector Machine

Support vector machine is a supervised system that uses a different method for each
regression and category problem. Class function is used for the linear boundary to
separate the classes. For the creation of hyper-plane, select the hyper-plane with the
long way to the training information factor of any discernment this has been achieved
by the separation of every lesson originally. The actual power of this collection of
rules comes from the use of the kernel feature. The following are the most commonly
utilized kernels:
• Linear Kernel
• Gaussian Kernel
• Polynomial Kernel.

32.3.2 Decision Tree

It’s also a supervised device learning method, with the most effective tree data form
at its core, as well as the usage of several if/else statements at the functions chosen.
Decision trees are built on a hierarchical rule-based system that permits class labels
to be recognized and rejected at each intermediate level/stage. There are three parts
to this method:
• Partitioning the nodes
• Finding the terminal nodes
• Allocation of the elegance label to terminal node.
In a decision tree it indicates the feature that look on to each non leaf node, each
branch represents the output characteristics and each leaf node indicates a class. To
utilize a decision tree, start at the root node and examine the distinctive attributes
in samples, next select the branch based on the output until we reach the leaf node.
Finally, the outcome may be the leaf node’s class that has been chosen.
• Create a size N random pattern using the facts as a substitute.
• Take a random sample of the predictors without includes them in the sample.
• Classification and Regression Tree (CART) for data partitioning is created first.
• Repeat Step 2 for each succeeding split until the tree gets as big as you want it to
be, but don’t prune.
• Perform Steps 1 through 4 a huge number of times (e.g., 500).
Random forest is a fantastic statistical research model because it is extremely green
in the domain of multi-dimensional characteristics and recognizes characteristic
choice. Data noise is occurred extremely high, and it fits over more.
334 P. Santhi et al.

32.3.3 K-Nearest Neighbor

It is a collection of criteria is based on the difference between characteristic vectors


and classifies unknown data elements by finding the most commonplace class among
the numerous ok-closest examples. To utilize the ok-nearest neighbor type, you’ll
need to define a distance metric or a similarity criterion. The two main popular
options in this are the Euclidean distance and the Manhattan distance.

32.3.4 Recurrent Neural Network

Minerals can be found all around us. Minerals are created via natural methods and
are also extracted from their raw sources. Their growth is influenced by a variety
of physical and chemical factors, including mass balance, temperature, pressure,
intensity, and other environmental factors. In general, minerals can be termed as
naturally occurring substances. Minerals are found in both organic and inorganic
form with a distinct chemical composition. Minerals are a type of crystal struc-
ture that occurs in nature in its purest form. Fast mineral identification is neces-
sary for certain jobs such as mining, petrography, and engineering geology due to
sophisticated improvements in geoscientific technologies. Minerals serve a variety
of purposes, including providing fuel for businesses such as herbal gasoline and
petroleum. Because minerals are so important in monetary growth, it’s critical to
improve their identity and classification.
In recent years, picture classification has attained a greater accuracy by using deep
learning. Deep learning is a subsection of machine learning. Through deep learning, a
model learns to execute classification tasks by learning from images. Neural network
architecture is commonly used to implement deep learning. The recurrent neural
network has already exhibited exceptional performance in the domain of image clas-
sification. Recurrent neural networks (RNNs) are the backbone of image classifica-
tion, since they take an input image and assign it a class and a label that identifies
it. The recurrent neural network uses a ranking model to learn spatial characteristics
via convolutional layers, and then outputs a fully connected layer to categories the
classes. Recurrent neural network which is a type of an artificial neural network. An
RNN adds loops to the connections of a feed-forward neural network. In recurrent
neural network, the inputs are processed sequentially in which each step are depend
on the previous step. In this way, the network displays dynamic temporal behavior.

Given a sequence data x = (x1, x2, . . . , x T ).

where x i = data at ith time step,


ht = Recurrent hidden state.
32 A Survey on Exploratory Mineral Data Analysis on Geological … 335

0 if t = 0
ht =
ϕ(h t−1 xt ,) Otherwise

ϕ = Nonlinear function as a logistic sigmoid function or hyperbolic tangent function.


In some tasks, recurrent neural network gives multiple outputs like (y1, y2,…,
yT) but in image classification it gives single output like yT.
In the conventional recurrent neural network model, the recurrent hidden state in
inform rule is usually implemented as

h t = ϕ(W xt + U h t−1 )

W, U = coefficient condition for the inputs at the present step and for the stimulation
of recurrent hidden units at the previous steps.
The continuous probability p(x1, x 2,…, xT) is divided as in Eq. (32.1)

p(x1 , x2 , . . . , x T ) p(x1 ), . . . p(x T |x1 , x2 , . . . , x T −1 ) (32.1)

Each conditional probability distribution can be represented as in Eq. (32.2)

p(xt |x1 , . . . , xt−1 ) = ϕ(h t ) (32.2)

Here the seismic pixels behave like sequential data rather than feature vectors,
allowing us to use a recurrent network to represent the spectral sequence. RNNs,
as a subset of the deep learning family, have recently demonstrated their ability to
perform well in a variety of machine learning and computer vision applications. It
is difficult to deal with RNN; it has been concluded that training RNNs is difficult
while dealing with long-term sequential data.
Fast and dependable identification of substances is critical for diverse programs
on Earth, including geological prospecting, fabric and engineering sciences, and
different analytical research. For planetary floor exploration, specifically missions
concerning pattern return, acquiring in situ facts on rock mineral composition is of
crucial importance due to time and weight constraints. Spectroscopy is an impor-
tant analytical method for attaining this, taking into account the structural, chemical
and functional residences of planetary substances to be diagnosed. Many robotics
planetary exploration missions already take advantage of data supplied via unique
spectroscopic techniques. Machine learning (ML) has exposed many series in spec-
troscopy, such as receiving used to know minerals and estimate elemental composi-
tion. In this insolvency, present novel methods for automatic mineral identity based
on combining facts from extraordinary spectroscopic strategies.
336 P. Santhi et al.

32.4 Conclusion and Future Work

Mineral detection from seismic data must be accurate in order to characterize and
model reservoirs. Based on delineation method on the growing recurrent neural
network (RNN) technique, which outperforms previous multi-attribute-based strate-
gies in two ways. The distinct patterns of artifacts process and seismic sounds are
eliminated and that has been effectively identified based on the local seismic reflec-
tion patterns and these has been learned, classified and identified first in the RNN
network. Difference between the seismic signals and minerals by using original
seismic amplitude rather than manually selected seismic properties in the mapping
association, it minimizes the interpreter bias and they require less time from the inter-
preter, this is the second method of determining the mapping association of RNN
network. The good match between the generated probability volume and the original
seismic images not only validates the RNN’s ability to learn seismic features, in
order to real-time feature segmentation the machine learning techniques gives more
potential for advanced seismic data analysis.

References

1. de Lima, R.P., et al.: Deep convolutional neural networks as a geological image classification
tool. Sediment. Rec. 17, 4–9 (2019)
2. Adler, J., et al.: Porosity and mineral fraction estimation of carbonate rock with an integrated
neural network/image processing technique. Akademeia 3(1), ea0118 (2012)
3. Fatima, K., Khattak, U.K., Kausar, A.B.: Selection of appropriate classification technique for
lithological mapping of Gali Jagir area, Pakistan. Int. J. Earth Sci. Eng. 7(12), 964–971 (2013)
4. Vural, Corumluoglu, O., Asri, I.: Exploring Gördes Zeolite sites by feature oriented principle
component analysis of LANDSAT images. Caspian J. Environ. Sci. 14(4), 285–298 (2016)
5. Di, H., Wang, Z., AlRegib, G.: Real-time seismic-image interpretation via deconvolutional
neural network. SEG Technical Program Expanded Abstracts 2018. Society of Exploration
Geophysicists, pp. 2051–2055 (2018)
6. Jones, I.F., Davison, I.: Seismic imaging in and around salt bodies. Interpretation 2(4), SL1–
SL20 (2014)
7. Malehmir, A., et al.: 3D reflection seismic imaging for open-pit mine planning and deep explo-
ration in the Kevitsa Ni-Cu-PGE deposit, northern Finland. Geophysics 77(5), WC95–WC108
(2012)
8. Punmiya, R., Choe, S.: Energy theft detection using gradient boosting theft detector with feature
engineering-based preprocessing. IEEE Trans. Smart Grid 10(2), 2326–2329 (2019)
9. Perumal, P., Suba, S.: An analysis of a secure communication for healthcare system using
wearable devices based on elliptic curve cryptography. J. World Rev. Sci. Technol. Sustain.
Dev. 18(1), 51–58 (2022)
10. Mandrekar, J.N.: Receiver operating characteristic curve in diagnostic test assessment. J.
Thoracic Oncol. 5(9), 1315–1316 (2010)
11. Fatima, K., et al.: Minerals identification and mapping using ASTER satellite image. J. Appl.
Remote Sens. 11(4), 046006 (2017)
12. Pandiaraja, P., Sharmila, S.: Optimal routing path for heterogenous vehicular adhoc network.
Int. J. Adv. Sci. Technol. 29(7), 1762–1771 (2020)
13. Pradeep, D., Sundar, C.: QAOC: Noval query analysis and ontology-based clustering for data
management in Hadoop. Future Gener. Comput. Syst. 108, 849–860 (2020)
32 A Survey on Exploratory Mineral Data Analysis on Geological … 337

14. Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge, MA, USA (2020)
15. Thilagamani, S., Nandhakumar, C.: Implementing green revolution for organic plant forming
using KNN-classification technique. Int. J. Adv. Sci. Technol. 29(7S), 1707–1712 (2020)
16. Chauhan, H., Kumar, V., Pundir, S., Pilli, E.S.: A comparative study of classification techniques
for intrusion detection. In: pp. 40–43 (2013)
17. Wu, X., et al.: Multitask learning for local seismic image processing: fault detection, structure-
oriented smoothing with edge-preserving, and seismic normal estimation by using a single
convolutional neural network. Geophys. J. Int. 219(3), 2097–2109 (2019)
18. Kim, S., Lee, K., You, K.: Seismic discrimination between earthquakes and explosions using
support vector machine. Sensors 20(7), 1879 (2020)
19. Mousavi, S.M., Horton, S.P., Langston, C.A., Samei, B.: Seismic features and automatic
discrimination of deep and shallow induced microearthquakes using neural network and logistic
regression. Geophys. J. Int. 207(1), 29–46 (2016)
20. Budakoğlu, E., Horasan, G.: Classification of seismic events using linear discriminant function
(LDF) in the Sakarya region, Turkey. Acta Geophys. 66(5), 895–906 (2018)
21. Zhang, B., et al.: Brittleness evaluation of resource plays by integrating petrophysical and
seismic data analysis. Interpretation 3(2), T81–T92 (2015)
22. Murugesan, M., Thilagamani, S.: Efficient anomaly detection in surveillance videos based on
multi layer perception recurrent neural network. J. Microprocess. Microsyst. 79 (2020)
23. Guitton, A., Wang, H., Trainor-Guitton, W.: Statistical imaging of faults in 3D seismic volumes
using a machine learning approach. SEG Technical Program Expanded Abstracts, pp. 2045–
2049 (2017)
24. James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning, vol.
112. Springer, New York, NY, USA (2013)
25. Pandiaraja, P., Aravinthan, K., Lakshmi, N.R., Kaaviya, K.S., Madumithra, K.: Efficient
cloud storage using data partition and time based access control with secure AES encryption
technique. Int. J. Adv. Sci. Technol. 29(7), 1698–1706 (2020)
26. Tang, L., Zhang, M., Wen, L.: Support vector machine classification of seismic events in the
Tianshan orogenic belt. J. Geophys. Res. Solid Earth 125(1), e2019JB018132 (2020)
27. Waldhauser, F., Schaff, D.P.: Large-scale relocation of two decades of northern California
seismicity using cross-correlation and double difference methods. J. Geophys. Res. Solid Earth
113(B8) (2008)
28. Rajesh Kanna, P., Santhi, P.: Unified deep learning approach for efficient intrusion detection
system using integrated spatial–temporal features. Knowl.-Based Syst. 226 (2021)
29. Shang, X., Li, X., Morales-Esteban, A., Chen, G.: Improving micro seismic event and quarry
blast classification using artificial neural networks based on principal component analysis. Soil
Dyn. Earthq. Eng. 99, 142–149 (2017)
30. Logeswaran, R., Aarthi, P., Dineshkumar, M., Lakshitha, G., Vikram, R.: Portable charger for
handheld devices using radio frequency. International J. Innovative Technol. Exploring Eng.
(IJITEE) 8(6), 837–839 (2019)
31. Kortström, J., Uski, M., Tiira, T.: Automatic classification of seismic events within a regional
seismograph network. Comput. Geosci. 87, 22–30 (2016)
32. Amin, P., Mazlan, H.: ASTER, ALI and Hyperion sensors data for lithological mapping and
ore minerals exploration. Springerplus 3, 130 (2014)
33. Di, H., Wang, Z., AlRegib, G.: Why using CNN for seismic interpretation? An investigation.
In: 2018 SEG International Exposition and Annual Meeting. OnePetro (2018)
34. Tan, X., Konietzky, H., Chen, W.: Numerical simulation of heterogeneous rock using discrete
element model based on digital image processing. Rock Mech. Rock Eng. 49(12), 4957–4964
(2016)
35. Kleinbaum, D.G., Dietz, K., Gail, M., Klein, M., Klein, M.: Logistic Regression. Springer,
New York, NY, USA (2002)
36. Thilagamani, S., Shanti, N.: Gaussian and Gabor filter approach for object segmentation. J.
Comput. Inf. Sci. Eng. 14(2), 021006 (2014)
338 P. Santhi et al.

37. Hastie, T., Rosset, S., Zhu, J., Zou, H.: Multi-class AdaBoost. Statist. Interface 2(3), 349–360
(2009)
38. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining,
Inference, and Prediction. Springer, New York, NY, USA (2009)
39. Taha, A., Malebary, S.J.: An intelligent approach to credit card fraud detection using an
optimized light gradient boosting machine. IEEE Access 8, 25579–25587 (2020)
40. Chen, T.: Xgboost: Extreme gradient boosting. R Package Version 0.4-2, vol. 1, no. 4, p. 4
(2015)
41. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
42. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42
(2006)
43. Xu, M., Watanachaturaporn, P., Varshney, P., Arora, M.: Decision tree regression for soft
classification of remote sensing data. Remote Sens. Environ. 97(3), 322–336 (2005)
44. Santhi, P., Mahalakshmi, G.: Classification of magnetic resonance images using eight directions
gray level co-occurrence matrix (8dglcm) based feature extraction. Int. J. Eng. Adv. Technol.
8(4), 839–846 (2019)
45. Tan, S.: An effective refinement strategy for KNN text classifier. Expert Syst. Appl. 30(2),
290–298 (2006)
46. Di, H., Shafiq, M., AlRegib, G.: Multi-attribute k-means clustering for salt-boundary
delineation from three-dimensional seismic data. Geophys. J. Int. 215(3), 1999–2007 (2018)
47. Deepa, K., Thilagamani, S.: Segmentation techniques for overlapped latent fingerprint
matching. Int. J. Innovative Technol. Exploring Eng. 8(12), 1849–1852 (2019)
48. Linville, L., Pankow, K., Draelos, T.: Deep learning models augment analyst decisions for event
discrimination. Geophys. Res. Lett. 46(7), 3643–3651 (2019)
49. Di, H.: Developing a seismic pattern interpretation network (SpiNet) for automated seismic
interpretation. arXiv preprint arXiv:1810.08517 (2018)
50. Clair, J., et al.: Geophysical imaging reveals topographic stress control of bedrock weathering.
Science 350(6260), 534–538 (2015)
Chapter 33
User’s Nostrum Spot for Consulting
and Prescribing Through Online

B. Padmini Devi, J. Rajaram, B. Jerish, S. B. Gowsick, and E. Shriram

Abstract On the off chance that anybody is debilitated and needs to visit an expert
for an assessment for their health, the singular prerequisites to visit the center and
postponement until the expert is open. The client needs to be patience up to the
response from the respective person. Assuming that the respective doctor or medic
for some calamity reason, the patient can’t know about the cancelation. Is growing
quickly, along these lines, one can utilize the versatile applications to conquer such
issues and bother for the patients. This automation application is avail for exploring
specialists which is synchronically obtainable in our application, which that they be
on the lookout for significant medicaments on time and on an organized plan through
E-consultation, physical consultation to home or prescribing through online. The
proposed work is a Health Care System that avail oneself of a smart phase that fix the
appointment of arranging from consultant free timing, effortless and dependable for
our clients. Our system will provide email notification for doctors and for client side,
they will receive the confirmation email and the client will upload their report to get
consulting through the already taken report for emergency. In our system, there is an
option that will provide the suggestion form to notify the admin as what they need.

Keywords Appointment scheduling · Online application · E-consultation ·


Physical consultation · Android Health Care System

33.1 Introduction

In the time of the COVID-19 pandemic, states of numerous nations declared a


full lockdown to forestall the spread of the infection. The economy of the world
is descending a result of lockdown. IT area faces tremendous difficulties for busi-
ness coherence in the pandemic time frame [1–3]. At the point when the COVID-
19 pandemic came shoreward and moves like lockdowns, so in case there is a
crisis to counsel a specialist is a danger for us all because of COVID dread, to

B. Padmini Devi (B) · J. Rajaram · B. Jerish · S. B. Gowsick · E. Shriram


Department of Computer Science and Engineering, M. Kumarasamy College of Engineering,
Thalavapalayam, Karur, Tamil Nadu 639113, India
e-mail: padminidevib.cse@mkce.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 339
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_33
340 B. Padmini Devi et al.

Fig. 33.1 Important features of Android OS

settle that Android application will assist us with getting some emergency treatment
consultancy inside the actual home or to get security arrangement [4–6].
Android is an open-source Linux-based working framework for versatile and
tablet PCs created by Google. Android is a stage that offers a coordinated technique
to foster the Android application for portable [7, 8]. Henceforth, engineers can foster
their own tweaked system and convey them in various Android fueled frameworks.
Android OS has some unique elements which make Android more loveable locally
of engineers [9–11]. Important features of the Android applications are presented in
Fig. 33.1.

33.1.1 Existing System

Throughout the most recent couple of years, the quantity of truly sick patients visiting
or conceded to medical clinics has expanded consistently which as a rule brings
about congestion [12–15]. It might even add to brutality in the medical clinics and
by implication affects number of patients visiting medical clinics. In most medical
clinic, doctor consultation booking is done physically right now which includes
manual Intervention every single time [16, 17]. Paper turns out made for the patients’
arrangements. There is not any product to robotize this interaction. By this, proposed
framework is absolutely targeting fostering the product to mechanize arrangement
33 User’s Nostrum Spot for Consulting and Prescribing Through Online 341

booking. This framework will give quick and precise outcomes with no defects
[18, 19].

33.1.2 Problems in Existing System

• Queue is piece of Hospital Management System that can be directly affecting the
quantity of patients visiting the emergency clinics.
• Maintaining in record that will be very tough to maintain over years.
• Unavailability of appropriate doctors.

33.2 Related Works

A sensible professional-based consulting reserving framework has been proposed in


[20, 21] wherein of making plans for framework is accommodated customers. The
lesser scientific workforce scheme of positioning is indicated by using the wanted
stage.
Suggested an automation software that is applied to help the clients to keep to
mind their dosage level and next schedule via alarm ringing structure, so they can
continue to be match and do the need on time. Searching through professionals and
clinics alongside path delicacies is likewise convenient in their system, on the way
to searching for appropriate ministration on slate [22–24].
Suggested an automation software-based association for the governments struc-
ture which avail on-self of application programming interfaces (APIs) from Google
manual and agenda on that. This consulting application may be utilized with other
consulting-based structures. The portable utility recognizes seek advice from with
the aid of saving the report of the appointment that is synchronized with the Google
agenda. Their clients receive an alarm dependent on present indicated time before
the respective scheme of plan [25].
Proposed a fitness-based structure of system that was developed with sensors by
way of PDA for client details assortment, and hold records simultaneously to the focal
server for additional investigation via the Internet. Some Internet-based structure that
is up to this point still has a risk [26].
Depicted an Android advanced mobile phones and tablet utility that is impetuously
accessible from Google play store, and it gives specific the quality of having a
practical use which include work forces of scientific facts, to follow position of real
client continuously. Guidance estimation is applied to examine least distance for goal
shape [27].
One greater review contains of an Internet-based guidance base for checking the
affected person with faux heart. This information set accommodates of examine
inoperable that is compact and continues steady support of their client which include
report history. There are exceptional investigations which include handheld hospital
342 B. Padmini Devi et al.

treatment and productive calculations for association at booking inclusive of surveys


from [28].
In deliberated with the important objective of making a positive solid hospital
treatment framework, it holds records honesty, a hit, hand-off correspondence, and
strong stockpiling of information among a huge number of its focal desires [29].
In our proposed Nostrum Spot System target is to reduce lines in emergency
timing which will produce the outcomes in saving time of patients. For every quiet
in the line, the complete treatment schedule for multiple patients before him in
the time of getting appointment, it will be reduced with this system of approach
[30]. Subsequently, the proposed framework attempts to anticipate plans through
web-based arrangements. The client will select the appointment schedule options
such as E-consultation, physical consultation to home or prescribing through online.
The proposed work is a Health Care System with the application subsection that
confirming the slots arrangements according to the doctor’s for manageable and
dependable for our clients. Our system will provide email notification for doctors,
and for client side, they will receive the confirmation email and the client will upload
their report to get consulting through the already taken report for emergency. In our
system, there is an option that will provide the suggestion form to notify the admin
as what they need [30].

33.3 Proposed Method

Our UI will be easy and effective to interact with users. Whenever the application
is accessing by the end user, he needs to be a part of our application by sign up
and logging himself. The client can be allowed for deciding the favored experts with
respect to their schedule by their choice such as E-consultation, physical consultation
to home or prescribing through online. The list of appointment can be look after by
our admin side. Our admin side will be always alert for every situation for clients
while scheduling their slots in doctor’s busy spot. Admin side have rights to view
doctors record, to view client’s records, and view suggestion by client side to know
their needs in system. Every one of the client’s reports will be monitored by our
admin team synchronously. All of this medical reports and clients’ graphs utilized
by the site of our system, respectively.

33.3.1 Client Interface Features

In this module, we manage the login interface and recovery of information from the
data set in the server. The status credits check whether or not the token is accessible.
When an accessible token is seen by the client, he goes on by booking. This framework
is fundamentally worried about booking arrangements for the patients. The treatment
of the information and records for such a tremendous framework extremely complex
33 User’s Nostrum Spot for Consulting and Prescribing Through Online 343

errand is done physically; however, it tends to be made a lot simpler if the framework is
modernized. This framework gives any time wherever administration for the clients.

33.3.2 Admin Features

• Admins have their respective user id and password for authenticated login.
• With the emergency clinic arrangement framework, the administrator can deal
with the booking, the circumstance of the specialists and let travelers select their
regular check-ups.

33.3.3 Booking Features

• The clients have their particular login including characteristic like patient name,
patient telephone number, patient age, patient email-id and the patient secret login.
• Clients will be allowed the view the already allotted and booked appointments
through booking features option.

Generating Allotted Appointments


When the client’s login, he/she can see the accessible specialist and timing and right
now reserved arrangements which can be recognized exclusively by their tones.
When an arrangement is reserved by a patient right away, this application will make
an impression on the specialist’s number about the client details. The portable number
which is entered in the login page is utilized as the reference.

33.4 Conclusion

The suggested web interactive arrangement framework has carried out in Android
studio for system advancement. The Errands associated with this work are separated
into modules. The proposed framework is effective and has a well-disposed UI. The
expansion for this system administrator and specialist subsection in the automa-
tion system is remembered for further improvement that may assist their specialist
with enlisting for the application and play out every one of the undertakings on the
application. An installment or some sum might be charged to the clients/patients
while planning to keep away from deceptive clients. Numerous clients just register
themselves for no particular reason and have no worry for their arrangement. Some
more upcoming headings are the overhauls in the client’s subsection which consoli-
dates setting refreshes for the courses of action and saving the intention date for the
respective schedule.
344 B. Padmini Devi et al.

References

1. Hylton III, A., Sankaranarayanan, S.: Application of intelligent agents in hospital appointment
scheduling system. Int. J. Comput. Theor. Eng. 4, 625–630 (2012)
2. Murugesan, M., Thilagamani, S.: Efficient anomaly detection in surveillance videos based on
multi-layer perception recurrent neural network. J. Microprocess. Microsyst. 79 (2020)
3. Ameta, D., Mudaliar, K., Patel, P.: Medication reminder and healthcare—an android applica-
tion. Int. J. Managing Public Sect. Inf. Commun. Technol. (IJMPICT) 6, 39–48 (2015)
4. Thilagamani, S., Nandhakumar, C.: Implementing green revolution for organic plant forming
using KNN-classification technique. Int. J. Adv. Sci. Technol. 29(7S), 1707–1712 (2020)
5. Choudhari, S.B., Kusurkar, C., Sonje, R., Mahajan, P., Vaz, J.: Android application for doctor’s
appointment. Int. J. Innovative Res. Comput. Commun. Eng. (2014)
6. Thilagamani, S., Shanti, N.: Gaussian and Gabor filter approach for object segmentation. J.
Comput. Inf. Sci. Eng. 14(2), 021006 (2014)
7. Gavaskar, S., Sumithra, A., Saranya, A.: Health portal—an android smarter healthcare
application. Int. J. Res. Eng. Technol. (2013)
8. Perumal, P., Suba, S.: An analysis of a secure communication for healthcare system using
wearable devices based on elliptic curve cryptography. J. World Rev. Sci. Technol. Sustain.
Dev. 18(1), 51–58 (2022)
9. Sposaro, F., Tyson, G.: iFall: an android application for fall monitoring and response. In: 31st
Annual International Conference of the IEEE Engineering in Medicine and Biology Society,
vol. 1, pp. 6119–22 (2009)
10. Pandiaraja, P., Sharmila, S.: Optimal routing path for heterogenous vehicular Adhoc network.
Int. J. Adv. Sci. Technol. 29(7), 1762–1771 (2020)
11. Pandiaraja, P., Aravinthan, K., Lakshmi Narayanan, R., Kaaviya, K.S., Madumithra, K.: Effi-
cient cloud storage using data partition and time based access control with secure AES
encryption technique. Int. J. Adv. Sci. Technol. 29(7), 1698–1706 (2020)
12. Tsai, P.-F., Chen, I., Pothoven, K.: Development of handheld healthcare information system
in an outpatient physical therapy clinic. In: Proceedings of the 2014 IEEE 18th International
Conference on Computer Supported Cooperative Work in Design, pp. 559–602
13. Rajesh Kanna, P., Santhi, P.: Unified deep learning approach for efficient intrusion detection
system using integrated spatial–temporal features. Knowl.-Based Syst. 226 (2021)
14. Santhi, P., Mahalakshmi, G.: Classification of magnetic resonance images using eight directions
gray level co-occurrence matrix (8dglcm) based feature extraction. Int. J. Eng. Adv. Technol.
8(4), 839–846 (2019)
15. Wang, J., Fung, R.Y.K.: Adaptive dynamic programming algorithms for sequential appointment
scheduling with patient preferences. Science Direct, Artif. Intell. Med. 33–40 (2015)
16. Deepa, K., Thilagamani, S.: Segmentation techniques for overlapped latent fingerprint
matching. Int. J. Innovative Technol. Exploring Eng. 8(12), 1849–1852 (2019)
17. Mu, B., Xiao, F., Yuan, S.: A rule-based disease self-inspection and hospital registration recom-
mendation system. In: 2012 IEEE 3rd International Conference on Software Engineering and
Service Science (ICSESS), 22–24 June (2012)
18. Pradeep, D., Sundar, C.: QAOC: Noval query analysis and ontology-based clustering for data
management in Hadoop. Future Gener. Comput. Syst. 108, 849–860 (2020)
19. Symey, Y., Sankaranarayanan, S., binti Sait, S.N.: Application of smart technologies for mobile
patient appointment system. Int. J. Adv. Trends Comput. Sci. Eng. (2013)
20. Logeswaran, R., Aarthi, P., Dineshkumar, M., Lakshitha, G., Vikram, R.: Portable charger for
handheld devices using radio frequency. Int. J. Innovative Technol. Exploring Eng. (IJITEE)
8(6), 837–839 (2019)
21. Aghav, J., Sonawane, S., Bhambhlani, H.: Health track: health monitoring and prognosis system
using wearable sensors. In: IEEE International Conference on Advances in Engineering &
Technology Research, pp. 1–5 (2014)
33 User’s Nostrum Spot for Consulting and Prescribing Through Online 345

22. Gunasekar, M., Thilagamani, S.: Performance analysis of ensemble feature selection method
under SVM and BMNB classifiers for sentiment analysis. Int. J. Sci. Technol. Res. 9(2), 1536–
1540 (2020)
23. SyMey, Y., Sankaranarayanan, S.: Near field communication based patient appointment. In:
International Conference on Cloud and Ubiquitous Computing and Emerging Technologies,
pp. 98–103 (2013)
24. Deepika, S., Pandiaraja, P.: Ensuring CIA triad for user data using collaborative filtering
mechanism. In: 2013 International Conference on Information Communication and Embedded
Systems (ICICES), pp. 925–928 (2013)
25. Nimbalkar, R.A., Fadnavis, R.A.: Domain specific search of nearest hospital and healthcare
management system. Recent Adv. Eng. Comput. Sci. (RAECS), 1–5 (2014)
26. Rajesh Kanna, P., Santhi, P.: Hybrid intrusion detection using map reduce based black widow
optimized convolutional long short-term memory neural networks. Expert Syst. Appl. 194, 15
(2022)
27. Luschi, A., Belardinelli, A., Marzi, L., Frosini, F., Miniati, R., Iadanza, E.: Careggi Smart
Hospital: a mobile app for patients, citizens and healthcare staff. In: IEEE-EMBS International
Conference on Biomedical and Health Informatics (BHI), pp. 125–128 (2014)
28. Deepa, K., Kokila, M., Nandhini, A., Pavethra, A., Umadevi, M.: Rainfall prediction using
CNN. Int. J. Adv. Sci. Technol. 29(7 Special Issue), 1623–1627 (2020)
29. Choi, J., Kang, W.Y., Chung, J., Park, J.W.: Development of an online database system for
remote monitoring of artificial heart patient. In: 2003 4th International IEEE EMBS Special
Topic Conference on Information Technology Applications in Biomedicine, 24–26 April 2003
30. Mobile Application for Doctor Appointment Scheduling (Research Paper): 2021 International
Conference on System, Computation, Automation and Networking (ICSCAN), July 2021
Chapter 34
Decentralized Waste Management
System: Smart Dustbin

Dhruv Shah , Maharshi Relia , and Sandip Patel

Abstract An exponential increase in the human population poses considerable


challenges to the garbage management system and in sustaining a clean environ-
ment. Many cities around the world are suffering due to poor garbage manage-
ment. Our project aims to find a solution with the help of Smart Dustbin by using
NodeMCU ESP8266 and ultrasonic sensor. From ‘ultrasonic sensor,’ which contin-
uously measures the garbage level of the dustbin. Thus, our project aims to prevent
them from overflowing garbage from the dustbin and send the dustbin data to the
server with the help of Wi-Fi so that the garbage disposal system becomes effective
and cities can become hygienic and cleaner.

Keywords Waste management · Decentralized waste management · Cloud


computing · Health

34.1 Introduction

Over the past 20 years, we have observed a tremendous increase in urbanization.


But, at the same time, we can see that there is an increase in waste production.
Normally, we know that garbage bins are evacuated and emptied regularly by the
cleaners. This approach has many drawbacks: (1) many of the garbage bins get filled
quicker than the emptying rate and are complete before the next scheduled time for
collection. This results in the overflowing of the garbage bins and creates hygiene
risks in the surroundings. (2) The particular time intervals like festivals, weekends,
and public holidays, the specific garbage bins get filled up very quickly, and there is
a necessity to evacuate garbage bins at a higher rate [1, 2]. We should consider waste

D. Shah (B) · M. Relia · S. Patel


CSPIT-IT, Charotar University of Science and Technology, Anand, Gujarat 388421, India
e-mail: 18it119@charusat.edu.in
M. Relia
e-mail: 18it110@charusat.edu.in
S. Patel
e-mail: sandippatel.it@charusat.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 347
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_34
348 D. Shah et al.

management as a significant part of maintaining the well-being of society, and this


paper gives a roadmap to achieve this noble cause [3].
This project has proposed a Smart Dustbin which is built on a microcontroller-
based platform NodeMCU interfaced with an ultrasonic sensor. The ultrasonic sensor
is placed at the top of the dustbin, which will measure the level of garbage in the
dustbin. The standard threshold is set as 40 cm. NodeMCU ESP8266 is integrated
so that when the dustbin is being filled, the remaining height is measured with the
help of an ultrasonic sensor from the threshold height which will be displayed on the
portal. With Wi-Fi, NodeMCU ESP8266 will continuously share the dustbin data
with authority. Once the garbage reaches its threshold level, there will be a red alert
on the portal of that dustbin, which will alert the required authority to track the status.
Version 1: The authority is available locally.
1. The authority website will be hosted locally, and all operations will be managed
through a local database.
Version 2: The authority is far.
1. The authority website is hosted globally, and NodeMCU ESP8266 will send the
data to AWS IoT core with the help of Wi-Fi.
2. Then, the AWS IoT core will store the data in the DynamoDB (A NoSQL
database in the AWS cloud).
3. With the help of AWS Lambda, the latest data will be extracted from the
DynamoDB and will be displayed to maps on that portal.
For the first instance when these intelligent garbage bins are installed on a large
scale by replacing the regular garbage bins, waste can be managed efficiently due to
a decrease in the unnecessary lumping of debris on the roadside and an automation-
based process [4]. It will also benefit cities in terms of hygiene and lower the
probability of left-over garbage.
Our primary objective is to automate such products for the municipal corporation.
It will save human resources, and it is a time-saving technology. It will also bring
a revolution in technology for the workers, resulting in the well-being of society’s
hygiene.

34.2 Related Work

See Table 34.1.


34 Decentralized Waste Management System: Smart Dustbin 349

Table 34.1 Work done till now by other authors


Work Approach Application Tools and technologies
[5] IoT, Wireless sensing Waste management in Ultrasonic sensors, Arduino UNO,
node and cloud-based the smart city project Amazon Web Services,
server with android
application
[1] IOT, Outdoor nodes, Outdoor testbed, litter GPS, Ultrasonic sensors, Mesh
analytics, work bin network, battery backup
station
[6] IOT, Wireless sensing Smart bin project in IR sensors—transmitters—receivers,
and foul smell Smart City LCD, PHP, Arduino IDE, Node
detection MCU, Gas sensor
[7] Mobile application Smart City project GSM, GPRS module, Ultrasonic
associated with an sensor, LED signal, microcontroller
IOT based Smart
Trash Bin
[4] Trash inside the bin is Economically sound Arduino UNO, GSM module, servo
decomposed regions motor, gas sensor, LED signal,
compared to a daily ultrasonic sensor
collection
[8] IoT, Cloud of Things, Smart cities, waste This is just a proposed system and
Cloud Computing, management not being implemented
big data, healthcare
[9] IoT, Mobile, and Smart City, waste Load sensors, Level sensors,
Web-based management, smart Microcontroller, GSM, Bluetooth,
monitoring waste-bin HC-SR04, Mobile and Web
Application
[10] IoT, Smell Detection, Smart City, Ultrasonic sensors, Wi-Fi modules,
Trash bin Tracking cloud-enabled smart APIs, Cloud, MQ-135, MQ-136,
dustbins Microcontroller (Arduino UNO),
database

34.3 Proposed System

34.3.1 Version 1

• The ultrasonic sensor (HC-SR04) measures the dustbin level which is connected
to NodeMCU ESP8266 that transmits dustbin data to the server over Wi-Fi. When
the data arrives at the server, it gets stored in the local database and at the same
time, the website running on the local host with the help of Python Flask shows
the updated data of the dustbin.
• This version is used for societies or universities that have Wi-Fi across campus.
350 D. Shah et al.

34.3.2 Version 2

• The ultrasonic sensor (HC-SR04) measures the dustbin level which is connected
to NodeMCU ESP8266 that transmits dustbin data to AWS IoT core over Wi-Fi.
When the data reaches the AWS IoT console, it gets automatically stored into
Amazon DynamoDB. The Lambda Function extracts the newly received data and
sends it to the frontend webpage with the help of AWS API Gateway.
• This version is used in smart cities to monitor their garbage.

34.4 Design Specifications

Our design is divided into four parts:


Client-Side; Server-Side (Version—1 only); AWS-Side (Version—2 only); Front
end.

34.4.1 Client-Side

NodeMCU ESP8266. A small microcontroller enabled Wi-Fi to transfer data through


Wi-Fi quickly.
Ultrasonic Sensor (HC-SR04). It is a sensor that transmits the ultrasonic waves
from the sensor, and in the same manner, it will also receive it back, and thus from
the time required for sending and receiving the waves. Through this, we can easily
find out the distance.
Jumper wires. These wires are easily connected between the NodeMCU and the
ultrasonic sensor.

34.4.2 Server-Side—Version 1 Only

Raspberry Pi. It is a type of microcontroller acting as a server in our case. It is


connected to the Wi-Fi through which the NodeMCU is connected, so the data sent
by the NodeMCU will be received by the Raspberry Pi over Wi-Fi. It also activates
the local-host database on this controller, so when the NodeMCU sends the data to
the server, Raspberry Pi, this controller can store that data into the database.
34 Decentralized Waste Management System: Smart Dustbin 351

34.4.3 AWS-Side—Version 2 Only

AWS IoT Core. This is an AWS cloud console, where we can register our IoT devices
and configure them to send the data to the AWS IoT core console [11].
AWS DynamoDB. Amazon DynamoDB is a completely managed personal NoSQL
database service that supports the key-value and document data structures and is
offered by Amazon.com as part of the Amazon Web Services portfolio. DynamoDB
exposes an analogous data model and derives its name from dynamo but has a
different underpinning implementation [12].
AWS API Gateway. Amazon API Gateway is an AWS service for creating,
publishing, maintaining, monitoring, and securing REST, HTTP, and WebSocket
APIs at any scale. API developers can develop APIs that pierce AWS or other web
services and data stored in the AWS cloud. API Gateway creates RESTful APIs that
are HTTP-based [13].
AWS Lambda. AWS Lambda is an event-driven, serverless computing platform
delivered by Amazon as a part of Amazon Web Services. It is a computing service
that runs a program in response to events and automatically manages the computing
resources needed by that program [14].

34.4.4 Front-End Side

With the help of HTML, CSS, JS, and Google Maps API, we have developed a
dashboard on local host to display our data to the authorized municipal corporation.
After successful testing, we will host it on an EC2 instance to be publicly available
or configure it as per their needs.

34.5 Methodology and Working

The ultrasonic sensor measures the distance, and it will send the data to the
NodeMCU, and the NodeMCU will send that data to the AWS IoT core through
Wi-Fi. AWS IoT core is configured to store the data in the database, i.e., DynamoDB.
Then, the Lambda will fetch the data from the DynamoDB and display it on a webpage
with the help of AWS API Gateway (Fig. 34.1).

34.6 Implementation and Execution

See Fig. 34.2.


352 D. Shah et al.

Fig. 34.1 Flowchart of sensor

34.6.1 Client-Side [15]

Step 1. Create a thing in the AWS IoT core console and download certificates and
keys.
Step 2. Convert certificates and keys to. der format.
34 Decentralized Waste Management System: Smart Dustbin 353

Fig. 34.2 Working of proposed system

Step 3. Program the NodeMCU in such a way that first it will load all the certificates
and then configure in such a way that it will interact with the ultrasonic sensor and
will send the data to the NodeMCU and the NodeMCU will send the data to the AWS
IoT core with the help of Wi-Fi.

34.6.2 AWS Cloud Configuration

Step 1. Once the data has been received in the AWS IoT core console, configure the
AWS IoT core to directly store the data in the DynamoDB database.
Step 2. Configure the AWS Lambda function to extract the newly received data from
the AWS DynamoDB and store it in one variable.
Step 3. Now, navigate to the AWS API Gateway console and then create a new REST
API and configure for getting requests and it can work with AWS Lambda and then
make sure that the CORS has been enabled or not. Now, whenever this API key is
invoked, the newly fetched data from the database will be shown.

34.6.3 Front-End (Portal)

Step 1. Designing the prototype of a basic structure of the index portal.


354 D. Shah et al.

Fig. 34.3 Outcome on front-end portal

Step 2. This portal contains three pages, including the home and index page.
1. Home Page—ADMIN
2. About Us
3. Contact Us.
Step 3. Development of website with basic HTML, HTML 5, CSS, JavaScript and
Bootstrap framework for better user interface and user experience (Fig. 34.3).

34.7 Conclusion

In this paper, an embedded ultrasonic sensor has been introduced for garbage collec-
tion. The database is created for measuring the garbage level in the dustbin. Upon
researching contemporary IoT products and creating a relevant model, the implemen-
tation of this project will automate the standard bins to save time. It can automatically
monitor the garbage level and send synchronous notifications to the portal.
In the first version, we have successfully transmitted the data to the local database
based on a local server. We have hosted the website to a local server, so whenever
NodeMCU sends the data to the server, it will send it to the database and on the front
end to display the updated level of a dustbin.
We successfully transmitted the data to DynamoDB based on AWS cloud in the
second version. We configured the AWS Lambda function in such a manner that it
will extract the newly received data from the AWS DynamoDB and store it in one
variable. Then we navigated the AWS API Gateway console, created a new REST
API, and configured it for getting requests. Then we fetched the data to the portal.
34 Decentralized Waste Management System: Smart Dustbin 355

References

1. Folianto, F., Low, Y.S., Yeow, W.L.: Smartbin: smart waste management system. In: 2015
IEEE Tenth International Conference on Intelligent Sensors, Sensor Networks and Information
Processing (ISSNIP), pp. 1–2. IEEE (2015)
2. Ali, T., Irfan, M., Alwadie, A.S., Glowacz, A.: IoT-based smart waste bin monitoring and
municipal solid waste management system for smart cities. Arab. J. Sci. Eng. 45(12), 10185–
10198 (2020)
3. Guerrero, L.A., Maas, G., Hogland, W.: Solid waste management challenges for cities in
developing countries. Waste Manage. 33(1), 220–232 (2013)
4. Balamurugan, S., Ajithx, A., Ratnakaran, S., Balaji, S., Marimuthu, R.: Design of smart waste
management system. In: 2017 International conference on Microelectronic Devices, Circuits
and Systems (ICMDCS), pp. 1–4. IEEE (2017)
5. Deore, S.M., Kukade, P.S., Yadav, K.L., John, J.: Waste management system using AWS. Waste
Manage. 6(01) (2019)
6. Raja, S.: IOT based on smart waste management in smart cities. Int. J. Recent Innovation Trends
Comput. Commun. 6(3), 68–73 (2018)
7. Haribabu, P., Kassa, S.R., Nagaraju, J., Karthik, R., Shirisha, N., Anila, M.: Implementation of
a smart waste management system using IoT. In: 2017 International Conference on Intelligent
Sustainable Systems (ICISS), pp. 1155–1156. IEEE (2017)
8. Aazam, M., St-Hilaire, M., Lung, C.-H., Lambadaris, I.: Cloud-based smart waste management
for smart cities. In: 2016 IEEE 21st International Workshop on Computer Aided Modelling
and Design of Communication Links and Networks (CAMAD), pp. 188–193. IEEE (2016)
9. Wijaya, A.S., Zainuddin, Z., Niswar, M.: Design a smart waste bin for smart waste management.
In: 2017 5th International Conference on Instrumentation, Control, and Automation (ICA),
pp. 62–66. IEEE (2017)
10. Misra, D., Das, G., Chakrabortty, T., Das, D.: An IoT-based waste management system
monitored by cloud. J. Mater. Cycles Waste Manage. 20(3), 1574–1582 (2018)
11. https://docs.aws.amazon.com/iot/latest/developerguide/what-is-aws-iot.html
12. https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html
13. https://docs.aws.amazon.com/apigateway/latest/developerguide/welcome.html
14. https://docs.aws.amazon.com/lambda/latest/dg/welcome.html
15. https://electronicsinnovation.com/storing-esp8266-data-into-amazon-dynamodb-using-aws-
iot-coremqtt-arduino/
Chapter 35
Ease of Doing Business: Approaching
the Context

Pawan Kumar and Dilip Kumar

Abstract According to World Bank, ten factors form the basis to give rankings on
Ease of Doing Business (EODB) to different countries, where higher ranking may
prompt higher probability of FDI inflows and vice versa. The present paper endeavors
to go through the comparative applicability and weightage of different factors which
are used in these rankings vis-a-vis the status of different stages of economic and
social developments of different countries which may affect the inflow of investments.
Methodology adopted is to refer the different papers already published on the topic to
reach out the conclusions derived from them. This exercise concludes that certainly
the rankings go in favor of degree of investment inflows, but also draw conclusion
that parameters adopted have different weightages based on case to case basis. Extra
commercial considerations also prevail under strategic interests of the investors.
Under developed countries may not be able to focus on their priority areas in the hot
contest of improving their EODB rankings, as ground reality may pose the fact that
at micro-level, actual impact of these indicators may definitely need further analysis,
which are bound to impact these rankings at overall macro-level. In Indian context,
though, ranking continuously bettered in recent years, but the focus on self-reliance,
under present level of socio-economic development stage combined with strategic
and global concerns, market cannot fully decide her priorities. Government of India
is continuously taking the efforts to promote the start-ups and reduction of regulatory
compliance burden with required financial and technical support.

Keywords Ease of Doing Business · Foreign direct investment (FDI) · World Bank

Due to the circumstances which existed before, there was a time when it was asked - Why
India? Now after looking at the impact of the reforms that have taken place in the country,
it is being asked - ‘Why not India’?
Prime Minister Narendra Modi
(In ASSOCHAM Foundation Week 2020)

P. Kumar (B) · D. Kumar


ICFAI University Jharkhand, Ranchi, India
e-mail: pawan.kr.43@gmail.com
D. Kumar
e-mail: dilip.kumar@iujharkhand.edu.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 357
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_35
358 P. Kumar and D. Kumar

35.1 Introduction

Since 1991, when Late Shri P. V. Narasimha Rao led government assumed office, a
new economic policy of market-based economy was adopted due to obvious reasons,
which saved India from almost being a bankrupt State. The words like “economic
liberalization”, “globalization”, “foreign direct investment (FDI)”, “business friendly
environment”, multi-national companies (MNCs), World Bank, international moni-
tory fund (IMF), globalization, etc., became household jargons, off course with
substantial resistance also. However, the reforms prevailed and Indian economy
moved from a closed economy to an open market economy with possible rich divi-
dends to be carved out in the future. Industrial licensing policy was warded off for
most of the sectors, and private sectors were facilitated with handsome liberties and
facilities to invest. Labor reforms were undertaken along with other steps in easing
of entry and exit norms. Now, after elapse of thirty years of reforms, a whole new
generation has taken over that legacy of liberalization and the words like “disinvest-
ments”, “second generation reforms”, “Ease of Doing Business Rankings”, “reducing
compliance burden”, and “agricultural sector reforms” are gaining importance with
quite a realizable dream of making a five trillion dollar Indian economy. However,
COVID-19 pandemic has led the world to different path, at least for the time being,
and thus putting the most awkward challenges before the mankind which brought
the health sector and pharma sector to the fore front. Parallelly, this effect has given
much popularity to working on virtual platforms in education, consultancy business,
and to a large extent, in government sector as well. Further amendments in laws
for labor reforms, taxation reforms, etc., are being persuaded with an eye on the
development and growth of infrastructure like transportation, energy, skill building,
education, etc. From the platform prepared by the first phase of economic reforms,
second phase of the same is supposed to take a flight.
India aspires to be five trillion dollars’ economy in the next five years. This
ambitious feature has been reiterated by the PM and other leaders many a times.
Programs like “Make in India”, “Atmanirbhar Bharat”, and Start-up India can be
seen as flag bearers in this direction. We can see the news coming about offering
of land “twice to that of Luxemburg” to the companies planning to leave China
after COVID-19. This requires strong political will and cooperation over parochial
interests, merely related to election gains and infrastructure sector. However, one
cannot deny the fact that huge amount of capital infusion of around 1.3 trillion
dollars in every five years and even more than that shall be required to facilitate this
targeted growth, which may not be available from domestic investments only. It is
but natural that the country has to look for the capital infusion from foreign sources,
which necessitates the betterment of EODB rankings.
35 Ease of Doing Business: Approaching the Context 359

35.2 Defining the Term “Ease of Doing Business”

“Ease of doing Business” may be defined as a consolidated impact of the simpli-


fied laws, rules, regulations, taxation process, etc., which function as facilitating the
smooth business operations, with proper back up of developed infrastructural facil-
ities like transportation, law and order position, and banking and financial system
along with the presence of conducive quality population, which is having the role,
both as consumers and human resources for the business (Fig. 35.1).
The World Bank (2006) identifies and compiled a set of factors which is the indica-
tors of “Ease of Doing Business” and it helps to measure the regulatory environment
of the economy. The factors are procedural formalities of starting a business, licensing
issues, labor laws, property registration, credit availability, investors’ protection,
taxation, crossborder trade, contract enforcements, and closure of business.
Interpretation: Table 35.1 depicted that the, New Zealand was the most favorable
destination for the companies to start their business followed by Singapore, Denmark,
Hong Kong, Korea, etc. In Ease of Doing Business list India secured 77 ranks with
score 67.23 which is far away to attract more and more foreign direct investment in
India.
Interpretation: Table 35.2 exhibited that the top ten countries in the world which
implemented the changes regarding Ease of Doing Business indicator and motivated
the investors to invest in the respected countries. Afghanistan made change with +

Fig. 35.1 Major stages of Ease of Doing Business. Source Doing business database
360 P. Kumar and D. Kumar

Table 35.1 Ease of Doing


Rank Economy EODB score EODB score
Business ranking
change
1 New Zealand 86.59 0.00
2 Singapore 85.24 +0.27
3 Denmark 84.64 +0.59
4 Hong Kong SAR, 84.22 +0.04
China
5 Korea, Rep. 84.14 −0.01
6 Georgia 83.28 +0.48
7 Norway 82.95 +0.25
8 USA 82.75 −0.01
9 UK 82.65 +0.33
77 India 67.23 +6.63
Source Doing Business database

Table 35.2 Ten economies


S. No. Economy Ease of Doing Change in Ease of
improving the most across
Business rank Doing
three or more areas
Business score
1 Afghanistan 167 +10.64
2 Djibouti 99 +08.87
3 China 46 +08.64
4 Azerbaijan 25 +07.10
5 India 77 +06.63
6 Togo 137 +06.32
7 Kenya 61 +05.25
8 Cote d’lvoire 122 +04.94
9 Turkey 43 +04.34
10 Rwanda 29 +04.15
Source Doing Business database

10.64 in Ease of Doing Business and secured 167 ranks which help to attract more
companies regarding invest in different sector. These changes depicted the awareness
and active involvement of the different countries regarding foreign investment. India
has secured fifth rank regarding changes in Ease of Doing Business score, which
also shows the active involvement of the government for more and more foreign
investment.
Going through the different research papers, World Bank reports, articles, books
etc., some of the major factors play vital role in promoting foreign investment. First
35 Ease of Doing Business: Approaching the Context 361

factor, i.e., “starting a business” has pre-requisite of smooth functioning of govern-


ment departments with sub-factors like time taken in various clearances, land acqui-
sition, single window systems, maintenance of law and order situation, attitude of
general public as well as those of administration, etc. Fifth one, i.e., getting credit
owes its effectiveness in healthy functioning of the financial systems of the country.
Though the law of the land has the role to play here but strengthening of the financial
institutions is long drawn procedures and takes a considerable period of time. Third
and eighth factors relate to infrastructure developments in power sector, transporta-
tion in all forms, i.e., surface, air, and water so that the products and services are
ready on time and to be delivered on time at proper destination. Trading across the
borders also depends upon the country’s priority in its economic system, its political
and commercial relations with other countries, quality offered in the products and
services, cost competitiveness, etc.
The seventh factor relates to one of the most important aspect, i.e., paying taxes.
Kautilya (Chanakya) has said in Arthashastra that wise people get the fruits only up
to certain limit but never harm the trees so that they can bear the fruits for a long time
to come. Similarly, a wise king never harms the taxpayers but facilitates them for
paying their taxes in foreseeable futures to come. In modern economies, methods that
determine the degree of easiness in depositing taxes, interface with tax officials, etc.,
are also the integral part of tax reforms. For the taxation system, compliance cost and
procedural formalities are also very important. It may be followed that the guiding
principle of “One Nation, One Market, One Tax” is behind the implementation of
goods and services tax (GST) in India.

35.3 Review of Literature

There is some review of literature that provides critical review on Ease of Doing
Business. Has EODB has any role to play in attracting FDI or upliftment of domestic
economy, trade, and production? If yes, then to what extent? What are the factors
other than EODB which are worth considered for their role in ensuring the advantages
to the investors? A critical analysis has been attempted in the forthcoming paragraphs
cutting across the spectrum of thoughts and studies by the various scholars to decipher
the importance of EODB.
Corcoran and Gillanders [1] exhibited that eighth factor, i.e., ease in trade across
the borders is the deciding factor in most of the inflows of FDI with very little scope
left for the other factors. But there is a caveat that same principle cannot be fitted in
all policy decisions in this respect. This may be true for export-based economies; for
example, like Mauritius, Singapore, etc., which mostly function as transit markets
for big size economies. But priorities economies like India may have this factor as
only one out of several. However, this fact cannot be avoided that FDI inflows need
healthy business environment, which definitely impacts EODB.
Djankov [2] talked about the various new emerging norms measuring a successful
enterprise as well as its environment. Also gives the degrees of freedom of several
362 P. Kumar and D. Kumar

parameters like investment, employment, product variation, etc. He also argues that
real-time data in any region or country like India can be used to provide the glimpses
of future business environment, and a theoretical base also for analysis of various
parameters through Ease of Doing Business angle.
Kelley et al. [3] demonstrated the motive behind the ranking system on EODB
as adopted by the World Bank along with its impacts on policy vide the channels of
bureaucracy, and transnational, as well as domestic-political channels. They study
the role of government machineries in improving the Ease of Doing Business through
observational and experimental data strategically. Their observations and proposed
methodologies are useful for third world countries.
Alemu [4], with a sample of 41 African countries from 2005 to 2012, studied
the impact of each factor of good governance on EODB. Factors like government
effectiveness in ensuring political stability, rule of law, regulatory quality, and absence
of corruption combined with other equally important factors like human capital,
physical infrastructure, and the level of development of a country, create conducive
business environment. This way, insight is developed toward the government’s role
in ensuring conducive business environment.
Ani [5] established and explained a positive relationship of economic growth and
Ease of doing Business in the selected economies of Asia like Singapore, China, and
Korea for the year 2014.
Lignier [6] looked into the taxpaying aspect of the small businesses. He found
that due to tax compliance requirements, record keeping and knowledge of financial
affairs had improved for a majority of small businesses. Though, having the limitation
of survey of chosen representative sample of small businesses, it establishes the fact
that, managerial awareness about taxation policy is also the part of Ease of Doing
Business.
Jayasuriya [7] has seen in general, the positive relationship in EODB and FDI
inflow for the average country. But this conclusion is insignificant in the case of
developing countries. The paper mainly discusses about the factors which are extra
commercial and cannot be related to purely business environment. It is a useful
literature in Indian case where there are so many intermingling of various commercial
and non-commercial issues.
However, Arruñada [8] depicted that the World Bank in improving the EODB
which have pressurized the developing countries lacking the resources for their insti-
tutional reforms. This gap lies for economies which are in the transitional phase. India
has also crossed and still crossing through this phase, and hence, the observations
are very much relevant for the research. The paper points out the global dominance
by the advanced countries and their trade and protectionist tactics using the agencies
like World Bank as a tool.
Tan et al. [9] suggest deferring from conventional ways to the study of doing busi-
ness which have tendency to predominantly stress on regulatory features. However,
in order to get more clarity over the topic, macroeconomic factors like market poten-
tial and infrastructure build-up are required to be looked in tandem with micro-level
variables such as profitability and cost effectiveness, along with the management of
35 Ease of Doing Business: Approaching the Context 363

competition by the government. Thus, attractiveness to investors, business friendli-


ness, and competitive policies (ABC) has a very concrete role to play in deciding
about the ease or difficulty in doing business. They have focused themselves on the
analysis of 33 Indonesian provinces’ subeconomize for their studies.
Tan et al. [10] further extend the ABC model of measuring the EODB with anal-
ysis at subeconomy level to the case of Indian context. By providing realistic view of
both actual and those on paper, business conditions in 21 sub-national economies of
India, through holistic framework analyzing the indicators under EDB–ABC index
in previous para, its positive correlation with competitiveness and investments into
Indian sub-national economies is seen. They also conclude that actual implemen-
tation on ground are at the sub-national matter more than the competitive policies
written on paper. Thus, the findings do not support the existing studies which highlight
the importance of the comprehensiveness of the index.
Haggard and Stephen Haggard [11] examine the effects of other less dynamic
factors, which have long term impact like cultural aspect, religion, legal origin, and
on four aspects of the ease of starting a new business, i.e., procedural formalities,
credit availability, and time and cost involved. Conclusion is that though the cost of
starting a business is unaffected by culture, legal origin, or religion, but parallelly,
legal aspects cause the procedural formalities, and hence, impact the time needed
for starting a business as well as the ease of getting credit as well. However, gender
differences in the ease of starting a business are better explained by analyzing the
culture, defining distance of power, and religious aspects.

35.4 Conclusions

Different studies suggest with different degrees that the international ranking in Ease
of Doing Business should be improved. This involves the multi-pronged actions by
the law makers, executive machinery, education among the common citizens with
positive mind set, and confidence in the efforts. However, few studies have also
highlighted some gray areas of these ranking exercises but do not oppose the need
of the betterment. The inherent observations out of discussions in this paper may be
summarized below:
Usually, Ease of Doing Business has positive correlation with the investment
environment and is synergizing in nature to each other, i.e., increased investment
propels greater economic expansion and which further causes betterment of EODB.
While underdeveloped countries require massive reforms in their economic and
social system, growth is main concern for the developed countries on the other
hand. Therefore, both are bound to have different priorities for themselves, and the
underdeveloped countries cannot blindly follow the system of developed countries in
labor reforms and legal overhauling as the social security concerns are at big stakes
for them and may cause social chaos and political destabilization. This ultimately
harms the economic system of the country with going down of the graph of EODB
as well.
364 P. Kumar and D. Kumar

Many a times, extra commercial interests like geopolitical strategic calculations,


border disputes, security challenges, and social aspects defies the rankings of EODB
in the aspect of investments. For small economies and export-based economies, the
trade barriers and liberal system play a much bigger role in deciding the capital
inflows in comparison with other factors. In fact, strengthening the basics of the
economy like sectoral reforms, healthy business environment, and political social
stabilization can ensure the long-term positive effects on Ease of Doing Business
and investment pattern; otherwise, only the overexploitation of scarce resources will
result in drained away of the capacity of the economy and society to survive.
Besides these formal measures of EODB, it cannot be ignored that the implication
of the different aspects of starting and conducting the business at micro-level or
say subeconomy level has a significant role to play. In fact, despite all macro-level
indicators whatever they may be on paper, the de facto implications of their several
variables on ground or say at micro-level or subeconomic level have much to conclude
about the Ease of Doing Business. This feature is prevalent among all and particularly
developing economies. This may be the very reason of impressing upon the start-ups
by the Govt. of India. Even if the cultural, legal, and religious aspects may not have
their bearing on cost of starting a new business but at social level, it is useful to see
their impacts on spatial business distribution in respect of gender and other social
biasedness.

35.5 Researcher’s Observations in Indian Context

The context is needed to be seen in light of following scenario which are prevailing
at present:
Indian economy is at the transition stage. The second stage economic reforms
are being undertaken with caution. The recent COVID-19 menace has highlighted
the importance of self-sufficiency. When the economy is already at low, substantive
reforms are needed in this situation and vitalize huge efforts in the area of research
and development activities in health sector. Tax reforms are still to deliver the desired
revenue collections, which has started showing growing trend only recently. However,
given the prevalent wide disparities in status of economic developments, social struc-
tures, cultural ethos, and demographic differences in India, analysis at subeconomy
level is a very useful option in this country, as one of the ABC study suggests.
Resources available in India, like a large proportion of young population below
thirty, established technical and management institutions, large market base, etc. are
basic raw materials for the attraction of investors. However, these factors are not
sufficient in themselves and lot more are expected as a society, for creating a healthy
business environment. Which can make this country a favorable destination for the
investors. According to World Bank report, 2020, India’s ranking was 63 out of 179
countries, which is being seen as good leap from 100 in 2017 to 77 in the year 2018.
Besides giving a positive signal of Indian business environment, this improvement
35 Ease of Doing Business: Approaching the Context 365

in rankings also offers more favorable terms for the government’s persuasiveness for
the potential investors around the world.
It can be inferred without any doubt that the continuous efforts toward improving
the rankings in EODB are vitally needed. It is quite observable in the recent efforts
like disinvestment of PSUs, reforms in labor laws, efforts to remodeling of bureau-
cracy, and reducing compliance burden. Another step is the adoption of GST, i.e.,
goods and services tax by Indian Union and State governments under the driving
principal of “One Nation, One Tax, One Market.” After initial hiccups and removal
of procedural plus system’s lacunae, this taxation system is bound to reduce the
cost of tax collection to the government, increase the tax collection and ease the
compliance formalities to the taxpayers. As it is an indirect tax, general public will
be the biggest beneficiary. The system needs to learn from the experiences of other
countries as well as to innovate the methods according to its needs. Moreover, tire-
less efforts by the government of India in promoting the start-ups and reducing the
compliance burden for enhancing the Ease of Doing Business and ease of living for
common citizens as well. Department of Promotion of Industry and Internal Trade
(DPIIT) has been the nodal department for coordinating and pushing the efforts of all
the ministries and departments [12] (https://dpiit.gov.in). Efforts are taken to iden-
tify the redundant rules, regulations, and acts. Decriminalization of acts and rules
is being identified, where they causes unnecessary hardships to entrepreneurs and
common citizens. Start-ups are being encouraged continuously with financial and
technical support. One such mammoth effort with success has been seen in drone
show of this years’ beating retreat function at Rashtrapati Bhawan. This is only one
of such examples, as many more are in pipeline.

References

1. Corcoran, A., Gillanders, R.: Foreign direct investment and the ease of doing business. Rev.
World Econ. 151(1), 103–126 (2015)
2. Djankov, S.: Measuring the Ease of Enterprise. World Bank, Washington DC (2007)
3. Kelley, J.G., Simmons, B.A., Doshi, R.: The Power of Ranking: The Ease of Doing Business
Indicator as a Form of Social Pressure. Wharton School University (2016)
4. Alemu, A.M.: The nexus between governance infrastructure and the ease of doing busi-
ness in Africa. Comparative Case Studies on Entrepreneurship in Developed and Developing
Countries, pp. 110–131. IGI Global (2015)
5. Ani, T.G.: Effect of ease of doing business to economic growth among selected countries in
Asia. Asia Paci. J. Multi. Res. 3(5), 139–145, December 2015 Part II (2015)
6. Lignier, P.: The managerial benefits of tax compliance: perception by small business taxpayers.
eJTR 7, 106 (2009)
7. Jayasuriya, D.: Improvements in the World Bank’s Ease of Doing Business Rankings: Do They
Translate into Greater Foreign Direct Investment Inflows? The World Bank (2011)
8. Arruñada, B.: How doing business jeopardises institutional reform. Eur. Bus. Org. Law Rev.
10(4), 555–574 (2009)
366 P. Kumar and D. Kumar

9. Tan, K.G., Amri, M., Merdikawati, N.: A new index to measure ease of doing business at the
sub-national level empirical findings from Indonesia. Cross Cult. Strateg. Manage. 25(3), 2018,
515–537 (2017)
10. Tan, K.G., Gopalan, S., Nguyen, W.: Measuring ease of doing business in India’s sub-national
economies: a novel index. South Asian J. Bus. Stud. 7(3), 242–264 (2018)
11. Haggard, D.L., Stephen Haggard, K.: The impact of law, religion, and culture on the ease of
starting a business. Int. J. Org. Theor. Behav. 21(4), 242–257 (2018)
12. https://dpiit.gov.in
Chapter 36
A Survey on Diagnosis of Hypoglycemia
and Hyperglycemia Using
Backpropagation Algorithm in Deep
Learning

V. Rajeshram, M. Karthika, C. Meena, V. Srimugi,


and K. Kaushik Karthikeyan

Abstract Diabetes is a metabolic issue. It has a great impact on uncountable indi-


viduals from one side of the planet to the other. Its annual occurrence rates are
startling. Diabetes-related disorders in several important organs of the body can be
lethal if left untreated. Diabetes must be detected early in order to receive adequate
treatment and prevent the condition from escalating to severe consequences. A smart
analytics model using deep learning should predict the risk factor about the patient
and rigorous of diabetics using an unknown dataset. Deep neural network approach
helps to find the optimal results using predictive analytics. Already available predic-
tive models are used to predict whether the disease is normal or not based on the
data processed. At first in this paper, the survey tells us that the ML algorithm is
utilized for identifying the disease. Secondly, the deep neural network uses the back-
propagation neural network as the base unit for breaking down the information by
relegating loads to each part of the neural network. The main focus of the survey
is to analyze and discuss about the machine learning and deep learning algorithms
for the prediction of diabetics and investigate the solutions for getting best results in
diabetic prediction with forecasting.

Keywords Machine learning algorithm · Neural network algorithm · Diabetic


prediction · Deep learning algorithm · Feature selection

36.1 Introduction

In today’s world, numerous people are influenced by diabetic mellitus. High glucose
levels can lead to chronic disease, so it is important to take steps to reduce the
risk of chronic disease. Diabetes influences very nearly three hundred eighty two
million people in the globalism [1, 2]. Diabetes, often known as diabetes mellitus, is

V. Rajeshram (B) · M. Karthika · C. Meena · V. Srimugi · K. Kaushik Karthikeyan


Department of Computer Science and Engineering, M. Kumarasamy College of Engineering,
Karur, Tamil Nadu 639113, India
e-mail: rajeshram107@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 367
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_36
368 V. Rajeshram et al.

a collection of chronic diseases characterized by an increase in blood glucose levels


and a reduction in insulin levels in the body [3, 4]. Polyuria, or continuous urination,
polyphagia, or a huge appetite, and polydipsia or an expanded utilization of water,
are all indications. Various diabetes such as gestational diabetes type 1 and type 2
are classified during the test. Type 1 diabetes, in any case called whimsical diabetes
mellitus, is a condition where a patient’s pancreas fails to convey adequate insulin,
requiring the usage of insulin and diabetic medications as suggested by a trained
proficient [5, 6]. People can also develop second diabetes, which is usually normal
1 diabetes but affects the immune system rather than the beta cells as a result of a
pancreatic condition. The loss of beta cells prevents glucose from entering into the
bloodstream without insulin, causing blood sugar levels to increase [7]. Patients with
type 1 diabetes experience diabetes ketoacidosis, in which the body is not able to add
glucose and converts fat cells to ketones [8]. Existing framework set up a framework
that utilizes hereditary calculation arranged semantic attributes for summing up text
in text order calculations and picking highlights from text. A best in class calculation
is utilized for the principal phase of registering [9]. The second level is achieved
through latent semantic indexing, which is based on a genetic set of rules, but the
output of which corresponded to greater values. In addition, they developed a model
called predictive risk, which combines several data analysis techniques such as data
mining, machine learning, and statistics to forecast the future based on current data, in
this case, diabetes risk [10]. Type 1 and type 2 diabetes both emerge when the casing
cannot well work and use glucose, which is vital for exuberance. This glucose then
gathers in the blood and does not spread the cells that requirement it, leading to serious
problems. Type 1 diabetes every so often emits an impression of being fundamental
in teenagers and youngsters [11], yet it can comparatively happen in grown-ups. In
type 1 diabetes, the insusceptible framework dosages pancreatic beta cells with the
goal that they can presently do not yield insulin [12, 13]. There is no way to stop type
1 diabetes, and it is often hereditary. Type 1 is a reliable source of information for
about 5–10% of diabetics. Type 2 diabetes is more likely to look as people age, but
children may still develop it. In this sort, the pancreas build hypoglycaemic agent,
but the body cannot practice it possibly. Way of life factors appears to assume a part
in its turn of events. The larger part trusted source of individuals with diabetes has
type 2 diabetes [14]. Figure 36.1 shows the various types of diabetics.

36.2 Related Works

In Misra et al. [15], the rapid escalation of type 2 diabetes (T2D) in developing world-
wide locales has been studied, with different incidences depending on rural versus
metropolitan habitat and degree of urbanization. Humankind with blacks have least
BMI value when compared to other whites, over a decade from them only the diabetes
was developed. The burden of headaches, including macro- and micro vascular, is
significant, yet it varies by population. Diabetes syndemics with HIV or tuberculosis
are common in many underdeveloped international areas, and they predispose to
36 A Survey on Diagnosis of Hypoglycemia and Hyperglycemia … 369

Fig. 36.1 Types of diabetics

each other. Diabetes screening in large populations living in various ecosystems will
not be cost-effective but tailored high-threat screening may have a place. The high
cost of diagnostic tests and the shortage of fitness personnel are major roadblocks in
the analysis and tracking of patients. In many of the developing countries, precau-
tion efforts are still ancient. Because the quality of service is generally poor, a large
majority of patients do not pursue their treatment goals [16]. This is exacerbated by
a deferral in looking for treatment, “fatalistic perspectives,” the significant expense
of medication, and the absence of accessibility of insulin.
Vaishali et al. [17], utilizing machine learning algorithms, aim to increase the
accuracy of existing diagnostic approaches for the prediction of type 2 diabetes. The
suggested set of rules uses Goldberg’s Genetic set of rules to identify vital functions
from the Pima Indians Diabetes Dataset at the preprocessing level, and at the dataset
level MOEFCs is used. Device learning suffers from dimensionality. Medical datasets
are frequently greater in size and contain many redundant features. The risk of noise
and reliance across capabilities increases as capabilities become more redundant.
The attribute of a fantastic dataset is that it has less correlated unbiased variables
that are highly correlated to the magnitude or predictive variable. As a result, data
preprocessing plays a crucial role in machine learning projects using clinical datasets.
Diminishing of measurement can be done in two ways: by selecting features or by
extracting characteristics. Feature selection creates a nice characteristic subset from
the current characteristic area, whereas feature extraction creates new features from
the feature area by extracting key components.
In Park et al. [18], diabetes has been displayed to build the danger of coronary heart
disease, essential stroke subtypes, and mortality because of vascular causes by around
triple. In potential observational studies, this group of strong institutions of diabetes
with a variety of different vascular illnesses contrasts with that of LDL cholesterol
(or non-HDL cholesterol), which is immovably associated with coronary illness, but
370 V. Rajeshram et al.

simply unobtrusively identified with ischemic stroke, and particular to hemorrhagic


stroke. Diabetes is connected to a third more lethal myocardial localized necrosis
than non-deadly myocardial localized necrosis, perhaps showing more extreme sorts
of coronary injuries in diabetics than in those without, a differential reaction of the
myocardium to ischemia, or, no doubt, differential coding of passings from coronary
heart disease.
In Cho et al. [19], using the preceding method, the overall healthcare costs on
diabetes, as well as the mean healthcare costs per person with diabetes, were calcu-
lated; both figures are in US dollar bills. This plan expects that individuals with
diabetes spend twice as much on clinical advantages as people without diabetes.
Global dollars are a speculative cash, wherein one international dollar has a similar
buy power in the nation of premium as one US dollar does in the USA at one point
on schedule. Comparisons across regions and over time can be made using inter-
national dollars [20]. The adjusting factor of buying power parity can be used to
convert different currencies from different countries into the common currency unit
of international dollars.
Maniruzzaman et al. [21] implemented various machine learning classifiers such
as Naive Bayes algorithm, support vector machine, K-nearest neighbor algorithm
which were all used to predict and prognosis diabetes diseases. When the data
contains missing values or outliers, these classifiers cannot correctly categories
diabetes patients, so when device mastering-primarily based classifiers are employed
for hazard stratification, they do not offer greater accuracy. The Gaussian proces-
sion has been developed in the recent decade as a non-parametric tool that may be
utilized not only in regression but also in classification problems to address chal-
lenges such as the insufficiency of the conventional linear technique, intricate data
sorts, the constraint of dimension, and so on. The ability to provide distrust estimates
and observe the noise and smoothness features using training statistics is the main
returns of this approach. Supervised learning method which is based on Gaussian
process tries to join the best of two distinct schools of approach: SVM, created by
Vapnik in the early 1990s, and Bayesian methods. A GP is a gathering of arbitrary
variables having a Gaussian circulation that spans a limited range.
Sisodia et al. [22] applied in the clinical field, classification approaches were
frequently utilized for classifying information into specific lessons in accordance
with a few constraints, as opposed to an individual classifier. Diabetes is an infection
that affects the body’s ability to produce the hormone insulin, causing carbohydrate
metabolism to become abnormal and blood glucose levels to rise. The response
of diabetics to increased glucose level is very prominent. To give some examples
signs and indications, high glucose causes parching, hunger, and urinary frequency.
Diabetes can cause a plenty of issues if left unprocessed. Two types of diabetics
are consequences and deadly diseases for recent decades. Numerous analysts are
doing preliminaries to identify diseases using the ML classification model [23].
According to studies, gadget-learning algorithms are more effective in diagnosing
certain illnesses. Data mining and machine learning algorithms draw their strength
from their capacity to handle enormous amounts of data, aggregate data from several
sources, and integrate historical data within the system.
36 A Survey on Diagnosis of Hypoglycemia and Hyperglycemia … 371

Perveen et al. [24] boost the performance of the ML algorithm and various decision
learning approaches using J48 (C4.5) decision tree as a tenderfoot and distinct figures
KDD algorithm J48 to characterize patients who has diabetes mellitus depending
on the hard parts of diabetes. Three single ordinal adult firms utilize this class in
the Canadian Chief Medical Care Sentry Surveillance network [25]. In addition to
the solitary J48 selection tree, the ordinary performance of the AdaBoost ensemble
technique is superior to bagging. One of the most actual and extensively used cate-
gorization and prediction systems is the decision tree. Decision tree algorithms such
as AdaBoost, J48 decision tree, bagging and boosting tree are developed with better
modes with improved performance to classify the diabetic diseases. The informa-
tion for this study came from the Canadian medical clinic database. As far as the
impacts are considered, the AdaBoost classifiers exceed in different decision tree
algorithms. Similar ensemble approaches could be used in the future on a variety of
disease datasets, including high blood pressure, coronary heart disease, and dementia.
Various man or woman strategies, such as Naive Bayes, SVM, and neural networks,
are also available. In an ensemble system, they might be used as base freshmen.
Nai-arun et al. [26] presented an online application based on the use of disease
classifiers and a real-world data collection. The figures in this introduction are based
on current records of multiple people collected between 2012 and 2013 from major
units at regional hospital. Thirteen type fashions were analyzed in order to find
a predictive version before developing the web software. Random forest method
includes decision tree and multiple neural networks with regression algorithms with
bagging and boosting statistics. The accuracy and ROC curve of every model were
designed and associated to others to regulate their robustness. Random forest’s effects
screen came out on top in terms of accuracy and ROC curve [27]. This is most
noteworthy prone to the significance of adaptable selection. Facts were now not
only randomly selected in the random forest method, but also, enter variables were
randomly selected by considering vital variables. As a result, accuracy places a high
emphasis on growth. As a result, this set of rules was chosen to update the diabetes
risk prediction and to develop the utility.
Singh et al. [28] make a try and deliver together numerous outlier detection tech-
niques, in a dependent and established description. In this paper, we have brought
collectively various outlier detection strategies, in a based and widespread descrip-
tion. With this exercise, we have achieved a higher evidence of the unique guide-
lines of studies on outlier examination for ourselves as well as for novices on this
research discipline who can pick out up the hyperlinks to unique areas of programs
in details. Outlier detection methods are applied with novel topics and news trainings
with multiple details [29]. The outliers are prompted due to a brand new interesting
event or an anomalous subject matter. The statistics on this part is generally extreme
dimensional and actually sparse. The records additionally have a temporal compo-
nent because the documents are gathered through the years. A mission for outlier
detection techniques on this domain is to address the big versions in files belonging
to one class or subject matter.
Krstajic et al. [30] specified and compared fine techniques that promote depend-
ability and boost confidence in predetermined fashions. Cloud computing is a major
372 V. Rajeshram et al.

operational component of the proposed methodologies, as it allows for the routine


employment of hitherto infeasible procedures. In an ideal scenario, we would have
enough data to teach and evaluate our fashions (samples) and separate records for
evaluating the finest of our version (test samples). In order to be representative,
both schooling and test samples may need to be suitably large and diversified. Such
statistics-rich situations, on the other hand, are uncommon in lifestyle sciences, such
as QSAR. One of the most significant issues with model selection and evaluation is
that we frequently only obtain data from schooling samples, making it impossible
to calculate a test error. Despite the fact that the test error cannot be calculated, it is
possible to predict the expected check errors using training samples. The comparative
study is shown in Table 36.1.

36.3 Proposed Work

Multiple possibilities for health care are created due to the fact device getting to know
models have ability for advanced predictive analysis .Because device learning models
can perform advanced predictive analytics, they open up a slew of opportunities for
health care. There are already models in gadget mastering that can predict long-
term contamination such as heart problems, infections, and digestive disorders [31].
There are also a few new models of technology that are being tested to predict
non-communicable diseases, which is bringing further benefits to the field of health
care. Researchers are working on deep learning models in order to provide very
early prediction of a patient’s specific illness in order to develop effective disease
prevention techniques [32]. Sufferers’ hospitalization will be reduced as a result
of this. This transition could be prodigiously dominance for preventive medicine
business. Diabetes is a disease that has capability of the body to initiate the insulin
In other words, the body is unable to respond to insulin synthesis. This results in
anomalous starch consumption and escalated blood glucose pitch. Diabetes must be
detected as soon as possible [33]. When a person has diabetes, their blood glucose
levels rise to hazardously esteemed in the body. After you eat something, your body
produces glucose. Insulin is a hormone generated by the body that helps to regulate
blood sugar levels and stabilize glucose levels. Insulin insufficiency causes diabetes
[34]. Figure 36.2 depicts the diabetic prediction methods.

36.3.1 Naive Bayes Algorithm

Wherever category labels are generated from some finite collection, Naive Bayes
could be a strategy for building classification prototypes that offer category labels
to drawback labels that are diagrammatical as vectors of feature principles [35]. A
collection of algorithms supported a general principle, but such classification is not an
assistant in nursing algorithmic rule for coaching. Given the class variable, all Naive
36 A Survey on Diagnosis of Hypoglycemia and Hyperglycemia … 373

Table 36.1 Table of related works


S. No. Title Techniques Prediction
1 Diabetes in developing Diagnostic tests Merits: Increasing awareness
countries of diabetics
Demerits: Difficult to handle
large datasets
2 Genetic algorithm based Genetic algorithm Merits: Reduces redundancy
feature selection and MOE data
fuzzy classification algorithm Demerits: Missing data on
on Pima Indians Diabetes feature selection
Dataset
3 Diabetes mellitus, fasting Statistical analyses Merits: Analyze the risk for
blood glucose concentration, diabetes
and risk of vascular disease: a Demerits: Computational time
collaborative meta-analysis is high
of 102 prospective studies
4 IDF Diabetes Atlas: Global Expenditure Merits: Decrease the risk
estimates of diabetes estimates factors
prevalence for 2017 and Demerits: Time complexity is
projections for 2045 high
5 Accurate diabetes risk Logistic regression Merits: Yield higher accuracy
stratification using machine framework Demerits: Cost is high
learning: role of missing
value and outliers
6 Prediction of diabetes using Naive Bayes Merits: achieved accuracy in
classification algorithms algorithm disease prediction
Demerits: Does not support
large datasets
7 Performance analysis of data J48 (c4.5) decision Merits: Higher performance to
mining classification tree classify diabetic patients
techniques to predict diabetes Demerits: Some features are
missed in analysis
8 Comparison of classifiers for Logistic Regression Merits: Robustness in disease
the risk of diabetes prediction prediction
Demerits: Large number of
rules are constructed
9 Outlier detection: Supervised outlier Merits: Understanding of
applications and techniques detection outliers
Demerits: Need large number
of datasets
10 Cross-validation pitfalls Grid-search V-fold Merits: Improved
when selecting and assessing cross-validation classification
regression and classification Demerits: Large amounts of
models computer power can be
needed
374 V. Rajeshram et al.

Input: Diabetic Datasets

Machine learning algorithms: Naives


Preprocess the data Bayes, Decision Tree, Support Vector
Machine

Results analysis

Fig. 36.2 Machine learning algorithms

Bayes classifiers choose that the worth of an obvious point is self-governing of the
worth of additional points. In several sensible applications, parameter approximation
for the naive model uses the principle of most chance [36]. Alternatively, we are able
to work with Naive Bayes models while not acceptive theorem casual or victimization
of any theorem ways. It inspects all the symptoms from a given information set and
uses the dependent probability to work out the chance of polygenic disease [37, 38].
There are several deciding features like glucose level, BMI, weight, age, pressure
level levels, and hormone levels. This classifier needs an enormously low range of
training data for parameter assessment. Naive Bayes classifier is a method based
on probability which mechanism on each class label individually. It fundamentally
works on the Bayesian theorem with probability. For the estimation of any feature,
it uses a method of maximum likelihood which works individually and hence gives
a probability of occurrence of data [39].

36.3.2 Decision Tree Algorithm

A supervised machine learning proposition called decision tree is used to tackle


categorization problems. The major goal of employing decision tree in this study is
to calculate the target class using a decision rule based on previous data [40]. The
instances with distinct architecture are classified by root nodes. The leaf nodes denote
classification, while the root nodes can contain two or more divisions. Decision tree
selects each node at each step by evaluating the maximum information gain among
all qualities [41].

36.3.3 Support Vector Machine Algorithm

Support vector machine algorithm is the comprehend of supervised machine learning


plea which is implemented in classification [42]. A support vector machine’s goal is
36 A Survey on Diagnosis of Hypoglycemia and Hyperglycemia … 375

to find the optimal highest-margin separation hyperplane between two classes given a
two-class training sample. Hyperplane should not lay faster to data points belonging
to different classes for improved generalization [43]. A hyperplane should be chosen
that is far away from each category’s data points.

36.3.4 Deep Learning

In this chapter, we will look at how to use a classification system to predict diabetes
illnesses using deep learning algorithms like the backpropagation technique. Back-
propagation is an artificial neural network structure that plots sets of info informa-
tion onto a bunch of significant result utilizing a feed forward algorithm [44, 45].
A directed graph has numerous tiers of junctions, each of which is fully associ-
ated to the next. Each junction, with the divergent of the input junction, is a neuron
with a nonlinear function. Backpropagation is a modified linear perceptron that can
discriminate data that is not divisible linearly. If backpropagation uses a basic on-off
mechanism, such as a linear activation function in all neurons, to control whether or
not a neuron fires, any number of layers can then be reduced to the conventional two-
layer input-output paradigm using linear algebra [46]. Gradient approaches are then
applied to optimization methods to regulate the weights and reduce the network’s
loss function. As a result, in order to compute the gradient of the loss function, the
algorithm requires a known and desired output for all inputs. Typically, backpropa-
gation feed forward networks are generalized using the delta rule, which may result
in a chain of iterative rules to compute gradients for each layer. The backpropa-
gation formula is currently getting used in current analysis on parallel, distributed
computing, and process neurobiology [47]. The backpropagation algorithm has addi-
tionally stood out enough to be noticed in the pattern recognition field [48]. They
are extremely useful in the study because of their capacity to solve difficult issues
and their fitness approximation outcomes, even when crucial predictions are made.
BPNN model utilizes the similar design as FF-based backpropagation for supervised
learning. It can be most probably used as multiple areas in healthcare domain [49].
And features can be provided by the user, and the diseases will surely be predicted
[50]. The steps of the algorithm are as follows:
Step 1: Set the weights and biases at random.
Step 2: Feed the training sample into the machine.
Step 3: Carry the inputs forward; in the intermediate and final layers, enumerate
each unit’s training and testing in the buffer and final tiers.
Step 4: Back propagate the fault to the intermediate layer.
Step 5: To duplicate the transmitted mistakes, correct the weights and biases.
Labeling functions are used to impulsively regulate the network’s weights and
biases.
Step 6: Put an end to the situation.
376 V. Rajeshram et al.

36.4 Conclusion

The topic of constraining and summarizing various data mining methodologies


utilized in medical prediction, as well as various machine learning and deep learning
algorithms for diabetic predictions, is addressed in this study. For intelligent and
successful diabetic illness prediction using data mining, the focus is on combining
many methodologies and combinations of multiple target attributes. A huge model of
persons hospitalized with clustering was employed in a recent study. When dealing
with noisy data, the existing machine learning classifiers provide reduced accuracy.
If there is any noisy data present, classification processing capacity will be severely
hampered. It not only slows down but also degrades the classification algorithm’s
performance. As a result, it is critical to eliminate any attributes from datasets that
will subsequently behave as noisy attributes before using a classification method.
Preprocessing processes and classification rule algorithms, such as machine learning
and deep neural network techniques, are employed in this survey work to categories
datasets that are supplied by users. Based on the results of the poll, it is clear that the
deep learning technique outperforms the other strategies.

References

1. Syed, L., Jabeen, S., Manimala, S., Alsaeedi, A.: Smart healthcare framework for ambient
assisted living using IoMT and big data analytics techniques. Future Gener. Comput. Syst.
101, 136–151 (2019)
2. Murugesan, M., Thilagamani, S.: Efficient anomaly detection in surveillance videos based on
multi layer perception recurrent neural network. J. Microprocess. Microsyst. 79 (2020)
3. Perumal, P., Suba, S.: An analysis of a secure communication for healthcare system using
wearable devices based on elliptic curve cryptography. J. World Rev. Sci. Technol. Sustain.
Dev. 18(1), 51–58 (2022)
4. Deepika, S., Pandiaraja, P.: Ensuring CIA triad for user data using collaborative filtering
mechanism. In: 2013 International Conference on Information Communication and Embedded
Systems (ICICES), pp. 925–928 (2013)
5. Vishnu, S., Ramson, S.J., Jegan, R.: Internet of medical things (IoMT)—an overview. In:
Proceeding of 5th International Conference on Devices, Circuits and System (ICDCS), pp. 101–
104 (2020)
6. Shmueli, G., Koppius, O.: Predictive analytics in information systems research. MIS Quart.
35(3), 553–572 (2011)
7. Pradeep, D., Sundar, C.: QAOC: noval query analysis and ontology-based clustering for data
management in Hadoop. Future Gener. Comput. Syst. 108, 849–860 (2020)
8. Corbin, L.J., Richmond, R.C., Wade, K.H., Burgess, S., Bowden, J., Smith, G.D., Timpson,
N.J.: BMI as a modifiable risk factor for type 2 diabetes: refining and understanding causal
estimates using Mendelian randomization. Diabetes 65(10), 3002–3007 (2016)
9. Nithya, B., Ilango, V.: Predictive analytics in health care using machine learning tools and
techniques. In: International Conference on Intelligent Computer Control System (ICICCS),
pp. 492–499 (2017)
10. Birjais, R., Mourya, A.K., Chauhan, R., Kaur, H.: Prediction and diagnosis of future diabetes
risk: a machine learning approach. Soc. Netw. Appl. Sci. 1(9), 1112 (2019)
36 A Survey on Diagnosis of Hypoglycemia and Hyperglycemia … 377

11. Marmot, M., Clemens, S., Blake, M., Phelps, A., Nazroo, J., Oldfield, Z., Oskala, A., Phelps,
A., Rogers, N., Steptoe, A.: English longitudinal study of ageing: waves 0–8, 1998–2017. Data
Service, U.K., Tech. Rep. SN: 5050 (2018)
12. Santhi, P., Mahalakshmi, G.: Classification of magnetic resonance images using eight directions
gray level co-occurrence matrix (8dglcm) based feature extraction. Int. J. Eng. Adv. Technol.
8(4), 839–846 (2019)
13. Chen, S., Bergman, D., Miller, K., Kavanagh, A., Frownfelter, J., Showalter, J.: Using applied
machine learning to predict healthcare utilization based on socioeconomic determinants of
care. Amer. J. Managed Care 26(1), 26–31 (2020)
14. Deepa, K., Thilagamani, S.: Segmentation techniques for overlapped latent fingerprint
matching. Int. J. Innovative Technol. Exploring Eng. 8(12), 1849–1852 (2019)
15. Misra, A., Gopalan, H., Jayawardena, R., Hills, A.P., Soares, M., Reza-Albarrán, A.A.,
Ramaiya, K.L.: Diabetes in developing countries. J. Diabetes 11(7), 522–539 (2019)
16. Hasan, M.K., Alam, M.A., Das, D., Hossain, E., Hasan, M.: Diabetes prediction using
ensembling of different machine learning classifiers. IEEE Access 8, 76516–76531 (2020)
17. Vaishali, R., Sasikala, R., Ramasubbareddy, S., Remya, S., Nalluri, S.: Genetic algorithm based
feature selection and MOE fuzzy classification algorithm on Pima Indians diabetes dataset. In:
International Conference on Computer Network Informatics (ICCNI), pp. 1–5 (2017)
18. Park, C.: The emerging risk factors collaboration. Diabetes mellitus, fasting blood glucose
concentration, and risk of vascular disease: a collaborative meta-analysis of 102 prospective
studies. Lancet 375, 2215–2222 (2010)
19. Cho, N.H., Shaw, J.E., Karuranga, S., Huang, Y., da Rocha Fernandes, J.D., Ohlrogge,
A.W., Malanda, B.: IDF diabetes atlas: global estimates of diabetes prevalence for 2017 and
projections for 2045. Diabetes Res. Clin. Pract. 138, 271–281 (2018)
20. Deepa, K., Kokila, M., Nandhini, A., Pavethra, A., Umadevi, M.: Rainfall prediction using
CNN. Int. J. Adv. Sci. Technol. 29(7 Special Issue), 1623–1627 (2020)
21. Maniruzzaman, M., Rahman, M.J., Al-MehediHasan, M., Suri, H.S., Abedin, M. M., El-Baz,
A., Suri, J.S.: Accurate diabetes risk stratification using machine learning: role of missing value
and outliers. J. Med. Syst. 42(5), 92 (2018)
22. Sisodia, D., Sisodia, D.S.: Prediction of diabetes using classification algorithms. Procedia
Comput. Sci. 132, 1578–1585 (2018)
23. Thilagamani, S., Nandhakumar, C.: Implementing green revolution for organic plant forming
using KNN-classification technique. Int. J. Adv. Sci. Technol. 29(7S), 1707–1712 (2020)
24. Perveen, S., et al.: Performance analysis of data mining classification techniques to predict
diabetes. Procedia Comput. Sci. 82, 115–121 (2016)
25. Haffner, S.M.: Epidemiology of type 2 diabetes: risk factors. Diabetes Care 21(3), C3–C6
(1998)
26. Nai-arun, N., Moungmai, R.: Comparison of classifiers for the risk of diabetes prediction.
Procedia Comput. Sci. 69, 132–142 (2015)
27. Thilagamani, S., Shanti, N.: Gaussian and Gabor filter approach for object segmentation. J.
Comput. Inf. Sci. Eng. 14(2), 021006 (2014)
28. Singh, K., Upadhyaya, S.: Outlier detection: applications and techniques. Int. J. Comput. Sci.
Issues (IJCSI) 9(1), 307 (2012)
29. Kocsis, O., Moustakas, K., Fakotakis, N., Hermens, H.J., Cabrita, M., Ziemke, T., Kovor-
danyi, R.: Conceptual architecture of a multidimensional modeling framework for older
office workers. In: 12th ACM International Conference on Pervasive Technologies Related
to Assistive Environment pp. 448–452 (2019)
30. Krstajic, D., et al.: Cross-validation pitfalls when selecting and assessing regression and
classification models. J. Cheminformatics 6(1), 1–15 (2014)
31. Konstantoulas, I., Kocsis, O., Fakotakis, N., Moustakas, K.: An approach for continuous sleep
quality monitoring integrated in the SmartWork system. In: IEEE International Conference on
Bioinformatics and Biomedicine (BIBM), pp. 1968–1971 (2020)
32. Kocsis, O., Stergiou, A., Amaxilatis, D., Pardal, A., Quintas, J., Hermens, H.J., Cabrita, M.,
Dantas, C., Hansen, S., Ziemke, T., Tageo, V., Dougan, P.: SmartWork: designing a smart
378 V. Rajeshram et al.

age-friendly living and working environment for office workers. In: 12th ACM International
Conference on Pervasive Technologies Related to Assistive Environment, pp. 435–441 (2019)
33. Bernabe-Ortiz, A., Perel, P., Miranda, J.J., Smeeth, L.: Diagnostic accuracy of the Finnish
diabetes risk score (FINDRISC) for undiagnosed T2DM in Peruvian population. Prim. Care
Diabetes 12(6), 517–525 (2018)
34. Rajesh Kanna, P., Santhi, P.: Unified deep learning approach for efficient intrusion detection
system using integrated spatial–temporal features. Knowl.-Based Syst. 226 (2021)
35. American Diabetes Association, 2. Classification and diagnosis of diabetes: Standards of
medical care in diabetes—2020. Diabetes Care 43(1), S14–S31 (2020)
36. Bujnowska-Fedak, M.M., Grata-Borkowska, U.: Use of telemedicine based care for the aging
and elderly: promises and pitfalls. Smart Homecare Technol. TeleHealth 3, 91–105 (2015)
37. Pandiaraja, P., Aravinthan, K., Lakshmi, N.R., Kaaviya, K.S., Madumithra, K.: Efficient
cloud storage using data partition and time based access control with secure AES encryption
technique. Int. J. Adv. Sci. Technol. 29(7), 1698–1706 (2020)
38. Chung, J.K.-O., Xue, H., Pang, E.W.-H., Tam, D.C.-C.: Accuracy of fasting plasma glucose
and hemoglobin A1c testing for the early detection of diabetes: a pilot study. Front. Lab. Med.
1(2), 76–81 (2017)
39. Zheng, T., Xie, W., Xu, L., He, X., Zhang, Y., You, M., Yang, G., Chen, Y.: A machine
learning-based framework to identify type 2 diabetes through electronic health records. Int. J.
Med. Informat. 97(120–127) (2017)
40. Logeswaran, R., Aarthi, P., Dineshkumar, M., Lakshitha, G., Vikram, R.: Portable charger for
handheld devices using radio frequency. Int. J. Innovative Technol. Exploring Eng. (IJITEE)
8(6), 837–839 (2019)
41. Naz, H., Ahuja, S.: Deep learning approach for diabetes prediction using PIMA Indian dataset.
J. Diabetes Metabolic Disord. 19(1), 391–403 (2020)
42. Xu, Z., Wang, Z.: A risk prediction model for type 2 diabetes based on weighted feature
selection of random forest and XGBoost ensemble classifier. In: 11th International Conference
on Advanced Computational Intelligence (ICACI), pp. 278–283 (2019)
43. Fitriyani, N.L., Syafrudin, M., Alfian, G., Rhee, J.: Development of disease prediction model
based on ensemble learning approach for diabetes and hypertension. IEEE Access 7, 144777–
144789 (2019)
44. Rghioui, A., Lloret, J., Sendra, S., Oumnad, A.: A smart architecture for diabetic patient
monitoring using machine learning algorithms. Health Care 8(3), 348 (2020)
45. Pandiaraja, P., Sharmila, S.: Optimal routing path for heterogenous vehicular Adhoc network.
Int. J. Adv. Sci. Technol. 29(7), 1762–1771 (2020)
46. Rghioui, A., Lloret, J., Harane, M., Oumnad, A.: A smart glucose monitoring system for
diabetic patient. Electronics 9(4), 678 (2020)
47. Efat, M.I.A., Rahman, S., Rahman, T.: IoT based smart health monitoring system for diabetes
patients using neural network. In: International Conference on Cyber Security and Computer
Science, pp. 593–606. Springer, Cham, Switzerland (2020)
48. Saravanan, M., Shubha, R.: Non-invasive analytics based smart system for diabetes monitoring.
In: International Conference on IoT Technologies for HealthCare, pp. 88–98. Springer, Cham,
Switzerland (2017)
49. Gunasekar, M., Thilagamani, S.: Performance analysis of ensemble feature selection method
under SVM and BMNB classifiers for sentiment analysis. Int. J. Sci. Technol. Res. 9(2), 1536–
1540 (2020)
50. Rajesh Kanna, P., Santhi, P.: Hybrid intrusion detection using map reduce based black widow
optimized convolutional long short-term memory neural networks. Expert Syst. Appl. 194, 15
(2022)
Chapter 37
A Novel Marathi Speech-Based Question
and Answer Chatbot for the Educational
Domain

Aditya R. Samak, Prafulla B. Bafna, and Jatinderkumar R. Saini

Abstract In any interactive agent, providing a suitable answer plays a vital role in
determining the success of that chatbot. In the present research work, speech
recognition API and Google Translate API are used to recognize the user’s voice
and text translation respectively. We propose the design and implementation of a
novel Marathi speech-based educational chatbot using Naïve Bayes classifier. After
performing preprocessing, Naive Bayes classifier is used to determine and evaluate
the highest probable class. More than 1500 words are used to train the model. The
proposed approach is tested with some questions and their topics in the dataset.
Brute force keyword matching and string similarity algorithms are implemented to
fetch the suitable answer. Confusion matrix is generated to evaluate the perfor-
mance of classification model. The obtained results prove the robustness of the
proposed system.

Keywords Chatbot  Marathi  Naïve Bayes  Speech  Text classification

37.1 Introduction

Today, technology is changing rapidly, and the applications of artificial intelligence


and machine learning are in high demand. Chatbot which is a virtual assistant is one
such application that tends to provide solutions to many routine problems of a user.
The present research work explains the design and implementation of an ‘educational
chatbot’ [1] which understands the question of a user and provides a suitable answer.

A. R. Samak  P. B. Bafna  J. R. Saini (&)


Symbiosis Institute of Computer Studies and Research, Symbiosis International
(Deemed University), Pune, India
e-mail: saini_expert@yahoo.com
A. R. Samak
e-mail: ads1942028@sicsr.ac.in
P. B. Bafna
e-mail: prafulla.bafna@sicsr.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 379
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_37
380 A. R. Samak et al.

After identifying possible class of input question, different algorithms are


implemented to retrieve the suitable answer. Keyword matching, string similarity
and combination of these algorithms implemented to produce desired output.
Keyword matching algorithm identifies the keywords [2] in a sentence. String
similarity algorithm identified the similarity between lists of strings and provided
the best suitable answer which is closest match to the user submitted question. To
achieve the best possible outcome, we combined these two algorithms. Even if the
keyword matching fails, it compares the input question with prestored question and
generates suitable answer [3, 4].
Text preprocessing means performing tokenization, removing stop words like
‘the’, ‘a’, ‘able’, and ‘in’, and removing the punctuation marks. Stemming is per-
formed to stem each token in its base form. Text classes or clusters are important for
decision making ‘scikit-learn’ is Python’s machine learning (ML) library, and it
features various algorithms like clustering, classification and regression.
In speech-based chat bot, recognizing human voice is the first step in under-
standing the input question from the user. Voice control will make application more
comfortable for user if one’s hands are busy.
The proposed approach is unique because for the first time Marathi language is
used to develop a chatbot for the educational domain. In lack of the availability of
the corpus, we created our own corpus too. This research work is also distinct as it
deploys Porter Stemmer in addition to generating Marathi Wordcloud.
Rest of the paper is organized as follows. The next section presents the relevant
literature review. Section 37.3 presents the description of the detailed methodology.
This is followed by results and discussion, and finally, the paper ends with con-
cluding remarks and directions of future work.

37.2 Literature Review

A chatbot is proposed which is a personalized medical assistant, which predicts the


disease of a patient by understanding the symptoms [5] and suggests medicines, list
of treatment, etc. It uses predictive algorithms, natural language processing tech-
niques. It provides real-time answers and solves the queries regarding medicines.
Implementing futurist think algorithms is a big challenge, i.e., providing long-term
tonic or medicine. Android-based educational chatbot for visually impaired people
provides education related information for visually impaired as well as normal
people. It gives flexibility as it uses voice recognition, pattern matching technique
which includes symbolic reduction, divide and conquer and keyword detection
method. It fetches information from Wikipedia through MediaWiki Application
Programming Interface (API). Using technologies like natural language processing
and ontologies [6]. Education-based chatbot solves all the queries of students,
provides lectures, and handles student’s assessment. It uses artificial intelligence
markup language (AIML), inference engine, interaction quality tracker, etc. System
improves the engagement among students by assigning group work and projects.
37 A Novel Marathi Speech-Based Question and Answer … 381

Also this system helps to minimize the burden on the teacher. In this fast-growing
data-driven world, understanding the business insights is becoming very vital for
businesses. Chatbot is constructed for web analytics insights that helps to track the
success of website and also essential [7] business insights to grow business. Bot
user can check the performance of website users. The main advantage is tracking
and analyzing the user’s website usage becomes much simpler than slower and
tedious web analytics tools. Intelligent tutoring chatbot [8] is built for solving
mathematical problems in high-school with ease. It uses different algorithms for
question-answering and problem solving methods. It uses knowledge-based system,
theorems and functions. At first, solution gives hint to student for solving the
question, so it improves the thinking process of students too. Chatbot is developed
for monitoring older patients with cancer, it is the implementation of
semi-automated messaging application which follow ups [9] on older patients, i.e.,
greater than 65 year with cancer receiving chemotherapy at home. ‘UNIBOT’ [5]
which solves and offers a proper answer to university specific questions. The theory
of artificial intelligence, machine learning is used with PHP language. The chatbot
processes the message and provides the correct answer. The college or any uni-
versity website can include this chatbot project. Chatbot using SnatchBot [2] is
proposed using natural language processing which provides services to the user
who can ask the question related to college and campus. The proposed system is a
web-based system, so the entire project is hosted on the cloud platform. It uses
HTML, CSS and JavaScript. Database as SnatchBot database and email platform as
Gmail. There are four modules which consist of enquiry webpage module, feedback
module, admin module and database module. PharmaBot [10] is developed which
is a Medicine Consultant. It presents a chatbot designed to recommend children's
generic medication. It works well for patients who are confused with the generic
drugs as pharmacy consultant. The researchers have used the algorithm of left and
right parsing to provide the desired performance. E-Commerce website-based
chatbot [11] has various products with wide range of variety. The chatbot helps you
to decide which product is suitable for you. The system has two major components
as website and chatbot. Website is coded in HTML and CSS. PHP is used for
scripting. MySQL database is used to store the product details and inventory. In
chatbot, RiveScript which is a simple scripting language is used to give intelligence
to the chatbot. A medical chatbot is proposed to solve their health-related problems.
The user can submit any personal health questions by chat. Natural language
processing implemented in which system can understands the sentiment of the user.
Recommendation tasks play an important in providing suitable suggestions. The
expert's guidance in implementing chatbot is considered. The platform allows
developers to find the right contact person for open [12] source projects. The
solution to the problem relies on techniques for the analysis of the natural language
to define sentence and key concepts using terms frequency and inverse document
frequency algorithms, respectively. An intelligent assistant system using NLP and
machine learning presents a bank chatbot which interact and solves the queries of
customer. It improves the customer service experience. It reduces the human load,
increases in productivity and increases the number of satisfied customers. A travel
382 A. R. Samak et al.

agent chatbot [13] analyzes user expectations and predicts knowledge selection. It
implements a restricted Boltzmann computer for a recommendation framework
with collaborative filtering. The system uses the Amazon Alexa-enabled device
input in the form of a user speech and analyzes the info using NLP techniques to
learn what the user is saying or asking and answers accordingly. It uses tools such
as MongoDB, MySQL and elastics search and the neural system. MedBot [14] is
telemedicine-based chatbot which explains how doctors can use telemedicine in the
recent outbreak of Covid-19 to connect with their patients. Telemedicine helps the
patient to receive medical care without having to visit a hospital. It uses AI-based
applications for their treatment. It involves multi-language, natural language
application, to provide chronic patients with health education and guidance.
Firebase Cloud features and Google Cloud platform are supported for this appli-
cation as a backend infrastructure. Home Automation [15] using IoT & a chatbot
using NLP study introduces a web-based program that allows the Internet for the
control of fans, lights and other equipment. The chatbot algorithm is used so that the
user is able to track the workings of electrical devices at home using text infor-
mation. It employs Raspbian Raspberry Pi, open-source language Python pro-
gramming language, natural-language NLTK library and Microsoft Azure Cloud
server hosting the web application and chat algorithm.

37.3 Research Methodology

When the user asks any question in any language, recognition of voice is done
using speech recognition API. Also, to make the system understand the question,
text translation is done to convert one language to another language using Google
Translate API. We have used Python’s NLTK [16] library for preprocessing of
question and also performed preprocessing on the questions in the dataset. We then
applied term frequency—inverse document frequency (TF-IDF) and generated
document term matrix (DTM) from the dataset. This was followed by use of Naïve
Bayes classifier to determine the highest probable class to which input question is
belong to. Also we have applied different algorithms to fetch suitable answer.
Maximum cosine similarity is considered to generate probably correct answer so a
reply having maximum similarity enhances more dimensionality to the chatbot
reply [17]. We have taken 30% testing data, and remaining is training dataset, i.e.,
70%, to apply a classifier and calculate accuracy. Figure 37.1 shows the flow of the
methodology.

37.3.1 Recognition of Human Voice and Text Processing

Voice control will make our application more convenient and comfortable for user
if one’s hands are busy. Hidden Markov model (HMM) has been used to convert
37 A Novel Marathi Speech-Based Question and Answer … 383

Fig. 37.1 Diagrammatic representation of the methodology

audio into text. Speech is recognized with the Recognizer class. To begin with,
several tasks were performed such as importing speech recognition API and
implementation of [18] PyAudio package to access microphone. As there are wide
variety of spoken languages in the world, aim of chatbot is to recognize these
languages. We then implemented the Google Translate API which translates the
input question to other language. When the user speaks in Marathi language, it gets
converted into English and necessary preprocessing is performed.
There are different languages in the world. The aim of a speech-based chatbot is
to understand the different languages which user speaks. We can do many things
with the help of the Google Translate API, for example detecting languages to
quickly translate text, setup of sources and destinations. The Google Translate API
supports variety of languages. Google Translate API's basic concept is to translate
terms or phrases from one language to another. There are many people who can
type in numbers, which may change the context of the word. For instance, ‘Is this a
4 credit course?’ is cleaned into ‘Is this a four credit course?’. Another example is,
‘What should I do 4 that?’ is cleaned to ‘What should I do for that?’. The term
cleaning here deals with the preprocessing of the input text to provide us with a
clean text for further downstream operations. During the stage of preprocessing, we
also converted the words to their base form. This process is called stemming, and
we used Porter Stemmer algorithm. For instance, the words ‘engineering’ and
‘college’ are converted to their base forms ‘engin’ and ‘colleg’ respectively to
understand the context of the statement. Also, we did not use a Lemmatizer as it
reduces the words to their root forms and hence losing the semantics of the words.
‘Wordcloud’ is a library in python which is useful for visualizing frequent and
important terms in the dataset. Size of each term indicates the frequency of that
term.

37.3.2 Naive Bayes (NB) Classifier

Natural language processing (NLP) plays a vital role to understand user request.
Intent classification is a major aspect in conversation engine. When the user asks the
question, text input is identified by software function called as classifier and it
384 A. R. Samak et al.

classifies the sentence into words in predefined categories. We implemented Naïve


Bayes (NB) [19] classifier which includes different classes and their sentences. NB
is a probabilistic algorithm which means that it calculates the probability of each
class for given question and then outputs the class with highest score. This highest
scored class is the one which is closest to the input question. Table 37.1 presents the
training dataset. We have taken a few (8) records from our Bayes_questions.csv file
to calculate the probabilities of each class.

37.4 Results and Discussions

Google Translate API was used for the translation of user’s language to English.
Translator is used to convert the question to specified language. In our case, the
Marathi question is converted to English. Text preprocessing is performed on the
generated question. Tokenization, stop words and punctuation removal, converting
uppercase to lowercase, stemming, etc. all necessary tasks performed. The dataset
containing 1553 words was used to train the model. Figure 37.2 shows the count of
records in each class.
Based on frequency of the words, Marathi Wordcloud is generated to visualize
the words from all the domains. As shown in Fig. 37.3, the size of the word
indicates the frequency and importance of the word.
Table 37.2 shows the document term matrix for our first few sample questions.
The row represents the questions, and column represents frequent terms in our
question dataset. For example, ‘India’ has occurred 1 time in Question1 (Que1).
Bayes theorem and probability theory are implemented to this scenario and
compared the input question with each class. Table 37.3 shows the comparison of
each word of input question with each respective class. Table 37.3 shows calculated
probabilities of each class.
The metrics are calculated using the values for True Negative (TN), False
Negative (FN), True Positive (TP) and False Positive (FP). Table 37.4 presents the
confusion matrix which is a table view of the number of right and wrong predic-
tions made by a classifier. It can be seen from the table that 20 out of 30 classi-
fications are correct, leading to a score of 0.66 (i.e., 66%) accuracy.

Table 37.1 Training dataset


S. No. Input Marathi question Translated English question Domain
1 भारतातील आयटी अभियांत्रिकीचा What is the syllabus of IT Education
अभ्यासक्रम काय आहे? engineering in India?
2 ऑस्ट्रेलियन क्रिकेट संघाचा Who is the captain of Sports
कर्णधार कोण आहे? Australian cricket team?
3 एमआयटी पुणे येथील आयटी विभागाचे Who is the HOD of IT education
अधिकारी कोण आहेत? department at MIT Pune?
37 A Novel Marathi Speech-Based Question and Answer … 385

Fig. 37.2 Frequency of each


class considered for the
research work

Fig. 37.3 Visualization of


Marathi words using
Wordcloud

Table 37.2 Document term Questions Syllabus It Captain Exams India


matrix
Que1 1 1 0 0 1
Que2 0 0 1 0 0
Que3 0 1 0 0 0
Que4 0 0 0 1 0
Que5 0 0 0 0 0

Table 37.5 shows the examples of keyword matching algorithm. When the user
asks the question, it finds the keywords in a question from the database. In the
below table, from first question, ‘admission’ and from second question, ‘principal’
is fetched and the suitable answer is provided. It is notable that though the actual
system uses correct names of college and principal, we have intentionally removed
those names while citing those results in Table 37.7.
Table 37.6 shows that the algorithm compared the question of the user with the
questions in the system and responded with a percentage for each, ignoring the
keywords, which basically represented the possibility of similarity between strings
and generated an effective answer.
386 A. R. Samak et al.

Table 37.3 Comparison of the probabilities of input words


S. Words in input P (word/ P (word/ P (word/
No. question education) sports) exams)
1 List 0.61 0.21 0.21
2 Some 0.61 0.21 0.21
3 Best 0.61 0.21 0.21
4 IT 0.81 0.21 0.21
5 Institute 0.61 0.21 0.21
6 India 0.71 0.21 0.21

Table 37.4 Confusion matrix


Class Education Exam Politics Sports
Education 18 0 0 0
Exam 1 0 0 0
Politics 8 0 2 0
Sports 1 0 0 0

Table 37.5 Use of keyword matching algorithm


Questions Answers Keyword
What is the admission process? Admission would be through an Admission
entrance
Who is the principal of ABC Dr. XYZ is the principal of ABC Principal
college? college

We have chosen to merge the two algorithms to obtain the best possible results.
In addition, we have added questions with responses and keywords, and the
questions with answers but no keywords to the database as shown in Table 37.7.
For instance, the keyword matching algorithm [20] is used to look up the keywords
and check all prestored questions with or without keywords with the string simi-
larity algorithm. Total 1553 words were used to train the model. To recognize the
keywords, keyword matching algorithms [21] were used first.

37.5 Conclusion and Future Work

In rapidly advancing world, a virtual assistant helps to solve the queries of users
online, even if the enquiry center is miles away from home. Different techniques
and algorithms were used to recognize the voice, for preprocessing and finding the
accuracy of the model. Total 1553 words were used to train the model. Text
translation of Marathi language is performed to generate suitable answer with
37 A Novel Marathi Speech-Based Question and Answer … 387

Table 37.6 Use of string similarity algorithm


Questions Answers Keyword1 Keyword2
Do I need an Yes, if you are from India, you should have Aadhar No
Aadhar card? it card keyword
Can I pay with Yes, you have to pay tuition fees online Aadhar No
Aadhar card? after admission confirmation card keyword

Table 37.7 Use of combination of algorithms


S. Questions Answers Keyword1 Keyword2
No.
1 Have you received my Sorry, you should contact Received No
application form? the admission office keyword
2 Have you received my Sorry, you should contact Received No
online payment? the admission office keyword
3 Have you received my rank Sorry, you should contact No No
in the entrance test? the admission office keyword keyword

an accuracy of 66% for the model. Naive Bayes classifier is used for multiclass
prediction. Through this paper, we can ensure that there is vast scope in building
education-related chatbots which will help students to clear their queries. The work
is useful to solve queries in different domains like CRM and so on. For further
work, we are working on the context-based understanding of the Marathi language
using WordNet with Word Sense Disambiguation and deep neural net.

References

1. Kumar, P., Sharma, M., Rawat, S., Choudhury, T.: Designing and developing a chatbot using
machine learning. In: 2018 International Conference on System Modeling & Advancement in
Research Trends, 23rd–24th Nov 2018, pp. 87–91 (2018)
2. Inamdar, V.A., Shivanand, R.D.: Development of college enquiry chatbot using SnatchBot.
Int. Res. J. Eng. Technol. (IRJET), 1615–1618 (2019)
3. Abbasi, S., Kazi, H.: Measuring effectiveness of learning chatbot systems on student’s
learning outcome and memory retention. Asian J. Appl. Sci. Eng. 57–66 (2014)
4. Augello, A., Pilato, G., Machi, A., Gaglio, S.: An approach to enhance chatbot semantic
power and maintainability: experiences within the FRASI project. In: 2012 IEEE Sixth
International Conference on Semantic Computing, pp. 186–193 (2012)
5. Patel, N.P., Parikh, D.R., Patel, D.A., Patel, R.R.: AI and web-based human-like interactive
University Chatbot (UNIBOT). In: IEEE Conference Record # 45616, pp. 148–150. IEEE
Xplore. ISBN: 978-1-7281-0167-5 (2019)
6. Rahman, A.M., Mamun, A.A.: Programming challenges of Chatbot: current and future. In:
2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), pp. 100–105
(2017)
7. Bani, B.S., Singh, A.P.: College enquiry chatbot using A.L.I.C.E. Int. J. New Technol. Res.
(IJNTR), 64–65 (2017)
388 A. R. Samak et al.

8. Bii, P.: Chatbot technology: a possible means of unlocking student potential to learn how to
learn. Educ. Res. 4(2), 218–221. ISSN: 2141-5161 (2013)
9. Ranoliya, B.R., Raghuwanshi, N.: Chatbot for university related FAQs, pp. 140–145. IEEE.
978-1-5090-6367-3/17/$31.00 (2017)
10. Comendador, B.V., Francisco, B.B., Medenilla, J.S., Nacion, S.T., Serac, T.E.: Pharmabot: a
pediatric generic medicine consultant chatbot. J. Autom. Control Eng. 3(2), 137–140 (2015)
11. Cerezo, J., Kubelka, J., Robbes, R., Berge, A.: Building an expert recommender chatbot. In:
2019 IEEE/ACM 1st International Workshop on Bots in Software Engineering, pp. 59–63
(2019)
12. Wijaya, H.D., Gunawan, W., Avriza, R., Sutan, A.M.: Designing chatbot for college
information management. Int. J. Inf. Syst. Comput. Sci. 8–13 (2020)
13. Paikari, E., Choi, J., Kim, S., Baek, S.: A chatbot for conflict detection and resolution. In:
IEEE/ACM 1st International Workshop on Bots in Software Engineering (BotSe), pp. 29–33
(2019)
14. Muftahu, M: Higher education and Covid-19 pandemic: matters arising and the challenges of
sustaining academic programs in developing African universities. Int. J. Educ. Res. Rev. 50–
52 (2020)
15. Argal, A., Gupta, S., Modi, A., Pandey, P., Shim, S., Choo, C.: Intelligent travel chatbot for
predictive recommendation in echo platform, pp. 176–183. IEEE. 978-1-5386-4649-6/18/
$31.00 (2018)
16. Baby, C.J., Khan, F.A., Swathi, J.N.: Home automation using IoT and a chatbot using natural
language processing. In: International Conference on Innovations in Power and Advanced
Computing Technologies [i-PACT2017], pp. 1–7 (2017)
17. Sinha, P., Bafna, P., Saini, J.R.: Hindi speech-based healthcare chatbot. In: Fifth International
Conference on Smart Computing and Informatics (SCI-2021). Springer (2022, In press)
18. Available online https://www.vyasaonline.com/category/mahabharata/stories-in-Kannada/.
Accessed 10 Jan 2022
19. Kulkarni, C., Bhavsar, A., Pingale, S., Kumbhar, S.: BANK CHAT BOT—an intelligent
assistant system using NLP and machine learning. Int. Res. J. Eng. Technol. (IRJET), 2374–
2377 (2017)
20. Galitsky, B., Ilvovsky, D.: Chatbot with a discourse structure-driven dialogue management.
In: Proceedings of the EACL 2017 Software Demonstrations, Valencia, Spain, 3–7 Apr 2017,
pp. 87–90 (2017)
21. Shakhovska, N., Basystiuk, O., Shakhovska, K.: Development of the Speech-to-Text Chatbot
Interface Based on Google API. MoMLeT 2019 (2019)
Chapter 38
Color Feature Extraction-Based
Near-Duplicate Video Retrieval

Dhanashree Phalke and Sunita Jahirabadkar

Abstract Near-duplicate video retrieval (NDVR) is the systematic approach to


search near-duplicate videos. Near-duplicate videos are the videos which may differ
from each other as per the format, version, editing differences, etc. Using the RGB
color features, the vector is generated which is used for retrieval of NDVR based on
Euclidean distance. The performance measure precision is used for the verification
of the performance of the model. The benchmarked dataset CC_WEB_VIDEO is
used.

Keywords Near-duplicate video retrieval · RGB · KeyFrame · Euclidean distance

38.1 Introduction

Near-duplicate video retrieval (NDVR) is the systematic approach to search for near-
duplicate videos, which are created by transforming original videos with or without
permission. There are various applications where NDVR can be used to handle
critical issues, such as storage cleaning, copyright protection or copyright violation
detection, and video recommendation. Feature extraction is the crucial and important
process for near-duplicate retrieval. The more effective features extraction process
will be, the more efficient results will be.
The paper is organized as follows: Literature review is given in Sect. 38.2 followed
by the proposed methodology in Sect. 38.3. The results are discussed in Sect. 38.4.

D. Phalke (B)
Department of Technology, Savitribai Phule Pune University, Pune, India
e-mail: daphalke@dypcoeakurdi.ac.in
S. Jahirabadkar
Cummins College of Engineering for Women, Pune, India
e-mail: sunita.jahirabadkar@cumminscollege.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 389
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_38
390 D. Phalke and S. Jahirabadkar

38.2 Literature Review

Various features are extracted from video like color and edges which will be further
processed for retrieval. HOOF, LBP, HSV features have been extracted by various
authors [1]. Different deep learning techniques are implemented for near-duplicate
video retrieval using LSTM, VGGNET [2]. Before starting to work on deep learning
techniques, basic understanding of color features will certainly create the awareness
of feature extraction strategies. To fulfill this purpose, color features-based near-
duplicate video retrieval is used in this paper.

38.3 Proposed Methodology

The proposed methodology of the research for near-duplicate video retrieval has the
following modules: feature vector extraction, video fingerprint, fingerprint dataset,
and similarity evaluation. To measure the performance of the retrieved near-duplicate
videos, mAP, precision, recall is used.

38.3.1 Feature Vector Extraction

A video is simply a sequence of individual frame captured as images. These frames


contain features of the video which will be extracted. Features extracted from video
database will be used for training, whereas features extracted from query video will
be used for testing. Feature extraction aims to generate effective and discriminative
representations of video contents. This is very important task of NDVR. From each
video of video dataset, 5 frames are extracted like frame no. 1,214,161 and 81. Then
every frame is divided into four-quadrant (Fig. 38.1).
From each quadrant, RGB features are extracted.
As shown in Table 38.1, from each frame 12 RGB features are extracted. These
features are concatenated to generate a single feature of one frame.

38.3.2 Video Fingerprint Generation

A video fingerprint is a representation of each video in the form of signature which


will be generated using extracted features of video. A video fingerprint is generated
by averaging the extracted features of five frames of each video.
In Table 38.2 colorFinal is extracted color features of single quarter of the divided
frame. It is in the form of red, green, and blue values extracted of single quarter.
ColorFeatures is the mean of color features extracted from four-quarter of single
38 Color Feature Extraction-Based Near-Duplicate Video Retrieval 391

Fig. 38.1 Each frame is divided into 4 quadrants to extract features

Table 38.1 Features


Quadrant of image Red Green Blue
extracted from each quadrant
of the extracted frame colorFeature1 0.374005 0.306709 0.285355
colorFeature2 0.422986 0.297366 0.278616
colorFeature3 0.053261 0.028986 0.019979
colorFeature4 0.121302 0.044323 0.022448

Table 38.2 Video fingerprint


colorFinal colorFeatures avgFeatures
generation from extracted
feature vector 0.374005 0.581849 0.11637
0.306709 0.43545 0.08709
0.285355 1.115446 0.223089
0.422986 3.079804 0.615961
0.297366 3.098192 0.619638
0.278616 3.254442 0.650888
0.053261 0.125207 0.025041
0.028986 0.101656 0.020331
0.019979 0.112112 0.022422
0.121302 0.836042 0.167208
0.044323 0.533438 0.106688
0.022448 0.311771 0.062354
392 D. Phalke and S. Jahirabadkar

Fig. 38.2 Testing fingerprint generation of query video

frame. avgFeatures are the mean of five frames which are extracted from a single
video. Generated fingerprint of each video from dataset is stored and used for testing
phase. In the testing phase, this fingerprint is used to retrieve near-duplicate videos.

38.4 Testing Phase

38.4.1 Testing Phase

After training phase, which stores a video fingerprint of each video, the testing phase
begins. In this phase, query video will be given as input. From the query video,
five frames will be extracted and color features (RGB) will be generated. Testing
fingerprint generation of query video is shown in Fig. 38.2.
Generated video fingerprint will be used to find near-duplicate video retrieval
from video dataset by evaluating similarity.

38.4.2 Similarity Evaluation

The similarity between query video and database videos will be measured based on
signatures generated. There are various ways for similarity measure, e.g., Euclidean
distance, Hamming distances, etc. Euclidean distance is calculated for similarity
evaluation.

 n

d( p, q) =  (qi − pi)2 (38.1)
i=1

where qi = excel file where all extracted features of video database are stored in the
form of signature of video.
38 Color Feature Extraction-Based Near-Duplicate Video Retrieval 393

Table 38.3 Training vector creation time


Number of videos Number of classes Elapsed time Unit of time Extraction %
1000 10 2433.623 sec 30

Table 38.4 Status and its


Status Meaning
meaning
E Exactly duplicate
S Similar video
V Different version
M Major change
L Long version
X Dissimilar video
−1 Video does not exist

pi = Signature generated through feature extraction of query video.

38.4.3 Testing Results

Table 38.3 shows testing results of the system where 1000 videos of 10 various
classes are used in video dataset, downloaded from CC_WEB_VIDEO dataset.
CC_WEB_VIDEO dataset is a benchmarked dataset used for near-duplicate
retrieval. It is having 13,129 videos [5, 6] divided into 24 classes with the status
as follows (Table 38.4).
The system is tested on 1000 videos of 10 various classes using the query video
which is also given in the dataset.
The results are shown in Table 38.5 (Fig. 38.3).

38.4.4 Performance Evaluation

To measure the performance of the retrieved near-duplicate videos, mAP, precision


[6], recall are adopted [1–4].

1 i
n
AP = (38.2)
n i=0 ri

where n is the number of relevant videos to the query video, and r i is the rank of the
ith retrieved relevant video.
Table 38.5 Testing results of 1000 videos on 10 classes
394

Video ID Query video Time elapsed Time elapsed Time elapsed Euclidean Extracted Extracted Extracted Extracted
first run second run third run distance query class relevant non-relevant
videos videos
108 2_8.mp4 62.11933 28.5724 19.159 0 103 2 36 264
0 107 2
0 108 2
101 2_1.mp4 17.69397 14.4853 14.7695 0 101 2 21 279
0.001429 925 12
0.003102 121 2
110 2_10.mp4 21.54346 21.6437 35.6209 0 103 2 36 264
0 107 2
0 108 2
1 1_1.mp4 16.00641 14.6225 12.7838 0 1 1 60 240
0 5 1
0 14 1
6 1_10.mp4 19.29317 8.44167 7.14951 0 2 1 60 240
0 4 1
0 6 1
3 1_3.mp4 16.47029 14.0602 14.0337 0 3 1 60 240
0.014812 131 2
0.03098 367 5
302 5_2.mp4 19.91686 15.0976 13.5754 0 302 5 60 240
0 305 5
(continued)
D. Phalke and S. Jahirabadkar
Table 38.5 (continued)
Video ID Query video Time elapsed Time elapsed Time elapsed Euclidean Extracted Extracted Extracted Extracted
first run second run third run distance query class relevant non-relevant
videos videos
0 330 5
306 5_6.mp4 14.38794 13.9091 15.1081 0 306 5 60 240
0 341 5
0 342 5
103 2_3.mp4 16.87154 13.3027 15.1973 0 103 2 36 264
0 107 2
0 108 2
206 3_7.mp4 17.25236 13.267 13.0467 0 206 3 67 233
0 208 3
0 210 3
702 10_2.mp4 13.63483 13.0261 12.7227 0 702 10 31 269
0 705 10
0 706 10
301 5_1.mp4 13.56769 13.8943 11.842 0 301 5 60 240
38 Color Feature Extraction-Based Near-Duplicate Video Retrieval

0 311 5
0 384 5
401 6_1.mp4 12.19629 11.7925 12.3619 0 401 6 29 271
0 406 6
0 457 6
503 7_3.mp4 12.93046 11.8829 11.1829 0 503 7 94 206
(continued)
395
Table 38.5 (continued)
396

Video ID Query video Time elapsed Time elapsed Time elapsed Euclidean Extracted Extracted Extracted Extracted
first run second run third run distance query class relevant non-relevant
videos videos
0 507 7
0 513 7
601 9_1.mp4 15.72918 12.0128 12.3952 0 601 9 61 239
0.000248 102 2
0.000248 666 9
70 2 10_2.mp4 14.03967 11.8036 12.6684 0 702 10 31 269
0 705 10
0 706 10
803 11_3.mp4 16.56119 13.1237 12.5123 0 803 11 57 243
0 827 11
D. Phalke and S. Jahirabadkar
38 Color Feature Extraction-Based Near-Duplicate Video Retrieval 397

100
90
Extracted
80
Relevant Videos
70
60
Actual Relevant
50
Videos
40
30
20
10
0

Fig. 38.3 Graphical analysis of relevant videos

Table 38.6 Precision analysis


Query video Extracted relevant videos Actual relevant Precision = ((TP) / (TP +
videos–Positive (TP) FP))*100
1_1.mp4 60 49 81.7
2_3.mp4 36 21 58.3
3_7.mp4 67 48 71.6
5_1.mp4 60 48 80
6_1.mp4 29 8 27.6
7_3.mp4 94 93 98.9
9_1.mp4 61 14 23
10_2.mp4 31 25 80.65
11_3.mp4 57 56 98.25

Precision is used for performance evaluation (Table 38.6).

38.5 Conclusion and Future Work

RGB color features are very prominent and useful visual features which are used
for near-duplicate video retrieval. Five frames with the interval of 20 frames are
extracted for each video. RGB features are extracted from each quarter of the frame,
and video signature is generated. This signature is used to extract relevant near-
duplicate videos from the database. For the experimentation, total 1000 videos are
used from CC_WEB_VIDEO dataset. Results show that only RGB feature is not
398 D. Phalke and S. Jahirabadkar

sufficient to retrieve near-duplicate videos. In the future, other color model will be
used to improve the accuracy of near-duplicate video retrieval.

References

1. Phalke, D.A., Jahirabadkar, S.: A systematic review of near-duplicate video retrieval techniques.
Int. J. Pure Appl. Math. 118(24), 1–11 (2018)
2. Phalke, D.A., Jahirabadkar, S.: A survey on near-duplicate video retrieval using deep learning
techniques and framework. In: 2020 IEEE Pune Section International Conference (PuneCon),
pp. 124–128. IEEE (2020)
3. Chou, C.-L., Chen, H.-T., Lee, S.-Y.: Pattern-based near-duplicate video retrieval and localization
on web-scale videos. IEEE Trans. Multimedia 17(3), 382–395 (2015)
4. Hao, Y., Mu, T., Hong, R., Wang, M., An, N., Goulermas, J.Y.: Stochastic multiview hashing
for large-scale near-duplicate video retrieval. IEEE Trans. Multimedia 19(1), 1–14
5. Wu, X., Hauptmann, A.G., Ngo, C.-W.: Practical elimination of near-duplicates from web video
search. In: 15th ACM International Conference on Multimedia, pp. 218–227 (2007)
6. http://vireo.cs.cityu.edu.hk/webvideo
Chapter 39
LabVIEW Software for Design
and Implementation of Particle Swarm
Optimization Tuned Controller
for the Climate Control of Greenhouse
System

Shriji V. Gandhi , Manish Thakker, and Ravi Gandhi

Abstract This paper presents the novel concept of simulation and optimized control
of the greenhouse system (GHS). The greenhouse is a climate control system with
protection boundaries that allow only solar light and no other outside environment
element to enter. To provide a favorable environment to the crop, the dynamic
behavior of the greenhouse system, the effect of parameters on each other, and
disturbances effects should be determined. For full filling these requirements, the
greenhouse simulator has been developed for modeling and control of the GHS. The
greenhouse climate control system dynamics based on energy and mass balance is
implemented using this simulator. The greenhouse climate control model has been
implemented on this simulator which allows changing greenhouse design parame-
ters, climate conditions, and functioning control strategies to understand the dynamic
behavior of the greenhouse system. This paper presents the simulation of the green-
house climate control system with open-loop and closed-loop structure; for a closed-
loop control, particle swarm optimized (PSO) tuned controller is implemented to
meet the desired climate requirements in the greenhouse system. The simulation
results obtained for different scenarios with comparative studies are presented.

Keywords Greenhouse system · Particle swarm optimization (PSO) · LabVIEW


software · Feedback linearization and decoupling

S. V. Gandhi (B)
Instrumentation and Control Engineering, Gujarat Technological University, Gujarat, India
e-mail: shrijigandhi.007@gmail.com
M. Thakker
Instrumentation and Control Engineering, L.D. College of Engineering, Gujarat, India
R. Gandhi
Electronics and Communication Engineering, Silver Oak University, Gujarat, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 399
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_39
400 S. V. Gandhi et al.

39.1 Introduction

Greenhouses system is being implemented in most of the regions in order to provide


suitable climate conditions for the growth and expansion of plants [1]. The green-
house is best for crop rising since it creates a closed environment in which climate
and growth of the plants can be controlled. GHS saves the crop from the external
climate in any seasons. Environments can be suitably modified as per the necessities
of the crop [2]. The main purpose of the GHS is to generate the extreme amount of
product with the highest optimum quality at a low cost. Greenhouse system is the
complex, dynamic, and nonlinear MIMO system [3]. The crop experiences internal
climate condition such as, inside temperature (T in ), inside absolute humidity (H in ),
and CO2 concentration. The external climate disturbances such as outside air temper-
ature, outside air humidity, wind velocity, and solar radiation affects the GHS [1–4].
The prime parameters that can be monitored and controlled are T in , H in , and inside
co2 concentration level that can be accomplished by numerous control inputs such as
heater input, ventilation input, fog input, and CO2 injection system [5]. This paper
describes the valuable perception of greenhouse mathematical modeling and simula-
tion using energy balance and mass balance principle. Through the literature survey,
it was observed that commonly two models have been developed for GHS such as
heating model achieved by heater input and cooling model achieved by fog system
input and ventilation input. Climate regulation can be accomplished by different
control structures such as an ON–OFF controller, linear controller, and feedback
and feed forward controller [6]. For the investigation of GHS, several web-based
interactive tool has been developed [7]. Consequence of measurement noise can be
suppressed using the Kalman filter algorithm [8]. The response of the entire system
depends on these relations, but also on the external atmosphere and control strategies
[9]. The key task is to maintain indoor greenhouse climate within suitable ranges. The
difficulties are in the weather cycle, the manipulated variables saturation problem,
and the performance of the soil. Therefore, the GHS is categorized as a complex
nonlinear system [10]. The efficiency of the system depends on how well the inside
climate condition can be controlled. This paper presents useful concepts of green-
house system simulation and control using LabVIEW. This LabVIEW-based simula-
tion allows the investigation of the changes in climate and other influencing parame-
ters to demonstration all the possibilities for understanding the dynamic behavior of
the greenhouse system. LabVIEW represents a useful tool for the analysis of GHS.
The LabVIEW Control Design and Simulation toolkit supports you for the analysis of
different dynamic systems and implements different control strategies in the real-time
system. This LabVIEW software allows the user to select the greenhouse design vari-
ables, climate conditions, and control strategies [11]. This software provides actual
estimation of the dynamic behavior of the greenhouse climate control system with
different design configuration and weather data. For the controlling purpose, PSO
tuned PID controller implemented for the climate control if GHS. The paper system-
atized as follows: Sect. 39.1 is the introduction of the system, Sect. 39.2 describes the
39 LabVIEW Software for Design and Implementation of Particle … 401

mathematical model of GHS, Sect. 39.3 presents system implementation, Sect. 39.4
presents results, and Sect. 39.5 presents the conclusion.

39.2 Mathematical Modeling of GHS

The greenhouse climate is a multivariable process. Control inputs such as heating


system, fog system, and ventilation all act concurrently on the system variables
to achieve control variables such as T in and H in . The disturbances for the system
are global radiation, outside temperature, outside humidity, and canopy temperature
[12]. Greenhouse system is the complex multivariable system [13]. The mathematical
model described the dynamic behavior of the greenhouse climate which has been
derived from mass and energy balance principles.
Figure 39.1 shows the schematic diagram of the greenhouse system dynamic
model. The greenhouse mathematical model can be described by energy and mass
balance equations which are produced by the differences in energy and mass data
between the inside and the outside environment [14, 15]. A mathematical model of
a system helps to understand dynamics of the system; it helps to optimize process
design and operating condition. Energy balance is the total accumulation of energy
in the system which is the difference between the input of energy into system to the
output of energy from system. Mass balance to the system refers to the rate of the
mass i/p to the system is equal to the rate of the mass o/p [15].
Figure 39.1 shows the schematic diagram of the greenhouse system dynamic
model. The greenhouse mathematical model can be described by energy and mass
balance equations which are produced by the differences in energy and mass data
between the inside and the outside environment [14, 15]. A mathematical model of
a system helps to understand dynamics of the system; it helps to optimize process
design and operating condition. Energy balance is the total accumulation of energy
in the system is the difference between the input of energy into system to the output
of energy from system. Mass balance to the system refers to the rate of the mass i/p
to the system is equal to the rate of the mass o/p [15].

Fig. 39.1 Schematic diagram of greenhouse system


402 S. V. Gandhi et al.

dTin 1   ϕvent μ
= ϕ H + Q S − λϕfog − (Tin − Tout ) − (Tin − Tout )
dt ξ C p VT VT ξ C p VT
Ccanopy
+ (TC − Tin ) (39.1)
ξ C p VT
dHin ϕfog ϕvent [(E(Q s , Hin ))]
= − (Hin − Hout ) + (39.2)
dt VH VH VH

Here, T in and T out are the inside and outside temperature, μ is the heat transfer
coefficient of material (29 J/K), H in and H out are the inside and outside humidity, ξ
consider as the air density (1.2 kgm−3 ), ϕvent is the ventilation rate input, C p defined
as the specific heat of the air (1006 Jkg−1 K−1 ), Qs is the solar radiation, and V T and
V H are the temperature and humidity blend with air volumes. λ define latent heat of
vaporization, ϕH is the heater input, E(( α λQ s − β Hout )) is the evaporation rate, and
ϕfog is the fog input. Canopy temperature and inside the temperature difference is set
to 2–6 °C. Ccanopy is the heat transfer coefficient of canopy. α and β are the lumped
parameters depending on leaf area index. Value of α and β is considered as 0.012
and 0.0015 [16].

39.2.1 Decoupler for the Nonlinear GHS

The feedback linearization and decoupling method to decouple the GHS is proposed.
Figure 39.2 shows the schematic of decoupled structure for the GHS. The GHS is
characterized by coupled and nonlinear characteristics, but it is linear to the control
variables. To model this composite behavior, state-space model is developed by
considering the state of the GHS x = [x 1 x 2 ]T = [T in ; H in ]T , control variable u =

Fig. 39.2 Decoupling and feedback linearization for the GHS


39 LabVIEW Software for Design and Implementation of Particle … 403

[u1 ; u2 ]T = [ϕvent ; ϕfog ]T and outside disturbances, d = [d 1 ; d 2 ; d 3 ; d 4 ]T = [Qs ;


T out ; H out ; T c ]T can produce the T in and H in system in a simplified form of a MIMO
nonlinear system, as present in below equation. The whole closed-loop structure of
the GHS is depicted in Fig. 39.2. It is observed from the diagram that the MIMO
system converted into two independent decoupled systems. Equations 39.3 and 39.4
show the linearized and decoupled GHS [2, 17].

ẋ = A(x, V ) + B(x, V )u

where x defined as the state vector, V consider as disturbance vector, and A (x, V )
and B (x, V ) are the matrix function with appropriate dimensions, so the nonsingular
matrix D can be defined as

ẏi = Fi (x, V ) + giT (x, V )u

where yi = u i
⎡ ⎤
g1T (x, V )
⎢ . ⎥
D(x, V ) = ⎢


⎦ (39.3)
.
g p (x, V )
T

⎧ ⎡ ⎤ ⎡ ⎤⎫ 

⎨ f 1 (x, V ) u1 ⎬
u = D −1 (x, V ) −⎣ . . . . ⎦ + ⎣ . . . . ⎦ (39.4)
⎩ ⎭

f p (x, V ) up

39.2.2 Particle Swarm Optimization (PSO) Tuned Controller

PSO is an innovative population-based computation algorithm. PSO updates popu-


lations without any genetic operators such as crossover and mutation [18]. In PSO,
the ‘swarm’ is started with a population of arbitrary solutions [19]. Every particle
in the swarm is a separate possible set of the unknown parameters to be optimized.
The aim is to powerfully identify the solution space by swarming the particles to the
finest fitting solution encountered in earlier iterations with the intent of encountering
improved solutions through the course of the process and eventually converging on a
smallest error solution. The steps involved in the PSO-based optimization of FOPID
controller parameters (K p , K i , K d ) are listed below. (1) Set a group of particles
including the arbitrary positions, velocities, and accelerations of particles. (2) Calcu-
late the fitness of individual particle. (3) Compare the individual fitness of individual
particle to its previous pbest. (4) Compare the individual fitness of each particle to
404 S. V. Gandhi et al.

its previous best. If the fitness is appropriate, revise the fitness as gbest. (5) Update
velocity and position of individual particle according to algorithm. (6) Come back
to step 2 of the process and keep repeating until some terminating condition is met.
The objective function is SSE, and the decision variables are (K p , K i , K d ) are 8.2,
2.28, 81.3, respectively [20].

39.3 System Implementation

By analyzing greenhouse mathematical model, the dynamic behavior of the green-


house system can be determined using LabVIEW simulation. LabVIEW systems
are called virtual instruments. LabVIEW software provides graphical programming
language. LabVIEW contains variety of tools for acquiring, analysis, display, and
storing data, as well as some tools, are available to troubleshoot the code [21].
LabVIEW control system toolbox is being utilized to formulate this simulation.
In the front panel shown in Fig. 39.3, one can select greenhouse parameters such as
length (m), height (m), width (m), heat transfer coefficient of the material, air density,
etc. The front panel also provides the facility for defining disturbances such as global
radiation, outside temperature, and outside humidity. For open-loop response, manip-
ulated variables such as heater input, ventilation rate, and fog system can be defined.
User can simulate open-loop and closed-loop response with the use of it.
Figure 39.4 shows the screenshot of the control and simulation loop toolkit of
LabVIEW. This toolkit executes the function described within the loop until it reaches
the simulation time. It executes the control and simulation loop has an input node,

Fig. 39.3 LabVIEW front panel displaying controls and indicators of greenhouse system
39 LabVIEW Software for Design and Implementation of Particle … 405

Fig. 39.4 Control and simulation loop toolkit

which is useful for defining parameters for the simulation. Output node of this toolkit
contains an error output terminal. It provides the error information, if there is any error
found during loop execution and it terminates the simulation. Signal information can
also be determined by this output node; by double-clicking the output node, we can
configure the parameters. Figure 39.5 shows the general flow chart of the greenhouse
system simulation. The first step starts with the data logging arrangement to access
the particular climate file or to select the particular weather condition then the next
step is to select the control strategy. If it is open loop, then there is no need for set
point. For the closed-loop system after defining the desired climate condition, we
need to set the PID controller parameters (proportional, integral and derivative gain).
This LabVIEW simulation of greenhouse system provides the user to select the
climate data files and greenhouse structure parameters for the analysis. Multiple
window display giving additional advantage to analyze the effect of various param-
eters on each other also provides the option to simulate the greenhouse system with

Fig. 39.5 Flowchart for


analysis of the greenhouse
system
406 S. V. Gandhi et al.

an open-loop or closed-loop approach; it also provides provision to save the data


in a user-defined format. Store simulation results can be seen graphically on the
same graph for the comparative analysis. This LabVIEW-based greenhouse system
simulation is interactive and user-friendly for the researchers and students.

39.4 Simulation Results

Using this LabVIEW-based simulation, greenhouse climate control system can be


stimulated by various conditions. This simulation provides an open-loop response as
well as a closed-loop response. For the closed-loop control, PSO tuned PID controller
was implemented for both of the decoupled loops. The winter climate heater input
can be used as a control signal and for the summer season heater input is set to zero.
The simultaneous display shows the effect of various parameters on each other.

39.4.1 Open-Loop Response

The main objective is to implement an environment control model that replicated


the accurate situations of a greenhouse into an interactive simulator. The open-loop
response of the GHS is implemented to summarize the key features that affected
greenhouse inside weather conditions such as outdoor climate, greenhouse construc-
tion, usage of different climate control structure and control approach. For the open-
loop response, volume of the greenhouse structure is taken 4000 m3 . The greenhouse
is considered as a homogeneous section where inside weather is uniform in the entire
section.
Figures 39.6 and 39.7 show the open-loop response of the greenhouse system.
Figure 39.6 shows the response of inside temperature with initial 20 °C temperature.
Solar radiation is given as step function of 350–550 W/m2 with 25k s step time.
Figure 39.7 shows the response of inside humidity with initial condition of 4 g/m3 .
Outside temperature and outside humidity given as a random signal of 8–20 °C and
8–14 g/m3 . This simulator also provides to add weather data file to execute simulate
scenarios. Weather database comprises the logged tested values of T out and H out from
greenhouse location with time logs.

39.4.2 Closed-Loop Control Response

Climate controller is the significant component of greenhouse System. To achieve


required T in and H in conditions in a GHS, PSO tuned PID controller is implemented.
In this closed-loop simulation attention on daytime climate regulation under summer
weather condition, so heating system is not taken. The efficacy of the proposed control
39 LabVIEW Software for Design and Implementation of Particle … 407

Fig. 39.6 Inside temperature response

Fig. 39.7 Inside humidity response

structure is demonstrated using GHS with a series of simulation experiments. For


this simulation, surface area of a greenhouse is assumed to be 998 m2 with its height
of 4 m. The maximum capacity of a fogging rate and ventilation rate is considered
as 1.6 × 103 g/s, and 22 m3 /s, respectively. The values of the system parameters
α and β are taken as 0.1242 and 0.0014, respectively. Greenhouse shading screen
408 S. V. Gandhi et al.

Fig. 39.8 Stabilizing control for inside temperature

reduces solar radiation by 70%. Canopy temperature during daytime is higher than
the inside temperature because of global radiation. However, this difference between
the two temperatures is less at night. Closed-loop control response is obtained for
the stabilizing and tracking control. Stabilizing control refers to a control system
intended to compensate for these disturbances. In the stabilizing control, the set
point for temperature and humidity remains constant, and disturbances are changing.
In the stabilizing control, results are obtained for fix set point with variable distur-
bances and different initial conditions. Figures 39.8 and 39.9 present the stabilizing
response of the temperature and the humidity loops for the fixed set point of 30 °C
and 19 g/m3 , respectively. Outside temperature (T out ), outside humidity (H out ), and
canopy temperature (T c ) are considered as the primary disturbances.
The response of both the loops is computed under wide ranges of the disturbances
at the sampling time of 30 s. Efficacy of the proposed stabilizing control is veri-
fied with randomized 31–38 °C outside temperature, 7–13 g/m3 outside humidity,
and 3–5 °C deviation among the canopy temperature and the inside temperature. It
is perceived from Figs. 39.8 and 39.9 that the proposed control approach provides
smoother and faster response compared to the other. The temperature loop is settled
within 20 min using the proposed approach. Also, the transient response using
the optimal control has very small overshoot. It is shown from Fig. 39.10 that the
suggested method provides rapid and smooth control. The humidity loop is settled
within 19 min using the proposed approach. Figure 39.10 shows the movement of
control signal for stabilizing control. With this proposed control algorithm, input
saturation effect is minimized.
In the tracking control, set point signal is changed, and the manipulated variable
is tuned to obtain the new operating condition. For a tracking control, set point
of temperature and humidity is changed at a particular time interval by keeping the
39 LabVIEW Software for Design and Implementation of Particle … 409

Fig. 39.9 Stabilizing control for inside humidity

Fig. 39.10 Control signal for stabilizing control

disturbance constant. A manipulated variable such as fog system input and ventilation
input are adjusted to get the defined set point. Tracking control of the system is
displayed in Figs. 39.11 and 39.12. Figure 39.11 displays tracking control of T in from
35 to 28 °C with 37 °C initial condition, and Fig. 39.12 shows tracking control of H in
16–24 g/m3 with 12 g/m3 initial condition. Overshoot obtained from the proposed
control algorithm is less than 7%. While the PID controller provides very high spikes
in the transient response, high overshoot sometimes creates actuator saturation. The
tracking control response with variable step change displays in Figs. 39.13 and 39.14.
The step change is given at every 1500 s interval. In both of these responses, less
410 S. V. Gandhi et al.

Fig. 39.11 Tracking control of inside temperature for a step change

Fig. 39.12 Tracking control of inside humidity for a step change

than 9% overshoot is obtained. Using optimal controller-less overshoot and better


transient and steady-state response is obtained.
39 LabVIEW Software for Design and Implementation of Particle … 411

Fig. 39.13 Variable step change for inside temperature

Fig. 39.14 Variable step change for inside humidity

39.5 Conclusion

Simulation results show that PSO tuned PID controller is implemented for the climate
control of the GHS. The proposed controller provides an oscillation-free response
in all the scenarios. Graphical programming-based greenhouse simulator provides a
good estimate to simulate greenhouse system under various climate conditions and
greenhouse system design parameters. It also enables the user to simulate this system
for an open-loop and closed-loop response. Multiple windows, interactive display,
and graphical user interface increase the usefulness of this simulator. This simulator is
planned to be used as a research tool to diagnostics greenhouse climate concepts and
412 S. V. Gandhi et al.

test the performance of the controller to maintain the control variables within suitable
limits. The data logging and data saving features make this system more efficient.
Multiple displays allow us to identify the dynamic behavior of different parameters
on each other. In this simulator, initial condition and parameter uncertainty are not
taken into consideration. The simulator functions can be increased at the model level
and controller level in an effective manner.

References

1. Pasgianos, G.D., Arvanitis, K.G., Polycarpou, P., Sigrimis, N.: A nonlinear feedback technique
for greenhouse environmental control. Artif. Intell. Agric. Bp. 40, 153–177 (2003)
2. Mohamed, S., Hameed, I.A.: A GA-based adaptive neuro-fuzzy controller for greenhouse
climate control system. Alex. Eng. J. 57, 773–779 (2018)
3. Outanoute, M., et al.: A neural network dynamic model for temperature and relative humidity
control under greenhouse. In: 2015 Third International Workshop on RFID and Adaptive Wire-
less Sensor Networks (RAWSN), pp. 6–11 (2015). https://doi.org/10.1109/RAWSN.2015.717
3270
4. Boughamsa, M., Ramdani, M.: Adaptive fuzzy control strategy for greenhouse micro-climate.
Int. J. Autom. Control 12, 108–125 (2017)
5. Heidari, M., Khodadadi, H.: Climate control of an agricultural greenhouse by using fuzzy
logic self-tuning PID approach. In: 2017 23rd International Conference on Automation and
Computing (ICAC), pp. 1–6 (2017). https://doi.org/10.23919/IConAC.2017.8082074
6. Gandhi, S.V., Thakker, M.T.: Climate control of greenhouse system using neural predictive
controller. In: Deb, D., Dixit, A., Chandra, L. (eds.) Renewable Energy and Climate Change,
pp. 211–221. Springer, Singapore (2020)
7. Fitz-Rodríguez, E., et al.: Dynamic modeling and simulation of greenhouse environments under
several scenarios: a web-based application. Comput. Electron. Agric. 70, 105–116 (2010)
8. Shi, P., Luan, X., Liu, F., Karimi, H.R.: Kalman filtering on greenhouse climate control. In:
Proceedings of the 31st Chinese Control Conference, pp. 779–784 (2012)
9. Gao, Y., Song, X., Liu, C., He, S.: Feedback feed-forward linearization and decoupling for
greenhouse environment control. In: 2014 International Conference on Mechatronics and
Control (ICMC), pp. 179–183 (2014). https://doi.org/10.1109/ICMC.2014.7231543
10. Javadikia, P., Tabatabaeefar, A., Omid, M., Alimardani, R., Fathi, M.: Evaluation of intelligent
greenhouse climate control system, based fuzzy logic in relation to conventional systems. In:
2009 International Conference on Artificial Intelligence and Computational Intelligence, vol.
4, pp. 146–150 (2009)
11. Gandhi, S., Binwal, S., Kabariya, H., Karkari, S.K.: LabVIEW software for analyzing Langmuir
probe characteristics in magnetized plasma. J. Instrum. 11, T03003–T03003 (2016)
12. Wang, Y., Zheng, W., Li, B., Li, X.: A new ventilation system to reduce temperature fluctuations
in laying hen housing in continental climate. Biosyst. Eng. 181, 52–62 (2019)
13. Vanthoor, B.H.E., Stanghellini, C., van Henten, E.J., de Visser, P.H.B.: A methodology for
model-based greenhouse design: Part 1, A greenhouse climate model for a broad range of
designs and climates. Biosyst. Eng. 110, 363–377 (2011)
14. Su, Y., Xu, L., Li, D.: Adaptive fuzzy control of a class of MIMO nonlinear system with actuator
saturation for greenhouse climate control problem. IEEE Trans. Autom. Sci. Eng. 13, 772–788
(2016)
15. Chen, L., Du, S., Xu, D., He, Y., Liang, M.: Sliding mode control based on disturbance observer
for greenhouse climate systems. Math. Probl. Eng. 2018, 2071585 (2018)
16. Bennis, N., Duplaix, J., Enéa, G., Haloua, M., Youlal, H.: Greenhouse climate modelling and
robust control. Comput. Electron. Agric. 61, 96–107 (2008)
39 LabVIEW Software for Design and Implementation of Particle … 413

17. Gurban, E.H., Andreescu, G.: Comparison of modified Smith predictor and PID controller tuned
by genetic algorithms for greenhouse climate control. In: 2014 IEEE 9th IEEE International
Symposium on Applied Computational Intelligence and Informatics (SACI), pp. 79–83 (2014).
https://doi.org/10.1109/SACI.2014.6840039
18. Coelho, J.P., de Moura Oliveira, P.B., Cunha, J.B.: Greenhouse air temperature predictive
control using the particle swarm optimisation algorithm. Model. Control Agric. Process. 49,
330–344 (2005)
19. Zou, Q., Ji, J., Zhang, S., Shi, M., Luo, Y.: Model predictive control based on particle swarm
optimization of greenhouse climate for saving energy consumption. In: 2010 World Automation
Congress, pp. 123–128 (2010)
20. Chen, L., Du, S., He, Y., Liang, M., Xu, D.: Robust model predictive control for greenhouse
temperature based on particle swarm optimization. Inf. Process. Agric. 5, 329–338 (2018)
21. Guzmán, J.L., Rodríguez, F., Berenguel, M., Dormido, S.: Virtual lab for teaching greenhouse
climatic control. In: 16th IFAC World Congress, vol. 38, pp. 79–84 (2005)
Chapter 40
Cluster-Based Energy-Efficient Routing
in Internet of Things

Amol Dhumane , Shwetambari Chiwhane , K. Mangore Anirudh ,


and Srinivas Ambala

Abstract Lot of improvements are taking place in the communication technology.


The networks are becoming ubiquitous and pervasive. Devices connecting to the
internet are increasing tremendously due to the inception of Internet of things (IoT)
in the current technology domain, and this count is going to become very huge in the
upcoming future. Majority of these devices are low power battery operated. These
devices need to handle their energy efficiently while doing the communication with
the rest of the neighboring devices. The adopted approach in this paper deals with
such devices and their energy usage. This approach has used dragonfly algorithm for
the selection of cluster heads. It changes the selection of cluster heads over the period
of time regularly so that the usage of energy will be balanced across the network
efficiently. The selection of cluster heads is done based on the specially designed
fitness function. At the end, the experimental outcomes prove the outperformance of
the proposed model over the conventional approaches.

Keywords Cluster head · Energy · Position · Distance · IoT · Delay

40.1 Introduction

Nowadays, IoT has emerged as a promising area for researchers and academicians.
There are lot of challenges in field. Establishing the communication between the

A. Dhumane (B) · S. Ambala


Pimpri Chinchawad College of Engineering, Pune, Maharashtra, India
e-mail: amol.dhumane@pccoepune.org
S. Ambala
e-mail: ambala.srinivas@pccoepune.org
S. Chiwhane
Symbiosis Institute of Technology, Pune, Maharashtra, India
e-mail: shwetambari.chiwhane@sitpune.edu.in
K. Mangore Anirudh
Gharda Institute of Technology, Ratnagiri, Maharashtra, India
e-mail: akmangore@git-india.edu.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 415
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_40
416 A. Dhumane et al.

heterogeneous devices is one of them. Internet of things (IoT) is trying to tie several
different things and technologies together with the already existing network [1].
The key intention of IoT is to make the human life more comfortable. IoT network
contains huge number of heterogeneous devices with varying sizes. These sizes are
from very small to very large. Sensors and actuators are key components of any
IoT network. The sensors sense the data continuously or periodically as per they are
programmed. This data is further transported to the sink station for further processing
[2–4]. This data has limited time validity. So, it is necessary that it must be reached
to the sink station within the limited time period; otherwise, there are chances that
the data becomes useless. During data routing, the context of the network plays a
significant role. So, it becomes essential that this data routing must be context-aware
for utilizing the network resources properly [5]. IoT network consists huge number
of small nodes having limited energy and limited computational capability, so it is
necessary to route the data towards the destination using energy-efficient mechanism
for improving the network lifetime.
The key contribution of this algorithm is to propose dragonfly algorithm for
energy-efficient clustering as well as cluster head selection in IoT.
This paper is organized as follows: Literature survey is stated in Sect. 40.2.
Section 40.3 discusses the network model and the energy model; dragonfly algorithm
is discussed in Sect. 40.4. Section 40.5 focuses on cluster head selection mechanism
based on five steps. In Sect. 40.6 along with comparative analysis, simulation is done
for performance evaluation of the proposed algorithm. Finally, paper is concluded in
Sect. 40.7.

40.2 Literature Survey

40.2.1 Grey Wolf Optimization Algorithm

This algorithm [6] is proposed by Mirjalili et al., which is motivated by the social
leadership and hunting activities of grey wolves. It is alike to the other heuristics
algorithms, and in this algorithm, the pack of wolves begins the searching process.
Cluster head (CH) selection is done by using this mechanism.
The GWO is a meta-heuristic algorithm which solves most of the problems of
optimization. In case of the usage and implementation, this algorithm behaves in
similar way as that of genetic algorithms. The convergence rate of this algorithm is
faster, due to this reason GWO is opted over other meta-heuristic techniques. In this
algorithm, the search space area reduction takes place in continuous manner. It also
avoids local optima.
By considering the hierarchical structure of the community supremacy of the
wolves, the candidate solutions can be classified as alpha (α) which is described as
the best and most optimized solution followed by β and δ as the second-best solution
and third-best solutions. Some more solutions are there apart from these α with β
40 Cluster-Based Energy-Efficient Routing in Internet of Things 417

and δ solutions are  solutions which are the least fit solutions. This optimization
process is carried out under the guidance of the α, β, δ, and  parameters.

40.2.2 The Artificial Bee Colony Algorithm

This algorithm [7] is proposed by Karaboga and Basturk [29]. This algorithm is
popular due to its robustness and simplicity. It is one of good intelligent and heuristic
algorithms for solving the clustering-based optimization problems. It requires large
amount of time for solving the clustering problems. It has been used to solve a variety
of problems having complex nature.
In the ABC algorithm, the artificial bee colony is classified into three categories:
onlookers, employed bees, and scouts. The employed bees share the information
about the food source with onlooker’s bees. Scout looks for a new food source
randomly, and onlooker’s bees wait in the colony and take decision to select food
source based on employed bees shared information. There is one employed bee for
every food source. Thus employed bees and the number of food sources are always
same in number. When a particular food source has been exhausted by the bees, an
employed bee associated with that food source becomes a scout. The food source
position represents a possible optimization problem solution and the nectar amount
of a food source represents the fitness value of the associated solution represented by
that food source. Onlookers are placed on the food sources by using a probability-
based selection process to take decision about the food source. The probability of
the preferred food source by onlookers increases with the increase in nectar amount
of a food source.

40.2.3 Ant Colony Optimization Algorithm

Ant colony optimization [8] (ACO) is a bioinspired algorithm based on heuristic


approach similar to PSO technique. This technique is derived from the activities of
the ant colonies in search of foodstuff. Every associate of the colony tries to discover
a frequently used path during the food source searching. While considering natural
behavior of ants, it secretes signaling pheromone for marking their path for its source.
The ants that are following prefer to use the equivalent path with stronger pheromone
so that, the change on the concentration of the pheromone intensity identifies the
choosing probability of every path.
ACO technique is designed by analyzing the food investigating hard work of ants.
This technique uses the graphical framework where the search area is represented
as graph and the agents (ants) are described as traversing point on this graph. As the
agents do movement, pheromone secrets to blot the most well-liked paths through the
source. Every ant begins to move from haphazardly chosen points on the graph. The
follower ants use the pheromone deposited on the path for finding more capable way
418 A. Dhumane et al.

through the food target. So as per the observation, the more deposition of pheromone
on the route, the higher the optimal solution.

40.3 Network System Model

The IoT network contains huge number of nodes, some of them might be rich in
resources, and some of them might be having resource constraints in terms of their
memory, computational power or energy. Out of these, energy of the node is directly
related to the life of that node. If the node is having less energy, the lifetime of
that node is considered to be low. This thing must be taken into consideration while
dealing with the transmission and reception of the data. Unnecessary data handling
may result in earlier death of the energy constraint node. Due to it, the sensor nodes
in this deployed network are categorized into clusters and one node is selected as
cluster head depending on its abilities and its properties. Every node sense collects
the information on periodic basis or query basis and sends the information to the
base station through the cluster head (CH) (Fig. 40.1).
As considerable amount of energy is required to transmit the data, the position
of the cluster head and that of the other normal nodes play a significant role while
constructing a cluster. The dimension of the area and position of the base station is
shown in the figure below. The remaining nodes except the base station are randomly
deployed into that area at the time of simulation.
The dimension of the simulation area is considered as U_t*V_t, and the base
station is situated at the center of the area (0.5U_t, 0.5V_t). Every other node N_Kis
situated at (X_k, Y_K). It is considered that at the time of simulation, total n nodes
will be deployed in the network and out of that C cluster heads will be selected, so

Fig. 40.1 Schematic representation of deployed network


40 Cluster-Based Energy-Efficient Routing in Internet of Things 419

the remaining normal nodes on the field will be n–C. These nodes will sense the data
and send it to the base station through the cluster head.
Energy Model
As per the energy model stated in [9], energy dissipation is discussed while trans-
mitting and receiving the data at cluster head and normal node based on the
distance.

40.4 Proposed Method

The intend of this work is to implement an energy-efficient routing protocol for


improving the lifetime of the network. Figure 40.2 shows the schematic representa-
tion of the protocol. To achieve the objective of energy-efficient routing and extending
the lifetime of deployed nodes, the data packets are transferred via an efficient route
and by appropriately selecting the cluster heads. To achieve this objective, we have
used dragonfly algorithm for the selection of the optimal cluster heads based on
various objectives, such as position of the node, lifetime of the link, intra-cluster
communication delay and residual energy of the nodes.
About Dragonfly Algorithm
Dragonfly algorithm [10] is a bioinspired algorithm inspired by the behavior of
dragonflies. There are more than 3000 species of dragonflies exists on the earth.
The major task of swarm is to find the food source and protect themselves from the
enemies. Based on that, the swarm always gets attracted towards the food source
(optimum solution) and gets distracted from the enemies (worst solution). Swarm
always struggles for its survival. It tries to get the food source, and also, it stays away
from its predators.
The swarms are of two types: static and dynamic. In static swarm, small groups of
dragonflies move back and forth in a small area for hunting purpose. The movements

Fig. 40.2 Schematic representation of proposed methodology


420 A. Dhumane et al.

are local in this case. Sometimes the abrupt changes may occur in the flying paths
of the dragonflies in the static swarm [11]. In contrast to that, in dynamic swarm,
large numbers of dragonflies fly together over comparatively larger distances in one
direction [12].
The static and dynamic swarm behavior of dragonflies is analogous to the
two important phases of meta-heuristics optimization algorithms known as explo-
ration and exploitation. The exploration mechanism searches the optimal solution at
global level. Exploitation mechanism starts its work when the search process gets
concentrated at local level.
Authors in [13] had discussed behaviors of swarms on the basis of three principles:
separation, alignment, and cohesion. The avoidance of collision between the individ-
uals and that of their neighborhood is taken care by using the separation mechanism.
Velocity matching of individual with respect to its neighbors needs to be taken care
when the swarm moves from one location to the other. This mechanism is effectively
handled by the alignment method. Cohesion deals with the inclination of individuals
towards the center of mass of the neighborhood.
For the survival, swarm has to update its positions continuously which is shown
by the five factors given below:
Separation mechanism is calculated as follows:


N
Sci = Y − Yk (40.1)
k=1

Here, the position of current individual is represented by Y, Yk represents the


position of neighboring individuals, and N represents the number of individuals at
neighborhood.
Velocity matching is done using alignment mechanism. It is calculated as given
below:
N
Ve
ACi = k=1 k (40.2)
N

where Vek shows the velocity of kth neighboring individuals.


Cohesion mechanism is mathematically modelled as follows:
N
Yk
CCi = k=1
−Y (40.3)
N

where Y denotes the position of current individual, Yk represents neighboring


individuals’ position, and N represents the number of individuals at neighborhood.
The dragonflies get attracted by the food source (solution). This attraction
mechanism is represented by the following equation:
40 Cluster-Based Energy-Efficient Routing in Internet of Things 421

Fci = Y + − Y (40.4)

where Y denotes the current position of the individual and Y + denotes the food
source (solution) position. Y + − Y denotes the distance between the food source and
that of the current individual.
Dragonflies need to maintain their safety by distracting themselves from their
enemy. Here, enemies are considered as the worst solutions in the search space. This
distraction mechanism is mathematically modelled as:

E Ci = Y − + Y (40.5)

40.5 Selection of Cluster Heads

For extending the lifetime of the deployed IoT network, we need to extend the lifetime
of individual IoT node. This is achieved using the dragonfly algorithm. Cluster head
selection process is explained in this section. The main parameters for selecting a
node as a CH are individual nodes energy, lifetime of the link, position of that node
in the vicinity and the intra-cluster delay that the node is experiencing during the
communication. Our aim in this case is to select that node as cluster head which
is outperforming the other nodes in its vicinity with respect to all the parameters
stated earlier. All these parameters are employed together along with the weighted
constants in the fitness function for selecting the optimal cluster head. The selection
process of cluster heads is carried out in the below-mentioned five steps.
Step I: Population and Step Vector Initialization
Initially, in the proposed algorithm, random initialization of the solution set is done.
This is termed as dragonflies’ population PDi and is represented as given below:

PDi = {i = 1, 2, 3, 4, 5, . . . m}

where {i = 1, 2, 3,…, m} are the randomly selected nodes which may act as cluster
heads in ith solution set. Here, PDi is termed as a search agent containing m cluster
heads that are randomly selected.
In PDi, i is 1iG. It indicates that, there is G number of random solutions. Every
solution contains m cluster heads in it.
The step vectors are considered to be the vectors which are used to update the
position of dragonflies slowly as per the global and local optimum solution in the
search space.
The functionality of the step vector is equivalent to the velocity vector in particle
swarm optimization algorithm. Dragonfly algorithm is developed on the basis of PSO
algorithms framework. It uses the step vector which mimics the velocity vector in
422 A. Dhumane et al.

the PSO algorithm. Step vector is initialized as:

PDi = {i = 1, 2, 3, 4, 5, . . . m}

This vector defines the direction of movement of search agents in the search space.
Step II: Step Vector Calculation
As discussed in Sect. 40.4 and Fig. 3, the step vector calculation is done based on
separation, alignment, cohesion, position of food source, and position of the enemy.
By considering Eqs. (40.1)–(40.5), calculation of the step vector is given below:

PDit+1 = (sSci + a Aci + coCci + f Fci + eE ci ) + W PDt (40.6)

where,
Separation factor of ith individual is denoted by Sci , Aci denotes alignment factor, Cci ,
E ci , and Fci denote the cohesion, enemy, and food source factors of ith individuals,
respectively, whereas s, a, co, f , and e are the separation, alignment, cohesion, food
source, and enemy weights, respectively. W is the weight of inertia while updating
the position of the search agent in the search space and t is a counter showing the
iteration count.
Step III: Position Updating
The position vectors are calculated after calculation of the step vectors. In case,
the dragonfly has only one neighbor associated with it, the position updating of the
dragonfly is shown as given below:

Yt+1 = Yt + Yt+1 (40.7)

where t denotes current iteration and t + 1 denotes the next iteration.


The dragonflies fly in the search space by using random walk for the improvement
of their stochastic behavior and exploration of artificial dragonflies. In case, if the
dragonfly has more than one neighbors associated with it, the position updating is
done with Eq. (40.8).

Yt+1 = Yt + Yt ∗ L(D) (40.8)

where D represents dimension, L represents Levy flight, t denotes current iteration,


and t + 1 denotes the next iteration. Here, Lévy flights represent the strategy to
search for a target randomly in the environment which is completely unknown. This
technique is adopted and widely observed in large number of animal species when
they search for food.
40 Cluster-Based Energy-Efficient Routing in Internet of Things 423

Lévy flight is calculated as:

α1 × σ
L(D) = 1 ∗0.01 (40.9)
|α2 | β

where β is a constant value while α1 and α2 are two random numbers between 0 and
1 and σ is calculated as given below:

⎡   ⎤ β1
πβ
(1 + β) sin 2
σ =⎣ 

β−1
⎦ (40.10)
 1+B
2
∗β ∗ Z 2

where

(m) = (m − 1)!

Step IV: Fitness Function


The optimization models are driven with the help of fitness functions. In this paper,
the fitness model contains various parameters for selecting cluster heads which helps
in well-organized data transmission process. The fitness function is calculated on the
basis of link life time, transmission delay at the node and the current energy present
in the node. For the better transmission and reception of the data to the base station,
it is necessary that the node from which the base station is accepting data must be
close to the base station. By considering this fact, node’s location or node’s position
is added as a fourth parameter in the fitness function. The nodes which are closer to
the base station are mostly chosen as cluster heads in the experimentation.
Similarly, the maximum value of link life time is considered as a better value, delay
is considered to be less always and the maximum energy in the node is considered
as the most suitable value for the experimentation. The fitness value calculation of
the search agent is shown as below:
    energ
fitness(i) = ω1 f 1 ∗ 1 − Piloc + ω2 f 2 ∗ 1 − Pidel + ω3 f 3 ∗ Pillt + ω4 f 4 ∗ Pi
(40.11)

energ
where Piloc is a position, Pidel denotes delay, Pi states the current energy of the
node, and Pillt denotes the link life time. The sum of ω1 , ω2 , ω3 , and ω4 is unity.
where Piloc must be less for maximizing the network lifetime. It is represented by
the following equation:
x
h HC d − d n + d n − d B
N C C
Piloc =
x=1 n=1,n∈x
Ut ∗ Vt
424 A. Dhumane et al.

where the term in the numerator part describes the distance between the normal node
to the base station, which is further normalized by the term in the denominator which
gives more information about the area on which the network is deployed.
del
The second term in the fitness function is P(i) , and it deals with the intra-cluster
delay based on the number of nodes participating in the communication within that
cluster. It clearly states that, the intra-cluster delay increases with increase in the
number of nodes into each cluster of the search agent.

n=1to HC
n
Max HCh
del
P(i) =
h

where h are the number of nodes in the network and HC are the cluster heads on the
energ
field for that search agent. Pi states the energy level of the nodes in the search
agent.
 h  H 
energ 1 1   x 1  C
 n
Pi = E DN + E DC
2 h x=1 HC n=1

 
where E D Nx and E DCx are energies associated with normal nodes and cluster
energ
head nodes. For maximizing the value of the fitness function, Pi must be high.
In every cluster, there is a link between every cluster head node and its corre-
sponding IoT nodes. The link lifetime Pillt is stated with the help of following
equation:

1  
h HC
Pillt = L yn
h∗HC y=1 n=1,n∈y

where L xn is the link between nth cluster head node and yth IoT node, which is further
calculated as:
  y 
E D N + E DCn
L yn =
llt
y n
Ps + R + Pr + D − D N C

where Ps represents packet sending rate of yth node, R is the transmission range, and
Pr is packet receiving rate of yth node.
Step V: Termination of Algorithm
Continue the process of updating the position of search agents until it reaches to
maximum number of iterations.
40 Cluster-Based Energy-Efficient Routing in Internet of Things 425

40.6 Results and Discussion

Selection of ω1 , ω2 , ω3 , and ω4 parameters of the fitness function is done with the


help of the following observations after executing the proposed algorithm for several
rounds. The comparative analysis of the proposed algorithm reveals that in majority
of the cases in alive node and network energy analysis, this algorithm performs better
than other algorithms when the values of ω1 = 0.3, ω2 = 0.3, ω3 = 0.1, and ω4 =
0.3 (Tables 40.1 and 40.2).
We have used these values of the weighted parameters for the future comparisons
of the proposed algorithm with the already existing algorithms.
Simulation Setup
This section discusses the simulation setup and various parameters used in the
simulation and their associated values (Table 40.3).
Comparative Analysis
In this section, the proposed algorithm is executed and evaluated and its performance
is compared with the traditional algorithms such as ant colony optimization (ACO),
artificial bee colony (ABC), and grey wolf optimization (GWO) algorithm. The
performance of the proposed algorithm is analyzed across the number of alive nodes
after the specified count of rounds and the amount of residual energy across the

Table 40.1 Alive nodes analysis of the proposed algorithm after 2000 rounds
Values Deployed nodes
100 200 300 500
ω1 = 0.5, ω2 = 0.2, ω3 = 0.2, ω4 = 0.1 9 19 25 44
ω1 = 0.4, ω2 = 0.2, ω3 = 0.2, ω4 = 0.2 8 17 24 42
ω1 = 0.3, ω2 = 0.3, ω3 = 0.1, ω4 = 0.3 10 19 25 46
ω1 = 0.7, ω2 = 0.1, ω3 = 0.1, ω4 = 0.1 9 18 23 41
ω1 = 0.2, ω2 = 0.3, ω3 = 0.4, ω4 = 0.1 10 17 21 42

Table 40.2 Network energy analysis of the proposed algorithm after 2000 rounds
Values Deployed nodes
100 200 300 500
ω1 = 0.5, ω2 = 0.2, ω3 = 0.2, ω4 = 0.1 0.0327 0.0330 0.0301 0.0308
ω1 = 0.4, ω2 = 0.2, ω3 = 0.2,ω4 = 0.2 0.0318 0.0322 0.0318 0.0311
ω1 = 0.3, ω2 = 0.3, ω3 = 0.1, ω4 = 0.3 0.0327 0.0331 0.0325 0.0335
ω1 = 0.7, ω2 = 0.1, ω3 = 0.1, ω4 = 0.1 0.0325 0.0328 0.0326 0.0334
ω1 = 0.2, ω2 = 0.3, ω3 = 0.4, ω4 = 0.1 0.0320 0.0321 0.0320 0.0332
426 A. Dhumane et al.

Table 40.3 Values used in this simulation


Parameter Value
ω1 0.3
ω2 0.3
ω3 0.1
ω4 0.3
Parameters in energy model are same as in [14]

Table 40.4 Comparative analysis


No. of rounds Total rounds Algorithm Number of alive Residual energy level of the
nodes network
Deployed nodes Deployed nodes
100 200 300 100 200 300
100 ACO 8 16 24 0.0289 0.0299 0.0300
GWO 8 18 25 0.0307 0.0290 0.02985
ABC 7 14 23 0.0277 0.0320 0.0306
CBEER-DA 10 19 25 0.0327 0.0331 0.0325
200 ACO 5 10 13 0.0320 0.0302 0.0307
GWO 5 11 14 0.0306 0.0277 0.0301
ABC 4 10 14 0.0298 0.0322 0.0319
CBEER-DA 6 12 14 0.0330 0.0322 0.0324
300 ACO 2 3 4 0.0306 0.0300 0.0299
GWO 1 3 4 0.0325 0.0323 0.0277
ABC 3 2 5 0.0289 0.0315 0.0327
CBEER-DA 3 3 5 0.0330 0.0329 0.0327

network. The status of both the parameters is analyzed after 100, 200, 300, and 500
rounds (Table 40.4).

40.7 Conclusion

In this paper, cluster-based energy-efficient routing using dragonfly optimization


algorithm is proposed for enhancing the network lifetime of an IoT network which
is deployed randomly on a field. This algorithm is working in two stages: (1) cluster
formation and cluster head selection (2) routing the data. Fitness function based on
the specified parameters works on the solution set for finding the optimal cluster head.
The cluster head node is changed over a fixed period of time to avoid the excess use
of its constrained resources. This makes into equal usage of energy by all the nodes
40 Cluster-Based Energy-Efficient Routing in Internet of Things 427

in the network which may further results in balancing and maximizing the overall
life span of the nodes deployed across the network. The comparison of CBEER-DA
with the various existing algorithms shows that it has improved the network lifespan
as compared to the existing algorithms.

References

1. Dhumane, A., Prasad, R.: Routing challenges in internet of things. CSI Communications (2015)
2. Hoang, D., Kumar, R., Panda, S.: Realisation of a cluster-based protocol using fuzzy C-means
algorithm for wireless sensor networks. IET Wirel. Sens. Syst. 3(3), 163–171 (2013)
3. Singh, B., Lobiyal, D.: Energy-aware cluster head selection using particle swarm optimization
and analysis of packet retransmissions in WSN. Procedia Technol. 4, 171–176 (2012)
4. Dhumane, A., Bagul, A., Kulkarni, P.: A review on routing protocol for low power and lossy
networks in IoT. Int. J. Adv. Eng. Glob. Technol. 3(12), 1440–1444 (2015)
5. Dhumane, A., Guja, S., Deo, S., Prasad, R.: Context awareness in IoT routing. In: 2018
Fourth International Conference on Computing Communication Control and Automation
(ICCUBEA), Pune, India, pp. 1–5. https://doi.org/10.1109/ICCUBEA.2018.8697685 (2018)
6. Mirjalili, S., Mirjalili, S.M.: Grey Wolf optimization algorithm. Adv. Eng. Softw. 69, 6–61.
https://doi.org/10.1016/j.advengsoft.2013.12.007 (2014)
7. Okdem, S., Karaboga, D., Ozturk, C.: An application of wireless sensor network routing based
on artificial bee colony algorithm. In: 2011 IEEE Congress of Evolutionary Computation
(CEC), New Orleans, LA, pp. 326–330 (2011)
8. Dhumane, A., Prasad, R., Prasad, J.: Routing issues in internet of things: A survey. In: Proceed-
ings of the International MultiConference of Engineers and Computer Scientists, Hong Kong,
China, pp. 16–18 (2016)
9. Dhumane, A., Prasad, R.: Multi-objective fractional gravitational search algorithm for energy
efficient routing in IoT. Wirel. Netw. 25, 399–413. https://doi.org/10.1007/s11276-017-1566-2
(2019)
10. Mirjalili, S.: Dragonfly algorithm: a new meta-heuristic optimization technique for solving
single-objective, discrete, and multi-objective problems. Neural Comput. Appl. 27, 1053–1073.
https://doi.org/10.1007/s00521-015-1920-1 (2016)
11. Wikelski, M., Moskowitz, D., Adelman, J.S., Cochran, J., Wilcove, D.S., May, M.L.: Simple
rules guide dragonfly migration. Biol. Lett. 2, 325–329 (2006)
12. Russell, R.W., May, M.L., Soltesz, K.L., Fitzpatrick, J.W.: Massive swarm migrations of
dragonflies (Odonata) in eastern North America. Am. Midl. Nat. 140(2), 325–334 (1998)
13. Dhumane, A., Midhunchakkaravarthy, D.: Multi-objective whale optimization algorithm using
fractional calculus for green routing in internet of things. Int. J. Adv. Sci. Technol. 29(3s),
1905–1922 (2020)
14. Lin, Y., Zhang, J., Chung, H.S.-H., Ip, W.H., Li, Y.: An ant colony optimization approach for
maximizing the lifetime of heterogeneous wireless sensor networks. IEEE Trans. Syst. Man
Cybern. Part C: Appl. Rev. 42(3), 408–420 (2012)
Chapter 41
Explainable AI for Sentiment Analysis

N. Pavitha, Pranav Ratnaparkhi, Azfar Uzair, Aashay More, Swetank Raj,


and Prathamesh Yadav

Abstract The explainable artificial intelligence (XAI) is a set of practices and


methods that allows the human users to understand and trust the results and outputs
created by machine learning algorithms. The whole calculation process is converted
to what is commonly called a “black box” that cannot be defined. Therefore, we use
XAI to explain these models so that we can understand the working of an AI model
or any neural network. We are going to use XAI for understanding an NLP model
of sentiment analysis. The NLP of natural language processing model for sentiment
analysis helps us to predict whether a given sentiment is positive or negative or irrel-
evant. We use a neural network to build this model. In this research paper, we are
going to use the XAI tool on this NLP model so that we can understand the working
of each layer of the neural network and how the model takes a decision at every step.

Keywords Explainable AI (XAI) · Natural language processing (NLP) · Valence ·


Arousal · Sentiments · Heatmap

41.1 Introduction

In Wikipedia, the word “explain” means “to make people clear, visible, or clear;
remove fading; to show meaning.” In scientific research, the scientific explanation
should include at least 2 parts: (1) the object which is to be explained (“Explanandum”
in Latin), (2) the content of the definition (“Definitions” in Latin) [1]. Without new
descriptive processes, the release of deep neural networks (DNNs) can neither be
explained by the neural network itself, nor by the external descriptive component,
and cannot even be explained by the system engineer [1].
Recent achievements in the machine learning (ML) field have said that the use of
new artificial intelligence (AI) offers full benefits in a variety of fields [2]. However,
many of these programs are unable to explain their own individual decisions and
actions to the personal users. Definitions might not be important in certain kinds of

N. Pavitha (B) · P. Ratnaparkhi · A. Uzair · A. More · S. Raj · P. Yadav


Vishwakarma Institute of Technology, Upper Indira Nagar, Pune, India
e-mail: pavitha.n@vit.edu

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 429
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_41
430 N. Pavitha et al.

AI programs, and also some of the AI researchers argue that the emphasis on meaning
is incorrect, very hard to achieve, and also not necessary. However, in many of the
critical security, medical, financial, and legal resources, definitions are needed for
users to understand, trust it, and effectively manage those new, intelligent partners
automatically (see recent work [1–3]).
Recent artificial intelligence breakthroughs have been widely disclosed in new
machine learning techniques which build models into their own actions which are
internals [4]. These also include support vector machines (SVMs), random forests,
reinforcement learning (RL), neural networks for deep learning (DL), and potential
image models [4]. Although such examples show good performance, they might
not be obvious in meaning terms. There might be a natural conflict between ML
performance (such as predictive accuracy) or translation. In general, most effective
methods (like deep learning) are the smallest, and most descriptive ones (like cut
trees) might be slightly more accurate. Interpreting and translating ML algorithms
have become a daunting task: who is to blame when things go wrong? Can we
explain why things go wrong? When things go well, we know why and how we can
use them effectively. Many papers have suggested various methods and frameworks
for interpreting images, and the topic of artificial intelligence (XAI) has become a
major factor in the ML research community [2].

41.1.1 What is XAI?

The main focus of an explainable AI (XAI) system is to make its working more visible
to humans by providing them whole explanations [2]. There are some general princi-
ples which help use to create a workable, understandable human artificial intelligence
systems: The XAI system should be able to define its capabilities and understanding;
explain what it is doing, what it is doing in the present time, and what will it do next;
and discloses important details it also [5]. However, all the details are set within the
context based on the function, skills, and expectations of the artificial intelligence
system user. Translation and translation definitions, therefore, depend on the location
and cannot be defined other than the domain. Definitions can be complete or partial.
Fully translated models provide full and complete definitions. Partially, translated
models incorporate important parts of their thinking process [1]. Translated models
send to “translation issues” defined as background (e.g., monotonicity related to
certain variations and related variations that listen to a particular relationship), and
black boxes or non-binding models do not correspond to these constraints. Incom-
plete descriptions can include dynamic value estimates, local models for almost
universal space models, and amazing maps.
Since the beginning of AI research, scientists have argued that intelligent systems
must be able to explain the effects of AI, especially when we talk of decisions
[6]. Popular in-depth libraries have begun to include descriptive AI libraries, such
as Pytorch Captum and TensorFlow tf-explain. In addition, a growing number of
translation test terms (such as reliability, cause, and usability) help the ML community
41 Explainable AI for Sentiment Analysis 431

to track how algorithms are being used and how their use can be improved, providing
guidelines for further development. In particular, it has been shown that visualization
can help researchers find erroneous assumptions about the problems that distinguish
many previous researchers from missing out on them. In this paper, we will use the
AI tool described in the in-depth NLP learning model.
In this research paper, we are going to use an XAI tool on an NLP model which
does sentiment analysis. This NLP model is made using the neural network. And
whenever we give any statement/sentiment to this model, the model can predict
whether the sentiment is positive, negative, or neutral or not good for the viewers
[1]. But it is very difficult to explain how this neural network is working at the
backend to generate a particular sentiment type. Therefore, we cannot trust blindly
on an ML model or NLP model. XAI tool will help us to understand how the model
is working [3]. The tool will generate some graphs of the working of the model, and
we can understand the working of our neural network by these generated graphs.
This is how an NLP model of sentiment analysis can be explained briefly and we can
trust this model.

41.2 Literature Survey

In this paper [6], authors presented an innovative approach that has the capability
to allocate context-specific sentiment inclining toward terms. They have given this
model a name SentiCircle. They have represented the utilization of this model for
sentiment recognition at both, entity-level as well as at tweet-level. Their suggested
technique surpassed other similar techniques for both entity-level as well as tweet-
level for the recognition. In fact, for the recognition of tweet-level, their model has
surpassed the other latest methods available and is more acceptable. The already
available approaches apply the utilization of specific groups of words and phrases,
so it retains the robustness of each word consistently. But, the presented approach
in the paper constructively reformed the robustness of a lot of words interactively
which were subjected upon circumstantial semantic tweets.
In this paper [7], authors claimed, sentiment exploration comprises a quickly
progressing probing field, especially after the appearance of the second phase of
evolution of the Internet, expressed specially by the replacement of stable websites
to interactive Websites with the expansion of social media like Facebook, Instagram,
Twitter, etc. These days, the use of GIFs, short videos like Instagram reels, YouTube
shorts, and especially Twitter are on peak. So, these have conveyed this observation
to do the exploration of sentiments in these short video’s posts, GIFs, and tweets.
There are already different ML techniques available which does the exploration of
words but they have a downside that they handle every tweet as a uniform statement
and allocate the total sum to the entire tweets. So, authors of the paper put forward
the techniques to find topics argued in the tweets and crumble every tweet in a group
that is relevant to the field. The outcome is the allocation of the total sum to every
distinct field.
432 N. Pavitha et al.

In this paper [8], authors presented the utilization of linguistic characteristics


in sentiment recognition and have probed various techniques for integrating them
into the study, first with substitution, second with expansion, and third with the
approximation of value within two known values in a sequence. Their outcome
showed that they surpassed the part-of-speech standard for recognizing different
types of sentiments. They have shown that the addition of semantic characteristics
give greater recall as well as F1 value. But, gives lower accuracy than the sentiment
characteristics when identifying negative comments. For positive comments also,
authors demonstrated that the model has surpassed the sentiment-topic characteristics
in terms of accuracy but gives a much lower score in terms of recall and F1.
In this paper [9], the authors discussed the highlights of most similar work done
in the area of sentiment analysis of Twitter. The research in this field has advanced
very quickly but still many problems such as data sparsity, multilingual, and senti-
ment analysis stands undetermined. After going through the most relevant works,
the authors concluded that sentiment analysis is a quickly enlarging area in which
research is going in different domains and on different challenges related to task.
Also, according to the author, the high interest in this field presently prompts the
use of this technology in different domains like government administration, business
intelligence, and also the recommender system. Therefore, the authors declared that
it is necessary to expand the sentiment analysis system that can withdraw the inherent
ideas which are circulated on sites like Twitter.
In this paper [10], the authors have done the research on election sentiments
where they targeted the use of different sentiment examiners with help of different
ML algorithms in order to detect the technique with the maximum accuracy. In
lexicon-based sentiment analysis, semantic inclination is for terms, phrases as well
as for sentences which are calculated in a documented file.
In this paper [11], the authors did not just write an demonstration system which
worked with AI system and also simulator, but the authors also used a general and
adjustable planning for XAI systems in constructing an explainable artificial intelli-
gence for one of the semi-automated forces objective system which is nothing but a
planned military simulation. The particular planning kept evolving even when they
worked on explainable artificial intelligence related to virtual human. This virtual
human simulation is nothing but a simulation that has been designed to teach many of
the soft skills to human beings such as cultural sensitivity, leadership management,
and teamwork.
In this paper [12], the authors have discussed various techniques in order to demon-
strate black box model on a huge scale inclusive of data mining and machine learning.
The authors went on to show a detailed categorization for the techniques of explain
ability by keeping the type of challenges faced in mind. This study highlighted only
interpretability’s mechanism and ignored the other dimensions of explainability like
evaluation, in spite of the fact that this survey considered holistic theory related to
all black box models.
In this paper [13], the author goes on to say that it can be actually very benefi-
cial in many ways if we include the counterfactuals in the interpretable models of
41 Explainable AI for Sentiment Analysis 433

complex artificial intelligence systems. Nevertheless, in order to optimize the effec-


tiveness to its full, it would be necessary to integrate the data from the psychological
experiments carried out. The psychological experiments had information related to
people, about the way they create and understand counterfactuals. It had information
regarding counterfactuals of different structure and content. The authors believe that
explainable artificial intelligence can be very effective if knowledge of cognitive
science about cognitive capacities of human reasoners is included in it. This would
also enable us, the humans, to simulate similar kinds of alternative methods in reality
as a human might also go for creating imaginative agents in future.

41.3 Methodology

In the world of machine learning, many approaches are considered “black box”
approaches, most notable of them being deep learning models. With increase in
popularity and interest for machine learning and its revolutionary effect in our lives,
it is no wonder that there is dire need for “explainable artificial intelligence” and
“explainable machine learning.” As there are no widely accepted similarities and
differences for “explainability” and “interpretability,” it is no wonder some people
use them interchangeably and some draw a hard distinction between them.

41.3.1 Model Used for Classification

There are many approaches through which we can compute sentiment of particular
snippet of text, which of course includes machine learning algorithms like support
vector machine (SVM), bog-of-words, Word2vec to list a few. With proper imple-
mentation each of these models can give good accuracy; though that is not the scope
of this paper; but we selected the SVM model for this research.
SVM is a supervised machine learning algorithm, which gives sentiment of our
input data (in this case tweet) as output. The accuracy of this particular model is
about 85% which is considered quite respectable given the complex nature of human
speech. Also, the aim is to give sufficient “explanation” regarding classification of
tweets by these “black box” models, which our model will be perfectly fine for.

41.3.2 Dataset

Dataset on which our model was trained is called the “Twitter for sentiment analysis
(T4SA)” dataset. This dataset was scrapped from Twitter using their API for 6 months.
The original aim for it was to analyze sentiment conveyed by an image through its
434 N. Pavitha et al.

tweet, but the textual dataset works perfectly fine for our use. For testing the model,
recent tweets are pulled using Twitter API.

41.3.3 How is Sentiment Determined

When a particular tweet is passed to the model, it determines which tweets are
keywords, i.e., which tweets contribute to overall meaning and thereby the sentiment
of the expression. The process to determine the keywords include removing stop
words (ex.: a, is, are, there). After this process, we get the arousal and valence score
for the remaining words from the dictionary created by training the model.
After getting the scores for these particular keywords, the total score for input
is calculated, and a valence and arousal score are assigned to the input. Arousal
score and valence score are treated as x and y coordinates and plotted on a graph.
The models represent pleasure along a horizontal axis, with highly unpleasant on
one end, highly pleasant on the other, and different levels of pleasure in between.
Russell’s model was used for depiction of sentiments based on valence and arousal
score.

41.3.4 Applying Visualization Tool to Our Model

We probe our model by passing certain tweets which contain specific keywords, for
example, as shown in Fig. 41.1 we have passed tweets containing keyword “love.”
As love is generally considered a positive/ pleasant word, it is no wonder that the
majority of the tweets are on the right side of the Cartesian plane where the positive
x axis shows pleasant tweets.
The graph represented in Fig. 41.2 depicts keyword “love” along with other
keywords used along with it in tweets. This graph helps to visualize four emotions.
First quadrant is for happy, the second quadrant for upset, third quadrant for unhappy,

Fig. 41.1 Sentiment of tweets with keyword “love”


41 Explainable AI for Sentiment Analysis 435

Fig. 41.2 Visualizing keyword “love” with other keywords used with it in majority of the tweets

Fig. 41.3 Visualizing keyword “death” with other keywords used with it in majority of the tweets

and the 4th quadrant for relaxed. Size of each keyword is proportional to its frequency
of occurrence. Figure 41.2 shows “love” is being used along with other keywords
like “rose,” “park,” “vegas,” “sunrise” among many others.
On the other hand, Fig. 41.3 shows an equally spread-out graph, showing sentiment
related to “death” entirely depends on other keywords used in the tweet.

41.3.5 Passing Multiple Keywords

Taking just a single keyword into consideration proved to be heavily dependent on


what other keywords are used along with it.
Let us check for multiple keywords: -
Figure 41.4 shows quite one-sided results of sentiment analysis if we take multiple
keywords that are arguably considered as positive, in this case “love” and “flower.”
Though we still see an equal distribution along y axis, this is probably due to
limitations of our model.
Figure 41.5 shows that the population is generally subdued for tweets related
to “election,” “politics,” and “corruption.” The distribution along the x axis tends to
favor positivity but enough nodes lie on the left side so we cannot say these keywords
convey a subdued positive feeling or more appropriate term would be relaxed
Figure 41.6 shows tweets related to these keywords generally tend to be singleton,
436 N. Pavitha et al.

Fig. 41.4 Sentiments for keyword “love” and “flower”

Fig. 41.5 Sentiment analysis for keywords “election,” “corruption,” and “politics”

Fig. 41.6 Tag cloud for keywords “election,” “corruption,” and “politics”

i.e., do not belong to any definite category, although during multiple runs for these
keywords, some tweets did fall in cluster like “labor,” “commission,” and “Morrison.”
The tag cloud graph used to describe Fig. 41.2; shows quite similar output we
get from analysis of Fig. 41.5 in Fig. 41.7. Figure 41.8 gives a general idea of how
retweets can change sentiment even if they belong to the same conversation.
41 Explainable AI for Sentiment Analysis 437

Fig. 41.7 Topics cluster for keywords “election,” “corruption,” and “politics”

Fig. 41.8 Narrative map line for keywords “election,” “corruption,” and “politics”

41.4 Results and Discussion

From the observation which we gathered using the visualization tool, we can safely
say that a part of classifying sentiments is prone to bias as human input is taken to
mark certain keywords/phrases as positive or negative. Assuming we take a good
random sample of human populus, we can then proceed to keenly analyze the results
which we obtained.
It can be said that taking a single keyword into consideration may not give us a
clear answer for that particular tweet. This was apparent from studying the graphs
plotted for the keyword “death” which according to general populus is considered
negative. But when used in a sentence/tweet, it meaning entirely depends on the
other keywords which were used along with it, thus giving “death” keyword status
of neutral.
When we take two keywords which are considered positive, the output is as
expected, i.e., heavily favoring the positive x axis which belongs to positive sentiment.
As we are taking tweets into consideration; which are usually quite short; even two
438 N. Pavitha et al.

keywords help us to get a distinct result than the one we were getting when using a
single keyword.
When we move on a complex analysis where we use “election,” “corruption,” and
“politics,” we find users are generally subdued about it and tend to favor positive x
axis, though not an insignificant number are around center left. This show people
are generally relaxed/not bothering to get themselves opinionated on these matters.
This shows a passive trend to keywords which may give us the impression of being
neutral on their own.
Thus, we can safely conclude that our model assigns each keyword in our tweet
a value, which when added gives us the net sentiment of our tweet. The assigning of
these values is generally a human’s job and therefore great care must be taken when
preparing the dataset. XAI when applied to these models generates transparency for
end user, thus building trust between social platforms and users.

41.4.1 Limitations

As discussed earlier, for labeling, the training dataset’s data-items human input was
used. If while preparing the dataset, human input from randomized large sample
is not taken, it may lead to biased dataset and thus a biased model. As sentiment
analysis is a crucial process in monitoring open Internet for preventing misuse, it is
the responsibility of each media house to take necessary caution during each step of
data collection and preparation.

References

1. Escalera, S., Guyon, I., Escalante, H.J., Xavier, B., Gerven, M.: Explainable and interpretable
models in Computer Vision and machine learning. Springer (2018)
2. Samek, W., Grégoire, M., Vedaldi, A., Hansen, L.K., Klaus-Robert, M.: Explainable Ai:
Interpreting, explaining and Visualizing Deep Learning. Springer International Publishing
(2019)
3. Biran, O.C.: Cotton, definition and correction in machine-learning: A survey. Paper presented
to IJCAI-17 Workshop on Explainable AI (XAI), Melbourne, Australia, 20 August (2017)
4. Kulesza, T., Burnett, M., Wong, W.-K., Stumpf, S.: Principles of explanatory debugging to
personalize interactive machine learning. Proceedings of the 20th International Conference on
Intelligent User Interfaces (2015). https://doi.org/10.1145/2678025.2701399
5. Bellotti, V., Edwards, K.: Intelligibility and accountability: Human considerations in context-
aware systems. Human-Computer Int. 16(2–4), 193–212 (2001). https://doi.org/10.1207/s15
327051hci16234_05
6. Saif, H., He, Y., Fernandez, M., Alani, H.: Contextual semantics for sentiment analysis of
Twitter. Inf. Process. Manage. 52(1), 5–19 (2016). https://doi.org/10.1016/j.ipm.2015.01.005
7. Kontopoulos, E., Berberidis, C., Dergiades, T., Bassiliades, N.: Ontology-based sentiment
analysis of Twitter posts. Expert Syst. Appl. 40(10), 4065–4074 (2013). https://doi.org/10.
1016/j.eswa.2013.01.001
41 Explainable AI for Sentiment Analysis 439

8. Gautam, G., Yadav, D.: Sentiment analysis of Twitter data using machine learning approaches
and semantic analysis. 2014 Seventh International Conference on Contemporary Computing
(IC3) (2014). https://doi.org/10.1109/ic3.2014.6897213
9. Martinez-Camara, E., Martin-Valdivia, M.T., Urena-Lopez, L.A.: Sentiment analysis in Twitter.
Nat. Lang. Eng. 20(1), 1–28 (2012). https://doi.org/10.1017/s1351324912000332
10. Khoo, C.S.G., Johnkhan, S.B.: Lexicon-based sentiment analysis: Comparative evaluation of
six sentiment lexicons. J. Inf. Sci. 44(4), 491–511 (2017). https://doi.org/10.1177/016555151
7703514
11. Core, M.G., Lane, H.C., van Lent, M., Gomboc, D., Solomon, S., Rosenberg, M.: Building
explainable artificial intelligence systems (2006). https://doi.org/10.21236/ada459166
12. Adadi, A., Berrada, M.: Peeking inside the black-box: A survey on explainable artificial intel-
ligence (XAI). IEEE Access 6, 52138–52160 (2018). https://doi.org/10.1109/access.2018.287
0052
13. Byrne, R.M.: Counterfactuals in explainable artificial intelligence (XAI): Evidence from human
reasoning. Proceedings of the Twenty-Eighth International Joint Conference on Artificial
Intelligence (2019). https://doi.org/10.24963/ijcai.2019/876
Chapter 42
Deep CNN Model Embedded
with Inception Layers for COVID-19
Classification

Jaya Sharma and D. Franklin Vinod

Abstract The sudden rise of Corona virus disease has tremendously infected the
health of human and the economy all over the world. This ongoing crisis can be mini-
mized by identifying the corona cases in a fast and accurate approach. The leading
method to diagnosis the seriousness of this disease is Computed Tomography (CT).
The advances in Deep Learning technology made the classification task very simple
and effective. In this paper, we introduced an automatic CNN model that examines
the binary representation of COVID-19 precisely. The introduced CNN model work
with the Inception Layers and CT images to achieve 98.18% classification accu-
racy. The simulation outcomes on the CT images proved that the proposed system
improved the recognizing rate and the classification performance while comparing
with the traditional methods.

Keywords COVID-19 · Lung CT scan · CNN · Classification problem

42.1 Introduction

Big Data is undoubtedly flaring with huge outbursts of data and has evolved to a
great extent in modern industries. This created a significant platform for researchers
to explore their innovation. As per the Moore’s Law, the Big Data is growing expo-
nentially, and it is “doubling every year” at a higher rate. The modern big data is
classified as 5Vs whereby “Volume” defines the size of data generated utmost every
single second “Variety” symbolize the unstructured as well as structured data that
are obtained from numerous sources, “Velocity” signifies how agile the data is origi-
nated, equipped, and widespread, “Veracity” is clean and absolute data, and “Value”
denotes the usefulness of the data to its context. The applications of big data are in
Banking, Media, Healthcare, Education and many more. Among these applications,
the healthcare is the prime application in the Big Data generation. The various issues

J. Sharma (B) · D. F. Vinod


Department of Computer Science and Engineering, Faculty of Engineering and Technology, SRM
Institute of Science and Technology, NCR Campus, Delhi-NCR Campus, Delhi-Meerut Road,
Modinagar, Ghaziabad, UP, India
e-mail: jayashaa19@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 441
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_42
442 J. Sharma and D. F. Vinod

to be addressed in the big data are Data Silos, Inaccurate data, Lack of skilled workers,
Classification, etc. [1]. In those, the classification is a vital one to analyze because
of the severity of application. The main purpose of this work is to distinguish the
pulmonary disease and overcome the problems in classification. Moreover, a large
size of data related to images is being analyzed, individually or considering multi
combinations of data from different sources to achieve purposeful (more accurate)
results.
The novel COVID-19 is a continuing worldwide contagious diseases that cause
SARS-CoV-2 denominated as severe acute respiratory syndrome coronavirus 2. The
WHO announces that the diseases COVID-19 is an international emergency on
January 2020 and Global emergency on March 2020. The initial victim of Corona
virus is detected in China during Dec. 2019 [2]. As per the latest survey, the expansion
of diseases has crossed over more than 213 nations and its regions worldwide. It has
been claimed that approximately 11.5 million people are infected by this disease,
and the death of around 5,03,600 till date. The virus belongs to the subfamily of
Coronavirinae with different classes of α, β, γ , δ [3]. The virus SARS-COV-2 is
considered as a zoonotic origin since it has very close interconnection with SARS-
COV, pangolin Corona virus and bat Corona virus. The scientific survey depicts
that the natural origin COVID-19 sign are cough, headache, diarrhea, shortness
of breath and shivering. According to WHO, the sign of Middle East respiratory
syndrome (MERS)-COV are shortness of breath, cough and fever, also its decease
rate is 35% [4]. Like SARS-COV and MERS-COV, the gastrointestinal infection also
persists in COVID-19. All coronavirus diseases mainly impact on lungs in the form
of pneumonia.
The expansion of virus is through few droplets generated by means of speak,
cough and sneeze [5]. These droplets cannot carry forward for lengthy distance via
air instead it will fall on the surface. But the research findings on June 2020 explicit
that talk created droplets can last in air for 10 min. More often the infection is carried
to human by touching the affected surface and keeping it on face or nose. The transfer
rate of virus is very fast in the first three days when compared to the rest of the days.
But the transmission of virus is on before the sign of infection or from asymptotic
person.
We observed that COVID-19 disease transfers from human to human through
respiratory territories and highly impact on bilateral lungs, which is called COVID-19
pneumonia. The RT-PCR denominated as real-time reverse transcription-polymerase
chain reaction mechanism that utilized detection of the Corona virus diseases more
accurately. The shortage of RT-PCR kits, errors and delay in RT-PCR result led to the
fast spread of infection among humans. Nowadays the radiological evaluation is used
for the detection and prevention of pneumonia in an early stage. Computed tomog-
raphy (CT), Chest X-ray are the standard testing method for detection and prevention
of pneumonia [6]. The clinical findings for COVID-19 cases through CT scan show
how severe a patient is and it exhibits whether the patient should be in normal or
intensive care observation. The imaging help to detect heterogeneous consolidation
in bilateral lungs by Ground Glass Opacity (GGO) and indicate a “White lung” during
the extreme infection. The features named extensive consolidation with multifocal
42 Deep CNN Model Embedded with Inception Layers … 443

bilateral GGO’s, interstitial inflammation, bilateral involvement, etc. are necessary


to classify pneumonia factor of Corona virus using imaging [7].
The advancement of machine learning methods and its automatic detection capa-
bility acquired demand in the medical field. Deep learning, a part of AI qualified
as a flourishing technology in the medical field that has the capability of automatic
feature extraction. The rapid rise in COVID-19 initiated the requisite of intelligence
in the medical field. This requisite turns my motivation of research in the develop-
ment of automated classification technique. The less count of radiologist demanded
deep learning model in every health care unit. This demand introduces our proposed
deep convolutional neural network model which are embedded with inception layer
for the classification COVID-19 diseases. The outcome of the introduced work will
provide instant support for the physician to take timely action and overcome the
problems like unavailability of RT-PCR test kits, delay, error and cost of test results.

42.2 Related Work

Rajpurkar et al. [8] used convolution neural network to identify Pneumonia on Chest
X-ray (CXR). In this paper, the author has proposed CheXNet model which used 121
convolutional neural networks to identify 14 diseases related to lungs. To identify
its performance, author has used Chest X-ray 14 dataset which includes 112,120
X-ray frontal view images. These images are downsized to 224 × 224 before given
to the network. The average F1-score is compared by the author for CheXNet and the
radiologists for similar samples. The difference identified by them during the above
comparison is 0.051 (95% CI 0.005, 0.084). Limitations of this paper is the author
used only the frontal radiographs and the achieved accuracy is very less which is
approximately 85%.
Pereira et al. [9], introduced flat and hierarchical classification methods utilized for
the identification of Corona virus based on chest X-ray. In this paper, author proposed
multi-class and hierarchical classification method and noticed that the texture is the
important feature in CXR images. Hence for texture descriptor he used early and late
fusion techniques. RYDLS-20 database consists of CXR images of pneumonia and
images of lungs without pneumonia. They achieve 0.65 macro F1-score using their
first multi-class approach. Furthermore, the use of second method helped in attaining
the F1-score (0.89) by use of hierarchical classification of COVID-19.
Ozturk et al. [10], introduced an architecture based on deep learning method
that works on CXR images for the diagnosis of Corona virus. Dark COVID-Net
architecture is introduced by the author to complete their objective with the help
of 17 convolutional layers by using various filtering process in every layer. The
architecture performed the classification on dual class and multiple-class. For binary
class their model attained 98.08% accuracy for the dual class and for multi-class the
accuracy achieved by them is 87.02%. In this paper Corona virus X-ray images are
utilized by Cohen JP for the creation of datasets. The author worked on 127x-ray
444 J. Sharma and D. F. Vinod

images of Corona virus to diagnose the disease. For the implementation of their
model a smaller count of images is utilized and it falls as main disadvantage.
Wang et al. [11], introduced architecture of the COVID-Net that utilizes the X-
Ray Images for the identification of Corona virus diseases. The network utilized
here was the first open-source network. For the implementation of the model the
author accomplishes the COVIDx dataset which was generated only for the COVID-
Net examination. The dataset COVIDx includes 13,975 chest images attained from
13,870 different victims. The kernel size of COVID-Net architecture is from 7 × 7
to 1 × 1. This architecture attained 93.3% accuracy while performing classification
on multiple classes (COVID-19, normal and viral).
Li et al. [12], introduced a network for the identification of Corona virus diseases
with the use of Pulmonary CT. The deep learning framework used by them for
the identification of COVID-19 is 3-D. This model extracts local features of two-
dimensional and global features of three-dimensional. The fully connected layer
input will be the final generated feature map and it performs classification on 3 classes
(CAP, non-pneumonia and COVID-19) using the SoftMax activation function. The
CT images collected for the dataset is from various hospitals that comprises Corona
virus CT images of 1292, pneumonia CT images of 1735, and non-pneumonia CT
images of 1325. This model identified COVID-19 with 96% of high specificity and
90% of sensitivity.

42.3 Methodology

Before beginning the proposed deep learning algorithm, we focus on the input image
to be processed. The input image is enhanced and made noise free by preprocessing
it with median filter which is a non-linear filtering technique. The prime objective
of the median filter is to eliminate the noise from the input image without changes
in the edges, followed by background removal in which segmentation-based color
thresholding is used [13].

42.3.1 Basic CNN Model

Over the period of years, the deep learning model have reached a huge level attention
in many computer vision applications which includes classification of medical images
[14]. The CNN comprises with convolutional layer that make use of input image to
extracts the features. This layer is continued with a pooling layer which minimizes
the complications of the model and its computational cost. The final layer will be
fully connected layer that flatters the feature maps for the classification. The lower
layers will find the elementary features and it is then fed as an input to the next layer
of the model to identify the more complex features. As we go deeper in the network
layers, more specific features are identified. In CNN we achieve feature extraction
42 Deep CNN Model Embedded with Inception Layers … 445

using the Convolutional layer that include mathematical convolutional operations.


For an input image I (Signal) and filter F, Eq. (42.1) represents the 2-d convolutional
activity [10].

(I ∗ F)(i, j) = K (m, n)I (i − m, j − n) (42.1)
m n

In Eq. (42.1), * is used for convolutional operation. It takes the input matrix and
filter matrix as two inputs and the filter matrix are slide over the input image to produce
a feature map. The activation functions are the computational operation that alter the
result of a neural model based on the particular outcome. The activation function
utilized in this architecture is Rectified Linear Unit (ReLU) and it is represented by
the Eq. (42.2).

f (x) = x; for x > 0


0; for x <= 0 (42.2)

The outcome of Eq. (42.2) result with similar value if a +ve input is given or with
−ve input it result 0. The feature map height and weight are reduced by the max
pooling process, and for the classification process softmax is utilized.

42.3.2 The Proposed CNN

The proposed architecture described in Fig. 42.1 has three convolutional layers. The
first layer has 7 × 7 convolution that will be directly connected to the input image with
512 × 512 in size, followed by a 3 × 3 convolution in the second layer. The output
of second layer will be consumed as an input of 64 × 64 image with 32 features
map by the inception module. Thereafter, the third convolution layer is supplied
with the result of inception layer. At last, to classify the diseases the vector space is
flattened. As in Fig. 42.2, we utilized 3 inception layers that trained the model for
dimensionality reduction in the deeper network. After every convolutional layer, a
3 × 3 max polling layer is used to achieve productive features. In our architecture,
for standardizing the inputs, for speeding up the training process and for reducing
the errors the Batch Normalization methodology is utilized, to speed up the training
process and to reduce the generalized errors. Additionally, we have used ReLU
Activation functions to compute the changes in the output of a neural network.
446 J. Sharma and D. F. Vinod

Fig. 42.1 The CNN architecture

Fig. 42.2 Inception layer n, where n = 1, 2, 3

42.4 Result

42.4.1 Dataset Information

Computed Tomography (CT) is a radical imaging method utilized to diagnosis the


intensity of lung infections occurs because of COVID-19. In this paper, the CT
images used for diagnose of COVID-19 is obtained from two different sources. One
source of CT image repository was developed by Shakouri [15] from two popular
hospitals, and this CT image repository have more than 1000 COVID-19 patients. All
images in the database are 512 × 512 pixels of grayscale images and it is maintained
in DICOM standard. Another source of CT images collected from local hospitals.
These images also maintain in DICOM standard and then rescaled the size to 512 ×
512 pixels.
42 Deep CNN Model Embedded with Inception Layers … 447

Table 42.1 Analogy between proposed and traditional CNN models


Model Accuracy Precision Recall F1-score AUC
VGG 16 0.74 0.72 0.95 0.82 0.84
DenseNet121 0.81 0.79 0.96 0.86 0.76
Inception V3 0.85 0.91 0.84 0.87 0.65
Xception 0.87 0.857 0.96 0.90 0.93
Proposed model 0.91 0.91 0.97 0.92 0.97

42.4.2 Performance Analysis

The performance of our model in classification of COVID-19 is differentiated from


traditional models by their achieved accuracy. In addition, a correct classification
rate is derived from a confusion matrix which in turn is important in expressing
the accuracy of the proposed model. The above mentioned statistical technique, i.e.,
confusion matrix is used to calculate the performance evaluation of our proposed
model by the factors such as recall, precision, AUC-ROC curves, F1- score and
accuracy. Therefore, the mathematical explanation of them is shown in Eqs. (42.3)
and (42.4) [16].

TP
Recall = (42.3)
TP + FN
TP
Precision = (42.4)
TP + FP

In Eqs. (42.3) and (42.4) the TP exhibit True Positive rate, FN exhibit False
Negative rate and FP exhibit False Positive. The F1-score is measured by a weighted
average of both precision and recall and it arrives at its best score at 1 and most
awful score at 0.The proposed deep CNN model is compared with other CNNs in
the Table 42.1, which includes VGG 16 [17], DenseNet121 [18], Xception [19] and
InceptionV3 [20]. The performance comparison exhibited in Table 42.1 is visualized
graphically in Fig. 42.2. The VGG 16 and DenseNet121 uses the images of fixed
size 224 × 224 with RGB channel whereas Inception and Xception model work on
299 × 299 color images (Fig. 42.3).

42.5 Conclusion

The fundamental intention of the introduced model is to identify and classify the
ongoing pandemic COVID-19 diseases with the help of CT images. With no manual
feature extraction approach, the proposed model will automatically diagnose the
COVID-19 infection and its severity. The implemented CNN model is enhanced by
448 J. Sharma and D. F. Vinod

1
0.9
0.8
0.7 VGG 16
0.6 DenseNet121
0.5 Inception V3
0.4 Xception
0.3
Proposed model
0.2
0.1
0
Accuracy Precision Recall F1-score AUC

Fig. 42.3 Performance comparison chart

the inclusion of Inception layers which provides the major improvement in terms of
performance. Additionally, An Adam analyzer is utilized to reduce the loss function
which in turn provides efficient computation through dealing with large datasets and
parameters. One of the main advantages of this model is that it can perform Image
classification on binary classes with an accuracy of 98.18%.

References

1. Gayatri, K., Agrawal, A., Ahmad Khan, R.: A study of big data characteristics. In: International
Conference on Communication and Electronics Systems (ICCES), pp. 1–4 (2016)
2. Sohrabi, C., Alsafi, Z., O’Neill, N., et al.: World Health Organization declares Global Emer-
gency: A review of the 2019 Novel Coronavirus (COVID-19). Int. J. Surg. 78, 71–76
(2020)
3. Lalmuanawma, S., Hussain, J., Chhakchhuak, L.: Applications of machine learning and arti-
ficial intelligence for Covid-19 (SARS-CoV-2) pandemic: A review. Chaos Solitons Fractals
139, 779–960 (2020)
4. Singhal, T.: A Review of Coronavirus Disease-2019 (COVID-19). Indian J. Pediatr. 87(4),
281–286 (2020)
5. Sahin, A.R., Erdogan, A., Agaoglu, P.M., Dineri, Y., Cakirci, A.Y., et al.: Novel Coronavirus
(COVID-19) outbreak: A review of the current literature. Eurasian J. Med. Oncol. 4(1), 1–7
(2020)
6. Kanne, J.P., Little, B.P., Chung, J.H., et al.: Essentials for radiologists on COVID-19: An
update—Radiology scientific expert panel. Radiology 296(2), E113–E114 (2020)
7. Bai, H.X., Hsieh, B., Xiong, Z., Halsey, K., et al.: Performance of radiologists in differentiating
COVID-19 from non-COVID-19 viral pneumonia at chest CT. Radiology 296(2), E46–E54
(2020)
8. Rajpurkar, P., Irvin, J.A., Zhu, K., et al.: CheXNet: Radiologist-level pneumonia detection on
chest X-rays with deep learning. ArXiv, abs/1711, 05225
9. Pereira, R.M., Bertolini, D., Teixeira, L.O., et al.: COVID-19 identification in chest X-ray
images on flat and hierarchical classification scenarios. Comput. Methods Programs Biomed.
194, 105532 (2020)
42 Deep CNN Model Embedded with Inception Layers … 449

10. Ozturk, T., Talo, M., Yildirim, E.A., et al.: Automated detection of COVID-19 cases using deep
neural networks with X-ray images. Comput. Biol. Med. 121, 103792 (2020)
11. Wang, L., Lin, Z.Q., Wong, A.: COVID-Net: A tailored deep convolutional neural network
design for detection of COVID-19 cases from chest X-ray images. Sci. Rep. 10(1). https://doi.
org/10.1038/s41598-020-76550z
12. Li, L., Qin, L., Xu, Z., Yin, Y., Wang, X., et al.: Using artificial intelligence to detect COVID-
19 and community-acquired pneumonia based on pulmonary CT: Evaluation of the diagnostic
accuracy. Radiology 296(2), E65–E71 (2020)
13. Kulkarni, N.: Color thresholding method for image segmentation of natural images. Int. J.
Image, Graph. Sign. Process. (IJIGSP) 4(1), 28–34 (2012)
14. Wang, H., Xia, Y.: ChestNet: A deep neural network for classification of thoracic diseases on
chest radiography. ArXiv, abs/1807, 03058 (2018)
15. Shakouri, S., Bakhshali, M.A., Layegh, P., Kiani, B., et al.: COVID19-CT-dataset: An open-
access chest CT image repository of 1000+ patients with confirmed COVID-19 diagnosis.
BMC. Res. Notes 14(1), 178 (2021)
16. Jesse, D., Goadrich, M.: The relationship between precision-recall and ROC curves. In: 23rd
International conference on Machine learning (ICML). Association for Computing Machinery,
New York, NY, USA, pp. 233–240 (2006)
17. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. arXiv preprint arXiv 1409, 1556
18. Huang, G., Liu, Z., Van Der Maaten, L., et al.: Densely connected convolutional networks. In:
IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
19. Chollet, F..: Xception: Deep learning with depthwise separable convolutions. In: IEEE
Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
20. Szegedy, C., Vanhoucke, V., Ioffe, S., et al.: Rethinking the inception architecture for computer
vision. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826
(2016)
Chapter 43
Design and Execution of Cyberattacks
Simulation for Practice-Oriented
Experiential Learning

Ashutosh Bahuguna and Samar Wazir

Abstract Convergence of ICT networks, if not properly secured, could lead to catas-
trophe, especially in critical infrastructure. A malicious cyberattack such as data
breach can cause substantial damages to the critical assets and services of an orga-
nization. Rapid identification, technical analysis, strategic, and tactical response can
minimize the impact of cyber incident and crisis. Various initiatives around the globe
are taken to ensure the cyber resiliency of the digital space. Cybersecurity capacity
building and exercises are widely adopted initiatives. The attack-simulated cyberse-
curity exercises can be conducted on vulnerable testbed supported by technologies
like virtualizations. However, very limited resources, artifacts, and research are avail-
able in academic domain related to such testbed and exercises. In order to conduct
research on how effective are practice-oriented experiential learning in building
higher-order skillset in cybersecurity domain, a practice-oriented lab which repli-
cate a real-cyberattack scenario was developed. The developed exercise contains a
malware-based scenario which leads to data breach and includes vulnerability assess-
ment, lateral movement, scanning, and log analysis of victim machine and network
traffic analysis. In this paper, we presented the design of simulated attacks-based
practice-oriented testbed for building deep cybersecurity skill sets. With the objec-
tive to provide platform to apply and analyze skills and assess abilities to counter
cyberattacks, vulnerable machines and exploits were developed for the experiment.
Exercise type design, selection of vulnerabilities, and exploits are derived from the
real-life cyber incidents.

Keywords Cybersecurity · Cyberattacks · Cybersecurity exercises ·


Practice-oriented learning · Experiential learning

A. Bahuguna (B) · S. Wazir


Jamia Hamdard University, New Delhi, India
e-mail: ashoo.online@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 451
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_43
452 A. Bahuguna and S. Wazir

43.1 Introduction

Adoption and reliance on ICT for critical and essential services are leading to hyper-
connected societies. In such a scenario, a cybersecurity attack or security breach can
cause disruptions and damage to critical assets, services, and interests of the victim.
A cyberattack that targets the infrastructure of an organization can effectively reduce
available resources, and undermine confidence in their services [1].
A cyberattack is a malicious attempt by attacker to cause damage, disruptions or
destruction to the target ICT environment. A cyberattack can be in form of cyber
extortion, data breach, automated malware, or targeted attack. There are various
objectives of cyberattacks such as espionage, disruptions of services, and financial
gains among others. Cyber incidents if not arrested quickly may become crisis of large
scale and could lead to impact on essential services such as power and healthcare.
Since the impact of cyberattacks may take any form and may threaten security and
business objectives [2, 3]. It is required to develop abilities for timely detection and
immediate technical analysis to respond to the malicious cyberspace activity.
Cyber exercises are globally adopted method to practice and assess preparedness
of participating entities to counter; minimize impact; and manage cybersecurity inci-
dents [4]. Cyber exercises enable participants to practice cyber crisis scenarios and
proven as useful tool for improving preparedness and skillsets to counter cyberat-
tacks. An exercise places the participants in a simulated situation. The goal of such
exercises is to learn from the exercise and be able to use the experiences in future situ-
ations that they face. For conducting the cybersecurity exercises, simulated machines
with vulnerabilities are developed such as vulnerable script, network packet capture,
vulnerable Web-application, malware artifacts, and logs.
This research is focused on conducting simulated attack-based practice-oriented
cybersecurity exercise to build cybersecurity skill sets and assess abilities to counter
cyberattacks. In order to conduct research on how effective are practice-oriented
experiential learning in building higher-order skillset in cybersecurity domain, a
practice-oriented lab which replicates a real scenario was developed.
This paper presents the development of practice-oriented lab that could help the
learners to perform technical analysis of cyberattacks within risk-free environments.
The developed exercise contains a malware-based scenario which leads to data breach
and includes vulnerability assessment, scanning, and log analysis of machine and
network traffic analysis. Section 43.2 and Sect. 43.3 of the paper present literature
review and problem statement, respectively. Section 43.4 presents details of the
technical testbed and cyberattack simulation scenario and finally Sect. 43.5 conclude
and discuss future research directions.
43 Design and Execution of Cyberattacks Simulation … 453

43.2 Literature Review: Practice-Oriented Attack


Simulation Testbed

Cybersecurity simulation and exercises of different types are conducted by economies


and organizations, each with different objectives, and target participants. There is no
unique standard to classify cybersecurity exercises. Simulation attack-based exer-
cises are useful for assessing the people, process, and technology implemented at
the organization from security viewpoint [5].
Simulation attack-based exercises launch cyberattacks on the target testbed and
exploit the weaknesses that is crafted in the exercise setup. Participants are expected
to counter the attack, perform technical investigation, and mitigate the simulated
impact [6].

43.2.1 Scope of the Exercise: Simulated Attack-Based


Exercise

Cybersecurity exercises are classified at broadly following three different levels:


• Strategic level
• Operational level
• Technical level.
At the technical level, exercises involve practice and assessment of incident detec-
tion, artifact investigation, and mitigation strategies. At the operational level, inci-
dent response protocols, coordination, are tested. At strategic level, which typically
involve board and top management of the entity involves decision-making to respond
to cyber crisis situation [7]. Hierarchy of exercise types is shown in figure below, as
adopted from the European Union Agency for Cybersecurity (ENISA) (Fig. 43.1).
In this research, our primary goal is to provide practice-oriented lab to participants
for learning to investigate and respond to a cybersecurity incident. Scope of designed
testbed presented in this paper is at technical level.

Fig. 43.1 Scope of different


type of exercises, adopted
from ENISA [7]
454 A. Bahuguna and S. Wazir

43.2.2 Theme of Attack Simulation

There are many different themes that can be used to design and conduct, each with
different target participants (such as technical analyst, mid management, top manage-
ment) and objectives. Broadly, exercises are classified in literature as discussion and
operational-based exercises [7, 8]. Discussion-based exercises such as table top exer-
cises focus on examining the injects and scenarios, develop couther response strate-
gies, and test incident response protocols. In operational exercises provide platform
to assess and test of technical skills, artifact analysis and decisions making with
limited information.
Based on our objective of implementation of the attack simulation-based cyber-
security testbed for building cybersecurity skills and capabilities. Attack simulation-
based scenario with focus on data breach and cyber extortion attacks were planned. In
design of testbed, it was planned that the simulation-based attacks will be launched on
the victim machines. Participants need to perform technical investigation and respond
to the questions related to incident and attacker methodology for skill building as
well as assessment of technical abilities to counter and resolve cyberattacks.

43.3 Problem Statement

Simulation-based cybersecurity exercises are conducted by many economies and


regional and international organizations around the globe to test and practice cyber-
security preparedness of the entities, however, there is lack of systematic research
in this domain and very limited resources, artifacts, and research is available in
academic domain related to building higher-order cybersecurity skill set as proposed
in Bloom’s Revised Taxonomy (BRT) [9]. One of the key reasons for non-availability
of testbeds and artifacts in academic domain is labeling of the content as restricted
or confidential.
In order to conduct research on how effective are practice-oriented experiential
learning in building higher-order skillset in cybersecurity domain using the cyberse-
curity exercises concept, a practice-oriented lab which replicates a real-cyberattack
scenario was developed. The development of practice-oriented lab that could help the
learners to perform technical analysis of cyberattacks within risk-free environments
is mapped with Bloom’s Revised Taxonomy (BRT).
In this experiment, we presented the design of simulated attacks-based practice-
oriented testbed for building cybersecurity skill sets. Vulnerable machines and
exploits were developed for the experiment. Exercise is planned to be conducted
for participants and data and results to be analyzed to assess the effectiveness of
practice-oriented experiential learning in building higher-order skillset as proposed
in cognitive model Bloom’s Revised Taxonomy (BRT). The following are the 3 key
objectives of this research are:
43 Design and Execution of Cyberattacks Simulation … 455

1. Implementation and execution of the attack simulation-based testbed for


building cybersecurity incident investigation skill sets.
2. Systematic research on mapping formal learning models in building cybersecu-
rity skills.
3. Results of the experiment will inform the effectiveness of practice-oriented
experiential testbeds in building cybersecurity skill sets.

43.4 Cyberattacks Simulation for Practice-Oriented


Experiential Learning: Concept and Design

Practice-oriented lab which replicates a real-cyberattacks scenario is developed. This


section contains technical details of the testbed along with indicative questions for
assessment of the cybersecurity skills of participants. The developed testbed contains
a malware-based scenario which leads to data breach [10] and includes vulnerability
assessment, lateral movement, scanning, and log analysis of machine and network
traffic analysis [11].

43.4.1 High-level Scenario Story

A hypothetical victim organization which has a well-built infrastructure having


>1000 employees provides services in telecommunication sector. They store the
critical data of the customers as well as the company project details in an internal
server on their network. The employees are not aware of the latest cybersecurity
trends, which leads them to become a victim of cyberattack. Since the organiza-
tion is a service-based company they receive enormous number of mail every day.
An employee received a targeted spear phishing email and unknowingly opened it
and downloaded the malicious attachment from the email. Further, while viewing
the attachment a suspicious file got downloaded on the employee machine. The
downloaded file was a malware which led the attacker gain remote access to the
employee machine. Attacker then performed lateral movement to internal critical
server, the attacker took the critical data including customer’s personally identifiable
information (PII). Latter, attacker hold the victim for ransom to not leak confidential
data into the public domain. The task of the company’s cybersecurity analyst is to
perform technical investigation of the incident and capture the root cause and attacker
methodology to respond to the attack.

43.4.2 Testbed Details

Testbed was created using 4 virtual machines with the following configurations.
456 A. Bahuguna and S. Wazir

• Attacker Machine: OS: Kali Linux Machine [12], RAM: 2 GB, Storage: 40 GB
• Internal Server Machine: OS: Ubuntu 20.04 Linux Machine, RAM: 2 GB,
Storage: 40 GB
• Organization Machine—Initial Access: OS: Windows Server 2012 Machine,
RAM: 2 GB, Storage: 40 GB
• Router: OS: Ubuntu 20.04 Linux Machine, RAM: 2 GB, Storage: 40 GB
The internal server machine and the organization machine (initial access) are in
the network, i.e., 192.168.125.1/24 connected to each other through the router
(192.168.125.1), and the attacker machine is from outside network 192.168.3.11.
The purpose of this was to create a separate environment for the attacker and the
organization as in real-life scenario.
To demonstrate the email communication, multiple email accounts were created
in a public available domain. After the lab setup, the employee machine was booted,
and the email client was configured with the employee email id. Multiple email
communications have been carried out to demonstrate a real-office environment.
The attacker also sent multiple email to the victim to carry out spear phishing attack.
For remote access by attacker in victim machine, QuasarRat (a Trojan application)
was downloaded and installed as backdoor to remotely control the victim machine.
Later, the attacker created a command and control (C2) server [13] in a public
domain and siphoned the sensitive data of the organization.

43.4.3 Remote Access Trojan—Quasar RAT

The malware involved in scenario is a remote access Trojan (RAT). The Quasar
generic version of the RAT was downloaded to create a backdoor on the employee
or victim machine [14]. A remote access Trojan (RAT) is a backdoor program for
remote access and control over the victim machine. Key features available for remote
access of victim machine through Quasar RAT includes keylogging, system remote
control, remote desktop, backdoor shell access, file management among others.

43.4.4 Attack Methodology

For initial entry, attacker was continuously sending spear phishing emails to
employees of the organization. Attacker created an excel file which can download
the malware from the attacker server. The malicious code was hidden in the macros
which get executed on click of a button. The malicious email was shared in a password
protected zip format to prevent from being detected from firewall and monitoring
tools.
When the employee downloaded and opened the attachment, malware file got
downloaded and executed. The malware was able to stay hidden in the folder. On
43 Design and Execution of Cyberattacks Simulation … 457

execution, it created the client and started keyloggers. The client made a commu-
nication channel with the command and control (C2) server, which give complete
access of the victim machine to the attacker. Once the attacker got access of the
victim machine, the following actions were performed:

4. A task was scheduled to execute, to boot-up the system after office hours.
5. Along with the start-up, the malicious client also needs to be started. For
which a new key “Quasar-client” pointing to the path of the file was added in
the windows registry.
6. Once the communication channel was created, the attacker took remote access
of the victim machine and continuously monitored the employee activity such
as login to internal servers.
7. Once the employee left, the machine booted as per the scheduled time (step
1).
8. The attacker started searching the directories to find any confidential informa-
tion.
9. For scanning the network, attacker downloaded Nmap tool in the victim
machine and started performing scanning. Nmap is network mapper or scanner
used to detect hosts and ports in a computer network.
10. The attacker then logged-in to the internal server through lateral movement
from victim machine. The username and password were stored inside the
log file created by the keylogger. Attacker came to know about the critical
information from the internal server.
11. The attacker then exploited the internal server and took the valuable data from
the server to the victim machine.
12. Then, attacker exfiltrated the critical data along with the keylogger log files.
The log file has many login credentials of the victim user.
13. Once the data were transferred, the attacker dropped a malware named “Red-
Line Stealer” for lateral movement [15]. Lateral movement to critical servers
or network segments is done by RedLine malware by harvesting credentials
and autocomplete data from victim machine. RedLine Stealer malware can
also capture system inventory to steal username, passwords, hardware details,
installed software, configuration of software and update attacker about the
details of compromised machine.
14. Once the data are exfiltrated by the attacker scheduled another task. This task
will execute a bat file which will inform the user about the attack and PII details.
The attacker further displayed a ransom note for victim to pay in bitcoins to
not release their confidential data in public domain (Fig. 43.2).

43.4.5 Incident Analysis

The testbed is to facilitate incident root cause analysis to build and assess participant’s
cybersecurity skills to counter and resolve cyberattacks. The analyst has to conduct
458 A. Bahuguna and S. Wazir

Fig. 43.2 Attack methodology used by the attacker to compromise the victim organization

incident investigation and answers the key questions related to the incident. The
analysis can be carried out by analyzing the events, logs, and artifacts in the victim
machine. In case of such an attack, the analyst is expected to conduct review of
the firewall logs, email logs, windows logs, and server logs. For an instance while
analyzing windows machine, they look for new entries or new software installations
made by the attacker. In most of the cases, there will be some footprints left behind in
any attack, finding those will lead participants to determine the attacker methodology
such as initial access and details regarding the data breach.
As a cybersecurity analyst, participants are assigned with mission is to conduct
forensic analysis on the provided victim machine; in testbed, it is a windows machine.
The user has to determine how the data breach occurred in the organization and
capture the evidence to support attack hypothesis. For the purpose of assessment,
participants have to answer questions related to the cyber incident. Assessment
questions related to various domains like log analysis, network traffic analysis, and
windows forensics are included. The participants have to analyze the victim machine
and submit their answers.
43 Design and Execution of Cyberattacks Simulation … 459

Fig. 43.3 Screenshot of update .html to identify malicious URL

43.4.6 Indicative Questions for Practice and Assessment


Based on the Technical Incident Investigation

The following are some of the questions designed in the testbed for assessment of
participant’s ability in countering and resolving cybersecurity incidents.
• Q1. Find the number of phishing mails received by the employee of victim
organization on day of attack. (Give your answer in numbers).
Answer: To be identified from analysis of email client and identifying phishing
emails.
• Q2: Identify the post URL for phishing mail where user is requested to update the
bank credentials?
Answer: To be identified from update .html file (Fig. 43.3).

• Q3: Identify the type of malware that was downloaded?


• Reverse shell
• RAT
• Ransomware

Answer: To be identified by activity log analysis.


• Q4: Find the generic of malware being downloaded on opening the malicious
attachment?
Answer: Can be identified by registry keys or suspicious processes (Fig. 43.4).
– Open Task Manager
• Q5: Find the IP Address of attacker command and control server?
Answer: To be identified from network packet capture (Fig. 43.5).

Fig. 43.4 Screenshot of Quasar RAT in task manager


460 A. Bahuguna and S. Wazir

Fig. 43.5 Screenshot of packet capture to identify attacker’s command and control server

• Q6: Find the port number on which attacker machine was communicating with
victim machine.
• 6743
• 5143
• 59,028
Answer: To be identified from network packet capture.

43.5 Conclusions and Future Work

Initiatives such as cyber exercises, capacity building, and assurance framework are
taken around the globe to ensure the cyber resiliency of the digital economy. However,
there is limited academic research and associated artifacts available for higher-order
skill set development in cybersecurity domain. Also, there is lack of application of
formal pedagogical principles in cybersecurity capacity building. This work aims
to develop testbed for attack simulation, artifact analysis, higher-order skills devel-
opment, and ability assessment along with mapping of the learning objectives with
pedagogical principles. This work is to provide opportunity to conduct practice-
oriented experiential learning sessions by which allows execution of attacks and
incident investigation in sandboxed environment. Outcome of the research would
help in assessment of effectiveness of practice-oriented experiential learning for
skill building in cybersecurity domain. To understand and conduct research on
how effective are practice-oriented experiential learning in building higher-order
skillset in cybersecurity domain, a practice-oriented lab which replicates a real-
cyberattack scenario is developed. The testbed is developed to help the learners
to perform technical analysis of cyberattacks within risk-free environments. The
developed scenario contains latest cyberattacks such as a malware-based scenario
which leads to data breach and includes vulnerability assessment, lateral movement,
scanning, and log analysis of machine and network traffic analysis. In this paper,
we presented the design of simulated attacks-based practice-oriented testbed for
building cybersecurity skill sets. Vulnerable machines and exploits were developed
for the experiment.
Going forward, experiment based on the developed testbed for participants can be
conducted. Data and results from the experiment could be analyzed to assess the effec-
tiveness of practice-oriented experiential learning in building higher-order skillset as
43 Design and Execution of Cyberattacks Simulation … 461

proposed in Bloom’s Revised Taxonomy (BRT). Execution of the empirical investiga-


tion will help in assessing the hypothesis about how effective are the practice-oriented
experiential learning in building higher-order skillset in cybersecurity domain.

References

1. White, G.B.: A grassroots cyber security program to protect the nation. In: Proceedings of the
Annual Hawaii International Conference on System Sciences. Manoa, Hawaii (2012)
2. Ministry of Electronics and IT: National Cyber Security Policy. Goverment of India (2013)
3. Bahuguna, A., Bisht, R., Jeetendra, P.: Country-level cybersecurity posture assessment: Study
and analysis of practices. Inf. Secur. J. A Glob. Perspect. 5, 250–266 (2020)
4. White, G.B., Dietrich, G., Goles, T.: Cyber security exercises: testing an organization’s ability
to prevent, detect, and respond to cyber security events. In: 37th Annual Hawaii International
Conference on System Sciences (2004)
5. Ahmad, A.: PhD thesis A cyber exercise post assessment framework: In Malaysia perspectives.
University of Glasgow (2016)
6. Bahuguna, A.: Cyber security exercises. In: Cyber Security Techniques, pp. 82–99. Uttarakhand
Open University (2016)
7. ENISA: Cyber Security Exercises—Survey. ENISA (2015)
8. Makrodimitris, G.A.C.D.: Towards a successful exercise implementation–A case study of exer-
cise methodologies. In: International Conference on Human Aspects of Information Security,
Privacy, and Trust (2015)
9. Krathwohl, D.R.: A revision of Bloom’s taxonomy: An overview. Theory Into Pract. 4, 212–218
(2002)
10. Sen, R., Sharad, B.: Estimating the contextual risk of data breach: An empirical approach. J.
Manage. Inf. Syst. 314–341 (2015)
11. Angela Orebaugh, G.R.J.B.: Wireshark and ethereal network protocol analyzer toolkit.
Syngress Publishing, Inc. (2007)
12. Willie, D.D.S., Pritchett, L.: Kali Linux Cook Book. Packt Publishing Ltd. (2013)
13. Zeidanloo, H.R., Manaf, A.A.: Botnet command and control mechanisms. In: Second Interna-
tional Conference on Computer and Electrical Engineering, Vol. 1. IEEE, Dubai, United Arab
Emirates (2009)
14. Github, “quasar,” Github, [Online]. Available: https://github.com/quasar/Quasar. Accessed 22
December 2021
15. Recorded Future: Shining light on redline stealer malware. Record. Fut. (2021)
Chapter 44
Agricultural App Development Using
Machine Learning and Deep Learning:
A Review
Janhavi Chavan, Nandini Dubey, Govind Kotecha, Paras Shah,
and Pranali Kosamkar

Abstract Plant diseases pose a threat to farmers, consumers, the environment, and
the global economy. In India alone, diseases and pests damage 35% of field crops,
resulting in financial losses for farmers. Excessive use of chemicals, many of which
are toxic and biomagnified, also poses a significant health risk. Early disease detec-
tion, crop monitoring, and targeted treatment all con-tribute to avoiding these unfa-
vorable consequences. Agronomists identify the vast majority of diseases based on
their external characteristics. On the other hand, farmers typically have limited or no
access to professionals. Agriculture is one of India’s primary sources of revenue and
has a big impact. Humanity is undergoing a digital revolution in terms of economic
progress. Rural mobile subscribership has constantly increased over the last many
years. With the proliferation of cellphones and the Internet, there is a great opportunity
for transmitting crucial data via these means. We have compiled a list of agricultural
mobile apps available in the Google Play Store for the Android operating system that
may be beneficial for farming and related jobs. There are significant opportunities
for integrating smartphones with agriculture growth in India. Its use is crucial for
rapid expansion and easy access to information for farmers.

Keywords Machine learning · Deep learning · Disease · Fertilizer · Precision


agriculture · Mobile apps

44.1 Introduction

Agriculture is a branch of science that deals with soil cultivation, crop and fruit
production, and livestock rearing. It includes the processing and distribution of plant
and animal products for human use. Agriculture is one of the most significant sectors
in India because it gives work to the majority of the population. Modern technology

J. Chavan (B) · N. Dubey · G. Kotecha · P. Shah · P. Kosamkar


MIT World Peace University, Pune, India
e-mail: janhavi.a.chavan@gmail.com
P. Kosamkar
e-mail: pranali.kosamkar@mitwpu.edu.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 463
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_44
464 J. Chavan et al.

has a tremendous impact on the lives of farmers. Because agriculture is an old trade,
they are first hesitant to adopt new technology. They tend to stay with tried-and-
true procedures. No one can deny the impact and benefits of technology in today’s
world. Advances in science have made it possible for farmers to grow more crops
and raise more livestock. In the event that they require assistance, farmers could use
a mobile app to connect with local specialists. The agricultural app would let them
know in advance if it was going to be sunny or gloomy. The weather forecast allowed
them to arrange their day’s activities. Farmers, like all businesses, may benefit from
technological improvements in order to become more efficient. Mobile apps and
cutting-edge equipment boost productivity by orders of magnitude. Incorporating
smart irrigation equipment, tracking devices, and smart surveillance systems into a
farmer’s workflow saves a lot of time and money.

44.2 Literature Survey

As the world’s population increases, the demand for food increases proportionately.
Farmers are utilizing cutting-edge technologies such as sensors, drones, intelligent
irrigation, and GPS-equipped tractors to increase the sustainability of food produc-
tion. By 2050, the world’s population is expected to reach 9.7 billion people. Since
then, food demand has been steadily increasing. Agriculture requires continuous and
sustainable productivity growth, but with scarce resources such as water, electricity,
and fertilizer, these must be used carefully to protect and sustain the environment and
the soil quality of arable land. This analysis delves into 25 distinct mobile applica-
tions. These apps are focused on agricultural crops, fruits, and vegetables. It informs
us of the app’s benefits, drawbacks, limitations, and other features. RiceXpert app
offered by National Rice Research Institute covers four major categories of rice
diseases and their subcategories. This app describes nine distinct varieties of rice
according to the region in which they are grown. Each of these categories has its
own set of characteristics, such as release year, grain type, yield, and total dura-
tion, which are all addressed in a separate section of the app. There are six major
sections: weeds, nematodes, diseases, and insect pests, in which all pertinent infor-
mation, as well as preventive and remedial measures, is discussed in detail. Addi-
tionally, there are features for weather forecasting, price advisory, marketing, and
fertilizer calculation. As a result, the scientists have developed a methodology for
diagnosing rice diseases and pests automatically [1]. The Farm calculator app main-
tained by Indian Council of Agricultural Research contains four sections: a fertilizer
calculator, a pesticide quantity calculator, a plant population calculator, and a seed
rate calculator. This application is universally applicable to all crops. These apps
do not automatically detect disease or recommend pesticides. It is only capable of
performing mathematical operations [2]. Bharat Agri app has some unique features,
such as the ability to add our farm so that it can access critical field information.
Our questions can be directed to the chat support system. It consists of a weather
advisory system, a call support system, satellite images, and the detection of pests
44 Agricultural App Development Using Machine Learning … 465

and diseases. This section discusses over 140 crops. A krushi book is included with
detailed information on all of the crops featured in the app. It has a water testing
facility, but it is a fee-based service [3]. Plant Disease Identification App detects
plant diseases in 18 different fruit and vegetable varieties, as well as six agricultural
crops. The authors proposed a machine learning model for disease prediction and
pesticide and fertilizer recommendation. The user needs only to upload images of
the damaged area. Pesticides and fertilizers are not recommended for all diseases
[4]. The next app is titled as pest and plant diseases. This application is not intended
to diagnose disease. It encompasses all crops. This app contains information about
pests and fertilizers that can be used to boost yields and prevent disease [5]. Plan-
ticus’ application contains information about seven agricultural crops and two fruits,
as well as their diseases. It detects disease and recommends pesticides after the user
uploads a photo. There is a library section that contains information about specific
crops and their diseases. This section discusses only a few crop varieties and their
diseases [6]. Cropalyser is a mobile application that assesses ten distinct agricultural
crops. It is not capable of real-time disease detection and makes no recommenda-
tions regarding pesticides or fertilizers. It merely provides information on the various
diseases associated with a particular crop and how to prevent them [7]. Kheti Point
is an app, where the user must upload an image of the diseased crop to the chat
room, where an expert will analyze the disease and make pesticide recommenda-
tions. This app contains data on all crops. The only restriction is that you must take
and upload a picture each time. Due to the lengthy nature of this procedure, the image
may occasionally be blurred [8]. Agri Doctor is the app that covers a few different
varieties of fruits, vegetables, and agricultural crops, as well as their diseases. The
app contains a single section devoted to the latest equipment and instructions on
how to use it. These applications are limited to a few crop varieties [9]. Next app
is Scouting. This app operates similarly to Agri Doctor. The only drawback is that
only a limited number of crop varieties are covered [10]. AgriCentral is an app,
which is similar to Scouting. It includes a weather report and a calendar for tracking
completed tasks. This will assist you in staying organized and on top of all tasks
and important dates [11]. AgriAI is an application that supports a limited number
of agricultural crop varieties [12]. The e-Mausam krishi app is maintained by the
government of Haryana. App include a plethora of features and are applicable to
all crops. Additionally, to crop-specific pathogens. It contains detailed information
on crops, diseases, preventive measures, and general recommendations for disease
management. Weekly crop updates are provided. There is a source for a contact
number for scientists. It does not assist in real-time disease detection [13]. AgriSetu
is an application that provides information on a variety of fruits and agricultural crops.
It includes a calendar outlining the crops grown in each month. Additionally, this
app includes a soil-related feature [14]. Plantix app covers all crop-specific diseases
and categorizes them according to the stage at which they manifest. Fertilizers are
recommended based on the plot’s size. The app is only compatible with a limited
number of crop varieties [15]. Pacific pathogens of pests and weeds app include fact
sheets on every disease, weed, and mite. However, the app does not allow for user
interaction in order to provide specific disease information [16]. The Kisankraft App
466 J. Chavan et al.

discusses a variety of crops. It provides a weather forecast and calculates fertilizer


dosages using the user-supplied NPK values. There is a section devoted to renting
tools and machinery. This app does not contain all available information on diseases
[17]. Agrostar is another app that covers all crops and only provides expert advice
on diseases or pesticides over the phone. This is the sole constraint that requires us
to consult experts for each and every query [18]. AgroAi app covers two fruits and
six agricultural crops. Apart from the other apps, this one includes a soil fertility test
feature that requires the user to enter a pH value and displays the results. To diagnose
a disease, the user must upload an image, and the app will suggest a treatment. Not
all crops are covered here, and not all results are displayed [19]. Onesoil’s Scouting
app is a farming scouting tool. All crops are covered here, as the field can be located
on a map and variable fertilizer rates determined. However, while working on the
map, it lags and eventually stops working [20]. The Agrio app provides information
on two fruits and four agricultural crops. The user uploads an image of the crop,
which is then diagnosed by experts and recommended pesticides and fertilizers [21].
Leaf doctor is the next app. The app’s primary objective is to ascertain the health of
diseased leaves. The only restriction is that suggestions must be submitted via email
[22]. Plantify Dr focuses on a few agricultural and fruit-related topics. To detect the
disease, it is proposed that a machine learning model be used. It makes no recom-
mendation regarding the use of pesticides or fertilizers [23]. Crop Diagnosis app
covers only three crops. The app suggests a diagnosis via chat, and users can then
take additional measures to protect crops [24]. The Purdue Tree Doctor app which
is offered by Purdue University covers over 60 trees and crop-specific diseases. It
makes recommendations regarding pesticides and treatments for specific diseases.
Pesticides are not recommended [25]. These were the different apps evaluated which
highlighted distinct characteristics of each app along with their shortcomings. So it
is manifested from the survey that most of the apps offer three basic functionalities
which are disease detection, pesticide recommendation, and fertilizer recommenda-
tion. More apps should be designed which will assist the farmers in checking the
fertility of soil.

44.3 Research Gap

After conducting the survey of agricultural applications, we have identified the


following research gaps in them. Numerous apps are confined to a single or a few
crops. This can present complications for the app’s user, as the user will be unable
to contact the app for assistance if he does not find the crop he is seeking for. Few
applications do not propose pesticides or fertilizers to treat agricultural diseases. If
the user is utilizing an app that does not recommend pesticides and fertilizers for
the disease that has been noticed on crops, he will be unable to save the crop from
disease, which could result in considerable financial loss. Many of the apps either
do not detect any ailments or detect only a few. If an app does not offer disease
detection or only covers a small number of diseases, the user will be unsure about
44 Agricultural App Development Using Machine Learning … 467

the type of disease afflicting the crop and how to avoid it. Numerous applications
provide no mention of the quantity of fertilizer or pesticide to be used. If the user’s
app does not suggest the appropriate amount of fertilizer or pesticide to use, it can
create significant problems for them, as using too little or too much fertilizer or
pesticide can cause serious damage to the crops. Numerous apps are incapable of
detecting soil N, P, and K values. If the user is unaware of the soil’s N, P, and K
contents, he will be unable to determine which crops are best suited to the soil type.
Agriculture-related calculators are not functioning properly. Numerous apps, such as
fertilizer calculators, pesticide calculators, and seed rate calculators, are inoperable.
Issues with logging in and signing up are evident in some apps. Many people are
unable to login and register in order to utilize the software. Numerous apps lack
up-to-date information. The information provided in a few parts, such as rates and
agricultural news, is a few days old and not the most recent. Numerous applications
have a language barrier. There is no opportunity to change the language in some
apps; the language is determined automatically based on the state selected. Certain
applications lack a location or address capability. On the other hand, some apps do
not require an address or location, making it difficult to purchase agricultural prod-
ucts through the app. Expert recommendations offered over the phone or through
the chat system are delayed. Numerous apps offer disease or pesticide advice from
experts via chat or phone. However, this is a time-consuming job, as the user must
wait for a response over an extended length of time. Numerous applications answer
within 48 hours. Thus, there is a need to develop a scalable mobile application which
will be easy to use and prove to be beneficial for the farmers.

44.4 Technology Used for Agricultural Mobile App


Development

44.4.1 Machine Learning

The approaches used in agricultural machine learning are drawn from the learning
process. These ways must be learned through experience in order to complete a
certain goal. Machine learning is used in agriculture to boost crop output and quality.
Nikita Yadav et al. have reviewed different machine learning techniques like deci-
sion tree, random forest, and image preprocessing for detection of leaf diseases.
The proposed system suggests the pesticides for particular disease [26]. Modern
agriculture collects data using a range of sensors to gain a better understanding
of the environment, which includes crop, soil, and weather variables. These facts
will enable us to make quick, result-oriented decisions. Abhishek Shah et al. have
proposed a weather-based forewarning pest prediction model. It detects different
pests for a particular crop. In future, the system can be developed as an end-to-
end product using a prototype [27]. Spoorthi S. et al. have developed a Freyr drone
which aims to reduce the work of farmers and complete the task in less amount
468 J. Chavan et al.

of time. The drone is used for spraying pesticides [28]. Prof. Swati D Kale et al.
have proposed a UAV model for spraying chemicals. This model works on the feed-
back provided by wireless sensor networks which is further deployed on the crop
field. They have developed an algorithm for adjusting the drone according to wind
[29]. Dr. Kiran Kumar Gurral et al. have proposed a disease diagnosis model which
identifies disease at an early stage. The work can further be extended to detect the
diseases using hybrid techniques [30]. Muhammad Junaid et al. have proposed a
cloud-based system on which different activities related to agriculture are monitored
and analyzed by agricultural experts. SVM algorithm is used to classify the data.
The work can further be improved by working on precision agriculture and other
critical factors [31]. Ms. Supriya shinde et al. have proposed a system which predicts
crop diseases using IoT and SVM. Parameters like temperature, humidity, rainfall,
and light intensity are taken into consideration. The work can further be extended
to build an app which could be useful for farmers [32]. Monzurul Islam et al. have
presented an approach wherein they have combined image processing and SVM
to detect the plant leaf disease of over 300 images 95% of accuracy is achieved
using SVM classifier [33]. Debasish Das et al. have implemented a framework for
detection of diseases. For classification purposes, three different algorithms are used
which are SVM, random forest, and logistic regression. SVM outperforms the other
two algorithms. This particular model can be used in real-life applications [34].
M. P. Vaishnave et al. have designed an application for detection of groundnut leaf
diseases. Four different groundnut diseases are categorized here. KNN algorithm is
used here which increases the accuracy of model. In future, extra classifiers can be
added which will decrease the false classification [35]. R. Deepika Devi et al. have
presented an IoT system along with random forest algorithm to detect and classify
the disease in banana plant. Different environmental parameters like temperature and
soil moisture are taken into consideration. The model has achieved the accuracy of
99% [36]. Mugithe et al. have proposed an application for identifying leaf disease.
It employed the K-means clustering technique in two ways: through the graphical
user interface, where it achieved an accuracy of 95.1613%, and in real-time, where
a buzzer alerts the farmer if a disease is detected [37]. Tete et al. proposed a model
which outlines numerous segmentation approaches for determining the presence of
various plant diseases. Additionally, this research studies categorization approaches
for plant diseases [38]. Vijai Singh et al. have proposed an algorithm which automati-
cally classifies and detects the diseases. The algorithm is applied on four agricultural
crops among which bean crop achieved 92% of accuracy [39]. Despite the fact that
the k-means clustering algorithm requires an a priori determination of the number
of cluster centers, this strategy is more successful than thresholding and gives the
best results when dealing with diverse datasets. After analyzing multiple machine
learning models, it is noticeable that most researchers used SVM even though all
models have given adequate accuracy. Along with SVM, the K-means algorithm is
the other algorithm which is used by most of the researchers.
44 Agricultural App Development Using Machine Learning … 469

44.4.2 Deep Learning

Deep learning techniques such as CNN, RNN, ANN, ResNet, and Inception can be
employed in agriculture. Because the accuracy levels of various deep learning algo-
rithms vary, the initial process of selecting an algorithm to use is critical. It is crucial
to detect and recognize diseases in crops at an early stage in the agricultural industry.
Sammy V. Militante et al. have developed a deep learning-based system to detect
and recognize diseases in several plant varieties [40]. Lili Li et al. have studied the
scientific advancement of deep learning technology in the field of crop leaf disease
detection that has been done in recent years. One of their primary learnings is that
better resilience deep learning models are required to adapt to diverse datasets [41].
Tanha Talaviya et al. look at how artificial intelligence can be used in agriculture for
irrigation, weeding, and spraying with the use of sensors and other devices integrated
in robots and drones. These technologies reduce the amount of water, pesticides, and
herbicides used, preserve soil fertility, aid in the efficient use of manpower, and
increase productivity and quality. They have conducted a survey to look at the work
of a number of researchers in order to acquire a quick overview of the present state
of automation in agriculture, including weeding systems using robots and drones
[29]. Solemane Coulibaly et al. have suggested a method for constructing a mildew
disease diagnostic system in pearl millet that combines transfer learning and feature
extraction [42]. Demonstrations show that using a transfer learning approach for
image recognition provides a quick, low-cost, and easy-to-use solution for detecting
digital plant diseases. Amanda Ramcharan et al. have used transfer learning to train
a deep convolutional neural network to identify three diseases and two types of
pest infestation using a dataset of cassava disease images captured in the field in
Tanzania [43]. Deep learning techniques such as CNN, RNN, ANN, ResNet, and
Inception can be employed in agriculture. Jagadish Kashinath Kamble et al. have
presented an image processing technique based on ANN for detecting plant diseases
so that farmers can take measures to cure them timely [44]. CNN. D. Devi et al.
have utilized CNN and SVM for building a model which detects diseases in fruits.
To get the information about the presence of pesticides in real time, few sensors,
Arduino, and a Wi-Fi module were used [45]. Pushkara Sharma et al. have devel-
oped an artificial intelligence-based autonomous plant leaf disease detection and
classification system that allows for quick and easy disease detection, classifica-
tion, and treatment. They used logistic regression, KNN, SVM, and CNN as the
system’s classifiers. CNN has been found to outperform all other algorithms [46].
Kaushik Kunal Singh devised a real-time diagnosis using CNN for cloud-based
image processing. To achieve higher accuracy, the model constantly learns from
user-submitted photographs and expert suggestions [47]. Michael Gomez Selvaraj
et al. have created an AI-based banana disease and pest detection system using DCNN
to assist farmers in cultivation of bananas. Their study revealed that the DCNN is a
reliable and simple-to-implement technique for detecting banana disease and pests
[48]. Melike Sardogan et al. have developed a tomato leaf diseases detection and clas-
sification method based on CNN with learning vector quantization algorithm. The
470 J. Chavan et al.

proposed approach effectively recognizes four different forms of tomato leaf diseases.
Varying filters or different sizes of convolutions can be employed to increase recogni-
tion rate in the classification process [49]. Pranali K. Kosamkar et al. have suggested
a system that uses TensorFlow technology to do preprocessing and feature extraction
of leaf pictures, followed by CNN for disease classification and pesticide recommen-
dation. To train the model, they have employed CNN with different levels (three, four,
and five layers) and an android application as the user interface [50]. Yong Ai et al.
employed the Inception-ResNet-v2 model, which was built utilizing deep learning
theory and CNN, to automatically identify agricultural illnesses. They have also inte-
grated their model with an app for detecting agricultural diseases and insect pests, as
well as providing relevant advice. The model can be expanded for more crop species
in future [51]. Wei-Jian Hu et al. utilize deep learning and IoT technology to develop
a comprehensive understanding of crop disease recognition. Apart from identifying
the disease, it also distinguishes between disease stages. On all metrics, the MDFC–
ResNet model outperforms the other models. It has the highest average accuracy, the
widest range (from zero to one), and the highest precision, recall, and F1 values [52].
Inception v3 is a very well-known image recognition model that has been shown
to perform significantly better on the ImageNet dataset than the previous versions.
It can be used to identify disease and pests. Tejas Pandit et al. address inception
networks, their imitations, and the difficulties encountered by some of the architec-
tural schemes utilized in Inception networks. The performance of various inception
network versions was evaluated. They found that these networks are a promising area
of research, and that various models integrating these variants performed exception-
ally well in picture classification difficulties [53]. After analyzing multiple models,
it is noticeable that most researchers used CNN even though all models have given
adequate accuracy. When compared to machine learning algorithms, deep learning
models have demonstrated to be more accurate.

44.5 Proposed System

The following diagram shows the proposed system for detection of diseases in crops
as one of the models of our Web application along with fertilizer recommendation,
soil fertility testing, and land lease module.

44.5.1 CNN Model for Plant Leaf Disease Detection

1. Image Preprocessing: Pre-processed images have a smaller image size and a


higher resolution. It processes and improves the image. The images are colored
and resized.
44 Agricultural App Development Using Machine Learning … 471

Fig. 44.1 High-level diagram of CNN model

2. Classification: After the model has been tested and trained, it is time to begin
classifying data. Whether the plant is infected or not, it can be used to determine
the type of disease and the type of plant it is.

By using the proposed model as shown in Fig. 44.1, the image is classified after
preprocessing of the testing image. A unique disease name is then generated and
sent to an Android app, allowing farmers to take the necessary steps to minimize the
disease percentage.

44.6 Conclusion

Detailed information about 25 different agricultural applications has been provided


in this paper, which will be of great assistance to farmers in their efforts to increase
crop production. While some apps provide real-time assistance through a variety
of features such as calling, image uploading, and so on, others do not allow users
to communicate with crop cultivation and disease management professionals. To
achieve more accurate results, machine learning and deep learning algorithms can
be used in the development of such applications. The use of sensors can also be
advantageous in some situations. In the following stage, farmers can use drones to
monitor their fields, which is a significant step forward.

References

1. National Rice Research Institute: riceXpert (3.7), Mobile application software, Google
Playstore App (2020)
2. Koti, V.: Farm Calculator (3.6), Mobile application software, Google Playstore App (2018)
3. Best and Trusted Agriculture Digital Farming: Bharat agri: Smart Kisan App (3.3.18.10),
Mobile application software, Google Playstore App (2021)
4. Fouxa: Plant disease identification (14.14.22), Mobile application software, Google Playstore
App (2021)
5. Adek Nata: Pests and plant diseases (7.0), Mobile application software, Google Playstore App
(2020)
472 J. Chavan et al.

6. Ask Attis: Planticus (1.1.2), Mobile Application Software, Google Playstore App.
7. Bejo Zaden B.V.: Cropalyser (1.6.13), Mobile Application Software, Google Playstore App
(2020)
8. Khetipoint: Khet point: Agriculture Doctor App Indian Farmers (1.0.6), Mobile Application
Software, Google Playstore App (2021)
9. Agri Doctor: Agri Doctor—Agriculture App for Farmers Farming (1.4.1), Mobile Application
Software, Google Playstore App (2020)
10. BASF Digital Farming GmbH: SCOUTING—Automate field diagnosis (3.17.1), Mobile
application software, Google Playstore App (2021)
11. AgriCentral: Agri Central (4.0.6), Mobile Application Software, Google Playstore App (2021)
12. AgriAi BV: AgriAi (1.0), Mobile application software Google Playstore App (2020)
13. CCS Haryana Agricultural University: Emausamhau Krishi Mausam Seva (1.1.18), Mobile
Application Software, Google Playstore App (2021)
14. Agrisetu: Agri Setu—Agriculture App for Smart Farming (1.1), Mobile Application Software,
Google Playstore App (2021)
15. Plantix: Plantix—Your Crop Doctor (3.6.2), Mobile Application Software, Google Playstore
App (2021)
16. Lucid Mobile: Pacific Pests, Pathogens and Weeds (1.7.5), Mobile Application Software,
Google Playstore App (2021)
17. KisanKraft Ltd.: KisanKraft—Rent Buy Learn Earn (4.2.0), Mobile Application Software,
Google Playstore App (2021)
18. AgroStar: AgroStar: Kisan Helpline and Farmers Agriculture App (5.12.2), Mobile Application
Software, Google Playstore App (2021)
19. Neural Farms: AgroAI—Plant Diseases Diagnosis (Varies with device), Mobile Application
Software, Google Playstore App (2020)
20. OneSoil: One Soil Scouting: Farming Tool (5.4.0), Mobile Application Software, Google
Playstore App (2021)
21. Saillog Ltd.: Agrio—Precision Farming Made Easy! (3.3.3), Mobile Application Software,
Google Playstore App (2021)
22. Sarah Jane Pethybridge: Leaf Doctor (1.0), Mobile Application Software, Google Playstore
App (2017)
23. Alex Lavaee: Plantify Dr (1.2), Mobile application software, Google Playstore App (2020)
24. Ergobyte Informatics S.A.: CropDiagnosis (1.1.0), Mobile Application Software, Google
Playstore App (2016)
25. Purdue University: Purdue Tree Doctor (1.2), Mobile application software, Google Playstore
App (2020)
26. Yadav, N., Kasar, S., Abuj, D., Vadvale, A., Dharmadhikari, S.C.: Crop disease prediction
and solution. Int. Res. J. Eng. Technol. (IRJET) 08(02), 599–602 (2021). e-ISSN: 2395-0056,
p-ISSN: 2395-0072
27. Shah, A., Syeda, S.: Machine learning based prediction and recommendation system for
detection of pests and cultivation of crops. Int. J. Res. Eng. Sci. Manage. 3(12), 86–92 (2020)
28. Spoorthi, S., Shadaksharappa, B., Suraj, S., Manasa, V.K.: Freyr drone: Pesticide/fertilizers
spraying drone-an agricultural approach. In: 2017 2nd International Conference on Computing
and Communications Technologies (ICCCT), pp. 252–255. IEEE (2017)
29. Kale, S.D., Khandagale, S.V., Gaikwad, S.S., Narve, S.S., Gangal, P.V.: Agriculture drone for
spraying fertilizer and pesticides. Int. J. Adv. Res. Comp. Sci. Softw. Eng. 5(12), 804–807
(2017). ISSN: 2277 128X
30. Gurrala, K.K., Yemineni, L., Rayana, K.S.R., Vajja, L.K.: A new segmentation method for
plant disease diagnosis. In: 2019 2nd International Conference on Intelligent Communication
and Computational Techniques (ICCT), pp. 137–141. IEEE (2019)
31. Junaid, M., Shaikh, A., Hassan, M.U., Alghamdi, A., Rajab, K., Reshan, A., Saleh, M., Alkinani,
M.: Smart agriculture cloud using AI based techniques. Energies 14(16), 5129 (2021)
32. Shinde, S.S., Kulkarni, M.: Review paper on prediction of crop disease using IoT and machine
learning. In: 2017 International conference on transforming engineering education (ICTEE),
pp. 1–4. IEEE (2017)
44 Agricultural App Development Using Machine Learning … 473

33. Islam, M., Dinh, A., Khan, W., Bhowmik, P.: Detection of potato diseases using image segmen-
tation and multiclass support vector machine. In: 2017 IEEE 30th Canadian Conference on
Electrical and Computer Engineering (CCECE), pp. 1–4. IEEE (2017)
34. Das, D., Singh, M., Mohanty, S.S., Chakravarty, S.: Leaf disease detection using support vector
machine. In: International Conference on Communication and Signal Processing (ICCSP),
2020, pp. 1036–1040. IEEE (2020)
35. Vaishnnave, M.P., Devi, K.S., Srinivasan, P., Jothi, G.A.P.: Detection and classification of
groundnut leaf diseases using KNN classifier. In: 2019 IEEE International Conference on
System, Computation, Automation and Networking (ICSCAN), pp. 1–5 (2019)
36. Devi, R.D., Nandhini, S.A., Hemalatha, R., Radha, S.: IoT enabled efficient detection and
classification of plant diseases for agricultural applications. In: 2019 International Conference
on Wireless Communications Signal Processing and Networking (WiSPNET), pp. 447–451.
IEEE (2019)
37. Mugithe, P.K., Mudunuri, R.V., Rajasekar, B., Karthikeyan, S.: Image processing technique
for automatic detection of plant diseases and alerting system in agricultural farms. In: 2020
International Conference on Communication and Signal Processing (ICCSP), pp. 1603–1607.
IEEE (2020)
38. Tete, T.N., Kamlu, S.: Detection of plant disease using threshold, k-mean cluster and ann
algorithm. In: 2017 2nd International Conference for Convergence in Technology (I2CT),
pp. 523–526. IEEE (2017)
39. Singh, V., Misra, A.K.: Detection of plant leaf diseases using image segmentation and soft
computing techniques. Inf. Process. Agric. 4(1), 41–49 (2017)
40. Militante, S.V., Gerardo, B.D., Dionisio, N.V.: Plant leaf detection and disease recognition using
deep learning. In: 2019 IEEE Eurasia Conference on IOT, Communication and Engineering
(ECICE), pp. 579–582. IEEE (2019)
41. Li, L., Zhang, S., Wang, B.: Plant Disease Detection and Classification by Deep Learning—A
Review, pp. 56683–56698. IEEE (2021). Accessed 9 (CNN)
42. Coulibaly, S., Kamsu-Foguem, B., Kamissoko, D., Traore, D.: Deep neural networks with
transfer learning in millet crop images. Comput. Ind. 108, 115–120 (2019)
43. Ramcharan, A., Baranowski, K., McCloskey, P., Ahmed, B., Legg, J., Hughes, D.P.: Deep
learning for image-based cassava disease detection. Front. Plant Sci. 8, 1852 (2017)
44. Kamble, J.K.: Plant disease detector. In: 2018 International Conference On Advances in
Communication and Computing Technology (ICACCT), pp. 97–101. IEEE (2018)
45. Devi, D., Anand, A., Sophia, S.S., Karpagam, M., Maheswari, S.: IoT deep learning based
prediction of amount of pesticides and diseases in fruits. In: 2020 International Conference on
Smart Electronics and Communication (ICOSEC), pp. 848–853. IEEE (2020)
46. Sharma, P., Hans, P., Gupta, S.C.: Classification of plant leaf diseases using machine learning
and image preprocessing techniques. In: 2020 10th International Conference on Cloud
Computing, Data Science and Engineering (Confluence), pp. 480–484. IEEE (2020)
47. Singh, K.K.: An artificial intelligence and cloud based collaborative platform for plant disease
identification, tracking and forecasting for farmers. In: 2018 IEEE International Conference
on Cloud Computing in Emerging Markets (CCEM), pp. 49–56. IEEE (2018)
48. Selvaraj, M.G., Vergara, A., Ruiz, H., Safari, N., Elayabalan, S., Ocimati, W., Blomme, G.:
AI-powered banana diseases and pest detection. Plant Methods 15(1), 1–11 (2019)
49. Sardogan, M., Tuncer, A., Ozen, Y.: Plant leaf disease detection and classification based on
CNN with LVQ algorithm. In: 2018 3rd International Conference on Computer Science and
Engineering (UBMK), pp. 382–385. IEEE (2018)
50. Kosamkar, P.K., Kulkarni, V.Y., Mantri, K., Rudrawar, S., Salmpuria, S., N. Gadekar. Leaf
disease detection and recommendation of pesticides using convolution neural network. In:
2018 Fourth International Conference on Computing Communication Control and Automation
(ICCUBEA), pp. 1–4. IEEE (2018)
51. Ai, Y., Sun, C., Tie, J., Cai, X.: Research on Recognition Model of Crop Diseases and Insect
Pests Based on Deep Learning in Harsh Environments, pp. 18. IEEE (2020). Accessed 8
474 J. Chavan et al.

52. Hu, W.J., Fan, J., Du, Y.X., Li, B.S., Xiong, N., Bekkering, E.: MDFC–ResNet: An Agricul-
tural IoT System to Accurately Recognize Crop Diseases, pp. 115287–115298. IEEE (2020).
Accessed 8
53. Pandit, T., Kapoor, A., Shah, R., Bhuva, R.: Understanding Inception Network Architecture
for Image Classification (2020). https://doi.org/10.13140/RG.2.2.16212.35204
Chapter 45
Performance Evaluation of Biharmonic
Function-Based Image Inpainting
Approach

Manjunath R. Hudagi, Shridevi Soma, and Rajkumar L. Biradar

Abstract Image inpainting states to restoring misplaced procedure or damaged


regions in an image. Image inpainting is a procedure of returning corrupted and
old images. The major intention of this research is to analyze and justify the effec-
tiveness of the image inpainting technique. Accordingly, performance analysis of
image inpainting model is performed through Biharmonic functions. Moreover, the
analysis is done by varying domain size in the Biharmonic function based on the
percentage of images. Here, the performance of Biharmonic functions is evaluated
using Structural Similarity Index Measure (SSIM), peak signal to noise ratio (PSNR),
universal image quality index (UQI), second-derivative-like measure of enhance-
ment (SDME), multi-scale structural similarity (MS-SSIM), and Mean Square Error
(MSE). Thus, from this analysis, it is shown that the Biharmonic functions obtained
better performance for the image inpainting process.

Keywords Image inpainting · Biharmonic functions · Noise reduction ·


Super-resolution · Universal image quality index · Peak signal to noise ratio ·
Multi-scale structural similarity · Second-derivative-like measure of enhancement

45.1 Introduction

Inpainting is an effective approach in order to reinstate the absence regions in an


image, which recreates the image depending on background information [1]. In
recent years, image inpainting plays a vital role in computer graphics, as well as it is
more significant in television special effect production, heritage conservation, film,

M. R. Hudagi (B)
Tatyasaheb Kore Institute of Engineering and Technology, Warananagar, India
e-mail: manjuhudagi@gmail.com
S. Soma
Poojya Doddappa Appa College of Engineering, Gulbarga, India
R. L. Biradar
G. Narayanamma Institute of Technology and Science (for Women), Hyderabad, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 475
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_45
476 M. R. Hudagi et al.

and elimination of redundant objects. In image inpainting, performance is not iden-


tified by human begins, which is the most important intention of image inpainting.
In addition, image inpainting is an ill-posed reverse concern, which has no precise
distinctive resolution [2–4]. Generally, image inpainting function is a significant
research area, and the outcomes from these research fields can be employed to tele-
vision effect and film, text elimination, cultural relic protection, and so on [5]. The
process of reconstructing the missing sections or damaged images by means of imper-
fect data, which are considered as precise as possible is named image inpainting [6].
The major applications of image inpainting are suppressment of scratches and texts
in early drawings, renovation of the missed pixel at the time of image transmission
[7], as well as the elimination of unwanted areas of film or photograph. Besides, the
inpainting process scheme assists in noise reduction, super-resolution, demosaicing,
etc. [8].
The image inpainting techniques are separated into two types, namely texture
synthesis and partial differential equation technique [5]. The image inpainting
techniques are usually grouped relies on exemplars [9], edges [10], statistics
[11], sparsity [12], and geometry [13] schemes. Furthermore, state-of-the-art tech-
niques are divided into two major schemes, namely traditional methods [13] and
learning-based approaches [14]. The conventional approaches are also categorized
into diffusion-based and patch-based algorithms. The images recovered from the
corrupted images are performed through a traditional image inpainting model, which
fills the damaged area through diffusion coefficients [15] and candidate patches tech-
niques. A diffusion-based inpainting model is employed for filling missed areas from
the specified area at pixel level.
The major intention goal of this research is to analyze the performance of the
image inpainting approach using Biharmonic functions. Here, the analysis is done
by altering percentage of images with different domain sizes in Biharmonic functions
with respect to various metrics.

45.2 Image Inpainting Using Biharmonic Functions

This section explains image inpainting approach by means of Biharmonic approaches


[16]. The input image H is considered and it is processed for acquiring harmonic
images. Let us assume simply interrelated area B with K boundary, represented
as V = ηB, as well as t is a diameter of simply associated area. Moreover, let us
consider smooth function be q0 at any region B and q0 is referred to as neighborhood
function outside B. The equation for identifying the function q is expressed as,

q = q0 , outside B (45.1)

The function q is created in various ways, thus Eq. (45.1) holds. The smooth
function q is identified for smooth surface completion by the following equation,
45 Performance Evaluation of Biharmonic Function-Based … 477
 
p = d + κ(c) + qmax σ v υv (45.2)

where qmax specifies maximum search radius at d + κ(c), σ v indicates reduction


factor, which ranges from 0 to 1, and υv denotes three-dimensional vector. In smooth
image inpainting, q0 is neighborhood external the region B, while data inside a region
is absent. The major intention of image inpainting is to expand function in area, so
expansion of missing area is not visible to human eyes. The image inpainting model
is referred to as simple linear or linear in image inpainting method, for any smooth
image since diameter g of inpainting area minimizes to zero, which is specified as,
 
q − q0  B = P g 2 (45.3)

where q specifies image attained from inpainting method and . B indicates the
S ∞ (B) norm function. Here, throughput b = P(h) is equally bounded by a constant
value K > 0. The harmonic inpainting expression is specified by,

q = 0; in K
(45.4)
q|V = q0 |V

Let us assume q1 is harmonic inpainting of q0 as well as qu is linear inpainting of


q0 on Bth region. The equation of q(a) is indicated as,

q(a) = q1 (a) + q2 (a); a ∈ B (45.5)

Moreover, the term q2 solves the Poisson’s equation, which is specified as,

q2 = qf in B
(45.6)
q2 |V = 0

Then, q describes the cubic inpainting of q0 , which is indicated as,


 
q − q0  B = P g 4 (45.7)

The output is comprehensive to multi-resolution approximation, where Laplacian


is exchanged through various anisotropic functions.

 p q = 0 in B,
(45.8)
 j q|V =  j q0 |V , j = 0, 1, . . . , j − 1
 
q − q0  B = P g 2 p (45.9)

Therefore, sharper error band is attained. After that, Eq. (45.8) can be rewritten
as,
478 M. R. Hudagi et al.

l0 = 0, in B,
l j = l j−1 in B, (45.10)

l j V =  p− j q0 |V ; j = 1, 2, . . . , p

q = lp (45.11)

Thus, the issue of resolving expression (45.8) is moderated to problematic of


explaining the Poisson’s equations is of form,

q = o in B,
(45.12)
q|V = x

The value of q is derived in order to enhance the smoothness of extension across


V,

 p q = 0 in B,
∂j ∂j (45.13)
q = q0 , j = 0, 1, . . . , p − 1
∂Rj ∂Rj
Thus, the Biharmonic functions are performed for the image inpainting process,
and the inpainted image is obtained.

45.3 Results and Discussion

The execution of this image inpainting approach is done in MATLAB with 4 GB


RAM, Windows 10 OS with Intel Core i-3 processor.

45.3.1 Experimental Arrangement

The execution of this image inpainting approach is done in MATLAB with 4 GB


RAM, Windows 10 OS with Intel Core i-3 processor.

45.3.2 Datasets

The execution of devised image inpainting model is executed based on corel-10 k


(dataset-1) [17] and GHIM-10 k part 1 (dataset-2) [17].
45 Performance Evaluation of Biharmonic Function-Based … 479

Fig. 45.1 Experimental outcomes of Biharmonic functions for image inpainting using dataset-1
and dataset-2

45.3.3 Experimental Outcomes

The experimental consequences of Biharmonic functions for image inpainting are


exposed in Fig. 45.1. The input images for datasets 1and 2 are specified in Fig. 45.1a.
Figure 45.1b explicates mask images for datasets 1and 2, Fig. 45.1c portrayed the
inpainted images for datasets 1and 2 as well as harmonic images for datasets 1and 2
is shown in Fig. 45.1d).

45.3.4 Performance Analysis

The performance of Biharmonic functions is evaluated using three metrics, including


PSNR, SDME, SSIM, MSE, MS-SSIM, and UQI. The analysis of Biharmonic func-
tions with different domain sizes based on different datasets are explicated in this
section.

45.3.4.1 Performance Analysis Using Database-1

Figure 45.2 depicts analysis of Biharmonic functions using dataset-1 with various
domain sizes in terms of several metrics. Figure 45.2a represents the performance
analysis of MSE by varying image percentages with various domain sizes. The MSE
of Biharmonic functions with domain size 30 is 0.0800, 40 is 0.0775, 50 is 0.0665, and
60 is 0.0536. The performance analysis of MS-SSIM by changing image percentage
with various domain sizes is specified in Fig. 45.2b. The MS-SSIM value of Bihar-
monic functions with domain size 30, 40, 50, and 60 is 0.9310, 0.9316, 0.9347,
and 0.9411. Figure 45.2c denotes performance analysis of PSNR by varying image
480 M. R. Hudagi et al.

Fig. 45.2 Performance analysis of Biharmonic functions using database-1 with various domain
sizes a MSE, b MS-SSIM, c PSNR, d SDME, e SSIM, and f UQI
45 Performance Evaluation of Biharmonic Function-Based … 481

percentages with various domain sizes. The PSNR of Biharmonic functions with
domain size 30 is 30.26 dB, 40 is 30.59 dB, 50 is 30.80 dB, and 60 is 32.88 dB.
The performance analysis of SDME by changing image percentage with various
domain sizes is portrayed in Fig. 45.2d. The SDME of Biharmonic functions with
domain size 30, 40, 50, and 60 is 68.97 dB, 69.96 dB, 70.97 dB, and 73.10 dB.
Figure 45.2e indicates the performance analysis of SSIM by changing image percent-
ages with various domain sizes. The SSIM of Biharmonic functions with domain size
30 is 0.9307, 40 is 0.9313, 50 is 0.9345, and 60 is 0.9408. Figure 45.2f depicts the
performance analysis of UQI by changing image percentages with various domain
sizes. The UQI of Biharmonic functions with domain size 30, 40, 50, and 60 is
0.9350, 0.9356, 0.9387, and 0.9451.

45.3.4.2 Performance Analysis Using Database-2

The performance analysis of Biharmonic functions is based on dataset-2 with various


domain sizes with regards to several metrics. The performance analysis of MSE by
changing image percentage with various domain sizes is portrayed in Fig. 45.3a.
The MSE of Biharmonic functions with domain size 30, 40, 50, and 60 is 0.0601,
0.0482, 0.0353, and 0.0242. Figure 3b indicates the performance analysis of MS-
SSIM by changing image percentages with various domain sizes. The MS-SSIM of
Biharmonic functions with domain size 30 is 0.9004, 40 is 0.9127, 50 is 0.9200, and
60 is 0.9245. Figure 45.3c depicts the performance analysis of PSNR by changing
image percentages with various domain sizes. The PSNR of Biharmonic functions
with domain size 30, 40, 50, and 60 is 27.57 dB, 28.58 dB, 30.37 dB, and 31.49 dB.
Figure 45.3d represents the performance analysis of SDME by varying image
percentages with various domain sizes. The SDME of Biharmonic functions with
domain size 30 is 66.57 dB, 40 is 68.75 dB, 50 is 69.31 dB and 60 is 72.36 dB. The
performance analysis of SSIM by changing image percentage with various domain
sizes is specified in Fig. 45.3e. The SSIM value of Biharmonic functions with domain
size 30, 40, 50, and 60 is 0.9124, 0.9246, 0.932, and 0.9364. Figure 45.3f denotes
performance analysis of UQI by varying image percentages with various domain
sizes. The UQI value of Biharmonic functions with domain size 30 is 0.9152, 40 is
0.9275, 50 is 0.9348, and 60 is 0.9393.

45.4 Conclusion

This paper explicates the performance analysis of Biharmonic functions for image
inpainting process. Here, the performance analysis is performed using the image
percentage by changing domain size in the Biharmonic function. The performance
of Biharmonic functions is evaluated with regard to six performance metrics. More-
over, performance of image inpainting approach is computed by mean of several
metrics, namely UQI, SDME, PSNR, SSIM, MSE, and MS-SSIM. Hence, from this
482 M. R. Hudagi et al.

Fig. 45.3 Performance analysis of Biharmonic functions using database-2 with different domain
sizes a MSE, b MS-SSIM, c PSNR, d SDME, e SSIM, and f UQI
45 Performance Evaluation of Biharmonic Function-Based … 483

analysis, it is exposed that the Biharmonic functions attained enhanced performance


for the image inpainting process. Here, the domain size 30, 40, 50, and 60 as well
as image percentage 50, 60, 70, 80, and 90 is considered for performance analysis.
Furthermore, other effective datasets can be utilized for evaluating the performance
of Biharmonic functions for image inpainting.

References

1. Mahajan, M., Bhanodia, P.: Image inpainting techniques for removal of object. In Proceedings
of International Conference on Information Communication and Embedded Systems, pp. 1–4
(2014)
2. Wohlberg, B.: Inpainting by joint optimization of linear combinations of exemplars. IEEE
Signal Process. Lett. 18(1) (2011)
3. Erkan, U., Enginoğlu, S., Thanh, D.N.H.: An iterative image inpainting method based on
similarity of pixels values. In Proceedings of 6th International Conference on Electrical and
Electronics Engineering, pp. 107–111 (2019)
4. Alilou, V.K., Yaghmaee, F.: Exemplar-based image inpainting using svd-based approximation
matrix and multi-scale analysis. Multimedia Tools Appl. 76(5), 7213–7234 (2017)
5. Wu, J., Ruan, Q.: Object removal by cross isophotes exemplar based inpainting. Proc. IEEE
Int. Conf. Pattern Recognit 3, 810–813 (2006)
6. Mo, J., Zhou, Y.: The research of image inpainting algorithm using self-adaptive group structure
and sparse representation. Cluster Comput. 1–9 (2018)
7. Zheng, J., Qin, M., Yu, H., Wang, W.: An efficient truncated nuclear norm constrained matrix
completion for image inpainting. Comp. Graph. 97–106 (2018)
8. Guo, Q., Gao, S., Zhang, X., Yin, Y., Zhang, C.: Patch-based image inpainting via two-stage
low rank approximation. IEEE Trans. Visual Comput. Graph. 24(6), 2023–2036 (2018)
9. Aujol, J.-F., Ladjal, S., Masnou, S.: Exemplar-based inpainting from a variational point of view.
SIAM J. Math. Anal. 44, 1246–1285 (2010)
10. Bertalmio, M., Vese, L., Sapiro, G., Osher, S.: Simultaneous texture and structure image
inpainting. In Proceedings of the International Conference on Computer Vision and Pattern
Recognition, pp. 707–712 (2003)
11. Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. Comp. Graph. 417–424
(2000)
12. Fadili, M., Starck, J.-L., Murtagh, F.: Inpainting and zooming using sparse representations.
Comp. J. 52(1), 64–79 (2009)
13. Bertalmio, M., Bertozzi, A., Sapiro, G.: Navier-Stokes, fluid dynamics, and image and video
inpainting. In Proceedings of IEEE Computer Vision and Pattern Recognition, pp. 213–226
(2001)
14. Ballester, C., Bertalmio, M., Caselles, V., Associate Member, I.E.E.E., Sapiro, G., Member,
I.E.E.E., Verdera, J.: Filling-in by joint interpolation of vector fields and gray levels. IEEE
Trans. Image Process. 10(8), 1200–1211 (2001)
15. Newson, A., Almansa, A., Fradet, M., Gousseau, Y., Perez, P.: Video inpainting of complex
scenes. SIAM J. Imag. Sci. 7(4), 1993–2019 (2014)
16. Damelin, S.B., Hoang, N.S.: On surface completion and image inpainting by biharmonic
functions: numerical aspects. Int. J. Mathem. Mathem. Sci. 8 (2018)
17. Corel-10k and GHIM-10k datasets taken from, “http://www.ci.gxnu.edu.cn/cbir/Dataset.aspx.
Accessed on August 2021
Chapter 46
Automated Perpetrator Identification
by Face Recognition

A. Vinothini, L. K. Nandhini, M. Sreekrishna, M. Jaeyalakshmi,


and Aksheya Suresh

Abstract In the present scenario, the perpetrator identification procedure is done by


the police force with the help of automated systems. Many buildings and streets have
surveillance cameras installed to monitor the activities that occur within the focus.
The videos recorded in these cameras have become one among the shreds of evidence
for the police force to investigate the crime. Recognizing the person face from the
captured videos is the most challenging task. The objective of this paper is to propose
a face recognition model that can detect and then recognize the perpetrator’s faces
automatically from the videos captured using the surveillance camera. The system
implements three components: Face detection, facial features extraction, and face
recognition. Haar cascades is used for face detection. Algorithms like local binary
pattern histograms, fisherface, and eigenface are used for implementing the face
recognition and the results obtained are plotted.

Keywords Face recognition · Face detection · Haar cascades · Eigenface ·


Fisherface · Local binary pattern histogram

46.1 Introduction

Every human in this world have some unique features on their face that help to identify
them. The facial recognition system can be employed in perpetrator identification

A. Vinothini (B) · L. K. Nandhini · M. Sreekrishna · M. Jaeyalakshmi · A. Suresh


Rajalakshmi Engineering College, Chennai, India
e-mail: vinothini.a@rajalakshmi.edu.in
L. K. Nandhini
e-mail: nandhini.lk@rajalakshmi.edu.in
M. Sreekrishna
e-mail: sreekrishna.m@rajalakshmi.edu.in
M. Jaeyalakshmi
e-mail: jaeyalakshmi.m@rajalakshmi.edu.in
A. Suresh
e-mail: aksheya.suresh@rajalakshmi.edu.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 485
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_46
486 A. Vinothini et al.

by comparing the digital photo of the perpetrator with the video or image captured
by the surveillance camera. It acts as a biometric application capable of recognizing
a person. The system extracts the features from the face. The pattern obtained is
then investigated based on the stored face features. This technology was used in
the initial stages in the computer. It is now developed to be employed on mobile
platforms and in other forms of expertise such as robotics [1, 2]. Face recognition
systems are implemented by deploying algorithms to élite exact, typical facts of the
captured human face. Human face diverges from on another by variations in eye
color, silhouette of nose, chin, lips, etc. The extracted features are then compared
with features of faces stockpiled in the database [3].
In current epochs, automation has accomplished developments by quality, accu-
racy, and precision. The technology endures to progress, conveying novel whitecaps
of improvement and innovations in data analytics, robotics, and in the vast areas of
artificial intelligence [4]. The prior works include many methods such as Convo-
lutional Neural Network (CNN) and Principal Component Analysis (PCA) which
identifies the face as a whole entity using many algorithms [5]. This paper empha-
sizes on using haar cascading features that use a classifier to detect the object it
has been trained to do from the source. Gong et al. [6] proposed a heterogeneous
face recognition to deal with various image modalities. Common encoding model is
anticipated to eradicate the modality gap at feature extraction stage itself. By using
the proposed encoding scheme, the heterogeneous face images are transformed to
common encoding space. Robust discriminant features are extracted to enhance the
recognition ability.
Best-Rowden et al. [7] proposed the techniques for studying the quality of the
face image based on the target face image quality scores. Support vector regres-
sion model was created. The model predicted the human quality values and matcher
quality values. The features are extracted from the face using CNN. Abuzneid et al. [8]
implemented a model using back-propagation neural network (BPNN) for face recog-
nition. A new dataset was created based on the input dataset and named as T-dataset.
Zhangi et al. implemented a multiplication fusion for calculating the weighted scores
and reduced the classification error rate when compared with the existing methods.
Chen et al. [9] projected a hierarchical clustering-based spectrum band selec-
tion method to remove the noise that occurs in each spectral band. Gabor filter and
gradient algorithms were used to remove the noise. Histogram of oriented gradients
improved the accuracy and time in training the model. Face alignment method called
adaptive pose alignment is proposed for recognizing face image in various poses.
The alignment method reduced the difference in intra-class caused by the alignment
function by estimating the pose and generating the optimal templates. The model
is trained based on the generated reference templates. Pereira et al. [10] introduced
domain specific units for heterogeneous face recognition. The proposed technique
consents erudition of low-level features that are specific to particular image domain
and distributes the high-level features from input image domain.
Lu et al. developed a semi-coupled dictionary learning scheme to promote the
performance of face of very low-resolution and super resolution images. The proposes
scheme performed better when compared with recognition rate of existing learning
46 Automated Perpetrator Identification by Face Recognition 487

scheme. Azis et al. [11] created a model to recognize human face image captured at
night with low lighting. Image enhancement and eigenface are utilized to progress
recognition ability. Histogram equalization is implemented to augment the face image
with various different contrast levels. In this paper a face recognition model is built
for automated perpetrator identification. The proposed model can spot faces and
recognize faces automatically from videos captured using the camera.
The rest of the paper is ordered as follows: Sect. 2 explicates proposed method.
Section 3 describes the experimental results and presents the discussions. To end,
Sect. 4 presents the conclusion and outlines future work.

46.2 Proposed Method

The architecture for proposed face recognition system is exposed in Fig. 46.1.
The system can automatically recognize the perpetrator using various algorithms
by matching the input image with the dataset. The original video is made into
image frames that capture the face images which then go through preprocessing.
The rendered image is run against the face database and if an image is matched the
face is recognized.

Video Data

Retrieve Image Face Detection


Frames Apply Haar cascades

Data Pre-processing Feature Extraction


Face Database

Training Classification
Face Recognition
Images

Fig. 46.1 Proposed face recognition system


488 A. Vinothini et al.

46.2.1 Face Detection

The first stage is creating a face detection system by applying haar cascades to the
retrieved image frames. The cascade function is trained from many images containing
a face and images that do not contain a face. The trained function is then used to
detect faces in other images. Haar cascade classifier is grounded on the Viola-Jones
detection process. Haar features include edge, line, and rectangular features. The
algorithm is executed in four steps. The first steps involve selecting the haar features.
The next step is in creating integral images. The third step involves applying an
AdaBoost algorithm to select the best features and to train the classifier. The final
step is used to organize the haar features into cascade classifiers [12, 13].
Eigenface, fisherface, and local binary pattern histograms [14] are implemented
independently, respectively. The training set of face images is stored in a matrix.
Eigenface is the set of eigenvectors computed from the covariance matrix of vector
space of human face images [15]. PCA is applied to yield the eigenfaces. Fisherface
is built on Fishers’ linear discriminant technique. Scatter matrix is engendered to
compute fisherface. It identifies the combination of features that recognize the face
[16]. LBPH involves in computing the local binary pattern of each pixel of the given
input face image. The histogram of the local binary pattern forms the LBP features
[12, 17].

46.3 Experimental Results and Discussion

A sample of collected images is shown in Fig. 46.2. The images are captured with a
standard quality camera. From these input face images, a set of images with various
poses and expressions are generated. The images are labeled as ID-3, ID-5, and
ID-15. The experiment is conducted using python. Eigenface, fisherface, and LBPH

Fig. 46.2 The tested images (ID-3, ID-5, ID-15)


46 Automated Perpetrator Identification by Face Recognition 489

are trained using different parameters such as the number of components, threshold,
radius from the center pixel. The trained model is then evaluated using the test image.
The resulting data is plotted after finishing the tests.
The first test image is shown in Fig. 46.3 and the plots are analyzed below. The
resulting ID variation is plotted in Fig. 46.4. The ID from the face recognition system
fluctuates between two classes of identical faces. The change of confidence that
increases with confidence are plotted in Fig. 46.5. The ID results from fisherface
are steadier than eigenface as seen in Fig. 46.6. There is an increase in confidence
level steadily. LBPH has a radius from the center pixel to be considered as the
first parameter. The ID remains stable up to the maximum radius Confidence level is
graphed against the radius. The number of neighbors is changed above 12. ID is steady
until 10 neighbors and it transformed to another ID. The confidence is fluctuating after
40. The lowest confidence level is at 2. The confidence uninterruptedly improved.
The cells in the X and Y scales are changed simultaneously. ID return shows that the
ID changed from ID-20 to ID-21 and stays steady.

Fig. 46.3 First test image

Fig. 46.4 The ID on eigenface versus number of component


490 A. Vinothini et al.

Fig. 46.5 Confidence on eigenface versus number of components

Fig. 46.6 Before and after LBPH calibration

The subsequent step is in calibrating the trainer and testing process on video.
The trainer phase is used to train the recognizer. The image pairs considered are
with and without calibration of the trainer. In eigenface the number of components is
substituted as 15 and the threshold as 4000. For fisherface the number of components
is taken as 5 and the threshold as 400. In LBPH the number of neighbors is assigned
as 2. The number of cells considered is 7 and the value of radius is assigned as 2. The
threshold is 15. Subsequently, after the calibration step, the eigenface and fisherface
performed with a better recognition rate with new data.
46 Automated Perpetrator Identification by Face Recognition 491

46.4 Conclusion

The proposed face recognition system for perpetrator identification performed better
when harr cascades and LBPH are unified. Thus, the system can be used to recognize
the perpetrators in public places to provide security to the people. The accuracy of
the system increases with image quality and with the use of high-quality cameras in
recording the videos and capturing the image frames. Although the system works
really well and recognizes faces with good accuracy, it can be improved by employing
other face detection algorithms. Further, a large training data can create a most
efficient model that learns better and is capable of detecting and recognizing faces
accurately without flaws.

References

1. Abdullah, N.A., Saidi, J.: Face recognition for criminal identification. AIP Conf. Proc. 1891,
020002 (2017). https://doi.org/10.1063/1.5005335
2. Apoorva, P., Impana, H.C., Siri, S.L., Varshitha, M.R., Ramesh, B.: Automated criminal iden-
tification by face recognition using open computer vision classifiers. In: 2019 3rd Interna-
tional Conference on Computing Methodologies and Communication (ICCMC), Erode, India,
pp. 775–778 (2019). doi: https://doi.org/10.1109/ICCMC.2019.8819850
3. Singh, S., Kumar, T.: An analytic approach 3D shape descriptor for face recognition. Int. J.
Electr. Electron. Comp. Sci. Eng. special Issue—ICSC AAIT-2018, 2454–1222 (2018)
4. Qu, X., Wei, T., Peng, C., Du, P.: A fast face recognition system based on deep learning. In:
2018 11th International Symposium on Computational Intelligence and Design (ISCID) 01,
pp. 289–292 (2018)
5. Ding, C., Tao, D.: Trunk-Branch ensemble convolutional neural networks for video-based face
recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 1002–1014 (2018). https://doi.org/
10.1109/tpami.2017.2700390
6. Gong, D., Li, Z., Huang, W., Li, X., Tao, D.: Heterogeneous face recognition: A common
encoding feature discriminant approach. IEEE Trans. Image Process. 26(5), 2079–2089 (2017).
https://doi.org/10.1109/TIP.2017.2651380
7. Best-Rowden, L., Jain, A.K.: Learning face image quality from human assessments. IEEE
Trans. Inf. Forensics Secur. 13(12), 3064–3077 (2018). https://doi.org/10.1109/TIFS.2018.
2799585
8. Abuzneid, M.A., Mahmood, A.: Enhanced human face recognition using LBPH descriptor,
multi-KNN, and back-propagation neural network. IEEE Access 6, 20641–20651 (2018).
https://doi.org/10.1109/ACCESS.2018.2825310
9. Chen, Q., Sun, J., Palade, V., Shi, X., Liu, L.: Hierarchical clustering based band selection
algorithm for hyperspectral face recognition. IEEE Access 7, 24333–24342 (2019). https://doi.
org/10.1109/ACCESS.2019.2897213
10. de Freitas Pereira, T., Anjos, A., Marcel, S.: Heterogeneous face recognition using domain
specific units. IEEE Trans. Inf. Forensics Secur. 14(7), 1803–1816 (2019). doi: https://doi.org/
10.1109/TIFS.2018.288528
11. Azis, F.M., Nasrun, M., Setianingsih, C., Murti, M.A.: Face recognition in night day using
method eigenface. In: 2018 International Conference on Signals and Systems (ICSigSys),
pp. 103–108 (2018)
12. Ahmed, A., Guo, J., Ali, F., Deeba, F., Ahmed, A.: LBPH based improved face recognition
at low resolution. In: 2018 International Conference on Artificial Intelligence and Big Data
(ICAIBD), Chengdu, pp. 144–147 (2018). doi: https://doi.org/10.1109/ICAIBD.2018.8396183
492 A. Vinothini et al.

13. Cuimei, L., Zhiliang, Q., Nan, J., Jianhua, W.: Human face detection algorithm via Haar cascade
classifier combined with three additional classifiers. In: 2017 13th IEEE International Confer-
ence on Electronic Measurement and Instruments (ICEMI), Yangzhou, pp. 483–487 (2017).
doi: https://doi.org/10.1109/ICEMI.2017.8265863
14. Jagtap, A.M., Kangale, V., Unune, K., Gosavi, P.: A study of LBPH, eigenface, fisherface and
haar-like features for face recognition using OpenCV. In: 2019 International Conference on
Intelligent Sustainable Systems (ICISS), Palladam, Tamilnadu, India, pp. 219–224 (2019). doi:
https://doi.org/10.1109/ISS1.2019.8907965
15. Wahyuningsih, D., Kirana, C., Sulaiman, R., Hamidah, Triwanto: Comparison Of the perfor-
mance of eigenface and fisherface algorithm in the face recognition process. In: 2019 7th
International Conference on Cyber and IT Service Management (CITSM), Jakarta, Indonesia,
pp. 1–5 (2019). doi: https://doi.org/10.1109/CITSM47753.2019.8965345
16. Anggo, M., Arapu, L.: Face recognition using fisherface method. In: 2nd International Confer-
ence on Statistics, Mathematics, Teaching, and Research Journal of Physics: Conference Series,
Vol 1028 (2017)
17. Alfy, E., Baig, Z., Aal, R.A.: A novel approach for face recognition fused GMDH-based
networks. Int. Arab J. Inf. Technol. 15(3) (2018)
18. Damanik, R.R., Sitanggang, D., Pasaribu, H., Siagian, H., Gulo, F.: An application of viola jones
method for face recognition for absence process efficiency. In: International Conference on
Mechanical, Electronics, Computer, and Industrial Technology, Journal of Physics: Conference
Series, Vol 1007, 6–8 December 2017
Chapter 47
Age Estimation in Social Network Using
Machine Learning Algorithm

R. Vikram, P. Nithish Kumar, S. Prasanth, S. Siva Sakthi Vel, and R. Surya

Abstract Human countenances, as fundamental visual signs, pass on a lot of non-


verbal data, taking into account genuine human-to-human discussion. Therefore,
present day insightful frameworks are relied upon to have the option to perceive
and comprehend human faces precisely continuously. Personality, age, sexual orien-
tation, look, and ethnic beginning are extremely significant elements in certifiable
facial picture studies, like media correspondence, human-PC cooperation (HCI), and
security. Law requirement associations can utilize face mug shot recovery to distin-
guish imminent suspects in criminal examinations. Regardless of the broad review on
human recognizable proof from facial photographs, there is just a limited quantity
of exploration on the best way to successfully evaluate and utilize segment proof
like age, sex, and society present in facial photos. Despite the fact that programmed
picture-based age valuation is a huge strategy complex in some genuine applications,
assessing human ages from face photographs stays a troublesome point. Precisely
assessing human age utilizing facial picture examination offers a wide scope of
genuine uses in Internet-based interpersonal interaction applications. For a long time,
online interpersonal organizations (OSN) have been a huge stage for people to draw
in with each other and offer data at different stages. Presently, billions of mechanics
use OSN to impart, and they are significant scenes for (in addition to other things)
content and assessment transmission. Also, every informal organization has an age
limitation for signing in. Be that as it may, it stays a troublesome issue to tackle contin-
uously. In this review, we will see how to decide age from face datasets utilizing a few
face highlight extraction calculations. The main motivation is to restrict the person
to register the social network whose age greater than the predefined threshold using
human facial feature.

Keywords Social network · Facial image analysis · Facial feature points · Age
estimation · Face detection

R. Vikram (B) · P. N. Kumar · S. Prasanth · S. S. S. Vel · R. Surya


Department of Computer Science and Engineering, M.Kumarasamy College of Engineering,
Thalavapalayam, Karur 639113, Tamilnadu, India
e-mail: vikramr.cse@mkce.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 493
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_47
494 R. Vikram et al.

47.1 Introduction

Assessing age from photos has consistently been one of the more troublesome parts of
facial examination. The uncontrolled person of the maturing system, just as the solid
particularity to individual examples, is a portion of the reasons. To effectively show
the classifier, a huge and counsel measure of information/pix is required, similarly all
things considered for boosting picture prominence occupations. Moreover, directed
classifiers require information/photos to be labeled, for our situation with certified
age [1]. Be that as it may, beforehand accessible information bases were restricted
and intensely slanted. This is particularly tricky in video reconnaissance and legal
sciences, where obscure people are normal and much of the time non-helpful. The
human face contains an abundance of data with respect to character, age, sexual
orientation, state of mind, and nationality. It is a significant segment just as a smooth
biometric property for deciding human ID [2]. Individuals’ ages are additionally
significant in up close and personal verbal correspondence. The allure of one man or
lady to one more is impacted by facial elements. They can give you fruitfulness and
wellness signs. Subsequently, these components can help an individual’s efficiency
and achievement [3–5]. One of the face qualities that plays a major part in helping
or obstructing correspondence is age. Custom, convictions, savor the experience
of, language, and age would all be able to impact how we express what we mean,
just as how we get what others say. It is a part that impacts how we speak with
each other, and it can work as a boundary when joined with different elements [6].
Working on the communication among individuals and machines through working
on a machine’s capacity to perceive and decipher appearances and facial patterns,
including age, progressively [6]. The programmed translation of facial pictures is
a theme that numerous researchers are keen on. On account of the wide scope of
facial appearances, automatic age estimation (AAE) utilizing face photographs is a
troublesome subject. It is brought about by a blend of outer and inherent elements
[7]. The living climate, medical issue, way of life, and other outward factors all play a
part in deciding the extraneous components. Inborn elements, then again, incorporate
physiological factors like qualities. Face looks and changes in appearance should be
tended to in vigorous AAE frameworks depending completely on facial pictures [8,
9]. Human computer interaction (HCI), reconnaissance, and Web content material
separating, and electronic supporter dating the executives are a couple of the projects
accessible from AAE frameworks (E-CRM) [10–12]. They are needed specifically
since people do not work effectively of assessing their age [13]. Thus, creating
AAE frameworks that outflank human execution is basic. The essential strategy for
assessing the age of an individual’s face is shown in Fig. 47.1.
47 Age Estimation in Social Network Using Machine Learning … 495

Fig. 47.1 Face age


estimation steps Input Image

Face Detection

Cropped Face

Feature Extraction

Prediction

47.2 Related Work

Agbo-Ajala and Viriri et al. [14] propose a variant that utilizes CNN design to expect
the age gathering and sex of human faces in unfiltered certifiable settings. The age
and sex marks are treated as discrete comments in this one of a kind CNN strategy,
which is utilized to prepare classifiers that anticipate the human’s age and sexual
orientation. Then, at that point, we show how retraining on huge scope datasets
permits us to adequately show our age and sexual orientation CNN variant, permitting
the classifiers to sum up on the pictures and forestall over fitting. In spite of the
incredibly troublesome nature of the pictures in the dataset, our strategy creates
standard size enhancements in age organization and sexual orientation type exactness
over current procedures, which can meet the prerequisites of a few genuine world
programs.
Badrinarayanan et al. [15] SegNet is a recommended calculation that comprises of
a pile of encoders followed by a predictable decoder stack that feeds into a smooth-
max type layer. Low-goal work maps on the encoder stack yield can be planned
to full information picture length trademark maps by the decoders. This defeats a
critical weakness of current profound dominating calculations that utilization thing
characterization networks for pixel brilliant naming. It is absolutely impossible to
plan profound layer include guides to enter aspects in these systems. They utilize
496 R. Vikram et al.

impromptu measures to build test limit, like replication. This outcomes in bois-
terous gages and restricts the quantity of pooling layers accessible, forestalling over
the top up examining and, therefore, diminishing geological setting. SegNet settles
these issues by figuring out how to plan encoder yields to picture pixel names.
SegNet utilizes a “level” geography, which implies that the scope of capacities in
each layer continues as before (64 in our model), yet with complete network. There
are two factors that impact this craving. In the first place, dissimilar to a devel-
oping profound encoder network with wonderful capacity availability, it stays away
from boundary blast (indistinguishable for decoder). Second, for every extra/more
profound encoder-decoder pair, the preparation time stays consistent (in our prelim-
inaries, it marginally diminishes) since the commonplace guide choice is more
modest, making convolutions quicker.
Maina Bukar et al. [16] enhancing the exemplary AAM by utilizing incomplete
least-squares (PLS) relapse in the PCA area is one of the proposed commitments. PLS
(regulated appearance model) is a dimensionality markdown strategy that amplifies
the covariance between the indicator and the reaction variable, bringing about idle
scores with each diminished aspect and progressed prescient energy (SAM). From
that point onward, the element extraction form is utilized to age assessment and sexual
orientation arrangement challenges. At long last, we check out how the arrangements
are introduced and how the FGNET-AD benchmark dataset is utilized (DB). Sex
type is likewise drawn closer in two stages: extraction of attributes and order. The
mathematical and look-based component extraction strategies depicted in the writing
can be isolated into two classifications. Since mathematical capacities portray best
structure adjustments that happen in early life and nearby surfaces are restricted
to wrinkles, which happen in adulthood, neighborhood capacities are presently not
satisfactory for explicit age assessment.
Chang et al. [17] because it gives more strong data than specific age numbers,
lease the general request old enough names. Positioning is generally utilized in record
recovery and has been classified as a “figuring out how to rank” technique that guides
provided documents into requested positions. Some early procedures were fruitful
in positioning dependent on relapse or type. Element brilliant ordinal relapse and
pair-wise decisions are two other notable procedures. Furthermore, they proposed a
positioning framework called Ranking SVM, which is altogether founded on pivot
misfortune and the SVM technique. The distinction between two trademark vectors
is utilized as contribution for learning, and the better positioning vector is planned
to higher scores during testing. For investigating pair-wise positioning calculations,
Rank Boost and RankNet utilize dramatic misfortune and pass entropy misfortune,
individually. Also, think of a spending plan agreeable answer for each sub-issue. The
idea of expense delicate resources has as of late been considered in the device learning
local area as a helpful method for demonstrating the seriousness of misclassification
issues. The expectation of significant worth touchy getting to know is to lessen
the general cost rather than the all-out mistakes on the grounds that the worth of
misclassification typically changes among various sets of names.
Chen et al. [18] give a solitary positioning put together methodology that concen-
trations with respect to the utilization of relative-request data for age gage that is
47 Age Estimation in Social Network Using Machine Learning … 497

more solid. Our strategy improves on the deduction issue to a bunch of basic twofold
inquiries and afterward joins the parallel determinations to figure age. As per the
consequences of the trials, the proposed technique beats past methodologies for
settling this issue by considering it as a multi-size or relapse issue. Nonetheless,
sooner or later in stand-out an extensive period, a human face can be developing old
in unmistakable organization, shape, and surface varieties. Since the piece capaci-
ties used to survey the pair-wise likenesses between a significant stretches will be
shift-or time-changing, this property types the arbitrary strategy shaped by non-fixed
maturing designs. Notwithstanding, utilizing non-fixed portions to tackle a relapse
issue is trying since it can undoubtedly prompt over-fitting during the learning system.
Chen and Dantcheva et al. [19] early discovery on the impact of facial beauty care
products on mechanized sexual orientation and age expectation calculations were
given. Since mechanized sexual orientation and age gage approaches are utilized in
an assortment of business applications, this review recommends that cosmetics ought
to be thought of. While a subject cannot utilize beauty care products to purposely
hoodwink the framework, it is not difficult to envision conditions in which a malignant
client might use customary cosmetics to delude the framework. Sexual orientation
ridiculing brought about by cosmetics affects programmed sex grouping frameworks.
Both male-to-female and female-to-male changes are conceivable. The female-to-
male change was demonstrated to be somewhat more troublesome than the male-
to-male change. Furthermore, we made another dataset (MIAA—makeup induced
age alteration) comprising of photos gathered from the Internet to explore cosmetics
incited age change. These photos address 53 subjects, with one picture taken previ-
ously and one taken in the wake of putting on beauty care products to every individual.
While we do not have the specific age of the subjects, we gage that they are over
30 years of age, and that beauty care products are utilized to work on their style and
cause them to seem more youthful.
Das and Dantcheva et al. [20] proposed a sex, age, and race class way to deal with
lessening between class predisposition. At the UTK Face and the Bias Estimation
in Face Analytics (BEFA) challenge datasets, the recommended multi-undertaking
CNN procedure utilized joint powerful misfortune and showed great outcomes. We
intend to grow the advanced glance at more facial qualities in future work. More-
over, we intend to examine the procedure introduced in this paper with regards to
limiting face acknowledgment predispositions. The inescapable business arrange-
ment of programmed face assessment frameworks (i.e., the utilization of face preva-
lence as a solid verification approach) has aroused restorative curiosity [21]. Current
gadget learning calculations take into account the moderately solid distinguishing
proof, evaluation, and order of face pictures dependent on age, nationality, and sexual
orientation.
Eidinger and Enbar et al. [22] convey two commitments: a fresh out of the plastic
new and considerable records set and benchmark for age and sex gage, just as a
class pipeline intended to benefit as much as possible from the restricted information
accessible. Moreover, we present a novel, strong face arrangement procedure depen-
dent on iterative appraisal of the vulnerability of facial capacity confinements. At
last, we give considerable tests that show our technique’s extended abilities just as the
498 R. Vikram et al.

expanded trouble level of our new benchmark. Assessing the age of an individual in
a photo dependent on that individual’s facial elements has been investigated before,
however, less significantly than the connected issue of face notoriety. Furthermore,
give an inside and out survey, looking at every possible benchmark and its relating
levels of trouble, just as the capacities of mechanized age and sexual orientation gage
frameworks. On the surveyed benchmarks as a whole, we show that our own device
outflanks others by enormous edges.
Chen et al. [23] suggest an arrangement of rules named IIS-LLD for contemplating
from the mark appropriations that is an iterative streamlining process fundamentally
based at the most entropy adaptation. Exploratory results show the advantages of
IIS-LLD over the conventional concentrating on strategies fundamentally dependent
on unmarried-ordered information. While arriving at right by and large execution on
facial age assessment, LLD may also be valuable for various dominating difficulties.
For the most part talk me, there are as a base three situations wherein LLD will
be valuable: The occurrences are to begin with marked with polish circulations.
The tastefulness disseminations might come from the data of specialists or realities.
Some preparation are really associated with different examples. As indicated by
the connection the different classes, a name dissemination might be plainly created.
The marking from stand-out sources are begging to be proven wrong. Rather than
concluding a solitary wining mark, it very well may be higher to produce a name
appropriation which incorporates the data from all resources.
Guo and Mu et al. [24] gave a special observing that the best three parts of gifts are
needed to assess age, sexual orientation, and nationality. The exploratory approvals
were performed on a gigantic dataset with north of 55 face photographs. We took a
gander at how the utilization of the position thought for CCA-based calculations influ-
ences highlight dimensionality. An exhaustive correlation of the practices between
the CCA and PLS-based procedures has been given, just as exactnesses or blunders
with respect to aspects, just as strolling time. In tests, the rCCA has a comparative
strolling time as the CCA and PLS, yet makes less mistakes. Due to its fast speed
and astoundingly little blunders, the regularized CCA is suggested for reasonable
purposes dependent on a commonplace consideration. PLS and CCA-based proce-
dures have as of late exhibited incredible generally execution in settling PC innovative
and perceptive challenges. It is basic to test and assess PLS and CCA-based method-
ologies in an assortment of vision issues with the goal that we might have a superior
comprehension of their practices in both general and exact vision applications.
Short coming in proposed system: Time complexity in face analysis, accuracy
of the system, age estimation using the image does not give better accuracy.

47.3 Age Estimation Technique

The assurance of somebody’s age dependent on biometric capacities is known as


age assessment. Albeit one of a kind biometric improvements can be used to gage
age, this article centers around facial age assessment, which utilizes biometric data
47 Age Estimation in Social Network Using Machine Learning … 499

taken from an individual’s face. The article’s primary concerns cover normal bundles
that can be utilized for facial age gage, the issue and deterrents associated with
facial age assessment, normal procedures depicted in the writing, and future review
directions [25]. The motivation behind programmed facial age assessment is to utilize
committed calculations to estimated an individual’s age utilizing capacities got from
their face picture [26]. The facial age assessment issue is similar to other normal
face picture understanding undertakings in that the execution level incorporates the
face location method, space of facial highlights, work vector parts, and type [27].
The yield of the class degree can be a gage of an individual’s exact age or the age
foundation of somebody, or a double outcome showing whether or not the age of a
subject is inside a positive age range, contingent upon the product for which an age
assessment gadget is planned to be utilized [28–30]. The age-bunch class is the most
broadly used of the three kinds expressed above, as it is essentially simpler to get a
good guess of a theme’s age rather than their actual age in many bundles [31–33].
One more fundamental part of the age assessment issue is the fleeting range that is
considered [26, 30, 34–40]. This boundary is a basic part of the issue since particular
maturing highlights show up in various age gatherings; henceforth, a gadget equipped
for managing a particular age reach may not be appropriate to other age gatherings
[12, 41–43]. The trouble of face age assessment is tantamount to the issue old enough
movement [44]. Age movement is the figure of a subject’s future face look dependent
on pictures of their earlier appearance [45]. Age gage and age advancement must
both be considered when clarifying age-related face distortions that happen over the
span of an individual’s life. Sometimes, the age assessment issue is tended to as I
would like to think, while in others, age gage and age movement are both tended to
utilizing comparable methodologies [46, 47]. The two center words, just as capacity
extraction and class draws near, make up a face age assessment framework. The
following parts go more than a few calculations.

47.3.1 Features Extraction

Age estimation can be calculated based on facial feature points and feature points-
based landmark values of each face images. Figure 47.2 shows the main facial feature
points.

47.3.1.1 Head Component Analysis (HCA)

HCA is an uncontrolled trademark extraction technique that delivers the main added
substances of realities that are requested by conflict guidelines. In HCA-helped trade-
mark evacuation, the component including the more significant position is kept, while
the instrument including the lower authority is killed. Subsequently, HCA may not
perform well in order because of the way that it disregards current realities of division
names and preparing prejudice. Most significant component assessment is perhaps
500 R. Vikram et al.

Fig. 47.2 Facial landmark points

the most broadly utilized solo trademark disposal strategy. HCA does not utilize the
guidance limited inside the tastefulness names for administered order issues. HCA
is the Eigen esteem breaking down of records notwithstanding the measurable direc-
tions with the best assortment. Since HCA abilities show an expanded measure of
electrical compaction, it is a leaned toward trademark district. Uncorrelated creation
capacities are produced by HCA. With the assistance of HCA, a lower dimensional
depiction of novel data is produced during the hour of capturing the insights course
to work with has the best contrast. Through interfacing the Eigen principles with
eigenvectors of covariance climate of exceptional data, this solo trademark evacua-
tion procedure assignments an insights situate to a new synchronize plot. They are
then, at that point, arranged in a sliding request after the Eigen esteems and eigen-
vectors have been worked out. By combining the eigenvector grid, the significant
added substances might be worked as a straight renovation of measurements. The
HCA model is portrayed as follows:
(1) Let have N information tests S 1 , S 2 , S 3 , …, S n in M dimensional space. Every
Si is a M × 1 vector. Let S signifies the mean vector of the commitment
information and protect be addressed as mentioned in Eq. (47.1)

1 
N
S= Si (47.1)
N i=1

(2) C which addresses the covariance network is unmistakable as shown in


Eq. (47.2)

1  
N
C= Si − S (Si − S)T (47.2)
N i=1
47 Age Estimation in Social Network Using Machine Learning … 501

(3) Let 1, 2, …, N are the n Eigen vectors comparing n biggest Eigen upsides of
C. appearance the vector W = [1, 2, …, N] Find the trademark vector as
 
Yi = W T Si − S ∀i = 1, 2, . . . , N

Huge records are explicitly made do with the utilization of a couple of number of
basic added substances, and it is usually a straight amassing of records from various
directions.

47.3.1.2 Straight Discriminant Analysis

To recognize a straight combination of capacities, SDA calculations are used in data,


model ID, and AI. One ward variable is communicated as a straight mix of a few
elements or estimations utilizing LDA. Similarly that HCA and thing examination
search for a straight total of factors that best clarify current realities, SDA does also.
SDA makes an express endeavor to adaptation the adjustment of insights preparing.
HCA, then again, disregards any differentiation in class, depending rather on angle
assessment to develop the trademark. Blend is completely founded on contrasts
rather than similitudes. SDA searches for vectors in the crucial freedom that are
effectively discernable across educational plan. SDA fosters a straight total of these,
which yields the essential mean contrasts among the leaned toward lessons, after
an authority settlement on various unprejudiced characteristics against which the
realities are described.
We portray two occasions: (1) Solitary is called inside-class disperse climate as
c Nj
j j
given by Sw = (xi − μ j )(xi − μ j )T .
j=1 i=1

j
where xi is the ith example set j, μ j is the indicate of set j, c is the amount of program
and μ j is the amount of samples in set j and I linking set scatter matrix


c
Sb = (μ j − μ)(μ j − μ)T
j=1

where μ represent the indicate of all program.

ST = Sw + Sb

S T is the absolute dissipate framework. Since its removed capacities exploit the
class data, LDA is a more effective capacity extraction approach than PCA in regu-
lated learning. The dispersions of tests in each stage, in any case, are viewed as
ordinary and homoscedastic. Thus, on the off chance that this supposition that is
broken, finding a legitimate.
502 R. Vikram et al.

47.3.2 Age Classification

Cluster face values based on features and classify the findings to determine a person’s
age [21]. This is how the classification can be defined:
• The distance between two focal points (for instance, the division of the eyes)
• The distance determined along a hub between two tourist spots (like the upward
stature of the nose)
• The distance determined along a surface between two milestones, (for example,
the upper lip limit circular segment length)
• Tendency point comparative with a hub (for instance, the slant of the nose span)
• The point between the two face positions, (for instance, the point outlined at the
tip of the nose).

47.3.2.1 Support Vector Machine

Support vector machines (SVM) are supervised learning models with related learning
algorithms for classification and regression analysis in machine learning. It is
primarily used to solve categorization challenges. Each data item is plotted as a
point in n-dimensional space (where n is the number of features), with the value of
each feature equal to the value of a certain coordinate. The hyperplane that best distin-
guishes the two classes is then used to classify the data. SVMs may also conduct
non-linear classification, implicitly translating their inputs into high-dimensional
feature spaces, in addition to linear classification [48]. A support vector machine
(SVM) is a discriminative classifier using a separating hyperplane as its formal defi-
nition. In other words, the algorithm produces an ideal hyperplane that categorizes
fresh cases given labeled training data (supervised learning).

47.3.2.2 Probabilistic Neural Network Classifier

The probabilistic neural network (PNN) is a network-based approach to “proba-


bility density estimation.” It is a competitive learning paradigm with a “winner takes
all” mentality and a basic notion based on multivariate probability estimation. The
computational load from the training phase is shifted to the evaluation phase, which
makes PNN unique. The fundamental advantage of PNN over back propagation
networks is that training is instantaneous, simple, and fast. The Parzen window notion
of multivariate probabilities was used to construct PNN. The Bayes technique for
decision-making is combined with a nonparametric estimator for getting the prob-
ability density function in the PNN, which is a classifier version. An input layer, a
pattern layer, a summation layer, and an output layer make up the PNN architecture.
The neuron xi j of the pattern layer receives a pattern x from the input layer and
computes its output as given by the Eq. (47.3) below
47 Age Estimation in Social Network Using Machine Learning … 503
  
1 (x − xi j )T x − xi j
∅i j (x) = d ex p − (47.3)
(2π )d/2σ 2σ 2

where σ denotes the smoothing parameter


xi j denotes the neuron vector and
d denotes the dimensions of the pattern vector x.
The summation layer neurons compute the maximum likelihood of pattern x being
classified into C, by summarizing and averaging the output of all neurons that belong
to the same class using Eq. (47.4) given below
  
1 
Ni
1 (x − xi j )T x − xi j
ρi (x) = d exp (47.4)
(2π )d/2σ Ni j=1 2σ 2

where N i is the total number of samples in Class C k . The decision layer unit classifies
the pattern x in accordance with Bayes decision rule based on the output of all
summation layer neurons by

C(x) = arg max{ pi (x)} for i = 1, 2, . . . , m

Here, C(x) denotes the estimated class of the pattern x and m is the total number of
classes in the training samples. Excessive categorization features raise both compute
time and storage memory requirements. They can make categorization more difficult
at times. A reduction in the number of features is required. Reduced dimension refers
to a smaller collection of features that are fed into the PNN during the training and
testing phases.
Benefits: Time complexity is reduced from the previous work. Automatic age
estimation in real time. Automatically restrict the user to login.

47.4 Conclusion

The suggested architecture is tested with real-time database images, the most effec-
tive of which is a single face. This is the simplest method if you wish to keep
local community statistics that allow you to determine a characteristic that changes
over time. The current device discretely classifies the worry in the photograph, then
guesses the approximate age of the man or woman near to the age referenced to
in the database and displays a range within which the person’s age could fall. The
age estimation technique with PCA and Euclidian distance classifier is discussed in
this project. As the previously said, age or age group identification is broken into
three sub-issues: face detection, characteristic extraction, and type. Because there
504 R. Vikram et al.

are a few issues such as head rotation, the effect of aging, variations in illumination
due to minor influences, and so on, this strategy is not always suitable for all of the
challenges. As a result, if the neural community is exploited, greater impacts can be
obtained. Also, if a first-class camera is employed to capture images for database
creation, results can be advanced. Increased database length and the usage of multiple
classifiers could be used to further refine the suggested system.

References

1. Guo, G., Fu, Y., Dyer, C.R., Huang, T.S.: Image-based human age estimation by manifold
learning and locally adjusted robust regression. IEEE Trans. Image Process. 17(7), 1178–1188
(2008)
2. Zhang, C., Liu, S., Xu, X., Zhu, C., ‘C3AE: Exploring the limits of compact model for age esti-
mation. In: Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition
(CVPR), pp. 12587–12596 (2019)
3. Sawant, M.M., Bhurchandi, K.: Hierarchical facial age estimation using Gaussian process
regression. IEEE Access 7, 9142–9152 (2019)
4. Logeswaran, R., Aarthi, P., Dineshkumar, M., Lakshitha, G., Vikram, R.: Portable charger for
handheld devices using radio frequency. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 8(6),
837–839 (2019)
5. Yang, T.-Y., Huang, Y.-H., Lin, Y.-Y., Hsiu, P.-C., Chuang, Y.-Y.: SSRnet: A compact soft
stagewise regression network for age estimation. In: Proceedings of the 27th International
Joint Conference Artificial Intelligence, pp. 1–7 (2018)
6. Thilagamani, S., Shanti, N.: Gaussian and gabor filter approach for object segmentation. J.
Comput. Inf. Sci. Eng. 14(2), 021006 (2014)
7. Gao, B.-B., Zhou, H.-Y., Wu, J., Geng, X.: Age estimation using expectation of label distribution
learning. In: Proceedings of IJCAI, pp. 712–718 (2018)
8. Liu, N., Chang, L., Duan, F.: PGR-Net: A parallel network based on group and regression for
age estimation. In: Proceedings of the IEEE International Conference Acoustic, Speech Signal
Process (ICASSP), pp. 2377–2381 (2019)
9. Pandiaraja, P., Sharmila, S.: Optimal routing path for heterogenous vehicular adhoc network.
Int. J. Adv. Sci. Technol. 29(7), 1762–1771 (2020)
10. Wan, J., Tan, Z., Lei, Z., Guo, G., Li, S.Z.: Auxiliary demographic information assisted age
estimation with cascaded structure. IEEE Trans. Cybern. 48(9), 2531–2541 (2018)
11. Ruder, S.: An overview of multi-task learning in deep neural networks (2017). arXiv:1706.
05098. [Online]. Available: http://arxiv.org/abs/1706.05098
12. Liu, X., Li, S., Kan, M., Zhang, J., Wu, S., Liu, W., Han, H., Shan, S., Chen, X.: AgeNet:
Deeply learned regressor and classifier for robust apparent age estimation. In: Proceedings of
the IEEE International Conference on Computer Vision Workshop (ICCVW), pp. 258–266
(2015)
13. Pandiaraja, P., Aravinthan, K., Lakshmi, N.R., Kaaviya, K.S., Madumithra, K.: Efficient
cloud storage using data partition and time based access control with secure AES encryption
technique. Int. J. Adv. Sci. Technol. 29(7), 1698–1706 (2020)
14. Agbo-Ajala, O., Viriri, S.: Deeply learned classifiers for age and gender predictions of unfiltered
faces. Sci. World J. 2020, 1–12 (2020). https://doi.org/10.1155/2020/1289408
15. Badrinarayanan, V., Handa, A., Cipolla, R.: SegNet: A deep convolutional encoder-decoder
architecture for robust semantic pixel wise labelling (2015). arXiv:1505.07293. [Online].
Available: http:// arxiv.org/abs/1505.07293
16. Bukar, M., Ugail, H., Connah, D.: Automatic age and gender classification using super-
vised appearance model. J. Electron. Imag. 25(6), 061605. doi: https://doi.org/10.1117/1.JEI.
25.6.061605
47 Age Estimation in Social Network Using Machine Learning … 505

17. Chang, K.-Y., Chen, C.-S., Hung, Y.-P.: Ordinal hyperplanes ranker with cost sensitivities for
age estimation. Proc. CVPR. 585–592 (2011)
18. Chang, K.-Y., Chen, C.-S., Hung, Y.-P.: A ranking approach for human ages estimation based
on face images. Proc. 20th Int. Conf. Pattern Recognit. 3396–3399 (2010)
19. Chen, C., Dantcheva, A., Ross, A.: Impact of facial cosmetics on automatic gender and age
estimation algorithms. Proc. Int. Conf. Comput. Vis. Theory Appl. (VISAPP) 2, 182–190 (2014)
20. Das, A.D., Bremond, F.: Mitigating bias in gender, age, and ethnicity classification: A multi-task
convolution neural network approach. In: Proceedings of European Conference on Computer
Vision (ECCV) Workshops, pp. 1–13 (2018)
21. Choi, S.E., Lee, Y.J., Lee, S.J., Park, K.R., Kim, J.: Age estimation using a hierarchical classifier
based on global and local facial features. Pattern Recognit. 44(6), 1262–1281 (2011)
22. Eidinger, E., Enbar, R., Hassner, T.: Age and gender estimation of unfiltered faces. IEEE Trans.
Inf. Forensics Secur. 9(12), 2170–2179 (2014)
23. Geng, X., Yin, C., Zhou, Z.-H.: Facial age estimation by learning from label distributions. IEEE
Trans. Pattern Anal. Mach. Intell. 35(10), 2401–2412 (2013)
24. Guo, G., Mu, G.: Joint estimation of age, gender and ethnicity: CCA versus PLS. In: Proceedings
of 10th IEEE International Conference on Workshops Automatic Face Gesture Recognition
(FG), pp. 1–6 (2013)
25. Murugesan, M., Thilagamani, S.: Efficient anomaly detection in surveillance videos based on
multi layer perception recurrent neural network. J. Microprocess. Microsyst. 79 (2020)
26. Deepa, K., Kokila, M., Nandhini, A., Pavethra, A., Umadevi, M.: Rainfall prediction using
CNN. Int. J. Adv. Sci. Technol. 29(7 Special Issue), 1623–1627 (2020)
27. Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes.
In: Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, pp. 1891–
1898 (2014)
28. Chen, B.-C., Chen, C.-S., Hsu, W.H.: Face recognition and retrieval using cross-age reference
coding with cross-age celebrity dataset. IEEE Trans. Multimedia 17(6), 804–815 (2015)
29. Pradeep, D., Sundar, C.: QAOC: Noval query analysis and ontology-based clustering for data
management in Hadoop, vol. 108, pp. 849–860 (2020)
30. Tan, Z., Wan, J., Lei, Z., Zhi, R., Guo, G., Li, S.Z.: Efficient groupn encoding and decoding
for facial age estimation. IEEE Trans. Pattern Anal. Mach. Intell. 40(11), 2610–2623 (2018)
31. Deepa, K., Thilagamani, S.: Segmentation techniques for overlapped latent fingerprint
matching. Int. J. Innov. Technol. Explor. Eng. 8(12), 1849–1852 (2019)
32. Han, H., Otto, C., Liu, X., Jain, A.K.: Demographic estimation from face images: Human versus
machine performance. IEEE Trans. Pattern Anal. Mach. Intell. 37(6), 1148–1161 (2015)
33. Li, S., Xing, J., Niu, Z., Shan, S., Yan, S.: Shape driven kernel adaptation in convolutional
neural network for robust facial traits recognition. In: Proceedings of the IEEE Conference on
Computer Vision Pattern Recognition, pp. 222–230 (2015)
34. Gunasekar, M., Thilagamani, S.: Performance analysis of ensemble feature selection method
under SVM and BMNB classifiers for sentiment analysis. Int. J. Sci. Technol. Res. 9(2), 1536–
1540 (2020)
35. Eidinger, E., Enbar, R., Herbrich, R.: Support vector learning for ordinal regression. In: Proceed-
ings of the 9th International Conference on Artificial Neural Networks (ICANN), pp. 97—102
(1999)
36. Deepika, S. Pandiaraja, P.: Ensuring CIA triad for user data using collaborative filtering mech-
anism. In: 2013 International Conference on Information Communication and Embedded
Systems (ICICES), pp. 925 –928 (2013)
37. Guo, G., Mu, G.: Simultaneous dimensionality reduction and human age estimation via kernel
partial least squares regression. Proc. CVPR 657–664 (2011)
38. Rajesh Kanna, P., Santhi, P.: Hybrid intrusion detection using map reduce based black widow
optimized convolutional long short-term memory neural networks. Expert Syst. Appl. 194, 15
(2022)
39. Chen, K., Gong, S., Xiang, T., Loy, C.C.: Cumulative attribute space for age and crowd density
estimation. In: Proceedings of the IEEE Conference of Computer Vision Pattern Recognition,
pp. 2467–2474 (2013)
506 R. Vikram et al.

40. Wang, X., Guo, R., Kambhamettu, C.: Deeply-learned feature for age estimation. In:
Proceedings of the IEEE Winter Conference on Applied Computer Vision, pp. 534–541 (2015)
41. Xing, J., Li, K., Hu, W., Yuan, C., Ling, H.: Diagnosing deep learning models for high accuracy
age estimation from a single image. Pattern Recognit. 66, 106–116 (2017)
42. Santhi, P., Mahalakshmi, G.: Classification of magnetic resonance images using eight directions
gray level co-occurrence matrix (8dglcm) based feature extraction. Int. J. Eng. Adv. Technol.
8(4), 839–846 (2019)
43. Ni, B., Song, Z., Yan, S.: Web image mining towards universal age estimator. In: Proceedings
of the 17th ACM International Conference on Multimedia (MM), pp. 85–94 (2009)
44. Niu, Z., Zhou, M., Wang, L., Gao, X., Hua, G.: Ordinal regression with multiple output
CNN for age estimation. In: Proceedings of the IEEE Conference on Computer Vision Pattern
Recognition (CVPR), pp. 4920–4928 (2016)
45. Levi, G., Hassncer, T.: Age and gender classification using convolutional neural networks.
In: Proceedings of the IEEE Conference on Computer Vision Pattern Recognition Workshops
(CVPRW), pp. 34–42 (2015)
46. Perumal, P., Suba S.: An analysis of a secure communication for healthcare system using
wearable devices based on elliptic curve cryptography. J. World Rev. Sci. Technol. Sustain.
Develop. 18(1), 51–58 (2022)
47. Li, W., Lu, J., Feng, J., Xu, C., Zhou, J., Tian, Q.: BridgeNet: A continuity-aware probabilistic
network for age estimation. In: Proceedings of the IEEE/CVF Conference on Computing Vision
Pattern Recognition (CVPR), pp. 1145–1154 (2019)
48. Thilagamani, S., Nandhakumar, C.: Implementing green revolution for organic plant forming
using KNN-classification technique. Int. J. Adv. Sci. Technol. 29(7S), 1707–1712
49. Rajesh Kanna, P., Santhi, P.: Unified Deep Learning approach for Efficient Intrusion Detec-
tion System using Integrated Spatial–Temporal Features, Knowledge-Based Systems, vol. 226
(2021)
50. Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In:
Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, pp. 3476–3483
(2013)
Chapter 48
Low-Profile Elliptical Slot Antenna
for Sub-6 GHz 5G and WLAN
Applications

Aneri Pandya, Killol Vishnuprasad Pandya, Trushit Upadhyaya,


and Upesh Patel

Abstract A low-profile, miniaturized, slot antenna is designed, analyzed, fabricated,


and presented in this paper. Two slots are incorporated over the elliptical-shaped patch
to receive adequate impedance matching. The FR4 is chosen as a substrate material
that offers the cost-effectiveness of the developed model. To obtain wide bandwidth,
a couple of slot patterns are engineered over the ground plane surface. The proposed
structure resonates 2.91 GHz to 3.60 GHz and 4.53 GHz to 6.83 GHz frequency
bands with adequate bandwidth of 22% and 38%, respectively. The measured results
from the fabricated antenna exhibit a very good correlation with simulated results.
The satisfactory parameters make the antenna suitable for sub-6 GHz 5G and WLAN
applications.

Keywords Slot antenna · WLAN applications · Wireless communications

48.1 Introduction

In wireless communication systems and embedded systems, the structural changes in


the conventional antenna structures are essential to meet the industry requirements.
In the initial phase, microstrip antennas are the first choice for the researchers to
develop for various wireless applications because it offers many advantages such as
low profile, ease in fabrication, and compact in size. [1]. Also, microstrip antennas
offer circular polarization and could resonate for dual-band frequency operations.

A. Pandya (B)
Alpha College of Engineering and Technology, Gandhinagar, Gujarat, India
e-mail: aneri.pandya@alpha-cet.in
K. V. Pandya · T. Upadhyaya · U. Patel
Chandubhai S Patel Institute of Technology, CHARUSAT University, Changa, Gujarat, India
e-mail: killolpandya.ec@charusat.ac.in
T. Upadhyaya
e-mail: trushitupadhyaya.ec@charusat.ac.in
U. Patel
e-mail: upeshpatel.ec@charusat.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 507
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_48
508 A. Pandya et al.

However, microstrip antennas are an inappropriate candidate to achieve satisfactory


bandwidth. Parasitic patch antennas with thick substrate could be the feasible solu-
tion to overcome the aforementioned disadvantage of microstrip antennas [2]. These
solutions invite coplanar and stack configurations. Also, stack structure creates prob-
lems against the fundamental advantages offered by microstrip antennas like the ease
in fabrication and cost- effectiveness [3]. After applying the majority of structural
changes, the researchers have come up with a new geometrical change of inserting
a slot on a patch surface to achieve adequate bandwidth. By focusing on the above
technique, the researchers have analyzed and understood the response from U-shaped
slot antenna [4]. One thing that could be noticed from the analysis is that the utiliza-
tion of a U-shaped slot in an antenna gives stable polarization and uniform radiation
pattern. The research says the unwanted harmonics could be eliminated significantly
by using ring-shaped slot antennas [5]. Directional antennas could give moderated
gain at certain frequency applications because they have minimum radiation over the
undesired frequency spectrum. In unidirectional antennas, it is extremely important
to suppress the undesired resonances to get acceptable gain. The aforementioned
research paper covered suppression of these unwanted harmonics in a very appro-
priate manner. The insertion of different-shaped slots, close loop slots, and slot pairs
provides desired antenna parameters such as omnidirectional and stable radiation
pattern, relatively wide impedance bandwidth, uninvited harmonics, and moderate
gain. A similar technique-inspired slot antenna is discussed in [6]. In this paper, the
octa-star slot is developed on the top layer of the radiating geometry. Because of
this arrangement, flat gain, conical radiation pattern, and wide bandwidth could be
obtained. Similarly, a slot having conical shape was developed and presented in [7].
The discussed antenna offers high bandwidth from 600 MHz to 4 GHz. The literature
depicts that the adoption of slots with various techniques shows application-oriented
response. The experiments claim that the radiation surface having an open slot could
get satisfactory bandwidth and moderated gain with circulation polarization [8]. In
this geometry, the microstrip line feed is utilized to excite the slot antenna and both
are located on the edge of the ground plane to make an asymmetrical antenna. Also,
the slot antenna could be useful to receive the dual-band performance. Such as other
antennas, the array of slot antennas are also discussed to obtain dual polarization.
The DRA antenna arrays were also discussed by the researchers to get dual- and
triple-band resonances. In those research, the authors have utilized unequal power
divider circuit to excite the antenna [9–12]. The authors have claimed the antennas
for 5G smart phone applications and mobile communications [13, 14]. In some
special applications such as implantable antennas ad wearable antennas, slot antennas
prove themselves a very promising candidate. A slot antenna using circular ring was
proposed with electromagnetic band gap structure for the application of mobile body
area network [15].
48 Low-Profile Elliptical Slot Antenna for Sub-6 GHz 5G and WLAN … 509

48.2 Antenna Geometry

Figure 48.1 depicts the developed antenna design. Figure 48.1a, b shows the view
from top and view from bottom, respectively. The model is excited using lump port.
The detail observations say that two slots are introduced over the surface of elliptical
patch. The literature says the defected ground plane could significantly contribute to
get desired resonances [16, 17]. The ground plane design also looks interesting. In
this, two slot patterns are developed just below the substrate layer. The satisfactory
impedance matching could be obtained by this technique. The ground plane is partial

(b) View from backside


(a) Upper view

(c) Fabricated antenna model

Fig. 48.1 Antenna design and fabricated model


510 A. Pandya et al.

Table 48.1 Detail


Notations Dimensions (mm) Notations Dimensions (mm)
dimensions
WS 6 WF 4.9
LS 1 LF 11.9
W 2 LSUB 28.7
L 10 WSUB 27

Fig. 48.2 Reflection coefficient values at various design iterations

having a thick border developed below the substrate. The width of the border is 3
mm.
The detail dimensions are shown in Table 48.1. Table 48.1 illustrates the
miniaturized size of antenna.

48.3 Parametric Study

The systematic improvement in the values of reflection coefficient (S11) is shown


in Fig. 48.2. The desired resonances could be received using the elliptical-shaped
antenna with additional two slots. In the first iteration, the monopole antenna output is
shown. In the second iteration, the insert feed is introduced to get the desired output.
Generally, the insert feed is useful to get deviation in the resonating frequency band.

48.4 Results and Discussions

The antenna structure has been primarily analyzed by its simulated reflection coeffi-
cient values. The high-frequency structure simulator software is utilized for antenna
48 Low-Profile Elliptical Slot Antenna for Sub-6 GHz 5G and WLAN … 511

analysis. For the desired frequency bands, S11 parameters values are below −10 dB
which is expected. Due to adequate impedance matching, these values are below
−10 dB. The proposed antenna parameter is verified with the actual result. For
systematic analysis, both the results (software-generated and actual) are given by a
similar graph in Fig. 48.3. The curve of the measured result follows the software-
generated results. A miniaturized compact conductive-based antenna is also reported
for wireless applications which resonate for dual-frequency bands [11, 18, 19].
Figure 48.4 depicts the current distribution at 3.13 GHz and 5.95 GHz frequen-
cies. It is visible from the figure that maximum current flow is received throughout

Fig. 48.3 Graph of simulated and measured results of reflection coefficient values

Fig. 48.4 Current distribution over the surface of antenna


512 A. Pandya et al.

Fig. 48.5 Antenna testing setup inside anechoic chamber

the insert feed line. The slot created in the inset feed makes significant impedance
matching and current distribution. The less amount of current distribution could be
observed at the elliptical shape of an antenna.
The developed fabricated antenna is tested inside the anechoic chamber. The size
of the anechoic chamber is 5 m × 5 m × 5 m. Figure 48.5 i, ii shows fixed antenna
with  = 00 and  = 900 to get E field and H field pattern, respectively. Figure 48.6
illustrates the E field and H field 2D radiation pattern. The radiation pattern is stable
and uniform in every direction. Also, the curve of the measured radiation pattern
follows the simulated radiation pattern curve.

48.5 Conclusion

A dual-band elliptical-shaped slot antenna is proposed for sub-6 GHz 5G and WLAN
applications. The software-generated results are analyzed with the actual results. The
comparison shows a satisfactory correlation between the two responses. The cost-
effective FR4 material is utilized to develop the structure. The antenna dimensions
are carefully finalized based on a literature study and parametric study analysis. The
primary parameters such as radiation pattern, return loss, current distribution, and E
field H field pattern are observed which actually shows the positive potential of the
presented research.
48 Low-Profile Elliptical Slot Antenna for Sub-6 GHz 5G and WLAN … 513

Fig. 48.6 E field and H field radiation pattern for 3.13 GHz and 5.95 GHz frequencies, respectively

References

1. Singh, I., Tripathi, V.S.: Micro strip patch antenna and its applications: a survey. Int. J. Comp.
Tech. Appl. 2(5), 1595–1599 (2011)
2. Mak, C.-L., Luk, K.M., Lee, K.F., Chow, Y.L.: Experimental study of a microstrip patch antenna
with an L-shaped probe. IEEE Trans. Antennas Propag. 48(5), 777–783 (2000)
3. Radavaram, S., Pour, M.: Wideband radiation reconfigurable microstrip patch antenna loaded
with two inverted U-slots. IEEE Trans. Antennas Propag. 67(3), 1501–1508 (2018)
4. Weigand, S., Huff, G.H., Pan, K.H., Bernhard, J.T.: Analysis and design of broad-band single-
layer rectangular U-slot microstrip patch antennas. IEEE Trans. Antennas Propag. 51(3), 457–
468 (2003)
5. Li, W., Wang, Y., You, B., Shi, Z., Liu, Q.H.: Compact ring slot antenna with harmonic
suppression. IEEE Antennas Wirel. Propag. Lett. 17(12), 2459–2463 (2018)
514 A. Pandya et al.

6. Kumar, A.: Wideband circular cavity-backed slot antenna with conical radiation patterns.
Microw. Opt. Technol. Lett. 62(6), 2390–2397 (2020)
7. Raza, A., Lin, W., Chen, Y., Yanting, Z., Chattha, H.T., Sharif, A.B.: Wideband tapered slot
antenna for applications in ground penetrating radar. Microwave Opt. Technol. Lett. 62(7),
2562–2568 (2020)
8. Ellis, M.S., Effah, F.B., Ahmed, A.-R., Kponyo, J.J., Nourinia, J., Ghobadi, C., Mohammadi,
B.: Asymmetric circularly polarized open-slot antenna. Int. J. RF Microwave Comput.-Aided
Eng. 30(5), e22141 (2020)
9. Pimpalgaonkar, P.R., Chaurasia, M.R., Raval, B.T., Upadhyaya, T.K., Pandya, K.: Design of
rectangular and hemispherical dielectric resonator antenna. In: 2016 International Conference
on Communication and Signal Processing (ICCSP), pp. 1430–1433. IEEE (2016)
10. Vahora, A., Pandya, K.: Triple band dielectric resonator antenna array using power divider
network technique for GPS navigation/bluetooth/satellite applications. Int. J. Microwave Opt.
Technol. 15, 369–378 (2020)
11. Vahora, A., Pandya, K.: Microstrip feed two elements pentagon dielectric resonator antenna
array. In: 2019 International Conference on Innovative Trends and Advances in Engineering
and Technology (ICITAET), pp. 22–25. IEEE (2019)
12. Pimpalgaonkar, P.R., Upadhyaya, T.K., Pandya, K., Chaurasia, M.R., Raval, B.T.: A review
on dielectric resonator antenna. In: 1ST International Conference on Automation in industries
(ICAI), pp. 106–109 (2016)
13. Parchin, N.O., Al-Yasir, Y.I.A., Ali, A.H., Elfergani, I., Noras, J.M., Rodriguez, J., Abd-
Alhameed, R.A.: Eight-element dual-polarized MIMO slot antenna system for 5G smartphone
applications. IEEE Access 7, 15612–15622 (2019)
14. Moreno, R.M., Kurvinen, J., Ala-Laurinaho, J., Khripkov, A., Ilvonen, J., van Wonterghem,
J., Viikari, V.: Dual-polarized mm-Wave end-fire chain- slot antenna for mobile devices. IEEE
Transactions on Antennas and Propagation (2020)
15. Gao, G.-P., Bin, H., Wang, S.-F., Yang, C.: Wearable circular ring slot antenna with EBG
structure for wireless body area network. IEEE Antennas Wirel. Propag. Lett. 17(3), 434–437
(2018)
16. Pandya, A., Upadhyaya, T.K., Pandya, K.: Tri-band defected ground plane based planar
monopole antenna for Wi-Fi/WiMAX/WLAN applications. Prog. Electromagnet. Res. C 108,
127–136 (2021)
17. Pandya, A., Upadhyaya, T.K., Pandya, K.: Design of metamaterial based multilayer antenna
for navigation/WiFi/satellite applications. Prog. Electromagnet. Res. M 99, 103–113 (2021)
18. Upadhyaya, T., Desai, A., Patel, R., Patel, U., Kaur, K.P., Pandya, K.: Compact transparent
conductive oxide based dual band antenna for wireless applications. In: 2017 Progress in
Electromagnetics Research Symposium-Fall (PIERS- FALL), pp. 41–45. IEEE (2017)
19. Kosta, S.P., Manavadaria, M., Pandya, K., Kosta, Y.P., Kosta, S., Mehta, H., Patel, J.: Human
blood plasma-based electronic integrated circuit amplifier configuration. J. Biomed. Res. 27(6),
520 (2013)
Chapter 49
A Survey of Computational Intelligence
Techniques Used for Cyber-Attack
Detection

S. Deepa Rajan and R. A. Karthika

Abstract In recent times, the number of computer systems and networks has grown
dramatically. These computer systems and networks are now more vulnerable to
cyber-attacks than ever before. Due to the complexity and dynamic characteristics of
cyber-attacks, computer systems require many cyber-protecting mechanisms. In this
article, a review of computational intelligence-based cyber-attack detection methods
is presented. The fundamental issues in cybersecurity and attack detection were
described before introducing various computational intelligence-based attack detec-
tion applications. This review is focused on the cyber-attack detection approaches
based on machine learning, deep learning and reinforcement learning techniques. The
benchmark datasets used in cyber-attack detection research are first described and
the performance of several computational intelligence-based cyber-attack detection
methods was compared to validate the attack detection efficiency. Finally, a multi-
agent reinforcement learning-based cyber-attack detection method was proposed to
improve the attack detection performance when using reinforcement learning.

Keywords Cyber-attacks · Cyber-attack detection · Computational intelligence ·


Machine learning · Deep learning · Reinforcement learning

49.1 Introduction

In the present era, the number of internet-connected systems has largely increased,
and these systems are more susceptible to cyber-attacks than ever before. A cyber-
attack is a deliberate attempt to get unauthorized access to the computer systems or
networks with the intention of causing damages. Cyber-attacks deactivate or take the
control of computer systems in order to manipulate or steal the information or data
stored in that system. According to the cyber risk analytics report published by the
research team risk-based security [1], in the early three quarters of the year 2020,
the data breach issue has resulted in 36 billion of records. Another study conducted

S. D. Rajan (B) · R. A. Karthika


Vels Institute of Science, Technology and Advanced Studies, Chennai, Tamil Nadu, India
e-mail: deepasdinesh@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 515
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_49
516 S. D. Rajan and R. A. Karthika

by the deep instinct [2] also indicates that millions of cyber-attacks happened every
day throughout the year 2020, overall malware rate was increased by 358% and the
ransom ware rate was increased by 435% when compared with the previous year
2019. Critical services such as healthcare organizations also suffered from cyber-
security breaches. According to the report [3], in the USA, over 90% of health-
care organizations were affected by a minimum of one security breach in the last
3 years. Therefore, the development of cyber-attack detection methods is considered
an essential task.
Cybersecurity is a collection of techniques and processes that protect computers,
networks and stored data from cyber-attacks, unauthorized modifications or destruc-
tions [4]. Intrusion detection systems (IDS) can be used to detect unauthorized system
activities like usage, copy, modifications and destructions [5]. In IDS, three major
types of analysis were used: misuse-based approach (signature-based), anomaly-
based approach, and hybrid approach. Signature-based approaches are suitable to
detect well-known types of cyber-attacks. Anomaly-based approaches learn the char-
acteristics of a normal network or systems to detect anomalies through the observation
of deviations from normal characteristics. A false alarm is the major limitation of
anomaly-based approaches because previously, unlearned network/system character-
istics might be detected as anomalies. Hybrid approaches combined signature-based
and anomaly-based techniques to improve the cyber-attack detection rate and reduce
the false alarm rate related to unknown cyber-attacks.
Computational intelligence methods such as machine learning (ML) and deep
learning (DL) approaches have been largely used in cyber-attack and defence. On
the cyber-attacker’s side, ML is used to weaken the cyber-defence approaches. In the
cybersecurity domain, ML is used to build a strong defence against cyber-attacks to
adaptively avoid or minimize the damages. The ML approaches have been broadly
employed to detect intrusion [6], malwares [7], cyber-physical attacks [8] and data
protection also [9]. However, ML-based systems are incapable of providing dynamic
as well as sequential reactions to cyber-attacks, particularly for new or continually
growing type cyber threats. Hence, still, there is a need for an intelligent cyberse-
curity system to prevent computers and networks from the new and continuously
growing type security threats. Cybersecurity researchers are continuously focused
on the development of intelligent cyber-attack defence systems using advancements
in computational intelligence techniques.
In recent times, reinforcement learning (RL) is also preferred in the cybersecurity
field. The RL is similar to human learning, which can learn from its experience by
exploiting and exploring unknown situations. The RL is highly adaptive as well as
useful in real time because it allows modelling an autonomous agent for conducting
consecutive actions perfectly with or without minimal prior information about the
environment. The combination of DL and RL methods can be suitable for modern
cybersecurity systems. In this paper, a review of computational intelligence methods
used for cybersecurity is presented. This paper is organized as follows: Various ML,
DL and RL methods used in the cyber-attack detection domain are described in
Sect. 49.2. The details about the benchmark datasets used in cyber-attack detection,
49 A Survey of Computational Intelligence Techniques Used for … 517

performance measures used to evaluate the cyber-attack detection methods, perfor-


mance comparison of the computational intelligence methods used in cybersecurity
and the future research directions in cyber-attack detection are presented in Sect. 49.3.
Finally, Sect. 49.4 concludes this research paper.

49.2 Cyber-Attack Detection Methods

49.2.1 ML-Based Cyber-Attack Detection

Machine learning (ML) techniques are broadly used in both cyber-attacking and
cyber-defence fields. The major types of ML techniques used for cyber-attack detec-
tion are illustrated in Fig. 49.1. Generally, the supervised ML-based cyber-attack
detection methods are executed as parametric or non-parametric approaches. Feng
et al. [10] established a user-centric ML framework which is aimed at the development
of a cybersecurity operation centre (SOC) for the real environment. This ML-based
framework is capable to learn more perceptions even from highly unbalanced data and
a limited number of labels. Unsupervised ML is also be used to learn and construct
baseline behavioural paroles for a variety of entities, which can subsequently be used
to identify anomalies. In Pu et al. [11], an unsupervised clustering-based method for
anomaly detection was proposed. This method combines the sub-space clustering
(SSC) technique and the one class-support vector machine (OC-SVM) technique to
identify the cyber-attacks without any kind of prior knowledge. The performance
of this method was evaluated by the standard NSL-KDD database [12]. An unsu-
pervised algorithm [13] for detecting unknown threat zero-days showed significant
potential in zero-day attack detection.

Fig. 49.1 Machine learning methods for cyber-attack detection


518 S. D. Rajan and R. A. Karthika

In recent years, researchers are focused on the development of end-to-end ML


frameworks, hybrid and ensemble methods for efficient cyber-attack detection. The
end-to-end ML-based framework [14] for network IDS had several paramount char-
acteristics such as reproducibility, most up-to-date to network traffic and cyber-
attacks. Moreover, it is also focused on the model’s realization as well as deploy-
ment issues. In [15], an ensemble model was constructed by sharing the weighted
ensemble of different classifiers such as logistic regression, decision tree and SVM
in addition to the big data analysis architecture to detect distributed cyber-attacks.
These hybrid ML-based cyber-attack detection methods can significantly defend the
cyber-attacks. However, in ML-based approaches, feature selection plays a vital role
in accurate cyber-attack detection. The feature selection process is related to the data
analysis method and minimum prior knowledge about the attack category. Hence,
researchers preferred DL and RL-based cyber-attack detection methods to overcome
the limitations of ML-based approaches.

49.2.2 DL-Based Cyber-Attack Detection

Deep learning (DL) is a relatively new domain in computational intelligence research.


This approach is focused on the development of a neural network-based system
similar to that of the human brain to understand the data types such as text, image
and sound. DL is widely employed in cybersecurity due to its significant potential for
building security-based applications [16]. The major categories of the DL approach
used in the cyber-attack detection field are shown in Fig. 49.2. The DL-based cyber-
attack detection methods can be unsupervised such as the autoencoder (AE) tech-
niques, the deep belief network (DBN) and the generative adversarial network (GAN)

Fig. 49.2 Deep learning methods for cyber-attack detection


49 A Survey of Computational Intelligence Techniques Used for … 519

techniques. In Gopalakrishnan et al. [17], a DBN-based approach was developed


for cyber-attack detection. This method focused on three main processes such as
network traffic prediction, data offloading and cyber-attack detection. The convolu-
tional DBN-based approach [18] identifies the cyber-attack features and executes
real-time intrusion detection in wireless networks also. This method has a high
average attack detection speed of 1.14 ms as well as a detection accuracy of 0.974.
The supervised DL methods such as the deep neural network (DNN), the convolu-
tional neural network (CNN) and the recurrent neural network (RNN) are also be used
in cyber-attack detection. In Ullah and Mahmoud [19], a CNN-based multiclass clas-
sification framework is used to detect cyber-attacks in IoT Networks. In recent times,
ensemble and hybrid DL approaches are also preferred in cyber-attack detection. An
ensemble framework proposed in Saharkhizan et al. [20] detected the cyber-attacks in
IoT networks based on the network traffic. This framework is a generalized approach,
which could be easily implemented in the already available industrial network infras-
tructures with a minimum effort. In Aslan and Yilmaz [21], a DL-based hybrid
model architecture was proposed to classify different malware variants. This method
attained 97.78% accuracy, which is comparatively better than the performance of
most state-of-the-art ML-related malware detection approaches. DL-based methods
attained significant accuracy in cyber-attack detection; however, the accuracy of
the DL-based approaches is dependent on the existing training data and imperative
with the training algorithm used. Hence, in recent times, RL-based methods are
also preferred in cyber-attack detection due to the reason that the RL algorithms are
exploratory and can be established without any current training dataset.

49.2.3 RL-Based Cyber-Attack Detection

Reinforcement learning (RL)-based cyber-attack detection methods not required any


sub-optimal type action need to be corrected. As an alternative, these methods find
a balance between exploration and exploitation. Moreover, these methods did not
assume any previous type of mathematical modelling for the learning environment.
Hence, RL-based cyber-attack detection approaches are flexible in learning new
knowledge about cyber-attacks. In Kurt et al. [22], an online anomaly/cyber-attack
detection framework was implemented using a partially observable Markov decision
process (POMDP) and universal type of robust online cyber-attack detection algo-
rithm with a model-free RL approach. This method was focused on smart grid-based
applications. The experimental results of the above method evidenced the efficacy
of RL in time as well as accurate identification of cyber-attacks in smart grid appli-
cations. The prior knowledge about cybersecurity can also be represented as the
cybersecurity knowledge graph (CKG) [23] and this CKG can be used to guide the
RL algorithm’s exploration to identify malware. This method illustrated that a CKG-
guided RL algorithm can be performed well in malware detection. The RL-based
methods can also be used to identify the phishing websites. Chatterjee and Namin
[24] proposed a deep RL-based approach to identify the phishing websites through
520 S. D. Rajan and R. A. Karthika

the analysis of the URLs. This method is capable to adopt the dynamic characteristics
of phishing websites and learning the features related to phishing websites.

49.3 Comparative Analysis and Discussions

49.3.1 Benchmark Datasets

The details of benchmark datasets used for cyber-attack detection are summarized
in Table 49.1. Malimg [25], Microsoft BIG 2015 [26], and Malevis [27] are the
most used benchmark datasets for the development and evaluation of computational
intelligence-based cyber-attack identification methods. The Malimg [25] dataset
contains a large number of malware samples belonging to different classes. There are
9339 malware samples in the Malimg dataset. These malware samples belong to 25
different classes. Furthermore, the sample size of each malware class varies across
the database. In the Microsoft BIG 2015 database [26], exactly 21,741 malware

Table 49.1 Details of benchmark datasets used in cyber-attack detection


Dataset Number of samples Number of attack Observations
classes
Malimg [25] 9339 25 Different classes of
malware are available
Microsoft BIG 2015 21,741 09 The quantity of
[26] malware samples is
not equally distributed
Malevis [27] 9100 training samples 25 Different classes of
5126 testing samples malware are available
KDDCup 99 [28] 3,925,650 05 A large amount of
redundant training and
testing samples are
there
NSL-KDD [12] Dos: 04 KDDTest+ has 17
Train 45,927 attack types that are
Test 74,588 not available in
Probe: KDDTrain+
Train 11,656
Test 2421
R2L:
Train 995
Test 2754
U2R:
Train 52
Test 200
49 A Survey of Computational Intelligence Techniques Used for … 521

samples belonging to nine different classes are available. In this database, the quan-
tity of malware samples is not equally distributed among the classes. Each sample
in this database is represented by two files with the extensions “.Byte” and “.asm”.
The file “.byte" contains the binary content’s raw hexadecimal format representation
and the file “.asm” contains the extracted disassembled code. The Malevis dataset
[27] includes 9100 training malware samples and 5126 testing malware samples of
25 different malware classes. Each malware class contains 350 training samples and
a variety of testing samples also.
Apart from the above three datasets, KDDCup 99 [28] and NSL-KDD [12] datasets
are also used in few academic research works. Despite several limitations such as a
large amount of redundant training and testing samples, the KDDCup 99 dataset is
broadly used in cybersecurity research. This dataset contains labelled training and
testing data samples. This dataset has five label types such as normal, probe, DoS,
R2L and the U2R. The normal refers to the normal network traffic cases. The probe
refers to the surveillance and the probing, DoS refers to an attack-type where the
attacker attempts to prevent the target machine from providing service or access for
the computer system, R2L refers to unauthorized access. The NSL-KDD [12] dataset
has two units KDDTrain+ and KDDTest+ . The KDDTest+ has 17 attack types that
are not available in KDDTrain+ . This unique feature makes the NSL-KDD dataset
as more challenging one compared with the KDDCup 99 database, which simulates
a real-time network with an unknown type of attacks.

49.3.2 Performance Measures

The performance of computational intelligence-based cyber-attack detection


methods is commonly assessed by the confusion matrix-based performance metrics.
The following four values are computed from the confusion matrix:
• True Positive (TP): It specifies a correctly identified intrusion/attack.
• True Negative (TN): It specifies a correctly identified benign activity as non-
malicious activity.
• False Positive (FP): It specifies a wrongly identified benign activity as intrusion/
malicious activity.
• False Negative (FN): It specifies a non-identified intrusion, which is also labelled
as non- malicious one.
Using the above four values, the following performance measures are computed
to identify the efficiency of cyber-attack detection system:
Accuracy: It specifies the ratio of the number of correctly identified data samples
and the number of total data samples.

TP +TN
Accuracy = (49.1)
T P + T N + FP + FN
522 S. D. Rajan and R. A. Karthika

Precision: It specifies the ratio of the detected intrusion samples which are correctly
classified.
TP
Precision = (49.2)
T P + FP

Recall: It specifies the ratio of the intrusion samples which are correctly classified.

TP
Recall = (49.3)
T P + FN

F-Measure: It is a harmonic mean value of the precision and the recall, which is
represented as follows:

2∗T P
F − Measure = (49.4)
2∗T P + F P + F N

49.3.3 Performance Comparison

The computational intelligence-based recent cyber-attack detection methods were


compared and their efficiency and limitations are presented in Table 49.2. The SSC
and OC-SVM-based method [11] attained a high attack detection rate (99%) among
the ML-based cyber-attack detection methods. However, its detection rate is slightly
(0.97%) less than the CNN-based DL method [19]. However, the complexity of the
SSC and OC-SVM-based method [11] is less than the CNN-based method [19]. The
performance of cyber-attack detection methods is also be compared based on the
area under curve (AUC) value, which represents the probability of random positive
examples positioned with random negative examples. The ML-based framework
proposed by De Carvalho Bertoli et al. [14] attained the highest AUC (0.98) compared
with other state-of-the-art attack detection methods. However, the attack detection
rate of this method is about 4% less than the OC-SVM [11] and CNN [19]-based
attack detection methods.
The performance of DL and RL-based attack detection methods can also be
compared based on the F1 score, which is the balance of precision and recall values.
Figure 49.3 compares the F1 scores of state-of-the-art DL and RL-based attack
detection methods. The model-free RL algorithm [22] attained the highest F1 score
99.97 among the DL and RL-based attack detection methods. The other performance
measures such as sensitivity and specificity are also used in the evaluation of attack
detection methods [17] and [21]. These methods attained significant sensitivity and
specificity values. However, these methods attained less F1-score compared with the
model-free RL algorithm [22]. Moreover, the attack detection rate of these methods
is significantly less compared with OC-SVM [11], CNN [19], and Ensemble DL [20]
49 A Survey of Computational Intelligence Techniques Used for … 523

Table 49.2 Comparison of recent computational intelligence-based methods used for cyber-attack
detection
Author Techniques Detection rate Other Limitations
(%) performance
measures
Feng et al. [10] Multi-layer neural MNN = 78.33 Average AUC Difficult to separate
network (MNN), RF = 80 = 0.80 the nonlinear
random forest SVM = 80 distribution of the
(RF), SVM, and LR = 76.67 dataset
logistic regression More complexity
(LR)
Pu et al. [11] SSC + OC-SVM 99 Average AUC The activation
= above 0.9 function is not a
bounded one
Carvalho et al. AB-TRAP 95 Average AUC More complexity
[14] framework = 0.98
T. Zoppi et al. Unsupervised Not reported Precision = Spectral
[13] framework 0.994 characteristics of
Recall = 0.995 attack classes
changed over time
Ullah and CNN 99.97 Precision = Required more
Mahmoud [19] 99.95 training data
Recall = 99.95
F1-score =
99.95
Aslan and Yilmaz Hybrid DL 96.6 Sensitivity = More complexity
[21] architecture 97.1
Specificity =
94.9
F1-score =
94.5
Gopalakrishnan DBN 95.30 Specificity = No global level
et al. [17] 96.16 optimum value is
Sensitivity = available for all
99.04 layers of the
F1 Score = network
97.77
Yang et al. [18] Conditional DBN 97.4 Precision = Computational
96.6 complexity
Recall = 97.6
F1-score =
97.1
Saharkhizan et al. Ensemble DL 99% Precision = More complexity
[20] 92.22
Recall = 91.21
F1-score =
91.71
(continued)
524 S. D. Rajan and R. A. Karthika

Table 49.2 (continued)


Author Techniques Detection rate Other Limitations
(%) performance
measures
Kurt et al. [22] Model-free RL Not reported Precision = Scalability and
algorithm 99.95 observability issues
Recall = 100
F1-score =
99.97
Piplai et al. [23] CKG + RL Not reported Q-value = − More complexity
16.99
Rank = 1(99)
Chatterjee et al. Deep RL 90.1 Precision = More training and
[24] 86.7 more computations
Recall = 88 required
F1-score =
87.3

Fig. 49.3 F1-score-based comparison of ML and RL-based cyber-attack detection methods

methods. The attack detection method proposed by Piplai et al. [23] was evaluated by
the rank-based metric. The complexity of this method is more due to the limitations
of CKG. Hence, there is a need for a method with less complexity and maximum
attack detection rate.
49 A Survey of Computational Intelligence Techniques Used for … 525

49.3.4 Future Research Scope

The performance comparison discussed in the previous section shows that the RL-
based attack detection method performed well compared with other state-of-the-art
cyber-attack detection methods. Moreover, the RL technique can model an agent for
sequential optimal actions without or with limited prior knowledge about the cyber-
attacking environment. As a result, the RL-based cyber-attack detection is appropriate
for the sophisticated and quick type of cyber-attacks. However, single active agent-
based RL frameworks cannot effectively detect cyber-attacks. Hence, multi-agents
can be used to increase the efficiency of cyber-attack detection systems. In this
paper, we proposed a multi-agent RL-based framework for cyber-attack detection.
The major motivation of the proposed method is to solve the issues in RL-based
cyber-attack detection. The proposed multi-agent RL-based cyber-attack detection
is shown in Fig. 49.4. In multi-agent RL, if one or more agents failed, then the
remaining available agents can take up their tasks. Moreover, in this system, a group
of autonomous and interacting entities share a common environment. Therefore, it is
suitable to detect many forms of cyber threats. This makes the proposed cyber-attack
detection method as inherently robust. The proposed model can learn faster and attain
better cyber-attack detection performance due to experience sharing characteristics
of multi-agent RL. Moreover, multi-agent systems permit the easy addition of new
agents in the system, a high degree of scalability is possible in the proposed approach.

Fig. 49.4 Multi-agent RL-based cyber-attack detection


526 S. D. Rajan and R. A. Karthika

49.4 Conclusions

Cyber-attacks are considered as the main issue due to the increase in internet-
connected computer systems. The cyber-attacks cause deactivation or unauthorized
control of the computer systems to steal the information or data stored in that
system. Computational intelligence-based methods are widely used to detect cyber-
attacks. This article investigated the various ML, DL and RL-based cyber-attack
detection approaches. The performance comparison of state-of-the-art computational
intelligence-based cyber-attack detection methods proved the efficiency of RL-based
cyber-attack detection. Based on the outcomes of the performance comparative anal-
ysis, a multi-agent reinforcement learning-based cyber-attack detection method was
proposed in this paper, to improve the attack detection performance when using RL.
Moreover, this study shows that, there is a future scope for research in the areas
such as unknown attack detection and improving attack detection rate. The advance-
ments in RL techniques can enable large numbers of advantages in the cybersecurity
domain.

References

1. Goddijn, I.: 2020 Q3 Report Data Breach QuickView (2020)


2. Deep Instinct. Cyber Threat Report on 2020 (2020)
3. F. & Sullivan.: US Healthcare Cybersecurity Market, 2020—Frost Radar Report (2020)
4. Yaacoub, J.P.A., Salman, O., Noura, H.N., Kaaniche, N., Chehab, A., Malli, M.: Cyber-physical
systems security: limitations, issues and future trends. Microprocess. Microsyst. 77, 103201
(2020). https://doi.org/10.1016/J.MICPRO.2020.103201
5. Milenkoski, A., Vieira, M., Kounev, S., Avritzer, A., Payne, B.D.: Evaluating computer intrusion
detection systems: a survey of common practices. ACM Comput. Surv. 48(1) (2015). https://
doi.org/10.1145/2808691
6. Xin, Y., et al.: Machine learning and deep learning methods for cybersecurity. IEEE Access 6,
35365–35381 (2018). https://doi.org/10.1109/ACCESS.2018.2836950
7. Milosevic, N., Dehghantanha, A., Choo, K.K.R.: Machine learning aided android malware
classification. Comput. Electr. Eng. 61, 266–274 (2017). https://doi.org/10.1016/J.COMPEL
ECENG.2017.02.013
8. Paul, S., Ni, Z., Mu, C.: A learning-based solution for an adversarial repeated game in cyber-
physical power systems. IEEE Trans. Neural Netw. Learn. Syst. 31(11), 4512–4523 (2020).
https://doi.org/10.1109/TNNLS.2019.2955857
9. Xiao, L., Wan, X., Lu, X., Zhang, Y., Wu, D.: IoT security techniques based on machine
learning. Cryptogr. Secur. Cornell Univ. (2018)
10. Feng, C., Wu, S., Liu, N.: A user-centric machine learning framework for cyber security oper-
ations center. In: 2017 IEEE International Conference on Intelligence Security Informatics
Security Big Data, ISI 2017, pp. 173–175 (2017). https://doi.org/10.1109/ISI.2017.8004902
11. Pu, G., Wang, L., Shen, J., Dong, F.: A hybrid unsupervised clustering-based anomaly detec-
tion method. Tsinghua Sci. Technol. 26(2), 146 (2021). https://doi.org/10.26599/TST.2019.
9010051
12. NSL-KDD | Datasets | Research | Canadian Institute for Cyber-security | UNB. https://www.
unb.ca/cic/datasets/nsl.html
49 A Survey of Computational Intelligence Techniques Used for … 527

13. Zoppi, T., Ceccarelli, A., Bondavalli, A.: Unsupervised algorithms to detect zero-day attacks:
strategy and application. IEEE Access 9, 90603–90615 (2021). https://doi.org/10.1109/ACC
ESS.2021.3090957
14. De Carvalho Bertoli, G., et al.: An end-to-end framework for machine learning-based network
intrusion detection system. IEEE Access 9, 106790–106805 (2021). https://doi.org/10.1109/
ACCESS.2021.3101188
15. Kotenko, I., Saenko, I., Branitskiy, A.: Detection of distributed cyber attacks based on weighted
ensembles of classifiers and big data processing architecture. In: IEEE INFOCOM 2019—
IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 1–6,
(2019)
16. Xu, X., Liu, X., Xu, Z., Dai, F., Zhang, X., Qi, L.: Trust-oriented IoT service placement for
smart cities in edge computing. IEEE Internet Things J. 7(5), 4084–4091 (2020). https://doi.
org/10.1109/JIOT.2019.2959124
17. Gopalakrishnan, T., et al.: Deep learning enabled data offloading with cyber attack detection
model in mobile edge computing systems. IEEE Access 8, 185938–185949 (2020). https://doi.
org/10.1109/ACCESS.2020.3030726
18. Yang, L., Li, J., Yin, L., Sun, Z., Zhao, Y., Li, Z.: Real-time intrusion detection in wireless
network: a deep learning-based intelligent mechanism. IEEE Access 8, 170128–170139 (2020).
https://doi.org/10.1109/ACCESS.2020.3019973
19. Ullah, I., Mahmoud, Q.H.: Design and development of a deep learning-based model for anomaly
detection in IoT networks. IEEE Access 9, 103906–103926 (2021). https://doi.org/10.1109/
ACCESS.2021.3094024
20. Saharkhizan, M., Azmoodeh, A., Dehghantanha, A., Choo, K.K.R., Parizi, R.M.: An ensemble
of deep recurrent neural networks for detecting IoT cyber attacks using network traffic. IEEE
Internet Things J. 7(9), 8852–8859 (2020). https://doi.org/10.1109/JIOT.2020.2996425
21. Aslan, O., Yilmaz, A.A.: A new malware classification framework based on deep learning
algorithms. IEEE Access 9, 87936–87951 (2021). https://doi.org/10.1109/ACCESS.2021.308
9586
22. Kurt, M.N., Ogundijo, O., Li, C., Wang, X.: Online cyber-attack detection in smart grid: a
reinforcement learning approach. IEEE Trans. Smart Grid (2018). https://doi.org/10.1109/TSG.
2018.2878570
23. Piplai, A., Ranade, P., Kotal, A., Mittal, S., Narayanan, S.N., Joshi, A.: Using knowledge graphs
and reinforcement learning for malware analysis. [Online]. Available: https://www.virustotal.
com/
24. Chatterjee, M., Namin, A.S.: Detecting phishing websites through deep reinforcement learning
(2019). https://doi.org/10.1109/COMPSAC.2019.10211
25. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and
automatic classification. ACM Int. Conf. Proc. (2011). https://doi.org/10.1145/2016904.201
6908
26. Microsoft Malware Classification Challenge (BIG 2015) | Kaggle. https://www.kaggle.com/c/
malware-classification
27. Bozkir, A.S., Cankaya, A.O., Aydos, M.: Utilization and comparision of convolutional neural
networks in malware recognition. 27th Signal Process. Commun. Appl. Conf. SIU 2019, (2019).
https://doi.org/10.1109/SIU.2019.8806511
28. Lippmann, R., Haines, J.W., Fried, D.J., Korba, J., Das, K.: The 1999 DARPA off-line intru-
sion detection evaluation. Comput. Networks 34(4), 579–595 (2000). https://doi.org/10.1016/
S1389-1286(00)00139-0
Chapter 50
A Technical Review on Machine
Learning-Based Prediction on COVID-19
Diagnosis

Sandeep Kejriwal and Narendran Rajagopalan

Abstract Scientists and medical specialists have been working around the clock to
find new ways to battle the COVID-19 virulent disease as a result of the present world-
wide community healthiness crisis caused by novel coronavirus illness (COVID-19).
Machine learning (ML) has recently been proved to be successfully used in the health-
care sector for a variety of operations. In order to tackle the COVID-19 epidemic, this
review examined the most recent research and development on the most cutting-edge
applications of machine learning. The most recent developments in artificial intelli-
gence (AI) technology are analyzed and summarized. COVID-19 prediction, diag-
nostics, and screening have all improved dramatically due to recent machine learning
research and development. This has led to better increase, quicker responses, and the
most efficient and reliable outcomes yet, and it has even outperformed human inside
some healthy tasks. This review paper will aid healthcare, researchers’ organizations
and institutions, politicians, and public officials by providing fresh perspectives into
how ML can manage the COVID-19 virus and push further reading to minimize the
COVID-19 pandemic.

Keywords Machine learning · COVID-19 · ML techniques · Random forests ·


Support vector machine · Decision tree · Logistic regression

50.1 Introduction

COVID-19 is an infection that by virtue of the coronavirus type 2 that causes


severe respiratory symptoms (SARS-CoV-2) [1, 2]. Patients infected with COVID-
19 commonly feel a exhaustion, fever, loss of taste and smell, and dry cough, as well

S. Kejriwal (B) · N. Rajagopalan


Department of Computer Science, NIT Puducherry, Karaikal 609609, India
e-mail: sandeep.kejriwal@outlook.com
N. Rajagopalan
e-mail: narenraj1@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 529
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_50
530 S. Kejriwal and N. Rajagopalan

as respiratory illnesses such as shortness of breath. In the cross-sectional view in


Fig. 50.1, the SARS-CoV-2 spike protein is depicted in a sectional view, a single-
stranded RNA, non-segmented, enveloped nucleocapsid protein, hem agglutinin-
esterase dimmer, matrix/membrane glycoprotein, and an envelope protein [3].
Following after the confirmed case of corona virus which was discovered in Wuhan,
China in December 2019 the World Health Organization (WHO) designated it a
Public Health Emergency of International Concern (PHEIC) on January 30th, 2020.
The first confirmed case was discovered in December 2019 [4]. Many governments
took measures to avert and minimize out spread of the new virus. With immediate
effect, lockdowns and curfews have been implemented [5]. Medics, researchers,
paramedics, and other medical professionals are toiling around the clock to find
temporary solutions for the COVID-19 pandemic, as well as getting to work on devel-
oping a vaccine as a temporary solution [6]. The diagnostic tests and the antibody
tests are the current standard methods for detecting coronavirus illness. The identi-
fication of coronavirus illness, diagnostic testing, and antibody tests are all made in
accordance with conventional procedures. Costly, time-consuming, and lacking inef-
fectiveness, these methods do not provide actual positive rates. Coronavirus illness
cannot be diagnosed and tracked using standard methods [7–9].
There are numerous medical studies that use this advancing technology, which
results in better scale-up and leads to better time-to-market. It is also more dependable
and effective outcomes than people in specific healthcare jobs [11–13]. ML is a
subtype of artificial intelligence that can grow and practice without needing to be
specifically programmed. An ML algorithm relies on the characteristics of the data.
Some of the most popular ML techniques, support vector machines, random forests,
decision trees are represented in Fig. 50.2. View increases the accuracy and efficiency
of detection by radiologists and minimizes the burden on their schedules, affords

Fig. 50.1 Sars-Cov-2: cross-sectional view [10]


50 A Technical Review on Machine Learning-Based Prediction … 531

Fig. 50.2 Various ML


techniques discussed Support
Vector
Machine

Machine
Decision Learning Random
Tree Techniq Forest
ues

Logistic
Regress
ion

patients with COVID-19 a prompt response and precise treatment in the year 2020
[14, 15].
It is possible to find medications that are effective against novel disorders like
COVID-19 using machine learning and drug repurposing. Repurposing drugs is a
complex process that requires a thorough understanding of the drug-disease link,
which perhaps greatly improved through developing technologies. As a result of the
numerous issues raised by ML, this has compiled a list of suggestions for govern-
ments, academics, researchers as well as various teams involved in the field. Science
Direct, Preprints from bioRxiv, MedRxiv, arXiv, and Google Scholar are among the
citation databases used in this study. Table 50.1 gives comparison of ML methods to
predict COVID-19 by various authors.

50.2 Machine Learning

ML targets mostly on how computers may learn from past experiences. ML comprises
two types of learning: supervision and self-directed study. Labeling is the most
significant difference between them. Labeling the input data is essential for super-
vised learning before they were taught to recognize them by computer models that
used machine learning. Unsupervised learning, on the contrary, does not require
training labels. Supervised learning methods comprises, Logistic Regression, Neural
Networks, and Random Forests, Decision Tree, Support Vector Machines and Kernel
Machines. The K-means algorithm is a symbolic clustering algorithm used in unsu-
pervised learning [21]. Artificial intelligence relies less on human interaction as
compared to deep learning. Deep learning automatically pulls features from a dataset
instead of requiring manual feature extraction prior to using them in a model for
classification or regression [30].
532 S. Kejriwal and N. Rajagopalan

Table 50.1 Various methods


S. Author Method Dataset Accuracy Specificity Sensitivity
No. (%) (%) (%)
1 Batista et al. SVM Clinical data of 235 – 85.0 67.7
[16] patients
2 Lamia et al. Multi-level COVID-chest 97.48 99.70 95.76
[17] thresholding X-ray-dataset-master
and SVM
3 Saban et al. sAE and 126 X-ray and CT 94.23 98.54 91.88
[18] PCA images
4 Feng et al. Infection 1685 confirmed 87.90 83.30 90.07
[19] size aware COVID and 1027
random CAP CT images
forest
(iSARF)
5 Zhenyu Severity 176 patients CT 87.5 74.5 93.3
et al. [10] assessment images (Severe and
RF model non-severe)
6 Mohammad Predictive 112 extracted 93.75 – –
et al. [20] analytics features from
algorithm 117,000 laboratory
(SVM, LR, confirmed CVID-19
NN, RF) CT images
7 Zirui et al. Multivariate 620 CT Samples 86.62 – –
[21] logistic
regression
8 Asmaa et al. CNN-based 80 normal X-ray and 93.10 85.18 100
[22] DeTraC 115 COVID-19
model and X-ray images
ResNet-18
9 Nan-Nun LR + CT images clinically 91.0 95.0 87.0
et al. [23] Feature obtained from 912
Selection patients (80:20)
(FS) Model
10 Bin Zhnag L1-norm 370 chest CT images – 70.07 87.5
et al. [24] SVM +
SVM
11 Seung et al. CNN + Chest X-ray Images 95.0 93.0 97.0
[25] Three binary (COVID-19 vs TB)
DT
classifiers
12 Muhammad ML model + Kaggle Dataset: CT 96.0 92.0 98.0
et al. [26] HOG for and X-ray images
feature (1400 COVID
extraction + infected images and
SVM 800 normal chest
images)
(continued)
50 A Technical Review on Machine Learning-Based Prediction … 533

Table 50.1 (continued)


S. Author Method Dataset Accuracy Specificity Sensitivity
No. (%) (%) (%)
13 Sudhir et al. LR Clinically obtained 70.0 92.0 90.0
[27] routine and infected
blood samples
14 Stephanie Hybrid 3D 1337 CT scans 90.8 93.0 84.0
et al. [28] Model labeled as pneumonia
and COVID-19
pneumonia
15 Prabira et al. ResNet 50 + Three classes 95.33 97.67 95.33
[29] SVM (COVID-19,
pneumonia and
normal chest) of
X-ray images

Machine learning techniques and methodologies can reveal trends and patterns in
data collections. A machine learning technology has suggested synthetic inhibitory
antibodies as a potential treatment, according to the literature [31, 32]. The immobi-
lization reaction of the patients and also 1933 virus-antibody samples were collected.
Thousands of potential antibody sequences were screened using graph features and
machine learning. Stable antibodies against COVID-19 have been detected in eight of
the samples [33, 34]. In light of this finding, it is reasonable to conclude that machine
learning practices could be effective in the fight against COVID-19. When dealing
with humongous data, it perhaps is difficult for humans to see patterns and trends
that machine learning can immediately pick up on. A machine learning model, for
instance, may be able to quickly establish a causal relationship among two instances.
Apart from the identification, it is capable of improving or adapting its operations
in the long run. As the amount of data grows, so does the level of efficiency and
accuracy [35, 36]. Predictions made by an algorithm that learns from data is more
accurate. Another key feature of this method is its capacity to instantly adapt without
the involvement of human interference.

50.3 Machine Learning Application for COVID-19


Classification, Screening, Prediction and Diagnosis

Machine learning learns and develops without it being explicitly programmed.


Algorithms based on machine learning are heavily reliant on defining character-
istics. ML-based techniques can be used to create complicated and massive datasets.
These methods have been widely employed in the study of epidemic patterns and
in the production of epidemic forecasts. For the COVID-19 pandemic, researchers
have applied these tools for repurposing, screening, categorization, identification,
and forecasting [37–39]. Decision Tree, Support Vector Machine, Random Forest
534 S. Kejriwal and N. Rajagopalan

Logistic Regression have all been used to resist COVID-19 pandemic and are
discussed. The above mentioned methods to analyze the patient health in different
less to severe infection categories. With help of these methods people can attain of
87% of accuracy in predicting the cases.

50.3.1 Support Vector Machine (SVM)

Using a support vector machine as a solution to classification and regression issues is a


valuable technique. Owing to its great precision and performance, it have been used in
several real-world functions, as well as the wellbeing concern. To tackle the COVID-
19 pandemic, SVM was recently employed because of its excellent performance.
Several articles on detection, classification, and prediction, and forecasting have been
discussed in this sub-section have also been included [16, 40]. Intended for before
time identification and detection of COVID-19 cases, researchers proposed SVM-
based model using X-ray pictures [17]. Each image is made up of a mix of regular
and COVID-19-infected X-rays, with 15 of the 40 contrast-enhanced lungs X-rays
being normal. It was shown that the SVM-based model might be used effectively
to identify novel coronavirus disease because of its high performance (sensitivity
was 95.76%, specificity was 99.7%, and accuracy was 97.48%). An model based
on ML was also built by the authors [18]. Using computed tomography (CT) scans
to classify COVID-19. The sample size is a total of 150 abdomen CT scans of 53
infected patients are included. To improve the categorization process following the
application of multiple feature extraction techniques and a classification system,
SVM is used to classify the extracted characteristics.
2-, 5-, and tenfold cross-classifications are used at the time of classification.
The tenfold cross-validation generated comparatively high results among the imple-
mented validations (sensitivity was 97.56%, specificity was 99.68%, and accuracy
was 98.71%). In conclusion the authors recommend the proposed model be validated
on a different dataset based on COVID-19 CT images. By examining over 200 labo-
ratory and clinical data, built the SVM model to diagnose the symptom of COVID-19
contaminated individuals. With area under AUROC of 0.996 in the training dataset
and 0.9757 in tests, the proposed model outperforms other models with reference
to performance. The analysis of COVID-19 in urgent situation room patient was
predicted using five machine learning models in another investigation [41]. GBT,
SVM, RF, LR, and NN are a few of the models available. Adult hospitalized patients
between March 17th and March 30th, 2020 were used to train these models. SVM
was shown to be the most accurate approach out of five ML-based methods, with an
accuracy of 85%.
50 A Technical Review on Machine Learning-Based Prediction … 535

50.3.2 Random Forests

The RF method is among the efficient classifiers for regression and classification
issues (RF). Multiple trees are utilized in the data samples are used for training and
prediction. It is common to practice in bioinformatics to take advantage of radio
frequency (RF) chemometric analysis [42]. It has been reported that RF is partic-
ipating in COVID-19 used extensively by researchers in the COVID-19 pandemic
mitigation effort. Owing to its speed and ease of use, quick and correct recognition
of COVID-19, an infectious random forest that considers size (iSARF) develop-
ment of the model [19]. The 1658 (COVID-19) CT scans used in this study Posi-
tive) and 1027 community acquired pneumonia (CAP) CT images were taken and
then pre-processed before being analyzed. It was shown that the discussed model
produced better results (sensitivity was 90.7%, accuracy was 87.9%, and the five-
fold cross-validation method of screening for coronavirus illnesses had a specificity
of 83.3%. It was also claimed that by incorporating radiances into the proposed
model, the findings improved much more. The COVID-19 severity assessment is
becoming increasingly complicated and time-consuming due to a large amount of
COVID-19 affected patients. Patients infected with COVID-19 can now be classified
according to the severity of their infection using a machine learning-based method-
ology proposed by Zhenyu et al. [10]. There are 176 COVID-19 positive individual
in the RF model’s training dataset. Using three-fold cross-validation, the proposed
model achieved 87.5% accuracy.
The severity of COVID-19 can be gauged using a variety of quantitative features,
according to the study’s authors, using five machine learning models to predict the
severity of COVID-19 cases in initial stages of the disease models were recommended
by Albahri et al. [43]. A review of 183 patients’ medical records is included in this
study. The model was developed using (COVID-19 severe) instances.

50.3.3 Decision Tree

A technique known as the decision tree (DT) can be used to solve problems involving
regression and Problems with categorization. As a result of its simplicity and dura-
bility, DT has found widespread application in a wide range of disciplines. DT
has recently gained notoriety for being well-known in the fields of health care and
medical research. The severity of COVID-19 in children was determined by the use of
trees [44]. A total of 105 affected children’s clinical laboratory and epidemiological
reports were gathered from the Chinese hospital between February 1st and March
3rd, 2020. A total of 105 youngsters tested positive for COVID-19, including 41
females and 64 boys. The female is the subject here. A lesser percentage of females
(39.05%) are infected than males (60.95%). An F1 score of 100 was attained for the
performance. CT scans were not recommended for initial discovery of COVID-19
suspected individuals, according to the researchers [45]. The training and validation
536 S. Kejriwal and N. Rajagopalan

using data collected from patients admitted to the hospital between January 14th and
February 26th, 2020 was the proposed model. The patient’s medical history, labora-
tory results, and admission symptoms make up the bulk of the data. The selection
of features and the construction of the model were both accomplished through the
use of lasso regression. Four distinct methods, including least absolute shrinkage
and selection operator (LASSO) regression with LR, a ridge with LR regularization,
DT, and the xgboost algorithm, were compared in this study. The model validation
cohort performed well, with an AUROC of 0.5 specificity of 100. COVID-19 infected
patients’ mortality rates were predicted using different ML methods. This study used
a dataset of 17,000 COVID-19 infected individuals, together with binary sexes. The
model predicted death rates with 93% accuracy. When used in conjunction with
tenfold cross-validation, DT attained an accuracy of 90.63% [20].

50.3.4 Logistic Regression

It is a statistical technique and regression analysis that uses logistic regression when
the target variable is a binary dependent variable; this function can be utilized in
a specific way (Field, 2012). The COVID-19 has recently been mitigated with the
help of the LR. Multivariate LR was utilized to make the diagnosis of COVID-19
using a dataset of 620 laboratory samples for regression. It is the distribution of
431 samples in the training set and 189 samples in the testing set. Involvement in
the show a positive predictive value of 86.35% was found to be satisfactory for
this model positive predictive value is 86.62% [21]. Predicting initial COVID-19
mortality risk using an evidence-based approach. It has been created by utilizing
a variety of machine learning techniques—the medical records of 183 individuals
having severe COVID-19. A total of 19 instances were used to create the model. Five
distinct machines were used in this study. Features were selected by use of a variety of
learning methods, including mortality rate forecast. The region under the receiver is
used for the evaluation of performance. The operational AUROC was implemented.
The resulting predictive model was externally validated using 19 examples.
Other machine learning methods include mmultilayer perceptions (MLPs) and
SVMs are two of the most commonly used ML-based techniques. Some of the most
commonly used statistical methods include multi-layer perceptron, neural networks,
etc. COVID-19 forecasting, screening, prediction, and detection have also been aided
by this method.

50.4 Conclusion

The COVID-19 pandemic database can be updated using machine learning, which
learns from the data it is given. The benefit of this technology for patient screening and
50 A Technical Review on Machine Learning-Based Prediction … 537

thermal scanning of the face and body is simple and effective. The person’s tempera-
ture is quickly detected by this initial and simple scanning. Machine learning models
assist in the successful triage of the first data. Predicting the signs and symptoms
of infectious pandemic diseases with more accuracy than currently available models
like ordinary least squares and the auto-regressive integrated moving average is a
primary goal of machine learning models. Infectious patient imaging, accurate and
personalized therapy, determining of cough, cold, and fever symptoms, effective
health monitor, patient analysis, and clinical behavior are some of the most common
uses of artificial intelligence in the medical field of potential disease symptoms and
offer a smart healthcare platform.
An artificial intelligence system will shortly be able to comprehend the COVID-
19 situation, which needs swift action in the near future. Predictions based on the
COVID-19 epidemic will be beneficial and accurate when using this tool.

References

1. Awasthi, A., Sukriti, V., Leander, C., et al.: Outbreak of novel corona virus disease (COVID-19):
antecedence and aftermath. Eur. J. Pharmacol. 884, 173381 (2020)
2. Manigandan, S., Wu, M.-T., Vinoth, K.P., Vinay, B.R., Arivalagan, P., Kathrivel, B.: A system-
atic review on recent trends in transmission, diagnosis, prevention and imaging features of
COVID-19. Process Biochem. 98, 233–240 (2020)
3. Jalaber, C., Lapotre, T., Morcet-Delattre, T., Ribet, F., Jouneau, S., Lederlin, M.: Chest CT
in COVID-19 pneumonia: a review of current knowledge. Diagn. Interv. Imaging 101(7–8),
431–437 (2020)
4. Jia-gang, D., Xiao-tao, H., Tie-jun, Z., et al.: Carry forward advantages of traditional medicines
in prevention and control of outbreak of COVID-19 pandemic. Chinese Herbal Med. 12(3),
207–213 (2020)
5. Cartin, S., Zaid, A., Naimh, O.N., et al.: World health organization declares global emergency:
a review of the 2019 novel coronavirus (COVID-19). Int. J. Surg. 76, 71–76 (2020)
6. Gianluca, P., Stefano, S., Maria, E.M., et al.: Role of computed tomography in COVID-19. J.
Cardiovasc. Comput. Tomogr. 15(1), 27–36 (2021)
7. Alireza, T., Abdollah, A.: Real-time RT-PCR in COVID-19 detection: issues affecting the
results. Expert Rev. Mol. Diagn. 20(5), 453–454 (2020)
8. de Almeida, S.M.V., Soares, J.C.S., Dos Santos K.L., et al.: COVID-19 therapy: What weapons
do we bring into battle. Bioorg. Med. Chem. 28(23), 115757 (2020)
9. Quoc-viet, P., Dinh, C.N., Thien Huynh, T., Won-joo, H., Pubudu, P.: Artificial intelligence
(AI) and big data for coronavirus (COVID-19) pandemic: a survey on the state-of-the-arts.
IEEE Access 8, 130820–130839 (2020)
10. Zhenyu, T., Wei, Z., Xingzhi, X., et al.: Severity assessment of COVID-19 using CT image
features and laboratory indices. Phys. Med. Biol. 66(3), 035015 (2021)
11. Mark, C.: Health care, capabilities, and AI assistive technologies. Ethical Theory Moral Pract.
13(2), 181–190 (2009)
12. Tom, N., Oliver, M., Aimee, C., Damien, R.: Acceptability of artificial intelligence (AI)-led
chatbot services in healthcare: a mixed-methods study. Digital Health 5, 205520761987180
(2019)
13. Brain, W., Aline, C.G.: Stefan G and Nina RSchwalbe: artificial intelligence (AI) and global
health: how can AI contribute to health in resource-poor settings. BMJ Glob. Health 3(4),
e000798 (2018)
538 S. Kejriwal and N. Rajagopalan

14. Swapnarekha, H., Himansu, S.B., Janmenjoy, N., Bighnaraj, N.: Role of intelligent computing
in COVID-19 prognosis: a state-of-the-art review. Chaos, Solitons Fractals 138, 109947 (2020)
15. Abu, S., Anirudha, G., Ali, S., Florentin, S.: A survey on deep transfer learning to edge
computing for mitigating the COVID-19 pandemic. J. Syst. Architect. 108, 101830 (2020)
16. Batista, A.F.M., Miraglia, J.L., Donato, T.H.R., Chiavegatto Filho, A.D.P.: COVID-19
diagnosis prediction in emergency care patients: a machine learning approach (2020)
17. Lamia, N.M., Kadry, A.E., Haytham, H.E., Hassan, A.E., Aboul, E.H.: Automatic X-ray
COVID-19 lung image classification system based on multi-level thresholding and support
vector machine (2020)
18. Şaban, Ö., Umut, Ö., Mucahid, B.: Classification of coronavirus (COVID-19) from X-ray and
CT images using shrunken features. Int. J. Imaging Syst. Technol. 31(1), 5–15 (2020)
19. Feng, S., Liming, X., Fei, S., et al.: Large-scale screening to distinguish between COVID-
19 and community-acquired pneumonia using infection size-aware classification. Phys. Med.
Biol. 66(6), 065031 (2021)
20. Mohammad, P., Mahdi, S.: Predicting mortality risk in patients with COVID-19 using artificial
intelligence to help medical decision-making (2020)
21. Zirui, M., Minjin, W., Huan, S., et al.: Development and utilization of an intelligent application
for aiding COVID-19 diagnosis (2020)
22. Khanday, A.M.U.D., Rabani, S.T., Khan, Q.R., et al.: Machine learning based approaches for
detecting COVID-19 using clinical text data. Int. J. Inf. Tecnol 12, 731–739 (2020)
23. Nan-Nan, S., Ya, Y., Ling-Ling, T., Yi-Ning, D., Hai-Nv, G., Hong-Ying, P., Bin, J.: A Prediction
Model Based on Machine Learning for Diagnosing the Early COVID-19 Patients. Cold Spring
Harbor Laboratory Press (2020)
24. Zhang, B., Ni-Jia-Ti, M.Y., Yan, R., An, N., Chen, L., Liu, S., Chen, L., Chen, Q., Li, M.,
Chen, Z., You, J., Dong, Y., Xiong, Z., Zhang, S.: CT-based radiomics for predicting the rapid
progression of coronavirus disease 2019 (COVID-19) pneumonia lesions. Br. J. Radiol. (2021)
25. Yoo, S.H., Geng, H., Chiu, T.L., Yu, S.K., Cho, D.C., Heo, J., Choi, M.S., Choi, I.H., Cung, V.C.,
Nhung, N.V., Min, B.J., Lee, H.: Deep learning-based decision-tree classifier for COVID-19
diagnosis from chest X-ray imaging. Front. Med. 7, 2296–2858 (2020)
26. Imad, M., Khan, N., Ullah, F., Abul Hassan, M., Hussain, A., Faiza.: COVID-19 classification
based on chest X-ray images using machine learning techniques. J. Comput. Sci. Technol. Stud.
2, 01–11 (2020)
27. Sudhir, B., et al.: Logistic regression analysis to predict mortality risk in COVID-19 patients
from routine hematologic parameters. Ibnosina J. Med. Biomed. Sci. 12(2), 123 (2020)
28. Stephine, A.H., Thomas, H.S., Sheng, X., Evirm, B.T., et al.: Artificial intelligence for the
detection of COVID-19 pneumonia on chest CT using multinational datasets. Nat. Commun.
11, 4080 (2020)
29. Prabhira, S.K., Santi, B.K., Pradyumna, K.R., Preesat, B.: Detection of coronavirus disease
(COVID-19) based on deep features. Int. J. Math. Eng. Manage. Sci. 5(4), 643–651 (2020)
30. Jordan, M.I., Mitchell, T.M.: Machine learning: trends, perspectives, and prospects. Science
349(6245), 255–260 (2015)
31. Kuo, C.H., Shinya, N.: Applying machine learning to market analysis: knowing your luxury
consumer. J. Manage. Analytics 6(4), 404–419 (2019)
32. Hong-Xing, L., Xu, L.D.: A neural network representation of linear programming. Eur. J. Oper.
Res. 124(2), 224–234 (2000)
33. Halgurd, S.M., Kayhan, G.: A smartphone enabled approach to manage COVID-19 lockdown
and economic crisis. SN Comput. Sci. 1(5) (2020)
34. Subhankar, R., Willi, M., Sebastian, O., et al.: Deep learning for classification and localization
of COVID-19 markers in point-of-care lung ultrasound. IEEE Trans. Med. Imaging 39(8),
2676–2687 (2020)
35. Shan, F., Li, D.X.: Hybrid artificial intelligence approach to urban planning. Expert. Syst. 16(4),
248–261 (1999)
36. Kullaya Swamy, A., Sarojamma, A.: Bank transaction data modeling by optimized hybrid
machine learning merged with ARIMA. J. Manage. Analytics 7(4), 624–648 (2020)
50 A Technical Review on Machine Learning-Based Prediction … 539

37. Aishwarya, K., Puneeth, G., Ankita, S.: A review of modern technologies for tackling COVID-
19 pandemic. Diabetes Metab. Syndr. 14(4), 569–573 (2020)
38. Elliot, M.: Integrating emerging technologies into COVID-19 contact tracing: opportunities,
challenges and pitfalls. Diabetes Metab. Syndr. 14(6), 1631–1636 (2020)
39. Ahmad, W.S., Preety, B., Gaurav, G.: Review on machine and deep learning models for the
detection and prediction of coronavirus. Mater. Today Proc. 33, 3896–3901 (2020)
40. Barenya, B.H., Deepak, G.: Modelling and forecasting of COVID-19 spread using wavelet-
coupled random vector functional link networks. Appl. Soft Comput. 96, 106626 (2020)
41. Engy, E.S., Aboul, E.H., Karam, M.S., Abohany, A.A.: Approach for training quantum neural
network to predict severity of COVID-19 in patients. Comput. Mater. Continua 66(2), 1745–
1755 (2021)
42. Žižka, F.D., Svoboda, A.: Random forest. Text Mining with Machine Learning, pp. 193–200
(2019)
43. Albahri, O.S., Zaidian, A.A., Albahri, A.S., et al.: Systematic review of artificial intelligence
techniques in the detection and classification of COVID-19 medical images in terms of evalu-
ation and benchmarking: taxonomy analysis, challenges, future solutions and methodological
aspects. J. Infect. Public Health 13(10), 1381–1396 (2020)
44. Hui, Y., Jianbo, S., Yuqi, G., Yun, X., Chuan, S., Ye, Y.: Data-driven discovery of a clinical
route for severity detection of COVID-19 pediatric cases (2020)
45. Cong, F., Lili, W., Xin, C., et al.: A novel triage tool of artificial intelligence-assisted diagnosis
aid system for suspected COVID-19 pneumonia in fever clinics (2020)
Chapter 51
A CNN-Based Neural Network for Tumor
Detection Using Cellular Pathological
Imaging for Lobular Carcinoma

Ekta Jain, Nishi Sharma, Deepika Rawat, Shipra Varshney,


Shweta Chaudhary, Neha Kashyap, and Prashant Vats

Abstract In neural network-based learning techniques, there are several models of


convolutional networks. Whenever the methods are deployed with large datasets only
then could their applicability and appropriateness be determined. Clinical and patho-
logical pictures of lobular carcinoma are thought to exhibit a large number of random
formations and textures. Working with so pictures is a difficult problem in machine
learning. Focusing on wet laboratories and following the outcomes, numerous studies
have been published with fresh commentaries in the investigation. In this research,
we provide a framework that can operate effectively on raw photos of various reso-
lutions while easing the issues caused by the existence of patterns and texturing. The
suggested approach produces very good findings that may be used to make decisions
in the diagnosis of cancer.

Keywords Lobular carcinoma · Convolutional neural networks (CNNs) · Deep


learning · Histopathological imagery scans

E. Jain · N. Sharma
Department of Computer Applications (MCA), ABES Engineering College, Ghaziabad, Uttar
Pradesh, India
D. Rawat
H. M. R. Institute of Technology and Management, Guru Gobind Singh Indraprastha University,
New Delhi, India
S. Varshney · S. Chaudhary
Dr. Akhilesh Das Gupta Institute of Technology and Management, Guru Gobind Singh
Indraprastha University, New Delhi, India
N. Kashyap
JIMS Engineering and Management Technical Campus, Guru Gobind Singh Indraprastha
University, Greater Noida, Uttar Pradesh, India
P. Vats (B)
Department of CSE, Faculty of Engineering & Technology, SGT University, Gurugram, Haryana,
India
e-mail: prashantvats12345@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 541
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_51
542 E. Jain et al.

51.1 Introduction

Despite the tremendous increase in cancer mortality and fatality rates in recent years,
the overall number of deaths in most affluent nations has been significantly lowered.
Many advances in early recognition and diagnostic procedures have been reported
as a result of computed tomography and analyzes. The categorization of individuals
as having cancer or not surviving cancer using medical notes necessitates rigorous
investigation and an examination having high sensitivity and selectivity. The crux
of the breast cancer awareness majority view concerns with the investigation of
numerous magnified ranges of cellular pathological imaging pictures, which is time-
consuming and leads to many pieces of research in this area. Even though the purpose
is to save patients’ lives, the investigation professionals rely on advanced computer-
aided diagnostics technologies. In the context of planning the diagnostic procedure
with the factorized correct time, appropriate treatment, the in cultured cells of patients
provide the valuable informational datasets which are collected for the investigation
propose. The conceptual remarks created aid professionals in their investigations and
steering scenarios toward the projected planning process.

51.2 Related Work

The fundamental treatment of ductal carcinoma picture classification and prediction


is divided into two phases: early preparations and preprocessing using image recog-
nition, and analysis and classification using machine learning with CNN. The goal of
histology is to discriminate between normal and malignant tissue to make predictive
decisions [1]. The malignancy of mammary tissue is called carcinogenesis, and the
form of the body tissue defines the severity of the malignancy [2]. The staining of
tissue material for normal histopathology diagnostics is a mix of stained with hema-
toxylin and eosin stains (H&E). Automated analysis imaging study utilizes regula-
tion methodologies as well as pattern recognition as well as deep learning models.
Using H&E marked histological breast cancer pictures, a pre-planned CNN classi-
fier from ImageNet with a simulation–optimization approach is used [3]. Sometimes
fully operational computer-assisted radiologist structures use computational intelli-
gence algorithms; for example, [4] describes a CAD system for diagnosis of breast
cancer that identifies and concentrates spots comprising masses and categorizes them
without even any outer user interference or early processing of pec musculature
expulsion or general populace fragmentation. Deep learning may be used to under-
stand data’s unique characteristics and connections; for picture data, deep convolu-
tional models are an effective approach that supports rational decision processing.
Convolutional networks could tackle critical decision-making issues involving image
representations of visual input [5]. Conventional histological approaches have been
associated with replacing variability, which can be minimized quantitatively when the
procedures are excessively technically challenging in clinical settings [6]. The digital
51 A CNN-Based Neural Network for Tumor Detection … 543

radiographic systems generate vast picture dimensions with the raw depiction of clin-
ical image scanning, with big proportions consuming considerable storage capacity,
viz., pretty much the entire automated analysis imaging, called WSI image set. In the
realm of digital histopathology, assessing from diagnostic tests imaging, WSI-based
image sets provide several unique challenges. Robertson et al. [2] showed in the initial
periods of artificial intelligence that machine learning can overcome picture quality
limitations and also that the difficulty of tumor detection can be greatly reduced also
with the adoption of cognitive computing.

51.3 Proposed Work

Several ductal carcinoma datasets are available for use in creating computer-assisted
specific diagnostic systems. Such dataset makes the most use of machine learning-
based computational assistance and traditional frameworks [7]. Though most assets
offer distinct exchange substitutes for clinicians, it is critical to grasp the infor-
mation’s comparative accessibility or intrinsic clinical relevance before advancing.
Using cutting-edge artificial intelligence and machine learning techniques in medical
diagnosis, computerized evaluation of pathological samples may be sped up, reducing
process time. Breaking this is by far the most recent and therapeutically important
open ductal carcinoma histopathology collection accessible to date. Images with
different tumors and innocuous states are organized into four distinct arrays based
on intensity, with labeling for important groups (benign/malignant) and subclasses.
To aid the reader in differentiating between the 2 kinds of visuals, photographs were
categorized as malignant or benign. To better aid the user, visuals are further classi-
fied into 8 groups mentioned below: In the event of benign imaging, Phyllodes Tumor
(PT), Fibroadenoma (F), Tubular Adenoma (TA), and Adenosis (A) are being used,
but in the event of malignant imaging, Lobular Carcinoma (LC), Ductal Carcinoma
(DC), Papillary Carcinoma (PC), and Mucinous Carcinoma (MC) are employed. The
many elements of this assertion enable critical thinkers to reconstruct the scenario
in the following manner: binary digital classifiers may accurately determine how an
input image is secure or harmful, whereas cross classifier may decide whether class
belongs to (PT/F/A/TA/A/LC/DC/PC/MC) categories.

51.4 Feature Acquisitions

There are several ways in the investigation for extracting robust characteristics from
photos. In general, visual features are the picture’s common construction elements,
such as vertices, borders, blob, masses, elevations, and so on. A characteristic is an
intentional attribute of a system that is used for analysis in a certain field of research.
Syntactic and semantic, a trait is made up of qualities. A collection of facial landmarks
out of each reference and float picture is generated and checked for similarity [1].
544 E. Jain et al.

Fig. 51.1 Infrastructure for fundamental image processing

By collecting programmatically and in wet laboratories, transformational variables


are investigated, and the morphology of the cytoplasmic tumor, places of divergence,
cross-over locations, and splitting are explored. As a result, a critical process such
as cell staining is carried out shortly before visualization. As a result, dyeing aids
in highlighting each element for more accurate structural evaluations underneath
sophisticated microscopy (Fig. 51.1).

51.5 Machine Learning-Based Pattern Analysis

Breast cancer is a higher prevalence of chronic cancer diagnosis around the globe, and
it is now a major health concern. Early detection increases the likelihood of effective
therapy and survivability. Nevertheless, it is a difficult and tedious procedure that
depends on physicians’ expertise. The automated detection of lobular carcinoma by
the analysis of histological pictures is necessary for patients and associated diagnosis.
Conventional extracting features approaches, on the other hand, could only recover
certain low-level aspects of pictures, and the previous analysis is needed to choose
useful traits, which may be highly influenced by people. Using machine learning
methods, it can extract relevant elevated abstraction properties from photos. As a
result, we apply it to the analysis of histopathology pictures of lobular carcinoma
using unsupervised and supervised deep CNN models. Breast tumor photos are gath-
ered and adjusted to various sizes. The form of the malignancy should be used to
determine if it is malignant or benign in the photos. To deploy, phased pictures are
trained and evaluated to classify them as malignant or benign (Fig. 51.2).
51 A CNN-Based Neural Network for Tumor Detection … 545

(a) Breast Cancer screening tumor lesions specimens as Benign

(b) Breast Cancer screening tumor lesions specimens as Malignant

Fig. 51.2 a Breast cancer screening tumor lesions specimens as benign, b breast cancer screening
tumor lesions specimens as malignant

51.6 Experimental Results

A deep convolutional neural design is used to reveal crucial parts of malignancy-


related clinical symptoms. A mammary picture of dimension 220 * 220 is fed into a
standard DCNN structure as a matrix of dimension 192 * 192. The initial inversion
of the specified design may take the form of 3 * 3 * 2 kernels filtering having cadence
1 * 1, for a maximum of 24 filterings. Pooling layer 1 generates at the very most
utilizing a convolution layer with a cadence of 2 * 2, culminating in 1/2 the input
size of 96 * 96. Following ReLU, the application’s result is routed to a nonlinear
convolutional. The result is subsequently sent into another convolution operation.
In the following convolution operation, 3 * 3 * 24 kernels filtering of time 1 * 1
were performed, for a total of 48 filterings. The next convolution layer decreased
the dimensionality of the data to half its original length of 48 * 48, where upon
the outcome was expanded with an at the very most layer with convolution layer
2 * 2. Non-linearity is aggregated into a single convolutional layer’s result before
being passed toward the next convolution operation, culminating in quaffed non-
linearities. Inside the third convolution, 3 * 3 * 96 kernel filtering with cadence
1 * 1 is utilized. Whenever the third inversion procedure is completed, the intake
is “aggregated” (made smaller) using a convolution layer with a cadence of 2 * 2,
resulting in a decreased intake that is half the usual length of 24 * 24. The activation
function’s function incorporates non-linear behavior, which is continued through
from the 4th convolution operation, which has 192 filterings and a 3 * 3 * 192 block
size. The picture is approximated to a length of 12 * 12 to occupy the free space by the
resulting decrease. To prodromal phase, the outcomes of all from before the layers,
546 E. Jain et al.

such as the excitation results from ReLU, of 240 filtrations, the 5th convolution with
240 filtrations is supplied the initiation outcomes from ReLU, as well as a kernel size
of 12 * 12 * 240, is being used to acknowledge many of the outcomes from standards
call layers and achieve the maximum the accumulation from such a layer with the
path of 2 * 2 (Fig. 51.3).
Each one of the 100 image sequences employed in this study does have a problem,
with 70 will be used for learning, 40 as normal, and the remaining 30 having flawed.
The very first 30 images are used for screening, with 20 of them will be normal and
10 having defective. Picture training is continued as constitutive equations are fed
into the system till the network size is established. Every other image is treated to
a preset number of rounds: 250, for each repetition having a training set of 0.0003.
The below study is conducted on the priority queue of DCNN using Keras. This

(a) The Losses Epoch graphs created in Keras by a DCNN recognizing


ductal cancer signs.

(b) The Accuracy Epoch graphs created in Keras by a DCNN recogniz-


ing ductal cancer signs.

Fig. 51.3 a The losses epoch graphs created in Keras by a DCNN recognizing ductal cancer signs,
b the accuracy epoch graphs created in Keras by a DCNN recognizing ductal cancer signs
51 A CNN-Based Neural Network for Tumor Detection … 547

Fig. 51.4 Graphs generated indicating the training and testing loss due to incorrectly classified
histological lobular carcinoma photos when pictures do not have cancer/tumor indications or only
weakly include the indicators

observatory’s findings are depicted in the graphic below. The testing is conducted
out with 250 data every epoch, and the system is evaluated using 150 epochs all
through the investigation. The preceding graphs depict the assessment services of
images transmitted to the proposed scheme. Both in training and test, the suggested
system was shown to have substantial drop and reliability (Figs. 51.4 and 51.5).
Furthermore, we will address the various shortcomings of the current convolu-
tional intelligent system. Classification mistakes in the images may happen as a result
of a conflict of interests, which does have a detrimental influence on the model’s stan-
dardization capability. Experiments on the categorization of the malignant breast
cancer PC, MC, LC, and DC have been carried out. The suggested CNN was
successful in identifying malignancies with high accuracy. The normality assump-
tion of histopathology pictures was used to produce the Coefficient of determination
for ductal, globular, papillary, and mucinous cancer (Table 51.1).
The detected and accumulated groupings of histological images examined for
benignity are shown in the column above. The graph is presented for the aforemen-
tioned data, as well as the AUC is indicating that about 89% of the photos have
benign properties (Fig. 51.6; Table 51.2).
Table 51.3 illustrates the observable and accumulated groupings of histological
images that were examined for malignancy. The ROC curve is presented for the
data collected, and the AUC is 0.88 indicating that over 88% of the photos include
malignancy characteristics (Fig. 51.7; Table 51.4).
548 E. Jain et al.

Fig. 51.5 Charts were generated demonstrating the train and test effectiveness in histological breast
cancer pictures with significant lobular carcinoma signs

Table 51.1 CNN efficiency for tumor samples with benign features
Benevolence Perceived values Progressive FPR TPR AUC
TRUE FALSE TRUE FALSE
0 0 1 0.89 0.94
1 33 4 33 4 0.98 0.88 0.96
2 64 8 98 11 0.96 0.94 0.95
3 87 12 186 22 0.95 0.96 0.89
4 103 15 291 36 0.94 0.95 0.88
5 121 24 412 52 0.87 0.94 0.98
6 96 61 506 117 0.92 0.87 0.96
7 8 76 512 192 0.89 0.94 0.95
8 7 42 522 236 0.88 0.87 0.94
9 5 32 523 267 0.94 0.87 0.87
10 1 16 523 278 0.87 0.92 0.92
528 278 0.891

Inside the example of Toto, it was discovered that perhaps the model obtains
exquisite quantitative efficiency when the histology images are sized with different
realms to decide malignancies (Toto). The presented CNN architecture is designed
in such a way where cancerous images with a quality of 240 * 240 pixels, and
also convolutional and max—pooling with action potentials in the ReLU levels, may
well be supplied into the system. The system is quite customizable and is constructed
in Python utilizing data flow graphs, Keras in Python Navigator 3, and computing
system. Furthermore, to show the usefulness of the proposed technique, the test was
51 A CNN-Based Neural Network for Tumor Detection … 549

Fig. 51.6 AUC for photos


of benign tumors is shown
by the ROC curve

Table 51.2 To have ROC


Correctness (CORR) 0.7955
curve values for Table 51.1
Sensitiveness (TPR) 0.7821
Specifications (TNR) 0.7684
Falsified positiveness ratio (FPR) 0.2344
Positive predictiveness ratio (PPR) 0.6785
Negative predictiveness ratio (NPR) 0.8567

Table 51.3 CNN effectiveness in histopathology images with malignant features


Malignity Perceived values Progressive FPR TPR AUC
TRUE FALSE TRUE FALSE
0 0 1 0.92 0.87
1 45 7 45 7 0.89 0.94 0.96
2 73 9 119 5 0.89 0.94 0.95
3 94 11 211 27 0.88 0.87 0.94
4 107 16 314 42 0.94 0.87 0.87
5 116 27 438 25 0.87 0.92 0.92
6 93 69 543 133 0.98 0.88 0.96
7 14 79 567 135 0.96 0.94 0.95
8 7 54 545 211 0.95 0.96 0.89
9 8 32 557 292 0.94 0.95 0.88
10 0 11 591 314 0.87 0.94 0.98
557 324 0.8675
550 E. Jain et al.

Fig. 51.7 AUC for


malignant breast cancer
pictures is depicted via a
ROC’s curve

Table 51.4 To have ROC


Correctness (CORR) 0.7465
curve values for Table 51.2
Sensitiveness (TPR) 0.7821
Specifications (TNR) 0.7667
Falsified positiveness ratio (FPR) 0.2452
Positive predictiveness ratio (PPR) 0.6654
Negative predictiveness ratio (NPR) 0.8654

carried in a modest GPU configuration, using the Chrome L. with a solitary 12 GB


Nvidia gtx K80 GPU as just an illustration.

51.7 Conclusion

In this research, we introduced a typical deep CNN system consisting of four levels
that really can easily detect the histopathological pictures of malignancy by using
different sizes of pictures and filtering out numerous unnecessary portions of the
image. The abovementioned system has indeed been trained to comprehend the
fundamental structural and morphological characteristics of carcinoma images with
varying frequencies at various layers. The model’s sensitivity for different cases
of histopathological breast cancer images was substantially greater than predicted,
posing a major challenge.
51 A CNN-Based Neural Network for Tumor Detection … 551

References

1. Elston, C.W., Ellis, I.O.: Pathological prognostic factors in breast cancer. I. The value of
histological grade in breast cancer: experience from a large study with long-term follow-up.
Histopathology 19(5), 403–410 (1991)
2. Robertson, S., et al.: Digital image analysis in breast pathology—from image processing
techniques to artificial intelligence. Transl. Res. 194, 19–35 (2018)
3. Rakhlin, et al.: Deep convolutional neural networks for breast cancer histology image analysis.
In: International Conference Image Analysis and Recognition, pp. 737–744. Springer, Cham
(2018).
4. Shayma’a, et al.: Breast cancer masses classification using deep convolutional neural networks
and transfer learning. Multimedia Tools Appl. 79(41), 30735–30768 (2020)
5. Khan, et al.: A survey of the recent architectures of deep convolutional neural networks. Artif.
Intell. Rev. 53(8). 5455–5516 (2020)
6. Srinidhi, et al.: Deep neural network models for computational histopathology: a survey. Medical
Image Anal., 101813 (2020)
7. Spanhol, et al.: A dataset for breast cancer histopathological image classification. IEEE Trans.
Biomed. Eng. 63(7), 1455–1462 (2015)
Chapter 52
A Survey on Identification of Grocery
Store Items Using Deep Learning
in Retail Store

Nidhi Savani and Munindra Lunagaria

Abstract The work for distinguishing an item on a retail store’s rack could be
an essential human capacity. Machine vision frameworks confront diverse deterrents
whereas managing with the same acknowledgment issue. The automatic item location
on a retail store’s racks moves forward the shopper encounter with additional esteem
too whereas advertising retailers will advantage financially. Machine vision-based
protest acknowledgment approaches have the next victory rate than the programmed
discovery of retail stock in a store environment. In this paper, we display to consider
device vision-primarily based retail object acknowledgment frameworks in addition
to a cutting-edge medical category for the subject. Besides, we discuss about the
problem’s troubles. We see the highlights that have been utilized in state-of-the-art
endeavors in this comprehensive investigation of distributed works. The report closes
by proposing to encourage inquire about subjects in related areas.

Keywords Data stream · Mining · Classification · Method · Challenges

52.1 Introduction

Computer vision masters have been working on creating a machine vision framework
that can recognize items on general store racks for a long time. Discovery alludes
to the method of recognizing and accurately finding items on general store racks (or
recognizable proof). The vision framework is accepted to have got to the culminating
publicizing picture of the particular item. The essential objective of a vision system
like typically to form a supply of products accessible to purchase at any given time
from pictures of racks stacked with items, to compare and confirm the item show
arrange (regularly alluded to as a planogram) with the genuine exhibit of products

N. Savani (B) · M. Lunagaria


Marwadi University, Rajkot, India
e-mail: nidhi.savani111003@marwadiuniversity.ac.in
M. Lunagaria
e-mail: munindra.lunagaria@marwadieducation.edu.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 553
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_52
554 N. Savani and M. Lunagaria

Fig. 52.1 Illustration rack photographs where the item pictures from (a) must be recognized are
shown in (a, b). Many endeavors have been made to handle the concerns recorded utilizing RFID,
sensors, and barcodes. There are numerous sensor-based gadgets (such as Amazon Go [1]) that
screen a consumer’s product acknowledgment and selection

(alluded to as a plan certification issue), and to supply clients with a value-added


encounter (alluded to as a shopping help issue).
The product detection systems for presented products that are progressing in a
grocery store are examined in this paper.
Figure 52.1 portrays the square graph of the machine vision framework beneath
thought. All through the rest of the page, the rack picture serves as a self-image,
and the item picture serves as an item layout. From the freely accessible GroZi-120
dataset, Fig. 52.1b appears a choice of normal item pictures Fig. 52.1a.
As a result, such sensors will not be competent in understanding the arranged
compliance issue. The issue with personal product-based sensors is that they cannot
survey the current status of various items at the same time.
In this paper, we provide a comprehensive audit of strategies and discoveries
discharged amid the final 20 a long time within the zone of item location in expan-
sive stores. The creators display a brief overview on item acknowledgment in rack
shots in an as of late distributed conference article. In expansion, one of the essential
objectives of this comprehensive audit is to propose a modern scientific categoriza-
tion of state-of-the-art computer vision-based calculations for identifying items on
grocery store racks.
The rest of the article is organized as takes after programmed item location from
photographs of retail store racks: challenges and benefits. Area 5 incorporates an
outline of freely accessible datasets, as well as a comparison of the comes about of
an assortment of strategies based on as of now detailed, comes about.

52.2 Benefits and Challenges

Table 52.1 abridges the likely issues with the item location framework. The racks
are as a rule chaotic and not orchestrated in a uniform way. Perfect promoting
52 A Survey on Identification of Grocery Store … 555

Table 52.1 Obstacles to


No. Category Sub-category
automatic retail product
recognition 1 The environment in a retail – Complexity of scene
store – Distribution of data
– Product features
– Classification at a finer
level
2 Imaging in the digital age – Blurring
– Inconsistent lighting
situations
– Unusual vantage point+
– Secularity

photographs of different items accessible to the vision framework are as often as


possible captured with numerous cameras, coming about in different picture bright-
ness conveyances. Due to converting imaging settings, the period of the object bundle
(in some gadgets of the period, which includes cm) is furthermore mapped to various
pixel resolutions for object and rack pictures. Figure 52.2a, b appear illustrations of
how item formats and rack pictures contrast (b).
Bundles for items come in a few distinctive shapes. An item location framework
must be able to recognize between minor limited time modifications in item bundling.

Fig. 52.2 GroZi-120 dataset [2]: a test item pictures for showcasing, b test rack pictures for recog-
nizing and localizing items. The spatial facilitates of the upper-left and bottom-proper corners of a
recognized bounding box are (x1; y1) and (x2; y2), separately
556 N. Savani and M. Lunagaria

Fig. 52.3 Fine-grained color and content alterations, as well as to measure variations

Fig. 52.4 a Rack picture with vertically stacked items owing to diagonal seeing as well as dazzling
projection as a result of gleaming item bundles and b rack image with mutilated rack picture due
to angled seeing and specular reflection due to shiny item bundling

The discovery of modest form or color changes in a broad range extend of items
requires categorization on a finer scale. Figure 52.3 portrays some illustrations of
apparently comparable items that vary as they were in color, text style, or measure.
Convenient gadgets are being utilized to capture rack photographs. Picture obscure
is common as a result of camera tremors and rushes (look at the center rack image in
Fig. 52.2b). The picture of the rack is misshaped (see Fig. 52.4a) due to calculated
seeing (non-front-parallel area of the camera around the rack) and light changes.
Because of the captivating item bundling, the collected picture of a rack now and
then shows specular reflection (see Fig. 52.4a). Because of the stacked product’s
best and foot borders, object-to-image mutilations are expanded (see Fig. 52.4b,
where diagonal locate causes parallel lines to seem to make a vanishing point). A
crevice within the rack (the nonattendance of an item on the rack) is habitually
mixed up for the nearness of a sticker on the surface (as seen in Fig. 52.5a) or the
nearness in an item because of the shadows and misalignment lighting (as seen in
Fig. 52.5b stamped with the ruddy curve). These qualities are huge trouble on the
beat of the customary question discovery strategy considered in computer vision.
Different protest discovery [3–5], a few occurrences of the same question found [6,
7], different question localization [8, 9], protest location in different sees [10], and
fine-grained categorization [11–14] are all combined within the retail item discovery
problem.
The focal points of a vision-based item location framework are recorded under-
neath 1. Superior Buyer Encounter: It is worth noticing that the world’s daze adds up
52 A Survey on Identification of Grocery Store … 557

Fig. 52.5 A crevice within the rack (the nonattendance of an item on the rack) is habitually mixed
up for the nearness of a sticker on the surface (as seen in Fig. 52.5a) or the nearness in an item
because of the shadows and misalignment lighting (as seen in Fig. 52.5b stamped with the ruddy
curve)

to is accepted to be over 30 million individuals [15]. Indeed for a periodic customer,


real-time data on the accessibility of a certain item at a particular store area dimin-
ishes shopping time by a little sum 2. Commercial Benefits: Out-of-stocks in grocery
stores happen between 5 and 10% of the time, concurring to Metzger [16]. In [17],
Gruen et al. investigate the effects of over in the retail industry foundations around
the world. Due to a deficiency of stock, they discover the taking after insights: Clients
switch retailers 31% of the time, purchase a diverse brand of products 22% of the time,
and do not purchase at all 11% of the time. The strategy for organizing items in one
or more progressive By and large. The esteem of major point-based characteristics
appears in Table 52.2.

Table 52.2 Records freely accessible datasets by distribution year, the sum of item sorts, item
photographs, and rack pictures. The information is briefly talked about within the taking after
sections
No. Year Dataset Product grouping Image of the Image from the
product rack
1 2007 Gros-1202 [2] 120 676 29∗
2 2007 WebMarket3 [18] 100 300 3153
3 2014 Grocery products 27∗∗ /3235 3235 680
[19]
4 2015 Grocery dataset4 10 3600 354
[20]
5 2016 Freiburg groceries 25 4947 74
dataset [21]
558 N. Savani and M. Lunagaria

52.3 A Methods for Detecting Products

1. Deep generative model uses multi-view deep generative model, which is deter-
ministic deep neural net. Conclusion: It contains images of various raw and
packed items, which can be split into fine-grained classes and grouped into
more general (coarse-grained) classes. We have a clear iconic image and a text
description of the item for each class.
2. Pruning with a Design (it is greedy algorithm). Conclusion: It offers a model
that is 11 percentage points simpler and more accurate than the state of the art.
3. Few-shot Learning (FSL) combining ResNeXt-101, one-shot learning, convo-
lutional net, and a Local Maximal Occurrence (LOMO) descriptor. Conclusion:
His promising combination of techniques within a Siamese net allows solving
the problem of one-shot learning.
4. New triplet mining, new sampling metaphor, and metric learning. Conclu-
sion: When compared with the traditional way, our mining strategy increases
Recall@1 by up to 5%.
5. The experimental results demonstrate the effectiveness of our approach, as
well as the improvement given by the color pre-selection step in both detection
accuracy and efficiency.
6. For the first time, a method that combines two concepts: “semi-supervised
learning” and “on-shelf availability” (SOSA). Conclusion: The proposed
SOSA method’s effectiveness was evaluated on image datasets with different
ratios of labeled data ranging from 20 to 80%.
7. A new method is Shelf Scanner. Conclusion: It shows how to design a system
capable of detecting items on a shopping list by leveraging the characteristics
of grocery stores. The key to this technique is mosaic building.
8. Food recognition method based on CNN. Conclusion: Inception-ResNet
converges much faster, achieving 72.55% top-1 accuracy and 91.31% top-5
accuracy.
9. Propose an end-to-end architecture that comprises a GAN to solve domain
shift throughout training and a deep CNN trained on the GAN’s samples to
build a product image embedding that enforces a hierarchy between product
categories. Finally, we have suggested an architecture to recognize a product
item derived from a shelf image, and we have shown that CNN can be success-
fully trained to extract bounding boxes surrounding grocery products from an
image of the whole shelf.
10. We proposed a hybrid approach in this research. Conclusion: Object detection
on planogram images can aid in improved classification; it has been observed
that when the image class was established using a hybrid classification and
detection model, the classification rate is greater.
11. Alternatively, the entire network is utilized to extract features and classify
them [21–24, 84]. In Table 52.2, you will find deep learning references. Three
52 A Survey on Identification of Grocery Store … 559

methods were used: HOG, BOW, and Vote Map. Conclusion: The most advan-
tageous option is a detection system based on vote map voting, which allows
multiscale processing with minimal slowdowns.
12. The YOLO (You Only Look Once) algorithm is used to detect objects in shelf
images using deep learning algorithms. Conclusion: Increasing the number of
repeats lowers the total loss value.
13. A computer vision pipeline for identifying products on shelves and ensuring
that the desired layout is followed. Unlimited Product Recognition is a term
that refers to the ability to identify Conclusion of the graph-based consistency
check: higher recognition rate (Recall), 90.2% versus 87.4%, with a (Precision)
of 90.4%.
14. Propose to use affirm deep learning-based analyzers of objects to get a first
product recognition that is not dependent on the product. Combine different
tactics in a final refinement stage to help weed out false positives and distinguish
among comparable products. Conclusion: It can identify products quickly and
accurately.
15. Locount refers to simultaneous object localization and counting, as well as
anchor-based and anchor-free methods. Conclusion: The bounding boxes of
objects are gradually classified and correlated with the predicted numbers of
instances enclosed in the bounding boxes.
16. In this research, we present a method for recovering text from product photos
captured with a cell phone camera with a resolution of 5 megapixels. Conclu-
sion on Feature Extraction: It achieves an 89 percent CRR for all characters,
including those in our database.
17. A new large-scale retail product dataset for fine-grained picture classifica-
tion, RP2K, has been released. Conclusion: On this dataset, even the most
advanced fine-grained classification method fails to win a simple ResNet
baseline, indicating that there is still significant room for improvement.
18. GroZi-120 is a new multimedia database for studying object recognition, CHM,
SIFT, ADA, and other related topics. Conclusion: At the instance of a blind or
visually impaired person using a gadget that identifies products in a grocery
shop, collecting in situ data every time the system needs to be trained would
be impractical, so in vitro findings received from the Web is a good source of
training data.

52.4 Comparison of Retail Product Detection Methods

As previously stated, there are over 35 published papers on product detection in


retail establishments. To evaluate the performance of the different methodologies,
we present the results that have already been published. However, there are two major
drawbacks to reviewing the outcomes: (a) contrasts in assessment strategy and (b)
varieties within the item plans and rack pictures. We begin by giving the details of
publicly available datasets.
560 N. Savani and M. Lunagaria

1. GroZi-120 [2]: The GroZi-120 data is the primary basic need item benchmark
dataset ever distributed. A test of things and rack photos from the dataset appears
in Fig. 52.2. Item photographs were assembled from grocery-related websites
such as Froogle6. The set of item photos incorporates pictures with several
lightings, sizes, and positions, as well as images from other suppliers or photo
galleries.
2. Web Market [25]: A Web showcase dataset with photographs of sizes
2272·1704 or 2592·1944 was gathered from 18 shelves in a retail store, each
with a length of 30 m. In terms of size, position, and lighting, rack shots differ
from product images. An item and a rack are shown in Fig. 52.5. Each of the
100 product categories has three examples in this dataset.
3. Grocery Products [26]: The basic need items data was made to deal with the
categorization and location of objects that are fine-grained (i.e., similar but not
identical). The product photographs were found on the Internet. In Fig. 52.6,
we appear several occurrences of a rack of item photos from the dataset. There
are 80 broad product categories in the dataset.

4. Grocery Store Dataset [27]: The product pictures within the basic supply
dataset (see Fig. 52.7a) were taken interior four types of closed environment of
cameras. The shelf photos (as seen in Fig. 52.7b) were taken at forty distinctive
basic need storefronts with four different camera brands at various rack-to-
camera distances separations.

5. Freiburg Groceries Dataset [79]: The Freiburg goods dataset may be a collec-
tion of real-world items and rack photos. At Freiburg, Germany, four different
cameras were utilized to record product photographs in grocery stores, houses,
and offices. Figure 52.8 shows some of the product and rack photos included in
the collection. Following that, we give a comparison of existing methodologies’
reported outcomes.

Fig. 52.6 a Rack image dataset b from the grocery products dataset [26]
52 A Survey on Identification of Grocery Store … 561

Fig. 52.7 Product photos (a) as well as rack pics (b) from the grocery dataset [27]

Fig. 52.8 Product images (a) rack images (b) Freiburg groceries dataset [79]

Table shows the survey’s significant findings. The table gives answers to the taking
after request. (a) Which approaches are the foremost fitting for certain application
scenarios? (b) What are the foremost squeezing problems that have arisen tended
to? (c) What are the last few options challenges to which machine vision specialists
ought to commit more time?

52.5 Summary and Concluding Remarks

52.5.1 Characteristics of a Desirable Key System

Real-time: From the consumer’s perspective, the framework ought to work in real
time to permit expedient item accessibility checks. Given that the store must react
to customer requests for stock repurchases, the framework ought to run near to real
time from the retailer’s perspective.
562 N. Savani and M. Lunagaria

Accuracy: Customers and retailers will embrace the system if it consistently works
at a tall level of exactness for fair a wide assortment of items. User interaction should
be minimal or non-existent.
Robustness: Major difficulties include scale mismatches between such item
templates and the rack picture, uneven lighting, camera point alterations, and
unsteady picture capturing due to hand-held gadgets. To increase the performance
of a machine learning-based technique, the synthetic production of training images
through data augmentation demands specific attention.

52.5.2 Future Directions

Following are a few significant study directions that have emerged from the
difficulties highlighted thus far.
(a) The key to a learning model’s success is the development of region recommen-
dations based on key points or saliency that can effectively collect a conceivable
locale containing an item. Different locale recommendations and the disclosure
of their formats on the racks all at once utilizing a chart theoretic or imperative
optimization approach ought to be major inquire about goals.
(b) Semantic Division: Potential zones can be decided by applying a set of defi-
nitions to abjectness and after that relegating a lesson label to them. Semantic
sections may be recognized utilizing non-maximal concealment of abjectness.
(c) Adaptability and versatility: Item bundles alter frequently, necessitating scala-
bility and adaptability. The number of items on the market is rapidly increasing.
Machine learning systems that require frequent retraining are a significant
barrier.
(d) Symbol and optical character acknowledgment: Both symbols distinguishing
proof and visual acknowledgment framework are well-known scholarly areas
with changed results in the done.
(e) Use of depth information: Using an RGBD [154]-based system, especially in
identifying holes and absent products (see Fig. 52.9a), should be examined.

Fig. 52.9 Manual abuse comes about in a lost stock (shown by ruddy form) and b skewed items
on the rack
52 A Survey on Identification of Grocery Store … 563

The look of gaps is typically influenced by uneven lighting.

References

1. Bishop, T.: How amazon works: the technology behind the online retailers groundbreaking
new grocery store. GeekWire. Extra__do de https://www.geekwire.com/2016/amazon-works-
technology-behind-online-retailers-groundbreaking-new-grocery-store
2. Merler, M., Galleguillos, C., Belongie, S.: Recognizing groceries in situ using in vitro training
data. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR’07. IEEE,
pp. 1–8 (2007)
3. Vo, B.-N., Vo, B.-T., Pham, N.-T., Suter, D.: Joint detection and estimation of multiple objects
from image observations. IEEE Trans. Sig. Process. 58(10), 5129–5141 (2010)
4. Villamizar, M., Garrell, A., Sanfeliu, A., Moreno-Noguer, F.: Interactive multiple object
learning with scanty human supervision. Comput. Vis. Image Understanding 149, 51–64 (2016)
5. Oh, K., Lee, M., Kim, G., Kim, S.: Detection of multiple salient objects through the integration
of estimated foreground clues. Image Vis. Comput. 54 (2016)
6. Haladová, Z., Šikudová, E.: Multiple instances detection in rgbd images. In: International
Conference on Computer Vision and Graphics, pp. 246–253. Springer (2014)
7. Aragon-Camarasa, G., Siebert, J.P.: Unsupervised clustering in Hough space for recognition of
multiple instances of the sameobject in a cluttered scene. Patt. Recogn. Lett. 31(11), 1274–1284
(2010)
8. He, H., Chen, S.: Imorl: Incremental multiple-object recognition and localization. IEEE Trans.
Neural Netw. 19(10), 1727–1738 (2008)
9. Foresti, G.L., Regazzoni, C.: A change-detection method fo rmultiple object localization
in real scenes. In: 20th International Conference on Industrial Electronics, Control and
Instrumentation, IECON’94, vol. 2, pp. 984–987. IEEE (1994)
10. Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview
object detection. IEEE Trans. Patt. Anal. Mach. Intell. 29(5), 854–869 (2007)
11. Ge, Z., Bewley, A., McCool, C., Corke, P., Upcroft, B., Sanderson, C.: Fine-grained classifi-
cation via a mixture of deep convolutional neural networks. In: 2016 IEEE Winter Conference
on Applications of Computer Vision (WACV), pp. 16. IEEE (2016)
12. Yao, H., Zhang, D., Li, J., Zhou, J., Zhang, S., Zhang, Y.: DSP: discriminative spatial part
modeling for fine-grained visual categorization. Image Vis. Comput. 63, 24–37 (2017)
13. Sun, T., Sun, L., Yeung, D.-Y.: Fine-grained categorization via CNN-based automatic extraction
and integration of object-level and part-level features. Image Vis. Comput. 64, 47–66 (2017)
14. Huang, D., Zhang, R., Yin, Y., Wang, Y., Wang, Y.: Local feature approach to dorsal hand vein
recognition by centroid-based circular key-point grid and fine-grained matching. Image Vis.
Comput. 58, 266–277 (2017)
15. WHO.: World health organization fact sheet, N_282 (2014). Accessed on 24 June 2017. http://
www.who.int/mediacentre/factsheets/fs282/en/
16. Metzger, C.P.: High delity shelf stock monitoring. Ph.D. thesis, ETH Zurich, Zurich,
Switzerland (2008)
17. Gruen, T.W., Corsten, D., Bharadwaj, S.: Retail out of stocks: a worldwide examination of
causes, rates, and consumer responses. Grocery Manufacturers of America, Washington, DC
18. Zhang, Y., Wang, L., Hartley, R., Li, H.: Where’s the sweet-bix. In: Asian Conference on
Computer Vision, pp. 800–810. Springer (2007)
19. George, M., Floerkemeier, C.: Recognizing products: a per exemplar multi-label image clas-
sification approach. In: European Conference on Computer Vision, pp. 440–455. Springer
(2014)
564 N. Savani and M. Lunagaria

20. Varol, G., Kuzu, R.S.: Toward retail product recognition on grocery shelves. In: Sixth Interna-
tional Conference on Graphic and Image Processing (ICGIP 2014), International Society for
Optics and Photonics, pp. 944309–944309 (2015)
21. Jund, P., Abdo, N., Eitel, A., Burgard, W.: The Freiburg groceries dataset. arXiv preprint arXiv:
1611.05799
22. Zientara, P., Advani, S., Shukla, N., Okafor, I., Irick, K., Sampson, J., Datta, S., Narayanan,
V.: A multitask grocery assistance system for the visually impaired smart glasses, gloves, and
1460 shopping carts provide auditory and tactile feedback. IEEE Cons. Electron. Mag. 6(1),
73 (2017)
23. Dingli, A., Mercieca, I.: Multimedia interfaces for people visually impaired. In: Advances in
Design for Inclusion, pp. 487–495. Springer (2016)
24. Chong, T., Bustan, I., Wee, M.: Deep learning approach to planogram compliance in retail
stores
25. Baz, I., Yoruk, E., Cetin, M.: Context-aware hybrid classification system for fine-grained retail
product recognition. In: Image, Video, and Multidimensional Signal Processing Workshop
(MSP), 2016 IEEE 12th, pp. 1–5. IEEE (2016)
26. Zhang, Q., Qu, D., Xu, F., Jia, K., Jiang, N., Zou, F.: An improved method for object
instance detection based on object center estimation and convex quadrilateral verification.
In: Information Technology, Networking, Electronic and Automation Control the Conference,
pp. 174–177. IEEE (2016)
27. Novak, C.L., Shafer, S.A.: Anatomy of a color histogram. 1992. In: Proceedings CVPR’92,
IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 599–605.
IEEE (1992)
28. Shapiro, M.: Executing the best planogram. In: Professional Candy Buyer, Norwalk, CT, USA
(2009)
29. Medina, M.O.M., Fan, Z., Ranatunga, T., Barry, D.T., Sinha, U., Kaza, S., Krishna, V.: Customer
service robot and related systems and methods, US Patent App. 14/921,899 1360 (Oct. 23, 2015)
Chapter 53
Termino-ontology Resources
of Endogenous Agro-Sylvo-Pastoral
Practices for the Adaptation to Climate
Change

Halguieta Trawina, Yaya Traore, Sadouanouan Malo, and Ibrahima Diop

Abstract For the construction of the ontology of endogenous technical or practices


for the adaptation of the agro-sylvo-pastoral domain to climate change, an important
step is to identify the Termino-Ontological Resources (TOR). This article presents
an overview of the main resources identified in the agro-sylvo-pastoral domain in
general and the agricultural domain in particular. This will facilitate the knowledge
acquisition in the construction process of this ontology.

Keywords Ontological resources · Termino-ontology resources ·


Agro-sylvo-pastoral · Endogenous knowledge

53.1 Introduction

Burkina Faso, a country located in the heart of the Sahel, is one of the most vulnerable
regions in Africa, with 86% [1] of its population engaged in agricultural activities.
In the past, local know-how of agricultural sector had enabled some communities
to adapt to their environmental and rainfall conditions. With climate change, where
trends are sometimes inverted from one area to another, it is essential to be able to
exploit and share this existing local know-how, for the benefit of other communities
that find themselves in similar environmental and climatic situations.
The popularization and sharing of this local knowledge are a solution to the ques-
tion “How to make rural agriculture resilient to climate change with the knowledge
of the populations?”.
Attempts at solutions have been proposed with the creation of certain platforms
or web portals (AfricAdapt, Enda Energie, WASCAL, etc.) for the dissemination of

H. Trawina (B) · Y. Traore


University Joseph Ki-Zerbo, 03 BP 7021 Ouagadougou, Burkina Faso
e-mail: halguieta@gmail.com
S. Malo
University Nazi Boni, 01 BP 1091 Bobo-Dioulasso, Burkina Faso
I. Diop
University Assane Seck, BP 523 Ziguinchor, Senegal

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 565
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_53
566 H. Trawina et al.

local know-how. There is a lack of interoperability between these platforms and the
data are in standard formats (pdf, html, docx, etc.) not understandable by software
agents.
To answer at the question, it is essential that this knowledge be described in a
formal language.
With the opportunities offered by the semantic web [2], it is important that data
on local agricultural knowledge can be structured and put in a format that can be
exploited not only by humans but also by software agents.
To facilitate those actors can easily find and exploit these interdependent data,
especially in a context of climate change, it is important that all data concerning the
agro-sylvo-pastoral domain are well organized and correctly structured.
In this need to construct, co-construct, open, and link data on local endogenous
practices, we have proposed in [3] and [4] the architecture of a semantic wiki. It is the
architecture of a social and semantic web platform allowing actors to describe, share,
and co-construct their endogenous practices in the agro-sylvo-pastoral domain.
In this architecture, an ontology on the knowledge of the agro-sylvo-pastoral
domain, climate change, and endogenous practices used by rural populations is
planned. But the construction of this knowledge requires a preliminary work of
identification of the TOR from the agro-sylvo-pastoral domain, climate change, and
endogenous practices.
It is then necessary to make an inventory of the existing resources in the domain.
They will be analyzed and used to build an ontology of these knowledges. This
ontology will be integrated into this a social and semantic web platform that
architectural has been proposed in [3].
This identification of resources is our objective in this article. It will thus provide
the raw material for the construction of an ontology of endogenous agro-sylvo-
pastoral practices.
The rest of this paper is organized as follows: in the Sect. 53.2, we define some of
the main concepts and terminologies that we use in the rest of the paper. Section 53.3
presents a literature review of the existing TORs in the agro-sylvo-pastoral domain.
Section 53.4 presents a discussion to show the importance of these existing TORs
for the construction of the ontology. We announce some perspectives on ontology
construction methodologies which will be the subject of our next article. Finally,
Sect. 53.5 discusses and concludes this article.

53.2 Notion of TOR and Its Role in the Ontology


Construction Process

This section defines the notions of Resource, Terminology, and Endogenous


Knowledge used in this paper.
The Larousse dictionary defines terminology as “a set of rigorously defined terms
that are specific to a science, a technique, or a particular field of human activity.”
53 Termino-ontology Resources of Endogenous Agro-Sylvo-Pastoral … 567

In the field of agro-sylvo-pastoral knowledge, terminology refers to all the terms,


concepts, or any other data specific to this field, and the definition of which has been
agreed upon by experts or specialists in the field.
In the literature, several definitions of Ontology exist but the one proposed by
Studer et al. [5] best characterizes the essence of Ontology. According to him, “An
ontology is a formal, explicit specification of a shared conceptualization.”
Thus, the Termino-Ontological Resources (TOR) of the agro-sylvo-pastoral
domain are constituted by the terminological resources of the domain, which exist
in standard formats, uniquely comprehensible by humans (terminologies). They also
include all the resources that are already conceptualized, formalized, explicit and,
above all, understandable by machines (ontologies).
Endogenous knowledge as already announced in the previous section is any partic-
ular knowledge held by an indigenous or native people according to these authors
[6].
As stated in the definition given by Studer in the previous section, ontology is a
shared conceptualization because ontology refers to an abstract model of a certain
phenomenon in the world having identified the relevant concepts of this phenomenon.
These concepts are identified through the resources available in the given domain.
In the following section, a state of the art on Ontological Resources is made. A
brief analysis on the basis of characteristic comparative elements will allow us to
highlight the limit of these existing ontological resources in terms of detailed infor-
mation on agro-sylvo-pastoral techniques and the need to move toward an ontology
of knowledge based on the knowledge of endogenous agricultural techniques.

53.3 Termino-Ontology Resources of the Domain


of Endogenous Agro-sylvo-pastoral Practices

In this section, we present the existing TORs (Termino-Ontology Resources) that will
serve as a basis for the construction of the ontology of this domain. It is structured
in two sub-sections: non-ontological resources and ontological resources.

53.3.1 Non-ontological Resources (NOR)

Non-ontological resources are existing resources that take into account knowledge of
a particular domain, represented with some degree of consensus but whose semantics
has not yet been formalized [7]. These resources can be free texts, textual corpora,
web pages, web directories, catalogs, classifications, thesaurus, lexicons, etc. [8].
To reuse these NORs, we propose to categorize them. We will use the three
classifying criteria of NOR which have been proposed in [7], namely (1) the type
of the NOR which refers to the type of knowledge encoded by the resource, (2) the
568 H. Trawina et al.

data model, i.e., the design model pattern used to represent the knowledge encoded
by the resource, and finally (3) the implementation of the resource.
These resources are elaborated in some specific frameworks and have received
the consensus of all the actors of the concerned domain.
1) Category 1: glossaries, thesauri, catalogs, terminologies: These are glossaries,
thesauri, or catalogs in the fields of endogenous knowledge, climate change,
adaptation, or governance. We can cite:
• AGROVOC Thesaurus: a multilingual structured thesaurus in all fields related
to agriculture, forestry, fisheries, food, and other related fields;
• UNISDR Terminology on Disaster Risk Reduction 2009: The UNISDR Termi-
nology aims at promoting a common understanding and use of disaster risk
reduction concepts and at supporting the reduction of disaster risk efforts of
authorities, practitioners, and the public.
(2) Category 2: Resources using and encoding system: ERA (Evidence for
Resilient Agriculture) is a platform that provides data and tools designed
to accurately determine where every agricultural technology works. ERA
provides a complete synthesis of the effects of switching from one technology
to another on key indicators of productivity, system resilience, and of climate
change mitigation.
(3) Category 3: Web resources and standard format: Resources in XML file format,
Excel spreadsheets, and standard files (in pdf, word, etc.) are the type of
resources where records do not have structured interrelationships. In this cate-
gory, the data in XML format comes from web portals for the popularization
of endogenous good practices. Among these platforms, we can mention:
• AfricaAdapt: a continent-wide platform for promoting and sharing knowledge
on climate change adaptation;
• AAKN (Africa Adaptation Knowledge Network): an information portal for
sharing knowledge, research, successful initiatives, and collaborative partner-
ships [4];
• ENDA Energy: an International Non-Governmental Organization that
compiles results of studies on adaptation measures, resilience to climate
change, and renewable energy;
• GAN (Global Adaptation Network): a global platform for disseminating and
exchanging knowledge on climate change adaptation in different ways.
The other resources in standard format (Excel, Word, PDF, etc.) were collected
from experts in the field during our various interviews and also on the Web. An
extract of the corpus is listed in Table 53.1.
53 Termino-ontology Resources of Endogenous Agro-Sylvo-Pastoral … 569

Table 53.1 An extract of the corpus


No. Documents references
1 Basga Emile DIALLA. Peasant practices and knowledge in Burkina Faso: a presentation
of some case studies. Center for Economic and Social Policy Analysis (CAPES), 2005
2 Knowledge management network in Burkina. Ethnobotany and traditional medicine:
farming practices and systems. CAPES, RGC-B, 2006
3 West African Water Partnership and (GWP/AO), “Inventory of climate change adaptation
strategies of local populations and exchange of experiences of good practices between
different regions in Burkina Faso,” 978–2918639-05 3, 2010. Online. Available at: https://
www.gwp.org
4 Savadogo, Moumini, Somda, J., Seynou, O., et al. Catalogue of good practices for
adaptation to climate risks in Burkina Faso. Ouagadougou, IUCN, 2011
5 S. J. Ouedraogo, P. Zounrana, E. Botoni, F. de V. Compaore, and J. C. Ouedraogo, “Good
Agro Sylvo Pastoral Practices for sustainable soil fertility improvement in Burkina Faso,”
CILSS 2012
6 The “Local Knowledge Bank” project of the enda energie Project, Nov 2016
https://portailqualite.acodev.be/fr/ressources
7 Dr. Aby Drame and André Kiema, “ Endogenous knowledge: good practices in climate
change mitigation and adaptation in West Africa,” 2016
8 P. J.-M. Dipama, “Climate change and sustainable agriculture in Burkina Faso: resilience
strategies based on local knowledge study report. Promoting Resilience in Semi-Arid
Economies (PRESA) and Innovation Environment Development Africa (IEDA),” p. 36,
2016
9 A. Sambo, “Vulgarization of local agricultural knowledge as adaptation strategies to
climate change in the Far North region of Cameroon,” Science et Technique, p. 173, 2017
10 D’haen S., Theokritoff E. 2019. State of play of climate change integration, “Assessment
of the integration of climate change into national adaptation and development policies in
Burkina Faso,” 2019

53.3.2 Ontological Resources (OR)

In the literature, several authors have proposed ontologies in the field of agri-
culture, including ONTAgri [9], the ontology based on AGROVOC [10], AGRO-
NOMIC TAXON [11], OntoAgroHidro [12], Ontology of the culture of the vine
[13], OntoCLUVA [14], Agroforestry [15], etc.
In 2019, Prachi Dalvi, in his work [8], proposes a review of ontological resources
in the field of agriculture and we will focus here on some of them. Let us also
note that there are other ontological resources in the field of climate change of
agro-sylvo-pastoral among which we will discuss some of them next.
Rehman and Shaikh [9] propose ONTAgri, an evolving service-oriented agricul-
tural ontology for precision agriculture. This ontology focuses on agricultural prac-
tices such as irrigation fertilization and pesticide spraying. ONTAgri uses sensor
technology support for the agricultural domain, which helps in acquiring real-time
values.
570 H. Trawina et al.

In order to facilitate the use of AGROVOC for the development of terminolo-


gies including ontologies in the field of agriculture (This avoids having to recon-
struct terminologies from scratch), AGROVOC was transformed into an OWL model
[10] (Agricultural Ontology Service / Concept Server (AOS/CS)). This allows the
development of applications using semantic techniques and allows interoperability
between applications using these ontologies. AGROVOC now exists as a SKOS-XL
conceptual schema published as Linked Open Data. It contains links and references
to many other Linked Datasets in the LOD. AOS has been developed to provide
the concepts between the terms and the specifications of the relationships between
them. Agricultural resources from several languages are combined using AOS which
provides a framework for sharing ideas and terms within the agricultural community.
AGRONOMIC TAXON [11] is a modular ontology developed by Roussey and
Chanet to allow annotating corpora of documents on crop biological monitoring,
crop protection, and good agricultural practices. It will also serve as a schema to
store spatiotemporal data related to observations made on crop development and
pest attacks on the same crops. It is dedicated to the description of living organisms
using Taxonomy. The objective of the ontology is to facilitate data sharing in the
Link Open Data between farmers and agronomists. The module allows the automatic
classification of organisms according to their taxonomic description where the names
of organisms are automatically deduced from their taxa.
In [12], authors propose the construction of a top-level ontology for OntoAgri-
Hidro by reusing existing representations. OntoAgriHidro is an ontology that repre-
sents knowledge about the impacts of agricultural activities and climate change on
water resources.
In [13], authors whose objective was to build a traceability and predictive tool
for the vine and wine cycle propose the ontology on vine growing. This tool is a
platform that allows to combine data from sensors (Internet of Things) with business
knowledge to build innovative services. This platform interacts with an ontology
that is built from the capture of four types of data: raw data from simple sensors
(temperature, CO2 level, etc.) at the foot of the vines, processed data from intelli-
gent sensors (on-board cameras with image processing algorithms), raw data from
sensors integrated into the vat (which provide raw parameters such as temperature,
Ph, turbidity, etc.), and finally a corpus of theoretical and scientific references.
After capturing the data, the authors use the data mining method or terminology
mining method to process the data.
I. DIOP and his colleagues propose OntoCLUVA [14], a generic ontology of
climate change that can be reused in the design of ontologies of this domain. The
authors, in order to satisfy the need to have a knowledge management system (KMS)
composed of several modules (tasks) in the domain of climate change, propose the
construction of an ontology design pattern or generic ontology named ONTOCluva
of the climate change domain.
Other ontologies exist in the field of agroforestry, forestry, and agronomy. These
include:
53 Termino-ontology Resources of Endogenous Agro-Sylvo-Pastoral … 571

• Agroforestry [15] is an ontology for the organization of the different components


of agroforestry systems. User-centric, the ontology is built for a search engine
whose development process was proposed and described by Julien Ingram and
his colleagues [16]. This search engine aims to helping practitioners (farmers and
advisors) in the field of agriculture and forestry to find search results that meet
their specific demands;
• AgroPortal [17] is a portal of semantic resources and ontologies in agronomy built
from a corpus of ontologies, terminologies, and thesauri. For its construction, more
than 400,000 mappings between concepts based on either reuse or similarity were
generated, extracted, and analyzed.

53.4 Synthesis and Analysis on the Art of ROT

In this work, we have done a state of the art on RTs on the one hand and ORs on
the other hand. This allowed us to build two main collections: the collection on
terminological resources and the collection on ontological resources, some of which
can be reused.
Based on the collection of existing Ontological Resources (OR), we propose an
analysis using four criteria, namely the type of resource, the sector of application
(agriculture, forestry, hydraulics, pastoral), the consideration of endogenous culti-
vation techniques in these OR, and the link with climate change (CC). Table 53.2
summarizes this comparative analysis.

Table 53.2 Comparative table of the corpus of Ontological resources


RO Type Area of Include Link to the CC
application endogenous
practices
ONTAgri [9] Ontology Agriculture – Yes
AGROVOC [10] Thesaurus Agriculture – No
AGRONOMIC Modular ontology Biological Yes No
TAXON [11] monitoring of
crops
Ontology of vine Event ontology – No No
cultivation [13]
OntoAgroHidr [12] Ontology Agriculture No Yes
hydrolic
OntoCLUVA [14] Generic ontology Climate change No Yes
Agroforestry [15] Ontology Agroforestry Yes No
AgroPortal [17] Semantic and – – No
ontological
resource portal
572 H. Trawina et al.

Fig. 53.1 Sub-domain


network of climate change
[14]

The analysis of this table shows that none of the ontological resources is specifi-
cally concerned local knowledge or endogenous techniques in the field of agriculture
while taking into account the context of climate change. This can be seen in the
last two columns where the two criteria are not simultaneously verified for a given
ontological resource.
Hence, the interest for us to propose an ontology on endogenous agricultural tech-
niques or knowledge that we call OntoEndo by reusing existing termino-ontological
resources like OntoCLUVA.
We propose to reuse this ontology because it already proposes a pattern for
building a climate change ontology. This ontology pattern takes into account the risks
and disasters (1) related to climate change and their impact on urban vulnerability
(2). The dimension governance for effective management of urban vulnerabilities
(3), risks and disasters (4), and climate change (5) were also taken into account. The
inter-relationship between these sub-domains identified in this work is illustrated in
Fig. 53.1.
Based on this work, our main contribution is the consideration of endogenous
knowledge or local know-how (6) as a tool for climate change governance. This
new governance tool will ensure better resilience of the agro-sylvo-pastoral domain
(7) also affected by risks and disasters related to climate change as illustrated in
Fig. 53.2.
For the construction of OntoEndo, an ontology of endogenous cultural knowledge
for a better adaptation of cultural techniques in a context of climate change, the RTOs
identified in this work will be reused, and then, an alignment with OntoCLUVA will
allow to take into account the dimension of adaptation to climate change.
To facilitate the knowledge acquisition in the ontology construction process, we
propose to organize them into six sub-domains which are:
• Climate change, which contains concepts and relationships related to climate,
greenhouse gases, risks, vulnerabilities, resiliencies and disasters, etc.;
• Risks and disasters: contain concepts and relationships related to risks, disasters,
damages, etc.;
• Governance, which includes concepts and relationships related to the actors of
governance, the roles or missions of the actors, and the instruments of the actors
(which are, among others, the good practices), etc.;
53 Termino-ontology Resources of Endogenous Agro-Sylvo-Pastoral … 573

Fig. 53.2 Breakdown into sub-domains

• Urbane Vulnerability: contains concepts and relationships related to the urban


system and its vulnerabilities;
• Agro-sylvo-pastoral sector: made up of the set of concepts and relationships that
describe the different application sectors for endogenous knowledge;
• Local know-how or endogenous knowledge which contains all the concepts and
relations related to endogenous good practices and which will depend on the types
of soil, the climatic zones, and the sectors of application.
The relationships between the sub-domains indicate that climate change presents
risks and disasters which in turn impact on the agro-sylvo-pastoral sectors. To reduce
the impacts on the agro-sylvo-pastoral sector, it is important to have a governance
policy on climate change-related risks and disasters, but also on the agro-sylvo-
pastoral sector. One of the elements of governance is the consideration of endogenous
knowledge in these sectors for better adaptation or resilience to climate change.
It should be noted that the sub-domains of Climate Change, Risks and Disasters,
Urban Vulnerability, and Governance have been taken into account in the work of I.
DIOP and his colleagues [14].
In order to better exploit all the RTOs identified for the construction of OntoEndo,
it is essential to start from a methodological approach to ontology construction.
This methodological choice must take into account the possibility of reuse and/or
reengineering and alignment of existing ORs or TRs of the studied domain, such will
be the objective in the next paper.

53.5 Conclusion and Perspectives

After recalling the context of the need for an ontology of endogenous knowledge
for the adaptation of cultivation techniques in a context of climate change, we were
574 H. Trawina et al.

able to identify the terminological resources that will be used to build this Ontology.
The literature review reveals that there is a plethora of terminological resources on
this endogenous knowledge, but they remain non-formalized data that cannot be
understood by machines. Nevertheless, this work has allowed us to have a well-
provided collection of terminological and ontological resources in the field of agro-
sylvo-pastoral in general and agriculture in particular. We have managed to propose
a division into sub-domains and to establish the interrelation between them.
As perspectives, we propose in our next work to make a state of the art on ontology
construction methodologies which will lead us to our methodological choice for
OntoEndo taking into account the sub-domain division presented here. This choice
must take into account the possibility of reusing and/or reengineering the available
termino-ontological corpora identified in the present work. Transformation rules will
be used to create a source model that will be used in the construction of the OntoEndo
ontology.

References

1. Dipama, P.J.-M.: Changement climatique et agriculture durable au Burkina Faso: stratégies de


résilence basées sur les savoirs locaux rapport d’étude, p. 36 (2016)
2. Buffa, M., Ereteo, G., Gandon, F.: Wiki et Web Sémantique, p. 13 (2007)
3. Trawina, H., Malo, S., Diop, I., Traore, Y.: Towards a social and semantic web platform for
sharing endogenous knowledge to adapt to climate change. In: 2020 15th Iberian Conference
on Information Systems and Technologies (CISTI), pp. 1–5. IEEE (2020)
4. Trawina, H., Diop, I., Malo, S., Traore, Y.: Architecture of a platform on sharing endogenous
knowledge to adapt to climate change. In: New Perspectives in Software Engineering, pp. 131–
141. Cham (2021). https://doi.org/10.1007/978-3-030-63329-5_9
5. Studer, R., Benjamins, V.R., Fensel, D.: Knowledge engineering: principles and methods. Data
Knowl. Eng. 25(1–2), 161–197 (1998)
6. Sambo, A.: Vulgarisation des savoirs locaux agricoles comme stratégies d’adaptation au
Changement climatique dans la région de l’Extrême Nord du Cameroun. Science et Technique,
p. 173 (2017)
7. García-Silva, A., Gómez-Pérez, A., Suárez-Figueroa, M.C., Villazón-Terrazas, B.: A pattern
based approach for re-engineering non-ontological resources into ontologies. In: Domingue, J.,
Anutariya, C. (éds.) The Semantic Web, vol. 5367, pp. 167–181. Springer, Berlin, Heidelberg
(2008). https://doi.org/10.1007/978-3-540-89704-0_12
8. Dalvi, P.D., Mandave, V., Gothkhindi, M., Patil, A., Kadam, S., Pawar, S.S.: Overview of
agriculture domain ontologies (2016)
9. Rehman, A., Shaikh, Z.: ONTAgri: scalable service oriented agriculture ontology for precision
farming (2011)
10. Liang, A.C., Lauser, B., Sini, M., Keizer, J., Katz, S.: D’AGROVOC à l’Agricultural Ontology
Service/Concept Server Un modèle OWL pour la création d’ontologies dans le domaine de
l’agriculture, p. 11
11. Roussey, C., Chanet, J.P. : Le premier module d’une ontologie agricole sur la protection des
cultures : Agronomic Taxon. In : Atelier INtégration de sources/masses de données hétérogènes
et Ontologies, dans le domaine des sciences du VIVant et de l’Environnement, IN-OVIV
2013 associé à la Plate-forme IA 2013 (PFIA 2013) et aux 24ème journées d’Ingéniérie des
Connaissances (IC 2013), Lille, France, juill. 2013, pp. 5–16
53 Termino-ontology Resources of Endogenous Agro-Sylvo-Pastoral … 575

12. Bonacin, R., Nabuco, O.F., Pierozzi, I., Jr.: Ontology models of the impacts of agriculture and
climate changes on water resources: scenarios on interoperability and information recovery. Fut.
Gener. Comput. Syst. 54, 423–434, janv. 2016. https://doi.org/10.1016/j.future.2015.04.010
13. Hugol-Gential, C. et al.: Une ontologie de la culture de la vigne: des savoirs académiques aux
savoirs d’expérience. ReC, vol. 48, avr. (2019). https://doi.org/10.14428/rec.v48i48.45493
14. Diop, I., Lo, M.: An ontology design pattern of the multidisciplinary and complex field of
climate change
15. Salazar, R.C., Liagre, F., Mougenot, I.: Vers une démarche ontologique pour la capitalisation
des données de l’agroforesterie. In: Conférence Nationale d’Intelligence Artificielle, p. 89
(2020)
16. Ingram, J., Gaskell, P.: Searching for meaning: co-constructing ontologies with stakeholders
for smarter search engines in agriculture. NJAS—Wageningen J. Life Sci. 90–91, 100300, déc.
2019. https://doi.org/10.1016/j.njas.2019.04.006
17. Laadhar, A., Abrahão, E., Jonquet, C.: Analysis of term reuse, term overlap and extracted
mappings across agroportal semantic resources. In: Keet, C. M., Dumontier, M. (éds.) Knowl-
edge Engineering and Knowledge Management, vol. 12387, pp. 71–87. Springer International
Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-61244-3_5
Chapter 54
Survey of Protocol-Based Approaches
Using NS3 for Congestion Control

Hemali Moradiya and Kalpesh Popat

Abstract Transmission Control Protocol (TCP) is a communications standard


which enables programs and computing devices to exchange messages over a
network. An important part of the TCP/IP suite, TCP ensures error free data transmis-
sion. TCP breaks down the data into multiple packets, and each packet is transmitted
over the link established between source and destination by TCP. Network conges-
tion issue can adversely impact transmission of such data packets, however. Causes
for congestion include network overflow (transmission of too many data packets
than what network can handle), loss of packets, poor network configuration, inade-
quate bandwidth, obsolete hardware, etc. TCP also has protocols that can address the
network congestion issue. Various variants of TCP algorithm have evolved and have
been developed in response to changing congestion control requirements. This paper
demonstrates analysis of various such TCP variants like TCP Reno, TCP NewReno,
TCP Sack, TCP Vegas, TCP Tahoe, and TCP Westwood. This paper also presents
performance study of 3 TCP variants, viz. TCP NewReno, TCP Vegas, and TCP
Westwood+ through simulation done in NS3.

Keywords TCP · Congestion · Congestion window (cwnd) · Bandwidth

54.1 Introduction

Large number of end user devices connected over the Internet transmit and receive
data through number of small networks connected by devices like routers and
switches. Congestion occurs when the network, comprising these devices, is not
capable enough to handle the data being transmitted. These interconnecting devices
have limited capacity and effective congestion control measures need to be deployed
to control congestion. TCP-based congestion control techniques are deployed at end

H. Moradiya (B)
GLS University, Ahmedabad, Gujarat 380054, India
e-mail: hemalisachdev@gmail.com
K. Popat
Marwadi University, Rajkot, Gujarat 360003, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 577
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_54
578 H. Moradiya and K. Popat

user devices, thereby reducing burden on low capability devices like routers and
switches.
ACK in TCP refers to Acknowledgment which is sent by the receiver to trans-
mitter post-receipt of a data packet. Once a data packet is received by the receiver, it
relays back an ACK signal to the transmitter confirming the data receipt. Most TCP-
based congestion control algorithms rely on controlling and modifying a congestion
window (cwnd) which limits the sending rate based on the rate of receipts of ACK,
i.e., depending on how many ACKs are received and time taken for the same, the trans-
mitter would adjust the number of messages transmitted. Cwnd limits the number
of outstanding unacknowledged data packets between transmitter and receiver [1].
If rate of ACK increases, no. of outstanding unacknowledged data packets reduces
and so sending rate can be increased, and vice versa in case of decrease in ACK.

54.1.1 TCP Congestion Control

TCP undertakes Congestion Control in 3 phases which are as below:


Slow Start: Slow Start is used by sender to adjust the data flow rate to the receiver
with every ACK received [2]. Slow Start threshold (ssthresh) is the level of cwnd at
which TCP would start checking for congestion. In Slow Start, cwnd is set at 1 MSS
and cwnd is increased by 1 MSS (Maximum Segment Size) every time an ACK is
received. This implies that cwnd would be increased exponentially after each Round
Trip Time (RTT). However, such exponential increase in cwnd would also result in
increase in transmission rate over the network and can lead to congestion.
Congestion Avoidance: Congestion avoidance starts when cwnd equals ssthresh.
When cwnd increases to ssthresh level, TCP decides to slow down transmission rate
to avoid congestion. Accordingly, cwnd is now increased additively in linear manner,
rather than exponentially. Even after this slowing down, Congestion risk still remains,
and TCP still keeps on checking for Congestion which is called Congestion Detection.
Congestion Detection: Congestion Detection is done when network requires any
particular packet to be re-transmitted. Re-transmission is required in 2 scenarios
(1) Network time out which happens when no ACK is received for a data packet
within the stipulated timeframe, and thus, packet is deemed to have been lost and
(2) 3 duplicate ACKs are received implying that some data segments may have been
dropped.
TCP reacts strongly in case of timeout scenario by re-setting the ssthresh to a level
which is half of the prevailing cwnd size and re-setting the cwnd to 1 MSS. In the
scenario when 3 duplicate ACKs are received, TCP’s reaction is relatively moderate
whereby it resets the ssthresh to half of prevailing cwnd size and resets the cwnd to
revised ssthresh. In effect, in case of time out, TCP returns back to Slow Start again,
whereas in case of 3 duplicate ACKs, TCP returns back to Congestion Avoidance.
54 Survey of Protocol-Based Approaches … 579

This paper now discusses various TCP variants used for Congestion Control, viz.
TCP Tahoe, TCP Reno, TCP New Reno, TCP Vegas, TCP Sack, and TCP Westwood.

54.2 TCP Tahoe

TCP Tahoe is based on the TCP congestion control algorithm proposed by Van
Jacobson [3]. Tahoe uses a combination of slow start, congestion avoidance, and fast
re-transmit [4]. It works on packet conservation technique whereby a new packet is
not injected in the network, unless a packet is taken out of the network as well, such
that overall number of packets in the system does not exceed network capacity. This
is implemented by monitoring outgoing packets through ACK receipts. Tahoe also
maintains a cwnd and ssthresh as described above [3].
Tahoe conceptualizes a Fast Re-Transmission phase, post the Slow Start and
Congestion Avoidance phases. In Tahoe, packet loss can be considered to have
happened even before transmission time out, in case 3 duplicate ACKs are received.
Tahoe treats congestion detected through 3 duplicate ACKs in the same way as
congestion detected through time out. In both cases, it goes back to Slow Start, rather
than going to Congestion Avoidance. Due to this, it is called as Fast Re-transmission.
As Tahoe releases a new packet only after receiving ACKs, it creates a problem
when a connection first starts as there are no ACKs at the start. To handle this, Tahoe
requires that any time a network starts or it re-starts post-detection of Congestion, it
should deploy Slow Start. This is suggested to avoid network getting overwhelmed
by initial burst of data [3]. Tahoe’s drawback is that it normally waits for an entire
timeout interval to identify packet losses and once packet loss is identified, it resorts
back to Slow Start process which makes overall network too slow [4].

54.3 TCP Reno

Reno adds a new phase called Fast Recovery, instead of Fast Re-transmission in
Tahoe. Reno also adds some intelligence which allows it to detect congestion sooner
and does not empty entire network pipeline when congestion happens.
Response of Tahoe and Reno is same in case of congestion detected through time
out, i.e., the system returns to Slow Start phase in case of time out in both. However,
when 3 duplicate ACKs are received which indicates congestion may have happened,
Tahoe goes into Fast Re-transmission by returning to Slow Start, thereby emptying
network pipeline and slowing down the system. On the other hand, when 3 duplicate
ACKs are received, Reno applies Fast Recovery technique whereby the network does
not return to Slow Start phase but reduces its ssthresh to half of existing cwnd and
sets cwnd to ssthresh + 3 (where 3 accounts for 3 duplicate ACKs). Fast recovery is
a phase between Slow Start and Congestion avoidance. TCP Reno stays in the Fast
Recovery and increases cwnd by 1 every time it receives further ACKs. TCP Reno
580 H. Moradiya and K. Popat

terminates Fast Recovery phase once a proper non-duplicate ACK is received and
switches back to Congestion Avoidance [3]. For example, if 3 duplicate ACKs are
received when cwnd is 13 MSS, then TCP Reno enters the Fast Recovery mode by
setting ssthresh to 6 and cwnd is set at 9 (i.e., ssthresh + 3), instead of setting cwnd
to 1 MSS as in case of Slow Start.

54.4 TCP New Reno

TCP NewReno builds further on Reno and is capable to identify multiple packet loss
[3]. In Reno, Fast Recovery terminates and system switches to Congestion Avoidance
on first instance when a proper non-duplicate ACK is received, without considering
whether or not all packets outstanding in the window prior to Fast Recovery are
delivered properly or not. Thus, Reno performs well only when there is single packet
loss in a window. In case of multiple packet losses, Reno would result in system
switching between Fast Recovery and Congestion Avoidance multiple times. This
results in window deflation multiple times, thus slowing down the system.
In NewReno, ACKs are either Full ACKs or Partial ACKs. Full ACKs is received
when all packets outstanding at start of Fast Recovery are acknowledged, whereas
Partial ACKs is acknowledgment for some but not all packets. On receipt of Partial
ACKs, NewReno does not terminate Fast Recovery but transmits next packet in
sequence and reduces the cwnd to 1 less than the number of packets acknowledged
by partial ACKs. On receipt of full ACK, NewReno terminates Fast Recovery and
switches to Congestion Avoidance by setting cwnd equal to ssthresh [5, 6].
Reno does not distinguish between ACKs as full ACKs and partial ACKs and
considers all ACKs as same, thereby terminating Fast Recovery and switching to
Congestion Avoidance multiple times in cases where there is multiple packet loss in
same window. However, New Reno stays in Fast Recovery till the time a full ACK
is received which indicates that all packets outstanding at the time of Fast Recovery
are transmitted successfully and switches to Congestion Avoidance only after that.
Thus, New Reno is capable to handle multiple packet losses and is also faster than
Reno [6].

54.5 TCP Vegas

TCP Vegas is more efficient than Reno. Vegas operates through a Modified Slow
Start, Enhanced Congestion Avoidance, and Modified New Re-transmission [4].
54 Survey of Protocol-Based Approaches … 581

54.5.1 Modified New Re-transmission Technique

Reno and Tahoe can detect network congestions only when there is some packet loss.
Congestion signals in Reno and Tahoe could either be network time out or receipt
of 3 duplicate ACKs. TCP Vegas, however, is proactive and recognizes congestion
even before packet loss. Unlike Reno and Tahoe, Vegas does not wait for 3 duplicate
ACKs to recognize possibility of congestion and instead relies on RTT to detect
congestion. Thus, on receipt of 1st duplicate ACK, Vegas calculates the time elapsed
between when packet was sent and when corresponding duplicate ACK was received
and if the same is greater than defined time out interval, it immediately initiates re-
transmission without waiting for 2 further duplicate ACKs. When packet losses are
very high or when window is too small, it is possible that sender will never receive
3 duplicate ACKs, and hence, Reno would not be able to detect congestion in that
scenario. Vegas resolves this drawback of Reno.

54.5.2 Enhanced Congestion Avoidance

In Vegas, cwnd is modified based on ‘difference’ between expected throughput


(cwnd/base RTT) and actual throughput (cwnd/actual RTT). Vegas calculates this
‘difference’, and whenever ‘difference’ goes beyond a defined threshold (thereby
indicating actual throughput is much higher than expected throughput), Vegas exits
Slow Start phase and enters Congestion Avoidance phase [7]. Vegas uses Enhanced
Congestion Avoidance whereby it defines two thresholds for throughput ‘difference’
mentioned above with Alpha being lower threshold and Beta being upper threshold.
As long as Alpha < ‘difference’ < Beta, cwnd is kept unchanged. When ‘difference’
< Alpha, cwnd is increased by 1 and when ‘difference’ > Beta, cwnd is decreased
by 1 [4].

54.5.3 Modified Slow Start

In case of Slow Start too, Vegas uses modified Slow Start technique. While cwnd
doubles on every RTT in case of conventional Slow Start (thus resulting in exponential
increase in cwnd), Vegas doubles the cwnd only after every other RTT (i.e., after every
alternative RTT), thereby slowing down the cwnd expansion rate.
582 H. Moradiya and K. Popat

54.6 TCP Sack

TCP variants like Tahoe, Reno, and NewReno undertake analysis of cumulative
ACKs. They can only detect one packet loss per RTT. If there is more than one packet
loss, these variants would re-transmit some of the already transmitted segments.
SACK stands for Selective Acknowledgements. SACK modifies the ACK mechanism
so that receiver can tell the sender which segments of a packet are received and which
are not, by sending SACKs, rather than sending cumulative ACK for entire packet.
Normal ACK mechanism allows receiver to tell the sender only about packets that
are received fully, i.e., sequentially. To implement SACK, both receiver and sender
must agree to a mechanism which allows for receiver to send out SACKs and this
should be done at the time of establishing TCP connection. SACK allows receiver
to inform sender about receipt of non-contiguous, out of order segments [5].
SACK uses this additional information to prevent re-transmission of already deliv-
ered segments, which is an improvement over Tahoe/Reno/NewReno variants. SACK
re-transmits only the lost segments. SACK deploys a variable called Pipe to store
information about outstanding data. When Pipe < cwnd, SACK sends data and
changes Pipe to Pipe + 1. When receiver sends ACK, SACK reduces Pipe by 1
and sets it to Pipe − 1. This continues till entire outstanding data in the network is
transmitted. SACK normally operates in Fast Recovery and terminates Fast Recovery
to enter into Congestion Avoidance once all outstanding data is successfully deliv-
ered. For example, if there are 5 segments in the window, and segments 1, 2, 4, and
5 are successfully delivered while segment 3 is lost, SACK would only re-transmit
segment 3 whereas Tahoe/Reno/New Reno would re-transmit segments 3, 4, and 5
[5].

54.7 TCP Westwood

TCP Westwood is an upgrade over Reno. Reno follows a simple rule whereby it
simply resets cwnd to half its value when congestion is detected through time out or
3 duplicate ACKs. However, in wireless connections especially, radio channel related
issues can lead to sporadic losses which can be wrongly considered as congestion. In
such cases, TCP Reno, by halving cwnd, can cause unnecessary reduction in cwnd.
TCP Westwood takes a different approach of constantly measuring the sender side
bandwidth based on rate of returning ACKs. Cwnd and ssthresh are dynamically
adjusted based on the effective bandwidth at the time when Congestion is detected.
TCP Westwood response is similar to Reno in Slow Start and Congestion Avoid-
ance. However, it has a different response once Congestion is detected. When Conges-
tion is detected through timeout, cwnd is set to 1 MSS whereas ssthresh is set at
Estimated Bandwidth (BWE) at the time of detection. In case Congestion is detected
through 3 duplicate ACKs, ssthresh is adjusted based on BWE, RTT and Segment
Size, whereas if cwnd > ssthresh then cwnd is set equal to ssthresh [8].
54 Survey of Protocol-Based Approaches … 583

54.8 TCP Westwood+

Westwood plus is sort of an upgrade over Westwood. TCP Westwood plus is used
for estimation of end-to-end available bandwidth. TCP Westwood plus flexibly sets
cwnd and ssthresh at a level which considers the system bandwidth at the time when
it has experienced congestion. It increases throughput in case of wireless links and
fairness in case of wired networks, as compared to throughput/fairness in case of
New Reno.
The basis of TCP is discussed with default congestion control in [9, 10]. The
traditional TCP variants which are used for wired networks are discussed in [11].
TCP Vegas and TCP with FACK are discussed in [12] and [13], respectively. We
have also reviewed some recent and modern TCP variants [14]. A discussion on
TCP in wired-cum-wireless networks is given in [15]. The future of TCP on Wi-
Fi is discussed in [16]. We have reviewed six approaches of advanced congestion
control schemes for TCP. Fairness-aware TCP-BBR Algorithm is discussed in [17].
Congestion control in high-speed lossless data center networks is discussed in [18].
High precision congestion control and Congestion Control Management in High
Speed Networks are discussed in [19] and [20], respectively. A simple and fast traffic
flow control algorithm and adaptive congestion control algorithm are discussed in
[21] and [22], respectively.

54.9 Simulation

We have run simulations to study performance of 3 TCP variants, viz. Westwood+,


NewReno, and Vegas, using Network Simulator 3 (NS3).

54.9.1 Network Topologies

In our simulation, we have simulated a network with 3 nodes. There is a source node
connected with destination node through an intermediate node. The connection of
source node to intermediate node is set as a reliable link with high-speed capacity.
The connection of intermediate node to destination node is set as unreliable link with
low speed capacity. Now when source node will transmit data to destination node,
intermediate node will experience congestion, because source node to intermediate
node link will deliver fast but intermediate node to destination node will deliver slow.
Our simulation also considers packet drops in addition to congestion. Unreliable
link will experience packet drops based on given probability. Parameters considered
for simulation includes (1) Reliable link bandwidth, (2) Reliable link delay, (3)
Unreliable link bandwidth, (4) Unreliable link delay, and (5) Unreliable link error
probability.
584 H. Moradiya and K. Popat

Table 54.1 Simulation result—1


S. No. Mbps
Reliable link Unreliable link Throughput Throughput Throughput
BW BW Westwood+ NewReno Vegas
1 10 1 0.98 0.99 0.59
2 10 2 1.93 1.98 0.74
3 10 3 2.86 2.96 0.68
4 10 4 3.74 3.94 1.09
5 10 5 4.56 4.86 1.83

Fig. 54.1 Simulation 10


result—1
5

0
1 2 3 4 5

Westwood+ NewReno Vegas

54.9.2 Simulation Result—1

In first simulation, we have kept reliable link BW as fixed while we are changing Unre-
liable link BW to see its impact on throughput. As we can see, throughput is bound
by unreliable link bandwidth, because it is lowest in path. Now throughput is even
lesser than unreliable link bandwidth because congestion happens at intermediate
node. The results are shown in Table 54.1 (also see Fig. 54.1).

54.9.3 Simulation Result—2

In second simulation, we have kept both Reliable link BW and Unreliable link BW
as fixed, but we are changing error probability for packet drop. We can see that
throughput is further reduced if we keep increasing error probability. The results are
shown in Table 54.2 (also see Fig. 54.2). Here, the measurements are in Mbps.

54.9.4 Simulation Result—3

We have also increased no. of nodes to 10 so it has multiple bottleneck links. The
results are shown in Table 54.3 (also see Fig. 54.3).
54 Survey of Protocol-Based Approaches … 585

Table 54.2 Simulation result—2


S. No. Mbps
Reliable link Unreliable Error Throughput Throughput Throughput
BW link BW probability Westwood + NewReno Vegas
1 10 3 0.000 2.86 2.96 0.68
2 10 3 0.005 2.61 0.40 0.72
3 10 3 0.010 2.57 0.23 0.61
4 10 3 0.020 2.13 0.15 0.33
5 10 3 0.025 1.66 0.13 0.30

Fig. 54.2 Simulation 5


result—2

0
1 2 3 4 5

Westwood+ NewReno Vegas

Table 54.3 Simulation result—3


S. No. Mbps
Reliable link Unreliable link Throughput Throughput Throughput
BW BW Westwood+ NewReno Vegas
1 10 1 0.96 0.98 0.40
2 10 2 1.92 1.98 0.53
3 10 3 2.82 2.96 0.61
4 10 4 3.69 3.92 0.63
5 10 5 4.63 4.86 0.64
6 10 6 5.35 5.77 0.72
7 10 7 6.05 6.64 0.66
8 10 8 7.13 7.50 0.66
9 10 9 7.86 8.29 0.93
10 10 10 8.49 8.95 0.67

Fig. 54.3 Simulation 10


result—3
0
1 2 3 4 5 6 7 8 9 10
Westwood+ NewReno Vegas
586 H. Moradiya and K. Popat

54.10 Conclusion

The performance of existing TCP Variants is analyzed, and we have found that they
may not perform well in all the cases. Some networks have unknown traffic, and
for that case, we need intelligent policies. These intelligent policies can be devel-
oped with machine learning techniques. Out of Westwood+, NewReno and Vegas,
Westwood+ and NewReno performs better than Vegas. Westwood+ is mainly used
for wireless networks. Our research focuses on wired networks where comparatively
NewReno performs more stable. At present, Westwood+ estimates bandwidth but
it is not always accurate. Our work will try to improve performance of Westwood+
with machine learning technique to do more appropriate congestion control.
In all these research work, we have identified that there is still scope of further
improvements to improve accuracy of estimations while predicting congestion and
controlling it. Machine learning techniques are efficient in decision-making. So we
are undertaking research to use machine learning for efficient Congestion Control.

References

1. Callegari, C., et al.: A survey of congestion control mechanisms in Linux TCP. In: International
Conference on Distributed Computer and Communication Networks. Springer, Cham (2013)
2. Abed, G.A., Ismail, M., Jumari, K.: Exploration and evaluation of traditional TCP congestion
control techniques. J. King Saud Univ—Comput. Inf. Sci. 24(2), 145–155 (2012)
3. Taruk, M., Budiman, E., Setyadi, H.J.: Comparison of TCP variants in long term evolu-
tion (LTE). In: 2017 5th International Conference on Electrical, Electronics and Information
Engineering (ICEEIE). IEEE (2017)
4. Chaudhary, P., Kumar, S.: Comparative study of TCP variants for congestion control in wireless
network. In: 2017 International Conference on Computing, Communication and Automation
(ICCCA). IEEE (2017)
5. Kaur, H., Singh, G.: TCP congestion control and its variants. Adv. Comput. Sci. Technol. 10(6),
1715–1723 (2017)
6. Parvez, N., Mahanti, A., Williamson, C.: An analytic throughput model for TCP NewReno.
IEEE/ACM Trans. Netw. 18(2), 448–461 (2009)
7. Sangolli, S.V., Thyagarajan, J.: An efficient congestion control scheme using cross-layered
approach and comparison of TCP variants for mobile ad-hoc networks (MANETs). In: 2014
First International Conference on Networks and Soft Computing (ICNSC2014). IEEE (2014)
8. Kanani, V.I., Panchal, K.J.: Performance Analyses of TCP Westwood 1 (2014)
9. Forouzan, B.: TCP/IP Protocol Suite. McGraw-Hill
10. Stevens, W.R.: TCP/IP illustrated. In: The Protocols, vol. 1. Addison-Wesley Professional
Computing Series
11. Abed, G.A., Ismail, M., Jumari, K.: Exploration and evaluation of traditional TCP congestion
control techniques. J. King Saud Univ—Comput. Inf. Sci. 24, 145–155 (2012)
12. Brakmo, L.S., O’Malley, S.W., Peterson, L.L.: TCP vegas: new techniques for congestion detec-
tion and avoidance. In: SIGCOMM’94 Proceedings of the Conference on Communications
Architectures, Protocols and Applications
13. Mathis, M., Mahdavi, J.: Forward acknowledgment refining TCP congestion control. In:
Pittsburgh Supercomputing Center—ACM SIGCOMM vol. 26, no. 4, October 1996
54 Survey of Protocol-Based Approaches … 587

14. Abdeljaouad, I., Rachidi, H., Fernandes, S., Karmouch, A.: Performance analysis of modern
TCP variants: a comparison of cubic, compound and New Reno. In: 2010 25th Biennial
Symposium on Communications (QBSC), pp. 80–83 (2010)
15. Pentikousis, K.: TCP in wired-cum-wireless environments. IEEE Commun. Surv. Tutorials
3(4), 2–14 (2000)
16. Grazia, C.A.: Future of TCP on Wi-Fi 6. IEEE Access 9, 107929–107940 (2021)
17. Jia, M., et al.: MFBBR: an optimized fairness-aware TCP-BBR algorithm in wired-cum-
wireless network. In: IEEE INFOCOM 2020-IEEE Conference on Computer Communications
Workshops (INFOCOM WKSHPS). IEEE (2020)
18. Huang, S., Dong, D., Bai, W.: Congestion control in high-speed lossless data center networks:
a survey. Futur. Gener. Comput. Syst. 89, 360–374 (2018)
19. Li, Y., et al.: HPCC: high precision congestion control. In: Proceedings of the ACM Special
Interest Group on Data Communication, pp. 44–58 (2019)
20. Bazi, K., Nassereddine, B.: Congestion control management in high speed networks. In: WITS
2020, pp. 527–537. Springer, Singapore (2022)
21. Millán, G., et al.: A simple and fast algorithm for traffic flow control in high-speed computer
networks. In: 2018 IEEE International Conference on Automation/XXIII Congress of the
Chilean Association of Automatic Control (ICA-ACCA). IEEE (2018)
22. Verma, L.P., Verma, I., Kumar, M.: An adaptive congestion control algorithm. http://iieta.org/
journals/mmc_a 92(1), 30–36 (2019)
Chapter 55
IOT-Based Smart Baby Cradle: A Review

Thilagamani Sathasivam, T. A. Janani, S. Pavithra, and R. Preethy

Abstract The contemporary range of working moms has substantially increasing


nowadays. At the same time, infant care has grown a big problem for each family,
which pretense a maximum dad and mom take infants to his/her grandparent’s resi-
dence or take them to infant protection home. Parents are not able to monitor their
babies’ day-to-day activities. To overcome a several risk, this primarily based on
the Smart Cradle (IoT- BBMS) and is described as an eco-friendly, economically,
IoT-based gadget for tracking. We also totally proposed a new set of rules for our
gadget that performs key functions in offering very high infant care when parents are
not near. In the designed gadget, Node ARDUINO Unit Board of control exposed
to collect the facts examined aid of using the monitors and updated through needed
requirements.

Keywords IOT-BBMS · ARDUINO UNO · Node MCU · Raspberry Pi ·


Cayenne · Gyro Sensor-ADXL 335

55.1 Introduction

Mother’s care is very important for the baby, so most of the parents have lots of
stress and workload. That they were unable to take care of the babies and searched
for a baby caretaker and play schools for prevention. It is used to secure the baby’s
protection on time. A baby cradle contains a camera with a video attachment and
a microphone. It refers to the bunch of objects that are connected via the network
connection. It can transfer sensor data on the Internet without any new inventions.
The sensors are used to collect the data and alert the parents when they are in an
emergency [1]. Nowadays, most babies are born prematurely. Just born babies are at
high risk. So, babies are under a controlled and enclosed environment which is said
to be isolation or incubation which gives heat to keep a constant body temperature
and is secured [2]. The automation process and data exchange using the Internet

T. Sathasivam (B) · T. A. Janani · S. Pavithra · R. Preethy


Department of Computer Science and Engineering, M. Kumarasamy College of Engineering,
Karur, Tamil Nadu 639113, India
e-mail: thilagan1@yahoo.co.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 589
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_55
590 T. Sathasivam et al.

of things (IoT) are growing continuously. IoT devices are not connected devices,
software systems, actuators, cyber systems, computer devices, and connected with a
particular object. It runs with the help of the Internet and controls the data between
objects or people automatically without any intervention of humans [3].
The IoT healthcare application system handles the patient by placing a sensor
in the room. Patients are monitoring the heat changes, etc., depending on the effect
haleness is to be notified and projection given periodically. Patients can have watched
and notified them, able to get from the actual area or accurate area. IoT watches the
price validation, current utility, and safety. The machines used as a model helped to
the information to protect the infant [4]. It records the body problem of the abnormal
with smartwatch fight shy of the wrong direction. Various sensors are enforced in
high efficient watch for measuring various health values, and this information is sent
to virtual storage for any method [5]. It analyzes and provides feedback to patients
who supported the health measures. Wireless healthcare observance is finished in
another technology like Raspberry Pi-3 board, and Arduino board is employed as an
entranceway. It may be connected to the cloud for capturing real-time knowledge [6].
The ongoing very important information from these devices is handy compiled each
hour. This handled an oversized quantity of corrupted information. It includes an
internal organ patient monitor, pressure level monitor, heat changes, pulse rate, and
force per unit area, to keep up the continual flow of elements and breathing machine.
All the project units are connected with a baby to capture the important records on
an everyday check [7].
It is used to maintain the warmth in a commonplace unaccompanied by irritating
the kids in nidus. Once the software is thriving updates about a child’s well-being
growth, emergencies can be held in a short period. The fermenter is used in many
machines [8]. It is a mistreatment heart monitor which can count the pulses of the
baby In several incubators in the institution. So it becomes troublesome to spot
that the particular incubators have issues. To resolve this issue, bulb and buzzer are
employed [9]. The choice of network resources that will be needed to demonstrate the
correct speeds for communication is the most notable issue. Numbers of problems
are notified to handle the performance of associate IoT networks considering the sort
of application to be enforced. Numbers of problems are to be thought of to boost
the performance of associate IoT networks considering the sort of application to be
enforced. The products are low in quick energy consumption. Several factors exist
that effect has an associated effect on the operating method of an IoT network [10].
DC compromise the IoT solutions are steadily increasing, which are assigned as
transformation with users for working and process. Each of them will invent prior
experience and it will display output to the users, we created a Bi-Directional IoT
gateway, with the protocol attached by wireless [11].
They are ZigBee and Wi-Fi. It controls and cooperates with the data transfer
process [12]. In an area unit, many times child observance systems are being built
and present in the shop. Most of them have the practicality to cardinal rate, heat
activity, and moment survive. Some analysis is finished mensuration respiration in
lungs and heart pumping with wireless technology. A baby monitor consists of many
imported aspects concerning observance [13]. The use of Raspberry Pi is to inject
55 IOT-Based Smart Baby Cradle: A Review 591

the data which can be gathered by the measure to an HTML website. It can be seen
directly around the countries by the particular user [14]. By using PHP, an HTML
page consisting of a GPIO pin is created. For the linkage of the webpage linking
using the Raspberry Pi, it will give access to the doctor for the control of the heater
or any function happens for the users. By the using of the app which was created by
my app inventor for control the temperature and other factors [15].
A terminal device’s square measure can be connected to IoT and turn out huge,
numerous knowledge on a daily or regular basis. The process of computation of the
cloud is troublesome to satisfy the necessities of IoT for fast replay, high quality,
earthy distribution, place awareness, and small latency [16]. The fog computing
method was introduced by the Cisco Company, which allocates storage and alter-
native methods of cloud computing technology from the middle to the sting of the
technology wherever all methods square measure nearer to end users. Fog computing
provides a system-level horizontal design [17]. Health experts play a crucial position
in advising dad and mom away to sleep their toddlers properly to minimize the chance
of unexpected toddler dying syndrome and napping accidents. Infants must be located
supine to sleep in a cot with a corporation well-becoming bed inside the parental
bedroom without a tender or free bedding that could hinder the airway. Exposure to
smoking each earlier than and after delivery ought to be minimized [18]. The layout
of the hardware module of IoT- primarily based toddler incubator tracking system.
The hardware module includes a microcontroller and records acquisition submodule,
and information communique submodule. In this research, the microcontroller used
is Arduino Uno Rev3 together with a frame temperature sensor because the illustra-
tion of biosensors and atmosphere temperature sensor, humidity sensor, and fuel line
sensor because the illustration of surroundings tracking sensors [19]. The function
of IoT technology withinside the clever metropolis concept (Janik et al. 2020) is
essential to bridge the already cited worldwide infrastructural demanding situations
in towns, that are connected with the present-day growth of the populace in towns
[20]. IoT technology in clever towns might permit the usage of various gadgets,
which might boom the lifestyles first-rate in towns in addition to the performance
of various each day offerings which include transportation, security (surveillance),
clever metering, clever strength systems, clever water management, etc. Different
sensing gadgets might acquire information, which could be processed in the direction
of green and beneficial solutions [21].
The thermoregulatory mechanism at some stage in sleep might impair arousal
mechanisms, respiration drive, cerebral oxygenation, and cardiac responses [22].
Nine studies have observed that bedroom heating will increase SIDS threat, 10 while
well-ventilated bedrooms and use of a fan are related to the reduced chance of
SIDS.11,12 These findings suggest that indoor warmth is a crucial threat aspect
for SIDS. In addition, indoor and outside temperatures were proven to correlate
strongly, specifically withinside the heating season [23]. A preferred framework for
collaborative production is proposed observed via way of means of a dialogue of such
an IoT-primarily based totally framework for the area of micro gadgets assembly.
The layout of this collaborative framework is mentioned withinside the context of
cloud computing in addition to the rising Next Internet that is the point of interest of
592 T. Sathasivam et al.

latest tasks withinside the USA, EU, and different countries [24]. The data/records
change among the diverse software program and bodily additives are modeled the
usage of the engineering Enterprise Modeling Language (UML), which offers a
dependent basis for designing and growing this IoT-primarily based collaborative
framework. The key cyber bodily additives and modules are defined observed via
way of means of a dialogue of the implementation of this framework [25]. His studies
aimed to determine, explore, and outline which can be the maximum appropriate
IT governance enablers to help managers in IoT implementation. The observation
followed the Design Science Research methodology, such as systematic literature
opinions and a Delphi technique, to construct the artifact [26]. The artifact turned
validated and evaluated in an actual organization. The consequences suggest that
statistics privacy, statistics protection, and statistical evaluation are presently the
maximum applicable enablers to bear in mind in an IoT implementation due to the
fact they boom the performance of the answer and beautify statistics credibility [27].
The statistics mining at the side of rising computing strategies has astonishingly
stimulated the healthcare industry. Researchers have used distinct Data Mining and
the Internet of Things (IoT) for enrooting a programmed answer for diabetes and
coronary heart patients. However, still, an extra superior and united answer is wanted
which can provide a healing opinion to man or woman diabetic and aerobic patients
[28].
With the evolution of clever things, to gather data, gateways play a first-rate
position in interconnecting with numerous sensor nodes. Using distinctive Wi-Fi
protocols and requirements with sensor nodes, a gateway can rework facts into a
unique layout that transmits into the cloud for similar usage [29]. The sensor values
may be located, and the gadgets may be managed via way of means of a cellphone
from far off location. Here, we display the evidence of the idea for controlling the
clever domestic appliances [30].
Most of those gadgets have cables, are pretty huge in length which could disturb
an infant’s ordinary existence, and want nonstop monitoring from guardians. This
paper shows a tracking device that determines the maximum critical essential alerts
of infants and transmits effects over Wi-Fi hyperlink to the management tool that
might be any smartphone [31]. The device can be measuring blood oxygen level,
coronary heart rate, breathing rate, frame temperature, frame posture, and legs
activity. Combining all of those uncooked alerts, it is feasible to apply this device
in different, feasible existence-threatening conditions for the duration of long-time
period tracking. Compared to different comparable answers, it has tiny dimensions,
less weight, extended dependability of measuring photoplethysmography sign, and
prolonged battery existence due to the use of Bluetooth Smart Wi-Fi protocol [32].
Stable temperature law is wanted to save you hypothermia or hyperthermia as a way
to arise in untimely infants. Care of the toddler in a little one incubator reasons the
separation of mom and toddler. Mothers who have untimely infants are observed to
lack consideration in being concerned for their infants in comparison with moms
who have timely infants. The reason for this observation turned into evaluation of
temperature on toddler incubator manipulates the system. To get a strong and unique
55 IOT-Based Smart Baby Cradle: A Review 593

temperature, the temperature sensor is located in step with the usual of the toddler
incubator on the hospital [33].
Variation in physiological parameters in the course of manipulations and
tactics can be related to poor fitness consequences. The fast-paced, traumatic
NICU surroundings can also additionally adversely affect how manipulations are
performed, which might not be captured in manner documentation in an EMR. Recent
research has tried to seize neonatal video streams with the aid of using positioning a
digital digicam at the pinnacle of the neonate’s crib to conquer guide documentation
limitations [34].

55.1.1 Embedded System

An embedded system is a kind of computing system mainly designed to perform


several tasks such as access, process, and store the information in many goods were
currently working on. An Embedded system is a mixture of appliances and software
where software is usually called firmware and embedded into the hardware. The
Embedded system plays a vital role in those systems, and it gave the result within
the cutoff dates. Embedded systems support to create the work more proper and
easier. So, we often use embedded systems in simple and sophisticated devices too.
The applications of embedded systems are mainly involved in our real life for many
devices.

55.2 Related Works

The paper [1] explained a new algorithm. It plays a vital role in a device, which
controls. The board of control utilizes the data and is read by the sensors and updated
online to a new connector. The model of the baby cradle design uses the software
[35]. The cradle uses sensors to measure the body temperature, dryness level, and
abnormal activities. An object placed on the cradle top causes the baby to feel sleepy.
It swings automatically when the baby starts crying. An external Wi-Fi camera has
been installed and appeared as real-time vision monitoring. The project proposed a
model of an automatic baby checking system based on some parameters. It checks
the temperature, moisture, heart pumping, gas molecules, and the baby’s position
through a video camera. S.ODI is a microcontroller used in this process and carries
over the qualities which are simple to computerize by the interfaced sensors. The
video camera connects to the Raspberry Pi [36]. Through a Blink channel screen, the
data measured checked. The product is the portable blinking channel. It controls the
sensor values so that the baby prevents safe [3]. The trouble lies with gaining knowl-
edge of the day-to-day activities of the patients. It helps us to pick out a constraint
with older ones. It is understood the way of tracking the day-by-day activities of a
person from brushing to dressing. Many technologies are there for sensing the device.
594 T. Sathasivam et al.

The caretaker now no longer displays 24/7 affected person can be domestic in his
life [4]. Abnormal ones with health issues can be monitored by a baby cradle give a
fast precaution. The prediction will be using portable gadgets that constantly survive.
Smartwatch is a device to protect our internal organs with the correct measurement
of cardinal pumping by Electrocardiographic and RPG Signal. The signal output is
on the smartwatch screen [37]. It consists of different devices like free-fall sensors,
Bluetooth Energy to reduce the current utility communication by other devices [6].
The body temperature of the baby in the cradle is maintained from 36.5 to 37.2 °C but
due to some reason, this temperature rises suddenly. The mother needs to provide the
quantity of warmth to the baby; if the mother is not capable of fulfilling the baby’s
need, then the brooder arises. It is troublesome for the doctor to possess a continuous
looking on every brooder. They used a buzzer to know the sign of the baby’s status
[9]. IoT is used in several fields like sensible computing, on-demand computing,
high work computing, particle accelerator computing, and sensible computing. The
affiliation among the IoT and different parts of computing should be inspected,
particularly for the execution purpose of the IoT nodes. It shows that several prob-
lems need improvement for better execution of the network nodes of IoT. Special
hardware which has been added in the cryptography needs to be updated regularly.
IoT structures are complex as the connection is completed in lots of mediums. The
value of gadgets joined to the IoT community is high [10].

55.3 Proposed System

Here, we propose the IoT-based monitoring of the health index of babies. The system
will monitor infants and smooth the way for picking care of their health. The struc-
tured diagnostic model can fit for the one who has no time. Here, pressure sensors,
pulse sensors, temperature sensors, and gyro sensors are interfaced with the controller
to detect the body temperature, pressure, pulse, and body moment. In any case of
emergency, the buzzer will turn on to alert the neighbor. The monitored values are
added to the IoT. LCD is used for displaying monitored parameters. IoT controls
ordinary systems as mentioned in Fig. 55.1.

55.3.1 Difference Between Existing System and Proposed


System

Extra sensors have been added to the existing system to help in parameter measure-
ment. The product’s price is not affordable, and most ordinary people cannot afford
it. The SODI Board is a complicated system with a restricted number of executions.
We recommended using the Internet of Things to track the babies’ health index. Only
the necessary sensors can be added to the structured diagnostic model at a low cost.
55 IOT-Based Smart Baby Cradle: A Review 595

Fig. 55.1 Block diagram of proposed system

As a new feature, we have included a gyro sensor. Instead of a SODI Board, we


used an Arduino UNO. Because the SODI Board had two functionalities, Wi-Fi and
microcontroller, but the Arduino had a microcontroller as mentioned in flow diagram
Fig. 55.2.

55.4 Software Details

55.4.1 Arduino IDE

The Zilog incorporated improved surroundings. It is an apparatus for the coding


part. The miles want to write and transfer packages to a programmable board. The
programs written mistreatment Arduino code program (IDE) are sketches. The Zilog
are written as documents as well as macro and script. The customer has the capa-
bility of manipulating the information. The content provides remarks while thrift and
commerce conjointly give the information about the mistakes. The overview of the
test results through the code-like error messages and completely different records.
Very cheap correct ledge displays the designed interface. The shortcuts will assist
596 T. Sathasivam et al.

Fig. 55.2 Flow diagram of proposed system

verify, transferring programs, creating, opening, and searching sketches, and opening
the serial screen.

55.4.1.1 Connecting the Arduino

1. The Universal Serial Bus cable which contains the plug within it—1 ceases to
the system and terminates to the Arduino board.
2. For unique Arduino IDE download, while triggered, choose Search my system
for the motive force and pick out the files to which your takeout Arduino IDE
downloads.
3. If a tool of Microsoft is certified, you may notice an issue that the board is
not—pick “set up anyway.”
4. Your system is now ready.
55 IOT-Based Smart Baby Cradle: A Review 597

55.4.2 Proteus

The Proteus layout Suite could be a proprietary software system suite used for elec-
tronic layout automation. The software system is employed principally via electronic
layout producers handling the drawing which gives the method. Proteus is a layout
device evolved through an electronic. Proteus is a layout of software program devel-
oped by electronics for digital circuit simulation, schematic capture, and PCB layout.
Its simplicity and person-friendly layout made it famous among electronics hobby-
ists. It may simulate LED, LDR, and USB unique. It is a sub-layer and layout device.
This layout is dependent on a motor. It owns a characteristic at different stages.

55.5 Hardware Details

55.5.1 Arduino Uno

It is associated with computer code text and semiconductor units, using as advanced
via Adriano. They geared up on various enlargement forums in a different connec-
tion. They have a virtual method in physical measurements and coding a kind of
connectors. They can handle a steam-filled through a wire or associate degree of the
cell. It considers the juice between twenty counts. The format and production docu-
ments for hardware are also available. “UNO” manner has a modern design which
makes the curing of association program. They are initially at a chain programmed
on connector permits out for mechanical workers.

55.5.2 Power Supply Circuit

A tool or gadget having a resource neutral with power for source in the organization
is referred to as strength to supply. The maximum points carried by elements worked,
energy substances of digital measures with electricity. It has easy work and increases
the number of cumbersome for contemporary gadgets. A law can bring about short
usage in the switching panel distributed in an equal score. However, it could be extra
complicated.

55.5.2.1 Linear Power Supply

A hopped-up, specific current provides some catches an electrical device to convert


the voltage from out of source to especially sometimes a lower voltage. Manufacture
of DC and a rectifier is employed. They can swish to the adding up to date from the
598 T. Sathasivam et al.

rectifier. Tiny periodic deviations from the swish electrical current can stay. These
pulsations occur at a frequency associated with of power.

55.5.3 Temperature Sensor

It has been a wet and hot device explained as a mark or source, often interrupting
a mechanical device. They find an instant result for work for the purpose of less
amount and little heat device, which supply excessive responsibility and for a long
time balance. It uses an electrical phenomenon wetness device, semiconductor to
the degree at facts. It is easy to apply collections and pattern codes assigned for
devices. It is easy to make it straightforward to attach both devices. It consists of
half resistance needed for the simple connections. It has more responsibility and
exceptional semi-permanent solidity to the distinct electrical sign add methodology
and hotness and humidness monitoring technology.

55.5.4 Pressure Sensors

Together with temperature, strain is one of the body parts in our surroundings. Stress
can be a parameter of physics, aeromechanics, acoustics, hydraulics, soil mechanics,
and natural philosophy. An instance of significance, industrial applications of pres-
sure size, we tend to might recall energy engineering. A phenomenological issue
of reading, pressure, p, as a macroscopic parameter, is delineated beginning with a
component of force dF. G, exerted sheer on associate degree detail of surface district
attorney G of the wall, by the fluid, contained within the box Eq. (55.1).

p = dF/Da (55.1)

A pressure dF is denoted as G, and the result of pressure p is perpendicular to the


visible floor dAG,

p = p0 + ρgh (55.2)

The Eq. (55.2). of an excellent gas is PV = nkBT, p: pressure, n: quantity of


molecules, T: temperature, V: extent, and kB: Boltzmann consistent.

55.5.5 Gyro Sensor—ADXL335

Acceleration could be a perception to change time, miles of vector amount, speed,


and direction. There are unit techniques for explaining the acceleration of something
55 IOT-Based Smart Baby Cradle: A Review 599

initial one is alternate in pace and the other is change path. Each area unit has
been modified at the same time. It tends to communicate concerning ADXL 335
measuring instrument accelerometer could be a device often used for measurement
acceleration of any item. It measures the acceleration within the form of analog
input, in 3 measuring directions inclusive of X, Y, and Z. It is low noise for less
electricity devour tool. A constant time has its miles used for acceleration, and then,
it is interfaced with any variety of controllers put together with a microcontroller or
Adriano.

55.5.6 Node MCU

It is based on communicative Wi-code for express wireless and uninterrupted supply


in the board that opposite to modules includes a device to a small chip for coding
and decoding. It is board-friendly and might be hopped up via its small connecting
devices. Node MCU is a wireless device created using Express if structures. It is
an exceedingly small device made to complete net paired up in a minimum deal. It
can be coded straight via data cable using a new type of software and easy coding.
They will set up an attachment in line with Internet linkage and loads greater. Node
MCU is equal to the Ethernet module. It combines the functions of getting the right
of entry to factor and station + microcontroller. Those capabilities make the Node
MCU an effective device for networking.

55.5.7 Digital Display

A digital display (liquid crystal display) is a leveled board monitoring device, which
acts as a virtual show. It makes use of a mild changing home of crystals that is liquid
and does not exhibit mild at once. Digital displays are present to show random pic
or unchangeable photos as they may be visible or invisible. It utilized the identical
primary generation of pictures created from a small pixel. Associate in the Nursing
liquid show may be small. It is straightforward to larboard with a small maintainer
as it is a fixed monitor. This monitor is preferred over several shows and denotes
that the number of small maintainers (which include a device) has compartments
that may be showing information as smoothly as a single. Area units are utilized
in a broad selection package such as electronic devices display, tool console, and
plane compartment. It was purchaser tools together virtual gamers, playing gadgets,
timers, hand timers, and calculating machines. They changed a ray tube presentation
at maximum programs, which is to be in a much broad variety of display areas rather
than a ray tube with a presentation, along with considering that now do not utilize it,
and not go through a photograph. It is liable to photograph staying.
600 T. Sathasivam et al.

55.5.8 Buzzer

A buzzer is a signal device. It is used to alert the car, family, home instrumental,
microwave, or recreation shows. It has a switch or sensor associated with an impres-
sive unit that determines the button became pushed or a planned time has no church-
going, unremarkably illuminates a lightweight on the button or manipulate panel.
There arises a warning in the type with a nonstop or intermediate abuzz or beep sound.
It initialized a tool which was primarily dependent on an electromechanical system
which turned into equal to a bell powered by electricity without the metal sound, and
frequently, those gadgets were to be placed to a building wall or on the ceiling and
used that ceiling or buildings wall as a board for sounding. Principle magnetic flux
or everlasting magnet. Those were set up to armature winding offer specific inherent
“driving force” circuits which varied the pace/torque regulation characteristics. At
the rate of a pitch of the sound or pulse sounds, DC motor is controlled with the
aid of rancid. The voltage implemented to the armature is a “lockout machine,” due
to the fact or by changing the sector contemporary. A man or woman indicates the
introduction of variable resistance within all others is locked from signaling. The
armature circuit or area circuit allowed velocity in numerous games indicating a
huge buzzer manipulates. DC cars are regularly buttons diagnosed as “plungers.”
The managed using energy electronics structures phrase “buzzer” comes from the
rasping referred to as DC drives. Buzzers noise created once that it had been a device
of mechanical; it runs on an AC line at a particular number of cycles of voltages.
Different rings ordinarily accustomed to signify at a switch have been ringing.

55.5.9 DC Motor

A DC motor may be an automatically commutated motor hopped up from electricity


(DC). The stator coil is immovable in the area and so the present within the rotor is
switched by the electrical switch to even be stationary in the area. This can be however
the comparative angle between the stator coil and rotor magnetic flux managed close
to ninety degrees, which generates the utmost torsion. DC mechanical force has an
action on that. On applying a specific rule, it is clear that force on every conductor is
prepared to rotate the coil in the anticlockwise direction. All of these forces add along
to supply a driving torsion that sets the coil rotates. Once the conductor moves from
one aspect of a brush to the opposite, the electricity in this conductor is reversed and, at
the identical time, it comes beneath the influence of the following pole that is against
polarity It ought to be noted that the performance of an electrical switch within the
motor is the same as in an exceeding generator. By reverse current in every conductor,
because it passes from one pole to a different, it helps to develop never-ending
and one-way torsion. Motors have a rotating coil winding (winding within which a
voltage is started) however non-rotating coil magnetic flux and a static coil (winding
that manufacture the most magnetic flux) or static magnet. Completely different
55 IOT-Based Smart Baby Cradle: A Review 601

connections of the sphere and coil winding give different types of speed/torque
regulation characteristics. The speed of a DC motor is often controlled by dynamic
voltage applied to the coil or by dynamic sphere current. The starting of different
resistance within the coil or field circuit allowed speed management. Trendy DC
motors’ area unit is usually controlled by power physics systems.

55.6 Conclusion

A smart cradle is attached with an infant tracking monitor through IoT. It has been
planned and invented to display the infant’s crucial parameters such as crying condi-
tion, humidity, and ambient temperature. Arduino and Node MCU turned as the main
board of controller within the circuit plan of the project. It has an integrated online
connection, which ensures the development of the IoT idea. A call for IoT was carried
out by the usage of the Node MCU because of its simplicity and open-supply nature.

References

1. Jabbar, W.A., Shang, H.K., Hamid, S.N.I.S., Al Mohammedi, A.A., Ramli, R.M., Ali, M.A.H.:
IoT–BBMS Internet of Things-based baby monitoring system for smart cradle. IEEE Access
7, 93791–93805 (2019)
2. Kanna, P.R., Santhi, P.: Unified deep learning approach for efficient intrusion detection system
using integrated spatial–temporal features. Knowl. Based Syst. 226 (2021)
3. Murugesan, M., Thilagamani, S.: Efficient anomaly detection in surveillance videos based on
multi layer perception recurrent neural network. J. Microprocess. Microsyst. 79, November
(2020)
4. Maloji, S., Malakonda Sai Lokesh, S., Nikhil Sai, K., Vasavi Prasanna, M., Ashwaq, M.K.,
Arun Mehta, S.: An innovative approach for infant monitoring system using movel s.Odi based
IoT system. Int. J. Adv. Sci. Technol. 29(6), 3623–3630 (2020)
5. Pandiaraja, P., Aravinthan, K., Lakshmi, N.R., Kaaviya, K.S., Madumithra, K.: Efficient
cloud storage using data partition and time based access control with secure AES encryption
technique. Int. J. Adv. Sci. Technol. 29(7), 1698–1706 (2020)
6. Jose Reena, K., Parameswari, R.: A smart health care monitor system in IoT based human
activities of daily living: a review. In: Proceedings of the International Conference on
Machine Learning, Big Data, Cloud and Parallel Computing: trends, perspectives and prospects
2019, pp. 446–448 (2019). https://doi.org/10.1109/COMITCon.2019.8862439
7. Ananth, S., Sathya, P., Madhan Mohan, P.: Smart health monitoring system through IoT. In:
Proceedings 2019 IEEE International Conference on Communication and Signal Processing,
ICCSP 2019, pp. 968–970 (2019). https://doi.org/10.1109/ICCSP.2019.8697921
8. Krishnamurthy, R., Cecil, J.: A next-generation IoT-based collaborative framework for
electronics assembly. Int. J. Adv. Manuf. Technol. 96(1–4), 39–52 (2018)
9. Thilagamani, S., Nandhakumar, C.: Implementing green revolution for organic plant forming
using KNN-classification technique. Int. J. Adv. Sci. Technol. 29(7S), 1707–1712 (2020)
10. Koli, M., Lodge, P., Prasad, B., Boria, R., Balur, N.J.: Intelligent baby incubator. In: Proceed-
ings of the 2nd international conference on electronics, communication and aerospace tech-
nology, ICECA 2018, no. Iceca, pp. 1036–1042 (2018). https://doi.org/10.1109/ICECA.2018.
8474763
602 T. Sathasivam et al.

11. Singh, H. et al.: Neo-bedside monitoring device for integrated neonatal intensive care unit
(iNICU). IEEE Access 7(c), 7803–7813 (2019). https://doi.org/10.1109/ACCESS.2018.288
6879
12. Deepika, S., Pandiaraja, P.: Ensuring CIA triad for user data using collaborative filtering
mechanism. In: 2013 International Conference on Information Communication and Embedded
Systems (ICICES), pp. 925–928 (2013)
13. Sowmya, K.V., Sastry, J.K.R.: Performance evaluation of IOT systems—basic issues. Int. J.
Eng. Technol. 7(2), 131–137 (2018). https://doi.org/10.14419/ijet.v7i2.7.10279
14. Priyadarshini, S.B.B., Bagjadab, B., Mishra, B.K.: The role of IoT and Big Data in modern
technological arena: a comprehensive study. In: Internet of Things and Big Data Analytics for
Smart Generation, pp. 13–25. Springer (2019)
15. Pandiaraja, P., Sharmila, S.: Optimal routing path for heterogenous vehicular adhoc network.
Int. J. Adv. Sci. Technol. 29(7), 1762–1771 (2020)
16. Pradeep, D., Sundar, C.: QAOC: noval query analysis and ontology-based clustering for data
management in Hadoop. 108, 849–860 (2020)
17. Leier, M., Jervan, G.: Miniaturized wireless monitor for long-term monitoring of newborns.
In: Proceedings of the Biennial Baltic Electronics Conference, BEC, vol. 2015, November,
pp. 193–196 (2014). https://doi.org/10.1109/BEC.2014.7320589
18. Pandiaraja, P., Suba, S.: An analysis of a secure communication for healthcare system using
wearable devices based on elliptic curve cryptography. J. World Rev. Sci. Technol. Sustain.
Dev. 18(1), 51–58 (2022)
19. Ashish, B.: Temperature monitored IoT based smart incubator. Proc. Int. Conf. IoT Soc. Mobile,
Anal. Cloud, I-SMAC 2017, pp. 497–501 (2017). https://doi.org/10.1109/I-SMAC.2017.805
8400
20. Deepa, K., Thilagamani, S.: Segmentation techniques for overlapped latent fingerprint
matching. Int. J. Inno. Technol. Exp. Eng. 8(12), 1849–1852 (2019)
21. Logeswaran, R., Aarthi, P., Dineshkumar, M., Lakshitha, G., Vikram, R.: Portable charger
for handheld devices using radio frequency. Int. J. Inno. Technol. Exp. Eng. (IJITEE) 8(6),
837–839 (2019)
22. Lu, Y., Cecil, J.: An Internet of Things (IoT)-based collaborative framework for advanced
manufacturing. Int. J. Adv. Manuf. Technol. 84(5–8), 1141–1152 (2016)
23. Horne, R.S.: Sudden infant death syndrome: current perspectives. Intern. Med. J. 49(4), 433–
438 (2019)
24. Joshi, M.P., Mehetre, D.C.: IoT based smart cradle system with an android app for baby
monitoring. In: 2017 International Conference on Computing, Communication, Control and
Automation (ICCUBEA), pp. 1–4 (2017)
25. Thilagamani, S., Shanti, N.: Gaussian and Gabor filter approach for object segmentation. J.
Comput. Inf. Sci. Eng. 14(2), 021006 (2014)
26. Deepa, K., Kokila, M., Nandhini, A., Pavethra, A., Umadevi, M.: Rainfall prediction using
CNN. Int. J. Adv. Sci. Technol. 29(7 Special Issue), 1623–1627 (2020)
27. Taylor, B.J., et al.: International comparison of sudden unexpected death in infancy rates using
a newly proposed set of cause-of-death codes. Arch. Dis. Child. 100(11), 1018–1023 (2015)
28. Gunasekar, M., Thilagamani, S.: Performance analysis of ensemble feature selection method
under SVM and BMNB classifiers for sentiment analysis. Int. J. Sci. Technol. Res. 9(2), 1536–
1540 (2020)
29. Gopi Krishna, P., Sreenivasa Ravi, K., Hari Kishore, K., Krishna Veni, K., Siva Rao, K.N.,
Prasad, R.D.: Design and development of bi-directional IoT gateway using ZigBee and Wi-Fi
technologies with MQTT protocol. Int. J. Eng. Technol. 7(2), 125–129 (2018). https://doi.org/
10.14419/ijet.v7i2.8.10344
30. Jhun, I., Mata, D.A., Nordio, F., Lee, M., Schwartz, J., Zanobetti, A.: Ambient temperature and
sudden infant death syndrome in the United States. Epidemiology (Cambridge, Mass.) 28(5),
728–734 (2017)
31. Nalajala, S., et al.: Data security in cloud computing using three-factor authentication. In:
International conference on communication, computing and electronics systems. Springer,
Singapore (2020)
55 IOT-Based Smart Baby Cradle: A Review 603

32. Santhi, P., Mahalakshmi, G.: Classification of magnetic resonance images using eight directions
gray level co-occurrence matrix (8dglcm) based feature extraction. Int. J. Eng. Adv. Technol.
8(4), 839–846 (2019)
33. Romansky, R.: A survey of digital world opportunities and challenges for user’s privacy. Int.
J. Inf. Technol. Secur. (Bulgaria) 9(4), 97–112 (2017)
34. Sharma, M., Singh, G., Singh, R.: An advanced conceptual diagnostic healthcare framework
for diabetes and cardiovascular disorders. arXiv preprint arXiv:1901.10530 (2019)
35. Lambert, A.B.E., Parks, S.E., Shapiro-Mendoza, C.K.: National and state trends in sudden
unexpected infant death: 1990–2015. Pediatrics 141(3), e20173519 (2018)
36. Kanna, P.R., Santhi, P.: Hybrid intrusion detection using map reduce based black widow
optimized convolutional long short-term memory neural networks. In: Expert Systems with
Applications, vol. 194, 15 May (2022)
37. Jabbar, W.A., Saad, W.K., Ismail, M.: MEQSA-OLSRv2: a multicriteria-based hybrid multi-
path protocol for energy efficient and QoS-aware data routing in MANET-WSN convergence
scenarios of IoT. IEEE Access 6, 76546–76572 (2018)
Chapter 56
Pressure Prediction System in Lung
Circuit Using Deep Learning

Nilesh P. Sable, Omkar Wanve, Anjali Singh, Siddhesh Wable,


and Yash Hanabar

Abstract A massive number of patients infected with SARS-CoV2 and Delta variant
of COVID-19 have generated acute respiratory distress syndrome (ARDS) which
needs intensive care, which includes mechanical ventilation. But due to the huge no
of patients, the workload and stress on healthcare infrastructure and related personnel
have grown exponentially. This has resulted in huge demand for innovation in the field
of automated health care which can help reduce the stress on the current healthcare
infrastructure. This work gives a solution for the issue of pressure prediction in
mechanical ventilation. The algorithm suggested by the researchers tries to predict the
pressure in the respiratory circuit for various lung conditions. Prediction of pressure
in the lungs is a type of sequence prediction problem. Long short-term memory
(LSTM) is the most efficient solution to the sequence prediction problem. Due to
its ability to selectively remember patterns over the long term, LSTM has an edge
over normal RNN. RNN is good for short-term patterns but for sequence prediction
problems, LSTM is preferred.

Keywords Pressure prediction · RNN · LSTM · PyTorch · COVID-19 pandemic

56.1 Introduction

Many diseases, such as pneumonia, heart failure, and COVID-19, lead to lung failure
for many different reasons. A person who cannot breathe on his own or has difficulty
in breathing needs to be given some external support to help with breathing [1]. This
help is provided to the patient with the help of a mechanical ventilator, and once
the patient is stable, they are weaned off the ventilator. A mechanical ventilator is a
machine that helps pump oxygen into a person’s body through a tube that goes in the

N. P. Sable (B)
Bansilal Ramnath Agarwal Charitable Trust’s Vishwakarma Institute of Information Technology,
Pune, India
e-mail: drsablenilesh@gmail.com
O. Wanve · A. Singh · S. Wable · Y. Hanabar
JSPM’s Imperial College of Engineering and Research, Wagholi, Pune, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 605
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_56
606 N. P. Sable et al.

mouth and down to the windpipe. According to the patient’s condition and needs,
the doctor programs the ventilator to push air when the patient needs help.
Mechanical ventilators use a proportional–integral–derivative (PID) control algo-
rithm to automatically adjust the oxygen concentration in the patient according to
the patient’s requirements. These controllers use several physiological data points
of the patient, such as the breathing frequency and oxygen level, to help the patient
get stable and provide an appropriate amount of oxygen. The input to the system
includes the carbon dioxide level, oxygen level, and air resistance. These ventilators
help adjust the breathing frequency in a clinically appropriate manner in response to
the changes in the patient’s breathing frequency. Mechanical ventilators are clinically
controlled and operated by doctors and nurses who are trained to handle them.
However, PID controllers have certain limitations. When it comes to a process
that is integrated and has a large time delay, performance is poor. Small changes or
deviations are not reflected easily. The purpose of the given system is to eliminate
these limitations. Using the concepts of deep learning, these limitations can be almost
eliminated and the system can work with greater efficiency. Given technology is
budget-friendly for hospitals. The amount of manpower required to ventilate a single
patient will be reduced, which can be an enormous benefit, especially in pandemic
situations.
According to the resource, Virus Centre of Johns Hopkins Medicine, 2.2% out
of total worldwide COVID-19 affected people are died due to the acute respiratory
syndrome analyzed since November 2019 [2]. Ground-glass opacity (GGO) has
been observed that the COVID-19 variants especially Delta one cause pneumonia in
both the lungs [3]. A large number of people infected with Delta and other variants
have acute respiratory distress syndrome (ARDS) and they need high-level medical
facilities like invasive mechanical ventilation. The effect of COVID-19 variants on
the immune system, ground-glass opacity, and the different neoplastic changes in
the lungs due to SARS-CoV2 and other variants attacks is the key focused area [4].

56.2 Related Works

In this paper [5], author tries to suggest that it is convenient to use HMMs algorithm
to predict the pressure of ventilators in sedated patients and to results that the given
threshold value of asynchrony events has such a probability. Unlike other studies
based on very limited observation periods in patients with specific conditions, authors
analyzed the whole period of mechanical ventilation in a different wide range of the
population of ICU patients with a different variety of critical illnesses.
In this paper [6], different settings of mechanical ventilation of ventilators are
analyzed depending on the particular patient’s lungs condition, and the determination
of these parameters depends on the observed patient’s past medical history and the
experience of the clinicians involved in their practice. In the research, they have used
graded particle swarm optimizer (GPSO) for the analysis of patients’ medical data.
The main limitation of the current study is all the data of patients were recorded
56 Pressure Prediction System in Lung Circuit … 607

manually, and these readings of ventilator parameters were taken at random intervals
of time that are not continuous readings.
In this paper [7, 8], author suggests a retrospective multicentre study of all patients
with COVID-19 patients, who were presented to the ER clinician at Beaumont Health,
which is the largest healthcare system in Michigan. The author has developed two
separate decision tree based, which is an ensemble to ML algorithm. This study has
two objectives, one is a prediction of mechanical ventilation and the second one is
mortality.
In this paper [9], author designed the end-to-end pipeline for learning a PID
controller, also the improvement is done upon PID controllers for tracking ventilator
pressure waveforms. All the improvements measures are considered concerning the
ISO standards for better performance of ventilator support parameters.

56.3 Recommended Algorithm

56.3.1 Recurrent Neural Networks

To understand RNN, we need to first understand what a neural network is, neural
network is a group of algorithms that try to find patterns in provided data while trying
to do the process in the way our human brain does. They recognize patterns in data
that are numerical or vectors so first other types of data such as audio, video, and
images need to be translated into mathematical form. RNN is a progression on the
feedforward type of neural networks, where the output from the last computation
is supplied as input for the next computation and this is where the word recurrent
comes from. In RNNs, the output from the last computation is copied and stored as
a hidden state. For the next computation, the input for that step and the value in the
hidden state is taken into consideration. Dataset available in [10], contains data in
the form of a sequence, and RNN is found to be one of the better models for dealing
with such sequential data (Fig. 56.1).
The formula for the current state is

Fig. 56.1 Operational principle of RNNs


608 N. P. Sable et al.

h t = f (h t−1 , It ) (56.1)

Applying activation function


 
h t = tanh Wph h t−1 + Wic It (56.2)

In the above equation, which uses the activation function of tanh, W is weight,
h is the hidden state, W ph is the weight at the previous hidden state, and W ic is the
weight at the present input state.

Ot = Who h t (56.3)

where Ot = is the output state and W ho = is the weight at the output state. In this
case, the advantage of RNN is that it can model a sequence of data to create a model
that outputs data that is dependent on the previous set of data. Furthermore, if an
activation function is used, it does not perform well for a large sequence of data. This
is where the LSTM framework comes into play.

56.3.2 Long Short-Term Memory Network

LSTM network is a modern version of recurrent neural networks, which has a memory
so it makes it easier for it to compute a large sequence of data. The problem of
insufficient memory or storage to remember previously stored data in hidden states
for a long term in RNN is solved here. The LSTM operates a bit similar to the logic
circuit and it has three gates that help it in its decision-making process (Fig. 56.2).

Fig. 56.2 Long short-term architecture


56 Pressure Prediction System in Lung Circuit … 609

The input gate, forget gate, and output gate are the three gates in an LSTM. The
data input authors used in this is X t as input and H t as the hidden state, similar to
RNN. To get the values of all three gates, the sigmoid function is used. The sigmoid
function is employed, and the values range from 0 to 1.

It = σ (X t Wxi + Ht−1 Whi + bi ) (56.4)

 
Ft = σ X t Wx f + Ht−1 Wh f + b f (56.5)

Ot = σ (X t Wxo + Ht−1 Who + bo ) (56.6)

The mathematical formulas above are used to calculate the value of the three
gates, where I, F, and O stand for input, forget, and output gates, respectively. X,
H, W, and b represents input data, hidden state, weights, and biases, respectively.
Next is the memory cell or candidate memory cell C. The value of the candidate is
computed similarly to the three gates just that the tanh function is used in the place
of the sigmoid function so its value lies between 1 and −1.
In LSTM, the forget gates decide how much of the data in the candidate memory
from the previous computation we keep and the input gate decides whether or not to
add to the candidate memory during that computation. The formula below gives a
mathematical representation of it.

Ct = Ft  Ct−1 + It  Ct (56.7)

The last thing remaining to compute is the hidden state. The hidden state is tanh
of the candidate memory which lies between −1 and 1 which is multiplied by the
output which lies between 0 and 1. When the output is near 1, it passes on everything
and when it is near 0, it does not pass anything.

Ht = Ot  tanh(Ct ) (56.8)

56.3.3 MLP Multilayer Perceptron

Understanding the multilayer perceptron network helps us to get information about


the underlying reasons in the high-level models of deep learning. Simple regression
problems can be easily solved with the help of MLP. A multilayer perceptron tries to
remember patterns in sequential data, because of this to process a multidimensional
dataset, we require a large number of parameters in the dataset (Fig. 56.3).
The input layer, the hidden layer, and the output layer are the three layers of
nodes in an MLP. For model training, MLP typically employs the supervised learning
technique and algorithms known as back propagation. The two main characteristics
610 N. P. Sable et al.

Fig. 56.3 Multilayer perceptron (MLP)

that distinguish MLP from a linear perceptron are its multiple layers and nonlinear
activation.

56.3.4 K-Fold Technique

In most cases, the authors employ the Holdout Method, which divides the dataset into
two parts: The training set and testing set, and check if the model trained on training
data performs well on testing or not using some error metrics. Can we, however, rely
on this method? NO is the answer. The evaluation done in this way highly depends on
the data points of the train set which are also in the test set, and thus, the evaluation
is highly dependent on the method of splitting the data. Let us see how K-Fold is
better than the conventional holdout method.
K-Fold is a validation technique in which the data are divided into k subsets
and the holdout method is applied k times. In the process, every set of subsets is
used as a test set and the other remaining subsets are used for the training purpose.
Then, the mean value error is calculated from all these k outcomes, which is more
reliable as compared to the standard handout method. Group K-Fold is an advanced
version of the K-Fold technique which validates that the same sample group data is
not represented in both the test set and train set. Consider an example, the dataset
contains marks of different subjects and each subject has 50 samples of marks from
50 students. For such a situation, the model should be flexible enough to learn from
highly person-specific features because it might fail to generalize it to every new
subject. Group K-Fold is considered the best method to detect this kind of over fitting
situation. Because the datasets authors have encompassed 50 different measurements
for the same patient, it implies the same in our case. This also needed first grouping,
folding the set, and then forming every possible combination of train and test set to
56 Pressure Prediction System in Lung Circuit … 611

Fig. 56.4 Cross-validation process

train the model and make it more robust and accurate. So Group K-Fold is the most
suitable method we can follow in such cases.

56.3.4.1 Cross-Validation

It is not a good practice to swat the parameters of training data and test it on the same.
A model which is trained and built by just repeating the labels of the samples of data
that it has just seen would give perfect results but with the unseen data, it may fail to
predict anything useful. Such kind of problem is known as over fitting.
To overcome this over fitting, it is suggested to divide available test dataset data
into two parts such as X-test and Y-test while performing a supervised learning
experiment. Now comes the most important part which is training the model and
experimenting on it, because the experiment phase is an integral part of model
development. Even commercially machine learning usually starts experimentally.
Figure 56.4 shows a flowchart of the cross-validation workflow during the training
phase.

56.4 Libraries

56.4.1 Pandas

A Panda is a powerful library available in Python which provides all the important
tools along with fast and flexible data structures which are developed to work with
612 N. P. Sable et al.

“relational” or “labeled” data both easy and untaught. Pandas provide fundamental
building blocks which can be used for real-world data analysis in Python. A Panda
is the most suitable library to handle datasets either train set or test set. Datasets are
handled by Pandas using the series and data frame data structures. Data cleaning and
data preprocesses are processed with Pandas.

56.4.2 NumPy

NumPy is a Python library that stands for numerical Python. As the name suggests, it
is used to carry out numerical operations. It contains multidimensional array objects
and a set of tools, utilities, and routines for processing those arrays. NumPy is a
library in Python language which works on arrays as an alternative for lists in Python.
In comparison with arrays, users know that processing lists are heavy and time-
consuming. Using NumPy, mathematical and logical operations on arrays can be
performed and brings computational power to the machine learning process.

56.4.3 Data Visualization

To analyze and interpret the available data, data visualization is a required process
in machine learning and deep learning development [11]. Data visualization is a
representation of data by using plots, histograms, or boxplots so that it can get a
better understanding. It converts available data into pictorial graphs, which helps
in the analysis of data and to make predictions. Matplotlib and Seaborn are the
libraries considered the backbone of data visualization in Python programming. In
data science, it makes complex data more handy and accessible.

56.4.4 Matplotlib

Matplotlib in Python is a powerful tool used for the graphical representation of data
with the help of other libraries like pandas and NumPy. It is like performing MATLAB
functions and methods in Python. It is used for generating statistical interferences and
plotting 2D graphs of arrays and sometimes 3D also. Matplotlib was used to create
the graph. The graph describes the u_in, u_out, and pressure variance of a particular
patient over different time steps. Figure 56.5 depicts a graph plotting example using
Matplotlib.
56 Pressure Prediction System in Lung Circuit … 613

Fig. 56.5 Change in u_in, u_out and pressure for particular patient over different time stamp

56.4.5 PyTorch

PyTorch is one of the significant libraries used while building machine learning
and deep learning-based model. PyTorch is primarily used for building applications
model using GPUs and CPUs. It has been mainly built considering Python program-
ming and object-oriented programming and torch library which support computations
of tensors (an array that mainly works on graphical processing units). It is an opti-
mized tensor library. RNN and LSTM, both of which are part of the PyTorch library,
were used to build the model in this study.
Tensor—A tensor is a fundamental unit of data in PyTorch. It can be a number,
vector, matrix, or any n-dimensional array. It is like an array in NumPy but processes
on graphical processing units.

56.4.5.1 Feature Engineering

Feature engineering means the process of using domain knowledge for the purpose to
extract the data and convert the most relevant variables from raw data while building
a predictive model using supervised learning, deep learning or statistical modeling.
The main need of the feature engineering phase is to enhance the performance of
machine learning (ML) and deep learning (DL) algorithms (Fig. 56.6).

56.4.5.2 Datasets and Data Loaders

When processing data samples, researchers frequently run into the issue of code
becoming messy and complex. The model may find it difficult to maintain. It is
considered a good practice to decouple from model training code for better modu-
larity and readability. It is a bit like the normalization and denormalization process
in DBMS. Torch provides two types of data, it’s similar to the normalization and
614 N. P. Sable et al.

Fig. 56.6 Feature engineering framework

demoralization processes in relational databases. Torch provides two data prim-


itives: torch.utils.data.DataLoader and torch.utils.data.Dataset. Researchers using
these primitives to access both pre-loaded datasets and our own data. The primi-
tive dataset is mainly used to store the samples and their corresponding parameters,
while the dataset is get wrapped with miserable by using data loader primitive which
allows easy access.

56.4.5.3 Transformers

Transformation means manipulating and rearranging the predictor variables to


enhance model performance. The data authors have is not in the final processed state
that is required for training a machine learning model. Hence, it needs to use the
transforms technique to perform some manipulation of the data and make it suitable
for training. The torchvision.transforms module contains all torch vision datasets that
have two parameters—transformer to enhance the features and target_transformer to
modify the labels and parameters.

56.4.6 Scikit Learn

Scikit learns are a widely used machine learning library for the development of
supervised and unsupervised learning models. This library provides a large set of
tools and utilities which cover many applications such as data preprocessing, model
selection, model fitting, and many other utilities. SK learns give access to a lot of
built-in machine learning models and algorithms known as estimators. The fit method
is a well-known method used to fit datasets with relatable estimators.
56 Pressure Prediction System in Lung Circuit … 615

56.5 Conclusion

This study will help us develop an algorithm and machine learning that will help
clinicians in predicting pressure levels in the lungs of patients with varying different
body conditions while putting them on mechanical ventilation. This will reduce
the workload from the healthcare workers, which is present like conditions like
pandemics are increasing. The predictions for mechanical ventilation get easier with
parameters such as age, resistance in the respiratory tract, and compliancy of lungs
this helps the clinician in their decision-making process.

References

1. Mahalle, P.N., Sable, N.P., Mahalle, N.P., Shinde, G.R.: Predictive analytics of COVID-19 using
information, communication, and technologies. https://doi.org/10.20944/preprints202004.025
7.v1
2. Coronavirus Stats. https://www.worldometers.info/coronavirus/
3. Sadhukhan, P., Ugurlu, M., Hoque, M.: Effect of COVID-19 on lungs: focusing on prospective
malignant phenotypes. Cancers 12, 3822 (2020)
4. Parums, D.: Editorial: revised World Health Organization (WHO) terminology for variants of
concern and variants of interest of SARS-CoV-2. In: Medical Science Monitor, vol. 27 (2021)
5. Marchuk, Y., Magrans, R., Sales, B., Montanya, J., López-Aguilar, J., de Haro, C., Gomà, G.,
Subirà, C., Fernández, R., Kacmarek, R., Blanch, L.: Predicting patient-ventilator asynchronies
with hidden Markov models. Scien. Rep. 8 (2018)
6. Oruganti Venkata, S., Koenig, A., Pidaparti, R.: Mechanical ventilator parameter estimation
for lung health through machine learning. Bioengineering 8, 60 (2021)
7. Yu, L., Halalau, A., Dalal, B., Abbas, A., Ivascu, F., Amin, M., Nair, G.: Machine learning
methods to predict mechanical ventilation and mortality in patients with COVID-19. PLoS
ONE 16, e0249285 (2021)
8. Mamandipoor, B., Frutos-Vivar, F., Peñuelas, O., Rezar, R., Raymondos, K., Muriel, A., Du,
B., Thille, A., Ríos, F., González, M., del-Sorbo, L., del Carmen Marín, M., Pinheiro, B.,
Soares, M., Nin, N., Maggiore, S., Bersten, A., Kelm, M., Bruno, R., Amin, P., Cakar, N., Suh,
G., Abroug, F., Jibaja, M., Matamis, D., Zeggwagh, A., Sutherasan, Y., Anzueto, A., Wernly,
B., Esteban, A., Jung, C., Osmani, V.: Machine learning predicts mortality based on analysis
of ventilation parameters of critically ill patients: multi-centre validation. In: BMC Medical
Informatics and Decision Making, vol. 21 (2021)
9. Sayed, M., Riaño, D., Villar, J.: Predicting duration of mechanical ventilation in acute
respiratory distress syndrome using supervised machine learning. J. Clin. Med. 10, 3824 (2021)
10. Data Set: Google Brain—Ventilator Pressure Prediction | Kaggle
11. Ghodke, S.V., Sikhwal, A. Sable, D.N.: Novel approach of automatic disease prediction and
regular check-up system using Ml/Dl. In: Design Engineering, pp. 2885–2896 (2021)
Chapter 57
Twitter Sentiment Analysis Using
Machine Learning and Deep Learning

Pallavi Tiwari, Deepak Upadhyay, Bhaskar Pant, and Noor Mohd

Abstract Twitter is the widely used microblogging site, where millions of people
share their feelings, views, or opinion regarding different things be it a product,
service, or events. Huge volumes of data are being produced hourly because of the
increase in the number of users. These data are unstructured in nature, and thus, it is
a difficult task to analyze them and extract the meaning from it. This paper, however,
will concentrate on sentiment analysis of Twitter data. We will perform text mining
or opinion mining to obtain a better understanding of public sentiment. In this paper,
Python is used to acquire, preprocess, and analyze tweets, after those three different
machine learning algorithms and one deep learning algorithm are used for sentiment
analysis and comparison is done among them to determine which approach or model
gives the best accuracy.

Keywords Twitter · Sentiment analysis · Machine learning · Deep learning ·


Python

57.1 Introduction

Sentimental analysis is the approach for identifying and categorizing sentiments from
a piece of text computationally and predict what is the user sentiment regarding a
specific topic, services or product, is positive, negative or neutral.
During the decision-making process in an industry or a campaign planning the
most important piece of information that has always is “what other people think.”
From success of a company or a product to success of a movie, event, concerts, or
even the win in the election of a particular candidate directly depends on people,
if the customer or a person likes the product, candidate or a movie, events it is a

P. Tiwari (B) · D. Upadhyay · B. Pant · N. Mohd


Department of Computer Science and Engineering, Graphic Era Deemed to be University,
Dehradun, India
e-mail: Pallavitiwari_20061041.cse@geu.ac.in
D. Upadhyay
e-mail: deepakupadhyay.ece@geu.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 617
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_57
618 P. Tiwari et al.

success if not than you certainly need improvise it by making some changes. But the
question arises how to analyze the customer, earlier there were used to be surveys to
know the opinion of the person regarding the same and it was expensive but now we
have one attribute for analyzing the customer is analyzing their sentiments and this
where the sentiment analysis comes into the picture.
With the rise in Web 2.0 platforms such as Twitter, Facebook, blogs, Instagram,
and various other types of platforms, people now have the access to platforms with
exceptional reach and power to share their opinions, views, experiences, and reviews
about any product or services. Nowadays, they are so accustomed about social media
that they cannot envision even a single day without it. Among these platforms, Twitter
has become one of the most popular microblogging sites, where millions of people
exchange their thoughts in the form of tweets. Tweets are brief sentence with the
word limit being 140 per tweet. People express their joy, rage, grief, and every
other emotion on Twitter. It is one of the most opinioned data providing platforms.
Twitter, that is why, has now become sentiment rich platform. The data produced is
unstructured in nature and thus, working with it is a challenging task.
In this paper, three machine learning classifier and recent neural network models
LSTM are used to analyze the sentiments of the people. Comparison among these
methods will be done based on their accuracy and f1_score and conclude which
technique gives the better accuracy.

57.2 Literature Review

Poornima et al. in their paper uses the concept of polarity which is generated from
term frequency. They compared the accuracy of three machine learning algorithms
multinomial Naive Bayes, SVM, and logistic regression to check which one provides
more accuracy of 86.23% in terms of sentiment analysis. The result was logistic
regression when applied with n-gram and bigram model [1]. The authors [2] in
their paper used machine learning classification. Naïve Bayes, SVM, and maximum
entropy method to compare their accuracy and precision. The result was Naïve Bayes
performed better than other two algorithms with accuracy of 86% and precision of
88.695652% and in some cases, maximum entropy is effective. In [3], Zahoor and
Rohilla worked with unsupervised machine learning algorithms which is rule based
or lexical unsupervised approach. They concluded that without supervised algorithms
that means without training and testing our datasets and directly classifying them
can result in not accurate or can change with the use of short forms of sentences.
Authors [4] in their paper worked with fine-grained sentimental analysis. They used
three algorithms SVM, decision tree, and random forest algorithm to analyze the
sentiments through which it is being concluded that decision tree and random forest
algorithm with accuracy of approximately 99% have better accuracy than the SVM
algorithm. In [5], NLTK corpora resources were used for the Twitter dataset. In their
paper, they aim to explain Twitter sentiment analysis with ordinal regression using
four different machine learning classifiers: Multinomial logistic regression, support
57 Twitter Sentiment Analysis Using Machine … 619

vector regression, decision tree, and random forest. Among these four techniques,
decision tree classifier produces more accuracy with 91.81%. Anand et al. discuss
about the general procedures in Twitter sentiment analysis and how they have used
the Python and its libraries for performing analysis on the Twitter data regarding
important decision taken by the government like how people are reacting on GST
or BHIM and sentiments of people regarding the same [6]. Goel et al. proposed a
model using deep learning technique like deep neural networks [7]. It used RNN
algorithm for sentiment analysis and for the purpose of comparison; they also used
machine learning algorithms and concluded that the accuracy of RNN is better than
the machine learning algorithm. Their result shows that LSTM model for scanning
the negative comments gives the score of 0.34 and machine learning algorithms
give the score of 0.54. In [8], performance of sentiment analysis on Twitter is done
with three different methods, namely machine learning, polarity-based and deep
learning models. They used seven different types of machine learning technique and
voting-based classification system. The accuracy for machine learning classifiers was
between 81 and 91%. Also, deep learning models like CNN-RNN and LSTM produce
accuracy in between 85 and 97%. Talpada et al. worked on sentiment analysis using
deep learning and lexical-semantic methods for understanding the trend of demo-
graphic in telemedicine. For lexical and sentiment models, TextBlob and VADER
are used and it was concluded that, it gives prediction better for the general use. Also,
among both the libraries, VADER proves to be more accurate than the TextBlob and
these models do not get affected by the domain knowledge [9]. Goularas et al. present
a study for the comparison of different deep learning techniques. They used SemEval
for the dataset and preprocess the datasets and then word embedding is done through
Word2vec and global vectors for wordsGloVe), and in this paper, it was concluded
that GloVe provides better performance, and lastly, for analysis of sentiments, two
neural network algorithms, CNN and LSTM, are used with different configuration
like used individual algorithm then various different combinations of them to find
better accuracy. Among those, it was concluded that these models when used individ-
ually perform very badly than the combination of them both. Also, it was suggested
that rather than using the combining model, it is better to focus on providing better
quality training datasets [10]. The authors in [11] proposed model where they used
Python crawler for collecting datasets from different social networking sites, prepro-
cess data, and used same word embedding tool as [10] and then they analyze the
sentiments using three different deep learning models: LSTM, BiLSTM, and GRU
models. They show the results in terms of accuracy, precision, recall, specificity and
f1_score. The results show that BiLSTM has greater % in all these terms. The authors
in [12] proposed an ensemble classifier that is the combination of logistic regression,
SVM, Naïve Bayes, and random forest classifier and concluded that this classifier
performs better and give better accuracy than those individual models. They also
explore the data preprocessing and feature extraction in their paper as they perform
a significant role in sentiment analysis. The main computation steps involve in their
paper was first to find the polarity of the sentiment and then classifying them as
positive or negative. Sharma and Ghose [13] compare the popularity of two candi-
date and by sentiment analysis, they predicted which one is likely to win election
620 P. Tiwari et al.

in 2019. They used the Twitter dataset and performed the acquisition, preprocess,
analyze, and then sentiment analysis on those datasets. By their literature review,
they concluded that unsupervised algorithm is better than supervised, so therefore,
they used lexicon-based and dictionary-based approaches for the sentiment anal-
ysis of Twitter. By their research, they conclude that candidate-1 is more popular
and has more positive views from the people. The authors in [14] performed senti-
ment analysis on Twitter with different datasets. In this paper, the authors perform
a comparison within the machine learning techniques, between the supervised and
unsupervised algorithms. They concluded that supervised techniques could have a
major drawback due to the datasets provided and for lexicon-based approach, and
it is important to have dictionary words or it will not perform better. The authors in
[15] used a specific dataset that is hate and abusive speech on Twitter, and they are
the first one to do comparative study with respect to these datasets. They performed
various machine learning techniques and deep learning models and concluded that
among the machine learning technique, logistic regression produces better accuracy
and in deep learning RNN with LTC generated the better accuracy when comes to
“spam” tweets, and CNN generates better when it comes to “hateful” tweets, and
hence, there is to be more to done so that accuracy of models can be clear.

57.2.1 Contribution

1. Current surveys have already covered a range of sentiment analysis and this
paper has used the approaches that are considered better in above papers and
will compare those approaches.
2. In addition, unlike past sentiment analysis, this paper deliberately and exhaus-
tively analyses sentiment in Twitter using Python language, as well as used
the Twitter API and Kaggle for dataset and Python libraries and packages to
perform various sentiment analysis approaches.

57.3 Methodology

The methodology used to implement the sentiment analysis and compare the models
is as shown in Fig. 57.1. Firstly, the Twitter dataset is used for the purpose of training
and testing the model and then the data is preprocessed, and preprocessed data is
then used by the four model used for the sentiment analysis and comparison is done
among them to show which model performs better.
57 Twitter Sentiment Analysis Using Machine … 621

Fig. 57.1 Proposed


methodology

57.3.1 Dataset

Dataset is one of the pillars of deep learning and machine learning; for training, we
need a large dataset, and to perform the experiment we will need a large number
of open-source datasets that allow everyone to easily access a huge quantity of data
for model training. We can obtain dataset from Twitter API, Kaggle, and sometimes
from the GitHub. Tweepy is the Python library that is used to access Twitter API.
From Twitter API you cannot just access the data for that you have to make Twitter
developer account and ask series of question to make the account and only after it get
approved, Twitter will be proved you some keys like consumer key, access token key,
and secret key [2]. These keys are unique for every user. After this dataset is being
created for that access from the Twitter, the data is collected into the csv which will
contain fields: created (date), text (the actual tweet), retweet, hashtag, followers, and
friends [3]. The dataset used in this paper is used from Kaggle. Figure 57.2 shows
the sample of training dataset.

57.3.2 Preprocessing

Preprocessing technique is an essential step in the sentimental analysis. Different


preprocessing steps need to be performed on the tweets before the sentiment analysis
622 P. Tiwari et al.

Fig. 57.2 Sample training dataset

so that the tweet text be ready for analysis. As Twitter data is in unstructured format,
they need to clean as they are noisy and dirty, and they contain emoticons, mentions,
hashtags, misspellings, and much more, and therefore, the standardized datasets
needs to be preprocessed.
In the preprocessing step, the tweet is cleaned by followings steps: (i) “@”
mentions are removed, (ii) “#” symbols are removed, (iii) retweets or any numbers,
or hyperlinks are removed, and (iv) any punctuation mark except “?”, “.” and “!”
are removed. These steps are applied to the training, test and health tweet dataset.
Figure 57.3 shows the preprocessed data.

57.3.3 Sentiment Analysis

Decision Tree Classifier


Decision tree is basically a predictive modeling approach and is supervised algorithm,
hence, it needs to be trained. Thus, the overall concept is similar to for any text or
sentiment classification given a set of tweets together with their labels, and the
algorithm will calculate how much each word correlates with a particular label.
For example, it would locate that the phrase “excellent” frequently seems in files
categorized as positive, while the phrase “terrible” normally seems in bad files. By
57 Twitter Sentiment Analysis Using Machine … 623

Fig. 57.3 Preprocessed tweets

Fig. 57.4 Output of


decision tree

combining all such observations, it builds a version capable of assign a label to any
document [16, 17].
We have used sklearn Python package from which decision tree classifier, confu-
sion_matrix, and f1_score library and function are used to model and produce confu-
sion matrix, and f1_score is calculated. To check the accuracy model.score(), function
is used and to plot the confusion matrix Seaborn package is used. Figures 57.4 and
57.5 show the result and confusion matrix for the decision tree.
Logistic Regression
It refers to be predictive analysis algorithm that is needed for classification. It is
primarily based on idea of probability. It makes use of cost function called “sigmoid
function,” whose hypothesis limits between zero and one.

0 ≤ he(z) ≤ 1

1
σ (z) =
1 + e−z
624 P. Tiwari et al.

Fig. 57.5 Confusion matrix of decision tree

Fig. 57.6 Output of logistic


regression

We have used sklearn Python package from which logistic regression, confu-
sion_matrix, and f1_score library and function are used to model and produce confu-
sion matrix, and f1_score is calculated. To check the accuracy model.score(), function
is used and to plot the confusion matrix Seaborn package is used. Figures 57.6 and
57.7 show the result and confusion matrix for the logistic regression.
Random Forest
Random forest is also known as random decision forest as it works by multitude the
decision tree during the training. It is basically an ensemble learning which ensembles
the two learning method: regression and classification. For the output classification,
choose the class selected by the most and regression returns the individual trees mean
or we can say average prediction [18–20].
We have used sklearn Python package from which random forest classifier, confu-
sion_matrix, and f1_score library and function are used to model and produce confu-
sion matrix, and f1_score is calculated. To check the accuracy model.score(), function
is used and to plot the confusion matrix, Seaborn package is used. Figures 57.8 and
57.9 show the result and confusion matrix for the random forest.
57 Twitter Sentiment Analysis Using Machine … 625

Fig. 57.7 Confusion matrix of logistic regression

Fig. 57.8 Output of random


forest

Fig. 57.9 Confusion matrix for random forest


626 P. Tiwari et al.

LSTM
Long short-term memory (LSTM) networks are a type of recurrent neural network
capable of learning order dependence in sequence prediction problems. LSTMs are a
complex area of deep learning. It can be hard to get your hands around what LSTMs
are, and how terms like bidirectional and sequence-to-sequence relate to the field.
An LSTM layer consists of a set of recurrently connected blocks, known as memory
blocks. These blocks can be thought of as a differentiable version of the memory
chips in a digital computer. Each one contains one or more recurrently connected
memory cells and three multiplicative units—the input, output, and forget gates—
that provide continuous analogs of write, read, and reset operations for the cells.
Keras library is used to implement the model. The model is trained for 10 epochs
using a training dataset of 31,962 tweets. There are four layers in the model. The
first layer is an embedding layer having params of 256,000 and output shape 31 by
128.The second layer is a 40% input drop spatial dropout 1D layer. The LSTM layer,
with an output dimension of 196, dropout of 20%, and recurrent dropout of 20%, is
the third layer. The last layer is a dense layer with 1 outputs and sigmoid activation.
With a binary cross entropy as the loss function, Adam as the optimizer, and accuracy
as the metrics, the model was built. There are 612,177 trainable parameters in the
final model (Figs. 57.10 and 57.11).

Fig. 57.10 Result of LSTM


model

Fig. 57.11 Confusion matrix of LSTM


57 Twitter Sentiment Analysis Using Machine … 627

Table 57.1 Performance


Algorithm Accuracy f1_score
metric of the classifier
Logistic regression 0.94 0.59
Decision tree 0.93 0.53
Random forest 0.95 0.61
LSTM 0.94 0.60

57.4 Result Analysis and Comparison

In this paper, sentiment analysis of the Twitter data is being successfully implemented
using four different classifiers, namely logistic regression, decision tree, random
forest, and LSTM. Experiment is done on the collaboration notebook which allows
one to write and execute code through the browser. The library used to execute
machine learning and deep learning algorithms is scikit learn and keras, respectively.
The comparison is done between the four algorithms based on accuracy and f1_score
in which random forest proves to be the most accurate model and have the highest
f1_score among the four and decision tree to be the least accurate model. The table
below displays the performance metrics of the four-classifier used in this paper (Table
57.1).

57.5 Conclusion

In this paper, three machine learning algorithms and one deep learning algorithm
are used for classifying the Twitter data into two classes being positive and negative
which by literature review concluded to be the best one and produce the best accuracy
when compared with others machine learning algorithms. And now, after this exper-
iment, it has been concluded that random forest produces the best accuracy 0.95.
The future scope of this work includes the multiclass classification of the Twitter
data into three or five different classes and using different models to compare which
performs better when it comes to multiclassifying the sentiment of the Twitter data.
We can also use the combination of the models for the better performance.

References

1. Poornima, A., Priya, K.S.: A comparative sentiment analysis of sentence embedding using
machine learning techniques. In: 6th International Conference on Advanced Computing and
Communication Systems 2020, ICACCS, pp. 493–496. IEEE (2020)
2. Mandloi, L., Patel, R.: Twitter sentiments analysis using machine learning methods. In:
International Conference for Emerging Technology 2020, INCET, pp. 1–5. IEEE (2020)
3. Zahoor, S., Rohilla, R.: Twitter sentiment analysis using lexical or rule based approach: a case
study. In: 8th International Conference on Reliability, Infocom Technologies and Optimization
628 P. Tiwari et al.

(Trends and Future Directions 2020), ICRITO, pp. 537–542. IEEE (2020)
4. Tiwari, S., Verma, A., Garg, P., Bansal, D.: Social media sentiment analysis on Twitter datasets.
In: 6th International Conference on Advanced Computing and Communication Systems 2020,
ICACCS, pp. 925–927. IEEE (2020)
5. Saad, S.E., Yang, J.: Twitter sentiment analysis based on ordinal regression. IEEE Access 7,
163677–163685 (2019)
6. Anand, T., Singh, V., Bali, B., Sahoo, B.M., Shivhare, B.D., Gupta, A.D.: Survey paper: senti-
ment analysis for major government decisions. In: International Conference on Intelligent
Engineering and Management 2020, ICIEM, pp. 104–109. IEEE (2020)
7. Goel, A.K., Batra, K.: A deep learning classification approach for short messages sentiment
analysis. In: International Conference on System, Computation, Automation and Networking
2020, ICSCAN, pp. 1–3. IEEE (2020)
8. Chandra, Y., Jana, A.: Sentiment analysis using machine learning and deep learning. In:
7th International Conference on Computing for Sustainable Global Development 2020,
INDIACom, pp. 1–4. IEEE (2020)
9. Talpada, H., Halgamuge, M.N., Vinh, N.T.Q.: An analysis on use of deep learning and lexical-
semantic based sentiment analysis method on Twitter data to understand the demographic trend
of telemedicine. In: 11th International Conference on Knowledge and Systems Engineering
2019, KSE, pp. 1–9. IEEE (2019)
10. Goularas, D., Kamis, S.: Evaluation of deep learning techniques in sentiment analysis from
Twitter data. In: International Conference on Deep Learning and Machine Learning in Emerging
Applications 2019, Deep-ML, pp. 12–17. IEEE (2019)
11. Cheng, L.C., Tsai, S.L.: Deep learning for automated sentiment analysis of social media. In:
Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks
Analysis and Mining, pp. 1001–1004. IEEE (2019)
12. Saleena, N.: An ensemble classification system for Twitter sentiment analysis. Procedia Comp.
Sci. 132, 937–946 (2018)
13. Sharma, A., Ghose, U.: Sentimental analysis of Twitter data with respect to general elections
in India. Procedia Comp. Sci. 173, 325–334 (2020)
14. Mittal, A., Patidar, S.: Sentiment analysis on Twitter data: a survey. In: Proceedings of the
2019 7th International Conference on Computer and Communications Management, pp. 91–95
(2019)
15. Lee, Y., Yoon, S., Jung, K.: Comparative studies of detecting abusive language on Twitter.
In: Association for Computational Linguistics/Proceedings of the 2nd Workshop on Abusive
Language Online ALW 2, Brussels, Belgium, pp. 101–106 (2018)
16. Katal, A., Wazid, M., Goudar, R.H.: Big data: issues, challenges, tools, and good practices. In:
Sixth International Conference on Contemporary Computing, pp. 404–409. IEEE (2013)
17. Sharma, G., Tripathi, V., Mahajan, M., Srivastava, A.K.: Comparative analysis of supervised
models for diamond price prediction. In: 11th International Conference on Cloud Computing,
Data Science and Engineering (Confluence), pp. 1019–1022. IEEE (2021)
18. Chauhan, H., Kumar, V., Pundir, S., Pilli, E.S.: A comparative study of classification tech-
niques for intrusion detection. In: International Symposium on Computational and Business
Intelligence, pp. 40–43. IEEE (2013)
19. Tripathi, V., Pant, B., Kumar, V.: CNN based framework for sentiment analysis of tweets
20. Kumar, I., Mohd, N., Bhatt, C., Sharma, S.K.: Development of IDS using supervised machine
learning. In: Soft Computing: Theories and Applications, pp. 565–577. Springer, Singapore
(2020)
Chapter 58
A Survey of Methods and Techniques
in Offline Telugu Character
Segmentation and Recognition

Chandrakala Mukku and Miriala Santhosh

Abstract Telugu is one of South India’s oldest languages. It has a complicated


orthography with various different character shapes. For so many decades, offline
character recognition has always been a popular area of study. Handwriting segmen-
tation and recognition are difficult, which has stimulated the interest of industry and
academic researchers. Methods for recognizing handwriting have become much more
prominent in recent years. Until now, the connected component approach, vertical
and horizontal profiles, and other approaches were presented in the literature for the
segmentation of printed character’s work. The selection of the best discriminative
features is an important concern in character recognition. Various statistical and struc-
tural features, as well as various combinations of them, are discussed in the research
work. Researchers used ANN, SVM, KNN, and CNN to classify offline Telugu
characters. The purpose of this article is to closely analyze various feature extrac-
tion methods and classification models to understand the problem and challenges
encountered by earlier research. This identification is meant to provide numerous
recommendations for advancements and scope.

Keywords k-nearest neighbor (KNN) · Support vector machines (SVM) ·


Convolution neural networks (CNN) · Optical character recognition (OCR) ·
Character recognition (CR) · Natural language processing (NLP) · Connected
component (CC) · Histogram of oriented gradients (HOG) · Fast Fourier transform
(FFT) · Discrete cosine transform (DCT) · Discrete wavelet transform (DWT) ·
Principal component analysis (PCA) · Fourier transform (FT) · Particle swarm
optimization (PSO) · Differential evolution (DE)

C. Mukku (B) · M. Santhosh


Department of ECE, Anurag University, Hyderabad, Telangana, India
e-mail: mchandrakala_ece@mgit.ac.in
M. Santhosh
e-mail: msanthoshece@cvsr.ac.in
C. Mukku
Department of ECE, Mahatma Gandhi Institute of Technology, Hyderabad, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 629
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_58
630 C. Mukku and M. Santhosh

58.1 Introduction

OCR has recently gained prominence in the field of pattern recognition due to its
numerous applications. There is a wide range of commercial OCR systems on the
market. The identification of zip codes in an automatic mail sorting application
attracted the attention of researchers in character recognition. There are numerous
advantages to document digitization, including faster sorting and less storage space.
Despite the widespread use of computers in document processing, handwritten docu-
ments remain an essential part of life. In the early studies, the majority of the research
was contributed to printed text utilizing template matching approaches. Because of
the difficulties in recognizing handwritten text, very little research priority must be
given to it. Character recognition in handwriting is a challenging task because various
writing styles and different writing variability can result in drastic differences in char-
acters. South Indian languages such as Tamil, Kannada, and Telugu, among others,
have received relatively little research attention due to a lack of standard datasets.
Telugu has an extremely complicated orthography with so many different character
shapes made up of sixteen vowels, 36 consonants, and a wide range of other vowel
and consonant combinations. Most Telugu characters are similar, making it difficult
to identify them. Sometimes consonants and vowel modifiers are combined to form
a complex or compound character. Researchers face a difficult task in recognizing
compound characters. Developing an OCR for the Telugu language is needed to
preserve old and historical documents, automatic sorting of books in the library,
postal services, automate the banking services, tax forms, reading aid for the blind,
forensic challenges, check processing in banks, natural language processing (NLP),
almost all form processing systems, etc. Developing an OCR for the Telugu language
is required to preserve old and historical documents.
The primary objective of this paper is to examine CR methodologies in relation
to the stages of a CR system. The article was primarily concerned with Telugu
offline character recognition techniques. Section 58.2 explores the methodologies of
the Telugu character recognition system. Section 58.3 discusses the challenges in
Telugu handwritten character segmentation and recognition. Section 58.4 concludes
with a brief discussion of future research directions.
Literature Survey
The most important step would be to create a database of handwritten characters
to work with. Databases make things easier for many researchers by providing a
very convenient and easy way out. The researchers created a proposed dataset, that
is now available on the IEEE Data port [1], and HP labs created a Telugu hand-
written font [2]. Image binarization, also known as thresholding, is a useful tool
in image processing and computer vision for distinguishing the object pixels in an
image from the background pixels. Numerous methods for document binarization
have been reported in the literature, namely the Otsu method [3], NiBlack’s [4],
Sauvola and Pietikäinen [5], Wolf and Jolion [6], Bradley and Roth [7], and Khur-
shid et al. [8]. In general, an angle of tilt can occur when a machine or a human
58 A Survey of Methods and Techniques in Offline … 631

operator is involved in document scanning. The Hough transform [9] and projec-
tion profile [10] are used to calculate a document’s skew angle. After the document
image has been binarized and skew corrected, the text content must be extracted. The
canonical syllable segmentation model [11], the projection profile method [12], and
the connected component method [13] are three popular segmentation algorithms in
document image analysis. Feature extraction addresses the problem of identifying the
most relevant, unique, and constrained set of features to improve recognition accu-
racy. Many efficient feature extraction methods have been proposed by researchers,
namely the geometric feature-based approach [14], features based on histograms
and distance profiles [15], HOG features [16], zone-based features [17], DCT and
Gabor features [18], and DCT and DWT features [19]. Many experts have investi-
gated developing an offline Telugu character classification method using a number
of techniques. Panyam et al. [19] reported that models trained using the two-stage
KNN classification method and transformed-based DWT and FFT features extracted
from Telugu palm leaf characters achieved recognition accuracy ranging from 85.7
to 96.4% in all three planes “XY”, “YZ”, and “XZ”. Aradhya et al. [20] described a
method for improving the performance of a multilingual OCR system trained with
PCA and FT features. PSO and DE are used by Vijaya Lakshmi et al. [16] to optimize
cell-based and HOG features extracted from Telugu palm leaf handwritten characters.
Finally, a model trained with KNN classifiers achieved 93.1% recognition accuracy.
Sastry et al. [15] made a significant contribution to their work on palm leaf docu-
ments by obtaining 3D features and proposing a depth sensing approach to eliminate
noise for such documents. The extracted characters from Telugu palm scripts are
then recognized using distance and histogram profile features.

58.2 Methodologies of Telugu CR Systems

The architecture of the CR system is shown in Fig. 58.1.

58.2.1 Image Acquisition

Image digitization takes place in this section. At first, characters are written on paper.
The paper is now scanned with the scanner. A bitmap image will be obtained after
scanning the image. After digitization, the bitmap image is fed into the prepro-
cessing stage as an input. According to the literature, there is no standard database
for Telugu handwritten characters for validating the results. The researcher created
their proposed dataset, which is now available in the IEEE Data port [1]. HP labs
developed a Telugu handwritten dataset that includes 166 isolated character classes
scribbled by 146 native Telugu writers [2].
632 C. Mukku and M. Santhosh

Fig. 58.1 Architecture of the CR system

58.2.2 Preprocessing

The goal of preprocessing is to create data that is simple enough for CR systems
to recognize characters correctly. Data flow model of preprocessing is shown in
Fig. 58.2,

58.2.2.1 Noise Removal

The term “noise” refers to unwanted information in document images. Noise from
the optical scanning device or writing instrument or poor quality document causes
disjointed line segments, gaps in lines, filled loops, bumps, and other anomalies.
Distortion is also an issue, including local variations, dilation, and erosion, and
corner rounding. It is necessary to remove these imperfections before proceeding
with the CR system [21]. There are several methods for removing noise, such as
low-pass filters, median filters, and so on. The low-pass filter is the most commonly
used method. Figure 58.3 and 58.4 show noisy image and noise removal image,

Fig. 58.2 Dataflow model of preprocessing

Fig. 58.3 Noisy image


58 A Survey of Methods and Techniques in Offline … 633

Fig. 58.4 Image after noise


removal

respectively.

58.2.2.2 Binarization

The majority of document analysis algorithms rely on underlying binarized image


data. Binarization thus distinguishes between foreground and background informa-
tion. This separation of text and background is required for subsequent operations like
segmentation, feature extraction, and classification [22]. Threshold-based binariza-
tion techniques can be categorized as either global or local (adaptive). If the grayscale
value is less than the threshold, all pixels are converted to black, otherwise they are
converted to white. Adaptive thresholding methods employ different threshold values
of the different regions of an image based on pixel by pixel approach; whereas, global
methods employ a specific threshold value for whole image. The Otsu method [3]
is a global thresholding method for histograms that are bimodal or multimodal.
The Otsu method, on the other hand, fails when the histogram is unimodal. Gener-
ally, document images are degraded due to uneven illumination, poor source type,
image contrast variation, shadows, smear, strain, and scanning errors. Especially for
degraded document images, local adaptive thresholding methods like NiBlack’s [4],
Sauvola and Pietikäinen [5], Wolf and Jolion [6], Bradley and Roth [7], and Khurshid
et al. [8] performance are better than the global threshold.
The processing time in local adaptive thresholding is determined by region statis-
tics. Niblack’s computes the mean and standard deviation over a small sliding
window. Sauvola’s threshold is calculated using the image gray-value standard devi-
ation. Wolf first normalizes the contrast and computes the image gray-level mean.
Darek Bradley proposed a thresholding technique that takes into account spatial
differences in illumination. Nick’s algorithm solves the problem of low-contrast
images as well as the problem of black noise. To demonstrate the efficacy, locally
adaptive thresholding methods such as Niblack, Sauvola, Wolf’s, Darek Bradley,
and Nick’s thresholding were applied to DIBCO-2009 [23] handwritten degraded
text document images. Figure 58.5 depict the ground truth and segmentation results
of different thresholding techniques applied on the DIBCO-2009 dataset.

58.2.2.3 Skew Detection and Correction

Document images are generated by scanning, which can be done mechanically or


by humans. The involvement of a machine or a human operator may result in an
angle of tilt. This angle (skew) is formed by the horizontal lines of text in the image
of the document. The Hough transform is used in the majority of skew detection in
document images to detect straight lines in images. This algorithm converts each edge
634 C. Mukku and M. Santhosh

(a) (b) (c) (d)

(e) (f) (g) (h)

Fig. 58.5 Thresholding of handwritten DIBCO dataset a degraded handwritten text, b histogram,
c Niblack’s d Sauvola’s, e Wolf’s, f Darek Bradley, g Nick’s and h ground truth

Fig. 58.6 Image with skew

Fig. 58.7 Image with


deskew

pixel in an image space into a curve in a parametric space [24]. The dominant line and
its skew are represented by the peak in the Hough space. The main disadvantage of
this method is that when the text becomes sparse, it is difficult to select a peak in the
Hough space [9]. To determine the skew angle of a document, the projection profile
[10] is calculated over several angles, and the difference between peak and trough
heights is evaluated for every angle. Figure 58.6 and 58.7 show skew image and
deskew image, respectively after applying Hough transform.

58.2.2.4 Size Normalization

It is used to make the size of the character in the image correspond to a specific
standard. The original sizes of the handwritten characters vary greatly, which may
affect feature extraction. In most cases, the character’s size is normalized by retaining
the character’s shape. Linear interpolation, cubic spline interpolation in polynomial,
poly-phase network forms, and decimation scaling methods are widely being used
to create input images of the same dimension [25].
58 A Survey of Methods and Techniques in Offline … 635

58.2.3 Segmentation

The preprocessing stage generates a “smooth” text by obtaining sufficient shape


information, low noise, and high compression on a normalized image. The document
is then divided into segments. The ability to separate lines, words, or characters
in the segmentation approach has a direct impact on the recognition rate of the
script. There are a variety of segmentation algorithms. There are three types of
segmentation approaches, namely recognition-based segmentation, dissection, and
holistic methods [26].
In dissection, segmentation is calculated by “character-like” attributes. This tech-
nique makes use of white space, connected components, projection analysis, and
pitch. The system aims at highlighting words as a whole using holistic methods,
trying to eliminate the need to segment them into characters.
For the segmentation of printed documents, dissection approaches are used,
because the line, word, and character spacing are fixed. A holistic approach to hand-
written character recognition can be beneficial [8]. Telugu script is extremely difficult
to segment due to the varying gaps between words and characters, as well as char-
acter overlapping [11]. The segmentation of Telugu characters is a difficult task due
to the overlapping of characters [27]
For segmentation, the projection profile method [23] makes use of white spaces
in line and word. The horizontal projection profile is popularly used to extract text
lines from a text document by adding up the ON pixels in a row. The horizontal
projection profile contains peaks and valleys that serve as text line separators. Line
boundaries can be found by using valleys with a value of zero. The vertical projection
profile is commonly used to extract text lines from a document image by summing
the ON pixels in a column. Words could be separated by examining the minima in a
single line’s vertical projection profile. The text lines are segmented using horizontal
projection profiles. Words and characters can be segmented using vertical projection
profiles.
The connected component method [24] begins by labeling the pixels in the given
image. A four or eight-neighborhood relationship is referred to as “adjacency”. Adja-
cent pixels have the same number and are regarded as connected components. The
characters are represented by the labeled components (glyphs). Despite the fact that
the CC method is suitable for single glyphs and solves the issue of overlapping char-
acter segmentation. Figure 58.8 depicts a sample character segmentation result using
connected component approach.
The over-segmentation of characters is a limitation of this method. Essentially, the
CC method divides the compound character into two characters (glyphs), resulting
in a change in the character’s meaning. To form a valid character, the glyphs must be
rejoined. Pratap Reddy et al. [28] presented characters extraction that falls primarily

Fig. 58.8 Segmentation


using connected components
636 C. Mukku and M. Santhosh

Fig. 58.9 Zones of Telugu handwritten characters

in the middle zone, as well as the top-middle and top-middle-bottom zones, which
are extracted using vertical and horizontal profiles, as well as knowledge of the top
and bottom baselines are shown in Fig. 58.9
Phases of canonical syllable segmentation [11], zone separation, and component
extraction are treated separately. The canonical syllable model describes how they are
positioned and which zone the character belongs to. The information obtained from
the zone separation and component separation models is combined in the canonical
syllable separation model. In the zone separation model [28], peaks and valleys of
vertical and horizontal profiles, as well as slope information, are used to identify
three zones. In the component separation model, the 8 × 8 connectivity bounding
box technique is used to separate these distinct units from the canonical syllable.

58.2.4 Feature Extraction Methods

Handwritten character recognition relies heavily on feature extraction, because it


extracts different features that enable one character to be distinguished from others.
Feature extraction is concerned with the problem of identifying the most relevant,
distinct, and limited set of features to improve recognition accuracy. The features are
derived from the entire character image or a specific part of it by separating up into
different overlapping or non-overlapping zones (windows or cells). The performance
of recognition is determined by the method of feature extraction used, as a result,
reduces the complexity of the system [29]. The strategies used for extracting the
features are categorized into three types.
a. Structural features
b. Statistical features
c. Transformed features.
58 A Survey of Methods and Techniques in Offline … 637

58.2.4.1 Structural Features

These features describe the structure of character-like shape is a straight line, curved,
and circular.
The positional and geometrical properties of the character serve as the basis for
structural features. Maxima, minima, the line ends, cross points, extreme points,
loops, branch points, and stroke direction are examples of topological features.
Geometrical features include textures, ridges, and edges features. The geometric
feature-based approach [14] was used on the English handwritten alphabet to extract
structural features such as lines.

58.2.4.2 Statistical Features

The statistical distribution of points used to represent a character image accounts for
style variations. Although this type of feature extraction does not permit for image
reconstruction, it can be used to minimize the dimensions of the feature set while
sustaining high speed and low complexity. The most important statistical features
are zoning, crossings and distances, and projections. Features based on histograms
and distance profiles [15], HOG features [16], and zone-based features [17] were
extracted from Telugu palm leaf handwritten characters and developed excellent
recognition accuracy.

58.2.4.3 Transformed Features

The image’s transform domain representation generally emphasizes information that


cannot be visualized in a spatial domain image and can be used to generate features.
The major transformed-based features used for character representation are FFT,
DCT, and Gabor transform. DCT and Gabor features [18] were extracted from eight
Indian script characters and achieved a good recognition rate. DCT and DWT features
[19] were extracted from Telugu handwritten characters using two-level character
recognition and showed good recognition accuracy.

58.2.5 Telugu Offline Character Recognition System

Many researchers have proposed using a variety of techniques to develop an offline


Telugu character recognition system. The multi-level algorithm [19] considers two
levels of classification model using KNN classifier. The model is trained using 1232
training samples and 308 testing samples. DCT is used in the first level to extract
Telugu palm leaf character features, and DWT is used in the second level. The second
level is used to put the unidentified characters from the first level to the test. Aradhya
et al. presented a new approach to improve the performance of a multilingual OCR
638 C. Mukku and M. Santhosh

[11] system that was trained and built using PCA and FT features to identify vowels,
consonants, and vowel modifiers from degraded documents written in South Indian
script.
Using zonal-based features, a Telugu handwritten character recognition system
was created [17]. The KNN model was trained with 18,750 samples and tested with
500 samples, yielding a recognition accuracy of 78%.Telugu palm leaf character
recognition based on distance and histogram profile [15]. A character database of 56
compound characters was gathered from various people. After being trained with a
KNN classifier, the model scored 82.5% recognition accuracy. Various dimension-
ality reduction techniques, like PSO and DE, could be used for HOG and cell-based
directional features [16]. The KNN model was trained and tested on 26 different
characters created by five various writers, and it achieved a 93% accuracy rate.
Telugu handwritten diacritics character recognition [1] based on convolution
neural networks on IEEE data port dataset and achieved recognition accuracy as
76.6%. Ukil et al. [30] presented a method for combining CNNs for Indian word-
level handwritten script recognition and used small, separately trainable CNNs with
varying levels of variation in their architectures and achieved 94.04% accuracy.
Sulaiman et al. [21] proposed a deep learning model for accurate handwritten word
recognition that simultaneously learns character-level and word-level information.
Character recognition can help improve methods for improving word recognition.
Proposed a weakly separated approach to character segmentation and then classified
the handwritten word at the character level using a series of LSTM layers. Table
58.1 illustrates the performance of Telugu handwritten characters based on various
features, with various characters depicted.
The good classification accuracy for Telugu handwritten script has so far been
reported to be 96.4% [19]. This recognition accuracy is achieved by training a KNN
classifier with transformed-based features such as FFT and DWT extracted from
a dataset of 28 different handwritten character images. The next highest accuracy

Table 58.1 Performance of Telugu handwritten characters based on various features


References Year Classifier Feature No. of Dataset Accuracy
extraction characters/samples (%)
[1] 2020 CNN Automatic 516 samples IEEE data 78.6
port
[15] 2017 KNN Distance 56 compound Proposed 82.5
and characters dataset
histogram
profile
[16] 2017 k-NN HOG and 26 characters Proposed 93.1
cell-based dataset
[19] 2016 k-NN FFT and 28 characters Proposed 96.4
DWT dataset
[20] 2008 PCA Fourier 31 characters Proposed 93.04
transform dataset
58 A Survey of Methods and Techniques in Offline … 639

for Telugu script is 93.04% [20], which was achieved by training a PCA classifier
with transformed-based features like Fourier transformed features extracted from 30
different handwritten characters in the proposed dataset. The accuracy for Telugu
script is 93% [16], which was achieved by training a KNN classifier with statistical
features like HOG and cell features extracted from 26 different handwritten charac-
ters in the proposed dataset. Training the classifier with KNN using statistical and
transformed-based features resulted in higher recognition accuracy, according to the
literature.

58.3 Challenges in Telugu Handwritten Character


Segmentation and Recognition

There have been a numerous studies over the last several decades into the recognition
of Telugu characters. There is a lot of ambiguity in Telugu when it comes to character
segmentation and recognition.
(a) The segmentation of Telugu characters is a difficult task due to the overlapping
of lines and the variable gaps between words and characters.
(b) The character’s vertical and horizontal extent spacing varies depending on the
modified vowel or consonant results in improper segmentation.
(c) Because of the large number of character shapes and personnel independence,
the Telugu script is structurally complex.
(d) The presence of consonant conjuncts and compound characters has a curve
linear structure, with the majority of them having a high similarity index. As a
result, character classification in Telugu handwritten is extremely difficult.
(e) Irregular handwriting makes it difficult to group characters and determine their
relationships.
(f) It is difficult to recognize handwriting characters because different writing
styles and handwriting variability can result in dramatic differences in
characters.

58.4 Conclusion and Future Scope

Individual researchers have been able to develop algorithms and techniques that can
recognize manuscripts with greater accuracy due to the advances of machine learning
and deep learning algorithms. Recognition accuracy is affected by character style,
writing style, distorted strokes, similar characters, variable character thickness, illu-
mination, and dataset quality. Researchers obtained lower recognition accuracy due to
a lack of datasets in Telugu handwritten characters. This research can be expanded to
improve recognition accuracy by properly segmenting characters, employing hybrid
feature extraction techniques and applying hybrid classification approaches. Feature
optimization approaches for improving character classification ratios and lowering
640 C. Mukku and M. Santhosh

the computational burden of a recognition system. Incorporating a neural network


into an Android application for handwritten text-to-speech conversion.

References

1. Muppalaneni, N.B.: Handwritten Telugu compound character prediction using convolutional


neural network. In: International Conference on Emerging Trends Information Technology Eng
IC-ETITE 2020, pp. 1–4 (2020)
2. Prasad, S.D., Kanduri, Y.: Telugu handwritten character recognition using adaptive and static
zoning methods. In: 2016 IEEE Students’ Technology Symposium TechSym 2016, no. i,
pp. 299–304 (2017)
3. Otsu, N.: A Threshold selection method from gray-level histograms. IEEE Trans. Syst. Cybern.
9 (1979)
4. Niblack, W.: An Introduction to Digital Image Processing, pp. 115–116. Englewood Cliffs,
N.J. Prentice Hall (1956)
5. Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2),
225–236 (2000)
6. Wolf, C., Jolion, J.M.: Extraction and recognition of artificial text in multimedia documents.
Pattern Anal. Appl. 6(4), 309–326 (2004)
7. Bradley, D., Roth, G.: Adaptive thresholding using the integral image: read but to check with
SIR. Jgt 12, 13–21 (2007)
8. Khurshid, K., Siddiqi, I., Faure, C., Vincent, N.: Comparison of Niblack inspired binarization
methods for ancient documents. Doc. Recognit. Retr. XVI 7247, 72470U (2009)
9. Shivakumara, P., Kumar, G.H., Guru, D.S., Nagabhushan, P.: A novel technique for estimation
of skew in binary text document images based on linear regression analysis. Sadhana—Acad.
Proc. Eng. Sci. 30(1), 69–85 (2005)
10. Hull, J.J.: Document image skew detection: survey and annotated bibliography, pp. 40–64
11. Reddy, L.P., Sastry, A.S.C.S., Rao, N.V.: Canonical syllable segmentation of Telugu document
images. In: IEEE Region 10 Annual International Conference Proceedings/TENCON, 2008
12. Sagar, B.M., Shobha, G., Ramakanth Kumar, P.: Character segmentation algorithms for
Kannada optical character recognition. In: Proceedings 2008 International Conference Wavelet
Anal. Pattern Recognition, ICWAPR, vol. 1, pp. 339–342 (2008)
13. Jindal, M.K., Sharma, R.K., Lehal, G.S.: Segmentation of horizontally overlapping lines in
printed Gurmukhi script. In: Proceedings—2006 14th International Conference on Advanced
Computing and Communication ADCOM 2006, vol. 1, pp. 226–229 (2006)
14. Gaurav, D.D., Ramesh, R.: A feature extraction technique based on character geometry for
character recognition, pp. 1–4 (2012)
15. Lakshmi, T.R.V., Narahari, P., Rajinikanth, T.V.: A novel 3D approach to recognize Telugu
palm leaf text. Eng. Sci. Technol. Int. J. 20(1), 143–150 (2017)
16. Vijaya Lakshmi, T.R.: Reduction of features to identify characters from degraded historical
manuscripts. Alexandria Eng. J., 0–6 (2017)
17. Sastry, P.N., Lakshmi, T.R.V., Rao, N.V.K., Rajinikanth, T.V., Wahab, A.: Telugu hand-
written character recognition using zoning features. In: 2014 International Conference on
IT Convergence and Security ICITCS 2014, pp. 0–3 (2014)
18. Rajput, G.G., Anita, H.B.: Handwritten script recognition using DCT, Gabor filter and wavelet
features at line level. Stud. Comput. Intell. 395, 33–43 (2012)
19. Panyam, N.S., Vijaya, V.L., Krishnan, R.K., Koteswara, K.R.: Modeling of palm leaf character
recognition system using transform based techniques. Pattern Recognit. Lett. 84, 29–34 (2016)
20. Aradhya, V.N.M., Kumar, G.H., Noushath, S.: Multilingual OCR system for South Indian
scripts and English documents: an approach based on Fourier transform and principal
component analysis. Eng. Appl. Artif. Intell. 21(4), 658–668 (2008)
58 A Survey of Methods and Techniques in Offline … 641

21. Sulaiman, A., Omar, K., Nasrudin, M.F.: Two streams deep neural network for handwriting
word recognition. Multimed. Tools Appl. 80(4), 5473–5494 (2021)
22. Bunke, H., Wang, P.S.P.: Handbook Character Recognition Document Image Analysis
23. https://doi.org/10.1007/s10032-010-0115-7
24. Chaudhuri, B.B., Pal, U.: Skew angle detection of digitized Indian script documents. IEEE
Trans. Pattern Anal. Mach. Intell. 19(2), 182–186 (1997)
25. De Oliveira, J.J., Veloso, L.R., De Carvalho, J.M.: Interpolation/decimation scheme applied
to size normalization of characters images. Proc. Int. Conf. Pattern Recognit. 15(2), 577–580
(2000)
26. Casey, R.G., Casey, R.G., Lecolinet, E., Lecolinet, E.: Survey of methods and STR in character
segmentation. Analysis 18(7), 690–706 (1996)
27. Shafait, F., Keysers, D., Breuel, T.: Performance evaluation and benchmarking of six-page
segmentation algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 30(6), 941–954 (2008)
28. Pratap Reddy, L., Satyaprasad, L., Sastry, A.S.C.S.: Middle zone component extraction and
recognition of Telugu document image. In: Proceedings International Conference on Document
Analysis and Recognition, ICDAR, vol. 2, pp. 584–588 (2007)
29. due Trier, O.: Feature extraction methods for character recognition—a survey. 29(4), 641–662
(1996)
30. Ukil, S., Obaidullah, S.M., Roy, K., Das, N.: Improved word-level handwritten Indic script
identification by integrating small convolutional neural networks. Neural Comput. Appl. 32(7),
2829–2844 (2020)
Chapter 59
A Machine Learning Approach in 5G
User Prediction

Deepak Upadhyay, Pallavi Tiwari, Noor Mohd, and Bhaskar Pant

Abstract Automated 5G is the latest generation in mobile networks and it is a huge


step-up from what we have available to us today. In today’s era of communication,
there is a huge spike up in terms of data rate and capacity. The current 4G-LTE
network is almost at it last stage and soon going to be replaced by the next generation
technology which is 5G. The 5G, however, will take time to roll out completely in
a particular country, current expectations are 2025–2026. Using machine learning
algorithms in the field of communication technology, particularly 5G, is a difficult
proposition. We really have to develop a model that runs quickly and functions well, as
measured by its key metrics, because we want minimal latency and rapid calculations.
Our goal in this paper is to determine whether or not the user is connected to a 5G
network. This data, 1,000,000 × 60, has been posted to Kaggle.

Keywords 5G · User prediction · Logistic regression · Random forest · XGB ·


Decision tree

59.1 Introduction

5G is the entirely new kind of network designed to connect virtually everyone and
everything together. 5G is the most promising technology in today’s world which
promises fast data rates and low latency in which everyone can do almost anything
from anywhere. The internet of Things (IoT) which requires ultra-low latency to work
[1] can only be achieved by the 5th generation of mobile communication system.
Talking about self-driving cars which requires low latency as well as high reliability is
only possible through 5G. In this paper, the user prediction is done for 5G, the dataset
is downloaded from Kaggle. This paper used the Python for doing all the necessary

D. Upadhyay (B) · P. Tiwari · N. Mohd · B. Pant


Department of Computer Science and Engineering, Graphic Era Deemed to be University,
Dehradun, India
e-mail: deepakupadhyay.ece@geu.ac.in
P. Tiwari
e-mail: pallavitiwari_20061041.cse@geu.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 643
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_59
644 D. Upadhyay et al.

prediction using machine learning algorithm such as logistic regression, random


forest, decision tree, and XSB classifier [2]. There are three sets of data, sample, train
and test all are.csv files. The prediction for different areas for 5G users has been done
based on different areas such as age type and sex type etc., after all these the using
machine learning algorithms, the precision and accuracy of a particular algorithm
has been tested. In the very first step, the data has been visualized, in second step, the
preprocessing is done, and finally training and testing. This prediction can be used
by telecom operators before installing the actual infrastructure so that the additional
cost can be saved, and the subscribers and users can get the most benefit out of it. 5G
will co-exist with existing 4G networks until the coverage is expanded significantly,
but it will eventually evolve into a standalone network that operates independently.
And 5G has potential to solve challenges in Internet of Things [3].

59.2 Methodology

59.2.1 Data Collection

The dataset was having the missing values and in order to predict nearly accurate,
we need to have the proper values and this error in missing values can be made right
by the next step which is data preprocessing (Fig. 59.1).

59.2.2 Data Preprocessing

In this scenario, the missing data is classified as missing not at random (MNAR),
which signifies that the missing data is particularly connected to what is missing.
Suppose there is a device which has not taken part knowingly, then it would not
be counted. Other part of missing data has been successfully handled through the
Python library. The resulting dataset was then divided into training and testing data,
with training data accounting for around 80% of the total and testing data accounting
for the leftover 20%. The train_test_split() method in the Python inbuilt library
scikit-learn is used to divide the dataset.

59.2.3 Building the Model

Python is used to create the model. Different machine learning algorithms are imple-
mented using the Scikit-learn toolkit. Scikit-learn is a Python package that combines
a variety of cutting-edge machine learning approaches for supervised and unsu-
pervised moderate problems. This package aims at bringing machine learning to
59 A Machine Learning Approach in 5G User Prediction 645

Fig. 59.1 Methodology


diagram

non-specialists via a general-purpose high-level language. The following features


are prioritized: ease of use, performance, documentation, and API consistency. That
has few prerequisites and is released under the BSD license, making it appropriate
for use in both academic and commercial settings. Because it is built on the profes-
sional Python ecosystem, it may be simply integrated into applications outside of the
normal range of statistical data analysis. Furthermore, the algorithms may be used as
basic components for strategies adapted to a specific use case since they are written
in a high-level programming language. This will also help in providing security to
5G-based devices by IDS as in [4, 5].
646 D. Upadhyay et al.

59.3 Data Representation

In Fig. 59.2, as we can see that for is_5g = 0 the city_5g_ratio is the most probable
across all of the city_5g_ratios.
In Fig. 59.3 Maximum density (is_5g) appears to be 85, 81, 51, 31, and 11 prov
IDs.
In consideration of 5 G likelihood, service type 5 is the most prevalent across all
channel types from 1 to 10 (Fig. 59.4).
In consideration of is 5 G likelihood, product type 5 is the most prevalent among
all channel types from 1 to 10 (Fig. 59.5).

Fig. 59.2 Visualization of city 5G ratio in different class 1 and 0

Fig. 59.3 Prov ID investigation if 5G of not


59 A Machine Learning Approach in 5G User Prediction 647

Fig. 59.4 5G on channel type and service type

Fig. 59.5 5G on channel type versus product type

In considerations of is 5 G likelihood, product type 5 is also the most prevalent


among all product types from 2 to 6, as it can be seen in Fig. 59.6.
The age column will be of little relevance because it comprises 43 people, all of
whom have a sex 1 that is higher than a sex 0, with the exception of age 59 (Fig. 59.7).
As a result, it provides no new information to anticipate is 5 G, as it can be seen in
Fig. 59.8.
648 D. Upadhyay et al.

Fig. 59.6 5G on product type versus service type

Fig. 59.7 5G with term type


59 A Machine Learning Approach in 5G User Prediction 649

Fig. 59.8 Comparison of devices used by particular age and gender

59.4 Performance Metric of the Model

For the study and evaluation, the very next performance metrics of the four models
were used. Table 59.1 presents a tabular comparison and analysis of the four machine
learning methods. The performance metric is done on the training dataset and with
the values are weighted average. The weighted average or weighted sum ensemble
is a machine learning strategy that aggregates predictions from several models, with
each model’s contribution weighted according to its competence or performance.
The voting ensemble is connected to the weighted average ensemble [6].
I. Accuracy
The total number of accurately predicted forecasts divided by the total number
of predictions made by the model is the model’s accuracy.

TP + TN
Accuracy = ;
TP + TN + FP + FN

where TP is True positive, TN is True negative, FP is False positive, and


FN is false negative

Table 59.1 Performance metric table of different algorithms


Model Accuracy Recall Precision F1-score
Logistic regression 0.99 0.99 0.97 0.98
Decision tree 0.98 0.98 0.98 0.98
Random forest 0.99 0.99 0.98 0.98
XGB classifier 0.99 0.99 0.97 0.98
650 D. Upadhyay et al.

II. Precision
Precision is defined as the number of correct positive predictions divided by
the total number of accurate positive predictions.

TP
Precision =
TP + FP

III. Recall
How many of the returning predictions that predicted that it belongs to a specific
class were really predicted to correspond to that class is characterized as recall.

TP
Recall =
TP + FN

IV. F1-score
The harmonic mean of precision and recall is denoted as F1-score.
 
P∗R
F1 - Score = ;
P+R

where, P is precision and R is recall


See Figs. 59.9 and 59.10.
See Table 59.2.

Fig. 59.9 Performance metric chart based on the data from Table 59.1
59 A Machine Learning Approach in 5G User Prediction 651

Fig. 59.10 Chart of prediction result based on the data from Table 59.2

Table 59.2 Prediction of


Model Correct predictions False prediction
count of different algorithm
used Logistic regression 300,000 0
Decision tree 298,005 1995
Random forest 299,416 584
XGB classifier 299,992 8

59.5 Result

Results in our experiment shows that the logistic regression and XGBoost machine
learning classifiers performed well with the highest accuracy; whereas, decision tree
comes at the lowest place among these (Fig. 59.11).

Fig. 59.11 Compression of different algorithm with accuracy


652 D. Upadhyay et al.

59.6 Conclusion

So now it can be finally concluded that from the machine learning models that this
paper has used, logistic regression and XGBoost achieved the highest accuracy.
We attempted to analyze and compare various machine learning models based on
different algorithm classification of various areas. This result will help the network
service provider or NSP to implement the 5G when the 5G will roll out.

References

1. Pundir, S., Wazid, M., Singh, D.P., Das, A.K., JPC Rodrigues, J., Park, Y.: Designing efficient
sinkhole attack detection mechanism in edge-based IoT deployment. Sensors 20(5), 1300 (2020)
2. Sharma, G., Tripathi, V., Srivastava, A.: Recent trends in big data ingestion tools: a study. In:
Research in Intelligent and Computing in Engineering, pp. 873–881. Springer, Singapore (2021)
3. Matta, P., Pant, B.: Internet of things: genesis, challenges and applications. J. Eng. Sci. Technol.
14(3), 1717–1750 (2019)
4. Mohd, N., Singh, A., Bhadauria, H.S.: A novel SVM based IDS for distributed denial of sleep
strike in wireless sensor networks. Wireless Pers. Commun. 111(3), 1999–2022 (2020)
5. Pundir, S., Wazid, M., Singh, D.P., Das, A.K., Rodrigues, J.J., Park, Y.: Intrusion detection
protocols in wireless sensor networks integrated to internet of things deployment: survey and
future challenges. IEEE Access 8, 3343–3363 (2019)
6. https://machinelearningmastery.com/weighted-average-ensemble-with-python/
Chapter 60
An Incremental Approach to Classify
Healthcare URLs Using a Novel ‘Web
Document Classification Model’
Yashoda Barve, Jatinderkumar R. Saini, Ketan Kotecha,
and Hema Gaikwad

Abstract In the present research work, we proposed a novel Web document classi-
fication model (WDCM) to classify healthcare-related URLs. This is required as the
Web is flooded with millions of URLs and they may be duplicated or irrelevant most
of the time. It is a challenging task to extract and organize Web documents specific to a
certain domain. Another challenge is to provide the updated and latest information to
the users. In this research, the authors have proposed WDCM which tackles the above-
mentioned challenges with the incremental learning approach. Authors have also
used sentiment analysis and document similarity measures to classify URLs in the
healthcare domain. The analysis of the experimental results shows that with machine
learning classifiers, cosine similarity measure with logistic regression showed the
highest accuracy of 93.75%. Euclidean distance (ED) measure with logistic regres-
sion showed minimum accuracy of 86.6%. When implemented as an algorithm ED
measure showed the highest accuracy of 96.60% for training and 95.83% for the
testing dataset in five iterations. Jaccard distance measure showed lower accuracy
of 65.71% for training and 85.33 for the testing dataset in five iterations. It is also
observed that the maximum URLs fetched are with the .com domain.

Keywords Document classification · Document similarity · Health care ·


Incremental learning · Sentiment analysis

Y. Barve
Suryadatta College of Management Information Research and Technology, Pune, India
J. R. Saini (B) · H. Gaikwad
Symbiosis Institute of Computer Studies and Research, Symbiosis International (Deemed
University), Pune, India
e-mail: saini_expert@yahoo.com
K. Kotecha
Symbiosis Centre for Applied Artificial Intelligence, Symbiosis International (Deemed
University), Pune, India
e-mail: head@scaai.siu.edu.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 653
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_60
654 Y. Barve et al.

60.1 Introduction

The World Wide Web (WWW) has become the source of information to millions
of users to get the latest information related to healthcare or medical domain. It
is available in the form of textual content, images, audio, and videos. Extracting
relevant information from the Web has become a topic of interest among researchers
[1–3]. However, there are millions of URLs available on the Web related to health
care and they might be duplicated and irrelevant which creates information overload
for the users. Users have to put in a lot of effort to search relevant Web pages on the
Web. Also, the researchers and scientists working related to the healthcare domain
face challenges in getting relevant information over the Web. It is a challenging task
to extract and organize Web information relevant to a certain domain like health care
[1, 2]. Also, data in the URLs may change over the period time and these changes
have to be recorded to provide updated information [1].
The efforts are taken in the literature to extract information from the Web using
HTML parser, tree-based techniques, and Web wrapper. Further, to classify Web
documents supervised machine learning approaches are proposed, viz. Naïve Bayes,
k-nearest neighbor algorithm, support vector machine, decision trees, etc. [3]. In
another research, authors have used Word2Vec and K-means clustering approach to
classifying Persian documents based on semantic similarity [4]. To classify scientific
documents, authors have used sequential minimal optimization (SMO) and WEKA
support vector machine [5]. In [6], authors have used support vector machines to
classify news articles. In another research, authors have used principal component
analysis and a one-class support vector machine to classify the documents [7]. In
research by Lin et al. [8], authors have used LSTM and deep learning. In another
approach, authors have used the nearest centroid class for adding new classes during
incremental learning [9]. In [10], authors have used the ensemble approach to classify
Indonesian documents. Although, these techniques deal with document classification
authors have not found any research to classify Web URLs related to the healthcare
domain and also update the dataset with the latest information. Therefore there is
a pressing need to devise a model which can classify Web URLs related to the
healthcare domain into the relevant and irrelevant category as the new influx of data
appear incrementally.
The objective of the paper is to propose the HWDC model for retrieving URLs
from the Web using healthcare-related keywords and further classify URLs as rele-
vant and irrelevant based on sentiment filter and document similarity filter. This will
benefit not only the healthcare professional to get relevant Web pages but also aid
the researchers to develop models related to the healthcare domain. Thus, the unique
research contributions are as follows:
1. The model is able to classify Web URLs into relevant and irrelevant categories
related to the healthcare domain and remove duplicate URLs.
2. The model can adapt to the newly arriving influx of data at any instance of time.
3. The model is able to detect changes in URLs and provides up-to-date records
of data.
60 An Incremental Approach to Classify Healthcare … 655

4. The experimental evaluation of the proposed model.

60.2 Literature Survey

This section elaborates on techniques used by the researchers for Web document
classification. It focuses on Sect. 2.1 with document classification using distance
measures, and Sect. 2.2 discussing on document classification and sentiment analysis,
and Sect. 2.3 discussing on document classification with incremental learning.

60.2.1 Document Classification Using Distance Measures

To find relevant information from the documents, it is necessary to perform docu-


ment classification [11]. In the literature, researchers have used document similarity
measures to classify Web documents. In the research [12], authors have proposed a
new similarity measure that defines a formula to compute the distance between two
vectors. However, the proposed new similarity measures outperformed other clas-
sifiers. In another research [13], authors have proved with sufficient evidence that
soft cosine similarity measure outperforms techniques like cosine similarity. Also,
researchers have used cosine similarity and Euclidean distance measure to clas-
sify documents in Gujarati language and multiple domain-based Bangla text docu-
ments [2, 14]. To identify the similarity between medical documents, authors have
used the Bag-of-Words approach along with Latent Dirichlet Allocation (LDA) [15].
Although all these researches have performed standard techniques, authors did not
find a methodology used to compute the similarity between documents using Jaccard
Distance, Euclidean distance, and cosine similarity to classify healthcare-related Web
URLs.

60.2.2 Incremental Learning for Document Classification

Incremental learning adapts to the newly arriving chunk of data without a need to
retrain the entire model. In a research, incremental neural network based on neural
perceptron was adopted to identify newly evolving features and classes in a document
[16]. In another research, authors have used partial supervision techniques with Hier-
archical Dirichlet Process (HDP) using incremental learning approach. Thus make
the model adapt to quick changes occurring in the data and record the latest informa-
tion [17]. In research, by Silambarasan and Shathik [18], predicted new classes as the
chunk of data arrives incrementally using ensemble text classifier (ETC). However,
ensemble techniques are more suitable for sudden concept drifts [19]. Therefore, the
656 Y. Barve et al.

incremental learning approach is more suitable to deal with the smooth arrival of data
and keeping the documents with the latest information in the healthcare domain.

60.2.3 Sentimental Features for Document Classification

Sentiment analysis determines the feelings of the people in terms of positive, negative,
and neutral categories. This certainly helps in categorizing documents. In research by
Yang et al. [20], authors have clustered documents with Latent Dirichlet Allocation
(LDA) using terms from the medical domain. In another research, authors have used
sentiment analysis to classify healthcare documents into True and False based on
the percentage of misinformation in the documents [21]. In this research, authors
have used a similar approach as that of [21] to classify Web URLs related to the
healthcare domain. Although all the above sections are studied independently, there
is no research that would combine the techniques to classify Web URLs related to the
healthcare domain using document similarity measures based on sentiment analysis
and incremental learning. Thus the proposed Web document classification model
(WDCM) is an innovation in this area of research.

60.3 Methodology

In this section, the authors discuss the methodology of the Web document classi-
fication model (WDCM). The architecture of WDCM is shown in Fig. 60.1. The

Fig. 60.1 Diagrammatic representation of methodology with ‘Web document classification model
(WDCM)’
60 An Incremental Approach to Classify Healthcare … 657

methodology has four main components fetching the URLs, applying filter 1, viz.
sentiment filter, filter 2, viz. document similarity filter, and incremental learning.

60.3.1 Fetching of URLs

The authors have fetched URLs through the most popular search engine Google
by entering keywords related to the healthcare domain. The master-level or level-1
URLs are collected, duplicates are identified and removed, and a dataset of 2000
URLs is generated. These 2000 sets of URLs are scrapped at three different time
intervals T1, T2, and T3. Further, 100 seed URLs are manually labeled as relevant
and irrelevant.

60.3.2 Sentimental Filter

In this filter, all the hyperlinks present in the level-1 URL are fetched. These hyper-
links are termed level-2 URLs. Further, level-2 URL names are scrapped, prepro-
cessed, and term frequency (TF) is generated. These terms are matched with the senti-
mental Bag-of-Words (BoW). The sentimental BoW is generated manually, which
is a one-time process, based on trusted Web sources from the healthcare domain. To
filter out the most relevant level-2 URLs, the count of positive and negative words is
generated and the threshold is computed based on these counts. The threshold is an
average of positive and negative word counts. Any URL, having a count of negative
and positive words greater than the threshold generated falls into the relevant cate-
gory of URLs and passed further into filter 2 which is a document similarity filter.
Other URLs are referred to as irrelevant and ignored.

60.3.3 Document Similarity Filter

In filter 2, the term frequency of the relevant URLs from the previous filter is accepted.
Based on these terms, the distance between the seed URLs and newly fetched URLs
is computed and recorded. This is done with help of three distance measures, viz.
Euclidean distance, Jaccard distance, and cosine similarity measure. Further, the
threshold value is computed separately for each distance measure which is an average
of distance values. If the newly arriving URL has a distance value greater than the
threshold, then the URL is considered relevant and stored in the URL repository along
with its timestamp. Otherwise, the URL is discarded and considered irrelevant.
658 Y. Barve et al.

60.3.4 Incremental Learning: Change Detection

Incremental learning is achieved in two ways. First, iteratively accessing the URLs
by classifying URLs into train and test split with five different iterations. Secondly,
detecting changes in the existing URLs at two different time intervals T2 and T3.
In this case, the Level-1 URLs are fetched once again to detect changes if any. In
this module, the first step is to verify if the URL is relevant based on seed URLs, if
it is found to be relevant, generate the term frequency of the Level-1URL contents
otherwise ignore and fetch the next URL. Further, the master URL textual contents
are fetched and changes are detected and recorded. If there are no changes in the
document then the URL timestamp is recorded along with the content changes status
marked as unchanged.

60.4 Results and Discussion

This section discusses the analysis of results and evaluates the performance of the
proposed model WDCM. Section 4.1 elaborates on evaluation based on performance
matrix, Sect. 4.2 discusses the analysis related to harvest ratios, and Sect. 4.3 explains
results about incremental learning.

60.4.1 Performance Matrix

This section shows the evaluation of results in terms of accuracy, precision, recall,
and F1-measure for algorithm based on document similarity measures, viz. Jaccard
distance, Euclidean distance, and cosine similarity. Also, the authors have evaluated
the model with machine learning classifiers, viz. logistic regression (LR) and support
vector machine (SVM). Figures 60.2, 60.3, 60.4, and 60.5 display the accuracy,
precision, recall, and F1-score of the model. It can be seen from Fig. 60.2 that
cosine similarity measure with logistic regression showed the highest accuracy of
93.75% while Euclidean distance measure with logistic regression showed minimum
accuracy of 86.6%.

Fig. 60.2 Accuracy of the


model based on distance
measures with machine
learning classifiers
60 An Incremental Approach to Classify Healthcare … 659

Fig. 60.3 Precision of the


model based on distance
measures with machine
learning classifiers

Fig. 60.4 Recall of the


model based on distance
measures with machine
learning classifiers

Fig. 60.5 F1-score of the


model based on distance
measures with machine
learning classifiers

60.4.2 Harvest Ratios

This section evaluates the results based on harvest ratio. The harvest ratio is computed
using the formula of Eq. 60.1. Figures 60.6 and 60.7 display the harvest ratio of URLs
fetched in a batch of 100 using logistic regression and support vector machine,

Fig. 60.6 Harvest ratio of


URLs with distance
measures for logistic
regression classifier

Fig. 60.7 Harvest ratio of


URLs with distance
measures for support vector
machine classifier
660 Y. Barve et al.

Table 60.1 Train and test


Incremental learning
split of data for five iterations
of incremental learning Iterations Train Test Total
I1 280 120 400
I2 560 240 800
I3 840 360 1200
I4 1120 480 1600
I5 1400 600 2000

respectively. It can be seen Jaccard distance measure-based algorithm outperformed


other distance measures with a harvest ratio of 0.91 in the second batch of 200–
300 URLs with logistic regression and support vector machine. Also, all the distance
measures showed a lower harvest ratio of zero during batches 400–500 and 900–1000.
This means that in these two batches, there were no relevant URLs.

Number of relevant URLs


Harvest Ratio = (60.1)
Total number of URLs fetched

60.4.3 Analysis of Results for Incremental Learning

This section elaborates on results generated with incremental learning technique.


Table 60.1 shows the train and test split of data for five different iterations of 2000
URLs. It can be seen that every iteration has newly collected 400 URLs in addition to
URLs from previous iterations. The train and test splits are 70% and 30%, respectively
per iteration. Table 60.2 shows the accuracy of the model with respect to five iterations
of incremental learning based on distance measures. It can be seen from Table 60.2
that Euclidean distance outperformed other models with an accuracy of 96.071429
for train and 95.833 for testing while the Jaccard distance measure showed lower
accuracy on training and testing in comparison with other distance measures. At time
T2, it was observed that 72 URLs got changed and 33 URLs changed at time T3.
Figure 60.8 displays the number of relevant and irrelevant URLs at time T1, T2, and
T3. Table 60.3 shows the percentage of URLs fetched from various domains, viz.
.com, .edu, .gov, and others at different time intervals. It can be seen that maximum
URLs are with the .com domain. Figure 60.9 displays the accuracy of the proposed
model with distance measures at times T2 and T3. It can be observed that Euclidean
distance outperformed other models with an accuracy of 94.44 at time T2. While
Euclidean and cosine similarity measures showed the highest accuracy of 93.93 at
time T3.
60 An Incremental Approach to Classify Healthcare … 661

Table 60.2 Accuracy (in %)


Distance measure Iterations Train Test
with train and test split of
data for five iterations of Euclidean distance I1 96.071429 95.833333
incremental learning based on I2 96.607143 94.166667
distance measures
I3 95.47619 92.5
I4 84.196429 92.291667
I5 87 91.5
Jaccard distance I1 65.714286 83.333333
I2 71.785714 77.083333
I3 77.5 78.055556
I4 70.892857 81.25
I5 76 83.333333
Cosine similarity I1 93.214286 95.833333
I2 94.821429 93.333333
I3 93.928571 90.833333
I4 83.125 91.041667
I5 85.928571 90.333333

Fig. 60.8 Number of


relevant and irrelevant URLs
at time T1, T2, and T3

Table 60.3 Domain-wise


Domain name T1 T2 T3
classification of URLs (in %)
.com 61.2 52.78 58.97
.edu 9.3 29.17 12.82
.gov 5.4 6.94 12.82
Others 24.1 11.11 15.38

Fig. 60.9 Accuracy of the


model with distance
measures at times T2 and T3
662 Y. Barve et al.

60.5 Conclusion and Future Enhancements

In this research, authors have proposed an innovative Web document classification


model (WDCM) to classify healthcare-related URLs using sentiment and document
similarity filters. Also, the model can adapt to incremental data appearing in chunks
at different instances of time. The analysis of the results shows that URLs tend to
change the contents at a different time interval and these changes have to be recorded
to keep the data up-to-date. It is observed that with machine learning classifiers
cosine similarity measure with logistic regression showed the highest accuracy of
93.75% while Euclidean distance measure with logistic regression showed minimum
accuracy of 86.6%. When implemented as an algorithm Euclidean distance measure
showed the highest accuracy of 96.60% for training and 95.83% for a testing dataset
in five iterations while Jaccard distance measure showed lower accuracy of 65.71%
for training and 85.33 for a testing dataset in five iterations. It is also observed that
the maximum URLs fetched are with the.com domain. In the future, authors want
to propose a new distance measure to classify URLs as relevant and irrelevant and
evaluate the performance of the model.

References

1. Ting, S.L., See-To, E.W.K., Tse, Y.K.: Web information retrieval for health professionals. J.
Med. Syst. 37(3) (2013). https://doi.org/10.1007/s10916-013-9946-3
2. Dhar, A., Dash, N., Roy, K.: Classification of text documents through distance measurement: an
experiment with multi-domain Bangla text documents. In: Proceedings—2017 3rd International
Conference on Advances in Computing, Communication and Automation (Fall), ICACCA
2017, 2018, vol. 2018-January, pp. 1–6. https://doi.org/10.1109/ICACCAF.2017.8344721
3. Shete, D., Bojewar, S., Sanghvi, A.: Survey paper on web content extraction classification.
In: 2021 6th International Conference Convergence Technology I2CT 2021, pp. 1–6 (2021).
https://doi.org/10.1109/I2CT51068.2021.9417947
4. Davoudi, S., Mirzaei, S.: A semantic-based feature extraction method using categorical clus-
tering for Persian document classification (2021). https://doi.org/10.1109/CSICC52343.2021.
9420602
5. Gonçalves, C.A.O., Camacho, R., Gonçalves, C.T., Vieira, A.S., Diz, L.B., Iglesias, E.L.:
Classification of full text biomedical documents: sections importance assessment. Appl. Sci.
11(6) (2021). https://doi.org/10.3390/app11062674
6. Bation, A.D.C., Manguilimotan, E.Q., Vicente, A.J.O.: Automatic categorization of Tagalog
documents using support vector machines. In: PACLIC 2017—Proceedings of the 31st Pacific
Asia Conference on Language, Information and Computation, 2019, pp. 346–353. [Online].
Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85072798627&partnerID=
40&md5=f8e33b9a7498eedb9cfbd119c496776e
7. Kumar, B.S., Ravi, V.: Text document classification with PCA and one-class SVM. Adv. Intell.
Syst. Comput. 515, 107–115 (2017). https://doi.org/10.1007/978-981-10-3153-3_11
8. Lin, R., Fu, C., Mao, C., Wei, J., Li, J.: Academic news text classification model based on
attention mechanism and RCNN. Commun. Comput. Inf. Sci. 917, 507–516 (2019). https://
doi.org/10.1007/978-981-13-3044-5_38
9. Doan, T., Kalita, J.: Overcoming the challenge for text classification in the open world. In:
2017 IEEE 7th Annual Computing and Communication Workshop and Conference, CCWC
2017 (2017). https://doi.org/10.1109/CCWC.2017.7868366
60 An Incremental Approach to Classify Healthcare … 663

10. Saputra, D.G., Khodray, M.L.: An ensemble approach to handle out of vocabulary in multilabel
document classification (2016). https://doi.org/10.1109/ICAICTA.2016.7803109
11. Barve, Y., Mulay, P.: Bibliometric survey on incremental learning in text classification
algorithms for false information detection. Libr. Philos. Pract. 2020(Nov), 2388–2392 (2020)
12. Eminagaoglu, M.: A new similarity measure for vector space models in text classification and
information retrieval. J. Inf. Sci. (2020). https://doi.org/10.1177/0165551520968055
13. Hasan, M.Z., Hossain, S., Rizvee, M.A., Rana, M.S.: Content based document classification
using soft cosine measure. Int. J. Adv. Comput. Sci. Appl. 10(4), 522–528 (2019). https://doi.
org/10.14569/ijacsa.2019.0100464
14. Rakholia, R.M., Saini, J.R.: Information retrieval for Gujarati language using cosine similarity
based vector space model. Adv. Intell. Syst. Comput. 516, 1–9 (2017). https://doi.org/10.1007/
978-981-10-3156-4_1
15. Song, C.-W., Jung, H., Chung, K.: Development of a medical big-data mining process using
topic modeling. Cluster Comput. 22, 1949–1958 (2019). https://doi.org/10.1007/s10586-017-
0942-0
16. Chen, Z., Huang, L., Murphey, Y.L.: Incremental learning for text document classification.
In: IEEE International Conference on Neural Networks—Conference Proceedings, 2007,
pp. 2592–2597. https://doi.org/10.1109/IJCNN.2007.4371367
17. Wang, D., Al-Rubaie, A.: Incremental learning with partial-supervision based on hierarchical
Dirichlet process and the application for document classification. Appl. Soft Comput. J. 33,
250–262 (2015).https://doi.org/10.1016/j.asoc.2015.04.044
18. Silambarasan, G., Shathik, J.A.: Ensemble text classifier: a document classifica-
tion technique to predict and categorizes regularised and novel classes using incre-
mental learning. Int. J. Appl. Eng. Res. 12(22), 12454–12459 (2017). [Online]. Avail-
able: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85057638046&partnerID=40&
md5=b608248320c6e2b5dc40588660a89980
19. Zang, W., Zhang, P., Zhou, C., Guo, L.: Comparative study between incremental and ensemble
learning on data streams: case study. J. Big Data 1(1), 1–16 (2014). https://doi.org/10.1186/
2196-1115-1-5
20. Yang, F.C., Lee, A.J.T., Kuo, S.C.: Mining health social media with sentiment analysis. J. Med.
Syst. 40(11) (2016). https://doi.org/10.1007/s10916-016-0604-4
21. Barve, Y., Saini, J.R.: Healthcare misinformation detection and fact-checking : a novel
approach. Int. J. Adv. Comput. Sci. Appl. 12(10), 295–303 (2021). https://doi.org/10.14569/
IJACSA.2021.0121032
Chapter 61
An Empirical Research on the Impact
of Digital Marketing and Data Science
on Indian Education System

S. Sushitha and Chethan Shetty

Abstract Indian Education System being one of the largest in the world has 760
universities and 38,498 colleges approximately. The rapid growth of web technolo-
gies, Internet of things and digital media has transformed the education sectors
because of the countries youth dependence on Internet. By influencing people inter-
action, work, life habits digital marketing is playing an incredible role in current
youth’s education not only in connecting but also in the process of teaching and
learning. The basic strategy behind this education-based—marketing is sharing of
knowledge by establishing credibility and trust. It is possible to achieve astonishing
results when education is combined with marketing. The paper describes the impact
of applying advanced data science techniques on digital marketing strategies and its
influence on Indian Education System.

Keywords Data analytics · Internet of things (IoT) · Education sectors · Digital


marketing · Data science · Machine learning

61.1 Introduction

In India, education is a holistic process which focuses on physical, emotional and


social development. This sector also plays a foundational role towards the economic
growth of an individual. The adoption of IoT [1], augmented reality along with digital
marketing strategies raising from Website development, Web casting to online adver-
tisement delivers required information to the customer’s by exploring technology [2].
The dawn of the IoT [3] with the revolutionary arrival of social media and smart phone
technologies has made the education sectors more dependent on digital strategies to
reach out potential students.

S. Sushitha (B)
Department of MCA, Nitte Meenakshi Institute of Technology, Bangalore, India
e-mail: sushitha.s@nmit.ac.in
C. Shetty
Nitte Meenakshi Institute of Technology, Bangalore, India
e-mail: chethan.shetty@nitte.edu.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 665
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_61
666 S. Sushitha and C. Shetty

The evolution of digital marketing along with current data science technologies
has developed a digital ecosystem which can connect the user at any point of time [4].
By targeting huge online database with large data repository which is referred as big
data [5], building a scientific knowledge discovery database [6] the digital marketing
strategies have made a remarkable insight into the Indian Education System during
the pandemic crisis. The usage of machine learning algorithms and methodologies [7]
along with the online-marketing strategies have created a data-driven environment
in the education sectors.

61.1.1 The Raise of Digital Marketing

The term “Digital Marketing” refers to the usage of data with respect to the marketing
objectives. It is the skill of building marketing plans focused at prospective customers
digitally utilizing consumer insights. In order to achieve the existing and potential
customer connection, digital marketers leverage on digital channels such as social
media, search engine and website building [8]. By developing an analytical frame-
work [9] using the higher computer visionary methods, the collected digital data
could be analysed more effectively. The automation-augmented [10] technology-
enabled innovations along with the marketing knowledge create a huge opportunity
for diverse marketing contexts in real time. Major studies have shown that spawning
data analytics [11] methods give rise to new marketing innovations in education
sectors.
The reminder of the paper is described as follows: the following section presents
the objective of this analysis and in the second part the theoretical background of
digital marketing followed by marketing skills and an introduction to data science and
its relation towards digital marketing is presented. The section “Research Method-
ology” and “Result and Discussion” presents the obtained Empirical methodology
and its results. Finally, the last section draws the conclusion and provides the future
aspects of the current work.

61.2 Theoretical Background

61.2.1 Digital Marketing

According to experts, digital marketing [12] is a collaborative endeavour in which


specialists from various domains combine their knowledge and available digital data
resources to achieve the organization’s mission. Some of the frequently used digital
marketing skills are explained below:
a. Search Engine Optimization (SEO) [13]: SEO is the art of learning how search
engines like Google index data by optimizing the actual page content and its
61 An Empirical Research on the Impact of Digital Marketing … 667

quality to increase the Web users. This provides a reference to and from a web
page and its contents, keywords and the count of users who access it.
b. Search Engine Marketing (SEM) [13]: This type of marketing mainly focuses
on methods to increase the visibility of search results. It is also referred as a
“paid-search”. Keywords play a vital role in this type of marketing. These are
the most knowable terms in associated with a specific item or brand.
c. Social Media Marketing (SMM) [14]: This type of marketing majorly uses the
social media platform to build brand, to connect audience and to drive website
traffic. By publishing a great content on the media platforms like Facebook,
Twitter, Instagram, LinkedIn and YouTube, they engage their audience and
analyse the results. Running a social ad by using social media management
tools like buffer platforms helps to target more people.
d. Affiliate Marketing [15]: Affiliate marketing is a process in which an affiliate
gains reward by promoting a product of some other individual or organization.
In this type possible to combine the skills of a diverse group of people to create
more effective marketing strategy.
e. Content Marketing: This is a strategic approach which involves the creation
and distribution of relevant, consistent content to attract and retain the targeted
audience actions. A high quality, entertaining and informative content in front
of the right audience becomes the heart of any marketing strategy. It includes
blogs, infographics, videos, pod ads, ads, social media posts and web pages.
Planning, creating and analysis of a content play a vital role in this type of
marketing.
f. Website Retargeting [16]: Retargeting is an online advertising technique that
allows you to keep your brand in front of bounced Website visitors. This strategy
involves Google ads, Facebook-Instagram Retargeting, Banner ads and Digital
retargeting search ads. This type of marketing requires considerable planning
and insights.

61.2.2 Data Science

The term data science is all about gathering of data analysing it for better decision-
making system. The entire process lies on identifying pattern in the data through
some statistical analysis and make future predictions with respect to that. Since
data is complex in nature like unstructured/semi-structured, the analysation of such
data becomes a tedious process [16]. With the help of inference, computer science
technologies, machine learning algorithms along with some predictive models it is
possible to gain the insight from such type of complex data.
To define the process of data science, it is required to understand the life-cycle
and the different stages associated with it. The pipeline workflow involves the steps
like acquiring the data, extracting it, analysing the data, managing the data and so
on. The fundamentals of data science are the data pre-processing step which involves
data classification, clustering and data modelling insights gleaned from the data in
668 S. Sushitha and C. Shetty

Fig. 61.1 Life-cycle of data science process [17]

order to create effective data for further process. Figure 61.1 represents the life-cycle
of data science process.

61.2.3 Data Science in Digital Marketing

Data science is the process of analysing large set of structures and unstructured data
to detect pattern and to extract required insights from them. The basic step of analysis
includes acquiring the data, data extraction, data cleansing, data pre-processing, data
staging, etc. The rapid revolution of Internet in the pandemic crisis clearly shows that
the optimization of digital marketing strategy with data science technology [4] can
create the best professional opportunities as well as a better lifestyle in the digital
era. Some of the data science technology used in digital marketing strategy are listed
in Table 61.1.

61.3 Digital Marketing in Educational Sector

The student community is the most dominant and active group where Internet has
become a primary source for them. The usage of Internet and the online engagement
is not only limited to gain knowledge about subjects, syllabus, projects, etc., but they
61 An Empirical Research on the Impact of Digital Marketing … 669

Table 61.1 Optimization of data science and digital marketing


Sl No. Data science technology Description
1 Pattern recognition using machine learning To identify the duplicate content in the
or artificial intelligence search engine which is a basic for SEO
2 Predictive analysis To understand the future pattern to reach
more audience and to qualify and prioritize
the leads
3 IoT and Google analytics To analyse and understand the purchase
path of a customer. And to provide an
adequate exposure to the marketing team
as well as customer through smooth online
platform
4 Sentimental analysis To analyse the behaviour pattern of a
customer through review and opinions
5 Classifications To understand and segregate the customers
6 Cloud computing To analyse the large social conversations
targeting high level of social activity
7 Unsupervised data mining (market-basket To understand the buying behaviour and
analysis) the correlation between the customer and
the purchase
8 Time series analysis To identify the right channels for the right
audience
9 Artificial intelligence To study and understand customer
phycology
10 Big data To store the huge collected data for further
analysis

also use to find out certain courses, colleges, admission process, ranking, placement
records, etc. In this digital era, before visiting any education sector, every parent
conducts online research to know more about academics.
Benefits of digital marketing on education sectors are:
a. Create Credibility: The good thing about having a web presence for an
educational establishment is that it establishes enormous credibility.
b. Boost Brand awareness: Through social media platforms, it is easy to achieve
online visibility to generate effective brand awareness by reaching a proper
target group.
c. Cost Effective: At a low venture, it is possible to focus on bigger audience and
can achieve considerable benefits.
d. Virtuous feedback: The feedback, responses and the queries are easily recorded
with no time by using social media channels.
e. More reach to the potential students: Through social networking sites, banners,
email, online ads a lot of web traffic could be generated, and this increases the
conversation rate on the Internet for an education sector and provides a more
visibility to the potential students.
670 S. Sushitha and C. Shetty

61.4 Research Methodology

The authors have obtained the empirical research methodology [18] to analyse the
impact of applying advanced data science techniques on digital marketing strate-
gies and its influence on Indian Education System. The evidence is gathered using
qualitative and quantitative market research methods [19].
a. Quantitative Research methods
1. Survey Research: The authors have performed the survey research and using
a predetermined set of close questions which are easy to answer in a less
time. The survey is conducted to gather the opinion of students belonging
to different educational sectors.
2. Longitudinal study [20]: To analyse the behaviour or attitude of a partic-
ipant, “Panel study” is performed targeting a same group of students at a
specified interval of time.
b. Qualitative Research methods
The authors have performed focus group [21] qualitative research method
targeting a limited number of participants. The survey was conducted in the
online mode using “why”, “what”, “how” format based questions. The authors
have conducted mini-focus group survey for minimum set of target participants.
The survey sample size was limited to 25 participants, and the questions were
in English language. This methodology completely focused on the education
sector decision-making system.

61.5 Result and Discussion

The authors have performed descriptive research analysis to investigate the impact
of digital marketing strategy and data science technologies on education sectors.
The data is collected using questionnaire survey method, and it is empirically tested
using both qualitative and quantitative methods. The questionnaire was formed using
English literature.
a. Sample Profile
The authors have adopted “simple random probability sampling method [22]”
for data collection. Total 324 responses were received for the survey method
out of which 200 responses were further examined eliminating the duplicate
and irrelevant survey responses. In order to study the major behaviour of the
participant with respect to education sector, the authors have conducted “Panel
study” keeping a 30 days’ time variations. Apart from this, around 25 responses
were collected in the qualitative approach using “mini-focus group survey”
method targeting the academic students of different education sectors.
61 An Empirical Research on the Impact of Digital Marketing … 671

Table 61.2 Variables of bivariate analysis


Explanatory variables Response variables
Age-group • Digital marketing strategies
• 18–25 • Data science technologies and
• 26–35 • Optimization of both
• 36–50
Gender • Digital marketing strategies
• Male • Data science technologies and
• Female • Optimization of both

b. Statistical Analysis
To determine the empirical relationship between the variables, the authors have
performed the bivariate analysis considering the following variables (Table
61.2).
To find out the associations between these variables, the analysis has been done
in two major steps. In the beginning step, we have analysed the relationships of
different age-groups on outcome variables. The authors have considered three sets
of age-groups like group1 targets participants from 18 to 25 years and group2 targets
participants from 26 to 35 and similarly group3 targets the participants from 36 to
50. The result of the analysis is presented in Table 61.3.
In the second step of analysis, the association between the gender and the outcome
variable is calculated. Here a separate analysis was done for the survey results consid-
ering for both male and female group. The detailed result of the same is represented
in Table 61.4.
In the further analysis, the authors have estimated the effect of different age-group
people with respect to Gender on outcome variables. The result of the same is shown
in Fig. 61.2.

Table 61.3 Result of bivariate analysis on age-group versus response variables


Age-group (x) Knowledge on digital Knowledge on data Optimization of both
marketing science technologies
18–25 135 (65%) 145 (72.5%) 152 (76%)
26–35 50 (25%) 41 (20.5%) 38 (19%)
36–50 15 (10%) 14 (7%) 10 (5%)

Table 61.4 Result of bivariate analysis on gender versus response variable


Gender Knowledge on digital Knowledge on data science Optimization of both
marketing technologies
Male 99 (49.5%) 115 (57.5%) 100 (50%)
Female 101 (50.5%) 85 (42.5%) 100 (50%)
672 S. Sushitha and C. Shetty

90 42.5% 40%
80
36% 35.5%
70 32%
30%
60

50

40
13.5%
30 11.5% 10.5% 11% 10%
20 8%
6%
4.5%
10 3%
1.5% 2.5% 2%
0

Fig. 61.2 Result of the estimated analysis w.r.t age-group, gender and outcome variables with data
table

The trend line in the above graph clearly indicated that age-group 18–25 in both
male and female categories had a positive impact on the outcome variables.
c. Correlation Coefficient
In order to measure the strength of the linear relationship between the variables
and to get their compute association results, the correlation coefficient for the
collected data considering the explanatory and outcome variable is calculated.
Table 61.5 represents the result of the same.
In the above result, it is clearly observed that the correlation within the two vari-
ables in the age-group 18–25 (target group for education sector) with respect to gender
(M/F), the value is greater than zero. Hence, it indicates a strong positive correla-
tion signifying that both the variables move in same direction. When we consider
the age-group 26–35, 36–50 negative correlation is observed. Hence, the research
clearly predicts, the impact of data science technologies, digital marketing strategy
and the optimization of both has significant dependence on Indian Education sector
because majority of the Indian students belong to the age-group 18–25.
The targeted age-group 18–25 is further analysed with respect to the education
sector by conducting a panel study survey analysis. The variable age-group was
restricted here with a condition Age ≥ 18 and Age = 25 against the outcome vari-
ables. The survey was conducted for a period of two months keeping 30 days’ time
variations. Both the results were analysed, and there were very less opinion changes
and were observed. The result of the study is shown in Fig. 61.3.
In the qualitative research method, the authors have performed mini-focus group
analysis targeting Indian Education sector. Here, the age-group was restricted to 18–
25 and limited number of participants (25) were considered for the study. The study
Table 61.5 Result of correlation coefficient calculation of variables
61 An Empirical Research on the Impact of Digital Marketing …
673
674 S. Sushitha and C. Shetty

A g e - g r o u p b e t we e n 1 8 - 2 5
75% 65%

7 2. 5 %

Digital Markeng
Data Science
Opmizaon of both

Fig. 61.3 Result of panel-study survey of variables for age-group 18–25

mainly focused on the quires related to education sector decision-making system. In


the analysis it was clearly examined that, 90% of the Indian students use Internet and
digital medium like social media, online ads to make their education (academic)-
related decisions. The result is shown in Fig. 61.4.

Response to Social media ads 95%

Decession on Reviews 92%

Organizaon Ranking 90%

Faculty profile 62%

Organizaon Events 40%

Alumni database 55%

Placement stascs 90%

Admission related quiers 96%

Brand Awareness 90%

0% 20% 40% 60% 80% 100% 120%

Fig. 61.4 Result of mini-focused study for education sector decision-making system
61 An Empirical Research on the Impact of Digital Marketing … 675

61.6 Conclusion and Implications

The findings from the empirical analysis clearly indicate that there is substantial asso-
ciation between digital marketing strategy, data science technology and the Indian
Education sector. The results revealed that 98% of the students belong to the age-
group 18–25 and are dependent on the digital medium for their education-based
decisions. Hence, the optimization of both digital marketing and data science has a
significant effect and dependency on education decision behaviours. The limitations
narrate to the sample collected and the geographical restrictions. The research focused
mainly on the analysis of effectiveness in education sector and in future research other
sectors like medical, agriculture could be considered targeting different age-groups.

Acknowledgements The authors would like to thank the students, parents and the staff members
of various education sectors, whose response and cooperation have contributed the major part of
the research.

References

1. Spaltro, C.E. (2016). Connecting the dots: how IoT is going to revolutionize the digital
marketing landscape for millennials
2. Malar, P.J.M.A.J.: Innovative digital marketing trends 2016. In: 2016 International Conference
on Electrical, Electronics, and Optimization Techniques (ICEEOT), March, pp. 4550–4556.
IEEE (2016)
3. Mohammadian, A., Mirbagheri, F., Khanlari, A.: Identification and classification of innovative
applications of internet of things in digital marketing. J. Bus. Manag. 11(4), 719–741 (2019)
4. Saura, J.R.: Using data sciences in digital marketing: framework, methods, and performance
metrics. J. Innov. Knowl. 6(2), 92–102 (2021)
5. Gandomi, A., Haider, M.: Beyond the hype: big data concepts, methods, and analytics. Int. J.
Inf. Manag. 35(2), 137–144 (2015)
6. Piatetsky-Shapiro, G.: Notes of IJCAI’89 Workshop Knowledge Discovery in Databases
KDD’89. Detroit, Michigan (1989)
7. Caseiro, N., Coelho, A.: The influence of Business Intelligence capacity, network learning and
innovativeness on startups performance. J. Innov. Knowl. 4(3), 139–145 (2019). https://doi.
org/10.1016/j.jik.2018.03.009
8. Htun, H.L.: The Impact of Digital Marketing Channels on Consumer Purchasing Behavior
during COVID-19 Pandemic in Myanmar (2021)
9. Bharadwaj, N., Ballings, M., Naik, P.A., Moore, M., Arat, M.: A new livestream retail analytics
framework to assess the sales impact of emotional displays. J. Mark. 86(1), 27–47 (2022)
10. Raisch, S., Krakowski, S.: Artificial intelligence and management: the automation-
augmentation paradox. Acad. Manag. Rev. 46(1), 192–210 (2021)
11. Cheng, S., Ma, L., Lu, H., Lei, X., Shi, Y.: Evolutionary computation for solving search-based
data analytics problems. Artif. Intell. Rev. 54(2), 1321–1348 (2021)
12. Das, S.: Search Engine Optimization and Marketing: A Recipe for Success in Digital Marketing.
CRC Press (2021)
13. Li, F., Larimo, J., Leonidou, L.C.: Social media marketing strategy: definition, concep-
tualization, taxonomy, validation, and future agenda. J. Acad. Mark. Sci. 49(1), 51–70
(2021)
676 S. Sushitha and C. Shetty

14. Suryanarayana, S.A., Sarne, D., Kraus, S.: Information design in affiliate marketing. Auton.
Agent. Multi-Agent Syst. 35(2), 1–28 (2021)
15. Maurer, C.: Digital Marketing in Tourism (2021)
16. Naeem, M., Jamal, T., Diaz-Martinez, J., Butt, S.A., Montesano, N., Tariq, M.I., De-la-Hoz-
Franco, E., De-La-Hoz-Valdiris, E.: Trends and future perspective challenges in big data. In:
Advances in Intelligent Data Analysis and Applications, pp. 309–325. Springer, Singapore
(2022)
17. Rose, D.: Using a data science life cycle. In: Data Science. Apress, Berkeley, CA (2016). https://
doi.org/10.1007/978-1-4842-2253-9_12
18. Hoddy, E.T.: Critical realism in empirical research: employing techniques from grounded theory
methodology. Int. J. Soc. Res. Methodol. 22(1), 111–124 (2019)
19. Queirós, A., Faria, D., Almeida, F.: Strengths and limitations of qualitative and quantitative
research methods. Eur. J. Educ. Stud. (2017)
20. Haugan, J.A., Frostad, P., Mjaavatn, P.E.: A longitudinal study of factors predicting students’
intentions to leave upper secondary school in Norway. Soc. Psychol. Educ. 22(5), 1259–1279
(2019)
21. Conway, G., Doherty, E., Carcary, M.: Evaluation of a focus group approach to developing
a survey instrument. In: Proceedings of the European Conference on Research Methods for
Business & Management Studies, pp. 92–98 (2018)
22. Zaman, T.: An efficient exponential estimator of the mean under stratified random sampling.
Math. Popul. Stud. 28(2), 104–121 (2021). Kannan, P.: Digital marketing: a framework, review
and research agenda. Int. J. Res. Mark. 34(1), 22–45 (2017)
Chapter 62
A Text Classification Optimization
Framework for Prodigious Datasets

Gunjan Singh and Arpita Nagpal

Abstract This paper introduces Tunicate Swarm Algorithm-based Hierarchical


Attention Network (TSA-HAN). TSA-HAN is the combination of Tunicate Swarm
Optimization Algorithm (TSA) that uses jet propulsion and Swarm Intelligence and
Hierarchical Attention Network (HAN) which makes use of leveled document struc-
ture. The proposed optimized algorithm is used for text classification. The perfor-
mance of TSA-HAN is evaluated based on five parameters, namely accuracy, TPR,
TNR, precision, and FNR. For this, purpose self-created dataset named real-time
dataset consisting of 5000 documents and popular datasets, i.e., Reuters datasets
and 20-Newsgroup datasets, have been utilized and the potency of the said opti-
mized algorithm has been further compared with an existing improved sine cosine
algorithm (ISCA). The comparative analysis results show that TSA-HAN performs
slightly better than ISCA.

Keywords Tunicate · Hierarchical networks · Attention layers · Text


classification · Deep learning optimizations

62.1 Introduction

With the [1] blast of data assets on the Web and corporate [1] intranet, there is an
imminent requirement for more successful and proficient innovations to assist with
searching and dealing with these assets. However, the expansion in the accessibility of
huge text information from different sources is making various issues and difficulties
for text investigation and characterization [2].
Text classification can be performed either through manual annotation [3, 4] or via
automatic labeling [4]. With the developing size of text information in modern appli-
cations, automatic text characterization is turning out to be progressively significant
[4]. Ways to deal with automatic text [4] arrangement can be [4] gathered into two

G. Singh (B) · A. Nagpal


G. D. Goenka University, Gurugram, India
e-mail: gunjan9010@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 677
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_62
678 G. Singh and A. Nagpal

classes, i.e., rule-based methods and machine learning (data-driven)-based methods


[3, 4].
Deep learning [5, 6] is a machine learning method [6] in view of representa-
tion [6] learning where the framework consequently gains and finds the elements
required for classification from the handling of various layers of information. It has
turned into a standard machine learning strategy with limits in different nonlinear
demonstrating assignments, for example, the classification and feature extraction
process from complex datasets [1]. As of late, many deep learning procedures have
been investigated for text arrangement [1]. Deep learning has been shown to be
viable to perform start to finish learning of progressive element portrayals, and deep
neural networks have exhibited predominant execution for hierarchical text classifi-
cation [1, 5]. Rule-based techniques group text into various classes utilizing a bunch
of pre-characterized runs and require a deep knowledge of the domain space [7].
The machine learning-based [7] methodologies figure out how to mention charac-
terizations in light of past objective facts of the information. Involving pre-labeled
models as training information, a machine learning calculation can get familiar with
the inborn relationship between bits of text and their names. The machine learning
models have attracted a ton of consideration in late years [4]. Most traditional machine
learning-based models follow the well-known two-step procedures that have limi-
tations [4]. These models cannot exploit a lot of training data on the grounds that
the features are pre-characterized. Neural methodologies have been investigated to
address the constraints because of the utilization of hand-create features as previously
mentioned. The center part of these methodologies is a machine-mastered installing
model that maps text into a low-layered continuous feature vector, in this manner no
hand-made feature is required [4, 5].
Subsequently, a new neural architecture, the Hierarchical Attention Network
(HAN) is intended to capture two fundamental experiences about document struc-
ture [3]. To begin with, since reports have a hierarchical structure (words structure
sentences, sentences structure a record), we in like manner develop a document repre-
sentation by first structure representations of sentences and afterward conglomerating
those into a document representation [3]. Second, it is seen that various words and
sentences in records are distinctively informative [3]. In addition, the significance
of words and sentences is highly context-dependent, i.e., a similar word or sentence
might be contrastingly significant in various contexts [3].
The real-life issues have enormous solution spaces that involve non-linear solu-
tion spaces that include non-straight imperatives. They face a few issues like high
computational expense, enormous number of factors and are muddled in nature.
The old-style approaches giving local optimum solutions do not ensure for the best
arrangement. Hence, metaheuristic algorithms are proposed by analysts which are
computationally modest, adaptable, and straightforward commonly. Metaheuristic
algorithms can be either single solution-based algorithm (SSBA) [5] or population-
based algorithm (PBA) [5]. PBAs can find global optimum and depend on logical
conduct of physics algorithm [5], swarm intelligence of particles [5], and biological
behavior of bio-inspired algorithm [5]. TSA is one such algorithm. It is a bio-inspired
metaheuristic algorithm which is inspired by swarm [5] conduct of tunicate [5] to
62 A Text Classification Optimization Framework … 679

endure effectively top to bottom of a sea. It can streamline nonlinear constrained


issues. It deals with jet propulsion and swarm conduct of tunicates [4]. Along these
lines, HAN is trained by TSA to give global best solution for text classification.
The paper is organized as follows [1]. This section [1] begins with a brief intro-
duction to text classification to provide the basic concept and background knowledge
[1]; Sect. 62.2 discusses related works [8]. In Sect. 62.3 [8], research methodology
has been discussed; Sect. 62.4 discusses text classification; Sect. 62.5 has further
been subdivided into four parts. It discusses in detail the result and analysis, and
finally, Sect. 62.6 concludes the research work [9].

62.2 Related Works

Stein, R. A. et al. introduced text classification (TC)—a.k.a. text characterization,


subject characterization as the field that reads up answers for grouping naturally a
portion of this literary information so data framework clients can all the more effec-
tively recover, remove, and control data to perceive designs and create information
[1]. Sorting out electronic records into classes has happened to expand interest for
some individuals and associations [1].
Belazzoug et al. [10] developed an improved sine cosine algorithm (ISCA) [11]
for selecting the features [11]. This [11] algorithm discovered new regions of search
space and focus on the best solution for generating the optimal solution. It considered
two solutions, namely location of best solution and the random location from search
space. It allowed to eliminate premature convergence and increased the performance.
It is more efficient to solve the feature selection problems, but it failed to apply this
method to other complicated problems.
Kaur et al. [4] developed Tunicate Swarm Optimization (TSA) which is a bio-
inspired algorithm [12] considering swarm [12] conduct and the jet propulsion of
tunicates [12] at the hour of scrounging and route process [12]. The tunicates are
splendid bio-radiant that creates a light, and it very well may be seen from a long
distance. Tunicate is a creature that moves around sea with the impetus interaction
that is very much like a liquid. The multitude conduct and the stream impetus of TSA
assist with expanding the assembly pace of the enhancement subsequently empower
to create global best arrangement by disposing of the nearby optima. Kaur et al.
[4] investigated a bunch of 74 benchmark test capacities having a place with CEC-
2015 and CEC-2017 test suite [5]. The factual outcomes demonstrated the viability
of TSA toward achieving global optimal [5] arrangements having a better union in
contrast with Spotted Hyena Optimizer (SHO) [13], Gray Wolf Optimizer (GWO)
[14], Particle Swarm Optimization (PSO) [15], Multi-Verse Optimizer (MVO) [16,
23], Gravitational Search Algorithm (GSA) [9, 23], Genetic Algorithm (GA) [17],
Emperor Penguin Optimizer (EPO) [12], JSO [18], and sine cosine algorithm (SCA)
[8]. TSA uncovers the prevalence of different calculations as far as best, mean, and
middle [4]. For CEC-2015 [5] and CEC-2017 [5] benchmark test works, all the
680 G. Singh and A. Nagpal

competitor calculations seldom observed the global optimal arrangements, conflict-


ingly the exhibition of TSA is viewed as precise and reliable [5]. They likewise
researched the impact of adaptability and responsiveness on the viability of TSA and
the recreation results uncover that the proposed algorithm is less vulnerable when
contrasted with different algorithms [5]. Kaur et al. [4] inferred that the proposed
TSA is appropriate to genuine contextual analyses with obscure inquiry spaces. TSA
can be extended to tackle multi-objective enhancement.
Yang et al. [3] developed Hierarchical Attention Network (HAN). The upside of
involving HAN for the text classification is to catch two essential bits of knowledge
in regard to the document structure. To start with, as the record has various leveled
structures, the archive portrayal is developed by displaying the sentences portrayal
and afterward bunches them to text portrayal. Second, it is noticed that various
sentences and text in the reports are independently useful. To upgrade the presen-
tation of classification, the HAN incorporates two distinct degrees of consideration,
one at word level [3] and the other at sentence level [3]. Additionally, this network
not exclusively improves execution; however, it likewise gives knowledge into which
words and sentences contribute to the classification choice which can be of worth in
application and investigation [3]. Yang et al. [3] use an attention instrument joined
with the various hierarchical structures that work on past models. It catches assorted
settings and doles out setting subordinate load to the words. It can likewise catch the
setting subordinate word significance. This model dynamically fabricates an archive
vector by amassing significant words into sentence vectors and afterward totaling
significant sentence vectors to record vectors [3]. This model performs fundamen-
tally better compared to past strategies (LSTM-GRNN) [7] as neural network-based
techniques do not investigate progressive record structure, for example, LSTM [7,
19], CNN-word [6], CNN char [20] enjoys minimal upper hand over conventional
strategies for enormous scope. Representation of these attention layers is compelling
in selecting significant words and sentences.
We studied many research papers and found that though there is an ample work
existing for text classification, there is scope for improvement [2]. Hence, the moti-
vation is to optimize text classification process through an algorithm that is capable
of handling large data to provide better performance with accuracy [2].

62.3 Research Methodology

Figure 62.1 represents the block diagram of the proposed method [14]. The intention
is to design and develop a new method for text classification using TSA-HAN [21].
The proposed [21] text classification process involves different phases, such as pre-
processing, feature extraction, feature selection, and text classification in order to
classify the text data [17]. At first, the input text data is fed to the pre-processing step,
where the stop word removal and stemming process are carried out more effectively
in such a way that the pre-processed result enables to increase the performance
of text classification [10]. The pre-processed result is fed to the feature extraction
62 A Text Classification Optimization Framework … 681

TSA-HAN

Input Pre-pro- Feature Feature Text Classificaon


text data cessing Extracon Selecon

Classified Text

Fig. 62.1 Block diagram of proposed methodology [14]

phase, where the features like term frequency-inverse document frequency (TF-IDF),
WordNet-based features, and co-occurrence-based features are effectively extracted
from the pre-processed text data [10, 13]. After, extracting the features, the feature
selection is done by Tunicate Swarm Algorithm (TSA) [4]. After the completion
of feature selection process, the text classification process begins in such a way that
the text classification is accomplished using Hierarchical Attention Network (HAN),
which is trained by the Tunicate Swarm Optimization algorithm (TSO). Finally, we
get the classified text.

62.4 Text Classification

HAN consists of word encoder, word attention layer, sentence encoder, and sentence
attention layer [3, 13]. In word encoder, the input feature is embedded to vectors
using the embedding matrix where the bidirectional gated recurrent unit [20] (GRU)
is used to get the annotations of the words by means of summarizing the data from
both directions for texts [20]. In word attention layer, attention module is used for
extracting the words that are significant in the sentence. The representation of infor-
mative features is aggregated to form the sentence vector. During training process,
word context vector is initialized randomly and learned jointly. In sentence encoder,
with the sentence vector, the document vector is derived and used the bidirectional
GRU for encoding sentences. In sentence attention, the sentence-level context vector
is used to find the importance of sentences.

dv = tanh(Bw i v + tw ) (62.1)
 
exp dvS U
αv =    (62.2)
S
v exp dv U

D= αv · i v (62.3)
v
682 G. Singh and A. Nagpal

where D denotes document vector that contains information of text in the document.
The output text data generated at output layer.
Training process of HAN: The training procedure of HAN is carried out with the
TSO algorithm.
Solution encoding [21]: It is the representation of solution vector that determines
optimal solution more accurately [21].
Fitness function: The fitness measure is employed to obtain optimal solution by
computing the error difference among the target and the actual output value and is
specified as [21],

1  2
m
L= Aβ − Tβ (62.4)
m β=1

where L denotes fitness measure, A indicates actual value, and T signifies classified
result.
As the training of HAN is done by TSA for text classification to get the global
optimum solution, it is required to update the solution that converges with the position
of the best search agent. The process of TSA includes initialization of tunicates
population, computing fitness measure, updating solution, evaluating the feasibility,
and termination. Therefore, position update equation can be given as,

Mbc+1 = E + L · P (62.5)


→ −

where E is the best search agent or food source in TSA, L denotes the vector, Mbc+1


is the position of swarm in the next iteration. P is the distance between the food
source and search agent [5].

62.5 Research and Analysis

62.5.1 Experimental Setup [21]

The implementation of proposed method is carried out in the PYTHON tool with
Windows 10 OS, Intel processor, and 2 GB RAM [21].
62 A Text Classification Optimization Framework … 683

62.5.2 Dataset Description

The dataset used for the implementation includes Reuters dataset [22], 20-Newsgroup
dataset [11], and real-time data.
Reuters dataset: This dataset contains 21,578 instances, where 19,043 documents
are selected for the classification process. Here, the documents are assembled and
indexed based on their categories. It contains 5 attributes and 206,136 number of
web hits. It does not contain any missing values.
20-Newsgroup dataset: It contains a collection of newsgroup documents and is more
popular for the experiments in the text appliances of the machine learning methods,
like text clustering and text classification. Here, the duplicate messages are removed
such that the original messages contain ‘from’ and ‘subject’ headers.
Real-time data: Here 20 topics are selected, and for each topic, 250 papers are
collected from springer and science direct web source. Some of the topics include
advances in data analysis, artificial intelligence, big data, bioinformatics, biometric,
cloud computing, and so on. Hence, it contains 5000 documents for the text
classification process.

62.5.3 Evaluation Metrics [21]

The performance [21] of developed method [21] is analyzed by considering the


measures [21], such as accuracy, TPR, and TNR, respectively.
Accuracy: It is the measure that shows the accurate observation of positive result to
total number of observations and is indicated as,

F p + Fn
Accuracy = (62.6)
F p + Fn + K p + K n

where G p denotes true positive [18], G n indicates true negative [18], H p signifies
false positive [18], and Hn represents false negative [18].
TPR: It is the measure that detects the positive result more accurately among the
total observations and is specified as,

Fp
TPR = (62.7)
Fp + K n

TNR: It is used to detect the negative result more accurately and is specified with the
below equation as,
684 G. Singh and A. Nagpal

Fn
TNR = (62.8)
Fn + K p

FNR: It is termed as the proportion of positive values that yield negative results, and
it is represented as,

Kn
FNR = (62.9)
K n + Fp

Precision: It is the ratio of relevant classification text instances to the total text
documents and is represented as,

Fp
Precision = (62.10)
Fp + K p

62.5.4 Comparative Analysis

(a) Analysis with Reuters dataset


We experimented with Reuters dataset using different values of training
datasets: 60%, 70%, 80%, and 90%, respectively, and compared results with
improved sine cosine algorithm (ISCA) in terms of five famous parameters,
namely accuracy, precision, TPR, TNR, and FNR [2]. Experimental results are
represented in graphs in Fig. 62.2 and in Table 62.1 [2].

(b) Analysis with 20-Newsgroup dataset


Also, we experimented with 20-Newsgroup dataset using different values of
training datasets: 60%, 70%, 80%, and 90%, respectively, and compared results
with improved sine cosine algorithm (ISCA) in terms of five famous parame-
ters, namely accuracy, precision, TPR, TNR, and FNR [2]. Experimental results
are represented in graphs in Fig. 62.3 and in Table 62.2 [2].

(c) Analysis with real-time data


We experimented with real-time dataset using different values of training
datasets: 60%, 70%, 80%, and 90%, respectively, and compared results with
improved sine cosine algorithm (ISCA) in terms of five famous parameters,
namely accuracy, precision, TPR, TNR, and FNR [2]. Experimental results are
represented in graphs in Fig. 62.4 and in Table 62.3 [2].
62 A Text Classification Optimization Framework … 685

Fig. 62.2 Analysis with Reuters dataset: a accuracy, b TPR, c TNR, d precision, e FNR

62.6 Conclusion

In this study, we discussed a new optimized hybridized framework, Tunicate Swarm


Optimization Algorithm-based Hierarchical Attention Network (TSA-HAN) for text
classification in detail. Three datasets, namely Reuters datasets, 20-Newsgroup
datasets, and Real-time datasets, have been considered. We find that TSA-HAN
works marginally better as compared with an existing technique improved sine cosine
686

Table 62.1 Comparative analysis of TSA-HAN and ISCA with Reuters dataset
Training data (%) TPR TNR Accuracy Precision FNR
ISCA TSA-HAN ISCA TSA-HAN ISCA TSA-HAN ISCA TSA-HAN ISCA TSA-HAN
60 0.70021 0.79954 0.72126 0.81255 0.701254 0.80126 0.714856 0.87413 0.27864 0.18638
70 0.71246 0.81254 0.73254 0.82155 0.721352 0.82155 0.735241 0.88975 0.25872 0.17854
80 0.74865 0.84126 0.76413 0.83126 0.754126 0.83252 0.748652 0.90746 0.20479 0.13785
90 0.78126 0.86522 0.79422 0.90126 0.794126 0.87413 0.756542 0.91525 0.17858 0.11585
G. Singh and A. Nagpal
62 A Text Classification Optimization Framework … 687

Fig. 62.3 Analysis with 20-newsgroup dataset: a TPR, b TNR, c accuracy, d precision, e FNR

algorithm (ISCA) in terms of the popular five parameters, namely accuracy, preci-
sion, TPR, TNR, and FNR providing 0.874, 0.915, 0.865, 0.901, and 0.137, respec-
tively, with Reuters datasets; 0.913, 0.935, 0.884, 0.913, and 0.115, respectively,
for 20-Newsgroup datasets; 0.921, 0.9471, 0.889, 0.899, and 0.111, respectively, for
real-time datasets. Further, it has been noticed that with increase in number of itera-
tions and percentage of training datasets, it shows better results. However, a specific
optimization algorithm [5] does not solve every problem according to no free lunch
(NFL) theorem [10]. Hence, we can design some new optimization algorithms, which
can solve various domains of specific problem [5].
688

Table 62.2 Comparative analysis of TSA-HAN and ISCA with 20-newsgroup dataset
Training data (%) TPR TNR Accuracy Precision FNR
ISCA TSA-HAN ISCA TSA-HAN ISCA TSA-HAN ISCA TSA-HAN ISCA TSA-HAN
60 0.72137 0.81363 0.73216 0.82416 0.712365 0.81255 0.723652 0.90745 0.27864 0.18638
70 0.74128 0.82146 0.75814 0.85413 0.732155 0.83215 0.735865 0.91425 0.25872 0.17854
80 0.79521 0.86216 0.78413 0.89413 0.775215 0.86542 0.758641 0.92458 0.20479 0.13785
90 0.82143 0.88416 0.82515 0.91256 0.812545 0.91254 0.768541 0.93541 0.17858 0.11585
G. Singh and A. Nagpal
62 A Text Classification Optimization Framework … 689

Fig. 62.4 Analysis with real-time dataset: a TPR, b TNR, c accuracy, d precision, e FNR
690

Table 62.3 Comparative analysis of TSA-HAN and ISCA with real-time dataset
Training data (%) TPR TNR Accuracy Precision FNR
ISCA TSA-HAN ISCA TSA-HAN ISCA TSA-HAN ISCA TSA-HAN ISCA TSA-HAN
60 0.73216 0.83216 0.74126 0.84946 0.721533 0.82321 0.735241 0.91542 0.26784 0.16785
70 0.76521 0.88413 0.77126 0.87413 0.732514 0.84126 0.748451 0.92451 0.23479 0.11587
80 0.78215 0.88325 0.79325 0.89413 0.784126 0.87125 0.765413 0.93541 0.21785 0.11675
90 0.78327 0.88943 0.83541 0.89942 0.823659 0.92125 0.776541 0.94713 0.21673 0.11057
G. Singh and A. Nagpal
62 A Text Classification Optimization Framework … 691

References

1. Stein, R.A., Jaques, P.A., Valiati, J.F.: An analysis of hierarchical text classification using word
embeddings. Inf. Sci. 471, 216–232 (2019)
2. Zhou, X., Gururajan, R., Venkataraman, R., Tao, X., Bargshady, G., Barua, P.D., Kondalsamy-
Chennakesavan, S.: A survey on text classification and its applications. Web Intell. 18, 205–216
(2020). https://doi.org/10.3233/WEB-200442
3. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks
for document classification. In: Proceedings of the 2016 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language Technologies,
June 2016, pp. 1480–1489
4. Kaur, S., Awasthi, L.K., Sangal, A.L., Dhiman, G.: Tunicate Swarm Algorithm: a new bio-
inspired based metaheuristic paradigm for global optimization. Eng. Appl. Artif. Intell. 90,
103541 (2020)
5. Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., Gao, J.: Deep learning
based text classification: a comprehensive review. arXiv preprint arXiv:2004.03705 (2020)
6. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint.
arXiv:1408.588 (2014)
7. Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for senti-
ment classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural
Language Processing, pp. 1422–1432 (2015)
8. Mirjalili, S.: SCA: a Sine Cosine Algorithm for solving optimization problems. Knowl.-Based
Syst. 96, 120–133 (2016). ISSN 0950-7051. https://doi.org/10.1016/j.knosys.2015.12.022
9. Rashedi, E., Nezamabadi-pour, H., Saryazdi, S.: GSA: a gravitational search algorithm. Inf.
Sci. 179, 2232–2248 (2009). https://doi.org/10.1016/j.ins.2009.03.004
10. Belazzoug, M., Touahria, M., Nouioua, F., Brahimi, M.: An improved sine cosine algorithm to
select features for text categorization. J. King Saud Univ.—Comput. Inf. Sci. 32(4), 454–464
(2020)
11. 20 Newsgroup dataset. https://www.kaggle.com/crawford/20-newsgroups. Accessed Nov 2020
12. Dhiman, G., Singh, P., Maini, R.: DHIMAN: a novel algorithm for economic dispatch problem
based on optimization method using Monte Carlo simulation and astrophysics concepts. Mod.
Phys. Lett. A 34, 1950032 (2019). https://doi.org/10.1142/S0217732319500329
13. Dhiman, G., Chahar, V.: Spotted hyena optimizer: a novel bio-inspired based metaheuristic
technique for engineering applications. Adv. Eng. Softw. (2017).https://doi.org/10.1016/j.adv
engsoft.2017.05.014
14. Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014).
ISSN 0965-9978. https://doi.org/10.1016/j.advengsoft.2013.12.007
15. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN’95—Interna-
tional Conference on Neural Networks, 1995, vol. 4, pp. 1942–1948. https://doi.org/10.1109/
ICNN.1995.488968
16. Mirjalili, S., Mirjalili, S.M., Hatamlou, A.: Multi-Verse Optimizer: a nature-inspired algorithm
for global optimization. Neural Comput. Appl. 27, 495–513 (2016)
17. Holland, J.H.: Genetic algorithms. Sci. Am. 267(1), 66–73 (1992)
18. Shen, Y., Liang, Z., Kang, H., Sun, X., Chen, Q.: A Modified jSO algorithm for solving
constrained engineering problems. Symmetry 13, 63 (2020). https://doi.org/10.3390/sym130
10063
19. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780
(1997). https://doi.org/10.1162/neco.1997.9.8.1735
20. Zhang, X., Zhao, J.J., LeCun, Y.: Character-level convolutional networks for text classification.
arXiv:1509.01626 (2015)
21. Kanavos, A., Nodarakis, N., Sioutas, S., Tsakalidis, A., Tsolis, D., Tzimas, G.: Large scale
implementations for twitter sentiment classification. Algorithms 10, 1–21 (2017)
22. Reuters-21578 Text Categorization Collection Data Set. https://archive.ics.uci.edu/ml/datasets/
reuters-21578+text+categorization+collection. Accessed Nov 2020
Chapter 63
An Overview of Indian Language
Datasets Used for Text Summarization

Shagun Sinha and Girish Nath Jha

Abstract In this paper, we survey text summarization (TS) datasets in Indian


languages (ILs), which are also low-resource languages (LRLs). We seek to answer
one primary question—is the pool of Indian language text summarization (ILTS)
dataset growing or is there a serious resource poverty? To answer the primary ques-
tion, we pose two sub-questions that we seek about ILTS datasets—first, what char-
acteristics—format and domain do ILTS datasets have? Second, how different are
those characteristics of ILTS datasets from high-resource languages (HRLs) partic-
ularly English. The survey of ILTS and English datasets reveals two similarities and
one contrast. The two similarities are—first, the domain of dataset commonly is news
(Hermann et al. in Adv. Neural Inf. Process. Syst. 28:1693–1701, 2015, [19]). The
second similarity is the format of the dataset which is both extractive and abstractive.
The contrast is in how the research in dataset development has progressed. ILs face a
slow speed of development and public release of datasets as compared with English.
We conclude that the relatively lower number of ILTS datasets is because of two
reasons—first, absence of a dedicated forum for developing TS tools; and second,
lack of shareable standard datasets in the public domain.

Keywords Text summarization dataset · Indian languages · Sanskrit

63.1 Introduction

In this paper, we present an overview of text summarization (TS) datasets available


in Indian languages (ILs). TS is a sub-domain of natural language processing (NLP)
which is a field that lies at the intersection of computer science and linguistics. NLP
develops computational “models and algorithms that enable a computer to interpret
process, and generate natural language” utterances ([33], p. 1). Text summarization
(TS) is an NLP task in which computers are expected to summarize long texts. Thus,
a TS tool produces summaries of input texts. The summary so produced can be one

S. Sinha (B) · G. N. Jha


Jawaharlal Nehru University, New Delhi, India
e-mail: shagunsinha5@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 693
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_63
694 S. Sinha and G. N. Jha

of the two types—first, extractive summary in which sentences from the source text
are quoted verbatim; second, abstractive summary which is composed by rephrasing
contents of the source text. Abstractive summaries are more coherent and human-like
than extractive summaries due to which developing abstractive summarizers is tough
([30], p. 814).
Since the first work in TS by Luhn [32], TS research has gradually gained attention.
Rise of machine learning and deep learning methods has further enhanced the growth
of TS research [4, 49]. Two key factors influence TS development—algorithm and
language for which TS is needed. The algorithm requires dataset of a given language
for summarization training. TS datasets, like datasets for other NLP tasks, are easier
to build for high-resource languages (HRLs) than for low-resource languages (LRLs).
Indian languages (ILs) are an example of LRLs because they lack sufficient digital
content.
Most TS review articles review TS approaches and algorithms but do not specif-
ically focus on the availability of datasets for ILs. Hasan et al. [18] observes that
many languages lack publicly available datasets (p. 4700). In this paper, we present a
survey of TS datasets in English, an HRL, and ILs on the LRL front—Hindi, Malay-
alam, Sanskrit, Urdu, Kannada, Konkani, Punjabi, and Bangla—with one primary
objective of comparing their characteristics and pace of dataset growth. To assess
the growth of TS efforts in ILs, it is important to review not just the algorithms and
approaches to TS but also the availability of datasets that have been developed.
In other words, we present this survey with to compare ILTS datasets with the HRL
English datasets. Through the comparison, we seek to answer one key question—
is the ILTS dataset pool growing or is it marred by resource poverty? We limit
ourselves to the academic research papers written around TS and the datasets reported
therein. The methodology of survey is similar to that of Nazar et al. [42], that is, we
use keyword search for a given language and TS on Google Scholar to locate any
academic papers for the duration.

63.2 TS Datasets and Resources in English

Text summarization has been the focus of many conferences which evolved out of the
need to discuss better document- and information-processing technologies. Docu-
ment, or text, summarization has been a regular part of these conferences organized
by three US-based agencies NIST, DARPA, and ARDA.1 Some top conferences that
evolved in the USA in the early 2000s include2 Document Understanding Confer-
ences, Translingual Information Detection Extraction and Summarization (TIDES)
conference, and Text Retrieval Conference (TREC).
A few other groups that evolved as a language resource forum like the linguistic
data consortium (LDC) which started in 1992 at the University of Pennsylvania,

1 https://www-nlpir.nist.gov/projects/duc/intro.html.
2 https://www-nlpir.nist.gov/projects/duc/intro.html.
63 An Overview of Indian Language Datasets … 695

Language Resource and Evaluation Conference (LREC) commenced to encourage


research in language resources and development, Annual Conference of the Asso-
ciation for Computational Linguistics (ACL), Conferences of the Association for
Computing Machinery (ACM), neural image processing (NIPS), and others. Some
of the key summarization papers that emerged from NLP conferences include [3, 5,
31, 53, 57]. A detailed analysis of those conferences is out of the scope of this work.

63.2.1 Datasets

Some widely used training datasets in English include, XSum, Newsroom, Multi-
news, CNN/DailyMail, MLSUM, and DUC datasets. Narayan et al. [40] introduced
XSum, a summarization dataset with 200 K news articles in English with single
line summary of each. Scialom et al. [52] present a “multilingual extension” of the
CNN/DailyMail dataset in French, German, Spanish, Russian, and Turkish, called
MLSUM (p. 8053).
Rush et al. [49] developed a sentence summarization system using the news-
headline pair of the Gigaword3 dataset (p. 385) and evaluated it using the DUC-
2003 and DUC-2004 datasets (p. 384). The DUC datasets have human generated
summaries of news articles (p. 384) due to which they may be termed abstractive.
Fabbri et al. [10] present MultiNews which is a multi-document summarization
dataset with summaries written by human editors. Newsroom4 is a collection of
article-summary pairs of 38 news publishers.5 Similarly, CNN/DailyMail is a news
corpus by Hermann et al. [19]. An exception to the news domain data includes the
BIGPATENT dataset by Sharma et al. [53] which is drawn from US Patent records
(p. 2204).
Datasets from the Document Understanding Conferences (DUC) have been regu-
larly used in summarization evaluation tasks [8, p. 1814]. Similarly, as Jin et al.
[23] note that the CNN DailyMail dataset is widely used in summarization evalua-
tion (p. 2000) although the dataset is not restricted to just evaluation and has been
widely used in training and testing too like in Mishra and Gayen [30, 36]. A detailed
account of key summarization datasets can be accessed in Jin et al. ([23], p. 2000)
and Dernoncourt et al. [7].6
As Sharma et al. [53] argue two factors may be observed in these datasets—
first, news remains a widely used domain of TS dataset; second, these datasets are
mostly extractive (p. 2204). We believe news articles remain a popular choice for
developing TS dataset because news articles and their headlines provide a rough

3 https://www.ldc.upenn.edu/ accessed January 20, 2022 at 21.12 h.


4 Link: Papers with Code dataset page accessed January 25, 2022 at 21.09 h [14].
5 https://paperswithcode.com/dataset/newsroom.
6 https://docs.google.com/spreadsheets/d/1b1-NpM1jDK7KVHd_CwrxhpNZ1zAE8m-7M0pZ0

gfZTMQ/edit#gid=0 accessed January 10, 2022 at 23.21 h.


696 S. Sinha and G. N. Jha

document-summary pattern making it easier to obtain a dataset rather than having to


manually summarize articles.

63.3 TS Resources and Datasets in Indian Languages

NLP research in Indian languages (ILs) is being undertaken by various institutions.


While no standardized conferences exist for IL Text Summarization (TS), some
conferences and individual researchers have made an attempt to build resources in
both extractive and abstractive spheres for ILs. One characteristic of TS dataset
is document-summary format of parallel data which is a suitable format for many
supervised machine learning algorithms as opposed to monolingual corpus which
are more suitable for unsupervised methods. We survey such parallel datasets in
the 22 scheduled languages of the Indian constitution for the last 10 years (2012–
2022). However, if a language does not have any parallel TS corpus, we cite the
reported datasets in any given format. For every TS research work, we observe if
the authors report the details of the dataset used, as well as if the dataset is currently
accessible. We could also not find any TS research in six scheduled languages Bodo,
Kashmiri, Manipuri, Maithili, Santhali, and Sindhi. Hence, those six languages have
been excluded.

63.3.1 IL Datasets

Assamese and Bengali. Assamese TS by Kalita et al. [26] use WordNet database
for summarization (p. 150) and thus, uses no text corpus. Talukder et al. [56] present
an abstractive summarization for Bengal using dataset from news corpus and social
media data (p. np) but provide no links to the dataset. Masum et al. [35] develop a
Bengali corpus for generating sentence similarity for abstractive summarization but
the corpus is not accessible. Chowdhury et al. [6] report using document-summary
pairs from printed NCTB books (p. np) to develop an abstractive TS dataset but do
not release the data for the research community.
Hindi. Two sources of Hindi TS are iNLTK7 and CDAC. iNLTK offers publicly
available ‘Short and Large Corpus’ for Hindi TS. CDAC hosts Saaranshak, a multi-
document summarizer that it developed.8 However, Saaranshak is ontology-based
and neither the tool nor the dataset is publicly available. Giri et al. [13] observe that a
TS for Hindi was proposed by Garain et al. [12] and later adapted by CDAC (p. 54).
But no details are present on the CDAC website to ascertain it is the same tool as
authors argue.

7 https://www.kaggle.com/datasets/disisbig/hindi-text-short-and-large-summarization-corpus.
8 https://www.cdac.in/index.aspx?id=mc_cli_saranshak.
63 An Overview of Indian Language Datasets … 697

Malayalam. Kabeer and Idicula [25] develop Malayalam TS methods on news arti-
cles paired with human summaries (p. 146) and report their work as the first Malay-
alam TS (p. 150). Nambiar et al. [39] report an abstractive TS on a BBC corpus that
was translated in Malayalam (p. 351). Kishore et al. [29] present a Malayalam TS
method based on Paninian Grammar (p. 197). None of these provides links to the
datasets they used.
Punjabi TS. Jain et al. [22] use particle swarm optimization for Punjabi text summa-
rization using two corpora, a monolingual Punjabi corpus, and a Punjabi-Hindi
parallel dataset from the Indian Language Corpora Initiative (p. 12). Similarly, Gupta
and Kaur [15] test their extractive methods on 150 random documents from two
corpora—Punjabi text obtained by translating Hindi corpora by CILT, IIT Bombay,
and a Punjabi news corpus (p. 267). Other Punjabi TS works by Gupta and Lehal
[16] on news dataset (p. 200), and Gupta and Lehal [17] provide no links to their
datasets. Only the Punjabi monolingual corpora is available on TDIL.9
Urdu. Humayoun et al. [21] present an abstractive summarization corpus, Urdu
Corpus (UC), built from 50 articles taken majorly from news articles and blogs
(pp. 796–797). They provide datasets link. Nawaz et al. [41] present ETS models
for Urdu for which they use the abstractive summary corpus provided by Humayoun
et al. [21], but they also develop an extractive version of the UC corpus, Urdu Corpus
expert ground truth (p. 9) and Urdu Training Dataset obtained from some news
sources (p. 10). The UC corpus is claimed to be open access.
Kannada and Konkani. Kannada has witnessed many TS works. Shilpa and Shashi
Kumar [54] present a Kannada summarizer Abs-Sum-Kan which uses POS- and
NER-techniques for summarizing documents. The authors do not mention any details
about the dataset except a line about Kannada gazetteers compiled by Saha et al.
[26] (p. 1031). An earlier extractive or information retrieval (IR)-based approach to
Kannada summarization by Embar et al. [9] does not provide any datasets. Guided
summarization in Kannada [27] offers no datasets either. Konkani Literature-based
TS dataset is offered by D’Silva and Sharma [8], and it has 71 stories (p. 1814).
While the dataset is limited in number, the domain of folk tales is unique.
Sanskrit, Dogri, and Oriya. The sole experiment in Sanskrit TS so far is Barve et al.
[2] which is an extractive TS on Sanskrit Wikipedia articles. Authors Barve et al. [2]
do not provide any dataset links, but Sanskrit Wikipedia articles have been presented
in Arora [1]. Dogri TS work by Gandotra and Arora [11] is not accessible, and the
paper metadata does not provide any details about the data. In [43], authors report a
manually built dataset of 200 cricket news articles with human summaries (p. 826),
and in Pattnaik and Nayak [43], authors report a news articles-summary texts used as
input to their proposed system (p. 397), but the authors do not release those datasets.

9 http://www.tdil-dc.in/ accessed on March 23, 2022 at 13.15 h.


698 S. Sinha and G. N. Jha

Gujarati, Marathi, and Nepali. Sarwadnya and Sonawane [51] use Marathi news
articles and Marathi portion of EMILLE10 dataset (p. np). Ranchhodbhai [47]
also evaluates his extractive summarization approach on Gujarati EMILLE corpus
(p. 117). Khanal et al. [28] collect Nepali news articles to implement extractive
methods (p. 989), but the dataset is not released.
Telugu and Tamil. Rani et al. use a Telugu corpus provided openly accessible on
Kaggle.11 Naidu et al. [38] uses e-newspaper data of Telugu but do not release it.
Mohan Bharath et al. [37] use “manually created dataset” as per their abstract, but
the paper is not accessible, and hence, no further details are available. Priyadharshan
and Sumathipala [46] use Tamil sports news as the corpus for TS evaluation. Their
paper is not openly accessible to gather further details.
A good but slow development in ILTS is translated and publicly accessible TS
dataset. A multilingual corpus from PIB and Mann ki Baat data has been presented
in Siripragada et al. [55]. Hasan et al. [18] offer TS dataset derived from BBC in 44
languages including a few major ILs like Bengali. XL-SUM [18] is claimed to be
the first public dataset in abstractive summarization for many languages (p. 4700).
However, Arora [1] had released Hindi summarization corpus prior to their work.

63.4 Observations

We observe two similarities and one contrast between English and ILTS datasets.
First, as noted earlier, news articles are the common source of TS datasets in English.
A similar pattern has been observed in ILTS, as seen above. ILTS datasets derive from
news articles with a few exceptions like Konkani folktales dataset [8]. Second, the
reported datasets have been used in extractive as well as abstractive TS development.
We observe a stark contrast in the dataset development efforts though. TS datasets
remain largely difficult to be found. Datasets like the CNN/Dailymail datasets in
English, on the other hand, are not only open access, they are also widely adapted.
There are two possible reasons for such limited reach of ILTS datasets—first, ILTS
has not been the focus of any Indic NLP groups in India. We noted a few forums
for English TS like the DUC, LDC, and TREC which have had dedicated tracks
for summarization for years. However, ILTS does not have any dedicated platforms.
While TDIL provides access to monolingual and parallel corpora which are usually
used for machine translation, a dedicated platform for TS is still missing; second,
lack of open access datasets. As can be seen in Table 63.1 and in the previous section,
many works report datasets but never release it for the research community. Hence,
datasets are used and quoted but not made open access for the research community
thereby impeding research progress in TS. Similar arguments have been made in
NLP and TS in general—for example, role of forums in generating LRL NLP data

10 https://www.lancaster.ac.uk/fass/projects/corpus/emille/.
11 https://www.kaggle.com/sudalairajkumar/telugu-nlp.
63 An Overview of Indian Language Datasets … 699

Table 63.1 Text summarization datasets in Indian languages


Language Dataset used Source TS approach used Open access? Y/N
Malayalam 1. Malayalam news [25] Extractive and No
articles abstractive
2. Translated BBC [39] Abstractive No
corpus
Kannada Gazetteer data by [54] Abstractive No
Saha et al. [50]
Marathi EMILLE dataset by [51] Extractive NA
Lancester Univ
Assamese WordNet database [26] Extractive NA
Konkani Folktales dataset [8] Abstractive No
Telugu News corpus from [48] Extractive NA
Kaggle
Telugu dataset [37] Abstractive NA
Tamil e-News corpus [46] Extractive e-Newspaper could be
accessible
Bengali NCTB book dataset [6] Abstractive No
Urdu 1. Urdu Summary [21] Abstractive Claimed to be open
Corpus (UC) access
2. UCE and UTD [41] Extractive NA
Sanskrit Wikipedia articles [2] Extractive Available in (Arora,
2020)
Gujarati EMILLE corpus for [47] Extractive NA
evaluation
Nepali Nepali News corpus [28] Extractive No
Oriya News corpus with [43, 44] NA No
human summaries
Hindi Hindi news articles [1] Abstractive Yes
Punjabi ILCI monolingual [22] Extractive Monolingual corpus
corpus from TDIL available on request
Dogri NA [11] NA NA

([20, 24], p.6290, [45], p.178), and lack of public datasets in TS by [20], and some
ILTS [18, 34]. Some slow development is indeed being made in public release of
datasets as pointed earlier.
700 S. Sinha and G. N. Jha

Two exceptions to limited dataset availability include—first, the set of online


platforms like HuggingFace,12 iNLTK,13 Tensorflow,14 nlpprogress,15 and paper-
swithcode16 which provide open access datasets across languages including ILs. A
detailed analysis of those platforms is out of the scope of this work but those platforms
include many languages including some ILs. A dedicated TS platform for ILs is still
missing; second, in some cases of extractive summarization, researchers use mono-
lingual corpus which may be open access but such corpora are general-purpose which
have not been developed for TS specifically. Thus, TS-focused dataset, especially in
the form of text-summary pairs, is difficult to obtain.

63.5 Conclusion

Language datasets are important for training NLP algorithms including TS algo-
rithms.
ILs are LRLs due to which dataset development for TS requires special attention.
In this paper, we surveyed TS datasets as reported by various researchers and devel-
opers to find out two similarities and one contrast of ILTS with datasets in English,
which is an HRL. Similarities are in the domain and approach of summarization.
Like English datasets, many ILTS datasets derive from news sources and are used in
both extractive and abstractive summarization tasks.
The contrast, however, is in the accessibility of datasets. Unlike English datasets,
most IL datasets are limited in availability and are not open access. More TS-focused
efforts and research forums could make more datasets available.

References

1. Arora, G.: iNLTK: natural language toolkit for Indic languages. In: Proceedings of Second
Workshop for NLP Open Source Software (NLP-OSS) (2020)
2. Barve, S., Desai, S., Sardinha, R.: Query-based extractive text summarization for Sanskrit. In:
Proceedings of the 4th International Conference on Frontiers in Intelligent Computing: Theory
and Applications (FICTA) 2015 (2016)
3. Cabral, L.d.S., Lins, R.D., Mello, R.F., Freitas, F., Ávila, B., Simske, S., Riss, M.: A platform
for language independent summarization. In: Proceedings of the 2014 ACM Symposium on
Document Engineering, Fort Collins, Colorado, USA (2014). https://doi.org/10.1145/2644866.
2644890

12 https://huggingface.co/datasets last accessed on March 23, 2022 at 18.12 h.


13 https://inltk.readthedocs.io/en/latest/index.html last accessed on March 23, 2022 at 18.12 h.
14 https://www.tensorflow.org/resources/models-datasets last accessed on March 23, 2022 at

18.12 h.
15 https://github.com/sebastianruder/NLP-progress last accessed on March 23, 2022 at 18.12 h.
16 https://paperswithcode.com/task/abstractive-text-summarization/codeless last accessed on

March 23, 2022 at 18.12 h.


63 An Overview of Indian Language Datasets … 701

4. Chen, J., Zhuge, H.: Abstractive text-image summarization using multi-modal attentional hier-
archical RNN. In: Proceedings of the 2018 Conference on Empirical Methods in Natural
Language Processing (2018)
5. Chen, Y.-C., Bansal, M.: Fast abstractive summarization with reinforce-selected sentence
rewriting. arXiv preprint arXiv:1805.11080 (2018)
6. Chowdhury, R.R., Nayeem, M.T., Mim, T.T., Chowdhury, M., Rahman, S., Jannat, T.: Unsu-
pervised Abstractive Summarization of Bengali Text Documents. arXiv preprint arXiv:2102.
04490 (2021)
7. Dernoncourt, F., Ghassemi, M., Chang, W.: A repository of corpora for summarization. In:
Proceedings of the Eleventh International Conference on Language Resources and Evaluation
(LREC 2018) (2018)
8. D’Silva, J., Sharma, U.: Development of a Konkani language dataset for automatic text
summarization and its challenges. Int. J. Eng. Res. Technol. (2019). ISSN: 0974-3154
9. Embar, V.R., Deshpande, S.R., Vaishnavi, A., Jain, V., Kallimani, J.S.: sArAmsha—a Kannada
abstractive summarizer. In: 2013 International Conference on Advances in Computing,
Communications and Informatics (ICACCI) (2013)
10. Fabbri, A.R., Li, I., She, T., Li, S., Radev, D.R.: Multi-news: a large-scale multi-document
summarization dataset and abstractive hierarchical model. arXiv preprint arXiv:1906.01749
(2019)
11. Gandotra, S., Arora, B.: Feature selection and extraction for Dogri text summarization. In:
Rising Threats in Expert Applications and Solutions, pp. 549–556. Springer (2021)
12. Garain, U., Datta, A.K., Bhattacharya, U., Parui, S.K.: Summarization of JBIG2 compressed
Indian language textual images. 18th International Conference on Pattern Recognition
(ICPR’06) (2006)
13. Giri, V.V., Math, M., Kulkarni, U.: A survey of automatic text summarization system for
different regional language in India. Bonfring Int. J. Softw. Eng. Soft Comput. 6, 52–57 (2016).
Special Issue on Advances in Computer Science and Engineering and Workshop on Big Data
Analytics Editors: Dr. SB Kulkarni, Dr. UP Kulkarni, Dr. SM Joshi and JV Vadavi
14. Grusky, M., Naaman, M., & Artzi, Y.: Newsroom: a dataset of 1.3 million summaries with
diverse extractive strategies. Proceedings of the 2018 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language Technologies,
Volume 1 (Long Papers) (2018)
15. Gupta, V., Kaur, N.: A novel hybrid text summarization system for Punjabi text. Cogn. Comput.
8(2), 261–277 (2016)
16. Gupta, V., Lehal, G.S.: Complete pre processing phase of Punjabi text extractive summarization
system. In: Proceedings of COLING 2012: Demonstration Papers (2012)
17. Gupta, V., Lehal, G.S.: Automatic text summarization system for Punjabi language. J. Emerg.
Technol. Web Intelli. 5(3), 257–271 (2013)
18. Hasan, T., Bhattacharjee, A., Islam, M.S., Mubasshir, K., Li, Y.-F., Kang, Y.-B., Rahman,
M.S., & Shahriyar, R.: XL-sum: large-scale multilingual abstractive summarization for 44
languages. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
(2021)
19. Hermann, K.M., Kocisky, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., Blunsom,
P.: Teaching machines to read and comprehend. Adv. Neural Inf. Process. Syst. 28, 1693–1701
(2015)
20. Hu, B., Chen, Q., & Zhu, F.: LCSTS: a large scale chinese short text summarization dataset.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
(2015)
21. Humayoun, M., Nawab, R.M.A., Uzair, M., Aslam, S., Farzand, O.: Urdu summary corpus.
In: Proceedings of the Tenth International Conference on Language Resources and Evaluation
(LREC’16) (2016)
22. Jain, A., Yadav, D., Arora, A.: Particle swarm optimization for Punjabi text summarization.
Int. J. Oper. Res. Inf. Syst. (IJORIS) 12(3), 1–17 (2021)
702 S. Sinha and G. N. Jha

23. Jin, H., Cao, Y., Wang, T., Xing, X., Wan, X.: Recent advances of neural text generation: core
tasks, datasets, models and challenges. Sci. China Technol. Sci. 1–21 (2020)
24. Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M.: The state and fate of linguistic
diversity and inclusion in the NLP. Proceedings of the 58th Annual Meeting of the Association
for Computational Linguisticsa (2020)
25. Kabeer, R., Idicula, S.M.: Text summarization for Malayalam documents—an experience. In:
2014 International Conference on Data Science & Engineering (ICDSE) (2014)
26. Kalita, C., Saharia, N., Sharma, U.: An extractive approach of text summarization of Assamese
using wordnet. In: Global WordNet Conference (GWC-12) (2012)
27. Kallimani, J.S., Srinivasa, K., Reddy, B.E.: A comprehensive analysis of guided abstractive
text summarization. Int. J. Comput. Sci. Iss. (IJCSI) 11(6), 115 (2014)
28. Khanal, R.S., Adhikari, S., Thapa, S.: Extractive method for Nepali text summarization using
text ranking and LSTM. In: 10th IOE Graduate Conference (2021)
29. Kishore, K., Gopal, G.N., Neethu, P.: Document summarization in Malayalam with sentence
framing. In: 2016 International Conference on Information Science (ICIS) (2016)
30. Kouris, P., Alexandridis, G., Stafylopatis, A.: Abstractive text summarization: enhancing
sequence-to-sequence models using word sense disambiguation and semantic content gener-
alization. Comput. Linguist. 47(4), 813–859 (2021). https://doi.org/10.1162/coli_a_00417
31. Liu, F., Liu, Y.: From extractive to abstractive meeting summaries: Can it be done by sentence
compression? In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers (2009)
32. Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165
(1958)
33. Lusetti, M., Ruzsics, T., Göhring, A., Samardžić, T., Stark, E.: Encoder-decoder methods for
text normalization (2018)
34. Mamidala, K.K., & Sanampudi, S.K.: Text summarization for Indian languages: a survey. Int.
J. Adv. Res. Eng. Technol. (IJARET). 12(1), 530–538 (2021)
35. Masum, A.K.M., Abujar, S., Tusher, R.T.H., Faisal, F., Hossain, S.A.: Sentence simi-
larity measurement for Bengali abstractive text summarization. In: 2019 10th International
Conference on Computing, Communication and Networking Technologies (ICCCNT) (2019)
36. Mishra, R., Gayen, T.: Automatic lossless-summarization of news articles with abstract
meaning representation. Procedia Comput. Sci. 135, 178–185 (2018)
37. Mohan Bharath, B., Aravindh Gowtham, B., Akhil, M.: Neural abstractive text summarizer for
Telugu language. In: Soft Computing and Signal Processing, pp. 61–70. Springer (2022)
38. Naidu, R., Bharti, S.K., Babu, K.S., Mohapatra, R.K.: Text summarization with automatic
keyword extraction in Telugu e-newspapers. In: Smart Computing and Informatics, pp. 555–
564. Springer (2018)
39. Nambiar, S.K., Peter, S.D., Idicula, S.M.: Abstractive summarization of Malayalam docu-
ment using sequence to sequence model. In: 2021 7th International Conference on Advanced
Computing and Communication Systems (ICACCS) (2021)
40. Narayan, S., Cohen, S.B., Lapata, M.: Don’t give me the details, just the summary! Topic-aware
convolutional neural networks for extreme summarization. arXiv preprint arXiv:1808.08745
(2018)
41. Nawaz, A., Bakhtyar, M., Baber, J., Ullah, I., Noor, W., Basit, A.: Extractive text summarization
models for Urdu language. Inf. Process. Manag. 57(6), 102383 (2020)
42. Nazar, N., Hu, Y., & Jiang, H.: Summarizing software artifacts: a literature review. J. Comput.
Sci. Technol. 31(5), 883–909 (2016)
43. Pattnaik, S., Nayak, A.K.: A simple and efficient text summarization model for Odia text
documents. Indian J. Comput. Sci. Eng. 11(6), 825–834 (2020). https://doi.org/10.21817/ind
jcse/2020/v11i6/201106132
44. Pattnaik, S., Nayak, A.K.: Automatic text summarization for Odia language: a novel approach.
In: Intelligent and Cloud Computing, pp. 395–403. Springer (2021)
45. Philip, J., Siripragada, S., Namboodiri, V.P., & Jawahar, C.: Revisiting low resource status
of Indian languages in machine translation. In: 8th ACM IKDD CODS and 26th COMAD,
pp. 178–187 (2021)
63 An Overview of Indian Language Datasets … 703

46. Priyadharshan, T., Sumathipala, S.: Text summarization for Tamil online sports news using
NLP. In: 2018 3rd International Conference on Information Technology Research (ICITR)
(2018)
47. Ranchhodbhai, S.J.: Designing and Development of Stemmer and String Similarity Measure
for Gujarati Language and their Application in Text Summarization System (2016)
48. Rani, B.K., Rao, M.V., Srinivas, K., Madhukar, G.: Telugu text summarization using LSTM
deep learning. Pensee J.
49. Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summa-
rization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language
Processing (2015)
50. Saha, S.K., Sarkar, S., Mitra, P.: Gazetteer preparation for named entity recognition in Indian
languages. In: Proceedings of the 6th Workshop on Asian Language Resources (2008)
51. Sarwadnya, V.V., Sonawane, S.S.: Marathi extractive text summarizer using graph based
model. In: 2018 Fourth International Conference on Computing Communication Control and
Automation (ICCUBEA) (2018)
52. Scialom, T., Dray, P.-A., Lamprier, S., Piwowarski, B., Staiano, J.: MLSUM: The multilin-
gual summarization corpus. In: Proceedings of the 2020 Conference on Empirical Methods in
Natural Language Processing (EMNLP) (2020)
53. Sharma, E., Li, C., Wang, L.: Bigpatent: a large-scale dataset for abstractive and coherent
summarization. arXiv preprint arXiv:1906.03741 (2019)
54. Shilpa, G., Shashi Kumar, D.R.: Abs-Sum-Kan: an abstractive text summarization technique
for an Indian regional language by induction of Tagging rules. Int. J. Recent Technol. Eng.
8(2S3), 1028–1036 (2019). https://doi.org/10.35940/ijrte.B1193.0782S319
55. Siripragada, S., Philip, J., Namboodiri, V.P., Jawahar, C.: A multilingual parallel corpora collec-
tion effort for Indian languages. In: Proceedings of the 12th Language Resources and Evaluation
Conference (2020)
56. Talukder, M.A.I., Abujar, S., Masum, A.K.M., Faisal, F., Hossain, S.A.: Bengali abstractive
text summarization using sequence to sequence RNNs. In: 2019 10th International Conference
on Computing, Communication and Networking Technologies (ICCCNT) (2019)
57. Woodsend, K., Lapata, M.: Automatic generation of story highlights. In: Proceedings of the
48th Annual Meeting of the Association for Computational Linguistics (2010)
Chapter 64
Scattering Wavelet Network-Based Iris
Classification: An Approach
to De-duplication

Parmeshwar Birajadar, Meet Haria, and Vikram Gadre

Abstract In a large-scale iris-based identification system, iris classification is an


important indexing task to reduce the search time in a large database for accurate
matching, especially in a de-duplication application. Because of the considerable
intra-class variability and small inter-class variability, iris classification is a diffi-
cult pattern recognition challenge. In this paper, we propose a novel approach to
iris classification based on iris fiber structures. Translation and minor deformation
invariant local iris features are extracted using a scattering wavelet network. A sim-
ple generative PCA affine classifier is used to classify the resulting invariant feature
vectors. Experiments on two benchmark iris databases reveal that the proposed iris
classification algorithm is successful and robust in terms of classification accuracy.

Keywords Biometrics · Iris classification · De-duplication · Scattering wavelet


network · PCA affine classifier

64.1 Introduction

Iris is widely used [1] as a biometric trait in commercial, health care, border control,
and large-scale national identification projects, and hence, iris indexing is becoming
increasingly important. Coarse iris classification and iris recognition are two dif-
ferent problems. Iris classification is intended to group iris images into pre-defined
classes based on some common characteristics (iris fiber structure), whereas in iris
recognition, each individual’s iris has a unique class. A comprehensive work on iris
recognition [2, 3] has been addressed in the literature. Despite the fact that iris classi-
fication is a significant study topic, it has received little attention in the literature. The

P. Birajadar (B)
VES Institute of Technology, Chembur, Mumbai 400074, India
e-mail: parmeshwar.birajdar@ves.ac.in
URL: https://www.parmeshwarbirajadar.in/
M. Haria · V. Gadre
Indian Institute of Technology Bombay, Mumbai 400076, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 705
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_64
706 P. Birajadar et al.

large-scale national identification programs like [4] launched by the Government of


India, uses a multimodal biometric [5] system. It refers to the use of a combination of
more than one biometric trait in a single identification system. The use of more than
one biometric attribute in a single identification system is referred to as multimodal
biometrics [5]. Iris, fingerprint, and face are the three biometric traits used in the
Aadhaar project to provide a unique identity (a 12-digit number) to Indian citizens.
The act of deleting instances of repeated enrollments (duplicates) by the same person
is known as de-duplication [6].
The biometric attributes of a person are matched against the biometric attributes
of previously enrolled people during de-duplication. Despite the fact that all of these
methods produce good recognition results, they all require that the input iris image
be paired against a great amount of iris photos in a database. This takes a long time,
notably as the number of iris databases used for identity verification increases. It
would be useful to be able to label an iris image prior to matching, such that the
query iris image will match to the same category, but iris classification has got-
ten little attention in the literature thus far. Iris classification is commonly used for
the initial coarse classification of the database into different classes to reduce the
search space and is very useful in large size databases. Continuous classification and
exclusive (fixed) classification are the two main approaches used in most automatic
biometric identification systems. There are just a few methods for classifying iris
photos using pre-defined iris classifications [7, 8] in the literature which uses iris fire
structure information. Iris texture patterns are broadly classified [9] into three types:
stream, flower, and jewel. The stream texture class is characterized by the ordering
of the white fibers originating from the iris center. The flower texture class is deter-
mined according to the flower-like petals present in the arrangement of fibers. The
jewel texture class can be identified by the presence of colored dots or pigmentation
on top of the fibers. The detailed information about the iris classes and its charac-
teristics can be found in [10]. Recently, Ref. [11] have studied and performed the
analysis of surface features of iris of diverse ancestry. The sample images of these
classes from two different standard iris databases are shown in Fig. 64.1. The UPOL
database [12] is widely used by the researchers for fixed classification; however, the
database is composed of only 384 iris images of 64 subjects which is not sufficient for
performance evaluation of classification algorithms. In this work, we have selected
a large Warsaw-BioBase-Smartphone-Iris database [13, 14] which includes 1340
iris images of 67 subjects. We have divided the database into three classes—stream,
flower, and jewel, by performing subjective analysis as reported in Table 64.2.
For biometric research, the availability of standard databases plays an important
role. Due to the availability of a benchmark NIST-SD4 [15], fingerprint database used
for fingerprint classification, a large amount of research [16] has been done in the
literature. But in the case of iris, such a database is not available, this may be the reason
for giving less attention to the iris-based indexing application. However recently,
coarse iris classification based on iris fiber structure is receiving more attention in
the biometric research community.
In this work, we have proposed a novel feature extraction method for coarse iris
classification using a scattering wavelet network [17]. Features extracted from the
64 Scattering Wavelet Network-Based Iris Classification … 707

(a) (b) (c)

(d) (e) (f)

Fig. 64.1 Iris classes from left to right: a stream b flower c jewel. Top row sample images are from
UPOL database and bottom row images are from Warsaw-BioBase-Smartphone-Iris database

scattering wavelet network are invariant to translation and small deformation [18].
The key contributions of this paper can be summarized as follows:
1. A novel scattering wavelet-based feature extraction method for coarse iris classi-
fication based on iris fiber structures.
2. Subjective analysis is performed on a large Warsaw-BioBase-Smartphone-Iris
database for coarse iris classification which will be useful for further research in
this area.
3. A detailed comparative analysis is performed using scattering wavelet features on
two benchmark databases using principal component analysis (PCA) affine and
support vector machine (SVM) classifier.

64.2 Related Work

Few approaches in the literature are already explored for the classification of iris.
Some of the approaches use continuous classification while others use exclusive
(fixed) classification. In [19], the fractal dimension of the iris is estimated using block-
based box-counting method for automatic coarse classification of iris images into four
categories. The experiments are performed on their implemented database consisting
of 872 images. In [20], the authors extracted the block-based texture information from
the iris images and used it for coarse classification. They partitioned the database
into predetermined clusters using principle direction divisive partitioning (PDDP)
technique. The experiments are conducted on UPOL and CASIA V3 databases. In
708 P. Birajadar et al.

[21], the authors used the iris color information for feature extraction called color iris
texon using different color spaces. The experiments are conducted on three different
color iris image databases, and the continuous classification is performed using the K-
means clustering approach. Reference [22] proposed a hierarchical visual codebook
(HVC) which is a method for encoding iris texture patterns and classifying iris
images. The authors performed a continuous coarse iris classification on CASIA-Iris-
Thousand database [23] using HVC and K-means clustering approach to classify iris
images into different categories. Only a few approaches [7, 8] for the categorization
of iris images based on pre-defined iris classes have been developed recently.
The authors of [7, 8] used information on iris fiber structure to divide iris images
into three categories: stream, flower, and jewel. A sparse representation of features
obtained from log-Gabor wavelet technique using online dictionary learning (ODL)
is used by authors to classify iris images. Classification experiments are performed on
UPOL, CASIA V3, and IIT-Delhi databases. In [8], authors also used the fiber struc-
ture information to classify iris images. The authors employed 3-class and 4-class
fixed classification approaches. Local intensity order pattern [25] feature extraction
method is used by the authors to extract iris features. The experiments are conducted
on UPOL database, and SVM classifier is used for the classification of iris images.
The nonlinear deformation of the iris fiber structure originates as a result of pupil
dilatation and contraction which, in turn, is caused due to changes in illumination.
As discussed above, the other methods in the literature do not utilize deformation
invariant features for course iris classification in both the cases (continuous and fixed
iris classification). The proposed scattering wavelet network-based approach tackles
the problem of nonlinear iris fiber structure deformation by extracting deformation
and translation-invariant features.
In this work, we have used scattering wavelet network to extract robust iris fea-
tures which are translation and small deformation invariant. We use these features
for classifying iris images based on iris fiber structure. We have used 3-class fixed
classification approach. The experiments are conducted on the widely used UPOL
standard database. We have also used Warsaw-BioBase-Smartphone-Iris database
on which subjective analysis is performed and is further used for automatic classi-
fication by using ScatNet features. Classification experiments are conducted using
PCA affine and SVM classifiers, and performance analysis is reported.

64.3 Databases Used for Classification

64.3.1 UPOL Database

The UPOL database [12] consists of 384 visible iris images of 64 subjects with left
and right eyes contributing three images per subject. The resolution of each image is
768 × 576 (24 bit RGB) which is captured using TOPCON TRC50IA optical device
via DXC-950p Sony camera. The sample images of the database are shown in row-
64 Scattering Wavelet Network-Based Iris Classification … 709

Table 64.1 Iris classes of UPOL database defined based on the iris fibers [24]
Class No. of images Subject Idsa
Stream 192 (50%) 001, 006, 007, 008, 011, 013, 014, 016, 018, 019, 020, 021, 023,
024, 026, 027, 028, 033, 041, 042, 044, 045, 050, 051, 052, 053,
058, 059, 060, 061, 062, 064
Flower 102 (26.56%) 002, 009, 010, 015, 017, 022, 031, 036, 037, 040 043, 047, 048,
049, 054, 056, 063
Jewel 90 (23.44%) 003, 004, 005, 012, 025, 029, 030, 032, 034, 035, 038, 039, 046,
055, 057
a Each subject id consists of 6 images (3 for left iris and 3 for right iris). There are a total of 64
subjects and the database comprises of a total of 364 images

1 of Fig. 64.1. The UPOL database consisted of only 384 images of 64 subjects
which is less in number to test the performance of any classifier or feature extraction
algorithm. This is the reason to advance toward a new Warsaw-BioBase-Smartphone-
Iris database with a large number of images. Iris classes of UPOL database are defined
based on the iris fibers [7, 8], as Table 64.1.

64.3.2 Warsaw-BioBase-Smartphone-Iris Database

The database is collected by the Warsaw University of Technology in Poland [13,


14]. The database consists of 3291 visible iris images of 70 subjects. The database
is captured by Apple iPhone 5s using 8-megapixel camera. The size of each image
is 640 × 480. The sample images of the database are shown in row-2 of Fig. 64.1.
This smartphone captured database is large in size and contains challenging images
(occlusions by eyelashes) which is the reason for the selection of such a database
for the classification based on iris fiber structure. We performed subjective analysis
over this database to classify the database into three primary classes: stream, flower,
and jewel. Five subjects were asked to categorize the iris database into three classes
based on the iris fiber structure that they perceive, to classify iris images accordingly
into different classes. The classes for iris images were finalized based on the majority
perception of iris class. The analysis was performed on the iris images of 67 subjects
(subjects 24, 29, and 62 are not considered for analysis due to the unavailability of
sufficient number of images). The corresponding subject IDs along with the classes
are tabulated in Table 64.2. We have selected 10 images per subject of left and right
iris images which sums up to 1340 well-segmented images.
710 P. Birajadar et al.

Table 64.2 Proposed subjective analysis-based iris classes of Warsaw-BioBase-Smartphone-Iris


database defined based on the iris fibers
Class No. of images Subject Idsa
Stream 760 (56.71%) 01, 03, 04, 05, 07, 09, 10, 14, 16,19, 22, 26, 30, 33, 34, 35, 36,
41, 42,43, 45, 49, 50, 51, 52, 54, 55, 56, 58, 59, 60, 61, 63, 66,
67, 68, 69, 70
Flower 400 (29.85%) 02, 08, 11, 12, 13, 15, 17, 18, 20, 21, 23, 25, 27, 28, 31, 32, 37,
38, 47, 57
Jewel 180 (13.43%) 06, 39, 40, 44, 46, 48, 53, 64, 65
aEach subject id consists of 20 images (10 for left iris and 10 for right iris). There are a total of 67
subjects, and the database comprises of a total of 1340 images

64.4 Scattering Wavelet Network

With the spirit of deep convolution networks, Mallat developed the scattering wavelet
network (ScatNet) in [17] which uses a wavelet transform cascade with a modulus
operator to build representations of images.
If the image is given by f (g), where g ∈ R2 . To build a scattering wavelet trans-
form, and consider φ J (g) = 2−2J φ(2−J g) is a Gaussian low-pass filter having scal-
ing factor J . The mother wavelet ψ, whose rotated and dilated types are denoted by
{ψλ } where λ = (θ, j), θ gives orientation and 2 j represents binary scales, where
j ∈ {1, 2, . . . , J }. The calculation of the scattering coefficients [29] of each layer
is pictorially represented in Fig. 64.2. The theoretical and implementation details of
the ScatNet are available in [26] and references given therein.

Fig. 64.2 ScatNet with J = 3, L = 8 and m = 0, 1, 2. As a result, λ = (θ, j) can range from λ1
to λ24. The scattering coefficients and wavelet modulus are shown by white and black circles,
respectively
64 Scattering Wavelet Network-Based Iris Classification … 711

64.5 Iris Classification Using ScatNet

The block diagram of the proposed ScatNet-based iris classification approach is


shown in Fig. 64.3. It consists of the following stages:
1. Extraction of the iris region of interest (ROI) from the iris image using segmen-
tation algorithm proposed in [27].
2. Conversion of circular iris region to a rectangular strip using Daugman’s rubber-
sheet model [2].
3. Weakly illuminated iris images are enhanced using fusion approach [28].
4. Square tessellation (tiling the rectangular rubber-sheet into smaller square blocks)
of the enhanced iris rubber-sheet into 16 blocks to extract the local texture features.
5. Estimation of the mean value of scattering wavelet coefficients for each block of
ROI and concatenate the mean features of all blocks to create the final feature
vector.
6. The feature vector is then fed to a trained principal component analysis (PCA)
affine classifier to perform coarse iris classification.
The iris preprocessing and ROI extraction stages are shown in Fig. 64.3.

64.5.1 Iris Segmentation and Enhancement

A robust iris segmentation is an essential step to extract the iris region from the
iris image accurately. Various approaches are available for iris segmentation. Since
visible iris images are very different from the NIR captured images, we have used the
recently published robust iris segmentation approach [27] which uses total variation
model to effectively segment iris images. Figure 64.4 shows the example of iris
segmentation with and without occlusion. In order to enhance the iris images acquired
under different illumination conditions, such as low light images, back-lit images, and
non-uniformly illuminated images, we have used a fusion-based image enhancement
algorithm [28]. Using this algorithm, the illumination and reflectance components of
the incoming image are first deconstructed. From the estimated lighting, three inputs
are obtained. To avoid artifacts, these three inputs are weighted and blended using a
multi-scale method. Figure 64.5 shows the example of the enhanced iris rubber-sheet
images. The processed (segmented and enhanced) iris rubber-sheet images will be
made available for further research in this field.

Fig. 64.3 Flowchart of the proposed ScatNet-based iris classification


712 P. Birajadar et al.

Fig. 64.4 Examples of iris segmentation with rubber-sheets: The top row shows proper iris seg-
mentation without occlusion. The bottom row shows iris segmentation with occlusion

Fig. 64.5 Iris rubber-sheet image enhancement: The left column shows the original rubber-sheet
images, while the right column shows the enhanced rubber-sheet images
64 Scattering Wavelet Network-Based Iris Classification … 713

Fig. 64.6 Summary of iris preprocessing and ROI extraction stages

64.5.2 Feature Vector Construction

To construct a ScatNet framework, any wavelet can be used; however, directed com-
plex wavelets were utilized to capture directional fiber structural information in iris.
The most obvious choice is the Gabor wavelets which are used in many image pro-
cessing applications due to the best joint spatial and frequency localization since
the Gabor wavelet has a nonzero mean which makes the feature vectors non-sparse.
As a result, the Morlet wavelet, a variation of the Gabor wavelet, is utilized for
classification.
An elliptical Morlet wavelet is used with eccentricity parameter  as given below:
 
 −(x 2 +  2 y 2 )
ψMorlet (x, y) = exp (exp (iωx) − β) (64.1)
2π σ 2 2σ 2
714 P. Birajadar et al.

Fig. 64.7 Morlet filters with L = 8 and J = 3

where the β is chosen to have a mean of zero. In all of the classification experiments,
we utilized a Morlet wavelet with  = 0.5, ω = 3π/4, and σ = 0.8, as shown in
Fig. 64.7.
To create the iris scattering features using Morlet wavelet filters, we choose the
number of orientations L = 8 and scales J = 1. For feature extraction, the iris rubber-
sheet is divided into 16 blocks. We adopted a block-based approach to extract scatter-
ing wavelet features to build a complete local invariant model. The rubber-sheet sizes
180 × 360 and 192 × 1536 are used for Warsaw-BioBase-Smartphone-Iris database
images and UPOL database images, respectively. For feature extraction, the enhanced
iris rubber-sheet is divided into
 16
 blocks. Each block undergoes a three-layer scat-
tering wavelet transform. L P Jp gives the size of the scattering vector of the pth
layer. where p is the layer number and P denotes maximum layer number. After
concatenating all the m layers, mp=0 L P Jp is the feature vector size. The mean of
each block is then calculated. The main feature vector (FV) is given by:

FV = (FVb1 , FVb2 , FVb3 , . . . , FVb16 ) (64.2)

To avoid the effect of occlusion, the partial iris (lower and side iris) information
is utilized. In these cases, we have used 8 blocks to construct the feature vector as
described in Sect. 64.7. In this work, the iris classification with scattering wavelet
features is performed using two machine learning algorithms, namely PCA affine
classifier and SVM. The PCA affine classifier is a generative model, whereas the
64 Scattering Wavelet Network-Based Iris Classification … 715

SVM is discriminative one. For multi-class SVM implementation, we have used the
LIBSVM library [31] and a linear kernel is used with regularity constant γ = 10−4
and all other hyperparameters kept at default values.

64.6 Experimental Results

On the two databases UPOL and Warsaw-BioBase-Smartphone-Iris, the classifica-


tion performance of scattering wavelet features is assessed using two classifiers,
namely PCA affine and SVM [18]. We have used fixed training and testing sets for
classifying iris images of UPOL database with SVM classifier. We have also followed
the same approach for fair comparison using scattering wavelet network features and
additionally considered three different iris regions in Fig. 64.6, namely complete iris
(all 16 blocks), lower iris (lower 8 blocks: 5, 6, 7, 8, 13, 14, 15, 16), and side iris (side
8 blocks: 1, 4, 5, 8, 9, 12, 13, 16) for testing the effect of full and partial information
of iris on classification accuracy. The reason for selecting lower and side iris is to
minimize the effect of occlusion by eyelids and eyelashes. The results have been
tabulated in Tables 64.3 and 64.4, respectively. For fair classification, the use of fixed
training and testing set is not appropriate. For this reason, we have also evaluated
the classification performance using tenfold stratified cross-validation (SVC) with
machine learning perspective and the results are reported in Table 64.5. The findings
of a comparable SVC technique on a big Warsaw-BioBase-Smartphone-Iris database
are provided in Table 64.6. [18] showed that for small training sets the generative
PCA affine classifier outperforms the discriminative SVM classifier, and it can be
clearly observed from the reported results that the performance of the generative PCA
affine classifier outperforms that of the SVM classifier. The suggested approach’s
performance is compared to existing approaches, and the results are presented in
Table 64.8 (Table 64.7).

Table 64.3 Performance of PCA affine classifier on UPOL database using pre-defined training and
testing data set
Iris region Complete iris Lower iris Side iris
Train/test ratio 2/1 1/2 2/1 1/2 2/1 1/2
Accuracy (%) 99.22 97.26 100 97.65 98.44 95.70

Table 64.4 Performance of SVM classifier on UPOL database using pre-defined training and
testing data set
Iris region Complete iris Lower iris Side iris
Train/test ratio 2/1 1/2 2/1 1/2 2/1 1/2
Accuracy (%) 79.68 73.43 71.87 72.65 74.21 73.82
716 P. Birajadar et al.

Table 64.5 Performance on UPOL database with tenfold stratified cross-validation (SCV) using
both classifiers
Iris region Complete iris Lower iris Side iris
Classifier PCA SVM PCA SVM PCA SVM
Accuracy (%) 81.93 74.87 84.42 67.65 81.78 69.96

Table 64.6 Performance on Warsaw-BioBase-Smartphone-Iris database with tenfold stratified


cross-validation (SCV) using both classifiers
Iris region Complete iris Lower iris Side iris
Classifier PCA SVM PCA SVM PCA SVM
Accuracy (%) 92.31 81.05 91.46 73.75 92.33 74.65

Table 64.7 Performance on Warsaw-BioBase-Smartphone-Iris database with tenfold stratified


cross-validation (SCV) using both classifiers
Iris region Complete iris Lower iris Side iris
Classifier PCA SVM PCA SVM PCA SVM
Accuracy (%) 92.31 81.05 91.46 73.75 92.33 74.65

Table 64.8 Classification performance comparison with proposed approaches on UPOL-Iris


database
Algorithm Classifier Features Accuracy
Nalla and Chalavadi [7] SVM Online dictionary 100% (with dictionary
learning size 90)
Emerich et al. [8] SVM Local intensity patterns 100%
ScatNet (proposed PCA affine Scattering wavelet 99.22%
with lower iris) features
ScatNet (proposed PCA affine Scattering wavelet 100%
with complete iris) features

64.7 Conclusion

In this research work, we have used a novel scattering wavelet network-based


approach for fixed coarse iris classification based on iris fiber structure. The experi-
mental results are evaluated using both the generative (PCA affine) and discriminative
(SVM) classifiers. It is observed that PCA affine classifier works well with scattering
wavelet features. The presented classification algorithm’s usefulness is demonstrated
using UPOL and the Warsaw-BioBase-Smartphone-Iris database which achieves the
classification accuracy of 100% with lower iris (Table 64.3) fixed classification con-
figuration, and 92.33% with side iris (Table 64.6) tenfold SCV configuration, respec-
tively. We have also studied the impact of partial iris information (lower iris and side
64 Scattering Wavelet Network-Based Iris Classification … 717

iris) on classification performance. It is observed that partial iris information is suffi-


cient for the classification on iris based on fiber structure. The proposed approach’s
performance is compared to that of other approaches in the literature, and the find-
ings are presented in Table 64.8. The performance of the proposed ScatNet-based
approach is comparable to these methods. The iris fiber structure information is use-
ful for coarse iris classification which is an important iris-indexing task that helps to
decrease the search time over a large database. This will reduce the matching time
during the de-duplication process in large-scale national identification programs.

References

1. Burge, M., Bowyer, K.: Handbook of Iris Recognition. Springer, New York, NY, USA (2013)
2. Daugman, J.: How iris recognition works. IEEE Trans. Circ. Syst. Video Technol. 14(1), 21–30
(2004)
3. Daugman, J.: Information theory and the IrisCode. IEEE Trans. Inf. Forens. Secur. 11(2),
400–409 (2016)
4. Aadhaar: Unique Identification Authority of India. https://uidai.gov.in/
5. Jain, A., et al.: Introduction to Biometrics. Springer, New York, NY, USA (2011)
6. De-duplication–the complexity in the Unique ID Context. Tech Report, 4G Identity Solutions,
pp. 1–9
7. Nalla, P., Chalavadi, K.: Iris classification based on sparse representations using on-line dic-
tionary learning for large-scale de-duplication applications. SpringerPlus 4(1), 1–10 (2015)
8. Emerich, S., et al.: Iris indexing based on local intensity order pattern, Proceedings of SPIE
10341, Ninth International Conference on Machine Vision, pp. 10341–10341-5 (2017)
9. Reddy, E., et al.: Biometric template classification: a case study in iris textures. In: Proceedings
of the International Conference on Biometrics (2007)
10. Unitree Foundation: The Rayid model of iris interpretation. https://rayid.com/iris-
patternsstructures/
11. Edwards, M., et al.: Analysis of iris surface features in populations of diverse ancestry. Royal
Soc. Open Sci. 3(1) (2016)
12. Dobes, M., Machala, L.: The database of human iris images. http://www.inf.upol.cz/iris/
13. Trokielewicz, M.: Exploring the feasibility of iris recognition for visible spectrum iris images
obtained using smartphone camera. In: Proceedings of SPIE, Photonics Applications in Astron-
omy, Communications, Industry, and High-Energy Physics Experiments, pp. 9662–9662-8
(2015)
14. Trokielewicz, M.: Iris recognition with a database of iris images obtained in visible light
using smartphone camera, pp. 1–6. In: IEEE International Conference on Identity, Security
and Behavior Analysis (ISBA) (2016)
15. Watson, C., Wilson, C.: NIST special database 4, print database. Technical report, National
Institute of Standards and Technology (1992)
16. Galar, M., et al.: A survey of fingerprint classification. Part I: Taxonomies on feature extraction
methods and learning models. Knowl.-Based Syst. (KB) 81, 76–97 (2015)
17. Mallat, S.: Group invariant scattering. Commun. Pure Appl. Math. 65(10), 1331–1398 (2012)
18. Bruna, J., Mallat, S.: Invariant scattering convolution networks. IEEE Trans. Pattern Anal.
Mach. Intell. 35(8), 1872–1886 (2013)
19. Yu, L., et al.: Coarse iris classification using box-counting to estimate fractal dimensions.
Pattern Recognit. 38(11), 1791–1798 (2005)
20. Ross, A., Sunder, M.: Block based texture analysis for iris classification and matching. In:
Proceedings of IEEE Computer Society Workshop on Biometrics, CVPR Conference, San
Francisco, USA, June 2010, vol. 3 (2010)
718 P. Birajadar et al.

21. Zhang, H., et al.: Iris image classification based on color information. In: International Con-
ference on Pattern Recognition (ICPR), Japan, pp. 11–15 (2012)
22. Sun, Z., et al.: Iris image classification based on hierarchical visual codebook. IEEE Trans.
Pattern Anal. Mach. Intell. 36(6), 1120–1133 (2014)
23. Casia iris image database v4.0. Institute of Automation, Chinese Academy of Sciences. http://
biometrics.idealtest.org/
24. Nalla, P., Chalavadi, K.: Sparsity-based iris classification using iris fiber structures. In: Pro-
ceedings of International Conference of the Biometrics Special Interest Group-BIOSIG (2015)
25. Wang, Z., et al.: Local intensity order pattern for feature description. In: Proceedings of the
International Conference on Computer Vision. IEEE (2011)
26. Scattering wavelet network. https://www.di.ens.fr/data/scattering/
27. Zhao, Z., Kumar, A.: An accurate iris segmentation framework under relaxed imaging con-
straints using total variation model. In: IEEE International Conference on Computer Vision
(ICCV), pp. 3828–3836 (2015)
28. Fu, X., et al.: A fusion-based enhancing method for weakly illuminated images. Signal Process.
129, 82–96 (2016)
29. Birajadar, P., et al.: Unconstrained ear recognition using deep scattering wavelet network. In:
IEEE Bombay Section Signature Conference (IBSSC), pp. 1–6 (2019)
30. Thainimit, S., et al.: Iris surface deformation and normalization. In: 13th International Sympo-
sium on Communications and Information Technologies (ISCIT) (2013)
31. Chang, C., Lin, C.: LIBSVM: a library for support vector machine. ACM Trans. Intell. Syst.
Technol. 2, 27:1–27:27 (2011)
Chapter 65
Portfolio Optimization Using
Reinforcement Learning: A Study
of Implementation of Learning
to Optimize

Manish Sinha

Abstract Portfolio optimization is defined as the process of asset distribution to


achieve optimum expected returns and/or minimizing financial risk associated. It is
crucial for a financial risk manager to provide the best returns possible in the market
and calculation of risks like value-at-risk. The problem of portfolio optimization is
not new to the financial world, and approach like efficient frontier is already known.
While the work on optimization of portfolio is voluminous, this paper describes
the portfolio optimization approach using reinforcement learning. This approach is
particularly useful in the case where the search space is very large, and distribution
of asset must be done in real time. Algorithmic trading could be a good candidate for
this optimization approach. In this paper, the description is given for a few famous
optimization algorithms also that is used in the financial industry. The main idea
of this reinforcement learning (RL)-based approach is that agent learns the weights
distribution across the portfolio by rewarding and punishing the weights ratio and by
continuously doing so it can produce real-time distribution of the weights. The study
has been done on a portfolio of four stocks though can be extends to any number of
stocks in the portfolio.

Keywords Reinforcement learning · Portfolio optimization · Weights


distribution · Algorithmic trading · Constrained gradient decent

65.1 Introduction

Portfolio optimization is defined as the task of finding the best asset distribution in
a portfolio such that the objective function of optimization of expected returns or
the objective function of minimizing financial risk. The task needs historical and/or
simulated data to learn from and learning technique itself.
Let’s consider a portfolio consists of n stocks of different firms. The returns of the
stocks are X = {x 1 , x 2 , …, x n } at a particular point in time. The weights distribution

M. Sinha (B)
Research & Innovation, Analytics and Insight, Tata Consultancy Services, Mumbai, India
e-mail: manish.sinha2@tcs.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 719
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_65
720 M. Sinha

is for the stocks at the same time is W = {w1 , w2 , …, wn }. Therefore, the task is to
find weights such that the expected return is maximized.
 n 


W = arg max wi xi
w
i=1

n
and constrained such that i=1 wi = 1.
In this paper, reinforcement learning (RL) is used to effectively calculate the
weights at the real time given the returns. This technique uses the reward or punish-
ment to find maximum weights distribution. This approach is inspired by the how
can action learn the constrained weights toward optimization. Another inspiration
is how could be reward function that would mimic human and what could be state
so that the agent will apply the learnt weights on. The above constrained optimiza-
tion problem is reformulated into a reinforcement learning problem. This paper also
includes the weights distribution calculated using efficient frontier, SLSQP, and L2
regularization in efficient frontier on the same set of stocks.

65.1.1 Related Works

One of the famous works done on the portfolio optimization is by Markowitz [1, 2].
The standard economic behavioral assumption is implemented that the intention
is to optimize the payoff for a given risk measure. The portfolio that has allocation
and distribution of weightage will be an optimized portfolio. Markowitz introduced
a concept of efficient frontier. This technique is very popular, and it chooses the best
portfolio considering expected return and risk trade-off. Merton [3] showed a positive
slope tangent to the Markowitz bullet is efficient frontier. Low [4] used Monte Carlo
simulation that enhance mean–variance portfolio optimization. Rad [5] found that
weighing schemes based on risk minimization dominates weighing schemes based
on expected return maximization for commodity portfolios. Davidson [6] developed
linear programming assuming the objective function is quadratic.
Sefiane [7] uses genetic algorithm to make selection of the portfolio up on the
optimization of the returns. They formulated an objective function that determines
the weights of the portfolio per asset as to maximize return and minimize risk. The
important aspect using genetic algorithm is choice of using mutation and crossover
methodology. Li [8] showed any optimization problem can be attempted using rein-
forcement learning policy approach. Achiam et al. [9] proposed constrained policy
optimization as the policy search algorithm for a constrained reinforcement learning
that assures near-constraint satisfaction. Hieu [10] explored deep learning reinforce-
ment learning approach. In the paper, the reward function considers Sharpe ratio in
addition with wealth change.
65 Portfolio Optimization Using Reinforcement Learning … 721

Agent
Learning

Next State,
Reward Policy

Environment Action

Fig. 65.1 Standard agent environment interface

65.2 Methodology of Using Reinforcement Learning

Reinforcement learning is a part of machine learning that deals with the agent that
takes an action which tries to maximize the objective function on the state provided
by environment which then learnt by reward.

65.2.1 The Standard Agent-Environment Interface

The standard agent environment Interface pictorial view is given in Fig. 65.1. Agent
takes input from environment. The input to agent is state and reward. Based on the
policy given on the inputs, the agent decides the action to be taken.
Formally, for each time t, the agent receives input of environment state st ∈ S and
r ∈ R where t = {1, 2, 3, …}, S = {s1 , s2 , s3 , …} and R is a set on numerical award.
Basis above, the agent decides the action at ∈ A (r, st ). The action is decided by
the policy p (s, a) that maps states to action based on the reward.
Once the action is learned, it is applied on the environment to get the next state
and reward.
A is a set of all possible decision-making behaviors when the system is in a given
state.

65.2.2 Reinforcement Learning Formulation for Portfolio


Optimization

We have the historical closing prices of the stocks. Using below formula, the daily
return of each stock is determined.

Returnt = (Pricet − Pricet−1 )/Pricet−1


722 M. Sinha

State
The approach is, for a time t, the return calculated from the all the stocks is considered
as one state st . This implies st is [stock1 _returnt , stock2 _returnt , and stock3 _returnt ,
…, stockn _returnt ] for the n number of stocks in a portfolio.
Environment
The approach toward the environment consists of set of states and in our case, it is a
set of returns calculated for each stock in the portfolio at a time t. This calculates the
expected returns using the state (or the stock returns) and the weightage calculated
by the action and then reward is decided.
Reward
In this case, the reward function punishes the weightage ratio provided by action if
the expected return comes out to be negative for the next state; otherwise, it rewards
the weights distribution by sending it to the agent to compute learning from the next
state and the weightage ratio.
Agent
The agent takes input from environment in the form of next state and the reward
associated. If the reward is negative (punishment), then the agent rejects the action
(that comprises newly created weight distribution) and previous action (and associ-
ated weights distribution) is becomes the current action to be applied over the next
state. If the reward is positive, then agent uses policy to get new action (new set of
weights distribution).
Action
In this approach, a new combination of weight distribution at a time, t, is an action.
We also keep a record of previous action. If the reward is negative, we punish the
action and make previous action as recent action.
Policy
The policy is implemented in this case is called constrained gradient decent. The
approach is basically a gradient decent but it constraint is that the weight calculated
using the gradient decent is rescaled to arrive at the sum of all the distribution to one.
The Reinforcement Learning Algorithm
The approach is inspired by Li [8] in which they have formulated RL problem out of an
optimization problem. Learning to optimize approach is modified such that weights
are learnt and subjected to the constraint. Considering the portfolio optimization
problem, this algorithm is extended and modified.
The pseudo code is described as below.
65 Portfolio Optimization Using Reinforcement Learning … 723

Policy

Reward

Agent
724 M. Sinha

65.3 Evaluation

65.3.1 Experimental Setup

For the simplicity, we consider four stocks portfolio and historical data closing
price of four stocks from 12/07/2016 to 12/3/2021 from New York Stock Exchange.
The four stocks are BRK (Berkshire Hathaway), JPM (JP Morgan), BAC (Bank of
America), and MS (Microsoft).
We considered the daily returns as state of the day. So, we have in total 1256
counts. We have implemented algorithm of agent, policy, action, and reward on our
own using Python programming language.
For experiment purpose, the calculation of the optimized weights done using
below well-known optimization algorithms also.
1. Maximizing Sharpe ratio using GRG nonlinear
2. Maximize Sharpe ratio using sequential least square programming
3. Minimizing volatility using sequential least square programming
4. Maximize Sharpe ratio using efficient frontier
5. Maximizing Sharpe ratio using L2 regularization in efficient frontier.
For the first two optimization approaches, the Python library scipy is used. And
the for the last two optimization approaches, the Python library pyportfolioopt is
used.

65.4 Experimental Results

The first thing was done is to calculate the daily returns. This task was very simple to
achieve using Excel spreadsheet. Once we had the daily returns, we created environ-
ment that contains those states and the reward function. For each state, we calculate
the associated reward (initially based on the random selection) input from action.
The agent, if finds the reward, then learn the weights according to newly state arrived
in and, if finds the punishment, rejects the weights learnt and mark previously learnt
weights as the latest. The learning rate kept is 0.01 for all the four stocks.
Bar-chart representing the above table is shown here.
65 Portfolio Optimization Using Reinforcement Learning … 725

1.2
1 0
0.8 0.42 0.3
0.45 0.54
0.6 0.82
0 1 0.14 0.3
0.4 0
0.35 0.16
0.46 0.2
0.2 0
0.19 0.18 0.26 0.18
0

BRK JPM BAC MS

65.5 Discussion

The incorporation of the reinforcement learning in portfolio optimization is inter-


esting to be sought after considering the increasing number of works that are using
data-driven approaches. This approach shows that it is achievable.
One may point out that there is no goal defined while we are learning through
gradient decent. The explanation to this is the very purpose of the reinforcement
learning usage in this regard is to learn in the real time. In the real time, the weights
are getting updated based on the state, the agent is at. Had it been a batch process
then we already have well-known established optimization approaches.
There is another point can be out what the states that we considered—are returns
sufficient to be as a state. As this work is intended to study the learning to optimize in
constrained manner for the portfolio, we could achieve the result. We can also take
variance covariance-matrix of the stocks in the portfolio or the statistic like Sharpe
ratio as the state. Again, to find the statistics or computation of a specific state using
the returns will take time and system like algorithmic trading is time crucial.
One more argument can be made on the reward function that as we adjust the
weight when we get the reward. Similarly, the reward can be adjusted while punishing
it. On this, the attention could be drawn to the objective of the learning which is to
optimize the distribution of the weights. The punishment is given to the weight
distribution by rejecting it as it cannot give the optimized weight distribution. To
give analogy to this is learning a bicycle riding. While learning to ride a bicycle,
we adjust the speed, handle, break, position, etc. While going to various states of
riding a bicycle, we adjust these parameters to get the optimum result. At any state,
if the adjust is not working, the adjustment is rejected, and we come to the previous
adjustment and try again make a learning from the same.
726 M. Sinha

Table 65.1 Weight distribution through different optimization approaches


Optimization approach BRK JPM BAC MS
GRG nonlinear 0.19 0.35 0 0.45
Maximize Sharpe ratio using SLSQP 0.18 0 0 0.82
Minimizing volatility using SLSQP 1 0 0 0
Maximize Sharpe ratio using efficient frontier 0.46 0 0 0.54
Maximizing Sharpe ratio using L2 regularization in efficient frontier 0.26 0.16 0.14 0.42
Reinforcement learning 0.18 0.2 0.3 0.3

It can be seen in Table 65.1 that some optimization algorithms assign zero fraction
to certain stocks. Practically, this is possible only when we are doing end of the day
calculation of the stocks. While doing algorithmic trading, we try to reduce the
fraction slowly.
The strength of this approach is that it will learn the weight distribution faster
and could behave optimally in normal conditions. There is no library dependency
as such for this RL approach as we have implemented it from scratch. Also, the
technique here is not specific for stocks only. It can be very well implemented to
other instruments. Also, a cash balance can be considered while implementing the
approach.

65.5.1 Future Work

It would be very intriguing to see in the future works that a portfolio consists of all
kinds of instruments and try algorithm learn the weight distribution for the same. To
achieve this, there should be a real-time availability of the data for other instruments
too.
Also, different approaches toward reinforcements learning techniques to obtain
the optimized weighted distribution can also be tried out.
The state combination can also be enhanced by including the industry indicators,
market indicators and try to see if it is beating the market.
Deep learning can be tried out instead of gradient decent and this is very simple
in this kind of setting as it is just a plug and play pattern. It will be good to see
the how an actor-critic learning architecture will be useful in portfolio optimization
and wherever the portfolio optimization is useful. In this work, we have used the
historical data to get the returns calculated out of. The simulation approaches like
Monte Carlo or the bootstrapping can used to get the returns. Based on the simulated
data, calculating the optimum weight distribution using reinforcement learning will
be interesting.
One of the interesting parts will be doing calculation of value-at-risk calculation
using RL approaches. The probabilistic approach in RL can be used in conditional
value-at-risk calculation.
65 Portfolio Optimization Using Reinforcement Learning … 727

References

1. Markowitz, H.M.: Portfolio selection. J. Finance 7(1), 77–91 (1952)


2. Markowitz, H.M.: Portfolio Selection: Efficient Diversification of Investments. Yale University
Press reprint (1970)
3. Merton, R.: An analytic derivation of the efficient portfolio frontier. J. Financ. Quant. Anal. 7
(1972)
4. Low, R., Faff, R., Aas, K.: Enhancing mean–variance portfolio selection by modeling
distributional asymmetries. J. Econ. Bus. 85, 49–72 (2016)
5. Rad H., Low, R., Miffre J., Faff R.: Does sophistication of the weighting scheme enhance the
performance of long-short commodity portfolios? J. Empir. Finance (2020)
6. Davidson, M.: Portfolio optimization and linear programming. J. Money Invest. Bank. 20
(2011)
7. Sefiane, S., Benbouziane, M.: Portfolio selection using genetic algorithm. J. Appl. Finance
Bank. 2(4), 143–154 (2012)
8. Li, K., Malik, J.: Learning to optimize. In: International Conference on Learning Representa-
tions (2017)
9. Achiam, J., Held, D., Tamar, A., Abbeel, P.: Constrained policy optimization. In: International
Conference on Machine Learning (2017)
10. Hieu, L.T.: Deep reinforcement learning for stock portfolio optimization. Int. J. Model. Optim.
10(5), 139–144 (2020)
11. Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general
algorithm configuration. In: Learning and Intelligent Optimization, pp. 507–523. Springer
(2011)
12. Maclaurin, D., Duvenaud, D., Adams, R.: Gradient-based hyperparameter optimization through
reversible learning. arXiv preprint arXiv:1502.03492 (2015)
13. Humphrey, E., Benson, K.L., Low, R.K.Y., Lee, W.L.: Is diversification always optimal? Pac.
Basin Financ. J. 35, 521–532 (2015)
14. Fernando, K.V.: Practical portfolio optimization. Technical report, The Numerical Algorithms
Group, Technical Report (TR2/00): NP3484 (2000)
15. Fernholz, R.E.: Stochastic Portfolio Theory, vol. 48. Springer (2002)
16. Figueroa-Lopez, J.E.: A Selected Survey of Portfolio Optimization Problems (2005)
17. Geyer, A., Hanke, M., Weissensteiner, A.: A stochastic programming approach for multi-period
portfolio optimization. Comput. Manag. Sci. 6(2), 187–208 (2009)
18. Giannakouris, G., Vassiliadis, V., Dounias, G.: Experimental study on a hybrid nature-
inspired algorithm for financial portfolio optimization. In Konstantopoulos, S., Perantonis, S.,
Karkaletsis, V., Spyropoulos, C., Vouros, G. (eds.) Artificial Intelligence: Theories, Models
and Applications. Lecture Notes in Computer Science, vol. 6040, pp. 101–111. Springer,
Berlin/Heidelberg (2010)
19. Greyserman, A., Jones, D.H., Strawderman, W.E.: Portfolio selection using hierarchical
Bayesian analysis and MCMC methods. J. Bank. Finance 30(2), 669–678 (2006)
20. Meketa Investment Group. Risk parity. Technical report (2010)
21. Guastaroba, G.: Portfolio Optimization: Scenario Generation, Models and Algorithms. PhD
thesis, Universit’a degli Studi di Bergamo (2010)
22. von Neumann, J., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton
University Press, 3rd edn. (1953)
23. Nishimura, K.: On mathematical models of portfolio selection problem. Manag. Rev. 26(1),
369–391 (1990)
24. Nocedal, J., Wright, S.J.: Numerical Optimization: Springer Series in Operations Research,
2nd edn. Springer (1999)
25. Ortobelli, S., Rachev, S.T., Stoyanov, S.V., Fabozzi, F.J., Biglova, A.: The proper use of risk
measures in portfolio theory. Int. J. Theor. Appl. Finance 8(8), 1–27 (2005)
26. Still, S., Kondor, I.: Regularizing portfolio optimization. New J. Phys. 12, 075034 (2010)
728 M. Sinha

27. Thong, V.: Constrained Markowitz Portfolio Selection Using Ant Colony Optimization.
Erasmus Universiteit (2007)
28. Zhou, G.: Beyond Black-Litterman: letting the data speak. J. Portfolio Manag. 36(1), 36–45
(2009)
29. Karatzas, I., Fernholz, R.E.: Stochastic portfolio theory: an overview. Handb. Numer. Anal. 15,
89–167 (2008)
Chapter 66
License Plate Detection Techniques:
Conventional Methods to Deep Learning

Sahil Khokhar, Deepak Kedia, and Pawan Kumar Dahiya

Abstract With the explosive growth in the computational power and the field of
artificial intelligence, the monotonous chores in many applications have been auto-
mated. One of these applications is the monitoring of vehicles, be it on the roads,
parking of shopping malls, toll booths, or other such places. However, the accuracy
of the automatic license plate recognition (ALPR) systems is still not adequate. This
accuracy is still improving at a steady pace. The key areas of research in ALPR
systems are license plate detection, character segmentation, and character recogni-
tion. This paper will focus on the license plate detection part. The techniques used for
the license plate detection ranging from conventional techniques to machine learning
and deep learning techniques are reviewed and the incremental increase in the accu-
racy over the years is presented. An F-score of 97.63 with an average processing
time of 21.76 ms per image was obtained as a result of implementing the YOLOv3
algorithm.

Keywords ALPR · Object detection · Artificial intelligence · Deep learning ·


Intelligent transport systems

66.1 Introduction

The ALPR systems are used to recognize a vehicle’s license plate when an image
or video of the vehicle is provided. Some key applications of ALPR system are in
parking systems of malls, traffic surveillance and monitoring, road toll collections,
etc. The system used to focus on the following aspects [1]:
• Image Preprocessing: Image resizing, noise removal, color enhancement, etc.
• License Plate Detection: Locating the license plate of the image.
• Character Segmentation: Segmenting each character from the license plate.

S. Khokhar (B) · D. Kedia


GJUS&T, Hisar, Haryana 125001, India
e-mail: khokhar.sahil0809@gmail.com
P. K. Dahiya
DCRUST, Murthal, Sonepat, Haryana 131039, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 729
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_66
730 S. Khokhar et al.

• Character Recognition: Recognizing the segmented character.


However, with the advancements in the field of machine learning and deep
learning, the need for image preprocessing has been on the decline and the areas
of key focus have been limited to license plate detection, character segmentation,
and character recognition.
The license plate detection started with traditional techniques which made use
of features of license plate such as geometry, color, etc. These techniques were not
robust enough to be deployed for field application. Significant improvements were
observed in the object detection field with the advent of machine learning and artificial
intelligence technologies.
The purpose of the paper is to provide a brief look into the techniques which used
for license plate detection over the recent years. Moreover, a deep learning algorithm
is also used for the detection of the plate and its results are compared with earlier
techniques.
The rest of the paper is organized as follows. Section 66.2 chronologically
discusses the various techniques used for license plate detection. In Sect. 66.3, a
brief about the datasets and algorithm used in the experiment is given. In Sect. 66.4,
the results obtained are presented and in Sect. 66.5, the conclusion of the paper is
given.

66.2 License Plate Detection Techniques

The earliest techniques used for license plate detection were based on the geometric
features of the license plate. Cowdrey and Malekian [2] used the width, height, aspect
ratio, area, and internal contours of the license plate to validate it. In [3], Wang et al.
used the position of the license plate relative to the tail lights of the vehicle and then
applied morphological operations to extract the plate. These sort of conventional
methods uses little computational resources but at the cost of being relatively less
robust.
As the computational resources became less expensive and with the explosive
growth in the field of artificial intelligence and machine learning, the research in
ALPR systems also followed suit. Rafique et al. [4] trained a support vector machine
(SVM) classifier with histogram of gradients (HoG) as the feature vector. The SVM
classifier offered better accuracy; however, it was too slow to be used in real-time
applications.
In simple machine learning algorithms, the accuracy of the system depended
heavily on the features that were provided to the algorithm. This problem was solved
by deep learning algorithms as the algorithm was itself responsible for choosing the
features that would offer the best accuracy. Rafique et al. [4] used the faster-RCNN
architecture and while comparing the results with the SVM-based detector, it was
apparent that the faster-RCNN method was more accurate and even significantly
66 License Plate Detection Techniques: Conventional Methods … 731

faster. In [5], Liu and Chang proposed a cascaded architecture of three detectors to
improve the accuracy of the system.

66.3 License Plate Detection Using YOLOv3

In this section, the implementation of YOLO algorithm is discussed for the detection
of license plates.

66.3.1 Datasets

In contrast to using the same dataset for training and testing the algorithm, different
datasets were utilized for the training and testing purposes. 2000 images from
Google’s Open Image Dataset version 6 (OIDv6) [6], which is an open source project,
was used for training the model. The model was tested on UCSD dataset [7] and the
dataset created by License Plate Detection, Recognition and Automated Storage
project (LPDRA) [8–13].

66.3.2 YOLOv3

You-Only-Look-Once algorithm was developed by Redmon et al. [14] in 2015.


YOLO is a CNN-based one stage detector. The YOLO algorithm was further
improved and released in subsequent versions YOLOv2 and YOLOv3 [15]. YOLOv3
was released in 2018 by Redmon and Farhadi. The YOLOv3 model is a deep learning
architecture that is made up of 53 convolutional layers and detects at object at multiple
scales, i.e., small, medium and large sizes. For the detection of license plates, the
model was trained on OIDv6 dataset for 6000 iterations. The loss and mean average
precision (mAP) can be seen in Fig. 66.1.

66.4 Results

The detailed performance matrices of YOLOv3 are given in Table 66.1. On an average
accuracy of 95.38%, F-score of 97.63 and processing time of 21.76 ms/image were
obtained when testing the YOLOv3 model at UCSD and LPDRA datasets. Some
results of successful detection from different angles and under varying illumination
conditions are shown in Fig. 66.2.
The results of YOLOv3 are compared with other techniques in Table 66.2. As can
be seen, the accuracy of the system improved as more advanced techniques started
732 S. Khokhar et al.

Fig. 66.1 Loss and mean average precision (mAP) chart of YOLOv3

Table 66.1 Performance of YOLOv3


Dataset Precision Recall F-score Accuracy Error
LPDRA 100 94.69 97.27 94.69 5.31
UCSD 99.64 96.9 98.26 96.57 3.43
Overall 99.87 95.5 97.63 95.38 4.62

being used rather than conventional techniques. The machine learning techniques
were too slow to be used at first as SVM took 3 s/image, nevertheless, it is now
possible to use these techniques even in real time since the average processing time
has dropped to 21.76 ms/image. It is clear that YOLOv3 not only has greater accuracy
when compared to other referenced techniques, it is the only technique that can be
used for real-time applications.

66.5 Conclusion

The object detection technology has come a long way from the time when simple
geometric features such as the aspect ratio, width, height, etc. were the only options
66 License Plate Detection Techniques: Conventional Methods … 733

Fig. 66.2 Successful detection of license plates using YOLOv3

Table 66.2 Comparison of various techniques to detect license plates


Technique used Accuracy Processing time (ms)
Color information and morphological operations [3] 75.8 –
Geometric projection [2] 87 –
SVM [4] 92.81 3000
Faster-RCNN [4] 94.45 70
CNN-based cascade structure [5] 91.6 202.3
YOLOv3 95.38 21.76

to recognize an object to using much deeper features such as histogram of gradients


to letting the machine itself decide which features are worth utilizing. The accuracy
of the system has been constantly improving with a steady decrease in the processing
time.
One of the main problems to address right now is a lack of large and diverse
dataset. The datasets used in this paper had only a few heavy vehicles like buses and
trucks. Due to which the model was not properly trained for that kind of vehicles.
Hence, a relatively lower accuracy was obtained for heavy vehicles as compared to
that of cars.

References

1. Khokhar, S., Dahiya, P.K.: A review of recognition techniques in ALPR systems. Int. J. Comput.
Appl. 170(6), 30–32 (2017). https://doi.org/10.5120/ijca2017914867
2. Cowdrey, K.W.G., Malekian, R.: Home automation—an IoT based system to open security
gates using number plate recognition and artificial neural networks. Multimed. Tools Appl.
734 S. Khokhar et al.

77(16), 20325–20354 (2018). https://doi.org/10.1007/s11042-017-5407-1


3. Wang, J., Bacic, B., Yan, W.Q.: An effective method for plate number recognition. Multimed.
Tools Appl. 77(2), 1679–1692 (2018). https://doi.org/10.1007/s11042-017-4356-z
4. Rafique, M.A., Pedrycz, W., Jeon, M.: Vehicle license plate detection using region-based convo-
lutional neural networks. Soft Comput. 22(19), 6429–6440 (2018). https://doi.org/10.1007/s00
500-017-2696-2
5. Liu, C., Chang, F.: Hybrid cascade structure for license plate detection in large visual surveil-
lance scenes. IEEE Trans. Intell. Transp. Syst. 20(6), 2122–2135 (2019). https://ieeexplore.
ieee.org/document/8447437
6. Google’s Open Image Dataset v6. https://opensource.google/projects/open-imagesdataset.
Accessed 15 Nov 2020
7. Dlagnekov, L., Belongie, S.: UCSD/Calit2 car license plate, make and model database. http://
vision.ucsd.edu/car_data.html. Accessed 15 Nov 2020
8. License Plate Detection, Recognition and Automated Storage project. http://www.zemris.fer.
hr/projects/LicensePlates/english/results.shtml. Accessed 15 Nov 2020
9. Srebrić, V., Ribarić, S.: Postupci poboljšanja kontrasta sivih slika. Graduation Thesis No. 1397,
Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, 2003
10. Adrinek, G., Ribarić, S.: Segmentacija slike na temelju pokreta. Graduation Thesis No. 1392,
Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, 2003
11. Kraupner, K., Ribarić, S.: Uporaba višeslojnog perceptrona za raspoznavanje brojčano-
slovčanih znakova na registarskim tablicama. Graduation Thesis No. 1396, Faculty of Electrical
Engineering and Computing, University of Zagreb, Zagreb, 2003
12. Haluška, J., Ribarić, S.: Razvoj programskih sustava u okruženju Khoros Pro 2001. Graduation
Thesis No. 1402, Faculty of Electrical Engineering and Computing, University of Zagreb,
Zagreb, 2003
13. Ribarić, S., Adrinek, G., Segvić, S.: Real-time active visual tracking system. In: Proceedings
of the 12th IEEE Mediterranean Electrotechnical Conference (IEEE Cat. No. 04CH37521),
Dubrovnik, Croatia, Nov 2004. https://doi.org/10.1109/MELCON.2004.1346816
14. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object
detection. arXivLabs (2015). https://arxiv.org/abs/1506.02640
15. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXivLabs (2018). https://
arxiv.org/abs/1804.02767
Chapter 67
Inverse Imaging: Reconstructing
High-Resolution Images from Degraded
Images

Samir Mishra, Mrunal Chide, Aishwarya Manmode, Rohit Pandita,


and Mansi Bhonsle

Abstract Inverse problems such as image denoising, super-resolution and image


reconstruction are based on complex unsupervised machine learning techniques. We
propose a supervised deep learning approach for image denoising and high resolution.
Our proposed system simply follows the U-Net architecture which takes images with
noise as input. In this work, we show that U-Net architecture consisting of convolu-
tional and de-convolutional or transpose convolutional neural networks does a pretty
good job of removing noise from the image and producing high-resolution images.
The task belongs to a general class of inverse problems that fall under the mathe-
matical branch of problems based on the posterior probability distribution which is
the probability of the parameter theta given the evidence X: Pr(theta|X). We show
that supervised learning algorithms based on deep learning are enough capable to
produce equivalent or better results than the unsupervised machine learning approach
and maintain the complexity even if the generality is extended.

Keywords Deep learning · Convolutional · Inverse problems · Posterior


distribution · Supervised · Transpose convolutional · U-Net · Unsupervised

67.1 Introduction

Denoising images or basically removing noise or disturbance from the images and
producing good quality images is quite a wonderful and complex task that can be
achieved by deep learning or formally generality of convolutional neural networks.

S. Mishra (B) · M. Chide · A. Manmode · R. Pandita · M. Bhonsle


G H Raisoni College of Engineering and Management, Pune, Maharashtra, India
e-mail: samir.mishra.cs@ghrcem.raisoni.net
M. Chide
e-mail: mrunal.chide.cs@ghrcem.raisoni.net
R. Pandita
e-mail: rohit.pandita.cs@ghrcem.raisoni.net
M. Bhonsle
e-mail: mansi.bhonsle@raisoni.net

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 735
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_67
736 S. Mishra et al.

Deep learning has not only proved to be efficient but also powerful for solving many
such tasks. Denoising imaging can produce a great impact by giving a high resolution
of the image containing some noise which can give more clarity without losing the
generality of the image.
In the proposed system, neural network techniques such as convolution neural
networks, MaxPooling, Dropout, concatenation and Lambda Layers are used. The
model adopts the U-Net [1] architecture and has approximately 15 million trainable
parameters.
Convolutional neural network or CNN has established a state-of-the-art approach
to image processing and pattern recognition problems such as face recognition, object
detection and other computer vision application [2].
CNNs are regularized versions of multilayer perceptrons [3]. Multilayer percep-
trons or MLP contains many perceptrons divided into many layers where each arti-
ficial neuron or perceptron is connected to all other neurons in the next layer, i.e.
fully connected networks. It consists of an input layer, hidden layers and an output
layer. In any neural network architecture, middle layers are called hidden because
their inputs and outputs are masked by the activation function. In a convolutional
neural network, the hidden layers include layers that perform convolutions.
Here CNNs are used to solve inverse problems such as denoising and super-
resolution. Inverse problem is basically the inverse of a forward problem, i.e. calcu-
lating from a set of observations and the factors that produced them. The proposed
work includes convolutional and de-convolutional which are performed on the
images. A de-convolutional or transpose convolutional neural network basically
performs the inverse of the convolutional method. In CNN, we reduce the height
and width of the image; whereas, in de-convolution, we are increasing the height and
width.
MaxPooling is basically used for downsampling of the image representation,
causing the reduction of the dimensions of input data by combining the outputs of
the groups of neurons at one layer into a single layer in the next immediate layer.
Dropouts or dilution is a regularization technique for reducing over fitting by
thinning the weights or randomly dropping out units (hidden and visible) during the
training process of the network.
Concatenation is a neural network technique in which it takes input numbers of
tensors and concatenates them into a single tensor. The concatenation layer basically
is used for skip connection which basically improves the feature processing and gives
a better efficient result.
Lambda layers are used to wrap arbitrary expressions as a layer object. It is used in
the model to give the model a better generality in terms of image resizing (Fig. 67.1).
The proposed model consists of deep CNN, MaxPooling2D, Dropout, concate-
nation and Lambda layers to improve system performance. Here, the initial layer
performs extension of patches, hidden layers form unit architecture and skip
connections and an end layer performs restoration.
67 Inverse Imaging: Reconstructing High-Resolution Images … 737

Fig. 67.1 Denoising the image

67.2 Background

Over the past few decades, considerable methods have been studied for denoising the
images. Among all supervised deep convolutional networks and deep images prior
have gained popularity.
A supervised deep convolutional network requires a large set of training data.
Also, if the image to be denoised is significantly different from the training images
then it might lead to inferior results and may also create hallucinations.
Deep image prior (DIP) [4] overcomes this drawback to some extent. DIP is
capable of capturing the low-level statistics of the natural image using an unsuper-
vised learning model that does not require training images other than the image itself.
Also, it is more flexible. Nevertheless, the accuracy of DIP is usually inferior to the
supervised learning-based methods using deep convolutional neural networks and is
also susceptible to over-fitting problems.
To solve this problem of DIP, a paper proposed a novel deep generative network
with multiple target images and an adaptive termination condition. Specifically, they
utilized mainstream denoising methods to generate two clear target images to be
used with the original noisy image, enabling better guidance during the convergence
process and improving the convergence speed. Moreover, they adopted the noise level
estimation (NLE) technique to set a more reasonable adaptive termination condition,
which can effectively solve the problem of over fitting.
Although this approach requires a large number of gradient updates, resulting in
long inference times. Thus, its execution efficiency is relatively low.
The proposed model uses U-Net architecture to perform low-level noise reduction,
maintaining the generality of the image. It extracts the rich features embedded in the
image and uses them to produce high-resolution [5]. It follows the fact that we are
738 S. Mishra et al.

given a rich set of features, the evidence and tries to reproduce images by maximizing
instances that might have resulted in such features [6].

67.3 Model Architecture and Methods

In this paper, we propose a neural framework based on U-Net architecture and skip
connections resulting from the concatenation of layers. Skip connections forward
the essential rich features of the image which results in better processing of the
images. To simply, consider noised images are basically concatenation of latent
noiseless images and noises. At the start of layers, the input data carries with it and
the necessary features and patterns that describe the latent image. These features thus
provide the necessity to generate images with the generality of lossless composition.
In our model, we have maintained this feature throughout and processed them in
an orderly fashion of what the U-Net architecture construct suggests (Fig. 67.2).
The modules consist of three building blocks which are named downsampler,
middleblock and upsampler.
1. Downsampler: It basically downsamples the training images. It is divided into
three stages. The first stage uses three convolutional layers (CNNs) and then the
concatenation layer of 3rd CNN with 1st CNN, followed by the MaxPooling
layer. We also have introduced the padding as “same” in CNN layers to preserve
the shape of the tensor for concatenation. The concatenation is responsible for
forwarding rich features. The second and third stages do the same operation as
the first stage. The MaxPooling layers are responsible for the downsampling
of images and they are also the outputs generated by each stage. Activation is

Fig. 67.2 Model


67 Inverse Imaging: Reconstructing High-Resolution Images … 739

“ReLU” for every CNN in downsampler, Units are increased in every stage by
twice.
2. MiddleBlock: The output of the downsampler is passed to this block, which
does the same as the first stage of the downsampler. Here as well the activations
are “ReLU” but the number of units or neurons remains the same for every
CNN.
3. Upsampler: This layer up samples the input passed by middle bock by using
transpose CNN. It also has three stages like downsampler and all do the same
job of upsampling the input. The construct is similar as well. Each stage has
Conv2DTranspose which performs the transpose convolutions then there is
concatenation with the stages of the downsampler (upper layers of the upsam-
pler is concatenated with lower layers of the downsampler). Following this is
the Dropout layer and then two back to back CNNs. Activation remains the
same throughout as “ReLU” and padding is “same” and the number of units
decreases going down the stages.

67.4 Training

67.4.1 Working on Data Sets and Creating Labels

For the first task, we need to capture the data set to work on. Since the task of
denoising is the general problem of noise reduction, thus we need a small data set
for the training. We have chosen the BSDS500 data set from Berkeley.edu which
consists of 500 different files.
The second step is to decode the images from the BSDS500 data set which can
be easily done with the help of libraries such as OpenCV and TensorFlow.
The third step was to divide the data set such that we have labels. Since our model
is based on supervise technique to produce high-resolution images, we need labels
to map the inputs with their correct outputs.
The Keras inbuilt function for the purpose of dividing the data set into training
and validation datasets.

67.4.2 Optimizer

The proposed system used Adam optimizers with Beta_1 as 0.9 and Beta_2 as 0.999.
Authors have taken input shape to be as (128, 128, 3). Then the input layer is passed
through the downsampling block which performs the extension of patches, then the
output of this is passed as input to the middle block which are. The output of the
middle block is then passed as input to the de-convolutional block which performs
transpose convolutional or de-convolution of the input and generates the output which
is then checked with the label.
740 S. Mishra et al.

The system also made use of the skip connection in the de-convolutional block.
As we talk about the shallow layers or layers which are not deep in the network
consist of rich features of the image such as corners, edges, diffusion of colours, etc.
But as we go deeper in the network, layers tend to learn different interesting features,
by using skip connection we allow our network not just to learn those new features
but also carry on the rich features [7, 8] about the image giving greater power to our
network. The output of our de-convolutional block is an image with the same shape
as our image, i.e. (128, 128, 3).
The model seems to do a very good job of matching the patterns to reduce the
noise. The generality is achieved in conjunction with mean squared error (MSE) loss
function and Adam optimizer to achieve momentum regularization.
One great advantage of the proposed model is the absolute result. No matter how
deep the noise is introduced in the image, the noise reduction does its job the same
every time.
Here, the system has divided the representation into three parts, 1st part shows
the label image that is the true image, 2nd part shows the true image induced with
different intensity of noises and 3rd part shows that when given noised image to our
model it produces the absolute same results.

67.5 Results and Discussions

The work resulted in an accuracy of 92.01% on validation data and 85.09% accuracy
on training data where training data consisted of 90% of 100 images and validation
data consisted remaining 10% of 100 images.
The model was trained for just 500 epochs and performed very well in producing
high-resolution images with maintaining the generality of the image.

67.6 Conclusion

Thus, we have shown that the supervised deep learning technique suffixes in
denoising images and producing high-resolution images without losing the gener-
ality of the image. Here, we tried to show the power of deep learning in reference
to computer vision application on inverse problems and to show that this general
representation of state-of-the-art technology can also be extended in the proper
reconstruction of degraded images.

References

1. U-Net: Convolutional Networks for Biomedical Image Segmentation


67 Inverse Imaging: Reconstructing High-Resolution Images … 741

2. O’Shea, K., Nash, R.: An Introduction to Convolutional Neural Networks


3. Wikipedia: Convolutional Neural Network
4. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep Image Prior
5. Mastan, I.D., Raman, S.: Multi-level Encoder-Decoder Architectures for Image Restoration
6. McCann, M.T.: Convolutional Neural Networks for Inverse Problems in Imaging: A Review
7. Ongie, G., Jalal, A., Metzler, C.A., Baraniuk, R.G.: Deep Learning Techniques for Inverse
Problems in Imaging
8. Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires
rethinking generalization. In: ICLR (2017)
Chapter 68
Cognitive Baby Care Solution for Smart
Parenting

V. Sireesha, Nagaratna P. Hegde, Anisha Kollipara, Meghana Ganapa,


and M. S. V. Sashi Kumar

Abstract In the twenty-first century, every family has busy lives, especially in Metro
cities where a heavy percentage of work life imbalance is quite common. The paedi-
atric care by the parents to the child is challenging since the child patterns keep
changing over time. Patterns include baby getting hungry or wetting the mattress or
happy/cheerful moments can be monitored, and using cognitive services, the child
can better be looked after. In this work, an IoT solution was proposed that aims
towards assisting parents capture the changing baby routines that can be monitored
using the cognitive data generated by the Smart Cradle. The prototype was designed
and tested using a dataset and achieved a promising success rate of 95%.

Keywords Facial expression recognition · IoT · Cognitive services · IFTTT ·


Arduino IDE

68.1 Introduction

Parenting is an art and over the years, young couples are expected to have learnt
from their previous generations. Yes they did, in making the kids happy, earning
more, taking up new challenges for a better living, competing every moment and
in return have totally forgotten to take the essential care for the newborn. Catalyst
study says 46% of the parents, both mother and father, work as full time employees.
The busy schedule of parents has made a lifestyle that transformed their lives into a
vicious circle of accomplishments, rather than focusing on the needs of the child, at
all times. In this work, the product was designed to offer an enriching experience for
young parents to learn from the patterns the child makes over time. Infants evolve at

V. Sireesha (B) · N. P. Hegde · A. Kollipara · M. Ganapa


Department of CSE, Vasavi College of Engineering, Hyderabad, Telangana, India
e-mail: v.sireesha@staff.vce.ac.in
N. P. Hegde
e-mail: nagaratnaph@staff.vce.ac.in
M. S. V. S. Kumar
Alight, Hyderabad, Telangana, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 743
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_68
744 V. Sireesha et al.

a unique pattern, but it is quite challenging for anyone other than the mother who can
resonate. Thus, we developed a continuous surveillance system to keep a watch on
the baby and notify the parents/guardians to be vigilant at all times [1–7]. This smart
system was carefully designed and implemented by using the current day matured
technologies, i.e. Image Processing, Artificial Intelligence, and IOT.

68.2 Literature Survey

Lavner et al. [8] developed an automatic baby cry detection system using audio
signals. They did this using machine learning algorithms like logistic regression
classifier and Mel Frequency cepstrum coefficients. Asthana et al. [9] have anal-
ysed the cry patterns of the baby. This dealt with analysing the audio signals and
signal processing methods like autocorrelation and linear prediction analysis of the
frequency of baby sounds.

68.3 Design

The prototype proposed a unique solution that can help parents to have a continuous
surveillance/watch on the baby and can get notified. As shown in Fig. 68.1, it uses
a microcontroller which is interfaced with sensors used to detect the moisture, and
a camera module to capture the facial expressions of the child. When the microcon-
troller records the rise in the moisture, the sensor detects it and the Microcontroller
notifies the user over his mobile phone. There is a continuous monitoring mode on
the system that takes the pictures of the baby in the cradle every 2 min and sends the
message to the parent if the baby is crying or is in fear. Based on the situation the
prototype also plays out loud a voice, which could be a pre-recorded audio of mother
or father, a song that the child likes, etc. which can be customized on the parents
device, to make the baby feel comfortable.
The design of the solution consist of 5 stages that includes capturing the image,
plotting of image, facial expression recognition/Baby Mood detect.
A. Capturing of Image
OpenCV is the huge open-source library for the computer vision, machine
learning, and image processing. By using it, one can process images and videos
to identify objects, faces, or even handwriting of a human. The image of the
baby is captured using Opencv module and then the image captured is placed
into the same folder as in the source code.
B. Plotting of Image
68 Cognitive Baby Care Solution for Smart Parenting 745

Fig. 68.1 System design

Matplotlib is one of the most popular Python packages used for data visualiza-
tion. It is a cross-platform library for making 2-D plots from data in arrays. We
used the matplotlib library for plotting the image captured.
C. Facial Expression Recognition
Facial expression recognition software is a technology which uses biometric
markers to detect emotions in human faces. More precisely, this technology is
a sentiment analysis tool and is able to automatically detect the six basic or
universal expressions: happiness, sadness, anger, surprise, fear, and disgust.
We have used a python module named deepface for the facial expression
recognition of the baby to know whether the baby is crying or not. The deepface
module which is created by facebook.
In modern face recognition there are 3 steps:
1. Detect: Opencv module of python is used to detect the face in the picture taken
2. Align: It is used to generate a frontal face from the input image that might
contain various angles.
746 V. Sireesha et al.

3. Represent and classify: Deepface module is trained to classify the images of


people based on their emotions, identities, race, etc.
If the baby’s expression is sad or scared then the message goes to the parent and
simultaneously a song is played to calm down the baby.
D. IFTTT webhooks
Webhooks allows you to make or receive a web request with IFTTT. This means
that we can get applications not already supported by IFTTT to talk to IFTTT.
We will use processing to create an HTTP request that will alert our webhook
triggering an action in IFTTT.
We used IFTTT webhooks in two places. One is to send a message to the
parent if the cradle is wet and another one is to send the message to the parent
if the baby is crying. We used the SMS applet for sending the messages.
E. Play sound and store images

We used playsound for playing a mp3 song when the baby is crying. We need
to download a song into the folder and provide the path for it. In case, the baby
is not crying, the images that are captured are stored.
The design of the solution consist of 5 stages that includes capturing the image,
plotting of image, facial expression recognition/Baby Mood detect.

68.4 Implementation

The prototype implementation is done in modules, i.e. to check the cradle wetness
and the baby mood.
A. Design to check if the cradle is wet or not:
We have used a water sensor to detect the wetness of the cradle. The water sensor
has 3 pins. They are:
1. Sensor (S)
2. Positive
3. Ground
The sensor pin is connected to the D2 pin of the NodeMCU and the positive
and negative pins are connected to voltage and ground, respectively. When the
sensor detects water the D2 pin is low. At this very instant, we trigger an event
and send a message to the parent using IFTTT webhooks service. IFTTT allows us
to create applets and each applet has its own key. We used the SMS applet for this
purpose to connect to the android phone. When the event is triggered, a notification
is sent. So, firstly this event can be triggered when NodeMCU can connect to Wi-Fi.
So, we imported the WiFiESP8266 library and connected the NodeMCU to Wi-Fi
(Fig. 68.2).
68 Cognitive Baby Care Solution for Smart Parenting 747

Fig. 68.2 Circuit diagram

B. Design to detect if the baby is crying


To execute the programme, we used a Jupyter notebook that we launched using
anaconda. Initially, in anaconda we had to create the environment. Then, we opened
the anaconda command prompt and we installed the libraries using the pip install
tensorflow command. After that, we installed deepface and matplotlib.
The AI algorithm we used in our project is Deep Learning and neural networks. We
used the inbuilt Deep Face library in python for emotion recognition. For plotting the
graph, we used the matplotlib library in python. The graph and analysis of the image
is required to recognise the emotions. Then, we ran all the cells. This code basically
executes automatically every 30 minutes. All the percentages of each emotion like
sad, happy, fear, surprise, neutral, and anger are displayed and the dominant emotion
is highlighted. Whenever the dominant emotion is sadness or fear, the IFTTT service
is triggered using the requests library. This helps to post the request and trigger the
IFTTT service to send the message to the parent. At the same time, so that the baby
can relax and stop crying, automatically a song is played. We did this using the play
sound library which we imported to our Jupyter Notebook.

68.5 Results

The prototype was designed for the parents and tested using a dataset and it gave the
right result 95% of the time. It will surely make parenting easier as the changing baby
routines can be monitored using the cognitive data generated by the Smart Cradle.
The results are as follows (Figs. 68.3 and 68.4):
748 V. Sireesha et al.

Fig. 68.3 Dataset used to


test the prototype

Fig. 68.4 Set up


68 Cognitive Baby Care Solution for Smart Parenting 749

Results obtained when we ran the code on the dataset to test out prototype
(Figs. 68.5 and 68.6):
{’region’: {’x’: 120, ’y’: 18, ’w’: 89, ’h’: 126}, ’emotion’:
{’angry’: 8.18801075220108, ’disgust’: 5.9294899301676196e-
05, ’fear’: 0.048710257397033274, ’happy’: 54.901885986328125,
’sad’:36.74006164073944, ’surprise’: 0.09823146974667907, ’neutral’:
0.02303722285432741}, ’dominant_emotion’: ’happy’}
{’region’: {’x’: 342, ’y’: 122, ’w’: 278, ’h’: 346}, ’emotion’:
{’angry’: 21.85674011707306, ’disgust’:0.00016552100987610174, ’fear’:
31.24159574508667, ’happy’: 18.50680261850357, ’sad’:28.361457586288452,
’surprise’: 0.0044103780965087935, ’neutral’: 0.028828406357206404},’domi-
nant_emotion’: ’fear’}
{’region’: {’x’: 69, ’y’: 26, ’w’: 89, ’h’: 138}, ’emotion’: {’angry’:
2.1042696571828734e-14, ’disgust’: 0.0,’fear’: 2.3757470529006847e-18, ’happy’:
750 V. Sireesha et al.

Fig. 68.5 Circuit diagram for connecting water sensor to NodeMCU

Fig. 68.6 Output of messages sent

100.0, ’sad’: 1.7677300762185984e-17, ’surprise’:1.8309259387535523e-25,


’neutral’: 6.345754096925349e-28}, ’dominant_emotion’: ’happy’}
{’region’: {’x’: 172, ’y’: 11, ’w’: 121, ’h’: 176}, ’emotion’:
{’angry’: 1.4828342615524922e-16, ’disgust’:3.127301701316922e-
32, ’fear’: 2.5302097844804647e-17, ’happy’: 99.9982476234436,
’sad’:3.398077808702539e-15, ’surprise’: 0.001086571046471363, ’neutral’:
0.0006724389550072374},’dominant_emotion’: ’happy’}
{’region’: {’x’: 162, ’y’: 26, ’w’: 124, ’h’: 167}, ’emotion’:
{’angry’: 8.344077816252579, ’disgust’:2.1511985143125476e-08, ’fear’:
1.6744335248728708, ’happy’: 0.02866490274626182, ’sad’:4.77254270151512,
’surprise’: 0.0018878599109691887, ’neutral’: 85.17839224209739},
‘dominant_emotion’: ‘neutral’}.
68 Cognitive Baby Care Solution for Smart Parenting 751

{’region’: {’x’: 56, ’y’: 43, ’w’: 109, ’h’: 149}, ’emotion’:
{’angry’: 6.8149141441153915e-06, ’disgust’:3.280159076480042e-
16, ’fear’: 0.0028966051104362123, ’happy’: 99.80378746986389,
’sad’:3.4208846955152694e-05, ’surprise’: 0.19327725749462843, ’neutral’:
1.203642185920728e-07}, ‘dominant_emotion’: ’happy’}

68.6 Conclusion

We developed a baby monitoring tool that dynamically sends messages to the parent
when the cradle is wet and when the baby is crying or is in a state of fear. We did this
using NodeMCU, IFTTT service and some python libraries. We could still develop
the tool by procuring more hardware such as a Weight sensor and a few other sensors
that can track the pulse rate, etc. of the baby.

68.7 Future Work

We did this using NodeMCU, IFTTT service, and some python libraries. We could
still develop the tool by procuring more hardware such as a weight sensor and a few
other sensors that can track the pulse rate, of the baby. Also, this work can further
be extended to capture the candid moments the baby makes over time.

References

1. Karkhanis, D., Kendre, Y., Hande, S., Dhawale, S.: Smart cradle system. Int. J. Creat. Res.
Thoughts 9(12) (2021)
2. Gare, H.S., Shahane, B.K., Jori, K.S., Jachak, S.G.: IoT based smart cradle system for baby
monitoring. IJCRT 8(3) (2020)
3. Srivastava, A., Yashaswini, B.E., Jagnani, A., Sindhu, K.: Smart cradle system for child
monitoring using IoT. Int. J. Innov. Technol. Expl. Eng. (IJITEE) 8(9) (2019). ISSN: 2278-3075
4. Gare, H.S., Shahne, B.K., Jori, K.S., Jachak, S.G.: IOT based smart cradle system for baby
monitoring. Int. Res. J. Eng. Technol. (IRJET) 06(10) (2019)
5. Kumaravel, A., Ramesh, S., Ramya, M., Ranjani, J.: Smart cradle for baby monitoring using
IOT. Int. J. Adv. Res. Comput. Commun. Eng. 10(5) (2021)
6. Kavitha, S., Neela, R.R., Sowndarya, M., Madhuchandra, Harshitha, K.: Analysis on IoT based
smart cradle system with an android application for baby monitoring. In: 2019 International
Conference on Advanced Technologies in Intelligent Control, Environment, Computing &
Communication Engineering (ICATIECE) (2019)
7. Mehetre, D.C., Joshi, M.P.: IoT based smart cradle system with an android app for baby
monitoring. In: 2017 International Conference on Computing, Communication, Control and
Automation (ICCUBEA), 17–18 Aug 2017
752 V. Sireesha et al.

8. Lavner, Y., Cohen, R., Ruinskiy, D., Ijzerman, H.: Baby cry detection in domestic environment
using deep learning. In: IEEE International Conference on the Science of Electrical Engineering
(ICSEE), 2016
9. Asthana, S., Varma, N., Mittal, V.K.: Preliminary analysis of causes of infant cry. In: IEEE
International Symposium on Signal Processing and Information Technology, ISSPI, 15–17 Dec
2014
Chapter 69
Present NLP Status for Dogri Language

Vipul Saluja, Annie Rajan, and Jyotshna Dongardive

Abstract As a result of the recent information boom, the contents of the Internet
are increasing in the multilingual form and mostly in the form of natural languages.
Therefore, the research on Natural Language Processing (NLP) tasks of regional
languages is very important. Dogri, the state language of Jammu and Kashmir (J&K),
has remained as a low-resourced language in the field of NLP. In this survey, the
contributions of various researchers for Dogri NLP are reviewed and discussed.
Sources of Dogri dataset and tools for processing Dogri are explored. Challenges such
as the complex linguistic tasks and lack of digital resources are also discussed. The
future scope and basic requirements for improving various Dogri language processing
tasks are highlighted.

Keywords Dogri · Natural language processing · Low resource language · Digital


resources

69.1 Introduction

With more than 833 million Internet users in India [1], only a fraction of this popula-
tion is fluent in English. But most of the online services and content available on the
web currently, are available exclusively in English. This language barrier leads to a
digital divide in the world’s second-largest Internet market. Therefore, the develop-
ment of NLP tasks for regional languages is very important. Nowadays, people in
India and across the world, are increasingly using regional languages on the social

V. Saluja · J. Dongardive (B)


University Department of Computer Science, University of Mumbai, Mumbai, India
e-mail: jyotsna@udcs.mu.ac.in
V. Saluja
RD & SH National College and SWA Science College, Mumbai, India
A. Rajan
University of Computer Science, University of Mumbai, Mumbai, India
DCT’s Dhempe College of Arts and Science, Goa, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 753
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_69
754 V. Saluja et al.

media platform to stay connected via their native languages. Also, many multina-
tional IT companies like Google, Amazon have started to provide support to their
Indian consumers by providing apps in their native languages. A lot of NLP research
is being done for different Indian languages, but Dogri has remained low-resourced.
Thus there arises a need for further NLP research in Dogri.
The objective of this paper is to provide an all-inclusive review of various NLP
work done in Dogri. For the language of Dogri, before 2011 the work has been
restricted due to various reasons like non availability of corpus, and it not being
recognized as a state language. This paper contains research work from year 2011
to 2021.
After the removal of Article 370 in J&K, the J&K Official Languages Bill, 2020
was passed in September 2020 to make Dogri as one of official languages of J&K
[2]. Dogri being a language spoken by around five million people across the borders
and being declared as the state language of J&K, leads to the need for further NLP
research in Dogri.

69.2 Dogri Language

In India, about 2.6 million people speak the Indo–Aryan Language Dogri which
is the second prominent language of J&K. It is spoken in the areas of Jammu,
Reasi, Udhampur, Kathua, and Poonch areas of J&K and in some parts of
Punjab (Gurdaspur, Pathankot, Nurpur, Hoshiarpur) and Himachal Pradesh (Kangra,
Chamba, Kullu, Mandi, Suket) [3, 4]. It is also spoken in Sialkot and Shakargarh
tehsils of Pakistan [3].
On 2nd August 1969, Dogri was given the status as an independent literary
language of India, by the Sahitya Academy, Delhi [5]. Subsequently Dogri was
added in the 8th Schedule of the 22 official languages in the constitution of India on
22nd December 2003 [3]. In 2020, it has been recognized as the official language of
the J&K [2]. This shows that Dogri is a progressive language.

69.2.1 Facts of Dogri Language

• Before being declared as an independent language, Dogri was listed as a dialect


of Punjabi in the national census [2].
• Dogra Akkhar, the official script of J&K was used to write Dogri during the rule of
Ranbir Singh. Devanagari script replaced Dogra Akkhar in India and in Pakistan
the Nastaliq form of Perso-Arabic script was used [2].
• The areas where Dogri is spoken is known as Duggar and the people that speak
Dogri are called Dogras [3].
69 Present NLP Status for Dogri Language 755

Table 69.1 Words in


Sr. No. Domains Words % of total corpus
LDC-IL Dogri text corpora
1 Aesthetics 594,609 74.16
2 Mass media 156,756 19.55
3 Social sciences 46,326 5.78
4 Science and 2730 0.34
technology
5 Commerce 1350 0.17

• The famous poet of Hindi and Persian, Amir Khusro, first mentioned Duger
(Dogri) in le line “Sindhi-o-Lahori-o-Kashmiri-o-Duger,” whilst describing the
languages and dialects of India [3].

69.3 NLP Status of Dogri

69.3.1 Corpus Resources

The Linguistic Data Consortium for Indian Languages (LDC-IL), a scheme of


Department of Higher Education, Ministry of Human Resource Development,
Government of India implemented by Central Institute of Indian Languages, Mysore
has developed a corpus for Dogri [2]. This corpus dataset is having 801,771 words
covering domains related to Aesthetics, Mass Media, Social Sciences, Science and
Technology, and Commerce. Table 69.1, shows the details of words available across
the five domains in this corpora.
LDC-IL has also created a Dogri speech data corpus of 17 h collected from 61
speakers from both the genders and different age groups [6]. Table 69.2, shows the
details of number of audio segments and their durations available across different
domains.
A methodology to construct a standard corpus using Portable Document Format
(PDF) versions of an online newspaper “Jammu Prabhat”, is discussed. It used 200
documents, a total of 23,398 sentences with 472,271 tokens and 24,893 unique tokens
was extracted [7].

69.3.2 Digital Resources

The Department of Ministry of Electronics and Information Technology (MEITY),


Government of India have developed tools to bring about software localization in
Dogri. Tools like Open Office, Firefox, Thunderbird, Pidgin messenger, Sunbird
calendar, and scribes page layout application have been developed for operating
systems such as Windows and Linux [8]. Table 69.3, enlists the different tools
756 V. Saluja et al.

Table 69.2 Dogri speech


Sr. No. Domains No. of audio Domain duration
corpora of LDC-IL
segments (in hh:mm:ss)
1 Phonetically 2050 1:50:38
balanced words
2 Frequently used 2000 1:16:27
word—full set
3 Frequently used 1831 1:18:06
words—part
4 Words of 1830 1:24:31
command and
control
5 Sentences 1527 1:24:48
6 Names of 1222 1:23:41
persons
7 Words of form 724 0:29:25
and function
8 Names of 609 0:29:10
places
9 Date formats 122 0:14:07
10 Creative text 61 2:51:42
11 News and 60 4:27:51
contemporary
text

Table 69.3 Tools developed


Operating system Tools Purpose
by MEITY for Dogri
Windows Unicode typing tool Typing utility
Windows and Indian LibreOffice for Open office
Linux Dogri
Windows and Indian Firefox Dogri Web browser
Linux version
Windows Indian Thunderbird e-mail client
Dogri version
Windows and Indian Pigeon Dogri Messenger
Linux version
Windows and Indian Tuxpaint Dogri Paint application
Linux version
Windows Indian Joomla Dogri Content
version management
system
Linux Linux Operating Operating
System in Dogri system
69 Present NLP Status for Dogri Language 757

developed by MEITY for Dogri.

69.4 Literature Survey of NLP Tasks in Dogri

Research work on NLP for Dogri has been carried out on tasks, such as corpus
creation, machine translation, morphological analysis, named entity recognition
(NER), part of speech (PoS) tagging, stemming, stop word generation/removal,
summarization, and other research areas. Table 69.4, contains a collection of
references to NLP in Dogri.
This list can be helpful in identifying the research areas in Dogri NLP that require
further exploration. As seen from the above list, topics such as sentiment analysis
and transliteration have not yet been explored for Dogri language. Figure 69.1, shows
NLP works in Dogri language. A review of the work done in these research areas is
summarized next.

69.4.1 Machine Translation

A rule-based Hindi to Dogri machine translation system using ASP.NET and MS-
Access database has been developed with an overall accuracy of 98.54% [9, 15].
A comparative study of the grammatical and inflectional analysis of Hindi to Dogri
languages with respect to machine translation has been done in order to develop
rules for inflectional analysis [10]. A Hindi-Dogri machine translation system is
used to develop a parallel Dogri-Hindi corpus [11]. The challenges and problems
faced during the development of Dogri-Hindi statistical machine translation system

Table 69.4 References of


Sr. No. NLP task References
NLP in Dogri
1 Machine translation [9–21]
2 Morphological analysis [22]
3 NER [23]
4 PoS tagging [24–27]
5 Stemming [28]
6 Summarization [29]
7 Stop word generation/removal [30, 31]
8 Spell checker [32]
9 Polysemy identification [33]
10 Translation of code mixed text [34]
11 Verb generation [35]
12 WordNet [36]
758 V. Saluja et al.

Fig. 69.1 NLP works in Dogri

and the steps to be considered to overcome these challenges are discussed [12]. A
Statistical Machine Translation (SMT) approach called Moses [13] has been devel-
oped to translate English to Dogri and vice versa with an accuracy of 80% and 87%,
respectively [14]. The performance of Moses is tested using parameters viz. trans-
lation table size, stack size, language model, reordering model and word penalty for
better speed and quality [16]. Other machine translation systems have been explained
for Hindi to Dogri translation [17–21].

69.4.2 Morphological Analysis

Morphological analysis is the process of determining the morphemes from which a


given word is constructed. An effort to develop a morphological analyzer using an
open source platform-Apertium (LT-Toolbox) has been described [22].

69.4.3 NER

NER is an entity detection method that finds and classifies named entities in text
into pre-defined categories. An algorithm for noun identification is applied after the
processing of test data, an average accuracy of 72.51% is obtained on four test runs
[23].
69 Present NLP Status for Dogri Language 759

69.4.4 PoS Tagging

A model based on unsupervised learning technique is proposed for the development


of PoS tagged corpus [24]. Hidden Markov Model (HMM) is used to apply PoS
tagging to achieve an accuracy of 83% (with the same training and test datasets) and
59% (when different datasets for training and testing are used) [25]. A rule-based
PoS tagger is evaluated over five datasets for six different parts of speech [26]. A
hierarchical tagset for annotating Dogri corpus has been developed which can be
further used for PoS tagging, chunking, morphological analysis and other NLP tasks
[27].

69.4.5 Stemming

Stemming reduces a word to its word stem that may be affixed to suffixes and prefixes
or to the roots of words known as a lemma. Unsupervised learning has been used for
the development of the stemmer of Dogri, and an accuracy of 69% is achieved [28].

69.4.6 Summarization

The technique of computationally shortening a set of data to generate a summary


that contains the most significant or relevant information within the original material
is known as summarization. An extractive Dogri text summarization approach that
make use of various statistical features such as term frequency, length of a sentence,
position of a sentence and Term Frequency-Inverse Sentence Frequency (TF-ISF) and
linguistic features such as presence of proper noun, numerical information, English-
Dogri words are considered for summary generation [29].

69.4.7 Stop Word Generation/Removal

An algorithm based on the frequency of occurrence of words after removing named


entities from a corpus is used to generate stop words [30]. Data preprocessing task
of removal of stop words after identifying them based on the frequency of their
occurrences has been discussed [31].
760 V. Saluja et al.

69.4.8 Other Works

A Spell Checker for Dogri has been implemented using the AVL and Trie data
structure. A comparative data study suggested that Trie data structure performs better
as compared to the AVL data structure whilst implementing the Dogri dictionary for
Spell Checker [32]. Models for translation of Code mixed Dogri text written in
English and Dogri and polysemy identifications of Dogri have been proposed [33,
34]. A hybrid model for generation of inflected forms of Dogri language is applied
and an accuracy of 69.21% is obtained on five data sets [35]. A prototype model for
developing a WordNet for Dogri similar to the WordNet of Hindi language has been
proposed [36].

69.5 Challenges of Dogri

After adopting Devanagari as a script more than half a century ago, Dogri writing has
not evolved into a scientific language. The semantics and vocabulary and hence the
meaning of Dogri changes in the lowlands as compared to that in the mountainous
regions. Due to the lack of substantial Dogri literature, the medium of teaching even
at the primary education level has not evolved in the post-independence period of
India. Dogri has also been unable to develop scientifically due to a lack of medium
of instruction and teaching at the middle, high school, and undergraduate levels. The
challenge is the number of dialects spoken and written for teaching and research at
various levels.

69.6 Conclusion and Future Work

As seen in all of the above work done, the size of the corpus is limited from a few
hundred to a few thousand words. As the size of the corpus increases or the test
dataset varies from the training dataset the accuracy of these tools may drop. This
is because of the lack of a proper annotated corpora for Dogri. Thus, there is a
need for a fully annotated Dogri corpus of a significant size to promote various NLP
research initiatives. Also, some NLP tasks such as sentiment analysis, transliteration,
and segmentation have gone unexplored for Dogri language for which further work
remains to be done.
The future work is to create a large annotated corpus for Dogri language in order
to promote further Dogri NLP research initiatives. This can be achieved by involving
a large number native speakers of Dogri who can assist in generating and further
validating the annotated corpus.
69 Present NLP Status for Dogri Language 761

This survey presents a comprehensive survey of the NLP research done in Dogri
language, which can be a bench mark for other researchers to work in this language
and shows the scope of ample research in this area.

References

1. The Indian Telecom Services Performance Indicators, April–June 2021. https://www.trai.gov.


in/sites/default/files/PIR_21102021_0.pdf. Accessed 15 Jan 2021
2. Tribune News Service: Hindi, Dogri Among 5 Official Languages in J&K (2020). https://www.
tribuneindia.com/news/j-k/hindi-dogri-among-5-official-languages-in-jk-135510. Accessed
10 Dec 2021
3. Dogri Language | Dogri History and Facts. https://www.ritiriwaz.com/dogri-language-dogri-
history-and-facts/. Accessed 20 Dec 2021
4. Ramamoorthy, L., Choudhary, N., Kumar, S.: A gold standard dogri raw text corpus. In:
Linguistic Resources for AI/NLP in Indian Languages. Central Institute of Indian Languages,
Mysore (2019)
5. Rao, S.: Five Decades: The National Academy of Letters, India: A Short History of Sahitya
Akademi, Sahitya Akademi (2004)
6. Choudhary, N., Choudhary, S., Rajesha, N., Manasa, G.: Dogri Raw Speech Corpus Linguistic
Resources for AI/NLP in Indian Languages. Central Institute of Indian Languages, Mysore
(2021)
7. Gandotra, S., Arora, B.: On creation of Dogri language corpus. J. Crit. Rev. 7(9), 2337–2343
(2020). https://doi.org/10.31838/jcr.07.09.380
8. Ministry of Electronics & Information Technology, Govt. of India—Technology Development
for Indian Languages Programme (TDIL). http://www.tdil.meity.gov.in. Accessed 15 Dec 2021
9. Dubey, P.: The Hindi to Dogri machine translation system. In: Proceedings of the 17th Interna-
tional Conference on Natural Language Processing (ICON): System Demonstrations, pp. 19–20
(2020)
10. Dubey, P.: The Hindi to Dogri machine translation system: grammatical perspective. Int. J. Inf.
Technol. 11(1), 171–182 (2019)
11. Moudgil, M., Dubey, P.: Testing and applying tools to develop Dogri-Hindi SMT system. Int.
J. Manag. Appl. Sci. (IJMAS) 4(9), 17–21 (2018)
12. Moudgil, M., Dubey, P., Kumar, A.: Challenges in building Dogri-Hindi statistical MT system.
Res. Cell Int. J. Eng. Sci. 24(1), 20–25 (2017)
13. Moses statistical machine translation system. https://www.statmt.org/moses/. Accessed 20 Jan
2021
14. Singh, A., Kour, A., Jamwal, S.: English-Dogri translation system using MOSES. Circ. Comput.
Sci. 1(1), 45–49 (2016). https://doi.org/10.22632/ccs-2016-251-25
15. Dubey, P.: Testing and results of Hindi-Dogri machine translation system. Indian J. Sci. Technol.
8(27) (2015)
16. Jamwal, S., Dutt, S.: Tuning of Moses decoder for Dogri SMT. Int. J. Comput. Sci. Commun.
6(1), 145–147 (2015)
17. Dubey, P.: Need for Hindi-Dogri machine translation system. In: International Conference on
Computing for Sustainable Global Development (INDIACom), pp. 136–140. IEEE (2014)
18. Dubey, P.: Machine translation system for Hindi-Dogri language pair. In: International
Conference on Machine Intelligence and Research Advancement, pp. 422–425. IEEE (2013)
19. Dubey, P.: Study and Development of Machine Translation System from Hindi Language to
Dogri Language an Important Tool to Bridge the Digital Divide (2013)
20. Dubey, P., Pathania, S.: Comparative study of Hindi and Dogri languages with regard to machine
translation. Lang. India 11(10) (2011)
762 V. Saluja et al.

21. Dubey, P.: Overcoming the digital divide through machine translation. Transl. J. 15(1) (2011)
22. Kumar, S.: Generating Dogri morphological analyzer using Apertium tool: an overview.
Criterion: Int. J. Engl. 8(VI), 212–221 (2017)
23. Jamwal, S.: Named entity recognition for Dogri using ML. Int. J. IT & Knowl. Manag. 10,
141–144 (2017)
24. Jamwal, S.: Modeling automatic POS tagger for the Dogri. Int. J. Comput. Sci. Commun. 12(2),
34–37 (2021)
25. Jamwal, S.: Development of POS tag set for the Dogri language using SMT. Development
13(1), 12–15 (2021)
26. Dutta, S., Arora, B.: Parts of speech (POS) tagging for Dogri language. In: Proceedings
of Second International Conference on Computing, Communications, and Cyber-Security,
pp. 529–540. Springer, Singapore (2021)
27. Kumar, S.: Developing POS tagset for Dogri. Lang. India 18(1) (2018)
28. Gupta, P., Jamwal, S.: Designing and development of stemmer of Dogri using unsupervised
learning. Soft Computing for Intelligent Systems, pp. 147–156, Springer, Singapore (2021)
29. Gandotra, S., Arora, B.: Feature selection and extraction for Dogri text summarization. In:
Rising Threats in Expert Applications and Solutions, pp. 549–556. Springer, Singapore (2021)
30. Gandotra, S., Arora, B.: Automated stop-word list generation for Dogri corpus. Int. J. Adv. Sci.
Technol. 28(19), 884–889 (2019)
31. Gandotra, S., Arora, B.: Pre-processing of Dogri text corpus. ICT for Competitive Strategies,
pp. 227–236. CRC Press (2020)
32. Jamwal, S.: AVL and TRIE loading time in Dogri spell checker. Int. J. Comput. Sci. Commun.
6(1), 140–144 (2015)
33. Jamwal, S.: Polysemy identification for Dogri language. Res. Cell: Int. J. Eng. Sci. 34, 28–31
(2021)
34. Jamwal, S.: Modeling translation of code mixed English-Dogri language. Int. J. IT & Knowl.
Manag. 14(2), 22–25 (2021)
35. Jamwal, S., Gupta, P., Sen, V.: Hybrid model for generation of verbs of Dogri language.
In: Data Driven Approach Towards Disruptive Technologies: Proceedings of MIDAS 2020,
pp. 497–508. Springer, Singapore (2021)
36. Kumar, R., Mansotra, V., Kumar, T.: A first attempt to develop a lexical resource for Dogri
language: Dogri WordNet using Hindi WordNet. In: International Conference on Computing
for Sustainable Global Development (INDIACom), pp. 575–578. IEEE (2014)
Chapter 70
Multi-objective and Seagull Optimization
Enabled Traffic Signal Controlling
for Traffic Management in Cities

Seelam Meghana Reddy, S. Sai Satyanarayana Reddy, and Seelam Sanjana

Abstract Traffic signal control is significant to solve the real world issues such
as the fuel wastage, time wastage, environmental pollution, accidents due to traffic
congestion and several other factors. Hence, this research introduces a novel traffic
management system based on the multi-objective function. Initially, the smart city
map is taken from the real satellite image and is then segmented to gather more
detailed information. Then, using the network simulation through MATLAB, the
information regarding the traffic is gathered and then the paths are identified for
efficient routing to avoid the traffic congestion. From, the identified path, the traffic
signal control is employed optimally based on the solution encoding, multi-objective
function and the seagull optimization algorithm (SOA). Finally, the performance of
the proposed method is evaluated based on the performance metrics, like travel time,
distance and average traffic density.

Keywords Traffic signal control · Optimization · Routing · Vehicle · Path

70.1 Introduction

Congestion in traffic is one of the emerging issues, which is prevailing in the


metropolitan areas with obstructive outcomes in both the society and public trav-
eling. These adverse effects grow over the period of time because most of the people
migrate toward the cities. Migration has various benefits like, societal, economic
and environmental benefits. Signals at the intersections are considered as a frequent

S. M. Reddy (B)
Vardhaman College of Engineering, Hyderabad, India
e-mail: ssmegu900@gmail.com
S. Sai Satyanarayana Reddy
Sreyas Institute of Engineering and Technology, Hyderabad, India
e-mail: saisn90@gmail.com
S. Sanjana
Kennesaw State University, Kennesaw, GA, USA
e-mail: sseelam4@students.kennesaw.edu

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 763
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_70
764 S. M. Reddy et al.

obstruction in the metropolitan areas, which plays a wide role in the traffic manage-
ment system [1]. Public transportation sectors supplies precise and real time details
to the publics and generate references for the staffs in order to deal the violations
in traffic and emergency situations. Detecting the vehicles at the crossings in all the
direction enhances the algorithms for signal control, alterations and the efficiency of
traffic on the basis of the observation results [2]. Congestion in traffic is categorized
as follows non-recurrent congestion and recurrent congestion. Recurrent conges-
tion appears because of the repeated routine patterns in travel and the non-recurrent
congestion is due to the unpredicted events like breakdowns, accidents and so on [3,
4]. Most complex circumstances occurs arbitrarily at various locations in the urban
areas at different instances of time but does not happen repeatedly [3, 4]. Handling the
non-recurrent circumstances using the optimization methods are one of the vulner-
able situation in the space and in time [3]. Discovering the features in the traffic and
detecting the patterns are the major process in the intelligent transport system (ITS)
like advanced control system and traffic management, advanced guidance system
and traveler information and so on [5].
Managing the traffic incidents are one of the important phenomenon in the trans-
portation systems because of its effects in the control and safety operations. In order to
handle the arbitrary situations, most of the traffic management centers (TMC) gener-
ates methodologies and feedback plans for minimizing the time for clearance [3].
Controlling the traffic signals is one of the effective methods for preserving the partic-
ipants in traffic at the crossings where the interaction of various streams of traffic takes
place. Adaptive traffic signal control (ATSC) systems were employed for providing
responses to the fluctuating demand in the traffic and are widely employed, which had
attracted most of the research sectors [6, 7]. In order to predict the traffic situations
with the use of sensors a method called task allocation approach is employed that
gathers the details of traffic upon efficient monitoring of traffic signals [2, 8]. Rather
than modeling the traffic system, the traffic controlling method is also an important
step in the metropolitan traffic network. For controlling the traffic signal, there are
four types of control method they are, coordinated-intersection approaches, fixed
time approaches, isolated-intersection approaches and traffic responsive approaches
[9]. Nowadays, large amounts of methodologies concentrate on the detection of
short-term traffic flow which enhances the prediction accuracy upon linking the
spatio-temporal correlations in the detection methods [5].
The ultimate goal behind the proposed method is the efficient traffic management
technique based on the multi-objective strategy. Hence, the satellite image from the
real database is utilized for the smart city map generation and then segmentation is
devised followed by the significant paths are identified for the traffic signal control
based on ON/OFF criteria. The major contributions of the research article are:
• Proposed Multi-objective Function: The proposed multi-objective function
utilized for traffic signal control are traffic density, traffic flow direction, vehicle’s
speed and collision factor, which is obtained through network simulation based
on MATLAB.
70 Multi-objective and Seagull Optimization Enabled Traffic Signal … 765

• Proposed SOA-based TSC: The proposed traffic signal controlling based on the
optimization is obtained through solution encoding, multi-objective function and
seagull optimization algorithm.
The paper is organized as: Sect. 70.2 discusses the motivation for the research,
Sect. 70.3 discusses the proposed methodology and the results of the proposed method
are demonstrated in Sect. 70.4. Finally, Sect. 70.5 concludes the paper.

70.2 Motivation

The conventional literatures based on the traffic signal controlling are detail in this
section along with the advantages and the limitations. The limitations’ concerning
the conventional methods motivates to propose the novel technique for the traffic
signal controlling.

70.2.1 Literature Survey

Mao et al. [3] combined the fast running Machine Learning (ML) algorithms and the
reliable Genetic Algorithms (GA) and found that this method significantly reduced
the traveling time and were applied for real time applications. Yet, the detection model
was required to be further improved. Li et al. [6] introduced Knowledge Sharing
Deep Deterministic Policy Gradient (KS-DDPG) and obtained better performance
with low computation cost yet, reduction in the communication efficiency. Zhang and
Su [9] implemented a heterogeneous traffic network and perform well against the
cyber threats but, this method had assumed that the flow of traffic was homogeneous.
Zha et al. [2] employed an improved adaptive clone genetic algorithm (IACGA) and
solved the issue of task allocation but the computational efficiency was required
to be enhanced. Tang et al. [5] deployed A hybrid model: Genetic Algorithm with
Attention-based Long Short-Term Memory (GA-LSTM), combining with spatial–
temporal correlation analysis. The prediction errors of this method were low but
the occupancy and speed were required to be included for improving the prediction
result.

70.2.2 Challenges

• One of the major challenges are, in the time of designing the execution process
all the agents were required to communicate, this lowers the efficiency of
communication [6].
766 S. M. Reddy et al.

• Most of the method obtained better results for the low traffic conditions, but in
the case of higher traffic situation like in the urban areas it is challenging as, even
with the application of the non-signalized networks the prediction results fails [9].
• Even though the traffic prediction model performed with lesser computational
time it is challenging when employed in real time cases as they had not generated
accurate results with lesser time [3].
• As the traffic data are complex accessing them and boosting the performances are
complex and resolving the task allocation issues are a tedious process [2].
• Integrating the multi-source information improves the robustness and generaliza-
tion of the model but it is challenging to acquire multi-source area [5].

70.3 Proposed Method for Traffic Signal Control


in the Urban Areas

The traffic signal management is essential in the current scenario for the congestion
and accident avoidance at the intersections. Hence, to avoid such scenario, the traffic
signal controlling is proposed using the SOA-based traffic signal control technique.
Initially, the input image is taken from the real database consisting the full view of
a city, which is acquired using the satellite image. To obtain the most significant
information and ease the controlling process, the segmentation of the areas from
the city map is performed, which is followed with the simulation of the segmented
map/areas using the MATLAB to obtain the information of the VANET environment,
such as traffic density, traffic flow direction, vehicle’s speed and collision factor.
Based on the gathered information, the significant path is identified for the individual
segmented areas in order to execute the traffic signal control. Based on the information
acquired from the significant path of the areas, the traffic control is executed using
the multi-objective function and the seagull optimization algorithm (SOA). The work
flow of the proposed SOA-based TSC is depicted in Fig. 70.1.

Fig. 70.1 Block diagram of the proposed SOA-based TSC


70 Multi-objective and Seagull Optimization Enabled Traffic Signal … 767

70.3.1 Smart City Map

A smart city composed of smart technologies, smart governance, smart infrastruc-


ture management and smart transportation. In the proposed traffic signaling control
method, the input image is obtained from the satellite and it can be expressed as,

S = {M1 , M2 , M3 , . . . , Ms , . . . , M N } (70.1)

where S refers to the dataset utilized to evaluate the performance of the proposed
method, N refers to the total number of samples in the dataset and M refers to sth
image in the dataset. Let us consider, the image M s is utilized for further processing.

70.3.2 Map Segmentation and Significant Route Discovery

From the acquired image M s , the segmentation is done, which provides the areas
in the city and the segmentation is performed on the identical size. Thus, the input
single image is converted into several squares of identical size. The aim behind the
image segmentation is to analyze and to gather more information efficiently. The
information gathered from the segmented image is more informative compared to
the whole image. Thus, the segmented map consists of [w × w] identical maps. Once
the segmented areas are determined, the significant routes are found for which the
MATLAB simulation is done.

70.3.3 Network Simulation for Information Extraction

The segmented area is then simulated in the MATLAB for the collection of informa-
tion regarding the significant routes and vehicles. Thus, through the simulation the
information, such as available roads, traffic density, traffic flow direction, vehicle’s
speed, and collision factor are gathered for the VANET-based control of traffic
signaling. The detailed description is given below.
Traffic density (T d ): For the efficient traffic signal controlling, the traffic density is
considered as the significant aspect. It measures the available vehicles in the road
and can be formulated as,

Vt
Td = (70.2)
L

where the length of the road is notated as L, the total number of vehicles in the road
is represented asV t and T d refers to the traffic density.
768 S. M. Reddy et al.

Traffic flow direction (T f ): The traffic flow is measured based on the vehicles passed
at a time. For the effective maintenance of the road traffic at the intersection of the
road, in the proposed method, the vehicles moves toward the traffic signal is ‘−1’
and the vehicles moving away from the traffic signal is ‘+1’. It is expressed as,

−1 if moving towards the signal
Tf = (70.3)
+1 if moving away from the signal

Average vehicle speed (As ): For the selection of the significant route, the average
speed of the vehicle in the path is estimated. The average speed of the vehicle is
calculated using the distance traveled by the time taken. Here, while considering all
the vehicles in the lane, the speed of all the vehicles in the lane is averaged and it is
formulated as,

Speed of the vehicles


As = (70.4)
Total number of vehicles

Collision factor (C f ): The collision factor for the traffic signaling control depends
on the drunk and drive condition, over speed vehicle and the vehicles condition
such as brake, accelerator and several factors concerning the vehicles condition. It
is formulated as,

C f = (1 − α)C D D + α(1 − α)C O S + α(1 − α)(2 − α)C V C (70.5)

where the drunk and drive vehicle is denoted as C DD , the over speed vehicle is denoted
as C OS , the vehicle condition is denoted as C VC . The information gathered for the
maintenance of the traffic through the traffic signal controlling are the normalized
value. Here, the value of C DD is 1 for the drunk and drive, else it is ‘0’. Likewise for
the C OS and C VC , the corresponding value is ‘1’ when the factor is true, else it is ‘0’.
In the proposed SOA-based traffic signal controlling for the efficient traffic
management, the available paths needs to be detected. From the network simula-
tion, the available paths are identified and based on the traffic density; the vehicles
paths are re-routed to avoid the traffic congestion based on the traffic signal control
mechanism, detailed in the next section.

70.3.4 Traffic Signal Controlling

The traffic light signal controlling for the effective management of the traffic in the
smart city using the proposed SOA-based traffic signal controlling is proposed. The
steps of the proposed are: solution encoding, multi-objective function and the seagull
optimization algorithm (SOA). The traffic signal ON/OFF is controlled for the traffic
70 Multi-objective and Seagull Optimization Enabled Traffic Signal … 769

TSC1 TSC 2 TSC3 …. ….. TSC10

Fig. 70.2 Multi-objective function of the proposed SOA-based TSC

management in all the significant areas based on the multi-objective function and
SOA. Thus, the traffic during the peak hours and emergency vehicle routing without
waiting time is done effectively using the proposed method through signaling OFF
the current path and re-directed to the new path with reduced traffic by making the
signal ON in the new path.

70.3.4.1 Solution Encoding

The solution encoding is utilized for the control of traffic signaling. Let us consider,
there are 10 number of traffic signals in the VANET and the solution attainment
based on it is presented in Fig. 70.2. Then, the dimension of the solution encoding
is assumed as [1 × 10]. Here, the traffic signal controller is represented as TSC. For
each TSC the optimal solution is provided using the SOA algorithm.

70.3.4.2 Multi-objective Function

The ON/OFF state of the traffic signal is optimally controlled by the SOA optimiza-
tion algorithm, in which four objective functions are considered. The traffic density,
traffic flow direction, average vehicle speed and the collision factor are the four
objective functions employed for the evaluation of the multi-objective function and
it is formulated as,

TSC(O f ) = {[1 − Td,i ] + [T f,i + As,i + [1 − C f,i ]} (70.6)

where TSC(Of ) refers to the multi-objective function of the proposed SOA-based


traffic signal controller, T d,i refers to the ith solution of the T d , T f ,i refers to the ith
solution of T d , As,i refers to the ith solution of As and C f , refers to the ith solution of
Cf .

70.3.4.3 Seagull Optimization Algorithm

Seagull [10] is an omnivorous and intelligent bird, which migrates from one place to
another in search of food. Seagulls lives in colonies and they are skilled in attacking
the prey. The attacking and the migratory behavior of the seagull are considered
for solving the optimization issues. Initially, in order to avoid the collisions, the
positions of the search agents are considered different. Then, they move toward the
770 S. M. Reddy et al.

best neighbor and remain close to the seagull with best fitness value. While attacking
the prey they move spirally, the behavior in x, y and z planes are considered and it is
expressed as,

x  = a × cos(h) (70.7)

y  = a × sin(h) (70.8)

y = a × h (70.9)

a = v × ehp (70.10)

where x  , y , and z refers to the behavior of the search agent in the respective planes.
The spiral shape movement is defined using the constants v and p. Here, h is the
random number and it ranges between [0 and 2π ], the radius of the spiral movement
is denoted as a. Then, the position updation of the search agent is formulated as,

 s × x  × y  × z  ) + Sbn (x)
Sn (x) = ( Q (70.11)

where Sn (x) refers to the position updated by the search agent, which is considered
as best.
The position of the search agent close to the best search agent is referred as
Q s . Sbn (x) is considered as the fittest seagull. Thus, by considering the solution
encoding, multi-objective function and the SOA algorithm the traffic signal control-
ling is devised using the proposed method. The pseudo-code for the SOA is presented
in Algorithm 1.

Algorithm 1 Pseudo-code for the SOA

Pseudo-code for the SOA


1 Begin: Initialize the parameters
2 Output: Sbn (x)
3 While (end criteria is not attained)
4 Estimate the best position Sbn (x)
5 New position of search agent is updated using Eq. (70.11)
6 Evaluate the fitness
7 If fitness of Sn (x) < Sbn (x)
8 Sbn (x) = Sn (x)
9 End if
(continued)
70 Multi-objective and Seagull Optimization Enabled Traffic Signal … 771

(continued)
Pseudo-code for the SOA
10 End while
11 Return the best solution
12 End

70.4 Result and Discussion

The evaluation of the proposed SOA-based TSC is implemented using the MATLAB
tool and evaluated based on the performance metrics such as travel time, distance
and average traffic density.
Dataset: The dataset utilized for the evaluation of the proposed method is
BigEarthNet-S2 [11]. It consists of 590,326 non overlapping images. Each image
patch in BigEarthNet has one directory under the archive root directory. GeoTIFF
files for spectral bands, multi-labels, and metadata of each patch will be in the
corresponding patch directory.

70.4.1 Analysis of the Proposed Method

The analysis of the proposed method is devised by comparing it with the conventional
PSO-based TSC, SFLA-based TSC and K-path-based TSC. Here, by varying the
number of vehicle with 50 and 100 are depicted below.

70.4.1.1 Analysis Using 50 Vehicles

Figure 70.3 depicts the analysis of the proposed method using 50 vehicles. When
considering the 35 vehicles, the travel time evaluated by the K-path-based TSC,
SFLA-based TSC, PSO-based TSC and the proposed SOA-based TSC are 26.43,
36.27, 13.73 and 13.73, respectively, which is shown in Fig. 70.3a. Figure 70.3b
depicts the distance traveled by the vehicle. The distance measured by the proposed
SOA-based TSC are 44.67, which is 11.91, 9.82 and 11.70%, better than the conven-
tional K-path-based TSC, SFLA-based TSC, and PSO-based TSC methods with 25
vehicles. The average traffic density is depicted in Fig. 70.3c. The average traffic
density evaluated by the proposed SOA-based TSC is 0.03, which is 19.13, 11.60
and 5.84% better than the conventional K-path-based TSC, SFLA-based TSC and
PSO-based TSC methods with 50 min.
772 S. M. Reddy et al.

Fig. 70.3 Analysis using 50 vehicles in terms of a travel time, b distance and c average traffic
density

70.4.1.2 Analysis Using 100 Vehicles

Figure 70.4 depicts the analysis of the proposed method using 100 vehicles. When
considering the 45 vehicles, the travel time measured by the proposed SOA-based
TSC are 4.93, which is 16.12, 91.41, and 80.00%, better than the conventional K-path-
based TSC, SFLA-based TSC and PSO-based TSC methods. The distance measured
by the proposed SOA-based TSC are 14.27, which is 38.05, 80.58 and 74.19%
better than the conventional K-path-based TSC, SFLA-based TSC and PSO-based
TSC methods with 45 vehicles. The average traffic density is depicted in Fig. 70.4c.
The average traffic density evaluated by the proposed SOA-based TSC is 0.018,
which is 59.42, 55.80 and 57.92% better than the conventional K-path-based TSC,
SFLA-based TSC and PSO-based TSC methods with 50 min.
70 Multi-objective and Seagull Optimization Enabled Traffic Signal … 773

Fig. 70.4 Analysis using 100 vehicles in terms of a travel time, b distance and c average traffic
density

70.5 Conclusion

The traffic management is the key factor for the transportation efficiency. Hence,
this research introduces a novel technique for the traffic signal control method using
the SOA-based TSC. In this method, the information required for the traffic signal
is obtained by the network simulator from the input satellite imaged segmented
into identical size. Finally, the traffic signal control based on ON/OFF is employed
through the solution encoding, multi-objective function and SOA. The proposed
SOA-based TSC obtained better performance compared to the existing system in
terms of travel time, distance and average traffic density. In future, the optimization
based deep learning will be developed for the more accurate traffic signal controlling.
774 S. M. Reddy et al.

References

1. Wei, H., Zheng, G., Gayah, V., Li, Z.: Recent advances in reinforcement learning for traffic
signal control: a survey of models and evaluation. ACM SIGKDD Expl. Newsl. 22(2), 12–18
(2021)
2. Zha, Z., Li, C., Xiao, J., Zhang, Y., Qin, H., Liu, Y., Zhou, J., Wu, J.: An improved adaptive
clone genetic algorithm for task allocation optimization in ITWSNs. J. Sens. 2021 (2021)
3. Mao, T., Mihăită, A.-S., Chen, F., Vu, H.L.: Boosted genetic algorithm using machine learning
for traffic control optimization. IEEE Trans. Intell. Transp. Syst. (2021)
4. Anbaroglu, B., Heydecker, B., Cheng, T.: Spatio-temporal clustering for non-recurrent traffic
congestion detection on urban road networks. Transp. Res. Part C: Emerg. Technol. 48, 47–65
(2014)
5. Tang, J., Zeng, J., Wang, Y., Yuan, H., Liu, F., Huang, H.: Traffic flow prediction on urban road
network based on License Plate Recognition data: combining attention-LSTM with Genetic
Algorithm. Transp. A: Transp. Sci. 17(4), 1217–1243 (2021)
6. Li, Z., Yu, H., Zhang, G., Dong, S., Xu, C.-Z.: Network-wide traffic signal control optimization
using a multi-agent deep reinforcement learning. Transp. Res. Part C: Emerg. Technol. 125,
103059 (2021)
7. De Weg, V., Sterk, G., Vu, H.L., Hegyi, A., Hoogendoorn, S.P.: A hierarchical control frame-
work for coordination of intersection signal timings in all traffic regimes. IEEE Trans. Intell.
Transp. Syst. 20(5), 1815–1827 (2018)
8. Nguyen, T.D.T., Nguyen, V.D., Pham, V.-N., Huynh, L.N.T., Hossain, M.D., Huh, E.-N.:
Modeling data redundancy and cost-aware task allocation in MEC-enabled internet-of-vehicles
applications. IEEE Internet Things J. 8(3), 1687–1701 (2020)
9. Zhang, Y., Su, R.: An optimization model and traffic light control scheme for heterogeneous
traffic systems. Transp. Res. Part C: Emerg. Technol. 124, 102911 (2021)
10. Dhiman, G., Singh, K.K., Soni, M., Nagar, A., Dehghani, M., Slowik, A., Kaur, A., Sharma, A.,
Houssein, E.H., Cengiz, K.: MOSOA: a new multi-objective seagull optimization algorithm.
Expert Syst. Appl. 167 (2021)
11. BigEarthNet database. https://bigearth.net/#downloads. Accessed 2022
Chapter 71
Design of Smart Spectacle in 5G-IoT
Environment to Detect and Prevent
Corona Virus Variants

S. Thamizharasan, Paruchuri Chandra Babu Naidu, M. Vasuja Devi,


Lourdes Emperatriz Paredes Castelo, A. K. P. Kovendan,
and J. N. Swaminathan

Abstract Corona virus, a novel virus, was firstly reported in Wuhan, China on
December 2019. From then, it has been started spreading to the entire world. The
virus can spread easily from the infected person to normal person. The infected
person’s human body develops high fever which is the main symptoms of this novel
symptoms of Corona virus. To prevent the spread, infrared thermometer for thermal
screening has been utilized. To find the infectee among the crowd, each person body
temperature has to be manually checked. This process is time-consuming and has
a threat of getting spread for the person who does the screening process. Hence, to
overcome the problems, we propose a novel way of screening using smart spectacle
with least human interactions. With the help of IoT technology and thermal camera
technology, the screening process will become more safety and faster compared to
the conventional screening. Additionally, the facial recognition method is exploited
to get pedestrian information. The proposed smart spectacle system has a greater
potential to identify the Corona virus and reduce the spreading.

Keywords COVID-19 · Smart spectacle · IoT technology · Corona virus

S. Thamizharasan (B)
School of Computer Science (SCOPE), Department of IOT, VIT, Vellore, Tamilnadu, India
e-mail: thamizharasan.s@vit.ac.in
P. C. B. Naidu
V R Siddhartha Engineering College, Kanuru, Vijayawada, Andhra Pradesh, India
M. Vasuja Devi
Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, India
L. E. P. Castelo
Facultad de Ciencias, Escuela Superior Politécnica de Chimborazo (ESPOCH), Riobamba,
Ecuador
A. K. P. Kovendan
St. Joseph’s College of Engineering, Chennai, India
J. N. Swaminathan
QIS College of Engineering & Technology, Ongole, Andhra Pradesh 523272, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 775
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_71
776 S. Thamizharasan et al.

71.1 Introduction

A novel Corona virus can cause infirmity in people [1, 2] and animals [3]. The
capability of the human body is upset by the activity of such virus. This happens
by breaking the cells inside the host and adventures them to repeat it. The Corona
virus looks like a royal crown of a spiked shell shape and hence its name derived
from Latin term ‘Corona’ which means crown. On January 2020, this new virus
called as nCoV-2019 has been identified and officially announced by World Health
Organization (WHO) [4]. This virus is perceived as a component of group of Corona
viruses including known colds [4, 5] and SARS. Wuhan, China has the first revealed
case and has tainted 7711 individuals before it was pronounced as a worldwide
pandemic. This delivers an ailment authoritatively characterized as COVID-19 that
has been spreaded almost to entire world and causes nearly 1.51 million deaths.
The person contaminated by COVID-19 shows indications like fever, tiredness, dry
cough, running nose with sore throat, and join pains and aches in some cases [6–8].
Notably, few people contaminated with the infection do not show any side effects
and feeling comfortable. Also, without any special treatment, around 80% of Corona
infected people can be recovered [9]. However, the probability of getting serious
illness and develops difficulty breathing is higher for older people and people with
serious illness. At the present time, no successful immunization or specific drug
for treatment of COVID-19 was invented. Nonetheless, expected immunizations and
some specific medications are yet under scrutiny and now being exposed to exhaustive
test by clinical examination communities. Further, WHO coordinating phenomenal
efforts to produce successful vaccines and medication to dodge and cure COVID-
19. In current scenario, the spread starts to saturate in many countries and leads to
unlock the lockdown by considering the economic growth. Hence, the administrators
of various countries are very much concerned about spreading rate during the unlock
process.
To control the spreading, it is crucial to recognize the infected person by screening
the temperature by utilizing infrared thermometer. In current scenario, because of
manual screening, the probability of covering all people is less. Additionally, there
is a likelihood for the infected person to contaminate the individual around while
the well-being official screening the individual persons in the queue. An innovative
technology is required to prevent this blemish. Smart cities exploit IoT as a key
technology for infrastructure development. Sensors attached to physical object and
associated software to transmit and receive information is interconnected by IoT with
least human intervention [10]. IoT medical care is current perspective to convey the
clinical and administration-related information for farther areas. The IoT framework
in health care is presently an advance setup that contains so numerous assortments
of instruments like clinical data framework, sensors, clinical hardware, distributed
computing, telemedicine, and some more. The various processes involved using
the IoT technique which involves remote patient observation, remote monitoring
and tracking of patient’s health, interactive RFID activity monitoring, etc. [10–13].
Productive treatment and contagious free diagnosis with safety are the key properties
71 Design of Smart Spectacle in 5G-IoT Environment to Detect … 777

of IoT healthcare systems. The principle real-time application of IoT in the medical
field [14–16] involves therapeutic information organization, telemedicine, versatile
clinical consideration, and management of health. The main intention of this study
is to design a system that has a capacity to recognize the Corona virus through
thermal image using the proposed smart spectacle. This can be accomplished with
least human intervention. In order to get the real-time data, the thermal camera
technology is associated with IoT technology in the proposed design. This process
is time-consuming and has a threat of getting spread for the person who does the
screening process. Hence, to overcome the problems, we propose a novel way of
screening using smart spectacle with least human interactions. With the help of IoT
technology and thermal camera technology, the screening process will become more
safety and faster compared to the conventional screening. Additionally, the facial
recognition method is exploited to get pedestrian information. The proposed smart
spectacle system has a greater potential to identify the Corona virus and reduces the
spreading.

71.2 Methodology

In this section, the working principle of all the subsystems and its interconnec-
tion protocol is explained. Additionally, the concept of optical and thermal data
processing, decision-making process and various system elements are discussed.
Initially, the task of collecting thermal and optical images is assigned to the proposed
smart spectacle. Interfacing of the proposed system is done using the IoT technique
and GSM. Primarily, the smart spectacle scans the suspended zone with a thermal
camera and conveys a notice if recognizing temperature is greater than 〖98.6〗ˆ° F.
Along with the temperature, the person’s exact location coordinates determined by
GPS module will be sent to a smart mobile through GSM. Hence, infected people’s
temperature and face will be received by the officials as described in Fig. 71.1. The
process explained here is due to the interconnection of three sections. Mobile phone
application, thermal camera, and optical camera are input sources which comprises
the first section. The second section is the process advancement. The Arduino inte-
grated development environment (IDE) software integrated with the microcontroller
processor to execute the source code.
The Arduino IDE software compiles source code and necessary commands into
NodeMCU v2 processor. The output source for the proposed system is the third
section. Facial and temperature information are collected form two different type of
camera equipped in smart spectacle. Thermal camera or thermal imager produces an
indistinguishable image by exploiting IR radiation. This module segments the image
as per the recorded temperature and shading pictures. Human body’s high tempera-
ture is identified through thermal camera by comparing temperature with the other
people temperature in the scanned zone. In this system, Java language is employed
778 S. Thamizharasan et al.

Fig. 71.1 Work flow of the smart spectacle system

in Arduino IDE. This cross-platform application involves features like auto inden-
tion, brace matching, and syntax highlighting. Additionally, the one-click mech-
anism is additional feature facilitates easy uploading of compiled programs. Hair
features-based cascade classification algorithm [14] is employed for face detection.
For training negative and positive images, machine learning algorithm has been used.
Cascade object detection present in the open CV library is utilized for recognizing
the faces in the captured image.

71.3 Result and Analysis

To affirm the achievability and enduring nature of a control strategy, proteus software
simulation is utilized at the beginning stage. Also, to ensure that all assertions are
attempted, the testing stage was intervened by intelligent intervals. This process
helps in approving the system tentatively. Presence of electronic gadgets and circuit
is imitated by the simulation software.
71 Design of Smart Spectacle in 5G-IoT Environment to Detect … 779

In Fig. 71.2, it can be noticed that the smart spectacle is configured with thermal
and optical camera. The thermal imaging camera for high-temperature detection is
placed at the right side of the spectacle. The optical camera is placed at the center to
capture the human face as shown in Fig. 71.3. The Arduino board and IoT module
are interfaced to spectacle through connecting wires which is placed at the back
side of the neck. The proposed system may be employed and exploited to visualize
the thermal image with excellent resolution to find the infected patients. The smart
spectacle triggers the alarm when the high-temperature deviation is observed in the
crowd. This permits the people with infections to be recognized rapidly and detached
from the crowd for more accurate testing.
Face recognition has been comprehensively analyzed since few decades by the
researchers (Fig. 71.4).
The convenient gadgets got reachable due to the recent rapid advances in
processing performance and memory. Additionally, this advancement facilitates live

Fig. 71.2 Interfacing smart spectacles with Arduino and IoT module

Fig. 71.3 Working principle of smart spectacle


780 S. Thamizharasan et al.

Fig. 71.4 Arduino configuration

video processing in smart gadgets. Also, in the proposed project to distinguish the
historical backdrop of the visited place made by an individual, Google location
history (GLH) has been utilized as depicted in Fig. 71.6. Whenever the user travels
with the smart mobile and smart spectacle, all the GKH locations are saved in the
developed application which will be helpful for identifying the infectee location.
These location histories will help government officials to locate the affected area and
sanitize the infected patients’ surroundings. As shown in Fig. 71.3, the collected body
temperature and infected patients’ location is communicated to cloud storage. This
is done by Arduino through sequential communication using external Blynk server.
The system alerts the official if the thermal camera reads a higher temperature as
depicted in Fig. 71.5b. The cases identified using the developed technology is shown
in Fig. 71.7. From the graph, it can be observed that the developed methodology helps
in tracking the COVID-19 cases effectively. Further, to evaluate the performance of
the developed smart spectacle, it has been subjected to comparative analysis with
the competitive methods. From the analysis, it is found that the developed smart
spectacle has higher accuracy in detecting the COVID-19 patients and can be very
much useful during epidemic situation (Table 71.1).
71 Design of Smart Spectacle in 5G-IoT Environment to Detect … 781

Fig. 71.5 Original image and thermal image

Fig. 71.6 Google map-assisted location tracking interface

71.4 Conclusion

A creative early COVID recognition system using smart spectacle has been proposed
in this work. The smart spectacle which incorporates thermal and optical camera
facilitates the easy and safest way of screening patients. The thermal camera in the
smart spectacle identifies the persons having high temperature in the groups also
782 S. Thamizharasan et al.

Fig. 71.7 COVID-19 cases identified

Table 71.1 Comparative


Measured Developed smart [14]
analysis technique
data\technique spectacle
Computation time 25 s 66 s
(Sec)
MSE 1. 843 × 10–3 1.262 × 10–3
Accuracy % 86 83

and communicates to the officials easily via smartphone. As we aware, the COVID
spreading has gained much consideration and mindfulness among individuals. Hence,
the appropriate approach to forestall the COVID spreading is to identify the symp-
toms early. An uninterrupted checking arrangement of screening cycle that naturally
shows up the thermal image of individuals is required. This will make screening
process less tedious and less human collaboration. Hence, the proposed smart spec-
tacle incorporates distant detecting methods and has a great potential to satisfy the
request from medical care framework.

References

1. Xie, C., Jiang, L., Huang, G., Pu, H., Gong, B., Lin, H., Ma, S., Chen, X., Long, B., Si, G.,
Yu, H.: Comparison of different samples for 2019 novel coronavirus detection by nucleic acid
amplification tests. Int. J. Infect. Dis. (2020)
2. Shen, M., Zhou, Y., Ye, J., AL-maskri, A.A., Kang, Y., Zeng, S., Cai, S.: Recent advances and
perspectives of nucleic acid detection for coronavirus. J. Pharm. Anal. 10 (2020)
3. Singh, S., Singh, R., Singh, K.P., Singh, V., Malik, Y.P., Kamdi, B., Singh, R., Kashyap,
G.: Immunohistochemical and molecular detection of natural cases of bovine rotavirus and
coronavirus infection causing enteritis in dairy calves. Microb. Pathog. 138(2019), 1038–14
(2020)
71 Design of Smart Spectacle in 5G-IoT Environment to Detect … 783

4. World Health Organization.: Laboratory testing for 2019 novel coronavirus (2019-nCoV) in
suspected human cases. 2019, 1–7 (2020)
5. Sitharthan, R., Rajesh, M.: Application of machine learning (ML) and internet of things (IoT)
in healthcare to predict and tackle pandemic situation. Distrib. Parall. Databases 1–19 (2021)
6. Sitharthan, R., Rajesh, M., Madurakavi, K., Raglend, J., Kumar, R.: Assessing nitrogen dioxide
(NO2) impact on health pre-and post-COVID-19 pandemic using IoT in India. Int. J. Pervasive
Comput. Commun. (2020)
7. Ramanujam, P., Venkatesan, P.G., Arumugam, C., Ponnusamy, M.: Design of a compact
printed lowpass filtering antenna with wideband harmonic suppression for automotive
communication. Int. J. RF Microwave Comput.-Aided Eng. 30(12), e22452 (2020)
8. Ramanujam, P., Ponnusamy, M., Ramanujam, K.: A compact wide-bandwidth antipodal vivaldi
antenna array with suppressed mutual coupling for 5G mm-wave applications. AEU-Int. J.
Electron. Commun. 133, 153668 (2021)
9. Gomathy, V., Janarthanan, K., Al-Turjman, F., Sitharthan, R., Rajesh, M., Vengatesan, K.,
Reshma, T.P.: Investigating the spread of coronavirus disease via edge-AI and air pollution
correlation. ACM Trans. Internet Technol. 21(4), 1–10 (2021)
10. Ndiaye, M., Oyewobi, S.S., Abu-Mahfouz, A.M., Hancke, G.P., Kurien, A.M., Djouani, K.: IoT
in the wake of COVID-19: a survey on contributions, challenges and evolution. IEEE Access
8, 186821–186839 (2020)
11. Fong, S.L., Wui Yung Chin, D., Abbas, R.A., Jamal, A., Ahmed, F.Y.H.: Smart city bus appli-
cation with qr code: a review. In: 2019 IEEE International Conference Automation Control
Intelleigent System I2CACIS 2019—Proceedings, pp. 34–39 (2019)
12. Zamani, N.S., Mohammed, M.N., Al-Zubaidi, S.: Design and development of portable digital
microscope platform using Iot technology.: In: IEEE International Colloquium on Signal
Processing and its Applications (CSPA 2020) (2020)
13. Hu, F., Xie, D., Shen, S.: On the application of the internet of things in the field of medical and
health care. Proceedings—2013 IEEE International Conference Green Computer Commu-
nication IEEE Internet Things IEEE Cyber, Physics Society Computer GreenCom-iThings-
CPSCom 2013, no. August 2013, pp. 2053–2058 (2013)
14. Mohammed, M.N., Hazairin, N.A., Syamsudin, H., Al-Zubaidi, S., Sairah, A.K., Mustapha, S.,
Yusuf, E.: 2019 novel coronavirus disease (Covid-19): detection and diagnosis system using
iot based smart glasses. Int. J. Adv. Sci. Technol. 954–960 (2020)
15. Sardianos, C., Varlamis, I., Bouras, G.: Extracting user habits from google maps history logs.
In: Proceedings of 2018 IEEE/ACM International Conference on Advance Society Networks
Anall Mining, ASONAM 2018, pp. 690–697 (2018)
16. Ruktanonchai, N.W., Ruktanonchai, C.W., Floyd, J.R., Tatem, A.J.: Using google location
history data to quantify fine-scale human mobility. Int. J. Health Geogr. 17(1), 1–13 (2018)
Chapter 72
Telecommunication’s Customer
Experience Prediction Using Hybrid
Machine Learning Model

Rupon Kumar Ghosh and Amitabha Chakrabarty

Abstract Communication technology has been developed a lot over the years. Tele-
com operators are providing network services so that the users can communicate
with each other using mobile phones. The customer life cycle of telecom depends on
the overall network quality provided by the telecom operator. So, telecom operators
always try to provide better service. A company that can provide cost-efficient service
with better network performance can have more users than its competitors. Decision
tree, logistic regression, support vector machine, Naive Bayes, and K-nearest neigh-
bor have been used individually and as ensemble learning model in this work to
predict users’ experience. The ensemble learning model has been found more effi-
cient than individual approaches to predict experience. An accuracy of 98.13% has
been found to predict the actual network experience of customers using the ensemble
learning model. So, the ensemble learning model has been proposed to predict the
user’s network using experience.

Keywords Customer experience · Life cycle of telecom user · Machine learning ·


Network performance data analysis · Predictive analytic

72.1 Introduction

Rivalry in the mobile telecommunication market is very intense. In each market,


there are a large number of competitors. The industry is extremely dynamic, with
new services, technologies, and carriers constantly altering the landscape. The degree
of these competitions is reflected in the storm of ads for mobile network services
in everyday papers and other broad communications. Telecom operators announce
new offers and rates often to attract new people and to hold existing customers out

R. K. Ghosh (B) · A. Chakrabarty


Department of Computer Science and Engineering, Brac University, Dhaka, Bangladesh
e-mail: rupomcse@gmail.com
A. Chakrabarty
e-mail: amitabha@bracu.ac.bd

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 785
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_72
786 R. K. Ghosh and A. Chakrabarty

from the competition. The number of mobile network users in Asia in 2010 was
2570 million. The number is increased much in the previous ten years and became
4764 million which is double that of 2010. Along with time different markets also
grew. For example, GDP of Bangladesh in 2010 was 6.4 but in 2020 it became 9.5.
Although there is significant room for growth in most markets, the industry growth
rate is declining, and competition is rising. Along with controlling the market, it has
become very important for telecom operators to control the loss of customers. A new
technology named MNP has been introduced recently. MNP means mobile number
portability, which offers customers to change telecom operators without changing
their number. Previously, some customers did not attempt to change their network to
keep their contact number running. But with new technology, they are also attempting
to change their network. Luna et al. [1] showed that in 1998, domestic monthly churn
rates were 2–3% of the customer base. At an average cost of $400 required to acquire a
subscriber, the churn cost the industry nearly $6.3 billion in 1998; the total annual loss
rose to nearly $9.6 billion when lost monthly revenue from subscriber cancelations is
considered. It cost roughly five times as much to sign on a new subscriber as to retain
an existing one. Consequently, for a carrier with 1.5 million subscribers, reducing
the monthly churn rate from 2 to 1% would yield an increase in annual earnings
of at least $54 million and an increase in shareholder value of approximately $150
million. The survey was done 25 years before. Now, after 25 years the amount of loss
or profit is also increased. A way to hold existing customers and gain new ones is
to provide strong network coverage. Network coverage strong means it will provide
better performance in calls and also in Internet browsing. Several studies have been
seen these years related to identifying churn in telecom operators but most of them
were not meant to resolve the problem. We wanted to introduce a system that can
predict users’ network experience concurrently. By analyzing different approaches,
we decided to use network performance data in this work. Network performance data
is gathered by the telecom operator automatically when a user tries to use the network
for calls or using the Internet. It will be helpful to predict the network performance
of a large-scale user base in a very short time and automatically. So, it may help to
resolve the network-related problem without impacting a large number of users.
The main objectives of this work are the following:

(i) To develop and use a machine learning model to predict customer’s network
using experience from network performance data efficiently.
(ii) To compare the result from our proposed model with other models to validate
the performance efficiency of our model.

72.2 Literature Review

The Twitter feed has been analyzed using NN and deep convolution by the authors
of [2]. Deep CNN was implemented to train and forecast the sentiment of the user.
Text preprocessing has a great impact in the field of Twitter data analysis. Multiple
72 Telecommunication’s Customer Experience Prediction … 787

methods to preprocess texts have been analyzed as part of [3]. Multiple categories of
sentiments are involved in sentiment analysis, so in [4] multi-class sentiment is ana-
lyzed. SENTA, a tool that forecasts multi-class sentiment was developed by authors
of [5]. The context, opinion mining process evolution, and Twitter conversation-
related events have been discussed in [6]. The usage of Twitter has kindled more
research work toward understanding the sentiments using Twitter data. A genetic
algorithm-based developed using a hybrid framework to perform sentiment analysis
has been discussed in [7]. Scalability perspective enhancement was the main focus
of the work. The way to promote the ranking of candidates effectively and to pre-
dict the topics of foreground authors of [8] proposed a mathematical model. The
work was discussion topic-oriented and interesting work to classify sentiment. Topic
adaptive classification of sentiment has been proposed by authors of [9] over tweets.
A multi-class sentiment analysis novel model has been developed and various chal-
lenges encountered while performing multi-class sentiment have been discussed in
[10]. An efficient process to convert text to speech based on the Twitter data sourced
sentences have been proposed by authors of [11]. A process to model Twitter data
hierarchically to operate online analytical processing (OLAP) has been introduced
by authors of [12]. They have proposed very large OLAP databases like Teradata DB
[13], Greenplum [14], or Vertica [15] to increase the efficiency of queries related to
analysis. To enhance efficiency Vertica [15] utilizes projection. Instead of developing
indexes that are conventional on columns, it retains the details about min/max ranges,
leading to higher latency from lower pruning which is less efficient. Machine learning
and data mining have been used to relationship management of customer automa-
tion [16]. Various types of methods have been guided and various churn detection
[17] such as call-habit according to demography [18]. Data and information sourced
from social networks are being used in recent churn prediction works and impacting
positively [19, 20]. In [21] NN, DT, LR, and boosting were used and in [22] SVM
has been proposed. To identify service requests, much efficiently service center call
records have been analyzed [23]. Call usage of mobile networks and through the
Internet became useful to relate different aspects of human life, such as crime [24]
or urban dynamics [25].
This work is related to predicting customer experience using customer network
data and using machine learning model over the data. Related work that uses telecoms
network data of customer and predicts network experience of the customer was not
found to review.

72.3 Methodology

72.3.1 Data Collection

Network performance data is not a public resource. Online resources do not have such
data type, and publicly available resources do not have the required data necessary for
788 R. K. Ghosh and A. Chakrabarty

the work. A leading telecom company in Bangladesh has provided us with data to use
in this work. There were millions of subscribers in the company. So, to identify our
experiment base we have analyzed customer care calls and user complaints. Around
more than 5000 complaints were coming for different services. Such complaints of
7 days were selected for analysis first. We have filtered out the complaints irrelevant
to poor customer experience from the list. After that, we got only 1200 users for this
work. When a user tries to make a call at first his attempt from handset goes to base
transceiver station (BTS). The BTS connects mobile devices to the network. BTS
then communicates with mobile switching center (MSC). It acts as a control center of
a network switching subsystem (NSS). The MSC connects calls between subscribers
by switching the digital voice packets between network paths. The customer usage
and network performance data are stored in core network (signaling).
We collected our required data from core nodes. The data categories were Sub-
scriber Information, Call Information, Coverage Information, Internet Information,
Video Information, and SMS Experience. We collected data on each category for
each user. All individual information was saved as a CSV file. So, the total number
of files became 6 × 7 × 1200 = 50, 400. Files generated from each category were
stored according to their type. The average of each KPI has been calculated for
each user. After calculation of average, a single row of data appears for each user
for each type of KPI. Client-side round trip time (RT), download throughput (DT),
total connection failure (TCF), call setup delay (CSD), failure SMS (FS), call failure
(CF), call drop (CD), and signal strength average (SS) are required features need
to predict customer experience. So, necessary KPIs for deriving those features have
been selected from six files and have been gathered in a common CSV file. Percent
of failed call attempts for each user has been calculated evaluating all call attempts.
Using Eq. (72.1), call failure has been calculated.
Number of Call Status Failed
Call Failure = 100 × (72.1)
Number of Total Call Attempt

Similar to Call Failure, Failure SMS has been calculated from SMS Experience data.
Using Eq. (72.2), Call Failure has been calculated.

Number of SMS Status Failed


Failure SMS = 100 ∗ (72.2)
Number of Total SMS Attempt

To derive call drop percent at first dropped calls has been filtered from Fail Cause ID of
Call Information data. Then, call Drop percent has been calculated using Eq. (72.3).

Number of Total Dropped Calls


Call Drop = 100 ∗ (72.3)
Number of Total Success Calls
Now, it is required to label each data row as good, moderate, and bad. To do that so,
sum of each customer’s 7 day’s weighted average data is calculated.
72 Telecommunication’s Customer Experience Prediction … 789

Fig. 72.1 Data labeling

RT + DT + TCF + CSD + FS + CF + CD + SS
SW = (72.4)
7
Using Eq. (72.4), sum of weighted average data is calculated, where SW = Sum of
weighted average data. If the calculated sum is between −283 and 760, it is labeled
as bad network quality experience. If the calculated sum is between 760 and 1149,
it is labeled as average network quality experience. If the calculated sum is between
1149 and 18,262, it is labeled as average network quality experience. Figure 72.1
shows the process of data labeling.

72.3.2 Feature Selection

We have collected network performance data with thousands of features from the
mobile operator. By analyzing them we found that all of them are not relevant to
network performance and cannot impact user experience. So, we have selected only
eight features among them. We did not use any external tool to reduce the dimension
of data as the number of selected features was only eight. We have used features
directly from our prepared dataset. The selected features were:
i. Client-Side Round Trip Time
ii. Download Throughput
iii. Total Connection Failure
iv. Call Setup Delay
v. Failure SMS
vi. Call Failure
vii. Call Drop
viii. Signal Strength Average.
Here, client-side round trip time refers to the answer time of the server as per
client request. Download throughput refers to the maximum Internet download speed
experienced by the user. Total connection failure refers to the times of failures to
connect to the network within a unit of time while the user was trying to call or
sending SMS or browsing the Internet. Call setup delay refers to the delay time
while the user was trying to make a call using the network. Failure SMS refers to the
number of SMSes that the user was trying to send others but did not transmit due to
a network problem in a unit time. Call failure refers to the number of calls that the
790 R. K. Ghosh and A. Chakrabarty

user was trying to make but did not establish due to a network problem in a unit time.
Call drop refers to the number of dropped calls while calling in a unit time. Signal
strength average refers to the quality of network coverage while the user was trying
to make calls or send SMSes or use the internet.

72.4 Model Selection

Authors of [17] used decision tree classifier and backpropagation neural network
to predict churn of telecom operator and found both efficient. Authors of [21] used
logistic regression, decision tree, and neural network to built the model to identify
churn of telecom operator.

Algorithm 1 Formation of Hybrid Classifier


1: Start
2: Set, D = Dataset
3: Split D into Training set and Test set as X _T rain, X _T est =
T rain_T est_Split (X, T est Si ze = 0.4, random_state = 0, strati f y = X )
4: Initialize Estimator to hold sub-models
5: Import Decision Tree as DT, Logistic Regression as LR, Support Vector Machine as SVM,
Naive Bayes as NB, and K-Nearest Neighbor as KNN from sklearn
6: For i=0:4
7: Repeat
8: Model DT i = DT ()
9: Model DT i → Estimator
10: Model L Ri = L R()
11: Model L Ri → Estimator
12: Model SV Mi = SV M()
13: Model SV Mi → Estimator
14: Model N Bi = N B()
15: Model N Bi → Estimator
16: Model K N N i = K N N ()
17: Model K N N i → Estimator
18: End Loop
19: Import VotingClassifier from sklearnėnsemble
20: Ensemble = V otingClassi f ier (Estimator )
21: Train the model using train data, ensemble. f it = Ensemble(X T rain )
22: Pr ediction = Ensemble. pr edict (X T est )
23: End

When multiple machine learning models work together and make a single unit
to perform regression or classification, the technique is called ensemble learning. It
has been proved so many times that ensemble learning performs better in different
machine learning applications. Ensemble learning also has two popular categories.
They are bagging and boosting. Boosting works on the whole dataset on the other
hand bagging works on only part of the whole dataset. We have developed an ensem-
72 Telecommunication’s Customer Experience Prediction … 791

ble machine learning model in this work, and it is heterogeneous in type. Instead of
using one type of machine learning model, we have used five variants. The variants
are LR, DT, SVM, KNN, and NB. Regular ensemble models use a homogeneous
collection of models but in this work collection of models is heterogeneous. So,
because of that, we have used the term hybrid to define our model. We have used the
five models five times each and combined their prediction results. So, internally 25
machine learning models were working together. Input passed through the 25 models
and performed 25 predictions. We have used Max Voting Classifier, in the end, to
combine the results and make a final prediction. The voting classifier is provided by
scikit-learn.org [26]. The main job of VotingClassifier is to aggregate the results of
multiple classifiers and then to predict one result based on the majority of voters.
The complete process can be better understood using Algorithm 1. We have used
programming language Python to develop the system. 60% of total data was used to
train the system, and the rest 40% was used to test the system. We have stored the
prediction result of test data from the model in a CSV file to perform further analysis.

72.5 Result Analysis

Before evaluating the efficiency of the machine learning models, we have contacted
all of the 1200 users to take their feedback. Later, we have performed a survey to
validate the observation of machine learning models with the actual observation of
the users. Later, we have used four parameters of the confusion matrix to evaluate
the performance of the model and determine how much correct they were while
predicting users’ experience. True Positive (TP), True Negative (TN), False Positive
(FP), and False Negative (FN) are the parameters of the confusion matrix. TP refers
to the actual good user experience. FP refers to the actual poor user experience. TN
refers to the cases where users were facing good performance but predicted as poor
by the models. FN refers to the cases where a user was getting poor performance but
predicted as good by the models.
Table 72.1 shows the value of parameters of confusion matrix, and Eq. (72.5)
shows the formula to calculate accuracy using the parameters.
TP + TN
Accuracy = (72.5)
TP + TN + FP + FN

DT, LR, SVM, NB, KNN, and hybrid approach (HA) show the accuracy of
97.50%, 91.25%, 92.50%, 84.79%, 95.00%, and 98.13%. So, from the performance
analysis of different models it can be told that hybrid approach provides better accu-
racy to predict the experience of the customer. Precision is the term to find that the
part of the prediction data is positive. Equation (72.6) calculates precision where
TP = True Positive and FP = False Positive.
TP
Precision = (72.6)
TP + FP
792 R. K. Ghosh and A. Chakrabarty

Table 72.1 Value of different parameters of confusion matrix


Classifier name TP TN FP FN
Decision tree 462 8 5 5
Logistic 434 28 4 14
regression
Support vector 440 19 4 17
machine
Naive Bayes 405 52 2 21
K-nearest 452 8 4 16
neighbor
Hybrid approach 465 7 5 3

The precision for the proposed method is 98.93%. NB showed the best precision
99.50% among the six models. The precisions for DT, LR, SVM, and KNN are
98.93%, 99.08%, 99.09%, and 99.12%. Recall is the term to find the positive hit
of the algorithm. Equation (72.7) calculates Recall where TP = True Positive and
FN = False Negative.
TP
Recall = (72.7)
TP + FN

Recall for the proposed method is 99.36%, and it is the best among the six models.
Recall for DT, LR, SVM, NB, and KNN is 98.93, 96.87, 96.28, 95.07, and 96.58%.
The F1-score can be interpreted as a weighted average of the precision and recall,
where an F1-score reaches its best value at 1 and worst score at 0. Equation (72.8)
calculates the F1-score.
2 ∗ Precision ∗ Recall
F1 score = (72.8)
Precision + Recall

The F1-score for the proposed method is 0.9914, and it is the best among the six
models. F1 score for DT, LR, SVM, NB, and KNN is 0.989, 0.979, 0.976, 0.972,
and 0.978. So, the value of the proposed method is nearly the best F1-score.
Figure 72.2 shows the efficiency of different classification models based on con-
fusion matrix parameters. Precision for the proposed model was not the best among
the six models. Accuracy, recall, and F1-score were found the maximum for the
proposed model. So, in short it can be told that the proposed method is performing
better to predict customer experiences than other models.
Now, the system cost is going to be analyzed. We have used a computing device
built with Intel(R) Core(TM) i7-10700 CPU and 8.00GB of DDR-4 RAM. So, the
system was powerful enough to run tasks with high resource usage. We have used a
total of 1200 datasets for training and testing the models. As the number of datasets
was not large so individual and hybrid approaches did not show much difference while
evaluating system cost. Naive Bayes has provided the fastest execution of 0.21 s while
72 Telecommunication’s Customer Experience Prediction … 793

Fig. 72.2 Efficiency of models using confusion matrix parameters

training the system. While hybrid approach provided an execution time of 0.26 s.
The average execution time of all approaches was 0.226 s. K-nearest neighbor has
consumed more CPU (2.56%) than other approaches. The hybrid approach consumed
2.53% of the CPU. The average CPU usage of all approaches was 2.52%. SVM
consumed 682MB of Memory which was the most among all the approaches. The
hybrid approach consumed 672MB of RAM. KNN consumed 626MB of RAM which
was the lowest consumption among all. So, it is clear that the hybrid approach did
not consume much more system resources than other approaches.

72.6 Conclusion

We have introduced an approach to predict user experience on networks of telecom


from a rich set of network performance data using hybrid machine learning model.
Main challenge of the work was to collect network performance data from core
node and use it to predict experience. We have used confusion matrix parameters
to determine the efficiency of approaches. Decision tree, logistic regression, support
vector machine, Naive Bayes, K-nearest neighbor, and hybrid approach showed the
accuracy of 97.50, 91.25, 92.50, 84.79, 95.00, and 98.13%. So, it is clear that the
hybrid approach outruns the independent approaches. So, we propose to use a hybrid
machine learning model instead of using an individual machine learning model. As
actual network using experience could be predicted using our model so, in the future,
we shall continue our work to automate action to resolve poor network quality to
ensure a better network for the users.
794 R. K. Ghosh and A. Chakrabarty

References

1. Luna, L.: Churn is epidemic. Radio Commun. Rep. (1998)


2. Jianqiang, Z., Xiaolin, G., Xuejun, Z.: Deep convolution neural networks for twitter senti-
ment analysis. IEEE Access 6, 23253–23260 (2018). https://doi.org/10.1109/ACCESS.2017.
2776930
3. Jianqiang, Z., Xiaolin, G.: Comparison research on text pre-processing methods on twitter
sentiment analysis. IEEE Access 5, 2870–2879 (2017). https://doi.org/10.1109/ACCESS.2017.
2672677
4. Bouazizi, M., Ohtsuki, T.: Multi-class sentiment analysis in twitter: what if classification is
not the answer. IEEE Access 6, 64486–64502 (2018). https://doi.org/10.1109/ACCESS.2018.
2876674
5. Bouazizi, M., Ohtsuki, T.: A pattern-based approach for multi-class sentiment analysis in
twitter. IEEE Access 5, 20617–20639 (2017). https://doi.org/10.1109/ACCESS.2017.2740982
6. Ebrahimi, M., Yazdavar, A.H., Sheth, A.: Challenges of sentiment analysis for dynamic events.
IEEE Intell. Syst. 32(5), 70–75 (2017). https://doi.org/10.1109/MIS.2017.3711649
7. Iqbal, F.: A hybrid framework for sentiment analysis using genetic algorithm based fea-
ture reduction. IEEE Access 7, 14637–14652 (2019). https://doi.org/10.1109/ACCESS.2019.
2892852
8. Tan, S., et al.: Interpreting the public sentiment variations on twitter. IEEE Trans. Knowl. Data
Eng. 26(5), 1158–1170 (2014). https://doi.org/10.1109/TKDE.2013.116
9. Liu, S., Cheng, X., Li, F., Li, F.: TASC: topic-adaptive sentiment classification on dynamic
tweets. IEEE Trans. Knowl. Data Eng. 27(6), 1696–1709 (2015). https://doi.org/10.1109/
TKDE.2014.2382600
10. Bouazizi, M., Ohtsuki, T.: Multi-class sentiment analysis on twitter: classification performance
and challenges. Big Data Min. Anal. 2(3), 181–194 (2019). https://doi.org/10.26599/BDMA.
2019.9020002
11. Trilla, A., Alias, F.: Sentence-based sentiment analysis for expressive text-to-speech. IEEE
Trans. Audio Speech Lang. Process. 21(2), 223–233 (2013). https://doi.org/10.1109/TASL.
2012.2217
12. Yu, D., Xu, D., Wang, D., Ni, Z.: Hierarchical topic modeling of twitter data for online analyti-
cal processing. IEEE Access 7, 12373–12385 (2019). https://doi.org/10.1109/ACCESS.2019.
2891902
13. Teradata Solution Technical Overview. https://www-50.ibm.com/partnerworld/gsd/
showimage.do?id=29988. Accessed 14 Nov 2021
14. Greenplum Database. https://greenplum.org/. Accessed 25 Nov 2021
15. Lamb, A.F., Varadarajan, M., Tran, R., Vandier, N., Doshi, B.L., Bear, C.: The Vertica analytic
database: C-store 7 years later. arXiv:1208.4173 (2012)
16. Ngai, E.W., Xiu, L., Chau, D.C.: Application of data mining techniques in customer relationship
management: a literature review and classification. Expert Syst. Appl. (2009)
17. Hung, S.-Y., Yen, D.C., Wang, H.-Y.: Applying data mining to telecom churn management.
Expert Syst. Appl. (2006)
18. Wei, C.-P., Chiu, I.-T.: Turning telecommunications call details to churn prediction: a data
mining approach. Expert Syst. Appl. (2002)
19. Richter, Y., Yom-Tov, E., Slonim, N.: Predicting customer churn in mobile networks through
analysis of social groups. In: SDM (2010)
20. Rowe, M.: Mining user lifecycles from online community platforms and their application to
churn prediction. In: ICDM (2013)
21. Mozer, M.C., Wolniewicz, R., Grimes, D.B., Johnson, E., Kaushansky, H.: Predicting subscriber
dissatisfaction and improving retention in the wireless telecommunications industry. IEEE
Trans. Neural Netw. (2000)
22. Zhao, Y., Li, B., Li, X., Liu, W., Ren, S.: Customer churn prediction using improved one-class
support vector machine. In: Advanced Data Mining and Applications. Springer (2005)
72 Telecommunication’s Customer Experience Prediction … 795

23. Tan, P.-N., Blau, H., Harp, S., Goldman, R.: Textual data mining of service center call records.
In: KDD (2000)
24. Bogomolov, A., Lepri, B., Staiano, J., Oliver, N., Pianesi, F., Pentland, A.: Once upon a crime:
towards crime prediction from demographics and mobile data. In: ICMI (2014)
25. Reades, J., Calabrese, F., Sevtsuk, A., Ratti, C.: Cellular census: explorations in urban data
collection. IEEE Pervas. Comput. (2007)
26. Voting Classifier. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.
VotingClassifier.html. Accessed 17 May 2021
Chapter 73
An Analysis on Predicting Social Media
Ads Using Kernel SVM Function

Swathi Jayaprakash, Dutta Yasaswi, and V. Pattabiraman

Abstract This paper predicts the social media users who will buy a car based on the
previously observed dataset. The dataset is extracted from user who has undergone
the search of cars and advertised by it in the Internet. The particular dataset has
attributes such as User ID, Gender, Age, Estimated Salary, and Purchased car. Using
certain data mining methods, we predict the customer who’s most likely to buy a car
and our focused gets on entirely on respective users. Ignoring those are not really
ready to purchase a car at that time. This helps marketing department where to relay
on hence, saving of money and time on par with having an analysis of going whether
to car or not. This gives an idea to monitor on certain users to look up for. Push on
the ads to these users who are most likely to purchase. Concentrating less on those
are unlikely to buy based on dataset, we have extracted. Based on the analysis of
the data extracted, we used different kernel SVM techniques and already existing
data mining models, according to this data we say which algorithm gave an effective
result, and conclude the best algorithm. The approach we going to possess is via
confusion matrix and get concluded on accuracy, precision, specificity, etc.

Keywords SVM · Kernel SVM · Accuracy · RBF · Polynomial kernel

73.1 Introduction

Now-a-days, social media has become much commercial than ever due to its usage
in the current world. It has become boon to e-commerce companies. As, it working
as platform to reach out their products to the people in much easier way possible and
get to increase their market with lesser time and money. On that note, we have come

S. Jayaprakash (B) · D. Yasaswi · V. Pattabiraman


Vellore Institute of Technology Chennai, Chennai, India
e-mail: swathijayaprakash2000@gmail.com
D. Yasaswi
e-mail: yasaswidutta666@gmail.com
V. Pattabiraman
e-mail: pattabiraman.v@vit.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 797
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_73
798 S. Jayaprakash et al.

with a paper called social media ads prediction system. The application has designed
completely for marketing business management. Our work gives an accurate result
of user’s data and tells whether user has particular user has brought the product or
not. User’s data contains user id, estimated salary, age, gender, purchased or not. If
user seems on affordable basis, the social media ads get sent to respective user. If
the user has not purchased the product and looking for buying the particular product.
The required social media ad gets sent to respective user.
The application that we have made would reduce the money that has to spend
on marketing. Companies need not to suggest their product based on the user’s
dataset which saves the time. Our application performs on nine different models of
data mining techniques which are linear kernel SVM, naive Bayes classifier, logistic
regression, polynomial kernel SVM, Radial type kernel SVM, Random Forest clas-
sifier, sigmoidal type SVM, KNN and decision tree classifier. All these nine models’
give’s a different result. We compare these nine data mining models and conclude
which model would be best suitable in predicting the social media ads prediction.
The comparison is done based on the confusion matrix, the data visualization of
graphs for both training and testing data. Eventually, E-commerce companies can
totally rely on the work which brings them with better results.
There are four major kernel function we used in this paper which are namely linear
type, radial basis type, polynomial type and sigmoid type function. SVM Algorithm
is actually an Algorithm which is very powerful when it comes to classification in
data mining and machine learning. SVM has a good Math Formula which is required
to build the basic difference between the groups of classes. The software we used
to implement was RStudio using R Programming. ElemStatLearn, CaTool, e1071
are few packages used in implementation. E1071 is an important package used in
R which helps us to use some of stats and classification functions like SVM, naïve
Bayes, Fourier transforms, etc.
There are several packages used for implementing SVM, but we have used e1071
package as it delivers a powerful interface. As we are using the different kernel
functions and implementing the SVM. This package provides all the kernel function
we used in this paper. On the other hand, CaTools is the package which provides
the fundamental functions in R like quick calculations, error round off, LogitBoost,
etc. The ElemStatLearn package provides a better visualization of data and also has
many statistical functions included in it.

73.2 Literature Survey

Gaye et al. [1] this paper basically inspects about SVM and various modified methods
of SVM. This paper also speaks about the problem faced in SVM such as uneven
distribution, sensitive to noise, the presence of outliers. Srivastava et al. [2] deals with
the SVM Algorithm, wherein they apply SVM and other data mining Algorithm for
about 4 different set of datasets say diabetics dataset, satellite dataset, shuttle dataset
73 An Analysis on Predicting Social Media Ads Using Kernel SVM … 799

Table 73.1 Social media Ads prediction dataset


User Id Gender Age EstimatedSalary Purchased
15,624,510 Male 19 19,000 0
15,810,944 Male 35 20,000 0
15,668,575 Female 26 43,000 0
15,603,246 Female 27 57,000 0
15,804,002 Male 19 76,000 0

and heart dataset. They concluded that applying the type of kernel depends on the
dataset used and the best results were shown by the RBF as the data is multi class.
For solving the SVM [3], the authors came with 3 iterative approaches and
suggested approaches made use of two functions namely, convex Huberloss and
robust for finding the error. Dinesh et al. [4] briefs on merits, by using the Algo-
rithms for prediction in the field of education. In addition to that they state about
applying the prediction Algorithm to collected data makes huge difference. Hachesu
et al. [5] the authors have stressed on prediction of length of stay in respective hospi-
tals Prediction was based on cardiac patients who were admitted in hospitals. Based
on highest accuracy rate, the length of stay was predicted which in return reduces
the burden of hospital management. Charles et al. [6] deals with predicting the type
of soil suitable for agriculture process. The authors have implemented the prediction
Algorithms in order to come up with best soil for better results.

73.3 Datasets Used

We have used a social media advertisement dataset for predicting. It basically checks,
whether users have purchased a product by clicking on the advertisements shown
to them. So, the dataset has got 5 variables namely user id, gender, age, estimated
salary, purchased [it is a Boolean value, 0 representing the user have not purchased,
1 representing the user buying the product]. We have extracted this dataset from
Kaggle and sample of first 5 data, from the dataset shown in Table 73.1. This dataset
has about 400 + data.

73.4 Methodology

For implementation of the kernel SVM and other Algorithms, we used a software
called R Studio. Even though the SVM type of model works well for regression
analysis, the SVM is widely used for classification analysis. We apply this just by
plotting out the data in the n-dimensional space. And further on, we mined to discover
the perfect hyper plane which separates between two group classes.
800 S. Jayaprakash et al.

The points that’s closer to the hyper plane will have an effect on the orientation,
arrangement and location of the hyper plane, this kind of points is called as the “sup-
port vectors”. Which is basically the coordinate representation of many observations
and it is a segregation method for separating the two class. There are namely 3 main
steps to follow in SVM:
1. Start with a data based on low dimension.
2. Then we should supposed to move the data into higher dimension. In our case
we are already dealing with 2-dimension data. [x axis = age, y axis = estimated
salary]. So, we should convert the 2-D data into 3-D data.
3. Find the support vector classifier which separates the higher dimension data
into two groups.
For conversion of two-dimension data to three-dimension data, you may wonder
how we decide, how to transform the data from 2-D to 3-D. Only in order to make
the mathematics possible, the SVM makes use of something called Kernel functions
to comprehensively discover support vector classifiers in higher dimension.
There are namely many types of kernels. Few of the kernel SVM present in
R Studio are namely, Radial, Linear, sigmoid, polynomial. We are exploring the
all above mentioned models and find which type of kernel SVM gives a good
classification result.
Each of these kernels has a unique mathematical equation which helps in forming
the hyper plane, separating 2 class of data. The Algorithm is just the same and the
only difference for the different type of kernel SVM is their kernel function used.
These kernel functions are difficult to solve manually, so we use R Studio to do that.
The Algorithm of Kernel SVM is as follows:
1. First generate the data in 2-dimensions, and separate into 2 matrix.
2. Choose two hyper planes (in 2-D) that can separate the data with no points
between them.
3. Try to maximize their distance (the margin).
4. The average line will be the decision boundary.
5. Now we should load the package e1071 which contains the SVM function.
6. As we are using 2 variables, let’s take y as the response variable and other
variables as the predictors.
7. Fit the model using “SVMfit” function present in e1071.
8. And plot the confusion matrix to find the accuracy of the model.
The Different kernel functions used are:

73.4.1 Polynomial Type Kernel SVM

The polynomial kernel is also a SVM kernel feature which in reality represents
the similarity among the vectors, allowing it to understand and learn the non-linear
models.
73 An Analysis on Predicting Social Media Ads Using Kernel SVM … 801

(a) (b)

Fig. 73.1 a Plot of training dataset using polynomial kernel, b plot of testing dataset using
polynomial kernel

Table 73.2 Confusion


Y predictions 0 1
matrix for polynomial type
kernel SVM 0 60 4
1 18 18

Additionally, the polynomial type kernel SVM kernel now not only seems at the
characteristic of Input samples to discover the similarity among variables, but also
specializes in all the available combos. While we communicate this with respect
to regression evaluation, those set of combinations are referred to as interaction
features. While the Input functions are binary-valued (both 0 and 1), then we say
the features correspond to logical co-prevalence of Input features. X i and X j are 2
different observations in the dataset. R stands for the coefficient of the polynomial.

(X i , X j ) = (X i · X j + c)2 (73.1)

c is simply a constant in (73.1).


The d in (73.1) stands for the degree of the polynomial, when d = 1, the polynomial
kernel finds the relation between each pair of observation in one-dimension. Further
on the calculated relationships are used to find a support vector classifier. When d =
2, we get a second dimension based on square of the equation and polynomial kernel
finds the relation between each pair of observation in two dimensions and so on.
To conclude, the polynomial kernel comprehensively gets increasing its dimensions
just by setting the value of “d”. So, by applying polynomial kernel function to the
dataset, it generates the following output. (see Fig. 73.1a) and (see Fig. 73.1b) shows
the visual representation of the training and testing dataset with polynomial kernel
SVM, respectively, and Table 73.2 signifies the confusion matrix generated.

73.4.2 Radial Type Kernel SVM

Radial Kernel is one of the powerful kernel SVM technique. This type of kernel SVM
is more useful when we deal with not so linear separable data. So, the process that we
802 S. Jayaprakash et al.

(a) (b)

Fig. 73.2 a Plot of training dataset using RBF kernel, b plot of testing dataset using RBF kernel

Table 73.3 Confusion matrix


Y predictions 0 1
for radial type kernel SVM
0 58 6
1 4 32

intake to resolve this type of dispute is just by applying a non-linear modification to


the feature variable then, further on transforming those variables to a higher dimen-
sion (i.e., from Two-Dimension to three-Dimension space) which is often referred
to as feature space. By doing the above mentioned steps we are segregating the
irregular/non-uniform data with a non-uniform partition. RBF is abbreviated as the
radial basis function. It is basically used when the user does not have any prior
knowledge about the dataset used. It is represented as:
   
Xi , X j = e − γ Xi − X j 2 (73.2)

In (73.2) X i and X j two different variables, the difference between the variables
is squared which gives us the squared distance between the two variables. So, the
number of impact one observation has on another is a function of the squared distance.
The gamma determines the cross validation, which scales the squared distance which
means it scales the influence. If we plug in the values for X i and X j . If the points are
relatively closer than the value of the kernel SVM will be higher. If we calculate for
points which are distant then the value would be less.
The radial kernel finds support vector classifiers in infinite dimensions. In this
type of model, the nearest data points have a lot of influence on how we classify the
new observation. (see Fig. 73.2a) and (see Fig. 73.2b) shows the visual representation
of the training and testing dataset with radial basis kernel SVM, respectively, and
Table 73.3 signifies the confusion matrix generated for it.

73.4.3 Linear Type Kernel SVM

This kernel is one-dimensional and is the maximum simple form of kernel in SVM.
The equation is:
73 An Analysis on Predicting Social Media Ads Using Kernel SVM … 803

(a) (b)

Fig. 73.3 a Plot of testing dataset using linear kernel [Left], b plot of training dataset using linear
kernel [Right]

Table 73.4 Confusion matrix


Y predictions 0 1
for linear type kernel SVM
0 57 7
1 13 23

 
K Xi , X j = Xi · X j (73.3)

Linear kernel SVM forms a liner hyper plane. This model is effective when we
work with uniform data. As we are dealing with non-uniform set of data this type of
kernel SVM does not suit better. This creates a hyper plane which is a straight line.
And the points may be mislaid.
So, (see Fig. 73.3a) and (see Fig. 73.3b) shows the visual representation of
the training and testing dataset of linear kernel SVM, respectively, and Table 73.4
signifies the confusion matrix generated for it.

73.4.4 Sigmoid Type Kernel SVM

The Sigmoid Kernel occurs to arise from Neural Network area, wherein the bipolar
[that is −1 and +1] sigmoid type kernel function is frequently applied like an activa-
tion feature for AN’s. A SVM model that makes use of the sigmoid kernel character-
istic is just equal to the 2-layer, perceptron neural network. We are able to constitute
this as:
   
K X i , X j = tanh ∝ Xi T X j + c (73.4)

So, (see Fig. 73.4a) and (see Fig. 73.4b) shows the visual representation of the
training and testing dataset of sigmoid kernel SVM, respectively, and Table 73.5
signifies the confusion matrix generated.
804 S. Jayaprakash et al.

a Plot of training dataset using sig- b Plot of testing dataset using sigmoid
moid kernel kernel [Right]

Fig. 73.4 a Plot of training dataset using sigmoid kernel, b plot of testing dataset using sigmoid
kernel [Right]

Table 73.5 Confusion matrix


Y predictions 0 1
for linear type sigmoid SVM
0 53 11
1 14 22

73.5 Observation

So, according to the social media ads dataset. We have worked on with several
Algorithms and here is the Tables 73.6 and 73.7, which compares the correctness of
the model. The accuracy of our model is calculated using the confusion matrix. Which
generates 4 values namely, TruePositive [Tp], FalsePositive [Fp], TrueNegative [Tn]
and FalseNegative [Fn].
The correctness of any system is basically calculated using the below formula:

Table 73.6 We are calculating the accuracy for all the Algorithms used
S.No. Algorithm used Confusion matrix result Accuracy in percentage
generated
Tp Fp Tn Fn
1 Logistic regression 57 7 26 10 83
2 KNN 59 5 30 6 89
3 Decision tree classifier 53 11 30 6 83
4 Random forest classifier 53 11 30 6 86
5 Naive Bayes 57 7 29 7 86
6 Kernel SVM using radial kernel 58 6 32 4 90
7 Kernel SVM using linear kernel 57 7 23 13 80
8 Kernel SVM using sigmoid 53 11 22 14 75
kernel
9 Kernel SVM using polynomial 60 4 18 18 78
kernel
Table 73.7 We are measuring of correctness for existing Algorithm
Measures Logistic KNN Decision tree Random Naive Radial Linear Sigmoid polynomial
regression Forest Bayes Kernel kernel kernel kernel
Sensitivity Tp/(Tn + Fn) 85 90 89 86 89 93 81 79 76
Specificity Tn/(Tn + Fp) 78 85 73 84 80 84 76 66 81
Precision Tp/(Tp + Fp) 89 92 82 92 89 90 89 82 93
False Positive Rate 21 14 26 15 19 15 23 33 18
(Fp/(Fp + Tn)
False Negative Rate 14 9 10 13 10 06 18 20 23
Fn/(Fn + Tp)
F1 Score 2Tp/(2Tp + Fp 87 91 86 89 89 92 85 80 84
+ Fn)
False DiscoveryRate 10 7 17 7 10 9 10 17 6
73 An Analysis on Predicting Social Media Ads Using Kernel SVM …

Fp/(Fp + Tp)
805
806 S. Jayaprakash et al.

73.6 Result

The Output got to be an analysis on predicting a social media user can purchase
a car or not. Our analysis is based on nine different data mining Algorithms. We
get to find confusion matrix for all of the nine methods and data visualization way
of representation. This gives us how these nine data mining methods are efficient
from each other and their respective outcome. We have further calculated the other
correctness measures to find the efficient Algorithm.
The Tables 73.6 and 73.7, Signifies the various calculation made out of the
results of confusion matrix and the values were based on percentage. Polynomial
kernel SVM has highest rate of precision value (see Fig. 73.8). Internally, it depicts
the measure of quality. KNN here again has the highest specificity (see Fig. 73.7)
measure. When it comes to decision tree classifier, the measure of sensitivity tops
the list that other measures. The sensitivity sees through the actual positive that
get predicted as positive. Another classifier that we have implemented is Random
Forest Classifier has the highest percentage of precision which in turn speaks about
quality. The third classifier, Naïve Bayes has got to have similar percentage values
of sensitivity, Precision and F1 score (Table 73.7). The innovative work in this paper
deals with kernel SVM, the radial kernel has the highest rate of sensitivity (see
Fig. 73.6) which predicts positivity. Linear kernel excels at the quality aspect, Preci-
sion. Sigmoidal kernel is followed by linear kernel which is precision. The last type
of kernel SVM, Polynomial kernel has the highest rate measure of precision (see
Fig. 73.8). The no of highest false positive rate present is in sigmoidal kernel SVM
(see Fig. 73.9). When it comes to the highest percentage of false negative rate, poly-
nomial kernel SVM tops the list (see Fig. 73.10) when it comes to accuracy and F1
score (see Fig. 73.12), its very clear that radial basis kernel svm provides the tops the
chart (see Fig. 73.5). The false discovery rate of decision tree and sigmoidal kernel
svm gets to be similar (see Fig. 73.11).

Fig. 73.5 Graphical representation of accuracy of the algorithms used


73 An Analysis on Predicting Social Media Ads Using Kernel SVM … 807

Fig. 73.6 Graphical representation of sensitivity measure of the algorithms used

Fig. 73.7 Graphical representation of specificity measure of the algorithms used

73.7 Conclusion

For the Social Media Ads Prediction dataset, on total we have used about 9 Algorithms
and found the accuracy and other measures of each and every Algorithm Tables 73.6
and 73.7. Out of the 9 Algorithms. The 4 kernel functions in SVM algorithm are
claimed to be our novelty in this work which are namely radial kernel SVM, Linear
Kernel SVM, sigmoidal kernel SVM and polynomial Kernel SVM. We observe that
the accuracy (see Fig. 73.5) of the system with these 4 kernel SVM and the already
808 S. Jayaprakash et al.

Fig. 73.8 Graphical representation of precision measure of the algorithms used

Fig. 73.9 Graphical representation of fpr measure of the algorithms used

existing data mining/ machine learning Algorithm, we conclude that polynomial


kernel SVM has got the highest rate of precision value (see Fig. 73.8). The precision
precisely deals with the quality of measure where the ratio is driven with predicted
positive notions to the total predicted positives notions.
The kernel SVM using radial basis function has given the maximum accuracy
(see Fig. 73.5) and F1 Score (see Fig. 73.10) for the system. Which in turn signifies
the performance of the model. So, thereby we declare the kernel SVM using radial
kernel function is the best Algorithm compared to all other existing algorithm, with
respect the Social Media Ads Prediction dataset.
73 An Analysis on Predicting Social Media Ads Using Kernel SVM … 809

Fig. 73.10 Graphical representation of fnr measure of the algorithms used

Fig. 73.11 Graphical representation of fdr measure of the algorithms used


810 S. Jayaprakash et al.

Fig. 73.12 Graphical representation of f1 measure of the algorithms used

References

1. Gaye, B., Zhang, D., Wulamu, A.: Improvement of support vector machine algorithm in big
data background. Math. Prob. Eng. 2021(5594899), 9 (2021)
2. Srivastava, D.K., Bhambhu, L.: Data classification using support vector machine. J. Theor. Appl.
Inf. Technol. 12(1), 1–7 (2010)
3. Borah, P., Gupta, D.: Functional iterative approaches for solving support vector classification
problems based on generalized Huber loss. Neur. Comput. Appl. 32(1), 1135–1139 (2020)
4. Dinesh Kumar, A., Pandi Selvam, R., Sathesh Kumar, K.: Review on prediction algorithms in
educational data mining. Int. J. Pure Appl. Math. 118(8) (2018)
5. Hachesu, P.R., Ahmadi, M., Alizadeh, S., Sadoughi, F.: Use of data mining techniques to deter-
mine and predict length of stay of cardiac patients. Healthcare Inf. Res. PMCID: PMC3717435
(2013)
6. Baskar, S.S., Arockiam, L., Charles, S.: Applying data mining techniques on soil fertility
prediction. Int. J. Comput. Appl. Technol. Res. 2(6), 660–662 (2013), ISSN: 2319–8656
Chapter 74
COVID-19 Triggers a Paradigm Shift
in Technology for Insurance Industry

Ravi Shankar Jha , Priti Ranjan Sahoo , and Arvind Tripathy

Abstract As the global economy grapples with the advent of novel coronavirus
and its variants, the aftermath has left all industries with ongoing uncertainties and
incalculable loss of life and livelihood in most countries worldwide. In such unpre-
dictable situations, the insurance industry and governments worldwide have become
the prominent source of optimism to sail through the situation. This applies to the
insurance industry globally, which is currently in the grip of fear due to the COVID-
19 outbreak and anticipating significant economic slowdown and hardship because
insurance rides on the back of other Industries. Therefore, to overcome a few of the
tenacious roadblocks due to the COVID outbreak, Insurers will be forced to reassess
all aspects of their business life cycle and take necessary steps to continue operations
with minimum disruption. Precisely, the impact of COVID on General Insurers and
Life and Health Insurers varied depending on the lines of business, product lines,
and a bouquet of benefits offered by the insurers. The pandemic has taken a hit on
new gross written premiums on specific lines of business, such as medical, travel,
commercial, and business insurance. Few lines of business such as motor and home
have remained muted during the COVID timeframe. However, the claims volumes
for personal insurance (e.g., motor) have significantly decreased due to the lock-
down and travel restriction; the industry has witnessed the highest claims volumes
in life and health compared to the past several decades. They say, “As every dark
cloud has a silver lining,” it has given an opportunity to many insurers to develop
new products (e.g., Pay Mile Auto insurance) and push toward greater productivity,
i.e., digital capability across product range which will result in an elevated position
to understand and address to the customer and intermediary self-service (such as
Portals) and implicit and explicit needs. Notably, the Insurance industry is likely to
lean toward offering personalized yet custom-made products and services, which
are sharply focused on preventative care and embracing digitalization across the
value chain. Besides enabling scalability and connectivity, insurers are strategically

R. S. Jha · P. R. Sahoo (B) · A. Tripathy


KIIT School of Management, KIIT University (Institution of Eminence), Bhubaneswar, India
e-mail: prsahoo@ksom.ac.in
A. Tripathy
e-mail: arvind@ksom.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 811
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_74
812 R. S. Jha et al.

focused on digitizing the core of the business and cloud implementation; automa-
tion across the insurance value chain is necessary to compete successfully with new
innovative product development or inclusive business models. Around the globe,
the insurance industry is continuously putting a deep focus on revitalizing the tech-
nology paradigm to grow and strive to achieve cost-effectiveness amid emerging
markets, rapidly changing economic conditions and stiff competition from Insurtech.
According to industry experts across geographies, growth may be a balanced blend of
preventative and protective approaches, with a gamut of new and improved services
and products, and insurers are deeply fostering redefining service-oriented strategies
and innovative products.

Keywords COVID-19 · Insurance · Digital · Blockchain · Artificial Intelligence ·


Technology · Insurtech

74.1 Current ‘Global Insurance Stance’

The global insurance penetration is 7.3%, as per reports from Statista [15]. Around the
world, insurers are facing tough economic challenges in terms of flat interest rates
(primarily for life insurers), rising inflation, negative yields from the government
bonds and corporate market that are driving the industry’s Return on Investment
(ROI) and Return on Equity (ROE) down. Consistent price hikes for raw material
in the construction business, rental vehicles (personal and commercial usage), and
auto parts (computer chips for Smart cars and semiconductors) are pushing expense
lever threatening to drive up the insurer loss cost in the year 2022. As per the survey
from Willis Tower Watson, the implementation of International Financial Reporting
Standard (IFRS 17) , which is due to come into effect in January 2023 likely to cost
global insurers between USD 15 billion and USD 20 billion.
As per the reports from McKinney [4], the insurance industry is projected to
be a $10 Trillion industry by 2030. Also, {McKinney reports} highlighted that
across the world, Insurtechs are predominately offering last-mile capability driven
through digital innovation and integration of disruptive technologies, paving the way
to multiple lines of business and product lines within the Insurance value chain [11].
Notable investments in Insurtechs worldwide are showing an upward trajectory from
$1 billion in 2004 to $7.2 billion in 2019 to $14.6 billion in 2021 (as per the recent
report in Deloitte [6, 12]). Around 40% of Insurtechs and a few big insurance carriers
are primarily targeting disruptive distribution channels and radical marketing strate-
gies, fueling them to meet the needs of end customers and improve the customer
journey through a digitally enhanced and enriched client experience [5]. In a pre-
COVID era, though digitalization and technology were in place across the insurance
value chain for many insurers after COVID, an unprecedented shift in technology
paradigm and adoption of technologies has accelerated to the maximum to do the
business, and it has been no more an option for any insurer globally (Fig. 74.1).
74 COVID-19 Triggers a Paradigm Shift in Technology for Insurance… 813

Fig. 74.1 Insurance value chain

There are several promising examples in which Insurers levered emerging tech-
nologies and reaped the benefits of early movers in the given space. For instance,
Allstate insurer have come up with a concept of connected cars which will offer
competitive premiums with telematics. Similarly, AIG has developed ‘Attune,’ a
data intelligence for underwriters. Aetna has implemented fraudulent claim detec-
tion using advanced machine learning. Based on the recent survey from McKinsey,
Insurtechs and some big insurer carriers would be focusing on marketing and
distribution channels in the coming time.
Another factor continuing to cut General Insurance insurer profitability is the
financial losses from climate risks. As per the Deloitte report [12] of 2021, globally
projected insured natural disaster property losses amounted to USD 40 billion until
June. This has resulted in many insurance regulators across the globe launching their
initiatives. In the United States, it is anticipated to outline new direction banking upon
Technology level on how insurers should be releasing and dealing with financial risks
spawn due to climate change.
Coming on to the Global insurance merger and acquisition (M&A) domain, deals
will be strategic rather than opportunistic, with the help of sizable capital infusion by
private equity investment firms already prevalent in the insurance brokerage space.
Life and Annuity (L&A) Insurers who are still deemed legacy insurers are more likely
to explore inorganic growth and acquire Insurtech in 2022 than General Insurance
Insurer. The latter is already actively engaged in Insurtech investment. Recently
Voya Financial sold its annuities, life, and wealth businesses and doubled down
on retirement, asset management, and group insurance. MassMutual, meanwhile,
divested its US direct-contribution business, Oppenheimer Funds, and its businesses
in Asia.
However, with the points mentioned earlier, the global insurance market is
expected to exceed US$7 trillion in premium terms by mid-2022 due to rising risk
awareness among consumers and businesses [17]. They were the leading insurer insti-
tute [18], firmly anticipating that the global economy rebound steadily in 2022 from
the COVID-19 pandemic because of the accelerated pace of vaccination drive and
global fiscal stimulus through government bodies [1]. However, this recovery may be
impacted by the problem of upcoming virus variants (Delta, Omicron). Witnessing
a drop of 2.9% in real growth in 2020, the institute estimates total global insurance
premiums [8] will slow to 3.3% in 2022 and 3.1% in 2023. Slower economic growth
814 R. S. Jha et al.

Fig. 74.2 Global insurance marketplace position of premium growth

in 2022 and 2023 is expected because of supply chain challenges, labor shortages,
and inflated energy prices [2]. Global life premiums are estimated to grow 2.8% from
2022 to 2023. Nonlife/General Insurance premiums are projected to grow 3.5% from
2022 to 2023, driven by rate hardening in commercial lines of business. Institute
expects global insurance premiums to exceed $7 trillion by mid-2022 (Fig. 74.2).
As per the Deloitte report [12], the consolidated premiums for all lines rebounded
by 3.3% for full year 2021 and likely to be 3.9% in 2022. China is founded to lead the
way with 9% growth in 2022, followed by emerging market at 7.4%, while advance
market are likely to see more moderate gains averaging 3%.
The industry will exhibit an unusual growth trajectory and profound finan-
cial performance in 2022. Yet the global survey from Big 4’s indicates that there
are multiple challenges cut across Finance, Talent, Technology, and Marketing
as insurers continue to acclimate to the pandemic’s aftershock and seek massive
enterprise transformation to propel faster growth and safeguard their enterprising
future.

74.2 Emerging Trends Reshaping the Insurance Industry

In the year 2022, insurers are evaluating, scaling, and refining the numerous digital
adoptions they had implemented as need of the hour to meet the requirements during
and after pandemic era supported by virtual workplace (equitable to hybrid work
model) and redefining customer engagement journey to ensure mere difference
before and after COVID generation [14]. Also, insurers are relentlessly revamping
the process and ICT landscape with the help of emerging and disruptive technology
strategies to achieve sustainable growth and fulfill the long-term vision. Though
core system modernization remains the top priority for the insurer, integrating data
74 COVID-19 Triggers a Paradigm Shift in Technology for Insurance… 815

Fig. 74.3 Emerging technologies where carrier expects to increase spending in 2022

analytics, rejuvenating business processes and models, creating personalized offer-


ings, and automation solutions will be in center stage to increase efficiency and
revenue—the majority of which will be deployed in the cloud platform. Post COVID,
insurers have realized that accelerated technology adoption has become evident, the
battle for talent resources is expected to be more ferocious in the year 2022. Deloitte’s
survey [12] recently expects technology budgets to rise by 13.7% in 2022 (Fig. 74.3).
From a technology point of view, the ever-increasing usage of Artificial Intelli-
gence (AI), Big Data Analytics, and cloud adoption will revamp the technology/ICT
landscape [14] of the insurance value chain. Insurers are increasingly investing in
conventional AI or chatbots, Big data capabilities to communicate effectively among
various stakeholders, improvise user experience, and reduce wait times [10]. Insurer
data can generate analytics and insight to transform customer experience via person-
alized products as per customer needs, product bundling, and subscription models.
For example, Mitsui Sumitomo Insurance has an AI-powered “agent support system”
to accelerate the potential need identification of customers by analyzing internal
and external data. AI is also helping to adopt new business models such as AI-
driven Underwriting/Pricing, and insurers collaborate with online retailers to provide
required insurance products and benefits in real-time while purchasing consumer
goods.
Another emerging trend gaining sound traction among insurers is striving
to build a digital-ready workplace (hybrid work model) encompassing onshore
presence, remote working capabilities, and offshore presence, offering a digital
working ecosystem systematically orchestrated by virtual collaboration using stan-
dard communication tools. This implies that the talent approach should be adopted
in parallel. The insurer needs to examine their productivity, collaboration, and
innovation by doing trial and error on different plans.
816 R. S. Jha et al.

74.3 Discussion

With digital strategies being implemented across insurance companies (e.g., self-
service portals, chatbots, core modernizations, etc.), face-to-face contact is prob-
lematic for potential customers (e.g., old age population in Japan) who still want
to buy services that way. Hence carriers should start adapting and find other ways
to integrate with legacy approaches to insurance operation, personalized product
development, distribution, and marketing. Insurance companies should not neglect
the human touch with the digitalization embedded in the insurance life cycle. The
insurance carrier needs to consider which form of insurance interaction channel
users will prefer, digital versus human intervention, to create a unique yet exciting
experience for its customer community.
Insurers are moving toward a ready-made COTS product that requires little or
no modification (product flexibility) to roll out their new and existing line of busi-
ness. Selection of these COTS products is a challenging and cumbersome process,
and hence the collaboration with Insurtech or IT service provider will play a huge
role. Citing reference example, AXA AL is establishing a tripartite ecosystem with
contractor customers, technology vendors, and innovation leaders to create an enter-
prising construction ecosystem using cameras and sensors to gather data to draw
impactful insight, which would rejuvenate engagement experience with commercial
customers.
Innovating and penetrating with new insurance products (for cyber risks, climate
change, pandemics, and intangible assets) is also becoming a discussion point. Many
insurers are yet to participate in the crypto insurance market, leaving relatively a
broader market segment in the current time on how to underwrite with profitability
and market such policies. Another one is cyber insurance which remains an unex-
ploited opportunity with less than 1% direct written premiums. With remote working
and increased digitalization, underwriting and pricing coverage are enormous chal-
lenges that insurers need to address. 60% of the organization are expected to invest
in some form of cybersecurity [3] by 2022–23. Scaling usage-based insurance to
multiple lines of business is another hot topic that will impact the market globally. It
is widely used for auto businesses, and companies are getting a good return on their
investment in this COVID era. In a nutshell, the insurer needs to make their prod-
ucts modular, reallocate capital between commercial and personal lines, and move
speedily to establish strong market positions in the new risks [13].
Environmental, Social, and Governance (ESG) have become a top priority as the
direct impacts of the COVID-19 pandemic have receded [16]. Until 2021, discussions
around sustainability were largely theoretical. Still, in 2022, many insurers would be
taking credible steps toward embracing hard metrics as they would be committed to
addressing the full range of ESG issues and opportunities [16].
74 COVID-19 Triggers a Paradigm Shift in Technology for Insurance… 817

74.4 Conclusion

The COVID-19 pandemic has made significant changes in the insurance industry
(General and Life Insurance) in terms of people, processes, and technology (know-
how) and changed the perspective of the end customers. Insurance organizations are
quickly integrating emerging technologies, focusing on the digital stack, committing
to agile way, upskill or reskilling workforce, adapting innovative business models
and products (such as usage-based insurance) to sustain in such turbulent times and
maintain growth and profitability. Also, organizations across the globe are adopting
new norms like making their employees work remotely from their homes, crafting a
hybrid work model, adhering to safe distancing measures, and contactless transac-
tions (especially by agents and brokers) to negate the effects of business disruption
and meet customer expectations. Also, with COVID-19, the insurers respond to this
situation depending on the demographic regions.
Above mentioned imperatives would enable carriers to answer the “how to play”
question in 2022–23. Insurers must invest quickly and massively in high-volume
activities with technology, digital skills, data and analytics capabilities, customer
experience, and compliance competencies to keep pace in the changing environment.
Several players have already adapted, and others will be changing and refocusing their
footprint and business model—in effect, rebalancing their portfolio of activities and
reviewing their capital allocation, mainly through M&A and asset disposals. Many
insurers believe they will be able to recreate value creation by offloading existing
legacy liabilities to owners better positioned to manage them and by changing their
business model.
Insurance companies will be using transformation lever to sustain and prosper in
the post COVID world. Though core value chain elements will remain in insurance,
all critical business processes (from quote to policy issuance to claims) will be more
streamlined, which will be enabled by digitalization, investment in technology, and
automation with no code/low code. Another notable emphasis would be enhancing
and personalizing customer engagement and experience with simplification of prod-
ucts portfolio, which can be customized according to individual needs. The insurer
needs to focus on business strategies (peer-to-peer insurance), technologies (Telem-
atics, IoT platforms, Drones, Blockchain, machine learning/deep learning), structural
simplification, marketplaces, and enterprise agility to achieve full potential [9].
In any geography, the insurers that will quickly integrate technology, upskill
the workforce, adopt innovative business models will taste success in such turbu-
lent times. In a nutshell, insurers need to reinvent themselves to become future-
ready. Enterprises need to prepare themselves, focusing on the “Trends and Trinity”
optimization, “Trends” in terms of technology, regulations, business models, and
“Trinity” or the three dimensions of speed, efficiency, and risk.
Insurance companies around the globe are standing at a crossroads wherein they
need to assess their investment portfolio, experience a new hybrid work model,
increase focus on technology spending and adoption, macro and micro hedge strate-
gies to strengthen their capital and operational efficiency [7]. As Peter Drucker
818 R. S. Jha et al.

famously said that “Luck never built a business. Prosperity and growth come only to
the business that systematically finds and exploits its potential.” Therefore, insurance
companies need to develop strategies based on a holistic understanding of current
and future trends and reassess their product depth and breadth, geographic focus,
technology capabilities, operating model, and core business process capabilities.
The verdict is clear; a proper balance between disruptive innovation (embracing
through disruptive technologies) and operational innovation (refining existing busi-
ness processes across value chain keeping customer centricity at center stage) with
novel business models will help the insurer disproportionally succeed and build
sustainable and enterprising future.

References

1. Aizpún, F.C., Dia, X., Lechner, R.: World insurance: the recovery gains pace. Swiss Re
Management Ltd, No 3, 44 (2021)
2. Antonelli, T., Cook, D.: Is the Market Rotation the Real Deal or Just a Head-Fake? Wellington
Management (2021)
3. Babuna, P., Yang, X., Gyilbag, A., Awudi, D.A., Ngmenbelle, D., Bian, D.: The impact of
COVID-19 on the insurance industry. Int. J. Environ. Res. Public Health 17(16), 5766 (2020)
4. Balasubramanian, R., Libarikian, A., Doug, M.: Insurance 2030—The Impact of AI on the
Future of Insurance. McKinsey (2021)
5. Bernard, P.-I., Binder, S., D’Amico, A., Nayves, H. de C. de, Ellingrud, K., Klais, P., Kotanko,
B., et al.: Creating Value, Finding Focus: Global Insurance Report 2022. McKinsey’s Insurance
Practice (2022)
6. Deloitte.: Impact of COVID-19 on the Insurance Sector. Deloitte Review, vol. 6 (2020)
7. Erk, A., Patiath, P., Pedde, J., van Ouwerkerk, J.: Insurance Productivity 2030 : Reimagining
the Insurer for the Future. McKinsey & Company (2020)
8. Insurance Information Institution.: “Insurance Handbook” (2020). Available at: https://www.
iii.org/publications/insurance-handbook/economic-and-financial-data/world-insurance-mar
ketplace. Accessed on 6 March 2022
9. Jha, R.S., Sahoo, P.R.: Internet of things (IOT)—enabler for connecting world. ICT for Compet-
itive Strategies: Proceedings of 4th International Conference on Information and Commu-
nication Technology for Competitive Strategies (ICTCS 2019), December 13th-14th, 2019,
Udaipur, India. CRC Press, p. 1 (2020)
10. Jha, R.S., Sahoo, P.R.: Influence of big data capabilities in knowledge management—MSMEs,
pp. 513–524. Springer, ICT Systems and Sustainability (2021)
11. Jha, R.S., Sahoo, P.R.: Relevance of Disruptive Technologies Led Knowledge Management
System and Practices for MSME, pp. 139–147. Springer, ICT Systems and Sustainability
(2022)
12. Li, T.: Global insurance market to hit a record of US$7 trillion by mid-2022. SHINE News
(2021). Available at: https://www.shine.cn/biz/finance/2112149481/. Accessed 6 March 2022
13. Mishra, P., Mishra, N., Sant, G.: Impact of COVID-19 on Business Industry and Management:
Pandemic Challenges and Responses (2021)
14. Nicoletti, B.: Insurance 4.0: Benefits and Challenges of Digital Transformation. Springer Nature
(2020)
15. Rudden, J.: Global insurance industry—statistics and facts. In: Statista Research Department
(2021). Available at: https://www.statista.com/topics/6529/global-insurance-industry/#topicH
eader__wrapper. Accessed 7 March 2022
74 COVID-19 Triggers a Paradigm Shift in Technology for Insurance… 819

16. Santenac, I., Bong, S.Y., Majkowski, E., Manchester, P.: 2022 Global Insurance Outlook. EY
Global Insurance Outlook (2022)
17. Shaw, G.: A Report from the Deloitte Center for Financial Services 2022 Insurance Industry
Outlook About the Center for Financial Services (2021)
18. Smith, R.: Global insurance industry could hit new record in 2022. In: Insurance
Business America (2021). Available at: https://www.insurancebusinessmag.com/us/news/bre
aking-news/global-insurance-industry-could-hit-new-record-in-2022-317017.aspx. Accessed
6 March 2022
Chapter 75
Internet of Things in Saudi Arabia
Universities: State of the Art, Future
Opportunities, and Open Challenges

Norah Alyahya and Bader Aljaber

Abstract Internet of Things technology has radically changed our universities,


which allows different devices to communicate with other physical devices and
contribute to effective education and management. It has the chance to enable smart
universities with using different IoT devices. This study focuses on investigating the
state of the art of implementing the Internet of Things in government universities in
Riyadh, Saudi Arabia. In addition, it aims to understand the limitations, challenges,
risks, forecasted benefits, and future opportunities of applying Internet of Things
applications and recommending applications for universities in Saudi Arabia. A qual-
itative research design had been utilized. Open-ended interviews had conducted with
five informants in the deanship of Information Technology in government universi-
ties in Riyadh. The result of this paper shows that Internet of Things technology was
not applied in universities except Saudi Electronic University, and all of them have a
direction to adapt it. Moreover, the study highlights the challenges, limitations, risks,
future opportunities, and benefits of adapting the Internet of Things in universities
in Riyadh and Internet of Things applications that can be implemented.

Keywords Internet of things · IoT · IoT in universities · Saudi Arabia universities

75.1 Introduction

In this era with more and more mobile devices and computer systems, there is a need to
adopt teaching and managing approaches that make the best utilize of the technologies
[1]. IoT is one of new initiative technologies that would change our environments
from simple objects to interconnected and smart objects [11]. It allows prevalent
interaction between environments, things, and people. It also able to gather data from
embedded sensors and other devices and then sends these collected data to specific

N. Alyahya (B) · B. Aljaber


Al Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia
e-mail: n-m-y-1@hotmail.com
B. Aljaber
e-mail: baljaber@imamu.edu.sa

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 821
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8_75
822 N. Alyahya and B. Aljaber

applications to produce useful and meaningful information [29]. This technology


has played an important role in the enhancement of education and management
at university, college, school, and other educational institutions. From campus to
classroom, teacher to learner, everything can obtain benefits from this technology
[2]. Moreover, IoT enables innovation in education through a growing group of
smart Internet-connected devices and other technologies, for example, connected
university buses, security cameras, smart lighting, etc. All of them provide real-time
and valuable data to learners, their parents, administration, lecturers, and other faculty
[21].
Although privacy, security, trust, mobility, reliability, availability, scalability,
performance, interoperability, and management are the main challenges of IoT in all
sectors [2], there have been a lot of investigations in IoT applications in the education
sector. IoT is a major area of interest within the field of education, and it has been
studied by many researchers. Previous studies have reported that a lot of universities
across the world have been used IoT within their laboratories, classrooms, campuses,
libraries, and other locations as a technology to automate education process, enhance
education, improve student outcomes, control security, manage students’ health care,
measure air pollution, humidity, temperature, etc., make teachers and students life
easier [2, 24], achieve learners’ satisfaction [11], and enhance management [24]. The
unique communication between lecturers and learners and the decrease in educa-
tional institution operation cost and energy consumption are the most IoT benefits in
the education sector [11]. For instance, New Richmond schools in Ohio are approxi-
mately saving $128,000 each year through utilizing a Web-based system that manages
and controls all mechanical tools inside the buildings [23].
In fact, many of Saudi Arabia universities have traditional campuses that are not
able to gather and analyze data of learners, lecturers, and staff all day and tradi-
tional classrooms which have speakers, PCs, projectors, etc., that are not able to
detect and record all learners’ activities in the class in any form. Furthermore, these
traditional universities consume energy, increase cost and time, raise difficulties of
maintenance and managing learners and staff, and reduce the security of universi-
ties campuses [17]. To overcome these problems and to conform to Saudi Arabia
vision 2030, the universities should take the advantages of the rapid improvement of
IoT technology and its applications especially in the education sector. On the other
hand, recent developments in IoT have heightened the need for IoT applications and
different Internet-connected smart devices. But unfortunately, there is a massive gap
in universities in Saudi Arabia to implement this technology as a service for each
learner and employee and a way to facilitate university management and make it
smarter.
Most studies have only focused on specific IoT applications and its benefits, and
few of them discussed IoT state of the art, future opportunity, and challenges that
have only been carried out in specific universities within specific countries. But
this paper distinguishes than other studies, and it discusses state of the art, future
opportunities, and open challenges of IoT applications in government universities
located in Riyadh, Saudi Arabia. Moreover, it aims to investigate the plan, usage,
75 Internet of Things in Saudi Arabia Universities … 823

and implementation of the Internet of Things (IoT) in government universities, iden-


tify the existing limitations and challenges of implementing IoT-based applications,
illustrate the forecasted risks and benefits of using IoT in universities, and deter-
mine future opportunities of IoT applications in Saudi Arabia universities located
in Riyadh. This paper seeks to address the following research questions: (1) What
is IoT state of the art in the main government universities of Riyadh, Saudi Arabia?
(2) What are the main challenges and limitations of applying IoT applications in
universities of Riyadh, Saudi Arabia? (3) What are the benefits of applied IoT in the
universities of Riyadh, Saudi Arabia? (4) What are the forecasted risks of adapting
IoT in the universities of Riyadh, Saudi Arabia? (5) What are the future opportunities
and IoT applications in the universities of Riyadh, Saudi Arabia?
Finally, the findings of this paper would be a reference for the universities leaders
who would gain knowledge of IoT applications, benefits, and opportunities of imple-
menting this technology and different challenges and limitations that they may face
to adapt it in their campuses. Furthermore, it will help the employees in IT deanship
in Saudi Arabia universities to consider the IoT importance and how it would help
them to achieve the digital transformation as the vision 2030 of Saudi Arabia.

75.2 Literature Review

75.2.1 IoT in Education and Universities

The several tools, strategies, the Internet technologies, social media, etc., have driven
the education innovation. Internet support education in different methods considering
IoT is the advanced generation of this technology. The prevalent interaction between
objects, people, and environment are enabled by IoT which is a new technology and
paradigm in the education sector [5]. According to a research report [56], the global
market size of IoT in the education is expected to grow to $ 11.3 billion in 2023. In
addition, there are many IoT vendors who offer various IoT solutions in the education
sector.
Many researchers have argued that the IoT technology has changed universities
in many different areas [11]. This technology can enhance things and make them
smarter which they complement education in the universities in different ways [57].
Moreover, it is able to gather data from actuators, sensors, and wearable devices to
perform significant action. It also provides dynamic services for university commu-
nity members [11]. In addition, the usage of IoT in universities has brought a lot
of opportunities and possibilities to enhance process and quality of learning [5] and
teaching [2], improve the learning outcomes, student performance, and administrative
process [7], improve universities infrastructure [2], enable learners to explore their
campus, personalized interaction with lecturers and other learners [11], and create
innovative and new ideas in their life [2]. Furthermore, the universities management
are interested to develop a classy campus for academics and learners and to make
824 N. Alyahya and B. Aljaber

access to classrooms, laboratories, and other locations in the universities more secure
for them. The campus also needs to adapt and embed IoT technology which helps
responsible in the university to track, control, and manage everything occurs within
its campus [11].
In addition, and compared with conventional campus, smart campus reduces effort
and operational cost and provides services at the proper time. It will adopt innova-
tive technologies to track and control facilities on campus automatically and provide
high-quality services to learners and staff. This led to increasing the responsive-
ness and efficiency of the campus, learners experience, space utilization, and making
better decisions [6]. A lot of universities have adapted IoT technology in their campus
such as the University of Melbourne in Australia turns to IoT technology to under-
stand campus use, provide accurate data sources for researcher, and utilize university
campus as a test bed for research [32]. Furthermore, University of New South Wales
(UNSW) in Australia integrates IoT technology and smart sensors into facilities of
Kensington campus to manage campus operations, increase energy saving and enable
new learning modes, and meet learners’ requirements [27]. As well, the Arizona State
University (ASU) has used the IoT to provide high-quality education, smart dorm,
and comfortable living for its learners [38].
In this era of fast-moving technology, learners are more willing and demanding
to utilize innovative teaching and learning methods. In addition, they will be looking
forward to surviving and living in the new sustainable, innovative, and smart univer-
sity campus environment. This will lead to enhance the efficiency and delivery
of daily activities with consideration of the environmental and social interactions
[45]. In authors’ views, universities in Saudi Arabia need to improve the quality of
learning, teaching, and management processes by using IoT. Smart university and
smart campus can attract learners, lecturers, and staffs by using different integrated
smart devices which will give an enormous impact on education sector in Saudi
Arabia.

75.2.2 Applications of IoT in Universities and Other


Education Institutions

Smart education can be delivered by the current IoT applications. These applica-
tions are boundless, and we are seeing them in smart universities and other smart
educational institutions in a lot of countries [20]. They help to make university
facilities such as campuses, buildings, classrooms, libraries, offices, and laborato-
ries smarter than before [45]. Moreover, applying them in the universities will aid
to enhance all educational institutions [20] and help them to get a lot of valuable
benefits such as save and manage energy and resources, track health and safety of
university learners, enhance environments of campuses, classrooms, and other facil-
ities, enhance learning and teaching, automate learners’ attendance, [11] track, and
monitor transport [20] and more other benefits. Table 75.1 illustrates IoT applications
Table 75.1 IoT applications and their benefits in the universities and other educational institutions
IoT applications Benefits of IoT application Example of Educational institutions have been used
IoT application
(1) Smart classroom 1. Create a productive teaching and learning Miami-Dade County Public Schools in Miami,
environment [2, 59] Florida [14]
2. Support security and foster safety of the class
community [59]
3. Decrease distraction from learning and help learners
to keep task focused [59]
4. Facilitate and organize the stream of learning
activities [59]
5. Make learners feel more welcome and comfortable
[59]
6. Have a good relationship between learners, lecturers
offer them the opportunities to be motivated and
engaged in the teaching process [59]
7. provides innovative methods to manage classrooms
[2]
75 Internet of Things in Saudi Arabia Universities …

8. Automate the education process and enable learner


to learn in any place and at any time [17]
9. Enhance learner achievement [14]
10. Reduce energy usage [14]
(2) Smart laboratory 1. Track, control and manage electrical devices and
systems in the laboratory easily [58]
2. Increase security and efficiency of energy [58]
3. Provide comfort to the learners, lecturers and other
staff [58]
4. Automate lab resources and reduces the human
effort [58]
(continued)
825
Table 75.1 (continued)
826

IoT applications Benefits of IoT application Example of Educational institutions have been used
IoT application
(3) Smart library 1. Create a smart library system [30] Inonu University Central Library [42]
2. Controls and manages all library information [25]
3. Tracks transactions and manage processes of library
[25, 30]
4. Create a secured library asset [30]
5. Tracks students, lecturers, and staffs in the library
[42]
6. Enhance the efficiency of operations and improve
user learning experiences [30]
7. Improve experiences and comfort [30]
(4) Smart attendance 1. Tracking learners’ attendance in an effective way [5] Arizona State University (ASU) [41]
2. Save time and effort of lectures for taking students The Penn State University [44]
attendance [54]
3. Focus on teaching, solve learners’ problems, and
answer their questions [22]
(5) Safe learning environment 1. Manage learners’ access to the university [29] Arizona State University (ASU) [38]
2. Automate tracking and monitoring to create a secure Sookmyung Women’s University (SWU) [33]
and safe place in the university [29]
(6) Smart parking 1. Enhance the parking facilities [4]
2. Reduce fuel consumption [18] and drivers’ time and
effort for searching the parking slot [18]
(7) Smart vehicles 1. Manage and track vehicles effectively [47] The Raytown School District in Raytown, Missouri
2. Ensure safety and security of learners by reporting [53]
the exact location of the bus and driver speeding [53]
(continued)
N. Alyahya and B. Aljaber
Table 75.1 (continued)
IoT applications Benefits of IoT application Example of Educational institutions have been used
IoT application
(8) Smart building 1. Reduce energy consumption and cost [55] The University of Washington (UW) [41]
2. Decrease electricity bills and increase building The University of Hawaii [31]
efficiency [55] Birmingham City University in UK [16]
3. Provide a safer and more comfortable environment
for the university learners and staff [55]
4. Provide better building management and better
decisions [55]
(9) Smart dorm 1. Improve security [39] Arizona State University (ASU) and Saint Louis
2. Save energy [39] University (SLU) [39]
3. Make dorm life more comfortable for learners [39] New York University (NYU) [35]
(10) Smart water and waste management 1. Reduce costs [52] KLE Technological University, India [43]
2. Improve efficiency [52]
3. Manage water and waste services easily [52]
4. Conserve water [51] and reduce the unnecessary
75 Internet of Things in Saudi Arabia Universities …

water loss [52]


5. Solve water wastage problem [51]
(11) Smart University stores 1. Eliminate the paper labels and make it easier to alter
or change the products’ prices within minutes by using
electronic labels [48]
2. Personalized advertisements [48]
3. Control and manage temperatures [48]
4. Help retailers track and manage their inventory [48]
(12) Smart teaching 1. Supplants papers, chalkboards, and pencils to enable
new efficient instructional methods [57]
2. It enables education institutions to use the
learner-centered approach [19]
3. Enhance teaching and engage more learners in the
education process at the same time [64]
827

(continued)
Table 75.1 (continued)
828

IoT applications Benefits of IoT application Example of Educational institutions have been used
IoT application
(13) Smart gym 1. Help to increase members’ attendance [15]
2. Enhance the overall exercise experience [15]
3. Allow instructors to understand the needs of gym
members [15]
4. Provide gym owners with the accurate records of
how much equipment have been used [12]
2. Track single group member work rate and their
health status [15]
5. Enable members to record [15] and share their
success with the health professionals, or trainers [12]
by using fitness applications and wearable devices
6. Evaluate member movements and hold out correct
guidance [15] to prevent sports-related injuries [12]
(14) Smart management of university offices 1. Provide attractive and comfortable workplace to
enhance employees’ productivity [8]
2. Streamline employees’ routine tasks [13]
3. Automate workplace functions and reduce or
eliminate human involvement in the office
management [8]
4. Save energy [13]
5. Increase safety of workplace [13]
6. Enable communication and collaboration easily [13]
(continued)
N. Alyahya and B. Aljaber
Table 75.1 (continued)
IoT applications Benefits of IoT application Example of Educational institutions have been used
IoT application
(15) Smart university stadium 1. Provide university teams and their audience with Sun Devil Stadium on the Arizona State University
valuable information for instance parking availability, (ASU) campus in the USA [41]
seat upgrades, the length of waiting lines and special
and personalized offers [28]
2. Able to order refreshments and know the status of
their orders from their seats and receive detailed
information to navigate crowded stadiums [28]
3. Monitor and track all the venue corners and keep
teams and audience safe [28]
4. Optimize energy use [28]
5. Monitor behavior of people and quickly identify and
detect unruly behavior audience [28]
6. Prevent unauthorized fans access to sensitive areas
[28]
75 Internet of Things in Saudi Arabia Universities …

(16) Smart health 1. Reduced care cost [29] Oral Roberts University (ORU)
2. Track and monitor healthcare of learners to enhance
the quality of health care [37]
(17) Smart news management 1. Facilitate campus community to quickly check
information on smart digital devices [26]
2. They can participate in any topic, comment on
comments, and share university news at any time and
from any place [26]
829
830 N. Alyahya and B. Aljaber

and their benefits in the universities and other educational institutions.

75.2.3 IoT Impact, Challenges, and Opportunities


in Universities and Other Education Institutions

To help educational institutions with their activities. A lot of schools and universi-
ties have been utilized advanced and new technologies to enhance education quality.
In fact, IoT brings huge opportunities to universities and other educational institu-
tions and a lot of issues and challenges that require to be addressed [40]. There are
few papers that have presented IoT opportunities, benefits, and challenges in the
education sector. Authors in [2, 6, 9] presented a survey discussed IoT challenges
and its impact in future of the education. They found that educational institutions
may have to face some difficulties which impeding the adoption of IoT such as
reliable and efficient Wi-Fi connection, network bandwidth, privacy, security, web
analytics, availability of students’ devices, management of applications and devices,
cost of devices and equipment, lecturers training, system integration, information
processing, interoperability, fear of introducing new technologies, and poor battery
life. Moreover, they found that IoT technology will enhance learning and teaching
process in the future. Learners will study better where this technology has improved
user interfaces of physical objects, as a tool to exploratory learning and as devices
to collect data about the education process, and lecturers will be able to carry out
their duties more efficiently, monitor learners progress, and help them to understand
hard concepts. Furthermore, educational institutions have the chance to save time
and cost by promoting smart management of water, energy and waste, automate
maintenance by notifying maintenance team to take proper action at a proper time,
protect the environment by bringing effective building and campus surveillance and
incidents warnings, attain efficient parking which will help learners and staff to find
the closest available car parking, automate attendance monitoring for learners and
staff, and provide learners a map of the campus to help them to navigate the campus
effectively.
On the other hand, authors in [3] discussed state of the art, possibilities and
opportunities, and limitations of applying smart university buildings, classroom,
library, and laboratory in the Hajee Mohammad Danesh Science and Technology
University (HSTU) in Bangladesh. They found that the concept of IoT is yet a
very hard model for Bangladesh because of some main common barriers such as
infrastructure, equipment, and devices availability, technological, network, storage
and software, security, and legal, cost optimization barriers. Also, using IoT will
help universities especially HSTU to automate and facilitate learning and teaching
processes, manage, and monitor university classrooms, buildings, laboratory, and
library efficiently, automate student attendance, reduce electricity consumption, and
track laboratory and library objects. In addition, author in [7] presented a case study
that discussed IoT in education in Malaysia. They found IoT technology can enhance
75 Internet of Things in Saudi Arabia Universities … 831

Table 75.2 Participants’ characteristics


Participant Role University name Experience years
P1 Application department Imam Mohammad Ibn Saud 1 year and 8 months
administrator Islamic University
P2 CIO King Saud University 1 Year and 5 months
P3 Operation and Princess Nourah Bint Abdul 10 years
maintenance manager Rahman University
P4 Dean of IT Department Saudi Electronic University 5 years
P5 IT operation manager king Saud bin Abdulaziz 4 Years
University for health sciences

education and attract learners in effective learning experience. It also makes learners
more active when they can communicate by using smart devices. Learners also can
share knowledge in real time which can enhance learning and research process.
Furthermore, this technology is used to automate administration activities and reduce
time and effort. In addition, they found that educational institutions may face some
challenges to implement this technology in education sector such as capability to
integrate IoT equipment in classroom, hard to manage IoT programs, and security
and privacy of sensitive data of learners.

75.3 Research Method

The researchers used a qualitative method design. Structured open-ended interviews


were utilized to collect data. All interviews were conducted by phone with 5 infor-
mants and experts elected for their knowledge of IT systems and applications in
Saudi Arabia public universities located in Riyadh. Table 75.2 shows the character-
istics of participants. In this piece of work, qualitative analysis was performed by
using the Grounded theory method. Through the process of processing and analysis
of the collected data, the researchers used proportion analytical to explore the impact
of important results. The data analysis process was done by following some steps:
(1) conduct interview, (2) transcribe the detailed information into a word processing
files, (3) code collected data into themes and place them into categories, and (4)
generalize questions and explain them with the light of the current study.

75.4 Research Results

1. Universities working on IoT

All members were agreed in the term of universities working on IoT, and all of them
have no idea which universities in Riyadh that applied it as this participant explained.
832 N. Alyahya and B. Aljaber

Fig. 75.1 State of the art of state-of-the-art of the IoT


the IoT in public universities
in Riyadh Has used IoT 20%
Does not use IoT 80%

Parcipants

“I have no idea, but I think Princess Nourah Abdul Rahman University has a
direction to implement IoT applications” (P5).
2. State of the art of the IoT
One of the participants was expressed in terms of state of the art of the IoT that smart
IoT attendance system has been used where others are not used IoT in their campus
until now as these participants explained.
“Yes, we use Smart IoT attendance system. Where the system can detect
attendance of students once they enter the campus” (P4).
“Not used in the university until now but there is a direction to adapt a smart IoT
attendance system on the university campus” (P1) (Fig. 75.1).
3. Impact of the IoT
All participants were agreed that IoT has a positive impact as this participant
explained.
“Very high positive impact whenever they find the best use for it” (P2).
All participants were expressed in terms of impact of the IoT in university which
reducing cost as these participants explained.
“It has a positive impact where it is an alternative for using the old methods, and
it will reduce cost for the long term” (P5).
Two participants expressed in terms of impact of the IoT in university which
improving the quality of services provided to the university community, which help
to provide a lot of information related to them as this participant explained.
“It has a positive impact. Because it will help to decrease cost and provide high-
quality services to all students and employees in the university based on collected
data from multiple sources” (P3) (Fig. 75.2).
4. University community satisfaction
All participants were agreed in terms of increased satisfaction of university
community when used the IoT as they explained.

Impact of IoT
Improving the quality of provided services 40%
Reducing cost 100%
Parcipants

Fig. 75.2 Impact of the IoT on universities


75 Internet of Things in Saudi Arabia Universities … 833

“Yes, it will increase user satisfaction. Because they need and interest of smart
and new services and applications” (P1).
“Yes, it will. Because it will make their life easier” (P3).
5. Possibility to shift to IoT
All participants were agreed in terms of possibility to make a shift to IoT in Saudi
Arabia universities as these participants explained.
“Yes, it is easy but Top management should be aware of the importance of IoT
and set their plan to implement it” (P3).
“ Yes, and we need such this technology to achieve vision 2030” (P5).
6. Support use of the IoT
All participants were expressed in terms of IoT support use that finds it in a convenient
way to facilitate processes and procedures.
7. Promote use of IoT
One participant was expressed in terms of IoT promote use that train the users to
increase usage. Also, one of them was expressed in terms of IoT promote use by
working on a clear strategy as this participant explained.
“We have a clear strategy to adapt new technology such as IoT and we work on
that” (P3).
Two participants were expressed by submitting proposal of projects which adapt
IoT in the university as this participant explained.
“Yes, we promote use of the IoT by searching for IoT applications and submitting
proposals to university management” (P4).
Three participants were expressed in terms of IoT promote use that setting an
infrastructure for IoT and university systems (Fig. 75.3).
8. Challenges of the implementation of IoT
Four participants were expressed in terms of challenges of the IoT implementation in
the universities including the suitable infrastructure. One participant was expressed in
terms of challenges of the IoT implementation in the universities regard to old devices
and integration between systems and devices, additionally, monopolization of the IoT
applications (to a specific vendor). Three participants were expressed in terms of the
challenges of the IoT implementation in the universities regard to the poor experience

Promote use of IoT


Set infrastructure 60%
Submit IoT projects proposal 40%
Work with clear strategy 20%
Train users 20%
Parcipants

Fig. 75.3 Promote use of the IoT


834 N. Alyahya and B. Aljaber

and lack of specialists which needs to increase awareness and training users. One
participant was expressed in terms of challenges of the IoT implementation in the
universities regard to devices maintenance and choose the best IoT systems. One
participant was expressed in terms of challenges of the IoT implementation in the
universities including privacy and security as these participants explained.
“Yes, there are a lot of challenges to implement IoT in the university and the most
challenge that we face is an infrastructure. We need to make university infrastructure
suitable for IoT. In addition, privacy and security” (P1).
“I think infrastructure, identifying and choosing the best IoT systems, device
maintenance and users training are the most challenges of IoT” (P2).
“Not all devices, we have old devices and systems and applications in the univer-
sity are not ready for IoT. Also, Integration between them is not easy. Moreover,
users’ awareness and monopolization of the IoT applications” (P5).
One participant was expressed in terms of challenges of the IoT implementation
in the universities regard to clear standards as this participant explained.
“Yes, Infrastructure and there are not clear standards” (P3).
Other one participant was expressed in terms of challenges of the IoT implemen-
tation in the universities including distance between buildings (large campus), and
high isolation (Fig. 75.4).
9. Limitations of implementation of the IoT
Four participants were expressed in terms of limitations of the implementation of
the IoT in universities including budget. One participant was expressed in terms of
limitations of the implementation of the IoT in universities regard to infrastructure
capacity. Other one participant was expressed in terms of limitations of the imple-
mentation of the IoT in universities including support from top management as these
participants explained.
“Yes, I think budget is the most limitation of applying IoT in university” (P1).
“Budget and university management may see that this technology is luxury and
not primary, so we need their support” (P4) (Fig. 75.5).

Challenges of IoT implementaon


Integraon between systems and devices
Distance between buildings (large campus), and high…
Clear standards
Privacy and security
Choose the best IoT systems
Devices maintenance
Poor experience and lack of specialists
Monopolizaon of the IoT applicaons
Old devices
Suitable infrastructure

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

Fig. 75.4 Challenges of the implementation of IoT


75 Internet of Things in Saudi Arabia Universities … 835

Fig. 75.5 Limitations of the Limitaons of IoT implementaon


implementation of IoT Support from top management 20%
Infrastructure capacity 20%
Budget 80%

Parcipants

Risks of applying IoT

Gap in integraon and accurate infrastructure 20%


Users resistance 20%
Protect data from hacking and unauthorized access 80%
Security 80%
Privacy 80%

Fig. 75.6 Risks of applying IoT

10. Risks of applying IoT


Four participants were expressed in terms of risks of the implementation of the IoT in
the university including privacy, security, and protect data from hacking and unautho-
rized access. One participant was expressed in terms of risks of the implementation
of the IoT in the university regard to users resist change. Other one participant was
expressed in terms of risks of the implementation of the IoT in the university including
gap in integration and accurate infrastructure (Fig. 75.6).
11. Benefits of IoT
All participants were agreed in terms of IoT applications in the university will provide
a lot of benefits. All of them were expressed in terms of benefits of IoT in the
university which reduce cost, time, and effort and increase quality of services and
outcomes. One participant was expressed in terms of benefits of IoT in the university
including decrease consumption of energy and water. Additionally, one participant
was expressed in terms of benefits of IoT in the university which increase efficiency.
One another participant was expressed in terms of benefits of IoT in the university
which improve monitoring, take the right decision, and help people with special
needs as this participant explained.
“Once the requirements matching possibilities of systems, the benefits will be
achieved: Improve monitoring for everything on the campus especially end users,
take the right decision based on analyzing valuable collected data from different
resource, reduce cost, time and effort, and help people with special needs to move
on the campus in an easy way” (P2) (Fig. 75.7).
12. Future opportunities for IoT applications
All participants were agreed in terms of future opportunities for IoT applications in
Saudi Arabia universities as these participants explained.
836 N. Alyahya and B. Aljaber

Benefits of IoT
Help people with special needs
Take the right decision
Improve monitoring
Increase efficiency
Decrease consumpon of energy and water
Increase quality of services and outcomes
Reduce cost, me and effort
0% 20% 40% 60% 80% 100% 120%

Parcipants

Fig. 75.7 Benefits of IoT

“Yes, I think. These applications will change university community life and will
reduce consumption cost of university” (P1).
“Yes, it is the government direction to apply the technologies in the different
sectors” (P2).
13. Successfulness of IoT
All participants were agreed in terms of successfulness of the IoT applications in the
coming years in Saudi Arabia universities as these participants explained.
“Yes, the implementation of IoT and other technologies are the direction of all
the world” (P1).
“Yes, and I think it depends on implemented applications” (P2).
“Yes, like this technology will support digital transmission. In addition, there is a
need and support from government)” (P4).
14. IoT applications in universities
Three participants were expressed in terms of IoT applications that will change educa-
tion and institutions in the future including smart attendance system, smart tracking,
monitoring system, and smart parking. One participant was expressed in terms of
IoT applications that will change education and institutions in the future including
smart bus system. Two participants were expressed in terms of IoT applications that
will change education and institutions in the future including smart classroom and
laboratory and smart building. One participant was expressed in terms of IoT appli-
cations that will change education and institutions in the future which smart identity.
Additionally, one participant was expressed in terms of IoT applications that will
change education and institutions in the future including smart devices for disabled
as this participant explained.
“I think there are many IoT applications that will change university future such
as smart monitoring system which helps to monitor and track students, employees,
devices and anything on the campus, smart attendance system which helps to reduce
time and effort of lecturers and smart devices to helps people with special needs to
move in the campus” (P2) (Fig. 75.8).
75 Internet of Things in Saudi Arabia Universities … 837

IoT applicaons in universies


Smart devices for disabled
Smart identy
Smart building
Smart laboratory
Smart classroom
Smart bus system
Smart parking system
Smart tracking and monitoring system
Smart aendance system
0% 10% 20% 30% 40% 50% 60% 70%

Fig. 75.8 IoT applications in universities

75.5 Result Discussion

The primary aims of this research were (1) to investigate the state of the art of the
implementation of the IoT in government universities in Riyadh, (2) to illustrate
the existing limitations and challenges for implementing IoT-based applications in
universities of Saudi Arabia, (3) to identify the forecasted risks and benefits of using
IoT, and, finally, (4) to determine future opportunities of IoT applications in univer-
sities. The results of this research show that all government universities in Riyadh
are not applying IoT technology in their campus except Saudi Electronic University
where they have applied smart IoT attendance system. But all these universities have a
direction to apply digital transformation and adapt advance technologies such as IoT.
Like these technologies help to achieve vision 2030, improve efficiency, and increase
quality of the education and institutions [50]. In addition, there are several challenges
that Saudi Arabia universities need to be addressed such as suitable infrastructure,
old devices, and integration between systems and devices, and monopolization of the
IoT applications and choose the best ones, poor experience and lack of specialists,
device maintenance, clear standards, and distance between buildings. In fact, the
success of IoT university in Saudi Arabia will face some of the limitations which
need to overcome before the implementation of IoT technology in terms of budget,
infrastructure capacity, and support from top management. To overcome them, the
universities should set plan to change infrastructure to make it suitable for the IoT. The
university should focus on the components of IoT infrastructure which are devices
to generate data, IoT network to transmit data, data storage to store data, and cloud
computing to analyze big data and applications which support actions [49]. In addi-
tion, to choose the best IoT application and make the right decision, the university
should follow the following steps: (1) determine university needs, (2) create a list
of IoT applications requirements, (3) search for applications, (4) exclude unsuitable
applications, (5) evaluate selected applications, and (6) choose the right application
which address university needs [46]. Furthermore, to overcome the poor experi-
ence and increase employees’ experience and knowledge, the university should (1)
838 N. Alyahya and B. Aljaber

provide appropriate training, (2) share knowledge and improve internal communi-
cations, and (3) set up regular meetings. Moreover, the university should create an
effective regular maintenance plan by defining and analyzing the situations, estab-
lishing support documentation that defining new processes, roles, and responsibilities
in detail, training employees until the new processes have become the way that work
is done, conducting an assessment and identifying lessons learned, and developing
a plan to support the change [36]. This will help universities reduce device mainte-
nance costs, improves device efficiency, extend their life, increase productivity, and
decrease downtime [34]. Additionally, IT department staff in university should gain
support from top management by hold an executive workshop, show clear cases,
and conduct surveys to show issues [10] for example. In fact, the future of IoT
university in Saudi Arabia will be successful because it will provide a lot of bene-
fits such as reduce cost, time, and effort, increase quality of services and outcomes,
decrease consumption of energy and water, increase efficiency, improve monitoring
of anything on the campus, and take the right decisions which are based on valuable
information from analyzing collected data. Fundamentally, the responsible able to
make a shift to IoT university by joining effort from the Ministry of education, the top
management in universities and IT department. Furthermore, there are several risks
that need to be handled including privacy, security, and protect data from hacking
and unauthorized access, users resist change, gap in integration, and accurate infras-
tructure. On the other hand, they will be a future opportunity for IoT applications
in university in Saudi Arabia and they are numerous. The main IoT applications
that can change education and institutions in the future in Saudi Arabia including
the smart attendance system, smart tracking and monitoring system, smart parking,
smart bus system, smart classroom and laboratory, smart building, smart identity, and
smart devices for the disabled. Finally, universities need to provide the platforms and
tools which make use of IoT university, ensure security and privacy of the collected
data, automat infrastructure, and make it suitable for IoT and integration between
systems and devices, set plan to train workers, and have clear policies (to exchange
and maintain data and ensure business continuity).

75.6 Conclusion

Various universities now are realizing the importance of introducing IoT tech-
nology in their campuses and facilities. This research presents a literature review
for displaying the importance of IoT in education and universities, IoT applications
and their benefits and challenges, and opportunities of IoT in universities and other
education institutions. Moreover, this research using interview method to achieve
research aim, which is to present Internet of Things benefits, future opportunity,
challenges, and applications that have been adopted in government universities in
Riyadh, Saudi Arabia. The result shows that state of the art of the IoT technology
that was not applied in all government universities of Saudi Arabia. In addition, the
study highlights the future opportunity, benefits, challenges that need to be addressed,
75 Internet of Things in Saudi Arabia Universities … 839

risks that need to be handled, and limitations that need to be overcome for IoT-based
applications that can be implemented in universities in Saudi Arabia.

References

1. Xheladini, A., Deniz Saygili, S., Dikbiyik, F.: An IoT-based smart exam application. IEEE
EUROCON 2017-17th International Conference on Smart Technologies, Ohrid, pp 513–518
(2017)
2. Gul, S., Asif, M., Yasir, M., Malik, M., Majid, M., Ahmad S.: A survey on role of internet
of things in education. IJCSNS Int. J. Computer Sci. Network Sec. 17(5):159–165 (2017).
Available at: http://paper.ijcsns.org/07_book/201705/20170520.pdf [Accessed: 19 Feb 2019]
3. Sultan, M., Ali, E., Ali, M., Habib, M.: Smart campus using IoT with Bangladesh perspective:
A possibility and limitation. Int. J. Res. Appl. Sci. Eng. Tech. 5, 1681–1690 (2017). Available
at: https://doi.org/10.22214/ijraset.2017.8239 [Accessed: 19 Feb 2019].
4. Khanna, A., Anand, R.: IoT based smart parking system. 2016 International Conference on
Internet of Things and Applications (IOTA), Pune, pp 266–270 (2016)
5. Mrabet, H.E., Moussa, A.A.: Research and design of smart management system in class-
room. Proceedings of ACM Mediterranean Symposium on Smart City Applications, Tangier,
Morocco, pp 1–26 (2017)
6. Abuarqoub, A., et al.: A survey on internet of things enabled smart campus applications.
Proceedings of the International Conference on Future Networks and Distributed Systems,
Cambridge, United Kingdom, pp 1–50 (2017)
7. Jusman, M.F.B., Mastan, N.B.M.: A case study review: future of Internet of Things (IoT)
in Malaysia. ASCENT International Conference Proceedings—Information Systems and
Engineering, pp. 82–95 (2017)
8. Biage, E.: How a smart office can increase your productivity. Iottechtrends.com (2019)
[Online]. Available at: https://www.iottechtrends.com/how-smart-office-increase-productiv
ity/ [Accessed: 29 Oct 2019].
9. Domínguez, F., Ochoa, X.: Smart objects in education: An early survey to assess opportuni-
ties and challenges. 2017 Fourth International Conference on eDemocracy & eGovernment
(ICEDEG), Quito, pp. 216–220 (2017)
10. Morgan, J.: 6 ways to build management support for collaboration [online] Information-
Week. Available at: https://www.informationweek.com/government/6-ways-to-build-manage
ment-support-for-collaboration [Accessed: 5 Sep 2021]
11. Majeed, A., Ali, M.: How Internet-of-Things (IoT) making the university campuses smart? QA
higher education (QAHE) perspective. 2018 IEEE 8th Annual Computing and Communication
Workshop and Conference (CCWC), Las Vegas, NV, 646–648 (2018)
12. Sward, H.: Gyms of the future: How IoT will change fitness and sports. dzone.com
[Online]. Available: https://dzone.com/articles/the-gyms-of-future-how-iot-will-change-wor
kout-and [Accessed: 23 Oct 2019]
13. Aleksandrova, M.: IoT in the workplace: smart office applications for better productivity. IoT
For All [Online]. Available at: https://www.iotforall.com/iot-smart-office-applications/amp/.
[Accessed: 29 Oct 2019]
14. Lighting the way to better learning outcomes Cisco (2016). [Online]. Available
at: https://www.cisco.com/c/dam/en/us/products/collateral/enterprise/operational-efficiency/
miami-dade-schools-voc-case-study.pdf. [Accessed: 19 Feb 2019]
15. IoT trends: How will they affect the future of gyms?. Perfect Gym (2018) [Online].
Available: https://www.perfectgym.com/en/blog/business/iot-trends-how-will-they-affect-fut
ure-gyms [Accessed: 23 Oct 2019].
16. Sims, B.: Smart thinking at Birmingham City University drives sustainability and system
interoperability—Risk UK. Risk UK (2016) [Online]. Available at: https://www.risk-uk.com/
birmingham-city/. [Accessed: 20 Feb 2019]
840 N. Alyahya and B. Aljaber

17. Zhamanov, A., Sakhiyeva, Z., Suliyev, R., Kaldykulova, Z.: IoT smart campus review and
implementation of IoT applications into education process of university. 2017 13th International
Conference on Electronics, Computer and Computation (ICECCO), Abuja, pp 1–4 (2017)
18. Cynthia, J., Bharathi Priya, C., Gopinath, P.: IOT based smart parking management system.
Int. J. Recent Tech. Eng. (IJRTE) 7(4), 374–379 (2018)
19. Patel, M.: Internet of Things solutions: 6 applications to transform the education sector—
eLearning industry. eLearning Industry (2018) [Online]. Available at: https://elearningind
ustry.com/internet-of-things-solutions-applications-transform-education-sector. [Accessed: 8
Sep 2019]
20. Kalluri, R.: The applications of IoT in education. MS&E 238 Blog leading trends in information
technology (2017) [Online]. Available at: https://mse238blog.stanford.edu/2017/07/rahulkal/
the-applications-of-iot-in-education/. [Accessed: 31 Jul 2019].
21. Internet of things in the world of school. Medium (2019) [Online]. Available at: https://med
ium.com/sciforce/internet-of-things-in-the-world-of-school-15492e3e2a90. [Accessed: 30 Jul
2019]
22. Shanmugasundaram, M.: The role of IoT in providing security, efficiency & accessibility in
education. Happiest Minds, pp. 3–5 (2014) [Online]. Available at: https://www.happiestm
inds.com/wp-content/uploads/2016/05/The-Role-of-IoT-in-Providing-Security-Efficiency-
and-Accessibility-in-Education.pdf [Accessed: 7 Oct 2019]
23. Meola, A.: How IoT in education is changing the way we learn. Business Insider (2016)
[Online]. Available at: https://www.businessinsider.com/internet-of-things-education-2016-9.
[Accessed: 17 Mar 2019]
24. Abed, S., Alyahya, N., Altameem, A.: IoT in education: its impacts and its future in Saudi
Universities and educational environments. First International Conference on Sustainable Tech-
nologies for Computational Intelligence. Advances in Intelligent Systems and Computing, vol.
1045. Springer, Singapore, pp 47–62 (2019)
25. Pandey, J., Kazmi, S.I.A., Hayat, M.S., Ahmed, I.: A study on implementation of smart library
systems using IoT. 2017 International Conference on Infocom Technologies and Unmanned
Systems (Trends and Future Directions) (ICTUS), Dubai, pp. 193–197 (2017)
26. Muhamad, W., Kurniawan, N.B., Suhardi, Yazid, S.: Smart campus features, technologies, and
applications: A systematic literature review. 2017 International Conference on Information
Technology Systems and Innovation (ICITSI), Bandung, pp. 384–391 (2017)
27. University of NSW becomes testbed for IoT and smart city tech. Smart Energy International
(2016) [online] Available at: https://www.smart-energy.com/regional-news/australia-new-zea
land/university-nsw-becomes-testbed-iot-smart-city-tech/ [Accessed: 19 Oct 2019]
28. McLaughlin, M.: How sports and entertainment venues use IoT to make stadiums smarter.
biztechmagazine.com (2018) [Online]. Available at: https://biztechmagazine.com/article/2018/
05/how-sports-entertainment-venues-use-IoT-make-stadiums-smarter. [Accessed: 21 Oct
2019]
29. Bagheri, M., Movahed, S.: The effect of the internet of things (IoT) on education business
model. 2016 12th International Conference on Signal-Image Technology & Internet-Based
Systems (SITIS), Naples, pp. 435–441 (2016)
30. Upala, M., Wong, W.: IoT solution for smart library using facial recognition. IOP Conference
Series: Materials Science and Engineering 495, 012030 (2019)
31. Hawaii creates ‘energy-smart university’ with new energy IoT platform. Smart Energy Inter-
national (2019) [online]. Available at: https://www.smart-energy.com/industry-sectors/smart-
energy/hawaii-creates-energy-smart-university-with-new-energy-iot-platform/. [Accessed: 19
Oct 2019]
32. Crozier, R.: Melbourne uni turns to IoT to understand campus use. iTnews (2018) [online].
Available at: https://www.itnews.com.au/news/melbourne-uni-turns-to-iot-to-understand-cam
pus-use-499875 [Accessed: 19 Oct 2019]
33. Kang, Y.: S Korea’s KT explores Internet of Things on campus, 4 August (2014) [Online].
Available at: https://asia.nikkei.com/Business/S-Korea-s-KT-explores-Internet-of-Things-on-
campus.
75 Internet of Things in Saudi Arabia Universities … 841

34. BAASS (2019). The importance of planned maintenance [online]. Available at: https://www.
baass.com/blog/the-importance-of-planned-maintenance [Accessed: 5 Sep 2021]
35. Student housing energy management solutions. Telkonet [online]. Available at: https://www.
telkonet.com/markets/student-housing/ [Accessed: 19 Oct 2019].
36. Hupjé, E.: How to implement maintenance planning and scheduling [online]. Road to
Reliability. Available at: https://roadtoreliability.com/implement-maintenance-planning-sch
eduling/ [Accessed: 5 Sep 2021]
37. Straumsheim, C.: Wearable, 1 April 2015 [Online]. Available at: https://www.insidehighered.
com/news/2015/04/01/oral-roberts-u-smartwatches-provide-entry-internet-things
38. Gutierrez, D.: Dorm of the future: student residences can change with IoT.
Collegepuzzle.stanford.edu. (2017) [online]. Available at: https://collegepuzzle.stanford.edu/
dorm-of-future-student-residence-can-change-with-iot/ [Accessed: 19 Oct 2019]
39. Wedeking, S.: Student housing is a Nascent market for smart home IoT technology. Navigantre-
search.com (2019) [online]. Available at: https://www.navigantresearch.com/news-and-views/
student-housing-is-a-nascent-market-for-smart-home-iot-technology [Accessed: 19 Oct 2019]
40. Pratama, A.Y.N., Zainudin, A., Yuliana, M.: Implementation of IoT-based passengers moni-
toring for smart school application. 2017 International Electronics Symposium on Engineering
Technology and Applications (IES-ETA), Surabaya, pp. 33–38 (2017)
41. Raths, D.: ‘Smart’ campuses invest in the Internet of Things. Campus Technology (2017)
[online]. Available at: https://campustechnology.com/Articles/2017/08/24/Smart-Campuses-
Invest-in-the-Internet-of-Things.aspx?m=1&Page=1 [Accessed: 19 Oct 2019]
42. Sabancı, K., Yigit, E., Üstün, D., Toktaş, A., Çelik, Y.: Thingspeak based monitoring IoT system
for counting people in a library. 2018 International Conference on Artificial Intelligence and
Data Processing (IDAP), Malatya, Turkey, pp. 1–6 (2018)
43. Kulkarni, V., Kadakol, S.P., Hiremath, S., Uppin, S.:. IoT-based RF framework for smart
water management system. Iotchallengekeysight.com [online]. Available at: https://www.iot
challengekeysight.com/2019/entries/smart-water/205-0515-022653-iot-based-rf-framework-
for-smart-water-management-system [Accessed: 19 Oct 2019]
44. Fusco, D.: Penn State University taking attendance using beacons. Evothings, 22 Mar (2016)
[Online]. Available at: https://evothings.com/penn-state-university-taking-attendance-using-
beacons/. [Accessed: 18 Jul 2019]
45. Mahmood, S., Palaniappan, S., Hasan, R., Sarker, K.U., Abass, A., Rajegowda, P.M.: Raspberry
PI and role of IoT in education. 2019 4th MEC International Conference on Big Data and Smart
City (ICBDSC), Muscat, Oman, pp. 1–6 (2019)
46. MiddleStone (2017). 7 steps to choosing software that is right for your business [online]. Avail-
able at: https://www.middlestone.ltd/blog/steps-to-choosing-software-that-is-right-for-your-
small-business [Accessed: 5 Sep 2021]
47. Bhat, S.: IOT application in education. Int. J. Advance Research Development. 2(6), 20–24
(2017)
48. Bandoim, L.: How smart shelf technology will change your supermarket. forbes.com
(2018) [online]. Available at: https://www.forbes.com/sites/lanabandoim/2018/12/23/how-
smart-shelf-technology-will-change-your-supermarket/#7a7c0b64114c [Accessed: 19 Oct
2019]
49. Devasia, A.: The basics of IoT infrastructure [online]. Control Automation (2021). Available at:
https://control.com/technical-articles/the-basics-of-iot-infrastructure/ [Accessed: 5 Sep 2021]
50. Unified National Platform. Digital transformation [online]. Available at: https://www.my.gov.
sa/wps/portal/snp/aboutksa/digitaltransformation [Accessed: 3 Sep 2021]
51. Rangnekar, P.: How can IoT help in water management system?. Smart Water Magazine
[online]. Available at: https://www.google.com/amp/s/smartwatermagazine.com/blogs/parija-
rangnekar/how-can-iot-help-water-management-system%3famp [Accessed: 19 Oct 2019]
52. Smart waste management and smart water management. Telit [online]. Available at: https://
www.telit.com/industries-solutions/smart-cities-smart-transportation/waste-and-water-man
agement/ [Accessed: 19 Oct 2019]
842 N. Alyahya and B. Aljaber

53. Raytown: Improve bus behavior. Kajeet.net (2018) [Online]. Available at: https://www.kaj
eet.net/success-stories/raytown-school-bus-wifi?__hstc=215977800.3abac98271a3dc9b36
14c446ae55a636.1563845768708.1563845768708.1563845768708.1&__hssc=215977800.
1.1563889210452&__hsfp=1248828304&hsCtaTracking=fb56b642-4339-4c79-9be8-e96
d2e9cc9c5%7Cb21ac8ea-47b0-4ad2-a3c7-25c7e961a3d3. [Accessed: 24 Jul 2019]
54. IoT in education: Internet of Things applications, benefits and importance of IoT in education.
IoTDunia.com (2019) [Online]. Available at: https://iotdunia.com/iot-in-education-internet-of-
things-applications-benefits-importance-iot-in-education/ [Accessed: 3 Aug 2019]
55. Govil, N.: IoT enabled Smart Building Technology. IoTDunia-Helping you to succeed with
the Internet of Things (2017) [Online]. Available at: https://iotdunia.com/smart-building-tec
hnology/. [Accessed: 5 Aug 2019]
56. IoT in education market by component (Hardware, solutions & services), end user (academic
institutions & corporates), application (learning management, classroom management, admin-
istration management & surveillance), and region—Global Forecast to 2023. Marketsandmar-
kets.com [Online]. Available at: https://www.marketsandmarkets.com/Market-Reports/iot-edu
cation-market-115520843.html. [Accessed: 11 Oct 2019]
57. Thomas, M.: The connected classroom: 9 examples of IoT in education. Built In (2019)
[Online]. Available at: https://builtin.com/internet-things/iot-education-examples. [Accessed:
27 Jul 2019]
58. Poongothai, M., Karupaiya, A.L., Priyadharshini, R.: Implementation of IoT based Smart
Laboratory. Int. J. Computer Applications 182(15), 31–34 (2018)
59. Tin, HHK.: Role of Internet of Things (IoT) for smart classroom to improve teaching and
learning approach. Int. J. Res. Innovation in Applied Science (IJRIAS), 45–49 (2019)
Author Index

A Dave, Jaykumar, 179


Agasti, Sarmistha, 43 Deepak Kumar, P., 313
Ahlawat, Neha, 295 Deshmukh, Asmita S., 283
Ahmed, Afsana, 133 Deshmukh, Ganesh, 81
Aljaber, Bader, 821 Devisha, T., 325
Alyahya, Norah, 821 Dhumane, Amol, 415
Ambala, Srinivas, 415 Diop, Ibrahima, 565
Amesara, Sameer, 25 Diwakar, 271
Angelin Pricila, S. A., 325 Dixit, Rashmi K., 259
Dongardive, Jyotshna, 753
Doshi, Nishant, 61, 71
B Dubey, Nandini, 463
Bafna, Prafulla B., 379 Dwivedi, Rajendra Kumar, 305
Bahuguna, Ashutosh, 451
Barve, Yashoda, 653
Behal, Sunny, 123 E
Bhandari, Abhinav, 155 Ezhilarasan, V., 313
Bhavsar, Hetal, 25
Bhonsle, Mansi, 735
Biradar, Rajkumar L., 475
Birajadar, Parmeshwar, 705 F
Farshid, Mohammad, 133

C
Castelo, Lourdes Emperatriz Paredes, 775 G
Chakrabarty, Amitabha, 785 Gadre, Vikram, 705
Chanchlani, Akshita, 7 Gaikwad, Hema, 653
Chaudhary, Shweta, 35, 541 Ganapa, Meghana, 743
Chauhan, Ankit, 111 Gandhi, Ravi, 399
Chavan, Janhavi, 463 Gandhi, Savita R., 165
Chide, Mrunal, 735 Gandhi, Shriji V., 399
Chiwhane, Shwetambari, 415 Ghosh, Rupon Kumar, 785
Gindani, Prashant, 25
Gopal, Shubhang, 199
D Gowsick, S. B., 339
Dahiya, Pawan Kumar, 729 Gupta, Rahul, 199, 221
© The Editor(s) (if applicable) and The Author(s), under exclusive license 843
to Springer Nature Singapore Pte Ltd. 2023
J. Choudrie et al. (eds.), ICT with Intelligent Applications, Smart Innovation, Systems
and Technologies 311, https://doi.org/10.1007/978-981-19-3571-8
844 Author Index

H Kumar, Mukesh, 155


Hanabar, Yash, 605 Kumar, Pawan, 1, 357
Handur, Vidya, 231 Kumar, P. Nithish, 493
Haria, Meet, 705 Kumar, Prajjwal, 221
Harlapur, Nehabanu H., 231
Hegde, Nagaratna P., 743
Hudagi, Manjunath R., 475 L
Langalia, Parth, 15
Lunagaria, Munindra, 553
I
Islam, A.K.M. Muzahidul, 133
M
Madhumitha, C., 325
Malo, Sadouanouan, 565
J
Mangat, Veenu, 1
Jaeyalakshmi, M., 485
Mangore Anirudh, K., 415
Jahirabadkar, Sunita, 389
Manmode, Aishwarya, 735
Jain, Dax, 165
Meena, C., 367
Jain, Ekta, 35, 541
Mishra, Samir, 735
Jain, Priyanka Sohanlal, 15
Mohd, Noor, 617, 643
Jain, Sonal, 187
Mohideen, S. Ismail, 91
Janani, T. A., 589
Mondal, Bidhan, 43
Jani, Nidhi, 61
Monika, 1
Jayaprakash, Swathi, 797
Moradiya, Hemali, 577
Jerish, B., 339
More, Aashay, 429
Jeyavani, M., 145
Mukku, Chandrakala, 629
Jha, Girish Nath, 693
Jha, Ravi Shankar, 811
Jivani, Anjali, 25 N
Jose Triny, K., 313 Nagpal, Arpita, 677
Naidu, Paruchuri Chandra Babu, 775
Nandhini, L. K., 485
K Nayak, Sinkon, 51
Kaazima Ifrah, Mirza, 247 Nikunj, Panchal Mital, 179
Kadam, Radha, 81
Kamble, Anjali, 81
Karim, Injamamul, 133 P
Karthika, M., 367 Padmini Devi, B., 339
Karthika, R. A., 515 Panchal, Mitali, 111
Karuppasamy, M., 145 Pandey, Manjusha, 51
Kashyap, Neha, 541 Pandita, Rohit, 735
Kaur, Dupinder, 211 Pandit, Jyoti Kumari, 43
Kaushik Karthikeyan, K., 367 Pandya, Aneri, 507
Kavitha, C. R., 247 Pandya, Killol Vishnuprasad, 507
Kedia, Deepak, 729 Pant, Bhaskar, 617, 643
Kejriwal, Sandeep, 529 Patel, Sandip, 347
Khokhar, Sahil, 729 Patel, Sohamkumar, 25
Khuspe, Aditi, 81 Patel, Upesh, 507
Kollipara, Anisha, 743 Patil, Srushti, 187
Kosamkar, Pranali, 463 Patil, Yaminee, 187
Kotecha, Govind, 463 Pattabiraman, V., 797
Kotecha, Ketan, 653 Pavitha, N., 429
Kovendan, A. K. P., 775 Pavithra, S., 589
Kumar, Dilip, 357 Phalke, Aditi, 81
Kumar, M. S. V. Sashi, 743 Phalke, Dhanashree, 389
Author Index 845

Popat, Kalpesh, 577 Shriram, E., 339


Prasanna, A., 91 Shrivastava, Atul Lal, 305
Prasanth, S., 493 Shukla, Dhirendra Kumar, 91
Preethy, R., 589 Singh, Anjali, 605
Premkumar, M., 91 Singh, Dilbag, 211
Priya, Potti, 247 Singh, D. P., 97
Singh, Gunjan, 677
Singh, Jagdeep, 123
R Sinha, Manish, 719
Raicha, Ruchi, 187 Sinha, Shagun, 693
Rajagopalan, Narendran, 529 Sireesha, V., 743
Rajan, Annie, 753 Soma, Shridevi, 475
Rajan, S. Deepa, 515 Sreekrishna, M., 485
Rajaram, J., 339 Srimugi, V., 367
Raj, Deep, 271 Sriram, Nuthi, 247
Rajeshram, V., 367 Suresh, Aksheya, 485
Raj, Swetank, 429 Suriya, S., 313
Rathod, Dushyantsinh B., 179 Surya, R., 493
Ratnaparkhi, Pranav, 429 Sushitha, S., 665
Raut, Anjali B., 283 Swaminathan, J. N., 775
Rautaray, Siddharth S., 51
Rawat, Deepika, 541
Reddy, Seelam Meghana, 763 T
Relia, Maharshi, 347 Taneja, Prabhsimar Singh, 199
Thakare, Vilas M., 7
Thakker, Manish, 399
Thamizharasan, S., 775
S Thapliyal, Siddhant, 97
Sable, Nilesh P., 605 Tharani, S., 325
Sahoo, Priti Ranjan, 811 Tiwari, Pallavi, 617, 643
Sahoo, Sandipkumar, 259 Traore, Yaya, 565
Saini, Jatinderkumar R., 379, 653 Trawina, Halguieta, 565
Sai Satyanarayana Reddy, S., 763 Tripathy, Arvind, 811
Saluja, Vipul, 753
Samak, Aditya R., 379
Sanjana, Seelam, 763 U
Sanjeev, Manasa, 247 Upadhyaya, Trushit, 507
Santhi, P., 325 Upadhyay, Deepak, 617, 643
Santhosh Kumar, M., 313 Uzair, Azfar, 429
Santhosh, Miriala, 629
Satani, Jigna, 165
Sathasivam, Thilagamani, 589 V
Savani, Nidhi, 553 Varshney, Shipra, 35, 541
Sehrawat, Navdeep, 221 Vasava, Dhaval, 71
Shaha, Rachit P., 259 Vasuja Devi, M., 775
Shah, Dhruv, 347 Vats, Prashant, 35, 541
Shah, Paras, 463 Vel, S. Siva Sakthi, 493
Shah, Smeet, 25 Vikram, R., 493
Shah, Zankhana H., 15 Vinod, D. Franklin, 295, 441
Shandilya, Piyush, 221 Vinothini, A., 485
Sharma, Jaya, 441
Sharma, Nishi, 35, 541
Shetty, Chethan, 665 W
Shiddike, Jasrin, 133 Wable, Siddhesh, 605
846 Author Index

Wadhai, Vijay M., 7 Yadav, Priybhanu, 199


Wadhwa, Jasmine Kaur, 187 Yasaswi, Dutta, 797
Wanve, Omkar, 605 Yesmin, Tahamina, 43
Wazid, Mohammad, 97
Wazir, Samar, 451

Y Z
Yadav, Prathamesh, 429 Zambre, Pranav, 111

You might also like