You are on page 1of 592

Proceedings in Adaptation, Learning and Optimization 15

Harish Sharma
Vijay Kumar Vyas
Rajesh Kumar Pandey
Mukesh Prasad Editors

Proceedings
of the International
Conference
on Intelligent Vision
and Computing
(ICIVC 2021)
Proceedings in Adaptation, Learning
and Optimization

Volume 15

Series Editor
Meng-Hiot Lim, Nanyang Technological University, Singapore, Singapore
The role of adaptation, learning and optimization are becoming increasingly
essential and intertwined. The capability of a system to adapt either through
modification of its physiological structure or via some revalidation process of
internal mechanisms that directly dictate the response or behavior is crucial in many
real world applications. Optimization lies at the heart of most machine learning
approaches while learning and optimization are two primary means to effect
adaptation in various forms. They usually involve computational processes
incorporated within the system that trigger parametric updating and knowledge
or model enhancement, giving rise to progressive improvement. This book series
serves as a channel to consolidate work related to topics linked to adaptation,
learning and optimization in systems and structures. Topics covered under this
series include:
• complex adaptive systems including evolutionary computation, memetic com-
puting, swarm intelligence, neural networks, fuzzy systems, tabu search, sim-
ulated annealing, etc.
• machine learning, data mining & mathematical programming
• hybridization of techniques that span across artificial intelligence and compu-
tational intelligence for synergistic alliance of strategies for problem-solving.
• aspects of adaptation in robotics
• agent-based computing
• autonomic/pervasive computing
• dynamic optimization/learning in noisy and uncertain environment
• systemic alliance of stochastic and conventional search techniques
• all aspects of adaptations in man-machine systems.
This book series bridges the dichotomy of modern and conventional mathematical
and heuristic/meta-heuristics approaches to bring about effective adaptation,
learning and optimization. It propels the maxim that the old and the new can
come together and be combined synergistically to scale new heights in
problem-solving. To reach such a level, numerous research issues will emerge
and researchers will find the book series a convenient medium to track the
progresses made.
Indexed by INSPEC, zbMATH.
All books published in the series are submitted for consideration in Web of Science.

More information about this series at https://link.springer.com/bookseries/13543


Harish Sharma Vijay Kumar Vyas
• •

Rajesh Kumar Pandey Mukesh Prasad


Editors

Proceedings
of the International
Conference on Intelligent
Vision and Computing
(ICIVC 2021)

123
Editors
Harish Sharma Vijay Kumar Vyas
Department of Computer Science Sur University College
and Engineering Sur, Oman
Rajasthan Technical University
Kota, India Mukesh Prasad
School of Computer Science
Rajesh Kumar Pandey University of Technology Sydney
Department of Mathematical Sciences Sydney, NSW, Australia
Indian Institute of Technology
(BHU) Varanasi
Varanasi, Uttar Pradesh, India

ISSN 2363-6084 ISSN 2363-6092 (electronic)


Proceedings in Adaptation, Learning and Optimization
ISBN 978-3-030-97195-3 ISBN 978-3-030-97196-0 (eBook)
https://doi.org/10.1007/978-3-030-97196-0
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents

Assessment of Weight Factor in Genetic Programming Fitness


Function for Imbalanced Data Classification . . . . . . . . . . . . . . . . . . . . . 1
Arvind Kumar, Shivani Goel, Nishant Sinha, and Arpit Bhardwaj
Mass Transfer Past an Exponentially Stretching Surface
with Variable Wall Concentration and MHD in Porous Medium . . . . . 10
Praveen Kumar Dadheech, Priyanka Agrawal, Anil Sharma,
Kottakkaran Sooppy Nisar, Mahesh Bohra, and S. D. Purohit
A Novel Approach of Using Materialized Queries for Retrieving
Results from Data Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Sonali Chakraborty
A Trust-Based Mechanism to Improve Security of Wireless
Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Sangeeta Rani, Dinesh Kumar, and Vikram Singh
Multikernel Support Vector Machine Approach with Probability
Distribution Analysis for Classifying Parkinson Disease Using
Gait Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Arunraj Gopalsamy and B. Radha
Impact of Convolutional Neural Networks for Recognizing Facial
Expressions: Deep Learning Perspective . . . . . . . . . . . . . . . . . . . . . . . . 74
Ridhima Sabharwal and Syed Wajahat Abbas Rizvi
Handwritten Bengali Digit Classification Using Deep Learning . . . . . . . 85
Amitava Choudhury and Kaushik Ghosh
IoT Based COVID Patient Health Monitoring System
in Quarantine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Rajat Kumar, Shivam Dixit, Japjeet Kaur, Kriti, and Krishna Murari Singh

v
vi Contents

Self-attention Convolution for Sparse to Dense Depth Completion . . . . . 105


Tao Zhao, Shuguo Pan, and Hui Zhang
Using Algorithm in Parametric Design as an Approach to Inspire
Nature in Architectural Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Mohamed Ibrahim Abdelhady, Ayman K. Abdelgadir, Fatma Al-Araimi,
and Khulood AL-Amri
Sybil Account Detection in Social Network Using Deep
Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Preety Verma, Ankita Nigam, Garima Tiwari, and G. Mallesham
Docker Container Orchestration Management: A Review . . . . . . . . . . . 140
Jigna N. Acharya and Anil C. Suthar
Locally Weighted Mean Phase Angle (LWMPA) Based Tone Mapping
Quality Index (TMQI-3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Inaam Ul Hassan, Abdul Haseeb, and Sarwan Ali
Age Estimation of a Person by Compound Stratum Practice
in ANN Using n-Sigma Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
M. R. Dileep, Ajit Danti, and A. V. Navaneeth
Improved Artificial Fish School Search Based Deep Convolutional
Neural Network for Prediction of Protein Stability upon
Double Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Juliet Rozario and B. Radha
Statistical Distribution and Socio-Economics in Accordance
with the Indian Stock Market in the COVID19 Scenario . . . . . . . . . . . . 193
Bikram Pratim Bhuyan and Ajay Prasad
Business, Finance and Decision Making Process - The Influence
of Culture on Foreign Direct Investments (FDI) . . . . . . . . . . . . . . . . . . . 207
Zoran Đikanović and Anđela Jakšić-Stojanović
Towards a Problematization Framework of 4IR Formalisms:
The Case of QUALITY 4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
John Andrew van der Poll
Investment or Gambling in the Crypto Market: A Review . . . . . . . . . . 227
Aditi Singh
Solving Partial Differential Equations on Radial Basis Functions
Networks and on Fully Connected Deep Neural Networks . . . . . . . . . . . 240
Mohie M. Alqezweeni, Roman A. Glumskov, Vladimir I. Gorbachenko,
and Dmitry A. Stenkin
Contents vii

Smart Technologies to Reduce the Spreading of COVID-19: A Survey


Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Abdul Cader Mohamed Nafrees, P. Pirapuraj,
M. S. M. Razeeth, R. K. A. R. Kariapper, and Samsudeen Sabraz Nawaz
Development of a Real Time Wi-Fi Based Autonomous Corona
Warrior Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
P. Shubha and M. Meenakshi
The Impact of Job Satisfaction Level on Employee Work Performance
in the Pharmaceutical Industry: An Empirical Study . . . . . . . . . . . . . . . 281
Geeta Kumari, Jyoti Kumari, and K. M. Pandey
Vocal Psychiatric Simulator for Public Speaking Anxiety Treatment . . . 299
Sudhanshu Srivastava and Manisha Bhattacharya
IoT Automation Test Framework for Connected Ecosystem . . . . . . . . . 309
Chittaranjan Pradhan, Sunil A. Kinange, Jayavel Kanniappan,
and Rajesh Kumar Jayavel
Application of Magic Squares in Cryptography . . . . . . . . . . . . . . . . . . . 321
Narbda Rani and Vinod Mishra
Application of Artificial Intelligence in Waste Classification
Management at University . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
Dongxu Qu
The Distributed Ledger Technology as Development Platform
for Distributed Information Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
Itzhak Aviv
Design of Chebyshev Bandpass Waveguide Filter for E-Band Based
on CSRR Metamaterial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Mahmoud Abuhussain and Ugur Cem Hasar
The Pedagogical Aspect of Human-Computer Interaction in
Designing: Pragmatic Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
Zahra Hosseini, Kimmo Hytönen, and Jani Kinnunen
A Brief Literature Review About Bioinspired Metaheuristics to Solve
Vehicle Routes Problem with Time Window . . . . . . . . . . . . . . . . . . . . . 377
Braynner Polo-Pichon, Alexander Troncoso-Palacio,
and Emiro De-La-Hoz-Franco
Markov Decision Processes with Discounted Costs: Improved
Successive Over-Relaxation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
Abdellatif Semmouri, Mostafa Jourhmane,
and Bahaa Eddine Elbaghazaoui
viii Contents

Control Quality Analysis in Accordance with Parametrization in MPC


Automation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Tomas Barot, Mikhail Perevozchikov, Marek Kubalcik, Jaromir Svejda,
and Ladislav Rudolf
Enhancing the Social Learning Ability of Spider Monkey
Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
Apoorva Sharma, Nirmala Sharma, and Kavita Sharma
Multi-objective Based Chan-Vese Method for Segmentation of Mass
in a Mammogram Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
Pramod B. Bhalerao and Sanjiv V. Bonde
Securing the Adhoc Network Data Using Hybrid Malicious Node
Detection Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
Atul B. Kathole and Dinesh N. Chaudhari
Embedded Digital Control System of Mobile Robot with Backlash
on Drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Eugeny Larkin and Aleksandr Privalov
A Novel Approach in Breast Cancer Diagnosis with
Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
Nahida Nazir, Baljit Singh Saini, and Abid Sarwar
Model Based Model Reference Adaptive Control of Dissolved Oxygen
in a Waste Water Treatment Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
Mohamed Bahita, Mouatez Bilah M’Haimoud, and Abdelmoula Ladjabi
Breast Cancer Diagnosis Using Deep Learning . . . . . . . . . . . . . . . . . . . 490
Salman Zakareya and Habib Izadkhah
Topological Data Analysis - A Novel and Effective Approach
for Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
Dhananjay Joshi, Kapil Kumar Nagwanshi, Nitin S. Choubey,
Milan A. Joshi, and Sunil Pathak
Object Detection Using Microsoft HoloLens by a Single Forward
Propagation CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
Reza Moezzi, David Krcmarik, Jindřich Cýrus, Haythem Bahri,
and Jan Koci
Significance of Dimensionality Reduction in Intrusion Detection
Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
Ghanshyam Prasad Dubey, Rakesh Kumar Bhujade, and Puneet Himthani
Comparative Analysis of Bioactive Compounds for Euphorbia Hirta L.
Leaves Extract in Aqueous, Ethanol, and Methanol Solvents
Using GC-MS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
Shikha Dixit and Sugandha Tiwari
Contents ix

An Efficient Edge Localization Using Sobel and Prewitt Fuzzy


Inference System (FIS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
Ali A. Al-Jarrah and R. Bremananth
Application of Artificial Intelligence in Recommendation Systems
and Chatbots for Online Stores in Fast Fashion Industry . . . . . . . . . . . 558
Meshal Alduraywish, Bhuvan Unhelkar, Sonika Singh, and Mukesh Prasad
Application of Artificial Intelligence in Healthcare by Industries
in Australia: Opportunities and Challenges . . . . . . . . . . . . . . . . . . . . . . 568
Priyanka Singh, Aruna S. Manjunatha, Ayesha Baig, Pooja Dhopeshwar,
Huan Huo, Gnana Bharathy, and Mukesh Prasad

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581


Assessment of Weight Factor in Genetic
Programming Fitness Function
for Imbalanced Data Classification

Arvind Kumar1(B) , Shivani Goel1 , Nishant Sinha2 , and Arpit Bhardwaj3


1
Department of Computer Science Engineering, Bennett University, Noida, India
ak2815@bennett.edu.in, arvind.jki@gmail.com
2
Pitney Bowes Software, Noida, India
3
Mahindra University, Hyderabad, India

Abstract. In real-world data classification, applications often have an


imbalanced distribution of data over various classes. This imbalanced
distribution imposes intense challenges, and because of this, traditional
classification methods are not effective in this case. This problem also
influences genetic programming (GP). One approach to resolve this issue
is to assign a custom high weight to the classes during training. This cus-
tom weight assignment may nullify the impact of higher counts of any
classes during the learning phase of the classifier. The GP fitness func-
tion may introduce the custom weight assignment for the minority class
samples. The fitness function performs an essential role in GP and influ-
ences each building block of GP. This research work assesses the impact
of weight factors in GP’s fitness function for imbalanced data classifi-
cation. For this assessment, eight imbalanced classification problems are
taken from the UCI repository, and intensive experimentation is done on
the different weight factors.

Keywords: Unbalanced data classification · Genetic programming ·


Weight assignment · Fitness function

1 Introduction
The data classification is said to be imbalanced data classification whether the
number of samples are significantly different across the classes [1,2]. In the
binary imbalanced data classification, many samples belong to one class (major-
ity class), and a significantly lower number of samples belong to the other class
(minority class). This binary imbalanced data classification is common in the real
world and creates difficulty for any machine learning techniques. In the imbal-
anced scenario, most of the classification algorithms do prodigiously favour the
majority class [3,4]. Therefore, designing the classification algorithms suited for
imbalanced data classification is an active and independent field of research.
Imbalance classification problems include, but are not limited to, medical diag-
nosis, natural disaster, fraud detection, image segmentation, etc.
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 1–9, 2022.
https://doi.org/10.1007/978-3-030-97196-0_1
2 A. Kumar et al.

To handle the classification of the imbalanced data-set, three types of


approaches are common. The first approach focuses on the over-sampling or
under-sampling of samples belonging to the minority or majority class, respec-
tively. This approach tries to balance against the skewed sample distribution over
different classes [5,6]. The second approach, also called the ensembles-based app-
roach, combines different well-known algorithms to produce a better result. The
third approach, also called the internal approach, focuses on the modification
of existing classification techniques internally or designing a new algorithm to
tackle the imbalance nature of data [7–9].
Genetic Programming (GP) is an evolutionary technique based on the Dar-
winian principles of natural selection and evolution of different species survived
in nature [10,11]. GP automatically produces computer programs represented as
trees in memory. GP is successfully used as a solution to solve numerous classi-
fication problems. However, like other ML techniques, GP is also influenced by
this imbalanced data distribution over different classes and generates biased clas-
sifiers. The biased classifiers give higher accuracy performance for the majority
class due to the higher sample count of the majority class.
Unequal distribution of class samples in the training (learning) process makes
the learning algorithm with a biased performance favour the majority class.
Resolving the biased learning of the classifier so that it gives precise results over
each class becomes a vital part of the research. In GP, the learning process of
classifiers is governed by the fitness function. Many research work in the era of
GP focused on directly adapting the new fitness function to address class imbal-
ance problems. One approach may be to calculate the classification errors and
assign a different weightage to different class samples so that the learning bias of
the classifier may be nullified [12–16]. However, assigning a custom weight to the
minority class without domain-specific knowledge of the data set is challenging.
In this work, we performed extensive experiments on eight imbalanced prob-
lems based on the UCI repository data-set to assess the impact of weight factor
in fitness function for generating a classifier that gives accurate results over each
class.

Table 1. Summary of unbalanced data-set used

Data-set Imbalanced Features Minority Majority Total Data-set


name ratio count count count count description
ABL-18 99:1 8 42 4135 4177 Abalone (class 18 vs all)
ABL-9-18 94:6 8 42 689 731 Abalone (class 9 vs 18)
ABL-9 83:17 8 689 3488 4177 Abalone (class 9 vs all)
YEAST2 89:11 8 163 1321 1484 Yeast (MEA3 vs all)
YEAST1 84:16 8 244 1240 1484 Yeast (MIT vs all)
ION 64:36 34 126 225 351 Ionosphere
SPECT 59:41 22 110 157 267 SPECT heart
SONAR 53:47 60 97 111 208 Sonar
Assessment of Weight Factor in GP Fitness Function 3

2 Benchmark Data-Sets
In this research work eight imbalanced problems, with various level of the imbal-
anced factor is taken. These imbalanced classification problems are based on the
publicly available UCI data-set repository [17]. Three problems ABL-18, ABL-
9-18, and ABL-9 are based on the Abalone data-set. The ABL-18 is created by
class 18 versus other class samples, ABL-9 is created by class 9 versus other
class samples, and ABL-9-18 is created by class 9 versus class-18 class sam-
ples. YEAST-1 is generated by the MIT class versus other classes samples, and
YEAST-2 is generated by the ME2 class versus other classes sample. The ION,
SPECT, and SONAR represent the ionosphere data set, Single-proton emission
computed tomography (SPECT) heart data set, and SONAR represents the
sonar data set. For our extensive experimentation, we have taken various impact
factor data sets. This impact factor varies from 99:1 to 53:47. These imbalanced
problems are summarized in Table 1.

Fig. 1. A sample GP program [18]

3 Genetic Programming
Genetic programming (GP), is a nature-motivated algorithm propose by Koza
[11,18]. The algorithm is inspired by Darwin’s “principle of survival of fittest” in
nature. The primary construct of GP is individuals, termed as programs. These
programs express various mathematical formulas for solving the considered prob-
lem. A program is expressed as a tree in memory (Fig. 1). Processing of a set
4 A. Kumar et al.

of programs is performed till the satisfaction criteria are not met or for a fixed
number of generations. This process is conducted using three genetic operators:
mutation, reproduction, and crossover. These three operators are inspired by
nature, where a better individual can generate a copy of itself and elected to
go in next-generation (called reproduction), two individuals can produce a new
individual, which carries few genes from the first and few genes from the second
(called crossover) and in rare cases, some genes become mutated in the child
(called a mutation). These operators working rely on an evaluation function
termed the fitness function. Depending on the admitted problem, the goal will
be to either maximize or minimize the value of the fitness function. Over genera-
tion by generation, the value of the fitness function converges. Thus, the central
steps of GP are population initialization and processing of this population till a
fixed number of generations by utilizing nature-inspired operations: crossover,
reproduction, and mutation. The GP framework is summarized in Fig. 2.

Fig. 2. GP framework

3.1 Fitness Function


In GP, accuracy may be taken as a standard fitness function, with the objective
to converge the fitness value of the best program to 100%. As accuracy treats
each class samples blindly, this leads to biased learning toward the majority
Assessment of Weight Factor in GP Fitness Function 5

class [19]. Therefore, we take the Eq. 1 as a fitness function which is based on
the Euclidean distance between predicted and expected values and assigning
different weightage to the different classes. We termed this as the Euclidean
distance and weight-based fitness function.
 
 distmaj 2
Nmaj

Nmin 2
|distmin i |
Fedwb = (1 − W ) ∗ i
+W ∗ (1)
i=1
2 ∗ Nmaj i=1
2 ∗ Nmin

Where,

W: weight value for the minority class samples.


Nmaj : The majority samples counts.
Nmin : The minority samples count.
distmaj i :Distance of actual value from predicted value in the ith majority class
sample.
distmini : Distance of actual value from predicted value in the ith minority class
sample.

Equation 1 has two parts. The first part accommodates the performance eval-
uation for the majority class, and the second part accommodates the perfor-
mance evaluation of the minority class. In summary, it gives different weightage
to the different class samples.

4 Experimental Details
Our objective is to assess the impact of the weight factor in the GP fitness func-
tion for generating a classifier that gives accurate results over each class. For
that, we performed extensive experiments on different imbalanced problems with
different weight factors. We performed these experiments by assigning minority
class weight values of 0.2, 0.4, 0.5, 0.6, and 0.8. For each weight value, 30 exper-
iments are executed and mean AUC values are calculated for each imbalanced
problem. In each experiment, 80% data is used for training, and 20% data is
used to evaluate the classifier’s performance. This partitioning is done randomly
for each experiment. In all these experiments GP parameter values are given in
Table 2.

5 Results and Discussions


Accuracy treats the performance of each class blindly and uses the total sam-
ples count irrespective of which class has how many samples. As majority class
samples have higher sample counts, performance in terms of accuracy is always
biased toward the majority class. Better performance measurement for imbal-
anced classification is the area under the ROC (receiver operating characteristic
curve) known as AUC. The AUC considers similar offerings to both minority
and majority classes when evaluating the performance of an algorithm [20,21].
6 A. Kumar et al.

Table 2. GP parameter setting

Parameter Value
Population initialization Ramp half and half
Population size 200
Maximum generation 100
Max tree height 15
Min tree height 5
Function-set +, −, ∗, /, sigmoid
Tree terminals Feature variables & random constants
Reproduction 0.10
Mutation 0.20
Crossover 0.70

As AUC is a non-biased performance measurement we evaluate our results based


on the AUC values.
Experimental results are summarized in Table 3, in which the mean value of
AUC and the standard deviation is given. For visual comparison Fig. 3 shows
the bar chart of average AUC values for each imbalanced problem with different
minority class weight values. For the minority class weight vector, W = [0.2, 0.4,
0.5, 0.6, 0.8], the AUC values for ABL-18 dataset are 0.570, 0.734, 0.752, 0.767,
and 0.702 respectively. For this weight vector W, the AUC values for ABL-9-18
dataset are 0.718, 0.801, 0.829, 0.807, and 0.774 respectively. For ABL-9 dataset
and weight vector W, the AUC values are 0.500, 0.578, 0.646, 0.615, and 0.540
respectively. For the same weight vector W, YEAST-1 dataset the AUC values
are, 0.699, 0.772, 0.773, 0.748, and 0.628 respectively. For YEAST-2 dataset and
the same weight vector W, the AUC values are 0.879, 0.915, 0.924, 0.906, and
0.892 respectively. For ION dataset, and the same weight vector W, the AUC
values are 0.852, 0.859, 0.868, 0.854, and 0.704 respectively. For SPECT dataset,
and the same weight vector W, the AUC values are 0.669, 0.712, 0.686, 0.595,
and 0.567 respectively. For SONAR dataset, and the same weight vector W, the
AUC values are 0.659, 0.759, 0.746, 0.724, and 0.684 respectively.
Out of eight imbalanced problems in this work, five imbalanced problems
produced the best result values when minority weightage is set to 0.5. These
five imbalanced problems include ABL-9, ABL-9-18, YEAST-1, and YEAST2.
Three imbalanced problems ABL-18, SONAR, and SPECT results are slightly
lower on W = 0.5 than W = 0.4 or W = 06 but better than other values of W.
For ABL-18, the performance difference is 0.015 compared to W = 0.6, and for
SPECT and SONAR, the difference is 0.026 and 0.013 only compared to W-0.4.
Setting the value of W in an apriori and domain-independent manner is required
for designing any generic method. Thus, five of eight considered imbalanced
problems, W = 0.5, give the best result and slightly lower than the best results
for three, out of eight imbalanced problems, compared to other W values. If we
Assessment of Weight Factor in GP Fitness Function 7

Table 3. Weight vs AUC

Data-Set W = 0.2 W = 0.4 W = 0.5 W = 0.6 W = 0.8


ABL-18 0.570 ± 0.061 0.734 ± 0.071 0.752 ± 0.053 0.767 ± 0.054 0.702 ± 0.071
ABL-9-18 0.718 ± 0.101 0.801 ± 0.086 0.829 ± 0.063 0.807 ± 0.050 0.774 ± 0.062
ABL-9 0.500 ± 0.001 0.578 ± 0.037 0.646 ± 0.017 0.615 ± 0.025 0.540 ± 0.025
YEAST-1 0.699 ± 0.034 0.772 ± 0.026 0.773 ± 0.032 0.748 ± 0.026 0.628 ± 0.034
YEAST-2 0.879 ± 0.039 0.915 ± 0.025 0.924 ± 0.014 0.906 ± 0.023 0.892 ± 0.020
ION 0.852 ± 0.035 0.859 ± 0.045 0.868 ± 0.035 0.854 ± 0.038 0.704 ± 0.147
SPECT 0.669 ± 0.039 0.712 ± 0.053 0.686 ± 0.063 0.595 ± 0.077 0.567 ± 0.035
SONAR 0.659 ± 0.072 0.759 ± 0.046 0.746 ± 0.060 0.724 ± 0.051 0.684 ± 0.066

define a range, then the optimal values range of the minority class weightage is
0.50 ± 0.10. Therefore we can conclude that as a generic approach, we can set
the value of W equal to 0.50.

Fig. 3. Weight vs AUC

6 Conclusions
The imbalanced distribution of data across various classes generates intense chal-
lenges to classification algorithms. One way to solve these challenges is to assign
a custom weight to the classes during model training. Genetic programming
(GP) is an evolutionary technique used in various domains to solve classification
problems. This imbalanced distribution of data also influences GP. In GP, the
learning process of classifiers is governed by the fitness function, and this fit-
ness function can easily incorporate custom weight assignments. We performed
extensive experimentation on eight UCI repository base imbalanced problems in
this work, with various values of minority class weight. Based on experimental
8 A. Kumar et al.

results, we can conclude that for better handling of imbalanced classification


problems in a generic manner, we can set the weight value to 0.5. This custom
weight assignment for the minority class samples during training generated bet-
ter solutions, which tackle the challenges of the imbalanced data classification
problems. Thus custom weight factor in GP fitness function gives more balanced
and accurate solutions for the imbalanced data classification problems.

References
1. Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learn-
ing from class-imbalanced data: review of methods and applications. Expert Syst.
Appl. 73, 220–239 (2017)
2. Hassib, E., El-Desouky, A., Labib, L., El-kenawy, E.S.M.: WOA + BRNN: an
imbalanced big data classification framework using whale optimization and deep
neural network. Soft Comput. 24(8), 5573–5592 (2020)
3. Zhang, C., Tan, K.C., Li, H., Hong, G.S.: A cost-sensitive deep belief network for
imbalanced classification. IEEE Trans. Neural Netw. Learn. Syst. 30(1), 109–122
(2018)
4. Zhu, M., et al.: Class weights random forest algorithm for processing class imbal-
anced medical data. IEEE Access 6, 4641–4652 (2018)
5. Han, W., Huang, Z., Li, S., Jia, Y.: Distribution-sensitive unbalanced data over-
sampling method for medical diagnosis. J. Med. Syst. 43(2), 39 (2019)
6. Kovács, G.: An empirical comparison and evaluation of minority oversampling
techniques on a large number of imbalanced datasets. Appl. Soft Comput. 83,
105662 (2019)
7. Bhowan, U., Johnston, M., Zhang, M.: Developing new fitness functions in genetic
programming for classification with unbalanced data. IEEE Trans. Syst. Man
Cybern. Part B (Cybern.) 42(2), 406–421 (2012)
8. Devarriya, D., Gulati, C., Mansharamani, V., Sakalle, A., Bhardwaj, A.: Unbal-
anced breast cancer data classification using novel fitness functions in genetic pro-
gramming. Expert Syst. Appl. 140, 112866 (2020)
9. Kumar, A., Sinha, N., Bhardwaj, A.: Predicting the presence of newt-amphibian
using genetic programming. In: Tiwari, S., Trivedi, M.C., Kolhe, M.L., Mishra,
K., Singh, B.K. (eds.) Advances in Data and Information Sciences. LNCS, vol.
318, pp. 215–223. Springer, Singapore (2022). https://doi.org/10.1007/978-981-
16-5689-7 19
10. Koza, J.R.: Human-competitive results produced by genetic programming. Genet.
Program. Evolvable Mach. 11(3–4), 251–284 (2010)
11. Koza, J.: On the programming of computers by means of natural selection. Genet.
Program. (1992)
12. Cheng, K., Gao, S., Dong, W., Yang, X., Wang, Q., Yu, H.: Boosting label weighted
extreme learning machine for classifying multi-label imbalanced data. Neurocom-
puting 403, 360–370 (2020)
13. Kumar, A., Sinha, N., Bhardwaj, A.: A novel fitness function in genetic program-
ming for medical data classification. J. Biomed. Inform. 112, 103623 (2020)
14. Tao, X., et al.: Self-adaptive cost weights-based support vector machine cost-
sensitive ensemble for imbalanced data classification. Inf. Sci. 487, 31–56 (2019)
15. Zhao, J., Jin, J., Chen, S., Zhang, R., Yu, B., Liu, Q.: A weighted hybrid ensemble
method for classifying imbalanced data. Knowl.-Based Syst. 203, 106087 (2020)
Assessment of Weight Factor in GP Fitness Function 9

16. Kumar, A., Sinha, N., Bhardwaj, A., Goel, S.: Clinical risk assessment of chronic
kidney disease patients using genetic programming. Comput. Methods Biomech.
Biomed. Eng. 1–9 (2021). PMID: 34726985
17. Dua, D., Graff, C.: UCI machine learning repository (2017)
18. Poli, R., Langdon, W.B., McPhee, N.F., Koza, J.R.: A field guide to genetic pro-
gramming. Lulu. com (2008)
19. Ballabio, D., Grisoni, F., Todeschini, R.: Multivariate comparison of classification
performance measures. Chemometr. Intell. Lab. Syst. 174, 33–44 (2018)
20. Mullick, S.S., Datta, S., Dhekane, S.G., Das, S.: Appropriateness of performance
indices for imbalanced data classification: an analysis. Pattern Recogn. 102, 107197
(2020)
21. Cuadros-Rodrı́guez, L., Pérez-Castaño, E., Ruiz-Samblás, C.: Quality performance
metrics in multivariate classification methods for qualitative analysis. TrAC Trends
Anal. Chem. 80, 612–624 (2016)
Mass Transfer Past an Exponentially Stretching
Surface with Variable Wall Concentration
and MHD in Porous Medium

Praveen Kumar Dadheech1 , Priyanka Agrawal2 , Anil Sharma1 ,


Kottakkaran Sooppy Nisar2 , Mahesh Bohra3 , and S. D. Purohit4(B)
1 Department of Mathematics, University of Rajasthan, Jaipur, India
2 Department of Mathematics, College of Arts and Sciences, Wadi Aldawaser, Prince Sattam
Bin Abdulaziz University, Al-Kharj, Saudi Arabia
n.sooppy@psau.edu.sa
3 Department of Mathematics, Government Mahila Engineering College, Ajmer, India
4 Departmetn of HEAS (Mathematics), Rajasthan Technical University Kota, Kota 324010,

India

Abstract. The present study examines the impacts of chemical reactions over an
exponentially expanding sheet of viscous incompressible fluid flow in a porous
medium with the imposed magnetic field. Here reaction rate and wall concen-
tration are exponential variables. The basic equations of the governing flow are
transformed into ordinary differential equations with the help of similarity anal-
ysis. Then the reduced system of equations was dealt with Shooting Technique
alongside the Runge-Kutta method of order four. Numerical results are presented
graphically for velocity and mass field in terms of the parameter of Reaction Ratio,
parameter of permeability, parameter of Magnetic field, and Schmidt Number.

Keywords: Mass transfer · Porous medium · Variable wall concentration ·


Boundary layer flow · Exponentially expanding sheet · MHD

1 Introduction

The analysis of the mass transfer of a laminar boundary layer flow over an expanding
surface in a porous medium, however, has rich implications in materials science and
chemical sciences [7, 8]. Crane [11] was the first who investigated the exact analytic
results of a boundary layer flow past a stretching sheet. Many researchers believed that
the distance from the origin is directly proportional to the velocity of the expanding
surface, linearly. Gupta and Gupta [17] have shown that it is not necessary for a plastic
sheet that stretches linearly. Initially, Kumaran and Ramanaiah [25] analyzed a boundary
layer flow where the expansion of the surface is quadratic. Elbashbeshy [5] studied the
exchange of heat of a boundary layer flow where the surface is expanding exponentially
with a heat sink. The exchange of heat of a laminar flow after applying magnetic field on
it, where the sheet is expanding exponentially was investigated by Al-Odat et al. [16].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 10–21, 2022.
https://doi.org/10.1007/978-3-030-97196-0_2
Mass Transfer Past an Exponentially Stretching Surface 11

The study of radiative MHD boundary layer laminar flow and rate of exchange of heat
along with Expanding surface was presented by Ishak [1].
The study of the flow of the fluid, which is electrically conducted with an imposed
magnetic field, is defined as Magnetohydrodynamics (MHD). Applications of MHD
can be seen in various technological and engineering fields, for instance, geophysics,
petroleum industries, MHD flow meters, MHD electricity generators, crystal magnetic
infiltration control, advanced magnetic filtration control, and MHD pumps, etc. [10, 13,
14, 20, 21]. MHD can be used as a very important tool for controlling mass and heat trans-
fer. The impact of imposed magnetic field where the sheet is expanding exponentially in
a variety of states was studied by the researchers like Andersson [9], Mukhopadyay et al.
[22], Dadheech et al. [18], and Dadheech et al. [12]. An investigation on free convective
MHD fluid flow with the imposed magnetic field was reported by Watanabe et al. [24].
The impact of the magnetic field with viscous dissipation on Non-Newtonian fluid has
been analyzed by researchers like Sharma [3] and Jat [19].
The study of the transfer of mass and heat with chemical reaction has become a
requirement for hydro-metallurgical industries in the present era. The buoyancy forces
and exchange of mass effect assume a significant job in various transport processes in
industries like solar collectors, metallurgical and chemical engineering nuclear reactor
safety combustion systems, etc. Khan et al. [23] analyzed the fluid flow with exchange of
heat of a viscoelastic fluid where the surface is stretching exponentially. The combined
impact of transfer of mass and heat of a viscoelastic MHD flow along with porous
stretching surface has been studied by Kar et al. [15]. An investigation of transfer of mass
and heat of a natural convective fluid flow, taking surface porous and the concentration of
the wall is variable with the imposed magnetic field has presented by Chen [4]. Banerjee
et al. [2] have studied the effect of chemical reaction on a boundary layer fluid flow
where the surface is stretching exponentially and concentration of the wall is variable.
The motivation behind this analysis is to examine the impacts of chemical reaction
because of an exponentially expanding sheet of viscous incompressible fluid flow in
a permeable medium with the imposed magnetic field. We focused mainly on effects
of the magnetic field in the presence of porosity on mass transfer. Chemical reaction
term included because fluid can be chemically reactive. Here reaction rate and wall
concentration are exponential variables. The basic equations of the flow are transformed
into ordinary differential equations with the help of similarity analysis. Then the reduced
equations were dealt with Shooting Technique alongside the Runge-Kutta method of
order four. Numerical results are presented graphically for velocity and mass field in
terms of parameter of Reaction Ratio, parameter of permeability, parameter of Magnetic
field and Schmidt Number.

2 Mathematical Model
Here we are considering 2-D incompressible viscous, electrically conducting, steady
fluid flow with the imposed transverse magnetic field where the surface is stretching
exponentially in a porous medium. The applied magnetic field is presumed to be small
than the induced magnetic field. Hence our governing equation of fluid flow and exchange
12 P. K. Dadheech et al.

of mass are following, under the usual approximations of the boundary layer.
∂u ∂v
+ = 0, (1)
∂x ∂y
∂u ∂u ∂ 2u v σc B02
u +v =v 2 − u− u, (2)
∂x ∂y ∂y k ρ
∂C ∂C ∂ 2C
u +v = D 2 + R(C − C∞ ). (3)
∂x ∂y ∂y
Here the component of velocity in x and y directions are u and v, respectively,
v(= μ/ρ) represents kinematic viscosity, D is coefficient of diffusion, ρ represents
density, coefficient of viscosity is μ, B0 represents magnetic field coefficient, C rep-
resents concentration, C∞ represents ambient concentration, k represents parameter of
x
permeability of porous media and R(x) = R0 e L represents the variable rate of chemical
inversion, here R0 is a constant and L represents reference length.
For above problem the boundary conditions are defined as:
λx
u = Uw (x), v = 0, C = Cw = C∞ + C0 e 2L at y=0 (4)

u→0 C → C∞ as y → ∞. (5)

Here in above boundary conditions Cw represents the concentration on the sheet


which is variable, C0 represents reference concentration and λ is a parameter to control
reaction rate of the mass transfer the impact of parameter λ assumes a significant job
since it controls the exponential increment of surface concentration. Uw is the velocity
of expansion of the sheet, is given by
x
Uw (x) = ae L

Where a is the expansion constant and a > 0.


By using a stream function ψ(x, y), one can define as:
∂ψ ∂ψ
u= , v=− . (6)
∂y ∂x
Thus the continuity Eq. (1) is identically satisfied and momentum and concentration
Eqs. (2) and (3) respectively transformed into the following equations

∂ψ ∂ 2 ψ ∂ψ ∂ 2 ψ ∂ 3ψ v ∂ψ σc B02 ∂ψ
− = v − − (7)
∂y ∂x∂y ∂x ∂y2 ∂y3 k ∂y ρ ∂y
∂ψ ∂C ∂ψ ∂C ∂ 2C 
− = D 2 + R C − C∞) (8)
∂y ∂x ∂x ∂y ∂y
And the corresponding conditions (4) and (5) reduced into
∂ψ ∂ψ λx
= Uw (x), = 0, C = Cw = C∞ + C0 e 2L , y = 0 (9)
∂y ∂x
Mass Transfer Past an Exponentially Stretching Surface 13

∂ψ
→ 0, C → C∞ y→∞ (10)
∂y
Now we have introduced the following similarity transformation for the solution of
governing equations.

ψ(x, y) = 2vLaf (η)ex/2L , (11)

C = C∞ + (Cw − C∞ )φ(η). (12)

Here η is defined as similarity variable, which is suggested by Magyari and Keller


[6] as

a x/2L
η=y e (13)
2vL
After using these similarity transformations (11)–(12) in the momentum and
concentration Eq. (2) and (3) we get the following reduced equations as:

f  − 2f 2 − 2Kf  + ff  + 2Mf  = 0, (14)

 
φ  + Sc f φ  − λf  φ − βφ = 0. (15)

And the boundary condition becomes:



η = 0; f (η) = 0, f  (η) = 1, φ(η) = 1
(16)
η → ∞; φ(η) → 0. f  (η) → 0.

Where the dimensionless parameters are


v −x/L
K= e (Permeability parameter)
ak
σc B02 −x/L
M = e (Magnetic parameter)
ρa
2LR0
β= (Reactionratio parameter)
a
v
Sc = (Schmidt number)
D

3 Numerical Solution for Momentum and Mass Equations


In this present problem, Eqs. (14) and (15), together with boundary conditions (16)
become non-linear coupled differential equations. For this, we adopt the numerical
14 P. K. Dadheech et al.

Shooting technique alongside the fourth-order Runge-Kutta method. So we have reduced


the above system of equations into initial value problems of order one as we set:

f = t1 , f  = t2 , f  = t3 , t3 = 2t22 − t1 t3 + 2Kt2 + 2Mt2 ,

φ = t4 , φ  = t5 , t5 = Sc[βt4 + Dt2 t4 − t1 t5 ].

Along with initial boundary conditions

t1 (0) = 0, t2 (0) = 1, t4 (0) = 1.

There are three initial boundary conditions, while we need five initial boundary
conditions i.e. t3 (0) and t5 (0). We have obtained solution by Runge-Kutta numerical
method of order four, taking the initial predicted values for t3 (0), t5 (0), and η(→ ∞),
say η∞ . After this we have computed f  (η) and φ(η) at η∞ (10) along with conditions
of boundary f  (η∞ ) = 0 and θ (η∞ ) = 0 and for desired approximate result (degree
of accuracy is 10−6 ) we have adjusted the values of f  (0) and θ  (0) taking step size as
η = 0.01.

4 Results and Discussion


In the present analysis we have computed numerical results for several physical dimen-
sionless parameters, which are represented by the graphs. The results are obtained to
explain the impact of the parameter M, parameter of reaction ratio β, parameter of
permeability K, Schmidt number Sc and parameter λ.
Figure 1 depict dimensionless profile of velocity f  verses η for the values of M, the
behavior of the curve f  is decreasing for increased M, and this leads to increase the

0.8

0.6
f'( )

M = 1, 2, 3, 4

0.4

0.2

0
0 0.5 1 1.5 2 2.5 3

Fig. 1. Velocity curve for M, taking K = 2, β = Sc = 0.5 and λ = 1.


Mass Transfer Past an Exponentially Stretching Surface 15

a1
0.9

0.8

0.7

0.6
( )

0.5 M = 1, 2, 3, 4

0.4

0.3

0.2

0.1

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

b1
0.9

0.8

0.7

0.6
( )

0.5

0.4
M = 1, 2, 3, 4
0.3

0.2

0.1

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Fig. 2. (a).Concentration curve for M, taking K = 2, β = Sc = 0.5 and λ = 1. (b).Concentration


curve for M, taking K = 2, β = Sc = 0.5 and λ = −1.

velocity field for decreased M. This is because of the resisting force of the magnetic field
that resists the flow of fluid which leads to increase the mass transfer rate.
Figure 2(a) and 2(b) depict the dimensionless concentration curve φ verses η for M,
when λ = 1 and λ = −1 respectively. When M is increasing the concentration profile
is also increasing for λ = 1 and decreasing for λ = −1 i.e. mass transfer efficiency is
lower as higher values of M, for λ = 1 and reverse effect can be seen for λ = −1.
Figure 3 depicts dimensionless velocity profile f  verses η for distinct values of K.
It is concluded that for increased values of K, the profile of velocity is decreasing. From
this we can conclude that momentum of fluid is lower for larger values of K.
Figure 4(a) and (b) depict dimensionless concentration profile φ verses η for distinct
values of K, for λ = 1 and λ = −1, respectively. It is observed that when K increases,
16 P. K. Dadheech et al.

0.8

0.6
f'( )

K= 0, 1, 2, 3, 4

0.4

0.2

0
0 0.5 1 1.5 2 2.5 3

Fig. 3. Velocity curve for K, taking M = 2, β = Sc = 0.5 and λ = 1.

a1
0.9

0.8

0.7
K = 0, 1, 2, 3, 4
0.6
( )

0.5

0.4

0.3

0.2

0.1

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

b 1

0.9

0.8

0.7

0.6
( )

0.5

0.4

0.3 K = 0, 1, 2, 3, 4
0.2

0.1

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Fig. 4. (a) Concentration curve for K, taking M = 2,β = Sc = 0.5 and λ = 1. (b). Concentration
curve for K, taking M = 2,β = Sc = 0.5 and λ = −1.
Mass Transfer Past an Exponentially Stretching Surface 17

φ decreases for λ = 1 and the reverse effect is seen for λ = −1. As a result, the mass
transfer efficiency is higher for λ = 1 and lower for λ = −1for the smaller values of K.
To control reaction rate of the exchange of mass, the impact of parameter λ assumes
a significant job since it controls the exponential increment of surface concentration λ.

0.9

0.8

0.7

0.6
= -1.5, -1, -0.5, 0, 0.5, 1, 1.5
( )

0.5

0.4

0.3

0.2

0.1

0
0 1 2 3 4 5 6 7 8 9 10

Fig. 5. Concentration curve for λ, taking K = M = 2, β = 0.5 and Sc = 0.5.

Figure 5 depicts the dimensionless profile of concentration φ verses η for various


points of parameter λ, we see when λ increases, the profile of concentration φ decreases.
So we can also enhance the mass transfer rate using the parameter λ.

0.8

Sc = .1, .15, .2, .3, .5, 1


0.6
( )

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Fig. 6. Concentration curve for Sc, taking K = M = 2, β = 0.5 and λ = 1.


18 P. K. Dadheech et al.

0.8

Sc = .1, .15, .2, .3, .5, 1


0.6
( )

0.4

0.2

0
0 0.5 1 1.5 2 2.5 3

Fig. 7. Concentration curve for Sc, taking K = M = 2, β = 0.5 and λ =−1.

Figure 6 and Fig. 7 depict dimensionless profile of concentration φ verses η for


various points of Schmidt number Sc, for λ = 1 and λ = −1 respectively. From the
sketches, it is noticed that when Sc increases, in both direct and inverse variation of
exponential wall concentration, the concentration profile is decreasing i.e. mass transfer
rate of the fluid can be enhanced by Sc.
Figure 8(a) and Fig. 8(b) depict the dimensionless profile of concentration β verses
η for various values of the parameter of Reaction Ratio β, for λ = 1 and λ = −1
respectively. From the sketches, we have noticed that when we give increment to the
value of β the tendency of concentration profile is decreasing in both cases. So that mass
transfer rate of the fluid can be enhanced by larger β.
Mass Transfer Past an Exponentially Stretching Surface 19

a 1

0.8

0.6
( )

=0.5, 0.75, 1.0, 1.25, 1.5

0.4

0.2

0
0 1 2 3 4 5 6 7 8 9 10

b1

0.8

0.6
= 1.5, 1.25, 1, 0.75, 0.5
( )

0.4

0.2

0
0 1 2 3 4 5 6 7 8

Fig. 8. (a). Concentration curve for β, taking K = M = 2, Sc = 0.5 and λ = 1. (b). Concentration
curve for β, taking K = M = 2, Sc = 0.5 and λ = −1.

5 Conclusion
Here an investigation of the chemical reaction of an electrically conducted, incompress-
ible viscous laminar fluid flow is considered in a permeable medium over an exponen-
tially Expanding sheet. Here chemical reaction rate and wall concentration are taken
variable. The present study is especially towards mass transfer effect. The thickness of
the concentration boundary layer, decreases with the increasing Schmidt number Sc,
parameter of reaction ratio β parameter λ and parameter of Permeability K i.e. mass
transfer efficiency can be enhanced by enhancing the values of these parameters. We
also observed that mass transfer reduces with increased M, the parameter of magnetic
field. The increasing values of parameter M, results to higher velocity profile while
20 P. K. Dadheech et al.

lower for increasing values of parameter K. In future these results can be useful for the
examination of the study of mass transfer with different geometries, different fluids and
different solution methods.

References
1. Ishak, A.: MHD boundary layer flow due to an exponentially stretching sheet with radiation
effect. Sains Malaysiana 40, 391–395 (2011)
2. Banerjee, A., Mahato, S.K., Bhattacharyya, K.: Mass diffusion with chemical reaction in
boundary layer flow due to an exponentially expanding sheet with variable wall concentration.
Acta Tech. CSAV 63(2), 157–168 (2018)
3. Sharma, A., Jha, A.K., Dadheech, P.K.: Heat transfer over a flat plate in porous medium
with heat source and viscous dissipation in slip flow regime. Int. J. Math. Arch. 8(1), 99–108
(2017)
4. Chen, C.H.: Heat and mass transfer in MHD flow by natural convection from a permeable
inclined surface with variable wall temperature and concentration. Acta Mech. 172(3–4),
219–235 (2004)
5. Elbashbeshy, E.M.A.: Heat transfer over an exponentially stretching continuous surface with
suction. Arch. Mech. 53(6), 643–651 (2001)
6. Magyari, E., Keller, B.: Heat and mass transfer in the boundary layers on an exponentially
stretching continuous surface. J. Phys. D: Appl. Phys. 32(5), 577–585 (1999)
7. Ali, F.M., Nazar, R., Arifin, N.M., Pop, I.: MHD Mixed convection boundary layer flow
toward a stagnation point on a vertical surface with induced magnetic field. J. Heat Transf.
133(2), 1–6 (2011). https://doi.org/10.1115/1.4002602
8. Andersson, H.I., Hansen, O.R., Holmedal, B.: Diffusion of a chemically reactive species from
a stretching sheet. Int. J. Heat Mass Transf. 37, 659–664 (1994)
9. Mathur, P., Mishra, S.R., Purohit, S.D., Bohra, M.: Entropy generation in a micropolar fluid
past an inclined channel with velocity slip and heat flux conditions: Variation parameter
method. Heat Transf. 50, 7425–7439 (2021). https://doi.org/10.1002/htj.22236
10. Agrawal, P., Dadheech, P.K., Jat, R.N., Baleanu, D., Purohit, S.D.: Radiative MHD hybrid-
nanofluids flow over a permeable stretching surface with heat source/sink embedded in porous
medium. Int. J. Numer. Meth. Heat Fluid Flow 31(8), 2818–2840 (2021). https://doi.org/10.
1108/HFF-11-2020-0694
11. Crane, L.J.: Flow past a stretching plate. Z Angew Math Phys 21(4), 645–647 (1970). https://
doi.org/10.1007/BF01587695
12. Dadheech, P.K., Agrawal, P., Sharma, A., Dadheech, A., Al-Mdallal, Q., Purohit, S.D.:
Entropy analysis for radiative inclined MHD slip flow with heat source in porous medium
for two different fluids. Case Stud. Therm. Eng. 28, 101491 (2021). https://doi.org/10.1016/
j.csite.2021.101491
13. Dadheech, P.K., Agrawal, P., Mebarek-Oudina, F., Abu-Hamdeh, N., Sharma, A.: Compara-
tive heat transfer analysis of MoS 2 /C 2 H 6 O 2 and SiO 2 -MoS 2 /C 2 H 6 O 2 Nanofluids
with natural convection and inclined magnetic field. J. Nanofluids 9(3), 161–167 (2020)
14. Agrawal, P., Dadheech, P.K., Jat, R.N., Nisar, K.S., Bohra, M., Purohit, S.D.: Magneto
Marangoni flow of γ−AL2O3 nanofluids with thermal radiation and heat source/sink effects
over a stretching surface embedded in porous medium. Case Stud. Therm. Eng. 23, 100802
(2021). https://doi.org/10.1016/j.csite.2020.100802
15. Kar, M., Sahoo, S.N., Rath, P.K., Rath, G.C.: Heat and mass transfer effects on a dissipative
and radiative viscoelastic MHD flow over a stretching porous sheet. Arab. J. Sci. Eng. 39(5),
3393–3401 (2014)
Mass Transfer Past an Exponentially Stretching Surface 21

16. Al-Odat, M.Q., Damseh, R.A., Al-Azab, T.A.: Thermal boundary layer on an exponentially
stretching continuous surface in the presence of magnetic field effect. Int. J. Appl. Mech. Eng.
11(2), 289–299 (2006)
17. Gupta, P.S., Gupta, A.S.: Heat and mass transfer on a stretching sheet with suction or blowing.
Can. J. Chem. Eng. 55, 744–746 (1977)
18. Dadheech, P.K., Agrawal, P., Sharma, A., Nisar, K.S., Purohit, S.D.: Marangoni convection
flow of γ–Al2O3 nanofluids past a porous stretching surface with thermal radiation effect in
the presence of an inclined magnetic field. Heat Transf. 51, 534–550 (2021). https://doi.org/
10.1002/htj.22318
19. Jat, R.N., Agrawal, P., Dadheech, P.K.: MHD boundary layer flow and heat transfer of casson
fluid over a moving porous plate with viscous dissiption and thermal radiation effects. J.
Rajasthan Acad. Phys. Sci. 16(3–4), 211–232 (2017)
20. Jat, R.N., Agrawal, P.: MHD boundary layer slip flow and heat transfer over a porous flat
plate embedded in a porous medium. Int. J. Math. Arch. 8(1), 146–155 (2017)
21. Agrawal, P., Dadheech, P.K., Jat, R.N., Bohra, M., Nisar, K.S., Khan, I.: Lie similarity analysis
of MHD flow past a stretching surface embedded in porous medium along with imposed heat
source/sink and variable viscosity. J. Market. Res. 9(5), 10045–10053 (2020)
22. Mukhopadyay, S.: Slip Effects on MHD boundary layer flow over an exponentially stretching
sheet with suction/blowing and thermal radiation. Ain Shams Eng. J. 4(3), 485–491 (2013)
23. Khan, S.K., Sanjayan, E.: Viscoelastic boundary layer flow and heat transfer over an
exponential stretching sheet. Int. J. Heat Mass Transf. 48, 1534–1542 (2005)
24. Watanabe, T., Pop, I.: Magnetohydrodynamic free convection flow over a wedge in the
presence of a transverse magnetic field. Int. Commun. Heat Mass Transf. 20(6), 871–881
(1993)
25. Kumaran, V., Ramanaiah, G.: A note on the flow over a stretching sheet. Acta Mech. 116,
229–233 (1996)
A Novel Approach of Using Materialized Queries
for Retrieving Results from Data Warehouse

Sonali Chakraborty(B)

Department of Mathematical and Computational Sciences, National Institute of Technology,


Mangalore, Karnataka 575025, India
chakrabartysonali@gmail.com

Abstract. The organization data warehouse stores historical records collected


from various operational sources for management strategy making. In case of
frequent management queries, generation of same results by repeated invocation
to the data warehouse is relatively time consuming. In order to extract results
from data warehouse, data-cubes and materialized-views are used. They acquire
more processing, storage area and maintenance cost. The present study enhances
result fetching of the queries which are frequent from a data warehouse by loading
queries, results and some meta-data in a database stated as Materialized-Query-
Database denoted as MQDB. When a query is given by the user, the MQDB is
checked for determining any already existing query in the database. In case the
query exists in MQDB and the stored results do not require any result updation
then the results are simply fetched from the database. The approach of storing the
results of the input query and thereafter simply fetching them when the same query
is executed again reduces the query processing time substantially. The evaluation
of the novel approach done by using the data warehouse on the central as well
as on a remote cloud-server shows a noteworthy reduction in the time taken to
retrieve results of the queries as compared to using the prevailing approaches.
The approach is suitable to make use of past records in the data warehouse for
management decision making.

Keywords: Data Warehouse · Online analytical processing queries ·


Materialized queries · Faster query result · Central-server · Cloud-server

1 Introduction
Enterprise data warehouse provides huge volume of historical data through various oper-
ational. The OLAP queries are implemented on the warehouse data by the management
[15] for analysis and strategic decision making. The results of the queries are created by
navigating large amount of warehouse data. For recurrent queries, the data warehouse
is accessed repeatedly while generating the same results which is fairly time consum-
ing. Moreover, the current approaches for storing and extracting query output from a
data warehouse; such asl data-cubes and materialized-views sustain additional cost [12,
13, 21]. In the study, the implemented queries, their results and meta-data is stored in
database stated as Materialized-Query-Database (MQDB). The meta-data information

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 22–35, 2022.
https://doi.org/10.1007/978-3-030-97196-0_3
A Novel Approach of Using Materialized Queries 23

contains the time-stamp of the query when it was last executed, the frequency represent-
ing the count of query execution, threshold value indicating an approximate count of the
query to be executed in a year and the number of output records and the path of result
table.
For a given query, Materialized-Query-Database is checked for its similar existing
query. The query existence is checked by matching the table-field-function-relational
operator-criteria of the input and warehoused queries. An input with a stored query fetch-
ing same results is considered as synonymous to each other. Thereafter, the time-stamp of
the warehoused query is equated with the preceding warehouse refresh date to determine
if the stored query requires an incremental update of results. If the query time-stamp
of the stored query is greater than the last data warehouse revive date then no update
for incremental data is required. In this case existing stored results are simply fetched
from MQDB eliminating need of data warehouse invocation for result generation. In
this study the handing out time of the synonymous queries with no incremental update is
depicted by placing data warehouse on the central and cloud-server. The experimental
results depict that the approach of using the stored results of executed OLAP queries
substantially decreases query handing out time as compared to the existing approaches.
The quality and the performance of a data warehouse for making strategy decisions
by an organization is an important aspect [19]. The eminence of the BI systems is
improved using the Online Analytical Processing queries, data warehouse and SQL
query techniques [2]. Materialized-views [5, 9, 10, 24], multidimensional data-cubes
[12, 13] and various indexing techniques [17, 23] are used for retrieving data warehouse
results. The major drawback of using data-cubes is that they can preserve the results of
queries with aggregate functions only. Moreover, storing enormous results computed on
multiple dimensions requires massive space and therefore incurs more time to fetch the
results. Cube materialization and the materialization cost are inversely related to each
other [14]. Many procedures are projected by scholars for dropping materialization cost
of the cubes [1, 6–8, 20]. The major limitation of materialized-views storing results with
view description is that they are not supported by all DBMS. Moreover, materialized-
views are executed each time and therefore handlers must know information about the
query with their table-field used in materialized-views [24]. Refreshing of view results
with their corresponding updations in the data warehouse is a major concern. Researchers
[10, 11, 16, 18, 25] confer various issues faced while performing maintenance of the
views such as overhead issues and also discuss the approaches to overcome those issues.

2 Implementation of MQDB

The steps for implementing the proposed approach while processing synonymous queries
is depicted in the following algorithm.

I. Initialization: Generating table-field-aggregate function-criteria-relational operator


identifiers likely in the query (Sect. 3.1)
II. When the input query is given;
a. Create its query-identifier-element (Depicted in Sect. 3.3)
b. If a similar query-identifier-element is found in the database
24 S. Chakraborty

i. Compare the input and stored query criteria


If both of them match then,
• Fetch existing results stored in MQDB
• Bring up-to-date the query time-stamp and query frequency
c. Otherwise, if, no similar query is found in database;
i. Create the results of the given input query using DW
ii. Store the results of the query along with meta-data information in MQDB
(Sect. 3.2)

In this novel approach of reusing stored query, the OLAP queries are stored in the
database in the form of integers. The main purpose of doing so is to ease the query
searching process from MQDB. Since a query can be written is multiple ways, the
occurrence of string mismatch is quite likely if the query is stored as a text in database.
One query written is multiple ways will be stored repeatedly in the database causing
unnecessary duplication of queries and their results. The working of the MQDB model
is illustrated using the education data of various age groups of males and females of
India for the years 1991, 2001 and 2011. The education data is from population census
dataset of India [22].

2.1 The Initialization Process

During this phase [4] identifiers are generated for the table-field-function-criteria-
relational operator required for the OLAP query. The initialization phase is initially
executed during the loading time. The table and their respective fields differ with the
application whereas the identifiers for functions, operators and criteria clause is same
for all types of applications. The identifier generation is depicted as follows:

a. In this case a data warehouse dw.edu in a de-normalized form is considered and


therefore the table identifier assigned to it is ‘01’.
b. The field identifiers in dw.edu is depicted in Table 1.

• The identifiers for the most probable functions to be used in queries and other clauses
are: (NULL function = 00), (‘Sum = 01’), (‘Average = 02’), (‘Min = 03’), (‘Max =
04’), (‘Count = 05’), (‘Std Dev = 06’), (‘Var = 07’), (‘Orderby ascending = 10’),
(‘Orderby descending = 20’), (‘Limitby = 30’), (‘Groupby = 40’)
• Identifiers for the clauses used for specifying criteria: ‘no-criteria = 00’, ‘where-
criteria = 01’, ‘having criteria = 02’
• The criteria of query with numeric value is specified with a minimum and maximum
range. Therefore, the operators (<, >, < = and > =) are converted to ‘BETWEEN’.
For example: The query criteria ‘literate_females < 100’; gets converted to ‘0 to 99’.
The relational operator identifiers are: (‘no operator, 00’), (‘ =, 01’), (‘! =, 02’),
(‘BETWEEN, 03’)
A Novel Approach of Using Materialized Queries 25

Table 1. Field identifiers for table dw.edu

Field Name Field identifier Field Name Field identifier


edu.record.id 01 middle_female 19
cen.year 02 total_middle 20
st.name 03 secondary_male 21
tn.name 04 secondary_female 22
age.group 05 total_secondary 23
illiterate_male 06 higher_sec_male 24
illiterate_female 07 higher_sec_female 25
total_illiterate 08 total_higher_sec 26
literate_male 09 diploma_male 27
literate_female 10 diploma_female 28
total_literate 11 total_diploma 29
bel_primary_male 12 graduate_male 30
bel_primary_female 13 graduate_female 31
total_bel_primary 14 total_graduate 32
primary_male 15 unclassified_male 33
primary_female 16 unclassified_female 34
total_primary 17 total_unclassified 35
middle_male 18 entry_date 36

2.2 Query Storage in MQDB

The executed OLAP query with meta-data are stored in MQDB [3, 4]. Two relational
tables are used to store the query details. The ‘Stored-query’ table stores the queries
using identifiers while the query meta-data is in the ‘Materialized-query’ table shown
in Table 2 and Table 3 respectively. Consider the following queries as test query set
executed for recording the experimental results.

• Test query set

Q1: Find first three towns in ‘Jharkhand’ state with the greatest number of unclassified
females in the age group of ‘60–64’ in census year ‘2011’.
select tn.name, unclassified_female from dw.edu.
where cen.year = 2011 and st.name = ‘jharkhand’ and age.group = ‘60–64’ order
byunclassified_female limit 3.
Q2: List states having an average greater than ‘1000’ for literate males in age group
of ‘80–84’ years for the three census years.
select cen.year, st.name, avg (literate_male) from dw.edu.
where age.group = ‘80–84’ group by cen.year, st.name.
26 S. Chakraborty

having avg(literate_male) > 1000.


Q3: List states having number of literate females a lesser amount than ‘1000’ in
census year ‘2011’.
select st.name, sum (literate_female) from dw.edu.
where cen.year = 2011 group byst.name.
having sum (literate_female) < 1000.
Q4: Compute the variance of the number of diploma holders for each state for each
census year for the age group of ‘30–34’.
select cen.year, st.name, var (total_diploma), from dw.edu.
where age.group =’30–34’ group bycen.year, st.name.
Q5: Compute the deviation for the number of literate males for age group of 20–
24 years for each state for each census year.
select cen.year, st.name, stdev(literate_male) from dw.edu.
where age.group = ‘20–24’.
group bycen.year, st.name.
The following two table depict ‘Stored-query’ and ‘Materialized-query’ table.

Table 2. ‘Stored-query’ table in MQDB.

The records in ‘Stored-query’ for query Q1 is explained below:

1. The first row having ID sq1 corresponds to ‘tn.name’. The value of function ‘30’
depicts LIMIT clause. ‘00’ for criteria and operator depicts no criteria and therefore
the values of columns list, minimum and maximum is NULL.
2. Row sq2 explains field ‘unclassified_female’. Func_id ‘10’ indicates the use of
ORDER BY clause in an ascending order. ‘00’ for criteria and operator shows that
no criteria of the query and therefore list, minimum, maximum is NULL.
3. Third, fourth and fifth rows depict the fields census year, state name and age group
respectively. The value ‘00’ as Func_ids in all three rows shown no aggregate function
used for this field. ‘01’ for criteria and operator shows that the criteria specified using
where clause has a ‘ =’ operator. List contains ‘2011’, ‘Jharkhand’ and ‘60–64’
A Novel Approach of Using Materialized Queries 27

for ‘cen.year’, ‘st.name’ and ‘age.group’ respectively. The minimum and maximum
range is NULL which indicates that there are no numeric range criteria.

Table 3. ‘Materialized-query’ table in MQDB.

2.3 Finding Synonymous Query from MQDB


For a given query ‘Stored-query’ is searched for a synonymous query. The following
steps are performed for a synonymous query [4]:

• In the first step, all the table-field-function combination of the given query and
stored query is compared. Identifiers table-field-function is grouped and is denoted as
identifier-code. The set of all identifier-codes of a query creates the query-identifier-
element. The number of identifier-codes in query-identifier-element depends on the
of fields used in the input query.
• During next step, the criteria of the given query is compared with that of the stored
query criteria. In case criteria of both the queries match then the queries are considered
as synonymous to each other.

The following example illustrates the generation of query-identifier-element.


Example: Consider Input_query 1: List the states having more than 1000 average
number of literate males in age.group 80–84 years for the three census years.

select st.name, avg (literate_male), cen.year from dw.edu.


where age.group = ‘80–84’ group by cen.year, st.name.
having avg(literate_male) > 1000.

Identifiers of Input_query1 are depicted in the following tables.


Referring to Tables 4 and 5, the four identifier-codes of Input_query1 are (010304),
(010902), (010240) and (010500). By combining identifier-codes in to a set, the query-
identifier-element is formed. From all the queries in the ‘Stored-query’ table, all the
members in the query-identifier-element are compared with the saved queries. In this
case while matching the query-identifier-elements, it is observed that all the identifier-
codes of Input_query1 and Q2 are similar. Further, the criteria, operator and the criteria
values of both queries are matched and here it is found that both Input_query1 and Q2
are synonymous. The average time for searching [4] if a synonymous query exists or not
is calculated by populating MQDB with 50, 100, 200 till 600 queries. The search time
is shown in Table 6.
28 S. Chakraborty

Table 4. Identifiers for Input_query1

Table 5. Criteria clause, relational operator and criteria for Input_query1

Table 6. Time required for searching a synonymous query from MQDB [4]

2.4 Finding the Requirement of Incremental Data [3]

Time-stamp of the query from ‘Materialized-query’ is checked with last data warehouse
refresh for determining the requirement of incremental updates. Here, the time-stamp of
the query Q2 (‘2020–11-18’) exceeds the data warehouse refresh date (‘2020–11-02’)
and so Q2 does not require any incremental processing of stored results. This implies
that the existing stored results of query Q2 in MQDB are up-to-date.

3 Experimental Observations

The data warehouse ‘dw.edu’ is formed with approximately 1,80,000 data. MQDB con-
tains 200 queries. The results are recorded by executing programs created using python
language and MySQL 5.7.19. The system specifications are: Intel(R) Core (TM) 2 Duo
A Novel Approach of Using Materialized Queries 29

CPU E8400 @ 3 GHz, 3000 MHz processor with 2 GB RAM and Microsoft Windows 7
Professional OS. Further, for reducing the cost of the central-server, the data warehouse
is placed on cloud-server using Amazon RDS service for MySQL with client application
as MySQL Workbench 8.0 CE. The instance is placed in us-east-1d (North Virginia).
The following computations are done while collecting the experimental results:

• Avg time for query processing considering five executions


• Std dev
• Std error of mean = std dev / sqrt(n)
• Co-efficient of variation = std dev * 100) / Avg
• % reduction in processing time between methods calculated as
100 x (Time of Method 1 – Time of Method 2) / Time of Method 1)

3.1 Processing with Data Warehouse on Central-Server for Synonymous Queries


The following methods are used for computing the time required for processing
synonymous queries:

a. Syn_DW: Query processing time is same as generating results using the data
warehouse.
b. Syn_DC: The results of non-aggregate queries (here query Q1) is computed through
data warehouse. The time required for processing the queries is same as that taken
to navigate records of data-cubes and fetching the aggregates.
c. Syn_MV: Total time required to invoke the views and thereafter retrieving the
stored results.
d. Syn_MQDB: The following process timings are considered:(Chakraborty and
Doshi, 2020):
i) Time to search synonymous query in MQDB
ii) Finding if incremental update is required
iii) Retrieving the results from MQDB and updating the meta-data information

The synonymous query search time using MQDB depends on the following
parameters;

i. Generation of query-identifier-element (shown in Sect. 3.3)


ii. Total records navigated for finding the a similar query

Generation of query-identifier-element depends on the number of field-function-


criteria in a query and this disparity in time is negligible. As observed in Table 6, the
search time for a synonymous query in MQDB varies with the number of records stored
in it. Therefore, the key factor for determining the total processing time of queries is the
quantity of queries in MQDB. By considering 200 queries in MQDB, Table 7 depicts
average search time for queries Q1 to Q5.
Table 8 depicts the total time for processing synonymous queries without updated
data using a data warehouse, cubes, materialized-views and MQDB. Using MQDB, the
total processing time includes the search time for synonymous query in MQDB (Table
30 S. Chakraborty

Table 7. Average search time for queries Q1 to Q5 using MQDB

7) followed by the sum of processing time of operations (ii) and (iii). The processing
time using MQDB is depicted with six significant digits since the variation in is very
insignificant.

Table 8. Processing time for synonymous queries with no incremental data and data warehouse
on central-server

Table 9 shows the decrease in the processing time of queries between the following
methods: (Fig. 1).

• Syn_MQDB equated to Syn_DW


• Syn_MQDB equated to Syn_DC
• Syn_MV equated to Syn_MQDB

3.2 Synonymous Query Processing with Data Warehouse on Cloud-Server


Major advantage of placing the data warehouse on a remote cloud-server is that storage
space requirement and the cost of maintaining the central-server is minimized. Plac-
ing the warehouse data on the cloud-server increases the accessibility of the records.
A Novel Approach of Using Materialized Queries 31

Table 9. Reduction in processing time of synonymous queries using MQDB as compared to using
data warehouse, data-cubes and materialized-views

Fig. 1. Synonymous query processing time using data warehouse, data-cubes, materialized-views
and MQDB

Moreover, load balancing; overall performance, automated backup etc. is taken care of
by the cloud-server. The investigational results are collected by keeping the data ware-
house and cube on the remote server while MQDB and the result table reside on the
central-server. While recording the experimental results using a cloud-server, the factors
affecting the time to connect to a remote a cloud instance like the provider, location of
the instance on cloud-server, the connection bandwidth and the time latency are identi-
fied. An added time of approximately 3.6272 s is required to connect to a remote cloud
instance regardless of the query processing. The following methods are used to record
the observations:

a. Syn_DW_Cld: Since the data warehouse is placed on the remote server, total time
includes the time the connection time to cloud instance thereafter generating the
results using the data warehouse.
b. Syn_DC_Cld: The time for queries with no aggregate function is same as
Syn_DW_Cld whereas for aggregate queries, the total time is the sum of time
required connecting to the remote cloud instance followed by navigating the records
of data-cubes in order for fetching the stored aggregate.
c. Syn_MQDB: This approach does not require accessing the data warehouse and so
there is no requirement to connect to the cloud instance. The results are retrieved
32 S. Chakraborty

from MQDB from central-server. Therefore, the total time of synonymous queries
without updated results using MQDB is same as depicted in Sect. 4.1. Table 10
shows the time for processing synonymous queries without updated results using
data warehouse, cubes and MQDB. Table 11 shows decrease in processing time of
queries between the methods Syn_MQDB and Syn_DW_Cld and Syn_MQDB and
Syn_DC_Cld. (Fig. 2)

Table 10. Processing time for synonymous queries with no incremental data and data warehouse
on cloud-server

Table 11. Reduction in processing time of synonymous queries using MQDB as compared to
using data warehouse and data-cubes on cloud-server
A Novel Approach of Using Materialized Queries 33

Fig. 2. Processing time of synonymous queries using data warehouse, data-cubes and MQDB
with data warehouse on cloud-server

4 Conclusion and Future Scope


The study depicts an innovative approach for generating results of OLAP queries from
the data warehouse. The queries, their results and meta-data are stored in Materialized-
Query-Database. The following conclusions are drawn from the research study:

• A substantial decrease in handing out time of synonymous queries is achieved by


using MQDB in comparison to using other approaches such as data warehouse and
data-cubes
• With data warehouse placed on central-server, the decrease in time of synonymous
queries without updated results using MQDB is by almost 95% and 84% in comparison
to using data warehouse and cubes respectively.
• When the data warehouse is placed on cloud-server an additional time of 3.6272 s on
an average is noted to connect to the remote cloud instance. However, the connection
time varies depending on the provider, the location of cloud instance, the bandwidth
and the time latency.
• While processing synonymous queries without incremental update, significant reduc-
tion in the processing time is achieved by almost 99% and 98% in comparison to using
data warehouse and data-cubes.

As a part of future work, MQDB application and the query result tables can be placed
on cloud-server for increasing use of the approach.

References
1. Agrawal, R., Gupta, A., Sarawagi, S.: Modeling multidimensional databases. In: Proceedings
of the 13th International Conference on Data Engineering, pp. 232–243 (1997)
34 S. Chakraborty

2. Bara, A., Lungu, I., Velicanu, M., Diaconita, V., Botha, I.: Improving query performance in
virtual data warehouses. WSEAS Trans. Inf. Sci. Appl. 5(5), 632–641 (2008)
3. Chakraborty, S., Doshi, J.: Materialized queries with incremental updates. Springer Smart
Innov. Syst. Technol. 1, 31–40 (2018)
4. Chakraborty, S., Doshi, J.: Faster result retrieval from health care product sales data warehouse
using materialized queries. Evol. Computational Intell. Adv. Intell. Syst. Computing 1176,
1–9 (2020)
5. Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM
SIGMOD Rec. 26(1), 65–74 (1997)
6. Chun, S., Chung, C., Lee, J., Lee, S.: Dynamic update cube for range-sum queries. In:
Proceedings of the 27th VLDB Conference, pp 521–530 (2001)
7. Datta, A., Thomas, H.: The cube data model: a conceptual model and algebra for on-line
analytical processing in data warehouses. Decis. Support Syst. 27(3), 289–301 (1999)
8. Deshpande, P., Agarwal, S., Naughton, J., Ramakrishnan, R.: Computation of multi-
dimensional aggregates. In: Proceedings of the 22nd VLDB Conference, pp. 506–521
(1996)
9. Gupta, A., Mumick, I., Subrahmanian, V.: Maintaining views incrementally. In: Proceedings
of the 1993 ACM SIGMOD International Conference on Management of Data, Washington,
pp. 157–166 (1993)
10. Gupta, A., Mumick, I.: Maintenance of materialized-views: problems, techniques and
applications. Bull. Tech. Commit. Data Eng., IEEE Comput. Soc. 18(2), 3–18 (1995)
11. Gupta, A., Jagadish, H.V., Singh Mumick, I.: Data integration using self-maintainable views.
In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 140–
144. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0014149
12. Gupta, G.: Introduction to data mining with case studies. PHI Learning Private Limited (2014)
13. Han, J., Kamber, M., Pei, J.: Data Mining-Concepts and Techniques, Third edn. Morgan
Kaufman Publishers (2011)
14. Harinarayan, V., Rajaraman, A., Ullman, J.: Implementing data-cubes efficiently. In: Pro-
ceedings of the 1996 ACM SIGMOD International Conference on Management of Data,
pp. 205–216 (1996)
15. Laudon, K., Laudon, J., Dass, R.: Management Information Systems, Eleventh edn., pp. 45–
49. Pearson Education (2010)
16. Mumick, I., Quass, D., Mumick, B.: Maintenance of data-cubes and summary tables in a ware-
house. In: Proceedings of the 1997 ACM SIGMOD International Conference on Management
of Data, pp. 100–111 (1997)
17. Neil, P., Quass, D.: Improved query performance with variant indexes. In: Proceedings of the
1997 ACM SIGMOD International Conference on Management of Data, Tucson, pp. 38–49
(1997)
18. Quass, D.: Maintenance Expressions for Views with Aggregation. Views’96. http://ilpubs.sta
nford.edu:8090/183/1/1996-54.pdf (1996)
19. Serranoa, M., Trujillo, J., Calero, C., Piattini, M.: Metrics for data warehouse conceptual
model’s understandability. Inf. Softw. Technol. 49(8), 851–870 (2007)
20. Shanmugasundaram, J., Fayyad, U., Bradley, P.: Compressed data-cubes for OLAP aggregate
query approximation on continuous dimensions. In: Proceedings of the 5th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, pp. 223–232 (1999)
21. Thareja, R.: Data Warehousing. Oxford University Press (2009)
22. The Office of the Registrar General and Census Commissioner, Ministry of Home Affairs.
http://censusindia.gov.in (2020)
23. Vanichayobon, S.: Indexing Techniques for Data Warehouses’ Queries. http://www.cs.ou.
edu/~database/documents/vg99.pdf (1999)
A Novel Approach of Using Materialized Queries 35

24. Zhou, J., Larson, P., Elmongui, H.: Lazy maintenance of materialized-views. In: Proceedings
of the 33rd International Conference on Very large Databases, pp. 231–242 (2007)
25. Zhuge, Y., Molina, H., Hammer, J., Widom, J.: View maintenance in a warehousing environ-
ment. In: Proceedings of the 1995 ACM SIGMOD International Conference on Management
of Data, pp. 316–327 (1995)
A Trust-Based Mechanism to Improve Security
of Wireless Sensor Networks

Sangeeta Rani1 , Dinesh Kumar1 , and Vikram Singh2(B)


1 Department of Computer Science and Engineering, University College of Engineering and
Technology, Guru Kashi University, Talwandi Sabo, Punjab, India
2 Department of Computer Science and Engineering, Chaudhary Devi Lal University, Sirsa,
India
vikramsinghkuk@yahoo.com

Abstract. The wireless sensor networks are much vulnerable to security attacks
due to their dynamic nature. The trust-based mechanism offers avenues for design
of efficient scheme that can improve the security of the wireless sensor network.
In this research work, a trust calculation scheme has been proposed which is based
on the packet delivery ratio (PDR). The trust represents the reputation of a wireless
sensor node. The packet delivery ratio is calculated based on the number of packets
received divided by the total number of packets transmitted. The threshold value
of PDR is defined and the node whose trust is below the threshold value will be
a target as the least trusted node. The trust is calculated directly and indirectly
also trust values will be updated from time to time. The proposed trust model
has been implemented in NS2 and results were analysed in terms of network
throughput, packet loss, and average energy consumption. The results of the trust-
based scheme have shown considerable improvement as compared to the shield-
based scheme. It was concluded from the results that the proposed technique
detects malicious nodes early as compared to shield based technique. Further,
the proposed security scheme increases the throughput of the network and packet
loss, energy consumption is reduced as compared to the shield-based scheme in a
wireless sensor network.

Keywords: WSN · Trust scheme · Shield scheme · Direct trust · Indirect trust

1 Introduction
Wireless sensor network is a distributed network that comprises multiple randomly scat-
tered sensor nodes. The inexpensive sensor nodes make wireless sensor nodes extremely
suitable for different applications including warzone monitoring, environment scrutiny,
traffic management, health care, and other domains. These nodes sense physical or
environmental parameters such as pollution, climate, acoustics, tremor, gravity, etc. at
various places. Monitoring results are forwarded to the sink, which after collecting all
data and sent to the customer over the web. The hundreds of nodes are scattered in a free
and hostile atmosphere to receive data from the sensor area. This task needs collabora-
tion amongst the enormous number of nodes for the region surveillance [1]. Since the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 36–54, 2022.
https://doi.org/10.1007/978-3-030-97196-0_4
A Trust-Based Mechanism to Improve Security of WSN 37

capacity of a node is restricted with regard to monitoring zone and transmission radii,
the nodes are left with no option but to cooperate in the network. Therefore, the nodes’
collaboration contributes significantly to improve the performance of wireless sensor
networks. Figure 1 shows the typical model of a wireless sensor network.

Fig. 1. Topological composition of a WSN

As shown in Fig. 1, the sensor nodes sense data from time to time and send it to
the base station in a multi-hop fashion. The network composition has a permanent base
station at the central location that involves managing the information of nodes. N number
of sensor nodes are randomly scattered in a square region L × L, and the transmission
range of every sensor node is fixed to R. Further, the power source on every node has
limited energy and Es denotes [2] the all-original energy. The topological composition of
the network consists of an unguided
 graph G(V , C), where V =  {1, 2, . . . ., i, . . . ., N }
is all sensor nodes and C = C12 , C13 , . . . ., Cij |i = j, i, j ∈ V represents a bunch of
fixed-route channels among nodes. The bunch of propagating nearby nodes of node i is
provided as:
 
FN (i) = j|dij < R, djs < dis , i, j ∈ V (1)

Here, dij , djs are the distance from node i, node j to sink correspondingly [3].

1.1 Security in WSN


The randomly deployed sensor nodes, for example, in the aggressive war zone, or inef-
ficiency to enforce physical protect regions, etc., and the properties of wireless channels
make wireless sensor networks more vulnerable to further security risks in contrast to
the classic networks. The characteristics of sensor nodes for example restricted trans-
mission bandwidth, memory space, energy, computation energy, and positioning area
38 S. Rani et al.

are the main reasons for these add-on security risks. All these risks reject the suitabil-
ity of outdated cryptosystem-based security schemes for wireless sensor networks [4].
The network security domain greatly depends on asymmetric cryptography to combat
outside attacks in web, end-to-end, and ad hoc networks.
Nevertheless, the restricted processing ability and resources make encryption algo-
rithm inappropriate for wireless sensor network due to the complexity involved and
requirement for immense computing memory. Moreover, encryption-assisted security
approaches can only counter external security threats and are not effective against inter-
nal intrusions. In wireless sensor networks, the uniqueness of nodes contrasting to other
networks may deny cooperation with service seekers to reduce energy consumption,
and such nodes are referred to as selfish nodes. Even though their active participation in
launching the attack in the network is nil, plenty of selfish nodes can lead to dire penalties.
Apparently, the current encryption system can’t identify the risks resulting from vali-
dated selfish nodes [5]. Therefore, an effective security mechanism is essentially required
to deal with these issues. In current years, trust management has been considered as an
efficient complementing tool to guarantee the security of sensor networks.

1.2 Trust Management


The trust management, to detect reliable and untrusted nodes, evaluates them based on
their earlier performance. In addition, it ensures that nodes cooperate through compen-
sations and consequences. Cryptographic mechanisms are unable to ensure complete
defense to the network since sensor nodes are restricted in terms of tamper-resistant
hardware and get compromised early. Therefore, trust management incessantly upgrades
security by observing the nature and performance of the node, evaluating its reliability,
and discovering trustworthy nodes to cooperate with [6]. In particular, trust establishment
in a network brings many advantages which are described as follow:

i. Trust offers a remedy for allowing related access control as per the behavior of sensor
nodes and their performance, which is impossible to be obtained through outdated
security systems.
ii. Trust provides trusted routing paths without including any malicious, selfish or faulty
nodes to help the routing process.

Trust improves the robustness and trustworthiness of the conventional security mech-
anism by confirming that only trusted nodes take part in validation, certification, or key
managing.

1.2.1 Trust Management in Wireless Sensor Networks


In wireless sensor nodes, trust can be referred to as the confidence of one node that another
node will behave in a specific manner. Furthermore, trust represents the ability of a node
to deliver an essential service. Trust is a measure of belief in the subsequent actions of
other nodes, depending on past understanding and observance of the behaviour of the
nodes. There is the need to have a strong understanding of trust management to construct
a strong and protected trust management mechanism for wireless sensor networks. Trust
A Trust-Based Mechanism to Improve Security of WSN 39

management deals with the management of trust relationships, for example, information
gathering, the evaluation of the measures concerning the trust association, the observance
and re-evaluation of current associations [7], and performing decision-making regarding
the trust. The four characteristics of autonomic trust management are as following:

i. Trust establishment: It refers to the process of creating a trust relationship between


two interacting parties.
ii. Trust monitoring: This process is to observe and record the performance or
behaviour of the trustee by the trustor by the trustor’s representative.
iii. Trust assessment: Based on the verified information, this process evaluates the
trustworthiness of the trustee by the trustor or by the trustor’s representative.
iv. Trust control and re-establishment: Trust relationships are re-created, or equivalent
steps are enforced to manage trust relationships based on the trust assessment.

In wireless sensor nodes, trust management is concerned with observing and record-
ing the performance and behaviour of nodes to assess trust and initiate trust relationships,
manage trust relationships [8], trust evaluation norms and strategies, and assists security
services for instance access control, key managing, and misconduct finding as illustrated
in Fig. 2:

Fig. 2. Components of trust management system

Therefore, the framework of trust management includes three fundamental opera-


tional segments that ensure effectual trust management. The functions of every segment
are defined below in brief:

i. Monitoring and learning: This block monitors and learns the behaviour and perfor-
mance of a node and provides it as input to the next block of the trust evaluation. It
is associated with a network edge to gather information of nodes.
40 S. Rani et al.

ii. Trust evaluation: This is the central block of the TM system that estimates and inte-
grates trust and reputation values, quantizes decision-making trust value, informa-
tion aging, etc. It delivers its output to the module of Recommendation Management
[9].
iii. Trust propagation unit: This block addresses the delivery and reception of recom-
mendations (belief values). Moreover, it offers the nodes’ trust values for different
types of services.

1.3 Trust-Based Security Models for WSN


Over time, several trust management plans have been put forward for wireless sensor
networks. Some plans are universal trust systems and others are developed for special
objectives including access management, aggregation, routing, surveillance, and attack
finding. In this part, different types of security-based trust management models have
been elaborated in the following way:
a. BTGMA: BTGMA (Behavioral Trust-based on the Geometric Mean Approach)
is a distributed trust management mechanism for wireless sensor nodes. Based on the
geometric mean of the QoS parameters; it calculates direct and indirect trusts to measure
nodes’ trustworthiness [10]. Each network node can compute the direct trust of another
node in the network. The following formula is used to count the direct trust value DTI (J )
of node I on node J for K variants of trust dissimilar m metrics:
 1/K

DTI (J ) = (mI ,J ,K ) (2)
K

At the same time, depending on the trust values given by L neighbors on node J,
the indirect trust value [ITI (J )] detected by node I on node J is measured using the
following formula [11]:
 1/L

ITI (J ) = (DTL (J )) (3)
L

Once both direct, as well as indirect trusts, are obtained, a weighted mechanism is
used to combine them to achieve the total trust in the following manner:

TT = DT ∗ Wa + IT ∗ Wb (4)

In which, Wa and Wb are the weights of direct trust and indirect trust respectively.
b. TSRF: TSRF (Trust-Aware Secure Routing Framework) as a customised routing
algorithm treats the integration of trust metrics and QoS metrics like routing metrics.
In this scheme, some steps are followed to obtain the best path from the source node to
the fixed destination. First, the source node (for example, N1 ) forwards a TREQ (trust
request) packet to the nearby node (such as N2 ). After the neighboring node discovers
that it has pre-received a similar packet, it will reject the REQ packet, or else it will
forward it to all its nearby nodes. It is mandatory for all the nodes receiving the packet
to deliver their assessment regarding N2 to N1 . Then, N1 unifies the direct and indirect
A Trust-Based Mechanism to Improve Security of WSN 41

trust values to calculate the trust value of N2 and decides its trustworthiness [12]. The
N1 can use this earlier step to figure out a trustful communicating nodes’ group, then it
relays them an RREQ (route request). If any of the trusted intermediate nodes receiving
the RREQ has the best path to the destination, it will deliver an RREP to N1 , or else it
will iterate the earlier steps to obtain the subsequent trustable node. The destination node
receiving an RREQ will send an RREP to N1 through the chosen path. Afterward, the
source node N1 can use this path to deliver any packet. TSRF is independent of explicit
routing protocols; hence it may function even if there is a change in the network’s routing
protocol. Furthermore, it can fruitfully deal with many attacks on trust models. However,
when the malevolent proportion turns out to be large, the performance declines.
c. ATSR: ATSR (Ambient Trust Sensor Routing) is a trust-aware routing protocol
based on location. This protocol shields wireless sensor nodes against routing intrusions.
It is a decentralised trust paradigm depending on both direct as well as indirect trust info.
In this protocol, based on a set of trust metrics, nodes count an indirect trust value by
observing the behavior of their nearby nodes. One such metric is the packet forwarding
metric using which nodes that reject or selectively send packets are detected. Also, each
node requests the trust information of its nearby nodes from its neighboring nodes to
collect indirect trust values. Lastly, routing decisions are made based on both geographic
information (distance from the sink) as well as the total trust information [13].
d. TBMD: TBMD (Trust Based Abuse Detection) model is a distributed trust model
inspired by the behaviour of the nodes. It can tackle the abuse of selfish or contagious
nodes in a wireless sensor node. It calculates the trust using fuzzy logic. Fuzzy considers
both direct as well as indirect observations for trust computation. Each node has a trust
table containing the trustworthiness of other motes. The model has three phases called
trust calculation, trust aggregation and abuse detection, and finally the propagation phase.
Trust calculation relies upon two kinds of trust values. The first is the past trust value
which is calculated by observing the node’s behaviour and provides an estimate of the
ability of the node. The next is the current trust of the node. To calculate this value, fuzzy
rules prediction is used. Trust values are forwarded to the sink to distinguish good nodes
and malevolent nodes based on a threshold. Afterward, the trust is sent to nodes in the
network and the trust value is updated in the status record [14].
e. BTRM-WSN: The BTRM-WSN is an ACS (Ant Colony System) based meta-
heuristic trust and reputation paradigm for wireless sensor networks. This model aims
at selecting the most trusted node over the most prestigious route delivering a specific
amenity. This model uses few nodes as customers (requesting certain services) and some
nodes acting as servers (providing those amenities). Having malicious servers is the main
risk in this approach that provides forged service to the customers. This model follows
several steps to achieve its goal as illustrated in Fig. 3.
Initially, the customer activates a bunch of ants by the client across the network.
The ants follow the pheromone trail left by other ants to discover a path directed to
the sensor nodes that provides the necessary service [15]. Next, the model provides
scores and ranks for each explored route. Each pathway is given a score depending on
its pheromone concentration. Afterward, the route having the highest pheromone trail is
selected because the most reputed pathway directs to the most reliable service offered
by sensor nodes. Then, the client makes the demand for the service from the chosen
42 S. Rani et al.

Fig. 3. Steps in BTRM-WSN and UTM schemes

trustable server, assesses the obtained service, and calculates its satisfactory level for the
service. Eventually, the customer rewards or punishes the selected route depending on
its level of pleasure. The pheromone value of this pathway is decreased or increased to
do this.
f. UTM: UTM (Unified Trust Model) is a trust management model using history, ref-
erence, suggestion, and platform validation as features for computing trust. It is inspired
by the idea of distant verification that provides the credibility of a platform (named veri-
fier) by notifying the status of its assets to the distant person (termed as the verifier). The
verifier forwards a verification request or test to the verifier/entrant to calculate a crypto-
graphic checksum/hash of its inside state and then returns it to make it compare against
the recognised good replicas. This model assumes that network nodes are deployed with
TPM (Trust Platform Module) and can validate the truthfulness of their nearby nodes.
It follows five steps for trust computation. First, information is collected regarding the
nodes providing the aggregation of essential services. Next, the trust of every node is
computed, and then the most trusted is selected as a service provider. Afterward, the
obtained service is transacted and evaluated. To do so, ultimately the service provider is
provided with reward or punishment based on a satisfactory level of the service [16].
g. ARTMM: The ARTMM (Attack-Resistant Trust Model) is developed for UASNs
that use multi-dimensional trust metrics. This model is built like the classic trust schemes
for TWSNs are inefficient to be used directly to shield UASNs due to certain issues
including untrustworthy submerged transmission medium, low-quality link, and mobile
network setting. This model uses link trust, data trust, and node trust as the trust metrics.
Link trust and node trust represent the reliability of the data while data trust evaluates
the fault tolerance and data consistency. To compute link trust, both link quality, as well
as link capacity, is considered. At the same time, node trust is achieved based on the
integrity and capability of the node. ARTMM follows the idea of a sliding time window
to calculate and update trust values based on both direct and indirect observations.
A Trust-Based Mechanism to Improve Security of WSN 43

Nevertheless, indirect trust is not included when the number of packets shared between
a node pair exceeds the set limit.

2 Literature Review

2.1 Trust-Based Optimisation Techniques for WSN Security

Baohe Pang et al. [17] suggested a strategy FTM-ABC to detect the malicious node based
on FTM (fuzzy trust model) and ABC (artificial bee colony algorithm). The indirect trust
was evaluated using FTM model [17]. The ABC algorithm was assisted in optimising
the trust model so that the dishonest recommendation attacks were detected. Moreover,
the recommended deviation and interaction index deviation were comprised in the fit-
ness function for improving the efficiency of the suggested approach. The simulation
outcomes depicted that the suggested approach capable of attaining a higher recogni-
tion rate and a lower FPR. A trust-based ACO (ant colony Optimisation) algorithm
was introduced by Ziwen Sun et al. [18] in which a node trust evaluation model was
utilised based on D-S evidence theory for enhancing the security of wireless sensor net-
works [18]. NS2 simulator was applied to stimulate the introduced approach to evaluate
security issues in the presence of attacks. The results revealed that the introduced algo-
rithm offered resistance against malicious attacks concerning E2E (end-to-end) delay
and the throughput. A novel trust and reputation model called BTRMC (Bio-inspired
Trust and Reputation Model using Clustering) was developed by Ichraf El Haj Hamad
et al. [19] in wireless sensor networks. The trust management was integrated with the
distribution of the clusters for estimating the suitable trust values. TRMSim-WSN was
executed to perform the experiments [19]. The experimental results indicated that the
developed model yielded superior accuracy and probability for reaching the genuine
sensors under the transactions. A trust model was projected by Tarek Gaber et al. [20]
and implemented for calculating a trust level for each node [20]. The BOA (Bat Optimi-
sation Algorithm) was exploited for selecting the cluster heads based on three metrics.
The outcomes validated that the projected approach had provided energy efficiency and
extended the duration of the network. In addition, this technique attained an average trust
value of around 30–50% for detecting the malicious nodes. Table 1 has summarised the
trust-based optimisation techniques for enhancing security in wireless sensor networks.

2.2 Trust-Based Hierarchical Routing Protocol for WSN Security


Weidong Fang et al. [21] designed a trust management scheme based on an energy-
efficient hierarchical routing algorithm recognised as LEACH-TM [21]. This algorithm
was capable of dealing with internal attacks. The simulation outcomes exhibited that the
designed algorithm performed more efficiently as compared to others. It had enhanced
the life span of a network; balanced the energy utilisation and alleviated the impact
of malicious nodes while selected the cluster head as the entire network was secured
using it. An energy-efficient hierarchical trust management method was established by
Reshmi et al. [22] for diminishing the power consumption rate of sensor nodes [22]. For
this, the trust values were computed on demand. The system did not stop working in
44 S. Rani et al.

Table 1. Trust-based optimisation techniques for WSN security

Author Technique used Parameters Results


Baohe Pang, Zhijun A fuzzy trust model False-positive rate This technique provided
Teng, Huiyang Sun, and artificial bee a higher recognition
Chunqiu Du, Meng colony algorithm rate and a lower FPR
Li, Weihua Zhu (FTM-ABC)
Ziwen Sun, Zhiwei Ant colony algorithm End-to-end delay, The introduced
Zhang, Cheng Xiao, throughput algorithm offered
Gang Qu resistance against
malicious attacks
Ichraf El Haj Hamad, Bio-inspired trust and Accuracy The developed model
Mohamed Abid reputation model yielded superior
using clustering accuracy
Tarek Gaber, Sarah Bat optimisation Energy efficiency, This technique attained
Abdelwahab, Aboul algorithm network duration an average trust value
Ella Hassanien of around 30–50% for
detecting the malicious
nodes

case of a compromised cluster head compromised. The intrusion was detected using the
trust management model. The established model was assisted in maximizing the packet
delivery ratio in contrast to other systems and effectively recognizing the malicious
nodes. A hierarchical trust model was formulated by Li Ma et al. [23] for cluster-based
wireless sensor network. The differences between cluster heads and general nodes were
considered to integrate the distributed and centralised trust management systems [23].
The defined trust value was efficient to tackle various security attacks that occurred in
wireless sensor network. The experimental results exhibited that the formulated model
was highly adaptable, had fault tolerance and potential for recognising malicious nodes.
Consequently, the security of the network was improved. A highly scalable cluster-
based hierarchical trust management protocol was designed by Fenye Bao et al. [24] for
wireless sensor networks to address the issue of malicious nodes [24]. The effectiveness
of this protocol was represented using trust-based geographic routing and trust-based
intrusion detection systems. The performance of each application was increased with the
recognition of the best trust composition and formation. The outcomes confirmed that
the intended model was performed well with flooding-based routing concerning delivery
ratio and delay and superiority of this model over conventional models. Table 2 presents
a summary of the trust-based hierarchical routing protocols for dealing with security of
wireless sensor networks.

2.3 Distributed Trust Models for WSN Security


Meenu Mathew et al. [25] recommended an efficient distributed technique based on
TCNPR [25]. The Trust Calculation depending on Nodes Properties and Recommenda-
tions is planned based on components of the node. This technique was utilised to compute
A Trust-Based Mechanism to Improve Security of WSN 45

Table 2. Trust-based hierarchical routing protocol for WSN security

Author Technique used Parameters Results


Weidong Fang, LEACH-TM Network lifetime, The designed
Wuxiong Zhang, power consumption algorithm performed
Yinxuan Yang more efficiently as
compared to others
Reshmi V, Sajitha M An energy-efficient Packet delivery ratio, The established model
hierarchical trust energy consumption was assisted in
management scheme maximizing the PDR
in contrast to other
systems
Li Ma, Guangjie Liu A hierarchical trust Adaptability, fault The formulated model
model tolerance was highly adaptable
and had fault tolerance
Fenye Bao, Ing-Ray A highly scalable Packet delivery ratio, The outcomes
Chen, MoonJeong cluster-based overhead and delay confirmed the
Chang, Jin-Hee Cho hierarchical trust superiority of this
management protocol model over
conventional models

the direct and indirect trust values following some trust metrics. The reliability of nodes
was determined and the availability of malicious nodes in the one hop communication
model was recognised using this technique. The analysis results exhibited that the supe-
riority of the recommended technique as compared to other models. An EDTM (Efficient
Distributed Trust Model) was constructed by Jinfang Jiang et al. [26] for Wireless Sen-
sor Networks [26]. Initially, the number of packets, that the sensor nodes had received,
was considered to compute the direct trust and recommendation trust. Subsequently, the
direct trust was computed based on communication trust, energy trust, and data trust.
Moreover, trust reliability and familiarity were described for enhancing the accuracy of
trust. The constructed model was capable of quantifying the reliability of sensor nodes
with exactness more and preventing the security breaches more effectively. The exper-
imental results revealed that the constructed model performs better in comparison with
other similar models. A fuzzy fully DTMS (distributed Trust Management System) was
investigated by Hossein Jadidoleslamy et al. [27] for Wireless Sensor Networks [27].
Unlike the traditional TMS, this system had diverse attributes such as fuzzy-nature trust
calculation criteria, trust calculation process and potential to predict the trust. In the
end, the simulations were conducted to compare this system with other systems. The
outcomes depicted that the investigated system was scalable and accurate and provided
enhanced fault tolerance and execution speed. An innovative T-IDS (trust-based IDS)
was designed for RPL by Faiza Medjek et al. [28]. This system was planned based
on distributed trust-based Intrusion Detection System for detecting the new intrusions
[28]. For this, network behavior deviations were compared. Every node was regarded
46 S. Rani et al.

as a monitoring node and utilised for collaborating with its peers for detecting intru-
sions and reporting them to 6BR (6LoWPAN Border Router). Furthermore, every node
contained a Trusted Platform Module co-processor so that computation cost and stor-
age related to identification and off-load security-related was handled. The system was
proved effective. The designed system had resistance against Sybil attack. Table 3 below
lists distributed trust models proposed for WSN security.

Table 3. Distributed trust models for WSN security

Author Technique used Parameters Results


Meenu Mathew, I. Trust calculation Accuracy, scalability The analysis results
K. Gayathri, depend on nodes exhibited that the
Aiswariya Raj properties and superiority of the
recommendations recommended technique
as compared to other
models
Jinfang Jiang, Efficient Distributed Threshold value The experimental results
Guangjie Han, Feng Trust Model revealed that the
Wang, Lei Shu, (EDTM) constructed model
Mohsen Guizani performs better in
comparison with other
similar models
Hossein A fuzzy fully Energy consumption, The outcomes depicted
Jadidoleslamy, distributed-trust accuracy, scalability, that the investigated
Mohammad Reza management fault tolerance, and system was scalable and
Aref, Hossein systems execution speed accurate
Bahramgiri
Faiza Medjek, Trust-based IDS Computation cost and The designed system
Djamel Tandjaoui, storage had resistance against
Imed Romdhani, Sybil attack
Nabil Djedjig

3 Issues and Challenges in WSN Security


The wireless sensor network is the decentralised type of network due to which security
is the major network concern. The wireless sensor networks are vulnerable to various
types of security attacks. The various schemes have been proposed to date to improve
the security of the wireless sensor network. The trust-based mechanism is a popular
approach to increase the security of wireless sensor networks. The trust value defines
the reputation of the sensor node. It is analysed from the last studies that trust of the
sensor nodes are generally calculated with bio-inspired algorithms like ant colony, bee
colony, etc. The bio-inspired algorithms follow an iterative process to calculate the final
value which is the trust value of the sensor node. The data needs to be transmitted on an
A Trust-Based Mechanism to Improve Security of WSN 47

urgent basis to the base station by the sensor node but when the bio-inspired techniques
are applied it will increase the delay in the network. When the delay gets increased in the
network it will also increase the energy consumption of the network. The approach needs
to propose for the trust calculation which can overcome the delay issue in a wireless
sensor network.

4 Research Methodology

The implementation of the network can be done using a limited number of sensor nodes
and sink nodes. The sensor nodes are utilised for sensing diverse kinds of natural situa-
tions such as temperature, pressure, etc. The sensor devices have heterogeneous nature
that implies sensors support diverse battery and processing power. These research works
suggest a trust-based technique to alleviate the VNA (version number attack). This tech-
nique can be executed in three stages in which, pre-processing is performed, trust is
computed and trust is updated. The sensor devices having lower trust are considered
malicious from the network. The trust methods utilise to compute the direct and indirect
trust, are defined as:

Tijx (t) = (1 − α)Tijx (t − t) + αTijx,direct (t), if j == k; (5)

X = community-interest, cooperativeness and hones.

• Direct trust observations

Fig. 4. Direct trust observation

honesty,direct
Tij denotes the honesty value based on direct observation of node j to i.
cooperativeness,direct
The degree of cooperativeness of node j to i is obtained from the Tij
48 S. Rani et al.

based on direct observations over the range of [0, t]. Figure 4 represents the pro-
cess to quantify the node j using node i with the help of direct observation and past
experiences.
• Indirect recommendations

Fig. 5. Indirect trust recommendation

βTikX (t)
= , TijX (t)(1 − γ )TkjX (t − t) + γ TkjX ,recom (t), if j! = k; (6)
1 + βTikX (t)

In this, TijX denotes the trust value of recommending the node k towards node i.
The maximisation of cTijX or β leads to expand the contribution of the suggested trust
system proportionally. Figure 5 illustrates the process of computing node j through
node i by the means of recommendation and precedent occurrence.
A Trust-Based Mechanism to Improve Security of WSN 49

4.1 The Proposed Algorithm

Input: Sensor nodes


Output: Detection of malicious nodes
1. Employ network with a limited number of sensor nodes
2. Divide the entire network into finite-sized clusters
3. Choose Cluster Head in every cluster based on distance and energy consumption
4. Calculate Trust
4.1. Check the number of packets transmitted through the sensor node
4.1.1 Compute the number of packets forwarded as follows

4.1.2. The PDR is calculated using the equation as

5. If (PDR < Threshold PDR)


5.1. Mark the sensor node as malicious
6. Construct new path from source to destination
7. Broadcast data through the newly constructed path

5 Results and Discussion


The simulation is carried out for computing a number of performance parameters such
as throughput, energy consumption, packet loss, and delay. The performances of these
parameters are quantified using them in case of occurrence of VNA. Table 4 describes
the simulation metrics and their values for the introduced Internet of Things network.

Table 4. Simulation parameters

Parameter Values
Simulator NS2-2.35
Number of nodes 32
Area 800 * 800 m
Antenna type Omni-directional
Channel Wireless channel
Propagation model Two ray

Figure 6 depicts the implementation of the new approach. It is trust-based approach


and sensor nodes with minimal trust are regarded as malicious in the network.
50 S. Rani et al.

Fig. 6. Screenshot of simulation of the proposed security model

Fig. 7. Plot of packet loss over time

Packet loss refers to the overall packets that are lost during the delivery of data. The
blue line in Fig. 7 (based on Table 5) reflects the packet loss values achieved from the
attack case and the green line denotes the value obtained from the present network. The
red line shows the packet loss values received from the presented trust-based approach.
Figure 8 (based on Table 6) reflects the throughput-based comparison between the
new and the old algorithm to evaluate them. The comparative analysis indicates that the
new approach provides higher throughput than its rival approach.
A Trust-Based Mechanism to Improve Security of WSN 51

Table 5. Packet loss analysis

Simulation time Attack scenario Sheild technique Trust-based technique


2s 0.60 bytes 0.40 bytes 0.10 bytes
6s 1.20 bytes 0.55 bytes 0.30 bytes
10 s 1.25 bytes 0.57 bytes 0.40 bytes
14 s 1.35 bytes 0.60 bytes 0.48 bytes

Fig. 8. Plot of throughput analysis

Table 6. Throughput analysis

Simulation time Attack scenario Sheild technique Trust-based technique


2s 20 packets 30 packets 32 packets
6s 58 packets 90 packets 110 packets
10 s 60 packets 100 packets 150 packets
14 s 63 packets 150 packets 190 packets
52 S. Rani et al.

Fig. 9. Plot of average energy usage

Figure 9 (based on Table 7) displays the comparison amongst three cases: attack
case, shield approach and presented scheme. As per the analysis, for isolating VNA, the
shield case is the standard method and the new one is a trust-based approach. Different
from other methods of isolating VNA, the new approach consumes a minimum of overall
energy than its rival approach.

Table 7. Average energy usage

Simulation time Attack scenario Sheild technique Trust-based technique


2s 0.20 J 0.15 J 0.050 J
6s 0.50 J 0.23 J 0.17 J
10 s 0.98 J 0.37 J 0.28 J
14 s 1J 0.75 J 0.55 J

6 Conclusion

The wireless sensor networks are the decentralised type of networks in which a central
controller is not present. Due to the dynamic nature of the network security is a major
concern that affects network performance. The various schemes are proposed in the
last years which can improve the security of wireless sensor networks. The trust-based
mechanism is the most popular scheme which can improve the security of wireless
sensor networks. In this research work, a trust calculation scheme is proposed which
can calculate the direct and indirect trust of a wireless sensor network. The trust of the
scheme will be calculated based on the packet delivery ratio. The trust-based mechanism
A Trust-Based Mechanism to Improve Security of WSN 53

is implemented in Network Simulator version 2 and results are analysed in terms of


throughput, packet loss, and average power consumption. The proposed scheme results
are compared with an existing scheme with a shield-based scheme. It is analysed that
shield-based scheme has high packet loss; average power consumption as compared to
the trust-based scheme. The throughput of the trust-based scheme is high as compared
to the shield-based scheme.

References
1. Kodali, R.K., Soratkal, S.R.: Trust model for WSN. In: 2015 International Conference on
Applied and Theoretical Computing and Communication Technology (iCATccT) (2015)
2. Gautam, A.K., Kumar, R.: A robust trust model for wireless sensor networks. In: 2018, 5th
IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer
Engineering (UPCON) (2018)
3. Khiabani, H., Idris, N.B., Ab Manan, J.-L.: Leveraging remote attestation to enhance the
unified trust model for WSNs”. In: Proceedings Title: 2012 International Conference on
Cyber Security, Cyber Warfare and Digital Forensic (CyberSec) (2012)
4. Zhang, M.: Trust computation model based on improved Bayesian for wireless sensor net-
works. In: 2017, IEEE 17th International Conference on Communication Technology (ICCT)
(2017)
5. Singh, V.P., Hussain, M., Raina, C.K.: Authentication of base station by HDFS using trust-
based model in WSN. In: 2016, International Conference on Communication and Electronics
Systems (ICCES) (2016)
6. Karthik, S., Vanitha, K., Radhamani, G.: Trust management techniques in wireless sensor
networks: an evaluation. In: 2011, International Conference on Communications and Signal
Processing (2011)
7. Ahmed, A., Bhangwar, A.R.: WPTE: weight-based probabilistic trust evaluation scheme
for WSN. In: 2017, 5th International Conference on Future Internet of Things and Cloud
Workshops (FiCloudW) (2017)
8. Ghugar, U., Pradhan, J.: NL-IDS: trust based intrusion detection system for network layer in
wireless sensor networks. In: 2018, Fifth International Conference on Parallel, Distributed
and Grid Computing (PDGC)
9. Voitovych, O., Kupershtein, L., Shulyatitska, O., Malyushytskyy, V.: The authentication
method in wireless sensor network based on trust model. In: 2017, IEEE First Ukraine
Conference on Electrical and Computer Engineering (UKRCON) (2017)
10. Dogan, G., Avincan, K., Brown, T.: DynamicMultiProTru: an adaptive trust model for
wireless sensor networks. In: 2016, 4th International Symposium on Digital Forensic and
Security (ISDFS) (2016)
11. Yussoff, Y.M., Hashim, H., Baba, M.D.: Analysis of trusted identity based encryption (IBE-
trust) protocol for wireless sensor networks. In: 2012, IEEE Control and System Graduate
Research Colloquium (2012)
12. Reddy, V.B., Venkataraman, S., Negi, A.: Communication and data trust for wireless sensor
networks using D–S theory. IEEE Sens. J. 17(12), 3921–3929 (2017)
13. Dai, H., Jia, Z., Dong, X.: An entropy-based trust modeling and evaluation for wireless sensor
networks. In: 2008, International Conference on Embedded Software and Systems (2008)
14. Sahoo, R.R., Singh, M., Sardar, A.R., Mohapatra, S., Sarkar, S.K.: TREE-CR: trust based
secure and energy efficient clustering in WSN. In: 2013, IEEE International Conference on
Emerging Trends in Computing, Communication and Nanotechnology (ICECCN) (2013)
54 S. Rani et al.

15. Prabhu, S., Mary Anita E.A.: Trust based secure routing mechanisms for wireless sensor
networks: a survey. In: 2020, 6th International Conference on Advanced Computing and
Communication Systems (ICACCS) (2020)
16. Liu, Z., Zhang, Z.G., Liu, S.S., Ke, Y.Q., Chen, J.: A trust model based on Bayes theorem
in WSNs. In: 2011, 7th International Conference on Wireless Communications, Networking
and Mobile Computing (2011)
17. Pang, B., Teng, Z., Sun, H., Du, C., Li, M., Zhu, W.: A malicious node detection strategy
based on fuzzy trust model and the ABC algorithm in wireless sensor network. In: 2021, IEEE
Wireless Communications Letters (2021)
18. Sun, Z., Zhang, Z., Xiao, C., Qu, G.: D-S evidence theory based trust ant colony routing in
WSN. China Commun. 15, 27–41 (2018)
19. Hamad, I.E.H., Abid, M.: BTRMC, a bio-inspired trust and reputation model using clustering
in WSNs. In: 2017, International Conference on Smart, Monitored and Controlled Cities
(SM2C) (2017)
20. Gaber, T., Abdelwahab, S., Hassanien, A.E.: Trust-based secure clustering in WSN-based
intelligent transportation systems. Comput. Netw. 146, 151–158 (2018)
21. Fang, W., Zhang, W., Yang, Y.: Trust management-based and energy efficient hierarchical
routing protocol in wireless sensor networks. Digit. Commun. Netw. 7, 470–478 (2021)
22. Reshmi, V., Sajitha, M.: Energy efficient hierarchical trust management scheme for solving
cluster head compromising problem in wireless sensor networks. In: 2015, International Con-
ference on Innovations in Information, Embedded and Communication Systems (ICIIECS)
(2015)
23. Ma, L., Liu, G.: A hierarchical trust model for cluster-based wireless sensor network. In:
2015, International Conference on Control, Automation and Information Sciences (ICCAIS)
(2015)
24. Bao, F., Chen, I.-R., Chang, M.J., Cho, J.-H.: Hierarchical trust management for wireless
sensor networks and its applications to trust-based routing and intrusion detection. IEEE
Trans. Netw. Serv. Manag. 9, 169–183 (2012)
25. Mathew, M., Gayathri, I.K., Raj, A.: An efficient distributed TCNPR method for wireless sen-
sor networks. In: 2017, International Conference on Energy, Communication, Data Analytics
and Soft Computing (ICECDS) (2017)
26. Jiang, J., Han, G., Wang, F., Shu, L., Guizani, M.: An efficient distributed trust model for
wireless sensor networks. IEEE Trans. Parallel Distrib. Syst. 26, 1228–1237 (2015)
27. Jadidoleslamy, H., Aref, M.R., Bahramgiri, H.: A fuzzy fully distributed trust management
system in wireless sensor networks. AEU - Int. J. Electron. Commun. 70, 40–49 (2016)
28. Medjek, F., Tandjaoui, D., Romdhani, I., Djedjig, N.: A trust-based intrusion detection system
for mobile RPL based networks. In: 2017, IEEE International Conference on Internet of Things
(iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber,
Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) (2017)
Multikernel Support Vector Machine Approach
with Probability Distribution Analysis
for Classifying Parkinson Disease Using Gait
Parameters

Arunraj Gopalsamy1,2(B) and B. Radha3


1 Sree Saraswathi Thyagaraja College, Pollachi 642107, Tamil Nadu, India
arunrajgs.phd@gmail.com
2 Lead IT Architect, IBM, Bangalore 560068, Karnataka, India
3 Sri Krishna Arts and Science College, Coimbatore 641008, Tamil Nadu, India

Abstract. Parkinson disease has become the most common among humans and
thus diagnosing the disease earlier will highly develop the quality of life of an
individual having the disease. However, diagnosing the disease using machine
learning algorithms are really challenging. Several works have been carried out to
diagnose the Parkinson disease using speech and voice parameters. Unfortunately,
use of speech signals alone will not provide accurate results. The other main
parameter that exists in Parkinson patients is gait which represents the way or
pattern of walk of a person. The existence of disease cause difficulty in walking or
changing direction causing freezing of gait. This paper suggests the multikernel
support vector machine with probability distribution analysis-based classifier for
detecting the Parkinson disease using various gait parameters. The model utilizes
Bayesian optimization for tuning the hyperparameters. In order to provide accurate
results, the probability distribution analysis has been carried out for verifying
the classification results and making appropriate decisions. Various experimental
analysis for the proposed model has been performed with various datasets related
to the gait parameters and the obtained results are compared with the existing
models. The proposed model has the accuracy, true positive rate, false negative
rate and AUC of 92.5%, 90.8%, 22.3% and 0.85 respectively. The model has better
results with reduced computational complexity.

Keywords: Parkinson disease · Gait parameters · Multikernel SVM · Bayesian


optimization · Probability distribution analysis

1 Introduction
Parkinson’s disease (PD) is the second most common neurodegenerative disorder with
increasing prevalence all over the world next to Alzheimer’s disease [1]. According to
an estimation of parkinsonsnewstoday.com, nearly 10 million people are having this
neurodegenerative disorder in which the occurrence of the disease is seen in at least
1,900 among every 1,00,000 people [2]. However, it is expected that the number of

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 55–73, 2022.
https://doi.org/10.1007/978-3-030-97196-0_5
56 A. Gopalsamy and B. Radha

cases will be doubled by the year 2040 [3]. Though the disease is commonly known as
movement disorder, the symptoms always include both movement and non-movement
disorder which are known as motor and non-motor symptoms respectively [4].
The disease has a wide range of symptoms that varies from one person to another
including motor symptoms such as trouble with balance and fall, problem in walking
and turning, rigidity, freezing and shuffling of gait, soft speech along with the non-motor
symptoms such as loss of smell, weight loss, mood disorder, sleep disorder and vision
problems. Due to the progress of Parkinson disease, the brain gradually reduces the
production of the chemical called dopamine, which is responsible for movement control
causing increased risk of falling and moving with difficulty in body movements [5].
Generally, for diagnosing the Parkinson disease, the speech vocals or the gait parameters
are analyzed form the individuals. Human gait is the movements during locomotion
which can be described as the movements that takes place in the joints during a walk
[6]. The analysis of gait characteristics helps to diagnosis the Parkinson disease and it
can be divided into two major types such as kinematics and kinetics [7]. The taxonomy
of gait analysis is shown in Fig. 1.

Gait
Analysis

Kinematics Kinetics

Measurements of
Descriptive
Temporal/Spatial Force such as Power,
Components
Torque, Pressure

Joint Angles for Biomechanical Parameters


Time Distance
Upper and Lower (Pelvic Rotation, Knee Flexion,
Variables Variables
Extremities Foot and Ankle Motions)

Step Time, Stride Time, Step Length, Stride


Stance Time, Swing Length, Foot Angle,
Time, Speed, Cadence Walking Base

Fig. 1. Taxonomy of gait analysis.

Fundamentally, the gait parameters for the classification of Parkinson disease is usu-
ally evaluated by examining the walk and body movements of a subject by making them
wear the shoes having sensors and force sensitive resistors. At the time of examination,
the sensors produce the force applied to the ground and pressure of a subject. The pro-
duced data in the form of analog samples are then converted to digital samples and are
stored in the database. These values are further processed to evaluate the speed, stride
time, stance time, cadence, swing time, foot angle and other attributes that are necessary
for the assessment in disease identification [8, 9].
Several works have been proposed in the literature to classify the Parkinson’s disease
among individuals. However, most of the studies concentrate only on voice data as they
insist that most of the subjects with Parkinson’s disease have one or other form of vocal
Multikernel Support Vector Machine Approach 57

impairments. However, for accurate diagnosing of disease, analyzing the vocal data alone
will not be appropriate. Few of the works in the literature focuses on applying machine
learning algorithms on gait parameters. Conversely, they suffer from computational and
time complexity and still the classification accuracy of the prediction model has to be
increased.
This paper presents the classification model using multikernel support vector
machine approach and probability distribution analysis for accurate prediction. The
model analyses various gait parameters to classify the disease. It uses multiple ranks
with majority vote based relative aggregate scoring model for feature selection, Bay-
seian Optimization for parameter tuning, equiweight based multikernel support vector
machine classifier for classifying the disease and probability distribution analysis for ver-
ifying the classification results to make a clinical decision. Various performance analysis
has been made for the proposed model using three publically available datasets related
to gait parameters and the results are also compared with few of the existing models.
The paper is organized as follows. Section 2 presents the various works available in
the literature related to the study. Section 3 describes the proposed multikernel support
vector machine with probability distribution analysis-based classification model along
with the algorithm and overall workflow of the model. Section 4 discusses the datasets
used and analyses the performance of the proposed classification model and the result
comparison with existing model. Finally, Sect. 5 concludes the paper with conclusion
and the future work.

2 Related Works

Several works were available in the literature for predicting the Parkinson diseases accu-
rately at an early stage. Machine learning and data mining techniques are highly utilized
by many of the researchers for effective results in classification [10–12] and feature
selection [13]. Most of the applications widely employ eager leaners for constructing a
learning model due to its effective result in classification [14]. Analysis was made for
various feature selection models such as ranker search, PSO search and Tabu search
on gait data in which it is proved that the ranker method provides better results [15].
Also several studies were carried out for analysing the gait dynamics [16, 17]. The most
common classifiers used in detecting Parkinson’s disease are variations in decision trees
[18], artificial neural network (ANN) [19] and support vector machine (SVM) models.
Recently, a mixed model that utilizes decision tree, k-nearest neighbor (k-NN) [20],
K-means, Naïve Bayes, random forest, support vector machine (SVM) [21], and Gaus-
sian mixture model was proposed and the model has higher accuracy for classifying
the disease [22]. Several SVM based classifiers are utilized for classifying the records.
The SVM classifier with RBF kernel and artificial neural network (ANN) are employed
for analysing the gait parameters extracted from walking that are suitable for diagno-
sis Parkinson disease with motor symptoms [23]. An approach to detect the Parkinson
disease by analysing various gait parameters that are collected from vertical ground
reaction forces from sensors placed under the foot was suggested in which the records
are classified as balanced and unbalanced gaits [24].
58 A. Gopalsamy and B. Radha

A nonlinear based classifier that makes use of decision tree for evaluating gait features
was suggested to categorize the Parkinson disease that utilizes a Recursive Feature
Elimination model to select significant feature. However, the model provides better
accuracy with minimum features [9]. Few models that employ wavelet transforms and
statistical analysis to extract the features from the gait characteristics was suggested
[25, 26]. Principal Component Analysis (PCA) for feature compression with several
statistical functions in order to generate a representative data for better classification
of disease was suggested and the method was evaluated with various classifiers [27].
However, the models undergo higher computational complexity.
A Supervised CNN learning architecture for evaluating stereotypical motor move-
ment using various sensors to detect autism disorders was proposed [28]. A deep norma-
tive modeling that detects the abnormalities in subjects having Parkinson’s Disease and
Autism Spectrum Disorders (ASD) as probabilistic novelty detection method was sug-
gested. Another variation of the model also includes the reconstruction error to construct
a model was also evaluated [29]. The model outperforms 1C-SVM and CNN model.
Here the learned reduced-rank latent space through diagnosing autoencoder model was
utilized to train the 1C-SVM [30].
Similarly, two different sensor system that generates the data evaluated from hand
motor function using sensor pen apart from the gait dynamics extracted from the sen-
sor shoe. The data created from the sensor pen utilizes pattern recognition models. The
data from the two sensor system was evaluated using AdaBoost classifier ended up with
promising results [31]. Multikernel SVM classifier was suggested that makes use of
various kernels with equal weights to improve the performance of the learning algo-
rithm and generalization ability. The model uses chaos Particle swarm optimization for
selecting the significant features. However, this model was suggested for credit scoring
applications and the evaluation was made for the credit scoring database [32].
PCA based multi kernel SVM was suggested to early diagnosis of Alzheimer Disease
[33]. Though the method has improved performance, the complexity and execution time
is high for the model. Owing to the advantage of the multi kernel SVM, the proposed
models utilizes multikernel SVM for detecting Parkinson’s disease. However, finding the
optimized values for the parameters is a challenging task in SVM. Several standard tech-
niques exist for tuning the SVM parameters such as grid search and random search. Apart
from these standard techniques, few parameter tuning methods were also introduced by
the researchers. A computation of class separability using cosine similarity in the kernel
space was one such method suggested to find the optimum values for the parameters
[34]. Here the optimal parameter can be selected as the one that maximizes inter-class
separability and minimizes intra-class separability. Similar method was suggested in the
literature to optimize the parameter by calculating the classification reliability of kernel
minimum distance [35].

3 Proposed Classification Approach

The overall framework of the proposed Multikernel Support Vector Machine with Proba-
bility Distribution Analysis based Classification (MSVMPDA) model is shown in Fig. 2.
It takes the Parkinson dataset as an input and classifies the given test data as Parkinson or
Multikernel Support Vector Machine Approach 59

healthy. Initially, the training dataset is pre-processed in order to make them suitable for
further processing. It includes data cleaning, data transformation in which the records
having missing values are processed. Additionally, it extracts and selects the significant
features that are relevant for the study thereby eliminating the irrelevant and redundant
attributes. This phase is highly mandatory as it influence the classification accuracy.

Training Dataset Test Data

Data Cleaning Data Transformation Feature Selection


Data Pre-processing

Pre-processed Training Data Pre-processed Test Data

Parameter Tuning for SVM Multikernel SVM Classification

Optimized parameter values Probability Distribution Analysis


Training Phase Classification Phase

Fig. 2. Overall framework of the proposed classification model.

The preprocessed dataset is trained using the proposed classification algorithm for
accurate prediction of Parkinson disease. Upon training the model using training dataset,
the test data can be fed as an input for predicting the disease effectively. In the proposed
framework, Support Vector Machine is applied over the training dataset. The critical
hyper parameters such as the cost C, and gamma γ of SVM are tuned in order to govern
the overfitting and the degree of randomness respectively. In order to tune the parame-
ters, Bayesian optimization is applied as it provide better performance that many other
methods [36]. The process of parameter results in identifying the suitable values for
the parameters that provides accurate classification with SVM algorithm. Once the opti-
mized values for the parameters are tuned during the training phase, the test record is
then passed to the classification model. The classification model employs Multikernel
Support Vector Machine in which the individual kernel functions are evaluated and the
results are accumulated using fixed rule. As the classification of Parkinson disease is
highly sensitive, the classified results are verified using probability distribution analysis
to verify the accuracy of the prediction results.

3.1 Data Preprocessing


The dataset pertaining to gait parameters are created from various Parkinson patients
and healthy individuals. This dataset act as a training set. As the dataset may contain
60 A. Gopalsamy and B. Radha

incomplete and missing data, the dataset must undergo preprocessing step [37]. This is
a mandatory phase to be performed in order to improve the classification performance
of any underlying model. It normally cleans the raw data by filling the missing values or
removing the records having missing values and transforms the data to a form suitable
for mining. It prepares the data for mining or evaluation by reducing the size of the data
through selection of significant attributes and normalization of data [38].
In the proposed model, the record having least number of missing values are filled
based on a semi-parametric imputation method called predictive mean imputation [39].
For data normalization, min-max normalization is applied in order to transform the large
set of feature values to a small set of range values [40]. Another significant step in prepro-
cessing is data reduction in which the size of the dataset is reduced by selecting important
features from the dataset. The proposed classification model employs multiple ranks with
majority vote based relative aggregate scoring model [41]. The method incorporates var-
ious feature selection techniques such as Pearson’s correlation, gain ratio, information
gain, relief and symmetrical uncertainty instead of using a single technique. The model
utilizes the ranks obtained from the various techniques for which majority vote based
relative scores are computed. The features having higher scores are selected for further
classification process.

3.2 Multikernel Support Vector Machine Approach


Support Vector Machine is a discriminative supervised learning algorithm which is
employed for classification problems which can also be suitable for regression as well.
This learning algorithm basically identify the characteristics that are unique for the
classes and these characteristics are used for classifying the given test data. Thus, it
focuses on predicting the similarities between the classes often termed as support vec-
tors. With the help of the training data it identifies the optimal hyperplane which may
be a line in a two dimensional space that partition the data for classifying the data.
Typically, the dimensions represent the number of features in the dataset. The partition
line is normally termed as decision boundary. As there is a possibility for having several
decision boundaries, the best decision boundary can be selected by averaging the two
hyperplanes that segregate the data completely in two dimensional space with maximum
distance between them.
However, several parameters contribute for achieving the accuracy in support vector
model such as the kernel (k), regularization (C) and gamma (γ). The parameter kernel is
decisive as it is responsible for transforming the data in to a correct format for obtaining
accurate classification results. The commonly used kernel types are linear, polynomial
and radial basis function (RBF). Regularization or penalty parameter denoted as C rep-
resents the amount of error or misclassification accepted. Higher values of C specify the
smaller margin hyperplanes that leads to lower misclassification whereas lower values
of C specify the greater margin hyperplanes that leads to higher classification rate. Sim-
ilarly, lower Gamma value loosely fit into the model whereas the higher gamma value
appropriately fit in to model.

Linear Kernel: It is a simple one dimensional kernel employed when the given input
data is linearly separable. The linear function is fast when compared with other kernel
Multikernel Support Vector Machine Approach 61

types. This is selected when the given input dataset contains more number of attributes.
The function that represent the liner kernel is given in Eq. (1).
 
klin xn , xj = xn .xj (1)

Polynomial Kernel: It is the general representation of the linear kernel and is less fre-
quently used due to its minimal efficiency. However, it is hilly preferred in image pro-
cessing applications. The function that represents the polynomial kernel is given in
Eq. (2).
 
kpoly xn , xj = (xn .xj + 1)d (2)
 
Here the function kpoly xn , xj represent the decision boundary with degree d.

RBF Kernel: It is the most frequently used kernel for nonlinear data as it helps to separate
the data properly when the knowledge about the data is unknown. The function that
represents the RBF kernel is given in Eq. (3).
 
kRBF xn , xj = exp(−γ xn − xj d ) (3)

The value of γ > 0 and it usually lies between 0 and 1.

Multikernel Approach. In the proposed model, in order to take the advantage of all the
kernels, the multikernel SVM is applied in which it applies the fixed rule approach [42].
Fixed rule approach are linear combination model in which the rules are set to combine
the kernels. Generally, it uses the summation or multiplication operation to integrate
multiple kernels. Additionally, the weights can be assigned for each kernel based on
their significance. In the proposed multikernel approach, linear, polynomial and RBF
kernels are employed by integrating them using the summation operation with equal
weights. The simple form of the multikernel representation is presented in Eq. (4)
 
1  
kmkSVM = klin + kpoly + kRBF (4)
3
The final support vector machine classifier can be defined as a decision function shown
in Eq. (5).
 N 
  
sgn αi yi kmkSVM xi , xj + b (5)
i

Here, N represents the vectors of training set, αi lies between 0 and C which is a
regularization parameter and yi represents the classified result.

Parameter Tuning. As the proposed model utilizes multikernel SVM approach, the
parameters such as C and γ are to be identified for the RBF kernel as well as the parameter
d must be identified for the polynomial kernel. Here in the study, the value for the
parameter d is also selected in which it employs d = 2 and d = 3 [32]. Obviously the value
of d = 1 corresponds to the linear kernel. The hyperparameters are tuned for the RBF
62 A. Gopalsamy and B. Radha

kernel using Bayesian optimization [43]. The main advantage of preferring Bayesian
optimization over grid search or random search is that utilizes minimum number of
iterations there by reducing the execution time in tuning parameters in the process of
finding the optimal value for hyperparameters. It also limits the number of times the
model has to be trained for validation. This method makes use of Bayes probability
theorem to find the maximum and minimum objective function. Specifically, it requires
a search space, an objective function and surrogate and selection function. The objective
function accepts the hyperparameters and produces a validation score based on their
performance. The posterior probability act as a surrogate function in which it maps the
hyperparameters to the probability score of the objective function. The hyperparametrs
are evaluated by the objective function using criterion to the surrogate function and is
denoted as selection function. This method is highly effective on the dataset with more
number of attributes.

3.3 Probability Distribution Analysis


The medical diagnosis for disease prediction is a critical process as it involves the lives of
the individuals, automatic classification or prediction of diseases have to be handled with
proper care for accurate results. In the proposed model, once the classification of class
labels is made using the multikernel support vector machine approach, the classified
class labels are verified using probability distribution analysis. Thus, as an initial step,
the training dataset is clustered based on the class labels. The proposed model utilizes
normal distribution for evaluating the prediction accuracy. For each cluster, the cluster
center is computed that contains the mean and standard deviation for each attribute
values of the training instances in the cluster. To verify the predicted class label for the
test instance x, the normal distribution is applied to each cluster center to compute the
likelihood of disease. The formula to compute the probability of instance likely to be in
the cluster is given in Eq. (6).
⎛ ⎞
n (x −μ )2
1 − j 2j
p(x, ci ) = ⎝ √ e 2σj ⎠ (6)
j=1
σj 2π

Here the variable x represents the test instance to be verified, ci represents the cluster
the variable j represents the attributes and the values varies from 1 to m where m is the
number of attributes, σj and μj represents the standard deviation and mean of the cluster
specified in the cluster center with respect to the attribute.
Thus, from the probability analysis, if the normal distribution of an instance with
specific to the cluster in which the instance is classified using multikernel support vector
machine approach is high, then the classification made by the model is deliberated
as more accurate. On the other hand, if the probability distribution of the classified
class for the instance is low than other classes, then the prediction made is not more
accurate and thus it requires further analysis and thus the patient is insisted to do more
medical examination for final results. The algorithm steps for the proposed Multikernel
Support Vector Machine with Probability Distribution Analysis based Classification
(MSVMPDA) algorithm is presented in Fig. 3.
Multikernel Support Vector Machine Approach 63

Algorithm: Multiple Rank with Majority Vote Based Relative Aggregate Scoring
Algorithm: Multikernel Support Vector Machine with Probability Distribution Analysis
based Classification
Input: training instances with n attributes, test instances
Output: Classified test instance
Procedure MSVMPDA ()
Begin
//Preprocessing Phase
1. Preprocess the input training set by evaluating the missing records and transform the
data using normalization.
//Feature selection using majority vote based relative aggregate scoring
2. Compute the ranks for all the attributes with each attribute evaluator.
3. Convert the ranks into votes using majority voting and computer relative score for
each attribute.
4. Perform relative score aggregation for each attribute.
5. Select the attributes having score greater than the threshold.
//Bayesian Optimization for parameter tuning
6. Perform Bayesian optimization until best values are achieved
a. Optimize the acquisition function for sample selection
b. Assess the sample using objective function.
c. Update the data and surrogate function.
7. Save the best hyperparameters C and γ value.
//Train the mulikernel support vector machine model
8. Train the classifier using multikernel SVM with best hyperparameter values.
a. Apply various kernel such as linear, polynomial and RBF kernel
b. Combine the kernels using fixed approach with equal weights.
//Classification of test instance
9. For each test instance
a. Classify the instance using multikernel SVM with optimized parameters C
and γ.
//Perform probability analysis using normal distribution
10. Test the prediction results
a. Cluster the training data based on the values of class variable.
b. Compute the cluster centre by computing the mean and standard deviation
for each attribute.
c. Perform probability analysis
i. Evaluate the class probability using normal distribution with the
instance and the cluster centre
ii. Make the decision based on the obtained results from the proba-
bility analysis.
End Procedure

Fig. 3. Algorithm steps for the proposed MSVMPDA classification.

The main advantage of utilizing multikernel in SVM is it reduces the unfairness


in the process of selecting kernel and its parameters. Instead of creating a new kernel
suitable for the application, multiple kernel can be used to combine the benefit of already
established kernels. Thus it takes the advantage of various kernels in a single process
to produce the classification effectively. Also, parameters are tuned to find the best
values using Bayesian optimization. Parameter tuning is an important step in SVM as it
64 A. Gopalsamy and B. Radha

increases the model efficiency through increased accuracy and decreased computational
time and cost. The workflow of the proposed classification model is shown in Fig. 4.

Input Training Dataset

Data Preprocessing

Relative Aggregation based Feature Clustering the Training


Selection Dataset

Parameter Tuning for Support Vector


Test Data
Machine

Individual SVM Kernels


Evaluation
Higher Yes
Accuracy
Multikernel SVM with Fixed Rules
No
Parkinson Disease Probability Analysis
Parameter Modification
Classification

Decision on Predicted Results

Fig. 4. Overall workflow of the proposed classification model.

4 Experimental Analysis
This section presents the details with respect to the experimental analysis made including
the dataset used and other performance analysis.

4.1 Dataset Used

Three datasets related to the gait parameters are used for performing the experimental
study in which the two datasets are available at PhysioNet [44] and the third is available
at UCI Repository. The first dataset used in the study is Gait in Parkinson’s Disease
refereed as GaitPDB [45]. It contains the measures of gait through 8 sensors per each
foot that was recorded for 93 (59 males and 34 females) patients with Parkinson and
73 (40 males and 33 females) healthy subjects. The sensors capture the vertical ground
reaction force while the subjects walk that last for approximately 2 min. Also, these 16
sensors record the details at a rate of 100 samples per second. The data were generated
Multikernel Support Vector Machine Approach 65

at the Laboratory for Gait Neurodynamics, Movement Disorders Unit of the Tel-Aviv
Sourasky Medical Center and made public on February 25th, 2008 through PhysioNet.
The various attributes in the dataset are stride time, vertical ground reaction force (VGRF,
in Newton) by each sensor at left and right foot, along with the total force at each force.
From the analysis made on attribute values, several ground truths exist that makes the
clear difference between the Parkinson and healthy subjects [46]. As the subjects with
Parkinson disease undergo interruption in walking termed as freezing point of gait (FOG),
the stride time and the VGRF values are always seem to be high whereas the speed and
the stride length will be minimum than normal than normal individuals [15].
The second dataset used for the study is the common Neuro-degenerative disease
referred as GaitNDD that are related to gait analysis. It contains the data recorded from
15 (5 males and 10 females) subjects having Parkinson, 20 (6 males and 14 females)
subjects having Huntington’s disease, 13 subjects having Amyotrophic Lateral Sclerosis
and 16 (13 males and 3 females) healthy subjects. It also collects the force under the foot
with various attributes such as time, left and right stride interval, swing interval, stance
interval for both left and right foot and double support interval. The dataset collected
was accepted by the Massachusetts General Hospital Institutional Review Board and
made public through PhysioNet on December 21st, 2000.
The Daphnet Freezing of Gait Dataset referred as FOG has 237 instances with 9
attributes that specifies the annotated readings obtained from 3 acceleration sensors
placed at the hip and leg of subjects having Parkinson disease and is available at UCI
repository [47]. The freezing of gait (FoG) has been recorded during their various types
of walks including straight line walk, numerous turns and realistic activity of daily living.
The recordings were carried out at the Tel Aviv Sourasky Medical Center in 2008 which
was permitted by the local Human Subjects Review Committee following the ethical
standards of the Declaration of Helsinki. The various attributes included in the dataset
are acceleration of ankle, upper leg and trunk.

4.2 Classification Performance Analysis

The proposed model has been evaluated by training the SVM classifier with various ker-
nel functions. The model is implemented in python installed in windows 64 bit operating
system with Intel (R) i3-4005U CPU at 1.7 GHz. Initially, the dataset was preprocessed
by evaluating the missing values and the significant features are selected that are relevant
for the study. The hyperparametrs are optimized using Bayesian optimization and the
optimized parameters values for the C and γ for GaitPDB are 46.23 and 0.0018 respec-
tively whereas that of GaitNDD are 10.41 and 0.005 respectively. The model is trained
using all the combinations of the kernel functions such as liner, RBF and polynomial
with degree 2 and 3 using 10 fold cross validation. Also equal weights are assigned
for the kernel function and is symbolized as β1, β2 and β3 which represents weights
for linear, polynomial and RBF kernel respectively. The accuracy obtained from the
experimental analysis for the proposed multikernel SVM approach with GaitPDB and
GaitNDD datasets are presented in Table 1.
66 A. Gopalsamy and B. Radha

Table 1. Performance of multikernel SVM for GaitPDB and GaitNDD datasets.

Different kernel functions Kernel weights Accuracy


β1 β2 β3 GaitPDB GaitNDD
Linear 1 0 0 86.72 88.75
Polynomial (d = 2) 0 1 0 87.89 89.98
Polynomial (d = 3) 0 1 0 87.01 89.12
RBF 0 0 1 88.32 90.32
RBF + Linear 1/2 0 1/2 87.89 90.56
RBF + Polynomial (d = 2) 0 1/2 1/2 89.23 91.24
RBF + Polynomial (d = 3) 0 1/2 1/2 88.97 90.87
Linear + Polynomial (d = 2) 1/2 1/2 0 87.61 90.11
Linear + Polynomial (d = 3) 1/2 1/2 0 86.54 90.08
RBF + Linear + Polynomial (d = 2) 1/3 1/3 1/3 90.47 91.53
RBF + Linear + Polynomial (d = 3) 1/3 1/3 1/3 89.92 90.88

With the analysis on the values presented in Table 1, the single kernel SVM such as
RBF kernel and polynomial with degree 2 provides better accuracy rate of 88.32% and
87.89% for GaitPDB dataset. However, with the usage of multikernel, the combination
of RBF and polynomial with degree 2 has the accuracy rate of 89.23% for GaidPDB
dataset. On the other hand, the combination of all the three kernel functions with poly-
nomial degree 2 has the highest accuracy rate of 90.47% and 91.53% for the GaitPDB
and GaitNDD datasets respectively. Thus, the proposed model utilizes RBF, linear and
polynomial with degree 2 with equal weights for multikernel SVM approach. The values
are presented as a graph in Fig. 5.
Similarly, the selection of optimized parameters for the multikernel SVM has been
evaluated using various standard methods and exiting methods using various kernel
functions such as linear, polynomial with degree d = 2, polynomial with degree d = 3
and RBF for the GaitPDB dataset. The classification accuracy and the execution time
are evaluated for the process of parameter tuning. The classification accuracy using the
optimized parameters with SVM classifier and the time taken (in seconds) to identify the
optimized parameters are recorded. The accuracy of the classifier is evaluated using 5
fold cross validation. The obtained results for the standard parameter tuning models such
as grid search and random search, and the existing models such as minimum distance
based [35] and cosine similarity based [34] along with the Bayesian optimization are
presented in Table 2.
Multikernel Support Vector Machine Approach 67

K(lin+RBF+poly3) K(lin+RBF+poly3)
K(lin+RBF+poly2) K(lin+RBF+poly2)
K(lin+poly3) K(lin+poly3)
K(lin+poly2) K(lin+poly2)
K(RBF+poly3) K(RBF+poly3)
K(RBF+poly2) K(RBF+poly2)
K(RBF+lin) K(RBF+lin)
K(RBF) K(RBF)
K(poly3) K(poly3)
K(poly2) K(poly2)
K(lin) K(lin)
82 84 86 88 90 92 87 88 89 90 91 92
Accuracy in % Accuracy in %
a) Accuracy for GaitPDB Dataset b) Accuracy for GaitNDD Dataset
Fig. 5. Performance of multikernel SVM approach.

Table 2. Performance analysis of various parameter tuning models.

Tuning models Grid search Random Minimum Cosine Baysiean


search distance similarity optimization
Kernel/Metrics Acc Time Acc Time Acc Time Acc Time Acc Time
(%) (Sec.) (%) (Sec.) (%) (Sec.) (%) (Sec.) (%) (Sec.)
K(lin) 86.45 2.89 85.53 2.96 84.12 2.59 85.12 3.61 86.72 2.98
K(poly) (d = 86.15 3.28 85.89 3.01 85.23 3.11 85.61 3.57 87.89 3.23
2)
K(poly) (d = 87.24 3.56 86.73 3.05 86.14 3.21 85.27 3.2 87.01 3.4
3)
K(RBF) 88.11 5.12 87.96 5.3 86.57 5.4 86.79 3 88.32 4.89

From the analysis made, the classification accuracy of the Bayesian optimization is
higher than grid and random search. However, the grid and random search model has
better performance than minimum distance and cosine similarity models. With execution
time as a metrics, the cosine similarity method has minimum time to tune the parameters.
Also random search and minimum distance have minimum time than Bayesian optimiza-
tion model. Though the Bayesian optimization model has highest execution time than
other models, upon analysing accuracy of the model, the increase in the execution time
by seconds or milliseconds are still negligible. The accuracy of various parameter tuning
models are presented in Fig. 6.
68 A. Gopalsamy and B. Radha

90
89
88
Accuracy in %

87
86
85
84
83
82
81
80
Grid Search Random Search Minimum Cosine Baysiean
Distance Similarity Optimization
Parameter Tuning Models

Linear Polynomial (d=2) Polynomial (d=3) RBF

Fig. 6. Accuracy of classifiers using various parameter tuning models.

The proposed classifier model with various feature selection techniques such as
ranker, PSO search and Tabu search [15] for the GaitPDB dataset are compared with
the multiple ranks with majority vote based relative aggregate scoring model [41]. The
various classifiers are used for the analysis such as Best First Decision Tree (BFT) [18],
Back-Propagation Artificial Neural Network (BPANN) [19], k-Nearest Neighbor (k-NN)
with Euclidean distance metrics [20] and Support Vector Machine (SVM) with various
kernel functions such as linear, polynomial and RBF models [21]. The parameters for
various models are used as given by Lim et al. [15].
Each model used for the analysis has certain parameters to be tuned. Here, Best First
Decision Tree (BFT) is a classifier which takes the number records at the terminal node
(M = 5) and number of folds in cross validation (N = 4) as parameters. BPANN takes
various parameters including momentum (MO = 0), learning rate (L = 0.1) and hidden
layer (HL = 20). KNN model uses the number of neighbors (k = 7) as parameter. With
SVM, the liner kernel takes C as 2, Polynomial kernel assumes C = 4 with Gamma as 0
and RBF assumes C = 7 and Gamma as 0 for the parameters. The obtained results for
classification rate, true positive rate and false positive rate are presented in Table 3.
From the analysis, it is clear that in general the ranker search method based feature
selection offers better performance than any other model. The SVM classifiers with
ranker search methods has good classification rate of about 90% approximately and true
positive rate with minimum false positive rate. However, BPANN has the accuracy rate
of 92.2% for ranker search method. On the other hand, the proposed multikernel SVM
with probability distribution analysis model and multiple ranks with majority vote based
relative aggregate scoring model for feature selection has the highest accuracy rate of
92.5% with 90.8% of true positive rate and 2.3% of minimum false positive rate.
Multikernel Support Vector Machine Approach 69

Table 3. Classification results for various classifiers and feature selection models.

Various classifiers Feature Classification rate True positive False positive


selection rate rate
Best First Decision Ranker 67.6 84.1 54.7
Tree (BFT) PSO search 65.8 83.3 51.7
Tabu search 65.9 83.3 54.9
Relative 70.2 85.6 52.3
scoring
Back-Propagation Ranker 92.2 89.7 24.7
Artificial Neural PSO search 91.1 88.1 34.7
Network (BPANN)
Tabu search 86.6 85.7 32
Relative 91.5 89.5 24
scoring
k-Nearest Neighbor Ranker 88.9 86.5 41.4
(k-NN) with PSO search 89.5 87.3 44.4
Euclidean distance
metrics Tabu search 82.6 84.9 54.5
Relative 89.2 87.1 43.5
scoring
Support Vector Ranker 89.9 88.9 34.5
Machine (SVM) + PSO search 89.3 88.9 34.5
Linear Kernel
Tabu search 86.2 86.5 41.4
Relative 88.4 89.2 33.5
scoring
Support Vector Ranker 89.5 90.5 27.7
Machine (SVM) + PSO search 87.7 89.7 34.3
Polynomial Kernel
Tabu search 83.2 88.9 37.7
Relative 88.9 90.2 27.1
scoring
Support Vector Ranker 90.4 89.7 31.1
Machine (SVM) + PSO search 87.6 89.7 34.3
RBF Kernel
Tabu search 84.6 88.1 37.9
Relative 88.3 88.1 31.3
scoring
Proposed Ranker 91.3 90.3 23.8
Multikernel SVM PSO search 90.3 89.7 28.3
Tabu search 87.3 86.2 29
Relative 92.5 90.8 22.3
scoring
70 A. Gopalsamy and B. Radha

Also, the average AUC values are measured for the individual subjects in FOG dataset
and the results are compared with various experiments such as Normative, Reconstruc-
tion [29], 1C-SVM [30] and Supervised CNN learning [28] along with the proposed
multikernel SVM. The details about the values obtained are shown in Table 4.
From the analysis, the reconstruction and 1C-SVM model has better AUC values
for the minimum number of individual subjects. Though the normative model has better
AUC for various individual subjects, the overall mean value is minimum when compared
with the proposed model and supervised learning model. On the other hand, the proposed
model has better average AUC as 0.85 approximately which is similar to the supervised
CNN model.

Table 4. Average AUC values for FOG dataset.

Subjects Normative Reconstruction 1C-SVM Supervised Proposed


Sub1 0.87 0.64 0.73 0.80 0.88
Sub2 0.80 0.80 0.54 0.95 0.89
Sub3 0.87 0.77 0.43 0.90 0.89
Sub5 0.83 0.70 0.60 0.80 0.85
Sub6 0.60 0.50 0.70 0.80 0.82
Sub7 0.79 0.66 0.67 0.92 0.90
Sub8 0.64 0.48 0.70 0.65 0.72
Sub9 0.77 0.51 0.62 0.94 0.88
Mean 0.77 0.63 0.62 0.84 0.85

5 Conclusion
This study presents the Multikernel Support Vector Machine with Probability Distri-
bution Analysis based Classification (MSVMPDA) model specifically for classifying
Parkinson disease with respect to gait parameters. The model utilizes multiple ranks
with majority vote based relative aggregate scoring model for selecting significant fea-
tures and suggests equiweight based multikernel support vector machine classifier with
Bayseian Optimization for classifying the disease effectively. It also utilizes the proba-
bility distribution analysis for verifying the classification results obtained which is then
used to make decisions for medical diagnosis. The performance of the proposed mul-
tikernel SVM model has been analysed with various gait datasets in which the use of
multiple kernel functions linear, polynomial and RBF has the accuracy rate of above
90%. Also, the proposed model has the average AUC value of about 0.85 which is maxi-
mum among few other models under comparison. The model reduces the computational
complexity and also reduces the bias in the process of selecting kernel and its parameters.
Multikernel Support Vector Machine Approach 71

The future work will focus on utilizing other machine learning techniques for achieving
100% accuracy and classifying Parkinson disease through automated learning algorithm
through other medical examination such as brain MRI.

References
1. Lajoie, A.C., Lafontaine, A.L., Kaminska, M.: The spectrum of sleep disorders in Parkinson’s
disease: a review. Chest 159, 818–827 (2020)
2. Parkinson’s Disease Statistics. https://parkinsonsnewstoday.com/parkinsons-disease-statis
tics/. Accessed 20 June 2021
3. Orozco, J.L., et al.: Parkinson’s disease prevalence, age distribution and staging in Colombia.
Neurol. Int. 12(1), 9–14 (2020)
4. Ryman, S.G., Poston, K.L.: MRI biomarkers of motor and non-motor symptoms in Parkinson’s
disease. Parkinsonism Relat. Disord. 73, 85–93 (2020)
5. Masato, A., Plotegher, N., Boassa, D., Bubacco, L.: Impaired dopamine metabolism in
Parkinson’s disease pathogenesis. Mol. Neurodegener. 14(1), 1–21 (2019)
6. Pinto, C., et al.: Movement smoothness during a functional mobility task in subjects with
Parkinson’s disease and freezing of gait–an analysis using inertial measurement units. J.
Neuroeng. Rehabil. 16(1), 110 (2019)
7. Barroso, F.O., et al.: Combining muscle synergies and biomechanical analysis to assess gait
in stroke patients. J. Biomech. 63, 98–103 (2017)
8. Hahn, M.E., Farley, A.M., Lin, V., Chou, L.S.: Neural network estimation of balance control
during locomotion. J. Biomech. 38(4), 717–724 (2005)
9. Aich, S., Choi, K., Park, J., Kim, H.C.: Prediction of Parkinson disease using nonlinear clas-
sifiers with decision tree using gait dynamics. In: Proceedings of the International Conference
on Biomedical and Bioinformatics Engineering, pp. 52–57 (2017)
10. Miljkovic, D., Aleksovski, D., Podpečan, V., Lavrač, N., Malle, B., Holzinger, A.: Machine
learning and data mining methods for managing Parkinson’s disease. In: Holzinger, A. (ed.)
Machine Learning for Health Informatics. LNCS (LNAI), vol. 9605, pp. 209–220. Springer,
Cham (2016). https://doi.org/10.1007/978-3-319-50478-0_10
11. Ricciardi, C., et al.: Using gait analysis’ parameters to classify Parkinsonism: a data mining
approach. Comput. Methods Programs Biomed. 180, 105033 (2019)
12. Ricciardi, C., et al.: Machine learning can detect the presence of mild cognitive impair-
ment in patients affected by Parkinson’s disease. In: nternational Symposium on Medical
Measurements and Applications (MeMeA), pp. 1–6. IEEE (2020)
13. Sathya Bama, S., Saravanan, A.: Efficient classification using average weighted pattern score
with attribute rank based feature selection. Int. J. Intell. Syst. Appl. 10(7), 29 (2019)
14. Shrivastava, P., Shukla, A., Vepakomma, P., Bhansali, N., Verma, K.: A survey of nature-
inspired algorithms for feature selection to identify Parkinson’s disease. Comput. Methods
Programs Biomed. 139, 171–179 (2017)
15. Lim, C.M., Ng, H., Yap, T.T.V., Ho, C.C.: Gait analysis and classification on subjects with
parkinson’s disease. Jurnal Teknol. 77(18), 1–6 (2015)
16. Manap, H.H., Tahir, N.M., Yassin, A.I.M.: Statistical analysis of Parkinson disease gait clas-
sification using artificial neural network. In: International Symposium on Signal Processing
and Information Technology (ISSPIT), pp. 060–065. IEEE (2011)
17. Zheng, H., Yang, M., Wang, H., McClean, S.: Machine learning and statistical approaches to
support the discrimination of neuro-degenerative diseases based on gait analysis. In: McClean,
S., Millard, P., El-Darzi, E., Nugent, C. (eds.) Intelligent Patient Management, vol. 189,
pp. 57–70. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00179-6_4
72 A. Gopalsamy and B. Radha

18. Shi, H.: Best-first decision tree learning doctoral dissertation. The University of Waikato
(2007)
19. Gullu, M., Yilmaz, M., Yilmaz, I.: Application of back propagation artificial neural network
for modelling local GPS/levelling geoid undulations: a comparative study. In: FIG Working
Week, pp. 18–22 (2011)
20. Fix, E., Hodges, J. L.:Discriminatory analysis. Nonparametric discrimination: sentiency
properties. Int. Stat. Rev./Rev. Int. Stat. 57(3), 238–247 (1989)
21. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
22. Khoury, N., Attal, F., Amirat, Y., Oukhellou, L., Mohammed, S.: Data-driven based approach
to aid Parkinson’s disease diagnosis. Sensors (Basel) 19(2), 242 (2019)
23. Tahir, N.M., Manap, H.H.: Parkinson disease gait classification based on machine learning
approach. J. Appl. Sci. (Faisalabad) 12(2), 180–185 (2012)
24. Alkhatib, R., Diab, M.O., Corbier, C., El Badaoui, M.: Machine learning algorithm for gait
analysis and classification on early detection of Parkinson. IEEE Sens. Lett. 4(6), 1–4 (2020)
25. Lee, S.H., Lim, S.: Parkinson’s disease classification using gait characteristics and wavelet-
based feature extraction. Expert Syst. Appl. 39(8), 7338–7344 (2012)
26. Baby, M.S., Saji, A.J., Kumar, C.S.: Parkinsons disease classification using wavelet transform
based feature extraction of gait data. In: International Conference on Circuit, Power and
Computing Technologies (ICCPCT), pp. 1–6. IEEE (2017)
27. Mittra, Y., Rustagi, V.: Classification of subjects with Parkinson’s disease using gait data
analysis. In: 2018 International Conference on Automation and Computational Engineering
(ICACE), pp. 84–89. IEEE (2018)
28. Rad, N.M., et al.: Deep learning for automatic stereotypical motor movement detection using
wearable sensors in autism spectrum disorders. Signal Process. 144, 180–191 (2018)
29. Mohammadian Rad, N., Van Laarhoven, T., Furlanello, C., Marchiori, E.: Novelty detec-
tion using deep normative modeling for IMU-based abnormal movement monitoring in
Parkinson’s disease and autism spectrum disorders. Sensors 18(10), 3533 (2018)
30. Erfani, S.M., Rajasegarar, S., Karunasekera, S., Leckie, C.: High-dimensional and large-
scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recogn.
58, 121–134 (2016)
31. Barth, J., et al.: Combined analysis of sensor data from hand and gait motor function improves
automatic recognition of Parkinson’s disease. In: 2012 Annual International Conference of
the IEEE Engineering in Medicine and Biology Society, pp. 5122–5125. IEEE (2012)
32. Ling, Y., Cao, Q., Zhang, H.: Credit scoring using multi-kernel support vector machine and
chaos particle swarm optimization. Int. J. Comput. Intell. Appl. 11(03), 1250019 (2012)
33. Alam, S., Kang, M., Pyun, J.Y., Kwon, G.R.: Performance of classification based on PCA,
linear SVM, and Multi-kernel SVM. In: International Conference on Ubiquitous and Future
Networks (ICUFN), pp. 987–989. IEEE (2016)
34. Liu, Z., Xu, H.: Kernel parameter selection for support vector machine classification. J.
Algorithms Comput. Technol. 8(2), 163–177 (2014)
35. Zhang, D., Chen, S., Zhou, Z.H.: Learning the kernel parameters in kernel minimum distance
classifier. Pattern Recogn. 39(1), 133–135 (2006)
36. Gardner, J.R., Kusner, M.J., Xu, Z.E., Weinberger, K.Q., Cunningham, J.P.: Bayesian
optimization with inequality constraints. In: ICML 2014, pp. 937–945 (2014)
37. Alasadi, S.A., Bhaya, W.S.: Review of data preprocessing techniques in data mining. J. Eng.
Appl. Sci. 12(16), 4102–4107 (2017)
38. Tamilselvi, R., Sivasakthi, B., Kavitha, R.: An efficient preprocessing and postprocessing
techniques in data mining. Int. J. Res. Comput. Appl. Robot. 3(4), 80–85 (2015)
39. Sim, J., Lee, J.S., Kwon, O.: Missing values and optimal selection of an imputation method
and classification algorithm to improve the accuracy of ubiquitous computing applications.
Math. Probl. Eng. 2015(12), 1–14 (2015)
Multikernel Support Vector Machine Approach 73

40. Jain, S., Shukla, S., Wadhvani, R.: Dynamic selection of normalization techniques using data
complexity measures. Expert Syst. Appl. 106, 252–262 (2018)
41. Gopalsamy, A., Radha, B.: Feature selection using multiple ranks with majority vote-based
relative aggregate scoring model for parkinson dataset. In: Saraswat, M., Roy, S., Chowdhury,
C., Gandomi, Amir H. (eds.) Proceedings of International Conference on Data Science and
Applications. LNNS, vol. 287, pp. 1–19. Springer, Singapore (2022). https://doi.org/10.1007/
978-981-16-5348-3_1
42. Gönen, M., Alpaydın, E.: Multiple kernel learning algorithms. J. Mach. Learn. Res. 12,
2211–2268 (2011)
43. Nguyen, V.: Bayesian optimization for accelerating hyper-parameter tuning. In: Second
International Conference on Artificial Intelligence and Knowledge Engineering (AIKE),
pp. 302–305. IEEE (2019)
44. Flagg, C., Frieder, O., MacAvaney, S., Motamedi, G.: Streaming gait assessment for
Parkinson’s disease. In: HSDM@ WSDM, pp. 34–42 (2020)
45. Goldberger, A.L., et al.: Components of a new research resource for complex physiologic
signals. PhysioBank PhysioToolkit Physionet 101(23), E215-20 (2000)
46. Bohnen, N.I., et al.: Extra-nigral pathological conditions are common in Parkinson’s disease
with freezing of gait: an in vivo positron emission tomography study. Move. Disord. 29(9),
1118–1124 (2014)
47. Bachlin, M., et al.: Wearable assistant for Parkinson’s disease patients with the freezing of
gait symptom. IEEE Trans. Inf Technol. Biomed. 14(2), 436–446 (2009)
Impact of Convolutional Neural Networks
for Recognizing Facial Expressions: Deep
Learning Perspective

Ridhima Sabharwal and Syed Wajahat Abbas Rizvi(B)

Department of Computer Science and Engineering, Amity University, Lucknow, Uttar Pradesh,
India
swarizvi@lko.amity.edu

Abstract. In general, it can be observed in the current scenario that the Auto-
matic emotion recognition through a variety of facial expressions has become
a hot area of researchers. Significant number of studies by various researchers
has being added every day in this domain. Application of these facial expression
analysis has been contributing in areas like human machine interfaces and health
domain. So many researchers have been working to develop different techniques
for interpreting various facial expressions in order to extract these features. With
this motivational background of facial emotion recognition this study has focused
the issue in the context of deep learning along with convolutional neural networks.

Keywords: Convolutional neural networks · Facial emotion · Deep learning

1 Introduction

Facial expression recognition commonly known as FER in short, offers technologies a


method of detecting sentiments. It can be well-thought-out as among the most extensively
used AI and pattern analysis applications [1]. The comparatively novel and hopeful trends
in facial expressions technology is to categorize a person’s emotions is the progress and
usage of different softwares that help in mechanizing the development of codes using
advanced technologies within the sphere of machine learning [2]. Even though such
great advancement has been made, perceiving outward appearance with a high precision
stays to be troublesome because of the multifaceted nature and assortments of outward
appearances. On an everyday rudiment people ordinarily perceive feelings by trademark
highlights, shown as an aspect of an outward appearance. For example, bliss is evidently
associated with a grin or an upward development of the sides of the lips [3]. Essentially
different feelings are portrayed by different disfigurements commonplace to a specific
articulation. Investigation into programmed acknowledgment of outward appearances
tends to the issues encompassing the portrayal and order of static or dynamic attributes
of these mis-happenings [4].
Gauging sentiments are tough. Feelings are frequently brief, concealed, and also
clashed at times. Enquiring a contributor about what he/she is exactly emoting over a

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 74–84, 2022.
https://doi.org/10.1007/978-3-030-97196-0_6
Impact of Convolutional Neural Networks for Recognizing Facial Expressions 75

discussion or through a survey may not at all times be the most effective thing to do.
A plethora of applicants express what they contemplate we want to perceive or merely
face a huge trouble in articulating their thoughts on what they are actually feeling [5].
Many are also cautious or frightened to confess their real feelings to a completely unfa-
miliar person. The contributor’s emotive state amidst doing something is mostly a cause
of concern. Just ponder about the variety of sentiments a contributor could feel while
scheming the amount of money he/she would collect when he retires, understanding a
disease he already has, or simply playing a sport with his family [6]. Facial recogni-
tion is a biometric innovation that utilizes recognizable facial highlights to distinguish
an individual. Today, we are immersed with information, all things considered, yet the
plenty of photograph and video information accessible gives the dataset needed to make
facial recognition innovation work [7]. Facial recognition frameworks break down the
visual information and a huge quantity of pictures and recordings made by top notch
Closed-Circuit Television (CCTV) cameras introduced in our urban areas for security,
cell phones, web-based media, and other online movement. AI and man-made brain-
power capacities in the product map discernible facial highlights numerically, search for
designs in the visual information, and contrast new pictures and recordings with other
information put away in facial recognition information bases to decide character [8].
The rest of the paper is organized as follows; Sect. 2 describes about Deep Learning
and its benefits. Section 3 presents the overview of Convolutional Neural Networks
along with its internals. Section 4 discusses the various training, testing and validation
datasets utilized in this study. Results and discussions are presented in Sect. 5 and finally
the paper concludes in Sect. 6.

2 Significance of Deep Learning


Numerous specialists discuss Deep Learning as the following innovative insurgency and
the fourth contemporary upset. It is normal that the usage of those sorts of computa-
tional cycles will duplicate by 12 until 2021. Deep learning is the computer framework
behind Artificial Intelligence. The, additionally called deep learning, utilizes organized
AI calculations as fake neural networks. These calculations permit the machine to learn
without anyone else and have the option to build up its own rules and set up new bound-
aries with which to settle on and execute choices [9]. These deep networks are framed by
numerous layers in which, as in human neural networks, the signs are adjusted and going
progressively through various layers of handling and change. Not at all like Machine
Learning, Deep Learning permits you to mechanize preparing measures and make your
own models naturally, without the requirement for human mediation [10].
The advantages of Deep Learning are material to any framework that is based on the
data. That is the reason most financial areas as of now have Artificial Intelligence cycles
to improve their administrations and execution. Deep Learning is likewise answerable
for a few social mechanical cycles at the client level [11].

2.1 Recognition of Speech


It is being observed during the review of literature that requirements level measures
are generally ignored while quantifying software reliability. As the errors introduced
76 R. Sabharwal and S. W. A. Rizvi

because of improper requirements analysis are having severe impact on the reliability of
the finally delivered software. Therefore, consideration of requirements stage measures
along with their effect on the software reliability is an effort to fill this gap.

2.2 Face Recognition and Computational Vision


This application is completely momentum for cell phones and web indexes. On account
of these computer networks, it is conceivable to learn trademark facial features and
recognize faces. Similarly, these networks permit to perceive, and separate helpful data
contained in images [12].

2.3 Recreation of Scenes


The Deep Learning has permitted the computational relationship with the images permit-
ting the ID recognition, reclamation of images and remaking of scenes, key components
of innovation as progressive as that of self-governing vehicles [13].

2.4 Semantic Translation and Regular Language


The Deep Learning applied in this field permits to respond to orders sent in normal
language and get the machines to comprehend the remarks of clients and get data about
their discussions. Deep learning permits the savvy mix of words to acquire a semantic
vision and locate the most exact words relying upon the specific situation.

3 Convolutional Neural Networks


Convolutional neural networks were stimulated by biological progressions in the way
that the connectivity pattern amongst the neurons look a lot like the organization of the
animal’s cortex. Discrete cortical neurons answer to spurs solitary in a controlled area
pertaining to the visual field which is called as the receptive field. The receptive fields
of various neurons partly connect through the way that they end up covering the whole
of visual domain. Similar to how a baby figures out how to perceive objects, we have
to show our algorithm a huge number of pictures before it has the option to sum up the
info and make predictions for images it has never seen [14]. The computers ‘find’ in an
unforeseen path in comparison to what we do. Their reality comprises of just numbers.
Each picture can be spoken to as 2-dimensional arrays of numbers, simply called pixels
[15]. In any case, the manner that they see images in an alternate manner, doesn’t mean
we can’t prepare them to perceive patterns, as we do. We simply need to consider what
a picture is in an alternate manner.
CNNs use relatively little preprocessing compared to other picture classification algo-
rithms. Now we may interpret that the network learns the filters that in traditional algo-
rithms were hand-engineered. This independence from former knowledge and human
effort in feature plan is a major advantage [16]. Under the concept of deep learning,
a CNN/convolutional neural network can be categorized under a class of deep neural
networks. They are generally used for the analysis of visual images. They are frequently
Impact of Convolutional Neural Networks for Recognizing Facial Expressions 77

also referred as shift invariant or even at times, space invariant artificial neural networks.
A convolution is nonentity but a meek application of a filter with an effort that finally
culminates into an activation. When we apply the same filter on input results in a map
of activations, then it forms a feature map. A feature map indicates the locations and
strength of a detected feature in an input, such as an image [17].
The convolutional NN come with the capability to automatically absorb a very colos-
sal sum of filters [talking parallel specific in nature] on a training dataset. The constraints
that usually exist have mainly to do with a precise extrapolative modelling problem. For
example, there is image classification. The outcome of this whole process is exceedingly
precise features that can usually be spotted wherever on the input images [18].

Fig. 1. Internals of CNN

Looking carefully at Fig. 1 there is a RGB input picture – width W, height H and three
networks. Therefore, what I can very easily say is that this layer is probable the main layer
of my model; in some other situation, we have feature maps as the input to our layer. In
general, we need to comprehend what a feature map is. That is the square of yellow color
in the picture which can be seen above. It’s an assortment of N one-dimensional “maps”
that each address a specific “feature” that the model has spotted inside the picture. This is
the main reason convolutional layers are known as feature extractors. Currently with the
given information that we have, this is actually extraordinary however our fundamental
inquiry is how would we get from input (regardless of whether picture or feature map)
to a feature map [19]. This is through pieces which can also be known as kernels, or
channels, really. These channels – we arrange some number N per convolutional layer –
“slide” over our information, and then we have a similar number like that of “channel”
dimensions as our information yet have a lot more modest widths and heights. For
example, for the situation over, a channel might be 3 × 3 pixels wide and high, yet
consistently has 3 channels as our info has 3 channels as well [20].

4 Data Set Utilized


The researcher has selected to train and test the model, we have made use of the Kag-
gle facial expression dataset which already provides images depicting a certain facial
expression which already had a label connected to it. The labels are of the 5 modules of
facial expressions that we have always talked about namely, anger, disgust, sad, happi-
ness and lastly surprise. We further cropped or reshaped the image into 48 × 48 pixels
along with converting the colored images into a gray scale image. All in all, the Kaggle
data set has provided us with 24568 emotion features images along with the associated
label. The adopted methodology has been depicted in the following Fig. 2.
78 R. Sabharwal and S. W. A. Rizvi

Fig. 2. Stepwise process

4.1 Training Data

In order to successfully develop a model, that is based on machine learning then we


need to use proper data to train the model, however it is likewise very basic to the way
that these novelties work. The training data is an underlying arrangement of data used to
assist a program with seeing how to apply the concepts like neural networks to learn and
create refined outcomes. Algorithms gain from data. They discover connections, create
understanding, decide, and assess their inevitability from the training data they’re given.
Also, the better the training data is, the better the model performs. Indeed, the quality
and amount of your training data has as a lot to do with the accomplishment of your data
project as the actual algorithms. Currently, irrespective of whether you’ve put away an
immense quantity of all around organized data, it probably won’t be named in a way that
really functions as a training dataset for your model. For instance, self-sufficient vehicles
don’t simply require photos of the street, they need named pictures where every vehicle,
walker, road sign, and more are commented on. Opinion examination projects require
names that assist a calculation with understanding when somebody’s utilizing slang or
mockery. Chatbots need substance extraction and cautious syntactic investigation, not
simply crude language. It is utmost critical that the data a researcher use for training
should be improved or named. In addition, you may have to gather a greater amount of
it to control your algorithms. Probabilities are the data you’ve put away isn’t exactly fit
to be utilized to prepare algorithms. In case you’re attempting to make an extraordinary
model, you need a solid establishment, which implies incredible training data. What’s
more, we know some things about that. All things considered, we’ve named more than
3 billion lines of data for the most creative organizations on the planet. Regardless of
it’s pictures, text, sound, or, truly, some other sort of data, we can help make the training
set that makes your models fruitful.

4.2 Testing Data

Testing Data in Software Testing is the information given to a software program during
test execution. It addresses data that effects or influenced by software execution while
Impact of Convolutional Neural Networks for Recognizing Facial Expressions 79

testing. Test data is utilized for both positive testing to check that capacities produce
anticipated consequences for given sources of info and for negative testing to trial soft-
ware capacity to deal with unordinary, extraordinary, or startling information sources.
Ineffectively planned testing data may not test all conceivable test situations which will
hamper the nature of the software.

4.3 Validation Data

A validation dataset is an example of information held in reserve away from preparing


your model that is utilized to give a gauge of model ability while tuning model’s hyper-
parameters. The validation dataset is not quite the same as the test dataset that is likewise
kept away from the training of the model, yet is rather used to give a impartial gauge
of the expertise of the last tuned model when looking at or choosing between definite
models. There is a lot of disarray in applied machine learning algorithms about what a
validation dataset is actually and how it varies from a test dataset.

5 Result

Initially with the usage of the Kaggle dataset, the number of images for the 5 emotions
namely, anger, disgust, sad, surprise and happy are shown in the histogram below for
visualization using the matplotlib library to get a clear understanding about what kind of
data we are dealing with. Further, after the application of the various operations within
CNN which were required, we ran the code to find its best fit over around 50 epochs. An
epoch is a term utilized in CNN and demonstrates the quantity of passes of the whole
training dataset the AI calculation has finished. Datasets are normally assembled into
bunches (particularly when the measure of information is extremely huge).
Later, we use the powerful techniques of data visualization once again for plotting
a graph for making a check on the accuracy of our built and trained model. The accu-
racy of an AI order algorithm is one approach to quantify how regularly the algorithm
orders a data point effectively as shown in Fig. 3. Accuracy is the quantity of effectively
anticipated data points out of all the data points. When training an AI model, one of
the fundamental things that you need to stay away from would be overfitting. This is
the point at which your model fits the training information well, yet it can’t sum up and
make precise forecasts for information it hasn’t seen previously.
To see whether their model is overfitting, information researchers utilize a method
called cross-validation, where they split their information into two sections - the training
set, and the validation set. The training set is utilized to prepare the model, while the
validation set is simply used to assess the model’s exhibition.
80 R. Sabharwal and S. W. A. Rizvi

Fig. 3. Softmax function

Fig. 4. Training & validation accuracy

Measurements on the training set let you perceive how your model is advancing as
far as it’s training, yet it’s measurements on the validation set that let you get a proportion
of the nature of your model - how well it’s ready to make new expectations dependent on
information it hasn’t seen previously. In view of this, misfortune and acc are proportions
of misfortune and accuracy on the training set, while val_loss and val_acc are proportions
of misfortune and accuracy on the validation set as in above Fig. 4.
Next, a learning curve has been draw as shown below in Fig. 5. As per the literature
the learning curve can be considered as a plot of model learning execution throughout
Impact of Convolutional Neural Networks for Recognizing Facial Expressions 81

skill or period. Learning curves are a generally utilized analytic apparatus in AI for
calculations that gain from a training dataset steadily.

Fig. 5. Training & validation loss

The model can be assessed as per the dataset used for training and on a holdout
validation dataset after every update in the training and plots of the deliberate exhibition
can made to show learning curves in Fig. 5. Checking on learning diagrams of models
in training can be utilized to determine issues to have learning, for example, an underfit
or overfit model, just as whether the training and validation datasets are appropriately
agent.
Finally, a confusion matrix using the sklearn library, to measure the effectiveness of
the model has also computed. We have a variety of approaches to check the presentation
of your arrangement determine yet none have stood the trial of time similar to the
confusion matrix. It encourages us assess how our model performed, where it turned
out badly and offers us direction to address our path.it is an exhibition estimation for AI
classification issue where yield can be at least two classes.
As it could be noticed easily from the Fig. 6, that contains the full connection results.
Beside it confusion matrix, which is a N × N matrix utilized for assessing the exhibition
of a classification model, where N is the quantity of target classes. The matrix contrasts
the genuine objective qualities and those anticipated by the AI model. This gives us an
all-encompassing perspective on how well our classification model is performing and
what sorts of mistakes it is making. It is very helpful for estimating Review, Exactness,
Explicitness, Precision and above all AUC-ROC Curve.
82 R. Sabharwal and S. W. A. Rizvi

Fig. 6. Full connection result

Fig. 7. Final result

The main result or the prediction of the system that is designed for facial expression
recognition is shown above in Fig. 7. The model has achieved an accuracy of 72%. Thus,
the proposed method is proven to be effective for emotion recognition.

6 Conclusion
The main crux of this paper is founded on facial expression recognition and the researcher
has used the Convolutional Neural Networks because it deals with image and useful
for large dataset. The dataset utilized in this study is consisting of 25000 images and
classified into different expressions mainly anger, sadness, happiness, disgust etc. so
Impact of Convolutional Neural Networks for Recognizing Facial Expressions 83

first the input image is cropped for irrelevant part that don’t focus on facial expressions,
subsequently it is converted into the grey scale, as it is easy to extract the features of
the greyscale image. Results are quite encouraging, and it will definitely help the other
researchers working in the same domain. As far as the future extension of this work is
concerned a better and more generalize dataset with a larger number of images can be
used in order to further enhance the obtained results. Beside that more expressions can
be incorporated to cover a big spectrum of facial expressions.

References
1. Sariyanidi, E., Gunes, H., Cavallaro, A.: Automatic analysis for facial: a survey of registration,
representation, and recognition. IEEE Trans. Pattern Analysis Machine Intelligence 37(6),
1113 (2014). https://doi.org/10.1109/TPAMI.2014.2366127
2. Anagnostopoulos, C.-N., Iliou, T., Giannoukos, I.: Features and classifiers for emotion recog-
nition from speech: a survey from 2000 to 2011. Artif. Intell. Rev. 43(2), 155–177 (2012).
https://doi.org/10.1007/s10462-012-9368-5
3. Shu, L., et al.: A review of emotion recognition using physiological signals. Sensors 18(7),
2074 (2018). https://doi.org/10.3390/s18072074
4. Rizvi, S.W.A., Singh, V.K., Khan, R.A.: The state of the art in software reliability prediction:
software metrics and fuzzy logic perspective. In: Satapathy, S.C., Mandal, J.K., Udgata, S.K.,
Bhateja, V. (eds.) Information Systems Design and Intelligent Applications. AISC, vol. 433,
pp. 629–637. Springer, New Delhi (2016). https://doi.org/10.1007/978-81-322-2755-7_65
5. Catherine, M., et al.: Survey on ai-based multimodal models for emotion detection. In: High-
Perf, pp. 307–324. Springer International Publishing, Modelling Big Data Applications (2019)
6. Alkawaz, M.H., Mohamad, D., Basori, A.H., Saba, T.: Blend shape interpolation and FACS
for realistic avatar. 3D Res. 6(1), 6 (2015). https://doi.org/10.1007/s13319-015-0038-7
7. Rouast, P.V., Adam, M., Chiong, R.: Deep learning for human affect recognition: insights
and recent developments. IEEE Trans. Affect. Comput. 12(2), 524–543 (2018). https://doi.
org/10.1109/TAFFC.2018.2890471
8. Shan, C., Gong, S., McOwan, P.W.: Facial expression recog. based on local patterns: a com-
preehensive study. Science Direct Image Visual Computing 27(6), 803–816 (2009). https://
doi.org/10.1016/j.imavis.2008.08.005
9. Jabid, T., Kabir, M.H., Chae, O.: Robust facial expression recognition based on local
directional pattern. ETRI J. 32(5), 784–794 (2010). https://doi.org/10.4218/etrij.10.1510.
0132
10. Zhang, S., Li, L., Zhao, Z.: Facial expression recognition based on wavelets and sparse
representation. In: 2012 IEEE 11th Int. Conference on Sig. Processing, vol. 2, issue 5, pp. 816–
819 (2012). https://doi.org/10.1109/ICSP.2012.649176
11. Gross, R., Matthews, I., Cohn, J., Kanade, T., Baker, S.: Multi-PIE. In: Proc. Int. Conf.
Autom. Face Gasture Recognition Int. Conf. Autom. Face Gesture Recognit., vol. 28, issue
5, pp. 807–813 (2010). https://doi.org/10.1016/j.imavis.2009.08.002
12. Pantic, M., Valstar, M., Rademaker, R., Maat, L.: Web-based database for facial expression
analysis. In: IEEE Int. Conf. on Multimedia and Expo. p. 5, (2005). https://doi.org/10.1109/
ICME.2005.1521424
13. Valstar, M.F., Jiang, B., Mehu, M., Pantic, M., Scherer, K.: The first facial expre. recognition
and analysis challenge. In: Face and Gesture, pp. 921–926 (2011). https://doi.org/10.1109/
FG.2011.5771374
14. Rizvi, S.W.A., Khan, R.A.: Improving software requirements through formal methods. Int. J.
Inform. Comput. Technol. 3(11), 1217–1223 (2013)
84 R. Sabharwal and S. W. A. Rizvi

15. Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis tough con-
ditions: evaluation protocol and benchmark. In: IEEE International Conference on Com-
puter Vision Workshops (ICCV Workshops), pp. 2106–2112 (2011). https://doi.org/10.1109/
ICCVW.2011.6130508
16. Rizvi, S.W.A., Singh, V.K., Khan, R.A.: Software reliability prediction using fuzzy inference
system: early stage perspective. Int. J. Comput. Appl. 145(10), 16–23 (2016)
17. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-
kanade dataset: a complete dataset for action and emotion-specified expression. In: 2010 IEEE
Computer Society Conf. on Comp. Vision and Pattern Recognition – Workshops, pp. 94–101
(2010). https://doi.org/10.1109/CVPRW.2010.5543262
18. Mohammadpour, M., Khaliliardali, H., Hashemi, S.M.R., AlyanNezhadi, M.M.: Facial emo-
tion recognition using deep convolutional networks. In: 2017 IEEE 4th Int. Conference on
Knowledge-Based Engg. and Innovation (KBEI), pp. 17–21 (2011). https://doi.org/10.1109/
KBEI.2017.8324974
19. Zhang, S., Zhang, S., Huang, T., Gao, W.: Multimodal deep convolutional neural network for
audio-visual emotion recognition. In: Proc. 2016 ACM on International Conf. on Multimedia
Retrieval, NY, NY, USA, pp. 281–284 (2016). https://doi.org/10.1145/291196.2912051
20. Kim, D.M., Baddar, W.J., Jang, J., Ro, Y.M.: Multi-objective based spatio-temporal repre-
sentation learning robast to expression intensity variations for expression recognition. IEEE
Trans. Affect. Computing 10(2), 223–236 (2019)
Handwritten Bengali Digit Classification Using
Deep Learning

Amitava Choudhury1(B) and Kaushik Ghosh2


1 Department of Computer Science and Engineering, Pandit Deendayal Energy University,
Gandhinagar, Gujarat, India
a.choudhury2013@gmail.com
2 School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India

Abstract. Bengali Character Recognition has recently become a field of increased


academic interest due to its wide and varying applications. In this paper, a recog-
nition system based on transfer learning has been proposed for robust recognition
of Bengali numeric digits. Several preprocessing techniques have been used on
the images to aid in the training process. ResNet-18 has been used for the transfer
learning. All deep learning applications have been achieved by the extensive use
of the PyTorch library. The dataset used has been chosen for its unbiasedness,
which allows the proposed model to perform better on digits it has never encoun-
tered before. The proposed model has shown state of the art performance, with
an accuracy of 92% in just 25 epochs. Thus, the proposed model doesn’t only
provide a decent accuracy, but it does so with lesser number of parameters than
other leading approaches along with taking fewer epochs to reach the final result.

Keywords: Numeral recognition · Deep learning · PyTorch · Pattern recognition

1 Introduction
Numeral systems represent numeric digits in a well-defined and understandable man-
ner. In India, numerous regional numeral systems along with the widely used Western
numeral system is used in day to day life. The use of handwritten language is more
prevalent in Indian as compared to digitized scripts. Handwritten character recognition
is gaining a lot of attention in the academic world. In India a large number of lan-
guages are spoken and written and each of the regional numeric systems bring along
with it a different challenge, so far character recognition is concerned. By 2019, the
projected number of internet users is 627 million out of a population of about 1.37 bil-
lion. This shows that only about 46% of the Indian population is connected to the web.
The remaining chunk of the population has no access to the Internet and relies on hand-
written methods for official and personal work. This large population generates a huge
volume of handwritten data and thereby creates need for automated systems capable of
recognizing and indexing the characters. The problem becomes more intense as most of
the Indian languages are under resourced. In this paper, we have taken Bangla numeral
recognition for case study.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 85–95, 2022.
https://doi.org/10.1007/978-3-030-97196-0_7
86 A. Choudhury and K. Ghosh

Bangla is an Indo-Aryan script used in both Bengali and Assamese language. It is the
official language of Bangladesh and is the second most widely spoken language of India.
In India, the language is in common use especially in the states of West Bengal, Tripura,
and Assam. Worldwide, it is spoken by about 260 million people and is the 6th most
spoken language throughout the world. Bangla’s cultural significance and importance
therefore is clearly undeniable.
In this paper, an offline Bangla handwritten numeric character recognition system
has been proposed. Handwritten character recognition in itself is challenging due to
several factors like the relative similarity between characters in both size and shape, and
the relative dissimilarity between the same characters written by different people. These
challenges become more difficult when Bangla characters are taken into consideration.
Bangla numerical system like the Arabic numeral system consists of 10 digits, the com-
bination of which represents all required numbers. Bangla characters have more curves
in comparison to their Arabic counterparts. Furthermore, there are a lot of characters
that are similarly shaped. All this makes the development of a generalized recognition
system difficult [1]. Figure 1 shows a writing pattern of Bangla numerals.

Fig. 1. Writing pattern of Bengali numerals.

While some work has been done in the field of Bangla handwritten character recog-
nition, detailed in Sect. 2 of the paper, there is a need for a leaner system that takes lesser
time to train. The system proposed here achieves the same, by providing state of the art
accuracy taking just 25 epochs to train. The availability of an unbiased and vast dataset
is an integral part of a machine learning and deep learning model, so far performance
is concerned. The dataset used in this paper has been discussed in Sect. 3. Before any
training can begin, it is necessary to pre-process the images. Several pre-processing tech-
niques have been applied to the images in our dataset. The images have been randomly
cropped and normalized. Thereafter the brightness, contrast and saturation of the images
have been randomly changed to provide the model with a more generalized framework,
such that an increased accuracy level could be achieved. All the pre-processing tech-
niques used have been detailed in Sect. 4. Convolutional Neural Networks (CNN) have
proven to be a proficient feature for extractors [2]. Hence, they perform very well on
tasks like recognition and classification. A variant of Residual Network, ResNet-18 has
been used in this study for recognition. The model has been explained and detailed in
Sect. 5.
Handwritten Bengali Digit Classification Using Deep Learning 87

2 Literature Review
Pradeep et al. proposed a novel approach of feature extraction for offline character
recognition, from an image of English alphabets [3]. The work divided the image into
equal sized zones through a diagonal feature extraction mechanism. This method out-
performed the horizontal and vertical feature extraction mechanisms. A similar neural
network based English alphabet handwritten character recognition method was proposed
by Patil and Shimpi [4]. Gradient features were extracted using Sobel operators instead
of directional element feature in [5]. For classification, the paper used the modified
quadratic discriminant function. By using the Mahalanobis function, the capability of
this quadratic discriminant function to differentiate between similar looking characters
was enhanced. In [6], a recognition system was developed for Bangla basic characters.
The proposed 5-layer Convolutional Neural Network (CNN) based system was also
used in the character recognition of Devnagri, Oriya, Telugu, and English numerals as
well. Here, the classification was carried out by the use of a Support Vector Machine
(SVM). An accuracy of 98.375% was achieved for Bangla Numerals using this approach.
Another CNN based approach for the handwritten character recognition of Bangla char-
acters was introduced by Rahman, et al. Shill [7]. The used model had two convolutional
layers, with 5 × 5 filters. The model achieved an accuracy of 85.96%. In [8], an offline
handwritten character recognition of Bangla characters with emphasis on discriminating
between similarly shaped characters had been put forth]. Preprocessing techniques like
binarization, normalization, and salt-and-pepper noise cleaning were done. The work
proposed a two step classification process, where the first stage consisted of MGDF
and the second stage consisted of the neural classifier. An accuracy of 95.84% was
achieved through this framework. Gaur and Yadav in their paper [9] handled the problem
of handwritten Devnagri character recognition. Binarization was performed in the pre-
processing phase. The horizontal bar present on every Hindi character had been detected.
Lastly, segmentation was carried out to separate the characters. For feature extraction
K-Means algorithm [10] was used. K-means shows promising results for poorly illu-
minated images. The classification was performed using SVM with a linear kernel. An
accuracy of 95.86% has been reported through this methodology. Authors in [11] sug-
gested a handwritten character recognition system for Marathi characters using R-HOG
features. Here, Rectangular Histogram Oriented Gradient (R-HOG) method was used to
extract features from an image. For classification, a comparative study was performed
between SVMs and a Feed Forward Artificial Neural Network (FFANN). The neural
network performed considerably better than the SVM with an accuracy of 97.15%. The
SVM’s accuracy was 95.64%. A Deep Convolutional Neural Network based handwritten
Bangla character recognition system is designed to recognized such characters. Here,
feature extraction was carried out by the convolutions and sub-sampling layers. Fully
connected layers carried out classification [12]. Choudhury et al. presented a robust Ben-
gali number classification techniques using hog based feature extraction algorithm [13].
Adaptive coefficient template matching is one of the simplest techniques is described in
literature [14] where as an attempt towards recognition of size and shape independent
Bangla handwritten numerals is discussed in [15]. By looking into the works of different
researchers, we drew the inference that there is always a tradeoff between accuracy and
computation. While some of the works studied provided a very high degree of accuracy
88 A. Choudhury and K. Ghosh

and lacked in reducing the cost of computation, there were other works that are com-
putationally efficient with accuracy level below par. The work proposed in this paper
therefore tries to bridge the gap between the two and thereby provide a computationally
inexpensive model without compromising on the accuracy part.

3 Methodology

3.1 Dataset

The dataset used in this study consists of 6000 images of handwritten Bangla numbers.
The images are all 32 × 32 pixels and each image is a unique handwritten variant of
the required Bangla numeral. The dataset has been divided into testing, training and
validation subsets to aid the training process of the neural network. The model to learn
the features of an image and learn the corresponding classifications in a supervised
learning paradigm uses the training set. This split contains ten classes, one for each digit
in the Bangla script numeral. Each class contains 420 unique handwritten characters
for that numeral. The second split is the validation split, which is used to tune hyper
parameters like the learning rate, number of epochs, and so forth. This split too contains
ten classes each having 162 images. Finally, the test split to check the performance of
the model has been created. This subset has ten classes with 10 images in each class.

3.2 Pre-processing and Augmentation

Pre-processing is an important aspect of any neural network based classifier. Applying


any sort of feature extraction and classification on raw and unprocessed image gives
poor result as evident from the work done by Pal and Sudeep [16]. The pre-processing
techniques applied in this paper has been discussed below:

Random Resize, Crop, and Flip


Data augmentation is an important means of increasing the size of a dataset. It also
proves useful in providing more vantage points to the model to learn from. This helps
in generalizing the framework. In the paper by Szegedy, random minute distortions
were made into the original images of some datasets including the MNIST dataset. It is
shown that small distortions that are visibly hard to distinguish provide incorrect or no
predictions. For all these reasons, random resizing, cropping and random vertical and
horizontal flipping has been performed on the dataset. This augmentation has been done
by randomly cropping the original image. This cropping has been set to be about 0.08 to
1.0 times the original image. A random aspect ratio has also been chosen which is 3/4th
to 4/3rd of the original image. These settings have been verified to provide good results
in image recognition tasks using CNNs. After these steps the image is resized back to
the required image size for the CNN.

Random Changes in Brightness, Saturation, Hue, and Contrast


In order to create a framework capable of recognizing Bengali handwritten characters in
a number of situations, a pre-processing technique that randomly varies the properties
Handwritten Bengali Digit Classification Using Deep Learning 89

of the original image is used. This augmentation allows the model to predict characters
in situations with low brightness, differing colors of the background/paper and the ink
used, and the saturation of the image. The brightness is chosen from a uniform distribu-
tion lying between 1 and 3. Similarly, contrast and saturation is chosen from the uniform
distribution that is between 1 and 3. The hue of the image is allowed to be changed from
the distribution −0.1 to 0.5. This process makes the proposed model more robust.

Normalize
Normalization is the process of centering an image. In an image the ranges of the differing
features in different color channels can be vastly different. This could cause some features
to dominate over others based only on numerical significance rather than the importance
of the feature. Normalization helps in converting the images to have zero mean and unit
variance. Normalization has proven to increase the accuracy power of a neural network-
based classifier like CNNs. In this study, the mean and standard variance was calculated
for all images across all color channels. The values of the mean and standard variance
for each color channel have been specified. Then each channel’s value is modified by
using the following formula:
inputch − meanch
inputch = (1)
stddevch

Interpolation
Interpolation is the technique by which new data points are predicted when the range
between which the new data point is to be predicted is known. When cropping and
resizing images, interpolation aids in the prediction of new pixel values of the resultant
image. This is achieved by predicting the value of a given pixel by looking at the pixels
neighboring the pixel in question and using the values of the neighboring pixels to pre-
dict value of the current pixel. In this study, Bicubic interpolation has been used. This
technique provides a sharper image with reduced interpolation artifacts like Aliasing,
Blurring, etc. It produces sharper images when compared to Linear and Bilinear interpo-
lation. Instead of considering 4 pixels in the nearest 2 × 2 matrix of neighboring pixels
like in Bilinear interpolation, Bicubic interpolation uses 16 pixels in the nearest 4 ×
4 matrix of neighboring pixels. This increases the number of calculations but provides
smoother images.

Convolutional Neural Networks


Convolutional Neural Networks have proven to be quite robust in their feature extraction.
This makes them useful in image recognition, natural language processing, etc.

Convolution
The convolutional layer is the first layer in a Convolutional Neural Network and it takes
the handwritten image. Receiving an entire image with all pixel values of an image as
input, as done in the fully connected layers of an Artificial Neural Network, not only
drastically increases the computation overhead but also introduces irrelevant features
into the learning. This negatively impacts the model’s performance and generalization.
Convolutional Layers instead uses a filter to scan a particular region of the image. This
90 A. Choudhury and K. Ghosh

region is known as the receptive field. The parameters in the filter are learnable and these
parameters are shared for the convolutional layer, this means that for a convolutional
layer, the filter has the same weights. This reduces the number of parameters to optimize
while making the convergence faster. Element-wise multiplication is performed between
the pixels of the image within the receptive field of the filter, and the weights of the filter.
A feature map is produced as an output of the convolutional layer. Fast Fourier transform
turns the convolution operation into element wise multiplication reducing computation.
The formula used for the convolution operation is
√   
featuremap = input ∗ kernel = F −1 2π F input F[kernel] (2)

In Eq. (2), the convolution operation is denoted√ by ∗. F is the Fourier transform


whereas F −1 is the inverse Fourier transform. 2π is the normalization constant.
Convolution layers extract features from the image. They start with learning simple
features like edges, etc., and with subsequent convolution layers learn more intricate and
abstract patterns from the image.
The size of the output produced as a result of the convolution operation is given by
the formula (for each dimension)
 
idimension + 2p − k
odimension = +1 (3)
s

In Eq. (3), idimension is the length of the dimension of the image (height, width).
p is the padding applied on the image. Padding is the process of adding zeroes along
the height and width of the image. Without padding the kernel lands on the corners
much less frequently in comparison to the pixels in the center, this skews the learning
of the network. Furthermore because of the aforementioned reason, the feature map size
reduces after each convolution operation, this would hinder layering of layers. For all
of these reasons padding is performed on the images. k is the kernel size. s is the stride
length, it is the distance between successive kernel positions.

Activation Function
Activation functions are used to compute the weighted sum of the inputs along with the
bias. Based on this weighted sum, it is decided if a node fires or not. Activation functions
can be linear or non-linear. Non-linear activation functions are used to allow for more
complex learning by the network. Some activation functions used in the network are:

Rectified Linear Units (ReLU)


Rectified Linear Units activation function clamps the negative value at 0. Basically, for
values of x that are less than 0, the output becomes zero. For values of x greater than 0,
a linear function is produced as an output. The function used for the implementation of
ReLU is computationally cheaper than activation functions like tanh and sigmoid, which
involve expensive operations like exponentiation. The formula that lies behind rectified
linear units is:
   w(i)T x, if x < 0
h(i) = max w(i)T x, 0 = (4)
0, otherwise
Handwritten Bengali Digit Classification Using Deep Learning 91

In Eq. (4), h(i) gives the activation of a hidden layer. w(i) is hidden weight matrix of
a hidden layer. x is the input.
ReLU faces an issue where for low values, the output is zero which makes it such that
optimization algorithms will not update that neuron. Adding to this, during the forward
pass if the output were positive then backpropogation is allowed otherwise it isn’t. To
combat these issues with ReLU, leaky ReLUs have been proposed.

Pooling
Pooling layers perform down sampling and help in dimensionality reduction which aids
in achieving translational invariance. This layer also helps in avoiding overfitting by
making the network’s learning more general. Like the convolution operation, pooling too
has hyperparameters like filter size, stride, and padding. There are two types of pooling
operations, Max pooling and Global Average pooling. In this paper, Max pooling has
been used. In Max pooling a filter is applied on the feature map. The filter is then moved
all over the feature map with the value specified by the stride.
aj = maxNxN ainxn u(n, n) (5)
Equation (5) specifies the max pooling operation; it finds the maximum value
encountered by the filter. Here, u(n, n) is the filter applied on the feature map.
The output dimensions are given by:
 
idimension − k
odimension = +1 (6)
s

Loss Function
For a network to learn, it is important to first evaluate how distant from the actual value
the predictions are. To do this in a quantitative manner, loss functions are used. Easily
differentiable functions are chosen as loss functions to ease the task of back propagation.
In this paper, Cross-Entropy loss has been used as the loss function.

eWyi xi +byi
N T
1
Loss = − log (7)
N n WjT xi +bj
i=1 j=1 e

In Eq. (7), W are the weights vector, b is the bias, xi is the training sample, yi is the
class of the xith training sample, N is the total number of samples, Wj and Wyi are the j th
and yith column of the weights vector.

Gradient Descent
Gradient Descent is performed on the learnable parameters of the network. In this oper-
ation, the parameters P are varied by a small change in the parameters δP << P. The
small variation is chosen in such a manner that the loss of the network reduces. In this
paper, Stochastic Gradient Descent (SGD) has been used. In SGD, the parameters are
updated for each training example, because of which redundancy of computation is
reduced which increases the speed of learning.
 
P = P − η.∇θ J θ ; x(i) ; y(i) (8)
92 A. Choudhury and K. Ghosh

In Eq. (8), P are the parameters, η is the learning rate, J θ ; x(i) ; y(i) is the loss
function, xi is the ith training example, yi is the label of the ith training example, and ∇θ
is the gradient of the loss function.
SGD faces difficulty in finding the local minima of an error space characterized by
difference in “steepness” across different dimensions. In such scenarios, SGD makes
slower progress towards the minima and tends to oscillate. Momentum diminishes this
oscillation and increases the speed of SGD in the required direction:

vt = γ vt−1 + η∇θ J (θ )
(9)
P = P − vt

In Eq. (9), γ is the momentum term and in this paper, it has been set to 0.9. Learning
rate has been set to 0.001 in this study. The learning rate was made to decay after every
7 epochs by a factor of 0.1. Decaying the learning rate leads to faster convergence to the
local minima and higher accuracy.

Regularization
A major problem faced while training CNNs is overfitting. Overfitting leads to good
performance on the training set but extremely poor performance on the validation set.
The network, in this state, learns the training data too well and loses all capability to
generalize. To combat this problem regularization techniques like L2, L1, Dropout, etc.
are used. In this study Dropout has been used for regularization [17].
In Dropout [18], co-adaptions are reduced by randomly dropping off of some connec-
tions in a network. Because of this there is no guarantee of the availability of a particular
hidden neuron.
Pre trained Networks

ImageNet
ImageNet was the top performer of the ILSVRC 2010 [19]. It contains of eight layers
in total, five of which are convolutional and three layers are fully connected. Finally,
the Softmax function is used to output the class scores. The activation function used is
Rectified Linear Unit (ReLU). To prevent overfitting, Data augmentation techniques and
Dropout [18] is used. The number of parameters is about 60 million. The smaller size
leads and small number of parameters, it is easier to train in comparison to VGGNet.
This light weightiness comes at the cost of accuracy.

ResNet
ResNet seeks to solve the problem of loss in accuracy as the network becomes deeper.
This problem of vanishing gradient and degradation of accuracy was dealt with the help
of skip or shortcut connections in the ResNet model. A diagrammatic representation
of the residual block is shown in figure (n). Instead of approximating a function, the
layers try to approximate a residual function. Formally, if F(x) is the function that the
layers are trying to approximate, and x is the input, the residual function is denoted by
R(x) = F(x) − x, the original function to approximate now becomes R(x) + x (Fig. 2).
Handwritten Bengali Digit Classification Using Deep Learning 93

Fig. 2. Working block diagram of ResNet [18].

4 Results and Discussion

In this paper, the 18-layer variant of the residual network, ResNet-18 has been used. It
contains eighteen layers, seventeen of which are convolutional layers, followed by one
fully connected layer which produces the final output. A Batch Normalization layer is
present after each convolutional layer. Batch Normalization is used for the normalization
of the inputs inside the network. Every mini-batch is normalized to a unit standard devi-
ation and a mean of zero. The images were resize to 24 × 24 pixels. Data Augmentation
techniques like random cropping, random changes in brightness, and saturation along
with several affine transformations were applied as discussed in the earlier sections. Two

Fig. 3. Graphical representation of training and validation accuracy and training and validation
loss.
94 A. Choudhury and K. Ghosh

approaches were used for the recognition task. In one approach the pre-trained ResNet-
18 model was fine-tuned to our dataset. In this approach all weights were updatable it
was the architecture of ResNet-18 that was used. The final fully connected layer was
transformed to better match our dataset. A decaying learning rate was used to better
improve the performance. Regularization techniques like Dropout were also used. This
approach yielded an accuracy of 96%. The accuracy for each character is shown in
Fig. 3. The model performed exceptionally well with the digits 0, 2, 3, 4, 5, 6, and 8
producing an accuracy of 100%. The model didn’t perform well with the digit 9 and 1,
this is perhaps due to further ambiguity in their structure.
In the second approach, the ResNet-18 model was used as a feature extractor and
hence the weights of the underlying network were not allowed to change. Only the final
fully connected layer was fine-tuned on the dataset. Same data augmentation techniques
were applied as in the previous approach. A decaying learning rate was also used. This
approach yielded an output of 60% which was considerably worse than the fine-tuned
approach.

5 Conclusion
In this paper a deep learning techniques has been presented to get higher accuracy in
Bengali numeric recognition. The dataset used has been chosen for its unbiasedness,
which allows the proposed model to perform better on digits it has never encountered
before. The proposed model has shown state of the art performance, with an accuracy of
96% in just 25 epochs. Thus, the proposed model doesn’t only provide ground breaking
accuracy, but it does so with lesser number of parameters than other leading approaches
along with taking fewer epochs to reach the final result.

References
1. Zahangir Alom, M., Sidike, P., Hasan, M., Taha, T.M.: Handwritten Bangla Character Recog-
nition Using The State-of-Art Deep Convolutional Neural Networks (2017). http://arxiv.org/
abs/1712.09872
2. Samer, H., Kumar, R., Rowen, C.: Using Convolutional Neural Networks for Image
Recognition, pp. 1–12. Cadence Design Systems Inc., San Jose, CA, USA (2015)
3. Pradeep, J., Srinivasan, E., Himavathi, S.: Diagonal based feature extraction for handwritten
character recognition system using neural network. In: 2011 3rd International Conference on
Electronics Computer Technology, vol. 4. IEEE (2011)
4. Patil, V., Shimpi, S.: Handwritten English character recognition using neural network. Elixir
Comput. Sci. Eng. 41, 5587–5591 (2011)
5. Liu, H., Ding, X.: Handwritten character recognition using gradient feature and quadratic clas-
sifier with multiple discrimination schemes. In: Eighth International Conference on Document
Analysis and Recognition (ICDAR’05). IEEE (2005)
6. Durjoy Sen, M., Bhattacharya, U., Parui, S.K.: CNN based common approach to handwritten
character recognition of multiple scripts. In: 2015 13th International Conference on Document
Analysis and Recognition (ICDAR). IEEE (2015)
7. Nair, P.P., James, A., Saravanan, C.: Malayalam handwritten character recognition using
convolutional neural network. In: 2017 International conference on inventive communication
and computational technologies (ICICCT). IEEE (2017)
Handwritten Bengali Digit Classification Using Deep Learning 95

8. LeCun, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11),
2278–2324 (1998)
9. Mahbubar Rahman, M., Akhand, M.A.H., Islam, S., Shill, P.C., Hafizur Rahman, M.M.:
Bangla handwritten character recognition using convolutional neural network. Int. J. Image
Graphics Signal Process. 7(8), 42–49 (2015). https://doi.org/10.5815/ijigsp.2015.08.05
10. Bhattacharya, U., et al.: Offline recognition of handwritten Bangla characters: an efficient
two-stage approach. Pattern Anal. Appl. 15(4), 445–458 (2012)
11. Acharya, S., Ashok Kumar, P., Prashnna Kumar, G.: Deep learning based large scale hand-
written Devanagari character recognition. In: 2015 9th International Conference on Software,
Knowledge, Information Management and Applications (SKIMA). IEEE (2015)
12. Chowdhury, S.P., Majumdar, R., Kumar, S., Singh, P.K., Sarkar, R.: Genetic algorithm based
global and local feature selection approach for handwritten numeral recognition. In: Oliva,
D., Houssein, E.H., Hinojosa, S. (eds.) Metaheuristics in Machine Learning: Theory and
Applications. SCI, vol. 967, pp. 745–769. Springer, Cham (2021). https://doi.org/10.1007/
978-3-030-70542-8_30
13. Choudhury, A., Rana, H.S., Bhowmik, T.: Handwritten Bengali numeral recognition using
hog based feature extraction algorithm. In: 2018 5th International Conference on Signal
Processing and Integrated Networks (SPIN), pp. 687–690. IEEE (2018)
14. Choudhury, A., Negi, A., Das, S.: Recognition of handwritten Bangla numerals using adaptive
coefficient matching technique. Procedia Comput. Sci. 89, 764–770 (2016)
15. Choudhury, A., Mukherjee, J.: An approach towards recognition of size and shape independent
Bangla handwritten numerals. Int. J. Sci. Res. 2, 223–226 (2013)
16. Pal, K.K., Sudeep, K.S.: Preprocessing for image classification by convolutional neural net-
works. In: 2016 IEEE International Conference on Recent Trends in Electronics, Informa-
tion & Communication Technology (RTEICT) (2016). https://doi.org/10.1109/rteict.2016.
7808140
17. Clevert, D.-A., Unterthiner, T.: Fast and Accurate Deep Network Learning by Exponential
Linear Units (ELUs) (2015). http://arxiv.org/abs/1511.07289
18. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple
way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
19. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10),
1345–1359 (2010)
IoT Based COVID Patient Health Monitoring
System in Quarantine

Rajat Kumar(B) , Shivam Dixit, Japjeet Kaur, Kriti, and Krishna Murari Singh

Department of Electronics and Instrumentation Engineering, Galgotias College of Engineering


and Technology, Greater Noida, India
rajatsinghgcet29@gmail.com, {japjeet.kaur,
kriti}@galgotiacollege.edu

Abstract. During the pandemic period, various Covid-19 isolation centers are
implemented to treat the affected patients. Because it is highly contagious, it is
very important to isolate patients, but at the same time doctors must also mon-
itor the health of patients. Different strains of this novel virus pose extremely
fluctuating symptoms, thereby making it difficult for the health-line workers to
remain unaffected. In some cases, it is challenging to track the health of many
remote/home quarantined patients. The increase in the severity and number of
patients requiring medical supervision is also increasing. To solve this problem,
remote healthcare system with monitoring of medical parameters is proposed,
allowing to quickly monitor multiple corona virus patients via the Internet of
Things. The microcontroller-based system receives the heart rate, SpO2 and body
temperature parameters through sensors, transmits this data through the ThingS-
peak platform for remote viewing. In case of abnormality or if the patient presses
the emergency button, an alarm is sent. The system enables doctors to remotely
monitor patients without the risk of infection. One doctor can care for more than
500 patients at the same time. If there are extreme fluctuations in health, the doctor
will be alerted immediately. In this challenging time, this is the system that helps
in taking the reading of different parameters of the body of patients, collecting,
and transferring data to the server and the captured data is used by the monitoring
system, while securing the healthcare workers from infection.

Keywords: IoT · Health monitoring · Medical devices · Sensors · Covid · SpO2

1 Introduction
In today’s condition, there is a sudden need for certain measures that could withstand the
current situation and save the lives of people. The use of technology would be the best
option available to cater to the problem of the healthcare sector. In the health sector, there
is a huge emergency of modern communication systems which helps to give information
systematically and uniformly to the user. So, understanding our needs, the best option
is to use IoT (Internet of things), it is most suitable for the help of doctors and other
healthcare workers. For many years, traditional tests in professional medical institutions
have been the standard method for measuring blood sugar, blood pressure, and heart

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 96–104, 2022.
https://doi.org/10.1007/978-3-030-97196-0_8
IoT Based COVID Patient Health Monitoring System in Quarantine 97

rate. With the advent of modern technology, various sensors that detect important signals
have appeared, such as Blood pressure meter, blood glucose meter, heart rate controller,
including electrocardiogram, which patients can use to record daily necessities. Doctors
receive readings every day and recommend drugs and exercise so that they can improve
their quality of life and overcome these diseases. Use for patient care is becoming more
and more common in healthcare, which improves people’s quality of life.
The Internet of Things is defined as the integration of all devices connected to the
network, which are controlled via the Internet and provide information in real-time.
On the other hand, the Internet of Things can be viewed from its paradigm, namely
Internet-oriented middleware, information objects, and information meaning. Arduino
is a tool that can be programmed to understand and interact with your environment. It
is a good open-source microcontroller platform, which allows electronics enthusiasts to
complete it quickly, easily, and cheaply, with minimal construction project participation
and minimal supervision. It is a new way to integrate the Internet of Things into patient
systems. The micro controller board collects data from sensors and transmits it wirelessly
to the Internet of Things website. Transmitting medical information and making correct
decisions based on the patient’s knowledge and knowledge may be a complex task on
the Internet of Things. For this project, a patient health monitoring system is mainly
based on the Internet of Things. This project is used to sense the specified parameters:
heart rate, body temperature and blood-oxygen level; display them on the screen. This
continuously monitored data can be transmitted to the health professional through the
use of IoT for timely diagnosis. Patient Health Monitoring System with Arduino also
provides proactive patient notification.

2 Literature Review
An IoT based monitoring system was proposed in which temperature was the basic quan-
tity to measure and treat on that basis [1]. Some existing systems were also implemented
to help COVID-19 patients used in self-isolation ward through which patient can mea-
sure his/her health parameters own its own and human interaction will be reduced with
medical staff [2]. An automatic system with surveillance of patient’s ward was imple-
mented which can be monitored online on designed application remotely which also
helps medical staff and patient to treat patient nicely as per necessity [3, 4]. For major
symptoms detection and observation, an IoT system was implemented which works
on the parameters like temperature, heartbeat rate and communicate whole data to the
desired platform using internet remotely [5]. These are the existing systems which come
with the revolution of technology in the medical field and automated everything also in
biomedical line. These all implemented instruments help a lot not only medical staff but
patients also, they do not need to worry to call doctor again if unavailable one member
can more patients simultaneously which would help to reduce the efforts of staff to toler-
ate with the different patients in different wards [6]. Remote healthcare and monitoring
system was implemented for smart regions which are so cost effective, efficient in terms
of energy consumption also [7–15]. Recently, most of the research interest has shifted
towards the advances in remote health support and automation of the daily parameter
monitoring for the patients to ensure the isolation of both the medical professionals as
well as the family members.
98 R. Kumar et al.

3 Proposed System
The objective is to develop and implement the patient monitoring system, as shown in
Fig. 1. All sensors communicate with the microcontroller. The blood pressure sensor
records the heart rate and the SpO2 sensor. The temperature ultrasonic sensor that senses
the patient’s body temperature measures the salinity of the patient at home. Various
sensors detect various parameters of the patient’s health and send them to the controller,
and the controller processes them and displays them on the LCD screen. In addition, it
downloads data via IoT. This is where the Wi-Fi module is used to access the Internet.
If the health parameters of patients exceed the set limits, Wi-Fi can send emergency text
messages to the doctor. There is also a buzzer for sending the signal. The entire system
requires a 5 V power supply.

Fig. 1. Proposed IoT monitoring system

This system measures unique parameters such as body temperature, heart rate and
oxygen levels. This device measures through unique sensors and the information are
sent over Wi-Fi. All the information shows on the ThingSpeak display in the graphical
form.

4 Methodology

This project is proposed on the basis for monitoring the health of patient with connected
with internet of things. This is used as patient’s body monitoring system (such as heart
rate and body temperature). The heartbeat device is connected to the patient’s finger
and temperature, and the sensory element is also connected to the patient’s body. The
temperature measuring element may be a measuring element. To withstand resistance,
the resistance is determined by the dynamics of the patient’s vital functions, and the
pulse rate sensor, vibration sensor, or flow rate sensor in the price transmits within the
corresponding signal range.
IoT Based COVID Patient Health Monitoring System in Quarantine 99

Fig. 2. Flow chart of system

The system has a dual function: monitoring health and managing basic household
appliances; this enables users to enjoy social life while maintaining control and moni-
toring of their health, especially during pandemics. The proposed approach as depicted
in the flowchart (Fig. 2) may have a large impact on the handling of virus-spread through
lowering the transmission rate of infectious diseases. After being recognized and receiv-
ing treatment for diseases such as COVID19, there is no cause to move frequently, so the
longevity of life may be ensured, and the transmission rate can be reduced. The current
stage of the system is to use IoT devices for physical deployment. The test phase of the
mobile application uses real scenarios and feedback documents for improvement. The
system proposed by the recommendation has been rigorously tested and can be used
in multiple departments. After being fully developed, the web application and mobile
application developed can be used as a portal to connect to the existing network domain
100 R. Kumar et al.

of the clinic and can be started as a new application for the clinic without an existing
domain.

Fig. 3. Architecture of health monitoring system

To use this platform, users need a Wi-Fi connection. The proposed design of the
IoT-based health monitoring system is based on the Arduino microcontroller that is
the brain of the project. Arduino collects real-time health data from many sensors that
measure patient health parameters. The Arduino board is connected to the Wi-Fi network
function through the Wi-Fi module, as shown in Fig. 3. The Arduino board learned to
make sin meaningful. Then send this setting to the IoT cloud through the Wi-Fi module.
The measured input is shown on the 16 × 2 LCD. And then the data is sent to IOT
platform for further monitoring and every measured parameter is shown on connected
platform. When the received value is greater or less than the range then notification
is sent to the Thing Speak which is an IOT analytics platform. The physiological data
collection device must be included in the current system. In the future, plan to enlarge
our utility from the IoT platform to different structures to ensure huge adaptability. With
the powerful generation brought on this document, its miles believed that these studies
may be prolonged to different regions of the Internet of Things, such as: In addition,
the brand-new machine may be prolonged to the pharmaceutical industry. The medical
doctor can ship the prescription to the pharmacist to acquire dosage guidelines and
dispense the medicine to the patient. The standard overall performance of the proposed
machine changed into evaluated instances using unique mathematical and statistical
evaluation tools.
IoT Based COVID Patient Health Monitoring System in Quarantine 101

5 Results
In this proposed model, three sensors are used, the first DS18B20 waterproof temperature
sensor is used to measure the patient’s body temperature, which achieved the highest
accuracy when the temperature sensor was placed on the armpit or tongue. The heart
rate sensor is used to measure the heart rate by measuring the intensity of the backlight
LED hitting the back of the light sensor. When a person places this sensor on their
nails and ears, maximum accuracy can be obtained. All sensors are connected to analog
pins on the microcontroller board. All these sensors provide a power difference based
on input parameters. These power variables are converted to output, the output of the
temperature sensor is converted to degrees Celsius, and the output of the heart rate sensor
is converted to heart rate in BPM (beats per minute). The picture comes from the serial
software architect of the Arduino IDE. As a result, the pulse is perceived as a peak. As an
analog input, this provides digital output from 0 to 1023, because the small Atmega328
controller has an onboard 10-bit ADC. Set the limit to 520 to estimate the number of
pulses. To calculate and measure the heart rate in BPM (beats per minute), the output
rate above the limit is considered. These results are displayed on the LCD screen, as
depicted in Figs. 4, 5and6. All parameters change according to changes in the body.

Fig. 4. Body temperature measurement

In this Fig. 4, real-time body temperature of patient is monitored and varies accord-
ingly with respect to time. Body temperature is shown on y-axis in Fahrenheit (F) and
time on x-axis of the graph. Figure 5 depicts the heart rate pulse plotted in real-time,
pulses are obtained in a particular time interval.
This graph in Fig. 6 displays the oxygen level present in the blood of patient at a
particular instant of time. These continuous readings can be transmitted to the medical
professionals for the health support.
102 R. Kumar et al.

Fig. 5. Heart rate measurement

Fig. 6. Oxygen level monitoring

6 Conclusion

The proposed IoT-based patient health monitor in quarantine helps the patient to measure
health parameters on their own. The medical staff, the doctor can act according to the
data collected from the patient. It gives relief to the overcrowded health infrastructure
and prevents the medical practitioner from unnecessary exposure. By this system the
critical patients can be identified in less time and treated as soon as possible. This
system provides monitoring of patients with the help of minimal staff and patient can
also relieve themselves by checking their parameters. It helps to reduce overloading in
IoT Based COVID Patient Health Monitoring System in Quarantine 103

the hospital which helps the critical patient to get the treatment on time. In the future,
an integrated database network can be formed where every hospital is connected to the
patient health parameter database, treatment, and other valuable information which can
be recorded by doctor and can treat the patient accordingly. With the help of IoT, we can
integrate many other sensors into the system so the patient will only go to the hospital
when it is advised by doctor else can get the treatment in home isolation.

References
1. Krishnan, D.S.R., Gupta, S.C., Choudhury, T.: An IoT based patient health monitoring system.
In: International Conference on Advances in Computing and Communication Engineering
(ICACCE), pp. 1–7. IEEE (2018)
2. Priambodo, R., Kadarina, T.M.: Monitoring self-isolation patient of COVID-19 with Internet
of Things. In: IEEE International Conference on Communication, Networks and Satellite
(COMNETSAT), pp. 87–91 (2020)
3. Seyed Shahim, V., et al.: COVID-SAFE: an IoT-based system for automated health monitoring
and surveillance in post-pandemic life. IEEE Access 8, 188538–188551 (2020)
4. Ghosh, A.M., Halder, D., Hossain, S.A.: Remote health monitoring system through IoT. In:
5th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 921–926.
IEEE (2016)
5. Swaroop, K.N., Chandu, K., Gorrepotu, R., Deb, S.: A health monitoring system for vital
signs using IoT. Internet of Things 5, 116–129 (2019)
6. Valsalan, P., Baomar, T.A.B., Baabood, A.H.O.: IoT based health monitoring system. J. Crit.
Rev. 7(4), 739–743 (2020)
7. Taiwo, O., Ezugwu, A.E.: Smart healthcare support for remote patient monitoring during
covid-19 quarantine. Inform. Med. Unlock. 20, 100428 (2020). https://doi.org/10.1016/j.imu.
2020.100428
8. Mohammad, N., Pouriyeh, S., Parizi, R.M., Dorodchi, M., Valero, M., Arabnia, H.R.: Internet
of Things for current COVID-19 and future pandemics: an exploratory study. J. Healthc.
Inform. Res. 4, 1–40 (2020)
9. Javaid, M., Khan, I.H.: Internet of Things (IoT) enabled healthcare helps to take the challenges
of COVID-19 pandemic. J. Oral Biol. Craniofac. Res. 11(2), 209–214 (2021). https://doi.org/
10.1016/j.jobcr.2021.01.015
10. Klarskov, C.K., Lindegaard, B., Pedersen-Bjergaard, U., Kristensen, P.L.: Remote continuous
glucose monitoring during the COVID-19 pandemic in quarantined hospitalized patients in
Denmark: a structured summary of a study protocol for a randomized controlled trial. Trials
21(1), 968 (2020). https://doi.org/10.1186/s13063-020-04872-4
´
11. Khoi, N.M., Saguna, S., Mitra, K., Åhlund, C.: IReHMo: an efficient IoT-based remote
health monitoring system for smart regions. In: 17th International Conference on E-health
Networking, Application & Services (HealthCom), pp. 563–568. IEEE (2015)
12. Elagan, S.K., Abdelwahab, S.F., Zanaty, E.A., Alkinani, M.H., Alotaibi, H., Zanaty, M.E.A.:
Remote diagnostic and detection of coronavirus disease (COVID-19) system based on intel-
ligent healthcare and internet of things. Results Phys. 22, 103910 (2021). https://doi.org/10.
1016/j.rinp.2021.103910
13. Singh, V., Chandna, H., Kumar, A., Kumar, S., Upadhyay, N., Utkarsh, K.: IoT-Q-band: a
low cost internet of things based wearable band to detect and track absconding COVID-19
quarantine subjects. EAI Endorsed Trans. Internet of Things 6(21), 163997 (2020). https://
doi.org/10.4108/eai.13-7-2018.163997
104 R. Kumar et al.

14. Doaa Mohey, E., Hassanein, A.E., Hassanien, E.E., Hussein, W.M.E.: E-quarantine: a smart
health system for monitoring coronavirus patients for remotely quarantine. arXiv preprint
arXiv:2005.04187 (2020)
15. Taiwo, O., Ezugwu, A.E.: Smart healthcare support for remote patient monitoring during
covid-19 quarantine. Inform. Med. Unlock. 20, 100428 (2020). https://doi.org/10.1016/j.imu.
2020.100428
Self-attention Convolution for Sparse to Dense
Depth Completion

Tao Zhao , Shuguo Pan(B) , and Hui Zhang

School of Instrument Science and Engineering, Southeast University, Nanjing, China


{zhaotao,psg,amzhanghui}@seu.edu.cn

Abstract. Depth completion from a sparse set of depth measurements and a single
RGB image has been shown to be an effective method for generating high-quality
depth images. However, traditional convolutional neural network methods tend
to interpolate and replicate the output from the surrounding depth values. The
underutilization of sparse information leads to blurred boundaries and loss of
structural information. To further improve the accuracy of depth completion, we
extend the original U-shaped network by self-attention convolution to extract more
useful information from the sparse depth measurements. The experimental results
validate the effectiveness of self-attention convolution using the U-net architecture
on the NYUv2 depth dataset. The accuracy of the proposed method has been
improved by 16.9% compared to the original Unet network.

Keywords: Depth completion · Self-attention convolution · Sparse samples

1 Introduction
Depth completion of a scene plays an important role in autonomous navigation and
smart body localization. For example, applications such as advanced driver assistance
systems, autonomous driving, and intelligent robotics rely on accurate depth percep-
tion [1]. Perceiving the three-dimensional structure of a scene and solving geometric
relationships helps to achieve intelligent understanding of the real-world environment.
In many intelligent scene perception applications, in addition to acquiring depth infor-
mation, visual tasks such as target detection, video tracking, semantic segmentation,
and feature matching must be processed in parallel. It is with the help of scene depth
information that these computer vision tasks are optimized [2]. For example, segmenta-
tion based on depth information in video segmentation tasks can significantly improve
algorithm performance [3], and RGB-D-based bit-pose estimation algorithms in visual
odometry systems offer higher accuracy and robustness [4]. Therefore, depth estimation
is an important topic in the field of computer vision.
Current depth estimation methods are mainly divided into traditional stereo matching
methods, hardware device-based acquisition and depth learning-based methods. Among
them, the traditional stereo matching-based methods [5] assume that the colors of the
matched points are similar and the depth information changes continuously in different
views, and optimally solve the depth information problem of the scene by constructing

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 105–112, 2022.
https://doi.org/10.1007/978-3-030-97196-0_9
106 T. Zhao et al.

an energy function. These methods achieve good results in textured areas with negligible
changes in brightness. However, in practice, there are a large number of occluded, poorly
textured, and light-varying areas in different views, which violate the basic assumptions
of stereo matching methods and make the depth solution ambiguous [6]. At the same
time, the stereo matching-based method requires a large number of computations to
estimate the depth of each pixel, which makes it difficult to obtain dense depth maps in
real time [7]. With the rapid development of the semiconductor industry, various types
of depth information acquisition devices (e.g., structured light sensors, radars, etc.) have
facilitated the measurement of depth information. However, each of these hardware
devices also has corresponding disadvantages. For example, high-end radar devices are
very expensive and can only obtain very sparse point clouds. Commonly used structured
light devices (Kinect, etc.) are susceptible to sunlight interference and can only detect
a very limited spatial range, often only for indoor scenes. In recent years, the boom
in deep learning has led to an increased interest in depth prediction methods based on
conventional neural networks (CNNs). Some of these methods use monocular static
images as input data and exploit the high-level semantic reasoning capabilities of deep
neural networks to predict the depth of scene objects. However, due to the cumbersome
nature of these methods, accurate depth results are not yet available.
In recent years, the successful development of deep learning has led to an increasing
interest in depth completion methods based on convolutional neural networks (CNNs).
Some of these methods use monocular images as input to predict the scene depth using
the understanding ability of deep neural networks. However, due to the inherent ill-
posedness of monocular depth completion, accurate depth results are not yet available.
Based on the above problems, the use of reliable sparse depth input combined with
deep learning methods (see Fig. 1) for accurate scene depth recovery has become a new
research topic [8, 9]. The goal is to build a more efficient and accurate depth completion
model for this application, solving problems based on sparse samples.

RGB
image

Sparse Networks
Depth Dense Prediction

Fig. 1. Sparse to dense depth completion. The input information includes a single RGB image
and a sparse depth map, and the output is a high-quality, dense depth prediction.
Self-attention Convolution for Sparse to Dense Depth Completion 107

2 Related Work
2.1 Sparse to Dense Depth Completion

Depth reconstruction from sparse measurements is an important issue in the field of depth
estimation, which brings new enlightenment for low camera cost and energy consump-
tion. One of the challenges of this problem is maintaining good efficiency while reducing
costs and saving energy. Hawe et al. [10] proposed a sub-gradient method to generate
optimization issues and effectively restore the gap map from very few measured values.
Liu et al. [11] provided a multi-scale warm-start multiplication with alternating direc-
tions to predict the depth values and achieve more accurate reconstruction. Ma et al. [12]
systematically studied depth estimation based on sparse samples. They used the second
derivative of depth images’ sparsity basis, which has a good performance in reconstruc-
tion accuracy and speed. Recently, plug-in modules based on deep learning have been
used in visual (or inertial) mileage measurement algorithms to create more accurate
and dense local map points. Ma et al. [13] proposed a self-supervised method based on
the visual mileage algorithm and achieved good performance on the KITTI data set. In
addition, some studies also combine semantic segmentation to enhance depth prediction
[14]. In terms of sensor fusion of visual images and sparse depth sensors, these studies
have significantly improved the accuracy of monocular depth estimation and provided
valuable methodological enlightenment for follow-up researchers. However, the main
difference in our work is that we focus on precise feature extraction and transmission of
high-precision models based on sparse sample depth estimation.

2.2 Self-attention Convolution

Feature-based gating has been extensively studied in areas such as vision [15], lan-
guage [16], and speech [17]. For example, highway networks [39] use self-attentive
convolution to simplify the learning of very deep networks based on gradients; squeeze-
and-excitation networks explicitly multiply each channel by a learned sigmoid gate value
to recalibrate the characteristic response; and WaveNets uses a similar method to recali-
brate the response of a network. WaveNets [17] models speech signals using gates with
special characteristics (y = tanh(w1 x) · sigmoid (w2 x)) with good results.

3 Method
3.1 Self-attention Convolution

Self-attention convolutions are applied to improve the accuracy of the sparse to dense
depth completion. Firstly, we demonstrate why vanilla convolutions used in the other
tasks [18] are ill-fitted for depth estimation, especially for sparse to dense depth comple-
tion. Vanilla convolutions are the most common form in the deep learning theory, which
considers a series of filters is used to the input feature map as outputs. Assuming that
108 T. Zhao et al.

the input image and output image are respectively I and O, then each pixel at (y, x) in
the output is calculated as
 
kh kw
 
Oy,x = Wk  +i,k  +j · Iy+i,x+j (1)
h w
 
i=−kh j=−kw

Where y, x represents y-axis, the x-axis of output map, kh and kw is the kernel size, so
 
kh = kh2−1 , kw = kw2−1 . For simplicity, W represents convolutional filters, and the bias is
ignored. The formula reveals that vanilla convolutions used the same filters to produce
outputs for all locations (y, x). This can be a reasonable way for object recognition
and detection, which extracts local image features and treats all input pixels as valid
ones in a sliding window. However, in sparse to dense depth completion, the input
consists of known and unknown depth pixels, or combined pixels (RGB image and
sparse depth samples). This may lead to some ambiguity in the training phase, and
performance-limiting of sparse samples for full depth maps in the testing phase. To
solve this problem, we propose the use of feature gating in the convolutional layer. This
allows the convolutional stage to focus on accurate feature values and transfer useful
information. The network combined with self-attentive convolution is more robust in
generating accurate depth values as it is able to identify and process all input pixels
rather than just valid ones. We propose a self-attentive convolution for depth completion
from sparse samples and RGB images. The self-attentive convolution is used in the
network layer of the feature map and is formulated as follows:

Gatingy,x = Wg · Input (2)

Featurey,x = Wi · Input (3)

   
Outputy,x = ϕ Featurey,x  σ Gatingy,x (4)

where  represents the per-pixel multiplication, and Wg and Wi are different filters. ϕ
isthe ReLU activation
  and σ is the sigmoid function. By using the operation
function
ϕ Featurey,x  σ Gatingy,x as the realization of self-attention convolution, our model
can emphasize the meaning of each spatial location and learn effective dynamic feature
selection. According to the above formula, since Gatingy,x can learn to recognize useful
and important regions, the model retains useful feature regions in the output.
We used the well-known U-net architecture [19] and modified it with self-attention
convolutions (see in Fig. 3) as the proposed network for depth completion. Specific
details of the U-net can be found in [19]. Our proposed self-attention U-net is shown in
Fig. 2.

3.2 Sampling Strategy


In the previous literature, two main sampling modes were mentioned: Chen [29] intro-
duced the regular grid mode, and Ma [8] focused on the robust random sampling mode.
Self-attention Convolution for Sparse to Dense Depth Completion 109

Input
. Output Conv
Input Input

Conv Conv
L1 Attention
Features
Features
L2 σ:Sigmoid
φ:Relu
φ:Relu
L3 pixel-wise multiplication
Output
L4
Output
Vanilla Self-attention
L5 Convolution Convolution
Up-sampling
Down-sampling Long Skip Connection Self-attention Conv. Block Vanilla Conv. Block

Fig. 2. Illustration of self-attention U-net architecture

However, the difference in distribution due to the different sampling modes would have
a significant impact on the accuracy of depth completion for the same network. The reg-
ular grid mode may not be easily applicable to many other systems that cannot acquire
regular grid measurements (e.g., direct methods and visual SLAM systems based on fea-
ture points). In order to improve the robustness of sparse sensors, we focus on random
patterns of depth sampling.
On the NYUv2 dataset, the ground truth depth map GT is sampled to obtain the
sparse depth map Sparse. For the coordinates without depth values, set to 0 on Sparse.
In generating sparse depth maps, Bernoulli probability p = m/n is used, where m is
the target number of sampled depth pixels, and n is the total number of effective depth
pixels in GT . Then, for any pixel (i, j),

GT (i, j), with probability p
Sparse(i, j) = (5)
0, otherwise

4 Experiments
4.1 Dataset and Evaluation Metrics
NYUv2 is an RGB-D indoor dataset collected by Microsoft Kinect. It consists of high-
quality 480 × 640 RGB maps and depth maps, downscaled and cropped to a size of 224
× 224 for experiments. According to the official data separation, 249 scenes (26,331
images) were used for training and 215 scenes (654 images) were used for testing, with
the original NYUv2 images downscaled to 224 × 224 size. For a better comparison,
the same image preprocessing methods and evaluation metrics as described in [8] were
used. The three metrics are as follows.

·Root Mean Square Error ·Mean Absolute Relative Error ·Delta Thresholds (δi )

 ŷ|max y ŷ
1 ŷ − y 2  |ŷ−y| ,
ŷ y
<1.25i
RMSE = N REL = 100
N y |{ŷ}|
110 T. Zhao et al.

Where, ŷ is the predicted depth and y is the ground truth depth, δi is the percentage
of pixels with relative error under a threshold controlled by the constant i.

4.2 Impaction of Self-attention Convolution

Performances of self-attention convolution with L1 and L2 losses are used. The testing
results are listed in Table 1. Row 1–4 are the results of RGB-based depth estimation,
rows 5–8 are the sparse to dense depth completion (with 200 samples as the an average
depth points).

Table 1. Performances of self-attention convolution with L1 and L2 loss function.

Problem Loss S.A.-Conv. RMSE REL δ1 δ2 δ3


RGB-based L1 × 0.958 0.291 50.3 80.9 93.7

0.678 0.211 66.5 90.4 96.4
L2 × 1.422 0.396 34.6 65.9 82.3

0.688 0.226 64.8 90.1 96.2
Sparse L1 × 0.243 0.052 97.2 99.3 99.8
samples-based √
0.202 0.041 97.9 99.5 99.9
L2 × 1.312 0.482 37.6 66.8 84.2

0.216 0.044 97.7 99.5 99.9

As shown in Table 1, all the experimental data reveal that the self-attention con-
volution can improve the accuracy of depth prediction to some extent. However, this
improvement is appreciably more pronounced when it is used for sparse samples-based
problems. The above experiments show that the self-attention network can focus on
accurate feature values in the convolutional phase and transfer useful information to
improve the accuracy of depth prediction. Comparing the L1 loss function with the L2
loss function, we can see something interesting. In Table 1, we can see that the two prob-
lems performed reasonably well with L1 loss, but U-net (no self-attention convolution)
had difficulty converging with L2 loss. This is because L2 is sensitive to outliers in the
training data, and errors in L1 are not squared. However, L1 has similar results to L2
when using self-attention. This is because the self-attention convolution processes the
input pixels discriminatively, reducing the sensitivity of L2 to outliers.

4.3 Comparison with Prior Work

In this section, we study the relation between the number of depth samples and the estima-
tion accuracy on the NYUv2 dataset. We compare our proposed network (Self-attention
U-net) with existing sparse samples-based methods (Ma et al. [8] and Shivakumar et al.
[20]) on NYUv2 Dataset. The impact of depth samples number on the estimation accu-
racy is shown in Fig. 3. As see from Fig. 3, as the number of sparse samples increases,
Self-attention Convolution for Sparse to Dense Depth Completion 111

the depth completion accuracy becomes higher. Our method outperformed Ma’s method
when the number was less than 200 and better than Shivakumar’s method when the num-
ber was more than 200. This illustrates that our approach can achieve a good performance
on the NYUv2 dataset.

RMSE (m) REL δ1 (%)


0.5 0.10 100
0.4 0.08
98
0.3 0.06
96
0.2 Ours
Shiv. 0.04 94
Ma
92
3 3
10 1
10 2
10 10 10 1
10 2
101 102 103
number of depth samples number of depth samples number of depth samples
Ma Shiv. Ours
Fig. 3. Impact of the number of depth samples on the estimation accuracy on the NYUv2 dataset.
RMSE and REL: lower is better; δ higher is better.

5 Summary
We propose to use self-attentive convolution to improve the performance of sparse-to-
dense depth completion task. The proposed self-attentive U-net can focus on accurate
feature values in the convolution stage and transfer useful information to improve the
accuracy of depth prediction. Extensive experiments show the effectiveness of self-
attention convolutions on the NYUv2 dataset, which presents a new idea to improve the
accuracy for depth completion tasks.

Acknowledgment. This research was financially supported by the National Natural Science
Foundation of China (Grant No. 41774027).

References
1. Fu, H., Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for
monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, pp. 2002–2011 (2018)
2. Wang, W., Neumann, U.: Depth-aware CNN for RGB-D segmentation. In: Ferrari, V., Hebert,
M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 144–161. Springer,
Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_9
3. Fu, H., Xu, D., Lin, S.: Object-based multiple foreground segmentation in RGBD video. IEEE
Trans. Image Process. 26(3), 1418–1427 (2017)
112 T. Zhao et al.

4. Loo, S.Y., Amiri, A.J., Mashohor, S., Tang, S.H., Zhang, H.: CNN-SVO: Improving the
Mapping in Semi-Direct Visual Odometry Using Single-Image Depth Prediction, arXiv:1810.
01011 [cs], Oct. 2018. http://arxiv.org/abs/1810.01011. Accessed 26 Nov 2020
5. Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspon-
dence algorithms. Int. J. Comput. Vis. 47(1), 7–42 (2002)
6. Chang, J.-R., Chen, Y.-S.: Pyramid stereo matching network. In: Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)
7. Hamzah, R.A., Kadmin, A.F., Hamid, M.S., Ghani, S.F.A., Ibrahim, H.: Improvement of
stereo matching algorithm for 3D surface reconstruction. Signal Process. Image Commun.
65, 165–172 (2018)
8. Ma, F., Karaman, S.: Sparse-to-dense: depth prediction from sparse depth samples and a
single image. In: 2018 IEEE International Conference on Robotics and Automation (ICRA),
pp. 4796–4803 (2018)
9. Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity Invariant
CNNs. In: 2017 International Conference on 3D Vision (3DV), Qingdao, pp. 11–20 (October
2017). https://doi.org/10.1109/3DV.2017.00012
10. Hawe, S., Kleinsteuber, M., Diepold, K.: Dense disparity maps from sparse disparity
measurements. In: 2011 International Conference on Computer Vision, pp. 2126–2133 (2011)
11. Liu, L.-K., Chan, S.H., Nguyen, T.Q.: Depth reconstruction from sparse samples: rep-
resentation, algorithm, and sampling. IEEE Trans. Image Process. 24(6), 1983–1996
(2015)
12. Ma, F., Carlone, L., Ayaz, U., Karaman, S.: Sparse sensing for resource-constrained depth
reconstruction. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and
Systems (IROS), pp. 96–103 (2016)
13. Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised Sparse-to-Dense: Self-supervised
Depth Completion from LiDAR and Monocular Camera, arXiv:1807.00275 [cs], July 2018.
http://arxiv.org/abs/1807.00275. Accessed 26 Nov 2020
14. Jaritz, M., De Charette, R., Wirbel, E., Perrotton, X., Nashashibi, F.: Sparse and dense data
with CNNs: depth completion and semantic segmentation. In: 2018 International Conference
on 3D Vision (3DV), pp. 52–60 (2018)
15. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
16. Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional
networks. In: International Conference on Machine Learning, pp. 933–941 (2017)
17. Van den Oord, A., et al.: Wavenet: a generative model for raw audio, arXiv preprint arXiv:
1609.03499 (2016)
18. Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion.
ACM Trans. Graph. (ToG) 36(4), 1–14 (2017)
19. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image
segmentation. In: International Conference on Medical Image Computing and Computer-
Assisted Intervention, pp. 234–241 (2015)
20. Shivakumar, S.S., Nguyen, T., Miller, I.D., Chen, S.W., Kumar, V., Taylor, C.J.: Dfusenet:
Deep fusion of RGB and sparse depth information for image guided dense depth completion.
In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 13–20 (2019)
Using Algorithm in Parametric Design
as an Approach to Inspire Nature
in Architectural Design

Mohamed Ibrahim Abdelhady(B) , Ayman K. Abdelgadir, Fatma Al-Araimi,


and Khulood AL-Amri

Engineering Department, Sur University College, Sur City, Oman


dr.abdelhady@suc.edu.om

Abstract. The aim of this study, is to provide the architectural designers with
broaden the perceptions of understanding of using algorithm in parametric design,
find easier ways to draw inspiration from nature. For achieving the goals, the fol-
lowing method was used, analyzing the most recent buildings in which the para-
metric design used as a main tool in design process. The main points for analyzing
are: inspiration approach, parametric patterns in nature, goals of nature inspira-
tion and ways of nature inspiration a flowchart was also used for the parametric
design work that explains how the parametric design works from the idea to the
output as a result, selected buildings were evaluated in several points: Inspiration
Approach, Parametric patterns in nature and Goals of nature inspiration, which
branch out from each point several sub-points where it was noted that the most
used technology in the buildings was direct metaphor and partial in proportion to
6% of the inspiration point. Then followed by the sustainability technique from
the point of Goals of nature inspiration, where it was used for 5% out of 6, and
it was also noticed that the least used technique was direct metaphor, where the
usage percentage out of 6% was zero. In conclusion, the parametric design method
is more flexible than the traditional design methods, although it is based on rigid
shapes such as square, triangle and circle. Flowchart was proposed to help the
architectural designers for using parametric design technique during the design
process for inspiration from nature.

Keywords: Algorithm · Parametric design · Nature inspire methods · Inspired


parametric in nature and contemporary architecture

1 Introduction
Parametric design is a design method that differs from the well-known traditional meth-
ods, the concept of parametric design is based on the principle of finding the shape, and
shape design is based on experiment and exploration, where the shape is produced and
executed in a computer environment, unlike the traditional method. Inspiring, legalizing
and renewing shape. How to build and construct and achieve the complexity, intercon-
nectedness and superposition of the components and work to transfer these methods to

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 113–130, 2022.
https://doi.org/10.1007/978-3-030-97196-0_10
114 M. I. Abdelhady et al.

the design with the aim of adding a little systematic complexity in the construction of the
shape, which achieves creativity and optimization in building shapes. Benefiting from
parametric design in solving the problems of codifying complex and irregular design
shapes inspired by nature by presenting and reviewing the most important theories on
which parametric design relies in the inspiration of formal formulas such as those found
in nature. Understanding how to implement technology and develop strategies for para-
metric architectural design. Developing strategy in parametric architectural design is
driven by parametric popularity, and monitor the role of a new specialist dealing with
the implementation these methods. Disclosure of the strategies and principles underly-
ing the parametric basis. Linking design theory to architectural practice [2]. As a result,
Authors found many problems in previous researches related to the necessity of activat-
ing the use of parametric design in obtaining design solutions inspired by nature. Also
it can be found that design issues are more complex in practice and different modeling
platforms require the user to have different levels of understanding and interchange in the
process. From this standpoint, authors aim to broaden the perceptions of understanding
of parametric design, solve its complexity and ambiguity, and facilitate the understand-
ing of the algorithms associated with it. Authors also aim to find easier ways to draw
inspiration from nature and cite examples of several buildings in which the parametric
design has been applied and analyzed and the programs can be used for implementation
[1].

2 Parametric Design
2.1 The Concept and Definition of Parametric Design

The concept of parametricity is summarized in the fact that all design elements and
components are parametrically adaptive and interrelated, and the effect on one of them
affects All in all, this is known as the rubber effect, since it causes an ontological shift in
understanding the basic elements and components of the design. As shown in (Fig. 1),
parametric models present dynamic models and expectations within the concepts of
distinction and interdependence. Unlike the classical and modern approach to design,
as shown in (Fig. 2), which deals with each of the elements. Planned separately Instead
of the styles of classicism and modernity focused on judgments and the absolute ideal
inertia of geometric forms such as the cube, Cylinder, pyramid, sphere, hemisphere) New
parametric priorities are the dependency of engineers Live (dynamic, suited, adaptive) as
Subdivs, Nurbs, Spline as the basis for geometric structures and dynamic systems Like
Metaballs, blobs, cloths, and hair shapes that interact with external form and resonate
influences The overall composition is as shown in (Fig. 3) and is controlled by its
parameters and the results of the shapes [5].
Using Algorithm in Parametric Design as an Approach to Inspire 115

Fig. 1. Conceptual model as an ontology of the parametric model [1]

Fig. 2. The ontology of classicism and modernity [1]

Fig. 3. The Soho Galaxy Building Designed by Zaha Hadid, Showing the Contrast between
Parametric and modernity

2.2 The Importance of Parametric Design


Figure 4 shows the importance of parametric design lies in the creation of small elements,
which are the cornerstone of large projects, as they do not need a long time to implement
them, as the elements of architecture have become parametric soft, the basis of which is
the relationship between exterior and interior design. The elements are no longer solid
and separate, but rather soft and connected with each other. This reflects the strength
and aesthetics of the various parametric elements [7].

2.3 Parametric Design Process


Whether there is a specific thinking process in parametric design, there is no definite
conclusion yet.However, no matter what type of design has its own general design
process. Figure 5 is a typical parametric design thinking process [8].

2.4 Parametric Strategies and Principles


The main strategies and principles of parametric are: [13].
116 M. I. Abdelhady et al.

Fig. 4. The importance of parametric design

Fig. 5. The process of parametric design [14]

• Smooth, flowing lines, like a piece of cloth.


• Originality and uniqueness of decorative elements and furniture.
• Curved geometric lines.
• Avoid normal geometric beginning (Like squares, triangles, rhombuses, etc.) because
they are difficult to align and shape.
• Avoid minor duplication of elements.
• Maximum environmental friendliness.
• Magnitude.
• Practicality and versatility. (Fig. 6)

3 The Methods of Nature Inspiration

3.1 Definition and Concept of Inspiration

A direct comparison between two or more materials (one of which is a source) which are
here nature in its manifestations and elements (total or partial) and the other architectural
composition as a container which receives the first source (nature) on its wall or inside. It
is in Christine and Nosipotro that the process of transforming summaries (of nature) into
physical or visual images is taking place. Lloyd Wright’s building from and to nature
[3].
Using Algorithm in Parametric Design as an Approach to Inspire 117

Fig. 6. Parametric strategies and principles

3.2 Relationship Between Architectural Designs with Nature

Entering the natural metaphor in contemporary architecture, it can be clearly and infer
through the work of some architects who are pioneers in this field as a supporter of the idea
of the relationship of nature as a donor and a hindrance to some of its advantages to the
building as the future of some of these natural elements, for example: and some architects
and designers are still affected By their nature in some of their designs, despite the
technological developments at the present time such as: Norman Foster, Zai Hadid, Frank
Gehry and James Law… etc. This may give an indication of hope for the relationship
between designers and nature in the future [3].

3.3 The Importance of Nature for the Architectural Designer

Where a metaphor constitutes a new vital energy/in design, such that it leads to the
development of a new concept and new characteristics through the incorporation of the
idea of design and nature (the natural environment) in order to meet human needs, and/
that is the metaphor/is an important and effective tool for creativity and gives designers
the ability to express Their ideas and find many design solutions in harmony with nature
[4].
118 M. I. Abdelhady et al.

3.4 Types of Inspiration


3.4.1 In Terms of the Goal: Direct Metaphor and Indirect Metaphor
• Direct metaphor, which represents the metaphor of the form in harmony with the
surroundings and with the recipient through the shape and stimulation of things that
have been seen before and become familiar and part of the memory.
• Indirect metaphor: it is the metaphor that embodies the formation or form imported
from nature after refining and re-sorting in a new and distinct form.

3.4.2 In Terms of Quantity and Type: General and Partial


• The general metaphor: it is the metaphor of the details of the entire environment (to
the degree of containment or complementarity).
• Partial metaphor: it is the borrowing of some elements of the ocean in the jacket or
composition.

3.4.3 In Terms of Natural Intervention: Structural Substitution and Mixed


Substitution
• Structural substitution: that is, the borrowing of the perimeter or element and the
structural intervention on it (metal or something similar).
• Mixed switch: Light and Shadow - Create - Color – Height [11]

4 The Methods of Nature Inspiration


4.1 Parametric Concept in Nature
There are programmatic spaces that contain one or more mathematical algorithms and
processes, and the parametric design is based on engineering foundations and concepts
with mathematical logic inspired by nature.

4.2 The Ways of Nature Inspiration


Figure 7 shows the transference methods of Nature Objectives to Architecture and what
can architecture mimic from nature in order to solve architectural problems.
C. Bio mimicry Levels Three levels of bio mimicry that can be applied to a design
issue are usually provided as forms, processes and ecosystems [2, 12]. Shape and mech-
anism are features of an organism or ecosystem that could be imitated in the study of
an organism or ecosystem. Ecosystem, however, is what should be examined in order to
look at particular aspects to replicate them. There are three degrees of imitation: Organ-
ism Level, Behavior Level and Ecosystem Level. The level of the organism refers to a
particular organism, such as a plant or a plant. Organism Level (Fig. 8) Raviolis and
Knight discuss a more specific material bio mimicry at the organism level, where the
surface of the beetle has been studied and mimicked to be used for other possible uses
such removing fog from airport runways and upgrading dehumidification equipment for
example [2].
Using Algorithm in Parametric Design as an Approach to Inspire 119

Fig. 7. Transference from nature to architecture [2]

Fig. 8. Matthew Parkes ‘Hydrological Center for the University of Namibia and the stenocara
beetle [2].

4.3 Parametric Patterns

Models of a form in nature on the basis of particular criteria in a repetitive or system-


atic manner and on the basis of the repetition of the item Realized entirely in nature.
Parametric trends in nature can be classified into the following as shown in Table 1:

4.3.1 Regular Parametric Pattern


There are patterns found in nature that are characterized by precision, as well as pat-
terns in which an individual intervenes and are characterized by rhythm Design and
consistency, as well as the ability to calculate them with controls and rules.
120 M. I. Abdelhady et al.

Table 1. Parametric patterns in nature

Repetitive
Non – organic Format
Pat patterns in Swarm pat-
repetitive pat- creation pat-
tern living organ- tern
terns tern
isms
Non – organic Parametric
Simulation
natural system Parametric design is
that focuses on
are used as inspi- design emu- used to simu-
collective swarm
rational models lates repeti- late highly
behavior rather
to formulate tive biological repetitive
Description

than on individ-
conceptions and models found patterns such
ual behavior
construct spatial in nature as the vital
swarms usually
forms many re- (models that repetitive
move uniformly
petitive shape perform vital structures
creating simple
patterns exist in functions ) found in the
curved lines
nature (sand such as bee- configuration
made up of or-
dunes, sea hives of trees and
ganic blocs
waves, spiral) coral reefs
Illustration

4.3.2 Irregular Parametric Pattern


There are patterns that express visual excitement and are characterized by unorganized
randomness and asymmetry in most cases. Their pieces, as well as their asymmetry, are
difficult to quantify, such as repeated patterns of living objects [2].

4.4 Parametric Design Theories Based on Nature Formations

Parametric design codifies a number of shapes that exist in nature, such as Voronoi
formations and fractal molecular shapes in picture theories to make it easier for the
designer to legalize shapes inspired by nature [1].

4.4.1 Partial Geometry in Nature


A theory developed by mathematicians to legalize some natural phenomena and with the
development of this theory, molecular engineering has become entrance as an experiment
and an expressive trend in the fields of visual arts, they are simply defined as geometric
elements that are divided into parts of each part they are similar in shape to the bulk
Using Algorithm in Parametric Design as an Approach to Inspire 121

from which they are derived and can be used in design [14]. Examples of some popular
fractal structures (Fig. 9) and (Fig. 10):

1. Von Koch’s Snowflake Curve: It’s a six-pointed star, with a hexagonal self-symmetry,
much like a normal snowflake. It can be got from an equilateral triangle, splitting
each side of its sides into three equal parts, replacing the middle section of each
side with a triangle, its base sag, and this method is repeated over and over, and in
practice it is difficult to get the true shape of Koch’s curve, since it consists of the
theory of an infinite number of duplicates.
2. Sierpinski gasket: Named after Wachaw Sierpinski, a Polish athlete. It consists of a
solid equilateral triangle from which the inner triangle formed by the intersection of
the mid-sides of the original triangle is removed, so that we get three inner triangles,
the process is repeated on each inner triangle to get nine triangles and so on [5].
3. Mandelbrot set: It is a fractal shape that is commonly recognized even beyond the
field of mathematics, since it overlaps with so-called fractal art, as it presents an
artistic picture characterized by beauty and abstraction. What determines the set of
Mandelbrot is its complex but simple structure [12].

Fig. 9. Fractal (Figs as repetition of geometrical progressions that may extend to infinity and
never vanish [5]

Fig. 10. From left: The VonKoch’s Snowflake Curve, Sierpinski gasket, Mandelbrot set [5]

4.4.2 Voronoi Theory


Voronoi architecture is regarded by its existence as a regulatory phenomenon. It is
referred to as a simple nature, where it simulates Voronoi diagrams of several formations
occurring in nature, and potential findings of 2D and 3D Voronoi diagrams in a variety
of life forms, such as bubbles, urchins, sponges, crystals, etc. [6]. (see Fig. 11)
122 M. I. Abdelhady et al.

Fig. 11. Voronoi diagram models in nature [6]

A manual Voronoi diagram can be drawn using the following steps as shown in
(Fig. 12).

1. Determine a group of generating points.


2. Connecting straight lines between these points.
3. Determine the points that refer to the middle of the previous straight lines.
4. Draw perpendicular lines to the previous lines from their midpoints (averages).
5. The averages intersect each other to form a new network known as the Voronoi
diagram [6]. (see Fig. 13).

Fig. 12. The steps in drawing a Voronoi diagram

Fig. 13. Types of Voronoi diagram regular, random and clustered [6]

5 The Parametric Attributes and Properties


Parametric features are the characteristics and characteristics by which the most
important concepts governing them are presented and presented as follows:
Using Algorithm in Parametric Design as an Approach to Inspire 123

5.1 Formal Features


Formal parametric features are those by which the rules and principles that allow the
preparation and evaluation of formal features can be described.
For parametric architecture, the ideals of the achievement of beauty are expressed
in the following points:

1. The shapes must be smooth, taking into account that they are parametrically
connected, and therefore the effect on one of them affects me as a whole.
2. Avoid grouping elements that have nothing to do with each other as this causes
isolation within the formation.
3. Avoid solid shapes (square, triangle, circle, cube, pyramid, and sphere).

5.2 Functional Features


These features by which the rules and principles that clarify and test the functionality of
a parametric design can be explained. These are concepts that drive success and can be
expressed in: All roles that take place within parametric scenarios and must be defined in
colloquial terms that can be adapted to In addition to achieving interdependence between
them, as one operation affects the rest of the activities [1].
Figure 14 shows the parametric part represented in the triangle of the basic unit of
the six-pointed star shape, so that the triangle represents the middle is the parametric
part of the Islamic hexagon, and the right and left triangles represent geometric changes
in the unit angle key by changing the position of the Point Constrained in the middle
parametric part, and the resulting inference of star shapes different as new decorative
units. As shown in Table 2 [9].

Fig. 14. New parametric unit of a star [9]

5.3 Parametric Design Patterns


In order to examine the application of parametric design patterns, the authors conducted
practical studies of parametric design patterns in some design cases by recreating para-
metric design patterns of building facades by using design patterns. Figure 15 shows
the steps of using the design patterns in design: select the shape, structure and surfaces
optimization, format the structure and distribute the structure. Figure 16 shows using
diagram pattern in design [10].
124 M. I. Abdelhady et al.

Table 2. Conventional and parametric methods

Fig. 15. Using design patterns in design

6 Contemporary Architectural Case Studies


The practical part of the current research is focusing on the Contemporary architects
work in a range of styles, from post-modern architecture and high-tech architecture
to highly imaginative and expressive forms and structures that imitate sculpture on a
monumental scale. The buildings have been selected based on its ways of inspiration
from nature. Table 3 shows the main information of selected case studies, while Table 4
shows the analysis.
Using Algorithm in Parametric Design as an Approach to Inspire 125

Fig. 16. An overview diagram of patterns used in design (Hangzhou Stadium) [10]

Table 3. Selected case studies

7 Results and Discussions

The current work results is presented in (Figs. 17:19) It Explained Inspired Parametric
in Nature in the form of a graph with its tables, so that the final results of the selection of
6 buildings. From below bar chart shows inspiration approach on three different points
and each point contains two sub points. Where the number of buildings used for each of
the three main types was determined out of 6 buildings.
The following bar chart in (Fig. 18) shows parametric patterns in nature on four
different points. Where the number of buildings determined out of 6 buildings. And
126 M. I. Abdelhady et al.

Table 4. Case studies analysis

Inspiration Approach
7
6
5
4
3
2
1
0
direct indirect General Partial Structural Mixed switch
metaphor metaphor substitution

Fig. 17. The analysis of inspiration approach of selected case studies


Using Algorithm in Parametric Design as an Approach to Inspire 127

Authors noticed that the percentage of the number of buildings used for the “Format
creation pattern” is greater compared to other types used in different buildings.

3.5
Parametric Patterns In Nature
3
2.5
2
1.5
1
0.5
0

Fig. 18. The analysis of parametric patterns in nature

The next bar chart (Fig. 19) shows the goals of nature inspiration through four
different points. Where the number of buildings determined out of 6 buildings. Where
the authors noticed that the proportions between the buildings varied in the ways of
inspiration and despite this high focus was on Sustainability nature inspiration.

Fig. 19. The analysis of nature inspiration goals


128 M. I. Abdelhady et al.

The bar chart below (Fig. 20) shows way of nature inspiration on three different
points. As the number of buildings used for the inspiration system is 3 out of 6 buildings
and this is the highest way used among all ways.

Fig. 20. The ways of nature inspiration

Authors conclude the mechanism of the parametric design work (Fig. 21) so that
it depends on the variables specified for it and from this point of view produces
many alternatives and then the final solution is chosen based on a set of performance-
related determinants, also the proposed mechanism concluded in the following graphic
flowchart.

Fig. 21. The mechanism of parametric design


Using Algorithm in Parametric Design as an Approach to Inspire 129

8 Conclusion
Parametric design considered as an introduction to nature’s inspiration, based on a new
design trend called parametric, supported by several theories that facilitate inspiration
from nature and using organic forms, such as the Molecular theory and the Voronoi
theory. This current research is focusing on the integration of parametric design process,
approaches and theories in architecture, which rely on:

1. Creating architectural forms, which were difficult to generate


2. Time saving
3. Ability to modify design elements according to design variables
4. The ability to generate new forms, structures and construction with different behaviors
5. The ability to generate more complex forms
6. Enhancing the creativity of the designer
7. Suitable for the figurative stage of the design process

As a result, it is noticed that the parametric design method is more flexible than
the traditional design methods, although it is based on rigid shapes such as square,
triangle and circle. At the present time, parametric design has become more popular as
the designer can easily inspire from nature in several ways using different patterns, such
as: Imitation, Abstraction and Inspiration. The proposed flow chart in current work for
using the parametric design as an approach to inspire nature in architectural design to
generate and create new forms.

References
1. Osama, M., Ahmed R., Eslam, E.: Parametric Design as an Approach to Inspire Nature in
Product Design 11 (January 2019)
2. Ahmad, F., Ahmed, S.: Parametric Patterns Inspired by Nature for Responsive Building Façade
An ISO 3297: 2007 Certified Organization Vol. 4, Issue 9, September 2015
3. Saeed, A., Bader, R., Khalas, M.: Nature is a source of inspiration and metaphor in the process
of architectural embodiment. Nature Influences on Architecture Designs. Tishreen Univ. J.
Res. Sci. Stud. Eng. Sci. Series 14(3) (2019)
4. Hussein, W.: Flexible Design and Its Effect on Developing the Urbanism and Architecture,
a Comparative Study of Parametric Design and Its Role in Promoting the Development of
Regionalism in Design p-ISSN: 2168–4995 e-ISSN: 2168–50022020; 10(1): 1–12
5. (2014, January 9). Syrian Researchers. https://www.syr-res.com/article/3240.html
6. Merhej, S., Mahmoud, S.: Applications of voronoi diagram on facades /comparative study of
contemporary architectural projects. Al-Baath Univ. J. 39(7) (2017)
7. Makhlouf, M.: Twenty two. Availably: https://twentytwo-group.org/documents last site
12/November/2020 (2019)
8. EDP Sciences “Research on Parametric Form Design Based on Natural Patterns” MATEC
Web of Conferences 176, 01012 (2018)
9. Dowa Khaled, H., Diaa El din Abd El Dayem, O., Nisreen Saied, A.M., Hind, E.S.: Parametric
design as a contemporary design solution for the islamic ornaments, Int. Design J. 10(2) Article
1 (2020)
130 M. I. Abdelhady et al.

10. Chien, S., Choo, S., Schnabel, M.A., Nakapan, W., Kim, M.J., Roudavsk, S., Micro-Utopias.:
Using parametric design patterns in building façade design. Research in Asia CAADRIA
167–176 (2016)
11. Shakra, K.: Biophilic Design and Biomimicry in Architecture Faculty of Engineering,
Material Credit Hour System Programs Architecture by Digital Technology (2020)
12. Herrera, P.C.: issuu. Availably: https://issuu.com/pabloherrera/docs/12012012_parastrat_
issuu_original_2009/14 last site 21/November/2020 (2012)
13. Roland, H.: Strategies for parametric design in architecture. An application of practice led
research University of Bath Department of Architecture and Civil Engineering 2010
14. Liu, Y.C.: Parametric design: method. Thinking Strategy and Framework. Architecture
Technique Z1, 34–37 (2011)
Sybil Account Detection in Social Network Using
Deep Neural Network

Preety Verma1 , Ankita Nigam2(B) , Garima Tiwari3 , and G. Mallesham4


1 Department of Computer Science and Engineering, Greater Noida Institute of Technology
(GNIOT), Noida, India
2 Department of Computer Science, Princeton Institute of Engineering and Technology for
Women, Hyderabad, India
ankita270481@gmail.com
3 Department of Electronic and Telecommunication, Jabalpur Engineering College, Jabalpur,
India
4 Department of ECE, Indur Institute of Engineering and Technology, Ponnala, Siddipet,

Telangana, India

Abstract. Everyone’s social life was intertwined with internet social networks
during the show’s run. These objectives have resulted in a significant shift in
how we seek fulfillment in our social lives. Making new friends and staying in
touch with them, as well as keeping up with their progress, has become less
difficult. However, due to their rapid growth, issues such as sybil (false) profiles
and online impersonation have arisen. According to a subsequent analysis, the
number of accounts that appear within social media is significantly lower than the
number of people who use it. This implies that sybil profiles have been increased
in the long run. Recognizing these sybil profiles poses a challenge for online
social media providers. There is no feasible course of action available to address
these issues. In this paper, we developed a machine learning method for revealing
sybil profiles that is feasible and competent. In this section, we will use business
classification techniques based on Notable Neural Organize calculation to classify
the professionals.

Keywords: Classification · Sybil attack · Sybil profile detection · Neural


network · Machine learning · Online social network

1 Introduction
A social organising area could be a website where each client has a profile and can keep
in touch with friends, share their updates, and meet advanced individuals who have a
vague interface. These Online Social Systems (OSN) companies provide web develop-
ment that allows customers to connect with one another. These social networking sites
are growing quickly and changing the way people communicate with one another. The
internet communities bring people with similar interests together, making it less difficult
for clients to meet new people. Everyone’s social life has become intertwined with online
social systems in the present day. These goals have resulted in a significant change in the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 131–139, 2022.
https://doi.org/10.1007/978-3-030-97196-0_11
132 P. Verma et al.

way we seek after our social lives. Checking unused companions and staying in touch
with them and their updates has become less difficult. Online social systems have an
impact on science, education, grassroots organising, commerce, commerce, and so on.
Examiners are looking into these online social networks to see what effect they have
on people. Teaches can effectively reach the examiners by creating an inviting environ-
ment for the investigators to outline, teaches now-a-days are getting to be themselves
recognisable to those objectives by bringing online classroom pages, giving homework,
making talks, and so on. This results in a lot of strides instruction. The chiefs can use
these social organising aims to employ people who are skilled and curious about the
task; their foundation checks are frequently completed successfully using this method.
The majority of the OSN are free, but several charge a toll for participation and use it
for business purposes, thus the remaining portion of them make money by using the
progressing. The government will use this to quickly invigorate the common open’s
conclusions.

2 Neural Network
Coordinate has tens of thousands, if not millions, of false neurons known as units. Input
layer, yield layer, and secured up layer are the layers that organise the units in a course
of activity. The input layer recognises data from the outside, and the secured up layer
organises that data. The fastened up layer is an important part of the yield process. The
yield layer is used to allow the client to abandon. There are two types of learning in
Neural Organize: supervised learning and unsupervised learning. Both input and output
are energised in assisted learning. The input is then provided by the organisation, which
compares its genuine resign to the predicted resign. Botches is the capability that exists
between honest-to-goodness abandon and expected yield. Botches are then reproduced
and modified by the organiser in order to change the weights.

3 Literature Review
According to [1] an increasing number of people have accounts on social media platforms
(SMPs) but hide their identities for malicious motives. Surprisingly, little or no research
has been done to identify sybil characters created by individuals, notably on SMPs.
Various indications exist of incidents when sybil accounts created by bots or computers
are effectively recognised using machine learning algorithms. In [2] the issue of harmful
works out in online social systems, such as Sybil ambushes and malevolent use of fake
characters, can have a significant impact on the social works out in which clients jolt in
despite the fact that they are online. This problem, for example, can influence substance
dispersal, collaboration formation, illumination, profile browsing, and commenting. Fake
identities and client accounts (also known as “Sybils”) in online forums have become a
gold mine for adversaries looking to disseminate sybil thing ponders, malware and spam
on social media, and astroturf political campaigns, according to [3]. Robotized Turing
Tests and graph-based Sybil pioneers are state-of-the-art inside the defense defiant [4–7].
In today’s society, social media platforms are used on a daily basis and have taken over
a considerable portion of our life. The aggregate of people gatherings on social media
stages is growing at an alarming rate for dangerous use.
Sybil Account Detection in Social Network 133

4 Problem Definition
The social organising districts are improving our social life, yet there are a few drawbacks
to adopting these social organising districts. Security, online bullying, the potential for
mistreatment, trolling, and other difficulties are all present. These are carried out using
sybil profiles for the most important parcel. There are a slew of issues with social
organising locales; one of them is sybil accounts, which might lead to unmistakable
troubles. It has a variety of effects on clients’ social organising goals. From individual
sybil profiles, the online social network moves forward. On Twitter, there are relatively
few methods for finding sybil accounts. In actuality, current procedures do not offer a
high level of precision [8–15].

5 Proposed Work

Recognizing the sybil accounts in online social platforms might be difficult. Individuals
that use online social networks move forward from a variety of challenges that affect
their personal and professional lives. On a social network, the number of sybil accounts is
multiplied. The creation of sybil accounts maintains the online social organisation. Sybil
news, online rating, and spam are all displayed in Sybil accounts. The sybil accounts on
Twitter are recognised by our suggested system. There are distinct ways for identifying
sybil accounts on online social networks. Each has its own central emphasis and goals.
Existing tactics, however, do not have an alarmingly high regard for precision. To start
the driving comes roughly, this proposed work blends the weighted join set with machine
learning techniques. Using the recommended strategy for identifying Sybil accounts on
Twitter will improve the accuracy. The approaches used in this suggested work, such as
neural systems, are used to classify legitimate and sybil accounts. The more ways better
comes around in information classification, the more neural organise gives. Because of
its potential to be instructive and handle various genuine to goodness time challenges,
machine learning procedures have been widely used in promoter gauge processes. They
can change their internal setup without requiring human participation in order to transmit
the assessed outcome for the desired issue and to establish a link between input and resign.
In this way, brain coordinates can be used to recognise sybil accounts on Twitter with
greater precision.
Figure 1 show the proposed block diagram, this block diagram we perform 4 steps.

Step 1: We’ll start by gathering the genuine and sybil profile datasets.
Step 2: We’ll remove the undesired attributes from both datasets, then change the data
sort of the credited concurring to calculation and replace any N/A or invalid values with
0.
Step 3: After pre-planning both datasets, we’ll combine the sybil and legit profile datasets
to form a single dataset, which we’ll then divide into two sections: arranging and testing.
Step 4: To create a classification outline based on Neural Organize, use a sybil or legit-
imate profile that has been prepared. Once the arranging has been completed, we will
test the appearance on a testing dataset, and based on the results of that testing, we will
calculate the execution of the calculation.
134 P. Verma et al.

Fig. 1. Proposed block diagram

6 Experimental and Result Analysis

This location serves as a trial run for the completed project and the recommended strategy.
For exploratory study, it was decided to use Jupiter scratch pad, a Python IDE mostly
used for machine learning model development.
Once the dataset has been stacked, we’ll do some highlight extraction on it and
remove any unnecessary columns from the datasets. Once we’ve removed the dataset’s
basic properties, we’ll do some pre-planning of information to be able to modify the data
sort of the properties and replace any N/A or invalid values with those shown in Fig. 2.
Sybil Account Detection in Social Network 135

Fig. 2. Data pre processing

When all of the profiles have been pre-arranged, the dataset is divided into an arrange-
ment and testing dataset, which opens up in Fig. 3, to blend both the legit and sybil profiles
into a single dataset. This blend results in all profiles being open in Fig. 3.
A prophetic machine learning display based on Neural coordinates is built using the
planning data set. Once the prescient display is built, we let the data set to be put to the
display for becoming displayed as shown in Fig. 4.
When the foresight show is organised, it is possible to test the show’s execution on a
test data set and the show’s anticipation of profiles, and based on the assessment result,
it is possible to encourage the show’s execution and the underwriting precision we have
obtained.
136 P. Verma et al.

Fig. 3. Merging and splitting dataset

Fig. 4. Building predictive model


Sybil Account Detection in Social Network 137

7 Evaluation Metrics
A perplexity cross-section related estimations variable, TP, FP, TN, and FN was used to
estimate the following:

• Sybil profile counted as genuine True Positive (TP): the number of sybil profiles that
have been identified as such.
• False Positive (FP): Number of standard profiles that are mistakenly identified as sybil
profiles.
• True Negative (TN): the number of common profiles that are regarded as conventional
profiles.
• False Negative (FN): the number of sybil profiles that are mistaken for conventional
ones.

Accuracy is used to rate the classifier. Exactness is the percentage of sybil accounts
that are correctly recognised. The ratio of precisely classified accounts to all accounts
can be used to calculate this metric. When it comes to driving strategy, it should be raised
the most. Subjective timberland [16–24] is a sloppy computation, and we are prepared
to evaluate the execution of our neural-organization proposal with it. The comparative
results are shown in Table 1 and Fig. 5.

Table 1. Result Comparison

Model Accuracy
Random Forest [1] 87.11%
Neural Network 97.19%

Fig. 5. Result comparison


138 P. Verma et al.

8 Conclusion
In the suggested unsupervised learning technique, Critical Neural Organize computation
is used to recognise sybil twitter accounts. This proposed unsupervised learning tech-
nique makes use of the most persuasive classifier neural arrange to increase precision
compared to assisted learning approaches like unusual timberland computation. We have
developed a Prescient show based on neural organisation that can identify sybil profiles
in any online social network with an unnervingly high accuracy of up to 97%.

References
1. Van Der Estee, W., Jan, E.: Using machine learning to detect fake identities: bots vs humans.
IEEE Access 6, 6540–6549 (2018)
2. Muhammad, A.-Q., Mabrook, A.-R., Atif, A., Majed, A.: Sybil defense techniques in online
social networks: a survey. In: IEEE (2017)
3. Mansour, A., Abdulrahman, A., AbdulMalik, A.-S., Mohammed, A., Abdulmajeed, A.: TSD:
Detecting Sybil Accounts in Twitter. In: IEEE (2016)
4. Singh, N., Sharma, T., Thakral, A., Choudhury, T.: Detection of fake profile in online social
networks using machine learning. In: IEEE (2018)
5. Secchiero, M.: FakeBook : Detecting fake profiles in on-line social networks. In: IEEE (2012)
6. El Azab, A., Idrees, A.M., Mahmoud, M.A., Hefny, H.: Sybil account detection in twitter
based on minimum weighted feature set. In: IEEE (2016)
7. Maind, M.S.B.: Research paper on basic of artificial neural network. In: IJRITCC (2014)
8. Shuang-Hong, Y., Bo, L., Alex, S., Narayanan, S., Zhaohui, Z., Hongyuan, Z.: Like like
alike: joint friendship and interest propagation in social networks. In: Proceedings of the 20th
WWW, pp. 537–546 (2011)
9. Gupta, G.K.: Introduction to Data Mining with Case Studies. Prentice Hall, India (2008)
10. Chattamvelli, R.: Data Mining Methods. Narosa (2010)
11. Kannan, S., Gurusamy, V.: Preprocessing Techniques for Text Mining (2015)
12. Adikari, S., Dutta, K.: Identifying sybil profiles in LinkedIn. In: PACIS 2014 Proceedings,
AISeL (2014)
13. Adebowale, M.A., Lwin, K.T., Hossain, M.A.: Intelligent phishing detection scheme using
deep learning algorithms. J. Enterprise Inform. Manag. (2020). https://doi.org/10.1108/jeim-
01-2020-0036
14. Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J.
Big Data 6(1), (2019)
15. Gong, Q., et al.: Deepscan: exploiting deep learning for malicious account detection in
location-based social networks. IEEE Commun. Mag. 56, 21–27 (2018)
16. Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection. ACM
Comput. Surv. 54(2), 1–38 (2021)
17. Su, X., et al.: A comprehensive survey on community detection with deep learning. arXiv
preprint arXiv:2105.12584 (2021)
18. Zhang, Q.-S., Zhu, S.-C.: Visual interpretability for deep learning: a survey. Frontiers Inform.
Technol. Electronic Eng. 19(1), 27–39 (2018)
19. Erfani, S.M., Rajasegarar, S., Karunasekera, S., Leckie, C.: High- dimensional and large-scale
anomaly detection using a linear one-class SVM with deep learning. Pattern Recognit. 58,
121–134 (2016)
20. Bulusu, S., Kailkhura, B., Li, B., Varshney, P.K., Song, D.: Anomalous instance detection in
deep learning: a survey. arXiv: preprint arXiv:2003.06979 (2020)
Sybil Account Detection in Social Network 139

21. Kwon, D., Kim, H., Kim, J., Suh, S.C., Kim, I., Kim, K.J.: A survey of deep learning-based
network anomaly detection. Clust. Comput. 22(1), 949–961 (2017). https://doi.org/10.1007/
s10586-017-1117-8
22. Wang, R., Nie, K., Wang, T., Yang, Y., Long, B.: Deep learning for anomaly detection. In:
WSDM, pp. 894–896 (2020)
23. Wang, H., Zhou, C., Wu, J., Dang, W., Zhu, X., Wang, J.: Deep structure learning for fraud
detection. In: ICDM, pp. 567–576 (2018)
24. Li, P., Chen, X., Jing, L., He, Z., Yu, G.: Swisslog: Robust and unified deep learning based
log anomaly detection for diverse faults. In: ISSRE, pp. 92–103 (2020)
Docker Container Orchestration Management:
A Review

Jigna N. Acharya(B) and Anil C. Suthar

Gujarat Technological University, Ahmedabad, Gujarat, India


jignaforever@gmail.com

Abstract. Cloud Computing is online technology where computing resources


like hardware, software and applications are available as per the user’s needs. A
cloud computing architecture microservices-based application involves multiple
microservices deployed, updated, and redeployed on lightweight virtualization
technology called docker container rather than hypervisor-based virtualization.
Docker Swarm, Kubernetes and Apache Mesos are container orchestration tool
for scheduling and managing individual Container for microservice application
within a cluster of private cloud and public cloud. Docker container orchestra-
tion can include creating and scheduling Container, availability of container and
the host machine, rescheduling of failed Container, scaling of Container to bal-
ance the workload on infrastructure and securing the interaction between Con-
tainer. This survey provides a complete description of docker container orches-
tration approaches with containers, analyzing the framework and classification of
container orchestration management.

Keywords: Docker · Container · Orchestration · Micro-services

1 Introduction
Nowadays, cloud computing is becoming famous for deploying a microservices-based
application on lightweight and self -contained container virtualization technology rather
than virtual machine technology. A microservices architecture breaks down an appli-
cation into minor, autonomous, independent parts, which can be arranged more simply
in the cloud [26]. As a result, it reduces maintenance cost and increases development
efficiency as the application expands. In cloud computing, either hypervisor-based virtu-
alization or Container-based virtualization is used for running microservices [27]. Virtual
Machines require a whole dedicated operating system that consumes enough CPU, RAM
and Storage to take a long time to start. Containers are lightweight and flexible to create
and deploy the micro-service application within a micro-second. It doesn’t require a
dedicated operating system.
Figure 1 Shows architectural framework difference between Hypervisor-based and
container-based virtualization.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 140–153, 2022.
https://doi.org/10.1007/978-3-030-97196-0_12
Docker Container Orchestration Management: A Review 141

Fig. 1. The framework of hypervisor based and container based virtualization technologies

Amazon EC2 [28], Azure [29], and Google cloud [30] Engine are different cloud
providers that provide Container as a service nowadays. Container based virtualization
is presented by the docker platform, aiming to make container technology easy to use. To
deploy large scale container application in a production environment, container orches-
tration is required. Container orchestration system has developed to deploy, run and
manage containers on host machines. It also offers features like scheduling, load balanc-
ing, Autoscaling and fault tolerance. Docker Swarm [31], Kubernetes [32] and Apache
Mesos [33] are popular orchestration tools used in a cluster cloud environment. Each
application running in Container has its characteristics and require different resources.
Hence, the orchestration system considers all these constraints and places Container on
an available host to maximise resources and minimise operational cost. In this paper, we
survey different container orchestration features like scheduling, load balancing, scaling
to a large number of systems, fault tolerance and availability of the system in docker
with existing work done.
The rest of this paper is organized as follows. Section 2 presents a Container Orches-
tration Software System. Section 3 presents a Container Orchestration Framework, and
Sect. 4 describes challenges and issues.

2 Container Orchestration Software System

Nowadays, in a cloud computing environment, monolithic applications are no longer


in use. Still, instead, microservice applications are divided into smaller containerized
components that work together. Container orchestration in cloud technology allows the
application provider to deploy, run and control its applications. Kubernetes, Docker
swarm, Apache Mesos and Apache Marathon [34] are various software system for con-
tainer orchestration. These systems are working for the management of microservice
application in a docker container environment.
142 J. N. Acharya and A. C. Suthar

Google Kubernetes is a moveable, free source platform for handling containerize


application. Kubernetes makes application easy to deploy and operate applications in a
microservice architecture. In Container, Pod is the essential building block that helps
to release more than a single Container at a time. Kubernetes cluster has a physical or
virtual machine called a node [35]. A node is a machine in which it is possible to run
and schedule a pod [1]. The default Kubernetes scheduler guarantees that nodes cannot
place all requested compute resources of pods beyond their capacity [2].
Docker Swarm is another important orchestration tool that allows user to manage
multiple containers deployed on host machines. In a docker swarm, there is a master
node and more than one employee nodes. Swarm use a scheduling strategy called a
spread, bin pack and random. Swarm uses a default strategy called to spread that execute
Container on the lightly loaded node. Bin pack strategy executes Container on a heavily
loaded node and random strategy select random node to execute a container [3]. Each
strategy can execute Container with predefined resources. Docker swarm uses a filtering
mechanism to identify resources and a health check mechanism to check host machines’
status. Docker Swarm also notices if there is any fail container of host machines. It
automatically redeploys the failure container in another host machine [36].
Another open-source cluster management platform is Apache Mesos, designed at
UC Berkeley. For distributed application, resource isolation and sharing are provided
by the Apache Mesos platform. The main components of Mesos are the main daemon
that handles leading daemons executing on each cluster machine and slave. Slaves are
either physical or virtual machines, typically from the same provider. Application com-
ponents are running on a Container that uses computing resources and assigns them to
the application without changing configuration [4].
Marathon is an example of an open-source Mesos Framework that supports container
orchestration. Specific resource requirement for application is automatically forward and
deploy to Mesos. A JSON-based cloud application is to enable and deploy by Marathon.
Dependencies constrain can be used by containers [4].
Table 1 is representing Container management software that works on the concept
of container-based virtualization.
Docker Container Orchestration Management: A Review 143

Table 1. List of container management software

Software Category Feature Cost Link


Docker Application Support any type Free https://www.doc
Container of application, Use ker.com
reliable images
AWS Orchestration Run thousands of Chargeable https://aws.ama
Framework containers in a zon.com/ecs/
second,
heterogeneous
cluster facility
Kubernetes Orchestration Support docker Based on usage https://kubernete
Framework image, Optimized Cost s.io/
Container
LXC System restriction and Free https://linuxcont
Container, ranking of ainers.org/
Application resources,
Container namespaces
isolation
Core OS Linux Application Monolithic Kernel, Free https://www.ope
Container Container Userspace nshift.com
instances for
container partition
Microsoft Azure Orchestration Open Source Billing https://azure.mic
Framework Docker CLI, Run rosoft.com/en-us/
on a cloud, hybrid product-catego
platform ries/containers/
Google Cloud Orchestration No container Free https://cloud.goo
Framework orchestration, gle.com/kubern
effective load etes-engine
balancing
Portainer Container Web UI to manage Free https://www.por
Manager Container, Easy to tainer.io
add nodes
Apache Mesos Orchestration Inbuilt Web UI, Free https://mesos.apa
Framework use HTTPS API, che.org
Linear Scalability
144 J. N. Acharya and A. C. Suthar

3 Container Orchestration Framework


In Orchestration, containers are running on different nodes. Container orchestration
supports various features like resource control, scheduling, autoscaling, health check,
load balancing, and fault tolerance [5]. A Reference Framework is shown in Fig. 2. The
given components of the system framework are common to most container orchestration
systems. The main three components are identified in the presented framework: web
application and service, Containerized Cluster master and Physical Machine.

Fig. 2. Container orchestration framework

Web Application and Services: Cloud user deploy their application, either web appli-
cation or service, in cloud computing. Cloud provider packages these applications in
Container to execute in a cluster environment. A single application uses more than one
Container. Various applications are executed on a common stage or cluster. Applica-
tions can be web services, core backend real-time services, statistics analytics works,
corn tasks, etc. [6].
Containerized Cluster Master: Orchestration has the main component is called clus-
ter master. It is the interface between Compute machine and application. The main Func-
tionality of the cluster master is service Management, scheduling, resource management.
Service Management includes Labels, groups, namespaces, dependencies, and load bal-
ancing of Containers. Scheduling is responsible for Allocation, replication, rescheduling,
rolling deployment, upgrades and downgrades of containers. In contrast, resource Man-
agement is managing resources like Memory, CPU, GPU, volumes, ports and IP of
containers.
Physical Machine: Physical machine in the cluster provides the infrastructure for
deploying a microservice-based application. Cluster machines can be virtual machine
or physical machine in the cloud.
Docker Container Orchestration Management: A Review 145

3.1 Classification of Orchestration Management

Orchestration management is classified into six categories: resource control, scheduling,


load balancing, health checking, auto-scaling, and fault tolerance, as shown in Fig. 3.
These classifications are helpful for a future researcher in a cloud container environment.

Fig. 3. Classification of orchestration management

The First classification focus on resource control which includes management of


resources like CPU, memory, IO and volume in a cloud environment. Scheduling is a
second category that defines a policy to place a Container on a node with constraining as
resource constrains or node affinity. The third type of present load balancing distributes
load among multiple container instances. The fourth group presents health checking
that is used to check the status of nodes. A fifth class represents Fault tolerance that
deals with replica controller or High availability controller. Both controllers are used
for maintaining a desired number of containers. Finally, the sixth category deals with
auto-scaling that allows to increase or eliminate containers automatically. The details of
the classification are explained in section B.

3.2 Analysis of Container Orchestrator Management

This section reviews container orchestrator management concepts and focuses on the
algorithm used in current work. Nevertheless, analyzing current methods and empathetic
effort is essential for emerging some additional valid methods and systems. The algorithm
can improve the current methods or get rewards from earlier studies [7].

1) Resource Control
Effective Scheduling decision can be made using resource limit control that
reserve a fixed amount of CPU, Memory and volume for Container. Dynamic and
Resource-aware placement Scheme (DRAPS) target distributing containers that find
146 J. N. Acharya and A. C. Suthar

the best available worker node with the best resource and consider different resource
demands from containers [8].
Dynamic CPU allocation in a cloud environment is useful when several con-
tainers share a single resource. An adaptive control scheme is used to decrease the
overutilization of resources and decrease execution time because of caching and
synchronization conflicts [9].
Memory Reservation Elasticity (MEER) System is proposed to execute an online
valuation of the least quantity of memory limit that attains closely optimum perfor-
mance. Using memory reservation, over memory provisioning is split from under
memory provisioning. Containers are permissible to run with a smaller amount of
memory reservation than they required [10].
Containers use a host OS to decrease their memory base- pattern and performance
expenses. However, ordinary OS is restricted to modify a container’s memory orga-
nization since they miss the essential ideas and methods to the exact path and separate
a container’s memory footmark. The container Level Address Space (CLAS) con-
cept is proposed to summarize and trail a container’s memory footmark across all
necessary actions. CLAS provides customized memory organization facilities to a
container [11].
2) Scheduling
Scheduling is the method that places service on a node in the cluster to keep of
putting a service task that is required to be run on a node in the cluster to keep the
desired state of a service. Without proper scheduling, a policy will lead to underuti-
lization or overutilization of resources and a single point of failure. Scheduling algo-
rithms have been designed to utilize resources efficiently and also considered to max-
imize application performance. Scheduling can be done using resource constraint
as well as node affinity.
In the past years, many scheduling algorithms have been proposed for the man-
agement of cloud computing resources. Energy usage as essential criteria consid-
ered by an energy-aware scaling algorithm while scheduling the load. Jobs are not
allocated to lightly loaded Container and switch off to prevent loss of energy. Fur-
thermore, reproducing a new container guarantees that the current containers have
not too many tasks for a process to become overloaded. This gives a high degree of
consistency in load balancing [12].
An ant colony algorithm is developed to solve the multi-objective container
scheduling problem. It focuses on the resource constraints of a physical machine and
its purpose to reduce network transmission overhead, load balancing of resources
and improve services [13].
The stable matching theory was proposed to find the best mapping from contain-
ers to host servers. This resource scheduling approach decreases the customer’s job
response time and increases providers’ resource utilisation rate. Physical machines
are free to choose the best containers to reduce waste resource fragments [14].
A new cluster scheduler called Stratus is dedicated to orchestrating batch job
execution on virtual clusters. It dynamically allocates collections of virtual machine
instances to a job for execution. Jobs are completely packed, and also, at the same
Docker Container Orchestration Management: A Review 147

time, it will consider possible packing and instance types. Stratus scheduler avoids
highly utilized leased machines and task migration to remove underutilized instances
[15].
Multiopt, a new container scheduling algorithm that is focused on multi-objective
optimization technique. It aims to minimize time consumption while transmitting
an image on a network and consider the association between nodes and containers.
Every need of business is fulfilled using advantages of spread, binpack and random
algorithms [16].
A Particle Swarm is a multi-objective optimization-based container scheduling
algorithm. Its primary focus is to solve the problem of load balance and resource
utilization. PSO algorithm allocates container application on the docker physical
machine to fully utilise every resource of a node. As compared to the current swarm
algorithm that uses spread and random strategy, the PSO algorithm improves the
application’s performance [17].
3) Load Balancing
Load balancing of containers helps to manage a load of container service appli-
cation efficiently. Containers are running on a common platform called docker.
It provides facilities like management, distribution and packaging of container
application.
Container load balancing also helps to support running a secure and efficient
application. The advantages of using efficient load balancing to balanced distribu-
tion, better visibility and security. The default implemented strategy is round-robin.
We can also change the default load balancing strategy by using custom load bal-
ancing. Kubernetes, Docker swarm, Apache Mesos, Apache Marathon and Yarn
are various software system for container orchestration use different load balancing
algorithms.
Ingress load balancing in the docker swarm is used to distribute service load even
if it contains only one node. Nodes in a cluster that are not running task for service
still access publishing port services using external components, such as cloud load
balancers. Ingress connection that is running instances is routed by nodes in the
swarm.
The internal load balancer and external load balancer are two types of load
balancer in Kubernetes. A strategy called dynamic balance was designed to support
the web system. It was focused to get high performance in terms of concurrency and
availability. Functions like automatic management and monitoring of the system are
appropriately implemented [18].
4) Health Checking and Fault Tolerance
Health checking is used to check that Container can handle the request and test
the Container to see if it is working or not. Using high availability controller or
replica control method, fault tolerance can be done. Replica control keeps several
Container as it is. High availability controllers are used to control application if
the host machine fails or is highly loaded. It is also used to implement a scalable
controller. The raft algorithm of a docker container running in host machines is
capable of dynamically create replicas and give permission to clients that they can
send requests to any replica of Raft [19].
148 J. N. Acharya and A. C. Suthar

Availability of container instances can be improved using replication of contain-


ers in Kubernetes. Kubernetes can also recreate a Container using a predefined image
when containers fail. Failed containers are not restored. The state of applications is
maintained using external volumes. It also saves the volumes against failures.
Moreover, concurrency is handling by application concerning volumes and state
replication. In this paper, integrate coordination scheme was presented by the author
in Kubernetes system. To guide the coordination, the shared memory component of
Kubernetes is used. Furthermore, for state replication DORADO (or Dering Over
shared memory) was created [20].
In this paper, the author suggests an Availability-Aware scheduling algorithm
for Container in cloud computing. It proposes a new strategy to select the best node
for Container-based on constraints specified in the application. It also increases
application availability [21].
5) Auto Scaling
Auto Scaling is used to add and remove the Container in the docker automati-
cally. It is implemented with policies like custom based or threshold-based (CPU or
Memory Utilization).
Auto Scaling Energy-based (EBAS) is a new resource auto-scaling method that
increases CPU resources related to frequency and number of cores. It dynamically
adjusts frequencies of CPU using DVFS (Dynamic Voltage and frequency scaling
(DVFS)) method [22].
In this article, the author had described a summary of the ATHENA system, also
known as a defence-oriented system. It was developed using stateful or stateless
Docker-based separation and consolidation scaling for exploitation across the cloud
system. Using Kubernetes, he had explained the autoscaling of pods. All the parts
of the ATHENA Defence-oriented system are live and functional. Nowadays, the
platform works in refined training needs, submarine personnel and offshore/onshore
constraints [23].
In this article, the author had improved the automatic scaling algorithm with flex-
ibility and reliability of Unified Communication server deployments using Kuber-
netes combo. This system also introduces new services with strong resiliency
[24].
In this article, the author had defined a multi-objective optimization problem as
a four-fold auto-scaling decision problem. He also developed a control architecture
for provisioning VM and Container that dynamically and elastically adjusts. Using
IBM CPLEX optimization solver prototype implementation, he compared two naive
scaling strategies [25].
In this paper, the author had proposed the KHPA (Kubernetes Horizontal Pod
Auto-scaling) algorithm. This algorithm uses absolute metrics that permit to control
of the response time of the application properly. It also keeps threshold bellowed
using service level objectives [5] (Table 2).
Docker Container Orchestration Management: A Review 149

Table 2. Summaries the comparison of related works focus on container orchestration manage-
ment algorithm.

Work Resource Scheduling Load Health Autoscaling Objective Technology


control balancing check &
Fault
tolerance

Y. Mao et al. CPU/Memory Swarm,
[8] Kubernetes

J. Monsalve CPU Swarm,
et al. [9] Kubernetes

G. Xu Memory Swarm,
et al.[10] Kubernetes

T. Li et al. [11] Memory Swarm,
Kubernetes

M. Energy Swarm,
Sureshkumar Kubernetes,
et al. [12] Marathon,
Cloudify

M. Lin et al. Network Swarm,
[13] transmission Kubernetes,
load balancing Marathon,
Cloudify

X. Xu et al. Response time Swarm,
[14] Kubernetes,
Marathon,
Cloudify

A. Chung Resource Swarm,
et al. [15] Utilization Kubernetes,
Marathon,
Cloudify

B. Liu et al. Minimize time Swarm,
[16] consumption Kubernetes,
Marathon,
Cloudify

L. Li et al. [17] Resource Swarm,
Utilization Kubernetes,
Marathon,
Cloudify

W. Ren et al. Resource Swarm,
[18] availability Kubernetes,
Marathon

M. Rusek Resource Swarm,
et al. [19] Availability Kubernetes,
Marathon,
Cloudify

H. V Netto Resource Swarm,
et al. [20] Availability Kubernetes,
Marathon,
Cloudify

E.J. Alzahrani CPU Swarm,
et al. [22] resources Marathon

E. Casalicchio Response Time Swarm,
et al. [5] Kubernetes
150 J. N. Acharya and A. C. Suthar

4 Challenges and Issues


After studying container orchestration management, some open challenges and issues
were identified. They are presented below, followed by a discussion on the significance
to address each one:

1. Service Availability for Containerized Application: The Container should be appro-


priately placed so that the end-user will get a higher level of service availability
without violating SLA. The strategy should assign the containers on the machine
that have advanced obtainability values to increase service application availability.
Placement strategy required to search all the possible mixture of container place-
ments on the machine and then select the best placement solution for service avail-
ability. Decrease the number of running containers to balance the host machine’s
load in terms of CPU and Memory.
2. Network Dependencies: Containers of a distributed microservices often have strong
network dependencies due to data communication. Collocating dependent contain-
ers on the different nodes can increase communication latency and waste network
resources, which also has to be considered in the deployment phase.
3. Logging and Monitoring: In the cloud, many services are running simultaneously. We
cannot monitor service every time to find any problem. It is not possible because we
are running many data servers and resources. For solving it, we don’t have time. Cloud
application workload is different from container workload. Continuous monitoring
of the host machine is required that running container application. However, the
challenge is monitoring and controlling your cloud resources to reap the benefits
truly. No matter the size of your deployment, you still need to know your resources’
health at any given time.
4. Security: Container can deliver applications rarely. Multiple containers are required
to deploy micro-service applications for the same host or distributed a host. Con-
tainers are based on the concept of sharing the kernel of the host machine. Spiteful
containers can leak information about another container on the same host machine.

Line up Auto-Scaling with Workload Demand: Auto Scaling groups can horizontally
scale in and out based on actual workload requirements. However, it can be tricky to
configure the node type & size and scaling parameters. You will need to interpret raw
utilization data and predictively decide what and how to scale.

5 Conclusion and Future Scope

Orchestration systems for containers responsible for handling the containerized appli-
cation of cloud computing the environment are discussed. In a cloud computing archi-
tecture, a microservices-based application involves multiple micro-services deployed,
updated, and redeployed on lightweight virtualization technology called docker con-
tainer rather than hypervisor-based virtualization. To better understanding of container
orchestration system, first, we have studied the container orchestration framework. There
are mainly three components web application and service, Containerized Cluster master,
Docker Container Orchestration Management: A Review 151

and Physical Machine. We have classified container orchestration management with a


detailed analysis of each and every parameter. The majority of recent research is still
concentrating on the management of virtual machines in cloud computing. We identified
several challenges and issues like Service availability of a containerized application, Net-
work dependencies, monitoring, and security for container orchestration management
in today’s cloud environments. In the future, work on the rescheduling of containers and
analyze the availability of hosts of containerized applications.

References
1. Riti, P.: Pro DevOps with Google Cloud Platform (2018)
2. Rodriguez, M.A., Buyya, R.: Containers orchestration with cost-efficient autoscaling in cloud
computing environments (2018)
3. Cérin, C., Menouer, T., Saad, W., Ben Abdallah, W.: A new docker swarm scheduling strategy.
In: Proceedings of 2017 IEEE 7th International Symposium Cloud Services Computing SC2
2017, vol. 2018-Janua, pp. 112–117 (2018)
4. Kehrer, S., Blochinger, W.: TOSCA-based container orchestration on Mesos: two-phase
deployment of cloud applications using container-based artifacts. Comput. Sci. - Res. Dev.
33(3–4), 305–316 (2018)
5. Casalicchio, E.: Container orchestration: a survey. In: Puliafito, A., Trivedi, K.S. (eds.) Sys-
tems Modeling: Methodologies and Tools. EICC, pp. 221–235. Springer, Cham (2019). https://
doi.org/10.1007/978-3-319-92378-9_14
6. Buyya, R., Rodriguez, M.A., Toosi, A.N., Park, J.: Cost-efficient orchestration of containers
in clouds: a vision, architectural elements, and future directions. In: Journal of Physics:
Conference Series, vol. 1108, no. 1 (2018)
7. Madni, S.H.H., Latiff, M.S.A., Coulibaly, Y., Abdulhamid, S.M.: Resource scheduling for
infrastructure as a service (IaaS) in cloud computing: challenges and opportunities. J. Netw.
Comput. Appl. 68, 173–200 (2016)
8. Mao, Y., Oak, J., Pompili, A., Beer, D., Han, T., Hu, P.: DRAPS: dynamic and resource-
aware placement scheme for docker containers in a heterogeneous cluster. In: 2017 IEEE
36th International Performance Computing and Communications Conference IPCCC 2017,
vol. 2018-Janua, pp. 1–8 (2018)
9. Monsalve, J., Landwehr, A., Taufer, M.: Dynamic CPU resource allocation in containerized
cloud environments. In: Proceedings of IEEE International Conference on Cluster Computing
ICCC, vol. 2015-Octob, pp. 535–536 (2015)
10. Xu, G., Xu, C.Z.: MEER: online estimation of optimal memory reservations for long lived
containers in in-memory cluster computing. In: Proceedings International Conference on
Distributed Computing Systems, vol. 2019-July, pp. 23–34 (2019)
11. Li, T., Gopalan, K., Yang, P.: ContainerVisor: customized control of container resources.
In: Proceedings of 2019 IEEE International Conference on Cloud Engineering IC2E 2019,
pp. 190–199 (2019)
12. Sureshkumar, M., Rajesh, P.: Optimizing the docker container usage based on load scheduling.
In: Proceedings of 2017 2nd International Conference on Computing and Communications
Technologies ICCCT 2017, pp. 165–168 (2017)
13. Lin, M., Xi, J., Bai, W., Wu, J.: Ant colony algorithm for multi-objective optimization of
container-based microservice scheduling in cloud. IEEE Access 7, 83088–83100 (2019)
152 J. N. Acharya and A. C. Suthar

14. Xu, X., Yu, H., Pei, X.: A novel resource scheduling approach in container based clouds.
In: Proceedings of 17th IEEE International Conference on Computational Science and Engi-
neering CSE 2014, Jointly with 13th IEEE International Conferences on Ubiquitous Comput-
ing & Communications IUCC 2014, 13th International Symposium on Pervasive Systems,
pp. 257–264 (2015)
15. Chung, A., Park, J.W., Ganger, G.R.: Stratus: cost-aware container scheduling in the public
cloud. In: SoCC 2018 - Proceedings of 2018 ACM Symposium Cloud Computing, pp. 121–
134 (2018)
16. Liu, B., Li, P., Lin, W., Shu, N., Li, Y., Chang, V.: A new container scheduling algorithm
based on multi-objective optimization. Soft. Comput. 22(23), 7741–7752 (2018). https://doi.
org/10.1007/s00500-018-3403-7
17. Li, L., Chen, J., Yan, W.: A particle swarm optimization-based container scheduling algorithm
of docker platform. In: ACM International Conference on Proceeding Series, pp. 12–17 (2018)
18. Ren, W., Chen, W., Cui, Y.: Dynamic balance strategy of high concurrent web cluster based
on Docker container. IOP Conf. Ser.: Mater. Sci. Eng. 466, 012011 (2018). https://doi.org/10.
1088/1757-899X/466/1/012011
19. Rusek, M., Dwornicki, G., Orłowski, A.: A decentralized system for load balancing of con-
tainerized microservices in the cloud. In: Świ˛atek, J., Tomczak, J.M. (eds.) ICSS 2016. AISC,
vol. 539, pp. 142–152. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-48944-
5_14
20. Netto, H.V., Lung, L.C., Correia, M., Luiz, A.F., Sá de Souza, L.M.: State machine replication
in containers managed by Kubernetes. J. Syst. Archit. 73, 53–59 (2017)
21. Alahmad, Y., Daradkeh, T., Agarwal, A.: Availability-aware container scheduler for appli-
cation services in cloud. In: 2018 IEEE 37th International Performance Computing and
Communications Conference, IPCCC 2018, pp. 1–6 (2018)
22. Alzahrani, E.J., Tari, Z., Zeephongsekul, P., Lee, Y.C., Alsadie, D., Zomaya, A.Y.: SLA-
aware resource scaling for energy efficiency. In: Proceedings of 18th IEEE International
Conference on High Performance Computing and Communications, 14th IEEE International
Conference on Smart City, 2nd IEEE International Conference on Data Science and Systems
HPCC/SmartCity/DSS 2016, pp. 852–859 (2017)
23. Kho Lin, S., et al.: Auto-scaling a defence application across the cloud using Docker and
Kubernetes. In: Proceedings of 11th IEEE/ACM International Conference on Utility and
Cloud Computing Companion, UCC Companion 2018, pp. 327–334 (2019)
24. Jin-Gang, Y., Ya-Rong, Z., Bo, Y., Shu, L.: Research and application of auto-scaling unified
communication server based on Docker. In: Proceedings of 10th International Conference
on Intelligent Computation Technology and Automation, ICICTA 2017, vol. 2017-Octob,
pp. 152–156 (2017)
25. Hoenisch, P., Weber, I., Schulte, S., Zhu, L., Fekete, A.: Four-fold auto-scaling on a contem-
porary deployment platform using Docker containers. In: Barros, A., Grigori, D., Narendra,
N., Dam, H. (eds.) Service-oriented computing. LNCS, vol. 9435, pp. 316–323. Springer,
Heidelberg (2015)
26. Lv, J., Wei, M., Yu, Y.: A container scheduling strategy based on machine learning in microser-
vice architecture. In: 2019 IEEE International Conference on Services Computing (SCC),
pp. 65–71. IEEE (2019)
27. Fazio, M., Celesti, A., Ranjan, R., Liu, C., Chen, L., Villari, M.: Open issues in scheduling
microservices in the cloud. IEEE Cloud Compu. 3(5), 81–88 (2016)
28. Amazon web services. https://aws.amazon.com/. Accessed 24 Aug 2021
29. Microsoft azure. https://azure.microsoft.com/. Accessed 22 Aug 2021
30. Google cloud engine. https://cloud.google.com/. Accessed 25 Aug 2021
31. Docker swarm. https://docs.docker.com/engine/swarm/. Accessed 26 Aug 2021
Docker Container Orchestration Management: A Review 153

32. Kubernetes. https://kubernetes.io/. Accessed 28 Aug 2021


33. Apache Mesos. http://mesos.apache.org. Accessed 31 Aug 2021
34. Apache Merathone. https://mesosphere.github.io/marathon/. Accessed 31 Aug 2021
35. Netto, H.V., Lung, L.C., Correia, M., Luiz, A.F., de Souza, L.M.S.: State machine replication
in containers managed by Kubernetes. J. Syst. Architect. 73, 53–59 (2017)
36. Ismail, B.I., et al.: Evaluation of Docker as edge computing platform. In: 2015 IEEE
Conference on Open Systems (ICOS), pp. 130–135 (2015)
Locally Weighted Mean Phase Angle
(LWMPA) Based Tone Mapping Quality
Index (TMQI-3)

Inaam Ul Hassan1 , Abdul Haseeb2 , and Sarwan Ali3(B)


1
Department of Computer Science, Lahore University of Management Sciences,
Lahore, Pakistan
16030050@lums.edu.pk
2
Department of Computer Science, CECOS University, Peshawar, Pakistan
3
Department of Computer Science, Georgia State University,
Atlanta, GA 30303, USA
sali85@student.gsu.edu

Abstract. High Dynamic Range (HDR) images are the ones that con-
tain a greater range of luminosity as compared to the standard images.
HDR images have a higher detail and clarity of structure, objects, and
color, which the standard images lack. HDR images are useful in captur-
ing scenes that pose high brightness, darker areas, and shadows, etc. An
HDR image comprises multiple narrow-range-exposure images combined
into one high-quality image. As these HDR images cannot be displayed
on standard display devices, the real challenge comes while converting
these HDR images to Low dynamic range (LDR) images. The conversion
of HDR image to LDR image is performed using Tone-mapped operators
(TMOs). This conversion results in the loss of much valuable information
in structure, color, naturalness, and exposures. The loss of information
in the LDR image may not directly be visible to the human eye. To cal-
culate how good an LDR image is after conversion, various metrics have
been proposed previously. Some are not noise resilient, some work on
color channels separately (Red, Green, and Blue one by one), and some
lack capacity to identify the structure. To deal with this problem, we
propose a metric in this paper called the Tone Mapping Quality Index
(TMQI-3), which evaluates the quality of the LDR image based on its
objective score. TMQI-3 is noise resilient, takes account of structure and
naturalness, and works on all three color channels combined into one
luminosity component. This eliminates the need to use multiple metrics
at the same time. We compute results for several HDR and LDR images
from the literature and show that our quality index metric performs
better than the baseline models.

Keywords: Tone mapping · HDR · LDR · Mean phase · Objective


quality assessment · Tone mapping operator

I. U. Hassan and A. Haseeb—Joint first authors.


c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 154–171, 2022.
https://doi.org/10.1007/978-3-030-97196-0_13
Locally Weighted Mean Phase Angle (LWMPA) Based (TMQI-3) 155

1 Introduction
The pictures taken from a camera is a combination of two components, the lumi-
nance component and the chrominance component. The visibility of the chromi-
nance component is highly dependant on the intensity of white light presented
by the luminance component [15,27,29]. The normal picture consists of shadows
and brighter portions. Sometimes due to brighter parts, the details for shadow
portions becomes very low, and in worst cases, the objects in that portion of the
image become near to invisible. This is termed a Low Dynamic Range (LDR)
image [24]. One way of balancing the brightness and shadow areas, such that
both are visible clearly and no detail is missed, is called High Dynamic Range
(HDR) Imaging [14].
According to [19], when we capture the high contrast image from a camera,
either dark images are saturated, or the bright ones are saturated in the output
image. This occurs because the sensors in our camera are minimal to capture the
luminosity due to shadows or over-brightness. Our conventional cameras cannot
provide a proper image, the same as a human eye. The idea of HDR photograph
is presented to us. Traditionally for creating an HDR photograph, various LDR
images are combined. But the prime issue is that most of the devices are unable
or incapable to show HDR images. Therefore, it is important to convert the HDR
image to the LDR image without any loss of information.
The process of creating HDR images is done by capturing multiple shots
(3 or more) of various exposures and combining them into one better image.
High-dynamic-range imaging (HDRI) [26,30] is a technique in which images
are processed or captured with a higher amount (or level) of luminosity. This
higher dynamic range of luminosity cannot be achieved through standard digital
imaging. To capture a HDR image, several narrow range exposed images are
taken and combined into one. A limited exposure range mainly results in the
loss of important information such as highlights and shadows. The method of
capturing HDR images is quite common and mostly the same in all techniques.
Still, the significant issue here is the representation of those High contrast and
High dynamic images onto our typical day to day devices which have minimal
display powers and are capable enough to display the LDR images only. Another
crucial part in HDR image rendering and displaying is Tone Mapping [22,32].
Tone mapping corresponds to rendering an HDR image to a standard mon-
itor or printing device. This process is carried out by Tone mapping operators
(TMO). The need to rendering HDR to LDR is crucial because an HDR image
cannot be directly displayed on a standard display or printing device due to its
high contrast and colour ratios. There are different TMO proposed previously
for the conversion of HDR to LDR [17,20,21,32]. A natural question that arises
is regarding the quality of images resulting from applying TMOs on them. Two
types of evaluation can be applied to the images, namely subjective evaluation
and objective evaluation. Subjective evaluation is done using human eyes, while
objective evaluation is done by analyzing the actual structure of the images.
HDR contains a full noticeable scope of luminance and shading of the images
[34]. However, while converting images from HDR to LDR (using TMOs), there
156 I. U. Hassan et al.

is a possibility of missing essential image structures in the resultant LDR image


[35]. It is challenging for human eyes to catch these structural problems in the
LDR image (called subjective evaluation). This problem creates room for an
objective image quality measure of the tone mapped images so that the overall
quality of an image can be measured in detail.
Tone mapping quality index (TMQI) is used to objectively measure the qual-
ity of the images (called objective evaluation) converted from HDR to LDR by
the TMOs. Previously, different objective quality measure are proposed such
as TMQI-1 [35], TMQI-2 [21], and Feature Similarity Index for Tone-Mapped
Images (FSITM) [25]. Although FSITM shows better results as compared to
TMQI-1 and TMQI-2, the major problem with it is that it works with Red,
Blue, and Green channels separately, rather than combined. We propose a new
approach, called TMQI-3, that overcomes the limitations of FSITM. Our con-
tributions in this paper are the following:

1. We study different tone mapping operator and through experiments we iden-


tify the strength and weaknesses of each operator.
2. We propose an algorithm for objective quality assessment of images called
tone-mapped image quality index (TMQI-3).
3. Our proposed quality index algorithm shows good correlations between a sub-
jective ranking score of the images and an objective ranking score computed
using TMQI-3.

2 Literature Review

Objective assessment is a common approach to analyse the output of algorithms


in many domains such as graph analytics [1,11], protein sequence study [4,5,9,
10], smart grid [6–8], information processing [2], network security [3], and pattern
recognition [33]. In the images quality assessment domain, in spite of operating
on the original HDR pixel value, almost every tone mapping algorithm perform
the task on the logarithm of luminance values of HDR pixel. Olivier Lezoray in
[19] proposes to use manifold based ordering in which pixel value is more non-
linear rather than a log-luminance curve. The proposed method builds a new
HDR image representation (IR) by learning the manifold of HDR image pixels.
The IR is in the form of an index image, which is associated with an ordering of
the HDR pixel’s vectors. It was done in the following manner:

1. Ordering of HDR vectors


2. According to the order of first step The Novice depiction of HDR images are
performed

Authors in [23] propose a generic tone mapping algorithm, which can be


used in the black-box analysis of existing TMO, backward compatibility of HDR
image compression, and the synthesis of new algorithms that are a combination of
Locally Weighted Mean Phase Angle (LWMPA) Based (TMQI-3) 157

existing operators. Their paper says that one model does the estimation of TMOs
comprise of tone curve accompanied by spatial modulation function. Moreover,
nearly the same image processing technique is used by approximately all TMOs
but the selection of parameters varies.
Eilertsen et al. in [13], authors have done a broad comparison between dif-
ferent TMOs and point out the drawbacks in video TMOs. Their prime focus is
on the descriptive analysis of the new changes and evolution in tone mapping
pipelines. They also devised a new and generic tone mapping algorithm that can
best suit the needs of future HDR videos
Kede et al. In [21] the authors introduce TMOs that intend to pack high
dynamic range (HDR) pictures to Low Dynamic Range (LDR) to envision HDR
photos on standard presentations.They have proposed a significantly unique way
to deal with outline TMO. Rather than utilizing any predefined efficient compu-
tational structure for tone mapping, they specifically explore all pictures in the
space of all photographs, hunting down the image that advances an enhanced
TMQI. Specifically, the first enhance the two building hinders in TMQI-primary
devotion and genuine expectation parts—prompting a TMQI-2 metric. They
then propose an iterative calculation that, on the other hand, enhances the aux-
iliary loyalty and measurable expectation of the following picture. Numerical and
subjective tests show that the proposed analysis reliably delivers better quality
tone-mapped images, notwithstanding when the most focused TMOs make the
underlying pictures of the emphasis. Then, these outcomes likewise approve the
predominance of TMQI-2 over TMQI-1.
In [25] the authors suggested a feature similarity index for tone-mapped
images (FSITM) system that works on the local phase information of images.
For assessing the tone mapping operator (TMO), the suggested index compares
the calculated associated tone-mapped image by utilising the TMO method’s
output with the locally weighted mean phase angle map of an original high
dynamic range (HDR). For experiments, they have taken two sets of images after
assessing the results. It shows that the FSITM system outperforms the other
tone mapped quality index (TMQI) algorithms. Furthermore, they combine the
proposed system FSITM with TMQI, and show better results as compared to
typical TMQI’s.
Authors in [17] Introduce a new hybrid method that has been introduced
by combining two-hybrid tone mapping operators (local and global operators).
Several images are amalgamating into a single HDR which results in enhanced
HDR image. An enhancement map is constructed either with the threshold value
or the luminance value of the pixel. Using the enhanced map, the original lumi-
nance map is separated from the base layer and detail layer by running bilateral
filtering (noise-reducing filter for images). The detail layer is used to enhance
the result of global tone mapping. The performance of hybrid tone mapping is
then compared to individual local and international operators, and the results
show that the hybrid operator gives better performance.
158 I. U. Hassan et al.

3 Proposed Approach
This section proposes our algorithm, TMQI-3, for objectively evaluating LDR
images (performance of TMO’s). In the literature, there are two types of popular
models for measuring the quality of images.

1. Peak signal-to-noise ratio


2. Structural Similarity Index Metric (SSIM)

However, the above two quality measures assume that the reference and com-
pared images have the same dynamic range. Since that assumption is not valid
in the LDR images, we cannot directly apply these models in our research.

Definition 1. TMQI-1 assesses the quality of individual LDR images based on


combining an SSIM-motivated structural fidelity measure with the statistical nat-
uralness. The expression for TMQI-1 is given in Eq. (1).

TMQI-1(x, y) = a[S(x, y)a + (1 − a)N (y)]B (1)


S represents structural fidelity, N represents statistical naturalness, x represents
HDR image, and y represents the LDR image. The parameters a and B determine
the sensitivity. The range of the parameter a is the following:

0≤a≤1 (2)

Remark 1. Note that the parameters structural fidelity and statistical natural-
ness are upper bounded by 1. Therefore, TMQI-1 is also upper bounded by 1.

TMQI-1 was the first approach for measuring the LDR image’s quality across
the dynamic range to the best of our knowledge. It provides a better assessment
for the LDR images (compared to the traditional methods discussed above) as a
result of applying TMO on the HDR. However, TMQI-1 have some limitations
such as

1. It can only be applied to greyscale images. However, most of the HDR images
nowadays are in colour.
2. The statistical naturalness measure used in the TMQI-1 is based on the inten-
sity statistics only. However, many sophisticated statistical models in the lit-
erature can also capture other properties of the image, such as structural
regularities in space, scale, and orientation.

Remark 2. The problem of TMQI-1 only working with the greyscale images can
be solved by applying the TMQI-1 on each colour channel separately. Although
this may allow TMQI-1 to work on colour images, its performance will not be
very good [35].
Locally Weighted Mean Phase Angle (LWMPA) Based (TMQI-3) 159

We have proposed a new approach in which we have combined existing tech-


niques differently to achieve better results in terms of subjective evaluation from
both TMQI-2 [21], and FSITM [25]. The approaches which we have used as a
reference for our research are the following.
Our proposed method (TMQI-3) take into account the following properties
of the images.
1. Structural fidelity
2. Statistical naturalness
3. It also makes use of the mean phase angle of local weights
By combining the properties mentioned above, we could get a better objective
score for the input images.
We will now discuss all the above properties one by one.

3.1 Improved Structural Fidelity


The structural fidelity [21] of TMQI-1 can be computed using a sliding window
across the whole image. This process results in a quality map and hence pre-
serving the local structural detail of the image. Local structural fidelity measure
given by TMQI-1 is given in Eq. (3).
2σ̂x σ̂y + C1 σxy + C2
Slocal (x, y) = . (3)
σ̂x2 σ̂y2 + C1 σx σy + C2
where σx and σy denote the local standard deviations (std), respectively.
The σxy denotes the covariance between two corresponding patches. The two
(positive) constant terms C1 and C2 are used to avoid any possible instability.
Overall structural fidelity is calculated using Eq. (4).

1 
M
S(X, Y ) = Slocal (xi , yi ) (4)
M i=1

The updates in structural fidelity are done through the gradient descent
method stated in Eq. (5).
Ŷk = Yk + λ∇Y S(X, Y )|Y =Yk (5)
Where Yk is the image resulting from k iteration and λ is the step size. This
works as the contrast visibility model for the local luminance model.

3.2 Improved Statistical Naturalness


Model for statistical naturalness [21] proposed in TMQI-1 is given in Eq. (6).
1
N (Y ) = Pm Pd (6)
K
where Pm represents gaussian density function while Pd represents beta density
function. However, Eq. (6) has the following limitations.
160 I. U. Hassan et al.

1. Gaussian density function and beta density function are considered indepen-
dent of image content, which may not entirely be true.
2. Model for statistical naturalness is derived from high-quality images while
having no information regarding how an unnatural image may look.

The updates above can be abstractly defined by the equations below:


 μ  (t − T )2 
1 1
Pm = √ exp − dtμ ≤ μe (7)
2πθ1 −∞ 2θ12
 2μr −μ  (t − T )2 
1 2
Pm = √ exp − dtμ > μe (8)
2πθ2 −∞ 2θ22
 σ  (t − T )2 
1 3
Pd = √ exp − dtσ ≤ σe (9)
2πθ3 −∞ 2θ32
 2σr −σ  (t − T )2 
1 4
Pd = √ exp − dtσ > σe (10)
2πθ4 −∞ 2θ42

Remark 3. Note that the acceptable luminance changes saturate at both small
and large luminance levels without significantly tampering with the image visual
naturalness [21].

3.3 Use of Mean Phase Angle of Local Weights

An image quality measure based on the local phase information of an image


is proposed in [25]. Their model is noise independent; hence no parameter is
required for noise estimation.
Remark 4. Note that multiple methods have already been proposed in the lit-
erature related to the quality assessment of images using the phase information
[12,16,25,28,31,36].
One drawback of the methods proposed in the literature that consider phase
information is that their results for evaluating tone-mapped images are not reli-
able compared to other famous quality assessment metrics like SSIM. The Locally
Weighted Mean Phase Angle (LWMPA) is robust to noise and is computed using
the following expression according to [25].
 
ph(x) = arctan 2[ eρr (x), oρr (x)] (11)
ρ,r ρ,r

where ρ represents scale and r represents orientation of the image [25]. The
values for the pixels (vph ) of ph(x) are following
−π +π
≤ vph ≤ (12)
2 2
Locally Weighted Mean Phase Angle (LWMPA) Based (TMQI-3) 161

The term −π +π
2 in Eq. (12) represents a dark line while 2 represents a bright line.
The pixels of ph(x) takes value 0 for the steps. For further detail, readers are
referred to [18,25].
The ph(x) in Eq. (11) provides a good representation of image features. This
representation includes the edges of the objects within the image and the shapes
of those objects. From Eq. (12), we know that ph(x) represents both dark and
bright lines. Therefore, it can be used to identify colours within the image. This
colour detection is a useful property for a TMO.
Remark 5. Note that the LWMPA has the property to ignore the noise from the
image, which is not the case with the TMQI-1 that uses phase derived features.
FSITM works on Locally Weighted Mean Phase Angle (LWMPA). It works
on separate channels of Red, Blue, and Green. Not all of them combined. This
is a significant issue that we noticed in FSITM. The image should be judged
as a whole, not on separate channels only and according to the human eye’s
sensitivity. Combining the locally weighted mean phase angle with structural
fidelity and statistical naturalness requires it to be represented in luminance.
The reason is that the structural fidelity and statistical naturalness work upon
the luminance of the image. So to combine the LWMPA with others, its quality
index score should be mapped for luminance components, too, since the human
eye is much more responsive to the luminance rather than RGB separately. To
solve the problem that we identified in FSITM, we used the YUV model. The
equation of ‘Y’ is used to connect Red, Green and Blue to luminance. We used
this equation for the following reasons.
1. The conversion of RGB to Y requires just a linear transform, which is very
easy to do and cheap to compute numerically
2. ‘Y’ is perceived as brightness, which is more sensitive to the human eye.
Luminance gives the measure of the amount of energy an observer perceives
from the light source. That is why separate RGB sensitivity values are selected
in Y.
Its equation for Y is the following:

Y = 0.299 ∗ R + 0.587 ∗ G + 0.114 ∗ B (13)


TMQI-1, TMQI-2, and FSITM work on different methods to evaluate the
image. Structural fidelity and statistical naturalness are considered in TMQI-2,
while LWMPA is considered in FSITM. Separately, these methods are suitable to
access some types of image features (but not an image as a whole). So they should
be combined into one quality assessment model. For this purpose, we combine
all of these to design our TMQI-3 and give equal weight to each component, as
given below:

Q = (1/3) ∗ N + (1/3) ∗ F + (1/3) ∗ L (14)


Where N, F, L denote statistical naturalness, structural fidelity, and LWMPA,
respectively.
162 I. U. Hassan et al.

The improvements in structural fidelity and statistical naturalness are some-


how provided in TMQI-2, but it lacks noise resilience like in SSIM and other
matrices. However, they give better results due to gradient ascent method infi-
delity and solving parameter optimization problems for point-wise intensity
transformation in statistical naturalness. Further noise reduction can be made
using the phase angle map of locally weighted means because it is robust and has
error resilience. By combining improvements in structural fidelity and statistical
naturalness with phase angle map, we would be improving the quality and hence
enhance our quality index.
Our metric takes following 3 inputs.

1. An HDR image is used as a reference


2. The LDR image being compared (either colour or greyscale image with its
dynamic range equal to 255)
3. A local window for statistics

The default window for our approach is gaussian. The quality metric first com-
putes the structural fidelity and statistical naturalness of the image using the
method proposed by [21]. Then it computes the quality index for RGB sepa-
rately by the method provided in [25]. The main issue comes at the point of
combining these three. For this purpose, RGB quality index values are mapped
to luminance quality index values, as stated in the previous paragraphs. They
are combined according to the sensitivity of RGB values to human eyes. Then
after that, these three are combined with equal weights given to each.
Combining these three different approaches comes from the fact that each
method checks the image from a different perspective. The structural fidelity
focuses upon the visibility of image details which further depends upon the sam-
pling density of the image, the distance between the image and the observer, the
resolution of the display, and the perceptual capability of the observer’s visual
system. For the statistical naturalness, the studies in [35] show that among all
attributes such as brightness, contrast, colour reproduction, visibility, and repro-
duction of details, brightness and contrast have more correlation with perceived
naturalness. The real challenge of combining these three together was due to the
LWMPA from FSITM. The image is not observed based on chrominance, lumi-
nance (brightness) or contrast, but based on the three primary colours only, that
is, Red, Green and Blue, and the quantities of these three in an image. LWMPA
is not susceptible to noise or error, which makes it more robust in terms of qual-
ity. Combining these three results in a Quality matrix with the capability of
luminance, contrast, Structure, Naturalness, and the phases of the image.
We would effectively distinguish the visible and invisible local contrast in
images and provide a good representation of image features indicating the
changes in colours and dark or bright lines.
Locally Weighted Mean Phase Angle (LWMPA) Based (TMQI-3) 163

4 Results and Discussion


In this section, we first give information about the dataset and other imple-
mentation detail. Then we show our results and discuss the behaviour of our
proposed algorithm is compared to the baselines.

4.1 Experimental Setup

To evaluate our metric, we used the dataset proposed in [35]. It contains 15


HDR images, with each image having its associated 8 LDR images as well. The
LDR images are subjectively scored from the range of 1 to 8. A subjective score
of 1 means the image is best converted from HDR to LDR, and a score of 8
means the worst conversion. The subjective scores were obtained based on an
assessment of 20 individuals. Implementation of our algorithm is done in Matlab,
and experiments are performed on Core I3 2.4 GHz with 8 GB of RAM.
The evaluation metric that we are using is Kendall’s rank-order correla-
tion coefficient (KRCC) [35]. KRCC is a non-parametric rank correlation metric
whose formula is the following:
Nc − Nd
KRCC = (15)
1
2 N (N − 1)

Where Nc is the number of consistent rank order (concordant), and Nd is the


number of inconsistent rank order (discordant) pairs in the data set.

4.2 Kendall Correlation Based Results

Kendall’s correlation coefficient was run on the quality indices generated for
each method (TMQI-3, TMQI-2, TMQI-1, FSITM), and the results are shown
in Table 1. We can observe that the TMQI-3 process is comparable (also better
for some images) to the TMQI-2 approach. In reference to TMQI-1, the TMQI-3
performs better in many cases. In FSITM, TMQI-3 is better for image set 7 and
comparable for image set 1, 6, 11, and 15.
We can argue that although there is not any clear winner from Table 1.
FSITM looks to be better than the other methods. However, authors of the
FSITM in [25] argues that different TMO algorithms perform differently on
additional HDR images. Their behaviour depends on the (type of) HDR image
to be converted. From this uncertainty of the TMO’s, we can conclude that the
best. TMO approach must be found for each case (no TMO can be generalized
on all HDR images).

4.3 Visual Results on LDR Images

In this section, we explore the visual effect of different LDR images generated
from different TMO’s [25].
164 I. U. Hassan et al.

Table 1. KRCC values between subjective score and different TMQI’s score (higher
score is better). For each of the 15 HDR images, we have 8 LDR images. We computed
each LDR image’s objective score separately and then reported their average in this
table (for all techniques). The Average, Minimum, Maximum, and Standard Deviation
values are also reported.

Image set TMQI-3 TMQI-2 TMQI-1 FSITM


1 0.7857 0.7857 0.3571 0.7857
2 0.3571 0.2857 0.6429 0.7143
3 0.5714 0.5714 0.6429 0.7857
4 0.5000 0.5000 0.7143 0.7857
5 0.5000 0.5000 0.6429 0.6429
6 0.7857 0.7143 0.7143 0.7857
7 0.7857 0.7143 0.5714 0.7143
8 0.5714 0.5000 0.5714 0.6429
9 0.7143 0.7143 0.5714 0.8571
10 0.7857 0.7857 0.8571 0.8571
11 0.7143 0.7143 0.7143 0.7143
12 0.4286 0.4286 0.5714 0.5714
13 0.6071 0.6071 0.5357 0.6786
14 0.5714 0.5714 0.6429 0.6429
15 0.7857 0.7857 0.7857 0.7857
Average 0.6309 0.6119 0.6357 0.7309
Min 0.3571 0.2857 0.3571 0.5714
Max 0.7857 0.7857 0.8571 0.8571
Std 0.1402 0.1445 0.1138 0.0812

Indoor House Images: These first set of LDR images in Fig. 1 are of indoor
houses generated from their corresponding HDR images. Referring to the Table 2
we can identify that the worst subjective score is given to the Fig. 1(b) (which
is 5.95).
Remark 6. A higher subjective score (max 8) refers to a bad conversion from
HDR to LDR, while a lower subjective score (minimum 1) refers to better con-
version.
We can observe that the best subjective score is of the Fig. 1(c). For each resul-
tant Figure TMQI-3, TMQI-2, TMQI-1, and FSITM are run to produce objective
scores. The results can be seen in Table 2. We can observe the correspondence
between the objective and subjective scores in the associated table.
Remark 7. The objective score of a metric should be low if the subjective score
is high and high if the subjective score is low.
Locally Weighted Mean Phase Angle (LWMPA) Based (TMQI-3) 165

The aim of FSITM scores for Fig. 1 (b) should be less than that of Fig. 1 (a)
because Fig. 1 (b) have a higher subjective score. However, in reality, the FSITM
is giving the opposite of it. The same behavior is observed with TMQI-1 regard-
ing Fig. 1 (b) and Fig. 1 (a). The TMQI-2 and TMQI-3 perform as expected,
giving a lower objective score to the Figure with a higher subjective score and
vice versa.

Table 2. Subjective and Objective score for different methods on indoor house LDR
image.

Figure Subjective score Objective scores


TMQI-3 TMQI-2 TMQI-1 FSITM
Figure 1 (a) 4.4 0.5484 0.4106 0.8016 0.8077
Figure 1 (b) 5.95 0.5194 0.3814 0.8799 0.8311
Figure 1 (c) 2 0.6645 0.5770 0.9191 0.8774

Fig. 1. Indoor house LDR images created using different TMOs [35]. The subjective
and objective scores for each image are given in Table 2.

Open Area Wide Shots Images: Figure 2 shows the LDR images of open
area-wide computed by different TMOs from the corresponding HDR image. The
Fig. 2 (a) has the highest subjective score, Fig. 2 (b) has the lowest and Fig. 2 (c)
is in the middle. The Fig. 2 (a) has the highest score because of its over brightness
and unclear image quality, and Fig. 2 (b) has the lowest score because of its clear
foreground and background. (As stated earlier, a lower subjective score means
a better image, and a higher subjective score means the worst image.) The
corresponding objective scores of the metrics should be inversely related to the
subjective scores of each image, respectively. In this case, all the metrics provide
the right objective scores. The objective score and the subjective scores for the
Figures are shown in Table 3.

Main Object Upfront Images: Figure 3 shows the LDR images produced
from HDR image from different TMOs. Subjectively the best Figure is 3 (b) and
166 I. U. Hassan et al.

Table 3. Subjective and Objective score for different methods on open area wide shots
LDR image.

Figure Subjective score Objective scores


TMQI-3 TMQI-2 TMQI-1 FSITM
Figure 2 (a) 7.1 0.4547 0.3207 0.7004 0.6987
Figure 2 (b) 3.65 0.5774 0.4732 0.8328 0.8155
Figure 2 (c) 4.75 0.8637 0.9008 0.8589 0.8272

Fig. 2. Open area wide shots LDR images created using different TMOs [35]. The
subjective and objective scores for each image are given in Table 3.

the worst Figure is 3 (c) (lowest and highest subjective scores respectively). The
objective score and the subjective scores for the images are shown in Table 4.
We can notice that the good image has a higher objective score, and the bad
image have a lesser objective score in all TMQI-3, TMQI-2, TMQI-1 and FSITM
metrics. Thus they all perform well on the images where the main object upfront
is obvious than the background. The only difference that can be noticed is that
FSITM and TMQI-1 were not able to provide a larger objective-score gap for
Fig. 3 (b) and 3 (c). The difference is very minute where it should have been
greater as the subjective score of image 3 (c) is much greater than that of Fig. 3
(b). This is not the case with TMQI-2 and TMQI-3. They both provide better
differences in objective scores in correspondence to their subjective scores.

Table 4. Subjective and Objective score for different methods on Main object upfront
LDR image.

Figure Subjective score Objective scores


TMQI-3 TMQI-2 TMQI-1 FSITM
Figure 3 (a) 1.65 0.7672 0.7337 0.9363 0.8862
Figure 3 (b) 1.45 0.8801 0.9051 0.9475 0.8926
Figure 3 (c) 5.6 0.6199 0.5544 0.9119 0.8332
Locally Weighted Mean Phase Angle (LWMPA) Based (TMQI-3) 167

Fig. 3. Main object upfront LDR images created using different TMOs [35]. The sub-
jective and objective scores for each image are given in Table 4.

Indoor with Background Scenery: Figure 4 shows the LDR images produced
from HDR image by different TMOs. The results are given in Table 5. The best
image subjectively so far is image 4 (b) and the worst is image 4 (c) (scores 1.55
and 5.65 respectively). The objective score from TMQI-3, TMQI-2, TMQI-1,
and FSITM are calculated on each image. The objective score of a metric should
be low if the subjective score is high and high if the subjective score is low.
The FSITM objective scores for image (b) should be less than that of image (a)
because the image (b) have a higher subjective score. But in actual FSITM is
giving the opposite of it. The same is with TMQI-1 regarding figures (b) and (a)
in this case. TMQI-2 and TMQI-3 perform as expected, giving a lower objective
score to the image with a higher subjective score and giving a higher objective
score to the image with a lower subjective score.

Table 5. Subjective and Objective score for different methods on Indoor with back-
ground scenery LDR image.

Figure Subjective score Objective scores


TMQI-3 TMQI-2 TMQI-1 FSITM-TMQI
Figure 4 (a) 1.55 0.8592 0.8704 0.9476 0.8942
Figure 4 (b) 3.9 0.8316 0.8166 0.9548 0.9050
Figure 4 (c) 5.65 0.3785 0.2142 0.8766 0.7484

Outdoor Natural Scenery Images: This set of LDR images in Fig. 5 are of
outdoor natural scenery generated produced from their respective HDR image
through various TMOs. The results are given in Table 6. It can be observed
that the best image (with lowest subjective score) is Fig. 5 (a) and worst is
Fig. 5 (c). Their respective subjective scores are 3.65 and 7.1. It can also be
observed that TMQI-3, TMQI-2, TMQI-1, and FSITM all produced the right
results. The only difference that can be noticed is that FSITM and TMQI-1
168 I. U. Hassan et al.

Fig. 4. Indoor with background scenery LDR images created using different TMOs
[35]. The subjective and objective scores for each image are given in Table 5.

could not provide a larger objective-score gap for each image. Whereas TMQI-2
and TMQI-3 were able to provide better objective score differences with respect
to the corresponding subjective scores.

Table 6. Subjective and Objective score for different methods on Outdoor natural
scenery LDR image.

Figure Subjective score Objective scores


TMQI-3 TMQI-2 TMQI-1 FSITM
Figure 5 (a) 3.65 0.8260 0.7768 0.9487 0.9332
Figure 5 (b) 4.75 0.7153 0.6142 0.8952 0.9035
Figure 5 (c) 7.1 0.4975 0.3156 0.8008 0.8167

Fig. 5. Outdoor natural scenery created using different TMO’s [35]. The subjective
and objective scores for each image are given in Table 6.

5 Conclusion
We have proposed an objective quality index, called the Locally weighted mean
phase angle based tone mapping quality index (TMQI-3), which is based upon
the combination of three basic properties namely statistical naturalness, struc-
tural fidelity, and locally weighted mean phase angle. These all three properties
Locally Weighted Mean Phase Angle (LWMPA) Based (TMQI-3) 169

were used separately in the literature. In this paper, we integrate all of them
in one quality index because they can provide better results (when combined)
while objectively evaluating an LDR image. The results are seemingly accurate
and better than those offered by previous quality matrices. Our metric is noise
resilient, includes structure and naturalness of image, and provide better objec-
tive scores. In future, we will focus on developing a better weights assessment
strategy for each individual component based on the sensitivity and structure of
each image separately.

References
1. Ahmad, M., Ali, S., Tariq, J., Khan, I., Shabbir, M., Zaman, A.: Combinatorial
trace method for network immunization. Inf. Sci. 519, 215–228 (2020)
2. Ali, S.: Cache replacement algorithm. arXiv preprint arXiv:2107.14646 (2021)
3. Ali, S., Alvi, M.K., Faizullah, S., Khan, M.A., Alshanqiti, A., Khan, I.: Detecting
DDoS attack on SDN due to vulnerabilities in OpenFlow. In: 2019 International
Conference on Advances in the Emerging Computing Technologies (AECT), pp.
1–6 (2020)
4. Ali, S., Ciccolella, S., Lucarella, L., Vedova, G.D., Patterson, M.: Simpler and
faster development of tumor phylogeny pipelines. J. Comput. Biol. 28(11), 1142–
1155 (2021)
5. Ali, S., Khan, M.A., Khan, I., Patterson, M., et al.: Effective and scalable clustering
of SARS-CoV-2 sequences. In: International Conference on Big Data Research
(ICBDR) (2021, to appear)
6. Ali, S., Mansoor, H., Arshad, N., Khan, I.: Short term load forecasting using smart
meter data. In: Proceedings of the Tenth ACM International Conference on Future
Energy Systems, pp. 419–421 (2019)
7. Ali, S., Mansoor, H., Khan, I., Arshad, N., Faizullah, S., Khan, M.A.: Fair alloca-
tion based soft load shedding. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) IntelliSys
2020. AISC, vol. 1251, pp. 407–424. Springer, Cham (2021). https://doi.org/10.
1007/978-3-030-55187-2 32
8. Ali, S., Mansoor, H., Khan, I., Arshad, N., Khan, M.A., Faizullah, S.: Short-term
load forecasting using AMI data. arXiv preprint arXiv:1912.12479 (2019)
9. Ali, S., Patterson, M.: Spike2Vec: an efficient and scalable embedding approach for
COVID-19 spike sequences. In: IEEE International Conference on Big Data (Big
Data), pp. 1533–1540 (2021)
10. Ali, S., Sahoo, B., Ullah, N., Zelikovskiy, A., Patterson, M., Khan, I.: A k-mer
based approach for SARS-CoV-2 variant identification. In: Wei ,Y., Li, M., Skums,
P., Cai, Z. (eds.) ISBRA 2021. LNCS, vol. 13064, pp. 153–164. Springer, Cham
(2021). https://doi.org/10.1007/978-3-030-91415-8 14
11. Ali, S., Shakeel, M.H., Khan, I., Faizullah, S., Khan, M.A.: Predicting attributes
of nodes using network structure. ACM Trans. Intell. Syst. Technol. (TIST) 12(2),
1–23 (2021)
12. Concetta Morrone, M., Burr, D.C.: Feature detection in human vision: a phase-
dependent energy model. Proc. Roy. Soc. London. Ser. B. Biol. Sci. 235(1280),
221–245 (1988)
13. Eilertsen, G., Mantiuk, R.K., Unger, J.: A comparative review of tone-mapping
algorithms for high dynamic range video. Comput. Graph. Forum 36(2), 565–592
(2017)
170 I. U. Hassan et al.

14. Fairchild, M.D.: The HDR photographic survey. In: Color and Imaging Conference
2007, no. 1, pp. 233–238 (2007)
15. Gijsenij, A., Gevers, T., Van De Weijer, J.: Computational color constancy: survey
and experiments. IEEE Trans. Image Process. 20(9), 2475–2489 (2011)
16. Hassen, R., Wang, Z., Salama, M.M.A.: Image sharpness assessment based on local
phase coherence. IEEE Trans. Image Process. 22(7), 2798–2810 (2013)
17. Ka, S., Punithavathanib, D.S.: Local and global tone mapping operators in HDR
image processing with amalgam technique. Int. J. Adv Eng. Tech. VII(I), 476–485
(2016)
18. Kovesi, P., et al.: Edges are not just steps. In: Proceedings of the Fifth Asian
Conference on Computer Vision, Melbourne, vol. 8, pp. 22–28 (2002)
19. Lézoray, O.: High dynamic range image processing using manifold-based ordering.
In: International Conference on Pattern Recognition (ICPR), pp. 289–294. IEEE
(2016)
20. Ma, K., Yeganeh, H., Zeng, K., Wang, Z.: High dynamic range image tone map-
ping by optimizing tone mapped image quality index. In: 2014 IEEE International
Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2014)
21. Ma, K., Yeganeh, H., Zeng, K., Wang, Z.: High dynamic range image compres-
sion by optimizing tone mapped image quality index. IEEE Trans. Image Process.
24(10), 3086–3097 (2015)
22. Mantiuk, R., Daly, S., Kerofsky, L.: Display adaptive tone mapping. In: Special
Interest Group on Computer Graphics and Interactive Techniques (SIGGRAPH),
pp. 1–10. ACM (2008)
23. Mantiuk, R., Seidel, H.-P.: Modeling a generic tone-mapping operator. Comput.
Graph. Forum 27(2), 699–708 (2008)
24. Marnerides, D., Bashford-Rogers, T., Hatchett, J., Debattista, K.: ExpandNet: a
deep convolutional neural network for high dynamic range expansion from low
dynamic range content. Comput. Graph. Forum 37(2), 37–49 (2018)
25. Nafchi, H.Z., Shahkolaei, A., Moghaddam, R.F., Cheriet, M.: FSITM: a feature
similarity index for tone-mapped images. IEEE Sig. Process. Lett. 22(8), 1026–
1029 (2014)
26. Nayar, S.K., Mitsunaga, T.: High dynamic range imaging: spatially varying pixel
exposures. In: Proceedings IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), vol. 1, pp. 472–479. IEEE (2000)
27. Oliva, A., Schyns, P.G.: Diagnostic colors mediate scene recognition. Cogn. Psy-
chol. 41(2), 176–210 (2000)
28. Oppenheim, A.V., Lim, J.S.: The importance of phase in signals. Proc. IEEE 69(5),
529–541 (1981)
29. Párraga, C.A., Brelstaff, G., Troscianko, T., Moorehead, I.R.: Color and luminance
information in natural scenes. J. Opt. Soc. Am. A 15(3), 563–569 (1998)
30. Reinhard, E., Heidrich, W., Debevec, P., Pattanaik, S., Ward, G., Myszkowski, K.:
High Dynamic Range Imaging: Acquisition, Display, and Image-based Lighting.
Morgan Kaufmann (2010)
31. Saha, A., Wu, Q.M.J.: Perceptual image quality assessment using phase deviation
sensitive energy features. Sig. Process. 93(11), 3182–3191 (2013)
32. Salih, Y., Wazirah binti Md-Esa, W., Malik, A.S., Saad, N.: Tone mapping of
HDR images: a review. In: 2012 4th International Conference on Intelligent and
Advanced Systems (ICIAS 2012), vol. 1, pp. 368–373 (2012)
33. Ullah, A., Ali, S., Khan, I., Khan, M.A., Faizullah, S.: Effect of analysis window
and feature selection on classification of hand movements using EMG signal. In:
Proceedings of SAI Intelligent Systems Conference, pp. 400–415 (2020)
Locally Weighted Mean Phase Angle (LWMPA) Based (TMQI-3) 171

34. Van Den Wymelenberg, K., Inanici, M., Johnson, P.: The effect of luminance dis-
tribution patterns on occupant preference in a daylit office environment. LEUKOS
7(2), 103–122 (2010)
35. Yeganeh, H., Wang, Z.: Objective quality assessment of tone-mapped images. IEEE
Trans. Image Process. 22(2), 657–667 (2012)
36. Zhang, L., Zhang, L., Mou, X., Zhang, D.: FSIM: a feature similarity index for
image quality assessment. IEEE Trans. Image Process. 20(8), 2378–2386 (2011)
Age Estimation of a Person by Compound
Stratum Practice in ANN Using n-Sigma Limits

M. R. Dileep1 , Ajit Danti2 , and A. V. Navaneeth1(B)


1 Department of Master of Computer Applications, Nitte Meenakshi Institute of Technology,
Yelahanka, Bengaluru, Karnataka, India
avnavaneeth25@gmail.com
2 School of Computer Science and Engineering, Christ (Deemed to be) University, Bengaluru,

Karnataka, India

Abstract. A face of a person reveals a lot of facts such as gender, identity as


well as age. Human face plays a crucial part in the approximation of a person’s
age. In this paper, a design is modelled to categorize the age of a person as per
the topographies retrieved from facial images of humans using Artificial Neural
Network (ANN). These years, ANN is widely implemented as a classification tool
for resolving numerous decision sculpting anomalies. In this study, a feed front-
ward promulgation Artificial Neural Network is built for age cluster classification
method for gray-scale frontal face images of human. Three age clusters namely
Kids, Mid aged and Senior aged adults are considered in the clustering system.
The facial images of the kids age group range from 1 to 16. The facial imageries of
the Mid aged group range between 16 and 60. The Senior aged group is reflected
as the age 60 and above. In this research paper, the compound stratum practice
is proposed in which the outcome gotten from the ANN has been polished at
the compound stratum for the purpose of improvising the precision of the rate of
detection. The accurateness of the algorithm is investigated by the disparity on the
boundary of the age clusters. The competence of the proposed system is validated
and reflected in the experimental outcomes.

Keywords: Age estimation · Feed forward promulgation · Facial features ·


Artificial neural network · Stratum practice

1 Introduction

Advanced Image Processing techniques related to human faces have been a dynamic and
motivating topic of research for years. Since facial structures and visibilities provide
ample of information, numerous issues have attracted lots of considerations and thus
have conducted the studied deeply.
In the Research, Invention and Data Skill, the neural network is an interesting
method of packages and information organizations that estimates the functionalities of
the humanoid mind. The term “Neuron” is a simple data dispensation element. A simple
neuron has a cell body known as Soma, a numerous amount of fibers titled as Dendrites,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 172–180, 2022.
https://doi.org/10.1007/978-3-030-97196-0_14
Age Estimation of a Person by Compound Stratum Practice in ANN 173

and a solo lengthy fiber known as Axon. Soma triggers at varied frequencies, Dendrites
accepts electrical signs pretentious by some chemical activities. A perceptron is a basic
system of neural network. The networks among neurons are known as Synapses. In the
network Neurons are associated by focused, biased routes. The weights can be excitant
otherwise Inhibitory. Figure 1 represents a classic organization of Neuron.

Fig. 1. Classical neural structure

An Artificial Neural Network consists of a huge quantity of processors functioning


simultaneously, individually with its specific trivial scope of understanding and admit-
tance to data in its native storage. Characteristically, an ANN is primarily “trained” or
nourished huge capacity of data, rules and constraints on data associations. A pack-
age then communicates the system in what way to act in reply for outside spur or start
action scheduled its own. In feed forward architecture, trained associateships on data can
“feed forward” to next superior stratums of acquaintance. ANN also absorbs progressive
notions and is widely implemented in analysis of time series and signal processing.
This study invents an efficient approach for estimation of person’s age from frontal
facial images. In this paper, the clustering of the humanoid facial imageries is achieved
at twofold stages, viz.,

• Principal Level
• Compound Level

In Principal Level, the images of the human face are clustered using ANN. The
Compound level resolution involves the clustering built on the effect of the Principal
level to progress the rate of detection effectually. The Planned system is pragmatic
to cluster the input imageries into any of the following stage clusters viz., Kids, Mid
aged and Senior aged clusters by means of Feed-Forward Neural Networks along with
n-Sigma limits.

1.1 Literature Survey

Human mood analysis system based on facial features, Facial expression recognition
and classification of Humans, Problems on facial appearances are the plunge investiga-
tion areas in reconnaissance and implementation solicitations. Till date, ample research
experiments have been conducted on detecting and recognizing the humanoid façades
constructed on tools, patterns, intelligent systems and example-foundation methods.
174 M. R. Dileep et al.

Yet, these techniques are comparatively luxurious, difficulty in maintenance and also
too complex to operate. Chellappa et al. [1], demonstrated the Humanoid and engine
acknowledgment of appearances in terms of investigational study. Dileep M R and Ajit
Danti [2], modelled a methodical algorithm for Regulated Connectivity-Face model for
the recognition of Human Facial Expressions. Er et al. [3], introduced the method on
appearance acknowledgment in fast track built on distinct cosine transmute and RBF
ANN. Gonzalez and Woods [4], demonstrated different concepts of Digital Image Pro-
cessing with demonstration examples and the method of coding them in Matlab with
various real time scenarios in his 3rd edition book by name Digital Image Processing
Using Matlab. Graf et al. [5], invented a technique that classifies human faces with the
help of man machine using neural computation. Hayashi et al. [6], designed an archi-
tecture about age and gender prediction using pre-processing of facial images. Kirby
et al. [7], designed on the uses of KL Process for the classification of human facial pic-
tures. Vitoria Lobo and Kwon [8], invented the process that clusters the facial images of
humans based on age. Lanitis et al. [9], conducted an analysis and real time experiment
on comparing various classifiers for age estimation of a person automatically. Looney
[10], introduced a process on configuration acknowledgment by means of ANN. Luu
et al. [11], explained a scheme on shadowy deterioration centered estimation of oldness.
Peng et al. [12], presented a methodology for LDA/SVM focused classification using
adjacent neighbor approach by means of NN. Turk & Pentland [13], given the systematic
approach on eigen faces for recognition. Wen-Bing Horng et al. [14], developed a tool
for the sorting of age collections constructed on topographies of face. Zheng and Lu
[15], explained an approach based on the classifier SVM with automated poise & its
presentation to organization of human sex.
In the present research study, three clusters of age viz., Kids, Mid aged and Senior
aged categories have been taken into consideration. An algorithmic approach is invented
here to estimate the ages using artificial neural network as the tool for classification at
Principal level. The precision and exactness of estimating the defined age clusters in
some picture has been matched with the approaches invented by various investigators
and scientists. This proposed technique and application is not restricted to any particular
dataset, but can be functional on various databases the pictures which can be downloaded
from the internet sources too.

1.2 Face Database

The methodology proposed in this paper is investigated on dataset of the frontal faces of
persons of various age clusters. This standard database is considered as the benchmark
dataset for the enactment evaluations and indicator for the estimation of human age.
There are 700 grayscale frontal view facial images in this dataset. Among 700 frontal
view images of human face, 350 images were taken for Training purpose and remaining
are considered for the purpose of Testing. Each picture is standardized to dimension 64
× 64. The below Fig. 2 represents face dataset of individuals of various age clusters.
Age Estimation of a Person by Compound Stratum Practice in ANN 175

Fig. 2. Examples of front view face pictures database.

2 Proposed Methodology
In this research, initially the face images are read. The strength standards of individual
image are varies between 0 and 255. To increase the efficacy of the enactment, rather
than giving full set of 4096 Neurons to ANN, the aggregate value of individual imagery
i.e., 64 customary standards are fed to ANN.
During the process in Principal level, ANN clusters the human face imagery centered
on various categories of age’s namely. Kids, Mid aged and Senior aged. This result
couldn’t be considered by way of a decision as there might be the probabilities of
misclassification.
To boost the Principal level, a system for Compound Level is invented to diminish
the misclassification probability and increase the success proportion of the system.
Experimental outcome of executing the facial imagery training by means of ANN is
as presented in below Fig. 3.

Fig. 3. Building and training of a ANN

2.1 Compound Level


In this research article, the approximate oldness of a one is estimated using Compound
Level Stratum Practice implemented on ANN. Figure 4 represents the diagrammatic
depiction of Principal Level Practice.
As depicted in the overhead construction, the outcome is a customary value which
estimates the age cluster of an individual within the Principal level. The current process
is epitomized as,

Y = £ (trainImg , testImg) (1)


176 M. R. Dileep et al.

Fig. 4. Principal-level practice

Here, £stands replication of ANN.


trainImg is trained assessment in ANN.
testImg is analysis imagery.
At this point, Y has a rate one or the other Ykid , Ymid , Ysenior , that signifies the esti-
mated age of an individual in the Principal level. Based on this Principal level outcome,
the supplementary enhancement in the Compound Stratum Practice is achieved.
In the Compound Stratum Practice, the succeeding steps have to be built.
The Mean values of all Ykid images is computed by,

Mkid = Ykid /N (2)

Here, Mkid is Aggregate assessment of altogether Kids faces.



Ykid is entirety of altogether Aggregate standards signifies the pictures of Kids.
N is number of pictures of Kids.
The Customary Deviancy of altogether imageries of Kids faces is premeditated as

SDkid = V kid (3)
√ 
Where, V kid = N1 Xi − Mkid 1 ≤ Xi ≤ N.
Similarly, the Mmid , SDmid, Msenior , SDsenior are calculated for entire training image
data set.
Now, the age of an individual is estimated by Compound Stratum Practice with the
implementation of ANN, where 3-Sigma limit is imposed on the ANN classifier. 3-Sigma
control limits encapsulate over 90% of the inhabitants of dataset under consideration for
making decision. Figure 5 demonstrates 3σ limit which designates how data are spread
round their means.
The threshold for all three clusters of age are found using 3-Sigma limits as given
below,

Lkid = Mkid − 3σ (4)

Ukid = Mkid + 3σ (5)

Where, Ukid and Lkid are the Upper Boundary and Lower Boundary of Kids face,
correspondingly.
Age Estimation of a Person by Compound Stratum Practice in ANN 177

Fig. 5. 3σ limit disseminated nearby their medians.

Likewise, Lmid , Umid and Lsenior , Usenior are estimated. The rational verge assessment
is experimentally found through seeing facial imageries in dataset.
While testing imagery, this proposed process is implemented in estimation and
drawing conclusion.
The experimental outcome of estimation of human age are presented in the below
Fig. 6.

(a) Kid age (b) Mid age (c) Senior age (d) Other Category
Fig. 6. Investigational outcomes of the projected approach

3 Projected Algorithm
The Projected algorithm for estimation of human age using compound stratum practice
is as shown below:

Phase 1: Feed images of the face to ANN for Training as input.


Phase 2: Fix the Objective values for grouping of Kids, Mid aged and Senior aged.
Phase 3: Build ANN.
Phase 4: The images should be trained using Neural Network.
Phase 5: Calculate Ykid , Ymid , Ysenior using Eq. (1).
Phase 6: Calculate Mkid , SDkid , Mmid , SDmid , Msenior , SDsenior using reckonings (2) &
(3) correspondingly.
Phase 7: Fix Verge ( ϴ ) for each of the age clusters viz., ϴ kid,ϴmid,ϴsenior that is, Kids,
Mid aged and Senior aged individuals by means of 3σ limits expending reckonings (4) &
(5) correspondingly.
178 M. R. Dileep et al.

Phase 8: Probe imagery should be tested & discover ( ϴ ), & investigate for the group
ϴ belongs, that is, ϴkid,ϴmid,ϴsenior.
Phase 9: On the basis of cluster where ϴ fits in Compound Stratum Practice, inference
is made that, the test image is belonging to that precise cluster.

4 Experimental Outcomes

In the current experiment, 700 Gray scale face imageries through gray levels 256 were
taken. Each picture is standardized to 64 × 64 dimension. The age clusters have been
distinguished consequently as reflected in the below Table 1. Subjective middling age
of the idiosyncratic decisions is formerly figured. Among 700 experimental pictures,
350 images were considered for training dataset, & the outstanding are considered for
the test set. Outcome of age cluster boundaries is imposed on ANN training, and lastly,
are cast-off for assessing the performance of clustering scheme upon test imageries by
Compound Stratum Practice method.

Table 1. Age clusters & age boundaries.

Age Clusters Age Boundaries


Kids 1 to 16
Mid Age 16 to 60
Senior Age Greater than 60

In experimentation stage, amongst 300 imageries, 100 pictures were taken from each
of the age cluster. The identification proportion for children cluster is 90.9%. Precise
rate for Mid aged cluster is 91.04%. Finding rate of Senior aged cluster is found to be
97.01%, correspondingly. Consequently, global success degree for test imageries is 93%.
Average detection period of every experimenting imagery is 0.50 s on an Intel Core i3
computer of RAM capacity of 4 GB.
Conversely, the proposed algorithm lacks in sensing the faces with side-view,
obstructed faces and fractional images of the face. This situation is because of the cir-
cumstance that the proposed architecture is limited to notice only the faces having frontal
pose. This proposed approach is compared with supplementary existing methodologies
and achieved sophisticated success ratio as presented in Fig. 7 and Fig. 8.
Age Estimation of a Person by Compound Stratum Practice in ANN 179

Success Rate(Approx)
100
90 Success
80 Rate(Approx)

Fig. 7. Success ratio of proposed compound stratum practice model

100
90 Proposed
80 Method
70
Other
Method

Fig. 8. Comparing proposed compound stratum practice architecture with existing models.

5 Inferences and Deliberations


In the present article, quick & effectual person’s age estimation methodology has been
invented to categorize face imagery one among three age clusters, viz., Kids age, Mid age
and Senior aged individuals. In the current research, age of an individual is estimated by
feed forward promulgation ANN, and supplementary clustering is achieved by imposing
authentication using 3-Sigma controls extracted from Principal level of investigation.
This proposed algorithm is superior considering the parameters of speediness and exact-
ness. Single frontal face of a human with various age clusters is estimated magnificently
with the detection percentage of 93. Misclustering is testified due to the motive that clear

Fig. 9. Sample Images with misclassification.


180 M. R. Dileep et al.

age clusters are not consistent. Though, the crisp boundaries are well-defined for each
cluster here. The trial misclustering consequences are displayed in the below Fig. 9.
In future experiments, these misclustering are condensed with the help of fuzzy
inference system (FIS) method for additional enhancement in the proposed architecture
so as to achieve extra relevant to the strategy of a real-time reconnaissance application.

References
1. Chellappa, R., Wilson, C.L., Sirohey, S.: Human and machine recognition of faces: a survey.
Proc. IEEE 83(5), 704–740 (1995)
2. Dileep, M.R., Danti, A.: Structured connectivity-face model for the recognition of human
facial expressions. Int. J. Sci. Appl. Inf. Technol. (IJSAIT) 3(3), 01–07 (2014). Special Issue
of ICCET 2014, ISSN 2278-3083
3. Er, M.J., Chen, W., Wu, S.: High speed face recognition based on discrete cosine transform
and RBF neural network. IEEE Trans. Neural Netw. 16(3), 679–691
4. Gonzalez, R.C., Woods, R.E.: Digital image processing third edition, Pearson education. In:
Gonzalez, R.C., Woods, R.E., Eddins, L. (eds.) Digital Image Processing Using MATLAB.
Pearson Education (2008)
5. Graf, A., Wichmann, F., Bulthoff, H., Scholkopf, B.: Classification of faces in man machine.
Neural Comput. 18(1), 143–165 (2005)
6. Hayashi, J., Yasumoto, M., Ito, H., Niwa, Y., Koshimizu, H.: Age and gender estimation from
facial image processing. In: Proceedings of the 41 st SICE Annual Conference, pp. 13–18
7. Kirby, M., Sirovich, L.: Application of the KL procedure for the characterization of human
faces. IEEE Trans. Pattern Anal. Mach. Intell. 12(1), 103–108 (1990)
8. Kwon, Y.H., Da Vitoria Lobo, N.: Age classification from facial images. In: Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition, pp. 762–767. Seattle,
Washington, U. S. A. (1994)
9. Lanitis, C., Draganova, C., Christodoulou, C.C.: Comparing different classifiers for automatic
age estimation. IEEE Trans. Syst. Man Cybern. Part B 34(1), 621–629 (2004)
10. Looney, C.G.: Pattern Recognition Using Neural Networks. Oxford University Press, New
York, U.S.A. (1997)
11. Luu, K., Dai Bui, T., Suen, C.Y., Ricanek, K.: Spectral regression based age determination
(2010)
12. Peng, J., Heisterkamp, D.R., Dai, H.K.: LDA/SVM driven nearest neighbour classification.
IEEE Trans. Neural Netw. 4(1), 158–163 (2003)
13. Turk, M., Pentland, A.: Eigenfaces for recognition. J. Cogn. Neurosci. 3(1), 71–86 (1991)
14. Horng, W.-B., Lee, C.-P., Chen, C.-W.: Classification of age groups based on facial features.
Tamkang J. Sci. Eng. 4(3), 183–192 (2001)
15. Zheng, J., Lu, B.L.: A support vector machine classifier with automatic confidence and its
application to gender classification. Int. J. Neurocomput. 1926–1935 (2011)
Improved Artificial Fish School Search Based
Deep Convolutional Neural Network
for Prediction of Protein Stability upon Double
Mutation

Juliet Rozario1(B) and B. Radha2


1 Department of Computer Science, Nehru Arts and Science College, Sree Saraswathi
Thyagaraja College, Coimbatore, India
julietjuana@gmail.com
2 Sri Krishna Arts and Science College, Coimbatore, India

Abstract. Recent advances in molecular mechanism and pharmaceutical drug


designing greatly influence the process of predicting protein’s stability upon muta-
tion by understanding the effects of amino acid substitutions. Thus, researchers
started focusing on protein engineering for investigating the protein structure to
discover the disease-causing variants at its initial stage and guide for designing
pharmaceutical drugs. Though, there are many existing protein stability prediction
models are available, still the issues related to the uncertainty and understanding
in depth knowledge about the protein and molecular structure are in existence.
Hence it is very challenging to construct a novel prediction model to handle the
existing issues. This paper aims to develop metaheuristic based Deep Neural Net-
work model for improved accurate prediction of protein stability. This research
work proposed Fish School Search Improved Deep Convolutional Neural Network
improved (CNN-FSS) for predicating protein stability upon double mutation. The
parameters in CNN is potentially handled by Fish School Searching behaviour to
assign best values to influence accurate prediction in presence of uncertainty and
inconsistent structure of protein stability energy change in double mutation site.
The complex pattern of double mutation is precisely handled by this proposed
CNN-FSS, by improving the learning rate of the CNN. Instead of greedy-descent
based weight assignment, the fish schooling induces optimized result. The simu-
lation is conducted on ProTherm database and results also proved that CNN-FSS
achieves highest accurate prediction of protein stability upon double mutation
compared with conventional CNN and DNN.

Keywords: Double mutation · ProTherm database · Protein stability ·


Convolutional neural network · Fish school search · Energy change · Deep neural
network

1 Introduction
One of the vital process in protein engineering is to predict protein stability by analyzing
its mutation. There are various factors which affect the work of protein folding along

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 181–192, 2022.
https://doi.org/10.1007/978-3-030-97196-0_15
182 J. Rozario and B. Radha

with compositional and conformational stability [1]. The other environmental factors are
pH, temperature, primary and secondary structure. The prediction of protein stability is
mainly used in the field of medicine to construct enhanced immunotherapeutic agents
and also for identifying drug [2]. The structure of protein is discovered by the interactions
between amino acid and its environment. The thermodynamic stability of a protein is
generally denoted as Gibbs free energy (G). When a mutation results in amino acid
substitution in a protein, it will affect its stability. The change in protein thermodynamic
stability (G). Due to mutation is very essential for medicine and biotechnology.
The protein stability upon double mutation is very essential to understand change in
stability because of packing hydrophobic, bonds among disulfide, ion pairs formation,
salt bridge removal [3]. By removing specific interactions or substation both the residues
consequences to stability change. The double mutant stability is well labeled with the
method of additivity, it is termed as additive if sum of the stabilities of two single mutants
is identical to the stability alteration upon double mutation. When the structures regions
are inclined by the mutants are not overlapping with each other then they also known as
additive [4].
In this work an improved deep convolutional neural network is designed to predict
protein stability change upon double mutation. To enhance the process of stability change
it is important to understand depth knowledge about protein structure. Thus, the proposed
model aims to improve the accuracy rate of stability change prediction with reduced
computation complexity.

2 Related Work
In their study, Arun Prasad et al. [5] used interaction parameters, residue, and packing
density to construct a knowledge-based strategy. They find a distance score between
the mutation protein and the wild type protein to anticipate the effect of mutations on
protein stability. Fang [6] used a comparison analysis to compare five different machine
learning models for predicting mutation-induced protein stability changes. Overfitting
is discouraged in this study due to a lack of training instances and qualities that do not
provide adequate information for the job. Wilson et al. [7] focused on the consequences
of 12 different cancer mutations. They provide additional information on the molecular
dynamics of mutants that we couldn’t get from experiments. The most common result
of this research is that it demonstrates that structural stability is not under risk.
Montanucci et al. [8] create a prediction model for single point protein stability
changes. To create simple anti-symmetric features, evolutionary information is inte-
grated with an untrained model in this work. From sequence and structural data, it can
also anticipate multipoint variations. The difference between projected and measured
energy change is found using pears and correlation. Panday [9] created a sequence-
based technique to anticipate the effect of single point mutations on protein stability. It’s
a gradient boosting decision tree that predicts how amino acid alterations affect fold-
ing free energy. To develop predictions, this study examined sequence characteristics,
physicochemical parameters, and evolutionary information traits.
Li et al. [10] developed a ThermoNet for structure-based prediction of protein ther-
modynamic stability after point mutation using a 3D convolutional neural network. Both
stabilizing and destabilizing mutations are predicted by this paradigm.
Improved Artificial Fish School Search 183

Cao et al. [11] devised a neural network approach for predicting changes in protein
stability as a result of point mutations. It’s a deep neural network that predicts changes
in protein stability using pears and correlation coefficients. It also demonstrates that the
most essential factor for determining protein stability is the solvent accessible surface.
Alvarez [12] proposed a two-step technique that combined Holdout Random Sampled
and neural network regression. The cumulative distribution function is used to find the
energy changes using the Holdout Random Sampler. The HRS output is used to train a
neural network to anticipate changes in protein stability.

3 Methodology: Fish School Search Improved Deep Convolutional


Neural Network for Prediction of Protein Stability upon Double
Mutation
This research work constructed a novel Deep Convolutional Neural Network whose per-
formance is improved by applying the behavioral inspiration of Artificial Fish School
Algorithm in prediction of protein stability by discovering the presence of Double muta-
tion. Unlike, other conventional deep learning model which uses gradient-descent based
weight assignment in each hidden nodes, this proposed work gains prior knowledge from
artificial fish school search algorithm whose contribution is to improve the learning rate
of the CNN. This architecture receives the input data from the ProTherm Dataset and
passes to convolutional layer known as ReLU with activation function followed by a
pooling layer. After processing with few layers repeatedly, it reaches fully connected
dens layer which comprised of softmax activation function. The parameters of Deep
Convolutional Neural Network such as weights and bias are optimized using searching
behaviour of fish school. The Fig. 1 depicts the complete work flow of the proposed
CNN-FSS.

3.1 Dataset Description


This work collected the double mutant dataset from the ProTherm database [16] known
as S2648 which has 180 mutants form 27 various proteins. The attributes in S2648 double
mutant dataset comprised of two variant set of attributes to denote the double mutation,
first mutant is represented as M1 and second mutant is denoted by M2. Both mutants
have eight attributes along with wild type residue and mutant residue as attribute 1 and
2 respectively. Attributes 3–8 are three neighboring residues of the mutation point for
both directions. The M12 belongs to all the attributes. The energy change (G) is the
label which increase energy increased or decreased.

3.2 Deep Convolutional Neural Network


An alternative model of Neural Network is Convolutional Neural Networks (CNNs) [13]
whose main objective is to decrease spectral differences and correlation of model which
presented in dataset. A conventional CNN is depicted in the Fig. 2, the fully connected
layer which comprised of hidden activation is calculated by multiplying the input vector
I with weights in that layer. The weight factor (wf ) is shared across whole input data as
184 J. Rozario and B. Radha

Preprocessing
Double
Mutation
Protein dataset

Deep Convolutional Optimization using Fish


Neural Network School Algorithm

Prediction of Protein
Stability using Double
Mutation

Fig. 1. Overall flow of fish school search improved convolutional neural network for double
mutation-based protein stability

shown in the figure. Once the hidden units (hu) are computed, max-pooling layer (Mi )
assist to remove inconsistency in the hidden units. In this work the lowest layer is the
convolutional layer and the higher layers are fully connected.

max pooling layer bands

pooling size

convolution
layer bands

shared weights wf

input bands
I

Size of the Filter

Fig. 2. Architecture of CNN


Improved Artificial Fish School Search 185

• Convolutional Layer

– It involves in computing the output of nodes which are connected to local areas of
the input matrix of the double mutation protein dataset.
– It calculates the set of weights and the local region values of the input using dot
product [14].

• ReLu/Activation Layer

– The output produced by the convolutional layer is passed to an element wise


activation operation which is termed as a Rectified-Linear Unit (ReLu).
– ReLU finds whether an input node will trigger the passed input data or not, its data
volume is remained unchanged.

• Pooling Layer

– The feature size is reduced in this pooling layer using down sampling policy

• Fully-Connected Layer

– The convolved features are passed to the fully connected layers, like conventional
feed forward model, each node is connected to next layer nodes using link and the
weight parameters are assigned to each link.
– It computes the class probability depending on the learned pattern.

Double
Mutation
Dataset

Fig. 3. Convolutional neural network for protein stability prediction fish school search algorithm

Figure 3 explores the complete structure of the Convolutional Neural Network


for prediction the Protein stability using Double Mutant dataset.
A population-based metaheuristic model which is constructed by the inspiration
of the swimming behavior of fishes, which contract and expand during search for food
[15]. Each fish location is represented in n dimensional possible optimized solution to the
searching problem. The searching solution is evaluated to discover their successfulness
using its cumulative account signified by weight factor. The feeding and movement are
the operators in Fish School Searching algorithm as shown in the Fig. 4.
The movement operator is divided into three different components namely individual,
collective-volitive, collective instinctive. Each fish can perform a random local search for
discovering promising areas using the individual component of the movement operator.
186 J. Rozario and B. Radha

Fig. 4. Fish school searching process

The individual component is mathematically represented as

Yi (t + 1) = Yi (t) + rstpdis (1)

where Yi (t) and Yi (t + 1) characterize ith fish position before and after the movement
triggered by the individual component, correspondingly. r ∈ RN with rj∼ Uniform [−1,
1], for j = {1,…,n} stpdis is a variable accountable to set the extreme displacement for this
movement. A new position Yi (t + 1) is only accepted if O (Yi (t + 1)) > f(Yi (t)), where
O is the objective function. Else, the fish remnants in the same position where it next
position won’t be changed Yi (t + 1) = Yi (t). By calculating the average of individual
movements for all Y is the collective-instinctive component of the movement. A vector
H ∈ RN is the weighted average of displacements for individual Yi mathematically
formulated as.
sz
i=1 Yi Oi
H=  N
, (2)
i=1 Oi

where sz is the school size, Yi is a shorthand for Yi (t + 1)−Yi (t), and Oi is a shorthand
for O(Yi (t + 1)) − O(Yi (t)). The displacement embodied by H is defined such as fishes
with a higher perfection will induce other fishes to its location. After calculating H, every
fish travel rendering to:

Yi (t + 1) = Yi (t) + H (3)

To regulate the fish school exploitation or exploration, the collective-volitive component


is used during the searching process. It starts with computing the barycenter BC ∈ RN
of the school relative to the position Yi of the fish and the weight wti as represented in
the equation below
s
i=1 Yi (t)wt i (t)
BC(t) =  N
(4)
i=1 wt i (t)
Improved Artificial Fish School Search 187

s
The fishes move towards the barycenter BC only if entire school weight i=1 wt i has
increased from t to t + 1.

Oi
wti (t + 1) = wti (t) + , (5)
max(|Oi |)
wi (t) ranges between 1 to wt scale , because it is a hyper-parameter. Each weight initial
value is wt scale /2.

3.3 Fish School Search Algorithm Improved Convolutional Neural Network


for Protein Stability Prediction upon Double Mutation

This proposed work improves Deep Convolutional Neural Network accurate prediction
of protein stability by examining the double mutation. The conventional CNN suffers
from the issue of overfitting and parameters involved in fully connected layer are done
using the backpropagation process. To overcome this difficulty and utilizing the structure
of double mutation points in protein to predict the stabilize or destabilized energy change,
fish school search algorithm is adapted in this work to enrich its learning phase by
understanding the depth pattern of double mutants. The figure depicts the adaptation of
FSS in fully connected layer of the CNN. The fish school search swimming behavior of
fishes, which contract and expand during search for food (Fig. 5).

Double
Mutation
Dataset

Fish School Search for


Weight Optimization

Fig. 5. Detailed structure of CNN-FSS for protein stability prediction upon double mutation
188 J. Rozario and B. Radha

Algorithm: Fish School Search improved Convolutional Neural Network

Input: Double Mutation Dataset


Output: Prediction of Protein Stability
Begin
.. Split the dataset into training and testing phase

..Initialize the required parameters


t=0
While t < max_iter && loss(t) > expected_error
o For all training set
Train the input dataset with class labels using fed forward the computation in ReLU, Max
Pooling and softmax
o Fully Connected Layer is used to predict the protein stability by using convolved features of
previous layers and multiplying each node value with the weight assigned to the links.
o Fish School Search is applied to discover the best weight values using its food searching behavior
based on the exploitation/ exploration in fully connected layer
For each fish in population
Perform individual movement
Perform process of feeding
Perform Collective instinctive movement
Compute the fish school’s barycenter
Apply Collective volitive movement
End for

. End
o Return the best weight values discovered

4 Results and Discussions

This section discusses in detail about the performance of the proposed CNN-FSS for
predicting protein stability upon double mutation. The dataset S2648 of 180 double
mutants is collected from ProTherm Database. The dataset comprised of information
about the wild-type residue, mutant residue and three neighboring residues on both sides
of mutant residues. The CNN-FSS performance is compared with standard Deep Neural
Network and Convolutional Neural Network. The evaluation metrics used are Accuracy,
Precision, Recall and F-Measure (Table 1).

Table 1. Performance evaluation

Accuracy Precision Recall F-Measure


CNN-FSS 98.9 98.6 99.1 98.74977
CNN 90.1 89.3 89.6 89.69822
DNN 82.4 81.9 82.2 82.14924

The table shows the result obtained by three different deep learning models in predic-
tion of Double mutation protein stability change. For training 90% of the dataset is used
and for test 10% of the dataset is used to validate the efficiency of prediction models.
Tenfold cross validation is used for determining the prominence of the deep learning
Improved Artificial Fish School Search 189

models. The CNN-FSS achieves better results compared to the other conventional state
of arts.
No of mutants correctly predicted
Accuracy =
Total number of mutants

98.9
90.1
100 82.4
Accuracy(%)

80
60 CNN-FSS
40 CNN
20 DNN
0
CNN-FSS CNN DNN

Deep Learning Models

Fig. 6. Comparison based on accuracy

The Fig. 6 displays accuracy comparison of three variants of deep learning in Double
mutation-based protein stability prediction. The proposed model Fish School Optimiza-
tion improved convolutional neural network produced the best accuracy rate of 98.9%,
conventional CNN attains 90.1% and DNN produced 82.4%. The CNN-FSO with its
knowledge of memetic nature of Fish School search the parameters involved in predic-
tion of protein stability are finetuned. While using conventional CNN and DNN their
parameters are assigned without any prior knowledge.
No of correctly predicted as stabilized
Precision =
Total number of stabilized mutants predicted

98.6
89.3
100 81.9
Precision(%)

80
60 CNN-FSS
40 CNN
20
DNN
0
CNN-FSS CNN DNN

Deep Learning Models

Fig. 7. Comparison based on precision

The precision value obtained for Double mutation-based protein stability prediction
of CNN-FSO, CNN and DNN is shown in the Fig. 7. The reason for proposed CNN-FSO
190 J. Rozario and B. Radha

achieves highest precision rate of 98.6%, the feature vectors involved in extraction of
significant features in each pooling and softmax layer are improved by applying the
strategy of artificial Fish Schooling. Its food searching nature optimizes the selection
of potential features to produce more accurate result. While CNN produced 89.3% and
DNN generated 81.9% because they perform feature reduction arbitrarily.
No. of correctly predicted stabilized energy
Recall =
No of correctly predicted stabilized and incorrectly
predicted destabilized mutants

99.1
100 89.6
82.2
80
Recall(%)

60 CNN-FSS
40 CNN
20 DNN
0
CNN-FSS CNN DNN

Deep Learning Models

Fig. 8. Comparison based on recall

The Fig. 8 shows the performance comparison based on the recall rate produced by
CNN-FSO, CNN and DNN for predicting the Double mutation-based protein stability.
The proposed model investigates the complex structure of the proteins in depth by
discovering the important features in the dataset using the food foraging nature of the
fish school, the fish which have the highest fittest values are considered as best searching
agent and those features are further considered as most suitable and passed to other layers
and finally only the best features which is highly relevant to the prediction process are
used in this work. Thus, proposed CNN-FSO produced highest recall rate of 99.1, while
CNN and DNN generates 89.6 and 82.2 respectively.
precision ∗ recall
F-Measure = 2 *
Precision + recall
The Fig. 9 shows the F-Measure value of newly constructed CNN-FSO, CNN and
DNN performance based on F-Measure. The F-measure metric is influence by the results
of both precision and recall, hence the CNN-FSO produced best precision and recall rate,
so its F-Measure value is also high with the value of 98.8% compared to CNN and DNN
their values are 89.7% and 82.1% respectively. The CNN and DNN uses a random
method for selecting the features, this may result in not taking into the account of highly
relevant ones so during the prediction process the they produce less correct detection
rate compared to the CNN-FSO in Double mutation based protein stability prediction.
Improved Artificial Fish School Search 191

98.74977215
89.69821628
100 82.1492392

F-Measure(%)
80
60 CNN-FSS
40 CNN
20
DNN
0
CNN-FSS CNN DNN

Deep Learning Models

Fig. 9. Comparison based on F-Measure

5 Conclusion

This paper concentrates on prediction of the protein stability upon double mutation by
developing an improved fish school search based convolutional neural network. This
work overcomes the existing issues of predicting the energy change due to uncertainty
and vagueness in double mutation. In this work the Convolutional neural network is used
to perform accurate prediction by applying ten-fold cross validation. The performance
of the CNN is improved by inducing the fish school searching behavior which uses the
exploitation and exploration policy in finding the best solution. The optimized weight
values are used for predicting the protein stability with reduced error and highest accuracy
compared to the standard convolutional neural network and deep neural network. The
ProTherm dataset is used for finding the simulation result on double mutation-based
protein stability prediction. From the obtained results it is observed that the proposed
CNN-FSS achieves highest rate of accuracy in the presence of uncertainty in double
mutant.

References
1. Stein, A., Fowler, D.M., Hartmann-Petersen, R., Lindorff-Larsen, K.: Biophysical and mech-
anistic models for disease-causing protein variants. Trends Biochem. Sci. 44, 575–588
(2019)
2. Wei, L., Xing, P., Shi, G., Ji, Z., Zou, Q.: Fast prediction of protein methylation sites using
a sequence-based feature selection technique. IEEE/ACM Trans. Comput. Biol. Bioinform.
16, 1264–1273 (2019)
3. Savojardo, C., Martelli, P.L., Casadio, R., Fariselli, P.: On the critical review of five machine
learning-based algorithms for predicting protein stability changes upon mutation. Brief
Bioinform. 21(5), 1856–1858 (2019)
4. Savojardo, C., Petrosino, M., Babbi, G., Bovo, S., Corbi-Verge, C., et al.: Evaluating the
predictions of the protein stability change upon single amino acid substitutions for the FXN
CAGI5 challenge. Hum. Mutat. 40, 1392–1399 (2019)
5. Pandurangan, A.P., Ochoa-Montaño, B., Ascher, D.B., Blundell, T.L.: SDM: a server for
predicting effects of mutations on protein stability. Nucleic Acids Res. 45(W1), W229–W235
(2017)
192 J. Rozario and B. Radha

6. Fang, J.: A critical review of five machine learning-based algorithms for predicting protein
stability changes upon mutation. Brief Bioinform. 21(4), 1285–1292 (2020). https://doi.org/
10.1093/bib/bbz071. PMID:31273374; PMCID:PMC7373184
7. Wilson, C.J., Chang, M., Karttunen, M., Choy, W.Y.: KEAP1 cancer mutants: a large-scale
molecular dynamics study of protein stability. Int. J. Mol. Sci. 22(10), 5408 (2021). https://
doi.org/10.3390/ijms22105408. PMID:34065616; PMCID:PMC8161161
8. Montanucci, L., Capriotti, E., Frank, Y., Ben-Tal, N., Fariselli, P.: DDGun: an untrained
method for the prediction of protein stability changes upon single and multiple point
variations. BMC Bioinform. 20(14), 1–10 (2019)
9. Li, G., Panday, S.K., Alexov, E.: SAAFEC-SEQ: a sequence-based method for predicting the
effect of single point mutations on protein thermodynamic stability. Int. J. Mol. Sci. 22(2),
606 (2021)
10. Cao, H., Wang, J., He, L., Qi, Y., Zhang, J.Z.: DeepDDG: predicting the stability change of
protein point mutations using neural networks. J. Chem. Inf. Model. 59(4), 1508–1514 (2019)
11. Li, B., Yang, Y.T., Capra, J.A., Gerstein, M.B.: Predicting changes in protein thermodynamic
stability upon point mutation with deep 3D convolutional neural networks. PLoS Comput.
Biol. 16(11), e1008291 (2020)
12. Alvarez Machancose, O., De Andres Galiana, E.J., Fernández Martínez, J.L., Kloczkowski,
A.: Robust prediction of single and double protein mutations stability changes. Biomolecules
10(1), 67 (2019)
13. Kandathil, S.M., Greener, J.G., Jones, D.T.: Recent developments in deep learning applied to
protein structure prediction. Proteins 87(12), 1179–1189 (2019)
14. Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks
for large vocabulary speech recognition 20(1), 30–42 (2012)
15. Bastos-Filho, C.J., Guimaraes, A.C.: Multi-objective fish school search. Int. J. Swarm Intell.
Res. (IJSIR) 6(1), 23–40 (2015)
16. Kumar, M., et al.: Thermodynamic databases for proteins and protein-nucleic acid interac-
tions. Nucleic Acids Res. 34, D204–D206 (2005)
Statistical Distribution
and Socio-Economics in Accordance
with the Indian Stock Market
in the COVID19 Scenario

Bikram Pratim Bhuyan1(B) and Ajay Prasad2


1
Department of Informatics, School of Computer Science,
University of Petroleum and Energy Studies, Dehradun, India
bikram23bhuyan@gmail.com
2
Department of Computer Application, School of Computer Science,
University of Petroleum and Energy Studies, Dehradun, India
aprasad@ddn.upes.ac.in

Abstract. Covid Pandemic had devastating effects on the human pop-


ulation, resulting in fatalities and social and economic losses. The expo-
nential behaviour of the spread of the virus in recent times has led to
assertive impacts on the financial markets all over the world; India is no
exception. With the agglomeration of more cases throughout the coun-
try; on 23rd March, the stock markets dipped by a record of 13.15% with
a huge impact on the countrys GDP. This fall by far, is the largest ever,
in Indian history. The returns of the stock market are studied exten-
sively in literature where most of the models assume that the returns are
normally distributed ignoring the actuality of contribution of the market
crashes and volatility to the fat-tailed structure of the distribution. In
this research paper we exploit and understand the statistical distribution
of the crash (market returns) which is found out to be strongly fat-tailed;
labelling it to be a Black swan event. We have used the NIFTY 50 daily
return as the primary dataset for the analysis and observations using
Extreme Value statistics. The role of socio-economics in market fluctu-
ation is predominantly explored in this article. The correlation between
the Indian market and GDP growth is analysed statistically. Concluding
that the distribution of the return generated is fat-tailed and a posi-
tive correlation exists between the market and GDP. Hypothesis testing
is performed to guarantee the proper distribution and analysis of the
results and Power Law distribution is applied.

Keywords: Black Swan event · COVID-19 · Extreme value ·


Fat-tailed distribution · Indian stock market

1 Introduction
India is fighting the Covid 19, Pandemic with over 31.1 million detected cases
resulting in the reported deaths of 0.414 million (as in 18th of July, 2021). The
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 193–206, 2022.
https://doi.org/10.1007/978-3-030-97196-0_16
194 B. P. Bhuyan and A. Prasad

imminent first preventive reaction from the Government of India came in the
form of a nationwide lockdown for 21 days, constraining the movement of the
entire 1.3 billion population with an amalgamation of lockdowns that followed.
In the final lockdown, India saw a classification of its 733 districts by three zones
Red, Orange and Green with all travel including air, bus, trains remaining shut.
All educational institutions, hotels, movie theatres, shopping mall, superstores,
gyms, swimming pools and bars were also ordered to be closed. The series of
piggybacking such as prolonged lockdowns and slower recovery has hurt India,
similar to the rest of the world [1] The battering of the Indian economy is pro-
jected to contract by 4.5% this fiscal year 2019, as per the International Mone-
tary Fund [2]. Predictions such as this are seen in the World Economic Outlook
(WEO) forecast report. The United States has contracted by June 2020 to 8.0%.
The financial capital of India, Mumbai with its Bombay Stock Exchange
(BSE) and National Stock Exchange (NSE), is the main seat of our observa-
tions. Here we elaborate our case beginning with the historical perspective of
the market and the economy. Although the blame for the GDP fall could be
easily attributed to the pandemic and lockdown effects, the beginning of the
fall was felt earlier as the economy was suffering and reflecting slow down [3].
However, the markets few months after the onset of the pandemic seem to be on
a road to recovery despite the initial crash. Nonetheless, the economy appears to
be in a state of confusion. In this exercise the authors statistically analysed the
data collected from the top fifty companies of the Indian stock market (according
to market capitalisation), correlating the same with the GDP of the country to
ascertain the factual effect of one on the other in the long run. Finally, a thorough
analysis of the market is ordered with proper statistical tools to comprehend the
distribution of the same to take proper decisions in future.

1.1 COVID-19 and India: Not an Impasse


India shares the medical and economic imperatives with the rest of the affected
nations as lockdown although being a medical exigent; the economic imperatives
are profound. With an already pre-existing huge discrepancy in the economic
foreground [3], urban India is fragile in the service sector when compared with
the developed nations requiring policies to be coherent and the resulting actions
to reinforce each other [4]. According to a survey [5], 8.76 trillion rupees of
gross value added (GVA) is the estimated loss in the component of Indian GDP.
The loss of 2.81 trillion rupees in professional sectors along with real estate
and finance is being approximated. Likewise, 2.42 trillion rupees loss may be
encountered in the hotel, trade and transport sector. Health facilities in India
have been a vivid reminder of the fact that the world global average is far
above India with its expenditure is approximated to be 1% of the GDP in the
union budget of 2020–21, far less than the targeted 2.5% [6]. The discrepancy
in the medical sector could be highlighted in the form that in contrast to the
WHO norms of having a doctor to people ratio of 1: 1000, India depicts it at
1:1457; nurse to people ratio as needed 3:1000 is only 1:675 [7]. On one hand,
as we notice the declining pace of new infection and hospital intensive care
Statistical Distribution 195

occupancy rates, India is touching newer highs each passing day as we speak
about. Consumers rely on their savings or other social safety nets as recessions
dig deeper resulting in faint consumption. The biggest hit is taken by the labour
market with a catastrophic high in job loss where unemployment rose from 6.7%
on 15th March to 26% on 19th April [4]. India being a developing country,
faces an employment crisis in the informal sectors like agriculture, construction
industries, small traders, daily wagers etc. as employment in such sectors are
reported as 81% of the total [8]. More acute are the low-skilled workers who
do not have any option of working from home. An estimation of 140 million
lost employment or had a salary cut [9]. According to an article [10], in April
itself, India has seen more than 122 million people losing their jobs where around
75% of them were small traders and wage labourers with Tamil Nadu being the
worst-hit state. The small traders including the hawkers may return to work
after the lockdown but the salaried workers will have to take a toll according to
the survey by the Centre for Monitoring Indian Economy (CMIE).
Computing a likely economic impact, the first round of preventive measures
India took was to restrict international travels and stringent measures to impose
social distancing to dawdle the rate of spread of the virus [11]. These necessary
measures took a direct toll on the per capita income in all service sectors ranging
from tourism, travel, retail, trade and transport.
As alluded before India has been witnessing a pre-pandemic slowdown and
the coronavirus effect has exacerbated further depreciation of Indias growth
which went down to 3.1% according to the Ministry of Statistics [12]. The World
Bank and other credit rating agencies have already downgraded the GDP esti-
mates to negative figures signalling a deep or even worst recession since inde-
pendence [13].
The Government of India took the cresset into their hands and announced
a variety of measures to deal with the situation ranging from food security and
funds for health care to sector-related incentives and tax deadline extension. As
an economic relief to the poorer section, a total of over 24 billion was released
in various forms. The RBI (Reserve Bank of India) lent out 7 billion dollars to
special financial institutions dealing with agriculture, small industry and housing
like NABARD, SIDBI and NHB [14]. The government changed the policy of
Indias foreign investment to protect Indian companies big or small. On 12th
May, the current Prime Minister of the country announced an overall economic
package worth 280 billion dollars which makes 10% of Indias GDP [16]. With a
dramatic reduction in the price of crude oil, the trade of the nation also took a
positive income transfer.

2 Indian Stock Market Crisis: A Historical Perspective

Indian stock market cropped up to be one of the oldest markets in Asia, dating
back to the close of the 18th century where loan securities were transacted by the
East India Company [15]. It was the mid-1850s when a group of 22 stockbrokers
began trading with an amount of Rupee 1 under a banyan tree opposite the
196 B. P. Bhuyan and A. Prasad

Town Hall of Bombay. In 1875, the nave association was named as Bombay
Stock Exchange (BSE). Finally, in 1956, the Government of India recognized
BSE as the first stock exchange in the country under the Securities Contracts
Act. The other exchange recognized by the government is the National Stock
Exchange (NSE) which started trading on 4th November 1994. By far India has
seen three historic stock market crashes [17] where the markets pummelled by
more than 10%, the recent one being gifted by the coronavirus pandemic.
The 1992 Indian stock market scam also known as the Harshad Mehta scam
[18] was historically recorded as the biggest scam in its history where the markets
dipped to 12.77% on 28th April 1992. The main perpetrator of the scam Harshad
Mehta made the impossible to happen as the impact made the entire securities
system collapse leading investors to lose thousands of rupees in the exchange
system. The scam was orchestrated in such a way that the banking organizations
provided secured securities to broker Mehta against forged cheques signed by
corrupt officials which were later failed to recover from. Known as the big bull
of his times, Mehta made the prices of the stocks with no sound fundamentals
soar to record high through fictitious practises which he would later sell to gain
humongous profits.
Another major crisis appeared in 2007–2008 which needs no introduction as
popularly known as the housing market collapse in the United States of America
[19]. This conglomerated a butterfly effect with a worldwide recession leading
to market crashes in many countries which were although not directly involved.
Known as the biggest disaster after the Great Depression, the financial crisis
was caused by the bubble created by the housing market in the US which killed
many American as well as Indian dream. The markets fell after multiple dips
throughout the year of 2008 with a huge negative return of 10.95% on 24th
October 2008 [20].
Fast forward 12 years, we are now dealing with a medical crisis against the
backdrop of an already tumbling economy in which we witnessed a fall of 13.15%
on 23rd March 2020 [21].

2.1 Contexting the Fall of Markets


The IMF has projected Indias growth rate to −4.5% in 2020, a historic low
since 1961 leaving the government in uncharted territory [2]. It cited the fact
the contraction is due to the fact that all economic activities was stalled thanks
to the unprecedented nature of the coronavirus pandemic. The IMFs projection
is by and large in line with the estimates from the investment banks and other
international rating agencies. With the strained government finance and crashing
of tax revenues, India could see a 90% spike in the debt to GDP ratio. Now the
primary question is: Was India doing well before this pandemic?
Before declaring that the economy had already been debilitated throughout
years of mismanagement; we should refer to some statistics provided by the
national statisticians at the end of May. In the financial year ending in March,
the GDP of India grew only by 4.2% [22]. A further deep down analysis about the
quarterly GDP growth numbers shows that from 7% growth we shrunk to 6.2%,
Statistical Distribution 197

then to 5.6%, 5.7%, 4.4% and finally landing with 3.1% in the March quarter
the slowest growth in 8 years [22]. So the country was already going through a
slowdown period [23].
Answers could be found when we look at the investment and debt figures.
Investment has shrank by almost 3% over the year and government spending has
increased by 12%, almost twice the growth rate of private consumption [4] As a
result, the government had a fiscal deficit of 4.6% higher than its predecessor.
Having consumer demand already in doldrums, accrued with the record bor-
rowing abroad of 22 billion dollars, Indias rating was a step closer to the junk
territory. Along with these the Indian central bank tried to twist its balance
sheet by selling short term government bonds to banks and buying long term
securities in turn.
According to the National Statistical Office data [24], the manufacturing
sector had grown by merely 0.03% in 2019–20 and the construction sector had
a decline of 1.3%. Gross capital formation too remained low with the growth
in deposits in banks declined to 7.9% depicting low level savings. Bank credit
growth reduced by more than a half to 6.1% showing lower consumption [25].
Thus we can notice that the Indian economy had already taken a heavy knock
even before the pandemic had shown its proper face. In this paper we tried to
study the market data, its statistical distribution during and before the pandemic
to understand the future of Indian markets with respect to its economy.

3 Materials and Methods

In a theoretical environment, the stock market and the GDP of a country should
be closely connected [26]. While stock market is often depicted as a sentiment
indicator, the GDP or gross domestic product reflects the output of all goods
and services in an economy. While the economy of any country translates into
the companies profits leading into the decision of the stock price. However, this
is not the case as not only domestic companies are allowed to be listed in the
market and valuations are not constant as sentiments and confidence vary which
are vastly impacted by the politics and media. Economy of a state is decided
by the past collected data but as markets are fuelled by expectations. But a
relation could be mined out from both of the indices as in a bull market, with the
rising market prices, consumers and companies have more wealth and confidence
leading to higher spending and higher GDP. The inverse is true in case of a
bearish market.
Let us try to understand the relation between Indias GDP and market return.
Data is collected for the past 26 years since 1996 to 2020. For each year, the
data consists of the Indias GDP (in percentage) and NIFTY 50 returns (in
percentage).
Based on the data collected, we have plotted the numbers in line graphs as
shown in Fig. 1. It could be easily observed that the market data is having more
fluctuations than that of its corresponding GDP figure. It could be seen that for
some years both of them are increasing and decreasing together but it is not the
198 B. P. Bhuyan and A. Prasad

Fig. 1. NIFTY returns vs GDP of India

case that could be observed globally. If we have to understand more about the
correlation between the two, we have to take the help of statistical covariance
which is an indicative of whether if any relation (positive, negative or none)
exists between the variable in question.

Fig. 2. Scatter plot NIFTY returns vs GDP of India

We now plot our data in a scatter plot that means for each GDP data, the
NIFTY return is plotted. We can see that there might be some kind of relation
between both of them as if one is increasing then the other is increasing. Now
when we tried to compute the correlation for the two variables using the following
formulation.


(GDPi − GDP )(N IF T Yi − N IF T Y )
r(GDP, N IF T Y ) =   (1)
(GDPi − GDP )2 (N IF T Yi − N IF T Y )2
Statistical Distribution 199

Where,
r(GDP, N IF T Y ) denotes the correlation between GDP and NIFTY returns.
GDPi denotes the GDP at the year ‘i’.
GDP denotes the average of the GDPs from 1996 to 2021.
N IF T Yi denotes the NIFTY returns at the year ‘i’.
N IF T Y denotes the average of the returns by NIFTY from 1996 to 2021.
Once computed, the value turned up to be 0.63. This shows that the variable
in question are positively correlated i.e. when GDP increases, NIFTY return is
also seen to be increasing. This is a very important result as it denotes that even
if for a short duration the NIFTY return might not show the exact GDP growth;
both of them are going to merge in a longer duration. The current situation of
India is also having the same property as the markets seem to be on a road
to recovery although the economy appears to be in a state of confusion but in
due run, we will have the same distribution. Let us hence discuss the now the
statistical properties of the market data.

4 Methodology and Procedure of Interpretation


As discussed in the previous section, the India market has a positive correla-
tion with the countrys GDP growth in the long run; in this paper, we try to
discuss the statistical distribution of the Indian stock market crash in terms
of return along with the robustness of the estimates which could be used for
serious policy making decisions in future. The returns of the stock market are
studied extensively in literature where most of the models assume the fact that
the returns are normally distributed ignoring the actuality that market crashes
and volatility contribute to the fat tailed structure of the distribution. In this
research work we exploit and understand the statistical distribution of the crash
(market returns) which is found out to be strongly fat-tailed; labelling it to be a
Black swan event [27]. With proper extreme value statistics for the tail analysis,
we guarantee the robustness of our estimates. We have used the NIFTY 50 daily
return as the primary dataset for the findings and observations using Extreme
Value statistics. The statistics of the dataset like mean, variance, standard devi-
ation, skewness and kurtosis for the returns generated are studied. The following
hypothesis were tested to understand the proper distribution of the data:
Hypothesis 1.1- The distribution followed by the collected data is same as the
normal distribution.
Hypothesis 1.2- The distribution followed by the collected data has a better fit
to the power law distribution as compared to the lognormal distribution.
Hypothesis 1.3- The distribution followed by the collected data has a better fit
to the power law distribution as compared to the exponential distribution.

5 Results
The dataset NIFTY 50 is fetched from NSE which comprises of the top 50
companies across India according to their market capitalisation. The data is
200 B. P. Bhuyan and A. Prasad

collected for the last five years ranging from July 20, 2015 to July 15, 2021.
The total number of trading days in this duration happened to be 1235 days.
With these bevy of days in our collection, we began our observation taking into
consideration the closing price of the market, which basically is the price at
which the market settles after the day.
The graph in Fig. 3 clearly shows the price movement of the index with
respect to the number of days taken into account. The huge drop in the index
price could be clearly observed in the graph. This drop was observed gradually
during the month of March, 2020 and finally took a nose dive on March 23, 2020,
where the index reported a closing price of 7610.25. The drop was about 30%
below the index price reported on January 30, 2020 (the first case reported in
India) which was 12035.8.

Fig. 3. Price movement vs number of days

In order to analyse the data, the distribution of the data is the very first step
towards it. Figure 4 depicts the frequency distribution of our data. It could be
observed that the market was larger in between the range of 10500 and 11000 for
a majority of the days taken into consideration. Apart from that, the distribution
is uneven and no proper analysis could be done. Hence we tried to look at the
daily market returns instead of the price movement given by the formulation
given below.
Pt − Pt−1
Rt = (2)
Pt−1
Where Rt is the computed return on day ‘t’ over the price Pt at day ‘t’ and Pt−1
on the previous day with respect to day ‘t’.
Statistical Distribution 201

Fig. 4. Frequency distribution of the price

After the computation the daily returns for each day; the data is plotted in a
graph as shown in Fig. 5. A huge volatility in this case may be observed during
the days as already discussed for Fig. 3.
Upon plotting the frequency distribution of the daily returns, a type of bell
curve could be observed as shown in Fig. 6. At a very first look the graph depicts a
good normal distribution upon which further analysis could be done. But instead
to be sure of it we decided to look at its statistics.

Fig. 5. Daily Return movement vs number of days

The minimum daily return observed as −12.98% (−0.1298) and the maximum
is 8.76% (0.0876). It is observed that the minimum return was found at the
closing of day March 23, 2020. The mean of the distribution is found out to be
0.000237 and the variance to be 0.00132. So the distribution could be concluded
to have a near zero mean with a standard deviation of 0.11489. In this case, the
minimum daily return is more than 11 standard deviation away from the mean.
202 B. P. Bhuyan and A. Prasad

Fig. 6. Frequency distribution of daily returns

The skewness is observed to be −1.3648 which means it is negatively skewed and


the tail of the left side of the distribution is fatter than that of the right. The
kurtosis is seen to be 22.4926. To be a proper normal distribution, the value of
kurtosis is 3 and any distribution greater than it is observed to have heaver tail.
In our case, the value is far greater than 3.
To have more insight about our data, we decided to have a hypothesis test
where the null hypothesis was taken as that our distribution is similar to the
normal distribution. After the chi-squared test, the p-value was found to be
8.55 × 10− 123, as far less value than the significance level 0.05; which concludes
that we cannot accept the null hypothesis and our distribution is not the same
as the normal distribution.
Our goal to find the correct distribution of our data in order to exploit the
properties of the distribution for further prediction and decision making.

5.1 Power Law Distribution Utilised

As the name implies, the power law states that a relative change in one quantity
results in a proportional relative change in another. A common example being
finding the area of a square, in this case if we increase the sides of the square by
a multiple of 2, the area gets enlarged by a factor of 4! This distribution comes
in handy when distributions do not follow the typical central limit theorem
of the normal distribution. Thus, distributions bearing large tails are generally
considered. In such a case, we try to analyse the power of the distribution to gain
more insights. Suppose we define the power by ‘α’, then the value it will now
finalize which of the moments (mean, variance, skewness and kurtosis) is finite.
Mathematically speaking, only ‘α − 1’ moments exits, all the rest are infinite. In
the context of this paper only mean and variance exists and the rest i.e. skewness
and kurtosis remain undefined. The extreme skewness of it has also led to the
construction of the 80-20 rule which in general terms of economy mean than 80%
of the wealth is held by 20% of individuals.
Statistical Distribution 203

Due to the heavy-tailed nature of the data, power law probability distribution
is theoretically interesting which are widely used throughout nature, ranging
through the fields of linguistics, neuroscience and astrophysics. But the catch lies
on the part that goodness of fit of the power law distribution to the empirical
data is non-trivial. A power law distribution has the form

Y = f (X) = kX −α (3)

where: X and Y are variables of interest, α is the laws exponent, k is a constant.


Figure 7 represents the probability density function (PDF) of our data of
daily returns. On fitting our distribution to the power law, it was found that
α = 2.9401 with a standard error of 0.2109. It could be easily observed that
the function is not smooth. This is because of the highly improbable negative
returns that was observed as mentioned in the previous sections.

Fig. 7. Probability density function of daily returns

We cannot directly conclude that our data will follow the power law distribu-
tion. We have to compare with other similar distributions and find the goodness
of fit. For this we have used the Kolmogorov-Smirnov test to generate the p-
value for an individual fit and log-likelihood ratios to entertain which of the two
fits are better. Upon comparing the power law distribution with lognormal and
exponentia distributions, the p-value was found up to be 0.2355 × 10− 12 and
0.455 × 10− 13 respectively, and as the statistics found as very small we hence
fail to reject Hypothesis 1.2 and 1.3. Confirming the distribution to follow power
law with heavy tail; we can now in future concentrate on other Extreme value
statistics dealing with the stochastic behaviour of rare events. Also ontological
works like [28] and [29] can be used for farther analysis.

6 Discussion and Conclusions

The paper dealt with various factors that were responsible for the slowdown of
the economy before the pandemic. In this paper the authors undertook a socio-
economical perspective of the Indian economy amidst the COVID 19 pandemic.
On the basis of the historical perspective of the Indian stock market and its
204 B. P. Bhuyan and A. Prasad

major crashes and previous experience with recovery, the writers of this paper are
confident that India appears to have capacity to return a reasonable and positive
growth post the current slowdown. Although in recent times disconnect could
was displayed between market and economy, Indian governments interventions
in relations are very likely to correct this anomaly. Signals to that affect are
beginning to be observed though the growing positive correlation between the
market and GDP. Often it is surmised that if one increases, the other does the
same and vice versa almost like a guaranty.
On the market returns data collected, the statistical analysis showed that
the data is heavily dispersed, with the standard deviation showing more than 11
away from its mean. In a typical normal distribution, the data is distributed in
the 68-95-99.7 rule meaning that 99.73% of the data is spread within 3 standard
deviation from the mean. This gives a horizon for further analysis as the dis-
persion in our data is large. When the kurtosis is observed, our data depicted a
result more than 22, far more than the typical normal distribution bearing kur-
tosis of 3. To confirm the study a hypothesis test was conducted which after the
chi-squared test confirmed that our data is never to be said to follow the normal
distribution but has a heavy tail. Once the data is fit to the power law distri-
bution because of its acceptance of the heavy tailed property, the distribution
is further subjected to goodness of fit analysis using likelihood ratio test with
other known distributions such like lognormal and exponential distributions. The
power law outperformed both of them confirming the valid distribution of the
data. As we gathered α, (2 < α < 3) by fitting our data into the power law,
we surmise that the first moment (mean) is finite and the other higher moments
(variance, skewness and kurtosis) are infinite. We suggest that future studies in
this area can now apply the 80-20 rule stating that 80% of the market returns
are generated by 20% of the stocks.
Finally, we conclude the work by stating that markets of India has seen a
great plummet and investor confidence has since returned as well as the recovery
rate of the pandemic cases has increased. With the back drop of a soon to be
realised vaccine in the news after a rat race in the testing phase, the economy
and markets are sure to heal and be in its earlier tack and see more glorious
days in future.

References
1. Ray, D., Subramanian, S.: India’s lockdown: an interim report. Indian Econ. Rev.
55(1), 31–79 (2020)
2. Nam, C.W.: World economic outlook for 2020 and 2021. In: CESifo Forum 2020,
vol. 21, no. 02, pp. 58–59. Mnchen: ifo Institut-Leibniz-Institut fr Wirtschafts-
forschung an der Universitt Mnchen
3. Vijayakumar, N.: A Study on the Causes and Consequences for the Present Slow-
down in the Indian Economy
4. Singh, M.K., Neog, Y.: Contagion effect of COVID19 outbreak: another recipe for
disaster on Indian economy. J. Public Aff. 20(4), e2171 (2020)
5. Mishra, H.H.: Coronavirus in India: COVID-19 lockdown may cost the economy
Rs. 8.76 lakh crore; here’s how. Business Today, March 2020
Statistical Distribution 205

6. Bhaskarabhatla, A.: Maximum resale price maintenance and retailer cartel profits:
evidence from the Indian pharmaceutical industry. Antitrust Law J. 83(1), 41–73
(2020)
7. Dutta, R., Chowdhury, S., Singh, K.K.: IoT-based healthcare delivery services to
promote transparency and patient satisfaction in a corporate hospital. In: Machine
Learning and the Internet of Medical Things in Healthcare, 1 January 2021, pp.
135–151. Academic Press (2021)
8. Bonnet, F., Vanek, J., Chen, M.: Women and men in the informal economy: a
statistical brief. International Labour Office, Geneva. http://www.wiego.org/sites/
default/files/publications/files/Women. Accessed 20 January 2019
9. Mehta, R.: COVID impact: 2 in 5 employees are facing salary cuts, finds survey.
Retrieved 21 June 2020
10. Bhattacharyyaa, R., Sarmab, P.K., Nathc, M.M.: COVID-19 and India’s Labour
migrant crisis. Int. J. Innov. Creat. Change (2020)
11. Ghosh, A., Nundy, S., Mallick, T.K.: How India is dealing with COVID-19 pan-
demic. Sens. Int. 1(1), 100021 (2020)
12. Jha, P.: Factors responsible for slowdown of Indian economy 2020 and methods to
mitigate them. Name Page No. 189
13. No, D.: Ministry of Statistics and Programme Implementation
14. Garcini, L.M., Domenech Rodrguez, M.M., Mercado, A., Paris, M.: A tale of two
crises: the compounded effect of COVID-19 and anti-immigration policy in the
United States. Psychol. Trauma: Theory Res. Pract. Policy 12(S1), S230 (2020)
15. Das, S.: Governors Statement. RBI Website, New Delhi, India (2020)
16. Mukherjee, D.: Comparative analysis of Indian stock market with international
markets. Great Lakes Herald 1(1), 39–71 (2007)
17. Das, G.: Jobonomics: India’s employment crisis and what the future holds,
Hachette UK, 20 January 2019
18. Safeer, M., Kevin, S.: A study on market anomalies in Indian stock market. Int. J.
Bus. Admin. Res. Rev. 1, 128–137 (2014)
19. Basu, D., Dalal, S.: The Scam: From Harshad Mehta to Ketan Parekh. KenSource
Information Services (2007)
20. Goodhart, C.A.: The background to the 2007 financial crisis. International Eco-
nomics and Economic Policy (2008)
21. Ali, R., Afzal, M.: Impact of global financial crisis on stock markets: evidence from
Pakistan and India. J. Bus. Manag. Econ. 3(7), 275–282 (2012)
22. Rakshit, B., Basistha, D.: Can India stay immune enough to combat COVID19
pandemic? An economic query. J. Public Aff. 20(4), e2157 (2020)
23. Arora, P., Suri, D.: Redefining, relooking, redesigning, and reincorporating HRD in
the post Covid 19 context and thereafter. Hum. Resour. Dev. Int. 23(4), 438–451
(2020)
24. Shetty, G., Nougarahiya, S., Mandloi, D., Sarsodia, T.: COVID-19 and global com-
merce: an analysis of FMCG, and retail industries of tomorrow. Int. J. Curr. Res.
Rev. 12(17), 23–31 (2020)
25. Roy, A.: The pandemic is a portal. Financ. Times 3(4) (2020)
26. Reddy, D.L.: Impact of inflation and GDP on stock market returns in India. Int.
J. Adv. Res. Manag. Soc. Sci. 1(6), 120–136 (2012)
27. Taleb, N.N.: The black swan: the impact of the highly improbable. Random House,
17 April 2007
28. Bhuyan, B.P., Karmakar, A., Hazarika, S.M.: Bounding stability in formal concept
analysis. In: Bhattacharyya, S., Chaki, N., Konar, D., Chakraborty, U., Singh,
206 B. P. Bhuyan and A. Prasad

C. (eds.) Advanced Computational and Communication Paradigms. AISC, vol.


706, pp. 545–552. Springer, Singapore (2018). https://doi.org/10.1007/978-981-
10-8237-5 53
29. Bhuyan, B.P.: Relative similarity and stability in FCA pattern structures using
game theory. In: 2017 2nd International Conference on Communication Systems,
Computing and IT Applications (CSCITA), 7 April 2017, pp. 207–212. IEEE (2017)
Business, Finance and Decision Making Process
- The Influence of Culture on Foreign Direct
Investments (FDI)

Zoran Ðikanović1(B) - Jakšić-Stojanović2


and Andela
1 Faculty of International Economics, Finance and Business, University of Donja Gorica,
Podgorica, Montenegro
zorandjikanovic@udg.edu.me
2 Faculty of Culture and Tourism, University of Donja Gorica, Podgorica, Montenegro

Abstract. This paper deals with the influence of culture and cultural values on
foreign direct investments (FDI). It tries to identify what is the level of influence of
culture on expectations, beliefs, attitudes, behavior etc. of investors from different
cultural backgrounds as well as to identify the cultural values which most affect
investors’ decision making process. Additionally, the paper tends to identify the
main trends and perspectives regarding cultural globalization such as hybridiza-
tion, homogenization and conflict intensification and the level of their influence
on the foreign direct investments. The speed and level of the influence of these
cultural trends and perspectives on the global market will still be significantly
influenced by the model of cultural values that dominantly exist in particular cul-
ture (Power Distance; Individualism or Collectivism; Masculinity or Femininity;
Uncertainty Avoidance; Long Term Orientation; Indulgence or Restraint).

Keywords: Culture · Cultural values · Influence · Foreign direct investments ·


Globalization

1 Introduction
Globalization, internationalization, implementation of modern ICT technologies influ-
enced on creation more dynamic and competitive business environment than ever before.
Companies from all around the world discuss the possibilities of investing in foreign
markets and these decisions are not easy. They regularly depend on many aspects such as
economic, geographical, socio-political, cultural, etc. Culture is very often considered to
be one of the most important factors that influences the financial direct investments (FDI).
There are many open questions regarding the level of correlation between some cultural
values and expectations, beliefs, attitudes, behavior etc. of investors from different cul-
tural backgrounds [1]. On the other side, having in mind the process of globalization
and the ongoing changes in the structure of financial markets it is quite expected that the
new trends and perspectives will appear, and depending on the level of their influence,
the role and the influence of culture on foreign direct investments (FDI) may be quite
changed as well.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 207–211, 2022.
https://doi.org/10.1007/978-3-030-97196-0_17
208 Z. Ðikanović and A. Jakšić-Stojanović

2 The Influence of Culture and Cultural Values Foreign Direct


Investments

“Culture comprises those aspects of human activity which are socially rather than genet-
ically transmitted. Each social group is characterized by its own culture, which informs
the thought and activity of its members in myriad ways, perceptible and imperceptible”
[2].
Cultural factors are very often considered to be very important in the process of
decision-making regarding investments especially regarding two types of decision [3]:
making a direct investments abroad and choosing a host country. Culture with its values
have been significantly influenced investment behavior and the evolution of the financial
system. According to some authors, there are six cultural values that influence and affect
foreign direct investments (FDI) and these are [4]: Power Distance; Individualism or Col-
lectivism; Masculinity or Femininity; Uncertainty Avoidance; Long Term Orientation;
Indulgence versus Restraint. They are presented in Fig. 1.

Power
Distance

Individualism
Indulgence
or
or Restraint
Collecvism

Foreign
Direct
Investments

Long Term Masculinity


Orientaon or Femininity

Uncertanty
Avoidance

Fig. 1. Hofstede’s model-cultural values that influence foreign direct investments (FDI)

Many research papers and analyses showed positive correlation between different
cultural values and foreign direct investments. For example, research show that [1]:
the similarities in Power Distance between two countries positively affect foreign direct
investments, while the presence of high Uncertainty Avoidance in one Common language
has a positive influence on bilateral FDI, which means that for example a company first
invest in a country in which the same language is spoken [5] etc. According to the results
of the research, countries with similar cultural values have more chances to establish
Business, Finance and Decision Making Process- the Influence 209

strong cooperation and improve internationalization process, especially from the foreign
direct investments point of view.
Every culture has its own views on investing process deeply rooted in its cultural
and historical background and highly influenced by different cultural factors and values.
For example, some research [6] showed that Anglo-Saxon investors tolerate the biggest
losses, while German investors are the most patient ones. In the same time, the same
research describe Nordic investors as more patient and with greater risk tolerance who
are willing to pay a higher price for shares than investors with high risk aversion, such as
investors from Eastern Europe. The research also showed that investors in Anglo-Saxon
countries are willing to pay more for shares than investors in other countries.
Another interesting finding refers to market development which is highly correlated
with the degree of individualism of the countries themselves. For example, in individu-
alistic countries, there are more “ego-traders” in search of quick profits, which leads to
greater market development, such as in the United States. In some other countries, such
as the Nordic countries and Germany, there are many more traders who prefer to wait to
earn higher yields than to cash in on lower earnings today.
Also, in this context it is interesting to mention that European clients, for example,
often choose to delegate their investments to financial institutions, especially banks,
while Asian investors of similar age and wealth profile do not like to delegate investment
decisions and usually prefer the advisory role of financial institutions and banks.
All these results clearly show that culture especially some cultural factors signifi-
cantly influence foreign direct investments and that, according to the results of many
theoretical and empirical research, their level of their correlation is quite high.

3 The Relationship Between Culture and Globalization and Its


Influence of Foreign Direct Investments (FDI)

Although globalization is very often perceived as economic process, the fact is that it has
a significant impact in the other fields such as: culture, politics, ecology, technologies
etc. Its role is extremely important in culture, having in mind the fact that it implies the
establishment of multiethnic and multicultural society in which different cultures coexist
together, complementing and permeating each other. Thanks to the development of
modern ICT technologies and mass media, information revolution, intensive migrations
of people all around the world, etc. world has become more global than ever before.
In the last decade, there are some important cultural trends and perspectives on
the global market that may reflect to the level of influence of culture on foreign direct
investments (FDI) which is presented in Fig. 2.
The first trend refers to hybridization. Some authors [7, 8] understand globalization
as a long-term historical process that started long time ago and that tends to bring
different cultures into interrelation. The second one refers to homogenization which is
dominantly focused on consumer culture [9] which will lead to the end of the cultural
diversity and beginning of era of human monoculture [10]. Some authors [11] even state
that the globalization of culture may lead to the conflict intensification and that a serious
clash of civilizations may happen.
210 Z. Ðikanović and A. Jakšić-Stojanović

Hybridizaon

Cultural
trends and
perspecves
on the global
market
Conflict
Homogenizaon
intensificaon

Fig. 2. Cultural trends and perspectives on the global market

4 Conclusion
The process of globalization has significantly changed financial market and its structure
[1]. It is quite clear that trends and perspectives regarding cultural globalization such
as hybridization, homogenization and conflict intensification may reflect to the level of
influence of culture on foreign direct investments (FDI), but it is quite difficult to predict
and estimate its speed and level. What is quite certain is the fact that the speed and level
of the influence of the mentioned cultural trends and perspectives on global market will
still be significantly influenced by the model of cultural values that dominantly exist
in particular culture (Power Distance; Individualism or Collectivism; Masculinity or
Femininity; Uncertainty Avoidance; Long Term Orientation; Indulgence or Restraint).

References
1. Ðikanović, Z.: Investicije i kultura, pp. 307–315. Institut društvenih nauka, Centar za
ekonomska istraživanja (2015)
2. Routledge Encyclopedia of Philosophy, https://www.rep.routledge.com/articles/thematic/cul
ture/v-1, last accessed 2021/08/04
3. Ribeiro Goraieb, M., Reinert do Nascimento, M., Cortez Verdu, F.: Cultural influences on
foreign direct investment. Revista Eletrônica de Negócios Internacionais (Internext) 14(2),
128–144 (2019)
4. Hofstede, G.: Dimensionalizing cultures: the hofstede model in context. Online Readings in
Psychol. Cult. 2(1), 1–26 (2011)
5. Tang, L.: The direction of cultural distance on FDI: attractiveness or incongruity? Cross
Cultural Management 19(2), 233–256 (2012)
6. Hens, T., Meier, A.: Behavioral finance: the psychology of investing. Credit Suisse AG (2015)
Business, Finance and Decision Making Process- the Influence 211

7. Pieterse, J.N.: Globalization and Culture. Rowman & Littlefield (2003)


8. Ghosh, B.: Cultural changes in the era of globalisation. J. Dev. Soc. 27(2), 153–175 (2011)
9. Kraidy, M.: Hybridity, or the Cultural Logic of Globalization. Philadelphia, PA: Temple
University Press. pp. 1–23 (2005)
10. Jaffe, E.D.: Globalization and Development. Infobase Publishing. p. 48 (2006)
11. Huntington, S.: The clash of civilizations. Foreign Affairs. 72(3), 22–23, 25–32, 39–41, 49
(1993)
Towards a Problematization Framework of 4IR
Formalisms: The Case of QUALITY 4.0

John Andrew van der Poll(B)

Digital Transformation and Innovation, Graduate School of Business Leadership (SBL),


University of South Africa (Unisa), Midrand, South Africa
vdpolja@unisa.ac.za

Abstract. The use of Formal Methods (FMs) as a software engineering paradigm


remains contentious. Advocates of the use of FMs point out the enhanced quality
of software that may be obtained through reasoning about the properties of a
formal specification, while critics mention the steep learning curve in mastering
the underlying mathematics and logic for the efficient use of these techniques. Be
that as it may, with the advent of the Fourth Industrial Revolution (4IR) in which
humans and intelligent machines are anticipated to work together, it is imperative
that the software driving these machines be provably correct or at least highly
dependable, making a strong case for the use of FMs. In this paper, following
an inductive research approach we problematize the use of FMs for software
development by considering numerous aspects like a formal specification of the
Quality 4.0 framework, problematization aspects in the literature, and the role of
upper managers in promoting or prohibiting the use of FMs. A problematization
framework for FMs use in the 4IR is developed and validated through a brief
theoretical analysis. Future work in this area may involve validating the framework
among stakeholders in the software industry and developing a solution framework.

Keywords: Automation · Formal Methods (FMs) · Fourth Industrial Revolution


(4IR) · Management · Problematization · Quality 4.0

1 Introduction

The Fourth Industrial Revolution (4IR), also known as Industry 4.0 continues to make
inroads into society, having significant influence in the digitalization strategies of com-
panies. It featured as an important item on the World economic forum (WEF) agenda, in
January 2016 and is a standing item on their agenda annually [1]. The 4IR is characterized
by a fine interplay in which humans and machines work closely together and machines
take over some of the tasks previously performed by humans [2]. It blurs the distinctions
among biological, digital and physical spheres and integrates cyber-physical systems and
Additive manufacturing, Advanced Analytics, Artificial intelligence-based systems, Big
data, Cloud/Edge computing (C/EC), the Internet of Things (IoTs), Quantum computing
and Robotics [2, 3].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 212–226, 2022.
https://doi.org/10.1007/978-3-030-97196-0_18
Towards a Problematization Framework of 4IR Formalisms 213

Reference [4] indicated that the Fourth Industrial Revolution is being driven by
extreme automation and connectivity, hence it is vital that the software that drive these
machines be reliable and highly dependable.
The use of software engineering Formal Methods (FMs) based on discrete mathe-
matics and formal logic to achieve software reliability and correctness has been a bone
of contention. Advocates of the use of these techniques point to the advantages to be
gained, for example, showing that a system meets its specification, or that undesirable
consequences are absent from the running software, achieved through reasoning about
the properties of the specification [5]. Opposition to the use of FMs include questioning
the return on investment (ROI) for system design and implementation owing to the steep
learning curve in mastering the underlying formalisms.
With the advent of the 4IR, numerous spheres of life took an interest in the oppor-
tunities offered by this new digitalization. Frameworks for incorporating 4IR aspects
into their existing ones emerged. Examples of these include Airport 4.0 [6] aimed at
developing smart aerotropoli in which 4IR technology is included in, amongst others,
flight information, customer (passenger) relationship care, and transforming the tradi-
tional low-cost airline model into a profit-generating business, and Quality 4.0 [7], a
framework for quality assessment of processes and data in the 4IR. Implicit in these
frameworks are considerations of structured and unstructured data which have implica-
tions for FMs in the 4IR. The Quality 4.0 framework is presented in Sect. 3.4 of this
work.
While the 4IR may be in its infancy with respect to solutions with challenges brought
about by its use, it may be may appropriate to take a step back and conduct a compre-
hensive synthesis of the underlying challenges through problematization [8, 9] of these.
Consequently, in this work we define a problematization framework for promoting the
use of Formal Methods in the 4IR, i.e., enhancing the quality of software and underlying
data in the new norm.
The layout of this paper is: Following the introduction, the research questions (RQs)
and research objective underlying this work are given in Sect. 1.1. Section 2 defines
the research methodology used, while a literature review on aspects of the 4IR; FMs;
problematization; a formalization of the Quality 4.0 framework; and aspects on technol-
ogy adoption appear in Sect. 3. Our problematization framework for the use of FMs in
the 4IR is presented in Sect. 4. Conclusions and directions for further work in this area
appear in Sect. 5, followed by a list of references.

1.1 Research Questions (RQs)

This paper aims to find answers to the following:

1. What are the advantages and disadvantages to the use of FMs in software
engineering? (RQ1)
2. To what extent does the Quality 4.0 framework lend itself to formalization? (RQ2)
214 J. A. van der Poll

The RQs inform an objective:

• Develop a problematization framework to unpack the challenges of the use of FMs in


the 4IR, inspired by, amongst others, Quality 4.0.

2 Research Methodology
The research in this paper follows the layout of Saunders et al.’s Research Onion [10]
depicted in Fig. 1.

Fig. 1. Saunders et al.’s research onion [10]

Following the onion from the outer layer, our philosophy is both interpretive and
positivist. It is interpretive since much of the literature on the 4IR are in discussion- and
development stages using qualitative textual and diagrammatic descriptions. We also
have a positivist angle since part of our work involves the use of mathematical text and
logic. These are conducive to specifics, amongst other involving formal reasoning.
At the 2nd layer our approach to theory development is inductive since a problema-
tization framework is developed. There is also a sense of a deductive approach since
we validate the framework through theoretical analyses afterwards. Our methodological
choice is a mixed – both qualitative and quantitative. The qualitative part, in line with
our philosophy, involves analyses of text and diagrams, while the formal methods add
a quantitative component, albeit it being not statistics per se, but discrete mathematics
and formal logic.
At the strategy layer we use a case study in the sense that the Quality 4.0 is viewed as
a case of a 4IR framework for a specific domain. Our time horizon is cross sectional since
this research is performed at a specific point in time; the data collection and procedures
may involve surveys among humans in the future, but for this paper data was collected
from the literature involving 4IR and FMs literature.
Towards a Problematization Framework of 4IR Formalisms 215

3 Literature Review
As indicated in the layout section we analyze the literature with respect to general 4IR
concepts, FMs, the Quality 4.0 framework, technology adoption, and finally aspects
around problematization.

3.1 Fourth Industrial Revolution


The notion of the Fourth Industrial Revolution or Industry 4.0 may be traced back to
2011 in Germany [11]. As is well known, three industrial revolutions occurred through-
out modern history and each carried certain technological disruptions which resulted in
significant economic and social changes [12]. The 1st industrial revolution started with
the development and application of the steam engine for industrial production. The 2nd
industrial revolution was powered by electricity which ushered in the era of mass pro-
duction. The 3rd revolution was characterized by developments in electronics – computer
technology signaling the start of automated production processes.
The 4IR is characterized as blurring the lines between living beings and machines,
i.e., digital, physical, and biological spheres. From these, numerous cyber-physical sys-
tems for, amongst other, interoperability [12]; operationalization of production processes
[13]; and self-aware and self-maintenance machines [14] were defined.
The precise effect of the 4IR in which humans and machines work closely together
may be unclear at present, and while many developments and research are needed to tap
into these, it is clear that software driving the human-machine interoperability and the
working of the self-aware and self-maintenance machines must be correct, calling for
research into the use of FMs in the 4IR.

3.2 Formal Methods in Software Engineering


Traditionally the use of FMs involves the development of a formal specification through
iterative reasoning about the specification. The purpose is to identify ambiguity in
natural-language constructs (e.g., “What car does Michael Schumacher drive?”, mean-
ing his personal car, not the Formula 1 racing car). To this end, a simple set-theoretic
specification illustrating an interesting consequence is given in Example 1. Consider the
natural-language claim that every person has/had a biological parent (avoiding compli-
cations of the very 1st humans). Two set-theoretic formulations come to mind (Parent
(y, x) indicates ‘y is a parent of x’):

Example 1. Consider the natural-language claim that every person has/had a biological
parent (avoiding complications of the very 1st humans). Two set-theoretic formulations
come to mind (Parent (y, x) indicates ‘y is a parent of x’):

• ((∀ x) (∃ y) | Parent (y, x)) ≡ P1


• ((∃ y) (∀ x) | Parent (y, x)) ≡ P2

Predicates P1 and P2 are not the same. P1 states that every person has a parent, in line
with what is intended, while P2 states there is one person who is the parent of everyone
216 J. A. van der Poll

else. Predicate P2 is too strong, hence P2 → P1, but not vice versa, illustrating the value
of FMs in specification work.
Example 1 gives a partial answer to our RQ1 above.

Formal Specification Languages. With respect to software development, various spec-


ification styles have been defined, aimed at achieving different purposes with the spec-
ification. The algebraic style [15] is declarative in nature and defines a specification
as a set of axioms that are to hold throughout the operation of the system. Prominent
examples of algebraic specification languages are Larch and OBJ [15, 16]. A second
specification style is process-based specifications. These involve process algebras to
describe the ordering of time-based events. They are useful for specifying concurrent
processes through modal logics, e.g., temporal logics. Prominent implementations of
process-based specification languages are CCS and CSP [17].
A third formal specification style favored by the researcher is the model-based spec-
ification style in which a system is described as moving from state to state, showing the
relationships between the before and after states of the variables. Model-based specifica-
tions are based on mathematical set theory and formal logic. Examples of model-based
specification languages are LTA+ [18, 19], VDM, Z [20] and the B method [21].
As indicated before the use of a formal specification for describing a system allows the
specifier to reason about the properties of the specification and by implication, discover
features of the resultant system. It can show that the system meets its specification,
and any undesirable properties are absent from the specification, hence absent from the
resulting system as well, provided that the sequence of transformations in moving from
the specification to the implemented system preserves correctness [22].
Formal specifications are not without challenges, however, as illustrated next.

Example 2. Consider an example from the ERP (Enterprise Resource Planning) domain
where students registering for a qualification have the option of paying up front or
delaying payment to a later date. In either case the student is registered, and payment is
indicated either as pending, or paid in full.
A Z specification illustrating the two actions appear in schemas Register and Pay
respectively [23].

Register
Δ Students
s? : Student
s? registered
registeredʹ = registered { s?}
payedʹ = payed

The system receives as input a student (s?). The student system is (might be) changed
( Students). The student to be registered is not in the system already; they are added
to the registration system and no payment has been made, indicated by the before state
and after state of the payment to be invariant (payed = payed).
Towards a Problematization Framework of 4IR Formalisms 217

Sometime later the student makes a payment as captured by schema Pay.

Pay
Δ Students
s? : Student
amount? :
s? registered
payed′ = payed { s? amount?}
registered′ = registered

The student (s?) is in the system already (s? ∈ registered) by virtue of schema
Register. A payment is specified accordingly, and the student’s status as being registered
in not affected.
Using Z’s schema calculus to combine schemas we define an operation for a student
who registers and pays at the same time as:

Register_and _Pay = Register ∧ Pay

Schema Register_and_Pay is given by:

Register_and_Pay
Δ Students
s? : Student
amount? :
s? registered s? registered
payedʹ = payed (payed′ = payed { s? amount?})
registeredʹ = registered (registeredʹ = registered { s?})

Schema Register_and_Pay displays inconsistencies, illustrating that care needs to


be exercised in the writing of a formal specification. What is illustrated here is one
of the so-called ‘dark corners’ of the Z schema calculus [24] which may be cited as
an objection to the use of FMs in software development. This shows the need for an
iterative process of reasoning about the properties of a formal specification, or at least a
manual inspection of the specification. Numerous reasoners are available for this task.
Some prominent reasoning environments are Event-B/Rodin [25] which parses the proof
obligation (PO) and then decides which reasoner or model checker to invoke. Reasoners
like Z/Eves [26] include type-checking of Z specifications as well.
It should be noted that, as is the case with any tool, a user of such software should
become proficient in it use, and should, at least at an intuitive level, comprehend how a
proof of a property of the specification should be approached. In this regard, interactive
218 J. A. van der Poll

reasoning assistants (e.g., Boole [27]) may assist with scaffolding [28] whereby a user
learns by doing and the mentor gradually withdraws assistance.
The discussions in this section illustrate the value in formalizing specifications
(Example 1) and the pitfalls involved in the use of FMs (Example 2), hence empha-
size the importance of formality in the automated world of 4IR software development
[29].
The above discussions present the full answer to our RQ1.
Next, we present an important 4IR construct, namely, Quality 4.0 used to evaluate
the quality of, amongst other, data and processes in the 4IR.

3.3 The Quality 4.0 Framework


The Quality 4.0 framework is depicted in Fig. 2.

Fig. 2. Quality 4.0 framework [7]

Discussion of Quality 4.0. Quality in the 4IR is divided into three outer sectors based
on People, Process(es) and Technology. These sectors rest on 11 pillars – Leadership,
Culture, Compliance, Management System(s), Analytics, Data, App Development, Con-
nectivity, Scalability, Collaboration, and Competency. All these pillars are subdivided
as indicated, and standard (pre 4IR) quality is embedded at the core (the blue part) of
the larger Quality 4.0 framework.
As indicated, the correct functioning of 4IR machines is dependent on the quality
of the underlying software – data and operations. The use of FMs allows a software
engineer to reason about the correctness of the formal specification involving both data
and operations.
Towards a Problematization Framework of 4IR Formalisms 219

As illustration, we attempt a formal specification of the Leadership pillar in the


quality 4.0 structure in Fig. 2 to investigate its properties. To simplify the analyses, we
resort to a basic set-theoretic specification, instead of, for example, a Z specification,
avoiding possible schema-calculus challenges elucidated in Sect. 3.2.

3.4 Formal Specification of Quality 4.0

To further simplify the specification, we present it as a bulleted list:

• Taking the 11 pillars of Quality 4.0 we have (no ordering among the sectors):
Quality_4.0_Ver1 ≙ {Leadership, Culture, Compliance, Management System(s),
Analytics, Data, App Development, Connectivity, Scalability, Collaboration, Compe-
tency}.
• If, however, an order (ranking) with leadership being the most prominent sector is
implied, then:
Quality_4.0_Ver2 ≙ Leadership × Culture × Compliance × Management Sys-
tem(s) × Analytics × Data × App Development × Connectivity × Scalability ×
Collaboration × Competency
• It is plausible that no order among the components is implied by [7], hence we define:

Quality_4.0 Quality_4.0_Ver1 AMBIGUITY #1 ASSUMPTION

• Next, we specify the Leadership component in the diagram:


• The subcomponent “Connected” appears in at least three of the pillars, hence we
qualify which one is being referred to. For the leadership sector we indicate it as
Leadership.Connected = set of X = P(X) (for use below)
• There appears to be a one-to-one mapping between leadership and its connectedness,
so we define:

Leadership Leadership.Connected {X.Objective Alignment, X.Executive


Ownership, X.Quality KPI} AMBIGUITY 2 ASSUMPTION

• The leadership sublayer again includes a “QUALITY” component. This could be


traditional quality (refer Fig. 2) or, recursively Quality 4.0, leading to our 3rd ambi-
guity: AMBIGUITY 3 ASSUMPTION

• At the next level there appears to be EXECUTIVE or CROSS FUNCTIONAL leadership


and it is not clear whether the cross functional style is a subcomponent of the Ex-
ecutive style, leading to our 4th ambiguity: AMBIGUITY 4 ASSUMPTION
220 J. A. van der Poll

• Should cross functional not be included in executive, then these two leadership styles
(functions) would involve an exclusive or – ⊕ (either one but not both):
So, the X attribute above would be: X.executive ⊕ X.cross functional.
• Further, according to the diagram each leadership style would have three finer sub-
divisions, but what would they be? Since these are not indicated, the Quality 4.0
framework appears to be an organic (growing), abstract structure. There appears to
be a different Quality 4.0 diagram instantiated for each application?

Hence, we have: AMBIGUITY 5 ASSUMPTION

From the above lightweight formalization of Quality 4.0 we note numerous ambi-
guities with respect to the Leadership pillar. A mere inspection of the diagram indicates
that a formalization of the other pillars would reveal similar challenges. In defense of
the framework, one should note it is conceptual and aimed at guiding quality officers
in their day-to-day tasks and is not intended for a hard-core formal methods treatment.
Nevertheless, our investigation illustrates the value proposition of FMs on 4IR structures.
The above formalization gives an answer to our RQ2.

3.5 Technology Adoption


While the use of FMs as a software engineering paradigm ought to be exercised with
care, its use may be facilitated by turning to the various technology acceptance mod-
els, for example, the Technology Acceptance Model (TAM) [30] or enhanced model,
UTAUT [31]. The use of these models may be key and deserve to be incorporated in a
problematization of 4IR FMs.
Next, we turn our attention closer to the purpose of this article, namely, aspects
around problematization, whereafter we present our problematization framework for
FMs in the 4IR.

3.6 Problematization Frameworks


Seminal work on aspects around problematization appear in [32] and further developed
in [33] and [8]. The traditional notion of problematization is to use it to investigate
an existing scenario or object aimed at questioning the status quo. Existing notions or
truths are challenged, aimed at improving on it. In a way problematization works like
disruptive innovation where current practice or technology are interrupted in favor of a
newer approach. In our work, however, problematization is utilized slightly differently.
Instead of problematizing an existing notion, namely, the adoption of FMs in standard
software development, we consider the challenges in bringing the use of FMs into the 4IR
embedding, amongst other, the challenges of humans and machines working together.
As indicated, the precise effect of the 4IR on software development may be unknown at
present, hence it is imperative to initiate a problematization framework for the FMs-4IR
interplay. As [34] puts it “when purpose is not known, misuse is inevitable”, i.e., in
finding mechanisms in the sustained adoption of 4IR FMs, one ought to first identify,
unpack, and interlink the obstacles in the sustained adoption and long-term use of these
Towards a Problematization Framework of 4IR Formalisms 221

techniques, i.e., problematize an anticipated scenario and thereafter devise solutions for
the parts of the framework.
Morgan [33] discusses problematization around aspects of paradigms, metaphors,
and puzzles (problems) in the context of organization theory. In our context the puzzle
solving part maps onto the functional use of FMs, while a paradigm is viewed as an
alternative reality in which, again in our work, the use of FMs may be viewed as an
alternative, or complementary to traditional software engineering work. Metaphors in
our context map onto pro FMs or anti FMs notation or language, arguably being one
sided from either of these opposing views. For example, pro FMs software engineers
often point only to the advantages of the use of FMs, while anti FMs developers elucidate
just the disadvantages of the use of FMs. Both viewpoints have been discussed in this
paper.
Problematization aspects with respect to scholarly literature is described in [9] as
being of three progressive kinds. The authors follow a constructionist approach whereby
extant literature was problematized by taking it for granted, thereby rendering it inacces-
sible for investigation. A large volume of qualitative publications from a prominent jour-
nal was analyzed and they established that many of these works start out by re-presenting
and organizing existing knowledge, followed by in a way “turning on themselves” by
evaluating and criticizing (i.e., problematizing) their own contribution, similar to the
challenge pointed out with Z’s schema calculus in Sect. 3.2.
With respect to extant literature as well as own contribution the authors identify three
progressive ways to problematize these: incompleteness, inadequacy, and incommensu-
rability. Incompleteness is when research claims that previous research is not finished,
and the current work will further develop it – previous work is useful yet needs to be
furthered. When problematizing existing literature as inadequate, a claim is made that
the literature does not sufficiently incorporate different views or perspectives of the
phenomenon under investigation. Oversights and a lack of pointing out alternatives are
elicited. An inadequate problematization stops short, however, of claiming that the extant
literature is wrong; it simply claims that alternatives can co-exist with existing ones.
Problematization through incommensurability goes further that inadequacy. Instead
of just pointing out that extant literature is incomplete or inadequate in lacking alter-
natives, it claims such extant literature overlooked different and relevant perspectives
and is incorrect, i.e., simply wrong and needs to be replaced by alternatives. The new
research rejects previous work; it does not want to co-exist with extant literature, it
wants to replace it. Incommensurability problematization posits their own perspectives
as better than the extant ones.
Reference [8] presents problematization in terms of the nursing profession. Attention
is paid to Foucault’s work on discourse analyses and relationships with problematiza-
tion. According to Foucault, discourse analyses involve not only linguistic activities like
language and speech, but also forms and patterns that discourses follow. These have
implications for the use of FMs in our problematization framework to follow – intelli-
gent, 4IR machines will deal with these discourse analyses, specifically the challenging
notations of some of the formal specification techniques. The nursing profession grapple
often with power relationships in life and death situations and these have implications
for the use of FMs in the new industry, specifically with software applications where
222 J. A. van der Poll

Table 1. Problematization framework for FMs use in the 4IR.

Aspect Components Notes

Policies
House rules relate to Stand-
Legislation Company software development
ards
house rules

FMBoK (Formal Methods Body of Still underdeveloped


Standards Knowledge)
Best practices (BPs) BPs to be established
Problematization (3 types)
Links with Validation & Veri-
* Incompleteness
fication (V&V) – reasoning
FMs Literature * Inadequacy
about properties of a specifi-
* Incommensurability
cation
Dark corners of specification

Psychological aspects
* Objections to FMs
Organizational
Skill sets Links with Training
Theory
* Hard mathematical notation
* Formal logic

Technology TAM, UTAUT Links with Management Per-


Adoption Culture spectives

Style depends on what speci-


fication is to be used for in
4IR context:
Declarative
* Refinement for system
Specification Procedural-like
development
Styles Communicating sequential
* Just investigate system
FMs as a language game
properties
Together with V&V an NB
part of FMs in the 4IR

Machine automation
Machine as an organizational
Autonomy Cultural aspects
organism
Ethics

Vital for correct behavior of


4IR robots.
Ethics Accepted ethical principles
To be built into the algo-
rithms.

(continued)
Towards a Problematization Framework of 4IR Formalisms 223

Table 1. (continued)

Aspect Components Notes

Type checking
Reasoning about spec
Reasoning Arguably the most NB aspect
Validation & Verifi-
* Automated of 4IR FMs at the functional
cation (V&V)
* Interactive level
* Manual (by hand)
Scaffolding

Discrete mathematics & Logic


Workshops
Key aspects for developers
* Distributed repetition
eventually embracing the use
Tool use
of FMs in the 4IR.
Training Mindset (links with Culture)
Links with aspects of machine
* Willingness to learn FMs
automation and learning –
(Life-Long Learning)
training of robots.
* Success stories
* Publications

Links with all of Autonomy,


Management Power relationships
Culture, and Technology
Perspectives Views of developers
adoption.

human life is at stake (e.g., nuclear power plants). A case of education is elucidated
in [8] and the same holds for FMs education with respect to discrete mathematics and
logic. The value of FMs education, coupled with scholarly publications, success stories
and workshops for software engineers in commercial software development is likewise
emphasized in [35].

4 Problematization of FMs in the 4IR

Frameworks or models may take many forms, being it a set of mathematical equations
to model the behavior of a missile in following a target, or a qualitative diagram for
sustainability in the coalmining industry [36].
Our framework, depicted in Table 1 is presented as a 3-column table with links
among the entries indicated by text italicization.

Discussion. Our problematization framework gives cognizance to aspects elucidated in


this paper. Traditional FMs acceptance aspects with respect to functional correctness are
indicated and the level of declarativeness of the specification depends on its use with
respect to 4IR intelligence. Adoption aspects for FMs as a technology are embedded,
together with the role to be played by upper management in the context of organization
224 J. A. van der Poll

theory to promote or prohibit the use of FMs in a company. The training of software
engineers together with a positive mindset are vital for the successful adoption of these
techniques. Within the 4IR machine autonomy together with ethical aspects around these
are important problematizations in the new technology.

5 Conclusions and Future Work

In this work we problematized the acceptance of Formal Methods in software engi-


neering in the 4IR. Cognizance was given to traditional challenges around these and
aspects of using FMs in the fourth industrial revolution. In this context literature on
problematization and how these relate to our focus was presented. On the strength of
these discussions a problematization framework was presented in the form of a table. Of
particular importance is the role of upper managers in facilitating or hindering the use
of these techniques. The role of intelligent machines and ethical concerns around these
are included in the framework.
Since this work presented a problematization framework as a first step, directions
for future work in this area may be pursued along a number of avenues. Following the
problematization framework a solution framework would be on the cards, but even before
that it could be prudent to validate the relative completeness of the framework among
different strata of stakeholders in industry – software developers, managers in the form
of Chief Information Officers (CIOs), customers and so forth. Deeper analyses around
the role to be played by the various technology adoption models, namely the TAM and
UTAUT should also be conducted.

Acknowledgements. This work is based on the research supported in part by the National
Research Foundation of South Africa (Grant Number 119210).

References
1. Nicholas, D.: Fourth industrial revolution WEF Agenda, Weforum (2016). https://www.wef
orum.org/agenda/2016/01/what-is-the-fourth-industrial-revolution/. Accessed 21 Oct 2018
2. Nankervis, A., Connell, J., Montague, A., Burgess, J. (eds.): The Fourth Industrial Revolution.
Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-1614-3
3. Bayode, A., van der Poll, J.A., Ramphal, R.R.: 4th industrial revolution: challenges and
opportunities in the South African context. In: Conference on Science, Engineering and Waste
Management (SETWM-19), pp. 174–180, 18–19 November (2019)
4. Baweja, B., Donovan, P., Haefele, M., Siddqi, L., Smiles, S.: Extreme automation and con-
nectivity: the global, regional, and investment implications of the fourth industrial revolution.
World Econ. Forum (2016)
Towards a Problematization Framework of 4IR Formalisms 225

5. Basile, D., et al.: Designing a demonstrator of formal methods for railways infrastructure
managers. In: Margaria, T., Steffen, B. (eds.) ISoLA 2020. LNCS, vol. 12478, pp. 467–485.
Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61467-6_30
6. Halpern, N., Budd, T., Suau-Sanchez, P., Brathen, S., Mwesiumo, D.: Towards Airport
4.0: Airport digital maturity and transformation. In: Proceedings of the 23rd Air Transport
Research Society World Conference, Amsterdam, Netherlands, p. 21 (2019)
7. Juran, J.M.: Quality 4.0: the future of quality? Web blog. https://www.juran.com/blog/qua
lity-4-0-the-future-of-quality/. Accessed 7 July 2021
8. Frederiksen, K., Lomborg, K., Beedholm, K.: Foucault’s notion of problematization: a
methodological discussion of the application of Foucault’s later work to nursing research.
Nurs. Inq. 22(3), 202–209 (2015)
9. Locke, K., Golden-Biddle, K.: Constructing opportunities for contribution: structuring Inter-
textual coherence and “problematizing” in organizational studies. Acad. Manage. J. Acad.
Manage. 40(5), 1023–1062 (1997)
10. Saunders, M., Thornhill, A., Lewis, P.: Research Methods for Business Students, 8th edn.
Pearson, London (2018)
11. Industry 4.0: On the way to the 4th industrial revolution with the Internet of
Things(2013). http://www.vdi-nachrichten.com/artikel/Industrie-4-0-Mit-dem-Internet-der-
Dinge-auf-dem-Weg-zur-4-industriellen-Revolution/52570/1. Accessed 7 July 2021
12. Lu, Y.: Industry 4.0: a survey on technologies, applications and open research issues. J. Ind.
Inf. Integr. 6, 1–10 (2017). https://doi.org/10.1016/j.jii.2017.04.005. ISSN: 2452–414X
13. Fatorachian, H., Kazemi, H.: A critical investigation of industry 4.0 in manufacturing: theo-
retical operationalisation framework. Prod. Plan. Control, 29(8), 633–644 (2018). https://doi.
org/10.1080/09537287.2018.1424960
14. Bagheri, B., Yang, S., Kao, H-A., Lee, J.: Cyber-physical systems architecture for self-aware
machines in industry 4.0 environment. IFAC-PapersOnLine, 48, 1622–1627 (2015). https://
doi.org/10.1016/j.ifacol.2015.06.318
15. Goguen, J.A., Winkler, T., Meseguer, J., Futatsugi, K., Jouannaud, J.P.: Introducing OBJ. In:
Goguen, J., Malcolm G. (eds) Software Engineering with OBJ. Advances in Formal Methods,
vol. 2. Springer, Boston (1999).https://doi.org/10.1007/978-1-4757-6541-0_1
16. Goguen, J.A., Winkler, T., Meseguer, J., Futatsugi K., Jouannaud, J.P.: Introducing OBJ. In:
Goguen, J., Malcolm, G. (eds) Software Engineering with OBJ. Advances in Formal Methods,
vol. 2. Springer, Boston, MA (2000). https://doi.org/10.1007/978-1-4757-6541-0_1
17. Palshikar, G.: Applying formal specifications to real-world software development. IEEE
Softw. 18, 89–97 (2001). https://doi.org/10.1109/52.965810
18. Newcombe, C., Rath, T., Zhang, F., Munteanu, B., Brooker, M., Deardeuff, M.: Use of Formal
Methods at Amazon Web Services. Available online at: https://lamport.azurewebsites.net/tla/
formal-methods-amazon.pdf (2014)
19. Newcombe, C.: Why amazon chose TLA + . In: Ait Ameur, Y., Schewe, KD. (eds) Abstract
State Machines, Alloy, B, TLA, VDM, and Z. ABZ 2014. Lecture Notes in Computer Science,
vol. 8477. Springer, Berlin, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43652-3_3
20. Azeem, M., Ahsan, M., Minhas, N.M., Khadija, N.: Specification of e-Health system using Z:
a motivation to formal methods. In: International Conference for Convergence of Technology,
pp. 1–6 (2014). https://doi.org/10.1109/I2CT.2014.7092123
21. Tolmach, P., Li, Y., Lin, S.-W., Liu, Y., Li, Z.: A survey of smart contract formal specification
and verification. ACM Comput. Surv. Association for Computing Machinery, New York,
54(7), 1–38 (2021). https://doi.org/10.1145/3464421
22. Thangaraj, J., Ulaganathan, S.: A comparative study on transformation of UML/OCL to other
specifications. Recent Adv. Comput. Sci. Commun. (Formerly: Recent Pat. Comput. Sci.),
13(2), 256–264 (2020)
226 J. A. van der Poll

23. Senaya, S.K., van der Poll, J.A., Schoeman, M.A.: Categorisation of Enterprise Resource
Planning (ERP) failures: an opportunity for formal methods in computing. In: Conference on
Science, Engineering and Waste Management (SETWM-19), pp. 181–187, 18–19 November
(2019)
24. Brien, S.M., Martin, A.P.: A calculus for schemas in Z. J. Symbolic Comput. 30(1), 63–91
(2000). https://doi.org/10.1006/jsco.1999.0347. ISSN 0747-7171
25. Ackermann, J.G., van der Poll, J.A.: Reasoning heuristics for the theorem-proving platform
Rodin/event-B. In: The 2020 International Conference on Computational Science and Compu-
tational Intelligence (CSCI 2020), pp. 1800–1806, 16–18 December 2020, Las Vegas (2020).
https://www.american-cse.org/csci2020
26. Z/Eves. https://www.swmath.org/software/10262. Accessed 01 Aug 2021
27. The Boole interactive reasoning assistant. https://github.com/avigad/boole. Accessed 01 Aug
2021
28. Kashora, T.: E-Learning technologies for open distance learning knowledge acquisition in
management accounting, D.Com thesis, Department of Management Accounting, University
of South Africa (2018)
29. Van der Poll, J.A.: Can I Trust my Android?, News and Events, Graduate School of Business
Leadership. University of South Africa, SBL (2020)
30. Hassler, E., MacDonald, P., Cazier, J., Wilkes, J.: The sting of adoption: the Technology
Acceptance Model (TAM) with actual usage in a hazardous environment. In: 2020 Proceedings
of the Conference on Information Systems Applied Research, pp. 1–8 (2020). ISSN: 2167–
1508
31. Chang, A.: UTAUT and UTAUT 2: a review and Agenda for future research. J. Winners 13(2),
106–114 (2012)
32. Kuhn, T.S.: The Structure of Scientific Revolutions. University of Chicago Press, Chicago
(1962)
33. Morgan, G.: Paradigms, metaphors, and puzzle solving in organization theory. Adm. Sci. Q.
605–622 (1980)
34. Akor, O.: Problematization: The foundation of sustainable development. In: International
Conference on African Development Issues (CU-ICADI), pp. 77–83 (2015)
35. Nemathaga, A., van der Poll, J.A.: Adoption of formal methods in the commercial World. In:
Eight International Conference on Advances in Computing, Communication and Informa-
tion Technology (CCIT 2019), pp. 75–84, 23–24 April 2019. https://doi.org/10.15224/978-
1-63248-169-6-12. ISBN: 978–1–63248–169–6
36. Mbedzi, M.D., van der Poll, H.M., van der Poll, J.A.: Enhancing a decision-making framework
to address environmental impacts of the South African coalmining industry. Energies, 13(18),
4897, 1–23 (2020) https://doi.org/10.3390/en13184897. ISSN 1996 – 1073
Investment or Gambling in the Crypto Market:
A Review

Aditi Singh(B)

Department of Electronics and Telecommunication Engineering, Mukesh Patel School of


Technology Management and Engineering, SVKM’s NMIMS (Deemed-to-be-University),
Shirpur, India
Adisingh7715@gmail.com

Abstract. Although, prodigious amount of exploration can be observed in this


field, yet it is seen as a miniscule amount and there is still a lot of work to accom-
plish. Inadequate number of published literatures encourages for further writings
in this area. This paper predominantly targets the available work with respect to
this field and builds a roadmap for future studies for not particular popular cryp-
tocurrencies but those who can be in the “Race to Ace” the crypto market. Based
on investigations that had been done in recent years in many different dimen-
sions about the economic part of the crypto-market, this paper targets evolution
to downfall and future scope of cryptocurrencies. To assist global or monetary
growth, it is said that cryptocurrency has the potential to provide instant access to
any sort of financial services to them. Cryptocurrency are encrypted and protected
and are not even dependent on financial institution. They charge a low fee, which
significantly is lower than these financial institution for same processing of credit
cards. Hence this paper will be a highlighting factor to proof if these currencies
can be seen as an investment or gamble in the financial market.

Keywords: Cryptocurrency · Bitcoin · Ethereum · Litecoin · Tether · Dogecoin ·


Ripple · Cardano · Polkadot · Uniswap · Binance coin · Govcoin

1 Cryptocurrency
1.1 Introduction
Cryptocurrency is witnessing a rife in the field of digital payment these days [3]. They
are expected to be anticipated as the later or prospect currency, might eventually replace
current paper money around the world. Despite the concerns has grabbed the eye of
users, many are unaware of its probability, pitfalls and the upcoming challenges in
future. Researches on this peer-to-peer system who got its name “Cryptocurrency” due
to its function to use encryption to verify transactions is still lacking and is still at its
infancy stage. In providing substantial aide and view to the academic field and users
[14], this paper focus on highlighting the darwinisim of cryptocurrencies. The rise in
the use of digital currency and its unpredictability, cryptocurrencies are being used in
a variety of legal and unlawful activities all over the world. The returns made from

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 227–239, 2022.
https://doi.org/10.1007/978-3-030-97196-0_19
228 A. Singh

cryptocurrency endowment lately were huge, yet there has been questions everywhere
about their actuality and their eminence [8]. The industry related to the crypto-market
believes that they cannot separate blockchain from cryptocurrency, the two should have
to go hand in hand [25]. To begin with, blockchain technology can be thought of as
a chain of blocks. Each block has access to information or contents of the past block.
Cryptocurrencies work on the same principle and contain records of all the transactions
that were made. Blockchain technology shows its presence in numerous areas such as,
finances, smart property and so on. When a transaction is initiated, all the nodes in the
blockchain activate. Once activated, they perform calculations to verify the authenticity
of the information and transaction. If a large number of the nodes arrive at a positive
result, a new block is added to the existing chain. In any other case the transaction
is denied [2]. In cryptocurrencies, there is no predictable proof of systematic items
exposures [16].
The ascent in the value of cryptocurrencies, in the market and the developing popu-
larity worldwide opens a wide variety of challenges and unsettlement towards business
and modern financial aspects [1]. An ample number of public conversations around
cryptocurrencies have been set off by significant changes in their costs, it professes that
the crpto-market is a delusion without any actual worth. These concerns have prompted
calls for increased regulations and guidelines or even a total ban. These conversations
often overshadow more heat than light [1]. India being dissimilar in respect to other
countries, already has a parameter revolving around the ongoing currency. It is also seen
that although these currencies do not behave like normal currencies, but they have some
similarities with each other [23].

1.2 The Path to Pitfall

Cryptocurrencies: a resource on a blockchain that can be traded or transferred between


network users and hence used as a method for payment, however offers no other benefits
[1]. In spite of cryptocurrency being a protected platform for the users, there is a need
to secure and tackle the problems or issues related to it. The user should double-check
the receiving address, otherwise, there are chances of money being transferred to some
other account. Fraud is considered as an unavoidable part of life. Using fiat currency,
the expenses and installment vulnerabilities can be avoided, however no system without
a trusted opposite party exists to make the payment over conveying channels [6]. Addi-
tionally, an ordinary cryptocurrency user could be deceived in saving their wallets and
passwords. This can give easy admittance to hackers who can without much of a stretch
can misuse the money [2]. Most allies of cryptocurrency fear interference of the govern-
ment, and with a valid reason. The issue with our government is that, they rule the roost
and they seem to predisposition their choices dependent on fear which automatically
builds more military, police enforcement along with a control on population, bringing
the public closer. Yet, the government has a crucial part to play in society by managing
character of individuals, pull things together in favor of economy to keep away from
the monopole, sustain equity to avoid the common public from taking statute in their
hands. Rather than no government, it should work to reduce a healthy arrangement in
our society and keep up with harmony [2] (Table 1).
Investment or Gambling in the Crypto Market: A Review 229

Table 1. General ways by which we can overcome this pitfall includes [2]

Method Explained
User information Due to the threat of personal details getting
exposed, user info should not be shared
Creating a secure email address Using of solid and strong passwords for email
and crypto accounts should be done and use of
password manager, for security over the account.
Making a protected or secure email with strong
password should be considered for securing the
account from the risk of hacking
Use of strong antivirus on the system Avoiding the fake websites and emails, known as
“phishing” which only aims to steals the
information or the record is also important for
security reasons. Phishing should be avoided due
to security reasons
Spreading cryptocurrency among several When cryptocurrency is saved in only one wallet,
wallets there are easy chances for that wallet to get
hacked and in that information and money of the
user can be lost, therefore spreading is a perfect
idea for the safety

2 Darwinism of Some Cryptocurrencies

Cryptocurrencies is the latest phenomenon comparable to traditional fiat currencies and


assets such as gold. Bitcoin being the first cryptocurrency to catch the public eyes was
launched by an individual in the year 2009 by the name Satoshi Nakamoto. As of
September 2015, there were over 14.6 million bitcoins available for use with an absolute
market worth of $3.4 billion [8].
Aside from bitcoin, Ethereum today is likewise the most known and significant cryp-
tocurrency. Vitalik Buterin a computer programmer and researcher in cryptocurrency
first introduced Ethereum in 2013. Software Development subsidized with Ethereum
was funded by an online crowd sale in the year 2014 between the month of July and
august, developing a framework that went live on 30 July 2015 [8].
Further we have the silver against the gold i.e., “Litecoin”. Litecoin rivaling Bitcoin,
was designed with a one and only main purpose, to speed up transaction process. Founded
in October 2011 by a google employee who then became an engineering director at
coinbase, Charlie Lee released Litecoin on github as an open -source code [7].
Tether, founded in 2014 by Brock Pierce, Craig Sellars, and its current CEO Reeve
Collins, has a combined market value due to its stable nature in the cryptocurrency market
[10]. Originally known as “Real coin”, tether is a blockchain based cryptocurrency that
has its roots in the U.S with an aim to keep a fixed 1:1 exchange ratio with the U.S.
dollar. It was the first cryptocurrency to peg its value directly to an existing fiat currency.
230 A. Singh

Presented as a “joke currency”, Dogecoin was created with an aspect of making a


fun currency that would reach out to more extensive segment than the recent rival cryp-
tocurrencies. Billy Markus, a programmer by profession from Portland Oregon officially
launched this Dogecoin on December 6, 2013. While the first complete inventory was
intended to be 100 billion Dogecoins, it was later announced that this network would
deliver infinite Dogecoins, thereby building inherent inflationary momentum [12].
A technology entrepreneur, Chris Larsen started Ripple in 2012 with its cofounder
Jed Mc Caleb. Ripple transactions use less energy than Bitcoin. The payment system for
Ripple is quick empowering the exchange of assets in any cash to another client on the
Ripple network within seconds [8].
In 2015, a mathematician and an entrepreneur “Charles Hoskinson” with a scientist
known as Jeremy Wood organized the IOHK (input output Hong Kong) platform. Both
programmers had worked with Ethereum, global payments and so on and decided to
create their own digital financial service technology. Charles Hoskinson simultaneously
in the beginning of 2015, worked completely with full focus for two years on developing
the Cardano cryptocurrency. Surprisingly when Cardano entered the crypto market in
mid-autumn 2017, it acquired hundreds of millions of dollars in no matter of time in
capitalization. On 4 January 2018, this cryptocurrency reached its peak of almost $35
billion in market capitalization and a price of $1.32 per ADA token [26].
Dr. Gavin Wood (co-founder of Ethereum) with his fellow cofounders Peter Czaban
and Robert Habermeier began pondering the sharding obstacles that the blockchain
would confront, while developing Ethereum and Ethereum 2.0 specifications. Dr. Wood
on November 2014 released the Polkadot white paper which took him almost about
four months to come up with an idea for a multi-chain heterogeneous framework i.e.,
Polkadot protocol. Polkadot not only enables cross - blockchain transfers of tokens but
also any type of data or asset can be transferred with the help of it [27].
On November 2018, a former mechanical engineer at siemens created Uniswap,
the largest decentralized exchange overall by daily trading volume. Hayden Adams
idea made Uniswap the fourth largest cryptocurrency exchange and from the time of
march 2021 it started generating fees of approx. around US$2–3 million per day for the
providers who facilitate liquid market [28].
Binance coin made its place to being the largest cryptocurrency around the globe in
terms with trading volume on April 2021. Initially on July 2017, this coin was launched
during an ICO (initial coin offering) by Changpeng Zhao or “CZ”, who by profession
is a business executive. Zhao who previously worked for OK-coin left his position as a
chief technology officer and started Binance. Through the ICO process, this coin offered
20 million BNB tokens to the angel investors while around 80 million tokens to the
founding team and rest all to the various participants [29] (Fig. 1 and Table 2).
Investment or Gambling in the Crypto Market: A Review 231

Fig. 1. The above graph represents the value of the respective cryptocurrency on 15TH September
2020 & 15TH September 2021 (complete one year gap).

Table 2. Is Crypto-market the future or a risky gamble?

Cryptocurrency Symbols Downside Facts


BITCOIN (1BTC BTC Bitcoin is fairly new and Bitcoin without the help of any
= 26,56, 419.98 has not yet received the centralized banking system,
INR) public attention or allows transactions to be
assurance that it needs to performed smoothly. It is a
thrive not only in India hidden digital code in data blocks
but in many different which could be mined by various
countries as well. The software that are available in
government rules and market [3]. People get rewarded
policies in these countries if they find out the hidden blocks,
do not obey in some sort it is 6.25 BTC per valid block
in accordance with mined
Bitcoin either, thus Since Bitcoin has a fixed supply
become a hindrance and cap unlike gold, it is possible that
does not let its value demand growth will exceed
increase in the crypto supply growth in the future [4]
market [3] In 2010 a bitcoin user was over
Due to its unpredictable the news just for his statement
or rapid change of nature, which stated that “I just want to
Bitcoin even though being report that I successfully traded
a cryptocurrency, does not 10,000 bitcoins for pizza”.
behave much like a Surprisingly in December 2017,
currency and hence some Bitcoin price crossed the level of
empirical studies believe $19,000 which intrigued many
it to be an investment [4] peoples interest in Bitcoin [5]
(continued)
232 A. Singh

Table 2. (continued)

Cryptocurrency Symbols Downside Facts


The prohibitively of Elon musk tweets have been
Bitcoin is costly for playing a greater role in the
normal public due to its crypto market, the first
volatility [3] unexpected fluctuation that we
Double sending is not at saw was on January 29, 2021,
all possible in bitcoin [3] when his Twitter account bio was
changed to #bitcoin. Suddenly we
saw that the price of Bitcoin from
about $32,000 rose to over
$38,000 increasing its market by
$111 billion market by $111
billion in a matter of hours [11]
ETHEREUM ETH Ethereum uses an Ethereum scalability has always
(1ETH = algorithm known as been a concern, it was recorded
1,65,844. 78 INR) Ethash to allow normal that within24 hours the Ethereum
computers to mine blocks, network managed over one
but Ethash being memory million unique transactions to an
heavy is thus less suitable average around 11 transactions
for ASIC mining [7] per second [7]
Coping up from downfall It is indicated that Ethereum’s
for Ethereum has always wide range of positive and
been a task. Since 2018, negative outcomes should be
Ethereum’s price has been included in the investment
plummeted by nearly 94% portfolio because it has a much
greater variance due to strong
relation with news and
speculations and hypes
surrounding it, even though it has
a very low expected value [8]
Bitcoin and Ethereum blockchain
are said to be alike, just the only
difference is that, Ethereum
blocks not only includes block
numbers nonce etc., but also with
the recent state contains the
transaction list. A new state is
created for every transaction by
applying the previous state in the
transaction list [7]
(continued)
Investment or Gambling in the Crypto Market: A Review 233

Table 2. (continued)

Cryptocurrency Symbols Downside Facts


LITECOIN (1 LTC Litecoin market shows an With the help of “scrypt”,
LTC = 11,756.29 absence of efficiency [9] Litecoin shows the proof of their
INR) Returns in Litecoin are work. This scrypt can be decoded
multifractal in nature. by CPUs of consumer grade [24]
Litecoin with respect to The price performance of
investment assets is a very Litecoin is more stable than
new concept and is still at Ethereum prices [8]. Litecoin
a debut stage [9] shows less variation than
Ethereum hence it can be
preferred as a new investment
option
Claiming phenomenal growth in
prices against Bitcoin, Litecoin
has marked its rank in the crypto
market (Its price has stood up
from 7291% against Bitcoin’s
1731%growth) [9]
TETHER (1 USDT When traders at bitfinex Tether Limited believes that when
USDT = 74.11 exchanged the tether coin you use these virtual currencies it
INR) except of Bitcoin, its price allows users to move fiat for an
gradually decreased on exchange more quickly and
October 2018 to $0.88 cheaply. Also, other
due to the perceived credit cryptocurrencies already have a
risk and hence raised the rocky relationship with the banks,
value of Bitcoin it is also believed that Tether is a
way to overcome that
Acting as a digital dollar which is
everywhere in the realm of the
crypto market, tether provides
liquidity and can facilitate
transactions in different
cryptocurrencies
DOGECOIN (1 DOGE Dogecoin encountered its When compared with rest
Doge = 21.60 first major crash of 80% cryptocurrencies, Dogecoin holds
INR) loss because of the the fastest coin production
enormous mining pools schedule with 100 billion coins in
taking advantage of the mid - 2015 with an additional rate
little time required to of production around 5.256
mine the coins. Issue billion coins per year since then
emerged as mining pools
redirected towards
Dogecoin and seized on
the effortlessness of
mining this coin instead
[12]
(continued)
234 A. Singh

Table 2. (continued)

Cryptocurrency Symbols Downside Facts


The very first theft
attempt of Dogecoin
occurred when a hacking
led to losing of millions
of coins from the Doge
wallet. The hacker
accessed the stage’s
filesystem and modified
its receive and send pages
to transfer it to any or all
coins to a static address
[12]
RIPPLE (1 XRP XRP In 2020, Ripple faced a Ripple being a centralized
= 58.37 INR) solid breakout and ran cryptocurrency unlike bitcoin and
into a block wall and Ethereum, is operated by the
hence had a bad XRP ripple labs [13]
news in the form of claim To avoid overloading, Ripple
from the SEC. Having charges minimum transaction fee
claimed that XRP tokens [15]
is an unregistered Ripple was designed in such a
security, the chief US way that it achieves some of the
regulator says that ripple lowest costs and fastest
being their head office has transaction times of any crypto
violated the act by currency. Ripple’s major
offering unregistered objective is to attract clients who
securities in the United want to move enormous amount
States to investors quickly and at a cheap cost,
Ripple has the potential to making it incredibly attractive to
disrupt numerous banks
industries and is upheld
by a group of specialists at
Ripple Labs and a number
of investors who are
striving to push funding
of the virtual credit
There is yet no secure or
lightweight platform for
ripple [15]
(continued)
Investment or Gambling in the Crypto Market: A Review 235

Table 2. (continued)

Cryptocurrency Symbols Downside Facts


CARDANO ADA Cardano network is still Cardano coins are one of the most
(1ADA = 109.78 under process. Managing reliable projects to address the
0 00 INR) scaling issues could take two most uprising issues in the
months or even years too crypto market, namely scalability
Cardano developers had and Ethereum hacks, giving
to work at a very fast pace cryptocurrency developers new -
now if they want to work found optimism [17]
on the Goguen release Cardano developers claim that
before Ethereum make its their currencies will be able to
way any further, since out-perform Bitcoin, which can
Ethereum already has a only handle seven transactions
powerful first mover per second [17]
advantage in NFT’s [18] The Cardano ecosystem will be
self-sustaining and capable of
self-development and change in
response to changing situations
After billionaire Elon Musk
stated that Tesla will not accept
bitcoin due to its high
environmental cost, Cardano’s
price plummeted as crypto traders
rushed to buy the token that
promised to be a lot less carbon
intensive [19]
POLKADOT DOT The process of staking The first para-chain is estimated
(1DOT = 1,705.1 that continues almost to be launched later this year with
9 INR) daily which is basically a a production of 1 million
28 days competition in transactions per second
which those who want to Polkadot para-chain is said to
be the validator must have the fastest processing for
enter maximum amount transactions. Polkadot rates could
of dot money. Since it has see the growth with the start of
a maximum capacity of 2022. It is estimated to swing
just 1000 validators, the from US$64.24 to US$70.54 in
highest dot bond validator January 2022
become the validators for In the year 2023, 1 DOT is
polka dot for one day [20] anticipated to reach US$81.78
and so on
To make the Bitcoin and other
bull market come to an end, many
traders are looking for an
alternative and some even are
convinced that they have Polkadot
as their solution to it [22]
(continued)
236 A. Singh

Table 2. (continued)

Cryptocurrency Symbols Downside Facts


According to the Polkadot
network rule, a limit of
128 nominators is
qualified for marking
compensation from a
validator. On the off
chance that a validator has
130 nominators, just 128
nominators with the most
DOT bond will get
compensated. Rest will
receive nothing [20]
To be the Polkadot
council member, one
needs to have atleast 11
million DOT backing
which is only possible
unless you are a superrich
celebrity or there’s no
way other [20]
UNISWAP UNI A severe bug can easily Uniswap could have an around
(1UNI = cause Uniswap market $50 to $80 Brave New Coin by
1,645.80 INR) down due to its significant the year 2026 as predicted [21]
issues and them creating Uniswap has been keeping token
problems for those that trading fully automated and open
are listed there who holds the tokens, as well as
Even though being improves the efficiency on
Ethereum’s favorite traditional exchanges
project, Uniswap The problems that Ether Delta
existence was on stake has been facing over liquidity is
due to the emergence of also a concern for Uniswap and
Sushi-Swap they are working on solving this
Sushi gained the control issue, hence it is not just only a
of staggering amount of decentralized method of
the liquidity of Uniswap exchange
just by making sushi’s
liquidity mine existent
over staking on Uniswap
LP tokens
(continued)
Investment or Gambling in the Crypto Market: A Review 237

Table 2. (continued)

Cryptocurrency Symbols Downside Facts


BINANCE COIN BNB Even though Binance coin Binance Coin can also be used to
(1BNB = has a top cap of 200 trade with other cryptocurrencies
24588.4 2 INR) million tokens, the tokens and is accepted in most major
are destroyed at regular crypto exchanges around the
basis to reduce the in total world
overall supply and said to Binance coin does not believe in
be stabilized by their smart contracts unlike Ethereum
value over the time Binance coin support BFT
On May 2021, Binance (byzantine-fault - tolerant)
market was reported to be method, which basically allows
under investigation for tax the different nodes in Binance
offences and money i.e., validator, accelerator and the
laundering by two witness node to operate
well-known organization
i.e., Internal Revenue
Service and United States
Department of Justice
Govcoin - Unlike these decentralized cryptocurrencies, Govcoin is a new currency in the picture
now a days, run by the government central banks. It can be a digital rupee, e-dollar or even e -
yuan.

3 Conclusion

Crypto markets are said to ease the financial transactions but this market is very unpre-
dictable and one should take these price predictions with a pinch of salt. Being present
from a decade ago, yet the trust towards the cryptocurrencies at the beginning or even
today for common people specifically is not even close to 50%. Cryptocurrency behaves
as an unnecessary gamble for the public, especially due to the shortfall in financial
protection. Using cryptocurrency can be the best way to prevent theft as blockchain
technology is there for it to protect the transactions since it cannot be altered, and can
be used whenever or wherever required. Now a days, one can get an idea just by look-
ing at the rates of exchange against the existing fiat currency in the crypto market, and
it is only possible because of n cryptocurrency exchanges, which continuously pro-
vides all the price records for all actively participation of the traded cryptocurrencies.
Although exchange rates are said to be highly volatile, they say that for those who pay
fiat currency in order to purchase the cryptocurrencies, have a non-zero value. Compa-
nies like Microsoft, KFC or take the example of Subway have accepted cryptocurrency
because of their efficient behavior to carry out payments. With the rise of these type of
innovation in the financial market sectors, we found that only 3% of the papers or not
even that focus on the currencies except some wellknown such as Bitcoin, Dogecoin
and Ethereum. Hence this paper will be among very few to discuss about some of the
emerging cryptocurrencies too.
238 A. Singh

References
1. Giudici, G., Milne, A., Vinogradov, D.: Cryptocurrencies: market analysis and perspectives.
J. Ind. Bus. Econ. 47, 1–18 (2020). https://doi.org/10.1007/s40812-019-00138-6
2. Chauhan, V., Arora, G.: A review paper on cryptocurrency & portfolio management. In: 2nd
International Conference on Power Energy, Environment and Intelligent Control (PEEIC),
pp. 60–62 (2019)
3. Swamy, T., Shukla, P., Iyer, S.G., Gupta, R.: Review paper on emergence of bitcoin in India,
its technological aspects and legal implications. Glob. Manag. Rev. 10(3), 55–60 (2016)
4. Baur, D.G., Dimpfl, T.: The volatility of Bitcoin and its role as a medium of exchange and
a store of value. Empirical Econ. 61(5), 2663–2683 (2021). https://doi.org/10.1007/s00181-
020-01990-5
5. Sharma, G., Jain, M., Mahendru, M., Bansal, S., Kumar, G.: Emergence of bitcoin as an
investment alternative: a systematic review and research agenda (2019). https://doi.org/10.
6702/ijbi.201903_14(1).0003
6. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system. Cryptography Mailinglist
(2009). https://metzdowd.com
- S.: Blockchain technology, bitcoin, and ethereum: a brief
7. Vujičić, D., Jagodic, D., Randić,
overview, pp. 1–6 (2019). https://doi.org/10.1109/INFOTEH.2018.8345547
8. Bhosale, J., Mavale, S.: Volatility of select crypto-currencies: a comparison of bitcoin,
ethereum and litecoin, Ann. Res. J. SCMS 6 (2018)
9. Jana, R.K., Tiwari, A.K., Hammoudeh, S.: The inefficiency of litecoin: a dynamic analysis.
J. Quant. Econ. 17(2), 447–457 (2018). https://doi.org/10.1007/s40953-018-0149-0
10. Wei, W.: The impact of tether grants on bitcoin. Econ. Lett. 171 (2018). https://doi.org/10.
1016/j.econlet.2018.07.001
11. Ante, L.: How Elon Musk’s twitter activity moves cryptocurrency markets (2021)
12. Chochan, U.W.: a history of dogecoin. SSRN Electron. J (2017). https://doi.org/10.2139/ssrn.
3091219
13. Soylu, P.K., Okur, M., Özgür Çatıkkaş, Z., Altintig, A.: Long memory in the volatility of
selected cryptocurrencies: bitcoin, ethereum and ripple. J. Risk Finan. Manag. 13(6), 107
(2020). https://doi.org/10.3390/jrfm13060107
14. Fauzi M.A., Paiman, N.: Bitcoin and cryptocurrency: allenges, opportunities and future works.
J. Asian Finan. Econ. Bus. 7(8) 695–704 (2020). https://doi.org/10.13106/jafeb.2020
15. Jani, S.: An overview of ripple technology & its comparison with bitcoin technology (2018)
16. Liu, Y., Tsyvinski, A.: Risks and returns of cryptocurrency. Glob. Bus. Issues eJournal (2018)
17. Cardano (ADA) Price Prediction for 2020, 2021, 2023, 2025, 2030. https://medium.com/
stormgain-crypto/cardano-ada-price-prediction-for-2020-2021-2023-2025-2030-2f9304
8168a5. Accessed 10 Apr 2021
18. We Need to Take Cardano (ADA) Very, Very Seriously. https://www.nasdaq.com/arti-cles/
we-need-to-take-cardano-ada-very-very-seriously-2021-03-23. Accessed 10 Apr 2021
19. Cardano Surges During $300 Billion Crypto Crash as Musk Eyes Sustainable Bitcoin. https://
www.forbes.com/sites/jonathanponciano/2021/05/13/cardano-surges-during-300-billion-cry
pto-crash-as-musk-eyes-sustainable-bitcoin-alternatives/. Accessed 10 May 2021
20. Here’s Why Polkadot Will Fail? https://provscons.com/heres-why-polkadot-will-fail.
Accessed 10 May 2021
21. Uniswap Price Prediction. https://www.cryptonewsz.com/forecast/uniswap-price-predic
tion/. Accessed 10 May 2021
22. Some asian traders are using polkadot to predict bitcoin’s future. https://www.coindesk.com/
asia-traders-bitcoin-polkadot-prediction. Accessed 10 Aug 2021
23. Klose, J.: Cryptocurrencies and gold: similarities and differences (2021)
Investment or Gambling in the Crypto Market: A Review 239

24. Edmund, C., Muhamad, A., Muhamad, A., Muhamad, A.A., Muhamad, J.: Role of blockchain
and cryptocurrency to redefine the future economy. Turkish Online J. Qual. Inquiry 12, 3579–
3593 (2021)
25. James, B., Parashar, M.: Cryptocurrency: an overview on its impact on Indian economy.
IJCRT1813170. 6(2) (2018)
26. Cardano (blockchain platform). https://en.wikipedia.org/w/index.php?title=Cardano_(blockc
hain_platform)&oldid=1045732060. Accessed 10 Aug 2021
27. Polkadot (cryptocurrency). https://en.wikipedia.org/w/index.php?title=Polkadot_(cryptocur
rency)&oldid=1038609283. Accessed 10 Aug 2021
28. Uniswap. https://en.wikipedia.org/w/index.php?title=Uniswap&oldid=1041527345.
Accessed 10 Aug 2021
29. Binance. https://en.wikipedia.org/w/index.php?title=Binance&oldid=1045137205.
Accessed 10 Aug 2021
Solving Partial Differential Equations on Radial
Basis Functions Networks and on Fully
Connected Deep Neural Networks

Mohie M. Alqezweeni1(B) , Roman A. Glumskov2 , Vladimir I. Gorbachenko2 ,


and Dmitry A. Stenkin2
1 Kerbala University, Karbala, Iraq
mohieit@mail.ru
2 Penza State University, Penza, Russia

Abstract. Physics-informed neural networks learn not by example, but by check-


ing physical patterns in a limited set of sampling points. Fully connected deep neu-
ral networks, implemented in deep learning libraries, for example, TensorFlow,
are usually used as physics-informed neural networks for solving partial differ-
ential equations. An important role in the popularity of these libraries is played
by the automatic differentiation implemented in them and modern learning algo-
rithms. It is proposed to use radial basis functions networks as physics-informed
neural networks, which are distinguished by a simple structure and the ability to
adjust not only linear, but also nonlinear parameters. The authors have developed
a fast algorithm for the Levenberg-Marquardt method for learning radial basis
functions networks. Extensions of the TensorFlow library have been developed
to implement the Levenberg-Marquardt algorithm and radial basis functions net-
works. The model problems solution has shown the advantages of using the radial
basis functions networks implemented in TensorFlow as physics-informed neural
networks.

Keywords: Physics-informed neural networks · Radial basis functions


networks · Fully connected deep neural networks · Partial differential equations ·
Neural network learning · Adam’s algorithm · Levenberg-Marquardt algorithm

1 Introduction

The solution of partial differential equations (PDE) on fully connected neural networks
has been known for a long time [1, 2]. The theoretical basis of the PDE solution on
neural networks is the universal approximation theorem [3]. This means that the neural
network, minimizing the solution residuals in the sampling points set during learning,
approximates the unknown solution. But the universal approximation theorem is not
constructive and only speaks of the fundamental possibility of constructing a neural
network containing one hidden layer that approximates a smooth function. In practice,
such a network may require a very large number of neurons in the hidden layer, which

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 240–249, 2022.
https://doi.org/10.1007/978-3-030-97196-0_20
Solving Partial Differential Equations on Radial Basis Functions Networks 241

practically does not allow network learning. In addition, “shallow” networks have a
tendency to overfit.
These problems are largely solved by using deep mesh networks (DNNs) to solve
PDE. As proved in [4], DNNs with the activation function ReLU are universal function
approximators. A decisive role in the use of DNN for the PDE solution was played
by free libraries, for example, TensorFlow, which support automatic differentiation [5]
when learning neural networks. TensorFlow’s automatic differentiation re-lies on com-
putational graphs to compute gradients. Moreover, TensorFlow 2 uses special gradient-
Tape to calculate the gradient, which do not require the user to build a graph [6]. DNNs
implemented using libraries that support automatic differentiation are called physics
informed neural networks [7]. A feature that unites physics in-formed neural networks
is that such networks are learned not by examples, but by checking physical laws. When
solving PDE in the learning process, the network parameters are adjusted in such a way
that the residuals in a certain set of sampling points inside and at the region boundary
become small.
Existing deep learning libraries are focused not on a PDE solving, but on building
large networks, for example, for image recognition. Such networks are learned on large
data sets. First-order gradient algorithms are used for learning, since faster second-
order algorithms require unacceptably large resources. The learned network is used to
solve many recognition problems. At the same time, the main focus is on the network
accuracy, and not on the learning time. When solving a PDE, for each problem, it is
required retraining the network on a relatively small set of sampling points. At the same
time, it is necessary to ensure high accuracy and short learning time, which are not
allowed by first-order algorithms. Therefore, for physics informed neural networks, it is
relevant to develop fast second-order learning algorithms.
To solve PDE, neural networks of another type are also using - radial basis function
networks (RBFNN) [2, 8]. RBFNN are universal approximators [9, 10]. RBFNN differ
from DNN in simplicity, as they are contained only two layers and allow adjusting not
only weights, but also nonlinear parameters - parameters of radial basis functions (RBF).
These features make RBFNN open to wide application of second-order gradient learning
algorithms. RBFNN allow to approximately getting a differentiable analytical solution
at any point of the investigated area.
However, at present, fast second-order gradient algorithms are practically not using
for RBFNN learning. The article authors have developed RBFNN learning algorithms
for solving PDE by the confidence regions method [11] and the equivalent, but simpler
Levenberg-Marquardt method [12, 13]. These algorithms tune both the weights and the
parameters of the basis functions at the same time. The parameters calculations of the
algorithms were carried out analytically, which speeds up the calculations, but requires
preparatory work.
Currently, there is no RBFNN implementation for the PDE solution in libraries like
TensorFlow, there are no DNN and RBFNN comparisons for the PDE solution.
The aim of the work is to develop and implement in TensorFlow 2 learning algorithms
DNN and RBFNN for solving PDE, comparing the model problems solution on DNN
and RBFNN.
242 M. M. Alqezweeni et al.

2 Development of Learning Neural Networks Algorithms


for Solving PDE

Consider a stationary PDE in operator form

Lu(x) = f (x), x ∈  (1)

Bu(x) = p(x), x ∈ ∂, (2)

where u—desired solution; L—differential operator; operator B sets the boundary


conditions; —solution area; ∂—boundary area; f and p—known functions.
When solving non-stationary problems, time can be considered as one of the coor-
dinates, i.e. consider a problem of the form (1)–(2). But this approach increases the
dimension of the problem, which may be unacceptable. A more common approach
implies an explicit difference approximation of the time derivatives and the solution at
each time step of the stationary problem (1)–(2), which does not increase the dimension
of the problem.
Choose a set of sampling points inside and on the boundary of the solution area
{(xi )|i=1,N −M ⊂ , (xi )|i=N −M +1,N ⊂ ∂} , where N —total sampling points, M —
number of boundary sampling points from ∂.
Learning network is based on minimizing the loss function, which is the sum of the
squares of the residuals in the sampling points


N
 2 
K
 2
J (w, p) = Lu(xi ) − f (xi ) + λ Bu(xj ) − p(xj ) → min, (3)
i=1 j=1

where λ—penalty factor.


The PDE solution on RBFNN has the form


N
uRBF (x) = wj ϕj (x), x ∈  =  ∪ ∂, (4)
j=1

where ϕj —RBF; wj —weights RBFNN.


Solution (4) is an approximate analytic differentiable solution at any point of the
investigated region.  
Using the Gaussian function (Gaussian) as RBF ϕ(||x − c||, a) = exp − ||x−c||
2
2a2
,
where c—the coordinates of the function center, a—the shape parameter (width).
Among the first-order gradient algorithms, the Adam (Adaptive moments) algorithm
[14], implemented in TensorFlow, is currently recognizing as the best. This algorithm
is an adaptive learning rate algorithm that provides different learning rates for different
components of the parameter vector. To learn RBFNN, consider a single vector of weights
and parameters RBF for a two-dimensional problem, which has the form
 T
θ = w1 , w2 , . . . , wnRBF , c11 , c21 , . . . , cnRBF 1 , c12 , c22 , . . . , cnRBF 2 , a1 , a2 , . . . , anRBF , (5)
Solving Partial Differential Equations on Radial Basis Functions Networks 243

where wj —weights, j = 1, 2, 3, . . . , nRBF , nRBF —number of RBF, cj1 and


cj2 —center coordinates, aj —width.
At the (k + 1) th step of network learning, the parameters vector (4) are corrected
by the formula

θ(k+1) = θ(k) − ηm(k+1) ./ s(k+1) + ε,

where m(k+1) = m
, m = β1 m(k) − (1 − β1 )g w(k) , s(k+1) = s
,
1−βk+1 1−βk+1
 1  2
s = β2 s(k) + (1 − β2 )g w(k) ⊗ g w(k) ,./— elementwise division, ⊗—element-
wise multiplication„ ε—smoothing member, usually ε ≈ 10−8 , β1 = 0.9, β2 =
0.999—recommended parameters.
In the Levenberg—Marquardt method [15, 16], the correction of weights in the k-th
epoch of neural network learning occurs on the correction network parameters vector
θ k . The correction vector is formed as a result of solving the system
 
T
Jk−1 Jk−1 + μk E θk = JT r, (6)

where Jk−1 —Jacobi matrix calculated from the network parameters values at the
(k-1)-th learning epoch, E— identity matrix, μk —regularization parameter, r(k−1) —
residual vector at sampling points at the (k-1)-th learning epoch.
In this work, Marquardt recommended changing the regularization parameter linearly
[16]. In the case of a decrease in the functional error, the new value of the regularization
parameter is obtained as a result of dividing the current regularization parameter by a
factor of v > 1; in the case of an increase in the functional error, the current regularization
parameter is multiplying by v. In [17], it is proposed to decrease the regulation parameter
more strongly than to increase it, that is, to use different values of the coefficient v1 > v2 .
Initial values μ and v are selected. At the beginning of learning, the regularization
parameter should take on a relatively large value.
The RBFNN learning algorithm based on the Levenberg – Marquardt method was
proposed in [12, 13]. The algorithm provides for the analytical calculation of the Jacobi
matrix elements. This article uses automatic differentiation to implement the Levenberg-
Marquardt algorithm learning for both DNN and RBFNN.

3 Developing TensorFlow Library Extensions


To create models describing the learning both of DNN and RBFNN, classes were
described that are inheritors of the base class of the Keras library. The Keras library
provides a fairly wide range of different objects for describing neural networks, however,
they cannot be used to create physics informed neural networks. For this purpose, Keras
has base classes that provide, on the one hand, the ability to use the Keras interface, and on
the other hand, allow the programmer to flexibly configure the necessary parameters by
overloading the base class methods. To describe learning models physics informed neu-
ral networks, classes were created that inherit the tf.keras.model class, in which
the train_step(self, data) method was overloaded, which describes one step
of network learning. First, the value of the differential operator (1) is calculated with
244 M. M. Alqezweeni et al.

respect to the output of the neural network at the current learning iteration for sampling
points. This is done by using Tensorflow’s automatic differentiation tools. Below is a
snippet of python code that implements calculation ∂∂xu2 + ∂∂xu2 .
2 2

1 2

with tf.GradientTape() as tape1:


tape1.watch(x)
with tf.GradientTape() as tape2:
tape2.watch(x)
y_pred1 = self(x,training=False)
d=tape2.gradient(y_pred1, x)
dx=d[:,0]
dxx=tape1.gradient(dx, x)[:, 0]
with tf.GradientTape() as tape3:
tape3.watch(x)
with tf.GradientTape() as tape4:
tape4.watch(x)
y_pred1 = self(x,training=False)
d=tape4.gradient(y_pred1, x)
dy=d[:,1]
dyy=tape3.gradient(dy, x)[:, 1]
y_1=dyy+dxx

After finding the value of the differential operator in accordance with (3), the value
of the loss function is found. Further, for models of learning algorithms of the first
order, the loss function gradient is calculated with respect to the weights of the neural
network, and for models learned by the algorithm of the Levenberg-Marquardt method,
the Jacobian is calculated.
Both the gradient and the Jacobian are found using their respective auto-
matic differentiation functions. Then the optimization function is called. For
optimization by the Adam algorithm, the built-in function of the same name
tf.keras.optimizers.Adam was used. The function for implementing the
Levenberg-Marquardt method optimization algorithm is not a built-in function of Ten-
sorflow and for this reason was implemented from scratch. The weight correction vector
in the Levenberg-Marquardt method is formed as a result of solving system (6). To solve
the system, the LU decomposition algorithm built into Tensorflow was used. Then the
network weights and the regularization parameter are updated.
To describe the RBFNN layers, specially created objects were also used, which are
inheritors of the tf.keras.layers.Layer class. They overload the initialization
and call functions in order to conform to the RBFNN structure.

4 Experimental Study

For the experiments, using a computer with the following characteristics: processor -
Intel Core i5 2310, frequency 2.9 GHz, RAM - 24.0 GB.
Solving Partial Differential Equations on Radial Basis Functions Networks 245

To study the developed learning algorithms, problems were selected that are
described by the Poisson equation
∂2u ∂2u
∂x12
+ ∂x22
= f (x1 , x2 ), (x1 , x2 ) ∈ ,
u = p (x1 , x2 ), (x1 , x2 ) ∈ ∂,
where ∂—boundary area; f and p—known functions.
The first PDE was decided on f (x1 , x2 ) = sin(πx1 ) sin(πx2 ) and p (x1 , x2 ) = 0.
The problem has an analytical solution (pic. 1b) u = − 2π1 2 sin(πx1 ) sin(πx2 ). Square
[0, 1]2 was chosen as the solution area. There were randomly generated 100 sampling
points inside the solution area and 40 points on the boundary (Fig. 1).

a) analytical solution b) the decision result on RBFNN

Fig. 1. Solution of the first PDE

The number of RBFs was chosen to be 25. To initialize the weight, random num-
bers were used, evenly distributed over the interval from 0 to 0.001. The initial values
of the RBF width were set equal to 1.0. RBFNN was learned by Adam and Levenberg
Marquardt’s algorithms. In Fig. 2 shows the values of the RBFNN parameters during ini-
tialization and after network learning using the Levenberg – Marquardt method. In Fig. 2
shows the RBF centers, the RBF width are conventionally shown by the diameters of the
circles, and the weights values are shown by filling the circles. Figure 2 demonstrates
the importance of tuning not only the weights, but also the RBF parameters.
A fully connected network, learned by the Adam algorithm, was set up with 3 hid-
den layers of 200 neurons each. A fully connected network, learned by the Levenberg-
Marquardt algorithm, was set up with one hidden layer of 100 neurons. The weights
of both fully connected networks were initialized with random numbers uniformly dis-
tributed from −1 to 1. The sigmoid function was chosen as the layer activation function.
For using networks the Levenberg-Marquardt algorithm as the optimization algorithm,
the initial value of the regularization parameter was set equal to 0.1.
Table 1 shows the results of networks learning of various configurations when solving
the first PDE. Due to the dependence of the experimental results on the random generation
of network parameters, a series of 10 experiments was carried out. Table 2 presents the
averaged results for a series of experiments.
246 M. M. Alqezweeni et al.

a) at initialization b) after learning

Fig. 2. RBFNN parameters when solving the first PDE

Table 1. Networks comparison of different configurations with PDE 1 solution

Network type Algorithm Loss function Number of learning Learning time, s


optimization epochs
RBFNN Adam 1.6429e-05 2000 42
RBFNN Levenberg-Marquardt 4.0341e-07 100 5
DNN Adam 1.5427e-05 2000 43
DNN Levenberg-Marquardt 2.2194e-07 100 5

For the second series of experiments, PDE is chosen with an analytical solution,
which is a multimodal function [18] (Fig. 3).
x  2 2 1
u(x, y) = 3(x − 1)2 e−x −(y+1) − 10 − x3 − y5 e−x −y − e−(x+1) −y . (7)
2 2 2 2

5 3
The right side of the PDE is obtained by double differentiation (7).

f = 6  2 x4 − 2x3 − 2x2 + 4x + 2 (x − 1)2 (y + 1)2 − 1 e−x − (y+1)
2 2

+2 2 10x5 − 42x3 + 10x2 y5 + 19x − 60y5 − .


 4 −x 2 − (y+1)2
f = 6 2 x − 2x − 2x + 4x + 2 (x − 1) (y + 1) − 1 e
3 2 2 2

The solution area was square [−3, 3]2 . The boundary conditions are equal to the
analytical solution at the boundary of the region. 150 internal sampling points and 50
boundary sampling points were randomly selected.
In RBFNN, the number of RBFs is chosen equal to 196. To initialize the weights,
random numbers were used, evenly distributed over the interval from 0 to 0.001. The
initial values of the RBF width were set equal to 1.0. RBFNN parameters are shown
in Fig. 4. DNN was set with 3 hidden layers of 200 neurons each. The weights were
initialized with random numbers uniformly distributed from −1 to 1. The sigmoidal
function was selected as the layer activation function.
When implementing the Levenberg-Marquardt algorithm, a problem arose with the
solution of systems of linear equations of a sufficiently large order. Solving such sys-
tems of linear equations using TensorFlow’s built-in functions is too time-consuming.
Solving Partial Differential Equations on Radial Basis Functions Networks 247

a) analytical solution b) the result of the decision on RBFNN

Fig. 3. PDE Solution 2

Moreover, if fewer neurons are used, then it is impossible to approximate the solution
with sufficient accuracy. To reduce the number of learning epochs using the Levenberg-
Marquardt algorithm, the RBFNN parameters were initialized not randomly, but by the
weights obtained after 100 learning epochs by the Adam algorithm, which took 4 s.
Further adjustment of all RBFNN parameters over 10 epochs, which took 63 s, made it
possible to obtain the value of the loss function 1.0087e-05. For DNN, even this approach
turned out to be unacceptable due to the excessive time spent on learning.

a) at initialization b) after learning

Fig. 4. RBFNN parameters when solving the second PDE

For the second PDE, RBFNN showed the best results.


248 M. M. Alqezweeni et al.

Table 2. Networks comparison of different configurations with PDE 2 solution

Network type Algorithm Loss function Number of learning Learning time, s


optimization epochs
RBFNN Adam 1.3412e-04 2000 37
RBFNN Adam 4.6429e-05 20000 402
RBFNN Levenberg-Marquardt 1.0087e-05 110 67
DNN Adam 1.4409e-04 2000 46

5 Conclusion
The solution of partial differential equations on fully connected deep neural networks
and on radial basis functions networks is investigated. The TensorFlow library extensions
have been developed that implement radial basis functions networks and the algorithm
for learning networks proposed by the authors based on the Levenberg – Marquardt
method. Using model problems, a comparison of the popular Adam learning algorithm
and the developed algorithm of the Levenberg-Marquardt method is carried out. From the
results of the experiments, it can be seen that both types of networks have approximately
the same accuracy when learned with the Adam algorithm. On the other hand, although
learning with the Levenberg-Marquardt algorithm took a long time, it was possible to
obtain an accuracy that was not achievable with the Adam algorithm. It is promising to
pre-learn RBFNN to low accuracy using the Adam algorithm, followed by additional
learning of all network parameters using the Levenberg – Marquardt method algorithm.
At the same time Adam algorithm can be successfully used for tasks sufficiently large
dimension, which is not required to achieve high accuracy. As further areas of work, the
authors consider the implementation and study of fast learning algorithms for various
architectures neural networks for solving partial differential equations.

References
1. Lagaris, E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial
differential equations. IEEE Trans. Neural Netw. 9(5), 987–1000 (1998)
2. Yadav, N., Yadav, A., Kumar, M.: An Introduction to Neural Network Methods for Differential
Equations. Springer, Dordrecht (2015). https://doi.org/10.1007/978-94-017-9816-7
3. Cybenko, G.: Approximation by superposition of a sigmoidal function. Math. Control Signals
Syst. 2, 303–314 (1989). https://doi.org/10.1007/BF02551274
4. Hanin, B.: Universal function approximation by deep neural nets with bounded width and
ReLU activations (2017). https://arxiv.org/abs/1708.02691v2
5. Bavdin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in
machine learning: a survey. J. Mach. Learn. Res. 18(1), 1–43 (2018)
6. Raschka, S., Mirjalili, V.: Python Machine Learning: Machine Learning and Deep Learning
with Python, Scikit-learn, and TensorFlow 2. Packt Publishing, Birmingham (2019)
7. Raissia, M., Perdikarisb, P., Karniadakisa, G.E.: Physics-informed neural networks: a deep
learning framework for solving forward and inverse problems involving nonlinear partial
differential equations. J. Comput. Phys. 378, 686–707 (2019)
Solving Partial Differential Equations on Radial Basis Functions Networks 249

8. Tarkhov, D., Vasilyev, A.: Semi-Empirical Neural Network Modeling and Digital Twins
Development. Academic Press, Cambridge (2019)
9. Park, J., Sandberg, I.W.: Universal approximation using radial-basis-function networks.
Neural Comput. 3(2), 246–257 (1991)
10. Park, J., Sandberg, I.W.: Approximation and radial-basis-function networks. Neural Comput.
5(2), 305–316 (1993)
11. Gorbachenko, V.I., Zhukov, M.V.: Solving boundary value problems of mathematical physics
using radial basis function networks. Comput. Math. Math. Phys. 57(1), 145–155 (2017).
https://doi.org/10.1134/S0965542517010079
12. Gorbachenko, V.I., Alqezweeni, M.M.: Learning radial basis functions networks in solv-
ing boundary value problems. In: 2019 International Russian Automation Conference
— RusAtoCon, Sochi, Russia, 8–14 September, pp. 1–6 (2019)
13. Gorbachenko, V.I., Alqezweeni, M.M.: Modeling of objects with distributed parameters on
neural networks. Models Syst. Netw. Econ. Technol. Nat. Soc. 4(32), 50–64 (2019). (in
Russian)
14. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. https://arxiv.org/abs/1412.
6980
15. Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q.
Appl. Math. 2(2), 164–168 (1944)
16. Marquardt, D.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind.
Appl. Math. 11, 431–441 (1963)
17. Transtrum, M.K., Sethna, J.P.: Improvements to the Levenberg-Marquardt algorithm for
nonlinear least-squares minimization. https://arxiv.org/abs/1201.5885
18. Safarpoor, M., Takhtabnoos, F., Shirzadi, A.: A localized RBF-MLPG method and its applica-
tion to elliptic PDEs. Eng. Comput. 36(1), 171–183 (2019). https://doi.org/10.1007/s00366-
018-00692-y
Smart Technologies to Reduce the Spreading
of COVID-19: A Survey Study

Abdul Cader Mohamed Nafrees1(B) , P. Pirapuraj1 , M. S. M. Razeeth2 ,


R. K. A. R. Kariapper1 , and Samsudeen Sabraz Nawaz1
1 South Eastern University of Sri Lanka, Oluvil, Sri Lanka
nafrees@seu.ac.lk
2 Uwa Wellasse University, Badulla, Sri Lanka

Abstract. Smart technologies can help people stay healthy during the pandemic
and to avoid it. Engineers and Technology professionals come out with long-term
technological solutions to assist human activities while staying at home during
the pandemic. The Internet of Things, Artificial Intelligent, Wireless communi-
cation technologies, and 5G networks are just some of the ideas that have been
developed. Smart Technologies can provide smooth and secure functions to fight
against pandemic diseases such as COVID-19. This study analyzed data from
“Smart Technologies” and “COVID-19” after the Coronavirus pandemic crisis,
and findings revealed that various smart technologies were used in the medical
sector to reduce the pandemic. A wearable device can be developed to show the
temperature of humans maintaining social distance. Google Glass and thermal
sensors can be used to monitor people’s body temperature using infra-red sensors.
Data privacy and data security were the major issues while implementing the smart
concept.

Keywords: Smart technology · Internet of Things · COVID-19 · Pandemic ·


Cloud computing

1 Introduction

A pandemic is a new kind of disease that is spread pandemic is a new kind of disease
spread across the countries or continents and affects a considerable amount of lives on
the earth. It can affect human life, such as the economy, education, health, and society. In
that sense, researchers mentioned that the most recent pandemic is the corona virus called
COVID 19, which was exposed from Wuhan, China, in December 2019 [1]. This virus is
believed to typically transmit via breathing droplets formed while talking, coughing, and
sneezing of an infected patient [2]. Therefore, this virus affected not only human health
but also the economic status of more than 211 countries to date by stopping human
movement across countries, which almost stopped the business activities within the
country and between countries; although, we still facing problems to complete our daily
activities and getting essential from outside because the COVID 19 spread is increasing

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 250–265, 2022.
https://doi.org/10.1007/978-3-030-97196-0_21
Smart Technologies to Reduce the Spreading of COVID-19 251

while we fighting with unstable treatment and death rate is also high which leads us into a
stressful life inside the home. Moreover, this is the breakpoint that the technology comes
into the ground to makes human life easier via the virtual platform to complete human
day to day tasks such as education, shopping, work-life, Engineering, and Health care
activities [3] while we quarantine. Further, a study pointed out that digital Technologies
and easy data access make it possible to work from home, online learning, and quarantine
during the pandemic of COVID-19 [4].
While medical scientists are hardly working to create a new treatment formula while
some countries already started to issue vaccines for this COVID 19 and the other hand,
Engineers and Technology professionals come out with different long term technological
solutions, such as the Internet of Things, Artificial Intelligent, Wireless communication
technologies, and 5G networks to assist human activities while they stay at home during
any kind of pandemic. In that sense, China’s Smart Cities led the government to control
the country in a techno-driven approach to control the virus transmission [1].
Countries face severe challenges to implementing innovative solutions, where the
best solution to controlling these pandemics is social distancing until the permanent
cure comes out [5]. However, another study firmly explored that the epidemic can be
controlled by using smart health monitoring and electronic health records (EHR) to
avoid false information among the public [6]. However, we humans cannot be staying
at home for an extended period where it can lead to huge destroys in the country’s
economic status and human mental health. Therefore, as we mentioned previously, as
technological researchers, we carried out a research study about how smart technologies
can help humans avoid the pandemic and move on with our regular lives even while
staying at home.
Smart Technologies can provide smooth and secure functions to the entire world,
such as Travelling, Business activities, Education organizations, the Construction indus-
try, and Health care sectors with or without direct human interactions, but the inventors
trying their best to work devices independently without human interaction. In that sense,
Singapore has used state-led technologies, such as TraceTogather and safe entry using
bottom-up technologies to fight against COVID-19 during the 2nd wave of the pandemic
[7] as well as, computer scientists proposed how to protect our-self using Smart tex-
tiles and wearable technologies instead of typical Facemasks [8]. Furthermore, virus
outbreaks can be managed by Drones, IoT, AI, Blockchain, Machine Learning, and 5G
[9].
This review paper presents an overview of the accessible solutions, modern practices,
and applications of Smart Technologies fight against the pandemic battleground such as
COVID-19. Meanwhile, we discuss the challenges posed by these technologies, such
as security, privacy, and threads to the erroneous information and recommendations to
overcome those challenges. Finally, we have concluded the paper by proposing and
suggesting the best technologies and devices from our literature reviews to survive any
pandemic situation, especially when we are forced to stay at home to avoid deadly
diseases.
252 A. C. Mohamed Nafrees et al.

2 Existing Researches and Developments


Daniel G. Costa and et al. [10] did an analysis study. They analyzed the devastating
detection, alerting, and mitigation that can be applied to manage the outbreaks of infec-
tious diseases in large cities. The COVID-19 pandemic raised many questions about
how potential outbreaks can be detected, alerted and mitigated. In addition, given that
cities will be at the centre of the significant contagious spread, current and future smart
cities should be designed, implemented and managed to address disease outbreaks as a
critical emergency. They analyzed an intelligent system to do the scenario mentioned
above. Sensor-based monitoring stations can be used to detect patterns related to flu-like
outbreaks. In which multi-sensors are used to measure the heart rate, body temperature,
and respiration rate of passengers.
Furthermore, email, SMS messages, or even television broadcasts and social media
posts could be alerting. Automated clinics and healthcare systems: this is, in fact, the
most transparent approach when taking into account mitigation pandemic. AI algorithms
used to manage the number of available hospital beds, medical staff, and medicine has
to be managed in advance, processing statistical data.
Another study made by Vibhutesh Kumar Singh and et al. [11] Proposed an IoT
based wearable quarantine band (IoT-Q-Band) to detect absconding and infecting more
people. They said that current monitoring of COVID-19 positive patient methods seems
inadequate, often carried out through mobile applications, e.g. Aarogya Setu and visual
indicator based tracking method (e.g., medical authorities stamping on the hands with a
non-washable ink. In order to overcome the issues faced in current methods, a wearable
band with a mobile application is suggested in this research. Using the global positioning
satellite (GPS) based Geofencing, the tracking system generates real-time alerts and
allows the authorities to detect the absconding quarantine patients in real-time. The
IoT-Q-Band includes a GPS module and Bluetooth module.
A survey analysis has done by Nasir Saeed and et-al. [12] About the involvement
of wireless communication in the pandemic of COVID-19 from multiple perspectives.
They analyzed how wireless communication technologies help combat this pandemic
by monitoring the spread of the virus, enabling healthcare automation, and encouraging
virtual education and conferencing. They also discussed challenges posed by wireless
technologies, including privacy concerns, safety, and misinformation. The importance
of wireless technologies in the automation of industries and supply chain, e-commerce,
and supporting occupations in the pandemic situation also was discussed in this study.
Another survey study was done by Zaheer Allam and David S. [13] Jones about how
Urban Health Data can be shared among smart cities in COVID-19 pandemic situation.
Through this platform, scientists from other regions have been observed to gain access
to information and are therefore able to act much faster; as in the case of scientists from
the Virus Identification Laboratory based at the Doherty Institute, Australia, who have
managed to grow a similar virus in the laboratory after accessing data shared by Chinese
scientists. Thermal cameras or Internet of Things (IoT) sensors can be used to get urban
health data automatically. As the conclusion of this study has suggested, sharing urban
health data among all cities can help restrict the outbreak of the disease and improve the
urban economy and urban safety.
Smart Technologies to Reduce the Spreading of COVID-19 253

Nenad Petrovic and Dorde Kocic [14] introduced a cost-effective IoT-based solution
to increase COVID-19 indoor safety. The Arduino Uno microcontroller board with a
contactless temperature sensor and Raspberry Pi single-board computer equipped with a
camera are used to develop the system mentioned above to monitor people with high body
temperature should stay at home, wearing a mask is obligatory, and distance between
persons should be at least 1.5–2 m. OpenCV library and cascade machine learning
approach (Face and body detection algorithms) used with Raspberry Pi to monitor mask-
wearing. For contactless temperature check, Arduino Uno equipped with an infrared
thermometer (such as MLX90614) or thermal camera sensor (AMG8833) were used
when they break any rules to prohibit the spread of COVID-19 when entering the indoor,
or inside the building, a notification is sent to security guards smartphones.
Mwaffaq Otoom and et al. [15] proposed a real-time COVID-19 detection and mon-
itoring system which uses the Internet of Things (IoT) framework to collect data on
real-time illnesses from users for earlier detection of suspected coronavirus cases, to
assess the clinical outcomes of those who have already recovered from the virus, and
to explain the depth of the virus by collecting and analyzing the relevant data. They
used eight machine-learning algorithms to identify potential coronavirus cases from
this real-time symptom data quickly: Support Vector Machine (SVM), Neural Network,
Naïve Bayes, KNearest Neighbor (K-NN), Decision Table, Decision Stump, OneR, and
ZeroR. In addition to this study, another paper said that IoT enables healthcare devices
connected through cloud computing, which is developed using smart sensors to measure
and record individuals’ body temperature will help identify and maintain a social distance
from affected individuals [16]. Furthermore, IoT-enabled devices and IoT technologies
implemented healthcare applications and systems always help reduce virus spread by
collecting and analyzing data, which leads to the creation of big data. However, data pri-
vacy and security is a major concern that these technologies facing. However, big data
quality is the primary factor in improving the control and monitoring of pandemics [17].
A study explored that IoT-enabled blended learning makes a way to provide the new
normal to the education sector, which can help to reduce the virus spreading. However,
the data privacy and security issues are also in place as the external sources can access the
data collected through the IoT devices. Although these issues can prevent using pseudo-
anonymized information [18]. Meanwhile, in another analysis, the authors confirmed
that online education could help to reduce the spread of COVID-19 by keeping students
learn from home [3].
Another survey analysis has done by Md. Siddikur Rahman and et al. [19] that
How can the Internet of Things (IoT) help to save the world from Novel Coronavirus
(COVID-19) outbreak. They said that IoT-Enabled Health Monitoring Systems (Health
Monitoring Systems) provide real-time surveillance through the use of wearable health-
monitoring devices, cloud-based remote health testing, and artificial intelligence (AI). As
well as they said, when AI and machine learning merge with distributed cloud, practical
blockchain, system software automation, and AI speech collection, health monitoring
systems enable creating a responsive remote monitoring system between the patient and
the doctor. The authors have proposed an IoT based detection and monitoring system
to identify not only the asymptotic COVID-19 patients early as possible and reduce the
254 A. C. Mohamed Nafrees et al.

spread and infection rate of the virus but also temperature, blood pressure, and heartbeat
of the quarantined person without visiting the patient physically; where this system is
developed using oximeter sensor and the reading can be sent to the doctors mobile phone
or laptop via web server [20].
A survey analysis done by Jobie Budd and et al. [21] about digital technologies are
being used to support public health response to COVID-19 globally, including demo-
graphic monitoring, case identification, contact monitoring and assessment of initiatives
based on mobility data and communication with the public.
In another review study, Tan Yigitcanlar and et al. [22] talked about Artificially
Intelligent (AI) City. They said that sustainable practices based on AI technologies
are used as the basis of urban locations as a robust system whose economic, societal,
environmental and governmental activities help us achieve good social and the other
desired results and futures for people and non-humans. Meanwhile, in another study,
robotics applications using various AI techniques, such as Object recognition, Emotional
Intelligence, Face recognition, Flightpath optimization, NLP, and Fall detection, can help
the health workers with patients monitoring, virus disinfection, identify the blood veins,
and supply delivery during the pandemic [23]. As well as in another research said that the
AI technologies from two different companies such as BlueDot, and Metabiota explained
that how AI-driven algorithm helps to early detection and prediction of COVID-19 by
using a massive amount of shared data. At the same time, they have compromised data
privacy and security [4].
Ravi Pratap Singh and et al. selected twelve significant applications of IoT and
analyzed how the allocations can be used to fight against the COVID-19 pandemic.
They concluded that IoT effectively detects symptoms and delivers improved care to
an infected COVID-19 patient quickly. It is helpful for the patient, doctor, physician
and hospital management system [11]. Furthermore, researchers mentioned that the IoT
technology implemented equipment like wearable, drones, robots, buttons, and smart-
phone applications assist the patients and healthcare workers in three different phases,
including Early Diagnosis, Quarantine Time, and After recovery to fight against COVID
19 [24].
Tanweer Alam proposed a framework which is four-layer architecture is proposed
using IoT and a Block chain to detect and prevent individuals from being COVID 19. IoT-
based devices collect valuable information, provide additional insight into symptoms and
behaviours, enable remote surveillance, and generally provide more self-determination
and treatment for people. Blockchain enables the safe transfer of health information for
the patient, controls the network of medical distribution. Meanwhile, the blockchain can
help to check the quality of the medical supplies; furthermore, the author suggested the
concept of Smart Ambulance connected to the cloud computing that can use the tech-
nologies such as GPS, IoT, AI, speech recognition, Biosensor, and Automatic sanitizing
operations where this Smart Ambulance can help to reduce the risk of infections to the
health care workers [25].
Another study discusses the implications of scanning business processes and fast-
moving consumer goods (FMCG) supply chain sustainability based on IoT implemen-
tation in the COVID-19 pandemic lockdown policy with reduced contacts and physical
Smart Technologies to Reduce the Spreading of COVID-19 255

distance. They talk about the IoT idea is focused on installing a virtual network that
integrates all data about manufacturing and service operations within the supply chain
via the Radio Frequency Identification – RFID tags, bar codes, Wireless Sensors (WS)
and smart devices [26]. In addition to that, the Internet of Things and Crowdsourcing
play major roles in urban planning; furthermore, Distance Learning benefits from this;
these factors help social distancing where keeping a social distance is the major factor
in reducing COVID-19 spread [27].
Syyada Abeer Fatima and et al. proposed an IoT enabled Smart Monitoring of Coron-
avirus empowered with Fuzzy Inference System (IoTSMCFIS), which is used to smartly
monitors and predicts either human is the victim of Coronavirus or not. The proposed
IoTSMCFIS system uses MATLAB 2019a for simulations. In another research work,
the Authors presented a decentralized IoT-based face detection system validated against
a state-of-the-art face detection system that helps avoid the crowd during the lockdown
period; furthermore, the system has used a CNN-based multitask cascaded framework
[28]. A group of researchers has been developed an infodemiology platform called
ColloborativeHealth system using deep-learning which help health professional to pro-
vide real-time data through a configurable dashboard where those data gathered from
social networks, public networks, and voluntary citizen participation [29]. Furthermore,
Researchers proposed a solution based on existing works that are an IoT system that can
be developed using sensors, RFID tags, and smartphones where; this system can use to
prevent and monitor the epidemic and reduce the virus spread; and this system can create
using both physical, virtual, and hybrid objects along with protocols, network devices,
and server [30].
Researchers recommended that the Smart Textiles play significant roles in the
development of personal protective equipment (PPE) and Telemedicine where these
devices can accommodate not only protect but also detect viruses, is adept of self-
decontamination, durable, biodegradable and help to reduce overcrowd and human expo-
sure especially in hospitals respectively [8]. Meanwhile, researchers confirmed that the
countries that used smart technologies help to reduce the spread of COVID-19 and
the death rate compared to the standard lockdown procedures implemented during the
initial stage pandemic, where these smart concepts included contact-free technologies,
drones & robots for smart health care systems and online tools for education and meeting
purposes [31].
The researchers Hameed Khan and et al. did a survey review [32] on the impact of
smart technologies to tackle the COVID – 19 pandemic. They analyzed the following
technologies: Robotics and drone technology-driven approach, Artificial Intelligence,
and fabrication methods like 3D printing in masks and sensors. Several sensors and smart
technologies are used to tackle COVID – 19 pandemic were discussed in this research,
such as COVID-19 FET Sensor, 3D printed mask, temperature and face recognition
helmet, social robots, etc.
Another research has conducted by Mohammad Nasajpour and et al. [33] a survey
on recently proposed IoT devices for healthcare workers and emergency management
services to aid in the containment of the COVID-19 pandemic. The researchers analyzed
256 A. C. Mohamed Nafrees et al.

the IoT solution used in three phases: early diagnosis, quarantine time, and recovery. Sev-
eral wearable IoT devices are discussed, such as Smart Thermometers, Smart Helmets,
Smart Glasses, IoT-Q-Band, Easy Band, and Proximity Trace, and Drone, Robots, IoT
Buttons and smartphone applications. The technologies mentioned above were analyzed
separately in the three phases mentioned above.
M. N. Mohammed and et al. proposed an automated coronavirus detection system
[24] with less human involvement, a smart helmet with a Mounted Thermal Imaging
System. The proposed system in this research is used to overcome the issue faced in
the currently used method of thermal screening using infrared thermometers. An IoT
based system is proposed using Arduino Developing board, infrared camera, GPS and
facial-recognition technology.
The researchers Jung Won Sonn and Jae Kwang Lee did a study that discussed
how South Korea minimized the infected people and the death rate by COVID – 19
by applying smart city technologies [34]. In order to accomplish this, the three critical
technologies used are credit and debit cards, cell phones, and CCTV. As 94.4% of all
transactions are cashless in South Korea, the transaction records show that purchases
were made with a credit card. To provide excellent coverage of people’s mobility, those
movements must be tracked. Because of the multiple functions of mobile phones and
CCTV, both services were used for the same purposes.
Several studies have introduced several Artificial Intelligent methods by many
authors to identify the covid 19 related issues. In that respect, Lin et al. [35] were utilized
a deep learning model to identify COVID-19 with the images of chest CT. As a result,
along with the algorithm, the dataset got 96% accuracy. As a competition to Lin et al.
[35], Chuansheng et al. [36] applied the same model with the same algorithm, which
gave them only 90.1% accuracy. However, Fatima et al. [37] obtained 97% accuracy
of COVID-19 detection with Convolutional neural networks (CNN). In addition, Gozes
et al. [38] obtained 99.6% of their trained model performance accuracy for COVID-19
classification using CT image with deep learning technique. Predication of COVID-19
cases got 92.77% accuracy with support vector regressor model (SVR) by Matheus et al.
[39]. Least-square support vector machine (LSSVM) is another model used for predic-
tion in the machine learning approach. Sarbjit et al. [40] obtained 99% accuracy with
LSSVM with COVID-19 confirm case prediction.
After the covid-19 pandemic, many scientists invented powerful medical applications
to overcome at least few issues of covid-19. In that respect, Nasajpour et al. [24] found
the “DetectaChem” application to identify covid-19 with cheap cost. At the same time,
Benny & Eyal [41] found “Hamagen” applications to find close contact of COVID-19
positive cases. Thiele [42] introduced “Stopp Corona” to identify COVID-19 cases. In
addition to that, “COVIDSafe” was found by David [43] to track the COVID-19 cases.
Some more details about technology used is summarized in Table 1.
Smart Technologies to Reduce the Spreading of COVID-19 257

Table 1. Technologies applied in the previous studies

Author(s) Technology Author(s) Technology Author(s) Technology


[10] IoT, [11, 33] IoT, [12] Wireless
Broadcasting, IoT-Q-Band, Technology
Machine Geofencing,
learning Bluetooth
[13, 17, 18, 27] IoT [14] IoT, OpenCv, [15, 16] IoT, Machine
Machine Learning (SVM,
learning NN, Naïve
Bayes)
[19] IoT, Cloud [22, 23] AI, Machine [24] IoT, Drones
computing, AI Learning
[25] IoT, Block [26] IoT, WSN, [28] IoT, Machine
chain virtual learning (CNN)
network,
supply chain
[29] Deep learning [30] IoT, server [31, 32] Drone, Robots
[34] Smart city [35]–[40] AI [41]–[43] IoMT

3 Methodology
This paper has been done using systematic review [44], a review of single documentation
of many previous relevant studies consisting of comprehensive and unbiased synthesis
but more than a general literature review process [45]. Furthermore, this study was
used a qualitative technique to analyze the data collected from the systematic review.
Qualitative data analysis points to various matters, as it is often in line with a particular
methodology, theoretical perspective, research tradition, and/or field [46].

3.1 Time Span

This study was mainly focused on the feasible solution to reduce the COVID-19 pan-
demic and monitor COVID-19 victims, using Smart Technologies; for the researchers
conducted on COVID-19 and Smart Technologies after the COVID-19 started over the
period of 2019 - 2021. COVID-19 was found in china in 2019 [1]. Therefore, Health-
care professionals contribute many more attempts to diagnose and cure the COVID-19
patients [47]; Meanwhile, technology experts work on various researchers to develop
smart technology solutions for diagnosing and monitoring COVID-19 victims [48].
Since 2019, publications on “COVID-19 and Smart Technologies” from various index-
ing digital libraries were searched using different strings and keywords related to the
pandemic and smart concepts. Finally, we found fewer research articles were found that
were published after 2019.
258 A. C. Mohamed Nafrees et al.

3.2 Research Strings and Databases to Search


Research articles were searched according to the main three aspects as, “Smart tech-
nologies” and “COVID-19”. In addition to that, fewer terms related to the above two
subjects were also considered while searching for the research papers such as “pan-
demic”, “IoT”, “ systematic review”, “wearable devices”, and “Robots”. Furthermore,
these search strings were used together with the help of AND and OR logical terms
too. Furthermore, required papers were downloaded from well-known research digital
libraries, research publishers, databases, and tools such as IEEE, Springer, Elsevier,
Atlantic-Press, Thomson-Reuters, Emerald, and Google Scholar. Finally, all the down-
loaded papers were summarized according to Sects. 3.2 and 3.3. Furthermore, Fig. 1
represents the classification diagram for the systematic literature review.
C. Criteria for Study Selection
We have collected the required data from previously published peer-reviewed Journal
papers, Book chapters, and Proceedings of International Conferences. These articles
have been selected based on the following significant factors;

• Only full paper articles


• Articles published in high index citation databases, Publishers, and Digital libraries.
• Free accessible research papers
• Published from 2019
• Research papers related to ICT, COVID-19, and pandemic
• Published in the English language.

3.3 Development of Research Questions


There were around 175 research papers that have been downloaded with considered
above criteria, and from that 175 papers, only 41 papers have been filtered out for this
review article based on the following research questions (Refer Table 2).

Table 2. Research Questions (RQ)

S. No Research Questions Motivation


RQ1 What are the usable features of Smart Find the Smart technologies
Technologies?
RQ2 Recognize the features of smart Select the most suitable Smart
applications which can help to reduce the Technologies
chance of COVID-19 pandemic
RQ3 What are the smart devices available to Recommend the devices from the existing
avoid the virus spreading? developed and proposed smart devices and
systems
RQ4 Select the most suitable device which can Select the Best device from RQ3 to avoid
reduce the spreading of the virus virus spreading
Smart Technologies to Reduce the Spreading of COVID-19 259

3.4 Criteria For Study Selection

We have done several Systematic approach literature reviews and summarized all the
smart technologies and smart devices discussed from the previous research works to
reduce the COVID-19 pandemic according to the above research questions 3.4 and
concluded both positive and negative faces of those technologies, Concepts and devices.
Finally, we draw the conclusions, limitations, recommendations, and future work plan
based on the summarized information.

Downloaded Articles Excluded due to not in English


Identification

Remaining Articles Excluded due to not relevant

According to Abstract Excluded due to not relevant


Screening

Remaining Articles Excluded due to not availability


Eligibility

Article with full text


Included

Article Included for Qualitative Analysis (46)

Fig. 1. Classification diagram of Research Articles

4 Discussions
There are various Smart healthcare systems and applications that have been developed
and proposed to avoid the spreading virus, monitoring patients, reducing crowds, and
early detection COVID-19 using various technologies to help not only for the healthcare
workers but also other citizens of the nation, such as students, teachers, and patients.
In that manner, smart cities were provided resilient services during the pandemic crisis
[49].
260 A. C. Mohamed Nafrees et al.

Sensors can be used to identify the vital signs of patients [50]. Further, that can use to
have been used to develop smart systems to assist health workers in hospitals and other
public places where these sensor-based systems have been used to locate the number
of empty beds in the hospital and measure human body temperature notify the staff
via SMS or Email; further, the temperature sensor such as contactless temperature and
the thermal camera can be used to design smart devices based on IoT that can help to
measure the human temperature from doctor’s recommended distance. It was confirmed
that social distancing could be maintaining using sensors [51]. Furthermore, sensors like
oximeters can help measure patients’ blood pressure, body temperature, and heart beat
without visiting patients.
Wearable devices found to be used in various applications in healthcare sectors [52].
Wearable devices can be used to monitor and locate quarantined patients in real-time
using GPS that helps to reduce the spread of covid-19 and keep the social distance from
those patients. Moreover, IoT can be used to develop wearable devices with the help of
AI, remote-based cloud computing, where these systems can assist in find asymptotic
COVID-19 patients. These IoT-enabled devices mainly help healthcare workers with
wearable devices and robots, buttons, and Smartphone applications. Furthermore, IoT
can reduce virus spreading by implementing Smart ambulances that help health workers
while transferring patients. Any IoT-enabled wearable devices greatly support reducing
human interference [53].
Robots can be used in medical works [54]. In that sense, robots based on various
AI techniques can be used in the medical and supply field, which helps reduce virus
spreading and find COVID-19. These robots mainly help health workers while treat-
ing patients; they can also help deliver medicine and other food delivery services during
quarantine periods. Robotic applications can collect the sample from a patient for screen-
ing, disinfect the hospital, supply logistics and food to the infected patient, and collect
physiological conditions [55]. Meanwhile, IoT can play a significant role in the FMCG
supply chain industry by using RFID, WSN, and barcode, which can help to reduce virus
transmission since these technologies help in social distancing.
An IoT-based real-time data collection system can help early identification, monitor,
and predict future treatment of COVID-19 victims [56]. Detection and monitoring sys-
tems can be developed based on IoT framework for real-time data collection from the
corona affected patients, and the collected data via these systems can be stored in cloud
storage for the creation of big data where these data can analyze using machine learning
algorithm and help to avoid misinformation and panic among the public. Furthermore,
the security of the data can be confirmed by blockchain technology [57].
Smart textiles can be used to create PPE and Mask that can be spread or avoid viruses,
detect the virus, and avoid overcrowded. A study proposed that the five-layer smart mask
was designed with a smart filter to prevent the Coronavirus from spreading [58].
It is confirmed that based on the above reviews and studies, Information and Com-
munication Technologies can create devices and applications called smart systems that
are helping to reduce and early detection of the virus spreading with the help of the
latest technologies such as AI, Cloud services, Blockchain technologies, and Internet
of things. Furthermore, these studies prove that those technologies mainly help health
workers as well as reducing crowdsourcing.
Smart Technologies to Reduce the Spreading of COVID-19 261

Although, all the available smart devices and proposed solutions are not fully-fledged
for the general human. Based on the doctors and WHO recommendations, social dis-
tancing is the top way to reduce corona spreading. Therefore, a Smart device must be
developed with less cost, user-friendliness, and able to identify people who have COVID-
19 symptoms; that is, at least the developed device must measure human temperature;
so that people can keep their distance from the affected person.
Apart from the above discussions, a significant issue arose on the data collected
through these smart technologies, which are still questionable, called data privacy and
security. However, experts are continuously working on these two matters to overcome
while developing their inventions. Experts explained the technical challenges and impli-
cations of COVID-19 due to the data collection [59]. Meanwhile, a possible solution
was proposed using blockchain technology [57].
Therefore, a simple device must be developed which can be afforded by all the peo-
ple in the name of a Smart device. With the above technologies, a wearable device can
be developed using Google glass and thermal sensors such as infra-red sensors; that can
be designed to show the temperature of humans while maintaining WHO recommended
social distancing measures. Furthermore, this work can be extended to identify the loca-
tion and face of the people with a higher temperature using machine learning and image
processing technique; and notify the relevant authorities.
While conducting this research work, we have faced some limitations, that is, few
numbers research works were published related to “Smart Technologies and COVID-
19”, some high index research works could not be accessible due to no funding, and
systematic review articles were not available on the same context of this paper.

5 Conclusion and Recommendation


COVID-19 pandemic crisis is a significant health issue worldwide that neither has any
stable solution nor no solution developed to identify the victim while keeping social
distancing so far. However, some medical treatments are available in different countries,
not tested yet but used the human directly. Hundreds of researchers were proposing
or suggesting solutions to reduce the spreading of this COVID-19 through medical
treatment or smart technologies. Therefore, as an Information technology research team,
we were come forward to identify the feasible smart technologies to reduce the spreading
of the Coronavirus while keeping the social distancing as recommended by the WHO.
This study was used a systematic literature review to analyze the data using the qual-
itative technique from the fully published research articles on “Smart Technologies” and
“COVID-19” after the Coronavirus pandemic crisis. The studies revealed that various
smart technologies and smart devices were used in the medical sector to reduce the
pandemic, monitor the patients, and paint the social distance in crowded places. That is,
smart wearable devices, smartphone applications, drones, robots using various sensors,
IoT, blockchain, AI technologies. In this sense, patient monitoring using smart technolo-
gies was a successful innovation, but none of the innovations available for identifying
the symptoms of COVID-19 victims. Furthermore, data privacy and data security were
the major issues while implementing the smart concept, although blockchain technology
could be used to reduce this issue.
262 A. C. Mohamed Nafrees et al.

References
1. Kummitha, R.K.R.: Smart technologies for fighting pandemics: The techno- and human-
driven approaches in controlling the virus transmission. Gov. Inf. Q. 37(3), 101481 (2020)
2. Karia, R., Gupta, I., Khandait, H., Yadav, A., Yadav, A.: COVID-19 and its modes of trans-
mission. SN Compr. Clin. Med. 2(10), 1798–1801 (2020). https://doi.org/10.1007/s42399-
020-00498-4
3. Nafrees, A.C.M., Roshan, A.M.F., Nuzla Baanu, A.S., Shibly, F.H.A., Maury, R., Kariapper,
R.K.A.R.: An investigation of Sri Lankan university undergraduates’ perception about online
learning during covid-19 : with superior references to South Eastern university. Solid State
Technol. 63(6), 8829–8840 (2020)
4. Maalsen, S., Dowling, R.: Covid-19 and the accelerating smart home. Big Data Soc. 7(2),
1–5 (2020)
5. Tešić, D.B., Lukić, A.: Bringing ‘smart’ into cities to fight pandemics: with the reference to
the COVID-19. Zb. Rad. Departmana za Geogr. Turiz. i Hotel., (49–1), 99–112 (2020)
6. Avdi, A.R., Marovac, U.M., Jankovi, D.S.: Smart health services for epidemic control. In:
ICEST 2020, pp. 46–49 (2020)
7. Das, D., Zhang, J.J.: Pandemic in a smart city: Singapore’s COVID-19 management through
technology & society. Urban Geogr. 42(3), 1–9 (2020)
8. Ivanoska-Dacikj, A., Stachewicz, U.: Smart textiles and wearable technologies – opportunities
offered in the fight against pandemics in relation to current COVID-19 state. Rev. Adv. Mater.
Sci. 59(September), 487–505 (2020)
9. Chamola, V., Hassija, V., Gupta, V., Guizani, M.: A Comprehensive review of the COVID-19
pandemic and the role of IoT, Drones, AI, Blockchain, and 5G in managing its impact. IEEE
Access 8(April), 90225–90265 (2020)
10. Costa, D.G., Peixoto, J.P.J.: COVID-19 pandemic: a review of smart cities initiatives to face
new outbreaks. IET Smart Cities 2(2), 64–73 (2020)
11. Singh, V., Chandna, H., Kumar, A., Kumar, S., Upadhyay, N., Utkarsh, K.: IoT-Q-Band: a
low cost internet of things based wearable band to detect and track absconding COVID-19
quarantine subjects. EAI Endorsed Trans. Internet Things 6(21), 163997 (2020)
12. Saeed, N., Bader, A., Al-Naffouri, T.Y., Alouini, M.S.: When wireless communication faces
COVID-19: combating the pandemic and saving the economy. arXiv, vol. 1, no. November,
pp. 1–15 (2020)
13. Richards, T.J., et al.: On the coronavirus (COVID-19) outbreak and the smart city network:
universal data sharing standards coupled with artificial intelligence (AI) to benefit urban health
monitoring and management. Tackling Coronavirus Contrib. Glob. Effort 8(1), 2–25 (2020)
14. Petrovic, N., Kocic, D.: IoT-based system for COVID-19 indoor safety monitoring. In:
IcETRAN 2020 (2020)
15. Otoom, M., Otoum, N., Alzubaidi, M.A., Etoom, Y., Banihani, R.: An IoT-based framework
for early identification and monitoring of COVID-19 cases. Biomed. Signal Process. Control
62(April), 102149 (2020)
16. Kumar, K., Kumar, N., Shah, R.: Role of IoT to avoid spreading of COVID-19. Int. J. Intell.
Netw. 1(May), 32–35 (2020)
17. Ndiaye, M., Oyewobi, S.S., Abu-Mahfouz, A.M., Hancke, G.P., Kurien, A.M., Djouani, K.:
IoT in the wake of COVID-19: a survey on contributions, challenges and evolution. IEEE
Access 8, 186821–186839 (2020)
18. Siripongdee, K., Pimdee, P., Tungwongwanich, S.: A blended learning model with IoT-
based technology: effectively used when the COVID-19 pandemic? J. Educ. Gift. Young
Sci. 8(June), 905–917 (2020)
Smart Technologies to Reduce the Spreading of COVID-19 263

19. Rahman, M.S., Peeri, N.C., Shrestha, N., Zaki, R., Haque, U., Hamid, S.H.A.: Defending
against the Novel Coronavirus (COVID-19) outbreak: how can the Internet of Things (IoT)
help to save the world? Heal. Policy Technol. 9(2), 136–138 (2020)
20. Arun, M., Baraneetharan, E., Kanchana, A., Prabu, S.: Detection and monitoring of the asymp-
totic COVID-19 patients using IoT devices and sensors. Int. J. Pervasive Comput. Commun.
(2020)
21. Budd, J., et al.: Digital technologies in the public-health response to COVID-19. Nat. Med.
26(8), 1183–1192 (2020)
22. Yigitcanlar, T., Butler, L., Windle, E., Desouza, K.C., Mehmood, R., Corchado, J.M.: Can
building ‘artificially intelligent cities’ safeguard humanity from natural disasters, pandemics,
and other catastrophes? An urban scholar’s perspective. Sensors (Switzerland) 20(10), 1–20
(2020)
23. Fong, S.J., Dey, N., Chaki, J.: AI-enabled technologies that fight the coronavirus outbreak. In:
Fong, S.J., Dey, N., Chaki, J. (eds.) Artificial Intelligence Coronavirus Outbreak. Springer-
Briefs in Applied Science and Technolology, pp. 23–45. Springer, Singapore (2021). https://
doi.org/10.1007/978-981-15-5936-5_2
24. Nasajpour, M., Pouriyeh, S., Parizi, R., Dorodchi, M., Valero, M., Arabnia, H.: Internet
of things for current COVID-19 and future pandemics: an exploratory study. J. Healthcare
Inform. Res. 4(4), 325–364 (2020). https://doi.org/10.1007/s41666-020-00080-6
25. Kumar, S., Raut, R.D., Narkhede, B.E.: A proposed collaborative framework by using artifi-
cial intelligence-internet of things (AI-IoT) in COVID-19 pandemic situation for healthcare
workers. Int. J. Healthc. Manag. 13(4), 337–345 (2020)
26. Končar, J., Grubor, A., Marić, R., Vučenović, S., Vukmirović, G.: Setbacks to IoT imple-
mentation in the function of FMCG supply chain sustainability during COVID-19 pandemic.
Sustain. 12(18), 7391 (2020)
27. Abusaada, H., Elshater, A.: COVID-19 challenge, information technologies, and smart cities:
considerations for well-being. Int. J. Commun. Well-Being 3(3), 417–424 (2020). https://doi.
org/10.1007/s42413-020-00068-5
28. Kolhar, M., Al-turjman, F., Alameen, A., Abualhaj, M.: A Three layered decentralized IoT
biometric architecture for city lockdown during COVID-19 outbreak. IEEE Access 8, 163608–
163617 (2020)
29. Apolinario-Arzube, Ó., et al.: CollaborativeHealth: Smart Technologies to Surveil Outbreaks
of Infectious Diseases Through Direct and Indirect Citizen Participation. In: Silhavy, R. (ed.)
CSOC 2020. AISC, vol. 1226, pp. 177–190. Springer, Cham (2020). https://doi.org/10.1007/
978-3-030-51974-2_15
30. Kaushalya, S.A.D.S., Kulawansa, K.A.D.T., Firdhous, M.F.M.: Internet of things for epidemic
detection: a critical review. In: Bhatia, S.K., Tiwari, S., Mishra, K.K., Trivedi, M.C. (eds.)
Advances in Computer Communication and Computational Sciences: Proceedings of IC4S
2018, pp. 485–495. Springer Singapore, Singapore (2019). https://doi.org/10.1007/978-981-
13-6861-5_42
31. Jaiswal, R., Agarwal, A., Negi, R.: Smart solution for reducing the COVID-19 risk using
smart city technology. IET Smart Cities 2(2), 82–88 (2020)
32. Hameed Khan, K.K., Kushwah, S., Urkude, H., Maurya, M., Sadasivuni, K.: Smart tech-
nologies driven approaches to tackle COVID-19 pandemic: a review. 3 Biotech 11(2), 1–22
(2021). https://doi.org/10.1007/s13205-020-02581-y
33. Mohammed, M.N., et al.: Novel COVID-19 detection and diagnosis system using IoT based
smart helmet. Int. J. Adv. Sci. Technol. 29(7), 954–960 (2020)
34. Sonn, J.W., Lee, J.K.: The smart city as time-space cartographer in COVID-19 control: the
South Korean strategy and democratic control of surveillance technology. Eurasian Geogr.
Econ. 61(4–5), 482–492 (2020)
264 A. C. Mohamed Nafrees et al.

35. Li, L., et al.: Artificial intelligence distinguishes COVID-19 from community acquired
pneumonia on chest CT. Radiology (2020)
36. Zheng, C., et al.: Deep learning-based detection for COVID-19 from chest CT using weak
label. MedRxiv (2020)
37. Salman, F.M., Abu-Naser, S.S., Alajrami, E., Abu-Nasser, B.S., Alashqar, B.A.M.: Covid-19
detection using artificial intelligence (2020)
38. Gozes, O., et al.: Rapid ai development cycle for the coronavirus (Covid-19) pandemic: initial
results for automated detection & patient monitoring using deep learning CT image analysis.
arXiv Prepr. arXiv:2003.05037 (2020)
39. Ribeiro, M.H.D.M., da Silva, R.G., Mariani, V.C., dos Santos Coelho, L.: Short-term forecast-
ing COVID-19 cumulative confirmed cases: perspectives for Brazil. Chaos Solitons Fractals
135, 109853 (2020)
40. Singh, S., Parmar, K.S., Makkhan, S.J.S., Kaur, J., Peshoria, S., Kumar, J.: Study of ARIMA
and least square support vector machine (LS-SVM) models for the prediction of SARS-CoV-2
confirmed cases in the most affected countries. Chaos Solitons Fractals 139, 110086 (2020)
41. Pinkas, B., Ronen, E.: Hashomer-a proposal for a privacy-preserving bluetooth based contact
tracing scheme for hamagen. GitHub (2020)
42. Thiele, C.: Stop Corona-Theses and antitheses on the use of tracking apps in the corona crisis.
J. Inf. Law 7(2), 152–158 (2020)
43. Watts, D.: Covidsafe, Australia’s digital contact tracing app: the legal issues. Aust. Digit.
Contact Tracing App Leg. (May 2) 2020
44. Aldera, M.A., Alexander, C.M., McGregor, A.H.: Prevalence and incidence of low back pain
in the Kingdom of Saudi Arabia: a systematic review. J. Epidemiol. Glob. Health 10(4),
269–275 (2020)
45. Aromataris, E., Pearson, A.: The systematic review: an overview. Am. J. Nurs. 114(3), 53–58
(2014)
46. Lochmiller, C.R., Lester, J.N.: An Introduction to Educational Research: Connecting Methods
to Practice. Sage (2017)
47. Wang, W., Huang, X., Li, J., Zhang, P., Wang, X.: Detecting COVID-19 patients in X-Ray
images based on MAI-Nets. Int. J. Comput. Intell. Syst. 14(1), 1607–1616 (2021)
48. Setianingrum, V.M., et al.: Design development of infographics content for Covid- 19 pre-
vention socialization. In: Advances in Social Science, Education and Humanities Research,
vol. 491, no. Ijcah, pp. 1411–1416 (2020)
49. Kim, H.M.: Smart cities beyond COVID-19. Smart Cities Technol. Soc. Innov. 299–308
(2021)
50. Malasinghe, L., Ramzan, N., Dahal, K.: Remote patient monitoring: a comprehensive study.
J. Ambient Intell. Humaniz. Comput. 10(1), 57–76 (2017). https://doi.org/10.1007/s12652-
017-0598-x
51. Paksi, H.P., Wicaksono, V.D., Sucahyo, W.W.I.: Development of physical distancing detector
(PDD) integrated smartphone to help reduce the spread of Covid-19. In: Proceedings of the
International Joint Conference on Science and Engineering (IJCSE 2020), vol. 196, no. Ijcse,
pp. 361–364 (2020)
52. Iqbal, S.M.A., Mahgoub, I., Du, E., Leavitt, M.A., Asghar, W.: Advances in healthcare
wearable devices. npj Flex Electron. 5(1), 1–14 (2021)
53. Krishnamurthi, R., Gopinathan, D., Kumar, A.: Wearable devices and COVID-19: state of
the art, framework, and challenges. In: Al-Turjman, F., Devi, A., Nayyar, A. (eds.) Emerging
Technologies for Battling Covid-19. SSDC, vol. 324, pp. 157–180. Springer, Cham (2021).
https://doi.org/10.1007/978-3-030-60039-6_8
54. Sagitov, A., Tsoy, T., Li, H., Magid, E.: Automated open wound suturing: detection and
planning algorithm. J. Robot. Netw. Artif. Life 5(2), 144–148 (2018)
Smart Technologies to Reduce the Spreading of COVID-19 265

55. Kaiser, M.S., Al Mamun, S., Mahmud, M., Tania, M.H.: Healthcare robots to combat COVID-
19. In: Santosh, K.C., Joshi, A. (eds.) COVID-19: Prediction, Decision-Making, and its
Impacts. LNDECT, vol. 60, pp. 83–97. Springer, Singapore (2021). https://doi.org/10.1007/
978-981-15-9682-7_10
56. Otoom, M., Otoum, N., Alzubaidi, M.A., Etoom, Y., Banihani, R.: An IoT-based framework
for early identification and monitoring of COVID-19 cases. Biomed. Signal Process. Control
62, 102149 (2020)
57. Odoom, J., Soglo, R.S., Danso, S.A., Xiaofang, H.: A privacy-preserving Covid-19 updatable
test result and vaccination provenance based on blockchain and smart contract. In: 2019 Inter-
national Conference on Mechatronics, Remote Sensing, Information Systems and Industrial
Information Technologies (ICMRSISIIT), pp. 1–6 (2019)
58. Ghatak, B., et al.: Design of a self-powered smart mask for COVID-19 (2021)
59. Majeed, A.: Towards privacy paradigm shift due to the pandemic: a brief perspective.
Inventions 6(2), 24 (2021)
Development of a Real Time Wi-Fi Based
Autonomous Corona Warrior Robot

P. Shubha(B) and M. Meenakshi

Dr. Ambedkar Institute of Technology, Bengaluru, Karnataka, India


shubhap.ei@drait.edu.in

Abstract. This paper is about the development of an autonomous robot in real


time that is responsible for helping the patients or the hospital staffs to control the
spread of corona pandemic. As social distancing is the main criterion which has
to be followed, the robot is provided with a visual as well as aural supervision
and inspection feature through an android application called IP Webcam. This
work highlights three main objectives: deliver essentials such as food, medicine
and some medical accessories without direct contact of individuals to patients,
obstruction detector which gives a buzzer warning when an obstacle is detected
on the pathway, sanitizing option which is capable of sanitizing the hospital wards
or spraying sanitizer to an individual who place their hand near the sanitizer nozzle.
At the front end, the robotic arm is used for contagious waste disposal and also
could be used as an obstruction remover. The back end of the robot includes a
sweeper used for sweeping the floor.
The overall movements and the operations performed by the robot are con-
trolled manually by a supervisor from the control station. The instructions are
given from a TCP/UDP android application by sending characters serially through
a Wi-Fi module that controls the operation of the robot. Finally, the Arduino micro-
controller is used which acts as the brain of the robot which helps in controlling
the overall operation of the robot.

Keywords: Autonomous robot · Robotic arm · TCP/UDP · Microcontroller

1 Introduction

As most people in the world are now acutely aware, an outbreak of COVID-19 was
detected in mainland China in December of 2019. As of now, every continent in the
world has been affected by this highly contagious disease, with nearly a million cases
diagnosed in over 200 countries worldwide. The cause of this outbreak is a new virus,
known as the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). On
February 12, 2020, WHO officially named the disease caused by the novel coronavirus as
Coronavirus Disease 2019 (COVID-19). Coronavirus can cause mild to moderate upper-
respiratory tract illnesses such as the common cold, severe acute respiratory syndrome
(SARS) and Middle East respiratory syndrome (MERS).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 266–280, 2022.
https://doi.org/10.1007/978-3-030-97196-0_22
Development of a Real Time Wi-Fi 267

Coronavirus, now globally carrying the status of a pandemic, has led to a worldwide
crisis. Putting the human at the center of the situation, the virus generates deep fear,
confusion and impacts us in a deeply emotional way that this generation has never felt. Of
course, on top of this, physical confinement is aggravating the situation. The increase in
death rate all across has created a lot of burden on Hospitals and Medical Staff. According
to the reports given by WHO (World Health Organization), social distancing, usage of
sanitizers to keep us hygienic, wearing masks are some of the preventive measures that
has to be taken to get rid of this pandemic.
Since many years humans have tried to assign their work to the machines. These
machines, which are referred as the robots, are faster and more effective than people.
The term robotics is practically defined as the study, design and use of robot systems for
manufacturing. Robots are generally used to perform unsafe, hazardous, highly repeti-
tive, and unpleasant tasks. Many elements of robots are built with inspiration from the
nature. Construction of the manipulator as the arm of the robot is based on human arm.
The robot has the ability to manipulate objects such as pick and place operations. It
is also able to function by itself. The development of electronic industry robot system
technology has been expanded increasingly. As one such application, the service robot
with machine vision capability has been developed recently.
Keeping this in mind, in order to help the medical staff, a robot named “Corona
Warrior robot” is implemented which can help medical staffs to handle the patients.

2 Problem Statement
While the whole country is facing COVID-19 crisis, the ones on the frontline are working
excruciatingly hard for the welfare of the patients and to minimize the damage to the
society due to the pandemic. The lack of a definite cure for the infection makes their
work extra hard and there is also added pressure of calming the nerves of the patients
and their families. For this, the doctors not only have to attend to patients in person
but also have to stay connected with the patients over mobile phones and through video
conferencing.
Since social distancing and quarantine measures are the major propaganda in dealing
COVID19, it is crucial for the health workers to treat Covid19 infected patients without
being in contact with them. In order to provide an alternative, “Corona Warrior robot”
is build, which is useful in hospitals for inspecting the patients from a distant place.
The proposed work consists of robotic car useful for delivering necessary items to
desired places without any physical contact. The arm is attached at the front end to
dispose any trash or for obstruction clearances (pick and place operation). As the main
preventive measure is to sanitize the surrounding, this project includes a sanitizer spray
that can hand sanitize the patients when the person places his hand near to sanitizer
nozzle. The sanitizing can be manually controlled which is used for floor cleaning or
sanitization.
268 P. Shubha and M. Meenakshi

2.1 Literature Review

The robot is defined as a “re-programmable, multi-functional manipulator designed to


move parts, materials, tools or specialized devices through various programmed motions
for the performance of a variety of tasks”. The robots can be categorized into Autonomous
and Semi-autonomous.
An Autonomous robot is not controlled by human and acts on its own decision by
sensing the environment. Majority of the industrial robots are autonomous as they are
required to operate at high speeds and with great accuracy. Semi-autonomous robots are
human controlled robots. Some of the most commonly used controlled systems are voice
recognition, tactile or touch controlled, motion controlled, hand gesture recognition and
motion controlled.
A pick and place robot was designed in [1] using solid works software and are
fabricated using 3D printing technology. The 3D printed models were assembled with
servomotors and are connected by Arduino and Bluetooth module. The robot arm was
controlled using smartphone.
Another survey study [2, 3] presents the pick and place of the robotic arm with a
soft catching gripper was designed to avoid extra pressure on the suspected objects like
bomb for safety reasons. The movement was controlled by android application. At the
receiving end of the robot, four dc motors are interfaced with microcontroller for gripper
movement and body movement. The reference [4] addresses an issues on IP webcam
application, which was deployed with IR sensor to provide overall better surveillance
for real time monitoring. The video track record of the movement stored in the playback
devices will help more in detecting the unauthorized movements in the work space. This
work was mainly designed to provide advanced security based surveillance systems [5,
6]. The reference [7], describes the evolving role of robots in the healthcare with special
concerns relating to management and control of the spread of the novel coronavirus
disease. The prime utilization of such robots is to minimize person-to- person contact
and to ensure cleaning, sterilization and support in the hospitals.
The development of all the above types of medical robots necessitates IP webcam
application enables the supervision of aural and visual inspection of patients and wards
in the hospital.

3 Proposed System

Figure 1 shows the block diagram which illustrates how the components are interfaced
with Arduino to form a robot. Here, Arduino is used as a robot control unit by which
all the peripherals are controlled and monitored. The connections are made according
to the diagram which is shown in Fig. 1.
Development of a Real Time Wi-Fi 269

The basic components of a Robot includes:

a. Manipulator
b. Power Supply
c. A Robot Control Unit (Control System)
d. Sensor Control Unit

a. Manipulator: is an assembly of various axes that is capable of providing motion in


various directions. Here, the robotic arm is used for pick and place operation.
b. Power Supply: The power supply is the source of energy used to regulate the robot’s
drive mechanisms. A rechargeable 12 V Lead-Acid Battery is used for the power
supply of robot. 7805 voltage regulator is used which regulates the voltage to 5 V.
c. Robot Control system (RCS): acts as the brain of the robot. It coordinates and
sequences the motion of the various axes of the robot and provides communication
with external devices and machines. Programs are written in high-level language
developed specifically for industrial robot applications. The operator interacts with
the RCS through a standard video terminal. The terminal is used to create and edit
programs, enter robotic commands and execute programs and generate data points
during the robot training phase.
Instructions are fed through Arduino IDE software. For Robot Control, TCP/UDP
Test Tool mobile application is used and IP webcam application for Video and Audio
streaming purpose.
d. Sensor Control Unit (SCU): provides visual information about the scene to be
analyzed. It takes the information about the performance of each camera, analysis
the image, and transposes those information to the robot’s coordinate system. An
image of an object comes through the camera lens and falls on an image plane inside
the camera. The most useful camera for machine vision is the solid-state matrix array
camera, which was designed for military/space use. The camera consists of grid light-
sensitive elements called pixels or picture elements. The information from this grid is
acquired by scanning rapidly the entire field, line by line. The varying light intensities
associated with each pixel are then translated into varying voltage levels and transmit
to the interface electronics. Adaptive control of a robot eliminates the need for
accurate fix turing of work pieces, precise fabrication tolerances of equipment, and
accurate teaching of the coordinate data. The 3D coordinate information is analyzed
by the SCU and sent to RCU so that action can be taken by the control unit and
delivers it to the manipulator.

Here, a smartphone with dual camera is used for capturing and transmitting visuals
and the sensors used are IR Proximity sensor and Ultrasonic sensor for obstacle sensing.
270 P. Shubha and M. Meenakshi

Fig. 1. Block diagram of the proposed robot

From Fig. 2, TCP/UDP is an android application test tool used to control robot
operations. This application acts as a TCP/UDP protocol to connect with the server/client.
The movement of the robot is observed when the characters are sent serially through the
Wi-Fi module.
Development of a Real Time Wi-Fi 271

Fig. 2. TCP/IP test tool operation

3.1 Software Implementation

Successful implementation of the Wi-Fi robot mainly depends on its movement and its
performance concerning pick and place operation. That means it depends on the two test
points given below.

Test Point1: Robot Movement


The robot can be moved front, back, left, right, and rotate in 360 degrees with the help of
DC motors [8]. The movements of the robot are observed depending on the rotations of
the DC motors in clockwise and anticlockwise directions. The principle used to control
the movements of the robot is: when the input is 01, then the motor turns in clockwise
direction and when the input is 10, then the motor turns in anti-clockwise direction. The
motor stays idle for inputs 00 and 11. Once the obstacle is detected along the path, the
robot comes to a halt and enables a buzzer warning. Otherwise, it proceeds until the
command is received by the user. In this task, the motor driver drives the motor in the
desired direction according to the commands received, the algorithm implementation
for which is given below.
272 P. Shubha and M. Meenakshi

Test Point2: Pick and Place Operation


Here, the initial settings are the same as test point1. The movement of the end effector
is controlled depending on pick and place operation [9, 10]. This operation is controlled
manually by the TCP/UDP test tool by sending characters through a Wi-Fi module. Pick
and place function is controlled by two DC motors, one for UP/DOWN movement of
the robotic arm (DC1) and other DC motor (DC2) for the movement of end effector used
for Hold/Release operations. When the DC1 input is 10, then the arm moves upward.
When the input is 01, then the arm moves downward. The movements of the gripper are:
when DC2 input is 01, then the arm gripper close and DC2 is 10, then the gripper open.
The following algorithm implements the above mentioned logic.
Development of a Real Time Wi-Fi 273

3.2 Hardware Implementation

The actual hardware implementation of Wi-Fi based robot is illustrated in the Fig. 3a.
The Wi-Fi module is fixed at the right side of the Arduino (Fig. 3b). The robot is
programmed in such a way that when the smartphone is connected to Wi-Fi, using
TCP/UDP application, the operator manually controls its functionalities and movements
by sending commands to Arduino.

Wi-Fi Module

(a) (b)

Fig. 3. (a) Hardware implementation (b) Wi-Fi module in a robot


274 P. Shubha and M. Meenakshi

4 Results and Discussions


TCP/IP configure is enabled after power on the system. Once the pairing between the
two devices done, the robot waits for the commands from the user. When the user press
a button on the TCP/IP the corresponding ASCII code is send to the controller serially.
The controller checks this with the preprogrammed value. The pictorial representation
of the prototype Wi-Fi based robot, which is developed using Arduino Mega microcon-
troller and other accessories is given in Fig. 4. The movement required by the robot is
demonstrated by pick and place operations of the arm (Figs. 5a and 5b), and open and
close functions of the gripper are shown in the Figs. 6a and 6b respectively.

Fig. 4. A prototype model of a Wi-Fi based autonomous robot

Fig. 5. (a) Pick operation (b) Place operation


Development of a Real Time Wi-Fi 275

Fig. 6. (a) Downward and upward arm movement (b) Gripper close and open operation

4.1 False Rejection Rate (FRR)

FRR is defined as the condition where the robot is unsuccessful to do the tasks. This
condition prevails when a robot is failed to follow the commands. FRR is given by:
Number of false Rejections
FRR = (1)
Number of targets tested

Table 1. Analysis of FRR

Robot movement Number of times tested Not reached the target FRR (%)
Without obstacle 50 02 4
With obstacle 50 05 10
Turned right 50 04 8
Turned left 50 04 8

FRR gathered from the system is shown in Table 1. It can be concluded that a high
FRR is produced when robot encounters an obstacle. It has to avoid the obstacle on its
path to reach the destination where as a low FRR is produced when the system does not
encounter any obstacle.
The efficiency of the above robot is demonstrated in real-time by considering the
supply of medicines to the patient lying at a fixed location. Robotics movements are
achieved based on the commands given. Experimental results demonstrated its successful
implementation and hence extension to real-time usage.
276 P. Shubha and M. Meenakshi

5 CASE STUDY: Operation of the Robot Extended to Control


the Spread of Corona Pandemic

The application of Wi-Fi based robot designed above is extended to help the patients
or hospital staff to control the spread of the COVID – 19 virus which were recently
introduced all over the world. As social distancing is the main criterion that has to be
followed, the robot provides a visual/aural supervision and inspection feature through
an android application called IP Webcam. The four-wheeled robot is able to deliver
essentials like food, medicine, and any medical accessories to patients without direct
contact with individuals. The obstruction detector gives a buzzer warning when an obsta-
cle is detected on the pathway. The robot also includes a sanitizing option capable of
sanitizing the wards or spraying sanitizer to an individual who would place their hand
near the sanitizer nozzle. The robotic arm is used for picking and placing, disposing of
contagious waste and it can also be used as an obstruction remover. The back end of the
robot includes a roller for sweeping the floor.
The additional functionalities added are:

• Sanitizer spray Application


• Hand Sanitizing Application controlled manually and automatically
• Aural/Visual Inspection and Supervision

5.1 Sanitizer Spray Application


During this Covid19 pandemic, the utmost priority is to keep the surroundings clean and
hygienic. The manually controlled sanitizer spraying option is incorporated. Sanitizer
spray option is controlled manually by passing instructions to the Arduino which is sent
serially through Wi-Fi module. The AT command ‘$M’ is sent to control the sanitizer
spray operation. As the Arduino receives the signal, it triggers the relay to pump the
sanitizer from the tank using a mini submersible pump. Sanitizer sprayed on the floor
acts as a disinfectant. A roller that is attached at the back end of the robot is used to
sweep the floor.

• Results and Analysis


Arduino receives the command for sanitizer spray operation. Then the relay initiates
the submersible pump of the robot spray on the floor as it moves. Figure 7(a) demon-
strates the manually controlled sanitizer spray operation. A roller is attached at the
back end of the robot that is used to sweep the floor as shown in Fig. 7(b).
Development of a Real Time Wi-Fi 277

Fig. 7. (a) Operation of manually controlled sanitizer spray (b) Roller is at the back end of the
robot

5.2 Hand Sanitizing Application Controlled Automatically by an IR Sensor


The hand sanitizing option operates using an IR proximity sensor that detects the hand
and sprays the sanitizer. When the hand is placed near the sanitizer tank, the IR proximity
sensor senses the hand and sends analog input to Arduino. This triggers the relay to pump
the sanitizer from the tank to the hand.

• Result and analysis


The demonstration of the automatic hand sanitizer operation is shown in the Fig. 8.
When the hand is placed near the sanitizer nozzle, the IR proximity sensor senses the
hand and the relay initiates the submersible pump to spray on the hand.

Fig. 8. Demonstration of automatic hand sanitizer operation

5.3 Aural/Visual Inspection and Supervision


The block diagram of audio/visual inspection and streaming operation is given in Fig. 9.
The surveillance system provides live video streaming option and transmits the audio.
278 P. Shubha and M. Meenakshi

Aural and visual inspection feature uses IP Webcam application (IPWA) that can be
installed on a smartphone. The IPWA has the capacity of serving live video or recorded
video to their clients. Instead of using traditional USB Camera for transmitting video,
the commonly used android phone is preferred for both transmission and reception of
the video. This smartphone is attached to the chassis and is used for streaming the video
connected to the same network. After opening the application, audio mode is enabled
to transmit and receive the audio. The webcam IP address 192.168.1.2:8080 is entered
in a browser of a smartphone or a laptop for viewing the video stream. Both the front
and the back camera of the mobile could be used and can be swapped simultaneously to
view both sides of the robot. The video can be sent to the guardian or the doctor.

Fig. 9. Block diagram of audio/visual inspection and streaming operation

• Result and analysis


The patients are monitored by performing audio/visual inspection and streaming oper-
ations. A mobile phone is used for visual inspection at home or at the hospital (ward).
The smartphone is attached to the robot as shown in Fig 10. Figure 11 shows a snap
shot of visual inspection at home scenario.

Camera

Fig. 10. A mobile is placed and interfaced for visual inspection


Development of a Real Time Wi-Fi 279

Fig. 11. A snap shot of visual inspection at home scenario

6 Conclusion
Corona Warrior robot is mainly built on the motive to help the Corona Warriors like
health workers and doctors in inspecting and supervising Coivd19 victims without being
in contact with them. The robot provides a clear two way audio visual which helps the
health workers to interact with the patients more effectively. Pick and place robotic arm
is very useful in transferring any object from one place to another. Pick and place arm
can cover an angle of 125 degrees approximately and has a stroke length of 230mm.
Robotic arm has Up/Down as well as gripper or end effector to hold and release the
objects accordingly. During this pandemic, its utmost important to keep ourselves and
the environment hygienic and infection free. The robot includes sanitizer spray option
to sanitize the floor as well as hand sanitizing of people. The attachment of a sweeper
at the back of the robot helps to clean the floor. UDP/TCP tool is used for controlling
the operations of the robot by sending characters serially through a Wi-Fi medium. IP
Webcam android application enables both audio and visual communication. Features
like Zoom, Autofocus, LED torch facility in the mobile makes the video streaming even
more effective. The entire surveillance can be recorded and also can capture photos using
this application. The future scope of the work is, a prototype that can be fully automated
using image processing application where it is possible for the robot to deliver any
necessary equipment to a particular person automatically.

References
1. ArkaSain, J., Mehta, D.M.: Design and implementation of wireless control of pick and place
robot arm. Int. J. Adv. Res. Eng. Technol. (IJARET) 9(3), (2018). p-ISSN 0976-6480, e-ISSN
0976-6499
280 P. Shubha and M. Meenakshi

2. Sawarkar, M.R., Raut, T.R., Nutan, P.N., Meshram, S.C., Pournima, P.T.: Pick and place
robotic arm using android device. Int. Res. J. Eng. Technol. (IRJET) 04(03) (2017). e-ISSN
2395-0056, p-ISSN 2395-0072
3. Muhammed, J.N., Neetha, J., Muhammed, F., Midhun, M., Mithun, S., Safwan, C.N.: Wireless
control of pick and place robotic arm using an android application. Int. J. Adv. Res. Electr.
Electron. Instrum. Eng. 4(4) (2015). p-ISSN 2320-3765, e-ISSN 2278-8875
4. Bokade, A.U., Ratnaparkhe, V.R.: Video surveillance robot control using smartphone and
Raspberry Pi. In: 2016 International Conference on Communication and Signal Processing
(ICCSP), Melmaruvathur, pp. 2094–2097 (2016)
5. Ikhankar, R., Kuthe, V., Ulabhaje, S., Balpande, S., Dhadwe, M.: Pibot: the raspberry Pi
controlled multi-environment robot for surveillance & live streaming. In: 2015 International
Conference on Industrial Instrumentation and Control (ICIC), Pune, pp. 1402–1405 (2015)
6. Chen, W.-T., Chen, P.-Y., Lee, W.-S., Huang, C.-F.: Design and implementation of a real time
video surveillance system with wireless sensor networks. IEEE (2014)
7. Mandrupkar, T., Kumari, M., Mane, R.: Smart video security surveillance with mobile remote
control. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3, 5 (2013)
8. Lin, Y.-C., Wei, S.-T., Yang, S.-A., Fu, L.-C.: Planning on searching occluded target object
with a mobile robot manipulator. In: Proceedings of the IEEE International Conference on
Robotics and Automation (ICRA), May 2015, pp. 3110–3115 (2015)
9. Nagatani, K., et al.: Redesign of rescue mobile robot Quince. In: Proceedings of the IEEE
International Symposium Safety, Security, and Rescue Robotics (SSRR), November 2011,
pp. 13–18 (2011)
10. Arif, M., Samani, H., Yang, C.-Y., Chen, Y.-Y.: Adaptation of mobile robot to intelligent
vehicles. In: Proceedings of 2013 IEEE 4th International Conference on Software Engineering
and Service Science, 23–25 May 2013, pp. 550–553 (2013)
The Impact of Job Satisfaction Level
on Employee Work Performance
in the Pharmaceutical Industry: An Empirical
Study

Geeta Kumari1(B) , Jyoti Kumari1 , and K. M. Pandey2


1 Eternal University, Baru Sahib, Baru Sahib, Himachal Pradesh, India
geekumari@gmail.com
2 National Institute of Technology Silchar, Silchar, Assam, India

Abstract. The objective of this paper is to explore the impact of satisfaction level
on the efficiency of the employees working in the pharmaceutical industry. Perfor-
mance appraisal is related to the assessment of employees’ current performance,
work activity, and their capacity for an upcoming performance. The research study
is based upon the performance appraisal System in Meridian Medicare Limited.
The purpose of this study was to determine the degree of employee satisfaction and
the influence of employee performance in the pharmaceutical sector. The sample
size of the employees was 60 and the data were collected using a well-designed
questionnaire. A percentage, mean, standard deviation, and regression model were
used to evaluate the data. Many respondents were pleased with the performance
assessment method, according to the study’s findings. Multiple linear regressions
revealed that chosen explanatory factors explained 79.2% of the variation in the
performance assessment system. It shows that if these variables are taken into
consideration by the industry, it may give the best results. The practical implica-
tions of this research paper are beneficial for the management of pharmaceutical
industries for designing performance appraisal criteria and programs that increase
the satisfaction level of employees because a satisfied employee can give a better
and effective performance to the industry. The limitations of the research paper
were data privacy, the policy of the company, time limit, and unavailability of
persons concerned.

Keywords: Performance appraisal · Pharmaceutical industry · Employees ·


Satisfaction level · Motivation

1 Introduction
Performance appraisal is a dynamic element of a larger set of human resource func-
tions that are used to analyze the degree to which each employee’s daily performance is
connected to the organization’s goals [1]. A systematic, structured procedure for assess-
ing an employee’s effectiveness about their job obligations is known as a performance
evaluation [2]. Ultimately, the goal is to learn everything there is to know about an

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 281–298, 2022.
https://doi.org/10.1007/978-3-030-97196-0_23
282 G. Kumari et al.

employee’s present job performance so that they may enhance it more efficiently in
the future [3]. As a result, it may improve personnel, and the enterprise, and society.
Performance evaluation is an extensively debated concept. The relevance of perfor-
mance assessment systems derives in part from the present business climate, which is
characterized by the requirement to fulfill organization objectives while also remaining
competitive in highly competitive markets through outstanding work performance [4].
Numerous studies indicate that businesses have little influence over their workers’ con-
duct in this situation [5]. Companies, on the other hand, influence how workers carry out
their duties. Furthermore, a performance management study reveals that a substantial
proportion of employees want to do a good job as part of their ambitions and as a symbol
of commitment to the company [6].

1.1 The Objective of the Study


To determine the job satisfaction of the employees and its influence on performance.

1.2 Hypothesis Formulation


• Ho1: There is a significant relationship between salary and work performance.
• Ho2: There is a significant relationship between promotion and work performance.
• Ho3: There is a significant relationship between job security and work performance.
• Ho4: There is a significant relationship between working conditions and work
performance.
• Ho5: There is a significant relationship between goal achieved and work performance
• Ho6: There is a significant relationship between co-workers and work performance.
• Ho7: There is a significant relationship between and work performance.
• Ho8: There is a significant relationship between the nature of work and work
performance
• Ho9: There is a significant relationship between managing stress at the work place
with work performance.
• Ho10: There is a significant relationship between quality improvement and work
performance
• Ho11: There is a significant relationship between the performance appraisal system
regulated properly with work performance.

2 Literature Review
Virani [7] examined the related inputs from different representative Information Tech-
nology Enabled Technology (ITES) organizations and perform a deductive/inductive
review of the qualitative and quantitative data needed to achieve the desired results. The
outcome result shows that using the questionnaire study, the current performance assess-
ment processes of the selected ITES companies were evaluated based on the parame-
ters considered for evaluation, standards communication process, input mechanism, and
performance appraisal method, consistency of performance assessment process, and
accountability of performance assessment process. Chavda and Yagnik [8] analyzed
The Impact of Job Satisfaction Level on Employee Work Performance 283

that there is a major difference between selected dairy units in performance assess-
ment. As far as performance evaluation is concerned, Amul dairy had the best perfor-
mance assessment setup followed by Baroda, Dudhsagar, SUMUL, and Mother’s diary.
Amul and Baroda dairy had better performance management practices than SUMUL,
Dudhsagar, and Mother dairy. Jain and Garg and Jain [9] appreciated the performance
assessment system accepted by (HRH) Historical Resort Hotels groups for employee
awareness. This indicates the level of satisfaction and understanding of the staff per-
formance appraisal method. According to Mishra [10], The Hong Kong and Shanghai
Banking Corporation showed proper collaboration between training, improvement, and
HR management. The volume editors, usually the program chairs, will be your main
points of contact for the preparation of the volume. The study recognized that there is a
requirement of performance appraisal and management for the company but at the same
time, it focused on the issue of allocation of funds to invest in the methods of appraisal
and training development program. Rana and Lokhande [11] observed that they assess
Maruti Suzuki India Limited’s performance regarding the export, imports, production,
and distribution network. The findings indicate that Maruti Suzuki has established a
quality standard for its research and innovation operations since the firm thinks that this
operation would enable it to provide better and environmentally sustainable goods to
its consumers with total contentment. The ecological impact of Maruti Suzuki is mini-
mal. Renganayaki [12] found that performance evaluation is inextricably related to the
determination of training and development programs, which are beneficial to both the
company and the personnel. The emphasis must therefore be on optimizing talent capa-
bilities through adequate TD reward strategies. Mohanty et al. [13] inspected in their
study the relationship between the organizational assessment and employee productivity
between staff members in the manufacturing sector in India. Suresh and Mohideen and
Lakshmi [14] examined the performance appraisal of automobile industries in Chennai.
The study designated that the degree of satisfaction did not reach the higher levels with
the current appraisal method being adopted by the industry. Yoganandan et al. [15] in
their study concluded that performance assessment is a vital practice for companies in
this ever-increasing competitive world searching for growth and benefit maximization.
Their results unveiled that the essential elements of an excellent performance assessment
system are the understanding of its foundations and the essential steps that lay the foun-
dation. Another research was performed by Agarwal [16] to determine the influence of
performance review on employee behavior and the relationship between performance
assessment and employee performance. The results revealed that using a performance
evaluation approach may help workers become more efficient, which leads to long-term
organizational growth. Chaudhry and Puranik [17] concluded that the purpose of hav-
ing a performance evaluation program in a hospital is to monitor the performance of
employees, encourage employees, and enhance morale in hospitals. Cowandy [18] ana-
lyzed the workers about their performance evaluation program. The results show that
the fair performance evaluation has a substantial effect on staff motivation to enhance
performance and performance satisfaction. Deepa et al. [19] observed the success or
failure of the organization and summarizes the performance appraisal framework, con-
ceptual framework, and its relation between job satisfaction, organizational engagement,
employee loyalty, and productivity. Gautam [20] analyzed that Simbhawli sugar limited
284 G. Kumari et al.

performance evaluation is a supporting pillar for comparing employees’ actual perfor-


mance and potential with existing benchmarks, to serve as motivating equipment for
improved employee performance. The demands of permanent and contract employ-
ees about the manner of program evaluation vary markedly, according to Malik and
Bakhtawar [21]. Consequently, their respective views have a powerful influence on
their overall results. Rahman [22] found that the management of Square Pharmaceuti-
cals Ltd. has been able to maintain stability and generate positive profit trends while run-
ning. Sippy and Varma [23] gave an extensive review about the performance appraisal
and concluded that the Performance assessment should not be viewed as a daily practice
but should be accepted and interacted with all the workers down the track. Workers with
outstanding results can be used as mentors for other workers who can inspire others to
achieve better output. Begum et al. [24] analyzed in their study the factors which ensure
the performance effectiveness Evaluation program in the Bangladesh Pharmaceutical
industry. The result shows that the most important factor is ratter accuracy, and therefore
this factor explicitly and very strongly affects the efficiency of the performance evalu-
ation method. The effectiveness of the quality assessment technique is also influenced
by the PA process, interaction, attitudes of employees, and training. Chetna et al. [25]
recognized that the key factor affecting OCL Iron and Steel Ltd. is a successful perfor-
mance assessment program. The study revealed that ‘Scope and Performance Evaluation
Strategy’ followed by ‘Evaluation Co-ordination, Approaches for Performance Evalua-
tion,’ ‘Performance-Based Initiatives,’ ‘Attributes Evaluated in Performance Evaluation’
and ‘Perception of Employees for Performance Evaluation’ are the key determinants of
effective performance assessment in order. Khedkar [26] explored the “performance
appraisal method” in the education field. The assessment system, which is regularly
introduced in various educational institutions during the academic year, helps to under-
stand the performance of employees in their respective organizations. It is quite clear
from this study that the variables identified for assessment are somewhat similar across
different educational institutions. The annual assessment system includes several key
parameters that revolve around the development and sustainability of both organizations
and employees. Phin [27] investigated that the factor leads to affect the implementa-
tion of performance appraisal in Malaysia’s private education sector to determine how
a performance assessment program can be effectively implemented in this industry. the
results showed that the effectiveness of performance evaluation was strongly and posi-
tively linked to system architecture, management system procedures, and system support.
Rekha [28] concluded that the main need is to acquire practical knowledge on the subject
and to know the mechanism and intent of the evaluation system at Yashoda Hospital. The
outcomes of the study showed that an efficient and efficient assessment system is essen-
tial for a company to achieve financial success and organizational goals. Xavier [29]
found the current business environment; the company must ensure the full performance
of their workers to succeed in the marketplace on an ongoing basis, effectively and tradi-
tionally, this objective was achieved through an employee performance assessment that
was more concerned with informing employees of their lack of performance. Employee
development is directly related to the results. When efficiency is naturally improved, the
organization’s output should be improved. Gupta and Swaroop [30] investigated the
impact of the performance evaluation procedure on job performance in administrative
The Impact of Job Satisfaction Level on Employee Work Performance 285

staff at Hawassa University. The results of the descriptive and inferential review showed
that there are large gaps in the application of all components of the performance evalu-
ation process, such as performance standards, communication of established standards,
measurement of actual performance and comparison with expectations, discussion of
the evaluation with the employee and input and corrective action, is well associated
with the success of employees. Jahan [31] concluded that the standard of competence
agreed on Assessment practices used among employees at Square Pharmaceuticals Lim-
ited. Most employees were satisfied with the company’s current performance appraisal
actions, according to the findings. Even though they needed a more detailed and system-
atic method of calculating performance. Wararkar and Wararkar [32] found that the
cotton industry’s requirement has risen considerably in today’s world due to inherent
advantages such as decreased setup costs, little space consumption, simple expansion,
aesthetic benefits, and expanded product options. So, employee performance evaluation
is important to recognize the abilities, competencies, and relative quality, and impor-
tance of each employee for the company. A well-designed performance management
program facilitates an integrated human resource plan that allows the achievement of
organizational and company objectives. Maheswari [33] suggested that performance
evaluation is considered to be a common procedure, but that its significance should be
recognized and conveyed to all employees. Performance assessment can also be used
to recognize high-performing employees. Job satisfaction of the employees with the
performance assessment technique was investigated by Bhatia and Patel [34] in this
study. The results indicate that both employee satisfaction and performance evaluation
are interrelated. Performance evaluation should satisfy the employees. If the employees
are not pleased with the performance appraisal, then it should be clarified to them that
why their performance is unacceptable. Dharmadhikari and Bampoori [35] investi-
gated in their study the structures and processes of employee performance evaluation
dependent on hospitals. the result showed that giving time and energy is essential to give
performance initiatives and managing processes and to provide training and resources
for the workspace. These help to create loyalty, both to the goals and to the company,
which encourages people to do more than they are expected to do, which translates into
better results. Reddy et al. (2018) observed that independent variables (age, gender, edu-
cation, designation, and department) have a greater impact on performance assessment
and methods. The organization needs to evaluate and solve these problems and boost
employee satisfaction rate context and assessment methods for better future performance.
In the Indian steel sector, Sharma, and Rao [36] studied the impact of performance
assessment on staff morale and efficiency. The performance agreement has played a cru-
cial role in improving and assessing productivity, directing public sector development,
and linking productivity to monetary incentive schemes. According to the endings, there
is a positive and substantial relationship between employee performance assessment and
efficiency at Indian Steel. Kumari et al. [37], disclosed in their study that optimistic
and comprehensible attitudes about one’s employment level to job contentment, while
harmful and negative attitudes point to job discontent. Instrumental engineers are con-
fronted with many issues in their exertion today. Several studies have disclosed that job
satisfaction amongst application professionals is one of the vast majorities necessary
286 G. Kumari et al.

characteristics related to success. Kumari and Pandey [38], deduced in their exami-
nations that if an association is thought to have a high turnover rate in contrast with its
rivals, it implies that its laborers work for a more limited period than representatives at
different organizations in a similar industry. Conceivable there’ll be significantly more
If gifted specialists leave consistently, it can affect an association’s productivity. There is
a huge level of hobos among the functioning populace. Kumari et al. [39] clarified the
results of the examination found that the greater part of respondents was happy with the
show assessment structure. The survey found the six components which sway the laborer
execution assessment structure in Meridian Medicare Limited, Himachal Pradesh, India.
The results of the audit showed that all the six parts working environment, pay and the
executives, work usefulness, getting ready and work execution, achievements and over-
hauls, execution through motivation, and occupation satisfaction had a strong positive
relationship with the show assessment system. The diverse direct backslides furthermore
found that 79.2% of the assortment in the display assessment system was explained by
picked educational elements. It shows that if these elements are contemplated by the
association, it may give the best results. Kumari et al. [40] resulted that more workers
have accepted that workers have been provided with appropriate procedures and instruc-
tions before completing the task. Therefore, it can be concluded that management takes
it seriously that workers understand the exact course of action before carrying out the
task so that it is safer for workers to carry out the operations. At the same time, some
employees denied that they had proper procedures and instructions. This may be due to
a lack of employee awareness. Most employees agreed that companies regularly follow
the procedures for documenting the investigation of the incident, and employees appear
to be contented with this provision. Thus, it can be concluded that the administration
appropriately reviews each incident that occurs during the execution of the task and
follows the correct documentation system to determine the real cause for the incident.
It is noted that most employees have accepted that companies have followed the proper
procedures for inspecting and assessing equipment hazards and that workers are satisfied
with them. It can therefore be concluded that the organization has recognized the need
to review and investigate the risks to facilities exist or may exist in the facilities to affect
workers’ health Bhanawat et al. [41] explained in their studies that the execution-based
investigation of a worker and understanding the abilities of that individual and offering
opportunity to develop is the suitable procedure to upgrade representative transporter.
The study profoundly focused on employable work in the human resource area and
assessed the Performance Judgment System of the association. Further, the authors con-
cluded that the employees’ abilities give good outcomes and adequate by workers, and
from the aftereffects of relapse model it is inferred that almost 91.2 % distinction in
the reliant variable. The concentrate on proposed that women’s commitment ought to
be expanded in all areas, which brings about the general strengthening of females. The
consciousness of representatives ought to be upgraded while surveying the exhibition
evaluation task.

3 Research Methodology
This is descriptive research aimed at determining how performance assessment meth-
ods affect employee job satisfaction. The current paper is relevant to the Solan town in
The Impact of Job Satisfaction Level on Employee Work Performance 287

Himachal Pradesh’s Solan district. This town has got established many pharmaceutical
units for a long. Among These Meridian Medicare Limited, is one of the famous indus-
tries? The present study was confined to Meridian Medicare Limited, for achieving the
stated objectives. The total sample size for the study was 60 employees in the Meridian
Medicare Limited, Solan, Himachal Pradesh, 30 employees were from the production
department,10 employees were from the marketing department,10 employees were from
the finance department, and 10 employees were from finished goods department out of
these 60 employees 37 were male while the other 23 were their female. The question-
naires in the form of psychological tests were directed individually upon employees of the
pharmaceutical industry. The age, gender, origin, organizational and educational status
was controlled as subject-relevant variables. To achieve the study’s conventional goals,
both primary and secondary data were analysed. The primary data was gathered using the
survey method or a good questionnaire method. The secondary data were gathered from
the research paper, articles, books, and magazines, etc A convenience sampling method
followed. The respondents were designated based on accessibility and willingness to
participate in the survey. To attain the aim of the research work i.e., to access the job
satisfaction and its impact on employee’s performance in the pharmaceutical industry,
Himachal Pradesh. The satisfaction level and its dimensions with the Job Performance
level of the employees of the pharmaceutical industry, a questionnaire that measures
the level of Job Satisfaction and its five dimensions were adopted. On the performance
assessment system, the questionnaire was organized on a five-point Likert scale from
strongly disagree to strongly agreed, with points 1 strongly disagreed, 2 disagreed, 3
neither agreed nor disagreed, 4 agreed, and 5 highly agreed. The SPSS 20 version was
used for data input and interpretation. Descriptive and inferential statistics were used to
describe the sample’s demographic characteristics. The findings were analyzed using a
variety of techniques, including percentage, mean, standard deviation, regression model,
ANOVA, and other appropriate statistics.

4 Results and Discussion

4.1 Reliability Analysis of the Variables

Over the entirety of Cronbach’s alpha of all variables are satisfactory more than adequate
and suggested esteem 0.50 by Nominally [43] and 0.60 by Moss et al. [44] this shows that
18 things were dependable and substantial to measure the assessments of employees.

Table 1. Reliability test of variables

Sr. no Scale Items Cronbach alpha


1 Job satisfaction 12 0.712
2 Performance appraisal 06 0.694
288 G. Kumari et al.

4.2 Socio-demographic Information of the Sample

The socio-demographic profile characteristics were measured for age, gender, marital
status, experience, salary, and academic qualification. Table 1 compiles the respondent’s
profile. This section provides a detailed finding of the demographic aspects of the sample
respondents, which are further discussed with their respective tables and figures (Table 2).

Table 2. Socio-demographic profile of respondents

Sr. no Categories No. of respondents Percentage


(N = 60)
1 Gender
Male 37 61.67
Female 23 38.33
Total 60 100
2 Age
Below 25 years 9 15
25–30 years 14 23.33
31–40 years 23 38.34
Above 40 years 14 23.33
Total 60 100
3 Marital Status
Unmarried 34 56.67
Married 26 43.33
Total 60 100
4 Work Experience
0–5 years 36 60
6–10 years 14 23.33
11–15 years 8 13.33
Above 15 years 2 3.34
Total 60 100
5 Qualification
Undergraduate 37 61.67
Graduate 17 28.33
Postgraduate 6 10
Others 0 00
Total 60 100
6 Designation
(continued)
The Impact of Job Satisfaction Level on Employee Work Performance 289

Table 2. (continued)

Sr. no Categories No. of respondents Percentage


(N = 60)
Manager 2 3.33
Executive 20 33.33
General Worker 38 63.34
Total 60 100
7 Salary in rupees
Less than 5 lakh 58 96.67
5–10 lakh 2 3.33
Total 60 100
Source: Field Survey, 2020

Table 3. Descriptive statistics job satisfaction level

Dimensions of satisfaction level Mean Standard Deviation Rank


(M) (SD)
Salary 1.82 1.242 XI
Promotion 1.98 1.127 VIII
Job Security 1.85 1.147 X
Working Condition 1.90 1.245 IX
Goal achieved 2.23 1.533 III
Relationship with coworker 2.23 0.927 IV
Relationship with supervisor 2.18 1.396 V
Nature of work 1.80 1.338 XII
Managing stress at the workplace 2.12 0.885 VII
Workload 2.12 1.316 VI
Quality improvement in employee performance appraisal 2.77 1.031 II
system
Performance appraisal system regulated in a proper way 3.37 1.235 I
Average 2.197 1.201
Source: As per the SPSS Output

In Table 3 performance appraisal system shows the highest satisfaction with a mean
of 3.37 and a standard deviation of 1.235 subsequently quality improvement in employee
performance appraisal system with a mean of 2.77 and standard deviation 1.031, goal
achieved with 2.23 and standard deviation of 1.533, relationship with a coworker with
290 G. Kumari et al.

mean 2.23 and standard deviation 0.927, relationship with a supervisor with mean 2.18
and standard deviation 1.396, managing stress at the workplace with mean 2.12 and stan-
dard deviation 1.316, change employee’s behavior with mean 2.12 and standard deviation
0.885, workload with mean 1.98 and standard deviation 1.127, working condition 1.90
and standard deviation 1.245, job security with mean 1.85 and standard deviation 1.147,
salary 1.82 and standard deviation 1.242, nature of work with mean 1.80 and standard
deviation 1.338 (Fig. 1).

4
3.5 Dimensions of Job sasfacon level
3
2.5
2
1.5
1
0.5
0

ay
rw
pe
po
pr
in
Mean (M)
d
te
la
gu
Standard Deviaon (SD)
re

Fig. 1. The graphical representation of Job Satisfaction level at different dimensions with a mean
(M) and Standard deviation (SD)

4.3 Pearson’s Correlation Among Job Satisfaction Dimensions on Work


Performance
To decide the relationship between factors, indicators of job satisfaction, Pearson rela-
tionship, coefficient (r), was acted in Table 4. It uncovers that all variables namely salary,
promotion job security, working condition, goal achieved, relationship with coworker,
relationship with supervisor, nature of work, managing stress at workplace, workload,
quality improvement on work performance appraisal system, performance appraisal sys-
tem regulated at proper way were essentially related between factors (0.481 ≤ r ≤ 0.745;
sig < 0.01). It likewise affirmed that there is no multicollinearity issue, as the best
r-esteem was 0.745**

4.4 Multiple Regression Analysis for Testing the Job Satisfaction Level Impact
of Performance Appraisal
The statistical connection between one or more predictor parameters is described by
regression analysis, which yields an equation. Multiple linear regression was employed
The Impact of Job Satisfaction Level on Employee Work Performance 291

Table 4. Pearson’s correlation among job satisfaction dimensions

Job satisfaction dimensions Pearson’s correlation, r job satisfaction


Performance appraisal 1
Salary 0.503**
Promotion 0.507**
Job security 0.6588**
Working conditions 0.481**
Goal achieved 0.468**
Relationship with co-worker 0.562**
Relationship with supervisor 0.542**
Nature of work 0.612**
Managing stress at the workplace 0.745**
workload 0.532**
Quality improvement on work performance appraisal 0.607**
system
Performance appraisal systems regulated at the 0.482**
workplace
**Correlation is significant at the 0.01 level (2-tailed).

in this study to examine the influence of independent factors on the dependent factor. The
amount of total variation in the dependent variable owing to the independent variable is
measured in the regression table.

Table 5. Model summary of variables

Model summary
Model R R Square Adjusted R Square Std. the error of the estimate
1 0.890a 0.792 0.744 0.625
a. Predictors: (Constant), Performance appraisal system regulated in a proper way, salary,
relationship with supervisor, goal achieved, job security, managing stress at workplace, quality
improvement in employee performance appraisal system, working conditions, relationship
with coworker, performance appraisal method, nature of work
b. Dependent Variable: Performance Appraisal System
Source: As per the SPSS Output

The R and R2 values are listed in this Table 5. The simple correlation is represented by
the R-value, which is 0.890 in (the “R” Column), indicating a high degree of correlation.
The R2 value (the “R Square” column) implies how much variation in the dependent
variable, performance assessment system, is influenced by several factors i.e., salary,
292 G. Kumari et al.

relationship with supervisor, goal achieved, job security, managing stress at workplace,
quality improvement in employee performance appraisal system, working conditions,
relationship with coworker, performance appraisal method, nature of work

4.5 Multiple Linear Regression Analysis: ANOVA of Job Satisfaction Variables


on Performance Appraisal
The ANOVA Table 6 explains how well the regression equations represent the facts. In
this case, the regression model successfully identified the dependent variable.

Table 6. ANOVA

ANOVA
Model Sum of Df Mean F Sig
squares square
1 Regression 71.211 11 6.474 16.597 0.000b
Residual 18.722 48 0.390
Total 89.933 59
a. Dependent Variable: Performance Appraisal System
b. Predictors: (Constant), Performance appraisal system regulated in a proper way, salary,
relationship with supervisor, goal achieved, job security, managing stress at workplace,
quality improvement in employee performance appraisal system, working conditions,
relationship with coworker, performance appraisal method, nature of work
Source: As per the SPSS Output

The ANOVA Table 6 illustrates how well the regression equations match the data.
In this case, the regression model correctly described the dependent variable.

4.6 Multiple Regression Coefficients of Job Satisfaction Variables on Employee


Performance
The coefficient Table 6 gives all the details that need to anticipate how the independent
variable will affect the dependent variable.
The beta value of the independent variable i.e., salary is 0.100, with a t value of 1.236
and a significance level of 0.222, according to the results presented in Table 7. The inde-
pendent variable’s beta is 0.162, with a t of 1.835 and a significance level of 0.073. With a
value of 3532 and a significance level of 0.001, the independent variable i.e. job security
has a beta value of 0.257. The independent variable’s beta value is 0.200 at a value of
0.2.608, with a significance level of 0.012. The independent variable i.e., goal achieved
has a beta value of 0.079, at the value of 0.982, and a significance level of 0.331. The
independent variable’s beta value i.e., relationship with a coworker is 0.346, with a t value
of 4.082 and a significance level of 0.000. The independent variable’s beta value is 0.117,
with a value of 1.514 and a significance level of 0.137. The dependent variable’s beta
The Impact of Job Satisfaction Level on Employee Work Performance 293

Table 7. Multiple Linear Regression Model Coefficients of Job Variables on employee perfor-
mance

Coefficients
Model Un-standardized Standardized T Sig Collinearity
coefficients coefficients statistics
B Std. Beta Tolerance VIF
error
1 (Constant) −0.384 .462 −0.832 0.410
Salary 0.100 0.081 0.100 1.236 0.222 0.658 1.520
Promotion 0.177 0.096 0.162 1.835 0.073 0.559 1.788
Job Security 0.277 0.078 0.257 3.532 0.001 0.816 1.225
Working 0.199 0.076 0.200 2.608 0.012 0.735 1.360
Condition
Goal 0.063 0.065 0.079 0.982 0.331 0.673 1.486
achieved
Relationship 0.461 0.113 0.346 4.082 0.000 0.604 1.656
with
coworker
Relationship 0.104 0.069 0.117 1.514 0.137 0.720 1.389
with
supervisor
Nature of −0.034 0.082 −0.037 −0.415 0.680 0.549 1.823
work
Managing the 0.086 0.103 0.062 0.837 0.407 0.793 1.262
stress level at
the
workplace
Quality −0.146 0.069 −0.155 −2.108 0.040 0.800 1.250
improvement
in employee
performance
appraisal
system
Performance 0.404 0.110 0.338 3.666 0.001 0.512 1.954
appraisal
system
regulated in a
proper way
a. Dependent Variable: Performance Appraisal System
Source: As per the SPSS Output
294 G. Kumari et al.

value is 0.117, with a value of 1.514 and a significance level of 0.137. The independent
variable i.e., nature of work has a beta value of −0.037, at the value of 415. The beta value
of the independent variable i.e., managing stress at the workplace is 0.062, the t value is
0.837, and the significance level is 0.407. The beta value of the independent variable i.e.,
quality improvement in employee performance is −0.155, with a t value of −2.108 and a
significance level of 0.040. The beta value of the independent variable appraisal system
in a proper way is 0.338, the t value is 3.666, and the significance level is 0.001. The
eleven factors in the standard model are significant predictors of the dependent variable
performance assessment system, as shown in Table 7. The ANOVA results show that
(F (5,232)) is 71.211, the significant value p 0.05. The model degree of forecasting the
dependent variable was determined to be R equal to 0.89 because of the conventional
regression analysis. The degree of prediction of the dependent variable by the model
is R square, which is equal to 0.782. As a result, the model assumes a good match
for the dependent variables. The degree of significance of the independent variables was
reflected by the absolute value of Beta in Table 7. The independent variable with the great-
est beta value is the one that is the most significant. From Table 7 the relationship with
coworkers made more contribution with the value of 0.346. It is followed by job security,
working conditions, quality improvement in employee performance appraisal system,
salary, managing the stress level at the workplace is less contribution to the model with
their determination coefficients. Based on the multiple linear regression analysis results,
the regression equations obtained as it is shown below: Performance appraisal system
= 0.384+0.100 (salary) + 0.177 (promotion) + 0.277 (job security) +0.199 (working
condition) +0.461 (relationship with coworker) +0.104 (relationship with supervisor)
−0.034 (nature of work) + 0.086 (managing stress level) −0.146 (quality improvement
in employee performance appraisal system) + 0.404 (performance appraisal system
required in proper way). This study was performed to highlight the relationship between
salary, promotion, job security, working condition, relationship with a co-worker, rela-
tionship with supervisor, nature of work, managing stress level at the workplace, quality
improvement in employee performance appraisal system. There are eleven hypotheses
in this study to examine the significant relationship between independent and dependent
variables. There were only four hypotheses accepted in this study. Hypothesis 3(H3)
was a significant relationship between job security and performance appraisal. Based on
the findings of multiple linear regression analysis, it revealed that there is a positive and
significant relationship between job security and job performance. The result explains
that as the value of job security increases, hence job employees’ work performance
is also increased. Similarly, the Hypothesis 4(H4) there was a significant relationship
between working conditions and job performance. Based on findings of multiple linear
regression analysis the standardized coefficient value of beta(β),0.200, and Significant
value p is of 0.012. These results revealed that if working condition increases then the
employees’ work performance is also increased. Consequently, the Hypothesis 6(H6)
there was a significant relationship between a relationship with workers and job per-
formance. From Table 7, the relationship with coworker value of beta(β), −0.346 and
Significant value p is of 0.000 It revealed that it is a negative significant relationship
between coworker relationship and employees work performance. It revealed that if an
employee’s coworker relationship is not good at the workplace, then the employee’s
The Impact of Job Satisfaction Level on Employee Work Performance 295

work performance will decrease while the performance appraisal system regulated in
the proper way value of beta(β) 3.666 and Significant value p is of 0. 001. This finding
suggests that if the employee’s experience from the performance appraisal system is
regulated properly, thus the work performance of employees will also increase in the
same direction. Based on the results of multiple linear regression analysis. Therefore
the hypothesis H11is accepted while from the finding hypothesis Ho1, Ho2, Ho5, Ho7,
Ho8, Ho9, and Ho10 is rejected because all having significant value is more than 0.05.

5 Conclusion

Following conclusions are drawn from the results and discussion:

A. Most respondents were male and are in the range of 61.67%. The main reason for
differences in the ratio of male and female is the actual ratio of employee recruited
in the selected pharmaceutical industries are approximately 70:30 ratios. The results
revealed that the number of female workers must be increased to boost participation
in women’s empowerment. Many respondents (38.34%) fall in the age category in
age group above 30 to 40 years. Apart from this age group, below 25to 30 years and
above 40 years is having second-largest share which is around 23.33% of the total
sample size, which revealed that there is a heavy scope of employment at an early
age and that the companies tend to hire young workers rather than older workers.
Both married and unmarried respondents were found to be equal. It was found that
almost 61.67% of employees have work experience of 0 to 5 years and 3.34. From
the finding, all predictors (independent variables contribute to 79.2% in explaining
work performance
B. From the Pearson’s correlation, the variable stress management at workplace is
found strong statistically significant i.e., 0.745**at the 0.01 level. This concluded
that if in the organisation stress managing training is provided to the employees,
then the employees work performance will increase and they give their best output
to the organisation.
C. The ANOVA results show that (F (5,232)) is 71.211, the significant value p is
observed to be 0.05. The model degree of forecasting the dependent variable was
determined to be R equal to 0.89 because of the conventional regression analysis.
The degree of prediction of the dependent variable by the model is R square, which
is equal to 0.782. As a result, the model assumes a good match for the dependent
variables. It indicates that the 79.2% independent variables i.e., job satisfaction
dimensions on dependent variables i.e., Work performance at Meridian Medicare
Private Limited.
D. This research study was performed to highlight the relationship between salary,
promotion, job security, working condition, relationship with a co-worker, rela-
tionship with supervisor, nature of work, managing stress level at the workplace,
quality improvement in employee performance appraisal system and work perfor-
mance. The eleven hypotheses were formulated to examine the significant relation-
ship between independent and dependent variables. There were only four hypothe-
ses found statistically significant accepted i.e.Ho3,Ho4,Ho6 and Ho11 and others
296 G. Kumari et al.

hypothesis i.e. Ho1, Ho2, Ho5, Ho7, Ho8, Ho9, and Ho10 were rejected because
all having significant value is more than 0.05.
E. From the Table 7 where multiple linear regression analysis is given, it revealed
that there is a positive and significant relationship between job security and job
performance. The result explains that as the value of job security increases, hence
job employees’ work performance is also increased, the relationship with coworker
value of beta(β), −0.346 and Significant value p is of 0.000 It revealed that it is
a negative significant relationship between coworker relationship and employees
work performance. It revealed that if an employee’s coworker relationship is not
good at the workplace, then the employee’s work performance will decrease.

References
1. Coutts, L.M., Schneider, F.W.: Police officer performance appraisal systems: how good are
they? Int. J. Police Strat. Manag. 27(1), 67–81 (2004)
2. Mondi, R., Mondi, W., Bandy, J.: Human Resource Management, 13th edn. Prentice Hall,
Hoboken (2014)
3. Dessler, G.: Human Resource Management, 13th edn. Prentice Hall, Hoboken (2013)
4. Chen, J., Eldridge, D.: Are standardized performance appraisal practices preferred? A case
study in China. Chin. Manag. Stud. 4(3), 244–257 (2012)
5. Attorney, A.: Performance Appraisal Handbook, New York (2007)
6. Wright, P., Cheung, K.: Articulating appraisal system effectiveness based on managerial
cognition. Pers. Rev. 36(2), 206–230 (2007)
7. Virani, S.R.: An analytical study of performance appraisal system of the selected information
technology enabled services companies. Zenith Int. J. Multi. Res. 2(5), 135–145 (2012)
8. Chavda, D.C., Yagnik, D.P.: Depth study on performance appraisal practices of selected dairy
units in Gujarat State. Int. J. Res. Manag. Pharm. 2(8), 19–24 (2013)
9. Jain, D., Garg, S.: Awareness towards the performance appraisal systems in HRH group of
hotels: a case study. Int. J. Mark. Finan. Serv. Manag. Res. 2(4), 29–48 (2013)
10. Mishra, L.: A research study on employees appraisal system: case of Hong Kong and shanghai
banking corporation (HSBC Bank). Int. J. Bus. Manag. Invention 2(2), 60–67 (2013)
11. Rana, M.V.S., Lokhande, D.M.A.: Performance evaluation of Maruti Suzuki India limited:
an overview. Asia Pac. J. Mark. Manag. Rev. 2(2), 120–129 (2013)
12. Renganayaki, N.: A study on the effectiveness of performance appraisal in GB Engineering
Enterprises Pvt. Ltd. Thuvakudi, Trichy Tamil Nadu, India. Int. J. Sci. Res. 2(2), 466–467
(2013)
13. Singh, R., Mohanty, M., Monthy, A.K.: Performance appraisal practices in Indian service and
manufacturing sector organizations. Asian J. Manag. Res. 4(2), 256–265 (2013)
14. Lakshmi, S., Mohideen, A.: Issues in reliability and validity of research. Int. J. Manag. Res.
Rev. 3(4), 2752–2758 (2013)
15. Yoganandan, G., Saravanan, R., Priya, N., Ruby, N.: A study on performance appraisal system
in EID Parry (India) Ltd, Pugalur Tamil Nadu India. Int. J. Sci. Res. 2(8), 242–245 (2013)
16. Agarwal, S.: A critical study on impact of performance appraisal system on employees pro-
ductivity with special reference to an FMCG company. J. Radix Int. Educ. Res. Consortium
3(1), 1–5 (2014)
17. Choudhary, G.B., Puranik, S.: A study on employee performance appraisal in health care.
Asian J. Manag. Sci. 2(3), 59–64 (2014)
The Impact of Job Satisfaction Level on Employee Work Performance 297

18. Cowandy, C.J.: The impact of fair performance appraisal to employee motivation and sat-
isfaction towards performance appraisal: a case of PT XYZ. Bus. Manag. 2(2), 21–28
(2014)
19. Deepa, E., Palaniswamy, R., Kuppusamy, S.: Effect of performance appraisal system in orga-
nizational commitment, job satisfaction, and productivity. J. Contemp. Manag. Res. 8(1),
72–82 (2014)
20. Gautam, A.: A study on performance appraisal system practiced in sugar mills, and its impact
on employees’ motivation. A case study of Simbhawli Sugar Limited India. Asian J. Manag.
Res. 4(3), 350–360 (2014)
21. Malik, K.A., Bakhtawar, B.: Impact of appraisal system on employee performance: a com-
parison of permanent and contractual employees of Pakistan Telecommunications Company
Limited (PTCL). Eur. Sci. J. 1(1), 98–109 (2014)
22. Rahman, D.: Financial performance of pharmaceutical industry in Bangladesh with special
reference to square pharmaceuticals limited. IOSR J. Bus. Manag. 16(10), 38–46 (2014)
23. Sippy, N., Varma, S.: Performance appraisal systems in the hospital sector: a research-based
on hospitals in Kerala. Int. J. Bus. Manag. Res. 4(1), 97–106 (2014)
24. Begum, S., Hossain, M., Sarker, M.A.H.: Factors determining the effectiveness of perfor-
mance appraisal system: a study on pharmaceutical industry in Bangladesh. J. Cost Manag.
J. 43(6), 27–35 (2015)
25. Chetana, N., Patnaik, L., Mohapatra, A.D.: Determinants of performance appraisal: an
empirical study. Int. J. Adv. Res. Comput. Sci. Manag. Stud. 3(11), 150–161 (2015)
26. Khedekar, D.E.: Analysis of performance appraisal systems in the education sector. Int. J.
Manag. Sci. Bus. Res. 4(6), 105–110 (2015)
27. Phin, L.W.: The effectiveness of performance appraisal in the private education industry in
Malaysia. Int. J. Bus. Inf. 10(1), 95–124 (2015)
28. Rekha, G.S.: Performance appraisal in Yashodha hospital. Int. J. Eng. Manag. Res. 5(2),
254–263 (2015)
29. Xavier, J.V.: A study on the effectiveness of performance appraisal system and its influence on
the socio-demographic factors of the employees of a manufacturing industry in Tamil Nadu.
Int. J. Res. Manag. Bus. Stud. 2(1), 26–31 (2015)
30. Gupta, V., Swaroop, A.: Comparative study of performance appraisal on two pharmaceutical
organizations in Madhya Pradesh. Int. J. Eng. Sci. Manag. 2(2), 248–260 (2012)
31. Jahan, D.S.: Employee performance appraisal system: a study on square pharmaceuticals
limited. J. Bus. Stud. 37(1), 49–61 (2016)
32. Wararkar, P., Wararkar, K.: Study of performance appraisal practiced at textile industry in
India. Int. J. Textile Eng. Process. 2(1), 23–29 (2016)
33. Maheswari, R.S.: A study of performance appraisal system at IBM, Bangalore. Int. J. Adv.
Res. Ideas Innov. Technol. 3(5), 139–145 (2017)
34. Bhatia, M.V.A., Patel, M.R.: A study on employees satisfaction towards performance appraisal
system at power generation company. J. Emerg. Technol. Innov. Res. 5(6), 577–582 (2018)
35. Dharmadhikari, D.S.P., Bampoori, M.: Study of employee performance appraisal methods in
hospitals. Int. J. Acad. Res. Dev. 3(2), 1149–1153 (2018)
36. Sharma, N., Prakash Rao, B.: Impact of performance appraisal on employee motivation with
special reference to Indian steel industry. IOSR J. Bus. Manag. 20(2), 44–47 (2018)
37. Kumari, G., Joshi, G., Alam, A.: A comparative study of job satisfaction level of software
professionals: a case study of private sector in India. In: Ray, K., Sharma, T.K., Rawat, S.,
Saini, R.K., Bandyopadhyay, A. (eds.) Soft Computing: Theories and Applications. AISC,
vol. 742, pp. 591–604. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-0589-
4_55
298 G. Kumari et al.

38. Kumari, G., Pandey, K.M.: Factors influencing employees turnover and measuring its impact
in pharmaceutical industry: an analytical analysis with SPSS method. In: Kumar, S., Purohit,
S.D., Hiranwal, S., Prasad, M. (eds.) Proceedings of International Conference on Commu-
nication and Computational Technologies. AIS, pp. 519–539. Springer, Singapore (2021).
https://doi.org/10.1007/978-981-16-3246-4_41
39. Kumari, J., Kumari, G., Pandey, K.M.: Factors affecting of employee performance appraisal
system in the pharmaceutical industry: an analytical study. In: Interdisciplinary Research in
Technology and Management, Chapter 45, pp. 290–300. Taylor and Francis Group, CRS
Press (2021). https://doi.org/10.1201/9781003202240. ISBN 978-1-003-20224-0 (ebook)
40. Kumari, G., Khanna, S., Bhanawat, H., Pandey, K.M.: Occupational health and safety of work-
ers in pharmaceutical industries, Himachal Pradesh, India. Int. J. Innov. Technol. Exploring
Eng. 8(12), 4166–4171 (2019)
41. Bhanawat, H., Kumari, G., Bijayasankar, B.P.: The satisfaction level of employees towards the
prevailing performance appraisal system. Turk. J. Comput. Math. Educ. 12(11), 1508–1514
(2021)
Vocal Psychiatric Simulator for Public Speaking
Anxiety Treatment

Sudhanshu Srivastava1(B) and Manisha Bhattacharya2


1 University of Petroleum and Energy Studies, Dehradun, Uttarakhand, India
shobhitsri1996@gmail.com
2 Remote Sensing Application Center, Lucknow, Uttar Pradesh, India

Abstract. Speech anxiety is an anxiety laid in the sub consciousness. It’s when we
are inner directed instead of outer directed and go through various reactions like
fighter-flight response, physiological and psychological reactions and this often
leads to mental disorders and trauma amongst the users. In India, where anxiety
and depression are treated like a taboo and people are often reluctant to visit
a psychiatrist, the researchers, in this paper have devised a strategy that would
tryto eliminate this human aspect by creating an intelligent system capable of
evaluating the human’s mental state using his facial expressions and simultaneous
vocal answers to set a of psychiatric questions. The generated simulator can then
be utilized for Public Speaking Anxiety (PSA) training and treatment, as well as
determining clues to which speakers have a higher sensitivity.

Keywords: Public speaking anxiety · Facial expression recognition · Sentiment


analysis · Support Vector Machines · Convolution Neural Network

1 Introduction
Public speaking anxiety is the most common social phobia among the general public
(PSA). Anxiety disorders are classified as a set of disorders marked by anxiety and fear
in the Diagnostic and Statistical Manual of Mental Disorders (DSM V) [1]. Anxiety is
defined as “an unpleasant emotional state or situation marked by subjective emotions of
anxiety, apprehension, and concern, as well as autonomous system activation or arousal”
[2]. Users with anxiety problems find it difficult to engage in a variety of daily activities,
such as conversing with strangers or staying in crowded places [3]. There are various
kinds of anxiety disorders, according to DSM V: panic disorder, obsessive-compulsive
disorder, agoraphobia (specific phobia) and social phobia [2, 4]. First, we must define
social phobia in order to better grasp public speaking fear. People with social phobia,
have a severe dread of social performance and of acting in a humiliating or embarrassing
manner that will cause others to judge them negatively [4].
According to the estimates, 85% of the general population feels nervous while giving
a public speech [5]. People who are afraid of giving a public speech expressed concern
that they: are ashamed when they make mistakes that make them appear “dumb” in
front of others [1], feel uneasy about being the center of attention [2], and fear that no

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 299–308, 2022.
https://doi.org/10.1007/978-3-030-97196-0_24
300 S. Srivastava and M. Bhattacharya

one will be interested in what they have to say [6]. Physical, vocal, and nonverbal signs
accompany public speaking anxiety. Some of the symptoms include shivering/shaking,
cold hands, rapid heartbeats, sweating, blushing, dizziness, internal pain, shaky voice,
stuttering, speaking rapidly or slowly, fidgeting, inability to stand still, avoiding eye
contact, and wiping hands [7].
PSA is not just a social phobia rather it leads to a lot of mental health issues. The
number of persons suffering from mental health issues is steadily rising. The optimal
psychiatrist-to-population ratio, according to reports from the Indian Union Ministry
of Health and Family Welfare, should be 1:8000 (at the very least), while the current
ratio is 1:3500 [8]. Furthermore, a bigger issue than a shortage of psychiatrists is that
people in India, particularly students, are hesitant to seek help from a counsellor or a
psychiatrist for fear of being judged by them. People in other locations do not have
easy access to such services and amenities, and as a result, they are unable to interpret
and comprehend their true feelings. This issue may become more serious in the future,
threatening a person’s well-being. People are often unaware of mental health issues and
have a negative attitude towards those who suffer from them, which is one of the main
reasons why people are hesitant to admit that they are suffering from them.
The facial expression is the most common nonverbal communication tool for under-
standing a person’s mentality. Automatic Facial Expression Recognition (FER) has
become a research focus due to its wide range of applications. A facial expression
recognition system is an automated system that can classify face expressions using
facial features extracted from a static image or a live video dataset. A facial expression
is a fundamental indicator of one’s mental and emotional state. According to Psycholo-
gist Mehrabian’s research, only 7% of genuine information is transferred verbally, while
38% is passed through language’s auxiliary elements, such as speech rhythm and pace,
tone, and so on. The information ratio communicated by a person’s facial expression
has reached 55%. As a result, the majority of useful information may be gathered by
facial expression recognition, which is the most effective technique to assess a per-
son’s mental state [9]. In recent years, FER has become a more popular study topic,
owing to its numerous applications in the disciplines of computer vision, robotics, and
human-computer interaction. Paul Ekman [10] presented six universal phrases. In his
research, he has described the location of faces as well as the physical motions required
to form various expressions [11]. The Facial Action Coding System (FACS), created
by Swedish anatomist Carl-Herman Hjortsjö, is a classification system for human facial
motions based on how they appear on the face. This methodology, which was later
adopted by Ekman & Friesen [12] is also a good way to categorize human expressions.
In the past, FACS was commonly used to implement FER systems. However, there has
recently been a trend to use classification methods like SVM, neural networks, and the
Fisherface algorithm to construct FER [13–15]. The Japanese Female Facial Expressions
(JAFFE) dataset, the Extended Cohn Kanade dataset (CK+), and the FER2013dataset
are all available for research in the field of Facial Expression Recognition [16–18]. Each
dataset has a different type and number of photos, as well as a different way of labelling
the images. The FACS system is used to label faces in the CK+ dataset, which includes
the Action Units (AUs) for each facial image. This research describes a Convolutional
Neural Networks (CNN) based approach to Facial Expression Recognition (FER). This
Vocal Psychiatric Simulator for Public Speaking Anxiety Treatment 301

CNN-based algorithm may be used to detect real-time facial expressions. While par-
ticipants answer psychiatric questions, this technology can be utilized to analyze their
emotions.
Another concept used in this research is of Sentiment Analysis of the live psychiatric
questionnaire that is asked from the user. Sentiment Analysis is a natural language pro-
cessing technique that analyses text to identify whether the author’s intentions toward a
given topic, product, or other entity are good, negative, or neutral. It’s a set of approaches,
tactics, and tools for detecting and extracting subjective data from text, such as opinions
and attitudes. Historically, sentiment analysis has concentrated on polarity of opinion, or
whether someone has a positive, neutral, or negative attitude about something. A product
or service whose review has been made public on the internet has often been the target of
sentiment analysis. This could explain why sentiment analysis and opinion mining are
often used interchangeably, despite the fact that sentiments should be considered emo-
tionally charged opinions. It’s almost as old as vocal communication itself that people
want to hear what others have to say [19].
Finally, the simulator is created after training both, the Facial Expression Recognition
module and the Sentiment Analysis module and a report is generated based on the
whole conversation between the person and the system evaluating the sentiment for
each answered question.

2 Research Objectives
The goal of this research is to create an intelligent system that can anticipate a person’s
mental state based on their audio responses to a series of psychiatric questions as well as
their facial expressions. It is an attempt to remove the human element by developing an
intelligent system capable of assessing a person’s metal status. The researchers also aim
to raise mental health awareness in the Indian rural areas where people are less aware
of such issues. Researchers want to take their project to these locations and encourage
people to test out the mental simulator, which includes a series of questions (provided
by a legal psychiatrist). They will be able to assess their true emotions as a result of this,
and the Indian society will become more conscious of such issues.

3 Design Methodology
3.1 Facial Expression Recognition Module
The FER system was implemented using the FER2013 dataset from the Kaggle compe-
tition on FER [18]. There are 35,887 tagged photos in the dataset, separated into 3589
test and 28709 train images. Another 3589 private test photographs make up the dataset,
on which the challenge’s final test was performed. FER2013 dataset images are black
and white and measure 48 × 48 pixels. The FER2013 collection includes images with a
variety of perspectives, illumination, and scale (see Fig. 1). The following Table 1 gives
the dataset description.
In the simulator, the user’s live video feed is given frame by frame to the Facial
Expression Classifier, which then uses a Convolutional Neural Network (CNN), a deep
learning architecture (see Fig. 2) to conduct the classification of the user’s emotion into
one of the seven classes (angry, disgust, fear, happy, sad, surprise, neutral).
302 S. Srivastava and M. Bhattacharya

Fig. 1. Sample images from the FER2013 dataset

Table 1. Description of the FER2013 dataset

Label Number of images Emotion


0 4594 Angry
1 547 Disgust
2 5121 Fear
3 8929 Happy
4 6077 Sad
5 4002 Surprise
6 6198 Neutral

Fig. 2. CNN architecture


Vocal Psychiatric Simulator for Public Speaking Anxiety Treatment 303

3.2 Sentiment Analysis Module


The user’s audio feed is used to do the sentiment analysis in order to classify his mood
into the appropriate category i.e. either positive or negative. After splitting the training
dataset, an instance of Count Vectorizer class is created for tokenizing and building the
vocabulary of the training dataset. Using a Web API, the audio feed is translated to text
(see Fig. 3).

Fig. 3. Flowchart of methodology

Support Vector Machines (SVM) algorithm has been used for the sentiment analysis.
Linear SVC or Support Vector Machines is another linear classification algorithm which
tries to find a hyperplane that separate 2 classes (applicable for multiclass also). The
difference here from Logistic Regression is that here we are trying to find a ‘margin
maximizing’ hyperplane.
The primal form of SVC is

||w|| 
n  
∗ ∗
w , b = argminw,b +C ζi s.t. yi wT xi + b ≥ 1 − ζi for i = 1 to n (1)
2
i=1

And, the dual form is



n
1 
n n 
n
αi∗ = argmaxαi αi − αi αj yi yj xi xj s.t. αi ≥ 0,
T
αi yi = 0 (2)
2
i=1 i=1 j=1 i=1

For training the SVM for sentimental analysis, self-created psychiatric questionnaire
has been used. The computer vision idea is also used, which allows the system to observe
and perceive its surroundings while recording the user’s video feed (see Fig. 4). In the
304 S. Srivastava and M. Bhattacharya

Fig. 4. Flowchart for FER and sentiment analysis

simulator, the user is asked a few psychiatric questions and his live audio is converted
to text for the sentiment analysis.
Finally, both the modules are integrated together using a Flask GUI to generate the
simulator which classifies the user’s answers to a set of psychiatric questions either into a
positive or a negative sentiment and simultaneously keeps on categorizing his live video
feed emotions.

4 Results
The model successfully evaluates the mental state of the user as well as displays the
sentiment for each of the question answered. The facial expression is also displayed
alongside while the questions are being answered. The detailed result of the different
sub-modules of the research is as follows:

4.1 Sentiment Analysis Module

The user defined Support Vector Machine classifier that was made, gives an accuracy of
around 80% with the sentiment classification (see Table 2).
Vocal Psychiatric Simulator for Public Speaking Anxiety Treatment 305

Table 2. Output of SVM user-defined

Question 1: How is your mood most of the time?


Begin Speaking!
Your Reply: I am in bad mood I get irritated and angry at small things for no reason I also yell at the people for no reasons
when I am in a bad mood
Negative Sentiment
Question 2: Is there any fluctuation in the mood?
Begin speaking!
Your Reply: a lot of fluctuation in the mood one moment I would be happy and then something would happen and it would
change my mood completely
Positive Sentiment
Question 3: How is your sleep? What is the pattern of sleep? Any difficulty in falling asleep or in getting up?
Begin speaking!
Your Reply: I do get proper sleep I have a very bad sleeping pattern and also it is very difficult for me to fall asleep
Negative Sentiment
Question 4: Do you feel Difficulty in Concentration?
Begin speaking!
Your Reply: It is difficult for me to concentrate on things for long time I easily get distracted and it is really affecting my
professional life.
Negative Sentiment
Question 5: Do you feel low or active most of the time?
Begin speaking!
Your Reply: Most of the time I am really active I go out and play football or cricket with my friends I also come home and
then workout and stay healthy and fresh all the time.
Positive Sentiment
Question 6: Do you feel uneasy or restless?
Begin speaking!
Your Reply: whenever I am sitting idle I feel very uneasy and restless I need to be engaged in something or the other or it
makes me anxious and worried
Negative Sentiment
Question 7: How is your appetite? Do you have decreased or increased feeling of eating?
Begin speaking!
Your Reply: I have really bed appetite I have decreased feeling of eating whenever I am feeling anxious also I eat very
unhealthy food it is affecting my health in a bad way
Negative Sentiment
Question 8: How is your orientation towards sex? Interest in sex decreases?
Begin Speaking!
Your Reply: yes, my interest in sex is decreasing and I am really worried about my sexual orientation
Positive Sentiment
Question 9: Are you losing interest in day to day activities?
Begin speaking!
Your Reply: most of the day today activities don’t seem to interest me anymore
Negative Sentiment
Question 10: Do you see any variation in your weight? How would you rate this conversation? How helpful was it?
Begin Speaking!
Your Reply: conversation was really helpful I would rate it 9 out of 10
Positive Sentiment
306 S. Srivastava and M. Bhattacharya

4.2 Facial Expression Recognition (FER) Module

The facial expression classifier gives an average accuracy of around 60% for the training
and validation tests carried out during the testing of the code (see Fig. 5).

Fig. 5. FER output

The final integration of all the modules using flask GUI generates a sentimental
analysis report at the end of the question set and the facial expression classifier keeps
on classifying the live feed alongside (see Fig. 6).

Fig. 6. GUI and facial expression classification alongside the question set

The sentimental analysis here works with an average accuracy of around 60% on the
input questionnaire (see Fig. 7).
Vocal Psychiatric Simulator for Public Speaking Anxiety Treatment 307

Fig. 7. Final sentiment analysis report

5 Conclusion and Future Work

All the modules integrated together successfully carries out the sentimental analysis
along with the Facial Expression Recognition, hosted on the local host with the help
of flask GUI and the simulator was successful in predicting the users mental state and
thus, would help in treating the Public Speaking Anxiety and eliminating the need of
psychiatrists in the future. The user defined SVM gives a high accuracy on the psychiatric
questionnaire. Some webcam and GPU based limitations were also encountered during
the research. Moreover, using the Web API for speech recognition reduced the accuracy
for the module as it does not work very efficiently.
For the future work, the researchers can create a multisource knowledge base for
the psychiatrists’ questionnaire and can try other deep learning and machine learning
architectures to achieve even higher accuracy for the simulator.

References
1. Vahia, V.N.: Diagnostic and statistical manual of mental disorders 5: a quick glance. Indian
J. Psychiatry 55, 1–5 (2013)
308 S. Srivastava and M. Bhattacharya

2. Spielberger, C.D., Reheiser, E.C.: Assessment of emotions: anxiety, anger, depression, and
curiosity. Appl. Psychol.: Health Well-Being 1(3), 271–302 (2009)
3. Gorini, A., Riva, G.: Virtual reality in anxiety disorders: the past and the future. Expert Rev.
Neurother. 8(2), 215–233 (2008)
4. Pertaub, D.P., Slater, M.: An experiment on public speaking anxiety in response to three
different types of virtual audience. Presence: Teleoper. Virtual. Environ. 11(1), 68–78 (2002)
5. Burnley, M.C.E., Cross, P.A., Spanos, N.P.: The effects of stress inoculation training and skills
training on the treatment of speech anxiety. Imagin. Cogn. Pers. 12(4), 355–366 (1993)
6. Ahs, F., Mozelius, P., Dobslaw, F.: Artificial intelligence supported cognitive behavioral ther-
apy for treatment of speech anxiety in virtual reality environments. In: Proceedings of the
European Conference on the Impact of Artificial Intelligence and Robotics, ECIAIR 2020
(2020)
7. Harris, S.R., Kemmerling, R.L., North, M.M.: Brief Virtual Reality Therapy for Public
Speaking Anxiety (2002)
8. Agarwal, T.: Centre for Budget and Governance Accountability, India, December 2019.
https://www.cbgaindia.org/blog/mental-illness-not-talking/
9. Roopa, S.: Research on face expression recognition. Int. J. Innov. Technol. Explor. Eng. 8(9),
81–91 (2019)
10. Ekman, P.: Strong evidence for universals in facial expression: a reply to Russell’s mistaken
critique. Psychol. Bull. 115(2), 268–287 (1994)
11. Rosenberg, E.L., Ekman, P.: What the Face Reveals: Basic and Applied Studies of Sponta-
neous Expression using the Facial Action Coding System. Oxford Scholarship Online, San
Francisco (1997)
12. Ekman, P., Friesen, W.V.: Unmasking the face: A guide to recognizing emotions from facial
clues (2003)
13. Alshamsi, H., Kepuska, V.: Real time automated facial expression recognition app develop-
ment on smart phones. In: 2017 8th IEEE Annual Information Technology, Electronics and
Mobile Communication Conference (IEMCON), October 2017
14. Fathallah, A., Abdi, L., Douik, A.: Facial expression recognition via deep learning. In: 2017
IEEE/ACS 14th International Conference on Computer System and Application (AICCSA),
pp. 745–750. IEEE (2017)
15. Lyons, M.J., Budynek, J., Akamatsu, S.: Automatic classification of single facial images.
IEEE Trans. Pattern Anal. Mach. Intell. 21(12), 1357–1362 (1999)
16. Kanade, T., Cohn, J.F., Tian, Y.: Comprehensive database for facial expression analysis.
In: Proceedings Fourth IEEE International Conference on Automatic Face and Gesture
Recognition (Cat. No. PR00580), pp. 46–53 (2000)
17. Lucey, P., Cohn, J.F., Kanade, T.: The extended Cohn-Kanade dataset (CK+): a complete
dataset for action unit and emotion-specified expression. In: IEEE Computer Society Con-
ference on Computer Vision and Pattern Recognition-Workshops, pp. 94–101. IEEE, June
2010
18. Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W.,
Tang, Y., Thaler, D., Lee, D.-H., Zhou, Y., Ramaiah, C., Feng, F., Li, R., Wang, X., Athanasakis,
D., Taylor, J., Milakov, M., Park, J., Ionescu, R., Popescu, M., Grozea, C., Bergstra, J., Xie,
J., Romaszko, L., Xu, B., Chuang, Z., Bengio, Y.: Challenges in representation learning: a
report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M.
(eds.) ICONIP 2013. LNCS, vol. 8228, pp. 117–124. Springer, Heidelberg (2013). https://
doi.org/10.1007/978-3-642-42051-1_16
19. Mäntylä, M.V., Graziotin, D.: The evolution of sentiment analysis—A review of research
topics, venues, and top cited papers. Comput. Sci. Rev. 27, 16–32 (2018)
IoT Automation Test Framework for Connected
Ecosystem

Chittaranjan Pradhan(B) , Sunil A. Kinange, Jayavel Kanniappan,


and Rajesh Kumar Jayavel

Intelligence & IoT, Samsung R&D Institute, Bangalore, India


{chittaranjan.p,sun.kinange,jay.sds,rajesh.j}@samsung.com

Abstract. Now-a-days the number of Internet of Things (IoT) devices has been
growing as many new devices are developing and connected through IoT ecosys-
tem to become smarter. Today, the IoT connects everything from TV, bulb, alarm,
air conditioner, door lock, safety system from different vendors to build IoT ecosys-
tem. IoT is transforming the way people live and work and similarly the assuring
quality of huge variety of IoT devices become more challenging factor as it involves
verification in 3 way layer like client (device), server and IoT device end-to-end
states. IoT verification involve diverse components like device hardware, hub, sen-
sors, application software, server software, network and client platform. In this
paper, we proposed an IoT Automation framework which validates all IoT layers
like end-to-end hardware state, client responses, cloud responses and IoT appli-
cations. Our framework has 4 novel aspects: (1) power based validation technique
(2) ML based image validation technique (3) OCR based validation technique
and, (4) ML based sound validation technique. The proposed method applied to
IoT project and played a major role in differentiating our quality of IoT services
across devices.

Keywords: AI · ML · IoT (Internet of Things) · Cloud validation · Software


engineering · Test automation framework · Model-based validation · Home
automation · ZigBee · Z-Wave · OCF

1 Introduction
As per statistical data more than 10 billion IoT devices connected and active globally.
Ensuring the quality of IoT ecosystem is more challenging as this products are developed
from variety of vendor and platform. To bring the automation framework in terms of
validation of real hardware state/actions for all IoT products is quite complex. To address
such problem, we design a scalable automation test framework with different validation
techniques for different IoT devices and all components like (client & server response)
which helps to increase the test coverage and quality. Framework is revolutionizing the
way IoT devices to be tested - allowing them to become faster, smarter and more efficient.
IoT home ecosystem connected with variety of devices loaded with huge functional-
ity, user interface and end to end behavior. The quality assurance of this devices required

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 309–320, 2022.
https://doi.org/10.1007/978-3-030-97196-0_25
310 C. Pradhan et al.

huge manual effort for development and production software releases as end user expe-
rience is most important factor. The analysis of all IoT home connected devices types
and human interface layer is required to conclude automation framework suitable for
IoT domain.
This research paper focus on IoT home connected devices types like Wi-Fi, Blue-
tooth Low Energy, infrared, and ZigBee (e.g. smart bulb, sensor, door lock, camera,
siren, TV, air conditioner, refrigerator, robot cleaner, washing machine, mobile, watch,
speaker…) end to end quality parameters and proposed automation framework. Frame-
work incorporated with intelligence to validate variety of IoT end to end device output
values using screen UI, text, power, image, video, audio and OCR concepts. Framework
powered with rule and CNN based AI/ML models to improve end to end validation
accuracy.
To deliver complete functionality, framework is broken into a multi-stage ver-
ification that includes client request/response, server request/response, IoT device
request/response, IoT device end to end state, latency computations, and engineering
kpi’s (memory leak, CPU, thermal, battery, stability etc.).

2 Related Work

Currently in market there are few frameworks supports end to end IoT validation which
works with limitations. Most of the IoT automation tools available in market are focused
towards security and interoperability testing and there is no complete end to end IoT test
framework in market which can support all stages of testing starting from application to
real IoT devices.
One of the competitor tool A, [1] which supports end to end testing by validating
all stages and it is similar to our proposed framework but its validation is based on only
client, server and device logs, whereas our framework is novel in validating real IoT
devices using different validation techniques.
Other competitor tool B, [2] are mainly towards security testing and communication
protocol testing which ensures mainly data privacy. All multi stage validations are not
covered as part of automation, hence it requires other tools to do multi stage validation.
So there is no common framework available to validate complete IoT ecosystem. As
IoT devices are increasing in market and challenges arises to test each new IoT device
is quite difficult and it is huge time consuming to assure quality.

3 Proposed Framework

Framework is a master and client architecture i.e. window/pc based solution which act
as a master controller & client (mobile) act as a user (see Fig. 1).
Master controller sends the request to client (mobile) to perform action on Smart-
Things application and it collects client and server logs from client device and cloud
server. Framework validates the client and server log parameters, SmartThings applica-
tion layout elements and real IoT device hardware state and generates the final end-to-end
evaluation report.
IoT Automation Test Framework for Connected Ecosystem 311

Fig. 1. Layered framework architecture

The framework architecture incorporated with 4 major layers like application layer,
framework layer, communication layer, service layer and device layers.
The application layer provides user interface to communicate with the framework
layer. The framework layer is the core part of the architecture which embedded with major
components of the system and controls the complete end to end automation framework.
This layer communicates with android service and interact with SmartThings application
based on input scripts.
The service layer contains mainly 3rd party libraries and AI/ML based models used
for multi stage validation for device interaction layer. As device layer represents the set of
supported IoT devices and respective devices capabilities and communication protocols.
Framework layer provides remote execution feature, where user can do test setup
in any place and test execution can be triggered from anywhere in remote manner. The
framework manage by two approaches a) remote execution is done through remote desk-
top concept b) remote execution is done through web service managed by user interface
given to user to send the test request remotely by providing basic test information (test
device name, test suite etc.).
Log parser module helps to request and receive the data from client, server and IoT
devices. Report Manager pulls execution data of functional and non-functional from
client side and generate excel dashboard. Device connection manager helps to connect
multiple devices which act as different user (client) and IoT devices in parallel thread
invocation manner.
312 C. Pradhan et al.

Fig. 2. Novel validation approach for IoT devices

The framework has been designed to handle multiple IoT devices testing in same
time with different novel advanced validation techniques (see Fig. 2) using ML based
image model, OCR based model, power based model and sound based model for real
IoT device hardware state, which covers 93% coverage and reduces the huge manual
effort and ensures IoT product quality.
The validation mechanism of real hardware state for all IoT products with advanced
techniques is the novel feature of this framework and it is unique in nature.

4 Advanced Validation Techniques


Framework has novel approach to validate the IoT devices states and the details of these
techniques are explained below.

4.1 ML Based Image Validation Technique


Framework uses Image segmentation and classification techniques (CNN based model)
[3] using tensor flow framework. The model has been trained with the known image
datasets of different states of IoT devices with varied angles and light exposures to
identify the right states (for e.g. door locked & unlocked, lamp on & off) in an accurate
way.
Figure 3 shows the experimental result for object detection by using object seg-
mentation technique. Firstly the object segmentation technique will be used to find all
possible devices as one or more bounding boxes in a frame with an object label (see
Fig. 3) and then fed into the image classification model to predict the state of the device
and compared with original state to verify the device functionality in the IoT ecosystem
similar to how a human visualize and verify the same.
Classification of images involves assigning an image to a label, while localization of
objects involves drawing a bounding box around one or more objects in an image (see
Table 1).
In order to train the model, the framework adapted Transfer learning techniques to
avoid huge datasets training and thus training with limited datasets of augmented IoT
IoT Automation Test Framework for Connected Ecosystem 313

Fig. 3. Objects detection by segmentation technique

Table 1. Accuracy based on ~11 objects

Detected/actual Actual positive Average accuracy Average error


Detected Positive 10 90.54% 09.45%

state images can be achieved by adding a separate convolution layer addition to the base
pre-trained model (see Fig. 4).

Fig. 4. CNN model with transfer learning techniques

Above figure shows the CNN model [4] on transfer learning techniques to train the
model by adding separate convolution layer (see Table 2).
The real environment contains many IoT devices at any given visual frame. The
sample size added on top layer has 5838 datasets for 14 states and has accuracy of
~92.33% as an output accuracy.
This framework has the flexibility to auto train the model with the images captured for
the device states for the new IoT devices added in IoT ecosystem. The captured datasets
314 C. Pradhan et al.

Table 2. Accuracy of IoT device state prediction

Device state Training size Accuracy %


Smart bulb off 676 100%
Smart bulb on 685 100%
Smart bulb red 471 99%
Smart bulb blue 455 93%
Water valve open 367 95%
Water valve close 380 99%
Door lock locked 448 98%
Door lock unlocked 441 95%

should be labelled for each device states which comprises of device state images with
varied angles, distances and light exposures to identify the right states for the device
(e.g. door lock open, close).

4.2 Power Based Validation Technique

Framework has power based validation technique to validate IoT device state by captur-
ing energy consumption values from smart plug where IoT devices like siren, speaker
are connected and no user interface and change of device states. The framework captures
user action on connected IoT devices then controller will send request to power module
to capture the energy consumption values through client logs and the same values will be
analyzed and the action is determined by using these values and threshold value before
action.
Formula to calculate power value when action is performed:
Eact − Eth
Pch = (1)
t
Where, Pch power value change when action is performed, E act energy consumption
when action is performed and E th threshold energy consumption, t is the time period.
Figure 5 shows the power change for different IoT device from idle state to action
state and action state to idle state.

4.3 OCR Based Validation Technique


Framework has novel approach to recognize text by using OCR [5] and validate the real
device end to end state. Pre Post OCR based validation technique is used to validate
display with GUI oriented IoT devices (e.g. TV, air conditioner, air purifier… etc.) this
method newly derived by using below parts.

[1] Image pre-processing (before OCR)


IoT Automation Test Framework for Connected Ecosystem 315

Fig. 5. Power change graph (idle-action-idle)

Fig. 6. Pre Post OCR Model

[2] OCR of pre-processed image (OCR)


[3] Word/character correction of OCR result (after OCR)

Figure 6 shows the Pre Post OCR Model for advanced text recognition model by
doing image pre-processing and text processing [6]. As we can see in (Fig. 2) captured
raw image which feed to image correction module to correct the image by using multiple
image processing algorithms and corrected image will be send to text area detection
module.
In this module text area will be detected and cropped into small images which
contains only text. After this each cropped image will be feed to OCR model and result
will be send to word/character correction module to correct the words and characters in
obtained results.
The word/character correction model [7] contains detector and translator sub models,
firstly OCR output will be provided to detector model which detects the erroneous
word/character sequences then these erroneous characters/words will be sent to translator
model where these sequences of characters/words will be corrected using language
dictionary and corrects the possible characters/words. Finally we will receive improved
and corrected text results. These final results will be sent to validation module to validate
the end to end results.
Table 3 shows the improved performance of our OCR method by using the Pre Post
OCR model.
316 C. Pradhan et al.

Table 3. Accuracy based on ~300 dataset

Test dataset OCR model Accuracy %


300 Normal OCR Accuracy 86.00%
300 Pre Post OCR Accuracy 96.40%

4.4 ML Based Sound Validation Technique


Framework uses tensor flow sound classification [8] model. We formalize the sound
classification model into a series of test sets and classify sounds. Prediction model com-
pare the real IoT device sound which recorded by the framework with trained reference
sound (e.g. Doorbell, SmartTag) and sent the matched sound to validation module to
validate the final results. Sound recording [9] feature is integral part of the framework
which helps to record IoT device sounds when action happen in device. Sounds will be
recorded through system mic continuously by masking surrounding or external noise by
using noise cancellation algorithm and media recorder [10] APIs are used capture the
sounds.

Fig. 7. Sound prediction model

Figure 7 shows the power change for different IoT device from idle state to action state
and action state to idle state. This framework has the flexibility to auto train the model
with audio samples recorded for various device states and triggers for the validation of
new IoT devices added in IoT ecosystem. New audio samples needed to be trained with
different sound level variations to classify the relevant sound [11] for the specific device
states (see Table 4).

5 Performance Analysis of Validation Techniques


The below result showing the processing time for different validation techniques used to
validate end to end state of different IoT devices. Below result represents test execution
of ~500 test set from different IoT devices (see Table 5).
IoT Automation Test Framework for Connected Ecosystem 317

Table 4. Accuracy based on ~100 dataset

Average accuracy 93.04%


Average error 06.55%

Table 5. Performance analysis on ~500 test set

Validation techniques Avg. execution time in [s]


Power Based Validation Technique 37.11
ML Based Image Validation Technique 40.45
ML Based Sound Validation Technique 36.14
OCR Based Validation Technique 38.33

6 Unified Script Creator for All IoT Products

Each automation framework equipped with script creation mechanism which play pivotal
role and input for any automation tools execution. Hence this framework incorporated
with intelligence script creator which creates scripts in no time for all IoT devices.
The framework required user to perform execution manually in SmartThings applica-
tion and tool automatically creates test scripts with capturing all core parameters like
pre-conditions, execution steps, client response, server response and application layout
elements for complete end-to-end validation (see Fig. 8). The average speed of script
creation is approximately 180 s based on the precondition, execution flow and type of
the validation techniques.

Fig. 8. End-to-end script creation flow


318 C. Pradhan et al.

To avoid script modification for each new releases due to UI changes, framework
has another intelligent technique to auto updating the scripts based on changes from user
interface of application user test. For example the SmartThings UI layout or elements
got changed in new releases and existing scripts are failing due to UI Changes.
Framework provides the option to execute same scripts on both old and new app
releases and it auto corrects the scripts by comparing both execution results. Based on
results the framework checks each step and scenario which got failed and corrects each
script by traversing both release xml layouts and corrects the failed step or scenario
and finally updates the complete scripts and re-executes the corrected scripts to ensure
the quality. Due to this script auto correction feature, we are able to reduce the huge
manual effort of script review and modification to speed up the product quality evaluation
process.

7 Seamless Execution and Engineering Values


IoT home system connected with lot of devices required scalable and stable automation
framework to adapt parallel execution of single and multiple user account devices in
simultaneous way (see Fig. 9).

Fig. 9. Seamless execution for multiple devices

This seamless method well experienced in the framework and reduced execution
speed and shorten the time to market for new devices and major enhancements.
This framework incorporated to measure core engineering values of IoT home system
like latency from all layers (client, server, device interaction and end to end pipeline),
IoT Automation Test Framework for Connected Ecosystem 319

runtime memory consumption, user device battery and thermal. This framework can be
deployed in early stage of IoT device feature development as it helps to assure quality
of specific use case stability in regular iterations.

8 Results and Impact


Framework deployed for all 25 IoT home devices (e.g. water leak sensor, door lock,
cctv, smart bulb, door camera, siren, air conditioner, TV, refrigerator…) and performed
452 rounds of test execution (see Table 6).

Table 6. Results and impact analysis

IoT devices Execution rounds Man days saved Test coverage


ZigBee 154 770 91%
Z-Wave 160 800 94%
Wi-Fi 138 690 93%
Total 452 2260 93%

Manual Effort Reduction Calculation


Let us assume that, number of TCs used for testing = x & number of TCs kept for 1 test
cycle = y, Hence x/y = number of test cycles.
For manual testing, 1 tester can test 100 TCs at most (in one day). So, y/100 =
number of TCs for 1 test cycle/number of TCs executed by 1 tester.
Total no. of testers required
 y  x  x 
∗ = (2)
100 y 100
For automation, 1400 TCs can be executed in one day using one test setup. So,
y/1400 = number of TCs for 1 test cycle/number of TCs executed by automation.
Total no. of testers required
 y  x  x 
∗ = (3)
1400 y 1400
Hence, using (2) and (3), percentage decrease in man power is calculated as
 x  
x   x  13
− / = ∗ 100 = 92.85% (4)
100 1400 100 14
Where x, is the number of test cases per test cycle, test cases executed by one engineer
manually per day is 100 TCs & test cases executed by automation per day is 1400 TCs.

Effectiveness Results
Framework played vital role to assure quality, stability of IoT products in ecosystem and
saving huge manual efforts across all products.
320 C. Pradhan et al.

356 defects are found across all IoT products.


Improved Test coverage from ~68% to ~93%.
~92.85% of manual effort reduced.

9 Conclusion and Future Scope


This framework is successfully deployed for all IoT products and improved the test
coverage form 68% to 93% and played crucial role in differentiating quality of IoT
products in ecosystem. Framework enabled end to end automation of 25 IoT home
devices (e.g. water leak sensor, door lock, cctv, siren, smart bulb, door camera, air
conditioner, TV, refrigerator…) with high and consistent accuracy of ~98%. Framework
has been successfully deployed on Samsung IoT system (SmartThings) and it supports
for Korean and Euro languages.
As the prevalence of IoT innovations grow, quality and automation team expect the
opportunities, and challenges posed by a more new type of connected devices in eco
system. Framework has been designed to expand for upcoming IoT devices in market as
existing framework incorporated with all multimedia validation techniques with AI/ML
powered system. As future work framework has the potential to enhance further for open
source global community tool which has common framework to validate all IoT home
products in the market.

References
1. Yadav, G.: IoT-PEN: an E2E penetration testing framework for IoT. J. Inf. Process. 28, 633–
642 (2020)
2. Bures, M.: Framework for integration testing of IoT solutions. In: International Conference
on Computational Science and Computational Intelligence, CSCI (2017)
3. Saha, P., Nath, U.K., Bhardawaj, J., Paul, S., Nath, G.: Building image classification using
CNN. In: Gunjan, V.K., Suganthan, P.N., Haase, J., Kumar, A. (eds.) Cybernetics, Cognition
and Machine Learning Applications. AIS, pp. 307–312. Springer, Singapore (2021). https://
doi.org/10.1007/978-981-33-6691-6_34
4. Indolia, S.: Conceptual understanding of convolutional neural network - a deep learning
approach (2018)
5. Smith, R.: An overview of the Tesseract OCR engine. In: Document Analysis and Recognition
(ICDAR 2007) Ninth International Conference (2007)
6. Dome, S.: OCR using Tesseract and classification. In: IEEE Internal Conference (2021)
7. Kissos, I.: OCR error correction using character correction and feature-based word classifi-
cation. IEEE (2016)
8. Khamparia, A.: Sound classification using convolutional neural network and tensor deep
stacking network. IEEE Publication, January 2019
9. Young, G.: Sound Recording Technologies: Data Storage Media and Playback Equipment.
Techno-science.ca (2006)
10. Yaohuan, W.: Windows API Based Recording and Playback for Waveform Audio.
en.cnki.com.cn (2005)
11. Hershey, S.: CNN architectures for large-scale audio classification. In: International Confer-
ence on Acoustics, Speech and Signal Processing (ICASSP). IEEE (2017)
Application of Magic Squares
in Cryptography

Narbda Rani(B) and Vinod Mishra

Department of Mathematics, Sant Longowal Institute of Engineering and Technology,


Longowal, Sangrur 148106, Punjab, India
narmadasharma1990@gmail.com

Abstract. In this paper, the magic square is used as a substitution


cipher text in cryptography. The method for encryption and decryption
of information is proposed by constructing magic squares with the help
of Narayana’s folding method and Knight’s move method. The validity
of proposed model has been analyzed by encryption and decryption of
alphabets, numeric digits and symbols. Also, both the methods used for
the construction of magic square are represented in a different way as
compared to the traditional method.

Keywords: Magic square · Cryptography · Knight’s move method ·


Narayana’s folding method · Arithmetic progression

1 Introduction

A magic square is a square matrix of order n with the additional property that
the sum of elements in each row, each column, main diagonal and anti diagonal
always remain the same. The fixed sum associated with magic square is known as
magic sum. The magic square in which only row and column sum remains fixed
and the condition for diagonals is not required is known as semi-magic square.
Emperor Yu, in China, was supposed to have been the first who discovered magic
square marked on the back of a divine tortoise only symbolically. After that, in
India a lot of work has been done on magic squares. The work of the ancient seer
Garga contain several 3 × 3 magic squares. The Buddhist philosopher Nagarjuna
(c. 2nd century AD) gave a general class of 4×4 magic squares. In Brihatsamhita
of Varahamihira (c. 550 AD), a description of a 4 × 4 magic square, referred to
as sarvatobhadra, was found. The 4 × 4 pan-diagonal magic square was found
at the entrance of Jaina Temple at Khajuraho in 12th century as described
in [4]. The construction of magic squares was done in 1356 AD by Narayana
Pandita in his celebrated work Ganitakaumudi. He discussed the general methods
for the construction of samagarbha (doubly-even), visamagarbha (singly-even)
and visama (odd) magic squares [7]. During 16th century, the Italian and the
Japanese mathematicians made an extensive study on the properties of magic
squares. Even these days, the study of magic squares is widespread in Tibet
and Malaysia, that have close connections with China and India. The conditions
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 321–329, 2022.
https://doi.org/10.1007/978-3-030-97196-0_26
322 N. Rani and V. Mishra

that assured the magic properties of magic and semi-magic squares are given
in [2] with the help of their powers. Sreeranjini and Mallayya in [5,6] made an
extensive study on some special properties of magic squares, and also worked
on the eigenvalues and dot products of third order magic squares. An extensive
study on the magic squares and cubes is elaborated in [1] by Andrews.
Cryptography is an art of hiding secret information from the intruders. In
substitution cipher, the plain-text is encrypted by swapping each letter or symbol
in the plain-text by a different symbol or letter [10]. In [8], Tomba and Shibiraj
enlightens the way for the construction of doubly-even magic squares using basic
Latin squares. They have also introduced an algorithm for the implementation
of the hill and magic square ciphers with the help of singly-even magic squares
[9]. Lok and Chin in [3] used the two orthogonal Latin squares for the generation
of cipher text.
In this paper, the magic squares constructed by using Narayana’s folding
method and Knight’s move method have been used for the encryption and
decryption of information to ensure the secure transfer of data from sender to
receiver. The algorithm has been explained by considering the case of fifth order
magic square but it can be applied to any fifth order magic square constructed
by using Narayana’s folding method or Knight’s move method. In addition,
the both methods of construction of magic squares have been represented by
using arithmetic progression rather than numeric digits. The proposed encryp-
tion/decryption method is valid for alphabets, numeric digits and symbols in
place of alphabets only. The level security has also been increased by using the
elements of magic squares rather than any sequence of characters or digits.

2 Narayana’s Folding Method

(a) For construction of Samagarbha magic squares:


Narayana’s folding method involves the construction of two auxiliary magic
squares, called as covered and coverer, which are added together to produce the
required magic square like the folding of the palms. The method is explained by
taking the case of fourth order magic squares. Now, consider a, a+d, a+2d, a+3d
as the first sequence and b, b + D, b + 2D, b + 3D as the second sequence. Let S
be the given magic sum of the required magic square. Let S1 and S2 be the sum
of the terms of the first and second sequence respectively. Firstly, find S − S1
and then evaluate guna by dividing S − S1 by S2 . After that each term of the
sequence is multiplied by guna for getting product sequence as under:
bg, (b + D)g, (b + 2D)g, (b + 3D)g
where
 
S − S1 S − (4a + 6d)
g = guna = =
S2 4b + 6D
Now, construct the covered and coverer from the first sequence and product
sequence. The two sequences are reversed after half of the square is filled. The
Application of Magic Squares in Cryptography 323

cells of the coverer are filled horizontally and those of the covered vertically.
Half of the square is completed in order and the other half in reverse order. On
adding theses two squares so formed like the folding of palms the required magic
square is obtained as under:
⎡ ⎤
(a + d) + (b + 3D)g (a + 2d) + (b + 2D)g (a + d) + bg (a + 2d) + (b + D)g
⎢ a + bg (a + 3d) + (b + D)g a + (b + 3D)g (a + 3d) + (b + 2D)g ⎥
⎢ ⎥
⎣ (a + 2d) + (b + 3D)g (a + d) + (b + 2D)g (a + 2d) + bg (a + d) + (b + D)g ⎦
(a + 3d) + bg a + (b + D)g (a + 3d) + (b + 3D)g a + (b + 2D)g

On interchanging covered and coverer the another magic square of order four is
fetched.
(b) For construction of visama magic squares:
For the construction of odd ordered magic squares the first sequence and the
product sequence are determined by following the procedure used in the con-
struction of samagarbha magic square. The steps for constructing covered and
coverer are as follows:
Starting from the center cell of the first row by filling the first term of the first
sequence and then writing all the other terms in order below the first term. The
rest of the terms are inserted in order from above for completing covered. The
coverer is also completed in the same way. The addition of covered and coverer
like the folding of palms gives the desired magic square as shown below:-
⎡ ⎤
(a + 2d) + (b + D)g a + bg (a + d) + (b + 2D)g
⎣ a + (b + 2D)g (a + d) + (b + D)g (a + 2d) + bg ⎦
(a + d) + bg (a + 2d) + (b + 2D)g a + (b + D)g

3 Generation of Cipher Text Using Narayana’s Folding


Method

Step-1: Start the construction of a magic square of order five with magic sum
115 by applying Narayana’s folding method. Begin with the construction of two
auxiliary squares named as covered and coverer. So, firstly take two sequences
1, 2, 3, 4, 5 as the base sequence and 0, 1, 2, 3, 4 as the second sequence respec-
tively.

Sum of terms of base sequence = 15


Sum of terms of second sequence = 10
Multiplier = 115−15
10 = 10
Now, multiplication of the elements of second sequence by 10 gives the product
sequence 0, 10, 20, 30, 40. To construct the covered and coverer, insert the terms
of the base sequence and product sequence by starting at the center cell of the
topmost row and below this place the remaining terms in order in downward
direction. The rest of the cells are filled by entering the numbers of the sequence
324 N. Rani and V. Mishra

on going from left to right in order by beginning at the previously occupied


center column.
⎡ ⎤ ⎡ ⎤
45123 30 40 0 10 20
⎢5 1 2 3 4⎥ ⎢ 40 0 10 20 30 ⎥
⎢ ⎥ ⎢ ⎥
⎢1 2 3 4 5⎥ ⎢ 0 10 20 30 40 ⎥
⎢ ⎥ ⎢ ⎥
⎣2 3 4 5 1⎦ ⎣ 10 20 30 40 0 ⎦
34512 20 30 40 0 10

Covered Coverer

Step-2: Add the elements of covered and coverer by the process of folding which
involves the covering of covered by the coverer just like in the folding of the
palms. Mathematically, if A1 and A2 are two auxiliary squares of order n then
the elements of resulting magic square A are obtained by the formula given
below:

A(i, j) = A1 (i, j) + A2 (n − 1 − i, j) ∀0 ≤ i, j ≤ n − 1

Thereby, the resulting magic square is shown as under:


⎡ ⎤
24 15 1 42 33
⎢ 35 21 12 3 44 ⎥
⎢ ⎥
⎢ 41 32 23 14 5 ⎥
⎢ ⎥
⎣ 2 43 34 25 11 ⎦
13 4 45 31 22

Step-3: Convert all the entries of above magic square to base 10 for reducing
the elements of magic square.
⎡ ⎤
16 11 1 26 21
⎢ 23 13 8 3 28 ⎥
⎢ ⎥
⎢ 25 20 15 10 5 ⎥
⎢ ⎥
⎣ 2 27 22 17 7 ⎦
9 4 29 19 14

Step-4: (a) Arrange the alphabets in the form of 5 × 5-array as follows:


⎡ ⎤
A B C DE
⎢J I H GF⎥
⎢ ⎥
⎢ K L M/N O P ⎥
⎢ ⎥
⎣ U T S R Q⎦
V W X Y Z

All the letters are arranged in horizontal zigzag pattern with M and N in the
same center cell because a magic square of order five contains only 25 number
of entries.
Application of Magic Squares in Cryptography 325

(b) For encryption of numerical digits and 16 special characters available on


computer keyboard arrange the digits from 0 to 9 along with special characters
in a 5 × 5-array in vertical zigzag pattern as follows:
⎡ ⎤
09 ! | \
⎢1 8 @ ∗ ; ⎥
⎢ ⎥
⎢ 2 7 , or. & : ⎥
⎢ ⎥
⎣3 6  ˆ ⎦
45 $ % ?

Step-5: (a) Assign numerical values to each alphabet for the substitution pro-
cess.
ABCDEF GH I J K L M
1 2 3 4 5 6 7 8 9 10 11 12 13
N O P Q R S T U V W X Y Z
14 15 16 17 18 19 20 21 22 23 24 25 26
30 29 28 27
(b) Assign the numerical values to all the digits and special characters for per-
forming substitution process but in reverse order by assigning 1 to the last entry.

0 1 2 3 4 5 6 7 8 9 ! @ ,
26 25 24 23 22 21 20 19 18 17 16 15 14
27 28 29 30

.  $ % ˆ&∗ | \ ; : ?
13 12 11 10 9 8 7 6 5 4 3 2 1
If the entries in a magic square are large enough even after changing of base then
assign the numeric values in the zigzag form upto that particular large value in
both the tables.
Step-6: (a) The comparison of tables given in Steps-3 and 4(a) generates the
following table:

A B C D E F G H I J K L M/N
16 11 1 26 21 28 3 8 13 23 25 20 15

O P Q R S T U V W X Y Z
10 5 7 17 22 27 2 9 4 29 19 14
(b) The comparison of tables given in Steps-3 and 4(b) forms the table as below:

0 1 2 3 4 5 6 7 8 9 ! @ , or.
16 23 25 2 9 4 27 20 13 11 1 8 15

 $ % ˆ & ∗ | \ ; :  ?
22 29 19 17 10 3 20 21 28 5 7 14
326 N. Rani and V. Mishra

Step-7: (a) Removal of the numerical values in Step-6(a) by the alphabets in


Step-5(a) have been depicted in the table as under:

A B C D E F G H I J K L M/N
P K AZ U Y CH M W Y T O

OP QR S T U V W XY Z
J EGQV ZB I D X S N
(b) The replacement of numerical values in Step-6(b)with the digits and special
characters of table in Step-5(b) leads us to table shown below:

0 1 2 3 4 5 6 7 8 9 ! @ , or.
! 31ˆ ; 06 . $?& @
$%ˆ&∗ | \ ; : ?
42 7 9% : 651\ ∗ ,
In the above tables, the first lines are used for plain text and second are used for a
cipher text. Finally, utilize these tables for performing encryption and decryption
of an information which needs be send through unsecured network from one place
to another. The decryption is done by following the tables depicted in Step-7
above in reverse order. The complexity of this approach can be increased by
exchanging the vertical and horizontal patterns of the tables formed in Steps-4
and 5 repeatedly after encryption of each alphabet/digit/special character.

Example 1: The encrypted form of a message “The monk who sold his ferrari”
is “ZHU OJOY DHJ VJTZ HMV YUQQPQM”.

Example 2: The encrypted form of plain text “Krishan ki Maya” is given by


“YQMVHPO YM OPSP”.

Example 3: The registration number of some student given in plain text


“PMA/1801” can be encrypted as “EOP∗3·!3”.

4 Knight’s Move Method


Initiate by inserting first term of an arithmetic progression (A.P.) in the middle
cell of the first row. Then proceed downwards from one column to the next col-
umn by using the knight’s move, one cell to the right and two cells in downward
direction. Using this move insert the terms of an A.P. sequentially until nth term
of an A.P. have been filled. Whenever any side of a square is reached repeat the
knight’s move on the opposite side by considering the square to be wrapped
around. After nth term of an A.P. is filled and no further movement is possi-
ble, because of reaching at the already filled cell, apply the break-move. Since,
there are many break-moves available so it is possible to apply any one of them.
Here, the break-move one cell down has been utilized. Apply these knight moves
repeatedly for completion of the magic square. If x, x + y, x + 2y, ..., x + (n2 − 1)y
Application of Magic Squares in Cryptography 327

is any A.P. with x and y as initial term and constant difference respectively, then
construction of fifth order magic square is given as under:
⎡ ⎤
x + 22y x + 11y x x + 19y x + 8y
⎢ x + 3y x + 17y x + 6y x + 20y x + 14y ⎥
⎢ ⎥
⎢ x + 9y x + 23y x + 12y x + y x + 15y ⎥
⎢ ⎥
⎣ x + 10y x + 4y x + 18y x + 7y x + 21y ⎦
x + 16y x + 5y x + 24y x + 13y x + 2y

The substitution of x = y = 1 represents the magic square consisting of num-


bers from 1 through n2 , i.e. normal magic square. Note one thing that we have
introduced only one knight move but there are many more knight movements
exists and so one can use any knight movement for odd order magic squares.

4.1 Generation of Cipher Text Using Knight’s Move Method

Step-1: Construct fifth order magic square with the above Knight’s move
method as below: ⎡ ⎤
23 12 1 20 9
⎢ 4 18 7 21 15 ⎥
⎢ ⎥
⎢ 10 24 13 2 16 ⎥
⎢ ⎥
⎣ 11 5 19 8 22 ⎦
17 6 25 14 3

Step-2: Same as Step-4 of implementation of cipher text using Narayana’s fold-


ing method.
Step-3: Same as Step-5 of implementation of cipher text using Narayana’s fold-
ing method except that the values are assigned upto 25.
Step-4: On making the comparison of tables shown in Steps-1 and 2(a) the
required table is as under:

A B C D E F G H I J K L M/N
23 12 1 20 9 15 21 7 18 4 10 24 13

O P Q R S T U V W X Y Z
2 16 22 8 19 5 11 17 6 25 14 3
(b) Similarly, the comparison of tables formed in Steps-1 and 2(b) results in the
following table:
0 1 2 3 4 5 6 7 8 9 ! @ , or.
23 4 10 11 17 6 5 24 18 12 1 7 13
 $ % ˆ& ∗ | \ ; : ?
19 25 14 8 2 21 20 9 15 16 22 3
328 N. Rani and V. Mishra

Step-5: (a) The replacement of numerical values of tables in Step-4(a) by the


letters of table made in Step-3(a) is represented as below:

A B C D E F G H I J K L M/N
W L A T I OU GRD J X M

OP QR S T U V W X Y Z
BP V H SEKQ F Y N C
(b) Interchange the numerical values in table of Step-4(b) with the digits and
special characters of table in Step-3(b) as follows:

0 1 2 3 4 5 6 7 8 9 ! @ , or.
3 ; %$91\28? ∗ .

$% ˆ &∗ | \ ; :?


71 , &56ˆ@! 4 :
These final tables are used for encryption and decryption processes. The decryp-
tion is done by using the same steps in reverse direction. Also, for increasing
the complexity in cipher text the horizontal and vertical patterns used above in
Steps-2 and 3 can be interchanged randomly.

Example 4: The encrypted cipher text of “The monk who sold his ferrari” is
“EGI MBMJ FGB SBXT GRS OIHHWHR”.

5 Conclusion

In this paper, the encryption and decryption of information has been performed
by constructing the magic squares with the help of famous Narayana’s folding
method and Knight’s move method. The use of elements of magic square make
the process of converting plain-text to cipher text complex so that it becomes
difficult for any hacker to decode it. The filling of matrices in the horizontal
and vertical pattern raises the level of randomization in the cipher text which
enhance the security of data needs to be transmitted through a network. The
validity of proposed model has been analyzed by encryption and decryption of
alphabets, numeric digits, and symbols. Thus, it is possible to use the magic
squares constructed by any method for the encryption and decryption of text
messages. The level of security and complexity can be increased more by applying
the rotations and flipping randomly on the magic squares. In future we will try to
propose an image encryption model using the elements of magic square instead
of its properties only.

Conflict of Interests. The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared to influence the work
reported in this paper.
Application of Magic Squares in Cryptography 329

References
1. Andrews, W.S.: Magic Squares and Cubes, 2nd edn. Dover Publications, New York
(1960)
2. Cook, C.K., Bacon, M.R., Hillman, R.A.: The “Magicness” of powers of some magic
squares. Fibonacci Q. 48, 298–306 (2010)
3. Lok, Y.W., Chin, K.Y.: An Application of Magic Squares in Cryptogra-
phy (2018). https://www.academia.edu/37853557/An_Application_of_Magic_
Squares_in_Cryptography
4. Singh, P.: Narayana’s treatment of magic squares. Indian J. Hist. Sci. 21(2), 123–
130 (1986)
5. Sreeranjini, K.S., Madhukar Mallayya, V.: Some special properties of magic
squares. Int. J. Algebra Stat. 1, 63–67 (2012)
6. Sreeranjini, K.S., Madhukar, M.V.: Eigenvalues and dot products of third order
magic squares. Indian J. Math. Sci. 8, 21–27 (2012)
7. Sridharan, R., Srinivas, M.D.: Folding method of Narayana pandita for the con-
struction of Samagarbha and Visama magic squares. Indian J. Hist. Sci. 4, 589–605
(2012)
8. Tomba, I., Shibiraj, N.: Improved technique for constructing doubly-even magic
squares using basic Latin squares. Int. J. Sci. Res. Publ. 3, 1–5 (2013)
9. Tomba, I., Shibiraj, N.: Successful implementation of the hill and magic square
ciphers: a new direction. Int. J. Adv. Comput. Technol. 2(3) (2010). ISSN 2319-
7900
10. Shimeall, T.J., Spring, J.M.: Network analysis and forensics. In: Introduction to
Information Security, pp. 235–251 (2014)
Application of Artificial Intelligence in Waste
Classification Management at University

Dongxu Qu1,2(B)
1 Sumy National Agrarian University, Sumy, Ukraine
qudongxu.123@163.com
2 Henan Institute of Science and Technology, Xinxiang, Henan, China

Abstract. Economic development and the abundance of human material life have
brought about the increasingly serious resource waste and garbage accumulation.
Waste classification recycling is the main approach to realize waste recycling
and reduction. Currently, the achievements of universities in waste classification
management are still far from the expectations, so people are increasingly aware
of the problems associated with conventional methods of waste collection. The
development of artificial intelligence (AI) technology provides a new alternative to
achieve efficient waste sorting and recycling. This study systematically reviews the
application of AI technology in waste classification management, mainly involv-
ing the relevant research and practical products of AI in the processes of waste
dumping, classification collection, and sorting. To explore the potential of AI tech-
nology applied in waste classification management on campus, this study takes
H University in China as a case to analyze the actual situation of the waste clas-
sification management program on campus, including the basic situation of all
kinds of the waste streams from generation to final treatment. Based on the above
results, we attempt to propose an alternative scheme for the existing waste clas-
sification management program in H University, that is, the framework of waste
classification management based on AI technology, to improve the performance
of waste classification recycling on campus. Our findings could provide references
for the universities that are committed to waste classification recycling and for the
promotion of the municipal intelligent waste management system in cities.

Keywords: Artificial intelligence · Automatic sorting · Recycling · Waste


classification · Waste stream

1 Introduction
Resource recycling and environmental protection are the sustainable lifestyles advocated
by all countries in the world. The emergence of a significant amount of waste at the
moment has added obstacles to realize the vision [1]. The more waste we produce,
the more we waste resources and initiate hazardous situations for biological life [2].
Implementing waste classification recycling could reduce the waste of valuable resources
and lower the negative impact of waste on the environment. Since 2019, China has
entered a new stage of institutionalization of waste classification, which is reflected in

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 330–343, 2022.
https://doi.org/10.1007/978-3-030-97196-0_27
Application of Artificial Intelligence in Waste Classification Management 331

the fact that many cities have begun to fully implement the mandatory classification of
domestic waste. As a large-scale community, colleges and universities have relatively
slow progress in their waste classification management programs. Previous studies have
shown that the waste collection scheme, classification method, and management model
on university campuses are still in an improper state, and the effect of classification and
recycling is also unsatisfactory [3, 4]. To improve the effectiveness of waste classification
recycling at universities, researchers generally argue that it is a good solution to improve
the circular economy awareness, attitude, and behaviors through publicity and education
[5, 6]. However, the implementation effect of these strategies in practice cannot be
guaranteed due to certain boundedness of education. Currently, the developments in AI
technology is opening up the possibility of new alternatives to remedy these manual
classification deficiencies.
AI based on thinking learning and construction could improve the efficiency of
waste classification recycling and reduce the amount of solid waste entering landfills
and incineration power plants through intelligent identification, collection, and sorting
of waste. Moreover, the AI waste classification recycling device could make use of Big
Data, Internet of Things and other technologies to realize intelligent waste classification
management with the combination of supervision functions [7]. It can be predicted that
the application of AI technology in waste classification management is a development
direction in the future, which will change the traditional crowd management tactics. At
present, companies committed to AI technology development have launched several AI
based waste collection and sorting products and applied in some areas. However, most
business users and the community have been deterred by the high price tag in promot-
ing them to a wider audience. Universities are also crucial subjects of AI technology
research and development other than specialized technology companies. To date, there
have already been 35 universities in China that set up AI faculty dedicated to AI edu-
cation promotion and technology research [8], which provides convenient conditions
for the implementation of AI waste classification management programs on campus.
A new low-cost AI waste classification system developed in Liverpool Hope Univer-
sity has confirmed the advantages of universities in applying AI technology to waste
classification recycling management [9].
To further explore the application prospect of AI, this study systematically reviews
the application of AI technology in waste classification management. Then, based on the
actual situation of the waste classification management program of H university in China,
a case study is conducted to analyze various types of waste streams on campus, including
waste generation, classification and disposition. Finally, we propose an AI based waste
classification management framework to replace the existing waste management system
in H universities. Our findings contribute to the efficiency of waste classification in H
University by providing an AI based waste classification management scheme for the
decision makers in universities. Also, it could provide references for the promotion of
intelligent waste management in the smart city construction of the city where it is located.

2 Methodology
H university locates in Xinxiang City of Henan Province, which is a provincial general
undergraduate university. Its educational and development scale is in the middle level
332 D. Qu

among universities in China, so it is representative to select it for case study. The campus
of H university covers an area of 165.2 ha, with a total floor area of 620,000 square meters,
including Xinxiang campus (undergraduate level) and Huixian campus (junior college
level). There are 21 teaching faculties, among which the AI faculty was just established in
2020. Also, there are 1759 staff members and 29130 full-time students, including 24202
undergraduates and postgraduates in total on Xinxiang campus and 4928 junior college
students on the Huixian campus. The main areas on campus of H university include
teaching areas (12 teaching buildings and two laboratory buildings), residential areas (14
student hostels and 41 teacher apartments), dining areas (five restaurants), public office
areas (one administrative office building and two libraries), and affiliated school (one
primary school and one kindergarten). In this study, we focus on the Xinxiang campus
as the object of investigation on the status quo of waste classification management. The
investigation period is from May to June 2021.
The research process of this study is divided into three stages, and different research
methods are adopted in each stage. The first stage is to use the literature research method
to systematically review the application of AI technology in waste classification manage-
ment, to provide a theoretical basis for the construction of AI based waste management
scheme in H university. In the second stage, representative sites and related dustmen are
selected to investigate the status quo of waste classification management of H university
by combining field sampling survey and interview, including waste output estimation,
waste collection equipment configuration and sorting method. In the third stage, we
attempt to put forward an AI based waste classification management framework suitable
for H university, aiming to figure out the waste management problem on campus.

3 Literature Review
AI technology has been utilized to improve existing solid waste management schemes
throughout the different stages, from collection to final disposal [10, 11]. In the con-
text of mandatory promotion of waste classification, researchers began to focus on the
application of AI technology in waste classification recycling, which a crucial approach
to solve the problem of surging waste generation and low efficiency of manual clas-
sification. The literature review conducted by Abdallah et al. [12] indicated that the
frequently used AI models in waste classification included Artificial Neural Network,
Support Vector Machine, Linear Regression, Decision Trees and Genetic Algorithm. As
for the launched AI products in the field of waste classification recycling, it is mainly
divided into three categories, namely, waste classification software based on AI tech-
nology, AI waste classification container, and AI waste sorting equipment. These three
types of products are respectively applied in the process of waste dumping, classification
collection and sorting treatment.

3.1 AI Application in Waste Dumping Process

As the first step of waste classification, proper waste dumping by residents could not
be replaced by any other automatic classification method. The requirement of classified
dumping is conducive to the promotion of residents’ awareness and behavior of circular
Application of Artificial Intelligence in Waste Classification Management 333

economy, while AI equipment is also easy to be polluted and damaged by random


waste. However, the rules, standards and subsequent treatment in the process of waste
classification recycling are so complex that it requires long-term education and learning
to master. China is currently only officially implementing mandatory waste classification
in some cities, and the knowledge related waste classification is still in an extremely
difficult stage of popularization. Therefore, the AI waste identification and classification
software based on computer vision technology and Big Data has been launched, which
is helpful to assist residents to correctly classify waste in the dumping process [13].
AI software to assist waste classification is mainly based on machine vision technol-
ogy to identify and compare objects for achieving waste type confirmation. Such software
generally comes in the form of WeChat mini programs, Apps and new platform features,
and relies on user groups for promotion and AI database reserve. Specifically, users
could use this kind of AI software to quickly identify the type of waste so that they can
dump waste in correct containers. Meanwhile, the data and feedback generated by users
when using the software could also be used to train AI to improve its performance in
waste classification. For example, AI waste classification software based on Convolu-
tional Neural Network with real-time recognition, voice query, text retrieval and other
functions was designed by some researchers [14, 15]. Also, Chinese companies, such as
Baidu, Alibaba and Tencent, have launched AI waste classification mini programs and
new AI identification function on their respective platforms: WeChat, Taobao and Ali-
pay, and Baidu App. Among them, Alibaba’s AI waste identification and classification
program has achieved 90% accuracy in waste classification with the help of massive user
usage data during half a month after its launch.
In addition, many auxiliary types of AI software have added extra user interac-
tion functions to determine the accuracy of residents’ waste classification, record and
guide residents’ waste classification behaviors. The American Intuitive AI company
designed the Oscar waste classification system that features human-computer interac-
tion and reminds residents of misclassification. The domestic Vivo AI assistant Jovi has
added a voice interactive waste classification function. Also, Xiaoji Technology com-
pany launched the Xiaoji Life mini program and App as an integrated waste classification
platform to assist users quickly classify waste.

3.2 AI Technology Applied in Waste Collection Process


In the waste collection process, AI classification trash cans are currently the most pop-
ular research fields and the most welcome AI products on the market [16]. In the ideal
state, users only need to put in the waste, and the smart trash cans could automatically
complete all the detailed classification. Researchers have explored the construction of
intelligent waste collection system based on different AI models. Sudha et al. [17] pro-
posed an automatic recognition system that uses deep learning algorithms to classify
biodegradable and non-biodegradable objects, which can recognize and classify objects
almost accurately in real-time. Lokuliyana et al. [18] highlighted an IOT based waste
collection framework that could automate solid waste identification, localization, and
collection processes through layered optimization algorithms. According to Rjamankam
and Solihinthe [19], the classification specification of AI trash bin could be realized by
using machine vision and machine learning technology. Besides, an automatic garbage
334 D. Qu

classification system based on machine vision was designed to improve the efficiency
of front-end classification in waste collection process [20]. Ruiz et al. [21] compared
the automatic classification of waste types by different deep learning architectures and
concluded that the combined Inception-ResNet model provides the best classification
results. Vrancken et al. [22] discussed the construction of a deep neural network training
database to identify common materials in solid waste streams. Alonso et al. [23] argued
that the automatic recycling of reusable materials from waste could be reached through
the use of Convolutional Neural Networks and image identification. Similarly, Adedeji
and Wang [24] proposed an intelligent waste classification system by using Convolu-
tional Neural Network model which serves as the extractor, and Support Vector Machine
which is used to classify the waste into different types. The AI waste classification sys-
tem based on Convolutional Neural Network could transform ordinary waste cans into
intelligent ones at a lower cost [25, 26].
However, the current AI based automatic classification collection trash bins have
limited processing capacity and throughput in practical application. It is only suitable
for public areas with a small amount of waste, a large flow of people and great difficulty
of supervision, such as shopping malls, parks and office buildings. Polish Bin-E company
and Chinese companies include Alpheus, Little Yellow Dog and Langdun Technology
have developed AI trash cans, among which the Rui Bucket launched by Alpheus has
been put into use in Shanghai Zhangjiang AI Island with a classification accuracy of
more than 95%.

3.3 AI Technology Applied in Waste Sorting Process


Waste sorting and treatment is the final process and the most critical link to realize waste
recycling and reduction, but it is also the difficulty of waste treatment in all countries
worldwide [27]. The relevant research is still in the initial stage because of the high
investment and technical difficulty. Tehrani and Karbasi [28] discussed an electronic
waste plastic identification and separation technology combining hyperspectral imaging
technology and neural network algorithm. Similarly, Chidepatil et al. [29] demonstrated
how to achieve mixed plastic separation by using AI based on multi-sensor data fusion
algorithms, thereby increasing the use of recycled plastic raw materials. According to
latest empirical findings, the main results of the training and operation of the AI robotic
sorting system for the separation of bulky municipal solid waste are promising with
regard to the purity of the sorted waste fractions, while the waste recovery was not so
successful [30].
Additionally, many enterprises and research institutions in the world have carried
out research and development in the field of waste sorting robots. For example, FANUC
in Japan, MIT and Alphabet in the United States and Zorn Robotics in Finland have
all set up waste sorting robot projects. Chinese companies such as CCGC and ONKY
Robotics have also launched waste sorting robot products. However, the main objects of
sorting are mainly recyclable waste with high economic benefits, and there are not many
practical products for ordinary recyclable materials. At present, the waste sorting process
in China is still dominated by manual sorting, with only a small amount of mechanical
assistance.
Application of Artificial Intelligence in Waste Classification Management 335

Overall, the research on AI application in the field of waste classification management


covers the three important processes of waste classification recycling, namely, dumping,
collection and sorting. According to the findings of scholars and technology companies,
the AI software for classification education and guidance applied in the dumping process
has been relatively mature, and the results in the collection process are the most abundant.
Also, the relevant research and products in the AI sorting process is still in its infancy due
to the limitations of technology and capital investment. Anyway, the previous research
provides an important basis for the construction of AI based waste classification recycling
system at university, which has tremendous potential for promoting waste classification
into the orbit of circular economy [31].

4 Results and Discussion

From the perspective of waste stream, waste classification management involves the
process of dumping, collection, sorting, transfer and disposal after the waste generation.
However, as a small unit in the social system, universities do not have the conditions and
capacity to manage the entire waste streams of all types from generation to final disposal.
Generally, campus waste is transferred by specialized outsourcing waste transportation
companies to transfer stations designated by the local government for unified disposal
after classification collection and sorting on campus, such as reusing, remanufacturing,
landfill or incineration. Only a small part of the waste can be disposed on campus in some
universities, such as the direct donation of old clothes and the fermentation of garden
waste for fertilizer. Therefore, to be precise, the current waste classification manage-
ment in universities mainly involves the dumping, collection, sorting processes. The AI
based waste classification management discussed in this study also mainly considers the
application of AI technology in the three links of waste stream.

4.1 Estimation of Waste Generation on Campus

After confirming with the waste transportation company in H University, there are four
waste trucks arranged for removal the campus waste four times every day, that is, a total
of 16 trucks of waste are cleared out every day, each weighing about 650 kg. To estimate
the daily waste output per capita, we calculated the total population according to the
number of permanent residents on campus, which is the sum of the number of full-time
students on Xinxiang Campus (24202) and the number of resident staff and their families
(2338) (see Table 1).
The calculations are based on the daily waste output while the students are learning
in school, not the data when students are absent during summer and winter vacations.
So, it is very intuitive to compare with other Chinese universities using similar calcula-
tion methods. To understand the difference with foreign universities, this study selected
some universities in the Netherlands for comparison, because Finland was one of the first
countries to implement waste classification and also one of the countries that did the best.
Different from Chinese universities, Dutch universities mostly use annual waste output
per capita as an indicator. If we consider the situation that the students spend 273 days in
school excluding summer and winter vacations according to the university calendar, the
336 D. Qu

Table 1. Estimation parameters of waste output on campus.

Number of Number of Vehicle load Number of Daily waste Per capita


waste trucks clearances per (kg) permanent output (kg) daily output
vehicle per day residents on (kg)
campus
4 4 650 26540 10400 0.39

average annual output per student is about 90.2 kg. This figure is relatively low compared
with some other Chinese universities, but far exceeds that of universities in the Nether-
lands (see Table 2). Therefore, the introduction of AI technology into waste classification
management in H university has significant potential, which is expected to effectively
reduce the per capita waste output and contribute to the sustainable development of the
city.

Table 2. The per capita garbage output of some universities.

Colleges and universities Waste output per capita (kg)


Nanjing Agricultural University 0.873 (per day)
Nanchang University 0.428 (per day)
Zhejiang Sci-Tech University 1.5 (per day)
China West Normal University 0.4 (per day)
Leiden University 31 (per year)
Utrecht University 55 (per year)
University of Groningen 29 (per year)
University of Amsterdam 29 (per year)
Source: the environmental coordinators of the universities concerned and the university website

4.2 Current Situation of Waste Classification Management in H University

With the in-depth advancement of waste classification management in China, waste


classification standards have also been unified and clearer. China’s national standard
Signs for Classification of Municipal Solid Waste (GB-T19905-2019) was implemented
in 2019. There are four major categories and 11 sub-categories of waste classifica-
tion that clearly defined in the stipulation, including Recyclable (paper, plastic, metal,
glass and textiles), Hazardous Waste (tubes, household chemicals, and batteries), Food
Waste (Household food waste, restaurant food waste and other food waste), and Residual
Waste. This waste classification standard is generally applicable to the whole process of
classified dumping, collection, sorting, transportation and treatment of domestic waste
throughout China. However, the current waste classification system in many colleges and
Application of Artificial Intelligence in Waste Classification Management 337

universities has different characteristics and is not uniformly classified in accordance


with national standards.
At present, waste classification standard in H university is still in accordance with
the traditional dichotomy of recyclable waste and non- recyclable waste. The recyclable
waste contains six sub-categories, including newspapers, glass cans, plastic, computers,
pop cans, and clothes. Similarly, the six sub-categories of non-recyclable waste are con-
taminated paper, cigarette butts, pet droppings, broken ceramics, large bones, disposable
tableware. By randomly selecting trash cans in different functional areas for inspection,
it is found that the correct classification of waste on campus is insufficient. It is com-
mon to use trash bags to mix all the waste and dump them in the trash bins. Actually,
this phenomenon also exists to a certain extent in many Chinese universities [32–34].
It is mainly due to the lack of attention to waste classification in campus management,
which is reflected in the outdated waste classification facilities that cannot meet the
requirements of the new standards, as well as the shortage of education and guidance
for students’ behaviors. Also, the inadequate development of municipal waste classifi-
cation system is also a considerable reason for this problem. Under the existing system,
the domestic waste removal has always adopted a mixed collection and transportation
method to transport the campus waste to the transfer station for unified classification
(see Fig. 1). Mixed collection and transportation not only mean that the categorization
in dumping process is meaningless but also leads to more serious secondary pollution,
resulting in the reduction of the recycling rate and the increase of labor costs.
We have to admit that the efforts of waste classification on campus are ineffective
and meaningless under the existing waste classification recycling model. However, it
should be certain that junkmen in university have contributed to the waste classification,
although they bring inconvenience to the waste management on campus. They mainly
collect uncontaminated cardboard, plastic bottles and books to resell them to the garbage
transfer station, which saves the value of these recyclable materials and reduces the
disadvantages brought by mixed collection and transportation. Besides, the special waste,
such as canteen food waste and laboratory hazardous waste, is assigned to various types
of qualified waste recycling companies designated by market regulatory authority for
unified disposal. Therefore, what needs to be focused on is the transformation from
mixed model to classification model, which is also the pivotal link needed to be applied
by AI technology in this study.

4.3 Waste Classification Management Framework Based on AI Technology

On the basis of the waste classification situation in H university, integrating AI into


the waste classification management process seems to be the best alternative solution
to resolve the mess before the urban waste classification system gets upgraded. Imple-
menting waste classification recycling requires abundant human resources to guide and
supervise, and also numerous labor efforts in the waste collection and sorting process. AI
technology could not only replace part of the labor force in waste classification but also
improve the performance of waste classification on campus. We propose a framework
of waste classification management based on AI technology, which mainly involves the
main processes including dumping, collection and sorting. Specifically, the AI software
338 D. Qu

Fig. 1. Waste streams in waste classification management system at H University

could be applied in the user’s waste dumping process to reduce the amount of guid-
ance and supervision personnel that are invested in the publicity and education of waste
classification knowledge. The application of AI classification terminal in waste collec-
tion process can realize secondary classification on the basis of manual classification to
maximize recycling and minimize landfill or incineration. AI sorting equipment could
be used in waste sorting process to reduce the labor cost caused by inefficient manual
sorting, and save the value of all kinds of recyclable materials.
The main processes of waste classification on campus require the AI participation
to ensure the action effects. In the four categories of waste, Food Waste and Hazardous
Waste are difficult to be classified by AI due to their characteristics of pollution, so it
is of great importance to guide and supervise users to correctly classify them through
AI technology in the dumping process. Residual Waste is complex and eventually flows
to waste incineration power plants or landfills, so AI classification equipment should be
applied to re-classify them to increase recycling and reduce incineration and landfills.
All Recyclable Waste should be finely sorted by AI sorting equipment and then exported
to make recycled products for consumers. Finally, all waste streams are directed to
corresponding docking institutions, including renewable resource plants, food waste
recycling plants, hazardous waste treatment plants and municipal sanitation systems
(see Fig. 2). In this framework, we removed the role of junkmen because their work is
replaced by AI sorting equipment. Also, it would reduce the potential health risks posed
by direct exposure to waste, especially at the critical time of the global COVID-19
pandemic.
We acknowledge that the framework of AI based waste classification management
in H university proposed in this study is only in the stage of theoretical exploration and
needs the support of subsequent empirical studies. In fact, some colleges and universities
have already begun to explore similar practice of AI waste management system, from
where we could gain the enlightenment. For example, Nanchang University is the first
university in China to fully implement waste classification management with AI tech-
nology. The AI waste classification and recycling bins developed by the student-initiated
Application of Artificial Intelligence in Waste Classification Management 339

Fig. 2. A framework of AI based waste classification management on campus

“Leaf Regeneration” entrepreneurial team has been widely promoted on campus with
the support of the university, which take on the functions of the real-name waste delivery,
full category waste collection and intelligent supervision. The smart bins could verify
the user’s identity through their campus cards or facial recognition when they dump
the waste, and then determine whether the user is correctly classified by using sampling
analysis and automatically generate delivery records. Every user could receive voice and
message prompt with correct or incorrect classification on the mobile App after dumping
their waste. If users correctly dump the waste in matched bins, they will be rewarded
with bonus points, which can be used to redeem food or daily necessities. Alternatively,
when users make mistakes, the waste bin could estimate the probability of their mistakes
by analyzing the past dumping behavior with algorithm, and then determine whether to
deduct the user’s personal credit score or get them retrained with waste classification
knowledge. For the waste sorting process, the “Leaf Regeneration” team set up a transfer
station on campus to realize the fine sorting of recyclable waste, and then delivered them
to various renewable resources recycling factories. Another case worth mentioning is
Guangdong Industry Polytechnic, which established an AI waste management system
on campus with the adoption of the Public-Private-Partnership model.
From the brief case description above, it can be seen that the construction of intel-
ligent waste management system in universities mainly focuses on AI guidance and
supervision in the waste dumping process, and there are still few practices in mixed
waste classification and fine sorting of recyclable waste by using AI technology. On the
upside, the flourishing AI faculty established in colleges and universities could carry out
empirical research on AI waste classification management programs on campus, which
could bridge the gap between theory and reality. Besides, AI based waste classifica-
tion products, especially AI classification and sorting equipment, cannot achieve mass
waste disposal. So, the AI equipment could be firstly put into public office areas, such
as administrative office buildings and libraries, and then it can be promoted in other
areas when conditions are ripe. This study shows the relevant data of waste generation
340 D. Qu

of H university, which can provide important data reference for subsequent research,
especially for comparison with the daily waste output after the implementation of AI
based waste classification management in the future.

5 Conclusions
The achievement of efficient waste classification recycling on campus is conducive to
maximize the utilization of recyclable resources and minimize the adverse impact of
waste on the environment. With the development of AI technology, the government,
academia and related enterprises are increasingly considering how to use AI technology
to assist the realization of waste classification. This study conducts a systematic review
of the relevant literature of AI based waste classification management and the launched
AI products, and reveals that there are three main categories of AI products in the field of
waste classification recycling, namely the AI education and guide software in dumping
process, AI classification trash bins in collection process, and AI sorting equipment in
sorting process. To explore the potential effects of the introduction of AI technology
to the waste classification management at university, this study takes H university as
a case to investigate and analyze the status quo of waste classification recycling on
campus, including the basic situation of waste generation, dumping and classification.
The results show that the daily waste output per capita of H University is relatively low
than some other universities in China, but it is significantly higher than that of Dutch
universities. In addition, the waste classification standard of H University still adopts
the traditional dichotomy of the recyclable and non-recyclable, which does not match
the current unified standard.
Based on the above results, we argue that it is of practical significance to apply AI
technology in waste classification management at H University to improve the perfor-
mance of waste classification. In this study, we propose a framework of AI based waste
classification management on campus, which mainly covers the application of AI tech-
nology in the main process of waste classification recycling. With the assistance of AI
technology, correct dumping and accurate collection and sorting of all kinds of waste
could be realized. However, the framework of AI based waste classification management
at university proposed in this study is only the theoretical exploration, which is needed
to be further proved its availability with future experimental studies. Also, subsequent
research could carry out technical research and development of AI products that can be
widely promoted in all processes based on this framework, especially the research and
development of AI terminal in classification collection and sorting process. Moreover,
although waste classification has significant features of positive externalities and pub-
lic welfare, it is still worth looking forward to the research on the economic benefits
brought by AI based waste classification recycling in universities, which could enhance
the confidence of university decision makers in implementing the programs.

References
1. Shevchenko, T., Kronenberg, J., Danko, Y., Chovancová, J.: Exploring the circularity potential
regarding the multiple use of residual material. Clean Technol. Environ. Policy 23(7), 2025–
2036 (2021). https://doi.org/10.1007/s10098-021-02100-4
Application of Artificial Intelligence in Waste Classification Management 341

2. Gupta, P.K., Shree, V., Hiremath, L., Rajendran, S.: The use of modern technology in smart
waste management and recycling: artificial intelligence and machine learning. In: Kumar, R.,
Wiil, U.K. (eds.) Recent Advances in Computational Intelligence. SCI, vol. 823, pp. 173–188.
Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12500-4_11
3. Zhang, K., Zheng, L., Wu, X., Huang, Y.: Investigation on classified collection and resource
utilization of domestic waste in colleges and universities--taking Pukou campus of Nanjing
Agricultural University as an example. Environ. Dev. 32(17), 171(10), 224–225+227 (2020).
https://doi.org/10.16647/j.cnki.cn15-1369/X.2020.10.122
4. Li, Z., Huang, Y., Hu, Y., Cai, Y.: Analysis on the status quo of domestic waste recycling
in colleges and universities and suggestions for standardized management–taking the main
campus of Central China Normal University as an example. Recycl. Resour. Circ. Econ.
13(11), 21–24 (2020)
5. Qu, D., Shevchenko, T.: University curriculum education activities towards circular economy
implementation. Int. J. Sci. Technol. Res. 9(5), 200–206 (2020)
6. Ghazali, A., Tjakraatmadja, J.H., Pratiwi, E.Y.D.: Resident-based learning model for sustain-
able resident participation in municipal solid waste management program. Global J. Environ.
Sci. Manag. 7(4), 599–624 (2021). https://doi.org/10.22034/gjesm.2021.04.08
7. Nowakowski, P., Szwarc, K., Boryczka, U.: Combining an artificial intelligence algorithm
and a novel vehicle for sustainable e-waste collection. Sci. Total Environ. 730, 138726 (2020).
https://doi.org/10.1016/j.scitotenv.2020.138726
8. Fu, Y., Heyenko, M., Zhang, W., Xia, Y., Niu, L.: Research on the promotion of discipline
development by the innovation of management system in Chinese universities. In: 37th IBIMA
Conference, Cordoba, Spain, pp. 30–31 (2021)
9. Myers, K., Secco, E.L.: A low-cost embedded computer vision system for the classification
of recyclable objects. In: Sharma, H., Saraswat, M., Kumar, S., Bansal, J.C. (eds.) CIS 2020.
LNDECT, vol. 61, pp. 11–30. Springer, Singapore (2021). https://doi.org/10.1007/978-981-
33-4582-9_2
10. Kolekar, K.A., Hazra, T., Chakrabarty, S.N.: A review on prediction of municipal solid waste
generation models. Procedia Environ. Sci. 35, 238–244 (2016). https://doi.org/10.1016/j.pro
env.2016.07.087
11. Vitorino, A., Melaré, D.S., Montenegro, S., Faceli, K., Casadei, V.: Technologies and decision
support systems to aid solid-waste management: a systematic review. Waste Manage. 59,
567–584 (2017). https://doi.org/10.1016/j.wasman.2016.10.045
12. Abdallah, M., Talib, M.A., Feroz, S., Nasir, Q., Abdalla, H., Mahfood, B.: Artificial intelli-
gence applications in solid waste management: a systematic research review. Waste Manage.
109, 231–246 (2020). https://doi.org/10.1016/j.wasman.2020.04.057
13. Zeng, W., Zhou, T., Meng, F.: Influencing factors and empirical analysis on the willingness
to use garbage classification software. Office Informatiz. 26(3), 56–59 (2021)
14. Yu, D., Jing, C.: Design and implementation of intelligent garbage sorting software based on
neural network. Sci. Technol. Innov. 26, 120–122 (2020)
15. Lv, W., Wei, X., Chen, Z., Tong, H., Ma, Y.: The implementation of garbage classification
software based on Convolutional Neural Network. Comput. Knowl. Technol. 16(5), 203–204
(2020). https://doi.org/10.14004/j.cnki.ckt.2020.0583
16. Zhou, F., Zhang, W.: Research on characteristics and optimization path of artificial intelligence
application in waste classification. J. Xinjiang Normal Univ. Ed. Philos. Soc. Sci. 41(4),
135–144 (2020). https://doi.org/10.14100/j.cnki.65-1039/g4.20200122.002
17. Sudha, S., Vidhyalakshmi, M., Pavithra, K., Sangeetha, K., Swaathi, V.: An automatic clas-
sification method for environment: friendly waste segregation using deep learning. In: 2016
IEEE Technological Innovations in ICT for Agriculture and Rural Development (TIAR).
IEEE, Chennai (2016). https://doi.org/10.1109/TIAR.2016.7801215
342 D. Qu

18. Lokuliyana, S., Jayakody, J., Rupasinghe, L., Kandawala, S.: IGOE IoT framework for waste
collection optimization. In: 2017 6th National Conference on Technology and Management
(NCTM), pp.12–16. IEEE, Malabe (2017). https://doi.org/10.1109/NCTM.2017.7872820
19. Rajamanikam, A., Solihin, M.I.: Solid waste bin classification using Gabor wavelet transform.
Int. J. Innov. Technol. Explor. Eng. 8(4S), 114–117 (2019)
20. Kang, Z., Yang, J., Guo, H.: Automatic garbage classification system based on machine vision.
J. Zhejiang Univ.: Eng. Sci. 54(7), 1272-1280+1307 (2020)
21. Ruiz, V., Sánchez, Á., Vélez, J.F., Raducanu, B.: Automatic image-based waste classification.
In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Toledo Moreo, J.,
Adeli, H. (eds.) IWINAC 2019. LNCS, vol. 11487, pp. 422–431. Springer, Cham (2019).
https://doi.org/10.1007/978-3-030-19651-6_41
22. Vrancken, C., Longhurst, P., Wagland, S.: Deep learning in material recovery: development
of method to create training database. Expert Syst. Appl. 125, 268–280 (2019). https://doi.
org/10.1016/j.eswa.2019.01.077
23. Nañez-Alonso, S.L., Reier-Forradellas, R.F., Pi-Morell, O., Jorge-Vazquez, J.: Digitalization,
circular economy and environmental sustainability: the application of artificial intelligence
in the efficient self-management of waste. Sustainability 13, 2092 (2021). https://doi.org/10.
3390/su13042092
24. Adedeji, O., Wang, Z.: Intelligent waste classification system using deep learning convo-
lutional neural network. Procedia Manuf. 35, 607–612 (2019). https://doi.org/10.1016/j.pro
mfg.2019.05.086
25. Wu, B., Deng, X., Zhang, Z., Tang, X.: Intelligent garbage classification system based on
convolutional neural network. Phys. Exp. 39(11), 44–49 (2019). https://doi.org/10.19655/j.
cnki.1005-4642.2019.11.009
26. Lv, C.: Automatic garbage sorting based on deep learning. Electron. Manuf. 24, 36–38 (2019).
https://doi.org/10.16589/j.cnki.cn11-3571/tn.2019.24.015
27. Shuptar-Poryvaieva, N., Gubanova, E., Andryeyeva, N., Shevchenko, T.: Examining of
portable batteries externalities with focus on consumption and disposal phases. Econ. Environ.
4(75), 8–22 (2020). https://doi.org/10.34659/2020/4/30
28. Tehrani, A., Karbasi, H.: A novel integration of hyper-spectral imaging and neural networks
to process waste electrical and electronic plastics. In: 2017 IEEE Conference on Technologies
for Sustainability (SusTech), pp.1–5. IEEE, Phoenix (2017). https://doi.org/10.1109/SusTech.
2017.8333533
29. Chidepatil, A., Bindra, P., Kulkarni, D., Qazi, M., Kshirsagar, M., Sankaran, K.: From trash
to cash: how blockchain and multi-sensor-driven artificial intelligence can transform circular
economy of plastic waste? Adm. Sci. 10(2), 23 (2020). https://doi.org/10.3390/admsci100
20023
30. Wilts, H., Garcia, B.R., Garlito, R.G., Gómez, L.S., Prieto, E.G.: Artificial intelligence in the
sorting of municipal waste as an enabler of the circular economy. Resources 10(4), 28 (2021).
https://doi.org/10.3390/resources10040028
31. Shevchenko, T., Vavrek, R., Hubanova, O., Danko, Y., Chovancova, J., Mykhailova, L.: Clar-
ifying a circularity phenomenon in a circular economy under the notion of “potential” new
dimension and categorization. Probl. Sustain. Dev. 16(1), 79–89 (2021). https://doi.org/10.
35784/pe.2021.1.09
32. Liu, G., Zhou, H., Peng, X., Xu, W.: Investigation and analysis of domestic refuse in Western
Normal University in China. Sci. Technol. Vis. 20, 271–272 (2014). https://doi.org/10.19694/
j.cnki.issn2095-2457.2014.20.215
Application of Artificial Intelligence in Waste Classification Management 343

33. Liu, T., Li, Q., Lan, J.: Study on the construction and countermeasures of household garbage
classification system in colleges and universities–a case study of universities in Nanchang
city, Jiangxi province. Mark. Wkly. 34(1), 130–132 (2021)
34. Zhu, Y.: Present situation and suggestions of campus garbage recycling –taking Hefei
University of Technology as an example. J. Changchun Educ. Inst. 28(4), 131–132 (2012)
The Distributed Ledger Technology
as Development Platform for Distributed
Information Systems

Itzhak Aviv1,2(B)
1 Tel Aviv University, 6997801 Ramat Aviv, Israel
cko150@yahoo.com
2 The Academic College of Tel Aviv–Yaffo, Tel-Aviv, Israel

Abstract. Distributed Ledger Technology (DLT) has emerged as a technology


enabler for developing trusted and decentralized solutions for various distributed
systems worldwide. Because of their sophisticated architectural patterns, current
Information System (IS) architectural frameworks are primarily designed for cen-
tralized information systems and can no longer ensure the requisite degree of
availability and dependability for Distributed Information Systems (DIS). In the
current study, I am the first to declare and define the term “DLT-Native”. A “DLT-
Native” DIS is built on the DLT platform and uses DLT design patterns to grow
internationally, support thousands of distributed nodes, and withstand operational
system failures and cyber-attacks. This research creates a reference architecture for
Permissioned DLTNS organizing dispersed information system components that
may share information, issue and service requests, and conduct outcome-focused
activities. The proposed DLTNS architecture combines prominent security, pri-
vacy, and trust domain aspects. DLT overcomes conventional DIS solutions with
an inherently centralized governance approach and a lack of transparency, data
traceability, and trust.

Keywords: Distributed ledger technology · Distributed information systems

1 Introduction
The DIS has gotten a lot of attention [1–4], especially in the realm of the Internet of
Things (IoT). Most DIS are now found in widely distributed environments, such as
hybrid cloud use cases and edge-based IoT domains like Military, Financial, Supply
Chain, or Telco. These use-cases are fundamentally distributed to provide the required
scalability, availability, and security to ensure a sustainable service level agreement. The
DIS is a geographically distributed system made up of computers connected via data
channels. Workloads and data are stored in various nodes and hosted on multiple servers,
hardware, and software systems [5]. All distributed DIS nodes should function as a single
logical, coherent system, with each node acting independently of the others [1]. The DIS
will comprise many nodes, from enormous high-performance compute units to smaller
devices in IoT use cases. To construct a multi-environmental digital area, DIS employs

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 344–355, 2022.
https://doi.org/10.1007/978-3-030-97196-0_28
The Distributed Ledger Technology as Development Platform 345

distributed cloud models. A hybrid cloud-based on a mix of suitable cloud types (private,
public, community) and edge computing is used in a multi-environment, primarily for
distributed scenarios [6]. In the information management sector, the transition to cloud-
based deployment models is well-established, particularly with the emphasis on edge
cloud and distributed cloud topologies [7]. Despite the increasing rise of distributed
cloud topologies, there is an extensive research gap focusing on distributed architecture
in multi-node edge-based IoT scenarios.
The DIS architecture differs from the centralized IS design in various ways. DIS is a
set of geo-distributed services deployed in a multi-environment that must communicate
to enable DIS component orchestration and a logically coordinated system. Currently,
any DIS project must devote a significant amount of time and effort to designing and
developing a distributed system framework for DIS data, process and service replica-
tion, system state management and consistency, coordination and synchronization of
system nodes, conflict resolution, and high availability; all while ensuring reliable, sta-
ble, and trustworthy connectivity among service nodes. With the increasing complexity
of DIS, adding DLT into one appears to be a viable option. DLT provides a shared
and preserved record of digital events validated, mutually agreed upon, immutable, and
cryptographically protected by numerous widely scattered entities [7, 8]. DLT technol-
ogy is notable for tolerating single points of failure and protecting against cyber-attacks
[9]. DLT is viewed as a crucial catalyst that highlights and can become a decentral-
ized application (dApp) development platform [10–12], as well as DLT applications for
permissioned scenarios [13–15].
A distributed network of peer-to-peer nodes, a ledger data structure replicated in the
DLT network, a network protocol specifying privileges, roles, controlled data owner-
ship, programmable smart contracts, and communication, verification, validation, and
consensus-based mechanisms across network nodes [16]. The term “DLT solution” is
frequently used in the context of permissioned implementation, which can be divided
into two categories: private DLT and consortium DLT. The private DLTs have the most
control over system involvement. Within a single organization, the network owner con-
trols all rights for reading, transacting, and mining. Consortium DLTs, on the other hand,
differ from private DLTs in that system entities in consortium DLT systems are governed
and overseen by member organizations [13]. Many recent research studies have proposed
DLT-based solutions for sharing, monitoring, and managing data and processes between
multi-party distributed systems in various use cases across several industries [16–18],
including government agencies, healthcare, legal firms, accounting firms, financial ser-
vices, supply chain, energy, education, and service providers in specific industries. These
studies look into the role of distributed ledger technology (DLT) in establishing trust
between untrustworthy parties and delivering secure distributed data and services.
However, I could not locate a paper investigating the DLT platform as a development
framework for a single DIS to exchange, track, and govern its internal data and services
in a secure and trustworthy manner across several environments nodes. Permissioned
private DLT platforms are becoming more popular in today’s market. However, there is
a lack of study in this area [19]. I defined a comprehensive view of DIS Architectural
Design Properties in this study (ADPs). I discovered sixteen common ADPs. Next, we’ll
go through how DLT technology helps DIS ADPs, how DIS can use DLT, and what’s
346 I. Aviv

missing from DLT that can be addressed with various tools. I designed Reference Archi-
tecture for Private Permissioned DLT Native Distributed Information Systems based on
DLT platform analysis as a DIS development framework (DLTNS).
The remainder of this work is arranged in the following manner. The DIS ADPs
are defined in Sect. 2. Section 3 examines how DLT technology is used to meet DIS
ADPs, and therefore the DLTNS reference architecture is described. Finally, I discuss
the findings, study limitations, this paper’s conclusion, and future research.

2 DIS Architectural Design Properties

The DIS ADPs are explored in this section and the technical challenges of meeting these
ADPs by the traditional design of distributed systems.
I conducted a systematic study to explore the ADPs of distributed information sys-
tems in greater depth. Our investigation uncovered sixteen DIS ADPs that are common
in DIS services. In the following sections, I numbered each ADP for further review and
appointment. Interaction of the DIS resources and services requires a System Coordina-
tion (ADP1) and synchronization framework at both the network and middleware level
[20]. In synchronous coordination, all components of a distributed system are coordi-
nated in time. In asynchronous coordination, separate entities take steps in arbitrary
order and operate at random speeds. The total demand of events needs to be established
through collective interactions. Group Communication (ADP2) addresses the commu-
nication schema available to ensure reliable delivery of messages across the distributed
entities [21]. These can involve simple peer-to-peer (P2P) direct messaging to provide
high availability considering network partitioning. It also facilitated concurrent updates,
which allow two or more nodes to alter the same data while disconnected and merge log-
ical data upon reconnecting. Alternatively, multicast provides an ordering of messages
that can be used along with the more sophisticated publish-subscribe form of group com-
munication. P2P DIS are Internet-based systems that operate in an entirely distributed
manner, aiming at unlimited scalability with millions of nodes, continuous availability
regardless of node and network failures, self-stabilization in the presence of rapidly
evolving node participation, and potentially selfish node behavior and high robustness
[22]. In a P2P environment, the entities connect directly to each other and transfer data
without using a centralized system in the process. The P2P network can either be hier-
archical or flat. This concept relays that there can be a network where specific nodes
(master nodes) have more responsibilities than other nodes. The gradient P2P network
topology can be exploited by a master-slave database replication strategy [23]. It can
deliver several advantages to a network, such as faster synchronization, better message
throughput, and increased scalability. P2P DIS can be very unpredictable as all network
nodes control their actions and decide at any time to link or exit from the network.
It, in turn, has an impact on network performance. Because data is transferred across
numerous nodes, some nodes may have conflicting data sets or outdated information
[24]. The networking part of a P2P system can also provide problems, as queries must
(in certain situations) be broadcast across the network, which can cause congestion if
there are a lot of them. Furthermore, security is often in question as the network is often
open to all participants. Trust is a concern because a node in the network is accountable
The Distributed Ledger Technology as Development Platform 347

for its actions, and incentives are needed to encourage participants to obey the network’s
rules and support it [19].
DIS requires support for managing a System State (ADP3), an approach that pro-
vides a mathematical abstraction and maintenance of a typical system state between a
collection of nodes. State machine replication is a common technique for implement-
ing distributed, fault-tolerant services [20]. Commonly, replicated state machine imple-
mentations are centered on using a consensus protocol, as replicas must sequentially
implement the same operations in the same order to prevent divergence.
Existing consensus protocols such as Paxos or Raft can be used to build a replicated
state machine based on a command log [25]. Once a replica learns one or multiple com-
mands by consensus, it appends them to its persistent local command log. However,
implementing such a command log incurs additional challenges such as log truncation,
snapshotting, and log recovery. These issues must be addressed independently on top
of the consensus method in Paxos, which is a challenging and error-prone task [25].
Other consensus solutions, e.g., Raft, consider some of these issues as part of the core
protocol while sacrificing the ability to make consensus decisions without an elected
leader. In either case, implementing consensus sequences requires extensive state man-
agement [26]. The challenge here is that the DIS framework requires Synchronization
Service that ensures synchronization and guarantees that synchronization is achievable.
A fundamental challenge for developing a reliable DIS is to support the cooperation
of dispersed entities demand to execute typical operations, even when some of these
entities, or the communication across them, fails. The need is to order service actions
and avoid partitions of the distributed resources to result in an overall “coordinated”
group of resources.
In P2P DIS, it must deal with the Concurrent Modification (ADP4) prevention chal-
lenge. In distributed systems, double-spending prevention is possible by integrating a
broker server that observes conflicting operations of distributed servers and guarantees
its data integrity [27]. The traditional way of preventing double-spending is by using
a centralized authority responsible for asset allocation [28]. According to it, DIS must
deal with the problem of allocating a digital asset more than once. In DIS, a significant
drawback of double-spending detection techniques is the risk that multiple distributed
nodes give the exact digital asset numerous times in a short period before being detected
[28].
Scalable DIS has to maintain its Service Continuity in all situations [20]. A trade-
off has to be decided upon among the three properties: replica Consistency (ADP5),
Availability (ADP6), and Network Partition Tolerance (ADP7). According to the CAP
theorem, only two of these features may be ensured simultaneously in a distributed ser-
vice. [27]. They are either respected or not. Service availability means that every request
receives a response. Network partition tolerance means that the network continues to
operate despite some messages being dropped. Service replica consistency, on the other
hand, allows for a range of consistency levels. Despite this, “consistency” refers to that
all instances can maintain the same values for each variable at the same time, providing
a behavior the same as if it were a single copy. As a result, the initial demonstrations
of the CAP theorem were assumed to be consistent in this way [29]. With the advent
348 I. Aviv

of multi-environment cloud computing, Because DIS is deployed in various data cen-


ters, network partition resilience is a prerequisite for those operations. DIS regularly
priorities availability when dealing with the CAP theorem constraints, and consistency
is the property being sacrificed. However, that sacrifice should not be complete. Brewer
(2012) [30] explains that network partitions are rare, even for worldwide geo-replicated
services. If services demand partition tolerance and availability, their consistency may be
pretty strong most of the time, relaxing it when any temporary network partition arises.
Therefore, there is still an open gap to explore which levels of consistency are strong
enough to be directly implied by the CAP constraints, i.e., those CAP-constrained mod-
els are not supported when the network becomes partitioned. When a network partition
occurs, however, numerous flexible modes remain available. They make up the CAP-free
set of modes, and there is an unknown boundary between them and the CAP-constrained
models. Additional consistency level in DIS is Eventual Consistency, where DIS com-
promises consistency in favor of high availability. Conflicts can occur in a replication
environment when two or more nodes change to the same piece of data at multiple sites.
Then, the synchronization engine tries to apply those into a single database. In this con-
text, DIS must consider Data Integrity issues that ensure the accuracy and consistency
of data throughout its lifecycle and is an essential feature of a system. All data charac-
teristics, including business rules and rules for how data relate dates, definitions, and
lineage, must be correct for its data integrity to be complete. When functions operate on
the data, the procedures must ensure integrity.
Facing CAP theorem challenges, DIS must deal with Scalability (ADP8), Band-
width Utilization (ADP9), and Latency (ADP10). The scalability of DIS can be mea-
sured along at least three different dimensions - Size, Geography, Administration [20].
Size scalability refers to a system that can easily add more users and resources without
noticeable performance loss. Geographical scalability refers to scaling up when users
and resources are spread out geographically but with minimal communication delays.
An administratively scalable system can still be easily managed even if it spans many
independent administrative organizations. Bandwidth utilization requires dealing with
DIS nodes located on bandwidth-constrained edge networks. The distribution framework
must handle a synchronization service for data pre-position and data guarantee delivery
in limited bandwidth or timely network disconnection. The latency-based design must
ensure all nodes in the network have a reasonably up-to-date version of the same con-
sistency; the transactions must be available to all nodes within a relatively small-time
interval.
An important distributed system property is Transaction Finality (ADP11) [31].
It refers to the general understanding that it is completed for good when the system
operates and doesn’t change in the future. The finality is a statutory, regulatory, and
contractual construct. Some entity agrees that a party has discharged an obligation or
transferred a digital information asset to another entity, which becomes unconditionally
irrevocable despite insolvency. The finality has been challenged by the possibility of
system corruption and information change.
DIS architecture has a lot of additional challenges like Nodes Management (ADP12),
Filtering (ADP13), and Flow Control (ADP14). Nodes Management framework supports
DIS for management membership and organization of the collection of nodes. In a closed
The Distributed Ledger Technology as Development Platform 349

permissioned group, only group members can interact with one another, and joining and
leaving the group requires a different procedure. A Filtering is used to restrict the data
that is synchronized. Typically, the source provider applies the filter during change
enumeration to specify the changes that are sent. The technique of regulating the data
transmission rate between two nodes to ensure that a rapid transmitter does not outrun a
slow receiver is known as flow control. It must include a mechanism for the receiver to
adjust the transfer rate so that data from the transmitting node does not overwhelm the
receiving node.
A great deal of research efforts has been reported to deal with a set of ADPs. Some
patterns, such as disconnected mode and data integrity, may be conflicted with each other,
and no general-purpose solution is yet available to meet all these ADPs satisfactorily
of DIS. These ADPs are relevant to the interoperations of the DIS services. Existing
research studies are facing the following challenges in supporting DIS data, processes,
and services interoperations. In summary, many ADPs of a DIS service rely on the
interoperations among the distributed resources and services. No general-purpose out-
of-the-box solution is available to meet all ADPs satisfactorily due to the challenges of
data validity, heterogeneity, uncertainties, and Trust (ADP15) assurances of data, process
executions, and QoS information of available services.
Consequently, distributed systems Security (ADP16) addresses the threats arising
from exploiting vulnerabilities in the attack surfaces created across the distributed sys-
tem’s resource structure and functionalities. It covers the risks to the data flows that
can compromise the integrity of the distributed system’s resources and structure, access
control mechanisms (for resource and data accesses), the data transport mechanisms, the
middleware resource coordination services characterizing the distributed system model
(replication, failure handling, transactional processing, and data consistency), and finally
the distributed applications based on them (e.g., web services, storage, and databases)
[32].
Given the significant challenges DIS ADPs pose for system designers and developers.
There is a growing need to provide enabling solutions for DIS development frameworks
that address all the aforementioned issues. The following section examines the ability
of DLT to fill this gap.

3 DLT Patterns Supporting DIS Development Principles

DLT combines a few approaches to create a distributed, immutable, append-only log


of ordered transactions, in which everyone involved agrees on the order of transactions
using a distributed consensus process. All transactions are organized into blocks to
improve network consumption, addressing DIS bandwidth requirements (ADP9) [33].
Every transaction is cryptographically signed, and blocks are hash-chained together
after all parties have agreed on block content, producing a distributed ledger (DLT) and
recording blocks in the ledger. Because no one can be trusted, each network participant
keeps their copy of the ledger, making it nearly difficult for a single person to falsify
recorded transactions or refuse the agreement. Any attempt to falsify or replace portions
of the transactions will be discovered, assuring data integrity and finality. All transactions
are recorded in the ledger that has taken place.
350 I. Aviv

On the other hand, a world state database reflects the system’s current state for a spe-
cific point in time, meeting the DIS System State property (ADP3). Many organizations
and industrial applications are exploring incorporating distributed ledger technologies
into the core of their solutions, prompting more significant research into efficiently uti-
lizing distributed ledger technologies in various fields. The following characteristics are
commonly connected with DLT and distributed ledger technologies:

a) Decentralization: The essential feature of the DLT system is that each network
participant has a copy of the shared and distributed ledger. The majority of the
mutually distrusted parties agree based on the local version of the global state,
devolving authority from the central authority [34].
b) Non-repudiation: Because each transaction is signed using public cryptography,
no DLT participant may reject the transaction or transaction content. Non-major
repudiation’s purpose is to keep an indisputable and verifiable transaction issuance
record backed up by an immutable and tamper-proof shared ledger.
c) Provenance: A ledger is an append-only log of records capable of storing a complete
history of transactions across time, with all participants seeing the same data. Data
provenance allows for the investigation of the data’s origin and evaluation, revealing
distinct stages of data modification lifecycles, who approved and initiated these
changes, and computation fingerprints that indicate how these changes were made.
d) Transparency: Everyone has equal access to the ledger because it is shared and
replicated across the network’s participants, making all records traceable and trans-
parent. Permissioned DLT (PB) platforms typically have fine-grained access control
techniques that limit access to data from ineligible players. As a result, PB offers
a toolset that is well-suited to DIS application design, meeting the need to limit
synchronized data access.

Permissioned DLT (PB) based DIS often employ a layered architecture with two
tiers:

a) DLT platform layer: concentrating on the formation of a business network among


distrusted parties, encoding company rules, and policies into smart contracts that
define agreement and trust models
b) DLT Backend tier: focused on data schema modeling and providing DLT characteris-
tics ADPIs and services. This layer uses DLT components and capabilities as building
blocks to create a reliable and trustworthy DIS with ADPs to handle data integrity
problems, application data availability, and consistency. Business and application
logic are usually embedded at this layer to address DIS required functionality.

The DLT layer comprises essential components that operate together to provide
secure communication between mutually untrusted nodes, shared transaction logs, and
data updates that are synchronized using the replicated state machine [35]. As a result,
DLT is ideal for ensuring reliable and consistent coordination between DIS services
(ADP1).
Cryptography: To secure DLT operations, the PB network employs public-key cryp-
tography techniques. To digitally sign transactions, private keys are utilized to ensure
The Distributed Ledger Technology as Development Platform 351

undeniability and prevent adversarial interference with the user’s data (ADP15 and
ADP16).
Ledger: The DLT supports data interchange by using a shared ledger with peer-to-
peer replication, ensuring openness and information availability. To maximize band-
width consumption, data is transferred between nodes in the form of blocks that are
hash-chained together. With each transaction and block consecutively appended into the
ledger, the system’s state is continuously changed, ensuring linearizability and removing
flakes.
Consensus Mechanism: The challenge of obtaining agreement among mutually
untrustworthy nodes in the DLT is converting the Byzantine Generals (BG) problem
[36]. The replicated state machine abstraction explains the replication synchronization
mechanism between distinct nodes (i.e., consensus). Because DLT is seen as trustworthy,
it is expected to have resilient features, such as a secure, highly available, dependable,
and safe system that provides continuous and uninterrupted service. In distributed net-
works, the consensus algorithm to support state replication gives DIS constructed on top
of the DLT with network partition tolerance (ADP7) and strong consistency (ADP5).
Smart Contracts: The DLT adds the ability to handle programmable transactions,
and application rules can now be embedded and written in an executable form, with the
DLT assisting with execution. Ethereum was the first to suggest the concept of smart
contracts [37], a general-purpose programming language that allowed the implemen-
tation of business rules into Turing-complete transaction logic. A smart contract is an
abstraction that mimics the functionality of a trusted distributed application, employing
the underlying DLT architecture to ensure DIS safety and consistency (ADP5).

4 DLT-Native System (DLTNS) Reference Architecture


I defined DLTNS reference architecture, which organizes essential basic concepts to
DLTNS design patterns given in Fig. 1, to meet as many of the 16 DIS ADPs as feasible,
following the DLT Principles specified in Sect. 3. We’re looking at these ideas and how
they might play out in the future.
The “Infrastructure Services” layer of the DLTNS standard design is based on lower-
level infrastructure components such as network, storage, and virtualization services.
The fundamental DLT Middleware Services are proposed in the second tier of the
DLTNS reference design. Nodes Services provide nodes group management and identi-
fication services on the network to all nodes, regardless of their functionality, as a need
to perform their functions as a contributor to the network or just access the network
services. Peer-to-Peer Connectivity Service is included. DLTNS projects can utilize out-
of-the-box peer-to-peer connectivity, which is the foundation for peer discovery and
distributed consensus among DLTNS participants. This layer also includes a crypto-
graphic mathematical algorithm service that effectively converts any size input to a
fixed-size output. Messaging services let data be passed between nodes and accounts
for discovery, transactions, coordination, and consensus. Consensus Services attempted
to reach a distributed consensus on the system state without relying on trusted third
parties. It opens the possibility of creating and using a DLTNS in which any allowed
organization can verify any possible state or interaction. Only approved and trusted enti-
ties will participate in ledger activities, according to the Permissioned Ledger Service.
352 I. Aviv

A private ledger can protect ledger data by allowing only permitted entities to access
it, useful in some DLTNS scenarios. Smart Contracts can be utilized as object codes
stored immutably on a DLT network and execute autonomously to react to internal or
external events in the DLTNS architecture. Each object specifies a collection of state
variables and related operations to determine the business logic and rules that must be
followed before changing the object state. Ecosystem Services provide tools to make
the governance processes of the DLTNS ecosystem easier. It has features to automate
the DLTNS governance needs, give the flexibility to expand the system on demand, and
monitor access to DLTNS, business functions, or state.

Fig. 1. DLTNS reference architecture

Application Services for DLTNS projects comprise the third layer. SDK for DLT
systems, Infrastructure APIs, and Application Services are all part of the development
stack. Third-party integration for external services is included in this layer. An external
database service must be employed and connected to DLTNS as a distinct component to
get over DLT’s restrictions and enable offline mode. As a result, a significant portion of an
application can be developed on the DLT platform, while the remainder can be developed
elsewhere. Which parts of functionality should be allocated within the DLT framework
or held off-chain is a consideration in any DLTNS design decision. Off-Chain Data
Integration can help with issues like privacy and scalability. It does, however, necessitate
careful thought for secure storage and retrieval. Off-chain storage services are frequently
used when it is preferable to store the data itself off-chain while keeping the proof-of-
data and a pointer to the data on-chain. Another roadblock in the DLTNS instance is
The Distributed Ledger Technology as Development Platform 353

DLTNS interoperability services, which provide compatibility between business services


running on different platforms or across different ledger implementations.

5 Discussion and Conclusion


In this research, I am the first to use the term “DLT-Native.” A “DLT-Native” distributed
information system (DLTNS) is built on DLT infrastructure. It uses DLT design patterns
to grow internationally, support thousands of distributed nodes, and withstand system
failures and cyber-attacks. DLT-Native is a word that gets thrown around a lot, but
it’s rarely defined beyond “we created it on the DLT platform” against “conventional
development framework.” This study uncovers new ideas and DIS design patterns that
are gaining traction. I describe how these qualities originate from technological design
patterns after we’ve described them. Finally, I examine the technological restrictions
of DLT platforms, which may provide insight into the future of DLT applications for
researchers.
I defined 16 architectural properties (ADPs) of any distributed information sys-
tem in Sect. 2. In part 3, I went through DLT principles and design patterns that can
be used to create DIS ADPs with DLT. DLT principles are recurrent principles for
achieving DIS ADP characteristics and realizing DLTNS transferability. Cryptography,
Distributed Ledger, P2P Data Distribution Service, Consensus Mechanism, and Smart
Contracts were used to demonstrate how they can be used to meet the functional require-
ments of the DIS ADPs. In Sect. 4, the DLTNS reference architecture is outlined, which
follows these concepts. The DLT platform does not provide some ADPs “out of the box”.
While considering the platform’s limitations and paradigms, ADPs must be designed
and constructed on or next to it.
Soon, the development of distributed systems in the IoT area will be critical. In
addition, software businesses are gradually implementing DLT technology into their
dispersed networks. As a result, I offered a revolutionary DLTNS approach for designing
and constructing Distributed systems based on their needs and requirements while main-
taining broad functionality and security. Using an empirical proof-of-concept imple-
mentation of DLTNS, I developed a DLT-Native framework. I discovered that the DLT
platform might be used to apply the essential architectural aspects of DIS.
The current study has some shortcomings that hint at future research options. The
number of DLT platforms on the market is rapidly growing. A more advanced version
of DLTNS comparing leading developing DLT systems like Corda or Hyperledger via
empirical case study could be a potential extension of this research. I propose looking
into the status of the DLT platform releases and seeing whether a new release adds new
features that support ADPs that aren’t addressed in this report.

References
1. Van Steen, M., Tanenbaum, A.: Distributed systems principles and paradigms. Network 2(28)
(2002)
2. Gong, W., Qi, L., Xu, Y.: Privacy-aware multidimensional mobile service quality prediction
and recommendation in distributed fog environment. Wirel. Commun. Mobile Comput. (2018)
354 I. Aviv

3. Zhu, X., Yang, L.T., Jiang, H., Thulasiraman, P., Di Martino, B.: Optimization in distributed
information systems. J. Comput. Sci. 26, 305–306 (2018)
4. Sahni, Y., Cao, J., Zhang, S., Yang, L.: Edge mesh: a new paradigm to enable distributed
intelligence in Internet of Things. IEEE access 5, 16441–16458 (2017)
5. Pleskach, V., Pleskach, M., Zelikovska, O.: Information security management system in dis-
tributed information systems. In: 2019 IEEE International Conference on Advanced Trends
in Information Theory (ATIT), pp. 300–303 (2019)
6. Darwish, A., Hassanien, A.E., Elhoseny, M., Sangaiah, A.K., Muhammad, K.: The impact
of the hybrid platform of internet of things and cloud computing on healthcare systems:
opportunities, challenges, and open problems. J. Ambient. Intell. Humaniz. Comput. 10(10),
4151–4166 (2017). https://doi.org/10.1007/s12652-017-0659-1
7. D’souza, S., Koehler, H., Joshi, A., Vaghani, S., Rajkumar, R.: Quartz: time-as-a-service for
coordination in geo-distributed systems. In: Proceedings of the 4th ACM/IEEE Symposium
on Edge Computing, pp. 264–279 (2019)
8. Shinde, S., Tak, S., Tiwari, K., Barapatre, O., Mishra, S.K.: An introduction of distributed
ledger technology in blockchain and its applications. Des. Eng. 2290–2299 (2021)
9. Pahlevan, M., Voulkidis, A., Velivassaki, T.H.: Secure exchange of cyber threat intelligence
using TAXII and distributed ledger technologies-application for electrical power and energy
system. In: The 16th International Conference on Availability, Reliability and Security, pp. 1–8
(2021)
10. Leiponen, A., Thomas, L.D., Wang, Q.: The dApp economy: a new platform for distributed
innovation? Innovation 1–19 (2021)
11. Li, J., Kassem, M.: Applications of distributed ledger technology (DLT) and blockchain-
enabled smart contracts in construction. Autom. Constr. 132, 103955 (2021)
12. Johnson, M., Jones, M., Shervey, M., Dudley, J.T., Zimmerman, N.: Building a secure biomed-
ical data sharing decentralized app (DApp): tutorial. J. Med. Internet Res. 21(10), e13601
(2019)
13. Hamilton, M.: Blockchain distributed ledger technology: an introduction and focus on smart
contracts. J. Corp. Account. Finance 31(2), 7–12 (2020)
14. Olnes, S., Ubacht, J., Janssen, M.: Blockchain in government: benefits and implications of
distributed ledger technology for information sharing. Gov. Inf. Q. 34(3), 355–364 (2017)
15. Riley, L.J., Kotsialou, G., Dhillon, A., Mahmoodi, T., McBurney, P.J., Pearce, R.: Deploying a
shareholder rights management system onto a distributed ledger. In: International Conference
on Autonomous Agents and International Systems (AAMAS) (2019)
16. Burke, J.J.: Distributed ledger technology. In: Financial Services in the Twenty-First
Century, pp. 131–154. Palgrave Macmillan, Cham (2021)
17. Chen, J., Chen, X., He, K., Du, R., Chen, W., Xiang, Y.: DELIA: distributed efficient log
integrity audit based on hierarchal multi-party state channel. IEEE Trans. Dependable Secure
Comput. (2021)
18. Wang, Z., Liffman, D.Y., Karunamoorthy, D., Abebe, E.: Distributed ledger technology for
document and workflow management in trade and logistics. In: Proceedings of the 27th ACM
International Conference on Information and Knowledge Management, pp. 1895–1898 (2018)
19. Polge, J., Robert, J., Le Traon, Y.: Permissioned DLT frameworks in the industry: a
comparison. ICT Express (2020)
20. Birman, K.: Reliable Distributed Systems. Springer, New York (2005). https://doi.org/10.
1007/0-387-27601-7
21. Crompton, C.J., Ropar, D., Evans-Williams, C.V., Flynn, E.G., Fletcher-Watson, S.: Autistic
peer-to-peer information transfer is highly effective. Autism 24(7), 1704–1712 (2020)
22. Chasin, F., von Hoffen, M., Cramer, M., Matzner, M.: Peer-to-peer sharing and collaborative
consumption platforms: a taxonomy and a reproducible analysis. IseB 16(2), 293–325 (2017).
https://doi.org/10.1007/s10257-017-0357-8
The Distributed Ledger Technology as Development Platform 355

23. Saghiri, A.M., Meybodi, M.R.: An adaptive super-peer selection algorithm considering peers
capacity utilizing asynchronous dynamic cellular learning automata. Appl. Intell. 48(2), 271–
299 (2017). https://doi.org/10.1007/s10489-017-0946-8
24. Wang, J., Gao, Y., Liu, W., Sangaiah, A.K., Kim, H.J.: An intelligent data gathering schema
with data fusion supported for mobile sink in wireless sensor networks. Int. J. Distrib. Sensor
Netw. 15(3), 1–9 (2019)
25. Skrzypczak, J., Schintke, F., Schütt, T.: Linearizable state machine replication of state-
based CRDTs without logs. In: Proceedings of the 2019 ACM Symposium on Principles
of Distributed Computing, pp. 455–457 (2019)
26. Howard, H., Mortier, R.: Paxos vs Raft: have we reached consensus on distributed consensus?
In: Proceedings of the 7th Workshop on Principles and Practice of Consistency for Distributed
Data, pp. 1–9 (2020)
27. Chakrabarti, C.: iCredit: a credit based incentive scheme to combat double spending in post-
disaster peer-to-peer opportunistic communication over delay tolerant network. Wireless Pers.
Commun. 121(3), 2407–2440 (2021). https://doi.org/10.1007/s11277-021-08829-x
28. Hoepman, J.-H.: Distributed double spending prevention. In: Christianson, B., Crispo, B.,
Malcolm, J.A., Roe, M. (eds.) Security Protocols. LNCS, vol. 5964, pp. 152–165. Springer,
Heidelberg (2010). https://doi.org/10.1007/978-3-642-17773-6_19
29. Campêlo, R.A., Casanova, M.A., Guedes, D.O., Laender, A.H.F.: A brief survey on replica
consistency in cloud environments. J. Internet Serv. Appl. 11(1), 1–13 (2020). https://doi.org/
10.1186/s13174-020-0122-y
30. Brewer, A.: CAP twelve years later: how the “rules” have changed. IEEE Computer 45(2),
23–29 (2012)
31. Cachin, C., Guerraoui, R., Rodrigues, L.: Introduction to Reliable and Secure Distributed
Programming. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-15260-3
32. Lun, Y.Z., D’Innocenzo, A., Smarra, F., Malavolta, I., Di Benedetto, M.D.: State of the art
of cyber-physical systems security: an automatic control perspective. J. Syst. Softw. 149,
174–216 (2019)
33. Messié, V., Fromentoux, G., Labidurie, N., Radier, B., Vaton, S., Amigo, I.: BALAdIN:
truthfulness in collaborative access networks with distributed ledgers. Ann. Telecommun. 77,
1–13 (2021). https://doi.org/10.1007/s12243-021-00855-x
34. Garcia-Font, V.: Conceptual technological framework for smart cities to move towards
decentralized and user-centric architectures using DLT. Smart Cities 4(2), 728–745 (2021)
35. Vukolić, M.: Rethinking permissioned blockchains. In: Proceedings of the ACM Workshop
on Blockchain, Cryptocurrencies and Contracts, pp. 3–7 (2017)
36. Berger, C., Reiser, H.P.: Scaling byzantine consensus: a broad analysis. In: Proceedings of the
2nd Workshop on Scalable and Resilient Infrastructures for Distributed Ledgers, pp. 13–18
(2018)
37. Buterin, V.: A next-generation smart contract and decentralized application platform. White
Pap. 3(37), 1–36 (2014)
Design of Chebyshev Bandpass
Waveguide Filter for E-Band Based
on CSRR Metamaterial

Mahmoud Abuhussain(B) and Ugur Cem Hasar

Electrical and Electronics Engineering Department,


Gaziantep University, Gaziantep, Turkey
ma41147@mail2.gantep.edu.tr

Abstract. A waveguide bandpass filter (WBPF) based on the Cheby-


shev response that operates in the E-band system for downlink channel
[71 GHz–76 GHz] at 73.5 GHz resonant frequency has been designed
and simulated. The new design of the WBPF used complementary split-
ring resonators (CSRRs) that both rings are located transversely on the
metallic sheet. Lumped circuit of the filter has been implemented and
discussed as well. The circuit of the prospective bandpass filter has been
designed and demonstrated via electromagnetic full-wave simulator CST.
By selecting proper physical dimensions of CSRRs, a shortened physical
length, a flat and lossless passband, and better return loss rather than the
traditional waveguide filter. Subsequently, the proposed waveguide BPF
and the traditional WBPF which coupled with inductive H-plane res-
onators have been compared at the same resonant frequency 73.5 GHz.
The new waveguide bandpass filter shortened the physical length of
WBPF by 37.5% and boosted the return loss up to 6.7%.

Keywords: Microwave filter · Metamaterials · Waveguide ·


Resonators · E-band

1 Introduction
Waveguides, in general, are an electromagnetic structures constructed as a hollow
metallic wave guiding used in the microwave communications and broadcasting
such as filters, couplers, combiners, and amplifiers because of the high power
handling and less power consumption of employing waveguides. [1]. Recently,
enormous research have been increased on microwave bandpass filters (BPFs)
with improving the frequency selectivity and a shortening the overall physical
dimension of filter. For instance, such a demand results in various designs and
implementations of waveguide filters with improved characteristics of sharpness,
bandwidth, and physical size [2–4] are proposed and implemented.
Microwave filters are vital components that employed for selecting and trans-
mitting signals over specified band or rejecting others [1]. They are available
in several structures in the literature such as waveguide [5] or microstrip [6]
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 356–365, 2022.
https://doi.org/10.1007/978-3-030-97196-0_29
Design of Chebyshev Bandpass Waveguide Filter for E-Band 357

Fig. 1. Lumped elements circuit model for CSRR.

bandpass filters, however, they have typical form to evaluate their performance
using distributed network conception. Filters involve periodic structures exhibit-
ing stopband and passband characteristics in varied frequency channels [1].
For a transceiver communication system, maximum achievable data rate
requires minimum distortion of its Radio Frequency (RF) transfer character-
istics [7]. The E-band offers reasonable wideband frequency capability to reach
the high gigabit rate required in wireless transmission systems [8]. The fre-
quency band 71 GHz to 76 GHz is allocated for several applications by the
International Telecommunication Union (ITU) and The European Telecommu-
nication Standards Institute (ETSI) [9]. Collection of microwave filters with
diverse characteristics of good insertion loss S21 , better quality factor Q , and
good selectivity for E-band applications have been proposed [2,10–18]. These
studies have been implemented based on the direct coupling phenomena [1,19].
In [2,5] authors have designed two channels waveguide diplexer for E-
band to be employed for point-to-point broadband wireless gigabit connectivity.
The bandpass waveguide filter for downlink channel [71 GHz to 76 GHz] with
73.5 GHz center frequency used H-plane inductive irise-coupled rectangular res-
onators which is implemented to have a Chebyshev response with five-poles. The
downlink section of diplexer, BPF gives 5 GHz bandwidth and −15 dB return
loss with −60 dB selectivity. In [5] the waveguide bandpass filter is designed for
downlink channel 71 GHz to 76 GHz and fractional bandwidth (FBW) 6.8% at
73.5 GHz center frequency. Five poles chebyshev waveguide filter with a pass-
band ripple of 0.0432 dB and direct H-plane inductive irises are chosen. How-
ever, this method of filter design is not able to minimize the overall physical
filter dimensions.
Lately, the engineered materials [20] coined as metamaterials (MMs) which
are studied earlier by Veselago [21] are considered as a new area which opened the
field of filters implementation due to their properties such as uncommon, odd,
and exotic electromagnetic possession. For instance, split ring resonators (SRRs)
and their complementary (CSRRs) have played a crucial role in different filters
358 M. Abuhussain and U. C. Hasar

Fig. 2. Conventional waveguide bandpass filter model.

implementation [22]. Due to the resonance structures and behaviour of split-ring


resonators, they can be employed for miniaturizing microwave components such
as waveguide bandpass filter.
Several Metamaterial resonators in different forms with diverse electromag-
netic features are implemented to miniaturize filter size by employing split
ring resonators (SRRs) or Complementary SRRs (CSRRs) structures [23–29].
Authors used various rectangular CSRRs structures to implement bandpass
waveguide filter for both single and dual modes are shown and discussed [30],
Also authors used double SRRs to reduce dual-band waveguide bandstop filter
size as demonstrated in [31].
Besides, the foregoing researches, progressively, MMs have been employed in
many designs in telecommunication applications to enhance the performance of
microwave filters such as bandwidth, return loss, or insertion loss using coupled
SRRs and CSRRs [32–35] are proposed and designed.
In this paper, in order to enhance the WBPF based MMs over a traditional
WBPF for E-band at 73.5 GHz center frequency, and to improve the selectivity,
insertion loss as well as decreasing number of waveguide cavities, a simple double
symmetric unit cell CSRRs which are ridged from the middle to ease tunning
of the resonating frequency at 73.5 GHz are synthesized and implemented as
shown in Fig. 1. Changing the physical dimensions of CSRR in a systematical
manner, and selecting proper parameters x, w, t, h, s as shown in Fig. 1 give an
optimal solution for the overall filter design. Cutoff frequency f0 , quality factor
Q, and scattering parameters S11 , S21 are calculated and discussed by using
full-wave CST as well as a circuit simulator (ADS 2017) has been employed to
design and simulate a lumped elements RLC model for the proposed filter.
Contrary to traditional forms of bandpass waveguide filters as shown in Fig. 2
which used direct coupled H-plane junctions [2,5], the desired bandpass waveg-
uide filter has been implemented by using the new technology based Metama-
terials which designed with coupled resonators and without inserting waveguide
junctions. An excluding waveguide junctions between coupled resonators gives
the design crutial characteristics, particularly for microwave applications and
components such as compactness, simplicity, and good performance. According
to the above features, employing the designed waveguide filter based MMs with
low noise amplifier for E-band receiver circuits is highly desired that works at
Design of Chebyshev Bandpass Waveguide Filter for E-Band 359

73.5 GHz resonant frequency, the desired bandpass filter is employed to suppress
an unwanted signal bands for backhaling links.

2 The Chebyshev Bandpass Filters

Design a bandpass filter based waveguide technology and Chebyshev response,


WR-12 standard, a > b, a = 3.0988 mm, b = 1.5494 mm parameters are used
and simulated. Designing bandpass filters which based the Chebyshev response
exhibits maximally flat stopband and equal-ripple passband as discussed in
[1]. The order n of filter surly will affect on the fluctuation of microwave fil-
ter response along with changing the bandwidth of channel and controlling the
sharpness. Namely, increasing in order n of filter circuit will affect directly on
the response such as cutoff and bandwidth; however, any increasing of filter’s
order will complicate the design of filter as well as will take more time for simu-
lation and optimization the response. The amplitude-squared transfer function
that describes this type of response is given in [1].
1
|S21 (jΩ)|2 = , (1)
1 + 2 Tn2 (Ω)
where Tn (Ω) is a Chebyshev function which n = 1 and  is the ripple constant
that is related to a given passband ripple LAr in dB by

LAr
 = 10 10 − 1. (2)

If the dominant pattern T E101 of a propagation in E-band cavity waveguide is


considered and the propagation will be in z−axis, then the transverse electric
Ez = 0, or in other words, no components of electric field in the direction of
propagation inside waveguide cavity. Similarly, for magnetic components endure
in same direction of propagation Hz = 0, this part is discussed comprehensively
in [1].

2.1 Design of Traditional Bandpass Filter

Five order waveguide bandpass filter n = 5 [2,5] has been designed with cheby-
shev response based on (1) and (2) as shown in Fig. 2. The waveguide bandpass
filter has been designed for downlink band and according to the iris coupled res-
onators [19]. Waveguide filter is employed for E- band applications [71 GHz–76
GHz] channel with 73.5 GHz center frequency, and bandwidth 5 GHz with return
loss −20 dB as shown in Fig. 2. The input and output external quality factors and
the coupling coefficients are computed for fractional bandwidth F BW = 6.8%.
The calculated values are Kc12 = Kc45 = 0.0589, Kc23 = Kc34 = 0.0432, and
Q = 14.279.
360 M. Abuhussain and U. C. Hasar

-5

S -Parameter [Mag in dB] -10

-15

-20
S11 conventional
S21 conventional
-25
S11 proposed
S21 proposed
-30

-35

-40
65 70 75 80 85
Frequency [GHz]

Fig. 3. S-parameters comparison between conventional and proposed resonator at res-


onant frequency.

2.2 The Proposed Bandpass Waveguide Filter

In the near Past, Veselago was the first Physicist who had explained and covered
the topic of using unnatural and the unique material [21]. The newly exposed
material has negative constitutive parameters μ,  permeability and permittivity,
respectively. Significantly delayed, the proof of medium which has simultaneous
negative parameters both μ and  was explained and shown experimentally [20].
In [37], the authors have used CSRRs (image of SRRs) to design microwave filter
with high compactness and they have engraved the metallic sheet of unit cell. In
our paper, based on [38] a double ring of CSRR which coupled with each other
directly without any junctions and loaded to waveguide as shown in Fig. 1 is
designed and simulated. Selecting proper CSRR physical dimensions and based
on the coupling phenomena between both rings move the resonant frequency
and control the bandwidth as desired.
As shown in Fig. 3, full-wave CST full-wave simulator is employed to design
and simulate a loaded waveguide cavity that its initial physical length d = λ/2
at 73.5 GHz center frequency. The lower cutoff frequency is fl = 71 GHz, and
higher cutoff frequency is fh = 76 GHz with 0.043 dB passband ripple. In addi-
tion, using substrate material (RT/Duroid) with a thickness of 0.5 mm, permit-
tivity constant r = 2.2, and appropriate mesh density selection 10 steps per
wavelength with 20 min number of steps are selected. Annealed copper (lossy
material) with conductivity σ = 5.8 × 107 [S/m], thickness 30 µm is chosen
for metallic layer, and impedance at resonant frequency Z0 = 499 Ω. Figure 3
shows the frequency response for both resonators conventional and proposed.
The microwave resonant circuit in high frequency behaves as RLC lumped ele-
Design of Chebyshev Bandpass Waveguide Filter for E-Band 361

-10
S -Parameter [Mag in dB]

-20 S11 Lumped


S21 Lumped
S11 proposed
-30 S21 proposed

-40

-50

-60

-70
65 70 75 80 85
Frequency [GHz]

Fig. 4. S-parameters comparison between lumped element circuit and proposed res-
onator at resonant frequency.

Fig. 5. Proposed three resonators waveguide bandpass filter.

ments in low frequency circuits that can be excited by external magnetic source.
To obtain lumped elements model from distributed model, values of components
that are calculated by using (3)–(5) as proposed in [30]. The corresponding
lumped elements RLC circuit for the unit cell of the proposed WBPF is pre-
362 M. Abuhussain and U. C. Hasar

-10
S -Parameter [Mag in dB]

-20

-30

-40

-50

S11 conventional
-60 S21 conventional
S11 proposed
S21 proposed
-70
65 70 75 80 85
Frequency [GHz]

Fig. 6. S-parameters comparison between conventional and proposed waveguide band-


pass filter at 73.5 GHz resonant frequency.

sented in Fig. 1, and the response for both scattering parameters S11 ,S21 for
CST model and circuit simulator model is shown in Fig. 4.

|S21 (jω0 )|
Ri = Z0 , (3)
2(1 − |S21 (jω0 )|)
|S21 (jω0 )|
Li = B3dB Z0 , (4)
2ω02
2
Ci = , (5)
B3dB Z0 |S21 (jω0 )|

where ω0 is angular frequency (rad/s), B3dB bandwidth at specific frequency,


Z0 port impedance, and S21 is passband S-parameter at considered resonant
frequency. Based on what is proposed in [30] and the definition of circuit R,
L, and C parameters, shown in Fig. 1, and, because of symmetry between two
CSRRs, the values are obtained at 73.5 GHz resonant frequency by using (3)–
(5): Z0 = 499 Ω, B3dB = 1.0925 GHz, R1,2 = 105, 100 Ω, C1,2 = 0.036375 pF,
L1,2 = 0.12895 nH.
Design of Chebyshev Bandpass Waveguide Filter for E-Band 363

2.3 Discussion
Based on the initial design specifications of the bandpass filter H-plane, resonant
frequency f0 , number of resonators n, bandwidth BW , low pass prototype with
element values gi , and insertion loss of filter being assumed, then the conventional
resonators are designed to get the desired WBPF as shown in Fig. 2. Once band-
pass filter is needed, the first step is looking at the form of low-pass prototype
circuit. The prototype circuit consists of g  s parameters which are in our WPBF
design, g’s parameters as follows: g0 = 1, g1 = 0.9714, g2 = 1.3721, g3 = 1.8014,
g4 = 1.3721, g5 = 0.9714, and g6 = 1, second step of designing WPBF, the
frequency transformation approach has been used and calculated [36] as demon-
strated in Fig. 2. Once the traditional E-bandpass waveguide filter with insertion
loss mechanism (ILM) [2,5] that used H − plane inductive irsis, and the desired
bandpass waveguide filter metamaterial technology as demonstarted in Fig. 5
have been compared to each other, we noticed that the filter which used Meta-
material technology has shortened the physical length of WBPF by 37.5% and
improved the return loss S11 or S22 up to 6.7% at the same 73.5 GHz center
frequency as shown in Fig. 6.

3 Conclusion
A straightforwarde procedure and new technique to design and implement waveg-
uide filters by using Metamaterial technology. The methodology of filter design
is shown clearly and discussed in this paper smoothly. The proposed filter is able
to control bandwidth and the resonant frequency of the overall filter response.
A three order waveguide bandpass filter which is loaded with CSRRs meta-
resonators has been designed and simulated. The initial response of the designed
bandpass filter has been tuned and optimized to reach the final response of the
proposed filter at 73.5 GHz resonant frequency. The filter’s size has been reduced
and shortened by 37.5% once it is compared with the traditional BPF with H-
plane junctions. In future work, the authors will propose and design a waveguide
diplexer and multiplexer based on the new approach mentioned and used here.

References
1. Pozar, D.M.: Microwave engineering. In: Electrical Engineering, 4th edn., pp. 102–
110. Wiley, Hoboken (2012). Ch. 3, Sect. 3.2
2. Skaik, T., AbuHussain, M.: Design of diplexers for e-band communication systems.
In: 2013 13th Mediterranean Microwave Symposium (MMS), Saida, pp. 1–4 (2013)
3. Atia, A.E., Williams, A.E.: Narrow-bandpass waveguide filters. IEEE Trans.
Microwave Theor. Tech. 20(4), 258–265 (1972)
4. Bonetti, R.R., Williams, A.E.: Application of dual TM modesto triple-and
quadruple-mode filters. IEEE Trans. Microwave Theor. Tech. 35(12), 1143–1149
(1987)
5. Abuhussain, M.M.: An E-band Diplexer for Gigabit Wireless Communications Sys-
tems. M.S. Thesis, Electronics Engineering Department, IUG, Gaza (2013)
364 M. Abuhussain and U. C. Hasar

6. Hong, J.S., Lancaster, M.J.: Microstrip Filters for RF/Microwave Applications. In:
Electronics, 1st edn., pp. 77–87. Wiley, Hoboken (2001). Ch. 4
7. Stander, T.: A review of key development areas in low-cost packaging and inte-
gration of future E-band mm-wave transceivers. In: AFRICON, Addis Ababa, pp.
1–5 (2015)
8. Boes, F., et al.: Multi-gigabit E-band wireless data transmission. In: IEEE MTT-S
International Microwave Symposium, pp. 1–4 (2015)
9. Frecassetti, M.G.L.: E-band and v-band - survey on status of worldwide regulation.
ETSI ISG mWT White Paper (2015)
10. Chan, K.Y., Ramer, R., Mansour, R.R., Guo, Y.J.: 60 GHz to E-band switchable
bandpass filter. IEEE Microwave Wirel. Compon. Lett. 24(8), 545–547 (2014)
11. Dilek, S.M., Henneberger, R., Kallfass, I.: Performance analysis of e-band duplex
transceiver based on waveguide diplexer filters. In: 48th European Microwave Con-
ference (EuMC), pp. 1069–1072. IEEE (2018)
12. Ding, D.Z., Xu, J.P.: Low conversion loss full E-band seventh-harmonic mixer with
compact filter. Electron. Lett. 50(7), 526–528 (2014)
13. Fang, D., Zhang, B., He, J.: A E-band E-plane type waveguide bandpass filter.
In: IEEE 9th UK-Europe-China Workshop on Millimetre Waves and Terahertz
Technologies (UCMMT), pp. 180–182. IEEE (2016)
14. Vosoogh, A., et al.: Compact integrated full-duplex gap waveguide-based radio
front end for multi-Gbit/s point-to-point backhaul links at E-band. IEEE Trans.
Microwave Theor. Techn. 67(9), 3783–3797 (2019)
15. Xu, X., Zhang, M., Hirokawa, J., Ando, M.: E-band plate-laminated waveguide
filters and their integration into a corporate-feed slot array antenna with diffu-
sion bonding technology. IEEE Trans. Microwave Theor. Techn. 64(11), 3592–3603
(2016)
16. Zhang, B., Zirath, H.: 3D printed iris bandpass filters for millimetre-wave applica-
tions. Electron. Lett. 51(22), 1791–1793 (2015)
17. Zhang, B., Zirath, H.: A metallic 3-D printed E-band radio front end. IEEE
Microwave Wirel. Compon. Lett. 26(5), 331–333 (2016)
18. Zou, T., Zhang, B., Fan, Y.: Design of a 73 GHz waveguide bandpass filter. In: IEEE
9th UK-Europe-China Workshop on Millimetre Waves and Terahertz Technologies
(UCMMT), pp. 219–221. IEEE (2016)
19. Cohn, S.B.: Direct-coupled-resonator filters. In: Proceedings of the IRE, vol. 45,
no. 2, pp. 187–196 (1957)
20. Pendry, J.B., Holden, A.J., Robbins, D.J., Stewart, W.J.: Magnetism from con-
ductors and enhanced nonlinear phenomena. IEEE Trans. Microwave Theor. Tech.
47(11), 2075–2084 (1999)
21. Veselago, V.G.: Electrodynamics of substances with simultaneously negative elec-
trical and magnetic permeabilities. Sov. Phys. Usp. 10(4), 504–509 (1968)
22. Garcia, J., Bonache, J., Gil, I., Martin, F., Velazquez-Ahumada, M.C., Martel, J.:
Miniaturized microstrip and CPW filters using coupled metamaterial resonators.
IEEE Trans. Microwave Theor. Tech. 54(6), 2628–2635 (2016)
23. Hrabar, S., Jankovic, G., Zivkovic, B., Sipus, Z.: Numerical and experimental inves-
tigation of field distribution in waveguide filled with anisotropic single negative
metamaterial. In: 18th International Conference on Applied Electromagnetics and
Communications, Dubrovnik, pp. 1–4 (2005)
24. Liu, Y., Ma, H.: A broadband bandpass rectangular waveguide filter based on
metamaterials. In: International Workshop on Metamaterials (Meta), Nanjing, pp.
1–4 (2012)
Design of Chebyshev Bandpass Waveguide Filter for E-Band 365

25. Yelizarov, A.A., Nazarov, I.V., Sidorova, T.V., Malinova, O.E., Karavashkina,
V.N.: Modeling of a waveguide stop-band filter with a mushroom-shaped metama-
terial wall and dielectric substrates. In: Systems of Signal Synchronization, Gener-
ating and Processing in Telecommunications (SYNCHROINFO), Minsk, pp. 1–3
(2018)
26. Bahrami, H., Hakkak, M., Pirhadi, A.: Using Complementary Split Ring Res-
onators (CSRR) to design bandpass waveguide filters. In: Asia-Pacific Microwave
Conference, Bangkok, pp. 1–4 (2007)
27. Hidayat, M.R., Munir, A.: Rectangular waveguide BPF using split ring resonator
metamaterials. In: 22nd Asia-Pacific Conference on Communications (APCC),
Yogyakarta, pp. 604–608 (2016)
28. Becharef, K., Nouri, K., Abes, T.: Enhanced performance of substrate integrated
waveguide bandstop filter based on metamaterials SCSRRs. In: 6th International
Conference on Image and Signal Processing and their Applications (ISPA), Mosta-
ganem, Algeria, pp. 1–5 (2019)
29. Kiumarsi, H., Wasa, K., Ito, H., Ishihara, N., Masu, K.: E-band filters based on
substrate integrated waveguide octagonal cavities loaded by complementary split-
ring resonators. In: IEEE MTT-S International Microwave Symposium, Phoenix,
AZ, pp. 1–4 (2015)
30. Stefanovski, S. L., Potrebić, M. M., Tošić, D. V.: Design and analysis of band-
pass waveguide filters using novel complementary split ring resonators. In: 11th
International Conference on Telecommunications in Modern Satellite, Cable and
Broadcasting Services (TELSIKS), Nis, pp. 257–260 (2013)
31. Fallahzadeh, S., Bahrami, H., Tayarani, M.: A novel dual-band bandstop waveg-
uide filter using split ring resonators. Prog. Electromagnet. Res. Lett. 12, 133–139
(2009)
32. Odabasi, H., Teixeira, F.L.: Electric-field-coupled resonators as metamaterial load-
ings for waveguide miniaturization. J. Appl. Phys. 114(21), 214901 (2013)
33. Nassar, S.O., Meyer, P.: Pedestal substrate integrated waveguide resonators and
filters. IET Microwaves Antennas Propag. 11(6), 804–810 (2017)
34. Torabi, Y., Dadashzadeh, G., Oraizi, H.: Miniaturized sharp band-pass filter based
on complementary electric-LC resonator. Appl. Phys. A 122, 273 (2016). https://
doi.org/10.1007/s00339-016-9787-2
35. Dong, Y., Itoh, T.: Miniaturized dual-band substrate integrated waveguide fil-
ters using complementary split-ring resonators. In: IEEE MTT-S International
Microwave Symposium, Baltimore, MD, pp. 1–4 (2011)
36. Hong, J.S.G., Lancaster, M.J.: Microstrip Filters for RF/Microwave Applications,
vol. 167. Wiley, Hoboken (2004)
37. Ortiz, N., et al.: Complementary split-ring resonator for compact waveguide filter
design. Microwave Opt. Technol. Lett. 46(1), 88–92 (2005)
38. AbuHussain, M., Hasar, U.C.: Design of X-bandpass waveguide Chebyshev filter
based on CSRR metamaterial for telecommunication systems. Electronics 9(1),
101 (2020)
The Pedagogical Aspect of Human-Computer
Interaction in Designing: Pragmatic Examples

Zahra Hosseini1(B) , Kimmo Hytönen2 , and Jani Kinnunen3


1 Tampere University, Kalevantie 4, 33100 Tampere, Finland
zahra.hosseini@tuni.fi
2 Tampere, Finland
3 Åbo Akademi University, Tuomiokirkontori 3, 20500 Turku, Finland

jani.kinnunen@abo.fi

Abstract. In the last two decades, behavioural science has influenced technology
design and developed the models and methods of user-centred design, human-
technology interaction, and user experience to increase user satisfaction. Cur-
rently, designers are utilizing the results of human science studies to develop the
effectiveness of technology and increase acceptance, ease of use, desirability, and
accessibility of technology for their users. Pedagogy is a purposeful and impor-
tant aspect of human science that can guide designers to direct human mind and
behaviour. Integrating pedagogy into technology is suggested in the recent Tech-
nological Pedagogical Content Design (TPCD) model of the authors. This article
explains four pragmatic examples of using pedagogical theories and principles in
designing technology: (1) utilizing learning dimensions for directing users’ minds
(e.g., Bloom taxonomy); (2) utilizing pedagogy to understand users (e.g., VARK
learning styles); (3) utilizing pedagogy to organize the content (e.g. concept map
or Gestalt principles for organizing the content; and (4) utilizing pedagogy to
provide user-interaction methods (e.g. Edgar Dale’s cone of experience).

Keywords: Pedagogical integration · TPCD · Pedagogical interface ·


Technology design

1 Introduction
We are experiencing a period when digital technology influences every aspect of life.
With the development of technology during the two last decades, technology designers
have paid more attention to make technology functional, usable, and meaningful for
people and to increase user satisfaction. It has led to the development of several disci-
plines of information technology such as Human-Computer Interaction (HCI), Human-
Technology Interaction (HTI), User-Centered Design (UCD), User Experience (UX),
and Life-Based Design (LBD). These various disciplines are associated with the under-
standing of human thought and behaviour interacting with technology. Pedagogy is an
important part of human science that has been defined as “the study of methods of
K. Hytönen—Independent M.Sc. Engineering Researcher.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 366–376, 2022.
https://doi.org/10.1007/978-3-030-97196-0_30
The Pedagogical Aspect of Human-Computer Interaction 367

teaching and gaining a systematic understanding of how the human mind acquires new
information” [1]. Although learning is not merely the result of teaching, the concept
of pedagogy is mostly used in formal educational contexts. In this article, pedagogical
theories and principles are applied to direct and extend users’ minds in both educational
and non-educational settings through technology design.

2 Pedagogical Interface in Designing

This study utilizes the concepts of pedagogy and learning not only for the explanation
of the changes in human mind or the growth in formal educational environment but
also through other life experiences. People’s cognition, attitudes, and skills are changing
during life and particularly through interaction with technology. The Covid-19 pandemic
has increased the dependency of people on digital services and technology interfaces
have gained increasing importance [2]. Today, people are living with technology and not
merely using. New knowledge, attitudes, and skills are transferred through technology
in each moment and with prevalent pedagogical aspects, changes in human beings can
be viewed as either purposeful or non-purposeful learning. Hence, pedagogy has an
important role in human science and the integration of pedagogy into technology needs
to be considered in the field of human-technology interaction. This article suggests
the new term of pedagogical interface (using pedagogical theories and principles in
technology design), along with other terms such as user interface (communicating user
with an operating system), software interface (communicating language and codes with
each other), hardware interface (communicating hardware with each other) in the field
of information technology.
Within the last two decades, the term Human-Computer Interaction has been attached
to computer design for studying the way how users interact with computer technol-
ogy. HCI uses behavioural science to find ways to enhance usability, desirability, etc.,
among computer users. Based on HCI, the concepts of user experience research and
user-centered design have been developed and, consequently, theories and models have
been introduced for designing, analysing, and evaluating digital services to increase the
adoption of technology by end-users. These models facilitate the classification of barriers
that users face, guide designers to analyse and evaluate the different factors affecting the
satisfaction of users. For instance, Kukafka et al. [3] classified these factors on (i) orga-
nizational levels (e.g., reengineering, management quality and organization culture), (ii)
group levels (e.g., professional values and culture, and user satisfaction) and (iii) individ-
ual levels (e.g., attitudes, user satisfaction, motivation, user involvement). In this regard,
many theories and models of psychological and behavioural sciences can explain the use
of information technology such as the theory of planned behaviour, the theory of rational
propagation of action, and socio-cognitive theory [3]. While behavioural theories explain
users’ behaviour toward technology adoption, the pedagogical interface emphasizes the
integration of pedagogy into technology design to strengthen users’ minds in interacting
with technology and directing their knowledge, attitudes, and skills.
368 Z. Hosseini et al.

3 Integrating Pedagogy into Technology


Integrating pedagogy into technology is elaborated through the Technological Pedagog-
ical Content Design (TPCD) model [4]. This model has already suggested a system-
atic approach for designing public service websites [5]. Based on TPCD pedagogy is
considered as an important factor for increasing the quality of technology design (cf.
Fig. 1). TPCD is the result of the integration of technology, pedagogy, and content, and
it makes understanding the content (information) easier, faster, and more effective for
end-users. It has three main components, i.e., technology, content, and pedagogy, as well
as three sub-integrations, i.e., technological content (TC), pedagogical content (PC) and
Technological Pedagogy (TP) as depicted in Fig. 1.

Fig. 1. Technological Pedagogical Content Design (TPCD) (adapted from [4]).

TPC (cf. Fig. 1) and its components are known in the educational context as the
Technological Pedagogical Content Knowledge or TPACK [6]. The aim of this study is
to develop the framework and transfer it to practice in the area information technology
with selected examples. In this regard, practical uses of pedagogical findings (theories
and principles) for technology design are discussed.

4 Aspects of Pedagogy in Technology Design

Human mind acquires new knowledge and skills based on some rules and functions
which have been studied and presented as learning theories. Educators are using these
theories to define teaching and learning methods to support teaching students. On the
other side, through marketing, companies are using the result of human science to find
ways to direct customers’ minds to accept and buy products, while policymakers use
them to share knowledge and to affect public opinion and decisions of the audience.
With the developing technology, the influence of human science and human-interactions
technology on the human mind and action have become so vigorous that the new term
“mind hacking” has been used to depict it [7–9].
Behaviourism, cognitivism, and constructivism are three important basic theories in
human science related also to pedagogical theories. Pedagogical theories are explaining
The Pedagogical Aspect of Human-Computer Interaction 369

how to change human knowledge, behaviour, attitude, and skills of learners for educa-
tional purposes. Accordingly, educators have defined several principles, rules, and stages
to elaborate the process of directing human mind. In this article, pedagogical theories
and principles are adopted to designing technology through TPCD.
Next, we introduce some pragmatic examples of adopting pedagogy to design tech-
nology including: (1) utilizing learning dimensions to direct users’ minds with the appli-
cation of Bloom taxonomy to determine the domain and level of goals of a digital service;
(2) utilizing pedagogy to understand users with the application of Edgar Dale’s cone of
experience and VARK learning styles; (3) utilizing pedagogy to organize content by
applying the concept map and Gestalt principles; and (4) utilizing pedagogy to provide
interaction methods for users.

4.1 Utilizing Learning Dimensions for Directing Users’ Minds


Based on constructivism, the construction of human knowledge, values, and skills is
built with facing new phenomena and experiences. Therefore, the background of users
is important for presenting new information. In 1956, Benjamin Bloom with his collab-
orators introduced a taxonomy to classify human learning [10]. According to Bloom’s
taxonomy, learning objectives are divided into three domains including: (1) the cogni-
tive domain (knowledge-based), (2) the affective domain (emotion-based), and (3) the
psychomotor domain (action-based). Each domain has several levels or stages. While
Bloom’s taxonomy is mostly used for determining learning objective goals in educational
environments, this paper applies these three dimensions and their levels to designing
digital services.
As an example, many companies and organizations are connecting to their end-users
through their websites. Based on TPCD, the first step of web designing is to define the
goal of the website. Regarding Bloom’s taxonomy, TPCD suggests designers to define
the aim of the organization, focusing on the behavioural actions or reactions the user
is expected to make. Bloom’s taxonomy can guide the designer to define the goal and
determine the starting point together with further progress of a design process.

The Cognitive Domain. Many organizations’ websites are designed to transfer infor-
mation to users through content-based websites. An aim of such websites is to change
the users’ cognition and users are expected to gain knowledge and understand a phe-
nomenon or a process. Based on Bloom’s taxonomy, the goal of the website is in the
cognitive domain. Therefore, it is suggested to determine the level (depth) that the users
are expected to understand. This may concern the range of memorizing, understand-
ing, applying, analysing, evaluating, and creating (cf. Fig. 2). Memorizing relates to
recalling data or information, while understanding refers to translation, interpolation,
and interpretation of instructions and problems in one’s own words. Using a concept
in a new situation or unprompted use of an abstraction is defined as applying. The per-
son who is able to analyse that, can separate materials or concepts into components to
understand its organizational structure. Evaluating refers to ability of judgments about
the value of an opinion or material and creating or synthesis level the person builds a
structure or pattern from diverse elements. Puting parts together forms a whole with an
emphasis on creating a new meaning or structure. Based on each level of cognition, a
suitable technology can be selected (see Fig. 2).
370 Z. Hosseini et al.

Fig. 2. The cognitive domain (adapted from [10]).

The Affective Domain. Digital services can change the attitude of users. The attitude
may create some new value and/or affect the values (preferences) of users related to
different phenomena or aspects of life such as healthcare, lifestyle, or culture (cf. the
stage of receiving phenomena in Fig. 3).

Fig. 3. The affective domain (adapted from [10]).

Currently, the most successful digital services are creating and changing users’ values
in the way resembling mind hacking. Information is part of the content that is presented
by a digital service. However, as the aim of companies is to encourage users to request
services or products, they need to follow the steps on the affective levels to influence
users’ minds and direct their values and preferences to the desired direction. Nowadays,
advertisements (hiddenly or openly) through digital platforms (e.g., social media) are
meant to direct the attention of users. The success of such advertising efforts depends
on how they affect the feelings of users and guide them to the subsequent level, which
makes them respond favourably (by buying a service or product, voting, etc.) and then
The Pedagogical Aspect of Human-Computer Interaction 371

creates a new value for users that is the third level of progressing with users’ feelings.
The experience of a user with a specific service or product, can guide a user to the next
level and reorganize her/his values and preferences to make a user a loyal customer of
a brand, product, etc. with internalized values (see Fig. 3).

The Psychomotor Domain. Users may aim to promote manual or physical skills
through digital services. Knowledge and attitude (perception in Fig. 4) toward phys-
ical activity are important to make a user act, specifically in the first stage, but a user
may not reach activity by only knowledge and a positive attitude toward the activity.
In fact, in the second stage, a user needs to have some interaction with technology
(computer or program). For example, games represent a huge digital industry, which
effectively involves users in a kind of physical activity. Some digital services are built
to improve users’ abilities such as driving, typing, etc. However, for such aim, a user
needs to obtain good instructions and perception before finding a mental and physical
position that makes him/her respond in a certain way to a situation (set) in the second
stage of Fig. 4.

Fig. 4. The psychomotor domain (adapted from [10]).

On the third level, a user receives guidelines and begins acting by trial and error
(guided response). When the responses become habitual, it is defined as mechanism in
the fourth level. The fifth level called complex overresponse, a user in interaction with
a digital service may make only minimal mental efforts but have high rate of success.
On the sixth level, a user is skilled enough to make at least two actions at the same time
(adaptation). On the last level (origination), the performance is combined, sequenced,
and automatic with little physical or mental exertion (see Fig. 4).

Overlapping Domains. Understanding Bloom’s taxonomy helps a designer define the


steps, which are needed to achieve the aim of a digital service. Although each goal has
a focus on one of cognitive, affective, or psychomotor domains, other domains overlap
partly, which should not be neglected in designing (see Fig. 5).
372 Z. Hosseini et al.

Fig. 5. The overlapping cognitive (a), affective (b), and psychomotor (c) domains.

4.2 Utilizing Pedagogy to Understand Users


Attention to a user’s age, gender, prior knowledge, interests, skills, culture, education,
abilities, and experience is necessary to make decisions to design a product. Currently,
many models of user-experience research and design claim to have identified the factors
increasing users’ acceptance and adaptation. In the last two decades, designing has moved
from making things attractive to fulfil users’ needs and understanding [11]. Designers
should be aware of end-users. Designers need to know their target group, e.g., adults,
adolescents, or kids, and base the design on their psychology, literacy, and physiology to
make decisions for selecting, organizing, and presenting the content. Also, other aspects
than demographic characteristics and the background of users are essentially important
to understand the content in designing a particular technology, such as culture, education,
and experience, for instance.
As an empirical example with the pedagogical view, people understand content in
different ways, which are driven, e.g., by their learning styles, which as well should be
considered in designing technology. As mentioned before, understanding users are the
key to a successful design and considering how users prefer to get information is essential
in this process. People receive and understand information from their environment with
different learning styles and one of the most accepted learning styles is VARK. The
VARK model was designed by Neil Fleming in 1987 to explain how people understand
information more easily through visual, auditory, read/write, and kinaesthetic ways [12].
A designer of a digital service (e.g., a website designer) must ensure to cover maximum
rate of end-users, and therefore, should consider various learning styles to increase the
chances of accessibility and desirability of the content. Users with visual understanding
style (learning style) may prefer to use a lot of diagrams and graphic organizers, which
are often color-coded, or have other visual ways of making distinctions or classifications
of the content. For them, a combination of pictures and texts enables the information
processing to be quicker and more meaningful. Auditory users, instead, prefer to receive
information by listening and they often focus on the tone and the rate of speech, for
example. Some supplementary resources like videos, storytelling strategies, or podcasts
can be further beneficial for them. Nowadays, many websites include recorded audio for
the users who rather listen to a player than read text. Such additional option increases
possibilities to reach a broader user base. Read/write-users best understand the content
through text format. They can understand abstract concepts that have been presented in
words and essays. The users with kinaesthetic understanding style prefer to experience
The Pedagogical Aspect of Human-Computer Interaction 373

and receive information through action and projects instead of being passive receivers. A
technology designer needs to provide some computer-interaction opportunities to attract
or satisfy this type of users.

4.3 Utilizing Pedagogy to Organize the Content


The key to success, for a designer, lays in how the content is understandable for end-
user. It depends on how the content is organized and how well users can find access and
understand the content. This may be compared to the role of an educator. An educator is
responsible for selecting the content based on goals, organizing it, and presenting it in a
way that the audience can receive it in the easy and effective way. Therefore, pedagogical
principles are to help designers to proceed in developing the content in many ways. Next,
two examples of applying the pedagogical principles in technology design are presented.

Concept/Mind Map. Concept/mind mapping represents a content-organizing strategy


example with which some instructional designers, engineers, and technical writers are
familiar with. It has been defined as the visual presentation of the relationship between
different concepts. Concept mapping is a practical strategy of constructivism. Concept
mapping is a simple graph that presents knowledge/concepts in the form of nodes and
the relationship between concepts as the branches of a tree. Mind mapping is a simple
form of concept mapping that is advised in many learning theories to give the meaning
of a new concept (e.g., the Meaningful Learning Theory introduced by Ausubel in 1967
[13]. Wiezel stated “If the instructor is successful in implementing level two of the
second round, the mind map exercise will cover all six levels of learning as presented in
Bloom’s taxonomy” [14. p. 339]. To draw a correct concept map for a particular content,
the knowledge of the construct of the subject as well as the familiarity with human mind
and logic are essential. For example, using mind mapping tools for designing a website
has been applied in previous studies [15, 16].

Gestalt Principles of Design. Gestalt principles are the result of a group of German
psychologists in the 1920s. They developed a theory to explain how human brain attempts
to simplify and organize complex images to understand environment by giving structure,
logic, and pattern to designs. Some Gestalt principles include similarity, continuation,
proximity, common region, figure-ground, closure, focal point, and symmetry. Three of
these design principles are explained next.

(i) Similarity refers to how human mind intends to group elements by similar shape,
colour, fonts, etc.: “In Use experience design, using similarity makes it clear to
your visitors which items are alike. For example, in a features list using repetitive
design elements (such as an icon accompanied by 3–4 lines of text), the similarity
principle would make it easy to scan through them. In contrast, changing the design
elements for features you want to highlight makes them stand out and gives them
more importance in the visitor’s perception” [17].
(ii) Continuation refers to how human eye follows lines, curves, or a sequence of
shapes, colours, or forms to create a relationship between elements: “The Conti-
nuity principle strengthens the perception of grouped information, creating order
374 Z. Hosseini et al.

and guiding users through different content segments. Disruption of continuity can
signal the end of a section drawing attention to a new piece of content” [17].
(iii) Proximity refers to those elements, which are closer to each other, and which are
perceived by human brain as a group. For a designer, it is important how to help
users find relative content by keeping them closer and how irrelevant information
is kept apart.

In general, Gestalt principles are seen essential for user-experience design in under-
standing human brain and use that knowledge to direct users’ attention to a wanted
direction. Using these principles help the users make associations between related clus-
ters together and avoid confusion and struggling to meet their needs among the huge
amount of information. Gkogka states: “User Interface Design isn’t all about pretty
pixels and sparkly graphics. It’s mainly about communication, performance, and con-
venience. Gestalt principles are always current helping us achieve these goals, creating
a pleasant experience for the users and a great success for the business” [18].

4.4 Utilizing Pedagogy to Provide Interaction Methods for Users

Selecting a suitable computer interaction to ensure that end-users reach the goal that
designers expect is a crucial decision. One helpful pedagogical guide for designers to
make this selection is the cone of experience introduced by Edgar Dale [19] (see Fig. 6).
During the 1960s, Edgar Dale theorized that a human retains more information by
what they “do” as opposed to what they “hear”, “read” or “observe”. Dale referred to the
experience concerning higher involvement of the audience, resulting in better retention
of the presented information. He also claimed that action experiences result in up to
90% retention. Based on the cone of experience, the more sensory channels possible
in interacting with a resource, the more possibility to connect users [19]. It is notable
that the selection of the type of experience to be provided must be done by keeping in
mind the limitations of the context. Particularly, the characteristics of the audience, the
nature of the content to be presented and the technical features of the presentation tools
available may determine this selection.
Currently, wide range of tools, applications, and programs are designed to increase
the interaction between users and computers in user-centred design. The game industry
has developed, and it attracts specifically adolescent and young users. Online collabora-
tion games have increased their motivation and engagement by providing a more dynamic
environment. Further, websites are using chatbots to keep interacting with users. Pod-
casts, images, and videos are added to enhance sensory channels to increase users’
satisfaction. Preparing many computer-interaction opportunities is in line with Dale’s
cone of experience theory, which designers may have found by trial and error. The the-
oretical knowledge of human science and technology interaction increase the quality
of design work to achieve purposeful and systematic user experience and user-centered
design.
The Pedagogical Aspect of Human-Computer Interaction 375

Fig. 6. Edgar Dale’s cone of experience (adapted from [19]).

5 Conclusion

Designing stylish products without paying attention to end-users’ needs and expectations
is most likely waste of money and efforts. User-centered design with a focus on Human-
Computer Interaction (HCI) offers a new perspective for designers to consider new
factors such as psychological factors in design. By extending and specifying this process,
TPCD makes the content more effective and easier for users to understand through
introducing ways to integrate pedagogy into technology. The functioning of human
brain for receiving and processing new information, while interacting with environment
has been broadly theorized and studied by pedagogy scientists. TPCD utilizes these
findings in user-centered design and user-experience efforts.
TPCD offers practical instructions to design digital services and products. It sug-
gests utilizing pedagogical principles to determine the aim, provide the content, select
the materials and techniques for presenting the content, and to plan user-interaction
activities. Accordingly, the selected examples to explain how pedagogical findings can
be applied in each step of user-centered design through drawing attention toward inte-
grating pedagogy in technology design were proposed. The authors plan to develop the
approach by accumulating empirical experiences from different domain-specific case
studies.

References
1. Hawk, R.: What is pedagogical science? (2021). https://www.wise-geek.com/what-is-pedago
gical-science.htm. Accessed 8 Sep 2021
2. Kinnunen, J., Georgescu, I.: Disruptive pandemic as a driver towards digital coaching in
OECD countries. Rev. Romaneasca pentru Educatie Multidimensionala 12(2), 55–61 (2020).
https://doi.org/10.18662/rrem/12.2Sup1/289
376 Z. Hosseini et al.

3. Kukafka, R., Johnson, S.B., Linfante, A., Allegrante, J.P.: Grounding a new information
technology implementation framework in behavioral science: a systematic analysis of the
literature on IT use. J. Biomed. Inform. 36(3), 218–227 (2003)
4. Hosseini, Z., Kinnunen, J.: Integration of pedagogy into technology: a practical paradigm.
In: Carmo, M. (ed.) Education and New Developments, pp. 406–410. Science Press, Lisbon
(2021). https://doi.org/10.36315/2021end086
5. Hosseini, Z., Kinnunen, J., Hytönen, K.: Utilizing technological pedagogical content (TPC)
for designing public service websites. In: Nagar, A.K., Jat, D.S., Marín-Raventós, G.,
Mishra, D.K. (eds.) Intelligent Sustainable Systems. LNNS, vol. 334, pp. 129–137. Springer,
Singapore (2022). https://doi.org/10.1007/978-981-16-6369-7_12
6. Mishra, P., Koehler, M.J.: Technological pedagogical content knowledge: a framework for
teacher knowledge. Teach. Coll. Rec. 108(6), 1017–1054 (2006). https://doi.org/10.1111/j.
1467-9620.2006.00684.x
7. Ienca, M., Haselager, P.: Hacking the brain: brain–computer interfacing technology and the
ethics of neurosecurity. Ethics Inf. Technol. 18(2), 117–129 (2016). https://doi.org/10.1007/
s10676-016-9398-9
8. Rugge, F.: Mind hacking: information warfare in the cyber age. ISPI 20(319), 1–
8 (2018). https://www.lumsa.it/sites/default/files/UTENTI/u1236/analisi319_rugge_11.01.
2018_2.pdf. Accessed 8 Sep 2021
9. Anderson, R.: Hacking the Mind: A Study of Russian and Chinese Cyberspace Influence
Operations. Doctoral Dissertation, Utica College (2020)
10. Hoque, M.E.: Three domains of learning: cognitive, affective and psychomotor. J. EFL Educ.
Res. 2(2), 45–52 (2016)
11. Norman, D.A.: The Design of Everyday Things. Currency Doubleday, New York (1990)
12. Fleming, N.D., Mills, C.: Not another inventory, rather a catalyst for reflection. Improve Acad.
11(1), 137–155 (1992). https://doi.org/10.1002/j.2334-4822.1992.tb00213.x
13. Ausubel, D.P.: The Acquisition and Retention of Knowledge: a Cognitive View. Kluwer
Academic Publishers, Amsterdam (2000)
14. Wiezel, A.: Empowering power points—using mind maps in construction education. In:
Proceedings of 2nd Specialty Conference on Leadership and Management in Construction,
pp. 334–340. PM Publishing, Louisville (2006)
15. Hosseini, Z., Okkonen, J.: Web-based learning for cultural adaptation: constructing a digital
portal for persian speaking immigrants in Finland. In: Arai, K. (ed.) Intelligent Computing.
LNNS, vol. 283, pp. 930–945. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-
80119-9_62
16. Hosseini, Z., Hytönen, K., Kinnunen, J.: Technological Pedagogical Content Design (TPCD)
for User-Centered Website: A User Case Study in Finland (unpublished)
17. Chapman, C.: Exploring the gestalt principles of design (2021). https://www.toptal.com/des
igners/ui/gestalt-principles-of-design. Accessed 8 Sep 2021
18. Gkogka, E.: Gestalt principles in UI design (2018). https://medium.muz.li/gestalt-principles-
in-ui-design-6b75a41e9965. Accessed 8 Sep 2021
19. Dale, E.: Audio-Visual Methods in Teaching, 3rd edn. Holt, Rinehart & Winston, New York
(1969)
A Brief Literature Review About Bioinspired
Metaheuristics to Solve Vehicle Routes Problem
with Time Window

Braynner Polo-Pichon1 , Alexander Troncoso-Palacio2(B) ,


and Emiro De-La-Hoz-Franco1
1 Department of Computer Science and Electronics, Universidad de la Costa CUC,
Barranquilla, Colombia
{bpolo6,edelahoz}@cuc.edu.co
2 Department of Productivity and Innovation, Universidad de la Costa CUC, Barranquilla,
Colombia
atroncos1@cuc.edu.co

Abstract. The Vehicle Routing Problem with Time Windows has become a pretty
studied optimization problem. Because the time constraint greatly expands the pos-
sible solutions; this allow that the model get closer to reality. In this investigation,
the restrictions, the conclusions and future works of around two hundred papers,
have been consulted at Web of Science, Scopus and Sage. In these, the most widely
used bioinspired metaheuristic algorithms were analyzed to provide solutions in
the vehicle routing problem with time windows. The objective of this research is
the publication of an article, which serves as a reference or support to give a broad
documented vision to future research on Bioinspired Metaheuristics. The findings
about them: Few written evidence it found about periodic restriction, and only
some papers, involving heterogeneous fleets. From the perspective of metaheuris-
tic techniques, the algorithms that has been most used for solve this problem, are
the genetic algorithms. In addition, little information was found on the application
of the Buffalo and Cuckoo algorithms. Which can become a great opportunity
for new research in this field. Based on these findings, it can be concluded that
the application of hybrid algorithms, could be analyzed in depth. Because there is
little information, and this represents a wide and interesting spectrum of research,
which can be used to increase solutions in these problems.

Keywords: Metaheuristic literature review · Metaheuristic algorithms · Time


windows · Vehicle routing problem · Restrictions on vehicle routing

1 Introduction
In the world, the transport of people and objects from one place to another has increased
over the years. The transport systems, have been taking an important part of everyday life.
From the government’s perspective, in some countries according to [1, 2], the inexistence
or inadequate planning of the social aspect is undeniable, this turns out to be a factor of

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 377–391, 2022.
https://doi.org/10.1007/978-3-030-97196-0_31
378 B. Polo-Pichon et al.

inequality in fundamental issues such as access to education and health. Which could
be considered as an inequity or inequality, which allows or not, the inhabitants of a
city have access to it, and to its different activities or services. Based on this, in the 17
Sustainable Development Goals, the 2030 agenda proposed, Peace and Prosperity to all
society. There it is clearly evidenced that transport implicitly supports the achievement of
these objectives. Regarding the opportunities to reduce inequalities, objective 10 shows
that the lack of transport is an obstacle to achieving the goals, which is confirmed in the
research of [3]. On the health side, according to [4], it is evidenced an incorrect planning
of transport routes [5]. According with item 13 of the Pan American Health Organization
“PAHO” and support in the “WHO” World Health Organization, they estimated that 9
out of 10 people worldwide is breathe polluted air, with fuels burning being its greatest
responsible [6].
All the above aspects have alerted the responsible authorities, motivating them to
apply different ways to remedy this situation. In some places, using strategies such as the
day without a car, or also, restricted circulation based on the number plate. The used of
cleaner technologies such as electric buses or natural gas based for public transport and
the regulation of heavy-duty vehicles. therefore, proper planning of the transportation
service could contribute to reducing inequality biases among low-income families¸ in
terms of health, it would not be stressful for drivers and users, since the traffic congestion
rate would decrease. There are also some models based on simulation of systems, which
according to [7, 8], In these, a series of studies were carried out to know the current
behavior of a process, and thus be able to make changes or modifications to improve it
and to be able to predict future behaviors with greater precision, which will allow a better
evaluation that will support decision-making in a real solution. The VRP is considered a
complex optimization problem. Therefore, the solution is a challenge for all providers.
Due to this, heuristic and metaheuristic techniques have been applied, allowing to obtain
optimal solutions in less time.

2 Literature Review on Vehicles Routing

Vehicle routing problem “VRP” according to [9], it is based on the planning a set of
vehicles with a common starting point, where each must be assigned a route that meets
the demand of customers that is previously known and their respective vehicle capacity
restrictions. In [10] it is said that the VRP aims to find a set of possible routes for vehicles,
where each customer is visited once on the route, at the lowest possible cost, without
exceeding the capacity of the vehicle and finishing the route where it started. VRP is not
the first problem of its kind. Before this, it was initially modeled in the Travel Salesman
Problem (TSP), which was defined in [11] as the shortest closed tour that a vendor must
take between a number n of cities, where the vendor must travel all cities once and
return to the city of origin [12, 13]. TSP and the VRP, within the literature have very
similar characteristics in the theoretical structure. They are modeled as directed graphs
Gr (Vr, Er), due, both of them representing the routes of the network with weights on the
edges Wij, and in some cases travel times between nodes with i and j as they do for VRP
in [14] and in [13] for TSP. Furthermore, the first appearance of the VRP is registered
as a large-scale TSP [15]. In continuation of the above, they also appreciated in this
A Brief Literature Review About Bioinspired Metaheuristics 379

research, the application of linear programming, one of the methods traditionally used
for solving combinatorial optimization problems of the NP - complex type. Due to their
similar characteristics, they are inherited for the VRP from the investigations of the TSP
in works such as [16], where we see that they apply linear programming in optimization
to a tour of 85,900 cities. In [17] also applied to a bicycle system in TSP. And for the
solution of a transport demand system in the VRP, the incorporation of restrictions [18].
Restrictions that the VRP shares with the TSP, such as delivery times, travel time, type of
fleet. The research trend realized out until 2020 in the databases, using keywords such as
VRP (CVRP), which is the traditional VRP that starts the approach of the VRP theme,
and in the Open VRP (OVRP), where it is not necessary for vehicles to take a route
where they must return to the place where they started (deposits of origin). The VRP
with Time Windows (VRPTW) presented an average of publications of approximately
142 articles, where the three countries that contributed the most were Canada, China
and the R Peoples, with 72, 122 and 182 respectively. Given its superiority will be the
focus restriction in this document.
But the VRP has also been associated with environmental sustainability due to the
opportunity and responsibility that transport presents with the environment, which known
as Green VRP. With this theme, it could be seen that in each year there was an increase
of approximately 50 publications compared to their respective previous year. The three
countries that contributed the most research was Peoples R China with 35, Iran with
21 and the United States with 15. In [19], they present variants for VRP very similar
to ours. Most of the results of previous searches proposed heuristic, metaheuristic, and
hybrid techniques. This trend was largely due to the evolution of hardware and software,
improving aspects such as reasonable calculation times, giving way to real-time applica-
tions, and finding the best solution in local minimum problems where exact algorithms
can get caught up in optimal partials. This supported by approximately 232 publications
of exact methods applied to the VRP problem, a rate that was surpassed by heuristic
and metaheuristic techniques with approximately 775 researches proposed to the VRP
solution. The following Fig. 1 shows the above.

2015 33 121
2016 35 136
2017 38 135
2018 66 186
2019 60 197

0 50 100 150 200 250


Fig. 1. Comparison of researches containing the VRP keywords, exact methods and heuristics
techniques published annually

3 Proposed Methodological Approach


According to the analytical method, a methodology organized in four steps followed.
1. Information search. 2. Contextualization of the problem under study. 3. Analysis of
380 B. Polo-Pichon et al.

various bioinspired metaheuristic techniques used to solve the problem of vehicular


routing. In addition, 4. The conclusion.

4 Metaheuristics Study Case Applied to VRP

The term heuristics has as synonyms to find, to invent, according to [20]. In the case of
Psychology, the term heuristics is a method that helps answer difficult questions quickly
and consistently, although the precise solution is not always possible [21]. In other disci-
plines such as engineering, which is the direction of this review, it known as a technique
that is able to find solutions with high satisfaction in terms of computing time, despite
not guaranteeing that the results have the highest optimization and reliability. Because
of the above, heuristic techniques have the disadvantage of the optimal locals, so meta-
heuristic techniques appear, which overcome this problem. The word “metaheuristics”
was attributed to Fred Glover [22, 23] who defined it as a high-level search in a space
of possible solutions, given prior knowledge and applying one or more rules. [24] they
define it as a guided interaction through intelligent combinations, which use learning
strategies for searches, and structure information to find the best near-optimal solutions.
The metaheuristic techniques on which this review will focus due to the increase they
have had in recent years are so called bioinspired, as they rely on nature’s behavior to
solve complex problems. This group it can divided into single-solution metaheuristics
that work on a single possible outcome by continuously enhancing it during the process,
and populations handling different solutions at the same time. In [24, 25] is described
this division.
Within the first category mentioned above have Tabu Search (TS), which, was the
first algorithm used the memory application, to extract information from what happened
and banning movements. This is logically expressed as, a linked list containing a tabu
history of optimal values found above, so that partial optimally, are not allowed to be re-
entered and thus not fall back on the local optimal [23]. Similarly, we find the simulated
Annealing (SA), that comes from the metal annealing process. In this algorithm, it can
find among characteristics, that every movement of improvement does not improve with
greater probability, accepted based on local searches. More information can be found in
[26].
On the other side, population metaheuristics subdivided into evolutionary algorithms
that, in each population generation or iteration, create a new generation based on the pre-
vious much closer, to the global solution and rewarding individuals, with better results
against the objective function. There group we find one of the most used algorithms,
genetic (GA). It is based on Charles Darwin’s natural selection theory, using chromo-
somes encoded as character strings or bit arrays, also implements strategies such as
mutation, crossover and recombination; that allows you to approach the optimal solu-
tions of the problem [27]. The memetic algorithms (MA), is based on a genetic structure,
but with cooperative work, of local and global search to achieve a synergy between the
two that allows, to improve the search process [28]. Continuing with the subdivision
raised above, the swarm intelligence algorithms that based on the concept of collective
intelligence where each individual has the ability to understand and alter their environ-
ment. In this group, it finding the particle swarm optimization algorithm (PSO), where
A Brief Literature Review About Bioinspired Metaheuristics 381

their search paths are adjusted to individual agents or particles and when one finds a
better solution it is updated for a global optimal search [29]. Ant colony optimization
(ACO) also found, where as in nature when finding food, individuals mark the place and
the path with pheromones. This algorithm takes into account the concept of evaporation
that is essential for convergence and self-organization of the process. In this way the ants
or individual agents follow the path with greater pheromone creating trend routes, which
are usually the best solutions [30]. Similar to the previous one, bee colony optimiza-
tion (BCO) is inspired by the search for insect food, takes into account the individual
search and call of other individuals of the same species when finding food, either by
pheromones or by wiggling dance [31]. The Fireflies Algorithm (FA) consists in the
flicker patterns of fireflies. This considers 3 rules 1) Fireflies are attracted by the inten-
sity of the brightness of others, 2) The brightest firefly will attract the other and the more
distance there is between them the more their intensity decreases, and 3) The solution
area determines the brightness of the fireflies; which is expressed as a mathematical
function [32]. The bat algorithm (BA), inspired by the echolocation of these mammals,
in which three utopian rules: 1) Used bats use echolocation to detect food or barrier and
know how to differentiate which one is each. 2) Bats fly randomly with a velocity I saw
at xi position with a frequency f min with certain variants in wavelength and volume,
and 3) knowing that the volume can vary a constant is assumed [33]. There is also the
Buffalo Optimization Algorithm (ABO) which is based on the behavior of a buffalo herd
in the search or exploration of space so as not to incur premature stagnation by ensuring
that each of the buffaloes update their position to the best position among all buffaloes in
the herd [34]. Aggressive behavior when reproducing from the cuckoo bird is the basis
for the inspiration of the cuckoo search algorithm (CS). Where the eggs of this species
of bird deposited in nests of other birds (random with jumps from the flight of Levy),
at time of birth, this calf is programmed to remove all eggs from the other species. The
best quality eggs will survive for the next generation, this if it gets to overcome aspects
such as the egg being similar to that of the hosts so that the host mother does not get
rid of the intruder egg or leaves the nest [35]. Figure 2 shows a Pareto chart with the
number of articles on metaheuristic algorithms published where VRP was applied.

Fig. 2. Researches containing the VRP, exact methods, and heuristics techniques keywords.
Obtained from web of science database.

In the previous figure it can be seen that the GA, TS and BCO algorithms have been
used in approximately 80% of the published literature. The Heuristic and metaheuristic
techniques have two evaluation metrics that define their feasibility. These known as the
convergence time and quality of the solution of the objective function or functions. The
382 B. Polo-Pichon et al.

first it based on the time range that is required to arrive at the optimal solution considering
the computational expenses. To other side, the second tells us how close the result is
to the general solution of the whole problem (global optimum). Considering the above,
it said that in general the works used for this article evaluated the algorithms that they
applied, based on these two.
The TS that has been applied to the VRP with time windows as it was first developed
in [36–38]. This has been measured with the genetic algorithm in [37, 39]. It has also
been compared with the SA and the BCO [38]. Furthermore, TS has been hybridized
with neighborhood variable search algorithms (VNS) [40, 41], ACO [42], BCO [42, 43],
harmony search (HS) [44] and another [43]. Likewise, within the literature, improve-
ments such as Granular TS [44], Iterated Granular TS (IGTS) [45], Reactive TS [46]
and improved TS [47] can be seen. For the metaheuristics of TS, there are references
in the literature such as [48, 49], of which the results of the TS working together or in
phase can be highlighted, in some specific stage of the implementation of a metaheuristic
algorithm with other algorithms [50]. Of the variations presented by the TW applying
the TS is the STW [44]. In [51] the TW focused on the access of some routes in the
routes, and the application of the multiple TW [45].
On the other side, this review found the use of tools such as CPLEX software to
compare the performance of the algorithm [43] and in [37] the use of Google Maps.
Within the review, no publications were found that considered the stochastic behavior of
traffic density and congestion, which is suggested to investigate in [38]. Another of the
unexplored gaps is the application of the TS algorithm to the reduction of gas emissions
with VRPTW, knowing that it is an aspect of great importance for the research community
in recent years [41].
Then, it finding that the SA for the VRPTW was traditionally applied in [52]. The
aforementioned metaheuristics have been compared with FA and discrete improved BA
(DaIBA) where DaIBA gave promising results according to the authors [53]. And in
addition, GA was taken and compared with the artificial immune system (AIS) [54]. In
the same way, the SA has been measured with the GA where it speaks of an efficient
result of both algorithms in terms of solution of the objective function and computational
time [55] and the SA is also compared with GA, using CPLEX, where the SA it surpasses
both [56]. However, the hybridizations that could be found in the literature with PSO [57]
and HS [58], leaving a wide range of interesting possible investigations. And finally, for
SA, we found investigations that present improvements to the algorithm with the restart
strategy [59], a network reduction [60], parallel [61] and the use of a memory [62–64].
The GA algorithm was compared with hybrids of VNS with PSO (HVN PSO), VNS
with electromagnetism mechanism algorithm (HVNSEMA) and VNS with BCO (HVN
BCO) [65]. Also, in the literature applied GA hybrids was with algorithms known as a
local neighborhood search algorithm (ALNS) [66], PSO [67, 68], TS [69],frontal space
fill curve method (SFC) [70], and a very interesting hybrid due to the variety of algorithms
it uses, such as K-means, Clark Wright and Extend GA non-dominated II (NSGA-II)
[71]. Likewise, in [72] a heuristic-based parthenogenetic algorithm (HPGA) is proposed
and compared with GA and a hybrid of GA. Similarly, other hybrids with GA were found
in [73]. And this has been measured with parallel SA, PSO and Cloud Theory PSO [74],
local search [75], HS [76], TS and SA using CPLEX software [58], and with the hybrid
A Brief Literature Review About Bioinspired Metaheuristics 383

algorithm that includes an insertion heuristic, a local search algorithm and a metaheuristic
algorithm [60]. In the literature there are several proposals for improvements to GA, such
as multi-objectivity [77]. In fact, the improvement of which the most works was found
was non-dominated GA (NSGA) [78] and its second version NSGA - II proposed in [79]
to measure an improvement of PSO, and in [80] is used with a 2-phase pulse algorithm.
In continuity, the improvement mentioned above is compared with a multi-objective
PSO and the selection algorithm based on Pareto’s envelope (PESA-II) [81]. A modified
cache enhancement identified for the GA. On [82], an aggregate fitness [83]. Fuzzy
logic [84]. Stochastic partially optimized cyclic displacement crossover for multi target
genetics [85]. With Markov truth region [86]. Nested in GA [87] and with cooperative
co-evolutionary quantum (CCMQGA) which in the same document is compared with
NSGA-II and a multi-objective quantum evolutionary algorithm (MQEA) [88]. On the
other hand, it can be found in [89] GA with VNS and temporal space (ST) and [90]
where GA is used for global search and PSO is used as a local search method to identify
potential better solutions, where in each generation the GA is applied in a phase role.
Another algorithm of the most used in the literature is the PSO [91, 92]. Which is
compared with the differential evolution algorithm (DE), where the DE surpassed in
convergence and the PSO in computational time [93]. It was also measured with GA in
[94], obtaining good results both in terms of solution and computation time according
to the authors. Advancing in thematic, hybrids of PSO with LS [95] and with VNS
[96] were found. In the application of PSO in phase, the comparison of PSO was found
together with the VNS, and on the other hand a two-stage decomposition method (TDM)
[97]. Proposals with multi-objective [98] were found for the improvements, such as self-
learning PSO [99] and an improved PSO that is also compared with a traditional PSO
[100].
In the case of ACO, the review was able to find its application in the way it was
first proposed [101]. ACO was compared with a hybrid of the same technique and with
the CPLEX and GRASH tools, which can be found in [102] and [103] respectively.
And in [104] it is compared with GRASH, but with the ACO in a traditional way.
Likewise, investigations were found where the algorithm worked in phases that used
feedback techniques [105]. On the other hand, within the hybrids the ACO appeared
with mutation operators [106], the memetic algorithm [107] and GRASH together with
VNS [108]. The ACO had several improvements with LS, where stochastic demand
(SACO) and investigations where it is compared with TS [109] The ACO had several
improvements with LS, where stochastic demand was considered (SACO), multiple
attribute labels (LACS) [110] were considered and investigations were considered where
it was compared with TS [111]. However, since restriction we have found two articles
from STW [112]. And to finish with ACO, it was found in [113] the application of this
algorithm to a robot system to route the arrival of first aid, which turns out to be one of
the few related to the technique in the literature.
For the BCO was obtained in the search as it was implemented for the first time.
In [114] a two-step restricted LS hybrid was developed for neighborhood selection.
Similarly, the improvements that were applied to the BCO can be found in [115], an
improvements can also be found in BCO with adaptive insertion heuristics (SIH) and
where it is also compared with BCO Continuing with the improvements is the BCO
384 B. Polo-Pichon et al.

with lexicographic optimization principles [116] and two improvements of the BCO
are compared (BCO - Daemon Algorithm (DA) and BCO - Old Bachelor Acceptance
Algorithm (OBA)), the registration travel algorithm a registry (RRT) where both are
measured in performance [117]. Articles that implemented STW were also found as
[118]. And an article that among its objectives is the reduction of fossil fuel, which is
approached with the application of the fuel consumption function inspired by the integral
modal emission model (CMEM), developed by Barth [119]. With reference to the BA,
articles found that applied the BA for the VRP, the comparisons of the BA with SA, GA,
FA, Gerubi Solver and CPLEX Software.
Few articles were found in the case of FA, both with the FA’s discretion [120]. In
this latest research referenced, it used in conjunction with a composite neighborhood
algorithm and compared using CPLEX software with a hybrid enhancement of TS - GA
and TS, a variation of the VNS and a variation of TS and in another with GA.
With respect to MA there is research on the application of the algorithm first proposed
[121], of this, very little was found. Hybrid algorithms such as the one implemented
with ACO, where regardless of the settings they have the satisfactory optimization, is
compared whit MA and ACO individually. Similarly, for improvements to the MA one
was found with multiple populations [122], another with different variable neighborhood
descent (VND) configurations and local searches that return a set of un dominated
solutions and is measured with LS [123]. The application of a parallel MA [124], a
multi-target MA is compared to a multi-target LS in [125]. The incorporation of 3 local
methods and a specially designed selection operator in [126] was observed. An MA with
a custom recombination operator in [124] and MA as an enhancement to the modified
discrete light worm swarm optimization algorithm, based on time window splitting
(MDGSOTWD), which was used for solution accuracy, referred to as a memetic local
search [127].

5 Conclusions
At the end of this research, it can be concluded that: From the point of view of restrictions,
few publications based on restrictions, TWVRP, HFVRP and OVRP are evidenced.
Similarly, few jobs implementing TWVRP, MDVRP, and OVRP on TS. The use of the
self-learning PSO algorithm is evidenced to solve the vehicle routing problem. of it.
It was also possible to show that: 1) The metaheuristics of the ACO algorithm was
used to create a model with VRPTW, HFVRP and PDVRP. 2) GA is the algorithm most
applied to VRP with TW, highlighting the participation of GA and NSGA improvements
in its two versions. 3) We did not find many metaheuristics that applied the ABO and
CS algorithm for the VRP with TW, this being an opportunity to create new works. 4)
Interesting contributions can be applied, with hybrid algorithms of which, to date, there
is little literature.
As future work, it is suggested from here on, to take into account an ecological
approach, in which restrictions such as multiple loading (MC) are included, due to the
current global interest in minimizing environmental pollution.
A Brief Literature Review About Bioinspired Metaheuristics 385

References
1. García-Schilardi, M.E.: Collective public transport: transporte publico coletivo Seu papel
no processo de inclusão social. Bitacora 24(1), 35–42 (2014)
2. Mignot, D., Aguiléra, A., Bloy, D., Caubel, D., Madre, J.L.: Formas urbanas, movilidad
y segregación. Urban Public Econ. Rev. (12), 73–104 (2010). https://www.redalyc.org/art
iculo.oa?id=50414006003. ISSN 1697-6223
3. Kwanele, M.: Las fusiones escolares empeoran el problema del transporte esco-
lar|Molido (2018). https://www.groundup.org.za/article/department-puts-distance-between-
children-and-education/. (Accessed 22 Sept 2019)
4. Robles, Y.: La falta de opciones de transporte significa que la ‘elección’ de la
escuela es ilusoria para muchas familias de Denver: Denverite, el sitio de Den-
ver (2017). https://denverite.com/2017/03/22/lack-transportation-options-means-school-
choice-illusory-many-denver-families/. (Accessed 22 Sept 2019)
5. Espíritu Salinas, N.: Transportation and stress in Lima city 1–18 (2018). https://repositorio.
urp.edu.pe/handle/URP/1486
6. OPS/OMS|Contaminación del Aire Ambiental (2017). https://www.paho.org/es/temas/cal
idad-aire. (Accessed 22 Sept 2019)
7. Uribe-Martes, C.J., Rivera-Restrepo, D.X., Filippo, A.-D., Silva, J.: Simulation model of
internal transportation at a container terminal to determine the number of vehicles required.
In: Smys, S., Bestak, R., Rocha, Á. (eds.) ICICIT 2019. LNNS, vol. 98, pp. 912–919.
Springer, Cham (2020). https://doi.org/10.1007/978-3-030-33846-6_100
8. Troncoso Palacio, A.: El modelado, la simulación de procesos y los gemelos digitales:
soporte para la toma de decisiones. In: Modelado y simulación de procesos. VirtualPro, no.
227, p. 3 (2020). https://www.virtualpro.co/revista/modelado-y-simulacion-de-procesos/3
9. Golden, B., Raghavan, S., Wasil, E.: The vehicle routing problem latest advances and new
challenges (2008). https://link.springer.com/book/10.1007/978-0-387-77778-8
10. Braekers, K., Ramaekers, K., Van Nieuwenhuyse, I.: The vehicle routing problem: state of
the art classification and review. Comput. Ind. Eng. 99, 300–313 (2016)
11. Albayrak, N., Allahverdi, M.: Development a new mutation operator to solve the traveling
salesman problem by aid of genetic algorithms. Expert Syst. Appl. 38(3), 1313–1320 (2011)
12. Menger, K.: Das Botenproblem. In: Ergebnisse eines Mathematischen Kolloquiums 2 (1932)
13. Laporte, G., Osman, I.H.: Routing problems: a bibliography. Ann. Oper. Res. 61(1), 227–262
(1995)
14. Guo, X., Liu, Y., Samaranayake, S.: Solving the school bus routing problem at scale via a
compressed shareability network. In: IEEE Conference Intelligent Transportation Systems
Proceedings, ITSC, vol. 2018-Novem, pp. 1900–1907 (2018)
15. Dantzig, G., Fulkerson, B., Delbert, R., Johnson, S.M., Cook, W.: Solution of a large-scale
traveling-salesman problem. 7–28 (2010)
16. Applegate, D.L., et al.: Certification of an optimal TSP tour through 85,900 cities. Oper.
Res. Lett. 37(1), 11–15 (2009)
17. Dell’Amico, M., Hadjicostantinou, E., Iori, M., Novellani, S.: The bike sharing rebalancing
problem: mathematical formulations and benchmark instances. Omega (United Kingdom)
45, 7–19 (2014)
18. Ropke, S., Cordeau, J.-F.: Branch and cut and price for the pickup and delivery problem
with time windows. Transp. Sci. 43(3), 267–286 (2009). https://doi.org/10.1287/trsc.1090.
0272
19. Gogna, A., Tayal, A.: Metaheuristics: review and application. J. Exp. Theor. Artif. Intell.
25(4), 503–526 (2013)
386 B. Polo-Pichon et al.

20. Significado de Heurística (Qué es, Concepto y Definición) – Significados. https://www.sig


nificados.com/heuristica/. (Accessed 15 Nov 2019)
21. Kahneman, D.: Pensar rápido, pensar despacio (2011)
22. Glover, F.: Future paths for integer programming and links to artificial intelligence. Comput.
Ops. Res 13(5), 533–549 (1986)
23. Martínez, C., Medina, D.: Modelación de una heurística para el análisis del desempeño
de un modelo determinístico de ruteo de vehículos con múltiples depósitos bajo un ambi-
ente estocástico. Repositorio Institucional CUC (2014). https://repositorio.cuc.edu.co/han
dle/11323/4854
24. Melián, B., Pérez, J.A.M., Vega, J.M.M.: Meta heurísticas: una visión global. Iberoamerican
J. Artif. Intell. 7, 7–28 (2003). http://journal.iberamia.org/public/Vol.1-14.html#2003
25. Elshaer, R., Awad, H.: A taxonomic review of metaheuristic algorithms for solving the
vehicle routing problem and its variants. Comput. Ind. Eng. 140, 1–19 (2020)
26. van Laarhoven, P.J.M., Aarts, E.H.L.: Simulated Annealing: Theory and Applications.
Springer Netherlands, Dordrecht (1987)
27. Sivanandam, S.N., Deepa, S.N.: Genetic algorithms. In: Sivanandam, S.N., Deepa, S.N.
(eds.) Introduction to Genetic Algorithms, pp. 15–37. Springer, Heidelberg (2008). https://
doi.org/10.1007/978-3-540-73190-0
28. Knowles, J.D., Knowles, J., Corne, D.: M-PAES: a memetic algorithm for multiobjective
optimization (2000)
29. Shi, R.C., Eberhart, Y.: Empirical study of particle swarm optimization. In: Proceedings of
the 1999 Congress on Evolutionary Computation, CEC 1999, vol. 3, pp. 1945–1950 (1999)
30. Dorigo, G., Di Caro, M.: Ant colony optimization: a new meta-heuristic. In: Proceedings of
the 1999 Congress on Evolutionary Computation, CEC 1999, vol. 2, pp. 1470–1477 (1999)
31. Yang, M., Xin, S., Cui, Z., Xiao, R., Gandomi, A.H., Karamanoglu: Swarm intelligence
and bio-inspired computation: theory and applications - Google Libros (2013).https://www.
elsevier.com/books/swarm-intelligence-and-bio-inspired-computation/yang/978-0-12-405
163-8
32. Yang, X.-S.: Cuckoo search and firefly algorithm. Intell. Comput. Int. 516, 1–26 (2014)
33. Yang, X.-S.: A new metaheuristic bat-inspired algorithm. In: González, J.R., Pelta, D.A.,
Cruz, C., Terrazas, G., Krasnogor, N. (eds.) Nature Inspired Cooperative Strategies for
Optimization, pp. 65–74. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-
12538-6_6
34. Odili, J.B., Kahar, M.N.M.: African Buffalo Optimization (ABO) - a new meta-heuristic
algorithm. In: National Conference For Postgraduate Research, pp. 1–7 (2015). http://dx.
doi.org/https://doi.org/10.13140/RG.2.1.4547.9846
35. Yang, S., Deb, X.-S.: Engineering optimisation by cuckoo search. Int. J. Math. Model.
Numer. Optim. 1(4), 330–343 (2010)
36. Alcaraz, J.J., Caballero-Arnaldos, L., Vales-Alonso, J.: Rich vehicle routing problem with
last-mile outsourcing decisions. Transp. Res. Part E: Logist. Transp. Rev. 129, 263–286
(2019). https://doi.org/10.1016/j.tre.2019.08.004
37. Kim, S.H., Bae, S.H.: Optimal solution to the vehicle routing problem by adopting a meta-
heuristic algorithm. Transp. Plan. Technol. 39(6), 574–585 (2016). https://doi.org/10.1080/
03081060.2016.1187808
38. Nowakowski, P., Szwarc, K., Boryczka, U.: Vehicle route planning in e-waste mobile collec-
tion on demand supported by artificial intelligence algorithms. Transp. Res. Part D: Transp.
Environ. 63, 1–22 (2018). https://doi.org/10.1016/j.trd.2018.04.007
39. Shi, Y., Boudouh, T., Grunder, O.: A robust optimization for a home health care routing and
scheduling problem with consideration of uncertain travel and service times. Transp. Res.
Part E: Logist. Transp. Rev. 128, 52–95 (2019). https://doi.org/10.1016/j.tre.2019.05.015
A Brief Literature Review About Bioinspired Metaheuristics 387

40. Molina, J.C., Eguia, I., Racero, J.: An optimization approach for designing routes in
metrological control services: a case study. Flex. Serv. Manuf. J. 30(4), 924–952 (2018)
41. Niu, Y., Yang, Z., Chen, P., Xiao, J.: Optimizing the green open vehicle routing problem
with time windows by minimizing comprehensive routing cost. J. Clean. Prod. 171, 962–971
(2017)
42. He, P., Li, J., Jiang, Y., Wang, X.: Collaborative transportation planning distinguishing old
and new shippers for small-medium enterprise. J. Ind. Prod. Eng. 35(3), 170–180 (2017)
43. Ye, F., Zhang, D., Si, Y.-W., Zeng, X., Thanh Nguyen, T.: A hybrid algorithm for a vehicle
routing problem with realistic constraints. Inf. Sci. 394–395, 167–182 (2017). https://doi.
org/10.1016/j.ins.2017.02.028
44. Yassen, E.T., Ayob, M., Nazri, M.Z.A., Sabar, N.R.: An adaptive hybrid algorithm for vehicle
routing problems with time windows. Comput. Ind. Eng. 113, 382–391 (2017). https://doi.
org/10.1016/j.cie.2017.09.034
45. Huang, Z., Huang, W., Guo, F.: Integrated sustainable planning of self-pickup and door-
to-door delivery service with multi-type stations. Comput. Ind. Eng. 135, 412–425 (2019).
https://doi.org/10.1016/j.cie.2019.06.022
46. Bernal, J., Escobar, J.W., Linfati, R.: A granular Tabu search algorithm for a real case study
of a vehicle routing problem with a heterogeneous fleet and time windows. J. Ind. Eng.
Manage. 10(4), 646–662 (2017). Special Issue
47. Hoogeboom, M., Dullaert, W.: Vehicle routing with arrival time diversification. Eur. J. Oper.
Res. 275, 93–107 (2019)
48. Shiri, S., Huynh, N.: Optimization of drayage operations with time-window constraints. Int.
J. Prod. Econ. 176, 7–20 (2016)
49. He, P., Li, J., Fang, E., deVoil, P., Cao, G.: Reducing agricultural fuel consumption by
minimizing inefficiencies. J. Clean. Prod. 236, 1–13 (2019). https://doi.org/10.1016/j.jcl
epro.2019.117619
50. Shen, L., Tao, F., Wang, S.: Multi-depot open vehicle routing problem with time windows
based on carbon trading. Int. J. Environ. Res. Public Health 15(9), 1–20 (2018)
51. Lin, S., Bard, J.F., Jarrah, A.I., Zhang, X., Novoa, L.J.: Route design for last-in, first-out
deliveries with backhauling. Transp. Res. Part C: Emerg. Technol. 76, 90–117 (2017). https://
doi.org/10.1016/j.trc.2017.01.005
52. Wang, C., Dessouky, M., Ordóñez, F.: Routing courier delivery services with urgent demand.
INFOR Inf. Syst. Oper. Res. 53(1), 26–39 (2015)
53. Grosso, R., Muñuzuri, J., Escudero-Santana, A., Barbadilla-Martín, E.: Mathematical for-
mulation and comparison of solution approaches for the vehicle routing problem with access
time windows. Complexity 2018, 1–10 (2018). https://doi.org/10.1155/2018/4621694
54. Guitián De Frutos, R.M., Casas-Méndez, B.: Routing problems in agricultural cooperatives:
a model for optimization of transport vehicle logistics. IMA J. Manage. Math. 30, 387–412
(2019)
55. Osaba, E., Yang, X.-S., Fister, I., Del Ser, J., Lopez-Garcia, P., Vazquez-Pardavila, A.J.: A
discrete and improved bat algorithm for solving a medical goods distribution problem with
pharmacological waste collection. Swarm Evol. Comput. 44, 273–286 (2019)
56. Mrowczynska, B., Krol, A., Czech, P.: Artificial immune system in planning deliveries in a
short time. Bull. Pol. Acad. Sci. Tech. Sci. 67, 969–980 (2019)
57. Yousefi, H., Tavakkoli-Moghaddam, R., Taheri, M., Oliaei, B., Mohammadi, M., Mozaffari,
A.: Solving a bi-objective vehicle routing problem under uncertainty by a revised multi-
choice goal programming approach. Int. J. Ind. Eng. Comput. 8, 283–302 (2017)
58. Setak, M., Azizi, V., Karimi, H., Jalili, S.: Pickup and delivery supply chain network with
semi soft time windows: metaheuristic approach. Int. J. Manage. Sci. Eng. Manage. 12(2),
89–95 (2017)
388 B. Polo-Pichon et al.

59. Chen, J., Shi, J.: A multi-compartment vehicle routing problem with time windows for urban
distribution-a comparison study on particle swarm optimization algorithms. Comput. Ind.
Eng. 133, 95–106 (2019). https://doi.org/10.1016/j.cie.2019.05.008
60. Manisri, T., Mungwattana, A., Janssens, G.K., Caris, A.: A hybrid algorithm for the vehicle
routing problem with soft time windows and hierarchical objectives. J. Inf. Optim. Sci. 36(3),
283–300 (2015)
61. Vincent, F.Y., Redi, A.P., Hidayat, Y.A., Wibowo, O.J.: A simulated annealing heuristic for
the hybrid vehicle routing problem. Appl. Soft Comput. 53, 119–132 (2017)
62. Khodabandeh, E., Bai, L., Heragu, S.S., Evans, G.W., Elrod, T., Shirkness, M.: Modelling
and solution of a large-scale vehicle routing problem at GE appliances & lighting. Int. J.
Prod. Res. 55(4), 1100–1116 (2015)
63. Wang, C., Mu, D., Zhao, F., Sutherland, J.W.: A parallel simulated annealing method for
the vehicle routing problem with simultaneous pickup-delivery and time windows. Comput.
Ind. Eng. 83, 111–122 (2015). https://doi.org/10.1016/j.cie.2015.02.005
64. Küçükoğlu, İ, Ene, S., Aksoy, A., Öztürk, N.: A memory structure adapted simulated
annealing algorithm for a green vehicle routing problem. Environ. Sci. Pollut. Res. 22(5),
3279–3297 (2015)
65. Cruz-Chávez, M.A., Rodríguez-León, A., Rivera-López, R., Cruz-Rosales, M.H.: A grid-
based genetic approach to solving the vehicle routing problem with time windows. Appl.
Sci. 9(18), 1–23 (2019)
66. Zheng, J., Zhang, Y.: A fuzzy receding horizon control strategy for dynamic vehicle routing
problem. IEEE Access 7, 151239–151251 (2019). https://doi.org/10.1109/ACCESS.2019.
2948154
67. Meng, F., Ding, Y., Li, W., Guo, R.: Customer-oriented vehicle routing problem with environ-
ment consideration: two-phase optimization approach and heuristic solution. Math. Probl.
Eng. 2019, 1–19 (2019). https://doi.org/10.1155/2019/1073609
68. Kang, H.Y., Lee, A.H.I.: An enhanced approach for the multiple vehicle routing problem
with heterogeneous vehicles and a soft time window. Symmetry (Basel) 10(11), 1–20 (2018)
69. Hsiao, Y.H., Chen, M.C., Lu, K.Y., Chin, C.L.: Last-mile distribution planning for fruit-and-
vegetable cold chains. Int. J. Logist. Manage. 29(3), 862–886 (2018)
70. Shahparvari, S., Abbasi, B., Chhetri, P.: Possibilistic scheduling routing for short-notice
bushfire emergency evacuation under uncertainties: an Australian case study. Omega 72,
96–117 (2017). https://doi.org/10.1016/j.omega.2016.11.007
71. Agrawal, V., Lightner, C., Lightner-Laws, C., Wagner, N.: A bi-criteria evolutionary
algorithm for a constrained multi-depot vehicle routing problem. Soft Comput. 21(17),
5159–5178 (2016)
72. Qin, J., Ye, Y., Cheng, B.R., Zhao, X., Ni, L.: The emergency vehicle routing problem with
uncertain demand under sustainability environments. Sustainability 9(2), 1–24 (2016)
73. Escuín, D., Larrodé, E., Millán, C.: A cooperative waiting strategy based on elliptical areas
for the Dynamic Pickup and Delivery Problem with Time Windows. J. Adv. Transp. 50(8),
1577–1597 (2016)
74. Ahkamiraad, A., Wang, Y.: Capacitated and multiple cross-docked vehicle routing problem
with pickup, delivery, and time windows. Comput. Ind. Eng. 119, 76–84 (2018). https://doi.
org/10.1016/j.cie.2018.03.007
75. Xu, S.H., Liu, J.P., Zhang, F.H., Wang, L., Sun, L.J.: A combination of genetic algorithm
and particle swarm optimization for vehicle routing problem with time windows. Sensors
(Switzerland) 15(9), 21033–21053 (2015)
76. Zhang, Y., Yang, Y., Yang, R.: Distribution path optimization method of gas cylinder based
on Genetic-Tabu hybrid algorithm. Int. J. Innov. Comput. 15(2), 773–782 (2019)
A Brief Literature Review About Bioinspired Metaheuristics 389

77. Yue, Y.-X., Zhang, T., Yue, Q.-X.: Improved fractal space filling curves hybrid optimization
algorithm for vehicle routing problem. Comput. Intell. Neuro-Sci. 2015, 1–10 (2015). https://
doi.org/10.1155/2015/375163
78. Wang, Y., et al.: Collaborative multi-depot logistics network design with time window
assignment. Expert. Syst. Appl. 140, 1–24 (2020). https://doi.org/10.1016/j.eswa.2019.
112910
79. Shi, C., Li, T., Bai, Y., Zhao, F.: A heuristics-based parthenogenetic algorithm for the VRP
with potential demands and time windows. Sci. Program. 2016, 1–13 (2016). https://doi.org/
10.1155/2016/8461857
80. Shi, Y., Boudouh, T., Grunder, O.: A hybrid genetic algorithm for a home health care routing
problem with time window and fuzzy demand. Expert Syst. Appl. 72, 160–176 (2016)
81. Barkaoui, M., Berger, J., Boukhtouta, A.: Customer satisfaction in dynamic vehicle routing
problem with time windows. Appl. Soft Comput. J. 35, 423–432 (2015)
82. Rabbani, M., Pourreza, P., Farrokhi-Asl, H., Nouri, N.: A hybrid genetic algorithm for multi-
depot vehicle routing problem with considering time window repair and pick-up. J. Model.
Manage. 13(3), 698–717 (2018)
83. Yang, B., Hu, Z.-H., Wei, C., Li, S.-Q., Zhao, L., Jia, S.: Routing with time-windows for
multiple environmental vehicle types. Comput. Ind. Eng. 89, 150–161 (2015). https://doi.
org/10.1016/j.cie.2015.02.001
84. Ma, Y., Xu, J.: A cloud theory-based particle swarm optimization for multiple decision maker
vehicle routing problems with fuzzy random time windows. Eng. Optim. 47(6), 825–842
(2015)
85. Zhou, Y., Wang, J.: A local search-based multiobjective optimization algorithm for multi-
objective vehicle routing problem with time windows. Syst. J. 9, 1100–1113 (2015). https://
doi.org/10.1109/JSYST.2014.2300201
86. Yassen, E.T., Ayob, M., Mohd Zakree, A.N., Sabar, N.R.: Meta-harmony search algorithm
for the vehicle routing problem with time windows. Inf. Sci. (Ny) 325, 140–158 (2015)
87. Oyola, J.: The capacitated vehicle routing problem with soft time windows and stochastic
travel times. Rev. Fac. Ing. 28(50), 19–33 (2019)
88. Zhao, P.X., Luo, W.H., Han, X.: Time-dependent and bi-objective vehicle routing problem
with time windows. Adv. Prod. Eng. Manage. 14(2), 201–212 (2019)
89. Wang, S., Wang, X., Liu, X., Yu, J.: A bi-objective vehicle-routing problem with soft
time windows and multiple depots to minimize the total energy consumption and customer
dissatisfaction. Sustainability 10(11), 1–21 (2018)
90. Kumar, R.S., Kondapaneni, K., Dixit, V., Goswami, A., Thakur, L.S., Tiwari, M.K.: Multi-
objective modeling of production and pollution routing problem with time window: a self-
learning particle swarm optimization approach. Comput. Ind. Eng. 99, 29–40 (2016). https://
doi.org/10.1016/j.cie.2015.07.003
91. Chai, H., He, R., Ma, C., Dai, C., Zhou, K.: Path planning and vehicle scheduling optimization
for logistic distribution of hazardous materials in full container load. Discret. Dyn. Nat. Soc.
2017, 1–14 (2017). https://doi.org/10.1155/2017/9685125
92. Momenikiyai, M., Student, M., Ebrahimnejad, S., Vahdani, B.: A Bi-objective mathematical
model for inventory distribution-routing problem under risk pooling effect: robust meta-
heuristics approach. J. Econ. Comput. Econ. Cybern. Stud. Res. 52(4), 257–274 (2018).
https://doi.org/10.24818/18423264/52.4.18.17
93. Tikani, H., Setak, M.: Efficient solution algorithms for a time-critical reliable transportation
problem in multigraph networks with FIFO property. Appl. Soft Comput. J. 74, 504–528
(2019)
94. Sivaramkumar, V., Thansekhar, M.R., Saravanan, R., Joe Amali, S.M.: Demonstrating the
importance of using total time balance instead of route balance on a multi-objective vehicle
routing problem with time windows. Int. J. Adv. Manuf. Technol. 98(5–8), 1287–1306 (2018)
390 B. Polo-Pichon et al.

95. Wang, Y.M., Yin, H.L.: Cost-optimization problem with a soft time window based on an
improved fuzzy genetic algorithm for fresh food distribution. Math. Probl. Eng. 2018, 1–17
(2018). https://doi.org/10.1155/2018/5743287
96. Pierre, D.M., Zakaria, N.: Stochastic partially optimized cyclic shift crossover for multi-
objective genetic algorithms for the vehicle routing problem with time-windows. Appl. Soft
Comput. 52, 863–876 (2017)
97. Abdulaal, A., Cintuglu, M.H., Asfour, S., Mohammed, O.A.: Solving the multivariant EV
routing problem incorporating V2G and G2V options. IEEE Trans. Transp. Electrification
3(1), 238–248 (2017). https://doi.org/10.1109/TTE.2016.2614385
98. Wang, X., Sun, X., Dong, J., Wang, M., Ruan, J.: Optimizing terminal delivery of perishable
products considering customer satisfaction 1–12 (2017)
99. Lightner-Laws, C., Agrawal, V., Lightner, C., Wagner, N.: An evolutionary algorithm app-
roach for the constrained multi-depot vehicle routing problem. Int. J. Intell. Comput. Cybern.
9(1), 2–22 (2016)
100. He, J., Huang, Y., Yan, W.: Yard crane scheduling in a container terminal for the trade-off
between efficiency and energy consumption. Adv. Eng. Inform. 29(1), 59–75 (2015)
101. Harbaoui Dridi, I., Ben Alaïa, E., Borne, P., Bouchriha, H.: Optimisation of the multi-depots
pick-up and delivery problems with time windows and multi-vehicles using PSO algorithm.
Int. J. Prod. Res. 58(14) (2020). https://doi.org/10.1080/00207543.2019.1650975
102. Guo, Y.-N., Cheng, J., Luo, S., Gong, D., Xue, Y.: Robust dynamic multi-objective vehicle
routing optimization method. IEEE/ACM Trans. Comput. Biol. Bioinform. 15, 1891–1903
(2018)
103. Zhang, Y., Shi, L., Chen, J., Li, X.: Analysis of an automated vehicle routing problem in
logistics considering path interruption. J. Adv. Transp. 2017, 1–10 (2017). https://doi.org/
10.1155/2017/1624328
104. Kachitvichyanukul, V., Sombuntham, P., Kunnapapdeelert, S.: Two solution representations
for solving multi-depot vehicle routing problem with multiple pickup and delivery requests
via PSO. Comput. Ind. Eng. 89, 125–136 (2015). https://doi.org/10.1016/j.cie.2015.04.011
105. Norouzi, N., Sadegh-Amalnick, M., Alinaghiyan, M.: Evaluating of the particle swarm
optimization in a periodic vehicle routing problem. Measur. J. Int. Measur. Confed. 62,
162–169 (2015)
106. Zhang, Q., Xiong, S.: Routing optimization of emergency grain distribution vehicles using
the immune ant colony optimization algorithm. Appl. Soft Comput. 71, 917–925 (2018)
107. Niroomand, I., Nsakanda, A.L.: Improving collection flows in a public postal network with
contractor’s obligation considerations. Int. J. Prod. Econ. 198, 79–92 (2018). https://doi.org/
10.1016/j.ijpe.2018.01.025
108. Verbeeck, C., Vansteenwegen, P., Aghezzaf, E.H.: The time-dependent orienteering problem
with time windows: a fast ant colony system. Ann. Oper. Res. 254(1–2), 481–505 (2017)
109. Brito, J., Martínez, F.J., Moreno, J.A., Verdegay, J.L.: An ACO hybrid metaheuristic for
close-open vehicle routing problems with time windows and fuzzy constraints. Appl. Soft
Comput. J. 32, 154–163 (2015)
110. López-Santana, E., Rodríguez-Vásquez, W.C., Méndez-Giraldo, G.: A hybrid expert system,
clustering and ant colony optimization approach for scheduling and routing problem in
courier services. Int. J. Ind. Eng. Comput. 9, 369–396 (2018)
111. Zhang, H., Zhang, Q., Ma, L., Zhang, Z., Liu, Y.: A hybrid ant colony optimization algorithm
for a multi-objective vehicle routing problem with flexible time windows. Inf. Sci. (Ny) 490,
166–190 (2019)
112. Decerle, J., Grunder, O., Hajjam, A., Hassani, E., Barakat, O.: A hybrid memetic-ant colony
optimization algorithm for the home health care problem with time window, synchronization
and working time balancing. Swarm Evol. Comput. 46, 171–183 (2019)
A Brief Literature Review About Bioinspired Metaheuristics 391

113. Ren, X.Y., Chen, C.F., Xiao, Y.L., Du, S.C.: Path optimization of cold chain distribution
with multiple distribution centers considering carbon emissions. Appl. Ecol. Environ. Res.
17, 9437–9453 (2019)
114. Wu, L., He, Z., Chen, Y., Wu, D., Cui, J.: Brainstorming-based ant colony optimization for
vehicle routing with soft time windows. IEEE Spec. Sect. Theory Algorithms Appl. Sparse
Recover. 7, 19643–19652 (2018)
115. Rouky, N., Boukachour, J., Boudebous, D., El Alaoui Hilali, A.: A robust metaheuristic for
the rail shuttle routing problem with uncertainty: a real case study in the Le Havre Port.
Asian J. Shipping Logist. 34(2), 171–187 (2018)
116. Huang, S.-H., Huang, Y.-H., Blazquez, C.A., Paredes-Belmar, G.: Application of the ant
colony optimization in the resolution of the bridge inspection routing problem. Appl. Soft
Comput. 65, 443–461 (2018)
117. Verbeeck, C., Vansteenwegen, P., Aghezzaf, E.-H.: Solving the stochastic time-dependent
orienteering problem with time windows. Eur. J. Oper. Res. 255, 699–718 (2016)
118. Yang, Z., et al.: Dynamic vehicle routing with time windows in theory and practice. Nat.
Comput. 16, 119–134 (2016)
119. Wu, W., Tian, Y., Jin, T.: A label based ant colony algorithm for heterogeneous vehicle
routing with mixed backhaul. Appl. Soft Comput. 47, 224–234 (2016)
120. Gao, J., Gu, F., Hu, P., Xie, Y., Yao, B.: Automobile chain maintenance parts delivery
problem using an improved ant colony algorithm. Spec. Issue Artic. Adv. Mech. Eng. 8(9),
1–13 (2016)
121. Allali, S., Benchaïba, M., Ouzzani, F., Menouar, H.: No-collision grid based broadcast
scheme and ant colony system with victim lifetime window for navigating robot in first aid
applications. Ad Hoc Netw. 68, 85–93 (2017)
122. Ju, C., Zhou, G., Chen, T.: Disruption management for vehicle routing problem with time-
window changes. Int. J. Shipping Transp. Logist. 9(1), 4–28 (2017)
123. Iqbal, S., Kaykobad, M., Rahman, M.S.: Solving the multi-objective vehicle routing problem
with soft time windows with the help of bees. Swarm Evol. Comput. 24, 50–64 (2015)
124. Yao, B., Yan, Q., Zhang, M., Yang, Y.: Improved artificial bee colony algorithm for vehicle
routing problem with time windows. PloS One. 12, 1–18 (2017). https://doi.org/10.1371/jou
rnal.pone.0181275
125. Yu, S., Tai, C., Liu, Y., Gao, L.: An improved artificial bee colony algorithm for vehicle
routing problem with time windows: a real case in Dalian. Spec. Issue Artic. Adv. Mech.
Eng. 8(8), 1–9 (2016)
126. Jawarneh, S., Abdullah, S.: Sequential insertion heuristic with adaptive bee colony opti-
mization algorithm for vehicle routing problem with time windows. PLoS One 10(7) (2015).
https://doi.org/10.1371/journal.pone.0130224
127. Nikolić, M., Teodorović, D.: Vehicle rerouting in the case of unexpectedly high demand in
distribution systems. Transp. Res. Part C Emerg. Technol. 55, 535–545 (2015)
Markov Decision Processes with
Discounted Costs: Improved Successive
Over-Relaxation Method

Abdellatif Semmouri1(B) , Mostafa Jourhmane1 ,


and Bahaa Eddine Elbaghazaoui2
1
Faculty of Sciences and Techniques,
Laboratory of Information processing and Decision (TIAD),
Sultan Moulay Slimane University, Campus Mghilla, Beni Mellal, Morocco
abd semmouri@yahoo.fr
2
Faculty of Sciences, Laboratory of Computer Sciences,
Ibn Tofail University, Kenitra, Morocco
bahaaeddine.elbaghazaoui@uit.ac.ma
https://fstbm.ac.ma, https://fs.uit.ac.ma

Abstract. In the mathematics field, discrete-time Markov chains are


stochastic control processes. They give a particular plan of rules and ideas
based on mathematics for describing and analyzing situations related to
agent decision. This case is governed by randomness (chance) and is
controlled by intervention of person or machine. To reduce complexity
computational efforts as much as possible, action elimination procedures
have been used with considerable success and have appeared in the lit-
erature of stochastic optimization to solve this problem.
We focus our efforts on the computational complexity of some sit-
uations consisting to find optimal policy in Markov decision problems
(MDPs) under the discounted cost criterion under infinite planning hori-
zon. It is well known that those problems are solvable in polynomial time
via the dynamic programming approach. In this regard, we provide an
overview on the successive over-relaxation algorithm (SOR) which is a
part of successive approximation methods. Next, we will be interesting
to establish a new test for identifying non-optimal decisions and we will
invest it in order to improve the SOR algorithm. Finally, we present an
illustrative example for demonstrating our contribution and we compare
our test to results already established in the same axis.

Keywords: Markov decision processes · Discounted utility ·


Sub-optimal decision · Optimization

1 Introduction
The area of Markov chains is a part of stochastic processes, containing several
types of knowledge and dealing with models requiring probabilities. Despite of
the results are applied to many real-world areas, they remain well understood by
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 392–402, 2022.
https://doi.org/10.1007/978-3-030-97196-0_32
MDPs - Action Elimination Procedures 393

its mathematical character. Markov chains have largely modeled a wide range
of decision problems related to uncertainty which is governed by randomness.
They provide a mathematical framework for solving these situations in many
areas such that systems engineering, industrial engineering, management sci-
ence, finance and artificial intelligence. In this regard, the main goal consists
to optimize some utility function for calculating optimal policies for the given
situation via the dynamic programming tool based on the Bellman optimality
equation.
A wealth of tests of non-optimal decisions have been established in this frame-
work by calculating the extreme bounds on the optimal utility function of dis-
crete MDPs with state and action spaces under the discounted criterion and with
infinite horizon. In this literature, various results have been proposed to solve
the complexity problem. Once an action is tested as a sub-optimal action in a
PDM, it should not be factored into calculations after this period. Remember
that MacQueen [1,2] is the first author who used the action elimination approach
after determining the range on the performance of the system in question.
Porteus [3] continued to extend the previous work in the general case of
decision problems modeled by Markov chains. He found other limits on the
performance of the system. Thus, the established test gave rise to other works
in the same literature (see [4–8] and [9–11]) in order to face this challenge.
In this regard, we will establish a nouvel test of sub-optimality. By combi-
nation of the given test and the SOR scheme, we obtain an improved algorithm
which compute the optimal value vector by eliminating the non optimal actions.
this allows to reduce the sieze of the current problem.
The structure of this manuscript is presented as the following manner: Sect.
1 exhibits the preceding results which focus on the same framework such as our
work. In Sect. 2, we give the function objective by expectation and some variants
of the popular VIA. Hence, we propose a new testing approach in Sect. 3. Finely,
we end our proposal by conclusion and the future scope in Sect. 4.

2 Preliminaries
2.1 MDP Model
Consider a discrete-time Markov Control Model is represented by:
 

S; A = A(i); Q = {pn (j/i, a)}; c = {cn (i, a)}; γ
i∈S

where
– S is a finite state space;
– A is a collection of available finite action sets which control the state dynam-
ics;
– p(./., .): transition probabilities;
– c(.,.): cost function;
– 0 ≺ γ ≺ 1: the discount factor (Fig. 1).
394 A. Semmouri et al.

Fig. 1. A control system model (Photo: [10])

In discrete epochs n = 1, 2, ..., the decision maker controls a system to occupy a


state from S according to the process {Xn , n ≥ 1} that resides in the set of states
S. At each stage n, the decision maker agent select a control a in the current
state. Hence, an immediate cost cn (i, a) is paid for reaching the following state
(Fig. 2).

Fig. 2. Markov decision process

For many details, readers are advised to see Semmouri [9–12].

2.2 Expected Discounted Cost Criterion

Here, a decision maker aims to optimize an objective function for determining


optimal policies (see Semmouri et al. [12], Howard [13], Bellman [14], Bert-
sekas [15], White [16] and Puterman [17]).
MDPs - Action Elimination Procedures 395

Definition 1. Let π ∈ Π and i ∈ S. The utility function over infinite horizon,


is defined as follows:


Jγ (i, π) := Eiπ [ γ n−1 c(Xn , An )] (1)
n=1

By series theory, a permutation of the symbols E and gives:


Jγ (i, π) = γ n−1 Eiπ [c(Xn , An )]
n=1


 
= γ n−1 Piπ (Xn = j, An = a)πj,a c(j, a)
n=1 j,a

Definition 2. The desired performance is expressed as follows:

Jγ (i) := inf Jγ (i, π), i ∈ S (2)


π∈Π

and a strategy π  ∈ Π is called optimal if Jγ (i) = Jγ (i, π  ), for all i ∈ S.

2.3 Popular Derivatives of the VIA


Because of its simplicity in terms implementation, VIA is widely used as a
dynamic programming tool for solving Markov decision problems.
 Pre-Jacobi (PJ)
⎧ ⎫
⎨ |S|
 ⎬
Vin+1 ← inf ci (a) + γ p(j/i, a)Vjn (3)
a ⎩ ⎭
j=1

 Jacobi (J)
⎧⎡ ⎤⎫
⎨  ⎬
Vin+1 ← inf ⎣ci (a) + γ p(j/i, a)Vjn ]/[1 − γp(i/i, a)]⎦ (4)
a ⎩ ⎭
j=i

 Pre-Gauss-Seidel (PGS)
⎧ ⎫
⎨   ⎬
Vin+1 ← inf ci (a) + γ p(j/i, a)Vjn+1 + γ p(j/i, a)Vjn (5)
a ⎩ ⎭
j≺i j≥i
396 A. Semmouri et al.

 Gauss-Seidel (GS)
 
Vin+1 ← inf {[ci (a) + γ p(j/i, a)Vjn+1 + γ p(j/i, a)Vjn ]
a
j≺i ji

/[1 − γp(i/i, a)]} (6)


 Successive Over-Relaxation (SOR)
⎡ ⎧ ⎫⎤
⎨   ⎬
Vin+1 ← ω ⎣inf ci (a) + γ p(j/i, a)Vjn+1 + γ p(j/i, a)Vjn ⎦
a ⎩ ⎭
j≺i j≥i

+ (1 − ω)Vin (7)
 
1
where ω ∈ (0, ω  ] is the relaxation factor such that ω  = inf
i,a 1 − γp(i/i, a)

3 Results - Experiments
3.1 A Brief Overview on the Successive Over-Relaxation (SOR)
In numerical analysis framework, the method of Successive Over-relaxation
Method (in short SOR) is a derivative of the GS method. Its convergence is gen-
erally faster. This method was discovered simultaneously by David M. Young,
Jr. and Stan Frankel in 1950 with the aim of automatically solving linear sys-
tems with computers. Next, the over-relaxation methods have been used to solve
Markov decision problems as a variant of the standard VIA.
For ω ∈]0, ω  ], we define the quantities γi (ω), i ∈ S by the recursion formula:
⎡ ⎧ ⎫⎤
⎨   ⎬
γi (ω) = ω ⎣ min c(i, a) + γ p(j/i, a)γj (ω) + γ p(j/i, a) ⎦ + 1 − ω
a∈A(i) ⎩ ⎭
j≺i j≥i

and
ρ(ω) = max γi (ω)
i∈S

We consider the Bellman mapping Φ : R|S| → R|S| defined for all V = (Vi )i∈S ∈
R|S| as follows
⎡ ⎧ ⎫⎤
⎨   ⎬
(Φω V )i = ω ⎣ min c(i, a) + γ p(j/i, a)(Φω V )j + γ p(j/i, a)Vj ⎦
a∈A(i) ⎩ ⎭
j≺i j≥i

+ (1 − ω)Vi (8)
It is established by Denardo [18] and Reetz [19] that Φω is a contraction operator
with parameter of contraction ρ(ω  ) = 1 + (γ − 1)ω  and ρ(ω  ) ≤ γ ≺ 1.
Since the optimal objective function Jγ is the unique fixed point of Φω , it
follows SOR algorithm:
MDPs - Action Elimination Procedures 397

Example 1. We consider a MDP given by the following probabilistic graph


(Fig. 3):

Fig. 3. Probabilistic graph

To see the behavior of the number of iterations necessary to reach the desired
minimum, we vary the discount factor γ from 0.1 to 0.9. The VIA and SOR
algorithms give the results in the form of the following graphic representation:
Figure 4 confirms that the convergence of SOR Algorithm 1 is faster than
VIA algorithm for values of γ greater than 0.3 in this example.

3.2 New Test for Non-optimal Controls


Suitable choice of ω = ω  , first ensure the convergence of the SOR2 algorithm.
Second, this convergence is faster than VIA to the fixed point of the contraction
Ψω .
398 A. Semmouri et al.

Fig. 4. Graphic representation

Set
C = max |c(i, a)| and μ = ρ(ω  )
i,a

During this subsection, we admit the time independence of all the parameters
of the time system. The following lemma gives lower and upper bounds on the
optimal value utility Jγ .
Lemma 1. Let (V n )n≥0 be vectors provided by the iterative method SOR. Then,
we get
μn C μn C
Vin − ≤ Jγ (i) ≤ Vin + , f or all i ∈ S, n ≥ 0. (9)
1−μ 1−μ
Proof. Lemma proof is demonstrated by the contraction mapping theory.

Inspiring from Puterman [17] and (9), we establish a new result of tests in the
theorem below:
Theorem 1 (Non-optimality ). Let (V n )n≥0 be the vectors provided by Algo-
rithm 1. If
 μn (1 + γ)C
c(i, b) + γ p(j/i, b)Vjn Vin + (10)
1−μ
j∈S

at some point of time n, then any strategy which uses action b in state i appears
non-optimal.

Proof. If action b ∈ A(i) optimal at point of time n, then we get:



Jγ (i) = c(i, b) + γ p(j/i, b)Jγ (j) (11)
j∈S
MDPs - Action Elimination Procedures 399

From (9), (11) it follows


 μn Cγ
Jγ (i) ≥ c(i, b) + γ p(j/i, b)Vjn − (12)
1−μ
j∈S

Combining (9), (10) and (12), we obtain

μn C
Jγ (i) Vin + ≥ Jγ (i)
1−μ
This termines the proof.

The practical application of our contribution requires the improvement of the


SOR algorithm. Therefore, a combination of the test (10) and the SOR algorithm
leads to the following algorithm:

Example 2. We take the previous example by discount factor γ = 0.5. Applying


ISOR algorithm, we get (Table 1):

The color shows the importance of each value of S n (., .). Moreover, we con-
clude that:

J0.5 = (−2, 1.529418)
and
A(1) = {a3 } and A(2) = {a1 }.
400 A. Semmouri et al.

Table 1. Eliminated actions by the test function S n

Iter S n (1, a1 ) S n (1, a2 ) S n (1, a3 ) S n (2, a1 ) S n (2, a2 ) S n (2, a3 )


1 -16,088235 -13,735294 -26,676470 -23,631488 -19,756055 -21,970588
2 0,499389 3,031447 -10,984429 -10,699471 -6,131997 -8,069713
3 7,044629 9,593449 -4,523000 -4,405664 0,161810 -1,775907
4 9,739728 12,295450 -1,862411 -1,814097 2,753377 0,815661

To prove the efficiency of our work, we present comparison with the previous
results. Let NSJ−01 (γ) and NSJ−02 (γ) be the number of iterations necessary to
identify all the non-optimal actions respectively by the tests SJ − 01 (10) and
SJ − 02 (3) in Semmouri [11] for the given gamma value. For variations of the
parameter γ, the behavior of numbers NSJ−01 (γ) and NSJ−02 (γ) is given by
Fig. 5 below:

Fig. 5. Comparison of tests

From the graph below, we observe that the number of iterations NSJ−02
grows faster than the number of iterations NSJ−01 . This means that the new
test has significant importance in complexity theory.

4 Conclusion
Markov chains have long been an effective tool for solving decision problems in
the presence of uncertainty governed by chance. Its practical applications are
constantly increasing in various fields.
In this work, we have provided an overview on some previous works that have
been arisen around the same literature of action elimination procedures related
MDPs - Action Elimination Procedures 401

to the discounted cost utility over the infinite horizon. Next, we have presented
the well-known cost objective function in the same horizon and we have pro-
vided a distinguished test for non-optimality to eliminate non-optimal decisions
from the planning. Since the Successive Over-Relaxation (SOR) method is effi-
cient and faster than the classic Value Iteration Algorithm (VIA), we focused
our intention on improving this algorithm by adding the presented test in this
manuscript. In order to show its novelty, we compared our result with that of
Semmouri and Jourhmane test (2020) by giving an illustrating example. There
is a significant difference between them since the main goal of the action elim-
ination procedures is to reduce the computational efforts complexity. At this
regard, the tool framework of identifying non-optimal decision is yet an open
area. We have to give more new results and especially in MDPs when using the
discounted criterion.
Hence, the future work will be applying the present result in Artificial Intel-
ligence.

Acknowledgements. We are grateful to Professor Dr. C. Daoui Sultan Moulay Sli-


mane University, Beni Mellal, Morocco for his help and encouraging and Mr. Lekbir
Tansaoui. So, we would like thanking ICIVC 2021 (Sur University College, Oman) and
referees for careful reading of the manuscript, useful comments and suggestions for
improving this paper.

References
1. MacQueen, J.B.: A modified dynamic programming method for Markovian decision
problems. J. Math. Anal. Appl. 14(81), 38–43 (1965). https://doi.org/10.1016/
0022-247X(66)90060-6
2. MacQueen, J.B.: A test for suboptimal actions in Markovian decision problems.
Oper. Res. 15(3), 559–561 (1967). https://doi.org/10.1287/opre.15.3.559
3. Porteus, E.L.: Some bounds for discounted sequential decision processes. Manage.
Sci. 18(1), 7–11 (1971). https://doi.org/10.1287/mnsc.18.1.7
4. Grinold, R.C.: Elimination of suboptimal actions in Markov decision problems.
Oper. Res. 21(3), 848–851 (1973). https://doi.org/10.1287/opre.21.3.848
5. Hastings, N.A.J., Mello, J.M.C.: Tests for suboptimal actions in discounted Markov
programming. Manage. Sci. 19(9), 1019–1022 (1973). https://doi.org/10.1287/
mnsc.19.9.1019
6. Puterman, M.L., Shin, M.C.: Modified policy iteration algorithms for discounted
Markov decision problems. Manage. Sci. 24(11), 1127–1137 (1978). https://doi.
org/10.1287/mnsc.24.11.1127
7. Sadjadi, D., Bestwick, P.F.: A stagewise action elimination algorithm for the dis-
counted semi-Markov problem. J. Oper. Res. Soc. 30(7), 633–637 (1979). https://
doi.org/10.1057/jors.1979.156
8. White, D.J.: The determination of approximately optimal policies in Markov deci-
sion processes by the use of bounds. J. Oper. Res. Soc. 33(3), 253–259 (1982).
https://doi.org/10.1057/jors.1982.51
9. Semmouri, A., Jourhmane, M.: Markov decision processes with discounted cost:
the action elimination procedures. In: ICCSRE 2nd International Conference of
Computer Science and Renewable Energies, pp. 1–6. IEEE Press, Agadir, Morocco
(2019). https://doi.org/10.1109/ICCSRE.2019.8807578
402 A. Semmouri et al.

10. Semmouri, A., Jourhmane, M.: Markov decision processes with discounted costs
over a finite horizon: action elimination. In: Masrour, T., Cherrafi, A., El Hassani,
I. (eds.) A2IA 2020. AISC, vol. 1193, pp. 199–213. Springer, Cham (2021). https://
doi.org/10.1007/978-3-030-51186-9 14
11. Semmouri, A., Jourhmane, M., Elbaghazaoui, B.E.: Markov decision processes
with discounted costs: new test of non-optimal actions. J. Adv. Res. Dyn. Con-
trol Syst. 12(05-SPECIAL ISSUE), 608–616 (2020). https://doi.org/10.5373/
JARDCS/V12SP5/20201796
12. Semmouri, A., Jourhmane, M., Belhallaj, Z.: Discounted Markov decision processes
with fuzzy costs. Ann. Oper. Res. 295(2), 769–786 (2020). https://doi.org/10.
1007/s10479-020-03783-6
13. Howard, R.A.: Dynamic Programming and Markov Processes. Wiley, New York,
London (1960)
14. Bellman, R.E.: Dynamic Programming. Princeton University Press, New Jersey
Google Scholar (1957)
15. Bertsekas, D.P., Shreve, S.E.: Stochastic Optimal Control. Academic Press, New
York (1978)
16. White, D.: Markov Decision Processes. Wiley, England (1993)
17. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Pro-
gramming. Wiley, New York (1994)
18. Denardo, E.V.: Contraction mappings in the theory underlying dynamic program-
ming. SIAM Rev. 9(2), 165–177 (1967). https://doi.org/10.1137/1009030
19. Reetz, D.: Solution of a Markovian decision problem by successive overrelax-
ation. Zeitschrift für Oper. Res. 17(1), 29–32 (1973). https://doi.org/10.1007/
BF01951368
Control Quality Analysis in Accordance
with Parametrization in MPC Automation
System

Tomas Barot1(B) , Mikhail Perevozchikov2 , Marek Kubalcik3 , Jaromir Svejda1 ,


and Ladislav Rudolf4
1 Department of Mathematics with Didactics, Faculty of Education, University of Ostrava, Fr.
Sramka 3, 709 00 Ostrava, Czech Republic
{Tomas.Barot,Jaromir.Svejda}@osu.cz
2 Faculty of International Relations, Saint Petersburg State University, Smolny str. 1/3, 8
entrance, 191060 Saint-Petersburg, Russia
3 Department of Process Control, Faculty of Applied Informatics, Tomas Bata University in
Zlin, Nad Stranemi 4511, 760 05 Zlin, Czech Republic
Kubalcik@fai.utb.cz
4 Department of Technical and Vocational Education, Faculty of Education, University of
Ostrava, Fr. Sramka 3, 709 00 Ostrava, Czech Republic
Ladislav.Rudolf@osu.cz

Abstract. In automation systems, a parametrization of a controller in a part of


its synthesis has an important influence on an accordance to an improving the
control quality aspects and on a decreasing the computational complexity. In the
field of the process control, as one of the modern control methods, the Model
Predictive Control (MPC) has been considered. The MPC strategy is one of a
novel and modern approach. In the MPC, the control process has been influenced
by parameters dependent on the strategy of the receding horizons. These horizons
can be set by programmers of the controller. However, the parametrization has
not been so widely bound on the statistical analysis of the MPC progresses of
the control quality criterions yet. The quantitative research techniques can be one
of the appropriate approaches of decision, which parameters are suitable. In this
paper, the statistical methods of the testing differences between the MPC control
criterions are proposed as an extended method with regards to the guarantee of
a statistical significance. Particularly, the testing differences of the criterions of
the MPC automation system is demonstrated on the control of the multivariable
model of the process including a consideration of the parametrical (ANOVA) or
non-parametrical (Kruskal-Wallis) statistical approaches based on the significance
level 0.001.

Keywords: MPC automation system · Model predictive control ·


Parametrization · Horizons setting · Control quality · Statistical analysis ·
Statistical significance

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 403–412, 2022.
https://doi.org/10.1007/978-3-030-97196-0_33
404 T. Barot et al.

1 Introduction
The methods of the statistical analysis [1] can bring and can guarantee a statistical
significance [2] into a wide spectrum of decisions in the theory of automation [3] and of
the signal processing [4]. In automation systems, an analysis of a parametrization of a
controller in a part of a synthesis can carry an important impact in an accordance to an
improving the control quality aspects [5] by a decreasing a computational complexity
[6].
For demonstration of these aims, the Model Predictive Control (MPC) automation
system [7–10] with a multivariable mathematical model of controlled system [9] can be
the appropriate example of the process control, where computations can be classified as
time-consuming in scales of the computational complexity.
The Model Predictive Control (MPC) [7] has been considered as one of the novel and
modern approach across the research community in the automation theory. In the MPC,
the control process is influenced by parameters related to a strategy of a receding horizon
[8]. The receding horizon [8] is an area, in which the prediction of the future increments
of manipulated variables [7] are computed due to the previous values of variables in
MPC and in the mathematical model of the process [7]. The computations are limited
on a size of a horizon window [7] given by the horizon parameters. The realization of
the determination of predictions in MPC are situated in a cooperation of the two bound
subsystems – predictor and optimizer [10]. The predictor [7] is characterized by the
prediction equation including the CARIMA model [10]. In optimizer [7], the problem
of the non-linear constrained optimization is being solved [11], e.g. often in the form of
the quadratic programming [12] frequently solved using by the Hildreth method [13].
From the horizon, only the first element of the computed information is applied into a
feedback control and the receding horizon strategy is being further repeated [7–13].
The progresses of control quality of the multivariable MPC automation control pro-
cess can be statistically further analyzed. An influence of the MPC parameters of horizons
has not been so widely considered on the control quality progresses yet. In addition, the
including the statistical significance [1, 2] can bring the guarantee into decisions, which
set of parameters is more efficient for the MPC controller in the phase of its synthesis,
e.g. as in [14].
For purposes of the testing the significant differences [1] in accordance to a set of
parameters (more or equal to 3), the methods ANOVA [1] (parametrical approach) or
Kruskal-Wallis [1] (non-parametrical approach) can be appropriately utilized. Type of
parametrical or non-parametrical methods is given by the testing normality of data [15].
Each of the parameter in MPC controller [7] can be defined as the group item for these
methods. Progresses of control quality criterions can be grouped in its setting of a range
in integer as an understanding of horizon parameters. Using by these statistical methods
[1] of the testing statistically significant differences, the setting of a controller in the
simulation of the MPC automation system [7] is presented with aim to achieved the
decision layer for the parametrization of the synthesis phase. On the significance level
0.001, a proposed analysis is demonstrated on mathematical models of a controlled
multivariable process, where the computational complexity is rapidly being increased
using by higher parameters.
Control Quality Analysis in Accordance with Parametrization 405

2 Selected Methods for Testing Differences in Favor of Analysis


of Parametrization in Automation Systems
The quantitative research techniques [1] can be one of the appropriate approaches to a
decision, which parameters of an automation system [3] can be efficient and suitable. In
this paper, the statistical methods of the testing differences [1] between the MPC control
quality criterions [5] are proposed as the extended method with regards to the guarantee
of the statistical significance [2].
In general, the quantitative techniques have been characterized by the guarantee of
the significance level [1]. According to this approach, the conclusions of their methods
of the mathematical induction [1] can be further generalized. In the frame of this type
of research, the appropriate connections into the applied research field can be widely
projected as can be seen e.g. [16–18].
The main classification of the methods of the mathematical induction has been built
on the principle of the measurement of the normality property [15] of data. According
to this assumption, the appropriate particular test is than selected.
Therefore, testing normality is the primary phase realized on data using by Shapiro-
Wilk [15] or Anderson-Darling [15] tests based on partial testing of the probability
distribution of normality f (x) (1) with parameters of the mean value μ and the standard
deviation σ. Following results of these tests of normality, the division of the methods of
the mathematical induction is into parametrical tests and non-parametrical tests can be
specified [1, 15].
 
1 (x − μ)2
f (x) = √ Exp − (1)
σ 2π 2σ 2

The representatives of the parametrical tests are e.g. T + F test, ANOVA test. These
tests are determined in case of data with fulfilling the property of normality. In the
opposite case, non-parametrical tests should be utilized. The non-parametrical tests are
e.g. Mann-Whitney test and Kruskal-Wallis test [1, 15].
In case of these denoted representatives of the parametrical or non-parametrical tests,
the same aim is assumed: testing the statistical significance of the differences among
means (in case of parametrical tests) or means (in case of non-parametrical tests). With
regards to numbered cardinal variable ordered under items of the nominal variable, the
particular utilization of the tests is as follow: T + F test and Mann Whitney test (for
2 items of nominal variable); ANOVA and Kruskal-Wallis test (for 3 or more items of
nominal variable) [1, 15].
In each of the intended test, the result is considered in the form of p value. In
relation on the application area of the quantitative research, the p value is than compare
with significance level α with conclusions about tested hypothesis (Table 1). For n sub-
grouped numbered variable X Parameters λi , i = 1, . . . , n are in case of parametric tests
(ANOVA for n ≥ 3, T + F test n = 2) mean values, in case of non-parametric tests
(Kruskal-Wallis for n ≥ 3, Mann-Whitney n = 2) medians. For p > α: zero hypothesis
H 0 is failed to rejected on significance level α. In opposite case, zero hypothesis H 0
is rejected in favor of alternative hypothesis H 1 on significance level α. Significance
level used to be determined as: 0.05 (social science), 0.01 (technical science) or 0.001
406 T. Barot et al.

(medical science). The concrete realization of the quantitative research can be seen: e.g.
in [1, 15–18].

Table 1. Consideration of structure of hypotheses for testing statistically differences

Hypothesis Assumption
Zero Hypothesis H 0 H0 : λ1 = λ2 = · · · = λn
There are not statistically significant differences between
progresses of numbered values of the stochastic variable X
categorized into n subgroups
Alternative Hypothesis H 1 H1 : λ1  = λ2  = · · ·  = λn
There are statistically significant differences between progresses
of numbered values of the stochastic variable X categorized into n
subgroups

3 Structure of Feedback Control in MPC Automation System

In automation systems, the control quality criterions [5] can suitably express the efficient
of each control strategy with main aims of either the minimization of the increments of the
manipulated variable – by the criterion J 1 (k) (with u1 (k), u2 (k)) or the minimization
of the output errors (difference between reference signal w and output variable y) – by
the criterion J 2 (with w1 (k), w2 (k), y1 (k), y2 (k)) considering the two inputs two outputs
system, where k is the index of the order in the sequence of the discrete form of the
realized control strategy. In this paper, the progress values of the control quality criterions
(2)–(3) are being observed and further statistically analyzed with regards to the influence
of the parametrization of the automation system on them [5].
 
J1 = (u1 (k))2 + (u2 (k))2 (2)
k k
 
J2 = (w1 (k) − y1 (k))2 + (w2 (k) − y2 (k))2 (3)
k k

For demonstration of these aims, the appropriate example of the process control
can be the Model Predictive Control (MPC) automation system [7] with a multivariable
mathematical model of controlled system [10], where the computations can be classified
as consuming in scales of the computational complexity.
The Model Predictive Control (MPC) [7–10] has been considered as one of the
novel and modern approach across the research community in automation theory. In
the MPC, the control process is influenced by parameters related to the strategy of
receding horizon. The receding horizon is an area, in which the prediction of the future
increments of manipulated variables are computed due to the previous values of variables
in MPC and the mathematical model of the process. The computations are limited on
the size of the horizon window given by the horizon parameters. The realization of
Control Quality Analysis in Accordance with Parametrization 407

the determination of the prediction in MPC are situated in the cooperation of the two
bound subsystems – a predictor and an optimizer. The predictor is characterized by
the prediction equation including the CARIMA model. In optimizer, the problem of
the non-linear constrained optimization is being solved, e.g. often in the form of the
quadratic programming frequently solved using by the Hildreth method [13]. From the
horizon, only the first element of computation is applied into the feedback control and
the receding horizon strategy is being further repeated [7–10, 17].
For simulation of the MPC automation system, the following mathematical model
(4)–(7) of two inputs two outputs multivariable process is controlled [7–10, 17].

G(z −1 ) = A−1 (z −1 )B(z −1 ) (4)


   
α11 (z −1 ) α12 (z −1 ) β11 (z −1 ) β12 (z −1 )
A(z −1 ) = , B(z −1
) = (5)
α21 (z −1 ) α22 (z −1 ) β21 (z −1 ) β22 (z −1 )

α11 (z −1 ) = 1 + α111 z −1 + α112 z −2 , α12 (z −1 ) = α121 z −1 + α122 z −2 ,
(6)
α21 (z −1 ) = α211 z −1 + α212 z −2 , α22 (z −1 ) = 1 + α221 z −1 + α222 z −2

β11 (z −1 ) = β111 z −1 + β112 z −2 , β12 (z −1 ) = β121 z −1 + β122 z −2 ;
(7)
β21 (z −1 ) = β211 z −1 + β212 z −2 , β22 (z −1 ) = β221 z −1 + β222 z −2

The subsystem of the predictor has been included the system of equation based
on CARIMA model (8) [8] with expressed predicted output variables rule (9). As the
vector of two increments of the manipulated variable for the two-dimensional system,
the manipulated variable u(k) = [u1 (k), u2 (k)]T . Partial vectors u1 (k), u2 (k) of future
manipulated variable u(k) has N u elements. In CARIMA model, also, output signal y(k)
and control error e(k) has been included [7–10, 17].
 
−1 −1 −1 −1 −1 −1 1 − z −1 0
A(z )y(k) = B(z )u(k) +  (z )C(z )es (k), (z ) =
0 1 − z −1
(8)

y(k) = A1 y(k − 1) + A2 y(k − 2) + A3 y(k − 3) + B1 u(k − 1) + B2 u(k − 2), ⎪ ⎪
    ⎪

(1 − α111 ) −α121 (α111 − α112 ) (α121 − α122 ) ⎪

A1 = , A2 = , ⎬
−α211 (1 − α221 ) (α211 − a212 ) (α221 − α222 )
      ⎪

α112 α122 β111 β121 β112 β122 ⎪

A3 = ; B1 = , B2 = ⎪


α212 α222 β211 β221 β212 β222
(9)

Using by matrices P and G, the system of the predictive equation has been became
more unified (10)–(12) and representative in sense, which considered values has been
used in predictor subsystem as previous and which are predicted. N 1 and N 2 are minimum
and maximum prediction horizons. Maximum prediction horizon expresses the number
408 T. Barot et al.

of the predictive equations in the MPC syntheses controller. Matrix Z is a zero matrix
[7–10, 17].
⎡ ⎤ ⎡ ⎤⎫
⎡ ⎤ y(k) u(k) ⎪

y(k + N1 ) ⎢ u(k + 1) ⎥⎪ ⎪
⎢ .. ⎥ ⎢ y(k − 1) ⎥ ⎢ ⎥⎪

⎣ ⎦ = P ⎢ ⎥ + G ⎢ .. ⎥⎪

. ⎣ y(k − 2) ⎦ ⎣ ⎦⎪

. ⎪

y(k + N2 ) ⎪
   u(k − 1) u(k + Nu − 1) ⎪ ⎪

y   ⎬
u (10)
⎡ ⎤ ⎡ ⎤ ⎪

P 11 P 12 · · · P 14 G11 G12 · · · G1j ⎪



⎢ P 21 P 22 · · · P 24 ⎥ ⎢ G21 G22 · · · G2j ⎥ ⎪

⎢ ⎥ ⎢ ⎥ ⎪

P=⎢ . . . ⎥ , G = ⎢ . . . ⎥ ⎪

⎣ .. . . .. ⎦ ⎣ .. . . .. ⎦ ⎪




P i1 P i2 · · · P i4 Gi1 Gi2 · · · Gij


P ∈ R2N2 ,8 ; ⎪


P 11 = A1 ; P 12 = A2 ; P 13 = A3 ; P 14 = B2 ;⎪ ⎪




P 21 = A1 + A2 ; P 22 = A1 A2 + A3 ;
2 ⎪





P 23 = A1 A3 ; P 24 = A1 B2 ; ⎪


P 31 = A1 + A1 A2 + A3 + A1 A2 ;
3
(11)


P 32 = A21 A2 + A1 A3 + A22 ; ⎪





P 33 = A1 A3 + A3 A2 ; P 34 = A1 B2 + A2 B2 ; ⎪
2 2

  ⎪⎪

P ij = A1 P (i−1)j + A2 P (i−2)j + A3 P (i−3)j , ⎪




i = 4, ..., N2 ; j = 1, ..., i

G = Z; Z ∈ R2N2 −2N1 +2,2N2 ; ⎪


G11 = G22 = G33 = B1 ; ⎪





G21 = G32 = (A1 B1 + B2 ); ⎪





G31 = (A1 B1 + A1 B2 + A2 B1 );
2


⎛ ⎞
Gi1 = A1 G(i−1)1 + A2 G(i−2)1 + A3 G(i−3)1 (12)

⎜ G ⎟ ⎪⎪
⎜ i(j−1) = A1 G(i−1)(j−1) + A2 G(i−2)(j−1) ⎟ ⎪ ⎪
⎜ ⎟, ⎪

⎝ + A3 G(i−3)(j−1) + B2 ⎠ ⎪⎪



Gij = B1 ⎪




i = 4, ..., N2 ; j = 1, ..., i

The system of the predictive Eqs. (8)–(12) is related to process contained in the
optimization subsystem (13) solved frequently using by the Hildreth method bound on
the principal of the quadratic programming techniques. The quadratic programming is
case of the class of the non-linear programming methods. In (13), the cost function J
can be seen with regards to the definition of constrained vector of increments of the
Control Quality Analysis in Accordance with Parametrization 409

aimed manipulated variable in the feedback control process of MPC. In particular, the
constraints have been given by matrices M and in vector γ . Matrix I is an identity
matrix. Variable is a reference signal w(k) and as this variable each other vector variable
in following equations is still two-dimensional [7–13, 17].
  ⎫
min J = uT Hu + bT u | Mu ≤ γ , ⎪
1


2 ⎪



⎛ ⎡ ⎤ ⎞⎬
y(k)
(13)
⎜ ⎢
T ⎜ ⎢ y(k − 1) ⎥
⎥ ⎟⎪ ⎪
H = G G + I, b = G ⎝P ⎣
T ⎟
− w⎠⎪ ⎪
y(k − 2) ⎦ ⎪



u(k − 1)

4 Results

For purposes of the proposed statistical approach of the analysis of statistically significant
differences in the frame of the progress of the control quality criterions, the two inputs
two outputs system (14) was selected.
 ⎫
−1 1 − 0.95 z −1 + 0.226 z −2 0.099 z −1 − 0.009 z −2 ⎪ ⎪
A(z ) = ⎬
0.194 z −1 + 0.086 z −2 0.427 z −1 + 0.218 z −2
  (14)
−0.087 z −1 0.01 z −2 0.796 z −1 − 0.214 z −2 ⎪ ⎪
B(z −1 ) = ⎭
0.235 z −1 + 0.092 z −2 0.152 z −1 + 0.062 z −2

The automation system of MPC described in Sect. 3 was simulated in the MATLAB
environment with the minimum horizon N 1 set as value of 1. The discrete control was
considered for the k from 0 to 200. Reference signals were chosen as w1 = 0.05 (for
k = 0 … 50 & k = 101 ... 150) and w2 = 0.1 (for k = 51 ... 100 & k = 151 ... 200).
The constrains on the variables in MPC were not considered. The control process can
be seen on Fig. 1.
Both control N u and maximum horizons N 2 were being similarly set steeply from 5
to 30 by the difference of 5. The total criterions for the moving values of these horizons
can be seen in Table 2. As can be seen in this table, the minimal criterions are for the
lowest horizon setting.
For both control N u and maximum horizons N 2 , the statistically significant differ-
ences were analyzed using by the Kruskal-Wallis test (for the non-parametrical character
of data) on the significance level 0.001. As can be seen in Table 3, all measured values
of the progressing parts of the computations did not express the significant differences
due to the achieved p values always greater than the significance level, as can be seen
also in Fig. 2. In Table 3, the columns express: k - the order of the realizations of the
discrete control using by the MPC control strategy with obtain values of the parts of the
equations of the control quality criterions (2) and (3).
410 T. Barot et al.

Fig. 1. Progress of variables in simulation of realized MPC automation system

Table 2. Control quality criterions obtained in MPC control

Criterions N 2 = Nu = N 2 = Nu = N 2 = Nu = N 2 = Nu = N 2 = Nu = N 2 = Nu =
5 10 15 20 25 30
J1 10.06298 20.05288 30.05232 40.05241 50.05245 60.05246
J2 10.03033 20.03312 30.03278 40.03258 50.03254 60.03254

Table 3. Achieved p values from Kruskal-Wallis tests for analysis of significant differences across
partial computations of control criterions on interval of k in MPC control

k (w1 (k) − y1 (k))2 (w2 (k) − y2 (k))2 (u1 (k))2 (u2 (k))2
0–25 0.6146 0.8499 0.9959 0.9959
26–50 0.2961 0.02234 0.8544 0.939
51–75 0.2754 0.2917 0.8858 0.8725
76–100 0.2604 0.0026 0.7207 0.9203
101–125 0.2655 0.2904 0.8879 0.8623
126–150 0.2565 0.002571 0.7249 0.9173
151–175 0.2759 0.2954 0.8928 0.8653
176–200 0.9464 0.9825 0.9742 0.9658
Control Quality Analysis in Accordance with Parametrization 411

Fig. 2. Achieved p values for parts of control quality criterions in realized MPC

5 Conclusion
Including the consideration of the guarantee statistical significance, on the significance
level 0.001, control quality criterions of the realized MPC automation system were totally
and partially analyzed for controlled multivariable models in accordance to the controller
parametrization. The MPC feedback control was realized in MATLAB and the influence
of the maximum and control horizon parameters of the MPC controller was observed for
their impact on the criterion’s computations. These impacts of the presented simulations
were confirmed as statistically significant; therefore, the conclusion of the proposed
examples can be assessed as parametrically independent on the increasing setting values
of horizons, when the criterions were not statistical differed. However, according to the
total values of the measured criterions, the lower values of the control and maximum
horizons are recommended with regards of the aim of their total minimization.

References
1. Gauthier, T.D., Hawley, M.E.: Statistical methods. In: Introduction to Environmental Foren-
sics, 3rd edn., pp. 99–148. Elsevier (2015). https://doi.org/10.1016/B978-0-12-404696-2.000
05-9
2. Stockemer, D.: Quantitative Methods for the Social Sciences. Springer, Heidelberg (2019).
https://doi.org/10.1007/978-3-319-99118-4
3. Navratil, P., Pekar, L., Klapka, J.: Load distribution of heat source in production of heat and
electricity. Int. Energy J. 17(3), 99–111 (2017). ISSN 1513-718X
4. Corriou, J.P.: Process Control: Theory and Applications. Springer, Heidelberg (2004)
5. Kubalcik, M., Bobal, V.: Adaptive control of coupled drives apparatus based on polynomial
theory. In: Proceedings of the IMechE Part I: Journal Systems and Control Engineering, vol.
220, no. I7, pp. 641–654. IEEE (2006). https://doi.org/10.1109/CCA.2002.1040252
6. Kubalcik, M., Bobal, V., Barot, T.: Modified Hildreth’s method applied in multivariable
model predictive control. In: Machado, J., Soares, F., Veiga, G. (eds.) HELIX 2018. LNEE,
vol. 505, pp. 75–81. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91334-6_11.
ISBN 978-3-319-91333-9
412 T. Barot et al.

7. Rossiter, J.A.: Model Based Predictive Control: A Practical Approach, CRC Press (2003)
8. Kwon, W.H.: Receding Horizon Control: Model Predictive Control for State Models. Springer,
Heidelberg (2005)
9. Navratil, P., Balate, J.: One of possible approaches to control of multivariable control loop.
IFAC Proc. Vol. 40(5), 207−212 (2007). https://doi.org/10.3182/20070606-3-MX-2915.
00033. ISSN 1474-6670
10. Camacho, E.F., Bordons, C.: Model Predictive Control. Springer, Heidelberg (2004)
11. Li, C., Mao, Y., Yang, J., Wang, Z., Yanhe, X.: A nonlinear generalized predictive control for
pumped storage unit. Renew. Energy 114, 945–959 (2017). https://doi.org/10.1016/j.renene.
2017.07.055. ISSN 0960-1481
12. Dostal, Z.: Optimal Quadratic Programming Algorithms: With Applications to Variational
Inequalities. Springer, Heidelberg (2009)
13. Wang, L.: Model Predictive Control System Design and Implementation Using MATLAB.
Springer, Heidelberg (2009)
14. Barot, T., Burgsteiner, H., Kolleritsch, W.: Comparison of discrete autocorrelation functions
with regards to statistical significance. In: Silhavy, R. (ed.) CSOC 2020. AISC, vol. 1226,
pp. 257–266. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-51974-2_24
15. Vaclavik, M., Sikorova, Z., Barot, T.: Particular analysis of normality of data in applied
quantitative research. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) CoMeSySo 2018.
AISC, vol. 859, pp. 353–365. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-
00211-4_31. ISBN 978-3-030-00210-7
16. Pivarc, J.: Ideas of Czech primary school pupils about intellectual disability. Educ. Stud.
[Article in Press] (2018). https://doi.org/10.1080/03055698.2018.1509784. ISSN 0305-5698
17. Barot, T., Krpec, R., Kubalcik, M.: Applied quadratic programming with principles of statis-
tical paired tests. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) CoMeSySo 2019. AISC,
vol. 1047, pp. 278–287. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31362-
3_27. ISBN 978-3-030-31361-6
18. Simbartl, P., Honzikova, J.: Using demonstration to design creative products focused on folk
traditions. In: 8th International Conference on Education and New Learning Technologies,
pp. 2832–2837. IATED (2016). https://doi.org/10.21125/edulearn.2016.1613
Enhancing the Social Learning Ability
of Spider Monkey Optimization
Algorithm

Apoorva Sharma1(B) , Nirmala Sharma1 , and Kavita Sharma2


1
Rajasthan Technical University, Kota, India
apoorvashrma2628@gmail.com
2
Government Polytechnic College, Kota, India

Abstract. In the arena of swarm intelligence algorithms, spider monkey


works as a very powerful algorithm. In this article, an efficient modifi-
cation of SMO is presented. In the proposed method the social learning
of a spider monkey is enhanced with using the local leader of the neigh-
boring group. The proposed algorithm is titled the social learner spider
monkey optimization algorithm (SLSMO). This modified variant exploits
the search space efficiently as well as the convergence speed of the algo-
rithm is also enhanced respected the optimal solution. For validating the
authenticity of this proposed SLSMO, it is collated with three benchmark
sets i.e. 23 global optimization problems, 3 engineering design problems,
and 16 constraint optimization problems. The attained outcomes are also
collated with the significant approaches available in the literature. The
obtained outcomes prove the authenticity of the propounded approach.

1 Introduction
The algorithms on swarm intelligence (SI) are designed to solve the compli-
cated multi-dimensional functions of the complicated systems [15]. These are
nature-inspired algorithms (NIAs) which are developed by taking inspirations
from colonies of social insects, like fishes, bee, bird flocks, ant, etc. Spider mon-
key optimization (SMO) is widely known and a recent strategy in the field of SI
algorithms. It was first introduced in 2014 by J. C. Bansal et al. [5]. The main
feature of every SI based algorithm is, fitness-based updation of all the poten-
tial solutions and to perform this operation every algorithm needs to follow two
methods: The first one is the variation method, in which different areas of search
spaces are explored and the second one is the selection method, which is used to
assures the exploitation of the previous experiences. Besides all its features, it
also has a drawback to stop proceeding towards global optima occasionally even
though the set of the potential solution has not converged to a local optima [6].
For any SI based algorithm, if it is necessary to use the variation process
intelligently to explore the search space, it will also become important to use the
selection process in a organized way to use the previous experience. Hence a new
strategy namely social learner spider monkey optimization (SLSMO) algorithm
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 413–435, 2022.
https://doi.org/10.1007/978-3-030-97196-0_34
414 A. Sharma et al.

is developed. This SLSMO not only learning from the global leader and its local
leader but also learning from the local leader of randomly selected neighbouring
groups. For the algorithm, it will intensify its exploration and convergence ability.
To prove the authenticity of this evinced algorithm, it is tested over 23 global
optimization functions (GOPs) [2], 3 engineering design problems (EDPs) [12]
and 16 constrained optimization problems (COPs) [7]. The obtained outcomes
of GOPs are tested and collated with 7 state-of-art algorithms, EDPs results
are also collated with three state-of-art algorithms and the results of COPs are
also compared with 5 SI based algorithms. The obtained results validate the
propound approach.
The remained article is ordered as follows: Sect. 2 introduced SMO. Section 3
introduce the SLSMO algorithm and the experimental results and discussions
are done in Sect. 4. At the end Sect. 5 concludes the summary.

2 Spider Monkey Optimization (SMO) Algorithm

This intelligent optimization technique is inspired from the behavior of spider


monkeys i.e. fission-fusion behavior. It is a social grouping pattern, where the
spider monkeys of the large community, divide themselves into small parties
according to the availability of the food sources [5].
At first, the initialization phase is executed. After that, all phases of SMO is
executed. It contains six phases specifically Local Leader Phase, Global Leader
Phase, Global Leader Learning Phase, Local Leader Learning Phase, Local
Leader Decision Phase, and Global Leader Decision Phase.

2.1 Initialization Phase

Firstly, the N spider monkey’s population is distributed uniformly in the specified


search region. Here each and every monkey SMp (p = 1, 2, ..., N) is a vector of
dimension-D. Each SMp will be initialized as follows:

SMpq = SMminq + U (0, 1) × (SMmaxq − SMminq ) (1)

Where as in the q th dimension SMminq and SMmaxq are the minimum and max-
imum bounds and U (0, 1) is the interval [0, 1] where there are evenly distributed
arbitrary values.

2.2 Local Leader Phase

During this stage, every spider monkey update its positions by learning from
its local leader as well as by a arbitrary selected neighbour from that particular
group. The position modify equation for local leader phase is expressed as follows:

SM newpq = SMpq + U (0, 1) × (LLkq − SMpq ) + U (−1, 1) × (SMrq − SMpq ) (2)


SLSMO 415

The fitness for the new position is calculated and then the greedy selection mech-
anism is applied to choose the better position. SMpq is the pth spider monkey’s
q th dimension. LLkq is the position of the k th group’s local leader. SMrq is the
location of a spider monkey arbitrary taken from k th group suchlike (r = p).

2.3 Global Leader Phase


The execution of this phase is carried out after the execution of the local leader
phase. The experience of the global leader and random member of the group is
used to modify the position of the spider monkeys. The position modify equation
is as follows:

SM newpq = SMpq + U (0, 1) × (GLq − SMpq ) + U (−1, 1) × (SMrq − SMpq ) (3)

here q ∈ {1, 2, ...., D} is a arbitrarily taken index and GLq specifies the q th
dimension of the global leader. SMrq is randomly choosed spider monkey from
the k th group such that r = p. In this phase, a probability factor is used to
enhance the efficiency of the updation process. This probability is calculated for
each spider monkey and it is calculated by the following expression:
f itnessp
probp = 0.9 × + 0.1 (4)
max f itness

here f itnessp is the fitness of pth spider monkey and max f itness is the max-
imum fitness in the group. Further, the better position is adopted by applying
the greedy selection mechanism.

2.4 Global Leader Learning Phase

In this phase, the solution with the best fitness is selected as the global leader.
Then the investigation is done to find whether the location of the global leader
is updated so far or not. The value of GlobalLimitCount is incremented by 1
if the position of the global leader is found not updated.

2.5 Local Leader Learning Phase

For this phase, the local leader’s position is updated by implemented the greedy
selection technique in that specified group. Then again the old position of the
local leader and the new position is collated to find whether the position is
updated for that local leader or not and if not then the LocalLimitCount’s
value is incremented by 1.

2.6 Local Leader Decision Phase

In this stage, if the LocalLeadeLimit exceeds the LocalLomitCount for a partic-


ular group, then all the associates of that group will update themselves i.e. their
416 A. Sharma et al.

locations, by performing random initialization or by utilizing the information of


the global leader as well as the local leader.

SM newpq = SMpq + U (0, 1) × (GLq − SMpq ) + U (0, 1) × (SMpq − LLkq ) (5)

It is clear from the Eq. 5 that by this updating process, the spider monkey repels
from the local leader and attracting towards the global leader.

2.7 Global Leader Decision Phase

If the GlobalLimitCount exceeds the predetermined threshold i.e. GlobalLead-


erLimit, then the global leader splits the population into smaller groups. The
population is uniformly distributed into two groups, into three groups and so
on respectively, till the maximum number of groups (MG) are formed define in
Sect. 4.4.1.

3 Social Learner Spider Monkey Optimization

In the local leader decision phase, when the local leader of a group is not learning
by updating its position and stuck in the local optima, all the spider monkey
are again initialized by doing random initialization or by taking the knowledge
of the global leader as well as the local leader. But in the case of stagnation,
if the global leader itself stuck on a point then the benefit of this phase will
become nill. Likewise in the local leader phase, the local leader of the group
and a randomly selected spider monkey is chosen for modifying the position of
the spider monkeys in the group. But for exploiting the search space efficiently,
the spider monkeys should also learn from the best solutions of nearby groups.
So by doing this, it can enhance its social learning ability. So for fulfilling this
purpose, two modification has been done in this article. The first modification
is in the local leader phase and the second modification is in the local leader
decision phase. The modified phases are described as follows:

3.1 Modified Local Leader Phase

In the local leader phase the Eq. 2 is modified by Eq. 6. Here in the place of
randomly selected spider monkey, a random group (s = k) is selected and the
position is updated by taking the local leader of that randomly selected group.
By this, the spider monkey is not only learning from its local leader but also
learning from the best solution of other groups.

SM newpq = SMpq + U (0, 1) × (LLkq − SMpq ) + U (−1, 1) × (LLsq − SMpq ) (6)

The modified local leader phase is depicted through the Algorithm 1.


SLSMO 417

for each member SM p ∈ kth group do


for each q ∈ {1, ...., D} do
if U (0, 1) ≥ pr then
SM newpq = SM pq + U (0, 1) × (LLkq − SM pq ) + U (−1, 1) × (LLsq −
SM pq )
else
SM newpq = SM pq
end if
end for
end for
Algorithm 1: Position Update operation for modified local leader phase

3.2 Modified Local Leader Decision Phase

The second modification is done in the local leader decision phase where Eq. 5 is
replaced by Eq. 7. Where instead of learning from the global leader, the solutions
are learning from the best fitted solution of other randomly selected groups.

SM newpq = SMpq + U (0, 1) × (LLsq − SMpq ) + U (0, 1) × (SMpq − LLkq ) (7)

Here LLsq is the q th dimension of the randomly choosen group’s local leader
(where s = k). Based on the above clarification, the modified local leader decision
phase is depicted in Algorithm 2.

for k = {1....M G} do
if LocalLimitCountk ≥ LocalLeaderLimit then
LocalLimitCountk = 0
GS = kth group size
for p ∈ {1....GS} do
for each q ∈ {1....D} do
if U (0, 1) ≥ pr then
SM newpq = SM minq + U (0, 1) × (SM maxq − SM minq )
else
SM newpq = SM pq + U (0, 1) × (LLsq − SM pq ) + U (0, 1) ×
(SM pq − LLkq )
end if
end for
end for
end if
end for
Algorithm 2: Position Update operation in modified local leader decision
phase
418 A. Sharma et al.

4 Experimental Results and Discussion


To verify the performance of SLSMO three different datasets have been taken.
The definition, parameter settings and results comparisons for the benchmark
problems for all three datasets are as follows:

4.1 Dataset 1 (Global Optimization Problems)

4.1.1 Benchmark Problems Under Consideration


For verifying the performance of SLSMO, it is tested on 23 different global
optimization problems (GOPs) [2] as presented by Table 1

Table 1. Global optimization problems (GOPs), Dimension (D), Acceptable error


(AE)

S. no GOPs Objective function Search range Optimum D AE


value
D
1 Parabola gop1 (x) = (xi )2 [−5.12, 5.12] 0 30 1.0E−05
i=1
D
2 De Jong f4 gop2 (x) = i=1 i.(xi )4 [−5.12, 5.12] 0 30 1.0E−05
D D xi
3 Griewank gop3 (x) = 1 + 4000 i=1 x2i − i=1 cos( √
1
i
) [−600, 600] 0 30 1.0E−05

4 Rosenbrock gop4 (x) = D 2 2
i=1 (100(xi+1 − xi ) + (xi − 1) )
2
[−30, 30] 0 30 1.0E−02

5 Rastrigin gop5 (x) = 10D + D 2
i=1 [xi − 10 cos(2πxi )] [−5.12, 5.12] 0 30 1.0E−05
D
6 Ackley gop6 (x) = −20 + e + exp(− 0.2 D
3
i=1 xi ) [−1, 1] 0 30 1.0E−05

7 Alpine gop7 (x) = D (|xi sin xi + 0.1x i |) [−10, 10] 0 30 1.0E−05
i=1 2 D
8 Cosine Mixture gop8 (x) = D i=1 xi −0.1( cos5πxi )+0.1D [−1, 1] −D × 0.1 30 1.0E−05
D i=12
9 Exponential gop9 (x) = −(exp(−0.5 i=1 xi )) + 1 [−1, 1] −1 30 1.0E−05
D  ixi 2
10 Zakharov gop10 (x) = i=1 xi 2 + ( D i=1 2 ) + [−5.12, 5.12] 0 30 1.00E−02
D ix1 4
( i=1 2 )
 2
2 (xi+1 ) +1 x 2 +1
11 brown3 gop11 (x) = D−1
i=1 (xi + xi+1 2 i ) [−1, 4] 0 30 1.0E−05
D  
12 Salomon gop12 (x) = i=1 1−cos(2π (xi ) )+0.1 (xi )2
2
[−100, 100] 0 30 1.0E−01
Problem
D
13 Axis parallel gop13 (x) = i=1 ix2i [−5.12, 5.12] 0 30 1.0E−05
hypere-llipsoid
D−1
14 Inverted cosine gop  = 2− 2 i=1
 14 (x)   [−5, 5] D+1 10 1.0E−05
wave −(xi +xi+1 +0.5xi xi+1 )
exp 8
× I

15 Levy montalvo gop15 (x) = 0.1(sin2 (3πx1 ) + D−1 2
i=1 (xi − 1) × [−5, 5] 0 30 1.0E−05
(1 + sin2 (3πxi+1 )) + (xD − 1)2 (1 + sin2 (2πxD )))
16 Colville gop16 (x) = 100[x2 −x21 ]2 +(1−x1 )2 +90(x4 −x23 )2 [−10, 10] 0 4 1.0E−05
function + (1 − x3 )2 + 10.1[(x2 − 1)2 + (x4 − 1)2 ] +
19.8(x2 − 1)(x4 − 1)
17 Branins’s gop17 (x) = a(x2 − bx21 + cx1 − d)2 + e(1 − f ) −5 ≤ x1 ≤ 0.3979 2 1.0E−05
function cos x1 + e 10, 0 ≤ x2 ≤ 15
  2
x1 (b2
i +bi x2 )
18 Kowalik gop18 (x) = 11i=1 ai − b2 +b x +x [−5, 5] 0.000307486 4 1.0E−04
i i 3 4
function
D−1
19 Shifted gop19 (x) = i=1 (100(zi2 − zi+1 )2 + (zi − 1)2 ) + [−100, 100] 390 10 1.0E−01
Rosenbrock fbias , z = x − o + 1, x = [x1 , x2 , ....xD ],
o = [o1 , o2 , ...oD ]
20 Six-hump camel gop20 (x) = (4 − 2.1x21 + x41 /3)x21 + x1 x2 + [−5, 5] −1.0316 2 1.0E−05
back (−4 + 4x22 )x22
((−(x1 −π)2 −(x2 −π)2 ))
21 Easom’s gop21 (x) = −cosx1 cosx2 e [−10, 10] −1 2 1.0E−13
function
n n
22 Sinusoidal gop22 (x) = −[A i=1 sin(xi − z) + i=1 [0, 180] −(A + 1) 10 1.0E−02
sin(B(xi − z))], A = 2.5, B = 5, z = 30
 D
23 Schewel prob 3 gop23 (x) = D i=1 |xi | + i=1 |xi | [−10, 10] 0 30 1.0E−05
SLSMO 419

4.2 Dataset 2 (Engineering Design Problems)

4.2.1 Benchmark Problems Under Consideration


The engineering design problems (EDPs) [12] are also considered for this per-
formance test. The characteristics and properties of the considered EDPs are
depicted in Table 2.

Table 2. EDPs: Engineering design problems (fn* signifies the optimal values and the
acceptable error is represented by AE)

EDPs Decision fn* AE


variable
Sinusoidal Problem (edp1 ) 10 −(A + 1) 1.0E−02
Lennard Jones (edp2 ) 15 −9.1038 1.0E−03
Parameter Estimation for 6 0 1.0E−02
frequency modulated sound wave (edp3 )

4.3 Dataset 3 (Constraint Optimization Problems)

4.3.1 Benchmark Problems Under Consideration


16 test constrained optimization problems (COPs) of CEC-2006 [7] have also
been taken for this performance test. The characteristics and properties of the
considered COPs are depicted in Table 3.

Table 3. Constrained optimization problems (COPs) [7], Nec denotes equality con-
straint’s count, Niec specify the count of inequality constraints, and Nac denotes the
active constraint’s count

COPs D Type of function Nec Niec Optimal value Nac


cop1 13 Quadratic 0 9 −15.0000000 6
cop2 20 Nonlinear 0 2 −0.8036191 1
cop3 10 Polynomial 1 0 −1.0005001 1
cop4 5 Quadratic 0 6 −30665.5386718 2
cop5 4 Cubic 3 2 5126.4967140 3
cop6 2 Cubic 0 2 −6961.81387558 2
cop7 10 Quadratic 0 8 24.3062091 6
cop8 2 Nonlinear 0 2 −0.0958250 0
cop9 7 Polynomial 0 4 680.6300574 2
cop10 2 Quadratic 1 0 0.7499000 1
cop11 5 Nonlinear 3 0 0.0539442 3
cop12 3 Quadratic 2 0 961.7150223 2
cop13 5 Nonlinear 0 38 −1.9051553 4
cop14 20 Linear 0 0 193.7245101 6
cop15 2 Linear 0 0 −5.5080133 2
cop16 2 Quadratic 1 1 1.3935 2
420 A. Sharma et al.

4.4 Parameter Settings


In the case of GOPs to check the authenticity and efficiency of SLSMO, it is
collated with Spider monkey optimization (SMO) [5], Power law-based local
search in SMO algorithm (PLSMO) [9], Levy flight SMO algorithm (LFSMO)
[12], Modified limacon SMO algorithm (MLSMO) [10], Artificial bee colony algo-
rithm (ABC) [6], Differential evolution algorithm (DE) [8] and Particle swarm
optimization algorithm (PSO) [14]. To prove the efficieny of SLSMO for EDPs, it
is also collated with SMO, DE and PSO and for COPs, the SLSMO is compared
with SMO, MLSMO, ABC, Best-So-Far ABC (BSFABC) [3] and modified ABC
(MABC) [1]. The parameter settings for the experiments is as follows:

4.4.1 Parameter Setting for GOPs

Table 4. Parameter settings: GOPs

Swarm Size N = 50
The number of simulations 100
GlobalLeaderLimit = N
LocalLeaderLimit = D×N
0.4
pr = 0.1 + ( M ax iterations )
MG = N/10

– The termination criteria: number of function evaluations attains its limit (i.e.
200000) or the acceptable error is attained,
– The parameter settings for SMO [5], PLSMO [9], LFSMO [12], MLSMO [10],
ABC [6], DE [8] and PSO [14] are taken from their original research papers
respectively (Table 4).

4.4.2 Parameter Setting for EDPs

Table 5. Parameter settings: EDPs

Swarm Size N = 50
The number of simulations 100
GlobalLeaderLimit = N
LocalLeaderLimit = D×N
0.4
pr = 0.1 + ( M ax iterations )
MG = N/10
Maximum run of function evaluations = 300000
SLSMO 421

– The parameter settings for SMO [5], DE [8] and PSO [14] are taken from
their original research papers respectively (Table 5).

4.5 Parameter Setting for COPs

Table 6. Parameter settings: COPs

Swarm Size N = 50
The count of simulations 100
Maximum run of function evaluation = 300000
Maximum count of iterations = 5000

– All the other settings of considered algorithms are same as for global opti-
mization problems.
– Equality or inequality constraint handling has a very important role to attain
the feasible solution for constraint optimization problems. In this article, the
experiment is performed with the help of the adaptive penalty function [4]
(Table 6).

4.6 Results Comparison

4.6.1 Global Optimization Problems


For validating the performance of the propound SLSMO over the global opti-
mization problems, it is compared with SMO, DE, PSO and three recent vari-
ants of SMO namely, PLSMO, LFSMO, MLSMO. The comparison performed
in terms of standard deviation (SD), mean error (ME), the average number of
function evaluations (AFEs) and ducess rate (SR). The results are exhibited
in Table 7. The result demonstrates the better performance of the SLSMO as
compared to the other respected algorithms.
422 A. Sharma et al.

Table 7. Results: global optimization problems (GOPs)

GOPs Algorithm SD ME AFEs SR


gop1 SLSMO 2.32E−06 5.59E−06 1292.94 100
SMO 8.18E−07 9.08E−06 12627.45 100
LFSMO 8.58E−07 8.87E−06 15753.81 100
PLSMO 9.10E−07 8.92E−06 13122.07 100
MLSMO 1.24E−04 3.35E−04 200495.00 0
ABC 2.02E−06 8.17E−06 20409.00 100
DE 1.37E−06 8.77E−06 23226.00 100
PSO 5.31E−07 9.41E−06 37880.00 100
gop2 SLSMO 2.57E−06 3.39E−06 940.50 100
SMO 1.15E−06 8.59E−06 10664.28 100
LFSMO 1.37E−06 8.25E−06 13438.71 100
PLSMO 1.39E−06 8.71E−06 11172.44 100
MLSMO 2.54E−06 4.71E−06 78693.04 100
ABC 3.11E−06 4.90E−06 9578.50 100
DE 1.84E−06 7.93E−06 21337.50 100
PSO 9.55E−07 9.02E−06 32529.50 100
gop3 SLSMO 2.24E−06 5.43E−06 1805.76 100
SMO 8.46E−03 4.91E−03 89396.33 66
LFSMO 5.50E−03 2.99E−03 84609.17 71
PLSMO 5.72E−03 3.21E−03 85460.89 70
MLSMO 7.44E−02 2.46E−01 200495.00 0
ABC 1.26E−03 2.28E−04 45942.82 97
DE 4.22E−03 2.13E−03 69395.50 78
PSO 7.46E−03 4.59E−03 138821.00 64
gop4 SLSMO 4.53E+00 8.07E−01 46707.47 80
SMO 5.29E+01 4.86E+01 200054.28 0
LFSMO 4.36E+01 4.81E+01 200069.05 0
PLSMO 3.70E+01 3.43E+01 200056.07 0
MLSMO 2.18E+03 1.41E+03 200495.00 0
ABC 2.40E+00 1.34E+00 196024.50 5
DE 2.86E+01 3.39E+01 199100.00 1
PSO 4.75E+01 4.07E+01 250050.00 0
gop5 SLSMO 0.00E+00 0.00E+00 99.00 100
SMO 1.66E−06 8.37E−06 94867.02 100
LFSMO 0.00E+00 0.00E+00 99.00 100
PLSMO 0.00E+00 0.00E+00 99.00 100
MLSMO 0.00E+00 0.00E+00 99.00 100
ABC 0.00E+00 0.00E+00 50.00 100
DE 0.00E+00 0.00E+00 100.00 100
PSO 0.00E+00 0.00E+00 100.00 100
(continued)
SLSMO 423

Table 7. (continued)

GOPs Algorithm SD ME AFEs SR


gop6 SLSMO 2.91E−06 5.82E−06 26166.77 100
SMO 9.27E−02 9.32E−03 31142.43 99
LFSMO 0.00E+00 0.00E+00 99.00 100
PLSMO 0.00E+00 0.00E+00 99.00 100
MLSMO 0.00E+00 0.00E+00 99.00 100
ABC 0.00E+00 0.00E+00 50.00 100
DE 0.00E+00 0.00E+00 100.00 100
PSO 0.00E+00 0.00E+00 100.00 100
gop7 SLSMO 1.91E−06 7.50E−06 17537.02 100
SMO 4.00E−05 1.69E−05 81051.97 95
LFSMO 3.48E−04 1.36E−04 154368.20 62
PLSMO 4.94E−04 1.15E−04 136353.51 71
MLSMO 4.35E−01 4.77E−01 200495.00 0
ABC 1002.21E−06 7.63E−06 74925.50 100
DE 8.31E−07 9.20E−06 61554.00 100
PSO 5.50E−01 5.53E−02 101265.00 98
gop8 SLSMO 5.13E−02 2.07E−02 34828.24 85
SMO 6.16E−02 2.81E−02 60075.49 82
LFSMO 1.03E−01 6.65E−02 86144.42 66
PLSMO 1.08E−01 7.09E−02 102391.84 64
MLSMO 1.46E−01 2.34E−01 200495.00 0
ABC 2.36E−06 7.05E−06 23061.00 100
DE 4.01E−02 1.18E−02 38076.50 92
PSO 5.28E−02 2.22E−02 72122.50 85
gop9 SLSMO 1.95E−06 6.37E−06 1306.80 100
SMO 7.77E−07 8.97E−06 9712.89 100
LFSMO 8.23E−07 8.83E−06 12081.12 100
PLSMO 6.61E−07 8.99E−06 10082.67 100
MLSMO 1.62E−06 7.57E−06 142449.46 99
ABC 2.32E−06 7.67E−06 16834.00 100
DE 1.24E−06 8.73E−06 17713.00 100
PSO 6.09E−07 9.40E−06 28466.50 100
gop10 SLSMO 2.25E−03 6.24E−03 2654.19 100
SMO 8.32E−04 9.28E−03 131924.09 100
LFSMO 7.14E−04 9.44E−03 159832.29 100
PLSMO 5.30E−04 9.60E−03 114865.22 100
MLSMO 1.19E−02 3.34E−02 200495.00 0
ABC 1.53E+01 9.64E+01 200000.00 0
DE 1.09E−03 8.93E−03 69636.00 100
PSO 4.43E−04 9.59E−03 209557.50 100
(continued)
424 A. Sharma et al.

Table 7. (continued)

GOPs Algorithm SD ME AFEs SR


gop11 SLSMO 1.47E−06 7.11E−06 1713.69 100
SMO 8.52E−07 8.86E−06 12627.45 100
LFSMO 7.44E−07 9.06E−06 15709.36 100
PLSMO 1.01E−06 8.85E−06 13098.56 100
MLSMO 1.19E−04 3.23E−04 200495.00 0
ABC 2.28E−06 7.91E−06 21117.00 100
DE 1.36E−06 8.58E−06 22772.50 100
PSO 6.89E−07 9.27E−06 34964.50 100
gop12 SLSMO 2.43E−05 1.00E−01 2029.50 100
SMO 4.00E−02 1.80E−01 184138.76 19
LFSMO 2.86E−02 1.91E−01 191430.39 9
PLSMO 2.96E−02 1.95E−01 194160.90 7
MLSMO 1.02E−01 6.49E−01 200495.00 0
ABC 1.45E−01 8.99E−01 200025.70 0
DE 3.56E−02 1.07E+00 195630.00 9
PSO 4.88E−02 2.68E−01 250050.00 0
gop13 SLSMO 2.08E−06 6.01E−06 1937.43 100
SMO 8.42E−07 8.84E−06 14685.66 100
LFSMO 8.28E−07 9.02E−06 18269.67 100
PLSMO 7.35E−07 9.05E−06 15335.61 100
MLSMO 1.37E−02 1.86E−02 200495.00 0
ABC 2.34E−06 7.50E−06 22683.00 100
DE 1.26E−06 8.46E−06 26637.50 100
PSO 7.76E−07 9.22E−06 44395.00 100
gop14 SLSMO 9.35E−02 1.98E−02 43025.70 95
SMO 1.35E−01 2.28E−02 80570.25 95
LFSMO 2.61E−02 2.67E−03 79135.99 97
PLSMO 1.16E−01 1.57E−02 69565.07 97
MLSMO 7.49E−01 4.60E−01 200095.00 1
ABC 1.04E−01 2.21E−02 78668.27 95
DE 6.13E−01 9.51E−01 178280.50 15
PSO 7.46E−01 1.27E+00 241711.50 10
gop15 SLSMO 1.30E−06 8.57E−06 46247.70 100
SMO 1.09E−03 1.18E−04 13770.47 99
LFSMO 2.80E−03 7.77E−04 27941.24 93
PLSMO 2.39E−03 5.58E−04 29817.33 95
MLSMO 3.14E−03 1.06E−03 200495.00 0
ABC 2.45E−06 7.56E−06 21720.00 100
DE 1.54E−03 2.29E−04 24558.00 98
PSO 2.16E−03 4.49E−04 48340.00 96
(continued)
SLSMO 425

Table 7. (continued)

GOPs Algorithm SD ME AFEs SR


gop16 SLSMO 1.48E−04 9.26E−04 8664.50 100
SMO 2.17E−04 7.60E−04 55430.55 100
LFSMO 6.45E−05 9.72E−04 17197.28 100
PLSMO 1.49E−04 8.58E−04 16498.55 100
MLSMO 6.53E−02 9.17E−02 195286.09 4
ABC 9.36E−02 1.55E−01 200024.66 0
DE 4.11E−01 1.03E−01 30963.50 87
PSO 2.01E−04 8.31E−04 48225.50 100
gop17 SLSMO 7.10E−06 6.22E−06 21845.37 90
SMO 6.49E−06 6.34E−06 31054.77 85
LFSMO 6.17E−06 5.61E−06 10871.00 95
PLSMO 6.51E−06 5.52E−06 13076.80 94
MLSMO 2.48E−06 1.24E−05 192480.15 5
ABC 6.71E−06 5.70E−06 29746.60 86
DE 2.48E−06 5.36E−06 6563.50 98
PSO 3.69E−06 6.57E−06 42698.00 84
gop18 SLSMO 1.15E−04 1.11E−04 15639.24 98
SMO 1.17E−04 1.05E−04 40074.56 98
LFSMO 1.15E−04 1.14E−04 33668.76 98
PLSMO 8.30E−05 9.98E−05 33950.54 99
MLSMO 3.36E−03 1.02E−03 191683.12 10
ABC 7.88E−05 1.86E−04 182719.65 20
DE 3.71E−04 2.71E−04 58183.00 73
PSO 1.21E−05 9.01E−05 36611.00 100
gop19 SLSMO 3.66E−01 1.59E−01 35020.02 87
SMO 1.01E+01 2.15E+00 164522.39 40
LFSMO 1.54E+00 8.35E−01 147231.97 63
PLSMO 1.08E+00 4.17E−01 134689.98 68
MLSMO 4.49E+01 4.18E+01 200095.00 1
ABC 2.34E+00 1.11E+00 179439.91 20
DE 2.06E+00 2.26E+00 196380.00 2
PSO 1.06E+01 2.09E+00 203737.50 81
gop20 SLSMO 1.45E−05 1.80E−05 110645.51 45
SMO 1.46E−05 1.81E−05 112402.68 44
LFSMO 1.52E−05 1.35E−05 70494.01 65
PLSMO 1.45E−05 1.87E−05 116398.67 42
MLSMO 1.76E−08 2.86E−05 200495.00 0
ABC 1.38E−05 1.89E−05 118435.82 41
DE 5.39E−06 1.08E−05 49702.50 76
PSO 1.18E−05 1.47E−05 104426.50 59
(continued)
426 A. Sharma et al.

Table 7. (continued)

GOPs Algorithm SD ME AFEs SR


gop21 SLSMO 3.02E−14 4.61E−14 8806.05 100
SMO 2.96E−14 4.65E−14 12038.47 100
LFSMO 3.00E−14 5.10E−14 12250.92 100
PLSMO 2.94E−14 4.91E−14 11909.68 100
MLSMO 4.57E−06 2.82E−06 200495.00 0
ABC 3.98E−05 1.22E−05 185760.72 14
DE 1.29E−14 4.27E−14 5214.50 100
PSO 3.01E−14 4.95E−14 9783.50 100
gop22 SLSMO 5.40E−02 1.44E−02 122145.14 89
SMO 5.04E−03 1.04E−02 154142.74 69
LFSMO 3.98E−03 9.46E−03 89610.24 88
PLSMO 1.41E−03 8.45E−03 28226.66 100
MLSMO 2.38E−03 6.35E−03 107964.32 99
ABC 2.13E−03 7.72E−03 53773.58 100
DE 1.98E−01 3.53E+00 200000.00 0
PSO 2.41E−01 3.13E−01 212851.00 29
gop23 SLSMO 2.22E+00 4.61E−01 15738.72 96
SMO 4.26E+00 2.19E+00 93900.82 77
LFSMO 4.80E+00 3.24E+00 123243.44 62
PLSMO 4.62E+00 2.78E+00 104537.93 70
MLSMO 6.49E+00 1.26E+01 181663.32 15
ABC 5.25E+00 7.31E+00 200033.46 0
DE 6.73E+00 5.69E+00 100629.00 55
PSO 4.52E+00 2.16E+00 101530.00 81

A boxplots analysis [13] has been carried out as shown in Fig. 1, to measure the
performance of SLSMO over all the considered algorithms in terms of the average
number of function evaluations. The low interquartile range and median in Fig. 1
of SLSMO proves its reliability as collated with other respected algorithms.
5
x 10
2.5
Average number of function evaluations

1.5

0.5

0
SLSMO SMO PLSMO LFSMO MLSMO ABC DE PSO

Fig. 1. The average number of function evaluations boxplot for GOPs


SLSMO 427

Further, the SLSMO’s convergence speed is compared with the considered


algorithms by calculating the acceleration rate (AR) [11]. The AR is defined as
follows:
AF E ALGO
AR = (8)
AF E SLSM O
The results of AR for all the considered algorithms are shown in Table 8. AR
> 1 shows the faster performance of SLSMO compared with the other algorithms.
The outcome in Table 8 show that for the greater part of the considered GOPs,
SLSMO converges faster than the considered algorithms. The Mann-Whitney U
rank (MWUR) sum test [13] is also used for the evaluation of the algorithms.
The results for 100 simulations are recorded in Table 9. Table 9 shows that in
comparison with the basic SMO and its recent variants, this evinced algorithm
performs outstanding for 22 GOPs in case of SMO, 19 GOPs in compare with
PLSMO, 18 GOPs in compare with LFSMO, 21 GOPs in case of MLSMO. In
compare with ABC, DE, and PSO, SLSMO performs better for 19, 19 and 21
GOPs out of 23 GOPs.

Table 8. Acceleration Rate (AR) of SLSMO compare to the SMO, PLSMO, LFSMO,
MLSMO, ABC, DE, PSO

GOPs SLSMO vs SLSMO vs SLSMO vs SLSMO vs SLSMO vs SLSMO vs SLSMO vs


SMO PLSMO LFSMO MLSMO ABC DE PSO
gop1 9.77 10.15 12.18 155.07 15.78 17.96 29.30
gop2 11.34 11.88 14.29 83.67 10.18 22.69 34.59
gop3 49.51 47.33 46.86 111.03 25.44 38.43 76.88
gop4 4.28 4.28 4.28 4.29 4.20 4.26 5.35
gop5 958.25 1.00 1.00 1.00 0.51 1.01 1.01
gop6 1.19 0.00 0.00 0.00 0.00 0.00 0.00
gop7 4.62 7.78 8.80 11.43 4.27 3.51 5.77
gop8 1.72 2.94 2.47 5.76 0.66 1.09 2.07
gop9 7.43 7.72 9.24 109.01 12.88 13.55 21.78
gop10 49.70 43.28 60.22 75.54 75.35 26.24 78.95
gop11 7.37 7.64 9.17 117.00 12.32 13.29 20.40
gop12 90.73 95.67 94.32 98.79 98.56 96.39 123.21
gop13 7.58 7.92 9.43 103.49 11.71 13.75 22.91
gop14 1.87 1.62 1.84 4.65 1.83 4.14 5.62
gop15 0.30 0.64 0.60 4.34 0.47 0.53 1.05
gop16 6.40 1.90 1.98 22.54 23.09 3.57 5.57
gop17 1.42 0.60 0.50 8.81 1.36 0.30 1.95
gop18 2.56 2.17 2.15 12.26 11.68 3.72 2.34
gop19 4.70 3.85 4.20 5.71 5.12 5.61 5.82
gop20 1.02 1.05 0.64 1.81 1.07 0.45 0.94
gop21 1.37 1.35 1.39 22.77 21.09 0.59 1.11
gop22 1.26 0.23 0.73 0.88 0.44 1.64 1.74
gop23 5.97 6.64 7.83 11.54 12.71 6.39 6.45
428 A. Sharma et al.

Table 9. Collation formed on the basis of MWUR sum test and AFEs (‘+’ symbolizes
better SLSMO, ‘−’ shows SLSMO is worse, and ‘=’ shows that there is no observable
distinction)

GOPs SLSMO vs SLSMO vs SLSMO vs SLSMO vs SLSMO vs SLSMO vs SLSMO vs


SMO PLSMOA LFSMOA MLSMOA ABC DE PSO
gop1 + + + + + + +
gop2 + + + + + + +
gop3 + + + + + + +
gop4 + + + + + + +
gop5 + + + + − + +
gop6 + − − − − − −
gop7 + + + + + + +
gop8 + + + + − + +
gop9 + + + + + + +
gop10 + + + + + + +
gop11 + + + + + + +
gop12 + + + + + + +
gop13 + + + + + + +
gop14 + + + + + + +
gop15 − − − + + + +
gop16 + + + + + + +
gop17 + − − + + − +
gop18 + + + + + + +
gop19 + + + + + + +
gop20 + + − + + − −
gop21 + + + + + − +
gop22 + − − − − + +
gop23 + + + + + + +
Total number 22 19 18 21 19 19 21
of ‘+’ sign

4.6.2 Engineering Design Problems


The engineering design problems are used to again evaluating this proposed
algorithm. The experimental results of SLSMO, SMO, DE, and PSO over the
EDPs are depicted in Table 10. It is understandable by the results that in terms
of reliability, accuracy, and efficiency SLSMO performs better than the other
three respective algorithms.
SLSMO 429

Table 10. Results: Engineering design problems (EDPs)

EDPs Algorithm SD ME AFEs SR


edp1 SLSMO 3.36E−03 8.64E−03 137750.74 90
SMO 5.59E−03 1.04E−02 166707.31 79
DE 2.65E−01 3.15E+00 248725.50 2
PSO 2.44E−01 3.89E−01 222022.00 19
edp2 SLSMO 1.27E−04 8.89E−04 54357.34 100
SMO 6.13E−04 1.11E−03 200288.47 71
DE 1.83E−04 7.33E−04 69566.50 100
PSO 2.38E−01 1.13E−01 156483.50 67
edp3 SLSMO 2.52E+00 5.60E−01 29510.48 95
SMO 3.32E+00 1.29E+00 129090.43 85
DE 6.65E+00 5.00E+00 109614.00 61
PSO 4.86E+00 2.49E+00 104672.00 76

The Boxplots for SLSMO, SMO, DE and PSO with respect to AFEs, success
rate and mean error are shown in Fig. 2, Fig. 3 and Fig. 4. The boxplot analysis
of exhibits the superiority of our proposed approach.

5
x 10
2.5
Average number of function evaluations

1.5

0.5

SLSMO SMO DE PSO

Fig. 2. Boxplot of average number of function evaluation for EDPs

120

100
Success Rate

80

60

40

20

SLSMO SMO DE PSO

Fig. 3. Boxplot of success rate for EDPs


430 A. Sharma et al.

4
Mean Error

−1

SLSMO SMO DE PSO

Fig. 4. Boxplot of mean error for EDPs

Further, AR is calculated for EDPs by Eq. 8. The Table 11 results for AR


prove the high convergence speed of the propound algorithm over the other
respective algorithms.

Table 11. AR of SLSMO as compared to SMO, DE and PSO, EDPs: Engineering


design problems

EDPs SLSMO vs SMO SLSMO vs DE SLSMO vs PSO


edp1 1.21 1.81 1.61
edp2 3.68 1.28 2.88
edp3 4.37 3.71 3.55

The results are depicted in Table 12 for the Mann-Whitney U rank (MWUR)
sum test for the AFE’s. Where in the results, the ‘=’ sign indicated no significant
difference while the ‘−’ symbolizes the higher (worse performance) and ‘+’ sig-
nifies less (better performance) AFEs of SLSMO than the respected algorithms.

Table 12. Mann-Whitney U rank (MWUR) sum test, EDPs: Engineering design prob-
lems

EDPs SLSMO vs SMO SLSMO vs DE SLSMO vs PSO


edp1 + + +
edp2 + + +
edp3 + + +
SLSMO 431

4.7 Constraint Optimization Problems

The propound algorithm is again tested over the considered constrained opti-
mization problems defined in Table 3. The comparison is based on mean value,
best value and worst value. Table 13 shows the results of comparison of the
SLSMO with the SMO, MLSMO, BSFABC, MABC and ABC. In terms of
mean value, SLSMO outperform others for 4 problems (cop1 , cop2 , cop8 , cop10 ).
According to the best value criteria SLSMO outperform others for 5 problems
(cop1 , cop6 , cop8 , cop12 and cop20 ) and shows as same results as SMO for 11
problems. To observe the significant distinction among the results the Mann-
Whitney U rank (MWUR) sum test is performed including the similar settings
as Sect. 4.6.1 and the results are shown in Table 14. As the data set for this test,
the best value of each run is taken. 32 ‘+’ signs out of 80 comparisons confirm
the significantly better SLSMO over the other compared algorithms.

Table 13. Results: Constraint optimization problems (COPs)

COPs Algorithm Mean value Best value Worst value SD


cop1 SLSMO −14.7191900 −15.0000000 −10.8308200 0.8897245
SMO −14.9600000 −15.0000000 −13.0000000 0.2800000
MLSMO −14.3495100 −15.0000000 −11.9999900 1.0300400
BSFABC −15.0000000 −15.0000000 −15.0000000 0.0000000
MABC −15.0000000 −15.0000000 −15.0000000 0.0000000
ABC −15.0000000 −15.0000000 −15.0000000 0.0000000
cop2 SLSMO −0.8032827 −0.8036191 −0.7779998 0.0026582
SMO −0.8028443 −0.8036173 −0.7846938 0.0031999
MLSMO −0.6737861 −0.7796201 −0.5834191 0.0413682
BSFABC −0.6241749 −0.7045666 −0.5811237 0.0244862
MABC −0.8020298 −0.8034815 −0.7970106 0.0016102
ABC −0.6347318 −0.7084850 −0.5866927 0.0200852
cop3 SLSMO −1.0004000 −1.0004000 −1.0004000 0.0000000
SMO −1.0003870 −1.0004000 −1.0001550 0.0000372
MLSMO −0.8902484 −1.0003940 0.0000000 0.3129772
BSFABC −0.0845507 −0.5133494 −0.0006156 0.0841175
MABC −0.9317702 −0.9926876 −0.8747236 0.0355228
ABC −0.2175168 −0.6633530 −0.0476625 0.1142939
cop4 SLSMO −30665.5400000 −30665.5400000 −30665.5400000 0.0000067
SMO −30665.5400000 −30665.5400000 −30665.5400000 0.0000000
MLSMO −30649.5200000 −30665.5400000 −30004.4700000 80.4453400
BSFABC −30608.6400000 −30658.9900000 −30488.9600000 36.1735100
MABC −30665.0700000 −30665.5100000 −30663.8000000 0.4239074
ABC −30460.5700000 −30621.2700000 −30252.9200000 75.7279000
(continued)
432 A. Sharma et al.

Table 13. (continued)

COPs Algorithm Mean value Best value Worst value SD


cop5 SLSMO 5208.3960000 5126.4340000 5599.8700000 108.4911000
SMO 5168.7420000 5126.4340000 5821.1530000 101.0651000
MLSMO 5802.9620000 5126.4480000 16791.0500000 1638.1400000
BSFABC 5141.5390000 5127.0640000 5250.3950000 15.9809500
MABC 5159.4930000 5127.2800000 5265.6140000 32.4640000
ABC 5155.9580000 5128.5270000 5343.8240000 26.5812500
cop6 SLSMO −6961.7410000 −6961.8140000 −6954.9930000 0.6788181
SMO −6961.7720000 −6961.8140000 −6961.2640000 0.0804134
MLSMO −6959.0450000 −6961.6680000 −6951.3040000 1.9182480
BSFABC −6879.4340000 −6955.8220000 −6781.5570000 33.8840300
MABC −6948.6470000 −6961.0520000 −6929.5990000 7.7424300
ABC −6933.2230000 −6961.2710000 −6789.5850000 33.1872200
cop7 SLSMO 24.5583200 24.3125800 31.4717300 0.7358444
SMO 24.5039700 24.3199000 26.2277600 0.2901096
MLSMO 26.8828800 24.3963200 33.0812400 2.2521480
BSFABC 29.1056900 25.0102500 37.5539300 2.0695680
MABC 24.5742200 24.4497800 24.8642400 0.0882832
ABC 27.9700200 25.3455100 31.8509500 1.3623120
cop8 SLSMO −0.0958250 −0.0958250 −0.0958250 0.0000000
SMO −0.0958250 −0.0958250 −0.0958250 0.0000000
MLSMO −0.0958250 −0.0958250 −0.0958250 0.0000000
BSFABC −0.0958250 −0.0958250 −0.0958250 0.0000000
MABC −0.0958250 −0.0958250 −0.0958250 0.0000000
ABC −0.0958250 −0.0958250 −0.0958250 0.0000000
cop9 SLSMO 680.6324000 680.6301000 680.7269000 0.0120756
SMO 680.6366000 680.6316000 680.6478000 0.0033417
MLSMO 681.2696000 680.6590000 685.6863000 0.7096391
BSFABC 684.2161000 681.7955000 685.8528000 0.6987432
MABC 680.7441000 680.6683000 680.8090000 0.0356390
ABC 683.5838000 681.6116000 685.9368000 0.9288611
cop10 SLSMO 0.7499946 0.7499946 0.7499946 0.0000000
SMO 0.7499950 0.7499946 0.7500090 0.0000015
MLSMO 0.7503904 0.7499946 0.7635190 0.0015426
BSFABC 0.7505069 0.7499950 0.7581977 0.0009308
MABC 0.7504464 0.7500279 0.7517118 0.0003434
ABC 0.7505203 0.7499951 0.7543077 0.0006493
cop11 SLSMO 0.2016434 0.0539219 0.4525155 0.1762180
SMO 0.1213571 0.0539220 0.4393654 0.1389558
MLSMO 0.6168114 0.0543327 1.1333330 0.2521326
BSFABC 0.0870480 0.0550903 0.3365873 0.0396248
MABC 0.1827610 0.0703079 0.3300603 0.0695442
ABC 0.0727823 0.0545834 0.1015380 0.0114572
(continued)
SLSMO 433

Table 13. (continued)

COPs Algorithm Mean value Best value Worst value SD


cop12 SLSMO 962.1799000 961.7151000 967.4693000 1.2214750
SMO 961.8301000 961.7151000 964.1617000 0.3819369
MLSMO 965.1546000 961.7152000 972.3165000 3.0285070
BSFABC 962.4051000 961.7169000 964.9010000 0.7040268
MABC 963.9697000 961.7960000 966.5719000 1.3071420
ABC 963.0764000 961.7233000 966.5880000 1.0862450
cop13 SLSMO −1.9051230 −1.9051550 −1.9027200 0.0002492
SMO −1.9051550 −1.9051550 −1.9051550 0.0000000
MLSMO −1.9000260 −1.9049730 −1.8363140 0.0077522
BSFABC −1.6744960 −1.8652850 −1.3956960 0.1026300
MABC −1.8926390 −1.9027750 −1.8713060 0.0068455
ABC −1.6584040 −1.8770400 −1.4694230 0.0955801
cop14 SLSMO 218.7530000 187.1226000 430.6193000 48.1559900
SMO 225.5466000 187.3549000 434.5061000 55.0671800
MLSMO 409.0251000 189.1123000 617.9673000 137.0103000
BSFABC 300.5630000 197.0424000 452.1618000 63.2793600
MABC 203.0488000 189.0351000 235.6139000 11.8592700
ABC 243.2079000 189.1685000 350.8709000 36.8965300
cop15 SLSMO −5.5080130 −5.5080130 −5.5080130 0.0000000
SMO −5.5080130 −5.5080130 −5.5080130 0.0000000
MLSMO −5.5069970 −5.5080120 −5.4959160 0.0018240
BSFABC −5.4969900 −5.5075590 −5.4689670 0.0086045
MABC −5.5079830 −5.5080120 −5.5079140 0.0000236
ABC −5.5040950 −5.5078850 −5.4968620 0.0026417
cop16 SLSMO 1.5457400 1.3934650 7.3422400 0.7080264
SMO 1.3946350 1.3934650 1.4457090 0.0054258
MLSMO 3.8877340 1.3935420 15.1285500 3.9035650
BSFABC 1.4432910 1.3936220 1.7148560 0.0442622
MABC 1.4692560 1.4054690 1.6278640 0.0539231
ABC 1.4163080 1.3953320 1.4572510 0.0125063
434 A. Sharma et al.

Table 14. Collation for constraint optimization problems (COPs) formed on best
values of 100 runs by applying Mann-Whitney U rank (MWUR) sum test at α = 0.05
significance level

COPs SLSMO vs SLSMO vs SLSMO vs SLSMO vs SLSMO vs


SMO MLSMO BSFABC MABC ABC
cop1 = = = = =
cop2 − − − − −
cop3 = − − − −
cop4 = = = − −
cop5 = + + + +
cop6 = − − − −
cop7 + + + + +
cop8 = = = = =
cop9 + + + + +
cop10 = = = + +
cop11 + + + + +
cop12 = = = + +
cop13 = − − − −
cop14 + + + + +
cop15 = = = − −
cop16 = + + + +

5 Conclusion

This article introduces two modified position update strategies for the local
leader phase and the local leader decision phase of the spider monkey optimiza-
tion (SMO) algorithm to make it more efficient and flexible. This modification is
based upon the learning strategy adopted by the best-fitted solution of the ran-
domly selected neighboring groups. Further, the performance test is performed
and three datasets have been used to check the reliability of SLSMO (i.e. global
optimization problems (GOPs), engineering design problems (EDPs) and con-
straint optimization problems (COPs)). SLSMO is analyzed over 23 GOPs, 3
EDPs as well as 16 COPs. The results of GOP are also compared with spider
monkey optimization (SMO), levy-flight SMO (LFSMO), power-law based local
search in SMO, modified limacon SMO, artificial bee colony (ABC), differential
evolution (DE) and particle swarm optimization. The obtained results of EDPs
are also compared with the different SI based algorithms, SMO, DE and PSO
and the results of COPs are compared with SMO, MLSMO, BSFABC, MABC,
ABC. The results prove that SLSMO is a competitive variant of SMO. The pro-
pound algorithm could be applied to different datasets in the future to resolve
complex problems of the real world.
SLSMO 435

References
1. Akay, B., Karaboga, D.: A modified artificial bee colony algorithm for real-
parameter optimization. Inf. Sci. 192, 120–142 (2012)
2. Montaz Ali, M., Khompatraporn, C., Zabinsky, Z.B.: A numerical evaluation of
several stochastic algorithms on selected continuous global optimization test prob-
lems. J. Glob. Optim. 31(4), 635–672 (2005)
3. Banharnsakun, A., Achalakul, T., Sirinaovakul, B.: The best-so-far selection in
artificial bee colony algorithm. Appl. Soft Comput. 11(2), 2888–2901 (2011)
4. Bansal, J.C., Joshi, S.K., Sharma, H.: Modified global best artificial bee colony for
constrained optimization problems. Comput. Electr. Eng. 67, 365–382 (2018)
5. Bansal, J.C., Sharma, H., Jadon, S.S., Clerc, M.: Spider monkey optimization
algorithm for numerical optimization. Memetic Comput. 6(1), 31–47 (2014)
6. Karaboga, D., Akay, B.: A comparative study of artificial bee colony algorithm.
Appl. Math. Comput. 214(1), 108–132 (2009)
7. Liang, J.J., et al.: Problem definitions and evaluation criteria for the CEC 2006
special session on constrained real-parameter optimization. J. Appl. Mech. 41(8),
8–31 (2006)
8. Price, K., Storn, R.M., Lampinen, J.A.: Differential Evolution: A Practical App-
roach to Global Optimization. Springer, Heidelberg (2006)
9. Sharma, A., Sharma, H., Bhargava, A., Sharma, N.: Power law-based local search
in spider monkey optimisation for lower order system modelling. Int. J. Syst. Sci.
48(1), 150–160 (2017)
10. Sharma, A., Sharma, H., Bhargava, A., Sharma, N., Bansal, J.C.: Optimal place-
ment and sizing of capacitor using limaçon inspired spider monkey optimization
algorithm. Memetic Comput. 9(4), 311–331 (2017)
11. Sharma, H., Bansal, J.C., Arya, K.V.: Opposition based Lévy flight artificial bee
colony. Memetic Comput. 5(3), 213–227 (2013)
12. Sharma, H., Bansal, J.C., Arya, K.V., Yang, X.-S.: Lévy flight artificial bee colony
algorithm. Int. J. Syst. Sci. 47(11), 2652–2670 (2016)
13. Sharma, N., Sharma, H., Sharma, A.: Beer froth artificial bee colony algorithm for
job-shop scheduling problem. Appl. Soft Comput. 68, 507–524 (2018)
14. Trelea, I.C.: The particle swarm optimization algorithm: convergence analysis and
parameter selection. Inf. Process. Lett. 85(6), 317–325 (2003)
15. Yang, X.-S., Karamanoglu, M.: Swarm intelligence and bio-inspired computation:
an overview. In: Swarm Intelligence and Bio-Inspired Computation, pp. 3–23. Else-
vier (2013)
Multi-objective Based Chan-Vese Method
for Segmentation of Mass in a Mammogram
Image

Pramod B. Bhalerao1(B) and Sanjiv V. Bonde2


1 Computer Science and Engineering, SGGSIE & T, Nanded, Nanded, India
pramod2604@gmail.com
2 Electronics, and Telecommunication Engineering, SGGSIE & T, Nanded,
Vishnupuri, Nanded 431606, Maharashtra, India
svbonde@sggs.ac.in

Abstract. In the modern world, breast cancer is one of the most affecting health
problems after skin cancer, and the solution to this is mainly early detection. Digi-
tal mammogram has arisen as the most mainstream screening method accepted for
breast cancer detection. It gives them the freedom to create innovative algorithms
for computer-aided detection (CAD). In the case of image segmentation for med-
ical image-level sets, fuzzy set methods are mainly used. But these methods have
limitations in managing local and global features. The active contour method is
prominent for segmentation, but some improvement is needed because of the lim-
itation of initial contour initialization. There are no solid strategies for detecting
breast tumors; it motivates working to create an innovative method for identifying
cancerous mass in the mammogram. A hybrid approach is introduced in this arti-
cle, which is a multi-objective chan-vese method. The main objective is to reduce
the computational complexity and accurate detection of mass in a mammogram.
The findings are analyzed using the Mini Mammogram Image Analysis Soci-
ety (MIAS) database and linked to other approaches. The success measurement
indicates that the proposed approach is superior to different methods.

Keywords: Breast cancer · Chan-vese method · Computer-aided detection ·


Mammogram

1 Introduction
According to worldwide statistics, breast cancer is one of the most prevailing female
cancers globally, accounting for the quantity of newly diagnosed cancers and cancer-
related deaths, rendering it a significant public health concern in today’s culture. In the
modern environment, one out of every fourteen females will be diagnosed with breast
cancer at any point in their lives. Every year near about one million cases are diagnosed.
On average, 75 to 80% of those affected are in the condition’s early stages, significantly
reducing the odds of effective therapy. Mammography, a type of medical imaging that
uses lower doses of radiation than traditional radiography, is mainly used to diagnose

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 436–446, 2022.
https://doi.org/10.1007/978-3-030-97196-0_35
Multi-objective Based Chan-Vese Method for Segmentation 437

this disease. There are mainly two different object identification methods based on edge
and region [1].
The most critical and complex stage in image recognition schemes is the process of
image segmentation. Centered on the histogram in [2] is one method for image segmen-
tation that uses a simple approach based on the information provided by histograms. The
optimum thresholds for each mammogram that can cope with uni-modal and bi-modal
histograms were calculated after pre-processing mammographic images to smoother the
picture histograms.
Various methodologies have been proposed for image segmentation based on feature
separations, also recognized as active contour models used to identify image boundary
objects. Even then, all these method’s essential meanings depend on the preceding:
contour initialization, convergence point. The initial shape is not correctly located. This
leads to unsuccessful results due to problems like convergence within the expected
number of steps to the desired result or the failure to reduce the locally optimal function.
This restriction of active contours aims to make them highly vulnerable to initialization
roles. Besides, the model is susceptible to incorrect boundary problems, erosion from the
weak side, and poor anti-noise potential. An active contour model driven by dynamic
and fuzzy operations is implemented here. Active contours have grown in popularity
for various applications during the last decade, primarily segmentation techniques and
motion detection. Fuzzy set theory is a helpful tool for representing and processing
human data [3].
A computer-aided diagnosis (CAD) program can be built for successfully supporting
radiologists by digitizing mammograms by using conventional efficient image process-
ing techniques. CAD methods are easier to use and affordable instruments that can easily
aid radiologists in their judgment processes by reviewing automated mammograms.
In this paper, the paper’s main objective is to improve the segmentation accuracy
and reduce the computational complexity using a hybrid approach based on the multi-
objective optimization framework and the Chan-Vese method. In the case of the Chan-
Vese process, the energy function is a combination of inner energy and outer energy.
This method has the benefit of easy calculation. In the literature on image segmentation,
an active contour is also helpful to perform image segmentation. Chan-vese model [4]
is projected by chan and vese; the energy task mainly has two parts: inner energy, and
outer energy.
Image processing is a tool used for extracting data from images. Segmentation is
a section of image processing designed to divide data from the specific target area of
the image. To segment points of interest from the image, different techniques are used.
These are active models of segmentation techniques that are the active contour, using
the object’s performance limitations and movements to identify the region of interest.
Nevertheless, inappropriate initial contours may uncover the Chan-Vese model’s issue
being trapped in a local minimum. For the Chan-Vese model, this condition also provides
bad outcomes. In particular, this issue arises in images with significant variability in
strength between local and global structures.
Chan and Vese suggested a different approximation of the functional Mumford-
Shah energy [5] with another active contour type. The Chan-Vese model is also known
as active contours without edges, a segmentation mechanism centered on the area. It uses
438 P. B. Bhalerao and S. V. Bonde

the curve evolution method. Chan-Vese model does not depend on the object’s gradient;
therefore, it does not use control points. The basic principle of the Chan-Vese model is
to take into account data within areas, not only within their boundaries. As a result, the
model [6] Objects whose limits are not well established can be attained. The Chan-Vese
model iteratively evolves the contour after the initialization by decreasing the necessary
functional energy. Thus, the minimization of the Chan-vese energy function relies pri-
marily on the gradient descent approach to ultimately reduce the function of energy in
a local context. In a nutshell, the steepest descent approach returns the local minimum
closest to the original contour. As a result, the model is accurate in the original contour
position. It would be easier to provide that a global minimum can be acquired regardless
of the original contours. Therefore, the Chan-vese model’s segmentation achievement
could be poorly hindered by a bad selection of the initial contour.
The rest of the paper is prepared as Sect. 2 consists of related work, Sect. 3
describes the methodology, and Sect. 4 explains performance evaluation results. Section 5
concluding remark is given.

2 Related Work
A hybrid methodology incorporating the Chan-Vese process and multi-objective-based
image segmentation is used in this journal article. The edge function is not used to stop the
slope from shifting on the goal border [7]. This study employs the Active Contour Model
as a Segmentation Methodology to reach strong segmentation outcomes for mammogram
images with multiple part restrictions. The Chan-Vese active contours technique, which
cannot accurately segment the lesions, is often used with the localized active contour
solution to improve results [8] This paper introduces the embedded CV-Chan Vese
segmentation model and PSO-Particle Swarm Optimization to segment some images
in a novel form of segmentation. Our proposed model gives a noise-free and precisely
segmented output. After segmentation of the image, the strength of the homogeneity
correction is performed. By way of bias field correction with reduced energy function
in our proposed approach simultaneously [9].
Here, For the classification of the benign and malignant states, an algorithm based on
the fuzzy inference method is suggested. Correlation by using device reliability such as
precision, specificity, and sensitivity shows that [10] This article uses fuzzy procedures
on mammograms to classify and analyze potential breast cancer lesions. They illustrate
how the images can be used to perform fuzzy measurements and how this detail can
be used in the various processing phases [11]. Later than the classical snake algorithm,
the geometric active contour concept was developed and is a deformable model that
makes three-dimensional image modelling in particular. A control is used as in the
snake’s process while the geometric active contour algorithm is performed. Instead of
the internal and external energies, the values defined as the gradient’s energy and the
curve’s energy are used [12]. Chan and Vese suggested another type of active contour,
with a different approximation of the functional Mumford-Shah energy [4]. The Chan-
Vese model is also known as active contours without edges, a segmentation mechanism
centered on the area. It utilizes the curve evolution method and stage setting.
In Chan-Vese model, segmentation output is dependent on the initial contour loca-
tion. The fundamental principle of the Chan-Vese model is to take the data within the
Multi-objective Based Chan-Vese Method for Segmentation 439

regions into account, not just within their borders. Consequently, objects whose borders
are not well described may be accomplished by the model [13]. The image segmen-
tation approach is considered a multi-objective operation involving multiple criteria,
including feature extraction and feature selection. Many authors are working on multi-
objective problems [14, 15]. In [16], because of the inhomogeneous thickness of CT
clinical pictures, the segmentation method depends on the Chan-Vese model, and the
fuzzy clustering method is created. In [17] gradient descent algorithm is used along
with Chan-Vese model for the segmentation of ultrasound images. Reviewing previous
studies on emerging CAD solutions for breast cancer detection, it has been discovered
that more emphasis is placed on machine learning-based systems to simplify the system
[18]. The results of many machine learning approaches for automating mammogram
image recognition are studied in [19]. Figure 1 shows the general working of the system.
A growing concept in image segmentation research framework is to employ methods
that may deal with multiple objectives in the course of its decision-making. The objec-
tive functions stated in challenges with multiple objectives are frequently incompatible,
limiting simultaneous optimization of each target. Many, if not all, real-world image seg-
mentation challenges involve several goals. Most of the work based on multi-objective
optimization is found in the literature [20–22].

Fig. 1. Proposed operations in the system

3 Proposed Model

In this research paper, the multi-objective optimization model has been introduced.
This is based on the technique of development and improved Chan-Vese model and
fuzzy model; It is a combination of Chan-Vese model with fuzzy model and SVM for
classification using breast cancer monitor for a woman. The fundamental principle of
the Chan-Vese model is to take data within the regions into account, not just within
their borders. The Chan-Vese model develops the contour iteratively after the contour’s
initialization by minimizing the necessary functional energy. The Chan-Vese model uses
a level range to address the model’s energy features.

(1)
440 P. B. Bhalerao and S. V. Bonde

In the Chan-Vese model for every pixel, the energy is determined not only through
its intensity but also to be weighted with equivalent fuzzy member key value.

3.1 Fuzzy Based Chan-Vese Model

In the edge location field, the Chan-Vese model has been broadly utilized and incor-
porates the division of clinical information. Nonetheless, the fuzzy logic for segmenta-
tion of the image is used regularly regarding the intricacy of clinical images. A fuzzy-
based Chan-Vese framework is planned for clinical image handling, which executes the
fuzzy clustering approach. Indeed, the fuzzy Chan-Vese model, in light of the Chan-
Vese model, is improved. The general image is divided into two parts background and
foreground. For getting fine segmentation, the same process has to be repeated.
 
F(C1 , C2 , φ)μ δ0 (x, y)|∇φ(x, y)dxdyv H (φ(x, y))dx
 
+ λ1 kc1 (x, y)dc1 )H (φ(x, y)) dx dy
x,y
 
+ λ2 ( kc2 (x, y)dc2 )(1 − H (φ(x, y)) dx dy (2)
x,y

Where,

Sc1 = {(x, y)1, (x, y)2, (x, y)3, . . . (x, y)c1, (3)

Sc2 = {(x, y)1, (x, y)2, (x, y)3, . . . (x, y)C1 + C2 (4)

dc = |μ(x, y) − c1 |2 (5)

dc = |μ(x, y) − c1 |2 (6)

The number of segments needed to smooth the background, foreground parts in the
image is defined by c1 and c2 . Often, they are considered prototypes. (x, y) in Eq. (1)
Represents the ith point coordinates in the C-curve. Put kc (x,y) as either 0 or 1. If the
prototype c(i) will approximate µ(x,y) in (i), the value of kc (x,y) is set to 1, or else, it
is 0.
This approach examines both the image’s global and local properties. During segmen-
tation, these methods disregard the global functionality of objects. The initial problem
of exposure to the situations arises due to this. Therefore, an image might obtain a dif-
ferent classification with various classification effects. On the contrary, specific worth
methods for the global characteristics of the image are taken into consideration, but
there are some limitations to defining image content. The flexibility of the two types of
technology is reflected in the fuzzy Chan-Vese paradigm, allowing them to balance one
another sound and bring their compensation into action. Compared with the Chan-Vese
paradigm and fuzzy clustering techniques, in the fuzzy Chan-Vese design, the global
and regional characteristics of images are better considered.
Multi-objective Based Chan-Vese Method for Segmentation 441

3.2 Multi-objective-Based Chen-Vese Model

In the image segmentation phase, the fundamental objects are taken out. In many image
processing applications, this method is one of the crucial ones. The researchers focus
mainly on the difficulty involved in this process and increase accuracy in separating
objects from the background [23]. For accomplishing the final objective of segmentation,
multi-objective optimization is one of the effective methods. Researchers’ recent primary
focus is in relating multiple objectives in medical image segmentation-related issues [24].
It has been experiential that multiple objectives are involved in most of the segmentation
approach. Multi-objective problems can be handled in two different ways. In one method,
multiple objectives can be combined in one single suite, and in the second method, the
multiple solutions that are not dominant to each other can be found.
Problems involving a single objective is represented in Eq. (7) as:

min f(x), where x ∈ S (7)

Where S is the number of constraints, this can be given in Eq. (8) as

S = {x ∈ Rm : h(x) = 0, g(x) ≥ 0} (8)

Here the only single objective is involved in considering multiple objectives, and in
Eq. (9) the multi-objective function is described as below.

min[f1 (x), f2 (x), f3 (x) . . . fm (x)]T (9)

Where (X1 , X2 , ..., Xn )T , X ∈  and fq (x) is the qth objective to be minimized or


maximized.
In most problems involving multiple objectives, one target is achieved by neglecting
the other, leading to a competitive situation. In such a scenario, one has to identify and
exchange between the targets to fulfill the final requirement. The Pareto optimal set is one
solution for such problems where one objective function is enhanced without affecting
the other objective function. In the case of Fuzzy Chan-Vese, multiple parameters are
available for the computation. A hybrid approach is implemented using multi-objective
problems and a chan-vese model to segment the boundary of mammogram images.

4 Simulation and Discussion

This work performance of the proposed method is evaluated using the mammogram
image analysis society (MIAS) database. This dataset offers the ground truth to every
mass by using an estimated circle radius covering the mass. In the MIAS dataset,
322 images are present. Table 1 shows the proposed multi-objective-based Chan-Vese
method’s results and other standard methods. The results show that the multi-objective-
based Chan-Vese method gives an accuracy of 98.29%, which is better than other methods
such as 97.06 and 90.36 for Fuzzy-based chan-vese method [16] and Chan-vese model
[9], respectively. Figure 2 shows the segmentation result for image mdb028 from the
MIAS database.
442 P. B. Bhalerao and S. V. Bonde

Table 1. Performance evaluation of proposed work

Recall Precision Specificity F-score Accuracy


Chan-Vese model [7] 71.41 72.82 94.01 70.68 90.36
Fuzzy based chan-vese method [4] 92.45 94.39 97.88 93.30 97.06
Multi-objective based chan-vese method 93.78 91.61 98.67 92.57 98.29

Fig. 2. Evaluation results for segmentation (MIAS images mdb028) (a) Input image; (b) Chan-
Vese method; (c) Fuzzy based Chan-Vese; (d) multi-objective-based Chan-Vese

Figure 3 shows the segmentation result for image mdb271 from the MIAS database.
The results are tested using various performance measurement factors of recall, precision,
specificity, F-Score, and accuracy based on the True Positive and True Negative fractions.
The SVM is a commonly used classification technique for detecting breast tumours.
The SVM in the classification of images consists of two phases: preparation and research.
During practice, the SVM takes as input the image of breast cancer that consists of
regular and abnormal images, and a training algorithm solves the task of splitting a set of
training vectors belonging to two different groups. It attempts to find an ideal separating
hyper-plane and optimize the margin among different classes to attain the optimum
generalization potential. The benefit is that the issue of over-fitting is minimized by
selecting a single hyperplane from several that can isolate the information inside the
feature space. Figure 4 represents the support vector machine with a hyperplane.
Multi-objective Based Chan-Vese Method for Segmentation 443

Fig. 3. Evaluation results for segmentation (MIAS image mdb271) (a) Input image; (b) Chan-Vese
method; (c) Fuzzy based Chan-Vese; (d) multi-objective-based Chan-Vese

Fig. 4. Support vector machine with a hyperplane

The efficiency of the SVM classifier is checked with Chan-Vese, Fuzzy Chan-Vese,
and Multi-objective-based Chan-Vese. The classification results are shown in Table 1
(Fig. 5).
Result analysis shows that the multi-objective-based Chan-Vese method gives better
results; this method is computationally effective than previous methods. Precision and
F-Score values are better in the Fuzzy Chan-Vese method than the proposed work of
the Multi-objective Chan-Vese method. In contrast, accuracy and specificity, and recall
values are better in our proposed work.
444 P. B. Bhalerao and S. V. Bonde

Fig. 5. Chart representing comparative analysis of proposed work

5 Conclusion

In previous methods, a Fuzzy Chan-Vese model was used to resolve medical image
segmentation by combining it with the Chan-Vese model and fuzzy C-means. The model
combines the versatility of two kinds of techniques to balance the global and local
features. An adaptive set of prototype information for the Chan-Vese fuzzy model was
still a challenge. An automated multi-objective algorithm is suggested in the context of
an optimized C-V model. It applies and analyses the C-V paradigm. As our approach has
proven helpful, an algorithm can more effectively distinguish multi-objects and achieve
the necessary results. The results show that accuracy (98.29) of the multi-objective-based
Chan-Vese method is better than fuzzy-based Chan-Vese method accuracy (97.06). In
the future, work will be done using nature-inspired optimization techniques.

Acknowledgement. I would like to communicate thankfulness to the persons from the Depart-
ment of Radiology, Mahatma Gandhi Mission, Medical College and Hospital, Aurangabad, M.S.
India, for their valuable support.

Funding. Not Applicable.

Conflict of Interest. The authors state that they have no conflict of interest.

References
1. Torre, L.A., Islami, F., Siegel, R.L., Ward, E.M., Jemal, A.: Global cancer in women: burden
and trends. Cancer Epidemiol. Biomarkers Prev. 26(4), 444–457 (2017). https://doi.org/10.
1158/1055-9965.EPI-16-0858. Epub 2017 Feb 21 PMID: 28223433
2. Suhail, Z., Zwiggelaar, R.: Histogram-based approach for mass segmentation in mammo-
grams. In: Proceedings of SPIE, 15th International Workshop on Breast Imaging (IWBI2020),
vol. 11513, p. 1151325, 22 May 2020. https://doi.org/10.1117/12.2563621
Multi-objective Based Chan-Vese Method for Segmentation 445

3. Algorri, M.E., Flores-Mangas, F.: Classification of anatomical structures in MR brain images


using fuzzy parameters. IEEE Trans. Biomed. Eng. 51(9), 1599–1608 (2004). https://doi.org/
10.1109/TBME.2004.827532. PMID: 15376508
4. Chan, T.F., Vese, L.A.: Active contours without edges. IEEE Trans. Image Process. 10(2),
266–277 (2001). https://doi.org/10.1109/83.902291
5. Mumford, D., Shah, J.: Optimal approximations by piecewise smooth functions and associated
variational problems. Commun. Pure Appl. Math. 42, 577–685 (1989). https://doi.org/10.
1002/cpa.3160420503
6. Ahmed, M.N., Yamany, S.M., Mohamed, N., Farag, A.A., Moriarty, T.: A modified fuzzy
c-means algorithm for bias field estimation and segmentation of MRI data. IEEE Trans. Med.
Imaging 21(3), 193–199 (2002). https://doi.org/10.1109/42.996338
7. Huo, B., Li, G., Yin, F.: Medical and Natural image segmentation algorithm using M-F based
optimization model and modified fuzzy clustering: a novel approach. Int. J. Signal Process.
Image Process. Pattern Recogn. 8, 223–234 (2015). https://doi.org/10.14257/ijsip.2015.8.7.21
8. Xia, R., Liu, W., Zhao, J., Li, L.: An optimal initialization technique for improving the
segmentation performance of chan-vese model. In: 2007 IEEE International Conference on
Automation and Logistics, pp. 411–415 (2007)
9. Wang, X.-F., Huang, D.-S., Xu, H.: An efficient local Chan-Vese model for image seg-
mentation. Pattern Recogn. 43(3), 603–618 (2010). https://doi.org/10.1016/j.patcog.2009.
08.002
10. Johra, F., Shuvo, M.M.H.: Detection of breast cancer from histopathology image and clas-
sifying benign and malignant state using fuzzy logic. In: 2016 3rd International Conference
on Electrical Engineering and Information Communication Technology (ICEEICT), pp. 1–5
(2016). https://doi.org/10.1109/CEEICT.2016.7873137
11. Bakoš, M.: Active contours and their utilization at image segmentation. In: 5th Slovakian-
Hungarian Joint Symposium on Applied Machine Intelligence and Informatics, Poprad,
Slovakia, pp. 313–317 (2007)
12. Rick, A., Bothorel, S., Bouchon-Meunier, B., Muller, S., Rifqi, M.: Fuzzy techniques in
mammographic image processing (2000). https://doi.org/10.1007/978-3-7908-1847-5_12
13. Zhang, N., Zhang, J., Shi, R.: An improved Chan-Vese model for medical image segmentation.
In: 2008 International Conference on Computer Science and Software Engineering, pp. 864–
867 (2008). https://doi.org/10.1109/CSSE.2008.826
14. Datta, N.S., Dutta, H.S., Majumder, K., Chatterjee, S., Wasim, N.A.: A Survey on the appli-
cation of multi-objective optimization methods in image segmentation. In: Mandal, J.K.,
Mukhopadhyay, S., Dutta, P. (eds.) Multi-Objective Optimization, pp. 269–278. Springer,
Singapore (2018). https://doi.org/10.1007/978-981-13-1471-1_12
15. Chakraborty, S., Mali, K.: Application of multiobjective optimization techniques in biomedi-
cal image segmentation—a study. In: Mandal, J.K., Mukhopadhyay, S., Dutta, P. (eds.) Multi-
Objective Optimization, pp. 181–194. Springer, Singapore (2018). https://doi.org/10.1007/
978-981-13-1471-1_8
16. Chen, J., Chen, L., Li, L.: A local fuzzy-based Chan-Vese method for the segmentation of CT
medical images. J. Nanhua Univ. Sci. Technol. 29(2), 108–113, 128 (2015)
17. Ramu, S.M., Rajappa, M., Krithivasan, K., Jayakumar, J., Chatzistergos, P., Chockalingam,
N.: A method to improve the computational efficiency of the Chan-Vese model for the seg-
mentation of ultrasound images. Biomed. Signal Process. Control 67, 102560 (2021). https://
doi.org/10.1016/j.bspc.2021.102560. ISSN1746–8094
18. Bagchi, S., Tay, G., Huong, A., Debnath, S.: Image processing and machine learning tech-
niques used in computer-aided detection system for mammogram screening - a review. Int. J.
Electr. Comput. Eng. (IJECE) 10, 2336 (2020). https://doi.org/10.11591/ijece.v10i3.pp2336-
2348
446 P. B. Bhalerao and S. V. Bonde

19. Meenalochini, G., Ramkumar, S.: Survey of machine learning algorithms for breast cancer
detection using mammogram images. Mater. Today Proc. 37, Part 2, 2738–2743 (2021).
https://doi.org/10.1016/j.matpr.2020.08.543. ISSN2214-7853
20. Bong, C.-W., Mandava, R.: Multi-objective optimization approaches in image segmentation –
the directions and challenges. Int. J. Adv. Soft Comput. Appl. 2, 40–64 (2010)
21. Pare, S., Kumar, A., Singh, G.K., Bajaj, V.: Image segmentation using multilevel thresholding:
a research review. Iran. J. Sci. Technol. Trans. Electr. Eng. 44(1), 1–29 (2019). https://doi.
org/10.1007/s40998-019-00251-1
22. Raychaudhuri, A., De, D.: Bio-inspired algorithm for multi-objective optimization in wireless
sensor network. In: De, D., Mukherjee, A., Kumar Das, S., Dey, N. (eds.) Nature Inspired
Computing for Wireless Sensor Networks. STNC, pp. 279–301. Springer, Singapore (2020).
https://doi.org/10.1007/978-981-15-2125-6_12
23. Nakib, A., Oulhadj, H., Siarry, P.: Image histogram thresholding based on multi-objective
optimization. Signal Process. 87, 2516–2534 (2007). https://doi.org/10.1016/j.sigpro.2007.
04.001
24. Ganesan, T., Elamvazuthi, I., Shaari, K.Z.K., Vasant, P.: An algorithmic framework for multi-
objective optimization. Sci. World J. 2013, 11 (2013). https://doi.org/10.1155/2013/859701.
Article ID 859701
Securing the Adhoc Network Data Using Hybrid
Malicious Node Detection Approach

Atul B. Kathole1,2(B) and Dinesh N. Chaudhari1,2


1 Department of Computer Engineering, Pune, India
atul.kathole1910@gmail.com
2 Department of Computer Science and Engineering, Pune, India

Abstract. As Adhoc networks are fundamentally dynamic, various security vul-


nerabilities may arise due to various assaults. As a result, several strategies for
preventing packet routing mistakes have been suggested in these networks. The
deployment scenario demonstrates that packet transmission rate and performance
increase when the Sybil attack is present in the network. Our objective is to offer
a technique for clustering that will enhance latency, packet transfer rate, and other
performance metrics. The suggested technique consists of two phases: the first
phase is based on packet delivery rate. The second phase verifies the node’s behav-
ior by determining the precise reason for the performance decline. Software that
authenticates the cluster network should be utilized to increase security. The false
accusation algorithm includes techniques for revocation and revocation of certi-
fications for harmful entities. By comparing the proposed system to the current
system, we are attempting to enhance the proposed system’s performance. As the
number of shared nodes in the system grows, the system’s performance and ability
to defend against different threats improve.

Keywords: Adhoc network · DSR · Security · Malicious node detection

1 Introduction

A wireless ad hoc network is a clustered wireless network. It does not have a pre-defined
infrastructure, and nodes may communicate directly with one another [1, 2]. It is made up
of several nodes that are connected through links. Because they are insecure and wireless,
security inside the Ad hoc network is a significant problem [4] The nodes need merely
to exchange a private key with authorized neighbor nodes to offer encryption, which
enables us to achieve multiple security goals such as secrecy, honesty, non-repudiation,
availability, and authentication. The fundamental advantage of the ad hoc network is its
low cost and simplicity of setup. Ad hoc networks are very susceptible to Dos attacks at
the network layer. Sybil attacks are broad-based assaults on the Ad hoc network. In a Sybil
attack, malicious nodes obstruct network data transit by delivering inaccurate routing
information [2]. Using a black hole technique, Malicious nodes send erroneous routing
information to surrounding nodes, indicating a minor route to the target node. After
obtaining this bogus information, the source transmits packets through these malicious

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 447–457, 2022.
https://doi.org/10.1007/978-3-030-97196-0_36
448 A. B. Kathole and D. N. Chaudhari

nodes, which discard the packets. On the other hand, the packets will not reach the
network node that serves as the destination. A gray hole is believed to be an enlargement
of Blackhole since the presence of malicious nodes cannot be anticipated. Sybil’s attack
is analogous to the momentary sense of establishing new nodes or network entities
and transferring bogus data to another network node. On some occasions, it may act
maliciously, but on others, it may behave normally. Both of these attacks disable the
route discovery method, lowering the throughput-to-packet distribution ratio [3].
In [2, 3], the CBDS method described the platform’s efficiency as evaluated by the
Packet Distribution Ratio, or PDR, of the source and destination link nodes. By contrast,
this discovery has increased the false positive rate by classifying the genuine nodes as
less effective and hostile in locating the harmful original nodes. The purpose of such
inadequacies is to agree that protracted packet mishaps or dropped PDR occur due to
cruel exercises using nodes that are not now covered by specific security strategies.
There are many diverse reasons for the declining PDR in MANET: high data rate,
blocking or excessive load, high portability, and so on. In any case, there are instances
when MANET’s protection mechanisms struggle to come up with alternate reasons for a
dropped PDR scenario in the ship’s status. For instance, using conventional approaches
results in erroneous estimations of malicious nodes [1–3].
This study develops a critical safety mechanism dubbed the Hybrid method for
securing interaction and avoiding DSR protocol assaults. This procedure determines if
any nodes in the network are malicious since the network is composed of several nodes
[5]. To eliminate these rogue nodes, the sophisticated DSR protocol is used [6]. As a
result, all malicious nodes are eliminated. If an adjacent node obtains incorrect routing
information from an intermediary node, the node should be regarded as malevolent.
The intermediate node informs the other nodes and any node that gets the malicious
nodes and updates its routing database to classify them as malicious. When an RREQ
is delivered, a malicious node list is attached, and other nodes change the receiving
node’s routing table [7]. Thus, by detecting inaccurate routing information or analyzing
the routing table, nodes may identify rogue nodes and alert other nodes not to accept
the malicious nodes’ routing information [8]. Ties connect numerous nodes to form the
network. A unique Id identifies each node, and each packet is stamped with the identity
of the matching source node. This critical information is maintained at each computer
node in the network [9].
MANET, or Mobile Ad hoc Networking, is a novel technology based on a multi-hop
architecture that is wireless and does not need a rigid infrastructure or prior network
node configuration. The following are the crucial characteristics of this new networking
model: (I) essential networking functions, such as routing and data transfer, are jointly
sponsored; (II) the network nodes do not have hierarchical boundaries; (III) a network’s
lack of a central entity; and (IV) in general, dynamic node-to-node network connections.
Strategies often fail to understand the true nature of an unfortunate case’s reasons.
This leads to a high number of false positives for nodes that are not malicious and low
detection rates for malicious nodes. Such vulnerabilities exist as a result of the assump-
tions made by these confidence-based security systems [10]. After all, packet losses occur
only due to malevolent behavior on the part of misbehaving nodes. These are, in any
case, due to a variety of factors, including insufficient agility propagation, congestion,
Securing the Adhoc Network Data 449

and wireless connections. Without a fine-grained examination of packet losses, conven-


tional detection algorithms might result in erroneous confidence estimations, mainly
when mobile nodes and data rates are high [11].
The rest of the research is organized as follows: provides nodes for harmful identi-
fication systems in Sect. 2, Related Work, and Sect. 3 Study Process. Section 5 details
the simulation’s effects. Finally, Sect. 6 contains a quick summary of the study.

2 Related Work
There have been several approaches developed over the past two decades to solve the
issue of rogue nodes in MANETs. Many of these tactics irritate the title of a single ques-
tionable node or need enormous time and financial resources for coordinated blackhole
strikes from police operations. Additionally, each of these techniques requires a distinct
context [3, 5] or set of assumptions to work. Malicious node detection techniques are
classified into three categories: proactive, reactive, and hybrid detection approaches.
Adequate detection mechanisms [12, 13] are required to monitor surrounding mobile
nodes to identify malicious nodes frequently. As a result, detection overhead is con-
stantly created despite the absence of malicious nodes, and the resources necessary for
detection are often exhausted [14]. One downside of these systems is that they may
facilitate the prevention or avoidance of similar assaults at their first stages. The iden-
tification of malicious nodes was launched only by the reactive detection procedures
[15–17] when the different packets were reportable at the destination node [16]. The
hybrid detection systems [1, 8] effectively incorporate proactive and reactive techniques
for locating malicious nodes [18]. These approaches benefited from both constructive
and reactive routing strategies [19].
In, 2ACK suggested a technique for locating and identifying routing corruption in
MANETs [20]. This design includes two jump affirmation packets transmitted to the
opposite side of routing to verify that the data packets were received successfully. A
parameter affirmation ratio, i.e., Rack also controls the fraction of received packets
containing information for which affirmation is required. This method falls under con-
structive designs, which results in excessive overhead routing that pays less attention
to risky nodes. In, a component dubbed best-effort fault-tolerant routing, or BFTR,
was developed [21]. We experimented with FGA scheme implementation on top of the
routing protocol. There are several diverse tactics described for MANET defense, for
example, trust-based approaches.
The suggested FGA approach was evaluated in this study using a Sybil attack sce-
nario. In this sort of attack architecture, the nodes representing the malicious output
of data packets fall randomly with a 25% probability. The percentage of bogus nodes
deployed fluctuates between 5% and 20%. As a result, this affects the efficiency of the
system [22].

3 Research Method
We recommended the Hybrid Bait Detection System, or HBDS, as a harmful detection
tool in this post. This is just a parameter for negative identification nodes to resolve
450 A. B. Kathole and D. N. Chaudhari

the problems encountered by earlier detection methods, which were mainly dependent
on packet losses. The HBDS presented is a two-stage detection method in which the
MANET defense provides and reduces the nodes that are malicious detection mistakes.
Numerous security mechanisms make it impossible to identify rogue nodes with certainty
[23].
The graph illustrates the node mobility about the packet delivery ratio, throughput,
and end-to-end latency. In the presence of Sybil attack, HBDS has a packet delivery
ratio of 96.5%, a throughput of 38.37%, and an end-to-end latency of 0.34%, which
is comparable to FGA. Utilizing the suggested method enables the network’s overall
preference to be improved to achieve maximum throughput in the shortest amount of
time [24].

4 Malicious Activity Detection

The primary goal of this study is to offer insight into the malicious node detection mech-
anism used in MANETs to enhance security and performance. The software is built on
the HBDS framework, a capable of defending against a variety of MANET assaults.
To optimize end-to-end latency and PDR performance, the hybrid CBDS (HBDS) tech-
nique is needed. The proposed work versus existing HBDS methods would significantly
improve performance against a range of network attacks.
We utilize the suggested method to avoid a Sybil attack on a malicious node inside
a particular MANET [25].

Algorithm
The node value identification by a source node will be calculated and compared to the
actual behavior of every node to assess the accuracy of the assessed credibility criterion
by the underlying trust-based framework. Following the algorithm, the code that is
pseudo for our HBDS scheme is introduced [3],
Securing the Adhoc Network Data 451

The above algorithm will consider the different parameters with packet loss & PDR
to evaluate the node working. If the particular node drops the packet gather, then the
set threshold value and PDR are also greater than 0.5 that time node id is captured and
stored in a separate table as it can be a malicious node on an above predication [26].
The Network environment is 500 m * 500 m. The number of different nodes is
shown in Table 1 below. In addition, the proposed phenomenon has been tested against
malicious situations where attackers have infected various legitimate nodes [27].

Table 1. Parameters use during execution

Parameters Values
Simulation time The 80 s
Grid facet 500 m * 500 m
Ad hoc nodes 250
Transmission range 140 m
Data size 512 bytes
MAC protocol IEEE 802.11
452 A. B. Kathole and D. N. Chaudhari

5 Simulation and Results


Collaborative hybrid bait detection system or HBDS performance review tests the pro-
posed system model in a self-organizing network. It compares it with routing systems
based on Boost DSR, CBDS, FGA, and HBDS indexes, with packet delivery, latency, and
performance, to determine whether the check was successfully used to attack Sibylla’s
stable routing protocol [28].
The performance analysis of the proposed system model HBDS based on the false
alarm rate, detection rate, power consumption, packet loss rate, packet transfer rate,
processing delay, and performance, and other parameters and compare it with traditional
DSR, CBDS, and FGA Testing the effectiveness of a secure routing protocol is an attack.
Below you will find some analysis of HBDS performance using corresponding methods
[29].

1) False Positives

The sum of false positives is the ratio of valid nodes classified as unsafe to the
total number of legitimate nodes. We compared the proposed architecture with DSR,
CBDS, and FGA in terms of false positives. This time we covered all structures with our
optimization model and included malicious nodes in the network [12]. The false alarm
rate as the speed of the node increases is shown in Fig. 1(a) below. As can be seen from
the schematic diagram, compared with other solutions, the number of false positives in
our HBDS system has been dramatically reduced. The method we propose can better
investigate the common possible causes of packet loss events and then determine the
reliability of the node. In general, statistics show that as the speed of growing nodes
increases, the rate of false positives will also increase. There is a reason for this trend: if
the node moves faster, the possibility of the source node eavesdropping badly or routing
information out of date will increase significantly, resulting in the source node being
declared malicious) Means that as the density of nodes in the network increases and the
frequency of false alarms, it is difficult to maintain a speed of 4 m/s in the nodes. As the
number of nodes in the architecture increases, the source/destination pair will increase
as the number of lost packets due to network conflicts increases. Compared with many
other schemes that also treat dropped packets as malicious activities, the number of false
positives in the HBDS scheme is lower because the drop rate of each packet is measured
before evaluating the behavior of the node.

2) Detection Rate

Compared to the other plan’s rate, our HBDS framework offers a higher level of
identification efficacy. Similarly, Fig. 2(a) illustrates the recognition rate as a function
of expanding hub speed for the HBDS methodology and other techniques. In contrast,
the location rate as a function of expanding hub thickness. The amount of information
associations inside an organization increases as the hub thickness increases, as more
bundles are lost due to effects. The alternative paradigm views these parcel drops as
upstream actions from real hubs. Subsequently, as shown in the figure, the identification
rate is more significant in our HBDS cycle than in the other cycles.
Securing the Adhoc Network Data 453

Fig. 1. Effect of a node moving speed and density on false positives. (a) False positives vs. node
moving speed.

Fig. 2. Effect of a node moving speed and density on detection rate. (a) Detection rate vs. node
speed.

3) Packet Loss Rate

The packet loss rate as a function of increasing node speed is shown in Fig. 3 for
the FGA and HBDS schemes, respectively. Our HBDS system, as shown in the picture,
has a lower packet loss rate than the FGA system. In the HBDS system, more reliable
nodes are often chosen for routing, resulting in fewer error packets and a higher package
transmission ratio.

4) Packet Delivery Ratio

The relationship between the malware host and package delivery is shown in the
figure below. The data packet transfer rate of the conventional routing protocol DSR
is 86.19%. In the Sybil attack, the Percentage of data loss in DSR, CBDS, and FGA
a the unprotected routing protocols CBDS and FGA accounted for 94.70%, and the
recommended HBDS protocol accounted for 97.54% (Fig. 4).
454 A. B. Kathole and D. N. Chaudhari

Fig. 3. Effect of node moving packet loss rate

Fig. 4. Performance analysis of packet delivery ratio using DSR, CBDS, FGA & HBDS

5) Throughput

By generating over 38.37 of average throughput, the HBDS surpassed DSR, CBDS,
and FGA. The DSR is the smallest of the three, and its throughput fluctuated significantly
during Sybil Attack. The following table summarises the throughput characteristics of
the traditional DSR, CBDS, and FGA, and the graph illustrates the findings (Fig. 5).
Securing the Adhoc Network Data 455

Fig. 5. Performance analysis of throughput using DSR, CBDS, FGA & HBDS

6 Conclusion
Many people are aware that Sybil attacks are the most lethal kind of ad hoc network
attack. While there are several techniques to guard Ad hoc networks against such assaults,
standard preventive measures in this area have substantial limits and drawbacks. Numer-
ous established procedures are inaccurate in their use. DSR often fails to exclude rogue
nodes during the route discovery process, rendering Sybil assaults incapable of deliv-
ering all data packets to the target. Additionally, when the number of malicious nodes
increases in these attacks, the Packet Distribution Ratio, or PDR, may decrease.
As a consequence, a novel technique dubbed HBDS for safeguarding ad hoc networks
has been presented.

References
1. Kathole, A.B., Chaudhari, D.N.: Pros & cons of machine learning and security methods, vol.
21, no. 4 (2019). http://gujaratresearchsociety.in/index.php/JGRS. ISSN: 0374-8588
2. Kathole, A.B., Halgaonkar, P.S., Nikhade, A.: Machine learning & its classification tech-
niques. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 8(9S3) (2019). ISSN: 2278-3075
3. Kathole, A.B., Chaudhari, D.N.: Fuel analysis and distance prediction using machine learning.
Int. J. Future Revolut. Comput. Sci. Commun. Eng. 5(6) (2019)
4. Hasrouny, H., Samhat, A.E., Bassil, C., Laouiti, A.: VANet security challenges and solutions:
a survey. Veh. Commun. 7, 7–20 (2017)
5. Yaqoob, I., Ahmad, I., Ahmed, E., Gani, A., Imran, M., Guizani, N.: Overcoming the key
challenges to establishing vehicular communication: is SDN the answer. IEEE Commun.
Mag. 55(7), 128–134 (2017)
6. Ahmad, I., Noor, R.M., Ali, I., Imran, M., Vasilakos, A.: Characterizing the role of vehicular
cloud computing in road traffic management. Int. J. Distrib. Sensor Netw. 13(5) (2017). Art.
no. 1550147717708728
7. Khan, M.S., Midi, D., Khan, M.I., Bertino, E.: Fine-grained analysis of packet loss in manets.
IEEE (2017). 2169-3536
8. Ahmad, I., Ashraf, U., Ghafoor, A.: A comparative QoS survey of mobile ad hoc network
routing protocols. J. Chin. Inst. Eng. 39(5), 585–592 (2016)
456 A. B. Kathole and D. N. Chaudhari

9. Li, L., Lee, G.: DDoS attack detection and wavelets. Telecommun. Syst. 28(3–4), 435–451
(2005)
10. Buragohain, C., Kalita, M.J., Singh, S., Bhattacharyya, D.K.: Anomaly-based DDoS attack
detection. Int. J. Comput. Appl. 123(17) (2015)
11. Manvi, S.S., Tangade, S.: A survey on authentication schemes in VANETs for secured
communication. Veh. Commun. 9, 19–30 (2017)
12. Srinivasa Rao, Y., Hussain, M.A.: Dynamic MAC protocol to enhancing the quality of real
time traffic in MANET using network load adaptation. J. Adv. Res. Dyn. Syst. 10(7 Special
Issue), 1612–1617 (2018)
13. Suma, P., Hussain, M.A.: Secure and effective random paths selection (SERPS) algorithm for
security in MANETs. Int. J. Eng. Technol. (UAE) 7(2), 134–138 (2018). https://doi.org/10.
14419/ijet.v7i2.8.10345
14. Sinha, A., Mishra, S.K.: Preventing VANET from DOS & DDOS attack. Int. J. Eng. Trends
Technol. 4(10), 4373–4376 (2013)
15. Boddu, N., Vatambeti, R., Bobba, V.: Achieving energy efficiency and increasing the network
life time in MANET through fault-tolerant multi-path routing. Int. J. Intell. Eng. Syst. 10(3),
166–172 (2017). https://doi.org/10.22266/ijies2017.0630.18
16. Kolagani, P., Aditya, K., Venkatesh, N., Kiran, K.V.D.: Multi cross-protocol with hybrid
topography control for manets. J. Theor. Appl. Inf. Technol. 95(3), 457–467 (2017)
17. Verma, K., Hasbullah, H.: Bloom-filter based IP-CHOCK detection scheme for denial of
service attacks in VANET. Secure. Commun. Netw. 8, 864–878 (2015)
18. Ghorsad, S.A., Karde, P.P., Thakare, V.M., Dharaskar, R.V.: DoS attack detection in vehicular
ad-hoc network using malicious node detection algorithm. Int. J. Electron. Commun. Soft
Comput. Sci. Eng. 3, 36 (2014)
19. Cynthia, C., Saguturu, P.K., Bandi, K., Magulluri, S., Anusha, T.: A survey on MANET
protocols in wireless sensor networks. Int. J. Eng. Technol. (UAE) 7(2), 1–3 (2018). https://
doi.org/10.14419/ijet.v7i2.31.13384
20. Abdul, A.M., Umar, S.: Notification of data congestion intimation for IEEE 802.11 adhoc
network with power save mode. Indon. J. Electr. Eng. Comput. Sci. 5(2), 317–320 (2017).
https://doi.org/10.11591/ijeecs.v5.i2.pp317-320
21. Hong, X., Huang, D., Gerla, M., Cao, Z.: SAT: situation-aware trust architecture for vehic-
ular networks. In: Proceedings of 3rd International Workshop Mobility Evolving Internet
Architecture, pp. 31–36, August 2008
22. Hortelano, J., Ruiz, J.C., Manzoni, P.: Evaluating the usefulness of watchdogs for intrusion
detection in VANETs. In: Proceedings of IEEE International Conference on Communication
Workshops, pp. 1–5, May 2010
23. Liao, C., Chang, J., Lee, I., Venkatasubramanian, K.K.: A trust model for vehicular network-
based incident reports. In: Proceedings of IEEE 5th International Symposium on Wireless
Vehicular Communications (WiVeC), pp. 1–5, June 2013
24. Ding, Q., Li, X., Jiang, M., Zhou, X.: Reputation-based trust model in vehicular ad hoc
networks. In: Proceedings of International Conference on Wireless Communications & Signal
Processing (WCSP), pp. 1–6, October 2010
25. Chowdary, K., Satyanarayana, K.V.V.: Malicious node detection and reconstruction of
network in sensor actor network. J. Theor. Appl. Inf. Technol. 95(3) (2017)
26. Hartenstein, H., Laberteaux, L.P.: A tutorial survey on vehicular ad hoc networks. IEEE
Commun. Mag. 46(6), 164–171 (2008)
27. Chen, C., Zhang, J., Cohen, R., Ho, P.-H.: Secure and efficient trust opinion aggregation for
vehicular ad-hoc networks. In: Proceedings of IEEE 72nd Vehicular Technology Conference-
Fall, pp. 1–5, September 2010
Securing the Adhoc Network Data 457

28. Wei, Y.-C., Chen, Y.-M.: An efficient trust management system for balancing the safety and
location privacy in VANETs. In: Proceedings of IEEE 11th International Conference on Trust,
Security and Privacy in Computing and Communications, pp. 393–400, June 2012
29. Nagendram, S., Rao, K.R.H., Bojja, P.: A review on recent advances of routing protocols for
development of MANET. J. Adv. Res. Dyn. Control Syst. 9(2 Special Issue), 114–122 (2017)
Embedded Digital Control System of Mobile
Robot with Backlash on Drives

Eugeny Larkin1(B) and Aleksandr Privalov2


1 Tula State University, Tula 300012, Russia
Elarkin@mail.ru
2 Tula State Lev Tolstoy Pedagogical University, Tula 300026, Russia

Abstract. An embedded digital control system of mobile robots (MR) based on a


Von Neumann type computer is considered. The control system under investigation
has following features: multiple control loops, time delays of feedback signals,
backlash at the joints of drives with executive units. A methodology of studying
a system with highlighted features is proposed. According to the method, the
system model is divided into two parts: the first part, which starts from the output
on the nonlinear element to its input. The second part includes a description of the
backlash. The first part of the model is based on the apparatus of transfer functions,
which makes it possible to quite simply describe cross-links and take into account
the time lags of the object under investigation. To evaluate the parameters of
the indicators, the built-in control algorithm, which programmatically generates
transactions for sensors and drives, is naturally represented as an ergodic semi-
Markov process, and then is artificially transformed into a process with starting
and absorbing states. The method is validated with an example that shows how
backlash and time delays affect the operation of the MR embedded digital control
system.

Keywords: Embedded digital control system · Control algorithm · Von


Neumann computer · Semi-Markov process · Lag · Transfer function · Backlash

1 Introduction
Aerial, terrestrial, above-water, underwater mobile robots (MR) are widely used for
solving aim tasks at transport, military sphere, when ecological monitoring, etc. [1–4].
Problems, solved by embedded digital control system of MR, are as follows: stabilization
of the robot spatial orientation, establishing of the required longitudinal velocity and
managing by execution a job with use the onboard equipment. Solving of any outlined
problems implies use of multiple control loops, every of which has sensor and drive. Due
to processes under control physics, mentioned loops have cross links, which influence on
system performance and stability. Furthermore, digital control systems are realized with
use the Von Neumann computer [5, 6], as a hardware, with embedded soft, as managing
subject. Managing program quests sensors, computes control actions and outputs them
onto MR drives, realizing polling, unfolded in physical time, due to which emerging
delays between quests [7].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 458–467, 2022.
https://doi.org/10.1007/978-3-030-97196-0_37
Embedded Digital Control System 459

In turn, in real structures the drives are loaded onto the rudders, throttles, manipulator
links, etc., that supposes the presence of backlash in joints of a drive with an executive
unit. Both time delays in control contours, generated by the Von Neumann type computer,
and natural cross-links between contours in the object under control, as well as the
backlash at dries born the problem of estimation the multi-loop system performance at
the design stage [8–10]. To solve the problem one should to develop of an adequate
model, which takes into account the dynamics of the MR as complex object under
control, the dynamics of the digital controller and backlash at drives. As far as cross-
links and backlash, their structure and parameters are determined by onboard equipment,
used in MR, and after construction completion it is poorly changeable. So, MR designer
may deal with software performance only, and for estimation of time lags he should
have adequate approach, which permits to evaluate time intervals between quests of
sensors and drives. Below the semi-Markov simulation mode [11–13] is proposed for
time intervals estimation in wide practice of MR software design [7, 14].
Methods of assessing mobile robot performance at the design stage, which takes
into account both natural physical processes in the onboard hardware, and embedded
soft time complexity are not widely used, that proves a topicality of the problem being
solved. In addition, method proposed would be recommended for studying by students,
specializing at computer sciences.

2 System Structure
General flowchart of system under investigation is shown on the Fig. 1a [15]. The struc-
ture consists of controllable equipment itself linked with digital controller (DC) through
the interface. System operates as follows. Managing vector codes uc are generated at the
output of DC and through the interface, as vector signals u is transmitted to drives. Drives
by means of vector signal v actuate onboard equipment, whose state x is measured by
sensors, signal vector x0 of which through the interface is transformed to sequence of
vector codes x0 , which are inputted to the digital controller.
Special case, namely two-parameter control system (i.e., longitudinal velocity and
heading rotation) flowchart is shown on the Fig. 1b. In the system under investiga-
tion F(s) = [F1 (s), F2 (s)]θ is the desired MR state, generated by external signal
source; s is the Laplace differentiation operator; θ is the transpose operation sign. The
 θ
feedback vector X0 (s) = X0,1 (s), X0,2 (s) is generated by proper sensors. When
transmitting through the interface analogue signals are converted into digital codes
 θ  θ
sequences Fc (s) = Fc.1 (s), Fc,2 (s) and Xc,0 (s) = Xc,0,1 (s), Xc,0,2 (s) . On the base
 θ
of Fc (s) and Xc,0 (s) controller computes control data vector Uc (s) = Uc,1 (s), Uc,2 (s) ,
which, being transmitted through the interface, is transformed into analogue vector
U (s) = [U1 (s), U2 (s)]θ , which in turn, feeds the drive inputs. Signal U (s) actuates
 θ
drive shafts, and their rotation Ṽ (s) = Ṽ1 (s), Ṽ2 (s) are transmitted though the joints
with backlash to rudders positions V (s) = [V1 (s), V2 (s)]θ , which, in turn actuates mech-
anism of robot equipment, which state is determined with vector X (s) = [X1 (s), X2 (s)]θ .
Equipment state X (s) = [X1 (s), X2 (s)]θ is measured by sensors, which form feedback
 θ
signal X0 (s) = X0,1 (s), X0,2 (s) .
460 E. Larkin and A. Privalov

uc u v MR x
Drives

controller

Interface
Digital
equipment
x0c x0
Sensors
a
F1(s) F2(s)
Interface
Fc,1(s) Fc,2(s)
DC Wc(s)
Uc,1(s) Xc.0,1(s) Xc.0,2(s) Uc,2(s)
Interface
U1(s) X0,1(s) X0,2(s) U2(s)

WA,1(s) W0,1(s) W0,2(s) WA,2(s)

V1(s) V2(s)

V1(s) V2(s)

W11(s) W21(s) W12(s) W22(s)

X1(s) X2(s)
MR b

Fig. 1. General structure (a) and flowchart (b) of two-loop MR digital control system

Physical processes dynamics in MR is described by the matrix equation


X (s) = W (s) · V (s), (1)
where W (s) is the matrix of transfer functions;
 
W11 (s) W12 (s)
W (s) = . (2)
W21 (s) W22 (s)
The forming of feedback signal X0 (s) is described with matrix equation
X0 (s) = W0 (s)X (s), (3)
 
W0,1 (s) 0
where W0 (s) = is the diagonal feedback matrix; W0,1 (s) and
0 W0,2 (s)
W0,2 (s) are transfer functions of the first and second sensors, correspondingly.
Embedded Digital Control System 461

For processing, signal vectors F(s) and X0 (s) are transformed into data vectors Fc (s)
and Xc,0 (s) are inputted into controller element-by element. Input and transformation
of vectors unfolds in physical time, so between forming vectors at sensors output and
beginning of its processing there is time delays, and vectors Fc (s) and Xc,0 (s) may be
defined as follows:

Fc (s) = Nf (s) · F(s); (4)

X0,c (s) = N0 (s) · X0 (s); (5)

where Nf (s) and N0 (s) are delay matrices;


  
exp −τf ,1 s  0
Nf (s) = ; (6)
0 exp −τf ,2 s
  
exp −τ0,1 s 0
N0 (s) = ; (7)
0 exp −τ0,2 s

τf ,1 , τf ,2 are time intervals spent for processing signals F1 (s), F2 (s) correspondingly;
τ0,1 , τ0,2 are time intervals spent for processing signals X0,1 (s), X0,2 (s) correspondingly.
Similarly, elements of vector U (s) emerging on input of drives with time delays τu,1 ,
τu,2 , and

U (s) = Nu (s) · Uc (s), (8)

where
  
exp −τu,1 s  0
Nu (s) = . (9)
0 exp −τu,2 s

Signal U (s) effects on the drive, forming the movement of its shaft as follows:
 
WA,1 (s) 0
Ṽ (s) = U (s), (10)
0 WA,2 (s)

where WA,1 (s), WA,2 (s) are transfer functions of the liner part of drives description
(from input till mechanical assembles with a backlash).
The Von Neumann computer processes digital signals Fc (s) and Xc,0 (s) according
control algorithm, in which differentiation and integration operations when calculation of
control action is expressed as a finite difference and finite summation operations. Apply-
ing to finite difference and finite summation Laplace transform, one can obtain so-called
Z-transform of discrete function [16–18], which has all main properties of Laplace trans-
form of continual functions such as linearity, possibility of substitution instead of finite
difference operation in discrete argument domain, the operation of Z-transform func-
tion multiplying on the differentiating operator, and possibility of substitution instead of
finite summation operation in discrete argument domain, the operation of Z-transform
function division on the differentiating operator. So, when sampling period tends to zero,
462 E. Larkin and A. Privalov

finite-difference and finite summation operations may be performed as continual differ-


entiation and integration [19], and for simulation of digital controller data processing
may be used the same mathematical apparatus of transfer functions, that is used for
simulation of MR description linear part.
With use the notion of transfer function linear data processing in controller may be
performed as

Uc (s) = Wf ,c (s) · Fc (s) − W0,c (s) · X0,c (s), (11)

where Wf ,c (s) and W0,c (s) are transfer functions matrices;


 
Wf ,c,11 (s) Wf ,c,12 (s)
Wf ,c (s) = ; (12)
Wf ,c,21 (s) Wf ,c,22 (s)
 
W0,c,11 (s) W0,c,12 (s)
W0,c (s) = . (13)
W0,c,21 (s) W0,c,22 (s)

In (12) and (14) Wf ,c,ij (s) are transfer functions, describing processing data Fc,i (s)
to obtain data Uc,j (s), i ∈ {1, 2}, j ∈ {1, 2}; W0,c,ij (s) are transfer functions, describing
processing data X0,c,i (s) to obtain data Uc,j (s), i ∈ {1, 2}, j ∈ {1, 2}.
So, expressions (1), (3), (4), (5), (8), (10), (11) form linear part of the model of MR
control system, from rudders position till position of drive shafts.
The static characteristics of backlash nonlinearity in the mechanical joints may be
simulated as it shown on the Fig. 2.

Fig. 2. Backlash static characteristics

Static characteristics is analytically simulated as follows:



⎨ κ(ṽ − α) when v = κ(ṽ − α) and ṽ˙ > 0;
v= κ(ṽ + α) when v = κ(ṽ + α) and ṽ˙ < 0; (14)

v = const, when κ(ṽ + α) ≤ v ≤ κ(ṽ − α),

where α is the backlash width; κ is the transmitting coefficient; ṽ is position of drive


shaft; v is position of the rudder; ṽ˙ = d ṽ(t)
dt ; t is the time.
Expressions
 
ṽ(t) = L−1 Ṽ (s) ;
(15)
V (s) = L[v(s)],
Embedded Digital Control System 463

 θ
where ṽ(t) = ṽ1 (t), ṽ2 (t) ; v(t) = [v1 (t), v2 (t)]θ ; L[. . .] is the direct Laplace
transform L−1 [. . .] is the inverse Laplace transforms, correspondingly, close analytical
description of mobile robot digital control system.
As it follows prom the MR control system description, there are two aspects, which
decrease performance of mobile robot: time delays in control loops and backlash in
drive. Latter may be remedied by mechanics designer and/or manufacturer of drives.
Time delays are entirely depends on controller’s hardware speed and software time
complexity, so on the stage of software design it is necessary to estimate time delays
and take precautions to ensure that lags, caused by software are minimized.

3 Estimation of Delays
Time intervals between quests to elements of vectors F(s), X0 (s)U (s) may be estimated
with use the semi-Markov model of managing soft, represented by quest operators only,
as follows:
   
h(t) = [hkl (t)] = gkl (t) ⊗ pkl , (16)

where h(t) is the K × K semi-Markov matrix; K is number of quest operators; hkl (t)
is the weighted time density of sojourn the process in the k-th state before switching to
the l-th state; pkl and gkl (t) are probability (weight) and pure time density.
Exemplary managing algorithm (Fig. 3) has the next particularities:
It is the cyclic one with no looping;
It generates quests in arbitrary sequence, but the same transaction cannot be generated
twice per a cycle;
There are no unenforceable operators in the algorithm [20].

Fig. 3. Structures of ergodic (a) and non-ergodic (b) semi-Markov process

Due to particularities, described above, semi-Markov process (16) may be classified


as the ergodic one, with restrictions
 
0 < Tklmin ≤ arg gkl (t) ≤ Tklmax < ∞, 1 ≤ k, l ≤ K; (17)
464 E. Larkin and A. Privalov

K
pkl = 1; (18)
k=1
 Tk,l max
gkl (t)dt = 1, (19)
min
Tk,l

where Tklmin and Tklmax are boundaries of density gkl (t) domain.
Due to complex structure and ergodicity, there are certain difficulties in determining
time of wandering through the semi-Markov process (16) from arbitrary state k till
arbitrary state l. For non-ergodic process there is common, formula for estimation of
such parameter [7], namely
 
Σ −1 ∞    y
gkl (t) = Ik · L
r
L h (t) · Ilc , (20)
y=1

where Ikr is the row-vector, k-th element of which is equal to one, and other elements
are equal to zeros; Ilc is the column-vector, l-th element of which is equal to one, and other
elements are equal to zeros; h (t) is the semi-Markov matrix of non-ergodic process.
Transformation
 
h(t) → h (t) = gkl (t) · pkl

(21)

Figure 3b should be carried out according the next technique:


k-th state of h (t) became the starting one, so k-th column should be zeroed;
l-th state of h (t) became the absorbing one, so l-th row should be zeroed too;
after zeroing the k-th column for probabilities restriction (18) ceases to be met, so
probabilities should be recalculated as follows:
pij
pij = , 1 ≤ i, j ≤ K, i = k, j = l, (22)
1 − pik

where pij are probabilities of (16); pij .


For time density (22) the expectation and the dispersion may be calculated, as usual
[21]:
 ∞
TklΣ = t · gklΣ (t)dt; (23)
0
 ∞
Σ 2
Dkl = t − TklΣ · gklΣ (t)dt. (24)
0

For the worst case of managing soft operation delays may be estimated according
“three sigma rule” [22]:

τkl = TklΣ + 3 DklΣ. (25)

Estimations τkl give numeric parameters of delay matrices (6), (7) and (9).
Embedded Digital Control System 465

4 Two-Loop Digital Control System Analysis


The digital system of MR longitudinal movement and direction control with the structure,
shown on the Fig. 1, is analyzed. Transfer functions of linear part of object under control
are as follows:

W11 (s) = W21 (s) = W21 (s) = 0,1s+1 ;


1
−1
W12 (s) = 0,1s+1 .

Feedback signal is formed by non-inertial sensors with transfer functions W0,1 (s) =
W0,2 (s) = 1. Input signals are standard Heaviside functions.

0, when t ≤ 0,
η(t) = .
1, othervise.

Linear part of drives description is performed with transfer functions


1, 2 10
WA,1 (s) = ; WA,2 (s) = .
s(0, 05s + 1) s(0, 05s + 1)
On the Fig. 4 system performance without lag and backlash is shown.

Fig. 4. System performance without lag and backlash

On the Fig. 5 system performance with lag and without backlash is shown. Lags in
line X0,2 (s) is equal to zero. Lags in lines U1 (s), U2 (s), X0,2 (s) are equal to 0.01, 0.015
and 0.02 s, correspondingly.
On the Fig. 6 system performance with lag and backlash is shown. Backlash has the
following parameters: κ = 1, α = 0, 01.
The plots indicate, that cross links, backlash and delays, caused by embedded soft
operation are destabilizing factors, which worsen digital control system performance,
namely increase overshooting and response to input signal. So, when control system
design, based on Von Neumann computer, one should simulate it with taking into account
all three factors.
466 E. Larkin and A. Privalov

Fig. 5. System performance with lag and without backlash

Fig. 6. System performance with lag and backlash

5 Conclusion

Common method of MR control system simulation at the stage of its design is worked
out. Destabilizing factors, such as backlash on actuators, cross-links between channels
in object under control and backlash on actuators were taken into account in the model.
Proposed method of estimation a runtime between quests to actuators and sensors permit
to analyze control algorithms of arbitrary complexity. Complication of control algorithm
for improvement of system performance may have opposite effect due to appearing of
excessive delays in control contours.
Further investigations in the domain may be directed to analytical link of control
algorithm structural-parametric complexity, and system performance as a whole, and
working out methods of algorithm synthesis, optimal to complexity-quality ratio.

Acknowledgement. The study was carried out under support of Russia Ministry of Education,
contract No. 073-03-2021-019/2 of 21.07.2021.
Embedded Digital Control System 467

References
1. Tzafestas, S.G.: Introduction to Mobile Robot Control. Elsevier, United States of America
(2014)
2. Kahar1, S., Sulaiman1, R., Prabuwono1, A.S., Akma, N., Ahmad, S.A., Abu Hassan, M.A.:
A review of wireless technology usage for mobile robot controller. In: 2012 International
Conference on System Engineering and Modeling (ICSEM 2012). International Proceedings
of Computer Science and Information Technology IPCSIT, vol. 34, pp. 7–12 (2012)
3. Cook, G.: Mobile robots: Navigation, Control and Remote Sensing, p. 319. Wiley-IEEE Press,
Hoboken (2011)
4. Siciliano, B.: Springer Handbook of Robotics, p. 1611. Springer-Verlag, Berlin, Heidelberg
(2008)
5. Landau, I.D., Zito, G.: Digital Control Systems, Design, Identification and Implementation,
p. 484. Springer, Heidelberg (2006)
6. Astrcm, J., Wittenmark, B.: Computer Controlled Systems: Theory and Design, p. 557.
Tsinghua University Press. Prentice Hall, Hoboken (2002)
7. Larkin, E., Ivutin, A., Kotov, V., Privalov, A.: Semi-Markov modelling of commands exe-
cution by mobile robot. In: Ronzhin, A., Rigoll, G., Meshcheryakov, R. (eds.) Interactive
Collaborative Robotics. ICR 2016. LNCS, vol. 9812, pp. 189–198. Springer, Cham (2016).
https://doi.org/10.1007/978-3-319-43955-6_23
8. Wu, R., Fan, D., Iu, H.H.-C., Fernando, T.: Adaptive fuzzy dynamic surface control for
uncertain discrete-time non-linear pure-feedback mimo systems with network-induced time-
delay based on state observer. Int. J. Control 92(7), 1707–1719 (2019)
9. Li, D., Chen, G.: Impulses-induced p-exponential input-to-state stability for a class of
stochastic delayed partial differential equations. Int. J. Control 92(8), 1805–1814 (2019)
10. Wu, M., He, Y., She, J.H., Liu, G.P.: Delay-dependent criteria for robust stability of time-
varying delay systems. Automatica 40(8), 1435–1439 (2004)
11. Limnios, N., Swishchuk, A.: Discrete-time semi-Markov random evolutions and their
applications. Adv. Appl. Probab. 45(1), 214–240 (2013)
12. Bielecki, T.R., Jakubowski, J., Niew˛egłowski, M.: Conditional Markov chains: properties,
construction and structured dependence. Stoch. Process. Appl. 127(4), 1125–1170 (2017)
13. Janssen, J., Manca, R.: Applied Semi-Markov Processes, p. 310. Springer US, Heidelberg
(2005)
14. Arnold, K.A.: Timing analysis in embedded systems. In: Ganssler, J., Arnold, K., et al. (eds.)
Embedded hardware MA. 01803 USA, pp. 239–272. Elsevier Inc. (2008)
15. Fadali, M.S., Visioli, A.: Digital Control Engineering: Analysis and Design, pp. 239–272.
Elsevier Inc., Amsterdam (2013)
16. Pavlov, A.V.: About the equality of the transform of Laplace to the transform of Fourier.
Issues Anal. 5(23(4(76))), 21–30 (2016)
17. Li, J., Farquharson, C.G., Hu, X.: Three effective inverse Laplace transform algorithms for
computing time -domain electromagnetic responses. Geophysics 81(2), E75–E90 (2015)
18. Yeh, Y.-C., Chu, Y., Chiou, C.W.: Improving the sampling resolution of periodic signals by
usingcontrolled sampling interval method. Comput. Electr. Eng. 40(4), 1064–1071 (2014)
19. Pospiŝil, M.: Representation of solutions of delayed difference equations with linear parts
given by pairwise permutable matrices via Z-transform. Appl. Math. Comput. 294, 180–194
(2017)
20. Larkin, E.V., Bogomolov, A.V., Privalov, A.N.: A method for estimating the time intervals
between transactions in speech-compression algorithms. Autom. Doc. Math. Linguist. 51(5),
214–219 (2017)
21. Kobayashi, H., Marl, B.L., Turin, W.: Probability, Random Processes and Statistical Analysis,
p. 812. Cambridge University Press, Cambridge (2012)
22. Pukelsheim, F.: The three sigma rule. Am. Stat. 48(2), 88–91 (1994)
A Novel Approach in Breast Cancer Diagnosis
with Image Processing

Nahida Nazir1(B) , Baljit Singh Saini1 , and Abid Sarwar2


1 Lovely Professional University, Punjab, India
nahidanzir449@gmail.com
2 University of Jammu, Jammu, India

Abstract. To declare breast lumps as cancerous through only segmentation tech-


niques is not possible since mammography reports should be followed by a biopsy.
We propose an Otsu method with two threshold values followed by an iterative
approach to split up the image into cancerous and non-cancerous regions since
both techniques are time efficient. Setting a threshold value with Otsu and Iterative
approach no doubt differentiates the image into cancerous and non-cancerous but
the challenging task is to set the threshold value, so by setting a minimal threshold
value 0.5 and 0.8 the algorithm performs well and an area that has been damaged
by disease is estimated as well by implementing the various Matlab functions.
This research will assist the radiologists to analyze properly the mammography
report act accordingly on the affected part.

Keywords: Malignant · Threshold · Lump · Image processing

1 Introduction
When cells keep growing unmanageably and squeeze out normal cell growth gives
rise to cancer [1]. Cancer can grow to any part of the body. Cancerous cells not only
affect the confined portion of the body but spreads to other regions of the body [2].
Metastasis is the spreading of cancer cells so is breast cancer and the influence of single-
agent antibodies against programmed death-ligand 1 (PD-L1) for maintaining therapy
treatment is undisclosed in patients with metastatic breast cancer, so women are highly
vulnerable to death in such cases [3]. Breast cancer can spread to different body parts
like the brain, liver, bone marrow and lungs. Screening of breast is carried out with a
low dose x-ray that is a mammogram [7]. The magnetic Resonance Imaging technique
is sometimes recommended but it would not find the cancers with higher efficiency as
mammography does. Breast cancer can be developed due to various factors like estrogen
level, gene mutation, family history, pathogens that may enhance breast cancer. Mutation
of oncogenes along with anti-oncogenes initiates tumor growth. Breast cancers are the
second-highest reason for deaths among women. Approximately twenty-five percent of
a total females having cancer is confirmed with breast cancer yearly [4]. 30% of women
in America suffer from breast cancer. An early breast cancer diagnosis is a challenging
task [5]. Some cells grow fast, others grow slowly. A tumor lump also gets formed due

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 468–476, 2022.
https://doi.org/10.1007/978-3-030-97196-0_38
A Novel Approach in Breast Cancer Diagnosis with Image Processing 469

to cancer cells [6]. A piece of the lump is taken out to diagnose cancer. The medical
practitioner needs to know how far the cells have spread and from where cancer started
which is called cancer stage. Different tests are done to check out the cancer stage; stage
1 or 2 is a lower stage; stage 3 or 4 is higher stage cancer. The highest stage of cancer is
stage 4 means cancer has spread more.
Due to limitations in 2D ultrasound imaging like suboptimal projection angle, thus
makes the technique very difficult to locate the anomaly, so has given rise to 3D imaging
that is far better as compared to 2D and allows the medical experts to analyze and
visualize the 3D report more precisely. Image segmentation has achieved a great success
in diagnosis and follow-ups of breast cancer [20].
As mentioned above MRI and mammography are commonly used by doctors for
screening cancer due to their affordable price and handiness [10]. Though MRI and
mammography screening has certain drawbacks, like high examination cost, MRI is not
able to detect all types of breast cancers, so mammography is widely used but it also has
fewer limitations like reliability issues, false-positive and false-negative consequences.
The FN analysis represents a negative result although the cancer is present. Later mam-
mogram shows abnormality although the cancer is not present. So, the purpose of this
research is to provide an enhanced technique for better visualization of the breast cancer
in various mammograms. Finally, it is concluded that implementing thresholding seg-
mentation has improved wise effect onto the breast cancer diagnosis into mammograms,
helping medical experts for better diagnosis.

2 Organization of the Work

The paper has been organized as follows: Sect. 3 contains the related research of breast
cancer. Section 4 explains the methodology and techniques adapted during research. 5th
section represents the general discussion of research and experimental results. Compar-
ative analysis has been discussed in 6th section, Conclusion and the purpose of research
have been highlighted in the 7th section. Finally future scope has been discussed in 8th
section of the paper.

3 Organization of the Work


Kreke et al. explained a segmentation technique for diagnosis of tumor in mammo-
gram images with vector quantization method [8]. The vector quantization method
is frequently used for data compression. Vector quantization is based on probability
functions applied to data compression techniques. Linde Buzo and Gray algorithm
outperformed than watershed algorithm and LGB. This method does not suffer from
under-segmentation and over-segmentation problems.
Wener Borges Sampaio et al. explained a computational-based method for the diag-
nosis of tumors in mammogram images. The first step was preprocessing the images by
applying various filters so that the internal breast structure could be highlighted properly.
The second step implemented a cellular neural network to segment the tumor area, shape
descriptors were analyzed, thus differentiated tumor and non-tumor cells [9].
470 N. Nazir et al.

Abo et al. presented a technique to discover the suspicious regions on mammography


reports. The technique uses the Fisher information measure. Prakash et al highlighted
a concept for diagnosis infectious tissues in the following way: (1) Background infor-
mation filtered with thresholding method (2) Applied filter for preprocessing, extracted
contours from the binary image that gives the alternative representation of an image and
detected the affected portion with thresholding function
Shruti Dalmiya et al. applied the K means algorithm with wavelet transformation
for detection of the tumor. Wavelet transformation facilitated visualization of images at
different orientations and scales. K means presented a better algorithm for the detection
of tumors [11]. S.Saheb et al implemented fuzzy c means clustering with morphological
operators to detect breast cancer. Early diagnosis of mass cells can be detected with this
approach [12].
Bovis et al. presented a technique based on feature extraction for breast cancer
detection with testing and training of the data [13]. A difficult task in breast cancer is
to determine the boundaries of masses precisely. Fuzzy logic differentiates malignant
tumors from non-malignant based on tissue density, so implemented two different seg-
mentation techniques that integrate fuzzy logic. The region growing method discovers
the boundaries of the tumor; preprocessing step is also followed to improve the region of
interest. The fuzzy region-growing method considers the uncertainty present around the
tumor boundaries. Fuzzy ribbon classifies the tumors being malignant or benign with
0.8sensitivity and 0.9 specificity [14].
Images are associated with different types of anomalies it could be either due to
sampling of data or different types of noise associated with images. The authors have
suggested interval analysis with an edge detection method that implements Laplacian
of Gaussian for performing the segmentation. The MIAS database has been used to
perform the experiment and noises such as salt and pepper noise, Gaussian noise have
been introduced to check the efficiency of the proposed method. Performance comparison
is done on certain parameters like Prewitt, LoG and canny filters based on PSNR, hence
reveals that the suggested method outperforms than others [15].
Punitha et al. early diagnose of breast mass will decline the death rate of women
and thus designing an automatic technique for diagnosis of the breast masses will assist
practitioners for precise diagnosis. Pre-processing of the images have been done by
Gaussian filter on DDSM database with the help of optimized region growing technique
and particle swarm optimization known as Dragon Fly optimization. Feature extraction is
done with GLCM and GLRLM techniques from the area of interest, training is performed
with back propagation algorithm by taking 300 images which categorizes the images as
benign and malignant. Comparison of the techniques is achieved through ROC analysis
sensitivity specificity achieved is 98.1% and 97.8% [16].
Zareeba et al. breast cancer is one the main cause of deaths among women, since
the imaging technique for diagnosis is ultrasound. The initial step for recognizing the
anomalous of the breast cancer, is to locate the region of interest (ROI). Authors have
suggested a new method to extract the ROI to minimize the false positive that is based
on the local pixel information and neural network. During training phase, a model was
developed by extracting the various batches from region of interest and background. The
testing phase is based on scanning the images with a fixed size window to highlight ROI
A Novel Approach in Breast Cancer Diagnosis with Image Processing 471

from the background, then a distance transform has been implemented to differentiate
between the ROI and remove non-ROI. Dataset consists of 250 ultrasound images (150
benign and 100 malignant and the result achieved is 95.4 [17].
Zhang et al. diagnosed breast cancer with deep learning, image pre-processing has
been done to enhance the image quality and classification performance has been achieved
with transfer learning comparison has been on three different parameters such as AUC,
sensitivity, and specificity of support vector machine with AlexNet and GoogLeNet,
study revealed that combining features of deep learning with photoacoustic imaging pro-
duces outstanding result. Mammography images have been categorized into six grades
with the help of segmentation, database used was LAPIMO EESC/USP (Laboratory of
Analysis and Processing of Medical and Dental Images), and algorithm has shown the
satisfactory result [18].
Zebari et al. proposed an enhanced threshold technique and trainable segmentation
method to highlight ROI. A hybrid approach has been used, that involves threshold-
ing with machine learning. The preliminary breast boundary was recognized through
a thresholding technique, various masks were implemented to refine the overestimated
boundaries. HOG and neural network have been used to determine the ROI. The database
used was mini-MIAS. Pectoral muscle segmentation has proved to better than the man-
ual segmentation with average accuracy 99.31% and 98.13% for the boundary of breast
region segmentation and accuracy for pectoral muscle segmentation [19] (Fig. 1).

Fig. 1. Thresholding hierarchy

4 Proposed Methodology
One of the fundamental segmentation techniques is the threshold technique. Mathemat-
ically it can be represented as:

G(x, y) = 1 if the value of f(x, y) is greater than 1.


G(x, y) = 0 if the value of f(x, y) is less than 1.

Thresholding techniques have been categorized into major groups as global


thresholding and local thresholding. The former one is further categorized into otsu
thresholding, iterative thresholding and triclass techniques.
472 N. Nazir et al.

4.1 Global Thresholding

A single value of the threshold is applicable to all the pixels. Foreground and background
have different pixel intensities, so anyone value of the threshold differentiates between
the foreground and background pixels. Different thresholding techniques that follow
global values are the Otsu method, Iterative method, Triclass, Entropy-based method.
Otsu method is based on the iteration of threshold values; threshold value is searched in a
way that it should decrease the intra-class variance. Global thresholding has applications
in pattern recognition, preprocessing. The iterative method starts by applying the Otsu
method first so that the image gets classified into two separate regions and also an Otsu
threshold gets defined. The three regions formed are foreground region, background
region, to be determined region. Foreground region pixel intensity is higher than the
larger value of the mean, background pixel intensity lies below the smaller mean value
and the to-be-determined region has pixel intensity in between the two.

4.2 Local Thresholding

Applying a single value of the threshold is not applicable to uneven distribution of


pixel intensity; so local threshold value is applied in such images. Value of threshold is
dependent on levels of the grey image that is f (x, y) and features like the variance of
pixels that are actually neighboring pixels. Adaptive thresholding converts the colored
image or grey image into a simple binary image. The foreground pixels are set when the
threshold value is greater and background pixels are set when the value of the threshold is
lesser. In adaptive threshold, the distinct threshold values occur for distinct regions. Chow
and Kaneko’s method split images into numerous sub-images. But this technique suffers
from computational cost for each region, hence not being applied to real time-based
applications.
In this paper, we highlighted a basic method to find out cancerous regions of a
mammography image and the area covered by this disease. The blueprint of our research
is portrayed in Fig. 2.
Since the mammography reports are rich in artifacts due to the body movement
of patient and electromagnetic variations, so by applying the filter the variations are
highlighted while preserving the fine details of breast area and edge properties. This
is achieved by median filter; image enhancement and sharpening are done so that key
features could be easily identified. Reports are converted into black and white so that
foreground and background areas are differentiated for further analysis. Now the thresh-
old value is set to check the different results of a single mammography image, it was
observed that 0.5 and 0.8 threshold values are considered to be better as compared to
0.2 and above 0.8 since former one leads to under-segmentation problem and the latter
suffers due to over segmentation.
A Novel Approach in Breast Cancer Diagnosis with Image Processing 473

Fig. 2. Flowchart of various steps followed

5 Experimental Results and Discussion

Mammography reports have been collected from different diagnostic centres in Kashmir.
After collecting the data, the next step is pre-processing of the images, since the images
are rich in certain noise that arise either due to the body movement of patient during
the scans or due to electromagnetic fluctuations various filters could be applied for the
cleaning of data but median filters have been used due to certain reasons such as filtering
random noise and Gaussian noise wand preserving edge properties. The experiment is
conducted with the help of Matlab tool as the tool is rich in various packages even
for visualizing the result in different ways. Next step after doing the pre-processing
is importing the mammography reports as input through Matlab After setting the path
in Matlab where the image is present; a function imread (‘name of image.jpg’) reads
an image with image information as <350*270*3unit8>. Converting the colored image
into gray sets the threshold manually which is the most challenging task in segmentation.
In the figure, the threshold value is set at 0.5 and the white patch that is mass is slightly
differentiated from the image but the result is still not clear with this threshold. In the next
step, the threshold value is increased to 0.8 so that the cancerous mass gets differentiated
from the normal mass of the breast. The black region of the breast is the normal part
and the whiter patches represent the cancerous region. In the first and second image, the
Otsu method is applied, which is one of the simpler techniques of the setting thresholds.
A third figure iterative approach is implemented where we have used two for loops
to get the result. If we clearly observe the third image it shows the over-segmentation
problem that is a normal mass of the breast was also declared as cancerous hence it is the
main pitfall of applying the iterative approach. The over-segmented part is circled on the
top of the image. After successful segmentation area of breast that has been damaged
by the disease is calculated that will assist the radiologist to take the decision whether
chemotherapy, radiotherapy or breast removal is better treatment for further spread of
disease depending on how much area has been affected (Figs. 3, 4 and 5).
474 N. Nazir et al.

Fig. 3. Threshold = 0.5 Fig. 4. Threshold = 0.8 Fig. 5. Iterative approach

6 Comparative Analysis
Otsu method of segmenting the target area is one of the most convenient method for
extracting the area of interest. The cancerous area is clearly differentiated between
foreground and background area so that the damaged area could be easily visualized
with even human naked eye thus, making this technique more efficient in radiology.
The proposed methodology has been compared with other pre-existing techniques like
watershed algorithm and iterative approach, former one suffers from overcutting and
discards numerous segments even from the area of interest thus resulting in the seg-
menting error, however the same is prevented in the Otsu method of segmentation and
the former one suffers from the over-segmentation thus resulting in vague results, so
Otsu has outperformed best among these techniques.

7 Conclusion
In our observation mammography reports are visualized inaccurately because not all
the lumps formed in the breast are cancerous. Lumps may or may not be cancerous;
here extracted mass has proved to be cancerous through biopsy tests. The main aim of
conducting research is to elucidate breast mass detection with segmentation techniques
as cancerous and calculating the affected area with the help of Matlab. Applying the
techniques, a cancerous region has been differentiated from noncancerous by setting the
different threshold values and once the region has been declared as cancerous one of
the powerful functions in Matlab for area calculation is used to diagnose the affected
part of the breast. BWarea (BW) aid in calculating the cancerous mass of the breast.
The calculated mass of the above breast has been estimated at 3.257. The purpose is
to expose the wrong interpretation of mammography reports by applying segmentation
since it should be followed by a biopsy test to reach the conclusion and suggest an
appropriate treatment on time by early diagnosis of cancerous and non-cancerous mass
with different thresholds.

8 Future Scope
Different investigations and critical analysis of other segmentation processes have been
left due to the lack of time. In future research we would like to incorporate various
A Novel Approach in Breast Cancer Diagnosis with Image Processing 475

deep learning techniques and segmentation techniques on cervical images to address


the different issues faced by most of the segmentation techniques. Automatic image
segmentation could also be explored by training the algorithms on primary or secondary
datasets to provide more self-regulating segmentation techniques that does not need
manual interference.

References
1. Jain, A., Jain, A., Jain, S.: Artificial Intelligence Techniques in Breast Cancer Diagnosis and
Prognosis, vol. 39. World Scientific, Singapore (2000)
2. Medline Plus. http://www.medlineplus.gov/breastdisease. Accessed 1 Aug 2021
3. Bachelot, T., et al.: Durvalumab compared to maintenance chemotherapy in metastatic breast
cancer: the randomized phase II SAFIR02-BREAST IMMUNO trial. Nat. Med. 27(2), 250–
255 (2021)
4. Sun, Y.S., et al.: Risk factors and preventions of breast cancer. Int. J. Biol. Sci. 13(11), 1387
(2017)
5. Siegel, R.L., Miller, K.D., Jemal, A.: Cancer statistics, 2016. CA Cancer J. Clin. 66(1), 7–30
(2016)
6. Abdul Aziz, A.A., Md Salleh, M.S., Ankathil, R.: Clinicopathological and prognostic char-
acteristics of Malaysian triple negative breast cancer patients undergoing TAC chemotherapy
regimen. Int. J. Breast Cancer 1(8) (2020)
7. Guzman-Cabrera, R., et al.: Digital image processing technique for breast cancer detection.
Int. J. Thermophys. 34(8–9), 1519–1531 (2013)
8. Kekre, H.B., Sarode, T.K., Gharge, S.M.: Tumor detection in mammography images using
vector quantization technique. Int. J. Intell. Inf. Technol. Appl. 2(5), 237–242 (2009)
9. Sampaio, W.B., Diniz, E.M., Silva, A.C., De Paiva, A.C., Gattass, M.: Detection of masses
in mammogram images using CNN, geostatistic functions and SVM. Comput. Biol. Med.
41(8), 653–664 (2011)
10. Bethapudi, P., Reddy, E.S., Srinivas, Y.: Detection and identification of mass structure in
digital mammogram. Int. J. Comput. Appl. 78(14), 17–20 (2013)
11. Dalmiya, S., Dasgupta, A., Datta, S.K.: Application of wavelet based K-means algorithm in
mammogram segmentation. Int. J. Comput. Appl. 52(15), 15–19 (2012)
12. Basha, S.S., Prasad, K.S.: Automatic detection of breast cancer mass in mammograms using
morphological operators and fuzzy c—means clustering. J. Theor. Appl. Inf. Technol. 5(6),
704–709 (2009)
13. Bovis, K., Singh, S.: Classification of mammographic breast density using a combined classi-
fier paradigm. In: Proceedings of the 4th International Workshop on Digital Mammography,
pp. 177–180 (2002)
14. Guliato, D., Rangayyan, R.M., Carnielli, W.A., Zuffo, J.A., Desautels, J.L.: Segmentation of
breast tumors in mammograms using fuzzy sets. J. Electron. Imaging 12(3), 369–378 (2003)
15. Liu, Q., Liu, Z., Yong, S., Jia, K., Razmjooy, N.: Computer-aided breast cancer diagnosis
based on image segmentation and interval analysis. Automatika 61(3), 496–506 (2020)
16. Punitha, S., Amuthan, A., Joseph, K.S.: Benign and malignant breast cancer segmentation
using optimized region growing technique. Future Comput. Inform. J. 3(2), 348–358 (2018)
17. Zeebaree, D.Q., Haron, H., Abdulazeez, A.M., Zebari, D.A.: Machine learning and region
growing for breast cancer segmentation. In: International Conference on Advanced Science
and Engineering 2019, pp. 88–93. IEEE, Iraq (2019)
18. Zhang, J., Chen, B., Zhou, M., Lan, H., Gao, F: Photoacoustic image classification and
segmentation of breast cancer: a feasibility study. IEEE Access 20(7), 5457–5466 (2018)
476 N. Nazir et al.

19. Zebari, D.A., Zeebaree, D.Q., Abdulazeez, A.M., Haron, H., Hamed, H.N.A.: Improved
threshold based and trainable fully automated segmentation for breast cancer boundary and
pectoral muscle in mammogram images. IEEE Access 8, 203097–203116 (2020)
20. Gu, P., Lee, W.M., Roubidoux, M.A., Yuan, J., Wang, X., Carson, P.L.: Automated 3D ultra-
sound image segmentation to aid breast cancer image interpretation. Ultrasonics 65, 51–55
(2016)
Model Based Model Reference Adaptive Control
of Dissolved Oxygen in a Waste Water
Treatment Process

Mohamed Bahita(B) , Mouatez Bilah M’Haimoud, and Abdelmoula Ladjabi

Department of Chemical Engineering, Constantine 3 University, Constantine, Algeria


mbahita@yahoo.fr, mohamed.bahita@univ-constantine3.dz

Abstract. With the development of new techniques of advanced control tech-


niques, adaptive systems theory is one the diverse used solutions making an
important role in the domain of the industrial process control, and in particu-
lar for complex and nonlinear systems. In this work and based on the recursive
least square identification method, we have proposed a Model Reference Adaptive
Control MRAC study applied to the dissolved oxygen concentration control in a
nonlinear system, which is an activated sludge bioreactor. This process is widely
used for waste water treatment and purification operation. Obtained simulation
results of the proposed MRAC are compared to those of a classical PI control
method. Results are validated by simulation in MATLAB environment.

Keywords: MRAC and PI controllers · Recursive least square method · DO in


waste water treatment process

1 Introduction

The automatic control of systems in general is a complicated problem because of non-


linearities, disturbance difficult to measure and uncertainties on the parameters of the
systems to be controlled. The regulation came especially through control methods to
maintain certain variables (temperature, pressure, flow, level, concentration, speed …)
in the vicinity of their desired value called set point and which can be fixed or variable
with time. One of the challenges of automatic control is to offer a controller suited to
the system to be controlled, guaranteeing the achievement of the desired task (generally
given in a specification). Fuzzy logic is one of the artificial intelligence method used for
this purpose [1]. The purpose of the control or regulation systems that are installed on
the processes is to ensure the stability of the operation of the processes, to minimize the
influence of disturbances, to optimize the overall performance [2].
In fact, conventional regulators of PID type (Proportional Integral and Derivative)
with adjustable parameters are found to solve a large number of control problems. In
addition, the researchers found that traditional PID controllers with certain fixed param-
eters were not always able to provide the desired performance, especially when the
parameters and the static and dynamic characteristics of the system to be controlled vary

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 477–489, 2022.
https://doi.org/10.1007/978-3-030-97196-0_39
478 M. Bahita et al.

over time. This sometimes requires new theoretical developments of the order. In these
critical cases, more advanced control techniques such as fuzzy PIDs [3] or more particu-
larly adaptive PIDs based on adaptive control are used. The latter therefore differs from
an ordinary control, in that its parameters are variable and adjusted by a mechanism that
acts in real time, based on the state of the controlled system.
The main objective of this work is to approach the technique of adaptive control with
reference model noted MRAC (Model Reference Adaptive Control) in the processes of
treatment and purification of wastewater (regulation of the concentration of dissolved
oxygen denoted DO (Dissolved Oxygen) in an activated sludge bioreactor), and to study
two types of control: a conventional control (proportional integral) and another called
adaptive control. This work is divided into three parts:

• The second part deals with (in a brief description) the activated sludge wastewater
treatment process in general.
• The third part will be devoted to the presentation of the theoretical basis of the method
of model reference adaptive control MRAC.
• In the fourth part, we present the comparison simulation results of the two control
structures: MRAC and conventional Proportional-Integral (PI) regulator applied on a
bioreactor.

2 Waste Water Treatment Process


Water is a precious commodity and an essential resource for humans, for their activities
and for their environment. The continued increase in the world’s population, its concen-
tration in cities and industrialization have resulted in a continual increase in the need for
water. Water is subject to various forms of pollution and degradation, and ecosystems
and human health are directly impacted. The pollution in the water comes from various
sources: industrial, domestic or agricultural. The use of wastewater treatment plants has
reduced the impact of wastewater on the natural environment by de-polluting it. Water
treatment processes are made up of several phases, due to its exceptional performance,
the biologically activated sludge treatment stage represents a key stage in the entire treat-
ment chain. However, its function depends on the development of bacterial populations
and it is also the most difficult to control [4].

2.1 Biological Treatment with Activate Sludge


The treatment line is composed of a bioreactor, a clarifier/settler tank (decanter) and a
sludge recycling loop [5], as also shown in Fig. 1.

3 Adaptive Control
Adaptive control is a set of techniques used for on-line and real-time automatic adjust-
ment of control loop (regulators) to achieve or maintain a certain level of performance
when the parameters of the process to be controlled are unknown or/and change with
time [6].
Model Based Model Reference Adaptive Control of Dissolved Oxygen 479

Fig. 1. Diagram of the activated sludge purification process.

3.1 Model Reference Adaptive Control (MRAC)


Model reference adaptive control is one of the most widely used adaptive control
approaches in which the desired performances are specified in a model that can be
imposed in a closed loop using a corrector. So we are talking about a reference model.
This model gives an indication of how the system output should ideally respond to a ref-
erence (input) signal. This technique consists in adjusting the parameters of the regulator
according to the error between the process and the reference model [7].

3.2 MIT Method


This method is used to design a Model Reference Adaptive Control (MRAC). It is known
as MIT rule developed at the instrumentation laboratory named MIT (Massachusetts
Institute of Technology) in America. This solution was then improved, and from this
comes the adaptive control. To apply the MIT law, we consider a closed loop system in
which the controller has a vector θ of parameters, and the desired closed-loop response
(desired specification) is specified in terms of a reference model ym. The general structure
of the MRAC is given in Fig. 2.
The adjustment of the regulator parameters is done in such a way to minimize a
quadratic cost function defined by the following equation [6]:
1 2
J (θ ) = · e (t) (1)
2
θ : Parameters of the PI regulator U (t) which has the following law:
t  
Kp
U (t) = KP e(t) + Ki e(t)dt, where θ = (2)
Ki
0

with : e(t) = ym(t) − C(t) (3)


ym(t) is the reference model output and C(t) is the output the controlled process
(Dissolved Oxygen).
480 M. Bahita et al.

Fig. 2. General structure of MRAC with a PI controller

4 Application of the MRAC Method for the Control of the Dissolved


Oxygen in a Wastewater Treatment Process
The wastewater treatment process is a highly non-linear system, which is characterized
by several dynamic devices, such as changes in concentration, composition of wastew-
ater substrates and different pollutants. All of these parameters complicate the control
tasks in these processes. Wastewater treatment strategies are based on the selection of
parameters to ensure better assessment of pollutants. The dissolved oxygen concentra-
tion (DO) is therefore an important parameter to control in the activated sludge process.
This concentration must be kept far from the critical value, in order to provide suffi-
cient oxygen to maintain the microbial activity. This concentration can be controlled by
adjusting the air flow in the bioreactor.
Among the many control methods that have been successfully applied to the regula-
tion of DO are classical PI and fuzzy PI [1]. Citing other advanced methods as in [8] for
example where the authors proposed a fuzzy logic method based on a fuzzy controller
of the Takagi-Sugeno type and a fuzzy estimator of the Mamdani type, to control the
DO in the wastewater treatment process.
In what follows, we will apply in simulation an advanced control which proves to
be suitable with a variable environment (external disturbances and change of process
parameters …) and compare it with the classic PI. That is the MRAC method. Model
reference adaptive control or MRAC can be considered as an adaptive control system in
which the desired performance is expressed in terms of a reference model. The particu-
larity in this adaptive control scheme is that the parameters of the controller (PI regulator
in our case) are adapted and changed with the change of the environment and with the
variations of the parameters of the process to be controlled.

4.1 Mathematical Model of the DO Process Dynamics


It is assumed that the volume of the bioreactor is constant; the table below collects the
values of the initial feed and the parameters of the bioreactor. These values correspond to
Model Based Model Reference Adaptive Control of Dissolved Oxygen 481

the characteristics of wastewater, which operates at constant temperature. The following


Table 1 gives a summary of all typical values of the parameters and initial conditions for
feeding the bioreactor [8].

Table 1. Typical values of the parameters and initial conditions for feeding the bioreactor.

Parameters Signification Values


C DO concentration 2 mg/l
Cs DO Saturation concentration 9.17 mg/l
C0 DO concentration in the influent 0.1 mg/l
χ Concentration of micro-organisms 150 mg/l
S Substrate concentration 142 mg/l
S0 Substrate concentration in the influent 270 mg/l
Qin Influent flow 4.2 m3 /min
QL Air flow 0.1 m3 /min
V Bioreactor volume 103 m3
QR
r= Q Recycling rate 2
in
Qw
w= Q Purge rate 0.01
in

KA Constant 1
KC Creation Coefficient (yield) 0.6
KD Decay rate of endogenous organism 0.06
KLa Oxygen transfert coefficient 0.65
Ks Saturation coefficient 70 mg/l
K1 Constant 12 × 10−5
K2 Constant 7 × 10−5
η Specific rate of growth 0.4
ηmax Maximum specific rate of growth 0.009

The differential equations representing this model are as follows [8].

• In terms of biomass:
dχ 1 1
= − (1 + r)Qin χ + rQin χR − KD χ + ηχ (4)
dt V V
dS 1 1 1 η
= − (1 + r)Qin S + Qin S0 + rQin S − χ (5)
dt V V V KC
S
η = ηmax (6)
Ks + S
482 M. Bahita et al.

• At the decanter level (according to the mass balance):

(r + w)χR + (1 − w)χE = (1 + r)χ (7)

χE = KA (1 + r)Qin χ (8)

• At the DO level:
dC Qin
= [C0 − (1 + r)C] + KLa QL [Cs − C] − K1 ηχ − K2 χ (9)
dt V
The complete model represented by the set of these three preceding nonlinear
differential equations is used for the simulation of the process control.

4.2 Control by Conventional PI Regulator

This type of regulator is the most used in industry, where Fig. 3 below shows the general
structure of the process control by a conventional PI regulator.

Fig. 3. Functional diagram of a process control loop using a classic PI regulator.

The control law of the classic PI regulator depends on the error e(t), i.e.,

t
U (t) = f (e(t)) or U (t) = KP e(t) + Ki e(t)dt (10)
0

The constant KP is the proportional gain and Ki is the integral gain of the PI regulator.
This last control law (of the PI regulator) is developed in order to subsequently compare
the results with those obtained by the MRAC method (adapted PI regulator).

4.3 Model Reference Adaptive Control (MRAC) Based on the MIT Approach

The MRAC method used here is based on the gradient algorithm (or MIT rule). This
latter law is based on an optimization problem where the decision variables are the
parameters of the regulator. The adjustment of these parameters is done in such a way
to minimize a quadratic cost function defined by:
1 2
J (θ ) = · e (t) (11)
2
Model Based Model Reference Adaptive Control of Dissolved Oxygen 483

θ : parameters of the PI regulator U (t) which has given by (10):


 
Kp
where θ = (12)
Ki

with : e(t) = ym(t) − C(t) (13)

ym(t) is the model reference output and C(t) is the output of the controlled process
(Dissolved Oxygen). The resolution of this problem is done using the gradient algorithm:
d θ (t) dJ (θ )
= −λ (14)
dt dθ
Where λ is an adaptation parameter to be chosen by the designer. We calculate the
derivative of (11) with respect to the parameters vector θ :
 
dJ (θ ) d 21 · e2 (t)
= (15)
dθ dθ
dJ (θ ) de(t)
= e(t) (16)
dθ dθ
As only the output C(t) of the system depends on the regulator parameters vector (i.e.,
the output of the model reference ym(t) is constant or slowly varying), or dym(t)
dθ = 0,
then:
de(t) d
= (ym(t) − C(t)) (17)
dθ dθ
de(t) d C(t)
=− (18)
dθ dθ
Replacing Eq. (18) in Eq. (16), and the resulting in Eq. (14), we obtain the adaptation
law of the regulator parameters vector:
d θ (t) dC(t)
= λe(t) (19)
dt dθ

Remark 1: In fact, the function or the term dC(t) d θ in (19) cannot be easily evaluated
because the output C(t) of the system is not directly depending on the parameters vector
θ of the regulator, so a solution must be found to remedy this problem. The recursive
least squares method [7] (which is a method of approximating the parameters of a model
to identify it) is part of our solution which will be clarified by the following:

4.4 Identification by Recursive Least Squares Method


The objective of this part is to characterize a model of the system from the inputs outputs
knowledge of the real system (which is in our case a wastewater treatment process noted
DO and described by Eqs. (4) up to (9)), and so that there is identity of outputs behavior
484 M. Bahita et al.

for identical inputs. A first-order model is chosen for the DO process identification, it is
given by the following form:
b · U (t)
y(t) = z −1 · (20)
1 + a · z −1
Based on the theory that z −1 y(t) = y(t − 1) and that z −1 U (t) = U (t − 1), the model
in (20) can be written in the following form (which is suitable for using the recursive
least squares method):

y(t) = −a · y(t − 1) + b · U (t − 1) (21)

After applying the recursive least squares method [7] using the Matlab programming
language, the following results were obtained:

-3
x 10
5 -0.9965

4.5
-0.997
4

3.5 -0.9975

3
-0.998
teta2=b

teta1=a

2.5
-0.9985
2

1.5 -0.999

1
-0.9995
0.5

0 -1
0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500
Iterations Iterations

Fig. 4. Evolution of the parameter b = Fig. 5. Evolution of the parameter a = −1 of


2.1535e−004 of the identification model. the identification model

From Fig. 6 it can be seen that the resulting model follows well the output of the
real DO system after training by random input values. According to figures (Fig. 4 and
Fig. 5), it is clear that the parameters (a) and (b) tend towards their permanent values
and which are respectively: a = −1 and b = 2.1535e−004. The identification model is
therefore:
b · U (t) (2 · 1535e − 004)U(t)
y(t) = z −1 · = z −1 · (22)
1 + a · z −1 1 − z −1
Or based on (21):

y(t) = y(t − 1) + (2.1535e − 004)U(t − 1) (23)

Remark 2: Based on the work done in [9] where the authors used a fuzzy model of
the system to solve the problem cited in Remark 1, in our case, we will use the first
order identification model obtained (given by (22) and (23)) to compute the term dC(t)

which cannot be calculated from the real system (as already mentioned in Remark 1)
because of the strong non-linearity of the wastewater treatment process, and finally the
simulation results will be presented.
Model Based Model Reference Adaptive Control of Dissolved Oxygen 485

3.5

The concentration DO (mg/L) and the model output


3

2.5

1.5

0.5

0
0 500 1000 1500 2000 2500
Iterations

Fig. 6. The process output DO in blue and its model output in red.

Remark 3: Since the model determined and given by Eq. (23) represents exactly the
real system (as shown in Fig. 6), i.e., C(t) = y(t), then the term dC(t)
d θ is replaced by
dy(t)
d θ and which is equal to
dC(t) dy(t) dy(t) dU (t)
= = · (24)
dθ dθ dU (t) dθ
dy(t)
According to Eq. (23), dU (t) = b = 2.1535e − 004, and it remains only to compute
dU (t)
the term dθ which is clearly can be computed from Eq. (10), that is
t t
U (t) = Kp e(t) + Ki e(t)dt = θ 1 · e(t) + θ 2 · e(t)dt (25)
0 0
   
Kp θ1
On the other hand, we know that θ = = , then it is now quite clear that
Ki θ2
dU (t)
d θ can be easily calculated from Eq. (25), that is to say:

t
dU (t) dU (t)
= e(t) et = e(t)dt (26)
dθ1 dθ2
0

4.5 Simulation Results of the Dissolved Oxygen (DO) Control Task


The simulation task was carried out using the Runge-Kutta method of order 4 (RK4), with
an integration step h = 0.05 for the resolution of the nonlinear differential Eqs. (4–9) of
486 M. Bahita et al.

the process mathematical model. The variations in the values of the reference Cc (t) (input
of the reference model) and its variation intervals are given in the following Table 2:

Table 2. Variations of the DO reference signal.

Iterations [1 600] [600 1200] [1200 1800] [1800 2400]


Cc(t) [mg/l] 7 8 5 8

A series of changes in the set point Cc (t) and as a consequence a change in the output
of the reference model (desired concentration of dissolved oxygen (DO)) is carried out
at the input of the control loop. In order to observe the behavior of the studied process
in this simulation, a comparison of the two control methods (classical PI regulator and
the adaptive control MRAC) was considered. The controlled variable is the dissolved
oxygen concentration DO at the output of the bioreactor during the operation of the
process. This is achieved by manipulating the air flow (control signal provided by the
regulator) at the input of the bioreactor (Fig. 7).

7
DO concentration mg/L)

4
MRAC: with adaptation
Reference
3
without adaptation
2

0
0 500 1000 1500 2000 2500
iterations

Fig. 7. Evolution of the DO concentration at the output of the bioreactor during the two cases
(without and with) adaptation of the regulator parameters.

From this figure, we notice that the output corresponding to the PI regulator with-
out adaptation and to the PI regulator with adaptation (MRAC method) converge both
towards the desired value (reference) in spite of the variations of the latter, but we can
clearly see that the evolution of the DO carried out by the PI regulator with adaptation
(MRAC) quickly approaches the desired response without any overshoot compared to
the evolution without adaptation of the corresponding PI (Fig. 8).
Model Based Model Reference Adaptive Control of Dissolved Oxygen 487

Error between the reference and the controlled output DO


MRAC: with adaptation
1.5 without adaptation

0.5

-0.5

-1

-1.5

-2
0 500 1000 1500 2000 2500
iterations

Fig. 8. Evolution of the error signal e(t) during both cases (without and with) adaptation of the
regulator parameters.

We see from figure above that the error signals are around zero except for peaks during
the change of the reference values. As we can see, these peaks have less overshoot with
the MRAC method compared to the PI method.

4.6 Robustness Study


In order to observe and evaluate how the regulators (without adaptation (classical PI) and
with adaptation (adapted PI or MRAC)) operate, we will introduce a perturbation at the
output of the process as shown in Fig. 3. So, after some tests, the accepted evolution of
the process output (DO) before diverging leaded by the PI regulator (without adaptation)
was obtained when the disturbance amplitude has a maximum value of 0.35 (in the worst
case). Concerning the PI regulator with adaptation (or MRAC controller), it was always
capable of eliminating the disturbance with really less amplitude (peak). The following
figure shows the evolution of the output of the bioreactor (DO) in both cases without
and with adaptation (Fig. 9).
From this figure, we notice that the disturbance is canceled in both cases (without
and with adaptation), but with a tight and reduced amplitude peak (case of adaptation
(MRAC)) compared to the case without adaptation which presents a peak with high
amplitude and which may be an undesirable effect.
Remark 4: According to our experience (realized tests), if we increase the amplitude
of the disturbance, for example to 0.5, we will notice that the output of the simple PI
regulator (without adaptation) moves away (diverges) completely from the desired set
point and that the regulator with adaptation (MRAC) will cancel the disturbance easily
with a limited peak.
488 M. Bahita et al.

8
MRAC: with adaptation
7 Reference
without adaptation

6
DO concentration mg/L)

0
0 500 1000 1500 2000 2500
iterations

Fig. 9. Evolution of the DO concentration at the output of the bioreactor during both cases (without
and with) adaptation.

5 Conclusion
The objective of the work presented was to control the flow of dissolved oxygen (DO)
in a wastewater purification process. One of the major challenges is therefore to develop
reliable, robust and inexpensive processes allowing simultaneous or sequential treatment
of the various sources of pollution. The work introduced aimed to apply adaptive control
technology (MRAC) to complex nonlinear systems and compare the results obtained
with those of a classical PI regulator, where the studied system is a bioreactor (more
exactly the activated sludge wastewater treatment process). The work was based on
a model of the system which was designed based on an identification of the actual
process via the recursive least squares method. The simulation was performed using the
MATLAB programming language, and the equations are integrated using the 4th-order
Runge-Kutta method. Simulation results showed the interest of modern automatic tools
(in particular; the MRAC method) in process engineering.

References
1. Ulucan-Altuntas, K., Ilhan, F., Kasar, C., et al.: Implementation of fuzzy logic model on textile
wastewater treatment by electrocoagulation process. J. Water Chem. Technol. 43, 255–260
(2021)
2. Corriou, J.-P.: Commande des procédés, 3rd edn. Lavoisier, Paris (2012)
3. Boutana, W., Ykhelfoune, N.: Etude comparative en simulation entre un régulateur PID et un
régulateur flou. Mémoire de Master2. Université de Jijel (2019)
4. Assaf, A.: Réduction de modèle et commande prédictive d’une station d’épuration d’eaux
usées. Doctorat de l’uniiversité de Lorraine (2012)
Model Based Model Reference Adaptive Control of Dissolved Oxygen 489

5. Aouaouda, A.: Modélisation multimodèle et commande prédictive d’une station d’épuration.


Thése de doctorat Université Badji Mokhtar-Annaba (2012)
6. Landau, I., Dugard, I.: Commande Adaptative, Aspects Pratiques et Théoriques. New York
(1986)
7. Åström, K.J., Wittenmark, B.: Adaptive Control, 2nd edn. Addison-Wesley, Lund (1995)
8. Bahita, M., Belarbi, K.: Fuzzy adaptive control of dissolved oxygen in a waste water treat-
ment process. In: 16th IFAC (International Federation of Automatic Control) Conference on
Technology, Culture and International Stability, pp. 24–27. Elsevier, Sozopol (2015)
9. Bahita, M., Belarbi, K.: Fuzzy modeling and model reference neural adaptive control of the
concentration in a chemical reactor (CSTR). J. Knowl. Cult. Commun. 33(2), 189–196 (2018)
Breast Cancer Diagnosis Using Deep Learning

Salman Zakareya(B) and Habib Izadkhah

Department of Computer Science, University of Tabriz, Tabriz, Iran


selman33111@hotmail.com, izadkhah@tabrizu.ir

Abstract. Breast cancer is one of the diseases that is gradually becoming more
prevalent in today’s society. Machine learning is helping in early detection of mam-
mographic lesions. In this work, the fully connected Neural Network (FCNN) deep
learning architecture used to diagnose the breast cancer. In addition, we studied the
effect of different techniques to ovoid overfitting and improve the performance of
the designed deep neural network after that we select the best model. In this study,
the breast cancer dataset called the Wisconsin Breast Cancer Dataset (WBCD)
used. Dataset splatted into training set and test set, with percentage of 80% and
20%, respectively. The reduced network in terms of size model obtained the high-
est accuracy and least loss on training set and adding L2 weight regularization
(FCNN + L2) model achieves the highest accuracy and the lowest loss on testing
set.

Keywords: Breast cancer · Deep learning · Overfitting · Classification

1 Introduction

In 2020, there were 2.3 million women diagnosed with breast cancer and 685 000 deaths
globally. At the end of 2020, there were 7.8 million women who were diagnosed with
breast cancer in the past 5 years, making it the world’s most prevalent cancer.
Machine learning plays an important role to assist medical professionals in early
detecting mammographic lesions [1, 2]. In recent years, it has drawn a remarkable deal
of research attention and led to many practical applications. Computer-aided detection
systems are designed to support radiologists in the process of screening mammograms
to avoid misdiagnosis because of fatigue, eyestrain, or lack of experience. The use of an
accurate CAD system for early detection could definitely save precious lives [4, 5].
Deep learning is a machine learning method that uses deep neural networks. The
deep neural network is a multilayered neural network that has several hidden layers
[1–3].
The main challenge faces deep learning is the overfitting. The best solution to reduce
overfitting is to get more training data. When no further training data can be accessed,
an alternative solution is to limit the amount of information that the model can store
or be allowed to store [7, 8]. This is called regularization, three common regularization
techniques which are used to reduce the overfitting and to improve the deep neural
network model performance. These techniques are:

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 490–499, 2022.
https://doi.org/10.1007/978-3-030-97196-0_40
Breast Cancer Diagnosis Using Deep Learning 491

1. Reducing the network’s size, in this technique, the number of layers and neurons are
reduced aiming to limit the structure of the model.
2. Dropout, in this technique, the number of neurons is reduced aiming to reduce the
complexity of the model.
3. Weight regularization- that is, do not let the weights change too freely. The structure
is the same, i.e., the number of layers or neurons is not reduced.

In this work, deep learning used to diagnose the breast cancer. Also we study the
effect of different techniques to ovoid overfitting and improve the performance of the
designed deep neural network and then selecting the best model. As in the paper [1],
the authors proposed Deep Learning with a Neural Network algorithm for the diagnosis
and detection of breast cancer. The standardization method and PCA algorithms were
applied for preprocessing the Wisconsin breast cancer dataset. They achieved 99.67%
accuracy for training set as the author’s presented.
This work is organized as follows: description of the used dataset, data preparations
for modeling then description of various model versions. Fully connected neural network
(FCNN) will be used for model building and then different regularization techniques
will be applied to study its effect on the deep neural network model performance. These
different regularization techniques are dropout layers adding to the network, adding
L2 weight regularization, adding L1 and L2 weight regularization, adding L2 weight
regularization and dropout layers and reducing the size of the network.

2 Model Implementation
2.1 Data Set
The breast cancer dataset used in this study called is the Wisconsin Breast Cancer Dataset
(WBCD) [6], it was obtained from the university of Wisconsin hospitals, “MadisonIt
was created by Dr. William H. Wolberg at the University of Wisconsin Hospitals and
made available online” in 1992. It “consists of nuclear features of FNAC biopsy test
result data taken from patients’ breasts”. The dataset contains records from 699 patients,
with 458 (65.5%) cases having a benign BC tumor and 241 (34.5%) cases having a
malignant BC tumor. It includes ten features plus the class feature. A digitized image of
a fine needle aspirate (FNA) of a breast mass is used to compute features. They discuss
the characteristics of the image’s cell nuclei. Nine independent features available in this
dataset are:

1. Clump Thickness
2. Uniformity of Cell Size
3. Uniformity of Cell Shape
4. Marginal Adhesion
5. Single Epithelial Cell Size
6. Bare Nuclei
7. Bland Chromatin
8. Normal Nucleoli
9. Mitoses
492 S. Zakareya and H. Izadkhah

The value of all these features is in the range [1, 10], where 1 represents a normal
state and 10 represents a most abnormal state. The last feature indicates the class, which
2 is for benign and 4 is for malignant.
From the 699 records, 16 records have missing values for the ‘Bare Nuclei’ feature
(denoted by “?”). The class distribution is as follows in Table 1 and Fig. 1:

Table 1. The Wisconsin breast cancer dataset (WBCD) distribution

Benign 458 65.5%


Malignant 241 34.5%
Total 699

DATA DISTRIBUTION
70.00% 65.50%
60.00%

50.00%

40.00% 34.50%
30.00%

20.00%

10.00%

0.00%
Benign Malignant

Fig. 1. The Wisconsin Breast Cancer Dataset (WBCD) distribution.

2.2 Data Preparations

In this stage, Data are prepared for model building. The data are pre-processed to maxi-
mize the performance of machine learning algorithms. The data is normalized since they
are at different scales. Min max scaling method used for normalization. The samples
that are containing a missing feature value, denoted by “?” replaced with number value
(−9999). The class labels 2 (benign) and 4 (malignant) are converted to 0 and 1 values,
respectively.
Breast Cancer Diagnosis Using Deep Learning 493

2.3 Model Building

In this work, we use fully connected Neural Network (FCNN) where layers are fully
connected (densely connected) by the neurons in a network layer. Each neuron in a
layer receives an input from all the neurons present in the previous layer. A densely
connected layer provides learning features from all the combinations of the features of
the previous layer. Due to the network architecture and available data, there is a possibility
of overfitting. So to improve the performance of deep neural networks and avoid over-
fitting, six different versions of the FCNN model are defined by adding dropout layers
to the network, adding weight regularization or reducing the network’s size. After each
definition of the model, the dataset used to train them, then evaluate.

Fully Connected Neural Network (FCNN)


First version of the FCNN model has five dense layers. The first layer has 20 neurons, with
9-dimensional input and the relu activation function. The second layer has 27 neurons
and the relu activation function. The third layer has 54 neurons and the relu activation
function. The fourth layer has 20 neurons and the relu activation function. The fifth layer
has two neurons, and since we have a classification problem, in the last layer we use
the sigmoid activation function, which has only two output states (binary) as shows in
Fig. 2. The loss function used to compile the model is mean squared logarithmic error.
Adam used as optimizer. Also, accuracy metric is used.

Fig. 2. Fully Connected Neural Network (FCNN)


494 S. Zakareya and H. Izadkhah

Adding Dropout to the Network (FCNN + Dropout)


In the 2nd model dropout added to the 1st version FCNN network. Dropout is a training
technique in which randomly selected neurons are ignored. They are “disappeared” at
random. This means that on the forward pass, their contribution to the activation of down-
stream neurons is removed temporally, and on the backward pass, any weight updates are
not applied to the neuron. The dropout offers a very computationally cheap and remark-
ably effective regularization method to reduce overfitting and improve generalization
error in deep neural networks of all kinds.
In this version three dense layers with 0.5 fraction of the input units to drop are added
after the 2nd, 3rd and 4th dense layers.

Adding L2 Weight Regularization (FCNN + L2)


In the 3rd model L2 Weight Regularization added to the 1st version FCNN network.
Neural network regularization is a technique used to reduce the likelihood of model
overfitting by add regularization expressions that add weight values to the cost function
(loss function). There are several forms of regularization. The two most common forms
of regularization are called L1 and L2. L1 regularization adds absolute value of weights
of coefficient as penalty term to the loss function, and L2 regularization adds squared
weights as penalty term to the loss function. L1 regularization penalizes the neural
network weights by zeroing the weight values that are close to 0 or negative. The general
intuition behind L1 is that, if the weight value is close to 0 or very small, its effect on
the overall performance of the model will be very small, so if we set this weight to 0, it
will not affect the performance of the model and can reduce the memory consumption
of the model.
L2 regularization is the most common form of regularization. Also known as weight
decay as it forces the weights to decay towards zero (but not exactly zero). L2 regu-
larization also penalizes weight values. For small and relatively large weight values,
L2 regularization converts the values to a number close to 0, but not completely 0. L2
regularization tries to reduce the possibility of overfitting by keeping the values of the
weights and biases small. L2 regularization tries to estimate the mean of the data to avoid
overfitting. In this version L2 Weight Regularization only are added to the 2nd , 3rd and
4th dense layers.

Adding L2 Weight Regularization and Dropout (FCNN + L2 + Dropout)


In the 4th model dropout added to the 3rd version FCNN network to study the effect
of the addition L2 weight regularization and dropout. In this version three dense layers
with 0.5 fraction of the input units to drop are added after the 2nd and 3rd dense layers.

Adding L1 and L2 Weight Regularization (FCNN + L1 & L2)


In the 5th model both L1 and L2 weight regularization are applied to the 1st version FCNN
network. L1 regularization adds absolute value of weights of coefficient as penalty term
to the loss function, and L2 regularization adds squared weights as penalty term to the
loss function. In this version L1 and L2 Weight Regularization are added to the 2nd , 3rd
and 4th dense layers.
Breast Cancer Diagnosis Using Deep Learning 495

Reducing the Size of the Network


The final version of FCNN network has four dense layers. The first layer has 20 neurons,
with 9-dimensional input and the relu activation function. The second layer has 27
neurons and the relu activation function. The third layer has 20 neurons and the relu
activation function. The fourth layer has two neurons and we use the sigmoid activation
function. The loss function used to compile the model is mean squared logarithmic error.
Adam used as optimizer. Also, accuracy metric is used.

3 Results

The dataset that used in this study called Wisconsin Breast Cancer Dataset (WBCD).
It contains records collected from 699 patients of which 458 were from patients who
had a benign BC tumor and 241 cases were from patients with a malignant BC tumor.
It includes ten features plus the class feature. Dataset was spilt into training set and test
set, with percentage of 80% and 20%, respectively. Data preparations include Min max
scaling method that used for normalization. The samples that are containing a missing
feature value, denoted by “?” Replaced with number value (−9999). The class labels 2
(benign) and 4 (malignant) are converted to 0 and 1 values, respectively.
Fully connected neural network (FCNN) used for model building and then differ-
ent regularization techniques applied to improve the performance of deep neural net-
works and avoid over-fitting. These different regularization techniques are: dropout lay-
ers adding to the network, adding L2 weight regularization, adding L1 and L2 weight
regularization, adding L2 weight regularization and dropout layers and reducing the size
of the network. So six different versions of the FCNN model defined. 1st version of the
FCNN model has five dense layers that have 20,27,54,20 and 2 neurons. The relu used
as activation function for the first four layers. The fifth layer used the sigmoid activation
function. The 2nd model dropout added to the 1st version FCNN network. Three dense
layers with 50% fraction of the input units to drop were added after the 2nd , 3rd and 4th
dense layers. The 3rd model L2 weight regularization added to the 1st version FCNN
network. In this version L2 weight regularization only were added to the 2nd , 3rd and 4th
dense layers. The 4th model dropout added to the 3rd version FCNN network to study
the effect of the addition L2 weight regularization and dropout. In this version three
dense layers with 50% fraction of the input units to drop were added after the 2nd and 3rd
dense layers. The 5th model both L1 and L2 weight regularization were applied to the
1st version FCNN network. L1 and L2 weight regularization were added to the 2nd , 3rd
and 4th dense layers. The final version of FCNN network has four dense layers that has
20, 27, 20 and neurons. The relu used as activation function for the three layers. The 4th
layer used the sigmoid activation function. The loss function used to compile the model
is mean squared logarithmic error. Also, accuracy metric is used. The following param-
eters were considered; ADAM optimizer with a learning rate equals 0.0001, number of
epochs equal 1000, a batch size of 10, and 50% dropout. The implementation and model
training code were written in Python and performed on kaggle. Table 2 summarized the
performance of proposed models in terms of accuracy and loss.
The performance of each model was evaluated during the training process. For
instance, during the 1000 epochs training of the six models. The accuracy is computed
496 S. Zakareya and H. Izadkhah

Table 2. Comparing the performance of several models on breast cancer problem

Model Test loss Test accuracy


FCNN 0.0173 0.9643
FCNN + Dropout 0.0137 0.9714
FCNN + L2 0.0101 0.9857
FCNN + l1_l2 0.0136 0.9714
FCNN + l2 + Dropout 0.0144 0.9714
Reducing the size of the network 0.0202 0.9571

after every epoch. Figure 3 and 4 show the training accuracies and losses during the
training process.

Fig. 3. Training accuracies during the training process

For further models verification done based on the test dataset. Figure 5 & 6 show
the accuracies and losses during the testing process. Model’s Performance in terms of
accuracy.
Breast Cancer Diagnosis Using Deep Learning 497

Fig. 4. Training loss during the training process

0.99
0.98
0.97
0.96
0.95
0.94
FCNN FCNN + FCNN + L2 FCNN + l1_l2 FCNN + l2 + Reducing the
Dropout Dropout size of the
network

test Acc

Fig. 5. Model’s performance in terms of accuracy

4 Discussion
As shown in Fig. 3 reducing the size of the network technique achieves the highest
accuracy and the least loss as shown in Fig. 4 during the training process. Table 2, Fig. 5
and Fig. 6 summarizes all the results obtained in all models during the test process, as
shown in Fig. 5 adding L2 Weight Regularization (FCNN + L2) model achieves the
highest accuracy and the lowest loss as shown in Fig. 6.
It can be seen that the reduced network in terms of size model obtained the highest
accuracy and least loss on training data. What matters is performance on data that the
498 S. Zakareya and H. Izadkhah

test loss
0.025
0.02
0.015
0.01
0.005
0
FCNN FCNN + FCNN + L2 FCNN + l1_l2 FCNN + l2 + Reducing the
Dropout Dropout size of the
network

test loss

Fig. 6. Model’s performance in terms of loss

model has not seen before. So for more comprehensive comparison, several times trails
run the models and the average accuracies and losses with each other compared. It can
be seen that the model with adding L2 weight regularization (FCNN + L2) model have
the best performance compared to the rest models on the test data.

5 Conclusion
Diagnosis and treatment of breast cancer are performed by some researches, timely and
accurate detection of this disease is lifesaving. The breast cancer dataset used in this study
called is the Wisconsin Breast Cancer Dataset (WBCD), deep learning architectures are
used in this paper. It can be seen that the reduced network in terms of size model obtained
the highest accuracy and least loss on training data. What matters is performance on data
that the model has not seen before. So for more comprehensive comparison, several times
trails run the models and the average accuracies and losses with each other compared.

References
1. Khuriwal, N., Nidhi, M.: Breast cancer diagnosis using deep learning algorithm. In: 2018
International Conference on Advances in Computing, Communication Control and Networking
(ICACCCN). IEEE (2018)
2. Habib, I.: Deep Learning in Bioinformatics Techniques and Applications in Practice. 1st edn.
Department of Computer Science University of Tabriz (2020)
3. Chugh, G., Shailender K., Nanhay S.: Survey on machine learning and deep learning
applications in breast cancer diagnosis. Cogn. Comput. J. 1–20 (2021)
4. Nguyen, C., Wang, Y., Nguyen, H.N.: Random forest classifier combined with feature selection
for breast cancer diagnosis and prognostic. J. Biomed. Sci. Eng. J. 6(5) (2013)
Breast Cancer Diagnosis Using Deep Learning 499

5. Rajaguru, H., Prabhakar, S.K.: Bayesian linear discriminant analysis for breast cancer classi-
fication. In: 2017 2nd International Conference on Communication and Electronics Systems
(ICCES), pp. 266–269. IEEE (2017)
6. Breast cancer statistics. https://www.wcrf.org/dietandcancer/breast-cancer-statistics/.
Accessed 24 June 2021
7. Abdel-Zaher, A.M., Eldeib, A.M.: Breast cancer classification using deep belief networks.
Expert Syst. Appl. 46, 139–144 (2016)
8. Keles, A., Keles, A., Yavuz, U.: Expert system based on neuro-fuzzy rules for diagnosis breast
cancer. Expert Syst. Appl. 38, 5719–5726 (2011). https://doi.org/10.1016/j.eswa.2010.10.061
Topological Data Analysis - A Novel
and Effective Approach for Feature Extraction

Dhananjay Joshi1(B) , Kapil Kumar Nagwanshi1 , Nitin S. Choubey2 , Milan A. Joshi3 ,


and Sunil Pathak1
1 Amity School of Engineering and Technology, Amity University Jaipur, Jaipur 303007,
Rajasthan, India
dj4query@gmail.com
2 Department of Information Technology, Mukesh Patel School of Technology Management
and Engineering Shirpur, SVKM’s NMIMS (Deemed to be University), Mumbai, India
3 Department of Applied Mathematics, Mukesh Patel School of Technology Management and

Engineering Shirpur, SVKM’s NMIMS (Deemed to be University), Mumbai, India

Abstract. The paper demonstrates how Topological Data Analysis (TDA) can
be effectively used for qualitative feature extraction and studying shape of the
data. This paper aims twofold: the first is using persistent homology for extract-
ing important image features, and the other is mapper to generate topological
networks. Medical imaging plays an essential role in the diagnosis of various dis-
eases. Feature extraction is required to apply a predictive model for any disease
diagnosis; one can think about TDA to extract features using persistent homology.
Every real-time data can be visualized and explored using various data visualiza-
tion techniques; in short, every data has a shape. Why shape? Because data points
in proximity have qualitative behavior. Why TDA? Because it deals with the shape
of the data, we can extract meaning from that shape & importantly, it is a branch
of mathematics. TDA summarizes irrelevant stories to get something interesting;
one can do this using mapper. An experimental study using persistent homology
and mapper is explained & how it can be effectively used for feature extraction
and to find hidden patterns in data, respectively.

Keywords: Topological data analysis · Persistent homology · Topological


network · Mapper

1 Introduction

The paper will provide examples of persistent homology (PH) and mapper to medical
imaging data. PH and mapper both are techniques of TDA. TDA was invented by Gur-
jeet Singh, Facundo Mémoli, and Gunnar Carlsson at Stanford University. Topology
is a branch of pure mathematics, which deals with shape. Therefore, topology has a
connection with the analysis and can be applied to study the structure of data.
TDA is the young and pioneering technique of data analysis, which can be com-
bined with ML for feature extraction, prediction, and pattern determination. Traditional

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 500–506, 2022.
https://doi.org/10.1007/978-3-030-97196-0_41
Topological Data Analysis - A Novel and Effective Approach 501

machine learning techniques or methods like PCA, and Cluster Analysis fail to detect
some geometric features that TDA effectively captures. So, what Extra insights do we
get? TDA Reveals insights and hidden patterns in data which is quite difficult with
traditional methods.
TDA can be applied to any domain; one can use it in Healthcare. Medical imaging
is a part of the healthcare system and incorporates radiology, including radiography,
magnetic resonance imaging, ultrasound, endoscopy, thermography, tomography, etc.
Topology has properties viz Coordinate Invariance, Stretch or Deformation Invariance,
and Compressed Representation. These properties make TDA suitable for image analysis
[1]. The general flow of TDA represented in Fig. 1, one can use either persistent homology
or mapper.

Fig. 1. General flow of TDA, either mapper or persistent homology.

This paper is discussing the practical implementation of persistent homology and


mapper techniques. Topological Data Analysis [TDA] is well explained in [2–8].
Persistent Homology(PH):
PH branch of algebraic topology and used to extract qualitative features from data.
General Flow of Persistence Homology:
Image Database → Filtration → Persistent Homology → Persistent Diagram →
Feature → Machine Learning Model → Output.
Mapper:
Mapper is one crucial technique for TDA. This algorithm aims to extract, simplify, and
visualize high-dimensional data. Mapper uses dimensionality reduction & clustering
to form a topological network. Mapper is mainly used to visualize the shape of the
data, detect clusters and interesting topological structures, and select features that best
discriminate data and model interpretability [9].
General Flow of Mapper:
Image Database → Point Cloud → Connected Network → Any valuable insights?
In this article, we are applying PH and mapper to the brain tumor image dataset [10].
The main aim is to extract features and check whether we are getting any hidden insights
or not? using PH and mapper respectively. We do not intend to prepare the ML/DL model
for classification.
The work is planned as follows; the next section discusses persistent Homology,
followed by the Mapper section. Both the PH and mapper sections demonstrate how
they are effectively used.
502 D. Joshi et al.

2 Persistent Homology
Persistent homology is an important technique in TDA to extract quality features from
data. Qualitative features play an important role to improve performance of ML. Per-
sistent Homology is a branch of algebraic topology. It can explain more complicated
structure loops and voids that are invisible for other methods [11].
Author in [12] presented a way to use Topological Data Analysis for machine
learning tasks on grayscale images [12]. Author applied persistent homology on the
MNIST dataset of handwritten digits to generate features, called topological features.
The author also showed that TDA is an effective dimensionality reduction technique
[12] and provides very good accuracy.
In [13], the author explained why topological data analysis? and the importance
about topological simplification and persistent homology. The author in [14] described
what types of shape TDA detects and why these shapes have meanings. Specifically, the
concepts of persistent homology and barcodes development.
The author in [15] investigated random linear projections of point cloud followed by
topological data analysis for computing persistence diagrams and Betti numbers. The
author also finds Betti numbers can be recovered accurately with high probability after
random projection up to certain reduced dimensions but then the probability of recovery
decreases to zero and discusses persistence diagram.
Here, we used persistent homology to extract features from brain images using Giotto
TDA. Gitto-TDA is a high-performance topological machine learning toolbox, built on
top of scikit-learn.It is part of the Giotto family of open-source projects [16, 17].
We referred to the code, classifying handwritten digits [12, 16, 17] and applied it to
the brain tumor dataset. The flow to extract features from brain tumor images are shown
in Fig. 2. Here Brain image is converted to a binarized image, and radial filtration is
obtained. Radial filtration is used to obtain cubicle persistence which further provides a

Fig. 2. Flow of topological feature extraction [12, 16, 17]


Topological Data Analysis - A Novel and Effective Approach 503

Fig. 3. Topological network representing the brain tumor dataset.

heat kernel to extract amplitude(features). To know about these steps in detail refer [12,
16, 17].

3 Mapper

The advancement in image processing and deep learning creates some hope in devising
more enhanced applications that can be used for the early detection of breast cancer [18].
Also, many authors are using Machine Learning or Deep learning for image classifica-
tion. TDA can also play an essential role in machine learning and deep learning. In this
section, we are introducing Topological Data Analysis using a mapper. The mapper is
an excellent visualization tool to visualize data, and that visualization may reveal hidden
structures. The mapper algorithm became popular after its use in the identification of
subgroups of breast cancers [19].
To implement this, we used the kepler mapper [20]. We applied the mapper technique
to the brain tumor dataset [10]. In Fig. 3, red color clusters represent brain tumor, right
bottom corner green color cluster represents no tumor. The portion above the red cluster
from the left upper side green color represents the tumored cluster; even though the
distance from x min is too minimal, it is still a tumorous cluster. The cluster with orange
color has a minimum distance from x min and represents tumored clusters. The region on
504 D. Joshi et al.

the upper left side, green, orange from the center, is an exciting region that can be used
to extract features. It is interesting because the upper side green cluster represents the
tumor cluster, even the distance from x min is significantly less. In Figs. 4 and 5, nodes
are colored according to the portions of the tumor and cluster member distribution.

Fig. 4. Topological network representing the brain tumor dataset & Nodes are colored according
to the proportion of Tumors

Fig. 5. Topological network representing the brain tumor dataset & with cluster member
distribution
Topological Data Analysis - A Novel and Effective Approach 505

4 Conclusion
The computer-aided automated expert system can reduce the overhead of the medical
experts, and it can increase the diagnosis accuracy. The proposed work here mainly
focused on the healthcare system’s performance enhancement using novel and effective
feature extraction technique using persistent homology and to find hidden patterns in
given data to get something interesting using mapper. TDA is a fast, data first approach,
comprehensive, having deep insights, more efficient, may deliver valuable results in
the coming future. As per earlier research in the field of TDA, machine learning, deep
learning works magically with TDA.The mapper can also be used to extract features
from images.

References
1. Bernstein, A., Burnaev, E., Sharaev, M., Kondrateva, E., Kachan, O.: Topological data analysis
in computer vision. In: Proceedings of SPIE 11433, Twelfth International Conference on
Machine Vision (ICMV 2019), p. 114332H, 31 January 2020. https://doi.org/10.1117/12.256
2501
2. Snasel, V., Nowakova, J., Xhafa, F., Barolli, L.: Geometric and topological approaches to big
data. Future Gener. Comput. Syst. 67, 286–296 (2016)
3. AYASDI White Paper: Understanding Ayasdi Core (2014)
4. Herman, D., Johnson, A.: Application of topological data analysis to target detection and
environment understanding, feature identification from noisy data. Final Technical report,
AYASDI (2014)
5. AYASDI White Paper: Topology and Topological Data Analysis (2014)
6. AYASDI White Paper: TDA and Machine Learning: Better Together (2016)
7. AYASDI White Paper: Clinical Variation Management using Ayasdi Care (2015)
8. Carlsson, G.: Topology and data. Am. Math. Soc. 46(2), 255–308 (2009)
9. https://www.quantmetry.com/blog/topological-data-analysis-with-mapper/. Accessed 8 Feb
2021
10. https://www.kaggle.com/navoneel/brain-mri-images-for-brain-tumor-detection. Accessed 10
Aug 2021
11. Munch, E.: A user’s guide to topological data analysis. J. Learn. Anal. 4(2), 47–61 (2017).
http://dx.doi.org/10.18608/jla.2017.42.6. ISSN 1929-7750 (online). The Journal of Learning
Analytics works under a Creative Commons License, Attribution - NonCommercial-NoDerivs
3.0 Unported (CC BY-NC-ND 3.0) 47
12. arXiv:1910.08345 [cs.LG]
13. Patania, A.: Topological analysis of data. EJP Data Sci. Springer Open J. 6, 1–6 (2017)
14. Murphy, N.: Topological data analysis. Thesis (2016)
15. Ramamurthy, K.: Computing Persistent homology under random projection. In: IEEE
Workshop on SSP (2014)
16. Tauzin, G., et al.: giotto-tda: a topological data analysis toolkit for machine learning and data
exploration. arXiv:2004.02551 (2020)
17. https://giotto-ai.github.io/gtda-docs/0.5.1/library.html. Accessed 17 July 2021
18. Heenaye-Mamode Khan, M., et al.: Multi-class classification of breast cancer abnormalities
using deep convolutional neural network (CNN). PLoS One 16(8), e0256500 (2021). https://
doi.org/10.1371/journal.pone.0256500
506 D. Joshi et al.

19. Nicolau, M., Levine, A., Carlsson, G.: Topology based data analysis identifies a subgroup of
breast cancers with a unique mutational profile and excellent survival. PNAS 108, 7265–7270
(2011)
20. van Veen, H.J., Saul, N., Eargle, D., Mangham, S.W.: Kepler mapper: a flexible python
implementation of the mapper algorithm (Version 1.4.1). Zenodo, 14 October 2019. https://
doi.org/10.5281/zenodo.4077395
Object Detection Using Microsoft HoloLens
by a Single Forward Propagation CNN

Reza Moezzi1,2(B) , David Krcmarik1 , Jindřich Cýrus1 , Haythem Bahri3 ,


and Jan Koci1
1 Institute for Nanomaterials, Advanced Technologies and Innovation, Technical University of
Liberec, Liberec, Czech Republic
Reza.Moezzi@tul.cz
2 Faculty of Mechatronics, Informatics and Interdisciplinary Studies, Technical University of
Liberec, Liberec, Czech Republic
3 University of Grenoble Alpes, GIPSA-Laboratory, CNRS, Saint-Martin d’Hères, France

Abstract. In this paper, the application of HoloLens mixed reality device devel-
oped by Microsoft based on machine learning is demonstrated for an enhanced
object detection task. The most important objective of this implementation is to
enhance object recognition for AR (Augmented Reality) using a machine learning
algorithm on HoloLens. This system helps the HoloLens user to recognize and
detect the objects in real world. In order to proper object detection, the single for-
ward propagation convolutional neural networks (YOLO) algorithm is performed.
Based on the definition of YOLO (You Only Look Once), prediction step in the
whole scene and image is performed in a single algorithm run. The system deliv-
ers the footnote for detected object and shows bounding boxes via HoloLens. It
can also identify the new-found position of shifting object in few milli-seconds.
Results demonstrate an increased rate of object detection comparable with other
mentioned methods.

Keywords: Object detection · Mixed reality · HoloLens · YOLO · Machine


learning

1 Introduction
Object detection in machine vision is the process to obtain information, identify and
detect objects by computer in the camera’s field of view (FOV). In addition, the extracted
information must be well organized and easy to use for desired application. It can over-
come the flaws of outdated identity validation methods and offers a more reliable and
secure mechanism. Consequently, object recognitions have been broadly employed in
public security, industry, and manufacturing as the fundamental application of AI (arti-
ficial intelligence). Such a process reduces social and industrial cost to some extent, has
expanded management level and service efficiency.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 507–517, 2022.
https://doi.org/10.1007/978-3-030-97196-0_42
508 R. Moezzi et al.

Throughout the computer-human interaction development, it is necessary to extend


more practical, intelligent, and easy to use experiences for users. Object detection is
able to decrease the number of operations which users hand over was necessary before
and it provides an innovative method which is different from the screen finger touching.
This can help to attain the freedom of hands movements and fewer physical interactions.
Nevertheless, the majority interactive computer devices are yet mouse, key board, and
touch computer screen for object detection, these devices have lacked in terms of their
intelligence.
The benefits to recognize objects using video streaming observation systems has
showed its effectiveness as regards the machine learning algorithm development. These
systems demand beyond just a device recognizing and detecting objects. Firstly, device
stores the object’s scene as sequences of images then it applies machine learning algo-
rithm to deliver recognition outcomes and detecting objects. These days with advanced
technologies for augmented reality (AR) such as Microsoft HoloLens, it can visualize,
recognize, and detect, the objects in front of the user in real world. This can enrich the
quality of the daily life and assist with human-computer interaction and environmental
orientation.
The modern machine learning algorithms such as Convolutional Neural Network
(CNN), Region-based CNN (R-CNN), fast R-CNN, and You Only Look Once (YOLO)
can be implemented in Microsoft HoloLens for object detection, and it enables the users
to experience these machine learning algorithms in AR area. Some previous research
which employs YOLO algorithms is described as following.
YOLO as a unified model has initiated by Redmon et al. [1] for object detection in
2016. In his approach the neural network was able to be trained directly on full images
by a single forward propagation. Different than other methods which use classifiers;
in his algorithm entire model is trained cooperatively and it uses the cost function that
precisely relates to recognition performance. YOLO has pushed the state of art in object
detection field, and it develops the fastest algorithm in machine learning based for vision
detectors. The method defined an ultimate arrangement that depend on robust, real time,
and fast object detection.
Redmon offered a new method in 2017, he developed YOLO v2 and YOLO-9000 as a
second approach of YOLO algorithm for real time detection system. Redmon et al. in [2]
showed that YOLO v2 is considerably quicker than previous detection techniques among
other range of detection systems. Thus, this method reveals the best results when it run
at different image sizes to deliver an effortless tradeoff between accuracy and speed. On
the other hand, the YOLO-9000 offers real-time algorithms for detection of extra than
9k objects which improves instantaneously classification and detection. As described in
[2] he used WT (Word Tree) involving information from various datasets by merging
to optimize training of Microsoft COCO [3] and ImageNet [4] at the same time. When
the arrangement of datasets that is used for classification in the hierarchical manner,
was applied in the segmentation and classification domains, this method presents better
comprehensive out-put details for classification of the images. Redmon applied the multi-
channel method for the NN training to give a value through a range of graphic tasks.
This new YOLO has brought a robust stage to bring a closer gap between classification
and detection in term of dataset size.
Object Detection Using Microsoft HoloLens 509

In 2018, Redmon et al. [5] developed a newer version of YOLO with some updates
on his previous work, which is called YOLO v3. He introduced a set of slight layout
variations to bring up YOLO2 progress, although, it is quite swell and bit larger but
more precise. Numerous studies have utilized mentioned YOLO algorithms to detect
objects. For example, Benjdira et al. [6] presents a comparison study of an experimental
study between YOLO v3 and Faster R-CNN for cars detections. For evaluation of its
performance, five metrics are adopted: recall, precision, score, F1, processing time and
quality. Comparing both algorithm’s results shows that YOLO v3 is better than faster
R-CNN in precision sensitivity which is more efficient to obtain all vehicles in image
with 99% exactness. Similarly, YOLO v3 reveals better performance than Faster R-CNN
in term of image recognition processing time. Further studies employed YOLO v3 for
image recognition for high resolution image in Liu et al. [7] article. This research is used
aggregated channel feature region proposal (ACF-PR) as a YOLO v3 network, and a
post-processing phase is implemented to extract possible regions from high resolution
image. Initially, Liu et al. [7] applied object detector (ACF) to obtain nominees, so an
extending method and a bounding merging box are designated to join the bounding
box to the proper part proposal will be arranged for YOLO. Following, his study used
YOLOv3 to construct right detection in potential region created by aggregated channel
feature region proposal. Finally, in [7] the comparative study between Liu’s work and
other presented methods such as public Tsinghua-Daimler Cyclist Benchmark (TDCB)
framework and its results of experiments, revealed that his technique based on YOLO
v3 beats others by 13.7% in correctness.
An upgrading version of YOLO algorithm is presented by Derakhshani et al. [8]
with respect to mean-average-precision (mAP) with no conceding of the speed. In his
study, the assisted excitation is reduced gradually to the zero throughout the most recent
phases of training. This procedure enhanced YOLOv3 mAP: 2.2% and YOLO v2: 3.8%
on Microsoft COCO dataset.
In this paper, an object detection technique is developed to recognize and detecting
objects by the use of HoloLens by applying YOLO in server-side. The algorithm pro-
cesses information from the client or user side. Furthermore, a selected TCP/IP protocol
is used to allow HoloLens transmits the data with the server. This transmission channel
performs via a local (host) network, and it also can transmit both information receiving
from the detection and streaming video.
Current paper is organized as follows, after this introduction which covered some
related studies on YOLO, its expansions, machine learning algorithms and compar-
isons between some previous research, next, using HoloLens as augmented reality is
overviewed in Sect. 2. In Sect. 3, our object detect system is described by details and
the employment on Microsoft HoloLens and effectiveness of YOLO is presented. In
fourth section, primary assessment of the detection outcomes is discussed then finally,
a conclusion of this research and possible future studies are offered in the last section.

2 Object Detection Using Augmented Reality


In recent years, Microsoft HoloLens has been used for object detection tasks by some
researchers, Wang et al. [9] has employed Microsoft HoloLens for a precise and effective
510 R. Moezzi et al.

manufacturing assembly fault detection, as an in/out unit. He applied Faster R-CNN


algorithm to acquire mass data extraction feature data for targeting recognition instead
of conventional manual inspection. In the training phases, 2000 object samples are used,
and the result showed the target detection achieves 85% in term of the mAP.
Eckert et al. [10] reviewed the recent advanced technologies in mixed reality hard-
ware and machine learning which they make available swifter improvement of assistive
technologies. His study proposes an approach which can offer opportunities to make sim-
pler daily life of blind or visually impaired person. It also is able to find the basic objects
without any prior training on HoloLens. His model intended to replace the damaged eye
of the operator with technological sensors by visualizing its environments. This model
implements YOLO v2 which delivers a conditional indication of the object in HoloLens
FOV. This model can understand observed object’s name that could be chosen by user’s
commands via voice, then It’s worked as guiding direction for person. The depth distance
of a chosen object is calculated by HoloLens’s spatial modelling and stereo cameras.
This wearable resolution suggests excellent chance to inexpensively localize object into
sustain alignment with no massive client training.
Alternatively, HoloLens is used YOLO v2 for Building and Asset Information Mod-
els (BIM/AIM) by Naticchia et al. [11, 12]. Their investigation intends to give a technical
aid by constructing life sequence to Service Managing processes. Both cases signifi-
cantly showed improvement in terms of efficiency and accuracy by feasibility assess-
ments which had been performed through training of YOLO v2 Neural Network and the
perception via AR device.

3 HoloLens Used for Object Detection

In this part, the most important procedure of the system is described to make sure proper
object detection via HoloLens with YOLO. As shown in Fig. 1, the software architecture
is based on server-client connection. The hardware used are, two parallel NVIDIA GPU
(graphic processing Units) Quadro P-4000 to run on the server which process the YOLO
algorithm. HoloLens v1 and its built-on cameras are used as an input scene for object
detection in the environment in the client side.
The built-on cameras are on the fore of the HoloLens which allows applications to
observe what the user observes. Developer mode of the device makes possible to access
and control the cameras input images for desired processing and application runs. The
Microsoft HoloToolkit library is imported to the project in Unity to control gesture, gaze,
cameras, and voice. The Redmon [13] darknet library is implemented in the server side,
to operate YOLO that is executed in Compute Unified Device Architecture (CUDA)
and C programming languages. All required components for the system are reviewed in
Table 1.
Object Detection Using Microsoft HoloLens 511

Fig. 1. General architecture of the recognition system, client side and server side

Table 1. Required components for client and server side

Development Programs Version


Requirements -Visual studio (2017)
-Unity (2018.3.7f1)
-HoloToolkit (2017.4.3.0)
Server -Linux Ubuntu (18.04)
Requirements -Darknet (YOLO Library)
-YOLO (v1, v2, v3)
-GPU (2x Quad-P-4000)
- NVIDIA CUDA Toolkit (v 10)
- NVIDIA CUDNN (v 7.3)
-OpenCV (v 3.4)
Client -HoloLens (Version 1)
Requirement

3.1 A. Server-Side
YOLO is applied in server-side to recognize and detect the object which is the most
extreme part of the system. The darknet library is responsible to add receiving data from
the client (HoloLens) into the processing file of the server as YOLO algorithm’s input
information.
512 R. Moezzi et al.

The system starts with the server request of the input statistics from the HoloLens
cameras as client side in the real time detection. The frames is stored as sequence
images with RGB channels and resolution of 898 × 506. By deployment of the designed
application into HoloLens, it can connect using to the server by unique IP address. So,
the server can receive the images to execute object detection by YOLO network. A
unique pre-trained YOLO network is used to give the results to the HoloLens as client
side. The bounding boxes, the annotation and the box color for every recognized object
are presented as outcomes of detection. Additional to mentioned results, the probability
of the detected object in the bounding box for a certain class is evaluated and displayed
with the annotation. The server sends results to HoloLens for each frame to utilize it
appropriately with high efficiency. Then, the HoloLens user is able to see the detected
object in real-time with mentioned features. The object’s coordinates is passed to the
Microsoft HoloLens in 2D by server side, so the object’s 3D coordinates are computed
on gadget with respect to the mentioned 3D model produced by Microsoft HoloLens and
obtained 2D data. It’s important to mention that the 3D model is represented through real
time procedure, aided by Microsoft HoloLens light cameras and built-on TOF (Time of
Flight) system. This is essential to compute depth distance for the specific detected object.
Thanks to spatial mapping capability of HoloLens which allows developers to estimate
the real-world coordinate easily. The scene is generated by mapping and scanning of
the surrounding environment by RGB cameras and the TOF. TCP/IP connection is used
server-client communication which is implemented in C language to make sure that
several clients can connect to the server.

3.2 Client Side

In the client side, The HoloLens camera must detect and recognize the objects. A project
is created by Unity software, then is deployed to HoloLens via Wi-Fi connection. In
addition, TCP/IP connection is used for linking server side to client side. The HoloToolkit
package is used to upload the central camera scene and to initialize the spatial mapping
feature of HoloLens which makes a meshing map from the real surrounding. HoloToolkit
also enables us to import gesture and gaze components to create an active virtual keyboard
for putting server IP address. Finally, the software shows an annotation of the detected
object in the user scene from the HoloLens central camera. In Fig. 2, a scheme of system
major stages is illustrated.

4 Discussion and Results

Our described methodology to recognize and to detect the objects via HoloLens indicate
an exceptional detection result. YOLO v1, YOLO v2 and YOLO v3 are used to obtain
the results in term of processing time and precision mean average. For higher efficiency
in detection procedure, for each mentioned version of YOLO, we have defined specific
fps (number of frames per second). Figure 3 demonstrates object detection via HoloLens
by YOLO with the annotation header of every object including the object’s name and
precision percentage.
Object Detection Using Microsoft HoloLens 513

(a) Application Deployment on Ho- (b) Server Connection and YOLO


loLens launch

(c) Recognition of the Object and its demonstration on HoloLens

Fig. 2. Conceptual diagram of the system in server and client side

Fig. 3. Object recognition by HoloLens with YOLO on the server screen

The Fig. 3 has been obtained from the screen through the application launch using
the output on the screen and the input from HoloLens. The object recognition using
the output and input frames with HoloLens is presented in Fig. 4. Both outcomes were
sent to HoloLens simultaneously and they are shown on its screen which were in the
514 R. Moezzi et al.

Fig. 4. Object recognition and detection in the client (HoloLens) screen

exact conformance of results of detection, the precision ratio of detection and the object
position.

4.1 Detection Precision

This method’s performance is based on finest precision with a finest time to recognize
and detect the object. For the measuring precision, YOLO with three versions is used to
confirm which one is the most accurate one.

Table 2. Average-precision of object detection for different versions

Keyboard Bottle Phone Monitor Mouse Chair Person


YOLO v3 91.93 97.83 92.23 95.55 99 98.41 99
YOLO v2 89.3 66.72 78.2 83.45 87.11 65.87 87.55
YOLO v1 79.09 63.14 66.56 76.07 78.61 50.12 86.52

As shown in Table 2 and respectively plotted in Fig. 5, The findings show the effec-
tiveness of YOLO for the precision average of various objects in our workplace. Based
on the results, YOLO v3 gave a good precision detection which is higher than 90%.
On the other hand, to verify obtained results, it was required to calculate the mAP
(mean average precision) of the different YOLOs. As shown in Table 3, the YOLO v3
mAP shows ninety percent but YOLO v1 and v2 can’t even achieve eighty in mAP
percentage.

4.2 Processing Time

As second validation factor, the processing time is computed of this methodology. Table
3 also presents the mean processing time of the 3 YOLO versions in term of fps. YOLO
v3 has shown higher performance both in detection accuracy and in processing time than
the YOLO v1 and v2. It could reach 5 fps identifying all objects with a high correctness
via HoloLens device.
Object Detection Using Microsoft HoloLens 515

100
80
60
40
20
0

YOLOv1 YOLOv2 YOLOv3

Fig. 5. Percentage of precision with respect to different detected objects by different YOLO
versions

Table 3. Mean average precision and mean processing time of different YOLO versions

Measure Version
YOLO v1 YOLO v2 YOLO v3
Mean average precision (mAP) 71.44 79.74 96.27
Frames per second (FPS) 4.2 4.6 5

5 Conclusion and Future Studies

In this article, we have revealed an interesting application of machine leaning algo-


rithms -specifically YOLO or you look only once- on augmented reality device Microsoft
HoloLens for object detection task. In our study, we have designed a system between
desktop computer as a server and HoloLens device as a client side which are con-
nected via TCP/IP channel. The system aimed to process the object recognition and
detection with HoloLens utilizing a server based on YOLO in a real environment. The
results present an effective and accurate detection by HoloLens with 96% mean average
precision and 5 fps as the detection processing time, using YOLO v3.
The future studies can be extended to include the AI coprocessor that is built-on in
HoloLens HPU for central vision processing chip. To implement other different deep
neural networks can be considered as fast-growing area in AI. Other possibilities can be
expanded to use described algorithm for different application such as autonomous vehicle
516 R. Moezzi et al.

which authors currently focused on [14, 15], simultaneous localization and mapping
problems (SLAM) [16], UAV control etc.

Acknowledgment. This work was supported by the Student Grant Scheme at the Technical
University of Liberec through project nr. SGS-2021-3059 and by Ministry of Education, youth
and Sports of the Czech Republic and the European Union in the frames of the project “Mod-
ular platform for autonomous chassis of specialized electric vehicles for freight and equipment
transportation”, Reg. No. CZ.02.1.01/0.0/0.0/16 025/0007293.

References
1. Redmon, J., Santosh, D., Ross, G., Ali, F.: You only look once: unified, real-time object detec-
tion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
pp. 779–788 (2016)
2. Redmon, J., Ali, F.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
3. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele,
B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).
https://doi.org/10.1007/978-3-319-10602-1_48
4. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K. and Fei-Fei, L.: ImageNet: a large-scale
hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern
Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848
5. Redmon, J., Ali, F.: Yolov3: An incremental improvement. arXiv preprint arXiv: 1804.02767
(2018)
6. Benjdira, B., Taha, K., Anis, K., Adel, A., Kais, O.: Car detection using unmanned aerial
vehicles: comparison between faster r-cnn and yolov3. In: Proceedings of the 1st International
Conference on Unmanned Vehicle Systems Oman (UVS), pp. 1–6 (2019)
7. Liu, C., Yu, G., Shuang, L., Faliang, C.: ACF based region proposal extraction for YOLOv3
network towards high-performance cyclist detection in high resolution images. Sensors
19(12), 2671 (2019)
8. Derakhshani, M.M., et al.: Assisted excitation of activations: a learning technique to improve
object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 9201–9210 (2019)
9. Wang, S., Ruifeng, G., Hongliang, W., Yuanjing, M., Zixiao, Z.: Manufacture assembly fault
detection method based on deep learning and mixed reality. In: Proceedings of the IEEE
International Conference on Information and Automation (ICIA), pp. 808–813 (2018)
10. Eckert, M., Matthias, B., Christoph, M.F.: Object detection featuring 3D audio localiza-
tion for Microsoft HoloLens. In: Proceedings of the 11th International Joint Conference on
Biomedical Engineering Systems and Technologies, vol. 5, pp. 555–561 (2018)
11. Naticchia, B., Corneli, A., Carbonari, A., Bonci, A., Pirani, M.: Mixed reality approach for the
management of building maintenance and operation. In: Proceedings of the International Sym-
posium on Automation and Robotics in Construction, vol. 35, pp. 1–8. IAARC Publications
(2018)
12. Corneli, A., Naticchia, B., Carbonari, A., Bosche, F.: Augmented reality and deep learning
towards the management of secondary building assets. In: Proceedings of the International
Symposium on Automation and Robotics in Construction, vol. 36, pp. 332–339. IAARC
Publications (2019)
13. Redmon, J.: YOLOv3. https://pjreddie.com/darknet/yolo/. Accessed 15 July 2021
Object Detection Using Microsoft HoloLens 517

14. Cyrus, J., Krcmarik, D., Moezzi, R., Koci, J., Petru, M.: Hololens used for precise position
tracking of the third party devices - autonomous vehicles. Commun. Sci. Lett. Univ. Zilina
21(2), 18–23 (2019)
15. Moezzi, R., Krcmarik, D., Bahri, H., Hlava, J.: Autonomous vehicle control based on
hololens technology and raspberry pi platform: an educational perspective. Pap. Present.
IFAC-PapersOnLine 52(27), 80–85 (2019). https://doi.org/10.1016/j.ifacol.2019.12.737
16. Moezzi, R., Krcmarik, D., Hlava, J., Cýrus, J.: Hybrid SLAM modelling of autonomous robot
with augmented reality device. Pap. Present. Mater. Today Proc. 32, 103–107 (2020). https://
doi.org/10.1016/j.matpr.2020.03.036
Significance of Dimensionality Reduction
in Intrusion Detection Dataset

Ghanshyam Prasad Dubey1(B) , Rakesh Kumar Bhujade1 , and Puneet Himthani2


1 Department of CSE, Mandsaur University, Mandsaur, MP, India
ghanshyam_dubey2@yahoo.com
2 Department of CSE, SISTec, Bhopal, MP, India

Abstract. Due to the extremely large size of datasets; it is sometimes difficult


to process the whole dataset at once multiple times to gain the proper outcomes.
The training time and the computing time are two major factors that affect the
processing or execution of a complete dataset. Dimensionality reduction is one of
the possible solutions to overcome this problem. Selecting the proper method for
dimensionality reduction is again a matter of concern. The selection of relevant
or desired feature subset from the existing features of the dataset is difficult; this
results in the loss of information or improper selection criteria. Irrelevant fea-
tures may lead to inappropriate results. This paper incorporates the three different
dimension reduction techniques, namely Dense_FR, Sparse_FR and ACO_FR,
where Dense_FR and Sparse_FR are based on Mutual Information and Kendall’s
Correlation Coefficient; ACO_FR is based on Ant Colony Optimization. The
reduced datasets are used for the training of four different IDS classifier mod-
els, namely KNN, SVM, NB and Logistic Regression. The results shown by these
methods are promising with existing classifiers; as they reduce the processing time
and improve the performance metrics.

Keywords: Dimensionality reduction · Intrusion detection systems · Feature


selection · KDD dataset · Ant colony optimization · Mutual information ·
Correlation

1 Introduction
An Intrusion Detection System (IDS) is a tool or security mechanism used to investigate
the network activity and traffic and detect a possible breach of security based on the
analysis. Sometimes it generates alerts and sometimes it takes proactive measures. IDS
was used to classify programs or actions that sought to undermine the secrecy, credibility,
and, most notably, accessibility of a network resource. It is a type of security mechanism
that analyses the operation of a system and assists in the detection of undesirable behavior
such as security breaches, computing resource theft, data loss, and so on. With the
advancement of the network and internet, a slew of new security concerns emerges [1].
Traditionally IDS can be classified as Misuse Detection and Anomaly Detection; where
Misuse-based IDS are highly efficient in the detection of already known attacks and
have very low False Alarm Rate; but they have the bottleneck of identifying the new

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 518–529, 2022.
https://doi.org/10.1007/978-3-030-97196-0_43
Significance of Dimensionality Reduction 519

or unknown attack patterns with the same efficiency and performance. Anomaly-based
IDS mainly build a Model based on the normal behavior and looks for the deviations
in the behavior [2]. According to the reaction of intrusion detection active intrusion
detection systems (Active-IDS) and passive intrusion detection systems (Passive-IDS)
are available. The classification based on analysis time (real-time vs. periodic intrusion
detection) relates to how the tool conducts its investigation [3]. Based on location IDS
can be classified Intrusion into three categories, namely Host-based IDS, Network-based
IDS and Hybrid IDS [4].
The larger size of Intrusion Detection Datasets, such as KDD CUP 99, NSL KDD,
and others results in a big challenge for building an effective and efficient training model
for machine learning. Reduced Dataset has a significant impact on the performance of
the algorithm or model being developed. Dataset with reduced size requires less Memory
as compared to overall Dataset; the model will also take less time in learning over the
reduced Dataset. This ultimately improves the efficiency, classification accuracy and
performance of the model. Feature Extraction and Feature Selection are two approaches
for applying feature engineering. Feature Extraction is the process of creating a new
feature set from existing features, while Feature Selection is the process of selecting a
subset of features from an existing Feature Set such that the resulting Feature subset is
appropriate to reflect the original Dataset and all of its insights [5]. Feature Selection is an
integral part of the process of Classification that plays a significant role in the elimination
of non-relevant, redundant and inconsistent Features; which ultimately improves the
Classification Accuracy of the algorithm. Relevancy and Redundancy are two important
parameters for the selection or removal of Features. Relevant Features are necessary for
the model to understand the pattern and take decisions; while Redundant Features are
highly correlated with each other. To get a consistent and best possible reduced Dataset,
it is necessary to keep all the Relevant Features and remove the Redundant ones [6].
Feature Ranking and Selection play a significant role in improving the performance
of an IDS. Highly significant Features are taken into consideration for developing, train-
ing and testing the model, as they help in classifying the patterns as normal or malicious.
Feature Reduction helps in developing a simple model of Classification. Feature Reduc-
tion improves the performance of the model by getting an understanding of the Features
and removing the redundant ones to reduce the size of the Dataset and improves its
consistency, thereby leading to the development of a more effective and accurate Classi-
fication model [1]. This work offers some dimensionality reduction techniques i.e. Dense
Feature Reduction (Dense_FR), Sparse Feature Reduction (Sparse_FR) and Ant Colony
Optimization Oriented Feature Reduction (ACO_FR). These three methods are tested
over the KDD-99 dataset [7]; this dataset contains 41 features and a target attribute.
Dense_FR can reduce it in 20 features, Sparse_FR in 7 features and ACO_FR works
with 28 features. Dense_FR and Sparse_FR schemes are based on Mutual information
and Kendall’s correlation coefficient. ACO_FR is based on swarm intelligence.
520 G. P. Dubey et al.

2 Literature Review
Feature Reduced IDS using ANN Classifier was proposed by Manzoor & Kumar [1].
They reduced the features based on entropy (information gain) and correlation. They
used 10% of the KDD-99 dataset for testing and training purposes. The features of KDD-
99 datasets are ordered by Entropy value and correlation separately and then ordered
features are formed in 3 groups as top 10 in the first group, the next 20 on another, and the
remaining 11 least value features in the third group. The common features of each group
are taken separately and finally add then to obtain a complete set of reduced features.
They can reduce 16 features among 41 total features available and finally make the
reduced dataset with 25 features and a target variable. A simple 3-layer Feed-Forward
NN is implemented with 25 Neurons in the Input Layer for training the model using the
Back Propagation approach with the Levenberg Marquardt Training method. According
to the results obtained; the Detection Accuracy of the model for various types of Samples
is 94.80% for Normal, 99.93% for DoS, 99.54% for R2L, 98.79% for Probe, and 19.7%
for U2R.
Ant Colony Optimization (ACO) based Feature Selection approach for IDS proposed
by Mehmod & Rais [6]. KDD CUP 99 Dataset is used for evaluating the performance of
the proposed ACO-based Feature Selection IDS. SVM is used as a Binary Classifier for
classifying an Instance of a Dataset as normal or attack. Results show that the proposed
ACO-based Feature Selection IDS based on SVM performs better than the traditional
SVM model. Chiba, Abghour [8], et al. proposes the implementation of an Optimal
Anomaly-based NIDS using Back Propagation NN. The Correlation-based Filters and
Information Gain are used for identifying the essential Features (Feature Selection).
Categorical Encoding of Data is performed in the Preprocessing stage. MIN – MAX
and Statistical approaches are used for normalizing the Data. The Detection Module is
based on Back-Propagation Neural Network. Salih & Abdulrazaq [9] proposes a Feature
Selection technique based on the principle of voting. They evaluated their results on
KDD-99 with three methods KNN, Naïve Bayes and MLP. All three methods having
reduced datasets of six features offer an accuracy of 99% (KNN), 93% (Naïve Bayes)
and 97% (MLP). The TPR value is 98.9, 93.3 and 96.5 respectively for KNN, NB and
MLP along with FPR 1.2, 6.7 and 3.5. The precision value is 98.9, 93.3 and 96.5.
Liu [10] suggested a PSO and BPNN based feature reduction scheme with 41 fea-
tures. For developing a model for IDS Classifier, the author proposes a PSO-based
BPNN approach. The proposed PSO BPNN approach outperforms the conventional
BPNN approach, which is exceptional in detecting types of attacks such as DoS, R2L,
U2R, and Probe. The accuracy of PSO-BP is 98%, TPR 98.24, FPR 1.76 and 98.2 of
precision. Rais & Mehmood [11] proposed a Dynamic Ant Colony System with Three-
Level Update Feature Selection for Intrusion Detection. Here ACO is used for the Feature
Selection and SVM is used for the Classification of patterns as normal or attack. The
model attains an Accuracy of approx. 98.7%, when used as a Binary Classifier. Wan,
Wang et al. [12] proposes a technique for Feature Selection based on Modified Binary
Coded Ant Colony Optimization (MBACO) Algorithm. The MBACO is a combination
of Binary Coded Ant Colony Optimization (BACO) Algorithm and Genetic Algorithms
ACO (GA – ACO). ACO is employed to identify the Heuristic Real-Time IDS Features,
which are highly Relevant and Non – Redundant for the identification of Samples and
Significance of Dimensionality Reduction 521

Learning of the model. Relative Fuzzy Entropy acts as the Heuristic factor for the selec-
tion of a Feature as Relevant and Non – Redundant in the reduced Dataset. This approach
reduces the training, testing and classification time of the IDS to a certain extent [13].

3 Proposed Methodology
Numerous researchers had already proposed various dimension reduction techniques.
In this section; Dense_FR, Sparse_FR and ACO_FR are suggested for the same. The
Dense_FR and Sparse_FR are based on mutual information and Kendall’s correla-
tion while ACO_FR is based on the ant’s behavior or swarm intelligence. KDD-99
data set having a total of 41 features and 1 target attribute. Dense_FR selects 20 fea-
tures; Sparse_FR selects 7 features and ACO_FR selects 28 features among 41 features
individually [14]. The brief procedures for the above-proposed schemes are explained
below.

3.1 Dense_FR
In this method, features are ranked according to MI and Correlation Coefficient in two
different lists separately. These two lists are further sub-divided into three parts each with
the top 30% features in part 1, the next 30% features in part 2 and the remaining 40%
features in part 3 respectively. Common features from adjacent parts are selected and
their union will result in the optimal feature subset. Let, M1, M2 and M3 be the sub-lists
based on MI and C1, C2 and C3 be the lists based on Kendall’s Coefficient, then the
intersection of M1 and C1, M2 and C2 and M3 and C3 will result in MC1, MC2 and MC3
respectively. Now, the optimal feature subset is computed by taking the Union of MC1,
MC2 and MC3. The approach is termed Dense_FR because it takes into consideration
features belonging to all the sub-lists, considering top-ranked, intermediate ranked and
low ranked features for identifying the optimal features required for target prediction
[14].

3.2 Sparse_FR
Sparse_FR is also similar to the Dense_FR initially; after the initialization of the dataset
Mutual information and Kendall’s correlation coefficient values are calculated separately
concerning rank. Then Divide the MI and C into 3 parts each as M1, M2 and M3 and
C1, C2 and C3 with top 30%, next 30% and remaining 40% attributes respectively. Find
the common attributes from MC1 and MC2 only and finally union them for getting the
final selected features [14].

3.3 ACO_FR
In this algorithm, swarm intelligence is used to classify the features of the dataset based on
correlation and heuristic information. Starting with dataset initialization and selection of
artificial ants, the method identifies a subset of features based on a subset size estimation
scheme and let the ants evaluate the heuristic information of the selected subset. After
522 G. P. Dubey et al.

evaluating the selected subset, evaluation criteria can be applied to add selected features
to the final list and again starts identifying another subset and evaluate it until all the
features of the dataset are evaluated. According to the principles of pheromone updating
and heuristic knowledge estimation, update the final feature selection list [6, 14].

4 Results and Performance Evaluation

The KDD CUP 99 and NSL KDD intrusion datasets are used to test the proposed
methods. KDD CUP 99 Dataset is based on DARPA Dataset and is one of the most
widely used Network Dataset for IDS. It contains basic attributes of TCP connection
along with other attributes like the number of failed logins, etc. It contains approx. 20
million samples belonging to around 20 different types of attacks. Target can be binary
i.e., normal sample or attack sample; the target can be multiclass i.e., normal or any type
of attacks like DoS, Probe, R2L and U2R [15, 16]. The proposed methods are tested for
both binary and multiclass classification and also compared with original datasets and
various predefined classifiers. The performance of any Classifier is characterized by a
Confusion Matrix comprising 4 components, namely, True Positive (TP), True Negative
(TN), False Positive (FP) and False Negative (FN). Based on these components, various
Performance Parameters are evaluated, which will justify the performance of an IDS
Classifier [17]. Accuracy, Precision, Recall and F1-Score are evaluated for analyzing
the performance of the binary classifiers; F1-Score, Hamming Loss and Jaccard Score
are computed to analyze the performance of the multi-class classification [18].

Table 1. Precision and recall for binary classification over KDD CUP 99 dataset

Dataset Precision Recall


LR NB SVM KNN LR NB SVM KNN
KDD [7] 97.33 99.93 99 99.74 95.55 67.41 98.26 99.71
IG Base [1] 97.59 99.19 99.32 99.05 98.94 93.84 98.91 99.09
Dense_FR 96.66 97.6 98.73 98.66 97.8 96.37 98.97 99.07
Sparse_FR 95.14 93.93 98.53 98.32 96.33 95.17 97.97 98.35
ACO_FR 89.25 89.56 99.95 99.81 99.84 64.38 99.63 99.83

Table 1 presents the precision assessment of proposed feature reduction meth-


ods along with the KDD original dataset (named as KDD) and information Gain and
correlation-based method ( named as IG Base) [1]. The reduced datasets by the methods
are tested with LR [19], SVM [20, 21], KNN [22] and Naïve Bayesian Classifier (NB)
[23]. Dense_FR, Sparse_FR and ACO-based feature reduction schemes (ACO_FR) are
proposed and tested with existing methods.
As far as precision values are concerned, ACO_FR outperforms other dimension
reduction techniques for KNN and SVM-based IDS classifier models. Concerning
Recall, Dense_FR performed well with Naïve Bayes (NB) with recall of 96.37, while
Significance of Dimensionality Reduction 523

ACO_FR offers best results with other classifier models, having a recall of above 99.6
with all classifier models.

Table 2. F1-Score and accuracy for binary classification over KDD CUP 99 dataset

Dataset F1-Score Accuracy


LR NB SVM KNN LR NB SVM KNN
KDD 96.43 80.51 99.12 99.73 94.34 73.87 98.6 99.56
IG Base 98.26 96.44 99.12 99.07 97.19 94.45 98.6 98.5
Dense_FR 97.23 96.98 98.85 98.86 95.53 95.19 98.15 98.18
Sparse_FR 95.73 94.55 98.25 98.34 93.08 91.23 97.2 97.33
ACO_FR 94.25 74.91 99.79 99.82 89.15 71.48 99.67 99.71

Table 2 provides the analysis of F1-Score and Accuracy of the classifier models over
different variants of feature-reduced KDD CUP 99 datasets. It is clear that Dense_FR
is the best feature reduction technique for the Naïve Bayesian classifier having an F1-
Score of 96.98; however, ACO_FR is most relevant for SVM and KNN, having the
highest F1-Score of 99.79 and 99.82 respectively. Considering the Accuracy metric,
again Dense_FR technique is most suitable for the Naïve Bayesian classifier with an
accuracy of 95.19, while SVM and KNN have the highest classification accuracy for the
ACO_FR feature reduction technique with an accuracy of above 99 with both models.

Table 3. Precision and recall for binary classification over NSL KDD dataset

Dataset Precision Recall


LR NB SVM KNN LR NB SVM KNN
NSL KDD [15] 80.45 88.26 68.51 97.13 74.4 65.87 97.75 99.94
IG Base 83.29 88.72 68.1 95.15 73.01 65.14 97.07 99.5
Dense_FR 85.06 81.9 80.57 96.49 78.51 86.66 96.08 83.85
Sparse_FR 86.24 88.48 79.64 97.87 77.87 74.21 97.35 80.56
ACO_FR 87.51 68.04 82.75 98.01 78.98 99.63 97.98 99.95

Table 3 shows the analysis of Precision and Recall for NSL KDD dataset and its
feature-reduced variants. ACO_FR has the best precision for SVM, KNN and Logis-
tic Regression classifier models. For Recall, ACO_FR outperforms the other feature
reduction techniques across all classifier models.
524 G. P. Dubey et al.

100

80

60 KDD
IG Base
40 Dense FR
Sparse FR
20 ACO

0
LR NB SVM KNN LR NB SVM KNN
F1-Score Accuracy

Fig. 1. Comparison of F1-Score and accuracy for binary classification over KDD CUP 99 dataset

Figure 1 shows the comparative analysis of F1-Score and the Accuracy for different
classifier models over different feature reduced KDD CUP 99 datasets.

Table 4. F-1 Score and accuracy for binary classification over NSL KDD dataset

Dataset F1-Score Accuracy


LR NB SVM KNN LR NB SVM KNN
NSL KDD 77.3 75.44 97.44 81.29 74.65 75.07 97.02 73.59
IG Base 77.79 75.13 96.1 80.86 75.81 75.1 95.45 68.66
Dense_FR 81.65 84.21 96.28 82.17 79.61 81.26 95.54 78.91
Sparse_FR 81.84 80.72 97.61 80.1 79.81 79.554 97.18 76.95
ACO_FR 83.03 80.86 97.99 90.54 81.24 68.02 97.7 85.34

Table 4 shows the analysis of F1-Score and Accuracy for NSL KDD Datasets and its
different feature reduced variants. Dense_FR has the best F1-Score of 84.21 for the Naïve
Bayesian classifier; while ACO_FR is best for all other classifier models. For Accuracy,
again ACO_FR is the best feature reduction technique for all classifier models except for
Naïve Bayesian, which has the best accuracy of 81.26 with Dense_FR feature reduced
dataset.
Figure 2 shows the comparative analysis of F1-Score and Accuracy across different
classifier models over NSL KDD Dataset and its feature-reduced variants. Jaccard Score
Significance of Dimensionality Reduction 525

100

80

60 NSL KDD
IG Base
40 Dense FR
Sparse FR
20 ACO

0
LR NB SVM KNN LR NB SVM KNN
F1-Score Accuracy

Fig. 2. Comparison of F1-Score and accuracy for binary classification over NSL KDD dataset

is an excellent metric for evaluating the output of classifiers that can distinguish between
multiclass and multi-category class data.

Table 5. F1-Score and Jaccard accuracy for multi-class classification over KDD CUP 99 dataset

Dataset F1-Score Jaccard Accuracy


NB OVA (LR) SVM KNN NB OVA (LR) SVM KNN
KDD 76.42 89.62 98.44 99.3 61.84 81.19 94.38 98.61
IG Base 89.54 96.54 98.25 98.18 81.06 93.11 96.55 96.36
Dense_FR 87.46 93.21 98.21 98.11 77.71 87.28 94.58 95.31
Sparse_FR 88.3 90.35 97.46 97.48 79.05 82.4 92.23 93.2
ACO_FR 71.7 92.79 99.61 99.7 55.88 86.54 99.23 99.4

Table 5 represents the analysis of F1-Score and Jaccard Accuracy for Multi-Class
classification over KDD CUP 99 Dataset and its feature reduced variants. In terms of
both F1-Score and Jaccard Accuracy, ACO_FR is the most suitable dataset for both
SVM and KNN models.
526 G. P. Dubey et al.

100

80

60 KDD
IG Base
40
Dense FR

20 Sparse FR
ACO
0
NB OVA SVM KNN NB OVA SVM KNN
(LR) (LR)
F1-Score Jaccard Accuracy

Fig. 3. Comparison of F1-Score and Jaccard accuracy for multi-class classification over KDD
CUP 99 dataset

Figure 3 represents the comparative analysis of F1-Score and Jaccard Accuracy


for multi-class classification over the KDD CUP 99 dataset. Table 6 represents the
comparative analysis of F1-Score and Jaccard Score for multi-class classification over the
NSL KDD dataset and its feature-reduced variants. Concerning F1-Score, the Sparse_FR
dataset is most suitable for the Naïve Bayesian classifier with an F1-Score of 70.02; while
Dense_FR is most relevant for the OVA (LR) classifier model having an F1-Score of
73.73. ACO_FR is best for SVM and KNN classifiers. Naïve Bayesian classifier has the
best Jaccard Accuracy of 53.87 with Sparse_FR dataset; while OVA (LR) classifier has
the best Jaccard Accuracy of 58.4 with Dense_FR dataset. SVM and KNN have best the
Jaccard Accuracy with the ACO_FR feature reduced dataset.

Table 6. F1-Score and Jaccard accuracy for multi-class classification over NSL KDD dataset

Dataset F1-Score Jaccard Accuracy


NB OVA (LR) SVM KNN NB OVA (LR) SVM KNN
NSL KDD 64.34 64.96 79.61 96.04 48.48 48.1 58.49 92.38
IG Base 66.05 65.18 79.47 93.67 49.91 48.35 57.61 88.08
Dense_FR 67.64 73.73 84.6 97.43 51.65 58.4 63.57 92.67
Sparse_FR 70.02 71.88 83.99 98.08 53.87 56.11 62.74 93.23
ACO_FR 68.96 69.87 85.02 98.28 51.7 53.7 65.72 95.94
Significance of Dimensionality Reduction 527

100

80

60 NSL KDD
IG Base
40
Dense FR

20 Sparse FR
ACO
0
NB OVA SVM KNN NB OVA SVM KNN
(LR) (LR)
F1-Score Jaccard Accuracy

Fig. 4. Comparison of F1-Score and Jaccard accuracy for multi-class classification over NSL
KDD dataset

Figure 4 shows the comparative analysis of F1-Score and Jaccard Accuracy for
multi-class classification over the NSL KDD dataset and its variants. Hamming Loss is
another important performance metric used for evaluating the efficiency of multi-class
classifiers. The lower the hamming value showing the higher accuracy. It can be defined
as the fraction of incorrectly classified samples.

Table 7. Comparison of hamming loss for multi-class classification over KDD CUP 99 and NSL
KDD datasets

Dataset Hamming Loss


KDD CUP 99 NSL KDD
NB OVA (LR) SVM KNN NB OVA (LR) SVM KNN
Original 23.58 10.38 1.56 0.7 35.66 35.04 20.39 3.96
IG Base 10.46 3.46 1.75 1.82 33.95 34.82 20.53 6.33
Dense_FR 12.54 6.79 1.79 1.89 32.36 26.27 15.4 2.57
Sparse_FR 11.7 9.65 2.54 2.52 29.98 28.12 16.01 1.92
ACO_FR 28.3 7.21 0.39 0.3 31.04 30.13 14.98 1.72

Table 7 shows the comparison of Hamming Loss for different classifier models over
different feature-reduced variants of KDD CUP 99 and NSL KDD datasets. ACO_FR
has the lowest hamming loss over SVM and KNN classifiers for both KDD CUP 99 and
NSL KDD datasets. Dense_FR over NSL KDD has the lowest hamming loss for OVA
528 G. P. Dubey et al.

(LR) model; while Sparse_FR over NSL KDD is the best feature reduced technique for
Naïve Bayesian classifier. In general, it is observed that ACO based feature reduction
scheme (ACO_FR) performs well, not only in binary classification but for multiclass
classification too.

5 Conclusion
The larger dimensionality is also a bigger problem concerning datasets; they are col-
lections of huge amounts of data and are scattered in numerous classes and features.
To process these datasets extremely higher execution or computation power is required.
Another problem associated with larger datasets is training time along with the learning
capacity of the system. To overcome above mentions lacunas dimensionality reduction
is a practical solution.
The work postulates and thoroughly examines three feature reduction methods:
Dense feature reduction (Dense_FR), Sparse feature reduction (Sparse_FR), and ACO
feature reduction (ACO_FR). An Ant Colony Optimization is based on heuristic Infor-
mation and Correlation estimation, while Dense_FR and Sparse_FR consider reciprocal
knowledge (mutual Information) and Kendall’s coefficient. After applying the feature
reduction techniques, the final reduced datasets were found to be computable, qualified,
and appropriate for the IDS models: they were qualified in terms of accuracy, precision,
and recall. The ACO_FR results found these were capable of having a useful relationship
for the overall prediction of the F1-Score. For Dense and sparse features, the Dense_FR
and Sparse _FR both yield positive results. In terms of training time, Dense_FR and
Sparse_FR have smaller resources than ACO and BPNN, but the acquisition time is
more.

References
1. Manzoor, I., Kumar, N.: A feature reduced intrusion detection system using ANN classifier.
Expert Syst. Appl. 88, 249–257 (2017)
2. Dubey, G.P., Bhujade, R.K.: Impact of ant colony optimization on the performance of network
based intrusion detection systems: a review. Int. J. Sci. Technol. Res. 8(9), 1830–1834 (2019)
3. Dubey, S., Dubey, J.: KBB: a hybrid method for intrusion detection. In: IEEE 2015
International Conference on Computer, Communication and Control (IC4), pp. 1–6 (2015)
4. Dubey, G.P., Bhujade, R.K.: Improving the performance of intrusion detection system using
machine learning based approaches. Int. J. Emerg. Trends Eng. Res. 8(9), 4947–4951 (2020)
5. Cai, J., Luo, J., Wang, S., Yang, S.: Feature selection in machine learning: a new perspective.
Neurocomputing 300, 70–79 (2018)
6. Mehmod, T., Rais, H.B.M.: Ant colony optimization and feature selection for intrusion detec-
tion. In: Soh, P., Woo, W., Sulaiman, H., Othman, M., Saat, M. (eds.) Advances in Machine
Learning and Signal Processing. LNEE, vol. 387, pp. 305–312. Springer, Cham (2016). https://
doi.org/10.1007/978-3-319-32213-1_27
7. KDD Cup 1999 dataset. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. Accessed
July 2021
8. Chiba, Z., Abghour, N., Moussaid, K., El Omri, A., Rida, M.: A novel architecture combined
with optimal parameters for back propagation neural networks applied to anomaly network
intrusion detection. Comput. Secur. 75, 36–58 (2018)
Significance of Dimensionality Reduction 529

9. Salih, A.A., Abdulrazaq, M.B.: Combining best features selection using three classifiers in
intrusion detection system. In: IEEE 2019 International Conference on Advanced Science
and Engineering (ICOASE), pp. 94–99 (2019)
10. Liu, R.X.: A computer network intrusion detection technology based on improved neural
network algorithm. Telecommun. Radio Eng. 79(7), 593–601 (2020)
11. Rais, H.M., Mehmood, T.: Dynamic ant colony system with three level update feature selection
for intrusion detection. Int. J. Netw. Secur. 20(1), 184–192 (2018)
12. Wan, Y., Wang, M., Ye, Z., Lai, X.: A feature selection method based on modified binary
coded ant colony optimization algorithm. Appl. Soft Comput. 49, 248–258 (2016)
13. Varma, P.R.K., Kumari, V.V., Kumar, S.S.: Feature selection using relative fuzzy entropy and
ant colony optimization applied to real-time intrusion detection system. Procedia Comput.
Sci. 85, 503–510 (2016)
14. Dubey, G.P., Bhujade, R.K.: Optimal feature selection for machine learning based intrusion
detection system by exploiting attribute dependence. Mater. Today Proc. (2021). https://doi.
org/10.1016/j.matpr.2021.04.643
15. Choudhary, S., Kesswani, N.: Analysis of KDD-Cup’99, NSL-KDD and UNSW-NB15
datasets using deep learning in IoT. Procedia Comput. Sci. 167, 1561–1573 (2020)
16. Aggarwal, P., Sharma, S.K.: Analysis of KDD dataset attributes-class wise for intrusion
detection. Procedia Comput. Sci. 57, 842–851 (2015)
17. Almseidin, M., Alzubi, M., Kovacs, S., Alkasassbeh, M.: Evaluation of machine learning
algorithms for intrusion detection system. In: 2017 IEEE 15th International Symposium on
Intelligent Systems and Informatics (SISY), pp. 000277–000282 (2017)
18. Toupas, P., Chamou, D., Giannoutakis, K.M., Drosou, A., Tzovaras, D.: An intrusion detection
system for multi-class classification based on deep neural networks. In: 2019 18th IEEE
International Conference on Machine Learning and Applications (ICMLA), pp. 1253–1258
(2019)
19. Gupta, G.P., Kulariya, M.: A framework for fast and efficient cyber security network intrusion
detection using apache spark. Procedia Comput. Sci. 93, 824–831 (2016)
20. Al Mehedi Hasan, M., Nasser, M., & Pal, B.: On the KDD’99 dataset: support vector machine
based intrusion detection system (ids) with different kernels. Int. J. Electron. Commun.
Comput. Eng, 4(4), 1164–1170, (2013)
21. Dubey, G.P., Gupta, N., Bhujade, R.K.: A novel approach to intrusion detection system using
rough set theory and incremental SVM. Int. J. Soft Comput. Eng. (IJSCE) 1(1), 14–18 (2011)
22. Benaddi, H., Ibrahimi, K., Benslimane, A.: Improving the intrusion detection system for
nsl-kdd dataset based on pca-fuzzy clustering-knn. In: 2018 6th International Conference on
Wireless Networks and Mobile Communications (WINCOM), pp. 1–6 (2018)
23. Gumus, F., Sakar, C.O., Erdem, Z., Kursun, O.: Online naive Bayes classification for network
intrusion detection. In: 2014 IEEE/ACM International Conference on Advances in Social
Networks Analysis and Mining (ASONAM 2014), pp. 670–674 (2014)
Comparative Analysis of Bioactive Compounds
for Euphorbia Hirta L. Leaves Extract
in Aqueous, Ethanol, and Methanol Solvents
Using GC-MS

Shikha Dixit(B) and Sugandha Tiwari

D.G.P.G College, C.S.J.M University Kanpur, Kanpur, India


dixitshikha0730@gmail.com

Abstract. The Objective: Euphorbia hirta L. is a traditional folk remedy which


is used as anti-inflammatory, anti-allergic, antidiarrheal, antiasthmatic, antidia-
betic, antioxidant, anti-tumor. In this study, the comparative chemical profiling of
Euphorbia hirta L. leaves has been performed in ethanol, methanol, and aqueous
extract using GC-MS analysis.
Materials and Methods: 10–10 g of powdered leaves were extracted
separately in ethanol, methanol and aqueous solvents for the analysis.
Results: The experimental outcomes of GC-MS study of E. hirta L. leaves
witnessed the existence of 6 bioactive compounds in aqueous extract, 24 in ethanol
extract, and 36 in methanol extract. The molecular structure, formula, and weight
of the compounds are determined with the help of National Institute Standard and
Technology (NIST) database.
Conclusion: The experimental outcomes demonstrate that the methanolic
extract of E.hirta L. leaves contains highest bioactive compounds among ethanol,
methanol and aqueous extract. Extraction and identification of these bio-active
compounds is extremely important for validating the medicinal potential of the
herb.

Keywords: Bioactive compounds · Biological activity · Euphorbiaceae ·


Euphorbia hirta L. · GC-MS

1 Introduction

Medicinal plants are the precious gift of the nature which plays a prominent role in curing
a number of diseases. The medicinal trait of these plants is marked due to the existence of
several bioactive compounds having varying structural arrangements and characteristics
[1]. Gezahegn et al. [2], reported that about eighty percent of the human health care
system depends upon the traditional therapeutic activities of these compounds. More-
over, the modern drugs developed from the various plant sources are widely used in
pharmaceutical industry [3]. Furthermore, the current health problems, alarming rise in
communicable diseases and disorders have elevated mortality rate across the world [4].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 530–540, 2022.
https://doi.org/10.1007/978-3-030-97196-0_44
Comparative Analysis of Bioactive Compounds 531

The evolving scientific medical world is challenged by the increased rate of infectious
diseases and antimicrobial resistance among the various pathogens. However, these cir-
cumstances have also produced a number of opportunities for the researchers to develop
more efficient medical therapy [5]. Thus, the demand of medicinal plants and their
products are increased vigorously all over the world. Moreover, the crude drugs are
pharmacologically active, cost effective, and least toxic, whereas synthetic drugs have
much adulteration and side effects. Also, crude drug provides an uncomplicated medica-
tion as compared to synthetic drugs [6]. However, the investigation of medicinal plants
and their therapeutic role mainly depends upon the available traditional data gathered
from the healers and local population [7, 8]. Therefore, the discovery and pharmaceutical
validation of the active compounds of important medicinal plants are inevitable.
In India, generally experiential evidences are the backbone for the majority of herbal
drugs, which provides a vast observational therapeutics and opportunities for reverse
pharmacological investigations [9, 10, 11]. However, the crude herbal drugs are still not
considered as natural drugs due to insufficient evidence, quality control, standardization,
and efficacy studies [12]. Furthermore, the crude drugs also exhibit some variations in
the chemical profile due to season of collection, growing location and storage conditions
[13]. Although various researches have already been done on safety of herbal medicines
and their formulations but still there is a need for validation and authentication of herbal
drugs which have high quality, safety and efficacy [14].
Euphorbia hirta Linn is an annual medicinal weed that belongs to family Euphor-
biaceae and generally known as milk weed, asthma plant, and cat’s hair. It occurs mostly
in the temperate and tropical parts of India, Africa, Asia, and Australia. It is a rud-
eral plant, which grows mostly in paddy fields, lowland, gardens, and roadsides [15].
E.hirta L. is a prominent medicinal plant among the tribal population used commonly to
heal wounds [16]. The preceding pharmacological analysis reported that E.hirta L. has
anti-inflammatory, antimicrobial, anti-asthmatic, antifungal, anti-parasitic, analgesic,
antipyretic, anti-cancer, anti-histamatic, anti-malarial, anti-diabetic, sedative anxiolytic,
diuretic, wound healing, and hepatoprotective propertie [17, 18]. In addition, the plant
is widely used to treat cough, gastritis, diarrhoea, eye sores, and dengue fever [19].
It also exhibits galactogenic property, water consumption property, hepato-protective
property, herbicidal property, and wound healing property [19]. The remedial activities
of the plant are anticipated because of the existence of several phytoconstituents such
as phenolic compounds, lignans, tannins, flavonoids, alkaloids, glycosides, coumarins,
saponins, and triterpenoids [20, 21]. E. hirta L. is also a wealthy heritage of calcium,
zinc, potassium and magnesium [22, 23]. Furthermore, quercetin, quercitrin, camphol,
flavonol glycoside xanthrhamnin, gallic acid, myricitrin, ellagic acid, maleic acid, α-
amyrin, β-amyrin, isocoumarin and hydroxytinnamic acid are some reported bioactive
phytocompounds [18, 20]. These bioactive compounds are investigated using GC-MS
analysis of the E.hirta L. leaves extract.
The present study aims to compare the phytocompounds in the ethanol, methanol,
and aqueous extract of E. hirta L. leaves through a convenient and reliable method. The
comparative results of this work may serve as a useful tool for establishing the medicinal
importance of the herb.
532 S. Dixit and S. Tiwari

2 Materials and Methods


2.1 Procurement of Plant Material
Fresh plant of E. hirta L. was harvested from different areas of Kanpur district, U.P,
in between September-October 2019. The plant was cross verified by a taxonomist Dr.
Archana Srivastava, D.G.P.G. College, Kanpur. The voucher specimens were kept in the
herbarium of botany department, D.G.P.G. College, Kanpur. The leaves were segregated
from the plant, cleaned properly with the tap water 2–3 times. Further, the leaves were
kept to dry in shade for 15 days. Thereafter the dried leaves were powdered with the help
of electric grinder and stored in jip pouches. This research does not contain any studies
with human or animals.

2.2 Extraction of Plant


10 g of E.hirta L. leaves were dissolved in 100 ml of ethanol, methanol, and to get crude
extracts. The amalgamation was stored at room temperature for the duration of 48 h.
Thereafter, it was filtered using Whatman No.1 filter paper. Further, the filtrates were
put under the tap water to remove the solvents and to get dry mass of extracts [22, 23].

2.3 Gas Chromatography-Mass Spectrometry Analysis


The GC-MS analysis of ethanol, methanol and aqueous extract of E. hirta L. leaves was
performed by IISER Bhopal, India. The experiments were performed by the Agilent
7890A GC system which was equipped with an Agilent 5975C MS system. The sample
was kept at an oven temperature from 50 °C to 280 °C at 4 °C/min for 5 min and
interface temperatures were ranged from 250 °C and 280 °C. The carrier gas was set
with the flow rate of 1.0 ml/min. Further, a sample of 0.2 ml was administered under
split of 20:1. The dataset of NIST with more than sixty two thousand patterns was used
for the interpretation of mass spectrum. The known compounds spectrum was matched
against the software of NIST library.

Fig. 1. Chromatogram in the aqueous extract of Euphoria hirta L. leaves obtained from GC-MS
Comparative Analysis of Bioactive Compounds 533

Fig. 2. Structure and mass spectrum of Phytocompound obtained from GC-MS in aqueous extract

Fig. 3. GC-MS chromatogram of ethanol extract of Euphoria hirta L. leaves

3 Results and Discussion

Now a day’s, majority of the drugs are produced from medicinal plants, directly or
indirectly. Medicinal plants have yielded various elements which can fight with different
diseases. Extraction and identification of bioactive compounds play crucial contribution
in controlling, reconstruction, quality, and development of modern drug formulations.
Moreover, pharmacological investigation of these plants also explores the natural toxins
534 S. Dixit and S. Tiwari

and their role in humans and animals. Therefore, the current study was performed to
recognize and compare various bioactive compounds in aqueous, ethanol, and methanol
extract of Euphorbia hirta L. leaves using GC-MS study. Compounds recognized in
aqueous, ethanol and methanol extract of Euphorbia hirta L. leaves have been listed in
Table 1, 2, and 3 respectively. The molecular formula, molecular weight, matching factor
(MF), reverse matching factor (RMF), and probability of each compound is presented
in the tables. The GC-MS chromatogram of E.hirta L. leaves in aqueous, ethanol, and
methanol extract have been presented in Fig. 1, 3 and 5 respectively whereas, mass
spectra of identified compounds have been presented in Fig. 2, 4 and 6 respectively.

Fig. 4. Structure and mass spectrum of Phytocompounds recognized by GC-MS in ethanol extract

Fig. 5. GC-MS chromatogram of methanol extract of Euphoria hirta L. leaves


Comparative Analysis of Bioactive Compounds 535

The comparative chemical profiling of E. hirta L. leaves witnessed the presence of


6 compounds in aqueous extract, 24 in ethanol, and 36 in methanol. Out of these, 1
compound in aqueous extract, 2 compounds in ethanol extract have shown matching
probability above 50% while, 4 compounds in methanol extract have matching proba-
bility above 20% when compared with the standard compounds of NIST Library Data
Bank. Moreover, no compound in methanol extract has the matching probability greater
than 50%. The matching probability of remaining compounds was observed to be less
than 10%. Furthermore, the dominant compounds in aqueous extract were C6 to C11,
C5 to C38, and C10 to C39 in aqueous, ethanol, and methanol extract respectively.

Fig. 6. Structure and mass spectrum of Phytocompounds identified by GC-MS in methanol extract

Table 1. Details of phytoconstituents recognized in the aqueous extract of Euphoria hirta L. leaves

S.N Compound Molecular Molecular MF MW Probability


Formula weight RMF
1 1,2-Benzenediol C6H6O2 110 807 874 58.6
2 Hydroquinone C6H6O2 110 763 819 12.7
3 Resorcinol C6H6O2 110 758 813 10.2
(continued)
536 S. Dixit and S. Tiwari

Table 1. (continued)

S.N Compound Molecular Molecular MF MW Probability


Formula weight RMF
4 Phenol, C11H15NO3 209 752 816 8.04
2-(1-methylethoxy)-,
methylcarbamate
5 2-Isopropoxyphenol C9H12O2 152 719 762 2.10
6 Hydroquinone, acetate C8H8O3 152 718 766 2.02

Table 2. Details of phytoconstituents recognized in the ethanol extract of Euphoria hirta L. leaves

S.N Compound Molecular Molecularweight MF MW Probability


Formula RMF
1 1,2,3-Benzenetriol C6H6O3 126 732 864 62.3
2 1,2,4-Benzenetriol C6H6O3 126 704 836 18.0
3 Phenol, 2-[(1-methylpropyl)thio]- C10H14OS 182 643 718 3.13
4 Phenol, 2-(butylthio)- C10H14OS 182 623 712 1.42
5 Phenol, o-(tert-butylthio)- C10H14OS 182 622 697 1.37
6 Pyrazole-5-carboxylic acid, 3-methyl- C5H6N2O2 126 620 768 1.26
7 3,7,11,15-Tetramethyl-2-hexadecen-1-ol C20H40O 296 823 905 19.8
8 17-Octadecynoic acid C18H32O2 280 779 785 4.29
9 E-2-Tetradecen-1-ol C14H28O 212 769 793 3.03
10 Phytol C20H40O 296 766 775 2.67
11 3-Eicosyne C20H38 278 764 805 2.47
12 Ethanol, 2-(9-octadecenyloxy)-, (Z)- C20H40O2 312 762 765 2.27
13 n-Hexadecanoic acid C16H32O2 256 886 891 71.2
14 l-(+)-Ascorbic acid 2,6-dihexadecanoate C38H68O8 652 847 847 16.8
15 Pentadecanoic acid C15H30O2 242 788 843 3.03
16 Palmitic anhydride C32H62O3 494 785 786 2.68
17 Isopropyl Palmitate C19H38O2 298 763 764 1.06
18 i-Propyl 14-methyl-pentadecanoate C19H38O2 298 761 764 0.97
19 Phytol C20H40O 296 818 826 43.0
20 3,7,11,15-Tetramethyl-2-hexadecen-1-ol C20H40O 296 767 813 8.67
21 1-Hexadecen-3-ol, 3,5,11,15-tetramethyl- C20H40O 296 766 799 8.34
22 Isophytol C20H40O 296 755 776 5.72
23 6-Octen-1-ol, 3,7-dimethyl-, (±)- C10H20O 156 750 823 4.61
24 Oxirane, hexadecyl- C18H36O 268 734 747 2.65
Comparative Analysis of Bioactive Compounds 537

Table 3. Details of phytoconstituents recognized in the methanol extract of Euphoria hirta L.


leaves

Molecular Molecular
S.N. Compound Formula weight MF RMF Probability

1. 1,3-Dioxane, 4-(hexadecyloxy)-2-pentade- C35H70O3 538 564 610 17.4


cyl-
2. N'-(2-Nitrobenzylidene)isonicotinohydra- C19H24N4O3 384 555 614 12.7
zide, N-tert.-butyldimethylsilyl- Si

3. 1,3-Dioxane, 5-(hexadecyloxy)-2-pentade- C35H70O3 538 551 579 10.7


cyl-, trans-

4. 2-Benzo[1,3]dioxol-5-yl-8-methoxy-3-nitro- C17H13NO6 327 541 610 7.54


2H-chromene

5. (5β)Pregnane-3,20β-diol, 14α,18α-[4-me- C28H43NO6 489 516 536 2.30


thyl-3-oxo-(1-oxa-4-azabutane-1,4-diyl)]-,
diacetate
6. 2,5-Dihydroxyacetophenone, bis(trime- C14H24O3Si2 296 515 653 2.21
thylsilyl) ether

7. Ethyl iso-allocholate C26H44O5 436 555 601 6.13

8. 2,7-Diphenyl-1,6diox- C20H13N5O2 355 546 651 4.45


opyridazino[4,5:2',3']pyrrolo[4',5'-d]pyri-
dazine

9. Benzoic acid, 2,5-bis(trimethylsiloxy)-, tri- C16H30O4Si3 370 541 612 3.61


methylsilyl ester
10. 3,9-Epoxypregn-16-ene-14-18-diol-20-one, C26H36O9 492 549 557 8.61
7,11-diacetoxy-3-methoxy-

11. Testoster-3,11-dione, 9-thiocyanato-, ace- C22H27NO4S 401 536 625 5.56


tate

12. Pregnane-3,20-dione, 11-[(trimethylsi- C26H46N2O3 462 535 579 5.34


lyl)oxy]-, bis(O-methyloxime), (5β,11β)- Si

13. Cholestan-3-one, cyclic 1,2-ethanediyl ae- C29H50O2 430 532 582 4.72
tal, (5β)-
14. 3,9β;14,15-Diepoxypregn-16-en-20-one, C27H34O9 502 531 557 4.54
3,11β,18-triacetoxy-

15. N-[3,5-Dinitropyridin-2-yl]proline C10H10N4O6 282 528 687 4.01

16. (5β)Pregnane-3,20β-diol, 14α,18α-[4-me- C28H43NO6 489 605 619 28.5


thyl-3-oxo-(1-oxa-4-azabutane-1,4-diyl)]-,
diacetate
17. Gibb-3-ene-1,10-dicarboxylic acid, 2,4a,7- C20H24O6 360 566 642 6.71
trihydroxy-1-methyl-8-methylene-, 1,4a-
lactone, 10-methyl
ester, (1α,2β,4aα,4bβ,10β)-

18. 3,9-Epoxypregn-16-ene-14-18-diol-20-one, C26H36O9 492 555 565 4.60


7,11-diacetoxy-3-methoxy-

(continued)
538 S. Dixit and S. Tiwari

Table 3. (continued)

19. Octadecane, 1,1'-[1,3-pro- C39H80O2 580 551 565 3.89


panediylbis(oxy)]bis-
20. Cyclohexane-1,3-dicarboxylic acid, 6-hy- C21H26O6 374 540 611 2.66
droxy-6-methyl-4-oxo-2-(2-phenylethenyl)-
, dietyl ester

21. Cholestan-3-one, cyclic 1,2-ethanediyl ae- C29H50O2 430 536 585 2.25
tal, (5β)-

22. Ethyl iso-allocholate C26H44O5 436 634 657 20.3

23. 4-Piperidineacetic acid, 1-acetyl-5-ethyl-2- C23H32N2O4 400 623 651 13.9


[3-(2-hydroxyethyl)-1H-indol-2-yl]-α-me-
thyl-, methyl ester

24. Propanoic acid, 2-(3-acetoxy-4,4,14-trime- C27H42O4 430 606 618 7.61


thylandrost-8-en-17-yl)-

25. 2,7-Diphenyl-1,6-dio C20H13N5O2 355 595 680 5.01


opyridazino[4,5:2',3']pyrrolo[4',5'-d]pyri-
dazine
26. Prost-13-en-1-oic acid, 9-(methoxyimino)- C30H61NO5S 599 582 610 3.24
11,15-bis[(trimethylsilyl)oxy]-, trimethylsi- i3
lyl ester, (8.xi.

27. Cyclopropanebutanoic acid, 2-[[2-[[2-[(2- C25H42O2 374 655 723 22.4


pentylcyclopr pyl)methyl]cyclopropyl]me-
thyl]cyclopropyl] methyl]-, methyl ester

28. Pentadecanoic acid, 13-methyl-, methyl es- C17H34O2 270 630 733 6.84
ter

29. Hexadecanoic acid, 14-methyl-, methyl es- C18H36O2 284 624 692 5.38
ter
30. Octadecanoic acid, 3-hydroxy-2-tetradecyl- C33H66O3 510 617 642 4.12
, methyl ester

31. Hexadecanoic acid, methyl ester C17H34O2 270 614 771 3.64

32. Oxiraneundecanoic acid, 3-pentyl-, methyl C19H36O3 312 613 662 3.50
ester, cis-

33. 9,12,15-Octadecatrienoic acid, 2,3-dihy- C21H36O4 352 750 819 20.2


droxypropyl ester, (Z,Z,Z)-

34. 8,11,14-Eicosatrienoic acid, methyl ester C21H36O2 320 724 825 7.77

35. n-Propyl 9,12,15-octadecatrienoate C21H36O2 320 724 816 7.77

36. 9,12,15-Octadecatrienoic acid, methyl C19H32O2 292 723 801 7.47


ester, (Z,Z,Z)-
Comparative Analysis of Bioactive Compounds 539

4 Conclusion
This paper presents the GC-MS study of E.hirta L. leaves. Three extracts were used
for the analysis namely, aqueous, ethanol, and methanol. A total of 66 compounds
were confirmed in the study, in which 6 in aqueous, 24 in ethanol and 36 compounds
were reported in methanol extract. Also, the biological activities of compounds with
higher probability of matching were discussed. The presence of various novel active
compounds in leaves of Euphorbia hirta L. confirms its medicinal potential. The study
provides information about the variation in chemical constitituent of E.hirta L. leaves in
aqueous, ethanol, and methanol extract. Also, the different chemical compounds reported
in the three extracts may serve a wide range of pharmacological activities. The current
preliminary study concludes that investigation of these compounds in various therapeutic
activities may create a fruitful path to modern crude drug research.

References
1. Asha, K.R., Priyanga, S., Hemmalakshmi, S., Devaki, K.: GC-MS analysis of the ethanolic
extract of the whole plant Drosera indica L. Int. J. Pharmacogn. Phytochem. Res. 9(5), 685–
688 (2017)
2. Gezahegn, Z., Akhtar, M.S., Woyessa, D., Tariku, Y.: Antibacterial potential of Thevetia
peruviana leaf extracts against food associated bacterial pathogens. J. Coast. Life Med. 3(2),
150–157 (2015)
3. Swamy, M.K., Sinniah, U.R.: A comprehensive review on the phytochemical constituents
and pharmacological activities of Pogostemon cablin Benth.: an aromatic medicinal plant of
industrial importance. Molecules 20(5), 8521–8547 (2015)
4. Lobo, D.A., Velayudhan, R., Chatterjee, P., Kohli, H., Hotez, P.J.: The neglected tropical
diseases of India and South Asia: review of their prevalence, distribution, and control or
elimination. PLoS Negl. Trop. Dis. 5(10), e1222 (2011)
5. Swamy, M.K., Sinniah, U.R., Akhtar, M.: In vitro pharmacological activities and GC-MS
analysis of different solvent extracts of Lantana camara leaves collected from tropical region
of Malaysia. Evid. Based Complement. Altern. Med. (2015)
6. Mujeeb, F., Bajpai, P., Pathak, N.: Phytochemical evaluation, antimicrobial activity, and
determination of bioactive components from leaves of Aegle marmelos. BioMed. Res. Int.
(2014)
7. Mohanty, S.K., Mallappa, K.S., Godavarthi, A., Subbanarasiman, B., Maniyam, A.: Eval-
uation of antioxidant, in vitro cytotoxicity of micropropagated and naturally grown plants
of Leptadenia reticulata (Retz.) Wight & Arn.-an endangered medicinal plant. Asian Pac. J.
Trop. Med. 7, S267–S271 (2014)
8. Dixit, S., Tiwari, S.: Investigation of anti-diabetic plants used among the ethnic communities
of Kanpur division, India. J. Ethnopharmacol. 253, 112639 (2020)
9. Vaidya, A.D., Devasagayam, T.P.: Recent advances in Indian herbal drug research guest editor:
thomas paul asir devasagayam current status of herbal drugs in India: an overview. J. Clin.
Biochem. Nutr. 41(1), 1–11 (2007)
10. Dixit, S., Tiwari, S.: Review on plants for management of diabetes in India: an ethno-botanical
and pharmacological perspective. Pharmacogn. J. 12(6s) (2020)
11. Calixto, J.B.: Efficacy, safety, quality control, marketing and regulatory guidelines for herbal
medicines (phytotherapeutic agents). Braz. J. Med. Biol. Res. 33(2), 179–189 (2000)
540 S. Dixit and S. Tiwari

12. Bandaranayake, W.M.: Quality control, screening, toxicity, and regulation of herbal drugs.
Mod. Phytomed. 25–57 (2006)
13. Sarkar, M.K., Mahapatra, S.K., Vadivel, V.: Oxidative stress mediated cytotoxicity in leukemia
cells induced by active phyto-constituents isolated from traditional herbal drugs of West
Bengal. J. Ethnopharmacol. 251, 112527 (2020)
14. Ghosh, P., Ghosh, C., Das, S.: Botanical description, phytochemical constituents and phar-
macological properties of euphorbia hirta linn: a review. Int. J. Health Sci. Res. 9(3), 273–286
(2019)
15. Tuhin, R.H., et al.: Wound healing effect of Euphorbia hirta linn. (Euphorbiaceae) in alloxan
induced diabetic rats. BMC Complement. Altern. Med. 17(1), 423 (2017)
16. Al-Snafi, A.E.: Pharmacology and therapeutic potential of Euphorbia hirta (Syn: Euphorbia
pilulifera)-a review. IOSR J. Pharm. 7(3), 7–20 (2017)
17. Nyeem, M.A.B., Haque, M.S., Akramuzzaman, M., Siddika, R., Sultana, S., Islam, B.R.:
Euphorbia hirta Linn. A wonderful miracle plant of mediterranean region: a review. J. Med.
Plants Stud. 5(3), 170–175 (2017)
18. Lam, H.Y., Montaño, M.N.E., Sia, I.C., Heralde III, F.M., Tayao, L.: Ethnomedicinal uses of
tawatawa (Euphorbia hirta Linn.) in selected communities in the Philippines: a non-invasive
ethnographic survey using pictures for plant identification. Acta Med. Philipp. 52(5) (2018)
19. Mekam, P.N., Martini, S., Nguefack, J., Tagliazucchi, D., Stefani, E.: Phenolic compounds
profile of water and ethanol extracts of Euphorbia hirta L. leaves showing antioxidant and
antifungal properties. S. Afr. J. Bot. 127, 319–332 (2019)
20. Pandey, P., Tiwari, S.: Identification of different phytochemicals in methanolic extract of
Chenopodium album (L.) leaf through GC-MS
21. Abu Bakar, F.I., Abu Bakar, M.F., Abdullah, N., Endrini, S., Fatmawati, S.: Optimization of
Extraction Conditions of Phytochemical Compounds and Anti-Gout Activity of Euphorbia
hirta L. (Ara Tanah) Using Response Surface Methodology and Liquid Chromatography-Mass
Spectrometry (LC-MS) Analysis. Evidence-Based Complementary and Alternative Medicine,
2020 (2020)
22. Basumatary, A.R.: Preliminary phytochemical screening of some compounds from plant stem
barks of Tabernaemontana divaricata Linn. used by Bodo community at Kokrajhar district,
Assam, India. Arch. Appl. Sci. Res. 8(8), 47–52 (2016)
23. Kim, M.G., Lee, H.S.: 1, 2-benzendiol isolated from persimmon roots and its structural
analogues show antimicrobial activities against food-borne bacteria. J. Korean Soc. Appl.
Biol. Chem. 57(4), 429–433 (2014)
An Efficient Edge Localization Using Sobel
and Prewitt Fuzzy Inference System (FIS)

Ali A. Al-Jarrah and R. Bremananth(B)

Sur University College, Sur, Oman


{aljarrah,bremananth}@suc.edu.om, bremresearch@gmail.com

Abstract. This paper addresses the problem of edge detection to overcome the
artifacts factors mainly by means of visual factors which appear on traditional
edge detection methods using Sobel and Prewitt Fuzzy Inference System (FIS).
In this research, we proposed a framework for Sobel and Prewitt FIS to enhance
the edge detection process as compared to the traditional Sobel and Prewitt filter
operations. Results show a considerable improvement on edge detection process,
especially on the diversely varying intensity properties’ images.

Keywords: Contrast · Edge enhancement · Edge localization · Fuzzy imaging ·


Membership function

1 Introduction
Fuzzy sets have many advantages for imaging such as representing, interpretation, restor-
ing, understanding, identification, and recognition. In frequency and spatial imaging, lin-
ear and non-linear transformation algorithms utilized diverse techniques such as image
negatives, logarithmic transformations, power-law transformations, and piece wise linear
transformations. Fuzzy Membership function is employed to exhibit the edges’ volume
of pixels (voxel) in diverse factors such as illumination, lighting variation and others.
The degree of membership function is used to precisely locate the voxel to classes that
directly represent partial membership to the different voxel. Image information can be
represented at different levels with fuzzy sets such as local, regional, or global, as well as
under different forms either numerical, or symbolic. For instance, classifications based
only on gray levels involve very local information at the pixel level [1, 2]. Image edges’
are exhibited and its imprecision causes finding at several levels such as imprecise limits
between structures or objects, limited resolution, numerical reconstruction methods, and
imprecision induced by a filtering for instance [3, 4]. Intensity variation analysis is a
computation challenge for non-linear intensity variation, linear intensity variation and
local intensity variation [5, 6].
In this paper, we proposed the consistent framework for the edge localization by intro-
ducing a new deep fuzzy logic based inference system which reproduces the image edge
enhancements and enables the edge localization. In this framework, Fuzzy Sobel and
Prewitt the inference system learns local, global, qualitative, and quantitative enhance-
ments for the human perception and localizing active regions, respectively. The quality

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 541–557, 2022.
https://doi.org/10.1007/978-3-030-97196-0_45
542 A. A. Al-Jarrah and R. Bremananth

improvement is the process of subjective concerns, especially mobile imaging appli-


cations, which in turns based on the personal feeling, tastes, perceptual factors, and
impressionistic of the human usages, whereas, the image localization is the process
of objective covers not influenced by personal feelings or opinions in considering and
representing facts, not dependent on the mind for the existence because it depends on
factual, empirical, and verifiable.
The remainder of this paper has been organized as follows: the reviews on fuzzy imag-
ing are given in the Sect. 2. Section 3 proposes framework for fuzzy imaging. Section 4
illustrates the results and analysis of the proposed framework, and the concluding
remarks are provided in Sect. 5.

2 Related Work

Color image enhancement using fuzzy logic is proposed in [1]. It utilized three member-
ship functions such as S-shape, sigma and trapezoidal. Fuzzy logic imaging technique
required fuzzification, membership function for the applications and defuzzification.
Estimation of image enhancement was performed by the mathematical model which was
designed from explicit knowledge base and experience from artists of natural images.
Degree of values from zero to one was determined by the fuzzy logic [1, 7]. In the paper
[2], a method of edge detection was performed by Fuzzy logic and Fuzzy set theory.
Edge detection is an important process to locate the active region of interest (AROI)
to extract the unique derrises which are spread over the digital images. Edge is a set
of connected outline pixels that appears between foreground and background intensity
variation. Outline is appeared as an edge segment of AROI in which the background
intensity may have higher or lower than the background pixels [10]. Moreover, pixels are
considerably different from dynamic backgrounds in video processing. Edge detection is
a most important function in the imaging applications and perfectly locating the bound-
aries of AROI is the computationally challenging approach. In the literature, Sobel,
Roberts and Prewitt [3, 9] are the gradient based edge detector. Gradient determined by
the level of variance between the neighbor pixels and the background pixels. Derivative
of pixels often is employed for the computation of gradients. In the 2D imaging, a 3 × 3
convolution kernel is utilized for edge-detection. If the convolution operation provides
a value above the threshold given by the application, then the frequency of the pixel is
considered as an edge.
This review paper [4, 12, 13] proposed the collection of fuzzy image processing
applications. It reviewed some of the fuzzy imaging paper for image edge detection,
filtering, segmentation, and classification. Type-2 fuzzy sets have classified into inter-
val and general type-2 applications. Membership function is required for implementing
these two types of fuzzy sets which have a 2D and 3D formation. Using Fuzzy set the-
oretic framework, Image ambiguity, Uncertainty Measures, other imaging issues were
discussed [14, 15]. A flexible approach of choosing fuzzy membership functions dis-
cussed for skeleton extraction, segmentation, edge detection, and feature extraction. In
this paper, a fuzzy algorithm was implemented for motion frame analysis scene abstrac-
tion, detecting man-made objects from remote sensing images, and modeling of face
images [5].
An Efficient Edge Localization Using Sobel and Prewitt Fuzzy Inference System (FIS) 543

In this paper, an image enhancement proposed to remove noise using fuzzy-logic-


control. The proposed algorithm mainly involved removal of impulse noise and improv-
ing contrast of the image [6]. Fuzzy logic based image enhancement proposed in [7, 9,
10]. This research resolves the existing enhancement techniques that have the color arti-
facts and also reduces steadily the intensity of the image. In order resolve the problem,
a fuzzy based algorithm has been utilized to boost up the contrasts’ in the images [15].
A method of image enhancement for mobile devices was proposed in [8], in this
research a convolutional neural network was employed to predict the coefficients of a
locally-affined on bilateral space. High-resolution images on a smart phone was proposed
and provided a viewfinder at 1080 pixel resolution trained offline from data and it couldn’t
need access to the original operator at runtime [8, 9, 11].
By referring the aforesaid state-of-art techniques, we have proposed a consistent
frame work for Fuzzy Inference System (FIS). Section 3 provides the details of our
approach.

3 A Frame Work for Fuzzy Inference System (FIS)

3.1 Block Diagram of Proposed Frame Work

We propose a consistent framework to provide edge-detected images for the real-time


image. Dimension reduction has been performed to improve sufficient efficiency on the
images and convert the pixel regions to double precision space which is suitable for
evaluation of Fuzzy Inference System (FIS). Obtaining the image gradient for FIS edge-
detector is a challenging task to situate the pixel cracks in the homogeneous region of
the objects which are present in the images. Both x and y axes image gradients have
been computed based on the Sobel and Prewitt gradients and obtain 2D convolution to
perform image filtering. To define the Sobel and Prewitt FIS, we need fuzzy rules based
on the image properties from the predefined knowledge base.
Figure 1 shows the block diagram of the proposed frame work. It has an Expert system
knowledge base with the set of Fuzzy Production Rules (FPR) which are based on the
gradients of Sobel and Prewitt operators. Due to noise, illuminations, poor backgrounds,
low contrasts, day and night light conditions, discontinuity boundaries of the subjects
are occurring. To overcome these factors, FPR has been formed to find each pixel which
determines whether or not pixels belong to the edge of the subject. It further ensures the
important properties of the edges collected and makes the continuity of the edges. FPR
plays an important role to define FIS, transferring gradients, add input for the FIS and
specify the membership functions. Specify the standard deviation for the membership
functions for both image axes. It performs to adjust less and more sensitivities to the
detected edges in the image. Detected-edges’ intensity is specified by turning the output
of FIS. Membership functions are also specified based on the expert knowledge to adjust
the edges of the images. Evaluating FIS is an important process to compute the edges on
the images for each pixel with respect to the Sobel and Prewitt gradients. Defuzzification
is a final process to get back the fuzzy values on the real-world pixel values to produce
the result of the edge-detected images.
544 A. A. Al-Jarrah and R. Bremananth

Convert Pixels 2D convolution


Set Sobel and Transferring
Real-time to double to obtain Define Sobel FIS
Prewitt gradients into
image precision for gradients for and Prewitt FIS
gradients FIS
FIS the image

Fuzzy rules based on


Image properties from
expert knowledge

Specify Gaussian, Pi, Specify standard Specify the intensity


Add the inputs Add the output
sigmoid and Gaussian-2D deviation for the of the
to the FIS to the FIS
Membership functions membership functions edge-detected image

Evaluate Fuzzy inference Defuzzification using centroid, Fuzzy edge


system edge detector mom,som and bisector detected image

Fig. 1. Block diagram of the proposed frame work for fuzzy inference system.

3.2 Mathematical Model of Proposed Sobel and Prewitt FIS


Fuzzy logic is utilized to control the image edge detection process by formulating the
mapping from input image’s gradients to obtain resultant image. Membership function
is defined as given in Eq. 1.
  
I = Ix Iy , μI (x,y) |Ix εX , I y εY , (1)
where μI (x,y) represents the membership function for the image’s fuzzy set I. X and
Y denote the set of image elements mapped from 0.0 to 1.0, i.e., μI : X, Yε[0 . . . 1], the
values represent double numbers of the image pixels.
To trace ruptures in identical regions of the subjects, image gradient computation is
an essential process. Sobel and Prewitt gradients employed in (2)

G = Gx + Gy , Gx = Sx ∗ I , Gy = Sy ∗ I , Gx = Px ∗ I , Gy = Py ∗ I , (2)

where G denotes gradients of both x and y axes, Gx and Gy represent x and y direction
gradients, respectively. Sx , Sy ,Px and Py denote both horizontal and vertical derivative
approximations of Sobel and Prewitt, respectively.
Both gradients’ directions is computed using (3)
 
−1 Gy
 
s = tan , p = atan2 Gy , Gx , (3)
Gx
where s and p are Sobel and Prewitt gradient directions, respectively.
Specify the image gradients as the input to edge-FIS using (4)
Fe = Add in (G, I ), (4)
where Fe denotes edge-FIS, Add in is a function to include image gradients in FIS.
Specify a set of Membership Functions (MF) for each input and If G for a pixel is
zero, and then it belongs to the zero in the MF with a degree of one. It can be done using
(5)
Fe = Add mf (G, I , MF ), (5)
An Efficient Edge Localization Using Sobel and Prewitt Fuzzy Inference System (FIS) 545

where Add mf is a function for include MF to the edge-FIS.


Standard deviation is specified for the MF image gradients. Edge detector recital can
change the values of σx and σy using (6)

Fe = Add mf (G, I , MF , [σx 0], σy 0 ), (6)

Initial value 0.1 is set for σx and σy , respectively. It gets higher further to formulate
more sensitivity and improves the intensity of the detected edges.
Edge-detected image’s intensity is specified using (7) for the output of edge-FIS.

Fe = Add o (Io , IL , IH ), (7)

where Io , IL and IH denote output of edge-detected image, low intensity and high
intensity of the image, respectively.
Include MF to the output of the edge-FIS using (8) with set of predefined parameters.
 
Fe = Add mf Io , MF , MF (p1 , p2 , . . . , pn (8)

Evaluate the edge-FIS using (9),

FI = n
i=1 j=1 Fe [Ix Iy ],
m
(9)

Fig. 2. Membership function for image Ix , Iy gradients and its output using triangular membership
function (MF).
546 A. A. Al-Jarrah and R. Bremananth

where FI denotes results of the edge-FIS based on the image gradients.


Figure 2 shows the Fuzzy inference system the membership functions of gradients
(X, Y) with degree of belief and Image output is with triangular membership function
(MF).
This system has the fuzzy operators of And, Or, Implication, Aggregation, and
Defuzzification with the minimum, maximum, and centroid parameters, respectively.
These parameters lead to an edge detection process to enhance the surface filter espe-
cially for the poor edges on the subjects. Two input variables Ix and Iy have the input
range and display range between −1 to 1, respectively. Gaussian membership function
(MF) is employed for the input variables. Combination of Ix , Iy and fuzzy parameters is
utilized to produce the output of FIS employed with triangular MF. FIS requires imple-
menting the fuzzy rules to make a set of inferences to estimate the pixel values on the
objects edges. Inference rules are: If (I x = 0) && (I y = 0) ⇒ (I out = 1(white)), If
(I x = 0) && (I y = 0) ⇒ (I out = 0(black)) normally, inference rules and membership
functions are combined together to locate the edginess of the pixels which are exactly
suitable for the brightness in the edges.
Initially, the given input image is converted into a set of fuzzy values which are
having a set of fuzzy pixels. These pixels are processed to sharply locate the suitable
edges on the objects that present in the images. MF is not unique to determine the pixels

Fig. 3. FIS-2 Ix , Iy gradients pi-shaped and Iout of generalized bell MF.


An Efficient Edge Localization Using Sobel and Prewitt Fuzzy Inference System (FIS) 547

on the edges based on the diverse factors such as illumination, lighting, and changes of
brightness of the nature images intensities could produce diverse locations of edges. In
our approach, we employed the process of Sobel filter with FIS and Prewitt filter with
FIS. This approach provides a fusion of deciding the exact pixels which are appropriately
suitable for the edge outcomes.
In the second FIS (See Fig. 3), output is fine tuned with a set of parameters and
degree of membership functions. In order to fetch the exact edges on the boundary of the
objects, it is important to fine tune the input of MF which is based on spline pi-shaped.
It has four parameters to provide feet and shoulders of pi-shaped. For input variables Ix
and Iy feet’s left and right base points are set as [−0.6026, −0.2786,] and shoulders’ is
set as [−0.07175, 0.1696]. Iout is tuned using generalized bell MF. It is employed using
(10)
1
MF (I , a, b, c) = , (10)
I −c 2b
1+ a

where a, b and c define width, vertical transition, and center of the MF, respectively.
A mean of maximum is defined for Defuzzification. Highest MF’s pixel is obtained
when there is more than one pixel having maximum values, and then means value of
maximum is acquired. This is performed using (11)
N
i=1 μ(MF (I i ))
Dg = , (11)
|N |
where Dg represents generalized defuzzification. MF (I i ) denotes highest MF’s pixel
values, and N is called cardinality of the set MF (I i ).
In the third FIS, boundaries of the objects are fixed with diverse illumination and
exhibit the poor transition of pixels’ discontinuous. Input variables of FIS MF is fine
tuned using (12)
1
MF (I , a, c) = , (12)
1 + e−a(I −c)
where a and c define width and center of the transition area of the membership. Input
variable a and b are set as 30.59 and 0.1177, respectively.
The output of MF is defined using (13).
−(I −c)
MF (I , σ, c) = e 2σ 2 , (13)

where σ and c denote standard deviation and mean of Gaussian MF, respectively.
Output variable is set as [σ, c] 0.3058 and 0.91. This is shown in Fig. 4.
Defuzzification of FIS-3 is performed with the smallest of maximum MF value.
Fuzzy operators of And, Or, Implication, and Aggregation are set as product, probabilis-
tic, product and probabilistic parameters, respectively.
548 A. A. Al-Jarrah and R. Bremananth

In fourth FIS (see Fig. 5), input variables Ix and Iy and Iout is defined two Gaussian
MF is employed. It has the input variables σ and c are set as 0.02847 and 0.08418,
respectively. Output variables [σ, c] is set as 0.3469 and 0.2384, respectively. Defuzzifi-
cation is employed with bisectors. And, Or, Implication, and Aggregation are employed
with minimum, probabilistic, product, and maximum respectively.

Fig. 4. FIS-3 Ix , Iy gradients difference sigmoid MF and Iout using two Gaussian MF.

4 Results Analysis for Fuzzy Inference System

The proposed framework was implemented in Matlab. Algorithms have tested on Intel
Core i-5-3210M CPU with 64-bit windows operating system. Standard images have been
collected for testing the efficiency of the system. Initially, Fuzzy Inference Systems’ MF
prominences were tested with the state of being important, eminent, and noticeable
detected-edges through their three dimensional surface plots. It depicts the Iout variable
of FIS surface against the input variables Ix , Iy . The surface plot of FIS-1, 2, 3 and 4
with degree of membership functions (see Fig. 6).

4.1 Experiment I

Surface plot of FIS-1 depicts sharp edges on the output against its inputs which are
more suitable for poor transition of background and foreground images. FIS-2 output
An Efficient Edge Localization Using Sobel and Prewitt Fuzzy Inference System (FIS) 549

Fig. 5. Membership function for image Ix , Iy gradients and its output Iout two Gaussian MF.

Fig. 6. Surface plot for output Iout FIS-1, 2, 3 and 4 against its input variables Ix , Iy .
550 A. A. Al-Jarrah and R. Bremananth

surface plot represents that MF produces the output which is important for the locating
active region of interest applications when foreground captured with same illumination
of background images. Eminent of the edges exhibited when diversely changing the
environment of capturing images the FIS-3’s MF is more suitable in this circumstance.
FIS-4 output surface plot ensures that based on the input variables the noticeable pixels
prominently extracted reference to the knowledge base fuzzy rules. A kernel of gradients
of [−1 1] and its transpose were utilized to detect the edges of the images. It was
implemented on FIS-1, 2, 3 and 4. It reveals prominent edges on the boundaries however,
40% to 60% discontinuity occurs on the foreground images. Figure 7 shows the gradient
output of the edge detection process using FIS-1, 2, 3 and 4. It further reveals gradually
improved with respect to changes of input, fuzzy operators and output variables of the
Fuzzy inference systems over the edge-continuity.

Fig. 7. The gradient [−1, 1] output of edge detection process using FIS-1, 2, 3 and 4.
An Efficient Edge Localization Using Sobel and Prewitt Fuzzy Inference System (FIS) 551

4.2 Experiment II
Figure 8 illustrates proposed Sobel-Fuzzy Inference System-1, 2, 3 and 4 results. Images
were convolved using the Sobel kernel and then fed forwarded to the proposed FIS to
find important, eminent, and noticeable edge pixels on the foreground boundaries. As
compared to gradient-FIS, Sobel-FIS produced 80%–90% of continuity pixels on the
objects’ boundaries which reveal that any computer vision applications can be essentially
utilized to localize active regions of interest. In addition, changing of MF, fuzzy operators
such as And, Or, Implication, aggregation and defuzzification provided improvements
on the foreground edges and made fuzzy edge-detected. Knowledge base Fuzzy rules
were also employed to enhance the noticeable edges.

Fig. 8. The Sobel-FIS-1, 2, 3, and 4 fuzzy edge-detection images.

4.3 Experiment III


Third experiment was involved in Prewitt-FIS result analysis. In this framework images
were convolved using the Prewitt. The gradient is then fed forwarded to a sequence of
steps in the framework. As compared to the traditional gradient-FIS, Prewitt-FIS pro-
duced from 85% to 90% of continuity pixels on the objects’ boundaries of the same
images which were employed for the analysis. Furthermore, altering MF’s fuzzy oper-
ators such as And, Or, Implication, aggregation, and defuzzification provided improve-
ments on the foreground edges and produced prominence edge-detected images. A set
552 A. A. Al-Jarrah and R. Bremananth

of Fuzzy rules were built to enhance the noticeable edges based on the knowledge base.
Figure 9 shows results of Prewitt-Fuzzy Inference System-1, 2, 3 and 4. It reveals that
any computer vision and pattern recognition applications can be essentially utilized to
localize active regions of interest by using this framework.

Fig. 9. The Prewitt-FIS-1, 2, 3, and 4 fuzzy edge-detection images.

4.4 Experiment IV

In the fourth experiment, we compared results of traditional Sobel and Sobel-FIS-1.


For that, MSE (Mean Square Error) and PSNR (Peak Signal-to-Noise Ratio) measures
were employed to verify the images edge-detected images. The same penguin images
were utilized to compare the result. MSE and PSNR were 1621.05 and 16.03 for tradi-
tional Sobel, respectively whereas Sobel-FIS-1 produced 0.074 and 59.39, respectively.
Figure 10 depicts the comparison between these two resultant images. It reveals that
the proposed Sobel-FIS-1 method produced 1620.976 less MSE and 43.36 more PSNR.
From these results, the proposed method can easily be utilized for computer vision
applications especially for region detection and tracking of multiple objects on the same
subjects were also possible.
An Efficient Edge Localization Using Sobel and Prewitt Fuzzy Inference System (FIS) 553

Fig. 10. Comparison between Sobel and Sobel FIS-1 PSNR and MSE.

4.5 Experiment V

Fifth experiment was conducted to verify the difference between traditional Sobel and
Sobel-FIS-4. Aforementioned two measures were employed for the verification and the
same penguin images were utilized for this analysis. Sobel-FIS-4 produced 0.12501 and
57.1612 as a MSE and PSNR, respectively. It reveals that the proposed Sobel-FIS-4
method produced 1620.92 less MSE and 41.1284 more PSNR. It discloses that 99% less
amount of MSE and 71.93% more PSNR were exhibited by Sobel-FIS-4 which in turns
to be utilized for any pattern recognition artificial intelligence system to detect multiple
objects on the scene (see Fig. 11 for the resultant images).
554 A. A. Al-Jarrah and R. Bremananth

Fig. 11. Comparison between Sobel and Sobel FIS-4 PSNR and MSE.

4.6 Experiment VI

Sixth experiment was conducted to validate the difference between traditional Prewitt
and Prewitt-FIS-1. For this experiment, a standard rice image in Matlab was employed.
MSE was 1878.6715 and PSNR was 15.3923 for traditional Prewitt-FIS-1 however MSE
was 0.0822 and PSNR was 58.9811. It discloses that 1878.5893 less MSE and 43.5888
more PSNR were produced by Prewitt-FIS-1. It discloses that 99% less amount of MSE
and 73.88% more PSNR exhibited by Prewitt-FIS-1 (see Fig. 12 for the results).
An Efficient Edge Localization Using Sobel and Prewitt Fuzzy Inference System (FIS) 555

Fig. 12. Comparison between Prewitt and Prewitt-FIS-1 PSNR and MSE.

4.7 Experiment VII

In the seventh experiment, differences between traditional Prewitt and Prewitt-FIS-4


were experimented. The same MSE and PSNR measures were utilized for the verification
and the same rice images were used for the analysis. Prewitt-FIS-4 produced 0.10132
and 58.074 as a MSE and PSNR, respectively. It discloses that proposed Prewitt-FIS-4
produced 1878.57 less MSE and 42.6817 more PSNR as well as 100% less mean square
errors and 73.49% more peak signals were exhibited by Prewitt-FIS-4. Figure 13 depicts
the resultant images of both traditional and proposed Prewitt-FIS-4.
556 A. A. Al-Jarrah and R. Bremananth

Fig. 13. Comparison between Prewitt and Prewitt-FIS-4 PSNR and MSE.

5 Conclusions

In this research paper, a consistent framework for edge detection using Sobel and Pre-
witt Fuzzy Inference System (FIS) was proposed, implemented and tested on standard
images. Proposed four FISs overcome the artifacts of visual perception on the edges
of objects which are appearing on traditional edge detection methods. Seven stages of
experiments were performed to verify the prominences of the proposed framework. The
proposed FIS found the important, eminent, and noticeable edge pixels on the fore-
ground boundaries. As compared to gradient-FIS, Sobel-FIS and Prewitt-FIS produced
from 80% to 90% of continuity pixels on the objects’ boundaries which disclose that
any video tracking, computer vision, imaging, pattern recognition, IoT based video
surveillance, localization applications can be essentially utilized this proposed frame-
work to enhance the localization of active regions of interest. Furthermore, by changing
Sobel and Prewitt FIS’s MF, fuzzy operators such as And, Or, Implication, aggregation
and defuzzification provided improvements on the foreground edges and made fuzzy
edge-detected effectively. In addition, expert knowledge base Fuzzy rules were also
employed to enhance the noticeable edges. Experimental results demonstrate a consid-
erable improvement on edge detection process especially on diversely varying intensity
properties’ images. This paper unfastens a new path of edge detection process and can
easily be adopted in the existing computer vision applications. In further development,
this proposed method will include more experts’ knowledge bases to acclimatize to any
An Efficient Edge Localization Using Sobel and Prewitt Fuzzy Inference System (FIS) 557

kind of applications without diverse changes in the system and minimize the artifacts of
diversity of illuminations.

Acknowledgment. The authors would like to thank the management of Sur University College
for providing the necessary support while carrying out this research in a successful manner.

References
1. Jeon, G.: Histogram-based color image transformation using fuzzy membership functions.
Int. J. Softw. Eng. Appl. 8(5), 63–72 (2014)
2. Peri, N.: Fuzzy logic and fuzzy set theory based edge detection algorithm. Serb. J. Electr.
Eng. 12(1), 109–116 (2015)
3. Zhang, Y., Han, X., Zhang, H., Zhao, L.: Edge detection algorithm of image fusion based on
improved Sobel operator. In: IEEE 3rd Information Technology and Mechatronics Engineer-
ing Conference (ITOEC), Chongqing, pp. 457–461 (2017). https://doi.org/10.1109/ITOEC.
2017.8122336
4. Castillo, O., Sanchez, M.A., Gonzalez, C.I., Martinez, G.E.: Review of recent type-2 fuzzy
image processing applications. Information 8(3), 97 (2017)
5. Pal, S.K.: Fuzzy image processing and recognition: uncertainty handling and applications.
Int. J. Image Graph. 1(2), 169–195 (2001)
6. Kundra, H., Aashima, E., Verma, M.: Image enhancement based on fuzzy logic. Int. J. Comput.
Sci. Netw. Secur. 9(10), 141–145 (2009)
7. Kaur, T., Sidhu, R.K.: Performance evaluation of fuzzy and histogram based color image
enhancement. In: Second International Symposium on Computer Vision and the Internet
(VisionNet 2015) (2015). Procedia Comput. Sci. 58, 470–477
8. Gharbi, M.: Deep bilateral learning for real-time image enhancement. ACM Trans. Graph.
36(4), 118.1–118.11 (2017)
9. Topno, P., Murmu, G.: An improved edge detection method based on median filter. In: IEEE
Proceedings of 2019 Devices for Integrated Circuit (DevIC), Kalyani, India, pp. 378–381
(2019). https://doi.org/10.1109/DEVIC.2019.8783450
10. Singh, N.V., Rani, A., Goyal, S.: Improved depth local binary pattern for edge detection of
depth image. In: IEEE Proceedings of 2020 7th International Conference on Signal Processing
and Integrated Networks (SPIN), Noida, India, pp. 447–452 (2020). https://doi.org/10.1109/
SPIN48934.2020.9070820 (2020)
11. Dong, X., Li, M., Miao, J., Wang, Z.: Edge detection operator for underwater target image.
In: 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC),
Chongqing, pp. 91–95 (2018). https://doi.org/10.1109/ICIVC.2018.849274
12. Li, J., Qiu, D., Liu, K., Yang, H.: A novel image edge detection method for workpiece based
on improved extreme learning machine and information measure. In: 2019 Chinese Automa-
tion Congress (CAC), Hangzhou, pp. 1592–1597 (2019). https://doi.org/10.1109/CAC48633.
2019.8996649
13. Mittal, M.: An efficient edge detection approach to provide better edge connectivity for image
analysis. IEEE Access 7, 33240–33255 (2019). https://doi.org/10.1109/ACCESS.2019.290
2579
14. Liu, Y., Xie, Z., Liu, H.: An adaptive and robust edge detection method based on edge
proportion statistics. IEEE Trans. Image Process. 29, 5206–5215 (2020). https://doi.org/10.
1109/TIP.2020.2980170
15. Ofir, N., Galun, M., Alpert, S., Brandt, A., Nadler, B., Basri, R.: On detection of faint edges
in noisy images. IEEE Trans. Pattern Anal. Mach. Intell. 42(4), 894–908 (2020). https://doi.
org/10.1109/TPAMI.2019.2892134
Application of Artificial Intelligence
in Recommendation Systems and Chatbots
for Online Stores in Fast Fashion Industry

Meshal Alduraywish1 , Bhuvan Unhelkar2 , Sonika Singh3 , and Mukesh Prasad1(B)


1 School of Computer Science, FEIT, University of Technology Sydney, Sydney, Australia
mukesh.prasad@uts.edu.au
2 Muma College of Business, University of South Florida, Tampa, USA
3 Marketing Discipline Group, Business School, University of Technology Sydney, Sydney,

Australia

Abstract. The competition in the fast fashion industry is getting more complex
as online-only retailing has emerged in the industry. The fast fashion online-only
companies rely on e-commerce platforms to merchandize products and provide
services to customers. Online-only companies use artificial intelligence (AI) and
machine learning (ML) through recommendation systems and chatbots to improve
business functions and enhance online customer shopping experience. Fast fash-
ion multichannel companies (online and physical stores) have a concern about
their physical stores’ performance, and the role that physical stores can play to
enhance customer experience. Multichannel companies are continuously looking
for new ways to optimize services and enhance customer experience in both chan-
nels. Physical stores can play a crucial role in enhancing customer experience. By
collecting data on in-store customers, their interests and interactions with prod-
ucts, multichannel companies can analyze such data using AI and ML to optimize
production, marketing, and customers experience. This paper highlights the appli-
cation of AI in ecommerce by fast fashion companies. Also, this paper discusses
the effect of fast fashion online-only companies on fast fashion multichannel com-
panies’ physical stores, aiming to understand the current impact of online-only
business on physical stores performance. This paper, further, discusses the future
role of fashion physical stores in the industry.

Keywords: Online-only companies · Multichannel companies · Artificial


intelligence · Machine learning · E-Commerce · Fast fashion · Customer
experience

1 Introduction

The fashion industry has witnessed a significant evolution due to internal and external
factors. Intense cultural expressions in streets, events, and other popular social settings
as well as televisions and the movie industry have significantly influenced the fashion
industry. The industry has witnessed an increased rate of fashion shows and catwalks that

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 558–567, 2022.
https://doi.org/10.1007/978-3-030-97196-0_46
Application of Artificial Intelligence in Recommendation Systems 559

have increased the rate of fashion awareness among customers. Customers have begun
adopting fast-paced living standards, leading to rapid changes in customers’ preferences
and demand. “There were high rates of expansion, intense competition, increased num-
ber of fashion seasons and increased manipulation in the supply chain that has led to
significant changes in the industry” [1]. Companies in the fashion industry have changed
their pace of operations by compressing lead times, leading to increase the rate of demand
satisfaction by availing the right products at the right time [2]. The change in the industry
has led to the emergence of fast fashion. Fast fashion is a sector in the fashion industry
whereby fashion companies frequently update inventories with products that follow the
latest fashion emerged in the market, adopt marketing approaches based on big data
analysis, and respond to demand by offering the right products to the right audience at
low prices [3].
Along with adopting flexibility by operating shorter, comprehensive, and agile sup-
ply chains, fast fashion online-only companies have adopted e-commerce retailing model
that helps companies to get broad customer base and improve business functions. “E-
commerce refers to the deals of buying and selling products and services on the internet”
[4]. The growth of e-commerce has required online companies to use advanced tech-
nologies such as AI and ML to improve business functions and understand customers,
resulting in enhancing customer experience. Companies use AI and ML through recom-
mendation systems to analyze customers behavior and searches history to predict what
customers prefer [5]. The use of AI and ML through recommendation systems save
customers time as customers can find preferred products quickly and easily [6]. The
application of AI and ML in e-commerce is also found in the chatbots which provides
automated communication through different forms including text, voice, and pictures
[7]. AI and ML help companies to automate service processes which reduces costs and
enhances service quality, resulting in enhancing customer experience [7].
In-store customer experience is a core objective for fast fashion multichannel com-
panies. The growth of the in-store fast fashion businesses is aligned to the continuous
understanding of customer behavior [8]. Technology is an elemental key in enabling
fast fashion businesses to understand and anticipate customers’ needs through in-store
consumer behavior analysis. According to [9], technology has transformed the art of
shopping with objects becoming the center of operations enhancing service delivery
by speeding up processes, minimizing errors, and incorporating flexible organization
systems. The major concern for fast fashion multichannel companies is the effect of
ecommerce on their physical stores’ performance. Even though multichannel companies
maintain online stores, their business, specifically physical stores, have been affected, and
many multichannel companies have announced closure of their physical stores including
H&M, GAP, Abercrombie & Fitch, and Forever 21.
This paper is divided into three sections. First section presents the adoption of ecom-
merce by fast fashion companies and the use of AI and ML in e-commerce platforms.
Second section discusses the effect of e-commerce on multichannel companies’ physical
stores. Last section discusses the future role of multichannel companies’ physical stores.
560 M. Alduraywish et al.

2 Ecommerce and Fast Fashion


The internet has become a marketplace wherein customers can find, compare, and pur-
chase fashion goods upon their considerations. Customers use fashion e-commerce plat-
forms as an alternative shopping way of traditional commerce. Traditional commerce
is defined as the buying and selling of goods and services with physical interaction
between sellers and buyers [10]. “E-commerce, on the other hand, is the purchasing and
selling of goods and services on the internet without physical interaction between sellers
and buyers” [10]. Increased online shopping for fashion-related goods has transformed
fashion companies’ activities as companies use online platforms as a channel to sell and
market products. Companies use data analytics tools to analyze data from e-commerce
platforms to design customized marketing campaigns and target a specific audience.
Companies collect data on customers from e-commerce platforms along with fashion
data that fashion companies generate, to personalize products and services to enhance
online customer experience. Big data and data analytics tools such as AI and ML have
become a source of competitive edge in the fast fashion industry (Table 1).

Table 1. Comparison between Traditional Commerce vs. E-Commerce.

Traditional commerce E-commerce


Definition Physical stores that buy and/or Online stores that buy and/or sell
sell products and services products and services over the
internet
Mode Direct and physical interaction Electronic interaction between
with customers – face-to-face sellers and buyers – online
transaction
Availability Physical stores have a limited Online stores are not obligated to
time of operation based on each certain operation times
country’s regulations
Product inspection Customers can check and/or try Customers cannot check and/or try
products before purchasing products before purchasing
Presence Physical stores have limited Online stores are based on the
presence due to the business internet and can be reached from
strategy of expansion, budgeting, anywhere in the world
and business type
Establishment and maintenance Physical stores require more Online stores require less
investment to establish and investment to establish and
maintain attractive stores maintain the platform than physical
stores
Payment methods Physical stores accept traditional Online stores accept the traditional
payment methods such as credit payment methods along with
cards, cash, cheque, etc. variety of online payment methods
such as PayPal, Amazon Pay, etc.
Delivery of goods Physical stores provide Instant Online stores deliver goods to
delivery of goods customers after some time of
making the purchase
Application of Artificial Intelligence in Recommendation Systems 561

E-commerce sales are evaluated at 4.28 trillion USD in 2020, and it is estimated to
grow to reach 5.4 trillion USD in 2022 [11]. “57% of the international internet users
purchased fashion-related items via the internet, which made the apparel segment the
most common online shopping category globally” [11]. 24% of the fashion revenue
globally is going to be produced online by 2024 as shown in Fig. 1 [11]. The growth of
online store sales, in general, is attributed to the integration of shopping functionality
into the content display on social media and the improvement in mobile browsing [11].

Fig. 1. Global fashion sales and fashion online sales in 2024.

In a global comparison of online fashion revenue, 55% of fashion revenue in China


($221 billion USD) is going to be produced online by 2024, which makes China’s fashion
online sales the largest fashion online market worldwide [12]. 26% of the fashion revenue
in the United States ($43 billion USD) is going to be generated online by 2024 [13]. 29%
of the fashion revenue in the U.K. ($10.4 billion USD) is projected to be produced online
by 2024 [14]. 21% of the fashion revenue in Germany ($5.7 billion USD) is anticipated
to be generated online by 2024 [15]. 17% of the fashion revenue in Japan ($4.6 billion
USD) is anticipated to be produced online by 2024 [16]. Figure 2 represents the 5 largest
Fashion online markets around the world in 2024.

Fig. 2. Global comparison for the fashion online revenue of the largest 5 fashion markets in 2024.
562 M. Alduraywish et al.

The vast usage of e-commerce for fashion has significantly revolutionized the fast
fashion industry, and it has become an integral part of business development. The fast
fashion industry has transformed as fashion companies with physical stores presence
expanding the business to the online environment (multichannel - “physical and online
stores”) and there are newcomers to the market based online business [17]. Based on
these two business types, there are three proposed online-based market entries for fast
fashion companies including entering a host market with a physical store first and then
expand to an online store (multichannel), entering a foreign market with an online store
then expand to a physical store (multichannel), and/or entering a market only with an
online store (ecommerce) [17] as described in Table 2.

Table 2. Developmental process of online-based internationalization [17].

– High ownership-specific advantages: strong multinational knowledge and experi-


ence and brand identity [17].
– High location-specific advantages: psychic distance among countries such as
language, legal and political systems, and educational levels [17].

Type 1: Companies with high ownership-specific advantages and high location-specific


advantages generally establish the business with a physical store first and then expand
the existence of the business to online [17]. Although intercontinental companies enter
markets with physical stores first, they would not expand their physical existence to the
rural areas in the same country due to demand uncertainty in rural areas [17]. They prefer
to expand to online stores to reach wider base of customers in the rural areas [17].
Type 2: Companies that have either low ownership-specific advantages or low location-
specific advantages establish an online business at first and then expand the business to
physical stores after increasing the ownership or location advantages [17]. This strategy
is followed because it requires less investment which reduces risks for companies [17].
After lowering the risk by gaining knowledge and experience of the market and customers
become aware of the brand, companies expand to physical stores [17].
Type 3: Companies that have low ownership-specific advantages and low location-
specific advantages establish an online-only business because of the high uncertainties
and risks [17]. This type of business has a wide range of assortment that makes it difficult
to open attractive physical stores to customers [17]. Restrictive government policies also
play a role for this type of business to establish physical stores in forging countries [17].
Application of Artificial Intelligence in Recommendation Systems 563

2.1 Application of AI and ML in Ecommerce

Intense competition in the fast fashion industry makes companies look for ways to lever-
age the latest technology to create a niche and gain a competitive advantage in the
market. Companies seek different ways to create demand and make brand awareness to
help them become the leading players in the industry [18]. Companies constantly look
for new ways that help them sell their products before they become out of fashion. “AI
and ML continue to change the way companies do business across several sectors to
boost their sales growth and optimizing e-commerce operations” [5]. Artificial Intelli-
gence is a technology with the capabilities to develop theoretical methods, technologies
and applications that efficiently simulate and extend human Intelligence [5]. Machine
learning is the application of technology to recognize patterns to enhance data mining
and probability theory, leading to useful statistics outputs. “Artificial intelligence and
machine learning emphasize more on computation, perception, reasoning, and action”
[4].
Online companies increasingly use AI and ML to improve sales processes and
enhance online customers’ shopping experience. AI and ML provide companies with
different applications in e-commerce that enable companies to analyze large datasets
regarding customer behavior and usage patterns. Online companies use recommendation
systems to create a holistic shopping experience for their customers. “Such recommen-
dation systems use machine learning algorithms to achieve deep learning and analysis of
customer behaviors, thereby making it easier for the companies to analyze large datasets
and effectively predict types of products that customers find more attractive” [5]. “Al-
gorithms in recommendation systems analyze customers’ searches and record crucial
details” [6]. “After obtaining the key details, the recommendation system analyses avail-
able data and displays relevant suggestions to customers, thereby making it easier for
customers to find products quickly without wasting time” [6]. AI and ML enhances
the effectiveness of recommendation systems as making recommendation systems a
comprehensive system depending on human-computer interaction [6].
Companies like Amazon and Walmart implemented AI and ML in e-commerce.
The rapid adoption of technology and inevitable need to compete has made several
other e-commerce retailers to invent and implement AI an ML in their operations [18].
For instance, Alibaba extensively uses AI and ML to improve its e-commerce retail
system and processes. Alibaba uses “Taobao”, a recommendation system that use ML to
recommend products matching customers’ demands and preferences [19]. Since Alibaba
offers online market for sellers, an online buyer can log into the platform and search
for fashion products, machine learning algorithm quickly sorts out different products
that meet the customer’s preferences. Forever 21 is also a fast fashion company that
uses AI and ML in their online store. Forever 21 allows customers to use visual search
navigation powered by artificial intelligence. Forever 21 introduced AI-enabled visual
search and navigation feature that allows online shoppers to search for dresses, pants,
shorts, jeans, and tops with results appearing as a standalone “Discover Your Style”
module on the Forever 21 webpage and mobile app [20]. Besides, Forever 21 Company
uses AI and ML technology to improve merchandising, recommendations and lifecycle
analysis. Fast fashion companies continue to implement AI and ML in their online store
to optimize sales and online customers shopping experience.
564 M. Alduraywish et al.

The other application of AI and ML in e-commerce is found in chatbots. Chatbots


help online companies to automate service processes, hence reducing labor expenses and
enhance service quality [7]. The primary function of chatbots is to provide automated
support to customers online. “Chatbots allow communication via text, voice and pictures”
[7]. Chatbots recognize simple commands that save customers time and enhance their
satisfaction. Besides, chatbots help customers to conduct purchasing in a conversational
text format as chatbots use natural language processing techniques [4]. Through such AI
assistants, customers can easily find suitable products, check the availability of products,
do quick comparisons with other products, and escalate the payment processes. [7]. AI
and ML play a crucial role in enhancing online customers experience as they improve
customers conversational shopping and directions as well as allowing businesses to
operate effectively.

3 The Effect of Online-Only Business on Physical Stores


Online-only companies have powerful capabilities, which obviously affected multichan-
nel companies’ physical stores. Online stores are more cost-effective as compared to
physical stores [21]. Online stores are not exhausted with physical stores’ operating
costs. Online retailers benefit from running online-only wherein maintaining online
stores and centric warehouse help retailers to reduce operation costs. “Online retailing
works as the main channel for reducing the advertising cost, direct managing promotion
campaign using visual power and latest technologies online” [22]. “Online companies
can easily reach out an uncountable number of audiences in a short time who will be target
customers” [22]. Online companies reduce merchandise consumption cycles [21]. When
a new design becomes available, customers can simply review it online and purchase it
rather than waiting for physical stores to make it available at stores.
The impact of online-only retailing on multichannel companies’ physical stores is
found on H&M. H&M has declared the closure of 160 stores, and this was caused by
accumulated stocks of more than $4 billion for unsold items, which has forced the com-
pany to provide significant discounting to clear out the goods [23]. H&M has planned to
throw more resources into online sales, and they are upgrading the online store including
improving product navigation and display, and shorter delivery times [23]. The U.S. fash-
ion industry, including home grown icons such as GAP and Abercrombie & Fitch, have
announced closures of 4,500 stores in 2019 alone [23]. Charlotte Russe has announced
bankruptcy during 2019, and it could end up liquidating if it cannot find an investor to
keep the business running [23]. Forever 21 company would cut out the operations in 40
countries as its revenue diminished from more than $4 billion to $3 billion in 2016 [24].
Forever 21 is planning to close 350 stores over all and fired more than 10,000 employs
[24]. The company, however, will continue executing the online store [24]. According
to [25], the expansion strategy made Forever 21 unable to invest in their supply chain,
which made the company take more time to get fresh styles of clothes to market at a
time when shoppers were hungry for newness. Forever 21 failed to understand markets
outside of America as it opened more shops in Asia and Europe [25].
Application of Artificial Intelligence in Recommendation Systems 565

4 Discussion and Recommendations


The competition in the fast fashion industry has become more complex as online-only
business has emerged in the industry. E-commerce by online-only companies will con-
tinue to shift the industry from strictly brick-and-mortar to mobile and social media-based
sector, opening doors for more adoption of AI and ML to enhance business functions and
customer experience. AI and ML are the main tools used by online-only companies to
handle online customer, online-only companies will increase the involvement of AI and
ML in their businesses in the future. The increasing adoption of AI and ML through rec-
ommendation systems and chatbots will improve marketing strategies, customize online
offerings, and personalize customer experiences which lead to sustaining competition
in the market in the future.
Multichannel companies, on the other hand, can capitalize on the advantage that
they possess physical and online stores. Multichannel companies’ physical stores can
play a crucial role in enhancing online customer experience and help companies to opti-
mize business functions. By gathering in-store data on customers and discovering their
interests, companies can personalize online services that enhance customer satisfaction
and customers retention. Integrating in-store data along with the online data, companies
can develop effective recommendation systems and chatbots that make personalized
suggestions and provide a superior automated and personalized support for online cus-
tomers. Gathering in-store data about customers and their interests can help companies
customize production. By discovering fashion interests, companies can produce prod-
ucts that match customers’ needs and preferences, resulting in increasing sales, reducing
wastes, and better resource customization. Therefore, multichannel companies should
through more resources in in-store customer behavior analysis methods such as loyalty
cards, QR codes, smart mirrors, smart hangers, among other technologies aiming to
collect more data about customers. It is crucial for multichannel companies to prepare
plans for the transformation in the industry, trade tensions and uncertainties of markets
due to unexpected circumstances such as health pandemics that caused closure of phys-
ical stores. Multichannel companies should reconsider the role of their physical stores
as physical store can be used as a source of collecting data about customers and their
interests to enhance online customer experience and business functions.

5 Conclusion and Future Work


The fashion industry has significantly changed due to the influence of television, the
movie industry, the change in supply chains’ structure, and increased number of fashion
shows and seasons. This change has yielded the so-called fast fashion. The competition in
the fast fashion industry has become more complex as online-only business has emerged
in the industry relying on e-commerce. Online-only companies integrate AI and ML
through recommendation systems and chatbots on ecommerce platforms to optimize
business functions and enhance online customer experience. Online retailing in fast
fashion industry has affected multichannel companies’ physical stores’ performance as
multichannel companies closed shops in the last few years. Multichannel companies,
however, will have opportunities in the market if they adopt AI and ML along with the
566 M. Alduraywish et al.

in-store data effectively. They will be able to understand customers as well as personalize
products and services and customize marketing campaigns. Future work should focus on
developing a method that helps multichannel companies to effectively involve physical
stores in enhancing customers’ experience. Innovative methods should concentrate in
collecting in-store data about customers and their interests with the aim of enhancing
same customer’s experience when they shop online. This method will help companies to
improve recommendation systems and chatbots to enhance online customer experience
as well as boost sales.

References
1. Gupta, S., Gentry, J.: Evaluating fast fashion: fast fashion and consumer behavior. In: Eco-
Friendly and Fair. Taylor and Francis Group, London, U.K (2018)
2. Bhardwaj, V., Fairhurst, A.: Fast fashion: response to changes in the fashion industry. Int.
Rev. Retail 20(1), 165–173 (2010)
3. Long, X., Nasiry, J.: Sustainability in the fast fashion industry. SSRN, p. 44 (2019)
4. Shyna, K., Vishal, M.: A study on artificial intelligence E-commerce. Int. J. Adv. Eng. Sci.
Res. 4(4), 62–68 (2017)
5. Song, X., Yang, S., Huang, Z., Huang, T.: The application of artificial intelligence in electronic
commerce. J. Phys.: Conf. Ser. 1302, 6 (2019)
6. Sahu, S.P., Nautiyal, A., Prasad, M.: Machine learning algorithms for recommender system – a
comparative analysis. Int. J. Comput. Appl. Technol. Res. 6(2), 97–100 (2017)
7. Soni, V.D.: Emerging roles of artificial intelligence in ecommerce. Int. J. Trend Sci. Res. Dev.
4(5), 223–225 (2020)
8. Blázquez, M.: Fashion shopping in multichannel retail: the role of technology in enhancing
the customer experience. Int. J. Electron. Commer. 18(4), 97–116 (2014)
9. Guo, B., Zhang, D., Wang, Z., Yu, Z., Zhou, X.: Opportunistic IoT: exploring the harmonious
interaction between human and the Internet of Things. J. Netw. Comput. Appl. 36(6), 1531–
1539 (2013)
10. Kamberaj, B.: Consumer trust in E-Commerce. SSRN, p. 10 (2020)
11. Statista. https://www.statista.com/outlook/dmo/ecommerce/fashion/worldwide. Accessed 10
Aug 2021
12. Statista. https://www.statista.com/outlook/dmo/ecommerce/fashion/china. Accessed 10 Aug
2021
13. Statista. https://www.statista.com/outlook/dmo/ecommerce/fashion/unitedstates. Accessed
10 Aug 2021
14. Statista. https://www.statista.com/outlook/dmo/ecommerce/fashion/unitedkingdom.
Accessed 10 Aug 2021
15. Statista. https://www.statista.com/outlook/dmo/ecommerce/fashion/germany. Accessed 10
Aug 2021
16. Statista. https://www.statista.com/outlook/dmo/ecommerce/fashion/japan. Accessed 10 Aug
2021
17. Lee, J.E.: Fast-fashion retailers-types of online-based internationalization. Res. J. Costume
Cult. 27(1), 33–45 (2019)
18. Forbes. https://www.forbes.com/sites/cognitiveworld/2019/07/16/thefashion-industry-is-get
ting-more-intelligent-with-ai/?sh=18a61f083c74. Accessed 23 Apr 2021
19. Harvard Business Review. https://hbr.org/2018/09/alibaba-and-the-future-ofbusiness.
Accessed 10 Jan 2021
Application of Artificial Intelligence in Recommendation Systems 567

20. RIS News. https://risnews.com/forever-21s-ai-powered-visual-searchincreases-conversions-


and-basket-size. Accessed 05 Feb 2021
21. Wei, Z., Zhou, L.: E-commerce case study of fast fashion industry. In: Du, Z. (eds.) Intelligence
Computation and Evolutionary Computation, vol. 180, pp. 261–270. Springer, Heidelberg
(2013). https://doi.org/10.1007/978-3-642-31656-2_39
22. Theseus. https://www.theseus.fi/handle/10024/148582. Accessed 01 June 2021
23. Forbes. https://www.forbes.com/sites/sanfordstein/2019/02/10/how-couldchanging-con
sumer-trends-affect-fast-fashion-leaders-hm-andzara/#7f5a96866f48. Accessed 01 May
2021
24. New York Times. https://www.nytimes.com/2019/09/29/business/forever21-bankruptcy.
html. Accessed 01 Mar 2021
25. CNBC. https://www.cnbc.com/2019/12/11/heres-why-forever-21-wentbankrupt.html.
Accessed 03 July 2021
Application of Artificial Intelligence
in Healthcare by Industries in Australia:
Opportunities and Challenges

Priyanka Singh, Aruna S. Manjunatha, Ayesha Baig, Pooja Dhopeshwar, Huan Huo,
Gnana Bharathy, and Mukesh Prasad(B)

School of Computer Science, FEIT, University of Technology Sydney, Sydney, Australia


mukesh.prasad@uts.edu.au

Abstract. Artificial Intelligence (AI) has proven its potential in various sectors
including the healthcare system. It is changing the landscape of the healthcare
sector across the world. AI is becoming progressively sophisticated over time
in helping detect and diagnose diseases early and predict treatment outcomes
in patients. Like many developed countries, Australia is already on the path to
implementing artificial intelligence in the healthcare and medical technologies
sector. This paper provides a broad analysis of applied AI in the health care system,
how industries are contributing towards it, and the challenges they face. This
paper has primarily used desktop research using resources like journal articles,
government reports, media articles, corporation-based documents, blogs and other
publicly available data. This paper focuses on the use of AI by industries and start-
ups in the healthcare space and their contributions to improving patient’s health
and the challenges they face in the adoption and implementation of AI solutions.
Start-ups lead in providing a range of AI solutions in health in Australia. However,
they face various challenges in form of lack of proper technology infrastructure
and funding issues.

Keywords: Artificial Intelligence · Healthcare · Industries · Start-ups

1 Introduction
Artificial Intelligence (AI) is a broad term that is used to describe a set of technologies
that help in solving the problem and perform tasks without explicit human intervention.
Few of these include machine learning, computer vision, natural language processing,
robotics and deep learning, etc. [1]. AI is changing the landscape of every sector includ-
ing healthcare across the world. It is becoming progressively sophisticated over time in
not only performing a task that is usually done by human beings but more efficiently and
at a lower cost. The idea behind incorporating AI in healthcare is to help common people
lead a healthy lifestyle, make the healthcare system more efficient, reduce cost, opti-
mize available resources, and assist healthcare providers with better-informed decision-
making. The use of Artificial Intelligence and Internet of Medical Things (IoMT) in
the healthcare sector is already in use and has started to realize its potential [2]. The

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022


H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 568–580, 2022.
https://doi.org/10.1007/978-3-030-97196-0_47
Application of AI in Healthcare by Industries in Australia 569

algorithms used in AI have existed for quite some time, few have evolved significantly
and shall evolve even in the future. The volumes of data and widespread availability of
affordable computation have enabled Australia to leverage the pace of AI growth [1].
AI in Healthcare is the usage of complex computer algorithms and applications to
predict human cognition by analyzing complex medical data. AI can also be defined
as the algorithms that approximate meaningful and reliable conclusions without direct
input/guidance from a human being. What distinguishes the traditional technologies in
healthcare and Artificial Intelligence technology is that the usage of AI gives more reli-
able, quick results even after processing complex data. AI does tasks with the help of
machine learning algorithms. Such algorithms recognize meaningful patterns in behav-
iors & create their logic. Of course, an Artificial algorithm must always be tested before
implementing to ensure error-free operation. In the healthcare sector, industry analysts
have predicted that the AI market would grow at a 23% annual rate between 2017
and 2023 [3]. In one of the analyses, Accenture reported the top 10 AI applications
in healthcare and their projected potential benefits by 2026 from the USA perspec-
tive as shown in Fig. 1. From Fig. 1 it’s clear that the top 3 AI applications with high
value are Robotic-assisted surgery, virtual nursing assistants and administrative work-
flow assistance, though AI in cybersecurity within the healthcare sector still has room
for improvement in the future [4].

Fig. 1. Estimation of potential benefits by AI healthcare application by 2026. Source: Accenture


Analysis 2020 [4]
570 P. Singh et al.

Other than the above-mentioned domains, AI in healthcare can be of advantage in


expanding in core areas in medicine like Antibiotic Resistance, Brain-Computer Inter-
face, Cardiology, Electronic Health Records, Health and wellness, Immunotherapy for
cancer treatment, Medical diagnosis, Neurology, Pathology Images, Radiology Tools,
Smart devices, Surgery [5]. The review is secondary desktop research using resources
like journal articles, government reports, media articles, corporation-based documents,
blogs and other publicly available data. Relevant information from these sources is col-
lected using keywords of the topic. The collected information is synthesized and analysed
to inform the outputs in the paper.

2 Literature Review
This section provides an overview of the start-ups and industries providing AI solutions
to health and healthcare challenges in Australia. The literature review is subdivided into
the following four sections.

2.1 Current Situation of AI in Healthcare


Healthcare in Australia is primarily funded by the public Medicare program and deliv-
ered by highly regulated public and private health care providers. It provides a wide
range of services, from population health and prevention through to general practice and
community health, hospital care, rehabilitation and palliative care. However, the start-up
that has boomed in recent years has used AI to deliver healthcare whether diagnostic
services or supporting informed treatment decisions in innovative ways and at competi-
tive prices. The Australian government is supporting AI in Australia by providing $33.7
million over 4 years to support Australian businesses to partner with the government to
pilot projects for AI-based solutions to national challenges [6]. The National AI Cen-
tre (within CSIRO’s Data61) said “they will coordinate Australia’s AI expertise and
capabilities, and address barriers that small-to-medium-sized enterprises (SMEs) face
in adopting and developing AI and emerging technology” [6]. Australia is striving hard
in recent years to nurture its HealthCare start-ups that have grown in leaps and bounds.

2.2 Start-Ups Using AI in Healthcare Delivery


Several start-ups have come up in the last decade leveraging AI for improved healthcare
delivery in terms of early and accurate diagnostic services, improved clinician decision
support systems for physicians. These starts-up provide solutions across a variety of
challenges like management of chronic diseases, supporting clinical trial recruitments,
use and adherence to medication, etc. As depicted in Fig. 1, the application and innova-
tions in AI in Australia are consistent with Accenture findings from the USA. Moreover,
the start-up is providing consulting and AI solutions to a range of other unique healthcare
challenges. There are 93 AI in Healthcare start-ups in Australia [7]. Table 1 provides a
list of a few start-ups that use AI in healthcare in Australia.
Even though Australia has seen a boo in the number of start-ups coming up to
provide AI solutions for real-world health problems, still most of these starts-up find it
Application of AI in Healthcare by Industries in Australia 571

Table 1. List of few start-ups in the healthcare space in Australia [7, 8]

Start-up’s name Focus area of healthcare Founding year


Diagnostic services
Harrison.AI It’s a provider of artificial intelligence systems for 2017
multiple diagnostic analyses. It offers a platform
based on deep learning artificial intelligence that
helps with services like predictive pregnancy from
time-lapse video and fundoscopic images, along
with tuberculosis case analysis and canine
parvovirus infection [7]
Maxwell Plus It’s a provider of diagnostic imaging solutions. The 2016
team is developing an algorithm to also detect
prostate cancer, breast cancer, lung diseases, and
neurodegenerative diseases like Alzheimer’s [9, 10]
Alixir It is a developer of mammographic detection 2017
platform that runs on machine learning and data
science that can help in detecting early signs of
breast cancer [9]
Omniscient It’s a provider of a suite of cloud-based software for 2004
the neu-rosurgeon. It provides clinical and research
solutions for a variety of brain-related disorders,
including depression, chron-ic pain, cancer, bipolar
disorder, Alzheimer’s, dementia, and others [7]
Life Whisperer It uses AI-driven image analysis for embryo 2016
selection for IVF. It claims to use proprietary
algorithms based on machine learn-ing, statistics,
and physics, to identify morphological features that
constitute a healthy embryo. It is a web-based
decision support tool for IVF clinic [7]
Presagen It creates globally scalable image-based medical -
diagnostics software by bringing together deep
learning (DL) and comput-er vision (CV) globally
scalable image-based medical diagnos-tics
software. It analyses large datasets of medical
images to rapidly create web-based medical
diagnostic tools that can be used on-demand by
medical institutes anywhere in the world [10]
(continued)
572 P. Singh et al.

Table 1. (continued)

Start-up’s name Focus area of healthcare Founding year


Clinician decision support tool
Auxita The AI-based platform that analyses patient data to 2016
provide treatment recommendations. It works
under the guidelines provided by specialists [7]
See-Mode Technologies It’s a provider of AI-based platform for stroke 2017
prediction. The platform analyses the patient health
data provided by the phy-sicians to predict the
chances of stroke in patients and plan the treatment
and prevention of such incidents in the future [7]
DocLink It was a tool to aid clinicians in the screening and 2017-closed
diagnosis of pathologies using computer-based
machine learning algorithms that will provide
rapid, probabilistic diagnoses from medical
imaging [9]. Unfortunately, the start-up has closed
its business operations in Australia
QURO by Medius Health It provides health assistant that uses advanced -
machine learn-ing to give more accurate health
assessments, bypassing doctor shortages, waiting
rooms and false internet diagnoses in the process
[7]
Supporting clinical trials
HealthMatch It’s a provider of a mobile app to manage oral 2017
health. It uses Artificial intelligence to scan photos
of teeth and check for dental problems such as
tooth decay, calculus, gingivitis, and others [7]
Disease management including chronic disease
Pearlii It’s a provider of patient recruitment platform to 2019
enable patients access to clinical trials from a
network of care teams around the world. The
company uses artificial intelligence to clinical data,
by which patients are matched in real-time to
clinical trials tailored to their medical profile [7]
(continued)
Application of AI in Healthcare by Industries in Australia 573

Table 1. (continued)

Start-up’s name Focus area of healthcare Founding year


Clevertar It’s a provider of digital coaches for managing 2012
chronic health conditions like type 2 diabetes,
chronic pain, anxiety, depression and heart failure.
The coaching programs provide skills for
managing the health condition designed by clinical
experts [7]
Pioneera It’s a provider of AI-based platform to manage 2018
workplace stress for employees. The company
offers an online analytics platform that helps
employees to detect stress habits and give real-time
clinical insights for better health management
solutions. It also offers a digital chatbot Indie to
discuss and share the employee’s emotional stress,
happiness, and more [7]
PredictBGL It provides patient-specific predictive analytics for -
diabetes management. Gain better control of your
diabetes without being connected to a device [8,
11]
Other health areas
PainChek The world’s first pain assessment tool that gives a 2016
voice to those who cannot verbalize their pain.
Using AI and facial recognition technology
provides carers with three important new clinical
benefits, the ability to identify the presence of pain,
to quantify the severity level of pain and to monitor
the impact of treatment to optimize overall care [8,
12]
EmergiSim EmergiSim is developing VR technology for first 2019
responders to be able to train and prepare for
high-stress emergencies [8, 13]
Prospection Prospection develops and delivers world-class 2012
consulting, data and technology solutions across
the healthcare sector, with a focus on the supply
and use of medications [8, 14]
574 P. Singh et al.

difficult to sustain due to lack of funding. There are a couple of them who have stopped
operating due to various reasons among which unsustainable funding was one of the key
reasons. However, in current times, this has been accentuated due to a large portion of
healthcare funding been channelized to fight the Covid-19 pandemic, the other healthcare
issues have suffered a setback in terms of funding prioritization by the government,
hospitals and other organizations likewise. There is some positive development on this
front by the Australian government, as they announced to provide $12 million over five
years to catalyze the AI opportunity in the regions by co-funding [6]. Though this is
not restricted to the health sector alone. Covid-19 pandemic has brought some aspects
of digital health and telehealth in limelight due to the inability to have a face-to-face
consultation, though the pandemic has caused to repurpose the healthcare resources to
fight it leaving asymmetrical focus and attention to other pressing issues in health.

2.3 Start-Ups Using AI for Healthcare Efficiency and Improved Patient Care

Few other startup companies like AMAX, PureProfile, VitalConnect, VitalPatch,


PointClickCare, etc. came up in the last couple of years focused use of AI in Health-
care to bring efficiency and improve patient care. They mainly focused on features like
virtual desktop and deep learning, wearable biosensor detection, interoperability and
augmented reality practice in healthcare.

AMAX. It is a global deep learning, AI and enterprise IT technology company. They


have launched virtual desktop infrastructure platforms and Deep Learning/AI Compute
Cluster systems for the Australian healthcare industry. This system has been optimized
for development, research, deployments (large scale) in a data center for the health sector.
The company’s systems are designed to redefine the traditional workstation-based and PC
deployments in clinics, hospitals, & doctor’s offices. The advanced analysis of important
data like an image that are used by physicians, radiologists & specialists, and the delivery
of valuable insight are the functions that are competitive and distinguish the companies
in the healthcare information technology from their competitors. It offers platforms that
are purpose-built for health organizations that are aspiring for the most evolved medical
computation systems.

VitalConnect. It is a famous vendor of wearable biosensors’ technologies and has


developed a software system called Vista Solution. This system assists the caretakers
to detect specific health outcomes based on National Early Warning Scores (NEWS).
Vista Solution includes the addition of pulse oximetry & blood pressure, weight scale,
and core temperature to the measurements of 8 signs that are vital as studied by a
biosensor called VitalPatch. The signs are respiratory rate, heart rate, lead EKG, heart rate
variability, body posture, the temperature of skin, activity and fall detection. VitalPatch
is the lightest and smallest medical device that is Food and Drug Administration (FDA)
Cleared and approved for usage in post-discharge home settings and hospitals. According
to VitalConnect, this system, which is new and contended by the company, will ensure the
reduction of risk intelligently to patients irrespective of the care intensity. With NEWS’s
integration, the software will enhance the doctor’s access to the patients’ information
about their condition all the time with predictive analysis which is proven clinically.
Application of AI in Healthcare by Industries in Australia 575

NEWS is a standard that is recognized internationally for acute illness assessments and
patient’s well-being monitoring continuously through the treatment.

PointClickCare. It has brought in a technology that is interoperable and designed to


empower health systems and hospitals with the insight and analytics necessary for
a coordinated & collaborative approach to the delivery of care. This technology has
been dubbed Harmony, which enables post-acute & acute providers to leverage valuable
insight by eliminating data silos between the care partners. The company also connects
post-acute and acute care providers on a simplified, integrated, and secure platform.
The features of this system are as follows: Locating-allows health systems and hospi-
tals to discover the location of the patients. Sync-Conveniently monitor the patients’
discharge to Long-Term and Post-Acute Care (LTPAC) facility & access updated and
detailed progress of the patient for efficient decision making. This is achieved by mean-
ingful data exchange between hospitals, settings, and health systems. Visualization-By
enabling easy real-time visibility and evaluation on the discharge of patients. This gives
a detailed view of LTPAC partners’ performance metrics and indicators.

Sudler Sydney. This company in collaboration with Webling Interactive is using Aug-
mented and Virtual Reality to enable doctors to step into the shoes of a patient who is
suffering from Rheumatoid Arthritis (RA). The company has developed a Virtual Real-
ity (VR) System which virtually places a doctor inside the perspective of Rheumatoid
Arthritis sufferers, to experience and understand the hardships of a patient living with
Rheumatoid Arthritis. By experiencing the difficulties of a patient with Rheumatoid
Arthritis virtually, healthcare practitioners and doctors can have empathy with patients.
Medical practitioners were able to try the Virtual Reality experience at a conference orga-
nized by the Australian Rheumatology Association (ARA) in New Zealand. Most of the
doctors believed that the VR System reflected the RA experience exactly as described
by their patients.

As depicted in Fig. 1, the innovations in Applied AI in Australia are consistent with


Accenture findings from the USA, that the majority of AI companies in Australia are
focusing on improving virtual doctor and nursing assistants service and improving the
administrative workflow assistance and contributing towards automated image diagnosis.
Developing an AI algorithm and testing it is not a solution in itself. These are tools that
are required to be integrated and embedded in the healthcare workflows for doctors,
nurses and healthcare decision-makers to use them. There is already a lot happening
in the space of improving healthcare efficiency and patient care, a more can be done if
work towards it by providing appropriate data infrastructure and connecting the right
stakeholders in the healthcare ecosystem to leverage each other’s expertise and keep
patients at the center of decision making.

2.4 Big Tech Support

Few top global AI established companies investing in AI in healthcare. These companies


now use AI as a Service or build their intelligent apps using cloud-based AI services.
Furthermore, Big data in healthcare and AI in healthcare are fast becoming an influencing
576 P. Singh et al.

factor [5]. There are few instances in Australia where global IT firms have partnered
with academic universities to invest and innovate in the application of AI in healthcare.

Google.Inc: It awards $1 million to the University of Sydney to invest in the study of


AI in healthcare. The Westmead Applied Research Centre, a part of The University of
Sydney working to support the research to develop a health program digitally to reduce
the risk of a heart attack which is one of the biggest causes of death in the world. By
combining consumer-derived data from wearables and mobile applications with AI, and
clinical data, the accuracy, and precision of risk assessment are expected to improve
while creating digitally adaptive health solutions. A proper focus on a scalable illness
prevention program would make a major difference in people’s lives and help solve the
problems of increasing chronic preventable illnesses suffered by people.

Movember Foundation and Genie Solutions: It offers digitalized tools for the care
of prostate cancer. Genie Solutions which is a medical management software provider
based in Brisbane and the Movember Foundation-Men’s Health Charity announced a
partnership to facilitate a suite of digital tools to help patients with prostate cancer.
Movember Foundation’s True North along with this partnership, the digitalized products
will be integrated with Genie solutions. Additionally, this software can also help the
ones affected to obtain tailored and personalized insights on managing the side effects
of treatment and share responses with the clinicians for follow-up and efficient care
management. The partnership shall fasten the provision of patient and clinical data to
the National Prostate Cancer Outcomes Registry (NPCOR). NPCOR is a largescale
registry that registers information outcomes and cares provided for patients diagnosed
with prostate cancer in New Zealand and Australia. Fastening the provision of data to
help hospitals and clinicians streamline the documentation and reporting process. This
partnership with Genie Solutions and Movember Foundations can provide urologist
training to fight prostate cancer.

IBM: It spends $10 million to build Artificial Intelligence Centre to study AI in


healthcare at Melbourne University. According to financial review 2019, the Australian
Research Council has initiated a study around the globe to solve some of the healthcare
challenges. The main area of focus for AI application is Alzheimer’s trajectory predic-
tion, brain-controlled prosthetics, epilepsy and natural language processing. The team
of researchers working on a mobile device that can be worn by epilepsy patients, to
provide real-time alerts before a seizure. This alert can prompt patients to take medi-
cation on time to prevent the seizure. The second major project that the University is
working is on brain-controlled prosthetics to help patients with lost limbs to be able to
virtually command the artificial prosthetic limbs to do work as commanded the study
aims to examine the electroencephalogram (EEG) signals and compare them with real-
time data. Therefore, to encourage the projects and boost the entrepreneurial activity the
students and researchers’ license is given, and funds are provided through the research
center.

The Big Tech support in form of grants for researching on the application of AI for
ranges of healthcare issues like treating and prevention of serious diseases like cancers,
heart attacks, chronic diseases management like epilepsy’s and Alzheimer’s, etc. are
Application of AI in Healthcare by Industries in Australia 577

very welcoming and contributes towards leveraging the research capability that already
exists in academics in Australia to make a meaningful contribution in healthcare. More
can be done in this space by channelling more funding through grants and awards for
innovation in this space. One of the main advantages of big tech collaboration is the
support that can be provided by such tech giants not only in terms of funding but also
for lending the technology infrastructure that is required for such AI-related research.
Furthermore, there is a strong need for accreditation of such new expert groups who
have specialized in the development and successful implementation of AI solutions in
the healthcare space. This will help in fostering trust in such academic institutions and
healthcare institutions implementing them [15]. One important point to highlight here
is the need for dissemination of learning and research experience from such successful
projects to other sectors to promote an exchange of knowledge among AI researchers.

3 Opportunities and Challenges


The healthcare ecosystem around the world is under pressure due to the Covid-19 pan-
demic including in Australia. There are other numbers of pressing issues in healthcare
like the aging population, life-related chronic diseases, increase in cancer prevalence, dis-
ability care, shortage of medical staff and essential resources, etc. With limited resources
to deal with multiple challenges in healthcare, this is a perfect window of opportunity
for leveraging AI to enhance, improve and make health care more efficient. Start-ups
and industries play a big role in providing solutions by applying AI to real-world health
and healthcare problems which are different from researching and experimenting with
AI and machine learning algorithms [16]. Unlike creating and experimenting with AI
models that can map inputs and outputs, perhaps in the backdrop of ideal situations.
Applying AI is complicated and involves cracking problems that have many unknown
confounders and dynamic factors contributing to real-world problems. Moreover, As
mentioned earlier development of an AI algorithm and testing is not a solution in itself.
These are tools that are required to be integrated and embedded in the healthcare work-
flows so that healthcare decision-makers can use them. Healthcare decision-makers like
clinicians, doctors, hospitals, healthcare funding bodies and patients look for relevant
and precise information that can help them in decision making and not be overloaded
with the data or technical jargon of AI through these tools. They need the information to
make more informed decisions that have a positive impact on the health and healthcare
system at large [16].
Other than integration, industries and start-ups face challenges related to interoper-
ability between different IT and data systems. These two challenges of integration and
interoperability are distinct when we think about Applied AI. This is not an issue in
academic research related to AI where health data sets are well-curated [16]. In the real
world, there are barriers in form of lack of proper data infrastructure, poor health data
linkages, difficulty to access data due to long and tedious ethics approval process to
access the data, in some instances, there are no guidelines on accessing the data that is
already available in the healthcare system. This is reflective of how fragmented health
care is currently. For industries and start-ups to thrive and provide meaningful solutions,
the approach has to be multi-prong where there is a need for support from the govern-
ment in form of funding, providing a conducive environment to thrive in form of a clear
578 P. Singh et al.

legal and ethical framework for the application of AI. Moreover, industries need to come
up with proactive solutions over data privacy concerns and ways how they can harness
the health data at population levels.
To realize the full potential of the new AI capabilities, Accenture suggested few areas
of preparedness in healthcare; they mainly focus on workforce, institutional readiness,
care reach and security. The first focus area of the workforce refers to how AI can help
in filling gaps amid the rise of health staff shortage in healthcare. For this, there is a need
to improve the human capability for using AI-enabled tools for decision-making [4].
The second focus area of institutional readiness refers to streamlining the organiza-
tion’s structure and governance in addition to compatible data systems for easy integra-
tion of AI tools. The third focus area is to improve care reach by providing a connected
and seamless care experience for patients by using the integrated AI tools. The focus
area suggested by Accenture mentions security which emphasizes the need to work eth-
ically and have a secured system to manage and share critical patient information [4].
The above-mentioned concerns and challenges overlap with different sectors other than
healthcare when it comes to the application and implementation of AI. The AI RoadMap
was developed by the Australian Government in 2019 in collaboration with CSIRO’s
Data61 and the Department of Industry, Innovation and Science [1]. The report has iden-
tified broad areas of challenges and barriers faced by AI researchers for the adoption,
adaptation and development of AI solutions for problems. The challenges mentioned in
the report overlaps with a few discussed above like development of AI specialist work-
force, a better quality of access to datasets, building trust in AI in the company providing
the technology solutions, focus on Research & Development (R&D) to continue innova-
tion, providing right digital infrastructure and tackling risks on cybersecurity. Last and
not least having high ethical and performance standards for the stakeholders involved in
the process [1].
There are some positive indications in form of interest in the uptake of AI wear-
able devices in people. This suggests that digital infrastructure and behavior towards
technology are slowly changing. As per the research done by PureProfile (2017), the
study confirms that Australians has one of the world’s most eager and fastest-growing
group of adopters who user AI wearable devices that are healthcare-oriented. This grow-
ing proportion of Australians adopting AI wearables into their lifestyle to analyze and
keep track of their day-to-day activities [17]. As per a survey conducted by PureProfile,
among the internet users in Australia who were aware of AI fitness wearables, 45% said
they used the fitness wearables that are smartphone focused, or a wrist-based device, or
some that are the combination of the two. Even though the overall proportion of people
who are using wearable fitness devices is under 50%, the growing proportion is boosting
the demands for AI-oriented wearables in Australia to rise to high levels [17]. Another
research conducted by Kantar Worldpanel ComTech (2016) confirmed there is 16% of
AI wearable device penetration in Australia in Dec 2016, which is higher compared to
other countries. While health remains a key driver for the adoption of AI wearables [18].
With all the growing base of data, there is a strong need expressed by AI researchers
to develop a centralized database for such information which is safe and secured from
the risk of breaches. Due to the lack of such provision at the moment, there is one
such example from a company called Presagen that has overcome the challenge of
Application of AI in Healthcare by Industries in Australia 579

lack of a centralized database. The company has decentralized training and the delivery
system ensures private health data remains locally where it originated while enabling the
creation of accurate and robust AI from a single large decentralized global dataset [10].
This open’s potential opportunities for partnering with Medtech companies and other
stakeholders in the healthcare ecosystem in harmonizing data for future developing
solutions in nearby start-ups and Industries.
With all the above factors the other challenge that can be a potential reason for the
lower uptake of Applied AI solutions offered by industries is the trust issue. A majority
of people including healthcare decision-makers believe AI to be a black box and find
it difficult to trust the insights generated by the tool. It will take time and awareness to
develop the comfort of using and improving the optics towards applied AI in healthcare.
There is a strong need to provide accreditation for institutions successfully implementing
AI technology solutions in the health care space [15]. The other concerns that medical
staff also includes the fear of they are being replaced by the use and update of AI in
the day-to-day mundane job they perform. All these intangible aspects need to be dealt
with for making the applied AI and implementation successful in the healthcare sector
by industries.
Government bodies are working to plugs in gaps in legislation and ethics involved
in the application of AI by outlines guidance documents. Like to address the skilled
workforce challenge, the Australian government has announced an investment of $24.7
million over six years in the skills of the future by establishing the Next Generation AI
Graduates Program to attract, train and retain the home-grown, job-ready AI specialists
[6].

4 Conclusion
This paper provides an understanding of the challenges and opportunities that exist
in the application and implementation of AI in healthcare by industries and start-ups.
AI in healthcare is expanding in core areas in medicine to help medical practitioners
understand and tackle medical problems in a clear, efficient and accurate manner. Start-
ups lead in providing a range of AI innovative solutions in health in Australia. However,
the application of AI is dynamic and still evolving, certain niche sectors like healthcare
need to be considered with a specific lens and can have different needs and barriers to
overcome from other sectors. The legislation, ethics and privacy requirements need to
be revisited and incrementally adapted as the situation evolves in this space in the future.

References
1. Donnellan, A.: Three Australian industries that have an AI advantage. CSIRO Data61, 18
November 2019 (2021). https://algorithm.data61.csiro.au/ai-for-australia-its-current-and-fut
ure-impact/
2. Wilson, T.: No longer science fiction, AI and robotics are transform in healthcare. PwC
(2021). https://www.pwc.com/gx/en/industries/healthcare/publications/ai-robotics-new-hea
lth/transforming-healthcare.html
3. Fossat, Y., Scolley, J., Hogan, L.: Artificial intelligence and industries in Australia. J. AI-
Commonw. Aust. 3(2), 103 (2018)
580 P. Singh et al.

4. AI: Healthcare’s nervous system, 30 July 2020 (2021). https://www.accenture.com/au-en/ins


ights/health/artificial-intelligence-healthcare
5. Morgan, L.: Artificial intelligence in healthcare: how AI shapes medicine, 2019 (2020). https://
www.datamation.com/artificial-intelligence/artificial-intelligence-in-healthcare.html
6. Government, A.: Australian government digital economy strategy. 6 May 2021 (2021). https://
digitaleconomy.pmc.gov.au/fact-sheets/artificial-intelligence
7. Tracxn. AI in healthcare startups in Australia 19 July 2021 (2021). https://tracxn.com/exp
lore/AI-in-Healthcare-Startups-in-Australia
8. Startups, M.: Top 28 medical and healthcare startups in Australia. 7 August 2021 (2021).
https://www.medicalstartups.org/country/Australia/
9. Cornick, T.: Top 5: Artificial Intelligence (AI) healthcare startups in Australia. 11 April
2019 (2021). https://www.doctology.com.au/post/top-5-artificial-intelligence-ai-healthcare-
startups-in-australia
10. University of Adelaide. Bachelor of Computer Science 2019. https://www.adelaide.edu.au/
degree-finder/2019/bcomp_bcmpsci.html
11. Diabetes, J.: JADE website (2021). http://www.jadediabetes.com/
12. Saini, R.: Top 10 healthcare startups in Australia. 27 March 2021 (2021). https://www.vcbay.
news/2021/03/27/top-10-healthcare-startups-in-australia/
13. EMERGISIM. (2021). https://vr.emergisim.com/
14. Prospection. Prospection company website (2021). https://www.prospection.com/
15. Quinn, T.P., et al.: Trust and medical AI: the challenges we face and the expertise needed to
overcome them. J. Am. Med. Inform. Assoc. 28(4), 890–894 (2020)
16. Dickson, B.: Meeting the challenges of AI in health care (2021). https://bdtechtalks.com/
2021/02/17/ai-healthcare-tina-manoharan-philips/
17. Pureprofile Fitness Tracking Devices: Usage and Attitutes Towards Smartphone and Wearable
Devices (2017)
18. ComTech, K.W.: Kantar Launches Quarterly Report for Wearable Technology. Global News
Wire (2016)
Author Index

A D
Abdelgadir, Ayman K., 113 Dadheech, Praveen Kumar, 10
Abdelhady, Mohamed Ibrahim, 113 Danti, Ajit, 172
Abuhussain, Mahmoud, 356 De-La-Hoz-Franco, Emiro, 377
Acharya, Jigna N., 140 Dhopeshwar, Pooja, 568
Agrawal, Priyanka, 10 Ðikanović, Zoran, 207
AL-Amri, Khulood, 113 Dileep, M. R., 172
Al-Araimi, Fatma, 113 Dixit, Shikha, 530
Alduraywish, Meshal, 558 Dixit, Shivam, 96
Ali, Sarwan, 154 Dubey, Ghanshyam Prasad, 518
Al-Jarrah, Ali A., 541
Alqezweeni, Mohie M., 240 E
Aviv, Itzhak, 344 Elbaghazaoui, Bahaa Eddine, 392

G
B
Ghosh, Kaushik, 85
Bahita, Mohamed, 477
Glumskov, Roman A., 240
Bahri, Haythem, 507
Goel, Shivani, 1
Baig, Ayesha, 568
Gopalsamy, Arunraj, 55
Barot, Tomas, 403
Gorbachenko, Vladimir I., 240
Bhalerao, Pramod B., 436
Bharathy, Gnana, 568
H
Bhardwaj, Arpit, 1
Hasar, Ugur Cem, 356
Bhattacharya, Manisha, 299 Haseeb, Abdul, 154
Bhujade, Rakesh Kumar, 518 Hassan, Inaam Ul, 154
Bhuyan, Bikram Pratim, 193 Himthani, Puneet, 518
Bohra, Mahesh, 10 Hosseini, Zahra, 366
Bonde, Sanjiv V., 436 Huo, Huan, 568
Bremananth, R., 541 Hytönen, Kimmo, 366

C I
Chakraborty, Sonali, 22 Izadkhah, Habib, 490
Chaudhari, Dinesh N., 447
Choubey, Nitin S., 500 J
Choudhury, Amitava, 85 -
Jakšić-Stojanović, Andela, 207
Cýrus, Jindřich, 507 Jayavel, Rajesh Kumar, 309
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2022
H. Sharma et al. (Eds.): ICIVC 2021, PALO 15, pp. 581–582, 2022.
https://doi.org/10.1007/978-3-030-97196-0
582 Author Index

Joshi, Dhananjay, 500 Privalov, Aleksandr, 458


Joshi, Milan A., 500 Purohit, S. D., 10
Jourhmane, Mostafa, 392
Q
K Qu, Dongxu, 330
Kanniappan, Jayavel, 309
Kariapper, R. K. A. R., 250 R
Kathole, Atul B., 447 Radha, B., 55, 181
Kaur, Japjeet, 96 Rani, Narbda, 321
Kinange, Sunil A., 309 Rani, Sangeeta, 36
Kinnunen, Jani, 366 Razeeth, M. S. M., 250
Koci, Jan, 507 Rizvi, Syed Wajahat Abbas, 74
Krcmarik, David, 507 Rozario, Juliet, 181
Kriti, 96 Rudolf, Ladislav, 403
Kubalcik, Marek, 403
Kumar, Arvind, 1 S
Kumar, Dinesh, 36 Sabharwal, Ridhima, 74
Kumar, Rajat, 96 Saini, Baljit Singh, 468
Kumari, Geeta, 281 Sarwar, Abid, 468
Kumari, Jyoti, 281 Semmouri, Abdellatif, 392
Sharma, Anil, 10
L Sharma, Apoorva, 413
Ladjabi, Abdelmoula, 477 Sharma, Kavita, 413
Larkin, Eugeny, 458 Sharma, Nirmala, 413
Shubha, P., 266
M Singh, Aditi, 227
M’Haimoud, Mouatez Bilah, 477 Singh, Krishna Murari, 96
Mallesham, G., 131 Singh, Priyanka, 568
Manjunatha, Aruna S., 568 Singh, Sonika, 558
Meenakshi, M., 266 Singh, Vikram, 36
Mishra, Vinod, 321 Sinha, Nishant, 1
Moezzi, Reza, 507 Srivastava, Sudhanshu, 299
Mohamed Nafrees, Abdul Cader, 250 Stenkin, Dmitry A., 240
Suthar, Anil C., 140
N Svejda, Jaromir, 403
Nagwanshi, Kapil Kumar, 500
Navaneeth, A. V., 172 T
Nawaz, Samsudeen Sabraz, 250 Tiwari, Garima, 131
Nazir, Nahida, 468 Tiwari, Sugandha, 530
Nigam, Ankita, 131 Troncoso-Palacio, Alexander, 377
Nisar, Kottakkaran Sooppy, 10
U
P Unhelkar, Bhuvan, 558
Pan, Shuguo, 105
Pandey, K. M., 281 V
Pathak, Sunil, 500 van der Poll, John Andrew, 212
Perevozchikov, Mikhail, 403 Verma, Preety, 131
Pirapuraj, P., 250
Polo-Pichon, Braynner, 377 Z
Pradhan, Chittaranjan, 309 Zakareya, Salman, 490
Prasad, Ajay, 193 Zhang, Hui, 105
Prasad, Mukesh, 558, 568 Zhao, Tao, 105

You might also like