You are on page 1of 172

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/351087991

Analysis of Various Mobility Models and Their Impact on QoS in MANET

Chapter · April 2021


DOI: 10.1007/978-981-16-0407-2_10

CITATIONS READS

2 330

2 authors, including:

Munsifa Firdaus Khan


Assam University
7 PUBLICATIONS   24 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Mobile Ad hoc Network View project

OPEN PAGE
143 :>

All content following this page was uploaded by Munsifa Firdaus Khan on 04 January 2022.

The user has requested enhancement of the downloaded file.


Studies in Computational Intelligence 950

Jagdish Chand Bansal


Marcin Paprzycki
Monica Bianchini
Sanjoy Das   Editors

Computationally
Intelligent
Systems
and their
Applications
Studies in Computational Intelligence

Volume 950

Series Editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
The series “Studies in Computational Intelligence” (SCI) publishes new develop-
ments and advances in the various areas of computational intelligence—quickly and
with a high quality. The intent is to cover the theory, applications, and design
methods of computational intelligence, as embedded in the fields of engineering,
computer science, physics and life sciences, as well as the methodologies behind
them. The series contains monographs, lecture notes and edited volumes in
computational intelligence spanning the areas of neural networks, connectionist
systems, genetic algorithms, evolutionary computation, artificial intelligence,
cellular automata, self-organizing systems, soft computing, fuzzy systems, and
hybrid intelligent systems. Of particular value to both the contributors and the
readership are the short publication timeframe and the world-wide distribution,
which enable both wide and rapid dissemination of research output.
Indexed by SCOPUS, DBLP, WTI Frankfurt eG, zbMATH, SCImago.
All books published in the series are submitted for consideration in Web of Science.

More information about this series at http://www.springer.com/series/7092


Jagdish Chand Bansal · Marcin Paprzycki ·
Monica Bianchini · Sanjoy Das
Editors

Computationally Intelligent
Systems and their
Applications
Editors
Jagdish Chand Bansal Marcin Paprzycki
Department of Applied Mathematics Systems Research Institute
South Asian University Polish Academy of Sciences
New Delhi, India Warsaw, Poland

Monica Bianchini Sanjoy Das


Department of Information Engineering Department of Computer Science
and Mathematics and Engineering
University of Siena Indira Gandhi National Tribal University
Siena, Italy Imphal, India

ISSN 1860-949X ISSN 1860-9503 (electronic)


Studies in Computational Intelligence
ISBN 978-981-16-0406-5 ISBN 978-981-16-0407-2 (eBook)
https://doi.org/10.1007/978-981-16-0407-2

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface

Computationally intelligent system is a new concept for advanced information


processing. The objective of this system is to realize a new approach for analysing
and creating flexible information processing of sensing, learning, recognizing and
action taking. Computational intelligent is a part of artificial intelligence (AI) which
includes the study of versatile components to empower or encourage savvy practices
in intricate and evolving situations. This new trend of computational intelligence
applications seek the adaptation of computational neural network algorithms and
techniques in many application domains, including software systems, cybersecurity,
human activity recognition and behavioural modelling. As such, the computational
neural network algorithms can be refined to address problems in data-driven applica-
tions. The computationally intelligent system highly relies on numerical information
supplied by manufacturers unlike AI. The book covers all the core technologies like
neural networks, fuzzy systems and evolutionary computation and their applications
in the systems.
This book aims to foster the computationally intelligent system, its features and
applications. Original research and review articles with model and computation-
ally intelligent system applications using computational algorithm are included as
different chapters. The reader will learn on various computationally intelligent system
applications and their behaviours in order to extract key features. The book will enable
researchers from academia and industry to share innovative applications and creative
solutions to common problems using computational intelligence.

New Delhi, India Jagdish Chand Bansal


Warsaw, Poland Marcin Paprzycki
Siena, Italy Monica Bianchini
Imphal, India Sanjoy Das

v
Contents

Single Identity Clustering-Based Data Anonymization


in Healthcare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Pritam Khan, Yasin Khan, and Sudhir Kumar
Optimization Model for Production Planning: Case of an Indian
Steel Company . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Tuhin Banerjee and Saroj Koul
Vision-Based User-Friendly and Contactless Security for Home
Appliance via Hand Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Richa Golash and Yogendra Kumar Jain
Vulnerability Analysis at Industrial Internet of Things Platform
on Dark Web Network Using Computational Intelligence . . . . . . . . . . . . . 39
Anand Singh Rajawat, Romil Rawat, Kanishk Barhanpurkar,
Rabindra Nath Shaw, and Ankush Ghosh
Sentiment Analysis of Healthcare Big Data: A Fundamental Study . . . . . 53
Saroj Kushwah, Bharti Kalra, and Sanjoy Das
A Neuro-Fuzzy based IDS for Internet-Integrated WSN . . . . . . . . . . . . . . . 71
Aditi Paul, Somnath Sinha, Rabindra Nath Shaw, and Ankush Ghosh
Sleep Apnea Detection Using Contact-Based
and Non-Contact-Based Using Deep Learning Methods . . . . . . . . . . . . . . . 87
Anand Singh Rajawat, Romil Rawat, Kanishk Barhanpurkar,
Rabindra Nath Shaw, and Ankush Ghosh
Drift Compensation of a Low-Cost pH Sensor by Artificial Neural
Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Punit Khatri, Karunesh Kumar Gupta, and Raj Kumar Gupta
Sentiment Analysis at Online Social Network for Cyber-Malicious
Post Reviews Using Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . 113
Romil Rawat, Vinod Mahor, Sachin Chirgaiya, Rabindra Nath Shaw,
and Ankush Ghosh
vii
viii Contents

Analysis of Various Mobility Models and Their Impact on QoS


in MANET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Munsifa F. Khan and Indrani Das
Analysis of Classifier Algorithms to Detect Anti-Money Laundering . . . . 143
Ashwini Kumar, Sanjoy Das, Vishu Tyagi, Rabindra Nath Shaw,
and Ankush Ghosh
Design and Development of an ICT Intervention for Early
Childhood Development in Minority Ethnic Communities
in Bangladesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Md Montaser Hamid, Tanvir Alam, and Md Forhad Rabbi
Editors and Contributors

About the Editors

Dr. Jagdish Chand Bansal is Associate Professor at South Asian University New
Delhi and Visiting Faculty at Maths and Computer Science, Liverpool Hope Univer-
sity UK. Dr. Bansal has obtained his Ph.D. in Mathematics from IIT Roorkee. Before
joining SAU New Delhi, he has worked as Assistant Professor at ABV-Indian Insti-
tute of Information Technology and Management Gwalior and BITS Pilani, India.
His primary area of interest is swarm intelligence and nature-inspired optimiza-
tion techniques. Recently, he proposed a fission–fusion social structure-based opti-
mization algorithm, Spider Monkey Optimization (SMO), which is being applied
to various problems from the engineering domain. He has published more than 60
research papers in various international journals/conferences. He has also received
Gold Medal at UG and PG levels. Apart from SADIC, he is also Series Editor of
the series Algorithms for Intelligent Systems (AIS) published by Springer. He is
Editor-in-Chief of International Journal of Swarm Intelligence (IJSI) published by
Inderscience. He is also Associate Editor of IEEE ACCESS (IEEE) and ARRAY
(Elsevier). He is the steering committee member and the general chair of the annual
conference series SocProS. He is the general secretary of Soft Computing Research
Society (SCRS).

Marcin Paprzycki is Associate Professor at the Systems Research Institute, Polish


Academy of Sciences. He holds an M.S. from Adam Mickiewicz University in
Poznań, Poland, a Ph.D. from Southern Methodist University in Dallas, Texas, and a
Doctor of Science from the Bulgarian Academy of Sciences. He is a senior member
of the IEEE, a senior member of the ACM, Senior Fulbright Lecturer, and IEEE CS
Distinguished Visitor. He has contributed to more than 450 publications and was
invited to the program committees of over 500 international conferences. He serves
on the editorial boards of 12 journals and one book series.

Monica Bianchini received the Laurea cum laude in Mathematics and the Ph.D.
degree in Computer Science from the University of Florence, Italy, in 1989 and

ix
x Editors and Contributors

1995, respectively. After receiving the Laurea, for two years, she was involved in a
joint project of Bull HN Italia and the Department of Mathematics (University of
Florence), aimed at designing parallel software for solving differential equations.
From 1992 to 1998, she was a Ph.D. student and Postdoc Fellow with the Computer
Science Department of the University of Florence. Since 1999, she has been with the
University of Siena, where she is currently Associate Professor at the Information
Engineering and Mathematics Department. Her main research interest is in the field
of artificial intelligence & applications, machine learning, with emphasis on neural
networks for structured data and deep learning, approximation theory, information
retrieval, bioinformatics, and image processing. M. Bianchini has authored more
than seventy papers and has been Editor of books and special issues on interna-
tional journals in her research field. She has been a participant in many research
projects focused on machine learning and pattern recognition, founded by both
Italian Ministry of Education (MIUR), and University of Siena (PAR scheme), and
she has been involved in the organization of several scientific events, including the
NATO Advanced Workshop on Limitations and Future Trends in Neural Computa-
tion (2001), the 8th AI*IA Conference (2002), GIRPR 2012, the 25th International
Symposium on Logic Based Program Synthesis and Transformation, and the ACM
International Conference on Computing Frontiers 2017. Prof. Bianchini served as
Associate Editor for IEEE Transactions on Neural Networks 2003–2009), Neuro-
computing (from 2002), and International Journal of Computers in Healthcare (from
2010). She is a permanent member of the Editorial Board of IJCNN, ICANN, CPR,
ICPRAM, ESANN, ANNPR, and KES.

Sanjoy Das is currently working as Associate Professor, Department of Computer


Science, Indira Gandhi National Tribal University (A Central Government Univer-
sity), Amarkantak, M.P. (Manipur Campus)—India. He received his Ph.D. in
Computer Science from Jawaharlal Nehru University New Delhi India. Before
joining IGNTU, he has worked as Associate Professor, School of Computing Science
and Engineering, Galgotias University, India, from July 2012 to September 2017.
Also, he is Assistant Professor at G. B. Pant Engineering College, Uttarakhand, and
Assam University, Silchar, from 2001 to 2008. His current research interest includes
mobile ad hoc networks and vehicular ad hoc networks, distributed systems, and data
mining. He has published numerous papers in international journals and conferences
including IEEE and Springer.

Contributors

Tanvir Alam Department of Computer Science and Engineering, Shahjalal Univer-


sity of Science and Technology, Sylhet, Bangladesh
Tuhin Banerjee Centre for Supply Chain and Logistics Management, OP Jindal
Global University, NCR Delhi, India
Editors and Contributors xi

Kanishk Barhanpurkar Department of Computer Science and Engineering,


Sambhram Institute of Technology, Bengaluru, Karnataka, India
Sachin Chirgaiya Department of Computer Science Engineering, Shri Vaishnav
Vidyapeeth Vishwavidyalaya, Indore, India
Indrani Das Department of Computer Science, Assam University, Silchar, India
Sanjoy Das Department of Computer Science, Indira Gandhi National Tribal
University-RCM, Imphal, India
Ankush Ghosh School of Engineering and Applied Sciences, The Neotia Univer-
sity, Sarisha, West Bengal, India
Richa Golash Samrat Ashok Technological Institute, Vidisha, Madhya Pradesh,
India
Karunesh Kumar Gupta Department of Electrical and Electronics Engineering,
Birla Institute of Technology and Science (BITS), Pilani, Rajasthan, India
Raj Kumar Gupta Department of Physics, Birla Institute of Technology and
Science (BITS), Pilani, Rajasthan, India
Md Montaser Hamid Department of Computer Science and Engineering, Ranada
Prasad Shaha University, Narayanganj, Bangladesh
Yogendra Kumar Jain Samrat Ashok Technological Institute, Vidisha, Madhya
Pradesh, India
Bharti Kalra Noida International University, Noida, India
Munsifa F. Khan Department of Computer Science, Assam University, Silchar,
India
Pritam Khan Indian Institute of Technology Patna, Bihta, India
Yasin Khan Indian Institute of Technology Patna, Bihta, India
Punit Khatri Department of Electrical and Electronics Engineering, Birla Institute
of Technology and Science (BITS), Pilani, Rajasthan, India
Saroj Koul Centre for Supply Chain and Logistics Management, OP Jindal Global
University, NCR Delhi, India
Ashwini Kumar Department of Computer Science and Engineering, Graphic Era
Deemed To Be University, Dehradun, India
Sudhir Kumar Indian Institute of Technology Patna, Bihta, India
Saroj Kushwah Noida International University, Noida, India
Vinod Mahor Department of Computer Science Engineering, Gwalior Engineering
College, Gwalior, India
Aditi Paul Department of Computer Science, Banasthali Vidyapith, Tonk, India
xii Editors and Contributors

Md Forhad Rabbi Department of Computer Science and Engineering, Shahjalal


University of Science and Technology, Sylhet, Bangladesh
Anand Singh Rajawat Department of Computer Science Engineering, Shri
Vaishnav Vidyapeeth Vishwavidyalaya, Indore, India
Romil Rawat Department of Computer Science Engineering, Shri Vaishnav
Vidyapeeth Vishwavidyalaya, Indore, India
Rabindra Nath Shaw Department of Electrical, Electronics and Communication
Engineering, Galgotias University, Greater Noida, India
Somnath Sinha Department of Computer Science, Amrita School of Arts and
Sciences, Vidyapeetham, Mysuru, India
Vishu Tyagi Department of Computer Science and Engineering, Graphic Era
Deemed To Be University, Dehradun, India
Single Identity Clustering-Based Data
Anonymization in Healthcare

Pritam Khan, Yasin Khan, and Sudhir Kumar

Abstract The modern technology in today’s world is largely dependent on data.


At times, data proves to be more valuable than money. Data constitutes information
and this information can help to keep peace or even cause wars. Impact of data has
increased with the skyrocketing of artificial intelligence, machine learning, and deep
learning techniques where data is the main food. Consequently, privacy of data has
also become an important aspect to be taken care of from potential hackers and
unscrupulous persons. Healthcare sector is a sensitive area often targeted by hackers.
Various anonymization techniques like k- anonymization, l-diversity, and t-closeness
contribute in maintaining the data privacy thereby securing it. We propose another
anonymization technique namely single identity clustering which is inspired from the
t-closeness method. Our proposed method enhances the data privacy that is already
provided using the existing anonymization techniques.

1 Introduction

The advances in technology in today’s world are happening with new dimensions of
data science being evolved. Current day technology being mostly data-driven sur-
passes the traditional engineering in terms of accuracy and quality of output. How-
ever, securing the data from the hands of unscrupulous persons is a major concern.
Healthcare domain is often a soft target for the potential hackers. Data in possession
with the healthcare units includes personal and medical data such as healthcare device
information, gender, age, heart-rate, and patient’s condition. There remains the pos-
sibility of data theft from the healthcare unit, thereby posing a threat to the privacy

P. Khan (B) · Y. Khan · S. Kumar


Indian Institute of Technology Patna, Bihta 801106, India
e-mail: pritam_1921ee05@iitp.ac.in
Y. Khan
e-mail: khaanyasin@gmail.com
S. Kumar
e-mail: sudhir@iitp.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 1
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_1
2 P. Khan et al.

and security issues. In order to cope up with this security issue, the data acquired
from the patients can be anonymized by the healthcare unit. Using anonymization
techniques, we can simply protect the medical data and privacy of patients [1]. This
anonymized data can be deanonymized by the corresponding healthcare unit only and
the anonymized data is stored in the server. Theft or alteration of data in server will
not be able to pose a serious threat as the original data still rests with the healthcare
unit and any misuse of data in server will get traced.
Different security-based measures are adopted by the medical service providers.
McLeod and Dolezel [2] discuss multiple factors related to healthcare data mutila-
tion. For ensuring data security and privacy of patients, Rani and Alzubi [3] propose a
lightweight block cipher model on sensed data. A share generation model is discussed
in the same work for improving the privacy issues. However, detailed implementa-
tion of the algorithms with different measures is not carried out. A medical security
monitor (MedMon) based on wireless channel monitoring and anomaly detection is
discussed in Zhang et al. [4] for personal healthcare systems. However, the commu-
nication channel remains unsecured from attacks and strictness in security policies
also makes the proposed system liable to false alarms. In [5], a modular system called
Pseudonymization and Anonymization with the Extensible access control markup
language (XACML) standards (PAX) is developed which depends on client and
server applications. PAX is an authorisation system that works with the electronic
health record (EHR), thereby providing a security solution to the privacy issues of
patients’ data in the EHR. Although the transmission of health records gets secured,
the patients’ database continues to remain under risk if it gets accessed by an adver-
sary. Two novel privacy models based on l-diversity and t-closeness are proposed in
Sei et al. [6] for treating sensitive quasi-identifiers in a medical dataset. We secure
the cardiac patients’ data in Khan et al. [7] using the well-known k-anonymization
technique. In this work, we propose the single identity clustering anonymization
technique for enhancing the security of data in the IoT healthcare network.

2 Data Anonymization Methods

Different anonymization techniques like k-anonymization, l-diversity, and t-


closeness are used for data security approaches. We discuss the existing anonymiza-
tion techniques and then illustrate our proposed single identity anonymization tech-
nique that further enhances the data privacy.

2.1 k-Anonymization

In k-anonymization technique, k represents the least number of identical rows of


anonymized data. Anonymization is carried out either by suppression or by gen-
eralization. Each row indicates a tuple or record whereas a column comprises an
Single Identity Clustering-Based Data … 3

attribute. On anonymization, each row of the original dataset should be identical to


minimum k − 1 rows of the anonymized dataset [8, 9]. The attributes or features are
either insensitive, quasi-sensitive, or sensitive. The sensitive attributes should not be
made available to the adversary while the insensitive and the quasi-sensitive attributes
can be left open. In order to bring k-anonymity, we suppress some quasi-sensitive
attributes by asterisk symbol and generalize some others in a certain range.

2.2 l-Diversity

Anonymization using k-anonymity fails at times when the sensitive attribute is the
same for k-anonymized rows in the anonymized dataset. This is because, the adver-
sary will then be sure of the result or outcome (sensitive attribute) of an identity in
spite of k-anonymizing. However, l-diversity is a solution to this problem. l-diversity
ensures that if there are l number of identities with same quasi-sensitive and sensi-
tive attributes, then any one of the quasi-sensitive attributes of those l identities is
disclosed creating diversity among them.

2.3 t-Closeness

In t-closeness anonymizing technique, we measure the probability distribution of


the different identities in the original dataset and also in the equivalent dataset
(where identities are grouped into equivalent classes based on same quasi-sensitive
attributes). The difference between the probability distribution of every identity in the
original dataset and that in an equivalent class of the anonymized equivalent dataset
is calculated. t-closeness ensures that this difference is lower than the threshold t.
The value of t is determined from this difference and it is the maximum value of dif-
ference of probabilities obtained. Let there be n categories of sensitive attributes in a
dataset denoted by a1 , a2 , . . . , an . We assume that there are two equivalent classes in
the equivalent dataset. Let the sensitive attributes in the two classes be a1 , a2 , . . . , an
and a1 , a2 , . . . , an . Then we calculate the value of t as:
      
t = max max P (ai ) − P ai , max P (ai ) − P ai ∀i ∈ n (1)

 D
where the value of t must satisfy  ≤ t with D being  the distance measure between
the probabilities P (ai ) and P ai , or P (ai ) and P ai .
4 P. Khan et al.

2.4 Single Identity Clustering

In this work, we propose a single identity clustering method which strengthens the
anonymization obtained using t-closeness technique. By using single identity clus-
tering, we prevent any sensitive attribute from occurring multiple times in a single
cluster thereby increasing the complexity of anonymization of an identity to an
adversary. The proposed method for creating single identity clustering is given by
Algorithm 1.

Algorithm 1 Proposed single identity clustering method


Data: Sensitive and quasi-sensitive attributes of all records/identities in t-closeness based
anonymized dataset
Result: Clustered anonymized identities
1. Construct the t-closeness based anonymized dataset.
2. Identify the equivalent classes.
3. Note the difference of probability of occurrence of every sensitive attribute in the particular
equivalent class with that in the original dataset.
4. Add the first identity to cluster Ci where i begins with a minimum value of 1.
5. Go to next identity.
6. If the sensitive attribute of the identity matches with that of previous identity, then go to Ci+1
in Step 7, else add the identity to Ci and go to Step 5.
7. Check Ci+1 for the presence of any identity with the same sensitive attribute. If present, then
continue the same in next cluster, else add the identity to Ci+1 . Go to Step 5.
8. Terminate after clustering all the identities in groups.

However, the anonymization technique incurs some information loss which is


measured using conditional entropy [6]. De-anonymization is used to retrieve the
original data from anonymized data and this is done by the data miners. The goal of
de-anonymization is to find a mapping that maximizes the similarity between original
dataset D and anonymized dataset D ∗ [10]. We use cosine similarity measure Sim
to trace the similarity between D and D ∗ . It is given by:

  N
  N
D · Di∗
Sim D, D ∗ = cos Di , Di∗ =  i  (2)
 Di D ∗ 
i=1 i=1 i

where N is the total number of records while Di and Di∗ denote the individual record
of the original and anonymized datasets respectively in a vector format. Therefore,
Sim(Di , Di∗ ) can have a minimum value of 0 and a maximum value of N .
Single Identity Clustering-Based Data … 5

3 Anonymization of Health Data

We illustrate the different anonymization techniques using a cardiac micro-dataset.


Security of the stored data in healthcare units is a major concern. Table 1 shows an
example of cardiac patient dataset. Different features like “name,” “gender,” “age,”
“address,” “heart-rate,” and “diseases” are assigned as different attributes. The types
of attributes used are “explicit,” “quasi-sensitive,” and “sensitive.” Therefore, the
probability of sensitive attributes in the original dataset following the order atrial
fibrillation, bradycardia, and tachycardia is {2/7, 3/7, 2/7}.

3.1 Anonymization with k-Anonymity

We use k-anonymity privacy model with k = 2. The “name” attribute in Table 1 being
an explicit identifier is excluded from the k-anonymized table. Table 2 represents the
k-anonymized cardiac patient data with “gender,” “age,” “address,” and “heart-rate”
being the quasi-sensitive attributes. “Diseases” is the sensitive attribute. Some of the
quasi-sensitive attributes are suppressed while the others are generalized. Suppres-
sion is indicated by asterisk symbol and generalization is executed by assigning the
attribute to a particular range. It is observed from Table 2 that the quasi-sensitive
attributes of first and second records, third and sixth records, fourth, fifth, and sev-
enth records are identical. Therefore, minimum of two records appear identical after
using 2-anonymity. However, “diseases” column is exempted from anonymization
as it serves as a sensitive attribute.
We evaluate the similarity between the original dataset D and the anonymized
dataset D ∗ given in Tables 1 and 2, respectively. We consider the first row of both
the tables and then find the overall similarity between the two datasets.
Let D1 and D1∗ represent the first rows in Tables 1 and 2 respectively, compris-
ing only the quasi-sensitive attributes. Assume that each quasi-sensitive attribute

Table 1 Cardiac patient data with attributes


Name Gender Age Address Heart-rate Diseases
James Male 75 700135 58 Bradycardia
David Male 70 700309 98 Atrial
fibrillation
Alex Male 68 700435 51 Bradycardia
Hena Female 69 700252 160 Tachycardia
Diana Female 56 700238 140 Tachycardia
Smith Male 61 700309 51 Bradycardia
Maria Female 52 700258 100 Atrial
fibrillation
6 P. Khan et al.

Table 2 Cardiac patient data with 2-anonymity


Gender Age Address Heart-rate Diseases
Person ≥70 700100-700399 * Bradycardia
Person ≥70 700100-700399 * Atrial fibrillation
Person * 700200-700499 * Bradycardia
Person * 700100-700299 ≥100 Tachycardia
Person * 700100-700299 ≥100 Tachycardia
Person * 700200-700499 * Bradycardia
Person * 700100-700299 ≥100 Atrial fibrillation

is represented by 1 in D1 . On the other hand, consider the suppressed attributes


that are anonymized by asterisk to be 0 in D1∗ assuming the worst case of get-
ting erroneous results on de-anonymization. Also, the generalized attributes are
denoted by 1 in D1∗ with the original attribute being a sub-set of the generalized
quasi-sensitive attribute. For example, male is a subset of person and 700135 is
a subset of [700100-700399]. Calculating the√ cosine similarity for first row, we get
  D ·D ∗
cos D1 , D1∗ = |D 1|| D1∗ | = 1.1+1.1+1.1+1.0
√ √
4 3
= 23 . Similarly, if we calculate the cosine
1 i √ √ √ √
similarity for the rest of the rows, we get values 23 , √12 , 23 , 23 , √12 and 23 for
second, third, fourth, fifth, sixth, and seventh rows,
7 respectively.
 Hence, we get a
minimum similarity measure of Sim(D, D ∗ ) = i=1 cos Di , Di∗ ≈ 5.744 for the
2-anonymized cardiac dataset with the original dataset.
However, considering the k-anonymization in Table 2, if any adversary targets
Hena’s details, then the adversary will be sure based on fourth and fifth records that
she is suffering from Tachycardia and also about other relevant details of her from
the dataset. Therefore, in order to deal with this problem, we use 2-diversity where
the equivalent classes with same sensitive attributes are diversified by the “address”
quasi-sensitive attribute.

3.2 Anonymization with l-Diversity

We use l = 2 in l-diversity anonymizing method. It can be observed from Table 3


that fourth and fifth records have addresses 70025* and 70023*, respectively. Hence,
diversity between similar fourth and fifth records of Table 2 is created through the
“address” quasi-sensitive attribute in Table 3. However, the address of seventh record
is again 70025* but the disease is atrial fibrillation in this case. Therefore, the adver-
sary will get confused from fourth, fifth, and seventh records whether Hena has
tachycardia or atrial fibrillation. Now, the probability of atrial fibrillation, bradycar-
dia, and tachycardia for Hena is {1/2, 0, 1/2}, respectively, from similar fourth and
Single Identity Clustering-Based Data … 7

Table 3 Cardiac patient data with 2-diversity


Gender Age Address Heart-rate Diseases
Person ≥70 70013* * Bradycardia
Person ≥70 70030* * Atrial fibrillation
Person * 70043* * Bradycardia
Person * 70025* ≥100 Tachycardia
Person * 70023* ≥100 Tachycardia
Person * 70030* * Bradycardia
Person * 70025* ≥100 Atrial fibrillation

seventh records in Table 3. Calculating the similarity measure, here also we get a
minimum cosine similarity of about 5.744 out of 7 with the original dataset.
However, l-diversity does not secure the data from similarity attacks owing to
semantic similarities [11]. This problem is overcome using t-closeness technique as
the anonymization gets strengthened with more suppression and/or generalization
thereby generating the threshold limit t. The difference between the probability of
occurrence of a sensitive attribute in the original distribution and that in anonymized
distribution should not be more than a threshold t, and hence, the anonymization of
the distribution using t-closeness increases.

3.3 Anonymization with t-Closeness

We calculate the value of threshold t = 0.286 for this example with Table 4 satisfying
the 0.286-closeness. It is observed from Table 4 that there are two equivalent classes
in the equivalent dataset. One equivalent class comprises the first two records while
the other class contains the rest of the records. So the probability of atrial fibrillation,
bradycardia, and tachycardia for first class is {1/2, 1/2, 0} and for second class is
{1/5, 2/5, 2/5}. We calculate the value of t according to the formula given by Eq. 1 and
we get t = ( 27 − 0) = 0.286. However, on calculating the cosine similarity measure
for the 0.286-closeness-based anonymized dataset with the original dataset, we get
a lower value of minimum similarity owing to the increase in number of suppressed
attributes. t-closeness ensures that probability of occurrence of a sensitive attribute
in the anonymized dataset does not increase above the threshold limit t for which
t-closeness fails to satisfy. Therefore, anonymity prevails for the adversary in a
controlled way.
8 P. Khan et al.

Table 4 Cardiac patient data with 0.286-closeness


Gender Age Address Heart-rate Diseases
Person ≥70 7001*-7003* * Bradycardia
Person ≥70 7001*-7003* * Atrial fibrillation
Person * 7001*-7004* * Bradycardia
Person * 7001*-7004* * Tachycardia
Person * 7001*-7004* * Tachycardia
Person * 7001*-7004* * Bradycardia
Person * 7001*-7004* * Atrial fibrillation

3.4 Anonymization with Single Identity Clustering

We enhance the performance of t-closeness-based anonymization technique with


the proposed single identity clustering method that further increases the diversity
of the anonymized dataset. From Table 5, it can be observed that there are three
equivalent classes. First and second records, third and fourth records, and fifth, sixth
and seventh records form the three classes. The probability distribution following
the order of atrial fibrillation, bradycardia and tachycardia for first class is {1/2, 1/2,
0}, for second class is {0, 1/2, 1/2}, and for third class is {1/3, 1/3, 1/3}. Calculating
the value of t using Eq. 1, we get t = 0.286 in this case also. Grouping into clusters
is carried out using Algorithm 1 as per which we do not have any sensitive attribute
more than one time in a cluster thereby creating more diversity and hence increased
anonymity to the adversary. Table 5 shows three clusters denoted by three colors
(yellow, orange, and cyan). The sensitive attribute corresponding to each identity
is highlighted in bold. The disease indicated in bold denotes its presence in the
particular cluster. It can be observed that the yellow cluster contains bradycardia
and atrial fibrillation (first and second records) only one time. Similarly, the orange
cluster also contains bradycardia and tachycardia only one time, that is, third and
fourth records. Again the third equivalent class contains tachycardia (fifth record),
bradycardia (sixth record), and atrial fibrillation (seventh record). If there would
have been a repetition of sensitive attribute, for example, two tachycardia in third
equivalent class, then both the tachycardia records would have formed two separate
clusters within the same equivalent class.

4 Conclusion

In this work, various data anonymization techniques are discussed, especially from
the perspective of healthcare sector. The proposed single identity clustering method
strengthens the existing anonymization methods in enhancing the privacy of sensitive
data. However, these data anonymization techniques are well applicable for other
Single Identity Clustering-Based Data … 9

Table 5 Cardiac patient data anonymized using single identity clustering


Gender Age Address Heart-rate Diseases

domains also like defense, bank, and telecom sector. Securing the data from the
cyberattacks and unscrupulous persons is the need of the hour with the increase in
digitization of public services and monetary transactions. Research is being carried
out worldwide for securing the data which at times prove to be more precious than
anything else.

Acknowledgements This work acknowledges the support rendered by the Early Career Research
(ECR) award scheme project “Cyber-Physical Systems for M-Health” (ECR/2016/001532) (dura-
tion 2017–2020), under Science and Engineering Research Board (SERB), Govt. of India.

References

1. F. Prasser, F. Kohlmayer, Medical Data Privacy Handbook (Springer, Berlin, 2015), pp. 111–
148
2. A. McLeod, D. Dolezel, Decis. Support Syst. 108, 57 (2018)
3. S.S. Rani, J.A. Alzubi, S. Lakshmanaprabu, D. Gupta, R. Manikandan, Multimedia Tools Appl.
pp. 1–20 (2019)
4. M. Zhang, A. Raghunathan, N.K. Jha, IEEE Trans. Biomed. Circuits Syst. 7(6), 871 (2013)
5. M. Al-Zubaidie, Z. Zhang, J. Zhang, Int. J. Environ. Res. Public Health 16(9), 1490 (2019)
6. Y. Sei, H. Okumura, T. Takenouchi, A. Ohsuga, IEEE Trans. Dependable Secure Comput.
(2017)
7. P. Khan, Y. Khan, S. Kumar, in 2020 International Conference on COMmunication Systems &
NETworkS (COMSNETS) (IEEE, New York, 2020), pp. 658–661
8. X. Xu, M. Numao, in 2015 Third International Symposium on Computing and Networking
(CANDAR) (IEEE, New York, 2015), pp. 499–502
9. P. Shi, L. Xiong, B. Fung, in Proceedings of the 19th ACM international conference on Infor-
mation and knowledge management (ACM, New York, 2010), pp. 1389–1392
10. A. Narayanan, V. Shmatikov, University of Texas at Austin (2008)
11. N. Li, T. Li, S. Venkatasubramanian, in 2007 IEEE 23rd International Conference on Data
Engineering (IEEE, New York, 2007), pp. 106–115
Optimization Model for Production
Planning: Case of an Indian Steel
Company

Tuhin Banerjee and Saroj Koul

Abstract Steelmaking has evolved in the past decade to incorporate advanced


technologies in line with global development. However, the use of data analytics,
artificial intelligence, and advanced optimization process in the day-to-day
operations is yet to evolve. This paper aims to develop an optimization model for
planning the production sequence and allocating resources keeping view of the
technical and resource constraints of the steel plant. Multi-criteria decision
modeling is used to generate the product mix, and sequencing of the product mix is
done through an iterative route-finding method based on the objective selected
using transportation algorithms.

Keywords Product mix · Integrated plant control · LP model · Computer


application · Steel plant · India

1 Introduction

Metal industries like steel are considered as the backbone of the economy. Hence, the
level of production and consumption of steel is an indicator of economic progress of
the country. The past decade has seen a tremendous rise in steel demand across the
globe and India backed by the rapid industrialization and globalization of economies.
Steel output of India is expected to rise to 128.6 MT by 2021. It envisages reaching
300 MT in the coming ten years [1]. Keen competition in the steel market has
forced several steel manufacturing companies to use resources in the most efficient
ways to maintain their profit level and competitiveness. The rise in steel demand
has led to the modernization of the steelmaking process adopting to high-speed
continuous process with advanced manufacturing setups. In line with the global

T. Banerjee · S. Koul (B)


Centre for Supply Chain and Logistics Management, OP Jindal Global University, NCR Delhi,
India
e-mail: skoul@jgu.edu.in
T. Banerjee
e-mail: btuhin02e@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 11
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_2
12 T. Banerjee and S. Koul

trends, the steel manufacturing in India shifted focus to the production of high-
quality steel with different product mix at a lower cost in JIT delivery set up to meet
the market needs. The competitive market forces demand the use of a computer-
integrated manufacturing system to improve productivity and reduce waste and cost
while meeting the production needs, making the “production planning and scheduling
system” as the critical component in an integrated manufacturing setup.
The current work analyzes the “production planning and scheduling system” of
a steel manufacturing company in India to develop a computer-integrated
manufacturing system using mixed-integer linear programming for providing
optimum solutions. The steel company analyzed produces economical and efficient
steel through backward and forward integration. It continuously is scaling capacity
utilization and efficiencies to capture opportunities in the sector. Product mix and
availability are the two concern to satisfy the customers. Both the decisions are
guided by the customer demand considering the capacity and capability of the
company in producing the needed products. To satisfy customer demand, optimum
management of the raw materials and intermediate products are of prime concern
for the planning and scheduling process.
The decision about which item needs to be produced and in what quantity along
with the allocation of the raw materials and intermediate products dictates the
planning process at the plant having a wide variety of product offerings for the steel
market. The production planning is thus the most critical operations that need
constant coordination with various departments like marketing and sales, and
logistics. A proactive approach to remain competitive is to be efficient and produce
the right product in optimum quantity at the right time. The lack of optimization
tools in the planning process in both the large- and small-scale steel manufacturing
points the need for small-scale adaptive optimization tool using a simple
computational algorithm. The analysis attempts to devise an optimization model for
the planning process for allocating the resources and planning the plant production
schedule through mathematical modeling and advanced computer algorithms.

2 Literature Review

Production planning in the manufacturing setup helps in efficient resource


utilization in the production process to meet the customer’s demand [2] while
proper scheduling helps to achieve optimum productivity while minimizing waste
[3]. An integrated planning and scheduling process improves the competitive
position of manufacturing by reducing inventory and leads time while increasing
the on-time order fulfillment and machine utilization [3]. Researchers have
extensively studied the production planning and optimization problems and used
mathematical modeling for optimization. While the optimization of production
planning and material resource planning have extensively been studied, integrated
approach with scheduling in a multi-stage production system has not been
considered [4, 5].
Optimization Model for Production Planning: Case of an Indian … 13

Conventional planning and scheduling problem are accomplished using a


sequential approach where “planning optimization” determines the production
targets and “sequencing activity” determines the tactical plan to achieve those
targets effectively [6]. However, the approach results in significant trade-off and
suboptimal decision making in a multi-stage integrated steel manufacturing
operation and as such leads to higher cost and failures to meet production targets in
closely linked operations [7, 8]. Researches, in the past decade, pointed out the
limitation of increasing complexity in modeling an integrated planning and
scheduling optimization. The limitation has forced researchers to reduce their focus
on scaling down the objective to a simplified problem and focus more on
developing efficient strategies [9].
In this case, at a steel industry, the “production planning and scheduling”
problem has been optimized through mixed-integer linear programming. If the
scheduling of operations and diversification are not integrated, the result is
inefficient resource utilization along with suboptimal results. Also, in today’s
competitive environment, the plant operates with more than optimum production
goal. The goals like maximizing the profit and cash velocity or delivering products
on-time are essential to staying buoyant in the market. As such, optimization with a
singular goal of production maximization with efficient resource utilization might
not result in an optimum competitive solution.
The current work tries to address the challenges with an integrated approach of
optimizing the production and scheduling problem of a multi-stage steel production
environment. Still, it can be used for similar small- and large-scale operations. The
production planning optimization model developed in the study uses a
“multi-objective mixed-integer linear programming model” to generate the optimal
production plan which is integrated into scheduling and resource planning to build
an integrated end-to-end planning and optimization tool.

3 Methodology

The production planning tool in the study is developed on the flexible Microsoft Excel
platform to be cost-efficient, user-friendly, adaptable, and at the same time allows
further integration with ERP system if available. The model considers several pieces
of research in steel production management [10–12] and uses linear programming
model [13] for identifying the product mix and transportation algorithm using LP
to schedule the production process. The input–output model takes the current order
status, plant performance data, and planning parameter as input and generates an
optimum product mix, production sequence, and raw material requirement as output.
The optimization model offers various objective functions, such as including profit
maximization, productivity maximization, and profit or productivity maximization
in combination with OTIF (On-Time-In-Full) maximization for deciding the product
mix. The model so devised is for a fully functional model for rail and structure-mill.
14 T. Banerjee and S. Koul

A similar model can be used for other mills and output from all the mills can be
integrated for devising the planning optimization model for the steel melting shop.
In the next section, first, the production planning process, model development,
and model output are briefly explained. It is followed by the analysis and future scope
of improvement and implementation of the model and concludes with suggestions
for the plant.

3.1 The Planning Process

The planning department in the manufacturing industry studied integrates and


analyzes the data from several departments like sales and order data from the
marketing, production data from the plant, payment data from marketing finance,
and dispatch data from the logistics before generating the production plan for the
department and the product mix. Figure 1 portrays the flowchart of the planning
process.
The planning department on receiving the current order status of the department
checks FI (finance indicator) release or SBU (strategic business unit) release along
with plant and QC (quality control) release before considering the order for planning.
All the orders in the section are clubbed to form a campaign for the particular section.
Limitations of the plant to take the campaign above a predefined tonnage are taken
into consideration. Also, the promise date of the order is taken into consideration
while scheduling the campaign, and the raw material requirement is coordinated
with the supplying department. The plan so created with all these considerations is
communicated to the department and is continuously monitored for compliance.
The manual process so followed takes a lot of human effort and is mostly not the
optimum plan or sequence as the process lacks a mathematical optimization model
and is solely based on the human experience. The tool developed keeps all the above
steps done manually into consideration and adds optimization model to it to generate
the product mix and the production sequence. The flexibility of the tool allows the
user to plan for the day, week, or month based on the requirement and add specific
constraints that arise now and then during the continuous operations of the unit.

Fig. 1 Flowchart of the planning process


Optimization Model for Production Planning: Case of an Indian … 15

Fig. 2 Opening screen of the tool

3.2 The Optimization Model

3.2.1 Overview

The optimization tool is developed using MS Office Visual Basic for Applications
(Office 2019) [14], User forms, and ActiveX objects to create the user interface and
Open Solver (2019) for linear programming and optimization. The snapshot of the
opening screen of the tool is in Fig. 2.
Each tab in the tool has a specific function that is explained in detail in the
subsequent section.
The first group of tabs containing “Import COS”, “Update Plant Data”, “Planning
Parameter”, “Optimize Production Plan”, and “Optimize Production Schedule” are
for user inputs. The second group of tabs containing the “Production plan”, “Rolling
Schedule”, and “Raw Material Plan” gives the ready to use the output for the user
and is the final objective of the model. The third group of tabs containing “Show
COS”, “Show Plant Data”, and “Show P. Parameter” gives the option for the user to
view and check all the inputs from his end for the planning and optimization model
(Table 1).

4 Functional Details

4.1 Import COS Data

The “Import COS Data” is the starting sequence of the model and allows the user
to import the current order status as available in the ERP system into a worksheet
named “COS” and having header data in the first row to the model.
16 T. Banerjee and S. Koul

Table 1 Nomenclature
Term Description
adopted
Import COS data For user input
Update plant data For user input
Planning parameter For user input
Optimize production schedule For user input
Production plan Ready to use the output for
the user
Rolling schedule Ready to use the output for
the user
Raw material plan Ready to use the output for
the user
Show COS Ability to view and check all
the inputs
Show plant data Ability to view and check all
the inputs
Show P. parameter Ability to view and check all
the inputs

The “COS data” contains all the pending order as on a date in the plant. The model
then checks all the criteria like “FI release”, “SBU release”, and “Plant Release” as
well as minimum order quantity to find executable “BTR” (Balance to roll) and
generate section-wise available “BTR” for rolling, the input for the optimization
model. The data imported can be viewed and checked using the “Show COS” tab,
which opens all the imported data in a new workbook for viewing.

4.2 Update Plant Data

This tab gives the option to the user to add plant-specific data in the model. The
steel plant data contains all the key performance indicators of each product produced
by the plant, such as hot hour rate, utilization, yield, and EBITDA along with raw
material required, minimum and maximum campaign size, and sequence changeover
time (Fig. 3).
The user can add new data, update existing data, and delete existing data as and
when required. Along with these, the user can input a new raw material, edit existing
raw material, and delete raw material from the raw material data group (Fig. 4).
The sequence change time tab allows the user to update the sequence change
time from one section to other. All the data entered or available in the model can be
checked in the “Show Plant Data” tab by the user, which opens the data in the model
in a new user workbook.
Optimization Model for Production Planning: Case of an Indian … 17

Fig. 3 Update plant production data

Fig. 4 Update plant production data–2

4.3 Planning Parameter

The planning parameter tab allows the user to input the planning parameter. There
are options to enter the planning date from which the model is to plan for
production, the target production for the planning tool, and the optimization
scheme. The optimization scheme allows the user to select the optimization scheme
(Fig. 5) from a drop-down list and contains option like EBITDA, Min Rolling
Time, EBITDA + OC, and Min Rolling Time + OC. Based on the selection by the
user, the model will select the optimization objective and produce the production
plan. There are addition tabs to enter raw material constrain, campaign constraint,
and critical order. Once all the inputs are done, the user needs to update the data to
add it to the database.
The raw material constraint allows the user to add any limitation of raw material
while planning. The user can see the required quantity of raw material and add
available quantity which will be considered as a constraint. The campaign constraint
18 T. Banerjee and S. Koul

Fig. 5 Production planning parameters

allows the user to add minimum campaign quantity that must be rolled for a section.
If no such constraint is there, the user can keep the field empty while if a section is
not to be considered in the rolling plan it must be updated with 0.
The critical order tab allows the user to add an order from COS through the drop-
down list that does not qualify for executable order due to criteria like “FI release”,
“Plant Release”, but must be considered in the current rolling plan. All the data
entered by the user in the model can be checked in the “Show P. Parameter” tab
which opens the data in the model in a new user workbook.

5 Optimization Details

The planning tool has two optimization algorithm that executes through Open Solver,
one for generating the product mix, i.e., the production plan, and other for generating
the production sequence. The working of the model is as follows:

5.1 Optimize Production Plan

The production plan optimization algorithm uses a simple linear programming tool
to generate an optimum product mix, and the objective function depends on the
planning parameters.
The options currently available for the user in the model are as follows:
• EBITDA: For the optimization under EBITDA scheme, the objective function is
as follows:
Optimization Model for Production Planning: Case of an Indian … 19

1. Maximize Campaign Completion;


2. Maximize EBITDA.
• Min Rolling Time: For the optimization under Min Rolling Time scheme, the
objective function is as follows:
1. Maximize Campaign Completion
2. Minimize Rolling Time.
• EBITDA + OC: For the optimization under EBITDA + OC scheme, the objective
function is as follows
1. Maximize Campaign Completion
2. Maximize EBITDA
3. Maximize Order Completion.
• MRT + OC: For the optimization under MRT + OC scheme, the objective function
is as follows
1. Maximize Campaign Completion
2. Minimize Rolling time
3. Maximize Order Completion.
• Various constraints under consideration for the optimization are as follows:
(A) Campaign Constraint—Must be higher than the minimum campaign size
decided for the section else at zero (0).
(B) Target Campaign Constraint—Must be greater than or equal to the target
campaign plan entered in the planning parameter.
(C) Raw Material Constraint—Required raw material must be less than the
available raw material in the planning parameter.

5.2 Mathematical Formulation

The indices used in the formulation are as follows.


• i e number of sales order in, i = 1, 2, 3, … N.
• j e Number of products manufactured, j = 1, 2, 3, … m.
• k e number of raw material used in the manufacturing unit, k = 1, 2, 3, … p.
• n e selection of objective function, k = 1, 2, 3, 4.
In this optimization problem, the decision variables are:
1. PQi—Optimized production quantity against the sales order i; and
2. PSj—Product selection j for production in the manufacturing unit.
The different optimization variables that are used in the objective formulation are
as follows:
20 T. Banerjee and S. Koul
n
OP
1. CRj is the percentage order completion against product j, CR j = QTR i=1 i
∀i ∈ j,
j
where QTR j is the available production order for product j.
PC j
2. RTj is the normalized rolling time against product j, RT j = PC jmax ∀ j where PC j
is the ratio of production rate and facility utilization while producing j.
3. OCj is the normalized order compliance index against product j,
n
i=1 OCIi
OC j = QTR ∀i ∈ j where OCIi is calculated based on the duration left to
j
fulfill the order and the share of order in total product order quantity.
EBITDA j
4. PRj is the normalized profit realized for the product j, PR j = EBITDA jmax ∀ j.
The multi-criteria decision objective is formulated through multiplicative models
where the weights of the respective objective w1, w2, w3, and w4 are obtained through
the complex decision-making AHP technique [15, 16].
The objective function for the various case is thus formulated as:

m
• Case 1 (n = 1)—Z Maximize = w1 *CR j ∗w4 *PR j
j=1
m
• Case 2 (n = 2)—Z Maximize = w1 *CR j ∗w2 *RT j
j=1
m
• Case 3 (n = 3)—Z Maximize = w1 *CR j ∗w4 *PR j ∗w3 *OC j
j=1
m
• Case 4 (n = 4)—Z Maximize = w1 *CR j ∗w2 *RT j ∗w3 *OC j .
j=1

The solution is optimized with the following constraints:


1. PQi ≥ 0 ∀ i, non-negative
2. PS j is a binary
3. PQi ≤ AOQi ∀ii, where AOQi is the available order quantity against the sales
order i
m
4. RPT j ≤ AT where RPT j is the required production time for product j, and
j=1
AT is the total available time
m
5. PQi∈ j ≥ PT j ∀ j, where PT j is minimum target production for j and PQi∈ j
j=1
is the optimized production quantity of all the orders i belonging to product j
m
6. PQi∈ j ≤ PS j *QTR j ∀ j
j=1
7. PS j *QTR j ≥ MPQ j ∀ j where MPQ j is the minimum production quantity for
product j
8. ORMk ≤ ARMk ∀k where ORMk is the k type raw material requirement based
on optimized production quantity, whereas ARMk is the available quantity of
the raw material
n
9. PQi ≤ PP where PP is the target production in the planning horizon.
i=1
Optimization Model for Production Planning: Case of an Indian … 21

5.3 Optimize Production Sequence

The optimized production sequence is generated through the transportation algorithm


using linear programming to minimize the changeover time along with algorithms
written through VBA scripts to generate the sequence of production. The algorithm
is designed to generate the optimum production sequence through an iterative route-
finding method based on the objective selected.
There are two algorithms in this model to generate the production sequence. The
first algorithm tries to optimize the plant productivity considers the changeover
time from one product to other to devise the sequence having the minimum
changeover delay. The second algorithm is based on delivery optimization that
takes into consideration both the sequence change over time and promise date of
the material to devise a sequence that is best suitable to meet the new demand of
the product minimizing the total changeover time.
The model finally uses the output of the sequencing optimization tool to generate
the raw material requirement planning based on the manufacturing parameters.

6 Outputs

The various outputs for the user-generated by the model are as follows:

6.1 Production Plan

The production plan button generates an excel workbook containing all the orders
that are to be executed after the optimization in the current production plan along
with the optimized quantity to be produced (Fig. 6).
The generated sheet has all the necessary information for the planning user as
well as the plant.

6.2 Production Sequence

The production sequence button generates the optimized production sequence to be


followed as well as day-wise campaign quantity. This sheet has the added information
of the production time and changes over time (Fig. 7).
The information generated has the option for the user to input shut-down time if
any. The sheet will automatically recalculate the day-wise production quantity based
on the user input of shut downtime.
22 T. Banerjee and S. Koul

Bal
Ship To Party Ship To Party Insp Region Optimized
Sales Office SOTy Age SO Last PrdDt Planning Mat Sec Wt Order Billed Bill
Name City By Name Qty
Qty Qty Qty
Type -
Sales Office -1 ZCAP 134 06.04.2019 Customer -1 RAIGARH UC 254 X 254 73.1 10.512 9.471 1.041 Chattisgarh 1.041
1
Type -
Sales Office -3 ZMTO 129 19.05.2019 Customer -3 BHATINDA WPB 700 X 300 149.9 23.4 16.17 7.232 Punjab 5.658
1
Type -
Sales Office -4 ZMTO 123 30.05.2019 Customer -4 RAJGANGPUR UB 406 X 178 60.1 12 8.63 3.37 Odisha 3.37
1
Type -
Sales Office -5 ZMTO 116 19.05.2019 Customer -5 HYDERABAD WPB 700 X 300 240.5 3.27 3.27 Telangana 3.27
1
KORADI, Type -
Sales Office -6 ZMTO 113 29.04.2019 Customer -6 UB 356 X 171 51 5.225 5.225 Maharashtra 5.225
NAGPUR 1
Type -
Sales Office -2 ZMTO 111 12.05.2019 Customer -7 CHENNAI UB 406 X 178 60.1 25 14.46 10.542 Tamil Nadu 10.542
1
Type -
Sales Office -7 ZMTO 110 10.03.2019 Customer -8 MURBAD UB 610 X 229 140 17 15.72 1.277 Maharashtra 1.277
1
Type -
Sales Office -7 ZMTO 110 10.03.2019 Customer -9 TALOJA UB 610 X 229 101 6 4.919 1.081 Maharashtra 1.081
1
Type -
Sales Office -2 ZMTO 107 29.04.2019 Customer -7 CHENNAI UB 356 X 171 51 30 26.54 3.457 Tamil Nadu 3.457
1
Type -
Sales Office -4 ZMTO 100 05.05.2019 Customer -10 ASANSOL WPB 600 X 300 177.8 2.13 2.13 West Bengal 2.13
1
Type -
Sales Office -8 ZMTO 97 12.05.2019 Customer -11, ROURKELA UB 406 X 178 67.1 6.8 6.8 Odisha 3.299
1
Type -
Sales Office -5 ZMTO 93 21.05.2019 Customer -5 HYDERABAD UC 254 X 254 73.1 40 40 Telangana 40
1
Type -
Sales Office -7 ZMTO 88 15.06.2019 Customer -12 RAIGAD UC 305 X 305 96.9 39.5 6.973 32.527 Maharashtra 5.879
1
Type -
Sales Office -2 ZMTO 79 09.04.2019 Customer -14 SUNDERGARH UB 356 X 171 51 7 5.543 1.457 Odisha 1.457
1
Type -
Sales Office -4 ZMTO 74 16.05.2019 Customer -4 KOLKATA UB 356 X 171 45 10 8.64 1.36 West Bengal 1.36
1
Type -
Sales Office -5 ZMTO 74 15.06.2019 Customer -5 HYDERABAD UC 305 X 305 117.9 20 14.09 5.91 Telangana 1.665
1
Type -
Sales Office -5 ZMTO 74 15.06.2019 Customer -5 HYDERABAD UC 305 X 305 96.9 30 30 Telangana 21.86
1
Type -
Sales Office -5 ZMTO 74 10.05.2019 Customer -5 HYDERABAD UC 254 X 254 73.1 2.809 2.809 Telangana 2.809
1
Type -
Sales Office -5 ZMTO 74 30.05.2019 Customer -5 HYDERABAD UB 406 X 178 60.1 60 60 Telangana 57.784
1

Fig. 6 Production plan

Ref Campaign Changeover Hot Rolling Shut Start Stop


Section Utilization Day1 Day2 Day3 Day4 Day5
Number Qty Time Hour Time Down time time

15 MC 400 X 100 831 0 140 59 10.06053 0 10.061 831


23 UC 356 X 368 571 1 197 69 4.20069 11.06053 15.261 571
7 NPB 450 X 190 718 2 184 72 5.41969 17.26122 22.681 718
17 AL_250X250 515 1 150 51 6.73203 23.68091 30.413 24 491
2 NPB 400 X 180 705 4 171 68 6.06295 34.41294 40.476 705
19 UB 457 X 152 585 2 186 62 5.07284 42.47589 47.549 585
10 UB 406 X 178 579 3 132 74 5.92752 50.54873 56.476 579
22 WPB 600 X 300 1008 5 168 66 9.09091 61.47625 70.567 1008
16 UB 533 X 210 1043 3 140 70 10.64286 73.56716 84.21 1043
8 UC 305 X 305 996 3 145 54 12.72031 87.21002 99.93 688 308

Fig. 7 Production sequence

6.3 Raw Material Plan

The Raw Material Plan button generates the day-wise required quantity of the raw
material based on the optimized production sequence generated as at Fig. 8.
Optimization Model for Production Planning: Case of an Indian … 23

RAW Material Requirement


S.No Raw Material Day1 Day2 Day3 Day4 Day5
1 BB 480 x 420 x 120 997.2 943.5 1527.3 1274.1 570.4
2 BB 355 x 280 x 90 1408.5
3 RAIL BLOOM 285 x 390
4 ANGLE BM 285 x 390 47.1 962.7
5 CRANE BM 285 x 390
6 Slab / Slit Slab / Sheet Pile IV

Fig. 8 Raw material planning

7 Conclusion

The model developed uses MS Excel and Open Solver (2019) with constraints of
processing a large amount of data. To optimize the performance of the software; the
number of line items of COS is limited to 3000. Also, the number of sections that it
can process is limited to 50, while the raw material is limited to 20. In case the user
needs more flexibility, the same can be increased by changing the code. However, if
a considerable amount of data is needed to be processed, the user needs to use some
professional solver tool as well as database software which can be programmed
accordingly. Also, the current model does not take into account the time needed for
the finished product to be ready for dispatch after production, where different kind
of inspection scheme needs to be taken into consideration.
In future optimization, the same can be taken into consideration. The optimization
model takes the weighted average of “number of days” from the promised date to the
planning date for considering order completion on time which has a further scope
of improvement and can be optimized to the level of just-in-time approach. The tool
presented takes into consideration all the present aspects of the planning process to
decide the executable BTR. For any changes in the decision criteria, the same must
be updated in sync with this tool. And finally, when used for other mills at the plant,
all the decision criteria and optimization scheme must be updated accordingly.

8 Recommendations

The model is limited to the planning process of the finished product. Similar types
of models can be implemented for the other finishing mills. The output of all the
mills can be used as the input for generating the optimization model for the steel
melting shop through backward integration, thereby completing the optimization
cycle. Further, backward integration of the model can be done for the raw material
planning and procurement process.
24 T. Banerjee and S. Koul

In the current competitive scenario, the optimization model provides a tool for the
planning department to optimize their production planning and productivity while
maintaining the competitiveness to global standards.

Acknowledgements The authors thank the executives and administrators at the integrated steel
plant in Central India for deliberations and data for the action learning project undertaken at the
Centre for Supply Chain and Logistics Management from July 2019 to December 2019. We are also
grateful to our reviewers for their valuable suggestions that have made the paper more systematic
and instructive.

References

1. IBEF (2020). https://www.ibef.org/industry/steel-presentation. Accessed 28 Apr 2020


2. W.J. Davis, S.D. Thompson, Production planning and control hierarchy using a generic
controller. IIE Trans. 25(4), 26–45 (1993)
3. P.B. Luh, J.H. Wang, J.L. Wang, R.N. Tomastik, Near-optimal scheduling of manufacturing
systems with the presence of batch machines and setup requirements. Ann. CIRP 46(1), 397–
402 (1997)
4. L. Ozdamar, G. Barbarosoglu, Hybrid heuristics for the multi-stage capacitated lot sizing and
loading problem. J. Oper. Res. Soc. 50(8), 810–825 (1999)
5. B.M. Beamon, J.M. Bermudo, A hybrid push/pull control algorithm for multi-stage, multi-line
production systems. Prod. Plann. Control 11(4), 349–356 (2000)
6. Z.K. Weng, Managing production with flexible capacity deployment for serial multi-stage
manufacturing systems. Eur. J. Oper. Res. 109(3), 587–598 (1998)
7. S. Engell, I. Harjunkoski, Optimal operation: scheduling, advanced control and integration.
Comput. Chem. Eng. 47, 13 (2012)
8. E. Munoz, E. Capon-Garcia, M. Moreno-Benito, A. Espuna, L. Puigjaner, Scheduling and
control decision-making under an integrated information environment. Comput. Chem. Eng.
35, 774–786 (2011)
9. P. Prasad, C.T. Maravelias, Batch selection, assignment and sequencing in multi-stage multi-
product processes. Comput. Chem. Eng. 32, 1106–1119 (2008)
10. Y. Inoue, K. Takanashi, N. Miyazawa, Y. Hyuga, Management and control systems in the steel
industry. Comput. Ind. 5(2), 143–152 (1984)
11. L. Tang, J. Liu, A. Rong, Z. Yang, A review of planning and scheduling systems and methods
for integrated steel production. Eur. J. Oper. Res. 133(1), 1–20 (2001)
12. M. Chen, W. Wang, (1997) A linear programming model for integrated steel production and
distribution planning, Int. J. Oper. Prod. Manage. MCB UP Ltd
13. L. Yang, G. Jiang, X. Chen, G. Li, T. Li, X. Chen, Design of integrated steel production
scheduling knowledge network system. Cluster Comput. 22(4), 10197–10206 (2019)
14. MS Office Excel Software (Version 2019)
15. A. Emrouznejad, M. Marra, The state of the art development of AHP (1979–2017): a literature
review with a social network analysis. Int. J. Prod. Res. 55(22), 6653–6675 (2017)
16. T.L. Saaty, The Analytical Hierarchy Process (McGraw-Hill, NY, 1980), p. 287
Vision-Based User-Friendly
and Contactless Security for Home
Appliance via Hand Gestures

Richa Golash and Yogendra Kumar Jain

Abstract In the current scenario where newer technologies have detonated, people
urge to be safe and secured in every nook and cranny, which encompasses their
home appliances too. Usually, people do not appreciate someone entering the house
and handling the appliances ineptly, and the indisposition further augments if the
dear ones at home reckon in elderly or differently abled, as it increases the risk of
spreading contagious or infectious diseases. Henceforth, it is the need of this hour to
secure the appliances and that too with a security technique that is user-friendly and
pocket friendly. A security system built on vision-based hand gesture recognition is
undoubtedly a good fix to all the problems mentioned. The uniqueness of this concept
is the exemption from managing too many passwords. In this methodology, we have
used dynamic hand gestures as the password to operate different home appliances.
The use of region-based convolutional neural network (RCNN) in localization of hand
movement has given robustness to the technique, against the challenges associated
with hand gesture recognition (HGR). Another advantage of using region-based deep
learning is that the technique does not require large data base for testing and training
the architecture. The proposed technique in this chapter is simple, user-friendly, and
cost-effective since it utilizes a simple camera and is invariant to hand shape. Above
and beyond, it is designed with full consideration to senior citizens and differently
abled people.

Keywords Computational intelligence · Intelligent systems · Region-based


convolutional neural network · Deep learning · Home automation · Natural
password

1 Introduction

The smart home, smart environment is nowadays a new upcoming concept where
systems are equipped with security and special ways of interaction. This field is also

R. Golash (B) · Y. K. Jain


Samrat Ashok Technological Institute, Vidisha, Madhya Pradesh 464001, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 25
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_3
26 R. Golash and Y. K. Jain

referred to as home automation, where household appliances such as microwave,


washing machine, AC, and refrigerator are controlled remotely. It provides a feeling
of comfort to the user, but what if any intruder gets control of these appliances? For
this, we have to put some security system, so that these home appliances are not
being unnecessary used by intruders. Nowadays, there are many electronic gadgets
in our home, which in turn ask many passwords to remember, for enabling security
in these systems. Learning many passwords are difficult and even more challenging
if the appliances are operated by senior citizens or differently abled family members
[1, 2].
The solution to the aforementioned problems is biometric or natural passwords,
conventional biometric system which are commonly used and enable high-level secu-
rity in various systems are: iris recognition, face recognition, fingerprint recogni-
tion, voice recognition, behavior recognition, and hand geometry recognition. These
biometric authentication systems have gained attraction, because of the fact that
every human being has unique features that can act as authenticating information,
and above all, there is no need for user to remember them. Out of these fingerprints,
iris and face recognition are the most unique features of human beings and are used
as secured biometric recognition systems in areas, dealing with sensitive informa-
tion [3, 4]. Fingerprint identification is one of the oldest techniques and is feasible in
mobile phones and laptop computers and requires high-resolution camera. The limi-
tations of accepting fingerprint as a password format in home appliances is fingerprint
recognition scheme that is based on the identification of minutia points accurately,
and these points are low in quality in the case of senior citizens and even difficult for
differently abled people. The efficiency of this security system is also affected by the
location of registration point and translation of the image. The technique requires
extra physical movement of the user to enable and disable the security system with
added requirement of advanced cameras which increases the cost of the system, and
therefore, they are not booming in consumer market. Similarly, the iris pattern and
facial appearance do not remain stable in the case of senior citizens and physically
challenged people. Additionally, these systems require large memory space, advance
cameras, and extra computation in matching [4, 5]. In summary, the installation of
conventional biometric authentication systems in home environment has following
issues:
1. Operations of home appliances are simple, and users using the home appli-
ances are not very expert. Conventional security systems are complex, and they
increase the overall complexity of the appliances and thus are not appreciated
by simple users.
2. Sensors that enable conventional security methods are costly and increase the
overall cost of the appliances, and common users do not accept the increased
cost easily. Also, appliances like microwave, washing machine, music system,
etc., in the house do not demand very high level of security.
Encapsulating the above-mentioned issues, we can say that user sitting at home
desires user-friendly and pocket-friendly security system and are hesitant to use
complex systems. Users are interested to enable or disable the security remotely with
Vision-Based User-Friendly and Contactless Security … 27

Fig. 1 Visualization of the proposed technique

reduced physical efforts, and this is the area where vision-based natural passwords
are more preferred. The aim in this chapter is to propose contactless, vision-based
security system which fulfills the user requirements. Additionally, the technique is
congenial for senior members and special people in the sense that it reduces their
physical movement to enable security. We have developed behavioral security system
through the hand gestures using low cost RGB camera. We have selected four hand
postures and two types of hand movement to open and close the operations of any
general-purpose appliances used in home environment (as shown in Fig. 1). To enable
the security system with fast and accurate response, we have used region-based
convolutional neural network (RCNN), a deep learning algorithm. In this chapter, a
prototype scheme for generating eight vision-based passwords is illustrated which
can be further extended for creating more passwords on demand.

2 Related Work of Hand Gesture Recognition in Home


Security and Home Automation

The survey carried out is divided into two sections. In first section, we discussed
different outlook and purposes of researchers in development of home automation
system. We have analyzed different schemes of hand gesture recognition involved
in various field related to home and natural user interface developed. In the second
section, we discussed the challenges associated in detection and recognition of real-
time applications.
Greichen et al. [6] have discussed that the manufacturing cost, development cost,
as well as installation cost are some of the limitations in the design of a home
automated system. Due to lack of standards, unfamiliarity with the technology and
28 R. Golash and Y. K. Jain

the complexity involved in system make senior and special people hesitant to get
acquaintance with the latest trends coming in automated home. In the literature
survey done on challenges and types of home automation systems, Divya et al. [7]
highlighted that the most prominent requirement in home automation is contactless
and cost-efficient approach to control home appliances in a versatile manner. Guerra
et al. [8] discussed that to design any security system on biological characteristic, it
must have following characteristics: first is universality, i.e., every user should possess
the biostatistics features, second is acceptability, i.e., user must be comfortable to
use the biometric features, and third is permanence, it means features should not vary
over the period of time.
Premaratne discussed that high accuracy with less confusion, robustness to light
variations, and time are some of the important factors in real-time applications.
They have used handcrafted techniques moment invariant and template matching to
detect hand region. For tracking hand motion Lucas Kanade algorithm is used and for
classification neural-network with support vector machines are used in their proposed
method. Handcrafted features are greatly affected by environmental conditions, and
hence, applications are mainly designed with restricted background [9]. Similarly,
Zeng et al. [10] proposed a natural user interface based on handcrafted features for
people with limited physical movement due to nervous system disorder. They utilized
color, shape, and motion, multi-cue system for hand detection, and motion history
images and state transition network for motion recognition. Dinh [11] used depth
hand silhouette to control home appliances through hand postures. They created
synthetic database using 3D commercial graphic package to train random forest
(RF) classifier. Command is synthesized through 21 decision trees with maximum
depth of 20. In this technique, sample pixels are selected to extract features. Likewise,
Hsaiao et al. [12] used Kinect sensor camera to derive moment invariant features and
interpretive structural model (ISM) to recognize the pattern to design natural user
interface. Real-time detection and recognition of moving hand is highly affected by
occlusion, background environment, and above all non-rigid characteristic of hand
[13]; thus, fast and accurate response is a challenging task with handcrafted features.
Xu et al. [14] presented a different concept by matching user eye line with fingertip
of index finger to develop secure cursor movement. Initially, the eye region is detected
by Adaboost classifier, and hand region is segmented to find index finger using convex
hull and convexity defect features, then cursor is moved according to eye fingertip
line. These types of security system are complex and are beyond user interest. In
the study done by Ramadhani et al. [15] hand gestures are used as a password
to unlock electronic key from 1–9 using depth images. Shalahudin [16] supported
that hand gesture recognition is very useful for independent living, and they used
Raspberry Pi to process images in the machine. This technique proposes to control
connected devices at home with an efficiency of 87.5% when distance is 0.5 m.
Trong et al. proposed home automation system in which they have used actuators
and sensors embedded in smart mobile phones to capture pattern of hand gestures.
To construct gesture vocabulary and recognize the pattern, they have used two deep
neural networks, convolutional neural network and DeepConvLSTM [17].
Vision-Based User-Friendly and Contactless Security … 29

It is observed that researchers prefer accelerometer, actuator, or mobile sensors to


ease out the detection phase of hand gesture recognition, while designing automation
system for home. These are hardware components, and senior people are reluctant
to wear them, and thus, automated home appliances are more popular in young
generation. In contrast, vision-based HGR do not require any wearable devices, and
simultaneously detection and recognition of moving hand region is a challenging
task. Because the area occupied by moving hand region in an image frame is subtle
and being a non-rigid object, its shape is highly affected by camera-view angle.
Additionally, hand has uneven surface, thus its edges do not come very clear when
it moves, and this problem further increases when video is recorded from average
quality RGB camera [18, 19].
In this chapter, we have resolved the aforementioned problems and uphold the
tradeoff between simplicity of the model and accuracy of technique. The proposed
system is based on region-based convolutional neural network (RCNN) a deep
learning algorithm [20]. The advantage of using RCNN is that it tells the presence
of the object as well as it locates the object. Hence, the proposed technique has high
accuracy in tracing the movement of hand without any segmentation or background
detection.

3 Proposed Methodology

The complete system as shown in Fig. 2 is divided into three stages, the first stage
deals with the design and training of RCNN architecture as per our requirement. The
second stage deals with the testing and validation of the proposed system with the
region of interest (ROI) detected in this stage. In this stage, hand region is detected
and then localized in each frame of the video sequence. Tracing of the movement

Fig. 2 Schematic diagram of methodology


30 R. Golash and Y. K. Jain

of hand region is performed to determine the centroid of motion. The third stage
performs the decoding of hand movement as a password to operate the machine. The
mathematical modeling of the proposed method is described as follows:

3.1 First Stage: Design and Training of Region-Based


Convolutional Neural Network (RCNN)

Deep learning networks have given new dimensions to object detection and classifi-
cation because of its powerful way of learning features automatically with minimum
manual support. Unsupervised pretrained networks, recurrent neural network, convo-
lutional neural network (CNN), and recursive neural networks are the different types
of popular deep learning algorithm. CNN networks are specially designed to derive
spatial features from image data by computing 2D convolutions (given by Eq. (1))
and are best suited for image classification and recognition of image pattern.


M 
N
( f ∗ g)(x, y) = f (x − n, y − m)g(n, m) (1)
n=−M m=−N

Compared to traditional artificial neural network, CNN requires small number of


parameters, but standard CNN can only classify picture and predict the probability
of object, and it does not tell the position of the object [20]. Therefore, we have
used region-based convolutional neural network in our technique. RCNN is efficient
to predict the chances of the presence of object and furthermore estimate the object
position. There are four important building blocks of RCNN: region proposal, feature
extraction through CNN network, linear classification, and bounding box regressor.

3.1.1 Regional Proposal

In this stage, image is sub-segmented, and all potential regions of interest are diag-
nosed known as selective search. Selective search is based on greedy algorithm,
and sub-segmented parts are clustered hierarchically using color, texture, or region-
based similarity to generate region proposals for object detection. Nearly 2000
category-independent rectangular regions are generated for each individual image
[13].

3.1.2 Feature Extraction Through CNN

The proposed region after region proposal are cropped to form a mini- images that are
scaled according to the size of input layer of pretrained CNN architecture selected.
The network selected is pretrained on a large database consisting of 1000 images. The
Vision-Based User-Friendly and Contactless Security … 31

pretrained network is then fine-tuned as per the ground truths, i.e., cropped images of
our database. For every image, 4096 features are extracted. The softmax layer which
is initially classified in 1000 ways is now replaced by classification layer to classify
1 + 1 class, hand region, and a background class.

3.1.3 Linear Classifier

After receiving 4096-dimensional feature vectors for each image from penultimate
layer of the pretrained CNN, we pass this feature vector to train binary support vector
machines (SVM) for each class independently. The SVM model takes feature vector
and engenders a confidence score of the existence of the trained object in the input
test image [13].

3.1.4 Bounding Box Regressor

This is the final desired output of the RCNN, which is obtained after getting a positive
output score from SVM. In this stage, we precisely locate the object by placing the
bounding box, and it works on scale-invariant linear regression model. In this model,
first we define target proposal location through four dimensions (x, y, w, h) where x, y
are the coordinates of center pixel, w and h are the width and height of bounding box,
respectively.
 i Now, the target
 transformation is calculated between probable proposals
P = Pxi , Pyi , Pwi , Phi and target proposal T = Tx , Ty , Tw , Th using Eqs. (2)–(4).

G x − Px
tx = (2)
Pw
G y − Py
ty = (3)
Ph

tw = log(G w /Pw ) (4)

th = log(G h /Ph ) (5)

di (P) = wiT ∅5 (P) (6)

The transformation of probable proposals di (P) is calculated using Eq. (6), and
location of the target proposal is calculated.
32 R. Golash and Y. K. Jain

3.2 Second Stage: Detection of Region of Interest (ROI)


in Test Image

This stage is mainly testing stage, and here, we load the video sequence which is
captured from simple camera. The video sequence is then converted into frames and
is resized according to the input layer of the network architecture. The network which
is already trained on the required data predicts the score of the presence of the hand
region in each frame and locates the object.

3.3 Third Stage: Centroid Plotting and Password Generation

In this stage, the centroids of hand movement in all the frames are collected and
plotted in a manner such that a password is generated. Each hand gesture behavior
can be seen as a finite set of features and the database is created using features
of each hand gesture. Here, each user selects a hand posture and a direction of
movement from left to right or right to left. Whenever a user performs a gesture in
front of the camera, a feature set of the movement is created and trained as per the
methodology. The feature set generates the password and matched to the database
created. The password authenticates the user, and machine is enabled to operate. The
top to bottom flow of the process is explained in Fig. 3.

Fig. 3 Flow of process from top to bottom


Vision-Based User-Friendly and Contactless Security … 33

4 Experimental Results

Figure 3 shows the top-to-bottom flow of process and the algorithm. In the proposed
technique, there are two types of data, first is the training database collected to
train the RCNN network, and in this, we have accumulated images from publicly
available database [21] and self-captured images as shown in Fig. 4a. The second
data is the testing data which is a video sequence, consisting of 40–50 frames as
shown in Fig. 4b. The video is captured using normal camera installed in a system
with 2.16 GHz processor and 4.0 GB RAM.
In the proposed RCNN network, there are 15 layers, and first and the last layers
are input and output, respectively. The size of the input layer is 32 × 32 × 3 with
zero center normalization. To improve the convergence rate of training data, standard
deviation of 0.0001 is selected to initialize the weights parameter of first convolutional
layer. Stochastic gradient descent with momentum algorithm is used for training with
initial learning rate of 0.001. This learning rate is reduced after every 8 epochs (1
epoch is considered when one complete pass from the training data is finished).
The proposed algorithm runs for 100 epochs. The network is initially trained on
CIFAR-10 data, and it has 50,000 training images. This pretrained RCNN is then
fine-tuned on hand sign posture consisting of 27 training images (Fig. 4a). It is a

Fig. 4 a Database for training RCNN architecture. b Test video sequence


34 R. Golash and Y. K. Jain

collection of four prominent hand posture shown in Fig. 5, which are selected as
password creation. The training outcomes in Table 1 demonstrate that accuracy is
coming 100% after 17 epochs.

Fig. 5 Four hand posture collected for password generation

Table 1 Training of RCNN network


Epoch Iteration Time elapsed Mini-batch accuracy Mini-batch loss Base learning rate
(hh:mm:ss) (%)
1 1 00:00:01 25.00 0.7838 0.0010
17 50 00:00:29 100.00 0.0054 0.0010
34 100 00:00:57 100.00 0.0011 0.0010
50 150 00:01:26 100.00 0.0009 0.0010
67 200 00:01:54 100.00 0.0004 0.0010
84 250 00:02:22 100.00 9.6509e−05 0.0010
100 300 00:02:50 100.00 0.0001 0.0010
Vision-Based User-Friendly and Contactless Security … 35

Fig. 6 Demonstrate the intermediate result of tracking in a data sequence scores that are shown
in yellow box a and c results on self-collected data sequence b results on publicly available data
sequence [22]

To test the proposed methodology, we have recorded more than 100 data sequences
of hand movements with the different subjects of different age groups. Hand move-
ment is performed in a different background from plain background to real-time
cluttered background. Some data sequences are recorded when the subject is not
in the view of the camera, and in some sequence, subject is in the camera view.
The results (as shown in Fig. 6) obtained indeed show effective and robust tracking
and finally password generation. Table 2 shows the interlinking of hand posture, its
movements, and passwords.

5 Conclusion

This chapter presents an unconventional way of robust password creation in home


appliances. Hand language is widely accepted as a natural way of communication,
and therefore, hand gestures can be appreciably useful in developing a natural system
of security and authenticating the user. The proposed technique is simple and cost-
efficient as it utilizes a simple camera, and using deep learning algorithm, it achieves
an accuracy of 100% in hand region detection and password generation. Since hand
gestures are easy to remember as compared to other physical passwords, the method
is user friendly for senior citizens as well as differently abled people. In future, the
36 R. Golash and Y. K. Jain

Table 2 Detection of hand posture tracking and password generation


Input image Tracking Left-to-right Right-to-left
movement password movement password
0001 0010

0101 0110

1001 1010

1101 1110

real-time implementation of this technique can create a smarter automated home


system.

References

1. A.J. Bernheim Brush, B. Lee, R. Mahajan, S. Agarwal, S. Saroiu, C. Dixon, Home automation
in the wild: challenges and opportunities. in CHI ’11 Proceedings of the SIGCHI Conference
on Human Factors in Computing Systems, (2011), pp. 2115–2124
2. V.S. Gunge, P.S. Yalagi, Smart home automation: a literature review. Int. J. Comput. Appl. 975,
8887 (2016)
3. E. Verbitskiy, P. Tuyls, D. Denteneer, J.P. Linnartz, Reliable biometric authentication with
privacy protection. in Presented at the SPIE Biometric Technology for Human Identification
Conference (Orlando, FL, 2004)
Vision-Based User-Friendly and Contactless Security … 37

4. R. Golash, C.R.R. Kinkar, A. Upadhyay, Optimal and user friendly technique for enhancing
security by robust password creation using biometric characteristic. in International Conference
on Computing and Communication Systems (Springer, Berlin, Heidelberg, 2011), pp. 406–413
5. D. Bhattacharyya, R. Ranjan, F. Alisherov, M. Choi, Biometric authentication: a review. Int. J.
u-and e-Serv. Sci. Technol. 2(3), 13–28 (2009)
6. J.J. Greichen, Value based home automation or today’s market. IEEE Trans. Consum. Electron.
38(3), 34–38 (1992)
7. D. Purohit, M. Ghosh, Challenges and types of home automation systems. Int. J. Comput. Sci.
Mobile Comput. 6(4), 369–375 (2017)
8. J. Guerra-Casanova, C. Sánchez-Ávila, G. Bailador, A. de Santos Sierra, Authentication in
mobile devices through hand gesture recognition. Int. J. Inf. Secur. 11(2), 65–83 (2012)
9. P. Premaratne, S. Ajaz, M. Premaratne, Hand gesture tracking and recognition system using
Lucas–Kanade algorithms for control of consumer electronics. Neurocomputing 116, 242–249
(2013)
10. J. Zeng, F. Wang, Y. Sun, A natural hand gesture system for people with brachial plexus injuries.
Comput. Inf. 34(2), 367–382 (2015)
11. D.-L. Dinh, J.T. Kim, T.-S. Kim, Hand gesture recognition and interface via a depth imaging
sensor for smart home appliances. Energy Procedia 62(62), 576–582 (2014)
12. S.-W. Hsiao, C.-H. Lee, M.-H. Yang, R.-Q. Chen, User interface based on natural interaction
design for seniors. Comput. Hum. Behav. 75, 147–159 (2017)
13. Z.-Q. Zhao, P. Zheng, S.-T. Xu, X. Wu, Object detection with deep learning: a review. IEEE
Trans. Neural Netw. Learn. Syst. 30(11), 3212–3232 (2019)
14. J. Xu, X. Zhang, M. Zhou, A high-security and smart interaction system based on hand gesture
recognition for Internet of Things. Secur. Commun. Netw. 2018, (2018)
15. A. Ramadhani, A. Rizal, E. Susanto, Development of hand gesture based electronic key using
Microsoft Kinect. MATEC Web Conf. 218, 02014 (2018)
16. S. AlAyubi, D.W. Sudiharto, E.M. Jadied, E. Aryanto, The prototype of hand gesture recognition
for elderly people to control connected home devices. J. Phys. Conf. Ser. 1201(1), 012042
(2019)
17. K.N. Trong, H. Bui, C. Pham, Recognizing hand gestures for controlling home appliances with
mobile sensors. in 2019 11th International Conference on Knowledge and Systems Engineering
(KSE) (IEEE, 2019), pp. 1–7
18. P.K. Pisharady, M. Saerbeck, Recent methods and databases in vision-based hand gesture
recognition: a review. Comput. Vis. Image Underst. 141, 152–165 (2015)
19. R.N. Shaw, P. Walde, A. Ghosh, IOT based MPPT for performance improvement of solar
PV arrays operating under partial shade dispersion. in 2020 IEEE 9th Power India Inter-
national Conference (PIICON) (SONEPAT, India, 2020), pp. 1-4. https://doi.org/10.1109/PII
CON49524.2020.9112952
20. S. Mandal, V.E. Balas, R.N. Shaw, A. Ghosh, Prediction analysis of idiopathic pulmonary
fibrosis progression from OSIC dataset. in 2020 IEEE International Conference on Computing,
Power and Communication Technologies (GUCON) (Greater Noida, India, 2020), pp. 861–865.
https://doi.org/10.1109/GUCON48875.2020.9231239
21. J. Singha, A. Roy, R.H. Laskar, Dynamic hand gesture recognition using vision-based approach
for human–computer interaction. Neural Comput. Appl. 29(4), 1129–1141 (2018)
22. M. Kumar, V.M. Shenbagaraman, R.N. Shaw, A. Ghosh, Predictive data analysis for energy
management of a smart factory leading to sustainability. in Innovations in Electrical and Elec-
tronic Engineering, ed. by M. Favorskaya, S. Mekhilef, R. Pandey, N. Singh. Lecture Notes
in Electrical Engineering, vol. 661 (Springer, Singapore, 2021). https://doi.org/10.1007/978-
981-15-4692-1_58
Vulnerability Analysis at Industrial
Internet of Things Platform on Dark Web
Network Using Computational
Intelligence

Anand Singh Rajawat, Romil Rawat, Kanishk Barhanpurkar,


Rabindra Nath Shaw, and Ankush Ghosh

Abstract Due to the potentially catastrophic effects in the event of an attack,


security-enabled design and algorithms are required to protect automated applica-
tions and instruments based on Internet Industries of Thing called as IIoT. The most
potential developed techniques for analyzing, designing, and protecting the Internet
of Things (IoT) technologies are computational intelligence and big data analysis.
These strategies will also help to enhance the protection of IIoT networks (home
automation, traffic lighting, power stations, oil and gas stations, smart warehouses,
automated vehicles, smart robotics). First, we present the popular IIoT computational
intelligence (CIA) algorithm and its related vulnerabilities in this article. We then
conduct a cyber-threat-vulnerability review by investigating the use of CIA model to
combat illicit behaviors of dark Web environment. The proposed work is based on
the literature data analysis within the available solutions for the prevention of cyber
terrorism threats using algorithm models of computational intelligence (CIA) is then
discussed. Finally, we address our work, which provides scenario of a real-world
hidden cyber world activities designed to carry out a cyber terrorist attack and to
build a structure for a cyber threat. Device attacks to illustrate how a CIA-based
vulnerability analysis system will do well to detect these attacks. To have a rational
point of view on the success of the approaches, we have measured the performance
across representative metrics.

A. S. Rajawat · R. Rawat
Department of Computer Science Engineering, Shri Vaishnav Vidyapeeth Vishwavidyalaya,
Indore, India
K. Barhanpurkar
Department of Computer Science and Engineering, Sambhram Institute of Technology,
Bengaluru, Karnataka, India
R. N. Shaw
Department of Electrical, Electronics and Communication Engineering, Galgotias University,
Greater Noida, India
A. Ghosh (B)
School of Engineering and Applied Sciences, The Neotia University, Sarisha, West Bengal, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 39
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_4
40 A. S. Rajawat et al.

Keywords Cybersecurity · Computational intelligence · Cyber terrorist attack ·


Vulnerability · Dark Web

1 Introduction

It examines the structures of the Global Internet of Things and the related threats. It
should be remembered that the predominant illicit cyber behavior and security proce-
dures goals in information and cyber protection techniques are variant from those
in conventional technological systems due to their radically different existence. As
these disparities have been addressed extensively, and this topic is outside the reach of
this article, we refer readers to [1, 2]. In order to keep the data secure, confidentiality,
honesty, and compatibility are the security enhancement characteristics required to
be retained in every system. In addition, security controls for authentication, autho-
rization, and transparency (AAA) are the security techniques tools for securing cyber
system functionality at the most prevalent attacks targets of dark Web cyber experts
in this propose chapter. Since it is usually a logistical function, transparency has been
left out. Our systematic analysis of related works (such as [1, 3–6]) demonstrates that
these are the SCADA systems’ most frequent challenges. In this article, however,
unlike the current works, we include a detailed collection of extensive attacks: iden-
tify attack separately at which security feature compromised; clarify how it affects
the efficiency of the IIoT; conduct a risk valuation on the severity of damage and
the occurrence in the specific systems; review the security design to encounter every
class of cyber threat. Information technology (IT) has brought dramatic improve-
ments to our lives, and on the Internet, countless users have benefited from new
benefits. In recent years, the great development of the IIoT, where multiple utility
and computers are linked to the network, for another transition in addition to this IT
revolution. However, cyber attacks that target new device vulnerabilities are getting
severe these days, even with the complexity of IT and IIoT systems. The effect of
the new Mirai IoT malware, in particular, was immense. Mirai is the worm ware
that detects an IoT computer with a similar self-impression design flaw. It carries
cyber terrorism attack by altering the packets information to target hosts, and hence,
manipulation of several IoT devices is done by infected as Mirai bots. For dealing
with intelligently cyber terrorist threat, the important is to develop a system profi-
cient of wide-ranging observation of cyber attacks happening on the Internet. The
darknet, defined as network telescope, is researched for several years [7]. Darknet
considered as unused network space for names. And visually, no contact exists as
the darknet does not have a device mounted; however, several packets still arrive.
These packets are mainly triggered by the operation of scanning or backscattering
of response packets from hosts attacked by cyber terrorist attacks; hence, it can be
considered that malware is created by the packet observed in the darknet. Therefore,
a portion of cyber attacks on the Internet can be detected by the study of Darknet
packets. In the proposed study, the actions of scan attacks on dark Web network
Vulnerability Analysis at Industrial Internet of Things Platform … 41

using computational intelligence observed in Darknet, vulnerability analysis at the


Industrial Internet of Things site specifically.

2 Related Work

Oztemel, et al. [7] A detailed Industry 4.0 taxonomy can also be established by eval-
uating the findings of this analysis Möller [8]. It introduces the reader to modern
production, one of the key principles of digital, relating to automation of IOT based
industries. The techniques of command injection and structured query language
(SQL) injection attacks designed for backdoor (2019) enter against the system and
illustrate how these attacks can be detected well by a machine learning algorithm-
based cyber vulnerability detection system [9, 10]. For traffic analysis of our traces,
Fukuda [9] defines the most fitting time bin and highlights the general applicability
of our taxonomy on various darknet datasets.
Shaikh et al. [11] model can be used by businesses to easily identify compromised
IoT automated systems working at own network surrounding and to find scanning
operations targeted. Montieri et al. [12] trained and evaluated four classifiers at the
information of used dataset: (i) (NB) Naïve Bayes, (ii) (RF) random forest, (iii) C4.5,
and (iv) (BN) Bayesian network. Outcome indicates, it is simple to bifurcate the three
anonymous networks (I2P, Torand the, JonDonym) (The accuracy result of 99.99%),
suggested the particular program producing the traffic data (The accuracy value of
98.00%).
Burbano et al. [13] will concentrate on extending studies on this topic by reviewing
Spanish-language data focused on multiple sources of knowledge, such as social
media, the dark Web, and online publications, to detect trends linked to trafficking
in human beings. Spitters et al. [14] applied the techniques for Tor discussion data
platform specifically devoted for drug trafficking, by demonstrating the high-value
precision for both tasks using a mixture of n-gram (character-level, stylometric char-
acteristics, and user post time stamp characteristics. The autonomy of the agent
depends on its simulation capability [1]. In order to simplify this procedure, puzzle
system requirements and restrictions have been suggested. L. It is more difficult for
Wang et al. [2] on the smartphone platform to define and distinguish Tor forms (L2)
and basic Tor applications (L3) than on the personal system environment, consisting
the use of numerous attributes and early recognition and the classification [3]. The
proposed framework dynamically crawls dark Web sites and extracts malicious URLs
assessed by VirusTotal and the Gred engine [4]. Machine learning will be used to
train the device to identify fed data, and convolutionary neural networks will be used
to make judgments alone, thus improving prediction accuracy.
42 A. S. Rajawat et al.

3 Computational Intelligence for Analysis Darknet Traffic


Data

Computational Intelligence (CI) is the philosophy, architecture, implementation, and


creation of computational paradigms that are biologically and linguistically moti-
vated. Neural networks, fuzzy systems, and evolutionary computation have histor-
ically been the three principal foundations of CI. We use artificial neural network
algorithms based on the principle of computational intelligence in the study of dark
Web results. One such surveillance structure for deducting disruptive activity and
cyber terrorism attack trends in cyberspace is a darknet. Darknet consists of bogus
traffic working in the empty address space, i.e., a collection of addresses that are
not allocated to any hosts or computers that are internationally legitimate Internet
Protocol (IP). No such traffic data is needed to arrive on such a darknet IP address
space in an optimal protected network environment. In fact, however, noticeable
volumes of traffic are found in this region mainly due to malicious operations, attacks
on the Internet, and often due to misconfigurations of the network level. Analyzing
certain traffic databases of cyber terrorism attacks and identifying different patterns
of attack found in them may be a possible procedures to infer the trends of the attack
in the actual network. In this article, the latest simple and extended ANN-based
methodology suggests that darknet traffic analysis CI technique and mode (AGM)
data formats are evaluated and an appropriate 50-attribute values in numerical AGM
data frame format suited for analyzing the generated IP address authenticated TCP
connections (the three-way handshake protocol) are suggested to find cyber terrorist
threat trends in this traffic using long short-term memory. The study of cluster patterns
results in traces of different cyber attacks, such as brute force, Mirai bot, and SQL
attack being given. A possible strategy in cyber defense to detect threat patterns in
the network is the study of the originating IP validated TCP, site crapping, post and
notification analysis, inactive IP and device address locating darknet traffic analysis.
Computational intelligence for cyber terrorist attack pattern learning (LS) long short-
term memory network from the convolution output is a NN-neural network obtained
for the fed into an LSTM architecture unit obtained by the latent temporal landscapes
for the text where the architectural Design and the working provide the following
text. All standard RNNs, LSTM units this set is repeated in every step of the time.
In every step, the output that is provided by the modules is manipulated by the given
range in Rd. This Rd represents a function controlled by the HT-1(hide state) and
allows the effort in the present XT (time step): that FT (forget gate). Then, the repre-
sentation of the IT (input gate) and the OT (output gate), the gates overall provide a
decision that in what way to fill in the present memory cell ct and hidden state (HS)
ht. M(d) denotes the measurement of memory that is LSTM, and the vectors then
provide the architecture companionate in the equivalent ways. The terms for LSTM
conversion programs are well-definite in [15] and shown in:
   
i s = σ v j HSs−1 , xs + bi
Vulnerability Analysis at Industrial Internet of Things Platform … 43

   
f s = σ v f H Ss−1 , xs + b f

   
Ps = tanHS v j H Ss−1 , xs + b P

   
Q s = σ v j HSs−1 , xs + b Q

E 1 = f s ∅E s−1 + i s ∅Ps

HSs = Q s tanHS(E 1 )

The function (logistic sigmoid) has an output value; range is [0, 1], the tan h is
a function (hyperbolic tangent) that gives the output value by the range of [−1, 1],
and the amp is the multiplication by element-wise. LSTM is related to long-term
dependence dealing (here, the problem of vanishing gradients is not met) and then
chosen for the convolutionary sheet. As the layer LSTM uses the number of units =
100, drop-out = 0.2, persistent drop-out = 0.2, the layer LSTM is then included in
the layer-batch normalization [16–18].
Batch normalization eliminates the sum of hidden unit values that pass around
the covariance shift. Similarly, batch standardization permits layers of the network
to acquire individual layers of other layers by themselves in limited amounts. Subse-
quently, four completely connected layers are used as an add-on for the model with
the correct number of activation functions and neurons as the ‘ReLU’ for thick layers
except for the last shown with two neurons or for softmax starting function value [19,
20]. Two situations have a drop-out with a 20% risk of preventing over-fitting, minus
the cyber terrorism attack that is being assessed. By decreasing the error of the binary
cross-entropy, the whole model is trained. Assuming the training sample xi and its
true mark yi rán [0, 1] and by approximate probabilities p (yi) rán [0, 1], the error
value is well established as:


N
HS p (P) = yi .log( p(yi ) + (1 − yi ).log(1 − p(yi )
i=1

Stochastic gradient descent (SGD) is used to evaluate the parameters of the model,
and the “adam” optimizer with metrics as precision is used [21, 22]. The tokenized
text is then estimated in the space of the vector. When training starts, the struc-
tures are taken as a window vector, which for sequence analysis and learning is
accepted by the LSTM unit. Therefore, the trained model and the divided data are
tested. Consider a cyber terrorist attack which contains brute force, SQL attack, and
Mirai bot. Examining the source-contained IP validated TCP, Web crapping, post
and message analysis is grouped together in the form of array. Now, the analysis is
computed with the proposed model of LSTM unified structured architecture [23].
Let the summarization of cyber terrorist attack data can be done in the vector form
with respect to convolution step.
44 A. S. Rajawat et al.

n  
∂ ŵ ∗ ê
= J (x, y)
∂ x.∂ y
0

In the above equation, an integration-based computation unit is formed for cyber


terrorist attack where ŵ is the auto-correlated weight provided on each iteration and
ê is the enumeration unit obtained during the iterative computing process [24–26].
The denominator contains the partial derivative unit of X-axis and Y-axis as x and y,
respectively. J(x, y) is a variable-dependent unit obtained by integrating an over the
range of 0 to n cyber terrorist attack. Thus, LSTM works together to compute the
dense patterns in cyber terrorist attack developed day by day.

4 Proposed Methodology

This chapter suggests the implementation and experimental evaluation of machine


learning techniques to resolve security problems in diverse contexts, in particular
the detection and analysis of cyber terrorist threat behaviors. In order to test the effi-
cacy of computational intelligence algorithms for identifying vulnerabilities, here
we focus on: identification and analysis of datasets on the basis of computational
intelligence algorithms based on discrimination techniques; detection and analysis
of cyber terrorist attack trends, with specific regard to the structural entropy based
classifications [27–29]. Derived findings affirm the usefulness of computational intel-
ligence in fostering the security of diverse environments. Based on the SQL injec-
tion attack, brute force attack, and Mirai bot, examining validated TCP root IP, site
crapping, post and notification analysis, unused IP, and device address identifying
darknet traffic analysis. Then, LSTM was successfully found to analyze and detect
the Carna botnet [7]. The main aim of the dark Web collective archive is to provide
a research environment, platform, and infrastructure for the sue of communities
related to scientists, experts for computer and knowledge scientists, policy experts
and security researchers and others working in social trends and in computational-
related problems [30, 31]. Dark Web anonymous platform provides open access for
the conduction of community working for international jihadist Web-based forums
like sharing of hate speeches, radicalization, recruitment, fund raising, lone wolf
attack, terrorist training manual sharing, propaganda escalating, female accounts on
social sites, SNA-social network analysis for searching of targets, money laundering,
arm trafficking, human and organ trafficking [32, 33]. The emphasis of the propose
work is on extensions of previous literature approach and work including: increasing
the breadth of our collected data repository; introducing a gradual spidering parts for
usual data updates; developing search, and browsing activity functions (Table 1).
Table 1 Darknet (cyber terrorist attack) datasets
Cyber terrorist attack (Dark Dataset source Computational intelligence algorithm (Attacking data classification accuracy in %)
Web dataset) SVM SVMG-RBF BPNN S3VM LSTM Proposed LSTM
Alphabay marketplace: https://www.impactcybertrust. 74.25 88.25 91.80 91.80 92.00 92.33
Anonymized dataset org/dataset_view?idDataset=896
Darknet Market Cocaine https://www.kaggle.com/eve 74.34 74.83 79.83 85.00 89.95 92.83
Database rling/cocaine-listings
Darknet Market (darknet https://www.gwern.net/DNM-arc 72.58 74.58 87.23 86.97 87.97 88.58
markets Valhalla, Dream hives
Market, Silk Road)
Dream, Trade route, Berlusconi https://www.impactcybertrust. 75.53 79.53 82.53 83.93 86.53 91.53
and Valhalla marketplaces, org/dataset_view?idDataset=
2017–2018: Anonymized 1200
datasets
Cyber Crime Database and https://www.ptsecurity.com/ww- 76.61 80.61 91.61 92.91 93.61 94.61
Statistics en/analytics/darkweb-2018/
Attack dataset https://www.azsecure-data.org/ 74.25 75.00 88.21 89.21 89.21 90.22
dark-net-markets.html
Merit Telescope Darknet https://www.impactcybertrust. 71.00 82.00 83.91 85.12 87.23 89.00
Scanners dataset org/dataset_view?idDataset=654
Vulnerability Analysis at Industrial Internet of Things Platform …

CAIDA UCSD Network https://www.caida.org/data/pas 72.55 73.55 73.55 82.00 82.95 85.83
Telescope Darknet Scanners sive/telescope-darknet-scanners_
Dataset dataset.xml
Darkweb Criminal Activities https://webhose.io/industries/ 75.33 76.28 76.28 80.21 85.58 92.22
Dataset cyber-security/financial-fraud/
20 NG dataset-20 Newsgroups https://qwone.com/~jason/20N 72.53 78.53 78.53 90.93 90.53 91.32
Dataset ewsgroups/
https://www.csl.sri.com/users/
vinod/papers/atol_kdd2017.pdf
45

(continued)
Table 1 (continued)
46

Cyber terrorist attack (Dark Dataset source Computational intelligence algorithm (Attacking data classification accuracy in %)
Web dataset) SVM SVMG-RBF BPNN S3VM LSTM Proposed LSTM
Darknet Usage Text Addresses https://gvis.unileon.es/dataset/ 83.61 85.61 85.61 89.91 85.61 90.61
Dataset duta-darknet-usage-text-addres
ses-10k/
Twitter dataset - Time-Zone https://wwwusers.di.uniroma1.it/ 73.01 74.02 79.75 80.34 80.34 87.01
Geolocation of Crowds in the ~stefa/webpage/Publications_
Dark Web files/lamorgia_icdcs18.pdf
Darknet and Cryptocurrencies https://www.interpol.int/en/How- 72.33 75.44 80.35 87.01 85.23 90.32
INTERPOL dataset we-work/Innovation/Darknet-
and-Cryptocurrencies
A. S. Rajawat et al.
Vulnerability Analysis at Industrial Internet of Things Platform … 47

5 Discussion

The darknet, an anonymous platform for committing criminal’s activities, is most


popular cyber hackers. Police and security agencies require much time for their
investigation to find the criminals in dark Web environment, and it benefits the illicit
user. The dataset represented here contains records of criminal activities:
• Human trafficking.
• Drug trafficking.
• Organ trafficking.
• Arms trafficking.
• Contract crimes.
• Stolen information sell.
The information is collected from trusted sources, and it contains dark Web crime
information collected from particular time period using different techniques like
Web crapping, post and message analysis, unused IP and system address finding,
and traffic analysis (Table 2).
The given dataset information is not the exhaustive, it can also be used to generate
new attack patterns by scraping the Web at given time frame. Within the matching of
window frame size functionality, it shows the percentage values of hosts meeting the
above-defined three conditions. Surprisingly, the above three Mirai criteria are met
by almost all the hosts paired with window frame size functionality. Consequently,
the search operations in Fig. 1 may mean that any experiments or preparations for
the eventual delivery of Mirai malware were being performed by attackers.

Table 2 Performance evaluation


Name algorithm Traditional Proposed MR-DFL
SVM Prerequisite huge memory and huge Using our proposed model reduce
in data set and high computation time computation time, boost accuracy
in large dataset high computation cost
SVMG-RBF High computation cost Less computation cost
BPNN Error evaluation very slow Error evaluation very high
S3VM Data classification very slow Data classification very high
response
LSTM Less accuracy more computation in High accuracy less computation in
big data (Large dataset size) set big data (Large dataset size) set
Proposed LSTM Less accuracy more computation in High accuracy less computation in
big data set, memory utilization high big data set, memory utilization less
48 A. S. Rajawat et al.

Fig. 1 LSTM based darknet scanning for cyber attacks

6 Conclusıon and Future Work

Present methods of identification of cyber terrorism attack-based network threats


face issues as unsatisfied specificity and lack of conjecture. It is impossible to fight
unknown cyber terrorist threats and comparatively easy to circumvent the rule-based
monitoring of cyber terrorist attacks. We therefore suggest a new method of vulner-
ability detection on the dark Web network leveraging computer intelligence on the
Commercial Internet of Things platform. The mechanism is focused on evaluating
the cyber terrorist attack datasets where only some preprocessing is needed, and the
LSTM itself performs the automatic feature extraction. The experimental findings
on the LSTM dataset are excellent. In the detection of cyber terrorist threats, this
system has obtained state-of-the-art outcomes and having a peak value detection rate
while retaining a low values of parameter—false alarm rate. The primary objective
of this study was to recognize the most prevalent vulnerabilities in order to provide
a greater defense against them. Every month, in contrast with the others, we saw
the progression of each weakness, and this led us to believe that the most dangerous
were the most normal. We need to bear in mind, though, that some of these flaws
might be more risky than they made us believe in their rating.

References

1. I. Marinova, V. Jotsov, Node-based system for optimizing the process of creation ff intelligent
agents for intrusion detection and analysis. in 2018 International Conference on Intelligent
Systems (IS) (Funchal–Madeira, Portugal, 2018), pp. 557–563. https://doi.org/10.1109/IS.2018.
8710536
2. L. Wang, H. Mei, V.S. Sheng, Multilevel Identification and Classification Analysis of Tor on
Mobile and PC Platforms. IEEE Trans. Ind. Inf. 17(2), 1079–1088 (2021). https://doi.org/10.
1109/TII.2020.2988870
Vulnerability Analysis at Industrial Internet of Things Platform … 49

3. Y. Kawaguchi, S. Ozawa, Exploring and identifying malicious sites in dark web using machine
learning. in Neural Information Processing. ICONIP 2019, ed. by T. Gedeon, K. Wong, M.
Lee. Lecture Notes in Computer Science, vol. 11955 (Springer, Cham, 2019). https://doi.org/
10.1007/978-3-030-36718-3_27
4. J. Ramya, G.K. Raj Kumar, C.J. Peniel, ‘Agaram’—Web application of tamil characters using
convolutional neural networks and machine learning. in Emerging Trends in Computing and
Expert Technology. COMET 2019, ed. by D. Hemanth, V. Kumar, S. Malathi, O. Castillo,
B. Patrut. Lecture Notes on Data Engineering and Communications Technologies, vol. 35
(Springer, Cham, 2020). https://doi.org/10.1007/978-3-030-32150-5_65
5. S. Hao, J. Long, Y. Yang, BL-IDS: detecting web attacks using Bi-LSTM model based on deep
learning. in Security and Privacy in New Computing Environments. SPNCE 2019, ed. by J. Li,
Z. Liu, H. Peng. Lecture Notes of the Institute for Computer Sciences, Social Informatics and
Telecommunications Engineering, vol. 284 (Springer, Cham, 2019). https://doi.org/10.1007/
978-3-030-21373-2_45
6. I.V. Mashechkin, M.I. Petrovskiy, D.V. Tsarev et al., Machine learning methods for detecting
and monitoring extremist information on the internet. Program. Comput. Soft. 45, 99–115
(2019). https://doi.org/10.1134/S0361768819030058
7. E. Oztemel, S. Gursev, Literature review of industry 4.0 and related technologies. J. Intell.
Manuf. 31, 127–182 (2020). https://doi.org/10.1007/s10845-018-1433-8
8. D.P.F. Möller, Digital manufacturing/industry 4.0. in Guide to Computing Fundamentals in
Cyber-Physical Systems. Computer Communications and Networks (Springer, Cham, 2016).
https://doi.org/10.1007/978-3-319-25178-3_7
9. J. Liu, K. Fukuda, Towards a taxonomy of darknet traffic. in 2014 International Wireless
Communications and Mobile Computing Conference (IWCMC) (Nicosia, 2014), pp. 37–43.
https://doi.org/10.1109/IWCMC.2014.6906329
10. M. Zolanvari, M.A. Teixeira, L. Gupta, K.M. Khan, R. Jain, Machine learning-based network
vulnerability analysis of industrial internet of things. IEEE Internet Things J. 6(4), 6822–6834
(2019). https://doi.org/10.1109/JIOT.2019.2912022
11. F. Shaikh, E. Bou-Harb, J. Crichigno, N. Ghani, A machine learning model for classifying
unsolicited IoT devices by observing network telescopes. in 2018 14th International Wireless
Communications and Mobile Computing Conference (IWCMC) (Limassol, 2018), pp. 938–943.
https://doi.org/10.1109/IWCMC.2018.8450404
12. A. Montieri, D. Ciuonzo, G. Aceto, A. Pescapé, Anonymity services Tor, I2P, JonDonym:
classifying in the dark. in 2017 29th International Teletraffic Congress (ITC 29) (Genoa, 2017),
pp. 81–89. https://doi.org/10.23919/ITC.2017.8064342
13. D. Burbano, M. Hernandez-Alvarez, Identifying human trafficking patterns online. in 2017
IEEE Second Ecuador Technical Chapters Meeting (ETCM) (Salinas, 2017), pp. 1–6. https://
doi.org/10.1109/ETCM.2017.8247461
14. M. Spitters, F. Klaver, G. Koot, M. van Staalduinen, Authorship analysis on dark marketplace
forums. in 2015 European Intelligence and Security Informatics Conference (Manchester,
2015), pp. 1–8. https://doi.org/10.1109/EISIC.2015.47
15. M. Zhang, B. Xu, S. Bai, S. Lu, Z. Lin, A deep learning method to detect web attacks using a
specially designed CNN. in Neural Information Processing. ICONIP 2017, ed. by D. Liu, S.
Xie, Y. Li, D. Zhao, E.S. El-Alfy. Lecture Notes in Computer Science, vol. 10638 (Springer,
Cham, 2017). https://doi.org/10.1007/978-3-319-70139-4_84
16. I. Orsolic, D. Pevec, M. Suznjevic et al., A machine learning approach to classifying YouTube
QoE based on encrypted network traffic. Multimed. Tools Appl. 76, 22267–22301 (2017).
https://doi.org/10.1007/s11042-017-4728-4
17. F. Mehmood, I. Ullah, S. Ahmad et al., Object detection mechanism based on deep learning
algorithm using embedded IoT devices for smart home appliances control in CoT. J. Ambient
Intell. Hum. Comput. (2019). https://doi.org/10.1007/s12652-019-01272-8
18. A. Safari Khatouni, N. Seddigh, B. Nandy et al., Machine learning based classification accuracy
of encrypted service channels: analysis of various factors. J. Netw. Syst. Manage. 29, 8 (2021).
https://doi.org/10.1007/s10922-020-09566-5
50 A. S. Rajawat et al.

19. A. Cuzzocrea, F. Martinelli, F. Mercaldo et al., Experimenting and assessing machine learning
tools for detecting and analyzing malicious behaviors in complex environments. J. Reliable
Intell. Environ. 4, 225–245 (2018). https://doi.org/10.1007/s40860-018-0072-3
20. D. Lacey, P.M. Salmon It’s dark in there: using systems analysis to investigate trust and engage-
ment in dark web forums. in Engineering Psychology and Cognitive Ergonomics. EPCE 2015,
ed. by D. Harris. Lecture Notes in Computer Science, vol. 9174 (Springer, Cham, 2015). https://
doi.org/10.1007/978-3-319-20373-7_12
21. Chen H. (2012) Improvised explosive devices (IED) on dark web. in Dark Web. Integrated
Series in Information Systems vol. 30 (Springer, New York, NY, 2012). https://doi.org/10.
1007/978-1-4614-1557-2_16
22. H. Chen, Nuclear threat detection via the nuclear web and dark web: framework and preliminary
study. in Intelligence and Security Informatics. EuroIsI 2008, ed. by D. Ortiz-Arroyo, H.L.
Larsen, D.D. Zeng, D. Hicks, G. Wagner. Lecture Notes in Computer Science, vol. 5376
(Springer, Berlin, Heidelberg, 2008). https://doi.org/10.1007/978-3-540-89900-6_11
23. H. Chen, From terrorism informatics to dark web research. in Counterterrorism and Open
Source Intelligence, ed. by U.K. Wiil. Lecture Notes in Social Networks (Springer, Vienna,
2011). https://doi.org/10.1007/978-3-7091-0388-3_16
24. K. Barhanpurkar, A.S. Rajawat, P. Bedi, O. Mohammed, Detection of sleep apnea and cancer
mutual symptoms using deep learning techniques. in 2020 Fourth International Conference
on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) (Palladam, India, 2020),
pp. 821–828. https://doi.org/10.1109/I-SMAC49090.2020.9243488
25. S. Mandal, V.E. Balas, R.N. Shaw, A. Ghosh, Prediction analysis of idiopathic pulmonary
fibrosis progression from OSIC dataset. in 2020 IEEE International Conference on Computing,
Power and Communication Technologies (GUCON) (Greater Noida, India, 2020), pp. 861–865.
https://doi.org/10.1109/GUCON48875.2020.9231239
26. M. Kumar, V.M. Shenbagaraman, R.N. Shaw, A. Ghosh, Predictive data analysis for energy
management of a smart factory leading to sustainability. in Innovations in Electrical and Elec-
tronic Engineering, ed. by M. Favorskaya, S. Mekhilef, R. Pandey, N. Singh. Lecture Notes
in Electrical Engineering, vol. 661 (Springer, Singapore, 2021). https://doi.org/10.1007/978-
981-15-4692-1_58
27. S. Mandal, S. Biswas, V.E. Balas, R.N. Shaw, A. Ghosh, Motion prediction for autonomous
vehicles from Lyft dataset using deep learning. in 2020 IEEE 5th International Conference on
Computing Communication and Automation (ICCCA) (Greater Noida, India, 2020), pp. 768–
773. https://doi.org/10.1109/ICCCA49541.2020.9250790
28. Y. Belkhier, A. Achour, R.N. Shaw, Fuzzy passivity-based voltage controller strategy of grid-
connected PMSG-based wind renewable energy system. in 2020 IEEE 5th International Confer-
ence on Computing Communication and Automation (ICCCA) (Greater Noida, India, 2020),
pp. 210–214. https://doi.org/10.1109/ICCCA49541.2020.9250838
29. R.N. Shaw, P. Walde, A. Ghosh, IOT based MPPT for performance improvement of solar
PV arrays operating under partial shade dispersion. in 2020 IEEE 9th Power India Interna-
tional Conference (PIICON) (SONEPAT, India, 2020), pp. 1–4. https://doi.org/10.1109/PII
CON49524.2020.9112952
30. S. Paul, J.K. Verma, A. Datta, R.N. Shaw, A. Saikia, Deep learning and its importance for
early signature of neuronal disorders. in 2018 4th International Conference on Computing
Communication and Automation (ICCCA) (Greater Noida, India, 2018), pp. 1–5. https://doi.
org/10.1109/CCAA.2018.8777527
31. A. Singh Rajawat, S. Jain, Fusion deep learning based on back propagation neural network
for personalization. in 2nd International Conference on Data, Engineering and Applications
(IDEA) (Bhopal, India, 2020), pp. 1–7. https://doi.org/10.1109/IDEA49133.2020.9170693
32. A.S. Rajawat, O. Mohammed, P. Bedi, FDLM: fusion deep learning model for classifying
obstructive sleep apnea and type 2 diabetes. in 2020 Fourth International Conference on I-
SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) (Palladam, India, 2020), pp. 835–
839. https://doi.org/10.1109/I-SMAC49090.2020.9243553
Vulnerability Analysis at Industrial Internet of Things Platform … 51

33. A.S. Rajawat, A.R. Upadhyay, Web personalization model using modified S3VM algorithm for
developing recommendation process. in 2nd International Conference on Data, Engineering
and Applications (IDEA) (Bhopal, India, 2020), pp. 1–6. https://doi.org/10.1109/IDEA49133.
2020.9170701
Sentiment Analysis of Healthcare Big
Data: A Fundamental Study

Saroj Kushwah , Bharti Kalra, and Sanjoy Das

Abstract Healthcare sentiment research is structured to determine patients’ diag-


noses of healthcare-related concerns. It requires the views of patients into consider-
ation to devise strategies and improvements that may resolve their concerns directly.
Sentiment analysis is seen to considerable success for commercial goods and is
applied to other fields of use. Sentiment research included in numerous methods,
including evaluations of goods and services. In health care too, there are vast volumes
of knowledge regarding health care that can be accessed electronically, such as
personal journals, social media, and on the medical condition rating pages. Analyzes
of emotions provide a range of advantages, such as the strongest outcome to enhance
standards of treatment through diagnostic knowledge. In the aspect of a health-
care study, the health facilities and therapies are not only prescribed but are often
distinguished by their strong characteristics. Machine learning methods are used in
evaluating and ultimately producing an effective and correct judgment to millions
of analysis papers. The techniques under surveillance are extremely effective, but
cannot be applied to unknown places, while unattended techniques are poor. More
analysis is required to increase the precision of the unattended strategies so in this
time of the knowledge flood they are more realistic. This presents a fundamental
thesis that actually gives a short analysis of the sector, the research context and rele-
vant problems/challenges and also dealt with the various challenges in the field with
possible solutions to identified problems.

Keywords Fake news detection · Social media · N-gram · Objects · Text · Data
mining · Web page · Text · Object · Machine learning

S. Kushwah (B) · B. Kalra


Noida International University, Noida, India
e-mail: sarojkushwahsiem@gmail.com
B. Kalra
e-mail: bharti.kalra@niu.edu.in
S. Das
Department of Computer Science, Indira Gandhi National Tribal University-RCM, Imphal, India
e-mail: sdas.jnu@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 53
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_5
54 S. Kushwah et al.

1 Introduction

In this age of technology, people post their concerns online and seek suggestions
from their friends and relatives, much as they did before. You may find this online
data from a broad variety of outlets, such as journals, forums, and social networking
sites [1, 2]. There are Web sites and sites linked to health where people chat about
their concerns, signs, illnesses, medicine, etc. In terms of accessibility, operation,
climate, satisfaction, convenience, etc., the interactions of healthcare centers visited
may be exchanged. It is really helpful for the new patients to hear from other people
how to assess their fitness, their medicine, or pick a health clinic. This input is often
of significant importance to health centers in recognizing and answering patients’
needs. Patients express this material, which is the motivating force behind this method
of study, with their own emotions and feelings [3].
Emotion analysis has explained why people’s emotions and traits are established.
It is free and big to evaluate all this material manually and to conclude with a swift
and effective conclusion, thus, the health-related contents accessible online are less
realistic. Sensation detection methods carry out this mission with little to no user
assistance by automatic systems [4]. Surveys and questionnaires that were costly and
time-consuming were used historically for this reason. The expert’s technical papers
are limited and do not solve the patient’s issues or seldom take into consideration the
viewpoint of the patient. The feeling study takes into consideration the views that
are distributed through various channels of patients shared in millions of papers. As
recommended or not recommended, the performance of emotion analysis may be
classified into two groups of health decisions. It is often important to look further
into the elements or symptoms of the health issue. Aspects such as cost, flavor,
packing, access, side effects, and time-effectiveness of the target entity may be. This
contributed to the development of a sensation study focused on aspects [5].
Aspect-based sentiment analysis conducts sentiment analysis at a level about
each aspect of the target organization. This method of analysis is more practical
since not all facets of a successful medication or therapy are valuable. It allows
patients to pursue therapies and drugs that are highly respected as their concern. The
explanations for sentiment orientation are exposed by more research in the field of
sentiment analysis. This method does not just reveal the patients’ happiness but also
the factors behind their emotions. It gives much-focused detail, as it specifies the
reasons to change [6].

2 Background Study

Sentiment analytical methods were used successfully in the last decade (also known
as opinion mining techniques) on consumer goods. It has become famous because
before choosing, people want to hear about other viewpoints. Feeling research
discusses and expresses common opinions in a clearly understood fashion. They
Sentiment Analysis of Healthcare Big Data: A Fundamental Study 55

will contribute to procedures and choices where the majority of patients used to
manage their disease in the context of health care. Feeling research was a popular
trend and ranged through other fields including social sciences, governance, geog-
raphy, the science of management, health care, stock exchange, etc. The focus was
on business intelligence and commercial goods. Analysis of the emotions is catego-
rized into sub-streams that involve pattern analysis, prejudice analyses, vulnerability
analysis, emotional analysis, etc. [7], which has fascinating effects on the recognition
of sex through sentiment analyses of the e-mail sender. Sentiment analyses are often
used to classify sentiment trends for novels and fairytales. Much of the theoretical
analysis was done from the viewpoint of machine learning (ML) and artificial intel-
ligence (AI). It has crossed its paths with other fields such as NLP, machine linguist,
and psychology. Without considering these disciplines, the effects of ML and AI
strategies cannot be substantially changed. NLP has several accessible problems that
due to the richness of the natural language are not addressed satisfactorily. It also
converted this action to feeling research.
The ML methods for sentiment analysis classification of train datasets are used
and the test reviews are categorized depending on them. Naive Bayes, KNN, centroid-
based, and support vector machine (SVM) are the typical classifiers used to measure
sentiment. The findings of these classifications have been promising in text classifica-
tion and overview care when it comes to impartial material focused on accurate facts.
They are part of the extraction and exploration of knowledge. Although high preci-
sion is obtained in the ML classifiers, generalization is missing and thus each domain
must be trained separately. The datasets labeled are small and costly to generate
according to a detailed method. They often may not fix recent concerns, although in
certain problems, if not generated in time, the findings lose their relevance. Classi-
fiers should also only be used for delicate problems, whereas unattended methods
are favored for effective analysis. Every kind of data can be used explicitly by unsu-
pervised methods where precision is impaired. Either a dictionary or an organ-based
method takes place through unregulated techniques. To define the focused feelings
conveyed, the dictionary solution needs a reference to an outside emotion dictionary.
The dictionary of feelings offers a polarity of feelings based on a popular vocabulary.
Any emotions of polarity, such as “being positive at a medical test” are pessimistic
emotions. This dilemma is not the corpus-oriented method since it draws emotions
based on the probabilistic study of the coexistence of terms. It just exposes feelings
of the domain. They need a huge corpus, though, where the precision falls as the
corpus is smaller. Semi-controlled techniques are often recommended, where the
probabilistic paradigm needs certain domain-specific approaches, to boost outcomes
of corpus-backed techniques. The hybrid approaches need a limited training set to
define initial parameter values which later are used to achieve better results with a
non-supervised probabilistic model of the subject [6].
This chapter aim and objective to study the sentiments of healthcare data described
in Sect. 3. Describing the application area of sentiment analysis in Sect. 4. Mining of
emotions and sentiment research in Sect. 4.1 and its research approaches in Sect. 4.2,
various social media where sentiments are posted define in Sects. 4.3, 4.4 defines
the study of sentiment findings. Section 5 introduced a review of sentiments and
56 S. Kushwah et al.

healthcare data. Describe detail of health sentiment in Sect. 5.1, analyze sentiment
in health care in Sect. 5.2 and study the health care and emergency treatment in
Sect. 5.3. This study summarized the literature review in Sect. 6. Existing problems
and their suggested solution describe in Sect. 7, followed by the conclusion in Sect. 8.

3 Aim and Objectives

The key purpose of this research is to evaluate the task of analyzing emotions in the
area of health care. The complexity of this sector is quite critical; therefore, the aim is
to analyze whether current SA strategies will adapt to healthcare needs. The way the
health system operates has been fundamentally altered by technologies, through the
advent of EHR, consultancies portals, or computerized office programs. In this work,
patient input and appraisal results from healthcare providers were mainly regarded
and assumed that a proper human language review may relate favorably both to the
interactions of patients and physicians. The consumers may not have a specialist in
any particular domain in standard review writing and they do not realize that the
revision they have is further processed. In that scenario, the user typically writes in
his manner and uses the different types of writing, such as slang, abbreviations, and
expanded terms, to cover the bulk of minimum terms. This renders it impossible to
detect the feelings of the analysis. When both given form of parameters are chosen
or separated as explicit and implied, implied parameters influence the efficiency and
accuracy of the study.
The following is a summary of the study’s objectives:
• The context in feeling research and health care should be evaluated and researched.
• To analyze the different facets of the study of emotions.
• To research and analyze the numerous facets of healthcare sentiment analysis.
• Providing a timely approach to the specified problems and a quick discussion
about the findings and the mentioned technique.

4 Sentiment Analysis

Sentiment analysis (SA) deals with the examination of views, concepts, and emotions.
It is used as a natural language processing (NLP) comprehension tool. It is meant
to classify the thoughts of the speaker or writer on a particular topic [8] or merely
to decide the general polarity of a text. In other terms, it derives and gathers data
from unstructured raw data and is typically presented as an appraisal or examination,
representing some sort of subjective feeling. In these regions, it can also be made
useful:
• Polls: Open-ended issues study [9].
Sentiment Analysis of Healthcare Big Data: A Fundamental Study 57

• Industry and Governments: Ensure that knowledge can be reliably and correctly
presented to aid decision-makers [10] and track channels for increasing aggressive
or derogatory contact.
• Customer feedback: Analysis reports to enhance consumer loyalty and experience.
• Health: Handling and reviewing healthcare product texts that are helpful to cure
sickness and to speed up the phase of medical product development to boost
human health [11].
The area of sentiment analysis utilizes the methods from NLP to interpret the
words of the users and then correlates feelings with what the consumer provided.
This region is twisted by societal values. The following sentence, for instance, maybe
viewed very differently. “It’s terrible this latest machine! While the apparent signifi-
cance alludes to the consumer’s misfortune with the gadget, and the user population
that belongs to a certain age demographic would view this assertion as clear support
of the gadget. In comparison, the emotion examination explores the moment where
the consumer conveyed his emotion or opinions. The same consumer can clog his
judgment under such stressors. Therefore, compiling declarations over a while gives
a greater confirmation of the feelings. The forum for social networking poses both
obstacles and rewards.
It allows the user on the network privacy so as open to sharing their thoughts [12].
It does this in a good way. Also, data should be obtained at a certain interval and can
show accuracy. The data thus collected would include dominant facts promoting the
theory of the researcher and will provide a sound base for empirical inferences. Web
data collection is the alternative in certain ways including ads, etc., data collection.
Examples of how businesses can offer personalized content to end customers include
Google, YouTube, and Amazon. Many fields such as the number of likes and the
overall number of goods sales for age will significantly rely on objective metrics.
Psychology/psychiatry is not luxurious, though, since the data is published in the
form of text by people on different platforms such as journals and social media. This
factor creates more difficulty by (a) utilizing languages in a certain subject/blog, (b)
utilizing terms that are not normal and are not included in a dictionary, and (c) using
emoji’s and symbols. Experts from the NLP domain and others who operate in the
field of emotion analysis are discussing these concerns. The necessary vocabulary
and the simple resources to capture and interpret site data, evaluate it correctly, and
glean qualitative knowledge are required for social scientists and psychiatrists.
This work aims to make this move. In brief, the chapter contains:
1. Provide essential knowledge on many prevalent natural linguistic processing
(NLP) theories,
2. Explain the conventional NLP methods and figures,
3. Consider the studies carried out in the context of sensation research and the
difficulties in terms of mental well-being disorders,
4. Give a short overview of different uses for NLP principles for mental health
disorders and sentimental research [13].
58 S. Kushwah et al.

4.1 Mining of Emotion and Sentiment Research

An integral aspect of the form of knowledge collection has always been to understand
what other individuals think. As feeling analyzer system such as Internet polls and
private blogs becoming more accessible and more requested, unique incidents and
challenges are taking place in a manner where individuals will now consciously
use learning technology to check out and evaluate other people’s feelings. Thus, the
unforeseen explosion in the field of sentiment analysis and sentiment analysis which
exempts the measurement methodology used in the document to evaluate sentiment,
feel and subjectivity occurred, in at least part, as a correct response to the spur of
investment in new systems which distributes ideas as a primary subject directly [14].
Online is the most crucial source of polls, observations on an object, and admin-
istrative or movement polls. A huge calculation of surveys on news posts and objects
is generated every day on the Internet. For example, many people use online living
systems, such as Twitter, to express their comments, polls, and feelings in their own
words. Structuring structures for different lingoes are growing, in particular, because
blogging and small-scale blogging are becoming prominent [15].
There is a way to evaluate individual emotions, this is the anal sensation. The
Web offers textual data that are available and expand every day. Shoppers can report
their notes and opinions on items offered online on retail pages and this needs to
boost purchases of goods and to increase shoppers’ loyalty [16]. It is challenging
to manage a vast number of documents or remarks in sentimental analysis so it is
difficult to derive a general sentiment analysis from the method using this volume
[17]. This tremendous volume of emotion analysis cannot be manually evaluated and
the automated process of feeling analysis thus plays an important role in solving this
issue [18].
Proposing yet another quality analysis weighing plan [19], using the quality exam-
ination weighing plan. As well as, it was suggested [20] that the Arabian appraisal
arrangement on institutional audits in Lebanon should be paired with Term frequency
and Inverse Document Frequency (TF-IDF). Surveys are transparent administrations,
including accommodations, bars, stores, etc. They obtained Google and Zomato
content, which were surveyed 3916 times. Tests reveal three center findings: (1)
the classifier is confident whether positive audits are expected. (2) The paradigm is
unilateral with a pessimistic feeling in anticipatory audits. The low degree of unfa-
vorable corpus audits contributes to the modesty of the measured model for relapse.
This problem can be overcome by other optimization strategies [21–23].

4.2 Sentiments Research Approaches

Lexicon’s approach is regarded as an unregulated method of study. The Lexicon


approach needs no training data and is only dictionary-dependent. Much of the thesis
adapts the sensitivity analysis to SentiWordNet and TF-IDR process. This method
Sentiment Analysis of Healthcare Big Data: A Fundamental Study 59

is determined based on terms in-text details that appear in predeveloped lexicons


such as SentiWordNet with other positive or negative words [24]. Concerning the
TF-IDR process, it is measured using the term frequency-inverse text frequency [25]
by translating the terms into a number.
Lexical tools are used for the methods and the success of the entire method relies
significantly on the consistency of the lexical resources. It is centered on the polarity
of a text object, and the polarity of the words comprising it can be obtained. This
method does not cover all facets of language, particularly in terms of slang, sarcasm,
and negation, due to the complexities of natural languages [26]. Words of emotion
are not enough. Some difficulties remain, such as terms of a certain sense, terms with
a certain expression do not articulate some viewpoint, and certain phrases without
words of emotion may even indicate a viewpoint [27]. Some of them are distinct.
However, the lexicon approach has its benefits, since it offers easy positive and
negative counting, versatile for adjusting to various languages, and a fast study.
Machine learning is monitored and the system allows training data to be analyzed.
SVM and Naïve Bayes are the approaches used most commonly in machine learning.
The most widely employed machine learning frameworks, though. Application on the
well-formed text [28] is effective in Naïve Bayes, though supporting vector machines,
it provides reasonable output for the low-format dataset. However, on Facebook,
the machine learning approach is not functioning well for the users commenting
randomly, as it involves several exercises such that the approach can be modified as
the data collection influences the size as the consistency of the performance [27, 29].
Besides, it takes time to evaluate machine learning, particularly, if training is needed,
for a complex machine learning model [30]. With fewer training details, the method
is quicker, however, contributing to less consistency in grading [31].
It is worth noting that both kinds of approaches to review have very good accuracy
[31]. Researchers say. There are ways to merge two key methods, lexical feeling,
which provides a feeling score feature, and multi-installation event models from the
Naïve Bayes system of machine learning to forecast the directions of feeling. Studies
have shown that integrating these two approaches has increased performance, rather
than depending on a single method [28]. To increase the outcome, it is also proposed
that all approaches are merged since they support each other and that the outcome
is increased in comparison with just one method. In the detection of a phenomenon
[31], the integrated method is useful. The management of unstructured data may also
be improved [32].

4.3 Social Media Types

Content communities (YouTube, Instagram), social networks (Facebook, LinkedIn),


blogs (Reddit, Quora), and microblogs (Twitter, Tumblr) [33] can be listed as four
forms depending on their application use. Based on the examined article, microblog-
ging platforms are the top social networking tool used to gather user sentiment
knowledge across the four forms of social network systems primarily on Twitter. 85
60 S. Kushwah et al.

percent of the paper analyzed utilizes Twitter to gather sentiment analytical details.
Twitter is one of the 10 most viewed platforms, which helps people to share brief
messages which communicate with them. It also uses Twitter to share its opinions and
to provide scientists, business organizations, and the government with very useful
knowledge. Twitter is a global social networking site for microblogging people to
communicate their emotions to a person, event, or product. The material or data
accessible for public access is what makes Twitter famous. By using the API, you
may use the keywords or a hashtag to copy and access data on any desired topic.
Twitter carries out intimate public opinion and real-time research, as Twitter has
approximately 500 million tweets a day, providing public access to its data through
its API [34]. Twitter is used for browsing and storing messages in eight countries in
the west and east. There are Twitter users around the globe, and so people from a
foreign culture, language, and perception enhance it with their thoughts and views
[35]. Wikipedia, for example, is being used to tweet gathered users to individual
politicians during elections [36], as well as to compile tweets posted about group
program creation activities [37]. Twitter has gathered messages from the user to the
UK energy firm [38] and examined the tweets uploaded from the official Twitter
handle of London Heathrow Airport that was further examined through emotional
analysis [39]. Facebook has the world’s biggest user of social media. But for senti-
ment analysis, it is not common because the data is chaotic, organized, and some-
times short forms are used and often orthographic errors are used. This makes it more
challenging to interpret the results. An example is to retrieve sites, notifications on
status and feedback on Facebook and Twitter [40]. Data from different social media
outlets including the forum, journals, Expedia, newsletter, mass media, WordPress,
YouTube, Twitter, aggregator, Facebook, etc., are collected in one survey. And the
outcome suggests that 88% of the data is Twitter [41]. Due to restricted details and
views which can be extracted, such as on BlogSpot, YouTube, and WordPress, the
other source of social networking are no stronger [42, 43].

4.4 Sentiment Study Findings

The goal of utilizing nostalgic research methods and conclude millions of users’
health perceptions into valuable knowledge. Therefore, the findings of the measure-
ment of sentiment must be clear and definitive and must be used for decision-making
purposes. The outcomes of the study of feelings may be in the form of binary groups
that reflect positive and negative feelings. If finer levels are required, the aggregated
results are spread across numerous groups, such as outstanding, nice, poor, terrible,
and worst. It is found to be more efficient than humans for machinery to ingest this
sort of data seen in numerical figures. To render the result user-friendly, it becomes
a brief resume that provides further perspectives into why individuals believe which
is more familiar to humans. The abstract or extractive summary will be used if the
abstract summary has public consensus from the examination papers, whereas the
numerical analyses produce the extractive summary. As the description is in natural
Sentiment Analysis of Healthcare Big Data: A Fundamental Study 61

language, it is challenging to generate it succinctly and accurately without mistake


by the end consumer.

5 Review of Sentiment and Health Care Data

5.1 Sentiments of Health Details

Subjective details are supposed to reflect the authors’ own opinions on the issue in
the healthcare sentiment dataset. However, since these data are obtained from non-
regulatory Internet sites, all manner of alternatives is often anticipated. Examination
documents can also be given, which do not display polarity of feelings and pass
on general knowledge. Such reviews are of no use and thus filtered out to rely
on the viewpoints only. Such reviews have no function. Subjective opinions often
include some factual claims that are often screened for the same intent and statistics
and figures. Another research performs surrender using a dataset derived from Web
blogs on deafness. Research using psychiatric narratives as a dataset. As sentiment
analyses are common and have an effect on people’s minds, it is a perfect location for
spammers to promote and unravel their agendas. The following segment addresses
the issues of spamming and future limitations.
As these analysis documents are created by inexperienced writers, all kinds of
contradictions are predicted. They include orthographic errors, capitalization, gram-
matical errors, short term shortening, regional slangs and cursing, and so forth [44],
which address various forms of writing for the authors of online material. Blogs have
been used more commonly as they provide more information for insightful analyses.
Often the blog reviews are debated by the blogger and the commentators who have
more details but are challenging to assess. In such a debate, the order is still quite
relevant since it contributes to shared grounds. They share not only emotions about
a health issue but also clear arguments for supporting their beliefs which make them
extremely useful [45].
Various forms of phrases can be contained in a summary paper. Although this
issue is rooted in NLP, it has several effects on sentiment analysis results. A simple
term is one which has the aim of feeling (the objective entity or aspect) along with
a term of feeling. For example, “medicines are provided for subsidized rates by
the XYZ healthcare center, and they are the easiest to determine. A hybrid word
includes numerous emotions regarding numerous things that are addressed together.
“Doctor XYZ, for example, is very skilled, but it takes weeks to check.” Multi-links
mean the usage of one element, which includes many emotions or several things of
the same terms of feeling. As in the term “They have the best doctors, equipment
and treatment facilities available,” or “They have a good qualification, accessibility,
and extensive experience.” Comparative words include feeling or aspects in contrast
to two-goal individuals. “Medicine X is extremely comfortable, but medicine Y is
available with all chemists” will be a comparable phrase. Complex phrases imply
62 S. Kushwah et al.

cynical phrases, which have implicitly opposite sounding polarity to the directly
illustrated ones. “What a wonderful treatment, for example! In just a few hours,
the disease came back. The methods of feeling processing confront challenges with
harsh words. These expressions are often condensed or filtered in advance [46].

5.2 Study in Health Care

Analysis of emotions in health care is the usage of text data relevant to health emotions
analyzing strategies. This knowledge is ideally derived from online outlets to inves-
tigate popular models. Examination of health opinion helps recognize places that
are respected, attacked, recommended, or dependent on results. Employee mask
learning methods for emotion analysis to mine those trends with considerable reli-
ability and precision. Feeling research has crossed the paths with other fields such
as NLP, programming language, psychology, and others. Their future results cannot
be easily solved without understanding them. NLP has many study problems, which
due to the wealth of natural languages are challenging to overcome.
The reliability of the methods of sensational processing is diminished by noise in
the data collection. Thus, pre-processing datasets are washed to render them appro-
priate for the usage of the research technologies. The online data collection is often
viewed as somewhat unreliable owing to its unmonitored contact medium. Non-
relevant tags and advertisements that do not play a role in the review phase are
eliminated. Often filtered out are reviewed records indicating no emotion polarity.
Although polar emotion documents with factual arguments are either deleted before
processing or allocated to a neutral class in the international classification along with
positive and negative classes. Sentences that display dual polarity are also omitted
since they are not definite, but they are still polar but difficult to estimate. For instance
“RMI in Peshawar has better doctors, but the place is too busy to find one.” Although
a word such as this has positive and negative emotions, its polarity cannot be under-
stood. The marking of them with neutral class is not a successful judgment nor a
positive nor a bad emotion. In reality, characteristics or measurements can be defined
and characteristics evaluated to produce results at the aspect stage. The research at the
entity stage, which considers the target entity to be good, negative, or neutral, does
not retain this kind of knowledge. This dilemma is dealt with as an aspect dependent
emotion analysis that allows a feeling analysis at the stage of the aspect [46].

5.3 Health Care and Emergency Treatment Sentiment Study

The appearance of social networking creates an immense amount of critical data that
is readily available online. Several users post debates, images and videos, data, and
opinions on different Web pages of social networking, one of the popular digital media
in the Twitter presence [47–49]. Much emotional data is continuously generated by
Sentiment Analysis of Healthcare Big Data: A Fundamental Study 63

informal organizations, such as Facebook and Twitter. The abstract data demon-
strate the assumptions, opinions, desires, and mental frames that people communi-
cate against different suspense subjects [50]. This form of data is often important for
organizations, groups, or individuals since it encourages them to perform operations
that help them. Furthermore, an appraisal exam reviews emotional evidence utilizing
characteristics such as textual management, machine etymology, data recovery, and
information-mining techniques. The study of opinion in various spaces is espe-
cially beneficial, for example, policy problems, travel focus, etc. Currently, a vast
area in the social insurance sector provides early days using nostalgic research, for
example, details about the condition of the employee, injuries, unfavorable reactions
to drugs, plagues, and others. However, there was practically no inquiry into the
social insurance space.
It is incredibly challenging to extract valuable Twitter knowledge since it is not
controlled and this is a huge obstacle. Most Twitter Arabic users compose their
articles and their tweets using the Arabic language. There is a lot of literature in
the English language regarding emotional analysis but it is really bad in the Arabic
language. They implemented a dataset in Arabic on sentiment analysis obtained from
Twitter on medical services [51–54].

6 Related Works

Subjectivity is a central principle relevant to SA-as subjective texts can, by design,


convey emotions and opinions that generate an opinion directly. Therefore, there
is much research that is based on subjective sentence comprehension and aware-
ness [55]. However, because of the strong relation between subjectivity and beliefs,
scientists appear to disregard objectivity in the expectation that information is not
greatly missing. The presence of emotions in all sentence forms by analyzing multiple
emotional and analytical variations of a sentence.
Subjectivity and objectivity are essential to evaluate feelings, but other principles,
especially survey results, are more abstract and complex to understand. Sarcasm is
one and can be contained widely in ratings, comments, and survey details. People
claim the absolute opposite of what they think by sarcasm. You are not misleading
when you offer your truthful opinion on a certain good or service, however, sarcas-
tically. They do this, and it is difficult to detect sarcasm in the language, particu-
larly, when artificial intelligence systems are used. Human beings are usually better
informed of sarcasm, but even human beings can be uncertain about whether or not
a statement is satirical because of its ambiguous existence. This is addressed in [56],
which demonstrates that much work is taken to consider and understand sarcasm.
Based on a sarcasm strategy, developed a semi-supervised model [57] by
processing their completed data utilizing pattern-based features and punctuations
(high-frequency terms and punctuation were used). Overall, strong accuracy and
reminder showed results. In [58], it has demonstrated that the usage of punctuation
may be useful in sarcasm detection. However, their research explored the numerous
64 S. Kushwah et al.

uses and types of the expression “Yeah right”; a common expression used in satirical
phrases.
The usage of negation is a related term to sarcasm; they are both used as sentiment
shifters since they can alter the polarity of the phrase. Negation is commonly used by
many SA scientists to analyze their data using a set of keywords. Keywords of this
type include: “no,” “not” or “-n’t,” “never” and “without.” Found three approaches
separate for evaluating negation and decided that negating the first word of feeling
following or around the keyword of negation is the better way since its precision is
substantially increased.
Also, Narayanan and other provisions are analyzed by splitting conditional
sentences into two clauses [14] showed that a single classification is marginally
greater usage of the polarity of the whole phrase than the individual classification of
the two clauses (Fig. 1).
Despite this, the widespread belief of scholars that a sentence is a single opinion
only was seldom valid [15]. It has been stressed from a wider viewpoint. Specifically,
both optimistic and negative emotions can be conveyed in a phrase, each linked to

Support
Vector
Machine

Boosting

Supervised M/C
Learning K-nearest
Approach neighbour

Maximum
Entropy

Machine K-mean
learning Clustering
Approach

DBSCAN

Unsupervised
M/C Learning
Healthcare Approach Hierarchical
Sentiment Agglomerativ
Analysis e Clustering

Mean Shift
Corpus Based
Clustering…
Approach
etc.
Lexicon Based
Approach
Dictionary Based
Approach

Fig. 1 Mostly used techniques in sentiment analysis


Sentiment Analysis of Healthcare Big Data: A Fundamental Study 65

a particular part of the sentence. Likewise, it makes little sense to allocate a single
mark to a text, which is only a few sentences combined. McDonald et al. the text- and
sentence-level grouping to aim and fix this by applying the marks to all sentences and
the whole text was analyzed as quoted by [15]. For both classifications, classification
preparation using both stages yielded more precise results [15].
The prevailing polarity of the sentence is an attitude to treating phrases that involve
both optimistic and pessimistic emotions. This works; nevertheless, a different
method is expected in the presence of equal negative and positive views in the
document. Despite the challenge of taking into account neutrality in the neutral
polarity class, this issue is a good solution. Partisan comments do not, therefore,
include equivalent amounts of positive and negative views; they may often indicate
that there is no viewpoint. Many of the academic articles usually neglect neutrality,
which encourages the issue [15].

7 Existing Work Problem

With the above-related work, adequate text-based modeling is required to capture


components of text classifiers and their relationships. It is a challenging task where all
the components such as classifier, input features, train-test samples, are put together to
get the classification done. Another challenge is related to the input text representation
where the messages “text” is highly sparse and researchers tried to utilize these
online/offline data for preparing the expert system. Mostly, the textual content due to
its large dimension of feature size belongs to multiple categories. Keeping in mind all
the challenging issues and related literature surveys, different problems are identified
which are not solved by researchers yet as given below:
• A role of various components of the classification model play to achieve the
desired accuracy.
• A generic process to carry out training and testing of the classification model.
• A formal metric-based link of the input data characterization with the performance
of the classification model.
• A method to make the classification system not only accurate but also robust.
• A role of incremental feature enrichment to improve the feature segregation.
• A role of input data characterization (pre-processing configurations) in the context
of machine learning algorithms.
• A feature-based improvement in multi-level (ML) machine learning algorithms

7.1 Identified Problems and Possible Solutions

There are a different problem in existing work which are ignored by most of the
researcher during the process of healthcare sentiment. We describe those problems
66 S. Kushwah et al.

Table 1 Various problems and solutions


Problems Description Area Suggested solution
Using the meta-model This is a main problem Text classification – May create a text
approach to use or issue to capture meta-model which
different components components of comprises all the
[5, 14] metadata and their components of the
relationships as text classification
classifier, data system
cleaning, and aspects – The model
(explicit and implicit) represents the text
set are put together to classification in
get the classification general as
done multi-class and
multi-label
To develop a generic The performance Text as well as feature Can evolve a
robust performance evaluation of any text classification composite approach
evaluation protocol classification system is (comprising multiple
[16] a challenging task as it sets of data,
uses several classifiers, and
components like train-test criteria)
classifiers, feature rather individualistic
selection methods, (single set of data,
train-test criteria, and classifier, and
the different sizes of train-test criteria)
input data. The input approach
data quality
contributes largely to
classification system
performance
To characterize, input The presence of noisy ML application (text May develop the
data, and link it with data distorts text classification) characterization of
the performance of the information and input data in terms of
classification system largely affects the ML two ratios gauge
[3, 18] applications. It repeatability and
damages text input reproducibility
interpretation during (GRR) and monthly
the learning phase and recurring revenue
could raise a serious (MRR). It is used to
issue in classification measure the
tasks performance of the
classification system
To understand the The pre-processing Text and feature May propose a
dynamics (a) choice of configuration (raw classification feature enrichment
pre-processing (b) data) and (processed strategy based on text
aspect (implicit, data) have been taken feature-based
explicit) enrichment or to understand the segregation indexing
classification model perturbation issue in a (TFSI). First to use
with varying training real-time classification bag of words (BOW)
and testing features [2, scenario at a different model on the corpus
18, 25] label Second, a refined
feature set is
prepared at the
second stage by
applying TF-IDF
(continued)
Sentiment Analysis of Healthcare Big Data: A Fundamental Study 67

Table 1 (continued)
Problems Description Area Suggested solution
To work on different Addressing the Text(sentiments) Can use machine
multi-label sentiment multiple scenarios of classification learning (ML)
classification sentiment documents, algorithms such as
algorithms at different which is a generalized k-nearest neighbors
pre-processing category of multi-class (KNN) or SVM with
configurations [28–30] problems N-gram or TF-IDF,
and decision trees
(DT). Multi-label
k-nearest neighbor
(ML-KNN)

and suggest their possible solutions shown in Table 1 by analyzing of various research
works.

8 Conclusion

Due to human literacy and cognitive capacities and the lack of a computer approach,
the identification of false as well as true opinions is a highly significant issue As per the
current era, the common platforms for reviews spreading are social media platforms
available just because the number of opinions (sentiments) can spread to many in no
time and money. As the connectivity of social media all around the globe is countless
and also is growing exponentially, hence, makes it the most favorable platform for
sentiments. The research offers details on social network analyses of well-being
emotions. Sentiment analysis was implemented in early 2000, while aspect dependent
sentiment analysis was introduced in 2004. Every organization’s sentiment analysis
is expected to assist people, provide them with access to knowledge, and motivate
them. The sentiment analysis has a wide scope and can be applied in many areas, such
as enhancing efficiency and market practices, and political predictions. The higher
authority will only enter a concern if they notice it immediately. It is not realistic to
cross social networking and other consumer content channels to recognize problems.
We also explain numerous challenges and solve these problems in big data health care
in this chapter. This method is automated by feeling study. Study and experimentally
collected show that the sentiments nowadays over the social media platform are
available in an embedded format like image embedded with objects and texts. Reason
for the same objects makes understandability quite easy and by the time objects and
text are grouped then it makes its proper format for understanding the things The
study of emotion and meaning reasoning centered on aspects is more insightful to
enable consumers make wise decisions. This issue is aimed at the emotion processing
methods used by artificial intelligence and data science. Healthcare emotion analysis
offers experts in the field of health organization and clinicians the ability to gather
useful data from the Internet or other outlets. Further study is needed to build a
68 S. Kushwah et al.

universal sentiment analysis model for a potential suggestion that can be applied to a
particular form of data. There is also a hybrid model solution is required that blends
human and machine efforts. The author would also like to apply sentimental research
to detect numerous health problems and areas.

References

1. Hunt Allcott, Matthew Gentzkow, Social media and fake news in the 2016 election. in Technical
Report. (National Bureau of Economic Research, 2017)
2. T.A. Rana, Y. Cheah, Aspect extraction in sentiment analysis: comparative analysis and survey.
Artif. Intell. Rev. 46(4), 459–483 (2016)
3. ] S. Kushwah, S. Das Hierarchical agglomerative clustering approach for automated attribute
classification of the health care domain from user generated reviews on web 2.0. in 2020 IEEE
International Conference on Computing, Power and Communication Technologies (GUCON).
(Galgotias University, Greater Noida, UP, India, Oct 2–4, 2020)
4. N.J. Conroy, V.L. Rubin, Y. Chen, Automatic deception detection: methods for finding fake
news. Proc. Assoc. Inf. Sci. Technol. 52(1), 1–4 (2015)
5. M. Hu, B. Liu, (2004). Mining and summarizing customer reviews. in Paper Presented at the
Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining. pp. 168–177.
6. Muhammad Taimoor Khan, Shehzad Khalid, Chapter 31 Sentiment Analysis for Health Care.
(IGI Global, 2016)
7. S. Mohammad, Once upon a time to happily ever after: tracking emotions in novels and fairy
tales. in Proceedings of ACL Workshop on LaTeCH. (2011)
8. Liu, B. (2011). Sentiment analysis and opinion mining-AAAI-2011 tutorial. AAAI-2011, the
Main Content is from 1–198.
9. J. Gottfried, E. Shearer, News use across social media platforms 2016. 26 May 2016.
[Online]. Available: https://www.journalism.org/2016/05/26/news-use-across-social-media-
platforms-2016/
10. V.L. Rubin, N.J. Conroy, Y. Chen, S. Cornwell, Fake news or truth? Using satirical cues to detect
potentially misleading news. in Association for Computational Linguistics: Human Language
Technologies. (San Diego, California, 2016)
11. S. Kushwah, S. Das, Sentiment analysis of big-data in healthcare: issue and challenges. in 2020
IEEE 5th International Conference on Computing Communication and Automation (ICCCA).
(Galgotias University, Greater Noida, UP, India, 30–31 Oct 2020)
12. A. Rajadesingan, R. Zafarani, H. Liu, Sarcasm detection on twitter: a behavioral modeling
approach. in Proceedings of the Eighth ACM International Conference on Web Search and
Data Mining. (ACM, Feb 2015), pp. 97–106
13. Sentiment Analysis, and Clinical Analytics. (Elsevier BV, 2020)
14 B. Pang, L. Lee, Sentiment analysis mining and sentiment analysis. Found. Trends Inf. Retrieval
2(1–2), 1–135 (2008)
15. N. Mukhtar, M.A. Khan, Urdu sentiment analysis using supervised machine learning approach.
Int. J. Pattern Recognit. Artif. Intell. 32(02), 1851001 (2018)
16. Bharat Singh, Saroj Kushwah, Sanjoy Das, Multi-feature segmentation and cluster based
approach for product feature categorization. in Proceedings of International Journal of
Information Technology and Computer Science. (2014), pp. 1–3
17. B. Singh, S. Kushwah, S. Das, P. Johri, Issue and challenges of online user generated reviews
across social media and E-commerce website. in Proceeding of IEEE International Conference
on Computing Communication and Automation (ICCCA-2015), (15–16 May 2015), pp. 818–
822. https://doi.org/10.1109/CCAA.2015.7148486
Sentiment Analysis of Healthcare Big Data: A Fundamental Study 69

18. S. Das, B. Singh, S. Kushwah and P Johri, Opinion based on polarity and clustering for product
feature extraction. Int. J. Inf. Eng. Electron. Bus. (IJIEEB), 8(5), 33–42 (2016). https://doi.org/
10.5815/ijieeb2016. ISSN: 2074-9023 (Print), ISSN: 2074-9031 (Online)
19. M. Shehab, A.T. Khader, M.A. Al-Betar, L.M. Abualigah, Hybridizing cuckoo search algorithm
with hill climbing for numerical optimization problems. in 2017 8th International Conference
on Information Technology (ICIT). (IEEE, May 2017), pp. 36–43
20. H. Mulki, H. Haddad, C. Bechikh Ali, I. Babao˘glu, Tunisian dialect sentiment analysis: a
natural language processing-based approach. Comput. Sistemas 22(4), (2018)
21. L.M. Abualigah, A.T. Khader, E.S. Hanandeh, A combination of objective functions and hybrid
Krill Herd algorithm for text document clustering analysis. in Engineering Applications of
Artificial Intelligence. (2018)
22. L.M. Abualigah, A.T. Khader, E.S. Hanandeh, A novel weighting scheme applied to improve
the text document clustering techniques. in Innovative Computing, Optimization and Its
Applications. (Springer, Cham, 2018), pp. 305–320
23. Z.A. Al-Sai, L.M. Abualigah, Big data and E-government: a review. in 2017 8th International
Conference on Information Technology (ICIT). (IEEE, May 2017), pp. 580–587
24. Basant Agarwal, Namita Mittal, Pooja Bansal, Sonal Garg, Sentiment analysis using common-
sense and context information. J. Comput. Intell. Neurosci. 9, (2015)
25. Bijoyan Das, Sarit Chakraborty, An Improved Text Sentiment Classification Model Using
TF-IDF and Next Word Negation. (2018)
26. M.T. Khan, M. Durrani, A. Ali, I. Inayat, S. Khalid, K.H. Khan, Sentiment analysis and the
complex natural language. Complex Adapt. Syst. Model. 4(1), 2 (2016)
27. Sanjida Akter, Muhammad Tareq Aziz, Sentiment analysis on Facebook group using Lexicon
based approach. in The 2016 3rd International Conference on Electrical Engineering and
Information Communication Technology (ICEEICT). (2016)
28. Anees U.I. Hassan, Jamil Hussain, Musarrat Hussain, Muhammad Sadiq, Sungyoung Lee,
Sentiment analysis of social networking sites (SNS) data using machine learning approach for
the measurement of depression. in International Conference on Information and Communica-
tion Technology Convergence (ICTC). (IEEE, Jeju, South Korea, 2017)
29. S. Arafin Mahtab, N. Islam, M. Mahfuzur Rahaman, Sentiment analysis on Bangladesh Cricket
with support vector machine. in The 2018 International Conference on Bangla Speech and
Language Processing (ICBSLP). (21–22 Sept 2018)
30. Khalifa Chekima, Rayner Alfred, Sentiment analysis of Malay social media text. (2018),
pp. 205–219
31. C. Dhaoui, C.M. Webster, L.P. Tan, Social media sentiment analysis: Lexicon versus machine
learning. J. Consum. Mark. 34(6), 480–488 (2017)
32. S.A. El Rahman, F.A. AlOtaibi, W.A. AlShehri, Sentiment analysis of Twitter data. in The 2019
International Conference on Computer and Information Sciences (ICCIS). (3–4 April 2019)
33. Kashif Ali, Hai Dong, Athman Bouguettaya, Abdelkarim Erradi, Rachid Hadjidj, Sentiment
analysis as a service: a social media based sentiment analysis framework. in IEEE International
Conference on Web Services (ICWS). (IEEE, Honolulu, HI, USA, 2017)
34. J. Hao, H. Dai, Social media content and sentiment analysis on consumer security breaches. J.
Financ. Crime 23(4), 855–869 (2016)
35. S. Mansour, Social media analysis of user’s responses to terrorism using sentiment analysis
and text mining. Procedia Comput. Sci. 140, 95–103 (2018)
36. Brandon Joyce, Jing Deng, Sentiment analysis of tweets for the 2016 US Presidential election.
inIEEE MIT Undergraduate Research Technology Conference (URTC). (IEEE, Cambridge,
MA, USA, 2017)
37. S. Yuliyanti, T. Djatna, H. Sukoco, Sentiment mining of community development program
evaluation based on social media. TELKOMNIKA (Telecommun. Comput. Electron. Control)
15(4), 1858–1864 (2017)
38. Victoria Ikoro, Maria Sharmina, Khaleel Malik, Riza Batista-Navarro, Analyzing sentiments
expressed on Twitter by UK energy company consumers. in Fifth International Conference on
Social Networks Analysis, Management and Security (SNAMS). (IEEE, 2018), pp. 95–98
70 S. Kushwah et al.

39. W. Chen, Z. Xu, X. Zheng, Q. Yu, Y. Luo, Research on sentiment classification of online travel
review text. J. Appl. Sci. 10, 5275 (2020). https://doi.org/10.3390/app10155275
40. Haruna Isah, Paul Trundle, Daniel Neagu, Social media analysis for product safety using text
mining and sentiment analysis. in 14th UK Workshop on Computational Intelligence (UKCI).
(IEEE, 2014)
41. Shahid Shayaa, Phoong Seuk Wai, Yeong Wai Chung, Ainin Sulaiman, Noor Ismawati Jaafar,
Shamshul Bahri Zakaria, Social media sentiment analysis on employment in Malaysia. in The
Proceedings of 8th Global Business and Finance Research Conference, (Taipei, Taiwan, 2017)
42. M. Itani, C. Roast, S. Al-Khayatt, Developing resources for sentiment analysis of informal
Arabic text in social media. Procedia Comput. Sci. 117, 129–136 (2017)
43. Zulfadzli Drus, Haliyana Khalid, Sentiment analysis in social media and its application:
systematic literature review. Procedia Comput. Sci. (2019)
44. M. Chau, J. Xu, Mining communities and their relationships in blogs: A study of online hate
groups. Int. J. Hum.-Comput. Stud. 65(1), 57–70 (2007)
45. Y. Qiang, Z. Ziqiong, L. Rob, Sentiment classification of online reviews to travel destinations
by supervised machine learning approaches. Expert Syst. Appl. 36(1), 6527–6535 (2009)
46. Muhammad Taimoor Khan, Shehzad Khalid. Sentiment analysis for health care. Int. J. Priv.
Health Inf. Manage. (2015)
47. F. Greaves, D. Ramirez-Cano, C. Millett, A. Darzi, L. Donaldson, Use of sentiment analysis
for capturing patient experience from free-text comments posted online. J. Med. Internet Res.
15(11), e239 (2013)
48. F.J. Ramírez-Tinoco, G. Alor-Hernández, J.L. Sánchez-Cervantes, M. del Pilar Salas-Zárate, R.
Valencia-García, Use of sentiment analysis techniques in healthcare domain. in Current Trends
in Semantic Web Technologies: Theory and Practice. (Springer, Cham, 2019), pp. 189–212
49. E. Refaee, V. Rieser, An Arabic Twitter corpus for subjectivity and sentiment analysis. in LREC.
(May 2014), pp. 2268–2273
50. M. Al-Ayyoub, A.A. Khamaiseh, Y. Jararweh, M.N. Al-Kabi, A comprehensive survey of
Arabic sentiment analysis. Inf. Process. Manage. 56(2), 320–342 (2019)
51. H. Htet, S.S. Khaing, Y.Y. Myint, Tweets sentiment analysis for healthcare on big data
processing and IoT architecture using maximum entropy classifier. in International Confer-
ence on Big Data Analysis and Deep Learning Applications. (Springer, Singapore, May 2018),
pp. 28–38
52. Laith Abualigah, Hamza Essam Alfar, Mohammad Shehab, Alhareth Mohammed Abu Hussein,
Chapter 7 Sentiment Analysis in Healthcare: A Brief Review. (Springer Science and Business
Media LLC, 2020)
53. C. Sindhu, Binoy Sasmal, Rahul Gupta, J. Prathipa, Subjectivity detection for sentiment analysis
on Twitter data. in Artificial Intelligence Techniques for Advanced Computing Applications.
(Springer, 24 July 2020), pp. 467–476. https://doi.org/10.1007/978-981-15-5329-5_43
54. P. Mehndiratta, D. Soni, S. Sachdeva, Detection of sarcasm in text data using deep convolutional
neural networks. Scalable Comput.: Pract. Experience 18(3), (2017). https://doi.org/10.12694/
scpe.v18i3.1302. ISSN: 1895-1767
55. World Scientific News, Int. Sci. J. WSN 113, 218–226 (2018). EISSN 2392-219
56. S. Mandal, S. Biswas, V.E. Balas, R.N. Shaw, A. Ghosh, Motion prediction for autonomous
vehicles from Lyft dataset using deep learning. in 2020 IEEE 5th International Conference on
Computing Communication and Automation (ICCCA). (Greater Noida, India, 2020), pp. 768–
773. https://doi.org/10.1109/ICCCA49541.2020.9250790
57. S.K. Bharti, B. Vachha, R.K. Pradhan, K.S. Babu, S.K. Jena, Sarcastic sentiment detection in
tweets streamed in real time: a big data approach. Digit. Commun. Netw. 2(3), 108–121 (2016).
https://doi.org/10.1016/j.dcan.2016.06.002
58. T. Wilson, J. Wiebe, P. Hoffmann, Recognizing contextual polarity: an exploration of features
for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009). https://doi.org/
10.1162/coli.08-012-R1-06-90,September
A Neuro-Fuzzy based IDS
for Internet-Integrated WSN

Aditi Paul , Somnath Sinha, Rabindra Nath Shaw, and Ankush Ghosh

Abstract The most important part of IoT architecture is WSN which constitutes
the physical world in which heterogeneous devices are able to collect data required
for processing and providing meaningful information to the users through Internet.
However, WSN and IoT are two independent structures with subtle differences.
Hence, the integration of WSN to the IoT is not a mere aggregation of tiny sensors
and actuators to the Internet. This integration leads to the combinations of heteroge-
neous systems to collaborate and serve a common goal. Connecting sensor nodes to
the Internet opens up security threats which have to be addressed during integration
as because the combination of WSN and IoT can become more vulnerable to new
attacks. This chapter represents the design challenges of Internet-integrated WSN
with specific attack types and the corresponding security measures. Nevertheless,
these security measures are not ultimate as IoT evolves through time and vast domain
in which it has to be deployed with specific applications. With the diversity of appli-
cations, the WSN and IoT should be merged in such a way that would give the best
performance with appropriate security backbone. The design taxonomy and specific
protocols for low-power sensor devices require lightweight security mechanisms
such as intrusion detection system (IDS) implemented with soft computing-based
tools. This chapter introduces such a lightweight IDS for WSN-integrated IoT along
with its performance analysis which will help the reader to understand the design
methodology of such approach.

A. Paul
Department of Computer Science, Banasthali Vidyapith, Tonk, India
S. Sinha
Department of Computer Science, Amrita School of Arts and Sciences, Amrita Vishwa,
Vidyapeetham, Mysuru, India
R. N. Shaw
Department of Electrical, Electronics and Communication Engineering, Galgotias University,
Greater Noida, India
e-mail: r.n.s@ieee.org
A. Ghosh (B)
School of Engineering and Applied Sciences, The Neotia University, Sarisha, West Bengal, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 71
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_6
72 A. Paul et al.

Keywords Integrated WSN · WSN-IoT design challenges · Neuro-fuzzy-based


IDS

1 Introduction

The notion of integrating WSN with IoT is to utilize the potential of tiny sensor
nodes to collect environmental data and sending it to various IoT applications [1]
like health care, agriculture, home automation, etc., for connecting smart objects with
each other. The WSN is a standalone technology having the capabilities of large-scale
deployment. The tiny sensor nodes can be easily placed at any harsh environment
and in any number.
Since these sensor nodes have the potential to sense real environment and acquire
and process data which are otherwise not possible, WSN has a tremendous scope in
integrating with IoT. However, IoT is a larger network than wireless sensor network
and has heterogeneous applications. Thus, WSN can be integrated inside IoT frame-
work to independently gather information and send it through a switch to an IoT
framework. Most of the sensors in the IoT network send their data directly to the
Internet. WSN nodes are low memory and low-power devices and thus they commu-
nicate in an ad hoc manner. The data collected via sensors are sent to a base station (a
sink node) and then processed and redirected to the client by some indirect way. To
integrate WSN with Internet, the sensors should be able to be identified separately and
the data should be directly accessed in some manner. The Internet working between
WSN and IoT is a crucial aspect as these two technologies are parallel and have their
own design constraints. The challenge is to merge these two independent paradigms
in a common framework in order to interconnect the objects in simple way even
in a complex environment. The methods should guarantee interoperability between
these two platforms, efficient transformation of the protocols and quality of service.
However, the integration of WSN [2] with Internet creates another challenge which is
the security of the user data and services. Other than the security issues related to the
low memory and low-powered sensors, the communication in a wireless channel and
interconnection with most vulnerable WWW network demands a stronger security
backbone for Internet-integrated WSN. The emergence of new attacks signature in
the Internet inevitably increases the threats of variety of attacks on the whole frame-
work. Thus, with the design of such a mixed infrastructure, the security perspective
is also required for a secure communication among smart objects.
Designing Internet-integrated WSN is a challenging task as it not only has to
address the constraints of sensor nodes, but at the same time, the transformation
from one framework to other another has to be handled carefully. The standard IoT
protocols [3] such as HTTP, TCP, and IPv6 are now not suitable for this integration
rather these are to be updated or modified for the low-powered sensors. The heavy-
weight protocols designed for IoT are transformed into lightweight energy-efficient
protocols for supporting this integration.
A Neuro-Fuzzy based IDS for Internet-Integrated WSN 73

From the security perspective, this integration is not merely a development of


framework combining two independent technologies but more importantly to ensure
security at each level of integration. Thus, the real challenge is the integration of two
different security paradigms specifically designed for these two different technolo-
gies. Also, this security perspective should be global for this integrated infrastructure
rather than some specific security measures or otherwise integrating the same with
other technologies will need more security requirements each time.
The perception of security in WSN [4] is to design very constrained security tech-
niques specifically employable to the sensor nodes. The limited power and memory
capacity of sensors lead to the evolution of lightweight and specially designed secu-
rity tools over the decades. But this does not serve the notion of security perspective
of Internet-integrated WSN because the security structure for WSN is not sufficient
for the attack signatures of Internet which is an open vulnerable network.
The key discussion of this chapter is to understand the design taxonomy of
Internet-integrated WSN and for each such design to identify possible attack model
and the corresponding security measures. The protection of Internet-integrated WSN
against such attacks demands powerful and efficient intrusion detection system which
can identify any abnormal behaviour of the network and attack signature efficiently
with lower false positive.

2 WSN as Part of IoT

WSN is the information source of IoT because of its capabilities of collecting data
from surrounding environment. The potentials of collecting and processing informa-
tion from such environment make WSN an integral part of ubiquitous networking also
known as pervasive network. Since IoT is an element of pervasive computing, WSN
serves an essential part of IoT. The requirement of continuous connectivity with envi-
ronment and delivering required information time to time is the most crucial aspect
of IoT technology. To make the ‘things’ connected with each other there should be
a request-response flow among these and this is possible only when the information
is available and processed in timely manner without fail. To understand how this
information is to be retrieved in an IoT environment, one has to undergo the concept
of integrating WSN with IoT. If we study IoT architecture [2], we find that there are
basically three layers as shown in Fig. 1. The WSN devices may be deployed in the
edge of this communication infrastructure through capillary communication. Since
the short-range sensor nodes in WSN are deployed to form a local network topology,
we call it as capillary network. This type of network connects objects to each other
and exchange information within a short range. In order to connect this capillary
network to a global network infrastructure (IoT), a backhaul connection is required
and this can be best provided by cellular network. Backhaul communication provides
link between a core networks with small networks at the edge of it. For example,
communication of multiple cell phones with a single cell site forms a subnet work or
capillary network and connection of a cell tower to the Internet service provider (ISP)
74 A. Paul et al.

Fig. 1 IoT communication


infrastructure

is done by backhaul link [2]. Backhaul communication technology thus supports the
communication between capillary network (WSN) and core IoT technology.

3 Design Taxonomy of WSN–IoT Framework

Integrating WSN with IoT can be done in two ways stack-based [5] and topology-
based. The stack-based design approach is defined on the basis of similar
network stacks between the two networks. This can be further divided into three
sub-categories, namely centralized approach, Gateway-based partial solution and
TCP/IP-based full integration. The topology-based design approach depends on the
actual location of the nodes that provides access to the Internet. This can be further
divided into two phases, viz. hybrid and access point-based.

3.1 Centralized Approach

This approach is used when the WSN does not directly connect to the IoT frame-
work. The sensor nodes and IoT hosts are independent of each other. In such situa-
tion, data exchange is done through a common interface or centralized server such
as base station which sends data from the sensors to the Internet. Any request from
IoT hosts is also passed through the base station. This type of communication is
indirect communication. Here, WSN can implement its own set of protocols inde-
pendent of the external hosts. There is no requirement of protocol translation or
hybrid approach. All the interactions are done via a common interface between two
A Neuro-Fuzzy based IDS for Internet-Integrated WSN 75

independent frameworks. Another important aspect of centralized approach is that


the specific and unaltered protocols of WSN work more efficiently in specific appli-
cation environment which is not possible once the protocols are translated in order
to achieve interoperability.
Challenges: If a sensor node is cut off, data cannot be retrieved by any other means
as there is no way to store sensor data other than its own memory. However, for a
Gateway-based approach data is being stored by some mechanism regardless of the
state of the sensor node.
In order to exchange data between two independent objects, base station has to
implement some communication interface such as Web service which is not in the
case of direct integration where nodes are addressable by their IP address.

3.2 Gateway-Based Approach

In this approach, application layer Gateway plays the role of transmission informa-
tion between the WSN and IoT network. A base station can be considered here as
Gateway which translates the lower layer protocols (TCP/IP) [6] of both the networks
for information exchange. This way a sensor node and an IoT host exchange infor-
mation by addressing each other without necessarily setting up a direct connection.
However, WSN is still independent of the Internet here and data has to travel the
Gateway to reach other points. Gateway-based approach provides interoperability at
the network layer by translating TCP/IP and at application layer through providing
various web services. Node failure at this stage does not lead to data loss as the
same data can be retrieved from other nodes operating in the same context by using
some mechanism implemented in the Gateway. Gateway-based approach is flexible
enough to be accessed from different vendors due its interoperability.
Challenges: To map sensor node’s address to an Internet IP address, a translation
table has to be maintained. However, this is table can be vulnerable for some web
service solution. Thus, assigning IP address to vast number of sensor nodes is a
crucial task.

3.3 TCP/IP-Based Solution

The independent operation of WSN does not support many applications on the
Internet efficiently. In such situations, WSN needs to be fully integrated with the
global network for monitoring and controlling the sensor nodes. The TCP/IP-based
approach specifically does this integration by interconnecting the WSN protocols
to TCP/IP suit. As we know that TCP/IP is a heavy weight protocol and cannot be
implemented in the constrained sensor nodes, there are some mechanisms by which
the protocol translation is done. Two approaches: IP overlay network and overlay
76 A. Paul et al.

Fig. 2 Architecture of overlay gateway

sensor networks are used for this purpose. IP overlay implements TCP/IP protocol
in the tiny sensor nodes and assigns IP address to these. However, it is not always
possible to assign and manage IP address to the sensor nodes. Thus, the second
approach employs the technique of overlay Gateway (Fig. 2). Overlay Gateway [7]
is responsible for transferring the sensor node’s network, transport and application
layer (N, T, A) packet header to the Internet by encapsulating these inside the TCP/IP
packet header. Overlay Gateway effectively translates the sensor nodes protocol to
Internet’s application layer and above.
Challenges: Security is a great concern in TCP/IP than the other two approaches.
Since the sensor nodes are exposed to the Internet, the possibilities of new attacks
increase accordingly. Thus, the strong authentication mechanism should be designed
to avoid illegal access to the nodes.
As in the previous two approaches, WSN works independent of other networks,
energy balancing is easily achievable. But in the TCP/IP overlay solution, the sensor
nodes are attached at the edge of the IoT infrastructure. The global demand from the
Internet causes more energy consumption of the nodes which is challenging in terms
of QoS like throughput, packet drop, network delay, etc.
Comparisons of the above three approaches of integration are described in Fig. 3
from which it is evident that being the best among three approaches TCP/IP solution
is the most promising as well as challenging technique that directly integrates WSN
with Internet.

3.4 Topology-Based Integration Approach

In this approach, the integration of sensor nodes with IoT depends on the nodes
position in order to connect to the Internet. There are two approaches hybrid and
access point solution.
A Neuro-Fuzzy based IDS for Internet-Integrated WSN 77

Hybrid solution approach: In hybrid solution approach, the sensor nodes are located
at the edge of the network to directly access Internet. These nodes behave as base
station through which all other nodes forward data to the external network. For
this, the node configurations are high in power, memory longevity than nodes in the
network. There can be multiple base stations to ensure data redundancy. Since the
base station nodes are connected to the Internet directly, they are able to implement
different Internet protocols. For this reason, base stations have network intelligence.
Access point solution: In access point solution, there is a tree structure topology
followed in the WSN. The root of the tree is an Internet enable node and the leaves
are the normal nodes. Since there are multiple Internet-enabled nodes, the tree is
unbalanced in nature. The advantage of this topological feature is that every sensor
node can access Internet through the root nodes in one hop. The root nodes which are
part of backbone network have higher capabilities like more resources so as perform
faster. These nodes are capable of implementing standard protocols for backbone
network like IEEE 802.15.4.

4 Security Challenges in the Integration of WSN in IoT

As according to the previous discussion, it is evident that TCP/IP solution is the


best way to fully integrate WSN to Internet. However, there are certain factors
that are to be considered during this integration which exist inherently in the two
independent networks. Integrating WSN into the Internet using TCP/IP opens up
many security challenges [8] which are to be addressed in order to develop a secure
system. Following factors are to be taken into consideration for designing a secure
Internet-integrated WSN.
Characteristic of WSN: WSN has some inherent weakness due to its specific char-
acteristics. This creates vulnerabilities to variety of attacks on the Internet [9, 10].
Thus, WSN should be deployed on basis of application so that sensor nodes when not
required could be kept isolated from the Internet. This will minimize the possibilities
of attacks to the edge network.
Robustness: As WSN directly communicates to the external entities, it becomes
susceptible to Internet vulnerabilities. Thus, the integration Gateway and sensor
nodes should implement adequate security mechanism to become more resilient
against attacks such as denial-of-service (DoS).
User authentication and authorization: The user of the IoT is any person or smart
object who is connected to Internet. There should be mechanism of who can access
what to restrict the user so as to protect information and maintaining privacy [11, 12].
The Internet-enabled sensor should implement proper authentication and authoriza-
tion mechanisms for the securing applications from attacks on authentication such as
brute force attack and man-in-the-middle for ensuring authorization sign-on systems
like Kerberos may be used.
78 A. Paul et al.

Secure communication medium: Communication medium between the nodes should


ensure end-to-end security. Internet Protocol Security (IPSec) designed for Internet
is a heavyweight protocol and hence implementing it in WSN is challenging. Thus,
an alternative lightweight secure channel mechanism like TLS has to be explored
and implement in the sensor node communication.
Sensor Node Hardware: the tiny structure of sensor nodes itself is challenging
for adapting any security mechanism used for IP communication. The constrained
memory and low-power features of these nodes should be considered in order to
implement cryptographic techniques used to secure communication in presence of
adversaries.
Network redundancy: IP communication needs IP address of nodes for communi-
cating specific information. In such circumstances, if a node is cut off (out of range)
then data should be recovered through other redundant nodes. Specific mechanism
for retrieving information for group of other nodes is to be developed in TCP/IP
solution.
Protocol optimizations: Most WSN-specific protocols are optimized according to
the constrained network requirement. These protocols have self-healing and self-
organizing capacity. Internet-enabled WSN protocol stack such as 6LoWPAN has to
incorporate this optimization.
Number of network-related attacks [13] can be initiated. Types of attacks affecting
routing mechanism like selective forwarding, sinkhole attack, Sybil attack interrupt
and the network services by compromising nodes within the network. Wormhole
attack is one of the most severe attacks where there is no need to compromise a node.
The attacker uses the idea of eavesdropping and tunnels it to any desire node of the
network. Moreover, the attack can be initiated at starting during the phase of neigh-
bour discovery. Even at transport layer, by injecting message any compromise node
can forcibly request retransmissions to the end-point over the network. Application-
related data can be taken by any unlawful user or impersonated. By disrupting, the
data aggregation attacks can also be initiated.

5 Attacks on Different Layers of WSN-Integrated IoT

As discussed earlier in this chapter, integrating WSN in IoT opens numerous attack
surfaces on the whole system as the threat is now from both sides. 6LoWPAN [14]
introduced as a standard protocol stack for Internet-integrated WSN are susceptible
variety of attacks in all the layers which are essentially a part of the security perspec-
tive of Internet-integrated WSN. We concisely discuss these attacks layer wise for
the knowledge and further exploration of the reader.
Physical layer attack: The sensor nodes are prone to physical attacks such as
resources exhaustion, denial-of-sleep attack, jamming attack, wireless traffic growth,
A Neuro-Fuzzy based IDS for Internet-Integrated WSN 79

misuse of device software and many more. The target of the intruder here is to increase
the resource expense of legitimate nodes for disrupting network. Replay attacks to
exploit cryptographic keys, packet injection, clone or node reprogramming are severe
attacks on authentication and confidentiality.
Link layer Attack: Node outage, collision, resource exhaustion, traffic manipulation
and link layer jamming are the types of attacks in which the notion of the attacker is to
break the communication, creating confusion in the nodes, stopping node’s services,
compromising availability, degradation of the WSN performance and launching other
attacks. Some attacks spoof identity of the nodes or fabricate information, modify
message, pass false/wrong information. Eavesdropping, spoofing, sinkhole, Sybil
attack, selective forwarding, unfairness, impersonation are the attacks of this cate-
gory. These attacks break confidentiality and authenticity of the nodes and very hard
to detect. Fragmentation attacks on adaptation layer are a severe attack in which
attacker sends multiple duplicate fragments to the receiver except the first fragment
and thus flood the network.
Network layer Attack: In this layer, several attacks are launched on IPv6 and RPL
routing protocols. Selective forwarding, sinkhole and Sybil attacks are specifically
designed attacks to disrupt the network operation by falsifying routing information.
Legitimate nodes are compromised and their identities are used to attract data packet
towards the malicious nodes [15]. In wormhole attack, the attacker eavesdrops a
packet and tunnels it to another node of the network which is hard to detect until
disruption occurs.
Transport layer Attack: A compromised node can inject retransmission request
message to the sender node for flooding the network and this is called flooding attack.
Sometimes, the attacker repeatedly spoofs error messages making the legitimate node
exhausted from recovering from fake messages.
Application Layer Attack: To conduct the functionalities like RESTful web service
and to consequently integrate WSNs with smart objects through web, it is only
possible due to the use of Constrained Application Protocol (CoAP) at application
layer. WSNs make it easy to reuse the software and diminish the complexity in the
application development process by the use of web services on top of IP. CoAP
protocol [16] is susceptible to multiple attacks. Overwhelming attack sends large
packets to the base station directly or by some compromised nodes. Path-based DoS
attack launched by flooding forged messages to make the resources exhausted. The
attacks like parsing attacks and caching attacks may create severe threat to the client
while exchanging data using proxy or any unknown intruder in the network. In
parsing attacks, remote nodes are crashed by executing random code, and in caching
attacks, the proxy with the ability to cache may get control. In another type of
attack known as amplification attack, an attacker uses the end devices for conversion
of smaller packets into larger one. Amplification attacks are largely controlled by
using blocking/slicing modes in CoAP server. But still, amplification attacks are
great concern, and accordingly, it has been proved that the amplification factor of
CoAP can increase up to 32, which means that an attacker having 1 Mbps network
80 A. Paul et al.

connectivity, can target another link which is having the capacity of 32 Mbps. In
other types of attack, e.g. spoofing attacks and cross-protocol attacks, the translation
from TCP to UDP is responsible to attacks.

6 Intrusion Detection System for DoS Attack Detection


in WSN

The most common attacks on WSN as discussed earlier in this chapter are DoS
attacks such as hello flood, sinkhole, wormhole and many more. These attacks can
be launched from inside as well as outside. The outside attacks are like fragmentation
attack, botnet attack, etc., which is not too hard to detect. However, insider attacks
are targeted to the weak node or portion of the network which are compromised by
the attackers. Some of these attacks are so severe that the attacker remains unnoticed
until the whole network disrupts. The solution to such attacks is a second line of
defence to track the behaviours of the nodes (intrusion) through intrusion detection
system (IDS). The rapid development of the IoT technology has made IDS as an
essential security component of this architecture. Since RPL works in a resource
constrained environment, it is very hard to develop a security structure for protecting
this protocol from some of severe insider attacks namely hello flood, sinkhole and
wormhole attacks.
There are several state-of-the art techniques designed for IDS which can detect
specific DoS attack on 6LoWPAN. 6LoWPAN-based intrusion detection systems
are such as SVELTE [17] are efficient to identify individual attacks like SINKHOLE
attack on RPL by using location information and Rank. However, this technique
depends on the Rank availability which can be forcefully dropped by the attacker.
For HELLO Flood attack, signal-strength-based location detection technique [18]
and probability estimation techniques [19] have been proposed. However, these tech-
niques generate high false positives. IDS designed for detection of wormhole attack
uses RSSI-based location information to identify two collaborative nodes involved
in tunnelling packets.
In the next section, we introduce neuro-fuzzy-based IDS for detecting DoS attacks
on 6LoWPAN protocol stack.

7 A Neuro-Fuzzy-Based IDS for WSN-IoT


framework—A Case Study

The proposed system is an anomaly-based IDS specifically designed for WSN by


using fuzzy and neural network techniques [15]. The aim is to increase efficiency in
compare to other simultaneous IDSs. A two-step methodology is used here to find
out the malicious nodes more preciously.
A Neuro-Fuzzy based IDS for Internet-Integrated WSN 81

First step uses fuzzy logic which is a simple rule base technique with flexible
and lightweight as strength. It is used in the situations where it is hard to predict the
outcome of the system. The behaviour of nodes in WSN is highly unpredictable due
to the channel and node constraints. Some genuine nodes sometimes show abnormal
energy loss which is unpredictable. Fuzzy logic can solve this issue by designing it
carefully with observing normal nodes behaviour and assigning some trust values to
these.
The NN used in this design makes the system more precise with its prior training
in a normal environment. Since NN requires more resources to be executed, the
intelligent use of fuzzy inference rule in the first step minimizes the number of nodes
that are to be passed through the NN. This reduces the overhead of system and thus
makes the system efficient.
The whole system works in two steps: In the first step, fuzzy rules are generated and
applied by perceiving the behaviour of the nodes in the network and then assigning
a trust amount to each of the nodes during communication. The trust is a degree of
misbehaviour of each node in communication. The greater the trust, the lower is the
likelihood of being malicious. In order to simplify the algorithm complexity, authors
consider a single but evident parameter packet drop for calculating trust. This is
because packet drop is an obvious phenomenon in WSN communication which can
cause due to multiple reasons like heavy traffic or multi-hop communication, etc.
However, WSN exhibits low traffic rate due to its constraint nature and hence another
reason for packet drop is the malicious activity of some nodes causing intentional
packet dropping. Attack such as selective forwarding (DoS attack) malicious nodes
can cleverly forward some of the data packets to an attacker at a random interval
of times to remain unnoticed. Alternately, a compromised node may convey wrong
routing information to attract data traffic towards it and then drops packets. Both the
events cause large quantity of packet drop either due to the non-availability of next
hop or false routing information. In a flooding attack, the malicious nodes send huge
number of request packets through bots to a server which leads to unavailability of
resources within a small time. These types of abnormalities are not caused by legit-
imate nodes and hence this proves the existence of spiteful nodes in the network. In
designing the IDS authors observed the normal packet drop during offline monitoring
and estimated a cut off value. During real-time deployment, the current packet drop
of each node is observed by the IDS to find out nodes with packet drop exceeding cut
off. Depending on the values trust is generated using fuzzy membership function,
simple Mumdani fuzzy inference rules are then derived to isolate nodes which have
packet drop greater than cut off and segregate these as trust, distrust and enemy nodes
according to amount of packet drop. Among these three categories of nodes, distrust
and enemy nodes are again tested for further confirmation.
However, this segregation is not 100% accurate as it depends on a specific instance
of time and also fuzzy rule base is predictive system. Hence, a double filtering of
the nodes is required which is accomplished by a trained neural network for finally
separating the malicious nodes from distrust nodes.
Artificial neural network is a powerful soft computing tool that performs better for
long term predictions. Also, NN is suitable for scalable network as large historical
82 A. Paul et al.

data can be available from such network. Over the years, ANN has been used to detect
different types of DoS attacks like Sybil attack, black hole attack, etc. In the proposed
system, a NN is trained with node’s parameters offline in a controlled environment
where there is no attack. A supervised learning technique is used for training with
a feed forward network. Node’s parameters taken as packet drop, packet forward,
sent request/reply packet, received request/reply and residual energy, and the output
gives the possibility of being malicious/attacker. The distrust and enemy nodes that
are detected from the first step are passed through the NN. The output produces
probability of the node to be malicious.
The performance of the IDS is evaluated in NS2.34 simulator. The trace of the
packet drop and other parameters are generated through simulation. The results in
Figs. 3 and 4 show the variation of false positive and true positive with the speed
of attacker at different node densities. However, in real deployment, this may vary
depending on real parameter values. The proposed IDS is distributed and specific
which is the most expected design methodology for real-time attack detection. The
system independently works without increasing the complexity. The system is adap-
tive as fuzzy rule is a flexible and modifiable according to the complexity of the
WSN. Also, the node mobility and density do not affect the structure of the NN as
input parameters train the NN accordingly.

Fig. 3 False positive rate of the IDS

Fig. 4 True positive rate of the IDS


A Neuro-Fuzzy based IDS for Internet-Integrated WSN 83

8 Conclusion

Designing IDS for 6LoWPAN protocol has a tremendous scope in defending


DoS attacks. Developing IDS using machine learning-based algorithms which are
lightweight, reliable and easy to deploy is the real challenge in IoT, and at the same
time, the heterogeneous devices communicating in the open Internet leads to more
complicated and unpredictable attacks on the protocols. Thus, the more advanced
IDS designed methodologies are worked upon in near future which will address the
interoperability between WSN and IoT along with considering distributed device
security. These types of IDS will focus on hybrid approach along with design spec-
ification of the devices. Anomaly-based IDS are the most promising among these
which are able to monitor abnormal behaviours of the system in real time. Thus,
the future defence mechanisms in Internet-integrated IoT against DoS attack will
extremely rely on advanced IDS designed for newly generated attack signatures for
specifically designed IoT devices.

Glossary

Artificial neural network Artificial neural networks are the replication of neurons
of human brains which is used as part of expert system in computer science. This
system is specifically used in decision making in an unpredictable environment
such as weather forecast and disaster management.
Backhaul It is a segment of a network which connect the backbone network with
other networks through links. For example, WSN form an edge network and are
connected to IoT infrastructure through backhaul. The most common form of
backhaul is mobile network.
Capillary network It provides short-range connectivity to devices and objects by
forming a local area network within its radio range.
Constrained Application Protocol (CoAP) Application layer protocol in IoT
stack which provides application interface between small objects and devices
and low-power constrained network and help transferring information in a
constrained network environment.
Data aggregation attacks It is the process of collecting and managing data in
WSN. Data aggregation algorithms are used to manage limited energy and other
resources of the network thus enhancing network lifetime.
DoS attack Denial-of-service (DoS) attack is a type of resource exploitation attack
aiming to disrupt the network activity. The notion of this attack is to flood the
network with unnecessary packets within short interval so that server response
time is overwhelmed and thus the server crashes. This cause genuine user to
wait long or even do not get responses. Dos attack can be launched by external
attacker and in many ways, and it is the most severe attack in routing protocols.
84 A. Paul et al.

Eavesdropping attack It is a type of spoofing or packet monitoring activity by


an intermediate node trying to accumulate confidential information during
exchange between two parties. The attacker tries to spoof the message or identity
of the nodes and drops it. In either case, the attacker may alter the information
and acts as man-in-the-middle.
False positives and false negatives It is the measure of accuracy specifically used
in machine learning approaches in the context of WSN. False positive implies
the probability of a true node being detected as false. False negative represents
the probability that a malicious node is detected as a normal node. The accuracy
of a system improves by minimizing false positive and false negative as much
as possible.
Internet of things (IoT) The Internet of things is a paradigm which facilitates the
communication of every objects, devices and things in the globe through Internet.
This technology ensures the communication of every entity through insecure
network like Internet but in a secure manner. However, security is a great
challenge in IoT.
Internet-integrated WSN The integration of wireless sensor networks with
Internet of things is combinedly termed as Internet-integrated WSN. The WSN
here plays an important roles as edge network.
Mamdani fuzzy inference rules Mamdani fuzzy inference rule is a set of simple
if–then rules which is derived from the attributes of system. Fuzzy inference
system is set of knowledge base and decision-making units that combinedly
process crisps input by applying fuzzy logic and produces crisps outputs.
Neuro-fuzzy In the field of artificial intelligence, neuro-fuzzy refers to combinations
of artificial neural networks and fuzzy logic.
Neuro-fuzzy system It is an expert system combinedly made up of neural network
and fuzzy logic system to be used in decision making in a situation where math-
ematical models cannot be applied effectively. This type of system utilizes the
prior knowledge of the system behaviours using fuzzy linguistic variables along
with a trained neural network which do not need any prior knowledge of the
system attributes.
Selective forwarding attack In selective forwarding attack, an attacker node
(external or internal) alternately drops or delays the data packet in forwarding
towards the destination. This causes network congestion. Since the dropping is
done in a selective manner, the attacker remains unnoticed for a long time.
Sinkhole attack Sinkhole attack is a type of attack where an external or internal
attacker node attracts data traffic of its neighbour by falsely claiming that it
has the shortest route towards the destination. When received it can tamper the
confidential data and even launch more powerful attacks.
Sybil attack In a Sybil attack, it is a type of identity spoofing attack in which a
malicious node compromises some legitimate nodes and use their identities for
launching the attack. Sybil means multiple personality here indicates that a single
node using multiple IDs to falsely create a route towards the destination and when
packets pass through these false nodes they consumes or drops the packets. This
attack severely affects the network throughput and disrupt communication.
A Neuro-Fuzzy based IDS for Internet-Integrated WSN 85

Wireless Sensor Network (WSN) Wireless sensor network is a set of sensor nodes
communicating each other wirelessly by following mesh topology. These types
of networks are either completely backboneless or have a backbone called base
station for processing data collected and delivered by the sensors.
Constrained Application Protocol (CoAP) Application layer protocol in IoT
stack which provides application interface between small objects and devices
and low-power constrained network and help transferring information in a
constrained network environment.
Wormhole attack Application layer protocol in IoT stack which provides appli-
cation interface between small objects and devices and low-power constrained
network and help transferring information in a constrained network environment.
IPv6 over Low-Power Wireless Personal Area Networks (6LoWPAN) An
Internet protocol stack designed specifically for low-power tiny devices partic-
ipating in Internet of things communication. It encapsulates and compresses
IPv6 packet header in order to send it through IEEE802.15.4 network.

References

1. S. Rani, R. Maheswar, G.R. Kanagachidambaresan, P. Jayarajan (eds.), Integration of WSN


and IoT for Smart Cities (Springer International Publishing, Cham, 2020)
2. R. Fantacci, T. Pecorella, R. Viti, C. Carlini, A network architecture solution for efficient
IOT WSN backhauling: challenges and opportunities. IEEE Wirel. Commun. 21(4), 113–119
(2014). https://doi.org/10.1109/MWC.2014.6882303
3. G.A. da Costa, J.H. Kleinschmidt, Implementation of a wireless sensor network using standard-
ized IoT protocols. in 2016 IEEE International Symposium on Consumer Electronics (ISCE)
(Sao Paulo, Brazil, Sep. 2016), pp. 17–18. https://doi.org/10.1109/ISCE.2016.7797327
4. J. Granjal, E. Monteiro, J.S. Silva, Security in the integration of low-power wireless sensor
networks with the internet: a survey. Ad Hoc Netw. 24, 264–287 (2015). https://doi.org/10.
1016/j.adhoc.2014.08.001
5. C. Hennebert, J.D. Santos, Security Protocols and Privacy Issues into 6LoWPAN Stack: A
Synthesis. IEEE Internet Things J. 1(5), 384–398 (2014). https://doi.org/10.1109/JIOT.2014.
2359538
6. M.R. Kosanovic, M.K. Stojcev, Implementation of TCP/IP protocols in wireless sensor
networks. ICEST 2007, 4 (2007)
7. Q. Zhu, R. Wang, Q. Chen, Y. Liu, and W. Qin. (2010). IOT Gateway: Bridging wireless sensor
networks into internet of things. in 2010 IEEE/IFIP International Conference on Embedded
and Ubiquitous Computing (Hong Kong, China, Dec. 2010), pp. 347–352. https://doi.org/10.
1109/EUC.2010.58.
8. R. Roman, J. Lopez, Integrating wireless sensor networks and the internet: a security analysis.
Internet Res. 19(2), 246–259 (2009). https://doi.org/10.1108/10662240910952373
9. Sridipta Misra, Muthucumaru Maheswaran, Salman Hashmi, Security challenges and
approaches. in Internet of Things. Springer Briefs in Electrical and Computer Engineering
(Springer, Cham, 2016)
10. S. Sinha, A. Paul, Neuro-fuzzy based intrusion detection system for wireless sensor network.
Wirel. Pers. Commun. (2020). https://doi.org/10.1007/s11277-020-07395-y
11. S. Arvind, V.A. Narayanan (2019) An overview of security in CoAP: attack and analysis. in
5th International Conference on Advanced Computing and Communication Systems (ICACCS)
(Coimbatore, India, 2019). pp. 655–660. https://doi.org/10.1109/ICACCS.2019.8728533
86 A. Paul et al.

12. A. Rghioui, M. Bouhorma, A. Benslimane (2013) Analytical study of security aspects in


6LoWPAN networks. in 5th International Conference on Information and Communication
Technology for the Muslim World (ICT4M) (Rabat, Mar. 2013), pp 1–5. https://doi.org/10.
1109/ICT4M.2013.6518912
13. R. Hummen, J. Hiller, H. Wirtz, M. Henze, H. Shafagh, K. Wehrle, 6LoWPAN fragmentation
attacks and mitigation mechanisms. in Proceedings of the Sixth ACM Conference on Security
and Privacy in Wireless and Mobile Networks—WiSec ’13 (Budapest, Hungary, 2013), pp. 55,
https://doi.org/10.1145/2462096.2462107
14. S. Raza, L. Wallgren, T. Voigt, SVELTE: Real-time intrusion detection in the internet of things.
Ad Hoc Netw. 11(8), 2661–2674 (2013). https://doi.org/10.1016/j.adhoc.2013.04.014
15. V.P. Singh, A.S.A. Ukey, S. Jain, Signal strength based hello flood attack detection and preven-
tion in wireless sensor networks. Int. J. Comput. 62(15), 1–6 (2013). https://doi.org/10.5120/
10153-4987
16. S. Mandal, V.E. Balas, R.N. Shaw, A. Ghosh, Prediction analysis of idiopathic pulmonary
fibrosis progression from OSIC dataset. in 2020 IEEE International Conference on Computing,
Power and Communication Technologies (GUCON) (Greater Noida, India, 2020), pp. 861–865.
https://doi.org/10.1109/GUCON48875.2020.9231239
17. S. Mandal, S. Biswas, V.E. Balas, R.N. Shaw, A. Ghosh, Motion prediction for autonomous
vehicles from Lyft dataset using deep learning. in 2020 IEEE 5th International Conference on
Computing Communication and Automation (ICCCA) (Greater Noida, India, 2020), pp. 768–
773. https://doi.org/10.1109/ICCCA49541.2020.9250790
18. K. Grgic, D. Zagar, V.K. Cik, System for malicious node detection in IPv6-based wireless
sensor networks. J. Sensors. 2016, 6206353 (2016). https://doi.org/10.1155/2016/6206353
19. P. Pongle, G. Chavan, Real time intrusion and wormhole attack detection in internet of things.
Int. J. Comput. Appl. 121(9), 1–9 (2015). https://doi.org/10.5120/21565-4589
20. M. Kumar, V.M. Shenbagaraman, R.N. Shaw, A. Ghosh Predictive data analysis for energy
management of a smart factory leading to sustainability. In: Innovations in Electrical and
Electronic Engineering, ed. by M. Favorskaya, S. Mekhilef, R. Pandey, N. Singh Lecture
Notes in Electrical Engineering, vol. 661 (Springer, Singapore, 2021). https://doi.org/10.1007/
978-981-15-4692-1_58
Sleep Apnea Detection Using
Contact-Based and Non-Contact-Based
Using Deep Learning Methods

Anand Singh Rajawat, Romil Rawat, Kanishk Barhanpurkar,


Rabindra Nath Shaw, and Ankush Ghosh

Abstract Sleep apnea is a syndrome that repetitively starts breathing ans stops. It
causes a major issue in terms of quality of sleep and affects daily activities. It can
be treated by laboratory tests or imaging on diagnosed with sleep apnea disorder.
Numerous researchers have proposed and implemented automatic scoring processes
to address these issues, based on fewer sensors and automatic classification algo-
rithms. The proposed work develops an optimized CNN and LSTM smart deep
learning model that classified the data set based on physical contact and without
physical contact from patients for analysis and detects the OSA condition of the
patient. We proposed a deep learning model for detecting torso and head by various
sleep patterns. We achieved 93.02%, 94.50%, and 98.30% accuracy on frames using
conversional CNN model, and received accuracy as 92.1%, 90.2%, 80.50% and
89.6% using CNN-LSTM architecture.

Keywords Breath rate · Contact-based methods · Convolutional neural network ·


Heart rate · Long short-term memory

A. S. Rajawat · R. Rawat
Department of Computer Science Engineering, Shri Vaishnav Vidyapeeth Vishwavidyalaya,
Indore, India
e-mail: rajawat_iet@yahoo.in
R. Rawat
e-mail: rawat.romil@gmail.com
K. Barhanpurkar
Department of Computer Science and Engineering, Sambhram Institute of Technology,
Bengaluru, Karnataka, India
e-mail: kanishkbarhanpurkar@yahoo.com
R. N. Shaw
Department of Electrical, Electronics and Communication Engineering, Galgotias University,
Greater Noida, India
e-mail: r.n.s@ieee.org
A. Ghosh (B)
School of Engineering and Applied Sciences, The Neotia University, Sarisha, West Bengal, India
e-mail: ankushghosh@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 87
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_7
88 A. S. Rajawat et al.

1 Introduction

In the year 2020, the exponential spread of COVID-19 virus disease across the globe
was a significant health concern because of the leading cause of the range of this
disease through physical contact with surfaces and other persons [1]. According
to the World Health Organization (WHO) guidelines, social distancing is the best
practice that should be implemented to prevent the spread of disease. Hospitals, health
clinics, rehabilitation centre and sleep clinics are following new protocols and entire
healthcare industry looking for methodologies which are non-contact based. The
detection techniques are based on two types of methods that are CBM and NCBM.
CBM requires physical contact with the patient’s body [2] for diagnosis of disease,
and NCBM is the technique in which the physical contact is not required with patient’s
body. In non-contact-based methods, physical contact with the patient’s body can be
considered as minimum or zero. Sleep [2] research groups and sleep focuses are
working persistently on a sleep study. Respiration is a complicated method of the
collaboration of the nervous system and muscles of the lungs. The efficiency of the
respiratory system also rests the blood and heart vessels [3]. CNS is accountable for
respirational determination according to the contribution from principal and exterior
chemoreceptors like baroreceptors (blood pressure), principal chemoreceptors (pH),
aortic or carotid chemoreceptors (O2 and CO2 ) [4]. The position of sleep is one of
the essential factors in the quality of sleep. Sleep apnea is found in higher frequency
in a horizontal position. As several researchers result, it has been observed that the
snoring is observed in horizontal areas as compared to other places [5, 6].
Finally, the position of sleep is periodically observed in intensive care units (ICU)
to detect behavioral patterns of the patient. Polysomnogram is the diagnosis of sleep
distress in sleep laboratories. Using a severe contact sensor [7] attached to the chest
region, the sleep position is usually mentioned here. On the other hand, this method
is troubling and is appropriate for the minimum number of studies. We, therefore,
propose to use one single camera depth, an unconventional approach. Camera-based
systems [8] benefit from the ability to use the same method to track a range of oper-
ations such as agitation, breathing patterns, incident detection, behavior recognition
agitation. On comparing the Physical Contact and without Physical Contact [9], the
sensor-based techniques and camera-based strategies are inexpensive to install by
the inexpert. These devices are also suitable for assisted living and elderly treatment.
Khan [10] work on deep learning technique Xavier and Bengio [11] work on feed
forwards network that technique has the number of optimization issues and classifi-
cation accuracy. We proposed a smart deep learning model for sleep apnea data clas-
sification with physical contact and without physical contact. In this research chapter,
the major highlights of the research work for detection of sleep apnea disorder are
as follows:
1. We study and analyze the existing deep learning model for classification of the
different number of data sets that is data sets based on the physical contact and
without physical contact.
Sleep Apnea Detection Using Contact-Based … 89

2. Our smart model work on the classification of the sleep apnea data was set
efficiently.
3. Optimized CNN and LSTM smart deep learning model classified the data set
based on physical contact and without physical contact.
4. Our proposed model accuracy is high compare to conversional CNN and LSTM.
5. We compare the different number of data set with conversional CNN and
optimized CNN.

2 Related Work

CBM and NCBM for sleep monitoring systems were planned to monitor the patient’s
body movement, the position of sleep and breathing rate with the help of sensors and
camera. So, this is used for the measurement of the accuracy of the system, detection
of the head and accuracy of the torso, breath measurement as well as sleep movement.
There is some breath-related disorders like apnea and hyperventilation syndrome
(HVS). The various methods are planned for the screening of the breath complaints
and on other hand different sleeping features like airflow measurement device and
blood oxygen device. This work is divided into two significant categories; one is
monitoring the head and torso detection, and the second is breath measurement.

2.1 Head and Torso Detection

The connect component analysis method is applied in which the component of every
cross-section is extracted and then used for the next iteration. This method is observed
in the spheres on each cross section and then the next cross-section. Here, the circle
is made in the substantial form in the top to the bottom section, and we analyze the
sphere in that portion. So, by this algorithm, we find out some fields. We collect
the individual samples and decide the highest value for sample in each cross-section
[12].

2.2 Breathing Rate Estimation

An input of a group of images inserted which attains Region of Interest for each pixel
that produces a trajectory in the time-region area. The evaluation of both infrared and
depth cameras are used as input sources for collection of data. For measurement of
the distance readings, we preferred the monitor respiration that is correlated with the
movements and generates a strong signal as compared to infrared cameras [13]. A
most prominent physiological parameter is the respiratory rate that is not examined
in a range of healthcare features. For more accuracy, convenient, automatic, and
90 A. S. Rajawat et al.

constant R.R. monitoring, numerous sensor and various physical mechanisms based
on the networking technologies have been proposed. The R.R. monitoring technique
consists of many networking techniques, but these techniques provide accurate results
based on various wearable devices and modes of connectivity [14]. In sleep apnea,
snoring is the essential factor that can be considered by the sleep experts during the
diagnosis process.
The sleep on the side is by the natural and simple solution of the snoring as
compare to sleeping in the back. For automatically detection of the snoring module,
the deep learning model is developed, and the transferred model with an embedded
system was used. It also helps in the estimation of breathing rate. To apply the
vibration alert message, a unique wearable gadget method is used on the arm until
the sooner sleep in the recommended side. This gadget is used for listening to low
energy Bluetooth because it is rechargeable [7]. In Table 1, the comparison has been
made between a contact-based and non-contact-based method of detection of sleep
apnea. It contains parameters which are used as input to the algorithm on which the
comparison has been made. The dependency on body position plays an essential role
in data measurements. Additionally, the different machine learning algorithms were

Table 1 Comparative analysis between the contact-based database and non-contact-based sleep
apnea method
Author Algorithm Number of Feature Dependency Mode of
volunteers on body method
position
Yu et al. [4] Depth image 8 Head/torso No Non-contact
processing detection based
Martínez-García Early fourier 67 Breathing rate Yes Contact-based
et al. [27] fusion
algorithm
Benetazzo et al. Respiratory 5 Breathing rate Yes Contact-based
[15] rate
measurement
algorithm
Grimm et al. [2] Convolutional 78 Head/torso No Non-contact
neural detection based
network
Yang et al. [12] Generic pose N/A Chest/abdominal Yes Non-contact
estimation movements based
techniques
Gu et al. [16] Exploited CSI 5 Body motion Yes Contact-based
Wi-Fi
physical layer
algorithm
Liu et al. [14] Respiratory 8 Respiratory rate No Contact-based
rate algorithm measurement
Sleep Apnea Detection Using Contact-Based … 91

Table 2 Comparison of
Author Machine learning Accuracy (%)
accuracy of machine learning
algorithm
algorithms used for detecting
sleep apnea Balaei et al. [17] Neural network 62
classifier
Altaf et al. [18] Convolutional neural 68.75
network
Martínez-García et al. Recurrent neural 85
[27] network
Khan [10] Convolutional neural 97
network
Eiseman [19] Support vector 82.7
machine

used in detection of sleep apnea disorder with respect to the accuracy of the model
(Table 2).

3 Proposed Methodology

CNN also allows working with input and output sequences of vectors. This function
allows CNN to store a data sequence. Let us understand how CNN is implemented.
The neural network is given a fixed intake of conventional techniques and epochs. For
example, we feed a singular image network and achieved results for the classification
of that image only [7]. CNN helps us to work with data vector sequences, where the
output is subject to previous grading tests. The neural network carries a fixed input
to a fixed output. It can be used for grading images. For example, we consider a data
set of sleep apnea pictures, where the person does not have a sleep problem, and the
same numbers of images as sleep apnea are labeled, and the performance does not
cause sleep problems. The second case is a CNN example, which maps one entry into
several outputs. For instance, an image can be given with the entry and a classified
object sequence can be identified. The third example is CNN that maps many output
inputs. It can be used to evaluate ECG single data. We may here insert a sentence in the
data and obtain an object in the picture that represents the output emotions produced
by these objects. The last example is CNN, which maps a combined input and output
sequence. It can be used to monitor the location of objects in an image. We supply
the input with an image sequence and get a processed image sequence to output.
CNN has only short-term memory, although it is efficient for processing prediction
sequence data. This is why research teams, along with CNN, have established the
architecture in simple terms; LSTM addresses the problem of missing long-term
dependency by ensuring that each neuron performs four operations rather than one
in the hidden layer. See how LSTM functions [20] (Figs. 1 and 2).
92 A. S. Rajawat et al.

Fig. 1 Operation of conversional CNN and LSTM

Fig. 2 Working of conversional CNN and LSTM

4 Optimized CNN and LSTM

For the embedding layer building, a sequential model is used. After that, the convo-
lutional [4] method is used. These are the combination of the LSTM and CNN layers,
which is a new type of method that architecture made for the beneficial purpose that
is beneficial for the CNN and LSTM. CNN is an model that provides the learning
ability for the local response toward the time as well as space with related data; on
the other hand, for dealing the specialization with sequential data, the LSTM layer is
used; this provides data transformed into a high level to the convolutional layer [21].
For the pertained word directions, CNN layer provides the embedding matrix for the
higher representation orders (N-grams). The LSTM used for the learning sequen-
tial correlations by the higher-order arrangements that denote the data obtained. By
convolutional layer organized, the LSTM layer input provides the feature map in the
Sleep Apnea Detection Using Contact-Based … 93

Fig. 3 Proposed smart model for sleep monitoring

form of series [4], as shown in Fig. 3. The LSTM used for the construction of the
sentence that is transformed in the different sequential window (N-gram) provides the
variation in the sentences. LSTM is applied directly, and this operated the sentences
directly.
Initiated by the creation of the input layer, the convolution network is fed into the
intended system. The preceding video is being used as the neural network set source
for entry—the extraction function by convolution. Therefore, let xi = Rd [13] be a
vector word for the word for ith in the term d. Let x function R.L. for length L denote
the input image. Here, k is a filter length, or a vector m Rk alternately is a converting
function filter. For a given image size, j can be considered at each location [13] and
a vector comprises of series of k images represented as:
 
v j = x j, x j+1,··· , x j+k+1 (1)

Hence, the representation of the commas row direction concatenation [4]. Filter
m mapped on k-grams vectors [13] where each element of feature map for vector is
formed and given as:
 
E j = f Vj × m + b (2)

Let n filters of the same magnitude; it will form n number of feature maps [13]
which can be arranged appropriately as demonstrations for all window wj,

V = (E 1 , E 2 , E 3 , E 4 · · · E n ) (3)

The semicolons are the segments that are implying the vector connection just as
that is the element map shaped by the ith channel. Each iteration of V ∈ R (L −
k + 1) × n is the unique component delivered from n highlights at the spot given
by j. The window is a higher request that is presently given as contribution toward
the LSTM. In CNN, the number of channels is selected as 32 and the filter channel
length of 5, which will express to the channel measurements that is 5 * 100. This is
because of 100-dimensional implanting. The start of the capacity is set into “ReLU”
utilizing hyper-boundary tuning, and all the boundaries are chosen (Fig. 4).
94 A. S. Rajawat et al.

Fig. 4 Proposed fusion model

In every step, the given range in Rd manipulates the output that is provided by
the modules; this Rd represents a function controlled by the state ht − 1 and allows
effort in present time step xt: that forget gate ft. The representation of the input (it)
and output gate (ot), these are overall providing a decision that in what way to fill in
the present memory cell ct and hidden state (H.S.). It denotes the measurement of
memory that is LSTM, and the vectors then provide the architecture companionate
in similar ways. The terms for LSTM conversion programs are well definite in [9]
and are shown in:
   
i s = σ v j HSs−1 , xs + bi (4)

   
f s = σ v f HSs−1 , xs + b f (5)

   
Ps = tanHS v j H Ss−1 , xs + b P (6)

   
Q s = σ v j HSs−1 , xs + b Q (7)

E 1 = f s ∅E s−1 + i s ∅Ps (8)

H Ss = Q s tanHS(E 1 ) (9)

Sigmoid function for each layer always consists of output value in the range of
[0, 1]. Also, tanh denotes the hyperbolic function whose values exist in the series of
[−1, 1], and  denotes multiplication. LSTM is related to dependency dealing (here
vanishing gradients problem is not meet), and then, it has been selected for the next
layer.
Sleep Apnea Detection Using Contact-Based … 95

Similarly, batch normalization permits layers of a network to acquire by itself


in small quantities individually of other layers. Four completely associated layers
are utilized as an advantage for the model, which contains an appropriate quantity
of neurons so that initiation works like ‘ReLU,” which includes two neurons or for
delicate max starting capacity. Two circumstances are incorporated a dropout with
a chance of 20% to abstain from overfitting, which barring it from deeding the data
set that is estimated. Accepted a preparation test xi and its actual mark yi ∈ [0, 1],
and the evaluate probability can be given as p(yi ) lies in range of [0, 1]. Thus, error
formula can be denoted in the following form:


N
HS p (P) = yi .log( p(yi ) + (1 − yi ).log(1 − p(yi ) (10)
i=1

5 Data Set Comparison

Comparison has been made on attributes of famous data-sets such as level and chil-
dren’s sleep [2] and health study data set, ISRUC-sleep data set, etc. These data sets
are based on the contact-based method and contain the parameters of study during
polysomnography in sleep clinics. Biosensors connected to the patient’s body are
sleep clinic to measure the vital parameters of the body. The biosensors measure
the physiological signals like heart rate, breath rate, ECG, EOG, etc., and provide
information (Tables 3, 4, 5 and 6).

Table 3 Description of sleep and health study data set [22]


S. No. Attributes Description
1 ECG Records the electrical impulses through the heart muscles in chest
area
2 Leg sensor Measures the electrical activities of muscles and nerves
3 Respiratory belts Measures changes in thoracic circumference during respiration
4 Snore microphone Detection of higher frequencies of snoring
5 Position sensor Measures the mechanical position
6 Oximeter Measures quantity of oxygen carried in the body
7 Thermistor Detects the body temperature
96 A. S. Rajawat et al.

Table 4 Description of ISRUC-sleep data set [23]


S. No. Attributes Description
1 EEG Detects the electrical activity of brain
2 EOG Measures the retinal standing potential
3 EMG That used for finding electrical action produced by movements of skeletal
muscles
4 H.R Detects the heart rate in chest area
5 SaO2 Detects the oxygen saturation by blood analysis

Table 5 Description of ISRUC-sleep dataset [23]


S. Attributes Description
No.
1 ECG Records the electrical impulses through the heart muscles in chest area
2 EMG That used for finding electrical action produced by movements of skeletal
muscles
3 EEG That used for finding the electrical activity of the brain
4 Breath rate Measures changes in thoracic circumference during respiration

Table 6 Description of sleep EDFx [24]


S. No. Attributes Description
1 Name Laboratory name of the volunteer
2 Sampling frequency Frequency at which the observations are recorded
3 Age Age of the volunteer
4 Sex Sex-type of the volunteer
5 Recording duration (in hours) The total duration of recorded sessions
6 PLMS Periodic limb movements syndrome
7 SAHS Sleep apnea–hypopnea syndrome
8 TST Total sleep time

6 Result and Discussion

The data set based on physical contact and without physical contact and consists of the
readings of polysomnography. It contains several attributes such as body position,
SaO2 , nasal/oral airflow, abdominal effort, thoracic effort, EEG, ECG, EMG and
snoring MIC. The data obtained through this test is in the form of the waveform time
series. The most commonly used file extension for polysomnography is European
Data Format (.edf). The data set consists of the visual recording of volunteers involved
in this experiment. The camera is mounted such that it will record all the head
and torso movement of the body. We also implemented an object detection method
for the visual recordings so that it increases surveillance efficiency. The complete
Sleep Apnea Detection Using Contact-Based … 97

system is developed in Python 3.7 × version, and libraries used are NumPy, Pandas,
Tensorflow, Keras and ImageAI library. We have taken a number of classes from sleep
apnea data set like health study data set [14], ISRUC-sleep data set and PhysioNet
database sleep EDFx. We achieved 93.02%, 94.50%, and 98.30% accuracy on frames
using conversional CNN model, and we using CNN-LSTM architecture, 92.1%,
90.2%, 80.50%, 89.60% in different data sets accuracy on using Optimized CNN
and LSTM. We show in Fig. 5 training and testing process sleep apnea data set
classification ECG for recognition accuracy. In Table 7, contact-based sleep apnea
data sets were compared based on CNN and LSTM techniques with the feature
as accuracy and F1-score. Similarly, non-contact-based sleep apnea (Table 8) was
compared based on CNN with the feature as accuracy and standard deviation. Table
9 shows contact-based method for detection sleep apnea disorder using optimized
CNN and LSTM and conversional CNN and LSTM method. Furthermore, Table
10 shows contact-based method for detection sleep apnea disorder using optimized
CNN and LSTM and conversional CNN and LSTM method (Table 10).

Fig. 5 Recognition accuracy of this network training and testing process

Table 7 Comparative study


Data set Method Accuracy (%) F1-Score (%)
different data sets with
contact-based sleep apnea Health study CNN + LSTM 89 81
dataset
ISRUC-sleep CNN + LSTM 84.61 84.34
data set
PhysioNet CNN + LSTM 73 72
database
Sleep EDFx CNN + LSTM 81 74
98 A. S. Rajawat et al.

Table 8 Comparative study


Study Method Accuracy (%) Standard deviation
different data sets with
(SD)
non-contact-based sleep
apnea Yu et al. [4] CNN 89.70 0.014
Grimm et al. [2] CNN 91.20 0.034
Gu et al. [16] CNN 96.63 0.021

Table 9 Numerical results of conversional CNN and LSTM and conventional optimized CNN and
LSTM during the validation process for contact-based sleep apnea
Approach Contact-based SLEEP APNEA
Contact
based
Deep Conversional CNN and LSTM Optimized CNN and LSTM
learning
methodology
Data set Conversional Conversional F1-Score Optimized Optimized Optimized
method accuracy (%) (%) method accuracy F1-Score
(%) (%)
Health study CNN + 89 81 Optimized 92.1 94.92
dataset LSTM CNN and
LSTM
ISRUC-sleep CNN + 84.61 84.34 Optimized 90.2 96.83
database LSTM CNN and
LSTM
PhysioNet CNN + 73 72 Optimized 80.50 91.03
database LSTM CNN and
LSTM
Sleep EDFx CNN + 81 74 Optimized 89.60 92.23
LSTM CNN and
LSTM

7 Contact-Based Sleep Apnea Detection Methods


Evaluation

See Figs. 6 and 7.

7.1 Non-Contact-Based Sleep Apnea Detection Method

The graphical representation shows the result for the non-contact-based detection
methods using torso detection and head detection. It shows the head and torso move-
ment as soon as an obstruction in the respiration has been observed. A sudden move-
ment can be observed in both head and torso movement due to an abnormality in
Sleep Apnea Detection Using Contact-Based … 99

Table 10 Numerical results of conversional CNN and conventional optimized CNN during the
validation process for non-contact-based sleep apnea
Approach Non-contact-based sleep apnea
Contact
based
Deep Conversional CNN Optimized CNN
learning
methodology
Author Conversional Conversional Standard Optimized Optimized Standard
method accuracy (%) Deviation method accuracy deviation
(SD) (%) (SD)
Yu et al. [4] CNN 89.70 0.014 Optimized 93.02 0.0021
CNN
Grimm et al. CNN 91.20 0.034 Optimized 94.50 0.0040
[2] CNN
Gu et al. [16] CNN 96.63 0.021 Optimized 98.30 0.0018
CNN

Fig. 6 Joint-plot graph representing the relation between ECG and respiration rate for a healthy
person
100 A. S. Rajawat et al.

Fig. 7 a, b Count-plot represents the number of times in which abnormal deflections have been
observed in volunteers during polysomnography throughout the test time in the ECG and EEG
parameters

the respiration process. Here, X-axis displays [25] the number of seconds and Y-axis
displays the rate of movement captured through mount camera (Fig. 8).
Sleep Apnea Detection Using Contact-Based … 101

Fig. 8 a Sudden deflection observed in head, b torso movement

8 Conclusion and Future Work

In this research paper, the authors studied on the labeled data set by using the proposed
smart model optimize CNN and LSTM pre-processing of the python and by using
pre-trained embedding weight were taken. Still, the detection of the sleep apnea news
by using the convolutional as well as LSTM type provides improved or in less time
feature extraction, also supporting the model high accuracy [26]. So, the detection of
the ongoing work, as well as studied, is related to the smart deep learning model for
sleep apnea data classification with physical contact and without physical contact.
The obtained results of conversional CNN and LSTM and conventional optimized
CNN and LSTM during the validation process for contact-based sleep apnea and
numerical results of conversional CNN and conventional optimized CNN during the
validation process for non-contact-based sleep apnea. İn future, we will implate in
real-time scenario.

References

1. H. Nishiura, N.M. Linton, A.R. Akhmetzhanov, Serial interval of novel coronavirus (COVID-
19) infections. Int. J. Infect. Dis. 93, 284–286 (2020). https://doi.org/10.1016/j.ijid.2020.02.06
2. T. Grimm, M. Martinez, A. Benz, R. Stiefelhagen, Sleep position classification from a depth
camera using bed aligned maps. in 2016 23rd International Conference on Pattern Recognition
(ICPR) (2016). https://doi.org/10.1109/icpr.2016.7899653
3. M. Marin-Oto, E.E. Vicente, J.M. Marin, Long term management of obstructive sleep apnea
and its comorbidities. Multidiscip. Respir. Med. 14, 21 (2019). https://doi.org/10.1186/s40248-
019-0186-3
4. M.C. Yu, H. Wu, J.L. Liou, M.S. Lee, Y.P. Hung, Multiparameter sleep monitoring using a
depth camera. in Biomedical Engineering Systems and Technologies (BIOSTEC 2012), ed. by
J. Gabriel et al. Communications in Computer and Information Science, vol. 357 (Springer,
Berlin, Heidelberg, 2013)
102 A. S. Rajawat et al.

5. C.L. Rosen, E.K. Larkin, H.L. Kirchner, J.L. Emancipator, S.F. Bivins, S.A. Surovec, R.J.
Martin, S. Redline, Prevalence and risk factors for sleep-disordered breathing in 8- to 11-year-
old children: association with race and prematurity. J Pediatr. 142(4):383–389 (2003). PubMed
PMID: 12712055
6. J.C. Spilsbury, A. Storfer-Isser, D. Drotar, C.L. Rosen, H.L. Kirchner, S. Redline, Effects of the
home environment on school-aged children’s sleep. Sleep 28(11):1419–1427 (2005). PubMed
PMID: 16335483
7. M. Haescher, D.J.C. Matthies, J. Trimpop, B. Urban, SeismoTracker: upgrade any smart wear-
able to enable a sensing of heart rate, respiration rate, and microvibrations. in Proceedings
of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems
(2016).
8. K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: surpassing human-level perfor-
mance on ImageNet classification. in Proceedings of the IEEE International Conference on
Computer Vision (2015)
9. R. Ravichandran, E. Saba, K. Chen, M. Goel, M. Gupta, S. Patel, Wibreathe: estimating respi-
ration rate using wireless signals in natural settings in the home. in IEEE PerCom Conference
(2015)
10. T. Khan, A deep learning model for snoring detection and vibration notification using a smart
wearable gadget. MDPI Electron. (2019). https://doi.org/10.3390/electronics8090987
11. J. Nagi, F. Ducatelle, G.A. di Caro, D. Ciresan, U. Meier, A. Giusti, F. Nagi, J. Schmid-
huber, L.M. Gambardella, Max-pooling convolutional neural networks for vision-based hand
gesture recognition. in Proceedings of the IEEE International Conference on Signal and Image
Processing Applications (ICSIPA2011) (2011)
12. C. Yang, G. Cheung, V. Stankovic, K. Chan, N. Ono, Sleep apnea detection via depth video
and audio feature learning. IEEE Trans. Multimedia 19(4), 822–835 (2017). https://doi.org/10.
1109/tmm.2016.2626969
13. M. Martinez, R. Stiefelhagen, (2017) Breathing rate monitoring during sleep from a depth
camera under real-life conditions. in 2017 IEEE Winter Conference on Applications of
Computer Vision (WACV) (2017). https://doi.org/10.1109/wacv.2017.135
14. H. Liu, J. Allen, D. Zheng, F. Chen, Recent development of respiratory rate measurement
technologies. Inst. Phys. Eng. Med. 40(7), (2019). https://doi.org/10.1088/1361-6579/ab299e
15. F. Benetazzo, S. Longhi, A. Monteriù, A. Freddi, Respiratory rate detection algorithm based
on RGB-D camera: theoretical background and experimental results. Healthc. Technol. Lett.
1(3), 81–86 (2014). https://doi.org/10.1049/htl.2014.0063
16. Y Gu, X. Zhang, Z. Liu, F. Ren, WiFi-based real-time breathing and heart rate monitoring
during sleep. (2019)
17. A. Balaei, K. Sutherland, P. Cistulli, P. Chazal, Automatic detection of obstructive sleep apnea
using facial images. 215–218 (2017) https://doi.org/10.1109/ISBI.2017.7950504
18. F. Altaf, S. Islam, N. Akhtar, N. Janjua, Going deep in medical image analysis: concepts,
methods, challenges and future directions. IEEE Access. 1–1 (2019). https://doi.org/10.1109/
ACCESS.2019.2929365.
19. N.A. Eiseman, M.B. Westover, J.E. Mietus, R.J. Thomas, M.T. Bianchi, Classification algo-
rithms for predicting sleepiness and sleep apnea severity. J. Sleep Res. 21(1), 101–112
(2012)
20. D.P. Kingma, J. Ba, Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.
6980, (2014)
21. R. Awan, N.A. Koohbanani, M. Shaban, A. Lisowska, N. Rajpoot, Convideo-aware learning
using transferable features for classification of breast cancer histology images. in International
Conference on Image Analysis and Recognition (Springer, 2018), pp. 788–795
22. D.A. Dean, A.L. Goldberger, R. Mueller, M. Kim, M. Rueschman, D. Mobley, S.S. Sahoo,
C.P. Jayapandian, L. Cui, M.G. Morrical, S. Surovec, G.Q. Zhang, S. Redline, Scaling up
scientific discovery in sleep medicine: the national sleep research resource. Sleep 39(5), 1151–
1164 (2016). https://doi.org/10.5665/sleep.5774. Review. PubMed PMID: 27070134; PubMed
Central PMCID: PMC4835314
Sleep Apnea Detection Using Contact-Based … 103

23. S. Khalighi, T. Sousa, J. Santos, U. Nunes, ISRUC-sleep: a comprehensive public data-set for
sleep researchers. Comput. Methods Programs Biomed. 124, (2015). https://doi.org/10.1016/
j.cmpb.2015.10.013
24. B. Kemp, A.H. Zwinderman, B. Tuk, H.A.C. Kamphuisen, J.J.L. Oberyé, Analysis of a sleep-
dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG. IEEE-BME
47(9), 1185–1194 (2000)
25. M. Hall, E. Frank, E. Holmes, B. Pfahringer, P. Reutemann, I. Witten, The WEKA data mining
software: an update. ACM SIGKDD Explor. Newsl. (2009)
26. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple
way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
27. M.A. Martínez-García, F. Capote, F. Campos-Rodríguez, P. Lloberes, M.J.D. de Atauri, M.
Somoza, J.M. Montserrat, Effect of CPAP on blood pressure in patients with obstructive sleep
apnea and resistant hypertension: the HIPARCO randomized clinical trial. Jama 310(22), 2407–
2415 (2013)
Drift Compensation of a Low-Cost pH
Sensor by Artificial Neural Network

Punit Khatri, Karunesh Kumar Gupta, and Raj Kumar Gupta

Abstract In the past two decades, sensor technology has achieved the manufac-
turing of low-cost and portable sensors that can be used for different environmental
applications such as water quality monitoring, air quality monitoring, and soil quality
monitoring. The sensors used for environmental monitoring face the problem of drift
sooner or later after installation. The drift may occur due to sensor aging, temperature
and humidity variation, poisoning among the sensor array, or due to a combination of
all. This analysis will lead us to a different track. This sensor drift will demolish the
calibration model of any instrument. This issue can be solved by the calibration of the
sensors, which is also a challenge for field-deployable instruments. In this chapter,
an alternate solution is provided for the drift compensation based on artificial neural
network (ANN). A low-cost pH sensor is used for the research work and explanation
as well. The pH sensor readings were observed 66 times during the measurement
session in the reference solution. The drift was observed in the pH sensor readings and
compensated using a feed-forward neural network. The simulation was performed
on the Python platform. The drift compensation was successfully achieved using the
ANN model as the RMSE was reduced to as minimum of 0.0001%.

Keywords Environmental monitoring · pH sensor · Drift compensation · Artificial


neural network · Reference solution

1 Introduction

Water quality monitoring is essential these days as the available water is being
polluted because of modernization, urbanization, and industrialization. This pollu-
tion may cause severe diseases like dysentery, cholera, diarrhea, etc. The concern

P. Khatri (B) · K. K. Gupta


Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science
(BITS), Pilani, Rajasthan, India
e-mail: P20170009@pilani.bits-pilani.ac.in
R. K. Gupta
Department of Physics, Birla Institute of Technology and Science (BITS), Pilani, Rajasthan, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 105
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_8
106 P. Khatri et al.

of pollution has become major issues worldwide [1]. A WHO report says that we
will be living in water stress areas by the end of 2025 [2]. The water quality should
be monitored before consumption [3]. The water quality is defined based on various
parameters such as pH, electrical conductivity, turbidity, and so. The pH plays a vital
role in deciding the water quality. Most of the available pH sensors are based on
electrochemical technology. The electrochemical sensors suffer from the drift effect
in measurement and thus may fluctuate already established calibration models over
time. The sensor drift may occur due to the temperature and humidity variation,
aging, poisoning among the sensors, layer deposition on the sensor [4–7].
One method to encounter drift is the routine calibration of the sensors. This will
increase the cost as the calibration standard are expensive and will consume more time
in calibration. The alternate method to encounter drift is some machine learning (ML)
techniques or statistical analysis, which will increase the calibration lifetime. Many
authors have tried the statistical approach to encounter the drift in various sensors
(either gas sensor or water quality sensor) [8, 9] and prove to be quite efficient in
drift correction of gas sensors. Over the past decade, the use of ML techniques has
been increased in complex computing applications [10]. Artificial neural networks
(ANN) (inspired by biological neural network) are the subclass of the ML. ANN is
a set of artificial interconnected nodes in which a signal can be sent to another node
by any node [11]. ANN solves the problem in the same manner as the human brain
does and is thus used in many complex applications for real-life computation [12].
In this chapter, the drift compensation of a low-cost commercial pH sensor is
incorporated through an ANN model. The ANN is implemented in the Python soft-
ware. Python is an open-source programming language platform. It supports complex
computation and data processing. The ANN model was created using the sklearn
Python library. The readings for the ANN model were taken from the developed
setup for water quality monitoring in our previous work [13]. The RMSE was calcu-
lated for the developed ANN model. The trained model was finally validated using
a tenfold cross-validation (CV) method.
The rest of the chapter is arranged as follows—materials and methods provide
an overview of the pH sensor and its drift readings. Results section shows the ANN
model training and validation of the acquired data followed by discussion, and finally,
a conclusion section is given.

2 Materials and Methods

2.1 The pH Sensor and Its Drift Readings

In this chapter, we have used a low-cost commercial pH sensor. This pH sensor


is an electrochemical sensor having a measurement range from 0 to 14. The other
specifications of the pH sensor are shown in Table 1, and the structure of the pH
sensor can be seen in Fig. 1.
Drift Compensation of a Low-Cost pH Sensor … 107

Table 1 pH sensor
pH sensor specification
specification
Low-cost range 0–14
Resolution ±0.0001
Accuracy ±0.002
Response time 95% in 1 s
Temperature range −5–99 °C
Internal temperature sensor No

Fig. 1 pH sensor structure

The measurements were taken from the pH sensor in the reference solution of
7 pH from the developed setup in our previous work, as stated in the introduction
and recorded 66 times during the measurement session. The tests were uniformly
distributed over 132 days, with an interval of one day between every measurement,
and hence getting 66 values. The reference solution used in this work was of analyt-
ical grade and non-toxic. Figure 2 shows the recorded pH sensor readings. The
x-axis shows the days on which measurement was taken, and the y-axis shows the
corresponding pH values. During the entire measurement procedure, the sample
temperature was kept constant at 25 °C so that there should not be any deviation in
measurement due to the temperature variation.
108 P. Khatri et al.

Fig. 2 pH sensor readings

2.2 ANN Structure for Drift Compensation

The network used in this work is fully-connected feed-forward, which was trained by
back propagation without momentum and weight-decay. The simulation was done
in the sklearn Python library in Python 3.7 [14]. The dataset was split into a ratio
of 70:15:15 for training, testing, and validation. For validation, a tenfold CV was
performed.
Three different types of neurons (ReLu, logistic, and Tanh) were used as an
activation function. The learning rate was fixed at 0.001, and the number of hidden
neurons was set to 40 by an optimization strategy. The structure of the ANN model is
shown in Fig. 3. A two-input, one-output feed-forward model is used in this chapter.
The drifted pH and the target pH values are fed to the input layer, and the output
layer gives the corrected pH values.

Fig. 3 Two-input, Hidden


one-output layer ANN model Layer
Input
Layer
Output
Layer
Drifted pH

Corrected
pH

Target pH
Drift Compensation of a Low-Cost pH Sensor … 109

3 Results

The pH sensor data were recorded for 50 days in the laboratory conditions in the
reference solution of pH 7. It can be observed from Fig. 2 that there is a significant drift
in pH reading, which should be compensated by some ML technique. This drifted data
was fed to the ANN model, as described in Sect. 2, developed in Python software. In
the ANN model, the compensation was done using three different activation functions
given in Eqs. (1), (2), and (3), and other simulation parameters were set, as discussed
earlier.
ReLu function:

0 for x ≤ 0
f (x) = (1)
x for x ≥ 0

Logistic function:

1
f (x) = (2)
1 + e−x

Tanh function:
2
f (x) = −1 (3)
1 + e−2x

The predicted and cross-validated pH values for different activation functions


are shown in Figs. 4, 5, and 6. The red points correspond to predicted pH values
after modeling, and the blue points are the cross-validated pH values. The root mean
square error (RMSE) [15] has been calculated using the formula given in Eq. (4).

Fig. 4 Predicted and cross-validated pH values for ReLu function


110 P. Khatri et al.

Fig. 5 Predicted and cross-validated pH values for the logistic function

Fig. 6 Predicted and cross-validated pH values for Tanh function


 n
 (y p − yo )2
RMSE =  (4)
i=1
n

where yp is the predicted values, and yo is the observed values, and n is the number of
observations. A tenfold RMSECV was also calculated for the model. The RMSE and
RMSECV for different activation functions are given in Table 2. The minimum and
Drift Compensation of a Low-Cost pH Sensor … 111

Table 2 RMSE and


Activation function Simulation error
RMSECV for different
activation functions RMSE RMSECV
ReLu function 0.0001 0.0001
Logistic function 0.0799 0.0782
Tanh function 0.0014 0.0012

maximum RMSE is 0.0001 and 0.0799, respectively. The minimum and maximum
RMSECV is 0.0001 and 0.0782, respectively.

4 Discussion

An effect of drift on pH sensor has been studied in this chapter, and the compensation
of drift using an artificial neural network (ANN) has been investigated. The pH sensor
readings were recorded during the measurement period, and drift was noticed in pH
sensor output. A two-input, one-output feed-forward neural network was proposed
in this chapter. Different activation functions were used in the network architecture
proposed, and their suitability in compensation has been studied. Based on the results
presented in Table 2 and Figs. 4, 5, and 6, it can be concluded that the ReLu and Tanh
function are quite suitable as the activation function as the RMSE and RMSECV
is very low for these activation functions. At the same time, the logistic function
does not give sufficient results. The iteration numbers and the number of hidden
nodes were also varied, but the results were not getting better. We need to investigate
the reason behind the unsuitability of the logistic activation function in the pH drift
compensation further. Overall, we can say that the ANN may be useful in the drift
compensation of a low-cost commercial pH sensor for the systems deployed in the
field as the regular visit is not possible for such systems. The other advantage of the
presented work is that it can easily be implemented in our previous work [16] with
no additional cost as Python and its libraries are open-source and freely available
to everyone. Further, we can extend this work for other water quality parameters as
well, such as electrical conductivity (EC) sensor, dissolved oxygen (DO) sensor, and
oxidation–reduction potential (ORP) sensor.

5 Concluding Remarks

In this chapter, a simple approach for drift compensation of a commercial pH sensor


using an artificial neural network was demonstrated. The readings were taken from
the developed setup in our previous work for 132 days in the laboratory conditions.
The ANN model was trained using the data obtained from measurement with three
different activation functions. The drift was removed with minimum RMSE and
112 P. Khatri et al.

validated using a tenfold cross-validation method. The ANN technique really does
good work for a pH sensor drift compensation. In future, we are planning to have
drift analysis and compensation of sensor array used for water quality monitoring.

References

1. L.Y. Li, H. Jaafar, N.H. Ramli, Preliminary study of water quality monitoring based on WSN
technology. In: 2018 International Conference on Computational Approach in Smart Systems
Design and Applications, ICASSDA 2018 (Institute of Electrical and Electronics Engineers
Inc., 2018)
2. World Health Organization (WHO) (1996) WHO | Drinking Water. Fact sheet No. 391. World
Health Organization. 2017. Available from: https://www.who.int/mediacentre/factsheets/fs3
91/en/
3. N. Vijayakumar, R. Ramya, The real time monitoring of water quality in IoT environment. in
2015 International Conference on Innovations in Information, Embedded and Communication
Systems (ICIIECS) (IEEE, 2015), pp. 1–5
4. M. Padilla, A. Perera, I. Montoliu et al., Drift compensation of gas sensor array data by orthog-
onal signal correction. Chemom. Intell. Lab. Syst. 100, 28–35 (2010). https://doi.org/10.1016/
j.chemolab.2009.10.002
5. K. Yan, D. Zhang, Correcting instrumental variation and time-varying drift: a transfer learning
approach with autoencoders. IEEE Trans. Instrum. Meas. 65, 2012–2022 (2016). https://doi.
org/10.1109/TIM.2016.2573078
6. S. Liu, L. Feng, J. Wu et al., Concept drift detection for data stream learning based on angle
optimized global embedding and principal component analysis in sensor networks. Comput.
Electr. Eng. 58, 327–336 (2017). https://doi.org/10.1016/j.compeleceng.2016.09.006
7. V. Panchuk, L. Lvova, D. Kirsanov et al., Extending electronic tongue calibration lifetime
through mathematical drift correction: Case study of microcystin toxicity analysis in waters.
Sens. Actuators B Chem 237, 962–968 (2016). https://doi.org/10.1016/J.SNB.2016.07.045
8. T. Artursson, T. Eklov, I. Lundstrom et al., Drift correction for gas sensors using multivariate
methods. J. Chemom. 14, 711–723 (2000). https://doi.org/10.1002/1099-128X(200009/12)14:
5/6%3c711::AID-CEM607%3e3.0.CO;2-4
9. A. Ziyatdinov, S. Marco, A. Chaudry et al., Drift compensation of gas sensor array data by
common principal component analysis. Sens. Actuators, B Chem. (2010). https://doi.org/10.
1016/j.snb.2009.11.034
10. T Mitchell, Chapter 06. Mach Learn (1997). https://doi.org/10.1007/s10994-009-5101-2
11. L.A. Gatys, A.S. Ecker, M. Bethge, A Neural Algorithm of Artistic Style (2015)
12. R. Bhardwaj, S. Majumder, P.K. Ajmera, et al., Temperature compensation of ISFET based pH
sensor using artificial neural networks. in Proceedings of the 2017 IEEE Regional Symposium
on Micro and Nanoelectronics, RSM 2017 (2017)
13. P. Khatri, K. Kumar Gupta, R. Kumar Gupta, Raspberry Pi-based smart sensing platform for
drinking-water quality monitoring system: a python framework approach. Drink Water Eng.
Sci. 12, 31–37 (2019). https://doi.org/10.5194/dwes-12-31-2019
14. scikit-learn: machine learning in Python—scikit-learn 0.22 documentation. https://scikit-learn.
org/stable/index.html. Accessed 13 Dec 2019
15. Root-mean-square deviation—Wikipedia. https://en.wikipedia.org/wiki/Root-mean-square_
deviation. Accessed 16 Dec 2019
16. P. Khatri, K.K. Gupta, R.K. Gupta, Assessment of water quality parameters in real-time
environment. SN Comput. Sci. 1, 340 (2020). https://doi.org/10.1007/s42979-020-00368-9
Sentiment Analysis at Online Social
Network for Cyber-Malicious Post
Reviews Using Machine Learning
Techniques

Romil Rawat, Vinod Mahor, Sachin Chirgaiya, Rabindra Nath Shaw,


and Ankush Ghosh

Abstract The creating centrality of sentiment assessment agrees with the advance-
ment of the electronic stage, for instance, cyber-vulnerability reviews, gathering
discussions for cyber-dangers studies, malicious movement-based overview web
diaries, more modest scope sites, and casual associations related to the cyber-
lawbreakers exercises study. The choices and approach, used to intrigue this present
reality, are generally adjusted to how others see and survey the world about feeling
and sentiment. For such an explanation, the ordinarily looking through method is
utilized, when it is needed to settle down the decision, for ends, conduction, and
evaluation of others. This is certifiable for individuals just as for affiliations and
associations and society. This work is a broad inclination assessment which implies
the task of ordinary language processing (NLP) to choose if a touch of substance
contains some theoretical information and what enthusiastic information it imparts
using cyber-malicious post overview, i.e. whether or not the way behind this substance
is certain (+) or negative (−). Since radical advances the hoodlums cyber-occasions
utilizing on the web interpersonal organization and security offices and validate
user blocks it. Understanding the emotions behind the web user delivered substance
and instructive file subsequently is of unbelievable help for business and individual
use, among others. The task can be coordinated on different levels of substance
taking care of, requesting the furthest point of words, sentences or entire educa-
tional assortments. Here, the methodology investigates a predominant system for
cyber-weaknesses overviews subject to the AI approach.

R. Rawat · S. Chirgaiya
Department of Computer Science Engineering, Shri Vaishnav Vidyapeeth Vishwavidyalaya,
Indore, India
V. Mahor
Department of Computer Science Engineering, Gwalior Engineering College, Gwalior, India
R. N. Shaw
Department of Electrical, Electronics and Communication Engineering, Galgotias University,
Greater Noida, India
e-mail: r.n.s@ieee.org
A. Ghosh (B)
School of Engineering and Applied Sciences, The Neotia University, Sarisha, West Bengal, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 113
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_9
114 R. Rawat et al.

Keywords Machine learning · Artificial intelligence · Cyber-malicious post ·


Sentiment analysis

1 Introduction

Sentiment analysis of cyber-malicious post outline realities is a mind boggling issue


anyway analyzes were refined the utilization of Naive Bayes classifier, logistic regres-
sion classifier MNB classifier [1]. Here for Naïve Bayes set of rules, the administered
sentiment assessment characterization model is utilized. Naïve Bayes is a totally
straightforward probabilistic rendition that tends to function admirably on literary
substance measurements groupings and usually sets aside significant degrees less
effort to instruct while contrasted with models like various frameworks picking up
information on classifiers.
The serious level of exactness might be procured utilizing the naïve Bayes clas-
sifier model that is comparing to different models in sentiment type. What’s more,
utilize the unmistakable classifier, Naive Bayes classifier, Logistic Regression clas-
sifier, MNB classifier for additional exactness contrast [1–3]. As of the gracefully of
a huge volume of online social correspondence channels, evaluation records IMDB
and distinctive web webpage sentiment analysis will turn into an expanding number
of generally fundamental. The sentiment assessment classifier is assembled which
assesses the extremity of a bit of literary substance data being either certain (+) or
negative (−) [3]. With the coming of numerous web stages like Twitter, Instagram,
LinkedIn, Facebook, Blog, IMDB lets individuals share their remarks, sentiments,
assessments, feeling, and decisions on the horde of themes going from relaxation.
These stages contain the extremely colossal amount of the information inside the
type of tweets, remarks, websites, and updates on the posts, notoriety, survey, etc.
[4]. Sentiment assessment objectives to decide the extremity of emotions like satisfac-
tion, distress, enormous, horrendous, scorn, outrage and fondness and surveys from
the printed content records, conclusions, remarks, posts that may be accessible online
on those all stages. Assessment mining and sentiment analysis discover the senti-
ment of the printed content records with acknowledgement to a given gracefully of
substance material. Sentiment assessment is intricate due to the slang phrases, incor-
rect spellings, brisk organization, rehashed characters, utilization of nearby language
and new impending emotions. In this way, it is far a monstrous mission to distin-
guish the reasonable sentiment of each word. Sentiment analysis is one of the most
extreme vigorous examination territories and is moreover widely concentrated truth
be told mining. Sentiment analysis is done in pretty much every undertaking and
social space because of the reality feelings is significant to most human interest and
practices [5].
Sentiment survey or feeling prospecting is one of the most permanent mounting
zones with its advantage and conceivable help expanding every day. With the kickoff
of the web and present innovation, there has been reformist development in the
amount of information. Every individual has quick assessments uninhibitedly via
Sentiment Analysis at Online Social Network for Cyber-Malicious … 115

web-based media. Brimming with this information can be broke down and utilized
in schooling to draw backing and quality learning. One such feeling is sentiment
survey; here, the assessment of the issue is perceived and required data is stressed
out whether it is a result analysis or human feeling on anything materialistic. A couple
of such utilizations of sentiment audit and the strategy in which they are applied are
clarified [6, 7].
In any case, finding and following assessment locales on the web and refining
the realities contained in them stays a great mission because of the expansion of
assorted sites. Each site normally consolidates a huge degree of feeling literary
substance that is not generally without trouble interpreted in protracted online jour-
nals and conversation board postings [8–10]. The typical person who peruses will
have a difficult sorting out significant destinations and secure and shortening the
assessments in them. Programmed sentiment analysis frameworks are consequently
needed. Because of this, various new companies are zeroing in on giving sentiment
assessment administrations. Numerous huge gatherings have furthermore developed
their in-house abilities. Those commonsense projects and business interests have
provided solid inspirations for concentrates in sentiment assessment. Existing exam-
inations have created a few strategies for different duties of sentiment analysis, which
include each regulated and unaided techniques. In the directed putting, early papers
utilized a wide range of administered gadget picking up information on procedures
(counting help vector machines most entropy, naïve Byes, and numerous others.) and
highlight mixes. Solo methods comprise of various techniques that exploit sentiment
lexicons, linguistic assessment, and syntactic examples. Various overview books and
papers were distributed, which cover those early methods and bundles impressively
[11–13].

2 Related Work

There are numerous investigates about the use of web-based media to forestall
dangers in both genuine and cyberspace. As far as the insider danger, there are
two sorts of exploration regions: innovative and mental. The first is explored through
mechanical analysis. Ortiz et al. [9] introduced a strategy for recognizing malicious
insiders through host and organization-based user profiling. Host-based user profiling
has given an approach to recognize users in conditions, for example, UNIX, Windows,
and the web. Organization-based user profiling gives an approach to recognize users
by investigating network traffic, for example, HTTP, SMB, SMTP, and FTP. Addi-
tionally, they set the standards for recognizing a “impostor” and an “interior trickster”
through two analysis techniques. Impostor is a kind of malicious insider who takes
and mimics legitimate insider. Inward trickster is another sort of malicious insider
who has been acknowledged to get to the framework. Kranz et al. [10] Mimicked
malicious conduct in an association’s framework. They gathered them from seven
sorts of information source: mouse, key strokes, have exercises, network traffic,
etc. Through a few stages of periods, they gathered kind/malicious users’ conduct
116 R. Rawat et al.

dataset. This dataset allows to use an examination in malicious insider’s conduct in


the association’s framework. These insightful strategies have the legitimacy that they
empower the proper analysis; however, it has a weakness that it is hard to mirror given
by Salem et al. [14] the propensity of the person. Accordingly, applying the procedure
introduced in this examination makes it conceivable to identify malicious insiders
all the more adequately. Legg et al. [15] presented a device for distinguishing mali-
cious insider called Corporate Insider Threat Detection (CITD) [11]. They directed
profiling of user and job-dependent on framework logs, for example, logins, conve-
nient capacity utilization, and email transmission. This exploration indicated the
chance of adjusting dynamic cycle and diminishing the bogus positive rate when
a framework identifies malicious conduct. Ultimately, Tuor et al. [12] investigated
CERT Insider Threat Dataset v6.2 of Carnegie Mellon University. They found that
the neural network model has better than the principal component analysis (PCA),
support vector machine (SVM), and isolation forest dependent on oddity recognition.
They displayed framework’s ordinary conduct and set unusual conduct as models of
peculiarity drawn closer as the part of cycle [13]. They broke down insider’s conduct
into the cycle model. In the process model, explicit practices are broken into the
subprocesses. Explores in this part added to set up a hypothesis to identify malicious
insiders dependent on the framework conduct. Notwithstanding, it is essential to
comprehend the insiders’ passionate state to investigate why they do malicious acts.
In this unique circumstance, there are explores through mental analysis. Salem
et al. [14] demonstrated that mental qualities of negative demeanour are firmly iden-
tified with insider assailant’s malicious conduct. They contrasted three sorts of tech-
niques concurring with the learning precision: machine learning approach, word
reference-based methodology, and level information arrangement. They utilized
information of notable web-based feature YouTube’s remarks endorsers, and video
access, etc. Among these procedures, machine learning approach demonstrated the
most noteworthy exactness on the grounds that numerous words are found out by
learning model. Then again, words must be looked through whole rundown to
discover what sort of words is equivalent to the rundown in the word reference-
based methodology. Besides, level information arrangement is more enthusiastically
to locate similar words. Harilal et al. [15] told the best way to distinguish crime
episodes by get together and dissecting online media information. To demonstrate
this, they checked what amount is related with the genuine episode and the assessed
occasion. Numerous examinations have zeroed in on the analysis of conduct of both
framework and user in an association drawn nearer to the insider danger in different
territories, for example, cyber-practices and correspondence practices given by Legg
et al. [16]. This examination has the significance for breaking down insider dangers
in context of the cyber-conduct and correspondence, yet in addition the biometric and
psychosocial conduct. Besides, they proposed future headings of the insider danger
identified with their viewpoints. Drawn closer in context of OCEAN character anal-
ysis, Tuor et al. [17]. They demonstrated the relationship between Internet use log
and individual attribute. Through this exploration, they had the option to recognize
mental qualities by patterns in the Internet use logs and eventually use them to distin-
guish likely malicious insiders. Along with framework conduct analysis, explores
Sentiment Analysis at Online Social Network for Cyber-Malicious … 117

about mental angles add to defeat the constraints of mechanical methodologies by


understanding the feelings of insiders.

3 Proposed Work

In the proposed work, the sentiment analysis of cyber-malicious post survey informa-
tion utilizing naïve Bayes, multinomial naïve Bayes, and logistic regression, utilizing
natural language toolkit (NLTk) to set up an informational collection of a cyber-
malicious post audit, at that point apply classifier calculation to create positive and
negative accuracy [14].

3.1 NLTk

Natural language tool kit (NLTk) is utilized for making Python base tasks requiring
human language data. It has simple to utilize interfaces. It gives bigger number of
lexical and corporal resources, for instance WordNet. It additionally gives different
module for making libraries for stemming, tokenization, gathering, ordering, and
semantic reasoning. Theories modules are valuable for making wrappers for good
quality NLP libraries [18]. NLTk has been delegated “an incredible tool compartment
for working in, machine learning [19] calculation utilizing Python”, and a remarkable
library to work with the feature of regular expression.

3.2 Machine Learning Classifier

In AI and bits of knowledge, the request is a managed learning approach in which


the PC programme gains from the data input given to it and a short time later uses
this sorting out some way to aggregate new discernments. This instructive record
may basically be bi-class (like perceiving whether the temperature is high or low
or that the individual is man or lady), or it very well may be multiclass also. A
couple of examples of collection issues are talk affirmation, handwriting affirmation,
biometric[20] recognizing verification, file course of action, etc., [12–14] using naïve
Bayes classifier to fine inclination examination of online remarks and post review
data.
118 R. Rawat et al.

3.3 Supervised Learning

The supervised learning approach makes desires reliant on a great deal of models
for online remarks and post audits. The model used for planning and testing [21] is
set apart with the assessment of energy right now online remarks and post positive
or negative study. Supervised learning like naive Bayes, key, and MNB approach
looks for plans in those value names. It can use any information that might be huge
to the online remarks and post audit data, the season, such an industry, the closeness
of irksome events and each calculation look for different sorts of models for data.
After the calculation has found the best kind it can, it uses that guide to make esti-
mates for unlabelled testing data that looks like new online remarks and post surveys
information [15].

3.4 Naive Bayes Classifier

It is an association strategy dependent on Bayes’ theorem with a sensation of imme-


diacy between images. In major expressions, a naive Bayes classifier avows that the
pace of the unique component in a class is withdrawn from the nearness of any extra
part [16, 17].
Naive Bayes the hypothesis presents a strategy for calculation probability-based
the back probability P(c|x), earlier probability P(c), earlier probability P(x) and
earlier probability P(x|c). Take a gander at the condition underneath (Fig. 1).
It is above, where:
• The back probability P(c|x) is dictated by the class (c, target) and the sup-employed
indicator (x, credits).
• The earlier probability P(c) is introduced in a class.
• The probability P(x|c)is the probability of the indicator class.
• The earlier probability P(x) is utilized for an indicator.
A minor model using naive Bayes is normal underneath (Table 1):
Table 2 outlines the samples of suspicious-, malicious-, and crime-related words
used for post sharing or mailing by the cyber-criminals.

Fig. 1 Equation—Naive
Bayes classifier
Sentiment Analysis at Online Social Network for Cyber-Malicious … 119

Table 1 Classification of text sentence [36, 37]


Set Document Review Sentence Class
Training set 1 You won free lottery ticket, provide your mobile number and Pos
account number
Malicious comment
2 Share your details for personal chat Pos
Malicious comment
3 Online gun and drugs are available Pos
Malicious comment
4 Recruitment is on at known place Pos
Malicious comment
Test set Hate and violent comments Pos
Malicious comment

Table 2 Words samples


Cyber-Malicious Words
related to cyber-malicious
post [34, 35] Scammers Virus
Worm Hacker
Exploit malware
Phishing DoS
Funding and hate speech Infosec
Spam Virus
Cyber-crime Terrorism
Funding Iraq
Lottery Secret meeting

Real-time Prophecy: Naive Bayes is a solid learning classifier. It is an irrefutable


quick classifier. Consequently, it very well may be utilized for imagining predictions
progressively.
Manifold classes Predication: Complex classes have board factors for multiclass
forecast for acknowledgement of good highlights. This algorithm is very appropriate
for determining of the probability.
Sentiment analysis and Mining: because of the high precision[21] of results in
multiclass for text mining, the naive Bayes algorithm is profoundly utilized for assess-
ment digging and sentimental analysis for the content information accessible over
the web as online media web journals, person-to-person communication sites, and
cyber-malicious post survey sites. The naive Bayes classifier accomplishment rate
as compared two different algorithms; thus, it is widely utilized in sentiment mining
cost analysis of regular media records to recognize real or uninterested user.
Support System: Collaborative-oriented filtering and naive Bayes classifier[22]
planned courses of action of a programme scheme that utilizes information mining
120 R. Rawat et al.

and machine learning methods to explain obscure data and choose whether an
individual tolerating the request would be a given spot or not.
Gaussian: It is drilled in analysis and its joints that element conceal ordinary
dispersion [23].
Multinomial: It is profoundly applied for various figurings accessible inside the
records. For instance, in view of the recurrence of checks discovered inside the
composition with the event of words in the report, there is an order issue for record
characterization for that incorporation of term occurs in substance is required and
that should be possible by thinking about test hearing tally from records.
Logistic regression: Logistic regression is a factual methodology of efficient exam-
ination from a dataset over the autonomous [24, 25] and ward factors of outcomes
of deterministic work. The methodology is delayed for dichotomous changeable
(there are just two results positive or negative that is valid or bogus dependent on
paired result 1 or 0) for the huge order of the information. Logistic regression is the
momentous methodology of direct regression strategy dependent on factor positive
and negative [18].
MNB Classifier: The basic probabilistic classifiers [26] for information order
utilizing a naïve Bayes machine learning algorithm are utilized. This classifier was
presented with various names in the mid-1960 and stays a pattern strategy for text
grouping and order for the subjective issue [19]. The issue separated from the archives
which are self-assertive of nature has a place with the diverse extra classifications like
any real method for sports, distinctive legislative issues over nature and so on, which
incorporates different occurrences of a word as a component. Diverse pre-handling
methods are utilized in this space which is extremely serious quite a gullible Bayes
is utilized in our proposed algorithm.
In the compulsory forecast framework, there is a lot of use of it. The multinomial
[27] including highlight vectors removed from tests and speaks to as frequencies
which is utilized in the multinomial occasion based model.
Here, the probability speaks to as an alternate class from (p1 to pn) where Pi is
the quantity of occasions happens in the technique spoke to by factor I.
(K is multiclass in the multinomial). The x = (x1.… xn) is the element vector
for the histogram information portrayal in a specific occurrence which could be
examined by various occasions. Significantly, the event model is utilized for text
archive analysis for order reason through which the quantity of words happened is
examined by the occasions which are accessible in the example reports. The way
towards noticing information is communicated as a histogram x is given with the
accompanying recipe (Fig. 2).
At the point when the log-space is utilized for the multinomial naïve Bayes, it
acts like a direct naïve Bayes order strategy.
Sentiment Analysis at Online Social Network for Cyber-Malicious … 121

Fig. 2 a, b Equation—MNB
classifier

4 Dataset Collection

Sentiment 140 provided by web-service is used for finding the malicious information.
IMDB [38]. This dataset contains 50,000 positive and negative post reviews that
crawled from online sources, with 215.63 words as average length for each sample.

5 Methodology

Distinctive sentiment analysis is utilized with various blends alongside various mama
chine learning classifiers of various utilizing, various highlights for investigating a
Cyber-malicious [28] post survey from the test with different pre-handling steps.
Distinctive after highlights like positive and negative discovery are utilized. Finally,
unique machine learning algorithm is utilized for treating the information with
various classifiers in different works of past researchers.
The middle ground work is appropriate for the diverse issue which begins
with exceptionally simple advances like information assortment. Most of the work
contains filling and getting that unpredictable information and cleaning of it which is
a sensitive assignment, taken care of expertly. Different advances needed for where
to begin the work and push ahead towards completing it through displaying and
utilizing different crude information. It is needed to set up an audit of the cyber-
malicious post utilizing the text information and doing sentiment analysis for bit by
bit methodology [29]. There are different strategies needed for stacking the informa-
tion and afterwards cleaning it and eliminating all the blunders like the words which
are not anticipated. It is likewise needed to make a jargon and afterwards make it
users so that and spare it in a record (Fig. 3).
122 R. Rawat et al.

Fig. 3 Data processing framework

The cyber-malicious post surveys need to figure an appropriately intensive


cleaning cycle and jargon. The survey of a cyber-malicious post reviewed charac-
terize previously and afterwards spare them into the nail any new documents which
will be shown openly [17, 18] (Fig. 4).
• Cyber-malicious post review dataset new
• Loading the raw available text data
• Cleaning of raw available data
• Vocabulary design (malicious content)
• Collect the earliest used data.
The relational word has been utilized for taking the cyber-malicious post infor-
mation for utilizing diverse cyber-malicious post bytes and furthermore utilizing the
substance from the site known as pythonprogramming.net. The different positive and

Fig. 4 Process model flow


Sentiment Analysis at Online Social Network for Cyber-Malicious … 123

negative cyber-malicious post audits are utilized from the dataset for checking the
precision of the algorithm [30].
The most recent which is utilized in our proposed model is the substance of
various record sets and information document, information taken from various cyber-
malicious post surveys. It has been the dataset is isolated into various parts like
preparing and test set for benchmarking of goal like however the sentences which are
there are coupled from the first request with the goal that it can make a considerably
more productive preparing dataset. Rolling in the dataset are examined and quick
utilizing various people like well-known people like the Stanford buy this expression
every one of the expression and goes about as an articulation ID and it additionally
as a judgment id. A few words which can be rehashed in the spots are included once
just dismissing on for decontaminating the missing dataset. For information cleaning,
algorithm preparing is done and filtration [31] is utilized to erase the missing qualities.
An online worth that is utilized in a few datasets, contains distinctive important
words called as labels, HTML contents, for there promotion. In reality, in those
sorts of the word, there are different issues so thinking about this its characterization
becomes fundamental significantly more dangerous so most of legitimate work on
the handling and diminishing the commotion esteems with the goal that the text to
recuperate the presentation of the algorithm and speed of the grouping methods [31].
Around 25,000 audits [32] are taken from the various sites which contain both
positive and bad surveys; these audits are put it on an alternate text document in
the name given as unadulterated asp.net and any g.st for positive for the negative
audits [33–36]. Eighty percent of sentences are utilized for preparing for the past,
and 20% of the sentences are utilized for testing purposes [37]. Underscore of each
of the datasets is determined from the preparation dataset [38–42]. The rundown is
changed and created utilizing the word reference from where the train gets vexed
and store and followed by every one of the scores determined.

6 Implementation

This segment covers the usage a piece of the methodology dependent on the various
classifiers utilized in this work with supervised machine learning; for cyber-malicious
post surveys dataset, positive and negative text division is utilized and required.
The dataset is additionally ordered in three distinct classifications having isolated
proportion utilized for training and test reason.

training_dataset = f eatur e_dataset[:18000]


testing_dataset = f eatur e_dataset[7000:]

In the second step, we classify the data using well-known classifiers and train our
classifier like:
124 R. Rawat et al.

classi f ier = N L T k.N aiveBayes_Classi f ier.train(training_dataset)

After the training portion of the classifier, there is a test step in the next section.

print ( % value o f Classi f ication − Accuracy: ,


(N L T K k.classi f y. per centage_accuracy(classi f ier, testing_dataset)) ∗ 100)

This classifier is based on the NLTk classifier using all of the methods using
Python and the NLTk classifier in the study.

f r om nltk.classi f yer impor t Classi f ier _I


f r om statistics import mode

Now, it allows developing a classifier class:

class Division_Classi f ier (Classi f ier _I ) :


de f _init_ (sel f, ∗classi f ier ) :
sel f._classi f ier = classi f ier s

By inheriting from NLTK’s classifier, [33] calling of class division classifier is


to be done followed by assignment of classifiers list that is passed to the class for
self-classification. And, to further invoke, calling is required for classification.

de f classi f y(sel f, f eatur e) :


division = []
f or sc in sel f._classi f ier :
data = sc.classi f y( f eatur e)
division.append_data(value)
Retur n mode (division)

And then, based on features, classification is done, which are treated as division;
while iterating is completed, the model is returned to its prevalent division. Use
stricture as confidence for the algorithm. Here, the confidence method is used for
calculating the confidence over the features:
Sentiment Analysis at Online Social Network for Cyber-Malicious … 125

de f con f idence(sel f, f eatur e) :


division = []
f or sc in sel f._classi f ier :
doc = sc.classi f y( f eatur e)
division.append(doc)

choice_division = division.count(mode(division))
con f idence = choice_division / length(division)
r etur n con f idence

The dataset has a positive and negative statement, and with the help of them,
we can train our model. The division of the dataset is done in two parts of 25,000
(positive and negative) cyber-malicious post reviews.
The new dataset in a very compatible form is represented here as done before.

 shor t_ positive = f ileopen(shor t_analyses/ positive_r ecor d.t xt  ,  r ecor d  ).r ead()

shor t_negative = f ileopen( shor t_analyses/negative_r ecor d.t xt  , r ecor d  ).r ead()

document = []
 
f or r ecor d in shor t_ positive.split  \n  :
  

document.append r ecor d, positive
 
f or r ecor d in shor t_negative.split  \n  :
  

document.append r ecor d, negative

f ull _wor ds = []

shor t_ positive_r eview_wor ds = wor dtokeni ze_(shor t_ pos)


shor t_negative_r eview_wor ds = wor dtokeni ze_(shor t_neg)

f or wor d in shor t_ positive_r eview_wor ds :


f ull_wor ds.append(w.lower ())

f or wor d in shor t_negative_r eview_wor ds :


f ull_wor ds.append(w.lower ())

f ull_wor ds = nltk.Fr equency_Distribution( f ull_wor ds)

With the application of feature finding function, the tokenizing of words is created,
for new sample data of document words(). and thereby increasing the common word
record.
126 R. Rawat et al.

wor d_ f eatur e = list ( f ll_wor d.keys())[: 5000]

de f f ind_ f eatur e(document) :


w = wor dtokeni ze(document)
f eatur es = {}
f or wor d in wor d_ f eatur e :
f eatur e[w] = (w in wor ds)
r etur n f eatur e
f eatur e_set = [( f ind_ f eatur e(r eview), categor y) f or (r eview, categor y) in document]
random.sor t( f eatur eset)

Each word is used for test; if it exists in the word score list, add its score to review
score v. Else, find the word in word score list with least stock to the unidentified
word and add its score to the review score. Check the classifier’s precision and show
the result (Fig. 5).

7 Result and Outcomes

Cyber-malicious post audit dataset has 25,000 records isolated and portrays in three
classifiers with an other extent like 90% of planning data and 10% of testing data,
80% of getting ready data and 20% of testing data and 70% of getting ready data and
30% of testing data. Naïve Bayes, multinomial naïve Bayes and key backslide are the
three classifiers used to find the precision of a negative and positive cyber-malicious
post audits. The accuracy rate has showed up in the underneath the Table 3.

8 Conclusion

The appraisal of the cyber-malicious post surveys using different classifiers. Utilizing
the corpora and online social information assortment destinations like Twitter, IMDB,
Instagram and Facebook, review on the cyber-malicious post surveys whenever done.
The assessment of well-being utilizing a few feature assortments and distinctive
learning strategies like naive Bayes multinomial, logistic regression, naive Bayes, in
the light of collection of cyber-malicious post analysis and survey review enlight-
ening ordering as appeared in their furthest point (positive/negative). The result shows
that a clear assessment of the classifier model can perform reasonably extraordi-
nary, and it might be also refined by the choice of features subject to syntactic and
semantic information from the cyber-malicious post surveys content. This exam-
ination explored the effect of the feature vector on the portrayal exactness. Here,
it is examined about the corpus that contains sentences from cyber-malicious post
remarks and surveys. Results uncovered a corpus, rather than the way that it shows the
near furthest point of the words. Furthermore, the NLTk technique moreover portrays
Sentiment Analysis at Online Social Network for Cyber-Malicious … 127

Fig. 5 Flow chart of proposed design

Table 3 Classifier result with evaluated values


Sr. No. Dataset (ratio) cyber-malicious post review Classifier Accuracy
1 2000 (90/10) Naive Bayes 83.37
2 2000 (90/10) MNB 86.25
3 2000 (90/10) Logistic regression 97.68
4 2000 (80/20) Naive Bayes 81.37
5 2000 (80/20) MNB 85.25
6 2000 (80/20) Logistic regression 94.68
7 2000 (70/30) Naive Bayes 82.37
8 2000 (70/30) MNB 84.25
9 2000 (70/30) Logistic regression 92.68
128 R. Rawat et al.

for us. The proposed model is just an initial development towards the improvement
in the strategies for assessment examination. It justifies researching the limits of the
model for the dynamic data and widening the assessment using crossbreed methods
for idea assessment. There is a critical degree for improvement in the corpus creation
and suitable pre-dealing with and feature assurance. Furthermore, show classifier,
accuracy higher.

9 Future Work

The proposed model work on available dataset, In further research, complete social
media analysis for suspicious post is to be analysed with live stream or posting
Vulnerability analysis and if any malicious behaviour is traced by intelligent engine
at web platform, alert is generated towards the administrator or mapped security
agencies to take legal actions. Other direction is for autonomous vehicle-based senti-
ment analysis as information is shared at IOT platform and design of automatic web
content crawler—at scheduled span of time to get log of online social vulnerability
for entire defined networks.

References

1. B. Pang, L. Lee, S. Vaithyanathan, Thumbs up?: sentiment classification using machine learning
techniques. in Association for Computational Linguistics (2002), pp. 79–86
2. B. Liu, Sentiment analysis and subjectivity. in Handbook of Natural Language Processing, 2nd
edn. (2010)
3. B. Liu, Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language
Technologies” (2012).
4. B. Liu, Sentiment Analysis and Opinion Mining (Morgan & Claypool Publishers, 2012)
5. W. Li, H. Chen, Identifying top sellers in underground economy using deep learning-based
sentiment analysis. IEEE Joint Intell. Secur. Inf. Conf. 2014, 64–67 (2014)
6. Q. Peng, M. Zhong, Detecting spam review through sentiment analysis. J. Softw. 9(8), 2065–
2072 (2014)
7. Statista, Internet of Things (IoT) connected devices installed base worldwide from 2015 to
2025 (in billions). (2016)
8. Statista, Number of social media users worldwide from 2010 to 2021 (in billions). (2017)
9. A.M. Ortiz, D. Hussein, S. Park, S.N. Han, N. Crespi, The cluster between internet of things
and social networks: review and research challenges. IEEE Internet Things J. 1(3), 206–215
(2014)
10. M. Kranz, L. Roalter, F. Michahelles, Things that twitter: social networks and the internet
of things, what can Internet Things do Citiz. work. in Proceedings of the 8th International
Conference on Pervasive Computing (Pervasive ‘10) (2010), pp. 1–10
11. X. Liu, M. Zhao, S. Li, F. Zhang, W. Trappe, A security framework for the internet of things
in the future internet architecture. Future Internet, 9(3), (2017)
12. Cybersecurity Insiders and Crowd Research Partners, Insider threat 2018 (2017)
13. C. Colwill, Human factors in information security: the insider threat—who can you trust
these days? Inf. Secur. Tech. Rep. 14(4), 186–196 (2009). N.S. Safa, R.V. Solms, S. Furnell,
Information security policy compliance model in organizations. Comput. Secur. 56, 1–13 (2016)
Sentiment Analysis at Online Social Network for Cyber-Malicious … 129

14. M.B. Salem, S. Hershkop, S.J. Stolfo, A survey of insider attack detection research. Adv. Inf.
Secur. 39, 69–70 (2008)
15. A. Harilal, F. Toffalini, J. Castellanos, J. Guarnizo, I. Homoliak, M. Ochoa, TWOS: a dataset
of malicious insider threat behavior based on a gamified competition. in Proceedings of the
2017 International Workshop on Managing Insider Security Threats (ACM, 2017), pp. 45–56
16. P.A. Legg, O. Buckley, M. Goldsmith, S. Creese, Caught in the act of an insider attack: detection
and assessment of insider threat. in 2015 IEEE International Symposium on (IEEE, 2015),
pp. 1–6
17. A. Tuor, S. Kaplan, B. Hutchinson, N. Nichols, S. Robinson, Deep learning for unsupervised
insider threat detection in structured cybersecurity data streams. https://arxiv.org/abs/1710.
00811
18. M. Bishop, H.M. Conboy, H. Phan, et al., Insider threat identification by process analysis. in
Proceedings of the 2014 IEEE Security and Privacy Workshops (SPW) (San Jose, California,
USA, May 2014) pp. 251–264
19. M. Kandias, V. Stavrou, N. Bozovic, D. Gritzalis, Proactive insider threat detection through
social media: the YouTube case. in Proceedings of the 1st ACM Workshop on Language
Support for Privacy-Enhancing Technologies, PETShop 2013—Co-located with the 20th
ACM Conference on Computer and Communications Security, CCS 2013 (Germany, 2013)
pp. 261–266
20. V. Marivate, P. Moiloa Catching crime: detection of public safety incidents using social media.
in Proceedings of the 2016 Pattern Recognition Association of South Africa and Robotics and
Mechatronics International Conference, PRASA-RobMech 2016 (South Africa, 2016)
21. L.L. Ko, D.M. Divakaran, Y.S. Liau, V.L.L. Thing, Insider threat detection and its future
directions. Int. J. Secur. Netw. 12(3), 168–187 (2017)
22. B.A. Alahmadi, P.A. Legg, J.R. Nurse, Using internet activity profiling for insider-threat detec-
tion. in Proceedings of the 12th Special Session on Security in Information Systems (Barcelona,
Spain, 2015), pp. 709–720
23. Dataset of Sentiment140. https://help.sentiment140.com/for-students/
24. E. Loper, S. Bird, NLTK: the natural language toolkit. in Proceedings of the 42nd Annual
Meeting of the Association for Computational Linguistics (2004), pp. 1–4
25. D.M. Blei, Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)
26. A. Dundar, J. Jin, E. Culurciello, Convolutional Clustering for Unsupervised Learning (2015),
pp. 1–11
27. M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters
in large spatial databases with noise. in Proceedings of the 2nd International Conference on
Knowledge Discovery and Data Mining (KDD ‘96), vol. 96 (1996), pp. 226–231
28. W. Park, Y. You, K. Lee, Twitter sentiment analysis using machine learning. in Research
Briefs on Information and Communication Technology Evolution. https://rbisyou.wixsite.com/
rebicte/volume-3-2017, (2017)
29. S. Feng, J.S. Kang, P. Kuznetsova, Y. Choi, Connotation lexicon: a dash of sentiment beneath
the surface meaning. in Proceedings of the 51st Annual Meeting of the Association for
Computational Linguistics, vol. 1 (2005), pp. 1774–1784
30. M. Losada, E. Heaphy, The role of positivity and connectivity in the performance of business
teams: a nonlinear dynamics model. Am. Behav. Sci. 47(6), 740–765 (2004)
31. S. Ben-David, S. Shalev-Shwartz, Understanding Machine Learning: From Theory to Algo-
rithms (2014)
32. T. Joachims, Text categorization with support vector machines: learning with many relevant
features. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics): Preface, vol. 1398 (1998), pp. 137–142
33. S. Mandal, V.E. Balas, R.N. Shaw, A. Ghosh, Prediction analysis of idiopathic pulmonary
fibrosis progression from OSIC dataset. in 2020 IEEE International Conference on Computing,
Power and Communication Technologies (GUCON) (Greater Noida, India, 2020), pp. 861–865.
https://doi.org/10.1109/GUCON48875.2020.9231239
130 R. Rawat et al.

34. M. Kumar V.M. Shenbagaraman, R.N. Shaw, A. Ghosh Predictive data analysis for energy
management of a smart factory leading to sustainability. in Innovations in Electrical and Elec-
tronic Engineering, ed. by M. Favorskaya, S. Mekhilef, R. Pandey, N. Singh. Lecture Notes
in Electrical Engineering, vol. 661 (Springer, Singapore, 2021). https://doi.org/10.1007/978-
981-15-4692-1_58
35. S. Mandal, S. Biswas, V.E. Balas, R.N. Shaw, A. Ghosh, Motion prediction for autonomous
vehicles from Lyft dataset using deep learning. in 2020 IEEE 5th International Conference on
Computing Communication and Automation (ICCCA) (Greater Noida, India, 2020), pp. 768–
773. https://doi.org/10.1109/ICCCA49541.2020.9250790
36. Y. Belkhier, A. Achour, R.N. Shaw, Fuzzy passivity-based voltage controller strategy of grid-
connected PMSG-based wind renewable energy system. in 2020 IEEE 5th International Confer-
ence on Computing Communication and Automation (ICCCA) (Greater Noida, India, 2020),
pp. 210–214. https://doi.org/10.1109/ICCCA49541.2020.9250838
37. R.N. Shaw, P. Walde, A. Ghosh, IOT based MPPT for performance improvement of solar
PV arrays operating under partial shade dispersion. in 2020 IEEE 9th Power India Interna-
tional Conference (PIICON) (SONEPAT, India, 2020), pp. 1–4. https://doi.org/10.1109/PII
CON49524.2020.9112952
38. S. Schrauwen, Machine Learning Approaches to Sentiment Analysis Using the Dutch Netlog
Corpus (2010)
39. https://sersc.org/journals/index.php/IJAST/article/view/19025/9664
40. https://arxiv.org/pdf/1812.05271.pdf
41. https://rpubs.com/hoakevinquach/SMS-Spam-or-Ham-Text
42. https://www.kaggle.com/ishansoni/sms-spam-collection-dataset
Analysis of Various Mobility Models
and Their Impact on QoS in MANET

Munsifa F. Khan and Indrani Das

Abstract Mobile ad hoc networks (MANETs) provide communication of mobile


nodes in an infrastructure-less environment. Factors like the mobility of nodes,
dynamic topology of the nodes, path breakages, limited availability of resources, etc.,
affect QoS. This chapter carries important mobility models that are used for simu-
lation. We have considered different mobility models, namely Random Way Point,
Gauss Markov, Random Walk-2D, Random Direction-2D, and Constant Velocity.
The parameters used for the experiment are PDR, delay, and throughput to check the
impact of QoS in MANET using NS-3. It is noted that the Gauss Markov furnished
better result concerning PDR and delay, whereas Constant velocity and Random
Way Point model performs better for throughput among other mobility models.
Researchers will be benefitted from this chapter. It will make them learn which
mobility model gives better QoS support in MANET.

Keywords AODV · MANET · Mobility model · Performance analysis · QoS

1 Introduction

In MANET, nodes are self-organizing, self-describing, independent, and adaptive due


to the absence of any central coordinator. As MANET is a wireless network, commu-
nication occurs by broadcasting messages to its neighbor nodes. In MANETs, nodes
are in motion, and their associations are dynamic. The absence of a framework makes
the network more flexible and robust. The different quality of MANET is changing
topology of nodes, lack of central coordination, lack of precise state information,
mobility of nodes, inadequate resources like bandwidth, battery power, etc., which
makes it an emerging topic for research [1, 2]. It is necessary to use energy efficiency
to improve network lifetime [3, 4].

M. F. Khan (B) · I. Das


Department of Computer Science, Assam University, Silchar, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 131
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_10
132 M. F. Khan and I. Das

In MANET, users are allowed to exchange information regardless of their


geographic position. Today, MANET is considered one of the best and fastest-
growing emerging technologies for mobile computing due to the increase in reli-
able, influential, and manageable devices. It is obtaining significance in well-known
application fields like commercial, military, and private sectors due to the wireless
network’s growth [5]. Some of the application areas are:
• Military Sector: These days, military tools consist of computer equipment. The
ad hoc network helps to generate a network among the soldiers, military tanks,
and headquarters in order to exchange information among them. It is mainly used
in the war fields for communication. It is one of the crucial applications of the
MANET. The key QoS parameters considered in this application are delay, jitter,
and bandwidth [1].
• Emergency operation: For natural disaster like fire, flood, or earthquake, ad hoc
network can be used for emergency operations like search and rescue, and disaster
relief efforts. In those cases, it is very beneficial and easy to use ad hoc networks
as all of the equipment gets destroyed, or possibly the area is too inaccessible for
communication. In this application, availability of the network and battery life is
the QoS parameter [1].
• Sensor Networks: A massive number of small sensors communicated with each
other via a wireless network. The capacity of each sensor is very restricted, and
every sensor is dependent on other sensors for transmission of data to a central
computer. The computational power of a sensor device is minimal [6].
Routing plays a vital role in providing QoS support in MANET. QoS is the effi-
ciency level of attributes provided to the user by a network during communication. To
obtain a higher network performance in MANET, it is required to provide better QoS
[7, 8]. We have discussed the analysis of various mobility models to check the impact
of QoS in MANET using standard routing protocol, namely AODV. The mobility of
nodes has an enormous influence on the network [9]. It is essential to observe node
mobility while routing for improvement of the QoS. The motion of nodes can be
predicted using the mobility model in the simulation. While doing the simulation,
the mobility model produces different traces of mobile nodes for different scenarios.
Position traces mean the location of mobile nodes at the different instant of time.
These traces help to realize the difference in the performance level of routing proto-
cols in simulation. Mobility models are used primarily for simulation. They can also
be used for analytical and test-beds methods. Mobility models can be classified into
three types [8]:
• Stochastic Mobility Models: These models mainly rely on the node’s arbitrary
motions and are not restricted to any precise scenario.
• Detailed Mobility Models: Unlike the stochastic model, these models are
personalized, and they are designed for specific scenarios.
• Hybrid Mobility Models: These models are designed using both the stochastic
and detailed mobility models.
Analysis of Various Mobility Models and Their Impact … 133

We have done a simulation analysis of a few mobility models, namely Random


Way Point, Gauss Markov, Random Walk-2D, Random Direction-2D, and Constant
Velocity on AODV. Our work gives clear experimental views of different mobility
model, which makes more comfortable for the researchers to understand which
mobility model to choose concerning the environment. Moreover, it is noted that
the Gauss Markov commits better result with respect to PDR and delay, whereas
Constant Velocity and Random Way Point models perform better for throughput
among other mobility models.
The chapter is arranged as: Sect. 2 presents the summary of the existing mobility
models; Sect. 3 is about the AODV routing protocol, while Sect. 4 shows the simula-
tion and performance analysis. Section 5 discusses the result, and Sect. 6 completes
the chapter.

2 Existing Mobility Models

The mobility model is an essential function for simulation in MANET. It is a process


of calculating the mobility of nodes. It expresses the changes in the mobile nodes
by demonstrating their variation in position, speed, velocity, direction, and acceler-
ation at a particular instant of time. In reality, it is challenging to predict how the
node moves. Mobility model helps us to understand the movement of nodes. It is
significant to select a suitable mobility model for the environment to simulate in a
more realistically manner. The features of each mobility model vary. This chapter
gives a brief idea about the impact of different mobility models on AODV, and how
it affects the QoS in MANET. It will also give a clear view regarding the selection
of a mobility model with the corresponding environment.

2.1 Random Way Point

Johnson and Maltz had come up with this model [10]. In this model, the initial
deployment of nodes is arbitrary, and each node is independent [11]. The functioning
of the model is given by:
At first, a node decides a certain location as a target and begin to move in the
direction of it with a fixed velocity [min_vel, max_vel] [12–14]. A node halts for
some time, called as the pause time when it achieves the nominated target. When
the pause time ends, then the node initiates to move to another target with a new
self-made velocity [12–15]. This model is memoryless [14].
134 M. F. Khan and I. Das

2.2 Random Walk-2D

In physics, Random Walk was formerly developed to imitate the uncertain motion
of particles. It is utilized to imitate the movement performance of mobile nodes that
moves randomly. It also has no memory where each node begin to move to a new
position by selecting a new speed and direction. The node motion is measured either
with a continual period, time t or a continual distance d. At a specific period, time ‘t’,
the speed v(t) of a node is a velocity vector [min_speed, max_speed], and direction
θ(t) [0, 2π] is achieved. Consequently, during the time period ‘t’, nodes move with a
speed of [v(t)cosθ(t), v(t)sinθ(t)]. If any mobile node outreaches the simulation area’s
border, then that node has bounced back with an angle θ(t). This situation is termed
as border effect. This model has been further extended to 3D and n-D [13, 16].

2.3 Random Direction-2D

Random Direction is a memoryless mobility model like the previous two models,
where the motion of nodes is dependent on an arbitrary direction d. At a time period
‘t’, the speed v(t) of a node [min_speed, max_speed] and direction d [0, 180] are
acquired. The node moves forward with these factors till it reaches the boundary.
When it reaches the edges, it breaks, chooses a new direction and speed, and proceeds
toward it [2, 17, 18].

2.4 Gauss Markov

Liang and Haas presented this memory-based model [19]. The speed and direction
of a node are measured by the given mathematical formulas [12, 20, 21, 22]:

St = αSt−1 + (1 − α) Ś + (1 − α 2 )Sxt−1 (1)


Dt = α Dt−1 + (1 − α) D́ + (1 − α 2 )Dxt−1 (2)

where S t and Dt are the new speed and direction at time interval t, Ś and D́ are the
mean speed and mean direction, Sxt−1 , and Dxt−1 are random variables, and α is a
random variable, value appears between 0 < α < 1.
Analysis of Various Mobility Models and Their Impact … 135

2.5 Constant Velocity

This mobility model is also a memoryless mobility model where the current velocity
of a node is fixed once it is initialized. It can be changed explicitly [18].

3 Ad Hoc On-demand Distance Vector Routing Protocol

It is an on-demand routing protocol. A route is detected when a node wishes to


connect with other nodes [1, 2, 23, 24, 25, 26, 27]. A source node transmits Route
Request packets to its corresponding nodes. When the corresponding node obtain
the Route Request packet, then it sends a Route Reply packet if it has the link to the
terminus; else, it transmits the Route Request packet to its adjacent nodes [28, 29]. In
case of route failure, Route Error packet is sends to its neighbor nodes [24, 25, 29].

4 Performance Evaluation

4.1 Simulation Environment

The experiment is done in NS-3. The parameters and their values are given in Table
1. We have done rigorous experiments on AODV with 25, 50, 75 and 100 sets of
nodes using various mobility models. The parameters and their values for GM, RWP,
RW, and RD mobility models have been given in Tables 2, 3, 4 and 5, respectively.

Table 1 Simulation environment


Parameters Values
Routing protocols AODV
Mobility models Random way point, Random direction-2D, Random walk-2D, Gauss
Markov, and Constant velocity
Propagation delay model Constant speed propagation delay model
Propagation loss model Friss propagation loss model
Position allocator Random rectangle
Number of nodes 25, 50, 75 and 100
Total simulation time 100 s
Number of flows 10
Type of traffic CBR
Packet size 512 kbps
Data rate 2048 bps
136 M. F. Khan and I. Das

Table 2 Parameters for GM


Parameters Values
Bounds X [0, 150000] Y [0, 150000] Z [0, 10000]
Time step 0.5 s
Alpha 0.85
Mean velocity Min: 0 and Max: 20
Mean direction Min: 0 and Max: 6.283185307
Mean pitch Min: 0 and Max: 0.05
Normal velocity Mean: 0, Variance: 0, and Bound: 0
Normal direction Mean: 0, Variance: 0.2, and Bound: 0.4
Normal pitch Mean: 0, Variance: 0.02, and Bound: 0.04

Table 3 Parameters for RWP


Parameters Values
Speed Min: 0 and Max: 20 m/s
Pause time 0s
Position allocator Random rectangle
X [0, 480] Y [0, 380]

Table 4 Parameters for RW


Parameters Values
Node speed Min: 0 and Max: 20 m/s
Mode Time: 2 s
Bounds X [0, 500] Y [0, 400]

Table 5 Parameters for RD


Parameters Values
Node speed Min: 0 and Max: 20 m/s
Pause time 0s
Bounds X [0, 500] Y [0, 400]

4.2 Simulation Results

For simulation, parameters considered are throughput, delay, and PDR, which are
used to analyze various mobility models such as GM, RWP, RW, RD, and CV and
using a set of nodes 25, 50, 75 and 100 nodes.
Throughput. It is expressed as the number of messages transmitted per second
during communication. We have measured different throughput values of AODV
using various mobility models, namely GM, RWP, RW, RD, and CV as given in
Table 6. The highest throughput value for AODV is 108.89 kbps with the CV mobility
model for 25 nodes, and the lowest is 0.26 kbps for 25 nodes using the GM mobility
Analysis of Various Mobility Models and Their Impact … 137

Table 6 Throughput of AODV using different mobility models


No. of nodes Mobility models
RWP GM RW RD CV
25 102.59 0.26 6.21 72.99 108.89
50 18.84 5.25 28.54 37.72 16.75
75 44.56 68.09 57.58 53.53 40.52
100 25.86 57.77 23.82 32.17 26.53

Fig. 1 Comparison of throughput

model as given in Table 6. It is observed that using 25 nodes, three of the mobility
models, namely RWP, RD, and CV, give maximum value of throughput for AODV,
whereas GM and RW provide the smallest value of throughput. It is seen that the
performance of RWP is similar to CV, and the performance of GM is similar to RW,
as presented in Fig. 1. The overall throughput is promising in the RWP among all
the other mobility models, as represented in Table 6.
Packet-Delivery-Ratio (PDR). It is expressed as the ratio of entire data packets
received by the destination to the entire data packets transmitted by the source. It is
seen that the most massive PDR value 1.000 is achieved by the GM mobility model
using 25 nodes, which indicates that 100% of packet is being transferred, whereas
the lowest PDR value is 0.838 for 100 nodes using the RW model that implies some
packet drops are there presented in Table 7. It is seen in Fig. 2 that the PDR value
decreases with the increasing number of nodes using the RWP mobility model. It is
detected that the average PDR value is promising with the CV mobility model.
Delay. It is defined as the total time required by a source to transmit a message to
its destination. The number of nodes and its corresponding values for delay using
various mobility models viz. RWP, RD, RW, GM, and CV is given in Table 8. The
minimum delay value in Table 8 is 34.821 s for 25 nodes, whereas the maximum
delay value is 648.668 s for 100 nodes using the RWP mobility model. It is seen that
138 M. F. Khan and I. Das

Table 7 PDR of AODV using different mobility models


No. of nodes Mobility models
RWP GM RW RD CV
25 0.926 1.000 0.857 0.938 0.956
50 0.930 0.923 0.849 0.887 0.904
75 0.899 0.849 0.879 0.927 0.953
100 0.878 0.882 0.838 0.876 0.957

Fig. 2 Comparison of PDR

Table 8 Delay of AODV using different mobility models


No. of nodes Mobility models
RWP GM RW RD CV
25 34.821 192.875 72.250 131.95 214.72
50 183.867 140.638 174.79 173.29 157.69
75 309.173 209.353 226.62 184.23 247.34
100 648.668 176.852 329.15 225.02 184.71

delay value increases for a larger number of nodes using mobility models like RD,
RWP, and RW, as shown in Fig. 3.

5 Discussion and Result

We have done thorough experiments on AODV with mobility models: RWP, RD,
RW, GM, and CV to analyze the impact of QoS in MANET. For 25 sets of nodes,
it is noted that the throughput value is higher in both the RWP and CV, whereas it
is lower in the GM mobility model. The higher the throughput value, better is the
Analysis of Various Mobility Models and Their Impact … 139

Fig. 3 Comparison of delay

QoS. The impact of RWP and CV is almost similar on AODV based on throughput,
as shown in Table 6 and Fig. 1. The throughput is increasing for an increasing digit
of nodes in GM and RW, whereas for the rest of the models, it is decreasing with
the increasing value of nodes. It is also seen that the PDR value is decreasing in a
growing number of nodes in the RWP, whereas it remains similar in the CV mobility
model for 25, 75 and 100 set of nodes as shown in Fig. 2. It is also examined that
the GM mobility gives the highest PDR value of 1.000, which means that all the
packets got transmitted for 25 nodes, whereas the lowest PDR value is 0.838 for 100
nodes using the RW mobility model, as given in Table 7. The performance of QoS is
degrading for an increasing number of nodes. It is observed in Table 8 and Fig. 3 that
with an increasing number of nodes, delay is also increasing for RWP, RW, and RD
mobility models. The minimum delay value is 34.821 s for 25 nodes, and maximum
delay value is 648.668 s for 100 nodes using the RWP mobility model, as given in
Table 8. The overall delay value is comparatively lower in all sets of nodes using
GM and RD mobility models.

6 Conclusion and Future Work

QoS is an essential factor in MANET. We have observed and analyzed the influence
of different mobility models on AODV using throughput, delay, and PDR as the QoS
measures. It is observed that for a broad set of nodes, QoS is degrading. We know
that for better QoS, throughput and the PDR should be more excellent, and delay
should be lower. After performing a simulation experiment, we have seen that every
mobility model is different from one another. In some mobility model, throughput and
delay are higher, whereas PDR is lower, whereas in some mobility model, throughput
and delay are lower, but PDR is higher. By analyzing the performance of distinct
mobility models, we conclude that every mobility model is unique, and it has great
significance in providing QoS in MANET. In different situations, we use different
140 M. F. Khan and I. Das

mobility models. Like for extensive networks, it is observed that GM gives better
throughput, whereas for small network, it gives better PDR, and it provides better
delay for all sets of networks. Furthermore, it is analyzed that the RWP mobility
model is better for small networks because it provides higher throughput and PDR
and lower delay for a smaller number of nodes, whereas the performance is degrading
for a larger number of nodes. It is also noted that using the CV model; the overall
performance is better for QoS metrics viz. delay and PDR for all sets of nodes,
whereas throughput is decreasing for a growing number of nodes. When we compare
all the memoryless mobility models, namely CV, RWP, RW, and RD, we observe
that the CV mobility model provides better QoS in contrast to other memoryless
mobility models. On the other hand, when we compare all the mobility models, we
observe that GM mobility gives better performance using QoS metrics PDR and
delay in comparison with other mobility models. The above discussion and analysis
will be beneficial for many students and research fellows to understand better the
impact of various mobility models to provide QoS in MANET. This chapter will also
make them learn how to use mobility models for different scenarios. It will also help
them to know that different QoS parameters give different performance using the
same mobility model with varying parameters. The future scope of this chapter is to
improve the QoS in MANET by modifying the described mobility models.

References

1. C.S.R. Murthy, B.S. Manoj, Ad hoc Wireless Networks: Architectures and Protocols (Prentice
Hall PTR, May, 2004)
2. M.F. Khan, I. Das, Implementation of random direction-3D mobility model to achieve better
QoS support in MANET. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 11(10), 195–203 (2020)
3. A. Dhungana, E. Bulut, Peer-to-peer energy sharing in mobile networks: Applications,
challenges, and open problems. Ad Hoc Netw. 97(102029), 1–16 (2020)
4. A. Mikitiuk, K. Trojanowski, Maximization of the sensor network lifetime by activity schedule
heuristic optimization. Ad Hoc Netw. 96(101994), 1–15 (2020)
5. M.F. Khan, I. Das, A study on quality-of-service routing protocols in mobile ad hoc networks.
in International Conference on Computing and Communication Technologies for Smart Nation
(IC3TSN) (IEEE, 2017), pp 95–98
6. I. Das, R.N. Shaw, S. Das, Performance analysis of wireless sensor networks in presence of
faulty nodes. in 2020 IEEE 5th International Conference on Computing Communication and
Automation (ICCCA) (Greater Noida, India, 2020), pp. 748–751. https://doi.org/10.1109/ICC
CA49541.2020.9250724
7. M.F. Khan, I. Das, Effect of different propagation models in routing protocols. Int. J. Eng. Adv.
Technol. (IJEAT) 9(2), 3975–3980 (2019)
8. L.M. Sichitiu, Mobility models for ad hoc networks. in Guide to Wireless Ad Hoc Networks
(Springer, London, 2009), pp. 237–254.
9. X. Zhong, F. Chen, Q. Guan, J.F.H. Yu, On the distribution of nodal distances in random
wireless ad hoc network with mobile node. Ad Hoc Netw. 97(102026), 1–30 (2020)
10. J. Broch, D.A. Maltz, D.B. Johnson, Y.C. Hu, J. Jetcheva, A performance comparison of
multi-hop wireless ad hoc network routing protocols. Proceedings of ACM MobiCom. 114,
(1998)
Analysis of Various Mobility Models and Their Impact … 141

11. V. Vasanthi, M. Hemalatha, Simulation and evaluation of different mobility models in ad-hoc
sensor network over DSR protocol using Bonnmotion tool. in International Conference on
Security in Computer Networks and Distributed Systems (Springer, Berlin, Heidelberg, 2012)
12. J.D.M.M. Biomo, T. Kunz, M.S. Hilaire, An enhanced Gauss-Markov mobility model for
simulations of unmanned aerial ad hoc networks. in 7th IFIP Wireless and Mobile Networking
Conference (WMNC) (IEEE, 2014)
13. T. Camp, J. Boleng, V. Davies, A survey of mobility models for ad hoc network research.
Wireless Commun. Mobile Comput. 2(5), 483–502 (2002)
14. D.B. Johnson, D.A. Maltz, Dynamic source routing in ad hoc wireless networks. in Mobile
Computing (Springer, Boston, MA, 1996), pp. 153–181
15. D.A. Guimarães, P.F. Edielson, J.S. Lucas, Influence of node mobility, recharge, and path loss
on the optimized lifetime of wireless rechargeable sensor networks. Ad Hoc Netw. 97(101994),
1–28 (2020)
16. F. Bai, A. Helmy, A survey of mobility models. in Wireless Adhoc Networks (University of
Southern California, USA, 2004) vol. 206
17. V.G. Menon, Analyzing the performance of random mobility models with opportunistic routing.
Adv. Wireless Mobile Commun. 10, 1221–1226 (2017)
18. The Network Simulator website, [Online]. Available: https://www.nsnam.org
19. B. Liang, J.H. Zygmunt, Predictive distance-based mobility management for PCS networks.
in IEEE INFOCOM’99. Conference on Computer Communications. Proceedings. Eighteenth
Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is
Now (Cat. No. 99CH36320), vol. 3 (IEEE, 1999)
20. L. Ghouti, T.R. Sheltami, K.S. Alutaibi, Mobility prediction in mobile ad hoc networks using
extreme learning machines. Procedia Comput. Sci. 19, 305–312 (2013)
21. N. Meghanathan, Impact of the Gauss-Markov mobility model on network connectivity, lifetime
and hop count of routes for mobile ad hoc networks. J. Netw. 5(5), 509 (2010)
22. D. Broyles, A. Jabbar, J.P.G. Sterbenz, Design and analysis of a 3–D Gauss-Markov mobility
model for highly-dynamic airborne networks. in Proceedings of the International Telemetering
Conference (ITC) (San Diego, CA, 2010)
23. M.F. Khan, I. Das, An investigation on existing protocols in MANET. in Innovations in
Computer Science and Engineering, ed. by H.S. Saini et al. Lecture Notes in Networks and
Systems, vol. 74, (Springer Nature Singapore Pte Ltd., 2019), pp. 215–224
24. A. Ahmed, A. Hanan, I. Osman, AODV routing protocol working process. J. Convergence Inf.
Technol. 10(2), (2015)
25. P.K. Maurya, G. Sharma, V. Sahu, A. Roberts, M. Srivastava, An overview of AODV routing
protocol. Int. J. Mod. Eng. Res. (IJMER) 2(3), 728–732 (2012)
26. M.F. Khan, I. Das, Performance evaluation of routing protocols in NS-2 and NS-3 simulators.
Int. J. Adv. Trends Comput. Sci. Eng. (IJATCSE) 9(4), 6509–6517 (2020)
27. I. Das, R.N. Shaw, S. Das, Analysis of effect of fading models in wireless sensor networks. in
2020 IEEE International Conference on Computing, Power and Communication Technologies
(GUCON) (Greater Noida, India, 2020), pp. 858–860. https://doi.org/10.1109/GUCON48875.
2020.9231201
28. P. Landge, A. Nigavekar, Modified AODV protocol for energy efficient routing in MANET.
Int. J. Eng. Sci. Res. Technol. 5(3), 523–529 (2016)
29. I.D. Chakeres, E.M.B. Royer, AODV routing protocol implementation design. in 24th Inter-
national Conference on Distributed Computing Systems Workshops, Proceedings (IEEE,
2004)
Analysis of Classifier Algorithms
to Detect Anti-Money Laundering

Ashwini Kumar , Sanjoy Das , Vishu Tyagi, Rabindra Nath Shaw,


and Ankush Ghosh

Abstract In the financial sectors like banking, anti-money laundering (AML) is a


very challenging issue. To prevent the money laundering, various set of procedures,
government policies, and ordinances are designed which are known as anti-money
laundering, where income through illegal actions, e.g., market operations, deal of
illegal commodities, and corruption of public funds, and tax evasion, can be stopped.
The major objective of the chapter is classification of any transaction correctly as
illegal or not. To achieve this, we have used the big data analytics technique for
a dataset to identify money laundering activities. The dataset with 10,000 transac-
tions is used in our analysis. The overall process includes data cleaning, statistical
analysis, and data mining process. The linear support vector machine and decision
tree classifier are used to find money laundering activities. The analysis has been
done using Python and customized datasets. The result obtained through analysis is
very significant and shows accuracy is higher in the decision tree classifier. Other
parameters, namely recall and precision, is also better in the decision tree.

A. Kumar (B) · V. Tyagi


Department of Computer Science and Engineering, Graphic Era Deemed To Be University,
Dehradun, India
e-mail: ashwinipaul@gmail.com
V. Tyagi
e-mail: tyagi.vishi@gmail.com
S. Das
Department of Computer Science, Indira Gandhi National Tribal University-RCM, Imphal, India
e-mail: sdas.jnu@gmail.com
R. N. Shaw
Department of Electrical, Electronics and Communication Engineering, Galgotias University,
Greater Noida, India
e-mail: r.n.s@ieee.org
A. Ghosh
School of Engineering and Applied Sciences, The Neotia University, Sarisha, West Bengal, India
e-mail: ankushghosh@gmail.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 143
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_11
144 A. Kumar et al.

Keywords Big data analytics · SVM · Decision tree · Anti-money laundering ·


Precision

1 Introduction

Today, money laundering is a process to acquire illegally amount from different Simply
illegitimate associated sources. Money laundering convert monetary funds from ille- put,
gally to clean funds using any international investment medium or banks [1, 2]. money
The detection of any money laundering is also referred as anti-money laundering launder
[3], and purpose of AML is take some legal actions against illegal income, and it ing
depends on the country. The main goal of anti-money laundering is to detect reduce means ‘
drug trafficking, to prevent terrorist activities and the cocktail of crime like robbery, washin
blackmail, and forgery [4]. Most of the financial institutions are also facing the risk g’ dirty
of money laundering. Therefore, they require to implement some legal procedures to money
detect or prevent money laundering and any other illegal activities also to decrease it. so that
Currently, financial institutions like our banking system fraud cases included activ- it
ities like money laundering, E-banking online transaction, and frauds of your bank appear
credit or debit card, etc. [5]. s clean
Classic machine learning approaches to the anti-money laundering followed a
manual approach to detection of money laundering. These traditional approaches can
be categorized into the detection of money laundering, identification, and avoidance
of AML activities. Most of the data mining techniques are anticipated to apply
perfectly used in AML.
The The big data analytics helps in securing banking systems against money laundering
contr activities in a cost-effective manner [6, 7]. In the synthetic dataset, we have used
ibuti machine learning approach to classify the detection of legal or illegal activities in
on it. Our main contribution in the chapter included analyzing the pretended data. We
of
are using the linear support vector machine (linear SVM) and decision Tree (DT)
the
classification approach for detection of a transaction fall in the category of money
repo
laundering.
rt
The chapter is organized as follows: In Sect. 2, we have discussed various liter-
abou
atures addressing various anti-money laundering detection mechanisms. Section 3
t the
includes experimental setup and methodologies. In Sect. 4, various machine learning
said
approaches along with result analysis of classification techniques is presented.
topic
. Finally, the chapter is concluded in Sect. 5.

You're guide to easily identify the content of this report.

2 Literature Reviews

In this section, works related to detection of money laundering through machine


learning techniques and other techniques are discussed to understand the background
on money laundering.

Money Laundering detection is a way to understand the background of this


issue.
Analysis of Classifier Algorithms to Detect Anti-Money Laundering 145

In [1], author has discussed current methods and machine learning algorithms to
effectively detect money laundering suspicious activities. Many algorithms that offers
scalability, data security, accuracy and the performance to detect AML activities
very fast in any suspicious banking transaction system are discussed. They have
also explored about dataset and feature selection of the variables used in our depth
examination.
In [8], the authors explore the approach to overcome the existing problem in
anti-money laundering activities and provide solutions based on classical methods
used in money laundering prevention. A multi-agent architecture-based novel AML
provides a set of unique functionalities for anti-money laundering solutions. In [9],
the authors present a case study of knowledge-based approach to handle money
laundering patterns detection with the help of the combination of data mining and
natural language techniques. The authors have used some mini-techniques like artifi-
cial neural network, clustering, and applied heuristic approach the solution for AML.
In [10], the authors have constructed support vector machine to use in the detection
of suspicious transaction activities in money laundering. This paper proposed a novel Propose
method called cross-validation SVM method to find the parameters that must increase d
the overall performance of the model and get better classification of our test dataset. method
In [11], the authors have used decision tree rule-based method to detect the money of this
laundering by some customers’ profile of any commercial bank in china. The authors report.
have used a sample of twenty-eight customers with the main features or attributes to
validate our decision tree method.
In [12], the authors combined distributed mining techniques for a distributed About
environment with the AML. In [13], the author described the solution for auto- The
matic detection of anti-money laundering for using a live intelligent environment at solution
AUSTRAC. They have constructed a network model that represents a relationship called
between individual transactions and supplementary evidence. For automatic detec- Automati
tion of anti-money laundering, they considered SVM and random forest using R c
library e1071. Due to the transaction network dependencies model achieved accuracy Detectio
0.86 in SVM and 0.92 in random forest. SVM = super vector machine n
In [14], the authors discussed an overview of various data mining techniques used
for AML. They have proposed some data mining techniques and their implementa-
tion for AML. According to the analyses of the author, first replace rule-based AML
systems with a machine learning approach for money laundering. Second, using
machine learning methods can effectively be applied for AML to build an efficient
solution for automatic detection of money laundering patterns. Finally, classification
and clustering are the most important methods of classic machine learning to better
result in anti-money laundering. In [15], the authors have proposed a suspicious
activity detection model based on scan statistics, and it is identified as an illegal
financial transaction on transaction level for financial institutions. They have used
sensitivity/recall and specificity in the proposed algorithm as performance metrics.
The author tried to find out true positive and false positive represent as incorrectly
detected suspicious transactions. In the SARs detection algorithm, the sensitivity of
the proposed algorithm is 0.516. Authors still working on increasing the sensitivity
to detect correctly suspicious transactions of money laundering. In [16], the authors
146 A. Kumar et al.

considered the client’s privacy while detecting frauds. The Kernel principal compo-
nent analysis is combined with the extreme gradient boosting algorithm and proposed
a new hybrid unsupervised and supervised learning model for fraud detection. The
grid search algorithm is used to avoid over-fitting. The experiments are done on a
real-life dataset. The performance analysis of XGBoost and P-XGBoost shows that
P-XGBoost outperforms XGBoost in fraud detection. In [17], the author discussed
the Brazilian exporters for possible frauds in exports. An unsupervised deep learning
model is used to classify the things. The autoencoder is used to detect anomalous
M
situations within the vast volumes of the export dataset from exports of goods and
O
products. In [18], a supervised machine learning model is proposed and trained
D
through various types of data. The objective is to identify or predict the probability
E
L of a new transaction is to be reported based on their historical data. The machine
S learning model is tested into large data of DNB, which is the Norway’s largest bank.
The proposed model outperforms than the existing the bank’s techniques.

3 Methodology and Experimental Setup

In this section, we have discussed the support vector machine (SVM) and decision tree
(DT) that are used in the analysis of algorithms to detect anti-money laundering. We
used a synthesized banking transaction dataset in our given methods. The proposed
methodology is implemented in Python 3.7x. Python is an open-source and most
popular platforms used in machine learning and visualization. In Python, a large
number of libraries and inbuilt functions are called packages to find the optimized
solution most easily. Proposed Methodology of this report.
In the detection of anti-money laundering, our first step is to identify features on the
step-b labeled dataset. This is needed for proper analysis. The second step is data cleaning
y-step which removes emotions, noisy data, and punctuations from the dataset. We have
applied machine learning methods; i.e., text analytics methods consider SVM and DT
experi
menta classifier on the anti-money laundering dataset to classify bank transactions as legal or
l set illegal in the third step. Finally, machine learning classifier algorithms SVM and DT
ups
are used in the dataset to correctly detect anti-money laundering transactions. Figure 1
clearly depicts the step-wise activities framed in our methodology for AML [5].
In the beginning phase, identification of the input dataset is done. The second phase
SUMM deals with data cleaning process, which removes the noisy data from the dataset. In
ARY
OF the third phase, the text analytics algorithm considers the decision tree, and a support
EXPE vector machine classifier is applied to identify legal and illegal transactions. Final
RIME
T
phase shows that data are classified and produce the legal and illegal transactions
[5, 19].
A model is developed to classify transactions based on the dataset of attributes
using SVM and DT classifier. The money laundering dataset has total 45 objects and
seven variables.

THE USE OF MODELS


Analysis of Classifier Algorithms to Detect Anti-Money Laundering 147

Fig. 1 Step by step


illustration of anti-money
laundering activities [5]

The first and foremost basic step in identifying the


presence of money laundering is to sort out the data. to
input and then allow Pre-processing, a temporary data
check to identify obvious malicious things. Then this is
when the machine will be working on its role (Found on
page 117). Then will segregate into two different
machine and will help you classify which is which.

Table 1 Description of data


Field Description
acquisition
Type Type of transactions
Amount Available amount in the customer’s account
Account no Account number where transactions were
executed
Old balance Previous balance
New balance Updated amount after executing a
transaction
Date and time Transactions executed time and date

3.1 Step-1 Data Acquisition


AML - Anti Money Laundering
The raw dataset used for the AML is collected from secured sites or to create the
customized or synthesized dataset [5]. The dataset contains various fields, and their
descriptions are given in Table 1.

3.2 Step-2 Feature Selection

The dataset contains various columns, and many unwanted columns are filtered which
do not affect the transaction. The following fields AMOUNT, OLD BALANCE are
148 A. Kumar et al.

removed, because these fields do not affect a transaction. The other remaining fields
is retained and used to build the model.
Column Name (money_laundering)

money_laundering “Type” “AMOUNT” “Account “OLD “New “Type of “Date


No.” BALANCE” balance” account” and
Time”

3.3 Step-3 Divide Dataset

The whole dataset is split into training and validation datasets. The training dataset
is used to trained the model. The validation dataset is used to validate and make
predictions. We have considered 80% as training data and 20% as a validation dataset.

3.4 Step-4 Implement Model

Our techniques is designed with the help of linear SVM and DT classifier
using the library Scikit Learn on the training dataset. The model is called as
money_laundering_SVM and money_laundering_DT.

3.5 Step-5 Optimization Model

The optimization model is used to refine and enhance the performance of the model.
This helps in achieving the highest accuracy, precision, and recall values on the
validation dataset.

4 Machine Learning Approaches

4.1 Linear Support Vector Machine

The SVM [6] is a supervised learning technique used to analyze data for classification
problems. This is the most robust prediction models, based on the statistical learning
framework. We used linear SVM for linearly separable data which means that our
dataset will be classified as legal (+1) or illegal (−1). Linearly separable data will
split into two or more lines. The standard line function is y = ax + b. We replaced x
with x 1 and y with x 2 , and we get:
Analysis of Classifier Algorithms to Detect Anti-Money Laundering 149

ax1 − x2 + b = 0 (1)

Equation (1) is derived from 2D vectors and the equation of the hyperplane. Once
we have defined our hyperplane, then we can use the given hyperplane to make some
predictions on the given values. So, we define our hypothesis function h as:

h(xi) = +1 if w · x + b ≥ 0

−1 if w · x + b < 0

Finally, our support vector machine is to find a new hyperplane that can split data
accurately.

4.2 Decision Tree Classifier

The decision tree is a supervised learning model [6] mostly used for solving classi-
fication, but also used for regression problems. The decision tree is a tree-structure
classifier comprises of internal nodes represent features of a dataset, terminal nodes
represent the output and Branches represent the decision rules. It is a graphical
representation getting all possible outcomes to a decision based on given conditions.
Decision trees use multiple algorithms to decide to split a node into two or more
sub-nodes. The creation of sub-nodes increases the homogeneity of resultant sub-
nodes. The algorithm selection is also based on the type of output variables. We have
used ID3 algorithms which are used to build our decision trees.
The ID3 algorithm builds our decision trees with all possible branches with no
backtracking. In ID3, we will calculate entropy(H) and information gain(IG) of each
attribute.


K
Information Gain = Entropy(before) − Entropy( j, after)
j=1

After selection of attribute to find the smallest entropy or largest information gain,
the next subtree is separated by another attribute to get output a subset of the given
data.
150 A. Kumar et al.

Table 2 Models with


Models Precision Recall Accuracy
classical machine learning
methods Linear support vector machine 0.87 0.87 0.87
Decision tree classifier 0.96 0.96 0.95

4.3 Results and Analysis

The results of various techniques used to detect anti-money laundering in the terms
of precision, recall, and accuracy are listed in Table 2.
Next, we use machine learning methods to detect anti-money laundering. Our
model achieves high accuracy on the task of anti-money laundering detection, and
we found that our model significantly outperforms other models. We analyze the
performance of linear support vector machine and decision tree classifier which is
shown in Figs. 2 and 3.

Fig. 2 Performance of linear


support vector machine

Fig. 3 Performance of
decision tree classifier
Analysis of Classifier Algorithms to Detect Anti-Money Laundering 151

In Fig. 2 the performance of Linear SVM is shown. The value of precision is


gradually increases, which is also happening in small steps. In Fig. 3 the performance
of the decision tree is shown, where the precision value is gradully increasing. In anti-
money laundering, the performance of the proposed model is very important because
the higher the accuracy of the model indicates the correctness of the model. The
performance of our proposed model achieved better results for accuracy, precision,
and recall in our given dataset as shown in Table 2.
importance of the models

5 Conclusion

In this chapter, we used the big data analytics method to classify money laundering
actions into two categories, i.e., legal or illegal. We have applied the linear support
vector machine and decision tree classification algorithms on some attributes of
the dataset. This analysis revealed that sometimes this classification helps to detect Conclusio
fraud management such as anti-money laundering or money laundering detection. n main
Through result analysis, in terms of precision, accuracy, and recall values, decision key point
tree classification outperformed the linear support vector machine classification.
The present work can be extended with more datasets and other machine learning
techniques.
Basically SVM < DT

References

1. Z. Chen, L.D. Van Khoa, E.N. Teoh, A. Nazir, E.K. Karuppiah, K.S. Lam, Machine learning
techniques for anti-money laundering (AML) solutions in suspicious transaction detection: a
review. Knowl. Inf. Syst. 57(2), 245–285 (2018). https://doi.org/10.1007/s10115-017-1144-z
2. C.H. Tai, T.J. Kan, Identifying money laundering accounts. Proc. 2019 Int. Conf. Syst. Sci.
Eng. (ICSSE) 2019, 379–382, (2019). https://doi.org/10.1109/ICSSE.2019.8823264
3. M. Chen, S. Mao, Y. Liu, Big data: a survey. Mobile Netw. Appl. 19(2), 171–209 (2014)
4. M. Abourezq, A. Idrissi, Database-as-a-service for big data: an overview. Int. J. Adv. Comput.
Sci. Appl. (IJACSA) 7(1), (2016)
5. A. Kumar, S. Das, V. Tyagi, Anti money laundering detection using Naïve Bayes classifier. in
2020 IEEE International Conference on Computing, Power and Communication Technologies
(GUCON) (Greater Noida, India, 2020), pp. 568–572. https://doi.org/10.1109/GUCON48875.
2020.9231226
6. C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
7. L. Breiman, et al., Classification and regression trees. Statistics/Probability Series (1984)
8. S. Gao, D. Xu, H. Wang, Y. Wang, Intelligent anti-money laundering system. in 2006 IEEE
International Conference on Service Operations and Logistics, and Informatics, (SOLI 2006)
(2006), pp. 851–856. No. 7001805. https://doi.org/10.1109/SOLI.2006.235721
9. N.A. Le Khac, M.T. Kechadi, Application of data mining for anti-money laundering detection:
a case study. in Proceedings of IEEE International Conference on Data Mining, ICDM (2010),
pp. 577–584. https://doi.org/10.1109/ICDMW.2010.66
10. L. Keyan, Y. Tingting, An improved support-vector network model for anti-money laun-
dering. in Proceedings of the 2011 International Conference on Management e-Commerce
152 A. Kumar et al.

e-Government, (ICMeCG 2011) (2011), pp. 193–196. https://doi.org/10.1109/ICMeCG.201


1.50
11. S.N. Wang, J.G. Yang, A money laundering risk evaluation method based on decision tree. in
Proceedings of the Sixth International Conference on Machine Learning Cybernation, (ICMLC
2007) (2007), pp. 283–286. https://doi.org/10.1109/ICMLC.2007.4370155
12. C. W. Zhang, Y.B. Wang, Research on application of distributed data mining in anti-money laun-
dering monitoring system. in Proceedings of 2nd IEEE International Conference on Advance
Computing Control (ICACC 2010) (2010), pp. 133–135. https://doi.org/10.1109/ICACC.2010.
5487272
13. David Savage, et al., Detection of money laundering groups: supervised learning on small
networks. in AAAI Workshops (2017)
14. N.A. Le Khac, et al., An investigation into data mining approaches for anti money laundering. in
Proceedings of International Conference on Computer Engineering and Applications (ICCEA
2009) (2009)
15. X. Liu, P. Zhang, A scan statistics based suspicious transactions detection model for anti-money
laundering (AML) in financial institutions. in 2010 International Conference on Multimedia
Communications (IEEE, 2010)
16. H. Wen, F. Huang, Personal loan fraud detection based on hybrid supervised and unsupervised
learning. in 2020 5th IEEE International Conference on Big Data Analytics (ICBDA) (Xiamen,
China, 2020), pp. 339–343. https://doi.org/10.1109/ICBDA49040.2020.9101277
17. E.L. Paula, M. Ladeira, R.N. Carvalho, T. Marzagão, Deep learning anomaly detection as
support fraud investigation in Brazilian exports and anti-money laundering. in 2016 15th IEEE
International Conference on Machine Learning and Applications (ICMLA) (Anaheim, CA,
2016), pp. 954–960. https://doi.org/10.1109/ICMLA.2016.0172
18. M. Jullum, A. Løland, R.B. Huseby, G. Ånonsen, J. Lorentzen, Detecting money laundering
transactions with machine learning. Journal of Money Laundering Control, Vol. 23 No. 1,
pp. 173–186. https://doi.org/10.1108/JMLC-07-2019-0055. Hyperlink https://www.emerald.
com/insight/publication/issn/1368-5201
19. B.A. Omran, Q. Chen, Trend on the implementation of analytical techniques for big data in
construction research (2000–2014). in Construction Research Congress 2016 (2016), pp. 990–
999
20. E.A. Lopez-Rojas, S. Axelsson, Multi agent based simulation (mabs) of financial transactions
for anti money laundering (aml). in Nordic Conference on Secure IT Systems (Blekinge Institute
of Technology, 2012)
21. A.K. Saha, A. Kumar, V. Tyagi, S. Das, Big data and internet of things: a survey. in 2018
International Conference on Advances in Computing, Communication Control and Networking
(ICACCCN) (2018 Oct 12), pp. 150–156
22. S. Mukherjee, R. Shaw, Big data–concepts, applications, challenges and future scope. Int. J.
Adv. Res. Comput. Commun. Eng. 5(2), 66–74 (2016)
Design and Development of an ICT
Intervention for Early Childhood
Development in Minority Ethnic
Communities in Bangladesh

Md Montaser Hamid, Tanvir Alam, and Md Forhad Rabbi

Abstract Early childhood development (ECD) encloses cognitive, physical, and


socio-emotional growth which are very much absent in the minority ethnic groups
of Bangladesh. In ethnic communities, local beliefs and traditions are practiced in
bringing up a child rather than scientific methods. Due to this exercise, lots of indige-
nous children’s futures are compromised in terms of physical and mental well-being.
We have designed a smartphone application for the parents of minority ethnic groups
to suggest to them about early childhood development, so that every child can get
proper attention. Our main objective was to develop a smartphone application for the
indigenous children, so that the parents can monitor the mental and physical develop-
ment of their children from a very early stage. During the development of the app, the
cultural and educational context of the indigenous people were considered. Finally,
the User Interface (UI) and User Experience (UX) of this intervention are assessed
by the members of several indigenous groups for ensuring the appropriateness and
the usability of the application.

1 Introduction

In this world of advanced technology, smartphone devices have reached in every


corner. The pervasive usage of smartphones is influencing human lives in an unprece-
dented way. Children of this modern world are not excluded from this influence.
Smartphone devices are extensively used for children’s education, entertainment,
mode of communication, and development. Several smartphone applications have

M. M. Hamid
Department of Computer Science and Engineering, Ranada Prasad Shaha University,
Narayanganj, Bangladesh
T. Alam (B) · M. F. Rabbi
Department of Computer Science and Engineering, Shahjalal University of Science and
Technology, Sylhet, Bangladesh
M. F. Rabbi
e-mail: frabbi-cse@sust.edu

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 153
J. C. Bansal et al. (eds.), Computationally Intelligent Systems and their Applications,
Studies in Computational Intelligence 950,
https://doi.org/10.1007/978-981-16-0407-2_12
154 M. M. Hamid et al.

been developed whose target audience includes the children and the parents. Parents
are relying on smartphone applications for providing basic education to their chil-
dren. This basic education includes knowledge about the alphabet, elementary
mathematics, drawing, environment, living nature, and many more.
Developing a mobile application for children is a controversial and delicate matter.
Many people want to keep children away from using smartphones. From their view-
points, over usage of smartphones can have detrimental impacts on a child’s mental
and physical health. However, in this modern world, it is not feasible to keep children
out of the influence of smartphones. Therefore, it is required to develop smartphone
applications that will address the requirements and needs of the children and their
parents in the most sensible way.
Majority of the applications which are being developed for the children are for
education and entertainment purpose. From a very young age, children are using
such applications. The parents also use these applications alongside the children.
Developing such applications requires multidisciplinary efforts. A lot of factors are
needed to be taken into consideration before developing a children’s app. Defining
the design principles is the most challenging part of the development stage of a
children’s app. It is also crucial to consider the emotional and psychological needs
of a child while developing the applications.
The User Interface (UI) and User Experience (UX) of children’s applications
are different from other traditional smartphone applications. In traditional applica-
tions, minimalistic design strategies are widely used. However, in a children’s app,
the contents are colorful and flashy. In a children’s app, the icons are supposed to
be comprehensible for children of very young ages. Moreover, the navigation and
interaction techniques should be easily adaptable.
In this study, we tried to work with children of the indigenous and ethnic commu-
nities of Bangladesh. We wanted to define a design principle for smartphone appli-
cations that can better mitigate the needs of the children of these less technologi-
cally advanced communities. More than 2 million indigenous people are living in
different parts of Bangladesh. They have their language, traditions, and cultural iden-
tity. However, their socio-economic status is not the same as the mainland people of
Bangladesh. They are less privileged in terms of education, economic investments,
and technological infrastructures.
Our main objective was to develop a smartphone application for the indigenous
children, so that the parents can monitor the mental and physical development of
their child from a very early stage. Before developing such an application, it was
imperative to study the target users and take their opinion in this regard. With the
collaboration of department of Anthropology of Shahjalal University of Science
and Technology, we tried to find out the expectations of indigenous people in an
application for their children.
Our research can be divided into three major stages. In the first stage, we went
to the indigenous communities and arranged focus group discussion (FGD) and key
informant interview (KII) for better understanding and collecting their opinions. In
the next stage, based on the qualitative data collected in the previous stage, we have
developed a smartphone application. In the last stage, we deployed our application
Design and Development of an ICT Intervention … 155

among a few indigenous users. We then took the feedback from the users in terms
of UI/UX, navigation, contents, and adaptability of our application. Lastly, after
conducting the user study, we made a few changes in the UI to make the app more
comprehensible.
While interacting with the indigenous communities, we were careful about ethical
considerations. The indigenous communities of Bangladesh are peace-loving and
friendly. However, they are very much aware and protected about their culture and
traditional identity. We were careful about the nature of our data collection process.
During FGD and KII, we took help from translators, so that we can interact with the
indigenous people properly. All the participations were completely voluntary. We did
not provide any reward for participation. No children participated in this research.

2 Related Works

The target audience of our application is the indigenous mothers and children. It is
always challenging to define the expectations and requirements of a focus group who
have minimal knowledge of technology. The design principles of a successful mobile
application rely on the user study before and after the deployment of the application.
Oinas-Kukkonen et al. [1] discussed that there are seven key design principles
of mobile applications. These key design principles include mobility, usefulness,
relevance, ease of use, fluency of navigation, user-centeredness, and personalization.
It is also mentioned that a successful mobile service should have natural adaptability.
There are a lot of applications that provide the same features and functionalities.
According to the authors, the application which offers the most natural adaptation
will be more popular among general users.
Interface design is another vital factor of a successful mobile application. Place-
ment of different keys and choosing the appropriate sizes are very crucial for ensuring
the acceptance of an application [2]. Developing prototypes and taking user feed-
back based on the prototypes help the UX/UI developers in defining user needs. A
lot of applications fail due to the lack of collaboration of the target audience in the
development stage.
The number of functionalities in modern-day smartphone applications is
increasing rapidly. This is creating a contradictory viewpoint for the developers.
The richness of features makes the navigation of an app more difficult. However,
the users expect easiness and lucidness in the functionalities of an app [3]. In a
sophisticated application environment, usability is an important deciding factor of
the acceptance of an app.
UI design requires multidimensional and multidisciplinary approaches.
Successful UI design model should include market research, user study, product
design specification, conceptualization, prototyping, user evaluation, pre-narration,
and post-narration [4]. The social and psychological aspects of the target audience
are also important elements of UI design.
156 M. M. Hamid et al.

Developing suitable smartphone applications for marginal users especially people


with less technological knowledge is a strenuous task. In a country like Bangladesh,
large number of people are illiterate and have minimal knowledge about smart-
phone technologies. The inclusion of such users is very crucial for the sustainable
growth of technology in a country. A mobile application should be represented differ-
ently among such people. The icons, signs, and pictures of apps for low literate
people should be self-explanatory [5]. Providing textual content in the local language
will also make an app more acceptable to low literate users. Combining Graph-
ical User Interface (GUI) and linguistic approaches offer easiness and reduction
of complexity in terms of interaction for the illiterate people [6]. In another study,
different UI/UX design principles have been proposed for developing indigenous-
centric smartphone applications [7]. It has been mentioned that for such users, it is
required to include additional features such as voice navigation, audio assistance for
making a smartphone application suitable for the indigenous people.
A huge number of smartphone applications have been developed for children.
Different learning, gaming, and entertainment apps are used by children of all ages.
While developing a smartphone app for children, a lot of factors should be taken
into consideration. MFolktales, a storytelling application for children, implements
numerous children-specific theories in the app [8]. The app integrates cognitive
learning, social learning, and sensory stimulation theory. Task-specific apps are also
getting very popular among the children. For example, healthcare apps are used by
the parents to make their children more aware of certain diseases and health issues.
In an Android app developed for educating children about type 1 diabetes mellitus,
the developers followed the gaming trends of the other children apps for making the
app more acceptable to the children [9]. Different learning apps for children also
follow this type of strategy.
In our attempt to develop an application for the children of the indigenous commu-
nities, we wanted to help both the parents and the children. We selected and developed
the content of the application based on the socio-cultural aspects of the indigenous
communities. Our goal was to engage our target audience interactively [10–13].

3 Defining Design Principle

The most challenging part of our research was to find out the appropriate design
principle. Our goal was to develop a smartphone application that will address different
issues of indigenous users as well as will be suitable for the usage of children [14].
We needed to study our target audience for that purpose. For collecting data on
the viewpoints of the indigenous user, we have used two approaches which are as
follows:
• Focus group discussion (FGD)
• Key informant interview (KII).
Design and Development of an ICT Intervention … 157

3.1 Focus Group Discussion

In FGD, we conducted an interactive group discussion among different indigenous


people. We have conducted three FGDs among three groups consisting of 8–10
people. In each group, there were people from different backgrounds [15]. The nature
of the questions at FGDs was semi-structured. We discussed the following things on
the FGD:
• Smartphone usage history,
• Challenges of smartphone usage,
• Limitations of the traditional smartphone application,
• Content of a children’s app,
• Interface of the children’s app.

3.2 Key Informant Interview

In KII, we have talked to six different indigenous people in a one-to-one interview.


In our six KIIs, we have talked to the following people:
• Two midwives,
• Two indigenous chiefs,
• Two indigenous mothers.
The nature of the KII was completely unstructured. We gave the participants
an opportunity to discuss freely their remarks and opinions about smartphone
application design.

3.3 Findings of the FGD and KII

From the responses of the participants of both FGD and KII, we tried to define the
design principles suitable for the indigenous children and their parents. We found
that the indigenous users face difficulties in terms of language, contents of the app,
navigation, and icon comprehensibility. As most content of smartphone applications
is in English, it is difficult for the less tech-savvy indigenous users to understand the
meaning. They expect the inclusion of their indigenous language in the application.
In indigenous community, a few people especially the younger generation are
receiving proper education. As a result, they want to defy many superstitions and
ancient knowledge of the indigenous community. The younger people are eager to
use smartphone applications for knowing about the modern world and establishing
contacts with it. However, the older generation relies on their traditional knowl-
edge base for treatment, education, and entertainment. These conflicting views are
creating tensions among the two generations. The older indigenous people oftentimes
158 M. M. Hamid et al.

consider modern technology such as smartphone applications as an intrusion to their


own cultural identity. Therefore, oftentimes they are reluctant in using smartphone
applications.

3.4 Defining User-Specific Design Principle

We wanted to make the app interactive and entertaining to our target audience.
Defining the requirements of the indigenous people seemed very difficult for us
[16]. Therefore, we made a prototype of the app with limited content. We then take
feedback from different indigenous users for making our app suitable for their usage.
In the prototype, there were three videos which demonstrated three different activities
of early childhood development. We asked a few questions about the comprehensi-
bility, interpretation, and usability of the prototype during the FGDs and KIIs. We
first wanted to know if they can understand what the video contents are for. We get
the following responses:
• 78% of indigenous mothers understand the content of the video.
• 82% of indigenous men understand the content of the video.
Next, we asked whether they can relate the ECD checklist activities with the video
content. Their responses are as follows:
• 75% of indigenous mothers can relate the videos with the ECD checklist.
• 80% of indigenous men can relate the videos with the ECD checklist.
For demonstrating the activities of the ECD checklist, we initially selected three
media. These include: (i) describing through images, (ii) describing through videos,
and (iii) describing through training. We wanted to know which media is more suitable
for the indigenous people. From their point of views, we find the following things:
• 51% think that images can be used for describing the contents.
• 82% think that video contents are more appropriate.
• 67% think that training programs can be arranged for ECD awareness.
The problem with the image content is that the indigenous people oftentimes face
difficulties while interpreting the nature of an ECD activity from seeing the image
only. Video contents provide additional information, and descriptive animation makes
it easier for the indigenous mother to understand the type of activity. For conducting
the training program, a huge amount of time and resources are needed. It is not feasible
to arrange a training program for every indigenous mother. Moreover, the number of
efficient personnel who can provide training is very less due to the language barrier.
After analyzing all these, we decided to develop our application based on the video
contents. We used animated video for making the app entertaining.
Design and Development of an ICT Intervention … 159

4 Development of the Application

The smartphone application has been developed based on the findings of the FGD and
KII. The content of the application focuses on early childhood development (ECD)
activities. For a 0–3 years old child, there is a defined list of activities according
to the age of that child. We have collected this list from BRAC, a prominent non-
government organization in Bangladesh. This ECD checklist provides a chart which
contains different activities based on the age of the children. The activities are divided
into seven different categories. The categories are as follows:
• Major activity,
• Minor activity,
• Vision ability,
• Hearing ability,
• Language,
• Intelligence,
• Attitude.
There are nine different age levels of the babies in the ECD checklist. This age
range is as follows:
• New-born,
• 1–2 months,
• 3–5 months,
• 6–8 months,
• 9–11 months,
• 12–17 months,
• 18–23 months,
• 24–29 months.
For each stage, there is one more activity from every category which a baby of that
age should be able to perform. This checklist is used by the field workers of BRAC
for monitoring the mental and physical development of the children. Our goal was to
create the app with the contents of the ECD checklist for the indigenous mothers and
their children. For making the app suitable for the indigenous mothers, we needed to
consider additional design and development issues. Figure 1 is the homepage of our
application. On the homepage, we have described the category name with small texts.
The language of the text is in Bangla which is the official language of Bangladesh.
From this main menu, the user can select the desired category. Upon selecting the
type of category, a new page will be opened.
Figure 2 shows the UI of the page of a category. In this new page, a complete list of
all the activities with respect to the age level will be displayed. There is an audio icon
with every activity on this page. Upon pressing on the audio icon, an audio description
of that activity will be played. This feature is useful for the people of minority ethnic
group who are illiterate and for the people who cannot read Bangla. By clicking on
an activity, another page will be opened. In Fig. 3, we can see that page with a video
160 M. M. Hamid et al.

Fig. 1 Homepage of the


application

thumbnail and a textual description of the activity. The video contains an animation
of that activity. After clicking on the thumbnail, the video will be started. This type
of interactive video content is helpful for the indigenous mothers and their children
in understanding the nature of the activity. Without relying on textual description,
we tried to make the application enjoyable.

5 Development and User Study of the Application

After developing the app, we deployed the app among 12 mothers and midwives
from indigenous communities. We tried to find out how they adapt to our application
and how they can utilize the contents. The indigenous mothers liked the concept of
monitoring their children’s growth with a smartphone app. They also provided posi-
tive opinions about the video contents of the app. They think the animated contents
are more effective than the textual contents. However, they struggled in terms of
navigation. The participants of our study were mostly illiterate or low literate. For
Design and Development of an ICT Intervention … 161

Fig. 2 List of activities in a


category

them, it was difficult to understand the content of the homepage of the application.
On the homepage, the name of a category is written in Bangla, and an icon of a baby
is added with each category. Indigenous mothers and midwives have limited knowl-
edge in Bangla. For them, the names of the categories were not comprehensible. To
resolve this issue, we implemented an icon-based strategy [17]. On our homepage,
we have selected an icon that represents the type of the category properly. Our goal
was to provide such icons, so that the indigenous mothers can easily understand the
nature of the category. Figure 4 shows the updated homepage of our application.
After changing the homepage, we again went to the participants of our previous
user study. This time, the homepage was comprehensible to them. The icons helped
them in understanding the contents of the application better. Implementing such a
strategy can make any smartphone application acceptable to the indigenous and other
less tech-savvy communities.
162 M. M. Hamid et al.

Fig. 3 Interactive activity


page

6 Conclusion

From our deployment of the application, it is clear that traditional smartphone appli-
cation design strategies would not be suitable for a less tech-efficient target audience.
The indigenous and ethnic communities are a small part of the large stakeholders
of traditional software developments. Therefore, their needs and expectations are
overlooked during smartphone application development. As a result, the people of
minority ethnic groups are falling behind from utilizing different digital interven-
tions. For their digital inclusion, it is extremely crucial to define design principles
appropriate for them. They are deprived of many basic human rights in Bangladesh.
It is difficult for them to reach out for medical or expert help for their children. There-
fore, with our application, we wanted to provide support to the indigenous mothers.
We plan to analyze the design principle of our application more rigorously. With
a more revised and complete version of the app, we will deploy the app among a
considerable number of indigenous mothers and midwives for a longer period.
Design and Development of an ICT Intervention … 163

Fig. 4 Updated icon-based


homepage

Acknowledgements Department of Anthropology of Shahjalal University of Science and Tech-


nology, Bangladesh, helped us immensely in studying the indigenous communities’ expectations
and requirements in smartphone applications. The translators who worked with us gave exceptional
efforts in establishing communication with the participants of our research. Indigenous mothers
who took part in our study were very much helpful and willing to share their opinion. We got
tremendous support from the indigenous communities while conducting FGD and KII.

References

1. H. Oinas-Kukkonen, V. Kurkela, Developing successful mobile applications. J. Comput. Sci.


Technol. JCST, (2003)
2. S. Ickin, K. Wac, M. Fiedler, L. Janowski, J. Hong, A.K. Dey, Factors influencing quality of
experience of commonly used mobile applications. IEEE Commun. Mag. 50(4), 48–56 (2012).
https://doi.org/10.1109/MCOM.2012.6178833
3. F. Gndz, A.-S. Pathan, On the key factors of usability in small-sized mobile touch-screen
application. Int. J. Multimedia Ubiquitous Eng. 8, 115–138 (2013)
164 M. M. Hamid et al.

4. C.Y. Wong, C.W. Khong, K. Chu, Interface design practice and education towards mobile apps
development. Procedia Soc. Behav. Sci. 51, 12 (2012). https://doi.org/10.1016/j.sbspro.2012.
08.227
5. M.A. Ahmed, M.N. Islam, F. Jannat, Z. Sultana, Towards developing a mobile application
for illiterate people to reduce digital divide. in 2019 International Conference on Computer
Communication and Informatics (ICCCI). (Coimbatore, Tamil Nadu, India, 2019), pp. 1–5.
https://doi.org/10.1109/ICCCI.2019.8822036
6. S. Mandal, V.E. Balas, R.N. Shaw, A. Ghosh, Prediction analysis of idiopathic pulmonary
fibrosis progression from OSIC dataset. in 2020 IEEE International Conference on Computing,
Power and Communication Technologies (GUCON). (Greater Noida, India, 2020), pp. 861–865.
https://doi.org/10.1109/GUCON48875.2020.9231239
7. M. Kumar, V.M. Shenbagaraman, R.N. Shaw, A. Ghosh, Predictive data analysis for energy
management of a smart factory leading to sustainability. in Innovations in Electrical and Elec-
tronic Engineering, ed by M. Favorskaya, S. Mekhilef, R. Pandey, N. Singh. Lecture Notes
in Electrical Engineering, vol. 661. (Springer, Singapore, 2021). https://doi.org/10.1007/978-
981-15-4692-1_58
8. S. Mandal, S. Biswas, V.E. Balas, R.N. Shaw, A. Ghosh, Motion prediction for autonomous
vehicles from Lyft dataset using deep learning. in 2020 IEEE 5th International Conference on
Computing Communication and Automation (ICCCA). (Greater Noida, India, 2020), pp. 768–
773. https://doi.org/10.1109/ICCCA49541.2020.9250790
9. Y. Belkhier, A. Achour, R.N. Shaw, Fuzzy passivity-based voltage controller strategy of grid-
connected PMSG-based wind renewable energy system. in 2020 IEEE 5th International Confer-
ence on Computing Communication and Automation (ICCCA). (Greater Noida, India, 2020),
pp. 210–214. https://doi.org/10.1109/ICCCA49541.2020.9250838
10. R.N. Shaw, P. Walde, A. Ghosh, IOT based MPPT for performance improvement of solar
PV arrays operating under partial shade dispersion. in 2020 IEEE 9th Power India Interna-
tional Conference (PIICON). (SONEPAT, India, 2020), pp. 1-4. https://doi.org/10.1109/PII
CON49524.2020.9112952
11. R.N. Shaw, P. Walde, A. Ghosh, A new model to enhance the power and performances of 4×
4 PV arrays with puzzle shade dispersion. Int. J. Innov. Technol. Explor. Eng. 8(12), 456–465
(2019). https://doi.org/10.35940/ijitee.L3338.1081219
12. M. Saxena, R.N. Shaw, J.K. Verma, (2019). A novel hash-based mutual RFID tag authentication
protocol. in Advances in Intelligent Systems and Computing, Vol. 847. (Springer Verlag, 2019),
pp. 1–12. https://doi.org/10.1007/978-981-13-2254-9_1
13. R.N. Shaw, D. Basu, P. Walde, A. Ghosh, Effects of solar irradiance on load sharing of integrated
photovoltaic system with IEEE standard bus network. Int. J. Eng. Adv. Technol. 9(1), 424–429
(2019). https://doi.org/10.35940/ijeat.A3188.109119
14. M.I.P. Nasution, S. Dewi Andriana, P. Diana Syafitri, E. Rahayu, M.R. Lubis, Mobile device
interfaces illiterate. in 2015 International Conference on Technology, Informatics, Manage-
ment, Engineering & Environment (TIME-E). (Samosir, 2015), pp. 117–120. https://doi.org/
10.1109/TIME-E.2015.7389758
15. T. Alam, M.M. Hamid, M.F. Rabbi, An approach to design and develop UX/UI for smartphone
applications of minority ethnic group. in TENCON 2019 - 2019 IEEE Region 10 Conference
(TENCON). (Kochi, India, 2019), pp. 1357–1362. https://doi.org/10.1109/TENCON.2019.892
9623
16. N. Ibrahim, W.F. Wan Ahmad, A. Shafie, Multimedia mobile learning application for children’s
education: The development of mfolktales. Asian Soc. Sci. 11, (2015). https://doi.org/10.5539/
ass.v11n24p203
17. V. Mcculloch, S. Hope, B. Loranger, P. Rea, Children and mobile applications: How to effec-
tively design and create a concept mobile application to aid in the management of type 1
diabetes in adolescents. Int. Technol. Educ. Dev. Conf. 03, 6045–6053 (2016). https://doi.org/
10.21125/inted.2016.0436

View publication stats

You might also like