You are on page 1of 687

Lecture Notes in Networks and Systems 339

Nikhil Marriwala
C. C. Tripathi
Shruti Jain
Dinesh Kumar   Editors

Mobile Radio
Communications
and 5G Networks
Proceedings of Second MRCN 2021
Lecture Notes in Networks and Systems

Volume 339

Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland

Advisory Editors
Fernando Gomide, Department of Computer Engineering and Automation—DCA,
School of Electrical and Computer Engineering—FEEC, University of Campinas—
UNICAMP, São Paulo, Brazil
Okyay Kaynak, Department of Electrical and Electronic Engineering,
Bogazici University, Istanbul, Turkey
Derong Liu, Department of Electrical and Computer Engineering, University
of Illinois at Chicago, Chicago, USA
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Witold Pedrycz, Department of Electrical and Computer Engineering, University of
Alberta, Alberta, Canada
Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Marios M. Polycarpou, Department of Electrical and Computer Engineering,
KIOS Research Center for Intelligent Systems and Networks, University of Cyprus,
Nicosia, Cyprus
Imre J. Rudas, Óbuda University, Budapest, Hungary
Jun Wang, Department of Computer Science, City University of Hong Kong,
Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest
developments in Networks and Systems—quickly, informally and with high quality.
Original research reported in proceedings and post-proceedings represents the core
of LNNS.
Volumes published in LNNS embrace all aspects and subfields of, as well as new
challenges in, Networks and Systems.
The series contains proceedings and edited volumes in systems and networks,
spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor
Networks, Control Systems, Energy Systems, Automotive Systems, Biological
Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems,
Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems,
Robotics, Social Systems, Economic Systems and other. Of particular value to both
the contributors and the readership are the short publication timeframe and
the world-wide distribution and exposure which enable both a wide and rapid
dissemination of research output.
The series covers the theory, applications, and perspectives on the state of the art
and future developments relevant to systems and networks, decision making, control,
complex processes and related areas, as embedded in the fields of interdisciplinary
and applied sciences, engineering, computer science, physics, economics, social, and
life sciences, as well as the paradigms and methodologies behind them.
Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago.
All books published in the series are submitted for consideration in Web of Science.
For proposals from Asia please contact Aninda Bose (aninda.bose@springer.com).

More information about this series at https://link.springer.com/bookseries/15179


Nikhil Marriwala · C. C. Tripathi · Shruti Jain ·
Dinesh Kumar
Editors

Mobile Radio
Communications and 5G
Networks
Proceedings of Second MRCN 2021
Editors
Nikhil Marriwala C. C. Tripathi
Department of Electronics University Institute of Engineering
and Communication Engineering and Technology
University Institute of Engineering Kurukshetra, Haryana, India
and Technology
Kurukshetra, Haryana, India Dinesh Kumar
Department of Electrical and Computer
Shruti Jain System Engineering
Department of Electronics RMIT University
and Communication Engineering Melbourne, VIC, Australia
Jaypee University of Information
Technology
Solan, Himachal Pradesh, India

ISSN 2367-3370 ISSN 2367-3389 (electronic)


Lecture Notes in Networks and Systems
ISBN 978-981-16-7017-6 ISBN 978-981-16-7018-3 (eBook)
https://doi.org/10.1007/978-981-16-7018-3

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface

This conference provides a platform and aid to the researches involved in designing
systems that permit the societal acceptance of ambient intelligence. The overall
goal of this conference is to present the latest snapshot of the ongoing research
as well as to shed further light on future directions in this space. This confer-
ence aims to serve for industry research professionals who are currently working
in the field of academia research and research industry to improve the lifespan of the
general public in the area of recent advances and upcoming technologies utilizing
cellular systems, 2G/2.5G/3G/4G/5G and beyond, LTE, WiMAX, WMAN, and
other emerging broadband wireless networks, WLAN, WPAN, other homes/personal
networking technologies, pervasive and wearable computing and networking, small
cells and femtocell networks, wireless mesh networks, vehicular wireless networks,
cognitive radio networks and their applications, wireless multimedia networks, green
wireless networks, standardization activities of emerging wireless technologies,
power management, signal processing, and energy conservation techniques. This
conference will provide support to the researchers involved in designing decision
support systems that will permit the societal acceptance of ambient intelligence. It
presents the latest research being conducted on diverse topics in intelligent tech-
nologies to advance knowledge and applications in this rapidly evolving field. The
conference is seen as a turning point in developing the quality of human life and
performance in the future; therefore, it has been identified as the theme of the confer-
ence. Authors were invited to submit papers presenting novel technical studies as
well as position and vision papers comprising hypothetical/speculative scenarios.
5G technology is a truly revolutionary paradigm shift, enabling multimedia
communications between people and devices from any location. It also underpins
exciting applications such as sensor networks, smart homes, telemedicine, and auto-
mated highways. This book will provide a comprehensive introduction to the under-
lying theory, design techniques, and analytical tools of 5G and wireless communica-
tions, focusing primarily on the core principles of wireless system design. The book
will begin with an overview of wireless systems and standards. The characteristics of
the wireless channel are then described, including their fundamental capacity limits.
Various modulation, coding, and signal processing schemes are then discussed in

v
vi Preface

detail, including the state-of-the-art adaptive modulation, multicarrier, spread spec-


trum, and multiple antenna techniques. The book will be a valuable reference for
engineers in the wireless industry and will be extremely valuable not only to graduate
students pursuing research in wireless systems but also to engineering professionals
who have the task of designing and developing future 5G wireless technologies.
For the proper review of each manuscript, every received manuscript was first
checked for plagiarism and then the manuscript was sent to three reviewers. In this
process, the committee members were involved and the whole process was monitored
and coordinated by the general chair. The Technical Program Committee involved
senior academicians and researchers from various reputed institutes. The members
were from India as well as abroad. The technical program mainly involves the review
of the paper. A total of 260 research papers were received, out of which 51 papers
were accepted, registered, and presented during the three-day conference; acceptance
ratio is 19.5%.
An overwhelming response was received from the researchers, academicians,
and industry from all over the globe. The papers were received from pan-India
with places such as Guwahati, Kerala, Tamil Nadu, Raipur, Pune, Hyder-
abad, Rajasthan, Uttarakhand, Raipur, Ranchi, Uttar Pradesh, Punjab, Delhi,
Himachal Pradesh, Andhra Pradesh, etc. The authors from premium institutes
IITs, NITs, Central Universities, NSIT, PU, and many other reputed institutes
participated in the conference.
Organizers of MRCN-2021 are thankful to the University Institute of Engineering
and Technology (UIET) which was established by Kurukshetra University in 2004 to
develop as a “Centre of Excellence,” offer quality technical education, and undertake
research in engineering and technology. The ultimate aim of the UIET is to become a
role model for engineering and technology education not only for the state of Haryana
but for the world over to meet the challenges of the twenty-first century.
The editors would like to express their sincere gratitude to Patron of the Confer-
ence MRCN-2021 Prof. (Dr.) Som Nath Sachdeva, Honorable Vice-Chancellor,
Kurukshetra University, Kurukshetra; Prof. Dinesh Kant Kumar, Professor at
RMIT University, Melbourne; Prof. Schahram Dustdar, Professor of Computer
Science and Head Research Division of Distributed Systems at the TU Wien, Austria;
Dr. Utkarsh Srivastava, Western Michigan University, USA; general chairs; plenary
speakers; invited speakers; reviewers; Technical Program Committee members;
International Advisory Committee members; and Local Organizing Committee
members of MRCN-2021, without whose support, the quality and standards of the
conference could not be maintained. Special thanks to the Springer and its team for
this valuable publication. Over and above, we would like to express our deepest sense
Preface vii

of gratitude to UIET, Kurukshetra University, Kurukshetra, for hosting this confer-


ence. We are thankful to Technical Education Quality Improvement Program
(TEQIP-III) for sponsoring the International Conference MRCN-2021 Event.

Kurukshetra, India Nikhil Marriwala


Kurukshetra, India C. C. Tripathi
Solan, India Shruti Jain
Melbourne, Australia Dinesh Kumar
Contents

Manual and Automatic Control of Appliances Based on Integration


of WSN and IOT Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
S. Karthikeyan, Adusumalli Nishanth, Annaa Praveen, Vijayabaskar,
and T. Ravi
Face Recognition System Based on Convolutional Neural Network
(CNN) for Criminal Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Ankit Gupta, Deepika Punj, and Anuradha Pillai
A Novel Support Vector Machine-Red Deer Optimization
Algorithm for Enhancing Energy Efficiency of Spectrum Sensing
in Cognitive Radio Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Vikas Srivastava and Indu Bala
Deceptive Product Review Identification Framework Using
Opinion Mining and Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Minu Susan Jacob and P. Selvi Rajendran
Detecting Six Different Types of Thyroid Diseases Using Deep
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
R. Mageshwari, G. Sri Kiruthika, T. Suwathi, and Tina Susan Thomas
Random Forest Classifier Used for Modelling and Classification
of Herbal Plants Considering Different Features Using Machine
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Priya Pinder Kaur and Sukhdev Singh
EEG Based Emotion Classification Using Xception Architecture . . . . . . . 95
Arpan Phukan and Deepak Gupta
Automated Discovery and Patient Monitoring of nCOVID-19:
A Multicentric In Silico Rapid Prototyping Approach . . . . . . . . . . . . . . . . . 109
Sharduli, Amit Batra, and Kulvinder Singh

ix
x Contents

Estimation of Location and Fault Types Detection Using Sequence


Current Components in Distribution Cable System Using ANN . . . . . . . . 119
Garima Tiwari and Sanju Saini
LabVIEW Implemented Smart Security System Using National
Instruments myRIO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
S. Krishnaveni, M. Harsha Priya, and P. A. Harsha Vardhini
Design and Analysis of Modified Sense Amplifier-Based 6/3T
SRAM Using CMOS 45 nm Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Chilumula Manisha and Velguri Suresh Kumar
Design and Implementation of Low Power GDI-LFSR
at Low-Static Power Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Velguri Suresh Kumar and Tadisetty Adithya Venkatesh
Lion Optimization Algorithm for Antenna Selection in Cognitive
Radio Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Mehak Saini and Surender K. Grewal
Sub-band Selection-Based Dimensionality Reduction Approach
for Remote Sensing Hyperspectral Images . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
S. Manju and K. Helenprabha
An Efficient Network Intrusion Detection System Based on Feature
Selection Using Evolutionary Algorithm Over Balanced Dataset . . . . . . . 179
Manisha Rani and Gagandeep
A Novel Hybrid Imputation Method to Predict Missing Values
in Medical Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Pooja Rani, Rajneesh Kumar, and Anurag Jain
Optimal LQR and Smith Predictor Based PID Controller Design
for NMP System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Divya Pandey, Shekhar Yadav, and Satanand Mishra
Detection of Cracks in Surfaces and Materials Using Convolutional
Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
R. Venkatesh, K. Vignesh Saravanan, V. R. Aswin, S. Balaji,
K. Amudhan, and S. Rajakarunakaran
Performance Analysis of TDM-PON and WDM-PON . . . . . . . . . . . . . . . . . 243
Raju Sharma, Monika Pathak, and Anuj Kumar Gupta
Detection of Microcalcifications in Mammograms Using Edge
Detection Operators and Clustering Algorithms . . . . . . . . . . . . . . . . . . . . . . 255
Navneet Kaur Mavi and Lakhwinder Kaur
Analysis of the Proposed CNN Model for the Recognition
of Gurmukhi Handwritten City Names of Punjab . . . . . . . . . . . . . . . . . . . . 267
Sandhya Sharma, Sheifali Gupta, Neeraj Kumar, and Himani Chugh
Contents xi

Modeling and Analysis of Positive Feedback Adiabatic Logic


CMOS-Based 2:1 Mux and Full Adder and Its Power Dissipation
Consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Manvinder Sharma, Pankaj Palta, Digvijay Pandey, Sumeet Goyal,
Binay Kumar Pandey, and Vinay Kumar Nassa
A Smart Healthcare System Based on Classifier DenseNet 121
Model to Detect Multiple Diseases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Mohit Chhabra and Rajneesh Kumar
An Effective Algorithm of Remote Sensing Image Fusion Based
on Discrete Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Richa, Karamjit Kaur, and Priti singh
QoS-Based Load Balancing in Fog Computing . . . . . . . . . . . . . . . . . . . . . . . 331
Shilpi Harnal, Gaurav Sharma, and Ravi Dutt Mishra
DoMT: An Evaluation Framework for WLAN Mutual
Authentication Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Pawan Kumar and Dinesh Kumar
Multiword Expression Extraction Using Supervised ML for Dogri
Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
Shubhnandan S. Jamwal, Parul Gupta, and Vijay Singh Sen
Handwritten Text to Braille for Deaf-Blinded People Using Deep
Neural Networks and Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
T. Parthiban, D. Reshmika, N. Lakshmi, and A. Ponraj
Two Element Annular Ring MIMO Antenna for 2.4/5.8 GHz
Frequency Band . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Kritika Singh, Shadab Azam Siddique, Sanjay Kumar Soni,
Akhilesh Kumar, and Brijesh Mishra
A Secure Hybrid and Robust Zone Based Routing Protocol
for Mobile Ad Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Vishal Gupta, Hemant Sethi, and Surinder Pal Singh
Multimedia Content Mining Based on Web Categorization
(MCMWC) Using AlexNet and Ensemble Net . . . . . . . . . . . . . . . . . . . . . . . . 415
Bhavana and Neeraj Raheja
DDSS: An AI Powered System for Driver Safety . . . . . . . . . . . . . . . . . . . . . . 429
Neera Batra and Sonali Goyal
Improving Performance and Quality of Service in 5G Technology . . . . . . 439
Stuti Jain, Ritika Jain, Jyotika Sharma, and Ashish Sharma
xii Contents

Machine Learning-Based Ensemble Classifier Using Naïve


Bayesian Tree with Logit Regression for the Prediction
of Parkinson’s Disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Arunraj Gopalsamy and B. Radha
Analysis of Radiation Parameters of Kalasam-Shaped Antenna . . . . . . . . 471
Samyuktha Saravanan and K. Gunavathi
Political Sentiment Analysis: Case Study of Haryana Using
Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
Dharminder Yadav, Avinash Sharma, Sharik Ahmad,
and Umesh Chandra
Efficient Key Management for Secure Communication Within
Tree and Mesh-Based Multicast Routing Protocols . . . . . . . . . . . . . . . . . . . . 501
Bhawna Sharma and Rohit Vaid
Detection of Signature-Based Attacks in Cloud Infrastructure
Using Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
B. Radha and D. Sakthivel
A Modified Weighed Histogram Approach for Image Enhancement
Using Optimized Alpha Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
Vishal Gupta, Monish Gupta, and Nikhil Marriwala
Spectrum Relay Performance of Cognitive Radio between Users
with Random Arrivals and Departures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
V. Sreelatha, E. Mamatha, C. S. Reddy, and P. S. Rajdurai
Energy Efficient Void Avoidance Routing for Reduced Latency
in Underwater WSNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
Swati Gupta and N. P. Singh
Sentiment Analysis Using Learning Techniques . . . . . . . . . . . . . . . . . . . . . . 559
A. Kathuria and A. Sharma
Design of Novel Compact UWB Monopole Antenna on a Low-Cost
Substrate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583
P. Dalal and S. K. Dhull
Multi-Objective Genetic Algorithm for Job Planning in Cloud
Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
Neha Dutta and Pardeep Cheema
Facial Expression Recognition Using Convolutional Neural
Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Vandana and Nikhil Marriwala
FPGA Implementation of Elliptic Curve Point Multiplication Over
Galois Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
S. Hemambujavalli, P. Nirmal Kumar, Deepa Jose, and S. Anthoniraj
Contents xiii

Blockchain Usage in Theoretical Structure for Enlarging


the Protection of Computerised Scholarly Data . . . . . . . . . . . . . . . . . . . . . . . 635
Preeti Sharma and V. K. Srivastava
A Framework of a Filtering and Classification Techniques
for Enhancing the Accuracy of a Hybrid CBIR System . . . . . . . . . . . . . . . . 645
Bhawna Narwal, Shikha Bhardwaj, and Shefali Dhingra
Microgrid Architecture with Hybrid Storage Elements and Its
Control Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657
Burri Ankaiah, Sujo Oommen, and P. A. Harsha Vardhini
A Robust System for Detection of Pneumonia Using Transfer
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
Apoorv Vats, Rashi Singh, Ramneek Kaur Khurana, and Shruti Jain
Reduce Overall Execution Cost (OEC) of Computation Offloading
in Mobile Cloud Computing for 5G Network Using Nature
Inspired Computing (NIC) Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679
Voore Subba Rao and K. Srinivas

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691


Editors and Contributors

About the Editors

Dr. Nikhil Marriwala did his Ph.D. from NIT, Kurukshetra in Electronics and
Communication Department. He did his post graduation in Electronics and Commu-
nication Engineering from IASE University, Sardarshahar, Rajasthan and did his
B.Tech. Electronics and Instrumentation from MMEC, Mullana affiliated to Kuruk-
shetra University, Kurukshetra. He is having additional charge of Training and Place-
ment Office, UIET, Kurukshetra University, Kurukshetra and heading the T&P cell
for more than 7 years now. He was also the M.Tech. Coordinator of ECE in U.I.E.T,
KUK for more than 3 years. He has more than 16 years of experience of teaching
graduate and post graduate students. More than 31 students have completed their
M.Tech. dissertation under his guidance presently 2 students are working along with
him. He has more than 30 publications to his credit in National and International
Journals with 5 publications in reputed SCI and Scopus International Journals. He
also has one patent published to his credit. He has also been Chairman of Special
Sessions in more than 5 International/National Conferences. His areas of interests
are Software Defied Radios, Wireless Communication, Fuzzy system design, and
Microprocessors.

Prof. C. C. Tripathi did his in Ph.D. (Electronics) from Kurukshetra University,


Kurukshetra. Since 2016, he is working as a Director, University Institute of Engi-
neering Technology (an autonomous institute), Kurukshetra University, Kurukshetra.
The institute is having more than 75 faculty member, above 150 non-teaching tech-
nical, laboratory and administrative staff and above 1600 students. As a Director,
he is also heading institute academic bodies like board of studies, academic council
with four UG, 8 PG programs and spearheading research in various engineering
and applied sciences departments in the institute. Microelectronics, RF MEMS for
Communication, Industrial Consultancy are his specialization areas. He has devel-
oped Micro-fabrication R&D Lab and RF MEMS R&D lab. He is a member of more
than 14 Professional Bodies. He has published more than 80 papers in conferences

xv
xvi Editors and Contributors

and journals. Also filled one patent. He has implemented TEQIP-II grants of Rs.10.00
Crores by preparing Institution Development Plan (IDP).

Shruti Jain is an Associate Professor in the Department of Electronics and Commu-


nication Engineering at Jaypee University of Information Technology, Waknaghat,
Himachal Pradesh, India and has received her Doctor of Science (D.Sc.) in Elec-
tronics and Communication Engineering. She has a teaching experience of around
15 years. She has filed three patents out of which one patent is granted and one is
published. She has published more than 09 book chapters, and 100 research papers in
reputed indexed journals and in international conferences. She has also published six
books. She has completed two government-sponsored projects. She has guided 06
Ph.D. students and now has 02 registered students. She has also guided 11 M.Tech.
scholars and more than 90 B. Tech. undergrads. Her research interests are Image and
Signal Processing, Soft Computing, Bio-inspired Computing and Computer-Aided
Design of FPGA and VLSI circuits. She is a senior member of IEEE, life member
and Editor in Chief of Biomedical Engineering Society of India and a member of
IAENG. She is a member of the Editorial Board of many reputed journals. She is
also a reviewer of many journals and a member of TPC of different conferences. She
was awarded by Nation Builder Award in 2018–19.

Prof. Dinesh Kumar, B.Tech. from IIT Madras, and Ph.D. from IIT Delhi, is a
Professor at RMIT University, Melbourne, Australia. He has published over 400
papers, authored 5 books and is on a range of Australian and international committees
for Biomedical Engineering. His passion is for affordable-diagnostics and making a
difference for his students. His work has been cited over 5600 times and he has also
had multiple successes with technology translation. He is the member of Therapeutics
Goods Administration (TGA), Ministry of Health (Australia) for medical devices.
He is also on the editorial boards for IEEE Transactions of Neural Systems and
Rehabilitation Engineering and Biomedical Signals and Controls. He has been the
chair of large number of conferences and given over 50 key-notes speeches.

Contributors

Sharik Ahmad Computer Science Department, Glocal University, Saharanpur, UP,


India
K. Amudhan Department of Mechanical Engineering, Mepco Schlenk Engineering
College, Sivakasi, India
Burri Ankaiah REVA University, Bengaluru, Karnataka, India
S. Anthoniraj Department of ECE, MVJ College of Engineering, Bangalore, India
V. R. Aswin Department of Mechanical Engineering, Ramco Institute of Tech-
nology, Rajapalayam, India
Editors and Contributors xvii

Indu Bala Lovely Professional University, Jalandhar, India


S. Balaji Department of Mechanical Engineering, Ramco Institute of Technology,
Rajapalayam, India
Amit Batra Department of CSE, University Institute of Engineering and Tech-
nology, Kurukshetra University, Kurukshetra, Haryana, India
Neera Batra Department of CSE, Maharishi Markandeshwar Engineering College,
Maharishi Markandeshwar (Deemed To Be University) Mullana, Ambala, Haryana,
India
Shikha Bhardwaj Department of Electronics and Communication Engineering,
University Institute of Engineering and Technology, Kurukshetra University, Kuruk-
shetra, Haryana, India
Bhavana CSE Department, Maharishi Markandeshwar (Deemed To Be University),
Mullana, India
Umesh Chandra Computer Science, Banda University of Agriculture and Tech-
nology, Banda, UP, India
Pardeep Cheema Acet Eternal University Baru Sahib, Baru Sahib, Himachal
Pradesh, India
Mohit Chhabra CSE Department, MMEC, Maharishi Markandeshwar Deemed to
be University, Mullana, Ambala, Haryana, India
Himani Chugh Chandigarh Group of Colleges, Ajitgarh, India
P. Dalal Guru Jambheshwar University of Science and Technology, Hisar, India
Shefali Dhingra Department of Electronics and Communication Engineering,
University Institute of Engineering and Technology, Kurukshetra University, Kuruk-
shetra, Haryana, India
S. K. Dhull Guru Jambheshwar University of Science and Technology, Hisar, India
Neha Dutta Acet Eternal University Baru Sahib, Baru Sahib, Himachal Pradesh,
India
Gagandeep Department of Computer Science, Punjabi University, Patiala, India
Arunraj Gopalsamy Sree Saraswathi Thyagaraja College, Pollachi, Tamil Nadu,
India;
IBM, Bangalore, Karnataka, India
Sonali Goyal Department of CSE, Maharishi Markandeshwar Engineering College,
Maharishi Markandeshwar (Deemed To Be University) Mullana, Ambala, Haryana,
India
Sumeet Goyal Department of Applied Science, Chandigarh Group of Colleges,
Landran, Punjab, India
xviii Editors and Contributors

Surender K. Grewal D. C. R. University of Science and Technology, Murthal,


India
K. Gunavathi PSG College of Technology, Coimbatore, India
Ankit Gupta Department of Computer Engineering, J.C. Bose University of
Science & Technology, YMCA, Faridabad, Haryana, India
Anuj Kumar Gupta Chandigarh Group of Colleges, Mohali, Punjab, India
Deepak Gupta Department of Computer Science and Engineering, National Insti-
tute of Technology, Jote, Arunachal Pradesh, India
Monish Gupta Electronics and Communication Engineering Department, Univer-
sity Institute of Engineering and Technology, Kurukshetra University, Kurukshetra,
India
Parul Gupta PGDCSIT, University of Jammu, Jammu, Jammu & Kashmir, India
Sheifali Gupta Chitkara University Institute of Engineering and Technology,
Chitkara University, Chandigarh, Punjab, India
Swati Gupta Department of Electronics and Communication Engineering,
National Institute of Technology, Kurukshetra, India;
Panipat Institute of Engineering and Technology, Panipat, India
Vishal Gupta Department of Computer Science and Engineering, MMEC,
MM(DU), Mullana, Ambala, India;
University Institute of Engineering and Technology, Kurukshetra University, Kuruk-
shetra, India
Shilpi Harnal Chitkara University Institute of Engineering and Technology,
Chitkara University, Chandigarh, Punjab, India
M. Harsha Priya Department of ECE, CMR College of Engineering and Tech-
nology, Hyderabad, Telangana, India
P. A. Harsha Vardhini Department of ECE, Vignan Institute of Technology and
Science, Deshmukhi, Telangana, India
K. Helenprabha Department of Electronics and Communication Engineering,
R.M.D. Engineering College, Kavaraipettai, India
S. Hemambujavalli Department of ECE, CEG, Anna University, Chennai, Tamil
Nadu, India
Minu Susan Jacob Department of Computer Science and Engineering, KCG
College of Technology, Chennai, India
Anurag Jain School of Computer Sciences, University of Petroleum and Energy
Studies, Dehradun, Uttarakhand, India
Editors and Contributors xix

Ritika Jain Department of Computer Science and Engineering, Maharaja Agrasen


Institute of Technology, New Delhi, India
Shruti Jain Department of Electronics and Communication Engineering, Jaypee
University of Information Technology, Solan, Himachal Pradesh, India
Stuti Jain Department of Computer Science and Engineering, Maharaja Agrasen
Institute of Technology, New Delhi, India
Shubhnandan S. Jamwal PGDCSIT, University of Jammu, Jammu, Jammu &
Kashmir, India
Deepa Jose Department of ECE, KCG College of Technology, Chennai, Tamil
Nadu, India
S. Karthikeyan Department of ECE, Sathyabama Institute of Science and Tech-
nology, Chennai, India
A. Kathuria Department of Computer Science and Engineering, Maharishi
Markandeshwar Engineering College, Maharishi Markandeshwar (Deemed Too Bee)
University, Haryana, India
Karamjit Kaur Department of Electronics and Communication Engineering,
Amity School of Engineering and Technology, Amity University, Gurugram, India
Lakhwinder Kaur Department of Computer Science and Engineering, Punjabi
University, Patiala, Punjab, India
Priya Pinder Kaur Punjabi University, Patiala, Punjab, India
Ramneek Kaur Khurana Department of Biotechnology, Jaypee University of
Information Technology, Solan, Himachal Pradesh, India
S. Krishnaveni Department of ECE, CMR College of Engineering and Technology,
Hyderabad, Telangana, India
Akhilesh Kumar Madan Mohan Malaviya University of Technology, Gorakhpur,
Uttar Pradesh, India
Dinesh Kumar Department of Information Technology, D.A.V. Institute of Engi-
neering and Technology, Jalandhar, India
Neeraj Kumar Chitkara University Institute of Engineering and Technology,
Chitkara University, Solan, Himachal Pradesh, India
Pawan Kumar Department of Computer Science, DAV College, Bathinda, India
Rajneesh Kumar Department of Computer Engineering, MMEC, Maharishi
Markandeshwar (Deemed to be University), Mullana, Ambala, Haryana, India
N. Lakshmi Department of Electronics and Communication Engineering, Easwari
Engineering of College, Chennai, India
xx Editors and Contributors

R. Mageshwari Department of Information Technology, KCG College of Tech-


nology, Chennai, India
E. Mamatha School of Science, GITAM University, Bangalore, India
Chilumula Manisha Department of ECE, Maturi Venkata Subba Rao Engineering
College, Telangana State, Hyderabad, India
S. Manju Department of Medical Electronics, Saveetha Engineering College,
Thandalam, Chennai, India
Nikhil Marriwala Electronics and Communication Engineering Department,
University Institute of Engineering and Technology, Kurukshetra University, Kuruk-
shetra, India
Navneet Kaur Mavi Department of Computer Science and Engineering, Punjabi
University, Patiala, Punjab, India
Brijesh Mishra Madan Mohan Malaviya University of Technology, Gorakhpur,
Uttar Pradesh, India;
Shambhunath Institute of Engineering and Technology, Prayagraj, India
Ravi Dutt Mishra Seth Jai Parkash Mukand Lal Institute of Engineering and
Technology, Radaur, India
Satanand Mishra AcSIR & CSIR—Advanced Materials and Processes Research
Institute (AMPRI), Bhopal, India
Bhawna Narwal Department of Electronics and Communication Engineering,
University Institute of Engineering and Technology, Kurukshetra University, Kuruk-
shetra, Haryana, India
Vinay Kumar Nassa Department of Computer Science Engineering, South Point
Group of Institutions, Sonepat, Haryana, India
P. Nirmal Kumar Department of ECE, CEG, Anna University, Chennai, Tamil
Nadu, India
Adusumalli Nishanth Department of ECE, Sathyabama Institute of Science and
Technology, Chennai, India
Sujo Oommen REVA University, Bengaluru, Karnataka, India
Pankaj Palta Department of ECE, Chandigarh Group of Colleges, Landran,
Punjab, India
Binay Kumar Pandey Department of IT, College of Technology, Govind Ballabh
Pant University of Agriculture and Technology, Pantnagar, Uttarakhand, India
Digvijay Pandey Department of Technical Education, IET, Dr. A. P. J. Abdul Kalam
Technical University, Lucknow, Uttar Pradesh, India
Editors and Contributors xxi

Divya Pandey Madan Mohan Malaviya University of Technology, Gorakhpur, U.P.,


India
T. Parthiban Department of Electronics and Communication Engineering, Easwari
Engineering of College, Chennai, India
Monika Pathak Multani Mal Modi College, Patiala, Punjab, India
Arpan Phukan Department of Computer Science and Engineering, National Insti-
tute of Technology, Jote, Arunachal Pradesh, India
Anuradha Pillai Department of Computer Engineering, J.C. Bose University of
Science & Technology, YMCA, Faridabad, Haryana, India
A. Ponraj Department of Electronics and Communication Engineering, Easwari
Engineering of College, Chennai, India
Annaa Praveen Department of ECE, Sathyabama Institute of Science and Tech-
nology, Chennai, India
Deepika Punj Department of Computer Engineering, J.C. Bose University of
Science & Technology, YMCA, Faridabad, Haryana, India
B. Radha Department of Information Technology, Sri Krishna Arts and Science
College, Coimbatore, Tamil Nadu, India
Neeraj Raheja CSE Department, Maharishi Markandeshwar (Deemed To Be
University), Mullana, India
S. Rajakarunakaran Department of Mechanical Engineering, Ramco Institute of
Technology, Rajapalayam, India
P. S. Rajdurai Department of Mathematics, Srinivasa Ramanujan Center,
SASTRA University, Kumbakonam, India
Manisha Rani Department of Computer Science, Punjabi University, Patiala, India
Pooja Rani MMICT&BM (A.P.), MMEC, Maharishi Markandeshwar (Deemed to
be University), Mullana, Ambala, Haryana, India
Voore Subba Rao Department of Physics and Computer Science, Dayalbagh
Educational Institute (DEI), Agra, India
T. Ravi Department of ECE, Sathyabama Institute of Science and Technology,
Chennai, India
C. S. Reddy Department of Mathematics, Cambridge Institute Technology - NC,
Bangalore, India
D. Reshmika Department of Electronics and Communication Engineering, Easwari
Engineering of College, Chennai, India
Richa Department of Electronics and Communication Engineering, Amity School
of Engineering and Technology, Amity University, Gurugram, India
xxii Editors and Contributors

Mehak Saini D. C. R. University of Science and Technology, Murthal, India


Sanju Saini Department of Electrical Engineering, DCRUST Murthal, Sonipat,
Haryana, India
D. Sakthivel Department of Computer Science, Sree Saraswathi Thyagaraja
College, Pollachi, Tamil Nadu, India
Samyuktha Saravanan PSG College of Technology, Coimbatore, India
P. Selvi Rajendran Department of Computer Science and Engineering, KCG
College of Technology, Chennai, India
Vijay Singh Sen PGDCSIT, University of Jammu, Jammu, Jammu & Kashmir,
India
Hemant Sethi Department of Computer Science and Engineering, MMU, Sadopur,
Ambala, India
Sharduli Department of Biotechnology, University Institute of Engineering and
Technology, Kurukshetra University, Kurukshetra, Haryana, India
A. Sharma Department of Computer Science and Engineering, Maharishi Markan-
deshwar Engineering College, Maharishi Markandeshwar (Deemed Too Bee)
University, Haryana, India
Ashish Sharma Department of Computer Science and Engineering, Maharaja
Agrasen Institute of Technology, New Delhi, India
Avinash Sharma Computer Science, Maharishi Markandeshwar Deemed To Be
University, Ambala, Haryana, India
Bhawna Sharma Department of Computer Science and Engineering, MMEC, MM
(Deemed To Be University), Mullana, Ambala, India
Gaurav Sharma Seth Jai Parkash Mukand Lal Institute of Engineering and
Technology, Radaur, India
Jyotika Sharma Department of Computer Science and Engineering, Maharaja
Agrasen Institute of Technology, New Delhi, India
Manvinder Sharma Department of ECE, Chandigarh Group of Colleges, Landran,
Punjab, India
Preeti Sharma Department of Computer Science, Baba Mastnath University,
Rohtak, Haryana, India
Raju Sharma Baba Banda Singh Bahadur Engineering College, Fatehgarh Sahib,
Punjab, India
Sandhya Sharma Chitkara University Institute of Engineering and Technology,
Chitkara University, Solan, Himachal Pradesh, India
Editors and Contributors xxiii

Shadab Azam Siddique Madan Mohan Malaviya University of Technology,


Gorakhpur, Uttar Pradesh, India
Kritika Singh Madan Mohan Malaviya University of Technology, Gorakhpur,
Uttar Pradesh, India
Kulvinder Singh Faculty of CSE, Department of CSE, University Institute of
Engineering and Technology, Kurukshetra University, Kurukshetra, Haryana, India
N. P. Singh Department of Electronics and Communication Engineering, National
Institute of Technology, Kurukshetra, India
Rashi Singh Department of Computer Science Engineering, Jaypee University of
Information Technology, Solan, Himachal Pradesh, India
Sukhdev Singh M.M. Modi College, Patiala, Punjab, India
Surinder Pal Singh Department of Computer Science and Engineering, MMEC,
MM(DU), Mullana, Ambala, India
Priti singh Department of Electronics and Communication Engineering, Amity
School of Engineering and Technology, Amity University, Gurugram, India
Sanjay Kumar Soni Madan Mohan Malaviya University of Technology,
Gorakhpur, Uttar Pradesh, India
V. Sreelatha School of Science, GITAM University, Bangalore, India
G. Sri Kiruthika Department of Information Technology, KCG College of Tech-
nology, Chennai, India
K. Srinivas Department of Electrical Engineering, Dayalbagh Educational Institute
(DEI), Agra, India
V. K. Srivastava Faculty of Engineering, Baba Mastnath University, Rohtak,
Haryana, India
Vikas Srivastava PSIT, Kanpur, India;
Lovely Professional University, Jalandhar, India
Velguri Suresh Kumar Department of ECE, Maturi Venkata Subba Rao Engi-
neering College, Hyderabad, Telangana, India
T. Suwathi Department of Information Technology, KCG College of Technology,
Chennai, India
Tina Susan Thomas Department of Information Technology, KCG College of
Technology, Chennai, India
Garima Tiwari Department of Electrical Engineering, DCRUST Murthal, Sonipat,
Haryana, India
Rohit Vaid Department of Computer Science and Engineering, MMEC, MM
(Deemed To Be University), Mullana, Ambala, India
xxiv Editors and Contributors

Vandana University Institute of Engineering and Technology, Kurukshetra Univer-


sity, Kurukshetra, India
Apoorv Vats Department of Computer Science Engineering, Jaypee University of
Information Technology, Solan, Himachal Pradesh, India
R. Venkatesh Department of Mechanical Engineering, Ramco Institute of Tech-
nology, Rajapalayam, India
Tadisetty Adithya Venkatesh Department of ECE, Maturi Venkata Subba Rao
Engineering College, Hyderabad, Telangana, India
K. Vignesh Saravanan Department of Computer Science and Engineering, Ramco
Institute of Technology, Rajapalayam, India
Vijayabaskar Department of ECE, Sathyabama Institute of Science and Tech-
nology, Chennai, India
Dharminder Yadav Computer Science Department, Glocal University, Saha-
ranpur, UP, India
Shekhar Yadav Madan Mohan Malaviya University of Technology, Gorakhpur,
U.P., India
Manual and Automatic Control
of Appliances Based on Integration
of WSN and IOT Technology

S. Karthikeyan, Adusumalli Nishanth, Annaa Praveen, Vijayabaskar,


and T. Ravi

Abstract As indicated by the Internet of Things, the future home the purported
Smart Home, will be a consistent mix of actual savvy objects, interfacing, among
them and with the general climate. Furthermore, since handsets are often used to
manage certain aspects of people’s lives, the capability to power and monitor Smart
Households from they will become a requirement. The justification for the home
automation system is to monitor the limits such as voltage, current, and temperature
using remote network architecture that operate on a screen. The main aim is to
reduce a smart condo’s excessive energy consumption. It aids with in improvement
of controlling organisation introduction. The aim of its project is to design a well-
thought-out intra smart home system that allows the consumer to monitor all of their
electric and electronic devices from every other Mobile. This project’s use includes
features such as screen surveillance, light and fan power, fire warning, and greenhouse
service. The detectors are linked to the Pic microcontroller, which sends the sensor’
position to the email address. The Arduino is used to interfere with device, and Wlan
is also connected to the Arduino to have a Domain name from either an adapter.
With the use of WSN, this research framework provides a solution for providing an
extremely accurate monitoring and scheduling position of current state of gear.

Keywords Home Automation · Arduino · Bluetooth · WSN · Smartphone

1 Introduction

A Wireless Sensor Network (WSN) is a centralised organisation made up of dispersed


and automatic modules that monitor material or chemical concentrations using
sensors. The WSN arrangement is created by combining hubs or self-administering
with a gateway and toggle. Household computerization or flat designs, according to
some reports, may use resources more effectively than conventional systems. As a
result, a few researchers have advocated for the use of remote monitoring to reduce

S. Karthikeyan (B) · A. Nishanth · A. Praveen · Vijayabaskar · T. Ravi


Department of ECE, Sathyabama Institute of Science and Technology, Chennai, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 1
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_1
2 S. Karthikeyan et al.

energy consumption. In the drafting, also suggested clear housing assemblages get
the WSN (Wireless Sensor Network) [1] as a manifestation of ineffable creation
Because of its flexibility and long battery life, the WSN has indeed been widely seen
in controller and inspecting implementations that instead Wi-Fi. Humans differ from
most other animals in that they have special supervisory powers, which necessitate
proper details and knowledge of history or rather its general situation. Monitoring
systems are used to obtain data through structures and their average soil character-
istics in a motorised or dynamic device to reduce the energy acquiring [2]. Sensor
is a sensor which can augment or replace a person’s Sight, hearing, taste, smell, and
interaction are the 5 students. For use of a master device network is shown to be versa-
tile and effective inside a variety of situations. even a variety of settings, including
domestic digitalization, manufacturing engineer, and control threats, among others. A
far-flung mobile device [3] uses spatiotemporal confiscated free devices to measure
and regulate body temperature, voltage, and budget constraints. Because with the
abundance of high straightforwardness and ease by laptop connectivity and internet,
the remained of domestic digitalization has been rising amazingly since later last year.
A mobile electric drivetrains system connects electronic components in a residence.
The methods used in household distributed computing are similar to those used in
housing industrialization and urbanization and management of local tasks such as
smart tv setup, pot plant including yard planting, and pet care, among other things.
Devices can be connected by a personal organise and licenced for web connection
from web to experiences related from a PC. The use of technology advancements and
management structures to reduce human labour is referred to as a product advance-
ment method. The rapid advancement in technology encourages one to use cells
to power home equipment in an indirect manner. Computerized devices have the
potential to operate with agility, dedication, and a low rate of errors.
Home scientists and home computer partnerships, the bolster method is a huge
problem. If the client lies and outside the house, a consumer pneumatic tires culture is
built on automation operating intelligent home gadgets. Home computer technology
allows a person to monitor various aspects of their home usually. A personal computer
is a system or tool designed to perform a particular function, usually a mechanical
unit for home use, such as with a refrigerator.House digitization combines the moni-
toring of lighting, temperature, equipment, and different models to provide offers
a feature, functionality, efficiency, and protection. Household mechanisation may
be a replacement for formal thinking for hampered and elderly people. Household
digitalisation or construction urbanisation, combined with the fuel concept, makes
living in today’s world incredibly easy. It includes personalised control over all
electrical and electronic devices in the household, as well as disorders can lead via
long-distance messaging. This arrangement allows for unified monitoring of lighting,
water heating, sound/video facilities, surveillance mechanisms, kitchen machinery,
and all other remaining devices used in homes frameworks. Canny home can be
defined as a “propelled bedroom” because it combines power of devices with vision.
Home People with advancement deficiency, the elderly, having poor vision,
sensory people who are blind, and mentally impaired human beings all need an elec-
tronic device. a large number of individuals One of it’s primary goals of sharper
Title Suppressed Due to Excessive Length 3

households is to reduce energy consumption. Smart systems can be used in a


controller to accomplish this impartiality. Furthermore, led lighting management
systems should take signature light into account (daylight). As a result, a few studies
have shown that in industry or administrative structures, daytime will stand in for
mainly lighting control. Scanners and fast controls use light to reduce the amount
of energy used to operate external light [4], ensuring that a room is adequately illu-
minated. Despite numerous suggestions for intelligent street lights for fuel savings
in beautiful houses, a precise electric lighting framework of strong fearlessness and
assist in ensuring is yet to be uncovered. The current arrangement’s main goals are
to broaden the scope of automatic monitoring considerations and reduce the effect
of a remote barrier for connected home structure [5] on the WSN collecting data
interface. A large number of narrow reach peer far away sensor line segments make
up a dispersed sensor. By merging two different forms of advancement, such as
converter and long-distance connectivity. Before submitting results to the device,
WSN [6] notices, receives, and manages paper material in its concealing region. The
WSN is commonly used in manager and tracking apps due to its flexibility and light
weight. WSN in a genius home cooperative [3] consists of confiscated free gadget
with extensive applications, such as surveillance, military tracks, and intelligent
motorised vehicles.Every unit with in organization in WSN for connected home [7]
is freed from various places; they are power energised, small in scale, and have good
indicator. Via a visual functional tissue, a brilliant home current controller that used
a far-off network system was already developed for healthy products in this article.
It will, in turn, be realised without the involvement of customers in the future. As
a result, it is simple to securely monitor and manage the insightful household using
web-based technologies.

2 Related Work

Pawar et al. [1] have designed a Home Automation System concept that includes
the Arduino ATmega328 microchip, a Wi-Fi device, and motion sensors. The CPU
is the core computer, which communicates the with Wi-Fi unit and takes requests
to view monitor computers.The steelworker manages the communication between
programme and the chip, as well as the buyers and the creations. The platform for
the items conversation system is an Android device that establishes a connection for
the client to communicate only with processor. The proposed setup involves a staff,
a client, several contact networks, that are all managed by connection tools.
Choudhary et al.[2] have suggested a flexible and wireless home mechanisa-
tion device that utilises Wi-Fi technology to link its dispersed instruments to a
homes framework labourer. In this article, an Arduino Uno R3 microcontroller and
a connecting wires Voltage regulator are being used to power a variety of equipment
and programming and other devices with a large amount of energy. PIR meter, when
PIR-based technology locaters are used to discern enhancement of humans, pets,
or different posts, thermocouple, etc. for testing heat and moisture content, design
4 S. Karthikeyan et al.

identifiable evidence, video seeing, etc. are among the different sensors used during
device structure. (5).
Sagar et al. [3] Assign probabilities to build a ringed home coordinations device
using an Arduino Microcontroller pic microcontroller with a simple Wi-Fi card slot,
in addition to the serial communication sensors (sensors) that is used. As a result,
the Galileo serious progress functions as a network specialist, allowing the devel-
opment layout to be passed from web programmes on any main Computer in a
comparable LAN using journeyman carpenter IP and from any PC or network glob-
ally using Network IP WiFi technology is the organisation organisation used in this
article. Thermocouple and moisture, construction transparency, fire and smoke field,
light scale, and on/off function for multiple devices are all included in the proposal
components. [8]
Pooja et al. [4] The aim of the External Keyword Powered Consumer Products
Monitor Progress Program was to analyse data from Micro controller, initialise the
Hdtv and Arduino show, and display the circumstance the with physical pressures on
the LCD. The machine is familiar with the divider’s wiring clicks. And using direct
current devices, the danger of dangerous electrocution may be held to a minimum.
The design employs two user interfaces, one for the PC the other for the Device. This
GUI may be used to determine the status of the creations, such as whether they are
on or off. Any time the story with the hardware shifts, a quick suggestion appears
here on GUI. Only after Device’s Wireless is connected to the PC’s Hdmi, the screen
GUI will most likely act as a professional to advance or transfer any info to both the
Mobile and the main power pad. If the Network card between both the PC or PC
and the power fence fails, connection may be s right by connecting via USB. The
person can control and operate the devices using IOT of any mobile location on the
network.
Dhakad Kunal et al. [5]. A system’s components are divided into three types:
PCB, toughness processor, and Microcontrollers controller. On the PCB, the LPT
port, transistors, and diode shunt resistor are all connected. They connected two
gadgets to the PCB, such as a machine and a lamp. The Arduino and the dampness
transmitter are connected. It can also tell the difference between temperature and
moisture. PC is connected to Arduino and the PCB Arduino and PCB can connect
along others through the Window.They measured the mass and also the amount of dust
in the atmosphere. They have just a fixed time when temperatures but moisture are
accurately resourced. It consistently services temperature and tensed muscles after
at regular intervals throughout the screengrab. Predilections (a) Increases safety by
controlling appliance (b) Obtains the home by using web access Increment usability
by regulating the temperature (c) loses time (d) reduces costs and adds relaxation (e)
allows computers to be regulated while the user is away [8].
Santhi et al. [6] captures how the innovations and correspondence production are
used to achieve a company’s home medicine expenses or information technology.
Rate focused on strewn data registration assist in connecting with matters encom-
passing all because one can know that it is simple to get to something at any moment
and in any location in a simplistic manner with beautifully portrayed line stuffAs
Title Suppressed Due to Excessive Length 5

a result, cloud will most likely serve as a gateway to IoT. The inspiring opportuni-
ties for constructing the organisation and connection of mobile gadgets for homes
advancement reasons to the internet that are still accessible.
Jain.Stevan Maineka et al. [7] depicts The Internet of Things (IoT) is a case
inside which objects, species, entities were granted specific identity as well as the
ability to send data over the network without the need for a person link.The term ‘Web
of Things’ refers to a variety of capabilities and investigative tools that allow the Net
to delve into the current reality of actual items. Summary the furthest contacts, RFID,
unreferenced and far smart cities (WSNs), that are all a component of the Internet
of Things (IoT). The use IoT technology and then a large network to digitalize the
writing activities greatby integrating basic digital units to it, and the devices are
operated sufficiently indirect via the site. This report covers the potential of WSN,
IoT, and connected home architecture.

2.1 Existing System

The Bluetooth, ZigBee, and Snapdragon software are currently used for home
automation. The system necessitates a time-consuming and expensive powered
installation as well as the use of an end PC. Gestures are used to organise a home
control system. The controller transmits hand gestures to the system using a glove.
A cell phone remote controller for office automation. The system differs now that
all communications are conducted over a specific phone system rather than over the
Network. Any mobile that supports double tone different re - occurrence can be used
to access the system (DTMF). Our project’s current setup does not access the internet
of Things concept to power home appliances. This is fitting for modern technologies
that allow you to monitor devices anywhere on the globe. As a result, we’ve decided
to use IoT engineering as a basis for our connected home project.

3 Proposed System

The suggested plan inside this paper is not identical to the one of regular WSN-
based superb residences [10], that are using WSNs to pass controlling order for
homes working frameworks. The main motivations for the designed antenna are to
raise awareness of a clear home remote controller and to reduce the effect of far-
off resistance on the WSN [11] innovation for a sharp layout and data collection
device. The proposed platform, which is intended for protection and to screen the,
consists primarily of an Arduino software and a meter. To monitor the hardware,
Microcontroller board is connected to the internet through the Modulated signal
Killing (MODEM) gui. When an individual enters the house and the PIR sensor
detects them, it sends out a different task to regulate, such as turning on the light and
fan. We may use the Wearable technology to control domestic electrical equipment
6 S. Karthikeyan et al.

Fig. 1 Overview of the Proposed System

such as light bulbs, fans, and engines (IOT). For identifying metal for identifies
cheat, the product tracking device is used. If some unknown party comes into another
phone app zone, it would be personal to the property owners. Following that, a water
reservoir is used to assess the total available in the drain pipe. LDR is used to naturally
switch on the lighting in the evenings and turn off lights all through the day (Fig. 1).
The A PIR sensor can also be placed above the hospital’s head so that when he
wakes up, the switch in the police station turns on, alerting the hospital personnel. The
LDR sensors measure illumination from inside hospital and controls the lighting in
the corridors and other areas. The metal detector sensor can be placed in the doctors
office so that if someone approaches the room playing metal ions, the bomb sounds
off and the doctors and nurses is notified.

4 Hardware Implementation

4.1 Aurdino UNO

Arduino Uno Arduino is a device that allows yourself to appear well-intentioned


when controlling a larger portion of the adult situation beyond your computer. It’s
an active real-time computing platform based on a simple microcontroller module,
as well as a development environment for writing programs for the system. The
ATmega328 is used in the Arduino, which is a microcontroller. A USB connection, a
force jack, a Honest view, and a 16 MHz clay transformer light switch are among the
14 computerised feedback pins (of which 6 can be used as PWM returns). It includes
anything needed to assist the processor; basically, attach it to a PC through USB
or compel that with an Ond cable or battery to get started.“ Arduino is a real-time
recording level that is open-source. The Buying framework is built on a basic so I also
Title Suppressed Due to Excessive Length 7

row and an enrichment layout.Microcontroller can be used to make self-contained


connected devices. Otherwise, it could be linked to the computer’s coding.“ A real
Input/Output (I/O) board with such a configurable Embedded System (IC) (Figs. 2
and 3).

Fig. 2 Aurdino UNO

Fig. 3 Hardware Aurdino UNO


8 S. Karthikeyan et al.

Fig. 4 Bluetooth Hardware

4.2 Bluetooth

Bluetooth HC-05 module The HC-05 The Bluetooth module is also used to connect
between Arduino Ide and a device over a distance.The HC-05 is a slaves device that
runs on 3.6 to 6 V of fuel. State, RXD, TXD, GND, VCC, and EN are the six pins.
Assign the TXD pin of the Bluetooth enabled device HC-06 to RX (pin 0) of the
Arduino Board and the RXD pin to TX (pin 1) of the Arduino Uno for equipment.
Adriano’s connection with the Bluetooth (BT) unit is sketched forth (Fig. 4).

4.3 PIR

A PIR tracker senses activity by detecting the heating emitted by a living body. This
are often mounted on security cameras so that they can switch on as soon as an
individual approaches. They are incredibly convincing when it comes to upgrading
home surveillance systems. The sensor is turned off because, but instead of sending
lighting or metastasis, it really should be obstructed. by a passing individual to
“sense” The person, the PIR, is sensitive to the cosmic rays emitted by all living
things (Figs. 5 and 6).
At an intruder walks into field of view of the monitor, the tracker “sees” a sudden
increase in thermal energy. A PIR sensor light is expected to turn on as someone
Title Suppressed Due to Excessive Length 9

Fig. 5 PIR

Fig. 6 Hardware PIR

approaches, but it won’t react to someone passing. The lighting is set up in this
manner.
The spatial production of the PIR sensor is obtained in Fig. 7, and also the oper-
ational amplifier of the PIR sensor is obtained in Fig. 7. The PIR sensor is primarily
used only for action track and to reduce power usage. When an individual walks into
the room, the color will change on, and the transfer function will be produced and in
PC, as seen in Fig. 6.
10 S. Karthikeyan et al.

Fig. 7 PIR Sensor output

4.4 Metal Detector

Metal A locator is an automated visual depiction of metal in the immediate vicinity.


Metal detectors are useful for locating metal speculations hidden inside objects or
metallic articles deep underground.They routinely involve a handheld unit with a
sensor This is a test that can then be run on the land or on different posts. A shifting
sound in ear buds or a needle proceeding forward a pointer appear if the indicator
steers against a hunk of iron. The device frequently provides a sign of location; the
closest the metal is to the headphones, the louder the sound or the further the device
goes. Set “walk around” iron reminders used for security bug at ways in confinement
workplaces, civic centres, and airports to detect rotating steel arms on a part of the
body are also another basic kind. As the LC circuit, which really is L1 and C1, receives
a dissonant repeat on any material nearby, an electromagnetic field is generated, and
causes a stream to flow up and down and alters the sign direction through twist. In
comparison to the Circuit, a reference voltage is often used to adjust the togetherness
indicator. It is far more smart to verify the distance as there is a circle and no strategy
for dealing the with material.
Right LC circuit may have altered sign as the metallic is seen. The modified sign
is sent to the proximity indicator (TDA 0161), which will detect the shift in the flag
and react accordingly. The attract system yields 1 mA when no object is observed,
and around 10 mA once the circle is near to the earth. Once the express trigger is
high, the resistor R3 can give a reference voltage to silicon Q1. Q1 will indeed be
Title Suppressed Due to Excessive Length 11

switched on, and the driven will shine, as well as the doorbell. The regulator r2 limits
the current channel (Figs. 8, 9 and 10).
The The real out of the metal detector sensor is seen in Fig. 11 above, while the
voltage signal of the metal detector sensor is shown in Fig. 11. The metal detector
is mostly used to locate metal and for safety purpose. When object is detected and
also an alarm sounds, the display can be seen in Fig. 10 and also the sensor signal
received in a PC is seen in Fig. 12.

Fig. 8 Digital sensor PIR output

Fig. 9 Digital sensor PIR


12 S. Karthikeyan et al.

Fig. 10 Hardware of Metal Detector

Fig. 11 Output of Metal Detector

4.5 LDR Sensor

A A heat variable variable, also known as a photographic diode, colour resistor


(LDR), or photoconductor. A picture resistor’s opposing decreases as the event illu-
mination voltage increase, indicating photoelectric effect. In light-sensitive identifier
Title Suppressed Due to Excessive Length 13

Fig. 12 Digital Output of Metal Detector

circuitry and glow dim activated sharing protocols, an image regulator may be used.
A image resistor is produced of a transistor with a high resistance. A image resistor
can get an obstruction as high as some few mega ohm (M) in the dark, in the sun, it
can have a distortion as low as very few hundred ohms. If the incidence light on a
photo sensor exceeds a predetermined occurance„ photons consumed by the semi-
conductor provide enough energy for bound electrodes to jump into the conduction
band Following that, the charge carriers (and their opening companions) guide force,
lowering conflict. The distortion scope and assorted variety of a picture blocker can
vary greatly between different devices. In fact, fascinating photo transformers can
react differently to photons within particular frequency categories (Figs. 13 and 14).

Fig. 13 LDR Sensor


14 S. Karthikeyan et al.

Fig. 14 Hardware of LDR Sensor

The visible out of the LDR sensor is shown in Fig. 15, or the operational amplifier
of the LDR sensor is shown in Fig. 16. The LDR sensor senses light output and turns
on the lighting so when intensity of light in it is extremely low, as seen in Fig. 15,
which is great for students who could really toggle on the lights at a certain time.
Figure 16 depicts the display screen achieved in a Device.

Fig. 15 Output of LDR Sensor


Title Suppressed Due to Excessive Length 15

Fig. 16 Digital output of LDR Sensor

4.6 Water Level Sensor

level Neurotransmitters discern various oils and fluidized bed particles such as
blends, ridged materials, and powders with a higher solid body. Level sensors can
detect the level of free-flowing liquids, as their name suggests. Fluids include water,
grease, mixtures, and other emulsions, as well as solid particles of cement matrix,
are included in those materials (solids which can stream). This compounds can,
in general, Because the force, they become at balance or in storage facilities, main-
taining their amount in the closed condition. The level of a sensor is measured against
a standard index (Figs. 17 and 18).
The visible output of the water level sensor is obtained in Fig. 19, and the voltage
signal of the water level sensor is obtained in Fig. 20. A water level sensor is primarily
used it for detecting water level and motor switch, as seen in Fig. 19, and indeed the
voltage signal obtained is observed in Fig. 20.

4.7 Relay Module

A leg is an electrically driven mechanism that could be turned down the brightness.
This device is used in the field of force coverage. This is applicable to all AC and DC
systems. This device is small, consistent, and secure. It regulates an input torque. The
turn component is depicted in Fig. 22. The first section is the genius transfer apparatus,
which is connected to the existing wiring of a machine tools in the home, such as
a roof cooler and lights to obtain electricity. This unit will significant contributions
16 S. Karthikeyan et al.

Fig. 17 Water Level Sensor

Fig. 18 Hardware of Water Level Sensor


Title Suppressed Due to Excessive Length 17

Fig. 19 Water Level Sensor Output

Fig. 20 Digital output of water level sensor

from the lifestyles and unrelated home stocks that are linked to the action module.
With a 5 V amplifier style Dc power Wi-Fi connection, it is 240 VAC to turn from
(AC) to (DC). The hand-off component’s potential as an usual switch “ON” and
“OFF” will toggle a light on and off. An ambient recognizing system includes an
ultrasonic camera as well as a switch module and an Arduino Wi-Fi connection.
Wi-Fi is an RF module with a lot of features that can be used on a central network
18 S. Karthikeyan et al.

Fig. 21 Relay Module

system. The IEEE 802.15.4 standard greatly reduces the amount of code required to
ensure knowledge exchanges. Aside from its data warehousing capabilities, Wi-Fi
has a slew of other advantages for use with a WSN. (#13) (Fig. 21).

4.8 GSM

For The GSM protection platform sends immediate messages based on the sensors
used in this module. If the PIR sensor detects something out of the norm, the server
will be notified immediately. This indicates that there is someone close to the house
who is not the customer. With the aid of a fire detection, the GSM also plays an
important role in the fire alarm system. When smoke is detected, the guide dog a sms
alert via GSM.
Title Suppressed Due to Excessive Length 19

Fig. 22 Relay Driver Hardware

5 Conclusion

Proposed kit is a practical method for partnering with your house. It also has a
management structure that is used for interaction. It improves the intelligence and
efficiency of household partnerships. An structure and production concept for a
fantastic home advancement device based on the Arduino microcontroller board is
published in this research schemes. The microcontroller board is connected to the
sensing. Due to any signals received from similar sensors, the residential mechan-
ically installations can be tested, managed, and accessed normally. The device is
reliant on electricity. The partnership will be terminated and SMS ready restrictions
will be lifted if the power source falls short of expectations. For defence produc-
tivity, every power centrepiece energises the security mechanism. It is not able to
achieve the framework without a security framework. Any base cost may even be
induced in the conceptual proposal for dream home or buildings. It alters the problem
of establishing trading center in WSNs and expands the effect of far-flung blocks.
The suggested system was said to be a foundational, monetarily informative, and
adaptable system.

References

1. Pawar, Nisha P.Singh, , “A home automation system using Internet of Things”, International
Journal of Innovative Research in Computer and Communication Engineering, Vol.4, Issue 4,
(2016).
20 S. Karthikeyan et al.

2. Vinod Choudhary, Aniket Parab, Satyajeet Bhapkar, Neetesh Jha, Ms. Medha Kulkarni, “Design
and Implementation of Wi-Fi based Smart Home System”, International Journal Of Engineering
And Computer Science, Volume 5, Issue No.2, (2016).
3. Vinay Sagar K N, Kusuma S M, “Home Automation using Internet of things”, International
Research Journal of Engineering and Technology (IRJET), Vol. 2, Issue 3, (2015).
4. Pooja N.Pawar, Shruti Ramachandran, P., Varsha V.Wagh4, “A Survey on Internet of Things
Based Home Automation System”,International Journal of Innovative Research in Computer
and Communication Engineering, Vol. 4, Issue 1, January (2016).
5. Dhakad Kunal1, Dhake Tushar2, Undegaonkar Pooja3, Zope Vaibhav4, Vinay Lodha5,” Smart
Home Automation using IOT”, International Journal of Advanced Research in Computer and
Communication Engineering Vol. 5, Issue 2, (2016).
6. H. Santhi, Gayathri.P , “A Review of Home Automation using IoT Applications”, International
Journal of Computer Science & Engineering Technology, ISSN : 2229–3345 Vol. 7 No. 07,
(2016).
7. Prof S A Jain.Stevan Maineka, Pranali Nimgade, “Application Of IoT-WSN in Home Automa-
tion System: A Literature Survey”, Multidisciplinary Journal of Research in Engineering and
Technology, Volume 3, Issue 1, Pg.916–922, (2017).
8. Gwang jun kim,Chang soo jang,Chan ho yoon,Seung jin janz and jin woo lee, “The implemen-
tation of smart home system based on 3G and zigbee in wireless network system” International
Journal Of Smart Home,vol.7no.3, pp311–320, (2013).
9. Prof Andrea goldsmith “Wireless sensor networks technology for smart buildings” for Funded
Project Final Survey Report In Precourt Energy Efficiency center, pp1–5, (2017).
10. Jayashri bangali and Arvind shaligram (2013) Design and implementation of security systems
for smart home based on GSM technology. International Journal Of Smart Home 7(6):201–208
11. Anuja P and Murugeswari T “A novel approach towards building automation through DALI-
WSN integration”, International Journal Of Scientific & Research Publication, volume 3, Issue
4, , pp1–5, (2013).
12. Firdous kausar, Eisa al eisa and Imam Bakhsh, “Intelligent home monitoring using RSSI in
wireless sensor networks”, International Journal of Computer Networks and Communications,
vol.4.no.6, pp33–46, (2011).
Face Recognition System Based
on Convolutional Neural Network (CNN)
for Criminal Identification

Ankit Gupta , Deepika Punj, and Anuradha Pillai

Abstract In this research paper, we give emphasis on the face recognition system
(FRS) and the evolution of the various approaches adopted for FRS by making use
of the CNN model. From recent study, we can conclude that recognized face recog-
nition systems (FRS) were easily attacked and tricked by using faked images of the
person whom they are targeting obtained from various social networks like Facebook,
Instagram, Twitter, etc. The main task which is involved in image classification is of
extracting important features. By using the important features of image processing
methods like scaling, thresholding and contrast enhancement which is based on deep
neural networks, the face recognition system can classify the results more efficiently
and achieve high accuracy. This paper gives emphasis on extracting various important
features required for face recognition using a CNN model and gives a brief overview
of various techniques employed by researchers in this area. In this research, we have
used Flickr-Faces-HQ Dataset (FFHQ) and test images mixed with some real time
images captured from camera devices like CCTV, mobile camera, and when appro-
priate match is found, it gives information according to the matched faces. Accuracy
obtained by using this CNN model using Google Colab is 96%.

Keywords Artificial intelligence · Cloud computing · Convolutional neural


network · Criminal identification · Deep learning · Face recognition

1 Introduction

There is a significant need for high security due to accumulation of data and informa-
tion in surplus amount. Face recognition system intent to recognize people in videos
or images captured by surveillance cameras using pattern recognition methods and
techniques. In recent times, face recognition system has been flourishing, demanding
and fascinating area in various instantaneous applications happening throughout the

A. Gupta (B) · D. Punj · A. Pillai


Department of Computer Engineering, J.C. Bose University of Science & Technology, YMCA,
Faridabad, Haryana, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 21
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_2
22 A. Gupta et al.

world. In last decade, enormous number of face recognition algorithms have been
developed for recognizing faces.
The FRS is an immensely difficult task because of various aspect varying within
the personalized to natural circumstances. An individual can formulate contrasting
aspect of face by making adjustments to the expression, facial hair, cap, haircut,
spectacles, etc. However, due to various obstacles such as bad lighting condition,
illumination, occlusions, falsification of faces, difficult pose, the capturing devices
such as CCTV camera and mobile camera are not able to click or capture good
quality face images. In spite of this fact, the face recognition system has an immense
prospective to automate various applications in real life scenario.
During current scenario, the government agencies of various countries such as
USA, China, India and Singapore are functioning on the expansion of FRS to counter
terrorism. Authentication systems of the majority of various government agencies
and offices are constructed on soundness of the information that are established on
the physical and behavioral surroundings known as biometrics. This system extracts
invaluable features by processing the data which is not treated such as fingerprint,
eyes, face, etc. The attribute characterizes the data essence which is allocated to this
system and forms a conclusion relative to it.
Automated face recognition systems are extensively used in applications varying
from social media platforms to futuristic authentication systems.
Due to gap between the subjective emotions and the hand-crafted features, it is
very difficult task to recognize facial expressions or impressions in video sequences.
In this paper, we emphasize on various algorithm needed for face recognition and
detection problems.
Face recognition modus operandi amidst all biometric techniques have one
significant advantage, which is user-friendliness and adaptable nature.
The computer algorithm which is related to the facial recognition system is alike
human visual recognition. But if persons accumulate visual data in the brain and
instinctively recollects visual data once required, computers should also call for data
from a database stored in the system and match that data so as to recognize a human
face.
In a concise manner, an automated computer system furnished by a capturing
device like camera or CCTV recognizes and detects a person face, draw out facial
attributes such as the nose length, distance amidst both eyes, shape of the forehead
and cheekbones.
The most significant assessment for face recognition systems programs is the
width of nostrils, the distance amidst both the eyes, the length of nose, the shape
and height of the cheekbones, the height of the forehead, the width of the chin and
other parameters. The individual is picked out if the parameters coincide of the data
obtained from the capturing devices compared with those accessible in the database
stored in the cloud.
CNN could precisely put in the authentic image, such a way that the preprocessing
of image turns out to be easy. CNN employs the approach of confined sensing field,
pooling and sharing of weights to immensely minimize the parameters required for
training. Simultaneously, CNN also has translation of image-to-image, translation
Face Recognition System Based on Convolutional Neural Network … 23

Fig. 1 Face recognition system

and rotational invariance. In deep learning, the CNN is exclusively implemented


toward computer vision which includes both object recognition and image classifi-
cation. Before CNN was broadly accepted, most of the pattern recognition jobs were
accomplished by virtue of the inceptive phase of manually extracting of the features
and classifier. Nevertheless, the development of CNN entirely brings changes to
pattern recognition. In place of making use of a conventional methods such as feature
extraction, the CNN makes use of input image—unprocessed pixel strength similar
to flat vector. Predominantly, CNN has outstanding transforming capability for 2D
data, for instance image, voice.
The CNN basically contains a stack of modules, all of them are performing three
functions: convolution, ReLU and pooling (Fig. 1).

1.1 Convolution

This draw out tiles of the featured input map, where as to evaluate features—filters
are used, by using this convolution layer output feature maps are produced. This
produced output is also known as convolved feature (that might possess a contrary
depth, size in comparison to input feature map). Following are the parameters of
convolutions (Figs. 2 and 3):
• Tile size (5 * 5 or 3 * 3 pixels).
• Output feature map depth.
24 A. Gupta et al.

Fig. 2 3 × 3 convolution containing depth 1 is implemented across a 5 × 5 input map with depth
1. This convolution results in 3 * 3 output feature map

Fig. 3 It shows how value is calculated on output feature map over 5 × 5 input feature map

1.2 ReLU

After convolution comes a ReLU step, which is used to transform the convolved
features, which is used to present the nonlinearity within the model. The rectified
linear activation function which is F(x) = max (0, x) gives result x for every values
of x greater than 0 as well as it returns zero for every values of x less than or equal
to 0.

1.3 Pooling

This step comes after ReLU step; in this, the convolved features downsampling takes
place (minimizing time required for processing), which in turn lessen the feature map
dimensions amount. Max pooling algorithm is used for this process.
Face Recognition System Based on Convolutional Neural Network … 25

Fig. 4 Max pooling operation

Max pooling is managed likewise convolution. In this, we move on the feature


maps along with extracting tiles of the given size. Therefore, the maximal value
results in the new feature maps for every tile; after that, every values other than
that are rejected. The working of max pooling utilizes two parameters which are as
follows:
• First parameter of pooling is max pooling filter sizes.
• There is also second parameter in pooling known as stride which is the distance
between each tile extracted in pixels. In contrast to convolution at which filters
move pixelswise on the feature maps, the location at which every tile will be
extracted in max pooling will be decided by stride. Stride of 2 decides the work that
will be performed by the max pooling that is it will extract every nonoverlapping
2 * 2 tiles from feature map for a 2 * 2 filter (Fig. 4).

2 Literature Survey

2.1 Uchôa, Britto, Veras, Aires and Paiva [1]

They proposed a method by applying data augmentation throughout which we can


recognize the face and transfer the learning to the pretrained convolutional neural
networks (CNNs). From the images, we can extract the features and with the use of
VGG-face CNN trained a KNN.
The test shows that with the dataset saturation, the classifier trained and the accu-
racy of 98.43% was obtained with the supreme results. For the propriety dataset, the
accuracy obtained was 95.41%, with the contrast, brightness and saturation combined
version.
26 A. Gupta et al.

2.2 Srivastava, Dubey and Murali [2]

They proposed a PSNet CNN model with the use of PSN layer. According to
this paper, they resolve the problems which occurs in the existing layers of CNN
which normally generate inferior class scores alongside including a PSN—para-
metric sigmoid norm layer prior to the ultimate fully connected layer. The datasets
which are used as testing are as follows: YouTube Faces (YTF) datasets and Labeled
Faces in the Wild (LFW) datasets.
Various experiments performed by it, and with the use of these Angular-Softmax
as well as ArcFace Loss Function across PSNet and ResNet CNN models, we can
easily compare the results. In the PSNet, the hyperparameter settings are retained
as a α equal to 1 along with β, γ which are also known as network trainable. At
ResNet18, the best performance of 97.79% is achieved, and by using PSNet50 using
ArcFace Loss Functions on LFW dataset, 99.26% performance is achieved, and on
YTF dataset, it attains the best performance of 94.65% at PSNet18 and 97.10% at
PSNet50 using ArcFace Loss Functions.

2.3 Mr. Deepak B. Desai, Prof. Kavitha S. N. [3]

They used the NUAA database for distinguishing between the real and fake face
images. Totally, 1300 face images were selected for this model. The imperative
features were drawn out by using the CNN model along with a SVM classifier which
was used to differentiate between the spoofed and un-spoofed face images. The
model built by the authors scored 95% accuracy.

2.4 Radhika C. and Bazeshree V. [4]

They proposed the system by using machine learning approach along with computer
vision for FRS. With the use of vigorous DNN-based detector, we can detect face.
For face recognition evaluation, following classifier are utilized such as CNN, MLP
and SVM. On self-generated database, the best testing accuracy of 98% is achieved
by CNN and 89% accuracy is achieved on real time input via camera.

2.5 C. Nagpal and S. R. Dubey [5]

They conducted an accomplishment with the usage related to CNN models to


compare the face anti-spoofing. According to this paper, various architectures of
CNN are utilized, for instance, ResNet152, ResNet50 and Inception-v3. The dataset
Face Recognition System Based on Convolutional Neural Network … 27

which is used to recognize the face in this model is mobile face spoofing database
(MFSD).
The bottommost learning rate for ResNet152 is effective, whereas more advanced
learning rate is effective for ResNet50. The experiments were concluded by exam-
ining the different aspects, for instance, the depth of the model, random initializing of
weights versus transferring of weight, training vs tune up through dissimilar learning
rate along with scratch. The model built by the authors scored 99.59% for ResNet152
in conjunction with 10−3 learning rate and 94.26% accuracy for ResNet50 alongside
learning rate of 10−5 throughout transfer learning.

2.6 E. Winarno, H. Februariyanti, Anwar and I. H. Al Amin


[6]

They introduced an attendance model which is based on face recognition. The model
used is this research is CNN-PCA model for extracting the features of face images
also known as object which is taken by the camera. The objects in this model is
used as an identity and is stored in the database for further uses. This model gave
high accuracy and provides more accurate results of feature extraction. This model
obtained 98% of accuracy.

2.7 Yinfa Yan, Cheng Li, Yuanyuan Lu [7]

They implemented the improved CNN model trained by adding LBP feature informa-
tion in the dataset, and it shows high recognition accuracy and robustness. From the
facial expression image, a local binary pattern (LBP) image is extracted, combining
original face image and LBP image as a training dataset. FER2013 expression dataset
is used.
With the combination of LBP feature map, the results show the excellent recog-
nition effect and improved CNN is better than other methods. We can effectively
avoid illumination versus expression with the full use of the characteristics of rota-
tion invariance and gray invariance of local binary mode. We can achieve 95% of the
accuracy by using this method.

2.8 Anugrah Bintang Perdana, Adhi Prahara [8]

They proposed a method to recognize the face with the minimum dataset by using
CNN light-build which is improved with VGG16 model. Here, 120 × 120 pixels
image sizes are used in the proposed architecture and contains convolutional layers
28 A. Gupta et al.

of two different types succeed alongside max pooling. According to its study, every
layer of convolutional is accompanied alongside rectified linear activation function
(ReLU). 512 neurons are present in the two different connected layers, and in final
layer of classification, 30 neurons are present. The model built by the authors scored
94.4% accuracy; it shows exceptional results in the finite dataset alongside with little
quantity of labels (Table 1).

3 Implementation and Result Analysis

In this research, we have used CNN model and enhanced its accuracy by using
GPU—“Tesla V100-16 GB RAM” of Google Colab. We get 27.4 GB of available
RAM. In this we, will firstly use 2D polling, and then we will use rectified linear
activation function (ReLU) in it.
Then we will use Softmax activation with dense of ReLU activation function, and
then the result will be evaluated through epoch and by using this we will calculate
accuracy and loss. This will be evaluated layer by layer, and we have use binary
cross entropy for evaluating classification report. Then we will summarize history
for accuracy and loss using graph. Then we will plot confusion matrix. The dataset
we used in this research is Flickr-Faces-HQ Dataset (FFHQ). In the project, we have
used Google Colab which results in following advantages:
• Faster computation than your typical 8–16 GB RAM
• It results in better accuracy then the normally calculated accuracy
• Less cost involved during running of project as we do not have to buy good quality
PC with GPU in it.
The accuracy we obtained by using this method is 96%. However, in future, we can
further improve the accuracy by combining CNN with SVM or by further scaling the
model by using cloud services like Amazon Web Services, Microsoft Azure, Google
Cloud, etc. Following are the output of the research conducted using CNN model in
the Google Colab (Figs. 5, 6, 7 and 8).

4 Conclusion

The main emphasis of this research article was to study the facial expression recog-
nition research area extensively. In this study, we exclusively studied techniques for
FER, which employs CNN as a deep learning perspective. The recent research was
scrutinized in detail and concluded with the problems faced by researchers while
executing the models for FER. The conclusion that can be drawn from this research
using the various type of convolutional neural network method is that it is able to
recognize faces very efficiently and effectively.
Face Recognition System Based on Convolutional Neural Network … 29

Table 1 Comparison of various CNN techniques in face recognition


S. No. Researcher/Year Technique used Dataset used Accuracy
1 Valeska Uchoa, KNN classifier by LFW dataset 98.43% for saturation
Rodrigo Veras, implementing dataset and 95.41%
Kelson Aires, VGG-face CNN with proprietary
Anselmo Paiva, dataset
Laurindo Britto
(2020)
2 Yash Srivastava, PSNet CNN model Datasets used in this 97.79% at ResNet18
Shiv Ram Dubey by making use of the research are Labeled and 99.26% at
and Vaishnav PSN layer Faces in the Wild PSNet50 on LFW
Murali (2019) (LFW) and YouTube dataset and 94.65% at
Faces (YTF) PSNet18, 97.10% at
PSNet50 on YTF
dataset
3 Mr. Deepak B FRM using CNN for NUAA database 95% accuracy is
Desai, Prof. feature extraction achieved
Kavitha S. N. and for classification
(2019) SVM is used
4 Radhika C. SVM, MLP and Self-generated 98% accuracy is
Damale, Prof. CNN are utilized as database achieved on
Bageshree V. a classifier for self-generated
Pathak (2018) evaluation database and 89% is
obtained on real time
input via camera
5 Chaitanya CNN architectures MSU Mobile Face 99.59% accuracy for
Nagpal, Shiv for instance Spoofing Database ResNet152 is
Ram Dubey Inception-v3, (MFSD) dataset achieved and 94.26%
(2019) ResNet152 and for ResNet50 is
ResNet50 are used obtained
6 Edy Winarno, CNN-PCA 2D-3D image 98% accuracy is
Imam Husni Al (Convolutional reconstruction obtained
Amin, Herny neural process is used for
Februariyanti network—principal storing the database
(2019) component analysis)
7 Yinfa Yan, Cheng Improved CNN FER2013 expression This research
Li, Yuanyuan Lu model trained by dataset produces an accuracy
(2019) adding LBP feature of 95%
information
8 Anugrah Bintang Light-CNN formed Limited dataset Light-CNN method
Perdana, Adhi on modified VGG16 (ROSE-Youtu Face produces high
Prahara (2019) model Liveness Detection accuracy of 94.4%
Database while few
images are taken
from camera)
30 A. Gupta et al.

Fig. 5 Confusion matrix

This paper contains an interpretation to various CNN techniques which can be used
to solve the problems like illuminations, inadequate resolution problems, falsification
of faces, similar faces, position of faces, occlusions etc. while recognizing faces.
Accuracy obtained by using CNN model on Flickr-Faces-HQ Dataset (FFHQ)
dataset is 96%. Therefore, we should use more images for training data in deep
learning and better cloud computing resources like Aws, Azure, or Google Cloud for
faster computing, so that better results can be achieved.
Face Recognition System Based on Convolutional Neural Network … 31

Fig. 6 Training of our deep learning model (CNN)

Fig. 7 Classification report


32 A. Gupta et al.

Fig. 8 Summarize history


for accuracy and loss using
graph

References

1. Uchôa V, Aires K, Veras R, Paiva A, Britto L (2020) Data augmentation for face recognition
with CNN transfer learning. In: 2020 International conference on systems, signals and image
processing (IWSSIP). IEEE, pp 143–148
2. Srivastava Y, Murali V, Dubey SR (2019) PSNet: parametric sigmoid norm based CNN for face
recognition. In: 2019 IEEE Conference on information and communication technology. IEEE,
pp 1–4
3. Desai DB, Kavitha SN (2019) Face anti-spoofing technique using CNN and SVM. In: 2019
International conference on intelligent computing and control systems (ICCS). IEEE, pp 37–41
4. Damale RC, Pathak BV (2018) Face recognition based attendance system using machine learning
algorithms. In: 2018 Second international conference on intelligent computing and control
systems (ICICCS). IEEE, pp 414–419
5. Nagpal C, Dubey SR (2019) A performance evaluation of convolutional neural networks for
face anti spoofing. In: 2019 International joint conference on neural networks (IJCNN). IEEE,
pp 1–8
6. Winarno E, Al Amin IH, Februariyanti H, Adi PW, Hadikurniawati W, Anwar MT (2019)
Attendance system based on face recognition system using CNN-PCA method and real-time
Face Recognition System Based on Convolutional Neural Network … 33

camera. In: 2019 International seminar on research of information technology and intelligent
systems (ISRITI). IEEE, pp 301–304
7. Yan Y, Li C, Lu Y, Zhou F, Fan Y, Liu M (2019) Design and experiment of facial expression
recognition method based on LBP and CNN. In: 2019 14th IEEE Conference on industrial
electronics and applications (ICIEA). IEEE, pp 602–607
8. Perdana AB, Prahara A (2019) Face recognition using light-convolutional neural networks
based on modified Vgg16 model. In: 2019 International conference of computer science and
information technology (ICoSNIKOM). IEEE, pp 1–4
A Novel Support Vector Machine-Red
Deer Optimization Algorithm
for Enhancing Energy Efficiency
of Spectrum Sensing in Cognitive Radio
Network

Vikas Srivastava and Indu Bala

Abstract 5G Cognitive Radio Network (CRN) has a drawback in energy efficiency


and spectrum sensing trade-off. When it comes to building the cognitive radio, it is
essential to get the most energy-efficient system. Spectrum gaps are identified with
a hybrid Support Vector Machine (SVM) and Red deer algorithm (RDA). Using the
values for transmitting power, sensing bandwidth, and probability of detection helps
remove the spectrum gaps. RF energy harvesting (RH) is seen as highly effective and
realistic in next-generation wireless technology, especially for things that don’t need a
lot of energy, such as the Internet of Things (IoT). It is believed that all secondary users
in a network system can harvest several bands of RF from each Primary User location.
The proposed algorithm can identify the spectrum holes through sensing bandwidth,
transmitting power, and the probability of detection. This led to an improvement in
energy efficiency (EE). The simulation result shows that energy efficiency increase
with the help of SVM-RDA concerning sensing bandwidth, transmission power, and
probability of detection compared with existing Artificial Bee Colony (ABC), Particle
Swarm Optimization (PSO), and Hybrid PSO-GSA (Particle Swarm Optimization
-Gravitational Search algorithm).

Keywords Energy efficiency · Spectrum sensing · Support vector machine-Red


Deer algorithm · Cognitive radio

1 Introduction

The widespread use of wireless networking has certainly revolutionized the way
people live. In 1901, after the successful demonstration of wireless communication by
Marconi, a new era of communication was born. After evolving over a century, these
wireless applications are an essential part of the daily lifestyle. As a result, electronic

V. Srivastava (B) · I. Bala


Lovely Professional University, Jalandhar, India
V. Srivastava
PSIT, Kanpur, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 35
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_3
36 V. Srivastava and I. Bala

technologies and applications, such as mobile and satellite communications and


mobile networks, as well as privately used Wi-Fi networks, have seen significant
growth in the twenty-first century. In contrast to the issue of spectrum scarcity, a
recent Federal Communication Commission (FCC) study is an eye-opener to the
global telecommunication market. It has been realized most bandwidth is idle much
of the time. Because of strict spectrum distribution rules, the spectra have been
ignored entirely [1]. In the year 2000, J. Mitola proposed cognitive radio (CR) as a
viable alternative to the spectrum shortage due to its ability to interact with the RF
environment and to allocate radio resources as per the end-user quality of service
(QoS) requirements [2]. In 2002 FCC Spectrum Working Group study, cognitive
radios seem to be the most important way to achieve a virtual spectrum scarcity by
learning [3].
One of the spectrum management technologies that may help address the scarcity
problem is cognitive radio technology. The basic idea of increasing spectrum utiliza-
tion through cognitive radio technology was introduced by J. Mitola III’s in his
Ph.D. thesis. Cognitive radio’s main characteristics that make it different from other
radios are cognitive capability and reconfigurability [3, 4]. The three main stages
in the cognitive process are sensing the spectrum, analyzing the spectrum, and
deciding on the spectrum. Spectrum sensing helps cognitive radio to obtain infor-
mation from available bands, identify spectrum opportunities, and explore spectrum
variants. A core element of the dynamic spectrum access requirement is dynamic
spectrum allocation/scheduling, which ensures affordable and effective use of the
spectrum. Along with spectrum shortage, one critical parameter energy efficiency
(EE) will exist. Deficient EE will impact the wireless technological environment.
So it is important to enhance energy efficiency. Since cognitive radio devices are
battery-powered, a concentrated study is needed to increase the Energy Efficiency of
Cognitive Radio devices. Spectrum sensing accuracy can be enhanced by the opti-
mization of the spectrum sensing parameter. Here, energy harvesting is one of the
common successful ways to reduce energy costs for mobile networks (of which 30%)
comes from networking [5]. Previously, several circuit designs have been introduced
in the literature which uses multiple antennas [6, 7] or a single antenna [8, 9] to
collect from multiple bands, where the multi-band principle increase energy to the
area ratio. [10, 11] proposed ABC (Artificial Bee Colony algorithm), PSO (Particle
Swarm Optimization) and, metaheuristic related algorithm for efficient spectrum
sensing. The author in [12] proposed a Fish school algorithm to improve spectrum
sensing efficiency. For optimizing resource allocation, the author in [13] proposed a
novel grey wolf optimization algorithm for spectrum sensing in CRN. To improve
spectrum sensing in CRN, a bio-inspired algorithm can be used. An author in [14]
discussed a hybrid PSO-GSA algorithm to improve EE in the context of spectrum
sensing parameters. So, Support Vector Machine (SVM) with Red Deer Optimiza-
tion (RDO) algorithm has optimized spectrum sensing in multi-band CRN. The
SVM algorithm is utilized to sense the spectrum in CRN, while The RDO algorithm
is applied to optimize SVM parameters to enhance the sensing accuracy.
Prime focus of the paper is evaluation of energy efficiency, sensing bandwidth, and
probability of detection through Red Deer Optimization algorithm with SVM and
A Novel Support Vector Machine-Red Deer Optimization Algorithm … 37

also compared with PSO, ABC, and PSO-GSA optimization algorithm. According
to the plan, this paper is to be structured as follows: Related works are found in
Sect. 2. Section 3 discusses an energy-efficient optimization factor. Section 4 contains
a discussion of the existing methodology. Section 5 describes the proposed algo-
rithm. Details of simulation results are described in Sect. 6, and Sect. 7 contains the
conclusion.

2 Related Works

Energy efficiency is an essential parameter of 5G technology. In future wireless


technology, we consider energy harvesting from radiofrequency. In [15], multi-band
energy harvesting schemes under the cognitive radio inter-band approach have been
analyzed. All nodes in the secondary are allowed to harvest energy from different
RF sources. In this work, a mutually beneficial arrangement where SUs can detect
the range to decide the correspondence between geological areas has been proposed.
In this way, they can choose to collect or communicate information. Additionally,
we mutually enhanced the number of detecting tests. Energy consumption depends
on the data rate addressed in [16] with the suggestion of a wireless power transfer
scheme. Throughput and power consumption of IoT devices will be improved by
multichannel IOT with dynamic spectrum sharing [16]. Wideband spectrum sensing
is discussed in [17, 18]. In traditional radio, power optimization parameters are crucial
for the system. In [16, 19–23] focused on improve the EE of spectrum sensing. For
efficient spectrum prediction, the multilayer perceptron and hidden Markov model
are used in [24]. The energy efficiency and spectrum efficiency have improved,
but total computational complexity and prediction energy have increased [24]. The
researchers presented the iterative algorithm of convex transformations in [25] as a
spectrum sharing scheme for improving energy efficiency. The non-linear program-
ming issue is used to formulate the problem. In [16], power allocation using channel
state information is implemented for CRN. Sudhamani and Satya Sai Ram [21] uses
cooperative spectrum sensing of fusion rules (AND and OR) with EE enhancement
depending on the number of users. Ghosh et al. [22] designed a game theory method to
improve spectrum efficiency and EE for massive Multi-input Multi-Output Cognitive
radio networks based on 5G networks. The disadvantage of this game theory method
is that it converges to a suboptimal solution rather than a globally optimal solution
[26]. As per the research paper, many studies have focused on improving spectrum
sensing as a convex optimization issue. To the best knowledge of the authors aware-
ness, energy efficiency optimization is now only developed as a non-convex function
[23]. ABC algorithm was being used to improve a CRN’s EE in recent literature
[27]. GSA is incorporated to improve PSO’s exploitation ability, as Particle Swarm
Optimization is recognized for lags in exploitation but improved exploration [28].
38 V. Srivastava and I. Bala

3 Energy-Efficient Optimization Factor

This paper aims to enhance energy efficiency when addressing various spectrum
sensing theories. Furthermore, EE is enhanced by using a hybrid SVM-RDA to
improve the EE metrics. Scanning spectrum for spectrum holes is the initial step
in the traditional spectrum sensing method. For smooth communication, periodic
sensing is needed.
If a vacant channel is observed during spectrum sensing, data transfer is completed,
and the secondary user repeats the next sensing time procedure. SU’s bandwidth ‘B’
is divided into two sections for this purpose: Bs for spectrum sensing and B–Bs for
data transmission with marginal energies beyond spectrum [29]. If the PU is not
observed, SU conducts data transfer. The data transfer is completely disconnected
if new spectrum sensing senses Primary User. Depending on the existence of PU,
data transfer through secondary users can turn off and on for bandwidth B–Bs in this
scheme. However, the spectrum sensing carried out in this manner over the bandwidth
‘B’ is constant. As a result, the consistent spectrum sensing method is referred to as
an efficient spectrum sensing scheme.

3.1 The EE Through Spectrum Sensing and Data Transfer


in Terms of Overall Opportunistic Throughput
and Energy Utilization

Signal sampling is a significant contributor to energy utilization during spectrum


sensing. Because the sampling process accounts for a large portion of the energy
used during spectrum sensing, it is interesting to think about how much energy is
used per sampling time. Let E s denote the energy used by the secondary user during
the sensing phase per sampling time, and Gt denote the power spectral density (PSD)
of the secondary user signal used during data transfer throughout channel. SU’s
maximal transmitting power is Qt,max , which is spread uniformly around the range
of B–Bs .
In this CR system layout, the two hypotheses for energy detection-based spectrum
sensing are represented by Eq. (1) [30, 31]

H0 : x(n) = w(n)
(1)
H1 : x(n) = hs(n) + w(n)

where n = 1, 2, …, M; M denotes the number of samples. W (n) is AWGN noise with


zero mean and variance E[|w(n)|]2 = σw2 . s(n) is the PU signal with zero mean and
variance σs2 [32], h is the AWGN block faded channel gain.
The p on and p off value shows whether or not the PU currently fills the channel.
Since p on + p off = 1, the probability of false alarm and probability of detection can
be expressed as Eqs. (2) and (3).
A Novel Support Vector Machine-Red Deer Optimization Algorithm … 39

P f = q f p off (2)

Pd = qd p on (3)

where qd = probability of detection and q f = probability of false alarm by energy


detector.
The normalized energy detection technique used for sensing spectrum can be
written as Eq. (4)

1 
2T p Bs
TH = |xn |2 (4)
σw2 n=1

T p = frame period and 2T p Bs = number of samples. Due to the large value of 2T p Bs .


  
H0 : N 2T p Bs , 2T p Bs  
TH ∼ (5)
H1 : N (2 1 + SNRpr T p Bs , 2 1 + SNRpr T p Bs

SNRpr is the Signal to Noise Ratio of Primary User measured at secondary user,
 
Probability of detection qd = probability TH > γ  |H1


γ
=Q   − 2T p Bs (6)
1 + SNRpr 2T p Bs
 
Probability of false alarm q f = probability TH > γ  |H0


γ
=Q − 2T p Bs (7)
2T p Bs
1 ∞ −t 2 /2
Q function is Q(x) = √ ∫e dt (8)
2π x

If qd increases, then q f will decreases. Threshold probability of detection qdth <


qd .
So the probability of false alarm
  
q f = Q(Q −1 qdth 1 + SNRpr + SNRpr 2T p Bs (9)

Since spectrum sensing energy efficiency depends on spectrum sensing vari-


ables, the below assumptions could be found depending on distinct spectrum sensing
contexts (Table 1).
Total energy consumption
40

Table 1 Value of probability, throughput, and energy consumption


Contexts Probability Throughput Energy consumption Remarks
Spectrum is qd p on c1 = 0 e1 = 2T p Bs E s p on qdth Spectrum is
busy occupied by the PU,
and it has been
correctly detected
by the SU
 
Missed pon 1 − qdth c2 = 0 e2 = For data
detection    transmission, PU
p on 1 − qdth 2T p Bs E s + G t (B − Bs )T p
and SU will
concurrently occupy
the channel
False alarm q f p off c3 = 0 e3 = 2T p Bs E s p off q f when the channel
does not have a PU,
SU does not transmit
 
Useful poff 1 − q f c4 = e4 = The vacant channel
detection     is properly found by
  G t (B−Bs ) p off 1 − q f 2T p Bs E s + G t (B − Bs )T p
T p poff 1 − q f (B − Bs ) log2 1 + |H |2 G 0 (B−Bs ) SU
V. Srivastava and I. Bala
A Novel Support Vector Machine-Red Deer Optimization Algorithm … 41


4
 
ET = ek = 2T p Bs E s + G t (B − Bs )T p 1 − p on qdth − p off q f (10)
k=1

Throughput

   
4
  2 G t (B − Bs )
CT = cr = T p poff 1 − q f (B − Bs ) log2 1 + |H | (11)
r =1
G 0 (B − Bs )
  
2 G t (B−Bs )
CT p off
1 − q f (B − Bs ) × log2 1 + |h| G 0 (B−Bs)
Energy Efficiency = =ε=  
ET 2Bs E s + G t (B − Bs ) 1 − pon qdth − poff q f
(12)

3.2 System Model for Sensing the Spectrum

IEEE 802.22 is related to the wireless network, where we concentrate mostly on


Cognitive Radio Network with one Primary User, one fusion center, and i secondary
user [33]. Each secondary user calculates energy and sends it to fusion center, which
therefore calculates the PU state. For time duration τ 0 , each SU calculates the energy.
The sampling rate is f s , so the energy value of SUi is given by-

2 
f s τ0



⎨ σi2 [n i ( j)]2 H0
j=1
yi = (13)

⎪ 
f s τ0
⎪ 2
⎩ σi2 [h i S( j) + n i ( j)] H1
2
j=1

h i = channel gain from Primary User


 to ith secondary user, S( j) = PU signal and
n(j) = noise. Noise power σi2 = E |n i ( j)|2 .

4 Description of Conventional Red Deer Algorithm


and Support Vector Machine

4.1 Red Deer Algorithm

The red deer algorithm is a population-based metaheuristic algorithm [34]. The opti-
mization technique can be divided into two groups: (i) mathematical programming
(ii) metaheuristic algorithm. The existing metaheuristic algorithm may be divided
into two main categories (i) evolutionary algorithm (ii) swarm algorithm.
42 V. Srivastava and I. Bala

The RDA, like some other meta-heuristics, begins with a random population that
is the contrary of RDs. The group’s strongest RDs are identified and called the
“male RD,” while the others are called the “hinds.” However, first, the male RD will
roar. They are divided into two categories depending on the strength of the roaring
process (i.e., stags and commanders). Implementing that, commanders and stag of
each harem compete for control of their harem. Commanders always develop harems.
The volume of hinds in harems is proportional to the commanders’ roaring and battle
skills. As a result, commanders in harems pair with a significant number of hinds.
Generate an initial red deer
Definition of array related to red deer is:

Red Deer = [X 1 ; X 2 ; X 3 ; . . . ; X N var ] (14)

A feasible method X refers to a red deer. Also, the feature value for each RD can
be measured:

Value = f (Red Deer) = f (X 1 ; X 2 ; X 3 ; . . . ; X N var ) (15)

The initial population size is N pop and N hind = N pop − N male , N male is no. of male
RD. N hind is no. of female RD.
Roar male RDs
Male RDs are increasing their grace by roaring in this phase. As a result, just like in
reality, the roaring task will succeed or fail. In consideration of the best solution, we
find the male RD’s neighbors and swap them with the prior ones if their objective
functions are larger than the male RD’s. In reality, we allow any male RD to change
positions. The following equation is suggested to change the position of males:

maleold + a1 ((UB − LB)a2 + LB); if a3 ≥ 0.5
malenew = (16)
maleold − a1 ((UB − LB)a2 + LB); if a3 < 0.5

LB and UB are lower and upper bound of search space. maleold is the present
position of male RD, and malenew is updated position of male RD. a1 , a2 , and a3 are
lies between 0 and 1.
Select γ % of the best male Red Deer as male commanders
Male RD is two types, first is commander, and second one is stags. The following
formula is used to calculate the number of male commanders:

Ncom = round{γ .Nmale } (17)

Ncom = no. males who are harem’s commander, 0 < γ < 1.


At last, stags number is determined using the following formula:
A Novel Support Vector Machine-Red Deer Optimization Algorithm … 43

Nstag = Nmale − Ncom (18)

Fight between stags and male commanders


Assigned stags to every commander at random. Two mathematical formulas are given
for the fighting process:

Com + Stag
New1 = + b1 ((UB − LB)b2 + LB) (19)
2
Com + Stag
New2 = − b1 ((UB − LB)b2 + LB) (20)
2
New1 and New2 are the two new solutions generated by the fighting process. Com
= commander, stag = Stag. b1 and b2 are lies between 0 and 1.
Form harems
We separate hinds among commanders to form harems in proportion to:

Vn = vn − max{vi } (21)

vn = power of nth commander V n = normalized value.


The following equation can be used to measure commanders’ normalized strength-
 
 V 
 n 
Pn =   Ncom  (22)
 i=1 Vi 

No. of hinds contain each harem is

N .haremn = round{Pn .Nhind } (23)

Mate commander of a harem with α percent of hinds in his harem


A commander carries out this operation; the percentage of the hinds in his harem
is parents.

N .haremmale
n = round{α.N .Haremn } (24)

N .haremmale
n = the number of hinds from the nth harem who mate with their
commander. α is a starting parameter value.
In general, mating process is defined as follows:

Com + Hind
Offs = + (UB − LB)c (25)
2
Offs is a new solution.
44 V. Srivastava and I. Bala

Mate commander of a harem with β percent of hinds in another harem


The percentage of hinds in the harem who mate with the commander is calculated
using the following formula:

N .haremmale
k = round{β.N .Haremn } (26)

The number of hinds in the kth harem that mate with the commander is
N.haremk male .
Mate stag with the closest hind
Locating closest hind, the distance between all hinds in J-dimension and a stag is
⎛ ⎞2

di = ⎝ (stag j − hindij )2 ⎠ (27)
j∈J

Specify the next generation


We choose the next generation of males based on fitness and use an evolutionary
process to choose a solution.
Convergence
It may be the case that the stopping condition is the number of iterations.

4.2 Support Vector Machine (SVM)

Among classifiers that have a linear decision boundary, the support vector machine is
the most powerful one [35]. If there are non-linear functions, it is called kernel func-
tion. Support vector machines can work with a relatively larger number of features
without requiring too much computation. In the SVM, we need a classifier to maxi-
mize the separation between distance surface and point. Distance between closest
points of negative surface to decision plane is the margin. So margin is denoted
by the minimum distance of a training instance from the decision surface. This is
an assumption that positive and negative points are linearly separable by a linear
decision point. The points nearest to decision surface are called support vectors.
Functional margin
Let take point (x i , yi ) point concerning a decision surface. The decision surface
equation is

wx + b = 0
A Novel Support Vector Machine-Red Deer Optimization Algorithm … 45

So equation of line is w1 x 1 + w2 x 2 + b = 0. Function margin is the distance


between x i , yi to the decision boundary.
 
γi = yi w T xi + b (28)

More certainty about the classifiers translates into a greater functional margin.
Functional margin of set of points so there are set of point x

S = {(x1 , y1 ), (x2 , y2 )}
m
γ  = min γi (29)
i=1

The best dividing hyperplane will be the one with the greatest margin, and all
training samples are assumed to satisfy the limitation,

yi (w T xi + b) > 1 (30)

To separate non-linear results, one must look for an optimal classification plane,
so as to keep the misclassification rate as low as possible.
The ideal hyperplane is analogous to a quadratic function optimization problem
in which the Lagrange function is used to calculate the global limit [36]. Training
samples are (PU(k), Z i (k)), where class label is PU(k) and feature vector is Z i (k).
Classifier is described by the following definition:

wi .Z i (k) + bi > 1 if PU(k) = 1 (31a)

wi .Z i (k) + bi ≤ −1 if PU(k) = 0 (31b)

wi = weighting vector and bi = bias.


Hence optimization problem is written as

1  N
min (wi , ζ ) =
wi
2 + C ξi ,
2 i=1

N is number of samples, ξi is slack variable, and C is set by user to limit the


misclassification rate.
46 V. Srivastava and I. Bala

5 Proposed Methodology: Support Vector Machine-Red


Deer Algorithm (SVM-RDA)

Pseudocode for spectrum sensing using Support Vector Machine (SVM) based Red
Deer Optimization (RDO) Algorithm and flowchart is given in Fig. 1.
Pseudocode for spectrum sensing using Support Vector Machine (SVM) based
Red Deer Optimization (RDO) Algorithm
Create system model using Primary Users (PUs), Secondary Users (SUs), and
Fusion center (FC).
Generate full duplex multi-band signal.
Separate multi-band into several subbands.
Initialize energy for SUs, PUs, and subband.
Placement PUs in a subband.
Check the availability of PUs in spectrums and collects FC the energy value
submitted by SUs in each round.
Sense the availability of spectrum using SVM algorithm.
Begin
Initialize the parameters of SVM such as bias and weight vector (training).

Start

Develop a SVM model Start

Take input values as energy Initialize RDs (weight


level of SU and availability of parameter)
PU

Roar male RDs


Train the network with initial
weight
Select percent of best male
RDs as male commanders
Yes
Is correctly Fight between male
End
sensed? commanders and stags as well
as form harems
No
Mate commander with and
Update the weight of SVM percent of hinds in another
using RDO algorithm harem
Spectrum unavailability

Mate stage with nearest hind


SVM classification based on and select new weight
optimal weight value (testing parameters
set)

Spectrum availability Yes End criteria No


Assign SUs to available satisfied?
spectrums

Stop Stop

Fig. 1 Algorithm of SVM-RDA


A Novel Support Vector Machine-Red Deer Optimization Algorithm … 47

for each SUi


Take input as energy level and availability of PU(k).
Calculate weight and bias vector by

wi .Z i (k) + bi > 1, if PU(k) = 1


wi Z i (k) + bi ≤ 1 if PU(k) = 0

Here, wi is weight vector, bi is a bias vector, and Z i (k) is an energy value


submitted by SUi in each kth round.
end for
Weight vector is randomly updated and it can’t able to sense the spectrum
availability accurately.

N
Optimize min (wi , ζ ) = 21
wi
2 + C ξi , N is number of samples, ξi is
i=1
slack variable, and C is set by user to limit the misclassification rate.
Initialize Red Deer population (weight vectors)
Calculate fitness and sort them according to fitness and form hinds and male
Red Deers.
X ∗ is the best solution (weight vector)
While (t < max _iter)
for each male RD
Roar male based on equation

malenew = maleold + a1 x((UB−LB) ∗ a2 + LB); if a3 ≥ 0.5


malenew = maleold − a1 x((UB−LB) ∗ a2 + LB); if a3 < 0.5

Update the position if better than previous one.


end for
Sort the males and form the commanders and stags by

NCom = round{γ .Nmale } and


Nstag = Nmale − NCom

for every male commander


Fight among stags and male commanders by

Com + Stag
New1 = + b1 x((UB−LB) ∗ b2 + LB);
2
Com + Stag
New2 = − b1 x((UB−LB) ∗ b2 + LB);
2
Update the male commanders’ position and stage position
end for
for each male commander N .haremmate
n = round{α.N .haremn }
48 V. Srivastava and I. Bala

Mate a male commander with selected hinds of his harems randomly

Com + Hind
Offs = + (UB − LB)xc
2

Select a harem randomly and name it k N .haremmale


k = round{β.N .haremk }
Mate a male commander with some selected hinds of harem by

Com + Hind
Offs = + (UB − LB)xc
2
end for
for each stag
Calculate distance between stag and all hinds and select nearest hind by
⎛ ⎞2

di = ⎝ (stag j − hindij )2 ⎠
j∈J

Mate stag with selected hind by Offs = Com+Hind


2
+ (UB − LB)xc.
end for
Select next generation
Update X ∗ , if there is a better solution.
end while
Return X ∗ (optimal weight value)
With trained energy vector, test energy vector in classification module
Classify the CRN with two classes like spectrum availability and spectrum
unavailability class.
SUs access available spectrums.
End whole process.

6 Simulation Results

The Energy Efficiency of Cognitive Radio Network is based on sensing bandwidth,


transmitted power, and probability of detection using MATLAB simulation data.
The optimal solution in each search space of the corresponding parameters was
obtained by segmenting each vector’s search space (parameter) and then plotting the
simulation result. The metrics of EE obtain from SVM-RDA, PSO-GSA, PSO, and
ABC algorithms were simulated for 50 runs, with each run having 200 iterations for
each algorithm. The bits/Joule unit of EE feature is used in this case. Collection of
Qt and its Bs can be achieved using the optimization process.
Simulation Parameters:
A Novel Support Vector Machine-Red Deer Optimization Algorithm … 49

The following conditions are taken into account when calculating numerical
results:
T p = 0.01 s, B = 1 × 106 Hz, qon = 0.3, E s = 1 × 10−6 J, No. of Primary User
= 10, no. of secondary user = 90.

6.1 Simulation Results of the Optimization Algorithms

6.1.1 Transmission Power vs. Energy Efficiency

The Energy Efficiency curve for differing transmission power (Qt ) for various Signals
to Noise Ratio values is seen in Figs. 2 and 3. It is clear from the Energy Efficiency
versus transmission power that if Qt ’s value increases, the EE increases, while if the
transmission power increases after reaching a peak value, the EE gradually decreases.
Since opportunistic throughput increases as transmission power increases and Energy
Efficiency is inversely proportional to power and directly proportional to throughput.
Energy efficiency function increases until the improvement in throughput exceeds
the rise in power level, at which point the Energy Efficiency function peaks and the
increase in power level predominate the throughput, resulting in a steady decline in
the EE function. Since the probability of false alarm reaches minimum level at high
SNR levels due to EE function peak for various situations is higher, giving the EE

Fig. 2 Analysis with respect to transmission power at signal to noise ratio = −6 dB


50 V. Srivastava and I. Bala

Fig. 3 Analysis with respect to transmission power at signal to noise ratio = 8 dB

function a greater chance of finding spectrum gaps. This leads to higher opportunistic
throughput and, as a result, a higher EE function.

6.1.2 Sensing Bandwidth Versus Energy Efficiency

EE reduces as sensing bandwidth grows because a higher sensing bandwidth


Bs results in a lower transmitting bandwidth, resulting in a lower opportunistic
throughput value, leading to a decreased EE value (Figs. 4 and 5).

6.1.3 Probability of Detection Versus Energy Efficiency

In above graph probability of detection is 0.9. Energy Efficiency increase with an


increase in detection probability, as seen in Figs. 6 and 7. Higher opportunistic
throughput is ensured by a higher detection probability, indicating an improved EE
function value. With a high probability of detection, an opportunistic spectrum access
scheme may operate with less interference. Because of its better exploitation poten-
tial, the SVM-RDA method is good in achieving a better peak for Energy efficiency
than ABC, PSO, and PSO-GSA for differing detection probability.
A Novel Support Vector Machine-Red Deer Optimization Algorithm … 51

Fig. 4 Analysis with respect to sensing bandwidth at signal to noise ratio = −6 dB

Fig. 5 Analysis with respect to sensing bandwidth at signal to noise ratio = 8 dB


52 V. Srivastava and I. Bala

Fig. 6 Analysis with respect to probability of detection at signal to noise ratio = −6 dB

Fig. 7 Analysis with respect to probability of detection at signal to noise ratio = 8 dB


A Novel Support Vector Machine-Red Deer Optimization Algorithm … 53

Table 2 Optimum energy efficiency values using Support Vector Machine-Red Deer Algorithm,
Particle Swarm Optimization-Gravitational Search Algorithm, Particle Swarm Optimization, and
Ant Bee Colony for different Signal to noise ratio values
Energy efficiency function Optimization methodology Peak values Energy efficiency
SNR = −6 dB SNR = 8 dB
Varying in terms of transmission SVM-RDA 7 15
power (mW) PSO-GCA 6 14.3
ABC 5.5 9.7
PSO 4.8 8
Varying in terms of sensing SVM-RDA 5.2 12
bandwidth (Hz) PSO-GCA 4.9 11.2
ABC 4.6 8
PSO 4.4 7.2
Varying in terms of detection SVM-RDA 2.6 6.5
probability PSO-GCA 2.5 6.1
ABC 2.3 3.9
PSO 2 2.8

6.1.4 Graphical and Tabular-Based Comparative Analysis

Table 2 demonstrates each optimization algorithm’s effectiveness (SVM-RDA, PSO,


PSO-GSA, and ABC) with respect to peak energy efficiency and parameter values.

7 Conclusion

For a Cognitive Radio Network, the authors propose an SVM-RDA-based optimiza-


tion algorithm for enhancing energy efficiency in spectrum sensing. Energy efficiency
function is calculated with respect to transmission power, sensing bandwidth, and
probability of detection. RDA’s performance has been significantly improved as a
result of the hybridization of SVM and RDA. In comparison to current PSO-GCA
algorithms, the proposed SVM-RDA system’s simulation results show its efficacy
in terms of EE for sensing of spectrum. With an SNR of 6 dB and population size
of 20, the proposed SVM-RDA is 15%, 30%, and 22% high energy efficient than
PSO-GSA, PSO, and ABC, respectively, for varying transmission power. For the
varying probability of detection, the proposed SVM-RDA outperformed PSO-GSA,
PSO, and ABC by 4%, 24%, and 16%, respectively. In contrast, for varying sensing
bandwidth, the proposed algorithm outperforms PSO-GSA, PSO, and ABC by 7%,
19%, and 16%, respectively. Compared to the current PSO-GSA, PSO, and ABC
algorithms, the proposed algorithm is effective in spectrum sensing and has obtained
optimum energy consumption for sensing of spectrum in Cognitive Radio Network.
54 V. Srivastava and I. Bala

References

1. Sun H, Nallanathan A, Wang CX, Chen Y (2013) Wideband spectrum sensing for cognitive
radio networks: a survey. IEEE Wirel Commun 20(2):74–81. https://doi.org/10.1109/MWC.
2013.6507397
2. Youssef M, Ibrahim M, Abdelatif M, Chen L, Vasilakos AV (2014) Routing metrics of cognitive
radio networks: a survey. IEEE Commun Surv Tutorials 16(1):92–109. https://doi.org/10.1109/
SURV.2013.082713.00184
3. El Tanab M, Hamouda W (2017) Resource allocation for underlay cognitive radio networks:
a survey. IEEE Commun Surv Tutorials 19(2):1249–1276. https://doi.org/10.1109/COMST.
2016.2631079 (Institute of Electrical and Electronics Engineers Inc.)
4. Ding H, Fang Y, Huang X, Pan M, Li P, Glisic S (2017) Cognitive capacity harvesting networks:
architectural evolution toward future cognitive radio networks. IEEE Commun Surv Tutorials
19(3):1902–1923. https://doi.org/10.1109/COMST.2017.2677082
5. Pollin S et al (2008) MEERA: cross-layer methodology for energy efficient resource allocation
in wireless networks. IEEE Trans Wirel Commun 7(1):98–109. https://doi.org/10.1109/TWC.
2008.05356
6. Nintanavongsa P, Muncuk U, Lewis DR, Chowdhury KR (2012) Design optimization and
implementation for RF energy harvesting circuits. IEEE J Emerg Sel Top Circuits Syst 2(1):24–
33. https://doi.org/10.1109/JETCAS.2012.2187106
7. Multi-band simultaneous radio frequency energy harvesting. In: IEEE Conference publication.
IEEE Xplore. https://ieeexplore.ieee.org/document/6546869. Accessed 19 Mar 2021
8. Sun H, Guo YX, He M, Zhong Z (2013) A dual-band Rectenna using broadband Yagi antenna
array for ambient RF power harvesting. IEEE Antennas Wirel Propag Lett 12:918–921. https://
doi.org/10.1109/LAWP.2013.2272873
9. Kuhn V, Lahuec C, Seguin F, Person C (2015) A multi-band stacked RF energy harvester with
RF-to-DC efficiency up to 84%. IEEE Trans Microw Theory Tech 63(5):1768–1778. https://
doi.org/10.1109/TMTT.2015.2416233
10. Hei Y, Li W, Fu W, Li X (2015) Efficient parallel artificial bee colony algorithm for cooperative
spectrum sensing optimization. Circuits Syst Signal Process 34(11):3611–3629. https://doi.org/
10.1007/s00034-015-0028-2
11. Hei Y, Wei R, Li W, Zhang C, Li X (2017) Optimization of multi-band cooperative spectrum
sensing with particle swarm optimization. Trans Emerg Telecommun Technol 28(12):e3226.
https://doi.org/10.1002/ett.3226
12. Azmat F, Chen Y, Stocks N (2015) Bio-inspired collaborative spectrum sensing and allocation
for cognitive radios. IET Commun 9(16):1949–1959. https://doi.org/10.1049/iet-com.2014.
0769
13. Karthikeyan A, Srividhya V, Kundu S (2019) Guided joint spectrum sensing and resource
allocation using a novel random walk grey wolf optimization for frequency hopping cognitive
radio networks. Int J Commun Syst 32(13):e4032. https://doi.org/10.1002/dac.4032
14. Eappen G, Shankar T (2020) Hybrid PSO-GSA for energy efficient spectrum sensing in
cognitive radio network. Phys Commun 40:101091. https://doi.org/10.1016/j.phycom.2020.
101091
15. Alsharoa A, Neihart NM, Kim SW, Kamal AE (2018) Multi-band RF energy and spectrum
harvesting in cognitive radio networks. In: IEEE International conference on communications,
July 2018, vol 2018-May. https://doi.org/10.1109/ICC.2018.8422511
16. Liu Z, Zhao X, Liang H (2018) Robust energy efficiency power allocation for relay-assisted
uplink cognitive radio networks. Wirel Networks 24(4):1237–1250. https://doi.org/10.1007/
s11276-016-1385-x
17. Pei Y, Liang YC, Teh KC, Li KH (2009) How much time is needed for wideband spectrum
sensing? IEEE Trans Wirel Commun 8(11):5466–5471. https://doi.org/10.1109/TWC.2009.
090350
A Novel Support Vector Machine-Red Deer Optimization Algorithm … 55

18. Paysarvi-Hoseini P, Beaulieu NC (2011) Optimal wideband spectrum sensing framework for
cognitive radio systems. IEEE Trans Signal Process 59(3):1170–1182. https://doi.org/10.1109/
TSP.2010.2096220
19. Noh G, Lee J, Wang H, Kim S, Choi S, Hong D (2010) Throughput analysis and optimization
of sensing-based cognitive radio systems with Markovian traffic. IEEE Trans Veh Technol
59(8):4163–4169. https://doi.org/10.1109/TVT.2010.2051170
20. Tang L, Chen Y, Hines EL, Alouini MS (2011) Effect of primary user traffic on sensing-
throughput trade-off for cognitive radios. IEEE Trans Wirel Commun 10(4):1063–1068. https://
doi.org/10.1109/TWC.2011.020111.101870
21. Sudhamani C, Satya Sai Ram M (2019) Energy efficiency in cognitive radio network using
cooperative spectrum sensing. Wirel Pers Commun 104(3):907–919. https://doi.org/10.1007/
s11277-018-6059-9
22. Ghosh S, De D, Deb P (2019) Energy and spectrum optimization for 5G massive MIMO
cognitive femtocell based mobile network using auction game theory. Wirel Pers Commun
106(2):555–576. https://doi.org/10.1007/s11277-019-06179-3
23. Tang M, Xin Y (2016) Energy efficient power allocation in cognitive radio network using
coevolution chaotic particle swarm optimization. Comput Networks 100:1–11. https://doi.org/
10.1016/j.comnet.2016.02.010
24. Shaghluf N, Gulliver TA (2019) Spectrum and energy efficiency of cooperative spectrum predic-
tion in cognitive radio networks. Wirel Networks 25(6):3265–3274. https://doi.org/10.1007/
s11276-018-1720-5
25. Zhou M, Zhao X, Yin H (2019) A robust energy-efficient power control algorithm for cognitive
radio networks. Wirel Networks 25(4):1805–1814. https://doi.org/10.1007/s11276-017-1631-x
26. Han Z, Niyato D, Saad W, Başar T, Hjørungnes A (2011) Game theory in wireless and communi-
cation networks: Theory, models, and applications, vol 9780521196963. Cambridge University
Press
27. Eappen G, Shankar T (2018) Energy efficient spectrum sensing for cognitive radio network
using artificial bee colony algorithm. Int J Eng Technol 7(4):2319–2324. https://doi.org/10.
14419/ijet.v7i4.10094
28. Yang XS (2014) Swarm intelligence based algorithms: a critical analysis. Evol Intell 7(1):17–
28. https://doi.org/10.1007/s12065-013-0102-2
29. Urkowitz H (1967) Energy detection of unknown deterministic signals. Proc IEEE 55(4):523–
531. https://doi.org/10.1109/PROC.1967.5573
30. Liu X, Min J, Gu X, Tan X (2013) Optimal periodic cooperative spectrum sensing based on
weight fusion in cognitive radio networks. Sensors (Switzerland) 13(4):5251–5272. https://doi.
org/10.3390/s130405251
31. Haykin S, Thomson DJ, Reed JH (2009) Spectrum sensing for cognitive radio. Proc IEEE
97(5):849–877. https://doi.org/10.1109/JPROC.2009.2015711
32. Yin W, Ren P, Du Q, Wang Y (2012) Delay and throughput oriented continuous spectrum
sensing schemes in cognitive radio networks. IEEE Trans Wirel Commun 11(6):2148–2159.
https://doi.org/10.1109/TWC.2012.032812.110594
33. Chen H, Zhou M, Xie L, Wang K, Li J (2016) Joint spectrum sensing and resource allocation
scheme in cognitive radio networks with spectrum sensing data falsification attack. IEEE Trans
Veh Technol 65(11):9181–9191. https://doi.org/10.1109/TVT.2016.2520983
34. Fathollahi-Fard AM, Hajiaghaei-Keshteli M, Tavakkoli-Moghaddam R (2020) Red deer algo-
rithm (RDA): a new nature-inspired meta-heuristic. Soft Comput 24(19):14637–14665. https://
doi.org/10.1007/s00500-020-04812-z
35. Zhu H, Song T, Wu J, Li X, Hu J (2018) Cooperative spectrum sensing algorithm based
on support vector machine against SSDF attack. In: 2018 IEEE International conference on
communications workshops, ICC Workshops 2018—proceedings, pp 1–6. https://doi.org/10.
1109/ICCW.2018.8403653
36. Haykin S et al (2009) Neural networks and learning machines, 3rd edn
Deceptive Product Review Identification
Framework Using Opinion Mining
and Machine Learning

Minu Susan Jacob and P. Selvi Rajendran

Abstract Reviews of products are an integral part of the marketing and branding of
online retailers. They help build trust and loyalty and generally define what separates
the goods from others. The competition is so high that at times the company is
forced to rely on the third party in producing deceptive reviews to influence readers’
opinions and hence to enhance the sales. This misleads the ethics and purpose of
online shopping. This paper proposes a model to identify fake product reviews. The
model uses Naive Bayes classifier and Support Vector Machine to classify the fake
and genuine reviews. The model uses a set of parameters such as length of the review,
usage of personal pronouns, nature of the review, verified purchase status, rating of
the review and the type of the product to extract the features for the classification.
The experimental results show that the model is working efficiently with a high
classification accuracy rate.

Keywords Opinion mining · Fake product review · Naive Bayes classifier ·


Support vector machine · Classification

1 Introduction

There are a variety of criteria that help to determine an e-commerce store’s perfor-
mance and reputation. Product reviews are however a very important factor in raising
the credibility, standard and evaluation of an e-commerce shop. Often after a trans-
action has been made, the organization may provide a URL for printed literature
or e-mail marketing to encourage clients to check their service. Consumer reviews
do not take anything into account but provide an e-commerce store with one of the
important factor-customer reviews.

M. S. Jacob (B) · P. Selvi Rajendran


Department of Computer Science and Engineering, KCG College of Technology, Chennai, India
e-mail: minu.cse@kcgcollege.com
P. Selvi Rajendran
e-mail: selvi.cse1@kcgcollege.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 57
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_4
58 M. S. Jacob and P. Selvi Rajendran

Merchants very often underestimate the value of product feedback for an e-


commerce store, paying more attention and thinking about too many things to handle,
such as improving web layout, removing consumer doubts, helping timely customers
determine which item to purchase [1], customer care and administrative tasks. A
customer knows exactly what he is purchasing, what he is being given and what
items he can expect from the product with the aid of product reviews. A customer
knows exactly what he is purchasing, what he is being given and what items he can
expect from the product with the aid of product reviews. Consumer reviews have
become more critical because of the absence of the right to ‘Test’ the item prior to its
online purchase. Some good reviews are applied to some of the review websites by
the individuals of the product company themselves to generate false-positive product
reviews [2] and negative fake feedback [3] for their competitors. For several different
goods produced by their own company, they offer good feedback. It would not be
possible for users to find out if the review is true or false.

2 Related Work

Fake news identification is the need of the hour. The study of literature requires
content-based strategies and approaches based on trends. Content-based approaches
require two steps: pre-processing content and supervised testing of the learning
model on pre-processed content. Generally, the first step includes word tokeniza-
tion, derivation and/or weighting. [4] and [5]. Term Frequency-Inverse Document
Frequency (TF-IDF) [6, 7] can be used in the second phase to construct samples
for supervised learning training. TF-IDF will, however, generate very few samples,
especially for data on social media. Word embedding methods (like word2vec [8]
and GloVe [9]) are used to transform words into vectors to overcome this obstacle.
Pennebaker [10], using Language Interrogation and Word Counting (LIWC) [11] also
addresses the variations between deceptive and non-deceptive languages in word use.
Specifically, more models focused on deep learning have been explored in compar-
ison with other supervised learning models [12, 13]. For example, Mikolov [14], to
describe the detection model, two LSTM RNN models were used: one to learn simple
word embedding and the other to improve performance by connecting the output of
long-term short-term memory (LSTM) with the feature vector of LIWC. In addition,
Doc2vec [13] is used to describe the content of each social interaction to achieve
better efficiency, attention-based RNN is often used. Lazer [15], in the feedback of
the attention-based RNN, add the name of the speaker and the topic of the argument
in addition because convolutional neural networks have been successful in many
text classification tasks, they are also widely used. A multi-source and multi-class
fake news detection system (MMFD) was proposed by Gao [16], in which CNN
analyzes the local pattern of each text in the argument and LSTM analyzes the time
dependence of the whole text. In the system based on the propagation pattern, the
patterns of propagation are derived from news knowledge in the time series. Lan
[17], for example, suggested a dynamic sequence time structure (DSTS) to capture
Deceptive Product Review Identification Framework Using Opinion … 59

shifts in social background characteristics (such as Weibo content and users) over
time in order to discover rumours as early as possible in order to recognize false news
from Weibo. A layered attention RNN (HARNN) was suggested by Hashimoto [18],
which uses a Bi-GRU layer with an attention mechanism to capture high-level gossip
material representations and uses a Gated Recurrent Unit (GRU).
Chapello [19], in time series modifications, imagine the subject structure and
seek support from external credible sources to assess the validity of the topic. Most
current techniques, in short, are based on supervised learning. In order to understand
the detection process, it needs a large amount of labelled data, particularly for deep
learning-based methods. However, annotating news on social media is too costly and
requires a lot of manpower due to the enormous size of social media info. To enhance
the efficiency of identification, it is therefore imperative to combine unmarked data
with marked data in the detection of fake news. Semi-supervised learning [20] is a
system that can use knowledge that is both labelled and unlabelled.
Therefore, for the timely identification of false news, we suggest a new paradigm
for supervised learning by Naïve Bayes and SVM classifiers. We will present the
proposed structure in the next section and describe in depth how to incorporate a
deep, semi-supervised learning model.
In an ongoing report, a strategy was proposed by Elmurngi and Gherbi [21] utilize
an open-source programming device called ‘Weka apparatus’ to execute AI calcu-
lations utilizing feeling examination [22] to group reasonable and out of line audits
from Amazon audits dependent on three unique classes positive, negative and impar-
tial words. In this research work, the spam audits are recognized by just including
the supportiveness votes cast a ballot by the clients alongside the rating deviation are
viewed as which restrains the, generally speaking, execution of the framework. Like-
wise, according to the analyst’s perceptions and exploratory outcomes, the current
framework utilizes Naive Bayes classifier for spam and nonspam arrangement where
the exactness is very low which may not give precise results for the client.
At first O’Brien [3] and Reis et al. [23] have proposed arrangements that rely just
upon the highlights utilized in the information set with the utilization of various AI
calculations in identifying counterfeit news on social media. Despite the fact that
distinctive AI calculations the methodology needs to demonstrate how precise the
outcomes are. Wagh et al. [24] took a shot at Twitter to dissect the tweets posted
by clients utilizing notion investigation to characterize Twitter tweets into positive
and negative. They utilized K-Nearest Neighbor as a technique to designate them
assumption marks by seven preparings and testing the set utilizing highlight vectors.
However, the relevance of their methodology to other sorts of information has not
been approved.

3 Methodology

To overcome the great problem of online websites due to spamming of views, this
work proposes the detection of any such spamming fake reviews by grouping them
60 M. S. Jacob and P. Selvi Rajendran

into fake in the first layer of the model. The system tries to classify the publicly avail-
able reviews datasets available from a variety of sources and categories including
service-based, product-based customer feedback, experience and a crawled Amazon
dataset. More precision is achieved with Naïve Bayes [25], SVM classifiers [26].
Additional features such as contrast, to improve the accuracy Apart from the review
info, the feeling of the review, checked sales, ratings, emoji counts [27, 28], product
categories with an overall score are used. A classifier is constructed based on the
features identified. And those traits are assigned a likelihood factor or weight,
depending on the training sets listed. This involves the usage of different supervised
algorithms to find real and fake reviews. The Amazon dataset collected from the
Amazon analysis data location is the data set available in this work. For assessment,
approximately 10,000 reviews are taken. As it is a broad data collection, 30% of the
information is used for training and 70% is used for research. Collected reviews are
pre-processed and the data processed is stored in a registry. Based on certain param-
eters, features are extracted from the pre-processed data. The features extracted are
length of the review of the reviewer, the nature of rating, verified purchase. The fake
reviews tend to be of smaller length. Review from the same ID multiple times tends
to be the sign of a fake review. Reviews collected have been classified using two
machine learning algorithms Support Vector Machine and Naive Bayes classifica-
tion algorithm. Performance both the algorithms are compared in terms of accuracy,
precision and recall.
The pre-processed data was transformed by applying those parameters into a set
of features. Extracted were the following features:
• Normalized length of the review-fake reviews tends to be of smaller length.
• Reviewer ID—A reviewer posting multiple reviews with the same Reviewer ID.
• Rating-fake reviews in most scenarios have 5 out of 5 stars to entice the customer
or have the lowest rating for the competitive products thus it plays an important
role in fake detection.
• Verified purchase—Purchase reviews that are fake have lesser chance of it being
verified purchase than genuine reviews.
• Nature of the review—The common problem/highlight of a product is discussed
or not.
Fake reviews often have stronger positive or negative emotions than other reviews,
because they are used to affect people’s opinions. Advertisers post fake reviews with
more objective information, giving more emotions such as how happy it made them
than conveying how the product is or what it does. This is how we used sentiment
analysis to classify various reviews as fake and genuine. Classifying the reviews
concurring to their feeling figure or opinions being positive, negative or impartial or
neutral.
It incorporates anticipating the reviews being positive or negative according to the
words utilized within the content, emojis utilized, appraisals given to the audit and
so on. Related research [29] says that fake surveys have more grounded positive or
negative emotions than genuine reviews. The reasons are that fake reviews are utilized
to influence people opinions and it is more noteworthy to communicate suppositions
Deceptive Product Review Identification Framework Using Opinion … 61

than to doubtlessly portray the facts. The Subjective versus Objective proportion
things: Promoters post fake surveys with more feeling given words or objective data,
giving more feelings such as how cheerful it made them than conveying how the
item is or what it does. Positive assumption versus negative sentiment: The opinion
of the survey is analyzed which in turn offer assistance in making the choice of it
being fake or honest to goodness survey.
The method used to classify the feedback obtained from publicly available datasets
from various sources and categories based on profit, product-based, customer criti-
cism, participation-based and the crept Amazon dataset using Naïve Bayes and SVM
with greater accuracy. To move forward the exactness, the extra parameters or features
like length of the parameter, rating of the review, nature of the review and verified
purchase are used and the scores obtained are utilized in expansion to the survey
details. Based on the known highlights, a classifier is constructed. And, depending
on the classified planning sets, those highlights are given a probability estimate or a
weight. This is a directed learning technique that uses various calculations of machine
learning to detect a false or honest analysis.
At this point the raw information is pre-processed by applying tokenization,
cleaning of words and lemmatization. Feature Extraction: The processed data have
distinctive highlights or quirks that can help to resolve the classification issue. For,
e.g. Length of surveys (fake audits tend to be littler in length with less realities
uncovered almost the item) and fancy words (fake reviews have fancy words). Sepa-
rated from the fair review text there are other highlights that can contribute towards
the classification of surveys as fake. A few of the critical ones that were utilized as
extra highlights consideration are ratings, verified purchase, nature of the review and
length of the review.
The diagram is shown below (Fig. 1) explains the entire flow of the paper.
• Customer Review Collection
• Data Pre-processing
• Sentimental Analysis/Opinion Mining
• Fake Product Review Detection
• Evaluation of Results and Result Analysis.

4 Experimental Results

We need a method to find the unusual pattern of text, style of writing and pattern [30].
The system should have an internal facility to score the review and any suspicious
similarities to be flagged to the authority. The fake surveys calculation requires
time and training to begin working faultlessly. The implementation of this work used
supervised learning techniques on the datasets and the fake and genuine labels helped
to cross validate the classification results of the data. Datasets for such reviews with
labels were found from different sources and combined into customerreviews.csv
file. Then to make it readable, the labels in the dataset were clearly labelled as fake
or genuine. Since the dataset created from multiple sources of information had many
62 M. S. Jacob and P. Selvi Rajendran

Customer Review
Collection
Blank Space
Removal/Cleaning

Conversion of text to
Data Pre-processing lowercase
Prepare Train
and Test Data
Set
Stemming and
Lemmatization
Opinion Mining

Apply the
parameters for Naive Bayes Classifier
classification
Fake Product Review

Support Vector Machine

Evaluation of Results and


Result Analysis

Fig. 1 The implementation architecture

forms of redundant and unclean values, it needed to be pre-processed. Data had


been cleaned by removing all the null values, white spaces and punctuations. This
raw dataset was loaded in the form of <ID, Review text, Label> tuple allowing to
only focus on the textual review content. Then the raw data was pre-processed by
applying tokenization, removal of stop words and lemmatization. This processed data
was then analyzed for emotions or sentiment if the review was positive or negative.
The significant factors for doing the sentiment analysis of the reviews were use of
the parameter’s length of the review, rating of the review and the nature of the review.
While pre-processing the input the data was parsed for the personal pronouns to be
an exception, so they didn’t remove or discard them by accident while cleaning the
dataset. The pre-processed dataset was thus classified using different classification
algorithms to analyze a variety of data to classify it.
The two classifiers used in this configuration are:
• Naïve Bayes classifier
• Support Vector Machine
Naïve Bayes [25] and Support Vector Machine were used for detecting the genuine
(T) and fake (F) reviews across a wide range of data set. The probability for each
word calculated was given by the ratio of (sum of frequency of each word of a class to
the total words for that class). The dataset was split into 70% training, 30% testing,
7000 for training and 3000 for testing. Finally, the likelihood of each review was
determined for each class for the testing of the data using a test collection. The class
Deceptive Product Review Identification Framework Using Opinion … 63

with the highest probability value for which the mark is applied to the analysis, i.e.
true/genuine (T) or false (F) review. F-train.txt and T-train.txt were the datasets used
for preparation. They included both the Review ID (e.g. ID-1100) and the Review
text ID (Great product).
The dataset consisting of 10,000 product reviews are read and it is given for pre-
processing. The data requires pre-processing. The raw data is transformed into an
understandable format for the NLP models. This will help in getting better results
while applying the classification algorithms. Besides the other easy to understand
steps the following two steps are also performed. The input data includes an iden-
tification number for each raw along with the review text and the label for it. We
are considering two labels Label 1 and Label 2 in order to indicate fake and genuine
reviews. Later the labels are changed to fake and genuine for a better understanding.
This helps in analyzing the parameters for feature extraction better. The raw data is
transformed into an understandable format for the NLP models. This will help in
getting better results while applying the classification algorithms.
Besides the other easy to understand steps the following two steps are also
performed:
• Tokenization
• Word Stemming/Lemmatization
Tokenization: This is a mechanism by which a text group is broken into words,
phrases, images, or other essential components called tokens. For additional planning,
the rundown of tokens is supported. NLTK Library has word tokenize and tokenize
sent to separately split a flood of text into a rundown of terms or phrases easily.
NLP is a branch of artificial intelligence, short for natural language processing,
which focuses on allowing computers to understand and interpret human language.
The issue with human language comprehension is that it is not a collection of rules
or linear data that can be fed into the machine (Figs. 2 and 3).
Word Stemming/Lemmatization: Stemming and lemmatization are related. The
distinction is that a stemmer works on words without taking into account the meaning,
whereas lemmatization is context-based. Consider the following table for better
understanding. Pre-processing of the data input helps in cleaning the data and the
final result of this will be a set of words in a list that is ready for analysis (Table 1).

5 Feature Extraction

The pre-processed reviews are further analyzed with various parameters for the
feature extraction. These parameters help in improving the accuracy of the training
system and hence the prediction. The following parameters are analyzed.
Normalized Length of the Review: Fake reviews appears to be smaller in length
(Figs. 4 and 5).
64 M. S. Jacob and P. Selvi Rajendran

Fig. 2 The extraction of data

Fig. 3 Analysis of corpus text

Table 1 Stemming versus


Word Stem Lemma
lemmatization
Studies Studi Study
Studying Study Study
Policies Polici Policy

Nature of the Review: fake negative review will not discuss the common problem
of the product. Similarly, the common highlight of the product will not be discussed
in the fake positive review.
Deceptive Product Review Identification Framework Using Opinion … 65

Fig. 4 The steps in pre-processing

Fig. 5 The analysis on length of the review

Reviewer ID: A reviewer posting a couple of opinions with the same Reviewer ID
is likely to be fake.
Rating: Fake reviews in most situations have 5 out of 5 stars to attract the client or
have the lowest rating for a negative fake review.
Verified Purchase: The review that is of a verified purchase is having less chance of
becoming fake.
Combination of all these features is selected to enhance the accuracy of the model
(Table 2).
66 M. S. Jacob and P. Selvi Rajendran

Table 2 Impact of nature of


NOR Label Count
review
Common problem discussed Fake 2100
Genuine 7412
Common problem not discussed Fake 8100
Genuine 1428

Table 3 Impact of length of


Review ID No. of words Fake/Genuine
the review
2 8 Genuine
4 15 Fake
9996 10 Genuine
9998 15 Fake

It is analyzed that the length of the review is closely related to the genuinity of the
review. If the length of the review is above the threshold (number of words is more
than 10), then the chance that the review is fake is high (Table 3).
Most of the time the review source will be the same. This can be identified by
the Reviewer ID parameter. It is also analyzed that the review for a product posted
from the same user multiple times, then those reviews are likely to be fake. This is
identified with the Reviewer ID parameter. The rating of the review is considered as
another important parameter in finding out the genuinity of the review. Fake reviews
in most situations have maximum number of stars to attract the client or have the
lowest rating for a negative fake review. A review rating in the midrange is identified
as genuine (3 or 4). The review that is of a verified purchase is having less chance of
becoming fake.
Table 4 indicates the impact of verified purchases on the detection of fake reviews.
If the review is from a person who has actually bought the product, the chances of
being genuine are more, On the other hand, if the review is from a person whose
purchase is not verified then the chances of genuinity are very less (Fig. 6).
The following graph (Fig. 7) indicates the nature of the review. The nature of the
review is considered an important parameter in finding out whether the review is
genuine or fake. It is analyzed that the review which does not address the common
problem of a product discussed in majority of the reviews is considered as a fake
negative review. Similarly, a review which is not addressing the common. Highlight

Table 4 Impact of verified purchase


Verified purchase Label Count
Yes Fake 2577
Genuine 7234
No Fake 7623
Genuine 1679
Deceptive Product Review Identification Framework Using Opinion … 67

Fig. 6 The analysis on verified Purchase

Fig. 7 The analysis on nature of review

of a product that is discussed in majority of the reviews is considered as a fake positive


review (Fig. 8).
The corpus will be split into two data sets, Training and Assessment. The training
data set will be used to match the model and the predictions will be performed on
the test data set. This can be done via the train test split of the Sklearn library. As we
have set the test size = 0.3 parameter, the training data will have 70% of the corpus
and the test data will have the remaining 30%.
68 M. S. Jacob and P. Selvi Rajendran

Fig. 8 Feature extraction

The target variable encoding is performed to transform string type categorical


data into numerical values in the data set that can be interpreted by the model. Then
it’s done with the word vectorization process. The transformation of a collection
of text documents into vectors with numerical features is a general operation. The
model can understand many methods for translating text data to vectors, but TF-IDF
is by far the most common method. This is an acronym for “Term Frequency-Inverse
Document” Frequency, which is the portion of the resulting scores assigned to each
expression.
Term Frequency: This summarizes how frequently within a text a given word
appears.
Frequency of the Inverse Document: This downscales terms that appear a ton in
documents.
TF-IDF are ratings of word frequency that tend to explain words that are more
interesting, e.g. daily in a text, without going into the math, but not by documents. TF-
IDF are ratings of word frequency that tend to explain words that are more interesting,
e.g. daily in a text, without going into the math, but not by documents. The following
syntax may be used to first fit the TG-IDF model over the entire corpus. This will
help TF-IDF build a vocabulary of words it has studied.
We have got a vocabulary as shown below from the corpus. This is going to give
a performance as (Figs. 9, 10 and 11)
{‘even’: 1459, ‘sound’: 4067, ‘track’: 4494, ‘lovely’: 346, ‘paint’: 3045, ‘mind’:
2740, ‘well’: 4864, ‘would’: 4952, ‘recommend’: 3493, ‘people’: 3115,’ hate ‘: 1961,’
video ‘: 4761…}
The vectorized data is available in the variable Train_X_Tfidf. Each column in the
database is represented with the help of a vector for further analysis. The Naïve Bayes
classifier gives an accuracy score of 83.1%. The SVM classifier gives an accuracy
score of 84.6%. A comparative analysis is shown in the following graph (Fig. 12). It
Deceptive Product Review Identification Framework Using Opinion … 69

Fig. 9 The label genuine/fake

Fig. 10 Analysis of parameters for feature extraction

is identified that SVM classifier gives a better result as applied in the given dataset.
70 M. S. Jacob and P. Selvi Rajendran

Fig. 11 The accuracy score calculation

Fig. 12 The comparison of accuracy in percentage

6 Conclusion

The identification of deceptive reviews is intended to filter false reviews. In this


approach, the SVM classification of the research work provided an enhanced classi-
fication accuracy on the given dataset. Compared to Naïve Bayes classifier. Precisely
the model can classify the fake feedback further and forecast them effectively. It
is possible to apply this technique to other sampled dataset examples. The visual-
ization of data helped explore the accuracy of the classification contributed to the
dataset and also the characteristics found. The different algorithms used and their
Deceptive Product Review Identification Framework Using Opinion … 71

accuracy indicates how each of them worked on the basis of their accuracy factors.
The method also provides the consumer with a feature to suggest the most important
truthful feedback to make choices about the merchandise possible for the customer.
Various variables such as the inclusion of new vectors such as review length, scores,
nature of the review and verified purchase have helped to infer and thereby achieve
the precision of the data classification.

References

1. Leskovec J (2018) WebData Amazon reviews [Online]. Available: http://snap.stanford.edu/


data/web-Amazon-links.html. Accessed: Oct 2018
2. Li J, Ott M, Cardie C, Hovy E (2014) Towards a general rule for identifying deceptive
opinion spam. In: Proceedings of the 52nd annual meeting of the association for computational
linguistics, Baltimore, MD, USA, vol 1, no 11, pp 1566–1576, Nov 2014
3. O’Brien N (2018) Machine learning for detection of fake news [Online]. Available: https://dsp
ace.mit.edu/bitstream/handle/1721.1/119727/1078649610-MIT.pdf. Accessed: Nov 2018
4. Tong S, Koller D (2001) Support vector machine active learning with applications to text
classification. J Mach Learn Res 2(Nov):45–66
5. Ramos J et al (2003) Using TF-IDF to determine word relevance in document queries. In:
Proceedings of the first instructional conference on machine learning, vol 242, Piscataway, NJ,
pp 133–142
6. Paik JH (2013) A novel TF-IDF weighting scheme for effective ranking. In: Proceedings of
the 36th international ACM SIGIR conference on research and development in information
retrieval. ACM, pp 343–352
7. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in
vector space. arXiv reprint arXiv:1301.3781
8. Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation.
In: Proceedings of the 2014 conference on empirical methods in natural language processing
(EMNLP), pp 1532–1543
9. Mihalcea R, Strapparava C (2009) The lie detector: explorations in the automatic recognition
of deceptive language. In: Proceedings of the ACL-IJCNLP 2009 conference short papers.
Association for Computational Linguistics, pp 309–312
10. Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count: LIWC 2001,
vol 71, no 2001. Lawrence Erlbaum Associates, Mahwah, p 2001
11. Ruchansky N, Seo S, Liu Y (2017) CSI: a hybrid deep model for fake news detection. In:
Proceedings of the 2017 ACM on conference on information and knowledge management.
ACM, pp 797–806
12. Rashkin H, Choi E, Jang JY, Volkova S, Choi Y (2017) Truth of varying shades: analyzing
language in fake news and political fact checking. In: Proceedings of the 2017 conference on
empirical methods in natural language processing, pp 2931–2937
13. Karimi H, Roy P, Saba-Sadiya S, Tang J (2015) Multi-source multiclass fake news detection. In:
Proceedings of the 27th international conference on computational linguistics, pp 1546–1557
14. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In:
International conference on machine learning, pp 1188–1196
15. Lazer DMJ et al (2018) The science of fake news. Science 359(6380):1094–1096
16. Ma J, Gao W, Wei Z, Lu Y, Wong K-F (2015) Detect rumors using time series of social context
information on microblogging websites. In: Proceedings of the 24th ACM international on
conference on XIII information and knowledge management. ACM, pp 1751–1754
17. Lan T, Li C, Li J (2018) Mining semantic variation in time series for rumor detection via
recurrent neural networks. In: 2018 IEEE 20th International conference on high performance
72 M. S. Jacob and P. Selvi Rajendran

computing and communications; IEEE 16th International conference on smart city; IEEE 4th
International conference on data science and systems (HPCC/SmartCity/DSS). IEEE, pp 282–
289
18. Hashimoto T, Kuboyama T, Shirota Y (2011) Rumor analysis framework in social media. In:
TENCON 2011–2011 IEEE Region 10 conference. IEEE, pp 133–137
19. Chapelle O, Scholkopf B, Zien A (2009) Semi-supervised learning (Chapelle O et al (eds),
2006) [book reviews]. IEEE Trans Neural Networks 20(3):542–542
20. Zhu XJ (2005) Semi-supervised learning literature survey. University of Wisconsin-Madison
Department of Computer Sciences, Technical report
21. Elmurngi EI, Gherbi A (2018) Unfair reviews detection on Amazon reviews using sentiment
analysis with supervised learning techniques. J Comput Sci 14(5):714–726
22. Conroy NJ, Rubin VL, Chen Y (2015) Automatic deception detection: methods for finding
fake news. In: Proceedings of the annual meeting. Association for Information Science and
Technology
23. Reis JCS, Correia A, Murai F, Veloso A, Benevenuto F (2019) Supervised learning for fake
news detection. IEEE Intell Syst 34(2):76–81
24. Wagh B, Shinde JV, Kale PA (2017) A Twitter sentiment analysis using NLTK and machine
learning techniques. Int J Emerg Res Manag Technol 6(12):37–44
25. McCallum A, Nigam K (1998) A comparison of event models for Naive Bayes text classifi-
cation. In: Proceedings of AAAI-98 workshop on learning for text categorization, Pittsburgh,
PA, USA, vol 752, no 1, pp 41–48, July 1998
26. Wang WY (1999) Liar, liar pants on fire: a new benchmark dataset for fake news detection. In:
Proceedings of the annual meeting. Association for Computational Linguistics, pp 422–426
27. Novak J (2019) List archive Emojis [Online]. Available: https://li.st/jesseno/positivenegative-
and-neutral-emojis-6EGfnd2QhBsa3t6Gp0FRP9. Accessed: June 2019
28. Novak PK, Smailović J, Sluban B, Mozeti I (2015) Sentiment of Emojis. J Comput Lang
10(12):1–4
29. Liu N, Hu M (2019) Opinion mining, sentiment analysis and opinion spam detection [Online].
Available: https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#lexicon. Accessed: Jan
2019
30. Long Y, Lu Q, Xiang R, Li M, Huang C-R (2017) Fake news detection through multi-perspective
speaker profiles. In: Proceedings of the eighth international joint conference on natural language
processing, vol 2: Short papers, pp 252–256
Detecting Six Different Types of Thyroid
Diseases Using Deep Learning

R. Mageshwari, G. Sri Kiruthika, T. Suwathi, and Tina Susan Thomas

Abstract Thyroid is the most common disease worldwide. More than 42 million
people are affected by thyroid in India. Thyroid is a small gland in the neck region
that secretes thyroid hormone which is responsible for directing all our metabolic
functions. When the thyroid malfunctions, it can affect every facet of our health.
Thyroid disease has become more challenging for both medical scientists and doctors.
Detecting thyroid diseases in early stage is the first priority to save many people’s
life. Typically, visual examination and manual techniques are used for these types of
thyroid disease diagnoses. Thus, in this project, we apply deep learning algorithms
to detect six different types of thyroid diseases and its presence without the need
for several consultations from different doctors. This leads to earlier prediction of
the presence of the disease and allows us to take prior actions immediately to avoid
further consequences in an effective and cheap manner avoiding human error rate.

Keywords Deep learning · Prediction · Thyroid

1 Introduction

The endocrine system is a part that secretes thyroid hormone in our neck, which
is responsible for controlling heart and brain development, muscle, and diges-
tive function [1]. The Thyroid Disease Detection project is focusing on detecting
six types of thyroid diseases which are Thyroid cancer, Thyroid nodules, Hyper-
thyroidism, Goiter, Thyroiditis, and Thyroidism [2]. Hypothyroidism doesn’t have
enough thyroid hormones. So, it leads to weight gain, dry skin, and heavy periods
[3]. Thyroiditis which produces too much thyroid hormone causes huge pain. Hyper-
thyroidism is caused when thyroid is overactive and releases much more hormones
which lead to appetite, diarrhea, and rapid heartbeat [4]. The thyroid nodules are
just an unusual growth in which the cells might be either fluid-filled or solid lump.
Thyroid cancer causes malignant cells from the thyroid gland. Prediction of thyroid

R. Mageshwari · G. Sri Kiruthika · T. Suwathi · T. S. Thomas (B)


Department of Information Technology, KCG College of Technology, Chennai, India
e-mail: tina.it@kcgcollege.com

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 73
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_5
74 R. Mageshwari et al.

diseases presence by manually checking the reports by a number of doctors leads to


mistakes causing life losing disadvantages. In the existing system, the benign and
malignant thyroid is detected using OTL by Ultrasound images [5]. The accuracy
of prediction was also compared to VGG-16 based on different input images and
the transfer learning model. The purpose of thyroid disease detection is to detect
the problems related to thyroid gland without multiple consultations from different
doctors. As it is an automated process, people can save time and money. To detect
the disease, we use VGG-16 algorithm in Deep Learning.

2 Related Work

This literature survey provides detailed documentation about the current state of deep
learning technology in information and research systems [6]. It outlines, in general,
the functioning and the factors required for the success of deep learning [7]. Here the
platform where this technology is used is mentioned and also its uses are described.
This technology enables safe and secure interaction between applications and people
without the help of doctors [8]. Conclusively, it is noted that this technology in its
current state has a long way to go before it can be adopted as a mainstream technology.

3 Objective

The main purpose of this project is to determine the presence of thyroid disease at an
early age. Predicting the presence of thyroid disease by hand examining the results
of reports from various doctors will be time consuming and may sometimes leads to
errors that cause even human loss. The purpose of thyroid diagnosis is to diagnose
thyroid-related problems without consulting a variety of doctors. An in-depth study
has so far been very helpful in health care systems by understanding the source and
journey of the product. As is the custom, people can save time and money.

4 Motivation

When it comes to health care the thyroid disease can affect anyone—men, women,
infants, teenagers and the elderly. It can be present at birth (typically hypothyroidism)
and it can develop as you age (often after menopause in women). More than 42 million
peoples are affected by thyroid in India. Thyroid leads to loss of life and increased risk.
Manual process is time consuming. Thyroid disease is sometimes hard to examine
as the symptoms were certainly puzzled with those of other circumstances. Inputs
are given as scanned image from the local storage and will get the output as name of
disease present with the patient’s details.
Detecting Six Different Types of Thyroid Diseases Using Deep Learning 75

5 Existing System

It aimed to put forward a highly mutualized model, a new framework of Online


Transfer Learning (OTL) to find the difference between benign and malignant thyroid
nodules using ultrasound (US) images [9]. Online Transfer Learning method is the
combination of both the transfer learning and the online learning. The datasets of
two thyroid nodules have 1750 images which contains 1078 of benign and 672 of
malignant nodules, and 3852 of thyroid nodules which contains 3213 of benign and
639 of malignant nodules were collected to develop the model [10]. OTL method
was compared with VGG-16 also. One of the most disadvantages is that the transfer
learning only works if the initial and target problems of both models are similar
enough. Only benign and malignant kind of thyroid presence is detected within the
existing system. Accuracy decreases with decrease within the amount of data.

6 Proposed System

An intelligent six different thyroid diseases recognition systems can predict the pres-
ence of different types of thyroid by integrating the image processing model and
deep learning technique. The proposed system discusses the detection of thyroid
information to increase accuracy. Both researchers and doctors face the challenges
of fighting with thyroid diseases. In this thyroid disease could be a major explanation
for formation in diagnosing and within the prediction, onset to which it’s a difficult
axiom within the medical research. Thus, in this project we develop and apply a
novel deep learning architecture to effectively detect and identify the presence of
five different thyroid diseases such as Hyperthyroidism, Hypothyroidism, Thyroid
cancer, Thyroid nodules, Thyroiditis and to check for normal thyroid without the need
of several consultations from different doctors. Uses VGG16 algorithm to achieve
the purpose. Optimizing technique like SGD—stochastic gradient descent and regu-
larization methods like ReLU and ELU to increase the accuracy. Eliminates human
error rate. Web application for hospitals will be developed which requires a scanned
input image to predict the type of disease.

7 Architecture Diagram

In this project given in Fig. 1, the presence of thyroid diseases are highlighted and it
will be easy to verify the presence of thyroid diseases within the CT pictures or X-
rays. So, initially, the primary steps are dataset assortment where we’ll be aggregating
the dataset like CT images or X-rays utilized by the laboratories to analyze the
presence of thyroid diseases from numerous resources through internet. After that,
we’ll be rending those datasets into totally different classes. That is, the dataset will be
76 R. Mageshwari et al.

Fig. 1 Architecture for detection of thyroid

rendered into coaching and testing dataset. In coaching datasets, we’ll be mistreating
the dataset for coaching the module whereas a testing dataset is employed to gauge
the model when it’s been utterly prepared. Thus, coaching dataset initially undergoes
the method referred to as dataset augmentation; wherever the dataset is multiplied
into several datasets then it’ll undergo the method referred to as pre-processing which
is to form all sizes into single size. We train those datasets by extracting the options
using a novel deep learning design. It undergoes a method referred to as improvement
that will optimize the model and decreases the loss which will cut back the noises
generated throughout training. Then it’ll be undergoing a process termed model
seriation which will be able to evaluate the generated mistreatment model. Thereby
the presence of thyroid diseases will be predicted from the testing dataset.
A JavaScript framework react.js will be developed during scanning the Associate
in Nursing input images. This will provide the output of the kind of thyroid disease
saving plenty of our time and cash invested by the patients. Thus, this methodology
provides Associate in Nursing effective and low-cost methodology to work out the
presence of thyroid diseases than the methodologies used these days.

8 Modules

Dataset Collection—Ultrasound images are collected for various types of thyroid


diseases in various hospitals. There are three steps to collecting data, manually
detecting, and downloading X-ray or CT scans that can take a long time due to the
amount of work involved. As data has become an important asset in deep learning,
more information can be obtained from third-party resources. Start with a pre-trained
network on a large database, then connect it to yourself.
Detecting Six Different Types of Thyroid Diseases Using Deep Learning 77

Fig. 2 Dataset collection

Data Addition—Using this method the amount of data collected increases.


Data Processing—Data sets collected from various hospitals are in raw format which
is not usable for analysis, so we used the data processing process.
Algorithm Training—Datasets are trained using an in-depth learning algorithm.
Data Uses—To increase efficiency and reliability, the optimization process is applied
(Figs. 2, 3 and 4).

9 Algorithm

Novel Architecture: In this project given in the diagram Fig. 5 a novel construction is
underway to identify five different types of thyroid diseases such as Hyperthyroidism,
Hypothyroidism, Prostate cancer, Thyroid nodules, Thyroiditis and to check whether
the thyroid gland is normal.. The network can take an input image with height, width
as duplicate 32 and 3 as channel width. It will be the size of the input as 224 × 224
× 3. The design of the novel makes the first convolution and max-pooling using the
dimensions of 7 × 7 and 3 × 3 respectively. After that, Phase 1 of the network starts
and consists of three remaining blocks containing three layers each. The maximum
number of characters used to perform the convolution function on all three layers
of block 1 is 64, 64, and 128 respectively. Curved arrows point to the connection of
the identity. The combined arrow represents that the operation of the convolution in
Residual Block is done in stride 2, hence, the input size will be reduced by half in
length and width but the channel width will be doubled. Continuing from one stage
78 R. Mageshwari et al.

Fig. 3 Efficiency of model obtained

Fig. 4 Graph plot after the training process


Detecting Six Different Types of Thyroid Diseases Using Deep Learning 79

Fig. 5 Steps for processing the thyroid Image

to another, the channel width is doubled and the input size is reduced by half. In
deep networks bottleneck design is used. For each remaining F function, three layers
are arranged in sequence. The three layers are 1 × 1, 3 × 3, 1 × 1 convolutions.
Layers of 1 × 1 convolution are responsible for reducing and restoring size. The 3
× 3 layers are left as a bottle with a small size of input/output. Finally, the network
has a Pooling layer followed by a fully connected layer of 1000 neurons (output of
the ImageNet class).

10 Flow Chart

Flow Chart is given below in Fig. 6.

11 Workflow

Step 1: Start.
80 R. Mageshwari et al.

Fig. 6 Flow chart

Step 2: Get the scanned image.


Step 3: Upload the scanned image from the local storage.
Step 4: Get the digital output.
Step 5: Stop.
Detecting Six Different Types of Thyroid Diseases Using Deep Learning 81

12 Conclusion

Thyroid disease detection model leads to an earlier prediction of the presence of the
disease and allows us to take prior actions immediately to avoid further consequences
in an effective and cheap manner avoiding human error rate. We use PyCharm and
Visual Studio Code as Integrated Development Environment and ReactJS is used
as a front-end tool and Node.js as backend tool. We have concluded that by giving
the ultrasound images as input we can get the disease name with patient details as
output. Therefore, instead of checking the report manually, we can implement the
same in digital method.

References

1. Zhou H, Wang K, Tian J (2020) Online transfer learning for differential diagnosis of benign
and malignant thyroid nodules with ultrasound images. IEEE Trans Biomed Eng 67(10)
2. Kumar V, Webb J, Gregory A, Meixner DD, Knudsen JM, Callstrom M, Fatemi M, Alizad
A (2020) Automated segmentation of thyroid nodule, gland, and cystic components from
ultrasound images using deep learning. IEEE
3. Song W, Li S, Liu J, Qin H, Zhang B, Zhang S, Hao A (2018) Multi-task cascade convolution
neural networks for automatic thyroid nodule detection and recognition. IEEE J Biomed Health
Inf
4. Rosen JE∗, Suh H, Giordano NJ, A’amar OM, Rodriguez-Diaz E, Bigio II, Lee SL (2016) Preop-
erative discrimination of benign from malignant disease in thyroid nodules with indeterminate
cytology using elastic light-scattering spectroscopy. IEEE Trans Biomed Eng 61(8)
5. Anton-Rodriguez JM, Julyan P, Djoukhadar I, Russell D, Gareth Evans D, Jackson A, Matthews
JC (2019) Comparison of a standard resolution PET-CT scanner with an HRRT brain scanner
for imaging small tumors within the head. IEEE Trans Radiat Plasma Med Sci 3(4)
6. Anand R et al (2016) Blockchain-based agriculture assistance. Lecture notes in electrical
engineering. Springer, Berlin, pp 477–483
7. Feigin M, Freedman D, Anthony BW (2016) A deep learning framework for single-sided sound
speed inversion in medical ultrasound 0018-9294(c)
8. Narayan NS, Marziliano P, Kanagalingam J, Hobbs CGL (2016) Speckle patch similarity for
echogenicity based multi-organ segmentation in ultrasound images of the thyroid gland. IEEE
J Biomed Health Inf 2168-2194
9. Azizi S, Bayat S, Yan P, Tahmasebi A, Kwak JT, Xu S, Turkbey B, Choyke P, Pinto P, Wood
B, Mousavi P*, Abolmaesumi P (2018) Deep recurrent neural networks for prostate cancer
detection: analysis of temporal enhanced ultrasound. IEEE Trans Med Imaging 0278-0062
10. Peter TJ, Somasundaram K (2012) An empirical study on prediction of heart disease using
classification data mining techniques. In: IEEE-International conference on advances in
engineering, science and management, ICAESM-2012, vol 14, no 8, Aug 2012
Random Forest Classifier Used
for Modelling and Classification
of Herbal Plants Considering Different
Features Using Machine Learning

Priya Pinder Kaur and Sukhdev Singh

Abstract The advancement of technology in all the fields not limited to one area
but today demand in the field of herbal plants has been also increased. As the change
in lifestyle of human being leads to various difficulties for their living and health-
related issues, there is a need for the system that will help for the recognition and
identification of herbal plants that are not known to everyone and carries useful
properties in life of humankind. The researcher takes advantage to carry out their
research in the field of computer science using image recognition, machine learning
and extends further into deep learning. In this paper images of herbal plants are taken
by mobile camera and then resized to 20% from the original size for easy and fast
calculation of features. Then those RGB images are converted into grayscale then to
binary for reading features and calculation purposes. The random forest classifier is
used to classify different herbal plant species considering shape as the main feature,
based on various features such as length, width, area of the leaf, perimeter of leaf,
leaf area enclosed in a rectangle, leaf percentage in the rectangle, leaf pixels in four
different quadrants and rectangle pixels in four different quadrants are extracted in
feature extraction. Training data consists of 85% and for testing 15% data has been
used and classification results are discussed.

Keywords Random forest · Feature extraction · Classification · Plant recognition

1 Introduction

The technological world of today necessitates that technology be used to find solu-
tions to complex problems. Herbal plants can be found abundantly across the globe
and in all regions of any country. These herbal plants have been used since millen-
niums to cure various diseases which have affected humankind. These plants are

P. P. Kaur (B)
Punjabi University, Patiala, Punjab, India
S. Singh
M.M. Modi College, Patiala, Punjab, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 83
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_6
84 P. P. Kaur and S. Singh

most effective for medicinal purposes and also help in keeping the air clean, disin-
fect germs and dangerous micro air pollutants. Besides providing for food these
plants have great use in medicine. Physical feature as shape helps in identification
as well as classification of the plant species and hence it is an important feature. The
shape is also used by researchers as its availability is without question. In this paper,
features extraction is done from the herbal leaves and then machine learning algo-
rithms are employed for training purposes. Afterward, the dataset created is tested
and classification is done of the plant leaves to obtain the optimum results using a
random forest classifier.

2 Literature Review

Today researchers carried out research by using different techniques of machine


learning and deep learning. Some of the methods, techniques, and algorithms are
already studied and discussed below related to machine learning to classify and
recognize herbal plants through various features of plant leaves.
Kaur et al. [1] purposed that different machine learning techniques used for the
recognition of herbal plants and various methods were discussed that are suitable
identification of herbal plant species and which physical features are considered as
the most important for the plant to be recognized in image processing, machine
learning and in deep learning.
Pankaja and Suma [2] introduced two methods WOA (Whale optimization algo-
rithm) and random forest for the classification and recognition of plant leaf. Images
of plant leaves taken from Swedish and Flavia datasets. WOA method overcomes
the dimensionality problem and random forest classify the leaf of plant and gives
97.58% accuracy using shape, texture, and color features.
Singh and Kaur [3] the paper gives a comparative study of features that were
discussed, which gives better results and is used most widely in the field of computer
vision and image processing for recognition and identification of herbal plants. Shape,
color, texture, and vein features were discussed in detail.
Singh and Kaur [4] in this review on plant recognition system on the basis of leaf
done in which different methods for classification of leaf on the basis of their shape,
color, texture, and vein were discussed with their problem addressed and how they
solve using best method/technique for recognition, classification, and identification
of plant using machine learning.
Ramesh et al. [5] proposed a system for identifying healthy and diseased leaves
through a random forest algorithm. 160 leaves of papaya are contained in the
dataset and texture, color, and shape features were extracted. For object detection
the histogram of an oriented gradient method. Accuracy attained through the random
forest was 70%, it improved by combining local features with global features such
as SURF (speed up robust features) and SIFT (scale invariant feature transform),
DENSE along with BOVW (Bag Of Visual Word).
Random Forest Classifier Used for Modelling and Classification … 85

Vijayshree and Gopal [6] purposed an automatic system that identifies and classi-
fies herbal medicinal plants. 500 leaves along with 21 features containing 50 different
species were extracted using color, texture, and shape. The experimental output shows
that using a neural network for identification and classification of texture feature
resulted in 93.3% accuracy while using all features gives 99.2% accuracy.
Dahigaonkar and Kalyana [7] identified ayurvedic medicinal plants that used
leaves on the basis of their shape, color, and texture feature. A total of 32 different
plants were classified using features such as solidity, eccentricity, entropy, extent,
standard deviation, contrast, equivalent diameter, mean, and class. They attained
through support vector machine classifier 96.66% of accuracy.
Kan et al. [8] purposed classification of herbal plants using an automated system
that used leaves. Shape-5 and texture-10 features were extracted and implemented
using SVM classifier for classification of the plant. The medicinal plant specimen
library of Anhui University of Traditional Chinese Medicine was taken and its dataset
contained 240 leaves. The experiment resulted in 93.3% recognition rate.
De Luna et al. [9] developed the system for the identification of Philippine
herbal medicine plant through leaf features by ANN. Dataset contains 600 images
from 12 different herbal plants having 50 from each. Basic features length, width,
circularity, convexity, aspect ratio, area, perimeter of convex hull, solidity, vein
density were used that defined the venation, shape, and structure of leaf. The clas-
sification and analysis of data are done through Python and Matlab. The experi-
ments were conducted for identification of herbal plants through various algorithms
mentioned as—Naïve Bayes, Logistic Regression, KNN, classification and regres-
sion trees, linear discriminant analysis, SVM, Neural network and resulted in 98.6%
recognition.
Isnanto et al. [10] proposed a system for recognition of herbal plants using their
leaf patterns. Extracted features of the image used Hu’s seven moments invariant,
image segmentation-Otsu method and for recognition—Euclidean and Canberra
Distance were implemented. The experimental results gave 86.67% accuracy in the
identification system through Euclidean distance while using Canberra distance 72%
accuracy was achieved. 100% accuracy was attained for nine types of leaves using
Euclidean distance and five types of leaves using Canberra distance.

3 Proposed Methodology

The images of herbal plant species taken by mobile camera in the daylight and after
resized into 20%, preprocessing is done then features of the leaf are extracted and
then training and testing are conducted to obtain maximum accuracy by classifying
different plants on the basis of their shape features (Fig. 1).
86 P. P. Kaur and S. Singh

Fig. 1 Proposed
methodology follows

3.1 Image Acquisition

The sample of leaves contained in the dataset obtained from herbal plants collected
from different herbal and botanical gardens of Punjab. The herbal plant leaf images
were taken using mobile camera in the daylight and used for the experimental purpose
of training and testing to classify plants through their leaves. Table 1 represented a
sample of leaves from plants namely (1) Haar Shinghar, (2) Periwinkle, (3) Currey
Patta, (4) Tulsi.

3.2 Preprocessing

The RGB image of herbal plant leaf has taken on plain white paper background, and
resized the image for fast execution, take low memory space, and for easy evaluation.
The RGB image is converted into grayscale and then binary for extracting their
features (Table 2).
Random Forest Classifier Used for Modelling and Classification … 87

Table 1 Sample of leaves from four different plants

Table 2 RGB image, grayscale image, and binary image of different leaves of herbal plants

Haar shinghar

Periwinkle

Currey patta

Tulsi

3.3 Feature Extraction

Feature extraction is the part that is used to recognize plants by extracting different
features from their leaf. Various morphological features are calculated such as leaf
height and width, total area of rectangle enclosing leaf, total leaf area, perimeter of
88 P. P. Kaur and S. Singh

Area of rectangle Length

Area of leaf Width

Leaf Perimeter Leaf pixels of


quadrant 1, 2, 3, 4
Rectangle pixels of
quadrant 1,2,3,4

Fig. 2 Represents feature extraction of herbal leaf

Fig. 3 Sample output of leaf feature extraction

leaf in pixels, four quadrants of rectangle in pixels, four quadrants of leaf in pixels,
and leaf percentage in a rectangle. Figure 2 shows leaf samples with their feature
description (Fig. 3).

3.4 Classification

For classification, random forest classifier is used where features are extracted from
leaves are stored in the dataset and used for training and testing purposes. Training
contains 85% and for testing, 15% of data has been used, random forest classifier
model has been implemented (taking n_estimator = 10, max_features = auto) for
Random Forest Classifier Used for Modelling and Classification … 89

classification of herbal plant. The machine has trained to obtain more accuracy using
different features.

3.5 Performance Evaluation

After classification of herbal plant species then performance has been evaluated on the
basis of results given by random forest classifier. It represents how many plants belong
to a particular category and also generated a confusion matrix, accuracy score, and
classification report that consists of precision, recall, F 1 -score, and support values.

4 Result and Discussion

4.1 Result while using random forest model on 50 leaves and 75 leaves of plants
0: Currey patta
1: Haar shinghar
2: Periwinkle
3: Tulsi has been analyzed using various features.
Case I: Using 4 features stated below (Fig. 4):
Case II: Using 5 features stated below (Fig. 5):
Case III: Using 6 features stated below (Fig. 6):
Case IV: Using 8 features stated below (Fig. 7):
Case V: Using 10 features stated below (Fig. 8):
Case VI: Using 14 features stated below (Fig. 9 and Table 3):

(a) (b)

Fig. 4 a Results of all four plants using random forest model for 50 leaves. b Results of all four
plants using random forest model for 75 leaves
90 P. P. Kaur and S. Singh

(a) (b)

Fig. 5 a Results of all four plants using random forest model for 50 leaves. b Results of all four
plants using random forest model for 75 leaves

(a) (b)

Fig. 6 a Results of all four plants using random forest model for 50 leaves. b Results of all four
plants using random forest model for 75 leaves

(a) (b)

Fig. 7 a Results of all four plants using random forest model for 50 leaves. b Results of all four
plants using random forest model for 75 leaves

4.1 Discussion (Using Random Forest Model)

Plants leaves of Currey Patta, Haar Shinghar, Periwinkle, and Tulsi are taken into
consideration.
Random Forest Classifier Used for Modelling and Classification … 91

(a) (b)

Fig. 8 a Results of all four plants using random forest model for 50 leaves. b Results of all four
plants using random forest model for 75 leaves

(a) (b)

Fig. 9 a Results of all four plants using random forest model for 50 leaves. b Results of all four
plants using random forest model for 75 leaves

Table 3 Random forest model accuracy result of all four plants (Currey Patta, Haar Shinghar,
Periwinkle, and Tulsi) with variable features considering 50 and 75 leaves of each plant
Cases No. Number of features 50-Leaves accuracy 75-Leaves accuracy Result output for
(%) (%) references
I 4 86.2 84.4 Figure 4a, b
II 5 86.6 88.8 Figure 5a, b
III 6 86.2 88.3 Figure 6a, b
IV 8 93.1 86.67 Figure 7a, b
V 10 89.6 86.67 Figure 8a, b
VI 14 90.0 91.1 Figure 9a, b

Case I: For four features [width, height, area of rectangle (in pixels), and area of
leaf (in pixels)]. Initially, 50 leaves resulted in accuracy of 86.2%, and afterward, 75
leaves resulted in accuracy of 84.2%. With increase in the number of leaves at same
feature level, accuracy decreases, since computational model needs to train and test
more datasets.
92 P. P. Kaur and S. Singh

Case II: For five features [width, height, perimeter, area of leaf (in pixels), and area
of rectangle enclosing leaf]. Initially, 50 leaves resulted in accuracy of 86.6%, and
afterward, 75 leaves resulted in accuracy of 88.8%. Even with increase in the number
of leaves, accuracy increases, since computational model starts getting familiar with
previous derived features of the dataset.
Case III: For six features [width, height, perimeter of leaf (in pixel), area of leaf (in
pixels), area of rectangle enclosing leaf, and the percentage of leaf in the rectangle].
Initially, 50 leaves resulted in accuracy of 86.2%, and afterward, 75 leaves resulted
in accuracy of 88.3%. Even with the increase in number of leaves, accuracy increases
with comparison to 50 leaves at six feature levels.
Case IV: For eight features [width, height, area of leaf (in pixels), percentage of leaf
in the rectangle, leaf pixels in Quadrant (I, II, III, IV)]. Initially, 50 leaves resulted in
accuracy of 93.1%, and afterward, 75 leaves resulted in accuracy of 86.67%. With
the increase in number of leaves, accuracy decreases, since computational model
needs to train and test a large number of datasets. Secondly, selection of features
becomes important here. As we can see only four features are common in 6 and 8
features, hence variability of these four new features gave the idea to explore more
dimensions to train and test our dataset.
Case V: For 10 features [width, height, perimeter of leaf (in pixel), area of leaf (in
pixels), area of rectangle enclosing leaf, percentage of leaf in the rectangle, and leaf
pixels in Quadrant (I, II, III, IV)]. Initially, 50 leaves resulted in accuracy of 89.6%,
and afterward, 75 leaves resulted in accuracy of 86.67%. With the increase in number
of leaves, accuracy decreases, since computational model needs to train and test a
large number of datasets. Secondly, additional four features are added into six feature
datasets, making model more complex.
Case VI: For 14 features [width, height, perimeter of leaf (in pixel), area of leaf (in
pixels), area of rectangle enclosing leaf, percentage of leaf in the rectangle, and leaf
pixels in Quadrant (I, II, III, IV)]. Initially, 50 leaves resulted in accuracy of 90%,
and afterward, 75 leaves resulted in accuracy of 91.1%. Finally, even with increase in
the number of leaves, accuracy increases, since computational model starts getting
familiar with previous derived features of the dataset. Here, all previous combinations
of features are introduced making it complex model, but model has been trained
enough to provide better desired results (Fig. 10).

5 Conclusion

5.1 Considering all four plant’s data with each 50 leaves parameter, and increasing
the number of features from 4 to 14. The model shows better accuracy as the number
of features increases. The best accuracy of the model is obtained at the 8th feature
Random Forest Classifier Used for Modelling and Classification … 93

RFM anaylsis of 50 and 75 leaves


94
Accuracy of model (in %)

92
90
88
86
84
82
80
Features
5 6 8 10 14
4
Accuracy of 50 Leaves in % 86.2 86.6 86.2 93.1 89.6 90
Accuracy of 75 Leaves in % 84.4 88.8 88.3 86.67 86.67 91.1

Fig. 10 Analysis of accuracy (in %) of 50 and 75 leaves using random forest model on increasing
number of features

Accuracy (%) for 50 leaves Accuracy (%) for 75 leaves.


95 95
90 90
85 85
80 Accuracy (%) 80 Accuracy (%)
for 50 leaves for 75 leaves.

Fig. 11 Accuracy in percentage using RF model for variable features a 50 leaves b 75 leaves

parameter, i.e., 93.1%. While, as we increase more features decrement and again
increment in accuracy is obtained due to more complexity of the model (Fig. 11).
5.2 Considering all four plant’s data with each 75 leaves parameter, and increasing
the number of features from 4 to 14. The accuracy of the model starts increasing
with an increase in the number of features. But after a certain increase, decline
in performance of the model is observed with an increase in complexity to detect
features. But after sufficient training of the model, best accuracy for 75 leaves is
obtained at a maximum number of feature 14 of 91%.

References

1. Kaur PP, Singh S, Pathak M (2020) Review of machine learning herbal plant recognition
system. In: 2nd Proceedings of international conference on intelligent communication and
computational research ICICCR, India
2. Pankaja K, Suma V (2020) Plant leaf recognition and classification based on the whale opti-
mization algorithm (WOA) and random forest (RF). J Instit Eng (India) Ser B. https://doi.org/
10.1007/s40031-020-00470-9
94 P. P. Kaur and S. Singh

3. Singh S, Kaur PP (2019) A study of geometric features extraction from plant leafs. J Inf Comput
Sci 9(7):101–109
4. Singh S, Kaur PP (2019) Review on plant recognition system based on leaf features. J Emerg
Technol Innov Res (JETIR) 6(3):352–361
5. Ramesh S, Hebbar R, Niveditha M, Pooja R, Prasad Bhat N, Shashank N, Vinod PV (2018)
Plant disease detection using machine learning. In: IEEE International conference on design
innovations for 3Cs compute communicate control, pp 41–45
6. Vijayshree T, Gopal A (2018) Identification of herbal plant leaves using image processing
algorithm: review. Res J Pharm Biol Chem Sci 9(4):1221–1228
7. Dahigaonkar TD, Kalyana RT (2018) Identification of ayurvedic medicinal plants by image
processing of leaf samples. Int Res J Eng Technol (IRJET) 5(5):351–355
8. Kan HX, Jin L, Zhou FL (2017) Classification of medicinal plant leaf image based on multi-
feature extraction. Pattern Recognit Image Anal 27(3):581–587
9. De Luna RG, Baldovino RG, Cotoco EA, De Ocampo ALP, Valenzuela IC, Culaba AB,
Gokongwei EPD (2017) Identification of Philippine herbal medicine plant leaf using artificial
neural network. In: IEEE 9th International conference on humanoid, nanotechnology, infor-
mation technology, communication and control, environment and management (HNICEM)
10. Isnanto RR, Zahra AA, Julietta P (2016) Pattern recognition on herbs leaves using region-based
invariants feature extraction. In: 3rd International conference on information technology and
computer, electrical engineering (ICITACEE), (1), pp 455–459
EEG Based Emotion Classification Using
Xception Architecture

Arpan Phukan and Deepak Gupta

Abstract Electroencephalogram (EEG) is widely used in emotion recognition


which is achieved by recording the electrical activity of the brain. It is a complex
signal because of its high temporal resolution and thus requires sophisticated methods
and expertise to be interpreted with a reasonable degree of accuracy. Over the years,
significant strides have been made in the field of supervised and unsupervised feature
learning from data, using deep architectures. The purpose of this study is to build on
some of these improvements for better classification of emotions from EEG signals.
This is a rather challenging task, and more so if the data we are reliant on is noted for
being unsteady, as it changes from person to person. There is a need for an intricate
deep learning algorithm that can achieve high levels of abstraction and can still dish
out robust/accurate results. In this paper, we have used the Xception (Chollet in 2017
IEEE Conference on computer vision and pattern recognition (CVPR), pp 1800–
1807, 2017 [1]) model from Keras API, further reinforced by fine tuning, to classify
emotions into three categories namely NEGATIVE, POSITIVE and NEUTRAL. An
open-source EEG dataset from Kaggle (Bird et al. in The international conference on
digital image and signal processing (DISP’19). Springer, Berlin, 2019 [2]) was used
in this study for the purpose of classification. Our experimental results achieved a
precision score of 98.34%, a recall value of 98.33%, and an F 1 -score of 98.336%. This
result outperforms many other popular models based upon support vector machine,
k-nearest neighbor, self-organizing maps, etc., whose accuracy usually ranges from
anywhere between 53 and 92%.

Keywords EEG · Transfer learning · Emotion classification · Xception

A. Phukan · D. Gupta (B)


Department of Computer Science and Engineering, National Institute of Technology, Jote,
Arunachal Pradesh, India
e-mail: deepak@nitap.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 95
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_7
96 A. Phukan and D. Gupta

1 Introduction

Brain Computer Interface (BCI) bridges the gap between the computer device and
human. It differs from neuromodulation as it facilitates bidirectional data flow. The
primary objective of the Brain Computer Interface (BCI) is to translate our intentions
into workable instructions for robots and other devices. Furthermore, it can also trans-
late our emotions into a machine-friendly language. Emotion is an important aspect
of human-to-human interaction and to imitate this relationship in Human-to-Machine
interactions has been the focus of research for decades. Emotion identification has
been garnering attention in the field of technology as developing empathic robots,
consumer analysis, safe driving [3], and numerous other concepts are being actively
sought after. Generally, emotions are categorized based on valence and arousal as
shown in Fig. 1. The dataset was labeled according to Negative, Neutral, and Positive
groups. Using transfer learning, we chose to classify the data into those three classes
(Table 1).
Researchers have implemented many other practices to classify emotion based
on EEG data as well as facial expressions and there are quite a few success stories
as well. Bharat and Gupta [6] achieved a staggering 91.42% accuracy for facial
expression recognition with their proposed iterative Universum twin support vector
machine. Kwon et al. [7] have devised a 2D CNN model which takes EEG spec-
trogram and GSR features, pre-processed using Short Time Zero Crossing Rate, to
classify emotion into four classes, and achieved an arousal accuracy of 76.56% and
Valence accuracy of 80.46% with an overall accuracy of 73.43%. Li et al. [8] used
multisource transfer learning to facilitate cross-subject emotion recognition based on
EEG data and achieved the highest accuracy of 91.31%. They achieved this by first
selecting appropriate sources before the transfer learning, calibrating a style transfer
mapping, calculating the nearest prototype as well as the Gaussian model before
finally executing the transfer learning with the multisource style transfer mapping.
A deep belief network was proposed by Zheng et al. [9] way back in 2014 to classify
emotions and they managed to achieve a staggering accuracy of 87.6%. Features were
extracted from the EEG data by employing the differential entropy feature extraction

Fig. 1 Negative, positive,


and neutral classes in an
arousal and valence diagram
[4]
EEG Based Emotion Classification Using Xception Architecture 97

Table 1 Emotions with their


Sl. No. Emotion Valence
valence label [5]
1 Shame Negative
2 Humiliation Negative
3 Contempt Negative
4 Disgust Negative
5 Fear Negative
6 Terror Negative
7 Enjoyment Positive
8 Joy Positive
9 Distress Negative
10 Anguish Negative
11 Surprise Negative (lack of dopamine)
12 Anger Negative
13 Rage Negative
14 Interest Positive
15 Excitement Positive

method or more specifically a Short Time Fourier Transform which had 512 points
and a nonoverlapping Hanning window of 1 s. They reasoned that since their data
has higher low-frequency values over high-frequency values, differential entropy
feature extraction should be able to accurately identify EEG patterns between low
well as high-frequency energy. The extracted features were then fed to their deep
belief network comprising of two hidden layers. Their proposed architecture for the
unsupervised learning stage was 310-100-30 while the supervised learning stage had
a structure of 310-100-30-2. Table 2 shows a list of existing models, the features they
used, and the accuracy they achieved.

2 Related Work

2.1 Xception Model

The very first convolutional neural network design was basically a pile of convolu-
tions meant for feature extraction from the input images and some max pooling oper-
ations for subsampling. This design structure was the LeNet style model [20] which
was revamped into the AlexNet architecture [21] in 2012. The AlexNet architec-
ture sported multiple convolution operations in between the max pooling operations
which facilitated the architecture to learn a diverse and abstract set of features at every
spatial scale. The model proposed by Zeiler and Fergus in 2013 [22] and the VGG
architecture presented by Simonyan and Zisserman [23] in 2014 were improvements
98 A. Phukan and D. Gupta

Table 2 List of existing models along with their accuracy


Sl. No. Author Features Model Accuracy (%)
1 Kwon et al. [7] CNN and GSR 2D-convolutional 73.43
features neural network
2 Li et al. [8] Style transfer mapping Style transfer mapping 91.31
3 Zheng et al. [9] Short time Fourier DBN-HMM 87.6
transform
4 Murugappan et al. Discrete wavelet Linear discriminant 83.26
[10] transform analysis and K-nearest
neighbor
5 Jirayucharoensak Fast Fourier transform Deep learning network 53.42
et al. [4] with principal
component based
covariate shift
adaptation
6 Hosseinifard et al. EEG band power, Linear discriminate 90
[11] detrended fluctuation analysis, logistic
analysis, Higuchi, regression, K-nearest
correlation dimension, neighbor
Lyapunov exponent
7 Hatamikia and Approximate entropy, Self-organization map 55.15
Nasrabadi [12] spectral entropy,
Katz’s fractal
dimension and
Petrosian’s fractal
dimension
8 Zubair and Yoon Statistical-based Support vector machine 49.7
[13] features, [14, 15]
wavelet-based features
9 Jadhav et al. [16] Gray-level K-nearest neighbor 79.58
co-occurrence matrix
10 Martínez-Rodrigo Quadratic sample Support vector machine 72.50
et al. [17] entropy
11 Lee et al. [18] Statistical Convolutional neural 82.1
photoplethysmogram network
features
12 Bird et al. [19] Log-covariance Random forest 87.16
features, Shannon
entropy, and
log-energy entropy,
frequency domain,
accumulative features
as energy model,
statistical features,
max, min, and
derivatives
(continued)
EEG Based Emotion Classification Using Xception Architecture 99

Table 2 (continued)
Sl. No. Author Features Model Accuracy (%)
13 Bharat and Gupta Principal Component Iterative universum 91.4
[6] Analysis, Linear twin support vector
discriminant analysis, machine
independent
component analysis,
local binary pattern,
wavelet transform,
phase congruency

Fig. 2 Transfer learning


model summary

to this design, achieved by increasing the depth or layers of the network. In the same
year, Szegedy et al. [24] proposed a novel network dubbed as GoogLeNet (Inception
V1). It was based on the Inception architecture and was later reined to Inception V2
[25], followed by Inception V3 [26], the Inception-ResNet [27]. The objective of the
Inception module is to make the process of convolution, i.e., the process where a
single convolution kernel maps cross channel as well as spatial correlations simulta-
neously, easier and more efficient. This was achieved by explicitly disseminating the
process of mapping correlations to a series of operations that would map the spatial
and cross channel correlations independently.
In a paper by Chollet [1] he defined Xception as “Extreme Inception”. It is a
convolutional neural network designed on depth wise separable convolutional layers.
It completely decoupled the mapping of spatial correlations and cross channels in the
feature maps. This structure has 36 convolutional layers which extract the features of
the input image for the network. These layers are segregated into 14 modules where
every node other than the first and last modules, is surrounded by linear residual
connections.
In this paper, we have made use of the feature extraction base of Xception and
defined our output layer with the three output classes, namely POSITIVE, NEGA-
TIVE and NEUTRAL. The activation function of choice was SoftMax. The total
parameter, trainable parameter, and non-trainable parameter of the transfer learning
model were 20,867,627, 20,813,099, and 54,528 respectively. Refer to Fig. 2.

3 Materials and Methods

The subsequent sections explain the implementation or methodology and the dataset
used.
100 A. Phukan and D. Gupta

3.1 Proposed Methodology

Figure 3 shows the flow chart of this study’s proposed methodology. The first step
that had to be undertaken for this study was to download an EEG dataset. Our
requirements for one were that it should be easily accessible, free for all, and without
any additional terms be it legal or otherwise. We came across many free datasets
like DEAP [28], SEED [29, 30], Enterface’06 [31], BCI-NER Challenge [32], c-
VEP BCI [33], SSVEP [34, 35], SPIS Resting State Dataset [36], EEG-IO [37] and
TUH EEG Resources [38] among many others. However, even if they were free,
they weren’t exactly available to all. They needed signed EULA among many other
preconditions that needed to be satisfied before one can download the dataset. Thus,
our final candidate for the dataset was the open-source dataset [2] by Bird. This is
a multi-class dataset that had discrete values for the readings. We then used it to
generate 2132 python specgram images. A sample of these images can be found in
Fig. 5. These generated specgram images are what we used for classification purposes
by the Xception model. The Xception model, however, only accepted images that met
certain conditions. We proceeded with the necessary prepossessing of the specgram
images, i.e., scaling and resizing to bring them within the bounds of the accepted
format. We then input these pre-processed images into the Xception model. At this
stage, trying to classify 2132 images on a mere 12 GB of RAM was guaranteed
to trigger an out-of-memory error on a free Google colab session. Thus, we had to
reduce the number of inputs, i.e., the number of images to be used for classification
down to 1500.
The next step involved putting the Xception model to work to generate our classi-
fier. Since our objective was to capitalize on an already optimized and robust neural
network, we decided to freeze all the weights of our Xception model and train it
over our trimmed dataset of 1500 specgram images. Although this was not an ideal
solution, this allowed us to circumvent the insufficient memory error of the Google
colab session. After successfully training our model, we fine-tuned it further by re-
training the 120th and successive layers. This implies leaving the weights of the
layers between 1 and 119 untouched while the weights of all the layers after the
119th layer were updated. Finally, we plotted our graphs and the confusion matrix
of our observed result as shown in Figs. 6, 7, and 8 respectively.

3.2 Dataset

There are two methods to acquire EEG data from a person. The first is the subdural
method where the electrodes are placed under the skull, on or within the brain itself.
On the other hand, forgoing this invasive method, there is also the option of a non-
invasive technique where electrodes (either wet or dry) are placed around the cranium
and raw EEG data, which is measured in microvolts (µV) are recorded from time t
to t + n.
EEG Based Emotion Classification Using Xception Architecture 101

Fig. 3 Flow chart of the


proposed method
102 A. Phukan and D. Gupta

Fig. 4 The sensors AF7,


AF8, TP10, and TP9 of the
Muse headband. The NZ
placement (green) was used
to calibrate the sensors as a
reference point [19]

The dataset used in this study was recorded by four dry extracranial electrodes via
a MUSE EEG headband which employs a non-invasive technique for acquiring data.
Micro voltage recordings were taken from the AF7, AF8, TP10, and TP9 electrodes
as illustrated in Fig. 4. Six film clips were shown to a male and a female subject
and the data was recorded for 60 seconds generating 12 min of EEG data. Which
translates to 6 minutes of data for each emotional disposition. Neutral EEG brainwave
data were also recorded for 6 minutes to add valence to the data. This practice of
using the neutral class for classification was found to have significantly reinforced
the learning rate for a chatbot [5].
The 36 min of brainwave data thus recorded had its variable frequency resampled
to 150 Hz which generated a dataset comprising 324,000 data points. Emotions were
evoked from the list shown in Table 1 by activities that were strictly stimuli. To avoid
modification of neutral data by emotions the participants might have felt a priori, it
was collected first and without any stimuli. This ensured that the resting emotional
state data of the subjects were unaltered. To curtail the interference of the resting
state, only 3 minutes of data was collected every day (Table 3).
In order to keep the Electromyographic (EMG) waves, which are more prominent
over brain waves, from modifying the EEG data, the participants were forbidden
from making any conscious movements like drinking or eating while watching the
video clips. Ding and Marchionini [39] proposed that blinking patterns can prove
to be constructive for classifying mental dispositions, which is why there were no
restrictions placed on the subjects over their unconscious movements.

Table 3 Details of the video clips used as stimuli for the EEG data recording [5]
Stimulus Details Studio Year Valence
Marley and Me Death scene Twentieth Century Fox, etc. 2008 Negative
Up Opening death scene Walt Disney Pictures, etc. 2009 Negative
My Girl Funeral scene Imagine Entertainment, etc. 1991 Negative
La La Land Opening musical number Summit Entertainment, etc. 2016 Positive
Slow Life Nature time lapse BioQuest Studios 2014 Positive
Funny Dogs Funny dog clips Mashup Zone 2015 Positive
EEG Based Emotion Classification Using Xception Architecture 103

Fig. 5 Four different


samples of the spectrograms
generated

A combination of Log-covariance features, Shannon entropy and log-energy


entropy, Frequency domain, Accumulative features as energy model, Statistical
Features, Max, Min, and Derivatives were taken into account to generate a dataset
of 2548 source attributes [2, 19].

3.3 Implementation

Using the specgram function from the matplotlib.pyplot library in python, we


extracted the spectrograms of the 2132 instances of the dataset. Figure 5 shows
a sample of the spectrograms which were generated and then fed to the Xception
architecture for feature extraction and classification. Those images were then resized
to 299 × 299 which is the recommended size for inputs in the Xception model. They
were then scaled to the range of 0.0–1.0. However, due to hardware constraints, we
could not use the whole set of images of 2132 spectrograms. Instead, we chose to
make the first 1500 spectrograms as our image dataset for this study.
We compiled our Xception model with the Sparse Categorical Cross entropy being
the loss function of choice, Adam optimizer, and the accuracy metrics. It was then
executed for 100 epochs with the batch size set to 16 because any batch size greater
than 16 crashed our Google colab session by using up all of the available RAM. The
validation split which was set to 0.1, was scrutinized for judging the accuracy of the
model in this study.

4 Results

The model, without any further fine tuning, gave a test accuracy of 97.66%, refer to
Fig. 6 for the accuracy versus epochs graph. To fine-tune this model, we trained the
weights of the Xception model layers from index 120 onwards, all the while keeping
the weights of the preceding layers fixed.
We compiled this model without changing any of the other parameters, i.e., the
loss function, optimizer, and metrics were still the Sparse Categorical Cross entropy,
Adam, and accuracy respectively. Furthermore, the epochs for the purpose of fine
104 A. Phukan and D. Gupta

Fig. 6 Accuracy versus


epochs (before fine tuning)

tuning were set to 100, the validation split was set to 0.1 and the batch size was kept
at 16 due to hardware constraints. This fine-tuned model gave a weighted average
precision score of 98.34%, a weighted average recall value of 98.33%, and a weighted
average F 1 score of 98.336%. Refer to Fig. 7 for the accuracy versus epochs graph
after fine tuning the Xception model while Fig. 8 visualizes the confusion matrix.
One can easily verify from Table 2 that our model outperforms all the models
mentioned in it. However, it still leaves much to be desired. A 98% accuracy, while
looks good on paper, is not an accurate enough model for the real world. A 2%
inaccuracy in the real world where there might be thousands of samples to classify,
some of which might even be for medical purposes, is a glaring problem. It is serious
enough that most might not consider it trustworthy enough to completely switch over
to machine learning alternatives from the human counterparts. Thus, further research
and optimizations are in order to successfully bridge this gap.

Fig. 7 Accuracy versus


epochs (after fine tuning)
EEG Based Emotion Classification Using Xception Architecture 105

Fig. 8 Confusion matrix

5 Conclusion

This study highlighted the application of transfer learning of robust models on EEG
data. These architectures were trained on the ImageNet dataset or other significantly
larger image datasets, one of which is home to 350 million images and 17,000 class
labels thereby guaranteeing higher abstraction and increased accuracy. The Xception
model considered for this study gave impressive results with minimal effort. The
caveat here is that the windowed EEG data were collected from four points on the
scalp, namely the AF7, AF8, TP9, and TP10 by a low resolution commercially
available EEG headband. The high precision results prove that architectures like
Xception, VGG, ResNet, DenseNet, etc. have astounding potential in the field of
BCI applications. Further studies may include an ensemble of transfer learning as
well as other classification algorithms like Support Vector Machine, Random Forrest,
Deep Belief Network, K-Nearest Neighbor, etc. over more sophisticated datasets like
DEAP, SEED, Enterface’06, etc. to gauge performance improvements with respect
to existing models.
106 A. Phukan and D. Gupta

References

1. Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: 2017
IEEE Conference on computer vision and pattern recognition (CVPR), pp 1800–1807
2. Bird JJ, Ekart A, Buckingham CD, Faria DR (2019) Mental emotional sentiment classification
with an EEG-based brain-machine interface. In: The international conference on digital image
and signal processing (DISP’19). Springer, Berlin
3. Phukan A et al (2020) Information encoding, gap detection and analysis from 2D LiDAR
data on android environment. In: Advanced computing and intelligent engineering. Springer,
Singapore, pp 525–536
4. Jirayucharoensak S, Pan-Ngum S, Israsena P (2014) EEG-Based emotion recognition using
deep learning network with principal component based covariate shift adaptation. Sci World J
2014:10, Article ID 627892. https://doi.org/10.1155/2014/627892
5. Bird JJ, Ekárt A, Faria DR (2018) Learning from interaction: an intelligent networked-based
human-bot and bot-bot chatbot system. In: UK Workshop on computational intelligence.
Springer, Berlin, pp 179–190
6. Richhariya B, Gupta D (2019) Facial expression recognition using iterative universum twin
support vector machine. Appl Soft Comput 76:53–67
7. Kwon YH, Shin SB, Kim SD (2018) Electroencephalography based fusion two-
dimensional (2D)-convolution neural networks (CNN) model for emotion recognition system.
Sensors (Basel) 18(5):1383. https://doi.org/10.3390/s18051383. PMID: 29710869; PMCID:
PMC5982398
8. Li J, Qiu S, Shen Y, Liu C, He H (2020) Multisource transfer learning for cross-subject EEG
emotion recognition. IEEE Trans Cybern 50(7):3281–3293. https://doi.org/10.1109/TCYB.
2019.2904052
9. Zheng W-L, Zhu J-Y, Peng Y, Lu B-L (2014) EEG-Based emotion classification using deep
belief networks. In: Proceedings—IEEE International conference on multimedia and expo.
https://doi.org/10.1109/ICME.2014.6890166
10. Murugappan M, Ramachandran N, Sazali Y (2010) Classification of human emotion from EEG
using discrete wavelet transform. J Biomed Sci Eng 3:390–396. https://doi.org/10.4236/jbise.
2010.34054
11. Hosseinifard B, Moradi MH, Rostami R (2013) Classifying depression patients and normal
subjects using machine learning techniques and nonlinear features from EEG signal. Comput
Methods Programs Biomed 109(3):339–345. https://doi.org/10.1016/j.cmpb.2012.10.008
12. Hatamikia S, Nasrabadi AM (2014) Recognition of emotional states induced by music videos
based on nonlinear feature extraction and some classification. In: Proceedings of the IEEE 21st
Iranian conference on biomedical engineering (ICBME), Tehran, Iran, 26–28 Nov 2014, pp
333–337
13. Zubair M, Yoon C (2018) EEG Based classification of human emotions using discrete wavelet
transform. In: Kim K, Kim H, Baek N (eds) IT Convergence and security 2017. Lecture notes in
electrical engineering, vol 450. Springer, Singapore. https://doi.org/10.1007/978-981-10-645
4-8_3
14. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
15. Hazarika BB, Gupta D (2020) Density-weighted support vector machines for binary class
imbalance learning. Neural Comput Appl 1–19
16. Jadhav N, Manthalkar R, Joshi Y (2017) Electroencephalography-based emotion recognition
using gray-level co-occurrence matrix features. In: Raman B, Kumar S, Roy P, Sen D (eds)
Proceedings of international conference on computer vision and image processing. Advances
in intelligent systems and computing, vol 459. Springer, Singapore. https://doi.org/10.1007/
978-981-10-2104-6_30
17. Martínez-Rodrigo A, García-Martínez B, Alcaraz R, Fernández-Caballero A, González P
(2017) Study of electroencephalographic signal regularity for automatic emotion recognition.
In: Ochoa S, Singh P, Bravo J (eds) Ubiquitous computing and ambient intelligence, UCAmI
EEG Based Emotion Classification Using Xception Architecture 107

2017. Lecture notes in computer science, vol 10586. Springer, Cham. https://doi.org/10.1007/
978-3-319-67585-5_74
18. Lee M, Lee YK, Lim M-T, Kang T-K (2020) Emotion recognition using convolutional neural
network with selected statistical photoplethysmogram features. Appl Sci 10:3501
19. Bird JJ, Manso LJ, Ribiero EP, Ekart A, Faria DR (2018) A study on mental state classifica-
tion using EEG-based brain-machine interface. In: 9th International conference on intelligent
systems. IEEE
20. LeCun Y, Jackel L, Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Muller U, Sackinger
E, Simard S et al (1995) Learning algorithms for classification: a comparison on handwritten
digit recognition. In: Neural networks: the statistical mechanics perspective, pp 261–276
21. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional
neural networks. In: Advances in neural information processing systems, pp 1097–1105
22. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In:
Computer vision-ECCV 2014. Springer, Berlin, pp 818–833
23. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image
recognition. arXiv preprint arXiv:1409.1556
24. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich
A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer
vision and pattern recognition, pp 1–9
25. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing
internal covariate shift. In: Proceedings of the 32nd international conference on machine
learning, pp 448–456
26. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception
architecture for computer vision. arXiv preprint arXiv:1512.00567
27. Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4, inception-ResNet and the impact of
residual connections on learning. arXiv preprint arXiv:1602.07261
28. Koelstra S, Muhl C, Soleymani M, Lee JS, Yazdani A, Ebrahimi T et al (2011). DEAP: a
database for emotion analysis; using physiological signals. IEEE Trans Affect Comput 3(1):18–
31
29. Duan R-N, Zhu J-Y, Lu B-L (2013) Differential entropy feature for EEG-based emotion clas-
sification. In: 2013 6th International IEEE/EMBS conference on neural engineering (NER).
IEEE
30. Zheng W-L, Lu B-L (2015) Investigating critical frequency bands and channels for EEG-based
emotion recognition with deep neural networks. IEEE Trans Auton Mental Dev 7(3):162–175
31. Martin O et al (2006) The eNTERFACE’05 audio-visual emotion database. In: 22nd
International conference on data engineering workshops (ICDEW’06). IEEE
32. Margaux P et al (2012) Objective and subjective evaluation of online error correction during
P300-based spelling. Adv Hum Comput Interact
33. Spüler M, Rosenstiel W, Bogdan M (2012) Online adaptation of a c-VEP brain-computer
interface (BCI) based on error-related potentials and unsupervised learning. PloS One
7(12):e51077
34. Fernandez-Fraga SM et al (2018) Feature extraction of EEG signal upon BCI systems based
on steady-state visual evoked potentials using the ant colony optimization algorithm. Discrete
Dyn Nat Soc
35. Fernandez-Fraga SM et al (2018) Screen task experiments for EEG signals based on SSVEP
brain computer interface. Int J Adv Res 6(2):1718–1732
36. Torkamani-Azar M et al (2020) Prediction of reaction time and vigilance variability from
spatio-spectral features of resting-state EEG in a long sustained attention task. IEEE J Biomed
Health Inf
37. Agarwal M, Sivakumar R (2019) Blink: a fully automated unsupervised algorithm for eye-
blink detection in EEG signals. In: 2019 57th Annual Allerton conference on communication,
control, and computing (Allerton). IEEE
38. Harati A et al (2014) The TUH EEG CORPUS: a big data resource for automated EEG inter-
pretation. In: 2014 IEEE Signal processing in medicine and biology symposium (SPMB).
IEEE
108 A. Phukan and D. Gupta

39. Ding W, Marchionini G (1997) A study on video browsing strategies. Technical report,
University of Maryland at College Park
Automated Discovery and Patient
Monitoring of nCOVID-19:
A Multicentric In Silico Rapid
Prototyping Approach

Sharduli, Amit Batra, and Kulvinder Singh

Abstract Deep learning (DL) has played a vital role in the analysis of several viruses
in the area of medical imaging in recent years. The Coronavirus disease stunned the
whole world with its quick growth and has left a strong influence on the lives of
myriad across the globe. The novel Coronavirus disease (nCOVID-19) is one of the
ruinous viruses as conveyed by the World Health Organization. Moreover, the virus
has affected the lives of millions of people and resulted in casualties of more than
1,224,563 people as of November 04, 2020. The objective of this manuscript is to
instinctively discover nCOVID-19 pneumonia sufferers by employing X-ray pictures
along with maximization of accuracy in discovery utilizing deep learning methods
and image pre-processing. Deep learning algorithms have been written to examine
the X-rays of individuals to fight with this disease. This would be highly beneficial
in this pandemic as the manual testing takes a huge amount of time, heeding that the
count of those infected from this disease is just augmenting. Several metrics in this
paper have been used to predict the performance, i.e., accuracy, sensitivity, specificity
and precision. The great accuracy of this computer-assisted analysis mechanism can
remarkably enhance the accuracy and speed of determining the nCOVID-19 patients
in initial stages. This research for the automated discovery of the nCOVID-19 virus
has employed the datasets from some of the widely used public repositories like
UCI, Kaggle and GitHub. Likewise, successful efforts are made for the automated
discovery and monitoring of nCOVID-19 by using a rapid prototyping approach for
timely and authentic decision by implementing machine learning.

Sharduli
Department of Biotechnology, University Institute of Engineering and Technology, Kurukshetra
University, Kurukshetra, Haryana, India
A. Batra (B)
Department of CSE, University Institute of Engineering and Technology, Kurukshetra University,
Kurukshetra, Haryana, India
K. Singh
Faculty of CSE, Department of CSE, University Institute of Engineering and Technology,
Kurukshetra University, Kurukshetra, Haryana, India
e-mail: ksingh2015@kuk.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 109
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_8
110 Sharduli et al.

Keywords Artificial Intelligence · Convolutional Neural Networks · Coronavirus ·


Deep Learning · Dropout · Machine Learning · nCOVID-19 · Pandemic

1 Introduction

The novel Coronavirus has resulted in pneumonia, which is a contaminated disease


which has an intensified effect on the lungs of a human. The thoracic CT and X-
ray images have been found as an effective tool in the discovery of nCOVID-19.
Although several clinical courses of action and treatments exist already, since Arti-
ficial Intelligence (AI) offers an innovative model for healthcare, various diverse
AI techniques that are based on machine learning (ML) algorithms are utilized for
examining dataset and help in making decisions quickly and rapidly. Deep learning
and machine learning approaches have become accepted exercises in the application
of AI to extract, examine, study and identify graphical forms from images datasets.
Additionally, deep learning is a technique in which convolutional neural networks
are employed for automated withdrawal of graphical features, which is obtained by
the task termed as convolution. Different layers do the processing of data which is
nonlinear [1]. The advancements in deep learning algorithms in the recent few years
have surged. Deep learning comprises machine learning techniques majorly centered
on automated extraction of graphical features and categorization of medical images.
It has been seen in March 2020 that there has been a rise in accessible X-rays from fit
and well people, along-with from sufferers who are having nCOVID-19. A dataset
consisting of 180 nCOVID-19, 1400 pneumonia and 1398 normal thoracic images has
been used for in silico investigations. Another image dataset containing 2500 images
has been used as a training set of each class for validating and training using deep
learning and convolutional neural networks (CNNs). Every layer comprises infor-
mation for transformation into a more conceptual level. As we go deeper and deeper
into the network, the more intricate features are obtained. Because of the small size
of the samples pertaining to nCOVID-19 (224 images), transfer learning technique
can be employed to train the deep CNNs. Currently, real-time reverse transcription-
polymerase chain reaction (RT PCR) is the testing that is being employed by the
medical professionals to detect nCOVID-19. This method of testing has a compar-
atively less positive rate in the early phase of the disease. As a matter of fact, the
medical clinicians felt the necessity of an alternate technique for the diagnosis of
nCOVID-19. The look and aspect of X-ray chest images in nCOVID-19 differs from
any other class of pneumonic virus. As a result, the objective of this paper is put
forward to discover nCOVID-19 employing chest X-ray images. The automated and
early diagnosis of nCOVID-19 may be very advantageous and helpful for the world
to prompt recommendation of the sufferer to isolate, frequent intubation of severe
instances in special hospitals, and keeping an eye on outspread of nCOVID-19. By
utilizing the concept of transfer learning, holding of the information or features
extracted is the clue to carry out other jobs.
Automated Discovery and Patient Monitoring of nCOVID-19 … 111

2 Related Works

The doctors and radiologists are able to discover nCOVID-19 patients centered on
chest X-ray inspection, their jobs are manual and take a long amount of time, partic-
ularly, when the number of cases becomes very large. Choi et al. [2] collected the
information from four United States and three Chinese radiologists to recognize and
ascertain nCOVID-19 in contrast to any pneumonia centered on chest CT pictures
collected from a group of 425 instances, where 219 instances were in China positive
with nCOVID-19 and in contrast 206 instances were in the US with non-nCOVID-19
virus. A novel deep learning technique, termed as Coronavirus discovery network
(COVNet), is recently incorporated by Li et al. [3] to discover nCOVID-19 centered
on thoracic CT pictures. Three types of CT pictures, comprising of nCOVID-19,
community attained pneumonia (CAP), and non-pneumonia images, are employed
to determine the effectiveness of the suggested technique, which is depicted in Fig. 1.
COVNet is described as a convolutional ResNet-50 [4] technique to which a
progression of CT images is fed as input and in turn, estimates the category labels
of the CT images as the output. One more deep learning technique centered on the
concatenation of a 3-dimensional CNN ResNet-18 network and location attention
approach is suggested by Xiaowei et al. [5] to discover nCOVID-19 cases employing
pulmonary CT images. In addition to the aforementioned research, it has been
observed in plenty of manuscripts about utilizing DL algorithms for nCOVID-19
examination employing medical pictures. Some of the important work related to
rapid detection of nCOVID-19 is depicted in Table 1 for contrast and differentiation.

Fig. 1 Architecture of COVID-19 detection neural network model (COVNet)


112 Sharduli et al.

Table 1 Outline and review of deep learning techniques for nCOVID-19 identification employing
medical images
Dataset AI techniques Outcomes
CT medical images of 1136 Amalgamation of Specificity value of 0.922 and
training cases with 723 cases three-dimensional UNet++ [7] sensitivity value of 0.974
found as positive with and ResNet-50 [4]
COVID-19 from five hospitals
[6]
1, 065 CT images comprising Change initiated transfer Specificity value of 0.83 with
of 325 Coronavirus and 740 learning approach accuracy 79.3% and sensitivity
viral pneumonia [8] of 0.67
CT images acquired from 157 ResNet-50 0.996 AUC value
cases (United States and
China) [9]
618 CT images: 219 from 110 Location attention and 86.7% accuracy value
COVID-19 people, 224 CT ResNet-18 [4]
images from 224 people
having influenza-A viral,
pneumonia, and 175 CT
images from well persons [5]

On the other side, a stereotype of an Artificial Intelligence technique called as


α-Satellite, is suggested by Ye et al. [10] to estimate the contagious chance of a
particular topographical region for risk assessment to help combat nCOVID-19 at
cluster levels. The data obtained from the social networks for a given region may
be finite and restricted so that they are enhanced and improved by the conditional
reproductive networks [11] to acquire knowledge about the public realization of
nCOVID-19 [12] changes a technique based on discrete-time, termed as ACEMod,
earlier employed for influenza pandemic simulation [13] for identifying the and
recognizing the patterns of nCOVID-19 pandemic across Australia over a period
of time. In other research, Allam and Jones [14] proposed the utilization of AI and
information spreading for good comprehension and maintaining the health of the
people throughout the period of nCOVID-19 pandemic. A composite AI technique
for the early prediction of speed of spread of nCOVID-19 is suggested by Du et al. [15]
which integrates the pandemic susceptible infection (SI) technique, natural language
processing (NLP), and deep learning methods. In another research, Lopez et al. [16]
suggested utilizing the network analysis methods along with NLP and mining of
the text to examine multilingual Twitter data to comprehend and interpret guidelines
and usual feedback to nCOVID-19 disease through a period of time. After going
through some of the important work for the early detection of nCOVID-19, proposed
algorithm is induced for the rapid discovery and checking of nCOVID-19.
Automated Discovery and Patient Monitoring of nCOVID-19 … 113

3 Proposed Algorithm and Methodology

The succeeding terms are trustworthy used so as to become acquainted with the
terminology. The term “parameter” is used for a changeable value that is instinc-
tively studied and understood throughout the training procedure. The term “weight”
is usually utilized equivalent and interchangeably in place of “parameter”; but an
attempt has been made to use this term whenever making mention of parameter
external to the layers of convolution, i.e., a kernel, as an instance in entirely connected
and associated layers. The term “kernel” is used for sets of learnable parameters appli-
cable in the convolution process. The term “hyperparameter” is used for a variable
that is required to be set well before the training begins. CNN is a kind of deep learning
technique where the dataset is processed and is having a grid structure, as instance
pictures, that is motivated by the structure of organism visible skin and developed to
instinctively study three-dimensional characteristics. Pertaining to the categorization
process of the CNNs, the following are some of the metrics: (a) correctly recog-
nized diseased cases, termed as (True Positives, abbreviated as TP), (b) incorrectly
categorized viral cases (False Negatives, abbreviated as FN), (c) correctly recog-
nized healthy instances (True Negatives, TN), and (d) incorrectly categorized healthy
instances (False Positives, FP). It can be observed that TP pertains to the correctly
estimated nCOVID-19 instances, FP pertains to particular or pneumonia instances
that are categorized as nCOVID-19 by the CNNs. TN pertains to usual or pneumonia
instances that are categorized as non-nCOVID-19 instances, whereas FN pertains to
Coronavirus instances categorized as usual pneumonia instances. Centered on these
metrics, an attempt is made to determine the sensitivity, specificity, and accuracy of
the technique employed.

Sensitivity = TP/(TP + FN) (1)

Specificity = TN/(TN + FP) (2)

Accuracy = (TP + TN)/(TP + TN + FP + FN) (3)

Convolution is a particular category of linear function employed for characteris-


tics withdrawal, in which a tiny matrix of numbers, termed as kernel, is forced to
apply through the input, which is a matrix cum array of numbers, termed as a tensor.
Two major hyperparameters that characterize the convolution function are the count
and size of kernels. Recent CNNs commonly utilize no padding to keep in-plane
measurements so as to put in additional layers. If there is no padding, every consecu-
tive characteristic map would become compact and tinier afterward the convolution
function. The span among two consecutive kernel locations is termed as a stride,
which as well explains and interprets the convolution function. A usual option of
a stride is 1; nevertheless, a stride greater than 1 is occasionally employed so as to
acquire downsampling of the characteristic maps. The parameters like kernels are
114 Sharduli et al.

Table 2 Convolutional neural network comprising of parameters and hyperparameters


Layer type Hyperparameters Parameters
Convolution layer No. of kernels, padding, stride, Kernel size, Activation Kernels
Function
Pooling layer Filter size, stride, pooling technique, padding None
Entirely connected layer Activation function, no. of weights Weights

instinctively studied and examined throughout the task of training in the layer of
convolution; the count of kernels, size of the kernels, padding, and stride are consid-
ered as hyperparameters that require to be fixed prior to the beginning of the training
task as mentioned in Table 2.

3.1 Training of Network

Training a system is a task of locating kernels in convolution layers and weights in


entirely associated layers which reduces the dissimilarity among output estimations
and provided ground truth tags on training information. Backpropagation technique
is the algorithm usually employed for training and guiding neural networks where
loss operation and optimization of gradient descent technique has a significant part to
play. A loss operation also termed as a cost operation, computes the similarity among
output estimations of the network via forward propagation and provided ground truth
tags. Frequently employed loss operation for multicategory categorization is cross
entropy, while averaged squared error is usually forced to apply to regression to
continual values. A category of loss operation is amongst the hyperparameters and
requires to be obtained as per the provided jobs. Gradient descent is usually employed
as an optimization technique that repeatedly modifies the understandable limits, i.e.,
weights and kernels, of the system in order to mitigate the loss. Mathematically, the
gradient may be defined as a partial derivative of the loss w.r.t. every understandable
parameter, and a one modification of a parameter can be written down as follows:

w := w − α ∗ ∂ L/∂w (4)

In the above equation, w is nothing but every understandable parameter, α is


speed of learning and L is a loss operation. The gradient of the loss operation
gives us the guideline with which the operation has the abrupt speed of growth,
and every understandable parameter is modified in the opposite direction of the
gradient with a random step size obtained centered on a hyperparameter termed as
rate of learning. An activation function that is forced to apply to the multicategory
job is a softmax operation that is used to renormalize resultant actual values from the
rearmost entirely associated layer to objective category likelihoods, in which every
value varies between 0 and 1 and all values total to 1. The following algorithms
Automated Discovery and Patient Monitoring of nCOVID-19 … 115

are successfully employed using Python software (Anaconda platform) installed on


Windows 10 operating system with 12 GB of RAM and Intel Core i7 (4.80 GHz)
processor.

3.2 Algorithms Used

Training Algorithm: 1
# VGG16 network requires to be loaded
Model = VGG(w=”imagenet”, inc_top=False,
inp_tensor=Inp(shape=(224, 224, 3)))
# develop the head of the prototype which may be positioned on topmost of the
# the baseline prototype
hModel = Model.out
hModel = AvgPooling2D_operation(pool_size=(4, 4))(hModel)
hModel = Flatten_operation(name="flatten")(hModel)
hModel = Dense_operation(64, activation="relu")(hModel)
hModel = Dropout_operation(0.5)(hModel)
hModel = Dense_operation(2, activation_function=”softmax”)(hModel)
# position the head FC prototype on topmost of the baseline prototype (which may constitute
# the real prototype which may be trained)
model = Model(inp=Model.input, out=hModel)
# for loop is employed through all layers in the baseline prototype and fix these, therefore,
# may not be modified for the duration of first training task
for l in baselineModel.layers:
l.train = False

In the aforesaid algorithm, VGG16 network is instantiated with weights pre-


trained on ImageNet.

Training Algorithm: 2
# The model is compiled as underneath:
print(“The model is being compiled...”)
optimizer = Adam_opt(lr=INIT_LR, decay=INIT_LR/no_of_epochs)
model.compilation(loss="binary_cross_entropy", opt=optimizer,
measures=[“accuracy”])
# The head of the network is trained
print(“The head is being trained...”)
Head = model.fitting_generation(
train.flow(trainingX, trainingY, size_of_batch=S),
steps=length(trainingX) // S,
valid_data=(testingX, testingY),
valid_steps=length(testingX) // S,
epochs=EPOCHS)

In Algorithm 2, the network is compiled with decay in learning rate and the
optimizer employed is Adam optimizer. The neural network is trained by calling
116 Sharduli et al.

Fig. 2 Training and


Validation technique utilized
in the fivefold
cross-validation step

fit_generator function and chest X-ray information is passed through the data
augmentation object.

4 Empirical Results

We carried out trials to distinguish and order nCOVID-19 X-ray pictures in distinct
situations. In the first place, a DarkCovidNet deep understandable technique has been
suggested to characterize X-ray images into threefold classifications: COVID-1, No-
Findings, and Pneumonia. Moreover, the DarkCovidNet prototype is constituted to
recognize two categories: Coronavirus and No-Findings categorizations. The exhibi-
tion of the suggested technique is evaluated employing the 5-overlap cross-approval
methodology for both the paired and triple characterization issues. 80% of X-ray
images are employed for preparation and 20% for approval. The tests are reused
several time periods as depicted in Fig. 2. The entireness of the split k compo-
nents is enveloped by folds to employ in the approval phase. In Fig. 3, the curves for
training loss and accuracy have been depicted. The accuracy of testing, computational
complexity, i.e., count of floating-point operations and architectural complexity, i.e.,
count of parameters of the proposed approach has been shown in Table 3.

5 Conclusion and Future Research Directions

In this research paper, a more refined technique is presented to describe different


deep learning strategies to detect novel nCOVID-19 at a rapid pace as compared to
Automated Discovery and Patient Monitoring of nCOVID-19 … 117

Fig. 3 Accuracy and


training loss

Table 3 Contrasting
Model Accuracy (%) Parameters (M) FLOPs (G)
accuracy, parameters, and
floating-point operations ResNet-50 98.6 23.54 42.70
Proposed 98.7 1.60 4.98
approach

manually employed RT-PCR technique utilizing X-rays of lungs of infected people.


The model performed on the approval set truly with just two cases each for False
Positives and False Negatives. Proceeding onward to the test information, out of 74
pictures, 70 are accurately anticipated, which is an outer approval exactness of 94.6%.
However, the most astonishing thing came when it anticipated the right positive cases.
The model never prepared on these images, yet still figured out how to foresee with
99% conviction that the lungs in the pictures are certain for nCOVID-19. This is an
extraordinary precision for a profound learning model in characterizing nCOVID-
19, however, this is just a new beginning to explore different avenues regarding a
conspicuous absence of X-ray information, and it has not been approved by outside
wellbeing associations or experts.

Conflict of Interest All authors confirm no conflict of interest or funding for this study.

Transparency Declaration The authors certify that the manuscript is reported clearly, truthfully,
and precisely with no omission of any important aspect.

Authors’ Contributions Statement Sharduli: Idea formulation, Conceptualization, Effective


literature review, Methodology, Formal analysis, Software testing, Writing- review & final editing.
Amit Batra: Conceptualization, Software, Methodology, Writing- review & final editing.
Kulvinder Singh: Experiments and Simulation, Supervision, Validation, Writing-original draft.
118 Sharduli et al.

References

1. Deng L, Yu D (2013) Deep learning: methods and applications. Found Trends Signal Proces
7(3–4):197–387
2. Choi W, My T, Tran L, Pan I, Shi L-B, Hu P-F, Li S (2020) Performance of radiologists in
differentiating COVID-19 from viral pneumonia on chest CT. Radiology 1:1–13
3. Li L, Qin L, Xu Z, Yin Y, Wang X, Kong B, Xia J (2020) Artificial ıntelligence distinguishes
covıd-19 from community acquired pneumonia on chest CT. Radiology 284(2):200905
4. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceed-
ings of the IEEE computer society conference on computer vision and pattern recognition,
2016-Decem, pp 770–778
5. Xu X, Jiang X, Ma C, Du P, Li X, Lv S, Yu L, Chen Y, Su J, Lang G, Li Y, Zhao H, Xu K,
Ruan L, Wu W (2020) Deep learning system to screen coronavirus disease 2019 pneumonia.
In: Applied Intelligence
6. Jin S, Wang B, Xu H, Luo C, Wei L, Zhao W, Xu W (2020) AI-assisted CT imaging analysis for
COVID-19 screening: building and deploying a medical AI system in four weeks. MedRxiv,
2020.03.19.20039354
7. Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, Liang J (2018) Unet++ a nested u-net archi-
tecture for medical image segmentation. In: 4th International workshop on deep learning in
medical ımage analysis, lecture notes in computer science (Including Subseries Lecture Notes
in Artificial Intelligence and Lecture Notes in Bioinformatics), pp 3–11
8. Shuai W, Kang B, Ma J, Zeng X, Xiao M, Guo J, Cai M, Yang J, Li Y, Meng X, Xu B (2020)
A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19).
MedRxiv
9. Gozes O, Frid-Adar M, Greenspan H, Browning PD, Bernheim A, Siegel E (2020) Rapid AI
development cycle for the coronavirus (COVID-19) pandemic: ınitial results for automated
detection & patient monitoring using deep learning CT ımage analysis. Retrieved from http://
arxiv.org/abs/2003.05037
10. Ye Y, Hou S, Fan Y, Qian Y, Zhang Y, Sun S, Laparo K (2020) Alpha-satellite: an aı-driven
system and benchmark datasets for hierarchical community-level risk assessment to help
combat COVID-19. Retrieved from http://arxiv.org/abs/2003.12232
11. Mirza M, Osindero S (2014) Conditional generative adversarial nets, pp 1–7. Retrieved from
http://arxiv.org/abs/1411.1784
12. Chang SL, Harding N, Zachreson C, Cliff OM, Prokopenko M (2020) Modelling transmission
and control of the COVID-19 pandemic in Australia, pp 1–31. Retrieved from http://arxiv.org/
abs/2003.10218
13. Zachreson C, Fair KM, Cliff OM, Harding N, Piraveenan M, Prokopenko M (2018) Urban-
ization affects peak timing, prevalence, and bimodality of influenza pandemics in Australia:
results of a census-calibrated model. Sci Adv 4(12):1–9
14. Allam Z, Jones DS (2020) On the coronavirus (COVID-19) outbreak and the smart city network:
universal data sharing standards coupled with artificial intelligence (AI) to benefit urban health
monitoring and management. Healthcare 8(1):46
15. Du S, Wang J, Zhang H, Cui W, Kang Z, Yang T, Zheng N (2020) Predicting COVID-19 using
hybrid AI Model. SSRN Electron J 1–14
16. Lopez CE, Vasu M, Gallemore C (2020) Understanding the perception of COVID-19 policies
by mining a multilanguage Twitter dataset, pp 1–4. Retrieved from http://arxiv.org/abs/2003.
10359
Estimation of Location and Fault Types
Detection Using Sequence Current
Components in Distribution Cable
System Using ANN

Garima Tiwari and Sanju Saini

Abstract This paper introduces a technique to estimate the location and detect types
of fault types in distribution underground cable systems. This paper is following by
mainly two steps, in which first step is modeling the 20 km, 11 kV underground
cable for distribution power system, and second step is collecting the data from it
for training the artificial neural network. Sequence current components (positive,
negative, and zero-sequence values of current) are used as input data and fault types,
and their locations are used as output data.

Keywords Underground cable · Artificial neural network · Fault detection ·


Distribution system

1 Introduction

The main problem in power systems is difficulty in the acquisition of data. To retain
information at multiple modes in power systems used, different types of traditional
measuring instruments, which are current transformer (CT), potential transformer
(PT), and remote terminal units (RTU), as well as intelligent electronic devices
(IEDs), are being used [1]. For smart online inspections in power systems developed
self-power non-intrusive sensors, which formed a smart network [2, 3]. In [4] a brief
introduction of the application of phasor measurement unit (PMU) has to attain more
impact. The levels of frequency of current signal and voltage signals revolutionize
abruptly [5] after occurrence of fault, i.e., 1-line to ground fault and three-phase
fault, etc. If these effects are analyzed, it may help the power plant in a great manner.
Several methods are produced for frequency characteristics analysis of TD (Time-
Domain) signals. In which three techniques are mainly used to identify fault type
in power systems are described as follows based on scalable and moving localizing
Gaussian window, ST gives frequency-dependent resolution with time–frequency
representation, as given by Stockwell et al. in [6].

G. Tiwari · S. Saini (B)


Department of Electrical Engineering, DCRUST Murthal, Sonipat, Haryana, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 119
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_9
120 G. Tiwari and S. Saini

In detection and interruption of a transient event, used an ST representation, i.e.,


two-dimensional TF (time–frequency) representation, which can efficiently provide
local spectral characteristics [7]. S-Matrix can store the calculation of ST, two-
dimensional visualization plotted with the help of ST contour. ST is better than
DWT according to some researchers because it avoids some drawbacks of DWT,
i.e., noise sensitivity and particular harmonics reflection characteristics [8, 9]. In
[10], problem of faulty section and phase is solved by using standard deviation and
signal energy of ST contour. Calculation of autocorrelation and variance of S-matrix
is explained in [11]. The introduction of ST based on fast discrete is explained in [12].
Fault point location phase angle, amplitude, and impedance of fault point, location of
fault introduced in [13]. Analysis of signals, the frequency domain is needed a math-
ematical tool and Fourier transform is mainly used for this purpose. Discrete Fourier
transform (DFT) is proposed for discrete frequency domain and TD coefficients. In
paper [14] authors proposed to calculate the phasor elements of the faulty current
signal by using discrete Fourier transform (DFT), full-cycle harmonic component,
and DFT half-cycle to eliminate direct current elements. For classification of faults
type researcher used half-cycle DFT to find harmonic and fundamental Phasor [15–
17]. Fundamental values of voltage and currents are finding by full-cycle FFT [18].
There are mainly two types of faults, one is symmetrical, and another is unsymmet-
rical faults. Three phases to ground faults are considered symmetrical faults. Other
types of unsymmetrical faults describe below.
• Single-phase to ground fault (L-G fault)
• Phase-to-phase fault (L-L fault)
• Double phase to ground fault (L-L-G fault)
• Three-phase fault (L-L-L fault)
Benefits to detects the type of faults are to enhance reliability, increase the life of
cables, protect the overall system from damage, to a continuous flow of energy,
because the three-phase fault is more severe compare to single-phase fault, but
occurrence of single-phase fault is frequently [1].
This work is using an artificial neural network (ANN) to estimate location and
fault type detection in a distribution cable system. A simulated model of distribution
underground cable system is given in Sect. 2. An introduction to an artificial neural
network-based fault detection scheme is given in Sect. 3 followed by simulation
results and conclusion.

2 MATLAB Simulink Model of Distribution Underground


Cable

A model of distribution underground cable system is simulated as a PI section.


MATLAB Simulink model of a distributed underground cable system is shown in
Fig. 1. Simulink model of underground cable can generate 11 categories of faults,
Estimation of Location and Fault Types Detection Using … 121

Fig. 1 Simulink model of an underground cable

Fig. 2 Internal structure of the cable

i.e., single-phase to a ground fault (AtoG, CtoG), phase-to-phase fault (AtoB, BtoC,
CtoA), double-phase to ground fault (AtoBtoG, BtoCtoG, CtoAtoG), three-phase
fault (AtoBtoC), three-phase to ground fault (AtoBtoCtoG), etc. The overall structure
of cable in PI section is shown in Fig. 2. Here, mutual inductance block is used
for representation of two- or three-windings inductances utilizing identical mutual
coupling. Capacitance, ‘C’ represents branch capacitance, in farads (F). By using
this model, collection of data has been done in this work to train an artificial neural
network. The used qualities of cable parameters are given in Table 1.

3 Collection of DATA

Data used for a training set of ANN is the values of positive, negative, and zero
sequences (I 1 , I 2, and I 0 , respectively) components of fault current. To get this data,
firstly phase A to ground fault (AtoG fault) is applied in the toolbox for 0.2–0.3 s (at
a distance of 4 km). This gives the values of zero sequence, positive sequence, and
122 G. Tiwari and S. Saini

Table 1 Configuration
Quality of cable parameters Values
parameters of cable
Number of cables 3
Frequency 50 Hz
Ground resistivity 100  m
The geometric mean distance between cables 80.09 cm
Phase conductor
Number of strands 58
Strand diameter of cable 2.7 mm
Resistivity of material 1.78e−8  m
Relative permeability of material 1
External diameter of phase conductor 20.9
Phase-screen insulator
Relative permeability of material 2.3
Diameter of internal cable 23.3 mm
Diameter of external cable 66.6 mm
Screen conductor
The resistivity of screen conductor 1.78e−8  m
Total section of screen conductor 0.000169 m2
The internal diameter of screen conductor 65.8 mm
External diameter of screen conductor 69.8 mm
Outer screen insulator
Relative permittivity of outer screen insulator 3.25
The internal diameter of outer screen insulator 69.8 mm
External diameter of outer screen insulator 77.8 mm

negative sequence component of fault current by using sequence analyzer block box
of MATLAB. Same procedure is repeated for other types of fault like BtoG (phase
B to ground), CtoG (phase C to ground), ABG (lines AB to ground), BCG (lines BC
to ground), CAG (lines CA to ground), AB (line A to line B), BC (line B to line C),
CA (line C to line A), ABC (three-phase fault), and ABCG (three-phase to a ground
fault) in addition to a case of NO FAULT. There are seven different locations of fault
like at source end, at a mid-point at load end, at 4, 8, 12, and 16 km. So there are
a total of nineteen cases, for which, the training data is collected. Source-end and
load-end voltages and currents t for 1-line to ground fault given in Table 2.
Collection of training data for artificial neural network which are values of
sequence components of fault current which have been noted for different types
of faults at 4 km length of cable from source end or final training data for ANN is
given in Table 3.
Estimation of Location and Fault Types Detection Using … 123

Table 2 Source-end and load-end voltages and currents t for 1-line to ground fault
Quantities Single-phase to ground Single-phase to ground Single-phase to ground
fault at phase A fault at phase B fault at phase C
VSA 0.0025 − 0.0003 J 0.0247 − 0.00191 J 0.0210 − 0.0082 J
VSB −0.0176 − 0.0141 J −0.00159 − 0.00209 J −0.014 − 0.020 J
VSC −0.0107 + 0.0223 J −0.0034 + 0.0223 J −0.0010 + 0.0023 J
ISA 0.4526 − 1.928 J 0.6594 − 0.00164 J 0.5280 − 0.204 J
ISB −0.4407 − 0.355 J −1.964 + 0.528 J −0.3310 − 0.5701 J
ISC −0.327 + 0.5721 J −0.0875 + 0.559 J 1.556 + 1.3 J
VRA 0.00035 + 0.0005 J 0.026 − 0.0026 J 0.0201 − 0.0102 J
VRB −0.018 − 0.123 J 0.000253 − 0.00056 J −0.01529 − 0.0212 J
VRC −0.0107 + 0.0238 J −0.00124 + 0.0226 J −0.00061 + 0.00006 J
IRA 0.0089 + 0.0125 J 0.6529 − 0.065 J 0.5051 − 0.2556 J
IRB −0.4739 − 0.309 J 0.00629 − 0.0140 J −0.382–0.53 J
IRC −0.2096 + 0.598 J −0.03104 + 0.5654 J −0.0153 + 0.00148 J

Table 3 Zero sequences,


Positive Negative Zero sequence Type of fault
negative sequence, and
sequence I1 sequence I2 I0
positive sequence current
components 1.0340 0.7889 0.7227 AG
1 0.7569 0.6912 BG
1 0.7353 0.6700 CG
1.9364 1.261 0.5440 ABG
1.8085 1.1140 0.5865 BCG
1.9092 1.2127 0.5610 CAG
1.6588 1.5251 0 AB
1.5075 1.3773 0 BC
1.6275 1.4949 0 CA
2.9790 0.7521 0.7521 ABCG
2.9790 0.7521 0.7521 ABC
1 0 0 No fault

4 Simulation Results

In this paper, fault types and their location in an underground cable have been found,
by using an artificial neural network. The cable model is simulated for 1 s. Source end
and load end currents and voltages are given in Table 2. Fault voltages and currents
have been observed for different types of faults. In Figs. 3, 4, 5 and 6, wave shapes
for the same are shown for two different types of faults.
124 G. Tiwari and S. Saini

Fig. 3 Waveform of voltage for 1-line to ground fault

Fig. 4 Waveform of current for 1-line to ground fault

Fig. 5 Waveform of voltage for phase to phase to ground fault


Estimation of Location and Fault Types Detection Using … 125

Fig. 6 Waveforms of current for phase to phase to ground fault

Artificial neural network is used to detect types of fault and its location combina-
tion of six layers; in which one input layer, four hidden layers (having 4, 2, 5, and 6
neurons, respectively), and one output layer. Other performance parameters of ANN
are shown in Fig. 7
The neural network after training has been observed to classify each type of fault
at different locations correctly with different accuracy for a different type of fault
approx. 99%, which is given in Table 4. Performance chart of ANN is shown in
Fig. 8.

5 Conclusion

This work concludes that the computational technologies such as artificial neural
networks can be effectively utilized for detection of fault types and locations in
distribution underground cable systems.
126 G. Tiwari and S. Saini

Fig. 7 Parameters of ANN

Table 4 Accuracy in a type


S. No. Type of faults Accuracy in %
of faults at 4 km
1 AG 98.2
2 BG 97.3
3 CG 99
4 ABG 99.9
5 BCG 96.3
6 CAG 97.8
7 AB 99.9
8 BC 98.5
9 CA 96.5
10 ABC 98
11 ABCG 99
12 NO Fault 100
Estimation of Location and Fault Types Detection Using … 127

Fig. 8 Performance chart of 100 Train


ANN Validation
Test
Best

Mean Squared Error (mse)

10-1
0 2 4 6 8 10 12
13 Epochs

References

1. Kezunovic M (2011) Smart fault location for smart grids. IEEE Trans Smart Grid 11–22
2. Ouyang Y, He JL, Hu J et al (2012) A current sensor based on the giant magnetoresistance
effect: design and potential smart grid applications. Sensors, 15520–15541
3. Han JC, Hu J, Yang Y et al (2015) A nonintrusive power supply design for self-powered sensor
networks in the smart grid by scavenging energy from AC power line. IEEE Trans Ind Electron
4398–4407
4. De La Ree J, Centeno V, Thorp JS et al (2010) Synchronized phasor measurement applications
in power systems. IEEE Trans Smart Grid 20–27
5. Bo Q, Jiang F, Chen Z et al (2000) Transient based protection for power transmission systems.
IEEE Power Engineering Society Winter Meeting, pp 1832–1837
6. Stockwell RG, Mansinha L, Lowe R (1996) Localization of the complex spectrum: the S
transform. IEEE Trans Signal Process 998–1001
7. Dash P, Panigrahi B, Panda G (2003) Power quality analysis using S-transform. IEEE Trans
Power Deliv 406–411
8. Dash PK, Das S, Moirangthem J (2015) Distance protection of shunt compensated transmission
line using a sparse S-transform. IET Gener Transm Distrib 1264–1274
9. Moravej Z, Pazoki M, Khederzadeh M (2015) New pattern-recognition method for fault analysis
in a transmission line with UPFC. IEEE Trans Power Deliv 1231–1242
10. Samantaray SR, Dash PK (20085) Pattern recognition based digital relaying for advanced series
compensated line. Int J Electr Power Energy Syst 102–112
11. Samantaray SR (2013) A systematic fuzzy rule-based approach for fault classification in
transmission lines. Appl Soft Comput 928–938
12. Dash P (2013) A new real-time fast discrete S-transform for cross-differential protection of
shunt compensated power systems. IEEE Trans Power Deliv 402–410
13. Samantaray S, Dash P (2008) Transmission line distance relaying using a variable window
short-time Fourier transform. Electr Power Syst Res pp 595–604
14. Yu S-L, Gu J-C (2001) Removal of decaying DC in current and voltage signals using a modified
Fourier filter algorithm. IEEE Trans Power Deliv 372–379
15. Das B, Reddy JV (2005) Fuzzy-logic-based fault classification scheme for digital distance
protection. IEEE Trans Power Deliv 609–616
16. Jamehbozorg A, Shahrtash SM (2010) A decision-tree-based method for fault classification in
single-circuit transmission lines. IEEE Trans Power Deliv 2190–2196
128 G. Tiwari and S. Saini

17. Jamehbozorg A, Shahrtash SM (2010) A decision tree-based method for fault classification in
double-circuit transmission lines. IEEE Trans Power Deliv 2184–2189
18. Hagh MT, Razi K, Taghizadeh H (2007) Fault classification and location of power transmission
lines using artificial neural network. In: 2007 Conference Proceedings of IPEC, vols 1–3, pp
1109–1114
LabVIEW Implemented Smart Security
System Using National Instruments
myRIO

S. Krishnaveni, M. Harsha Priya, and P. A. Harsha Vardhini

Abstract In this paper, a smart security system is proposed based on LabVIEW.


Here, abnormal movements are detected from IR sensor and camera record the
surroundings and sends information to user through email. This system monitors
physical movements are monitored using an efficient image analysis system and
communicated using wireless network data managing system. Smart security system
design consists of components such as sensors, alarms, cameras, and an SMTP
protocol. Continuous monitoring and movements are captured in camera further the
frames are analyzed in LabVIEW software. myRIO notifies the user if any intrusions
are observed, while notifying user small video of intruder is also sent to registered
email.

Keywords IR sensor · myRIO · USB 3.0 · Security · Smart system

1 Introduction

A smart home security system is significant in residential community. Security and


safety are the key purpose of designing smart home security system. Advanced secu-
rity features are possible through essential wireless communication network proto-
cols. Physical movements are recognized and recorded in camera; further, infor-
mation is communicated to user [1–7]. This system monitors physical movements
are monitored using an efficient image analysis system and communicated using
wireless network data managing system. When sensor identifies any movements in
premises, then processor triggers the camera to capture picture on real time, and
video is streamed and recorded. The code for this application is done by using

S. Krishnaveni (B) · M. Harsha Priya


Department of ECE, CMR College of Engineering and Technology, Hyderabad, Telangana, India
e-mail: krishnaveni@cmrcet.ac.i
P. A. Harsha Vardhini
Department of ECE, Vignan Institute of Technology and Science, Deshmukhi, Telangana, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 129
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_10
130 S. Krishnaveni et al.

LabVIEW software by taking a flat sequence. This flat sequence consists of indi-
vidual codes, combined together as a final result. National instrument’s myRIO is
an efficient embedded board used for real-time evaluation to enhance the application
efficiency. National instruments LabVIEW is a system design development environ-
ment provides graphical approach. Due to visualization in LabVIEW, it becomes
easier to integrate complex hardware as diagram from any vendor. myRIO utilizes
on board FPGA and microprocessors, it served as embedded evaluation board works
on real time.
Imaging camera captures imaging data is communicated using USB 2.0, 3.0 tech-
nologies. Characteristics of surroundings are sensed by electronic instruments called
infrared sensor [8–15]. There are two techniques involved in detection one is infrared
sensor detects either presence of object which emits infrared radiation; second is it
detects if any intrusion occurred in infrared radiation being emitted by the sensor. By
combining all the hardware and software components, the streaming video is sent to
the specified person, using the SMTP protocol.

2 Technical Approach

Smart security system design consists of components such as sensors, alarms,


cameras, and an SMTP protocol. Continuous monitoring and movements are captured
in camera; further, the frames are analyzed in LabVIEW software. myRIO notifies
the user if any intrusions are observed, while notifying user small video of intruder
is also sent to registered email. Communication protocol SMTP port 25 is used for
communicating from server to client. An electronic sensors here used in infrared
sensors identifies the movements in the sensor covered region. Heat emitted from
living beings can be measured by IR sensor. This plays vital role in detection of
intruders. Immediately, when intruder is observed, a video is recorded and commu-
nicates to user on real time using SMTP protocol. USB cameras are picture capturing
cameras; here, technology used is USB 3.0 which transfers the captured data.
The structure of components that are integrated together to develop an application;
components used here are IR sensor, Buzzer, myRIO, camera USB3.0. If any intru-
sion is found in path covered by IP sensor it notifies the user with a recorded video
using USB camera. For authorized user, notification is sent using SMTP protocol.

3 Methodology

Smart security system design consists of components such as sensors, alarms,


cameras, and an SMTP protocol. As mentioned in Fig. 1, block diagram describes
the implementation of project and interfacing of components. Power supply is given
to myRIO, and the system. IR sensor and Buzzer are interconnected to myRIO. An
USB Camera is connected to system. Whenever IR sensor detects any object, Buzzer
LabVIEW Implemented Smart Security System … 131

Fig. 1 Taxonomy of
processing

starts alerting. Then, camera starts capturing the footage, records it and sends an
email notification to the authorized registered email account.

3.1 Module 1: Buzzer Code

First stage of design, Fig. 2 is to detect if any obstacles in program, and two cases are

Fig. 2 Buzzer code


132 S. Krishnaveni et al.

Fig. 3 Camera code

considered if any intrusion occurs, buzzer alerts the surroundings. Case 2 is when
there is no intrusion code returns a string indicating normality. In buzzer mode,
the output is considered as global variables so that it can be used as input for next
modules.

3.2 Module 2: Camera Code

If the sensors detects any abnormal situations in the predefined surroundings, immedi-
ately, camera starts capturing the premises. The captured video is stored in extension
of ‘.avi’. Further, this is communicated to user. If there is no abnormal situation, then
module 2 returns a false to next module 3, i.e., email mode as shown in Fig. 3.

3.3 Module 3: Email Code

Email mode is the third mode of the design. After intrusion is observed, camera
captures a video of the premises and saves the file. Email code returns ‘True’ and
send the email to authorized user through the registered email id. If no video is
captured, email code returns zero or ‘False’. Code is as shown in Fig. 4.
LabVIEW Implemented Smart Security System … 133

Fig. 4 Email code

4 Experimental Results

4.1 Detection of the Movement

When IR sensor detects any object, the USB camera starts to capture the 30 s video
by executing true case of the code and sends the email to the registered mail-id.
Figure 5 illustrates the results of the work proposed.

Fig. 5 Hardware setup


while detection
134 S. Krishnaveni et al.

4.2 Streaming Video Being Recorded

When IR sensor detect intrusions, video is recorded and saved in the given file path.
The recorded file has been sent as an alert message to the authorized person registered
email account.
Figure 6 illustrates source code front panel, and Fig. 7 depicts the code results in
recording saved in file path mentioned in the code. Figure 7 represent the recording
of captured premises.

Fig. 6 Source code–front panel

Fig. 7 Code results in


recording saved in file path
mentioned in the code
LabVIEW Implemented Smart Security System … 135

Fig. 8 Alert message to all

Fig. 9 Source code-block diagram

4.3 Mail Attachment to the Registered Account

Figure 8 depicts the email which is sent to the user, along with a streaming
video clip attached to the mail. This mail is sent when the sensor detects some
intrusion/movement. Final source code is shown in Fig. 9.

5 Conclusion

With LabVIEW, abnormal movements are detected from IR sensor and camera as it
records the surroundings and sends information to user through email. This system
monitors physical movements and are monitored using an efficient image analysis
system and communicated using wireless network data managing system. Smart
security system design consists of components such as sensors, alarms, cameras, and
an SMTP protocol. Continuous monitoring and movements are captured in camera;
further, the frames are analyzed with LabVIEW. myRIO notifies the user if any
intrusions are observed, while notifying user small video of intruder is also sent to
registered email.
136 S. Krishnaveni et al.

References

1. Juhana T, Angraini VG (2016) Design and implementation of smart home surveillance system.
In: 2016 10th International conference on telecommunication systems services and applications
(TSSA)
2. Harsha Vardhini PA, Harsha MS, Sai PN, Srikanth P (2020) IoT based smart medicine
assistive system for memory impairment patient. In: 2020 12th international conference on
computational intelligence and communication networks (CICN), Bhimtal, India, pp 182–186
3. Harsha Vardhini PA, Ravinder M, Srikanth P, Supraja M (2019) IoT based wireless data printing
using Raspberry Pi. J Adv Res Dyn Control Syst 11(4):2141–2145
4. Kumar S, Swetha S, Kiran T, Johri P (2018) IoT based smart home surveillance and automa-
tion. In: 2018 International conference on computing, power and communication technologies
(GUCON), Greater Noida, Uttar Pradesh, India
5. Vardhini PAH (2016) Analysis of integrator for continuous time digital sigma delta ADC
on Xilinx FPGA. In: International conference on electrical, electronics, and optimization
techniques, ICEEOT 2016, pp 2689–2693
6. Harsha Vardhini PA, Madhavi Latha M (2015) Power analysis of high performance FPGA low
voltage differential I/Os for SD ADC architecture. Int J Appl Eng Res 10(55):3287–3292
7. Kumar P (2017) Design and implementation of Smart Home control using LabVIEW. In: Third
international conference on advances in electrical, electronics, information, communication
and bio-informatics (AEEICB), Chennai, pp 10–12. https://doi.org/10.1109/AEEICB.2017.
7972317
8. Vardhini PAH, Koteswaramma N, Babu KMC (2019) IoT based raspberry pi crop vandalism
prevention system. Int J Innov Technol Explor Eng 9(1):3188–3192
9. Darrell T, Demirdjian D, Checka N, Felzenszwalb P (2001) Plan-view trajectory estimation
with dense stereo background models. Trevor Darrell, David Demirdjian, Neal Checka, Pedro
Felzenszwalb ICCV 2001, pp 628–635
10. Harsha Vardhini PA, Ravinder M, Srikanth Reddy P, Supraja M (2019) Power optimized
Arduino baggage tracking system with finger print authentication. J Appl Sci Comput J-ASC
6(4):3655–3660
11. Vardhini PAH, Makkena ML (2021) Design and comparative analysis of on-chip sigma delta
ADC for signal processing applications. Int J Speech Technol 24:401–407
12. Tushara DB, Vardhini PA (2016) Effective implementation of edge detection algorithm on
FPGA and beagle board. In: 2016 International conference on electrical electronics and
optimization techniques (ICEEOT), pp 2807–2811
13. A wide area tracking system for vision sensor networks. In: Kogut G, Trivedi M (2002) 9th
World Congress on intelligent transport systems. Chicalgo, Illinois, Oct 2002
14. Bindu Tushara D, Harsha Vardhini PA (2015) FPGA implementation of image transformation
techniques with Zigbee transmission to PC. Int J Appl Eng Res 10(55):420–425
15. Bindu Tushara D, Harsha Vardhini PA (2017) Performance of efficient image transmission using
Zigbee/I2C/Beagle board through FPGA. In: Saini H, Sayal R, Rawat S (eds) Innovations in
computer science and engineering. Lecture notes in networks and systems, vol 8. Springer,
Singapore. https://doi.org/10.1007/978-981-10-3818-1_27
Design and Analysis of Modified Sense
Amplifier-Based 6/3T SRAM Using
CMOS 45 nm Technology

Chilumula Manisha and Velguri Suresh Kumar

Abstract SRAM and sense amplifiers are important components in memory design.
The choice and design of a sense amplifier define the robustness of bit line sensing,
impacting the read speed and power. The primary function of a sense amplifier in
SRAMs 6T is to amplify a small analog differential voltage developed on the bit lines
by a read-accessed cell to the full swing digital output signal thus greatly reducing the
time required for a read operation. Since SRAMs 6T do not feature data refresh after
sensing, the sensing operation must be non-destructive, as opposed to the destructive
sensing of a DRAM cell. A sense amplifier allows the storage cells to be small,
since each individual cell need not fully discharge the bit line. A novel high-speed
and highly reliable sense amplifier primarily based 3 T SRAM is proposed for low
supply voltage (VDD) operation. By the proposed method, area of the SRAM will
reduced by 50% and speed increase up to 40% and reduced power dissipation. The
above proposed method will designed by CMOS tanner 45 nm technology.

Keywords Sense amplifier · Conventional SRAM · Power dissipation

1 Literature Survey

The sense amplifier-based master stage reduces the loading to the clock network.
The proposed improvement on the slave stage shortens the data to Q delay. The
simulation results indicate that the proposed design has a shorter data to Q delay than
the flip-flop design used in Alpha 21,264 microprocessors. The proposed design [1]
can be fabricated in the mainstream commercial digital complementary metal oxide
semiconductor process. The self-lock problem in the standard sense amplifier-based
master stage is avoided by additional discharging paths controlled by the redundant
nodes. The improved SR latch design reduces the data to Q delay. A NOR-gate bank
is used to prevent a master stage single event transient propagating to two branches

C. Manisha · V. Suresh Kumar (B)


Department of ECE, Maturi Venkata Subba Rao Engineering College, Telangana State,
Hyderabad, India
e-mail: vsureshkumar_ece@mvsrec.edu.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 137
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_11
138 C. Manisha and V. S. Kumar

in the slave stage. A novel soft error tolerant latch and a novel soft error-tolerant flip-
flop are presented for multiple VDD circuit design [2]. By utilizing local redundancy,
the latch and the flip-flop can recover from soft errors caused by cosmic rays and
particle strikes. By using output feedback, implicit pulsed clock, and conditional
discharged techniques, the proposed flip-flop can behave as a level converter and
without the problems of static leakage and redundant switching activity. Since the
setup time of the new flip-flop is negative, it can further mitigate the impact of single
event transient (SET) at the D input of the flip-flop.

2 Introduction

Today’s world is craving for an ultra-low-power system on chip (SOC) which is


continuously increasing as the interest in high voluminous—constrained mobile
SOC application is growing. In particular, for applications where performance is
of secondary importance the one which is the simplest and most efficient methods to
improve energy consumption to reduce the voltage supply voltage at the cost of loss
of speed. Sense amplifier is like the mitochondria of a memory chip. As the number
of cells (SRAM/DRAM) increase [3], it tries to reduce the voltage difference created
by them. Sense amplifiers in consortium with memory cells are essential elements in
defining the performance and environment tolerance of CMOS memories. Because
of their high importance in memory design, sense amplifiers became a very large
circuit class.
SRAM stands for static random access memory, a nonvolatile memory that can
store the information as long as the power is applied. The sense amplifier operates
only when data stored is read from memory. Sense amplifier is used to detect the very
small difference voltage at bit lines and amplify the signals to its full digital voltage
swing before the signals are fully charged or discharged [4]. This situation causes
the time taken to read the content of memory to shorten since the circuitry doesn’t
require waiting until the signals get charged or discharged to determine either if it is
1 or 0. The small spark on glitch [5, 6] at both bit lines may determine its state, so the
memory may take it quickly either as 1 or 0 rather than trying to calculate or wait for
its voltage full swing level, thus saves time in read operation into memory. The above
paper has arranged as: 1. Introduction, 2. Literature survey, 3. Implementation, 4.
Experimental results and comparison, 5. Conclusion.
Design and Analysis of Modified Sense Amplifier-Based 6/3T … 139

3 Implementation

3.1 Conventional Sense Amplifier

The designing of conventional SAFF is composed of two stages. One is an SA stage


followed by NAND-based latch and sense amplifier stage. In the designing first
design the NAND gates. For the designing of sense amplifier stage, take both PMOS
M2 & M3 and NMOS M5 & M6. The PMOS M2 and NMOS M6 both gates are
short together similarly M2 & M5. The output and input sessions of series PMOS
and NMOS are short together. Two PMOS M1 & M4 are taken for clock supply. M1
source is connected to M2 source similarly M4 source is connected to M3 source. All
drains of four PMOS are connected to VDC now again take NMOS M7, M8 & M9.
All drain of NMOS are shorted. With the support of inverter get the DBAR. Inverter
is designed with PMOS M10 and NMOS M11 with supply VDC and ground. If input
is D, then we get the DBAR through the inverter. Then connect input D to M7. One
more NMOS is clocked to M12 (Fig. 1).

Fig. 1 Conventional sense amplifier-based flip-flop


140 C. Manisha and V. S. Kumar

3.1.1 Designing of Latch

Now, bring two NAND gates both inputs are upwards, and both the grounds are
shorted. Supply V DC is given to both NAND gates. One output is Q for NAND1, and
QBAR is output for NAND2. Extract a wire from M2 & M5 connected to NAND1.
Extract a wire from M3 & M6 connected to NAND2. Now, both NAND gates are
cross coupled. Now, to go, spice sources select VDC and GND. Select VPulse for
clock and data D. Select spice commands print voltage at Q and QBAR. For the
designing purpose, the required number of PMOS is 5 and NMOS are 7, and NAND
gates are 2. For the desired output data, test bench plays a crucial role. In this design,
the proper output is obtained for the following specifications. Take the delay is 0n if
delay is more the response will be decreased, so delay is very small for any design,
fall time is 5n, period 200n and pulse width 95n, rise time 5n which means the time
taken to reach 10–68% of the final output.

3.1.2 Simulation Wave Form of Conventional SAFF

The time is taken in X-axis, and voltage is taken in Y-axis. For clock equal to logic 1,
the data voltage increases initially then saturated and decreases gradually and then
repeated same. When clock is 1 or HIGH, the D value gradually increase up to 10n
(Rise time); then, it will constant for 50n then it gradually decreases up to 50 to 55n
(fall time) then it will remains at 0 V up to 100n. From the graph, it is clear that
QBAR is opposite to Q. Q is logic 0, and QBAR is logic 1 (Fig. 2).

Fig. 2 Simulation results of SAFF


Design and Analysis of Modified Sense Amplifier-Based 6/3T … 141

Fig. 3 Proposed sense amplifier

3.1.3 Proposed Sense Amplifier

The designing of proposed version of the sense amplifier has 7 transistors. It has two
inverters with PMOS and NMOS back-to-back connection. The PMOS transistors
M1 & M3 and NMOS transistors M2 & M4. To form inverter configuration, M1 &
M2 connected in series and both gates are shorted similarly, M3 & M4. The input
and output of both inverters are shorted together. Both drain of the inverters are
connected to V DD . Another two access transistors are arranged these are PMOS
transistors M6 & M7 to connect BIT and BIT BAR. The inputs of access transistors
are connected to the inverters. To obtain the BITBAR, another inverter M8 & M9
is used between M6 & M7. Both gates of M6 & M7 are shorted and connect read
enable. Another NMOS transistor M5 is connected to the memory part. The source
of the M5 connected to ground. The gate is connected to read enable (Fig. 3).
Go for spice elements and take two voltage pulses V 1 & V 2 for BIT and read
enable. Take DC voltage V 3 for VDD and connected to GND. Then, go for spice
commands select print voltage and connect at BIT, read enable, out.

3.1.4 Power Analysis

Proposed sense amplifier has been simulated up to 0 to 5e−07 with VV3 VDC
voltage source. The average power consumed around 1.129630e+15 W, the max-
power consumption 1.946189e+15 at time 5e−09 and min-power consumption
7.196223e−10.
142 C. Manisha and V. S. Kumar

Fig. 4 Simulation results of proposed sense amplifier

3.1.5 Simulation Results of 7Transisitor Sense Amplifier

The output waveforms of the 7T sense amplifier as follows: the read enable is high
(5 V); then, the bit line is high (5 V), and the output is also high (5 V). When the
read enable is low (0 V), the BIT line is remains in high voltage stage, but the output
is low voltage stage (0 V). The read enable voltage is increases gradually; the BIT
line is decreases gradually and remains at low voltage, and the output is remains at
low voltage state (Fig. 4).

3.1.6 Conventional SRAM Architecture

See Fig. 5.

3.1.7 Schematic Explanation

Inside the circuit part is called memory part, it has two inverters in back to back
connection. The inverter is formed using two PMOS M1 & M2 and two NMOS
M3 & M4. The M1 & M4 gates are short together. The input and outputs of inverters
are shorted. The drain of two PMOS is connected to VDD, and the source of two
NMOS is connected to GND. Another two NMOS transistors (access transistors)
M5 & M6 those gates are short together. The output of one M5 is BIT, and the output
of M6 is BITBAR. WL is connected to both gates. The operation of the CMOS
inverter and NOT gate is same. The memory part has TWO inverters; outputs are
taken as Q, and the other one is Q bar. The memory part is connected to the access
transistors nothing but line. These lines are called bit and bit bar. The bit bar is
get from the inverter connected to the bit. These are used for read or write from the
Design and Analysis of Modified Sense Amplifier-Based 6/3T … 143

Fig. 5 Conventional SRAM schematic

memory part. To access the lines or bit and bit bar access, transistors are used. If word
line (WL) = 1, the access transistors are ON. If WL = 0, the access transistors are
OFF then. The memory in HOLD state. From the above statements, we can conclude
that when WL = 1, then only READ and WRITE operations are possible.

3.1.8 Read Operation

Read from the memory, bit and bit bar are used as output lines. The recharge capacitors
are used in READ and WRITE operation. When Q = 1 and Q Bar = 0 WL = 1,
then only READ operation is possible to make sure that bit and bit bar are OUTPUT
lines. The capacitors are act as PRECHARGE (V DD ) capacitors for READ operation.
Bit bar voltage will decrease. So, capacitor will discharge. Bit and bit bar values are
send to sense amplifier. The sense amplifier acts as comparator. It will check. Bit
bar value decreases then the OUTPUT will be 1. It successfully READ from the
memory. When Q = 0 and Q bar = 1, WL = 1 bit and bit bar are acts as OUTPUT.
The capacitors are act as PRECHARGE (V DD ) capacitors for READ operation. Bit
voltage will decrease. So, capacitor will discharge. So, capacitor will discharge. Bit
and bit bar values are send to sense amplifier. The sense amplifier acts as comparator.
It will check bit voltage decreases then the OUTPUT = 0.

3.1.9 SRAM Write Operation

In write operation, we have to change Q = 0 to Q = 1, lets save memory value has


zero that is Q = 0 and Q bar = 1, word line WL = 1 because we have to access
the bit lines bit, and bit bar lines are equal to V pulse as input lines, make bit bar line
144 C. Manisha and V. S. Kumar

as ground for voltage variations. When voltage is decreases m1 and m2 transistors


effect. If applied DC voltage less then threshold of M2, then M2 = OFF M1 = on
then it get the value of VDD then Q = 1. Let’s say Q = 0 and Q bar = 1. Due to
VDD and Q = 0 the capacitor gets discharged due voltage that increase at both the
gates. So that voltage greater then threshold voltage. Then, M4 = on the Q bar = 0
so Q = 0. So, we have to make sure DC voltage less than threshold of M4.

3.1.10 Power Results

The average power consumed around 2.685001e−04 watts, max power


1.227270e−02, and min power consumption is 5.882367e−09.

3.1.11 Output Waveforms

The wave form is plotted by taking time on X-axis and voltage on Y-axis. From the
graph is clear that when word line is high, the bit line is low. The Q and Q bar are
opposite to each other. For 0 to 47.30 µs, the WL is high (4 V), and the bit line is
low (0 V). The Q is high (4 V), and QBAR is low (0 V). After 47.30 µm, the WL
decreases gradually and reaches 0 V and remains constant up to 47.40 µm in that
period the bit line is initially increases and saturated till 47.35 µm and decreases and
remains in low state up to 47.40 µm. The Q and QBAR are constant. It is repeated
periodically (Fig. 6).

Fig. 6 Simulation results of conventional SRAM


Design and Analysis of Modified Sense Amplifier-Based 6/3T … 145

Fig. 7 Proposed 3T SRAM cell architecture

3.1.12 Proposed 3T SRAM Architecture

A standard 3T SRAM cell has two bit lines for read and writes operation thus it
consumes more power. When we change write line BL also changes and changes
occur in output 0 or 1. Here, 3 transistors are used to store bits of data. M1 and
M2&M3 three transistors. M1, M3 are NMOS, and M2 is PMOS. Each transistor
consists of one bit line and one write line. And it always indicates in same direction.
Here, we used 3 spice sources they are V 1, V 2, V 3; all three voltages are grounded.
V 1 = 1.8 V and V 2 consists of BIT line, and V 3 consists of write line. Transistors
length (L) is 45n and width (W ) is 90 nm for NMOS, and for PMOS length is 45 nm,
and width is 180 nm (Fig. 7).

3.1.13 Test Bench of 3T SRAM

To simulate or work, the 3T SRAM values are given as, DC voltage of 3T SRAM
cell is 1.8 V, the inputs of voltage 1 and voltage 2 are V 1 = 5 V and V 2 = 0 V. The
period of 3T SRAM cell is 100 n/s. Delay time of 3T SRAM is 0 n/s. Rise time of
3T SRAM is 5n. Fall time of 3T SRAM is 5n. And the pulse width is 45n.

3.1.14 Power Analysis

To simulate or to run the 3T transistors, the time period will be 200 ns, for the 3T
transistors the average power consumed is 7.251771e−08 W. And the maximum
146 C. Manisha and V. S. Kumar

Fig. 8 Simulation results of proposed 3T SRAM cell

Table 1 Experimental results of conventional with proposed sense amplifier-based SRAM cells
S. No. Conventional Proposed Conventional Proposed
1 Sense 10T 7T SRAM 6T 3T
amplifier
2 Power 10 × e14 1.2 × e09 2.6 × e−3 7 × e−8
consumption

power consumed is 2.445337e−05 at time 5e−09, and minimum power consumed


is 7.496473e−11 at time 1.28418e−07 where Bl:v is bit line, and Wl:v write line
and out:v are output (Fig. 8).

4 Experimental Results Comparison

See Table 1.

5 Conclusion and Future Scope

A novel high-speed and highly reliable sense-amplifier-primarily based 3T SRAM is


proposed for low supply voltage (VDD) operation. By the proposed method, area of
the SRAM will reduced by 50% and speed increase up to 40%. And reduced power
dissipation of 7 × e−8 by this proposed 3trasisitored SRAM will applicable in all
analog and digital circuits to load large data processing systems.
Design and Analysis of Modified Sense Amplifier-Based 6/3T … 147

References

1. Kaul H, Anders M, Hsu S, Agarwal A, Krishnamurthy R, Borkar S (2012) Near-threshold voltage


(NTV) design—opportunities and challenges. In: Proceedings of 49th ACM/EDAC/IEEE
Design Automation Conference (DAC), June 2012, pp 1149–1154
2. Jeon D, Seok M, Chakrabarti C, Blaauw D, Sylvester D (2012) A super-pipelined energy efficient
subthreshold 240 MS/s FFT core in 65 nm CMOS. IEEE J Solid-State Circuits 47(1):23–34
3. Suzuki Y, Odagawa K, Abe T (1973) Clocked CMOS calculator circuitry. IEEE J Solid-State
Circuits SSC-8(6):462–469
4. Gerosa G et al (1994) A 2.2 W, 80 MHz superscalar RISC microprocessor. IEEE J Solid-State
Circuits 29(12):1440–1454
5. Markovic D, Nikolic B, Brodersen RW (2001) Analysis and design of low-energy flip-flops. In:
International Symposium on Low Power Electronics and Design, pp 52–55
6. Sandeep P, Harsha Vardhini PA, Prakasam V (2020) SRAM utilization and power consumption
analysis for low power applications. In: 2020 International conference on recent trends on
electronics, information, communication & technology (RTEICT), Bangalore, India, 2020, pp
227–231. https://doi.org/10.1109/RTEICT49044.2020.9315558
Design and Implementation of Low
Power GDI-LFSR at Low-Static Power
Consumption

Velguri Suresh Kumar and Tadisetty Adithya Venkatesh

Abstract Gate Diffusion Input (GDI) technique plays a vital role in very large scale
integration circuits. GDI modeling is a low power technique which is used to design
all digital engineering blocks. This technique is used to design universal gates by
using only two CMOS Transistors. This technique will improves the silicon area
as well as less power consumption and more power delivered to load. And also,
proposed with IDDQ and IDDT fault detection testing techniques to detect faults to
operate at low-static current consumption in GDI_CMOS_LFSR circuits. However,
previous work has been carried out by using 130 nm technology. The proposed
LFSR has been design and simulated using Tanner CMOS 45 nm Mentor graphics
Technology. In the proposed technique, parameters such as total number of transistors
required to design LFSR has reduced, and rise time, fall time, pulse width, noise
analysis, are calculated with fault detection . Comparative analysis carried out along
with IDDQ current testing methods between conventional LFSR and GDI_LFSR
shows 50% and 20% area and power reduction GDI technique respectively. This
Paper will report the innovative low power techniques and its impact on future VLSI
applications.

Keywords Gate diffusion input · Low power · IDDQ · Rise time and fall time ·
LFSR · Signature analyzer

1 Introduction

A low-power digital circuit design technique which is used to reduce area, power
dissipation, delays and to increase the speed. By this technique, circuit complexity is
also reduced. As shown in Fig. 1, it is the basic GDI cell. The basic GDI cell has three

V. S. Kumar · T. A. Venkatesh (B)


Department of ECE, Maturi Venkata Subba Rao Engineering College, Hyderabad, Telangana,
India
V. S. Kumar
e-mail: vsureshkumar_ece@mvsrec.edu.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 149
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_12
150 V. S. Kumar and T. A. Venkatesh

Fig. 1 CMOS GDI D flip flop

inputs. G-common input to both PMOS and NMOS, [1] P&N inputs to source/drain
of PMOS/NMOS. The advantage of GDI technique [2] over conventional CMOS
technology is that in this technique N, P&G terminals could be given a supply V dd
or can be grounded or can be supplied with input signal depending on the circuit
to design and hence minimizing the transistor count. Another technique that is used
to test semiconductor devices is known as the IDDQ test. The IDDQ test [3–8] is
able to find soft errors that do not affect the logical output, such as certain types of
coupling and shorts in the system. These errors often waste the power of the system
and can lead to more severe issues such as data retention faults. The IDDQ test works
by monitoring the Quiescent current of the circuit, which is the current drawn by
the system when it is fully powered but not active. This current is usually extremely
small. If there is an error, such as a break, in the system, the current will rise by a
few orders of magnitudes. By monitoring the current for this change, certain defects
that are otherwise undetectable can be found. The IDDQ test is not commonly used
in SRAM testing [9–11] for a number of reasons including sizing and scaling. Due
to the decreasing transistor size, the leakage current and parameter variation make it
very difficult to accurately detect errors. The IDDQ test has only been successfully
used for devices that have a channel length of 0.25 m and above, which is much
larger than the current technology In addition to this; changes must be made to either
the SRAM cell or the row decoders in order to access the whole memory cell which
increases the footprint of the chip.
Despite this fact, several methods have been proposed for adapting the IDDQ test
for current technology, but as sizing continues to decrease, the IDDQ test is becoming
less reliable for accurate fault detection although it is not always a useful method
Design and Implementation of Low Power GDI-LFSR … 151

for determining faults; IDDQ testing is still used to evaluate current flow and power
consumption within memory devices.

2 Literature Survey

The main challenging areas in VLSI are performance, cost, testing, area, reliability
and power. The demand for comparatively lesser price and portable computing
devices and communications system are increasing rapidly. These applications
require low power dissipation for VLSI circuits [12]. There are main two sources of
power dissipation in digital circuits; these are static (mainly due to leakage current
and its contribution to total power dissipation is very small) and dynamic (due to
switching, i.e., the power consumed due to short circuit current flow and charging
of load capacitances) power dissipation. Hence, it is important aspect to optimize
power during testing. Power optimization is one of the main challenges.
There has been various low power approaches proposed to solve the problem of
power dissipation, i.e., to decrease the power supply voltage, switching frequency
and capacitance of transistor during the testing. Here, this paper presents one such
approach and that is LFSR counter [13–16] which has low power architecture. The
LFSR is used in variety of applications such as built-in-self test (BIST), cryptography,
error correction code and in field of communication for generating pseudo-noise
sequences. Nowadays, LFSR’s are present in nearly every coding scheme as they
produce sequences with good statistical properties, and they can be easily analyzed.
Moreover they have a low-cost realization in hardware. Counters like binary, gray
suffer problem of power consumption, glitches, speed, and delay because they are
implemented with techniques which have above drawbacks. They produce not only
glitches, which increase power consumption but also complexity of design. The
propagation delay of results of existing techniques is more which reduces speed and
performance of system. LFSR counters overcome [17–22] these problems which are
implemented using different technologies of CMOS.

3 Implementation

3.1 CMOS D Flip Flop

The effect is that D input condition is only copied to the output Q when the clock
input is active. This then forms the basis of another sequential device called a D flip
flop. The “D flip flop” will store and output whatever logic level is applied to its data
terminal so long as the clock input is HIGH.
152 V. S. Kumar and T. A. Venkatesh

Fig. 2 CMOS GDI D flip


flop power analysis

Fig. 3 CMOS D flip flop simulation results

Figure 1 shows the CMOS GDI D flip flop. In pull up region deigned with PMOS
and in pull down region constructed with NMOS logic. It is designed with Tanner
SEDIT Tool.
Figure 2 shows the power analysis of CMOS D flip flop. It analyzed by using tanner
tool. The average power consumed around 3.80e − 004 W. Max power consumed is
around 2.26e−002 and minimum power consumed around 6.12e−009 W. The above
simulation is done with 1.8 VDD (Fig. 3).
Figure 2 shows the simulation results of CMOS D flip flop. It is simulated using
CMOS Tanner-SPICE simulation program with integrated circuit emphasis tool.
Wave forms are analyzed using W-EDIT (wave form editor).

3.2 CMOS GDI Linear Feedback Shift Register (LFSR)

We have designed three-bit LFSR (linear feedback shift register). The output of
NAND gate is given as input to first D flip flop and the output of first D flip flop
given to input of second D flip flop and the input to the NAND gate. The output of
the second D flip flop is given as the input to the third D flip flop and as the input to
the NAND gate (Fig. 4).
Figure 5 shows the power analysis of LFSR Using GDI CMOS D flip flop. It
Design and Implementation of Low Power GDI-LFSR … 153

Fig. 4 CMOS GDI LFSR

Fig. 5 CMOS LFSR power analysis

is analyzed by using tanner tool. The average power consumed is around 4.00e −
004 W. Max power consumed is 2.79e−002. Minimum power consumed around
3.85e − 008 W. For above simulation have done with 1.8 VDD.
When clock pulse is two the output Q0 will be zero, Q1 will be one and Q2
will be one. When clock pulse is three the output Q0 will be one, Q1 will be zero
and Q2 will be one. When clock pulse is four outputs Q0 will be zero, Q1 will be
one and Q2 will be zero. Figure 6 shows the simulation results of CMOS LINEAR
FEEDBACK SHIFT REGISTER (LFSR). It is simulated using CMOS Tanner-SPICE
simulation program with integrated circuit emphasis tool. Wave forms are analyzed
using W-EDIT (wave form editor).

3.3 IDDQ Test

IDDQ Test schematic The above schematic represents about the IDDQ test. IDDQ
testing is a method for testing CMOS integrated circuits for the presence of repre-
senting faults. It depends on measuring the supply current (I d ) in the quiescent state.
154 V. S. Kumar and T. A. Venkatesh

Fig. 6 CMOS LFSR simulation result

IDDQ testing uses the principle that in a quiescent CMOS digital circuit, and there
is no current path between power supply and ground. The most common semicon-
ductor is silicon. This semiconductor manufacturing faults will cause the current
to increase the order of magnitude, which can be easily detected. IDDQ testing is
somewhat complex than just measuring the supply current. The reasons to get the
IDDQ are test generation is fast, test application time is fast, area and design time
overhead is very low. It catches some defects that other tests, particularly stuck-at
logic tests, do not (Fig. 7).
The above schematic contains two inverters which has two PMOS and NMOS
transistors. In the above schematic, PMOS are represented as M1, M3, and NMOS
is represented as M2 and M4. The transistors M1 and M2 are connected to input,
and the output of the first inverter is connected to input of the second inverter. The

Fig. 7 Schematic of IDDQ test


Design and Implementation of Low Power GDI-LFSR … 155

output is taken from the second inverter. There are two pulse voltage switch are taken
from the Vpulse in which one is connected at the input, and other is connected at
the output. VDD and VSS are taken from the miscellaneous that are connected to
the circuit (i.e., VSS is connected at PMOS and ground is connected to NMOS).
The above schematic also contains the dc supply and pulse waveform. The function
of IDDQ is same as the inverter. When the input = 1, the output of the inverter 1
is 0. When the input is 1 then PMOS is off and NMOS is on and the output will
be equal to 0. As the output of inverter 1 is connected to input of inverter 2 so that
the input of the inverter 2 is 0. As the input of the inverter 2 is 0 then PMOS is
on, and NMOS is off then output is 1. Due to some faults like open faults (struck
at 0), closed faults (struck at 1), transistor gate-oxide short circuits and unpowered
interconnection opens current will flow from inverter 2 PMOS to inverter 1 NMOS.
Here, measurable amount of current is flow through the transistor. When the input
= 1 the output should be zero then the drain current (I d ) should be zero. There is no
voltage drop across the circuit then the output is 5 V. But in IDSS some current is
flows from drain to drain that current is called as quiescent current. The advantages
in IDDQ are quality of an IC, identification of bridge faults; it is also used for failure
test analysis. The disadvantages of IDDQ are time consuming and more expensive.
Test bench of IDDQ The test bench of IDDQ represents about the transient analysis
and the properties. It has the fall time about 5 ns, delay is 0, frequency is 5 M, time
period is about 200 ns, pulse width is 95 ns, rise time is 5 ns, V High and V Low
is 5 V and 0 V, respectively, which is taken from spice. The system parameters of
IDDQ are angle is about 0 degrees, MasterCell is Vpulse, and Master Library is
SPICE_Sources (Fig. 8).
IDDQ test The above simulation represents the IDDQ test. In the above simulation,
input is In and output is Out which is represented in terms of volts. In the above
simulation, In is 1 from 0 to 105 ns, changed to In = 0 from 105 to 200 ns. Later,

Fig. 8 Simulation of IDDQ test


156 V. S. Kumar and T. A. Venkatesh

Table 1 Comparison table of


GDI-LFSR Conventional LFSR
conventional, GDI_LFSR
module Module Power Module Power
consumed consumed
D_FF 2.4 * 10−4 W D_FF 3.24 * 10−12 W
XOR GATE 1.4 * 10−4 W XOR GATE 9.7 * 10−4 W
LFSR 1.12 * 10−15 W LFSR 73.3 * 10−3 W

In is changed to 1 from 200 to 301 ns, In = 0 from 301 to 400 ns and In = 1 from
400 to 500 ns. Out = 1 from 0 to 105 ns, then Out = 0 from 105 to 200 ns, Out =
1 from 200 to 301 ns, Out = 0 from 301 to 400 ns and Out = 1 from 400 to 500 ns.
The simulation of IDDQ test is same as functionality of inverter (Table 1).

4 Conclusion

In this paper, we have designed and analyzed the GDI LFSR using D flip-flop.
Also, we have analyzed the power consumed by the LFSR and its different blocks
with IDDQ analyzed. The average power consumed is around 1.12 * 10−15 W. Max
power consumed is around 2.79e − 002, and the minimum power consumed is around
3.85e−008 watts. Thus, the above simulation is done with 1.8 VDD. An extension of
this work would be to optimize the final block LFSR to reduce the power consumption
and the area by reducing transistor count at transistor level. This optimized LFSR can
be used in different applications as well as for test pattern generations by reducing
the faults.

References

1. Bronzi D, Zou Y, Villa F, Tisa S, Tosi A, Zappa F (2016) Automotive three-dimensional vision
through a single-photon counting SPAD camera. IEEE Trans Intell Transp Syst 17(3):782–795
2. Vornicu I, Carmona-Galán R, Rodríguez-Vázquez A (2014) A CMOS 0.18 µm 64 × 64 single
photon image sensor with in-pixel 11b time to- digital converter. In: Proceedings of International
Semiconductor Conference (CAS), pp 131–134
3. Palagiri, H., Makkena, M., & Chantigari, K. R. (2013). Design development and performance
analysis of high speed comparator for reconfigurable  ADC with 180 nm TSMC technology.
In: 15th International conference on advanced computing technologies (ICACT 2013)
4. Perenzoni M, Perenzoni D, Stoppa D (2017) A 64 × 64-pixels digital silicon photomultiplier
direct TOF sensor with 100-MPhotons/s/pixel background rejection and imaging/altimeter
mode with 0.14% precision up to 6 km for spacecraft navigation and landing. IEEE J Solid-State
Circuits 52(1):151–160
5. Pavia JM, Scandini M, Lindner S, Wolf M, Charbon E (2015) A 1 × 400 backside-illuminated
SPAD sensor with 49.7 ps resolution, 30 pJ/sample TDCs fabricated in 3D CMOS technology
for near-infrared optical tomography. IEEE J Solid-State Circuits 50(10):2406–2418
Design and Implementation of Low Power GDI-LFSR … 157

6. Niclass C, Soga M, Matsubara H, Ogawa M, Kagami M (2014) A 0.18-µm CMOS SoC for a
100-m-range 10-frame/s 200 × 96-pixel time-of-flight depth sensor. IEEE J Solid-State Circuits
49(1):315–330
7. Palagiri AHV, Makkena ML, Chantigari KR (2016) An efficient on-chip implementation of
reconfigurable continuous time sigma delta ADC for digital beamforming applications. In:
Advances in intelligent systems and computing, vol 381, pp 291–299
8. Kim J, Park S, Hegazy M, Lee S (2013) Comparison of a photon counting-detector and a
CMOS flat-panel-detector for a micro-CT. In: IEEE Nuclear Science Symposium and Medical
Imaging Conference Proceedings (NSS/MIC), pp 1–4
9. Palagiri HV, Makkena ML, Chantigari KR (2012) Performance analysis of first order digital
sigma delta ADC. In: 2012 Fourth international conference on computational intelligence,
communication systems and networks, Phuket, pp 435–440
10. Mo H, Kennedy MP (2017) Masked dithering of MASH digital delta sigma modulators with
constant inputs using multiple linear feedback shift registers. IEEE Trans Circuits Syst I Reg
Pap 64(6):1390–1399
11. HarshaVardhini PA, Madhavi Latha M, Krishna Reddy CV (2013) Analysis on digital imple-
mentation of Sigma-Delta ADC with passive analog components. Int J Comput Dig Syst
(IJCDS) 2(2):71–77
12. Sham KJ, Bommalingaiahnapallya S, Ahmadi MR, Harjani R (2008) A 3 × 5-Gb/s multilane
low-power 0.18-µm CMOS pseudorandom bit sequence generator. IEEE Trans Circuits Syst
II Exp Briefs 55(5):432–436
13. Ajane A, Furth PM, Johnson EE, Subramanyam RL (2011) Comparison of binary and LFSR
counters and efficient LFSR decoding algorithm. In: Proceedings of IEEE 54th International
Midwest Symposium on Circuits and Systems (MWSCAS), Aug 2011, pp 1–4
14. Vardhini PAH (2016) Analysis of integrator for continuous time digital sigma delta ADC
on Xilinx FPGA. In: International conference on electrical, electronics, and optimization
techniques, ICEEOT 2016, pp 2689–2693
15. Kłosowski M, Jendernalik W, Jakusz J, Blakiewicz G, Szczepański S (2017) A CMOS pixel
with embedded ADC, digital CDS and gain correction capability for massively parallel imaging
array. IEEE Trans Circuits Syst I Reg Pap 64(1):38–49
16. Palagiri HV, Makkena ML, Chantigari KR (2013) Optimum decimation and filtering for
reconfigurable sigma delta adc. Far East J Electron Commun 11(2):101–111
17. Clark DW, Weng L-J (1994) Maximal and near-maximal shift register sequences: efficient
event counters and easy discrete logarithms. IEEE Trans Comput 43(5):560–568
18. Harsha Vardhini PA, Madhavi Latha M (2015) Power analysis of high performance FPGA low
voltage differential I/Os for SD ADC architecture. Int J Appl Eng Res 10(55):3287–3292
19. Abbas TA, Dutton NAW, Almer O, Finlayson N, Rocca FMD, Henderson R (2018) A CMOS
SPAD sensor with a multi event folded flash time-to-digital converter for ultra-fast optical
transient capture. IEEE Sens J 18(8):3163–3173
20. Tetrault M-A, Lamy ED, Boisvert A, Fontaine R, Pratte J-F (2013) Low dead time digital SPAD
readout architecture for realtime small animal PET. In: IEEE nuclear science symposium and
medical imaging conference (NSS/MIC), Oct 2013, pp 1–6
21. Vardhini PAH, Makkena ML (2021) Design and comparative analysis of on-chip sigma delta
ADC for signal processing applications. Int J Speech Technol 24:401–407
22. Krstajić N et al (2014) A 256 × 8 SPAD line sensor for time resolved fluorescence and Raman
sensing. In: Proceedings of 40th European Solid State Circuits Conference (ESSCIRC), pp
143–146
Lion Optimization Algorithm
for Antenna Selection in Cognitive Radio
Networks

Mehak Saini and Surender K. Grewal

Abstract Overlay cognitive radio networks have proven to be efficient in 5G wire-


less communication systems owing to their capability of intelligent utilization of the
precious electromagnetic spectrum. In this research paper, lion optimization algo-
rithm is used to optimize the transmit antenna selection technique used by the cogni-
tive radio to send information to its user as well as the main channel’s user at the
same time. Results in the form of bit error rate versus signal to noise ratio graph are
also depicted.

Keywords AS · CR · LOA

1 Introduction

As stated by the Federal Communications Commission (FCC), on an average about


15–85% of the spectrum band that is licensed is utilized [1]. To overcome the limita-
tions of the already existing wireless networks and meet the needs of the ever-growing
number of mobile devices, newer technologies have to be incorporated. Though
cognitive radio networks have been in usage since a long time, when combined with
various multiple input multiple output or MIMO techniques and newer modulation
schemes such as non-orthogonal multiple access (NOMA) etc., these can prove to
be one of the key technologies for future generation wireless networks (Fig. 1).
As given in Fig. 2a, cognitive radio systems can be classified broadly in three
main categories:
Interweave cognitive radio networks: These devices do not cause any interference
with the primary users as they utilize the primary user’s spectrum only for the time
duration in which the primary users are not using it. These unused bands of spectrums
of the primary users are given the name of spectrum holes.

M. Saini (B) · S. K. Grewal


D. C. R. University of Science and Technology, Murthal, India
S. K. Grewal
e-mail: skgrewal.ece@dcrustm.org

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 159
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_13
160 M. Saini and S. K. Grewal

Fig. 1 Cognitive radio networks existing simultaneously with the main network [2]

CogniƟve Radio
Networks

Overlay CogniƟve Underlay Interweave


radio CogniƟve Radio CogniƟve Radio

(a) (b)
Fig. 2 a Classification of cognitive radio networks, b various types of cognitive radio networks,
i.e., interweave, underlay and overlay [3]

Underlay cognitive radio networks: These networks involve operation of both


primary and secondary users simultaneously over the same spectrum. To avoid inter-
ference between the two different types of users, i.e., the main channel’s users and
the cognitive radio users, a certain threshold has to be maintained.
Overlay cognitive radio networks These networks also require both the primary and
secondary users to operate simultaneously. However, the difference from underlay
Lion Optimization Algorithm for Antenna Selection … 161

method is that here secondary user needs to have information regarding primary
user’s data in order to cancel it and decodes its own data [3].
Non-orthogonal multiple access technique (NOMA) is a multiple access tech-
nique which permits multiple users in the same time–frequency resource block to
coexist owing to communication utilizing the power domain. User with best channel
condition (having most SNR) is allocated lowest power as it will be able to decode the
original signal correctly even if it is allocated least power by performing successive
interference cancelation (SIC). The user with worst channel allocation is allocated
maximum power so that it can directly decode its signal by considering the rest as
noise. So, each user is given the same signal which is a superposition of all the signals
required for all the users, weighted according to the amount of power allocated to
each signal [4].
Spatial modulation [5] and transmit antenna selection [6] are closed-loop MIMO
techniques. In MIMO wireless communication systems, spatial modulation is the
method of transmitting information not only from the antenna but via antenna
index. This increases the efficiency of the overall system. Transmit antenna selection
methods aim at finding the best antenna/antennas out of a set of multiple antennas at
the transmitting side of the communication system.

2 Lion Optimization Algorithm

Lion optimization algorithm, abbreviated as LOA, constitutes an optimization


method, and it is based on behavior of lions and their social organization. This
algorithm (LOA) starts by defining a smaller set that is generated randomly which
constitutes the solutions that form the (given by the term ‘lions’). Some lions of initial
population act to be nomadic lions and the remaining form the resident population
of lions and the latter is partitioned in a random fashion into a number of different
sub-sets, termed as prides. Certain percent (denoted by S) of part of prides’ popula-
tion is taken as female and remaining are in male category. However, in nomads, the
percentage of females is (1 − S)%.
Best (obtained) solution for every lion in passed iterations will be called as best
visited one, and for the entire process of optimization, it is progressively updated. The
territory of the pride consists of the best visited position of every member. In every
pride, some females (selected randomly) are given the task of performing hunting
of prey in a group in order to gather food (for their own pride). Hunters make a
movement toward prey for encircling and catching it. Remaining females make a
move in different positions or parts of the territory. Male lions that form part of the
pride roam in the territory. The females of the prides undergo mating process with
either one or some of the resident males. Each and every pride excludes the young
males to form part of their maternal side’s pride and hence turns nomadic on reaching
maturity. The power of these male lions is considered to be lesser in value than that
of pride’s residents (males).
162 M. Saini and S. K. Grewal

Pseudo code of LOA

1. Generate initial population of lions (Npop) randomly.

2. Initiate the prides and randomly choose a certain percentage of (initial) population as Nomads & randomly partition the
remaining lions in a fixed number of prides, thus forming the territory of each pride. In every pride, take
some percentage (S) of members of prides as female & the remaining as males. In nomads, take the percentage of fe-
males is (1-S)% & the remaining as males.

3. For every pride, perform the following steps:

a. Make some female lions (selected randomly) to go for hunting & the remaining female lions of pride are made to
move towards best selected position of territory.
b. In prides, mate a fixed percentage (depending on mating probability) of females with resident males (new cubs be-
coming mature).
c. Drive the weakest male lion out (from the pride) so that it becomes nomad.
4.
a. Make male as well as female Nomad lions to move in the search space randomly & mate a fixed percentage (de-
pending on the probability of mating) of the nomadic females to the best nomadic male (new cubs becoming ma-
ture).
b. Make male nomad to randomly attack the pride.

5. Convert some (depending upon immigrate rate) female lions from each pride into nomad.

6.
a. Sort all nomad lions (males & females) on the basis of fitness value, select best females out of them and distribute
them to prides to fill empty spaces created by migrated females.
b. Remove nomad lions having minimum fitness value keeping in view the maximum number of permitted lions of
each gender,

7. Go to third step if desired results are yet to be obtained.

Fig. 3 Pseudocode of LOA [7]

The nomadic lion (either male or female) has to move in the search space randomly
in order to find a place (or a solution) that is better. There is a possibility that a
stronger nomadic (male lion) can invade the resident (male lion), causing the latter
to be evicted out of the pride who will now become the resident (male lion). For
evolving, few of the females (resident ones) keep on immigrating to a new pride
after some time or they might turn nomadic (switching their lifestyles), and there is
possibility that a few nomad lions (female) join prides. Weakest lion is killed or it
may die depending on various factors, like lack of competition and food, etc. This
entire process is carried out in a loop until and unless the stopping criteria is met.
The overall process of optimization in this algorithm is represented in pseudocode
given in Fig. 3. In present paper, this algorithm is utilized to optimize the process of
transmit antenna selection.

3 LOA for Transmit Antenna Selection(TAS) in Cognitive


Radio Networks

The system model of the proposed work is depicted in Fig. 2. PT and PR stand for the
primary transmitter and the primary receiver, i.e., these users form part of the main or
primary channel for whom the spectrum is allocated, i.e., they constitute the licensed
band. PT has multiple transmitting antennas (N p ), and PR has multiple receiving
Lion Optimization Algorithm for Antenna Selection … 163

Fig. 4 A model for


cognitive radio [8]

antennas (N r1 ). Similarly, ST is the secondary transmitter, having Nr receiving end


antennas and Nt transmitting end antennas. SR is short for secondary receiver which
uses N r2 number of receiving antennas. Together, ST and SR form parts of the cogni-
tive radio network. The dotted lines depicted that there can be multiple secondary
receiver, and here, only one is chosen for sending data. The secondary transmitter
acts as a relay by first receiving the signal from the primary transmitter and then in
the next step sending that signal to the primary receiver and the data intended for
the secondary receiver simultaneously by using NOMA and SM. Further, antenna
selection is done between the secondary transmitter and the primary receiver link to
enhance the performance (Fig. 4).

4 Results

Simulation of the cognitive radio model was performed on MATLAB R2018a. Two
datasets were chosen for simulation. The first dataset used 40,000 bits of binary data
generated randomly and the second dataset used 60,000 bits of binary data.
In Fig. 5, four cases are taken in which BER versus SNR graph is drawn by
calculating the BER over a range of SNR values, taken for −10 to +10. It is observed
that as SNR increases, BER converges to zero. The statistical values calculated for
this dataset are given in Table 1.
Similarly, for dataset 2, BER versus SNR graph is depicted for four cases, as
shown in Fig. 6a–d. Also, statistical results obtained after simulation are given in
Table 2.
We have achieved the objective of error minimization. Table 3 summarizes the
optimization parameters chosen.
164 M. Saini and S. K. Grewal

Fig. 5 a–d BER versus SNR graph for first dataset (i.e., for 40,000 bits) for four cases

Table 1 Optimization
Best 1.4082
statistical report
Worst 224.99
Mean 53.624
Median 38.922
STD (standard deviation) 48.868

5 Conclusions and Future Scope

This work shows the potential of using lion optimization algorithm for the task of
transmit antenna selection in a cognitive radio network. It can be concluded that this
technique results in improvements in results as BER converges as SNR is increased
with the usage of optimization technique. Further optimization techniques can be
applied to compare the results given in this paper. Also, apart from optimization-
driven techniques, data-driven techniques can be applied to further enhance the
results.
Lion Optimization Algorithm for Antenna Selection … 165

Fig. 6 a–d BER versus SNR graph for second dataset (i.e., for 60,000 bits) for four cases

Table 2 Optimization
Best 2.3605
statistical report
Worst 148.19
Mean 39.556
Median 30.506
STD (standard deviation) 34.919

Table 3 Optimization
Population size 10
parameters
Chromosome length [24 32 48 64]
Maximum number of iterations 100
Mature age 3
Mutation rate 0.15
166 M. Saini and S. K. Grewal

References

1. FCC Spectrum Policy Task Force (2002) Report of the spectrum effiency working group
2. Tellambura C, Kusaladharma S (2017) An overview of cognitive radio networks. Wiley
ecyclopaedia of electrical and electronics engineering
3. Akyildiz IF, Lee WY, Vuran MC, Mohanty S (2006) NeXt generation/dynamic spectrum
access/cognitive radio wireless networks: a survey. Comput Netw 50(13):2127–2159
4. Sun Q, Han S, Chin-Lin I, Pan Z (2015) On the ergodic capacity of MIMO NOMA systems.
IEEE Wireless Commun Lett 4(4):405–408
5. Abdzadeh-Ziabari H, Champagne B (2020) Signal detection algorithms for single carrier
generalized spatial modulation in doubly selective channels. Sig Process 172
6. Kai X, Tao L, Chang-chuan Y, Guang-xin Y (2004) Transmit antenna selection for MIMO
systems. In: IEEE 6th Circuits and Systems Symposium on Emerging Technologies: Frontiers
of Mobile and Wireless Communication, vol 2, pp 701–704, China
7. Yazdani M, Jolai F (2016) Lion optimization algorithm (LoA): a nature-inspired metaheuristic
algorithm. J Comput Design Eng 3(1):24–36
8. Emam S, Celebi ME (2018) Non-orthogonal multiple access protocol for overlay cognitive
radio networks using spatial modulation and antenna selection. AEU-Int J Electron Commun
86:171–176
Sub-band Selection-Based
Dimensionality Reduction Approach
for Remote Sensing Hyperspectral
Images

S. Manju and K. Helenprabha

Abstract Hyperspectral image classification and segmentation are unique technique


to examine diversified land cover images. The main resource controlling productivity
for terrestrial ecosystem in land cover classification and the challenges are subjected
to curse of dimensionality, which is termed as Hughes phenomenon. Nowadays, it
is vital to distinguish how land cover has distorted over time. Over the globe, land
area is confined only to 29% of the surface, while the majority share is covered by
enormous spread of water. Human has to depend upon less than one-third of the
surface, which is around 148,300,000 km2 area, as a habitable part. It is estimated
that more than two-third of the land cover area is not suitable for habitation. Approx-
imately, 27% of land area is characterized by cold environment. Dry and extreme
weather conditions also account for 21% of the land surface. Nearly, 4% of the
total land surface accounts for uneven geographic conditions. Hence, it exemplifies
that land is inimitable and ubiquitous resource. This research is focussed on provi-
sion for land use vegetation classification using AVIRIS hyperspectral images. The
contiguous narrow bandwidth of hyperspectral data facilitates detailed classification
of land cover classes. The issues of utilizing hyperspectral data are that they are
normally redundant, strongly correlated and subject to Hughes phenomenon. This
paper proposes sub-band selection techniques to overcome the Hughes phenomenon
in classification.

Keywords Hyperspectral image · Hughes phenomenon · Feature extraction ·


Sub-band selection · GLCM parameters · Linear and nonlinear methods

S. Manju (B)
Department of Medical Electronics, Saveetha Engineering College, Thandalam,
Chennai 602105, India
K. Helenprabha
Department of Electronics and Communication Engineering, R.M.D. Engineering College,
Kavaraipettai 601206, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 167
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_14
168 S. Manju and K. Helenprabha

1 Introduction

Mapping and recognizable proof of vegetation is a most significant issue for biodiver-
sity management and preservation. The different common marvels like agricultural
escalation, urban development, environmental change and scrub advancement are
the significant focal point of research, and their spatiotemporal elements are the
fundamental concern. In this situation, satellite images develop to be a promising
answer for common vegetation mapping. Contrasted to aerial photography, satel-
lite symbolism can give images at certain time slack with more noteworthy area of
coverage. A few investigations have demonstrated the capability of very high spatial
resolution (VHSR) sensors to map vegetation networks.
Most of the VHSR satellite sensors are confined to just four spectral bands (blue,
green, red, near infrared) and from there on lacking to segregate a few character-
istic vegetation networks. The WorldView-2 satellite has been giving VHSR images
in eight spectral bands, beginning from blue to near infrared, together with added
substance groups like coastal-blue, yellow and red-edge bands. A few examinations
have just demonstrated the capability of WorldView-2 spectral imagery to appraise
woodland biomass and basic parameters, and to evaluate fine-scale plant species
beta diversity in grassland, Dalmayne et al. (2013) considered vegetation mapping
assignments applied to urban tree species.
The phrase “hyperspectral” imaging was derived from works in remote sensing
first cited by Goetz et al. in (1985) to make a straight recognition of surface materials
in the form of images. Hyperspectral images are progressively found to have advan-
tages over the field of agriculture. The primary difficulties of HSI grouping cannot
be viably overwhelmed by machine learning techniques (Li et al. 2019) and further-
more present the benefits of deep learning and how to deal with these issues. At
that point, they developed a structure which partitions the relating works into spec-
tral feature systems, spatial-features systems and spectral spatial feature systems
to systematically review the ongoing accomplishments in deep learning-based HSI
classification. Also, considering the way that accessible preparing analysis in the
remote detecting field is generally restricted and preparing deep networks requires
an enormous number of tests, they include a few techniques to improve character-
ization execution, which can give a few rules to future examinations on this point.
A few representative deep learning-based characterization techniques are directed
with respect to genuine HSIs in their model. The purpose of hyperspectral imaging
is to acquire the spectrum for each pixel in the image of the scene, with the principle
of finding objects, identifying materials or classifying land cover classes. Therefore,
the accurate way of classification of land cover classes is essential in remote sensing
images.
The rest of this proposed research work organized as follows: Sect. 2 explains in
details the literature survey accomplished in the areas related to land cover classes
in context of preprocessing with its advantages and disadvantages; Sect. 3 describes
the high dimensionality reduction technique employed. Sect. 4 describes the results
Sub-Band Selection-Based Dimensionality Reduction … 169

part with various aspects. Lastly, Sect. 5 ends the comparison of various sub-band
selection with the proposed method.

2 Literature Survey

Zhou et al. (2018) have novel strategy for CCPR by coordinating fuzzy support vector
machine (SVM) with hybrid kernel function, and genetic technique (GA) is proposed.
Initially, two shape highlights and two factual highlights that do not rely upon the
appropriation parameters and number of tests are displayed to unequivocally depict
the qualities of CCPs. At that point, a novel multiclass strategy dependent on fuzzy
support vector machine with a hybrid kernel function is proposed. In this strategy, the
impact of anomalies on characterization exactness of SVM-based classifiers is weak-
ened by allocating a level of participation for each preparation test. In the interim,
a hybrid kernel function consolidating Gaussian part and polynomial bit is received
to additionally improve the speculation capacity of the classifiers. To understand the
issue of highlights choice and parameters enhancement, GA is utilized to upgrade
the info highlights subsets and parameters of fuzzy SVM-based classifier. At last, a
few simulation tests and a genuine model are routed to approve the practicality and
viability of the proposed procedure.
Borsoi et al. (2019) proposed that hyperspectral multispectral (HS-MS) image
combination is right now drawing in remote sensing for remote detecting since
it permits the age of high spatial goals HS images and going around the primary
confinement of this imaging methodology. Previous HS-MS combination technique,
in any case, disregard the otherworldly inconstancy frequently existing between
images obtained at various time moments. This time distinction causes varieties in
spectral signatures of the fundamental constituent materials because of the diverse
securing and occasional conditions. This work presents a novel HS-MS image combi-
nation system that consolidates an unmixing-based definition with an unequivocal
parametric model for typical spectral variability between the two images. Recre-
ations with manufactured and genuine information show that the proposed method-
ology prompts a critical exhibition improvement under spectral variability and
state-of-the-art execution generally.
Echanobe et al. proposed that extreme learning machines (ELM) have been effec-
tively applied for the classification of hyperspectral images (HSIs), regardless of
the ill effects of three fundamental disadvantages. These include: (1) inadequate
feature extraction (FE) in HSIs because of a single hidden layer neuron organize
utilized; (2) ill-posed issues brought about by the arbitrary info loads and biases;
and (3) absence of spatial data for HSIs classification. To handle the main issue,
they build a multilayer ELM for compelling FE from HSIs [16]. The sparse repre-
sentation is received with the multilayer ELM to handle the poorly presented issue
of ELM, which can be tackled by the elective course strategy for multipliers. This
has resulted about in the presented multilayer sparse ELM (MSELM) model. The
neighbouring pixels are almost certain from a similar class, a nearby block expansion
170 S. Manju and K. Helenprabha

is acquainted for MSELM with extricate the local spatial data, prompting the nearby
block MSELM (LBMSELM). The loopy conviction spread is likewise applied to the
proposed MSELM and LBMSELM ways to deal with further use the spectral and
spatial data for improving the classification.
Iturrino et al. (2019) concluded that multispectral imaging devices as of now
being utilized in an assortment of uses ordinarily follow opto-mechanical plans.
These plans regularly present the weakness of a massive and substantial development,
restricting their relevance in extraordinary failure elevation remote detecting appli-
cations conveyed on procurement frameworks with compelled weight and measure-
ment prerequisites. This work introduces the plan and development of a straight-
forward optical framework dependent on a transmission grating and a lightweight
business camera for depiction of multispectral image acquisition. The framework can
be mounted on a little automaton for low-elevation flying remote detecting and is
equipped for isolating the spectral data from the spatial scene and creating multispec-
tral HD images and 4 K recordings. Therefore, they propose a quick technique for
recouping multispectral scenes from their individual procured unearthly projections.
Trial results show similarity of the compared waveforms, and that the greatest peak
of the reproduced wavelength changes up to 13 nm from reference spectroradiometer
information.

3 High Dimensional Reduction Technique

Hyperspectral bands generally have a lot of redundant information, leading to the


Hughes phenomenon and an increase in computing time. As a popular dimension-
ality reduction technology, band/feature selection is essential for HSI classification.
The dimensions of dataset can be reduced by using feature selection and feature
extraction. The curse of dimensionality is categorized as shown in Fig. 1.

CURSE OF
DIMENSIONALITY

FEATURE FEATURE
EXTRACTION SELECTION

LINEAR NON LINEAR FILTER EMBEDDED


METHODS METHODS METHOD WRAPPER METHOD
1.PCA 2.MCIA 1.Autoencoders 1.mRMR METHOD 1.Elastic Net
3.Joint NMF 2.Representative 2.Information 1.RFE SVM 2.Stability
4.MOFA learning Gain Selection

Fig. 1 Dimensionality reduction techniques


Sub-Band Selection-Based Dimensionality Reduction … 171

3.1 Feature Selection

This is the process of selecting dimensions of the dataset which contributes more
to the machine learning tasks which can be achieved by various techniques like
correlation analysis, univariate analysis, etc.
In filter method, features are selected on the basis of their scores in various test, and
best score is selected for learning rule. Finally, performance measures are calibrated.
Figure 2 depicts the process of filter method.
Figure 3 shows the wrapper method, where subset of features are used to train the

Fig. 2 Filter method

Fig. 3 Wrapper method


Input Features

Subset Generation

Learning Algorithm

Performance
172 S. Manju and K. Helenprabha

Fig. 4 Embedded method


Input Features

Subset Generation

Learning Algorithm

Performance

model. It measures the usefulness of a subset of feature by actually training a model.


Hence, the best subset of features is selected.
In embedded method, the feature selection is performed during the model training
and so much less prone to overfitting. Figure 4 shows the process of embedded method
feature selection.

3.2 Feature Extraction

This is the process of finding new features by selecting or combining features to


create reduced feature space while still accurately and completely describing the
dataset without loss of information. Some of the popular dimensionality reduction
techniques are PCA, ICA, LDA, kernel PCA, etc.
In the proposed sub-band selection method, the subspace decomposition is
achieved by calculating the correlation coefficients between adjacent bands. Figure 5
illustrates the flow graph of the proposed method. It takes into account the corre-
lation coefficient of adjacent bands and spectral characteristics simultaneously to
determine the reduce the redundant information among the selected bands.

4 Results and Discussion

4.1 Salinas Dataset

This scene was gathered by the 224-band AVIRIS sensor over Salinas Valley, Cali-
fornia, and is described by high spatial goals (3.7-m pixels). The region secured
involves 512 lines by 217 samples. This image was accessible just at-sensor radiance
information. It incorporates vegetables, exposed soils and vineyard fields. Salinas
ground truth consists of 16 classes. The number of classes, training and testing
samples is shown in Table 1.
Sub-Band Selection-Based Dimensionality Reduction … 173

Fig. 5 Proposed method


Remote Sensing
Image (All N bands)

Pre-processing

CURSE OF DIMENSIONALITY

Existing Method Proposed Method


Band
PCA, MCA,
selection
Autoencoders
method

Reduced Band
Set of Bands

Classification of
crops

4.2 Sub-band Selection

The GLCM is utilized to register a lot of scalar amounts that describe the various parts
of the surface in the image. The GLCM features, for example, inertia, sum entropy,
difference entropy, homogeneity, angular second moment (ASM), local homogeneity,
inverse difference moment (IDM), contrast, entropy, correlation, sum of squares, vari-
ance and sum average can well depict the image surfaces. Grey-level co-occurrence
matrix (GLCM) gives conveyance estimations of image grey level as grid. In this
presented framework, the GLCM figured for the block encompassing the pixel of
concern. In this matrix, a few features like contrast, energy, homogeneity, correlation,
etc. can be determined. Twenty-two features are removed from this GLCM. Four out
of propositions 22 highlights are chosen utilizing correlation feature selection tech-
nique. These are homogeneity, sum variance, sum entropy and information measure
of correlation-2 (IMC-2). The features are processed utilizing the equation appeared
in Table 2.
In this, p(i, j) is the (i, j)th entry in a standardized grey tone spatial dependence
grid, px(i) is the ith section in the marginal probability grid got by adding the rows
of p(i, j), py(i) is the ith passage in the minor marginal probability grid acquired
174 S. Manju and K. Helenprabha

Table 1 Indicate the number of classes and their samples information


No Class Samples Training/Test
1 Brocoli_green_weeds_1 2009 150/1570
2 Brocoli_green_weeds_2 3726 200/2400
3 Fallow 1976 175/1500
4 Fallow_rough_plow 1394 200/1100
5 Fallow_smooth 2678 250/1750
6 Stubble 3959 350/2500
7 Celery 3579 275/3000
8 Grapes_untrained 11,271 150/1000
9 Soil_vinyard_develop 6203 400/2000
10 Corn_seneseed_green_weeds 3278 300/2400
11 Lettuce_romaine_4wk 1068 170/850
12 Lettuce_romaine_5wk 1927 180/1200
13 Lettuce_romaine_6wk 916 150/700
14 Lettuce_romaine_7wk 1070 100/750
15 Vinyard_untrained 7268 500/2800
16 Vinyard_vertical trellis 1807 180/1100

Table 2 Formula used to compute GLCM features


S. No. GCLM Features Formula
 
1 Homogeneity F1 = i j 1
1+(i− j)2
p(i, j)
2Ng
F2 = (i−F3 )2 px+y (i) where px+y (k)
i=2
2 Sum variance 2Ng 2Ng
= p(i, j), i + j = k = 2, 3, . . . , N g
i=1 j=1

g
2N  
3 Sum entropy F3 = − px+y (i) log px+y (i)
i=2

    1/2
F4 = 1− exp −2.0 H X Y 2 − H X Y
 
4 Information measure of Where H X Y = − p(i, j) log( p(i, j))
i j
correlation (IMC-2)    
and H X Y 2 = px (i) p y ( j) log px (i) p y ( j)
i j

by adding the columns of p(i, j), and Ng = Number of particular grey levels in the
quantized image. The GLCM features extracted band wise, for Salinas and Indian
pine images, are presented in Table 3.
Figure 6a shows the input image of Salinas data for band 20. The classified result
is shown in Fig. 6b. It is explored that the measure of correlation which measures
Sub-Band Selection-Based Dimensionality Reduction … 175

Table 3 GLCM features extracted for Salinas image


BAND Homogeneity Sum variance Sum entropy Information measure of correlation
(IMC-2)
1:20 0.0337 0.9159 0.9831 0.4664
21:40 0.0340 0.9305 0.9830 0.5786
41:60 0.0352 0.9184 0.9824 0.5340
61:80 0.0351 0.9258 0.9824 0.5340
81:100 0.0351 0.9258 0.9824 0.4933
101:120 0.0302 0.9150 0.9849 0.6732
121:140 0.0346 0.9276 0.9827 0.4884
141:160 0.0321 0.9127 0.9839 0.6015
161:180 0.0320 0.9133 0.9840 0.6000
181:200 0.0345 0.9235 0.9828 0.4162
201:220 0.0337 0.9159 0.9831 0.4664

Fig. 6 a Salinas input image. b Land covered class. c Sub-band selection values
176 S. Manju and K. Helenprabha

the structural similarity between various bands seems to be similar for band 21 to
band 40. And graph is depicted in Fig. 6c. The result of the sub-band selected in the
nearby neighbourhood of every pixel the mean and standard deviation is determined
as from these

1 
N
μi = xi j (1)
N j=1

1 N
σi = (xi j − μi ) (2)
N j=1

where μi and σi are mean and standard deviation of pixels in a specific block, x ij is
characterized the pixel esteems in block and N describes to the absolute number of
pixels in every local block.
In the proposed method, the subspace decomposition is achieved by calculating
the correlation coefficients between adjacent bands. The redundant information from
the selected bands can be obtained from the spectral characteristics of huge spectral
bands. From Fig. 6c, it is observed that bands from 21 to 40 have similar distin-
guishing capacity for different land covers, and thus, the huge numbers of bands are
reduced while preserving the information of the original bands in hyperspectral image
processing. After all the 1D, 2D, GLCM features are separated, a standardization
procedure is applied. It is utilized in the features to bring them into a compara-
tive scope of qualities, helping the successive classification. From this procedure,
they are described to zero mean with a standard deviation of one. These esteems
are forwarded as input to IRVM training procedure. Satellite image order can be
accepted as a coordinated methodology that includes both image processing and
machine learning. GLCM indices were created for the evaluation of heterogeneous
surface patterns and roughness appeared in digital imagery. Each record can feature
a specific property of surface, for example, smoothness, roughness and abnormality.

5 Conclusion

One of the most vital challenges tasks in remote sensing applications is the segmen-
tation and classification of hyperspectral images. In order to segment and classify the
regions in hyperspectral image, the clear depiction of the edge information is neces-
sary. In this paper, the high dimensionality reduction technique using correlation
analysis of subspace decomposition is achieved by calculating the correlation coef-
ficients between adjacent bands. Thus, the huge number of bands are reduced which
is the curse of satellite image dataset, and hence, the original bands information is
preserved.
Sub-Band Selection-Based Dimensionality Reduction … 177

References

1. Gao Q, Lim S, Jia X (2018) Hyperspectral image classification using convolution neural
networks and multiple feature learning. Remote Sens 10:1–18
2. Toksoz MA, Ulusoy I (2016) Hyperspectral image classification via basic threshold classifier.
IEEE Trans Geosci Remote Sens 54(7):4039–4051
3. Damodaran BB, Nidamanuri RR, Tarabalka Y (2015) Dynamic ensemble selection approach
for hyperspectral image classification with joint spectral and spatial information. IEEE J Sel
Top Appl Earth Obser Remote Sens 8(6):2405–2417
4. Manju S, Venkateswaran N (2017) An efficient feature extraction based segmentation and
classification of Antarctic Peninsula ICE shelf. Int J Control Theory Appl 10(20):231–241
5. Manju S, Venkateswaran N (2018) An improved relevance vector machine with metaheuristic
optimization based vegetation classification using worldview-2 satellite images. TAGA J Graph
Technol 14:562–574
6. Huo H, Guo J, Li Z-L (2018) Hyperspectral image classification for land cover based on an
improved interval type-II fuzzy C-means approach. Sensors 18(2):1–22
7. Hasane Ahammad SK, Rajesh V, Zia Ur Rahman MD (2019) Fast and accurate feature
extraction-based segmentation framework for spinal cord injury severity classification. IEEE
Access 7:46092–46103
8. Saravanakumar V, Suma KG, Sakthivel M, Kannan KS, Kavitha M (2018) Segmentation
of hyperspectral satellite image based on classical clustering method. Int J Pure Appl Math
118(9):813–820
9. Benediktsson JA, Sveinsson JR (2005) Classification of hyperspectral data from urban areas
based on extended morphological profiles. IEEE Trans Geosci Remote Sens 43(3):480–491
10. Kang X, Li S, Benediktsson JA (2017) Hyperspectral image classification: a benchmark. In:
IEEE International Geoscience and Remote Sensing Syposium, pp 3633–3635
11. Zhang X, Weng P, Feng J, Zhang E, Hou B (2014) Spatial-spectral classification based on
group sparse coding for hyperspectral image. In: IEEE International Geoscience and Remote
Sensing Symposium, pp 1745–1748
12. Braun AC, Weidner U, Hinz S (2012) Classification in high-dimensional feature spaces—
assessment using SVM, IVM, and RVM with focus on simulated EnMAP data. IEEE J Sel Top
Appl Earth Obser Remote Sens 5(2):436–443
13. Mianji FA, Zhang Y (2011) Robust hyperspectral classification using relevance vector machine.
IEEE Trans Geosci Remote Sens 49(6):2100–2112
14. Demir B, Erturk S (2007) Hyperspectral image classification using relevance vector machines.
IEEE Geosci Remote Sens Lett 4(4):586–590
15. Damodaran BB, Nidamanuri RR, Tarabalka Y (2015) Dynamic ensemble selection approach
for hyperspectral image classification with joint spectral and spatial information. IEEE J Sel
Top Appl Earth Obser Remote Sens Soc 8(6):2405–2417
16. Echanobe J, del Campo I, Martinez V, Basterretxea K (2017) Genetic algorithm based opti-
mization of ELM for on-line hyperspectral image classification. In: IEEE International Joint
Conference on Neural Networks, pp 4202–4207
17. Lopez-Fandino J, Quesada-Barriuso P, Heras DB, Arguello F (2015) Efficient ELM-based
techniques for the classification of hyperspectral remote sensing images on commodity GPUs.
IEEE J Sel Top Appl Earrh Obser Remote Sens Soc 8(6):2884–2893
18. Lv Q, Niu X, Dou Y, Wang Y, Xu J, Zhou J (2016) Hyperspectral image classification via
kernal extreme learning machine using local receptive fields. In: IEEE International conference
in image processing, pp 256–260
19. Samat A, Du P, Liu S, Li J, Cheng L (2014) Ensemble extreme learning machines
for hyperspectral image classification. IEEE J Sel Top Appl Earth Obser Remote Sens
7(4):1060–1069
20. Shen Y, Xu J, Li H, Xiao L (2016) ELM-based spectral–spatial classification of hyperspectral
images using bilateral filtering information on spectral band-subsets. In: IEEE International
geoscience and remote sensing symposium, pp 497–500, Nov 2016
178 S. Manju and K. Helenprabha

21. Shen Y, Chen J, Xiao L (2018) Supervised classification of hyperspectral images using local-
receptive-field based kernal extreme learning machine. IEEE International Conference on
Image Processing, pp 3120–3124
22. Manju S, Helenprabha K (2019) A structured support vector machine for hyperspectral satellite
image segmentation and classification based on modified swarm optimization approach. J Amb
Intell Hum Comput. https://doi.org/10.1007/s12652-019-01643-1
An Efficient Network Intrusion Detection
System Based on Feature Selection Using
Evolutionary Algorithm Over Balanced
Dataset

Manisha Rani and Gagandeep

Abstract Intrusion detection system is the substantial tool for securing the network
traffic from malicious activities. Although it provides protection to real-time
networks, however, they suffer from high false alarm rate and low testing accuracy
in classification task. Because of large-sized IDS datasets, the high dimensionality
and class imbalance problem are the crucial barrier in achieving high performance
of IDS model. To reduce the dimensions of dataset and to address class imbalance
issue, we have used artificial bee colony algorithm for feature selection and SMOTE
data oversampling technique to balance the dataset respectively. At the outset, data
is preprocessed using min–max normalization method and data is balanced by over-
sampling the minority instances in the dataset. After that, features are selected and
evaluated using random forest classifier. Further, RF classifier has also been chosen
based on empirical analysis. The experimental results are performed over widely used
NSL KDD dataset, and the superiority of proposed work is evaluated by comparing
it with the existing literature.

Keywords Intrusion detection system · Artificial Bee Colony · Class imbalance ·


SMOTE · NSL KDD · UNSW-NB15

1 Introduction

In digital world, security has become the critical issue in modern networks from
past two decades. It leads toward rapid increase of security threats too along with
the growth of computer networks. Despite strong security system against threats,
still there exist numerous intruders over the network that can violate the security
policies such as confidentiality, integrity and availability, i.e., (CIA), of the network.
An intruder may attempt unauthorized access to information, manipulate the infor-
mation and make the system unreliable by deploying unnecessary services over the
network. Therefore, to secure network from these intruder activities, strong security

M. Rani (B) · Gagandeep


Department of Computer Science, Punjabi University, Patiala, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 179
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_15
180 M. Rani and Gagandeep

mechanisms need to be designed in order to protect system resources from unau-


thorized access [1]. There are a number of security mechanisms available such as
cryptographic techniques, firewalls, packet filters, intrusion detection system and
so on. Subsequently, malicious users use different techniques such as password
guessing, unnecessary loading of network with irrelevant data and unloading of
network traffic to exploit system vulnerabilities. Therefore, it is very unrealistic to
protect the network completely from breaches. But it is possible to detect the intru-
sions so that the damage can be repaired by taking an appropriate action. Thus, intru-
sion detection system plays a significant role in network security field by providing
a solid line of defense against malicious users. It detects any suspicious activity
performed by intruder by monitoring network traffic and issues alerts whenever any
abnormal behavior is sensed [2]. It is broadly classified into two types, i.e., NIDS and
HIDS. NIDS stands for network-based IDS which analyzes the network traffic by
reading individual packets through network layer and transport layer, whereas HIDS
stands for host-based IDS which monitors every activity of individual host or device
on the network. IDS can detect known attacks through signature-based detection
and can detect unknown attacks through anomaly-based detection. Both approaches
have their own limitations such that former detection approach is good for finding
known attacks only but not for unknown attacks because it can match the patterns
with the stored patterns only. The system must update the database with new attack
signature whenever novel attack is identified. Whereas anomaly-based detection can
detect both known and unknown attacks, but it suffers from high false alarm rate
because of its nonlinear nature, high-dimensional features, mixed type of features in
datasets. To overcome these challenges, there exist various machine learning algo-
rithms and evolutionary algorithms that are mainly used for classification and feature
reduction process, respectively. Recently, various algorithms such as random forest,
naïve Bays and k-nearest neighbor are used as IDS classifiers. However, the pres-
ence of irrelevant features in the dataset deteriorates the performance of classifier
[3]. Thus, to improve the performance of classifier, it is very important to reduce the
dimensionality of feature space by identifying and selecting relevant features from
original feature set which are needed for classification. But selecting the relevant
features from full set is itself a challenging task.
Currently, bio-inspired algorithms such as genetic algorithm [4], particle swarm
optimization [5] and ant colony optimization [6] are emerging techniques used for
feature selection because of their high convergence power and searching behavior.
They are inspired from biological nature of animals, insects and birds and works
based upon the principle of their intelligent evolutionary behavior. These algorithms
help in solving the complex problems with improved accuracy and are very useful in
finding an optimal solution to a given problem. Although feature selection approach
helps in reducing the false alarm rate of IDS, however, due to imbalanced nature
of IDS datasets, it could not improve the performance of classifiers significantly.
Generally, IDS datasets contain large number of normal data (i.e., majority class
instances) as compared to attack data (i.e., minority class instances), resulting in
class imbalance problem. It can be addressed at either data level or classifier level by
changing class distribution of training set itself or by altering the training algorithm
An Efficient Network Intrusion Detection System … 181

rather than training data, respectively. At data level, there are various under-sampling
techniques such as one-sided selection and oversampling techniques like SMOTE,
clustering approach, etc., are available to balance the instances by inserting some
random instances to minority samples or by removing some instances from majority
instances, respectively. At classifier or algorithmic level, either threshold approach
can be used by adjusting decision threshold of a classifier or cost-sensitive approach
can be used by modifying the learning rate parameter of algorithm. In this paper,
initially, data is preprocessed using min–max normalization technique. Then, data
is balanced using SMOTE [7]. Then, subset of features is chosen from original set
using ABC algorithm. The optimality of feature subset is evaluated using random
forest classifier. The performance of IDS is evaluated using NSL KDD dataset. After
empirical analysis of classification accuracy of various classifiers, random forest has
been chosen as classifier among them.
The rest of the paper is organized as follows: Sect. 2 deals with survey related
to different feature selection and class imbalance approaches. Section 3 describes
the dataset used in the experimental analysis. Section 4 deals with the methodology
followed for proposed work. Then, experimental results are reported in Sect. 5 and the
performance of [8–10] proposed results is superior than existing work. The story of
the paper is concluded in Sect. 6.

2 Related Work

Numerous ML-based classifiers and evolutionary algorithms have been proposed for
classification and feature selection process for IDS. To identify attack from normal
traffic, researchers employ various steps such as data preprocessing, feature selection
and reduction techniques and classification step. First of all, existing work related
to IDS needs to be reviewed to identify research gaps in the previous work and then
some novel work would be proposed for further research.
Tavallaee et al. [11] analyzed the publicly available KDDCup’99 dataset in order
to solve the inherent problems of the dataset. The major problem occurs due to large
number of redundant records, and level of difficulty in both training and testing
sets leads ML algorithm to be biased toward majority class instances and large gap
between training and testing accuracies. To solve these problems, new version of
KDDCup’99, i.e., NSL KDD, is proposed. Although NSL KDD can be used as
benchmark dataset, it still suffers from few problems such as data imbalancing.
Pervez and Farid [12] proposed SVM-based feature selection algorithm and classi-
fier to select subset of features from original set. It had achieved 82.38% of testing
accuracy using 36 features and compared its results with existing work. Ingre and
Yadav [13] evaluated the performance of NSL KDD using back propagation algo-
rithm of artificial neural network (ANN) architecture. Before learning, data was
preprocessed by converting each feature into numerals and normalizing the data
into range [0, 1] using Z-score approach. Then, 41 features of KDDTrain+ and
KDDTest+ were reduced to 23 using information gain technique, and then, reduced
182 M. Rani and Gagandeep

features were learnt through neural network. Aghdam and Kabiri [14] proposed
IDS model by selecting new feature subset using ant colony optimization (ACO)
approach followed by nearest neighbor classifier to identify attacks from normal
traffic. Although it achieved better results than existing work, ACO algorithm suffers
from computational memory requirements and low speed due to separate memory
required by each ant. Kaur et al. [14] compared the performance of hybridization
of K-means with firefly algorithm and bat algorithm with other clustering tech-
niques, and it had shown that the proposed work outperformed other techniques with
huge margin. However, data preprocessing involving data normalization might also
improve the performance of the proposed work. Hajisalem and Babaie [15] proposed
another hybrid model using artificial bee colony (ABC) and artificial fish swarm
(AFS) evolutionary algorithms over NSL KDD and UNSW-NB15 datasets. Mazini
et al. [15] designed a hybrid model for anomaly-based NIDS using ABC algorithm
and AdaBoost algorithm to select features and to classify attack and normal data
using NSL KDD and ISCXIDS2012 datasets, respectively. Although evolutionary
algorithms outperformed various existing works, it could not address the imbalance
problem in research work.
Recently, Jiang et al. [16] designed hierarchical model based on deep learning
approach by addressing class imbalance problem using feature selection process
to classify attack data. It combined the one-sided selection (OSS) technique and
SMOTE for handling under-sampling and oversampling, respectively. Then, it
used convolutional neural network (CNN) to choose spatial features and used bi-
directional long short-term memory (Bi-LSTM) to extract temporal features from
NSL KDD and UNSW-NB15 dataset. It identified attack data from normal by
achieving 83.58% and 77.16% testing accuracy using NSL KDD and UNSW-NB15
datasets, respectively. Alkafagi and Almuttairi [17] designed proactive model for
swarm optimization (PMSO) for selecting individual optimal feature subset using
PSO and bat algorithms followed by DT classifier. Both researches were successful
in improving the performance of the proposed work as compared to existing work but
did not address common issue, i.e., class imbalance problem. Wu et al. [18] achieved
classification accuracy up to 78.47% by combining K-means clustering approach with
SMOTE algorithm to equalize the minority data instances with majority instances
in NSL KDD dataset and then by using enhanced RF algorithm for classification
process. Although it addressed the class imbalance, it could not achieve effective
results for minority attack data such as R2L and U2R attack types. Liu and Shi [19]
designed hybrid model for anomaly-based NIDS using GA algorithm for feature
selection and RF for classification. It achieved high classification accuracy rate using
NSL KDD and UNSW-NB15 training sets but did not test or validate the data results
using testing set of both datasets. Priyadarsini [20] solved data imbalanced problem
using borderline SMOTE algorithm through RF technique and then selected feature
subset using ABC algorithm. Various classifiers such as SVM, DT and KNN were
applied for classification of attack data from normal data using KDDCup’99 dataset
only.
From existing work, it is found that evolutionary algorithms are being extensively
used for feature selection process followed by ML classifiers. It is also predicted
An Efficient Network Intrusion Detection System … 183

that lesser research has been done to address class imbalance problem. In this paper,
data sampling technique such as SMOTE has been used for balancing the NSL KDD
dataset. Then, ABC algorithm has been applied for finding optimal subset of features
followed by RF classifier to build an effective model for IDS.

3 Dataset Used

To empirically analyze the results of IDS model, widely used dataset, i.e., NSL KDD,
has been considered in this paper. The experiments are performed using 100% of
dataset. The data is trained over 70% of training set, and remaining 30% of training
set is used for validating the results. Then, the results of dataset are tested using
testing set.

3.1 NSL KDD Dataset

Due to inherent problems in KDDCup’99 dataset, performance of various


researchers’ work was affected. To overcome the issues, its improved version was
made and renamed as NSL KDD dataset [11]. It is the benchmark dataset used by
every researcher to compare the performance of IDS models. It contains total of 42
attributes out of which 41 attributes belong to network flow type whereas last 42nd
attribute indicates the label assigned to each attribute. All attributes except last are
categorized into four types such as basic, content-related, time-related and host-based
contents. On the other hand, last attribute represents class that indicates whether the
given instance is normal or attack connection instance. The four attack classes are
distributed denial of service (DDoS), probe, user to root (U2R) and Remote2Local
(R2L). It is publicly available over the Internet consisting of training set and testing
set as KDDTrain+ and KDDTest+, respectively. The number of instances present in
each set is described in Table 1:

Table 1 Record distribution


Dataset Total instances Record distribution
of NSL KDD dataset
Normal Attack
KDDTrain+ 125,973 67,343 58,630
KDDTest+ 22,544 9711 12,833
184 M. Rani and Gagandeep

4 Intrusion Detection Process

Intrusion detection system follows some basic steps to detect intrusions from network
traffic data. To monitor network traffic, there must be some genuine data containing
information about network packets. The first step involves collection of data from
available sources over which entire detection process has to be done. In this paper,
a benchmark set, i.e., NSL KDD, has been used for empirical analysis of the data.
These datasets contain metadata of network-related data and different attacks that
can be generated by the intruders. Second step involves data preprocessing step
which consists of two sub-steps, i.e., data conversion and data normalization. In
order to reduce the dimensions of data, redundant and noisy data are removed from
entire dataset and only relevant features are selected through feature reduction and
feature selection process. Then, reduced feature set of data is fed into classifier to
discriminate normal and attack data. The block diagram of whole intrusion detection
process is shown in Fig. 1.

4.1 Data Preprocessing

Because of vast amount of network packets, unequal distribution of data in datasets


and instability of data toward changing behavior of network, it is very difficult to
classify attack data from normal with high accuracy rate. Therefore, there is a need of
some sought of preprocessing to data before putting it over detection model. Before
classification, data is preprocessed through data conversion and data normalization.

Fig. 1 Block diagram of IDS model


An Efficient Network Intrusion Detection System … 185

Data Conversion: The NSL KDD contains heterogeneous data such as numerical
and non-numerical or string type of data. Most of the IDS classifiers accept numerical
data type only. So, it is necessary to convert the entire data into homogenous form,
i.e., numeric type. First of all, data conversion is involved in preprocessing step which
converts all non-numerical features into integers. For instance, protocol feature in
NSL KDD contains string type of values such as TCP, ICMP and UDP which are
converted into numbers by assigning values 1, 2 and 3 to them, respectively.
Data Normalization: After converting all features into integer type, they can be of
either discrete or continuous in nature, which are incomparable and may degenerate
the performance of the model. To improve the detection performance significantly,
normalization technique is applied in order to rescale the dataset values into range
of interval [0, 1]. There are various normalization techniques such as Z-score, Pareto
scaling, sigmoidal and min–max available. Based on empirical analysis, min–max
normalization approach has been used among them in this paper. Every attribute
value is rescaled into [0, 1] by transforming maximum value of that attribute into
1 and minimum value of that attribute into 0, while remaining attribute values are
transformed into decimal values lying between 0 and 1. For every feature x i , it is
calculated mathematically as follows:

xi − min(xi )
Xi = (1)
max(xi ) − min(xi )

where xi denotes ith feature of dataset x, and min(xi ) and max(xi ) denote minimum
and maximum values of ith feature of dataset x, respectively. X i represents the
corresponding normalized or rescaled value of the ith feature value.

4.2 Data Sampling

Generally, large-size datasets suffer from unequal distribution of records known as


class imbalance problem. The training set of IDS datasets contains more number
of normal data known as majority instances than attack data, i.e., minority class
instances. For example, the large difference between normal and attack instances is
8713 in KDDTrain + set of NSL KDD, which makes the results biased towards
normal data instances. It is difficult to detect U2R and R2L attacks in NSL KDD
because it contains only 52 U2R and 995 R2L instances which are lesser than DoS and
probe class instances. Therefore, it fails to detect minority attack types in the presence
of vast amount of normal data effectively. In order to give importance to minority
class instances, we have used synthetic minority oversampling technique (SMOTE)
in which minority instances are oversampled by adding some random instances to
it so that it can be equally represented as majority instances in dataset. The pseudo-
instances are added by joining it with k-nearest neighbors of each minority sample
of feature vector. The number of neighbors used to introduce pseudo-instances in
186 M. Rani and Gagandeep

minority instances is taken as five in the paper. The pseudo-instance value is decided
by taking difference between original feature value and its corresponding nearest
neighbor value. To avoid the zero difference, it is multiplied by some random value
lying in the interval range (0, 1). It is calculated using Eq. (2) as follows:

Newinst = ( f v − n v ) ∗ rand(0, 1) (2)

where Newinst indicates the pseudo-value to be added to the set, f v denotes the feature
value of minority class instance, n v denotes value of nearest neighbor of that minority
instance and rand(0, 1) is the random function which generates the random value
that lies in between 0 and 1.
The sampling ratio is determined by dividing the number of samples available in
minority class instances after resampling to the total number of samples present in
majority class instances. Mathematically, it is defined using Eq. (3) as follows:

Sr = NMin NMaj (3)

where Sr denotes sampling ratio, NMin indicates the number of minority class
instances after resampling and NMaj indicates the number of majority class instances
present in the dataset. In this way, balanced dataset is generated using SMOTE to
equally represent the minority and majority class instances.

4.3 Feature Selection

Apart from data preprocessing, it is very challenging task to monitor large amount of
network traffic having ambiguous and redundant data. It is very crucial to reduce the
dimensions of full data in order to secure network from intrusions in real time. The
dimensionality reduction of large datasets further improves the performance of IDS
classifiers of system. It can be achieved by selecting informative or relevant features
while rolling out redundant and irrelevant features from original set through feature
selection process. On the basis of evaluation criteria, feature selection is of two
kinds, i.e., filter-based and wrapper-based feature selections. Filter-based FS assigns
weights to features and filters out the irrelevant features based upon order of weight.
Although it saves finding time of method, there is no role of classification algorithm.
On the contrary, wrapper-based FS takes into account the effect of classification
algorithm in finding feature subset and results in high classification accuracy as
compared to filter-based FS [21]. That is why, we have preferred wrapper-based
FS approach in this paper. The basic procedure of wrapper-based feature selection
involves selecting the subset of features from original set, and then, the generated
set is evaluated for its optimality. If generated set is evaluated to be optimal, it
is chosen as best subset; otherwise, another subset is evaluated. In recent years,
swarm intelligence emerged out as an effective approach for feature selection as
An Efficient Network Intrusion Detection System … 187

compared to other approaches. It is inspired from collective behavior of swarms such


as honey bees, birds, flocks and insects and interacts with each other through specific
behavioral patterns. They are capable of solving nonlinear complex problems within
less computational time with low computational complexity [22]. After learning pros
and cons of various evolutionary techniques, we have used ABC for feature selection
in this paper. ABC has been used to find optimal subset of features from NSL KDD
dataset.

4.3.1 Artificial Bee Colony

ABC is meta-heuristic evolutionary algorithm based on swarm intelligence intro-


duced by Karaboga [23]. It is biologically inspired from collective behavior of honey-
bees. It mainly consists of three kinds of bees such as employed bees, onlooker bees
and scout bees that accomplish the task through global convergence. The employed
bees search for food source in the hive on the basis of few properties of food like
concentration of energy (i.e., nectar amount), closeness to the hive, etc. They share
their information about searched food source with the onlooker bees in dancing area
of the hive. Then, onlooker bees further search for most profitable food source on the
basis of information acquired from employed bees in the hive. The location of most
profitable food source is memorized by the bees which start exploiting that location.
On the other hand, the employed bees turn into scout bees when the food source is
abandoned. Then, scout bees start finding for new food source in a random manner.
In this algorithm, the solution is initiated by number of employed bees present in
the hive. The solution is represented by number of features contained in the dataset.
Initially, the solution in the population is denoted by S which is defined as S i = {1,
2, 3, …, N} where N denotes the number of attributes and S i denotes the ith food
source. For NSL KDD, value of N is 41. The values or records contained in the
attribute are taken from balanced set of dataset. In this paper, ABC algorithm has
been divided into three phases as follows:
Initialization Phase: In this phase, the food source S of the population is initialized
using Eq. (1):
 
Si, j = l j + u j − l j ∗ rand(0, 1) (4)

where l j and u j indicate the lower and upper bound of solution Si, j , respec-
tively, and rand(0, 1) is the random function used to generate the number in the
range (0, 1).
Employed Phase: After initializing the food source, employed bee starts searching
for another food source in the neighborhood of it and updates the position of food
source as shown in Eq. (5):
 
NSi, j = Si, j + ∅i, j ∗ Sk, j − Si, j (5)
188 M. Rani and Gagandeep

where NSi, j denotes the possible candidate food source, Sk, j represents new food
source searched by employed bees in neighborhood of prior food source Si, j and ∅i, j
is the random number that lies in the interval range [−1, +1].
Onlooker Phase: After exploring the neighborhood, employed bees share its infor-
mation with onlooker bees in the dancing area. Then, onlooker bees exploit that food
source based on the probability function using Eq. (6):

Probfunc(i) = fitnessi  fitnessi (6)

where fitnessi denotes the fitness value of ith food source. This fitness value is eval-
uated by employed bees as shown in Eq. (7). Then, neighborhood is again explored
by employed bees as shown in Eq. (5). If the new food source is better than previous
food source, it is replaced by new one; otherwise, it finds for another food source in
the hive.

1
, Oi ≥ 0
fitnessi = Oi +1 (7)
1 + |Oi |, Oi < 0

where Oi represents the objective function of ith food source. When food source
gets abandoned, employed bee becomes scout bee and randomly looks for new food
source as shown in Eq. (4). In this way, this algorithm is iterated again and again
until each and every feature gets explored and optimal feature subset is generated by
ABC algorithm.

4.4 Classification

To classify normal data from attack data and to evaluate the chosen optimal feature
subset using ABC algorithm in the previous step, random forest classifier has been
used based upon empirical analysis of classification accuracy with various classifiers
such as K-nearest neighbor (KNN), multilayer perceptron (MLP) and classification
and regression tree (CART) in this paper. It mainly solves the problems based upon
classification and regression by building multiple decision trees to reduce the noise
effect so that more accurate results can be achieved. The multiple decision trees are
created randomly using feature subset generated by ABC algorithm in the previous
step. Then, it predicts the data as either normal or attack data based upon majority of
aggregated votes made from each sub-tree. For binary classification, it usually gives
output in 0 and 1 form in such a way that if attack gets detected, it gives value 1;
otherwise, it gives output value 0. The proposed flowchart for IDS model is shown
in Fig. 2.
An Efficient Network Intrusion Detection System … 189

Fig. 2 Flowchart of
proposed IDS model

5 Results and Discussion

The proposed work is implemented using Python language using TensorFlow and
sklearn using Intel® Core™ i5 Processor and storage of 8 GB RAM. For experimental
analysis, NSL KDD, a widely used dataset, has been considered in this paper. The
whole training sets of both datasets are used to train the data, out of which 70% is used
for training the algorithm whereas remaining 30% of set is used for validating the
proposed results. The classification is done using tenfold cross-validation approach.
The features are selected using ABC algorithm whose initial values of parameters
are defined in Table 2.
For binary classification, random forest classifier is chosen based on empirical
analysis. The classification accuracy of four classifiers is evaluated over NSL KDD
dataset. The random forest mainly depends upon the number of decision trees. The
accuracy is calculated using 50 decision trees in this paper. Next, KNN classifier is
chosen for classification process which works on the principle of close proximity of
similar type of data which is calculated using Euclidean distance between different
features. The Euclidean distance is represented by k, and value of k is selected as
10 for finding accuracy rate for both datasets. Then, MLP classifier is selected for

Table 2 Parameter
Parameters Values
initialization of ABC
algorithm Maximum no. of iterations 100
Population size 10
Population dimension size N (number of features)
For NSL KDD, N = 42
190 M. Rani and Gagandeep

Table 3 Classification
Classifier NSL KDD
accuracy of four classifiers
Random forest 75.41
KNN 74.83
MLP 74.22
CART 75.41

comparison which entirely depends upon the number of neurons in the input layer.
The number of neurons is set same as number of attributes in the dataset, i.e., 42
neurons in the input layer. The CART classifier gives similar results to random
forest classifier. Among four classifiers, random forest gave best accuracy rate as
compared to others. That is way, random forest classifier is chosen for this work. The
classification accuracy is shown in Table 3.

5.1 Impact of Class Imbalance

Because of unequal distribution of class instances in NSL KDD, they suffer from
class imbalance problem. This problem has adverse effect on the performance of
IDS model. Therefore, it is very important to solve this problem using some suitable
approach. It can be solved at either data level or classifier level. In this paper, it
is solved at data level using SMOTE. It balances the dataset by oversampling the
minority class instances in such a way that both majority and minority class instances
are equally represented in the dataset. The normal and attack data are classified using
random forest classifier without any feature selection process in this section. The
classification accuracy without using any imbalance technique is found to be 76.94,
whereas using SMOTE it is evaluated to be 78.87%. From the results, it is evident
that the accuracy has been improved by 1.93% for NSL KDD which is of great
importance.

5.2 Performance of the Proposed Work

In order to find optimal feature subset, we have used ABC algorithm for feature
selection because of its ability of exploration and exploitation. Further, to address
the data imbalance problem, oversampling is done using SMOTE. The impact of
only data imbalance has been discussed in the previous section. The experimental
results are drawn after training the data over 70% of training set, and then, it is
cross-validated using tenfold of remaining 30% validation set after going through
100 numbers of iterations. The confusion matrices using original set without any
modification and after applying SMOTE and ABC feature selection for NSL KDD
dataset are given in Table 4.
An Efficient Network Intrusion Detection System … 191

Table 4 Confusion matrix


Normal Anomaly
using NSL KDD (a) using
original set and (b) using (a)
SMOTE and ABC algorithm Normal 9444 267
Anomaly 4930 7903
(b)
Normal 9421 290
Anomaly 3989 8844

It has selected 27 attributes out of 42 attributes using NSL KDD dataset which
shows the dimensionality reduction of full dataset. With this, it is able to minimize
the storage requirements of the program effectively. The list of features selected
using ABC algorithm over NSL KDD is shown in Table 5.
The proposed work is evaluated in terms of various parameters such as classifi-
cation accuracy, precision, recall, F-score and ROC curve. The accuracy is tested
over KDDTest+ set of NSL KDD, and it is evaluated to be 81.01% which shows
large improvement against original accuracy. Similarly, other parameters have shown
tremendous results as compared to unbalanced full datasets. Therefore, data imbal-
ance problem plays a crucial role along with feature selection process in classifying
normal and attack data in IDS model. Even the ROC curve values are also above
95% over dataset which shows the supremacy of the proposed work. The empirical
results for NSL KDD dataset are given in Table 6.
The performance of the proposed approach is also compared with the existing
literature in Table 7. From the comparison table, it is evident that the proposed work
has shown great improvement as compared to the previous work.

Table 5 List of selected features using NSL KDD


Dataset Number of features selected List of selected features
NSL KDD 27 {dur, service, flag, src_bytes, wrong_fragment,
urgent, hot, logged_in, num_compromised,
root_shell, su_attempted, num_root_access,
num_shells, num_outbound_cmds, is_hot_login,
is_guest_login, count, srv_count, srv_serror_rate,
rerror_rate, same_srv_rate, diff_srv_rate,
dst_host_count, dst_host_diff_srv_rate,
dst_host_srv_diff_host_rate, dst_host_serror_rate,
dst_host_rerror_rate}
192 M. Rani and Gagandeep

Table 6 Proposed results for


Parameters NSL KDD
binary classification
Original dataset SMOTE + ABC
Accuracy 76.94 81.01
Precision 94.73 96.82
Recall 61.58 68.91
F-score 75.25 80.52
ROC 95.20 96.58

Table 7 Comparison of the


References Classification accuracy (%)
proposed work with the
existing literature Ibrahim et al. [24] 75.49
Li et al. [8] (using GoogLeNet) 77.04
Li et al. [8] (using ResNet 50) 79.14
Al-Yaseen et al. [9] 78.89
Tao et al. [18] 78.47
Rani and Gagandeep [10] 80.83
Proposed work 81.01

6 Conclusion

In this work, an effective NIDS is designed using swarm intelligent-based ABC opti-
mization algorithm followed by random forest classifier for binary classification.
Due to large-sized IDS datasets, they usually suffer from class imbalance problem
which becomes a crucial barrier to performance of the system. To address this issue,
data is balanced using SMOTE in order to give importance to minority instances
in the dataset. The impact of class imbalance is effectively evaluated based upon
empirical analysis of classification accuracy in this paper. In order to improve the
performance of system to a large extent, data is preprocessed using min–max normal-
ization technique followed by data balancing using oversampling approach. Later on,
the dimensions of large-size datasets are reduced by extracting relevant feature set
using ABC optimization algorithm followed by random forest classifier. The classi-
fication accuracy is calculated to be 81.01% through empirical analysis using NSL
KDD dataset. The proposed work has successfully outperformed the existing liter-
ature also. The future work of this paper is to reduce the computational time of
ABC algorithm as it takes long time to train the large datasets which is very time
consuming.
An Efficient Network Intrusion Detection System … 193

References

1. Madhavi M (2012) An approach for intrusion detection system in cloud computing. Int J
Comput Sci Inf Technol 3:5219–5222
2. Rani M, Gagandeep (2019) A review of intrusion detection system in cloud computing. In:
Proceedings of international conference on sustainable computing in science, technology and
management (SUSCOM), pp 770–776
3. Montazeri M, Montazeri M, Naji HR, Faraahi A (2013) A novel memetic feature selection
algorithm. In: The 5th Conference on information and knowledge technology, pp 295–300
4. Sampson JR (1976) Adaptation in natural and artificial systems. John H. Holland
5. Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1:33–57
6. Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Comput Intell Mag
1:28–39
7. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-
sampling technique. J Artif Intell Res 16:321–357
8. Li Z, Qin Z, Huang K, Yang X, Ye S (2017) Intrusion detection using convolutional
neural networks for representation learning. International conference on neural information
processing, pp 858–866
9. Al-Yaseen WL (2019) Improving intrusion detection system by developing feature selection
model based on firefly algorithm and support vector machine. IAENG Int J Comput Sci 46
10. Rani M, Gagandeep (2021) Employing Artificial Bee Colony for feature selection in intrusion
detection system. In: Proceedings of 8th International conference on computing for sustainable
global development, pp 1–5
11. Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD CUP
99 data set. In: 2009 IEEE symposium on computational intelligence for security and defense
applications, pp 1–6
12. Pervez MS, Farid DM (2014) Feature selection and intrusion classification in NSL-KDD cup
99 dataset employing SVMs. In: The 8th International conference on software, knowledge,
information management and applications (SKIMA 2014), pp 1–6
13. Ingre B, Yadav A (2015) Performance analysis of NSL-KDD dataset using ANN. In: 2015
International conference on signal processing and communication engineering systems, pp
92–96
14. Aghdam MH, Kabiri P et al (2016) Feature selection for intrusion detection system using ant
colony optimization. Int J Netw Secur 18, 420–432
15. Kaur A, Pal SK, Singh AP (2018) Hybridization of K-means and firefly algorithm for intrusion
detection system. Int J Syst Assur Eng Manag 9:901–910
16. Jiang K, Wang W, Wang A, Wu H (2020) Network intrusion detection combined hybrid
sampling with deep hierarchical network. IEEE Access 8:32464–32476
17. Alkafagi SS, Almuttairi RM (2021) A proactive model for optimizing swarm search algorithms
for intrusion detection system. Int J Phys Conf Ser 1–17
18. Tao W, Honghui F, HongJin Z, CongZhe Y, HongYan Z, XianZhen H (2021) Intrusion detection
system combined enhanced random forest with smote algorithm, pp 1–29
19. Liu Z, Shi Y (2022) A hybrid IDS using GA-based feature selection method and random forest.
Int J Mach Learn Comput 12(2):43–50
20. Priyadarsini PI (2021) ABC-BSRF: Artificial Bee colony and borderline-SMOTE RF algorithm
for intrusion detection system on data imbalanced problem. In: Proceedings of international
conference on computational intelligence and data engineering: ICCIDE 2020, pp 15–29
21. Li X, Yi P, Wei W, Jiang Y, Tian L (2021) LNNLS-KH: A feature selection method for network
intrusion detection. Secur Commun Netw 1–22
22. Blum C, Li X (2008) Swarm intelligence in optimization. Int J Swarm Intell 43–85
23. Karaboga D (2005) An idea based on honey bee swarm for numerical optimization
24. Ibrahim LM, Basheer DT, Mahmod MS (2013) A comparison study for intrusion database
(Kdd99, Nsl-Kdd) based on self organization map (SOM) artificial neural network. J Eng Sci
Technol 8:107–119
A Novel Hybrid Imputation Method
to Predict Missing Values in Medical
Datasets

Pooja Rani, Rajneesh Kumar, and Anurag Jain

Abstract The performance of machine learning-based decision support systems


for predicting any disease is adversely affected by missing values in the dataset.
Prediction of these missing values in an accurate manner is a critical issue. In this
paper, a new imputation method called hybrid imputation optimized by classifier
(HIOC) is proposed to predict missing values in the dataset. The proposed HIOC
is developed by using a classifier to combine multivariate imputation by chained
equations (MICE), K-nearest neighbor (KNN), mean and mode imputation methods
in an optimum way. Proposed HIOC has been validated on the datasets of heart
disease and breast cancer. Performance of HIOC is compared to MICE, KNN, mean
and mode using root-mean-square error (RMSE). The results suggest that HIOC is
the most appropriate method for predicting missing data in medical datasets and
provides stable performance even when the rate of missing data is increased from 10
to 30%. It provided a reduction in RMSE up to 18.06% in heart disease dataset and
24.62% in breast cancer dataset.

Keywords Imputation methods · KNN imputation · Mode imputation · Mean


imputation · Hybrid imputation

P. Rani (B)
MMICT&BM (A.P.), MMEC, Maharishi Markandeshwar (Deemed to be University), Mullana,
Ambala, Haryana, India
e-mail: pooja.rani@mmumullana.org
R. Kumar
Department of Computer Engineering, MMEC, Maharishi Markandeshwar (Deemed to be
University), Mullana, Ambala, Haryana, India
e-mail: drrajneeshgujral@mmumullana.org
A. Jain
School of Computer Sciences, University of Petroleum and Energy Studies, Dehradun,
Uttarakhand, India
e-mail: anurag.jain@ddn.upes.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 195
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_16
196 P. Rani et al.

1 Introduction

The impact of machine learning on the healthcare sector is increasing day by day.
Decision support systems developed by training machine learning algorithms on
medical datasets have become useful tools in the prediction of diseases. Machine
learning algorithms can be used to develop a decision support system to predict any
disease using clinical data [1, 2]. Medical datasets used for training these algorithms
can include some clinical data whose values are not available. This missing data
in the dataset reduces the prediction accuracy of the system. Medical datasets with
missing data make data mining very difficult. Before these datasets are mined, these
missing values must be effectively handled [3].
An easy way to handle missing values is to use only those records from the dataset
that have no missing values. This method is known as complete case analysis. But it
can reduce the size of the dataset if there are a large number of records with missing
values. In such a situation, imputation methods are the better alternative [3].
Imputation methods predict missing values in the dataset by using characteristics
that are hidden in the dataset. Efficiency of imputation methods is reflected by their
ability to accurately predict missing values. Several methods of imputation exist to
predict missing values. Imputation methods are classified into two categories: single
and multiple imputation. In single imputation, each missing value is predicted once
and this prediction is considered true. The drawback to this method is that it cannot
handle the uncertainty in missing data efficiently. Uncertainty is handled efficiently
by multiple imputation because it has multiple cycles of imputation [4].
The major motivation of this research work is to develop a classifier-independent
imputation method that can be used efficiently on any dataset.
In this paper, a new hybrid imputation optimized by classifier (HIOC) method
for predicting missing values is proposed. HIOC is developed by combining MICE,
KNN, mean and mode imputation in an optimum manner through a classifier. It
is a classifier-independent method that improves the performance of any classifier.
Authors have validated the proposed HIOC on two medical datasets: datasets of heart
disease and breast cancer. Proposed HIOC has also been compared to already existing
imputation methods: MICE, KNN, mean and mode. MICE is a multiple imputation
method. KNN, mean and mode are single imputation methods. The major research
contributions of the paper are as follows:
(1) A new method for imputation of missing values, called hybrid imputation
optimized by classifier (HIOC), is proposed.
(2) The proposed HIOC has been validated on two medical datasets, namely
Cleveland heart disease dataset and Wisconsin breast cancer dataset.
(3) The performance of the proposed HIOC method has been compared to other
existing imputation methods.
The remaining paper is structured as follows: Imputation methods applied on
different datasets are described in Sect. 2. Description of the proposed method is
given in Sect. 3. Results obtained using proposed method is discussed in Sect. 4.
Conclusion of the paper and future scope are given in Sect. 5.
A Novel Hybrid Imputation Method … 197

2 Literature Review

Sim et al. [5] used seven imputation methods i.e., mean, listwise deletion, hot deck,
group mean, predictive mean, k-means clustering and KNN imputation methods on
six datasets. Values were randomly removed from the dataset to introduce missing
values in the datasets. Missing value imputation methods were applied with 5, 10
and 15% missing values in the datasets. Datasets imputed with values were classified
using six classifiers, and the performance of each imputation method was measured.
Imputation method which performed best for all the datasets was not found, and
in each dataset, different imputation method provided the best results. However,
performance of listwise deletion decreased with increasing missing data. Nahato
et al. [6] applied imputation methods on hepatitis and cancer datasets. In hepatitis
dataset, tuples having more than 24% missing values were removed, and in remaining
tuples, mode imputation was used to fill missing values. In the breast cancer dataset,
the mode imputation method was applied to all the tuples with missing values.
Kumar and Kumar [7] applied four imputation methods KNN, mean imputation,
fuzzy KNN and weighted KNN on hepatitis, breast cancer and lung cancer datasets.
Fuzzy KNN provided the best performance in imputation. Kuppusamy and Para-
masivam [8] introduced missing values randomly in nursery dataset and predicted
these missing values with missForest algorithm. Venkatraman et al. [9] performed
prediction of cardiovascular disease on DiabHealth dataset. Missing data was filled
by using mean method for continuous features and mode method for categorical
features. AlMuhaideb and Menai [10] removed records having missing values in
their proposed preprocessing procedure for the classification of medical data. Authors
validated the proposed preprocessing architecture on 25 medical datasets. Sujatha
et al. [11] proposed the imputing missing values classifier (IMVC) method to impute
missing values. Imputation was done using three classifiers Naive Bayes, neural
network and C4.5. The method was applied on Cleveland heart disease dataset.
Abdar et al. [12] used Wisconsin breast cancer dataset (WBCD) for the diagnosis
of breast cancer. Authors did not use records of patients having missing medical
parameters for training the model. This method of handling missing values is known
as complete case analysis. Nikfalazar et al. [13] proposed a missing value imputation
method DIFC by combining decision tree and fuzzy clustering. Proposed method
was tested on six datasets, adult, housing, pima, autompg, yeast and cmc. Adult
and autompg datasets contain missing values, whereas remaining four datasets do
not contain any missing values. Records of missing values in adult and autompg
datasets were removed. Missing values were introduced artificially in all six datasets
to analyze the performance of imputation method.
Zhang et al. [14] performed breast cancer prediction from Wisconsin breast cancer
dataset. Authors used complete case analysis for handling missing values. Qin et al.
[15] performed a diagnosis of chronic kidney disease using kidney disease dataset.
Missing values were filled in the dataset using KNN imputation method. Mohan
et al. [16] used Cleveland heart disease dataset for developing a system for heart
disease prediction. The authors used listwise deletion for handling missing values,
198 P. Rani et al.

i.e., removed all the instances having missing values and used remaining records for
training and testing the system.
Almansour et al. [17] applied different classification algorithms on chronic kidney
disease dataset. Authors replaced missing values in each attribute with mean of
available values in the corresponding attribute. Supriya and Deepa [18] performed
breast cancer prediction on Wisconsin breast cancer dataset. Missing values were
filled using mean method of imputation. Different methods of imputation used on
various datasets by several researchers are shown in Table 1.

3 Materials and Methods

3.1 Methodology

In this paper, a new imputation method hybrid imputation optimized by classifier


(HIOC) is proposed. Proposed method has been validated on two medical datasets:
Cleveland heart disease dataset and Wisconsin breast cancer dataset [19, 20]. Arti-
ficial missing values were introduced in the datasets through random distribution.
After introducing missing values, HIOC was applied on the datasets to predict missing
values.
HIOC combines MICE, KNN, mean and mode algorithms in an optimum manner
through a classifier. The algorithm iteratively performs column-wise imputation by
selecting the best imputation method for each column. Four temporary datasets are
produced by using MICE, KNN, mean and mode imputation methods. After that,
algorithm performs n iteration where n is number of columns in dataset. In all iter-
ations, a column is selected from one dataset out of four temporary datasets so that
RMSE of the final imputed dataset is reduced to a minimum value. Methodology of
proposed method is shown in Fig. 1.
A Novel Hybrid Imputation Method … 199

Table 1 Imputation methods utilized by researchers in various datasets


S. No. Authors Name of No. of records No. of features Imputation
dataset methods
1 Sim et al. [5] Wine 178 16 Listwise
Iris 150 7 deletion, mean,
KNN, group
Liver disorder 345 8 mean, k-means
Glass 214 16 clustering,
Statlog 57,999 14 predictive mean
shuttle and hot deck
imputation
Ionosphere 351 36
2 Nahato et al. [6] Hepatitis 155 20 Mode method
Breast cancer 699 11
3 Kumar and Breast cancer 699 11 Mean, KNN,
Kumar [7] Hepatitis 155 20 weighted KNN
and fuzzy KNN
Lung cancer 32 56 imputation
4 Kuppusamy, and Nursery 12,960 8 MissForest
Paramasivam [8] algorithm
5 Venkatraman DiabHealth 2800 180 Mean and mode
et al. [9] imputation
method
6 AlMuhaideb and Liver 345 6 Listwise
Menai [10] Hepatitis 155 20 deletion
Cleveland 294 13
heart disease
Pima 768 9
Hungarian 303 13
heart disease
Statlog heart 270 13
disease
7 Sujatha et al. [11] Cleveland 303 13 Imputing
heart disease missing values
classifier
(IMVC) method
8 Abdar et al. [12] Wisconsin 569 32 Complete case
breast cancer analysis
9 Nikfalazar et al. Housing 506 14 Support vector
[13] Adult 32,561 15 regression,
iterative fuzzy
Autompg 398 8 clustering,
Pima 768 9 decision
CMC 1473 10 tree-based
imputation,
Yeast 1484 9 DIFC
(continued)
200 P. Rani et al.

Table 1 (continued)
S. No. Authors Name of No. of records No. of features Imputation
dataset methods
10 Zhang et al. [14] Wisconsin 569 32 Complete case
breast cancer analysis
11 Qin et al. [15] Chronic 400 24 KNN imputation
kidney
disease
12 Mohan et al. [16] Cleveland 303 12 Listwise
heart disease deletion
13 Almansour et al. Chronic 400 24 Mean
[17] kidney imputation
disease
14 Supriya and Wisconsin 569 32 Mean
Deepa [18] breast cancer imputation

Imputed
dataset using
MICE
imputation

Imputed
dataset using
KNN
imputation Select each
Dataset with column from Final imputed
missing values one dataset to dataset
Imputed reduce RMSE
dataset using
Mean
imputation

Imputed
dataset using
Mode
imputation

Fig. 1 Methodology of proposed HIOC method


A Novel Hybrid Imputation Method … 201

Algorithm: Hybrid Imputation Optimized by Classifier (HIOC)


Algorithm HIOC( )
{
• Dataset DS with missing values is the input.
• Apply MICE imputation on dataset DS to predict missing values creating
dataset DS1.
• Apply KNN imputation on dataset DS to predict missing values creating
dataset DS2.
• Apply MEAN imputation on dataset DS to predict missing values creating
dataset DS3.
• Apply MODE imputation on dataset DS to predict missing values creating
dataset DS4.
• Copy dataset DS1 into DSFinal.
• N=Number of columns in DS
For (i=1 to N)
{
o Replace column i of DSFinal with column i of DS1 and calculate
RMSE using a classifier.
o Replace column i of DSFinal with column i of DS2 and calculate
RMSE using a classifier.
o Replace column i of DSFinal with column i of DS3 and calculate
RMSE using a classifier.
o Replace column i of DSFinal with column i of DS4 and calculate
RMSE using a classifier.
o Choose one of the DS1, DS2, DS3 and DS4 datasets with the
least RMSE.
o Replace the ith column of the dataset DSFinal with the dataset
selected in the previous step.
}
• Dataset DSFinal with imputed missing values is the output.
}

3.2 State-of-the-Art Imputation Methods

Performance of HIOC has been compared to four existing imputation methods:


MICE, KNN, mean and mode.
Mean Imputation. In this method, missing values of each column are filled with
mean value of available values of the column [6].
202 P. Rani et al.

Algorithm: Mean Imputation


Algorithm MEAN(DS)
{
Input: Dataset DS with missing values.
Output: Imputed Dataset DS with zero missing values
N=Number of Columns in DS
I=1
While ( I <= N)
{
M = mean of values in column I.
Replace missing values in column I with value M.
I=I+1
}
}

Mode Imputation. In this method, missing values of each column are filled with
values that occur most frequently in the column [21].

Algorithm: Mode Imputation


Algorithm MODE(DS)
{
Input: Dataset DS with missing values.
Output: Imputed Dataset DS with zero missing values
N=Number of Columns in DS
I=1
While ( I <= N)
{
V =Most frequently occurring value in column I.
Replace missing values in column I with value V.
I=I+1
}
}

KNN Imputation. In this method, K-nearest neighbor records of missing data are
found and missing data is filled with mean value of these records [22].
A Novel Hybrid Imputation Method … 203

Algorithm: KNN Imputation

Algorithm KNN(DS)
{
Input: Dataset DS with missing values.
Output: Imputed Dataset DS with zero missing values.
N=Number of Columns in DS
I=1
While ( I <= N)
{
Select K nearest neighbors of the record based upon remaining attributes of
record.
M = mean of values of column I of selected k records.
Replace missing values in column I with value M.
I=I+1
}
}

MICE Imputation. MICE is a multiple imputation method. In this method, a regres-


sion model is developed that is fitted on the dataset to predict missing values of any
feature from the remaining features of the dataset. In this method, multiple cycles
of imputation are performed. Firstly, missing values of all attributes are filled using
a single imputation method such as mean imputation. After that, one attribute X is
selected from the dataset and mean imputation of the X is again replaced with missing
values. Now, a regression model is developed to predict missing values of X from
the remaining attributes. Similarly, a regression model is built and used for all the
attributes having missing values [23].
204 P. Rani et al.

Algorithm: MICE Imputation


Algorithm MICEIMP(DS)
{
Input: Dataset DS with missing values.
Output: Imputed Dataset DS with zero missing values.
N=Number of Columns in DS
I=1
While ( I <= N)
{
Replace missing values in column I with mean of available values of column I.
I=I+1
}
I=1
While ( I <= N)
{
Replace Mean Imputation of column I with missing values.
Develop a regression model on the dataset with column I as predicted feature.
Use the regression model to predict missing values of column I.
Replace missing values of column I with predicted values.
I=I+1
}
}

3.3 Classifiers

Four classifiers, naive Bayes (NB), support vector machine (SVM), random forest
(RF) and decision tree (DT), have been used to evaluate the performance of HIOC
and compare it to already existing methods.
NB is a type of probabilistic classifier that is based upon Bayes theorem. Bayes
theorem uses the concept of conditional probability, that is, the probability of an
event on another event. This theorem assumes that different prediction features are
independent of each other. Formula for Bayes theorem is given in Eq. 1 that calculates
the probability of occurring event x when event y occurs [24, 25].

P(x|y) = P(y|x)P(x) ÷ P(y) (1)

SVM distinguishes between two classes using a hyperplane that acts as a boundary
between the classes [26, 27]. Hyperplane H 0 is defined as given in Eq. 2:

H0 : w T x + b = 0 (2)

Two hyperplanes H 1 and H 2 are constructed in parallel to H 0 as given in Eqs. 3


and 4.
A Novel Hybrid Imputation Method … 205

H1 : w T x + b = −1 (3)

H2 : w T x + b = 1 (4)

Optimization of SVM is a constraint optimization problem solvable by Lagrange


multipliers. Equations 5, 6 and 7 are written using Lagrange multipliers λ for finding
the optimal values of w and b.

∂l  l
=w− (λi .yi .xi ) = 0 (5)
∂λ i=1

∂l  l
= yi (w.xi + b) − 1 = 0 (6)
∂λ i=1

∂l  l
= λi .yi = 0 (7)
∂b i=1

DT creates a model by inferring decision rules from the features of the dataset. A
tree is constructed in a top-down mechanism. Each internal node of the tree represents
a condition on a feature, and each leaf node is associated with a class. A decision
tree can suffer from the problem of overfitting [28].
RF uses multiple decision trees to perform classification. Different subsamples of
the dataset are used to train these trees. Each tree classifies individually, and a vote
is made between the trees to make the final classification. It resolves the problem of
overfitting [28].

3.4 Performance Parameter

RMSE of the classifiers was analyzed to evaluate the performance of imputation


methods [29]. RMSE measures the deviation between the actual and predicted value
as shown in Eq. 8:


1  n
RMSE =  (P − A)2 (8)
N i=1

In the above equation, A is the actual value, P is the predicted value and N is total
number of samples. The value of the RMSE should be low because it reflects the
inability of the system to make correct predictions [30].
206 P. Rani et al.

4 Results and Discussion

In this research work, proposed HIOC method has been validated on two medical
datasets: heart disease and breast cancer datasets. Both the datasets are available
online on University of California, Irvine (UCI) repository. Cleveland dataset has
303 records, and Wisconsin dataset has 569 records.
Artificial missing values were randomly introduced in both datasets. Ten to thirty
percent missing values were introduced in the datasets. After introducing missing
values, imputation was done using HIOC, MICE, KNN, mean and mode methods.
Efficiency of imputation methods has been analyzed by observing the impact of
imputation methods on the RMSE of classifiers.
Lower RMSE indicates a better prediction of missing data. NB, SVM, RF and DT
classifiers have been used to analyze the performance of imputation methods. Each
imputation method was used to fill 10%, 15%, 20%, 25% and 30% missing data, and
RMSE of classifiers was calculated. Finally, average RMSE was calculated. Impact
of imputation methods on heart disease and breast cancer datasets is given in Tables 2
and 3, respectively.

Table 2 Impact of imputation methods on RMSE of classifiers on heart disease dataset


Classifier Imputation 10% 15% 20% 25% 30% Average RMSE
method
SVM HIOC 0.4419 0.4457 0.4144 0.4062 0.4264 0.4269
SVM MODE 0.4714 0.5059 0.4569 0.4750 0.5443 0.4907
SVM MEAN 0.4714 0.4642 0.4569 0.4569 0.5125 0.4724
SVM KNN 0.4855 0.4855 0.4750 0.4495 0.4958 0.4782
SVM MICE 0.4532 0.4532 0.4264 0.4495 0.4495 0.4463
NB HIOC 0.3805 0.4103 0.3978 0.4062 0.4062 0.4002
NB MODE 0.4224 0.4419 0.4678 0.4457 0.4419 0.4440
NB MEAN 0.4062 0.4184 0.4381 0.4419 0.4419 0.4293
NB KNN 0.4103 0.4264 0.4419 0.4419 0.4419 0.4325
NB MICE 0.3849 0.4184 0.4184 0.4184 0.4224 0.4125
RF HIOC 0.4062 0.3978 0.3978 0.3978 0.4144 0.4028
RF MODE 0.4750 0.4924 0.4495 0.5222 0.5190 0.4916
RF MEAN 0.4678 0.4714 0.4264 0.4678 0.4820 0.4631
RF KNN 0.4495 0.4457 0.4642 0.4495 0.4569 0.4531
RF MICE 0.4342 0.4342 0.4342 0.4144 0.4457 0.4326
DT HIOC 0.4569 0.4457 0.4224 0.4303 0.4381 0.4387
DT MODE 0.5190 0.5059 0.5318 0.5412 0.5535 0.5303
DT MEAN 0.5190 0.5350 0.4569 0.5286 0.5443 0.5168
DT KNN 0.5535 0.5059 0.5059 0.4924 0.5381 0.5191
DT MICE 0.5059 0.5286 0.5059 0.4714 0.4785 0.4981
A Novel Hybrid Imputation Method … 207

Table 3 Impact of imputation methods on RMSE of classifiers on breast cancer dataset


Classifier Imputation 10% 15% 20% 25% 30% Average RMSE
method
SVM HIOC 0.2444 0.2258 0.2480 0.2684 0.2749 0.2523
SVM MODE 0.2874 0.2904 0.2812 0.3081 0.3274 0.2989
SVM MEAN 0.2904 0.2717 0.2781 0.2812 0.3193 0.2881
SVM KNN 0.2584 0.2584 0.3052 0.2994 0.3109 0.2865
SVM MICE 0.2444 0.2258 0.2749 0.2717 0.2874 0.2608
NB HIOC 0.2408 0.2371 0.2408 0.2334 0.2408 0.2386
NB MODE 0.2444 0.2618 0.2651 0.2618 0.2781 0.2623
NB MEAN 0.2371 0.2550 0.2651 0.2584 0.2651 0.2562
NB KNN 0.2296 0.2480 0.2371 0.2408 0.2480 0.2407
NB MICE 0.2515 0.2480 0.2480 0.2408 0.2444 0.2466
RF HIOC 0.1921 0.1827 0.2011 0.1779 0.1921 0.1892
RF MODE 0.2296 0.2408 0.2684 0.2258 0.2904 0.2510
RF MEAN 0.2011 0.2371 0.2480 0.2444 0.2371 0.2336
RF KNN 0.2258 0.2218 0.2651 0.2334 0.2684 0.2429
RF MICE 0.2054 0.2096 0.2096 0.2011 0.2138 0.2079
DT HIOC 0.2515 0.2408 0.2584 0.2334 0.2584 0.2485
DT MODE 0.3220 0.2935 0.2812 0.3052 0.3081 0.3020
DT MEAN 0.3193 0.2994 0.3023 0.2904 0.3137 0.3050
DT KNN 0.3023 0.2550 0.2651 0.2812 0.2964 0.2800
DT MICE 0.2651 0.2618 0.2812 0.2550 0.2812 0.2689

Comparison of different imputation methods on heart disease and breast cancer


datasets is given in Figs. 2 and 3, respectively. Proposed HIOC has reduced RMSE
of all the classifiers in both datasets. HIOC method has provided the best results, and
mode method has provided the worst results.
When HIOC has been used with heart disease dataset, it has reduced 9.86% RMSE
for NB classifier, 13.00% RMSE for SVM classifier, 18.06% RMSE for RF classifier
and 17.27% RMSE for DT classifier. When HIOC has been used with breast cancer
dataset, HIOC has reduced 9.03% RMSE for NB classifier, 15.59% RMSE for SVM
classifier, 24.62% RMSE for RF classifier and 17.71% RMSE for DT classifier. It can
be concluded that HIOC is an efficient method for predicting missing values in the
medical datasets. This imputation method is classifier independent and has provided
the best results with all the classifiers.
When we collect medical parameters of patients, it can be a common situation
that the values of some parameters are not available. So, the disease dataset can
contain missing values. When we use dataset with a large number of missing values
for training a classification model, RMSE will be high and accuracy will be low.
208 P. Rani et al.

Fig. 2 Comparison of imputation methods on heart disease dataset

Fig. 3 Comparison of imputation methods on breast cancer dataset

Proposed method will help in the accurate prediction of missing values. As it is


classifier-independent method, it can be used with any classifier on any dataset.

5 Conclusion and Future Scope

In this paper, new hybrid imputation optimized by classifier (HIOC) method is


proposed for predicting missing values in the medical datasets. Datasets of heart
disease and breast cancer have been used to validate the proposed method. Authors
have randomly removed data from the datasets to introduce missing values. Four
classifiers, NB, SVM, RF and DT, have been used to analyze prediction efficiency
of proposed method. Proposed HIOC has been compared to MICE, KNN, mean and
A Novel Hybrid Imputation Method … 209

mode methods. HIOC provides an efficient way to predict missing values in medical
datasets. It provides better results than existing imputation methods as reflected by
lower RMSE. It provided a reduction in RMSE up to 18.06% in heart disease dataset
and 24.62% in breast cancer dataset. It is a classifier-independent method that can
predict missing values in any medical dataset.
As future work, the authors have plans to evaluate the proposed imputation method
with more classifiers and more datasets. The proposed methodology can be applied
to fill missing values in any medical dataset, and a decision support system can be
developed for the diagnosis of a disease by training classification algorithms on the
imputed dataset.

References

1. Kumar R, Rani P (2020) Comparative analysis of decision support system for heart disease.
Adv Math Sci J 9(6):1–7. https://doi.org/10.37418/amsj.9.6.15
2. Jain A, Tiwari S, Sapra V (2019) Two-phase heart disease diagnosis system using deep learning.
Int J Control Autom 12(5):558–573
3. Bertsimas D, Pawlowski C, Zhuo YD (2017) From predictive methods to missing data
imputation: an optimization approach. J Mach Learn Res 18(1):7133–7171
4. Jakobsen JC, Gluud C, Wetterslev J, Winkel P (2017) When and how should multiple impu-
tation be used for handling missing data in randomised clinical trials—a practical guide with
flowcharts. BMC Med Res Methodol 17(1):1–10. https://doi.org/10.1186/s12874-017-0442-1
5. Sim J, Lee JS, Kwon O (2015) Missing values and optimal selection of an imputation method
and classification algorithm to improve the accuracy of ubiquitous computing applications.
Math Probl Eng 2015:1–15. https://doi.org/10.1155/2015/538613
6. Nahato KB, Harichandran KN, Arputharaj K (2015) Knowledge mining from clinical datasets
using rough sets and backpropagation neural network. Comput Math Methods Med 2015:1–8.
https://doi.org/10.1155/2015/460189
7. Kumar RN, Kumar MA (2016) Enhanced fuzzy K-NN approach for handling missing values
in medical data mining. Indian J Sci Technol 9(S1):1–6. https://doi.org/10.17485/ijst/2016/
v9iS1/94094
8. Kuppusamy V, Paramasivam I (2016) A study of impact on missing categorical data—a
qualitative review. Indian J Sci Technol 9(32):1–4. https://doi.org/10.17485/ijst/2016/v9i32/
83088
9. Venkatraman S, Yatsko A, Stranieri A, Jelinek HF (2016) Missing data imputation for individ-
ualised CVD diagnostic and treatment. In: Computing in cardiology conference. CinC, IEEE,
pp 349–352. https://doi.org/10.22489/CinC.2016.100-179
10. AlMuhaideb S, Menai MEB (2016) An individualized preprocessing for medical data
classification. Proc Comput Sci 82:35–42. https://doi.org/10.1016/j.procs.2016.04.006
11. Sujatha M, Anusha S, Bhavani G (2018) A study on performance of Cleveland heart disease
dataset for imputing missing values. Int J Pure Appl Math 120(6):7271–7280
12. Abdar M, Zomorodi-Moghadam M, Zhou X, Gururajan R, Tao X, Barua PD, Gururajan R
(2020) A new nested ensemble technique for automated diagnosis of breast cancer. Pattern
Recogn Lett 132:123–131. https://doi.org/10.1016/j.patrec.2018.11.004
13. Nikfalazar S, Yeh CH, Bedingfield S, Khorshidi HA (2020) Missing data imputation using
decision trees and fuzzy clustering with iterative learning. Knowl Inf Syst 62(6):2419–2437.
https://doi.org/10.1007/s10115-019-01427-1
14. Zhang J, Chen L, Abid F (2019) Prediction of breast cancer from imbalance respect using
cluster-based undersampling method. J Healthcare Eng 2019:1–11. https://doi.org/10.1155/
2019/7294582
210 P. Rani et al.

15. Qin J, Chen L, Liu Y, Liu C, Feng C, Chen B (2019) A machine learning methodology for
diagnosing chronic kidney disease. IEEE Access 8:20991–21002. https://doi.org/10.1109/ACC
ESS.2019.2963053
16. Mohan S, Thirumalai C, Srivastava G (2019) Effective heart disease prediction using hybrid
machine learning technique. IEEE Access 7:81542–81554. https://doi.org/10.1109/ACCESS.
2019.2923707
17. Almansour NA, Syed HF, Khayat NR, Altheeb RK, Juri RE, Alhiyafi J, Alrashed S, Olatunji
SO (2019) Neural network and support vector machine for the prediction of chronic kidney
disease: a comparative study. Comput Biol Med 109:101–111. https://doi.org/10.1016/j.com
pbiomed.2019.04.017
18. Supriya M, Deepa AJ (2019) A novel approach for breast cancer prediction using optimized
ANN classifier based on big data environment. Health Care Manage Sci 2019:1–13. https://
doi.org/10.1007/s10729-019-09498-w
19. https://archive.ics.uci.edu/ml/datasets/heart+disease. Accessed on 10-01-2020
20. http://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(diagnostic). Accessed on 12-
01-2020
21. Rani P, Kumar R, Jain A (2021) Multistage model for accurate prediction of missing values
using imputation methods in heart disease dataset. In: Raj JS, Iliyasu AM, Bestak R, Baig
ZA (eds) Innovative data communication technologies and application, lecture notes on data
engineering and communications technologies. Springer, Singapore, pp 637–653. https://doi.
org/10.1007/978-981-15-9651-3_53
22. Thomas RM, Bruin W, Zhutovsky P, Van Wingen G (2020) Dealing with missing data, small
sample sizes, and heterogeneity in machine learning studies of brain disorders. In: Machine
learning. Academic Press, pp 249–266. https://doi.org/10.1016/B978-0-12-815739-8.00014-6
23. Azur MJ, Stuart EA, Frangakis C, Leaf PJ (2011) Multiple imputation by chained equations:
what is it and how does it work? Int J Methods Psychiatr Res 20(1):40–49
24. Rani P, Kumar R, Ahmed NMOS, Jain A (2021) A decision support system for heart disease
prediction based upon machine learning. J Reliable Intell Environ. https://doi.org/10.1007/s40
860-021-00133-6
25. Lamba R, Gulati T, Alharbi HF, Jain A (2021) A hybrid system for Parkinson’s disease diagnosis
using machine learning techniques. Int J Speech Technol. https://doi.org/10.1007/s10772-021-
09837-9
26. Rani P, Kumar R, Jain A, Chawla SK (2021) A hybrid approach for feature selection based on
genetic algorithm and recursive feature elimination. Int J Inf Syst Modeling Des 12(2):17–38.
https://doi.org/10.4018/IJISMD.2021040102
27. Lamba R, Gulati T, Al-Dhlan KA, Jain A (2021) A systematic approach to diagnose Parkinson’s
disease through kinematic features extracted from handwritten drawings. J Reliab Intell
Environ. https://doi.org/10.1007/s40860-021-00130-9
28. Rani P, Kumar R, Jain A, Lamba R (2020) Taxonomy of machine learning algorithms and
its applications. J Comput Theor Nanosci 17(6):2509–2514. https://doi.org/10.1166/jctn.2020.
8922
29. Guo H, Yin J, Zhao J, Yao L, Xia X, Luo H (2015) An ensemble learning for predicting
breakdown field strength of polyimide nanocomposite films. J Nanomater 2015:1–11. https://
doi.org/10.1155/2015/950943
30. Ayilara OF, Zhang L, Sajobi TT, Sawatzky R, Bohm E, Lix LM (2019) Impact of missing data
on bias and precision when estimating change in patient-reported outcomes from a clinical
registry. Health Qual Life Outcomes 17(1):1–9. https://doi.org/10.1186/s12955-019-1181-2
Optimal LQR and Smith Predictor Based
PID Controller Design for NMP System

Divya Pandey, Shekhar Yadav, and Satanand Mishra

Abstract In this paper, an optimal PID (proportional-integral-derivative) controller


is designed for controlling the non-minimum phase system with time delay and
performed greater than the conventional control approaches. When there is an
increment in time delay then complications related to NMP system determines the
undershoot response due to presence of zeros in the right half plane. Furthermore,
Ziegler-Nichols, Internal Model Control (IMC) and Chien-Hrones & Reswick tuning
approaches are developed for the conventional PID controller. The time-domain char-
acteristics obtained are unable to persuade the requirements of the system. In order to
improve the performance criteria, Smith’s predictor is used for controlling the NMP
system. Smith’s predictor is better than the conventional PID controller techniques
for tuning the parameters but has the incapability to achieve all the design require-
ments. For improving the performance requirements, the optimal PID controller is
designed with the employment of Linear Quadratic Regulator (LQR). The tuning
parameters of PID controller are optimal values using popular technique that is
Linear Quadratic Regulator (LQR). In order to minimize integral time absolute error
(ITAE) and integral time square error (ITSE), the tuning parameters are optimal and
give precise values of rising time, peak time, % overshoot and settling time. The
simulation results are obtained and according to that LQR gives better results than
Smith Predictor and other conventional PID controller techniques.

Keywords Non-minimum phase system · PID controller · Smith predictor ·


Conventional tuning methods · Linear quadratic regulator

D. Pandey (B) · S. Yadav (B)


Madan Mohan Malaviya University of Technology, Gorakhpur, U.P. 273010, India
e-mail: syee@mmmut.ac.in
S. Mishra (B)
AcSIR & CSIR—Advanced Materials and Processes Research Institute (AMPRI),
Bhopal 462026, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 211
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_17
212 D. Pandey et al.

1 Introduction

With its three-term capability formulates treatment to both steady-state and transient
responses, PID (proportional-integral-derivative) controller shows the simplest, easy
to understand and most efficient response to different real-time control problems. PID
controllers are used in many complicated problems due to its easy understanding and
simple structure. The tuning of PID controller appears to be simple and theoretically
intuitive but can be difficult in realistic problems.
There are different feasible systems having time delay in which the right half
plane (RHP) zeros/poles in several industrial applications. A non-minimum phase
system is characterized in which one or more zero exist in the right half plane. It is
well-known fact that the parameter values of the controller gain reduce when there is
increment in time delay and as a result, the response of the system will be reduced.
The tuning methods of PID controller are employed in process control problems
and rule of thumb established in 1942 by Ziegler and Nichols. Furthermore Feedback
autotuning has been introduced for SISO controllers. Hang, Wang and Cao in 1995
have suggested the autotuning of modified smith predictor with feedback systems
[1]. The analytical structure of PID control methods in future research was suggested
in 2001 by Astrom and Hagglund [2].
In this paper, the conventional PID tuning approaches are used with smith predictor
for the enhancement of the performance of NMP system. After that, we have to
optimize the parameters of PID controller by using smith predictor and LQR for
better results. Here we are using LQR control design technique with conventional
PID controller for accuracy and betterment of tuning parameters.

2 Description of Non-minimum Phase System

The systems which are having one or more zero in the right half of s-plane are
characterized in category of non-linear systems which are having different complex
applications in control theory. The presence of zeros in right half of the plane and time
delay in NMP systems is practically unattainable for its non-causal characteristic. The
locations of zeros are the reason of undershoot and overshoot phenomenon of non-
minimum phase system. There are different complex problems categorized as non-
minimum phase systems which are having advanced non-linear control approaches
for solving them. There are many practical examples of NMP systems such as drum-
boiler dynamics, aircraft dynamics and hydraulic pumps.
Therefore the systems having time delay (wanted or not) with feedback in different
specific fields can be analyzed by NMP systems. As the time delay increases in NMP
system, a more sluggish response is obtained. The proposed method is examined for
controlling the unstable plants with long time delay to maintain the robustness and
properness in system. The time delay is also known as transport lag and magnitude
of the transport lag is always equal to unity which is given as:
Optimal LQR and Smith Predictor Based … 213

|G( jω)| = |cos ωτ − j sin ωτ | = 1 (1)

where τ is the delay time and ω is the frequency in given system.


The phase angle of the transport lag is given as:

180
∠G( jω) = −ωτ × degree (2)
π
In this method, the techniques are used for the betterment of the plant to be
controlled having robustness in place of disturbances and to eliminate the step load
faults.

3 PID Controller

A PID design approach must include these three perspectives which are plant iden-
tification, controller design methodology and tuning of the parameters. These three
features are important for the analysis and for studying the behaviour of the given
system. The PID controller was introduced in 1939 and used in different control areas
for solving difficult problems till now. For minimizing the error between the estimated
plant variable and setpoint, it calibrates the controlled inputs. The mathematical
equation of the PID controller is as shown:

U (s) 1
G c (s) = = KP + KI + KDs (3)
E(s) s

where, G c (s) is transfer function of controller, K P is the proportional gain, K I is the


integral gain, K D is the derivative gain, U(s) is the output of the controller and E(s)
is error of system.
Therefore predictive control is used for a PID controller to study the non-minimum
phase system attributes and to acknowledge the complication (Fig. 1).

Fig. 1 Block diagram of PID controller


214 D. Pandey et al.

Table 1 PID controller


Control function K Ti Td
parameters obtained from the
Z-N step response method P 1/a – –
PI 0.9/a 3L –
PID 1.2/a 2L L/2

Table 2 PID controller


Control function K Ti Td
parameters obtained from the
Z-N frequency response P 0.5 K u – –
method PI 0.4 K u 0.8 T u –
PID 0.6 K u 0.5 T u 0.125 T u

3.1 Ziegler-Nichols (Z-N) and Related Methods

The Ziegler-Nichols method has had powerful effect in different complex control
related applications. In this method, two tuning approaches are introduced is, the
step response method and another is frequency response method. This method is
used for determining the variables of PID controller tuning strategies. This approach
is very effective in determining the tuning parameters of PID controller for obtaining
better results. For finding the tuning parameter values K p , K i and K d by this Z-N
step response method are described in Table 1.
This conventional technique is used in different fields of control related applica-
tions for tuning the parameters of PID (Table 2).

3.2 The Chein, Hrones and Reswick Method

There have been certain improvements accomplished in Ziegler-Nichols method by


Chein-Hrones-Reswick and this method is based on 20% overshoot criteria. This
method is a modified form of the Z-N method. This tuning method determined
that set point output and load disturbance output are distinct in different features.
This method is considered in many process control techniques for solving complex
problems (Tables 3 and 4).

Table 3 Controller parameters obtained from CHR load disturbance response method
Overshoot 0% 20%
Controller Kp Ti Td Kp Ti Td
PID 0.95/a 2.4L 0.42L 1.2/a 2L 0.42L
PI 0.6/a 4L – 0.7/a 2.3L –
P 0.3/a – – 0.7/a – –
Optimal LQR and Smith Predictor Based … 215

Table 4 Controller parameters obtained from CHR set point response method
Overshoot 0% 20%
Controller Kp Ti Td Kp Ti Td
PID 0.6/a T 0.5L 0.95/a 1.4T 0.47L
PI 0.35/a 1.2T – 0.6/a T –
P 0.3/a – – 0.7/a – –

K e−Ls
G p (s) = (4)
Ts + 1

4 The Smith Predictor General Structure

Time delays are general situation in different industrial applications and due to which
the difficulties are obtained in these related systems. Conventional tuning techniques
such as PID are generally unproductive in a solution of such control problems because
for stabilization of closed loop systems detuning would be needed in sufficient
manner. The smith predictor controller gives better responses for the closed loop
systems in comparison to standard control approaches. The controller consolidates
a model of the method, then permitting for prognostication of parameters and after
that designed the controller for the method which is having no time delay.
Smith predictor is generally utilized in control techniques in different complex
systems for delay compensation in systems (Fig. 2).
where, e−τ P s is time delay of the non-minimum phase system.
The tracking capability will be improved with the help of smith predictor by
exclusion of time delay in feedback of the system. In this paper, the different tuning
approaches of PID controller have been utilized by using different tuning variables
Kp, K i and K d for reduction of the settling time of NMP system. In this paper, smith

R(s) (s)

Fig. 2 Closed loop control system with delay component


216 D. Pandey et al.

E(s) N(s) CP(s)


R(s)
Gc

Fig. 3 The Smith predictor general structure for delay component

predictor controller is utilized for the enhancement of the control situation and for
the removal of the external disturbances in a given system (Fig. 3).

5 Linear Quadratic Regulator (LQR)

The Linear Quadratic Regulator (LQR) is a popular approach that gives optimal
feedback gains for stabilizing the closed loop system and shows best performance
in different complex systems. Optimal control methods are employed for the design
of the system with respect to specified benchmarks.
This method indicates a robust control approach. In this technique, we are trying
to drive the state X of a linear (rather linearized) system to the origin by minimizing
the following quadratic performance index (cost function).
A Continuous-time LTI system is given below:

ẋ = Ax + Bu (5)

y = Cx (6)

Feedback signal is given by

u = −K x (7)

By putting this feedback signal we get

ẋ = (A − B K )x (8)
Optimal LQR and Smith Predictor Based … 217

The linear quadratic regulation (LQR) problem consists of finding the control
signal u(t) that makes the given criterion as small as possible. The cost function is
given as

∞
JLQR  x T (t)Qx(t) + u T (t)Ru(t)dt (9)
0

where Q and R are the positive-definite weighting matrices.


K is obtained in LQR gain vector from the given equation

K = R −I B T P (10)

P is positive-definite constant matrix and the algebraic Riccati equation (ARE) is


given by

A/.T P + P.A − PBR−I B T P + .Q = 0 (11)

6 Results and Discussions

The transfer function of optimized PID controller for non-minimum phase system is
given below.

0.1
G(s) = e−8s (12)
s(s + 1)(0.5s + 2)(0.1s + 2)

In this paper, the transfer function of the system is approximated in order to get
First order PADE approximated non-minimal system which is presented as

−0.1s + 0.025
G P (s) = (13)
0.05s 5 + 0.6625s 4 + 1.763s 3 + 1.4s 2 + 0.25s

The obtained response is shown above for closed loop Non-Minimum Phase
system with Smith predictor. By the analysis of the responses, we can understand
that the smith predictor gives a better response than the given transfer function of
closed loop non-minimum phase system. In Fig. 4 the blue line shows the non-
minimum phase system (NMP) and green line shows the smith predictor controller
in terms of time-domain characteristics (Fig. 5).
The step response of non-minimum phase system is compared with smith predictor
and internal model control technique. Here the smith predictor gives a better response
than IMC method for this system (Fig. 6).
218 D. Pandey et al.

Fig. 4 Step response of closed loop NMP with Smith predictor

Fig. 5 Step response of NMP with PID-IMC and smith predictor

By the analysis of Fig. 7, we have come to understand that smith predictor gives
better responses than the PID-ZN and PID-CHR. The complication associated with
the given system is time delay which we want to remove and gives the improved
responses by these conventional techniques of PID controller with smith predictor.
The smith predictor gives better settling time, i.e. 40.6120 s than the other approaches.
Furthermore, the PID controller conventional techniques are compared in which
Ziegler-Nichols (PID) gives better time-domain specification having settling time, i.e.
69.6098 s than the Chein,-Hrones-Reswick (PID) for this method but still improvisa-
tion is needed in time-domain characteristics for the system. In the above figure red
line represent the PID-ZN and blue line represents the PID-CHR responses obtained
for the given system.
Optimal LQR and Smith Predictor Based … 219

Fig. 6 Step response of NMP with PID-ZN, PID-CHR, PID-IMC and Smith predictor

Fig. 7 Step response of NMP with PID-ZN, PID-CHR and Smith predictor

In Fig. 8, the step response of non-minimum phase system with Smith Predictor
controller and Linear Quadratic Regulator is shown. LQR gives a better settling time
than the smith predictor. Optimal control response is presented by Linear Quadratic
Regulator.
In Fig. 9, the LQR stabilizes the system in 18 s which is quite suitable for handling
the complicated methodology. Hence PID controller gives optimal values by using
this approach which is represented by light blue curve line in the plot (Table 5).
220 D. Pandey et al.

Fig. 8 Step response of NMP with Smith predictor and LQR

Fig. 9 Comparative analysis of NMP with different controller

Table 5 Parameters of PID


Tuning methods Kp Ki Kd
controller
Ziegler-Nichols 5.169 0.2779 24.04
Chien-Hrones-Reswick 4.08 0.1846 16
PID-IMC 0.2443 0.003108 1.058
Optimal LQR and Smith Predictor Based … 221

Table 6 Comparison of time-domain characteristics


Parameters Rise time Settling time % overshoot Peak Peak time
C/L NMP system 64.7793 124.7616 0 0.9998 254.1795
PID (ZN) 3.1811 69.6098 80.1516 1.8015 18.2130
PID (CHR) 5.1974 89.2550 54.9307 1.5493 23.5028
PID (IMC) 111.4263 1.4419e + 03 45.3597 1.4536 295.3242
Smith predictor 13.1554 40.6120 0 0.9996 80.9475
LQR 8.9108 18.7756 0 0.9986 30.8181

The analysis of this system is done according to the above tuning variables of PID
controller utilizing these three conventional tuning approaches. The tables demon-
strate that the distinct parameters that are utilized for excellent tuning and decrease
the delay by smith predictor technique (Table 6).
By the analysis of the given comparative study of the different conventional
methods of PID controller, Linear Quadratic Regulator (LQR) and smith predictor we
can understand that the LQR gives better settling time than other different approaches
that are used in this paper. The NMP system is controlled by these standard methods
of PID controller, LQR and smith predictor provides better performances in terms
of time-domain characteristics.

7 Conclusion

This paper controlled the non-minimum phase system (NMP) by different standard
methods that as Ziegler-Nichols (ZN), Cheins-Hrones-Reswick (CHR) and Internal
Model Control (IMC) of PID controller tuning methods. External load perturbations
and tracking errors are eliminated by these methods systematically. By the demon-
strations, we understand that the smith predictor controller gives better responses
in comparison to other conventional tuning methods but is still not accurate as
desired for the time-domain specification. Therefore optimal control approach that is
linear quadratic regulator is utilized for the desired response. The controller designs
are done with the help of Smith Predictor Controller, Linear Quadratic Regulator
and conventional PID tuning methods that is Ziegler-Nichols (PID), Cheins-Hrones-
Reswick (PID) and Internal Model Control (PID) method. But the time-domain char-
acteristics can be improved up to the desired limit. In this method, LQR is designed
for improvements in results. After that, the errors incorporated in the system reduce
and shows better responses in terms of time-domain specification. The simulations
results calculated for this method authenticate the responses obtained by the LQR
(Linear Quadratic Regulator) provides better results than smith predictor and other
conventional methods of PID controller in time-domain characteristics.
222 D. Pandey et al.

References

1. Hang CC, Wang Q, Cao LS (1995) Self-tuning Smith predictors for processes with long dead
time. Int J Adapt Control Signal Process 9(3):255–270
2. Åström KJ, Hägglund T (2001) The future of PID control. Control Eng Pract 9(11):1163–1175
Detection of Cracks in Surfaces
and Materials Using Convolutional
Neural Network

R. Venkatesh, K. Vignesh Saravanan, V. R. Aswin, S. Balaji, K. Amudhan,


and S. Rajakarunakaran

Abstract Crack is the separation of an objects or surface in two or more pieces.


Identification of these cracks is crucial and plays a vital role in preventing the disaster
that will happen due to that crack. The effective identification of these cracks on
materials or parts may help the manufacturer to neglect those parts and to find the
root cause for the formation of crack. For this, we have developed an image processing
model using convolutional neural network. This model will constantly take picture
by using the web camera, and these images were processed by the model and that
model will predict whether the part or material has a crack on its surface or not. If the
crack is detected by the model, those images will be saved, and respective actions
will be taken. We have trained our model with 20,000 positive datasets (images
with crack) and 20,000 negative datasets (image without any cracks). Then we have
tested the model with some random images obtained from the internet. Analysing
the results, we found that our model has attained maximum accuracy with an average
confidence level of 93–97%. After this stage, we have developed the model to access
the device’s webcam to capture live video feed and predict the images for cracks. If
a crack is present in the image, the model will store the image.

Keywords Computer vision · Crack detection · Deep learning · Convolution ·


Image filtering

R. Venkatesh · V. R. Aswin (B) · S. Balaji · S. Rajakarunakaran


Department of Mechanical Engineering, Ramco Institute of Technology, Rajapalayam, India
K. Vignesh Saravanan
Department of Computer Science and Engineering, Ramco Institute of Technology, Rajapalayam,
India
K. Amudhan
Department of Mechanical Engineering, Mepco Schlenk Engineering College, Sivakasi, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 223
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_18
224 R. Venkatesh et al.

1 Introduction

Cracks are called as a fracture or a discontinuation in a body. Development of crack


means deviation or declination in any physical property of the body on which the
crack has been developed. The body can the core element in the structure or it might
also be a protective layer or a coating.
Sometimes, the deviation in humidity level in the atmosphere and changes in
atmospheric temperature makes the element to expand and contract. This movement
of expansion and contraction of the outer surface of any physical body is one of the
major causes of cracks.
Sometimes, improper fabrication might also lead to crack development as a result
of failure in its expected behaviour. This, in turn, reduces the endurance of the physical
body.
Thus, detection of crack is a crucial rule in almost all the fields that consists of a
considerable amount of consumer’s population.

1.1 Types of Concrete Cracks

• Plastic shrinkage cracks


• Expansion cracks
• Heaving cracks
• Settling cracks
• Cracks caused by overloading
• Cracks caused by premature drying.

1.2 Types of Cracks on Mechanical Components

• Tensile crack at open flaw tip


• Tensile crack at position close to open flaw tip
• Antitensile crack
• Normal shear crack
• Coplanar shear crack
• Far-field crack.

1.3 Effects of Crack in Structures

Generally, a crack is also a type of failure. Therefore, presence of a crack on any


surface or a structure makes that structure more vulnerable and weaker towards
external environment and external factors. Crack accelerates the ageing of structure
Detection of Cracks in Surfaces and Materials … 225

and induces a decline in the endurance strength of the material. Crack also reduces
the ability of a structure to absorb stress and it may even cause collapsing of large
structures.
Surface crack increases the rate of decrease in strength by reducing the
pulverization of structure at dynamic loading conditions.
Sometimes, cracks can be as fatal as powerful enough to collapse a building, or
a large entertainment centre in a theme park, or even a bridge, thus making to so
crucial to detect cracks and take action against it.
The development of a crack along a continuous path on a surface is called as crack
propagation.
There are various modes of crack propagations: due to differences in crack tip
stress, due to friction on the crack faces, shear modes of fracture, resistance of material
to crack initiation and growth may also be a mode.
Cracks can be prevented by forming an impression on the surface by pressing
with a strong body at a particular position of a metal component which undergoes
the crack prevention process. A plastic deformation is occurred in the impression,
and a compressive residual stress is formed which prevent initiation or propagation
of cracks on the corresponding surface.
Scientifically and theoretically, the cracks propagate through the body to reduce
the concentration of total energy of the system due to load, by dissipating the elastic
strain energy due to loading into the creation of a new surface.
These cracks are usually detected using manual methods like magnetic particle
inspection, dye penetrant inspection and ultrasound. But for these techniques, many
chemical preparations are used.
These cracks are not only present on surfaces, elements, structures but are also
present on parts that are on the assembly line or conveyor line in production units.
There, these cracks occur due to manufacturing defect, and to make sure these
defective pieces are not sent out, crack detection is a vital part.
By this crack detection, large number of problems can be avoided, starting from
packaging units where defective work pieces might be mixed, to failures in large
structures which might lead to fatal injuries to a considerable size of population.

2 Literature Review

Concrete crack detection and quantification using deep learning and structured
light (August 2020)
Song Ee Park, Seung-Hyun Eem, Haemin Jeon.
Detection of deterioration of infrastructures by detecting cracks on its surfaces to
ensure safety has become essential. In this paper, usage of deep learning with two
sensors and structured light composed of vision has been highlighted. They have
used You Only Look Once algorithm, and the position of laser beams projection
gives us the size of the crack. As laser beams cannot be projected in parallel due
226 R. Venkatesh et al.

to manufacturing errors or installation, the correction algorithm for laser alignment


with a jig designed specifically for this application and a distance sensor is used
to increase the accuracy in this case. They have verified the performance of this
algorithm by running tests and simulations which results in an increased accuracy of
drack detection. In short, this paper concludes that applying deep learning techniques
in crack detection increases its accuracy and reduces possible errors.
Crack Detection and Segmentation using Deep Learning with 3D Reality Mesh
Model for Quantitative assessment and Integrated Visualization (May 2020)
Rony Kalfarisi, Zheng Yi Wu, M. ASCE, Ken Soh.
This paper starts by highlighting the past researches made on this same field.
Crack detection has been an ever-trending research topic among civil infrastructure
researchers. They say that excellent results have been extracted by application of
deep learning techniques in crack detection. However, there are no accurate crack
detection techniques in context of engineering structures. In this paper, two deep
learning-based techniques for crack detection have been developed: Faster region-
based CNN (FRCNN) and structured random forest edge detection (SRFED). The
former is used to detect cracks by forming a bounding box around it, while the latter
is used to segment the cracks detected inside the box. The models have been trained
with a huge amount of data sets collected from random real-world places to enhance
its application. Both the approaches have been applied in a 3D reality mesh model
that enables such assessment with integrated visualization. Thus, implementation of
such techniques increased the accuracy in crack detection up to a considerable level
making it easier and more effective.
Crack Detection of Structures Using Deep Learning Framework (December
2020)
Arunish Kumar, Avinash Kumar Jha, Amit Kumar, Ashutosh Trivedi.
This paper starts by highlighting the fact that cracks on the roads in bridges have
drawn attention of transportation system. They say that introduction deep learning
into crack detection made it possible to complete the task with a high accuracy. In
this paper, they have proposed modified LeNet 5 model for detecting cracks in roads
and bridges. They have worked on automated bridge crack detection, applied their
own modified version of LeNet 5 model, collected concrete crack images for image
classification, and finally, they have analysed their results by comparing it with the
accuracies of existing techniques. They are also highlighting the region of cracks by
red colour and the region without cracks by green colour. They have taken time and
accuracy as main factors while producing the model and extracting the output.
Autonomous Concrete Crack Detection Using Deep Fully Convolutional Neural
Network (March 2019)
Cao Vu Dung, Le Duc Anh.
In this paper, the authors have used image classification and bounding box
approaches in the existing vision-based crack detection techniques. This study
proposed a crack detection method based on deep fully convolutional neural network,
Detection of Cracks in Surfaces and Materials … 227

to increase the efficiency of crack detection system. Performance of three different


pre-trained network architectures, which serves as the fully convolutional network
(FCN) encoder’s backbone, is evaluated for image classification on a public concrete
crack dataset of 40,000 227 × 227-pixel images. Subsequently, the whole encoder–
decoder FCN network with the VGG16-based encoder is trained end to- nd on a subset
of 500 annotated 227 × 227-pixel crack-labelled images for semantic segmentation.
This network achieves about 90% in average. They have observed that their model
detects cracks from a video of a cyclic loading test on a concrete specimen at a
reasonable rate, and thus, they have validated the performance of this FCN module.
Infrared Thermal Imaging-Based Crack Detection Using Deep Learning
(December 2019)
Jun Yang, Wei Wang, Guang Lin, Qing Li, Yeqing Sun, Yixuan Sun.
Deep learning vision-based approaches are being widely used in the field where
crack detection is a crucial and an essential task. The defects on the surface of compo-
nents can be detected after the processing of image takes place. Due to the limitations
in photographic images, internal features of the objects cannot be detected. In order
to overcome this drawback, infrared thermal imaging technique has been used using
CNN. First, a horizontal heat conduction method is used to excite the component’s
surface, and a rolling electric heating device is used as the heat excitation source.
Second, they have studied the difference in temperature between normal regions and
cracked regions. Thirdly, 3000 infrared thermograms labelled for penetrating cracks,
non-penetrating cracks, and surface scratches are fabricated into a databank. Then
the CNN is trained on the data bank. After this, the faster region-based CNN seemed
to be improved, the feature maps of multiple levels in the feature extraction network
are aggregated, and the anchor selection scheme of the region proposal network
(RPN) is adjusted from 9 to 25. The robustness of the improved faster R-CNN is
demonstrated by evaluating the detection results on the 125 images outside of the
data bank; the accuracy and mean average precision (mAP) are 95.54% and 92.41%,
respectively, which outperform the original algorithm by increasing the accuracy of
3.18% and mAP of 1.88%.
Automated Road Crack Detection Using Deep Convolutional Neural Networks
(January 2019)
Vishal Mandal, Lan Uong, Yaw Adu-Gyamfi.
To avoid failures in structures, it is essential and crucial to detect cracks and
take the necessary actions. Most crack detection methods till date follow a conven-
tional manual approach which might be time consuming and quite dangerous and
expensive too. In this study, the authors have produced an automated pavement crack
detection system using the YOLO algorithm with v2 deep learning framework. The
system is trained with an approximate of 7300 images and was tested on around 1400
images. The detection and classification accuracy of the proposed distress analyser
is measured using the average F1 score obtained from the precision and recall values.
Successful application of this study can help identify road anomalies in need of urgent
repair, thereby facilitating a much better civil infrastructure monitoring system.
228 R. Venkatesh et al.

Robust Concrete Crack Detection Using Deep Learning-Based Semantic


Segmentation (January 2019)
Donghan Lee, Jeongho Kim, Daewoo Lee.
In this paper, the authors have proposed a crack detection technique based on
image segmentation and image classification network for robust crack detection,
which utilizes information, obtained from the data sets and performs pixelwise
prediction. They also produce a crack image generation algorithm using 2D Gaus-
sian Kernel and the Brownian motion process to overcome the lack of datasets. They
gathered around 250 images with cracks to train their module and to verify the robust-
ness of their trained crack detection module. The neural network was trained with
MS-COCO datasets and retrained by each dataset to derive the maximum prediction
accuracy. Finally, they conclude by saying that their method has a highly accurate
prediction system, even for complex images with complex pixel ratings.
DeepCrack: Learning Hierarchical Convolutional Features for Crack Detection
(October 2018)
Qin Zou, Zheng Zhang, Qingquan Li, Xianbiao Qi, Qian Wang, Song Wang.
The authors in this paper start off by giving the basic example of crack: typical line
structures that are of interest in many computer vision applications. The major chal-
lenges faced by vision-based image processing techniques for crack detection are the
discontinuity of cracks and low contrast. Thus, the authors propose deep crack—an
end-to-end trainable deep convolutional neural network for automatic crack detection
by learning high-level features for crack representation. Multi-scale deep convo-
lutional features learned at hierarchical convolutional stages are fused together to
capture the line structures in this method. They have collected four datasets, training
their module on one and testing with the other three. The experimental results demon-
strate that deep crack achieves F-measure over 0.87 on the three challenging datasets
in average and outperforms the current state-of-the-art methods.

3 Objectives

We propose this solution to make crack detection and inspection of abnormalities


easier and efficient.
Our project is made to eliminate human interventions as much as possible, in parts
inspection and quality assessment of physical structure.
We have framed this solution with an objective to reduce time consumption and
to eliminate labour cost in manual inspection and also to overcome the drawbacks
faced during manual inspection and the problems faced during visual inspection of
video feed in the place of implementation.
We propose this solution with an aim to provide an efficient inspection technique
in manufacturing units during packing stage.
Detection of Cracks in Surfaces and Materials … 229

We also aimed to provide an enhanced method to check for abnormalities and


defects in the work pieces coming out of a manufacturing unit and to implement it
as one of the machine vision techniques.

4 Proposed Solution

We have created a model using deep learning techniques. Our model will use the
external camera device connected to the control system, to capture pictures in equal
intervals, and detect for any abnormal features, defects or cracks.

4.1 CNN

A convolutional neural network (CNN) is a type of neural artificial network used to


detect and process image created specifically to process pixel data.
CNNs are powerful image processing, artificial intelligence (AI) that uses deep
learning to perform dual reproductive and defining functions, often using a vision
machine that incorporates image and video recognition, and NLP complementation
and processing programs.

4.1.1 Data Preparation

Data collection plays a vital role. Insufficient data may lead to overfitting of the model.
Hence, sufficient data should be collected. 40,000 images of 20,000 each positive
(images with crack) and negative (images without crack) images were collected.
Each image is of about 227 × 227 pixels.

4.1.2 Data Processing and CNN Architecture

The collected data are categorized and trained by applying various layer in order to
identify the feature in the image and to classify it (Fig. 1).

4.1.3 Filter Processing

Filtering is the initial processing in the input image datasets. Through this process, the
edges of the images are extracted using the vertical-edge and horizontal-edge filtering
techniques. For simple mathematical operations, we consider the input datasets as
the grayscale images represented as 2D where the value of each element represents
the amount of light in the corresponding pixel. To extract the edges data from this,
230 R. Venkatesh et al.

Fig. 1 CNN architecture used

we perform convolutional product which is basically the sum of the elementwise


product in each block.

4.1.4 Padding and Stride

While performing the convolutional product using the vertical-edge and horizontal-
edge filter, the pixels on the corner of the image are very less used than the pixels in
the middle. So, the information at the edges is often thrown away. To overcome this
problem, we introduce the padding technique. By padding technique, we add zeros
on each of the four edges of the images. In the matrix representation, value ‘0’ is
added.
Striding is performed after the convolution product. When the stride is higher,
it allows to shrink the size of output in large. Striding performs the sum of the
elementwise element per block.

4.1.5 Convolution

After we have defined the filters, padding and stride, we can define the convolution
product. Convolution product is the sum of the elementwise product in the matrix,
which we can mathematically represent as:

dim(image) = (n H , n w , n C )

where n H is the size of the height, n w is the size of the width, n C is the number of
channels. In case if our image is RBG images, we have red, blue, green colours and
the number of channels n C = 3. When performing the convolutional product, the
filter must have the same number of channels. So, we can calculate the dimension of
the filter is as follows:

dim(filter) = ( f, f, n C )
Detection of Cracks in Surfaces and Materials … 231

The convolutional product between the image and the filter is a 2D matrix where
each element is the sum of the elementwise multiplication. Mathematically for a
given image (I) with filter (K), we have:


nH 
nw 
nC
conv(I, K )x,y = K i, j,k Ix+i−1,y+ j−1,k
i=1 j=1 k=1

5 Methodology

Initially, we developed our model using convolutional neural network architecture.


Our model was developed to create a database, train itself with the provided datasets
from the specified directory, test itself with some random images that we provide as
input and display us the result.
As first version of our model, it was able to classify the feed image and run the
prediction model on it. Initially, we trained the model using 2000 datasets obtained
from the internet. Then we tested the performance of our model with some random
images obtained from the internet. As a result, it gave us a desired output with a
confidence level.
Initially, in this version, we first used two layers to test and classify the image
with 40 epochs. The prediction value ranges from 69 to 77%. Then, the model is
trained with three layers. The performance accuracy of the model was observed to be
around 88–93%. Then we increased the layers of image classification to four. With
four layers (improved CNN) of 40 epochs, the performance accuracy increased up to
95%. Then we thought of increasing the epochs to the largest double-digit number:
99. In doing so, the performance accuracy hiked up to 93–98%.
Then we developed the second version of our own model by accessing the AI
server in our college. Using this AI server, and our device capacity being expanded,
we trained our model with 40,000 datasets, 20,000 positive datasets (with cracks)
and 20,000 negative datasets (without cracks). After this, we trained our model with
a set of random images from the internet. We observed a significant improvement in
the performance and a considerable hike in precision and accuracy also (Fig. 2).
With success in this version, we thought of developing an even better version of
our own model. So, we developed the code to access the device’s connected visual
hardware (camera) and monitor live video feed. If it detects crack in any of the frames
it will give us the result and save the frame with crack in a separate specified directory.
This model will be developed into a software and implemented into machine vision
system in any manufacturing or production industry where inspection of the end
product’s physical feature, impressions on the surface, defects and other things are
done using automation (Fig. 3).
232 R. Venkatesh et al.

Fig. 2 Improvised CNN graph

Fig. 3 Methodology
Detection of Cracks in Surfaces and Materials … 233

5.1 Stage 1

• Initially, datasets are classified into two: Positive and negative, merged into an
array and initialized to a variable called as categories.
• Then the categories are called into the model by defining a function called training
data.
• Then the data in training data are shuffled. Then the X_data and y, also known as
features and labels respectively, are created.
• Then those data are reshaped to the correct format in order to be accessed inside
tensor flow.
• After reshaping, the data are saved in a specified directory.

5.2 Stage 2

• The training data saved in stage 1 are fed into the code of stage 2 and are scaled
to the correct size.
• Model Sequential () is initialized.
• Then the three hidden layers called as ReLU are created.
• Then softmax layer is created.
• Then data in those layers are flattened using flatten, and by using dense, activation
function is performed.
• After compiling the model, it is trained by 99 epochs with a batch size of 32.
• Then the model is saved.

5.3 Stage 3

• A function named prepare image is defined with frame passed as an argument.


• The while loop is initialized.
• The model saved in stage 2 is fed into stage 3.
• Then the prediction is carried on the live feed data.
• If the model predicts the image and finds a crack, show us the prediction value,
and it will save the image in a specified directory.
• If it does not find a crack, it will proceed with the loop.

6 Experimental Results

We have trained our model in support vector machine (SVM), gradient boosting and
convolutional neural network (CNN) methods to identify the best method for our
application (Fig. 4).
234 R. Venkatesh et al.

Fig. 4 Accuracy comparison graph

By training the dataset with SVM, gradient boosting and CNN methods, we have
achieved maximum accuracy in CNN method for varying number of data set.
Then we have tested the time taken by the model to train the data set. By
conducting, this we came to know that CNN takes less time than other two methods
and is efficient than other two methods (Fig. 5).
By conducting such tests, we came to a conclusion to use CNN for our application,
since it takes less time to train and its prediction accuracy is also considerably higher
than other two methods.

6.1 Images with Crack

Our model is able to train itself with the given datasets from the specified directory.
It is then able to load the trained data and use it to predict the output from the given
picture. The output is then processed and shown to us as given Figs. 6, 7, 8, 9 and
10.
Detection of Cracks in Surfaces and Materials … 235

Fig. 5 Time comparison graph

Fig. 6 Wall crack


236 R. Venkatesh et al.

Fig. 7 Izod tested piece crack

Fig. 8 Izod tested piece 2 crack


Detection of Cracks in Surfaces and Materials … 237

Fig. 9 Cylinder crack

Fig. 10 Sheet metal crack


238 R. Venkatesh et al.

Fig. 11 Plain wall

6.2 Images Without Crack

As it is clear in the given images, our model gives highly accurate results with an
average confidence level above 95%. These images belong to the positive category
as they have cracks and have been predicted correctly. These images will be stored
in a database at a specified directory for future reference (Figs. 11, 12, 13, 14 and
15).

7 Conclusion

Traditional crack detection method is no more efficient and is highly time consuming.
The fact that it requires skilled personnel makes it inefficient costwise. Visual inspec-
tion via video feed lagged behind when it faced the problems due to noise effect in
the video feed and low contrast. Also, crack detection on steel elements and any other
metal elements was very hard in any conventional methods. All these disadvantages
made crack detection tough and small cracks lead to large disasters. Our paper has
proposed a solution which consists of a model that uses the device’s web cam to
classify the live video feed and detect for cracks. We have built our model based on a
CNN architecture, which is trained with a random 40,000 datasets with 20,000 data
each positive and negative (crack and without crack respectively). We have fixed a
batch size of 32 and the model is trained for 99 epochs every time. We have added
Detection of Cracks in Surfaces and Materials … 239

Fig. 12 3D printed component without crack

Fig. 13 3D printed component 2 without crack


240 R. Venkatesh et al.

Fig. 14 Block without crack

Fig. 15 Component without crack


Detection of Cracks in Surfaces and Materials … 241

four layers of image classification layers in which three are ReLU and one is softmax.
Then the model is tested using random images obtained from the internet in both
positive and negative categories. It is observed that our model has attained around
95% accuracy with an average confidence level of 90–97%. The confidence level
indicates how precise the model is performing and giving us the output. The images
classified as positive gets stored in a specified directory for taking the necessary
actions.

References

1. Park SE, Eem S-H, Jeon H (2020) Concrete crack detection and quantification using deep
learning and structured light. Constr Build Mater 252:119096. ISSN 0950-0618
2. Kalfarisi R, Wu ZY, Soh K (2020) Crack detection and segmentation using deep learning with
3D reality mesh model for quantitative assessment and integrated visualization. J Comput Civ
Eng 34(3):04020010
3. Kumar A, Kumar A, Jha AK, Trivedi A (2020) Crack detection of structures using deep learning
framework. In: 2020 3rd international conference on intelligent sustainable systems (ICISS),
Thoothukudi, India, pp 526–533
4. Dung CV, Anh LD (2019) Autonomous concrete crack detection using deep fully convolutional
neural network. Autom Constr 99:52–58. ISSN 0926-5805
5. Yang J, Wang W, Lin G, Li Q, Sun Y, Sun Y (2019) Infrared thermal imaging-based crack
detection using deep learning. IEEE Access 7:182060–182077
6. Mandal V, Uong L, Adu-Gyamfi Y (2018) Automated road crack detection using deep convo-
lutional neural networks. In: 2018 IEEE international conference on big data (big data), Seattle,
WA, USA, pp 5212–5215
7. Lee D, Kim J, Lee D (2019) Robust concrete crack detection using deep learning-based semantic
segmentation. Int J Aeronaut Space Sci 20:287–299
8. Zou Q, Zhang Z, Li Q, Qi X, Wang Q, Wang S (2019) DeepCrack: learning hierarchical
convolutional features for crack detection. IEEE Trans Image Process 28(3):1498–1512
9. Chen F, Jahanshahi MR (2018) NB-CNN: deep learning-based crack detection using convolu-
tional neural network and Naïve Bayes data fusion. IEEE Trans Industr Electron 65(5):4392–
4400
10. Zhang L, Yang F, Daniel Zhang Y, Zhu YJ (2016) Road crack detection using deep convolutional
neural network. In: 2016 IEEE international conference on image processing (ICIP), Phoenix,
AZ, USA, pp 3708–3712
11. Torok MM, Golparvar-Fard M, Kochersberger KB (2014) Image-based automated 3d crack
detection for post-disaster building assessment. J Comput Civ Eng 28(5):A4014004
12. Makantasis K, Protopapadakis E, Doulamis A, Doulamis N, Loupos C (2015) Deep convo-
lutional neural networks for efficient vision-based tunnel inspection. In: 2015 IEEE interna-
tional conference on intelligent computer communication and processing (ICCP), Cluj-Napoca,
Romania, pp 335–342
13. Zhang A, Wang KCP, Li B, Yang E, Dai X, Peng Y, Fei Y, Liu Y, Li JQ, Chen C (2017)
Automated pixel-level pavement crack detection on 3D asphalt surfaces using a deep-learning
network. Comput-Aided Civil Infrastr Eng 32(10):805–819
14. Wang X, Hu Z (2017) Grid-based pavement crack analysis using deep learning. In: 2017 4th
international conference on transportation information and safety (ICTIS), Banff, AB, Canada,
pp 917–924
15. Kim B, Cho S (2018) Automated vision-based detection of cracks on concrete surfaces using
a deep learning technique. Sensors 18(10):3452
Performance Analysis of TDM-PON
and WDM-PON

Raju Sharma, Monika Pathak, and Anuj Kumar Gupta

Abstract Nowadays, fiber technology provides many features like sound and video
traffic, high data bandwidth in new applications for users. Passive optical network
(PON) and fiber to the home (FTTH) technologies have changed our lives due to its
reduced cost. Multiple users supported single optical fiber using PON technology.
Most widely used multiplexing techniques are time division multiplexing (TDM)
and wavelength division multiplexing (WDM). A TDM-PON provides large number
of channels but moderate bandwidth and uses a single wavelength. WDM-PON uses
multiple wavelengths in a single fiber to multiply the capacity without increasing
the data rate. In this paper, multiple access techniques most widely used in optical
network are explained together with structure and types of passive optical networks
are investigated. Wavelength division multiplexing/time division multiplexing is
analyzed using OptiSystem 17.

Keywords Optical network · Passive optical network · Active optical network ·


TDM-PON · WDM-PON

1 Introduction

Optical fiber is transparent fiber and flexible which is made by plastic or glass to
a diameter approximately equal to human hair. It is used to carry light from one
end to the other. As compared to wire cables, optical cables have higher bandwidth
and it is used to transmit the information in optical domain. Optical fiber provides
higher bandwidth; thus, capacity of the system is also larger. Optical fibers have
small diameter; thus, these cables are small in size and light in weight. Optical fiber

R. Sharma
Baba Banda Singh Bahadur Engineering College, Fatehgarh Sahib, Punjab, India
M. Pathak
Multani Mal Modi College, Patiala, Punjab, India
A. K. Gupta (B)
Chandigarh Group of Colleges, Mohali, Punjab, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 243
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_19
244 R. Sharma et al.

Fig. 1 Block diagram of basic optical network

has less cost per unit length as compared to the traditional links. Cross talk is also
less the optical fibers [1].
Communication is a way to convey the information from one point to another. Due
to the increase demand of higher bandwidth, fiber optical communication systems are
used as compared to traditional system. Capacity of the system is directly related to
the carrier frequency and bandwidth. Fiber optical communication is the best choice
for transmission beyond gigabits of transmission of information. Electrical signals are
changed into light signals and transmitted through optical fiber. Larger bandwidth,
low attenuation, long-distance transmission, lightweight and smaller diameter are
some features of optical fiber systems [2, 3].
Optical fiber transmits signals in the form of light from one place to another;
on the other hand, in copper wired transmission signal transmits the information in
the form of electrical signals. Basic diagram of communication system is shown in
Fig. 1. Transmitter circuitry converts the input data into light signal by using light
source. LED is the main source which provides stable frequency, phase and amplitude
and has minimum fluctuations to provide efficient transmission. Information in the
form of light is sent to the receiver through fiber optic cable. Receiver circuit has
photodetector to convert the light signal back to electrical signal [4].

2 Optical Network

A passive optical network (PON) delivers data from a single transmission point
to multiple user endpoints using point-to-multipoint topology and optical splitters.
Active optical network (AON) and passive optical network (PON) are the two optical
distribution networks (ODNs). The difference between these two is based on how
the signal is split between multiple fibers that are going to each customer [5].
Performance Analysis of TDM-PON and WDM-PON 245

Active Optical Network: Electronically powered devices are used to direct the
signal to the relevant customer in AON. Switching device and Ethernet are commonly
used to route the signal to up to about 500 customers [6].
Passive Optical Network: Optical splitters that do not require any electrical power
are used in PON to send the information to each customer. Around 128 end users
can be handled by each switching cabinet [7].
Depending on the data multiplexing scheme, passive optical networks are of
three types: time division multiplexing-passive optical network (TDM-PON), wave-
length division multiplexing-passive optical network (WDM-PON) and orthogonal
frequency division multiplexing-passive optical network (OFDM-PON). In earlier
days, TDM-PON technology deployed where upstream and downstream traffic TDM
multiplexed from and to multiple optical network unit. BPON, EPON and GPON are
the main examples of TDM-PON [8]. WDM-PON provides the provision of band-
width to optical network unit by using multiple wavelengths and provides higher than
40 Gbps data rate, while OPDM-PON transmits traffic to and from optical network
unit by using number if orthogonal subcarriers provide Tbps data rate [9]. This paper
mainly deals with the performance comparison of TDM-PONs and WDM-PONs.

3 System Modeling

3.1 TDM-PON

Unidirectional TDM-PON is analyzed is this paper with only downstream trans-


mission. Figure 1 shows the simulation model of unidirectional TDM-PON with
2 users. Transmitter block consists of a User Defined Bit Sequence Generator is
used to generate the data sequence at a data rate of 2.5 Gbps, NRZ pulse generator, a
Continuous Wave Laser with frequency 1490 nm, power 0.2 dBm and Mach–Zehnder
Modulator according to the output of the NRZ pulse generator vary the intensity of
the light of from the CW laser [10].
Output of the transmitter will be transmitted to the receiver through optical fiber
channel with optical attenuator which provides attenuation of 0.2 dB/km to reduce
the power level of the optical signal. To split the signals to the users, a power splitter is
used in the receiver section. APD photodetector is used to detect the signals. Output
of the detector is given to the low-pass filter to filter the higher frequency components.
To obtain the original signal, filtered output is given to the 3 regenerators and then
output is analyzed using a BER analyzer [11].
246 R. Sharma et al.

3.2 Unidirectional WDM-PON

Unidirectional WDM-PON is analyzed in this paper with only downstream trans-


mission. Figure 2 shows the simulation model of unidirectional WDM-PON with

Fig. 2 Active and passive optical network


Performance Analysis of TDM-PON and WDM-PON 247

Fig. 3 Simulation setup of TDM-PON

2 users. Two different wavelengths 1490 and 1491 nm for two users with different
fiber lengths are used to simulate the WDM-PON. Transmitter block consists of a
User Defined Bit Sequence Generator is used to generate the data sequence at a data
rate of 2.5 Gbps, NRZ pulse generator are used to encode the signal generated the
bit sequence generator, Continuous Wave Laser sources with wavelength 1490 and
1491 nm with power 0.2 dBm and Mach–Zehnder Modulator is and optical modu-
lator according to the output of the NRZ pulse generator vary the intensity of the light
of from the CW laser [12]. These signals are combined by the WDM multiplexer and
given to the receiver through optical fiber. WDM demultiplexer is used to split the
signals according to the different wavelengths. APD photodetector is used to detect
the signals [13]. Output of the detector is given to the low-pass filter to filter the
higher frequency components. To obtain the original signal, filtered output is given
to the 3 regenerators and then output is analyzed using a BER analyzer (Figs. 3, 4;
Tables 1, 2).

4 Results and Discussion

4.1 Performance Analysis of Unidirectional TDM-PON

Q factor and bit error rate (BER) parameters of unidirectional TDM-PON are
analyzed using OptiSystem 17 simulator software. Figure 5 shows the graph between
Q factor and fiber length. From graph, it is clear that the Q factor is varied with distance
from 30 to 70 km with power of 0.2 dBm. As the distance increases, the Q factor
also decreases. Figure 6 shows the graph between BER and fiber length from the
unidirectional TDM-PON with 2 users. Graph is plotted for distance of 50–130 km
at an input power of 0.2 dBm. From the graph, it is clear that the log BER increases
with the increase in fiber length (Tables 3, 4 and 5).
248 R. Sharma et al.

Fig. 4 Simulation setup of WDM-PON

Table 1 TDM-PON
Components Parameters
parameters and values
Type Value
Light source Freq. 1490 nm
Power 0.2 dBm
Bit sequence Bit rate 2.5 Gbps
generator
Photodetector Responsivity 1 A/W
Dark current 10 nA
Modulator Modulator format NRZ pulse generator
Optical fiber Fiber length 70 km

Table 2 WDM-PON
Components Parameters
parameters and values
Types Value
Light source Wavelength 1490, 1491 nm
Bit sequence generator Bit rate 2.5 GBPS
Optical fiber Fiber length 30–70 km
Modulator Modulation format NRZ
WDM MUX Insertion loss 0 dB
Filter type Bessel
Filter order 2
Photodetector Responsibility 1 A/W
Dark current 10 nA
Performance Analysis of TDM-PON and WDM-PON 249

Q Factor at User Defined Decision Instant ( Length km)


30

25

20
Q Factor

15
Q Factor (User 1)
10 Q Factor (User 2)

0
30 40 50 60 70
Length (km)

Fig. 5 Fiber length versus Q factor in TDM-PON

Log BER at User Defined Decision Instant ( Length km)


0
-20 30 40 50 60 70

-40
-60
Log BER

-80 Log BER ( User 1)


-100 Log BER ( User 2)
-120
-140
-160
Length (km)

Fig. 6 Fiber length versus log BER in TDM-PON

Table 3 Analysis of
Analysis User 1 User 2
TDM-PON with two users
Max. Q factor 11.734 10.8628
Min. BER 3.9998e−032 8.12313e−028
Eye height 3.82466e−005 3.70775e−005
Threshold 1.94108e−005 1.93769e−005
Decision inst. 0.5 0.5
250 R. Sharma et al.

Table 4 Analysis of Q factor


Fiber length Q factor (user 1) Q factor (user 2)
at different fiber lengths in
TDM-PON 30 25.7450203563969 25.6217112569758
40 18.7445941555153 19.0761016809819
50 14.6084060629338 14.5367789388854
60 12.9602919720475 12.6039408849564
70 10.9826634424671 11.0369132461994

Table 5 Analysis of log BER


Fiber length Log BER (user 1) Log BER (user 2)
at different fiber lengths in
TDM-PON 30 − 136.696425880475 − 131.871703690725
40 − 65.5338820174283 − 69.4982412221475
50 − 30.7852138496346 − 29.2341041615242
60 − 20.9588806943095 − 19.9396570207902
70 − 18.2563287355932 − 17.8496398578652

4.2 Performance Analysis of Unidirectional WDM-PON

Q factor and bit error rate (BER) parameters of unidirectional WDM-PON are also
analyzed using OptiSystem 17 simulator software. Figure 7 shows the graph between
Q factor and fiber length. From graph, it is clear that the Q factor is varied with distance
from 30 to 70 km with power of 0.2 dBm. As the distance increases, the Q factor also
decreases. It is also seen that the distance covered by the transmitted signal has been
increased when compared with the TDM-PON system. Figure 8 shows the graph
between BER and fiber length from the unidirectional WDM-PON with 2 users.

Q Factor at User Defined Decision Instant ( Length km)


12

10

8
Q Factor

0
30 40 50 60 70 80 90 100 110 120
Length (km)

Fig. 7 Fiber length versus Q factor in WDM-PON


Performance Analysis of TDM-PON and WDM-PON 251

Log BER at User Defined Decision Instant ( Length km)


0
30 40 50 60 70 80 90 100 110 120
-1
-2
-3
-4
Log BER

-5 Log BER (User 1)


-6 Log BER (User 2)
-7
-8
-9
-10
Length km

Fig. 8 Fiber length versus log BER in WDM-PON

Table 6 Analysis of
Analysis User 1 User 2
WDM-PON with two users
Max. Q factor 8.83901 8.59285
Min. BER 3.6527e−019 3.20764e−018
Eye height 7.75376e−005 7.65473e−005
Threshold 2.30269e−005 2.30761e−005
Decision inst. 0.515625 0.515625

Graph is plotted for distance of 50–130 km at an input power of 0.2 dBm. From the
graph, it is clear that the log BER increases with the increase in fiber length. When
compared with TDM-PON, it is seen that distance covered by the transmitted signal
has been increased and the number of errors in the received signal has decreased
(Tables 6, 7 and 8).

5 Conclusion

Nowadays, fiber technology provides many features like sound and video traffic, high
data bandwidth in new applications for users. Passive optical network (PON) and fiber
to the home (FTTH) technologies have changed our lives due to its reduced cost.
Multiple users supported single optical fiber using PON technology. Most widely
used multiplexing techniques are time division multiplexing (TDM) and wavelength
division multiplexing (WDM). A TDM-PON provides large number of channels
but moderate bandwidth and uses a single wavelength. WDM-PON uses multiple
wavelengths in a single fiber to multiply the capacity without increasing the data
rate. From the analysis, it is found that the Q factor decreases as the fiber length
252 R. Sharma et al.

Table 7 Analysis of Q factor


Fiber length Q factor (user 1) Q factor (user 2)
at different fiber lengths in
WDM-PON 30 9.91473094169505 10.0184688890305
40 9.68483591303112 9.81006848143759
50 9.50867888729148 9.4396154415423
60 9.03767060448041 9.18950227564523
70 8.60978326226622 8.62164096497156
80 8.08698085456781 7.8978115905578
90 7.2218760271851 7.06188489133579
100 5.12906156669137 4.996113866206
110 4.37594478696312 4.42764967658026
120 3.38140926645459 3.32140926645459

Table 8 Analysis of log BER at different fiber lengths in WDM-PON


Fiber length Log BER (user 1) Log BER (user 2)
30 − 9.16205060780433 7.88017622948094E−010
40 − 8.90580981445384 1.39173182477758E−009
50 − 8.40189333453321 2.86013051305138E−009
60 − 8.1556215607284 1.36540921489852E−008
70 − 7.52028437490084 2.83935175641473E−008
80 − 6.78973343653168 9.57176785906436E−008
90 − 6.0689719236632 3.92406775977567E−007
100 − 4.14736061941337 0.000103811318328781
110 − 3.98368487291056 0.000103811318328781
120 − 3.09307686099912 0.00103193560648707

increases and BER increases with increase in fiber length. As compared to TDM-
PON, the received signal quality and the distance covered by the transmitted signal
have been increased and the number of errors in the received signal has reduced in
WDM-PON. From next-generation passive optical networks, WDM-PON is the best
choice as compared with TDM-PON.

References

1. Agrawal GP (2012) Fiber-optic communication systems, vol 222. Wiley


2. Keiser G (2003) Optical fiber communications. Wiley encyclopedia of telecommunications
3. Singh P, Gupta AK, Singh R (2020) Improved priority-based data aggregation congestion
control protocol. Mod Phys Lett B 34(02):2050029. https://doi.org/10.1142/S02179849205
00293
Performance Analysis of TDM-PON and WDM-PON 253

4. Singh R, Ahlawat M, Sharma D (2017) A review on radio over fiber communication system.
Int J Enhanced Res Manage Comput Appl 6(4):23–29
5. Koman E, Özlem, ÜNVERDİ NO (2019) Analysis and applications of the hybrid WDM/TDM
passive optical networks. Sigma J Eng Nat Sci Mühendislik ve Fen Bilimleri Dergisi 37(4)
6. Hamadouche H, Merabet B, Bouregaa M (2020) The performance comparison of hybrid
WDM/TDM, TDM and WDM PONs with 128 ONUs. J Opt Commun 1
7. Mahajan G, Kaur H (2017) Comparative analysis of dispersion compensation techniques for
hybrid WDM/TDM passive optical network. Int J Adv Res Comput Sci 8(5)
8. Mandal GC, Patra AS (2017) High capacity hybrid WDM/TDM-PON employing fiber-to-the-
home for triple-play services with 128 ONUs. J Opt 46(3):347–351
9. Conombo S et al (2020) WDM-OIL-TDM hybrid passive optical network. In: International
conference on electrical, communication, and computer engineering (ICECCE). IEEE
10. Singh A et al (2020) TWDM-PON, the enhanced PON for triple play services. In: 2020 5th
IEEE international conference on recent advances and innovations in engineering (ICRAIE).
IEEE
11. Kumar A, Randhawa R (2018) An improved hybrid WDM/TDM PON model with enhanced
performance using different modulation formats of WDM Transmitter. J Opt Commun 1
12. Dhaam HSZ, Wadday AG, Ali FM (2020) Performance analysis of high speed bit-interleaving
time-division multiplexing passive optical networks (TDM-PONs). J Phys Conf Ser 1529(3)
13. Belete M (2020) Performance analysis of TDM-GPON system using different parameters.
Dissertation
Detection of Microcalcifications
in Mammograms Using Edge Detection
Operators and Clustering Algorithms

Navneet Kaur Mavi and Lakhwinder Kaur

Abstract Mammography is a popular modality used by the radiologists to detect


the affected areas in human breast. Microcalcification detection is not an easy
task because of their small size. This paper proposes a methodology for enhancement
of low contrast mammograms and then applying three edge detection operators with
K-means clustering and fuzzy C-means clustering to find out microcalcification clus-
ters in a mammogram taken from MIAS database. To validate the work, the proposed
methodology is tested on different mammographic images, and the results obtained
are compared on the basis of different metrics such as accuracy, mean square error
(MSE) and peak signal-to noise ratio (PSNR).

Keywords Breast cancer · Mammogram · BBHE · Fuzzy C-means · K-means


clustering

1 Introduction

Breast cancer is considered as a major cause of deaths among women of all ages.
Breast screening programs are being held to detect the early signs of breast cancer
[1]. Computer-aided detection systems are popularly used among radiologists for
early detection of cancerous parts in a digital mammogram. The microcalcifications
appear at different parts of mammogram or they can appear in clusters at one place
unlike a mass which is like a lesion and visible more clearly in breast tissue. Because
of small size such as 0.3 mm, it is not an easy task to segment the microcalcifications
in a mammogram [2].
A mammogram as shown in Fig. 1 consists of different segments such as fore-
ground breast segment, background dark area, pectoral muscle and extra labels etc.
It is important that prior to cancer detection, the breast part is separated from the
unwanted background, and all these artifacts should be separated. There are two
views of mammogram, one is CC view that is craniocaudal view called top view, and

N. K. Mavi (B) · L. Kaur


Department of Computer Science and Engineering, Punjabi University, Patiala, Punjab, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 255
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_20
256 N. K. Mavi and L. Kaur

Fig. 1 MLO view


mammogram with its various
parts

another is mediolateral oblique view which is left or right side view of breast. Figure 1
shows a MLO view of a mammogram which is used in our research methodology.

2 Related Work

Many researchers have presented various techniques for microcalcifications detec-


tion. Loizidou et al. [3] proposed an automated system for detecting breast micro-
calcifications and its classification using temporal subtraction of mammograms.
Different features are extracted such as the shape features, intensity-based features,
FOS features and GLCM features. Comparison is done on the basis of accuracy,
sensitivity and specificity using total six classifiers, i.e., LDA, NB, k-NN, SVM,
DT and EDT. This method achieved 99.02% accuracy for classification of MC’s as
benign or malignant. Azlan et al. [4] introduced a novel method for classification of
masses as benign or malignant. Segmentation is carried out using regionprops algo-
rithm, but before that, snake algorithm is applied to obtain the ROI. After feature
extraction, classification is carried out using PCA and SVM.
Hossain et al. [5] introduced an automatic system for microcalcifications detec-
tion. Firstly, preprocessing is carried out to enhance the image using Laplacian filter.
After this, suspicious regions are segmented into positive and negative patches using
C-means clustering algorithm. The positive patches are further utilized for training
a modified U-net segmentation network. This trained network is used for automatic
detection of microcalcifications. Sadad et al. [6] proposed a CAD system in which
during preprocessing, the image is cropped to obtain the required ROI only. Then
segmentation technique, i.e., Fuzzy C-means Region Growing (FCMRG), is applied
to obtain the mass in the image. Further different features are extracted such as
Detection of Microcalcifications in Mammograms … 257

LBP-GLCM and local phase quantization (LPQ). Next feature selection is done on
the basis of minimum-redundancy maximum-relevancy (mRMR) algorithm. At the
end, classification is carried out using different classifiers to differentiate the benign
tumors from the malignant ones.
Taghanaki et al. [7] proposed a very effective method of extracting the pectoral
muscle from the MLO view of the mammogram of all breast types as it increases
the FPR in CADx. Bekker et al. [8] proposed a two-step classification method for
classifying the clusters of microcalcifications as benign or malignant. In this work,
expectation–maximization (EM) algorithm is also combined with linear regression
classifiers to find maximum likelihood parameters for a model. Quellec et al. [9]
described a CADx system for breast cancer diagnosis. In this work, the input image
is first preprocessed, and then, the breast area is segmented. This output image is
then partitioned adaptively into regions with respect to their distance from the breast
edges. Then from each region, features are derived along with the texture features
for classification of normal or abnormal regions.
Chen et al. [10] proposed a novel method for the classification of microcalcifica-
tions in mammograms. In this paper, the shape, morphological and cluster features
are extracted for individual microcalcifications. Morphological and shape features
like roughness, shape and size are extracted, and other than these features, cluster
features are also extracted that provides the global properties of the clusters. This
work is distinct from previous approaches as here morphology of microcalcification
clusters is focused such as cluster area, perimeter, eccentricity, circularity and many
more. Elmoufidi et al. [11] proposed a combined approach of K-means algorithm
and seeded-based region growing to segment the mammogram into different parts
based on some parameters. Sahli et al. [12] detected microcalifications on the basis
of multifractal analysis. In this approach, researchers decompose the mammogram
into different regions and take q-structure functions of each part to compare it with
multifractal spectrums.
Bharadwaj et al. [13] proposed a clustering technique named fuzzy C for differen-
tiating the breast region. For eliminating the nonessential areas, the image is contin-
uously divided using top-hat transform. To isolate the region of interest, watershed
transform is used. To analyze the pixels pattern to match with the clique patterns
and detection of microcalcification in the mammogram, a method named Gibbs
random fields is employed. Thresholding is carried out on the final image where
the microcalcifications are detected. Various histogram equalization-based prepro-
cessing techniques such as CLAHE, BBHE and MMBEBHE have been explained
by Garg et al. [14].
K9 segmentation is explained by Abdellatif et al. [15] for pectoral muscle removal
and mass detection which is based on entropy thresholding technique. Fuzzy-based
segmentation is another method used by the researchers. Quintanilla et al. [16]
proposed a two-step approach for microcalcifications detection. First is preprocessing
by using morphology, and second is segmentation by using different clustering mech-
anisms such as hard clustering, partitioning clustering, fuzzy clustering and K-means
clustering techniques for the detection of MC clusters. Mohanalin et al. [17] presented
an automated method for microcalcifications detection based on type II fuzzy index.
258 N. K. Mavi and L. Kaur

They have used Tsallis entropy parameter for thresholding and achieved a higher
rate of true positives as compared to non-fuzzy approach.
Combination of region growing method with top-hat transforms is used by Sultana
et al. [18] to detect cancerous area in a mammogram. They detected a very low false
positive rate and an efficient detection rate algorithm. For intensity-based segmen-
tation of microcalcifications, Bhattacharya and Das [19] used morphological trans-
forms and fuzzy C-means clustering techniques. According to the authors, morpho-
logical operators are best for the enhancement of mammogram. Abdel-Daymen
et al. [20] proposed a mass detection method based on thresholding. Optimal value
of thresholding is obtained after reducing the entropy of mammogram. Bhangale
et al. [21] detected clusters of microcalcifications by employing gabor filters and
a K-means clustering algorithm which is based on Euclidean distance for the
segmentation of mammogram, and they correctly detected the abnormal areas.

3 Preprocessing

Preprocessing is the first step in mammography which is used to remove the unwanted
area from the mammogram and to extract the abnormal tissues without the inter-
ference of background tissue. Preprocessing consists removing unwanted artifacts,
breast border extraction, removal of pectoral region and enhancing breast region. In
preprocessing step, only the essential information part is obtained and enhanced for
further detection purposes. It is also used to decrease the effects of noise in a mammo-
gram which may result in many FPs in the segmentation and detection process. In
this paper, we used brightness preserving bi-histogram equalization (BBHE) method
for our enhancement purpose. Our methodology used for detection is represented in
Fig. 2.

3.1 Brightness Preserving Bi-Histogram Equalization


(BBHE)

BBHE method is used to preserve the brightness of mammogram, and it becomes


easy to segment the mammogram into its constituent segments. BBHE technique
divides the mammogram into two equal parts called as upper histogram and lower
histogram based on the mean intensity values (X M ) of all pixels. This technique is
represented as (Fig. 3):

X = X L U XU

where X L is called as lower sub-image:


Detection of Microcalcifications in Mammograms … 259

Fig. 2 Proposed methodology

Fig. 3 a Original image mdb236 taken from MIAS database, b enhanced image using BBHE

XL ={X(i, j)/X(I, j) ≤ Xm , ¥ X(I, j) C X} (1)

and X U is the upper sub-image:

XU ={X(i, j)/X(I, j) > Xm ¥ X(i, j) C X } (2)


260 N. K. Mavi and L. Kaur

After the enhancement of low contrast mammogram so that the breast region
is more visible to the radiologist, the region of interest (ROI) is extracted using a
polynomial mask, and further, microcalcifications detection is carried out on the
breast region only (Fig. 4).
Edge Detection: In the next step, edge detection process is applied on the extracted
region of interest using three operators: Sobel, Prewitt’s and Laplace of Gaussian.
These three operators use 3X 3 masks which convolve on the extracted image.
Morphological Operations: To fill the holes and removing unwanted edges, opening
and closing methods are used in Fig. 5.

Fig. 4 Extracted ROI using a polynomial mask

Fig. 5 a Sobel edge detection operator and b image obtained by edge detection and morphological
operations
Detection of Microcalcifications in Mammograms … 261

Fig. 6 Fuzzy C-means segmented image

4 Segmentation

Segmentation is a way of dividing an image into multiple sections to simplify its


representation by extracting meaningful objects so that it becomes easier for analysis.

4.1 Fuzzy C-Means Clustering

Fuzzy clustering involves allocating the pixels in one class based on their similarity in
some predefined way, whereas allocating pixels to other classes which are dissimilar.
The recognition of clusters is based on likeness such as gap and intensity value, and
clustering is carried out by fixing membership to each pixel with respect to each
cluster center. Moreover, clusters are built based on the distance between the chosen
pixel and the nearest cluster center. The phenomenon of this algorithm is that pixel
which is close to the cluster center, the chances of its membership toward the selected
cluster are more and the pixel which is far from the center of cluster; its chances are
less for membership. Moreover, the membership and cluster center is updated after
the occurrence of each iteration.
Fuzzy C-Means Clustering Steps:
Let X = {x 1 , x 2 , x 3 , x 4 ..., x n } is the set of pixels.
V = {v1 , v2 , v3 ..., vc } is the set of cluster centers (Fig. 6).
(1) In the first step, select ‘c’ cluster center randomly.
(2) After that, calculate the fuzzy membership function ‘μij ’ using the given
formula:
262 N. K. Mavi and L. Kaur

Fig. 7 K-means segmented image


c
  m2 −1
μi j = 1 Di j /D (3)
k=1

(3) In the third step, compute the fuzzy cluster centers ‘V j ’ using:

(4)

(4) Steps (2) and (3) are repeated and stop when the value reached at minimum
point.
or
 (k+1) 
U − U (k)  < β. (5)

where

k is repetition count.
β is called as ‘stoppage criterion’ whose value is between [0,1].
U = (μij )n*c is the fuzzy membership matrix.
J is the objective function.

4.2 K-Means Clustering

K-means algorithm is used to segmenting a mammogram to its segments. The steps


behind this technique are as follows:
Detection of Microcalcifications in Mammograms … 263

1. First of all, on the basis of some predefined parameters, select k cluster centers.
2. Assignment of a pixel to the selected cluster center is done when that pixel has
a minimum distance from the center of specified cluster center.
3. Average of all pixels is computed for the selected cluster, and again, new cluster
center is obtained.
4. Repeat the Steps (2) and (3), and the algorithm will stop when all the clusters
in mammogram are formed.
5. In K-means algorithm, the distance taken is the squared difference between the
selected pixel and the center of a cluster which is more close to that pixel. If
a dataset containing N values is chosen, N = {n1 , n2 , n3 … nn }, the result of
K-means clustering is set of K clusters, as shown:

n(x, y) = |xi − yi |2 (6)

• First of all, calculate the average of data values within each cluster.
• Update the average value by repeating above steps until no change happens
(Fig. 7).

5 Experimental Analysis

The clustering methods are tested using Sobel edge detection on 30 mammograms
which contains fatty (F), fatty glandular (G) and dense glandular (D) images, and
normal images also taken from mammographic image analysis society (mini-MIAS)
database which may be benign or malignant calcifications and some normal tissues.
Mean square error (MSE) and peak signal-to-noise ratio (PSNR) parameters are
computed on mammogram images, and comparison is performed using edge detec-
tion operations with K-means clustering and fuzzy C-means clustering. Moreover,
to measure the accuracy, true positive, false positive, true negative and false negative
values are calculated (Tables 1, 2, 3, 4, 5, 6, 7 and 8).
K-means clustering and fuzzy c-means clustering algorithms using with Sobel,
Prewitt’s and LoG edge detection techniques give good results when applied on
MIAS database images.

Table 1 MSE and PSNR


Image MSE PSNR
values of five fatty glandular
mammograms applying Mdb209 6.62e+003 9.91
K-means clustering algorithm Mdb211 8.47e+003 8.84
Mdb213 6.18e+003 10.21
Mdb227 7.02e+003 9.66
Mdb233 5.90e+003 10.42
264 N. K. Mavi and L. Kaur

Table 2 MSE and PSNR


Image MSE PSNR
values of 5 dense glandular
mammograms applying Mdb216 1.20e+004 7.31
K-means clustering algorithm Mdb223 4.94e+003 11.19
Mdb226 4.98e+003 11.15
Mdb236 1.01e+004 8.08
Mdb233 1.38e+004 6.71

Table 3 MSE and PSNR


Image MSE PSNR
values of five fatty tissue
mammograms applying Mdb231 6.97e+003 9.69
K-means clustering algorithm Mdb238 5.62e+003 10.63
Mdb245 6.46e+003 10.02
Mdb248 9.63e+003 8.29
Mdb256 7.34e+003 9.47

Table 4 MSE and PSNR of


Image MSE PSNR
five fatty glandular
mammograms using fuzzy Mdb209 6.64e+003 9.90
C-means clustering technique Mdb211 8.49e+003 8.84
Mdb213 6.19e+003 10.21
Mdb227 7.03e+003 9.66
Mdb233 5.91e+003 10.42

Table 5 MSE and PSNR


Image MSE PSNR
values of five dense glandular
images using fuzzy C-means Mdb216 1.20e+004 7.31
clustering Mdb223 4.94e+003 11.18
Mdb226 4.99e+003 11.14
Mdb236 1.01e+004 8.07
Mdb233 1.38e+004 6.70

Table 6 MSE and PSNR


Image MSE PSNR
values of five fatty tissue
mammograms using fuzzy Mdb231 6.98e+003 9.69
C-means clustering Mdb238 5.62e+003 10.63
Mdb245 6.46e+003 10.02
Mdb248 7.33e+003 9.46
Mdb256 9.63e+003 8.2
Detection of Microcalcifications in Mammograms … 265

Table 7 Comparison of clustering techniques using edge detection in terms of TP, TN, FP and FN
Algorithm TP FN TN FP
Fuzzy C-means 25 2 2 1
K-means 23 3 3 3

Table 8 Accuracy of FCM


Algorithm Accuracy (%)
and K-means clustering
Fuzzy C-means 83.33
K-means 76.66

6 Conclusion and Future Scope

In our research, the proposed methodology uses K-means and fuzzy C-means clus-
tering methods with three edge detection techniques that are applied on the mammo-
graphic images. To enhance the obtained results, morphological opening and closing
functions are used. Fuzzy C-means clustering method gave 83.66%, and K-means
gave 76.66% accuracy on total 30 mammograms. The results shown by Prewitt’s,
Sobel and Laplace of Gaussian are almost similar. Moreover, from the results, it is
clear that the fuzzy C-means algorithm gives much accurate results and gives less
false positives as compared to K-means clustering, and the clusters framed by the
fuzzy C-means technique are compact as compared to technique proposed in [1].
As lesser the mean square error, more the peak signal-to-noise ratio, the results
will be better. Therefore, in case of dense glandular (D) images, more accuracy is
obtained by using our technique as compared to fatty (F) and fatty glandular (G)
mammograms. In future, more accuracy can be achieved by using a combination of
feature extraction techniques.

References

1. Cisneros R, Marin H, Pablos S (2008) Detection and classification of microcalcification clusters


in mammograms using evolutionary neural networks. Adv Comput Intell Paradigms Healthcare
107:151–180
2. Sampat MP, Markey MK, Bovik AC (2005) Computer-aided detection and diagnosis in
mammography
3. Loizidou K, Skouroumouni G, Nikolaou C, Pitris C (2020) An automated breast micro-
calcification detection and classification technique using temporal subtraction of mammo-
grams. IEEE Access 8:52785–52795
4. Azlan NAN, Lu C-K, Elamvazuthi I, Tang TB (2020) Automatic detection of masses from
mammographic images via artificial intelligence techniques. IEEE Sens J 20(21):13094–13102
5. Hossain MS (2019) Microcalcification segmentation using modified u-net segmentationnet-
work from mammogram images. J King Saud Univ Comput Inf Sci
266 N. K. Mavi and L. Kaur

6. Sadad T, Munir A, Saba T, Hussain A (2018) Fuzzy C-means and region growing based
classification of tumor from mammograms using hybrid texture features. J Comput Sci 29:34–
45
7. Taghanaki SA, Liu Y, Miles B, Hamarneh G (2017) Geometry based pectoral muscle
segmentation from MLO mammogram views. IEEE Trans Biomed Eng 64(11):2662–2671
8. Bekker AJ, Shalhon M, Greenspan H, Goldberger J (2016) Multi-view probabilistic classifi-
cation of breast microcalcifications. IEEE Trans Med Imaging 35(2):645–653
9. Quellec G, Lamard M, Cozic M, Coatrieux G, Cazuguel G (2016) Multiple-instance learning
for anomaly detection in digital mammography. IEEE Trans Med Imaging 35(7):1604–1614
10. Chen Z, Strange H, Oliver A, Denton ERE, Boggis C, Zwiggelaar R (2015) Topological
modeling and classification of mammographic microcalcification clusters. IEEE Trans Biomed
Eng 62(4):1203–1214
11. Elmoufidi A, El Fahssi K, Jai-Andalossai S, Sekkaki A (2015) Automatically density based
breast segmentation for mammograms by using dynamic k-means algorithm and seed based
region growing. IEEE Instr Measur Soc
12. Sahli IS, Bettaieb HA, Abdallah AB, Bhouri I, Bedoui MH (2015) Detection and segmentation
of microcalcifications in digital mammograms using multifractal analysis. In: IEEE conference
of image processing theory, tools and applications, pp 180–183
13. Bharadwaj AS, Celenk M (2015) Detection of microcalcification with top-hat transform and
the gibbs random fields. In: International conference of IEEE engineering in medicine and
biology society, pp 6382–6385
14. Garg G, Sharma P (2014) An analysis of histogram equalization method for brightness
preserving and contrast enhancement. Int J Innov Adv Comput Sci IJIACS 3:1–11
15. Abdellatif H, Taha TE, Zahram OF, Al-Nauimy W, Abd El-Samie FE (2013) K9 auto-
matic segmentation of digital mammograms to detect masses. In: 30th national radio science
conference, Egypt, pp 557–565
16. Quintanilla-Dominguez J, Ojeda-Magana B, Cortina-Januchs MG, Ruelas R, Vega-Corona A,
Andina D (2011) Image segmentation by fuzzy and possibilistic clustering algorithms for the
identification of microcalcifications. Scientia Iranica D:580–589
17. Mohanalin B, Kalra PK, Kumar N (2010) A novel automatic microcalcifications detection
techniques using Tsallis entropy and type II fuzzy index. In: Conference of computer and
mathematics with applications, pp 2426–2432
18. Sulana A, Ciuc M, Florea L, Florea C (2009) Detection of mammograhic microcalcifications
using are fined region growing approach. Proc IEEE
19. Bhattacharya M, Das A (2007) Fuzzy logic based segmentation of microcalcificatins in breast
using digital mammograms considering multiresolution. In: IEEE conference of international
machine vision and image processing, pp 98–105
20. Abdel-Dayem AR, El-Sakka MR (2005) Fuzzy entropy based detection of suspicious masses
in digital mammogram images. In: 27th IEEE conference in medicine and biology, China, pp
4017–4022
21. Bhangale T, Desai UB, Sharma U (2000) An unsupervised scheme for the detection of
microcalcifications in mammograms. In: IEEE conference publications, pp 184–187
Analysis of the Proposed CNN Model
for the Recognition of Gurmukhi
Handwritten City Names of Punjab

Sandhya Sharma, Sheifali Gupta, Neeraj Kumar, and Himani Chugh

Abstract A holistic approach employing different learning rates is proposed for


the recognition of handwritten district names of Punjab state which are written in
Gurmukhi Script. Learning rate (LR) is a hyperparameter that specifies how much the
model can change when the model parameters are changed in terms of error. For the
purpose of recognition, a convolutional neural network (CNN) using deep learning
is proposed. Initially, the dataset of 4000 of images is prepared for ten different city
names of Punjab state, and later, a CNN is employed. The proposed CNN archi-
tecture having 12 layers is developed and employed using five different learning
rates: 0.00001, 0.0001, 0.001, 0.01 and 0.1, and their performance is analysed for
the purpose of text recognition. Best average validation accuracy achieved for the
proposed CNN architecture is 93%, and maximum achieved validation accuracy is
99% by employing LR of 0.0001 for the prepared dataset.

Keywords Word recognition · Convolutional neural network · Gurmukhi words ·


Holistic approach · Postal automation

1 Introduction

Convolutional neural networks (CNNs) are commonly used in various applications


ranging from image detection to voice recognition to text mining to natural language
processing (NLP). For deep neural neural networks (DNN), an optimization is an
important parameter to achieve high validation accuracy. The main aim of machine

S. Sharma (B) · N. Kumar


Chitkara University Institute of Engineering and Technology, Chitkara University, Solan,
Himachal Pradesh, India
e-mail: sandhya.sharma@chitkarauniversity.edu.in
S. Gupta
Chitkara University Institute of Engineering and Technology, Chitkara University, Chandigarh,
Punjab, India
H. Chugh
Chandigarh Group of Colleges, Ajitgarh, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 267
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_21
268 S. Sharma et al.

learning is to decide how accurately a model can learn from the data using a cost
function and later to determine the parameters which reduces this cost function [1].
So, learning is the process of changing these parameters in order to find the best
optimal collection. This cost function is selected based upon the training data. To
reduce the gradient of the cost function, a gradient based method is used during
the training process. Gradient descent is an optimization method which is used to
estimate the error gradient and later updates the weights of the model using back-
propagation [2]. LR is an important parameter for the optimization of the model. It is
known that in neural networks, weights are updated each time the error is generated
at the output, whereas LR controls how much to change the model corresponding to
the error generated. Choosing of optimal LR is a challenging task. If the LR is too
small, training process will be very long but if the LR is too high than training will
be unstable [3]. LR is an important parameters that configures the neural network. In
this paper, different LR are used for the optimization of the proposed CNN model.
The CNN model is proposed for the recognition of ten different city names of Punjab
(state) which are handwritten in Gurmukhi Script. Postal automation is one of the
potential application areas in the field of text recognition [4–9]. Postal automation
is helpful in the automatic handling or sorting of mails. Firstly, the dataset of 4000
samples for ten different city names is created and proposed CNN using holistic
approach is employed, and model is analysed on different LRs.

2 Methodology

In the proposed methodology, the first step is to prepare the dataset for ten different
city names of Punjab state followed by digitization and preprocessing, and later, a
CNN model is designed for the purpose of recognition.

2.1 Dataset

For the creation of Gurmukhi handwritten words of ten city names, 40 different
writers wrote each word ten times, resulting in total 4000 samples (400 for each
city name), whereas the writers were considered from different age groups, different
educational and professional qualifications [10].

2.2 Digitization and Pre-processing

For digitization, collected dataset is converted into digital form using a scanner,
and for preprocessing, binarization and normalization are employed. Images of the
prepared dataset are shown in Table 1.
Analysis of the Proposed CNN Model … 269

Table 1 Ten city names used


S. No. City name The city name in Gurmukhi
with their corresponding
handwritten digitized image 1 Amritsar
in Gurmukhi 2 Fazilka
3 Hoshiarpur
4 Jalandhar
5 Ludhiana
6 Mansa
7 Mohali
8 Muktsar
9 Pathankot
10 Patiala

2.3 Processing

In this step, CNN is designed having 12 layers. CNN model has three main layers:
convolution, pooling and fully connected layers [11] as per the diagram shown in
Fig. 1. The main purpose of convolution and pooling layer is to extract the abstract
features while the output layer which is the fully connected layer helps in delivering
the output in the form of output classes.
The proposed model with 12 layers is shown in Fig. 2. At first, an input image
of size 64 × 32 × 3 is given to the first convolution layer which has 32 filters of
size 3 × 3 and a stride of 1 × 1, from which 32 feature maps are generated which is
followed by ReLU activation function. Next maxpooling layer is introduced having
pool size of 2 * 2 and stride of 2 × 2, which reduces a feature map by a factor of
2 using maxpooling layer. Next convolution layer has 64 filters of size 3 × 3 and a
stride of 1 × 1, and this layer is followed by maxpooling layer of size 2 × 2 and a
stride of 2 × 2. Next is again a convolution layer with 128 filters of size 3 × 3 and

Fig. 1 Overview of CNN model


270 S. Sharma et al.

Fig. 2 Proposed model with layers and their filter sizes

followed by maxpooling layer. Last convolution layer is added with 256 filters of
size 3 × 3 and again followed by maxpooling layer. The combination of convolution
and maxpooling layers is followed by fully connected layer which has 2048 neurons
in its input layer, 120 neurons in the middle layer and 10 neurons in the output layer.

3 Experiments and Results

In this section, performance of the model is evaluated based upon the parameters;
training and validation loss as well as accuracy. For the evaluation, the model is
trained on 3200 of images while tested on 800 of images from the prepared dataset
of 4000 images. Such that 80 samples from each class are reserved for the purpose of
validation, while 320 samples from each class are used for the purpose of training the
model. Model’s evaluation based upon the parameters is analysed on five different
LRs. The training parameters for the proposed work are given below.
• Optimizer: Adam
• Learning rate: 0.00001, 0.0001, 0.001, 0.01 and 0.1
• Number of epochs: 10
• Dropout: 0.5
• Number of layers: 12.
The proposed model is implemented on the dataset of 4000 images using Python
with the help of Keras and TensorFlow which are machine learning libraries [12].
TensorFlow is mainly deep learning library while Keras is.
Neural network library. In the next section, parameters such as loss and accuracy
are computed on training and validation dataset based upon five different LRs which
are 0.00001, 0.0001, 0.001, 0.01 and 0.1, while the optimizer used for evaluation of
model is Adam.

3.1 Recognition Results Achieved by Using LR of 0.00001

In this section, training as well as validation loss (Fig. 3a) and accuracy (Fig. 3b)
Analysis of the Proposed CNN Model … 271

Loss

Accuracy
epochs epochs

(a) (b)

(c)

Fig. 3 Parameters convergence plot obtained using Adam optimizer with a LR of 0.00001 a training
and validation loss, b training and validation accuracy, c confusion matrix

is computed using the LR of 0.00001 which is very small LR. It can be observed
from Fig. 3a, b that learning is slow from the smoothly changing curves without
any spikes. Training and validation loss started from the values around 3.0 on 1st
epoch and linearly decreased after that while training and validation accuracy started
from around 0.2 on 1st epoch and linearly increased with each epoch. Figure 3c is
the confusion matrix which tells the predicted values against true values. It can be
observed that various city names are misclassified with a LR of 0.00001.

3.2 Recognition Results Achieved by Using LR of 0.0001

Now, the LR is changed to 0.0001. Loss and accuracy on the training and validation
dataset can be observed in Fig. 4a, b. It can be found from the smoothly changing
curves without any spikes that learning is slow.
Training and validation loss started from around from the value 1.5 on 1st epoch
and linearly decreased after that while training and validation accuracy started from
272 S. Sharma et al.

Accuracy
Loss

epochs epochs
(a) (b)

(c)

Fig. 4 Parameters convergence plot obtained using Adam optimizer with a LR of 0.0001 a training
and validation loss, b training and validation accuracy, c confusion matrix

around 0.5 on 1st epoch and linearly increased with each epoch. Figure 4c is the confu-
sion matrix which tells the predicted values against true values. It can be observed
that it has misclassified only two classes, i.e. city ‘Fazilka’ is predicted as ‘Ludhiana’
while two samples of ‘Muktsar’ are predicted as ‘Mansa’.

3.3 Recognition Results Achieved by Using LR of 0.001

Now, the value of LR is increased from 0.0001 to 0.001. Training and validation
loss is shown in Fig. 5a while accuracy is shown in Fig. 5b. It can be observed from
Fig. 5a that validation loss is too high on 1st epoch but later started decreasing with
each epoch and training loss is close to 1 on 1st epoch. Also, the validation accuracy
was very low on 1st epoch while training accuracy is close to 0.8 on the same epoch;
same can be observed in Fig. 5b. From the confusion matrix of Fig. 5c, it can be
observed that with this LR, model has missclassified various samples from classes.
Analysis of the Proposed CNN Model … 273

Accuracy
Loss

epochs epochs

(a) (b)

(c)

Fig. 5 Parameters convergence plot obtained using Adam optimizer with a LR of 0.001 a training
and validation loss, b training and validation accuracy, c confusion matrix

One such example can be seen in Fig. 5c; ten samples out of 80 samples (validation
dataset) of class ‘Fazilka’ are misclassified as ‘Amritsar’.

3.4 Recognition Results Achieved by Using LR of 0.01

Now, the LR is further increased to 0.01. It can be observed from Fig. 6a which is
of training and validation loss that validation loss is around 0.6 on 1st epoch, and
after that, it is increasing and later again decreasing. Similarly, Fig. 6b shows that
validation accuracy on the 1st epoch is around 0.75 but again decrease and increase
in the curve. Confusion matrix for the same is shown in Fig. 6c which shows that the
model has again misclassified few samples of the classes.
274 S. Sharma et al.

Accuracy
Loss

epochs epochs
(a) (b)

(c)

Fig. 6 Parameters convergence plot obtained using Adam optimizer with a LR of 0.01 a training
and validation loss, b training and validation accuracy, c confusion matrix

3.5 Recognition Results Achieved by Using LR of 0.1

Figure 7a, which is of training and validation loss, Fig. 7b, which is of training and
validation accuracy, shows that the model with the LR of 0.1 is highly unstable.
Figure 7a shows that the validation loss on 7th epoch is close to 400. Confusion
matrix for the same is shown in Fig. 7c which shows that the model has misclassified
many samples for various classes.

4 Comparative Analysis of Different LRs

In this section, comparison of all the LR is presented for the parameters like training
loss, validation loss, training accuracy and validation accuracy corresponding to
Analysis of the Proposed CNN Model … 275

Accuracy
Loss

epochs epochs
(a) (b)

(c)

Fig. 7 Parameters convergence plot obtained using Adam optimizer with a LR of 0.1 a training
and validation loss, b training and validation accuracy, c confusion matrix

each epoch is shown in Table 2. Average values obtained for each parameter are also
shown. Validation accuracy is the main parameter for the purpose of recognition. It
can be observed from Table 2 that the highest validation accuracy of 0.99 (99%) is
obtained with a LR of 0.001 and 0.0001. Average validation accuracy with 0.001
is 0.88 (88%) and with 0.0001 is 0.92 (92%) which shows that LR of 0.0001 has
performed better for the available dataset. Moreover, the curve is more linear for
LR 0.0001 instead of other LR. The maximum validation accuracy obtained with a
LR of 0.00001 is 0.86 (86%) with an average value of 0.64 (64%). With the LR of
0.1, maximum validation accuracy is 0.97 (97%) with average value of 0.86 (86%).
Validation accuracy curve for all the LRs corresponding to the number of epochs can
be observed in Fig. 8. This figure also shows that the maximum validation accuracy
is achieved by using LR of 0.0001 and 0.001.
276 S. Sharma et al.

Table 2 Parameter values obtained using five different LRs


LR Epoch Training loss Validation loss Training Validation
accuracy accuracy
0.00001 1 2.51 3.09 0.17 0.15
2 2.03 1.7 0.30 0.40
3 1.74 1.34 0.39 0.50
4 1.42 1.08 0.50 0.60
5 1.30 0.90 0.56 0.69
6 1.14 0.77 0.62 0.75
7 0.99 0.64 0.67 0.80
8 0.88 0.56 0.73 0.83
9 0.77 0.48 0.77 0.85
10 0.74 0.44 0.78 0.86
Average values 1.35 1.10 0.55 0.64
0.0001 1 1.5 1.5 0.50 0.50
2 0.72 0.29 0.79 0.92
3 0.41 0.14 0.89 0.96
4 0.30 0.26 0.92 0.91
5 0.23 0.06 0.93 0.98
6 0.22 0.07 0.94 0.97
7 0.17 0.04 0.96 0.99
8 0.15 0.06 0.96 0.98
9 0.15 0.02 0.96 0.99
10 0.14 0.02 0.96 0.99
Average values 0.40 0.25 0.88 0.92
0.001 1 0.70 6.2 0.79 0.27
2 0.24 0.03 0.93 0.99
3 0.17 0.02 0.95 0.99
4 0.13 0.01 0.96 0.95
5 0.13 0.01 0.96 0.99
6 0.20 0.69 0.94 0.76
7 0.12 0.02 0.97 0.99
8 0.14 0.01 0.95 0.99
9 0.16 0.46 0.95 0.85
10 0.13 0.02 0.96 0.99
Average values 0.21 0.76 0.94 0.88
0.01 1 1.1 0.62 0.59 0.76
2 0.81 0.78 0.72 0.80
(continued)
Analysis of the Proposed CNN Model … 277

Table 2 (continued)
LR Epoch Training loss Validation loss Training Validation
accuracy accuracy
3 0.74 0.72 0.75 0.75
4 0.51 0.25 0.82 0.91
5 0.41 0.16 0.86 0.95
6 0.35 0.10 0.88 0.96
7 0.23 0.17 0.92 0.94
8 0.24 0.08 0.91 0.97
9 0.23 0.04 0.92 0.98
10 0.24 0.06 0.92 0.97
Average values 0.49 0.30 0.83 0.90
0.1 1 2.07 0.74 0.40 0.71
2 0.92 2.89 0.69 0.59
3 0.69 0.87 0.76 0.65
4 0.68 0.14 0.77 0.95
5 0.47 0.15 0.84 0.94
6 0.45 0.13 0.85 0.96
7 0.38 461.4 0.87 0.97
8 0.30 0.09 0.90 0.96
9 0.30 2.68 0.91 0.87
10 0.30 8.75 0.89 0.97
Average values 0.66 47.79 0.79 0.86

Validation Accuracy
1.2
1
0.8
0.6
0.4
0.2
0
1 2 3 4 5 6 7 8 9 10

LR:0.00001 LR:0.0001 LR:0.001 LR:0.01 LR:0.1

Fig. 8 Comparison between all the LRs for validation accuracy


278 S. Sharma et al.

Average Validation Accuracy


1
0.8
0.6
0.4
0.2
0
LR:0.00001 LR:0.0001 LR:0.001 LR:0.01 LR:0.1

Average Validation Accuracy

Fig. 9 Comparison plot between all the LRs for average validation accuracy

Now, the average validation accuracy obtained with LR of 0.00001, 0.0001, 0.001,
0.01 and 0.1 is compared. Plot for the same is shown in Fig. 9, from where it can be
observed that LR of 0.0001 has performed better as compared to other LRs. Average
validation accuracy of 0.92 (92%) is obtained with a LR of 0.0001 which is high in
comparison to rest of the LRs. The minimum average validation accuracy is obtained
with a LR of 0.00001 while LR 0.001, 0.01 and 0.1 has performed almost same. It
can be concluded that LR of 0.0001 has outperformed other LR in terms of average
validation accuracy.

5 Conclusion

It can be concluded that LR plays an important role in analysing the performance of


any CNN model. In this work, a proposed model is implemented using five different
LR to analyse their performance. It can be concluded from Figs. 3, 4, 5, 6 and 7 that
the learning of the model is slow and stable with the small LR but model is unstable
when the LR is high. Parameters like training and validation loss as well as accuracy
are linearly varying with a small LR of 0.00001, 0.0001 and fairly ok with a LR of
0.001 but sudden fluctuations in the curves are present with a LR of 0.01 and 0.1.
So, from the above discussion; LR of 0.0001 can be considered a perfect LR for
the prepared dataset by keeping in view the model stability, maximum validation
accuracy and average validation accuracy obtained.

References

1. Wu Y, Liu L, Bae J, Chow KH, Iyengar A, Pu C, Wei W, Yu L, Zhang Q (2019) December.


Demystifying learning rate policies for high accuracy training of deep neural networks. In:
2019 IEEE international conference on big data (big data), pp 1971–1980
2. Attoh-Okine NO (1999) Analysis of learning rate and momentum term in backpropagation
neural network algorithm trained to predict pavement performance. Adv Eng Softw 30(4):291–
302
Analysis of the Proposed CNN Model … 279

3. Wilson DR, Martinez TR (2001) The need for small learning rates on large problems.
In: IJCNN’01. International joint conference on neural networks. Proceedings (Cat. No.
01CH37222), vol 1. IEEE, pp 115–119
4. Roy K, Vajda S, Pal U, Chaudhuri BB (2004) A system towards Indian postal automation.
Ninth international workshop on frontiers in handwriting recognition, pp. 580–585
5. Daniyar N, Kairat B, Maksat K, Anel A (2019) Classification of handwritten names of cities
using various deep learning models. In: 2019 15th international conference on electronics,
computer and computation (ICECCO). IEEE, pp 1–4
6. Pal U, Roy RK, Kimura F (2012) Multi-lingual city name recognition for Indian postal
automation. In: 2012 International conference on frontiers in handwriting recognition, Bari,
pp 169–173
7. Vajda S, Roy K, Pal U, Chaudhuri BB, Belaid A (2009) Automation of Indian postal documents
written in Bangla and English. Int J Pattern Recognit Artif Intell 23(08):1599–1632
8. Pal U, Roy RK, Kimura F (2011) Handwritten street name recognition for indian postal automa-
tion. In: 2011 international conference on document analysis and recognition (ICDAR), Beijing,
pp 483–487
9. Roy K, Vajda S, Pal U, Chaudhuri BB (2004) A system towards Indian postal automation. In:
Ninth international workshop on frontiers in handwriting recognition, pp 580–585
10. Thadchanamoorthy S, Kodikara ND, Premaretne HL, Pal U, Kimura F (2013) Tamil hand-
written city name database development and recognition for postal automation. In: 2013 12th
international conference on document analysis and recognition. IEEE, pp 793–797
11. Ismail A, Ahmad SA, Soh AC, Hassan K, Harith HH (2019) Improving convolutional neural
network (CNN) architecture (miniVGGNet) with batch normalization and learning rate decay
factor for image classification. Int J Integr Eng 11(4)
12. Ramasubramanian K, Singh A (2019) Deep learning using Keras and tensorflow. In: Machine
learning using R. Apress, Berkeley, CA., pp 667–688
Modeling and Analysis of Positive
Feedback Adiabatic Logic CMOS-Based
2:1 Mux and Full Adder and Its Power
Dissipation Consideration

Manvinder Sharma , Pankaj Palta , Digvijay Pandey , Sumeet Goyal ,


Binay Kumar Pandey , and Vinay Kumar Nassa

Abstract For high-speed performance in computing and other applications which


involve processing, computing, and analysis of any signal, digital CMOS circuits
have been used mostly. Several pros over other families which include perfect logic
levels, impeccable noise margins, better performance, and almost negligible static
power dissipation CMOS logic family is preferred. As we need faster processing
of signals, the demand of these circuits is high and is going to increase in near
future. With integration, the number of transistors is increasing per chip area but
the energy due to gate switching does not decrease with same rate. Hence power
dissipation, reduction of heat is major concern in circuits. In this work, the posi-
tive feedback adiabatic logic PFAL-based design for digital circuits using CMOS is
proposed and compared with the conventional circuit. The design is simulated, and
with the comparison of conventional CMOS 2:1 multiplexer circuit, the designed
PFAL CMOS 2:1 multiplexer circuit the proposed method has less power dissipa-
tion in terms of 80.871 pW while conventional CMOS circuit has 6.9090 nW with
no change in behavior of the circuit. For full adder circuit conventional CMOS has
48.0452 pW while proposed PFAL-based full adder has 3.9089 pW.

M. Sharma (B) · P. Palta


Department of ECE, Chandigarh Group of Colleges, Landran, Punjab, India
D. Pandey
Department of Technical Education, IET, Dr. A. P. J. Abdul Kalam Technical University,
Lucknow, Uttar Pradesh, India
S. Goyal
Department of Applied Science, Chandigarh Group of Colleges, Landran, Punjab, India
B. K. Pandey
Department of IT, College of Technology, Govind Ballabh Pant University of Agriculture and
Technology, Pantnagar, Uttarakhand, India
V. K. Nassa
Department of Computer Science Engineering, South Point Group of Institutions, Sonepat,
Haryana, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 281
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_22
282 M. Sharma et al.

Keywords Adiabatic · Low power design · PFAL design · Multiplexer · Full


adder · CMOS

1 Introduction

Growing the switching frequency of transistors and increasing the number of tran-
sistors incorporated across a chip region have resulted in orders of magnitude faster
computing systems. Computing rate is accomplished by expanding the amount of
transistors incorporated on a chip. Nevertheless, such boost in efficiency is accom-
panied by an increase in energy and power dispersion [1]. The downside to high
power and energy dispersion would be that the circuitry needs more costly shielding
and extensive cooling technologies that reduce efficiency and raise expense. When
the clock frequency and on integration are increased to meet the expectations of fast
computation, the energy and power dispersion of such high-performance circuitry
becomes a dangerous architecture problem [2]. Obtain Tera Instructions per Second
(TIPS), high-end microprocessors deploy billions of transistors onto the chip at clock
speeds exceeding 30 GHz; at this rate of acceleration, circuitry output power is
expected to reach thousands of watts. This power dissipation capacity raises relia-
bility concerns such as warm carrier, thermal expansion, and electromigration, all of
which worsen circuitry efficiency. Furthermore, greater power dispersion circuitry
can draw further electricity from the battery for the necessity of reduced power
processors or limited battery consumption chips [3–5]. The primary challenge for
portable media systems that run on batteries, such as notepads, notebooks, and ipads,
is low power, which also improves battery lifespan.
The energy consumed per unit time by any system is referred to as its power
consumption.

Top
En = Pow(t)dt (1)
0

The energy (En) needed for a given operation seems to be the integral of a power
(Pow) expended over design life (Top).
The digital CMOS circuit’s power (Pow) is given by

Pow = C ∗ Vs ∗ Vd ∗ f (2)

where
Vs is the supply voltage,
f is the clock frequency,
C is the capacitance being recharged, and
Vd is the signal voltage swing.
Modeling and Analysis of Positive Feedback Adiabatic … 283

The amount of energy used can be provided by

En = n.C ∗ Vs ∗ Vd (3)

The energy usage in this equation is independent of the clock frequency. Since
battery performance is decided by energy consumption, it is crucial to reduce energy
usage in comparison to electricity. Strength, on the other hand, is important for heat
dissipation considerations. As a result, optimization of both energy and power is
needed [6]. Power dissipation in a CMOS circuit is made up of static and dynamic
power dissipation. Power dissipated in a steady state is referred to as static power
dissipation, and it is given by

Pstate = Istate .Vs (4)

When the circuit is in a stable state or there is no shifting operation, Istate is


the current flowing through it. Even so, because there is no straight route from V s
to ground in the stable position (pMOS) and also no transistor is on (nMOS), no
stationary dissipation occurs [7]. However, because the metal–oxide–semiconductor
(MOS) is not really a perfect or flawless shift, it seems there is a substrate infusion
current and a leakage current, causing static power dissipation [8–10]. Substrate
injection current for a nMOS system (W/L = 10/0.5) at 5 V of V s is in the range of
1–100 A [11].
Ratioed logic is an additional type of static power dissipation. Since the CMOS
family is a ratioed group, the pMOS network is still on, acting as a load for the nMOS
network to take down. There is a direct path for current from V s to ground in the low
state, when gate performance is low. Static current flows at this stage. As a result,
the CMOS family consumes a lot of static electricity. The CMOS inverter is shown
in Fig. 1.
In the circuit during shifting operation or changeover, dynamic energy dissipation
is launched. Mostly during transitory action, both pMOS and nMOS are turned on at
Vtn (nMOS threshold point) and V s −V tp (pMOS threshold point). Throughout that
time, a short circuit emerges, and current flows from V s to ground. One such problem
can be mitigated to ten to fifteen percent of the overall power by applying suitable
transition edges. Another factor affecting dynamic dispersion is the charging and
discharging of parasitic capacitance inside the circuit [12]. Figure 1 shows parasitic
capacitance as a capacitor output. In one full processing cycle from V s to GND and
return, the process of trying to charge the capacitor uses up power from the power
supply equal to C. (V s ) 2. Half of the energy is deposited in the capacitor and half is
receded in the pMOS; throughout operating condition, the capacitors discharge and
the power are dissipated in the nMOS. Energy equal to C * (V s ) 2 is abated each
time the capacitors flips. The dynamic dissipation can be expressed as follows:

Pdyna = α ∗ C ∗ Vs 2 f (5)
284 M. Sharma et al.

Fig. 1 For power analysis,


CMOS inverter

where
α is assumed to be the range of 0–1 data cycle transitions,
f is the frequency of the clock.

2 PFAL Design and Modeling

The dispersion uses nearly 90% of power dissipation. Developers could even reduce
a node’s dissipation, switching events, voltage swings, or even group it to reduce
this dispersion. However, energy obtained from energy production has been used
even after all of these methodologies prior to the dissipation of energy. Nevertheless,
the efficiency of logic circuits could be accomplished by recycling energy obtained
from the power supply [13]. A class of circuit design renowned as adiabatic logic
circuits offers the opportunity to reuse power from the supply and to reduce power
dispersion during the switching. In this logic, the energy that is stored in capacity is
reused rather than discharged as heat. This is recognized as CMOS, as well as energy
recovery. Once the transistors are switched under certain circumstances as well as
the slowdown procedure, the power dissipation in the circuit be very low.
Figures 2, 3 and 4 depict an adiabatic shifting procedure in which the capacitance
is charged by a current source rather than a constant voltage source [14], R denotes
a pMOS resistance. A linear voltage ramp correlates to a constant charging current.
Initially, let us assume capacitance voltage V C = 0.
I * R be the voltage across switch

P(t) in the switch = I 2 R (6)


Modeling and Analysis of Positive Feedback Adiabatic … 285

Fig. 2 Inverter with


adiabatic CMOS

Fig. 3 Inverter equivalent


resistive circuit

Fig. 4 Adiabatic-based
switching
286 M. Sharma et al.

Energy during charge = (I 2 R) ∗ T (7)

C∗V C∗V
Therefore I = →T = (8)
T I
 2
  C∗V C2 ∗ V 2 ∗ R
E = I2 ∗ R ∗ T = ∗R∗T = (9)
T T
     
R∗C 2∗ R∗C 1
E = E diss = ∗ C ∗ V2 = ∗ C ∗ V2 (10)
T T 2

E stands for energy dispersed, Q for charge transport to load, C for load capacitance,
R for MOS switch resistance, V for voltage final value, and T for time spent at load.
The energy dissipation could be reduced by increasing a charging time (T ) if it
is higher than 2 * R * C, as shown in Eq. 9. In contrast to conventional CMOS, the
power dissipated seems to be proportional to R rather than capacitance or voltage
swing. Power dissipation would be reduced by lowering the pMOS resistance.
In adiabatic logic, a distinct form of energy supply is being used to reuse the power.
A sinusoidal or trapezoidal origin is typically needed for adiabatic procedure. One
such supply can also serve as just a circuit’s clock. The trapezoidal power supply is
shown in Fig. 5. The process is depicted by the four different phases idle, assess, hold,
and recover. The value of idle is 0, and the value of hold is 1. As during evaluation
phase, the load capacitance charges up and down depending on the input, and during
the recovery phase, the charge kept on capacitance is managed to recover.
The classification of positive feedback adiabatic logic (PFAL) belongs to the adia-
batic logic family. This PFAL consumes the least amount of energy when compared
to other families such as ECRL, 2N-2N2P, and CAL [15]. It is resistant to changes in
technological parameters. The adiabatic amplifier at the heart of PFAL is a latch made
up of two nMOS and two pMOS. The two n-trees provide reliable logic functions
and can generate both positive and negative outputs, depending on the situation. The
particular schematic of the PFAL gate is shown in Fig. 6 [16, 17]. M1 and M2 are

Fig. 5 Adiabatic power supply in various phases


Modeling and Analysis of Positive Feedback Adiabatic … 287

Fig. 6 General schematic of PFAL gate

two latches formed by pMOS and nMOS, respectively. M3 and M4 are two latches
formed by pMOS and nMOS, respectively.
Figure 7 shows a traditional CMOS-based 2:1 multiplexer that is constructed and
evaluated for power dissipation in mentor graphic design architect IC, and Fig. 8
shows a traditional CMOS-based full adder that is modeled and analyzed.
PFAL-based 2:1 multiplexer and full adder are configured and modeled in mentor
graphic design architect IC using discussed methodology, as shown in Figs. 9 and

Fig. 7 A 2:1 multiplexer based on conventional CMOS technology


288 M. Sharma et al.

Fig. 8 Full adder based on CMOS technology

Fig. 9 PFAL CMOS 2:1 multiplexer

10. ELDONET is used to simulate the designs.

3 Results and Discussions

To begin, the layout structure involves PFAL-based CMOS, which would be simu-
lated in Mentor Graphic’s Design Architect tool utilizing standard TSMC 0.35 m
CMOS technology. A 3.3 V supply is provided at a heat of 27 °C for the input
Modeling and Analysis of Positive Feedback Adiabatic … 289

Fig. 10 PFAL CMOS-based full adder

voltage. For traditional CMOS and PFAL designed circuits, the ELDONET tool is
used to simulate and check the results for power dissipation.
V s is assumed to be a DC supply source in the conventional circuit, with a value of
3.3 V. The circuits are shown in Figs. 7 and 8. The reconstruction of PFAL circuits is
done from traditional circuits and uses the same method used to create PFAL circuit
design. Figures 9 and 10 show the circuit design that has been modeled. The circuit’s
source is a SIN source, and the circuit’s power supply is a 3.3 V sine AC wave.
In the ELDONET environment, the design is simulated. Figures 11 and 12 show
the results of a traditional CMOS 2:1 Multiplexer simulation. After simulation of a
conventional 2:1 multiplexer, the measured power dissipation is 6.9090 nW.
Figures 13 and 14 show the results of a PFAL 2:1 multiplexer simulation as energy
dissipation. This circuit’s measured power dissipation is 80.8716 pW. With the same
circuit working, the output power can be seen to be reduced at a rapid rate. Figure 15
shows a dissipation comparison for a 2:1 Multiplexer.
Figures 16 and 17 show the results of a conventional CMOS full adder simulation.
After simulation, the calculated power dissipation is 48.0452 pW.
Figures 18 and 19 present the results of a simulation of a PFAL CMOS-based full
adder, as well as the power dissipation. This circuit’s determined power dissipation is
3.9089 pW. It is possible to see that power dissipation is decreased with same circuit
operation. Figure 19 shows a power dissipation contrast both for full adder circuit
design.
Table 1 shows the different circuit modeled and simulated in ELDONET
environment and their power dissipation.
It is clear that the power dissipation of the circuit by PFAL is reduced to a signifi-
cantly lower value after simulation of all circuits (convention: CMOS 2:1 multiplexer,
conventional CMOS full adder, PFAL CMOS 2:1 multiplexer, and PFAL full adder).
However, the PFAL circuit has increased the number of gates needed to do any logic,
290 M. Sharma et al.

Fig. 11 Power dissipation of a 2:1 multiplexer based on conventional CMOS

Fig. 12 Power dissipation of a traditional CMOS-based 2:1 multiplexer in a log window

but also delivers both original and additional outputs. This can eliminate the need
for other circuits where the additional output for many applications is also needed.

4 Conclusion

The rate of transistor assimilation for every chip is rising every day. One such results
in heat dissipation and use of high thermal cooling methods. Dissipation has a direct
Modeling and Analysis of Positive Feedback Adiabatic … 291

Fig. 13 PFAL CMOS-based 2:1 multiplexer computed power dissipation

Fig. 14 Power dissipation log window for PFAL-based 2:1 multiplexer

effect on the device’s consistency and circuit efficiency. The emphasis is on lowering
leakage power. Some other methods for lowering as well as reutilizing energy and
power dissipation are described in this article. The power dissipation of a 2:1 multi-
plexer circuit has been decreased to nanoWatts using the PFAL method, compared to
nanoWatts in a conventional CMOS circuit. The power dissipation for PFAL circuitry
is also decreased while using a full adder. This demonstrates the utility of PFAL and
the adiabatic logic family. This method could be used to create circuits that have lower
power discharge. Handheld devices such as laptops, mobile phones, and notepads
292 M. Sharma et al.

Fig. 15 A schematic 8 6.909


illustration of the 2:1 7
multiplexer’s reduced power 6
dissipation
5 Convenonal
4 CMOS
3
2 PFAL CMOS
1 0.080871
0
2:1 Mulplexer Power
Dissipaon
(Nano Wa)

Fig. 16 Calculated power dissipation of full adder-based conventional CMOS

will still have longer battery life with much less output power and consumption.
Additionally, the cost of costly thermal cooling would be lowered.
Modeling and Analysis of Positive Feedback Adiabatic … 293

Fig. 17 CMOS-based full adder log window for power dissipation

Fig. 18 Computed PFAL-based full adder power dissipation


294 M. Sharma et al.

Fig. 19 Power dissipation log window for PFAL-based full adder

Table 1 Power dissipation of


Circuit Power dissipation (nW)
conventional CMOS v/s
PFAL CMOS Conventional CMOS 2:1 multiplexer 0.00709
PFAL CMOS 2:1 multiplexer 0.081371
Conventional CMOS full adder 0.0470452
PFAL CMOS full adder 0.0041089

References

1. Samaali H, Perrin Y, Galisultanov A, Fanet H, Pillonnet G, Basset P (2019) MEMS four-


terminal variable capacitor for low power capacitive adiabatic logic with high logic state
differentiation. Nano Energy 55:277–287
2. Liu J-S, Clavel MB, Hudait MK (2019) TBAL: tunnel FET-based adiabatic logic for energy-
efficient, ultra-low voltage IoT applications. IEEE J Electron Dev Soc 7:210–218
3. Raghav HS, Bartlett VA, Kale I (2018) Investigating the effectiveness of without charge-
sharing quasi-adiabatic logic for energy efficient and secure cryptographic implementations.
Microelectron J 76:8–21
4. Chandrakasan AP, Sheng S, Brodersen RW (1999) Low power CMOS digital design. IEEE J
Solid-State Circuits 27(04):473–484
5. Veendrick HJM (1984) Short-circuit dissipation of static CMOS circuitry and its impact on the
design of buffer circuits. IEEE JSSC, pp 468–473
6. Sharma M, Singh H, Singh S, Gupta A, Goyal S, Kakkar R (2020) A novel approach of object
detection using point feature matching technique for colored images. In: Proceedings of ICRIC
2019. Springer, Cham, pp 561–576
7. Blotti, Saletti R (2004) Ultralow-power adiabatic circuit semi-custom design. IEEE Trans VLSI
Syst 12(11):1248–1253
8. Horowitz M, Indennaur T, Gonzalez R (1994) Low power digital design. In: Technical digest
IEEE symposium low power electronics, San Diego, Oct 1994, pp 08–11
9. Sakurai T, Newton AR (1990) Alpha-power law MOSET model and its applications to CMOS
inverter delay and other formulas. IEEE JSSC 25(02):584–594
10. Chandrakasan AP, Brodersen RW (1995) Low-power CMOS digital design. Kluwer Academic,
Norwell, Ma
11. Denker JS (1994) A review of adiabatic computing. In: Technical digest IEEE symposium low
power electronics, San Diego, Oct 1994, pp 94–97
Modeling and Analysis of Positive Feedback Adiabatic … 295

12. Voss B, Glesner M (2001) A low power sinusoidal clock. In: Proceedings of the international
symposium on circuits and systems, ISCAS 2001
13. Athas WC, Koller JG, Svensson L (2005) An energy-efficient CMOS line driver using adiabatic
switching. In: Fourth great lakes symposium on VLSI, California, Mar 2005
14. Sharma M, Singh H (2018) A review on substarte intergrated waveguide for mmW. Circulation
in computer science, ICIC 2018, pp 137–138
15. Keote M, Karule PT (1996) Design and implementation of low power multiplier using proposed
two phase clocked adiabatic static CMOS logic circuit. Int J Electr Comput Eng 8(6):4959;
Moon Y, Jeong DK (1996) An efficient charge recovery logic circuit. IEEE JSSC 31(04):514–
522
16. Kamer A, Denker JS, Flower B et al (1995) 2N2D-order adiabatic computation with 2N-2P and
2N-2N2P logic circuits. In: Proceedings of the international symposium on low power design,
Dana Point, pp 191–196
17. Anitha K, Jayachira R (2016) Design and analysis of CMOS and adiabatic 1: 16 multiplexer
and 16: 1 demultiplexer. Int J Reconfig Embedded Syst 5(1)
A Smart Healthcare System Based
on Classifier DenseNet 121 Model
to Detect Multiple Diseases

Mohit Chhabra and Rajneesh Kumar

Abstract Detection of multiple diseases such as cardiomegaly, emphysema, hernia,


infiltration, mass, nodules, atelectasis, pneumothorax, pleural thickening, pneu-
monia, fibrosis, and edema by X-ray images of the chest is difficult for a specialist.
Researchers have proposed several methods to detect a single disease using X-ray
images of the chest. But if a person is suffering from more than one disease that is
not easily detected. This motivates the authors to demonstrate the feasibility of clas-
sifying the multiple diseases using chest X-ray images by applying the deep learning
approaches from a given dataset. In this research work, authors have proposed a smart
healthcare system based on DenseNet 121 to detect multiple diseases from chest X-
ray images. The experiments were performed on the ChestX-Ray8 dataset having
108,948 front viewed X-ray images of 32,717 patients with a menu scripted eight
diseases and fourteen different pathological conditions. The results of the developed
classifier DenseNet 121 model are compared with the radiologist score.

Keywords DenseNet 121 · Transfer learning · Pathology · Convolution neural


network · X-ray images · Class imbalancing

1 Introduction

In the era of medical science, precise and accurate disease prediction and diagnosis
are still the main concern. Especially in countries like India, medical facilities are
not up to the mark. Thousands of people died due to a lack of availability of medical
facilities. So deep learning techniques in medical imaging diagnosis can play a vital
role and provide a lot of help to the medical industry. Artificial intelligence made

M. Chhabra (B) · R. Kumar


CSE Department, MMEC, Maharishi Markandeshwar Deemed to be University, Mullana,
Ambala, Haryana 133207, India
e-mail: Chhabra108@mmumullana.org
R. Kumar
e-mail: drrajneeshgujral@mmumullana.org

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 297
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_23
298 M. Chhabra and R. Kumar

computers more intellectual, and by virtue of it, they can think. Artificial intelligence
can be considered as the superset of deep learning and machine learning [1, 2].
There are a lot of different techniques related to machine learning exists whose
purpose is to make the medical diagnosis process easy and precise [3]. These models
are useful in the medical field for the diagnosis of different diseases from the X-ray
datasets. Due to the advancement of technologies, we can store millions of images
and use them for research, consider the fact that accuracy and precision are the two
important factors [4].
Researchers have proposed several methods to find a particular disease from
a chest X-ray images dataset, but still, there is room for developing smart deci-
sion support systems that can diagnose multiple diseases from a single chest X-ray
images dataset [5]. These types of systems are beneficial for medical professionals for
analyzing and diagnosing the disease. Nowadays, a proper diagnosis has become very
tougher [6]. The mortality rate got increased due to limited resources available [7]. In
medical research and diagnosis, an important aspect is a pathology [8]. Pathology is
a branch of medical science that refers to the study of different diseases that integrate
an extensive range of biological research fields. It involves inspecting the reasons
for illness. How illness affects cells and what are the different consequences of the
illness [9].
By using deep learning techniques like Keras library, medical image diagnosis
from the chest X-ray images are easy. Pathological dataset download from NIH
Clinical Center American research hospital has been used in this study. In this dataset,
each image consists of multiple text-mined labels which can recognize fourteen
different pathological conditions, practitioner then used these different conditions
to diagnose eight different diseases, by considering all these numerical factors, the
author aim is to create a model that will help to provide likelihood estimations in
two different forms such as positive and negative [10].
Those thoracic diseases [11] which are considered pathological can be diagnosed,
and their précised location can be spatially localized through the classification of
multi-segmentation imaging and framework of disease localization [12]. It can be
validated by the provided dataset. Though the preliminary computable consequences
are favorable as labeled, DenseNet 121 model remains quite a persistent job for fully
automated high accuracy systems.

1.1 Deep Learning

Today in the medical field main problem that arises is regarding diagnosis, as the
X-ray images dataset contains millions of images, so it is difficult to sort out the
healthy and useful information from the given data [13]. As a result deep learning
considers a perfect solution for disease diagnosis in the medical field.
This learning procedure is considered as a special subsidiary field of machine
learning whose task is to gather tools and algorithm which assist in pattern matching.
Deep learning mainly depends on artificial neural networks, as it mimics the brain
A Smart Healthcare System Based on Classifier DenseNet 121 … 299

of humans. There is no need for an explicit program. Hierarchical learning is also a


special kind of learning which uses a multilayered architecture model to categorize
and analyze data [14].
There are different layers in the deep learning model, and through the cascading
of multiple layers, data is filtered, where each succeeding layer is using the output
values from the preceding layers for the results. These deep learning models work
more accurately as their capacity to processing of data increases, as they learn from
its previous values [15].

1.2 Background

Cardiomegaly is a serious medical term that causes an increase in the cardiac size
of humans. A cardiothoracic ratio is used to access it. It is not a disease but a medical
condition. Heart got enlarged by this, as a result, the patient can suffer cardiac arrest
[16]. As per a survey, and millions of people die within 5 years after diagnosis of
cardiac arrest. The different causes of cardiomegaly depend on case to case as it
varies for different patients. The problem arises from hypertension or heart disease.
The problem may get better over time, but still, patients suffering from it need
medications and lifetime treatments. This disease is highly susceptible and related
to family history.
Emphysema is a condition related to lung infection, by virtue of it alveoli (air sacs),
get damaged. The inner walls of air sacs got fade and split which leads to bigger air
space instead of small ones [17].
Effusion is a medical condition in which a lot of fluid is contained around the lung.
Pleural effusion is a very tiny membrane that is placed inside the chest wall along
at the surface of the lungs. In healthy people, the fluid size is normally around 1
teaspoon which allows our lungs to work and move smoothly in the chest cavity
when we breathe. Some of the causes of having pleural effusion are leaking from
other organs, lung cancer, autoimmune conditions, pulmonary embolism, and some
other type of infections.
Hernia happens when from a tissue or muscle an organ pushes out. Hernia can
occur in different parts of the body, but mostly in the abdomen between hips and
chest; also, it can appear in the thigh and groin part. Most of the time hernias are not
life-threatening, but still, sometimes they are required urgent treatment to prevent
life-threatening complications [18].
Infiltration is the dispersal or accumulation of different foreign materials in tissues
and cells. The material collected in those cells is known as infiltrating. Infiltration is
also used to define the incursion of cancer cells into the blood vessels.
Mass is an abbreviation of mitral value, aorta, skeleton, and skin. Syndrome of mass is
due to the presence of fibril in 1 gene, a mutation in FBN1. MASS is a tissue disorder
300 M. Chhabra and R. Kumar

that can affect patients in a different manner. Mitral valve prolapse is described as
a heart condition where two valve flaps of the mitral valve in the heart do not close
smoothly or evenly and bulge upward into the left atrium. Aortic root dilation is the
borderline area where the heart and aorta meet. Skinstriae, stretch marks in the skin
without any weight change. Skeletal features include several possibilities such as
joint hypermobility, scoliosis (spine curvature), and deformities in the chest walls.
Some of the people suffering from this disease also face myopia (an eye disease)
[19].
Nodule is an abnormal tissue growth. Mainly, it develops just below the skin area
and also in deep tissues of the skin or may also in the internal organs. According
to dermatologists, any lump is considered a nodule if it is at least 1 cm in size. The
thyroid gland may also contain nodules, as it refers to enlarged nodes. A nodule is
visible and feels like a hard skin lump. A thyroid nodule may affect swallowing, and
in the abdomen, it causes abdominal discomfort, a change in voice.
Atelectasis is a serious lung disease that causes damage to the entire lungs or part of
the lungs. This mainly happens when the alveoli become deflated or filled with fluid.
This condition mainly happens after surgery. The main complication is respiratory
problems. Some of the other complications include lung tumors, cystic fibrosis,
injuries to the chest, lung in fluid, and respiratory weakness [20].
Pneumothorax (PTX) is a cardiorespiratory medical life-threatening problem. This
disease requires rapid treatment. The diagnosis and screening mainly depend on chest
X-rays. Pneumothoraxes in chest X-rays are difficult to identify due to their variable
size. Some of the authors work on the diagnosis of this disease from the given dataset
and apply the proper procedure to accurately and timely identify the pneumothorax.
Pleural thickening cultivates when welt muscle thickens the mild membrane lining
the lungs. Pleural thickening can be formed following some infection or some
asbestos exposure. It may be a symptom of a more severe diagnosis such as malignant
pleural mesothelioma [21].
Pneumonia Today, it is assumed as a life-threatening disease due to many reasons.
This can create lung swelling and enlargement. Fluid can be filled in the lungs which
causes the damaging of one or more lungs. Common viruses, bacteria are the main
causes of pneumonia. So a person suffering may face breathing problems and some
other issues also [22].
Fibrosis is also a lung-based disease that mainly occurs when cells or tissues of
the lung become damaged and disfigured and thicker. This increase in thickness
causes the lung to work abnormally and as its result patient feel different types of
discomfort. The damaged caused by pulmonary fibrosis is not easy to be fixed but
taking certain medications can ease the symptoms and can help in living a normal
life. In extreme cases, a lung transplant is required. Authors have studied a lot about
it how to diagnose it by using the dataset.
A Smart Healthcare System Based on Classifier DenseNet 121 … 301

Edema is a medical condition that arises when excess fluid is confined in the body’s
cells. It can affect any part of our body, such as arms, feet, legs ankles, and hands;
also, there is a type of diabetic macular edema that can damage the eyes, if not
diagnosed and treated early. So authors have proposed different ways to identify
edema from the datasets and also compare different methods which suggest which
technique provides better results [23].

2 Related Works

Chen et al. [1] in this study, authors discussed various machine learning algorithms
for operative calculation of chronic disease. They compared the prediction models
with the help of a database collected from different hospitals in China. They use the
latent factor model to overwhelm the problem of incomplete data. They also proposed
a multi-disease prediction-based model based on convolution neural network.
Saranya et al. [3] proposed a comprehensive study on dissimilar approaches used
to forecast multiple diseases, the key objective of this research is to get improvement
in the medical field, reduce admission in hospitals, and cost of health care. The
author compares the deployed prototype with the existing ones in terms of different
parameters and finds that the accuracy of their results is better.
Dahiwade et al. [4] focused on the prediction of the diseases at the early stages.
Actually, in reality, the diagnosis is not so easy due to numerous reasons like the
repetition of images, unclear data, and unstructured data. So they used the approach
here based on data mining. For accurate and precise data, they used convolution
neural network and K-nearest neighbor models, and then they compare the results
obtain from these two models.
Wang et al. [10] proposed a dataset consisting of around 1 lakh images of
around thirty thousand persons where they diagnosed multiple diseases using natural
language processing. They also trained a convolutional neural network to do this work
by virtue of it these available diseases, and their exact position can be identified.
Shen et al. [14] explored medical field images by computer-assisted methods.
They provide lightning of identification, classification, and quantification patterns.
By utilizing the best available learning techniques, it can increase the performance
by increasing the accuracy.
Raghavendra et al. [7] deployed the model based on CNN for the detection of
brain diseases. This disease is based on continuing degradation of the brain’s motor
function, which is linked to an abnormality of the brain. The authors implemented
a 13-layer model and the developed model achieved a good performance of around
80% accuracy, 92% specificity, and 85% specificity.
Li et al. [9] proposed a deep learning-based model for approximating modality
imaging data. In this method, input and output are the two different modalities.
The model consists of multiple trainable parameters which can seizure the connec-
tion between different modularity. The authors then evaluated this model on ADNI
302 M. Chhabra and R. Kumar

database where magnetic resonance imaging (MRI) and positron emission tomog-
raphy (PET) are input and output modularities, respectively. Finally, the authors
compare the result with the existing methods and show the result is better as compared
to other models.
Liu et al. [6] provided an accurate diagnosis method at an early stage of
Alzheimer’s disease as it helps patients to take preventive actions against the shapes
of irreversible brain damage. Different authors work on these models, and results are
shown a bottleneck of diagnosis performance, so authors designed a deep learning
model to overcome Alzheimer’s disease and bottleneck problem.
Franz et al. [5] built and presented a model for patient diagnosis calculation by
managing certain health records electronically. Authors face many problems due
to two main factors owing to the multitude of setups and lacking of coherency in
electronic health records. This will cause interference in the retrieval of information
present in electronic health records. So authors deployed a python package to convert
public health datasets into an easily accessible universal format. So they proposed
different architectures for the deep observer model and clinical Bert model. As a
result, these models provide multiple diagnoses with high accuracy.
Bashir et al. [8] proposed a multilayer classifier collective framework based on
the optimum grouping of diverse classifiers. The advantage of having this decision
support system is to gather and built a proper interpretation and information system.
The model built here by the authors overcomes the limits of conventional systems.
Qin et al. [15] in this paper authors study the computer-aided detection systems
and provide different methods for disease prediction by visualizing the chest X-
ray datasets. The authors here used different chest X-ray datasets and used different
methods for preprocessing. Computer-aided detection (CAD) is used for the detection
of multiple diseases.
Harimoorthy et al. [11] in this model proposed a specific architecture for disease
prediction in the healthcare industry. The method was conducted with a lesser feature
set of chronic diseases of kidney, heart, and diabetes. The authors also compared the
results with the existing methods such as decision tree, SVM-Polynomial, SVM-
linear, and random forest, and decision tree. The performance has been evaluated
based on performance parameters such as accuracy, specificity, and sensitivity, preci-
sion and found that the proposed model produces a better rate as compared to other
models.
Today it is very tough for the hospitals to manually maintain the record of the
patients and categorize them in form of priority. So author developed a text mining
method to differentiate between pathology and radiology reports and arrange the
data of hospitals automatically, which further classify the prediction in form of the
positive and negative impact of the disease. Radha et al. [12] suggest a different
classification of radiology and pathology reports using modified extreme learning
machine (MELM), naïve Bayes (NB), and support vector machine (SVM). These
proposed classifiers produce better results.
Su et al. [13] studied different applications of artificial intelligence in the area of
mental health and their providers such as psychologists and psychiatrists for decision
making based on historical data of patients. The main focus of this survey is to
A Smart Healthcare System Based on Classifier DenseNet 121 … 303

apply deep learning techniques on health issues especially mental health issues.
They partition the data into different segments. Diagnosis based on the given data,
analysis of genetics data for mental health conditions, vocal analysis for prediction
of diseases, and finally risk estimation using social media data.
Junior et al. [16] developed techniques to access the ratio of cardiothoracic from
chest X-rays. For this, they proposed automated cardiomegaly detection in CXR.
They complete the process in two segments. In the first part, they trained the network
to distinguish the result of CXR in positive and negative. Operations are performed
with different parametric operations to reduce the bias factor and to improve effi-
ciency and reduce the possibility of bias to the majority class. Out of the given
models, DenseNet generated precise results on testing and outer validation datasets.
Saito et al. [21] applied different techniques to study pleural thickening. Out of
observing around thirty thousand chest ray images, it was observed that this disease
was found very frequently and mainly occurs at the right side of the lungs. This
disease occurs frequently in males and especially in smokers. Authors after applying
proper techniques observed that the pervasiveness increased with age, ranging from
1.8% in youngsters to 9.8% in the old person of age 60 or more.
Negahdar et al. [17] here from the given X-ray dataset of images diagnosed
emphysema. There exists different cell pattern which includes fibrosis, micro nodule,
emphysema, ground glass, and normal. So they proposed a convolution neural
network-based emphysema diagnosis system.
There are five different lung cells pattern
Chouhan et al. [22] proposed a technique based on deep learning for the effusion
detection present around the lungs. From the available dataset, it is not easy to
do the diagnosis; so the author here used an ImageNet pre-trained model for the
prediction classifier. Authors here developed different approaches, and out of these,
they observed that the results of the ImageNet model based on convolution neural
network provide better results as compared to other methods.
Rajpurkar et al. [18] deployed a technique which can be able to detect radiologist-
based pathological diseases. The disease dataset consists of around one lakh images.
The author applied CheXnet a convolution neural network, which is trained on the
dataset of one lakh images, and as a final achieved start-of-the-art results of these
fourteen diseases. The author then compares the results of the CheXnet model with
the models of Wang and Yao and the output shows that the AUROC value of the
author’s model provides a good result compared to other models.
Jin et al. [23] proposed fundus fluorescein angiography (FFA) detection tech-
nique based on NPA. Vision can be loosed by diabetic macular edema. So by early
detection and proper diagnostic, treatment can be done. Authors have used three
different models to recognize microaneurysms and leakages in fundus fluorescein
angiography.
Chhabra et al. [20] study different edge detection techniques to improve the quality
of medical images. The author here studies different algorithms such as canny, log,
and many more and then compares their performance in the parameters of perfor-
mance and speed. As a result, the author predicted that the performance of the Canny
algorithm is better as compared to existing algorithms.
304 M. Chhabra and R. Kumar

3 Materials and Methods

The collection of different data objects can be viewed as a dataset. Selected data
objects can be termed by different features.

3.1 Pathology Chest X-Ray Dataset and Specification

The dataset provided here is taken from around thirty-two thousand different sources
containing more than one lakh images. For every image contain in the dataset, there
exist different text classified labeling which further can be used to identify fourteen
different pathological conditions. This identification can be used by physicians for
the diagnosis of eight different diseases.
Then we will deploy a model based on DenseNet121 which will be beneficial
for providing binary classification prediction. The predicted value for each of the
pathology appears in the form of positive or negative.
As the number of images is huge, so we will deploy it using the subset. In a
particular subset, we used 875, 109, and 420 images for the training, validation, and
testing purposes, respectively. Figure 1 shows the different diseases that exist in the
database which needed to be diagnosed.

Fig. 1 Common pathological diseases in chest X-rays


A Smart Healthcare System Based on Classifier DenseNet 121 … 305

3.2 Preprocessing of Data

It is the initial stage in which data is to be processed to be fed to the Keras model.
In the first part, we aim to clear the null values dataset, and then we will convert a
categorical variable to numerical variables by encoding methods.

3.3 Transfer Learning

This type of technique can be used in the retraining of the DenseNet 121 model
for classification purposes based on X-ray imaging. The main advantage of using
transfer learning is that model can be reused again and again for as many tasks. Pre-
training models can also be considered as the initial for computerized aided design.
Transfer learning is done for the tasks where the dataset contains too little data for
the train so it can use the services of the existing models.
The following workflow procedure of deep learning is used in the context of
transfer learning. Take layers from the last trained model, freeze those layers, to
evade abolishing any of the statistics they comprise while future training, onto the
top of the frozen layers, add some new layers recognized as trainable layers, which
will help to convert the previous features into new dataset prediction, fine-tuning the
last optional step, which is used to unfreeze the whole model.
There are two ways to approach model named as developed model approach and
pre-trained model approach.

Developed model approach


Select Source Task: Out of many problems that exist, we need to select an associated
predictive problem based on modeling while there is a set of inputs, output, and
variables.
Develop Source model: The second step is the development of the model that is
assigned for the first job.
Reuse Model: As the name suggests developed model could be used as initial
point for some other job work. It can involve some or all the parts of the model.
Tune Model: This is the optional part; the model needs to be refined or reformed
based on the combination of available data in the form of input and output.

Pre-trained model approach


Select Source model: From the existing models, a pre-trained source model to be
selected. Out of many available models, the pre-trained model is to be selected. Some
of the research hubs issues model on the datasets which are huge and of challenging
nature can be accommodated.
Reuse Model: As discussed earlier, it could be used as preliminary point for a
prototype for alternative type of allocated work. It can conclude some or all the parts
of process depends on the requirements.
306 M. Chhabra and R. Kumar

Fig. 2 DenseNet 121 models for pathological disease detection

Tune Model: this model is voluntary and could require to be restructured or


modernized on the sort of input–output data available and its variables.

3.4 Model Deployment

Out of many available neural networks, one of the most important used for the
diagnostic data from medical images referred as DenseNet 121. It is considered as a
model which emphasizes utilizing a convolutional neural network as it goes deeper,
still making them use effectively, by exploiting smaller acquaintances between layers.
In DenseNet 121 convolution neural network, all the layers make a connection
with all the others that are in a downward location in the grid. Let us assume that
there are five layers in the model. So layer one will be connected to layer two, layer
three, layer four, and so on. Similarly, layer two will be connected to layer three,
layer four, and so on. The key behind this strategy is to enable maximum information
flow among all the layers.
To maintain the feed-forward ability, from all the deeper layers each layer gets
input and further passes feature maps to all the upcoming layers. Every Kth layer
has K input blocks and comprises all its previous convolution feature map blocks.
DenseNet 121 comprises different blocks such as dense blocks and transition blocks
which are different as compared to other pooling and convolution layers.
In DenseNet 121 model, it consists of an input layer followed by the convolution
layer, dense layer, convolution layer, pooling layer again dense layer, convolution
layer, pooling layer, dense layer, pooling layer, and linear layer followed by the
output as shown in Fig. 2.

4 Results and Discussions

Class Imbalance Handling: This situation arises in case when the sampling number
in the training segment of dataset is not stable for the purpose of class labeling.
A Smart Healthcare System Based on Classifier DenseNet 121 … 307

Class imbalance denotes classification predictive modeling problems while


creating a good prediction model, a well-balanced dataset is often required. Most
of the medical datasets are not balanced in their class labeling. As a result, the
classification methods were not performed as per requirements.
Figure 3 shows the class imbalance of various pathological diseases in the form of
frequency. It shows the presence of positive cases slightly vary through the multiple
pathologies conditions which are around 0.2% in the case of hernia, whereas the
infiltration pathology which is having the least figure of imbalancing has around
17.5% of the training cases labeled positive.
The right scenario is that we should train our model by providing a balanced
dataset by which both types of training cases should contribute equally to the loss.
To prioritize the majority class, we can use a normal function known as cross-entropy
normal function as it contributes more to the damage.

Impact of class imbalance on the loss function


Now consider normal cross-entropy loss function for the different pathology diseases.
The different parameters for the contribution toward loss are for the kth training data
is

L cross - entropy (xk ) = −(yk y log( f xk )) + wn (1 − yk ) log(1 − f xk ))).

Provided x k , y k are the input parameters and variables and f xk .

Fig. 3 Class Imbalance of different pathology conditions


308 M. Chhabra and R. Kumar

Provide the output of the model in the form of a probability function. Average
cross-entropy loss can be rewritten as

1 
L cross - entropy (D) = − log( f (xi )
N positive examples

+ log(1 − f (xi )))
negative examples

By implementing these equations, it is concluded that the imbalance factor is large


with fewer positive cases as the loss will be controlled by the negative class. The
contribution of both the classes is represented by

number of positive examples


freq p =
N
number of negative examples
freqn =
N
From Fig. 4, it can be concluded that the assistance of the true positive cases is
generally less as compared to the negative cases. So our task is to contribute to them
equally. So it can be done by multiplying each class component by a specific weight
in the class.

wpos ∗ freq p = wneg ∗ freqn


wpos = freqneg
wneg = freqpos

Figure 5 shows that the class and the value graph by properly applying weight
with both positive and negative characterizations inside each segmentation contribute
properly to the loss function.
 
Lw
cross - entropy (x) = − w p y log( f (x)) + wn (1 − y) log(1 − f (x))

Visualization and Prediction under the ROC and AUROC


In this next step after implementing the model, we will evaluate it using the test
set. To generate the image predictions, the variable used is the predicted generator.
A ROC curve can be interpreted as a receiver operating characteristic curve which
shows the classification model evaluation performance at all the threshold values.
This curve is used to plot two variables named true positive rate and false positive
rate.
The receiver operator characteristic (ROC) curve is an assessment metric used
mainly for binary classification type problems. It is a probability wave that plots
the value between true positive rate and false positive rate. The higher the value of
A Smart Healthcare System Based on Classifier DenseNet 121 … 309

Fig. 4 Different diseases showing class and value in class imbalance function

Fig. 5 Contribution of class and value after class balancing


310 M. Chhabra and R. Kumar

Table 1 Comparison of
S. No. Pathology Radiologist DenseNet121 model
DenseNet algorithm with
radiologists data 1 Atelectasis 0.808 0.862
2 Cardiomegaly 0.828 0.831
3 Consolidation 0.841 0.893
4 Edema 0.91 0.924
5 Effusion 0.9 0.901
6 Emphysema 0.681 0.704
7 Fibrosis 0.787 0.806
8 Hernia 0.795 0.851
9 Infiltration 0.714 0.721
10 Mass 0.886 0.909
11 Nodule 0.849 0.894
12 Pleural thickening 0.779 0.798
13 Pneumonia 0.823 0.851
14 Pneumothorax 0.915 0.944

AUC, so better will be the performance of the model. Similarly, the lower the value
of AUC, the lesser will be its value.
From Table 1, it is shown that the performance of DenseNet 121 area under the
curve (AUC) is better than the radiologist data in terms of all diseases.
Figure 6 shows above plots the graph between DenseNet 121 model and radiologist
data, and the author above shows that the data of the DenseNet121 shows better
performance when compared with all the different diseases.

Fig. 6 Comparison of 1
DenseNet 121 model with 0.9
0.8
radiologist data 0.7
0.6
AUC

0.5
0.4
0.3
0.2
0.1
0
Atelectasis
Cardiomegaly
Consolidaon
Edema
Effusion
Emphysema
Fibrosis
Hernia
Infiltraon
Mass
Nodule
Pleural Tickening
Pneumonia
Pneumothorax

Pathological Disease
Radiologist DenseNet 121 Model
A Smart Healthcare System Based on Classifier DenseNet 121 … 311

5 Conclusion and Future Work Scope

Due to the limited medical resource available in Asian countries like India, accu-
rate and reliable diagnosis of the disease is still a major concern. The authors here
proposed a DenseNet 121classifier model for the diagnosis of multiple diseases by
applying certain techniques such as transfer learning and class imbalance handling
techniques. Pathological dataset download from NIH Clinical Center, American
research hospital. (https://nihcc.app.box.com/v/ChestXray-NIHCC) had been used.
The results were compared with the radiologist data in terms of area under the curve,
and it was obtained that the multiple diseases diagnosed from the DenseNet121 show
better results as compared to radiological data. In the future, authors plan to build a
deep learning-based model for the diagnosis of other critical diseases.

References

1. Chen M, Hao Y, Hwang K, Wang L, Wang L (2017) Disease prediction by machine learning
over big data from healthcare communities. IEEE Access 5:8869–8879
2. Chhabra M, Gujral RK (2019) Image pattern recognition for an intelligent healthcare system: an
application area of machine learning and big data. J Comput Theor Nanosci 16(9):3932–3937
3. Saranya G, Pravin A (2021) A comprehensive study on disease risk predictions in machine
learning. Int J Electr Comput Eng 10(4):4217–4225
4. Dahiwade D, Gajanan P, Ektaa M (2019) Designing disease prediction model using machine
learning approach. In: 3rd international conference on computing methodologies and commu-
nication (ICCMC), pp 1211–1215
5. Franz L, Shrestra YR, Paudel B (2020) A deep learning pipeline for patient diagnosis prediction
using electronic health records. In: ACM SIGKDD conference on knowledge discovery and
data mining (KDD), BIOKDD pp 1–10
6. Liu S, Liu S, Cai W, Pujol S, Kikinis R, Feng D (2014) Early diagnosis of Alzheimer’s disease
with deep learning. In: 2014 IEEE 11th international symposium on biomedical imaging, ISBI,
6868045, Institute of Electrical and Electronics Engineers (IEEE), pp 1015–1018
7. Oh SL, Hagiwara Y, Raghavendra U, Yuvaraj R, Arunkumar N, Murugappan M, Acharya
(2020) A deep learning approach for Parkinson’s disease diagnosis from EEG signals. Neural
Comput Appl
8. Bashir S, Qamar U, Khan F, Naseem L (2021) HMV: a medical decision support framework
using multi-layer classifiers for disease prediction
9. Zhang W, Li R, Deng H, Wang L, Lin W, Ji S, Shen D (2015) Deep convolutional neural
networks for multi-modality isointense infant brain image segmentation. Neuroimage 108:214–
224
10. Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM (2017) ChestX-Ray8: hospital-scale
chest X-ray database and benchmarks on weakly-supervised classification and localization of
common thorax diseases. In: IEEE conference on computer vision and pattern recognition
(CVPR), Honolulu, HI, USA, 2017, pp 3462–3471
11. Harimoorthy K, Thangavelu M (2020) Multi-disease prediction model using improved SVM-
radial bias technique in healthcare monitoring system. J Ambient Intell Humaniz Comput
12(3):3715–3723
12. Radha P, Preethi BM (2019) Machine learning approaches for disease prediction from radiology
and pathology. J Green Eng (JGE) 9(2):12–21
13. Su C, Xu Z, Pathak J, Wang F (2020) Deep learning in mental health outcome research: a
scoping review. Transl Psychiatry 10(1)
312 M. Chhabra and R. Kumar

14. Shen D, Wu G, Suk H (2017) Deep learning in medical image analysis. Annu Rev Biomed Eng
19(1):221–248
15. Qin C, Yao D, Shi Y, Song Z (2018) Computer-aided detection in chest radiography based on
artificial intelligence: a survey. BioMed Eng OnLine 17(1)
16. Ferreira-Junior J, Cardenas D, Moreno R, Rebelo M, Krieger J, Gutierrez M (2021) A general
fully automated deep-learning method to detect cardiomegaly in chest X-rays. Med Imaging:
Comput-Aided Diagnosis
17. Negahdar M, Beymer D (2019) Lung tissue characterization for emphysema differential
diagnosis using deep convolutional neural networks. Med Imaging: Comput-Aided Diagnosis
18. Irvin JA, Rajpurkar P, Ko M, Yu Y, Ciurea-Ilcus S, Chute C, Marklund H, Haghgoo B, Ball
RL, Shpanskaya K, Seekins J, Mong D, Halabi S, Sandberg J, Jones R, Larson D, Langlotz C,
Patel B, Lungren M, Ng A (2019) CheXpert: a large chest radiograph dataset with uncertainty
labels and expert comparison. AAAI
19. Khan MA, Kwon S, Choo J, Hong SM, Kang SH, Park IH, Kim SK, Hong SJ (2020) Automatic
detection of tympanic membrane and middle ear infection from oto-endoscopic images via
convolutional neural networks’. Neural Netw 126:384–394
20. Chhabra M, Kumar R (2020) Comparison of different edge detection techniques to improve
quality of medical images. J Comput Theor Nanosci 17(6):2496–2507
21. Saito A, Hakamata Y, Yamada Y, Sunohara M, Tarui M, Murano Y, Mitani A, Tanaka K, Nagase
T, Yanagimoto S (2019) Pleural thickening on screening chest X-rays: a single institutional
study. Respiratory Res 20(1)
22. Chouhan V, Singh S, Khamparia A, Gupta D, Tiwari P, Moreira C, Damaševičius R, de Albu-
querque V (2020) A novel transfer learning based approach for pneumonia detection in chest
X-ray images. Appl Sci 10(2):559
23. Jin K, Pan X, You K, Wu J, Liu Z, Cao J, Lou L, Xu Y, Su Z, Yao K, Ye J (2020) Automatic detec-
tion of non-perfusion areas in diabetic macular edema from fundus fluorescein angiography
for decision making using deep learning. Sci Rep 10(1)
An Effective Algorithm of Remote
Sensing Image Fusion Based on Discrete
Wavelet Transform

Richa, Karamjit Kaur, and Priti singh

Abstract Image fusion is a combination of two different images into one image.
Image integration has a huge application field, from medical to satellite imaging,
as this approach plays an important role. There are various methods available for
image merging, and in particular, there are two categories in which image merging
is subdivided, modified, and integrated with the domain. A newly created image
compilation would be more accurate and informed than the original image. This
paper introduces a single image integration process called DWT discrete wavelet
transform (DWT). We will be comparing various combinations such as MIN MIN
and MAX MIN. The final picture is compared by multiple criteria, and the best
approach is inferred in this article. Panchromatic satellite images are the data used
for this paper. For these pictures, this algorithm needs to be used, as satellite pictures
demand high spectral and spatial modification. Techniques to share photographs
are included in this situation. The merged image would have both local and optical
characteristics in a single image through image fusion.

Keywords DWT · Wavelet · MIN · MAX · Hilbert · FIR · Fusion · Discrete ·


Spatial · Spectral

1 Introduction

A wavelet is a wave-like oscillations, beginning at zero amplitude and then rising to


zero. It appears like a seismographer or ECG and is often referred to as the ‘short
oscillation’ [1]. Wavelets are typically designed for image processing and signal

Richa (B) · K. Kaur · P. singh


Department of Electronics and Communication Engineering, Amity School of Engineering and
Technology, Amity University, Gurugram, India
K. Kaur
e-mail: kkaur@ggn.amity.edu
P. singh
e-mail: psingh@ggn.amity.edu

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 313
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_24
314 Richa et al.

processing with specialist capabilities. The most fundamental way to describe the
Hilbert field of integrable functions in relation either to a top-down or an orderly
collection of fundamental tasks, or to an over-complete set or frame for a vector
space is a wavelet sequence representation of a square integrable function. Wavelet
theory will cover a multitude of subjects. In wavelet, the harmonic interpretation and
transformations are similar to the perpetual representation of time signals (analogue).
A discrete (sampled) signal approximation for a wavelet is a discrete (i.e. on-time)
wavelet transformation of a signal by a dyadic (octave) band of discrete-time filter.
These filter banks have both FIR and IIR filters.
Fourier’s theory of analysis and the sampling of wavelets that make up a transfor-
mation of the continuous wavelet (CWT) are: it is impossible to attribute an occur-
rence concurrently with a precise time and frequency repeating scale [2]. There are
lower limits on the product’s uncertainty around time and frequency repetition.
This transforms the signal for one point in a scalogram by a permanent wavelet;
an occurrence like this marks a whole field in the timescale plane. Specific wavelet
bases can also be used in other types of dubiousness definitions [3].
Types of wavelet transforms.

1.1 Continuous Wavelet Transforms (CWT)

Continuous transfer technique (CWT) and scaling parameters: The functions used by
the continuous wavelet transform are known as continuous wavelets (CWT). These
functions are empirical expressions since they are either tied to time or frequency. The
vast majority of these continuous wavelets can be used for wavelet decomposition
as well as composition transformations. They are orthogonal wavelets’ continuous
counterparts [4].
The continuous wavelet transform can be expressed mathematically as follows:

sin(2π t) − sin(π t)
ϕ(t) = 2sinc(2t) − sinc(t) =
πt

where ϕ is the mother wavelet and sinc is the sinc function [5].
Some examples for mother wavelets are [6–8] (Figs. 1, 2 and 3).
The following are some important and desired CW that are used in a variety of
applications [9].
1. Poisson wavelet
2. Morlet wavelet
3. Modified Morlet wavelet
4. Mexican hat wavelet
5. Complex Mexican hat wavelet
6. Shannon wavelet
7. Meyer wavelet
An Effective Algorithm of Remote Sensing Image … 315

8. Difference of Gaussians
9. Hermitian wavelet
10. Hermitian hat wavelet
11. Beta wavelet
12. Causal wavelet
13. µ wavelets
14. Cauchy wavelet
15. Addison wavelet

Fig. 1 Meyer wavelet

Fig. 2 Morlet wavelet


316 Richa et al.

Fig. 3 Mexican hat wavelet

1.2 Discrete Wavelet Transforms (DWT)

Discrete wavelet transforms are continuous in time and consist of discrete change and
scaling parameters [10]. Since computing wavelet coefficients is much more difficult,
it could be easier to extract the discrete subset of the signal from its coefficient. The
wavelet that was used is known as an infant wavelet. Kid wavelet can be expressed
mathematically as [11–13]
 
1 t − nb
ϕm,n (t) = √ ϕ
am am

where ϕm,n (t) is the child wavelet and a m , nba m with m, n are points of the discrete
subset.

2 Wavelets for 2D Signals

Since images are typically two-dimensional, research into 2-D f (x, y) wavelets is
critical. In order to transform the images, two-dimensional wavelets, i.e. ϕ(x, y),
would be used instead of one-dimensional wavelets ϕ(t). Two representations of 2D
wavelets are seen in Figs. 4 and 5.
This solution necessitates a significant amount of computing power as well as
further mathematical processing, but instead of the above, the 1D transform can be
applied sequentially to the rows and columns of the image as a separable 2D trans-
form. Furthermore, instead of applying a 2D wavelet to the pixel, a 1D transformation
An Effective Algorithm of Remote Sensing Image … 317

Fig. 4 2D Haar wavelet

Fig. 5 2D max hat wavelet

is applied to the image’s rows and columns. This algorithm is advantageous since it
requires a low computational complexity of separable transforms [14, 15].

Filter Bank Theory


Digital FIR filters can be used to decompose an image using the multi-resolution
principle. The quick ON algorithm uses multirate filter banks to measure the wavelet
coefficients.
Figure 6 demonstrates the working of filter. The work of the filter bank is seen
in Fig. 6; a wavelet filter bank analysis is used to find the information and CA of
an image. The approximation coefficient shows the pixel trend over the horizontal,
vertical and diagonal axes in three dimensions. In the XYZ Image
318 Richa et al.

Fig. 6 Filter bank scheme (DWT)

Lo_D is low-pass wavelet decomposition filter.


Hi_D is a high-pass wavelet decomposition filter.
2 is downsample by factor 2.
The signal is decomposed into two components as an image is passed into a
high-pass and low-pass filter.
1. detail coefficient (high frequency)
2. Approximation coefficient (low frequency)
Function used in MATLAB to get the desired decomposition is:
[cA, cH, cV, cD] = dwt2(img, ‘haar’, ‘mode’, ‘sym’) (Fig. 7).
With the help of inverse discrete wavelet transform, we can reconstruct our approx-
imate image from decomposed image by upsampling signal by factor of two. The
signal produced from low-pass filter will be having high frequency which is equal

Fig. 7 Filter bank scheme (wavelet reconstruction or IDWT)


An Effective Algorithm of Remote Sensing Image … 319

Fig. 8 Level 1 image


decomposition

to the original frequency. In order to reconstruct the image more perfectly, only half
of original samples are required.
MATLAB function used for IDWT is:
imgr = idwt2 (cA, cH, cV, cD, ’haar’, ’mode’, ’sym’)

In Fig. 8, after one degree of decomposition, the M × M matrix image is decom-


posed into four distinct matrices, namely approximation Coefficients (cA1), Hori-
zontal Detailed Coefficients (cH1), Vertical Detailed Coefficients (cV1) and Diag-
onal Detailed Coefficients (cD1). All these matrices are of M2 × M2 dimensions.
For example, if the image size is 256 × 256, i.e. M × M, then all four matrices would
be 128 × 128 after one stage of decomposition [15–17].
Approximation coefficients (cA1) will decompose further in level 2 into the other
4 matrices. In Fig. 9, the M/2 × M/2 matrix of cA1 is decomposed into four different
Approximation Coefficients, matrices after the second step of decomposition (cA2),
Horizontal Detailed Coefficients (cH2), Vertical Detailed Coefficients (cV2) and
320 Richa et al.

Fig. 9 Level 2 image


decomposition

Diagonal Detailed Coefficients (cD2). All these matrices are of M4 × M4 dimen-


sions. For instance, if the image size is 128 × 128, i.e. M × M, then all four matrices
will be 64 × 64 after the second step of decomposition [18].

Image Fusion using discrete wavelet transform (DWT)


This method gathers all of the important details from different images and transforms
it into a single image. The single output image will be more accurate and informa-
tive than single (or original or provided) images, consisting of all the required and
significant features [3]. The image fusion purports to truncate the magnitude of the
data and construct images that are more opportune and understandable for human
and machine perception. The critical advantage of utilizing the image fusion is to
renovate the degraded images, mix the images and achieve face morphing.
Two major domains of image fusion are listed below [14, 19]:
1. Spatial domain
(a) High-pass filtering (HPF)
(b) Intensity–Hue–Saturation (IHS)
(c) Brovey method
(d) Principal component analysis (PCA)
An Effective Algorithm of Remote Sensing Image … 321

2. Frequency Domain
(a) Pyramid-based (Laplacian, Gaussian)
(b) Discrete Cosine Transform (DCT)-based
(c) Discrete Wavelet Transform (DWT) based
(d) Curvelet Transform (CT)-based

3 Proposed Model

Our suggested model would decompose and recombine the image with two images as
the dataset using the discrete wavelet transform (DWT) [20] and the inverse discrete
wavelet transform (IDWT) (Figs. 10 and 11).

Algorithm 1: Decompose Image Level 1


Step 1: Selecting image from desired folder
Step 2: Reading image pixel value
Step 3: img selected image pixel value
Step 4: Calculating Approximation Matrix
[cA, cH, cV, cD] = dwt2(img, ‘haar’, ‘mode’, ‘sym’).
Step 5: Plotting Horizontal and Vertical Coefficient
Step 6: Inverse wallet transform to get synthesized image
imgr = idwt2 (cA, cH, cV, cD, ‘haar’, ‘mode’, ‘sym’);
Decomposition of Image A (Level 1 Decomposition)
See Fig. 12.
Decomposition of Image B (Level 1 Decomposition)

Fig. 10 Image A:
panchromatic satellite image
322 Richa et al.

Fig. 11 Image B: panchromatic satellite image

Fig. 12 Image A: level 1 decomposition


An Effective Algorithm of Remote Sensing Image … 323

4 Image Fusion Using Different Fusion Arithmetic


Operations

Figures 13 and 14 illustrate the decomposition, with technology, of images in four


subdivisions, which are: ‘Coefficient approximate (cA)’, ‘Horizontal Detail Coef-
ficient’ (cH), ‘Coefficient Vertical Detailed’ (cV) and ‘Coefficient for Diagonal
Details’ (cD). Invert Discrete wavelet transform is an important element in the inte-
gration of the decomposed images, i.e. image A and image B, where the fusion of
the above coefficients can take place.
On the following bases, the final fused image is created:
• cA = fusion1(cA1 , cA2 )
• cH = fusion2(cH1 , cH2 )
• cV = fusion2(cV1 , cV2 )
• cD = fusion2(cD1 , cD2 )
The arithmetic operation on matrices is Fusion 1 and Fusion 2. Mean, Max and Min
mathematical operations can be Fusion 1 and Fusion 2. We used the three above-
mentioned operations and compared the effects using SNR, peak signal-to-noise

Fig. 13 Image B: level 1 decomposition


324 Richa et al.

Fig. 14 MAX-MAX
combination

ratio (PSNR), mean square error (MSE) and signal-to-signal intensity (SSI), though
other operations are possible structural similarity index. A total of nine mathematical
variations on rough and informative wavelet coefficients such as MAX-MAX, MAX-
MIN and so forth are feasible.

Algorithm 2: Image Fusion


Step 1: Define Fusion type = ‘MEANMAX’
Step 2: Choose Wave type = ‘Coif5’
Step 3: Choose First Image
Img1 ← First Image
Step 4: Choose Second Image
Img2 ← Second Image
Step 5: Size of First Image ← [ROW, COL]
Step 6: If Size of (Img2 = Img1) then:
Step 7: Resize Image
Step 8: Fusion Operation for Red Green and Blue frame
Step 9: Final Image Fused Image

Following images were obtained after the fusion of img10 and img11 (Figs. 15,
16, 17, 18, 19, 20, 21 and 22; Table 1).

5 Conclusion

In this paper, the image fusion algorithm with panchromatic satellite image was intro-
duced. In fusion1 and fusion2, we experimented on various arithmetic combinations.
And a few parameters such as SNR, PSNR, MSE and SSIM analysed the final fused
image. We may infer that photograph fusion of ‘harr’ mode and the wave form of
An Effective Algorithm of Remote Sensing Image … 325

Fig. 15 MAX-MEAN combination

Fig. 16 MAX-MIN combination

‘Coif5’ had the most effective results in MAX-MAX mode. A total of 9 variations
have been tested based on the performance.
326 Richa et al.

Fig. 17 MEAN-MAX combination

Fig. 18 MEAN-MEAN combination


An Effective Algorithm of Remote Sensing Image … 327

Fig. 19 MEAN-MIN combination

Fig. 20 MIN-MAX combination


328 Richa et al.

Fig. 21 MIN-MEAN
combination

Fig. 22 MIN-MIN
combination

Table 1 Analysis of image


Fusion combination SNR PSNR MSE SSIM
MAX MAX 18.0668 23.0096 325.1803 0.7760
MAX MEAN 18.0290 23.0203 324.3808 0.7142
MAX MIN 18.0245 23.0175 324.5853 0.6982
MEAN MAX 16.2963 23.0078 325.3106 0.7218
MEAN MEAN 16.2428 23.0073 325.3486 0.6202
MEAN MIN 16.1838 22.9821 327.2403 0.5941
MIN MAX 14.3661 23.0627 321.2286 0.7453
MIN MEAN 14.2154 23.0222 324.2331 0.6561
MIN MIN 14.1943 23.0285 323.7631 0.6275
An Effective Algorithm of Remote Sensing Image … 329

References

1. Yin Y, Hu Y, Liu P (2011) The research on denoising using wavelet transform. In: 2011
international conference on multimedia technology, Hangzhou, China, pp 5177–5180. https://
doi.org/10.1109/ICMT.2011.6002276
2. Aishwarya N, Abirami S, Amutha R (2016) Multifocus image fusion using discrete wavelet
transform and sparse representation. In: 2016 international conference on wireless communi-
cations, signal processing and networking (WiSPNET), Chennai, India, pp 2377–2382. https://
doi.org/10.1109/WiSPNET.2016.7566567
3. Vishwanath M, Owens RM (1996) A common architecture for the DWT and IDWT. In: Proceed-
ings of international conference on application specific systems, architectures and processors:
ASAP ‘96, Chicago, IL, USA, pp 193–198. https://doi.org/10.1109/ASAP.1996.542814
4. Chipman LJ, Orr TM, Graham LN (1995) Wavelets and image fusion. In: Proceedings of
international conference on image processing, Washington, DC, USA, vol 3, pp 248–251.
https://doi.org/10.1109/ICIP.1995.537627
5. Kulkarni JS (2011) Wavelet transform applications. In: 2011 3rd international conference on
electronics computer technology, Kanyakumari, India, pp 11–17. https://doi.org/10.1109/ICE
CTECH.2011.5941550
6. Unser M, Blu T (2003) Wavelet theory demystified. IEEE Trans Signal Process 51(2):470–483.
https://doi.org/10.1109/TSP.2002.807000
7. Li H, Manjunath BS, Mitra SK (1994) Multi-sensor image fusion using the wavelet transform.
In: Proceedings of 1st international conference on image processing, Austin, TX, USA, vol 1,
pp 51–55. https://doi.org/10.1109/ICIP.1994.413273
8. Zhang H, Cao X (2013) A way of image fusion based on wavelet transform. In: 2013 IEEE 9th
international conference on mobile ad-hoc and sensor networks, Dalian, China, pp 498–501.
https://doi.org/10.1109/MSN.2013.103
9. Evangelopoulos G, Maragos P (2008) Image decomposition into structure and texture subcom-
ponents with multifrequency modulation constraints. In: 2008 IEEE conference on computer
vision and pattern recognition, Anchorage, AK, USA, pp 1–8. https://doi.org/10.1109/CVPR.
2008.4587649
10. Haribabu M, Bindu CH (2017) Visibility based multi modal medical image fusion with DWT.
In: 2017 IEEE international conference on power, control, signals and instrumentation engi-
neering (ICPCSI), Chennai, India, pp 1561–1566. https://doi.org/10.1109/ICPCSI.2017.839
1973
11. Yang Y (2010) Multimodal medical image fusion through a new DWT based technique. In:
2010 4th international conference on bioinformatics and biomedical engineering, Chengdu,
China, pp 1–4. https://doi.org/10.1109/ICBBE.2010.5517037
12. Majumdar J, Chetan KT, Nag SS (2011) A comparative analysis of multilevel Daubechie
wavelet based image fusion. In: Venugopal KR, Patnaik LM (eds) Computer networks and
intelligent computing. ICIP 2011. Communications in computer and information science, vol
157. Springer, Berlin. https://doi.org/10.1007/978-3-642-22786-8_68
13. Bhataria KC, Shah BK (2018) A review of image fusion techniques. In: 2018 Second interna-
tional conference on computing methodologies and communication (ICCMC), Erode, India,
pp 114–123. https://doi.org/10.1109/ICCMC.2018.8487686
14. Rajini KC, Roopa S (2017) A review on recent improved image fusion techniques. In:
2017 International conference on wireless communications, signal processing and networking
(WiSPNET), Chennai, India, pp 149–153. https://doi.org/10.1109/WiSPNET.2017.8299737
15. Priya RMS, Dharun VS (2014) Image fusion techniques for high resolution images—a compar-
ison. In: 2014 International conference on control, instrumentation, communication and compu-
tational technologies (ICCICCT), Kanyakumari, India, pp 1266–1270. https://doi.org/10.1109/
ICCICCT.2014.6993155
330 Richa et al.

16. Jha N, Saxena AK, Shrivastava A, Manoria M (2017) A review on various image fusion
algorithms. In: 2017 International conference on recent innovations in signal processing and
embedded systems (RISE), Bhopal, India, pp 163–167. https://doi.org/10.1109/RISE.2017.837
8146
17. Omar Z, Stathaki T (2014) Image fusion: an overview. In: 2014 5th international conference
on intelligent systems, modelling and simulation, Langkawi, Malaysia, pp 306–310. https://
doi.org/10.1109/ISMS.2014.58
18. El-Hoseny HM, El Rabaie EM, Elrahman WA, Abd El-Samie FE (2017) Medical image fusion
techniques based on combined discrete transform domains. In: 2017 34th national radio science
conference (NRSC), Alexandria, Egypt, pp 471–480. https://doi.org/10.1109/NRSC.2017.789
3518
19. Xuyun C, Ting Z, Qianlin Z, Hao M (1996) 2-D DWT/IDWT processor design for image
coding. In: 2nd International conference on ASIC, Shanghai, China, pp 111–114. https://doi.
org/10.1109/ICASIC.1996.562764
20. Sung T, Shieh Y, Yu C, Hsin H (2006) Low-power multiplierless 2-D DWT and IDWT archi-
tectures using 4-tap Daubechies filters. In: 2006 Seventh international conference on parallel
and distributed computing, applications and technologies (PDCAT’06), Taipei, Taiwan, pp
185–190. https://doi.org/10.1109/PDCAT.2006.78
QoS-Based Load Balancing in Fog
Computing

Shilpi Harnal, Gaurav Sharma, and Ravi Dutt Mishra

Abstract Fog computing is the most advanced and futuristic technology that is
employed along cloud computing to mitigate the gap between cloud computing
services and edge devices in a network. Various constraints which are present in the
cloud computing can be easily resolved by fog computing. Fog computing serves as
an ameliorator addition to cloud computing. Instead of performing computations at
relatively farther located server, fog computing does all the computations locally in
a fog virtual server. Today, fog computing challenges such as resource utilization,
load balancing, and response time are the main focus of many existing systems.
Load balancing is still an issue for many application domains in a fog environment.
Uniformly distribution of workload among all the edge nodes is desirable to improve
the performance of the whole system. A novel architecture has been proposed in this
paper to balance the load between available fog devices and improves various QoS
parameters such as resource utilization, turnaround time, delay time, and average
response time by implementing the concept of task and resource categorization. So
that requesting task will be submitted to most suitable resource by considering QoS
parameters.

Keywords Fog computing · QoS · Load balancing · Reliability · Delay · Make


spam

S. Harnal (B)
Chitkara University Institute of Engineering and Technology, Chitkara University, Chandigarh,
Punjab, India
e-mail: shilpi.harnal@chitkara.edu.in
G. Sharma · R. D. Mishra
Seth Jai Parkash Mukand Lal Institute of Engineering and Technology, Radaur, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 331
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_25
332 S. Harnal et al.

1 Introduction

Fog computing, fogging or fog networking is the terms that can be described as a
decentralized computing architecture in which various computer resources are posi-
tioned efficiently and logically. The fogging technique intends to enhance the perfor-
mance of the system and downscale of data being transported to cloud for computation
[1]. In fogging, the development is handled by nodes present on the smart devices
such as gateways or routers. Therefore, it is always preferable to send lesser amount
of data to the cloud. Cloud computing renders computational as well as storage capa-
bilities for IoT-based applications [1]. Despite providing so many facilities, cloud
computing has so many challenges to deal with such as processing delay, turnaround
time, and response time. Fog computing provides better solution in dealing with
these challenges as it operates nearer to end users [2]. The term fog computing or
fogging [3] was initially put forward by Cisco, and it served as an advancement in
cloud computing that can convert the edge of a network into a computing system
which can implement various IoT applications. Fog computing operates by bringing
useful resources nearer to end devices which produce and consume the data. In
fog computing, computation workload is handled in parts by devices, known as fog
servers or fog nodes, which are closer to end users instead of performing operations at
cloud layer. Fog servers can be installed at a place which has network connection such
as a factory, any commercial building, railway stations, and moving vehicles. Any
device like switches, controllers, routers or one with network, storage, and compu-
tation capability can work as a fog server. Because of the presence of resources
closer to the end devices, the overall transmission time for processing tasks get opti-
mized as data is received by processing devices in no time. However, due to limited
computing capability of a fog server, only small size task requests are processed in a
fog computing architecture. Large-scale tasks or tasks with longer delays are pushed
to cloud layer for processing. With the advancement in the field of cloud computing,
it has led to a novel computing system known as cloud–fog computing. In spite of
the fact that the new model is more efficient, it still struggles with issues like load
balancing and task scheduling. Though cloud–fog computing provides a suitable
environment for bag-of-tasks (BoT) applications, it is still challenging to schedule
tasks among cloud nodes and fog nodes while distributing load equally. Bag-of-
tasks (BoT) applications have tasks independent to each other and are used in many
IoT applications such as computational biology [4] and computer imaging [3]. Load
balancing in such systems proves beneficial for end users or service providers. That is
why, it is vital to deploy load balancing techniques as it diminishes the transmission
delay while directly affecting the users’ experience.
QoS-Based Load Balancing in Fog Computing 333

2 Related Work

Li et al. [5] have presented an SSLB mechanism for the various centralized and
decentralized fog servers which are distributed over a large area. They have used
an algorithm named k-means to make clusters of different fog nodes as cells and
independent scheduling has been performed in each cell. With this, workload has been
migrated across all cells with the help of two different adaptive threshold policies
known as task distributing and task grasping. It outperforms both centralized and
decentralized schemes with less bandwidth consumption.
Verma et al. [6] have put forward an algorithm for load balancing using replication
technique in fog–cloud integrated system. Replicas of data are maintained in fog
networks, which leads to reduced dependency on cloud data centers and hence leading
to reduced latency time. If a fog server is not capable of providing the data on receiving
a request from the nearest client, the request is then sent to another nearest server.
If cloud server fails to fulfill the request, it transmits the requests to all the cloud
servers present at the cloud layer. It has been depicted that the algorithm yields lesser
response time in comparison to Round Robin and Throttled but proves costly because
of a large amount of data transmission.
Yu et al. [7] have come up with scalable and dynamic load balancer (SDLB).
In this approach, hashing technique-based data structure has been implemented for
designing. It has also been observed that SDLB is faster and consumes relatively less
amount of memory in comparison to other load balancer designs that makes use of
hash table and hashing technique.
Yousefpour et al. [8] have presented a cooperative offloading policy for load
balancing that would send requests from end users to nearest fog servers (fog to
fog) or forward them to the cloud (fog to cloud) to minimize the delay. A threshold
value has been set for waiting time, and if the calculated waiting time is less than
the threshold value, the task is accepted; otherwise, it gets offloaded. Tasks can be
forwarded to fog servers for a limited number of times only, after which they must
be forwarded to the cloud server. Simulations show that their proposed algorithm is
better in terms of delays and index value.
Radojevic presented an ameliorated algorithm called central load balancing deci-
sion model (CLBDM) over Round Robin [9]. This algorithm has been based upon
Round Robin but differs in the fact that it also accounts for the overall time duration
of connection established between the client and the server by measuring the total
time a task would take while executing on a given cloud resource.
Ren introduced another load balancing mechanism known as ESWLC that is based
upon weighted least connection (WLC) algorithm [10]. In this, resources that have
minimum assigned weight are allocated to a task and the capabilities of the node have
been considered. Depending on weight and node capabilities, a task is allocated to a
node.
334 S. Harnal et al.

Wang et al. [11] proposed advancement in an algorithm that combines both oppor-
tunistic load balancing (OLB) and load balance min-min (LBMM) scheduling algo-
rithms and executes tasks efficiently for maintaining the load of the system. The
proposed algorithm operates in a dynamic environment.
In another work [12], a honeybee foraging solution has been presented for load
balancing in a distributed environment that got inspiration from the nature where
search for food by an ant develops the basis of distributed load balancing. The
queue data structure has been used in the implementation of the algorithm which is
self-organized in nature.
Ningning et al. [13] conceptualized a load balancing algorithm where the arrival
and exit of nodes in the fog domain occur dynamically. Initially, a mapping between
the physical resources takes place. Then a graph is developed where all virtual
resources are represented with the help of a node, and each one of them has a certain
capacity. A couple of virtual nodes has been linked using an edge and the bandwidth
of the communication link has been assigned as a weight. Iteratively, the edges get
removed one after the other as per the criterion of minimum weight threshold which
accounts for the division of tasks among virtual machines. Whenever a new fog node
arrives, the load is redistributed in the neighborhood and gets balanced while main-
taining the degree of task distribution and links. When nodes are to be removed, the
above-mentioned steps are followed in a reverse manner.
Agarwal et al. [14] designed an algorithm that accounts for the distribution of
load over the fog layer and the cloud layer in an efficient manner. It states that a fog
server manager accepts the task requests and fulfills the end users’ requirements by
allocating requested resources. When a request arrives, the manager verifies it and
looks for the availability of demanded resources. It is then decided either to execute
all tasks, execute a few and postpone others, or to transfer the demand to the cloud
node for task execution. Simulation of the algorithm depicts that the efficiency of it
is much better than other techniques adopted for load balancing and to optimize the
response time.

3 Proposed Architecture

Load balancing is the need of any cloud computing system because it alone accu-
mulates efficient resources and bandwidth utilization, and improved delay times.
Many factors affect or categorize the load balancing in different ways such as a static
or dynamic environment for scheduling software or hardware-based load balancer
[15]. In the presented architecture, a QoS-based load balancing algorithm has been
designed for a fog computing environment. As in fogging, servers are located nearer
QoS-Based Load Balancing in Fog Computing 335

Fig. 1 Proposed fog architecture

to the end users so resources can be accessed easily and with minimum response
time and less cost. The presented architecture has been particularly designed for
real-time applications, where time and consistency play a major role. The proposed
architecture is represented diagrammatically below in Fig. 1.
The designed framework has to tackle the issue of load balancing based upon
the concept of reliability, categorization of tasks as well as resources, and has been
named as QoS-based load balancing algorithm (QLBA). There are two important
parameters, namely success rate and availability that measures the reliability of the
VMs and are computed as follows:
336 S. Harnal et al.

3.1 Success Rate of Node i (SVi )

For a VM, the success rate directly depends upon the total number of successfully
executed tasks, and the total number of task requests it has received. Now as the
success rate is directly proportional to the VM’s reliability, that means, the higher
the success rate value, the more reliable the VM is [16]. Hence, the success rate for
a VM ‘i’ (denoted as SVi ) can be computed by using the following equation, and the
success rate for ith VM will always have a value of either 0 or 1.

SVi = TSi /TRi (1)

where
TSi : total number of tasks successfully executed
TRi : total number of task requests received.

3.2 VM Availability (VAi )

In a cloud environment, the term VM availability [16] for any ith VM can be derived
as the ratio of the total number of times the VM was available for the task execution
(T E ) and the total number of attempts made to submit the tasks (T N ) through a
particular node in that network.

VAi = TEi /TN i (2)

where
T Ei : total number of times VM was available for task execution
T Ni : number of attempts made to submit the tasks.

Using above defined terms, the value for reliability of ith VM can be derived using
the equation:

R valuei = 0.5 ∗ (SVi ) + 0.5 ∗ (VAi ) (3)


QoS-Based Load Balancing in Fog Computing 337

Algorithm1:Quality of Service based Load Balancing Algorithm (QLBA)


Input:Fog devices (Vi), number of reliable VMs, No. of incoming requests (RM),
Cluster (CH, CM, CL), TaskList(), Task Computation (TC)
Output: Optimum allocation of Tasks onto the most suitable Fog devices
1. Procedure:Task_ Scheduler()
2. Initialize total N number of fog devices V1, V2……VN
3. For each Vi VM where VM is the list of available virtual machines
4. Ri = Reliable_Fog(Vi)
5. IfRi>= 0.5
6. Add Vito the TaskList()
7. else
8. Reject Vi
9. End If
10. End of For loop of step 3
11. while TaskList()!= NULL do
12. If Vi= High_config
13. Assign Vi to CH
14. Else if Vi = Medium_cofig
15. AssignVi to CM
16. Else Assign Vi to CL
17. End If
18. End While loop of step 11
19. For each incoming request, Ri = R1, R2………RM do
20. Identify the task computation demand TC
21. If TC = High_ComputationAssign Ri to clusters C1, C2….CH
22. Else if TC = Medium_ Computation, Assign Ri to clusters C1, C2….CM
23. Else TC = Low_ Computation, Assign Ri to clusters C1, C2….CL
24. End If of step 21
25. End of For Loop of step 19
26. For each VMi amongst targeted clusters
27. Applying PEFT algorithm to rank VM according to VM’s load and execution time
28. End of For Loop of step 26
29. Assign task Ri to the highest ranked, less loaded and most efficient VMdiscovered
from step 27
30. Repeat step 19 until all requests/ tasks have been completed
31. End of Procedure
338 S. Harnal et al.

Algorithm2: Fetching the set of reliable Fog devices from the set of available devices
Input:Fog devices (Vi), number of reliable Fog devices, total number of tasks suc-
cessfully executed (TSi), total number of task requests received (TRi), total number of
times VM was available for task execution (TEi), number of attempts made to submit
the tasks (TNi)
Output: Reliability value of available Fog devices
1. ProcedureReliable_Fog(x)
2. Initialize Rvalue to .
3. Calculate the reliability values of available devices
4. SVi = TSi / TRi #success rate of virtual machine x
5. VAi = TEi / TNi #availability of virtual machine x
6. Rvaluei= 0.5*(SVi) + 0.5*(VAi)
7. ReturnRvaluei
8. End Procedure

In the above-mentioned steps of QLBA algorithm, all fog devices are numbered in
the very first step. Then in the second step, the reliability value of those fog devices
is calculated, and depending upon that value, the devices with Rvalue < 0.5 are
discarded and are not further allowed to take part in the process. Thereafter, logical
division of the fog devices into three categories is performed, i.e., high capability
cluster C H , medium capability cluster C M, and low capability cluster C L according
to the processing capabilities, i.e., million instruction per second (MIPS). Devices
with higher processing speed are put into high capability clusters and so on.
A count of the total number of incoming tasks is made, and then these tasks are
grouped into high, medium, and low categories as per their computation demand.
Now, the computation demand for each task is identified and further tasks with high,
medium, or low computation demand are allocated into high capability, medium
capability, or low capability cluster, respectively. Then, ranking of the VMs for each
appropriate cluster using the predict earliest finish time (PEFT) algorithm is done
according to VMs current load value and minimum value of execution time. The
task is being assigned to a machine with the highest rank. Eventually, the task gets
assigned to an optimized and most appropriate VM.

4 Environmental Setup

To check the efficiency of the presented framework [17] and compare the results
with that of other contemporary techniques, various simulation experiments were
performed using iFogSim under a cloud–fog environment. The iFogSim is an open-
source platform that also has an extension of the CloudSim simulator. It simulates
various computing resources in a fog environment and manages the efficiency of
QoS-Based Load Balancing in Fog Computing 339

Table 1 Simulation
Parameters Fog
parameters for fog
environment Number of nodes 10
Usage cost of CPU ($) 0.4
Number of instructions (MIPS) [1–200]
Required memory (MB) [50–200]
System Intel Core i5, 2 GHz
Operating system Windows 10

available resources under different scheduling techniques. The homogenous char-


acteristics in the computations are employed only for performing the simulation.
Tasks with varied sizes are submitted on VMs which have various parameters. As
the simulation environment is different for every test conducted, requests received
are also of various types such as data request, memory request, and data transfer
request. All the fog nodes in the framework have their RAM, CPU, processing speed
(MIPS), transmission capacity, and utilization cost. The simulation was performed
with five processing nodes with the specifications as shown in Table 1.
The framework is designated to fully serve all the requests made by end users/IoT
devices. Different tuples of requests are created, and according to future processing
required, they are further classified into various categories.
The experiment has been performed using this framework on 20 to 400 different
tasks. The simulation environment with all the requirements has been shown in
Table 1. The simulation tests have been carried out with iFogSim using Eclipse,
which is a widely accepted tool for performing simulation tests in a fog computing
environment [15].

5 Results and Discussions

Many factors play a vital role in managing load balancing such as resource utilization,
turnaround time, delay time, and response time. Taking utmost care of these factors
can drastically improve the quality of service (QoS) [18]. Qualitative paradigms of
various load balancing mechanisms can be accessed and compared with the help of
several metrics [19].
Various parameters that have been taken into account for analyzing simulation
results of the presented framework are resource utilization, turnaround time, delay
time, and average response time [20]. Furthermore, the presented framework has also
been compared with many conventional scheduling algorithms such as first come first
served (FCFS), shortest job first (SJF), and Max_Min algorithm. In the preceding
section, a test for turnaround time comparison has been performed.
340 S. Harnal et al.

Fig. 2 Turnaround time test comparison

5.1 Turnaround Time Performance Test

Turnaround time can be defined as the time interval between submission and comple-
tion of the task in a fog computing environment. The presented algorithm has been
compared with the FCFS, SJF, and Max_Min algorithms, and simulation has been
performed with the help of ten different workloads using 40–400 tasks with sizes
ranging from 2000 to 8000. Each task is executed on a constant computation speed
of 2000 MIPS. The comparison between the results is displayed in Fig. 2.
The results have shown that the turnaround time of the presented QLBA algorithm
is less than other traditional scheduling algorithms.

5.2 Fog Resource Utilization

Resource utilization can be computed easily by subtracting starting execution time


from the finishing execution time of a task for each resource utilized.

Ru j = ti where ti will be executed on R j(t f i − tsi) (4)

where tfi is the finishing time, and tsi is the start time of task ti for resource Rj.
The presented algorithm has been compared with FCFS, SJF, and Max_Min algo-
rithms in terms of resource utilization (in percentage). Figure 3 clearly depicts that
the nonuniform division of resources in the case of conventional approaches. In the
case of FCFS, it is because resources are allocated based on the order of requests
received. On the other side, the presented algorithm uses a task and resource cate-
gorization approach; therefore, resource utilization is effectively divided among all
the fog nodes present in the network.
QoS-Based Load Balancing in Fog Computing 341

Fig. 3 Utilization of resources in terms of tasks performed by fog devices

5.3 Average Response Time

Response time of a server in a fog environment can be described as the duration of


the time starting from submission of a task request to sending the first response [22].
The following equation computes the response time:

Rt = Tp + Tc + Td (5)

where Rt is the response time, Tp is the propagation time, Tc is the computation time,
and Td is the delay time.
Figure 4 presents the simulation results with a total of 60 incoming tasks in a
cluster of 10 and 20 fog devices. It has been depicted that the average response time
in the case of QLBA algorithm is much better when compared with other conventional
algorithms.

Fig. 4 Average response


time comparison of FCFS,
Max_Min, SJF, and QLBA
342 S. Harnal et al.

Fig. 5 Delay comparison of FCFS, SJF, Max_Min and QLBA

5.4 Processing Delay

The fog node processing delay can be defined as the time that a node spends during
processing a data packet. As resources in a fog computing environment are located
nearer to the end user devices, processing delay is significantly less [21] than that
in a cloud environment where resources are remotely located [22, 23]. It can be
calculated using the following equation:

dnodal = dproc + dqueue + dtrans + dprop (6)

where d proc is the processing time, d queue is the delay time in queue, d trans is the
transmission delay time and d prop is the propagation delay.
The technique adopted for allocating resources to the tasks greatly affects the
processing delay [24, 25]. The performance of the overall network will decrease if
the processing delay is high. Appropriate allocation of resources to the task requests
will minimize the processing delay and performance will be enhanced. Figure 5 shows
the processing delay in the case of various task scheduling algorithms. Therefore,
allocating resources dynamically and intelligently will reduce the delay at the server
end. The simulation tests performed show that QLBA performs better than other
conventional scheduling policies in terms of the average processing delay.
The simulation test results conducted using all the important parameters show that
presented framework has improved outcomes for turnaround time, resource utiliza-
tion, response time, and processing delay and also outperforms other scheduling
algorithms.

6 Conclusion and Future Scope

Fog computing paradigm has become widely popular and is adapted immensely
because of the improved performance which was not achievable in a cloud environ-
ment. As in fogging, resources are placed near IoT devices, the latency overheads get
QoS-Based Load Balancing in Fog Computing 343

reduced hugely. The work in this paper represents an algorithm for load balancing in
a fog computing environment. An adaptive quality of service-based load balancing
algorithm (QLBA) framework has been designed to balance the load between avail-
able fog devices and hence improves various QoS parameters such as turnaround time,
resource utilization, response time, and processing delay of fog devices. The results
have been evaluated using various simulation experiments and have been verified
thoroughly. The simulation tests presented results with improved QoS parameters in
comparison to existing scheduling algorithms and also improve the load balancing.
Present work can further be enhanced with the assistance of intelligent learning-
based techniques, i.e., reinforcement learning in order to manage resources in a fog
environment and including other QoS parameters such as security and reliability.
Further improvements in the presented QLBA algorithm can be made in terms of
performance with the use of artificial intelligence and neural-based models for low-
priority tasks. Author, F.: Contribution title. In: 9th International Proceedings on
Proceedings, pp. 1–2. Publisher, Location (2010).

References

1. Botta A, De Donato W, Persico V, Pescape A (2015) Integration of cloud computing and internet
of things: a survey. Future Gen Comput Syst 56:684–700
2. Li L, Xiaoguang H, Ke C, Ketai H (2011) The applications of wifi-based wireless sensor
network in internet of things and smart grid. In: Proceedings of the 2011 6th IEEE conference
on industrial electronics and applications. IEEE, pp 789–793
3. Fog Computing and the Internet of Things: Extend the Cloud to Where the Things Are; Cisco
White Paper; 2015. Available online: https://www.cisco.com/c/dam/en_us/solutions/trends/iot/
docs/computing-overview
4. Bar-Magen Numhauser J (2012) Fog computing introduction to a new cloud evolution. In:
Escrituras silenciadas: paisaje como historiografía. University of Alcala, Spain, pp 111–126
5. Li C, Zhuang H, Wang Q, Zhou X (2018) SSLB: self-similarity-based load balancing for
large-scale fog computing. Arabian J Sci Eng 43(12):7487–7498
6. Verma S, Yadav AK, Motwani D (2016) An efficient data replication and load balancing
technique for fog computing environment. In: Proceeding of 3rd international conference on
computing for sustainable global development, IEEE
7. Yu Y, Li X, Qian C (2017) A scalable and dynamic software load balancer for fog and mobile
edge computing. In: Proceedings of the workshop on mobile edge communications. ACM, pp
55–60
8. Yousefpour A, Ishigaki G, Gour R, Jue JP (2018) On reducing IoT service delay via fog
offloading. Internet of Things J 5(2)
9. Radojevic B, Zagar M (2018) Analysis of issues with load balancing algorithms in hosted
environments. In: Proceedings of 34th international convention on MIPRO, IEEE
10. Lee R, Jeng B (2011) Load-balancing tactics in cloud. In: Proceedings of international confer-
ence on cyber-enabled distributed computing and knowledge discovery (CyberC), IEEE, pp
447–454
11. Wang SC, Yan KQ, Liao WP, Wang SS (2010) Towards a load balancing in a threelevel cloud
computing network. In: Proceedings of 3rd international conference on computer science and
information technology (ICCSIT). IEEE, pp 108–113
12. Randles M, Bendiab AT, Lamb D (2008) Cross layer dynamics in self-organising service
oriented architectures. IWSOS, Lecture Notes in Computer Science, vol. 5343. Springer, Berlin,
pp 293–298
344 S. Harnal et al.

13. Ningning S, Chao G, Xingshuo A, Qiang Z (2016) Fog computing dynamic load balancing
mechanism based on graph repartitioning, China Commun 13(3):156–164
14. Agarwal S, Yadav S, Yadav AK (2016) An efficient architecture and algorithm for resource
provisioning in fog computing. Int J Inf Eng Electron Bus 8(1):48–61
15. Xu X, Fu S, Cai Q, Tian W, Liu W, Dou W, Sun X et al (2018) Dynamic resource allocation
for load balancing in fog environment. In: Wireless communications and mobile computing,
vol 2018, Hindawi
16. Sharma G, Harnal S, Miglani N, Khurana S et al (2021) Reliability in underwater wireless
sensor networks. In: Energy-efficient underwater wireless communications and networking,
IGI Global. Advances in intelligent systems and computing, vol 468. Springer, Singapore
17. Fan Q, Ansari N (2018) Towards workload balancing in fog computing empowered IoT. IEEE
Trans Netw Sci Eng 99
18. Sharma G, Harnal S, Miglani N, Khurana S et al (2020) Reliability of cluster head node
based upon parametric constraints in underwater acoustic sensor networks. Underwater Technol
37(2):39–52
19. Verma M, Bhardwaj N, Yadav AK (2016) Real time efficient scheduling algorithm for load
balancing in fog computing environment. Int J Inf Technol Comput Sci 8(4):1–10
20. Miglani N, Sharma G (2019) Modified particle swarm optimization based upon task
categorization in cloud environment. Int J Eng Adv Technol (IJEAT)
21. Miglani N, Sharma G (2018) An adaptive load balancing algorithm using categorization of
tasks on a virtual machine based upon queuing policy in cloud environment. Int J Grid Distrib
Comput
22. Rahimi M, Songhorabadi M, Kashani MH (2020) Fog-based smart homes: a systematic review.
J Netw Comput Appl 153
23. OpenFog Consortium Architecture Working Group (2017) OpenFog reference architecture for
fog computing. OPFRA001, 20817, p 162
24. Sarkar S, Misra S (2016) Theoretical modelling of fog computing: a green computing paradigm
to support IoT applications. Iet Netw 5(2):23–29
25. Sharma G, Miglani N, Kumar A (2021) PLB: a resilient and adaptive task scheduling scheme
based on multi-queues for cloud environment. Cluster Comput
DoMT: An Evaluation Framework
for WLAN Mutual Authentication
Methods

Pawan Kumar and Dinesh Kumar

Abstract Authentication is one of the core components of every network security


model. In a wireless networking environment, knowing the membership of any user
or device becomes more important due to the lack of any physical linkage among
the members of the network. IEEE 802.1X is a port-based authentication frame-
work, which uses the extensible authentication protocol (EAP) to accomplish the
task of authentication and key derivation in both wired as well as wireless networks.
Currently, more than one hundred EAP methods have been adopted as official authen-
tication mechanisms by the wireless security standards. These methods offer different
security features by implementing different authentication and key derivation algo-
rithms. To standardize the EAP authentication methods, security requirements, goals,
and features have been defined in the various RFC documents. In this work, we have
performed the goal-oriented analysis of security requirements specified for the mutual
authentication EAP methods with the motive of developing an evaluation model. The
motivation of developing this evaluation model is to have some mechanism for the
assessment of mutual authentication EAP methods. The evaluation model proposed
in this work computes the degree of mutual trust (DoMT), which has four levels,
namely, very high, high, moderate, and low for indicating the strength of mutual
trust established by an authentication mechanism. This model has been validated
by evaluating the five most intensively studied mutual authentication EAP methods
EAP-LEAP, EAP-TLS, EAP-TTLS, EAP-PEAP, and EAP-FAST and the results of
the proposed model match-up with the conclusions drawn by the other related studies.

Keywords Wireless local area networks (WLANs) · Authentication protocol ·


Extensible authentication protocol (EAP) · Evaluation model · Trust model ·
Degree of mutual trust (DoMT)

P. Kumar (B)
Department of Computer Science, DAV College, Bathinda, India
D. Kumar
Department of Information Technology, D.A.V. Institute of Engineering and Technology,
Jalandhar, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 345
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_26
346 P. Kumar and D. Kumar

1 Introduction

Authentication is the process of verifying the credentials provided by the communi-


cating peers to each other for proving their legitimacy. A well designed authentication
scheme protects a network from various potential threats. In the wireless networks,
authentication is highly important and a prerequisite process, as the access control
can be enforced only after determining the user’s identity. In a WLAN environ-
ment, controlling the network access is a peculiar issue because of the broadcasting
nature of the channel. As there is no physical linkage among devices to define the
membership in a wireless networking setup, there exist no physical mechanisms
to restrict a device in the radio range from accessing a wireless network. Wireless
network requires authentication and access control mechanism of diverse nature than
deployed in their wired counterparts. A variety of authentication methods have been
devised for the WLANs, ranging from very simple open access to 802.1X port-
based authentication framework. Open mode allows the users to get the network
access easily and without any complex configuration. This mode of authentication
is the most appropriate, when the ease of connectivity is the main concern than the
enforcement of access control. MAC-based access control method allows or denies
network access based on the device’s MAC address. In this mode of authentication,
when a wireless device attempts to associate with a network, its MAC address is used
for the verification before providing access to the network. The IEEE 802.1X stan-
dard defines a client–server-based framework for enforcing the access control, which
restricts the unauthorized clients from connecting with a WLAN through publicly
accessible access points. 802.1X provides strong authentication mechanisms and has
improved the wireless network security [1]. To achieve the goal of strong and flex-
ible authentication, this framework utilizes a layer two protocol named extensible
authentication protocol (EAP), which authenticates the clients at layer two. EAP is
a generic framework, which provides support for multiple authentication methods.
Each EAP method provides the client, access to the protected network in its own way
by using different types of credentials. The mandatory security properties and other
desired features of EAP methods are identified and discussed in IETF documents
[2, 3].
The rest of the paper is organized in the different sections. In section two, we have
discussed the studies related to our work. 802.1X authentication framework and its
components have been described in the section three. In the fourth section, EAP
framework and EAP methods have been described. In section five, evaluation model
for finding the degree of mutual trust (DoMT) has been presented. In section six,
the proposed evaluation model has been applied on the five mutual authentication
EAP methods EAP-LEAP, EAP-TLS, EAP-TTLS, EAP-PEAP, and EAP-FAST, for
demonstrating the practical applicability of this model. Validation of the evaluation
model has been demonstrated in section seven. Section eight comprises of results
and discussion of the work presented in this article.
DoMT: An Evaluation Framework for WLAN Mutual … 347

1.1 Our Contributions

In this work, we have performed the goal-oriented analysis of the security require-
ments specified for mutual authentication EAP methods. The motive of this analysis
is to develop an evaluation model, for computing the degree of mutual trust (DoMT)
offered by the mutual authentication EAP methods.
The evaluation model proposed in this work is based on the ten core security
properties of the mutual authentication EAP methods. The developed model has
been applied on the five popular and intensively studied mutual authentication EAP
methods. The DoMT offered by the selected methods have been calculated for
demonstrating the working of proposed model. The validation of the model has been
performed by comparing the results obtained through the evaluation model with the
results of other similar studies.

2 Related Work

Various researchers have compared and evaluated the strength of EAP methods on
the basis of security claims listed in the RFC3748 [2]. As these security claims
listed in RFC 3748 are mixture of security properties, security objectives and desired
features, therefore evaluation of any EAP method based on these security claims do
not depict the true strength of authentication offered by these methods. In [3], security
requirements for EAP methods have been categorized into mandatory requirements,
recommended requirements, and optional features. In [4], Liu and Fapojuwo have
performed the formal evaluation of EAP-TLS, EAP-TTLS, and EAP-AKA on the
basis of key agreement properties. Comparison of various authentication and key
exchange (AKE) methods based on all categories of identified requirements, i.e.,
mandatory, recommended, and additional is presented in [5]. Jun Lei et al. in [6]
have presented the comparative study on authentication and key exchange methods
for 802.11 WLANs against the identified requirements.

2.1 Motivation

As 802.11 WLAN technology evolved, a diverse set of authentication methods has


been developed for it. Due to the wide range of WLAN applications with varying
levels of security needs, a single authentication method cannot be suitable for
all the scenarios. When a variety of methods with diverse security features and
implementation requirements are available for achieving a specific goal like mutual
authentication, then the role of a reliable evaluation model becomes very crucial.
348 P. Kumar and D. Kumar

After performing an in-depth literature survey, it has been conceived that there is
a great need for the development of a true evaluation model based on the real secu-
rity properties of mutual authentication EAP methods. The mandatory properties,
recommended properties, and additional desired features of EAP methods need an
in-depth analysis for their inclusion in the evaluation model.

3 802.1X Authentication Framework

IEEE802.1X is a port-based network access control (NAC) framework, which uses


the EAP to accomplish the process of authentication. It provides both strong authen-
tication and a mechanism for deriving and distributing the strong keys [7]. When
deploying the 802.1X, we use the EAP framework for authentication and an EAP
type for carrying out the actual authentication process which provides the optimal
balance between cost, manageability, and risk mitigation [8].
In this framework authentication server validates the credentials of the client,
before allowing it to use services offered by the network. The use of EAP protocol
makes this framework highly flexible, as the specific EAP method is negotiated by
peers at the time of authentication.
There are three main components of the 802.1X framework, which participates in
the process of mutual authentication. Here in 802.1X, the term supplicant is used for
the wireless client requesting access to the network services, the access point through
which network services are offered to the supplicant is called authenticator, and the
term authentication server is used for the entity, that is, responsible for verifying
the supplicant’s credentials and taking the final decision for providing the network
access. The architecture of 802.1X framework with its components is shown in Fig. 1.

3.1 Components of 802.1X Framework

1. Supplicant: It is an entity which sits inside the client device, which is used by the
client device for accomplishing the authentication process. Once a supplicant
is authenticated successfully by the network, it is granted access to the network
services. In a WLAN environment, a supplicant can reside in a PDA or in a

Fig. 1 Architecture of 802.1X framework


DoMT: An Evaluation Framework for WLAN Mutual … 349

Fig. 2 Port-based authentication framework of 802.1X

laptop computer or in some other electronic gadget requesting access of the


network services.
2. Authenticator: This component of the framework acts as a gateway of the
network through which network services are provided to the supplicants. In
a wireless networking environment, an access point acts as an authenticator.
802.1X defines two virtual ports corresponding to each logical connection,
controlled, and uncontrolled ports as depicted in Fig. 2. Initially, the uncon-
trolled port allows only the authentication traffic to pass through it. Once the
supplicant is authenticated successfully, it opens the controlled port for the other
network traffic to pass through it.
3. Authentication Server: This component is responsible for verifying the creden-
tials supplied by the supplicant and taking decision of whether to grant network
access or not. For the verification of supplicant’s credentials, it may either use a
native database or may query some external database such as LDAP database.
Generally, RADIUS servers are used in the enterprise WLAN environments for
implementing the authentication, authorization, and accounting (AAA) services
in the network.

3.2 Carrier Protocols Used in 802.1X Framework

The 802.1X framework uses two carrier protocols for exchanging the EAP messages
between its components, as depicted in the Fig. 3. A brief description of carrier
protocols used in the 802.1X framework is given under.
1. EAPoL: EAP over LANs (EAPoL) is a layer two protocol defined by the IEEE
802.1X for the transportation of EAP messages between the supplicant and the
authenticator. Before authentication, the identity of the supplicant is unknown
and all the traffic is blocked except EAPoL. Once the authentication process

Fig. 3 Carrier protocols for carrying the EAP packets


350 P. Kumar and D. Kumar

between the supplicant and authentication server is completed successfully,


only then other network traffic is allowed to pass through the authenticator.
The authenticator relays all the EAP messages between the supplicant and the
authentication server by using the EAPoL protocol.
2. RADIUS: Remote-authentication-dial-in-user-service (RADIUS) is an authen-
tication, authorization, and accounting (AAA) protocol developed for the trans-
portation of authentication traffic between the authenticator and authentica-
tion server. It supports the use of various EAP and non-EAP methods for the
authentication of the supplicants requesting the network access. At authenti-
cator, it unwraps the packets received from the supplicant and then re-wraps
these packets into the format specified in the RADIUS protocol for further
processing by the authentication server.

4 EAP Framework and EAP Types

802.1X works in conjunction with the EAP for accomplishing the process of strong
authentication and key generation in the highly insecure environment like WLAN.
EAP is a generic framework, which provides support for the variety of authentication
methods. It provides the transportation for authentication parameters and other keying
material generated by the authentication method.
It is a lock-step protocol which supports only a single packet in a flight [9]. The
multiple authentication methods supported by the EAP framework support a variety
of authentication credentials, which offer varying level of security.
It is a highly flexible and extensible protocol, which support wide range of authen-
tication mechanisms like token cards, smart cards, one-time-passwords (OTP), X.509
certificates, as well as basic passwords. The various EAP types are the different
EAP methods, which execute the authentication process by utilizing the security
credentials supplied by communicating peers to each other.

4.1 Feature of EAP Framework

EAP framework has many desirable features. It provides support for multiple authen-
tication methods by providing only the common functionality required for carrying
out the authentication process and specifying the message formats, which are required
to be exchanged between supplicant and authentication server for accomplishing the
authentication process. The implementation details are EAP method specific and are
handled by the developers of that specific method.
EAP provides the supplicant, flexibility to select the authentication method
dynamically. Being an extensible protocol, it allows new authentication methods
to be developed as plug-ins, which can be easily integrated with the supplicant and
RADIUS modules. The secure EAP methods are based on the concept of secure
DoMT: An Evaluation Framework for WLAN Mutual … 351

mutual authentication, where supplicant authenticates the network by verifying the


credentials of the authentication server and network verifies the legitimacy of the
supplicant by verifying the credentials presented by the supplicant.

4.2 Types of EAP Methods

There exists a wide range of EAP authentication methods developed by the academia
and industry. EAP authentication methods can be broadly categorized into three types,
namely password based, certificate based, and token based. Password-based EAP
methods are simple to implement but are less secure. Certificate-based methods are
considered to be most secure methods but are difficult to manage and configure. The
third type of EAP methods is token based like EAP-SIM and EAP-AKA. These
types of method are generally used in the cellular industry for the subscriber’s
authentication.
EAP mutual authentication methods accomplish the task of authentication in
two stages. In the first stage, authentication server is authenticated to the suppli-
cant by providing its credentials, which are generally digital certificates. In case of
tunneled methods, during the first stage, a protective tunnel gets established between
the supplicant and authentication server, generally the TLS protocol is used for
establishing this secure tunnel [10].
After the establishment of the secure tunnel, any legacy authentication method can
be executed inside this secure tunnel for authenticating the supplicant to the network.
EAP-TTLS, EAP-PEAP, and EAP-FAST are such types of EAP methods, which use
the TLS protocol to provide integrity and privacy protection to the client authenti-
cation process [11]. The other major advantage of certificate-based server authenti-
cation is that clients can be authenticated with existing password-based credentials,
and there is no need to deploy the X.509 certificate at clients for implementing the
secure mutual authentication.
On the other hand, in non-certificate-based EAP methods, no such protective
tunnel is established for the protection of EAP messages flowing between the
supplicant and the authentication server. The example of this type of EAP method
is EAP-LEAP, such type of method executes the authentication process without
establishing any protective tunnel and hence are vulnerable to different forms of
man-in-the-middle (MITM) attacks.

4.3 Selection Factors of EAP Methods

The selection of a suitable EAP method for any organization’s environment depends
on the many factors. There are many dominant factors other than the security require-
ments, which influence the choice of EAP method for any project. Some of the
important factors which help in the selection process of EAP method is given under:
352 P. Kumar and D. Kumar

• Requirement of authentication server.


• Requirement of certificate at server.
• Requirement of certificate at clients.
• Client platforms supporting EAP method.
• Provider of the EAP method.
• Server support for EAP method.
• Resources available in the organization.
• Deployment complexity.

5 Evaluation Model (DoMT)

In the network security domain, various evaluation models are available for
computing the strength of security systems. These models are used for measuring
the level of trust based on some measureable attributes of the system. These models
play a very vital role in the selection process of security system components.
In this work, we have proposed an evaluation model referred here as DoMT, for
the assessment of mutual authentication EAP methods. By using this model, one can
easily compute the degree of mutual trust provided by any mutual authentication EAP
method. The basis of the proposed evaluation model is security properties and secu-
rity requirements specified in [3, 18]. These EAP attributes are the mixture of security
requirements, security goals, and other additional features of the EAP methods. The
detailed analysis of the attributes of EAP methods has been performed to find the
relationship between the security requirements and desired goals. We have catego-
rized these attributes into three classes, namely, mandatory security requirements,
security goals, and desired features. Ten most influential properties of EAP mutual
authentication methods have been selected and categorized as mandatory security
requirements. These ten properties are helpful in achieving the desired security goals
listed in Table 2.
These properties are enumerated as SR1 to SR10 for the further discussion and
inclusion in the evaluation model. We have further divided these ten properties into
two sub-categories, normal, and special security requirements. The security proper-
ties P1 and P4 are multi-valued properties, therefore these two properties are desig-
nated as special security requirements and rest eight properties are designated as
normal security requirements, as the domain of these eight properties is limited to
only two values, i.e., either yes or no. The security property P1 is designated as
special because process of client authentication can be accomplished by various
ways, having different levels of complexity and security. The basic password-based
methods for authenticating the clients are easy to implement but are less secure.
The concept of combining more than one factor like combining the OTP with pass-
word makes the authentication system more secure. The use of digital certificate for
client authentication provides the strongest form of mutual authentication between
client and the network. Similarly, in case of security property P4, i.e., key strength
of the dynamic per session key derived by different authentication methods is also
DoMT: An Evaluation Framework for WLAN Mutual … 353

different in different methods, so same weight cannot be assigned to all the authen-
tication methods for the property P1 and P4. The normal security requirements have
been assigned the binary weights, i.e., 0 or 1, and the special security requirements
have been assigned the three weights, i.e., 0, 1, and 2. The detail of weights assigned
to these special security requirements and normal security requirements is given in
Table 1. The other attributes of EAP authentication methods have been categorized
as security goals and desired features, which are listed in Tables 2 and 3, respectively.
The study of attributes listed in Tables 2 and 3 is beyond the scope of this work.

Table 1 Security requirements of mutual authentication EAP methods


Pr. No. Name of property Category Sub-category Weight (wt.)
P1: Client authentication Security Special 0: for client
mechanism requirement-SR1 authentication using
legacy methods
1: for multi-factor
client authentication
2: for certificate-based
client authentication
P2: Protected cipher suite Security Normal 1: for yes
negotiation requirement-SR2 0: for no
P3: Key derivation Security Normal 1: for yes
requirement-SR3 0: for no
P4: Key strength Security Special 0: for small key size
requirement-SR4 1: for medium key size
2: for long key size
P5: Client identity privacy Security Normal 1: for yes
requirement-SR5 0: for no
P6: Method chaining Security Normal 1: for yes
requirement-SR6 0: for no
P7: Session independence Security Normal 1: for yes
requirement-SR7 0: for no
P8: Integrity protection of Security Normal 1: for yes
EAP messages requirement-SR8 0: for no
P9: Confidentiality of EAP Security Normal 1: for yes
messages requirement-SR9 0: for no
P10: Replay protection of Security Normal 1: for yes
EAP messages requirement-SR10 0: for no
354 P. Kumar and D. Kumar

Table 2 Security goals of EAP mutual authentication methods


Pr. No. Name of property Category
P11: Mutual authentication Security goal-SG1
P12: Resistance to dictionary attacks Security goal-SG2
P13: Protection against man-in-the-middle attack Security goal-SG3
P14: End-user identity protection SecurityGoal-SG4
P15: Protection against replay attacks SecurityGoal-SG5

Table 3 Desired features of


Pr. No. Name of property Category
EAP mutual authentication
methods P16: Fragmentation Feature-F1
P17: Fast reconnect Fature-F2

5.1 Description of Security Properties Used in the Evaluation


Model

A brief description of the security properties included in the evaluation model along
with the goals achieved by these is given under:

P1: Client authentication mechanism


In 802.1X framework, clients can be authenticated with the network by using various
mechanisms ranging from simple password-based authentication to the most complex
but highly secure X.509 certificate-based authentication. Trade-off can be achieved
between security and simplicity by analyzing the various factors.
Method which offers the strong mutual authentication and provides protection
against various forms of man-in-the-middle attacks are highly desirable in the
insecure environment like WLAN.

P2: Protected cipher suite negotiation


Cipher suite is a complete set of cryptographic algorithms used for authentication,
data encryption, integrity protection, and key generation. Before starting the commu-
nication over a channel, client and server need to agree on a common set of algorithms,
i.e., cipher suite for securing the connection against various attacks. This negotiation
process needs to be protected against various attacks like downgrade attack, which
forces the communicating parties to select a weaker cipher suite, thereby, making it
vulnerable to attacks like man-in-the-middle.
This property, therefore, provides protection against man-in-the-middle attacks
and downgrade attack.

P3: Key derivation


This property deals with the ability of an authentication method to derive master
session keys (MSKs) and extended master session keys (EMSKs). These master
DoMT: An Evaluation Framework for WLAN Mutual … 355

keys are not used directly for data encryption, but these are used for deriving the
transient keys. These transient keys are used for protecting the EAP data and other
data exchanged between the communicating client and authenticator.
It provides the protection against replay attacks and other type of man-in-the-
middle attacks.

P4: Key strength


The strength of any encryption algorithm is measured in terms of efforts required
for finding its keys in order to break it. Two major factors on which strength of any
encryption algorithm depends are cipher used by the algorithm and size of the keys
(measured in number of bits) used to encrypt the data. A general principle is that
longer keys provide stronger encryption.
If an authentication method derives keys, then it is necessary to estimate its effec-
tive key strength. This estimation serve as a reference for the potential users of the
method to decide whether the keys produced are strong enough for the application
under consideration or not.

P5: Client Identity Privacy


The adherence to this property ensures that the real identity of the user is not passed
on to the authentication server in clear during the authentication process. The basic
solution to this problem is to use the concept of aliases which ensure non-traceability
by hiding the user’s real identity.
P6: Method chaining
This property is used for combining the multiple authentication methods for
achieving the better security by including the multiple methods in a chain. Here,
chain is a combination of authentication methods. In this technique, it becomes
compulsory for a client to pass through all methods in the chain to be successfully
authenticated. For example, if we create a chain by combining the LDAP password
and SMS OTP, then a user must first specify the LDAP password. If the LDAP pass-
word is correct, the system sends an SMS with a one-time-password (OTP) to the
user. The user must specify the correct OTP to be authenticated.
P7: Session independence
This property of key establishment protocols implies the independence of the session
key from the session keys used in the previously established sessions and other long
term keys. The main advantage of this property is if an adversary somehow manages
to get the session keys of an ongoing session, then only the data exchanged during
that session will be at stake. If the adversary has already recorded the data exchanged
during the previously established sessions, then this compromised session key is not
going to be useful in decrypting the data captured from the previously established
sessions.
P8: Integrity protection of EAP messages
This property assures the packet source authentication and protection against the
unauthorized exploitation of EAP packets. EAP packets and sensitive fields in theses
packets need to be protected from various attacks.
356 P. Kumar and D. Kumar

P9: Confidentiality of EAP messages


The compliance of this property ensures the secrecy of the authentication data
flowing between the client and the authenticator. To achieve this property, encryption
algorithms are used.

P10: Replay protection of EAP messages


This feature refers that all the EAP packets need to be replay protected, including
the final EAP-success or EAP-failure messages received by the client. This feature
is highly useful in avoiding the man-in-the-middle attack. To achieve this property,
techniques like nonces and time stamps are used.

5.2 Description of Evaluation Model

This evaluation model has been designed exclusively for the assessment of mutual
authentication EAP methods. For finding the degree of mutual trust (DoMT) offered
by any EAP method, it is necessary to tot-up the weights of its security requirements
as specified in Table 1. In case, the value of any security requirement is not available
or it is not applicable for a specific method, a zero weight is to be added for that not
applicable security requirement.

5.2.1 Criterion Used in the Evaluation Model

In this model, four levels have been defined for specifying the degree of mutual trust.
Therefore, any mutual authentication EAP method evaluated using this model will be
kept in one of these four levels, namely very high, high, moderate, and low. Where
“very high” trust level depicts very strong mutual authentication and “low” trust
level depicts the weak strength of mutual authentication method. The four levels of
DoMT have been defined, corresponding to the scores of the authentication methods
as shown in Table 4. Score S of any EAP mutual authentication method (M) can be
computed by using the equation given under:

S(M) = Wt.(SRi ), for i one to ten : (1)

Table 4 Evaluation criterion


Evaluation criteria for the mutual Degree of mutual trust
used in the evaluation model
authentication EAP methods (DoMT)
Score greater than or equal to 10 Very high
Score from 8 to 9 High
Score from 6 to 7 Moderate
Score less than or equal to 5 Low
DoMT: An Evaluation Framework for WLAN Mutual … 357

where Wt.(SRi ) depicts the weight of ith security requirement i is the index of security
requirement.
By adding the weights (as specified in Table 1) of the applicable and available secu-
rity requirements, score of any EAP mutual authentication method can be obtained
very easily. Once the score of an authentication method is calculated, degree of
mutual trust can be obtained by referring Table 4.

5.2.2 Scope of Evaluation Model

As discussed in the previous section, this model is exclusively based on the secu-
rity requirements laid down for the mutual authentication EAP methods so this
model can only be applied for the assessment of EAP methods providing the mutual
authentication.

5.3 Features of the Evaluation Model

This evaluation model has a rich set of desirable features, which are described under.
• Simplicity: By using this model, it is very easy to obtain the level of authentication
strength offered by any mutual authentication EAP method.
• Based on objective and quantitative measures: This model logically assigns
quantitative weights to the goal-oriented security properties and produces highly
objective outputs, which are highly comparable.
• Applicability: This model is helpful in the analysis and selection process of
mutual authentication EAP methods available for the WLAN environments.
• High relevance: As this model is based on the most influential and easily obtain-
able security properties of the EAP methods, therefore results obtained from it
are highly realistic and useful.
• Reliable: This model has been validated by using the intensively studied authen-
tication methods, whose authentication strength is already known through the
available literature. So it can be used for evaluating the other available or newly
developed authentication methods.

6 Demonstration of Evaluation Model

For demonstrating the practical applicability of the proposed evaluation model, we


have applied this model on five mutual authentication EAP methods EAP-LEAP,
EAP-TLS, EAP-TTLS, EAP-PEAP, and EAP-FAST. The evaluation process of five
selected methods tunneled as well as non-tunneled is depicted under in Table 5.
Table 5 Assessment of five mutual authentication EAP methods by using the evaluation model
358

Security Req. No. Name of the Non-tunneled EAP methods Tunneled EAP methods Remarks about the
property EAP-LEAP [12] EAP-TLS [10] EAP-PEAP [13] EAP-TTLS [11] EAP-FAST [14] security property
√√
SR1 Client X X X X Only the EAP-TLS
authentication mandates the
mechanism deployment of
certificates at client
for providing the
strong mutual
authentication
√ √ √ √
SR2 Protected cipher X All the methods
suite negotiation provide cipher
suite negotiation
except LEAP
√ √ √ √ √
SR3 Key derivation All selected mutual
authentication
methods support
key derivation
*SR4 Key strength Not available Only NIST Variable Up to 384 bits Only NIST All the methods
recommendations recommendations provide variable
available. (exact available. (exact key strength
value unknown) value unknown)
√ √ √ √
SR5 Client identity X All methods
privacy provide client
identity hiding
except EAP-TLS
(continued)
P. Kumar and D. Kumar
Table 5 (continued)
Security Req. No. Name of the Non-tunneled EAP methods Tunneled EAP methods Remarks about the
property EAP-LEAP [12] EAP-TLS [10] EAP-PEAP [13] EAP-TTLS [11] EAP-FAST [14] security property
√ √
SR6 Method chaining X X X Only PEAP and
FAST support
method chaining
√ √ √ √ √
SR7 Session All the methods
independence provide session
independence
√ √ √ √
SR8 Integrity X All the methods
protection of EAP provide integrity
messages protection of EAP
messages except
LEAP
√ √ √ √
SR9 Confidentiality of X All the methods
EAP messages provide
confidentiality of
DoMT: An Evaluation Framework for WLAN Mutual …

EAP messages
except LEAP
(continued)
359
Table 5 (continued)
360

Security Req. No. Name of the Non-tunneled EAP methods Tunneled EAP methods Remarks about the
property EAP-LEAP [12] EAP-TLS [10] EAP-PEAP [13] EAP-TTLS [11] EAP-FAST [14] security property
√ √ √ √
SR10 Replay protection X All the methods
of EAP messages provide replay
protection of EAP
messages except
LEAP
Score
 of EAP methods: S(M) → S(M) 3 8 8 7 8
= Wt. (SRi ), for i one to ten
Trust levels (DoMT) → (according to Low High High Moderate High
evaluation criterion given in Table 4)
*: As the value of property SR4 is not available for most of the methods, so the score of this property is not included while calculating the total score of the
methods
X:
√ Depicts the zero score for the security requirement
√√: Depicts the one score for the security requirement
: Depicts the two score for the security requirement
P. Kumar and D. Kumar
DoMT: An Evaluation Framework for WLAN Mutual … 361

7 Validation of the Evaluation Model

For testing the predictive accuracy of the developed evaluation model, same five EAP
methods: EAP-LEAP, EAP-TLS, EAP-TTLS, EAP-PEAP, and EAP-FAST have
been used. The basis for selection of these methods for carrying out the validation is
that these methods have been rigorously analyzed and evaluated in different studies
[5, 15–17] by using the different approaches. For these authentication methods, we
have the established knowledge about the security strength offered by these methods.
The already known security capabilities of these methods have proved useful in the
validation process of the newly developed evaluation model. The validation process
of five authentication methods tunneled as well as non-tunneled is depicted in Table
6.

8 Results and Discussion

On evaluating the five EAP methods by using the proposed evaluation model DoMT
as shown in Table 5, the value of DoMT for EAP-MD5 method fallout to level low
while EAP-TLS, EAP-PEAP, and EAP-FAST evaluated to level high. On the other
hand, EAP-TTLS evaluated to moderate level.
After comparing the results obtained through the model with the conclusions
drawn in the other related studies [5, 15–17], according to the [5, 15, 16], EAP-
LEAP is a weak method and vulnerable to various attacks. The assessment of LEAP
method by using the DoMT confirms the same fact. The conclusions drawn about
the EAP-TTLS, EAP-PEAP, EAP-FAST, and EAP-TLS in these studies also validate
the predictive accuracy of the evaluation model as depicted in Table 6.

9 Conclusions

On examining the accuracy of results shown by the evaluation model for five mutual
authentication EAP methods, it is concluded that this model can be reliably used for
finding the security level offered by other mutual authentication methods.
By considering the outcome of the evaluation model, other dominant factors
described in Sect. 4.3 and application’s security requirements, system designers can
choose the most suitable EAP method for the project under consideration. However,
tunneled-based authentication methods are a good choice for securing and providing
protection to the already existing legacy authentication methods.
362 P. Kumar and D. Kumar

Table 6 Validation of evaluation model by comparison of model results with results of other studies
EAP-LEAP EAP-TLS EAP-PEAP EAP-TTLS EAP-FAST
Results of It is It is considered It offers strong It offers strong Security
study vulnerable as the best security by security during provided by it
carried-out to attacks available employing PKI the depends on its
in [5] due to use of authentication for server side authentication implementation.
unencrypted method for authentication process, while If poorly
challenge WLANs and other EAP supporting all implemented,
and authentication existing its security is
response types for client authentication comparable to
messages side methods LEAP or MD5.
authentication including the By using digital
legacy certificates at
authentication clients, it
methods by provides the
avoiding the use maximum
of PKI at client security
side
Results of Low Highest High High Medium to high
study
carried-out
in [15]
Results of Not Provide the It provides the It provides the Not covered in
study sufficiently strong strong security strong security study
carried-out secure due security if the offered by offered by
in [16] to network users EAP-TLS as EAP-TLS as
vulnerability are not well as other well as other
to dictionary concerned with additional additional
attacks identity features features
privacy
Results of Not covered It is more This method is This method is Not covered in
study in study secure as less secure as less secure as study
carried-out compared to compared to the compared to the
in [17] the EAP-TTLS EAP-TLS EAP-TLS, but
and provides the
EAP-PEAP same security
methods as EAP-PEAP
Evaluation Low High (Score:8) High (Score:8) Moderate High (Score:8)
using (Score:3) (Score:7)
DoMT

Acknowledgements I am highly thankful to my research supervisor Dr. Dinesh Kumar, for his
guidance and consistent support throughout this work. This paper would not have been possible
without the support of my colleagues in the Department of Computer Science. A very important
support came from the Department Head, Dr. Vandana Jindal. This paper is also supported by the
Dr. Neetu Purohit, by improving the language aspect of the work presented.
DoMT: An Evaluation Framework for WLAN Mutual … 363

References

1. Kumar P (2018) A review of 802.1X-EAP authentication for enterprise WLANs. Int J Latest
Trends Eng Technol 10(1):207–211
2. Aboba B, Blunk L, Vollbrecht J, Carlson J, Levkowetz H (2004) Extensible authentication
protocol (EAP), RFC3748. Internet Eng
3. Stanley D, Walker J, Aboba B (2005) Extensible authentication protocol (EAP) method
requirements for wireless LANs. Request for Comments, 4017
4. Liu X, Fapojuwo AO (2006) Formal evaluation of major authentication methods for IEEE
802.11 i WLAN standard. In: IEEE vehicular technology conference. IEEE, pp 1–5
5. Ali KM, Al-Khlifa A (2011) A comparative study of authentication methods for wi-fi networks.
In: 2011 third international conference on computational intelligence, communication systems
and networks. IEEE, pp 190–194
6. Lei J, Fu X, Hogrefe D, Tan J (2007) Comparative studies on authentication and key exchange
methods for 802.11 wireless LAN. Comput Secur 26(5):401–409
7. Hoeper K, Chen L (2007) Where EAP security claims fail. In: The fourth international confer-
ence on heterogeneous networking for quality, reliability, security and robustness & workshops,
pp 1–7
8. Jindal P, Singh B (2017) Quantitative analysis of the security performance in wireless LANs.
J King Saud Univ-Comput Inf Sci 29(3):246–268
9. Williams N (2007) On the use of channel bindings to secure channels. RFC 5056
10. Simon D, Aboba B, Hurst R (2008) The EAP-TLS authentication protocol. RFC5216
11. Funk P, Blake-Wilson S (2008) Extensible authentication protocol tunneled transport layer
security authenticated protocol version 0 (EAP-TTLSv0). RFC5281
12. George Ou (2007) Ultimate wireless security guide: an introduction to LEAP authentication
13. Microsoft Open Specification, Protected Extensible Authentication Protocol (PEAP) Speci-
fication, https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-peap/5308642b-
90c9-4cc4-beec-fb367325c0f9, [MS-PEAP]-v20180912
14. Cam-Winget N, McGrew D, Salowey J, Zhou H (2007) The flexible authentication via secure
tunneling extensible authentication protocol method (EAP-FAST). Draft-cam-winget-eap-fast-
03
15. Dondurmacioglu O (2012) EAP: The Basics (online), https://community.arubanetworks.com/
t5/Community-Tribal-Knowledge-Base/EAP-The-Basics/ta-p/25380
16. Baek KH, Smith SW, Kotz D (2004) A survey of WPA and 802.11i RSN authentication
protocols. Dartmouth Computer Science Technical Report
17. Kothaluru TR, Mecca MY (2012) Evaluation of EAP authentication methods in wired and
wireless networks
18. Hoeper K, Hanna S, Zhou H, Salowey J (2012) Requirements for a tunnel-based extensible
authentication protocol (EAP) method. RFC6678
Multiword Expression Extraction Using
Supervised ML for Dogri Language

Shubhnandan S. Jamwal, Parul Gupta, and Vijay Singh Sen

Abstract The multiword expression identification in NLP remains an interesting


topic of research and number of research techniques has been proposed for MWE
for different Indian languages. In this paper, a machine learning-based approach is
applied that can be used to extract the multiword expressions from corpora. The MWE
relies on simple heuristics that take the co-occurrence of MWEs into consideration.
Novel technique for multiword expression extraction of bigrams using supervised
machine learning is proposed for Dogri language. The proposed technique is applied
on a variety of test datasets and a decent measure of precision, recall, and F-score is
obtained.

Keywords Machine learning · Dogri · Multiword expressions · Morphology ·


Corpora

1 Introduction

Extracting multiword expressions (henceforth MWEs) from the text or corpus remain
a challenge in many Indian languages. MWEs play a vital role in real-life applica-
tions such as machine translation (MT) system, multilingual information retrieval
(IR), data mining, and some other natural language processing tasks. In this paper,
supervised machine learning techniques are used to develop the MWE for the Dogri
language. Multiword extraction is the extraction of sequences of words (n grams) with
relevant semantic meaning which is used for information retrieval, indexing of docu-
ments, document classification or clustering, and term recognition. The failure of the
NLP systems to identify MWE can cause serious problems in many NLP systems.
The approach of the development of the MWE depends on language specific knowl-
edge and some pre-existing natural language processing (NLP) tools which are not
available for the low resourced languages like Dogri. Many Indian languages possess
less such resources and tools when compared to English. Approaches are also used

S. S. Jamwal · P. Gupta (B) · V. S. Sen


PGDCSIT, University of Jammu, Jammu, Jammu & Kashmir, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 365
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_27
366 S. S. Jamwal et al.

to prove that adverbs are one of the main aspects of grammar in almost all the
languages and play significant role in the formulation of a sentence. Extraction of
MWEs in Hindi is done by various researchers based on the adverbs. In Dogri, issues
in appropriate processing of multiword expressions remain a challenge and pose a
huge problem to a precise language processing. Various research works have been
done in automatic extraction of multiword expressions for the English, Hindi, and
other Indian languages but no extended work has been reported on MWE in Dogri.

2 Literature Review

Stefan Evert [1] introduced Web1T5-Easy, a simple indexing solution that allows
interactive searches of the Web 1 T 5-g database and a derived database of quasi-
collocations and validated against co-occurrence data from the BNC on the auto-
matic identification of non-compositional VPC. Els Lefever and others [2] presented
a module that performs terminology extraction independent of language-pair. This
module was based on the alignment system at sub-sentential level that connects
phrases in parallel text. Statistical-based filters have been applied on the candidate
terms from bilingual list that is obtained from alignment output. Performance compar-
ison of the alignment and terminology extraction module was performed for three
language pairs (French-Italian, French-Dutch, and French-English), and language-
pair specific problems were highlight. Tim Van De Cruys and others [3] described
a fully unsupervised and automated method for large-scale extraction of multiword
expressions (MWEs) from large corpora. Their method aims at capturing the non-
compositionality of MWEs; the intuition is that a noun within a MWE cannot easily
be replaced by a semantically similar noun. To implement this intuition, a noun clus-
tering is automatically extracted (using distributional similarity measures), which
gives us clusters of semantically related nouns. The system is tested on Dutch and
automatically evaluated using Dutch lexical resources. Scott Piao and others [4]
presented the MWE issue using a semantic field annotator. English semantic tagger
(USAS) has been used which was developed at Lancaster University to identify multi-
word units that depict single semantic concepts. In test evaluation, this approach
extracted a total of 4195 MWE candidates, out of which, 3792 were accepted as
valid MWEs after being checked manually, generating a precision of 90.39% and
an estimated recall of 39.38%. Annette Hautli and Sebastian Sulger [5] described a
method for automatically extracting and classifying multiword expressions (MWEs)
for Urdu on the basis of a relatively small unannotated corpus which is around 8.12
million tokens. The MWEs are extracted by an unsupervised method and classified
into two distinct classes, namely locations and person names. The resulting classes
are evaluated against a hand-annotated gold standard and achieve an F-score of 0.5
and 0.746 for locations and persons, respectively. Shailaja Venkatsubramanyan and
Jose F Perez-Carballo [6] described an algorithm that can be used to improve the
Multiword Expression Extraction Using Supervised ML … 367

quality of multiword expressions extracted from documents. They measured multi-


word expression quality by the usefulness of a multiword expression in helping ontol-
ogists build knowledge maps that allow users to search a large document corpus. Ying
Liu and Zheng Tie [7] used five statistical models including Dice coefficient (Dice),
2 coefficient (2), log likelihood ratio (LLR), symmetrical conditional probability
(SCP), and normalized expectation (NE) to extract multiword unit candidates from
patent corpus. Gupta P. and Jamwal S. S. [8] proposed the unsupervised ML approach
for the development of the stemmer of Dogri language and achieved a significant
amount of accuracy. Jian Xu, Jingsong Yu, and Huilin Wang [9] proposed method of
combining similarity measure and statistical tool is also proposed for automatically
extracting English MWEs from the corpus of Chinese Government white papers and
work reports from 1991 to 2010. Statistical approach is employed to calculate the co-
occurrence affinity between two words. Besides, similarity measure is harnessed to
compute the semantic relations between words for improving MWE coverage, thus
aiming at obtaining higher precision and recall in extracting candidate multiword
expressions. Experimental results showed the proposed technique improved MWE
extraction efficiently. X. Sun [10] tested experimentally by adopting a manually
labeled set of positive and negative multiword expressions from microblog or other
Internet resources, and the experiments have shown very promising results, which
are comparable to the best value ever reported. Priyanka and R. M. K. Sinha [11]
designed rules that are representing the idioms. For the idiom identification, rule-
base system has been used to mark the input text for probable presence of idiom.
T. Su and others [12] investigated the discriminative training framework based on
MPE/MWE criteria in the context of Chinese handwriting recognition. For the Dogri
language, Jamwal S.S., Gupta P., Sen V.S. [13] have proposed the hybrid model for
generation of the verbs of the language.

3 Classifications of MWEs in Dogri

This section provides a comprehensive portrayal of the MWE classification that will
be used to tag the data, as well as their relevant and defined annotation labels and
examples.

3.1 Duplicated Multiword Expressions (DMWEs)

Duplicate multiword expressions are often used in daily life in almost all the
languages and same remains indifferent for Dogri language as well. However dupli-
cation of MWEs is not of a single type, they exist in various forms such as complete
reduplication ( ), collocation ( ), expressive ( ), echo
( ), and question type ( ). Since Dogri and Hindi languages are
368 S. S. Jamwal et al.

closely related languages, many of the duplicate MWEs used in Dogri language are
similar to that of MWEs in Hindi language. Various subcategories of MWEs are
discussed below:
Complete duplication MWEs: In complete duplication category, only MWEs
with a propensity to reiterate themselves are considered. These repetitions
could include nouns, adjectives, verbs, adverbs, and other grammatical cate-
gories. It could then be used to refer to as appeal, order, direction, command,
and so on. (different), (luke warm), (many), and
(similar) are some of the examples of complete duplication MWEs.
Collocation: Such a MWEs, wherein the second arranged/collocated term is not the
literal repetition of the first but is instead used as a rhyming/phonological match
of the first term. The second term usually does not possess any grammatical or
literal relation with the first term except phonological, also the second collocated
term may have no individual significance in the language. The subsequent term is
being utilized so that it might display certain verbal similarity with the first term.
In this kind of MWEs, the collocated terms are being consolidated so that they
together give an alternate importance then their individual collated term they signify
to. Some examples of these type of MWEs are (abuse), (to take),
(here and there), (confuse).
Echo: Such MWEs, where the succeeding word is simply the halfway regeneration
of the first word. The difference here is that the succeeding word is neither exactly
same nor is it the exact echo of the first word, as it is the case in first two subcate-
gories of duplicate MWEs, respectively. In echo duplicate, MWEs second word is
the partial duplication of the first word. These MWEs are quiet commonly found
in Indian regional languages including Dogri. Examples of such type of MWEs are
(crying), (again), (smiling), (forcefully).
Expressive: As the name proposes, these are the components that help the expressive
dimension of a language. These convey the expressive designing or the sound imagery
qualities of language. These designing could be the request for sounds, imperson-
ations, and so forth. These MWEs predominantly uphold the highlights of sounds
and their various structures and types in a given language. Some of these MWEs are
(little by Llttle), (nonsense/garbage), (very difficult),
(zigzag).
Question type MWEs: In Hindi language, we have words that are used to
pose queries likewise there are words that perform the same function in Dogri
language. These are alluded as types which are qualified for outline and pose
inquiries. Here, the articulations are being introduced so that it indicates a feeling
of posing inquiries. MWEs of such type are (who), (where),
(how), (when).
Multiword Expression Extraction Using Supervised ML … 369

Abbreviation: MWEs: MWEs are outlined and made with initial segments or of
words. These are likewise the arrangements of more than single word or expressions.
Every one of these MWEs have distinct implications when employed independently,
yet whenever they are used in combination they implicate distinctly. Dogri abbrevia-
tions are quiet similar to that used in Hindi since abbreviation making properties are
an incredibly same in both of these languages. MWEs of such type are (B.A),
(Ph.D), (C.R.P.F.)
Compound MWEs: These MWEs are the bi-word combinatory components which
are drawn from distinct domain and joined to have various inferences. The nature of
this subcategory of MWEs is compositional. These also have different interpretation
when used independently from that when clubbed together consequently providing
alternate inference. Some of the compound MWES are (scolding),
(horse rider), (in no time), (Sindu River).
Idioms and phrases MWEs: Dogri and Hindi languages being very closely related,
idioms in Dogri possess similar properties of that of idioms in Hindi language. The
composition of idioms do not follow a particular structure instead is a combination
of multiple words that convey usually all together a distinct meaning leaving literal
meaning of words involved. Idioms do not follow any structural pattern, they have
been taken from various popular stories of earlier generations, some age old sayings,
etc. These are commonly used to convey a longer interpretation in fewer words. Few
of these MWEs are (when the climate is very hot), (to become
crazy), (attain peace of mind), (the occurring of a scarcity).
Named entity MWEs: MWEs that include all the possible names of person,
places, things, designations, qualification, and fundamental terminologies in
various domains. Like idioms no pattern is followed here in making named
entity MWEs in Dogri language as well. Some named entity MWEs are
(ICC committee), (Nehru Museum), (Chamba
city), (Brigu Lake).

4 Proposed Architecture

The main idea of our research is to develop a model for the extraction of MWEs
in Dogri language. The capacity to recognize this kind of MWE in writings is of
principal use in a wide assortment of natural language processing undertakings since
it makes an improvement in the exactness of sentence parsing. Sentence parsing is
often the initial phase in more refined language processing tasks, we adopted an
approach of separating MWEs by disintegrating it into three sub-tasks. Collocation
extraction is the first subtask. We employ a standard collocation measure to get
370 S. S. Jamwal et al.

the applicant bigram MWEs from the corpus. The subsequent subtask is refinement
of the extracted MWEs. Refinement is to make sure that only actual MWEs are
recorded and all captured MWEs that were not actually supposed to be intended
MWEs are removed. We distinguish between MWEs and non-MWEs in refinement
subtask. Many word sequence displays dual behavior, i.e., MWEs and normal literal
interpretation depending upon their contextual usage. For example, , the
literal sense of its constituent words can be used to deduce its meaning. One meaning
is dropping of temperature and other one getting satisfied. The events of strict word
groupings that mimic MWEs are referred to as pseudo-MWEs in this research paper.
The third issue we tackle is assessment of the grammatical form of MWEs. The final
subtask is the POS tag approximation to the MWEs. We have considered bigram
MWEs (consisting of two words) in our research. This work may be extended so that
it applies to MWEs of arbitrary length appropriately.

4.1 Method

We propose a three-phased mechanism for retrieving MWEs from the corpus of


Dogri language. First phase distinguishes a list of candidate MWEs depending upon
the statistical behavior of the tokens. The bigrams, whose likelihood of showing up
together is more prominent than that of their individual frequency of occurrence,
are considered to comprise a possible MWE and are extracted for application in
further phases. This first phase does the job of collocation extraction. The second
phase utilizes the list of extracted candidate MWEs as the reason for extracting
instances of candidates along with their context from the corpus. These examples
then act a training data for the supervised learning model enabling it to differentiate
between genuine MWEs, pseudo-MWEs, and non-MWEs. In the last phase, we
utilize supervised machine learning to train a classifier to perform MWE part of
speech assignment. By looking at the context surrounding an MWE, it is possible
for the classifier to decide the most likely grammatical parts of speech for the MWE
in that specific context.
Multiword Expression Extraction Using Supervised ML … 371

Collocation Extraction: Collocation extraction was performed using one of the


statistical measures discussed in [14]. The measures we experimented with were
frequency, chi-square [15], log likelihood ratio [16], t-score [15], mutual information
(MI) [17], mutual dependency (MD—mutual information minus self-information)
[14], and log-frequency biased mutual dependency (LFMD—a combination of the
t-score and mutual dependency) [14]. It is represented as Figs. 1, 2, 3, 4, 5 and 6.

Fig. 1 T-score
372 S. S. Jamwal et al.

Fig. 2 Chi-square

Fig. 3 Log likelihood ratio

Fig. 4 Point-wise mutual


information

Fig. 5 Mutual dependency


(MD)

Verification: In order to extract the list of quality candidate MWEs from the list of
MWEs obtained in the collocation stage, verification was performed.
Features: In order to train the classifier, we used a background window of three tokens
on both sides (left and right) of candidate MWEs and their parts of speech tags. We
have also considered the tokens that constitute the candidate MWEs and their parts
of speech. The consideration of three token window has been taken by assuming that
most lexical dependencies are relatively local, so window size is broad enough to
catch the most valuable knowledge available from the context. Contexts could not
transcend sentence limits. Where the available context around a candidate, MWE was
less than the three token windows, we added the required number of dummy tokens
and dummy sections of expression to compensate: “BOS” and “EOS” indicating
beginning and end of sentence tokens in left and right context, respectively.
Part of Speech Tagging: Tagging the corpus with the part of speech tagger, TnT
tagger provided the part of speech details. Since the Dogri-corpus (DC) is a POS-
tagged corpus, retagging was needed because although each MWE in the DC is
tagged with a part of speech, its constituent tokens are not. As a result, the tokens
that comprise each MWE must be labeled with specific parts of speech. It may be
argued that comparing the initial DC tagging with the TnT tagging of the words in the
MWEs would have resulted in more reliable training data; however, in a real-world
application, the part of speech information in the classifier’s input data would be
generated entirely using a tagger such as TnT tagger. By using the same POS tagger
for training and application stages, any systematic tagging errors should be learned
and accounted by the classifier. The tagger was conditioned on a subset of DC files

Fig. 6 Log-frequency biased mutual dependency (LFMD)


Multiword Expression Extraction Using Supervised ML … 373

containing slightly less than 98.6 k tokens. It was then through on a new collection of
files containing approximately 92 k tokens. When checked on this data, the tagger’s
precision was 74.7%.
Training: The corpus used to train the classifier was a sub-corpus of the DC that
included approximately 99,000 tokens representing all domains in the corpus. The
events in the training corpus of the top 6000 bigrams found using the t-score and
LFMD collocation indicators were used as examples for training. The corpus included
the majority of derogatory cases, either non-MWEs or pseudo-MWEs. TinySVM5
was used to construct a binary classifier. Training was performed on the training
corpus. Several training runs were carried out with differing volumes of data in order
to explore the association between the volume of training data and the output of the
resulting model.
Testing: The classifier was tested using a certain sub-corpus of the DC that was not
used in preparation. This research corpus included about 50 k tokens and texts from
each domain in the corpus.

Part of Speech Estimation of Multiword Expressions


We addressed calculating the POS of a given MWE as a classification task, using the
same method that we used to classify true and false MWEs. For each target part of
speech, we trained a distinct classifier. An instance of a MWE with the target part of
speech was a positive training example. An example of a negative example will be
any MWE with a non-target part of speech (Fig. 7).
The tokens and parts of speech of three tokens to the left and right of the goal
MWE, as well as the tokens and parts of speech of the words in the MWE itself, were
used as features. The training and testing corpora were identical to those used during
the verification period, as previously mentioned. Since the two roles are unrelated,
this was acceptable. The target sections of speech (verbs, adverbs, prepositions, and
conjunctions) were chosen because they are expected to be the most effective in
applications like parsing due to their comparatively closed class and high frequency.
For contrast, we have experimented with an open class form (nouns).

5 Results and Discussion

5.1 Collocation Extraction

Normal collocation steps perform poorly in the task of extracting the target MWEs,
according to the results for collocation extraction (Fig. 8). And when recall is
measured using the top 6000 candidate collocations, it is just 46%. (Using the t-score).
It is observed from the Fig. 8 that some values are not satisfactory because only
bigram MWEs are included in the approach and test data does not consider only
374 S. S. Jamwal et al.

Fig. 7 Data flow of proposed methodology

Fig. 8 Precision and recall for top 6 k candidate multiword expressions extracted using different
collocation measures

bigrams. Since the collocations extracted are confined to bigrams, any of these may
be part of a larger MWE in our target list.
Multiword Expression Extraction Using Supervised ML … 375

Fig. 9 Performance of verifier using models trained on different quantities of data

5.2 Verification

Verification of candidate MWEs produced better results (Fig. 9). For example, a
classifier trained using 12 k examples had precision of 82.56% and recall of 73.11%
giving an F-measure of 77.55.
Many false negative results tend to be due to tagging errors. Proper nouns are often
marked as “unclassified words,” which are smart in that it groups together all unusual
words, the tagger is unaware of, but it leads to erroneous labeling, which prevents the
classifier from distinguishing true MWEs. Both false positives and false negatives
seem to occur often in the context of punctuation, suggesting that this presents a
particular difficulty for the classifier. Many false positives are actually substrings of
longer MWEs. MWEs longer than two tokens were ignored when determining if a
candidate MWE was a true or false MWE because in this paper, we concentrated
on bigrams. Some of the candidate MWEs, though, was only substrings of a longer
MWE. As a consequence, the classifier can recognize that a given substring appears
in a context that is typical of MWEs, and thus classify the MWE substring as a
MWE in its own right. As a result, a more complete implementation that extracts
MWEs longer than two tokens are likely to remove this cause of error. Because of
the occasional error, implementing a classifier to the context surrounding a candidate
MWE tends to be a successful way to differentiate true MWEs from non-MWEs and
pseudo-MWEs.

5.3 Estimating Effective Part of Speech

The effectiveness of part of speech classifiers in estimating a MWE’s part of speech


based on its context of occurrence has been illustrated below. The classifier for
nouns, as expected, performed better with near-perfect recall and high precision. The
conjunctions classifier did not perform well with recall being slightly poorer than for
other parts of speech. This may be due to the fact that the contexts around conjunc-
tive MWEs are more varied. Since conjunctions often play a discursive function in
sentences, evidence of whether or not a phrase is a conjunction can be identified at
a higher level of linguistic research than the immediate lexical context used in our
study (Fig. 10).
376 S. S. Jamwal et al.

Fig. 10 Part of speech estimation results

6 Conclusion

The goal of this study was to find a method for extracting set MWEs automatically
and estimating parts of speech. MWEs knowledge could help increase the accuracy
of a variety of natural language processing applications. As a result, generating a list
like this would be a crucial part of natural language processing. In order to generate
a list of candidate bigrams, we use a collocation measure. These candidates are then
used to select training data for a classifier. The trained classifier was successfully able
to distinguish between contexts containing a true MWE, from contexts containing
a pseudo-MWE, or no MWE at all. Candidates found using LFMD had precision
of 82.56% and recall of 73.11%, giving an F-measure of 77.55% for the classifier
trained on 12 k examples. Part of speech classifiers were then trained and tested. With
an accuracy and recall of over 91.04% in case of noun, 88.98 in case of preposition,
82.32 in case of conjunction, 85.56 in case of adverb, 87.02 in case of verbs, the
classifiers were able to classify the right part of speech for a MWE. These result
demonstrated that the local context around MWEs provide enough support to detect
the existence and approximation of part of speech for MWEs.

References

1. Evert S (2010) Google web 1T 5-grams made easy (but not for the computer), WAC- 6 ’10. In:
Proceedings of the NAACL HLT 2010 Sixth Web as Corpus Workshop, June 2010, pp 32–40
2. Lefever E, Macken L, Macken L, Hoste V (2009) Language independent bilingual terminology
extraction from a multilingual parallel corpus. In: EACL ’09: Proceedings of the 12th confer-
ence of the European chapter of the association for computational linguistics, Mar 2009, pp
496–504
3. Van De Cruys T, Moiron BV (2007) Semantics-based multiword expression extraction. In:
MWE ’07: Proceedings of the workshop on a broader perspective on multiword expressions,
June 2007, pp 25–32
4. Piao S, Rayson P, Archer D, Wilson A, McEnery T (2003) Extracting multiword expressions
with a semantic tagger. In: MWE ’03: Proceedings of the ACL 2003 workshop on multiword
expressions: analysis, acquisition and treatment, vol 18, July 2003, pp 49–56
5. Hautli A, Sulger S (2011) Extracting and classifying Urdu multiword expressions. In: HLT-SS
’11: Proceedings of the ACL 2011, student session, June 2011, pp 24–29
Multiword Expression Extraction Using Supervised ML … 377

6. Subramanyan SV, Perez-Carballo JF (2004) Multiword expression filtering for building knowl-
edge maps. In: MWE ’04: Proceedings of the workshop on multiword expressions: integrating
processing, July 2004, pp 40–47
7. Liu Y, Tie Z (2011) Automatic extraction and filtration of multiword units 1. In: Eighth inter-
national conference on fuzzy systems and knowledge discovery (FSKD), 26–28 July 2011,
IEEE, Conference Location: Shanghai, China
8. Gupta P, Jamwal SS (2021) Designing and development of stemmer of Dogri using unsuper-
vised learning. In: Marriwala N, Tripathi CC, Jain S, Mathapathi S (eds) Soft computing for
intelligent systems. Algorithms for intelligent systems. Springer, Singapore, pp 47–156
9. Xu J, Yu J, Wang H (2010) Automatic extraction of multiword expressions combining statistical
and similarity approaches. In: Fourth international conference on genetic and evolutionary
computing, 13–15 Dec 2010. IEEE, Shenzhen, China. ISBN: 978-1-4244-8891-9
10. Sun X, Li C, Tang C, Ren F (2013) Mining semantic orientation of multiword expression from
Chinese microblogging with discriminative latent model. In: 2013 International conference on
Asian language processing, Urumqi, pp 117–120. https://doi.org/10.1109/IALP.2013.41
11. Priyanka, Sinha RMK (2014) A system for identification of idioms in Hindi. In: 2014 Seventh
international conference on contemporary computing (IC3), Noida, pp 467–472. https://doi.
org/10.1109/IC3.2014.6897218
12. Su T, Ma P, Wei T, Liu S, Deng S (2013) Exploring MPE/MWE training for Chinese handwriting
recognition. In: 2013 12th International conference on document analysis and recognition,
Washington, DC, pp 1275–1279. https://doi.org/10.1109/ICDAR.2013.258
13. Jamwal SS, Gupta P, Sen VS (2021) Hybrid model for generation of verbs of Dogri language.
In: Singh TP, Tomar R, Choudhury T, Perumal T, Mahdi HF (eds) Data driven approach towards
disruptive technologies. Studies in autonomic, data-driven and industrial computing. Springer,
Singapore. https://doi.org/10.1007/978-981-15-9873-9_39
14. Thanopoulos A, Fakotakis N, Kokkinakis G (2002) Comparative evaluation of collocation
extraction metrics. In: International conference on language resources and evaluation (LREC-
2002), pp 620–625
15. Manning CD, Schütze H (1999) Foundations of statistical natural language processing. The
MIT Press, Cambridge
16. Dunning T (1993) Accurate methods for the statistics of surprise and coincidence. Comput
Linguist 19:61–74
17. Church K, Hanks P (1990) Word association norms, mutual information, and lexicography.
Comput Linguist 16:22–29
Handwritten Text to Braille
for Deaf-Blinded People Using Deep
Neural Networks and Python

T. Parthiban, D. Reshmika, N. Lakshmi, and A. Ponraj

Abstract Among the people living in the world, nearly 253 million people are
blind, of whom 97.6 million are both deaf and blind. To communicate with other
humans, deaf-blinded people use communication techniques such as Print on Palm,
tactile fingerspelling, and Braille. Of these, Braille is considered a convenient way for
communication. Braille is a code language that has six raised dot patterns embossed
on Braille paper. Braille text is available in the form of Braille books and Braille sheets
that are accessible by deaf-blinded people. For printing Braille sheets and books, a
specialized printer called Braille Embosser is required. For day-to-day interaction,
conversion of normal text to Braille is difficult because it is slower and very expensive
to print Braille sheets. To make Braille accessibility easier, this paper brings forth a
technique that converts visual text to Braille by deep neural networks using Python.

Keywords Deaf-blinded · Braille · Visual text · Deep neural networks · Python

1 Introduction

Braille code is categorized as Grade 1, Grade 2, and Grade 3 [1]. Grade 1 Braille
consists of a single-character representation of alphabets that involves one-to-one
conversion. Grade 2 Braille consists of a shortened form of commonly used words
that saves space. Grade 3 Braille is referred to as the Braille shorthand that is not
commonly used.
Braille code consists of six raised dots (also known as cells) that are arranged
in a three-row two-column fashion. Braille code is available in many languages. In
English, Braille is coded as 63 unique characters in a dotted pattern (64 if non-dot
pattern included). Universally, the diameter of the dots in the Braille cell is 1.6 mm
[2]. The spacing between two dots in the small cell is 2.5 mm, and the spacing
between two adjacent cells is 7.6 mm. The spacing of the Braille cell is shown in
Fig. 1.

T. Parthiban (B) · D. Reshmika · N. Lakshmi · A. Ponraj


Department of Electronics and Communication Engineering, Easwari Engineering of College,
Chennai, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 379
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_28
380 T. Parthiban et al.

Fig. 1 Spacing of the Braille cell

Braille code is embossed on a thick paper—140 to 150 g per square meter


(GSM)—called Braille paper. To print Braille code in sheets, a specialized printer
is used. First, the content that has to be communicated with deaf-blinded people is
loaded to a software on any PC, and it is encoded to equivalent Braille code. Then,
the encoded Braille data is sent to the specialized printer that embosses the Braille
dots as separate cells in the Braille paper.
In day-to-day life, interaction with deaf-blinded people is a big task as instructions
to them have to be given in a specialized way as Braille code. For this, we have to
print Braille frequently using the software to convey the conversations to deaf-blinded
which is way more tedious as Braille printing is more time-consuming and the printer
is costlier.
In day-to-day life, interaction with deaf-blinded people is a big task as instructions
to them have to be given in a specialized way as Braille code. For this, we have to
print Braille frequently using the software to convey the conversations to deaf-blinded
which is way more tedious as Braille printing is more time-consuming and the printer
is costlier.
To resolve this time-consuming process, convolutional neural network—CNN,
a class of deep neural networks—is employed where optical character recognition
(OCR) plays a vital role.
This paper deals with the encoding of handwritten characters that are taken as
images in portable network graphics (PNG) format to Braille code using convolu-
tional neural network on Python. The image file is loaded to the program that uses
optical character recognition to identify individual alphabets/characters in the image
and to convert it into digital text that is understandable by the software, and this
digital text is encoded into equivalent Braille.
Handwritten Text to Braille for Deaf-Blinded People Using Deep … 381

2 Literature Survey

In 2010, Manzeet Singh [3] proposed a technique that converts computer key input
into Braille code (Grade 1) using Java programming language. The project converts
both English and Hindi languages to Braille. It has a user interface (UI) that allows
the user to type the content which converts it to Braille code using a Java processing
engine. The author proposes that the output data can be given to a Braille printer to
obtain Braille code in the future.
The system proposed by Adrian Moise [4] in 2017 uses an automated system that
converts computer-written text to Braille code with the help of a microcontroller—
Cypress PSoC 4 kit. The microcontroller employs a finite state machine (FSM) model
that helps to convert the text to Braille. The microcontroller is programmed using C
programming language using the stack counter method to encode Braille.
In 2006, Xuan Zhang [5] proposed a system that uses field programmable gate
arrays (FPGAs) which convert text files in .txt to Braille code using very high-speed
hardware description language (VHDL) that will speed up the encoding of Braille in
a much faster way. The author proposed that all the hardware design can be fabricated
into a single integrated chip (IC) that incorporates all the necessary processing, and
multiple language integration can be achieved.
Priyank Patel proposed a methodology in 2021 [6] that converts the handwritten
character into computer-readable text using convolutional neural network (CNN)
that employs optical character recognition (OCR). This paper explains the concepts
and theories on how the images in PNG format are converted to individual characters
using optical character recognition.

3 Existing System

From the above literature survey, the techniques and methodologies that are employed
follow a common architecture except for the paper that discusses how to convert
image files to computer-readable text. Figure 2 represents the block diagram of the
existing system.

Fig. 2 Block diagram of existing system


382 T. Parthiban et al.

The input to the system is given as computer-readable text that can be obtained
from the user via the keyboard. Then, the given text data is passed to a soft-
ware or hardware managed system which works as converter from the computer-
readable text to Braille code. The hardware model uses algorithms such as finite
state machine (FSM) and hardware design such as field programmable gate arrays
(FPGAs) in processing languages such as C programming and very high-speed
hardware description language (VHDL).
The software model uses Java processing engine and user interface (UI) for
obtaining input from the user as text from the keyboard and converts it to Braille that
can be given as output to the user in Braille code.

4 Proposed System

This paper aims to simplify the communication for deaf-blinded people by taking
the written text in the form of a PNG image and giving it to the convolutional neural
network model (CNN) which converts the visual text to computer-readable format,
and then, the computer-readable format is mapped to corresponding equivalent
characters that are encoded as Braille, that is, readable by visually impaired.
Figure 3 represents the block diagram of the proposed system.

4.1 Preprocessing

Preprocessing is the conventional operation that frequently deals with the improve-
ment and enhancement of the acquired images for further processing. When the
information transcribed content (input) is given as an image, at first, the geomet-
rical transgression of the images is re-shaped, then the images are converted into
grayscale so that a base size for all the images is obtained that is fed into our deep
neural network model. Gaussian blur is used to smooth and remove unwanted noise
from the images, then foreground and background objects in the images are separated
by segmentation. The smoothing edge (morphology) technique is used to retain the

Fig. 3 Block diagram of proposed system


Handwritten Text to Braille for Deaf-Blinded People Using Deep … 383

texture of the image that is used for the further processing of the image in feature
extraction.

4.2 Feature Extraction

It is a technique where each character is represented in the form of vectors to extract a


set of features. The raw data is divided into manageable groups for easier processing.
The computing process for large datasets will be difficult since they have a large
number of variables. Feature extraction reduces the data by selecting and combining
a large number of variables into features without losing any important information.
The reduction of data helps to build the model with less machine effort by increasing
the speed of image processing and finding the output.
Features are taken from the content of the images; they can be
• Color
• Texture
• Shape
• Position.

4.3 Deep Neural Network Model

Neural network imitates the learning process of the human brain which utilizes a
process that takes input according to the problem and gives the output as a solution,
where the algorithm used here is convolutional neural networks (CNNs).
Neural network models are commonly used in image classification; they are
employed in computer vision tasks such as image classification, object detection,
image recognition. Neural network consists of three layers (or) three actions that are
shown in Fig. 4.
1. Input layer (or) receiving input
2. Hidden layer (or) processing information
3. Output layer (or) generating output.
In the above diagram, each layer consists of a circular unit that is called neurons.
The algorithm of the neural networks is classified into two steps:
1. forward propagation
2. backward propagations.
The forward and backward propagations are shown in Fig. 5.
Forward Propagation. The information transcribed content (input) is given in the
form of numbers to the input layer. Numbers represent the pixel value of the image;
based on the intensity of the pixel, the values of numbers might differ (say for example
384 T. Parthiban et al.

Fig. 4 Neural network


layers

Fig. 5 Forward and backward propagations

0–255 value). The values get a mathematical operation in the hidden layer based on
the problem, and from the output layer, we get the results for further prediction.
Backward Propagation. The obtained result from the output layer of the forward
propagation is compared with the actual value. Comparison of the output with actual
value gives an insight into how far or close the error is between them, then the
parameter values are updated. Then, the forward propagation is once again repeated
with the updated value to obtain new outputs.

5 Convolutional Neural Network (CNN) Architecture

When trying to find the difference between the two or three images, based on the
pieces of information such as feature, shape, edges from the images, we differentiate
the image as dog or cat and so on. The hidden layer in the CNN does the same. The
CNN can be classified into two
1. convolutional layers
Handwritten Text to Braille for Deaf-Blinded People Using Deep … 385

Fig. 6 Block diagram of two layers in CNN

2. fully connected layers (or) dense layers.


Block diagram of the two layers of CNN is shown in Fig. 6. As discussed, the
forward and backward propagation processes are involved in any training of neural
network models.

5.1 CNN—Forward Propagation

Convolutional layer. Figure 7 shown is taken from the “towards data science” [7]
and gives an insight into how the pixel values are determined in an image. If looked
closely at the image, the pixel values may differ (say 0–255) from the edges of the
face on the image.
The mathematical representation (1) of the convolution is often represented by
the asterisk * sign

X = A∗ f (1)

where X represents output, A represents the input image, and f represents the filter
applied.
To have a better understanding of the convolution, let us discuss an example as
shown in Fig. 8 by considering the image size as 3 × 3 and filter size as 2 × 2.
The filter performance element-wise multiplication, the results are summed up:

(5 × 1 + 6 × 1 + 4 × 0 + 8 × 1) = 19
(6 × 1 + 1 × 1 + 8 × 0 + 3 × 1) = 10
(4 × 1 + 8 × 1 + 9 × 0 + 5 × 1) = 17

Fig. 7 Picture with pixel


values
386 T. Parthiban et al.

Fig. 8 Convolution example

(8 × 1 + 3 × 1 + 5 × 0 + 7 × 1) = 18

In the above example, since the image size is 3 × 3 matrix and filter size 2 × 2
matrix, it is easy to interpret the output 2 × 2 matrix.
Fully connected layers (or) dense layers. Valuable features from the data have been
obtained by the convolutional layer. The final output is generated in the dense layer
by passing the features in a conventional way.
The fully connected layer works only on 1D data. Since output obtained from the
convolutional layer is 2D matrix, it is converted into the 1D data as shown in Fig. 9;
by this way, each row will represent a single input image.
Once the 1D conversion is completed, it is sent to the dense layer. The 1D values
represent separate features of the image. There are two actions performed on the
incoming data
• linear transformation
• Nonlinear transformation
Equation (2) for the linear transformation is given as

X = M T .B + c (2)

Fig. 9 2D is converted into


1D

Fig. 10 Block diagram of CNN—forward propagation


Handwritten Text to Braille for Deaf-Blinded People Using Deep … 387

Fig. 11 Linear transformation example

Here, B is the input, M is the weight, and c is the constant (called bias), where
M will be the matrix of randomly initialized numbers. Figure 11 shows example of
linear transformation.
The size of the weight will be of a matrix (m, n). The number of inputs or features
will be equal to m, and the number of neurons used in the layer will be equal to n.
Here, m is four from the convolutional layer, and n is two. We have (4, 2) weight
matrix (M) and (4, 1) input matrix (B), and let the bias be a randomly initialized
value of (2, 1) matrix. These values are applied in the given linear equation.
Nonlinear transformation is a new component in the architecture that is also called
the activation function. It adds nonlinear to the data because the complex relation
cannot be captured alone by a linear transformation; based on the problems, the
activation functions are used and are added to the neural networks.
There are different types of activation functions present in building the neural
network, and some of them are explained in the GeeksforGeeks [8].
Here, the activation function tanh is used for the binary classification problem;
the mathematical expression of Tanh is

f (x) = tanh(x) = 2/(1 + e − 2x) − 1 (3)

(or)

tanh(x) = 2 ∗ sigmoid(2x) − 1 (4)

The range of the tanh function is between −1 to 1. So, for any input value given,
the output will be from the range (−1, 1). It is highly used for binary classification and
will use for both convolution and fully connected layers. The overall block diagram
of forward propagation is shown in Fig. 10.
388 T. Parthiban et al.

Fig. 12 Gradient descent


graph

5.2 CNN—Backward Propagation

The values such as Weight, biases, and filters are randomly initialized during the
forward propagation. Such values are considered as parameters from the CNN algo-
rithm. In backward propagation, the overall prediction becomes more accurate as the
model updates the parameters.
The gradient descent technique is used for updating the parameters. Figure 12
shown below gives an understanding of the gradient descent as the randomly initial-
ized value a2 in the graph is not at the minimum loss. It is seen that a1 is the minimum
loss in the graph from a2. The gradient descent tries to update the initialized value a2
of the parameter to attain the minimum loss (i.e., near a1) by increasing or decreasing
the learning rate. Slop or gradient is calculated at the initialized point a2 to know the
direction of the movement.
From the calculated value of slope or gradient, the direction of the movement is
decided. When the slope is positive, the direction of the movement is left, and when
the slope is negative, the direction of the movement is right.
The learning rate must be always selected carefully because it decides the model
performance by responding to the error and updating the weight of the model. If the
values are too large, it may cause an unstable training process; if the values are too
small, it takes more time for the training process to complete.
From Eq. (1) of the forward propagation, we get the expression

X11 = A11 f11 + A12 f12 + A21 f21 + A22 f22 (5)

X12 = A11 f12 + A12 f13 + A21 f22 + A22 f23 (6)

X21 = A11 f21 + A12 f22 + A21 f31 + A22 f32 (7)

X22 = A11 f22 + A12 f23 + A21 f32 + A22 f33 (8)
Handwritten Text to Braille for Deaf-Blinded People Using Deep … 389

Fig. 13 Block diagram of CNN—backward propagation

To perform backward propagation as shown in Fig. 13, we need to know the chain
rule; Eq. (9) for the chain rule is

dz/dt = ∂z/∂ x · dx/dt + ∂z/∂ y · dy/dt (9)

By applying the chain rule on the output equation obtained from forward
propagation like shown in Fig. 5, ∂f is calculated

∂f11 = A11 ∂X11 + A12 ∂X12 + A21 ∂X21 + A22 ∂X22 (10)

∂f12 = A12 ∂X11 + A13 ∂X12 + A22 ∂X21 + A23 ∂X22 (11)

∂f21 = A21 ∂X11 + A22 ∂X12 + A31 ∂X21 + A32 ∂X22 (12)

∂f22 = A22 ∂X11 + A23 ∂X12 + A32 ∂X21 + A33 ∂X22 (13)

Similarly, ∂X is calculated and updated in the parameters without errors; by this


way, we get the minimum loss of the parameter in the model. The overall block
diagram of CNN—backward propagation—is shown in Fig. 13.

6 Deployment of CNN Model and Encoding of Text


to Braille

The model type used in this paper is convolutional neural networks (CNNs) in Python
3.8.7 version, and the libraries such as TensorFlow, NumPy, and other math libraries
are imported. CNN is a more convenient and useful model in the terms of extracting
the features from the handwritten image. It has a fully connected layer that will
update parameters to attain minimum loss by gradient descent so that overall model
prediction is good.
390 T. Parthiban et al.

To compile the model, we use specifications such as learning rate and mean square
error (MSR). The learning rate is controlled by an optimizer to adjust the training of
the model, and the model weight is determined by the learning rate. Usually, smaller
learning rates lead to accurate models, but the time taken for the model to train will
be longer. The mean square error (MSR) is the loss function that takes the squared
difference between the actual value and predicted value. If the loss is close to zero,
then the model performed well; if not, then the model did not perform well. The
formula for MSR is

1  2
N 

MSE = E−E (14)


N i=1

Handwritten data in the form of images for training the model is taken from IAM
Handwriting database [9]. There are
• 1539 pages of scanned text
• 5685 isolated and labeled sentences
• 13,353 isolated and labeled text lines
• 115,320 isolated and labeled words of data in the IAM Handwriting database.
The number of times the model will cycle through the data is determined by the
number of epochs. Normally, the learning of the model will increase if the epochs are
set high, but the time taken for the model to improve is long, and after a certain time,
the model stops improving. The early stopping technique is used once the model
stops improving before it reaches the specified epoch. Here, we set early stopping as
five, which means it stops at epoch five if the model is not improving.
Once the model is trained, we predict the handwritten text that has to be conveyed
to the visually impaired. The text that is needed to be converted into Braille is written
on a piece of white paper, then the paper is taken as a snapshot and given to the model
as input for prediction of the handwritten text that can be digitally readable by Braille
encoder. The input images are preprocessed where re-shaping, grayscale conversion,
and segmentation of the foreground and background of the images take place. So
that, input images are of the same base size for further feature extraction.
Then, feature extraction takes place; it converts the image character into vectors
to extract features. The features extracted may be color, texture, shape, and position.
In this case, the features extracted are the shape and position of the text. These
features are passed to the model where weight and biases are assigned to the image
to differentiate from one another.
Convolutional neural network (CNN) has many layer functions such as pooling,
convolutional layer, and a classification layer as shown in Fig. 14. It requires low
preprocessing compared with other algorithms, and its architecture is easy to learn.
The structure of convolutional neural network (CNN) is similar to neurons in the
human brain and works the same as the human brain where neurons respond to
stimuli. It is responsible for capturing the low-level features.
Handwritten Text to Braille for Deaf-Blinded People Using Deep … 391

Fig. 14 Convolutional neural network layers

It breaks the given handwritten images into values of pixels, and these values are
like neurons in binary numbers as shown in Fig. 7. According to the images, the
breakage of pixels values may differ. Here, the convolutional neural network (CNN)
reduces the images to make the processes easier without losing any feature for good
prediction. High-level features such as edges are extracted effortlessly.
The pooling layer reduces the size where it decreases the power to compute the
data processing, then the output is given to the fully connected layer which finds the
error and updates the values to get the probability of the prediction higher. Then,
the text with higher probability comes out as the output from the hidden layer in the
form of computer-readable format.
The outputs that are obtained from the handwritten images in the form of texts are
mapped/encoded with the corresponding equivalent Braille characters. The mapping
of the text to Braille characters is detailed in Fig. 15.

Fig. 15 Mapping of text to Braille code


392 T. Parthiban et al.

Table 1 Result of the conversion of handwritten text to Braille code


S. No. Handwritten text as input Recognized text Text encoded/mapped to
(output from Braille code
model)

1 parthiban ⠏⠁⠗⠞⠓⠊⠃⠁⠝

2 reshmika ⠗⠑⠎⠓⠍⠊⠅⠁

3 lakshmi ⠇⠁⠅⠎⠓⠍⠊

At first, an empty string is initialized, if the computer-readable characters need


to be encoded to Braille then it automatically proceeds further, or else it stops. The
given text is checked for alphabet and numbers; if the text has an alphabet, then the
corresponding equivalent Braille code is incremented to the empty string; if the text
has numbers, then the corresponding equivalent Braille code is incremented to the
string. Once the text is converted to Braille code, then the string is printed. Thus, it
concludes the working of conversion of handwritten to Braille code.

7 Results

The proposed system is deployed for the conversion of handwritten text to Braille
code. As shown in Table 1, the handwritten text is given as input in PNG format. It
is recognized with the CNN pre-trained model, and the output is given in computer
recognized text with the highest probability, then the text is encoded or mapped to
the respective Braille code.

8 Conclusion and Future Scope

In this paper, the proposed idea is for the deaf-blinded people, where the communi-
cation with the deaf-blinded people is made easier by converting the handwritten text
to the Braille code, which is understood by the deaf-blinded people. Additionally,
the system is helpful for learning the Braille code for the visually impaired at their
Handwritten Text to Braille for Deaf-Blinded People Using Deep … 393

own pace by instantly converting the written text to Braille code. In the future, the
system is integrated with an actuated system that works as a printer. The six dots
in the Braille cell are connected equivalently to six servo motors that constitute the
Braille cell, which elevates and imprints the Braille dots on the hand of the deaf-
blinded people. It is also provided with a previous and forward command to know
the previously elevated Braille code. The model is cost-efficient and is also easily
portable.

References

1. Tennessee Council of the Blind, http://www.acb.org/tennessee/braille.html, last accessed


2010/08/10
2. Wikipedia The Free Encyclopedia, https://en.wikipedia.org/wiki/Braille
3. Singh M, Bhatia P (2010) Automated conversion of English and Hindi text to Braille
representation. Int J Comput Appl 4(6):25–29
4. Moise A, Bucur G, Popescu C (2017) Automatic system for text to Braille conversion. In: 2017
9th international conference on electronics, computers and artificial intelligence (ECAI). IEEE
5. Zhang X, Ortega-Sanchez C, Murray I (2006) Hardware-based text-to-braille translator.
In: Proceedings of the 8th international ACM SIGACCESS conference on Computers and
accessibility
6. Patel P et al, A brief introduction to handwritten character recognition
7. Towards data science, https://towardsdatascience.com/understanding-images-with-skimage-pyt
hon-b94d210afd23. Last accessed 2019/10/31
8. GeeksforGeeks, https://www.geeksforgeeks.org/activation-functions-neural-networks/. Last
accessed 2020/10/08
9. Fki Computer Vision and Artificial Intelligence, https://fki.tic.heia-fr.ch/databases/iam-handwr
iting-database. Last accessed 2019/05/28
Two Element Annular Ring MIMO
Antenna for 2.4/5.8 GHz Frequency Band

Kritika Singh, Shadab Azam Siddique, Sanjay Kumar Soni,


Akhilesh Kumar, and Brijesh Mishra

Abstract The work presented in this article embodies of two parallel port MIMO
antenna (80 × 40 × 1.6 mm3 ) for dual-band operation. The proposed antenna consists
of Sierpinski fractal geometry and an annular ring between two elements on the
ground plane and a rectangular strip loaded on the radiating patch. Annular ring
provides better isolation of −12.92 dB and −20.22 dB at 2.4 GHz and 5.8 GHz,
respectively. The proposed antenna covers a 10 dB operating band of 2.24–2.52 GHz
(280 MHz) and 5.32–6.44 GHz (1120 MHz). The peak gain of 2 and 7.51 dB
is observed at 2.4 and 5.8 GHz, respectively. The value of envelope correlation
coefficient (ECC) and diversity gain (DG) is also within the accepted range.

Keywords MIMO · Fractal · Annular · ISM · ECC · DG

1 Introduction

In the rapid pace of the world, there is a need of wireless communication system
which has features like higher data rate, high SNR, high reliability, improved network
coverage, high quality of transmission, and high quality of service (QoS). The reason
behind this having these features are necessary for “always-on” accessibility and the
need for forthcoming wireless applications like video, interactive multimedia, voice,
and data [1].
Three basic techniques are reported to improve wireless communication, first
denser deployment, second large spectrums, and third multiple antennas; but the
first two have their limitations like interference, mobility, millimeter wave [2]. So,
the third one is somewhere more suitable in all three; when we compare SISO,
MISO, SIMO, and MIMO system, MIMO antenna systems have more capacity and
high data rates as compared to other antenna systems [3]. MIMO is generally used in

K. Singh · S. A. Siddique · S. K. Soni · A. Kumar · B. Mishra (B)


Madan Mohan Malaviya University of Technology, Gorakhpur, Uttar Pradesh, India
B. Mishra
Shambhunath Institute of Engineering and Technology, Prayagraj, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 395
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_29
396 K. Singh et al.

spatial diversity and spatial multiplexing modes. The data are distributed over several
space-time resources in the spatial diversity mode, improving reliability, whereas the
data are combined with spatial multiplexing to enhance the throughput of data [1].
Due to its simple installation in movable handled devices and incorporated in smart
devices, the microstrip-based MIMO antenna is common across the board in the
communication system [4]. For researchers, the antenna size is the primary restric-
tion since the space available in compact devices for incorporating various antenna
elements is confined. By creating compact compositions and close-spaced antenna
elements, it can be extenuated [5]. Due to which mutual coupling between a pair
of antenna elements increased, hence performance of MIMO system degrades. In
the MIMO technology, high isolation among closely positioned antenna elements
with a common ground plane is more difficult to achieve [5]. The subjugation of
mutual coupling in the antenna world has attracted broad attention. In [6] extenu-
ating isolation while having the elements of the antenna close to one another, many
approaches have been proposed like electromagnetic band gap structures (EBG),
defective ground structures (DGS), metamaterials, resonators, insulators, and slots.
Several techniques for extenuating strong mutual relation between two antenna array
elements for the MIMO system have been documented such that a T-shaped ground
stub radiating element [5], the meander lines in the slotted ground [6], five discrete
mushroom-like loadings in four-element antenna [7], a metamaterial absorber [8],
meander line resonator [9], F-shaped stubs [10], ITI-shaped isolation structure [11],
square-ring slot [12], SRR meta-inspired structure [13], and tapered line [14].
This paper proposes a closely spaced (2 mm) dual-band 2 × 2 MIMO antenna
operating at 2.4 GHz and 5.8 GHz with shared ground plane. A patch antenna with
Novel Sierpinski fractal pattern (equiangular triangle) on the ground plane with
rectangular microstrip feedline is designed. For improving the isolation between two
elements, an annular ring is used between these two elements. The detail study of
the proposed antenna is discussed in foregoing sections.

2 Antenna Design and Evolution

The geometry of 2-port MIMO antenna (Antenna-5) is shown in Fig. 1. The overall
volume of the proposed design is 80 × 40 × 1.6 mm3 . FR4-epoxy is used as substrate
material (dielectric constant = 4.4, loss of tangent = 0.02, height = 1.6 mm). Sier-
pinski fractal geometry and an annular ring on the bottom surface and a rectangular
strip on the top surface are used. Sierpinski fractal is a pattern in which an equian-
gular triangle is divided into four congruent equiangular triangles, and the middle
triangle is subtracted. Figure 2a–e is showing the schematics procedure of 2-port
MIMO antenna designs (antenna-1, antenna-2, antenna-3, antenna-4, and antenna-
5). The return loss (dB) parameter of antenna-1, 2, 3, 4, and 5 is also shown in
Fig. 2. Antenna-1 has a triangular slot on the square ground plane and resonates
at 2.1 GHz. In antenna-2, a triangular resonator is designed to resonate at 5.2 GHz
inside the triangular slot on the square ground surface of antenna-1. Antenna-2 also
Two Element Annular Ring MIMO Antenna for 2.4/5.8 GHz … 397

Fig. 1 Antenna-5, a top view, b side view

Fig. 2 Return loss of antenna-1, 2, 3, 4, and 5

resonate only at single frequency 5.2 GHz. So, the first iteration of the Sierpinski
fractal is done to obtain two resonating bands resonating simultaneously at 2.1 GHz
and 5.8 GHz. Antenna-4 represents 2-port MIMO of antenna-3. The 2-elements are
separated by 2 mm to create a compact and densely packed MIMO configuration.
The antenna-4 resonates at the 2.4 GHz and 5.8 GHz frequencies. Further in final
geometry of antenna, there is an annular shape introduced between two components
to enhance isolation of the MIMO antenna.
398 K. Singh et al.

3 Results and Discussion

The study of the proposed antenna is carried out in terms of return loss, surface current
density, radiation pattern, peak gain, ECC, and diversity gain. Figure 3 shows the
distribution of surface current at 2.4 GHz and 5.8 GHz, whereas the maximum surface
current densities of 22.82 and 18.65 (A/m) are observed respectively. Figure 4 shows
the reflection coefficient comparison of antenna-4 and antenna-5.
Antenna-4 is covering 2.2–2.4 GHz and 5.2–6.4 GHz bandwidth with S 11 /S 22 <
−10 dB. S 21 = −11.53 dB at 2.4 GHz and S 12 = −19.95 dB at 5.8 GHz. While
antenna-5 is covering 2.24–2.52 GHz and 5.32–6.44 GHz bandwidth with S 11 /S 22 <
−10 dB, S 21 = −12.92 dB at 2.4 GHz and S 12 = −20.22 dB at 5.8 GHz shown in
Fig. 5.
The graph of input impedance (Z in ) is presented in Fig. 6. For port-1, the real and
imaginary value of input impedance are 48.7727 Ω and −5.5043 at 2.4 GHz and
67.71 Ω and −1.06 at 5.8 GHz. For port-2, the real and imaginary value of input
impedance are 48.72 Ω and −5.577 at 2.4 GHz and 66.971 Ω and −1.27 at 5.8 GHz.

Fig. 3 Surface current distribution at a 2.45 GHz, b 5.8 GHz

Fig. 4 S—parameter of antenna-4 and antenna-5


Two Element Annular Ring MIMO Antenna for 2.4/5.8 GHz … 399

Fig. 5 Return loss and realized gain of antenna-5

Fig. 6 Input impedance of the proposed antenna

The 2D and 3D plot of gain are shown in Figs. 5 and 7, respectively.


Gain of the the proposed antenna is varying between 1.3 dB to 7.5 dB and 1.98 dB
at lower frequency, and 7.5 dB at higher frequency is observed. The antenna gain
presented in Fig. 7a, b are also in consonance with results presented in Fig. 3a, b.
400 K. Singh et al.

Fig. 7 3-D gain pattern for the proposed MIMO antenna: a at 2.4 GHz, b at 5.8 GHz

The radiation patterns of the proposed antenna at 2.4 GHz and 5.8 GHz are
presented in Fig. 8a, b, respectively. The omnidirectional pattern is observed at
frequency at 2.4 GHz while this pattern slightly changes its shape for 5.8 GHz.
In Fig. 9, the radiated power in terms of efficiency is shown (90.87% at 2.4 GHz and
87.25% at 5.8 GHz).
Figure 9 shows the plot of the group delay of the proposed antenna, which is nearly
stable at around 0.5 ns. Group delay of 0.5 ns is suitable for efficient transmission
that is also conferred in [15, 16].
The ECC and diversity gain parameters must be evaluated to identify efficiency of
the MIMO system. The ECC value should preferably be 0, but it should practically
be less than 0.5 for the uncorrelated MIMO antenna [14]. When the MIMO antenna
satisfies limit of ECC, then the antenna ensures better spatial multiplexing perfor-
mance and high data rate too. The ECC and DG can be determined by the following

Fig. 8 Co and cross polarization level at a 2.4 GHz, b 5.8 GHz


Two Element Annular Ring MIMO Antenna for 2.4/5.8 GHz … 401

Fig. 9 Radiation efficiency and group delay of the proposed antenna

Eqs. 1 and 2, respectively, using S parameters [7].


 ∗ 
 S S12 + S ∗ S22 2
ECC =  11
 21
 (1)
1 − |S11 |2 − |S21 |2 1 − |S22 |2 − |S12 |2

DG = 10 1 − (ECC)2 (2)

For the proposed antenna, the value of ECC (0.081324 at 2.4 GHz and 0.000813
at 5.8 GHz) and DG (9.999 dB in resonating bands) is observed in Fig. 10, hence it
offers high data rate.
The comparative study between present work and reported works is presented in
Table 1. Proposed work is better in terms of gain among all other previously worked
[11, 17–19] 2.5–5.6 dB [11], 5.3 dB [17], 5.43 dB [18], 2–4 dB [9], while 2–7.51 dB
gain of proposed work, also in terms of area [18, 19] 4032 mm2 [18], 7654 mm2
[19], while 3200 mm2 of proposed work. Due to their large area [18, 19], they have
better isolation in comparison with proposed work.

4 Conclusion

The work presented in this article operates in dual bands 2.24–2.52 GHz and 5.32–
6.44 GHz with a bandwidth of 280 and 1120 MHz, respectively. By introducing an
annular ring into the patch elements, improved isolation is achieved. The peak gain
402 K. Singh et al.

Fig. 10 ECC and diversity gain of the proposed antenna

Table 1 Performance comparison of the proposed antenna with previously published works
References No. of F H (GHz) FL (GHz) BW Size Isolation Gain ECC
elements (MHz) (mm2 ) (dB) (dB)
[11] 2 2.9, 7.55 2.24, 3.9 660, 3650 72 × 56 24 2.5–5.6 <0.04
[17] 4 5.49 6.02 534 27.69 × 33 5.3 5 × 10−4
97
[18] 4 5.3 6.7 1400 38 × 38 13 5.43 4 × 10−4
[19] 4 2.88, 6.6 2.16, 720, 1750 89 × 86 >19 at 2–4 <0.05
4.85 2.4 GHz
>21 at
5.8 GHz
Proposed 2 2.52, 6.44 2.24, 280, 1120 40 × 80 12.92 at 7.51 dB 0.08 at
5.32 2.4 at 2.4 GHz,
GHZ, 5.8 GHz 0.0008 at
20.22 at 2 dB at 5.8 GHz
5.8 GHz 2.4 GHz
F H —higher frequency, F L —lower frequency, BW—bandwidth ECC—envelope correlation coefficient

of 7.51 dB is observed at 5.8 GHz. The improved ECC (<0.05) and DG (>9 dB)
enhance the performance of the proposed antenna.
Two Element Annular Ring MIMO Antenna for 2.4/5.8 GHz … 403

References

1. Palaskas Y, Ravi A, Pellerano S (2008) MIMO techniques for high data rate radio communi-
cations. In 2008 IEEE custom integrated circuits conference, Sept 2008. IEEE, pp 141–148.
https://doi.org/10.1109/CICC.2008.4672042
2. Marchetti N (2017) Towards the 5th generation of wireless communication systems, arXiv, p
115
3. Sarangi AK, Datta A (2018) Capacity comparison of SISO, SIMO, MISO MIMO systems. In:
Proceedings of 2nd international conference on computing methodologies and communication,
ICCMC 2018, no Iccmc, pp 798801. https://doi.org/10.1109/ICCMC.2018.8488147
4. Dwivedi AK, Sharma A, Singh AK, Singh V (2020) Quad-port ring dielectric resonator
based MIMO radiator with polarization and space diversity. Microw Opt Technol Lett
62(6):23162327. https://doi.org/10.1002/mop.32329
5. Saurabh AK, Meshram MK (2020) Compact sub-6 GHz 5G-multiple-input-multiple-output
antenna system with enhanced isolation. Int J RF Microw Comput Eng 30(8):111. https://doi.
org/10.1002/mmce.22246
6. Li Q et al (2018) Mutual coupling reduction between patch antennas using meander line. Int J
Antennas Propag 2018. https://doi.org/10.1155/2018/2586382
7. System FMA (2016) Mutual coupling reduction of a closely spaced using discrete mushrooms.
IEEE Trans Antennas Propag 64(10):30603067
8. Agarwal M, Meshram MK (2016) Isolation improvement of 5 GHz WLAN antenna array
using metamaterial absorber. In: 2016 URSI Asia-Pacific radio science conference on URSI
AP-RASC 2016, p 10501053. https://doi.org/10.1109/URSIAP-RASC.2016.7601144
9. Ghosh J, Ghosal S, Mitra D, Chaudhuri SRB (2016) Mutual coupling reduction between closely
placed microstrip patch antenna using meander line resonator. Prog Electromagn Res Lett
59(April):115122. https://doi.org/10.2528/PIERL16012202
10. Iqbal A, Saraereh OA, Ahmad AW, Bashir S (2017) Mutual coupling reduction using F-shaped
stubs in UWB-MIMO antenna. IEEE Access 6:27552759. https://doi.org/10.1109/ACCESS.
2017.2785232
11. Kumar A, Ansari AQ, Kanaujia BK, Kishor J (2019) A novel ITI-shaped isolation structure
placed between two-port CPW-fed dual-band MIMO antenna for high isolation. AEU Int J
Electron Commun 104:3543. https://doi.org/10.1016/j.aeue.2019.03.009
12. Parchin NO et al (2019) Eight-element dual-polarized MIMO slot antenna system for 5G
smartphone applications. IEEE Access 7:1561215622. https://doi.org/10.1109/ACCESS.2019.
2893112
13. Roy S, Chakraborty U (2020) Mutual coupling reduction in a multi-band MIMO antenna using
meta-inspired decoupling network. Wirel Pers Commun 114(4):32313246. https://doi.org/10.
1007/s11277-020-07526-5
14. Saxena G, Jain P, Awasthi YK (2020) High diversity gain MIMO-antenna for UWB application
with WLAN notch band characteristic including human interface devices. Wirel Pers Commun
112(1):105121. https://doi.org/10.1007/s11277-019-07018-1
15. Mishra B, Singh V, Singh RK, Singh N, Singh R (2018) A compact UWB patch antenna with
defected ground for Ku/K band applications. Microw Opt Technol Lett 60(1):16. https://doi.
org/10.1002/mop.30911
16. Mishra B (2019) An ultra compact triple band antenna for X/Ku/K band applications. Microw
Opt Technol Lett 61(7):1857–1862. https://doi.org/10.1002/mop.31812
17. Malviya L, Panigrahi RK, Kartikeyan MV (2016) Circularly polarized 2 2 MIMO antenna for
WLAN applications. Prog Electromagn Res C 66(5):97–107. https://doi.org/10.2528/PIERC1
6051905
18. Chouhan S, Panda DK, Gupta M, Singhal S (2018) Meander line MIMO antenna for 5.8 GHz
WLAN Application. Int J RF Microw Comput Eng 28(4). https://doi.org/10.1002/mmce.21222
19. Birwal A, Singh S, Kanaujia BK, Kumar S (2020) Low-profile 2.4/5.8 GHz MIMO/diversity
antenna for WLAN applications. J Electromagn Waves Appl 34(9):1283–1299. https://doi.org/
10.1080/09205071.2020.1757516
A Secure Hybrid and Robust Zone Based
Routing Protocol for Mobile Ad Hoc
Networks

Vishal Gupta, Hemant Sethi, and Surinder Pal Singh

Abstract An ad hoc network consists of a group of mobile nodes that function


together without any established infrastructure or centralized administration. The
nodes are free to move around and place themselves in any random position which
in turn will change the topology of the network. Security during transmission of
packets is a vital factor in wireless networks. Maintaining vital parameters of secu-
rity like confidentiality, authentication, integrity, etc. during transmission across
ad hoc network is a challenging task. Internal malicious nodes are the source of
these security risks, as they drop data packets on purpose and disrupt the routing
protocol’s proper operation. SHRZRP, a Secure Hybrid and Robust Zone Based
Routing Protocol based on the ZRP protocol, is proposed in this paper. Then the
impact of the proposed enhancement is evaluated. The simulation results are then
compared to the performance of the defenceless ZRP which shows that SHRZRP
has better throughput and packet delivery ratio as compared to the conventional ZRP
in the malicious environment.

Keywords Ad hoc network · ZRP · SIARP · SHRZRP

1 Introduction

An ad hoc network is a network with no fixed infrastructure, such as a wireless


network. They are recognized by a dynamic topology that is devoid of static routers.
The routing problem is one of the most enticing and difficult aspects of mobile ad hoc

V. Gupta · S. P. Singh
Department of Computer Science and Engineering, MMEC, MM(DU), Mullana, Ambala, India
e-mail: vishal.gupta@mmumullana.org
S. P. Singh
e-mail: s.p.singh@mmumullana.org
H. Sethi (B)
Department of Computer Science and Engineering, MMU, Sadopur, Ambala, India
e-mail: hemant.sethi@mmumullana.org

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 405
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_30
406 V. Gupta et al.

networks. Wireless transmission is used to communicate in a mobile ad hoc network


(MANET). Any communication in such networks is without any centralized base
station, as in cellular networks or a wireless LAN access point. Routing is one of the
arduous tasks in MANETs due to the node’s self-deployment feature, node-to-node
communication and limited resources like low power of the node. In comparison
to wired networks, mobile ad hoc networks are more vulnerable to information and
physical security threats due to their known and distinct characteristics. To have
reliable data transmission in the wireless network and to save resources, several
security services are required in providing secure networking [1–3].
1. Authentication: This service verifies a node’s or a user’s identity to avoid
impersonation.
2. Confidentially: To protect the information sent from unauthorized access, i.e.
only the intended user users or nodes can access the information.
3. Integrity: Ensure that the data has not been tampered with during transmission.
4. Availability: It ensures that all the above-mentioned network security services
are available to all the intended parties when required.
5. Non-repudiation: Ensure that parties may demonstrate the transmission or
receipt of information by another party, i.e. a party cannot falsely deny receiving
or sending such data.
A hybrid approach can provide a solution for secure routing over the ad hoc
network that can perform efficiently across a wide range of goals. This paper presents
a design of the proposed protocol “Secure Hybrid and Robust Zone Based Routing
Protocol (SHRZRP)” that can well be suited across various applications of MANET
and analyzed its performance.

2 Secure Hybrid and Robust Zone Based Routing Protocol


(SHRZRP)

The Secure Hybrid and Robust Zone Based Routing Protocol (SHRZRP) implements
the hybrid approach followed in Zone Routing Protocol (ZRP) [4–9]. To perform
secure routing, it adds its own security mechanisms. In our protocol, we will check
for the performance when there will be malicious nodes in the network. The key
takes for ZRP selection as our base protocol are as follows:
1. ZRP is classified as a routing zones concept. In this security, mechanisms can
be applied inside a limited area rather than in a larger area.
2. Zones differentiate the communicating nodes as interior (nodes within the zone)
and exterior (nodes outside the zone) nodes and hence information such as
neighbourhood information and its topology, etc. can be hidden from the exterior
nodes.
3. In the event of a failure, routing may be zone constrained, similar to ZRP, i.e.
the proposed protocol performs intra-zone and inter-zone routing [10, 11]. The
A Secure Hybrid and Robust Zone Based Routing Protocol … 407

proactive approach can be followed on the intra-zone and reactive approach


outside the zone.
SHRZRP is intended to handle all live security considerations like authentication,
data integrity and information confidentiality throughout routing whether intra-zone
or inter-zone. For an end to end authentication and message/packet integrity, a digital
signature mechanism [12] is utilized, whereas confidentiality is ensured by encryp-
tion techniques, i.e. symmetric and asymmetric key encryption. Packets are digitally
signed and/or encrypted depending on whether the packet is a control packet or a
data packet. SHRZRP is a two phase protocol:
• In the first phase, a preliminary certification process takes place. In this process,
CA issues the keys to each CN.
• The second phase involves secure routing using keys between intra-zone or inter-
zone and security during transmission is enriched by applying the digital signature
and message encryption cryptographic techniques.
Architecture of SHRZRP protocol is given in Fig. 1. It consists of various
components discussed below.
1. Key Generation: In this new key pairs are generated for encryption. This
involves the generation of public and private key pairs for implementing digital
signatures in our architecture.
Two pairs of keys are generated:
Signing key (SigX)
Verifying (VerX)
Decrypting key (DecX)
Encrypting key (EncX)
Sessionkey (SKD)
where:
SigX and DecX are kept Private
VerX and EncX are made Public
2. Certification Process: The presence of trustworthy certification servers known
as certification authorities (CAs) in the network is required by the SHRZRP.

CA → X : certX = [IPX, VerX, EncX,]|signCA

3. Key Management: Between the source and discovery nodes, shared keys are
used [13]. Securing communication in ad hoc networks is found to be difficult
due to computing constraints or lack of infrastructure support. Security during
communication is usually based on symmetric-key cryptography or public-key
cryptography systems.
4. Secure Intra-Zone Routing Protocol (SIARP)
Step 1: Route discovery using routing table.
408 V. Gupta et al.

Fig. 1 Architecture of SHRZRP protocol

Step 2: Source S sends a SKREQ packet to D

S → Broadcast : [SKREQ, IPD, NS]|signS, certS

where SKREQ is the session key request packet identifier.


Step 3: Let P be a Neighbour that has received from S the RDP, which subsequently
rebroadcasts further.
A Secure Hybrid and Robust Zone Based Routing Protocol … 409
 
P → Broadcast : [SKREQ, IPD, NS]|signS |signP, certS, certP

Step 4: On receiving the request packet, destination D verifies the signature using
VerS, creates the session key, encrypts it using EncS and sends it to S as SKREP
packet along the reverse route D-P-S.
 
D → S : SKREP, IPS, NS, {Sessionkey}EncS |SigD, certD

where SKREP is the session key reply packet identifier.


Step 5: Source S on receiving back SKREP packet from D, verifies the packet using
VerD, conforms its authenticity, decrypts it using DecS and extracts the sessionkey.
5. Secure Inter-Zone Routing Protocol (SIERP)
Step 1: SIERP at S bordercasts to its peripheral nodes to initiate the secure route
discovery process to D with the help of BRP.

S → bordercast : [RREQID, IPD, NS, β]|signS, certS

Step 2: Peripheral nodes further bordercasts request packet further.


 
T → bordercast : [RREQID, IPD, β]|signS |signT, certT, certS

Step 3: Finally, the Request packet from source S arrives at destination D.


Step 4: In response, the destination transmits the reply packet with encrypted session
key along the same reverse path.

3 Performance Metrics

Simulation parameters used for experiments is given in Table 1. In order to determine


the performance of proposed system, we use the following metrics as evaluation
parameters.
• Total Data packets Send/Received.
• Throughput.
• Percentage of Packets Dropped that passed through Malicious Nodes (PDP).
410 V. Gupta et al.

Table 1 Simulation
Number of nodes 40, 50, 60, 70
parameters
Number of malicious nodes 5–25%
Dimension of space 120 m * 120 m
Range of node 30 m
Zone radius 2–4
Source destination pair 1
Data rate 10 packets/s
Simulation time 100 s

4 Results and Discussions

Performance comparison of ZRP (existing) and SHRZRP (Proposed) is made based


on parameters- throughput and PDP.
The average PDP and average throughput of ZRP and SHRZRP are shown in
Tables 2 and 3 respectively by varying the number of malicious nodes on different
radius. Tables 4 and 5 shows the average PDP and average throughput of ZRP and
SHRZRP respectively by varying the number of nodes on different radius.

Table 2 Comparison results of average PDR for ZRP and SHRZRP with varying numbers of
malicious nodes
Radius No. of malicious nodes
3 5 8 10 13
ZRP with radius = 2 65 65 80 85 70
SHRZRP with radius = 2 95 85 95 95 100
ZRP with radius = 3 80 70 75 70 70
SHRZRP with radius = 3 95 90 90 95 95
ZRP with radius = 4 65 60 80 85 70
SHRZRP with radius = 4 95 90 95 95 100

Table 3 Comparison results of average throughput for ZRP and SHRZRP with varying number of
malicious nodes
Radius No. of malicious nodes
3 5 8 10 13
ZRP with radius = 2 6.5 6.5 8.0 8.5 7.0
SHRZRP with radius = 2 9.5 8.5 9.5 9.5 10.0
ZRP with radius = 3 8.0 7.0 7.5 7.0 7.0
SHRZRP with radius = 3 9.5 9.0 9.0 9.5 9.5
ZRP with radius = 4 6.5 6.0 8.0 8.5 7.0
SHRZRP with radius = 4 9.5 9.0 9.5 9.5 10.0
A Secure Hybrid and Robust Zone Based Routing Protocol … 411

Table 4 Comparison results of average PDR for ZRP and SHRZRP with varying number of nodes
Radius No. of nodes
40 50 60 70
ZRP with radius = 2 76 66 74 58
SHRZRP with radius = 2 96 96 94 98
ZRP with radius = 3 76 66 74 68
SHRZRP with radius = 3 92 94 98 94
ZRP with radius = 4 66 72 12 12
SHRZRP with radius = 4 92 98 96 98

Table 5 Comparison results of average throughput ZRP and SHRZRP with varying number of
nodes
Radius No. of nodes
40 50 60 70
ZRP with radius = 2 7.6 6.6 7.4 5.8
SHRZRP with radius = 2 9.6 9.6 9.4 9.8
ZRP with radius = 3 7.6 6.6 7.4 6.8
SHRZRP with radius = 3 9.2 9.4 9.8 9.4
ZRP with radius = 4 6.6 7.2 1.2 1.2
SHRZRP with radius = 4 9.2 9.8 9.6 9.8

Figures 2, 3, 4 and 5 shows that SHRZRP has better Throughput and Packet
Delivery Percentage as compared to the conventional ZRP in the malicious
environment.

120
ZRP with Radius=2
100
SHRZRP with
80 Radius=2
Average PDR

ZRP with Radius=3


60

SHRZRP with
40
Radius=3
ZRP with Radius=4
20

0 SHRZRP with
3 5 8 10 13 Radius=4
Number of Malicious Nodes

Fig. 2 Comparison of average PDR for ZRP and SHRZRP with varying number of malicious nodes
412 V. Gupta et al.

12

10
Average Throughput ZRP with Radius=2

8 SHRZRP with
Radius=2

6 ZRP with Radius=3

SHRZRP with
4 Radius=3
ZRP with Radius=4
2
SHRZRP with
Radius=4
0
3 5 8 10 13
Number of Malicious Nodes

Fig. 3 Comparison of average throughput for ZRP and SHRZRP with varying number of malicious
nodes

120

100
ZRP with Radius=2

80 SHRZRP with
Average PDR

Radius=2

60 ZRP with Radius=3

SHRZRP with
40 Radius=3
ZRP with Radius=4
20
SHRZRP with
Radius=4
0
40 50 60 70
Number of Nodes

Fig. 4 Comparison of average PDR for ZRP and SHRZRP with varying number of nodes
A Secure Hybrid and Robust Zone Based Routing Protocol … 413

12

ZRP with Radius=2


10
Average Throughput
SHRZRP with
8
Radius=2

6 ZRP with Radius=3

4 SHRZRP with
Radius=3
2 ZRP with Radius=4

0 SHRZRP with
40 50 60 70 Radius=4
Number of Nodes

Fig. 5 Comparison of average throughput of ZRP and SHRZRP with varying number of nodes

5 Conclusion and Future Scope

The simulation results for Secure Hybrid and Robust Zone Routing Protocol
(SHRZRP) show that the proposed protocol is efficient in discovering and main-
taining routes when implemented with different mobility patterns and security param-
eters. Simulation results show that SHRZRP has better Throughput and Packet
Delivery Percentage as compared to the conventional ZRP in the malicious environ-
ment. SHRZRP fulfils the desired security goals. In future, more levels of security
can be embedded in the Secure Hybrid and Robust Zone Routing Protocol (SHRZRP)
or any other routing protocol by integrating a location based security mechanism.
With the support of GPS (Global Positioning System) service, the destination node
or receiving node can keep track of the source node’s location. A frequent and sudden
change in direction of the source may be found doubtful as mobile nodes can at a
relevant speed; sudden change in direction is not much feasible.

References

1. Zhou L, Haas ZJ (1999) Securing ad hoc networks. IEEE Netw Mag 13(6)
2. Bhosle A, Pandey Y (2013) Int J Adv Res Comput Eng Technol (IJARCET) 2(3). ISSN:
2278-1323
3. Yasinsacetal A (2002) A family of protocols for group key generation in ad hoc networks. In:
International conference on communications and computer networks (CCN02)
4. Nekkanti RK, Lee C (2004) Trust based adaptive on demand ad hoc routing protocol. ACM
5. Shiva Murthy G, D’Souza RJ, Varaprasad G (2012) Digital signature-based secure node disjoint
multipath routing protocol for wireless sensor networks. IEEE Sens J 12(10)
6. Haas ZJ, Pearlman MR, Samar P (2001) Intrazone routing protocol (IARP). IETF Internet
Draft, draft-ietf-manet-iarp-01.txt
414 V. Gupta et al.

7. Haas ZJ, Pearlman MR, Samar P (2001) Interzone routing protocol (IERP). IETF Internet
Draft, draft-ietf-manet-ierp-01.txt
8. Haas ZJ, Pearlman MR, Samar P (2001) The bordercast resolution protocol (BRP) for ad hoc
networks. IETF Internet Draft, draft-ietf-manet-brp-01.txt
9. Kottarwar K, Jichkar N (2017) Analysis of SEZRP using secure code technique. In 2017
International conference on innovations in information, embedded and communication systems
(ICIIECS). IEEE, pp 1–4
10. Hu C, Chim TW, Yiu SM, Hui LCK, Li VOK (2012) Efficient HMAC-based secure
communication for VANETs. Comput Netw 56
11. Haas ZJ, Pearlman MR, Samer P (2003) The zone routing protocol (ZRP) for adhoc networks.
Internet Draft, available at: http://tools.ietf.org/id/draft-ietf-MANETs-zone-zrp-04.txt
12. Perkins CE, Bhagwat P (1994) Highly dynamic destination-sequenced distance-vector routing
(DSDV) for mobile computers. Comp Commun Rev 234–244
13. Anitha V, Akilandeswari J (2010) Secured message transmission in mobile ad hoc networks
through identification and removal of Byzantine failures. InterJRI Comput Sci Netw 2(1)
Multimedia Content Mining Based
on Web Categorization (MCMWC)
Using AlexNet and Ensemble Net

Bhavana and Neeraj Raheja

Abstract Nowadays, a large amount of web content available over the web tends to
grow exponentially. From this big data environment, extracting meaningful informa-
tion according to user adaptability and making better decisions is a big challenge. For
this Web Page, categorization approaches help to classify web pages into a number
of predefined categories. Normal web categorization approaches work for either text
or images. But to extract information from multimedia sort of webpages is again a
big challenge. Hence in this paper, a method of categorization has been proposed
which is applicable over the webpages consisting of multimedia type of data (text and
images both). Experimental results based on real-world disaster dataset show that
the proposed MCMWC model provides better performance (3% more accuracy in
comparison to Text Modality and 2% more accuracy in comparison Image modality)
in comparison to models trained using a single modality (either text or image).

Keywords Pre-processing · Feature extraction · LBP · HOG · ALexNet · SVM ·


KNN · Ensemble · Feature-level fusion

1 Introduction

The continuously growing web content available over the internet challenge the
web search, web recommendation system and also information management and
retrieval system. Web page categorization improved web search efficiency and helps
in information management, information retrieval and in-text filtering. Web contents
consist of text, images and videos. Web content mining extracts important knowledge
or information from the contents according to the user’s needs [1]. It presents the
knowledge in a more structured form from the web. Categorization is used for classi-
fying the web content into predefined classes called category labels. It is a process to

Bhavana (B) · N. Raheja


CSE Department, Maharishi Markandeshwar (Deemed To Be University), Mullana, India
N. Raheja
e-mail: neeraj_raheja2003@mmumullana.org

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 415
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_31
416 Bhavana and N. Raheja

label a document into one or more categories (classes). Contents are to be categorized
according to its subject or attributes. Attributes are like document type, its author
and publishing year. It consists of several steps for web content categorization such
as document pre-processing, extraction of features and categorization of contents.
A Pre-processing: The first step of categorization is pre-processing of the data.
Most documents contain irrelevant words like stop words, misspelt words,
HTML tags and many more. These irrelevant words are called noise and these
have a poor effect on the system’s performance. Some techniques are used
for document pre-processing such as Tokenization, Stop Words, Capitalization,
Abbreviation and Noise Removal [2]. Some techniques are used for document
pre-processing. These Techniques are:
(a) Tokenization: To break the text into some useful tokens such as symbols,
words and phrases is known as Tokenization.
(b) Stop Words: Document consists of some words that are not essential in
the documents these words are called stop words. So removed these words
from the documents are essential.
(c) Capitalization: Capitalization of words produces inconsistency in the
document, so each capital word is changed into lower word.
(d) Abbreviation: Abbreviations stands for the short form of words. For
example, BOW stands for Bag of words
(e) Noise Removal: Punctuation and some special characters contain insignif-
icant in the documents. These special characters and symbols are called
noise. Removal of noise during pre-processing is necessary.
B Feature Extraction: Feature extraction is a dimensionality reduction process
that obtains the relevant information from the raw data and describes it in a lower
dimensionality space. Transforming input data into a set of relevant features
is called feature extraction. In this paper, Bag of words (BOW) is used for
extracting text features and LBP, HOG and AlexNet are used for extracting
image features.
• Bag-of-Words: The BOW technique is used in multiple domains like Natural
Language Processing, document classification and in information retrieval.
BOW extracts the words or features from a document and then classifies them
into the frequency of occurrence of word use. For classification of text, BOW
finds the similarity of vectors in vector space thereby reducing the textual
data into the word vectors [3, 4].
• Local Binary Pattern: It is a texture-based feature extraction method that
extracts features from histogram and L2 normalization is used to normalize
each LBP cell histogram [5, 6]. To find the local detail of image, a binary
pattern between central pixel and consecutive pixel is encoded by LBP. Find
LBP features by using 3 × 3 images patch [7, 8]. LBP histogram grid-based
method is used, in which LBP is used for feature extraction from different
orientations of images and then these features are merged with an enhanced
Multimedia Content Mining Based on Web Categorization (MCMWC) … 417

feature vector for representation of an image [9]. LBP extract feature using
size and position and merged with vector value [10].
• Histogram Oriented of Gradient: To extract the features from all loca-
tions of the image is called dense feature extraction. For improving perfor-
mance, local contrast is used for normalizing local histograms [11]. HOG
descriptor effectively represents local features of the image. For finding out
the histogram of radians, gradient image (horizontal or vertical) is used [12].
From every pixel, in the image, the magnitude and direction of gradient are
extracted [13] and then are used to create an angular histogram of gradients
which provide a feature vector of image texture.
• AlexNet: The network has learned rich feature representations of images.
In Alexnet Dropout [14], ReLU and pre-processing are the best steps of
AlexNET to achieve excellent performance [15–17]. AlexNet train millions
of images from the database [18]. It extracts the features in less time [19].
It takes an image as an input and outputs the object label in the image. It
captures finer features that are useful for categorization.
C Classifiers: Classifiers are used to assign categorical (class labels) to partic-
ular points. Classifier categorizes the document into labelled classes or into
categories of information. Four classifiers are used in this paper.
• Support Vector Machine (SVM): SVM works on Statistical Learning theory
and is based on supervised learning. In the linear-separable case, SVM finds
the hyper plain such that margin is optimal. In the non-linear separable, SVM
converts the input space into kernel method based feature space [20, 21].
• K nearest neighbour (KNN) Algorithm: It is a supervised learning method
and uses a technique to represent a document. Every document is represented
by a feature vector such as (d 1 …d k ), here d i represents ith feature and k
represents the total features in the document. Different weighting schemes
have been introduced to determine the weights such as term frequency, TF-
IDF and binary weighting. Here the specific word is represented by a feature
[22].
• Decision Tree: It is used in domains such as classification and regres-
sion models. These data models predict class labels or values for the
decision-making process [23]. In the decision tree, tree representation is
used where each leaf node represents class label and the internal node repre-
sents attributes [24, 25]. Discrete values present class labels. The leaf nodes
present the responses.
• Ensemble: It is a supervised learning method. Ensemble classifiers’ effec-
tiveness is improved over individual classifiers due to representational issues,
computational reason and statistical reason. Ensemble classifier also solves
the problem of class imbalance [26, 27]. There are different types of ensemble
methods such as Boosting, bagging and ECOC ensembles.
418 Bhavana and N. Raheja

2 Related Work

Yu et al. [28] proposed a multimodal approach that is based on faculty homepage


dataset. They proposed a framework for multimodal learning with the problems
of imbalanced data. Rungta et al. [29] proposed an approach for classification of
applications by utilising information from APK files of apps. 2-phase multimodal
deep learning approach is used for classification of applications into categories.
Suryawanshi et al. [30] proposed an approach to offensive content classification
in memes. MultiOFF dataset was used to train and evaluate a multimodal classi-
fication system for detecting offensive memes. Gülbaş et al. [31] introduced deep
feature extraction and ELM classifier for fashion classification. This paper used three
different pre-trained CNN models for deep feature extraction. Due to its robustness
and simple structure ELM machine learning technique was used. Purohit and Ajmera
[32]; Yu et al. [28] introduced a technique to perform optimal feature level fusion.
Optimization technique is used to extract the relevant features. In this paper, proposed
OGWO technique for optimal features selection.

3 System Architecture (MCMWC)

The proposed approach (MNCWC) is applied to multimedia data that consists of text
and images. Before that categorization approach is applied to either text or images
only. In multimodal web content data BOW for text, LBP, HOG and ALexNet are
used for image feature extraction. For classification of images KNN, SVM, Decision
Tree and Ensemble classifier is used. Combination of AlexNet and Ensemble net
gives better performance in comparison to other combinations of the net. Here BOW
is done with 2312 words and 743 documents (Fig. 1).
Here AlexNet requires input images of 227-by-227-by-3 Size zero centre normal-
ization. Here, width and height of the input image are both 227 and 3 specify colour
channel, i.e. RGB. To extract the training and tested images’ features, AlexNet use
activations on the fc7 layer, i.e. 4096 fully connected layers. The classifier is spec-
ified by the ‘Method’ name-value pair argument of fitrensemble. Here the value of
method is AdaBoostM1 (Adaptive Boosting for Binary Classification). AdaBoostM1
is a boosting method used for binary classification which trains learners sequentially.
Fitrensemble gives the trained regression ensemble model object that consists of
NLearn classification. Other technique is used is feature level fusion.
Feature Level Fusion: In this, Features from different sources are combined into a
single feature set. The main advantage of using feature-level fusion is the detection
of relevant features values generated from different sources [32]. Figure 2 shows
the flow chart of Feature level fusion approach. In this paper, multimodal approach
combines features from text level feature matric and image level feature matric. FV1
is a feature matric of text content of size 1 * m. FV2 is a feature matric of image
Multimedia Content Mining Based on Web Categorization (MCMWC) … 419

Fig. 1 BOW output

Text Images

Feature Vector FV1 (1*m) Feature Vector FV 2(1*n)

Normalization (min-
Normalization (0-1)
max normalization)

Feature Level Fusion (1* (m+n))

Fig. 2 Flow chart of feature level fusion method


420 Bhavana and N. Raheja

content of size 1 * n. Then combining the feature of both matrices the matric becomes
of size 1 * (m + n).
Figure 3 shows the flow chart of MNCWC model. In MNCWC model firstly,
multimedia data such as text and images text and image data are read from the
database. For text feature extraction BOW technique is used and for image feature
extraction, LBP and HOG are used. ALexNet is used for image feature extraction.
For classification of images KNN, SVM, Decision Tree and Ensemble classifiers are
used. For performance measure of categorization four performance parameters are
used.

Text Image

Bag of Words (BOW) LBP, HOG, AlexNet

Feature Level Fusion

Classifier KNN, SVM, Decision Tree, Ensemble

Performance
Accuracy, Recall, Precision, F-measure
Matrices

Fig. 3 MCMWC model


Multimedia Content Mining Based on Web Categorization (MCMWC) … 421

4 Results and Discussions

Methodology used in this is the dataset used experimental setup and results,
evaluation metrics and discussion.
A Dataset: Dataset used for calculating the results is Multimodal Damage
Identification for Humanitarian Computing Data Set [33]. Dataset used in
this paper is the multimodal dataset. Social media posts like Instagram and
Twitter were used as Samples of Dataset. 5879 captioned images which include
both text and images related to natural disasters damage and wars. This dataset
content belongs to 6 classes, i.e. Infrastructural, Human, Fires, Floods, non-
damage, Natural landscape. The dataset is categorized into major categories:
offensive and non-offensive. For categorization, Adaptive Boosting for Binary
Classification is used which assign 1 for offensive category and 0 for non-
offensive category.
B Performance Evaluation: The performance evaluation of classifiers is neces-
sary for correct categorization results. Performance measures such as Accuracy,
recall, precision and F-measure are used to ensure the classifier’s accuracy and
efficiency.
Evaluation Parameters: Evaluation matrices are used for finding the performance
of predictive model. Accuracy is the ratio of truly predicted observation by total
observations. Recall and precision are used for the measurement of completeness
and exactness of the classifier. It is the ratio of retrieved documents divided by the
significant documents. Higher recall means classifiers return most of the pertinent
results whether or not irrelevant results are returned. Precision is used for the exact-
ness measurement of the classifier. It is the ratio of significant documents by the
retrieved documents. Higher precision means classifier returns the most pertinent
results irrespective of irrelevant results. F-measure is calculated as weighted average
of both recall and precision. For better performance of classification, accuracy, preci-
sion, recall and F-measure rates are higher. Accuracy, precision, recall and F-measure
rates are shown in equation [34–37].

TPR + TNR
Accuracy = (1)
TPR + FNR + FPR + TNR
TPR
Precision = (2)
TPR + FPR
TPR
Recall = (3)
TPR + FNR
2*Precision*Recall
F − Measure = (4)
Precision + Recall

where
422 Bhavana and N. Raheja

TPR true positive rate


TNR true negative rate
FPR false positive rate
FNR and false negative rate.
C Text Modality: For the Text Modality, Firstly, Data pre-processing is performed
to remove irrelevant contents from data like stop words, misspelt words, HTML
tags and many more. Here erase Punctuation, tokenized document, remove
Stop Words, remove Short Words, remove Long Words, normalize Words Pre-
processing technique are used. BOW technique is used for feature extraction and
KNN, SVM, Decision Tree and Ensemble classifier is used for text classification.
D Image Modality: Pre-processing of Image consists of resizing images. Here
image data is stored as a numeric array, an Image Datastore object. By using
Image Datastore import data in batches from dataset. For feature extraction of
images, LBP. HOG and ALexNet are used. For classification of images KNN,
SVM, Decision Tree and Ensemble classifier is used.
E Text and Image (Multimedia)
MNCWC model uses both text and image modalities of social media datasets.
In MNCWC model ALexnet for feature extraction and Ensemble as a classifier are
used. AlexNet trains millions of images from the database and it extracts the features
in less time. Ensemble Classifier improves the final model’s predictive performance
by integrating several classification models for obtains higher accuracy results.
Tables 1, 2 and 3 represent the performance results achieved for different modal-
ities. In the unimodal experiments, the image-only model gives better than the text-
only models. Specifically, the Accuracy is more than 2% on average in the image
modality in comparison to Text Modality. In the multimodal model, performance is
better than the single model. MCMWC model gives 3% more Accuracy in compar-
ison to Text Modality and 2% more Accuracy in comparison to Image modality.
Multimodal gives 94.08% accuracy, 94.16% precision and 93.58% recall and 93.84%
F-measure results. These results confirm that MNCWC model provides a further
performance improvement over the unimodal learning approach.
Table 4 shows that the accuracy of different classifiers and feature types. It shows
MCMWC model provides higher accuracy in comparison to others.
Table 5 shows the F-measure of different classifiers and feature types. It shows
MCMWC model provides higher F-measure as a comparison to others. Figures 4
and 5 show the accuracy and f-measure graph.

Table 1 Text modality results


Classifier Accuracy Precision Recall F-measure
KNN 80.22 79.55 79.98 79.72
SVM 91.92 91.66 91.66 91.66
Decision tree 88.29 88.04 87.68 87.85
Ensemble 91.03 92.17 91.95 92.05
Multimedia Content Mining Based on Web Categorization (MCMWC) … 423

Table 2 Image modality results


Feature type Classifier Accuracy Precision Recall F-measure
LBP KNN 76.45 75.74 75.24 75.45
LBP SVM 58.95 57.73 50 37.09
LBP Decision Tree 90.44 90.11 90.15 90.13
LBP Ensemble 91.67 93.45 93.49 93.47
HOG KNN 74.29 74.08 74.86 74.02
HOG SVM 84.93 85.19 83.48 84.08
HOG Decision Tree 88.83 88.99 87.84 88.3
HOG Ensemble 90.31 91.04 89.04 89.77
ALEXNET KNN 77.52 77.25 75.71 76.19
ALEXNET SVM 93.41 93.24 93.11 93.18
ALEXNET Decision Tree 92.19 91.87 92.03 91.95
ALEXNET Ensemble 92.63 92.93 92.57 92.74

Table 3 Text and images (multimedia) results


Feature type Classifier Accuracy Precision Recall F-measure
LBP KNN 80.89 81.91 78.51 79.35
LBP SVM 92.06 91.97 91.57 91.76
LBP Decision tree 90.58 90.12 90.62 90.33
LBP Ensemble 93.02 94.05 93.68 93.86
HOG KNN 74.97 76.03 71.75 72.34
HOG SVM 89.91 90.56 88.65 89.35
HOG Decision tree 86.81 86.76 85.83 86.21
HOG Ensemble 93.54 93.65 92.98 93.28
ALEXNET KNN 81.56 81.46 80.08 80.57
ALEXNET SVM 93.94 93.8 93.67 93.73
ALEXNET Decision tree 91.92 91.47 92.06 91.72
ALEXNET Ensemble (MNCWC) 94.08 94.16 93.58 93.84

Table 4 Accuracy with


Accuracy Feature type
different classifiers
Technique used LBP HOG ALEXNET
KNN 80.89 74.97 81.56
SVM 92.06 89.91 93.94
Decision tree 90.58 86.81 91.92
MCMWC 93.02 93.54 94.08
424 Bhavana and N. Raheja

Table 5 F-Measure with


F-measure Feature type
different classifiers
Technique used LBP HOG ALEXNET
KNN 79.35 72.34 80.57
SVM 91.76 89.35 93.73
Decision tree 90.33 86.21 91.72
MCMWC 91.86 93.28 93.84

Accuracy %
100
90
80
70
60
50
40
30
20
10
0
KNN SVM DecisionTree MCMWC

LBP HOG ALEXNET

Fig. 4 Accuracy comparison graph

5 Conclusion

This paper proposed a joint representation approach (MCMWC) using both text
and image modalities of social media datasets. It shows that a feature level fusion
based multimodal can outperform better than the unimodal. MCMWC uses a cate-
gorization approach applicable to multimedia data that used Adaptive Boosting for
Binary Classification and categorized the data into two categories, i.e. offensive
and non-offensive categories. Experimental results based on disaster datasets show
that MCMWC uses ALexNet for feature extraction and Ensemble net as a classifier
provides better performance (3% more accuracy in comparison to Text Modality
and 2% more accuracy in comparison Image modality) in comparison to models
trained using a single modality (either text or image). Compared to other feature
types and classifiers, the MCMWC model gives 94.08% accuracy, 94.16% precision
and 93.58% recall and 93.84% F-measure results.
Multimedia Content Mining Based on Web Categorization (MCMWC) … 425

Fmeasure %
100
90
80
70
60
50
40
30
20
10
0
KNN SVM DecisionTree MCMWC

LBP HOG ALEXNET

Fig. 5 F-Measure comparison graph

References

1. Jain R, Purohit DG (2011) Page ranking algorithms for web mining. Int J Comput Appl
2. Kowsari K, JafariMeimandi K, Heidarysafa M, Mendu S, Barnes L, Brown D (2019) Text
classification algorithms: a survey. Information 10(4):150
3. Qiu D, Jiang H, Chen S (2020) Fuzzy information retrieval based on continuous Bag-of-Words
model. Symmetry 12(2):225
4. Gao J, Yi J, Jia W, Zhao X (2018). Improved deep belief network to feature extraction in
Chinese text classification. In: 2018 IEEE 9th international conference on software engineering
and service science (ICSESS). IEEE, pp 283–287
5. Sujith A, Aji S (2020) An optimal feature set with LBP for leaf image classification. In: 2020
fourth international conference on computing methodologies and communication (ICCMC).
IEEE, pp 220–225
6. Xiao B, Wang K, Bi X, Li W, Han J (2018) 2D-LBP: an enhanced local binary feature for
texture image classification. IEEE Trans Circuits Syst Video Technol 29(9):2796–2808
7. Jung JY, Kim SW, Yoo CH, Park WJ, Ko SJ (2016) LBP-ferns-based feature extraction for
robust facial recognition. IEEE Trans Consum Electron 62(4):446–453
8. Huang D, Shan C, Ardabilian M, Wang Y, Chen L (2011) Local binary patterns and its appli-
cation to facial image analysis: a survey. IEEE Trans Syst Man Cybern Part C (Appl Rev)
41(6):765–781
9. Ahonen T, Hadid A, Pietikainen M (2006) Face description with local binary patterns:
application to face recognition. IEEE Trans Pattern Anal Mach Intell 28(12):2037–2041
10. Ma C, Jung JY, Kim SW, Ko SJ (2015) Random projection-based partial feature extraction for
robust face recognition. Neurocomputing 149:1232–1244
11. Gritti T, Shan C, Jeanne V, Braspenning R (2008). Local features based facial expression recog-
nition with face registration errors. In: 2008 8th IEEE international conference on automatic
face & gesture recognition. IEEE, , pp 1–8
12. Liu Y, Ge Y, Wang F, Liu Q, Lei Y, Zhang D, Lu G (2019). A rotation invariant HOG descriptor
for tire pattern image classification. In: ICASSP 2019–2019 IEEE international conference on
acoustics, speech and signal processing (ICASSP). IEEE, pp 2412–2416
426 Bhavana and N. Raheja

13. Pang Y, Yuan Y, Li X, Pan J (2011) Efficient HOG human detection. Signal Process 91(4):773–
781
14. Lu S, Lu Z, Zhang YD (2019) Pathological brain detection based on AlexNet and transfer
learning. J Comput Sci 30:41–47
15. Zhang Q, Wang Z, Wang B, Ohsawa Y, Hayashi T (2020) Feature extraction of laser machining
data by using deep multi-task learning. Information 11(8):378
16. Huang F, Yu L, Shen T, Jin L (2019). Chinese herbal medicine leaves classification based
on improved AlexNet convolutional neural network. In: 2019 IEEE 4th advanced information
technology, electronic and automation control conference (IAEAC), vol 1. IEEE, pp 1006–1011
17. James A, Manjusha J, Saravanan C (2018) Malayalam handwritten character recognition using
AlexNet based architecture. Ind J Electr Eng Inf (IJEEI) 6(4):393–400
18. Gargeya R, Leng T (2017) Automated identification of diabetic retinopathy using deep learning.
Ophthalmology 124(7):962–969
19. Sapkal AT, Kulkarni UV (2018) Comparative study of leaf disease diagnosis system using
texture features and deep learning features. Int J Appl Eng Res 13(19):14334–14340
20. Joachims T (1998). Text categorization with support vector machines: Learning with many
relevant features. In: European conference on machine learning. Springer, Berlin, pp 137–142
21. Alhiyafi JA, Alnahwi A, Alkhurissi R, Bayoumi M, Altassan M, Alahmadi A, Olatunji SO,
Maarouf AA (2019). Document categorization engine based on machine learning techniques.
In: 2019 international conference on computer and information sciences (ICCIS). IEEE, pp
1–5
22. Awad WA (2012) Machine learning algorithms in web page classification. Int J Comput Sci
Inf Technol 4(5):93
23. Mosa ZM, Ghaeb NH, Ali AH (2019) Detecting Keratoconus by using SVM and decision tree
classifiers with the aid of image processing. Baghdad Sci J 16(4 Supplement):1022–1029
24. Chen, J., Lian, Y. and Li, Y., 2020. Real-time grain impurity sensing for rice combine harvesters
using image processing and decision-tree algorithm. Comput Electron Agric 175:105591
25. Khotimah WN, Arifin AZ, Yuniarti A, Wijaya AY, Navastara DA, Kalbuadi MA (2015). Tuna
fish classification using decision tree algorithm and image processing method. In: 2015 inter-
national conference on computer, control, informatics and its applications (IC3INA). IEEE, pp
126–131
26. Salo F, Nassif AB, Essex A (2019) Dimensionality reduction with IG-PCA and ensemble
classifier for network intrusion detection. Comput Netw 148:164–175
27. Xinqin LI, Tianyun SHI, Ping LI, Wen ZHOU (2019). Application of bagging ensemble clas-
sifier based on genetic algorithm in the text classification of railway fault hazards. In: 2019 2nd
international conference on artificial intelligence and big data (ICAIBD). IEEE, pp 286–290
28. Yu G, Li Q, Wang J, Zhang D, Liu Y (2020) A multimodal generative and fusion framework
for recognizing faculty homepages. Inf Sci 525:205–220
29. Rungta M, Sherki PP, Dhaliwal MP, Tiwari H, Vala V (2020). Two-phase multimodal neural
network for app categorization using APK resources. In: 2020 IEEE 14th international
conference on semantic computing (ICSC). IEEE, pp 162–165
30. Suryawanshi S, Chakravarthi BR, Arcan M, Buitelaar P (2020). Multimodal meme dataset
(multioff) for identifying offensive content in image and text. In: Proceedings of the second
workshop on trolling, aggression and cyberbullying, pp 32–41
31. Gülbaş B, Şengür A, İncel E, Akbulut Y (2019). Deep features and extreme learning machines
based apparel classification. In: 2019 international artificial intelligence and data processing
symposium (IDAP). IEEE, pp 1–4
32. Purohit H, Ajmera PK (2021) Optimal feature level fusion for secured human authentication
in multimodal biometric system. Mach Vis Appl 32(1):1–12
33. https://archive.ics.uci.edu/ml/datasets/Multimodal+Damage+Identification+for+Humanitar
ian+Computing
34. Zhang Q, Yang LT, Chen Z, Li P (2018) A survey on deep learning for big data. Information
Fusion 42:146–157
Multimedia Content Mining Based on Web Categorization (MCMWC) … 427

35. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural
networks. Science 313(5786):504–507
36. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural
Comput 18(7):1527–1554
37. Nagatani K, Hagiwara M (2014) Restricted Boltzmann machine associative memory. Int Joint
Conf Neural Netw (IJCNN) 2014:3745–3750
DDSS: An AI Powered System for Driver
Safety

Neera Batra and Sonali Goyal

Abstract This research describes an AI-powered drowsiness detection safety


system (DDSS) that uses many sensors at various places to detect tiredness based
on eye blink and head motion. With the use of a crash sensor and GPRS, it can also
find the occurrence of an accident and alert about the location of the accident using
GSM communication module to some predefined emergency numbers stored in the
database. DDSS uses an IR sensor and an accelerometer to track the vehicle owner’s
eye and head movements in order to alert them to a potentially dangerous situation
in which they are about to fall asleep.

Keywords Drowsiness · Raspberry pi · GSM module · Driver · Accelerometer ·


IR Sensor

1 Introduction

According to current data, driver fatigue has become one of the leading causes of
fatalities, severe bodily injuries, and economic losses on the road in recent years.
Drowsiness is characterized as a reduced degree of awareness manifested by tired-
ness and difficulty staying awake, but the person is awakened by simple stimuli. It
could be caused by a lack of sleep, medication, substance abuse, or a neurological
problem. As a result, he might also additionally has sluggish response instances,
decreased vigilance and impaired thinking. In the worst case, he might also addi-
tionally nod off at the back of the wheel. Researchers have tried to decide driving
force drowsiness with the help of following measures: automobile-primarily based
totally measures, behavioral measures and physiological measures. With the goal
of reducing or eliminating the number of accidents, the driver’s state of sleepiness
should be monitored on a regular basis.

N. Batra · S. Goyal (B)


Department of CSE, Maharishi Markandeshwar Engineering College, Maharishi Markandeshwar
(Deemed To Be University) Mullana, Ambala, Haryana, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 429
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_32
430 N. Batra and S. Goyal

2 Review of Literature

Through [1], a new method which is light-weight, the detection of driver drowsiness
is evolved and its implementation is done on an Android application. The evolved
system captures the videos by using number of image processing techniques for the
detection of face of the driver using every frame. This evolved system is having the
capability for the detection of landmarks present on face which is helpful to calcu-
late Eye Aspect Ratio as well as Eye Closure Ratio. Wijnands et al. [2] confirmed
in his studies how separable three-D convolutions can increase prediction accuracy.
Ramzan et al. [3] proposed an analysis for the traditional used techniques for detec-
tion of drowsiness and makes three classes by using the property of classification, i.e.,
vehicle related, behavior related and parameter related. This paper [4] investigated
the reasons that is it possible to predict that when the level is reached for drowsiness.
This paper takes 21 users who drives a car for one hundred ten minutes for inducing
the drowsiness. This paper [5] presents a clear description of latest works which
are related to drowsiness of driver with a complete description of alert system by
using different ML techniques like PERCLOS algo, OpenCV method used for deter-
mining drivers condition. A system is developed which is actually an alert system for
drivers in [6]. It is using the concept of video processing by using eye blink criteria
through eye aspect ratio with Euclidean distance. Tiago presents a low cost system
[7] used for fatigue detection which consists of numerous sensors to check physical
parameters and behavior of the vehicle. Sahu et al. [8] presents a multitasking system
for drowsiness detection using Arduino. This work proves best and inexpensive as
compared to traditional ones. Chellapa et al. [9] uses a camera to capture the face
and eyes of the driver and then process these images to calculate fatigue.

3 Methodology

3.1 Problem Definition

DDSS is designed for detection of drowsy state of eye and give and alert signal or
warning in the form of audio or any other means and preventing the potential acci-
dent. There are diverse measures available to hit upon the drowsiness in the driver
varying from vehicle-based, physiological or behavioral-based measures. However,
in this work, measure used to calculate the drowsiness is percent of the time eyelids
are closed or open which is based on behavioral measures. As a result, the amount of
eye closure, also known as (PERCLOS) percentage of closure, will be the primary
emphasis of this technique, as it gives the most reliable information on drowsi-
ness. The fraction of open and closed eyes with the aggregate number of frames for a
certain period was computed as an improved methodology for quantifying the driver’s
PERCLOS estimation. To achieve this goal, a script is built that takes eyes from a
camera and saves them to a local disk to form the dataset. The data was manually
DDSS: An AI Powered System for Driver Safety 431

Fig. 1 Architecture diagram of the proposed drowsy driver safety system (DDSS)

cleaned by deleting any photographs that were not required for the DDSS’s construc-
tion. Figure 1 depicts the architectural diagram of the proposed drowsy driver safety
system (DDSS).

3.2 Implementation

In this section, implementation steps of drowsiness detection safety system (DDSS)


are explained in detail. Hardware of the drowsiness detection safety system (DDSS)
incorporates Accelerometer, GSM module, IR Sensor, Crash Sensor, LCD, Raspberry
pi, GPS and buzzer. The flow diagram for the proposed drowsiness detection safety
system (DDSS) is shown in Fig. 2.
The proposed system determines the true status of the eye at this point, whether it
is closed, open or semi-closed. The most significant criterion is the detection of eye
status. If it is detected that the eyes are open or semi-open up to a certain threshold
value, a warning message is sent out. If the system notices something unusual, it will
alert the driver. The accelerometer ADXL345, which is mounted on the car owner’s
forehead, detects his head movements. When he feels sleepy, he can tilt his head
forward, to the left, or to the other side, and the accelerometer will assess the new
head position. He is alerted by blowing a buzzer and warning messages are sent to the
pre-stored emergency contacts. The location of car is determined using GPS. Crash
sensor detects collision, if any.
Sequence Diagram:
A Sequence Diagram for Drowsy Driver Safety System (DDSS) is depicted in Fig. 3.
432 N. Batra and S. Goyal

Fig. 2 Flow diagram of the proposed drowsiness detection safety system (DDSS)

4 Result and Analysis

4.1 Experiment

Drowsy driver safety system (DDSS) prototype is created with Raspberry Pi hardware
and written in Python. It was put to the test on a variety of people in various situations,
such as with a straight or inclined head. System tests is performed for 10 drivers of
different age groups and genders. They all are accompanied by an “aid” who can
control events.
Table 1 depicts the effects of the detection of accelerometer values for diverse head
movements by driver. The following activities are performed to validate DDSS: the
front and lateral movement of the head, left and right fall and eye blinking. In order
DDSS: An AI Powered System for Driver Safety 433

Fig. 3 Sequence diagram for proposed drowsy driver safety system (DDSS)

Table 1 Accelerometer values in case of different head movements by driver


S.no Head direction Value at X axis Value at Y axis Value at Z axis
1 Right X<0 Y<0 Any value
2 Left X > 0.95 Any value Any value
3 Front X < 0.4 Y<0 Any value
4 Normal X=0 Y=0 Z=0

to validate DDSS, drivers repeat unique actions to get five instances of each test. The
wide variety of accuracy measuring activities like fake positives and fake negatives
are performed (Tables 2, 3 and 4).
Figure 4 depicts drowsiness detection values under normal conditions for 50
observations.
Figure 5 depicts drowsiness detection values under normal conditions for 100
observations.
Figure 6 depicts drowsiness detection values under normal conditions for 150
observations.
434 N. Batra and S. Goyal

Table 2 Drowsiness detection values under normal conditions for 50 observations


S.no. Test No. of observations No. of hits Percentage of hits (%)
1 Front nodding 50 46 92
2 Assent of the head to the 50 48 96
right
3 Assent of the head to the 50 49 98
left
4 Distraction to the right 50 47 94
5 Distraction to the left 50 48 96
6 Blink detection 50 48 96

Table 3 Drowsiness detection values under normal conditions for 100 observations
S.no. Test No. of observations No. of hits Percentage of hits (%)
1 Front nodding 100 95 95
2 Assent of the head to the 100 96 96
right
3 Assent of the head to the 100 98 98
left
4 Distraction to the right 100 94 94
5 Distraction to the left 100 97 97

Table 4 Drowsiness detection values under normal conditions for 150 observations
S.no. Test No. of observations No. of hits Percentage of hits (%)
1 Front nodding 150 147 98
2 Assent of the head to the 150 145 96
right
3 Assent of the head to the 150 145 96
left
4 Distraction to the right 150 145 96
5 Distraction to the left 150 146 97
6 Blink detection 150 148 99

4.2 System Database

A database is well maintained to store the emergency contacts of driver. Figure 7


depicts the text messages sent to emergency contacts of driver after an accident is
detected by DDSS.
DDSS: An AI Powered System for Driver Safety 435

Fig. 4 Drowsiness detection


values under normal
conditions for 50
observations

Fig. 5 Drowsiness detection


values under normal
conditions for 100
observations

Fig. 6 Drowsiness detection values under normal conditions for 150 observations
436 N. Batra and S. Goyal

Fig. 7 Snapshot of text


messages sent on emergency
numbers

5 Conclusion and Future Work

The proposed and implemented drowsiness detection safety system (DDSS) attempts
to provide a solution for night time drivers after they experience sleepy and for
drunkards whilst they may be over drunk. It additionally shows caution messages for
drivers by means of their cherished ones whilst they may be overdrunk or rash riding
the automobile in an effort to save them from accidents. In case of misfortune, DDSS
is prepared to send textual content messages to the emergency numbers. More state-
of-the-art and small in size hardware needs to be designed. Instead of using Raspberry
pi, a mobile application can be developed through which data can be sent to cloud
for faster and effective response.

References

1. Sukrit M et al (2019) Real time driver drowsiness detection system using eye aspect ratio and
eye closure ratio. In: International conference on sustainable computing science, technology and
management
2. Wijnands JS, Thompson J, Nice KA et al (2020) Real-time monitoring of driver drowsiness on
mobile platforms using 3D neural networks. Neural Comput Appl 32:9731–9743
3. Ramzan M, Khan HU, Awan SM, Ismail A, Ilyas M, Mahmood A (2019) A survey on state-of-
the-art drowsiness detection techniques. IEEE Access 7:61904–61919
4. Jacobé de Naurois C, Bourdin C, Stratulat A, Diaz E, Vercher J-L (2019) Detection and prediction
of driver drowsiness using artificial neural network models. Accid Anal Prev 126:95–104
5. Navya Kiran VB, Raksha R, Rahman A, Varsha KN, Dr. Nagamani NP (2020) Driver drowsiness
detection. Int J Eng Res Technol (Ijert) Ncait 8(15)
6. Biswal AK et al (2021) IoT based smart alert system for drowsy driver detection. Wireless
Commun Mobile Comput
7. Meireles T et al (2019) A low cost prototype for driver fatigue detection. Multimodal Technol
Detect 3
DDSS: An AI Powered System for Driver Safety 437

8. Sahu S et al (2019) Survey on drowsiness detection/smart vehicle and security implementation


at low cost. Int J Res Appl Res Eng Technol 7(3)
9. Chellapa A et al (2018) Fatique detection using Raspberry Pi 3. Int J Eng Technol 7(2):29–32
Improving Performance and Quality
of Service in 5G Technology

Stuti Jain, Ritika Jain, Jyotika Sharma, and Ashish Sharma

Abstract 5G technology and networks have been set in motion to deliver high data
rates, low latency, massive network capacity, and an unwavering user experience to
more users. This work analyzes some key parameters that may influence the quality
of service (QoS) in 5G networks in sense of spectral efficiency. In the near future,
to improve data rates, we need more spectral. However, the spectrum is limited.
Thus, it is required to improve spectrum efficiency. In the present scenario, massive
MIMO helps in achieving substantial spectral efficiency by focusing towards a user
through congregating its antennas but still there is potential for improvement. To cater
to these requirements, cellular architectures of 5G networks need development on a
large scale. So in this paper, we are trying to throw some light on the understanding of
spectral efficiency on how positioning or count of UE is critical, what impact signal-
to-noise ratio can bring in practical implementation, and the use of less complex and
efficient precoding techniques to achieve near-optimal performance.

Keywords Fifth generation · Spectral efficiency · User equipment · Precoding ·


Noise · QoS

Abbreviations

5G Fifth generation
BS Base station
CSI Channel state information
D2D Device-to-device
EE Energy efficiency
IIOT Industrial IOT

S. Jain · R. Jain (B) · J. Sharma · A. Sharma


Department of Computer Science and Engineering, Maharaja Agrasen Institute of Technology,
New Delhi, India
A. Sharma
e-mail: ashish@mait.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 439
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_33
440 S. Jain et al.

IOT Internet of things


KPI Key performance indicators
Ma-MIMO Massive MIMO
MIMO Multiple-input multiple-output
mmWave Millimeter wave
MR Maximum ratio
P-ZF Phased zero-forcing
QoS Quality of service
RF Radio frequency
SE Spectral efficiency
UE User equipment
ZF Zero-forcing

1 Introduction

5G is the fifth generation of mobile networks. It supports multitude of utilizations


such as IIoT, smart homes, self-driving cars, drone operations, health tech, and enter-
tainment. According to users’ experience, it is segregated into the following: immer-
sive 5G services, intelligent 5G services, omnipresent 5G services, autonomous 5G
services, and public 5G services [1, 2]. 5 g uses small cells in large clusters so as
to provide ultra-high-speed with the use of mmWaves. Also, to cover wide area
and to send and receive signals simultaneously, it uses macrocells, i.e., MIMO.
Unprecedented array gains and a spatial resolution can be provided by Ma-MIMO
allowing multi-user MIMO communication to hundreds of user equipment (UEs) per
cell, maintaining robustness to inter-user interference as well [3, 4]. The spectrum
management is handled much more smartly due to the use of beamforming tech-
nology and massive MIMO. This results in considerably more uniform data rates
and latency levels across the network [5]. Ma-MIMO is able to observe data from
contemporaneous sensor transmission. This precedes the fact of having high multi-
plexing gain and beamforming capabilities. Its sensors have high data rates and low
latency which makes connectivity more reliable [2, 6].
The categories for technical challenges in 5 g implementation are present
in mmWave, D2D communications, Backhaul, Technology Maturity, Security
challenges, and Technology Maturity [7]. Areas of development in 5G include [8]:
(a) Spectrum issues [9]
(b) Traffic attributes [10]
(c) Radio interoperability [11]
(d) Access network problems [9]
Even though the precoding techniques increase throughput and reduce interfer-
ence, it increases the computational complexity of the overall system. Therefore,
Improving Performance and Quality of Service in 5G Technology 441

using a lower complex and more efficient precoder for the system would be a prac-
tical approach [2, 6]. Ma-MIMO technology increases SE and EE by implementing
simple signal techniques such as zero-forcing. In this work, we have considered
ZF, P-ZF, and MR and presented a comparative behavior in the presence of noise
which gives a simulated idea of practical conditions. Zero-forcing (ZF) precoding
is a virtually optimal linear precoding scheme in massive MIMO systems corre-
sponding full-complexity. However, Phased-ZF (P-ZF) is a hybrid precoding scheme
that essentially applies phase-only control at the RF domain and then performs a
low-dimensional baseband. ZF precoding based on the effective channel seen from
baseband [12]. Being hybrid provides a compromise between complexity and perfor-
mance. Thus P-ZF and MR become less complex and prove the supremacy of both
spatial multiplexing and beamforming gain. We may attain over 100% refinement in
contrast with base-line, where we get hold of orthogonal frequency-division multiple
access and turbo coding similar to LTE [13]. We have used random noise randn()
to understand how SE varies with different precoding techniques in actual cases of
different frequencies.
The organization of the rest of the paper is:
• Section 2 furnishes the model and methodology along with peculiarities which
are computing the optimal SE and UE using the µ parameters in different tiers of
the cellular network.
• Section 3 presents the implementation of the system model through the simulation
results of different interference cases and checking the variations when noise is
added.
• Section 4 presents the inferred results of the model and summarizes the main
points of the paper.
• Section 5 presents the insights of the graphs and the future scope of this paper.

2 Methodology

A hexagonal cellular network was contemplated by us. Each cell was dynamically
assigned a value. The base station in each cell is being equipped using antennas (M)
and user equipment (UEs) (K). Each position of UE in a cell is an ergodic random
variable. Since time frequency resources are being divided into multiple frames; the
coherence block length S = T c W c . The hexagonal cellular network is infinitely large
to give all the cells the same properties. The below-stated formula is used to find the
location of the base station Bj .
√   
√ 3r 0
Bj = 3 2 α (1)
j + √ α (2) (1)
r
2
3r j

where α (1) (2)


j and α j depict the coordinates as follows (Fig. 1):
442 S. Jain et al.

Fig. 1 Represents the


coordinate system for a
hexagonal system

After computing the base station of an arbitrary cell, the user equipment locations
are generated randomly since each position is an ergodic random variable. After the
generation, it is checked whether the generated values lie in the hexagonal radius of
the base station or not. Each cell in the cellular network has six neighboring cells
that can create interference in the communication establishment in tier 1. Similarly,
it has twelve interfering cells in tier 2. We computed the reuse factor for the cellular
network when the first neighboring cell is the one that reuses the same pilot sequence.
 ω
d j(zlm)
μ(ω)
jl = E zlm (2)
dl(zlm)

for ω = 1, 2.
Since the spectral efficiency is a function of the μ parameters μ(1) (2)
j and μ j ; all the
μ parameters are then calculated. μ(1) j defines the mean ratio between the channel
variance to the arbitrary base station and the channel variance to the neighboring
base station whose interference is being considered and its second order moment.
These parameters are equal to one only if both the cells being considered are the
same. μ parameters are calculated to find the interference between the cells when
the pilot reuse factor is:

β = α12 + α22 + α1 α2 (3)

The interference is calculated considering two conditions, i.e., with cells having
the same pilot sequence as that of the origin cell and with the cells having the same
pilot sequence with their neighboring cell. This process is repeated for all the tiers
of the cellular network with respect to the origin cell (Fig. 2).
Considering a set of large numbers of antennas, if we considered that M → ∞,
the SINRs for P-ZF, ZF, and MR converged to the same limit. To check the ultimate
effect of pilot contamination, cases were considered involving same and different
values of pilot reuse factor. Pilot reuse factor defines the total count of pilot signals
being used in a cellular network. Higher the value of pilot reuse factor, higher is the
count of pilot signals used, and hence higher will be the inter-cell interference.
Improving Performance and Quality of Service in 5G Technology 443

Fig. 2 Tiers of a cellular


network

Fig. 3 Representation of various values of pilot reuse factor in hexagonal network

After finding the µ parameters, the process to calculate the spectral efficiency
begins with the help of Monte Carlo Simulation. The origin cell is simulated consid-
ering all the possible non-negotiable interferences. The User Equipment (UEs) can lie
out of the forbidden range of 0.14 r. Once it is done, the interferences; whether intra-
cell or inter-cell are minimized using the three linear precoding methods; namely:
full-pilot zero-forcing (P-ZF), zero-forcing (ZF), and maximum ratio (MR) (Fig. 3).
For each number of antennas, M, spectral efficiency is optimized pertaining to
the optimal number of user equipment and pilot reuse factor (which determine B =
βK) for each of the M antennas. It is done by identifying the range of all acceptable
integral values. Also, the following values were set:
• Coherence block length or S = 400 (e.g., 2 ms being the coherence time and
200 kHz being coherence bandwidth, with respect to the eq S = T c W c ).
• The value of inverse SNR or p/rhoˆ2 = 5 db.
• We had also considered a pathloss component (kappa) of 3.7.
• A forbidden range which is a minimum of 0.14% is also considered. After this,
spectral efficiencies are calculated using the existing mathematical model:
   
B 1
SE j = K 1 − log2 1 + [bits/Hz/cell] (4)
S I scheme
444 S. Jain et al.

When M → ∞, for the above equation, the pre-log factor is maximized by K =


S

.
SNR values are the measure of signal strength to undesired noise. To maintain
this SNR, it requires more signal power. Since, having a pathloss component that is
higher, reduces the inter-cell interference, hence we considered k = 3. The random
noise with a variance of 1.5 is also added to understand the change that would occur
in the signals and graphs.

3 Results

For increasing area throughput, we have either increased bandwidth or cell density,
But, spectral efficiency has never been considered for the same. Therefore, to get
maximum spectral efficiency we need to find the optimal values of pilot reuse [14].
We simulated the spectral efficiency for an arbitrary cell in the cellular network
and took into consideration all the possible interferences (inter-interference as well
as intra-interference). The UEs can be anywhere in the cell, hence based on their
positioning; three cases were considered:
A Best Schema: All UEs in the different cells lie on the furthest edge with respect
to the cell being considered.
B Worst Schema: All UEs in the different cells lie on the closest edge with respect
to the cell being considered.
C Average Schema: All UEs in the different cells lie randomly in the middle with
respect to the cell being considered (Figs. 4 and 5).

Fig. 4 a SE versus total number of base stations in best case. b SE versus total number of base
station in worst case
Improving Performance and Quality of Service in 5G Technology 445

Fig. 5 a Optimal efficiency versus total number of base stations in best case. b Optimal efficiency
versus number of base stations in worst case

The best case being the most desirable is overly positive. Although the UE posi-
tions are different in the interfering cells, it still has an upper bound in the coordinated
scheduling of the across cells. The worst case being the least desirable follows an
overly negative approach since altogether the UEs cannot be at the worst positions at
the same instance. Hence, the average schema is considered the most probable one
for each cell. The spectral efficiencies that can be achieved per cell vary immensely
when it comes to a comparison between best case and the other two. This clearly
confirms that for massive MIMO, single cell analysis cannot be relied upon. Higher
Spectral efficiency is brought by ZF, and not MR in the cell interference presented
by the best case, hence the potential gain is much higher from intra-cell interference
rather than inter-cell interference. P-ZF is equal to ZF in the best scenario but outdoes
in the worst situation cell interference. In the average case interference, P-ZF, ZF, and
MR are almost similar in the most practical range of [10, 100] base station antennas.
In all the schemas, maximum diversity is seen when M > 10,000.
From the graphs, it was quite evident that the major difference in P-ZF, ZF, and MR
was not on the basis of spectral efficiencies but on the basis of how it was achieved:
what is the optimal number of UE, and what is the pilot reuse factor. Generally, Larger
the value of M, larger is the Optimal UE and smaller is the value as channels become
more orthogonal. MR shows the maximum number of UEs and shifts to a smaller
reuse factor when BS antennas are less than P-ZF and ZF. On the contrary, P-ZF
prefers large RF and the smallest count of UEs, reason being inter-cell interference
can be minimized easily. We can also say that MR gives low spectral efficiency per
user to many UEs, while ZF and P-ZF provide more spectral efficiency per user with
minimal UEs (Fig. 6).
446 S. Jain et al.

Fig. 6 a Spectral efficiency versus BS antennas in average case. b Optimal efficiency versus BS
antennas in average case

On analyzing the significance of pilot β, it was proved that for change in value
of β, highest spectral efficiencies are seen with base station antennas ranging from
[100, 10000].
Pilot reuse factor affects the ideal number of UE and their maximum possible
performances. We realized that P-ZF gives the highest spectral efficiency in compar-
ison to MR. It is not likely that cells will be completely occupied if MR, ZF, and
P-ZF are compared for a particular number of antennas; the value may vary. MR is
quite competitive when the number of antennas is quite large; i.e.; more than 10,000
although, ZF and P-ZF provide better spectral efficiency.
From the graphs generated, we can see that,
K = S/2β is the optimal number of UEs as M → ∞ [4].
It is authenticated as, since in the average case, K tends to 70 (where β = 3),
in the best case, K tends to 205 (where β = 1),
and in the worst case, K tends to 48 (where β = 4) (Fig. 7).
Massive MIMO can also work on lower SNR values, but then power loss would
happen in that case. The noise with 1.5 variances has not shown much deviation
from the actual values calculated. As densification increases the noise would also
occur and would need to be taken into consideration. Zero-Forcing and Phased-ZF
are more responsive to the signal-to-noise ratio.
Taking three different environments helps us to analyze the situation of how would
noise affect it (Figs. 8 and 9).
Observing the comparison in three cases which are based on the different inter-
cell interference, could be seen that worst case is affected the most. But for the
realistic model, i.e., there is not much deviation from actual results. And it works
great when the best case is taken into account. Therefore, we can find the achievable
SE at different interference levels.
Improving Performance and Quality of Service in 5G Technology 447

Fig. 7 a Effect of noise on spectral efficiency versus BS antennas in best case. b Effect of noise
on optimal efficiency versus BS antennas in best case

Fig. 8 a Effect of noise on spectral efficiency versus BS antennas in worst case. b Effect of noise
on optimal efficiency versus BS antennas in worst case

4 Conclusion

5G Technology we use now isn’t as sufficient to use full spectrum, as our spectrum
is limited hence, we can’t expand it but we need to effectively use it. Massive MIMO
is a technique that would help us to achieve maximum Spectral Efficiency (SE) and
better throughput. With the help of this model, we have tried to find a solution to this
problem. Here, we have taken best and worst cases to find the range of achievable
SEs (per cell) at different interference levels. With the graphs, we came to know
that P-ZF has the highest performance in terms of per UE whereas MR gives the
lowest. But for the case of the number of UEs, MR gives more value as compared to
448 S. Jain et al.

Fig. 9 a Effect of noise on spectral efficiency versus BS antennas in average case. b Effect of noise
on optimal efficiency versus BS antennas in average case

P-ZF values. Therefore, ZF turns out to be a good choice for the per-cell SE. Taking
account of the average case, which is the realistic model P-ZF, ZF, MR almost shows
similar optimized SE for the practical range of antennas 10 < M < 200.
With the help of graphs, we can find how optimized SE is attained. Thus varied
values of UE, K, and the β were used in it. We have seen that generally for a large
M it implies a smaller β and higher K, as channels tend to become more orthogonal
when there is an increase in the value of M. Even after the addition of random noise
of 1.5 variances, there is not much deviation in the values of our model in the average
case that is the realistic approach to it. Therefore, we could apply this in our daily
life situations.

5 Future Scope

Technology is a never-ending world and will continue to evolve in the future. 5G


technology offers a wide range of possibilities to look into. And there are a few
areas we were not able to touch upon in our paper. In this paper, we have taken an
approximate forbidden region, but this might change based on different geographical
locations. And the values could vary with different infrastructures taken into consid-
eration. Hardware Impairments might also affect the values we got of optimized SE
in multi-cell massive MIMO systems as we have used ideal hardware of BS and UE
in this paper. With more variations in values of pilot reuse factor and the number of
antennas, we can deduce more results for different scenarios.
Improving Performance and Quality of Service in 5G Technology 449

References

1. Zikria YB, Sung KW, Muhammad AK, Wang H, Mubashir RH (2018) 5G mobile services and
scenarios: challenges and solutions. Sustainability 10(10)
2. MDPI. [Online]. https://www.mdpi.com/
3. Cornell University arxiv.org. [Online]. https://arxiv.org/
4. Bjornson E, Larsson EG, Debbah M (2015) Massive MIMO for maximal spectral efficiency:
how many users and pilots should be allocated?. arXiv:1412.7102v3
5. Electronic Notes. [Online]. https://www.electronics-notes.com/
6. Chataut R, Akl R (2020) Massive MIMO systems for 5G and beyond networks—overview,
recent trends, challenges, and future research direction. Sensors 20(10)
7. Taheribakhsh M, Jafari AH, Peiro MM (2020) 5G implementation: major issues and challenges
in 2020. In: 25th international computer conference, computer society of Iran (CSICC), Tehran,
Iran
8. Busari SA, Mumtaz S, Al-Rubaye S, Rodriguez J (2018) 5G millimeter-wave mobile
broadband: performance and challenges. IEEE Commun Mag 56(6):137–143
9. Bajracharya R, Shrestha R, Kim SW (2017)Impact of contention based LAA on Wi-Fi network.
Int Inf Inst (Tokyo). Inform 20(2A):827–836
10. Bajracharya R, Shrestha R, Zikria YB, Kim SW (2017) LTE or LAA: choosing network mode
for my mobile phone in 5G network. In: 2017 IEEE 85th vehicular technology conference
(VTC Spring), pp 1–4
11. Nuggehalli P (2016) LTE-WLAN aggregation [industry perspectives]. IEEE Wirel Commun
23(4):4–6
12. Liang L, Wei X, Dong X (2014) Low-complexity hybrid precoding in massive multiuser MIMO
systems. IEEE Wireless Commun Lett 3(6):653–656
13. Wang J et al (2017) Spectral efficiency improvement with 5G technologies: results from field
tests. IEEE J Sel Areas Commun 35(8):1867–1875
14. Chataut R, Akl R (2018) Optimal pilot reuse factor based on user environments in 5G
massive MIMO. In: IEEE 8th annual computing and communication workshop and conference
(CCWC), Las Vegas, NV, USA, pp 845–851
Machine Learning-Based Ensemble
Classifier Using Naïve Bayesian Tree
with Logit Regression for the Prediction
of Parkinson’s Disease

Arunraj Gopalsamy and B. Radha

Abstract Parkinson disease is the most common nerve disorder that affects the
central nervous system with various symptoms including depression, anxiety,
muscular weakness, low voice with difficulty in speaking and difficulty with walking.
Though the disease seems to be riskless at the beginning, the gradual growth makes
the health of a person get worse over time. Most of the analysis reports that most
Parkinson patients undergo voice deficiencies. To treat Parkinson disease, early detec-
tion of the disease is most imperative. This paper presents a predictive ensemble tree,
a machine learning-based ensemble classifier that employs a decision tree, naïve
Bayesian classifier and logit regression for predicting Parkinson disease based on
voice data samples. Experimental analysis has been made for the proposed model
using three publically available Parkinson datasets. The results show that the model
has an accuracy of 98.01%, precision of 98.6% and AUC of 97.6% with ten-fold
cross validation. The performance of the proposed model has been compared with
various other models using various evaluation metrics from which it is shown that the
proposed model outperforms various other existing models. Additionally, the model
has minimum values for mean squared error, root mean squared error, mean absolute
error in comparison with various other models. The results prove that the proposed
model can be utilized for the early prediction of Parkinson disease.

Keywords Machine learning · Predictive ensemble tree · Decision tree · Naïve


Bayesian model · Logit regression · Parkinson’s disease prediction

A. Gopalsamy (B)
Sree Saraswathi Thyagaraja College, Pollachi, Tamil Nadu 642107, India
IBM, Bangalore, Karnataka 560064, India
B. Radha
Sri Krishna Arts and Science College, Coimbatore, Tamil Nadu 641008, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 451
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_34
452 A. Gopalsamy and B. Radha

1 Introduction

Parkinson disease, a progressive chronic syndrome that extremely affects the central
nervous system [1] is the second most common disease targeting the aged ones [2].
The disease is increasing promptly all over the world especially in developing coun-
tries [3]. The disease has a vast range of motor skill symptoms such as tremors, vocal
impairments, gait or walking difficulties, rigidity or postural instability, dystonia
(repetitive muscle movements), bradykinesia (slowness in movements) and non-
motor symptoms including mental or behavioural issues, sense of smell deficiencies,
gastrointestinal issues and joint pains [4, 5]. Gradual increase in the disease on a
human cause difficulty in walking, talking and sleeping.
The conventional method in diagnosing Parkinson disease takes a lot of time to
analyse the patients’ behaviour by observing or analysing their day to day activities,
motor skills such as walking, body movements and posture observation, non-motor
parameters such as sweating, joint pain along with various neural parameters to
evaluate the progression of the disease. However, these observations and analyses
may not be helpful in identifying the disease at its earlier stage [6]. It is reported
that speech disorder is the initial stage that appears five years earlier before diag-
nosing the disease clinically [7, 8]. Nearly 89% of the people affected with Parkinson
disease have voice disorders and impairments [9]. Some of the voice and speech
disorders include softened voice, larynx (reduced vocal loudness), monotone speech
with unchanging pitch, breathiness and hoarse voice, tremor in voice, trouble in
pronouncing letters, disability and uncertain speech articulation [10].
Many researchers proved that studying the speech signals of an individual highly
diagnosis the presence of Parkinson disease at an earlier stage [11]. For early diag-
nosis of the disease, the speech signals are recorded and evaluated effectively to
find the impending abnormalities present in the voice and speech which may not be
known to the listener [12, 13]. Thus the earlier diagnosis of Parkinson disease through
voice and speech data has gained huge attention in recent days which motivated the
researchers to propose intelligent and expert systems that predict the disease.
Artificial intelligence and machine learning techniques are considered to be the
qualified field for critical classification problems [14]. However, though there exist a
large number of algorithms and models for classifying Parkinson disease at an initial
stage through voice and speech data, the accuracy, as well as the reliability, plays a
significant role in choosing a better algorithm for disease prediction. The model with
high prediction accuracy and reliability along with the low error rate is considered to
be a good algorithm for prediction concerning the medical field. Thus the research
on diagnosing Parkinson at an early stage using voice data has narrowed down their
research on improving the accuracy of disease prediction [15].
In this paper, an attempt has been made for predicting Parkinson’s disease at an
early stage with respect to the speech and voice dataset. To achieve this objective,
a predictive ensemble tree model has been proposed which builds a decision tree
at the first phase by segmenting the dataset into sets and subsets. At each node in
the decision tree, a naïve Bayesian-based conditional probability has been applied.
Machine Learning-Based Ensemble Classifier … 453

At a successive stage, logistic regression is applied to the leaf nodes of the decision
tree. For predicting the test data, cumulative voting is applied for the results obtained
from the three classifiers. Extensive experimental analysis has been made for the
proposed model using Parkinson datasets available publically at the UCI repository.
The obtained results are also compared with the existing models specifically proposed
for predicting Parkinson disease over voice data. The results show that the proposed
model provides better accuracy than several existing models under comparison.
The structure of the paper is organized as follows. Section 2 presents the various
existing works related to the field of study. Section 3 explains the proposed predic-
tive ensemble tree classifier model in detail along with the overall architecture of the
proposed model and the algorithm for the proposed predictive ensemble tree clas-
sifier. Section 4 discusses the experiment set up along with the dataset description,
model preparation and evaluation criteria. Section 5 presents the result analysis for
the proposed model using various performance metrics and the comparison of results
between the proposed and the existing models. Section 6 concludes the paper and
lists the future enhancements to be made.

2 Related Works

Several works have been proposed by researchers for effectively classifying and
predicting Parkinson disease by using voice and speech data. Various machine
learning algorithms are used widely such as neural networks [16, 17] and for more
efficacy, ensemble classifiers were also suggested for effective prediction of disease
[18]. A two dimensional expert system was proposed that selects optimal samples
and optimal features [13] for disease prediction. This model has an accuracy rate
of 97.5%. Sakar et al. collected various voice samples from healthy and Parkinson
patients and investigated voice data with various machine learning tools [19]. A
model that employs a fuzzy KNN classifier along with a feature reduction technique
was proposed to diagnose Parkinson disease [20]. This method outperforms the SVM
model with an accuracy of 96.07%. The method was extended with an adaptive fuzzy
k-nearest neighbour that applies PSO for optimizing the parameters and selecting
the features effectively [21]. A multi-edit-nearest-neighbour (MENN) algorithm and
an ensemble learning algorithm was suggested with an improvement of 45.72% of
accuracy than various other methods [22].
An effective model that utilizes optimal support vector machine (SVM) along with
bacterial foraging optimization (BFO) and relief feature selection was also proposed
to predict Parkinson by evaluating vocal datasets [23]. The authors extended the
work by implementing an enhanced fuzzy k-nearest neighbour (FKNN) method
combined with the chaotic bacterial foraging optimization with Gauss mutation
(CBFO) approach to detect Parkinson with voice samples. This evolutionary instance-
based learning approach provides an accuracy of about 97.89% [8]. However, the
methods are tested only with a small number of samples.
454 A. Gopalsamy and B. Radha

A projection-based learning for meta-cognitive radial basis function network


(PBL-McRBFN) was proposed for predicting Parkinson at an early stage [24]. The
accuracy achieved for the standard vocal dataset seems to be 99.35%. A hybrid
intelligent system was suggested that includes feature pre-processing using a Gaus-
sian mixture model and a feature selection with various models and classification
models for diagnosing the disease. The model was proved to be effective with 100%
classification accuracy [25]. A similar method was suggested that uses a set of
feature selections and classifiers for improving the classifier accuracy in predicting
the disease [26–28]. A mel frequency cepstral coefficients (MFCCs) model that
extracts voiceprint from each voice sample was proposed to predict the disease which
produced an accuracy of about 82.5% [2]. Similarly, the integration of stacked gener-
alization and complementary neural networks was introduced to predict Parkinson’s
disease [29].
Regression is another powerful concept particularly used for prediction. Several
authors invested their time and effort for their contribution in building new regression
models such as the Logit leaf model [30] for customer churn prediction and logistic
model trees [31]. However, sparseness is the key issue for which householder trans-
formation based sparse least squares support vector regression was proposed and the
model was also tested with Parkinson dataset which has an RMSE value of 0.8081
[32]. An incremental support vector machine was developed for predicting UPDRS
prediction along with SOM and NIPALS for clustering. The model provides good
results with an MAE value of 0.4 [33, 34].
With the knowledge obtained from the above literature survey, serval factors
are clearly identified. Machine learning and artificial intelligence algorithms such
as neural networks and support vector machines are highly used in predicting the
disease. However, due to the black box behaviour, understanding the process of
making a decision is more difficult. Multiple classifiers and ensemble classifiers are
employed for producing better results than other methods. In this case, choosing the
classifiers is most significant for producing better results with higher accuracy. Moti-
vated with the above-listed works, an ensemble classifier model has been proposed
that provides better accuracy.

3 Predictive Ensemble Tree Classifier Model

The proposed model is an ensemble classifier in which it uses decision tree induction
as the primary part with naïve Bayesian classifier and logistic regression classifier.
The Logit leaf model proposed by De Caigny et al. [30] is the base for the proposed
model. The model has two stages in which the first stage splits the dataset into
sets and subsets and applies naïve Bayesian probability at various levels based on
the consistent characteristics of the records and in the second stage, the logistic
regression is applied on the final level of the fragments. The overall design of the
proposed predictive ensemble tree model is presented in Fig. 1.
Machine Learning-Based Ensemble Classifier … 455

Input Dataset

Decision Tree
Root Node
Posterior Conditional
Probability Probability

Leaf Node Internal Node


Posterior Posterior
Probability Probability

Probability at each Leaf Node Leaf Node


Node with Naïve
Bayesian Classifier

Prediction Prediction Prediction


Probability Probability Probability

Logistic Regression
DT NB LR

Cumulative Voting

Prediction

Fig. 1 Overall context of the proposed predictive ensemble tree model

Decision trees are a powerful machine learning algorithm for predicting class
labels. This supervised learning algorithm uses a tree structure to represent the given
input data. It uses a greedy approach with a divide and conquers based top-down
approach to construct the tree. The model has a single root node that contains the
entire dataset. The model undergoes a recursive splitting process in which optimal
split is carried out with a splitting criterion that specifies the best way to split the
dataset. Each split is based on the computation of attribute significance and thus each
internal node represents the attributes with a certain value that constitutes the subset
of the dataset from its parent node. Thus attribute selection is made at each step for
choosing the attribute for splitting the datasets based on its values. If all the instances
in the subset belong to the single class, then the splitting process is stopped. Finally,
the leaf nodes represent the class labels for each segmented subset.
The splitting process is carried out repeatedly until any of the stopping criteria
is encountered such as (1) all the subset belongs to a single class, (2) no attribute is
456 A. Gopalsamy and B. Radha

left or (3) no instance is left for the subset. Thus with the complete set of instances S
and attributes A, from the given dataset D, the decision tree can be built by splitting
the datasets into a set of internal nodes F and a set of leaf nodes T with a subset of
instances S t at each leaf node t and this can be represented as in Eq. (1).

S= St ∀t = t  , St = ∅ and St ∩ St  = ∅ (1)
t,t  ∈T

The proposed model uses the C4.5 model for building the decision tree that uses
the information gain ratio for selecting the attributes. Though the decision trees are
considered to be effective among many machine learning algorithms, the recursive
split increases the complexity of the tree with respect to the height of the tree. Also,
the tree may end up with errors due to the noise and outliers raising the overfitting
issues. Thus several parameters exist to govern the effective splitting for improving
the efficiency of the tree such as the number of leaf nodes, number of instances at
the leaf node and estimated error rate of the trees. To avoid overfitting issues, the
trees are pruned in such a way as to minimize the complexity by removing the least
reliable branches from the trees. The pruning stage improves the performance of the
tree by providing a fast and better classification rate which can be included either as
a prior process during splitting or posterior after the tree construction during which
the least reliable braches are removed and the leaf node is replaced instead. At each
leaf node, the prediction percentage for each class can be computed which is used
for class prediction.
Additionally, at each internal node, the naïve Bayesian classifier is applied to the
decision tree in the proposed model. It is a statistical classifier that predicts the class
membership probability of the given test instance based on the computed probability.
A naïve Bayesian classifier is a simple statistical model that applies Bayes theorem
with high accuracy and speed with minimum prediction error. The internal node
carries the conditional probability for the particular attribute value which undergoes
split and the leaf node carries the posterior probabilities for the class variables. Thus,
the conditional probability can be computed as in Eq. (2) for each internal node that
contains the subset of instances that specifies each value of the attribute. Here ‘n’
represents the number of classes in the dataset.
 
  P A ∩ Cj
P A|C j   , ∀ j = 1, 2, . . . n (2)
P Cj

Thus, the posterior probability at the leaf node can be computed by multiplying the
conditional probabilities obtained for the internal nodes of the branch corresponding
to the leaf node specific to each class with the posterior probability of that class.
Thus, the probability of the leaf node t particular to the class value C j is computed
by multiplying the probabilities of the class at each internal node corresponding to
the leaf t and is given in Eqs. (3) and (4).
Machine Learning-Based Ensemble Classifier … 457

        
P t|C j × P C j = P At |C j × P C j (3)

           
P t|C j × P C j = P a1 |C j × P a2 |C j . . . × P am |C j × P C j (4)

The prior probability of the specific class is computed by dividing the number of
instances in the particular class by the number of instances in the dataset D as in
Eq. (5).
   
P C j = C j /|D| (5)

Thus the leaf node contains the probability obtained by applying the naïve
Bayesian classifier. At a second stage, the logistic regression is applied to each leaf
node t for all the instances. Logistic regression is a good statistical method that not
only predicts or classifies the data but is also used to select the significant features
based on the coefficients computed using maximum likelihood estimation. Thus,
logistic regression is fit into the leaf node for classifying the instances. It also selects
the significant attributes of the entire model and for the subset of instances at each
leaf node. Upon training the model, the classification of the test data can be processed
by applying the model in which the class of the test instance can be classified using
the average cumulative votes.


n
C j = Avgmaxc Pi (6)
i=1

Thus the class having maximum average cumulative value is considered to be the
label of the given test data. Here the Pi represents the prediction probability obtained
from the classifiers such as decision tree, logistic regression and naïve Bayesian
classifier used in the proposed model. The algorithm for the proposed model is
presented in Fig. 2.

4 Experimental Setup

For experimental analysis various publically available datasets have been used for
evaluating the performance of the proposed model. The various experimental results
obtained for the proposed model with various datasets are discussed in this section.

4.1 Dataset Description

One of the datasets used for the proposed study is the Oxford Parkinson’s Disease
Detection Dataset created by the Max Little of Oxford University clubbed with the
458 A. Gopalsamy and B. Radha

Input: Training data set D, attribute_list, test instance


Create a node N
If all instances of N є same class Ci
Return N as a leaf node t
//Apply naïve Bayesian at each node
Compute the posterior probability NBt for node t and class Ci
If attribute_list = ɸ then
Return N as a leaf node t with majority class as a label
//Apply naïve Bayesian at each node
Compute the posterior probability NBt for node t and class Ci
For all attribute a in the attribute_list
Compute info_gain_ratio(a)
End For
abest is highly significant
Create a decision node F that splits the attribute abest
//Apply naïve Bayesian at each node
Compute the conditional probability P(F|Ci)
Split the list of instances based on the splitting of abest
If a list of instances is empty
Add N as a leaf node t
Else
Add N as a child node of F
End If
Compute the error rate of classification
Return N as a leaf node t
//Apply naïve Bayesian at each node
Compute the posterior probability NBt for node t and class Ci
For each leaf node t in T
//Compute prediction percentage for leaf nodes of the decision tree
Compute the percentage of instance at each class Ci as DTt
//Compute regression for each leaf node t
Define logistic regression
Do
Add attributes having minimum prediction error
Until stopping condition occurs
Compute prediction probability LRt for node t and class Ci
End For

//To predict the class label cl for the test instance


For each leaf node t in the decision tree
Compute cumulative voting (CV) for prediction probabilities of three methods Pi
For each class Ci from the dataset D
CVi = ∑
End For
cl = max (avg(CVi))
End For
Return prediction (cl)

Fig. 2 Algorithm for the proposed predictive ensemble tree model


Machine Learning-Based Ensemble Classifier … 459

Table 1 Dataset used in the study


Dataset ID Dataset name No. of attributes No. of instances
DS1 Oxford Parkinson’s disease detection 23 195
DS2 Oxford Parkinson’s disease telemonitoring 26 5875
DS3 Parkinson speech (Istanbul University) 26 1040

Voice and Speech National Centre [35, 36]. The dataset is based on the speech signals
recorded that contain a set of 23 real attributes related to a range of biomedical voice
measurements such as vocal fundamental frequency and its variations, amplitude and
tonal components, the ratio of voice and other complexity measures. The measure-
ments belong to 31 people among which 23 are having Parkinson’s disease with a
total of 195 voice recording samples. The class variable specifies the discrimination
of PD with status as 1 from healthy individuals having status as 0. The same authors
created another dataset in association with various medical centres in the US and
Intel Corporation by utilizing the telemonitoring device for recording the speech
signals [36, 37]. This Oxford Parkinson’s Disease Telemonitoring Dataset contains
5875 records having 26 attributes that include the patient details, motor UPDRS,
total UPDRS and other 16 biomedical voice measurements obtained from 42 people
with early-stage Parkinson’s disease.
Another dataset used for evaluating the proposed model is the Parkinson speech
dataset with various sound soundtracks recorded from 20 PD and 20 healthy individ-
uals by the Department of Neurology in Cerrahpasa Faculty of Medicine [19]. The
dataset contains 26 integer and real attributes that contains voice samples comprising
vowels, numbers, words and short sentences along with (Unified Parkinson’s Disease
Rating Scale) UPDRS score. All the datasets used for the analysis are obtained from
the UCI Machine Learning Repository, University of California, Irvine [38].
The dataset contains both training and test data with a total of 1040 records. The
datasets used in the evaluation study is presented in Table 1.

4.2 Model Preparation

The data, as well as the model, are prepared initially for performing experiments
with the identified Parkinson datasets. Data Pre-processing is the most significant
step that provides various advantages to the model. First, it improves the predictive
performance of the entire model. The selection of features before performing the
classification or prediction increases the model effectiveness concisely [30]. The
minimum number of attributes in the dataset also reduces the time and cost to execute
the model and improves the implementation and interpretability of the model [39, 40].
The data pre-processing is carried out with various process. First, the features
having numeric values are converted to nominal values. Here the idea is to map the
large set of feature values to a small set of range values [41]. The proposed model
460 A. Gopalsamy and B. Radha

uses a filter that discretizes a range of numeric attributes into nominal attributes using
binning in a given dataset. Second, data is normalized using min–max normalization
that applies a linear transformation as in Eq. (7). This process helps to reduce the
range of values of an attribute to make them fit into a specifically limited boundary.

Valold − Minold
Valnew = (Maxnew − Minnew ) + Minnew (7)
Maxold − Minold

Third, the features that are important for the classification or prediction are chosen
initially through logistic regression that selects the significant features at each subset
of records segmented by the decision tree. However, for selecting significant features
from the given dataset, the model uses the multiple ranks with majority vote based
relative aggregate scoring model for Parkinson Dataset [42].
After performing various pre-processing techniques on the given dataset, the next
stage is to set the parameters for the proposed model. Here the proposed model
utilized C4.5 for implementing decision trees [43]. Thus, the parameters used for
the proposed model is similar to the decision tree model. The best splitting attribute
at each level and the pruning of the tree in the proposed model is entropy based.
The hyperparameters used for implementing the decision tree in the proposed model
are the confidence factor for pruning and the minimum leaf size. However, in the
proposed model, the value for the parameters are chosen based on the optimal
performance of the entire model.

4.3 Evaluation Metrics

Various evaluation metrics are available in the literature for evaluating the perfor-
mance of any model. This paper utilizes a few of the available measures such as
precision, recall, F1-measure, accuracy, AUC, logloss, mean squared error (MSE),
root mean squared error (RMSE), mean absolute error (MAE), R squared (R2 ). The
following are the formula to evaluate the performance of each metric.

TP
Precision =
TP + FP
TP
Recall =
TP + FN
Precision ∗ Recall
F1 Measure = 2 ∗
Precision + Recall
TP + TN
Accuracy =
TP + TN + FP + FN
−1    
N M
Logloss = xi j ∗ log pi j
N i=1 j=1
Machine Learning-Based Ensemble Classifier … 461

1 
N


Mean Squared Error = (yi − yi )2


N j=1

1 
N


Root Mean Squared Error = (yi − yi )2


N j=1

1  
N
yi − yi 


Mean Absolute Error =


N j=1
N 

j=1 (yi − yi )
2
R - Squared = 1 − N
j=1 (yi − yi )
2

Apart from these formulae, AUC is also used as a measure that evaluates the
classifier to distinguish between the classes. It is a summary of the ROC curve, an
evaluation metric for the binary class problem which a probability curve that plots
true positive rate against false positive rate.

5 Result Analysis

This section presents the various results obtained for the model proposed in the
study. It discusses the results for the proposed and the existing models for predicting
Parkinson disease. Initially, the performance analysis of the individual models and
other base models used in the study are compared with the proposed model. The
various metrics such as precision, recall, f-measure, accuracy, AUC and Logloss for
various models under various test options with Oxford Parkinson dataset DS1 are
evaluated. The proposed model is compared with other models such as decision tree
(DL), logistic regression (LR), naïve Bayesian (NB), logistic model trees (LMT),
logit leaf model (LLM), random forest (RF), AdaBoost (AB), multilayer perceptron
(MLP), radial basis function (RBF), predictive ensemble tree (PET). The obtained
results are presented in Table 2.
The proposed predictive ensemble tree has a precision of about 97.3% with tenfold
cross validation, 94.3% with random sampling specifically with 66% of the training
set and 96.6% with leave one out option. Similarly, the accuracy of the proposed
model with tenfold cross validation, random sampling and leave one out are 97.89%,
94.3% and 95.8% respectively. With the cross validation test option, the logit leaf
model has the highest precision of 98.5%, F1 measure of 98% and accuracy of about
97.89% whereas the proposed model has the highest recall, accuracy and AUC as
98.6%, 98.01% and 97.6% respectively.
With the random sampling option having 66% of the training set, the proposed
model has the highest precision, recall, f1 measure and accuracy of 94.3%, 93.25%,
93.77% and 94.3% respectively whereas the random forest has the maximum AUC
value as 95.2%. By experimenting with the leave one out test option, the decision tree
462

Table 2 Performance evaluation for Oxford Parkinson dataset


Test options Various metrics DT LR NB LMT LLM RF AB MLP RBF PET
Cross validation Precision 88.9 84.6 84.6 89.2 98.5 93.2 90 87.3 87.9 97.3
Recall 88.7 85.1 76.4 87.2 97.5 94.5 89.2 87.7 87.2 98.60
F1 Measure 88.8 84.8 80.3 88.2 98.0 93.8 89.6 87.5 87.5 97.9
Accuracy 88.7 85.1 76.4 88.9 97.89 94.8 89.2 87.7 87.2 98.01
AUC 87.9 88.2 89.9 91.4 95.5 95.8 88 93.3 88.4 97.6
LogLoss 2.44 0.35 1.18 0.53 0.39 0.39 3.72 0.27 0.32 0.42
Random sampling Precision 87.4 83.4 83.4 88.4 92.5 93.2 86.2 86.5 88.8 94.3
Recall 87.2 83.3 75.5 84.5 91.6 92.9 86.1 86.9 88.2 93.25
F1 Measure 87.3 83.3 79.3 86.4 92.0 93.0 86.1 86.7 88.5 93.8
Accuracy 87.2 83.3 75.5 84.7 92.6 92.5 86.1 86.9 88.2 94.3
AUC 83.4 87.7 89.9 89.5 93.2 95.2 81.2 92.8 85.4 95.1
LogLoss 3.53 0.39 1.23 0.43 0.36 0.31 4.79 0.29 0.344 0.28
Leave one out Precision 96.6 84.1 83.8 85.6 95.2 88.4 89.3 88.5 87.9 96.6
Recall 96.5 84.6 75.9 85.9 96.6 88.7 89.2 88.7 87.2 95.6
F1Measure 96.5 84.3 79.7 85.7 95.9 88.5 89.2 88.6 87.5 96.1
Accuracy 96.6 84.6 75.9 88.1 93.6 88.7 89.2 88.7 87.2 95.8
AUC 94.3 89.3 89.2 89.5 97.5 93.8 85.8 93.2 88.6 96.4
LogLoss 2.39 0.35 1.18 0.51 0.31 0.26 3.72 0.269 0.314 0.24
A. Gopalsamy and B. Radha
Machine Learning-Based Ensemble Classifier … 463

and the proposed model have better results of 96% for precision, recall, f1 measure
and accuracy and the proposed model has the highest AUC value as 95.8%.
Also, the proposed model has an average of about 96.07% of precision, 95.82% of
recall, 95.94% of F1 measure, 96.04% of accuracy, 96.37% of AUC values among all
the test options. With this comparison, it is clear that the proposed model has better
performance when compared with several models under comparison. The values
presented in Table 2 are consolidated and are depicted as a graph in Fig. 3. In Fig. 3,
(a) precision values (b) accuracy values (c) AUC obtained and (d) logloss values
of the proposed model for various test options are compared with the various other
models evaluated with different test options given in Table 2 under comparison.
Other than traditional models, several authors contributed to the effective classi-
fication of Parkinson disease through vocal analysis. Some of the works and their
accuracy that are evaluated with the Oxford Parkinson dataset is presented in Table 3.
As the proposed model has an average accuracy of 98.01% with tenfold cross valida-
tion, it provides better accuracy than four models out of six models in comparison. In

100 100
95 95
90
Accuracy
Precision

90
85
85
80
80 75
75 70
LR

RF
NB

AB
DT

MLP
RBF
PET
LMT
LLM
DT

RF

RBF
PET
LR
NB
LMT

AB
MLP
LLM

Various Models Various Models


CV RS LOO CV RS LOO
a) Precision for different test options b) Accuracy for different test options

100 6
95 5
LOGLOSS

90 4
AUC

85 3
80 2
75 1
70 0
PET
LR

AB
DT

LLM
RF

RBF
NB
LMT

MLP
PET
LR

AB
DT

LLM
RF
NB
LMT

RBF
MLP

Various Models Various Models


CV RS LOO CV RS LOO
c) AUC values for different test options d) Logloss values for different test options

Fig. 3 Performance comparison on various models with Oxford Parkinson dataset


464 A. Gopalsamy and B. Radha

Table 3 Accuracy comparison of various models using Oxford Parkinson dataset


Method Accuracy
PCA-FKNN [20] 96.07 (avg. tenfold CV)
PSO-FKNN [21] 97.47 (tenfold CV)
PBL-McRBFN [24] 99.35% (hold-out)
Integration of feature weighting, selection and classifiers [25] 100% (tenfold CV)
SVM based on BFO [23] 97.42% (tenfold CV)
CBFO approach with FKNN (CBFO-FKNN) [8] 97.89% (tenfold CV)
Proposed predictive ensemble tree (PET) using a decision tree, logistic 98.01% (tenfold CV)
regression and Naïve Bayesian model

addition to the above experimental analysis, another evaluation has been made with
the DS2, Oxford Parkinson’s Disease Telemonitoring Dataset mentioned in Table 1.
The mean absolute error and root mean squared error is computed in which the
proposed model has 0.4568 and 0.6342 respectively. The results of the proposed
model have been compared with various other models including logistic regres-
sion, Gaussian process, neural network, simple linear regression, SMO with SVM
regression, random forest, multiple linear regression, adaptive neuro-fuzzy inference
systems, Incremental support vector machine and proposed predictive ensemble tree.
The obtained MAE and RMSE values for proposed and various models are shown
in Table 4.
The values for different models using Oxford Parkinson’s Telemonitoring Dataset
presented in Table 4 is shown as a graph in Fig. 4. From the analysis, it can be seen
that the proposed model has improved by a reduction in MAE and RMSE values.

Table 4 Performance evaluation for Oxford Parkinson’s telemonitoring dataset


Various models Mean absolute error Root mean squared error
Logistic regression (LR) 0.9584 1.9260
Gaussian process (GP) 0.8611 1.9490
Neural network (NN) 0.9641 2.3486
Simple linear regression (SLR) 0.7208 1.2498
SMO with SVM regression (SVR) 0.7052 1.4734
Random forest 0.8075 1.0129
Multiple linear regression (MLR) 0.9920 2.4027
Adaptive neuro-fuzzy inference systems 0.7570 1.6555
(ANFIS)
HSLSSVR [32] – 0.8081
SOM-NIPALS-ISVR [34] 0.4812 0.6683
Predictive ensemble tree (PET) 0.5468 0.7342
Machine Learning-Based Ensemble Classifier … 465

3
2.5
2
Values

1.5
1
0.5
0
LR GP NN SLR SVR RF MLR ANFIS NIPALS PET
Various Models
Mean Absolute Error Root Mean Squared Error

Fig. 4 Performance comparison with Oxford Parkinson’s telemonitoring dataset

Table 5 Performance evaluation for Parkinson speech dataset


Various models Mean squared error Root mean squared Mean absolute error R squared
error
Decision tree 0.796 0.892 0.086 0.995
SVM model 0.757 2.544 1.025 0.128
Random forest 0.866 2.165 1.468 0.752
Adaboost 0.340 0.583 0.025 0.998
Proposed model 0.248 0.527 0.032 0.992

The model has better results than many of the models used for comparison
Parkinson speech dataset with multiple types of sound recordings by Istanbul Univer-
sity is also used for evaluating the performance of the proposed model. The various
metrics such as mean squared error, root mean squared error, mean absolute error
and R squared are used for evaluation. The values obtained for the proposed and
existing conventional models are presented in Table 5.
The evaluation results obtained for the proposed and existing models are shown as
a graph in Fig. 5. From the analysis, it is clear that the proposed model has low MSE
and RMSE values than the decision tree, support vector machine, random forest and
AdaBoost models whereas it has low MAE and R square values than most of the
models.
Apart from other conventional methods, the accuracy, sensitivity and specificity
of the proposed model have been compared with the other recent models available in
the literature. The values are listed in Table 6. The accuracy of the proposed model
for the Parkinson speech dataset is 90.2% which is above the accuracy of seven out
of nine existing models used for the comparison. Also, the model has sensitivity
and specificity rates of 89.2% and 91.7% which is also better than many of the
algorithms used for evaluation. As the model has better performance, it could be
used for identifying Parkinson’s disease.
466 A. Gopalsamy and B. Radha

Decision Tree SVM Model Random Forest AdaBoost Proposed Model

2.54
2.17

1.47
Values

1.03

1.00
1.00

0.99
0.89
0.87
0.80
0.76

0.75
0.58
0.53
0.34
0.25

0.13
0.09

0.03
0.03
MSE RMSE MAE R2
Various Models

Fig. 5 Performance comparison with for Parkinson speech dataset

Table 6 Performance evaluation for Parkinson speech dataset


Method Accuracy Sensitivity Specificity
KNN with SVM [19] 68.45 60 50
Machine Learning System [26] 68.94 70.57 66.92
Multiple classifier framework [27] 87.5 90 85
Multi-edit nearest-neighbour (MENN) with ensemble 81.5 90 85
learning algorithms [22]
Multi-decision trees [28] 66.5 – –
PD telediagnosis method [17] 94.17 50 94.92
MFCC with SVM [2] 82.5 80 85
Stacking with complementary neural networks 75 – –
(CMTNN) [29]
Two dimensional data selection method [13] 97.5 100 95
Proposed predictive ensemble tree (PET) 90.2 89.2 91.7

6 Conclusion

In this paper, a predictive ensemble tree which is a machine learning based ensemble
classifier that utilizes a naïve Bayesian classifier with a decision tree with logit regres-
sion for predicting Parkinson’s disease at an early stage is proposed. The model has
two stages in which the first stage builds a decision tree by segmenting the records
and applies naive Bayesian probability at each node and in the second stage, the
model applies logistic regression at the leaf nodes. In order to classify the data, the
model utilizes cumulative voting for predicting labels. The model has been eval-
uated using three publically available Parkinson datasets over voice with various
performance metrics. The experimental results show that the proposed model attains
good results with 97.3% of precision, 98.01% of accuracy and 97.6% of AUC with
respect to the oxford PD dataset. Also, the model has 90.2%, 89.2% and 91.7% of
Machine Learning-Based Ensemble Classifier … 467

accuracy, sensitivity and specificity respectively over the Parkinson speech dataset.
The model also results with minimum MSE, RMSE and AME values as 0.248, 0.527
and 0.032 respectively. The extensive experimental analysis shows that the model
outperforms various conventional and previously reported models over Parkinson
datasets. The future work will be focused on detecting Parkinson disease early using
various other attributes related to the disease including gait analysis. Also, the model
can be extended to be appropriate for various other datasets in the healthcare domain.

References

1. De Lau LM, Breteler MM (2006) Epidemiology of Parkinson’s disease. Lancet Neurol


5(6):525–535
2. Benba A, Jilbab A, Hammouch A (2016) Analysis of multiple types of voice recordings in
cepstral domain using MFCC for discriminating between patients with Parkinson’s disease
and healthy people. Int J Speech Technol 19(3):449–456
3. Dorsey ER, Constantinescu R, Thompson JP, Biglan KM, Holloway RG, Kieburtz K, Marshall
FJ, Ravina BM, Schifitto G, Siderowf A, Tanner CM (2007) Projected number of people with
Parkinson disease in the most populous nations, 2005 through 2030. Neurology 68(5):384–386
4. Sage JI (2017) Parkinson’s disease: changes in daily life. In: Changes in the brain. Springer,
New York, pp 65–86
5. Mu J, Chaudhuri KR, Bielza C, de Pedro-Cuesta J, Larrañaga P, Martinez-Martin P (2017)
Parkinson’s disease subtypes identified from cluster analysis of motor and non-motor
symptoms. Front Aging Neurosci 9:301
6. Aich S, Kim HC, Hui KL, Al-Absi AA, Sain M (2019) A supervised machine learning approach
using different feature selection techniques on voice datasets for prediction of Parkinson’s
disease. In: 21st international conference on advanced communication technology (ICACT),
pp 1116–1121. IEEE
7. Harel B, Cannizzaro M, Snyder PJ (2004) Variability in fundamental frequency during speech in
prodromal and incipient Parkinson’s disease: a longitudinal case study. Brain Cogn 56(1):24–29
8. Cai Z, Gu J, Wen C, Zhao D, Huang C, Huang H, Tong C, Li J, Chen H (2018) An intelli-
gent Parkinson’s disease diagnostic system based on a chaotic bacterial foraging optimization
enhanced fuzzy KNN approach. Comput Math Methods Med 2018:1–24
9. Ramig L, Halpern A, Spielman J, Fox C, Freeman K (2018) Speech treatment in Parkinson’s
disease: randomized controlled trial (RCT). Mov Disord 33(11):1777–1791
10. Parkinson’s Foundation, Falls prevention. https://www.parkinson.org/pdlibrary/factsheets/Fal
lsPrevention. Retrieved 16 Oct 2020
11. Al-Fatlawi AH, Jabardi MH, Ling SH (2016) Efficient diagnosis system for Parkinson’s disease
using deep belief network. In: Congress on evolutionary computation (CEC). IEEE, pp 1324–
1330
12. Upadhya SS, Cheeran AN, Nirmal JH (2018) Thomson Multitaper MFCC and PLP voice
features for early detection of Parkinson disease. Biomed Signal Process Control 46:293–301
13. Ali L, Zhu C, Zhou M, Liu Y (2019) Early diagnosis of Parkinson’s disease from multiple
voice recordings by simultaneous sample and feature selection. Expert Syst Appl 137:22–28
14. Hassan MR, Kadir SO, Islam MA, Abujar S, Zannat R (2019) A knowledge base data
mining based on Parkinson’s disease. In: 8th international conference system modeling and
advancement in research trends (SMART). IEEE, pp 14–17
15. Froelich W, Wrobel K, Porwik P (2015) Diagnosis of Parkinson’s disease using speech samples
and threshold-based classification. J Med Imag Health Inf 5(6):1358–1363
16. Johri A, Tripathi A (2019) Parkinson disease detection using deep neural networks. In: Twelfth
international conference on contemporary computing (IC3). IEEE, pp 1–4
468 A. Gopalsamy and B. Radha

17. Zhang YN (2017) Can a smartphone diagnose parkinson disease? A deep neural network
method and telediagnosis system implementation. Parkinson’s disease
18. Patra AK, Ray R, Abdullah AA, Dash SR (2019) Prediction of Parkinson’s disease using
ensemble machine learning classification from acoustic analysis. In: Journal of physics:
conference series, vol 1372(1). IOP Publishing, p 012041
19. Sakar BE, Isenkul ME, Sakar CO, Sertbas A, Gurgen F, Delil S, Apaydin H, Kursun O (2013)
Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings.
IEEE J Biomed Health Inform 17(4):828–834
20. Chen HL, Huang CC, Yu XG, Xu X, Sun X, Wang G, Wang SJ (2013) An efficient diagnosis
system for detection of Parkinson’s disease using fuzzy k-nearest neighbor approach. Expert
Syst Appl 40(1):263–271
21. Zuo WL, Wang ZY, Liu T, Chen HL (2013) Effective detection of Parkinson’s disease using
an adaptive fuzzy k-nearest neighbor approach. Biomed Signal Process Control 8(4):364–373
22. Zhang HH, Yang L, Liu Y, Wang P, Yin J, Li Y, Qiu M, Zhu X, Yan F (2016) Classification
of Parkinson’s disease utilizing multi-edit nearest-neighbor and ensemble learning algorithms
with speech samples. Biomed Eng Online 15(1):1–22
23. Cai Z, Gu J, Chen HL (2017) A new hybrid intelligent framework for predicting Parkinson’s
disease. IEEE Access 5:17188–17200
24. Babu GS, Suresh S (2013) Parkinson’s disease prediction using gene expression–a projection
based learning meta-cognitive neural classifier approach. Expert Syst Appl 40(5):1519–1529
25. Hariharan M, Polat K, Sindhu R (2014) A new hybrid intelligent system for accurate detection
of Parkinson’s disease. Comput Methods Programs Biomed 113(3):904–913
26. Cantürk İ, Karabiber F (2016) A machine learning system for the diagnosis of Parkinson’s
disease from speech signals and its application to multiple speech signal types. Arab J Sci Eng
41(12):5049–5059
27. Behroozi M, Sami A (2016) A multiple-classifier framework for Parkinson’s disease detection
based on various vocal tests. Int J Telemed Appl 2016:1–9
28. Vadovský M, Paralič J (2017) Parkinson’s disease patients classification based on the speech
signals. In: 15th international symposium on applied machine intelligence and informatics
(SAMI). IEEE, pp. 000321–000326
29. Kraipeerapun P, Amornsamankul S (2015) Using stacked generalization and complementary
neural networks to predict Parkinson’s disease. In: 11th international conference on natural
computation (ICNC). IEEE, pp 1290–1294
30. De Caigny A, Coussement K, De Bock KW (2018) A new hybrid classification algorithm
for customer churn prediction based on logistic regression and decision trees. Eur J Oper Res
269(2):760–772
31. Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59(1–2):161–205
32. Zhao YP, Li B, Li YB, Wang KK (2015) Householder transformation based sparse least squares
support vector regression. Neurocomputing 161:243–253
33. Nilashi M, Ibrahim O, Ahani A (2016) Accuracy improvement for predicting Parkinson’s
disease progression. Sci Rep 6(1):1–18
34. Nilashi M, Ibrahim O, Ahmadi H, Shahmoradi L, Farahmand M (2018) A hybrid intelligent
system for the prediction of Parkinson’s disease progression using machine learning techniques.
Biocybern Biomed Eng 38(1):1–15
35. Little M, McSharry P, Roberts S, Costello D, Moroz I (2007) Exploiting nonlinear recurrence
and fractal scaling properties for voice disorder detection. Nat Preced 1–35
36. Little M, McSharry P, Hunter E, Spielman J, Ramig L (2008) Suitability of dysphonia
measurements for telemonitoring of Parkinson’s disease. Nat Preced 1–27
37. Tsanas A, Little M, McSharry P, Ramig L (2009) Accurate telemonitoring of Parkinson’s
disease progression by non-invasive speech tests. Nat Preced 1–10
38. Dua D, Graff C (2019) UCI machine learning repository [http://archive.ics.uci.edu/ml]. The
University of California, School of Information and Computer Science, Irvine, CA
39. Maldonado S, Pérez J, Bravo C (2017) Cost-based feature selection for support vector machines:
an application in credit scoring. Eur J Oper Res 261(2):656–665
Machine Learning-Based Ensemble Classifier … 469

40. Maldonado S, Flores Á, Verbraken T, Baesens B, Weber R (2015) Profit-based feature selection
using support vector machines–General framework and an application for customer retention.
Appl Soft Comput 35:740–748
41. Yadav N, Badal N (2020) Data preprocessing based on missing value and discretisation. Int J
Forensic Softw Eng 1(2–3):193–214
42. Arunraj G, Radha B (2021) Feature selection using multiple ranks with majority vote based
relative aggregate scoring model for Parkinson dataset. In: International conference on data
science and applications (ICDSA 2021)
43. Quinlan JR (2014) C4. 5: programs for machine learning. Elsevier
Analysis of Radiation Parameters
of Kalasam-Shaped Antenna

Samyuktha Saravanan and K. Gunavathi

Abstract In this modern world, our traditional culture has left us with many things
to be explored and admired. All Hindu temples have Kalasams at the top of the
temple tower, which are in the form of inverted pot with pointed head facing toward
the sky. There exists many superstitious misunderstandings regarding the structure
of the Kalasam, it’s placement at the top of the temple tower and the filling materials
within the Kalasam. The proposed work aims to prove that the Kalasam acts as an
antenna, and it is responsible for the positive energy felt inside the temple. The models
of the Kalasams have been designed using Ansys HFSS tool, and its feed is given
by line feeding technique. This feed is kept at top of the Kalasam structure since the
Kalasam absorbs energy from the atmosphere. The results have been obtained by
constructing the hollow structure of the Kalasam, by varying the relative permittivity
to analyze the significance of grains stored within the Kalasam and also by placing an
array of Kalasams. The radiation pattern is observed to check whether it is directed to
the main deity or toward the atmosphere. It also includes the study of characteristics
of grains placed within Kalasams and their contribution to the positive energy that
is felt within the interior sanctum of the temple. The performance of odd number of
array of Kalasams is observed to be better compared to even number of Kalasams.
It may be the reason for the positivity, goodness, serenity, or divinity one feels when
one approaches the interior sanctum and the reason for circumambulation around the
idol.

Keywords Kalasam antenna · Circumambulation · HFSS

Samyuktha Saravanan (B) · K. Gunavathi


PSG College of Technology, Coimbatore, India
K. Gunavathi
e-mail: kgy.ece@psgtech.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 471
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_35
472 S. Saravanan and K. Gunavathi

1 Introduction

In the olden days, temples were strategically situated in places where the energy is
abundantly available from the electric and magnetic field distributions of north and
south pole thrust. The main idol in the temple receives the beamed electromagnetic
waves such that it has maximum energy around it. It is like an antenna, and it absorbs
all the powers onto the deity or the idol.
Most Hindu temples have Kalasams placed at the top of the Gopuram, also called
as temple towers. These Kalasams are in the form of an inverted pot with a pointed
head facing toward the sky. They are made of five metals namely copper, brass, gold,
silver, and lead. The five metals are collectively called as Impon. The Kalasam is
filled with nine types of grains called Navadhanyam, namely barley, ragi, varagu,
thinai, kambu, horsegram, saamai, cholam, and paddy. People believed that it acts
as a lightning arrester. In addition to that, it is believed that Kalasams were used
for storing grains to cultivate crops in case of damage of cultivation to be used as a
redevelopment source in times of natural calamities like floods, and earthquakes.
Gopuram on top of which Kalasams are placed in temples is believed to act
as a lightning seize engineered to guard the building in the event of lightning hit.
A metallic item is mounted on top of the structure electrically bonded using an
electrical conductor to be grounded to the land. If lightning targets the temple, it will
strike the rod and be conducted through the wire to ground. If it passes through the
building, it could affect the strength of the building and could also create fire or cause
electrocution.
The science behind these constructions is that the temple architecture focusses
the cosmic force inside the Garbhagriha toward the main idol. The Juathaskambam
acts like an antenna. It plays a vital role in receiving the cosmic force from the space
and is linked to the main idol in the Garbhagriha through a subversive channel. The
cosmic force continuously flows through the Juathaskambam to the statue of the
idol and energizes it. The celestial power fetched through the process makes the
idol powerful and gives it metaphysical powers. The pyramid like construction in
the Gopuram helps to intensify and protect the cosmic force inside the Garbhagriha.
These could be the reasons as to why people feel positive energy, serenity, or divinity
as one approaches the interior sanctum of the temple. The copper plate has the ability
to suck part the Ether when that penetrates from the copper resulting in powerful
atomic force which has the capability to penetrate through the skin to heal the human,
and it is one of the main reasons for putting the copper plate on the temple tower.
This paper explains about the Kalasam antenna with microstrip line feed technique
and design parameters for a single Kalasam. It also elaborates the significance of
storing Navadhanyam within Kalasams by changing the relative permittivity (1r)
values of Kalasam used in the design. Followed by this, construction of an array of
Kalasams has been analyzed.
Analysis of Radiation Parameters of Kalasam-Shaped Antenna 473

2 Proposed Methodology

2.1 Design of Single Kalasam Antenna-Hollow Structure

The proposed work focusses on the design of single Kalasam using Ansys high
frequency structural simulator (HFSS). Ansys HFSS is 3D electromagnetic (EM)
simulation software for designing and simulating high-frequency electronic products.
The 3D model of Kalasam is designed based on actual dimensions of Kalasam used
in temples, and its feed is given by line feeding technique. This feed is kept at the top
of the Kalasam structure since the Kalasam observes radiation from the atmosphere.
In this paper, since the Kalasams used in temples are always hollow in structure,
hollow structures of Kalasam antenna are designed and analyzed.
The height of the Kalasam antenna shown in Fig. 1 is 3/4 feet. The material used
to construct the Kalasam-shaped antenna in the design is copper, as copper has high
electrical conductivity and is the most common metal used to make Kalasams placed
on top of temple towers in temples.
The dimensions of various components used in the design of single Kalasam
antenna are shown in Table 1. The table lists the different components used to
construct the Kalasam-shaped antenna. The exact dimensions used in temples have
been incorporated in the design of the single Kalasam antenna.
The return loss for the design is shown in Fig. 2. At the valley point, the minimum
return loss obtained is −16.2901 dB at a frequency of 46 GHz. Also, from the graph,
it is seen that there are multiple valley points, which means that the designed antenna
can operate at multiple frequencies of the range 10–50 GHz.

Fig. 1 Design of single Kalasam antenna


474 S. Saravanan and K. Gunavathi

Table 1 Dimensions of single Kalasam antenna


Shape Radius (cm) Height (cm)
Cone 2.5 6
Sphere (cut) 3 1
Cylinder 1 1.5 0.5
Sphere 1 2
Cylinder 2 1.5 0.5
Sphere 2 2.5
Cylinder 3 1.5 0.5
Hemisphere 5.5

Fig. 2 Return loss graph for 3/4 feet Kalasam (hollow)

As shown in Fig. 3, the obtained radiation pattern of 3/4 feet single Kalasam
antenna has two lobes. It can be seen that the lower lobe covers larger area compared
to the upper lobe, and maximum radiation is directed toward the idol.

2.2 Design of Single Kalasam Antenna—Hollow Structure


with Changes in Relative Permittivity

In practice, Kalasams are filled with Navadhanyam. It is believed that the purpose
of filling grains inside the Kalasam is to make the Gopuram withstand lightning.
The grains act as natural lightning conductor, which plays a vital role in reducing
the impact on the structures. For this reason, the Kumbam is also designed to have a
pointed edge facing toward the sky. Another reason is these stored grains could be
used to cultivate crops in case of damage of cultivation, in times of floods or natural
calamities. The grains stored in the Kalasams are refilled and changed every 12 years
during the festival called Kumbhabishekam.
Analysis of Radiation Parameters of Kalasam-Shaped Antenna 475

Fig. 3 Radiation pattern for


3/4 feet Kalasam (hollow)

Relative permittivity describes the ability to polarize a material subjected to an


electrical field. Since the relative permittivity of the grains stored within Kalasams is
difficult to determine, it is not possible to be incorporated to the design. Hence, the
relative permittivity values used in the design has been varied and the performance
of the Kalasam antenna has been analyzed. Thus, four different values of 1r such as
100, 1000, 5000, and 10,000 are incorporated to the hollow single Kalasam structure
with height 3/4 feet.
The results obtained by altering the relative permittivity values in the Kalasam
antenna have been tabulated in Table 2. It can be observed that as the value of relative
permittivity increases, the return loss observed decreases. As relative permittivity is
increased to 100, the size of upper lobe has decreased drastically and maximum
radiation is observed in the lower lobe. But as relative permittivity increases, size of
upper lobe increases slightly. Also, it can be inferred from the radiation pattern that

Table 2 Comparison of single Kalasam antenna for different values of relative permittivity
Relative permittivity 100 1000 5000 10,000
Return loss (dB) −13.99 −18.13 −17.26 −23.74
Frequency (GHz) 36 24 43 27

3D polar plot
476 S. Saravanan and K. Gunavathi

Fig. 4 Comparison graph of electric field intensity for different values of relative permittivity

as the value of relative permittivity increases, more radiation is directed toward the
idol and less radiation is directed to atmosphere.
A graph comparing the electric field intensity values against the values for different
values of relative permittivity has been depicted in Fig. 4. It can be observed that
there is almost nil radiation near the idol. Maximum radiation is observed around the
idol. This may be the reason why only priests enter the Garbhagriha and the devotees
worship the idol from outside where maximum radiation is observed. This could also
be the reason for circumambulation practiced in temples.
Thus, the incorporation of 1r values to the Kalasam structure explains the signif-
icance of placing grains within the Kalasams. Rather than the energy being radiated
to the atmosphere, it helps in directing most of the positive energy toward the temple,
around the Garbhagriha.

2.3 Design of Array of Kalasams

In practice, temple towers have an array of Kalasams. Thus, array of Kalasams each
of 3/4 feet height with a relative permittivity of 100 have been constructed using
a separate source for each Kalasam and a common source for the entire array of
Kalasams. The number of Kalasams in an array is also varied. Two different cases
have been considered, one array with two Kalasams (even number of Kalasams) and
another array with three Kalasams (odd number of Kalasams). Thus, the behavior of
each case has been analyzed.
In real time, Kalasams are placed as an array in a linear fashion on top of
the Gopuram in temples. The numbers of Kalasams placed are usually odd. Here,
both even numbered (two Kalasams) and odd numbered (three Kalasams) array of
Kalasam structures have been designed to understand the reason for placing odd
number of Kalasams on top of the Gopuram. The structures are 3/4 feet hollow
Analysis of Radiation Parameters of Kalasam-Shaped Antenna 477

structures with 1r of 100 at distance of 25 cm apart and are excited with a single
common source and a separate source for each Kalasam. The design of two and three
Kalasam antenna array is shown in Figs. 5 and 6, respectively.
The results obtained from two and three Kalasam antenna arrays excited by
common and separate sources have been tabulated in Table 3. On placing series
of Kalasams in a linear fashion, it can be interpreted that the radiation obtained from
the Kalasam is directed towards the temple. There is no radiation present over the
temple (not radiated back to the atmosphere) which means that there is no wastage of
positive energy. It is completely focused within the temple. On comparing the return
loss values of various cases, it can also be observed that the performance of antenna
array with odd numbered Kalasams is better compared to even number of Kalasams.

Fig.5 Design of two Kalasam array

Fig. 6 Design of three Kalasam array


478 S. Saravanan and K. Gunavathi

Table 3 Comparison of array of Kalasam antennas excited by common and separate sources
Antenna array 2 3 2 3
Excitation Single Single Separate Separate
Return loss (dB) −24.04 −24.66 −21.55 −35.47
Frequency (GHz) 32 32 23 26

3D Polar Plot

3 Conclusion

The Kalasam-shaped antennas have been found to be radiating, and they are operating
at a particular range of frequencies. Thus, it is proved that the Kalasams placed on
top of the Gopuram can act as antennas. Also, from the 3D polar plots obtained,
it is inferred that the maximum radiation from the Kalasam is directed toward the
temple around the Garbhagriha. The incorporation of 1r values into the Kalasam
structure explains the significance of placing grains within the Kalasams. Rather
than the energy being radiated to the atmosphere, it helps in directing the positive
energy toward the temple around the idol.
Array of Kalasams placed in a linear fashion radiates the positive energy
completely toward the temple around the Garbhagriha, rather than radiating it back
to the atmosphere. Since there is maximum radiation found around the idol placed in
Garbhagriha, it can also be a reason why people go around the Garbhagriha. Usually,
people don’t go inside the Garbhagriha except the priest because the radiation is
concentrated only around the Garbhagriha and not within the Garbhagriha. It can
also be observed the performance of antenna array with odd numbered Kalasams
is better compared to even number of Kalasams. This could be the reason why odd
number of Kalasams is placed on top of temple tower.
Thus, these interpretations may be the reasons for the positivity, goodness,
serenity, or divinity one feels when one approaches the interior sanctum and the
reason for circumambulation around the idol. The proposed work can further be
extended by studying the exact heights at which Kalasams are placed above the
ground and analyzing whether the entire temple architecture itself contributes to the
radiation field inside the temple.
Political Sentiment Analysis: Case Study
of Haryana Using Machine Learning

Dharminder Yadav, Avinash Sharma, Sharik Ahmad, and Umesh Chandra

Abstract Textual data is growing rapidly in the world as a culmination use of social
media platforms including Twitter, Facebook, Whatsapp, Web 2.0, web articles and
micro-blogging site. More than 70% of people spend most of their time on social
networking for personal communication and opinion sharing. They share opinions
about products, services and general or political events between users. Text data
has been used in opinion mining and sentiment classification to know what people
thinking about product and political parties. Social networking site produces a huge
amount of unstructured text daily. Processing and extraction of information from
unstructured text, Machine learning, Natural Language Processing (NLP), infor-
mation recovery (IR) and data mining are essential. Twitter is the best source of
unstructured data and is used for analysis of emotion and extraction of view. The
main objective of this study, forecast the result of voting in Haryana from the tweets
written in the English language. We used the Twitter TAGS Archiver tool and Twitter
API using R to get the tweets. R is a very strong programming language and is most
probably utilised in data interpretation and sentiment analysis. This present study
uses a corpus base and dictionary-based method to find the sentiment of tweets. This
study attempts to ascertain the mood of Twitter users in Haryana in direction of the
political party and quantifies the sentiment as positive, negative, or neutral.

Keywords Twitter · Sentiment analysis · Haryana election · Opinion mining ·


Social media · Natural language processing · R · Twitter API

D. Yadav (B) · S. Ahmad


Computer Science Department, Glocal University, Saharanpur, UP, India
A. Sharma
Computer Science, Maharishi Markandeshwar Deemed To Be University, Ambala, Haryana, India
e-mail: asharma@mmumullana.org
U. Chandra
Computer Science, Banda University of Agriculture and Technology, Banda, UP, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 479
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_36
480 D. Yadav et al.

1 Introduction

Due to the technological advancement and use of social networking sites a huge
amount of structured, semi-structured and unstructured data is produced daily and
available on websites and apps. Users share and exchange their opinion and ideas on
these sites daily. Facebook, Twitter, Whatsapp and other social networking sites today
act as a pool of opinions because people share their views on the topic of interest.
These opinions and ideas are in the form of text, opinion mining, text mining and
sentiment analysis are to use text data and classify the sentiment based on text data
as positive negative and neutral. Sentiment analysis and opinion mining are used
in different organizations such as governments, healthcare, education, engineering
and technology sector, manufacturing and retail sector to get opinions on their prod-
ucts and services. Text mining and opinion mining used NLP (Natural Language
Processing) and NLTK (Natural Language Toolkit) to mine the opinions on partic-
ular topics, product prediction of the election, stock prices or services [1]. Feeling
analysis and opinion mining are essential for business management and for making
the best decision based on users’ views/opinions extracted from text data. A sentiment
assessment at the sentence level is an arbitrary classification that assesses the polarity
of short sentences. Aspect/Entity level feeling analysis is also called augmented anal-
ysis and find the sentiment of an entity or aspect. For, e.g. “My phone features are
good but it hang some time and picture quality is bad”, “Phone features are good”
show positive sentiment but “picture quality is bad” show negative sentiment.
Twitter is the most popular social networking site that provides real-time micro-
blogging services to uses to share opinion/views/information in the form of text; this
text is called the Tweets and maximum size of tweets are 140 characters. Twitter
was launched in 2006, but users sharing information and daily activity start in 2007
[2]. At a time Twitter’s has more than 467 million users and daily share or generate
more than 175 million tweets. Tweets are collected in Hindi and English language
using Twitter API and TAGS that considered three major parties ruling party as a
party 1, opposition party as party 2 and third-largest party as a party 3 in Haryana
and labelled them as a positive negative and neutral by using the free tool R and its
libraries.
In sentiment analysis is very difficult to predict the result of the election on the
opinion of users. This paper tries to answer the question using sentiment analysis
popularity of any political party on Twitter can increase the chance of winning elec-
tion? To answer this question we analyse the Twitter tweets and study the people’s
sentiment in terms of positive negative and neutral polarity. The author collects the
tweets, filter the tweets, pre-processing the tweets and then applies the sentiment
mining on tweets to predict the political situation in Haryana.
This paper discusses the introduction of Text mining and sentiment analysis on
Twitter tweets, literature review political and general sentiment analysis, method-
ology used for tweets extraction, fetching and emotion analysis of tweets. Discuss the
result and graphically reorientation the polarity in the resulting segment. This study
wraps up various challenges present in sentiment analysis using Machine learning
Political Sentiment Analysis: Case Study of Haryana … 481

methods and freely available tools such as R, Python using Twitter and discussing
the future scope of sentiment analysis.

2 Literature Survey

This part of the paper explains the related study of opinion mining and feeling analysis
in different languages, link technique and technology, data collection and processing
methods, micro-blogging sites and algorithms that are required for these processes.
The researcher divides the related work into two classes, Study on general opinion
survey and political sentiment analysis.
Turney [3] and Pang et al. [4] started his studies in the field of emotion analysis with
item and movie reviews. Turney [3], used the bag-of-words method and Pointwise
Mutual Information (PMI) to evaluate phrase sentiment orientation Pang et al. [4]
have used supervised learning with a range of n-gram features for the classification
of binary sensitivities at the document level to achieve nearly 83% accuracy with
unigram presence feature. The use of Twitter, the hypothesis of Blippr’s micro-
examinations and the analysis of how Smeaton and Bermingham tested sentimental
analytics in the micro-text easier than long documents [5]. Twitter, Blog posts and
film reviews have been experimented with and it has been found easier to identify
feelings.
Pak and Paroubek [6] proposed a Twitter-based sentiment mining model and
tweets were divided into two categories: positive and negative. They formed a Tweet
corpus by gathering tweets and annotating these tweets automatically via emoticons
using the Twitter API. Using this corpus, the multinomial Naive Bayes classification
system was invented. Naïve Bayes uses the POS-tags and N-gram function to predict
the sentiment. The Twitter API has been used to gather tweets and to filter out tweet
opinions. Liang et al. [7]. For identification of polarity, Naive Bayes model with
Unigrams was founded. Mutual knowledge and the Chi-square extraction process
were used you eliminate the undesirable characteristics. However, the positive or
negative prediction of the posts on Twitter did not yield better results.
Rochmawati and Wibawa [8] use Twitter to learn about people’s reactions to the
Rohingya and their reactions are divided into three categories: positive, negative and
neutral. Grouping data was used Naive Bayes, Twitter uses R and its library to predict
the opinion of people and plot the result in a diagram that represents all possible
sentiment of tweets. Finally, there is no clear indication of tweets because most of
the sentiment of the tweets is unknown, positive versus negative tweets equal and
fewer neutral Younis [9] uses Twitter API and TwitteR package to search the Twitter
messages. They present the open-source approach by collecting; pre-processing,
analysing and visualisation Twitter Microblog data using open-source software for
text mining and sentiment analysis. In the Christmas season of 2014, they collected
consumer feedback on two major retail stores around the U.K., namely Tesco and
Asda. A sentimental study of consumer opinions makes it easier for companies to
consider the competitive value of their goods and services in an evolving environment
482 D. Yadav et al.

and to understand their customer views, which can provide insight into the potential
marketing policies and policies.
Pennebaker et al. [10] designed programs to retrieve emotion through tweets.
LIWC is a text analysis programme that uses a psychometrically evaluated internal
vocabulary to analyse mental, cognitive and structural aspects of text samples.
Tumasjan et al. [11] utilised Twitter tweets to forecast election outcomes in the
2009 German federal election, concluding that a party’s number of tweets/mentions
is directly related to its chances of winning. They gathered over 100,000 tweets
containing a list of the six parties between August 13 and September 19, 2009.
Choy et al. [12] estimate the polling ratio for each candidate during the 2011
presidential elections in Singapore using online opinion. Wang et al. [13] suggested
a legitimate sentiment analysis method for democratic tweets based on the 2012 US
presidential election that used the Naive Bayes prototype with unigram functionality
and Amazon Mechanical Turk to gather emotion annotations. Mishra et al. [14]
employs Natural Language Processing, Machine Learning and Dictionary Bases
approach method for sentiment analysis of digital India. They collected tweets using
Twitter API from Twitter and classified the tweet as positive negative and neutral.
Results clearly show that the 50% tweet is positive, 20% negative and 30% neutral
polarity. Some researchers work on sentiment analysis of local Indian languages such
as Hindi, Marathi, Urdu, etc.
Das and Bandopadhya [15] were getting ready for a Bengali SentiWordnet and
defining the four procedures to comprehend the meaning of a phrase. Joshi et al. [16]
created a lexical resource of Hindi SentiWordnet to grasp the meaning of a sentence
of Hindi word and using the methodologies such as Language Sentiment Analysis,
Resource-Based Assumption Analysis and Machine Translation with an accuracy
of 78.14%. Chesley [17] uses a unique linguistic factor, verb class information, in
conjunction with an online resource, the Wikipedia lexicon, to determine adjective
polarity in topic- and genre-independent blog postings. Each blog article is graded as
objective, positive, or negative, with a 90.9 percent accuracy in classifying adjective
polarity. Gune et al. [18] build a parser in the Marathi language. Mittal et al. [19] build
a Hindi language corpus, improve the present Hindi SentiWordnet and developed a
practical method for extracting sentiment from Hindi text with 80% accuracy. This
paper analyzes the political sentiment, using English and Hindi tweets to predict the
general election of Haryana.

3 System Analysis

Text mining is a Big Data and Data mining technology. It employs the princi-
ples and techniques of data mining to find patterns, uncover knowledge and new
information from huge unstructured Text. To Structure this unstructured text, Text
mining required some initial steps called data pre-processing. Gaikwad proposed
many methods and techniques: TBM (Term Based Method), PBM (Phrase-Based
Method), CBM (Concept-Based Method) and Pattern Taxonomy Method (PTM).
Political Sentiment Analysis: Case Study of Haryana … 483

Fig. 1 Text mining process

Search engine and Information Retrieval (IR) system provide the facility to search a
specific keyword or search query from a text document and efficiently gives search
results. Text mining uses data mining tools: clustering, classification and associative
rule to find new information and relationships in textual data. Figure 1, shows the
text mining process. In the first step a set of unstructured text (Tweets, web data,
Facebook, etc.) Collected, then pre-processing technique performed on unstructured
text to remove noise, spaces, stop word, prefix and suffix, stemming and gives struc-
tured text is called Term-Document matrix. In the final stage of implementation the
methodologies for data mining to investigate the term association and pattern in text
then, using resources like word clouds, tag clouds, ggplot and others, visualise certain
phenomena.

4 Twitter Sentiment Analysis

Sentiment Analysis, which was first introduced by Liu, B [3] is also referred to
as Opinion Mining and Subjectivity Analyse. Subjectivity analysis is a method of
determining the polarity of human views expressed on Twitter in order to rate products
or services. Microblogs are quick text messages similar to tweets, which are limited
to 140 characters and make it easier to find user thoughts.
This paper uses sentence-level sentiment analysis and dictionary-based approach
for analysing the sentiment of Twitter tweets. Emotion assessment can be performed
484 D. Yadav et al.

using a variety of tools: NLTK, GATE (General Architecture for Text Analysis), Red
Opal and Opinion Finder. NLTK has been written in python and used for the sentiment
analysis process: stop word removal, tokenization, stemming and tagging. GATE is
written in java language and used for information extraction. Red Opal is used by
users who choose to purchase a commodity based on a variety of characteristics.
Opinion finder has classified the sentence on the base of their polarity, written in java
and platform-independent tool. This experiment uses tidytext and tm library of R for
sentiment analysis and to find the polarity of tweets.

5 Methodology and Framework

Figure 2 shows the methodology used for mining the sentiment of Twitter tweets.

Fig. 2 Sentiment analysis diagram


Political Sentiment Analysis: Case Study of Haryana … 485

The steps are involved in the methodology described in detail below:

5.1 Collecting Data

To collect the public opinion and data used TwitteR API and TAGS Google Spread-
sheet. We create the App account on Twitter (Fig. 3) using a Twitter account to
get Twitter key and token which are used by the TwitteR library in R and return
the tweets/data. We also collected the Twitter Archiver TAGS which establishes the
connection Twitter, all of the search results are being imported into a spreadsheet.
Search and retrieved tweets, from a Twitter account, were saved in the database.
Figures 4 and 5 shows how we search tweets in R using the TwitteR library and
TAGS Archiver Google Spreadsheet.
We used different hashtags related to different political parties in Haryana such as
#Huda, #INCHaryana, #bjp4haryana, #bjphr @cmharyana, #Election2019, #Mayer-
Election, @inldhr, INLD4HARYANA, etc. to search the trending tweets on Twitter
which represent the political views of people. We collected 56,400 tweets related
ruling party (political party 1 or Party 1), 9800 tweets from political party 3 or Party
3, 6203 tweets related to political party 2 (Party 2), 11,300 tweets from political party
4 (Party 4) and a few tweets found from other political parties.

Fig. 3 App account on Twitter

Fig. 4 Search tweets using TwitteR


486 D. Yadav et al.

Fig. 5 Search tweets using TAGS spreadsheet

5.2 Data Pre-processing

Data pre-processing steps are Steaming, Tokenization, Stop Word Removal and
Normalization. We first remove irrelevant tweets and Twitter features from Twitter
data. Figure 6 shows the Twitter features, special characters and symbols such as
URL, Mention, Recency, Hashtag, Emoticons, Friends, Followers and Retweet.
Discuss the pre-processing steps in brief:
Stemming: In this process, we recognise and remove the suffixes and prefixes
from each word like “ly”,”ion”, “tion”, “pre”, “re”, ”semi” etc.
Tokenization: It splits the system of text into tokens. White spaces and punctu-
ation are used as token delimiters. It included several steps like Emoticons replaced
with the original meaning, remove white spaces, blank tweets, remove URL, remove
HTML links, Retweet Abbreviations like LOL, Plz, OMG, etc. are replaced with
original meaning Pragmatics handling such as helloooooooo as hello, oooooooh as
oh, hiiiiiiiiiiiii as hi, etc.
Stop Words Removal: In this step, we remove the unused stop words such as
prepositions (to, by, in, over, under, a, the, an, etc.) and conjunctions (for, and, nor,
but, or, yet, so).
All steps are performed using the library such as tidyverse. Tidytext, Stringr
RCurl, TwitteR, httr and tm with R. Library can be downloaded from (https://cran.r-
Political Sentiment Analysis: Case Study of Haryana … 487

Fig. 6 Twitter tweets without pre-processing

Fig. 7 Tweet pre-processing steps

project.org). Figures 7 and 8 shows the tweets pre-processing steps and tweets after
pre-processing respectively.
Sentiment Analysis: In this stage, perform sentiment analysis on cleaned data
and clarify the emotion as positive, negative and neutral using library sentiment and
tidytext in R. Figure 9 represents the emotions of each tweet.

6 Results and Discussion

In this section, discuss the detail of the experiment with results. The first aim of this
experiment is the accuracy of sentiment analysis. For this purpose, using R and its
tools for processing tweets and sentiment analysis with wordcloud. Wordcloud is
also called a tag cloud/text cloud and displays the text data visually. These graphs
488 D. Yadav et al.

Fig. 8 Tweets after cleaning

Fig. 9 Emotion of tweets


Political Sentiment Analysis: Case Study of Haryana … 489

and wordcloud display the textual data in an informative and interesting form. The
result of the experiment and testing has been done is as follows:
Figure 10 shows all ten sentiments of the ruling party (Party 1). Figure 10 shows
the different 10 sentiments: anger, positive, negative, disgust, anticipation, fear, joy,
trust, sadness, surprise. Large no group of people with positive (2714 Tweets) senti-
ment. The second-highest group is negative (2602 Tweets). It means that the people
of Haryana with the present government and think positive, but there is a minor
difference between positive and negative voters. Most of the voters are angry and
unhappy with the present government. Figure 11 shows the result of text analysis by
plotting the wordcloud.
Figure 12 shows the sentiment of tweets #mlkhatter CM Haryana. We found
56,400 tweets, in which 15,426 positive, 10,239 trusts, 11,492 negative, 10,292 trust,
7153 anger tweets, etc. Our result shows that most of the people think positive and
trust in Chief Minister of the present government. Figure 13 shows the result of a
positive negative and neutral opinion grouping. Figure 13, concluded that a neutral
group has the highest number (40,213 tweets), but positive (9436 tweets) and negative
(6751 tweets) group has a difference of 2685 tweets. Figure 14 shows the wordcloud
of #mlkhatter.
Figures 15 and 16, show the sentiment of the main opposition party. For party
2 (leader of opposition in the assembly of Haryana), we get 6203 tweets but after
processing very few tweets are left. Figure 15 shows the negative, positive and neutral
group of opinions. Most of the tweets are negative and neutral very few tweets are
positive, which means that voters think negative regarding party 2. One reason behind

Fig. 10 Sentiment of ruling party tweets


490 D. Yadav et al.

Fig. 11 Wordcloud ruling party

Fig. 12 Sentiment of ruling party tweets

that is the internal fighting of leaders and its worker or supporter. Figure 16 shows
the wordcloud of party 2. Figure 17 shows all ten sentiments of tweets; the number
of tweets with positive and trust emotions are the same.
Figures 18 and 19, show the sentiment of the third political party in Haryana.
Figure 18 shows the negative, positive and neutral group of opinions. Most of the
Political Sentiment Analysis: Case Study of Haryana … 491

Fig. 13 Sentiment of CM Haryana

Fig. 14 Wordcloud CM Haryana


492 D. Yadav et al.

Fig. 15 Sentiment of party 2

Fig. 16 Wordcloud party 2


Political Sentiment Analysis: Case Study of Haryana … 493

Fig. 17 Ten sentiment of party 2

Fig. 18 Sentiment of political party 3

tweets are neutral and positive very few tweets are negative, it’s that voters of Haryana
are neutral regarding the congress party, few people are negative regarding party 3.
Figure 19 shows all sentiments of party 3, in this figure the highest group is positive
but the second-highest group was trusted on party 3.
Figures 20 and 21, show the sentiment of political party 4 in Haryana. It shows
the negative, positive and neutral group of opinions. Most of the tweets are neutral
(6196) and positive (2428) very few tweets are negative (1376). Figure 21 shows
all the sentiments of political party 4 tweets. It shows most of the people trusted on
political party 4.
Machine learning: Bayes theorem is a supervised Machine learning algorithm
that finds out the accuracy, precision, Recall F1 score, Specificity and Balanced Accu-
racy of experiments with different N-grams. Table 1 Show the accuracy, precession,
494 D. Yadav et al.

Fig. 19 Ten sentiments of political party 3

Fig. 20 Sentiment of political party 4

Recall, F1 score, Specificity and Balanced Accuracy. “RTextTool” is the package in R


which is used for text data classification and gives the precession and recall. Five algo-
rithms such as SVM, SLDA, LOGITBOOST, MAXENTROPY, GLMNET (Support
Vector Machine, Scaled Linear Discriminant Analysis, Boosting, Maximum Entropy,
Ordinary Least Squares Regression respectively) were used and find the precision
and recall of each algorithm. Table 2 Show the accuracy, precision, recall F Score,
Accuracy, n-Ensemble Coverage and Recall of five algorithms.
Political Sentiment Analysis: Case Study of Haryana … 495

Fig. 21 Ten sentiments of political party 4

7 Conclusions

This is a technical paper; used Twitter to collect tweets/raw data to perform sentiment
analysis. The current study constructed corpora for ten sentiments of tweets, positive,
negative, neutral tweets and create the wordcloud. This paper uses the R language
and Rstudio to implement sentiment analysis. It is an open tool for text mining and
sentiment analysis for Twitter data to analyse user contribution reviews for products
and services. It also helps to improve the business value and customer relationship.
Based on the result get from the above experiment, the current study concludes that
most of the tweets are neutral few tweets can be positive and negative. All the neutral
tweets are the combination of the rest of eight emotions such as anger, disgust,
anticipation, fear, joy, trust, sadness and surprise. This is maybe incompleteness
of dictionary or technical reasons. This paper also analyses that there is a need to
improve Twitter sentiment analysis and text mining. With these technical problems,
a huge number of industries are using social media sites for opinion mining and
sentiment analysis for better services, sales and getting user reviews at a very low
cost.
Table 1 Accuracy, precession, recall, F1 score, specificity and balanced accuracy Bayes theorem
496

Algorithm used Party N-gram Accuracy Class Precision Recall/sensitivity F1 score Specificity Balanced accuracy
NAVIER BAYES Cm Haryana 2 91.12 Negative 40.00 97.56 15.68 99.81 54.78
THEOREM Neutral 94.86 95.19 95.02 58.90 77.04
Positive 55.65 59.25 57.39 94.81 77.03
5 90.57 Negative 50.00 04.87 08.88 99.93 52.40
Neutral 94.88 95.46 95.17 58.90 77.18
Positive 55.39 58.64 56.97 94.81 76.73
Party1 2 91.14 Negative 43.53 55.22 48.68 96.95 76.08
Neutral 99.27 95.12 97.15 95.35 95.23
Positive 60.95 78.04 68.44 95.03 86.53
5 90.02 Negative 74.19 99.00 84.82 86.84 92.92
Neutral 100 84.78 91.76 100.00 92.38
Positive 97.08 92.74 94.86 99.45 96.09
Party2 2 86.95 Negative 30.95 56.52 40.00 92.42 74.47
Neutral 95.92 91.07 93.44 81.43 86.25
Positive 75.56 72.34 72.91 93.94 84.64
5 86.21 Negative 30.00 52.17 38.10 92.69 72.43
Neutral 96.20 90.48 93.25 82.86 86.67
Positive 68.00 72.34 70.10 95.54 83.94
Party3 2 95.59 Negative 37.50 16.15 22.58 99.16 57.65
Neutral 96.94 98.76 97.84 75.37 87.06
Positive 90.06 90.83 90.44 99.10 94.97
(continued)
D. Yadav et al.
Table 1 (continued)
Algorithm used Party N-gram Accuracy Class Precision Recall/sensitivity F1 score Specificity Balanced accuracy
5 94.67 Negative 36.00 34.62 35.29 98.06 66.34
Neutral 97.43 99.05 98.23 79.33 89.19
Positive 83.51 69.63 75.94 98.77 84.20
Party 4 2 86.11 Negative 97.92 62.67 76.42 99.65 81.16
Neutral 89.55 94.55 91.98 85.37 89.96
Positive 72.40 86.34 78.75 90.58 88.46
5 82.43 Negative 100.00 55.29 72.21 100 77.65
Neutral 88.35 91.26 89.78 88.35 89.80
Positive 67.38 90.97 77.42 84.16 87.57
Political Sentiment Analysis: Case Study of Haryana …
497
498 D. Yadav et al.

Table 2 Precession, recall, F1 score, accuracy, n-ensemble, coverage and recall of five algorithm
Party Name of Precision Recall F Accuracy n-Ensemble Coverage Recall
name algorithm score
Party 1 SVM 69.00 57.00 61.33 93.00 n> =1 100 91
SLDA 62.00 51.67 54.33 92.28 n> =2 100 91
LOGITBOOST 67.67 61.67 63.00 94.32 n> =3 100 91
GLMNET 82.00 51.33 56.33 72.45 n> =4 95 94
MAXENTROPY 58.33 58.00 58.00 92.88 n> =5 86 96
Party 2 SVM 59.33 56.00 56.67 89.82 n> =1 100 85
SLDA 62.33 65.33 63.33 91.18 n> =2 100 85
LOGITBOOST 58.33 53.33 54.33 94.51 n> =3 99 86
GLMNET 63.00 57.33 59.33 59.58 n> =4 88 86
MAXENTROPY 56.00 61.00 57.67 94.84 n> =5 79 95
Party 3 SVM 96.00 80.00 86.33 95.92 n> =1 100 94
SLDA 95.00 69.33 77.00 95.76 n> =2 100 94
LOGITBOOST 94.67 70.67 78.67 96.71 n> =3 100 94
GLMNET 96.33 77.33 84.33 71.01 n> =4 99 94
MAXENTROPY 88.67 81.33 84.33 96.52 n> =5 92 95
Party 4 SVM 84.67 82.33 83.33 80.73 n> =1 100 85
SLDA 80.67 78.67 79.67 77.19 n> =2 100 85
LOGITBOOST 81.67 77.33 79.00 94.30 n> =3 97 87
GLMNET 87.33 81.33 83.67 43.29 n> =4 88 90
MAXENTROPY 77.67 79.33 78.00 79.48 n> =5 72 96
CM SVM 90.67 67.67 74 95.48 n> =1 100 95
Haryana SLDA 88.67 62.67 70 94.73 n> =2 100 95
LOGITBOOST 97.67 67.33 75 95.90 n> =3 100 95
GLMNET 97.33 67.67 75 85.52 n> =4 100 95
MAXENTROPY 66.67 82.33 64.33 74.98 n> =5 82 99

References

1. Berry MW (2004) Automatic discovery of similar words, in survey of text mining clustering,
classification, and retrieval. Springer Verlag, New York, LLC, pp 24–43
2. Boyd D, Golder S, Lotan G (2010) Tweet, tweet, retweet: conversational aspects of retweeting
on twitter, system science (HICSS)
3. Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised
classification of reviews
4. Pang B, Lee L, Vaithyanathan SK (2002) Thumbs up? Sentiment classification using machine
learning techniques. In: Proceedings of the conference on empirical methods in natural language
processing, vol 10
5. Smeaton AF, Bermingham A (2010) Classifying sentiment in microblogs: is brevity an advan-
tage? In: Proceedings of the 19th ACM international conference on information and knowledge
management, pp 1833–1836
Political Sentiment Analysis: Case Study of Haryana … 499

6. Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In:
Proceedings of the seventh conference on international language resources and evaluation, pp
1320–1326
7. Liang P-W, Dai B-R (2013) Opinion mining on social media data. In: IEEE 14th international
conference on mobile data management, Milan, Italy, June 3–6, pp 91–96
8. Rochmawati N, Wibawa SC (2013) Opinion analysis on Rohingya using twitter data. IOP Conf
Ser: Mater Sci Eng
9. Younis EMG (2015) Sentiment analysis and text mining for social media microblogs using
open source tools: an empirical study. Int J Comput Appl (0975–8887) 112(5)
10. Pennebaker JW, Chung CK, Ireland M, Gonzales A, Booth RJ (2007) The development and
psychometric properties of liwc
11. Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with Twitter:
what 140 characters reveal about political sentiment. ICWSM, pp 179–185
12. Choy M, Cheong LFM, Ma NL, Koo PS (2013) A sentiment analysis of Singapore presidential
election 2011 using Twitter data with census correction
13. Wang H, Can D, Kazemzadeh A, Bar F, Narayanan S (2012) A system for real-time twitter
sentiment analysis of 2012 US presidential election cycle. In: Proceedings of the ACL 2012
system demonstrations, Association for Computational Linguistics, pp 115–120
14. Mishra P, Rajnish R, Kumar P (2016) Sentiment analysis of twitter data: case study on digital
India. In: International conference on information technology, pp 148–153
15. Das A, Bandyopadhyay S (2010) SentiWordNet for Bangla. Knowledge sharing event-4: task,
vol 2
16. Joshi A, Balamurali AR, Bhattacharyya P (2010) A fallback strategy for sentiment analysis
in Hindi: a case study. In: Proceedings of ICON 2010: 8th international conference on natural
language processing
17. Chesley P (2006) Using verbs and adjectives to automatically classify blog sentiment. In:
Proceedings of AAAI-CAAW-06, the spring symposia on computational approaches
18. Gune H, Bapat M, Khapra MM, Bhattacharyya P (2010) Verbs are where all the action lies:
experiences of shallow parsing of a morphologically rich language. In: Proceedings of the 23rd
international conference on computational linguistics, pp 347–355
19. Mittal N, Agarwal B, Chouhan G, Bania N, Pareek P (2013) Sentiment analysis of Hindi review
based on negation and discourse relation. In: Proceedings of international joint conference on
natural language processing, pp 45–50
20. Godbole N, Srinivasaiah M, Skiena S (2007) Large-scale sentiment analysis for news and blogs.
In: Proceedings of the international conference on weblogs and social media (ICWSM)
Efficient Key Management for Secure
Communication Within Tree
and Mesh-Based Multicast Routing
Protocols

Bhawna Sharma and Rohit Vaid

Abstract An important component for secure multicast is key management which


is used for security. Security is a major challenge in mobile ad-hoc networks, because
of its inclined nature towards the MANETs and it cannot be removed. For this reason,
a key control in multicast routing protocol on mobile ad-hoc networks (MANETs)
is applied for centralized authority, limited bandwidth, dynamic topology and strong
connectivity. In other words, Key distribution may be the most important chal-
lenge for the type of this dynamic kind of network. In this paper, we use different
parameters (average packet delivery ratio, overhead, throughput, end to end delay
and packet drop) for measuring the overall performance of tree based and mesh based
routing protocols with the use of Traffic Encryption Keys (TEKs), i.e. private key
infrastructure and public key infrastructure inside the multicast group.

Keywords Multicast · Protection · Symmetric key · Public key cryptosystem ·


Key management center · Security · TEK

1 Introduction

Mobile Ad-Hoc Network (MANET) [1, 2] is a decentralized, self-control system,


not rely upon fixed verbal exchange facilities. Ad-Hoc Network can be a collection
of nodes in shape which flow everywhere at will, the community nodes distribute
dynamically, nodes connect others through wireless network, each connected node
has the double features first is terminal and second is routers.
Multicast communications are the best way to keep records of the participated
nodes as compared to unicast communications. Multicast network is dynamic in
nature. Secure multicast is that the internetwork provider that securely grants records
from a supply to a couple of selected receivers. The inherent advantages of multi-
cast routing can also provide a few benefits making it prone to attack unless they

B. Sharma (B) · R. Vaid


Department of Computer Science and Engineering, MMEC, MM (Deemed To Be University),
Mullana, Ambala, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 501
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_37
502 B. Sharma and R. Vaid

are secured. The aim is to stable those benefits maintain the advantages of multicast
providers.

1.1 Multicasting Routing

There are three different types of routing in networks: unicast, multicast and broad-
cast. Unicast is one-to-one communication. Unicast communication is suitable for
small networks. Whereas in multicast, it is one to many routing take place. In multi-
cast, data is shipped from one node to other nodes [3]. In broadcast one to all commu-
nication done [4]. However, multicast can be an aggregate of unicast and multicast
systems. In broadcast, packets have been sent to all nodes inside the network. In this,
some nodes will drop the packets if they are no longer multicast networks. Below
are a few necessities for multicasting [5]:
1. Multicast addresses used to be identified in order that in IPv4, class D deal is
fixed for this purpose.
2. Each node used to translate the IP multicast deal with inside the multicast
organization.
3. A router has to send among an IP multicast deal with and a sub-communication
multicast deal with so as to supply a multicast IP datagram at the network [5].
4. Multicast is dynamic in this any host can be a part of or depart the multicast
network.
5. Routers have to alternate statistics. Firstly, routers used to recognize which
subnet use the nodes of multicast network and secondly, routers calculate the
shortest distance among the different multicast network.
6. A routing algorithm is used to calculate the shortest paths to all or any network.
7. Each router determines the shortest path to the sender node and the destination
node.
Multicast routing will be a network-layer carry out that constructs techniques
on that understanding packets from a deliver are allotted to achieve several, but
now no longer all, destinations throughout a network. Multicast routing sends one
reproduction of a statistics packet at an identical time to more than one receiver
over a connected hyperlink that is shared via way of means of the techniques to
the receivers. The sharing of hyperlinks within the collection of the techniques to
receivers implicitly defines a tree wont to distribute multicast packets (Fig. 1).

1.2 Multicast Tree Classification

Classification of multicast tree based on the two categories—(i) Source based tree
(ii) Group Shared tree.
Efficient Key Management for Secure Communication Within Tree … 503

Fig. 1 Classification of multicast routing protocols [6]

Source based tree (SBT) The Source Specific Multicast (SSM) protocol. There
is minimum delay between End-to-stop communication. It is not suitable for large
networks. There are different source based protocols such as PIMDM, DVMRP and
MOSPF.

Group Shared tree It is Core-based tree in which data is transmitted


through the router between nodes inside the network. The delay inside the tree is
more than the SBT (Source-primarily based tree). It is scalable (suitable for big
networks). CBT and PIM-SM are group based shared protocols.

1.3 Multicast Mesh Classification

Mesh primarily based schemes set up a mesh of paths that join the nodes and desti-
nations [7]. This repetition brings approximately high dependability/energy but may
also basically construct package deal overhead. For example—On demand multi-
cast routing protocol (ODMRP), Dynamic core-based multicast protocol (DCMP)
Middle assisted mesh protocol (CAMP), Neighbor support ad-hoc multicast routing
protocol (NSMP), Location-primarily based multicast protocol and Forwarding
group multicast protocol (FGMP) [8].

2 Security in Multicasting in MANET

The basic security aspects in MANET are as follows:


• Authentication offerings offer guarantee of a host’s identity. Authentication mech-
anisms are regularly implemented to many factors of multicast communications
504 B. Sharma and R. Vaid

completely strong authentication mechanisms are suggested for stable multicast


applications. Digital signatures schemes, equal to the Digital Signature Standard
(DSS), are valid samples of authentication mechanisms supported public key
technology [9].
• Confidentiality ensures that the community data cannot be found out to the
embezzled unit. Integrity is essential to attend to transfer the data between the
nodes.
• Availability—in this service, all data and services that are available in the network
are only accessed by each authorized node. Due to dynamic topology and open
boundary nature of MANET’s an availability challenge arises. The main security
parameter is time which is used to access the network services or data is important.
• Nonrepudiation ensures that the message forwarded cannot be refused with the
aid of using the message instigator [10].

3 Key Management

The Key control is the method of creating, applying and updating the keys for a stable
and secure communication network [11]. The Traffic Encryption Keys (TEKs) and
Key Encryption Keys (KEKs) are used for encryption and decryption. In a stable
multicast communication, every member process a key to encrypt and decrypt the
multicast data. The method of updating and implementing the keys to the nodes in
the network is rekeying operation [12]. However, at some point of network modula-
tion, key control apply several exchanges according to unit time for forwarding and
backward sending [13]. The multicasting is classified into types like centralized and
distributed schemes. The Group Controller (GC) plays network key control and the
best small operations are implemented at the nodes of the network simply in case of
a centralized scheme. For distributed scheme, the important thing is control done by
every user to strongly connect to the other user [14, 15].

3.1 Traffic Encryption Keys

Numerous correspondence frameworks empower steady interchanges among


sending and getting messages via way of means of the usage of symmetric visi-
tors encryption keys (TEKs) containing an encryption key and a deciphering key
which are indistinguishable and which are utilized to hold non-public information
join among the nodes.
Efficient Key Management for Secure Communication Within Tree … 505

4 Related Work

In general, there are three major methods for key control in WSNs, supported
symmetric key cryptography, public key cryptography and hybrid [16]. Particularly
throughout a hierarchical architecture, it’ll upload up to apply a hybrid method,
throughout which the major process and large operations are performed. During
this case, authentication and integrity are regularly received via way of means of
digital signatures. However, as there is a large overall performance distinction among
symmetric key and public key cryptography, it is still exciting to seem at symmetric
key-primarily based on solutions, thanks to the limited energy, verbal exchange band-
width and process capacities of the tool nodes. Many proposals for key control using
a symmetric key-primarily based totally method in WSNs are published. Looking at
the assumed topology, this method is split into two categories: a hierarchical and non-
hierarchical (or distributed) layout of the nodes [17]. Most methods for symmetric
key-primarily based on key control protocols expect a hierarchical network, where
the nodes have a predefined role within the network.
Mesh certification authority (MeCA), given in [18], addresses the nodes for
authentication and key control in wireless mesh networks. It depends on the self-
configuring and self-organizing methods of WMN to distribute the capabilities of
a certification authority over many mesh routers [19]. The basic key functions like
data sharing and key distribution are done through it. MeCA helps the multicast tree
method given in [20] to reduce the operational overhead.

5 Proposed System

In the proposed system, Traffic Encryption Keys (TEKs) and Key Encryption Keys
(KEKs) are generated. The source can be tree and mesh based. The key is generated
in both phases for encryption and decryption (Fig. 2).

6 Simulation Results

(see Table 1)

6.1 Metrics

We use the following parameters in evaluating the performance of the tree and mesh
multicast routing protocols (Figs. 3, 4, 5, 6 and 7).
506 B. Sharma and R. Vaid

Fig. 2 Proposed architecture [21]

Table 1 Simulation
Number of nodes (receiver) 5, 10, 15, 20, 25
parameters [12]
Area size 2000 × 2000
Simulation time 40 s
Rate 250 Kb
Mobility model Random waypoint
Mac 802.1
Speed 5, 10, 15, 20, 25 and 30

• Average Packet Delivery Ratio = number of packets received/ total number of


packets sent.
• Overhead = The total number of control packets sent/the number of data packets
delivered successfully.
• Packet Drop—It is defined as the average number of packets dropped at each
receiver end.
Efficient Key Management for Secure Communication Within Tree … 507

Fig. 3 Comparison of delivery ratio with Traffic Encryption Keys (TEKs) for varying receivers in
trees and mesh based multicast routing protocols

Fig. 4 Comparison of overhead with Traffic Encryption Keys (TEKs) for varying receivers in trees
and mesh based multicast routing protocols

• End to End Delay: It can be defined as the time taken for a packet to travel from
source to destination.
• Throughput = Total number of packets received per unit of time.

7 Conclusion

Research is given around various multicast routing protocols and their examina-
tion. All conventions have their personal advantages and drawbacks. This paper
clarified exceptional highlights of multicast routing. The examination consists of
508 B. Sharma and R. Vaid

Fig. 5 Comparison of packet drop with Traffic Encryption Keys (TEKs) for varying receivers in
trees and mesh based multicast routing protocols

Fig. 6 Comparison of end to end delay with Traffic Encryption Keys (TEKs) for varying receivers
in trees and mesh based multicast routing protocols

Fig. 7 Comparison of throughput with Traffic Encryption Keys (TEKs) for varying receivers in
trees and mesh based multicast routing protocols
Efficient Key Management for Secure Communication Within Tree … 509

the important difference and likenesses of mesh and tree primarily based on multi-
cast routing protocols which could help with selecting which routing is affordable
wherein circumstance. Multicast tree primarily based totally directing conventions
are powerful and satisfy versatility issue, they have got some downsides in particular
appointed far flung corporations due to transportable nature of hubs that partake in
the course of multicast meeting. In the mesh, primarily based protocols store the
good sized length of manipulating overhead applied in tree support.

References

1. Younis M, Ozer SZ (2006) Wireless Ad hoc networks: technologies and challenges. Wirel
Commun Mob Comput 6(7):889–892
2. Guo S, Yang OWW (2007) Energy-aware multicasting in wireless Ad hoc networks: a survey
and discussion. Comput Commun 30(9):2129–2148
3. Liebeherr J, Zarki M (2003) Mastering networks: an internet lab manual. Addison Wesley
4. Maharjan N, Moten D Secure IP multicasting with encryption key management
5. Stallings W (2004) Data and computer communications, 7th edn. Pearson Prentice Hall, Upper
Saddle River, NJ
6. Jain S, Agrawal K (2014) A comprehensive survey of multicast routing protocols for mobile
Ad Hoc networks. Int J Comput Appl (IJCA)
7. Biradar R, Manvi S, Reddy M (2010) Mesh based multicast routing in MANET: stable link
based approach. Int J Comput Electr Eng 2(2)
8. Sung-Ju WS, Lee MG (2002) On-demand multicast routing protocol in multihop wireless
mobile networks. Mobile Netw Appl 7:441–453
9. Maughan D, Schertler M, Schneider M, Turner J (1997) Internet security association and key
management protocol (ISAKMP), Internet-Draft
10. Rajan C, Shanthi NS (2013) Misbehaving attack mitigation technique for multicast security in
mobile ad hoc networks (MANET). J Theor Appl Inf Technol 48(3):1349–1357
11. Devi DS, Padmavathi G (2009) A reliable secure multicast key distribution scheme for mobile
Ad hoc networks. World Acad Sci Eng Technol 56:321–326
12. Madhusudhanan B, Chitra S, Rajan C (2015) Mobility based key management technique for
multicast security in mobile Ad Hoc networks. Sci World J 2015
13. Devaraju S, Padmavathi G (2010) Dynamic clustering for QoS based secure multicast key
distribution in mobile Ad hoc networks. Int J Comput Sci Issues 7(5):30–37
14. Lin H-Y, Chiang T-C (2011) Efficient key agreements in dynamic multicast height balanced
tree for secure multicast communications in Ad Hoc networks. EURASIP J Wireless Commun
Netw 2011. https://doi.org/10.1155/2011/382701
15. Srinivasan R, Vaidehi V, Rajaraman R, Kanagaraj S, Kalimuthu RC, Dharmaraj R (2010) Secure
group key management scheme for multicast networks. Int J Netw Secur 10(3):205–209
16. Carlier M, Steenhaut K, Braeken A (2019) Symmetric-key-based security for multicast
communication in wireless sensor networks. In: MDPI
17. Bala S, Sharma G, Verma A (2013) Classification of symmetric key management schemes for
wireless sensor networks. Int J Secur Appl 7:117–138
18. Kim J, Bahk S (2009) Design of certification authority using secret redistribution and multicast
routing in wireless mesh networks. Comput Netw 53(1):98–109
19. Matam R, Tripathy S (2016) Secure multicast routing algorithm for wireless mesh networks.
J Comput Netw Commun 2016
20. Ruiz PM, Gomez-Skarmeta AF (2005) Heuristic algorithms for minimum bandwidth consump-
tion multicast routing in wireless mesh networks. In: Syrotiuk VR, Chávez E (eds) Ad-hoc,
mobile, and wireless networks, vol 3738. Lecture notes in computer science, pp 258–270
510 B. Sharma and R. Vaid

21. Chaitra KV, Selvin Paul Peter J (2018) A reverse tracing scheme and efficient key management
for secured communication in MANET. Int J Eng Res Technol (IJERT) ISSN: 2278–0181
ICESMART-2015
Detection of Signature-Based Attacks
in Cloud Infrastructure Using Support
Vector Machine

B. Radha and D. Sakthivel

Abstract In modern years, cloud computing has arisen as a widely utilized inno-
vation in IT area. With increase in the use of cloud computing, it has become more
prone to intrusions. Indeed, even a little interruption assault can bargain whole frame-
work; subsequently, interruption can be considered as a basic issue for cloud-based
platforms. The substantial growth in the number of applications using cloud-based
infrastructures calls for the need of security mechanisms for their protection. Intrusion
detection systems are one of the most suitable security solutions for protecting cloud-
based environments. The signature-based intrusion detection and support vector
machine have emerged as a recent interest and research area. With their robust
learning models and data-centric approach, SVM-based security solutions for cloud
environments have been proven effective. Attack features are extracted from organi-
zation’s network and application logs. Attack presence is confirmed by performing
support vector machine. Performance measures such as average detection time are
used to evaluate the performance of the detection system.

Keywords Machine learning · SVM · Cloud

1 Introduction

The exponential growth in modern technology has produced a ubiquitous global


networking system of services and communications along with its attendant prob-
lems. Given this state of affair and the cost-saving paradigm of providing resources
locally, companies around the globe are increasingly leaning to providing their
services and resources for users through cloud computing networks, which, in turn,

B. Radha
Department of Information Technology, Sri Krishna Arts and Science College, Coimbatore, Tamil
Nadu 641105, India
D. Sakthivel (B)
Department of Computer Science, Sree Saraswathi Thyagaraja College, Pollachi, Tamil Nadu
642107, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 511
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_38
512 B. Radha and D. Sakthivel

pushed the security risks to new heights. Eventually, cyber security became a major
issue for cloud computing [1].
Firewalls and other rule-based security approaches have been utilized broadly
to give assurance against assaults in the server farms and contemporary networks.
However, firewalls are not capable of detecting insider attacks or advanced attacks,
such as distributed denial of service (DDoS) attacks, since they do not occur where
the firewalls are set up or the rules are not enough to detect them. Additionally, large
distributed multi-cloud environments would require a significantly large number of
complicated rules to be configured, which could be costly, time-consuming, and
error-prone. Cloud providers must take the necessary measures to either prevent
these attacks before their occurrence or detect them as soon as they occur. Generally,
attacks can be identified and forestalled utilizing intrusion detection systems (IDS)
and intrusion prevention systems (IPS). Traditionally, IDSs use predefined rules or
behavioral analysis over the network to detect attacks [2].
In this paper, we tackle the problem of intrusion detection in cloud platforms
by leveraging machine learning (ML) technique, namely support vector machines
(SVM). The aim has been to improve signature-based detection accuracy, while
reducing the model training time and complexity. ML models used in intrusion
detection systems need to have a relatively short training time since they need to
be retrained to classify existing types of attacks. To evaluate the proposed model,
UNSW-NB15 dataset was used with some of the normalization methods that help to
increase the overall accuracy. Our contributions in this work are twofold [2, 3]:
(1) We implement SVM using the tool provided by Python to achieve high accuracy
for signature-based detection in cloud environments. We perform parameter
tuning to achieve the best accuracy.
(2) We perform feature engineering to find out optimal set of features to achieve
maximum accuracy in minimal training time and complexity. We aim to reduce
the training time by selecting optimal set of features while accuracy is not
compromised.

2 Related Work

Sakthivel and Radha, Cloud Computing has remarkable area in conceptual and
infrastructural computing. Cloud computing provides user to access and keep their
resources in cloud by multitenant architecture. Infrastructure as a service (IaaS) is
a form of cloud computing that provides virtualized computing resources over the
internet [4–6]. In an IaaS model, a cloud provider hosts the infrastructure compo-
nents traditionally present in an on-premises data center, including servers, storage,
and networking hardware, as well as the virtualization or hypervisor layer. Benefits
of IaaS are its full control of the computing resources through administrative access
to VMs, flexible and efficient renting of computer hardware and portability, interop-
erability with legacy applications. IaaS has some common issues such as network
Detection of Signature-Based Attacks in Cloud Infrastructure … 513

dependence and browser-based risks. It also has some specific issues such as compat-
ibility with legacy security vulnerabilities, virtual machine sprawl and robustness of
VM-level isolation and data erase practices [7].
The recent cloud-based computer networks have created number of security chal-
lenges associated with intrusions in network systems. The increase in the huge
amount of network traffic data, involvement of humans in such detection systems is
time-consuming and a non-trivial problem. The network traffic data is highly dimen-
sional, consisting of numerous features and attributes, making classification chal-
lenging and thus susceptible to the dimensionality problem. The threats to the security
of the company’s data and information are highly involved and to be addressed. The
data mining techniques and machine learning algorithms have evolved for securing
cloud databases and monitor the malicious activity and threats inside the networks.
Therefore, this paper reviews the various machine learning algorithms for securing
cloud-based environment [3].
Deep learning approaches are mainly categorized into supervised learning and
unsupervised learning. The difference between these two approaches lies in the use
of labeled training data. Specifically, convolution neural networks (CNN) that use
labeled data fall under supervised learning, which employs a special architecture
suitable for image recognition. Unsupervised learning methods include deep belief
network (DBN), recurrent neural network (RNN), auto-encoder (AE), and their vari-
ants. Next, we describe recent studies related to our work; these studies are mainly
based on KDD Cup 99 or NSL-KDD datasets [1].
Studies on intrusion detection using the KDD Cup 99dataset have been reported.
Kim et al. specifically targeted advanced persistent threats and proposed a deep
neural network (DNN) [8] using 100 hidden units, combined with the rectified linear
unit activation function and the ADAM optimizer. Their approach was implemented
on a GPU using TensorFlow [9]. Papamartzivanos et al. proposed a novel method
that combines the benefits of a sparse AE and the MAPEK framework to deliver
a scalable, self-adaptive, and autonomous misuse IDS. They merged the datasets
provided by KDD Cup 99 and NSL-KDD to create a single voluminous dataset [3].

3 Signature-Based Intrusion Detection System

The problem of intrusion detection can be modeled as a classification problem. This


approach consists of first obtaining labeled traffic data and then training a classifier
to discern between the normal traffic and intrusions. Each record belonging to the
training set consists of a certain number of traffic features, such as the protocol
type, service requested, and size of payload [10]. Each of these records has a label
indicating the class of traffic (normal/intrusion) they belong to [11, 12]. UNSW
dataset has been released, and it includes ten different types of traffic packets; it
is more suitable to be used in the signature base detection models for SVM [8].
It includes normal packets as well as nine types of attacks, which are Analysis,
514 B. Radha and D. Sakthivel

Backdoor, DoS, Exploits, Fuzzers, Reconnaissance, Shellcode, and Worms. UNSW-


NB-15 is composed of two parts: a training set UNSW-NB-15 training-set.csv which
has been used for model creation and a testing set, UNSW-NB-15-testing-set.csv
used for the testing step and modeling the received real-time packets. The number
of records in the training and testing sets is 175,341 and 82,332, respectively.

3.1 Support Vector Machines

Support vector machine (SVM) is a classification technique. Based on the concept of


optimal margin classifiers, this classification method gives a very high accuracy rate
for a large number of problem domains and is highly suited for high-dimensional
data.
H1, H2, and H3 are three of the infinite possible decision hyperplanes. The
hyperplane H3 (green) does not separate the two classes and is not suitable for use
in classification. The hyperplane H1 (blue) does separate the two classes but with a
small margin and H2 (red) separates the two classes with the maximum margin [13].

3.2 Maximal Margin Hyper Planes

For the purpose of illustration, let us consider a dataset that is linearly separable.
Given a set of labeled training data, we can find a hyperplane such that it completely
separates points belonging to the two classes. This is called the decision boundary
[9]. An infinite number of such decision boundaries are possible (Fig. 1). Decision
Boundary edge alludes to the briefest distance between the nearest focuses on the
either side of the half plane (Fig. 2). It is evident by intuition and has been mathemat-
ically proven that the decision hyperplane with the maximal margin provides better
generalization error. Support vectors refer to train samples lying on the margins of the
decision plane, and the process of training the SVM involves finding these support
vectors.
Figure 2 shows the maximum margin hyperplane and margins for an SVM trained
with samples from two classes. Samples on the margin are called the support vectors.

3.3 Signature-Based Intrusion Detection Using SVM

The formulation for a standard SVM for a binary classification problem is given as

1  n
min ||w||2 + C ∈i
2 l=1
Detection of Signature-Based Attacks in Cloud Infrastructure … 515

Fig. 1 Infinite decision


hyperplanes for a binary
classification problem

Fig. 2 Optimal margin


classifier for binary
classification problem

under the constraints


 
yi W T xi + b ≥ 1− ∈i ∈i ≥ 0, i = 1, . . . N ,
516 B. Radha and D. Sakthivel

where x i ∈ Rn is the feature vector for the ith training example. yi ∈ −1, 1 is the
class label of x i , i = 1, …, N, C > 0 is a regularization constant. The pseudo-code
for the self-training wrapper algorithm is given below:

Algorithm 1 Self-Training-SVM
Input: FI, FT and 0

FI : The set of N1 labeled training examples xi , i = 1, ..., N. Labels of the exam-


ples are y0(1), ..., y0(N)
FT : The set of N2 training examples for which the labels are unknown.
0: The threshold for convergence
Output: A Trained SVM
1: Train a SVM using FI and classify FT using the model obtained
2: k=2
3: while TRUE do
4: FN = FI + FT where labels of FT are the ones predicted using
the current model
5: Train a new SVM using FN and again classify FT
6: Evaluate objective function
1
( (k), ∈ ( )) = || k||2 + ∑ +N2 ∈ i
2
=1

7: if ( (k), ∈ ( )) − ( (k 1), ∈ ( − 1)) < 0


then
8: break
9: end if
10: k=k+1
11: end while

3.4 Dataset

UNSW dataset has been released, it includes 10 different types of traffic packets,
and it is more suitable to bemuse in the contemporary anomaly detection models. It
includes normal packets as well as 9 types of attacks, which are Analysis, Backdoor,
DoS, Exploits, Fuzzers, Reconnaissance, Shellcode, and Worms. Table 1 shows the
notation of these UNSW-NB-15 is composed of two parts: a training set UNSW-
NB-15 training-set.csv which has been used for model creation and a testing set,
UNSW-NB-15-testing-set.csv used for the testing step and modeling the received
real-time packets. The number of records in the training and testing sets is 175,341
and 82,332, respectively.
Detection of Signature-Based Attacks in Cloud Infrastructure … 517

Table 1 Classes notation


Number Class
1 Normal
2 Analysis
3 Backdoor
4 DoS
5 Exploits
6 Fuzzers
7 Generic
8 Reconnaissance
9 Sellcode
10 Worms

During this phase, two sets of datasets are extracted from the UNSW-NB-15
training set which consists of over 2 lakh records. The first set FI is a set of labeled
records and is used to train the initial SVM. The subsequent set, FT, is the arrangement
of unlabeled records and is utilized to retrain the SVM model during the cycles of the
calculation [14]. All the 10 different attacks of UNSW-NB-15-features were used in
the simulation. For the purpose of this simulation, the size of FI was taken to be much
smaller than that of FT so that the efficiency of the proposed scheme in reducing the
requirement of labeled data may be properly tested [1].

4 Result

The degree of improvement in the detection accuracy with the iterations of the self-
training algorithm depends on the size of the labeled and unlabeled training set. This
outcome can be construed from the way that after 6 emphases, the adjustment of the
location precision for the re-enactment with 5000 marked records set is practically
twofold that of the reproduction with 500 named records set. The results also show
that the overall accuracy is most sensitive to the size of the labeled set. In case of the
simulation with 500 labeled records, the final detection accuracy was around 75.5%,
whereas for the simulation with 5000 labeled records, it was found to be around
86%. Finally, the results validate the hypothesis that self-training can be used for the
reduction of the labeled training set size in the domain of intrusion detection as well.
A reduction of upto 90% has been achieved in the number of labeled training Fig. 3.
518 B. Radha and D. Sakthivel

Fig. 3 Self-training SVM with a labeled training set of size 5000 and unlabeled training set (self-
training set) of size 2 K

5 Conclusion

A new method for intrusion detection using SVM algorithm and its effectiveness for
the intrusion detection problem domain has been verified by simulation on the stan-
dard UNSW-NB-15 dataset. Further, the given algorithm achieves good results in
reduction of the requirement of labeled training data. Simulation results showed that
this scheme is more efficient for attack detection using SVM can increase the classifi-
cation accuracy from 66% to up to 90%. Future work will address anomaly detection
of attacks in cloud infrastructure and more advanced techniques for profiling normal
packets while improving the classification performance using methods such as data
mining and data clustering.

References

1. Chkirbene Z, Erbad A, Hamila R (2019) A combined decision for secure cloud computing
based on machine learning and past information using. In: IEEE wireless communications and
networking
2. Wang W, Du X, Shan D, Qin R, Wang N (2020) Cloud intrusion detection method based on
stacked contractive auto-encoder and support vector machine. In: IEEE transactions on cloud
computing
3. Sakthivel D, Radha B (2020) Novel study on machine learning algorithms for cloud security.
J Crit Rev 7(10)
4. Kumar R, Sharma D (2018) Signature-anomaly based intrusion detection algorithm. In:
Proceedings of the 2nd international conference on electronics, communication and aerospace
technology
Detection of Signature-Based Attacks in Cloud Infrastructure … 519

5. Guo C, Ping Y, Liu N, Luo S-S (2016) A two-level hybrid approach for intrusion detection.
Neurocomput 214:391–400, ISSN 0925–2312
6. Sakthivel D, Radha B (2018) SNORT: network and host monitoring intrusion detection system.
Int J Res Appl Sci Eng Technol (IJRASET) 6(X)
7. Sakthivel D, Radha B (2018) A study on security issues and challenges in cloud IaaS. Int J Res
Appl Sci Eng Technol (IJRASET) 6(III)
8. Chkirbene Z, Erbad A, Hamila R (2019) A combined decision for secure cloud computing based
on machine learning and past information. In: IEEE wireless communications and networking
conference
9. Bhamare D, Erbad A, Jain R, Zolanvari M, Samaka M (2018) Efficient virtual network function
placement strategies for cloud radio access network. Comput Commun 127:50–60
10. Cisco Systems (2016) Fog computing and the internet of things: extend the cloud to where the
things are. www.cisco.com
11. Aljamal I (2019) Hybrid intrusion detection system using machine learning techniques in cloud
computing environments. In: IEEE SERA 2019
12. Aboueata N, Alrasbi S, Erbad A (2019) Supervised machine learning techniques for efficient
network intrusion detection. IEEE
13. Ghanshala KK (2018) BNID: a behavior-based network intrusion detection at network-layer
in cloud environment. In: First international conference on secure cyber computing and
communication
14. Din MF, Qazi S (2018) A compressed framework for monitoring and anomaly detection in
cloud networks. In: International conference on computing, mathematics and engineering
technologies (iCoMET), IEEE
A Modified Weighed Histogram
Approach for Image Enhancement Using
Optimized Alpha Parameter

Vishal Gupta, Monish Gupta, and Nikhil Marriwala

Abstract Enhancement of images is a critical part of Computer vision for the


processing of images. The objects with a low contrast ratio and low intensities are
difficult to interpret by the system models for their further processing. In the present
work, we have emphasized our approach in finding the correct methodology for
the enhancement of images collected from different data sources. In the proposed
approach, we have considered the maritime satellite imagery data and the images
captured through camera (especially of ships). We have proposed a novel approach
Modified Weighted Histogram Equalization method in which the alpha parameter is
adjusted to the best optimized value and the presented work have been compared with
other existing approaches such as adaptive gamma correction weighted distribution
in terms of PSNR (Peak Signal to Noise Ratio), AMBE (Absolute mean brightness
error) and processing time. The work gives the best accuracy results in enhancing
the images which further may be used for different application of computer vision
such as object detection, vessel classification, etc.

Keywords Computer vision · Object detection · PSNR · AMBE · Maritime


images · Etc.

1 Introduction

The omnipresent and wide applications like scene understanding, video observation,
mechanical technology, and self-driving frameworks set off tremendous exploration
in the area of PC vision in the latest decade [1]. Being the center of every one of
these applications, visual acknowledgment frameworks which incorporate picture
arrangement, restriction and recognition have accomplished incredible exploration

V. Gupta (B)
University Institute of Engineering and Technology, Kurukshetra University, Kurukshetra, India
M. Gupta · N. Marriwala
Electronics and Communication Engineering Department, University Institute of Engineering and
Technology, Kurukshetra University, Kurukshetra, India
e-mail: nmarriwala@kuk.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 521
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_39
522 V. Gupta et al.

energy. Picture order, being the generally explored territory in the field of PC vision
has accomplished momentous outcomes in overall rivalries, for example, ILSVRC,
PASCAL VOC, and Microsoft COCO with the assistance of profound learning [2].
Inspired by the consequences of picture characterization, profound learning models
have been created for object identification and profound learning-based article recog-
nition has additionally accomplished condition of the results [3]. The authors [4–6]
proposed three novel profound learning designs, which can play out a joint location
and posture assessment. Pathaka et al. [7] demystified the part of profound learning
strategies dependent on convolutional neural organization for object identification.
Profound learning structures and administrations accessible for object identification
were likewise articulated. Hou et al. [8] tended to the issue to facilitate the profun-
dity all the more adequately with RGB targeting boosting the presentation of RGB-D
article identification. With numerous applications in significant level scene under-
standing, striking article location is a significant target. The objective of notable
item location is to recognize the most remarkable areas and afterward portion whole
striking articles out. It has as of late pulled in more consideration in PC vision
local area as it tends to be utilized as a significant pre-preparing venture before addi-
tional handling. Regularly, PC vision models which are intended to recognize striking
districts are enlivened by human visual consideration and insight frameworks. Visual
saliency identification has exhibited its value in different applications, for example,
object discovery and acknowledgment [9–14], picture quality evaluation [15], picture
thumbnailing [16], video preparing, and picture information pressure. [17, 18], GPS
area assessment and human–robot association. The moving item discovery fills in
as a pre-handling step to more elevated level cycles, for example, object charac-
terization or following. Thus, its exhibition can have an enormous impact on the
presentation of more elevated level cycles. Foundation deduction is the most widely
recognized technique to distinguish forefront objects in the scene. In the foundation
sub-foothold strategy, every video outline is contrasted and a foundation model and
the pixels whose force esteems digress altogether from the foundation model are
considered as frontal area. Exact forefront identification for complex visual scenes is
a troublesome errand since genuine video arrangements contain a few basic circum-
stances [19, 20]. Some key difficulties experienced in foundation deduction strate-
gies are: dynamic foundation, moving shadows of moving articles, abrupt brightening
changes, disguise, closer view gap, loud picture, camera jitter, bootstrapping, camera
programmed changes, stopped and sluggish items [21].
Zhang et al. [10] proposed a novel diagram-based streamlining outline work for
notable item identification. The trial results on four benchmark information bases
with correlations with fifteen agent strategies were likewise illustrated. Liang et al.
[22] expanded the idea of remarkable article recognition to material level dependent
on hyper-ghostly imaging and introduced a material-based notable item identification
technique that can successfully recognize objects with comparable saw tone however
extraordinary phantom reactions. Naqvia et al. [23] proposed methods that misuse the
part of shading spaces for tending to two significant difficulties comparable to notable
article identification. This technique self-governing distinguishes the shading space
which is locally the most fitting for saliency discovery on a picture by picture premise.
A Modified Weighed Histogram Approach for Image Enhancement … 523

Shading uniqueness is perhaps the most broadly adjusted sign for remarkable district
identification because of the significance of shading in human visual insight [16,
24–28]. Different distinctive shading models (shading spaces) are discovered to be
appropriate for different various applications: picture preparing, media, designs, and
PC vision. Picking a shading portrayal that safeguards the significant data and gives
understanding into the visual cycle is a troublesome cycle according to numerous
applications [16, 25–28]. Raj et al. introduced a novel calculation dependent on
bandlet change for object recognition in Synthetic Aperture Radar (SAR) pictures.
Here, an initial bandlet put together despeckling plan was utilized with respect to
the information SAR picture and afterward, a consistent bogus caution rate (CFAR)
indicator was utilized for object discovery. Lua et al. read the model for proficient
profound organization for vision-based item discovery in mechanical applications
[22, 29].
In our previous work [27, 29], we have designed an automated system for the clas-
sification and detection of different objects whose results might be improved if the
dataset considered over there was first processed through an enhancement technique.
In future publications, the work may be combined to give the hybrid approach for
the improvement of results. The paper is organized such as Sect. 1 explains the intro-
duction and the previous researchers work, Sect. 2 gives the proposed methodology
and Sect. 3 represents the experimental results and analysis.

2 Proposed Approach

In our previous published work [27, 29], we have emphasized our study on ships
classifications and object detection in coastal areas. As stated, the detection of such
objects depends on various factors; one of them might be the dullness of the captured
images especially the case of aerial images. The accuracy of the work may be
improved if the processed images or datasets are enhanced before being passed
through the proposed algorithms. In this work, we have considered different cate-
gories of images for their enhancement. In future publications, the work will be
expanded for the detection of vessels of the enhanced datasets. In this work, we have
first defined the previous approach AGCWD (Adaptive gamma correction weighted
distribution) used for the enhancement of aerial images and then the proposed method
MWHE (Modified Weighted Histogram Equalization) which gives the better results
as compared to the previous approach. In AGCWD method we utilized the adjusted
parameter for brightness preservation; this adjusted parameter is also known as alpha
parameter. We gave the value of this adjusted parameter manually, so we decided
that this adjusted parameter is calculated through optimization. Then we utilized the
MWHE method to optimize the adjusted parameter. Through the second approach
we received the better results than AGCWD method in which both the parameters
(PSNR & AMBE) show the good results according to their requirement. We can
see in section experimental results III shows the qualitative results and quantitative
524 V. Gupta et al.

results between AGCWD method and new proposed method. And through these
results we can conclude that proposed method is better than AGCWD method.
The mathematical equations for Peak to Signal Noise Ration and Absolute Mean
Brightness Error are defined below:

(L − 1)2
PSNR = 10Log10 (1)
MSE
MSE is known as Mean Square Error and given by Eq. 2

|x(i, j) − y(i, j)|2
MSE = (2)
N

where x(i, j) is the input image and y(i, j) is the output image.
AMBE (X, Y ) = |X M −Y M |, where X M is the mean of the input image X = X(i, j)
and Y M is the mean of the output image Y = Y (i, j).
In our work, the main motive is to preserve brightness and contrast enhancement.
So AMBE is used to assess the degree of brightness preservation, while PSNR is
used to assess the degree of contrast enhancement.
In this section, we will discuss the enhancement method with the help of a flow
chart. These methods are used for image enhancement. Technique AGCWD (Adap-
tive Gamma Correction Weighted Distribution) is the previous method in which we
optimize the gamma value with the help of Cumulative Distribution Function (CDF).
In proposed Modified Weighted Histogram Equalization method, we optimize the
alpha parameter that is adjustable. Figure 1 shows the flow chart diagram of the
proposed method.

Fig. 1 Flow chart of


proposed method Input Image

Binarization of Image

Modified Histogram
Equalization

Weighted Correction

Output Image
A Modified Weighed Histogram Approach for Image Enhancement … 525

Fig. 2 (i) Input image; (ii) AGCWD method; (iii) MWHE method

Flow chart of proposed method as given below:

3 Simulation Results and Discussion

We have considered the MARVEL (marine vessels) dataset available at https://github.


com/avaapm/marveldataset2016 and www.shipspotting.com. Some of the dataset
used in the work is gathered from the Karnal Lake captured in the daytime. The work
analysis is carried out considering the qualitative as well as quantitative approaches.
The qualitative analysis gives the results in terms of visualization for the observation
of humans to get the enhanced images for further analysis and processing. While
the quantitative approach gives mathematical acceptance to the model. Here in this
paper, we have shown the samples of 10 images of different categories out of complete
dataset used. In Figs. 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11, we can see that (i) Input image; (ii)
Output of Adaptive Gamma Correction Weighted Distribution method; (iii) Output
of Modified Weighted Histogram Equalization method.
After the quantitative analysis of the methods below are shown the mathemat-
ical results obtained. Table 1 shows the PSNR of the two methods. The proposed
approach shows the higher PSNR values which show the enhancement over the
existing method. In the same manner, Table 2 shows the AMBE (Absolute mean
brightness error) values which must be less to visual the images in a better way.

4 Conclusion

A Modified Weighted Histogram Equalization method is proposed and a novel tech-


nique for the enhancement of images is given. The present work is implemented to
526 V. Gupta et al.

Fig. 3 (i) Input image; (ii) AGCWD method; (iii) MWHE method

Fig. 4 (i) Input image; (ii) AGCWD method; (iii) MWHE method

Fig. 5 (i) Input image; (ii) AGCWD method; (iii) MWHE method
A Modified Weighed Histogram Approach for Image Enhancement … 527

Fig. 6 (i) Input image; (ii) AGCWD method; (iii) MWHE method

Fig. 7 (i) Input image; (ii) AGCWD method; (iii) MWHE method

Fig. 8 (i) Input image; (ii) AGCWD method; (iii) MWHE method

improve the low contrast ratio or dimmed intensity images and to reduce the noise
error in captured image so that the visualization of the dataset enhances. The proposed
method is compared with the existing algorithm and it is shown that the proposed
work gives better results when compared to the existing approach. The results are
528 V. Gupta et al.

Fig. 9 (i) Input image; (ii) AGCWD method; (iii) MWHE method

Fig. 10 (i) Input image; (ii) AGCWD method; (iii) MWHE method

shown in a qualitative as well as quantitative manner. The algorithm further may be


used for many different application of image processing like detection, classification,
segmentation of images.
A Modified Weighed Histogram Approach for Image Enhancement … 529

Fig. 11 (i) Input image; (ii) AGCWD method; (iii) MWHE method

Table 1 Comparison
Images PSNR (previous PSNR (proposed
between PSNR values
approach) (dB) approach) (dB)
Figure 2 + 17.63 + 18.53
Figure 3 + 15.41 + 15.99
Figure 4 + 16.55 + 17.17
Figure 5 + 13.95 + 14.89
Figure 6 + 15.55 + 16.27
Figure 7 + 16.57 + 16.58
Figure 8 + 16.28 + 16.56
Figure 9 + 15.38 + 18.88
Figure 10 + 10.14 + 12.92
Figure 11 + 17.80 + 23.39

Table 2 Comparison
Images AMBE (previous AMBE (proposed
between AMBE values
approach) approach)
Figure 2 0.1262 0.1119
Figure 3 0.1535 0.1424
Figure 4 0.1459 0.1390
Figure 5 0.1758 0.1745
Figure 6 0.1477 0.1350
Figure 7 0.1499 0.1401
Figure 8 0.1397 0.1365
Figure 9 0.1883 0.1400
Figure 10 0.2831 0.2776
Figure 11 0.1539 0.1132
530 V. Gupta et al.

References

1. Kyung Hee M (2007) A dynamic histogram equalization for image contrast enhancement.
IEEE Trans Consum Electron 53(2):593–600
2. Kim Y-T (1997) Contrast enhancement using brightness preserving bi-histogram equalization.
IEEE Trans Consum Electron 43(1):1–8
3. Wan Y, Chen Q, Zhang B (1999) Image enhancement based on equal area dualistic sub-image
histogram equalization method. IEEE Trans Consum Electron 45(1):68–75
4. Gupta V, Gupta M, Singla P (2021) Ship detection from highly cluttered images using convolu-
tional neural network. Wireless Pers Commun 121:287–305. https://doi.org/10.1007/s11277-
021-08635-5
5. Gupta V, Marriwala N, Gupta M (2021) A GUI based application for Low Intensity Object
Classification & Count using SVM Approach. In: 2021 6th International conference on signal
processing, computing and control (ISPCC), pp 299–302. https://doi.org/10.1109/ISPCC5
3510.2021.9609470
6. Gupta V, Gupta M, Zhang Y (2021) IoT-based artificial intelligence system in object detection.
Taylors and Francis Group, CRC Press, ISBN:9781003140443
7. Pathak AR, Pandey M, Rautaray S (2018) Application of deep learning for object detection.
Proc Comput Sci 32:1706–1717
8. Hou S, Wang Z, Wu F (2018) Object detection via deeply exploiting depth information.
Neurocomputing 286:58–66
9. Ramli (2009) Minimum mean brightness error bi-histogram equalization in contrast enhance-
ment. IEEE Trans Consum Electron 49(4):1310–1319
10. Zhang J, Ehinger KA, Wei H, Zhang K, Yang J (2017) A novel graph-based optimization
framework for salient object detection. Pattern Recogn 64:39–50
11. Wang C, Ye Z (2005) Brightness preserving histogram equalization with maximum entropy: a
variation perspective. IEEE Trans Consum Electron 51(4):1326–1334
12. Ibrahim (2007) Brightness preserving dynamic histogram equalization for image contrast
enhancement. IEEE Trans Consum Electron 53(4):1752–1758
13. Lamberti F, Montrucchio B, Sanna A (2006) CMBFHE: a novel contrast enhancement tech-
nique based on cascaded multistep binomial filtering histogram equalization. IEEE Trans
Consum Electron 52(3):966–974
14. Kim L-S (2001) Partially overlapped sub-block histogram equalization. IEEE Trans Circuits
Syst Video Technol 11(4):475–484
15. Celik T, Tjahadi T (2011) Contextual and variation contrast enhancement. IEEE Trans Image
Process Appl 20(2):3431–3441
16. Fu K, Gu IY-H, Yang J (2018) Spectral salient object detection. Neurocomputing 275:788–803
17. Rodriguez Sullivan M, Shah M (2008) Visual surveillance in maritime port facilities. In:
Proceedings of SPIE, vol 6978, p 29
18. Liu H, Javed O, Taylor G, Cao X, Haering N (2008) Omni-directional surveillance for unmanned
water vehicles. In: Proceedings of international workshop on visual surveillance
19. Wei H, Nguyen H, Ramu P, Raju C, Liu X, Yadegar J (2009) Automated intelligent video
surveillance system for ships. In: Proceedings of SPIE, vol 7306, p 73061N
20. Fefilatyev S, Goldgof D, Lembke C (2009) Autonomous buoy platform for low-cost visual
maritime surveillance: design and initial deployment. In: Proceedings of SPIE, vol 7317, p
73170A
21. Kruger W, Orlov Z (2010) Robust layer-based boat detection and multi-target-tracking in
maritime environments. In: Proceedings of international waterside
22. Liang J, Zhou J, Tong L, Bai X, Wang B (2018) Material based salient object detection from
hyperspectral images. Pattern Recogn 76:476–490
23. Naqvi SS, Mirza J, Bashir T (2018) A unified framework for exploiting color coefficients for
salient object detection. Neurocomputing 312:187–200
24. Fefilatyev S, Shreve M, Lembke C (2012) Detection and tracking of ships in open sea with
rapidly moving buoy-mounted camera system. Ocean Eng 54:1–12
A Modified Weighed Histogram Approach for Image Enhancement … 531

25. Westall P, Ford J, O’Shea P, Hrabar S (2008) Evaluation of machine vision techniques for
aerial search of humans in maritime environments. In: Digital image computing: techniques
and applications (DICTA) 2008, Canberra, pp 176–183
26. Herselman PL, Baker CJ, Wind HJ (2008) An analysis of X-band calibrated sea clutter and
small boat reflectivity at medium-to-low grazing angles. Int J Navig Obs. https://doi.org/10.
1155/2008/347518
27. Gupta V, Gupta M (2020) Ships classification using neural network based on radar scattering.
Int J Adv Sci Technol 29:1349–1354
28. Onoro-Rubio D, Lopez-Sastre RJ, Redondo-Cabrera C, Gil-Jiménez P (2018) The challenge
of simultaneous object detection and pose estimation: a comparative study. Image Comput
79:109–122
29. Gupta V, Gupta M (2021) Automated object detection system in marine environment. In:
Mobile radio communications and 5G networks, Lecture notes in networks and systems, vol
140. https://doi.org/10.1007/978-981-15-7130-5_17
Spectrum Relay Performance
of Cognitive Radio between Users
with Random Arrivals and Departures

V. Sreelatha, E. Mamatha, C. S. Reddy, and P. S. Rajdurai

Abstract Rapid changes and its advancement in wireless communication and palm-
top computing devices lead to technology revolution in the present world. The
demand for wireless networks with limited spectrum drives to find various paths
to enhance communication sector. During study, it is observed that cognitive radio
is one of the promising technologies for better communication service. It improves
the service by making utilization of the spectrum to the secondary user by allocating
spectrum from, licensed users during free time. In this paper, we proposed to develop
a mathematical model for the system, and results are presented in figures.

Keywords Spectrum · Multiple users · Cognitive radio · Wireless technology

1 Introduction

The demand for wireless communication and its rapid growth in communication
sector leads to effective resource utilization of the spectrum. It has a limited resource,
but according to research studies the licensed spectrum is not efficiently utilized in
most of the time [1, 2]. The recent technology proposed the cognitive radio (CR)
networks to overcome the shortage and to increase the traffic service. It expands the
service from early generations, 1G and 2G of simple voice and short text service to
high streamed data applications. In the present days, wireless local area networks
(WLANs) are developed based on IEEE 802.11, and wireless personal area networks
(WPANs) are developed based on IEEE 802.15 to address the network problems

V. Sreelatha · E. Mamatha (B)


School of Science, GITAM University, Bangalore, India
e-mail: mellirik@gitam.edu
C. S. Reddy
Department of Mathematics, Cambridge Institute Technology - NC, Bangalore, India
P. S. Rajdurai
Department of Mathematics, Srinivasa Ramanujan Center, SASTRA University, Kumbakonam,
India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 533
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_40
534 V. Sreelatha et al.

with optimal cost and reliably high-speed data service. In most of the needed
places, WLAN nodes serve as hotspot connectivity points, especially required in
the places such as complex, colleges and universities, railway stations, and resi-
dence houses. The future proposed technologies, cognitive radio, and software-
defined radio (SDR) technologies entrust multi-node, multiband communications
with diversified interface connectivity environment.
According to research study from Federal Communications Commission, it is
observed that in US approximately 70% of the spectrum is not utilized effectively.
Networks are primarily divided into two types of wireless users, primary user (PU)
and secondary user (SU) [2–4]. To tackle the ideal spectrum utilization, cognitive
radio technology evolved for a better service. In the cognitive radio system, secondary
user is exploited the usage of spectrum bandwidth during free time, whereas license
is primarily provided for primary users [5–7]. To overcome intolerable usage of the
secondary users, a nice frequently spectrum monitoring technology is required, so
that the service is provided seamlessly to primary users and maximum throughput
achieved [8, 9]. Since, maximum utilization of spectrum does not depend on primary
users. So, throughput gain of the network depends on the secondary users, allocated
by cognitive radio system. Hence, we primarily focus on the throughput of the CR
network as the throughput of the secondary users only [10]. In this structure, the
spectrum utilization is maximized by providing to secondary user, so that interference
level to primary user should be under threshold stage. Under this condition, a good
sensible data transmission slots are set to coordinate unit frame such that network
collisions should be optimized, and data pocket transmission is maximized.
In almost all countries, governmental bodies manage the electromagnetic
frequency spectrum and is considered as a natural resource to all the people. Naturally,
most of the spectrum is dedicated to the license organizations or service sectors; the
remaining left-over spectrum is allotted for future wireless networks. It is monitored
that within a stipulated time the requirement of wireless communications swiftly
increased, which tends to the shortage of the spectrum. But, many research studies
show that a huge volume of dedicated spectrum is not properly utilized in the given
stipulated time in any specific locality [8]. That is the allocation of available spec-
trum to other users for the usage of the spectrum is not bad thought, if it will not
affect interference to the licensed network. It gives a better solution to deal with
scarcity problem of the wireless spectrum. It can be accomplished by cognitive radio
which is a smart technology for wireless communication that can handle sensing,
learning, and dynamically allocating physical equipment to frequency radio environ-
ment. With these cognitive radios technology by allowing secondary user, the usage
of frequency spectrum not only increases but also permits additional users for the
service required.
In this paper, we primarily focus to develop three distinct performance models
for cognitive radio networks to achieve maximum throughput of the spectrum. In
the first one, we develop a model to cooperate secondary user in relaying to primary
user’s packets requirement. The cooperation should not be affected for both users’
data transmission pockets. In the second stage, cognitive radio networks completely
dedicated to all multiple primary users. In this case, the occupancy of the spectrum
Spectrum Relay Performance of Cognitive Radio ... 535

monitored by secondary receiver and transmitter is not indistinguishable, and hence,


it infringes the functioning of the secondary user. In the final stage, we consider the
networks to allow interference cancelation with multiple antennas so that secondary
user can transmit data packets uninterruptedly. This is processed without destructing
the interference for the primary user. This method is supposed to give better result
for networks sharing between primary users and secondary users by using cognitive
radio technology.
We forecast the coordination between users will enhance the throughput for the
primary user, by optimizing the channel access time. Subsequently, it will allow
higher level of accessing the channels for the secondary users. In this paper, authors
attempted to verify their hypothesis by modeling as a M/G/1 queuing system for
primary and secondary users on defining in a closed form for the delay of data
packets to the secondary user.

2 System Model

In the present work, we consider the cognitive radio technology having multiple
primary users explained in Fig. 1. All these primary users use the same channel for
data transmitting at various geographical locations. It is assumed that all secondary
users have its own range region with the boundary range of radius Rg . The primary
user active in this area Ag must be detected the signal with some probability, where
Ag = πRg 2 . Observe that the radius Rg must be chosen, such that it guarantees to
a primary user beyond the sensing region endure the secondary user transmission
access. Suppose in the transmission region of primary user, if secondary transmitter
of another primary is arranged then channel usage may be distinguishable from
secondary transmitter and receiver nodes. It leads to the secondary user successful

Fig. 1 Wireless communication network with primary and secondary users


536 V. Sreelatha et al.

Fig. 2 Transition state


architecture between nodes

probability transmission rate drastically shrivel. In this paper, we proposed to develop


secondary user mean packet delay.
Let us consider the system presented in the Fig. 2 for two primary users. In the
system, first primary user (FPU) and second primary user (SPU) operate the same
channel of networks at different geographical locations. Within the of FPU’s and
SPU’s transmission range, another secondary transmitter (ST) and receiver (SR) are
arranged. This system can be modeled by an M/G/1 queuing network system for
FPU and SPU. At the points FPU and SPU data pockets arrive according to Markov
Poisson process with mean rates σ1 , σ2 packets per unit time, respectively. There
is an influence between primary users; hence, accessibility of the channel at ST is
distinct compared to SR. As a result, the chance of utilization of transmission for
secondary user occurs only if both FPU, SPU are in idle state at the same time.
It is also possible to model and M/G/1 queuing networks system for secondary
user. Packets arrive at a rate α packets/s that follows Poisson distribution, whereas
service rate follows general distribution. It is assumed that a packet either transmitted
successfully or it may be dropped at both primary user and secondary user owing
to channel defectives. It follows the Bernoulli process independent from the system.
If the packets are missed, then the transmission of packets further repeated between
users until it positively reached the destination. Further, it is assumed that the service
time is taken as the time difference between initial transmission time of the packet
to successful destination transmission time. The system is considered as idle state if
both FPU, SPU are instantaneously at rest, otherwise the communication system is
considered as a busy state. In the next section, we discuss idle and busy period times
for primary user.
σk
IKPU (S) = ; where k = 1, 2
s + σk

Next, the Laplace transform of the primary user packet service time M kpu (s) is as
follows:
Spectrum Relay Performance of Cognitive Radio ... 537
∞
MKPU (S) = L KPU (ns)Pr(Nk = n)
n=1

Pr(Nk = n) = (1 − pKPU )n−1 p K PU,

where N k follows geometric distribution.


And primary user busy period is obtained as:

BKPU (s) = MKPU [s + σk − σk BKPU (s)]

The summation of primary user’s idle and busy periods is considered as cyclic
period, and is defined as:

Q KPU = B KPU + I KPU

3 Probability Distribution of System Idle Period

As mentioned earlier, KPU idle periods are follows Poisson distribution with σk . The
system idle period is given by the minimum of a complete and a residual idle period
of the two primary users. Let I a and I b represent complete and residual periods of
two primary users. So, we have

Isys = min(Ia , Ib )

Since exponential distribution follows memoryless property, hence for two indepen-
dent primary users the probability of idle time becomes:

Pr(Isys > t) = Pr(Ia > t, Ib > t)

= [ 1 − Pr(Ia < t)][1 − Pr(Ib < t) ]

= [ 1 − (1 − e−σ1 t )][1 − (1 − e−σ2 t ) ]

= e−(σ1 +σ2 )t

Therefore, for idle period, the system cumulative distribution function is as


follows:

Pr(Isys ≤ t) = 1 − e−(σ1 +σ2 )t


538 V. Sreelatha et al.

And probability density function is as follows:

f Isys (t) = (σ1 + σ2 )e−(σ1 +σ2 )t

4 Results

Numerical results are presented in this section in graphical form in Figs. 3, 4, 5, 6,


respectively. Mean packet delay of the primary and secondary users with respect to
packet traffic load in wireless communication is computed and plotted. To compute
the simulation values, mean arrival rate of the primary user 1 is considered as 2 and
3 packets/s per primary user 2. Secondary user packet transmission time is chosen
as 0.1 s.

Fig. 3 Shows a graph between secondary users versus traffic load


Spectrum Relay Performance of Cognitive Radio ... 539

Fig. 4 Graph between number of packet and mean packet arrival

5 Conclusion

In this paper, authors attempted to address performance of wireless transmission spec-


trum by adopting cognitive radio technology for multiple primary users. It shows that
cognitive radio technology is one of the promising technologies to effectively utilize
spectrum. Probabilistic formulae are developed to estimate performance measures
among various primary users.
540 V. Sreelatha et al.

Fig. 5 Graph between secondary users and traffic load density


Spectrum Relay Performance of Cognitive Radio ... 541

Fig. 6 Plot between number of packet and mean packet arrival size

References

1. Ma X, Ning S, Liu X, Kuang H, Hong Y (2018) Cooperative spectrum sensing using extreme
learning machine for cognitive radio networks with multiple primary users. In: 2018 IEEE 3rd
advanced information technology, electronic and automation control conference (IAEAC), pp
536–540, IEEE
2. Cavalcanti D., Agrawal D, Cordeiro C, Xie B, Kumar A (2005) Issues in integrating
cellular networks WLANs, AND MANETs: a futuristic heterogeneous wireless network. IEEE
Wireless Commun 12(3):30–41
3. Wei L, Tirkkonen O (2012) Spectrum sensing in the presence of multiple primary users. IEEE
Trans Commun 60(5):1268–1277 (2012)
4. Mamatha E, Sasritha S, Reddy CS (2017) Expert system and heuristics algorithm for cloud
resource scheduling. Rom Stat Rev 65(1):3–18
5. Pradhan H, Kalamkar SS, Banerjee A (2015) Sensing-throughput tradeoff in cognitive radio
with random arrivals and departures of multiple primary users. IEEE Commun Lett 19(3):415–
418
6. Saritha S, Mamatha E, Reddy CS, Anand K (2019) A model for compound poisson process
queuing system with batch arrivals and services. Journal Europeen des Systemes Automatises
53(1):81–86
7. Saritha S, Mamatha E, Reddy CS (2019) Performance measures of online warehouse service
system with replenishment policy. Journal Europeen Des Systemes Automatises 52(6):631–638
8. Mamatha E, Saritha S, Reddy CS, Rajadurai P (2020) Mathematical modelling and performance
analysis of single server queuing system-eigenspectrum. Int J Mathemat Oper Res 16(4):455–
468
542 V. Sreelatha et al.

9. Mamatha E, Reddy CS, Krishna A, Saritha S (2021) Multi server queuing system with crashes
and alternative repair strategies. Commun Statist Theor Meth. https://doi.org/10.1080/036
10926.2021.1889603
10. Bayat S, Louie RH, Vucetic B, Li Y (2013) Dynamic decentralised algorithms for cognitive
radio relay networks with multiple primary and secondary users utilising matching theory.
Trans Emerg Telecommun Technol 24(5): 486–502 (2013)
Energy Efficient Void Avoidance Routing
for Reduced Latency in Underwater
WSNs

Swati Gupta and N. P. Singh

Abstract Given the unique characteristics of UWSNs, the provision of mountable


and energy effective routing in underwater sensor networks (UWSNs) is very hard
to achieve. Because of the fact that the nodes are randomly deployed and energy
harvesting cannot be employed, the problem of void hole arises which is regarded
as the most challenging problem in design of routing protocols. These are the areas
where the packet cannot be further forwarded using greedy mode. The void or energy
hole creation leads to performance degradation in these networks in terms of energy
consumption, throughput, and network lifetime. The paper works on the concept
of detecting the void nodes prior and follow proactive routing in which paths can
be updated at regular intervals. Holding times can be reduced by using two-hop
information. The paper deals with these aspects of the routing process. In terms of
energy economy, packet delivery ratio, and end-to-end delay, simulation results show
that the proposed protocol outperforms Depth based Routing and WDFAD-DBR
routing.

Keywords UWSN · Void hole · Routing · Energy efficiency · Packet delivery


ratio · E2E delay

1 Introduction

Because of its broad and varied applications, undersea sensor networks have trig-
gered interest in scientific and business communities. Underwater environmental
monitoring, mine exploration, disaster warning and prevention, oceanographic data

S. Gupta (B) · N. P. Singh


Department of Electronics and Communication Engineering, National Institute of Technology,
Kurukshetra, India
N. P. Singh
e-mail: npsingh@nitkkr.ac.in
S. Gupta
Panipat Institute of Engineering and Technology, Panipat, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 543
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_41
544 S. Gupta and N. P. Singh

collecting, and tactical surveillance applications are only a few of the key uses of
UASN [1]. The underwater nodes gather information from the tough maritime envi-
ronment and transmit it to stations offshore. Sensor nodes deployed underwater have
acoustic modems since they work in aquatic environments only but the sink nodes
have both acoustic as well as radio modems because they have to receive messages
from deployed underwater sensors and had to convey this information through radio
links to offshore base stations [2].
UWSN suffers from a lot of drawbacks. The speed of propagation of acoustic
signals is 1500 m/s and that is very low as compared to that of the speed of radio
signals which is 3 × 108 m/s. Also due to the mobility of nodes, no fixed topology
can be considered. Acoustic communication has limited bandwidth, leading to low
data rates. Moreover, sensor nodes cannot be recharged and battery replacement
is not feasible. Delay and Doppler spread cause intersymbol interference (ISI) in
underwater signals [3, 4]. Prominently, two communication architectures have been
projected for underwater networks, two dimensional and three dimensional. Static
two-dimensional UWSN can be extended to ocean bottom tracking. They comprise
sensor nodes that are fixed to the ocean floor and are linked by wireless acoustic
links to one or more underwater sinks. The underwater sinks transfer information
from inside the sea network to a surface station. Characteristic uses are the ecological
control or surveillance of tectonic marine plates.
Since only random deployment is possible in underwater sensor networks, so
this may lead to high population of nodes in some areas whereas, some regions
may be sparsely populated. This may also lead to coverage gaps or void holes in
some areas. The void holes may also arise due to sudden death of nodes because
of high energy drainage of these nodes or unpredicted movement of nodes because
of water currents [5]. In the literature survey many void avoiding algorithms have
been discussed some of which work on depth information, others on residual energy,
and other forwarding metrics. The proposed protocol works proactively working on
two-hop information and bypasses the void nodes in the routing process. Results of
the simulation demonstrate how the systems proposed to outperform the selected
existing systems in terms of the output measures chosen. The rest of the paperwork
is as follows:
• A summary of current protocols is given in second section,
• Problem statement is given in third section,
• System model is presented in the fourth section,
• Results for simulation and discussion are given in fifth section,
• Conclusion is provided in sixth section.

2 Related Work

Due to water currents, UWSN is dynamic and hence terrestrial protocols are not
suited for underwater sensor networks. UWSN routing protocols are classified into
two main categories—geographic information based and geographic information free
Energy Efficient Void Avoidance Routing for Reduced Latency … 545

routing. Geographic routing uses sensor node location information to route between
source and destination [2]. It is further divided as receiver based and sender based
and more further as location information based routing or Depth Based Routing.
Some of the receiver based protocols are DBR, WDFAD-DBR, VAPR, DSD DBR,
H2 DAB, and senders based are VAR, Hydrocast, ARP.
DBR adopts routing decisions based on sensor node depth [6]. The packet header
contains the sender’s depth information. When opening this packet at a separate
node, it compares its own depth to the information about the depth found in that
packet. If the receiving node’s depth is smaller, packet will be accepted for further
transmission, then if the depth of the receiving node is larger it will be discarded.
WDFAD-DBR [7] is likewise a protocol dependent on scope. In this protocol, routing
decisions are based on current hop and forwarding nodes from next hop. WDFAD-
forwarding DBR’s region is split into two auxiliary zones and a fixed primary zone
that can be increased based on node density. The primary downside is that the fixed
region limits routing options, and the received Acknowledgements in response to each
packet sent wastes network resources and proposed in VAPR which also calls the
depth knowledge for the data packets forwarding. By exchanging beacon messages
VAPR [5] predicts all the information about the path to the destination in advance.
HydroCast [8, 9] is a routing protocol based on a load. It functions in two parts, a
covetous algorithm founded on pressure and a lower-depth-first method of recovery
locally. Recovery pathways are established a prior in HydroCast [10], by utilizing
the depth attributes of the deployed nodes. But the exploration and management
of the recovery route is high overhead, particularly when the path is very long. In
the Vector Dependent Forwarding protocol [11], a virtual pipe to be used for data
transmission is generated from source to destination. Only the nodes in the pipe
are selected for data forwarding. However, repeatedly picking forwarders within the
pipe does not balance energy usage. Network lifespan is shortened due to imbalanced
energy consumption. In [12], a hop-by-hop vector-based forwarding protocol (HH-
VBF) is given to address this problem and improve the performance of VBF. In
HH-VBF, each node makes the virtual pipe, because of which the direction of the
vector changes according to the position of the node. Compared to VBF the energy
consumption is adjusted. However, the hop-by-hop computer pipe system generates
significant overhead communication. Compared to that of VBF the overhead is higher
[13]. Another protocol Adaptive hop-by-hop forwarding dependent vector is given
in [14]. It is based on a fixed radius of the virtual routing pipeline. Once a packet is
received at a node, it is initially tested to see if it’s under a predefined range for its
distance from the forwarder. It uses time-holding to eliminate unwanted broadcasts.
The node with more neighbors has a high held time and vice versa [15]. While the
packet overlap is minimized, end-to-end delay does not increase. Table 1 offers a
comparison of various routing protocols in terms of latency, energy consumption,
and bandwidth. The proposed protocol uses adjacent node location information to
decide the next hop forwarder neighbors set. To avoid unnecessary transmissions it is
forbidden to forward process nodes with higher depths. This increases the network’s
residual capacity and increases the packet distribution ratio by eliminating unwanted
transfers and collisions [8, 16]. Although we presume that the sinks deployed once are
546 S. Gupta and N. P. Singh

stationary, the sensor nodes follow the random-walk mobility model—sensor node
selects a path randomly and, unless otherwise specified, moves to the new position
with a random velocity between the minimum and maximum velocity, respectively 1
and 5 m/s. A performance analysis will be carried out between the protocol proposed
and the DBR and WDFAD-DBR protocol, using a simulation environment, analyzing
the metrics: PDR, residual energy, end-to-end delay, and forwarding number.

3 Problem Statement

Energy harvesting cannot be employed in underwater sensor nodes so every node


has to work with its inbuilt battery and hence have a limited lifespan concluding that
energy consumption and lifetime are the main concerns in designing an underwater
sensor network. Void recovery in underwater networks consumes a large amount
of energy because of the search for alternate paths in case of encountering voids
leading both to energy drainage and time delay [11, 17]. So avoidance of void holes
is one of the major concerns while developing a routing algorithm in underwater
sensor networks. In some routing algorithms such as DBR and WDFAD-DBR. The
forwarder selection for the next hop is based on the depth difference between the
current forwarder node and the next forwarder node. In DBR the only criteria for
selection of the next forward forwarding node are one hope distance and no provision
for handling voids has been provided whereas in WDFAD-DBR, the selection criteria
include two-hop neighbor information but in these cases, Local optimal solution
problem occurs where the current source node is considered the higher depth node
as the next forward node, rather than the lower depth node [18, 19]. In this paper, a
routing protocol is proposed which removes the problem of local optima and works
on the concept of defining the routing paths prior by maintaining a table of two-hop
forwarding neighbors.
Definition of void node.
Taking a set of sensor nodes K = [K n |0 ≤ n ≤ i] and d K n represents the
depth of each nodeK n . Further, N K n is the set of neighboring nodes of K n and
S = [Sn |0 ≤ n ≤ m] is a set of sink nodes located on the surface. A void node is
seen as a node that is not able to find any node with a lower depth than itself as a
neighbor node towards the surface. Void node problem is satisfied by the equation
[K i ∈ K |d K i < d K v , K i ∈ TK v = ].
In Fig. 1, nodes f and t qualify as void nodes according to the above definition,
since they have no node with a lower depth than theirs in their communication range
[20, 21]. In greedy depth based protocols, the void nodes may get a higher priority
to act as a forwarder node since it is at a lower depth than its neighboring node and
hence results in packet drop. Our proposed protocol will bypass these void nodes
by using forwarding algorithm based on the local node density by selecting 2-hop
forwarding and saving this information in routing table and hence works towards
finding a faster and the shortest path from every sensor node towards the sink node.
Energy Efficient Void Avoidance Routing for Reduced Latency … 547

Fig. 1 Void hole problem in UWSNs [22]

4 System Model

The UWSN architecture proposed is discussed as follows. The proposed architecture


is multi sink with the sinks randomly deployed on water body surface and some
randomly deployed at the bed of water body and others floating throughout the area
of study attached through buoys and can adjust their depths by adjusting the tethers.
They are the relay nodes that use acoustic signals to communicate collectively and to
the sinks which use radio signals to communicate with the terrestrial base stations.
Hence, the sinks use both radio modems as well as acoustic modems. Also, it is
assumed that.
1. Sinks have more energy.
2. The packet once received by any of the sinks is assumed to be received by all
others also.
3. To balance the load the sinks are attached to each other.
4. The mobility of the nodes in upright direction is presumed to be insignificant.
Mobility of nodes is considered only in the horizontal direction.
The channel model used to calculate signal attenuation in underwater acousticsis
detailed here. At a given frequency f , Thorps channel model calculates absorption
loss α(f ) [5] as follows:

11 f 2 44 f 2
10 log α( f ) =  + + 2.75 ∗ 104 f 2 + 0.003 f ≥ 0.4 (1)
1+ f2 4 ∗ 100 + f
 
f
= 0.002 + 0.11 + 0.011 f f < 0.4 (2)
1+ f

Here α(f ) is in dB/km and if in kHz. We determine the value of α by using the value
of absorption loss, as follows:
548 S. Gupta and N. P. Singh

10α( f )
α= (3)
10
likewise, all tunable parameters are given in dBreμPa. The total attenuation A(l, f )
can be computed by combining absorption loss and spreading loss, as follows:

10 log(A(l, f )) = k ∗ 10 log(l) + l ∗ 10 log(α( f )) (4)

where, first term corresponds to spreading loss and the second term to the absorption
loss [11]. The spreading coefficient k defines the geometry of the signal propagation
in underwater acoustics (i.e., k = 1 for cylindrical, k= 2 for spherical, and k = 1.5
for practical spreading).
The Signal to Noise ratio for an acoustic signal is given as.

SNR( f, d) = Tp ( f ) − A(d, f ) − N ( f ) + Di (5)

where f is the frequency, T p (f) denotes the transmission power, and Di is the directive
index, and N(f ) is given as.

N ( f ) = Nth ( f ) + Ns ( f ) (6)

i.e., only thermal and shipping noise are taken under consideration.

4.1 Description of the Algorithm

The initial simulation parameters as discussed in Table 1 in the next section are
initialized for the proposed routing protocol. The sensor nodes to act as source nodes
for collecting information from the water body are deployed randomly throughout the
region of interest and the sinks are placed on the ocean surface. Every node broadcasts
a beacon message at regular intervals whereas the sink nodes broadcast the beacon
message only once. The message is being broadcasted to locate the neighboring nodes
within its range of transmission. With the help of this message, the nodes are able to
locate neighbor nodes and get the hop count numbers. As time lapses the location
of the nodes becomes obsolete and hence becomes ineffective so regular updating
of the information is necessary which is why beacon messages are being generated
periodically. Each node uses a beacon message to locate its neighboring nodes and
the hop counts. At this point, each sensor has its neighbor node information, its
distance from the nearest sink, and the hop count number from that sink. A beacon
message consists of two fields (source ID, NFN ID) where source ID is the ID of
the current node and NFN ID is the ID of the next forwarder node. By generating
beacons, the neighbor table is updated and maintained. Through this process, the
sensor nodes remain synchronized with their routing tables. The routing table format
is given in Fig. 2. It consists of the neighboring node’s ID and these neighbors’ next
Energy Efficient Void Avoidance Routing for Reduced Latency … 549

Neighbors’ Node ID Neighbors in Range Distance from neighbor No. of hops from the sink

Fig. 2 Routing table format

hop neighbors, the source node’s distance from the next neighbor’s hop, and the
number of hops from the sink’s distance.
The routing algorithm consists of the updation stage and the routing stage [23].
All the nodes are initially regular nodes and identify themselves as voids through
collection of the information from the neighboring nodes. As time passes, this infor-
mation is updated to keep track of the new voids generated if any, or any void node
becoming a regular node because of the nodes being in motion. During the routing,
the void nodes exclude themselves from the list of probable forwarding nodes, hence
leading to the increase in the probability of the regular nodes participating in the
routing and increasing the packet delivery ratio. The size of the forwarding area can
also be customized according to the density of the nodes.
In the harsh underwater world, the delay in propagation is very high. When a
packet is generated/received from its previous node, the node is required to forward
it to other nodes in its communication range, which results in the generation of a large
number of duplicate packets. High energy depletion results if all nodes are involved
in the transmission. And rising the forwarding nodes to an appropriate level is very
difficult. Additionally, when a node receives a packet, the forwarder’s coordinates
are collected for the removal of redundant packets. If the node’s ID node and its
sequence number are matched with the entries of packet queue, packet is dropped
immediately dropped. Thus forbidding the forwarding of redundant packets in the
network.
Holding time calculation is important for the suppression of the propagation of
duplicate packets in the network and improving the packet delivery [24, 25]. The
proposed protocol prolongs the holding time for up to second hop of the source
node as the network architecture showing two hop nodes is given in Fig. 3. The
parameters for the calculation include the distance of the forwarder from the edge of
the transmission range of the source, the distance of the second hop forwarder from
source node, the distance from sink to the next forwarder, and the distance from the
virtual pipeline. The algorithm for routing table updation is given in Fig. 4.

5 Simulation Results

5.1 Simulation Setup

For evaluating the efficiency of proposed Energy Efficient Void Avoiding Routing
(EE-VAR), the results will be compared with two baseline protocols, DBR and
WDFAD-DBR. The numbers of nodes in the network are varied from 100 to 300.
550 S. Gupta and N. P. Singh

Fig. 3 Network model

Fig. 4 Algorithm for


routing table updation 1: Deployment of relay and sink nodes
2: Broadcasting of the beacon message by the
nodes
3: for nodes n nodes, perform
4: Look for two hop neighbors
5: if two hop neighbors exist then
6: Calculate the depth from the surface
7: Maintain a neighbor table
8: end if
9: Add or update entry in the table by using
(NID,Nr, Dr, Counthop)
10: Δd = pkt.depth - noden.depth
11:ifnoden.depth <pkt.depth then
12: Search for minimum depth in the
neighboring table
13: else
14: Remove noden from nodei .NeighTable
Energy Efficient Void Avoidance Routing for Reduced Latency … 551

Table 1 Simulation
Simulation parameters Values
parameters
Channel Acoustic channel
Node deployment Random
Number of sensor nodes 100–300
Number of sinks 10
Area of interest 500 m × 500 m × 500 m
Transmitting energy 1.5 W
Receiving energy 0.1 W
Idle state energy consumption 10 mW
Propagation speed 1500 m/s
Transmission power threshold 90 dB re μ Pa
Receiving power threshold 10 dB re μ Pa
Transmission range 10 m
Node speed in horizontal direction 2 m/s
Mobility model MCM

All the simulations are carried out in NS2. Nodes are set out randomly in three-
dimensional area of 500 × m × 500 × m × 500 m. Each node consumes 1.5 and
0.1 W power for sending and receiving a packet. In the idle state, the power consump-
tion is assumed to be 10 mW. The node speed in the horizontal direction is kept at
2 m/s. The transmission range of each node is considered to be 10 m. Ten sink nodes
have been placed on the surface for data collection. Meandering Current Mobility
(MCM) model has been used for modeling the mobility of the sensor nodes. It is
assumed that a packet is generated every 5 s. Simple Media Access Control (MAC),
UWAN-MAC is integrated into NS2 for underwater networks.

5.2 Performance Metrics

The various performance metrics used to evaluate the performance of the protocol
are discussed as:
1. PDR: Packet delivery ratio is the ratio of the total number of packets that the
sink nodes successfully receive to the number of packets that the source nodes
produce. It is calculated as

Total (Pktsink)
Average PDR = (7)
Total (Pktsource)

2. End-to-end delay: It is the time taken from the point the packet is released from
the source to the time it reaches the sink. It is measured in (s). The time T n of
552 S. Gupta and N. P. Singh

a node transmitting a packet to the sink is the sum total of propagation delay,
transmission delay, and additional delay if any.

Lpac dn − dn−1
Tn = + + (8)
b s
where L pac is the length of the data packet, b is the bandwidth, s indicates
speed of sound which is 1500 m/s, d n – d n–1 is the distance from the next
neighboring node and  is the additional delay.
3. Energy tax: This is expressed in millijoule, which is the average energy required
by a node to successfully send a packet to the sink node.
4. Forwarding number: This is the total number of transmissions over the entire
simulation time by a node. The energy consumption and packet collisions are
greatly influenced by it.

6 Performance Analysis

1. Energy efficiency
Energy efficiency is one of the key factors to be considered in any routing
avoiding the void holes. In the case of DBR and WDFAD-DBR, the static
forwarding location area leads to a rise in the sum of data packet trans-
missions, receptions, and crashes too, but the proposed protocol adjusts the
forwarding area and communication strategy based on whether it is a dense or a
sparse network and hence consumes less energy, thereby increasing the energy
efficiency of the network as can be seen in graph shown in Fig.5.
The increase in energy efficiency is clear from the graph which increases
with the increase in the number of nodes as the probability of number of nodes
being declared as void nodes is reduced. The statistics are clear as is given in
Table 2.

DBR WDFAD-DBR EE-VAR


Energy Efficiency (J)

80

60

40
20

0
100 150 200 250 300 350
Number of Nodes

Fig. 5 Energy efficiency comparison


Energy Efficient Void Avoidance Routing for Reduced Latency … 553

Table 2 Energy efficiency


Energy efficiency of the network (in Joule)
comparison for different
number of nodes Number of nodes DBR WDFAD-DBR EE-VAR
100 9.334 11.399 14.343
150 16.45 21.853 25.342
200 22.533 29.448 43.242
250 26.333 35.242 54.242
300 32.8955 48.3144 64.2312

DBR WDFAD-DBR EE-VAR


End-to-End Delay (s)

500
400
300
200
100
0
100 150 200 250 300 350
Number of Nodes

Fig. 6 End-to-end delay analysis

Table 3 E2E delay


End-to-end delay (in s)
comparison for different
number of nodes Number of nodes DBR WDFAD-DBR EE-VAR
100 88.244 65.263 51.438
150 143.243 119.342 87.328
200 231.538 187.328 123.983
250 318.422 231.983 149.253
300 398.860 283.890 187.346

2. End-to-end delay
E2E delay for all the three protocols is seen to be decreased with the increase
in the number of nodes as can be seen in Fig. 6. The reason is the availability
of more qualified forwarding nodes. Also, because the holding time is reduced
in a dense network. E2E delay is comparatively more in WDFAD-DBR, as can
be seen in Table 3 because the next forwarder selection is based on the depth
information of the nodes. Our proposed protocol works proactively by adjusting
the communication path, also the forwarding nodes selected are those with
higher packet advancement probability irrespective of their locations. Since the
forwarding path nodes are calculated one hop in advance reducing the holding
time and reducing the end-to-end delay.
554 S. Gupta and N. P. Singh

DBR WDFAD-DBR EE-VAR

Packet Delivery Ratio (%)


120
100
80
60
40
20
0
100 150 200 250 300 350
Number of Nodes

Fig. 7 PDR comparison

Table 4 PDR comparison for


Packet delivery ratio (in %)
different numbers of nodes
Number of nodes DBR WDFAD-DBR EE-VAR
100 9.343 12.342 16.353
150 23.342 28.222 32.422
200 45.3422 50.245 54.893
250 69.423 73.2422 76.4533
300 90.247 93.374 96.2558

3. Packet delivery ratio


The proposed protocol definitely has a higher PDR as compared to DBR
and WDFAD-DBR in Fig. 7. In Depth Based Routing, packets are delivered
on the basis of depth information, and also, how to deal with packets that get
stuck in void nodes is not mentioned in DBR. Since the proposed protocol is
a proactive protocol, all the paths leading to the void nodes can be avoided by
using the shortest recovery paths. Also, the updation can be performed more
frequently so as to get the updated information. Also, the packet holding time
can be minimized because of this. The protocol proactively decides the routes,
so it excludes all routes leading to void nodes and hence allowing more regular
nodes to participate in forwarding of the packet. The PDR of the protocols also
increases with the growth in the node density as is clear from Table 4.
4. Forwarding number
The forwarding number is lower as compared to DBR in which the packet
is forwarded to every node having a depth lower than the current relay node. In
case of WDFAD-DBR, the forwarding area do not (Fig. 8) adapts to the changes
in the density of nodes, hence the number of transmissions remains the same for
dense deployment. But, it is not the case with the proposed EE-VAR protocol
(Table 5).
Energy Efficient Void Avoidance Routing for Reduced Latency … 555

DBR WDFAD-DBR EE-VAR


700

Forwarding Number
600
500
400
300
200
100
0
100 150 200 250 300 350
Number of Nodes

Fig. 8 Forwarding number comparison

Table 5 Forwarding number


Forwarding number
for different number of nodes
Number of nodes DBR WDFAD-DBR EE-VAR
100 198 123 98
150 319 287 132
200 465 328 283
250 523 437 348
300 649 537 429

7 Conclusion

Minimum Energy Consumption and hence network lifetime extension because of


the restricted resources underwater is one of the major factors affecting designing of
routing protocols in UWSNs. The random distribution of nodes, void holes, and the
inherent characteristics of underwater communication reduces the network’s lifespan.
In degrading the network performance, void holes are deemed to be one of the
important factors. By selecting the optimal forwarding nodes, unnecessary energy
consumption can be reduced by a significant amount. The simulation results check
the efficacy of the proposed work in terms of energy efficiency, E2E delay, PDR,
and forwarding number. It provides the mechanism to achieve better communication
over void regions.

References

1. Sozer EM, Stojanovic M, Proakis JG (2000) Underwater aacoustic networks. IEEE J Oceanic
Eng 25(1):72–83
2. Xie P, Cui J-H, Lao L, (2006) VBF: vector-based forwarding protocol for underwater sensor
networks. In: NETWORKING 2006. Networking technologies, services, protocols; perfor-
mance of computer and communication networks; mobile and wireless communications
systems. Lecture notes in computer science, vol 3976. Springer, Hiedelberg
556 S. Gupta and N. P. Singh

3. Chaima Z, Fatma B, Raouf B (2016) Routing design avoiding energy holes in underwater
acoustic sensor networks. Wireless Commun Mob Comput 16(14):2035–2051
4. Wang H, Wang S, Zhang E, Lu L (2018) An energy balanced and lifetime extended routing
protocol for underwater sensor networks. Sensors 18:1596
5. Noh Y, Lee U, Wang P, Choi BSC, Gerla M (2013) VAPR: void-aware pressure routing for
underwater Sensor networks. IEEE Trans Mob Comput 12(5):895–908
6. Yan H, Shi ZJ, Cui JH (2008) DBR: depth-based routing for underwatersensor networks. In:
Das A, Pung HK, Lee FBS, Wong LWC (eds) Networking 2008 ad hoc and sensor networks,
wireless networks, next generation Internet. Lecture notes in computer science, vol 4982.
Springer, Berlin, Germany
7. Yu H, Yao N, Wang T, Li G, Gao Z, Tan G (2016) WDFAD-DBR: Weighting depth and
forwarding area division DBR routing protocol for UASNs. Ad Hoc Netw 37:256–282
8. Hong Z, Pan X, Chen P, Su X, Wang N, Lu W (2018) A topology control with energy balance
in underwater wireless sensor networks for IoT-Based application. Sensors 18:2306
9. Jin Z, Wang N, Su Y, Yang Q (2018) A glider-assisted link disruption restoration mechanism
in underwater acoustic sensor networks. Sensors 18:501
10. Noh Y, Lee U, Lee S, Wang P, Vieira LFM, Cui J, Gerla M, Kim K (2016) Hydrocast: pressure
routing for underwater sensor networks. IEEE Trans Vehicul Technol 65(1):333–347
11. Ahmed F, Wadud Z, Javaid N, Alrajeh N, Alabed MS, Qasim U (2018) Mobile sinks assisted
geo-graphic and opportunistic routing based interference avoidance for underwater wireless
sensor network. Sensors 18(4):1062
12. Chen W, Gang Z, Yang S, Lei Z (2015) Improvement research of underwater sensor network
routing protocol HHVBF. In: 11th International conference on wireless communications,
networking and mobile computing (WiCOM 2015), Shanghai, pp 1–6
13. Jouhari M, Ibrahimi K, Tembine H, Ben-Othman J (2019) Underwater wireless sensor
networks: a survey on enabling technologies, localization protocols, and Internet of underwater
things. IEEE Access 7:96879–96899
14. Khan A, Ali I, Rahman AU, Imran M, Mahmood H (2018) Co-EEORS: cooperative energy
efficient optimal relay selection protocol for underwater wireless sensor networks. IEEE Access
6:28777–28789
15. Javaid N, Majid A, Sher A, Khan W, Aalsalem M (2018) Avoiding void holes and collisions
with reliable and interference-aware routing in underwater WSNs. Sensors 18:3038
16. Khasawneh A, Latiff MSBA, Kaiwartya O, Chizari H (2018) A reliable energy-efficient
pressure-based routing protocol for the underwater wireless sensor network. Wireless Netw
24:2061–2075
17. Javaid N, Jafri MR, Ahmed S, Jamil M, Khan ZA, Qasim U, Al-Saleh SS (2015) Delay-sensitive
routing schemes for underwater acoustic sensor networks. Int J Distrib Sens Netw 11:532676
18. Kanthimathi N (2017) Void handling using geo-opportunistic routing in underwater wireless
sensor networks. J Comput Elect Eng 64:2417–2440
19. Coutinho RW, Boukerche A, Vieira LF, Loureiro AA (2016) Geographic and opportunistic
routing for underwater sensor networks. IEEE Trans Comput 65(2):548–561
20. Ghoreyshi SM, Shahrabi A, Boutaleb T (2017) Void handling techniques for routing protocols in
underwater sensor networks: survey and challenges. IEEE Commun Surv Tutor 19(2):800–827
21. Coutinho RWL, Boukerche A, Vieira LFM, Loureiro AAF (2018) Underwater wireless sensor
networks: a new challenge for topology control–based systems. ACM Comput Surv 51
22. Feng P, Qin D, Ji P. et al. (2019) Improved energy-balanced algorithm for underwater wireless
sensor network based on depth threshold and energy level partition. J Wireless Com Netw
228:110–125
23. Latif K, Javaid N, Ullah I, Kaleem Z, Abbas Malik Z, Nguyen LD (2020) DIEER: delay-
intolerant energy-efficient routing with sink mobility in underwater wireless sensor networks.
Sensors 12:1503–1512
Energy Efficient Void Avoidance Routing for Reduced Latency … 557

24. Awais M, Javaid N, Rehman A, Qasim U, Alhussein M, Aurangzeb K (2019) Towards void
hole alleviation by exploiting the energy efficient path and by providing the interference-free
proactive routing protocols in IoT enabled underwater WSNs. Sensors
25. Latif K, Javaid N, Ahmad A, Khan ZA, Alrajeh N, Khan MI (2016) On energy hole and
coverage hole avoidance in underwater wireless sensor networks. IEEE Sens J 16:4431–4442
Sentiment Analysis Using Learning
Techniques

A. Kathuria and A. Sharma

Abstract This paper depicts sentiment analysis classification as an efficient process


for analysing textual data coming from various internet resources. One of the most
prominent research subjects in recent years has been sentiment analysis. It is a critical
challenge for both companies and users to provide an accurate and useful overview
of the network, which requires a large amount of data in terms of configuration,
usage and content; hence, the sentiment analysis principle is proposed to address this
problem. Methods of sentiment analysis seek to reveal any feelings, subjectivity and
opinions in the text. It is virtually difficult to analyse such a vast number of reviews
manually. As a result, SA is used for extracting the general polarity or sentiment of
opinions from documents. Sentiment research is used in Twitter, movie reviews, blogs
and consumer comments, among other places. For sentiment analysis, three methods
are commonly used: Machine Learning-based techniques, lexicon-based techniques
and hybrid techniques, but the Machine Learning approach is more efficient and
accurate. Many variations and extensions of machine learning methods and software
have recently been available in recent times. This paper presents a brief introduction
to the sentiment analysis, its classification and levels of sentiment analysis. Further,
Machine learning techniques for sentiment analysis are also mentioned in detail in
this paper. This paper aims to provide a brief knowledge of the sentiment analysis
process, including standard SA methods, from the viewpoint of ML methods, in
which machines interpret and identify human sentiments conveyed in speech and
text.

Keywords Sentiment analysis · Machine learning · Deep learning · Supervised ·


Unsupervised · Semi-supervised

A. Kathuria (B) · A. Sharma


Department of Computer Science and Engineering, Maharishi Markandeshwar Engineering
College, Maharishi Markandeshwar (Deemed Too Bee) University, Haryana, India
A. Sharma
e-mail: asharma@mmumullana.org

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 559
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_42
560 A. Kathuria and A. Sharma

1 Introduction

SA is an information extraction task and Natural Language Processing that analyses


many documents to extract the writer’s opinion conveyed in negative or positive state-
ments, queries and demands. The computational research of individuals’ feelings,
beliefs, assessments and appraisals, towards individuals, activities and characteris-
tics is known as opinion mining and sentiment analysis. The many wide varieties of
applications [1] and diverse science issues have received much interest from industry
and academia over the last few years. Today, Sensitive Analytics is motivated by the
rapid growth in web usage and public opinion sharing. The Internet is a massive
database containing both unstructured and structured information. The process of
analysing this data to obtain sentiment and latent public opinion is complex.
“where tl is the time when the opinion is expressed, oj is a target object, soijkl is the sentiment
value of the opinion of the opinion holder hi on feature fjk of object oj at time tl, hi is an
opinion holder, fjk is an attribute of the object oj and soijkl is positive, negative, or neutral,
or a more granular ranking” according to Liu et al. [2]

(1) Features of sentiment analysis: sentiments uses polarities and blends to create a
range of featured qualities such as bi-grams and tri-grams. As a result, emotions
are analysed as both positive and negative aspects using various training algo-
rithms and support vector machines. In sentiment analysis, NN is used to calcu-
late mark belongingness. The conditional dependencies of nodes of an acyclic
graph and several edges run by BN are used to aid data extraction at the context
level. Data precision and learning can be achieved on social media platforms
by optimizing sentences and terms. For generating positive and negative char-
acteristics of data at the term root stage, data tokenization is used. Techniques
are being used to reduce sentiment analysis errors and achieve a higher degree
of accuracy in social media results [3].
(2) Sentiment analysis techniques: there are two types of techniques are used by
sentiment analysis: (i) ML based techniques (ii) lexicon-based techniques [3].
(a) Corpus-based or Lexicon-based techniques: In this, strategies are
focused on decision trees similar to sentiment classification methodolo-
gies, such as sequential minimal optimization (SMO), k-Nearest Neigh-
bors (KNN), conditional random field (CRF), Hidden Markov Model
(HMM) and single dimensional classification (SDC).
(b) ML based techniques: extraction of sentences and feature stages is used in
this method of technique. Bag-of-words, uni-grams, bi-grams, n-grams,
parts-of-Speech (POS) codes and are among the elements. At the aspect
and sentence levels, ML operates in three flavours: maximum entropy,
Nave Bayes and support vector machine (SVM).
(3) Sentiment analysis as multidisciplinary field: since it encompasses areas like
computational linguistics, machine learning, knowledge retrieval, semantics,
natural language processing, and artificial intelligence, sentiment analysis is a
Sentiment Analysis Using Learning Techniques 561

Input
Document

Web User

Positive
Opinion
Sentiment
Opinion Classificati
on
Negative
Opinion

Comment Review
Pre-
Processing

Fig. 1 Process of sentiment analysis

multidisciplinary field [4]. The approaches to sentiment analysis can be clas-


sified into three stages of extraction: (a) aspect or attribute level; (b) text level;
and (c) sentence level [3].

The overall method of sentiment analysis is depicted in Fig 1, which begins


with the pre-processing of the review dataset and continues with opinion mining
or sentiment classification using various dictionary-based techniques or machine
algorithms.

1.1 Sentiment Analysis Classification

Lexicon-based methods and ML algorithms are two types of SA techniques. In


comparison to these methods, a hybrid approach that incorporates both machine
learning and lexicon methods can be used in certain circumstances. Figure 2 [5]
depicts these grouping methods.
Lexicon-based methods count and measure sentiment-related terms from a lexicon
to conduct sentiment analysis. Two major approaches are used to gather sentiment-
related words. Corpus-based process starts with a seed collection of sentiment words
of established polarity. It then searches a vast corpus for other sentiment words to
classify sentiment words with context-specific orientations. Another approach is the
dictionary-based technique relies on gathering a primary seed collection of opinion
562 A. Kathuria and A. Sharma

Statistical
Corpus-based
Approach
Lexicon-based Semantic
Approach

Dictionary-
based Approach

Maximum
Sentiment Entropy
Analysis
Supervised Probabilistic
Naïve
learning Classifiers
Bayes

Machine Semi supervised Rule-based Bayesian


Learning learning Classifiers Network

Linear SVM
Unsupervised Classifiers
learning
NN
Decision
tree

Fig. 2 Sentiment analysis techniques

terms before searching a dictionary for their synonyms and antonyms to extend the
range.
In SA, a machine learning technique seeks to operate well-known ML techniques
to text results. As previously stated, this paper aims to look at ML approaches to
sentiment analysis. Certain papers about SA approaches using the ML method are
discussed in the following sections.
Sentiment classification aims to determine the orientation of sentiment in a given
text. Sentiment classification is linked to polarity identification, cross-lingual, vague-
ness resolution in an opinionated text and cross-domain contexts and sentiment
interpretation in multi-lingual [6]. Most emotion recognition tasks use a machine
learning approach. Saleh et al. used SVM with a variety of feature selection methods
to perform sentiment classification studies on polarity determination [7]. Three
benchmark datasets were included in the experiments. With these datasets, strong
classification accuracies have been recorded for various features.

2 ML Approach for SA

The compilation of a dataset containing labelled tweets is the initial phase in machine
learning for SA. This dataset is likely to be noisy, so it should be pre-processed using
different Natural Language Processing (NLP) techniques. After that, appropriate
sentiment analysis features must be extracted and the classifier must be tested and
trained on unseen data.
Sentiment Analysis Using Learning Techniques 563

Distributed Data Computing

Pre-processing

Feature
Extraction
Online
comments

ML model

Prediction Training Vectorizing

Service/ Product

Twitter-site
Tweets
Reviews with Sentiments

Fig. 3 Workflow for sentiment analysis using machine learning

In sentiment classification, the use of well-known ML approaches on text data


is central to the ML technique. ML is more realistic since it is completely auto-
mated and can manage huge amounts of data. Sentiment classification based on
machine learning may be categorised into three types: supervised, semi-supervised
and unsupervised learning approaches. Since this technique is automated and can
accommodate large amounts of data, it is well suited for sentiment analysis [4].
In this area, the data sets used in SA are a major concern. Product feedback is
the primary source of information. These reviews are essential to company owners
because they can make business choices based on the outcome of the study of
consumer feedback about their goods. The majority of the recommendations come
from review pages. SA can be seen in a variety of settings, including political debates
[8] product reviews [9, 10] and news articles [11]. For example, in political forums,
we might learn about people’s views on specific political parties or election candi-
dates. Political positions may also be used to forecast election outcomes. People
openly express and exchange their views about a subject on microblogging plat-
forms and social networking, which are considered excellent sources of knowledge.
In the SA method, they are often used as data sources. The Sentiment Classification
564 A. Kathuria and A. Sharma

(SC) methods, as seen in Fig. 1, are explored in greater depth, with references to
similar papers and sources.
• Supervised learning
The presence of labelled training documents is needed for supervised learning
methods. Supervised learning is a proven approach for classification and it has
been used to classify sentiments with positive outcomes. ANN, DT (Decision
Tree), Maximum Entropy (ME), NB and SVM classifiers are commonly used
supervised classification methods in SA. BN (Bayesian Network), LR, KNN (K-
Nearest Neighbor) and RF are several other less widely used algorithms.
• Unsupervised learning
To divide documents into a set of predefined categories is the aim of text classi-
fication. For classification, various labelled training data are used by supervised
learning and its success depends on a perfect fit among the testing and training
data. It usually challenging to produce labelled training manuals using text clas-
sification and it necessitates human effort. The unlabelled papers, on the other
hand, are simple to pick. These issues can be solved using unsupervised learning
techniques. Unsupervised learning is a few reliant on the subject or domain in
sentiment classification and its output is much stronger [12]. There are several
research studies in this area.
• Semi-supervised learning
Semi-supervised learning is an ML approach that differs from unsupervised and
supervised learning [12, 13]. Both classified and unlabelled data are utilised in
semi-supervised learning. The main theory behind semi-supervised learning is that
knowledge about joint distribution over classification features and information
about groups are contained by the unlabelled data [14]. Where there is a lack of
labelled details, semi-supervised learning methods will outperform unsupervised
and supervised learning methods.
Figure 1 depicts the various measures involved in sentiment analysis using
machine learning. The compilation of a dataset containing labelled tweets is the
initial phase in machine learning for SA. Since this data is likely to be chaotic, it
must be pre-processed using various Natural Language Processing (NLP) techniques.
After that, appropriate sentiment analysis features must be extracted and the classifier
must be educated and evaluated on unseen data.
Figure 3 Shows the percentage of unsupervised, supervised learning and semi-
supervised machine learning methods in use is the most. Despite the downside of
seeking labelled data, the study results represent that NB and SVM algorithms are
the most commonly used supervised learning approaches in the literature. Because
of the sparse existence of the language, it is possible to distinguish it linearly. As a
result, support vector machine is commonly used in SA research. The other finding
of the study is that Naïve Bayes is a widely used approach for SA due to its easiness
and lack of dependence on the location of words in papers.
Sentiment Analysis Using Learning Techniques 565

3 Levels of SA

The activities outlined in the preceding section can be completed at various gran-
ularity levels, including sentence level, word level, or phrase, feature level and
document levels. The degrees in granularity of sentiment analysis are depicted in
Fig. 4.
At the following granularity levels, tasks of SA can be completed.

Fig. 4 Distribution of ML
techniques

16%

7% 37%

4%

5%
22%

SVM NB ME ANN DT Other U-SS

Sentiment analysis

Document Sentence Feature


Word level based
level level
Sentiment Sentiment
Sentiment Sentiment
Analysis Analysis
Analysis Analysis

Dictionary Corpus
Based Based

Fig. 5 Levels of sentiment analysis


566 A. Kathuria and A. Sharma

3.1 Document Level Sentiment Analysis

For determining sentiment orientation, the entire document is treated as the basic
unit, by document level SA. For making the task easier, each text’s ultimate opinion
is assumed to be regarding a single object and fully held by a single opinion holder.
For this task, there are a variety of machine learning approaches. Pang et al. [15]
categorised feedback as negative or positive using conventional machine learning
approaches. They experimented with features such as term frequency, bigrams,
unigrams, parts-of-speech and term presence and location, as well as three classifiers
(Naive Bayes, Support Vector Machines and Maximum Entropy). They concluded
that the SVM classifier is the most efficient and that unigram presence information
is very efficient. Pang and Lee [16] have also formulated document level sentiment
analysis as a regression challenge. To forecast scores of ratings, supervised learning
was used. Finding the polarities linear combination in the paper, as described by
Turney [17] and Dave et al. [18], is a basic and straightforward procedure (Fig. 5).
The problem that arises from the idea is that a text can include mixed opinions
and individuals may articulate the same opinion in various ways because of the
imaginative nature of natural language, even without using any opinion terms. As
previously mentioned, factual and subjective sentences are equally likely to appear
in a document. As a result, methods are required to derive meaningful knowledge
from descriptive rather than objective sentences. This leads to sentiment analysis at
the sentence level.

3.2 Sentence Level Sentiment Analysis

Research has been done on detecting subjective sentences in a text from a mixture
of subjective and objective sentences at the sentence level and then their sentiment
orientation is determined. Yu and Hatzivassiloglou [19] and Vateekul and Koomsubha
[26] attempt to distinguish arbitrary sentences and assess their orientations of opinion.
It employs supervised learning to identify opinions and subjective sentences. It used
a system similar to Turney [17] to classify sentiment for all described subjective
sentences, but with a log-likelihood ratio as the score parameter and far more seed
terms. Liu et al. [20] used a straightforward approach to get through all polarities of
the opinion statement by aggregating the terms’ orientations in the sentence.

3.3 Word Level Sentiment Analysis

The role of determining semantic orientation at the phrase level is crucial in senti-
ment analysis. For sentiment classification at the text and sentence stage, most
works use the prior polarity of terms and phrases. As a result, the development of a
Sentiment Analysis Using Learning Techniques 567

semantic orientation word lexicon by semi-automatically or manual or is common.


The majority of adjectives are used in word sentiment grouping, but researchers still
use nouns, adverbs and verbs. Corpus-based methods and Dictionary-based methods
are the two techniques for automatically annotating sentiment at the word level.

3.3.1 Dictionary-Based Methods

With known prior polarity, a limited seed list of terms is generated using this approach.
By removing synonyms or antonyms from online dictionary databases this seed list is
then iteratively expanded such as WordNet1. Two seed lists are manually generated by
Kim and Hovy [21], one with negative adjectives and verbs and positive adjectives and
verbs. By removing the synonyms and antonyms of the seed list’s terms, they added
to these lists by removing the synonyms and antonyms of the seed list’s terms from
WordNet and adding them to the right list (antonyms in the opposite and synonyms
were placed in the same list). How the latest unseen terms interacted with the seed list
dictated the sentiment intensity of the words. The negative and positive sentiment
strengths were calculated for each expression and their relative magnitudes were
compared. The semantic orientation of terms was evaluated by Kamps et al. [22]
using WordNet lexical relations. In WordNet, they gathered all of their synonyms
and terms, i.e. words with the same synset. Then a chart was created with edges
that bind synonymous words. A word’s semantic orientation was determined by its
distance from the two seed words bad and good. The length of the shortest path
between two terms wj and wi was the gap. The values representing the orientation’s
power ranged from [–1, 1] to the absolute value.

3.3.2 Corpus-Based Methods

Methods focused on corpora rely on statistical or syntactic approaches such as a


word’s co-occurrence with a defined polarity. Hatzivassiloglou and McKeown [23]
assumed that pairs of conjoined adjectives had the opposite orientation (if conjoined
by but) and same orientation (if conjoined by and). Using a log linear regression
model, they formed clusters of opposite and similar aligned terms using conjunctions
such as “brutal or corrupt” or “simplistic yet well-received.” The optimistic list
was intuitively allocated to the cluster that contained words with a higher average
frequency. The corpus needed for this system, which is an unsupervised classification
method, was enormous. Turney [17] used affiliation to assign semantic orientation.
That is, if they have favourable associations, they are assumed to have a positive
orientation (e.g. Romantic ambience). To label an unfamiliar term as negative or
positive, the interaction relationship between it and a series of manually picked seeds
(such as poor and excellent) was used. Counting the number of results returned by
determining the point-wise reciprocal knowledge between them, joining the terms
with the NEAR operator and web searches in the AltaVista Search Engine was used
568 A. Kathuria and A. Sharma

to determine the degree of interaction between the seed words and the unknown
word.

3.3.3 Feature-Based SA

The author of an analysis discusses the benefits and drawbacks of a commodity. Even
if the product’s overall opinion is negative or favourable, the reviewer may dislike
certain features or like some features. Sentiment classification at the sentence or text
level does not include this type of detail. As a result, feature-based sentiment analysis
is needed. This entails collecting a product element as well as the associated opinion
about it. While it is natural to believe that noun phrases and nouns express product
features, not all noun phrases and nouns are features of the product. Further by
excluding only base nouns, Esuli and Sebastiani [24] narrowed the applicant terms.

4 Deep Learning

Hinton suggested Deep Learning in 2006 and it is a part of the ML method that
corresponds to DNN [25]. The human brain influences the neural network, which
consists of many neurons that form an impressive network. DNN can be used to train
both unsupervised and supervised categories of data [26].
Deep Belief Networks (DBN), Recurrent Neural Networks (RNN), Convolutional
Neural Networks (CNN), Recursive Neural Networks and many other networks are
used for deep learning. Function presentation, sentence modelling, word represen-
tation prediction, text generation, vector representation, sentence classification and
all benefit from neural networks [25].

4.1 Applications of DL

Deep architecture is made up of a variety of non-linear operations at various stages.


Because of its ability to model hard AI tasks, deep architecture is expected to do
well in semi-supervised learning, like DBN and to achieve significant performance
in the natural language processing community [27]. Improved learning processes,
improved software engineering and access to training data and computing resources
are all part of Deep Learning [28]. It is based on neuroscience and has a huge effect
on technologies such as computer vision, Natural Language Processing and speech
recognition. One of the most fundamental problems of deep learning research is
determining how to learn the model’s configuration and unknown variables and the
number of layers for each layer [29].
When working with various features, deep learning architecture demonstrates its
maximum ability and necessitates a large number of labelled samples for data capture
Sentiment Analysis Using Learning Techniques 569

through deep architecture. Time-series predictions, Visual recognition, pedestrian


identification, off-road robot navigation, target types and auditory signals tasks have
all benefited from deep learning networks and techniques [30]. Complex multi-
tasking can be firmly achieved using deep architectures like semantic tagging,
according to a very inspiring approach in Natural Language Processing [30].

4.2 Combining Deep Learning with Sentiment Analysis

Deep learning has a lot of influence in both unsupervised and supervised learning
and various scholars are using it to handle emotion analysis. It is made up of many
common and efficient models used to solve various issues [31]. The RNN represents
movie reviews from the website rottentomatoes.com is the most well-known example
Socher has used.
Many researchers have used neural networks to conduct emotion recognition,
based on the work of [32]. Kalchbrenner [33], for example, predicted a DyCNN that
uses a pooling operation, such as dynamic k-max pooling on linear sequences. To
master sentiment-bearing sentence vectors, Kim [34] employs CNN.
Mikolov et al. [35] also suggest a paragraph vector for studying the SSWE, which
does better sentiment analysis than the ConvNets and bag of words model (sentiment
specific word embedding). ConvNets have recently been extended to characters rather
than word embeddings by [36]. LSTM is another composition model for emotion
classification that can literally take flow in one direction [37].

5 Literature Review

Piryani et al. [38] provide a scientometric study of OMSA studies from 2000 to 2016.
By computationally and manually reviewing the journal publication results in OMSA,
this paper provides a concise empirical account of OMSA research from 2010 to 2015.
The paper contributes to a better view of the wider landscape of OMSA science by
presenting findings that are valuable to researchers (and others who are considering
starting research) in the field. To the best of our understanding, the scientific find-
ings are the first of their kind. The findings can be useful to academics and experts
working in the field from a variety of backgrounds. Hussein [39] had presented an
analysis ton sentimental analysis. In this paper, the author review various techniques
of sentimental analysis. While several algorithms exist that have decent sentiment
analysis outcomes, the author concludes that no methodology can fully overcome
the challenges. Recent approaches to opinion mining aim to deal with the philosoph-
ical rules that control sentiments. To offer a wider opinion mining component that
is more profoundly influenced by individual thoughts and psychology, prospective
programmes must combine the ideas of common sense and logic.
570 A. Kathuria and A. Sharma

Devika et al. [40] analysed different methodologies to compare the various


methods used for Sentiment Analysis. Authors are working to improve research
methods in this field, including semantics, by using n-gram evaluation rather than
word-by-word analysis. Other approaches that the author has come across include
rule-based and lexicon-based methods. The majority of people in the Internet
world depend on social networking sites to obtain valuable information; analysing
these blogs’ feedback would provide a greater understanding and aid in their
decision-making.
Kharde and Sonawane [41] conduct a survey and comparative analysis of current
opinion mining methods such as lexicon-based and machine learning methods and
measurement metrics. Machine learning approaches like naive Bayes and SVM have
the best precision and can be considered baseline learning methods. In contrast,
lexicon-based methods are very successful in some situations and need little effort in
human-labelled documents, according to research findings. The author also investi-
gated the impact of different features on the classifier. The paper also concludes that
the cleaner the records, the more precise the findings.
Rajput et al. [42] attempt to solve this issue by analysing student reviews using
multiple text analytics approaches. The paper introduces a sentiment analysis-based
metric that is seen to be strongly correlated with aggregated Like scale ratings
and uses sentiments core, tag clouds and other frequency-based filters to offer new
insights into a teacher’s results. The paper also showed that the sentiment score can
be compared to aggregated scores such as scale-based scores. However, in compar-
ison to word clouds, the sentiment ranking has more information insight than the
like-based score.
Pradhan et al. [43] examines numerous sentiment analysis algorithms and
discusses the problems and applications that arise in this area. The author concludes
that while a dictionary-based solution requires less time to process than supervised
learning, the precision is not up to par. The use of a supervised learning technique
improves accuracy. Based on this study’s results, it can be inferred those supervised
methods are more accurate than dictionary-based approaches.
A hybrid sentimental object recognition model (HSERM) has been developed
[44]. In today’s emotional analysis, most literature only considers the differentiation
of positive although negative emotional attitudes and no one has ever discovered the
composition factors behind emotional analysis. However, emotional makeup analysis
of hot topics and public image polling have a wide range of applications in our
everyday lives. Previous researchers used postages to use deep learning approaches
of name identity recognition. However, using this approach, it is difficult to reach a
significant level for data with a lot of noise. Person recognition theory’s detection
accuracy in deep learning, on the other hand, can achieve over 95%.
Sentiment Analysis Using Learning Techniques 571

5.1 Related Work Based on Supervised Learning Approach

There are two sets of documents in this method: a test set and a training set. A test
collection is used for evaluation purposes. The training set is used by the classifier
to learn about the text. Many methods may be used to classify reviews.

5.1.1 Decision Tree Classifier

A hierarchical decomposition of the training data space is provided by a decision


tree classifier, which divides the data based on an attribute value state. The absence
or existence of one or more words is the predicate or condition. The data space is
divided in a recursive manner until the leaf nodes hold a certain minimum number
of records that are used for classification.
Jeevanandam and Kumaraswamy [45] Reciprocal text frequency and the impor-
tance of the term found were used to remove IMDb’s movie analysis functionality.
Feature selection was dependent on the relevance of the work in relation to the whole
paper using principal component analysis and CART. LVQ was able to achieve a
classification accuracy of 75%.
Kaur and Vashisht [46] proposes using data analysis methods to investigate
emotional variation in teenage and the causes for these improvements. Different
emotional differences are evaluated by classifying feelings and using a decision
tree. Decision trees can also create if-then rules. Outlier research is used to classify
emotion variation in children with disabilities of some type.

5.1.2 Linear Classifier

(i) Support vector machine


For polarity analysis and aspect classification of product reviews, [47] uses
ML combined with domain-specific lexicons. SVM is used for polarity classi-
fication per aspect after it has been conditioned to model aspect classification.
According to the findings of the experiments, the proposed methods have a
78% accuracy rate. Complementary feature selection approach and emotion
trigger extraction subsystem are added to web-based results and the perfor-
mance of these features is combined. In the training phase, web posts with
uncertain emotions are fed into the SVR and SVM classification models, with
the performance revealing the emotion category [48].
The authors [49] looked at different approaches for classifying a piece of
natural language text based on the views reflected in it, such as whether the
general feeling is positive or negative. They used domain-specific lexicons and
machine learning to introduce a suite of polarity recognition of product reviews
and aspect classification techniques.
572 A. Kathuria and A. Sharma

(ii) Neural network


This contribution [50] solves the issue of successfully assessing consumer feel-
ings towards businesses in the blogosphere. A NN based approach is devel-
oped that combines the benefits of machine learning approaches and semantic
orientation index to efficiently and rapidly classify sentiments. Semantic orien-
tation indexes are the neural network’s data. The backpropagation NN (BPN)
is chosen as the basic learner of the proposed system because of its fault toler-
ance performance. Data was gathered from real-world blogs such as “Review
Center” and “LiveJournal”. Word segmentation, computation of SO indices,
neural network testing and performance assessments have all been completed.
Datasets are divided into evaluation and training sets. F1 was used in conjunc-
tion with overall accuracy (OA), output assessment matrices and overall accu-
racy (OA). In comparison to conventional IR and ML, the results showed that
the proposed approach improved classification accuracy and reduced training
time.

5.1.3 Rule-Based Classifier

Gao et al. [51] suggests a rule-based method for detecting emotion trigger compo-
nents in Chinese microblogs. In fine-grained emotions, it introduces the emotion
model and extracts the related trigger components. The corpus can be used to
construct the emotional lexicon both manually and automatically. Meanwhile, using
Bayesian probability, the proportions of trigger components can be estimated under
the control of multi-language functions. The findings of the experiment prove that
the method is feasible.

5.1.4 Probabilistic Classifier

(i) Naïve Bayes


The paper [52] introduces a sentiment analysis approach based on consumer
reviews of movies. Anaive Bayes algorithm is used to classify feedback into
negative and positive categories. We used a list (preclassified in negative and
positive) of sentences taken from movie reviews as training results. We excluded
meaningless terms and added grouping classes of words to boost classification
(n-grams). We saw a significant change in classification for the n = 2 classes.
Pang and Lee [53] compared the efficiency of Support Vector Machines,
Nave Bayes and Maximum Entropy, in SA on various features such as only
considering bigrams, unigrams, a mixture of both, adding sections of speech
and location detail, only taking adjectives and so on. The outcome is summarised
in Table 1.
Sentiment Analysis Using Learning Techniques 573

The result shows that:


i. The appearance of a feature is more critical than the frequency at which it
appears.
ii. As Bigrams are used, the precision actually decreases.
iii. Accuracy increases as all commonly occurring words from all parts of speech,
not just adjectives, are used.
iv. Including position data improves precision.
v. When the feature space is restricted, Nave Bayes outperforms SVM. When
function space is expanded, however, SVMs perform better.
Bikel et al. developed a Voted Perceptron based on a Subsequence Kernel and
compared its output to that of normal Support Vector Machines. In Subsequence
Kernel dependent voted Perceptrons, the rise in the number of false positives is
much lower as the number of true positives increases, relative to bag-of-words based
SVMs, where the increase of false positives with true positives is more linear. Despite
only being trained on intense one- and five-star reviews, their model built an excellent
spectrum over reviews with intermediate star scores, as seen in the graph below. “It
is unusual that we see such behaviour correlated with lexical features that are usually
known as isolated and combinatorial,” the authors write. Finally, we remember that
the voted perceptron is capable of distinguishing between things that humans find
difficult to do.
The SA is done in unstructured feedback in [54]. The authors used the Nave Bayes
classifier to classify the most defined features in the study and then calculated their
negative, positive and neutral polarity distribution.
(ii) Bayesian Network
The key concept of the Naïve Bayes classifier is that the characteristics
are independent. The other drastic presumption is that all characteristics are
fully reliant. The model of Bayesian Network is a directed acyclic graph with
edges representing conditional dependencies and nodes representing random
variables as a result of this. For the variables and their interactions, BN is

Table 1 Different classifiers accuracy comparison in sentiment analysis on movie review dataset
Features Number of features Frequency or presence? SVM NB ME
Unigrams + position 22,430 Pres 81.6 81.0 80.1
Top 2633 2633 Pres 81.4 80.3 81.0
unigrams
Adjectives 2633 Pres 75.1 77.0 77.7
Unigrams + POS 16,695 Pres 81.9 81.5 80.4
Bigrams 16,165 Pres 77.1 77.3 77.4
Unigrams + bigrams 32,330 Pres 82.7 80.6 80.8
Unigrams 16,165 Pres 82.9 81.0 80.4
Unigrams 16,165 Freq 27.8 78.7 N/A
574 A. Kathuria and A. Sharma

called a complete model. Since the computing complexity of BN is very high


in text mining, it is not widely used.
To increase the polarity classification mission’s efficiency and to reduce the
noise sensitivity associated with language uncertainty, [55] introduces a novel
ensemble method for sentiment classification. Multiple classifiers are used in
the proposed Bayesian ensemble learning method to predict user-generated
content’s sentiment orientation. The research yields an efficient and accurate
polarity identification model that can be used in both social media contexts
(Twitter) and consumer feedback.
(iii) Maximum Entropy
In [46], a novel approach is used to gather tweets from different learners.
Pre-processing for sentiment analysis is done on this dataset. It entails a
number of intermediary operations to eliminate complexity naive bayes, SVM
and ME classifiers are applied to the pre-processed dataset to construct user
emotional state classification and the results are very successful.
Maximum Entropy (MaxEnt) models are feature-based and do not assume
independence, according to [56, 57] and [58]. We can apply functionality
to MaxEnt by using sentences and bigrams and there would be no feature
duplication. The fascinating concept behind Maximum Entropy models is that,
as defined by [59], the most uniform models that satisfy a given constraint
should be preferred. Any probability distribution can be estimated using these
feature-based models. Using logistic regression to find a distribution over
groups is the same as in a two-class situation [60].
People are more social in the twenty-first century, thanks to online shopping,
social media and the internet, among other things. As a result, whether indirectly or
directly, viewpoints and online judgments are attracting a lot of attention. The real
deal, though, is opinion mining or critique. The following is a list of some of the
latest sentiment analysis solutions. Table1 also includes a concise summary of these
approaches.
Popescu and Etzioni [61] suggested OPINE, an unsupervised, web-based knowl-
edge extraction method that collected viewpoints from feedback and product features.
It distinguishes a product feature, a user’s opinion about that feature, the polarity of
those opinions and then rates the product accordingly. Ratings and nouns from the
dataset are collected for feature recognition. Frequencies that exceed the threshold
frequency are kept; otherwise, they are discarded. The feature assessor in OPINE
is used to extract explicit features (frequent feature occurrence) [4]. To extract info,
researchers used a manual extraction rule [61]. The domain independence of OPINE
is a benefit. However, since the OPINE method is not widely available, it has yet to
see practical applications.
Adverbs and Adjectives are superior to Adjectives Alone. Benamara et al. [62]
suggested Alone, a linguistic approach to SA at the document level, in 2006. The level
of degree of adverbs (as measured by Linguistic Classifiers) and adverb–adjective
combinations were the starting points for this study (using Scoring Methods). The
Sentiment Analysis Using Learning Techniques 575

Table 2 Comparison table of existing techniques


Year of proposal Method Classification Text level Prediction Benefits Drawbacks
accuracy
Sentiment analysis 2011 ML Document 71% (in Automatic Use of
for product review three calculation WordNet
rating and a joint categories) of feature
model of feature 46.9% (in vector
mining five
categories)
Interdependent 2011 Probabilistic Document 73% Faster in Correlation
latent graphical comparing between
dirichlet allocation model and identified
correlating clusters
sentiment and feature
and rating or rating
are not
explicit
always
Sentiment 2011 Rule-based Sentence 86% Said to be Depends
classification approach domain solely
based on the independent on
lexical meaning of WordNet
a sentence
Opinion digger 2010 Unsupervised Sentence 51% Rates Guidelines
ML methods product at to
aspect level rate works
only on
known
data
Sentiment 2006 Linguistic Document Pearson Adjectives None
analysis: adverbs approach correlation are given
and adjectives are of more
better than 0.47 priority
adjectives alone (adjectives
expresses
human
sentiments
better than
adverbs
alone)
OPINE 2005 Unsupervised Word 87% Domain Difficulty
rule-based independent in
approach availing
OPINE
system,
thus
rare to get
applied in
real life
576 A. Kathuria and A. Sharma

said Scoring Methods used herein are Adverb First Scoring, Variable Priority Scoring,
Adjective and Priority Scoring [62].
The different algorithms used by researchers are examined, as well as data patterns.
Studies using machine learning methods from the last decade were examined for this
analysis. The papers that were investigated were classified in various groups based
on the language, data sets and algorithms, that were used. Table 2 summarises and
presents the 34 journals (Table 3).

6 Conclusion

Sentiment analysis has grown in popularity as an area of study. Sentiment analysis


has been a very important method in the business world. As a result, there are a variety
of approaches to consider consumer habits and choices using sentiment analysers.
The amount of data on social media has increased in recent years and sentiment
classification has become increasingly significant. This paper gives a summary of
recent developments in the sentiment analysis algorithm as well as its implementa-
tions. After reviewing this paper, we can easily see that Sentiment Classification and
Sentiment Interpretation are still available for research.
In this paper, different machine learning techniques and algorithms are examined
and their success on various data sets is explored based on previous studies in a
literature survey. This paper includes a detailed dataset for a study of numerous
machine learning research projects. unsupervised learning supervised learning and
semi-supervised learning methods are all used in sentiment analysis. We analysed
and identified several strategies to be used in machine learning for sentiment analysis
for this research.

7 Future Scope

As a part of future work, more functionality can be introduced, which would increase
the precision of forecast subjectivity, opinions and feelings. The research will be
undertaken in the future to determine which architecture can be developed and applied
so that machine learning techniques reach an accuracy percentage that is comparable
to human accuracy, which can contribute to further advancement in this field.
Sentiment Analysis Using Learning Techniques 577

Table 3 SA techniques and tasks with ML methods


Ref. No. Year Task Algorithms used Data scope Dataset/source
[63] 2013 Sentiment ME, NB, SVM Political News sites
Classification columns
[64] 2013 KNN, NB, DT, Company Twitter
RF reviews
[12] 2013 Unsupervised, Movie MCE, MC
SVM, NB, DT reviews
[65] 2013 ANN, NB, SVM Book Amazon.com
reviews, Movie,
GPS,
Camera
[66] 2013 KNN, SVM, DT Review Review sites
datasets
[67] 2013 SVM, NB Tweet, Movie Twitter
reviews
[68] 2013 SVM Social –
network data
[69] 2014 NB, SVM Movie beyazperde.com
reviews
[70] 2014 Unsupervised, Social Twitter
NB, network data
DT, KNN
[71] 2014 RF, NB, SVM Social Twitter
network data
[72] 2015 SVM, NB Movie –
reviews
[73] 2015 SVM, ANN Product Amazon.com
reviews
[74] 2015 SVM Movie –
reviews
[75] 2015 NB, DT News articles –
[76] 2015 RF, SVM, NB, Product Jingdong,
DT reviews Dangdang
[77] 2014 Review ANN Product Amazon.com
Usefulness reviews
[78] 2011 Measurement SVM MP3 N/A
reviews and
digital camera
[79] 2011 RF, SVM Product Amazon.com
reviews
[80] 2014 Opinion LR, SVM Review data Amazon.com
[81] 2013 Spam NB, SVM Reviews TripAdvisor.com
Detection ara>

(continued)
578 A. Kathuria and A. Sharma

Table 3 (continued)
Ref. No. Year Task Algorithms used Data scope Dataset/source
[82] 2014 RF, SVM Review data Apondator

References

1. Liu B (2010) Sentiment analysis and subjectivity. In: Indurkhya N, Damerau FJ (eds) Handbook
of natural language processing, 2nd edn. Taylor and Francis
2. Liu B (2009) Sentiment analysis and opinion mining. In: 5th Text analytics summit, Boston,
June 1–2, 2009
3. Singh J, Singh G, Singh R (2016) A review of sentiment analysis techniques for opinionated
web text, CSI Trans. ICT, 2016
4. Aydogan E, Akcayol MA (2016) A comprehensive survey for sentiment analysis tasks using
machine learning techniques. In: International Symposium on INnovations in Intelligent
SysTems and Applications
5. Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a
survey. Ain ShamsShams Eng J 5(4):1093–1113
6. Ravi K, Ravi V (2015) A survey on opinion mining and sentiment analysis: tasks, approaches
and applications. Knowl-Based Syst 89:14–46
7. Rushdi Saleh M, Martín-Valdivia MT, Montejo-Ráez A, Ureña- López LA (2011) Experiments
with SVM to classify opinions in different domains. Exp Syst Appl 38(12):14799–14804
8. Xu T, Qinke P, Yinzhao C (2012) Identifying the semantic orientation of terms using S-HAL
for sentiment analysis. Knowl-Based Syst 35:279–289
9. Yu LC, Wu J, Chang PC, Chu HS (2013) Using a contextual entropy model to expand emotion
words and their intensity for the sentiment classification of stock market news. Knowl-Based
Syst 41:89–97
10. Hagenau M, Liebmann M, Neumann D (2013) Automated news reading: stock price prediction
based on financial news using context-capturing features. Decis Supp Syst 55(3):685–697
11. Isa M, Piek V (2012) A lexicon model for deep sentiment analysis and opinion mining
applications. Decis Support Syst 53:680–688
12. Martín-Valdivia MT, Martínez-Cámara E, Perea-Ortega JM, Ureña-López LA (2013) Sentiment
polarity detection in Spanish reviews combining supervised and unsupervised approaches. Exp
Syst Appl 40(10) 3934–3942
13. Ortigosa-Hernández J, Rodríguez JD, Alzate L, Lucania M, Inza I, Lozano JA (2012)
Approaching sentiment analysis by using semi-supervised learning of multi-dimensional
classifiers. Neurocomputing 92:98–115
14. Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Tech 5(1):1–167
15. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine
learning techniques. In: Proceedings of the conference on empirical methods in natural language
processing (EMNLP), pp 79–86
16. Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization
with respect to rating scales. In: Proceedings of the association for computational linguistics
(ACL), pp 115–124
17. Turney P (2005) Thumbs up or thumbs down? Semantic orientation applied to unsupervised
classification of reviews. In: Proceedings of the association for computational linguistics (ACL),
pp 417–424
18. Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction
and semantic classification of product reviews. In: Proceedings of of the 12th international
conference on World Wide Web (WWW), pp 519–528
19. Yu H, Hatzivassiloglou V (2003) Towards answering opinion questions: separating facts from
opinions and identifying the polarity of opinion sentences. In: Proceedings of the Conference
on Empirical Methods in Natural Language Processing (EMNLP)
Sentiment Analysis Using Learning Techniques 579

20. Liu B, Hu M, Cheng J (2005) Opinion observer: analyzing and comparing opinions on the web.
In: Proceedings of the 14th international World Wide web conference (WWW-2005). ACM
Press, pp 10–14
21. Kim S, Hovy E (2004) Determining the sentiment of opinions. In: Proceedings of the
international conference on computational linguistics (COLING)
22. Kamps J, Marx M, Mokken RJ, de Rijke M (2004) Using WordNet to measure semantic
orientation of adjectives. In: Language resources and evaluation (LREC)
23. Hatzivassiloglou V, McKeown K (2004) Predicting the semantic orientation of adjectives. In:
Proceedings of the Joint ACL/EACL conference, pp 174–181
24. Esuli A, Sebastiani, F (2005) Determining the semantic orientation of terms through gloss
classification. In: Proceedings of CIKM-05, the ACM SIGIR conference on information and
knowledge management, Bremen, DE
25. Day M, Lee C (2016) Deep learning for financial sentiment analysis on finance news providers,
no. 1, pp 11271134
26. Vateekul and Koomsubha (2016) A study of sentiment analysis using deep learning techniques
on Thai Twitter data
27. Zhou S, Chen Q, Wang X (2013) Active deep learning method for semi-supervised sentiment
classification. Neurocomputing 120:536546
28. Deng L, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech
recognition and related applications: an overview. In: IEEE Int. Conf. Acoust. Speech Signal
Process., pp 85998603
29. Bengio S, Deng L, Larochelle H, Lee H, Salakhutdinov R (2013) Guest editors introduc-
tion: special section on learning deep architectures. IEEE Trans Pattern Anal Mach Intell
35(8):17951797
30. Arnold L, Rebecchi S, Chevallier S, Paugam-Moisy H (2011) An introduction to deep learning,
Esann, no. April, p 12
31. Ouyang X, Zhou P, Li CH, Liu L (2015) Sentiment analysis using convolutional neural network,
Comput. Inf. Technol. Ubiquitous Comput. Commun. Dependable, Auton. Secur. Comput.
Pervasive Intell. Comput. (CIT/IUCC/DASC/PICOM), 2015 IEEE Int. Conf., pp 23592364
32. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in
vector space, Arxiv, no. 9, pp 112
33. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for
modelling sentences, Acl, pp 655665
34. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of
2014 Conf. Empir. Methods Nat. Lang. Process. (EMNLP 2014),pp 17461751
35. Mikolov T, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases
and their compositionality, Nips, pp 19
36. Wu Z, Virtanen T, Kinnunen, T, Chng ES, Li H (2013) Exemplar-based unit selection for voice
conversion utilizing temporal Information. In: Proc. Annu. Conf. Int. Speech Commun. Assoc.
INTERSPEECH, pp 30573061
37. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured
long short-term memory networks. In: Proc. ACL, pp. 15561566
38. Piryani R, Madhavi D, Singh VK (2017) Analytical mapping of opinion mining and senti-
ment analysis research. Inf Process Manage 53(1):122–150. https://doi.org/10.1016/j.ipm.
2016.07.001
39. Hussein DMEDM (2016) A survey on sentiment analysis challenges. J King Saud University
- Engineering Sciences, 34(4). doi:https://doi.org/10.1016/j.jksues.2016.04.002
40. Devika MD, Sunitha C, Ganesh A (2016) Sentiment analysis: a comparative study on different
approaches. Procedia Computer Science 87:44–49
41. Kharde VA, Sonawane SS (2016) Sentiment analysis of twitter data: a survey of techniques.
Int J Comput Appl 139(11):975–8887
42. Rajput Q, Haider S, Ghani S (2016) Lexicon-based sentiment analysis of teachers ’ evaluation.
Hindawi Appl Comput Intell Soft Comput 6:2016
580 A. Kathuria and A. Sharma

43. Pradhan VM, Vala J, Balani P (2016) A survey on sentiment analysis algorithms for opinion
mining. Int J Comp Appl 133(9):7–11. https://doi.org/10.1016/j.jksues.2016.04.002
44. Wang Z, Cui X, Gao L, Yin Q, Ke L, Zhang S (2016) A hybrid model of sentimental entity
recognition on mobile social media. EURASIP J Wirel Commun Netw 2016(1):253. https://
doi.org/10.1186/s13638-016-0745-7
45. Jotheeswaran J, Kumaraswamy YS (2013) Opinion mining using decision tree based feature
selection through Manhattan hierarchical cluster measure. J Theor Appl Inform Technol
58(1):72–80
46. Kaur J, Vashisht S (2012) Analysis and indentifying variation in human emotion through data
mining. Int J Comp Technol Appl 133(9):121–126
47. Bhadane C, Dalal H, Doshi H (2015) Sentiment analysis: measuring opinions, Science Direct
48. Li W, Xu H (2013) Text-based emotion classification using emotion cause extraction, Elsevier
49. Bhadane C, Dalal H, Doshi H (2015) Sentiment analysis: measuring opinions. International
conference on advanced computing technologies and applications (ICACTA2015). Procedia
Computer Science 45:808–814
50. dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of
short texts, Coling-2014, pp 6978
51. Gao K, Xu H, Wanga J (2015) A rule-based approach to emotion cause detection for Chinese
micro-blogs, Elsevier
52. Smeureanu I, Bucur C (2012) Applying supervised opinion mining techniques on online user
reviews. Informatica Economică 16(2):81–91
53. Pang B, Lee L (2008) Opinion mining and sentiment analysis, foundations and trends in
information retrieva l. 2:1–2
54. Nithya R, Maheswari D (2014) Sentiment analysis on unstructured review. In: International
Conference in Intelligent Computing Applications (ICICA), pp 367–371
55. Fersini E, Messina E, Pozzi FA (2014) Sentiment analysis: Bayesian ensemble learning. Dec
Supp Syst 68:26–38
56. Cheong M, Lee VCS (2011) A microblogging-based approach to terrorism informatics: explo-
ration and chronicling civilian sentiment and response to terrorism events via Twitter. Inf Syst
Front 13(1):45–59
57. Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision.
Processing 150(12):1–6
58. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–
2):1–135
59. Nigam K, Lafferty J, McCallum A (1999) Using maximum entropy for text classification. In:
IJCAI-99 workshop on machine learning for information filtering, vol 1, pp 61–67
60. Vinodhini G, Chandrasekaram RM (2012) Sentiment analysis and opinion mining: a survey.
Int J Adv Res Comp Sci Softw Eng 2(6):28–35
61. Popescu AM, Etzioni O (2005) Extracting product features and opinions from reviews. In:
Proceedings of international conference on human language technology and empirical methods
in natural language processing, pp 339–346
62. Benamara F, Cesarano C, Reforgiato D (2006) Sentiment analysis: Adjectives and Adverbs
are better than Adjectives Alone. In: Proceedings of international conference on Weblogs and
social media, pp 1–7
63. Kaya M (2013) Sentiment analysis of Turkish political columns with transfer learning. Middle
East Technical University, Diss
64. Çetin M, Amasyali MF (2013) Active learning for Turkish sentiment analysis. In: IEEE
international symposium on innovations in intelligent systems and applications (INISTA), pp
1–4
65. Moraes R, Valiati JF, Gavião Neto WP (2013) Document-level sentiment classification: an
empirical comparison between SVM and ANN. Exp Sys Appl 40(2):621–633
66. Seker SE, Al-Naami K (2013) Sentimental analysis on Turkish blogs via ensemble classifier.
In: Proceedings the international conference on data mining
Sentiment Analysis Using Learning Techniques 581

67. Rui H, Liu Y, Whinston A (2013) Whose and what chatter matters? The effect of tweets on
movie sales. Decis Support Syst 55(4):863–870
68. Cârdei C, Manior F, Rebedea T (2013) Opinion mining for social media and news items in
Romanian. In: 2nd international conference on systems and computer science (ICSCS), pp
240–245
69. Akba F, Uçan A, Sezer E, Sever H (2014) Assessment of feature selection metrics for sentiment
analyses: Turkish movie reviews. In: 8th European conference on data mining, vol 191, pp
180–184
70. Nizam H, Akın SS (2014) Sosyal medyada makine öğrenmesi ile duygu analizinde dengeli
ve dengesiz veri setlerinin performanslarının karşılaştırılması. XIX. Türkiye’de İnternet
Konferansı, pp 1–6
71. Meral M. Diri B (2014) Sentiment analysis on Twitter. In: Signal processing and communica-
tions applications conference (SIU), pp 690–693
72. Tripathy A, Agrawal A, Rath SK (2015) Classification of sentimental reviews using machine
learning techniques. Procedia Computer Science 57:821–829
73. Vinodhini G, Chandrasekaran RM (2015) A comparative performance evaluation of neural
network based approach for sentiment classification of online reviews. J King Saud University
- Comp Inform Sci 28(1):2–12
74. Shahana PH, Omman B (2015) Evaluation of features on sentimental analysis. Procedia
Computer Science 46:1585–1592
75. Florian B, Schultze F, Strauch L (2015) Semantic search: sentiment analysis with machine
learning algorithms on German news articles
76. Tian, F, Wu F, Chao KM, Zheng Q, Shah N, Lan T, Yue J (2015) A topic sentence-based
instance transfer method for imbalanced sentiment classification of Chinese product reviews.
Elect Comm Res Appl
77. Lee S, Choeh JY (2014) Predicting the helpfulness of online reviews using multilayer
perceptron neural networks. Expert Syst Appl 41(6):3041–3046
78. Chen CC, De Tseng Y (2011) Quality evaluation of product reviews using an information
quality framework. Decis Support Syst 50(4):755–768
79. Ghose A, Ipeirotis PG (2011) Estimating the helpfulness and economic impact of product
reviews: mining text and reviewer characteristics. IEEE Trans Knowl Data Eng 23(10):1498–
1512
80. Lin Y, Zhu T, Wu H, Zhang J, Wang X, Zhou A (2014) Towards online anti-opinion spam:
spotting fake reviews from the review sequence. In: Proceedings of the 2014 IEEE/ACM
international conference on advances in social networks analysis and mining, ASONAM, pp
261– 264, 2014
81. Ott M, Cardie C, Hancock JT (2013) Negative deceptive opinion spam. In: HLT-NAACL, pp
497–501
82. Costa H, Merschmann LHC, Barth F, Benevenuto F (2014) Pollution, badmouthing, and local
marketing: the underground of location-based social networks. Inf Sci 279:123–137
Design of Novel Compact UWB
Monopole Antenna on a Low-Cost
Substrate

P. Dalal and S. K. Dhull

Abstract This paper, proposes and analyzes, a novel and compact ultra-wideband
(UWB) monopole antenna. The antenna is designed on a 1.6 mm thick, low-cost
flame retardant-4 (FR-4) substrate. The overall footprint of the antenna is 20 ×
14 mm2 . Ansys High frequency structure simulator is employed to perform the
simulations. The impedance bandwidth of the proposed antenna with S 11 < –10 dB,
is from 3.1 to 11.3 GHz, with a fractional bandwidth of 113%. For the whole UWB
bandwidth, the average gain of the antenna is approximately 3 dB, and the average
radiation efficiency is above 95%. The far-field radiation pattern of the antenna is
dumbbell-shaped in E-plane and Omnidirectional in H-plane.

Keywords Monopole antenna · Ultra-wideband (UWB) · Low cost · Compact

1 Introduction

The frequency band from 3.1 to 10.6 GHz was authorized by the Federal Communi-
cation Commission (FCC) in the year 2002, for unlicensed ultra-wideband (UWB)
communications [1]. The UWB technology is aimed to provide short-range and high-
speed, peer-to-peer indoor wireless data communication. The antenna placed at the
transmitting and receiving site plays an important role in the successful deployment
of this technology. But this technology demands antennas operating over a very large
bandwidth of 7.5 GHz, which poses a significant challenge to antenna designers [2].
Because of the distinctive challenges posed, many researchers and antenna designers
have been instigated to develop antennas for this technology. Other requirements of
UWB antennas include low transmission power, low profile, compact size, and ease
of incorporation with Monolithic Microwave Integrated Circuits (MMIC). To satisfy
such requirements, many researchers have proposed printed monopole antennas
(PMA) as a suitable candidate for this technology. This is because PMA offers a

P. Dalal (B) · S. K. Dhull


Guru Jambheshwar University of Science and Technology, Hisar 125001, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 583
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_43
584 P. Dalal and S.K. Dhull

lot of advantages, such as wide bandwidth, smaller size, low profile, easy to fabri-
cate, low cost, omnidirectional radiation characteristics, and easy integration with
MMIC. Over the last few years, different shapes of PMA designs have been proposed
for UWB applications [3–12]. These antenna designs include circular disc monopole
[3], triangular monopole [4], CPW fed tapered ring slot monopole [5], rectangular
monopole with tapered feed [6], and many more. These monopole antennas have
wide impedance bandwidth, easy to fabricate, and have acceptable radiation perfor-
mance. But with the futuristic trend of miniaturization in communication systems,
the demand for compact size antennas is increasing day by day.
In this paper, a novel UWB monopole antenna on a low-cost FR4 substrate is
proposed. The antenna is compact, with an overall footprint of 20 × 14 mm2 . The
three-dimensional electromagnetic simulator software Ansys High frequency struc-
ture simulator (HFSS) is employed for design simulation. Different characteristics
of the proposed antenna are investigated throughout the UWB frequency range, such
as, return loss (S 11 ), real and imaginary impedance, gain, and radiation efficiency.
E-plane and H-plane radiation patterns of the proposed antenna are studied at two
dissimilar frequencies, at 4 and 8 GHz.

2 Antenna Design and Its Geometrical Parameters

In this section, the design of the proposed monopole antenna is discussed. The geom-
etry of the proposed antenna is illustrated in Fig. 1. A low-cost FR4 epoxy laminate
material with height H = 1.6 mm and relative permittivity εr = 4.4, is chosen as
the substrate for the antenna. From the geometry of the antenna, it is observed that
the antenna is composed of a rectangular monopole of the size L 1 × W 1 . Then the
rectangular monopole is tapered off until it connects to the 50  feedline of the
antenna. The tapering helps in achieving good impedance matching of the monopole
with the feedline for a wider range of frequencies, and hence wider bandwidth of the
antenna. The partial ground plane of the monopole increases the bandwidth further.
The various geometry parameters of the antenna are as follows: L 1 = 7.5 mm, L 2 =
5.5 mm, L f = 6 mm, L g = 3 mm, W1 = 14 mm, W f = 2 mm, g = 1 mm.

3 Results and Discussion

The different simulated characteristics of the UWB monopole antenna proposed in


the previous section are discussed in this section. For the study, Finite element method
(FEM) based simulations are done using Ansys HFSS.
Design of Novel Compact UWB Monopole Antenna on a Low-Cost Substrate 585

Fig. 1 Geometry of proposed UWB monopole antenna a top view, b back view, c side view

3.1 Return Loss

The return loss of an antenna is the measure of power reflected the source by the
antenna. An antenna’s return loss should be less maintained below –10 dB for accept-
able performance. This limiting value ensures that 90% of the input power is radi-
ated by the antenna and 10% of the input power is reflected the source. The simu-
lated return loss of the proposed antenna is presented in Fig. 2. From the figure,

Fig. 2 The plot of return loss (S 11 ) against frequency for the proposed antenna
586 P. Dalal and S.K. Dhull

Gain(dB) 2

-1

-2

-3
2 3 4 5 6 7 8 9 10 11 12
Frequency (GHz)

Fig. 3 The plot of gain against frequency for the proposed antenna

it is observed that the –10 dB impedance bandwidth of the antenna is from 3.1 to
11.3 GHz, with a fractional bandwidth of about 113%. Thus, the proposed antenna
covers the frequency band allocated by the FCC for the UWB applications.

3.2 Gain

The graph of gain versus frequency for the proposed UWB monopole antenna is
presented in Fig. 3. From the figure, it is observed that the gain of the proposed
antenna is acceptable throughout the whole UWB band. The peak value of the gain
is 3.2 dB around 9.5 GHz.

3.3 Real and Imaginary Impedance

The plot of real impedance and imaginary impedance of the proposed monopole
antenna with frequency is presented in Fig. 4. For the perfect impedance matching of
the antenna with the source at the resonance frequency, the value of real impedance
must be 50 , and the value of imaginary impedance must be 0 . From the figure,
it is observed that the real impedance of the proposed monopole tries to maintain its
value around 50 , and the imaginary impedance tries to maintain its value around
0 , for the whole bandwidth range of the UWB technology.
Design of Novel Compact UWB Monopole Antenna on a Low-Cost Substrate 587

real
70
imag
60
50
Impedance Ω 40
30
20
10
0
-10
-20
-30
2 3 4 5 6 7 8 9 10 11 12
Frequency (GHz)

Fig. 4 The plot of real and imaginary impedance against frequency for the proposed antenna

100

90
Radia on Efficiency (%)

80

70

60

50

40
2 3 4 5 6 7 8 9 10 11 12
Frequency (GHz)

Fig. 5 The plot of radiation efficiency against frequency for the proposed antenna

3.4 Radiation Efficiency

The plot of radiation efficiency versus frequency for the proposed monopole antenna
is presented in Fig. 5. From the figure, it is observed that the value of radiation
efficiency is above 90% for the whole UWB band. This value is considered good for
the acceptable performance of the monopole antenna.

3.5 Radiation Pattern

The radiation pattern of the proposed monopole antenna is studied at two different
frequencies, at 4 and 8 GHz. This is done to authenticate the antenna’s radiation
performance at both, the lower and the upper frequencies of the UWB band. The E-
plane and the H-plane radiation characteristics of the suggested monopole antenna
588 P. Dalal and S.K. Dhull

Fig. 6 The radiation pattern of the suggested antenna at 4 GHz a E-plane, b H-plane

at 4 GHz are respectively presented in Fig. 5a and 6b. From the figure, it is noticed
that the E-plane radiation pattern is dumbbell-shaped, whereas the H-plane radiation
pattern is Omnidirectional. Figure a and b, respectively present the radiation pattern
of the suggested monopole antenna in the E-plane and the H-plane at 8 GHz. It is
noticed from the figure, that the E-plane radiation pattern is slightly tilted at 8 GHz,
as compared to the E-plane radiation pattern at 4 GHz. And the H-plane radiation
pattern is slightly directional at 8 GHz, as compared to the H-plane radiation pattern
at 4 GHz. Nevertheless, it can be concluded that the proposed antenna maintains its
radiation performance throughout the UWB band (Fig. 7).

3.6 Comparison Table

Table 1 compares the monopole antenna suggested in this paper, with other UWB
PMAs already existing in the literature. From the comparison table, it can be
concluded that as compared to other PMAs proposed in different references that
are cited in the table, the suggested monopole antenna is compact, and covers the
entire UWB band.
Design of Novel Compact UWB Monopole Antenna on a Low-Cost Substrate 589

Fig. 7 The radiation pattern of the suggested antenna at 8 GHz a E-plane, b H-plane

Table 1 Comparison of the proposed monopole antenna with the antenna existing in literature
Ref. No. Substrate, dielectric constant, substrate Frequency band (GHz) Size mm2
height in mm
[4] FR-4, 4.7, 1 4–10 60 × 20
[5] Rogers RO4003, 3.38, 0.762 3.1–12 66.1 × 44
[6] Teflon, 2.65, 0.8 2.75–16.2 30 × 8
[7] FR-4, 4.4, 1.6 2.6–13.04 25 × 25
[8] FR-4, 4.4, 1.6 3.1–10.6 27 × 30.5
[9] FR-4, 4.4, 1.5 2.9–19.2 40 × 40
[10] FR-4, 4.4, 1.6 2.4–11.2 30 × 30
[11] FR-4, 4.4, 1.6 3.1–10.6 20 × 33
[12] Rogers 4350, 3.66, 0.508 2.39–13.78 36 × 23
Proposed FR-4, 4.4, 1.6 3.1–11.3 20 × 14

4 Conclusion

A compact PMA, designed on a low-cost FR4 substrate is proposed in this research


paper. The proposed antenna performance is evaluated by studying different char-
acteristics such as return loss, gain, radiation efficiency, and radiation pattern are
studied. Ansys HFSS is used to perform all the simulations. From the simulation
results, it is observed that the proposed antenna maintains acceptable performance
throughout the entire UWB band.
590 P. Dalal and S.K. Dhull

References

1. Yang L, Giannakis GB (2004) Ultra-wideband communications: an idea whose time has come.
IEEE Signal Process Mag 21(6):26–54
2. Malik J, Patnaik A, Kartikeyan MV (2018) Compact antennas for high data rate communication,
1st edn. Springer, Cham
3. Liang, J, Chiau CC, Chen X, Parini CG (2005) Study of a printed circular disc monopole
antenna for UWB systems. IEEE Trans Antennas Propagn 53(11), 3500–3504 (2005)
4. Lin CC, Kan YC, Kuo LC, Chuang HR (2005) A planar triangular monopole antenna for UWB
communication. IEEE Microw Wireless Compon Lett 15(10):624–626
5. Ma TG, Tseng CH (2006) An ultrawideband coplanar waveguide-fed tapered ring slot antenna.
IEEE Trans Antennas Propagn 54(4):1105–1110
6. Wu Q, Jin R, Geng J, Ding M (2008) Printed omni-directional UWB monopole antenna with
very compact size. IEEE Trans Antennas Propagn 56(3):896–899
7. Gautam AK, Yadav S, Kanaujia BK (2013) A CPW-fed compact UWB microstrip antenna.
IEEE Antennas Wireless Propag Lett 12:151–154
8. Cai Y, Yang H, Cai L (2014) Wideband monopole antenna with three band-notched
characteristics. IEEE Antennas Wireless Propag Lett 13:607–610
9. Boutejdar A, Abd Ellatif W (2016) A novel compact UWB monopole antenna with enhanced
bandwidth using triangular defected microstrip structure and stepped cut technique. Microw
Opt Technol Lett 58:1514–1519
10. Dwivedi RP, Khan Z, Kommuri UK (2020) UWB circular cross slot AMC design for radiation
improvement of UWB antenna. Int J Electron Commun (AEU) 117:1–8
11. Singh C, Kumawat G (2020) A compact rectangular ultra-wideband microstrip patch antenna
with double band notch feature at Wi-Max and WLAN. Wireless Pers Commun 114:2063–2077
12. Ma Z, Jiang Y (2020) L-shaped slot-loaded stepped-impedance microstrip structure UWB
antenna. Micromachines 11:828
Multi-Objective Genetic Algorithm
for Job Planning in Cloud Environment

Neha Dutta and Pardeep Cheema

Abstract In Cloud environments, scheduling decisions must be made in the shortest


time because there are many users competing for resources and time. Optimization
measures are used when making scheduling decisions and represent the objectives
of the scheduling process. Better resource utilization, a higher amount of effectively
accomplished jobs and minimum reply time are the key constraints for optimization.
The events are stated by the cost of objective function which measures the worth of
computed solution and associates it with dissimilar solution. Here objective role is
defined with the help of constraints and criteria. The suggested job planning method-
ology centred on Multi-Objective Genetic procedures in cloud computing settings
uses an enhanced cost-based planning set of rules for building resourceful mapping
of jobs to existing resources in cloud computing. This scheduling algorithm achieves
the minimization of both assets cost and working performance.

Keywords Standard genetic algorithm · Multi-objective genetic algorithm ·


Evolutionary algorithms · Virtual machine · Artificial Bee colony

1 Introduction

1.1 Cloud Computing

Cloud computing offers pooled assets, data, programs and other devices to the
consumer’s necessity at particular time is an on-demand facility. It’s usually used in
the example of internet. The entire internet can be seen as a cloud. Cost and func-
tioning expenses can be changed by means of cloud computing. Cloud computing
facilities can be used from dissimilar and extensive assets, rather than faraway servers
or native technologies. Cloud computing, in general, comprises of a group of scattered
servers acknowledged as masters, providing required facilities and assets to dissim-
ilar users acknowledged as clients with scalability and trustworthiness of datacenter.

N. Dutta (B) · P. Cheema


Acet Eternal University Baru Sahib, Baru Sahib, Himachal Pradesh, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 591
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_44
592 N. Dutta and P. Cheema

Fig. 1 A cloud is used in the network diagrams to depict the internet [2]

Facilities may be of software resources (e.g. Software as a Service, SaaS) or physical


resources (e.g. Platform as a Service, PaaS) or hardware/infrastructure (e.g. Hard-
ware as a Service, HaaS or Infrastructure as a Service, IaaS). Amazon EC2 (Amazon
Elastic Compute Cloud) is an example of Cloud Computing services [1] (Fig. 1).

1.2 Task Scheduling in Cloud Computing

It is a set of instructions to govern the order of jobs to be executed by a server. Job


planning is a very challenging assignment in cloud computing because it is parallel
and distributed design. The job completion time determination is difficult in cloud
because the job may be distributed between more than one Virtual machine. Job plan-
ning and delivery of resources are two main problems in grid and cloud computing.
Cloud computing is evolving tool in IT field. The planning of the cloud facilities to
the customers by service sources impacts the budget and profit of these computing
prototypes. Though, there are several procedures given by many academics for job
planning. In 2008, a heuristic scheme to plan bag-of-tasks (jobs with small finishing
time and not reliant) in a cloud is offered so that the amount of virtual machinery to
perform all the jobs within the budget is less. In 2009, Dikaiakos et al. recognized
the notion of organization of distributed internet computing as public usefulness and
Multi-Objective Genetic Algorithm for Job Planning in Cloud … 593

talked some major difficulties and available prospects about the utilization, effective
actions and usage of cloud computing set-ups [3].
In 2009, Sudha and Jayarani suggested a proficient two-level job planner
(customer centric meta-task planner for collection of assets and system centric VM
schedular for posting jobs) in cloud computing environment established on QoS [4].
In 2010, Ge and Wei offered a novel job planner that takes the planning judgment
through assessing the complete collection of jobs in a job line. A genetic process
is considered as the optimization technique for a novel job planner that delivers
improved makespan and better load balancing than FIFO and delays planning [5]. In
2010, an ideal programming strategy established on linear programming; to subcon-
tract target constraint jobs in a hybrid cloud setting is projected. In 2011, Sandeep
Tayal suggested a procedure built on Fuzzy-GA optimization to assess the whole
collection of jobs in a job line on base of the forecast of finishing time of jobs
allocated to certain workstations and creates the planning judgment [6].
In 2011, Zhao et al. Sakurai offered a DRR (Deadline, Reliability, Resource-
aware) planning procedure, which programs the jobs such that all the works can
be completed before the deadline, guaranteeing the reliability and minimization
of assets [7]. In 2011, Sindhu and Mukherjee proposed two algorithms for cloud
computing environment and compared them with default policy of Cloudsim toolkit
while considering computational complexity of jobs [8].
Refer to the cloud computing framework MapReduce that Google presented.
Each unit consists of an independent master task scheduling node and the slave
task assigned node which derives from a cluster node under the jurisdiction of each
master node in Cloud environment. The master node is answerable for planning
all the jobs, monitoring their implementation, re-run the failed task, or disposing
of errors. The slave node is only accountable for the implementation of the jobs
allocated by the master node. Upon receipt of the assignment of the master node, the
slave node starts to find a suitable computing node. First, the slave node discovers
the rest of its own computing resources, if the rest of resources are sufficient to
meet the user’s calculative requirement; the slave node allocates its own computing
resources; if resources have been exhausted or insufficient to meet the minimum
computing requirements of user’s task, the master node begins to search for other
suitable cloud computing resources.
Cloud computing uses mature virtualization technology to map the resources to
the virtual machine layer, implementing the user’s task, so the work planning of cloud
environment succeeds at the application layer and virtual resources layer. Scheduling
is mapping tasks and resources according to a certain principle of optimal goal. Cloud
computing simplifies the matching of tasks and resources, the required resources from
a virtual machine, the procedure of examining resource package to the method of
examining the virtual machine. Cloud computing resources allocation is an important
portion of cloud computing technology, its efficiency directly affects performance
of the whole cloud computing environment. The main purpose of planning is to
program jobs to the compliant virtual machines in agreement with adjustable time,
which in fact includes discovery of the appropriate order in which jobs can be fulfilled
594 N. Dutta and P. Cheema

under contract sensibleness constraint. Here genetic simulated annealing algorithm


is selected for scheduling [9].
Task scheduling is very important in Cloud Computing in order to achieve the
optimal solution. Task scheduling is the job of start and finish times of the dissimilar
tasks that are subject to certain constraints. The constraints can be resource constraints
or time constraints. Task scheduling is a very important part of cloud computing. Task
scheduling helps to maximize resource utilization and lessen the completion time by
allocating the burden on different processors. It has two types: static scheduling and
dynamic scheduling. Several algorithms are proposed for scheduling mechanisms.
Scheduling mechanism is very important to improve resource utilization.
The basic process of task scheduling is presented in Fig. 2. In Cloud computing
there is a queue of different tasks, each task has different priority. Scheduler checks
the precedence of every task and allocates the tasks to different processors according
to their priority.
Job Planning of cloud computing talks about to report the computing jobs to
resource pooling between dissimilar resource consumers agreeing to certain instruc-
tions of resource use under specified cloud settings. At present-day, there is not a
uniform normal for work planning in cloud computing. Resource management and
work planning are the key skills of cloud computing that show a vibrant part in
effectual cloud resource management.
Meta-heuristics customize the operations of supporting heuristics to come up with
higher quality results expeditiously, optimizing each performance and price whereas
considering no uniformity of virtual machines. Various heuristic and Meta-heuristic
Techniques can be used [10]:

Fig. 2 Task scheduling [9]


Multi-Objective Genetic Algorithm for Job Planning in Cloud … 595

• Ant Colony Optimization (ACO).


• Genetic Algorithmic (GA).
• Particle Swarm Optimization (PSO).
• League Championship Algorithmic (LCA).
• Simulated Annealing.
• Tabu Search.
• Threshold Accepting.

2 Task Scheduling Using Genetic Algorithm

The Genetic Procedure is adaptable methodology for the assignment planning issue.
A Genetic procedure is a hunt investigative that copies the procedure of normal
evolution. This heuristic is routinely used to create helpful answers for advancement
and search issues [11].
Genetic procedures fit into the class of evolutionary procedures, which create
resolutions for optimization difficulties using skills motivated by normal progression,
such as inheritance, mutation, selection and crossover. But, the correct representa-
tion of probable resolutions is critical to guarantee that the mutation of any couple
(i.e. chromosome) will outcome in a new usable and expressive individual for the
problem. An output plan of jobs is an arrangement list of population (named chro-
mosomes), which translate applicant resolutions to an optimization problem, grows
toward improved results. Time minimization will give profit to service provider and
less care cost to the resources. It will too offer profit to cloud’s service customers
as their submission will be fulfilled at a reduced price [11]. Genetic Procedure is
centred on the biological theory of population generation. The common structure of
Standard Genetic algorithm (SGA) is labelled as (Fig. 3).
In 2015 Jena suggested job planning using a multi-objective genetic algorithm
to improve energy and completion time. The author compared proposed procedure
with present job planning procedures and reported that proposed procedure delivers
optimal balance outcomes for multiple goals [12]. In 2016 Nagaraju and Saritha
offered the multi-objective genetic procedure. A single point crossover prototype is
engaged for the generation of fresh population. Mutation procedure is conceded by
arbitrarily altering the bit locations in the chromosomes [13]. In 2017 Mahmood and
Khan projected a greedy and a genetic algorithm with an adaptive choice of appro-
priate crossover and mutation actions (called as AGA) to assign and plan real-time
jobs with priority control on heterogamous virtual machines [14]. In 2018 Srichandan
et al. discovered the job planning procedure using a hybrid methodology, which pools
necessary features of two of the most commonly used biologically-inspired heuristic
procedures, the genetic process and the bacterial foraging procedures in the cloud
[15].
596 N. Dutta and P. Cheema

Create a Population of Chromosomes

Determine the Fitness of Each


Individual >100 Generations

Next Generation

Display Results
Select Next Generation

Perform Reproduction using


Crossover

Perform Mutation

Fig. 3 Standard genetic algorithm (SGA) [11]

3 Proposed Work

The proposed task scheduling approach centred on Multi-Objective Genetic algo-


rithm in cloud environment makes use of an enhanced cost-based planning procedure
for creating competent mapping of jobs to access resources in cloud. This scheduling
algorithm achieves the minimization of both resource cost and computation perfor-
mance. The proposed work is compared with existing improved ABC algorithm
[16].
(a) Input—Required parameters are Cloudlets and Virtual machines are taken from
the user.
(b) Output—Scheduling of Cloudlets on Virtual machines with the minimum Cost
and Makespan.
Proposed Algorithm.
Multi-Objective Genetic Algorithm (MOGA).
• Generate an initial population of individuals with randomly generated individuals.
• Evaluate the fitness of all individuals.
• Archive building with best schedule (Non-dominate sorting).
Multi-Objective Genetic Algorithm for Job Planning in Cloud … 597

• While end state does not happen do


• Select appropriate individuals for replica with smallest completing time.
• Crossover between individuals by two-point crossover.
• Mutate individuals by simple substitution operator.
• Assess the fitness of the improved individuals having appropriate fitness.
• Make a new population.
• End while

Fitness Function
A fitness function is used to measure the quality of the individuals in the population
according to the given optimization objective. The fitness function can be different
for different cases. In some cases, the fitness function can be based on deadline, while
in cases it can be based on budget constraints. In the proposed work there are two
objectives:
• Objective 1: Makespan (Total time to complete the schedule)
• Objective 2: Cost (Total cost of the schedule to complete jobs on resources
according to makespan)
The fitness function is applied to these two above objectives (Fig. 4).

II Initialize Population Size Evaluate Objective


Rank Population
(N) Functions

Selection

Crossover

Mutation
Report Final Population
and Stop
Evaluate Objective
Function
Yes No

Combine Parent and


Stopping Select N Child Population, Rank
Criteria Individuals Population
Reached?

Fig. 4 Multi-objective genetic algorithm (MOGA)


598 N. Dutta and P. Cheema

4 Results and Discussion

Here the objective is to analyze the performance of genetic algorithm in minimizing


the makespan and cost of the processor and to find the best scheduling of jobs on the
resources, MOGA (Multi-Objective genetic algorithm) is implemented on Intel core
i3 machine with 500 GB HDD, 4 GB RAM on Windows 7 OS, Java language version
1.7 and simulation work is performed using CloudSim3.0 toolkit on Netbeans IDE
8.01.
A worthy job planning procedure is that which leads to enhanced resource use,
small average Makespan and healthier system output. Makespan denotes the accom-
plishment time of all cloudlets in the array. To express the problem we assume Jobs
(J1, J2 and J3 … Jn) run on Resources (R1, R2, R3 … Rn). Our goal is to reduce the
Makespan and Cost. The speed of CPUs is stated in MIPS (Million instructions per
second) and size of the job can be stated as number of commands to be performed.
Each processor is assigned varying processing power and respective cost in Indian
rupees. Our objective is to compute the Makespan (completion time of tasks) and the
corresponding Cost of output schedules from the proposed algorithm and compare
the result with existing improved ABC algorithm [16] (Tables 1 and 2).

Table 1 GA parameters
Parameter Value
No. of resources 6
No. of jobs 25–100
Population size 5
No. of iterations 1000
Crossover type Two-point crossover
Crossover probability 0.5
Mutation type Simple swap
Mutation probability 0.015
Selection method Tournament
Termination condition No. of iterations

Table 2 List of resources


Resource Id Processor capacity (Mips)
R1 120
R2 131
R3 153
R4 296
R5 126
R6 210
Multi-Objective Genetic Algorithm for Job Planning in Cloud … 599

Table 3 Best scheduling of


Jobs Resources
jobs on resources at pop
generation = 1000 J1 R2
J2 R2
J3 R2
J4 R2
J5 R2
J6 R2
J7 R2
J8 R2
J9 R2
J10 R2
J11 R2
J12 R2
J13 R2
J14 R2
J15 R2
J16 R2
J17 R2
J18 R2
J19 R2
J20 R2
J21 R2
J22 R2
J23 R2
J24 R2
J25 R2

Output:
After getting the values of Cost and Makespan at 1000th generation then we
have the best schedule. The Best Scheduling at Population Generation = 1000 are
(Tables 3 and 4):
Comparison of Improved ABC Algorithm with MOGA.
The results are compared for processing time and processing cost for various
numbers of Cloudlets namely 25, 50, 75 and 100. Table 5 compares MOGA
scheduling algorithm with Improved ABC scheduling algorithm [16] on the basis
of cost spent for processing the tasks. From Table 5 it can be seen that for MOGA
scheduling the processing cost spent to complete tasks is very less when compared
with the processing cost spent to complete the tasks with ABC algorithm. Table
6 compares Improved ABC scheduling algorithm with MOGA (Multi-objective
Genetic Algorithm) on the basis of time taken for completion of the tasks. From
Table 6 it can be seen that for MOGA scheduling the time taken to complete tasks is
600 N. Dutta and P. Cheema

Table 4 Cloudlets executed on virtual machines


Cloudlet ID STATUS Data center ID VM ID Time Start Time Finish Time
1 SUCCESS 3 1 9.8 0.2 10
3 SUCCESS 3 1 9.8 0.2 10
5 SUCCESS 3 1 9.8 0.2 10
7 SUCCESS 3 1 9.8 0.2 10
9 SUCCESS 3 1 9.8 0.2 10
11 SUCCESS 3 1 9.8 0.2 10
13 SUCCESS 3 1 9.8 0.2 10
15 SUCCESS 3 1 9.8 0.2 10
17 SUCCESS 3 1 9.8 0.2 10
19 SUCCESS 3 1 9.8 0.2 10
21 SUCCESS 3 1 9.8 0.2 10
23 SUCCESS 3 1 9.8 0.2 10
0 SUCCESS 2 0 10.61 0.2 10.81
2 SUCCESS 2 0 10.61 0.2 10.81
4 SUCCESS 2 0 10.61 0.2 10.81
6 SUCCESS 2 0 10.61 0.2 10.81
8 SUCCESS 2 0 10.61 0.2 10.81
10 SUCCESS 2 0 10.61 0.2 10.81
12 SUCCESS 2 0 10.61 0.2 10.81
14 SUCCESS 2 0 10.61 0.2 10.81
16 SUCCESS 2 0 10.61 0.2 10.81
18 SUCCESS 2 0 10.61 0.2 10.81
20 SUCCESS 2 0 10.61 0.2 10.81
22 SUCCESS 2 0 10.61 0.2 10.81
24 SUCCESS 2 0 10.61 0.2 10.81

Table 5 Simulation of
No. of cloudlets Improved ABC MOGA
processing cost for improved
algorithm (Processing cost in
ABC algorithms [16] and
(Processing cost in Rs.)
MOGA (multi-objective
Rs.)
genetic algorithm)
25 72.34 53.12
50 374.01 105.13
75 402.61 141.53
100 543.32 203.17
Multi-Objective Genetic Algorithm for Job Planning in Cloud … 601

Table 6 Simulation of processing time for improved ABC algorithms [16] and MOGA (multi-
objective genetic algorithm)
No. of cloudlets Improved ABC algorithm MOGA
(Processing time in seconds) (Processing time in seconds)
25 133.1 10.81
50 234.01 21.02
75 378.51 28.3
100 487.5 40.63

Fig. 5 ABC algorithms versus MOGA cost

very less when compared with the time taken to complete the tasks with Improved
ABC Algorithm. From the results, we can conclude that the MOGA scheduling
algorithm is better than improved ABC scheduling algorithm (Figs. 5 and 6).

5 Conclusion and Future Scope

The proposed approach leads to better results in a multi-objective genetic algorithm


in Cloud Computing environment for scheduling of jobs on the processors. The
fitness is established to boost the establishment of solutions to attain both the cost
and makespan minimization. The performance of MOGA (on minimization of cost
and makespan of the resources) with the variation of its control parameters is eval-
uated. Increasing the Population generation and making population size constant
enables the genetic algorithm to obtain minimum Cost and Makespan which results
in better scheduling. From the comparison of completion time taken and processing
602 N. Dutta and P. Cheema

Fig. 6 ABC algorithms versus MOGA processing time

cost spent for MOGA scheduling algorithm and improved ABC scheduling algo-
rithm, concluded that the MOGA scheduling algorithm is better than improved ABC
scheduling algorithm.
In future, study has to be done to select the correct parameter. It is planned to extend
the proposed work for, dynamic scheduling of both dependent and independent tasks
considering different objectives. It is planned to compare the simulated results with
other scheduling algorithms also. It can also be extended by considering the objective
like maximizing the processor utilization. This simulation will be tested on the real-
world scenario. The simulation is tested for reliability of fitness.

References

1. Randles M, Lamb D, Taleb-Bendiab A (2010) A comparative study into distributed load


balancing algorithms for cloud computing. In: IEEE 24th international conference on advanced
information networking and applications workshops, pp 551–556
2. Velte AT, Velte TJ, Elsenpeter R (2010) Cloud computing a practical approach, TATA McGraw-
HILL edn, pp 3–11
3. Dikaiakos M, katsaros D, Mehra P, Vakali A (2009) Cloud computing: distributed internet
computing for IT and scientific research. IEEE Trans Internet Comput 13(5):10–13
4. Sadhasivam S, Nagaveni N (2009) Design and implementation of an efficient two-level sched-
uler for cloud computing environment. In: International conference on advances in recent
technologies in communication and computing, IEEE, pp 884–886
5. Van den Bossche R, Vanmechelen K, Broeckhove J (2010) Cost optimal scheduling in hybrid
IaaS clouds for deadline constrained workloads. In: IEEE international conference on cloud
computing, Miami, pp 228–235
6. Ge Y, Wei G (2010) GA-based task scheduler for the cloud computing systems. In: IEEE
international conference on web information systems and mining, pp 181–186
7. Zhao L, Ren Y, Sakurai K (2011) A resource minimizing scheduling algorithm with ensuring
the deadline and reliability in heterogeneous systems. In: International conference on advance
information networking and applications, IEEE, pp 275–282
Multi-Objective Genetic Algorithm for Job Planning in Cloud … 603

8. Sindhu S, Mukherjee S (2011) Efficient task scheduling algorithms for cloud computing envi-
ronment. In: International conference on high performance architecture and grid computing,
vol 169, pp 79–83
9. Guo G, Ting-Iei H, Shuai G (2010) Genetic simulated annealing algorithm for task scheduling
based on cloud computing environment. In: IEEE international conference on intelligent
computing and integrated systems (ICISS), Guilin, pp 60–63
10. Sandhu SK, Kumar A (2017) A survey on meta-heuristic scheduling optimization techniques
in cloud computing environment. Int J Recent Innov Trends Comput Commun 5(6):486–492
11. Sureshbabu GNK, Srivatsa SK (2014) A review of load balancing algorithms for cloud
computing. Int J Eng Comput Sci 3(9):8297–8302
12. Jena RK (2015) Task scheduling in cloud environment using multi-objective genetic algorithm.
In: 5th international workshop on computer science and engineering, WCSE
13. Nagaraju D, Saritha V (2017) An evolutionary multi-objective approach for resource scheduling
in mobile cloud computing. Int J Intell Eng Syst 10(1):12–21
14. Mahmood A, Khan SA (2017) Hard real-time task scheduling in cloud computing using an
adaptive genetic algorithm. Computers 6(15):1–20
15. Srichandan S, Ashok Kumar T, Bibhudatta S (2018) Task scheduling for cloud computing using
multi-objective hybrid bacteria foraging algorithm. Future Comput Inf J 3:210–230
16. Selvarani S, Sadhasivam GS (2011) Improved cost-based algorithm for task scheduling in cloud
computing, IEEE, pp 1–5
17. Tayal S (2011) Tasks scheduling optimization for the cloud computing systems. Int J Adv Eng
Sci Technol 5(2):111–115
Facial Expression Recognition Using
Convolutional Neural Network

Vandana and Nikhil Marriwala

Abstract Facial expression conveys the emotional state of human beings. Facial
expressions are a common form of non-verbal communication that helps to transfer
necessary information or data from one person to another. However, in today’s world
with increasing demand for artificial intelligence, recognition of facial expressions
is a challenging task in solving problems related to artificial intelligence, machine
learning, and computer vision. In this paper, we present an approach that helps to
classify different types of facial expressions using Convolutional Neural Network
(CNN) algorithm. The proposed model is a Neural Network architecture that is
based on sharing of weights and optimizing parameters using CNN algorithm. Two
Models are designed using this algorithm which is named Simple CNN and Improved
CNN models having different convolution layers. Architecture designs of these two
models are different from each other. The input of our system is grayscale images
which consist of expressions of different faces. Using Input as grayscale images,
both CNN models are trained and parameters optimized in neural network. Output
of system is seven common facial expressions such as happy, anger, sad, surprise,
fear, disgust, and neutral. To achieve better experimental results of designed model,
graph of loss and accuracy is plotted for both of the above models. The overall
simulation results prove that Improved CNN model improves the accuracy of facial
expression recognition as compared to Simple CNN model. As result, analysis of
confusion matrix is also obtained for Simple CNN and Improved CNN model.

Keywords Facial expression recognition · Convolution neural network · Simple


CNN and improved CNN · Confusion matrix

Vandana (B)
University Institute of Engineering and Technology, Kurukshetra University, Kurukshetra, India
N. Marriwala
Electronics and Communication Engineering Department, University Institute of Engineering and
Technology, Kurukshetra University, Kurukshetra, India
e-mail: nmarriwala@kuk.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 605
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_45
606 Vandana and N. Marriwala

1 Introduction

1.1 Related Work

Human beings communicate through verbal as well as through body movement


and gestures (non-verbal), to convey necessary information and to interact with
each other. Facial Expression Recognition System is a technology that automati-
cally detects facial expressions in real time. Nowadays, facial expression recogni-
tion is a beneficial artificial intelligence technology in human–computer interaction,
automobile driving, biometric system, customer feedback system, market research,
market assistance, video games, and video testing, computer vision problems, online
distance education, online medicines, and healthcare services, safe driving moni-
toring, lie detector, predictive learning [1] and other fields [2]. There are many
literature surveys and researches related to face detection techniques, image pre-
processing techniques, feature extraction mechanisms, and some other techniques
used for expression classification, but development of an automatic system with an
improved accuracy that helps to achieve facial expression recognition system is a
challenging and difficult task [3]. Some of the most commonly used techniques are
Local Binary Pattern (LBP), Sparse Support Vector Machine (SSVM) [4], Support
Vector Regression [5], LeNet and ResNet [3], STM-ExpLet [6], k-means Clus-
tering algorithm, fuzzy C-means Clustering algorithm, Support Vector Machines
(SVM), Maximum Entropy, and Deep learning algorithms include Long-Short term
Memory (LSTM), Recurrent Neural Network (RNN), Convolution Neural Network
(CNN), Artificial Neural Network (ANN), Merged Convolution Neural Network [3],
Adaboost, Extreme Learning Machine (ELM), Bayesian Networks and the multilevel
Hidden Markov Model (HMM) [7]. These approaches are used for facial expression
recognition systems and emotions classification systems.

2 Architecture of Proposed CNN Model

In this paper, the proposed model is a Convolution Neural Network which uses
back-propagation algorithm for extracting parameters and optimizing them with
new parameters using convolution operation (dot product) [7]. Convolution Neural
Network helps to improve the accuracy of facial expression recognition system. CNN
algorithm includes a system that is based on shared weights, which uses multi-layers
to successively extract higher level parameters from the input images [2]. CNN is
a model used for extracting desired parameters and is also used for image classi-
fication [8]. In this paper, two models, first model Simple CNN model and other
Improved CNN model [9] are designed using CNN algorithm to compare accuracy
and performance of the models. Improved CNN model has a greater number of layers
than CNN model. Therefore, Improved CNN is called an improved version of CNN
Facial Expression Recognition Using Convolutional Neural Network 607

Fig. 1 Seven facial expressions

model. The proposed method aims to classify emotions and recognize facial expres-
sions. Therefore, the method is divided into some steps: FER-2013 Database, Image
pre-processing, Feature extraction, Facial expression classification, and recognition
(Fig. 1).
Facial expression dataset or FER is the basic condition for recognition of Facial
expression. FER-2013 is an open-source database provided by Kaggle. It has 35,887
grayscale images of (48 * 48)-pixel size facial expressions. Each grayscale image is
divided into seven emotion labels. Facial Expressions are classified as follows: 0 =
Angry, 1 = Disgust, 2 = Fear, 3 = Happy, 4 = Sad, 5 = Surprise, and 6 = Neutral.
The original data consists in form of arrays in.csv format with grayscale value for
each pixel. This data is converted into images and hence divided into multiple sets
which are training, validation, and test sets. There are 28,709 images for training
purposes, 3589 images for validation set, and 3589 images for testing set. In the
total set, there are 4953, 547, 5121, 8989, 6077, 4002, 6198 images of the seven
kinds of expressions respectively [10]. From the total set, 3995, 436, 4097, 7215,
4830, 3171, 4965 images are used for only training purpose [11]. Training of model
is done in python language using keras with TensorFlow backend. This database is
used to conduct training, validation, and testing on models in 80:10:10 ratio. Some
of the commonly known datasets are: FER-2013 dataset, CK + dataset or Cohn-
Kanade dataset, JAFFE dataset [4], RAF database [2, 3], CMU and NIST dataset [1],
and EURECOM Kinect face dataset [12], other datasets [13]. Image Pre-processing
step involves improvement of image that removes undesired distortions, unnecessary
information from input. Facial Expression image pre-processing mainly consists of
face detection and location, rotation rectification system [14], normalization func-
tion, scale normalization, Gray level Equalization [7]. Feature extraction is the main
step, its purpose is to extract relevant parameters from input layer. LBP is a feature
extraction technique. It creates new parameters from the original features and the new
reduced set of parameters collects most of the information which helps to classify
the facial expressions in the next step. Facial expressions classify into seven types
of emotion labels [2]. The structure of Facial Expression Recognition is shown in
Fig. 2.
The Convolution Neural Network is a composition of multiple layers, it is a
feedforward technique [7] that can easily extract desired parameters from input and
optimize next network parameters by using a back-propagation algorithm. It is a
feedforward network that works heavily by using convolution operations (dot opera-
tions) [15]. This process consists of five layers of convolution and three layers of max
608 Vandana and N. Marriwala

Fig. 2 Structure of facial expression recognition

pooling in a neural network. These layers are input layer, convolution layer, pooling
layer, fully connected layer, and output layer. A Neural Network is designed when
these layers connect with each other. In addition, some other layers such as Dropout
layer, Activation function, Flatten Layer, Batch Normalization are also necessary for
designing CNN architecture [6, 16]. The edge detecting filters are used in convolu-
tion layers [1]. The Layers of network are described as Input Layer is the first layer
in CNN architecture which contains grayscale images dataset in the form of arrays.
Size of Input layer is 46 * 46 * 1. A filter of size 5 * 5 or 3 * 3 or 1 * 1 applies to
input layer and then convolution operation is performed on each parameter of input
layer. Convolution Layer is the feature extracting layer in CNN model which extracts
basic parameters of the image. Convolution operation is performed in one convolu-
tion layer and a filter is applied on the image to perform convolution again for next
feature extraction. The process of convolution will repeat again and again until all
features extract from the whole image. After convolution operations, normalization
between 0 and 1 is done in layer. ReLu as activation function is used after this layer.
Now, Max pooling layer performs downsampling functions [16] in layer, and filter
of size 2 * 2 is used. Adam-Optimizer is used after this layer [17]. Max pooling
layer reduces size of layer. Fully Connected Layer contains number of neurons and
Facial Expression Recognition Using Convolutional Neural Network 609

Fig. 3 (a) Architecture of convolution neural network. (b) Convolution operation

weights either 256 or 512 or 1024 neurons are used in fully connected layer [16].
This layer is mainly responsible for connection of neurons of one layer to neurons
of another layer in a Convolution Neural Network architecture [16]. Output Layer is
the final layer that consists of labels as result of CNN model architecture [6, 7]. The
Architecture of Convolution Neural Network is shown in Fig. 3 as:
Experimental Results
The Simple CNN model architecture is basically composed of a mixture of convolu-
tional layers, fully connected layers, max pooling layer, Flatten layer Batch Normal-
ization layer, dropout layer, activation function, and loss function [15]. The design
idea of first model is to connect two convolution layers with two fully connected
layers [3] and some other layers such as Batch Normalization function, activation
function, Max pooling layer, Flatten layer, and dropout layer [9, 15]. Therefore,
Simple CNN model is designed and the layers of CNN model are shown in Table 1.
According to experimental results obtained from Simple CNN model, the model
gave accuracy after training is 54% for 50 number epochs and the execution time
after training is very less. The execution time for CNN model is 2 min 42 s. For
Second model, the idea of designing is to increase number of layers for experimental
purposes in network architecture of CNN. The idea is to connect eight convolutional
layers with eight fully connected layers [3] and some other layers such as max
pooling layer, Batch Normalization layer, dropout layer, activation function, and
loss function. After the addition of convolution layers and fully connected layers
610 Vandana and N. Marriwala

Table 1 CNN model layers


Input layer Output layer Weights
Convolution layer 46 * 46 * 32 320
Batch normalization 46 * 46 * 32 128
Activation function 46 * 46 * 32 0
Max pooling layer 23 * 23 * 32 0
Dropout layer 23 * 23 * 32 0
Convolution layer 21 * 21 * 64 18,496
Batch normalization 21 * 21 * 64 256
Activation function 21 * 21 * 64 0
Max pooling layer 10 * 10 * 64 0
Dropout layer 10 * 10 * 64 0
Flatten layer 6400 0
Fully connected layer 256 1,638,656
Batch normalization 256 1024
Activation function 256 0
Dropout layer 256 0
Fully connected layer 256 65,792
Batch normalization 256 1024
Activation function 256 0
Dropout layer 256 0
Fully connected layer 7 1799
Accuracy of simple CNN model 54%

[15], the Improved CNN architecture is designed [9] and the Layers of Improved
CNN architecture are shown in Table 2.
According to experimental results obtained from Improved CNN Model, the
best Improved model gave Accuracy after training is 62% for 50 number epochs
and the execution time after training is more as compared to Simple CNN model.
The execution time for Improved CNN model is 7 min 30 s. Improved CNN is an
improved version of Simple CNN model. It can be noticed from the above experi-
mental results, that the accuracy of CNN model increases with an increase in number
of convolutional, fully connected, and other layers in the model.
A confusion matrix is obtained from the analysis of above experimental results
and it is designed to make a comparison between both of the models [3]. A Confusion
matrix is a type of matrix plotted between true value of emotion and predicted value of
emotion [17]. As Improved CNN model has a greater number of convolutions, fully
connected and other layers and more parameters therefore, accuracy of predicting
facial expressions is higher in Improved CNN model. With an increase in number
of layers, accuracy increases. It is clearly shown in Figs. 4 and 5 confusion matrix,
that happy expression is highly predicted expression in Simple CNN model of value
528 as well as in Improved CNN model of value 712. Prediction of happy expression
Facial Expression Recognition Using Convolutional Neural Network 611

Table 2 Improved CNN


Input layer Output layer Weights
model layers
Convolution layer 46 * 46 * 64 640
Batch normalization 46 * 46 * 64 256
Activation function 46 * 46 * 64 0
Max pooling layer 23 * 23 * 64 0
Dropout layer 23 * 23 * 64 0
Convolution layer 19 * 19 * 128 204,928
Batch normalization 19 * 19 * 128 512
Activation function 19 * 19 * 128 0
Max pooling layer 9 * 9 * 128 0
Dropout layer 9 * 9 * 128 0
Convolution layer 7 * 7 * 128 147,584
Batch normalization 7 * 7 * 128 512
Activation function 7 * 7 * 128 0
Max pooling layer 3 * 3 * 128 0
Dropout layer 3 * 3 * 128 0
Convolution layer 1 * 1 * 128 147,584
Batch normalization 1 * 1 * 128 512
Activation function 1 * 1 * 128 0
Max pooling layer 1 * 1 * 128 0
Dropout layer 1 * 1 * 128 0
Convolution layer 1 * 1 * 512 66,048
Batch normalization 1 * 1 * 512 2048
Activation function 1 * 1 * 512 0
Max pooling layer 1 * 1 * 512 0
Dropout layer 1 * 1 * 512 0
Convolution layer 1 * 1 * 512 262,656
Batch normalization 1 * 1 * 512 2048
Activation function 1 * 1 * 512 0
Max pooling layer 1 * 1 * 512 0
Dropout layer 1 * 1 * 512 0
Convolution layer 1 * 1 * 1024 525,312
Batch normalization 1 * 1 * 1024 4096
Activation function 1 * 1 * 1024 0
Max pooling layer 1 * 1 * 1024 0
Dropout layer 1 * 1 * 1024 0
Convolution layer 1 * 1 * 1024 1,049,600
Batch normalization 1 * 1 * 1024 4096
(continued)
612 Vandana and N. Marriwala

Table 2 (continued)
Input layer Output layer Weights
Activation function 1 * 1 * 1024 0
Max pooling layer 1 * 1 * 1024 0
Dropout layer 1 * 1 * 1024 0
Flatten layer 1024 0
Fully connected layer 256 262,400
Batch normalization 256 1024
Activation function 256 0
Dropout layer 256 0
Fully connected layer 256 65,792
Batch normalization 256 1024
Activation function 256 0
Dropout layer 256 0
Fully connected layer 512 131,584
Batch normalization 512 2048
Activation function 512 0
Dropout layer 512 0
Fully connected layer 512 262,656
Batch normalization 512 2048
Activation function 512 0
Dropout layer 512 0
Fully connected layer 512 262,656
Batch normalization 512 2048
Activation function 512 0
Dropout layer 512 0
Fully connected layer 512 262,656
Batch Normalization 512 2048
Activation function 512 0
Dropout layer 512 0
Fully connected layer 1024 525,312
Batch normalization 1024 4096
Activation function 1024 0
Dropout layer 1024 0
Fully connected layer 1024 1,049,600
Batch normalization 1024 4096
Activation function 1024 0
Dropout layer 1024 0
Fully connected layer 7 7175
Accuracy of improved CNN model 62%
Facial Expression Recognition Using Convolutional Neural Network 613

Fig. 4 Confusion matrix of


simple CNN model

Fig. 5 Confusion matrix of


improved CNN model

means that CNN algorithm can easily predict happy expression as compared to
any other expressions from FER-2013 database [3]. It is interesting to observe that,
for Prediction of Disgust expression value is 22 for Simple CNN and 20 value for
Improved CNN which means disgust expression is the least predicted expression.
The expressions such as Sad, Surprise, Neutral in Simple CNN and Improved CNN
model are likely to be confused by confusion matrix because these expressions have
almost equivalent values as shown in Figs. 4 and 5 which means CNN algorithm
614 Vandana and N. Marriwala

Fig. 6 Comparison between simple CNN and improved CNN

confused in Sad, Surprise, and Neutral expressions. Also, prediction of Neutral, Sad
expressions in Improved CNN model is higher than prediction in Simple CNN model.
In confusion matrix, it is also observed that the prediction of Angry, Fear expressions
is higher in Simple CNN as compared to Improved CNN which means Simple CNN
has more accuracy of prediction for some expressions because, for some expressions,
it is not important that going in deeper layers will provide better features in CNN
network architecture. Therefore, no improvement was observed in some features in
experimental results [3]. Also, Angry and Fear expressions are the most suitable
expressions in confusion matrix for both the models as shown in Figs.4 and 5 [15,
17].
Figure 6 shows the comparison of expressions between two models. Both the
models Simple CNN and Improved CNN predict expressions such as happy, surprise,
neutral, sad, anger, fear, disgust. It is noticed from Fig. 5 that Improved CNN model
improves parameters as well as predicts expressions accurately except for some
expressions such as Anger, Fear. Happy expression is the most predicted expression
and disgust is the least predicted expression from the neural network (Table 3).
A graph of loss and accuracy is plotted for both of the models as shown in Figs.7
and 8 to compare performance of Simple CNN model and Improved CNN model
[3]. In Figs. 7a and 8a a graph is plotted between loss and no. of epochs. In Figs. 7b
and 8b, a graph is plotted between accuracy and no. of epochs. The total no. of
epochs is 50 in both cases. It is observed from above plotted graphs that Simple
CNN model has better learning rate than Improved CNN that means CNN model
has fast learning ability than Improved CNN model. Loss of Simple CNN model is
very less as compared to Improved CNN. Accuracy of Simple CNN model has high
learning features parameters therefore, it quickly reaches its peak value and then
Facial Expression Recognition Using Convolutional Neural Network 615

Table 3 Comparison
Method Accuracy (%) Time taken
between CNN and improved
CNN model SIMPLE CNN MODEL (2 54 2 min. 42 s
layers)
IMPROVED CNN MODEL (4 57 3 min. 26 s
layers)
IMPROVED CNN MODEL (6 58 5 min. 12 s
layers)
IMPROVED CNN MODEL (8 62 7 min. 30 s
layers)

Fig. 7 Plotted graph of simple CNN model where a graph plots between loss in training and no.
of epochs used during training and b graph plots between accuracy obtained in training and no. of
epochs used during training

saturates very easily. As result, Simple CNN model learns facial faster as compared
to Improved CNN model [3].

3 Conclusion and Future Work

In this research paper, we designed an efficient CNN algorithm for facial expression
recognition system. Two models are designed based on this algorithm which is Simple
CNN and Improved CNN. These models are designed to compare the main effect of
addition of number of layers on accuracy and performance of CNN model. However,
the experimental results showed that Improved CNN model is capable of learning
facial features and improving facial expression recognition system but Simple CNN
model learns features faster as compared to Improved CNN. The time required for
616 Vandana and N. Marriwala

Fig. 8 Plotted graph of improved CNN model where a graph plots between loss in training and
no. of epochs used during training and b graph plots between accuracy obtained in training and no.
of epochs used during training

training is also very less in case of Simple CNN model as compared to Improved
CNN. Both of the models obtain good detection results which means they are suit-
able for facial expression recognition systems. As result, Happy expression is easily
detected expression by both of the CNN models and Disgust expression is the least
detected expression. These results describe that Improved CNN approach is a better
model as well as an easy approach to designing neural networks for facial recognition
systems. In the future, we will want to expand our model to predict expressions in
real time. We will also want to predict the accuracy of each expression in percentage
using CNN algorithm and also work on comparing our model with different models
for facial expression recognition in real time. Furthermore, our work will also focus
on simplifying the algorithm used for neural network architecture and to increase the
speed of CNN algorithm. In future, we will also want to work on applications and
designing techniques of Convolution Neural Network.

References

1. Mehendale N (2020) Facial emotion recognition using convolutional neural networks (FERC).
SN Appl Sci 2(3):1–8. https://doi.org/10.1007/s42452-020-2234-1
2. Shi M, Xu L, Chen X (2020) A novel facial expression intelligent recognition method using
improved convolutional neural network. IEEE Access 8:57606–57614. https://doi.org/10.1109/
ACCESS.2020.2982286
3. Ucar A (2017) Deep convolutional neural networks for facial expression recognition. In:
Proceedings—2017 IEEE international conference on innovations in intelligent systems and
applications INISTA 2017, pp 371–375. https://doi.org/10.1109/INISTA.2017.8001188
Facial Expression Recognition Using Convolutional Neural Network 617

4. Wu C, Chai L, Yang J, Sheng Y (2019) Facial expression recognition using convolutional neural
network on graphs. In: Chinese control conference CCC, vol 2019, pp 7572–7576. https://doi.
org/10.23919/ChiCC.2019.8866311
5. Yang Y, Key C, Sun Y, Key C (2017) Facial expression recognition based on arousal-valence
emotion model and deep learning method. pp 59–62. https://doi.org/10.1109/ICCTEC.2017.
00022
6. Jung H et al (2015) Development of deep learning-based facial expression recognition system.
In: 2015 Frontiers of computer vision, FCV 2015, pp 2–5. https://doi.org/10.1109/FCV.2015.
7103729
7. Zhang H, Jolfaei A, Alazab M (2019) A face emotion recognition method using convolutional
neural network and image edge computing. IEEE Access 7:159081–159089. https://doi.org/
10.1109/ACCESS.2019.2949741
8. Network CN, Wang Y, Li Y, Song Y, Rong X (2019) Facial expression recognition based on
random (1):1–16
9. Xu Q, Zhao N (2020) A facial expression recognition algorithm based on CNN and LBP
feature. In: Proceedings—2020 IEEE 4th information technology, networking, electronic and
automation control conference (ITNEC), pp 2304–2308. https://doi.org/10.1109/ITNEC48623.
2020.9084763
10. Zahara L, Musa P, Prasetyo Wibowo E, Karim I, Bahri Musa S (2020) The facial emotion
recognition (FER-2013) dataset for prediction system of micro-expressions face using the
convolutional neural network (CNN) algorithm based Raspberry Pi. In: 2020 Fifth international
conference on informatics and computing (ICIC), March 2021. https://doi.org/10.1109/ICIC50
835.2020.9288560
11. Zhou N, Liang R, Shi W (2021) A lightweight convolutional neural network for real-time facial
expression detection. IEEE Access 9:5573–5584. https://doi.org/10.1109/ACCESS.2020.304
6715
12. Ijjina EP, Mohan CK (2014) Facial expression recognition using kinect depth sensor and convo-
lutional neural networks. In: Proceedings—2014 13th international conference on machine
learning and applications ICMLA 2014, pp 392–396. https://doi.org/10.1109/ICMLA.2014.70
13. Kim JH, Kim BG, Roy PP, Jeong DM (2019) Efficient facial expression recognition algorithm
based on hierarchical deep neural network structure. IEEE Access 7:41273–41285. https://doi.
org/10.1109/ACCESS.2019.2907327
14. Yang B, Cao J, Ni R, Zhang Y (2017) Facial expression recognition using weighted mixture
deep neural network based on double-channel facial images. IEEE Access 6(8):4630–4640.
https://doi.org/10.1109/ACCESS.2017.2784096
15. Pranav E, Kamal S, Satheesh Chandran C, Supriya MH (2020) Facial emotion recognition
using deep convolutional neural network. In: 2020 6th international conference on advanced
computing and communication systems (ICACCS), pp 317–320. https://doi.org/10.1109/ICA
CCS48705.2020.9074302
16. Kumar GAR, Kumar RK, Sanyal G (2018) Facial emotion analysis using deep convolution
neural network. In: Proceedings—2017 international conference on signal processing and
communication (ICSPC), vol 2018-Janua, pp 369–374. https://doi.org/10.1109/CSPC.2017.
8305872
17. Videla LS, Kumar PMA (2020) Facial expression classification using vanilla convolution neural
network. In: 2020 7th international conference on smart structures and systems (ICSSS), pp
2–6. https://doi.org/10.1109/ICSSS49621.2020.9202053
FPGA Implementation of Elliptic Curve
Point Multiplication Over Galois Field

S. Hemambujavalli, P. Nirmal Kumar, Deepa Jose, and S. Anthoniraj

Abstract Safeguarding data from unauthorized access became a major issue in


various aspects of daily life. To secure this, Elliptic curve cryptography plays a vital
role. Elliptic curve cryptography (ECC) processor was built using a binary field in
this project. The Montgomery ladder, as well as unified point addition algorithm, are
considered in EC Point Multiplication. The ECC Processor’s conclusive operations
are ECPM to generate a public key. The elliptical curve forms a group point addition,
to reduce the operation and minimize the area size in the processor. The proposed
designs are simulated and synthesized in VIVADO 2018 tool.

Keywords Point multiplication · Galois binary field · Elliptic curve cryptography


(ECC) · Coordinates

1 Introduction

Cryptography is the technique to hide our data and secure it from unauthorized
persons. It involves mathematical techniques that give the protection services
like confidentiality, integrity, authentication, and non-repudiation. Various data or
photographs from various sources are broadcast through the network, including
online data exchange, medical details, military confidential details, and record storage
systems, among other things. Highly confidential data is present. Elliptical curve
cryptography (ECC) is often known as asymmetric cryptography, and it is used
to protect this. The strength of ECC is that the keys are shorter. As compared to

S. Hemambujavalli · P. Nirmal Kumar


Department of ECE, CEG, Anna University, Chennai, Tamil Nadu 600025, India
D. Jose (B)
Department of ECE, KCG College of Technology, Rajiv Gandhi Salai, Chennai, Tamil Nadu
600097, India
e-mail: deepa.ece@kcgcollege.com
S. Anthoniraj
Department of ECE, MVJ College of Engineering, Whitefield Salai, Bangalore 560067, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 619
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_46
620 S. Hemambujavalli et al.

other cryptosystems, it has lower storage needs and faster field arithmetic opera-
tions, making it a better option for integrating asymmetric cryptography in resource
constrained systems, which are often used in IoT devices, including such wireless
sensors, and achieving high throughput.
Innumerable public key cyphers, including ElGamal cryptosystem, Diffie and
Hellman 1976, RSA and the Elliptical Curve Crypt, depend on algorithms from
algebra, computer arithmetic, and number theory. Several public key cryptosys-
tems make extensive use of the modular square, modular arithmetic operations, and
modular multiplication over finite fields prime field and binary field. Among the
techniques proposed in the literature, RSA and ECC have attracted a great deal of
practical attention due to high level of securities involved. RSA requires a greater key
length than ECC. A restricted environment seems to be a computational organisa-
tion that contains many diverse elements and restricted capacities of the underlying
computational devices. These restrictions are imposed by the device’s computing
power, connectivity range. Constrained structure refer to deployments of IoT, wire-
less technologies Traditional application, Smartphones applications, and newly found
modern uses of elliptic curves are all using ECC.
Miller [1] and Koblitz [2] were the first to use elliptic curves in cryptography.
Several standardization bodies, including ANSI (ANSI 1999), NIST (NIST 1999),
and ISO (ISO/IEC 2009), have adopted standards for ECC during the last two
decades. Security problems will undoubtedly play a significant role in modern
computer and communication systems; it is widely acknowledged [1–5]. Based on
elliptic curves the implementation of either binary or prime field can be chosen. This
paper implements the ECPM over the binary field GF (2233) and GF (2163). ECPM
is a technique used to produce the key by multiplying a private key and the base
point. The elliptical curve is used to create a group point addition, which helps to
reduce the operation and reduce the processor’s area size. Modular multiplication
is a widely used algorithm that is interleaved modular multiplication. This specifies
the no. of. Clock cycles are necessary to complete the multiplication process. The
ECPM level is compared using the unified point addition and Montgomery ladder
algorithms [6–9].

2 Analytical Background

The Weierstrass form of equations is applied in the form of binary.

Y 2 + x y = x 3 + ax 2 + b
  (1)
a and b belongs to GF 2m

Mainly three operations can be done. Firstly, point addition it adding the one point
of curve P with another point Q to produce point R (i.e. P + Q = R).secondly point
doubling, it elliptic curve point P added itself to obtain R which is doubled of P. For
FPGA Implementation of Elliptic Curve Point Multiplication … 621

point multiplication are multiply the point P with the integer k to obtain Q = kP.
The affine coordinates into projective coordinates are used for conversions.
Cryptography is usually carried out on the Galois binary field or the Galois prime
field. The multiplication algorithm performance decides the area utilization of the
processor and also demands the optimization of the area. In this, the simple XOR gates
are implemented for the finite field arithmetic by this avoid the carry generation so by
this reduce the resource utilization. In this, the simple XOR gates are implemented
for the finite field arithmetic by this avoid the carry generation so by this reduce the
resource utilization.

2.1 Group Operations

In point addition, Fig. 1 consideration the EC’s both points P and Q, as well as the
straight line that must be drawn between them. The line crosses an EC also at other
points [−R]. It obtains a point [R] when it reflects a point with its respective x-axis.
It results in P + Q = R (PA). Let P = (X 1 , Y 1 ), Q = (X 2 , Y 2 ) thus p not equivalent
to ± Q, while P + Q = (X 3 , Y 3 ). Then (X 3 , Y 3 ) denotes R coordinates.
Where

X 3 = λ2 + λ + x 1 + x 2 + a (2)

Y3 = λ(x1 + x3 ) + x3 + y1 (3)

Fig. 1 Illustrates of point


addition
622 S. Hemambujavalli et al.

Fig. 2 Illustrates of point


doubling

λ = (Y1 + Y2 )/(X 1 + X 2 ) (4)

In point doubling, Fig. 2 Draw one a tangent line at the point P on an EC. The
elliptic curve intersects the line at a position −R. A point R is obtained by reflecting the
position (−R) with appropriate x-axis. It effectively doubled the number of positions
in the equation P = 2P. Thus 2P = (X 3 , Y 3 ).
Where

X 3 = λ2 + λ + a = x12 + b/x12 (5)

Y3 = x12 + λx3 + x3 (6)

λ = x1 + y1 /x1 (7)

2.2 Coordinates Systems

Affine is the most simple point representation scheme. Each point P ∈ E(GF(2m))
in this scheme is represented by a coordinate pair, which is usually (x, y). A point
can be represented with only two values, and in some situations, a single coordinate
is necessary. The negative aspect is that retrieving results in affine coordinates needs
many field inversions while computing the group operation. Inversions in the binary
FPGA Implementation of Elliptic Curve Point Multiplication … 623

field are resource-intensive operations that can be avoided in restricted environments.


This problem is solved using projective coordinates. The triple-axis (X:Y:Z) of a
coordinate denotes a projective point P ∈ E(GF(2m)). The formulas for converting
affine to projective coordinate are x => X/Z, y => Y /Z. The result is obtained without
the use of an inversion operation in this method. An extra value is needed to describe
a Z-axis.
The affine point (x, y) could be used to derive the projective form (X, Y, Z) as
follows:

X => x, Y => y, Z => 1

As follows, the projective form (X, Y, Z) could be reshaped into an affine point
(x, y):

x => X/Z , y => Y/Z

3 Hierarchy of ECC

The implementation of elliptic curve cryptosystems constitutes complex cross-


sectoral research areas involving mathematics, electrical engineering, and computer
science.
In this Fig. 3, Field operations, elliptic curve group operations, and elliptic curve
point multiplication are the three different levels of the ECC.
• At the first stage of ECC, the Galois field arithmetic operation is found, which
includes modular addition, subtraction, squaring, multiplication, and modular
inversion.
• The second stage of ECC is involved in group operations, which include point
addition and point doubling to perform modular arithmetic operations.

Fig. 3 Hierarchy of ECC


624 S. Hemambujavalli et al.

• The third stage of ECC has an ECPM that sequentially combines a group and field
operation.
• ECPM is a time-consuming operation in the elliptic curve cryptosystem. It
represents Q = kP for generating a public key.
Q → public key
P → elliptic curve’s base point
K → secret key or private key
The points over an elliptic curve are of the affine (x, y) nature since it is two-
dimensional curve. However, to simplify the computations, the used algorithm for
ECP multiplication uses projective coordinates. These projective coordinates also
contain a coordinate and are thus of the form (x, y, z).

4 Proposed Design

4.1 Binary Polynomial Algorithm

This algorithm computes an X/Y mod f in the field of binary. In the scenario of elliptic
over the binary field, where the degree m can be used to minimize an outcome of the
operation with a degree exceeding m−1, polynomial arithmetic cannot be simplified.
To more efficiency in polynomial reducer some implementation of operation on
an elliptic curve it chose the irreducible polynomial as trinomial polynomials. For
polynomial addition, it can be achieved by the XOR gates. Because there is no carry
generation. For example, the addition of two numbers can be represented as A ⊕ B.
Squaring is quite quicker than multiplies the two random polynomials because it is
the linear operation for finite field. The binary form of a(x)2 is achieved by adding
a zero’s between successive bits of a(X) is in binary form. These algorithms use the
shift and add method, so they begin processing the multiplier b(x) with its LSB or
either MSB.

4.2 ECC Processor

It is designed to generate a public key with area-efficient Elliptic curve cryptography.


It begins by obtaining a Q through the product of private key (k) with the reference
points (P). The affine and projective coordinates are considered. Initially, it converts
a P(X, Y ) to P(X, Y, Z); this conversion denotes the affine to projective convertor.
After conversion X = x, Y = y, Z = 1. Then to yield Q(X, Y, Z) by computing the
ECPM with P(X, Y, Z) and confidential key. It is accomplished by adding P with
its own k−1 periods. Finally, it converts the Q(X, Y, Z) to Q(x, y). This convertor
denotes projective to affine form. Figure 4 block diagram of public key generation.
FPGA Implementation of Elliptic Curve Point Multiplication … 625

X X3 x3

x
Point Post –process
Pre-process Y Y3
multiplication x=X/Z , y=Y/Z
X->x, Y->y, Z->1
y Q=kP y3
Z
Z3

Point doubling and point Arithmetic unit


addition ADD,MUL,SQUARER

Fig. 4 Public key generation

4.3 ECPM

The methods for operating point multiplication on an elliptic curve are discussed in
this section. These methods use the scalar k to compute the main ECC operation,
which is kP, or elliptic curve point multiplication. The fundamental idea behind the
kP method is to add a point P by k periods, whereas P is an elliptic curve point
then k ∈ [1, n−1] is the integer. The binary algorithm is the easiest way to multiply
points on an EC. An ECC Processor’s definitive action is ECPM. A position upon
the elliptic curve is multiplied by this scalar. ECPM is a crucial mechanism in ECC
systems. P(X, Y, Z) is the position upon a curve in projective coordinates, whereas
represents a private or hidden key. By performing ECPM on the identified private
key k and point P, a Q(X, Y, Z) public key is created. It performs, Q=kP Q is an
elliptic curve point. It has been calculated by multiplying P by its own k−1 intervals.

Q = P + P + P + ... + P
626 S. Hemambujavalli et al.

Fig. 5 General steps to


ECPM
Determine a point multipli-
cation algorithm

To represent elliptic curves


points determine the
co-ordinates

Choose a finite field operations


and select either prime or binary
field

It is concerned with the number of group operations as well as the arithmetic


operations. To make ECPM’s time complexity more manageable.
Figure 5 represent the general steps to accelerate an Elliptic curve point
multiplication. We have to follow these three steps
Figure 6, shows the ECPM implemented the Montgomery ladder algorithm over
binary fields.

4.3.1 Unified Point Addition

To prevent side-channel attacks, it performs PA and PD as similar modules in unified


PA. This architecture has four stages, each of which has arithmetic modules connected
in sequential order. Through this, it can achieve the least data path. In this input as P
in projective coordinates and kit iteration from 0 to M and output as Q in projective
coordinates. In ECPM, it operates on the bit pattern of the key for 233 bits over
binary fields.

• Feed in: P(X, Y, Z), ki = 0M−1 2ki
• Feed out: Q(X, Y, Z)
Q1 ← P;
While i reduce M−1 as 0
do
Q2 ←P + P;
if k i = 1 then
Q3 ← Q2 + P;
FPGA Implementation of Elliptic Curve Point Multiplication … 627

Fig. 6 Montgomery based


ECPM
Montgomery ladder algo-
rithm

Projective into affine


co-ordinates vice versa

Design over a Binary field and


modular multiplication, squaring,
reduction, addition for field oper-
ations

Q1 ← Q3 ;
else
Q1 ← Q2 ;
end if
end for
return Q1 ;

4.3.2 Montgomery Ladder for ECPM

To perform an ECPM, the proposed Montgomery ladder algorithm has been imple-
mented. This simultaneously performs PA and PD group operations to make a key
undetermined. The cumulative power used by the PD and PA at each iteration is the
result of parallel behaviour. Thus result, jitter of PA has longer than PD because the
PD modules accomplished prior to PA. The point addition module’s initial data is
evaluated by the P, 2P. The module’s PD recalculation is determined by the (l−2)th
k’s bit, whereas l denotes k’s length of bit.
• Feed in: p(x, y) base point on curve, scalar (k i ) where i = (M−1)
• Feed out: Q = kP
Q0 = P
Q1 =2P
If (k i = 0)
628 S. Hemambujavalli et al.

Q1 ← Q0 + Q1
Q0 ← 2Q0
else
Q0 ← Q0 + Q1
Q1 ← 2Q1
end if
Return Q0

5 Results

The ECC Processor is designed 233-bit and 163-bit for generating a public key
generation. The VIVADO 2018.2 version is used for the simulation of ECPM depends
on Montgomery ladder and unified PA algorithms, which utilize the low area.
Figure 7 depicts ECC’s group operations. It will be used to accelerate the ECPM
which will result in a public key.
P and Q are two inputs in (X, Y ) coordinates, with R being the output of PA and
PD stages. Using the Vivado simulator the design simulation results are shown in
Fig. 8. Here clock as 1, reset as 0, start as 1 are set. when operation as one bit it gets
start point to add and point double operation works, remains it will be 0. The clock
period is 1000 ps.
The Montgomery ladder algorithm has been applied on an ECPM in this Fig. 9,

Fig. 7 Schematic of PA and PD


FPGA Implementation of Elliptic Curve Point Multiplication … 629

Fig. 8 PD and PA simulation

Fig. 9 Schematic of ECPM


630 S. Hemambujavalli et al.

Fig. 10 Simulation of Montgomery point multiplication

which can be executed sequentially with PA as well as PD, following the binary bit
sequence of the hidden key (k). This proposed design over a 163 bit are simulated
and synthesized using Vivado simulator.
Figure 10 shows the simulation results of an ECPM based on the Montgomery
ladder algorithm which resulted in public key generation (Q). Clock period 1000 ps
are determined.
The proposed designs are implemented unified PA algorithm over an ECPM this
shown in Fig. 11. Here the input was given as private key (k) and base point as (Xp,
Yp) was forced a constant value for Xp and Yp, k. When clock is enabled signal is
generated the XQ1 , YQ1 , XQ2 , YQ2 . When 233-bit are assigned for respective values
and delay was 100 ps, the operation was completed it shows the done enabled. This
simulation represents the generation of key in X, Y coordinates of Q (public key).
The ECC modules of ECPM based on Montgomery ladder and unified PA
algorithm has utilization resource has represented in Tables 1 and 2.
Figures 12 and 13 shows the proposed designs of ECPM are implemented on
both algorithms and compared the synthesized and utilization report performance of
algorithms.
FPGA Implementation of Elliptic Curve Point Multiplication … 631

Fig. 11 Unified PA of simulation

Table 1 Resource allocation


Resources Estimation Available Utilization %
of Montgomery ladder
Lut 2293 303,600 0.75
Flip flop 1703 607,200 0.28
IO 820 600 136.67
BUFG 1 32 3.13

Table 2 Utilized PA resource


Resources Estimation Available Utilization %
allocation
Lut 2588 303,600 0.85
Flip Flop 3066 607,200 0.50
Io 474 600 79.00
BUFG 1 32 3.13
632 S. Hemambujavalli et al.

Fig. 12 Power estimation of


Montgomery ladder

Fig. 13 Power estimation of


unified PA

Acknowledgements This research work is part of the project work carried out by grant received
from All India Council for Technical Education, New Delhi, India through the Research Promotion
Project Scheme Grant File No. 8-79/FDC/RPS (POLICY-I)/2O19-2O.

References

1. Miller VS (1986) “Use of elliptic curves in cryptography” in Advances in cryptology—CRYPTO


’85 (Santa Barbara Calif. 1985), Springer, Berlin, pp 417–426
2. Koblitz N (1987) Elliptic curve cryptosystems. Math Comp 48(177):203–209
3. Islam MM, Hossain MS, Hasan MK, Shahjalal M, Jang YM (2019) FPGA implementation of
high-speed area-efficient processor for elliptic curve point multiplication over prime field. IEEE
Access 7:178811–178826
4. Javeed K, Wang X (2014) Radix-4 and radix-8 booth encoded interleaved modular multipliers
over general Fp. In: Proceedings of 24th international conference on field programmable logic
and applications (FPL), Munich, Germany, pp 1–6
5. Lara-Nino CA, Diaz-Perez A, Morales-Sandoval M (2018) Elliptic curve lightweight cryptog-
raphy: a survey. IEEE Access 6:72514–72550
FPGA Implementation of Elliptic Curve Point Multiplication … 633

6. Asif S, Kong Y (2017) Highly parallel modular multiplier for elliptic curve cryptography in
residue number system. Circuits Syst Signal Process 36(3):1027–1051
7. Yang Y, Wu C, Li Z, Yang J (2016) Efficient FPGA implementation of modular multiplication
based on Montgomery algorithm. Microprocessors Microsyst 47:209–215, Art 2016
8. Joseph GB, Devanathan R (2016) Design and analysis of online arithmetic operators for
streaming data in FPGAs. Int J Appl Eng Res 11(3):1825–1834
9. Sivanantham S, Padmavathy M, Gopakumar G, Mallick PS, Perinbam JRP (2014) Enhancement
of test data compression with multistage encoding. Integr VLSI J 47(4):499–509
Blockchain Usage in Theoretical
Structure for Enlarging the Protection
of Computerised Scholarly Data

Preeti Sharma and V. K. Srivastava

Abstract All scholarly activities generate documents such as statements of results,


testimonials, certificates, and transcripts, which are often held in manual file cabinets
and relational databases. With the value of scholarly data, it has become critical to
safeguard them against falsification and unintentional alteration. Despite the fact that
relational databases have become the de facto norm for storing scholarly documents
in recent years, the digitised contents are still under the control of a single admin-
istrator who has authoritative access to them as well as full rights and privileges.
However, because of the centralised control of the system, authoritative access is
likely to introduce instances of unauthorised record alteration. This paper provides
a blockchain-based theoretical structure for the authentication of digitised scholarly
documents with digital signatures. The storage, alteration, dissemination and verifi-
cation of scholarly records would all be based on blockchain technology. Also, using
this technology will notify all approved parties about the changes and they will be
able to confirm or deny the status of any update on an earlier stored scholarly record
by using its hash.

1 Introduction

The consistency of processing and storage is critical to the protection of digitised


scholarly documents. The stored records are used as legitimate contents in relation
to the individual performances of students and processing typically comes before
storage. Scholarly records storage has never been a difficult job. The difficult part of
the process typically occurs when the initially stored contents are altered, corrupted,
lost and/or destroyed as a result of a variety of factors such as intentional theft,
natural disasters and system failure. Because of the sensitivity and importance of

P. Sharma (B)
Department of Computer Science, Baba Mastnath University, Rohtak, Haryana, India
V. K. Srivastava
Faculty of Engineering, Baba Mastnath University, Rohtak, Haryana, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 635
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_47
636 P. Sharma and V. K. Srivastava

scholarly records in the overall framework, as well as their importance to student’s


professional growth, more sophisticated but low-cost methods of keeping them safe
in all circumstances have become increasingly important.
The retrieval of transcripts and the authentication of certificates are two impor-
tant aspects of the scholarly record-keeping process. Numerous investigations have
focused on record handling, in [1–3]. For example, [2] proposed a web-based tran-
script processing method. However, this method ignored the different levels of autho-
risation that could be used to improve the protection of the stored transcripts, such
as the risk of unauthorised material alteration. Similarly, [3] suggested a method for
optimising the production of transcripts from students’ performance. The majority
of these studies focused on the ease of access, with little or no information provided
about the safe method of maintaining these scholarly records to ensure that unin-
tended modification and distortion are fully avoided. Although there are some costs
associated with the tradeoff between ease of access and protection, a protected archive
of records ensures confidentiality, integrity and availability.
This paper provides a structure for introducing safe and distributed access to schol-
arly records using blockchain technology, with no instances of unauthorised manip-
ulation or distortion of records. The solution proposes the use of a blockchain-based
digital signature scheme and timestamps to create a highly transparent framework
for ensuring the protection of digitised scholarly records with low implementation
and maintenance costs.

2 Review of Literature

The preservation and retrieval of computerised scholarly documents necessitate the


use of solid implementation strategies.
Relational database management systems like Oracle, PostgreSQL SQL Server,
or MySQL in Alwi Mohd Yunus and Bunawan [2], Atzori [3] are used to store
computerised scholarly records. Most scholarly institutions and some production
environments also use these databases in web-based or desktop scholarly information
systems and it has become standard practice in most scholarly institutions and some
production environments.
Authenticity and reliability are major considerations of the trustworthiness of
digital data, according to Mohanta and Jena [4]. Database protection for scholarly
information systems filled with data from transcripts, certificates and statements was
addressed in Cachin [5].
Table constraints and relationships, as well as role-based access control, were
used in this implementation.
One of the issues that prompted Ouaddah and Elkalam [1], Catalini and Gans [6]
to work together is the lack of adequate protection and management of scholarly
documents.
Blockchain Usage in Theoretical Structure for Enlarging … 637

But, their proposed solution lacked an explicit security aspect that addresses the
issue of record manipulation and distortion due to human error. The various levels of
authorisation for accessing the stored contents were not specified at the same time.
Blockchain can be used to replace centralised control over scholarly documents.
Blockchain has been commonly used for the development of decentralised currencies,
self-extracting digital contracts and intelligent assets over the Internet in Kim and
Choi [7], Casino and Dasaklis [8], Gupta and Kumar [9], Kamel and Kamel [10].
Loebbecke and Lueneborg [11] suggested a blockchain-based framework for
storing a permanent distributed record of intellectual effort and reputational reward.
According to Gupta and Kumar [9], Lauter [12] and Swan [13], blockchain
can be used to create a database of distributed records. These distributed records
are executed and exchanged among participating parties, who use the distributed
consensus method to allow updates to the records. Updates are only included in
distributed consensus if a majority of the involved parties consent and once an update
is made, it cannot be undone by a single party. Casino and Dasaklis [8] and Swan
[13] addressed how blockchain, as a replicated database of transactions (or records),
allows for the creation of smart contracts.
Massively scalable and profitable applications with a high level of security are
developed. Kim and Choi [7], Kamel and Kamel [10] and Okikiola and Samuel [14]
include additional examples of blockchain technology applications.

3 Blockchain Technology

Decentralised control over transactions and data is provided by blockchain. This


technology was originally developed for the bitcoin blockchain, but it has recently
been applied to a variety of other fields. According to Raval in [13], blockchain tech-
nology is the future of massively scalable and profitable applications that are more
versatile, open, distributed and resilient. Decentralised applications are a reflection of
a cryptographically stored ledger, scarce-asset model that uses peer-to-peer network
technology. Decentralised authentication of transactions, which are held in a quasi-
anonymous network utilising some kind of storage known as public ledgers, has
found application for blockchain technology. The use of decentralised peer-to-peer
technology to validate transactions has been shown to provide clear confidentiality,
data integrity and non-repudiation services [15]. For the involved parties, transac-
tions over blockchain technology are authenticated using public-key cryptography,
with the public key mapped to the participant’s account number and the private
key displaying the participant’s ownership credentials. Digital signatures are used to
ensure data confidentiality and non-repudiation. As shown in Fig. 1.
In [16] it describes blockchain as a paradigm of state machine replication. State
machine replication allows a service to keep track of any state that can be changed
by client-initiated operations. Following that, outputs are created, which the clients
can then verify.
638 P. Sharma and V. K. Srivastava

Fig. 1 Portrayal of a blockchain

A distributed protocol is used to enforce blockchain across Internet-connected


nodes. This way, blockchain functions as a trusted computing service in both permis-
sioned and permissionless modes. Participation in a permissionless blockchain is
dependent on the ability to expend CPU cycles as well as show proof-of-works—a
prerequisite that ensures that costly computations are performed in order to enable
transactions on a blockchain.
Despite its status as a public ledger, blockchain technology has the potential to
become a globally decentralised record of digital assets, according to [17]. Each
device or node connected to a blockchain architecture with a client that participates
in invalidating and relaying transactions has a downloaded copy of the shared ledger
or database. The information contained in the blockchain database is still up to date
and contains all transaction records back to the point of origin. This ensures that a
chain of linked records can be created that is open to all validating parties and where
changes or improvements can only be made with the approval of a majority of them.
The ability of blockchain to implement a trustless proof system for all transactions
initiated and checked by all nodes linked to it demonstrates its disruptive existence.
Users can trust the public blockchain database of records stored globally on decen-
tralised nodes without including any trusted intermediaries for the trustless proof
mechanism [12, 17–20].
Figure 2 explained that the Blockchain is a distributed and decentralised tech-
nology. The term “decentralisation” refers to the fact that the failure of a single node
does not have an effect on the entire blockchain architecture.
Also, the timestamped public database or ledger is maintained on several Internet-
connected nodes to create a distributed environment that is freely available to all
participants [13].
Blockchain’s peer-to-peer technology enables nodes to communicate directly with
one another. Since any node has access to the blockchain record, this provision uses
smart contracts to achieve decentralised consensus. The smart contract, which is a
piece of code that is activated based on preprogrammed conditions, is stored on the
blockchain, which eliminates the problem of a single point of failure.
Blockchain Usage in Theoretical Structure for Enlarging … 639

Distributed Centralized

Fig. 2 Distributed and decentralised blockchain

The smart contract, according to [18], is a piece of code that is implemented on a


blockchain by an Ethereum Virtual Machine (EVM). The code is then made available
to the network participants to use.
Turing-complete programming languages are used to generate EVM bytecode
from smart contracts in order for them to be successfully executed.
Solidity, Serpent, Mutan and LLL (Low-level Lisp-like) programming languages
are Turing-complete programming languages used to create smart contracts in a
blockchain. Smart contracts are often known by addresses and the data they receive
is used to execute the code they contain [18].

4 The Projected Approach

The proposed method makes use of blockchain technology’s distributed consensus


to improve the security of stored data.
Figure 2 Shows how blockchain technology can be used to create both centralised
and distributed [13] scholarly records. All processed scholarly records are stored
on a public blockchain in this approach. After that, a cryptographic hash function
is applied to the stored records and the result is stored in a blockchain, with the
created transaction being signed by the tertiary institution’s private key. As a result,
the stored scholarly record has a high degree of confidentiality, as the blockchain
cannot be changed without a majority consensus of the participating parties who act
as validators [21].
As shown in Fig. 3, before changes can be made to an earlier version of the
stored scholarly records, each validator on a connected node (represented as Node-1
… Node-n), in this case, a member of the institution with access to the scholarly
records, must agree on an update, where possible. These updates are then distributed
throughout the blockchain database, ensuring that all validators are aware of any
subsequent record reviews. Furthermore, the tertiary institution may use blockchain’s
digital signature scheme to add the signatures of approved validators, allowing
changes to be based on a previously accepted condition.
640 P. Sharma and V. K. Srivastava

Fig. 3 Planned framework of project using blockchain

This requirement may be the minimum number of approved validators or


signatures required to complete an update.
The cryptographic hash function, as shown in Fig. 3, plays an important role in
the security and validity of the stored scholarly documents. The use of distributed
consensus instead of authoritative access improves security in this situation. This
means that providing a single validator with full control over access to scholarly
records makes it virtually impossible to carry out fraudulent activities. The timestamp
associated with the digital signature may be used to check the authenticity of scholarly
documents. When the cryptographic hash function is applied to the scholarly record
to construct a transaction that is stored in a public blockchain, the timestamp is
automatically generated. The stable hash algorithms (SHA-256) are the chosen hash
function in this case, as they make the blockchain database immutable.
The SHA-256 algorithm was chosen to ensure that all blocks in the blockchain
are well developed and untampered with, making the blockchain unbreakable in its
use [18, 22] as shown in Fig. 4.
Due to the decentralised existence of a blockchain, the digitally signed transac-
tion includes the date the scholarly record was deposited on the public blockchain
and as such, the record is publicly verifiable by users. It is also possible to use an
encryption algorithm to secure scholarly documents from public access when they
are explicitly signed and stored on the blockchain using a well-structured data format
rather than a cryptographic hash function. The encryption is based on public/private
key cryptography, which allows for the creation of a public/private key pair for each
record stored on the blockchain.
The public key is distributed to those who need access to the public blockchain,
while the validator uses the private key to access the blockchain database as required
[17]. The key pair is generated using the Elliptic Curve Digital Signature Algorithm
Blockchain Usage in Theoretical Structure for Enlarging … 641

Fig. 4 SHA 256 hash


function IV 256 bits

c Message (block 1) 512 bits

c Message (block 2)

c Message (block n)

Padding

Hash

(ECDSA) [23], which is based on elliptic curve mathematics. This asymmetric key
can also be used to digitally sign a document to verify that a network member
has approved an upgrade. This asymmetric key can also be used to digitally sign
a document to verify that a network member has approved an upgrade. When the
network participant’s private key is created, it can be added to the encrypted scholarly
record and each user who needs access to the public blockchain uses a public key
that is cycled through additional layers of encryption using SHA-256 [17].
Figure 5 shows how access to the public key is granted based on the agreement
between a group of people. To avoid unauthorised access to scholarly documents,
explicitly stored records on the public blockchain validator and users accessing the
contents of the blockchain database are encrypted. Figure 4 shows how this works.
ECDSA, which is primarily based on elliptic curve cryptography, allows for the
generation of a relatively short encryption key, which is faster and thus requires
less computing power, particularly when used on wireless devices with limited
computing power, memory and battery life [24–26]. RIPEMD, meanwhile, is more
secure against pre-image attacks and has the property of being one-way, involving
irreversible identity data transformation [26–28].

5 Conclusion

This paper presents an alternative approach to authoritative access to the scholarly


records database. The approach addresses the use of blockchain technologies, such
as distributed consensus for authorising changes to stored scholarly records, digital
signature schemes and timestamps for signing and checking the validity of the stored
642 P. Sharma and V. K. Srivastava

Fig. 5 Cryptography done on records in public blockchain

scholarly record and encryption algorithms for ensuring the security of the stored
contents. If implemented correctly, the proposed system has the ability to remove the
common problems associated with the centralised database approach, which enables
an administrator to have too much control over sensitive data.
The timestamp created by the digital signing of the transaction formed from
the cryptographic hash of the original scholarly record can be used to verify its
authenticity in the future. As a result, unauthorised modification and falsification of
scholarly documents will be eliminated, providing a higher level of protection at all
times. Any intended updating of the digitised scholarly record is communicated to all
stakeholders, who must verify such an update before it is enforced on a blockchain.
Through the implementation of Blockchain, this added layer of authentication
and accountability enhances the security of the computerised documents.

References

1. Ouaddah A, Elkalam AA (2017) FairAccess: a new blockchain-based access control framework


for the internet of things
2. Yunus AM, Bunawan A-A (2016) Explaining the importance: proper academic records
management. In: International conference on information science, technology, management,
humanities and business, ITMAHuB, USA
3. Atzori M (2017) Blockchain technology and decentralized governance: is the state still
necessary? J Governance Regul
4. Mohanta BK, Jena D (2019) Blockchain technology: a survey on applications and security
privacy challenges. Elsevier
5. Cachin C (2016) Architecture of the hyperledger blockchain fabric, IBM Research
Blockchain Usage in Theoretical Structure for Enlarging … 643

6. Catalini C, Gans JS (2019) Some simple economics of the blockchain. National Bureau of
Economic Research, 1050 Massachusetts Avenue, Cambridge, MA 02138
7. Kim DG, Choi J (2018) Performance improvement of distributed consensus algorithms for
blockchain through suggestion and analysis of assessment items. J Soc Korea Ind Syst Eng
8. Casino F, Dasaklis TK (2019) A systematic literature review of blockchain-based applications:
current status, classification and open issues. Elsevier, pp 55–81
9. Gupta EP, Kumar S (2014) A comparative analysis of SHA and MD5 algorithm architecture
10. Kamel I, Kamel K (2011) Toward protecting the integrity of relational databases, internet
security (WorldCIS). World Congress IEEE
11. Loebbecke C, Lueneborg L (2018) Blockchain technology impacting the role of trust in trans-
actions: reflections in the case of trading diamonds. In: Twenty-sixth european conference on
information systems (ECIS 2018), Portsmouth, UK
12. Lauter K (2004) The advantages of elliptic curve cryptography for wireless security. IEEE, pp
62–67
13. Swan M (2015) Blockchain: blueprint for a new economy. “O’Reilly Media, Inc.
14. Okikiola MA, Samuel F (2016) Optimizing the processing of results and generation of tran-
scripts in Nigerian universities through the implementation of a friendly and reliable web
platform. IJIR
15. Raikwar M, Gligoroski D (2019) SoK of used cryptography in blockchain. IEEE Acess 7
16. Omogbhemhe M, Akpojaro J (2018) Development of centralized transcript processing system.
Am Sci Res J Eng Technol Sci (ASRJETS)
17. Memon M, Hussain SS (2018) Blockchain beyond bitcoin: blockchain technology challenges
and real-world applications. IEEE
18. Onwudebelu U, Fasola S (2013) Creating pathway for enhanching student collection of
academic records—a new direction computer science and information technology
19. Qui Q, Xiong Q (2004) Research on elliptic curve crptography. IEEE, Xiamen, China
20. Ghosh R (2014) The techniques behind the electronic signature based upon cryptographic
algorithms. Int J Adv Res Comput Sci
21. Underwood S (2016) Blockchain beyond bitcoin. Commun ACM 59
22. Sharples M (2016) The blockchain and kudos: a distributed system for educational record,
reputation and reward. In: European conference on technology enhanced learning, Springer,
Cham
23. https://www.maximintegrated.com/en/design/technical-documents/tutorials/5/5767.html
24. Kalra S, Sood SK (2011) Elliptic curve cryptography: current status and research challenges.
Springer, pp 455–460
25. Cai W, Wang Z (2018) Decentralized applications: the blockchain-empowered software system.
IEEE
26. Tang Y-F, Zhang Y-S (2009) Design and implementation of college student information
management system based on web services. In: IEEE international symposium on IT in
medicine and education, IEEE
27. Zeb A (2014) Enhancement in TLS authentication with RIPEMD-160. Int J 401–406
28. Zhao W (2021) Blockchain technology: principles and applications. IEEE
A Framework of a Filtering
and Classification Techniques
for Enhancing the Accuracy of a Hybrid
CBIR System

Bhawna Narwal, Shikha Bhardwaj, and Shefali Dhingra

Abstract In recent times, content-based image retrieval (CBIR) has gained more
attention. CBIR is a technology for retrieving the features of an image from a large
database that closely matches the given query image. This paper proposes a novel
hybrid framework for content-based image retrieval to enhance the accuracy related
issues. Here, Color moment is used as a filtering metric to select some relevant images
from a broad database. Then, the features of the filtered images are extracted using
Color moment and Gray Level Co-occurrence Matrix (GLCM) techniques. Various
distance measure functions like Euclidean, Manhattan, Cosine is used to calculate
similarity compatibility. Lastly, for increasing the efficiency of the developed hybrid
CBIR system, an intelligent approach, i.e., an ELM classifier has been used. The
average precision obtained on the Wang database is 96.25% which is higher as
compared to many other classification techniques.

Keywords Content-based image retrieval (CBIR) · Color moment · Gray level


co-occurrence matrix GLCM · Extreme learning machine · Filtering

1 Introduction

Frequently used image capturing devices like smartphones, cameras, closed-circuit


television cameras, etc. generate large volumes of visual data [1]. Billions of inter-
active media information records are getting developed and shared on the internet,
basically social media websites. This dangerous increment in multimedia informa-
tion, particularly pictures and recordings, has contributed to the issue of searching
and recovering the relevant data from the archive collection. Within the current era,
the complexity of picture information has expanded exponentially. These days, due to
advanced imaging techniques, numerous applications such as clinical imaging, inac-
cessible detecting, wrongdoing anticipation, instruction, interactive media, informa-
tion mining, so on have been developed. These applications require computerized

B. Narwal (B) · S. Bhardwaj · S. Dhingra


Department of Electronics and Communication Engineering, University Institute of Engineering
and Technology, Kurukshetra University, Kurukshetra, Haryana, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 645
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_48
646 B. Narwal et al.

pictures as a source for different forms like division, protest acknowledgment, and
others. This huge amount of visual data uploaded by users from various geographic
regions with varied languages have either metadata in diverse languages or no meta-
data associated with the images [1]. To recognize comparable sorts of pictures from
such unstructured visual information is a difficult undertaking. Content-Based Image
Retrieval (CBIR) framework can upgrade the comparable picture search ability,
particularly for pictures having multilingual labeling and comments.
Typically, an image recovery process is split into two forms: Text-based and
Content-based. In the Text-Based Approach (index of images using Keywords)
pictures are recovered on the premise of text annotations in a text-based recovery
framework. But, it endures from many disadvantages like human explanation blun-
ders and also the utilization of synonyms, homonyms, etc. lead to a wrong picture
recovery [2]. The text-based method uses keyword descriptions as input and gets the
required output in a similar format type of image.
The best solution for overcoming these limits is CBIR mechanism. Image retrieval
using the content is considered as one of the most successful and proficient ways of
getting to visual information. This strategy depends on the picture content like shape,
shading, also, surface rather than the explained text. The picture recovery approach
utilized in the CBIR framework is very distinctive from the picture retrieval systems
which utilize the meta information-based approach for picture recovery. The image
retrieval system is based on a feature extraction technique. Features are extracted
from the query image then a similarity measure is used to find the most similar
image from a database [3]. The basic block diagram of CBIR is given in Fig. 1.
Database Creation: Create or store a few picture databases to plan for own
database for the testing reason or as an inbuilt database.
Query Image Submission: The query picture is nothing but an input picture
which is we giving to the framework as input and concurring to that images system
will discover out comparable pictures.

Query Image Image Collection

Feature Extraction Feature Extraction

Query Image Feature Feature Database

Similarity Calculation

Retrieved Images

Fig. 1 Block diagram of CBIR [4]


A Framework of a Filtering and Classification Techniques … 647

Extraction of features: Extract the significant features from the query image and
database image like color, texture, shape, etc. [4].
Feature Matching: Measurement of query picture content and database pictures
and based on that, similar pictures are recovered by comparing input images with all
the pictures from the database.
Retrieval of Desired Images: Depending on the content of the picture functions,
it recovers the desired images [4].
Image retrieval based on Color Features:
Color is the most critical content within the color picture and it is the most broadly
visual content. It highlights the extent of pixels of particular color within the picture.
In color pictures normally RGB, YCbCr, HSV, HSI, etc., different color space models
are used and different color descriptors like color coherence vector, color histogram,
color moments, and color correlogram can be used [4]. The essential qualities of the
visual content of a picture are the shading highlights. People can easily identify the
visual content in the image [5].
Image retrieval based on Texture features:
Texture is another critical feature that can be recovered based on image content.
The surface is a vital highlight within the picture which is used in design acknowl-
edgment [5]. In common, the term “texture” alludes to the surface characteristics and
appearance of objects in a picture that results from the measure, shape, thickness,
course of action, and extent of their elementary parts that are commonly known as
features. These surface features are calculated from the statistical dissemination of
observed combinations of pixel power at indicated positions relative to each other
within the picture. It represents the spatial format of the gray level of pixels in a
locale. The texture is categorized into two sorts that are statistical and structural.
The main contributions to this work are:
• To filter the dataset images using color extraction techniques.
• To develop a hybrid system by using a combination of color and texture features
by using remaining images after filtration.
• To increase the efficiency of the system by using an advanced and intelligent
technique.
The remaining organization of this paper is given in the following way: Sect. 2
introduces some strategies related to CBIR examined in the literature. In Sect. 3, the
proposed methodology is examined. Section 4 provides the details of the implemen-
tation and results of the proposed approach. Finally, the paper concludes with a little
guidance on future work in Sect. 5.

2 Related Work

This section explores the numerous and latest CBIR methods. Pavithra and Sharmila
[2] created a cross breed framework for the CBIR framework to address the precision
648 B. Narwal et al.

issues related to the conventional image retrieval framework. This system at first
chooses related pictures from a huge database using color moment data.
Singh and Batra [1] developed an Efficient Bi-layer Content-Based Image
Retrieval System named BiCBIR where three primitive features namely color,
surface, and shape have been considered. The closeness between inquiry pictures and
dataset pictures is computed in two layers. The proposed BiCBIR framework is split
into two modules. In the first module, attributes of the dataset pictures are extracted
by using color, texture, and shape. The second module performs the recovery task
which is further divided into two layers.
A hybrid feature-based proficient CBIR system is proposed by Mistry et al. [6]
that utilizes different distance measures. Spatial space features consisting of color
auto-correlogram, color moments, HSV histogram features, and frequency space
features like moments utilizing SWT, features utilizing Gabor wavelet change are
utilized.
Shao et al. [7] presented directed two-stage deep learning cross-modal retrieval
which supports content to image and picture to content recovery. Hashing-based
picture recovery framework is faster as compared to traditional CBIR frameworks
but they are sub-standard in terms of accuracy.
A CBIR system developed in several levels of the hierarchy has been developed
by Pradhan et al. [8]. Here, adaptive tetrolet transform is employed to extract the
textual information. Edge joint histogram and color channel correlation histogram
have been used respectively to research shape and color features associated with an
image. This technique is realized within the sort of a three-level hierarchical system
where the very best feature among the three is depicted at every level of the hierarchy.
Speed enhancement and speedup plot was proposed by Fadaei and Rashno [9]
that is dependent on a stretch registered from a proficient mix of Zernike and Wavelet
highlights. This span was registered for each question picture to eliminate all imma-
terial pictures from the information base. Most proper highlights among Zernike and
Wavelet highlights were chosen to utilize PSO to have the highest decreased data
set.
Alsmadi [10] develop a novel likeness evaluation employing a meta-heuristic
calculation called a Memetic Algorithm (MA) (hereditary calculation with the great
storm) is accomplished between the highlights of the Query Image (QI) and the
features of the database pictures. This work proposed a viable CBIR framework
utilizing MA to retrieve images from databases. At that point, utilizing the MA-based
closeness degree; pictures that are pertinent to the QI were recovered proficiently.
Corel picture database was utilized. It has the ultimate capability to separate color,
shape, and color, surface highlights. The results outperformed other CBIR systems
in terms of accuracy and recall where normal exactness and recall rates were 0.882
and 0.7002 separately.
A Framework of a Filtering and Classification Techniques … 649

3 Proposed Methodology

In this work, the development of an Intelligent and Hybrid CBIR System has been
proposed. Initially, all dataset images are filtered out by the color moment technique.
Then, the features of the selected images are extracted. For color feature extraction,
the color moment technique has been used, and for texture feature extraction, the Gray
Level Co-occurrence Matrix (GLCM) technique has been used. The combination of
color and texture features makes a hybrid system. For increasing the efficiency and
accuracy of the developed CBIR system, an intelligent approach based on an ELM
classifier has been used. The basic architecture of the proposed system is given in
Fig. 2.

Color Feature Extraction


Color characteristics have been extracted using a variety of techniques in feature
extraction systems. The colors moment has been selected as a good adsorbent in
capturing color features. It’s reliable, quick, flexible, and contains less space and time.
The illumination and brightness of the images were determined by color moments.
Mean, standard deviation, and skewness are the color moments used for this system
to extract features on a color basis. Since color information is more sensitive to the
human visual system than pixel values, color is the first option for extracting features.
The Color moment can be computed for any color space model. This color descriptor
is also scale and rotation invariant but it includes the spatial information from images
[11].

Texture Feature Extraction


GLCM develops a matrix with pixel distances and directions, then eliminates signif-
icant figures as surface features from the matrix. The best second-order statistical
texture feature extraction technique is Gray Level Co-occurrence Matrix (GLCM)
which has many dominant attributes like (1) Rotation invariance (2) Diverse appli-
cations (3) Simple implementation and fast performance (4) Numerous resultant

Dataset Images Filter Images Selected Images

Feature Extraction by Feature Extraction


Color Moment by GLCM

Hybrid System + Hybrid System


ELM as a Classifier

Fig. 2 The design structure of the proposed framework


650 B. Narwal et al.

(Haralick) parameters (5) Precise inter-pixel relationship [11]. It’s an empirical


method for extracting texture features from images for retrieval and identifica-
tion [11]. The GLCM selects feature vectors based on energy, contrast, entropy,
correlation, as well as other factors.

Ensemble Classifier
The ensemble methods, also known as committee-based learning or learning multiple
classifier systems train multiple hypotheses to solve the same problem. Each clas-
sifier in the ensemble is a decision tree classifier and is generated using a random
selection of attributes at each node to determine the split. During classification, each
tree votes, and the most popular class is returned. Plenty of group classifiers are
degenerate into twofold types of methods. To build a distinctive ensemble model,
merge numerous poor novices. Characteristics rely upon the selection of the algo-
rithm. Numerous sorts of gathering classifiers are utilized: 1—Ensemble stowed
trees: we utilize irregular woodland pack, with choice tree inexperienced persons,
2—Ensemble feature space discriminant: with discriminant beginners, we are using
subspace. 3—Ensemble subspace K-nearest neighbors: we use shortest path trainers
in a spatial domain [12].

ELM Classifier
Many types of classifiers have been used based on image retrieval to acquire precise
accuracy during the classification of the image database [13]. Extreme learning
machine (ELM) is a new learning algorithm that is a single hidden layer feed-
forward neural network. ELM is based on empirical risk minimization theory and its
learning process needs only a single iteration. The algorithm avoids multiple itera-
tions and local minimization. ELM designates learning parameters such as weights,
input nodes, and biases in an adequate way, containing no tuning [13]. Only the
number of hidden nodes involved should be indicated, as the output is estimated
factually by the independent variable. In comparison to several other learning algo-
rithms, the ELM has a speedy learning rate, considerably better efficiency, or reduced
management interference [13].

4 Experimental Setup and Results

Matlab R2018a, 4 GB memory, and 64-bit windows are used for the experiments.
The experiments are carried out by using the Wang dataset. Wang’s dataset provides
1000 photos categorized into ten categories: African Tribes, Sea, Buildings, Buses,
Dinosaurs, Elephants, Flowers, Horses, Mountains, and Food. There must be 100
photographs for each group. Also, every image is either 256 × 384 or 384 × 256
pixels in length [14]. Some of the sample images from Wang dataset are given in
Fig. 3.
A Framework of a Filtering and Classification Techniques … 651

Fig. 3 Sample images from Wang’s database

Experimental Results
The efficiency of any CBIR system can be measured using several performance
measurements. However, the following are some common assessment measures are
given in Eqs. (1)–(3).

Number of relevant images retrieved


Precision (P) = (1)
Total number of images retrieved
Number of relevant images retrieved
Recall (R) = (2)
Number of relevant images in the database
Precision × Recall
F − Measure = 2 (3)
Precision + Recall

Different distance metrics are being used to acquire similarities between any
query image and database images. To start comparing the pictures, the distance
metric employs a distance measure. The narrower the distance between these two
photos, the larger the resemblance among them. It is chosen primarily on the basis
that has been used to display the image. For retrieval of identical items contributing
to the query item, similarity computation between pair feature vectors is necessary.
It’s compared with the results of an appropriate distance-based function [15]. Some
of the utilized distance [16] metrics are given in Eqs. 4–6 and precision results in
terms of varied distance metrics [17] as shown in Table 1.
652 B. Narwal et al.

Table 1 Average precision


Dataset used Distance metric Precision %
using three distance metrics
Wang Euclidean 97.00
Wang Manhattan 96.25
Wang Cosine 97.50


 n

disEuclidean =  (|Ii − Di |)2 (4)
i=1


n
disManhattan = |Ii − Di | (5)
i=1

X ·Y
disCosine = 1 − cos θ = (6)
XY
Since filtering has been used in this paper to sort the database images. The color
moment [18] has been used here as a filtering technique to filter the desired images.
A threshold level has been selected and the mean value of the images above that
threshold are selected and below it are rejected. Thus, filtering is obtained. The
results of filtering in terms of Precision, Recall, and F-measure [19] both before and
after filtering are given in Fig. 4.
To check the accuracy of the developed hybrid CBIR system, four classifiers
namely Support vector machine (SVM), K-nearest neighbor (KNN), Ensemble, and
ELM have been used. Among all these classifiers, ELM obtains the highest accu-
racy. Since, Multi-class classification [20] is not possible with SVM, error related to
outliers is present in K-NN, Ensemble is too complicated. ELM is based on Single
layer feed-forward neural network and its training is very fast as compared to others.
The results are given in Fig. 5.
To check the retrieval accuracy, two separate Graphical user interface (GUI) has
been designed for ELM and Ensemble classifier. Results of GUI are given in Figs. 6
and 7.
It can be concluded from Figs. 6 and 7 the images belonging to the same native
category, viz. African tribes are obtained for ELM classifier with the precision of 1
and for ensemble, the obtained precision is 0.9.

Fig. 4 Precision, recall, and


F-measure before and after
filtering
A Framework of a Filtering and Classification Techniques … 653

Fig. 5 Accuracy comparison of various classifiers

Fig. 6 Retrieved results from ELM classifier

5 Conclusion

A CBIR system has been presented in this paper, which uses color and texture
features to display query and dataset images. In this work, all the dataset images
are filtered out by using the color moment technique. Color features are extracted
by using the color moment technique and texture features are extracted by using the
GLCM technique and both features are used to form a hybrid system. Then after that,
a machine learning classifier is used to create multiple classes for large databases.
The search area for retrieving relevant images from the query image is reduced as
a result of this classification, and the system has become more accurate and quick.
654 B. Narwal et al.

Fig. 7 Retrieved results from ensemble classifier

KNN, SVM, Ensemble, and ELM are four machine learning classifiers whose outputs
are compared here. From the comparison of these classifiers, ELM has the highest
accuracy. The average precision, recall, accuracy, and F-score were all used to assess
the system’s results.

References

1. Singh S, Batra S (2020) An efficient bi-layer content-based image retrieval System. Multimedia
Tools Appl
2. Pavithra LK, Sharmila TS (2017) An efficient framework for image retrieval using color, texture
and edge features. Comput Electr Eng 1–14
3. Nazir A, Ashraf R, Hamdani T, Ali N (2018) Content-based image retrieval system by using
HSV color histogram, discrete wavelet transform, and edge histogram descriptor. In: IEEE
international conference on computing, mathematics and engineering technologies, pp 978-1-
5386-1370-2
4. Dahake PA, Thakare SS (2018) Content-based image retrieval: a review. Int Res J Eng Technol
05:1095–1061
5. Varma N, Mathur A (2020) A survey on evaluation of similarity measures for content-based
image retrieval using hybrid features. In: IEEE international conference on smart electronics
and communication, pp 978-1-7281-5461-9
6. Mistry Y, Ingole DT, Ingole MD (2018) Content-based image retrieval using hybrid features
and various distance metric. J Electr Syst Inf Technol 5:874–888
7. Shao J, Zhao Z, Su F (2019) Two-stage deep learning for supervised cross-modal retrieval.
Multimedia Tools Appl 78:16615–16631
8. Pradhan J, Kumar S, Pal AK, Banka H (2018) A hierarchical CBIR framework using adaptive
tetrolet transform and novel histograms from color and shape features. Digit Signal Process A
Rev J 82:258–281
A Framework of a Filtering and Classification Techniques … 655

9. Fadaei S, Rashno A (2020) Content-based image retrieval speedup based on optimized combi-
nation of wavelet and Zernike features using particle swarm optimization the algorithm”, Int J
Eng 33(5):1000–1009
10. Alsmadi MK (2019) An efficient similarity measure for content-based image retrieval using
the memetic algorithm. Egyptian J Basic Appl Sci 4:112–122
11. Bhardwaj S, Pandove G, Dahiya PK (2020) An extreme learning machine-relevance feed-
back framework for enhancing the accuracy of a hybrid image retrieval system. Int J Interact
Multimedia Artif Intell 6(2)
12. Kavitha PK, Saraswathi PV (2019) Machine learning paradigm towards content-based image
retrieval on high-resolution satellite images. Int J Innov Technol Explor Eng 9:2278–3075
13. Dhingra S, Bansal P (2019) A competent and novel approach of designing an intelligent image
retrieval system. EAI Endorsed Trans Scalable Inf Syst 7(24)
14. Dhingra S, Bansal P (2020) Experimental analogy of different texture feature extraction
techniques in image retrieval systems. Multimedia Tools Appl 79:27391–27406
15. Bhardwaj S, Pandove G, Dahiya PK (2020) A genesis of a meticulous fusion based color
descriptor to analyze the supremacy between machine learning and deep learning. Int J Intell
Syst Appl (IJISA) 12(2):21–33
16. Rajkumar R, Sudhamani MV (2019) Retrieval of images using combination of features as
color, color moments and Hu moments. Adv Image Video Process 7(5):9–21
17. Dewan JH, Thepade SD (2020) Image retrieval using low level and local features contents: a
comprehensive review. Appl Comput Intell Soft Comput 20
18. Ghosh N, Agrawal S, Motwani M (2019) A survey of feature extraction for content-based
image retrieval system. In: International conference on recent advancement on computer and
communication
19. Bellaa MT, Vasuki A (2019) An efficient image retrieval framework using fused information
feature. Comput Electr Eng 46–60
20. Du A, Wang L, Qin J (2019) Image retrieval based on color and improved NMI texture feature.
J Control Measure Electron Comput Commun 60(4):491–499
Microgrid Architecture with Hybrid
Storage Elements and Its Control
Strategies

Burri Ankaiah, Sujo Oommen, and P. A. Harsha Vardhini

Abstract A portable group of microgrid architecture is suggested in this paper,


which exists in the microgrid approach with hybrid storage factors with stability and
control strategies. The proposed model hardware implementation involves generated
array for photovoltaics, wind turbulence, fuel cell, circuit breakers, power electronics
circuits and controllers. With this proposed model the use of energies like renew-
able operates more effectively in portable groups like domestic, commercial and
agricultural. Energy sources for renewable with penetration improvement and also
future grids the proposed model with hardware implantation plays an important role.
In future grids storage system is very much important for the stability of grids for
maintenance and also power quality improvement. The storage system reliability and
for improving the efficiency improvement for various time responses the proposed
model is the best solution. The proposed model of microgrid hardware implemen-
tation with hybrid storage factors and its control strategies is shown with results
obtained from simulation with standard values of factors.

Keywords Renewable energy systems · Hybrid storage systems · Microgrids ·


Proposed hardware model

1 Introduction

In technology development very essential is electrical power at present world elec-


trical power, the service of nature and supply of coherence with most vital for power
segments are very significant. With the smart grid, the overall drawback can be
overcome. The use of energy storage like renewable systems by controller utiliza-
tion, communication systems and power electronics systems. The energy sources
of sustainability like photovoltaic, wind turbulence and battery storage system are

B. Ankaiah (B) · S. Oommen


REVA University, Bengaluru, Karnataka, India
P. A. Harsha Vardhini
Vignan Institute of Technology and Science, Deshmukhi, Telangana, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 657
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_49
658 B. Ankaiah et al.

Fig. 1 Cumulative global solar photovoltaics installed capacity by region (2010–2030)

a part of smart grid, the control electronic circuits are deployed as in Figs. 1 and 2
and the results of a single phase grid are illustrated in Fig. 3. Smart grids like enor-
mous economical not practical for the investigations like testing, design of small
grid can be exhibit as a model. So that the likewise energy sources are limited. A
microgrid that consists of renewable energy sources and inverters keep running with
switching frequency in real-time applications [1, 2]. Various methods are regarded
to accelerate real application time simulation analysis of model smart grid with
switching operation frequency [3–7]. Communication network is fundamental to
effectively use a large number of the elements of the smart grid-like distributed
automated systems [8, 9]. Presentation of execution and network state, resource
protection of energy, islanding and controlling. Along with these sources, addition-
ally, various types of network communication to stimulate the electric power system
at real application time is a tough challenge to model smart grid [10–13].
Microgrid Architecture with Hybrid Storage Elements … 659

Fig. 2 Single phase grid connected solar photovoltaic system

2 Microgrids

The microgrid model as illustrated in Fig. 4 has had very good stability and reliability
with its varying parameters dependency on power system for main. power system
for main is used to transmission of the power from a source of major to minor
sources with centralized control and transmission line which alerting can do with the
minimum efficiency of the power supply electrical. The model of supply of microgrid
as depicted in Fig. 5, which is collecting of renewable energy resource gives of
a portable group like domestic, commercial and agricultural. Many consumers in
1990s in a few states had power cuts because of damage to power lines in urban
areas. By implantation of microgrid or power grid, these kinds of problems can
be rectified. In period of disturbances, the model microgrid separation from the
distribution main system to isolate loads from these kinds of incidents it will give the
continuous and stability of the power supply electricity without affecting the grid of
main transmission.

3 Obtained Simulation Results

The results obtained simulation is shown for two grids that is one connected mode
and another islanding mode. The results obtained simulation circuit is shown in
Figs. 6 and 7, The results obtained simulation shown in Figs. 8 and 9 AC current and
AC voltage of the microgrid which contains photovoltaic, wind turbulence, fuel cell,
critical load and non-critical load as shown. Output of grid connected mode and in
islanding is shown by comparing these two we can illustrate that dc, ac bus voltage
660 B. Ankaiah et al.

Fig. 3 Results of single phase grid connected solar photovoltaic system

and load currents remains constant. Initially, circuit breakers are open and it will be
closed at their respective transition time, the transition time of PV is connected from
0 to 0.1 at this time circuit breakers is closed and PV output is connected to grid and
similarly, transition of non-critical both loads, fuel cell, wind turbulence and critical
load is 0.1–1 respectively. During islanding mode, assumed that the photovoltaic
and wind turbulence cannot provide the required power to the loads, therefore load
shedding is necessary to maintain the system stable. In islanding mode, they can
give the most extreme power of 9.7 kW, the PV 7.7 kW and the wind turbine 12 kW.
On the load side, the power utilization of each non-critical load and critical load is
17 kW and 13 kW, separately.
Microgrid Architecture with Hybrid Storage Elements … 661

Fig. 4 Systems-level microgrid simulation from simple one-line diagram

4 Proposed Model Hardware Implementation

The proposed model hardware implementation of microgrid with generated array


photovoltaic, battery storage system, wind turbulence as energy sources and with
normal local electricity supply is done. The proposed model hardware implemen-
tation output is without grid mode because of the battery storage the voltage is
considering of when battery storage voltage will discharge to less voltage within
10 V it switches to normal local electricity supply as shown in Fig. 10 and the battery
storage voltage is also rectified into alternating current with the inverter model circuit
and transformer model.

5 Conclusion

This paper proposed a microgrid model of a portable group that includes the gener-
ated array photovoltaic, fuel cell, wind turbulence, circuit breakers power electronic
circuits and controllers. This proposed microgrid model is stimulated with switching
frequency. With this proposed model the use of energies like renewable operates
more effectively in portable groups like domestic, commercial and agricultural.
Energy sources for renewable with penetration improvement and also future grids the
proposed model with hardware implantation plays an important role. In future grids
storage system is very much important for the stability of grids for maintained and
also power quality improvement. The storage system reliability and for improving
the efficiency improvement for various time responses the proposed model is the best
solution.
662 B. Ankaiah et al.

Fig. 5 Microgrid simulation model

Fig. 6 Microgrid output of grid connected


Microgrid Architecture with Hybrid Storage Elements … 663

Fig. 7 Microgrid output of Islanding mode

Fig. 8 Photovoltaic AC output generation

Fig. 9 Wind AC output generation


664 B. Ankaiah et al.

Fig. 10 Proposed model of hardware model grid connected

References

1. Geng Y, Yang K, Lai Z, Zheng P, Liu H, Deng R (2019) A novel low voltage ride through control
method for current source grid-connected photovoltaic inverters. IEEE Access 7:51735–51748
2. Al-Shetwi AQ, Hannan MA, Jern KP, Alkahtani AA, PG Abas AE (2020) Power quality assess-
ment of grid-connected PV system in compliance with the recent integration requirements.
Electronics MDPI
3. Di Silvestre ML, Favuzza S, Sanseverino ER, Zizzo G, Ngoc TN, Pham M-H, Nguyen TG
(2018) Technical rules for connecting PV systems to the distribution grid: a critical comparison
of the Italian and Vietnamese frameworks. In: Proceedings of the 2018 IEEE international
conference on environment and electrical engineering and 2018 IEEE industrial and commercial
power systems Europe (EEEIC/I&CPS Europe), Palermo, Italy, pp 1–5
4. Tang C-Y, Kao L-H, Chen Y-M, Ou S-Y (2018) Dynamic power decoupling strategy for
three-phase PV power systems under unbalanced grid voltages. IEEE Trans Sustain Energy
10:540–548
5. Hu Y, Yurkovich BJ (2011) Electrothermal battery modeling and identification for auto motive
applications. J. Power Sources 196(1):449–457
6. Hossain E, Tür MR, Padmanaban S, Ay S, Khan I (2018) Analysis and mitigation of power
quality issues in distributed generation systems using custom power devices. IEEE Access
6:16816–16833
Microgrid Architecture with Hybrid Storage Elements … 665

7. Sai Krishna A, Alekya P, Satya Anuradha M, Harsha Vardhini PA, Pankaj Kumar SR (2014)
Design and development of embedded application software for interface processing unit (IPU).
Int J Res Eng Technol (IJRET) 3(9):212–216
8. Villalva MG, Gazolli JR (2010) Comprehensive approach to modeling and simulation of
photovoltaic arrays. IEEE Trans Power Electron 25(12):2910–2918
9. Al-Shetwi AQ, Sujod MZ (2018) Grid-connected photovoltaic power plants: a review of the
recent integration requirements in modern grid codes. Int J Energy Res 42:1849–1865
10. Hu Y, Yurkovich BJ (2010) Battery state of charge estimation in automotive applications using
LPV techniques. In: Proceeding of the 2010 ACC, Baltimore, MD, USA, pp 5043–5049
11. Harsha Vardhini PA, Murali Mohan Babu Y, Krishna Veni A (2019) Industry parameters
monitoring and controlling system based on embedded web server 6(2)
12. Hu Y, Yurkovich BJ (2009) A technique for dynamic battery model identification in automotive
applications using linear parameter varying structures. IFAC Control Eng Pract 17(10):1190–
1201
13. Hu Y, Yurkovich BJ (2008) Modeled-based calibration for battery characterization in HEV
applications. In: Proceeding of the ACC, Seattle, WA, USA, pp 318–325
A Robust System for Detection
of Pneumonia Using Transfer Learning

Apoorv Vats, Rashi Singh, Ramneek Kaur Khurana, and Shruti Jain

Abstract Pneumonia disease has been reported as severe adversity for causing a
large number of deaths in children every year. Detecting pneumonia at the early
stage is a bit difficult as symptoms are similar to cold or flu. However, early detection
with innovative technologies can help in the detection of pneumonia which reduces
the mortality rate. As the number of cases continues to rise, the expertise becomes
difficult, and an alternative approach is needed. To counterpart this limitation, authors
have proposed an expert system for the early detection of Pneumonia using X-ray
images considering Neural Network and Convolution Neural Networks (CNN) that
help in differentiating normal from the diseased. The CNN takes advantage of local
coherence in the input to cut down on the number of weights. In this research,
neural network techniques and various pre-trained models like VGG (16 and 19),
InceptionV3, and MobileV2 have been used. DenseNet has been designed which
overcome the limitation of the existing pre-trained models. The proposed network
consisting Convolution Pooling block, dense block with transition layer series, Global
Average pool, and a fully connected layer. The designed network results in 92.6%
accuracy considering 20 epochs. The work has been compared with the existing state
of the art technology and the proposed network shows remarkable results.

Keywords Pneumonia · Detection · Expert system · Neural network · Deep


learning

A. Vats · R. Singh
Department of Computer Science Engineering, Jaypee University of Information Technology,
Solan, Himachal Pradesh 173234, India
R. K. Khurana
Department of Biotechnology, Jaypee University of Information Technology, Solan, Himachal
Pradesh 173234, India
S. Jain (B)
Department of Electronics and Communication Engineering, Jaypee University of Information
Technology, Solan, Himachal Pradesh 173234, India
e-mail: shruti.jain@juitsolan.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 667
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_50
668 A. Vats et al.

1 Introduction

Pneumonia is an infection inflaming the air sacs in one or both lungs. The air sacs
may fill with purulent material or fluid, causing difficulty in breathing, cough, chills,
and fever. There is a vast difference between a cold virus that produces the usual
cough, stuffiness, and sore throat. The mechanism of infection starts from the upper
respiratory band and spreads further in the lower region and the process is known
as pneumonitis. We can get either viral or bacterial pneumonia, but both can be
life-threatening [1, 2]. A deep productive cough from the lungs is a symptom of
Pneumonia. The lungs react to emotions of grief, fear, unhappiness, feeling unloved,
the inability to handle all that is coming at once, etc. People today just don’t under-
stand how dangerous pneumonia is. Medically it is a disease that causes inflamma-
tion in the parenchyma tissues of the lungs. The germs that cause pneumonia are
communicable and contagious. Bacterial and viral pneumonia can spread through
inhaled airborne droplets from sneezing and coughing. The certain risk factors that
are associated with pneumonia are chronic obstructive pulmonary disease, asthma,
diabetes, cystic fibrosis, heart failure, etc. People suffering from pneumonia expe-
rience a weak immune system, and feel chills, shortness of breath, chest pain, etc.
[3]. There have been 100 strains recorded so far but only a few cause most infec-
tions [4]. According to data, 45% of infections are mixed infections, i.e., caused
by bacteria and viruses both and it becomes difficult to isolate the causative agent
even after attentive testing. According to 2016 data, 1.2 million cases of pneumonia
are reported in children of the age group of 5 years and out of which 880,000 died.
For its treatment influenza vaccines are preferred, and medications like amantadine
and Zanamivir are used. Certain other factors like ceasing smoking, curbing indoor
air pollution, and maintaining proper hygiene can help to relax the disease arriving
conditions [5].
For diagnostic purposes, chest radiographs are used which were processed and
a diagnostic system was framed using various machine learning techniques [6, 7]
and deep learning techniques. Neural Network (NN) is an algorithm that is inspired
by the neurons in the brain [8]. Neural networks are used for a special purpose,
have weights, hidden layers, and supervised learning. The input features are fed
directly into these layers, and they are multiplied with the weights contained in the
hidden layers and finally get an output [9, 10]. The output obtained after the first
iteration will have a significant difference from targets which is calculated through
a function called loss function [11, 12]. A loss function results in how the output is
different from the target. Based on loss, alteration in the weights of the layers can
be done to decrease the loss because in the next iteration the output is nearer to the
target. The weight adjustment over different iterations will be done to reduce the loss
and the process is known as Back Propagation [13]. In some cases, training is not
successful, or overtraining where the networks’ ability to generalize is weak are some
disadvantages of neural networks. For many applications, we need to understand why
the solution found works and how robust it is. To overcome the disadvantages of NN,
Convolution neural network (CNN) a type of deep learning technique is employed
A Robust System for Detection of Pneumonia Using Transfer … 669

[14, 15]. The CNN shares the same weights among its filter, thus there are fewer
computations in comparison to NN. The increase in the number of Convlayers can
be afforded in CNN which would not have been possible in the case of NNs due
to the huge computations involved [16, 17]. CNNs are traditionally used for image
processing. There are various existing pre-trained models in the literature.
Hashmi et al. in 2020, has worked on developing a model for detection of pneu-
monia using digital chest X-ray images and thus reducing the workload of radiolo-
gists [18]. The approach used is supervised learning and transfer learning for hyper-
tuning of the deep learning models to obtain higher accuracy. The proposed model
outmatched all individual models. Ibrahim et al. in 2021, proposed a work where an
alternative to RT-PCR tests have been provided for rapid screening, and for this chest,
X-rays have been used [19]. This work uses a pre-trained AlexNet model to identify
and classify into four categories namely standard CXR scans, non-COVID-19 viral
Pneumonia, COVID-19, and bacterial Pneumonia). Jain et al. in 2020, has worked on
creating a CNN model and detecting Pneumonia using X-rays and classifying them as
Pneumonia and non-Pneumonia [20]. The first two models, i.e., the model containing
two and three convolutional layers used in this work were CNNs that results in an
accuracy of 85.26% and 92.31%. Shin et al. in 2016, has worked to explore three
important aspects which are exploring and evaluating different CNN models, then
the impact of the dataset and spatial scale was evaluated, and lastly transfer learning
from pre-trained ImageNet was examined [21]. Garstka and Strzelecki in 2020, has
worked on creating a self-constructed CNN to detect lung X-ray images and classify
them into healthy and viral pneumonia in the area of medical image processing and
accurate diagnosis [22]. The obtained accuracy was 85%. Kaushik et al. in 2020, has
presented a model based on a CNN that can detect Pneumonic from chest X-rays
[23]. The four models which have been worked on consist of one, two, three, and four
layers respectively. Due to overfitting, fully connected layers dropout regularization
was used in the second, third, and fourth models. The accuracy of all four models
was 89.74%, 85.26%, 92.31%, and 91.67% respectively. This work has helped in the
analysis of data and understanding. Varshni et al. in 2019, has proposed an automatic
system to detect Pneumonia and for this CNNs have been used [24]. The paper has
helped in understanding the role of pre-trained models in feature extraction for image
datasets.
Detection of pneumonia at the early stages is a bit difficult as the symptoms are the
same as that of cold or flu. However, early detection with innovative technologies can
help in early detection of pneumonia and can reduce the mortality rate. As the number
of cases continues to rise, the expertise becomes difficult, and an alternative approach
is needed. This paper proposes an expert system for the detection of pneumonia
using X-ray images that differentiate diseased (pneumonia) and non-diseased. In
this work, the authors have used neural networks and different pre-trained models of
CNN for detection. The novelty of the paper lies in the design of DenseNet which is
showing remarkable results. The training and validation accuracy and loss have been
evaluated using mini-batch gradient descent optimization for the designed model. A
comparison with the other state of the art techniques has also been done.
670 A. Vats et al.

This paper covers methodology steps for the detection of Pneumonia in Sect. 2,
results using NN, CNN, and the proposed network is shown in Sect. 3. Also, a
comparison has been done that is followed by concluding remarks and the future
work.

2 Materials and Methodology

The image dataset has been considered from the online dataset Kaggle [25]. A total of
5840 images were considered which were divided into training and testing datasets.
The training dataset comprises 1341 images of non-disease people and 3875 images
of Pneumonia affected people and the testing dataset contains 234 images of non-
affected people and 390 for people having Pneumonia. All the simulations were done
in a system equipped with Intel i7 @2.6 GHz, 16 GB RAM using Jupyter Notebook.
Most of the research present in the literature uses X-ray images for the classification
of Pneumonia to have less accuracy. In this paper, two-class classification has been
done using Neural Network, different CNN models, and DensNet to attain high
accuracy. The results are compared on the basis of accuracy and loss. Figure 1 shows
the methodology used in this paper.
In this paper, authors have considered images from online datasets which were
pre-processed. The images of the dataset are of different sizes but input size is only
224 × 224. A diagnostic system has been made using a Neural network. A neural
network is a combination of nodes assembled tier wise that is usually trained back-
to-back in a supervised method using stochastic gradient descent algorithm. Neural
networks learn in a hierarchical manner whereby every layer discovers additional
attributes. Therefore, the depth of the network is determined by the number of such
layers. The higher the layer count the deeper the network and hence the name “deep
learning”. Deep learning refers to neural networks that have a very large number of
layers stacked one over the other. A CNN is a distinctive type of deep neural network
which uses flashing layers of pooling and convolutions; it contains trainable filter
banks per layer. Every filter in a filter bank is known as kernel and it has a fixed
receptive window scanned over a layer for computation of output [26]. The output
map is sub-sampled using pooling technique to reduce sensitivity to distortions of
stimuli in the upper layers [27]. Figure 2 shows the architecture of CNN. CNNs utilize
spatial information so as to reduce the number of parameters and overall complexity
while learning similar information. For small problems, this is unnecessary and can
make CNNs prohibitively expensive to train. For larger problems, the complexity of
other algorithms grows faster, making CNNs more viable.
The first layer ConvLayer is accountable for confining the low-level functions
along with color, edges, gradient orientation, etc. With delivered layers, the archi-
tecture adapts to the High-Level functions properly, giving us a network that has
the wholesome information of images within the dataset. To reduce the overfitting
of the dataset, Regularization is used. Regularization is a strategy that makes slight
adjustments to the learning calculation with the end goal that the model sums up
A Robust System for Detection of Pneumonia Using Transfer … 671

Fig. 1 Proposed methodology for the detection of pneumonia

better. This improves the model’s exhibition on the inconspicuous information too.
In different words, while going towards the right, the unpredictability of the model
increases. The estimation of the regularization coefficient has to be done to get a well-
fitted model. The different regularizations in deep learning that enable decreasing
overfitting are L1 and L2 regularization and dropout. L1 and L2 regularization are the
most common kinds of regularization. Equation (1) represents the L2 while Eq. (2)
represents L1 regularization techniques.

λ 
Cost function = Loss + w2 (1)
2m
672 A. Vats et al.

Fig. 2 Basic architecture of CNN

λ 
Cost function = Loss + w (2)
2m
λ is the hyper-parameter or regularization parameter whose cost is optimized
for better effects. In L2 regularization weight decays in the direction of 0 (but not
precisely zero) while in L1 weights can be decreased to 0. Dropout is one of the
common and frequently used regularization strategies. There are forms of results to
the operation, one in which the convolved characteristic is reduced in dimensionality
in comparison to the other, and the opposite in which the dimensionality elevates or
remains identical. This is completed with the aid of making use of Valid Padding and
Same Padding respectively. Padding with zero (0-padding) is done so that it fits or
drops the part of the picture wherein the filter is no longer in shape. To introduce the
non-linearity in the network, Rectified Linear Unit (ReLU) is used. If the spatial size
of convolved features is diminishing, then the Pooling layer comes into the picture.
The Pooling Layer and Convolutional Layer jointly structured the ith layer of a CNN.
Over a progression of ages, the model is in a situation to recognize overwhelming
and certain low-stage and the utilization of the Softmax Classification technique.
There are different types of pre-trained layers. In this paper, the authors have used
VGG16, VGG19, InceptionV3, and Mobile V2 layers. Based on existing layers,
authors have designed a Dense Net layer consisting of convolution pooling block,
dense block transition layer series, global average pool, and fully connected block.
The accuracy and loss have been evaluated as a performance evaluation parameter.
The proposed network results better than the existing networks and the work have
also been compared to the other state of the art method.
A Robust System for Detection of Pneumonia Using Transfer … 673

3 Results and Discussion

In this paper, an expert system has been designed for the diagnosis of Pneumonia
using neural networks and CNN. The training/validation accuracy and loss have been
evaluated as performance parameters. Initially, the model is designed using a neural
network. All the simulations were carried out in MATLAB and 85.74% accuracy has
been achieved considering NN. The training is not successful while using NN. Due
to which authors have considered designing the model using CNN. CNN has shown
a lot of variants over the past few years that have led to significant developments
in the field of Deep Learning [26, 27]. CNN often uses low image resolutions with
255 × 255 pixels because increasing the image size would increase the number of
parameters of the network and hence it would take longer to train, and we require
a larger dataset in order for it to work well. Another reason is that we as humans
don’t necessarily need high-resolution images to do simple tasks such as what object
is present in an image. Some of CNN’s most popular architectures that have shown
significant improvement in error rates have been used in this work.
(a) Residual Network (Resnet 50 and Resnet 50V2): The construction of
ResNet50 can be described into four stages. All ResNet architecture performs
the initial convolution and max-pooling using 7 × 7 and 3 × 3 kernel sizes
respectively. When the stage changes, the channel width doubles while the size
of the input is reduced by half. Finally, the network has an average pooling
layer followed by a fully connected layer consisting of 1000 neurons. Deep-
ening the ResNet won’t impact the accuracy much, at least won’t worsen it,
but the inference will be much slower and can’t be used for real-time applica-
tions. However, for other networks which have no skip connections, deepening
would mean more parameters and more overfitting on the training data.
(b) MobileNetV2: MobileNetV2 is a CNN architecture based on an inverted
residual structure. These residual connections are located amid the layers of
the bottleneck and the transitional layer uses lightweight depth-wise convolu-
tions to filter features as a source of non-linearity. MobileNetV2 consists of
32 filters in a full convolution layer followed by 19 residual bottleneck layers.
MobileNets are low-latency, small, low-power models to meet the resource
constraints but they have linear bottlenecks and inverted residuals.
(c) Visual Geometry Group (VGG): One of the ways to improve deep neural
networks can be provided by increasing their size. VGG16 and VGG 19are
two types of VGG. VGG-16 has 13 constitutional and three fully connected
layers along with ReLU activation function and used stacking layers and small-
sized filters. The VGG19 consists of three additional convolution layers in
comparison with VGG16 model. In VGG networks, the use of 3 × 3 convo-
lutions with stride 1 providing an active field corresponding to 7 × 7 results
in fewer factors to train. Both the networks are implemented, but Fig. 3 shows
the training/validation loss and accuracy of VGG16 model.
674 A. Vats et al.

(a) (b)

Fig. 3 Results of VGG16. a Training and validation loss, b training and validation accuracy

Alexnet and VGG are pretty much the same concepts, but VGG is deeper and has
more parameters, as well as using only 3 × 3 filters.
InceptionV3: InceptionV3 is a form of CNN architecture that has made several
improvements including label smoothing. In InceptionV3 factorized convolution
is done to reduce the computational efficiency because it reduces the number
of parameters involved in the network. Smaller convolutions are used for faster
training. Pooling operations are done so as to reduce the grid size. Lastly, an
Auxiliary classifier has been used that uses small CNN embedded between layers
at the time of training. All the concepts have been incorporated into the final
architecture. The InceptionV3 is implemented, and the training/validation loss
and accuracy are shown in Fig. 4.

The difficulty in InceptionV3 is the use of Asymmetric convolutions. Considering


all the pre-trained networks its advantages and limitations, the authors have designed
Dense Net.

(a) (b)

Fig. 4 Results of InceptionV3. a Training and validation loss, b training and validation accuracy
A Robust System for Detection of Pneumonia Using Transfer … 675

Proposed DenseNet: The designed DenseNet uses a convolution pooling layer


followed by dense block transition layer series, global average pool, and fully
connected layer. As the architecture of the design is very big, so authors are able
to show only the first few blocks of the design. Figure 5 shows the architecture
for the DenseNet model. For the proposed model, 942 and 32,767 pixels are used as
width and height respectively and 32 as the bit depth is considered as the dimensions.
The network is implemented, and the training/validation loss and accuracy are
shown in Fig. 6.
All the pre-trained models have been run for five epochs to 20 epochs. Table 1
tabulates the accuracy obtained for all the models which were implemented in this
paper.
From Table 1, it has been observed that maximum accuracy of 92.6% has been
attained for 20 epochs using the proposed DenseNet model. 92.15%, 90.22%, 92.47%
have been achieved using VGG16, Inception V3, and VGG19 layer respectively.
Comparison with other state of the art techniques: Authors have compared the
work with other state of the are methods and are tabulated in Table 2.

Fig. 5 Proposed
architecture of DenseNet
(First few Blocks)
676 A. Vats et al.

(a) (b)

Fig. 6 Results of DenseNet. a Training and validation loss, b training and validation accuracy

Table 1 Evaluation of
Model Accuracy (5 Accuracy (20
performance parameters for
epochs) (%) epochs) (%)
different networks
Proposed DenseNet 92.15 92.60
VGG16 91.18 92.15
InceptionV3 89.74 90.22
VGG19 88.14 92.47
MobileNetV2 81.73 92.15
Neural network 85.74

Table 2 Comparison table


Authors Model Accuracy (%)
with other state art of the
technique Jain et al.[20] VGG 16 net 87.28
Garstka and Strzelecki Self-constructed CNN 85
[22]
Apoorv et al. Proposed DenseNet 92.6

The results were compared with existing work, and it has been concluded that
the proposed network results in 92.6% accuracy. The comparison based on accuracy
resulted in 8.2% improvement with Garstka and Strzelecki [22] and 5.7% with Jain
et al. [20] paper.

4 Conclusion and Future Work

The detection of Pneumonia at the early stage is difficult. To overcome this problem,
researchers have proposed an expert system for the detection of Pneumonia using
A Robust System for Detection of Pneumonia Using Transfer … 677

neural network and deep learning techniques that helps doctors to diagnose the
disease. The online dataset has been considered which was pre-processed and the
two-class classification has been done. 85.75% accuracy has been obtained using
neural network while 92.47% has been achieved using VGG 19 network. To achieve
better accuracy, authors have proposed a Dense Net that uses a convolution pooling
layer followed by dense block transition layer series, global average pool, and fully
connected layer resulting in 92.6% accuracy. The work has been compared with
other state of the art techniques which results in 8.2% improvement with Garstka
and Strzelecki and 5.7% with Jain et al. in terms of accuracy. This work can be
further improved by using a larger dataset. In the future, this model can be deployed
so that it can be used by medical institutions and hospitals for Pneumonia detection
in patients.

References

1. Healthcare, University of Utah. Pneumonia makes list for top 10 causes of death. Accessed on
31 December 2019; 2016 Available online: https://healthcare.utah.edu/the-scope/shows.php?
shows=0_riw4wti7
2. WHO Pneumonia is the Leading Cause of Death in Children. Accessed on 31 December
2019; 2011 Available online: https://www.who.int/maternal_child_adolescent/news_events/
news/2011/pneumonia/en
3. World Health Organization (2001) Standardization of interpretation of chest radiographs for the
diagnosis of pneumonia in children. World Health Organization, Geneva, Switzerland: 2001.
Technical Report
4. Cherian T, Mulholland EK, Carlin JB, Ostensen H, Amin R, Campo MD, Greenberg D, Lagos R,
Lucero M, Madhi SA et al (2005) Standardized interpretation of paediatric chest radiographs for
the diagnosis of pneumonia in epidemiological studies. Bull World Health Organ 83:353–359
5. Kallianos K, Mongan J, Antani S, Henry T, Taylor A, Abuya J, Kohli M (2019) How far have
we come? Artificial intelligence for chest radiograph interpretation. Clin Radiol 74:338–345.
https://doi.org/10.1016/j.crad.2018.12.015
6. Salau AO, Jain S (2019) Feature extraction: a survey of the types, techniques and applications.
In: 5th international conference on signal processing and communication (ICSC-2019), Jaypee
Institute of Information Technology, Noida (INDIA).
7. Bhusri S, Jain S, Virmani J (2016) Breast lesions classification using the amalagation of
morphological and texture features. Int J Pharma BioSci (IJPBS) 7(2) B:617–624
8. Dhande G, Shaikh Z (2019) Analysis of epochs in environment based neural networks
speech recognition system. In: 2019 3rd international conference on trends in electronics and
informatics (ICOEI)
9. Kirti, Sohal H, Jain S (2020) Multistage classification of arrhythmia and atrial fibrillation on
long-term heart rate variability. J Eng Sci Technol 15(2):1277–1295
10. Srivastava K, Malhotra G, Chauhan M, Jain S (2020) Design of novel hybrid model for
detection of liver cancer. In: 2020 IEEE international conference on computing, power and
communication technologies (GUCON), Greater Noida, India, pp 623–628
11. Wu X, Liu J (2009) A new early stopping algorithm for improving neural network general-
ization. In: 2009 second international conference on intelligent computation technology and
automation. https://doi.org/10.1109/icicta.2009.11
12. Prashar N, Sood M, Jain S (2020) A novel cardiac arrhythmia processing using machine learning
techniques. Int J Image Graphics 20(3):2050023
678 A. Vats et al.

13. Dogra J, Jain S, Sood M (2019) Glioma classification of MR brain tumor employing machine
learning. Int J Innov Technol Explor Eng (IJITEE) 8(8):2676–2682
14. Sharma O (2019) Deep challenges associated with deep learning. In: 2019 international
conference on machine learning, big data, cloud and parallel computing (COMITCon)
15. Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT press,
Cambridge
16. Kanada Y (2016) Optimizing neural-network learning rate by using a genetic algorithm with
per-epoch mutations. In: 2016 international joint conference on neural networks (IJCNN)
17. Bhardwaj C, Jain S, Sood M (2021) Deep learning based diabetic retinopathy severity grading
system employing quadrant ensemble model. J Digital Imaging
18. Hashmi MF, Katiyar S, Keskar AG, Bokde ND, Geem ZW (2020) Efficient pneumonia detection
in chest X-ray images using deep transfer learning. Diagnostics 10(6):417
19. Ibrahim AU, Ozsoz M, Serte S, Al-Turjman F, Yakoi PS (2021) Pneumonia classification using
deep learning from chest X-ray images during COVID-19. Cogn Comput 1–13
20. Jain R, Nagrath P, Kataria G, Kaushik VS, Hemanth DJ (2020) Pneumonia detection in
chest X-ray images using convolutional neural networks and transfer learning. Measurement
165:108046
21. Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Summers RM (2016) Deep convolutional
neural networks for computer-aided detection: CNN architectures, dataset characteristics and
transfer learning. IEEE Trans Med Imaging 35(5):1285–1298
22. Garstka J, Strzelecki M (2020) Pneumonia detection in X-ray chest images based on convolu-
tional neural networks and data augmentation methods. In: 2020 signal processing: algorithms,
architectures, arrangements, and applications (SPA), IEEE, pp 18–23
23. Kaushik VS, Nayyar A, Kataria G, Jain R (2020) Pneumonia detection using convolutional
neural networks (CNNs). In: Proceedings of first international conference on computing,
communications, and cyber-security (IC4S 2019), Springer, Singapore, pp 471–483
24. Varshni D, Thakral K, Agarwal L, Nijhawan R, Mittal A (2019) Pneumonia detection using
CNN based feature extraction. In: 2019 IEEE international conference on electrical, computer
and communication technologies (ICECCT), IEEE, pp 1–7
25. Chest X-Ray Images (Pneumonia). https://www.kaggle.com/paultimothymooney/chest-xray-
pneumonia
26. Yaseen MU, Anjum A, Rana O, Antonopoulos N (2018) Deep learning hyper-parameter
optimization for video analytics in clouds. IEEE Trans Syst Man Cybern: Syst
27. Bhardwaj C, Jain S, Sood M (2021) Transfer learning based robust automatic detection system
for diabetic retinopathy grading. Neural Comput Appl
Reduce Overall Execution Cost (OEC)
of Computation Offloading in Mobile
Cloud Computing for 5G Network Using
Nature Inspired Computing (NIC)
Algorithms

Voore Subba Rao and K. Srinivas

Abstract Cloud computing utilizes mobile devices as resource computing for


offloading computational tasks to servers. These mobile devices are not having that
much efficient energy for uploading these heavy tasks because mobile devices are
constrained in battery and life of battery is very short. The ubiquitous features of
mobile devices act as resources of cloud computing for utilizing for offloading
of computational tasks. This affects communication costs for cloud servers for
offloading. These offloading problems should be optimized to reduce overall execu-
tion cost in Mobile Cloud Computing for 5G network. The objective of this paper is
optimization of offloading of computational tasks through smart mobile devices to
cloud servers by Nature Inspired Computing (NIC) algorithms. Optimize the process
of services with NIC to maximize the performance offloading tasks, minimize energy
consumption and minimize total execution cost.

Keywords 5G network · Nature inspired computing · Offloading tasks · Mobile


devices

1 Introduction

Cloud computing is a modern technology that provides web based resources, services
via internet. It is a high-requirement infrastructure service available for clients all
the time [1].
Mobile Cloud Computing (MCC) is an architecture a feature of the data storage
and process that will be happened externally of the mobile device. The applications
of mobile cloud move data to cloud servers through mobile devices. MCC is to
minimize burden on mobile devices to store, process such a huge application and
also minimize overall execution cost of application processing [2].

V. S. Rao (B)
Department of Physics and Computer Science, Dayalbagh Educational Institute (DEI), Agra, India
K. Srinivas
Department of Electrical Engineering, Dayalbagh Educational Institute (DEI), Agra, India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 679
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3_51
680 V. S. Rao and K. Srinivas

Figure 1 shows architecture of Mobile Cloud Computing (MCC) contains three


parts those are smartphones, wireless communication technology, cloud infrastruc-
ture. The cloud infrastructure contains data centres for storing, process applications.
The data centres provide storage of data, process data and security of data in a cloud.
Mobile Cloud Computing (MCC) is integrated with cloud, mobile technologies
for wireless networks to fulfil computation works by mobile devices [3]. Increase
of handheld devices like smart mobile phones, tablets, laptops inbuilt huge memory
and high processing capabilities. Even these devices do not support for offloading of
high computational processing of applications for cloud computing. Smart mobile
devices are small in size, less battery life. The network bandwidth, storage capability
is very less for computational heavy applications of cloud computing. Mobile Cloud
Computing (MCC) is a new technology that offloads complex applications to the
cloud servers through mobile devices to save energy as well as time [4]. Offloading
is a process of offloading for computational processes in a remote server for remote
execution and to extend the Smart Mobile Devices (SMD) capability via internet [5].
Nowadays the vast improvement of technology provides high processing power,
huge storage capability as well as equipped with various sensors. Even these high
acquiring of high capabilities of mobile devices are constrained with a limited battery
and when compared with PCs the processing capability of mobile devices is very
less for act as resource utilization of heavy computational offloading tasks in cloud
servers in 5G network. And these mobile devices do not have that much capability
for executing heavy applications and involves require high processing time as well
as drain the battery power.
Cloud computing enables mobile devices to offload computational tasks to cloud
servers these servers are remotely located in someplace. These feature of mobile
devices acts as resources for offloading tasks for cloud computing involved in Mobile
Cloud Computing (MCC). This MCC allows computational tasks as well as data

Fig. 1 Architecture of mobile cloud computing (MCC)


Reduce Overall Execution Cost (OEC) of Computation Offloading in Mobile … 681

storage from mobile phones to servers in cloud (remotely). The offloading computa-
tional tasks from mobile phones to servers in cloud that effects minimize execution
costs also saves energy of digital devices.
The offloading heavy computational tasks on mobile devices are NP-hard. These
mobile devices are not supported for executing long running and heavy processing
applications for complex operations. The mobile devices are low storage battery capa-
bilities need frequently require battery charging operations. Smartphones couldn’t
be able to execute heavy applications and data for complex network environments
because all smartphones are battery operated. The problems in 5G are considered
network optimization problems. These problems set of inputs to provide an optimal
output with predefined constraints. Many traditional algorithmic techniques such
as convex optimization, linear programming and greedy algorithms have also been
applied. When the complexity of the above problems becomes exponential because
of the features like inherent ability of decentralization, scalability, self-evolution and
collective behaviour for particular tasks, etc. of Nature Inspired Computing (NIC)
algorithms have been successfully applied to model complex 5G network heavy
applications.

2 Literature Survey

In [6] authors proposed an offloading decision for mobile devices. The objective of
this paper is to optimization of energy consumption of mobile devices as well as
minimize execution time by Genetic Algorithm (GA). The experimental solutions
shows nearby optimum solutions with a reasonable amount of time.
In [7] authors proposed a design of intelligent computational offloading system
that take decision for program upload from a digital phone to server of cloud in 5th
generation mobile technology. This proposed method maximize saving of energy,
time management.
In [8] authors proposed a framework for computational offloading tasks by
applying Cuckoo algorithm approach. This method is used to minimize the energy
consumption on smartphone devices as well as increase the speed of computational
tasks.

3 Modeling, Solving and Optimization of Offloading


Problem Using Nature, Inspired Computing

Nature Inspired Computing algorithms are the inspiration from nature. These are the
model of solving computational problems efficiently. Due to bandwidth problems,
increasing heterogeneity of devices, increasing network is a challenging task. For
such a scenario find optimal solutions for a given time is not possible. But it can be
682 V. S. Rao and K. Srinivas

possible to find near-optimal solutions in a given time by Nature Inspired Computing


algorithms. These NIC algorithms are used to explore the exploitation or exploration
method to maximize or minimize an objective. Swarm Intelligence (SI) algorithms
are the features self-motivation, self-adaptation, collective behaviour of homogenous
agents in nature and being able to solve complex problems [9]. Examples are Ant
Colony Optimization (ACO), Particle Swarm Optimization (PSO).

3.1 Genetic Algorithm (GA)

Optimization of offloading problem using Genetic Algorithm is mean of offloading


for represents a chromosome. Chromosomes contain genes. Each gene constitutes a
mobile service. The length of a chromosome depends upon the number of genes it
contains. Every gene represents the value of 0 or 1. The value 1 represents service is
required to offload otherwise not. Reduce the execution time of offloading tasks, as
well as minimize energy consumption using Genetic Algorithm [10].
Pseudo-code for Genetic Algorithm (GA)

3.2 Particle Swarm Optimization (PSO)

Particle Swarm Optimization (PSO) is a swarm intelligence algorithm that is a meta-


heuristic algorithm for searching optimal solutions within search space of fitness
function. Swarm Intelligence (SI) algorithms are group (swarm) of birds searching
for food in a targeted area. These natural models are converted into computational
models by best utilization of SI algorithm, i.e. Particle Swarm Optimization (PSO)
Reduce Overall Execution Cost (OEC) of Computation Offloading in Mobile … 683

[11–13]. In Particle Swam Optimization the solutions have randomly generated these
solutions are known as particles. These particles search for the best solutions and
co-ordinately balance directions with velocities. The local and global searching for
optimum results are mainly dependent and affected by particles directions and their
velocities. The local search is particles Pbest , i.e. personnel best is particles personnel
position towards search space. The global search is particles Gbest , i.e. global best is
among all particles best position towards search space. The following equations are
particles updating their positions as well as velocities as follows.
Pseudo-code for Particle Swarm Optimization (PSO)

w = inertia weight that balance local search (exploitation), global search


(exploration)
rand() = random number ∈ [0, 1]
c1, c2 = acceleration constants
Pi = represents Pbest , i.e. personal best position for a particle in a swarm
Pg = represents Gbest , i.e. global best position among all particles in a swarm

3.3 Biogeography Based Optimization

BBO is a meta-heuristic algorithm that analyzes the species that will move one
specific island to another specific island. In BBO every habitat has some value called
as Habitat Suitability Index (HSI). And contains a parameter called as Suitability
684 V. S. Rao and K. Srinivas

Index Variable (SVI). HSI of Island that is friendly to life is said to have a High
Habitat Suitability Index (HSI). The features that classify habitable (i.e. fit to live in)
are called SIVs. Features that correlate with HSI include rainfall, vegetation diversity,
land area, temperature and others. In terms of habitability, SIV can be considered
the independent variables of the island and HSI can be considered the dependent
variables.
BBO has some parameters, i.e. migration operator, mutation operator and elitism
operator. To increase HSI value, Migration operator uses immigration and emigration.
The mutation operator updates the solutions randomly. The elitism operator selects
the best optimum solutions for the future.
BBO has specific equalities with being found well known bio-inspired computing
algorithms are Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). In
Genetic Algorithm (GA), whether the parent is not fittest (not capable), according to
these criteria, its child couldn’t survive for upcoming next generation because having
low-probability, but whereas in Particle Swarm Optimization (PSO), Biogeography
Based Optimization (BBO) solutions are capable for surviving in upcoming gener-
ations. In PSO, solutions are formed into same groups, whereas in BBO as well as
GA, the solutions will not form into a group in cluster (is a group). In PSO solutions
will be updated by velocity, in BBO solutions will be directly updated. So BBO will
produce better optimum results compared with GA, PSO [14, 15].
Biogeography Based Optimization (BBO) Algorithm
Reduce Overall Execution Cost (OEC) of Computation Offloading in Mobile … 685

Proposed BBO Algorithm

The fitness function for solving problems of optimizing to minimize execution


time of mobile applications and also minimize energy consumption. The proposed
method, the objective function for offloading technique is the sum of execution time
for mobile service as well as use of energy of mobile device (MD). The objective
function for calculating overall cost for execution.

Overall_Execution_Costmobil_device = ω ∗ Execution_Costmobil_device
+ (1 − ω) ∗ Energy_consmobile_device (1)

Execution_timemobile_device = overall execution time for MD for processing of an


application
Energy_consmobile_device = energy taken by MD
ω = is inertia parameter have a value in range of 0–1 shows tradeoff in between
execution time and energy consumption
ω = 0.5, i.e. need to save device’s battery life. It’s a parameter of time and energy.

Execution_Timemobil_device_L
= local_execution_service/ processor_capacitymobile_device (2)

Execution_Timemobile_device will be computed by taken time,


Execution_Timemobile_device_L consider time for executing tasks locally mobile
device. Execution_Timemobile_device_C is denoted for executing applications on
remote m/c, servers in the cloud. The locally executed service is mentioned
by local_execute_service and mobile device processor capacity is denoted by
processor_capacitymobile_device .
686 V. S. Rao and K. Srinivas

Execution_Timemobil_device_ C = ω − timecloud + inputi /data_trans_rate


+ Cloud_server/processor_capacitycloud
+ outputo /data_trans_rate (3)

Execution_timemobile_device_C = the time duration wait for execution of task.


data_trans_rate = require data transmission rate (calculate in kbps) will
send/receive and input/output data for offload task to/from cloud through a mobile
user
cloud_server = workload of offloaded service
input/output = data size of task
processor_capacitycloud = processor capacity of cloud server

Energy_consmobil_device_ L = Execution_Timemobile_device_ J
∗ Energy_cons_ratemobile_device (4)

Energy consumes for the execution of application by two components are


Energy_consmobile_device_L and Energy_cons_ratemobile_device_C.
Energy_consmobile_device_L = energy consumption for local execution
Energy_cons_ratemobile_device_C = rate of energy consumption of a mobile device
while executing tasks in cloud.

Execution_timemobile_device_C = inputi / data_trans_rateinput


∗ Energy_cons_ratemobile_device_up
+ output/data_trans_rateoutput
∗ Energy_cons_ratemobile_device_down (5)

Energy consumption of send and receive about input and output


data and information for a cloud offloading tasks is done by
Energy_cons_ratemobile_device_up /Energy_cons_ratemobile_device_down
For that …
 
Overall_Execution_Costmobile_device = ω ∗ Execution_Timemobile_device + (1 − ω)

 
∗ ⎝−
m Energy_costmobile_device_L


+ Energy_cons_ratemobile_device_C
c
(6)
 
m Energy_consmobile_device_L = denotes energy consumed by mobile devices
locally
Reduce Overall Execution Cost (OEC) of Computation Offloading in Mobile … 687

 
c Energy_cons_ratemobile_device_C = denotes energy consumed by mobile
devices either on cloud.
Assign fitness function to Nature Inspired Computing algorithm for optimum
solutions for multiple iterations. For resulting every iteration in execution gener-
ating new population. These steps are continuous to get fitness values for the given
population.

4 Results and Discussion

Nature Inspired Computing algorithms GA, PSO, BBO for generating optimal
offloading have been evaluated. The simulation software MATLAB was used for this
proposed work. These algorithms are evaluated considering smartphones and laptops
of different configurations. The experiments for optimization of energy consump-
tion and minimizing application execution time by NIC algorithms for taking these
parameters shows in Table form.
Table1 provides parameters in detail of Nature Inspired Computing algorithms.
Table2 provides the detailed configuration of smartphones and laptops for
offloading experiments by Nature Inspired Computing algorithms.
Figures 2 and 3 shows Overall Execution Cost (OEC) by GA, PSO, BBO for

Table 1 Parameters of nature inspired computing algorithms


NIS algorithms Size of population Generations Control variables
GA 100 50 Mutation rate = 0.0001
Crossover rate = 0.95
Elitism count = 2
PSO 100 50 c1 = 0.5, c2 = 0.5, ω = 0.5
BBO 100 50 Appliances = 12,
Migration = immigration/emigration
Mutation = 0.1
Max. iterations = 5

Table 2 Specifications of
Equipment Mobile devices Power (W) Capacity
devices for the experiment
type (MIPS)
analysis
Smart phones Rasberry Pi2 2 4744 at
1.6 GHz
Samsung 4 14,000 at
Exynos 5250 20 GHz
Laptops i7 6950 140 9900 at
1.5 GHz
Intel Core i7 84 13,374 at
4770 K 3.9 GHz
688 V. S. Rao and K. Srinivas

Fig. 2 Raspberry Pi2—shows OEC by GA, PSO, BBO

Fig. 3 Samsung Exynos 5250—shows OEC by GA, PSO, BBO

mobile devices like Raspberry Pi2 and Samsung Exynos 5250. As per results, BBO
algorithm provides the best optimal solution for the tasks. BBO provides the best
performance results for overall execution cost.
Figures 4 and 5 shows Overall Execution Cost (OEC) by GA, PSO, BBO for
Laptops like i7 6950 and Core i7 4770 K. As per results, BBO algorithm provides
the best optimal solution for the tasks. BBO provides the best performance results
for overall execution cost.

5 Conclusion

This paper focusing the importance of Nature Inspired Computing algorithms for
optimum results for optimization of mobile computation offloading in Mobile Cloud
Reduce Overall Execution Cost (OEC) of Computation Offloading in Mobile … 689

Fig. 4 i7 6950 laptop—shows OEC by GA, PSO, BBO

Fig. 5 Intel corei7 4770 K laptop—shows OEC by GA, PSO, BBO

Computing tasks. The objective of this paper is to improve energy efficiency, mini-
mize energy consumption by minimizing total execution cost for executing offloading
tasks from mobile phones to cloud servers. By observing simulation results, it proved
that compared with Genetic Algorithm (GA) and Particle Swarm Optimization
(PSO) the Biogeography Based Optimization (BBO) algorithm is showing better
performance results for getting near optimum results for minimum overall execution
cost.

References

1. Mell P, Grance T (2011) The Nist definition of cloud computing


2. Dinh HT, Lee C, Niyato D, Wang P (2013) A survey of mobile cloud computing: architecture,
690 V. S. Rao and K. Srinivas

applications, and approaches. Wireless Commun Mobile Comput 13(18):1587–1611


3. https://en.wikipedia.org/wiki/Mobile_cloud_computing. Accessed 31 Dec 2016
4. Xia F, DingJie F, Xi J, Kong X, Yang LT, Ma J (2014) Phone2Cloud: exploiting computa-
tion offloading for energy saving on smartphones in mobile cloud computing. Inf Syst Front
16(1):95–111
5. Yang K, Ou S, Chen H-H (2008) On effective offloading services for resource-constrained
mobile devices running heavier mobile internet applications. Commun Mag IEEE 46(1):56–63
6. Deng S et al (2014) Computation offloading for service workflow in mobile cloud computing.
IEEE Transactions Parallel Distrib Syst 26(12):3317–3329
7. Khoda ME et al (2016) Efficient computation offloading decision in mobile cloud computing
over 5G network. Mobile Netw Appl 21(5):777–792
8. Kemp R et al (2010) Cuckoo: a computation offloading framework for smartphones. In:
International conference on mobile computing, applications, and services. Springer, Berlin,
Heidelberg
9. Yang F et al (2017) Survey of swarm intelligence optimization algorithms. In: 2017 IEEE
international conference on unmanned systems (ICUS), IEEE
10. Srinivas M, Patnaik LM (1994) Genetic algorithms: a survey. Computer 27(6):17–26
11. Blum C, Li X (2008) Swarm intelligence in optimization. In: Swarm intelligence, Springer,
Berlin, Heidelberg, pp 43–85
12. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of IEEE
international conference on neural networks, vol 4, pp 1942–1948
13. Zhang Y, Wang S, Ji G (2015) A comprehensive survey on particle swarm optimization
algorithm and its applications. Math Prob Eng 2015
14. Simon D (2008) Biogeography-based optimization. IEEE Trans Evol Comput 12:702–713
15. Wei Z et al (2020) An offloading strategy with soft time windows in mobile edge computing.
Comput Commun 164:42–49
Author Index

A Goyal, Sumeet, 281. See 00


Ahmad, Sharik, 479. See 00 Grewal, Surender K., 159. See 00
Amudhan, K., 223. See 00 Gunavathi, K., 471. See 00
Ankaiah, Burri, 657. See 00 Gupta, Ankit, 21. See 00
Anthoniraj, S., 619. See 00 Gupta, Anuj Kumar, 243. See 00
Aswin, V. R., 223. See 00 Gupta, Deepak, 95. See 00
Gupta, Monish, 521. See 00
Gupta, Parul, 365. See 00
B Gupta, Sheifali, 267. See 00
Bala, Indu, 35. See 00 Gupta, Swati, 543. See 00
Balaji, S., 223. See 00 Gupta, Vishal, 405, 521. See 00
Batra, Amit, 109. See 00
Batra, Neera, 429. See 00
Bhardwaj, Shikha, 645. See 00 H
Bhavana, 415. See 00 Harnal, Shilpi, 331. See 00
Harsha Priya, M., 129. See 00
Harsha Vardhini, P. A., 129, 657. See 00
C Helenprabha, K., 167. See 00
Chandra, Umesh, 479. See 00 Hemambujavalli, S., 619. See 00
Cheema, Pardeep, 591. See 00
Chhabra, Mohit, 297. See 00
Chugh, Himani, 267. See 00 J
Jacob, Minu Susan, 57. See 00
Jain, Anurag, 195. See 00
D Jain, Ritika, 439. See 00
Dalal, P., 583. See 00 Jain, Shruti, 667. See 00
Dhingra, Shefali, 645. See 00 Jain, Stuti, 439. See 00
Dhull, S. K., 583. See 00 Jamwal, Shubhnandan S., 365. See 00
Dutta, Neha, 591. See 00 Jose, Deepa, 619. See 00

G K
Gagandeep, 179. See 00 Karthikeyan, S., 1. See 00
Gopalsamy, Arunraj, 451. See 00 Kathuria, A., 559. See 00
Goyal, Sonali, 429. See 00 Kaur, Karamjit, 313. See 00
© The Editor(s) (if applicable) and The Author(s), under exclusive license 691
to Springer Nature Singapore Pte Ltd. 2022
N. Marriwala et al. (eds.), Mobile Radio Communications and 5G Networks,
Lecture Notes in Networks and Systems 339,
https://doi.org/10.1007/978-981-16-7018-3
692 Author Index

Kaur, Lakhwinder, 255. See 00 Raheja, Neeraj, 415. See 00


Kaur, Priya Pinder, 83. See 00 Rajakarunakaran, S., 223. See 00
Khurana, Ramneek Kaur, 667. See 00 Rajdurai, P. S., 533. See 00
Krishnaveni, S., 129. See 00 Rani, Manisha, 179. See 00
Kumar, Akhilesh, 395. See 00 Rani, Pooja, 195. See 00
Kumar, Dinesh, 345. See 00 Rao, Voore Subba, 679. See 00
Kumar, Neeraj, 267. See 00 Ravi, T., 1. See 00
Kumar, Pawan, 345. See 00 Reddy, C. S., 533. See 00
Kumar, Rajneesh, 195, 297. See 00 Reshmika, D., 379. See 00
Kumar, Velguri Suresh, 149. See 00 Richa, 313. See 00

L
Lakshmi, N., 379. See 00 S
Saini, Mehak, 159. See 00
Saini, Sanju, 119. See 00
M
Sakthivel, D., 511. See 00
Mageshwari, R., 73. See 00
Saravanan, Samyuktha, 471. See 00
Mamatha, E., 533. See 00
Selvi Rajendran, P., 57. See 00
Manisha, Chilumula, 137. See 00
Sen, Vijay Singh, 365. See 00
Manju, S., 167. See 00
Marriwala, Nikhil, 521, 605. See 00 Sethi, Hemant, 405. See 00
Mavi, Navneet Kaur, 255. See 00 Sharduli, 109. See 00
Mishra, Brijesh, 395. See 00 Sharma, Ashish, 439. See 00
Mishra, Ravi Dutt, 331. See 00 Sharma, Avinash, 479, 559. See 00
Mishra, Satanand, 211. See 00 Sharma, Bhawna, 501. See 00
Sharma, Gaurav, 331. See 00
Sharma, Jyotika, 439. See 00
N Sharma, Manvinder, 281. See 00
Narwal, Bhawna, 645. See 00 Sharma, Preeti, 635. See 00
Nassa, Vinay Kumar, 281. See 00 Sharma, Raju, 243. See 00
Nirmal Kumar, P., 619. See 00 Sharma, Sandhya, 267. See 00
Nishanth, Adusumalli, 1. See 00 Siddique, Shadab Azam, 395. See 00
Singh, Kritika, 395. See 00
Singh, Kulvinder, 109. See 00
O
Singh, N. P., 543. See 00
Oommen, Sujo, 657. See 00
Singh, Priti, 313. See 00
Singh, Rashi, 667. See 00
P Singh, Sukhdev, 83. See 00
Palta, Pankaj, 281. See 00 Singh, Surinder Pal, 405. See 00
Pandey, Binay Kumar, 281. See 00 Soni, Sanjay Kumar, 395. See 00
Pandey, Digvijay, 281. See 00 Sreelatha, V., 533. See 00
Pandey, Divya, 211. See 00 Sri Kiruthika, G., 73. See 00
Parthiban, T., 379. See 00 Srinivas, K., 679. See 00
Pathak, Monika, 243. See 00 Srivastava, Vikas, 35. See 00
Phukan, Arpan, 95. See 00 Srivastava, V. K., 635. See 00
Pillai, Anuradha, 21. See 00 Suresh Kumar, Velguri, 137. See 00
Ponraj, A., 379. See 00 Suwathi, T., 73. See 00
Praveen, Annaa, 1. See 00
Punj, Deepika, 21. See 00

T
R Thomas, Tina Susan, 73. See 00
Radha, B., 451, 511. See 00 Tiwari, Garima, 119. See 00
Author Index 693

V Vijayabaskar, 1. See 00
Vaid, Rohit, 501. See 00
Vandana, 605. See 00
Vats, Apoorv, 667. See 00
Venkatesh, R., 223. See 00 Y
Venkatesh, Tadisetty Adithya, 149. See 00 Yadav, Dharminder, 479. See 00
Vignesh Saravanan, K., 223. See 00 Yadav, Shekhar, 211. See 00

You might also like