Intelligent Computing and Optimization: Pandian Vasant Ivan Zelinka Gerhard-Wilhelm Weber Editors

Advances in Intelligent Systems and Computing 1324
Pandian Vasant
Ivan Zelinka
Gerhard-Wilhelm Weber Editors
Intelligent
Computing and
Optimization
Proceedings of the 3rd International
Conference on Intelligent Computing
and Optimization 2020 (ICO2020)
Advances in Intelligent Systems and Computing
Volume 1324
Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland
Advisory Editors
Nikhil R. Pal, Indian Statistical Institute, Kolkata, India
Rafael Bello Perez, Faculty of Mathematics, Physics and Computing,
Universidad Central de Las Villas, Santa Clara, Cuba
Emilio S. Corchado, University of Salamanca, Salamanca, Spain
Hani Hagras, School of Computer Science and Electronic Engineering,
University of Essex, Colchester, UK
László T. Kóczy, Department of Automation, Széchenyi István University,
Gyor, Hungary
Vladik Kreinovich, Department of Computer Science, University of Texas
at El Paso, El Paso, TX, USA
Chin-Teng Lin, Department of Electrical Engineering, National Chiao
Tung University, Hsinchu, Taiwan
Jie Lu, Faculty of Engineering and Information Technology,
University of Technology Sydney, Sydney, NSW, Australia
Patricia Melin, Graduate Program of Computer Science, Tijuana Institute
of Technology, Tijuana, Mexico
Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro,
Rio de Janeiro, Brazil
Ngoc Thanh Nguyen , Faculty of Computer Science and Management,
Wrocław University of Technology, Wrocław, Poland
Jun Wang, Department of Mechanical and Automation Engineering,
The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications
on theory, applications, and design methods of Intelligent Systems and Intelligent
Computing. Virtually all disciplines such as engineering, natural sciences, computer
and information science, ICT, economics, business, e-commerce, environment,
healthcare, life science are covered. The list of topics spans all the areas of modern
intelligent systems and computing such as: computational intelligence, soft comput-
ing including neural networks, fuzzy systems, evolutionary computing and the fusion
of these paradigms, social intelligence, ambient intelligence, computational neuro-
science, artificial life, virtual worlds and society, cognitive science and systems,
Perception and Vision, DNA and immune based systems, self-organizing and
adaptive systems, e-Learning and teaching, human-centered and human-centric
computing, recommender systems, intelligent control, robotics and mechatronics
including human-machine teaming, knowledge-based paradigms, learning para-
digms, machine ethics, intelligent data analysis, knowledge management, intelligent
agents, intelligent decision making and support, intelligent network security, trust
management, interactive entertainment, Web intelligence and multimedia.
The publications within “Advances in Intelligent Systems and Computing” are
primarily proceedings of important conferences, symposia and congresses. They
cover significant recent developments in the field, both of a foundational and
applicable character. An important characteristic feature of the series is the short
publication time and world-wide distribution. This permits a rapid and broad
dissemination of research results.
Indexed by SCOPUS, DBLP, EI Compendex, INSPEC, WTI Frankfurt eG,
zbMATH, Japanese Science and Technology Agency (JST), SCImago.
All books published in the series are submitted for consideration in Web of
Science.
More information about this series at http://www.springer.com/series/11156

Pandian Vasant Ivan Zelinka
• •
Gerhard-Wilhelm Weber
Editors
Intelligent Computing
and Optimization
Proceedings of the 3rd International
Conference on Intelligent Computing
and Optimization 2020 (ICO 2020)
123
Editors
Pandian Vasant Ivan Zelinka
Department of Fundamental Faculty of Electrical Engineering
and Applied Sciences and Computer Science
Universiti Teknologi Petronas VŠB TU Ostrava
Tronoh, Perak, Malaysia Ostrava, Czech Republic
Faculty of Engineering Management
Poznan University of Technology
Poznan, Poland
ISSN 2194-5357 ISSN 2194-5365 (electronic)

Advances in Intelligent Systems and Computing
ISBN 978-3-030-68153-1 ISBN 978-3-030-68154-8 (eBook)
https://doi.org/10.1007/978-3-030-68154-8
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The third edition of the International Conference on Intelligent Computing and

Optimization (ICO) ICO’2020 will be held via online platform due to COVID-19
pandemic. The physical conference was held at G Hua Hin Resort & Mall, Hua
Hin, Thailand, once the COVID-19 pandemic is recovered. The objective of the
international conference is to bring together the global research leaders, experts, and
scientists in the research areas of Intelligent Computing and Optimization from all
over the globe to share their knowledge and experiences on the current research
achievements in these fields. This conference provides a golden opportunity for
global research community to interact and share their novel research results, find-
ings, and innovative discoveries among their colleagues and friends. The pro-
ceedings of ICO’2020 is published by SPRINGER (Advances in Intelligent Systems
and Computing).
Almost 100 authors submitted their full papers for ICO’2020. They represent
more than 50 countries, such as Bangladesh, Canada, China, Croatia, France,
Greece, Hong Kong, Italy, India, Indonesia, Iraq, Japan, Malaysia, Mauritius,
Mexico, Myanmar, Namibia, Nigeria, Oman, Poland, Russia, Slovenia, South
Africa, Sweden, Taiwan, Thailand, Turkmenistan, Ukraine, USA, UK, Vietnam,
and others. This worldwide representation clearly demonstrates the growing interest
of the research community in our conference.
For this edition, the conference proceedings cover the innovative, original, and
creative research areas of sustainability, smart cities, meta-heuristics optimization,
cybersecurity, block chain, big data analytics, IoTs, renewable energy, artificial
intelligence, Industry 4.0, modeling, and simulation. The organizing committee
would like to sincerely thank all the authors and the reviewers for their wonderful
contribution for this conference. The best and high-quality papers have been
selected and reviewed by International Program Committee in order to publish in
Advances in Intelligent System and Computing by SPRINGER.
ICO’2020 presents enlightening contributions for research scholars across the
planet in the research areas of innovative computing and novel optimization
techniques and with the cutting-edge methodologies and applications. This con-
ference could not have been organized without the strong support and help from the
v
vi Preface
committee members of ICO’2020. We would like to sincerely thank Prof. Igor

Litvinchev (Nuevo Leon State University (UANL), Mexico), Prof. Rustem Popa
(Dunarea de Jos University in Galati, Romania), Professor Jose Antonio Marmolejo
(Universidad Panamericana, Mexico), and Dr. J. Joshua Thomas (UOW
Malaysia KDU Penang University College, Malaysia) for their great help and
support in organizing the conference.
We also appreciate the valuable guidance and great contribution from
Dr. J. Joshua Thomas (UOW Malaysia KDU Penang University College,
Malaysia), Prof. Gerhard-Wilhelm Weber (Poznan University of Technology,
Poland; Middle East Technical University, Turkey), Prof. Rustem Popa (“Dunarea
de Jos” University in Galati, Romania), Prof. Valeriy Kharchenko (Federal
Scientific Agro-engineering Center VIM, Russia), Dr. Vladimir Panchenko
(Russian University of Transport, Russia), Prof. Ivan Zelinka (VSB-TU Ostrava,
Czech Republic), Prof. Jose Antonio Marmolejo (Universidad Anahuac Mexico
Norte, Mexico), Prof. Roman Rodriguez-Aguilar (Universidad Panamericana,
Mexico), Prof. Ugo Fiore (Federico II University, Italy), Dr. Mukhdeep Singh
Manshahia (Punjabi University Patiala, India), Mr. K. C. Choo (CO2 Networks,
Malaysia), Prof. Celso C. Ribeiro (Brazilian Academy of Sciences, Brazil),
Prof. Sergei Senkevich (Federal Scientific Agro-engineering Center VIM, Russia),
Prof. Mingcong Deng (Tokyo University of Agriculture and Technology, Japan),
Dr. Kwok Tai Chui (Open University of Hong Kong, Hong Kong), Prof. Hui Ming
Wee (Chung Yuan Christian University. Taiwan), Prof. Elias Munapo (North West
University, South Africa), Prof. M. Moshiul Hoque (Chittagong University of
Engineering & Technology, Bangladesh), and Prof. Mohammad Shamsul Arefin
(Chittagong University of Engineering and Technology, Bangladesh).
Finally, we would like convey our utmost sincerest thanks to Prof. Dr. Janusz
Kacprzyk, Dr. Thomas Ditzinger, and Ms. Jayarani Premkumar of
SPRINGER NATURE for their wonderful help and support in publishing
ICO’2020 conference proceedings Book in Advances in Intelligent Systems and
Computing.
December 2020 Pandian Vasant

Ivan Zelinka
Contents
Sustainable Clean Energy System

Adaptive Neuro-Fuzzy Inference Based Modeling of Wind Energy
Harvesting System for Remote Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Tigilu Mitiku and Mukhdeep Singh Manshahia
Features of Distributed Energy Integration in Agriculture . . . . . . . . . . . 19
Alexander V. Vinogradov, Dmitry A. Tikhomirov, Vadim E. Bolshev,
Alina V. Vinogradova, Nikolay S. Sorokin, Maksim V. Borodin,
Vadim A. Chernishov, Igor O. Golikov, and Alexey V. Bukreev
Concept of Multi-contact Switching System . . . . . . . . . . . . . . . . . . . . . . 28
Alexander V. Vinogradov, Dmitry A. Tikhomirov, Alina V. Vinogradova,
Alexander A. Lansberg, Nikolay S. Sorokin, Roman P. Belikov,
Vadim E. Bolshev, Igor O. Golikov, and Maksim V. Borodin
The Design of Optimum Modes of Grain Drying
in Microwave–Convective Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Dmitry Budnikov
Isolated Agroecosystems as a Way to Solve the Problems of Feed,
Ecology and Energy Supply of Livestock Farming . . . . . . . . . . . . . . . . . 43
Aleksey N. Vasiliev, Gennady N. Samarin, Aleksey Al. Vasiliev,
and Aleksandr A. Belov
Laboratory-Scale Implementation of Ethereum Based Decentralized
Application for Solar Energy Trading . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Patiphan Thupphae and Weerakorn Ongsakul
Solar Module with Photoreceiver Combined with Concentrator . . . . . . 63
Vladimir Panchenko and Andrey Kovalev
vii
viii Contents
Modeling of Bilateral Photoreceiver of the Concentrator

Photovoltaic Thermal Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Vladimir Panchenko, Sergey Chirskiy, Andrey Kovalev,
and Anirban Banik
Formation of Surface of the Paraboloid Type Concentrator of Solar
Radiation by the Method of Orthogonal Parquetting . . . . . . . . . . . . . . . 84
Vladimir Panchenko and Sergey Sinitsyn
Determination of the Efficiency of Photovoltaic Converters Adequate
to Solar Radiation by Using Their Spectral Characteristics . . . . . . . . . . 95
Valeriy Kharchenko, Boris Nikitin, Vladimir Panchenko,
Shavkat Klychev, and Baba Babaev
Modeling of the Thermal State of Systems of Solar-Thermal
Regeneration of Adsorbents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Gulom Uzakov, Saydulla Khujakulov, Valeriy Kharchenko,
Zokir Pardayev, and Vladimir Panchenko
Economic Aspects and Factors of Solar Energy Development
in Ukraine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Volodymyr Kozyrsky, Svitlana Makarevych, Semen Voloshyn,
Tetiana Kozyrska, Vitaliy Savchenko, Anton Vorushylo,
and Diana Sobolenko
A Method for Ensuring Technical Feasibility of Distributed
Balancing in Power Systems, Considering Peer-to-Peer Balancing
Energy Trade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Mariusz Drabecki
Sustainable Optimization, Metaheuristics and Computing

for Expert System
The Results of a Compromise Solution, Which Were Obtained
on the Basis of the Method of Uncertain Lagrange Multipliers
to Determine the Influence of Design Factors
of the Elastic-Damping Mechanism
in the Tractor Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Sergey Senkevich, Ekaterina Ilchenko, Aleksandr Prilukov,
and Mikhail Chaplygin
Multiobjective Lévy-Flight Firefly Algorithm
for Multiobjective Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Somchai Sumpunsri, Chaiyo Thammarat, and Deacha Puangdownreong
Cooperative FPA-ATS Algorithm for Global Optimization . . . . . . . . . . 154
Thitipong Niyomsat, Sarot Hlangnamthip, and Deacha Puangdownreong
Contents ix
Bayesian Optimization for Reverse Stress Testing . . . . . . . . . . . . . . . . . 164

Peter Mitic
Modified Flower Pollination Algorithm for Function Optimization . . . . 176
Noppadol Pringsakul and Deacha Puangdownreong
Improved Nature-Inspired Algorithms for Numeric Association
Rule Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Iztok Fister Jr., Vili Podgorelec, and Iztok Fister
Verification of the Adequacy of the Topological Optimization
Method of the Connecting Rod Shaping by the BESO Method
in ANSYS APDL System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Sergey Chirskiy and Vladimir Panchenko
Method for Optimizing the Maintenance Process of Complex
Technical Systems of the Railway Transport . . . . . . . . . . . . . . . . . . . . . 205
Vladimir Apatsev, Victor Bugreev, Evgeniy Novikov,
Vladimir Panchenko, Anton Chekhov, and Pavel Chekhov
Optimization of Power Supply System of Agricultural Enterprise
with Solar Distributed Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Yu. V. Daus, I. V. Yudaev, V. V. Kharchenko, and V. A. Panchenko
Crack Detection of Iron and Steel Bar Using Natural Frequencies:
A CFD Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Rajib Karmaker and Ujjwal Kumar Deb
Belief Rule-Based Expert System to Identify the Crime Zones . . . . . . . . 237
Abhijit Pathak, Abrar Hossain Tasin, Sanjida Nusrat Sania, Md. Adil,
and Ashibur Rahman Munna
Parameter Tuning of Nature-Inspired Meta-Heuristic Algorithms
for PID Control of a Stabilized Gimbal . . . . . . . . . . . . . . . . . . . . . . . . . 250
S. Baartman and L. Cheng
Solving an Integer Program by Using the Nonfeasible Basis Method
Combined with the Cutting Plane Method . . . . . . . . . . . . . . . . . . . . . . . 263
Kasitinart Sangngern and Aua-aree Boonperm
A New Technique for Solving a 2-Dimensional Linear Program
by Considering the Coefficient of Constraints . . . . . . . . . . . . . . . . . . . . 276
Panthira Jamrunroj and Aua-aree Boonperm
A New Integer Programming Model for Solving a School Bus
Routing Problem with the Student Assignment . . . . . . . . . . . . . . . . . . . 287
Anthika Lekburapa, Aua-aree Boonperm, and Wutiphol Sintunavarat
Distributed Optimisation of Perfect Preventive Maintenance
and Component Replacement Schedules Using SPEA2 . . . . . . . . . . . . . 297
Anthony O. Ikechukwu, Shawulu H. Nggada, and José G. Quenum
x Contents
A Framework for Traffic Sign Detection Based on Fuzzy Image

Processing and Hu Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Zainal Abedin and Kaushik Deb
Developing a Framework for Vehicle Detection, Tracking
and Classification in Traffic Video Surveillance . . . . . . . . . . . . . . . . . . . 326
Rumi Saha, Tanusree Debi, and Mohammad Shamsul Arefin
Advances in Algorithms, Modeling and Simulation

for Intelligent Systems
Modeling and Simulation of Rectangular Sheet Membrane Using
Computational Fluid Dynamics (CFD) . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Anirban Banik, Sushant Kumar Biswal, Tarun Kanti Bandyopadhyay,
Vladimir Panchenko, and J. Joshua Thomas
End-to-End Supply Chain Costs Optimization Based on Material
Touches Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
César Pedrero-Izquierdo, Víctor Manuel López-Sánchez,
and José Antonio Marmolejo-Saucedo
Computer Modeling Selection of Optimal Width of Rod Grip
Header to the Combine Harvester . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Mikhail Chaplygin, Sergey Senkevich, and Aleksandr Prilukov
An Integrated CNN-LSTM Model for Micro Hand
Gesture Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Nanziba Basnin, Lutfun Nahar, and Mohammad Shahada Hossain
Analysis of the Cost of Varying Levels of User Perceived Quality
for Internet Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
Ali Adib Arnab, Sheikh Md. Razibul Hasan Raj, John Schormans,
Sultana Jahan Mukta, and Nafi Ahmad
Application of Customized Term Frequency-Inverse Document
Frequency for Vietnamese Document Classification in Place
of Lemmatization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
Do Viet Quan and Phan Duy Hung
A New Topological Sorting Algorithm with Reduced
Time Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
Tanzin Ahammad, Mohammad Hasan, and Md. Zahid Hassan
Modeling and Analysis of Framework for the Implementation
of a Virtual Workplace in Nigerian Universities Using Coloured
Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
James Okpor and Simon T. Apeh
Contents xi
Modeling and Experimental Verification of Air - Thermal

and Microwave - Convective Presowing Seed Treatment . . . . . . . . . . . . 440
Alexey A. Vasiliev, Alexey N. Vasiliev, Dmitry A. Budnikov,
and Anton A. Sharko
Modeling of Aluminum Profile Extrusion Yield: Pre-cut Billet Sizes . . . 452
Jaramporn Hassamontr and Theera Leephaicharoen
Models for Forming Knowledge Databases for Decision Support
Systems for Recognizing Cyberattacks . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Valery Lakhno, Bakhytzhan Akhmetov, Moldyr Ydyryshbayeva,
Bohdan Bebeshko, Alona Desiatko, and Karyna Khorolska
Developing an Intelligent System for Recommending Products . . . . . . . 476
Md. Shariful Islam, Md. Shafiul Alam Forhad, Md. Ashraf Uddin,
Mohammad Shamsul Arefin, Syed Md. Galib, and Md. Akib Khan
Branch Cut and Free Algorithm for the General Linear
Integer Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
Elias Munapo
Resilience in Healthcare Supply Chains . . . . . . . . . . . . . . . . . . . . . . . . . 506
Jose Antonio Marmolejo-Saucedo
and Mariana Scarlett Hartmann-González
A Comprehensive Evaluation of Environmental Projects Through
a Multiparadigm Modeling Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 520
Roman Rodriguez-Aguilar, Luz María Adriana Reyes Ortega,
and Jose-Antonio Marmolejo-Saucedo
Plant Leaf Disease Recognition Using Histogram Based Gradient
Boosting Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
Syed Md. Minhaz Hossain and Kaushik Deb
Exploring the Machine Learning Algorithms to Find the Best
Features for Predicting the Breast Cancer and Its Recurrence . . . . . . . 546
Anika Islam Aishwarja, Nusrat Jahan Eva, Shakira Mushtary,
Zarin Tasnim, Nafiz Imtiaz Khan, and Muhammad Nazrul Islam
Exploring the Machine Learning Algorithms to Find the Best
Features for Predicting the Risk of Cardiovascular Diseases . . . . . . . . . 559
Mostafa Mohiuddin Jalal, Zarin Tasnim, and Muhammad Nazrul Islam
Searching Process Using Boyer Moore Algorithm in Digital Library . . . 570
Laet Laet Lin and Myat Thuzar Soe
xii Contents
Application of Machine Learning and Artificial

Intelligence Technology
Gender Classification from Inertial Sensor-Based Gait Dataset . . . . . . . 583
Refat Khan Pathan, Mohammad Amaz Uddin, Nazmun Nahar,
Ferdous Ara, Mohammad Shahadat Hossain, and Karl Andersson
Lévy-Flight Intensified Current Search for Multimodal
Function Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
Wattanawong Romsai, Prarot Leeart, and Auttarat Nawikavatan
Cancer Cell Segmentation Based on Unsupervised Clustering
and Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
Juel Sikder, Utpol Kanti Das, and A. M. Shahed Anwar
Automated Student Attendance Monitoring System Using
Face Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
Bakul Chandra Roy, Imran Hossen, Md. Golam Rashed, and Dipankar Das
Machine Learning Approach to Predict the Second-Life Capacity
of Discarded EV Batteries for Microgrid Applications . . . . . . . . . . . . . . 633
Ankit Bhatt, Weerakorn Ongsakul, and Nimal Madhu
Classification of Cultural Heritage Mosque of Bangladesh Using
CNN and Keras Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
Mohammad Amir Saadat, Mohammad Shahadat Hossain, Rezaul Karim,
and Rashed Mustafa
Classification of Orthopedic Patients Using Supervised Machine
Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Nasrin Jahan, Rashed Mustafa, Rezaul Karim,
and Mohammad Shahadat Hossain
Long Short-Term Memory Networks for Driver Drowsiness
and Stress Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
Kwok Tai Chui, Mingbo Zhao, and Brij B. Gupta
Optimal Generation Mix of Hybrid Renewable Energy System
Employing Hybrid Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . 681
Md. Arif Hossain, Saad Mohammad Abdullah, Ashik Ahmed,
Quazi Nafees Ul Islam, and S. R. Tito
Activity Identification from Natural Images Using Deep CNN . . . . . . . . 693
Md. Anwar Hossain and Mirza A. F. M. Rashidul Hasan
Learning Success Prediction Model for Early Age Children Using
Educational Games and Advanced Data Analytics . . . . . . . . . . . . . . . . . 708
Antonio Tolic, Leo Mrsic, and Hrvoje Jerkovic
Contents xiii
Advanced Analytics Techniques for Customer Activation

and Retention in Online Retail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720
Igor Matic, Leo Mrsic, and Joachim Keppler
An Approach for Detecting Pneumonia from Chest X-Ray Image
Using Convolution Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735
Susmita Kar, Nasim Akhtar, and Mostafijur Rahman
An Analytical Study of Influencing Factors on Consumers’
Behaviors in Facebook Using ANN and RF . . . . . . . . . . . . . . . . . . . . . . 744
Shahadat Hossain, Md. Manzurul Hasan, and Tanvir Hossain
Autism Spectrum Disorder Prognosis Using Machine Learning
Algorithms: A Comparative Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754
Oishi Jyoti, Nazmin Islam, Md. Omaer Faruq, Md. Abu Ismail Siddique,
and Md. Habibur Rahaman
Multidimensional Failure Analysis Based on Data Fusion
from Various Sources Using TextMining Techniques . . . . . . . . . . . . . . . 766
Maria Stachowiak, Artur Skoczylas, Paweł Stefaniak, and Paweł Śliwiński
Road Quality Classification Adaptive to Vehicle Speed Based
on Driving Data from Heavy Duty Mining Vehicles . . . . . . . . . . . . . . . . 777
Artur Skoczylas, Paweł Stefaniak, Sergii Anufriiev,
and Bartosz Jachnik
Fabric Defect Detection System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 788
Tanjim Mahmud, Juel Sikder, Rana Jyoti Chakma,
and Jannat Fardoush
Alzheimer’s Disease Detection Using CNN Based on Effective
Dimensionality Reduction Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 801
Abu Saleh Musa Miah, Md. Mamunur Rashid, Md. Redwanur Rahman,
Md. Tofayel Hossain, Md. Shahidujjaman Sujon, Nafisa Nawal,
Mohammad Hasan, and Jungpil Shin
An Analytical Intelligence Model to Discontinue Products
in a Transnational Company . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812
Gabriel Loy-García, Román Rodríguez-Aguilar,
and Jose-Antonio Marmolejo-Saucedo
Graph Neural Networks in Cheminformatics . . . . . . . . . . . . . . . . . . . . . 823
H. N. Tran Tran, J. Joshua Thomas,
Nurul Hashimah Ahamed Hassain Malim, Abdalla M. Ali,
and Son Bach Huynh
Academic and Uncertainty Attributes in Predicting
Student Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 838
Abdalla M. Ali, J. Joshua Thomas, and Gomesh Nair
xiv Contents
Captivating Profitable Applications of Artificial Intelligence

in Agriculture Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848
R. Sivarethinamohan, D. Yuvaraj, S. Shanmuga Priya, and S. Sujatha
Holistic IoT, Deep Learning and Information Technology

Mosquito Classification Using Convolutional Neural Network
with Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865
Mehenika Akter, Mohammad Shahadat Hossain, Tawsin Uddin Ahmed,
and Karl Andersson
Recommendation System for E-commerce Using Alternating Least
Squares (ALS) on Apache Spark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 880
Subasish Gosh, Nazmun Nahar, Mohammad Abdul Wahab,
Munmun Biswas, Mohammad Shahadat Hossain, and Karl Andersson
An Interactive Computer System with Gesture-Based Mouse
and Keyboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894
Dipankar Gupta, Emam Hossain, Mohammed Sazzad Hossain,
Mohammad Shahadat Hossain, and Karl Andersson
Surface Water Quality Assessment and Determination of Drinking
Water Quality Index by Adopting Multi Criteria Decision
Analysis Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907
Deepjyoti Deb, Mrinmoy Majumder, Tilottama Chakraborty,
Prachi D. Khobragade, and Khakachang Tripura
An Approach for Multi-human Pose Recognition and Classification
Using Multiclass SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 922
Sheikh Md. Razibul Hasan Raj, Sultana Jahan Mukta,
Tapan Kumar Godder, and Md. Zahidul Islam
Privacy Violation Issues in Re-publication of Modification Datasets . . . 938
Noppamas Riyana, Surapon Riyana, Srikul Nanthachumphu,
Suphannika Sittisung, and Dussadee Duangban
Using Non-straight Line Updates in Shuffled Frog Leaping Algorithm . . . 954
Kanchana Daoden and Trasapong Thaiupathump
Efficient Advertisement Slogan Detection and Classification Using
a Hierarchical BERT and BiLSTM-BERT Ensemble Model . . . . . . . . . 964
Md. Akib Zabed Khan, Saif Mahmud Parvez, Md. Mahbubur Rahman,
and Md Musfique Anwar
Chronic Kidney Disease (CKD) Prediction Using Data
Mining Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 976
Abhijit Pathak, Most. Asma Gani, Abrar Hossain Tasin,
Sanjida Nusrat Sania, Md. Adil, and Suraiya Akter
Contents xv
Multi-classification of Brain Tumor Images Based on Hybrid

Feature Extraction Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 989
Khaleda Akhter Sathi and Md. Saiful Islam
An Evolutionary Population Census Application Through
Mobile Crowdsourcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1000
Ismail Hossain Mukul, Mohammad Hasan, and Md. Zahid Hassan
IoT-Enabled Lifelogging Architecture Model to Leverage
Healthcare Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1011
Saika Zaman, Ahmed Imteaj, Muhammad Kamal Hossen,
and Mohammad Shamsul Arefin
An Improved Boolean Load Matrix-Based Frequent Pattern Mining . . . 1026
Shaishab Roy, Mohammad Nasim Akhtar, and Mostafijur Rahman
Exploring CTC Based End-To-End Techniques for Myanmar
Speech Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1038
Khin Me Me Chit and Laet Laet Lin
IoT Based Bidirectional Speed Control and Monitoring of Single
Phase Induction Motors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1047
Ataur Rahman, Mohammad Rubaiyat Tanvir Hossain,
and Md. Saifullah Siddiquee
Missing Image Data Reconstruction Based on Least-Squares
Approach with Randomized SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1059
Siriwan Intawichai and Saifon Chaturantabut
An Automated Candidate Selection System Using Bangla
Language Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1071
Md. Moinul Islam, Farzana Yasmin, Mohammad Shamsul Arefin,
Zaber Al Hassan Ayon, and Rony Chowdhury Ripan
AutoMove: An End-to-End Deep Learning System
for Self-driving Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1082
Sriram Ramasamy and J. Joshua Thomas
An Efficient Machine Learning-Based Decision-Level Fusion Model
to Predict Cardiovascular Disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1097
Hafsa Binte Kibria and Abdul Matin
Towards POS Tagging Methods for Bengali Language:
A Comparative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1111
Fatima Jahara, Adrita Barua, MD. Asif Iqbal, Avishek Das, Omar Sharif,
Mohammed Moshiul Hoque, and Iqbal H. Sarker
xvi Contents
BEmoD: Development of Bengali Emotion Dataset for Classifying

Expressions of Emotion in Texts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1124
Avishek Das, MD. Asif Iqbal, Omar Sharif,
and Mohammed Moshiul Hoque
Advances in Engineering and Technology

Study of the Distribution Uniformity Coefficient of Microwave Field
of 6 Sources in the Area of Microwave-Convective Impact . . . . . . . . . . 1139
Dmitry Budnikov, Alexey N. Vasilyev, and Alexey A. Vasilyev
Floor-Mounted Heating of Piglets with the Use of Thermoelectricity . . . 1146
Dmitry Tikhomirov, Stanislav Trunov, Alexey Kuzmichev,
Sergey Rastimeshin, and Victoria Ukhanova
The Rationale for Using Improved Flame Cultivator
for Weed Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1156
Mavludin Abdulgalimov, Fakhretdin Magomedov, Izzet Melikov,
Sergey Senkevich, Hasan Dogeev, Shamil Minatullaev, Batyr Dzhaparov,
and Aleksandr Prilukov
The Lighting Plan: From a Sector-Specific Urbanistic Instrument
to an Opportunity of Enhancement of the Urban Space
for Improving Quality of Life . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1168
Cinzia B. Bellone and Riccardo Ottavi
PID Controller Design for BLDC Motor Speed Control System
by Lévy-Flight Intensified Current Search . . . . . . . . . . . . . . . . . . . . . . . 1176
Prarot Leeart, Wattanawong Romsai, and Auttarat Nawikavatan
Intellectualized Control System of Technological Processes of an
Experimental Biogas Plant with Improved System for Preliminary
Preparation of Initial Waste . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1186
Andrey Kovalev, Dmitriy Kovalev, Vladimir Panchenko,
Valeriy Kharchenko, and Pandian Vasant
Way for Intensifying the Process of Anaerobic Bioconversion
by Preliminary Hydrolysis and Increasing Solid Retention Time . . . . . . 1195
Andrey Kovalev, Dmitriy Kovalev, Vladimir Panchenko,
Valeriy Kharchenko, and Pandian Vasant
Evaluation of Technical Damage Caused by Failures
of Electric Motors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1204
Anton Nekrasov, Alexey Nekrasov, and Vladimir Panchenko
Development of a Prototype Dry Heat Sterilizer for
Pharmaceuticals Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1213
Md. Raju Ahmed, Md. Niaz Marshed, and Ashish Kumar Karmaker
Contents xvii
Optimization of Parameters of Pre-sowing Seed Treatment in

Magnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1222
Volodymyr Kozyrsky, Vitaliy Savchenko, Oleksandr Sinyavsky,
Andriy Nesvidomin, and Vasyl Bunko
Development of a Fast Response Combustion Performance
Monitoring, Prediction, and Optimization Tool for Power Plants . . . . . 1232
Mohammad Nurizat Rahman, Noor Akma Watie Binti Mohd Noor,
Ahmad Zulazlan Shah b. Zulkifli, and Mohd Shiraz Aris
Industry 4.0 Approaches for Supply Chains Facing COVID-19:
A Brief Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1242
Samuel Reong, Hui-Ming Wee, Yu-Lin Hsiao, and Chin Yee Whah
Ontological Aspects of Developing Robust Control Systems
for Technological Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1252
Nataliia Lutskaya, Lidiia Vlasenko, Nataliia Zaiets, and Volodimir Shtepa
A New Initial Basis for Solving the Blending Problem Without
Using Artificial Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1262
Chinchet Boonmalert, Aua-aree Boonperm, and Wutiphol Sintunavarat
Review of the Information that is Previously Needed to Include
Traceability in a Global Supply Chain . . . . . . . . . . . . . . . . . . . . . . . . . . 1272
Zayra M. Reyna Guevara, Jania A. Saucedo Martínez,
and José A. Marmolejo
Online Technology: Effective Contributor to Academic Writing . . . . . . 1281
Md. Hafiz Iqbal, Md Masumur Rahaman, Tanusree Debi,
and Mohammad Shamsul Arefin
A Secured Electronic Voting System Using Blockchain . . . . . . . . . . . . . 1295
Md. Rashadur Rahman, Md. Billal Hossain, Mohammad Shamsul Arefin,
and Mohammad Ibrahim Khan
Preconditions for Optimizing Primary Milk Processing . . . . . . . . . . . . . 1310
Gennady N. Samarin, Alexander A. Kudryavtsev,
Alexander G. Khristenko, Dmitry N. Ignatenko, and Egor A. Krishtanov
Optimization of Compost Production Technology . . . . . . . . . . . . . . . . . 1319
Gennady N. Samarin, Irina V. Kokunova, Alexey N. Vasilyev,
Alexander A. Kudryavtsev, and Dmitry A. Normov
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1329

About the Editors
Pandian Vasant is a senior lecturer at University of Technology Petronas,

Malaysia, and an editor-in-chief of International Journal of Energy Optimization
and Engineering (IJEOE). He holds PhD in Computational Intelligence (UNEM,
Costa Rica), MSc (University Malaysia Sabah, Malaysia, Engineering
Mathematics), and BSc (Hons, Second Class Upper) in Mathematics (University of
Malaya, Malaysia). His research interests include soft computing, hybrid opti-
mization, innovative computing and applications. He has co-authored research
articles in journals, conference proceedings, presentation, special issues guest edi-
tor, book chapters (257 publications indexed in SCOPUS), and General Chair of
EAI International Conference on Computer Science and Engineering in Penang,
Malaysia (2016), and Bangkok, Thailand (2018). In the year 2009 and 2015,
Dr. Pandian Vasant was awarded top reviewer and outstanding reviewer for the
journal Applied Soft Computing (Elsevier). He has 30 years of working experiences
at the universities. Currently, he is an editor-in-chief of International Journal of
Energy Optimization & Engineering, and Member of AMS (USA), NAVY
Research Group (TUO, Czech Republic) and MERLIN Research Group (TDTU,
Vietnam). H-Index Google Scholar = 33; i-10-index = 107.
Ivan Zelinka is currently working at the Technical University of Ostrava

(VSB-TU), Faculty of Electrical Engineering and Computer Science. He graduated
consequently at Technical University in Brno (1995 – MSc.), UTB in Zlin (2001 –
PhD) and again at Technical University in Brno (2004 – assoc. prof.) and VSB-TU
(2010 - professor). Before academic career, he was an employed like TELECOM
technician, computer specialist (HW+SW), and Commercial Bank (computer and
LAN supervisor). During his career at UTB, he proposed and opened seven dif-
ferent lectures. He also has been invited for lectures at numerous universities in
different EU countries plus the role of the keynote speaker at the Global Conference
on Power, Control and Optimization in Bali, Indonesia (2009), Interdisciplinary
Symposium on Complex Systems (2011), Halkidiki, Greece, and IWCFTA 2012,
Dalian China. The field of his expertise is mainly on unconventional algorithms and
cybersecurity. He is and was responsible supervisor of three grants of fundamental
xix
xx About the Editors
research of Czech grant agency GAČR, co-supervisor of grant FRVŠ - Laboratory

of parallel computing. He was also working on numerous grants and two EU
projects like a member of the team (FP5 - RESTORM) and supervisor (FP7 -
PROMOEVO) of the Czech team and supervisor of international research (founded
by TACR agency) focused on the security of mobile devices (Czech - Vietnam).
Currently, he is a professor at the Department of Computer Science, and in total, he
has been the supervisor of more than 40 MSc. and 25 Bc. diploma thesis. Ivan
Zelinka is also supervisor of doctoral students including students from the abroad.
He was awarded by Siemens Award for his PhD thesis, as well as by journal
Software news for his book about artificial intelligence. Ivan Zelinka is a member of
British Computer Society, editor-in-chief of Springer book series: Emergence,
Complexity and Computation (http://www.springer.com/series/10624), Editorial
board of Saint Petersburg State University Studies in Mathematics, a few interna-
tional program committees of various conferences and international journals. He is
the author of journal articles as well as of books in Czech and English language and
one of the three founders of TC IEEE on big data http://ieeesmc.org/about-smcs/
history/2014-archives/44-about-smcs/history/2014/technical-committees/204-big-
data-computing/. He is also head of research group NAVY http://navy.cs.vsb.cz.
G.-W. Weber is a professor at Poznan University of Technology, Poznan, Poland,

at Faculty of Engineering Management, Chair of Marketing and Economic
Engineering. His research is on OR, financial mathematics, optimization and con-
trol, neuro- and bio-sciences, data mining, education and development; he is
involved in the organization of scientific life internationally. He received his
Diploma and Doctorate in mathematics, and economics/business administration, at
RWTH Aachen, and his Habilitation at TU Darmstadt. He held Professorships by
proxy at University of Cologne, and TU Chemnitz, Germany. At IAM, METU,
Ankara, Turkey, he was a professor in the programs of Financial Mathematics and
Scientific Computing, and Assistant to the Director, and he has been a member of
further graduate schools, institutes, and departments of METU. Further, he has
affiliations at the universities of Siegen, Ballarat, Aveiro, North Sumatra, and
Malaysia University of Technology, and he is “Advisor to EURO Conferences”.
Sustainable Clean Energy System
Adaptive Neuro-Fuzzy Inference Based
Modeling of Wind Energy Harvesting System
for Remote Areas
Tigilu Mitiku1 and Mukhdeep Singh Manshahia2(&)

1
Department of Mathematics, Bule Hora University, Bule Hora, Ethiopia
tigilu2004@gmail.com
2
Department of Mathematics, Punjabi University Patiala, Patiala, Punjab, India
mukhdeep@gmail.com
Abstract. The wind speed has a great impact on the overall performance of the
wind energy harvesting system. Due to variable nature of wind speed, the
system is controlled to work only in a specified range of wind speeds to protect
both the generator and turbine from damage. This article presents adaptive
neuro-fuzzy inference system-based control scheme for operation of the system
between the cut in and rated wind speed. By controlling the generator speed to
its optimum value, the generator power and speed fluctuation can be reduced.
A Matlab/Simulink tool is used for the simulation and analysis of the system.
The obtained results indicate that adaptive neuro-fuzzy inference system is an
effective method to control the rotor speed.
Keywords: Wind energy harvesting system Adaptive neuro-fuzzy inference

system Wind speed
1 Introduction
Wind energy technology have shown rapid growth among renewable energy sources in
most parts of the world due to depletion of fossil fuel reserves, rising pollution levels
and worrying changes in the global climate created due to conventional energy sources
[1]. According to Global Wind Energy Council (GWEC) report, around 50 GW of
newly installed wind power was added in 2018 slightly less than that of 2017, bringing
the global total wind power generation capacity to 597 GW [2]. The report indicates
that 2018 was the second year in a row with growing number of new installations for
energy generation.
Wind Energy Harvesting System (WEHS) can operate in both fixed as well as
variable speed mode of operation. In fixed speed wind turbines, the generator rotates at
almost constant speed and frequency for which it is designed regardless of variation in
wind speed [1]. As turbines forced to operate at constant speed, it should be extremely
robust to withstand mechanical stress created due to the fluctuation of wind speeds. On
the other hand, in variable speed wind turbines, the rotor of the generator is allowed to
rotate freely at any speed over a wide range of wind speeds. The generator is directly
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021

P. Vasant et al. (Eds.): ICO 2020, AISC 1324, pp. 3–18, 2021.
https://doi.org/10.1007/978-3-030-68154-8_1
4 T. Mitiku and M. S. Manshahia
connected to the grid in fixed speed operation whereas connected with the help of
power electronic equipment in variable speed system [3].
Thus, it is possible to control the rotor speed by means of power electronics to
maintain the optimum tip speed ratio at all the times under fluctuation of wind speeds to
produce maximum power. Different types of AC generators can be used along with the
modern wind turbine. Researchers classified them into synchronous and asynchronous
(induction) generators. Some are Squirrel-Cage Rotor Induction Generator, Wound-
Rotor Induction Generator, Doubly-Fed Induction Generator, Synchronous Generator
(With external field excitation) and Permanent Magnet Synchronous Generator [3].
Variable Speed Wind Turbine with Permanent Magnet Synchronous Generator
(VSWT-PMSG) is an attractive choice for villages, settlements and remote areas found
at very far distance from grid [4]. VSWT-PMSG can operate close to optimal speed
using maximum power point tracking (MPPT) for various wind-speeds.
This paper has seven sections: the first section contains general introduction about
WEHS and the purpose of this work is mentioned in Sect. 2. Section 3 presents
literature of the control methods applied on WEHS. Section 4 describes the details of
WEHS. Section 5 describes MPPT controller’s used in our work. The model of the
system, simulation results and discussions are given in the Sect. 6 before the conclu-
sion and future scope which is the final part of this paper.
2 Motivation and Objective of Research
The control system is used in WEHS to produce maximum power and supply the load
with a constant voltage and frequency under change in wind speed and load. WEHS are
controlled to work only in a specified range of wind speeds limited by cut-in and cut-
out speeds. Outside of these limits, the turbine should be stopped to protect both
generator and turbine from damage [5, 6]. The recent advancements in power elec-
tronics and control strategies have made it possible to regulate the voltage of the PMSG
in many different ways. The main factor that affects the performance of the system is
the variation in wind speed which affects power system stability and power quality. The
machine-side a torque/speed controller is applied to improve the performance of the
system at variable wind speed between cut-in and rated speed. Grid-side inverter is also
controlled to keep the DC-link voltage at a constant value and the current injected to
the grid at unity power factor to achieve maximum power delivery to grid as a desired
operating condition. Conventional control methods such as proportional integral
(PI) have proposed by different researchers for the control of wind energy system [7].
However, these methods need exact mathematical knowledge of dynamics system,
which is very often difficult to derive for complex systems. Nowadays to overcome
such problem researchers are tending to use intelligent control methods like fuzzy logic
control (FLC), neural networks (NN), adaptive neuro fuzzy interfacing system (ANFIS)
and genetic algorithm (GA) to control wind speed fluctuations.
In this paper, ANFIS controller is proposed to control rotor speed of WEHS. The
objective of ANFIS controller is to maximize energy production and ensure a con-
tinuous supply of energy to the grid by regulating the turbine speed in such a way that
the optimal tip speed ratio is maintained [8]. Modelling and simulation of the system is
Adaptive Neuro-Fuzzy Inference Based Modeling 5
developed using Matlab/Simulink tool to enhancement the performance of the system

with the proposed speed controller. In this paper, the VSWT-PMSG system is equipped
with AC/DC/AC power electronic converters in which the active and reactive power
produced by PMSG can be controlled. The present research analyzes the model of a
variable speed wind turbine equipped with 1.5 MW PMSG.
3 Related Work
The studies related to modelling of WEHS using Adaptive MPPT network is limited
due to complexity of the system and advancement of the control techniques.
Researchers applied different MPPT techniques for control applications. Boobalan et al.
[9] proposed Fuzzy-PI control strategy to model WEHS using PMSG to provide
optimum power to the gird. The proposed vector control provides the desired voltage at
the output of the generator side convertor so as to control the generator speed. With the
vector control scheme, the magnitude, phase voltage and the frequency of the generator
currents are controlled and enable acceptable results in steady state with variation in
wind speed. However, if the wind speed changes significantly from time to time, this
method may require a long searching time to locate the maximum power point. Sarvi
et al. [10] have proposed a maximum power point tracking control algorithm based on
particle swarm optimization (PSO) and Fuzzy logic control for variable speed PMSG
based WEHS. The PSO algorithm is used to determine optimum rotor speed of turbine
generator and its maximum power according to different wind speeds whereas FLC is
used to adjust the duty cycle of boost converter. Ahmed et al. [11] applied fuzzy-PID
controller based MPPT to track the maximum power available from the WEHS and
obtain AC voltage with constant amplitude and frequency. The error between actual
rotor speed of PMSG and estimated value of rotor speed which depends on values of
DC current and voltage at the terminal of boost-converter and parameters of PMSG are
used to obtain maximum power. The rotor speed error is given as input to the fuzzy-
PID controller that controls the duty cycle of pulse width modulation (PWM) generator
and its output is connected to the boost converter. Wafa Torki et al. [12] have applied
vector control strategy using PI controller to model and control the direct-drive WEHS
with PMSG. The actual speed of the generator is compared to its reference value which
is obtained by MPPT method using the optimal tip speed ratio control and gives the
reference current. Marmouh et al. [13] have applied two controllers to produce max-
imum active power of WEHS equipped by PMSG whose stator is connected to the grid
through a back-to-back AC-DC-AC converter. The first one is FLC based MPPT
control algorithm of stator side control by using a hysteresis control and an optimal
generator speed reference which is estimated from different wind speeds for the gen-
eration of maximum active power and the second FLC was applied to grid side control
to ensure a smooth DC voltage between the two converters to its reference value.
Aamer Bilal Asghar et al. [5] have proposed hybrid intelligent learning based ANFIS
for online estimation of effective wind speed from instantaneous values of wind turbine
TSR, rotor speed and mechanical power. The estimated wind speed is used to design
the optimal rotor speed estimator for MPPT of VSWT. Gencer [14] demonstrated fuzzy
logic control system of variable speed WEHS to investigate power flow efficiency of
PMSG. Fuzzy logic controller was structured for MPPT to drive the WEHS at the
optimum speed that relates to maximum power at any wind speed. Kesraoui et al. [15]
examined the FLC control of the aerodynamic power for variable-speed wind turbine at
high wind speeds to oversee overabundance power created during high wind speeds
dependent on PMSG associated with the grid through consecutive power converter.
Pitch angle was additionally controlled utilizing fluffy rationale to restrain the aero-
dynamic power and the power on the dc bus voltage. Nadoushan et al. [16] exhibited
optimal torque control (OTC) of stand-alone variable-speed small-scale wind turbine
outfitted with a PMSG and a switch-mode rectifier. Right now, control method is
utilized for variable speed WEHS to remove optimal power from wind and keep the
voltage at 400 V with 50 Hz frequency and produce most extreme power output. The
simulation model of the framework was created utilizing MATLAB/SIMULINK
framework.
4 Wind Energy Harvesting System
The output voltage of the generator is not acceptable due to the variation in amplitude
and frequency because of fluctuation in wind speeds throughout the day. A 1.5 MW
wind turbine is considered in this paper which consists of a 1.5 MW wind turbine and
PMSG connected to AC/DC converter and DC/AC converter modelled by voltage
sources. AC/DC converter is used to convert AC voltage with variable amplitude and
frequency at the generator side to DC voltage at the DC-link voltage. DC-link voltage
should be constant for direct use, storage and for conversion from DC to AC by using
inverter. Thus the obtained DC voltage again converted to AC voltage with constant
amplitude and frequency at the load side for electrical utilization. The back-to-back
PWM converter based power electronic interface is a suitable option for wind-power
applications with PMSG.
4.1 Modelling of Wind Turbine

The mechanical power captured by wind turbine is given by [17–19] (Fig. 1).
1
Pm ¼ CP ðk; bÞqpR2 Vw3 ð1Þ
2
The mechanical torque generated from wind power by wind turbine that will be
transferred through the generator shaft to the rotor of the generator is given by
Cp ðk; bÞqpR2 Vw3

Tm ¼ ð2Þ
2xm
where q is density of air, A is area swept by blades, Vw is wind speed, Cp is the

power extraction efficiency coefficient of the wind turbine, b is the pitch angle of the
blade and k is the tip speed ratio. The power coefficient Cp is a non-linear function of
Fig. 1. PMSG based wind turbine model [21, 24]
the tip speed-ratio k that depends on the wind velocity and the rotation speed of the
shaft, xm in rad/s, given by [20–22]

116 21
Cp ðk; bÞ ¼ 0:5716 0:4b 5 e ki þ 0:0068 ð3Þ
ki
with
1 1 0:035
¼ 2 ð4Þ
ki k þ 0:008 b þ 1
The tip speed ratio k is defined by
Rxm Rnp
k¼ ¼ ð5Þ
Vw 30Vw
where R is blade radius and n is wind turbine rotor speed in revolutions per minute
(rpm). Since there is no gearbox, the shaft speed of the rotor and mechanical generator
speed are the same, again mechanical torque transferred to the generator is also equal to
the aerodynamic torque. The maximum value of Cp is 0.59 which means that the power
extracted from the wind is at all times less than 59% (Betz’s limit), this is because of
the various aerodynamic losses depending on rotor construction [3]. For VSWT the
pitch angle is nearly 0, therefore at, b = 0° the maximum power coefficient is 0.4412.
To maintain the tip speed ratio at optimum value, xm changes with the wind speed.
Therefore, to extract the maximum power from the wind the tip speed ratio should be
maintained at optimum value at any wind speed [23]. Hence, the wind turbine can
produce maximum power when the turbine operates at optimum value of Cp which is
Cp-opt. So it is necessary to adjust the rotor speed at optimum value of the tip speed ratio
kopt as shown in Fig. 2 below.
Fig. 2. Cp (k, b) characteristic for different value of the pitch angle b [10]
4.2 Modeling of the PMSG

For the dynamic modeling of the PMSG the magneto motive force (mmf) is considered
to be sinusoidal distributed and the hysteresis and eddy current effects are neglected.
Transforming the three-phase stator voltages into the d-q reference frame aligned with
the electrical rotor position h, the stator voltages equations of salient pole PMSG are
given by Eq. (6) and (7) [21, 22].
d 1 Rs Lq
id ¼ Vd id þ Pxm iq ð6Þ
dt Ld Ld Ld
d 1 Rs Ld 1
iq ¼ Vq iq Pxm ð id þ wf Þ ð7Þ
dt Lq Lq Lq Lq
where, Lq and Ld are q, d axis components of stator inductance of the generator

respectively, Rs is resistance of the stator windings, iq , id and vq , vd are q, d axis
components of stator current and voltage respectively, wf is the flux linkage induced by
permanent magnet and P is number of pole pairs.
The generator produces an electrical torque, and the difference between the
mechanical torque and the electrical torque determines whether the mechanical system
accelerates, decelerates, or remains at constant speed. The electric torque produced by
the generator is given by [21, 25, 26],
3
T e ¼ P w f i q þ L d Lq i d i q ð8Þ
2
For surface mounted PMSG, Lq = Ld and Eq. (12) becomes
3
Te ¼ Pwf iq ð9Þ
2
The active and reactive powers of PMSG in steady state are given by
Ps ¼ Vd id þ Vq iq ð10Þ
Qs ¼ Vd id Vd iq ð11Þ
Since wind turbine and generator shafts are directly coupled with out the gear box,
there is only one state variable. The mechanical equation of PMSG and WT is given by
dxm 1
¼ ð T e Tm f x m Þ ð12Þ
dt J
d
xe ¼ Pxm ; h ¼ xm ð13Þ
dt
Where Te is the electromagnetic torque of the generator in Nm, J is inertia of rotor

and generator in Kg.m2, and f is the coefficient of viscous friction that can be neglected
in a small scale wind turbine, xe electrical rotational speed of the rotor in rpm and h is
rotor angle/electrical angle which is required for abc $ d-q transformation [21].
5 Adaptive Neuro-Fuzzy Inference System
Currently fuzzy logic control has played significant role in the development and design
of many real-time control applications [4]. FLC consists of three important functional
blocks i.e., fuzzification which assign fuzzy variables to the crisp data using mem-
bership function; inference engine that creates fuzzy rules by mapping from input to
output through the use of knowledge base in it and defuzzification which is the reverse
process of fuzzification provides the final output to the plant to be controlled [18].
Different adaptive techniques can easily be implemented in it to increase the perfor-
mance of the network. The ability of the network depends on the quality of the signals
used for training and the performance of the training algorithms and their parameters.
ANFIS is one of the neuro fuzzy networks that give best result in control of wind
energy system. ANFIS is a fuzzy inference system (FIS) whose membership functions
and rule base are appropriately tuned (adjusted) by ANN. It takes advantage from both,
the learning capability of ANNs and human knowledge based decision making power
of FIS. It uses a combination of least squares estimation and back propagation for
membership function parameter estimation.
(a)
(b)
Fig. 3. (a) A first-order Sugeno fuzzy model with two–input and two rules; (b) Equivalent ANFIS
architecture [16]
The sugeno’s fuzzy logic model and the corresponding ANFIS architecture is
shown in Fig. 3.
Assume the FIS under consideration has two inputs x, y and one output fi as shown
in Fig. 3 above. Square node indicates adaptive node whereas circle node indicates
fixed (non-adaptive) nodes. For a first-order Sugeno fuzzy model, a common rule set
with two fuzzy if-then rule is:
Rule 1: If x is A1 and y is B1 ; then f1 ¼ p1 x þ q1 y þ r1 ; ð14Þ
Rule 2: If x is A1 and y is B1 ; then f2 ¼ p2 x þ q2 y þ r2 ; ð15Þ
Here Oni denotes the output of the ith node (neuron) in layer n, pi, qi, and ri are
consequent parameters of the first order polynomial which are updated during the
learning process in forward pass by least square method [27].
Layer 1. Every node i in this layer is an adoptive node and consists of two inputs as
variables with a node function
O1i ¼ lAi ðxÞ for i ¼ 1; 2 and O1i ¼ lBi2 ðyÞ for i ¼ 3; 4 ð16Þ
where Ai is the linguistic label like big, very small, large, etc. associated with this node
function. lAi(x) is the membership function of Ai and it specifies the degree to which
the given x satisfies the quantifier Ai. It is chosen to be triangular or bell-shaped
membership function. Bell-shaped membership function with maximum equal to 1 and
minimum equal to 0, such as the generalized bell shaped function is selected for our
study.
Layer 2. Every node in this layer is non-adaptive node labeled p which multiplies the
incoming signals and sends the product out to the next layer. For instance,
O2i ¼ wi ¼ lAi ðxÞ lBi ðyÞ for i ¼ 3; 4 ð17Þ
The obtained result represents the firing strength of a rule. Other T-norm operators
that perform generalized AND can be used as the node function in this layer.
Layer 3. The neurons in this layer is non adoptive labeled by circle node N and
compute the normalized firing strength which is the ratio of firing strength of a rule to
the sum of the firing strengths of all rules. They compute the activation level of each
rule as
wi
O3i ¼ w ¼ ; i ¼ 1; 2: ð18Þ
w1 þ w2
For convenience, outputs of this layer will be called normalized firing strengths.
Layer 4. Every node i in this layer is adaptive square node which multiplies the
normalized firing strength of a rule with corresponding first order polynomial function
which produces crisp output
O4i ¼ wi fi ¼ wi fi ¼ wi ðpi x þ qi y þ ri Þ ð19Þ
where wi is the output of layer 3, and, pi, qi, and ri are is known as consequent
parameters. This layer is called as defuzzification layer.
Layer 5. The single node in this layer is a fixed node labeled R that computes the
overall output as the summation of all incoming signals and transforms the fuzzy
classification results into a single crisp output and it is called the output layer, i.e.,
X w1 w2
O5i ¼ f ¼ wi fi ¼ f1 þ f2 ð20Þ
w1 þ w2 w1 þ w2
Therefore, when the values of the premise parameters in layer are fixed, the overall
output can be expressed as a linear combination of the consequent parameters in layer 4.
Adaptive network functionally equivalent to a first-order Sugeno fuzzy model is
constructed this way. The ANFIS learning or training algorithm is used to change all
the adjustable parameters to compare ANFIS output with trained data and identify the
parameters in the network. Each training period of the network is divided into two
phases. In the first phase (forward pass), functional signals go forward until layer 4 and
the consequent parameters are adjusted with Least-Squares Method. In the second
phase (backward pass), the error rates propagate backwards and the premise parameters
are updated with gradient descent (back propagation) method [28]. If these parameters
are fixed, ANFIS output is expressed as the summation of all incoming signals to
produce a single crisp output. Thus, a combination of gradient descent and least squares
methods can easily define optimal values for the result parameters, pi, qi, and ri.
The ANFIS based MPPT controller computes the optimum speed for maximum power
point using information on the magnitude and direction of change in power output due
to the change in command speed.
6 Simulation Results and Discussion
The generator-side converter is controlled to catch maximum power from available

wind power. According to Eq. (9) in order to obtain maximal electromagnetic torque
Te, with minimum current this study just controls the q-axis current iqs with the
assumption that the d-axis current ids = 0. Moreover, according to [21, 29], to produce
maximum power, the optimum value of the rotation speed is adjusted using fuzzy logic
control technique. The relation between blade angular velocity reference xref and wind
speed Vw for constant R and kopt is given by
kopt Vm
xmref ¼ ð21Þ
R
First the wind speed is approximated by the proposed ANFIS based MPPT con-
troller to generate reference speed xmref for the speed control loop of the rotor side
converter control to track maximum power points for the system by dynamically
changing the turbine torque to operate in kopt as shown in Fig. 4. The PI controller
controls the actual rotor speed to the desired value by varying the switching ratio of the
PWM inverter. In the speed control loop, the actual speed of the generator is compared
to its reference value xref that is obtained by the ANFIS based MPPT control kopt which
is defined above. Then, the speed controller will output the reference q-axis current iqref.
The control target of the inverter is the output power delivered to the load [30, 31].
Fig. 4. Speed control block diagram
The block diagram of the ANFIS-based MPPT controller module is shown in

Fig. 5. The input to ANFIS network is mechanical power and speed of the turbine. The
network estimates the effective wind speed used to find the reference speed of the rotor.
Table 1 shows the parameters of wind turbine and generator used for the model.
Fig. 5. ANFIS-based MPPT control module of turbine rotor speed.
The simulated results of generator output voltage at average speed of 12 m/s is

given in Fig. 6 below. The voltage is near to rated voltage of the generator and speed
fluctuations are reducing. The obtained voltage is purely sinusoidal.
Fig. 6. Inverter output voltage
The advantages of ANFIS over the two parts of this hybrid system are: ANFIS uses
the neural network’s ability to classify data and find patterns. Then it develops a fuzzy
expert system that is more transparent to the user and also less likely to produce
memorization errors than neural network. ANFIS removes (or at least reduces) the need
for an expert. Furthermore, ANFIS has the ability to divide the data in groups and adapt
these groups to arrange a best membership functions that clustering the data and
deducing the output desired with minimum epochs [32]. The learning mechanism fine-
tunes the underlying fuzzy inference system. Using a given input/output data set,
ANFIS constructs a fuzzy inference system (FIS) whose membership function

parameters are tuned (adjusted) using either back propagation algorithm alone, or in
combination with a least squares type of method. This allows your fuzzy systems to
learn from the data [33, 34]. However, the restriction of ANFIS is only Sugeno-type
decision method is available, there can be only one output and defuzzification method
is weighted mean value. ANFIS can replace the anemometer for small wind energy
system and reduce the size of the turbine as well as the cost. Moreover, it increases the
production of power from the system. Sensor less estimation with help of ANFIS
network has very good opportunity for small and medium wind energy generation
system [35, 36].
Table 1. Parameters of wind turbine and PMSG

Wind turbine PMSG
Rated power 1.5 MW Rated power 1.5 MW
Cut-in wind speed 3 m/s Rated rotational speed 17.3 rpm
Rated wind speed 11 m/s P 44
Cut-out wind speed 22 m/s Frequency 50 Hz
Rated voltage 200 v
6.1 Limitation of Research

The results presented in this paper have some limitations. The data set used in this
study is collected from 1.5 MW variable-speed PMSG based wind turbine. It was very
good if it is supported by laboratory. However, the actual system is very expensive as
well as bigger in size is not usually available in the laboratories, except those which are
highly equipped and solely dedicated to wind energy research. The only option is to
collect the data samples from an operational wind turbine system and then using the
collected data to design estimation and control mechanism.
7 Conclusion and Future Scope
As wind speed changing throughout the day, variable speed based wind power gen-
eration is useful for optimizing power output of wind energy harvesting system using
MPPT methods. This paper presents modeling and simulation of variable speed based
PMSG using ANFIS network. The network is applied to control the speed of the rotor
to adjust it to its maximum value to give maximum power output. Results have shown
that the voltage is near to rated voltage of the generator and speed fluctuations are
reducing. The future work is to control the pitch angle for the speed more than the rated
speed to produce rated power.
Acknowledgments. The authors acknowledge Punjabi University Patiala for providing the
internet facilities and the necessary library resources.
References
1. Zhang, J., Xu, S.: Application of fuzzy logic control for grid-connected wind energy
conversion system. In: Dadios, E.P. (ed.) Fuzzy Logic-Tool for Getting Accurate Solutions.
pp. 50–77. IntechOpen (2015) DOI: https://doi.org/10.5772/59923
2. WWEA: Wind power capacity worldwide reaches 597 GW, 50,1 GW added in 2018. World
Wind Energy Association, Brazil (2019)
3. Husain, M.A., Tariq, A.: Modeling and study of a standalone PMSG wind generation system
using MATLAB/SIMULINK. Univ. J. Electr. Electron. Eng. 2(7), 270–277 (2014)
4. Ali, A., Moussa, A., Abdelatif, K., Eissa, M., Wasfy, S., Malik, O.P.: ANFIS based
controller for rectifier of PMSG wind energy conversion system energy conversion system.
In: Proceedings of Electrical Power and Energy Conference (EPEC), pp. 99–103. IEEE,
Calgary (2014)
5. Asghar, A.B., Liu, X.: Adaptive neuro-fuzzy algorithm to estimate effective wind speed and
optimal rotor speed for variable-speed wind turbine. Neurocomputing 272, 495–504 (2017)
6. Pindoriya, R.M., Usman, A., Rajpurohit, B.S., Srivastava, K.N.: PMSG based wind energy
generation system: energy maximization and its control. In: 7th International Conference on
Power Systems (ICPS), pp. 376–381, IEEE, Pune (2017)
7. Khaing, T.Z., Kyin, L.Z.: Control scheme of stand-alone wind power supply system with
battery energy storage system. Int. J. Electr. Electron. Data Commun. 3(2), 19–25 (2015)
8. El-Tamaly, H.H., Nassef, A.Y.: Tip speed ratio and pitch angle control based on ANN for
putting variable speed WTG on MPP. In: 18th International Middle-East Power Systems
Conference (MEPCON), pp. 625–632, IEEE Power & Energy Society, Cairo (2016)
9. Boobalan, M., Vijayalakshmi, S., Brindha, R.: A fuzzy-PI based power control of wind
energy conversion system PMSG. In: International Conference on Energy Efficient
Technologies for Sustainability, pp. 577–583, IEEE, Nagercoil (2013)
10. Sarvi, M., Abdi, S., Ahmadi, S.: A new method for rapid maximum power point tracking of
PMSG wind generator using PSO_fuzzy logic. Tech. J. Eng. Appl. Sci. 3(17), 1984–1995
(2013)
11. Ahmed, O.A., Ahmed, A.A.: Control of Wind Turbine for variable speed based on fuzzy-
PID controller. J. Eng. Comput. Sci. (JECS) 18(1), 40–51 (2017)
12. Torki, W., Grouz, F., Sbita, L.: Vector control of a PMSG direct-drive wind turbine. In:
International Conference on Green Energy Conversion Systems (GECS), pp. 1–6. IEEE,
Hammamet (2017)
13. Marmouh, S., Boutoubat, M., Mokrani, L.: MPPT fuzzy logic controller of a wind energy
conversion system based on a PMSG. In: 8th International Conference on Modelling,
Identification and Control, Algiers, pp. 296–302 (2016)
14. Gencer, A.: Modelling of operation PMSG based on fuzzy logic control under different load
conditions. In: 10th International Symposium on Advanced Topics in Electrical Engineering
(ATEE), pp. 736–739, IEEE, Bucharest (2017)
15. Kesraoui, M., Lagraf, S.A., Chaib, A.: Aerodynamic power control of wind turbine using
fuzzy logic. In: 3rd International Renewable and Sustainable Energy Conference (IRSEC),
Algeria, pp. 1–6 (2015)
16. Nadoushan, M.H.J., Akhbari, M.: Optimal torque control of PMSG-based stand-alone wind
turbine with energy storage system. J. Electr. Power Energy Convers. Syst. 1(2), 52–59
(2016)
17. Aymen, J., Ons, Z., Nejib, M.M.: Performance assessment of a wind turbine with variable
speed wind using artificial neural network and neuro-fuzzy controllers. Int. J. Syst. Appl.
Eng. Dev. 11(3), 167–172 (2017)
18. Sahoo, S., Subudhi, B., Panda, G.: Torque and pitch angle control of a wind turbine using
multiple adaptive neuro-fuzzy control. Wind Eng. 44(2), 125–141 (2019)
19. Mitiku, T., Manshahia, M.S.: Fuzzy logic controller for modeling of wind energy harvesting
system for remote areas. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.) Intelligent
Computing and Optimization, ICO 2019. Advances in Intelligent Systems and Computing,
vol. 1072. Springer, Cham (2019). https://doi-org-443.webvpn.fjmu.edu.cn/10.1007/978-3-
030-33585-4_4
20. Heidari, M.: Maximum wind energy extraction by using neural network estimation and
predictive control of boost converter. Int. J. Ind. Electron. Control Optim. 1(2), 115–120
(2018)
21. Slah, H., Mehdi, D., Lassaad, S.: Advanced control of a PMSG wind turbine. Int. J. Mod.
Nonlinear Theory Appl. 5, 1–10 (2016)
22. Medjber, A., Guessoumb, A., Belmili, H., Mellit, A.: New neural network and fuzzy logic
controllers to monitor maximum power for wind energy conversion system. Energy 106,
137–146 (2016)
23. Sahu, S., Panda, G., Yadav, S.P.: Dynamic modelling and control of PMSG based stand-
alone wind energy conversion system. In: Recent Advances on Engineering, Technology and
Computational Sciences (RAETCS), pp. 1–6, IEEE, Allahabad (2018)
24. Jain, A., Shankar, S., Vanitha, V.: Power generation using permanent magnet synchronous
generator (PMSG) based variable speed wind energy conversion system (WECS): an
overview. J. Green Eng. 7(4), 477–504 (2018)
25. Elbeji, O., Mouna, B.H., Lassaad, S.: Modeling and control of a variable speed wind turbine.
In: The Fifth International Renewable Energy Congress, pp. 425–429. IEEE, Hammamet
(2014)
26. Sagiraju, D.K.V., Obulesu, Y.P., Choppavarapu, S.B.: Dynamic performance improvement
of standalone battery integrated PMSG wind energy system using proportional resonant
controller. Int. J. Eng. Sci. Technol. 20(4), 1–3 (2017)
27. Petkovic, D., Shamshirband, S.: Soft methodology selection of wind turbine parameters to
large affect. Electr. Power Energy Syst. 69, 98–103 (2015)
28. Oguz, Y., Guney, I.: Adaptive neuro-fuzzy inference system to improve the power quality of
variable-speed wind power generation system. Turk. J. Electr. Eng. Comput. Sci. 18(4),
625–645 (2010)
29. Farh, H.M., Eltamaly, A.M.: Fuzzy logic control of wind energy conversion system.
J. Renew. Sustain. Energy 5(2), 1–3 (2013)
30. Sekhar, V.: Modified fuzzy logic based control strategy for grid connected wind energy
conversion system. J. Green Eng. 6(4), 369–384 (2016)
31. Gupta, J., Kumar, A.: Fixed pitch wind turbine-based permanent magnet synchronous
machine model for wind energy conversion systems. J. Eng. Technol. 2(1), 52–62 (2012)
32. Thongam, J.S., Bouchard, P., Ezzaidi, H., Ouhrouche, M.: Artificial neural network-based
maximum power point tracking control for variable speed wind energy conversion systems.
In: 18th IEEE International Conference on Control Applications, Saint Petersburg (2009)
33. Vasant, P., Zelinka, I., Weber, G.W.: Intelligent Computing & Optimization. Conference
Proceedings ICO 2018. Springer, Cham (2018). ISBN 978-3-030-00978-6
34. Vasant, P., Zelinka, I., Weber, G.W.: Intelligent computing & optimization. In: Proceedings
of the 2nd International Conference on Intelligent Computing and Optimization. Springer
(2019). ISBN 978-3-030-33585-4
35. Mitiku, T., Manshahia, M.S.: Artificial neural networks based green energy harvesting for
smart world. In: Somani, A., Shekhawat, R., Mundra, A., Srivastava, S., Verma, V. (eds.)
Smart Systems and IoT: Innovations in Computing. Smart Innovation, Systems and
Technologies, vol. 141. Springer, Singapore. https://doi.org/10.1007/978-981-13-8406-6_4
36. Mitiku, T., Manshahia, M.S.: Fuzzy inference based green energy harvesting for smart
world. In: IEEE International Conference on Computational Intelligence and Computing
Research (ICCIC), pp. 1–4. IEEE, Madurai (2018)
Features of Distributed Energy Integration
in Agriculture
Alexander V. Vinogradov1 , Dmitry A. Tikhomirov1 ,

Vadim E. Bolshev1(&) , Alina V. Vinogradova1,
Nikolay S. Sorokin2, Maksim V. Borodin2, Vadim A. Chernishov3 ,
Igor O. Golikov2, and Alexey V. Bukreev1
1
Federal Scientific Agroengineering Center VIM,
1-st Institutsky proezd, 5, 109428 Moscow, Russia
winaleksandr@rambler.ru, alinawin@rambler.ru,
tihda@mail.ru, skiffdark@mail.ru,
vadimbolshev@gmail.com
2
Orel State Agrarian University named after N.V. Parahin,
Generala Rodina St., 69, 302019 Orel, Russia
sorokinnc@rambler.ru, montazar@rambler.ru,
maksimka-borodin@yandex.ru
3
Orel State University named after I.S. Turgenev,
Komsomolskaya St., 95, 302026 Orel, Russia
blackseam78@mail.ru
Abstract. Agriculture is the guarantor of the sustainable development of the

state as it supplies the population with basic necessities. Agricultural holdings
and similar agricultural associations are distributed agricultural enterprises, each
part of which requires reliable provision with the necessary types of energy. The
article gives an understanding of the agroholdings and what problems exist in
the implementation of distributed energy projects in agriculture. It also considers
the options for the small generation facility structure on the example of biogas
plants and their property issues. Various promising options for the use of
individual and local small generation facilities in agroholdings are given listing
facility types, scope of perspective application in the agroholding structure and
expected key results.
Keywords: Agricultural enterprises Agricultural holdings Distributed

generation Energy consumption analysis Energy resources Waste recycling
1 Introduction
An agroholdings is a group of legal entities engaged in agricultural activities and sales

of agricultural products [1]. Thus, it is actually a distributed agricultural enterprise that
can be engaged in different types of agricultural production in different territories. In
one holding company there can be units engaged in crop production, livestock raising
and agricultural product processing. Energy needs of the agricultural holding are
formed depending on the specialization of its divisions, production volumes and
applied production technologies.
https://doi.org/10.1007/978-3-030-68154-8_2
20 A. V. Vinogradov et al.
Usually, the sources of energy supply for all structural units of holdings for elec-
tricity are electric grids (that are on the balance of power grid companies) and for heat
are their own boiler houses and boiler rooms (that are on the balance of heat supply
companies) or in some cases centralized heat networks.
The main problems of energy supply to agricultural holdings as well as all agri-
cultural enterprises, especially with facilities in places far from developed infrastruc-
ture, are:
• low reliability of electricity supply including low reliability of electricity supply to
heat supply facilities (boiler houses) what leads to a decrease in the quality of
products, underproduction, livestock losses and heat supply interruptions of agri-
cultural processes in case of in electricity interruptions of boiler rooms [2–4];
• low energy efficiency of used boiler houses what leads to an increase in the cost of
holding production;
• the lack of sustainable communication channels for the organization of systems for
monitoring the activities of energy supply facilities, energy consumption. It does
not allow efficiently monitoring of energy consumption by facilities [5–7];
• the lack of qualified specialists who can effectively organize interaction with power
sales and power grid companies. First of all, the low qualification of electrical
engineering personnel and engineering workers affects the difficulty of forecasting
consumption, meeting the requirements of energy sales companies to the conditions
of using more favorable price categories, which, in turn, does not allow agricultural
enterprises to reduce the cost of electricity in the structure of agricultural holdings
and hence the cost products [8].
2 Problems of Integration of Distributed Energy Projects

in Agriculture
Features of distributed generation allow in certain cases to solve a number of these

problems. Advanced options are developed for installations for the production of
energy from renewable sources [9–11]. Low generation is an opportunity to increase
independence from centralized energy supply systems and the possibility of utilization
of enterprise waste, which can be used as raw materials for small generation sources or
heat generators.
The need for distributed generation for agroholdings exists objectively and logi-
cally, but its justification has a number of constraining factors given below.
The problem of the lack of effective methods for assessing the potential of using
small generation facilities (SGF) in the structure of agricultural holdings contains a full
range of issues ranging from the availability of land resources to accommodation and
the availability of human, raw materials, etc. The existing methods are focused only on
the effects of the replacement of electricity and heat as well as on the effect of waste
processing and do not explore all the necessary issues of the prospects of using small
generation facilities.
There is an insufficient supply of ready-made integrated solutions on the SGF
market, including in the ownership variant of the SGF on the balance of the relevant
Features of Distributed Energy Integration in Agriculture 21
companies. Basically, the proposals concern only the SGF directly and do not consider
its effective integration into the infrastructure of the enterprise taking into account its
features. In combination with the absence of a developed system of SGF operational
services, this makes it difficult to decide on the use of distributed generation.
Significant investments are needed in pre-project preparation of SGF implemen-
tation in agroholdings with unguaranteed validity of the effectiveness of SGF use.
Understanding this, enterprises are reluctant to carry out such work. In addition, a very
small number of examples of successful application of SGF in the agricultural sector
also “scares” agroholdings from their use.
There is a problem of consistency of legislation on the SGF use in terms of
connecting them to general-use electrical networks and using them for power supply to
third-party consumers. The process of joining and determining tariffs are overly
bureaucratic. This imposes difficulties on the effective use of SGF generating capaci-
ties. It is necessary either to unreasonably overestimate the power of the generators or
to use the SGF for incomplete coverage of the capacities of the enterprise. The con-
struction of its own network for the transmission of electricity to third-party consumers
requires the creation of its own network and distribution company or to involve them
from outside, which increases the cost of electricity sold.
It is necessary to coordinate legislation and regulatory documents in terms of the
possibility of using renewable energy sources, installations for processing agricultural
waste.
There is instability in raw materials for use as a fuel for SGF. This is primarily the
case of the use of biogas plants, plants for the incineration of crop waste, the forest
industry, etc. The market dictates the need to adapt to it in terms of production,
therefore, the structure and volume of production wastes may change and become
insufficient for SGF. The use of biomass specifically cultivated for energy needs as a
raw material requires justification since it needs the occupation of farmland areas for
these purposes.
3 Discussion and Results
It should be noted that the options of the SGF can vary greatly depending on the type,
quantity and quality of raw materials as well as the needs of the customer. Consider the
options for the structure of the SGF on the example of biogas plants (BGP), as one of
the most promising options for SGF [12, 13].
Individual BGP. Individual BGP is designed to partially or fully meet the needs of
small agricultural enterprises for electricity, fertilizer, cheap heat and in some cases for
complete energy autonomy. The structural diagram of an individual BGP is presented
in Fig. 1.
Fig. 1. Structural diagram of the use of individual BGP
At the same time, the property issue of BGP has 2 solutions:

• BGP is owned by the company and supplies it to: a) gas and fertilizers; b) elec-
tricity, heat and fertilizers; c) gas, electricity, heat and fertilizers;
• BGP is an independent enterprise buying raw materials and selling the same
products in any combination.
Local BGP. Local BGP is biogas plant designed to fully or partially cover the required
capacities of several enterprises connected to a common energy network. The structural
diagram of the local BGP is presented in Fig. 2.
The end products of such BGP are: electricity, biogas, bio-fertilizers used by
agricultural enterprises for technological purposes.
Fig. 2. Structural diagram of the use of local BGP

At the same time, the property issue of BGP has 2 solutions:

• BGP is jointly owned by enterprises and supplies them with products of biomass
processing in the combination they need taking into account the characteristics of
enterprises;
• BGP is an independent enterprise that buys raw materials and sells its products to
suppliers to enterprises.
Network BGP. Network BGP is biogas plant intended for the sale of energy resources
(gas/electricity/fertilizers/heat) to an external network. Supply of raw materials for the
network BGP is engaged in one or more agricultural enterprises, the final product of
such the BGP is gas (electricity). The block diagram of the network BGP is presented
in Fig. 3.
Fig. 3. Structural diagram of the use of network BGP
At the same time, the property issue of network BGP has 2 solutions:
• BGP is jointly owned by enterprises and supplies its products in the network in the
form of electricity, heat, fertilizer or gas;
• BGP is an independent enterprise purchasing raw materials from enterprises and
selling its products in the appropriate networks. BGP can be specialized and sell
only gas to the gas network what significantly reduces the necessary equipment for
BGP and, accordingly, reduces the cost of its products. Auxiliary product in this
case can be fertilizer.
In the case of an agricultural holding, BGP can also become one of its structures
aimed at solving the tasks of supplying the units with the products they need.
As for the whole range of SGF, not only BGP, here, first of all, it is necessary to
conduct research aimed at determining the scope of using different types of SGF. For
example, if there are enterprises engaged in pasture cattle breeding in the structure of an
agricultural holding, it is rational to use mobile technological centers (shepherds’
homes on wheels, milking points, haircuts, etc.) equipped with solar batteries as an
energy source. Pumping installations on pastures, or stationary in areas of greenhouse,
livestock farms can be equipped with both solar and wind power plants. There are
promising SGF options for enterprises in the presence of a natural gas network that
uses gas to generate heat and electricity. But all these options require careful scientific
study with the development of practical recommendations on the required SGF
capacities, their choice in different climatic zones and with different specializations of
farms and other solutions.
Various promising options for the use of individual and local SGF in agroholdings
are listed in Table 1.
Table 1. Options for the use of individual and local SGF in agroholdings
SGF type Scope of perspective application Expected key results
in agroholding structure
Wind power plant • Power supply and • Reducing the cost of pumping
(or) mechanization of pumping water;
stations; • Reducing the cost of power
• Power supply of individual supply to individual objects
objects remote from the (including by reducing the cost
infrastructure of connecting to the network),
for example, at stationary points
on remote pastures
Biogas plant • Gas supply to facilities of • Reducing the cost of gas
different purposes; supply;
• Heating of livestock facilities • Reducing heating costs;
located near buildings of • Reducing the cost of power
various purposes, supply to individual objects
• Recycling of enterprise waste; (including by reducing the cost
• In some cases, the power supply of connecting to the networks);
of individual objects; • Reducing the cost of fertilizers;
• Effective use of fallow lands
(when growing energy plants)
or use of land for raw materials
of BSU in crop rotation;
• Reduction of fines for
environmental impact
(continued)
Table 1. (continued)
SGF type Scope of perspective application Expected key results
in agroholding structure
Solar power plant • Power supply to pumping • Reducing the cost of pumping
stations; water;
• Power supply (electricity and • Reducing the cost of electricity
heat supply) to individual supply to individual objects
objects remote from the (including by reducing the cost
infrastructure; of connecting to the network),
• Power supply to mobile for example, at stationary points
technological points (housing on remote pastures;
for shepherds on wheels, mobile • Improving the comfort of
milking stations, etc.); workplaces of mobile points
• Power supply to individual • Reducing the cost of creating
loads in enterprises (emergency backup power lines;
lighting, back-up power for the • Reducing the cost of thermal
responsible electrical receivers, energy
etc.);
• Drying of grain, energy supply
to storage facilities
Power plant on • Heating of the enterprise • Reducing the cost of electricity
natural gas (gas facilities located near buildings supply to individual objects
turbine, gas piston, for various purposes, (including by reducing the cost
etc.) • Power supply to the enterprise of connecting to the network);
facilities; • Reducing the cost of thermal
• The use of heat in technological energy
processes
Heat generator on • Heating of livestock facilities; • Reducing the cost of thermal
the enterprise waste • The use of heat in technological energy
processes
In addition to those indicated in Table 1 there can be other SGF as well as the
effects of their use. The final assessment in any case should be carried out taking into
account the characteristics of a particular farm. In the countries as a whole, it is rational
to formulate a program to investigate the potential of SGF use in the structure of
agriculture. It should provide for the solution of legislative and regulatory issues as well
as technical solutions, solutions for the SGF service infrastructure, organization of
advisory services and other aspects of SGF, for example, the use of the principles of
intelligent electrical networks and microgrids described in [14–16] when creating
projects of distributed generation.
4 Conclusions
1. Agricultural holdings are objectively interested in the implementation of distributed
generation projects in the case of obtaining results from these projects, which
consist in reducing the cost of production.
2. For the implementation of distributed generation projects in agroholdings, it is

necessary to carry out pre-project training aimed at determining the potential use of
certain types of SGF. For this, it is necessary to create methodologies for assessing
this potential, covering all implementation issues from land acquisition to the
obtained effects and the availability of an effective SGF service system.
3. It is rational to formulate a program to study the potential of the SGF application in
the structure of agriculture providing for the solution of legislative and regulatory
issues as well as technical solutions, decisions regarding the infrastructure of service
and maintenance of the SGF, the organization of the advisory service and other
aspects of the SGF application.
References
1. Epshtein, D., Hahlbrock, K., Wandel, J.: Why are agroholdings so pervasive in Russia’s
Belgorod oblast’? Evidence from case studies and farm-level data. Post-Communist Econ.
25(1), 59–81 (2013)
2. Vinogradov, A., Vinogradova, A., Bolshev, V., Psarev, A.I.: Sectionalizing and redundancy
of the 0.38 kV ring electrical network: mathematical modeling schematic solutions. Int.
J. Energy Optim. Eng. (IJEOE) 8(4), 15–38 (2019)
3. Vinogradov, A., Vasiliev, A., Bolshev, V., Vinogradova, A., Kudinova, T., Sorokin, N.,
Hruntovich, N.: Methods of reducing the power supply outage time of rural consumers. In:
Kharchenko, V., Vasant, P. (eds.) Renewable Energy and Power Supply Challenges for
Rural Regions, pp. 370–392. IGI Global (2019)
4. Tikhomirov, D.A., Kopylov, S.I.: An energy-efficient electric plant for hot steam and water
supply of agricultural enterprises. Russ. Electr. Eng. 89(7), 437–440 (2018)
5. Bolshev, V.E., Vinogradov, A.V.: Obzor zarubezhnyh istochnikov po infrastrukture
intellektual'nyh schyotchikov [Overview of foreign sources on the infrastructure of smart
meters]. Bull. South Ural State Univ. Ser. Energy 18(3), 5–13 (2018)
6. Sharma, K., Saini, L.M.: Performance analysis of smart metering for smart grid: an
overview. Renew. Sustain. Energy Rev. 49, 720–735 (2015)
7. Kabalci, Y.: A survey on smart metering and smart grid communication. Renew. Sustain.
Energy Rev. 57, 302–318 (2016)
8. Vinogradov, A.V., Anikutina, A.V.: Features in calculations for electric energy of
consumers with a maximum power over 670 kW and a computer program for selecting
the optimal price category [Osobennosti v raschetah za elektroenergiyu potrebitelej s
maksimal’noj moshchnost’yu svyshe 670 kVt i komp’yuternaya programma dlya vybora
optimal’noj cenovoj kategorii]. Innov. Agric. 2, 161–169 (2016)
9. Litti, Y., Kovalev, D., Kovalev, A., Katraeva, I., Russkova, Y., Nozhevnikova, A.:
Increasing the efficiency of organic waste conversion into biogas by mechanical pretreatment
in an electromagnetic mill. In: Journal of Physics: Conference Series, vol. 1111, no. 1 (2018)
10. Panchenko, V., Kharchenko, V., Vasant, P.: Modeling of solar photovoltaic thermal
modules. In: Vasant, P., Zelinka, I., Weber, GW. (eds.) Intelligent Computing &
Optimization. ICO 2018. Advances in Intelligent Systems and Computing, vol. 866.
pp. 108–116. Springer, Cham (2019)
11. Daus, Y., Kharchenko, V.V., Yudaev, I.V.: Managing Spatial Orientation of Photovoltaic
Module to Obtain the Maximum of Electric Power Generation at Preset Point of Time. Appl.
Solar Energy 54(6), 400–405 (2018)
12. Gladchenko, M.A., Kovalev, D.A., Kovalev, A.A., Litti, Y.V., Nozhevnikova, A.N.:
Methane production by anaerobic digestion of organic waste from vegetable processing
facilities. Appl. Biochem. Microbiol. 53(2), 242–249 (2017)
13. Kovalev, A., Kovalev, D., Panchenko, V., Kharchenko, V., Vasant, P.: System of
optimization of the combustion process of biogas for the biogas plant heat supply. In:
Vasant, P., Zelinka, I., Weber, G.W. (eds.) Intelligent Computing and Optimization. ICO
2019. Advances in Intelligent Systems and Computing, vol. 1072. Springer, Cham (2019)
14. Kharchenko, V., Gusarov, V., Bolshev, V.: Reliable electricity generation in RES-based
microgrids. In: Alhelou, H.H., Hayek, G. (eds.) Handbook of Research on Smart Power
System Operation and Control, pp. 162–187. IGI Global (2019)
15. Vinogradov, A., Bolshev, V., Vinogradova, A., Kudinova, T., Borodin, M., Selesneva, A.,
Sorokin, N.: A system for monitoring the number and duration of power outages and power
quality in 0.38 kV electrical networks. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.)
Intelligent Computing & Optimization. ICO 2018. Advances in Intelligent Systems and
Computing, vol. 866, p. 10. Springer, Cham (2019)
16. Vinogradov, A., Vasiliev, A., Bolshev, V., Semenov, A., Borodin, M.: Time factor for
determination of power supply system efficiency of rural consumers. In: Kharchenko, V.,
Vasant, P. (eds.) Handbook of Research on Renewable Energy and Electric Resources for
Sustainable Rural Development, pp. 394–420. IGI Global (2018)
Concept of Multi-contact Switching System
Alexander V. Vinogradov1 , Dmitry A. Tikhomirov1 ,

Alina V. Vinogradova1, Alexander A. Lansberg2,
Nikolay S. Sorokin2, Roman P. Belikov2, Vadim E. Bolshev1(&) ,
Igor O. Golikov2, and Maksim V. Borodin2
1
1-st Institutsky proezd, 5, 109428 Moscow, Russia
winaleksandr@rambler.ru, alinawin@rambler.ru,
tihda@mail.ru, vadimbolshev@gmail.com
2
Orel State Agrarian University named after N.V. Parahin,
Generala Rodina St., 69, 302019 Orel, Russia
thegreatlansberg@mail.ru, sorokinnc@rambler.ru,
montazar@rambler.ru, el-ogau@yandex.ru@yandex.ru,
maksimka-borodin@yandex.ru
Abstract. The paper proposes a new approach to the construction of intelligent

electrical networks. It is based on the use of the multi-contact switching system -
switching devices having 3 or more contact groups and the control of them is
carried out independently. The paper also states the main provisions of this
approach. There is an example of a distribution electrical network on the basis of
the multi-contact switching system and containing a transformer substation,
several renewable energy sources and various types of switching equipment.
The basic functions of the network are listed based on the processing of the
received data, the interaction of network equipment and systems of monitoring,
control, accounting and management.
Keywords: Agricultural enterprises Agricultural holdings Distributed

generation Energy consumption analysis Energy resources Waste
recycling Smart electric networks Multicontact switching systems
Sectionalizing Redundancy
1 Introduction
At present, the development of “smart power grids” is a global trend related to the fact
that existing 0.4 kV distribution power grids are characterized by low reliability, sig-
nificant energy losses and insufficiently high power quality. The main reasons for this
are the oversized length of power lines, the insufficient degree of network automation
as well as the construction of networks using radial schemes which does not allow for
backup power supply to consumers. The most effective way to increase the power
supply system reliability is to connect distributed generation, for example, biogas
plants, solar panels, etc. [1–3], enabling these networks to work as microgrids [4]. The
joint work of distributed generation sources within the microgrid requires complex
automation. It consists in equipping the electric network with intelligent devices that
allow analyzing the operating modes and automatically reconfiguring the network to

https://doi.org/10.1007/978-3-030-68154-8_3
Concept of Multi-contact Switching System 29
localize the place of damage and restore power to consumers connected to undamaged
network sections.
Distributed automation of 0.4 kV electric networks should be carried out taking
into account the fact that these networks can have a large number of outgoing power
lines from the main transmission lines and there is a possibility to connect small
generation sources directly to 0.4 kV networks. In accordance with this, the concept of
the development of electric networks using multi-contact switching systems (MCSS) is
proposed in this paper.
2 MCSS
A new approach to the construction of intelligent electrical networks is proposed. Its

feature is in the use of multi-contact switching systems in these networks. Multi-contact
switching systems (MCSS) are switching devices having 3 or more contact groups and
the control of them is carried out independently [5, 6]. The use of the MCSS in
electrical networks makes it possible to automatically change the network configuration
when the situation changes or on the instructions of the operator. To this end, the
MCSS are equipped with monitoring, accounting and control devices that allow data
exchange with a single network information center, which in turn makes it possible to
implement the principles of SMART GRID.
The developed concept of intelligent electrical networks contains the following
main provisions:
• Application of distribution electric networks containing multi-contact switching
systems. Multi-contact switching systems (MCSS) are switching devices having 3
or more contact groups and the control of contact groups is carried out indepen-
dently. The use of the MCSS in electrical networks makes it possible to automat-
ically change the network configuration when the situation changes or on the
instructions of the operator. To this end, the MCSS are equipped with monitoring,
accounting and control devices that allow data exchange with a single network
information center, which in turn makes it possible to implement the principles of
SMART GRID.
• Equipping electrical networks by means of redundancy and partitioning.
• Equipping electrical networks, power lines systems [7] by:
– The control system of network equipment is designed to monitor the actual state
of network elements such as switching devices and power elements. The system
is based on SCADA systems. The monitoring system can be combined with
monitoring and accounting systems.
– The monitoring system performs the functions: monitoring the technical con-
dition of all network elements (the state of power lines and their elements, for
example, supports, the state of overgrowing of transmission lines, the state of
equipment insulation, etc.); monitoring of the network operation modes and its
individual elements (loading, energy flow direction, etc.); monitoring the ful-
fillment of contractual obligations of consumers and energy supplying compa-
nies (power grid, retail, generating, etc.); monitoring the reliability of power
supply in terms of the number of power outages and damage to network ele-
ments; monitoring the power quality; monitoring the energy efficiency of the
network (energy losses, other energy efficiency indicators); monitoring other
parameters of the network and the overall power supply system depending on
the capabilities. The monitoring system can be combined with control and
accounting systems.
– The accounting system performs the functions: accounting for the amount of
electricity, adjusting the electricity cost depending on its quality or reliability
indicators of the power supply (for example, if the contractual terms for relia-
bility are violated, the cost decreases); accounting for the number and duration
of power supply interruptions; accounting of the number of the network
equipment work; accounting of other data. The accounting system can be
combined with control and monitoring systems.
– The control system manages all network elements depending on its operating
modes, specified switching requirements, received data from control, monitoring
and accounting systems in order to improve the energy efficiency of the network,
increase the power supply reliability, etc.
• Creation of database systems and information processing comprising:
– Databases of consumers connected to the network with their characteristics,
parameters of operating modes;
– Database of equipment installed on the network with the characteristics and
parameters of the modes of operation;
– Database of power supply reliability indicating the characteristics of power
supply interruptions, equipment damageability;
– Database of power quality with an indication of the quality parameters at dif-
ferent points of the network;
– Database of technical connections with an indication of the connection char-
acteristics, terms of implementation, etc.
The information collected in the database should serve as the basis for making
forecasts of electricity consumption, accident rates, etc. It is also used by control
systems that change the network configuration in various modes, switching, shutting
down equipment, etc.
• Ensuring the possibility of using various energy sources including renewable ones
(RES) and energy storage devices both in parallel with centralized systems and
without being switched on for parallel operation.
• Providing the ability to automatically change the network scheme in the event of
various situations.
• The possibility of organizing microgrids with different energy sources, operating in
parallel or in isolation from each other for different consumers.
Figure 1 shows an example of a distribution smart grid containing a transformer
substation (TS) and several renewable energy sources (RES), various types of switching
equipment. It is shown that the action of each device installed in the grid should be
monitored by monitoring, accounting and control systems. Each unit must be remotely
controlled. Communication channels in the network can be organized with different data
transfer technology (JPS, JPRS, radio frequencies, PLC modems, etc. [8–10]).
The following monitoring, accounting and control systems are installed at trans-
former substations and renewable energy sources:
• Systems (sets of equipment) of determining the damage places in power trans-
mission lines and transformer substations – DDP [11];
• Systems (sets of equipment) for regulating the power quality indexes (including
means of adaptive automatic voltage regulation, for example [12]) - RPQI;
• Systems (sets of equipment) of substation automation (Automatic load transfer,
Automatic reclosing, Automatic frequency load shedding, relay protection, etc.) -
SAU (substation automation unit) [13];
• Systems (sets of equipment) of automatic reactive power compensation (regulation)
– RPC [14, 15];
• Advanced metering infrastructure - AMI. The AMI TS takes into account the
operation of the automation systems of the TS, operating modes, transformer
loading as well as carry out electricity metering for each outgoing line, at the inputs
of the transformer on the high and low sides [16–18];
• Other systems as they are developed.
Information from all specified and prospective systems should be transmitted to
information processing and control units having a communication channel between
themselves as well as to a control center (it can be the dispatching office of the electric
grid companies). This allows remote control of TS, RES and, if necessary, remote
control.
The inputs of all consumers are equipped with a AMI system allowing to determine
the values of electric power losses, interruptions in power supply, monitor the operation
of protective equipment and automation of the consumer network, detect unauthorized
electrical equipment (for example, a welding transformer), detect unauthorized con-
nection of generators without compliance safety rules, etc. In addition, consumers AMI
allows for electricity cost adjustment depending on its quality, on the number and
duration of interruptions in power supply and other factors [19]. It must be possible to
prepay electricity and other promising functions.
Power lines regardless of their performance (cable or air) are equipped with remote
monitoring systems for technical condition (insulation condition, overgrowth, incli-
nation of supports, etc.) transmitting data on the power transmission line element state
to IPCU and, respectively, to the SCC.
In the event of an emergency at one of the transmission line sections, automatic
control of the network switching equipment is based on the data received from the
control, monitoring and accounting systems. That is, the network configuration changes
due to switching contacts of the MCSS, sectionalizing points (SP), universal section-
alizing points (USP), points of automatic switching on the reserve (ASR) in such a way
as to isolate the damaged area and provide power to consumers in the backup areas.
The example shown in Fig. 1 does not exhaust all the possibilities of using the
presented systems depending on the required reliability parameters. The principles of
construction of the network and the types of switching devices presented in the Fig. 1
can also be used in networks of other voltage classes.
Fig. 1. An example of the use of multi-contact switching systems for the development of
microgrids containing renewable energy sources.
3 Discussion
The intelligent electrical network based on the use of the multi-contact switching
system allows to perform the following basic functions:
• To implement switching algorithms and automatic commands for switching in
networks (to perform network configuration changes, highlighting damaged areas,
etc.);
• To “see” electricity loss in real time with the ranking of technological, commercial
while for power transformers to “see” the loss of idling and short circuit;
• To develop recommendations for reducing losses, optimal network configuration;
• To automatically change the electricity cost depending on its quality, power supply
interruption time and other parameters;
• To develop recommendations for changing the technological connection conditions
(change in capacity, change in the scheme, etc.);
• To obtain the calculated parameters of power lines and equipment, build mathe-
matical models of the network;
• To receive aggregate load schedules daily, monthly, annual …;
• To develop recommendations on the volumes and parameters of electricity storage,
modes of operation of balancing power plants;
• To obtain reliability indicators of the network and equipment installed in it with an
analysis of the most/least reliable network elements, brands and types of equipment,
issuing recommendations for selection, maintenance and repair;
• To develop recommendations for the selection and configuration of protection and
automation;
• To receive a mapping of the state of switching devices, recommendations for the
development and installation of new devices;
• To receive diagnostic parameters of the equipment and recommendations on the
timing of maintenance, repair, replacement, forecast accidents;
• To get other results based on the system capabilities.
The implementation of such electrical network intellectualization systems will
allow using technical and economic mechanisms to improve the efficiency of power
networks and power supply systems in general. It is possible to automate the imple-
mentation of justifications for the use of microgrids and networks with varying degrees
of automation, form multi-year databases on all parameters of power supply system
functioning, predict the operation modes of electrical networks, generating facilities,
current trends in the development of networks, electricity market. All this allows
integrating power supply systems, intelligent electrical networks into the digital
economy, creating new markets for equipment, communications equipment and soft-
ware for intelligent electrical networks and microgrids, and markets for electricity
services. The creation of new Internet equipment and Internet services based on the
capabilities of managing network equipment, regulating power quality, power supply
reliability, electricity storage, and the use of renewable energy sources are also ele-
ments of the digital economy involving the capabilities of intelligent electrical
networks.
One of the main advantages of such a concept of building autonomous networks
and networks connected to centralized systems is the possibility of flexible changes in
the power supply scheme. This means that if there is a shortage of power from one of
the energy sources or if there is damage to one of the network sections, it is possible to
automatically switch to another power source. It is also possible to transfer surplus to
the centralized network solutions for coding situations in the electrical network with the
peaks of energy production on renewable energy sources [4]. Solutions have been
developed for partitioning and backing up electrical networks [20], for improving the
quality of electricity and regulating the voltage in electrical networks [21].
In addition, the implementation of these systems will significantly improve elec-
trical safety in the operation of electrical networks. Receiving information about the
state of the power supply system and emergency conditions in it allows to quickly and
remotely localize the accident site or a section of the network where there is a danger of
electric shock, for example, when a wire break.
4 Conclusions
The developed concept of building intelligent electrical networks based on the use of
multi-contact switching systems allows to automatically change the network configu-
ration when using different energy sources, monitor the network situation including the
parameters of the equipment technical condition, the parameters of the network
operating modes, highlighting accident modes, manage switching equipment. This
makes it possible to use several power sources in the network, including renewable
ones which can work both in parallel and in isolation from each other.
References
1. Litti, Y., Kovalev, D., Kovalev, A., Katraeva, I., Russkova, Y., Nozhevnikova, A.:
in an electromagnetic mill. In: Journal of Physics: Conference Series, vol. 1111, no. 1 (2018)
modules. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.) Intelligent Computing &
Optimization. ICO 2018. Advances in Intelligent Systems and Computing, vol. 866.
pp. 108–116. Springer, Cham (2019)
3. Daus, Y., Kharchenko, V.V., Yudaev, I.V.: Managing spatial orientation of photovoltaic
module to obtain the maximum of electric power generation at preset point of time. Appl.
Solar Energy 54(6), 400–405 (2018)
4. Kharchenko, V., Gusarov, V., Bolshev, V.: Reliable electricity generation in RES-based
microgrids. In: Alhelou, H.H., Hayek, G. (eds.) Handbook of Research on Smart Power
System Operation and Control, pp. 162–187. IGI Global (2019)
5. Vinogradov, A.V., Vinogradova, A.V., Bolshev, V.Ye., Lansberg, A.A.: Sposob
kodirovaniya situacij v elektricheskoj seti, soderzhashchej mul'tikontaktnye kommuta-
cionnye sistemy i vozobnovlyaemye istochniki energii [A way of coding situations in an
electric network containing multi-contact switching systems and renewable energy sources].
Bull. Agric. Sci. Don 2(46), 68–76 (2019)
6. Vinogradov, A.V., Vinogradova, A.V., Marin, A.A.: Primenenie mul’tikontaktnyh kom-
mutacionnyh sistem s mostovoj skhemoj i chetyr’mya vyvodami v skhemah elek-
trosnabzheniya potrebitelej i kodirovanie voznikayushchih pri etom situacij [Application
of multi-contact switching systems with a bridge circuit and four outputs in consumer power
supply circuits and coding of situations arising from this]. Bull. NIIEI 3(94), 41–50 (2019)
7. Vinogradov, A., Vasiliev, A., Bolshev, V., Vinogradova, A., Kudinova, T., Sorokin, N.,
Hruntovich, N.: Methods of reducing the power supply outage time of rural consumers. In:
Kharchenko, V., Vasant, P. (eds.) Renewable Energy and Power Supply Challenges for
Rural Regions, pp. 370–392. IGI Global (2019)
8. Bolshev, V.E., Vinogradov, A.V.: Perspektivnye kommunikacionnye tekhnologii dlya
avtomatizacii setej elektrosnabzheniya [Promising communication technologies for automa-
tion of power supply networks]. Bull. Kazan State Power Eng. Univ. 11(2), 65–82 (2019)
9. Ancillotti, E., Bruno, R., Conti, M.: The role of communication systems in smart grids:
architectures, technical solutions and research challenges. Commun. Technol. 36, 1665–
1697 (2013)
10. Khan, R.H., Khan, J.Y.: A comprehensive review of the application characteristics and traffic
requirements of a smart grid communications network. Comput. Netw. 57, 825–845 (2013)
11. Vinogradov, A., Bolshev, V., Vinogradova, A., Kudinova, T., Borodin, M., Selesneva, A.,
Sorokin, N.: A system for monitoring the number and duration of power outages and power
quality in 0.38 kV electrical networks. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.)
Computing, vol. 866, p. 10. Springer, Cham (2019)
12. Vinogradov, A., Vasiliev, A., Bolshev, V., Semenov, A., Borodin, M.: Time factor for
determination of power supply system efficiency of rural consumers. In: Kharchenko, V.,
Vasant, P. (eds.) Handbook of Research on Renewable Energy and Electric Resources for
Sustainable Rural Development, pp. 394–420. IGI Global (2018)
13. Abiri-Jahromi, A., Fotuhi-Firuzabad, M., Parvania, M., Mosleh, M.: Optimized sectional-
izing switch placement strategy in distribution systems. IEEE Trans. Power Deliv. 27(1),
362–370 (2011)
14. Singh, B., Saha, R., Chandra, A., Al-Haddad, K.: Static synchronous compensators
(STATCOM): a review. IET Power Electron. 2(4), 297–324 (2009)
15. Haque, M.H.: Compensation of distribution system voltage sag by DVR and D-STATCOM.
In: 2001 IEEE Porto Power Tech Proceedings (Cat. No. 01EX502), vol. 1. IEEE (2001)
16. Bolshev, V.E., Vinogradov, A.V.: Obzor zarubezhnyh istochnikov po infrastrukture
intellektual'nyh schyotchikov [Overview of foreign sources on the infrastructure of smart
meters]. Bull. South Ural State Univ. Ser. Energy 18(3), 5–13 (2018)
17. Sharma, K., Saini, L.M.: Performance analysis of smart metering for smart grid: an
overview. Renew. Sustain. Energy Rev. 49, 720–735 (2015)
18. Kabalci, Y.: A survey on smart metering and smart grid communication. Renew. Sustain.
Energy Rev. 57, 302–318 (2016)
19. Vinogradov, A., Borodin, M., Bolshev, V., Makhiyanova, N., Hruntovich, N.: Improving the
power quality of rural consumers by means of electricity cost adjustment. In: Kharchenko,
V., Vasant, P. (eds.) Renewable Energy and Power Supply Challenges for Rural Regions,
pp. 31–341. IGI Global (2019)
20. Vinogradov, A., Vinogradova, A., Bolshev, V., Psarev, A.I.: Sectionalizing and redundancy
of the 0.38 kV ring electrical network: mathematical modeling schematic solutions. Int.
J. Energy Optim. Eng. (IJEOE) 8(4), 15–38 (2019)
21. Vinogradov, A., Vinogradova, A., Golikov, I., Bolshev, V.: Adaptive automatic voltage
regulation in rural 0:38 kV electrical networks. Int. J. Emerg. Electric Power Syst. 20(3)
(2019)
The Design of Optimum Modes of Grain
Drying in Microwave–Convective Effect
Dmitry Budnikov(&)
Federal State Budgetary Scientific Institution

“Federal Scientific Agroengineering Center VIM” (FSAC VIM),
1-st Institutskij 5, Moscow 109428, Russia
dimm13@inbox.ru
Abstract. The development of processing modes using electrical technologies

and electromagnetic fields can reduce the energy intensity and cost of grain heat
treatment processes. During development, it is necessary to consider the techno-
logical requirements of the processed material, types of used technology, the mode
of operation of the mixing equipment (continuous, pulse etc.). This paper presents
the results of experimental studies, on the basis of which systems for optimal
control of post-harvest grain processing equipment can be built using electro-
physical effects. At the same time, energy consumption can be reduced by 20–30%
compared to classic mine dryers and process can be intensified by 35–40%.
Keywords: Electrophysical effects Post-harvest treatment Microwave field

Optimum modes
1 Introduction
The high energy intensity of post-harvest grain processing dictates the need to develop
new equipment that reduces energy costs for post-harvest processing [2, 5, 7]. Many
researchers note the possibility of reducing these energy costs due to the use of elec-
trophysical factors [1, 3, 4, 6, 8, 9]. These factors include ultrasound, ozone, aeroions,
infrared field, microwave exposure and others. In order to reduce costs while ensuring
the quality indicators, it is also necessary to develop modes and equipment for optimal
management of post-harvest processing plants. At the same time, it is worth consid-
ering the possibility of managing according to the criteria of minimum energy con-
sumption and minimum processing time.
2 Main Part
2.1 Research Method
A number of experimental studies are required to obtain the desired data. In this work,
the research was carried out on installations of two types. In the first case, the instal-
lation contained one microwave power source (a 900 W magnetron) acting on a sta-
tionary grain layer with a volume of 0.015 m3. In the second case, the installation

https://doi.org/10.1007/978-3-030-68154-8_4
The Design of Optimum Modes of Grain Drying 37
contained six similar sources located in three levels of height with two opposite
magnetrons at the level. In this case, the grain layer was moved vertically, and the
volume of simultaneously processed material was 0.1 m3.
At this stage, the planning of the screening experiment is aimed at searching for
energy-efficient grain drying modes using electrophysical factors, such as microwave,
for different states of the grain layer, taking into account the form of moisture coupling
in the grain. It is obvious that natural drying modes will have the lowest energy
consumption, but in this case the final material may be damaged due to limited time of
safe storage. The mode of constant and pulsed microwave exposure is used. Table 1
shows the factors of the screening experiment and their variation levels. The response
function is the cost of electrical energy for the drying process.
Table 1. Values of factors for obtaining drying curves.

Pos. Density, q, Initial Microwave Air speed, Air
kg/m3 moisture, W, operation mode v, m/s temperature,
% T, °C
1 800 17.2 1 1.0 20
2 600 17.2 1 1.0 20
3 400 16.9 1 1.0 20
4 800 17.1 2/3 1.0 20
5 600 16.7 2/3 1.0 20
6 400 17.0 2/3 1.0 20
7 800 16.9 1/3 1.0 20
8 600 16.8 1/3 1.0 20
9 400 16.8 1/3 1.0 20
Since the screening experiment was carried out on an installation that allows
processing a small volume of grain mass, it makes sense only to dependence of the
relative deviations of energy intensity from the assumed initial one.
According to the results of the screening experiment, the mode with a thin layer of
grain mass and constant microwave exposure has the lowest energy consumption. At
the same time, the highest energy intensity is observed in the mode with pulsed
microwave exposure to the suspended layer that is 2.4 times higher than the cost in the
least energy-intensive mode. Despite this, it is worth considering that a significant
amount of energy (about 50%) was spent on creating a fluidized layer.
Thus, in the subsequent stages, it is worth dividing the total energy spent on the
process and the energy spent on the impact factor. In addition, the volume of the
processed material must be increased while maintaining the uniformity of the field
distribution in the processed material.
The measurement of grain material humidity by rapid assessment devices in the
process of microwave exposure gives false results associated with changes in the shape
of the moisture coupling and the properties of the material. Since that, the measurement
of humidity reduction is carried out by weighing the processed volume of the material.
38 D. Budnikov
The drying curves were taken to get the required data. After the data is approxi-
mated by a standard parametric model or a user-defined model, the approximation
quality can be evaluated both graphically and using various approximation fitness
criteria: SSE (least squares), R-square (R-square criterion), Adjusted R-square (ad-
justed R-square), RSME (root mean square error). In addition, one can calculate
confidence intervals for the found values of model parameters that correspond to
different probability levels, and confidence bands for approximation and data that also
correspond to different probability levels. The dependence of grain moisture change, W,
%, on the drying time, t, min, can be represented by the following equation:
W ¼ a ebt þ c; ð1Þ
where a, b, c – proportionality coefficients.

Table 2 shows the data of statistical processing of drying curves.
Table 2. Experimental data processing results.

Pos. a b c R2
1 3.438 0.0911 13.9 0.9819
2 3.769 0.07893 13.48 0.9836
3 3.293 0.09729 13.57 0.9996
4 3.035 0.05973 14.07 0.9911
5 3.013 0.04815 13.79 0.9864
6 3.332 0.07748 13.81 0.9745
7 2.959 0.02602 13.95 0.9879
8 2.99 0.04361 13.96 0.9727
9 2.963 0.05021 13.92 0.9667
The data logged during the experiments were processed directly in the SCADA
system. In addition, data downloaded from log files can be loaded into application
software packages such as Statistica, Matlab, Excel, etc., for further statistical and
regression processing. Table 3 partially presents the results of experimental studies of
the efficiency of a laboratory installation containing a single source of microwave
power according to the selected optimality criteria.
Table 3. The results of experimental studies.

# Culture Moisture, Moisture q, MJ/kg of Optimality
W, % removal, DW, evaporated criterion
% moisture
1 Wheat 16 1.5 7.3 Maximum
2 20 2 4.6 performance
3 16 1 6.2 Minimum energy
4 20 1 3.7 consumption
(continued)
# Culture Moisture, Moisture q, MJ/kg of Optimality
W, % removal, DW, evaporated criterion
% moisture
5 Barley 16 1.5 7.6 Maximum
9 Sunflower 9 1.5 5.8 Maximum
Analysis of experimental results shows a decrease in energy costs relative to drying

in mine grain dryers up to 30%, but higher than similar indicators in active ventilation
bins. The best indicators for reducing energy intensity and intensifying drying were
obtained when processing sunflower, this is due to lower values of standard humidity
and associated values of dielectric parameters. A new module and equipment structure
were designed after evaluating the equipment structure and temperature field param-
eters in the microwave grain processing chamber. Modification of the design involves
reducing the uneven heating of grain during processing, which will increase produc-
tivity due to the intensification of heat and moisture exchange.
In addition to these, the modes with pulsed switching of microwave power sources
were considered. At the next stage, research was conducted for a unit containing six
microwave power sources.
2.2 Experiment
The modes presented in Table 4 was considered in the research work of the microwave-
convective module. The operating mode of sources of microwave power are: 0 -
without using the microwave; ½ - sources of microwave power work intermittently
(this mode was implemented as 10 s with microwave treating and 10 s without the
one); ¼ - sources of microwave power work intermittently (this mode was implemented
as 5 s with microwave treating and 15 s without the one).
In the course of experimental studies, wheat drying curves were taken from the
initial humidity of 20% [10, 11]. Further, according to these data, the dependence of the
rate of moisture removal in the drying process depending on the test mode of operation
of the unit was studied. Figure 1 shows the drying curves obtained as a result of
experimental studies. The current rate of moisture removal was also calculated from the
current humidity of wheat from the test mode of operation of the plant. As a result,
dependences of energy consumption for drying (evaporation of 1 kg of moisture) of
wheat were obtained depending on the implemented mode of heat treatment.
40 D. Budnikov
Table 4. Experimental data processing results.

Pos. fig. 1–4 1 2 3 4 5 6 7 8 9
Microwave operation mode 0 0 0 1/4 1/4 1/4 1/2 1/2 1/2
Tair 20 30 40 20 30 40 20 30 40
22
21
1 2 3 4
20
19 5 6 7 8
Moisture, W, %
18
17 9 10
16
15
14
13
0 20 40 60 80 100 120 140 160
Drying time, τ, min
Fig. 1. Drying curves.
Both the rate of moisture removal and energy costs significantly depend on both the
operating mode of the equipment and the current humidity.
2.3 Results and Discussion

Despite the fact that the current values of energy consumption for removing moisture in
modes using microwave power may exceed the modes using heated air as a drying
agent, the total cost of the drying process of these modes for the entire drying process is
lower. Table 5 shows the average energy consumption for drying wheat from 20 to
14%, obtained as a result of laboratory tests under these conditions. It is also worth
taking into account that the use of pulse modes of microwave exposure is similar to
heating the drying agent.
Table 5. Average energy consumption for drying wheat from 20 to 14%.

Mode 1 2 3 4 5 6 7 8 9
Energy consumption 6.17 6.8 8.57 6.1 7.59 4.64 4.36 3.74 3.6
for evaporation of
1 kg moisture, MJ/kg
These results allow to refine the developed models of equipment management and
implement processing control according to the specified optimality criteria.
3 Conclusions
The following conclusions can be drawn from the results of the experiment:
1. The highest energy intensity is observed in the drying mode with the pulse effect of
the microwave field on the suspended grain layer and is 2.4 times higher than the
cost in the least energy-intensive mode.
2. It should be taken into account that the application of the drying mode in the
fluidized layer has a very limited application in the processing of cereals, and
significant energy (about 50%) are spent on creating a fluidized layer.
3. When considering the energy intensity of drying from the point of view of prof-
itability, it is necessary to take into account the cost of energy carriers and sources
of generation (conversion) of energy.
4. The use of microwave fields allows to intensify the drying process in areas of
humidity close to the standard by 3–4 times.
5. It is advisable to apply this technology for drying grain at moisture levels close to
the standard one (from 17 to 14% for wheat).
References
1. Agrawal, S., Raigar, R.K., Mishra, H.N.: Effect of combined microwave, hot air, and
vacuum treatments on cooking characteristics of rice. J. Food Process Eng. e13038 (2019).
https://doi.org/10.1111/jfpe.13038
2. Ames, N., Storsley, J., Thandapilly, S.J.: Functionality of beta-glucan from oat and barley
and its relation with human health. In: Beta, T., Camire, M.E. (eds.) Cereal grain-based
functional foods, pp. 141–166. Royal Society of Chemistry, Cambridge (2019)
3. Basak, T., Bhattacharya, M., Panda, S.: A generalized approach on microwave processing
for the lateral and radial irradiations of various groups of food materials. Innov. Food Sci.
Emerg. Technol. 33, 333–347 (2016)
4. Dueck, C., Cenkowski, S., Izydorczyk, M.S.: Effects of drying methods (hot air, microwave,
and superheated steam) on physicochemical and nutritional properties of bulgur prepared
from high-amylose and waxy hull-less barley. Cereal Chem. 97, 483–495 (2020). https://doi.
org/10.1002/cche.10263
5. Izydorczyk, M.S.: Dietary arabinoxylans in grains and grain products. In: Beta, T., Camire,
M.E. (eds.) Cereal Grain-Based Functional Foods, pp. 167–203. Royal Society of
Chemistry, Cambridge (2019)
6. Pallai-Varsányi, E., Neményi, M., Kovács, A.J., Szijjártó, E.: Selective heating of different
grain parts of wheat by microwave energy. In: Advances in Microwave and Radio Frequency
Processing, pp. 312–320 (2007)
7. Ranjbaran, M., Zare, D.: Simulation of energetic-and exergetic performance of microwave-
assisted fluidized bed drying of soybeans. Energy 59, 484–493 (2013). https://doi.org/10.
1016/j.energy.2013.06.057
42 D. Budnikov
8. Smith, D.L., Atungulu, G.G., Sadaka, S., Rogers, S.: Implications of microwave drying
using 915 MHz frequency on rice physicochemical properties. Cereal Chem. 95, 211–225
(2018). https://doi.org/10.1002/cche.10012
9. Intelligent Computing and Optimization. Proceedings of the 2nd International Conference on
Intelligent Computing and Optimization 2019 (ICO 2019). Springer (2019). ISBN 978-3-
030-33585-4
10. Vasilev, A.N., Budnikov, D.A., Ospanov, A.B., Karmanov, D.K., Karmanova, G.K.,
Shalginbayev, D.B., Vasilev, A.A.: Controlling reactions of biological objects of agricultural
production with the use of electrotechnology. Int. J. Pharm. Technol. (IJPT). 8(N4), 26855–
26869 (2016)
11. Vasiliev, A.N., Goryachkina, V.P., Budnikov, D.: Research methodology for microwave-
convective processing of grain. Int. J. Energy Optim. Eng. (IJEOE) 9(2), 11 (2020). Article:
1. https://doi.org/10.4018/IJEOE.2020040101
Isolated Agroecosystems as a Way to Solve
the Problems of Feed, Ecology and Energy
Supply of Livestock Farming
Aleksey N. Vasiliev(&) , Gennady N. Samarin ,

Aleksey Al. Vasiliev , and Aleksandr A. Belov

1-st Institutsky passage, 5., Moscow, Russian Federation
vasilev-viesh@inbox.ru, samaringn@yandex.ru,
lex.of@mail.ru, belalexan85@gmail.com
Abstract. Livestock farming is one of the main energy consumers in agricul-

ture. It consumes about 20% of all energy consumption. The half of these
consumption (50%) is consumed by cattle farms. The energy consumption of the
livestock production is about 16%. More than 75% of the energy is consumed to
produce feed and is subsequently concentrated in animal waste. Improving the
energy efficiency of livestock farming is an important scientific and economic
issue. The improvement of food production results in the development of
monocultures. This rule refers to all spheres of husbandry human activity. In
order to solve a contradiction between agricultural production and the nature
laws there has been paid a great attention to the organization of agro-
ecosystems. The maintenance of permanent functioning of agroecosystems
requires significant energy consumption. The study of energy consumption in
agroecosystems is one of the main research methods in ecology, therefore, the
current paper has considered the energy-environmental problems of livestock
farming from these points of view. The current study carried out on the basis of
the energy balance of the livestock complex has shown that the energy efficiency
of livestock production was not more than 16%. This factor promotes a great
raise of environmental issues. There has been proposed to reduce “energy
capacity” of animal waste to increase environmental safety and energy inde-
pendence of livestock farming.
Keyword: Agroecological system Ecology Energy efficiency Energy

consumption Livestock farming
1 Introduction
Livestock farming is one of the main energy consumers in agriculture. It consumes

about 20% of all energy consumption. The half of these consumption (50%) is con-
sumed by cattle farms. According to the Mindrin's study [1] in 1928 there were con-
sumed 48 cal of total energy to produce 100 cal of a unit of product, in 1950 there were
57 cal, in 1960 there were 70 cal, and in 1990 there were 86 cal. Over a century, the
energy consumption to produce a unit of product have almost doubled. The current

https://doi.org/10.1007/978-3-030-68154-8_5
44 A. N. Vasiliev et al.
study has considered the efficiency of energy consumption increase and the possible
ways to reduce them. There have been analyzed these current issues and there have
been considered the variants of production organization.
2 Materials and Methods
When assessing the energy efficiency of production and determining the directions of
its improvement, it is expedient to use the methodology of energy evaluation of
products. It is expedient to apply a universal physical unit, whose exchange rate is
constant, solid and clear. Many researchers agree that this is an energy unit. Energy is
the only objective and universal measure of the value of any type of products, man-
ufactured not only by man, but also by nature.
This measure depends on neither supply and demand, nor on price. The energy
approach to the study of natural and production processes has a number of significant
advantages and wider opportunities compared to other methods [2]. The approach is
based on the necessity to take into account the objective laws of energy conversion
within the system, and at that, energy acts as a universal measure allowing us to
estimate both the value of material objects and the efficiency of production processes.
Without the use of energy analysis, effective production management is thought
impossible [3, 4]. One of the ways to implement this methodology is the environmental
and energy analysis of farm functioning. The issues of energy analysis of the
geosystems’ functioning have been studied since the beginning of the new century. In
particular, there has been developed an algorithm to conduct the study economic
activities of agroecosystems according to their energy assessment [5].
There are other researchers who carried out thorough work in this direction [6],
among them N.P. Mishurov [7].
In F.S. Sibagatullin’s work [8] there has been presented a methodology for cal-
culating the energy content of each element that is included in livestock production.
The energy content (E) was calculated on the basis of consumption in physical
units, taking into account its energy equivalent according to the formula [8]:
E ¼ Rn In ; ð1Þ
where R_n is the consumption of the resource in natural units (kg, hwt of feed units,
person ∙ h, kWt ∙ h); I_n is an energy equivalent of a resource [8].
The example of the conducted calculation is given in Table 1.
From the above presented data, it’s clear that the largest amount of energy has been
consumed on feed production.
Table 1. The energy consumption to produce a unit of product [8].
Isolated Agroecosystems as a Way to Solve the Problems of Feed, Ecology 45
Table 1. The energy consumption to produce a unit of product.

Index Energy equivalent of a Total energy consumption per
resource, MJ a unit of product (per hwt), MJ
Feed, hwt of feed units From 717 to 12995 1540.8 19226.0 17160.0
Electroenergy, kWt∙h 8.7 256.6 3241.6 358.4
Fuel, kg 10 49.0 850.0 937.0
Specific quantity of 105 367.5 2520.0 2940.0
equipment, kg
Litter, hwt 17.1 102.6 81.9 –
Consumption of labour, 43.3 73.6 1095.5 982.9
person∙h
Total – 2390.1 27015.0 22378.4
If the share of energy consumption to produce various types of products in the total
amount of energy consumption is analyzed, it’s clear that 64.5% of the total amount of
energy consumption is required to produce milk. According to [9], the energy con-
sumed in the process is about 24% of total amount of energy consumption.
To produce cattle live weight, the share of energy consumption is 71.2% of the total
amount. In the production of pig live weight, this share rises up to 76.6%. At the same
time, direct consumption of electricity and fuel are 5.8% (of the total energy con-
sumption) for pig weight increase and 15.1% for cattle weight increase. Feed pro-
duction is the main consumer of energy.
In the works of E T. Sch. Fuzella [10], the energy efficiency is indicated as the ratio
of the energy flow at the system output to the energy flow at the system input. So using
this approach [10], there was made an estimation of the energy consumption by the
technological processes to produce various products for the SPK “Nelyubino”. Some
data of the conducted calculation are presented in Table 2. The energy flow at the
system output has been determined on the basis of the energy equivalent. As in Table 1,
the energy consumption structure has shifted to feed production with 72.5% of the total
energy.
Table 2. The estimation results of energy efficiency of livestock farming on the example of SPK
“Nelyubino”, TJ (1010 J).
Type of consumption Initial data Output
Year Type of products Year
1991 1999 2004 1991 1999 2004
Feed 60.2 45.5 42.6 Milk 8.1 6.6 6.5
Infrastructure 8.1 2.1 8.8 Beef 4.8 3.2 2.7
Electroenergy 12.3 11.3 12.3
Agromachinery 2.0 1.2 1.4
Fuel 0.5 0.6 0.7
Total 83.1 60.7 65.8 Total 12.9 9.8 9.2
In feed production the main share of energy is consumed on the production of crop
products, which is automatically transferred to livestock products [11].
According to the data presented in the table, in 1991 there were 0.16 units of a
product in energy equivalent per unit of consumed energy; in 1999 there were 0.16
units, and in 2004 there were 0.14 units. The calculation results have shown that the
highest energy efficiency of livestock production was 16%. For beef production energy
efficiency was not higher than 6%. That has shown that only 6% of the energy used for
beef production was consumed efficiently and 94% of the used energy were lost. The
problem of the lost energy is considered much more keen and urgent than a simple
calculation of energy consumption figures. Consuming 16% of energy to produce
livestock products, the farmer uses almost 84% of energy inefficiently. The energy
conservation principle must be taken into account in all sectors of agricultural activity,
and it’s necessary to find out what this energy is wasted on.
At this stage in the development of technology, this part of energy begins to
transform natural environment into which it has fallen, and the issue of energy inef-
ficiency in production automatically becomes an environmental concern for people
[12]. The issue is becoming keen for global agricultural production management.
Among the requirements of nature to biological objects, the provision of maximum
stability is of great importance. The most stable community to environmental devia-
tions is a biological community with a maximum variety of individual characteristics.
This requirement is practically inconsistent with the tasks of maximum productivity.
Maximum productivity leads to a minimum diversity. The best way of maximum
productivity is to cultivate a monoculture.
However, if a monoculture is cultivated, diversity is at a zero level, and a mono-
culture is absolutely unstable. To ensure the stability of a monoculture, people must
make additional efforts, i.e. to increase energy consumption.
The natural development and modification of biological objects in nature results in
an improvement of their diversity, that in its turn results in energy dissipation. The
organized use of a biological community, the purpose of which is to seize the obtained
product, reduces dissipation, and hence the stability of a biological system. Thus we
can conclude that any human activity that damages a balance of diversity conflicts with
the laws of nature and results in environmental problems. The development of
agroecosystems [13] has turned to be one of the way to solve this global contradiction
between humanity and the biological community.
An agroecosystem is referred to as the territory organized by man, where there is to
be ensured a balance to obtain products and return them to the fields to provide an
organized circulation of organic substances. Ration-ally organized agroecosystems are
necessarily to include pastures, livestock farming complexes, arable lands, though even
a perfectly organized agroecosystem cannot provide a complete cycle of substances.
This is due to the fact that part of the mineral substances is consumed by the yield. The
imbalance is eliminated due to the application of necessary fertilizers. Due to such
approach, agroecosystems are extremely unstable and incapable of self-regulation. The
additional energy resources introduced by a person must ensure the stability of the
agroecosystem. The experience has shown that agroecosystems, in which grain crops
prevail, cannot exist more than one year. In the case of a monoculture or perennial
grasses, the agroecosystem can break down after 3‒4 years.
There are several ways to organize livestock agroecosystems. When free ranging of
cattle, an extensive option is implemented when the consumption of anthropogenic
energy is at a minimum level. The energy is consumed for support of cattlemen and the
primary processing of livestock products. An intensive way to implement an agroe-
cosystem involves the production of products at large complexes, while feed is obtained
due to the high energy input in the fields of the agroecosystem. Part of the manure is
brought to the fields, but before application it should be processed, that requires addi-
tional energy. The amount of obtained manure can be larger than is necessary to be laid
into the soil, and the environmental problem remains unsolved again.
Results and discussion.
By the example of livestock farming there has been considered the management of
such system, the effect of the components of the management system on environmental
friendliness of production [14, 15]. Figure 1 has presented a structural diagram that
allows discussing this issue in detail.
Fig. 1. Structural management scheme of agroecosystem ‘livestock farming’.
The main object of management is the livestock complex. The species of animals
are not important, since the control scheme will have the same structure. There should
be taken several parameters as controlled ones, namely volume of output (P), given in
tons; cost of production (C) in dollar per t; volume of production given in energy
equivalent (Ee), J; production waste (W), given in units of energy, J.
The use of the energy equivalent Ee as a controlled quantity is logical, since the
study of energy flows in ecosystems is one of the main research methods in ecology.
To implement technological production processes (Fig. 1), these are used com-
plexes that need general energy supply (Et) and energy supply for feed production (Ef).
The sources of energy I provide common energy supply, which is divided into indirect
(Ei) and direct (Ed) sources. Indirect energy supply is the main energy given in units of
energy consumed on buildings and structures, equipment, machinery.
Direct energy consumption includes the cost of fuel and electricity, used directly in
technological processes to produce finished products. The sources of energy II are
represented by feed (Efe), which are obtained in the second element of the agroe-
cosystem, i.e. plant growing. Initially the amount of livestock production fully con-
forms to the existing arable land and is provided with balanced crop rotation. The
system of mechanisms changes an energy flow (Efl) to implement technological pro-
cesses in the livestock complex.
Production results are evaluated using a monitoring system. The obtained data are
compared with a given indicator (Pp). Depending on the obtained comparison result
(ΔP) the control system management is to be corrected and updated. Deviations of the
controlled values from the given values can occur due to such disturbing effects as
changes in climate parameters, changes in feeding rations, changes in animal age and
weight, and other factors affecting the physiological state and productivity of animals.
The livestock agroecosystem presented in the form of a structural diagram is also
unstable, like any agroecosystem. From the point of view of the automatic control
theory, a system is considered unstable if, due to such disturbing effects, it deviates
from the given value and cannot return to its original state by itself. We have con-
sidered the results of stability loss of a given system through each controlled value.
As for such controlled value as volume of production (P), the stability loss for this
value means an irrevocable loss in animal productivity. This problem can occur due to
a significant influence of disturbing effects (e.g. an epidemic), as well as a lack of
regulatory measures (Efl), i.e. during the epidemic there were not provided appropriate
sanitary measures and vaccinations.
To ensure sustainability of such controlled value as cost price (C), there are
required restrictions on energy consumption. According to the data in Table 1 and
Table 2 it is necessary to reduce the energy consumption on feed production (Efe).
However, with the nowadays technologies, this can result in a production volume
decrease (P). Therefore, at present it is only possible to introduce restrictions on the
amount of consumed energy [16]. It has turned out that the stability of any controlled
effect is fully ensured by the energy supply of technological processes. Even for such
controlled value as volume of production, this thesis is not correct.
The current study has paid attention to the analysis of the system through equiv-
alent energy consumption.
As for equivalent energy consumption (Ee), the energy efficiency of the system
should strive for 1. The control system must reduce the energy flows Rt and Ee, while
simultaneously increasing the volume of output and its energy “costs”. In this case, the
“energy cycle” occurs without its significant, uncontrolled accumulation in production
waste. Those innovative plans, which are being actively discussed and adopted, have
got the exact opposite position. Indirect energy consumption is ready to be significantly
increased to produce products (e.g. to develop and use satellite systems for precise
farming, to use robotic systems, etc.). Moreover, the increase in such expenses can be
justified by only an improved product quality, with a reduced content of chemicals in
the final product, and a lower content of impurities. At the same time, direct energy
consumption is expected to decrease.
As a result, there is created another contradiction: to solve environmental problems
the indirect costs of production are increased and the energy efficiency of production is
reduced, and it results in the amount of energy waste increase, that makes the envi-
ronmental problems more urgent.
Thus, on the one hand, the problem of the sustainability of the livestock agro-
ecological system is being solved by increasing energy regulatory effects, on the other
hand, these are simultaneously being created serious prerequisites for an environmental
disaster. One of the ways to solve the concern is to use waste as energy carriers. There
is so much energy in the agricultural waste that if it is obtained in the form of heat, fuel
and electricity, it will be more than enough for several cycles of animal production
[17]. Energy obtaining from organic and solid domestic waste is currently being
successfully implemented in various types of bioenergy mechanisms (biogas machines,
generators operating on mixed fuels obtained from organic waste, pyrolysis mecha-
nisms, etc.).
In the world, this issue has been successfully studied by a wide range of researchers
[18]. Nowadays technologies and equipment are being improved, which allows stating
that bioenergy is able to ensure energy independence of the agro-industrial sector.
Bioenergy is commonly considered as an element of renewable energy sources, but its
main task is to prevent an environmental disaster from energy stored in waste. It should
be noted that the simple waste burning with thermal energy release does not improve
the environmental situation. This energy remains unused, only its form changes, and it
does not return to the livestock production process.
The use of energy stored in waste reduces the share of hydrocarbons and gas in
direct energy consumption to produce livestock products. This increases the stability of
the livestock agroecosystem due to its cost price. However, the energy efficiency of the
system can only increase on 10‒15%. Still, more than 80% of the energy consumed to
produce livestock products is wasted. The necessity to increase livestock production
will require new energy costs for feed, which will result in an even greater decrease of
energy efficiency, and will lead to instability of the system according to cost price. The
issue could be solved by the use of non-standard solutions in the energy supply of
livestock.
Another way to increase the energy efficiency of livestock farming is to change the
structure of agroecosystems. Analyzing the data presented in Fig. 1 “the sources of
energy I” and “the sources of energy II” are the external ones to the agroecosystem
‘livestock farming’. Livestock farming largely (more than 75%) depends on the ‘ex-
ternal’ energy of crop production. The current trends in agriculture (precise farming)
show that the share of indirect energy consumption for feed production will grow
constantly and significantly. It is in this place that it is necessary to break the energy
cycle. It is necessary to reduce the share of energy consumption for feed production.
Therefore, there should not be developed the technologies to provide feed for the
livestock farming agroecosystem from the plant-farming agroecosystem. This is pos-
sible due to establishing so-called “isolated” agroecosystems. In this case, external
energy influx into the agroecosystem should be minimal. The animal husbandry should
provide the maximum feed requirements by its own resources. Most of necessary
technologies already exist and are undergoing serious production testing.
One such technology is the production and use of microalgae feed [19]. Microalgae
are actively used as feed additives in livestock and poultry farming, as it increases
animal immunity, weight, fertility and survival of young animals. In poultry, it results
in an increase in egg production and egg size. To implement this technology, the farms
specializing in cattle and poultry farming provide themselves with algal water pools
where livestock waste is utilized. As a result, 40% of the nitrogen from the wastewater
again enters the algae biomass and is eaten by animals.
Another technology that realizes the potential of livestock farming to produce feed
is the technology of growing fly larvae on organic waste, as they quickly multiply.
Larvae’s weight increases by 300‒500 times for a week. Bio-mass from a pair of flies
and their progeny with the full realization of the genetic potential will be more than 87
tons by the end of a year, i.e. it will be equal to the weight of six elephants [20].
Biomass of housefly larvae is a complete protein feed for pigs, calves, poultry, fur
animals, and fish. It contains 48‒52% of protein, 7‒14% of fat, 7‒10% of fiber, 7% of
BEV, 11‒17% of ash, as well as such biologically active substances as vitamins,
ecdysone, etc.
The processing of organic waste which is environmentally unfriendly and unsuit-
able for use in crop cultivation technologies can produce fertile humus. Its application
as a soil for growing feed in a protected ground will meet the needs of livestock
farming. Though organic waste of livestock farming should not be considered as
energy raw materials. The use of waste as an energy carrier will reduce the energy
intensity of livestock farming and increase its energy efficiency by no more than 15%.
The issue could be radically solved by using organic materials to produce highly
nutritious feed in closed systems, combined into an isolated agroecological system.
In this case, only one restriction must be fulfilled. The volume of the livestock and
its productivity (manufactured products and production waste) should maximize the
necessary amount of feed for the livestock. Ideally, the chain ‘feed ! animals ! a
technological process of production ! waste ! feed’ should have minimal energy
supply from the outside. The renewable energy sources acquire a special role as the
sources of direct energy supply (Ed) as shown in Fig. 1. Their use can increase the
share of capital costs, but significantly reduce the environmental problems of energy
supply.
3 Conclusions
Analyzing production efficiency through energy supply and consumption allows us to

identify and study the structural and functional correlation between the components of
agricultural systems, as well as to study the dynamics of the effect of various energy
sources on the work of systems.
The energy consumption of the livestock production is about 16%. More than 75%
of the energy is consumed to produce feed and is subsequently concentrated in animal
waste.
One of the main ways to increase energy efficiency of livestock farming can be a
change of the agroecosystem structure. It should be implemented in the establishing
“isolated” agroecosystems, where organic waste technologies for producing livestock
products compensate energy consumed to produce feed for livestock farming.
It is inexpedient to use organic waste of livestock farming as raw materials for
energy. The use of microalgae production technologies, fly larvae for processing of
livestock waste can be efficiently applied to produce high-protein animal feed and
reduce energy consumption by its products.
It is reasonable to use renewable energy sources to compensate direct energy costs
in the production of livestock products.
References
1. Mindrin, A.S.: Energoekonomicheskaya ocenka sel'skohozyajstvennoj produkcii [Energy-
economic assessment of agricultural products], 187 p (1987)
2. Bulatkin, G.A.: Analiz potokov energii v agroekosistemah [Analysis of energy flows in
agroecosystems]. Vestnik Rossijskoj Akademii Nauk tom 82(8), 1–9 (2012)
3. Perez-Neira, D., Soler-Montiel, M., Gutierrez-Pena, R., et al.: Energy assessment pastoral
dairy goat husbandry from an agroecological economics perspective. A case study in
Andalusia (Spain). Sustainability 10 (2018). Article number: 2838.
4. Guzman, G.I., de Gonzalez, M.G.: Energy efficiency in agrarian systems from an
agroecological perspective. Agroecology Sustain. Food Syst. 39, 924–952 (2015)
5. Pozdnyakov, A.V., Semenova, K.A., Fuzella, T.Sh.: Energeticheskij analiz funkcionirova-
niya agroekosistem v usloviyah estestvennogo nasyshcheniya-pervye rezul'taty [Energy
analysis of the functioning of agroecosystems in conditions of natural saturation - first
results]. Uspekhi sovremennogo estestvoznaniya (2), 124‒128 (2018)
6. da Silva, N.F., da Costa, A.O., Henriques, R.M., Pereira, M.G., Vasconcelos, M.A.F.: Energy
planning: Brazilian potential of generation of electric power from urban solid wastes—under
“waste production liturgy” point of view. Energy Power Eng. 7(5), 193 (2015)
7. Mishurov, N.P.: Metodologicheskie osnovy energeticheskoj ocenki proizvodstva moloka
[Methodological foundations of energy assessment of milk production]. Tekhnika i
oborudovanie dlya sela (5), 16‒19 (2017)
8. Sibagatullin, F.S., Sharafutdinov, G.S., Shajdullin, R.R., Moskvichyova, A.B.: Bioener-
geticheskaya ocenka i osnovnye puti snizheniya energoemkosti proizvodstva produkcii
zhivotnovodstva [Bioenergy assessment and main ways to reduce energy intensity of
livestock production]. Uchenye zapiski Kazanskoj gosudarstvennoj akademii veterinarnoj
mediciny im. N.E. Baumana T. 216, 295–302 (2013)
9. Rajaniemi, M., Jokiniemi, T., Alakukku, L., et al.: Electric energy consumption of milking
process on some Finnish dairy farms. Agric. Food Sci. 26, 160–172 (2017)
10. Fuzella, T.Sh.: Energeticheskaya ocenka funkcionirovaniya agroekosistemy (na primere
SPK «Nelyubino») [Energy assessment of the functioning of the agroecosystem (on the
example of the SEC ``Ne-Lyubino'')]. Vestnik Tomskogo gosudarstvennogo universiteta
vypusk (326), 203‒207 (2009)
11. Stavi, I., Lal, R.: Agriculture and greenhouse gases, a common tragedy. A review. . Agron.
Sustain. Dev. 33, 275–289 (2013)
12. Ghosh, S., Das, T.K., Sharma, D., Gupta, K., et al.: Potential of conservation agriculture for
ecosystem services. Indian J. Agric. Sci. 89, 1572–1579 (2019)
13. Marks-Bielska, R., Marks, M., Babuchowska, K., et al.: Influence of progress in sciences and
technology on agroecosystems. In: Conference: Geographic Information Systems Confer-
ence and Exhibition (GIS Odyssey), Trento, Italy, 04–08 September 2017, pp. 254‒263
(2017)
14. Lachuga, Yu.F., Vasilyev, A.N.: Napravleniya issledovanij v bioenergetike [Research areas
in bioenergy]. Vestnik Rossijskoj sel'skohozyajstvennoj nauki (2), 4‒7 (2015)
15. Vasilyev, A.N.: Reshenie energo-ekologicheskih problem zhivotnovodcheskoj agroekosis-

temy [Solution of energy and environmental problems of livestock agroecosystem].
Tekhnologii i tekhnicheskie sredstva mekhanizirovannogo proizvodstva produkcii raste-
nievodstva i zhivotnovodstva (88), 19‒25 (2016)
16. Lehtonen, H.: Evaluating adaptation and the production development of Finnish agriculture
in climate and global change. Agric. Food Sci. 24, 219–234 (2015)
17. Mancini, F.N., Milano, J., de Araujo, J.G., et al.: Energy potential waste in the state parana
(Brazil) Brazilian archives of biology and technology 62 (2019)
18. Aberilla, J.M., Gallego-Schmid, A., Adisa, A.: Environmental sustainability of small-scale
biomass power technologies for agricultural communities in developing countries. Renew.
Energy 141, 493–506 (2019)
19. Sui, Y., Vlaeminck, S.E.: Dunaliella microalgae for nutritional protein: an undervalued asset.
Trends Biotechnol. 38, 10–12 (2020)
20. Kavran, M., Cupina, A.I., Zgomba, M., et al.: Edible insects - safe food for humans and
livestock. In: Scientific Meeting on Ecological and Economic Significance of Fauna of
Serbia. Ecological and Economic Significance of Fauna of Serbia Book Series, Belgrade,
Serbia, 17 November 2016, vol. 171, pp. 251‒300. Serbian Academy of Sciences and Arts
Scientific Meetings (2018)
Laboratory-Scale Implementation of Ethereum
Based Decentralized Application for Solar
Energy Trading
Patiphan Thupphae and Weerakorn Ongsakul(&)
Department of Energy, Environment and Climate Change,

Asian Institute of Technology, Khlong Nueng, Thailand
st120104@ait.asia
Abstract. The decentralized application (DApp) is an application that has a

backend operation on the distributed computing nodes system. DApp has been
built on decentralized technology such as blockchain. The advantages of DApp
are security, transparency, and reliability. There are several use cases of DApp
for many aspects such as prediction potential trading gains on Augur, sharing
economy of computing power by Golem and browsing, chatting, and payment
on Status. However, these DApps are utilized without any details about how to
implement it. This paper address this issue by presenting the implementation of
solar energy trading. Ethereum Blockchain – an open-source platform for DApp
has been proposed and applied for solar energy trading. The token is created by
using the ERC20 token for trading. The wallet is deployed by Metamask.
Transactions, assets, and participants are made by Ganache and tested by
Truffle. Moreover, the trading algorithm has been shown to check the correction
between seller and buyer with the smart contract on Solidity. Lastly, React- a
javascript library for building user interfaces has been deployed as a front- end
to make users interactive in solar energy trading.
Keywords: Blockchain Decentralized application Solar energy trading
1 Introduction
One of the most progressive distributed technology today - Blockchain, The blockchain
is a decentralized digital database to store inside any nodes in the network called
Distributed Ledger Technology (DLT). The most popular example which uses
blockchain as a back-end operation system is Bitcoin. Bitcoin is an innovative payment
network and a new kind of money [1]. Moreover, blockchain can apply to many
aspects such as the hospital which uses blockchain with the information record and
payment [2]. In the financial system, blockchain is applied to make payments with the
transactions across the border. Blockchain also has more applications such as supply
chain, voting, and energy supply [3].
The decentralized application (DApp) plays an important role to apply with
blockchain technology. DApp overcomes the limitation of the locally running program
which is the performance limitation and cannot respond to the requirements of many
applications [4]. DApp could have four categories - open source, internal cryptocurrency
https://doi.org/10.1007/978-3-030-68154-8_6
54 P. Thupphae and W. Ongsakul
support, decentralized consensus, and no central point of failure [4]. This study repre-
sents implement of DApp using the Ethereum open-source platform in Sect. 2. In
Sect. 3 shows implement of ERC 20 token. Section 4 shows the implementation of
energy trading and Sect. 5 shows the deployment and experimental result.
2 Implement of DApp Using the Ethereum Open-Source

Platform
The proposed energy trading was implemented using the Ethereum blockchain
framework [5]. It includes the following three steps: (1) set up the Ethereum envi-
ronment; (2) set up the React and (3) set up identities with Ganache and wallets with
Metamask for participants.
2.1 Set Up the Ethereum Environment

In this study, the Ethereum platform was selected to implement this experiment.
Ethereum has many resources for learning such as Crypto zombies [6], Ethernauts [7],
Remix [8], etc. Computer running MacOs 10.14.4 operating system with 2.3 GHz.
Intel Core i-5 was used. Pre-requisites such as Node.js, node-gyp, Python2.7.x, Truffle
framework, Ganache truffle suit, Metamask – A crypto wallet, Web3.js, git, Chrome
web browser, and React-app was installed.
Several package were installed inside package.json, including ‘apexcharts’ (rep-
resent a candlestick), ‘babel-polyfill’ (emulate a full 2015+ environment), ‘babel-
preset-env’ (allow to use the latest Javascript without needing to micromange which
syntax transform), ‘babel-preset-es2015’, babel-preset-stage-2, babel-preset-stage-3 (an
array of Babel plugins), ‘babel-register’ (automatically compile files on the fly),
‘bootstrap’ (front-end open source toolkit), ‘chai’ (Test drive development/Bahavior
driven development assertion library for node), ‘chai-as-promised’ (asserting facts
about promises), ‘chai-bignumber’ (assertions for comparing arbitrary-precision deci-
mals), ‘dotenv’ (zero-dependency module that loads environment variables from .env
file into process .env), ‘ladash’ (taking the hassle out of working with array, number,
etc.), ‘moment’ (Parse, validate, manipulate, and display dates and times in Java-
Script.), ‘openzeppelin-solidity’ (library for secure smart contract development), ‘react’
(javascript library for building user interfaces),‘react-apexcharts’ (create charts in
react), ‘react-bootstrap’ (front-end framework rebuilt for react), ‘react-dom’ (provide
DOM-specific methods), ‘react-redux’ (predictable state container for javaScript
application)), ‘react-scripts’ (include scripts and configuration), ‘redux’ (predictable
state container for JS Apps), ‘redux-logger’ (can replay problem in app),‘reselect’
(compute derived data), ‘solidity-coverage’, ‘truffle’ (testing framework and asset
pipeline), ‘truffle-flattener’ (verify contracts developed on truffle), truffle-hdwallet-
provider’ (to sign transactions) and ‘truffle-hdwallet-provider-privkey’ (to sign trans-
actions for address derived from a raw private key string). Then ‘web3’- is a library that
allows interacting with local or remote Ethereum nodes.
Laboratory-Scale Implementation of Ethereum 55
2.2 Set Up React

This section show example to create React and how to connect React with smart
contract and testing to show at the console of Chrome web browser. To set up react,
npm was installed use ($ npm install) to install all dependencies on 2.1 in the package.
json file. After that create React by using $ create-react-app follow with the name of the
project for example ($ create-react-app Dapp project) and then make sure ganache
running. To start React use ($ npm start) which is automatic run on port 3000 on
Chrome web browser. After that go to index.js and import ‘bootstrap/dist/css/bootstrap.
css’ from package.json. Users can play around with App.js to make his own DApp. In
this study, Design DApp by using cards [9], navs [10] from bootstrap, and layout
module from CSS flexbox [11]. Inside App.js using Web3.js [12] very heavily such as
connecting the blockchain, load the account, load the smart contract, load the smart
contract account, and call the smart contract function.
Inside App.js (see Fig. 1) use web3.js form ($ const web3 = new Web3 (Web3.-
givenProvider || ‘https://localhost:7545’) to connect with the specific provider. In this
study use Metamask to talk with local blockchain on port:7545. Moreover, use function
componentWillMount [13] to load blockchain Data. This function comes up with the
React component which is the extensions of the Component of the class App. The
componentWillMount is the React life cycle method and can use this lifecycle diagram
as a cheat sheet.
class App extends Component {

componentWillMount() {
this.loadBlockchainData()
}
async loadBlockchainData() {
const web3 = new Web3(Web3.givenProvider || 'http://localhost:7545')
}
}
Fig. 1. A command-line uses to connect to the local blockchain.
After doing the step above, check the web3 loaded inside the console. By ($con-
sole.log(“web3”,web3)). The result would shown in Fig. 2.
Fig. 2. The result from loading web3 inside console.

The next step is detecting the network connected to by using ($web3.eth.net.

getNetworkType()) [14] (see Fig. 3). Then checking the web3 loaded inside the con-
sole by ($ console.log(“network”,network)). The result shows in Fig. 4. In addition,
you can change to the other networks in the Metamask. To fetch the account network
that we are connected with. Go to web3.eth.getAccounts [15] and use ($web3.eth.
getAccounts()) (see Fig. 3). The result returns an array when checking with ($ console.
log(“accounts”, accounts)). In this study sets an account index [0] to show in the
console. The token has loaded by import Token.sol file and fetches the token from the
blockchain wih ($web3.eth.Contract) [16]. To set the network Id using ($ web3.eth.net.
getId()) [17] (see Fig. 3). Then the result when using ($console.log(“token”, token))
returns both abi and address (see Fig. 4). The next step is using ($methods.
myMethod.call) [18] to get total supply. (see Fig. 5).

}
const network = await web3.eth.net.getNetworkType()
const accounts = await web3.eth.getAccount()
const networkId = await web3.eth.net.getId()
const token = web3.eth.Contract(Token.abi, Token.networks[networkId]. address)
}
}
Fig. 3. A command-line use to detect the network, account, network ID and import the token.
Fig. 4. The result command-line from Fig. 3.


}
const network = await web3.eth.net.getNetworkType()
const accounts = await web3.eth.getAccount()
const networkId = await web3.eth.net.getId()
const token = web3.eth.Contract(Token.abi, Token.networks[networkId]. address)
const totalSupply = token.methods.totalSupply().call()
}
}
Fig. 5. A command-line use to fetch the totalSupply.
Then the result should return the total supply which we defined = 10^18 Wei in
Fig. 6.
Fig. 6. Return of total supply.
2.3 Set Up Identities with Ganache and Wallets with Metamask

for Participants
Ganache is a personal blockchain [19] that provides public key, private key, and testing
Ethereum coin. To set up Ganache go to [19] download and set up. Then at the
MNEMONIC tab, there are some phrases as passwords. This MNEMONIC use for fill-
in Metamask to login local blockchain into Metamask wallet.
Metamask wallet is the HD wallet that holds the Ethereum accounts and the amount
of cryptocurrency. Metamask [20] is also a wallet interface that allows the user to
manage multiple accounts. To set up Metamask users can go to [20] on Chrome web
browser download and add to Chrome extension. Metamask can work as an extension
of the web browser such as Chrome and Firefox. When Metamask is running, the user
can create a private network or can connect to the public network and can also import
the account into the Metamask wallet.
3 Implement of ERC20 Token
The implement of ERC20 Token has standard API to use with smart contracts [21]. The
function of ERC20 Token consists of name, symbol, decimals, totalSupply, balanceOf,
transfer, transferFrom, approve, and allowance. In this study focus on implementing these
functions on smart contract to DApp. The flowchart in Fig. 7. Represent the flow of smart
contract to create ERC20 token and use them to transfer. To make the ERC20 token, First
need to set name, symbol, decimal, and totalSupply of the token. In this study set
Name = Energy Token, Symbol = ET, Decimals = 18 (1 Ether = 1018 Wei) and
totalSupply = 1,000,000. Then set the balance of deployer equal to totalSupply to test the
transfer function. To execute the transfer function, the return values must return true to
execute this function. This function checks the balance of the deployer must more than the
transfer value. Then check the invalid receiver. If these return true, the transfer function is
executed. The balance of the deployer is decreased and the balance of receiver is increase
with the transfer value. The last of this function is to emit information from this transfer.
Start
Set; Name: Energy Token

Symbol : ET
Decimals : 18
TotalSupply : 1,000,000 tokens
BalanceOf;
Set Deployer :1,000,000 tokens
Receiver : 0 Tokens
Transfer;
Check1 : !balancceOfDeployer >= _value
Check2 : !invalidRecipient
True
balancceOfDeployer = balancceOfDeployer - _value
balancceOfReceiver = balancceOfReceiver + _value
False
Emit transfer _from, _to, _value,

address_sender,address_receiver, _amount
End
Fig. 7. Flowchart of transfer function.

4 Implement Energy Trading

4.1 The User Uses ET Token to Do Both Sell Orders and Buy Orders
To set amount of energy at a new order tab which wants to do in terms of a kilowatt-
hour (kWh). Then set the price of each kWh of energy. The DApp calculates the total
price and shows it’s below. Figure 8 represents the functions which use to create sell
and buy order. The variables tokenGet is token to buy order while amountGet convert
the amount to Wei [22] which is the smallest denomination of ether. The tokenGive is
ether for a buy order. The amountGive is the variable that calculates the price in Wei.
Then the function makeBuyOrder call to make order function in the smart contract. The
function can return an error if one of the variables is a mistake and show pop up ‘There
was an error!’ The makeSellOrder function does the opposite way. The tokenGet is the
ether to sell order. The amountGet calculates price in Wei. The tokenGive is the token
to sell order and the amountGive convert amount to Wei.
export const makeBuyOrder = (dispatch, exchange, token, web3, order, account) => {
const tokenGet = token.options.address
const amountGet = web3.utils.toWei(order.amount, ‘ether’)
const tokenGive = ETHER_ADDRESS
const amountGive = web3.utils.toWei((order.amount * order.price).toString(), ‘ether’)
exchange.methods.makeOrder(tokenGet, amountGet, tokenGive, amountGive).send({ from: account })

.on(‘transactionHash’, (hash) => {
dispatch(buyOrderMaking())
})
.on(‘error’,(error) => {
console.error(error)
window.alert(`There was an error!`)
})
}
export const makeSellOrder = (dispatch, exchange, token, web3, order, account) => {
const tokenGet = ETHER_ADDRESS
const amountGet = web3.utils.toWei((order.amount * order.price).toString(), ‘ether’)
const tokenGive = token.options.address
const amountGive = web3.utils.toWei(order.amount, ‘ether’)
exchange.methods.makeOrder(tokenGet, amountGet, tokenGive, amountGive).send({ from: account })

.on(‘transactionHash’, (hash) => {
dispatch(sellOrderMaking())
})
.on(‘error’,(error) => {
console.error(error)
window.alert(`There was an error!`)
})
}
Fig. 8. The function which creates sell and buy order.
4.2 Trading
In a smart contract, the trading function which is internal function lives inside fillOrder
function. This time when the _tokenGet fetch the msg.sender balances which is the
person who called the filling order set equal to his balances minus the amountGet.
While the _tokenGet for the user is set to the user balances added the amountGet.
The _user is the person who creates the order. For the tokenGive use the balances
of the user who creates the order minus with the amountGive. This amountGive is
added to the msg.sender which is the person who fills the order. Moreover, the fee
amount is going to be paid by the person who fills the order and takes out of the
amountGet this case is the msg.sender. In this work, fee percent is set to 10% and add
to the amountGet of the msg.sender. The feeAccount was created to update the balance
to the feeAmount (see Fig. 9).
Function _trade(uint256 _orderId, address _user, address _tokenGet, uint256 _amountGet, address _tokenGive,
uint256 _amountGive) internal {
uint256 _feeAmount = _amountGive.mul(feePercent).div(100);
tokens[_tokenGet][msg.sender] = tokens[_tokenGet][msg.sender].sub(_amountGet.add(_feeAmount));
tokens[_tokenGet][_user] = tokens[_tokenGet][_user].add(_amountGet);
tokens[_tokenGet][feeAccount] = tokens[_tokenGet][feeAccount].add(_feeAmount);
tokens[_tokenGive][_user] = tokens[_tokenGive][_user].sub(_amountGive);
tokens[_tokenGive][msg.sender] = tokens[_tokenGive][msg.sender].add(_amountGive);
emit Trade(_orderId, _user, _tokenGet, _amountGet, _tokenGive, _amountGive, msg.sender, now);

}
Fig. 9. Trade function in smart contract.
5 Deployment and Experimental Result
The front-end of Solar-Energy Trading has been designed as Fig. 10. This user
interface has been deployed on the localhost:3000. In this study, suppose user1 has
defined to make buy order with the total payment amount is 1 ether and send to user 2.
Figure 11 represents transferring the total payment amount from user1 to user 2 in the
terminal interface (by setting 1 Ether = 11018 tokens). Figure 12 shows the trans-
action between user 1 and 2 on Ganache personal Ethereum blockchain. User 1 and
user 2 have the address 0xb877dCcB80F27b83E4f863c41f050f18FfAEcb9b and
0x12e622A7c90CE-fF482Fc79ADe96a3AcD17C9F282 respectively. The exchange
contact address is 0x0FDA0BA4c75c3552A42B9877c9b48fC6fddc022D.
Fig. 10. User interface of Solar energy trading.

Fig. 11. Transferring total amount payment between user 1 to user 2.
Fig. 12. Transaction from user 1 on Ganache personal Ethereum blockchain.
6 Conclusion
The increase of solar PV and the progress of blockchain technology play an important
role in the energy section. The future market will become peer to peer market in the
microgrid. The excess energy is going to make a profit to prosumer. This paper presents
a decentralized application based on the blockchain technology framework. The
description is provided in technical details on setting up the ethereum blockchain, make
the token by using ERC 20 token, useful dependencies to set up DApp, and some part
of trading with smart contract.
References
1. B. Project. “Bitcoin is an innovative payment network and a new kind of money,” MIT
license (2009–2020). https://bitcoin.org/en/
2. Tasatanattakool, P.: Blockchain: challenges and applications. In: International Conference
on Information Networking (ICOIN), Thailand (2018)
3. Xu, X.: Architecture for Blockchain Applications. Springer (2019)
4. Wei Cai, Z.W.: Decentralized applications: the blockchain-empowered software system.
IEEE Access 6, 53019–53033 (2018)
5. Ethdotorg: Ethereum.org, Stiftung Ethereum. https://ethereum.org/en/
6. Cleverflare: Learn to Code Blockchain DApps By Building Simple Games, Loom. https://
CryptoZombies.io
7. Santander, A.: Hello Ethernaut, OpenZeppelin. https://ethernaut.openzeppelin.com
8. Remix - Ethereum IDE. https://remix.ethereum.org
9. Team, B.: Bootstrap, MIT. https://getbootstrap.com/docs/4.0/components/card
10. Team, B.: Bootstrap, MIT. https://getbootstrap.com/docs/4.0/components/navs
11. W3Schools: W3schools.com, W3.CSS. https://www.w3schools.com/Css/css3_flexbox.asp
12. Sphinx: web3.js, Core team (2016). https://web3js.readthedocs.io/en/v1.2.11/
13. OpenSource, F.: React.Component,Facebook Inc. (2020). https://reactjs.org/docs/react-

component.html
14. Sphinx. Web3.js, Core team (2016). https://web3js.readthedocs.io/en/v1.2.9/web3-eth.
html#net
15. Sphinx. Web3.js, Core team (2016). https://web3js.readthedocs.io/en/v1.2.9/web3-eth.
html#getaccounts
16. Sphinx. Web3.js, Core team, (2016). https://web3js.readthedocs.io/en/v1.2.9/web3-eth-
contract.html#eth-contract
17. Sphinx, Web3.js, Core team (2016). https://web3js.readthedocs.io/en/v1.2.9/web3-eth-net.
html?highlight=getId)
18. Sphinx. Web3.js, Core team (2016). https://web3js.readthedocs.io/en/v1.2.9/web3-eth-
contract.html?highlight=method.#methods-mymethod-call
19. McVay, W.: Truffle suit, Truffle Blockchain group (2020). https://www.trufflesuite.com/
ganache
20. Davis, A.: METAMASK, ConsenSys Formation (2020). https://metamask.io
21. Vitalik Buterin, F.V.: Ethereum improvement proposals, EIPs (2015). https://eips.ethereum.
org/EIPS/eip-20
22. Sphinx. Ethereum Homestead, Core team (2016). https://ethdocs.org/en/latest/ether.html
Solar Module with Photoreceiver Combined
with Concentrator
Vladimir Panchenko1,2(&) and Andrey Kovalev2

1
Russian University of Transport, Obraztsova st. 9, 127994 Moscow, Russia
pancheska@mail.ru
2
Federal Scientific Agroengineering Center VIM, 1st Institutsky passage 5,
109428 Moscow, Russia
kovalev_ana@mail.ru
Abstract. The paper considers the design of the solar photovoltaic air cooling
module with a paraboloid type concentrator. Photovoltaic converters are located
on the surface of the concentrator, which ensures their cooling using a metal
radiator, the functions of which are performed by the solar concentrator itself. At
the same time, the profile of the solar radiation concentrator provides uniform
illumination of photovoltaic cells, which favorably affects their efficiency. It is
advisable to use high-voltage matrix silicon photovoltaic solar cells as photo-
voltaic converters, which have high electrical efficiency and maintain it at a high
concentration of solar radiation and heating with concentrated solar radiation.
Keywords: Solar energy Solar concentrator Silicon photovoltaic

converters Air heat sink Uniform illumination Profile Efficiency
1 Introduction
The use of solar radiation concentrators can reduce the number of photovoltaic con-
verters, which favorably affects the cost of the installation and the electricity received
with its help [1–3]. However, when photovoltaic cells convert concentrated solar
radiation, their heating occurs significantly, as a result of which their electrical effi-
ciency decreases [4], which indicates the need for their cooling [5–10]. When photo-
voltaic converters operate in a concentrated solar stream, their current-voltage
characteristic acquires a triangular shape, which indicates a significant decrease in their
efficiency [11]. All planar solar cells have this property, but matrix high-voltage silicon
photovoltaic converters do not lose electrical efficiency in concentrated solar flux due
to their structure [12, 13]. The use of a paraboloid type concentrator in the form of an
air radiator, in the focal region of which such matrix photovoltaic converters are located
simultaneously on its surface, eliminates the need for an active cooling system design.
However, for their stable operation, it is necessary to ensure uniform illumination of the
entire surface of the photovoltaic converters, which requires a special geometric
approach and design methods [14] for the profile of the solar radiation concentrator.

https://doi.org/10.1007/978-3-030-68154-8_7
64 V. Panchenko and A. Kovalev
2 Creation of the Geometry of Working Profile

of the Paraboloid Type Solar Radiation Concentrator
Developed method is proposed to be applied for the calculations of the profile of a

paraboloid type concentrator, which provides uniform illumination of photovoltaic
converters at a relatively high degree of solar radiation concentration and ensures stable
electricity generation. On the surface of the developed solar radiation concentrator there
are photovoltaic converters, which have thermal contact with its metal surface, which
allows heat energy to be removed due to heating from the solar radiation by the entire
surface of the metal concentrator (Fig. 1). Thanks to the efficient heat sink, the pho-
tovoltaic converters do not overheat and operate at the nominal operating mode without
losing their electrical efficiency.
Fig. 1. The design of the solar photovoltaic module, where the concentrator is also an air-cooled
radiator for photovoltaic converters
Thanks to the use of silicon high-voltage matrix photovoltaic converters (Fig. 2), it
appears possible to obtain a high-voltage direct current (1000 V or more) at the output
of one solar module, as well as an increase in the electrical conversion efficiency of
solar radiation and, accordingly, a reduction in cost as module (specific electric power)
and generated electric energy. Due to the use of sealing technology of photovoltaic
converters with two-component polysiloxane compound, the period of the rated power
of the photovoltaic converters also increases.
Solar Module with Photoreceiver Combined with Concentrator 65
Fig. 2. Silicon matrix high-voltage photovoltaic converters with a voltage of more than 1000 V
The developed method allows obtaining the necessary concentration of solar

radiation on the surface of photovoltaic converters due to the calculation of the working
profile of the solar radiation concentrator. Solar photovoltaic module (Fig. 3) consists
of the paraboloid type concentrator 1, which in turn consists of different zones a – b, b
– c, c – d. The zones under consideration provide a cylindrical focal region of con-
centrated solar radiation with sufficiently uniform illumination in the focal region – on
the surface of the photovoltaic receiver 2 (silicon matrix high-voltage photovoltaic
converters). Photovoltaic receiver 2 is made in the form of a truncated cone from
commutated silicon high-voltage matrix photovoltaic converters of height d0, which are
located on the reflective (front) surface of the concentrator in the zone b – c. Paraboloid
type solar radiation concentrator for silicon high-voltage matrix photovoltaic converters
is also the radiator of passive air cooling.
Fig. 3. Principle of operation and the scheme of the concentrator photovoltaic solar module
Solar radiation incident on the reflecting surface of the solar concentrator 1 is

reflected from the surface of the concentrator of zones a – b and c – d in such a way that
a sufficiently uniform illumination of the photoelectric receiver 2 is provided by con-
centrated solar radiation in the focal region, which in turn is located on the profile of the
concentrator in the zone b – c.
The profile of the reflecting surface of the solar concentrator X (Y) under consid-
eration is determined by a system of various equations corresponding to the illumi-
nance condition of the surface of the photoelectric receiver (silicon matrix high-voltage
photovoltaic converters). The values of the X and Y coordinates of the working profile
of the solar radiation concentrator in the zone a – b are determined by the following
system of equations:

1
X ¼2f þ tg b ; ð1Þ
cos b
X2
Y¼ ; ð2Þ
4f
ð2 R X Þ2 ¼ 4 f Y; ð3Þ
ð2 R Xc Þ ¼ ð2 R Xb Þ þ d0 sin a0 ; ð4Þ
( 12 )
Y0
Xb ¼ 2 f tg b0 1þ tg b0
2
1 ; ð5Þ
f
Xb2
Yb ¼ ; ð6Þ
4f
Xc ¼ Xb d0 sin a0 ; ð7Þ
Yc ¼ Yb h; ð8Þ
h ¼ d0 sin a0 ; ð9Þ
d0 ðYc f Þ
¼ : ð10Þ
sin a sin d sin u0
The focal length of the solar concentrator f is calculated by the formula:

1
f ¼R tg b0 ; ð11Þ
cos b0
where b0 is the angle between the sunbeam, reflected from the surface of the
concentrator profile in the zone a – b at the point with coordinates Xa and Ya, coming
into the focal region of the concentrator (on the surface of the photovoltaic photore-
ceiver) at the point with coordinates (2 R – Xb), Yb and parabola focal length f;
the angle b ¼ b0 n=N varies from 0º to b0, the values of the coefficient n are
selected from a series of integers n ¼ 0; 1; 2. . . N;
c0 is the angle between the sunbeam reflected from the surface of the concentrator
profile in the zone a – b at the point with coordinates Xb and Yb, coming into the focal
region of the concentrator (on the surface of the photovoltaic photoreceiver) at the point
with coordinates ð2 R Xc Þ, Yc and parabola focal length f;
a0 is the angle of inclination of the working surface of the profile of the concen-
trator in zone b – c;
the maximum radius of the solar concentrator is R;
the angle d is determined by the expression d ¼ p=2 þ a0 b;
angles a0 and b0 are selected in accordance with the boundary conditions.
The distribution of the concentration of solar radiation Kab on the surface of the
photovoltaic photoreceiver, reflected from the upper surface of the working profile of
the solar radiation concentrator (zone a – b) is calculated according to the formulas:
2
Rn þ 1 R2n
Kab ¼ ; ð12Þ
Ddn Rbcn þ 1 þ Rbcn
Ddn ¼ dn þ 1 dn ; ð13Þ
Rn ¼ Xn R; ð14Þ
Rbcn ¼ Xbcn R: ð15Þ
The values of the X and Y coordinates in the zone of the working profile of the
concentrator c – d are determined using the system of equations:

Yc
tgam1 Yb
tgam ðXb Xc Þ
Ym ¼ ; ð16Þ
1
tgam1 tgam
ð Y c Ym Þ
Xm ¼ Xc ; ð17Þ
tga0m
ðYb Ym1 Þ
tgam1 ¼ ; ð18Þ
Xm1 ð2 R Xc Þ
ðYb Ym Þ
tgam ¼ ; ð19Þ
Xm ð2 R Xc Þ
ð2 R Xc Þ þ ðYc Ym Þ
Xm ¼ ; ð20Þ
tgam1
a0m ¼ a0 þ am ; ð21Þ
( 12 )
Y0
Xd ¼ 2 f tg b0 1þ 1 ; ð22Þ
f tg2 b0
Xd2
Yh ¼ Yd h Yd ¼ ; ð23Þ
4f
where a0m is the angle of inclination of the working surface of the profile of the
concentrator in zone c – d;
am is the angle between the sunbeam, reflected from the surface of the profile of the
concentrator in the zone c – d at the point with coordinates Xm and Ym coming to the
focal region of the concentrator (on the surface of the photovoltaic receiver) at the point
with coordinates Xb and Yb and the level of the coordinate Ym;
values of the coefficient m are selected from a series of integers m ¼ 0. . . M.
Distribution of the concentration of solar radiation Kcd on the surface of the pho-
tovoltaic receiver is calculated according to the formulas:
X
M X
M R2mn R2mðn þ 1Þ
Kcd ¼ Km ¼ ; ð24Þ
m¼0 m¼0 Ddn Rbcðn þ 1Þ þ Rbcn
Rmn ¼ Rm DXmn ; ð25Þ
Rm ¼ Xm R; ð26Þ
DXmn ¼ DXm dn ; ð27Þ
DXm ¼ Xm Xm1 : ð28Þ
The presented systems of equations make it possible to calculate the coordinates of

the working surface of the profile of the solar radiation concentrator in various its
zones.
3 Results of Calculating the Profile of the Concentrator

Paraboloid Type
Using the formulas presented above, the coordinates of the working profile of the
paraboloid type solar concentrator are calculated, which ensures uniform illumination
of the focal region (silicon matrix high-voltage photovoltaic converters) with con-
centrated solar radiation. Figure 4 shows the profile of the reflecting surface of the
paraboloid type solar radiation concentrator with a maximum diameter of about
400 mm.
Three different zones of the concentrator profile can be distinguished on the profile
of the solar radiation concentrator, one of which has a flat shape and represents the
landing surface of the photovoltaic converters. This kind of photovoltaic solar modules
can be installed together on one frame with a system for constant tracking the position
Fig. 4. Working surface of the concentrator profile of the paraboloid type
of the Sun. For a better layout and optimal filling of the frame space, the concentrators
of solar modules can be squared and closely mounted to each other.
Figure 5 on the left shows a three-dimensional model of the solar module with the
profile of the solar concentrator, calculated according to the developed method and
providing uniform distribution of concentrated solar radiation over the surface of
photoreceiver (silicon matrix high-voltage photovoltaic converters) in the focal region,
which is presented in the Fig. 5 on the right.
Fig. 5. Three-dimensional model of the solar module (on the left) and distribution of
concentrated solar radiation over the surface of photoreceiver (on the right)
The distribution of illumination by concentrated solar radiation over the surface of

the photoreceiver is relatively uniform, varies from 7 to 11 times and averages 9 times.
The presented distribution will favorably affect the operation of photovoltaic con-
verters, providing illumination of the entire surface of the photovoltaic converters with
uniform concentrated solar radiation, due to which the output electric power of the solar
module will be at a nominal level.
Thanks to the calculated profile of paraboloid type solar concentrator using
computer-aided design system, it becomes possible to create their three-dimensional
solid-state model. Figure 6 shows such concentrator of solar radiation of paraboloid
type and its single component (petal), into which it can be divided for subsequent
manufacture from metallic reflective material.
Fig. 6. Three-dimensional model of a paraboloid type solar concentrator and its single
component made of reflective metal
Based on the developed three-dimensional model, the subsequent manufacture of a

paraboloid type solar concentrator from reflective metal material with the help of single
components is possible. Number of single components may vary depending on the
required manufacturing accuracy and optical efficiency of the solar concentrator itself.
The more single elements that make up the solar radiation concentrator, the more
accurate its manufacture and the greater its optical efficiency. However, with a large
number of components, the complexity of manufacturing also increases, in view of
which other methods of manufacturing solar concentrators may be more relevant
(centrifugal method, method of electroforming, method of glass bending, manufacture
of a reflecting surface from flat curved mirrors) [14].
4 Conclusion
Photovoltaic solar module with a paraboloid type concentrator and a photoreceiver,

based on silicon high-voltage matrix photovoltaic converters located directly on the
surface of the concentrator, which is a passive air-cooled radiator for them, has been
developed. Solar radiation concentrator provides a fairly uniform illumination of the
surface of the photovoltaic receiver. The use of silicon high-voltage matrix
photovoltaic converters makes it possible to increase the electrical efficiency in the

conversion of concentrated solar radiation even when they are overheated. Excessive
heat is removed from the photovoltaic converters due to the entire area of the solar
radiation concentrator. The use of a two-component polysiloxane compound in the
manufacturing process of the photovoltaic receiver allows increasing the term of the
rated power of the photovoltaic converters in the concentrated solar radiation flux.
Concentrators of solar radiation of the solar modules can be squared to optimally fill the
frame of the system for constant tracking the position of the Sun, and be manufactured
using various methods and manufacturing technologies.
Acknowledgment. The research was carried out on the basis of financing the state assignments
of the All-Russian Institute of the Electrification of Agriculture and the Federal Scientific
Agroengineering Center VIM, on the basis of funding of the grant “Young lecturer of RUT” of
the Russian University of Transport, on the basis of funding of the Scholarship of the President of
the Russian Federation for young scientists and graduate students engaged in advanced research
and development in priority areas of modernization of the Russian economy, direction of
modernization: “Energy efficiency and energy saving, including the development of new fuels”,
subject of scientific research: “Development and research of solar photovoltaic thermal modules
of planar and concentrator structures for stationary and mobile power generation”.
References
1. Rosell, J.I., Vallverdu, X., Lechon, M.A., Ibanez, M.: Design and simulation of a low
concentrating photovoltaic/thermal system. Energy Convers. Manage. 46, 3034–3046 (2005)
2. Nesterenkov, P., Kharchenko, V.: Thermo physical principles of cogeneration technology
with concentration of solar radiation. Adv. Intell. Syst. Comput. 866, 117–128 (2019).
https://doi.org/10.1007/978-3-030-00979-3_12
3. Kemmoku, Y., Araki, K., Oke, S.: Long-term performance estimation of a 500X
concentrator photovoltaic system. In: 30th ISES Biennial Solar World Congress 2011,
pp. 710–716 (2011)
4. Kharchenko, V., Nikitin, B., Tikhonov, P., Panchenko, V., Vasant, P.: Evaluation of the
silicon solar cell modules. In: Intelligent Computing & Optimization. Advances in Intelligent
Systems and Computing, vol. 866, pp. 328–336 (2019). https://doi.org/10.1007/978-3-030-
00979-3_34
5. Sevela, P., Olesen, B.W.: Development and benefits of using PVT compared to PV. Sustain.
Build. Technol. 90–97 (2013)
6. Ibrahim, A., Othman, M.Y., Ruslan, M.H., Mat, S., Sopian, K.: Recent advances in flat plate
photovoltaic/thermal (PV/T) solar collectors. Renew. Sustain. Energy Rev. 15, 352–365
(2011)
7. Kharchenko, V., Nikitin, B., Tikhonov, P., Gusarov, V.: Investigation of experimental flat
PV thermal module parameters in natural conditions. In: Proceedings of 5th International
Conference TAE 2013, pp. 309–313 (2013)
8. Hosseini, R., Hosseini, N., Khorasanizadeh, H.: An Experimental study of combining a
Photovoltaic System with a heating System. In: World Renewable Energy Congress,
pp. 2993–3000 (2011)
9. Panchenko, V., Kharchenko V., Vasant, P.: Modeling of solar photovoltaic thermal modules.
In: Intelligent Computing & Optimization. Advances in Intelligent Systems and Computing,
vol. 866, pp. 108–116 (2019). https://doi.org/10.1007/978-3-030-00979-3_11
10. Panchenko, V.A.: Solar roof panels for electric and thermal generation. Appl. Solar Energy
54(5), 350–353 (2018). https://doi.org/10.3103/S0003701X18050146
11. Kharchenko, V., Panchenko, V., Tikhonov, P., Vasant, P.: Cogenerative PV thermal
modules of different design for autonomous heat and electricity supply. In: Handbook of
Research on Renewable Energy and Electric Resources for Sustainable Rural Development,
pp. 86–119 (2018). https://doi.org/10.4018/978-1-5225-3867-7.ch004
12. Panchenko, V.: Photovoltaic solar modules for autonomous heat and power supply. IOP
Conf. Ser. Earth Environ. Sci. 317 (2019). 9 p. https://doi.org/10.1088/1755-1315/317/1/
012002
13. Panchenko, V., Izmailov, A., Kharchenko, V., Lobachevskiy, Y.: Photovoltaic solar
modules of different types and designs for energy supply. Int. J. Energy Optim. Eng. 9(2),
74–94 (2020). https://doi.org/10.4018/IJEOE.2020040106
14. Sinitsyn, S., Panchenko, V., Kharchenko, V., Vasant, P.: Optimization of Parquetting of the
concentrator of photovoltaic thermal module. In: Intelligent Computing & Optimization.
Advances in Intelligent Systems and Computing, vol. 1072, pp. 160–169 (2020). https://doi.
org/10.1007/978-3-030-33585-4_16
Modeling of Bilateral Photoreceiver
of the Concentrator Photovoltaic Thermal
Module
Vladimir Panchenko1,2(&), Sergey Chirskiy3, Andrey Kovalev2,

and Anirban Banik4
1
pancheska@mail.ru
2
Federal Scientific Agroengineering Center VIM, 1st Institutsky passage 5,
kovalev_ana@mail.ru
3
Bauman Moscow State Technical University, 2nd Baumanskaya st. 5,
baragund@yandex.ru
4
Department of Civil Engineering, National Institute of Technology Agartala,
Jirania 799046, Tripura (W), India
anirbanbanik94@gmail.com
Abstract. The paper describes the modeling of the thermal state and visual-
ization of the operating mode of the bilateral photoreceiver of the concentrator
photovoltaic thermal solar module. Based on the results obtained during the
simulation, the geometric parameters of the components of the photoreceiver of
the solar module are substantiated and selected. The developed design of the
solar module will provide high electrical and thermal efficiency of the solar
module. These types of concentrator photovoltaic thermal solar modules can
operate autonomously or in parallel with the existing power grid to power
consumers.
Keywords: Solar energy Solar concentrator High-voltage matrix

photovoltaic converters Photovoltaic thermal solar module Power supply
1 Introduction
Recently solar energy converters have been developing at a pace that outstrips the
development of converters of other renewable energy sources [1–6]. Along with
increasing the efficiency of photovoltaic converters, when the technology of their
manufacture is being improved [7–9], the cogeneration method is also used to increase
the overall efficiency of the solar module – when, along with electricity, the consumer
receives thermal energy. Such solar modules can be either a planar design [10] or a
concentrator design [11] when various concentrators and reflectors of solar energy are
used. Such devices must be manufactured with high accuracy, and their design is an
important and difficult task [12]. Concentrator thermal photovoltaic solar plants gen-
erate electrical and thermal energy, while the temperature of the coolant can reach

https://doi.org/10.1007/978-3-030-68154-8_8
74 V. Panchenko et al.
higher temperatures compared to planar photovoltaic thermal modules. The design of

such modules is complicated by the need to simulate the thermal state of the pho-
toreceiver located in the focus of the concentrator, since at its high temperature, the
electrical efficiency of the photovoltaic converters decreases. Along with the design and
manufacture of solar concentrators, an important role is played the design and simu-
lation of photoreceivers of solar modules in both planar and concentrator designs [13].
In addition to the method for designing such modules, methods of modeling and
visualizing the processes occurring in the photoreceivers of such solar photovoltaic
thermal solar modules are also necessary.
2 Three-Dimensional Modeling of Bilateral Photoreceivers

of Photovoltaic Thermal Solar Modules
The developed method for creating three-dimensional models of photoreceivers of solar

photovoltaic thermal modules allows create photoreceivers with unilateral and bilateral
photoelectric converters, as well as with facial, rear and two-sided heat removal
(Fig. 1).
Fig. 1. Components of solar thermal photovoltaic modules of various designs and the model of
the bilateral photoreceiver “Model 4”
Modeling of Bilateral Photoreceiver of the Concentrator Photovoltaic Thermal Module 75
Bilateral photovoltaic converters make it possible to create concentrator photo-

voltaic thermal solar modules where photovoltaic converters are illuminated from two
sides and in this case the number of used photovoltaic converters is halved and given
that the solar concentrator also is part of the installation, the savings will be even more
significant when compared with planar solar modules. Two-sided heat removal allows
more efficient cooling of photovoltaic converters and the resulting heat increases the
overall efficiency of the solar module. When creating a photoreceiver with bilateral
photovoltaic converters and two-sided heat removal, the components of the “Model 4”
are used (Fig. 1), which consists of 7 different components.
In concentrator solar photovoltaic thermal modules it is advisable to use not
standard silicon planar photovoltaic converters, but silicon high-voltage matrix ones,
which were originally developed for concentrator modules. The efficiency of high-
voltage matrix photovoltaic modules increases when working in concentrated solar
flux, and its high value is maintained even when the concentration of solar radiation is
more than 100 times. Moreover, the high electrical efficiency of the matrix photovoltaic
converters is maintained with increasing temperature, while the efficiency of planar
photovoltaic converters significantly decreases with increasing temperature.
Silicon photovoltaic converters of this kind are used in “Model 4” as an bilateral
electrogenerating component. By scaling the length of the high-voltage matrix pho-
tovoltaic converters, it is possible to achieve higher module voltages, and an increase in
the concentration of solar radiation can proportionally increase the electric current and
electric power of the entire module.
3 Modeling and Visualization of the Thermal State

of the Bilateral Photoreceiver of the Concentrator
Photovoltaic Thermal Module
After creating a three-dimensional model of the photoreceiver of the solar photovoltaic

thermal module in the computer-aided design system, it is advisable to study this model
in order to determine its thermal state in the Ansys finite element analysis system. As a
result of modeling the thermal state, it is possible to analyze and draw a conclusion
regarding the operation mode of the photoreceiver and then perform layer-by-layer
optimization of the model components in order to achieve maximum overall module
efficiency (maximum thermal efficiency or maximum electric efficiency). As a result of
modeling according to the developed method, it becomes possible to determine the
temperature fields of the model and visualize the coolant flow [14].
As an example of modeling the thermal state of the bilateral photoreceiver of the
concentrator photovoltaic module, we consider a receiver with matrix photovoltaic
converters located in the focus of the concentrator and cooled by nitrogen (Fig. 2). The
model of the photoreceiver in the Ansys finite element analysis system is presented in
the Fig. 2. Half of the photoreceiver is presented in view of its symmetry to accelerate
the modeling process in the finite element analysis system.
Fig. 2. Model of the bilateral photoreceiver and coolant movement
For calculation in a finite element analysis system, the photoreceiver model was
divided into finite elements, as well as broken down into components representing var-
ious physical bodies. When modeling the thermal state in the finite element analysis
system, the following parameters and module characteristics were adopted: module
length 600 mm; thermophysical properties of substances: silicon: thermal conductivity:
120 W/m K, heat capacity: 800 J/kg K, density: 2300 kg/m3; glass: thermal con-
ductivity: 0,46 W/m K, heat capacity: 800 J/kg K, density: 2400 kg/m3; rubber:
thermal conductivity: 0,15 W/m K, heat capacity: 1800 J/kg K, density: 1200 kg/m3;
nitrogen: atmospheric pressure: 101300 Pa, 20 ºC, dynamic viscosity: 1710–6 Pa s),
thermal conductivity: 0,026 W/m K; heat capacity at constant pressure: 1041 J/kg K;
density: 1,182 kg/m3.
At the next stage of modeling, the following parameters were adopted: specific heat
release on the photovoltaic converter: 2 W/cm2, gas temperature at the inlet: 15 ºC; gas
inlet speed: 10 m/s; gas pressure: 10 atm; total gas flow: mass: 0,066 kg/s, volumetric
at operating pressure and temperature: 5,50 l/s, volumetric at atmospheric pressure and
temperature 20 ºC: 56,12 l/s. The thermal state of the photovoltaic converter is shown
in the Fig. 3 on the left, the temperature of the coolant and other structural components
is shown in the Fig. 3 on the right.
Fig. 3. Thermal state of the photovoltaic converter (on the left) and temperature of the coolant
and other structural components (on the right)
The temperature of the silicon photovoltaic converter ranged from 44 °C at the

edges of the cell to 53 °C in the center of the cell. The temperature of the coolant in the
central section of the model ranged from 15 ºC (did not differ from the inlet coolant
temperature) to 34 ºC, which indicates the heating of the coolant layer adjacent to the
photovoltaic converter and removal heat from it. By varying various parameters, such
as the concentration value (specific heat release on the photovoltaic converter), heat
carrier velocity and temperature of the heat carrier at the inlet, the necessary temper-
ature values of the photovoltaic converter and heat carrier can be obtained.
Thanks to the developed modeling method [14], a study was made of the bilateral
photoreceiver of the solar photovoltaic thermal module (“Model 4”), as a result of which
temperature distributions, flows, and coolant velocities were obtained, which made it
possible to analyze and optimize the layer-by-layer structure of the model. High-voltage
bilateral matrix photovoltaic converters are located in the focal region of the concen-
trator of the module. Bilateral photovoltaic thermal photoreceiver located in the focus of
a solar radiation concentrator was subjected to simulation. Bilateral photoreceiver of the
solar photovoltaic thermal module consists of 7 different components (Fig. 1), the
layered structure of which is presented in the Table 1. Modeling the thermal state in a
finite element analysis system with obtaining an array of temperatures allows getting a
more detailed three-dimensional picture of the thermal state of the module, in contrast to
two-dimensional analytical modeling, which allows comparing the thermal state of all
components and each separately for different parameters.
Table 1. Parameters of the components of the bilateral photoreceiver of the solar photovoltaic
thermal module (“Model 4”)
Component Thick., Comp. mat. Therm. Dens., Kin. Dyn. Heat Coef. of
mm cond., kg/m3 visc., visc., cap., therm. exp.,
W/mK Pas m2/s J/kgK 1/K
1. Bilateral 0,2 Silicon 148 2330 – – 714 2,54 10–6
electrogenerating
2. Transparent 0,2 Polysilo- 0,167 950 – – 1175 100 10–6
sealing xane
3. Transparent 0,1 Polyethy- 0,14 1330 – – 1030 60 10–6
insulating lene
4. Transparent 5 Water 0,569 1000 1788 1,78 4182 –
heat removal 10–6 10–6
5. Transparent 0,3 Polyethy- 0,14 1330 – – 1030 60 10–6
insulating heat lene
sink
6. Transparent 3; 5; 7 Air 0,0244 1,293 17,2 13,210– 1005 –
heat-insulating 10–6 6
7. Transparent 4 Tempered 0,937 2530 – – 750 8,9 10–6

protective glass
optiwhite
Since the model of the photoreceiver is symmetric, the axis of symmetry will be the
middle of the bilateral electrogenerating component (silicon matrix high-voltage pho-
toelectric converter) (Fig. 4 above) in longitudinal section (Fig. 4 below). The thick-
nesses of the components in the drawing are taken preliminary.
Fig. 4. Bilateral electrogenerating component (silicon matrix high-voltage photoelectric

converter) (above), model of bilateral module and module drawing shown in section (below)
When modeling the thermal state of the model of the photoreceiver, the influence of
the concentration of solar radiation (3; 6 and 9 times), transparent heat-insulating
component (the air gap) (3; 5 and 7 mm), as well as the flow rate of the coolant (0,5; 5
and 50 g/s) on the thermal state of the coolant is considered. The temperature of the
bilateral electrogenerating component (silicon matrix high-voltage photoelectric con-
verter) should be below 60 ºC, since when the photovoltaic converters are heated, their
electrical efficiency gradually decreases.
As a preliminary calculation, a part of the bilateral electrogenerating component is
considered: a strip 10 mm wide, conventionally located in the middle part of the
photovoltaic thermal photoreceiver, the length of this part being equal to the length of
the entire module. Since the module is symmetrical with respect to the average hori-
zontal plane, only its upper half is considered.
The flow of the cooling agent is considered uniform throughout the cross section of
the cooling cavity. Three variants of the module design are considered, differing in the
height of the transparent heat-insulating component (air layer thickness) – 3 mm, 5 mm
and 7 mm (Fig. 5).
Fig. 5. Three variants of the module design (height of the transparent heat-insulating component
3 mm, 5 mm and 7 mm)
The influence of the concentration of solar radiation on the thermal state of the
module was studied – three cases were considered: a concentration of 3, 6 and 9 times
for each side of the module. Taking into account the fraction of solar radiation con-
verted into electrical energy, the heat flux is taken to be 2400 W/m2, 4800 W/m2 and
7200 W/m2.
For the case of water cooling with the temperature of 293 K, the thermal state was
simulated for mass flow rates of 0,05 kg/s, 0,005 kg/s and 0,0005 kg/s. It was noted
that at the mass flow rate of water of 0,05 kg/s, the water layer (transparent heat
removal component) does not warm up to the entire thickness, therefore, the calculation
for this flow rate was performed only for a design variant with a heat insulator thickness
of 3 mm.
To assess the effect of non-uniformity in the supply of the cooling agent (trans-
parent heat removal component), a solid-state model of a quarter of the photoreceiver is
constructed, shown in the Fig. 6. A design variant with a thickness of the transparent
heat-insulating component (air layer thickness) equal to 3 mm is considered. The layers
of transparent heat-insulating component and transparent heat removal component are
divided by thickness into 10 layers of elements (Fig. 6). The components of the model
with a small thickness are divided by thickness into 3 layers of elements.
Fig. 6. Solid-state model of a quarter of the photoreceiver
The symmetry planes of the module are the average vertical and horizontal planes.
In the Fig. 7 on the left, the symmetry planes are marked in green. The coolant enters
and exits through the central holes marked in green in the Fig. 7 on the right. The flow
rate of the cooling agent is set at 0,205 kg/s and 0,0205 kg/s, which is equivalent to a
flow rate of 0,05 kg/s and 0,005 kg/s for a part of the module with a width of 10 mm.
Fig. 7. Symmetry planes of the model (on the left) and coolant inlet (on the right)
As a result of the simulation, the temperature distributions of the components of the

solar module model are obtained, as well as the velocity and flow line of the coolant
(Figs. 8 and 9).
Fig. 8. Temperature distribution of the components of the model from the side of the inlet and
outlet of the coolant (above) and the velocity of the coolant from the side of the inlet and outlet of
the coolant (below)
Fig. 9. Temperature distribution of the components of the model along the entire length – top
view (above) and coolant velocity along the entire length – top view (below)
Based on temperature distributions, the temperature values of the various compo-

nents of the model were determined, which were entered in the Table 2 for further
optimization of the thicknesses of the components. Also analyzed the quality of
washing a transparent radiator (transparent insulating component) and the uniformity of
the flow lines of the coolant (transparent heat removal component) at various flow
rates; analyzed the places of overheating of photovoltaic converters (bilateral electro-
generating component) and underheating of the coolant, its stagnation in the module.
Presented figures (the result of three-dimensional modeling of the thermal state)
carry the necessary information for analysis and optimization of the thermal state of the
components of the module and the quality of its cooling.
As a result of the analysis of the temperature array of the “Model 4” components
obtained in the process of modeling the thermal state of the three-dimensional model
using the developed method, optimization was carried out according to the criterion of
the changing the coolant temperature (heat removal component). With the same change
in the main criterion (coolant temperature), the change in the total criterion is calculated.
As a result of design optimization, an air gap (transparent heat-insulating component) of
7 mm, a concentration of 6 times and a coolant flow rate of 0,5 g/s for the selected part
of the module were selected. The number of models can be expanded with an increase in
the ranges of geometric characteristics of other components and coolant flow.
Table 2. Temperature characteristics of the photoreceiver components and component

optimization criteria
Air gap (transparent heat-insulating 3 5 7
component), mm
Mass flow rate of coolant (transparent heat 0,5 5 0,5 5 0,5 5
removal component), g/s
Glass temperature at the coolant inlet 20 20 20 20 20 20
(transparent protective component), °C
Glass temperature at the coolant outlet 29 20 29 20 29 20
(transparent protective component), °C
Change in criterion, % (priority of the third 45 0 45 0 45 0
category – optimum minimum)
Air temperature in the gap at the coolant 20 20 20 20 20 20
inlet (transparent heat-insulating
component), °C
Air temperature in the gap at the coolant 35 20 34 20 34 20
outlet (transparent heat-insulating
component), °C
Change in criterion, % (priority of the fourth 75 0 70 0 70 0
category - optimum minimum)
Temperature of photovoltaic converters at 26 26 26 25 26 26
the coolant inlet (bilateral electrogenerating
component), °C
Temperature of photovoltaic converters at 44 32 44 32 44 32
the coolant outlet (bilateral
electrogenerating component), °C
Change in criterion, % (priority of the 69 23 69 28 69 23
second category – optimum minimum)
Temperature of the coolant at the outlet of 38 22 38 23 38 21
the module (transparent heat removal
component), °C
Change in criterion, % (priority of the first 90 10 90 15 90 5
(main) category – optimum maximum)
Change in the total criterion (maximum −60,6 −57,1 −57,1
priority)
Optimization of the air gap and the flow rate of the coolant were carried out
according to the criterion for changing the temperature of the coolant (priority of the
first category). The remaining secondary criteria by which optimization did not occur
were of secondary importance, but also had priority depending on their contribution to
the heating of the coolant (the second category – coefficient 0,9, the third – 0,8, the
fourth category – 0,7 and the fifth – 0,6) (Table 2). Also, for changes in the criteria,
Table 2 shows the achievement of their desired values in terms of finding the optimum
– minimum or maximum, which is also taken into account when summing up the
secondary criteria, that is, the smaller the change, the better the heating of the coolant.
If a change in the value of a secondary criterion reduces the main criterion (the priority
of the first category – heating of the coolant, relative to the value of which a decision is
made on the appropriateness of a particular module design), then when summing up the
magnitude of the change of this criterion is given with a negative sign. With the same
change in the main criterion (coolant temperature), the change in the total criterion is
calculated, which includes changes in all secondary criteria and preference is given to
the maximum value of the total criterion:
X
K ¼1 K1 0; 9 K2 0; 8 K3 0; 7 K4 0; 6 K5
The total criterion reflects a change not only in the parameter of the main criterion
(change in the temperature of the coolant), but also takes into account changes in other
criteria (change in the temperatures of the remaining components), which plays an
important role in the optimization process in view of the level of their contribution to
the thermal state of the module and heating of the coolant in particular.
4 Conclusion
As a result of the developed method for designing solar photovoltaic thermal modules
the designer can develop solar cogeneration modules of various designs, which can
then be tested in the finite element analysis system using the developed method for
modeling and visualization of thermal processes.
As a result of optimization of the geometric parameters of the bilateral photore-
ceiver of a concentrator solar photovoltaic thermal module, a module design is pro-
posed that will allow obtaining high values of the overall efficiency of the solar module
when the photovoltaic converters do not overheat and the coolant heats up to high
values.
Solar photovoltaic thermal modules of this kind can serve as cogeneration modules
for the simultaneous generation of electric and thermal energy for the consumers own
needs, which can operate both offline and in parallel with existing energy networks.
References
1. Buonomano, A., Calise, F., Vicidimini, M.: Design, simulation and experimental
investigation of a solar system based on PV panels and PVT collectors. Energies 9, 497
(2016)
(2011)
3. Kemmoku, Y., Araki, K., Oke, S.: Long-term perfomance estimation of a 500X concentrator
photovoltaic system. In: 30th ISES Biennial Solar World Congress 2011, 710–716 (2011)
https://doi.org/10.1007/978-3-030-00979-3_12
5. Rawat, P., Debbarma, M., Mehrotra, S., et al.: Design, development and experimental
investigation of solar photovoltaic/thermal (PV/T) water collector system. Int. J. Sci.
Environ. Technol. 3(3), 1173–1183 (2014)
6. Sevela, P., Olesen, B.W.: Development and benefits of using PVT compared to PV.
Sunstain. Build. Technol. 4, 90–97 (2013)
00979-3_34
Conf. Ser. Earth Environ. Sci. 317 9 p. https://doi.org/10.1088/1755-1315/317/1/012002.
9. Panchenko, V., Izmailov, A., Kharchenko, V., Lobachevskiy, Y.: Photovoltaic solar
74–94. https://doi.org/10.4018/IJEOE.2020040106.
10. Panchenko, V.A.: Solar roof panels for electric and thermal generation. Appl. Solar Energy
54(5), 350–353 (2018). https://doi.org/10.3103/S0003701X18050146
11. Kharchenko, V., Panchenko, V., Tikhonov, P., Vasant, P.: Cogenerative PV thermal
pp. 86–119. https://doi.org/10.4018/978-1-5225-3867-7.ch004
12. Sinitsyn, S., Panchenko, V., Kharchenko, V., Vasant, P.: Optimization of parquetting of the
org/10.1007/978-3-030-33585-4_16
modules. In: Intelligent Computing & Optimization. Advances in Intelligent Systems and
Computing, vol. 866, pp. 108–116 (2019). https://doi.org/10.1007/978-3-030-00979-3_11
14. Panchenko, V., Chirskiy, S., Kharchenko, V.V.: Application of the software system of finite
element analysis for the simulation and design optimization of solar photovoltaic thermal
modules. In: Handbook of Research on Smart Computing for Renewable Energy and Agro-
Engineering, pp. 106–131 https://doi.org/10.4018/978-1-7998-1216-6.ch005
Formation of Surface of the Paraboloid Type
Concentrator of Solar Radiation
by the Method of Orthogonal Parquetting
Vladimir Panchenko1,2(&) and Sergey Sinitsyn1

1
pancheska@mail.ru
2
1st Institutskiy passage 5, 109428 Moscow, Russia
Abstract. The paper discusses the geometric aspects of the surface design of
solar radiation concentrators by adjusting its surface with single elements. The
proposed method of orthogonal parquetting allows optimizing the accuracy of
manufacturing a solar radiation concentrator and the smoothness of its working
profile. The geometric characteristics of paraboloid type concentrators are
considered. The article also discusses various methods of manufacturing solar
radiation concentrators, including the fan-surface parquetting method, which is
related to the orthogonal method of parking. The optimization of the number of
elements used in the developed method occurs either by the least number of
curved components of the concentrator or by the maximum number of flat
elements. As an example of the implementation of the developed method two
solar radiation concentrators designed for photovoltaic and photovoltaic thermal
photoreceivers are presented. Concentrations of solar radiation provide uniform
illumination in focal areas.
Keywords: Solar energy Concentrator of solar radiation Paraboloid

Optimization Orthogonal parquetting Concentrator solar module
1 Introduction
Solar plants are the fastest growing stations based on renewable energy converters
[1–5]. Most of solar stations consist of silicon planar photovoltaic modules that gen-
erate exclusively electrical energy. To increase the overall efficiency of solar modules,
reduce their nominal power and accelerate the payback period, photovoltaic thermal
solar modules are relevant, which, along with electrical energy, also generate thermal
energy in the form of a heated coolant. Such photovoltaic thermal solar modules are
divided according to the type of construction used: planar [6–9] and concentrator (with
different concentrators of solar radiation) [10–12]. Since concentrator photovoltaic
thermal modules can work with bilateral photovoltaic converters, including silicon
matrix high-voltage [4, 5, 12] (which reduces the number of used photovoltaic con-
verters), as well as receive a higher temperature coolant at the output, which increases
the overall efficiency of a solar installation, concentrators of such installations require
more complex calculations related both to the development of a profile for uniform
https://doi.org/10.1007/978-3-030-68154-8_9
Formation of Surface of the Paraboloid Type Concentrator of Solar Radiation 85
illumination of the focal region where the photovoltaic converters are located and to
methods for its manufacture according to geometrically specified requirements.
Methods for calculating the thermal state of concentrator solar photovoltaic thermal
modules are also necessary [13], since as the temperature of the photovoltaic converters
increases, their electrical efficiency decreases [14], which implies the need for their
effective cooling. The main types of concentrators for use in concentrator photovoltaic
thermal installations are paraboloid type solar concentrators, however, methods for
calculating their working profile and its manufacture are also relevant and important for
research.
2 Paraboloid Type Concentrators of the Solar Radiation
Ideal parabolic concentrator of solar radiation (Fig. 1) focuses parallel sunrays to the
point, which corresponds to an infinite degree of concentration [15], which does not
allow evaluating the capabilities of the concentrator of solar radiation, since the Sun has
finite dimensions.
Fig. 1. Scheme of the formation of the focal region of the paraboloid concentrator of solar
radiation
Figure 1 shows the diagram of the formation of the focal spot of the paraboloid
concentrator of solar radiation: an elementary sunray with an angular size (2
0,004654 rad) is reflected from the surface of the concentrator of solar radiation and
falls on the focal plane, where the trace of this ray is an elementary ellipse with half
shafts: a ¼ p u0 =ð1 þ cos uÞ cos u; b ¼ p u0 =ð1 þ cos uÞ, where p = 2 f –
the focal parameter of the parabola; f – focal length [15].
86 V. Panchenko and S. Sinitsyn
From different radial zones of the concentrator of solar radiation (with different
angles U), the ellipses have different sizes, which, overlapping each other, form the
density of the focal radiation. An approximate estimate of the maximum radiation
density in focus is the calculation by the formula [15]:
1
EF ¼ q sin2 Um E0 ; ð1Þ
u20
where q – the reflection coefficient of the concentrator of solar radiation; u0 – the

opening angle of the elementary sunray; Um – the largest opening angle of the para-
boloid to the side; E0 – density of the solar radiation.
The imperfection of the reflecting surface of the concentrator of solar radiation leads
to blurring of the spot due to the mismatch of their centers according to a random law. The
focal spot illumination is best described by the normal Gaussian distribution curve [15]:
ET ¼ Emax ecr ;
2
ð2Þ
2
180
Emax ¼ E0 q h2 sin2 u; ð3Þ
p
2
h
c ¼ 3;283 103 ð1 þ cos uÞ2 ; ð4Þ
p
where r – the radius in the focal plane; h – the measure of the accuracy of the
concentrator of solar radiation.
In the calculations, the focal length f and half-opening angle Um are considered
known, from where, using the representation of the parabola equation in the polar
coordinate system, the diameter of the concentrator of solar radiation is obtained [15]:
4 f sin um
D¼ : ð5Þ
1 þ cos um
The optimal value Um at which the average coefficient of concentration of solar

radiation will be maximum is equal to 45° at the achieved concentration Kmax equal
11300 [16] and the concentration of solar radiation at the focus of the ideal concentrator
of solar radiation takes the form [17]:
1:2
Efid ¼ Rs sin2 um E0 ð6Þ
u20
where Rs – the integral transmittance of the system; u0 – the angular radius of the Sun
equal to 0,004654 rad.; E0 – the density of direct solar radiation.
The curves of the distribution of concentration of solar radiation in the focal plane
for four concentrators of solar radiation (spherical, quasiparabolic and parabolotoric)
with the same diameters and focal lengths (500 mm each) are shown in the Fig. 2 [18].
Fig. 2. Distribution of concentration of solar radiation in the focal plane of spherical,

quasiparabolic and parabolotoric concentrators of solar radiation
Solar radiation concentrators, the concentration distribution of which is shown in

the Fig. 2, can be successfully used in conjunction with solar radiation photoreceiver,
however, uniform distribution of concentrated solar radiation over the focal region is
necessary for solar photovoltaic converters. Therefore, when designing the working
profile of paraboloid type concentrators, it is necessary to pay considerable attention to
the distribution of illumination in the focal region of the concentrator.
For the manufacture of paraboloid type solar radiation concentrators, the following
methods are mainly used [15]: centrifugal method; the method of electroforming; the
method of glass bending; the manufacture of a reflective surface from flat curved
mirrors. Along with the methods considered, parabolic concentrators with small
manufacturing accuracy for cooking and heating water are manufactured in southern
and not rich countries, which are presented in the Fig. 3 [19–21]. Of great interest is the
fan-shaped method of parqueting of surface of the paraboloid type concentrator of solar
radiation [22], with the help of which the concentrators in the Fig. 3 on the right are
manufactured.
Fig. 3. Paraboloid type solar radiation concentrators, the base of which is made of reed, guide
ribs, satellite dishes and using the fan-shaped method of parqueting of surface
The term “parquetting” refers to the task of approximation of surfaces, which

allows simplifying the technological process of their production while respecting the
differential geometric characteristics of finished products. The task of selecting the
shape of the parquet elements of the shell and their dimensions is solved in two ways:
by dividing the structure into small flat elements and dividing the shell into elements of
a curvilinear outline. As elements of the parquet are considered figures the planned
projections of which are straight lines.
3 Method of the Orthogonal Parqueting of Surface

of the Paraboloid Type Concentrator of the Solar Radiation
As the initial data for parqueting, a planned projection of the surface of the paraboloid
is set (Fig. 4). The method for solving the task is related to the breakdown into
projection elements with the subsequent finding on the surface of the third coordinates
corresponding to the planned breakdown.
Fig. 4. The scheme of orthogonal parqueting of the surface of the paraboloid of the concentrator
According to the classification of types of parquet accepted:

a) the planned projection of the elements in the form of closed n - squares with the
inclusion of circular arcs;
b) the equidistant arrangement of internal and external surfaces [22].
When solving it is taken into account that the mathematical model of the inner
surface is given by the equation x2 þ y2 ¼ 2 p z.
The surface of the paraboloid S is dissected by one-parameter families of projecting
planes (Fig. 5). In this case, the planned projection is divided by a network with the
shape of the cell, depending on the relative position of the projecting planes. It is most
advisable to split the surface into elements with a planned shape of rectangles or
arbitrary quadrangles.
Fig. 5. The sectional diagram of the surface of the paraboloid by families of projecting planes
The cell parameters of the surface partition network are determined taking into
account the accuracy of the surface approximation ðDn D nÞ, where the numerical
value of the limiting error parameter D n ¼ 0;15%.
If the surface of a paraboloid is represented by an approximated set of elements
having the shape of equal squares in plan (Fig. 4) Si, then their angles on the surface
S correspond to the set of points of the skeleton, approximately taken as regular.
To determine the side of the square of the unit cell of the partition, in the circle of
the trace of the paraboloid on the plane X0Y, will enter a square with a side b (Fig. 4).
The parameter b is calculated by the area of a circle of radius r:
rffiffiffiffiffiffiffi
Scir
b ¼ 1;4 : ð7Þ
p
The constructed square S is a trace of the surface part of the paraboloid area.
Based on the relation:

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n pffiffiffi h i o

pffiffiffiffi
D n ¼ 4 exp 2 ln S ln ð1 þ Nsm ÞNsm ð N 1Þ CNsm þ 2D2 a: ð8Þ
For the regular point skeleton, the dimension parameter N is calculated:

N ¼ f S; Nsm ; D n; CNsm at Da = 0: ð9Þ
where Nsm = 1 – the smoothness order of the approximating contour of the surface
composed of parquet elements; D n ¼ 0;15%.; CNsm – statistical parameter.
A point frame of dimension N = 680 is placed in the grid nodes of a square with a
side b, including segments of its sides, so the total number of unit cells of the partition
is determined by the formula:
pffiffiffiffi 2
N ¼ N 1 ¼ 625 ð10Þ
So, for example, for an area S = 250 m2 of a piece of the surface of a paraboloid
having a square trace on a plane X0Y, one cell has:
S 250
Si ¼ ¼ ¼ 0;4ðm2 Þ;
N 625
therefore, the square cell of the partition of the planned projection has a side li equal
to:
pffiffiffiffi pffiffiffiffiffiffiffi
li ¼ Si ¼ 0;4 ¼ 0;63:
Given the approximations, we finally obtain:
li ¼ 0; 9 li ¼ 0;57:
So, the surface of a paraboloid is dissected by a family of horizontally projecting

planes ki ¼ ð1; 2; . . .; k þ 1Þ perpendicular to a straight line a, selected on the plane
X0Y, and the distance between the planes is li = 0,57 m. The line a is defined on the
plane by two parameters: the angle a and the distance R from the center of coordinates.
The set of elements located between adjacent planes is conventionally called a row
or a belt. Of further interest are cells that partially or fully cover the surface projection.
Technological information about the nodal points of all elements of the parquet,
including those located outside the “square b”, is presented in the form of a matrix:
y1min ; y1max ; XN1;1 ; XK1;1 ; . . .; XN1;K1 ; XK1;K1

kSk ¼ y2min ; y2max ; XN2;1 ; XK2;1 ; . . .; XN2;K2 ; XK2;K2 ; ð11Þ
ynmin ; ynmax ; XNn;1 ; XKn;1 ; . . .; XNn;Kn ; XKn;Kn
where Kj is the number of elements in the i row (Fig. 4); n is the number of rows; (XN
and XK) are designations X1 of the coordinates of the boundaries of the planned
projection of the parquet element.
So, for example, the selected element
of the plan projection Si (Fig. 4) is located in
the i row with borders yimin ; yimax and in the j column with the X1 coordinates of the
borders (XNj,i and XKj,i).
Given the equal width of all belts yj+1 = yj + li, without taking into account the
gaps between the elements of the parquet, the matrix (11) is simplified and takes the
form:
y1 ; XN1;1 ; XN1;2 ; . . .; XK1;K1

kSk ¼ y2 ; XN2;1 ; XN2;2 ; . . .; XK2;K2 : ð12Þ
yn ; XNn;1 ; XNn;2 ; . . .; XKn;Kn
Based on the approximated surface model x2 þ y2 ¼ 2 p z, the third coordi-

nates of the corner points of the outer surface of the parquet elements are calculated,
taking into account which a complete information model of the parquet is used, which
is used in the preparation of control programs for the technological cycle of manu-
facturing parquet elements.
4 Paraboloid Type Concentrators of the Solar Radiation

for Various Photoreceivers
The developed method for calculating the surface of a paraboloid profile using
orthogonal parquetting can be applicable for the manufacture of the working surface of
paraboloid type solar radiation concentrators. Similar solar radiation concentrators are
used in solar modules, where photovoltaic converters (including high-voltage matrix
ones), thermal photoreceivers and also combined photovoltaic thermal photoreceivers
can be used as photoreceivers of concentrated solar radiation. In concentrator solar
photovoltaic thermal modules, along with electrical energy, the consumer receives
thermal energy in the form of a heated coolant at the output.
The profiles of such paraboloid type solar radiation concentrators, providing uni-
form illumination in the focal region on the surface of the photoreceivers, are presented
in the Fig. 6 on the left, where the solar radiation concentrator is intended for photo-
voltaic converters located on the solar concentrator itself, which is an air-cooled
radiator for photovoltaic converters. Figure 6 below shows the profile of a paraboloid
type solar radiation concentrator that provides uniform illumination in the focal region
of the photovoltaic thermal cylindrical photoreceiver, in which the side surface con-
verts concentrated solar radiation into electrical energy and the upper end surface
converts concentrated solar radiation into thermal energy.
Fig. 6. Profiles of working surfaces of paraboloid type solar radiation concentrators (on the left),
partition of their surfaces into unit elements (in the middle) and individual elements of solar
radiation concentrators themselves (on the right)
Concentrators of solar radiation can be made of reflective metal sheet material.

Breakdown of the working surface of solar radiation concentrators into individual
elements by projecting planes can occur on cells with a side of 20 mm (Fig. 6 in the
middle) (or with other sizes depending on the manufacturing requirements), as a result
of which unit cells of the surfaces of solar radiation concentrators are formed (Fig. 6 on
right).
5 Conclusion
Thus, it should be noted that the planned development of solar plants will occur at a
steadily increasing pace, and the share of use of hybrid photovoltaic thermal installa-
tions will increase, which will significantly increase the need for the design, manu-
facture and study of solar radiation concentrators and in particular the paraboloid type.
The presented method of orthogonal parquetting of the surface of paraboloid type
solar concentrators will allow optimizing the manufacturing process of concentrators
already at the design stage and will make it possible to obtain almost any working
profile of paraboloid type concentrators. The orthogonal scheme of parquetting surfaces
of rotation, in particular a paraboloid, is easy to implement data preparation. Also
presented method will help to control and set the manufacturing error at the design
stage of this kind of parabolic concentrators in order to ensure the expected charac-
teristics of the focal region of the concentrator. The developed method of parquetting
surfaces allows designing and manufacturing surfaces of rotation, including solar
radiation concentrators, with specified differential-geometric requirements, which will
ensure the expected distribution of the solar radiation illumination in the focal area.
Such solar radiation concentrators provide uniform illumination in the focal region of
various types of photoreceivers (photovoltaic, thermal, photovoltaic thermal), which
will positively affect the efficiency of solar modules and their specific power.
References
(2011)
https://doi.org/10.1007/978-3-030-00979-3_12
3. Buonomano, A., Calise, F., Vicidimini, M.: Design, simulation and experimental
investigation of a solar system based on PV panels and PVT collectors. Energies 9, 497
(2016)
Conf. Ser. Earth Environ. Sci. 317 (2017). 9 p. https://doi.org/10.1088/1755-1315/317/1/
012002.
5. Panchenko, V., Izmailov, A., Kharchenko, V., Lobachevskiy, Y.: Photovoltaic Solar
Modules of Different Types and Designs for Energy Supply. Int. J. Energy Optim. Eng. 9(2),
6. Rawat, P., Debbarma, M., Mehrotra, S., et al.: Design, development and experimental
investigation of solar photovoltaic/thermal (PV/T) water collector system. Int. J. Sci.
Environ. Technol. 3(3), 1173–1183 (2014)
7. Sevela, P., Olesen, B.W.: Development and benefits of using PVT compared to PV. Sustain.
Build. Technol. 90–97 (2013)
8. Kharchenko, V., Nikitin, B., Tikhonov, P., Gusarov, V.: Investigation of experimental flat
PV thermal module parameters in natural conditions. In: Proceedings of 5th International
Conference TAE 2013, pp. 309–313 (2013)
9. Hosseini, R., Hosseini, N., Khorasanizadeh, H.: An Experimental study of combining a
Photovoltaic System with a heating System. World Renew. Energy Congress 2993–3000
(2011)
10. Kemmoku, Y., Araki, K., Oke, S.: Long-term performance estimation of a 500X
concentrator photovoltaic system. In: 30th ISES Biennial Solar World Congress 2011,
pp. 710–716 (2011)
11. Rosell, J.I., Vallverdu, X., Lechon, M.A., Ibanez, M.: Design and simulation of a low
concentrating photovoltaic/thermal system. Energy Convers. Manage. 46, 3034–3046 (2005)
12. Kharchenko, V., Panchenko, V., Tikhonov, P.V., Vasant, P.: Cogenerative PV thermal
modules. In: Intelligent Computing & Optimization. Advances in Intelligent Systems and
Computing, vol. 866, pp. 108–116 (2019). https://doi.org/10.1007/978-3-030-00979-3_11
14. Kharchenko, V., Nikitin, B., Tikhonov, P., Panchenko V., Vasant, P.: Evaluation of the
00979-3_34
15. Strebkov, D.S., Tveryanovich, E.V.: Koncentratory solnechnogo izlucheniya (Concentrators
of the Solar Radiation), Moskva, GNU VIESKH [Moscow, GNU VIESH], 12–30 (2007). (in
Russian)
16. Andreev, V.M., Grilikhes, V.A., Rumyantsev, V.D.: Fotoehlektricheskoe preobrazovanie
koncentrirovannogo solnechnogo izlucheniya (Photoelectric conversion of concentrated
solar radiation), Leningrad, Nauka (Leningrad, Science) (1989). 310 p. (in Russian)
17. Zahidov, R.A., Umarov, G.Y., Weiner, A.A.: Teoriya i raschyot geliotekhnicheskih
koncentriruyushchih system (Theory and calculation of solar concentrating systems),
Tashkent, FAN (Tashkent, Science) (1977). 144 p. (in Russian)
18. Alimov, A.K., Alavutdinov, J.N., et al.: Opyt sozdaniya koncentratorov dlya modul'nyh
fotoehlektricheskih ustanovok (Experience in creating concentrators for modular photo-
voltaic plants), Koncentratory solnechnogo izlucheniya dlya fotoehlektricheskih ustanovok
(Concentrators of solar radiation for photovoltaic plants) 17–18 (1986). (in Russian)
19. Hassen, A.A., Amibe, D.A.: Design, manufacture and experimental investigation of low cost
parabolic solar cooker. ISES Solar World Congress 28 August–2 September 2011 (2011).
12 p. https://doi.org/10.18086/swc.2011.19.16
20. Chandak, A., Somani, S., Chandak, A.: Development Prince-40 solar concentrator as do it
yourself (DIY) kit. In: ISES Solar World Congress August–2 Sept. 2011 (2011). 8 p. https://
doi.org/10.18086/swc.2011.23.02
21. Diz-Bugarin, J.: Design and construction of a low cost offset parabolic solar concentrator for
solar cooking in rural areas. In: ISES Solar World Congress August–2 Septe 2011 (2011).
8 p. https://doi.org/10.18086/swc.2011.30.05
22. Sinitsyn, S., Panchenko, V., Kharchenko, V., Vasant, P.: Optimization of parquetting of the
org/10.1007/978-3-030-33585-4_16
Determination of the Efficiency of Photovoltaic
Converters Adequate to Solar Radiation
by Using Their Spectral Characteristics
Valeriy Kharchenko1, Boris Nikitin1, Vladimir Panchenko2,1(&),

Shavkat Klychev3, and Baba Babaev4
1
1st Institutskiy Passage 5, 109428 Moscow, Russia
kharval@mail.ru
2
pancheska@mail.ru
3
Scientific and Technical Center with a Design Bureau and Pilot Production,
Academy of Sciences of the Republic of Uzbekistan,
100125 Tashkent, Uzbekistan
klichevsh@list.ru
4
Dagestan State University, Gadzhieva str., 43-a, 367000 Makhachkala, Russia
bdbabaev@yandex.ru
Abstract. The paper presents experimental studies of silicon photovoltaic solar

radiation converters. An experimental evaluation of the photoconversion effi-
ciency is considered, the spectral dependences of the current photoresponses of
the photoelectric converter are determined and the spectral densities of the
current photoresponses of various photovoltaic converters from an artificial light
source are compared. Based on the results obtained, it becomes possible to
determine the efficiency of the photovoltaic converter for a given wavelength of
monochromatic radiation, as well as to calculate the efficiency of the photo-
voltaic converter for a given radiation spectrum.
Keywords: Solar energy Efficiency Photovoltaic converters Spectral

density Wavelength Spectrum of solar radiation Monochromatic radiation
1 Introduction
The experimental assessment of the spectral values of the efficiencies of photovoltaic

converters is a very urgent task. By the nature of this dependence, it is possible to
obtain information on the corresponding values of the efficiency of the studied con-
verter for certain wavelengths of monochromatic radiation and also to calculate the real
value of the efficiency of the photoconversion for a predetermined radiation spectrum,
including for standard solar radiation AM 1.5 [1–11].
The solution of the tasks posed is possible by comparing the current spectral (using
specially selected light filters) photoresponses of the reference and studied photocon-
verters. A reference photoconverter should be considered a semiconductor converter
calibrated in the area of the photoactive part for a given standard solar radiation semi-
conductor in the form of the spectral density of short circuit currents of the converter.

https://doi.org/10.1007/978-3-030-68154-8_10
96 V. Kharchenko et al.
2 Comparison of the Spectral Densities of Current

Photoresponses of a Reference Silicon Photovoltaic
Converter Under Various Exposure Conditions
A prerequisite for the planned analysis is the adjustment of the equipment used to
ensure the equality of the integral (without the use of light filters) short circuit currents
of the reference converter from standard solar radiation and from a laboratory light
source. A comparison of the spectral (with the help of light filters) current photore-
sponses of the reference converter from two compared light sources gives an idea of the
values of the spectral averaged (over the bandwidth of the selected filter) power of the
monochromatic line of the used laboratory light source. Figure 1 shows the spectral
dependences of the current photoresponses of a reference silicon photoconverter when
illuminating with standard solar radiation AM 1.5 (according to the rating data for
calibration of this converter) and when illuminated from a laboratory light source
(halogen incandescent lamp).
Fig. 1. Comparison of the spectral densities of current photoresponses of the reference silicon
photovoltaic converter when exposed to standard solar radiation AM 1.5 (sunlight exposure) and
a laboratory light source (halogen lamp with a water filter h = 42 mm.) for equal integral short
circuit currents
From the analysis of these dependencies it follows that the spectrum of the labo-
ratory light source is depleted in the intensity of light fluxes in its short-wave part, but
differs in increased intensity in the long-wave part. The areas bounded by these
dependencies are equal in size, since they have the integral values of short circuit
currents equal.
Determination of the Efficiency of Photovoltaic Converters Adequate 97
3 Averaged Spectral Densities of Power of the Photoactive

Part of the Spectrum for Silicon
Figure 2 presents the averaged spectral densities of power of the light bands (50 nm
wide in accordance with the calibration of the reference converter) of the photoactive
part (for Si) of the standard solar radiation AM 1.5 spectrum. It also shows the
numerical values of the power of the light flux of each such band. Figure 2 shows the
DP0kiAM1;5 values of standard solar radiation AM 1.5 obtained by dividing its pho-
toactive part of the spectrum for silicon into bands with a width of 50 nm.
Fig. 2. Averaged spectral densities of power of the bandwidths of 50 nm wide of the standard
solar radiation AM 1.5 of the photoactive part of the spectrum for silicon (numerical spectral
densities of power of the light bands are shown above the graph and have a dimension of W/m2)
The data in Fig. 1 and Fig. 2 make it possible to determine the power of the light
bands of the used laboratory light source for each selected wavelength.
4 Comparison of the Spectral Densities of Current

Photoresponses of the Silicon Photovoltaic Converters
Figure 3 shows the experimentally recorded spectral dependences of the current pho-
toresponses of two different photoconverters (reference photoconverter and the studied
one) at the same level and illumination spectrum from a laboratory light source. From
the analysis of the figure it follows that, due to the individual design features and the
technology of manufacturing photoconverters, these dependencies are not identical.
Comparison of the photoresponses of the converters allows to determine for the studied
photoconverter sample the current photoresponse per unit of the power of the light
current for a given spectrum wavelength.
Fig. 3. Comparison of the spectral densities of the current photoresponses of the silicon
reference and investigated photoconverters under exposure from a laboratory light source
The volt-ampere characteristics of the investigated converter, taken by the tradi-

tional method under illumination from the used laboratory light source (naturally dif-
fering in spectrum from standard solar radiation), contains important data for further
calculations:
1) The converter short circuit current is integral Io, and it can be represented as the
sum of the contributions of all monochromatic lines of the used laboratory light
source. The same can be said about the standard solar radiation, since the short
circuit currents are equal in accordance with the above.
2) The maximum power of the photoelectric converter is denoted as Pmax.
5 Dependence of the Efficiency of the Silicon Photovoltaic

Converter Versus the Wavelength
According to the well-known definition, the efficiency of the photovoltaic converter is

equal to the ratio of the useful power to the total consumed power of the total light flux.
In our case, under the useful power of the photoconverter, it should be taken the
maximum power taken from the converter according to the volt-ampere characteristics.
It should be noted that the nature (form) of the volt-ampere characteristics of the
converter should not change if the light fluxes of some lines are mentally replaced by
equivalent ones according to the current photoresponse of other lines, and therefore, the
position of the operating point on the volt-ampere characteristics will not change. This
circumstance is due to the fact that mobile charge carriers (primarily electrons),
according to the kinetic energy obtained by the absorption of high-energy photons,
almost instantly thermolize according to the thesis described in [3]. This circumstance
makes these mobile charge carriers indistinguishable by the history of their origin.
In accordance with the foregoing, the power of monochromatic radiation with a
wavelength ki, generating the total short circuit current I0, is determined by the
expression:
P P ð1Þ
where: DP0kiAM1;5 is the density of power of the light fluxes of standard solar
radiation AM 1.5 with a wavelength ki and bandwidth of 50 nm;
SPVC is the area of the investigated photovoltaic converter;
i0kilab:ref : is the spectral density of the short circuit current of the reference photo-
voltaic converter from the laboratory light source at a wavelength ki;
i0kiAM1;5cal: is the spectral density of the short circuit current of the reference
transducer from the solar radiation AM 1.5 at a wavelength ki;
I0 is the short circuit current of the investigated photovoltaic converter according to
the volt-ampere characteristics;
i0kilab:PVC is the spectral density of the short circuit current of the investigated
photovoltaic converter under illumination from a laboratory light source at a wave-
length ki;
Δkcal. is the step of the calibration of the reference photovoltaic converter (usually
50 nm).
The spectral values of the coefficient of performance of the investigated photo-
voltaic converter will be determined according to the expression:
Pmax PVC
COPki ¼ ; ð2Þ
Pki
where PmaxPVC is the maximum power of the photovoltaic converter, calculated by

its volt-ampere characteristics, taken under illumination from the laboratory light
source tuned to standard solar radiation by the integrated value of the short circuit
current I0 using a reference photovoltaic converter;
Pki is the power of monochromatic radiation with a wavelength ki, which causes the
same value of the short circuit current I0 of the photovoltaic converter under study,
which corresponds to the current-voltage characteristic. The value of such power is
defined by the expression 1.
Figure 4 shows the dependence of the spectral values of efficiency of the investi-
gated serial silicon photovoltaic converter [12, 13] versus wavelength. From the
analysis of the obtained dependence, it follows that the COPki gradually increases from
zero (at ki = 0.4 lm) to 40% with ki = 0.95 lm with increasing wavelength. An
almost zero value of the COPki is observed under conditions when photons of the
corresponding (small) wavelength are completely absorbed by the doped layer of the
converter without p-n separation – by the transition of electron-hole pairs.
Fig. 4. The characteristic form of the spectral values of the efficiency of silicon photovoltaic
converters
Based on the obtained experimentally recorded spectral dependence of the effi-

ciency of the photoconverter, it seems possible to determine the generalized efficiency
of the photovoltaic converter for a predetermined emission spectrum. The generalized
efficiency of the photoconverter can be represented as the sum of the partial contri-
butions of the COPki for each band of a given spectrum, taking into account the fraction
of these bands in the power of the total light flux of radiation. The generalized effi-
ciency of the photoconverter, as adequate for a given radiation spectrum, is described
by the expression:
X
COPgen ¼ Kki COPki ; ð3Þ
where COPki are spectral values of the efficiency of the investigated photoconverter
obtained according to expression 2;
Kki is the fraction of the luminous flux power of the wavelength ki in the total
luminous flux power of a given spectrum.
The efficiency of the photoconverter, adequate to the standard solar radiation of

AM 1.5 (1000 W/m2), calculated according to expression 3 in the Fig. 4 is presented in
the form of a “big dot” with a value of 13, located on the spectral curve of the
dependence of the efficiency versus wavelength.
6 Conclusion
Thanks to the considered method, it becomes possible to experimentally estimate the

spectral values of the efficiencies of photovoltaic converters. Using the considered
provisions, one can obtain information on the corresponding values of the efficiencies
of photovoltaic converters for specific wavelengths of monochromatic radiation. It is
also possible to calculate the efficiency of the photovoltaic converter for a given
emission spectrum.
References
1. Bird, R.E., Hulstrom, R.L., Lewis, L.J.: Terrestrial solar spectral, data sets. Sol. Energy 30
(6), 563–573 (1983)
00979-3_34
3. Arbuzov, Y.D., Yevdokimov, V.M.: Osnovy fotoelektrichestva (Fundamentals of Photo-
electricity), Moscva, GNU VIESKH (Moscow, GNU VIESH) (2007). 292 p. (in Russian)
4. Kharchenko, V.V., Nikitin, B.A., Tikhonov, P.V.: Theoretical method of estimation and
prediction of PV cells parameters. Int. Sci. J. Alt. Energy Ecol. 4(108), 74–78 (2012)
5. Kharchenko, V.V., Nikitin, B.A., Tikhonov, P.V.: Estimation and forecasting of PV cells
and modules parameters on the basis of the analysis of interaction of a sunlight with a solar
cell material. In: Conference Proceeding - 4th International Conference, TAE 2010, pp. 307–
310 (2010)
6. Nikitin, B.A., Gusarov, V.A.: Analiz standartnogo spektra nazemnogo solnechnogo
izlucheniya intensivnost'yu 1000 Vt/m2 i ocenka na ego osnove ozhidaemyh harakteristik
kremnievyh fotoelektricheskih preobrazovatelej (Analysis of the standard spectrum of
ground-based solar radiation with an intensity of 1000 W/m2 and assessment based on it of
the expected characteristics of silicon photovoltaic converters). Avtonomnaya energetika:
tekhnicheskij progress i ekonomika (Auton. Power Eng. Tech. Progress Econ. no. 24–25,
50–60 (2009). [in Russian]
7. Strebkov, D.S., Nikitin, B.A., Gusarov V.A.: K voprosu ocenki effektivnosti raboty
fotopreobrazovatelya pri malyh i povyshennyh urovnyah osveshchennosti (On the issue of
evaluating the efficiency of the photoconverter at low and high levels of illumination).
Vestnik VIESKH (Bulletin VIESH), 119 (2012). (in Russian)
8. Nikitin, B.A., Mayorov, V.A., Kharchenko, V.V.: Issledovanie spektral'nyh harakteristik
solnechnogo izlucheniya dlya razlichnyh velichin atmosfernyh mass (Investigation of the
spectral characteristics of solar radiation for various atmospheric masses). Vestnik VIESKH
[Bulletin of VIESH], 4(21), 95–105 (2015). (in Russian)
9. Nikitin, B.A., Mayorov, V.A., Kharchenko V.V.: Vliyanie velichiny atmosfernoj massy na
spektral'nuyu intensivnost' solnechnogo izlucheniya (The effect of atmospheric mass on the
spectral intensity of solar radiation). Energetika i avtomatika (Energy Autom.) 4(26), pp. 54–
65 (2015). (in Russian)
10. Kharchenko, V., Nikitin, B., Tikhonov, P., Adomavicius, V.: Utmost efficiency coefficient of
solar cells versus forbidden gap of used semiconductor. In; Proceedings of the 5th
International Conference on Electrical and Control Technologies ECT-2010, pp. 289–294
(2010)
11. Strebkov, D.S., Nikitin, B.A., Kharchenko, V.V., Arbuzov, Yu.D., Yevdokimov, V.M.,
Gusarov V.A., Tikhonov, P.V.: Metodika analiza spektra issleduemogo istochnika sveta
posredstvom tarirovannogo fotopreobrazovatelya i komplekta svetofil'trov (Method of
spectrum analysis of the studied light source by means of a calibrated photoconverter and a
set of light filters). Vestnik VIESKH (Bulletin of VIESH), 4(9), 54–57 (2012). (in Russian)
Conf. Ser. Earth Environ. Sci. 317, 9 p. (2019). https://doi.org/10.1088/1755-1315/317/1/
012002
13. Panchenko, V., Izmailov, A., Kharchenko, V., Lobachevskiy, Ya.: Photovoltaic solar
Modeling of the Thermal State of Systems
of Solar-Thermal Regeneration of Adsorbents
Gulom Uzakov1, Saydulla Khujakulov1, Valeriy Kharchenko2,

Zokir Pardayev1, and Vladimir Panchenko3,2(&)
1
Karshi Engineering - Economics Institute,
Mustakillik str. 225, Karshi, Uzbekistan
uzoqov1966@mail.ru
2
kharval@mail.ru
3
pancheska@mail.ru
Abstract. Energy supply of fruit and vegetable storages, especially in places

remote from centralized energy supply using solar energy is especially relevant.
The authors of the paper propose systems for the thermal regeneration of
adsorbents based on the use of solar energy. The paper presents studies of heat
transfer and thermal regime of the developed system of solar-thermal regener-
ation of the adsorbent (activated carbon) in non-stationary mode. The paper also
offers a mathematical model of the temperature field of the adsorbent layer
during solar heating using a solar air collector. The proposed mathematical
model of the thermal regime of the solar adsorption installation allows quali-
tatively controlling the technological process of thermal regeneration of adsor-
bents and significantly reducing the cost of traditional energy resources.
Keywords: Solar air heater Air temperature Thermal efficiency Solar

energy Adsorbent Temperature Air flow
1 Introduction
The leadership of Uzbekistan has set tasks to reduce the energy and resource intensity
of the economy, the widespread introduction of energy-saving technologies in pro-
duction, and the expansion of the use of renewable energy sources. The implementation
of these provisions, including increasing the efficiency of the use of solar energy in
thermotechnological processes of fruit and vegetable storages, is considered one of the
most important tasks.
Based on our studies, we developed a solar air-heating system for thermal regen-
eration of adsorbents and active ventilation of fruit and vegetable chambers [1–3].
Studies have shown that the thermal regime of thermal regeneration of adsorbents
depends on the intensity of convective heat transfer between the surface of the
adsorbent and the hot washer air. The main heat engineering parameters of convective
heat transfer is the heat transfer coefficient, which depends on many factors [4–8].

https://doi.org/10.1007/978-3-030-68154-8_11
104 G. Uzakov et al.
2 Research Problem Statement
The aim of this work is to study the heat transfer and thermal regime of the developed
system of solar-thermal adsorbent regeneration in non-stationary mode. Convective
heat transfer during forced air movement through a fixed adsorbent layer in the solar
regeneration mode has a pronounced unsteady character, that is, air temperatures and
adsorbent temperatures change both in time and in space. In order to calculate the cycle
of adsorption plants, determine the duration of the solar regeneration regime, and select
the optimal thermotechnical parameters for their implementation, it is necessary to
calculate the temperature field and calculate the change in the temperature of the
adsorbent along the length of the adsorber at the studied time.
3 Research Results
The heat transfer coefficient in the granular layer of activated carbon in the processes of
thermal regeneration is determined by the method of modeling convective heat transfer
using criteria-based similarity equations [9, 10]. The experimental results were pro-
cessed using the following criteria similarity equations, that is, according to the formula
of V.N. Timofeeva:
at 20 \ Reliq \ 200 it is used Nuliq ¼ 0; 106 Reliq ð1Þ
at Re [ 200; it is used Nuliq ¼ 0; 61 Re ð2Þ
The results of studies to determine the heat transfer coefficient from air to adsorbent
are shown in the Table 1.
Table 1. The results of studies of heat transfer in a granular adsorbent layer during thermal
regeneration by atmospheric air
№ W, m/s Reliq Nuliq a, W/m2 K
1 0,2 141,2 14,95 35
2 0,3 211,7 22 51,15
3 0,4 282,3 26,7 62,1
4 0,5 352,9 31 72
5 0,6 423,5 35,1 81,6
6 1,0 705,9 49,4 114,6
The process of unsteady heat transfer in the main apparatus of the adsorption unit –
the adsorber – allows presenting the physical model of the process as follows. The
process is one-dimensional, there is no heat exchange with the environment through the
side surface of the adsorber, at any time the temperature of the adsorbent particle can be
considered constant throughout its volume, convective heat transfer between the air
Modeling of the Thermal State of Systems 105
flow and the adsorbent layer is decisive, heat transfer is heat conduction through the air
and the layer in the axial direction and the heat transfer by radiation is small and can be
neglected, the air pressure during movement through the layer remains unchanged, the
mass flow rate of air is constant, all the thermal properties of the air and layer are
considered constant and independent of temperature [11–13].
When considering non-stationary temperature fields in adsorbers in heat transfer
processes, in addition to the above assumptions, it is assumed that the adsorption
properties of the layer do not affect the heat transfer process – the specific mass air flow
in any section remains constant, and the thermal effects associated with adsorption and
desorption are insignificant, and they can be neglected.
To derive the basic equations for air and particles enclosed in the elementary
volume of the layer, heat balance equations are compiled. For air: the change in the
enthalpy of air over the considered period of time plus the heat introduced by the flow
is equal to the amount of heat transferred to the layer during convective heat transfer.
For particles of the layer: the change in the enthalpy of particles over the considered
period of time is equal to the amount of heat transferred to the layer during convective
heat transfer [14–18].
Fig. 1. Design scheme of unsteady heat transfer during heating of the adsorbent layer
The desired functions are the air temperature ta and the temperature of the particles
of the layer tl, these functions are functions of two independent variables (Fig. 1).
The temperature of the adsorbent layer is determined by the following data, among
which mad – the mass of the adsorbent, kg; cl – specific heat capacity of the adsorbent J/
(kg °C); t0 – initial temperature of the adsorbent layer, °C; tl = t (s) – temperature of
the adsorbent at the time s (s 0).
The amount of accumulated heat in the adsorbent grain layer at time s is determined
by:
Q ¼ cl mad tðsÞ ¼ cl mad t; ð3Þ
and at time s = 0:
Q0 ¼ cl mad t0 : ð4Þ
Within the time ds, the amount of heat received by the adsorbent layer will increase
by:
dQ ¼ cl mad dtðsÞ: ð5Þ
This amount of heat is transferred to the adsorbent by air at the constant temper-
ature during convective heat transfer between the air and the adsorbent:
dQ ¼ a ðta tl Þ ds; ð6Þ
where a – the heat transfer coefficient, W/m2 ºC; ta – air temperature, ºC.
Equating (5) and (6), the following heat balance equation is obtained:
cl mad dt ¼ a ðta tl Þ ds: ð7Þ
By separating the variables as follows:
dt a
¼ ds: ð8Þ
ta tl cl mad
and integrating Eq. (8):

Z Z
dt a
¼ ds ð9Þ
ta tl cl mad
determined by:
Z
dt a ~1
¼ sþN ð10Þ
ta tl cl mad
where C1 – an arbitrary integration constant.

To calculate the integral on the left side of Eq. (10), the following substitution is
introduced:
ta tl ¼ x; ð11Þ
it turns out d(ta – tl) = dx, if we assume that ta = const,
dtl ¼ dx: ð12Þ
Taking into account expressions (11), (12) and the presence of the integration
constant C1 in Eq. (10), after integration we obtain:
Z
dx
¼ ln x: ð13Þ
x
Replace the left side of Eq. (10), taking into account (12):
a ~1
ln x ¼ s þ ln N ð14Þ
cl mad
The resulting Eq. (13) after the conversion, taking into account (10), has the fol-
lowing form:
1 a
s
t ¼ ta ecl mad : ð15Þ
1
Under the initial condition s = 0, t (0) = t0, from (15) we get:
1 0 1 1
t0 ¼ ta e ¼ ta or ¼ ta t0 : ð16Þ
~1
N ~1
N ~1
N
Further, for the convenience of calculations, we introduce the following notation:

a
b¼ ; ð17Þ
cl mad
from (15) taking into account (17) we obtain the final equation:
t ¼ ta ðta t0 Þ ebs : ð18Þ
The obtained Eq. (18) allows us to determine the temperature of heating of the
adsorbent in the solar air-heating installation, taking into account the environmental
parameters and thermal characteristics of the adsorbent itself (a, cl, mad).
From the analysis of the obtained dependence, we can conclude that the heating
temperature of the adsorbent in the adsorber is determined by its mass and heat
capacity, initial temperature, heat transfer coefficient, and duration of heating. The
resulting Eq. (18) will solve the following problems:
1) Given the maximum temperature for heating the adsorbent, it is possible to deter-
mine the maximum duration of its heat treatment with air through a solar air heater
(the duration of the thermal regeneration of adsorbents).
2) Given the maximum duration of heat treatment of the adsorbent, we determine the
possible temperature of its heating.
The resulting mathematical Eq. (18) is convenient for practical engineering cal-
culations of thermal regeneration systems of adsorbents and does not require a large
amount of initial and experimental data.
We will calculate based on the following source data:
W = 0,2 m/s; a = 35 W/(m2 °C); cl = 0,84 kJ/(kg°C); mad = 28,8 kg; ta = 60 °C;
t0 = 18 °C;
a 35
b¼ ¼ ¼ 1;44 103
cl mad 0;84 103 28;8
Based on the calculation results, a graph of the temperature change of heating the
adsorbent in the adsorber was constructed (Fig. 2).
Fig. 2. The graph of the temperature of heating the adsorbent (activated carbon) in the adsorber
of a solar air-heating installation
4 Conclusion
1. The proposed mathematical model of the temperature field of the adsorbent layer
during solar heating allows one to determine the change in the temperature of the
adsorbent along the length of the adsorber and in time and also takes into account
the influence of the thermal characteristics of the adsorbent itself.
2. At the maximum heating temperature of the adsorbent, it is possible to determine
the duration of heat treatment with hot air heated in a solar air-heating installation.
3. As can be seen from Fig. 2 with increasing heat transfer coefficient (a) from air to
the adsorbent, the heating intensity of the adsorbent layer increases.
4. Thus, the obtained results of the study of the thermal regime of solar-thermal
adsorbent regeneration and the mathematical model of the temperature field of the
adsorbent layer make it possible to qualitatively control the process of solar-thermal
regeneration of the adsorbent and choose the optimal technological parameters of
the gas medium regulation systems in fruit storage.
References
1. Khuzhakulov, S.M., Uzakov, G.N., Vardiyashvili, A.B.: Modelirovanie i issledovanie
teplomasso- i gazoobmennyh processov v uglublennyh plodoovoshchekhranilishchah
(Modeling and investigation of heat and mass and gas exchange processes in in-depth
fruit and vegetable storages). Problemy informatiki i energetiki (Prob. Inf. Energy), no. 6,
52–57 (2010). (in Russian)
2. Khujakulov, S.M., Uzakov, G.N., Vardiyashvili, A.B.: Effectiveness of solar heating
systems for the regeneration of adsorbents in recessed fruit and vegetable storages. Appl.
Solar Energy 49(4), 257–260 (2013)
3. Khujakulov, S.M., Uzakov, G.N.: Research of thermo moisten mode in underground
vegetable storehouses in the conditions of hot-arid climate. Eur. Sci. Rev. 11–12, 164–166
(2017)
4. Uzakov, G.N., Khuzhakulov, S.M.: Geliovozduhonagrevatel'naya ustanovka s solnechno-
termicheskoj regeneraciej adsorbentov (Helio-air-heating installation with solar-thermal
regeneration of adsorbents). Tekhnika. Tekhnologii. Inzheneriya [Technics. Technol. Eng.],
no. 2, 7–10 (2016). https://moluch.ru/th/8/archive/40/1339/. (in Russian)
5. Uzakov, G.N., Khuzhakulov, S.M.: Issledovanie teploobmennyh processov v sistemah
solnechno-termicheskoj regeneracii adsorbentov (Study of heat exchange processes in
systems of solar-thermal regeneration of adsorbents). Tekhnika. Tekhnologii. Inzheneriya
(Tech. Technol. Eng.) (2), 10–13 (2016). https://moluch.ru/th/8/archive/40/1340/. (in
Russian)
6. Uzakov, G.N., Khuzhakulov, S.M.: Issledovanie temperaturnyh rezhimov geliovozduhona-
grevatel'noj ustanovki dlya sistem termicheskoj regeneracii adsorbentov (Study of the
temperature conditions of a solar air heating installation for thermal regeneration of
adsorbents). Geliotekhnika (Solar Eng.) (1), 40–43 (2017). (in Russian)
7. Abbasov, E.S., Umurzakova, M.A., Boltaboeva, M.P.: Effektivnost’ solnechnyh voz-
duhonagrevatelej (Efficiency of solar air heaters). Geliotekhnika (Solar Eng.) (2), 13–16
(2016). (in Russian)
8. Klychev, Sh.I., Bakhramov, S.A., Ismanzhanov, A.I.: Raspredelennaya nestacionarnaya
teplovaya model’ dvuhkanal’nogo solnechnogo vozduhonagrevatelya (Distributed non-
stationary thermal model of a two-channel solar air heater). Geliotekhnika (Solar Eng.) (3),
77–79 (2011). (in Russian)
9. Akulich, P.V.: Raschety sushil'nyh i teploobmennyh ustanovok (Calculations of drying and
heat exchange plants). Minsk, Belarus. navuka (Minsk, Belarus. Navuka) (2010). 443 p. (in
Russian)
10. Mikheev, M.A., Mikheeva, I.M.: Osnovy teploperedachi (Fundamentals of heat transfer).
Moskva, Energiya (Moscow, Energy) (1977). 320 p. (in Russian)
11. Yanyuk, V.Ya., Bondarev, V.I.: Holodil'nye kamery dlya hraneniya fruktov i ovoshchej v
reguliruemoj gazovoj srede (Refrigerators for storing fruits and vegetables in a controlled gas
environment). Moskva, Legkaya i pishchevaya promyshlennost’ (Moscow, Light and food
industry) (1984). 128 p. (in Russian)
12. Kharitonov, V.P.: Adsorbciya v kondicionirovanii na holodil'nikah dlya plodov i ovoshchej
(Adsorption in conditioning on refrigerators for fruits and vegetables). Moskva, Pishchevaya
promyshlennost’ (Moscow, Food industry) (1978). 192 p. (in Russian)
13. Chabane, F.: Design, developing and testing of a solar air collector experimental and review
the system with longitudinal fins. Int. J. Environ. Eng. Res. 2(I. 1), 18–26 (2013)
14. Henden, L., Rekstad, J., Meir, M.: Thermal performance of combined solar systems with
different collector efficiencies. Sol. Energy 72(4), 299–305 (2002)
15. Kolb, A., Winter, E.R.F., Viskanta, R.: Experimental studies on a solar air collector with
metal matrix absorber. Sol. Energy 65(2), 91–98 (1999)
16. Kurtas, I., Turgut, E.: Experimental investigation of solar air heater with free and fixed fins:
efficiency and exergy loss. Int. J. Sci. Technol. 1(1), 75–82 (2006)
17. Garg, H.P., Choundghury, C., Datta, G.: Theoretical analysis of a new finned type solar
collector. Energy 16, 1231–1238 (1991)
18. Kartashov, A.L., Safonov, E.F., Kartashova, M.A.: Issledovanie skhem, konstrukcij,
tekhnicheskih reshenij ploskih solnechnyh termal’nyh kollektorov (Study of circuits,
structures, technical solutions of flat solar thermal collectors). Vestnik YUzhno-Ural’skogo
gosudarstvennogo universiteta (Bull. South Ural State Univ.), (16), 4–10 (2012). (in
Russian)
Economic Aspects and Factors of Solar Energy
Development in Ukraine
Volodymyr Kozyrsky, Svitlana Makarevych, Semen Voloshyn,

Tetiana Kozyrska, Vitaliy Savchenko(&), Anton Vorushylo,
and Diana Sobolenko
National University of Life and Environmental Sciences of Ukraine, St. Heroiv

Oborony, 15, Kyiv 03041, Ukraine
{epafort1,t.kozyrska}@ukr.net, birma0125@gmail.com,
semvenergy@gmail.com, anton2320@gmail.com,
dinysa2400@gmail.com
Abstract. The paper is devoted the development of renewable energy in Ukraine

with particular directions of state stimulation this branch. The functional diagram
of interrelations the factors influencing their efficiency and mass using is given (on
the example of solar power plants). The concept of MicroGrid is considered as a
power supply system for remote territories. The concept of “dynamic tariff” is
proposed as an integrated indicator of the current cost of electricity at the input of
the consumer. It is formed on the basis of the real cost of electricity from sources in
the MicroGrid system, the cost of electricity losses during transportation, taxes,
planned profits and a number of functional factors that determine the management
of the balance of electricity generation and consumption, the impact of consumers
and the MicroGrid system on electricity quality.
Keywords: Renewable energy sources Solar electricity Solar energy

station Electricity storage MicroGrid system Dynamic tariff Reclosers
Renewable energy sources (RES) are one of the priorities of energy policy and
instruments to reduce carbon emissions. Efforts by countries to address this issue under
the Kyoto Protocol are not yielding the expected effect. The first steps of the world
community to solve this problem began at the 18th Conference of the Parties to the UN
Framework Convention and the 8th meeting of the Parties to the Kyoto Protocol, which
took place from November 26 to December 7, 2012 in Doha (Qatar) [1]. Under the
second period of the Kyoto Protocol (2013–2020), Ukraine has committed itself to
reducing greenhouse gas emissions by 20% (from 1990 levels) and has announced a
long-term goal by 2050 - to reduce emissions by 50% compared to 1990.
Why is the number of RES use projects growing in Ukraine, despite the fact that
tariffs are falling?
It turns out that at the same time there were several positive factors that determine
the development of alternative energy. The first factor is regulatory. In addition to
reducing green energy tariffs, the state guarantees that support will last long enough to
recoup investment in energy facilities.
As in some European countries, where the tariff decreases as the EU targets are met
(20% of energy is obtained from renewable sources by 2020), Ukraine also has

https://doi.org/10.1007/978-3-030-68154-8_12
112 V. Kozyrsky et al.
mechanisms in place to reduce it. The green tariff will be reduced from the base level in
2009 by 20% in 2020 and another 30% in 2025, and abolished in 2030 as a temporary
incentive for the development of RES.
The second factor in stimulating the construction of stations is that the conditions
for the construction of new facilities have been liberalized. In particular, the Verkhovna
Rada changed the rule on the mandatory local component in the equipment. Instead, an
incentive mechanism was introduced: the more domestic components in the station, the
higher the rate of tariff increase.
Another important factor is the rapid decline in technology in the world. In recent
years, capital expenditures for the construction of SES have decreased significantly, as
the cost of equipment has decreased. As of 2017, investments in 1 MW of capacity in
Ukraine fluctuate at the level of 0.75–1.05 million euros. Payback periods of projects
are 6–7 years. China intends to put into operation 15–20 GW of solar capacity
annually. By 2020, their total number should triple. Such a plan envisages investments
in China’s energy - $368 billion, this amount of investment (one of the main manu-
facturers of solar energy station (SES) components) will help reduce the cost of SES.
A factor in reducing the cost of SES, too, is to increase the efficiency of the
elements of SES (photovoltaic panels (FEP), batteries, etc.). To date, the production of
most commercial modules of solar cells is based on crystalline Si (I generation FE) and
amorphous thin-film SE with a large area of FE with a value of η * 5–8% (II-
generation FE). The concept of III-generation is the use of nano- and microstructures
(microwires). The main characteristic of FEP is the efficiency of photoelectric con-
version or efficiency (efficiency), which for the currently available technological
industrial FEP is in the range from 7% to 18% [1–7], and in laboratory developments
reaches 39–43% [4, 8].
Thus, the improvement of technologies and efficiency of SES elements has made
solar generation one of the leaders in terms of capacity growth not only in Ukraine but
also in the world.
One of the main elements of SES are batteries for electricity storage. Electricity
storage technologies are also developing rapidly - lithium-ion batteries, hydrogen
energy storage technologies, supercapacitors. These innovative developments have
higher productivity and lower cost.
Unfortunately, the current electricity storage technologies in Ukraine have a high
cost. One kW of storage capacity costs from $ 500 to $ 3,000. It is expected that within
3–5 years the price will drop by a third or more.
The factor that negatively affects the market of the domestic segment of RES is the
low technical condition of electrical networks. For example, due to the high density of
buildings in the Kyiv region, private homes have problems due to insufficient capacity
of electrical networks.
Importantly, the capacity of solar panels increased almost sevenfold: from 2.2 MW
(2015) to 16.7 MW at the end of 2016. That is, households began to install more
powerful panels. For all those who installed solar panels in 2017, there will be a fairly
high green tariff of 19 cents per 1 kWh. Gradually, this tariff will be reduced on the
same principle that works for industrial stations - up to 14 cents by 2030.
An important factor that encourages private individuals to install panels is the
reduction in the cost of small SES technologies. Ekotechnik Ukraine Group forecasts a
Economic Aspects and Factors of Solar Energy Development in Ukraine 113
5–10% drop in prices for solar power plant equipment for private households. In 2017,
the average price of the entire set of equipment is about 900–950 euros per 1 kW of
SES.
Ukrainian startup SolarGaps has offered a new solution - the world’s first sun blinds
have appeared. The project has already attracted the first investments and is preparing
to enter various markets around the world. The equipment of a usual window blinds in
the apartment, according to the startup, will cost $300. SolarGaps blinds in a three-
room apartment with windows facing south will be able to produce up to 600 W h, or
about 4 kW h per day (100 kW h hour per month).
Thus, solar energy is gradually becoming the cheapest source of energy in many
countries in the long run. Ukraine is no exception. Solar activity in Ukraine is enough
to ensure a return on investment for 6–7 years using the green tariff and 13–15 years
without it. This term can be compared with the payback of a classic thermal power
plant.
Ukraine has all the natural and regulatory prerequisites for the development of
RES. The share of green generation while maintaining favorable factors may reach
30% by 2035 due to the construction of 15 GW of new SES and wind farms.
Investments in RES and Ukraine’s economy will create tens of thousands of jobs –
which is an important social factor.
It is important in determining the conditions for the development of SES to
establish the degree of influence of various factors. In Fig. 1 presents a functional
diagram of the relationship of factors for two types of SES with a capacity of
P > 30 kW and P < 30 kW (private sector).
Analytical relationship between the function of SES development and factors:
ð1Þ
ð2Þ
Determination of coefficients can be performed by the method of expert

assessments.
The direction of Smart Grid technologies is considered to be a promising concept
for the development of human energy supply systems, and for remote areas, such as
rural areas – the creation of MicroGrid systems [8]. A number of pilot projects of
similar systems have already been implemented in the world [9–12].
However, this concept creates a number of scientific, technical and economic
problems that need to be solved to ensure the successful operation of MicroGrid
systems. For example, the economic ones include the formation of a tariff policy in a
closed MicroGrid system.
MicroGrid system – a local power system containing two or more, homogeneous or
heterogeneous power sources, means of energy storage, elements of transportation,
distribution of electricity and switching, power consumers, systems for receiving,
transmitting and analyzing information. The system is considered as a single integrated
and controlled unit of subsystems “generation-accumulation-transportation-

distribution-consumption” of electricity with an intelligent control system. The
MicroGrid system can operate autonomously and as an integrated subsystem in a
centralized power system.
SES P > 30 kW SES P < 30 kW
State regulation (tariff) X1
State regulation (non-tariff) X2
State regulation (liberalization of

construction) X3
Reducing the cost of technology

X4
Increasing the efficiency of the

elements X5
The level of reliability and bandwidth of

electrical networks X6
Social factor X7
Logistics X8
Climatic factor X9
Environmental factor X10
Fig. 1. Functional diagram of the relationship between the factors of development of solar
electricity.
Since electricity is a commodity, and the cost of its generation and transportation
changes over time, it is advisable to consider a new principle - the formation of the
tariff for each consumer in the dynamics (during the period Dt, when the components of
the tariff are constant). This tariff can be called - dynamic.
Dynamic tariff (DT) is an integrated indicator of the current cost of electricity at the
input of the consumer, which is formed on the basis of the actual cost of electricity
from sources in the MicroGrid system (see Fig. 1), the cost of electricity losses during
transportation, taxes, planned profits and a number of functional factors, which
determine the management of the balance of generation and consumption of electricity,
the impact of consumers and the MicroGrid system on the quality of electricity.
Fig. 2. Example of electrical network diagram in MicroGrid system
The scheme of Fig. 2 shows the main elements of the MicroGrid system: con-
sumers – various objects of the settlement (utility and industrial spheres); SES - solar
power plants (can be with electricity storage); reclosers – switching devices with
automation and remote control; MicroGrid system control center – a server with a
computer center and an intelligent system control algorithm; information transmission
system from each element of the MicroGrid system to the control center (elements –
SES, electricity storage, consumer meters, reclosers – equipped with means of trans-
mitting and receiving information).
During the operation of the MicroGrid system during the day the electrical load of
consumers changes and as the loads in the nodes of the circuit Fig. 2 will be controlled
on part of the reclosers, and the other part of the shutdown – the circuit configuration
will be optimized by the minimum total power loss in the MicroGrid system RDPmin .
Thus, the scheme of electricity transportation from SES can change and the consumer
can receive electricity from different sources during the day.
Functionality of determination of the dynamic tariff on input of the consumer of the
electric power:
ТД(Ci, Ki) = (Т1(С1)+Т2(С2) +Т3(С3) +Т4(С4) – Т5(С5)) · К1(S) · К2(I1) · К3(І2) ð3Þ
where the components of the electricity tariff: T1(C1) – the dependence of the tariff
for a particular consumer on the cost of electricity produced; T2(C2) – dependence of
the tariff on the cost of electricity losses for transportation; T3(C3) – the dependence of
the tariff on the amount of taxes; T4(C4) – a component determined by the planned
profit of the MicroGrid system; T5(C5) – a component determined by the sale of
electricity to the centralized power supply system at a “green tariff”; K1(S) – coefficient
that takes into account the management of the balance of generation and consumption
of electricity; K2(I1) – coefficient that takes into account the negative impact of the
consumer on the quality of electricity; K3(I2) – is a factor that takes into account the
negative impact of the MicroGrid system on the quality of electricity.
The coefficient K1(S) is determined by the total electrical load S of the MicroGrid
system and motivates consumers to use electricity during hours of minimum electrical
load of the system and can take values within n K1(S) m, for example – n > 0, and
m < 2 and can be determined on a contractual basis between electricity consumers, if they
are co-owners of the MicroGrid system, or consumers and the owner of the system.
The coefficients K2(U1) and K3(U2) are determined by the deviation of electricity
quality parameters from the norms and determine the motivation of the consumer and
the MicroGrid system not to reduce the quality of electricity.
Component T1(C1) determines the dependence of the tariff on the cost of electricity
produced by the generating installation, from which the consumer receives electricity
during the time Dt. In the case of a closed mode of operation of electrical networks
(see Fig. 2) of the MicroGrid system, the daily change of electrical loads of consumers
will be a switching change of the network. In this case, in order to optimize the mode of
operation of the network by the criterion RDPMIH (minimum total power losses during
the time period Dt) there is a change in the configuration of the circuit and the con-
sumer can receive electricity from one or more generating units. Since the cost of
electricity produced from different generating units will be different, the dynamics of
this component of the tariff for a particular consumer will change.
Component T2(C2) determines the dependence of the tariff on the cost of electricity
losses for its transportation to the consumer. Based on the conditions of formation in
the MicroGrid system of component T1(C1), and periodic changes in the configuration
time of the electrical network, it can be concluded that the component of tariff T2(C2)
should be calculated for the consumer for each time period Dt.
Component T3(C3) determines the dependence of the tariff on the amount of taxes
and is a constant value at constant tax rates.
Component T4(C4) is determined by the planned profit of the MicroGrid system
and can be determined on a contractual basis by consumers as owners, or consumers
and owners of the MicroGrid system.
Component T5(C5) is determined by the sale of electricity to the centralized power
supply system at a “green tariff” and can be included in the formation of the tariff on a
contractual basis by consumers, owners of the MicroGrid system.
1 Conclusions
1. The most promising concept for the development of human energy supply systems
is the direction of Smart Grid technology, and for remote areas, such as rural areas -
the creation of MicroGrid systems.
2. Modern technical means allow to create a system of online accounting of the cost of
electricity input to the consumer (in real time), which requires the solution of a
number of economic problems.
3. Introduction of the concept of dynamic tariff - an integral indicator of the current
cost of electricity at the input of the consumer will reduce electricity losses for its
transportation, which will significantly affect the cost of electricity consumed by
consumers.
References
1. Introduction to Microgrids – What is a Microgrid. ABB. https://new.abb.com/distributed-
energy-microgrids/introduction-to-microgrids
2. International Energy Agency. https://www.iea.org/
3. European Commission. https://ec.europa.eu/
4. Bondarenko, S.A., Zerkina, O.O.: Smart Grid as a basis for innovative transformations in the
electricity market. BUSINESSINFORM, no. 4, pp. 105–111 (2019)
5. Cheremisin, M.M., Cherkashina, V.V., Popadchenko, S.A.: Features of introduction of smart
grid technologies in the electric power industry of Ukraine. Sci. J. Sci. Rise /4/2(9), pp. 27–
31 (2015)
6. Kozyrsky, V.V., Guy, O.V.: SMART GRID technologies in energy supply systems:
monograph. Comprint, Kyiv (2015). 336 p.
7. LAW of Ukraine on Voluntary Association of Territorial Communities (Vidomosti
Verkhovnoi Rady (VVR)), no. 13, p. 9 (2015)
8. Forst, M.: Germany’s module industry poised for growth. SUN Wind Energy 5, 256–263
(2011)
9. Bekirov, E.A., Khimich, A.P.: Computer modeling of complex power systems with solar
energy concentrators. Renew. Energy 1(24), 74–81 (2011)
10. Solar energy: industry review: based on materials from Nitol Solar Limited. - World Wide
Web access mode: https://nitolsolar.com/rusolarenergy/
11. Eckart von Malsen. Opportunities for large-scale projects. SUN Wind Energy 5, 254–255
(2011)
12. Solar energy/Wikipedia. [Free Internet Encyclopedia]. – World Wide Web access mode:
https://en.wikipedia.org/wiki/Solar_energy
A Method for Ensuring Technical Feasibility
of Distributed Balancing in Power Systems,
Considering Peer-to-Peer Balancing
Energy Trade
Mariusz Drabecki(&)
Institute of Control and Computation Engineering,

Warsaw University of Technology, 15/19 Nowowiejska Street, Warsaw, Poland
m.drabecki@onet.eu
Abstract. In this paper a method for ensuring network feasibility of power flow
(in terms of all network constraints), when energy in power system is balanced
via peer-to-peer contracts, is proposed and analyzed. The method considers
subjective benefits (utility functions) of market participants. The method is
based on two optimization problems originated from Optimal Power Flow
standard formulations, which can be solved by system’s operator. It possibly can
be used in power systems with high penetration of distributed energy resources
(DERs) to give incentive to build and actively control those resources by market
participants. The method was tested on 9-bus test system, under three different
preference scenarios.
Keywords: Balancing energy market Peer-to-peer energy trade Power

flow Network constraints
1 Introduction
Currently, electrical energy is traded on specially designed markets. Their basic

architecture is similar worldwide. Depending on when the trade happens, as opposed to
the actual delivery date, one can distinguish following markets: long-term contracts
market, where energy is traded bilaterally long before the delivery, day-ahead market
where trade happens for the next day, intraday market where participants trade for the
same day, at least one hour prior to delivery and the balancing market, being the real
(or nearly real) time market. Through the balancing market, system’s operator assures
that supplied energy exactly equals its consumption (transmission losses included) and
that the power system operates safely and securely – i.e. that all technical constraints
are met. This is assured by the fact that the operator participates in every buy/sell
transaction and, by consequence that the operator is responsible for final dispatch of
generating units [1].
However, it is believed that due to such a centralized balancing scheme, operations
on the market are not optimal from each of the market participants’ subjective per-
spective. This is true even despite that the balancing is performed by the operator

https://doi.org/10.1007/978-3-030-68154-8_13
Ensuring feasibility of power flow 119
aiming to maximize the social welfare function, which usually means minimizing the
overall generation cost, subject to all power flow technical constraints. Although being
least costly for the aggregated set of energy consumers, this approach completely
neglects individual preferences of market participants and of bilateral, peer-to-peer,
agreements made between them. I argue that allowing them in the balancing process
would add more freedom to the market itself.
Such a liberated way of energy balancing is especially interesting when considering
market mechanisms for power systems integrating many distributed energy resources
(DERs). This energy balancing scheme is important to systems with high DERs pen-
etration, as it directly gives incentives to market participants to both build DERs and
actively control their power output, to maximize participants’ subjective profits. Dis-
tributed power systems, where energy is traded on a peer-to-peer basis are gaining
recognition both theoretically and practically in multiple integration trials performed
worldwide [2].
Yet, as described in the following section, such a distributed self-balancing and unit
self-commitment may cause problems in terms of technical feasibility of power flow,
resulting from so-dispatched units. Some other problems may also arise while trying to
optimally balance demand and supply of energy in the system. Addressing these
technical issues lay in the scope of this paper and form its contribution.
Some interesting research in this field has already been conducted. Authors of [2]
outlined that depending on whether network constraints are taken into account or not,
and whether individual actions of peers are controlled centrally multiple ways of peer-
to-peer transactions can be identified, with their associated control models. One more
approach towards optimizing distributed p2p trade on the market was given in [3]. Yet,
authors of the latter neglected all network constraint and arising technical problems.
These were addressed in [4], yet limited to active power only. However, both of these
papers, together with [5] addressed the issue of maximizing subjective benefits of
market participants. Authors of [6] have pointed that p2p trading might be even more
beneficial when consumers optimize their power consumption and later trade nega-
watts. Some other examples of related research on relevant models of trade opti-
mization are given in [7–9]. However, the role of system’s operator in these works
remains unclear and possibly suppressed.
According to authors of [2, 7, 10], the technical feasibility issues (network con-
straints) might be addressed by special control systems at generation/load bus level –
Energy Management Systems. They will need to have capabilities of limiting possible
generation/demand of a given market participant to ensure network feasibility of the
dispatch. However, these systems would require correct integration and control
knowing the full picture of current situation in the grid. Therefore, it is reasonable to
assume that, at least in the period of transition from centralized to fully distributed
systems architecture, the system’s operator is a relevant entity to guard security and
stability of power supply to its customers, as it is the operator who knows all technical
issues of the grid.
This paper addresses the problem of possible infeasibility of power flow (both
active and reactive) when energy balancing in the system is accomplished through
peer-to-peer trading. In other words, the paper proposes a method for finding feasible
generating units dispatch when peer-to-peer balancing energy contracts are made.
120 M. Drabecki
For this a multi-step method basing on optimization models is proposed. As an

assumption to it, a typical scenario in which the power system under consideration may
be a local grid, or a wider area sub-network, managed by the system operator striving
for system self-balancing is considered.
2 Notion of Power Flow and Its Feasibility
The main goal behind any electrical power system is to safely and securely provide the
demanded amounts of electrical power to its consumers at each time instant. This
consists of two main tasks, namely generation of the power in system’s generation
buses and its transmission to the load buses.
For the first task it is necessary to produce the amount of apparent power which
equals exactly the demanded amount of power plus transmission losses. Transmission
of the power through the system is referred to as the power flow. It results from
combination of generating units setpoints (both for active and reactive power), demand
attached to every load bus and from current technical parameters of the power grid
itself. Normally, the power flow is estimated by numerically solving a set of nonlinear
equations as shown in [11].
The power flow can only be technically attainable (in this paper referred as feasible)
if all the variables, i.e. generating units setpoints, branch power flows, nodal voltages
and nodal voltage angles fall within their technical limits. As it was shown in [12],
network feasibility of the power flow, considering both active and reactive flow,
depends highly on grid model used for determining the power dispatch of generating
units. Thus, depending on the model of dispatching the units, despite the fact that all
generation setpoints lay within generating units capabilities, a non-feasible power flow
may be obtained.
3 Proposed Method
In this section the proposed method for ensuring technical feasibility of power flow,
when generating units dispatch is obtained via p2p balancing energy trade is described
in detail. It is assumed that the operator is only responsible for checking if the gen-
erating units dispatch resulting from so-agreed contracts yields a feasible power flow
and not for any a priori dispatch of generating units. This check is to be performed at
the moment of accepting bilateral contracts. If resulting power flow is not feasible, the
operator proposes a direction in which the market participants should be adjusting their
contractual position, so to get as close to the power flow feasible value as possible. This
is obtained by applying an additional constraint on accepted contracts. We can thus say,
that the operator serves only as a feasibility guard of the power flow without imposing
anything on the market participants for as long as possible. What is more, it is assumed
that only active power balance is subject of trade between peers.
The method is based on specially designed optimization problems. As contracts are
signed directly between market players basing on their subjective goals, overall gen-
eration cost (social welfare function) is unknown and is of little interest to the operator.
The optimization goal is thus to minimize the violations of agreed balancing con-
tractual positions, subject to all power flow technical constraints.
We shall consider two network flow optimization sub-problems formulated further
in this paper. The first sub-problem (Formulation (2)) allows for identifying the unit,
whose total amount of contracted power should be adjusted the most in order to make
the power flow feasible. This unit shall be considered as most problematic in terms of
taken contracts and should be the first to change its contractual position, while in search
of network-feasible power dispatch.
The second formulation (Formulation (3)) however, arbitrarily imposes changes on
generating units contractual positions, to obtain a feasible power flow while maxi-
mizing the use bilateral contracts between suppliers and consumers. I propose, the
operator to use this formulation when market participants could not come to a network-
feasible dispatch through negotiations in a considerable amount of time/rounds when
(2) was applied. Formulation (3) was first proposed in [13] and is cited in this paper as
part of the now-proposed method.
Both proposed formulations are the restrictions of the standard Optimal Power
Flow (OPF) problem [14]. Thus, any feasible solution of one of these two sub-
problems would yield a network feasible power flow in terms of satisfying all grid
technical constraints.
The method can be summarized in the following generic steps:
1. Accept contracts between suppliers and consumers that lay within all technical
limits of generating units, i.e. which do not exceed their technical maxima/minima.
Architectural design of an appropriate IT platform is beyond the scope of this article
and will not be elaborated.
2. Check the network feasibility of power flow that results from accepted contracts
between suppliers and consumers. In case of infeasibility, apply Formulation (2) to
identify the most problematic unit and to estimate transmission losses.
3. Calculate new constraint to be added.
4. Return information on most problematic unit and on the new constraint to market
participants.
5. Enforce market participants to adjust their market positions through bilateral
negotiations following the information issued in step 3 of this method, respecting
the additional constraint. After they agree on new contracts distribution go to step 0.
6. Use Formulation (3) to impose changes on market positions of the participants, if no
feasible solution is found through negotiations in a considerable amount of time (or
rounds of negotiations).
3.1 Optimal Power Flow Problem (OPF)

The OPF problems are well-known and widely used nonlinear and non-convex opti-
mization problems, solved by system operators for determining feasible active and
reactive generating units dispatch.
Usually, OPF is a problem of minimizing the total generation cost, with respect to
all system constraints such as technical maxima/minima of generating unit constraints,
line flow constraints, voltage levels and angle constraints and power balance
122 M. Drabecki
constraints. However, different other cost functions can be also used, such as: mini-
mization of transmission losses or re-dispatch of reactive power for enhancing the level
of system’s stability such as in [15].
Below, in (1), I cite a simplified formulation of the OPF problem as was given in
[14], with standard cost function i.e. minimization of the overall generation costs:
min fP ð1aÞ
subject to:
Pinj
i Pi þ Pi ¼ 0
D
8i 2 N ð1bÞ
Qinj
i Qi þ Qi ¼ 0
D
8i 2 N ð1cÞ
Pmin
i Pi Pmax
i 8i 2 N G ð1dÞ
Qmin
i Qi Qmax
i 8i 2 N G ð1eÞ
U min
i U i U max
i 8i 2 N ð1fÞ
Hmin
i Hi Hmax
i 8i 2 N ð1gÞ
0 Sl Smax
l 8l 2 N f ð1hÞ
where: f P – the total cost of generation and transmission, N- set of indices of all buses
in the system, N G - set of indices of all generating units, N f – set of indices of all
inj
branches in the system, Pinj
i =Qi - active/reactive power injection at bus i calculated
using standard, highly nonlinear power flow equations [11], PDi =Qi - active/reactive
D
min=max min=max
power demand at bus i, Pi =Qi – active/reactive output of unit i, Qi =Pi –
min=max
generation limits of unit i, U i – voltage magnitude at bus i, Ui - limits on voltage
min=max
magnitude of bus i, Hi – voltage angle at bus i, Hi - limits on voltage angles of
bus i, Sl – apparent power flow through line l, Smax
l – maximum value of apparent
power flow through line l.
The above standard OPF problem formulation provides basis for the proposed
optimization sub-problem Formulations (2) and (3) described below.
3.2 Formulation (2) – Identification of Most Problematic Unit

In the step 1 of method presented in this paper I propose to identify the unit, whose
total contractual position (total amount of output power) should be adjusted the most to
make the power flow network-feasible. To obtain Pareto-optimal solution, deviations
from other units’ contractual positions should also be minimized.
Below in (2), the model is presented. It assumes that reactive power is not subject
of trade and that market players contract supply of active power only. As feasible set of
presented problem is a restriction of the standard OPF’s feasible set shown in (1), once
a feasible solution to (2) is found, the resulting power flow is guaranteed to be tech-
nically feasible.
X G= þ
min c1 T þ c2 i2N G
ðsP=Q;i Þ; where c1 c2 ð2aÞ
subject to:
P;i T
sG 8i 2 N G ð2bÞ
þ
P;i T
sG 8i 2 N G ð2cÞ
X X
þ
k2CN;i
PkC;i sG
P;i Pi k2CN;i
PkC;i þ sG
P;i 8i 2 N G ð2dÞ
Pmin
i Pi Pmax
i 8i 2 N G ð2eÞ
G= þ
sP;i 0 8i 2 N G ð2fÞ
þ constraintsð1bÞ ð1hÞ ð2gÞ
G þ =
where c1 ; c2 – arbitrarily chosen positive costs of contracts violation, sP;i - slack
variable for making violation of limits possible for active power, CN; i – set of con-
tracts signed with generating unit i, PkC;i – contracted volume of active/reactive power
for generating unit i with contract k.
This formulation allows for identification of most problematic generating unit.
Once it is identified, one or more new constraints shall be formulated to limit the set of
acceptable contracts by the operator, as proposed for step 4 of the method. This
procedure is explained in Sect. 3.2.1.
3.2.1 Addition of New Constraint

In step 4 of the method it is proposed to add a new constraint (calculated in step 2) after
each round of negotiations between market participants. This is to ensure that in the
power flow obtained in each round is closer to its technical network feasibility. The
constraint is not added directly to the optimization problem, but is to be imposed on
market participants negotiations– they simply need to take it into account while
agreeing on new contractual positions. Contracts violating this constraint shall not be
accepted by the operator.
In the proposed method the constraint is formulated basing on the identified most
problematic unit – as the result of problem (2). It is thus known by how much the active
power output setpoint needs to deviate from desired contractual position in order to
make the power flow network-feasible. The easiest way to get this exact information is
to calculate differences d ðr Þ between the contractual positions and corresponding
calculated optimal solution for most problematic unit p, in round r of negotiations
124 M. Drabecki
P
(dðrÞ ¼ k2CN;p PkC;p ðr Þ Pp ðrÞ). Next, if dðrÞ 6¼ 0 a new constraint on accepted
contracts in next round of iterations (r þ 1) shall be added for the most problematic
P
unit. If dðrÞ [ 0: the constraint shall take form of k2CN;i PkC;p ðr þ 1Þ Pp ðr Þ dðrÞ,
P
and of k2CN;i PkC;p ðr þ 1Þ Pp ðrÞ þ dðrÞ when d\0. Since it is guaranteed that the
deviation for most problematic unit is minimal, it is obvious that constraining its
accepted contracts by d will drive the dispatch closer to feasibility. Once added, the
method returns to end of step 4 – negotiations between peers.
3.3 Formulation (3) – Operator’s Corrective Actions

Despite the fact of adopting a method for helping market participants agree on dispatch
resulting in technically feasible power flow (as in Sects. 3.1–3.2), it is still possible that
market participants would be unable to find one in a considerable amount of time or
rounds of negotiations. Thus, in this section the optimization problem from [13] for
allowing the operator to arbitrarily change generating units’ setpoints to achieve net-
work feasibility of power flow, while maximizing the use of bilateral balancing con-
tracts is cited.
Similarly to Sect. 3.2 , it is assumed that market participants trade and contract
supply of active power only. What is more, it assumed that each generating unit can
change its contractual position – either by generating more or by reducing the output
within its technical limits. Yet, we assume that such an action can be accomplished at a
certain unit cost. This cost is known to the operator through offers submitted by power
suppliers, prior to the balancing energy delivery, for changing their contractual
positions.
The discussed formulation is given in (3).
X
G þ = G þ =
min i2N G
cP;i sP;i ð3aÞ
subject to:
X X
þ
PkC;i sG
P;i Pi k2CN;i
PkC;i þ sG
P;i 8i 2 N G ð3bÞ
k2CN;i
Pmin
i Pi Pmax
i 8i 2 N G ð3cÞ
G= þ
sP;i 0 8i 2 N G ð3dÞ
þ constraintsð1bÞ ð1hÞ ð3eÞ
G þ =
where: cP;i – positive cost (price) of violation of upper/lower limits on generation of
unit i for active power.
4 Assessment of the Method
In the paper I propose to add more freedom into the balancing energy market. As a
result, this should add more economical efficiency to the market, whose amount should
be maximal. Yet, the freedom is achieved through rounds of bilateral negotiations,
whose number should be minimal. Thus, given the above, I propose to assess the
performance of the method by considering two criteria - measure of market efficiency
and number of iterations it takes to come to consensus.
4.1 Measure of Market Effectiveness

Number of rounds of negotiations is fairly straightforward to calculate. However,
things are different when it comes to market efficiency. One should keep in mind that
the amount of efficiency is very subjective for each participant and differs from the
simple social welfare function, as discussed previously. Thus, for the sake of assessing
the performance let me formulate an effectiveness measure (f e ) that combines sub-
jective goals of participants with objective total cost of generation, which should
undergo maximization. Formulation of the measure is given in (4).
f E ¼ 2f S=L cG cA ; ð4Þ
where f S=L is the total amount of subjective benefits for making contracts between
suppliers and consumers (sometimes referred to as utility function), and cG is the total
cost of generation and cA is the total cost of adjustments of units operating points when
formulation (3) is used. Assuming that subjective benefits of consumer j for making a
bilateral contract with generating unit i is equal to subjective benefit of i making a
contract with j, f S=L is formulated as
X X X X
f S=L ¼ j2N L
a PD
i2N G ji ji
¼ j2N L
a PD ;
i2N G ij ji
ð5Þ
where aji – factor of benefit for delivering balancing power from i to j quantifying how
important the benefit is, PD ji – amount of delivered balancing power from i to j,
N L – set of indices of consumers.
For assessment, it is assumed that total generation cost of each unit is well known.
Normally it can be of any form, yet most likely quadratic formulation is used, as in
(6) where ai ; bi ; ci are cost coefficients and Pi is the active power output of unit i.
X
cG ¼ i2N
ðai P2i þ bi Pi þ ci Þ ð6Þ
G
4.2 Simulation Results

The proposed approach is tested on a nine-bus test system (presented in Fig. 1 and
proposed in [16]) over three scenarios. The system for tests is deliberately chosen
small, so that results are more evident and interpretable than on a larger test system.
126 M. Drabecki
In the tests we look at the measures described previously – i.e. market efficiency ðf E Þ
and number of negotiation rounds (max. Round). Resulting value of f E is benchmarked
with the result of standard OPF problem (1) – to check performance as opposed to
currently used dispatching methods. All simulations were coded in Matlab and solved
with MATPOWER MIPS solver [17].
Fig. 1. Test system topology
Test system comprises of nine buses, three of whom being generator buses and
three load buses. All generating units have their technical minima equal to 10 MW and
technical maxima of unit 1, 2 and 3: 250 MW, 300 MW, 270 MW respectively, giving
the total generating capacity of 820 MW. The load is assumed constant and attached to
buses 5, 7 and 9. Value of load in each of these buses is as follows, given as pairs
(active demand, reactive demand): bus 5 – (90 MW, 30 MVAR); bus 7 – (100 MW,
35 MVAR); bus 9 – (125 MW, 50 MVAR), giving the total demand of (315 MW,
115 MVAR). The system consists also of nine transmission lines. Their ratings
(maximum flow capacities) can be found in [16].
For simulations it is assumed that system topology, with locations of load and
values of the load and with costs of generation are constant. It is also assumed that,
demanded load and technical limits of generating units are known and guarded in step 0
of the method. What is more, I assume that subjective benefit for unit i for making a
contract with consumer j is symmetrical and equal to the benefit for consumer j, i.e.
aij ¼ aji . Apart from this, I also assume that after estimation of transmission losses,
consumers are forced to contract their compensation evenly, i.e. in the discussed test
system each consumer contracts exactly one third of the losses.
Generation costs for each of the generating units are presented in Table 1. They
correspond to quadratic formulation given in (6). For simplicity it is assumed here that
generating units sell energy at its cost, meaning that sell price equals generation costs.
Benefit coefficients are assumed as given in Table 2. These reflect both benefit of the
consumers and of suppliers meaning that benefit should be summed only once into f E .
Previous statements yield that economic surplus of the market is produced only through
the benefits. It is assumed here that each consumer wants to contract as much demand
with its preferred unit, as possible. From generating unit’s perspective – the unit
accepts those contracts, which bring most benefit to it until its maximum capacity is
reached. What is more, it is also assumed that operator’s IT system guards that sum of
contracts made for each unit falls within this unit’s technical capacities – from mini-
mum to maximum.
For all of the scenarios, with the load as written previously, the objective of
standard OPF problem (1) is equal to 5,296.69 USD, which in terms of benefit function
equals f E ¼ 5; 296:69 USD. This is the cost value for optimal solution: P1 ¼
89:7986MW,P2 ¼ 134:3207MW,P3 ¼ 94:1874MW.
Table 1. Generation cost coefficients

Generation cost coefficients
ai [USD/MW2] bi [USD/MW] ci [USD]
Gen. Unit 1 0.1100 5.0000 150.0000
Gen. Unit 2 0.0850 1.2000 600.0000
Gen. Unit 3 0.1225 1.0000 335.0000
Table 2. Benefit coefficients assumed in test cases

Benefit coefficients [100 USD/MW]
Scenario Consumer 1 (bus 5) Consumer 2 (bus 7) Consumer 3 (bus 9)
Gen. Gen. Gen. Gen. Gen. Gen. Gen. Gen. Gen.
Unit 1 Unit 2 Unit 3 Unit 1 Unit 2 Unit 3 Unit 1 Unit 2 Unit 3
S1 0.2 0.7 0.1 0.3 0.6 0.1 0.1 0.8 0.1
S2 0.2 0.1 0.7 0.3 0.1 0.6 0.1 0.1 0.8
S3 1 0 0 1 0 0 1 0 0
4.2.1 Numerical Results

In this section brief description of each of the scenario is given with most important
results presented in Table 3.
In Scenario 1, each of the consumers finds most benefit in making a bilateral
balancing contract with generating unit 2. Yet, each of the consumers to a slightly
different extent. Thus, it is possible for the units to maximize their benefits and accept
those contracts, from whom they can benefit the most. At first, most of the power was
to be supplied by unit 2. Yet, this highly violated maximum flow capacity of line
adjacent to it. Therefore, significant changes in dispatch were required. In this scenario
market participants agreed on a network feasible solution in the third round of nego-
tiations. During the course of the method two different units were identified as most
problematic – in round 1: unit 2 and in round 2: unit 1. To guide the search new
constraints were imposed on their acceptable contracts. Results are shown in Table 3.
128 M. Drabecki
Scenario 2, however, is slightly different from the previous one. Here the preferred
unit for all of consumers was unit 3. Similarly to previously considered example,
benefit coefficients differed making it easy for the generating unit to decide on which
contracts to accept. However, this time most of the problem with network feasibility
resulted from transmission losses which were not considered by the participants at first
round of balancing. They were estimated by solving (2) and an additional constraint
was added on most problematic unit’s acceptable contracts. Market participants man-
aged to find consensus in two rounds of negotiations. Results are presented in Table 3.
Scenario 3 differs significantly from both previously presented cases. This time all
consumers wish to contract all energy supply from unit 1 without any will to com-
promise. After a considerable round of negotiations (here assumed 10), market par-
ticipants did not come to a conclusion on dispatch yielding a network- feasible power
flow. Therefore, the Operator used formulation (3) to arbitrarily adjust operating points
G þ =
of generating units. It was assumed that prices cP;i were all equal to 100 for all units.
They form the cost cA which is considered while calculating the value of f E .
Table 3. Results of numerical examples

Scenario 1, f E ¼ 12; 484:40USD, max. Round = 3
Gen. Unit 1 Gen. Unit 2 Gen. Unit 3
Original Feasible Original Feasible Original Feasible
contract contract contract contract contract contract
[MW] [MW] [MW] [MW] [MW] [MW]
Customer 1 3.3400 4.8505 83.3200 83.0896 3.3400 4.8505
Customer 2 3.3300 61.5815 93.3400 38.1096 3.3300 3.0996
Customer 3 3.3300 4.8405 118.3400 118.1096 3.3300 4.8405
Total [MW] 10.0000 71.2726 295.0000 239.3087 10,0000 12.7906
P P
Additional k2CN;iPkC;1 71:28 P k
k2CN;i C;2 249 No
constraint (in round 2) (in round 1)
added?
Scenario 2, f E ¼ 9; 901:26USD, max. Round = 2
Customer 1 3.3400 4,9968 3.3400 4.9968 83.3200 83.3200
Customer 2 28.3300 29.9868 3.3300 4.9868 68.3400 68.3400
Customer 3 3.3300 4.9868 3.3300 4.9868 118.3400 118.3400
Total [MW] 35.0000 39.9704 10.0000 14.9704 270.0000 270.0000
P
k2CN;i PC;2 14
Additional No k No
constraint (in round 2)
added?
Scenario 3, f E ¼ 7; 820:06USD, max. Round = 10+
Customer 1 3.3400 20.5085 83.3200 83.1551 3.3400 3.3333
Customer 2 3.3300 20.5085 93.3400 83.1551 3.3300 3.3333
Customer 3 3.3300 20.5085 118.3400 83.1551 3.3300 3.3333
Total [MW] 10.0000 61.5255 295.0000 249.4652 10.0000 10.0000
Additional No Yes, without satisfactory No
constraint results
added?
The results show that under the assumptions made, adding more freedom to the
balancing energy market might add some significant benefit as opposed to the standard
dispatching method, e.g. solving the regular Optimal Power Flow. As shown in
Table 3, this is the case for all scenarios. Not surprisingly, it is more beneficial to the
market when participants manage to decide on contracts alone (as in Scenario 1 and in
Scenario 2) rather than as when these contracts are arbitrarily changed by the operator –
as in Scenario 3.
What is more, the results show that using the proposed method it is possible to find
a network-feasible units dispatch for every discussed scenario, which actually maxi-
mizes use of bilateral balancing energy contracts made between participants.
5 Conclusions and Discussion
Power systems with high penetration of distributed energy resources are nowadays
becoming of interest to both academia and industry. Balancing of energy in such
a system ideally should performed via peer-to-peer trading between energy market
participants, as the participants should be given direct incentives to both develop DER
infrastructure and to actively control it to maximize their subjective profit.
Therefore, in this paper a multi-step method for assuring network-feasibility of
power flow resulting from generating unit dispatch obtained via bilateral, peer-to-peer,
balancing contracts is presented and analyzed. The proposed method allows market
participants to make bilateral contracts between consumers and suppliers for provision
of necessary amounts of balancing energy. The method assumes that market partici-
pants do not have sufficient knowledge on technical situation in the power system and
thus, do not know a priori what dispatch to agree on. Therefore, it is assumed that it is
system operator’s responsibility to guide participants in the search of network-feasible
dispatch. The method however, returns necessary information on direction of this
search, as a result of optimization problem (2) with heuristically added constraints on
accepted bilateral contracts. If despite having this information market participants are
unable to agree on technically feasible contracts, the method foresees the ability for the
operator to arbitrarily adjust those contracts using optimization problem (3). Yet, when
such a situation arises, suppliers are to be paid for adjusting their contractual positions.
The basis of the proposed method is formed by two optimization problems given in
formulations (2) and (3), which aim to maximize use of such contracts, while always
guaranteeing feasibility of resulting power flow.
Proposed approach was tested in simulations over three scenarios. All tests showed
that allowing free trade on the balancing energy market may bring more benefits to the
market than currently used centralized balancing options – namely when it is the
system’s operator who is always a party in any transaction made on the balancing
market. What is more, it has been shown in the paper, that it is possible to derive
methods which can satisfactorily help to achieve technical feasibility of power flow
resulting from so obtained dispatch of generating units.
130 M. Drabecki
Some perspective for further research can be identified. First of all, one may
consider designing architecture for an operator’s IT system for accepting contracts –
such as assumed in step 0 of described method. Another perspective may be devel-
opment of new market efficiency measures or new models for assuring technical fea-
sibility of power flow. They could be developed basing on different optimization,
modelling and simulation problems, possibly including development of agent systems.
When all of the above issues are deeply considered and followed-up by academic
communities worldwide, hopefully it will be possible to establish highly effective
bilateral balancing energy markets, that may be able to give incentives for transition
towards high penetration of green energy. Specific design of these markets shall be also
given thorough consideration, to make room for all their participants, regardless of their
size and power supply capabilities.
References
1. Wang, Q., et al.: Review of real-time electricity markets for integrating distributed energy
resources and demand response. Appl. Energy 138, 695–706 (2015)
2. Guerrero, J., et al.: Towards a transactive energy system for integration of distributed energy
resources: Home energy management, distributed optimal power flow, and peer-to-peer
energy trading. Renew. Sustain. Energy Rev. 132, 110000 (2020)
3. Lee, W., et al.: Optimal operation strategy for community-based prosumers through
cooperative P2P trading. In: 2019 IEEE Milan PowerTech. IEEE (2019)
4. Guerrero, J., Archie, C., Verbič, G.: Decentralized P2P energy trading under network
constraints in a low-voltage network. IEEE Trans. Smart Grid 10(5), 5163–5173 (2018)
5. Pasha, A.M., et al.: A utility maximized demand-side management for autonomous
microgrid. In: 2018 IEEE Electrical Power and Energy Conference (EPEC). IEEE (2018)
6. Okawa, Y., Toru, N.: Distributed optimal power management via negawatt trading in real-
time electricity market. IEEE Trans. Smart Grid 8(6), 3009–3019 (2017)
7. Zhang, Y., Chow, M.Y.: Distributed optimal generation dispatch considering transmission
losses. In: 2015 North American Power Symposium (NAPS), Charlotte (2015)
8. Kar, S., Hug, G.: Distributed robust economic dispatch: a consensus + innovations approach.
In: 2012 IEEE Power and Energy Society General Meeting, San Diego (2012)
9. Lin, C., Lin, S.: Distributed optimal power flow with discrete control variables of large
distributed power systems. IEEE Trans. Power Syst. 23(3) (2008)
10. Garrity T.F.: Innovation and trends for future electric power systems. In: 2009 Power
Systems Conference, Clemson (2009)
11. Machowski, J., Lubosny, Z., Bialek, J.W., Bumby, J.R.: Power System Dynamics: Stability
and Control. Wiley (2020)
12. Drabecki, M., Toczyłowski, E.: Comparison of three approaches to the security constrained
unit commitment problem. Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki
Politechniki Gdańskiej (62) (2019). http://yadda.icm.edu.pl/baztech/element/bwmeta1.ele
ment.baztech-0345a0bb-f574-41c1-be5d-71f50b8e060c?q=5838bcf6-a5ee-44ec-93f8-ea79d
2cd037b$1&qt=IN_PAGE
13. Drabecki, M., Toczyłowski, E.: Obtaining feasibility of power flows in the deregulated
electricity market environment. Przegląd Elektrotechniczny 95 (2019)
14. Zhu, J.: Optimization of Power System Operation. Wiley, Hoboken (2015)
15. Drabecki, M.: A method for enhancing power system’s steady-state voltage stability level by
considering active power optimal dispatch with linear grid models. In: Zeszyty Naukowe
Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej (62) (2019). http://yadda.
icm.edu.pl/baztech/element/bwmeta1.element.baztech-2eb9eb53-193d-4970-827d-4dba236e
0936?q=5838bcf6-a5ee-44ec-93f8-ea79d2cd037b$2&qt=IN_PAGE
16. Chow, J.H. (ed.): Time-Scale Modelling of Dynamic Networks with Applications to Power
Systems. Lecture Notes in Control and Information Sciences vol. 26, pp. 59–93. Springer.
Berlin (1982)
17. Zimmerman R.D., Murillo-Sanchez C.E., Thomas R.J.: MATPOWER: steady-state oper-
ations, planning and analysis tools for power systems research and education. IEEE Trans.
Power Syst. 26(1) (2011)
Sustainable Optimization,
Metaheuristics and Computing
for Expert System
The Results of a Compromise Solution, Which
Were Obtained on the Basis of the Method
of Uncertain Lagrange Multipliers
to Determine the Influence of Design Factors
of the Elastic-Damping Mechanism
in the Tractor Transmission
Sergey Senkevich(&), Ekaterina Ilchenko, Aleksandr Prilukov,

and Mikhail Chaplygin
Federal Scientific Agroengineering Center VIM, 1st Institute pas. 5,

Moscow 109428, Russia
sergej_senkevich@mail.ru, kat-sama@mail.ru,
chel.diagnost@gmail.com, misha2728@yandex.ru
Abstract. The article is devoted to the research of a compromise solution for

finding the optimal parameters of the Elastic Damping Mechanism (EDMṇ) in
the 14 kN class tractor transmission. The tractor was part of three different
machine-tractor units and performed the main agricultural operations: plowing,
cultivation and sowing. The task was to define a single function (compromise
solution), which can be used to describe the processes in the transmission when
performing these operations. The Lagrange multiplier method was used to
obtain a compromise solution. It was necessary to create one General mathe-
matical model out of three mathematical models, which should correctly reflect
the nature of the ongoing processes. It was necessary to determine the Lagrange
multipliers k1, k2…km for this purpose. All calculations were made in the
software environment Maple and MatLab. Compromise problem solution was
found. The extremum of the «transmission transparency degree» function is
found based on Lagrange multipliers. A compromise model is obtained that
expresses the influence of the main EDM parameters on the «transmission
transparency degree». The values of factors included in the resulting Lagrange
function are found.
Keywords: Method of uncertain lagrange multipliers Elastic damping

mechanism Transmission Optimal parameters
1 Introduction
Optimizing the operation of certain elements and replacing outdated mechanisms with
new developments can significantly improve performance. A large number of studies
exist in the field of improving mobile technology.

https://doi.org/10.1007/978-3-030-68154-8_14
136 S. Senkevich et al.
One of the significant elements requiring optimization is the transmission of mobile

power equipment. Improvements will allow us to significantly influence the efficiency
of resource use and control precision. These questions are being investigated all over
the world. For example, a dynamic transmission model has been created for improved
gear shifting by some researchers, analysis of gear shifting processes has been con-
ducted, and coordinated strategies for managing these processes have been developed
with the replacement of multiple clutches while the tractor is operating in the field [1].
A numerical model of dynamic characteristics for a new transmission with multistage
end gears as the main component was developed by other scientists [2]. A simulation
model of the synchronous transmission unit was created, which differs from other
models in the simplicity and quality of the model, which can be computed in real time
[3]. A large number of analyses of different types of transmissions exist, for example,
such as analysis of the control of a hybrid automatic manual transmission and a hybrid
dual-clutch transmission [4]. The dynamic analysis of a vehicle with three different
transmission configurations is of interest [5]: with spring buffer; machine with spring
buffer in combination with stabilizer bars; machine with spring damper in combination
with hydraulically linked suspension on the surface of the roll.
Linear stability analysis for groups of vehicles researches were conducted to reduce
fuel consumption and increase the fuel reserve not so long ago [6].
Analysis and modeling of various systems provide the basis for creating new
transmissions, for example, compact 18-speed epicyclic transmission for a small
vehicle, that presented in the work [7].
Authors showed the presented optimal power distribution algorithms and algo-
rithms for determining the optimal mode are effective by using process modeling in
their work [8]. Other authors presented new rolling-resistant hydraulically linked
suspension with two batteries for each liquid line in their work [9]. Correct computing
of rolling-resistant hydraulically linked suspension is a key to quality management
system [10]. Reducing dynamic loads on drive shafts has a positive effect on system
reliability [11]. The effective interaction of the chassis with the ground also helps to
increase productivity and reduce fuel consumption [12, 13].
The review shows the relevance of research on mobile machines transmissions
optimization. These questions, which are investigated all over the world, allow us to
significantly improve the parameters of the mobile tractor.
2 Purpose of Research
Our previous works [14–17] were aimed at finding the optimal parameters of the
Elastic Damping Mechanism (EDM) in the 14 kN class tractor transmission. Con-
ducted research is described in detail in these papers [14–17]. The indicator «P» is
proposed to assess the protective qualities of the mechanism («transmission trans-
parency degree») as the ratio of the current amplitude of vibrations of the engine speed
to its maximum value [14, 15, 18]. If P = 1 is reduction gear is absolutely «trans-
parent», the engine is not protected from fluctuations in the traction load (this happens
in serial transmissions). If P = 0, reduction gear is absolutely «not transparent» and will
The Results of a Compromise Solution, Which Were Obtained 137
completely absorb vibrations transmitted to the engine. Research has been conducted
for next agricultural operations: plowing, cultivation and seeding.
Purpose of this research is finding the conditional extremum of the “transmission
transparency degree” function regarding restrictions as a result of a compromise
solution based on Lagrange multipliers.
The following tasks were solved to achieve this goal:
1. Acquisition a compromise model that expresses the EDM main parameters
influence on the «degree of transparency of the transmission»;
2. Finding the factors values included in the resulting Lagrange function.
Main limitations for conducting this research the restrictions were chosen based on our
previous research [14, 15, 17, 18]. The name and designation of the elastic damping
mechanism factors are given in Table 1.
Table 1. Name and designation of factors of the elastic damping mechanism installed in the
tractor transmission.
N Factor name Factor identification Code mark
1 The throttle cross-sectional area Sth, m2 x1
2 Volume of hydropneumatic accumulator (HPA) Vhpa, m3 x2
3 The air pressure in HPA Pa, Pa x3
4 Inertia moment of additional load Jth, kgm2 x4
5 The oscillation frequency of the traction load f, Hz x5
The method of Lagrange multipliers was used to create a compromise solution

based on three previously obtained models [14, 15, 18], the method consists of this:
First. The lagrange function must be composed to find a conditional extremum, this
has the form in General for a function of several variables n f(x1, x2,…,xn) and
m coupling equations (herewith n > m) in general (1) [19]:
F ðx1 ; x2 ; . . .; xn ; k1 ; k2 ; . . .; km Þ ¼ f þ k1 u1 þ k2 u2 þ þ km um ð1Þ
where k1, k2…km is Lagrange multiplier;

f is function selected as the main one;
u1, u2…um is functions selected as a constraint.
Restrictions are presented as equalities in our study, however, the Lagrangian

method allows for the constraint by inequalities and other functions.
Second. The necessary conditions for the extremum are set by a system of Eqs. (2),
consisting of partial derivatives of an Eq. (1) and constraints, equal to zero, stationary
points are determined from this system of equations [20]:
dF
¼ 0; ði ¼ 1. . . nÞ
dxi ð2Þ
uj ¼ 0; ðj ¼ 1. . . mÞ
The presence of a conditional extremum can be determined based on this. The

symbol d2F is a sufficient condition that can be used to find out the nature of the
extreme. If d2F > 0 at a stationary point, the function f(x1, x2,…, xn) has a conditional
minimum at this point, if d2F < 0, function has conditional maximum.
A compromise solution had to be found in this study: to create one General
mathematical model out of three mathematical models, which should correctly reflect
the nature of the ongoing processes. It was necessary to determine the Lagrange
multipliers k1, k2…km for this purpose.
4 Discussion Results
Providing all models.

Model (3) that was taken as the main function f(x1, x2,…, xn):
y1 ðx1 ; x2 ; x3 ; x4 ; x5 Þ ¼ 0:578 þ 2:29 103 x1 3:72 103 x2 þ 0:053 x3

þ 4:51 103 x4 þ 0:021 x5 1:34 103 x1 x2
3:14 103 x1 x3 2:41 103 x1 x4 þ 3:32 104 x1 x5
ð3Þ
1:25 103 x2 x3 þ 1:37 103 x2 x4 0:044 103 x2 x5
6:04 105 x3 x4 0:035 x3 x5 þ 8:09 103 x4 x5
þ 0:03 x21 þ 0:03 x22 þ 0:031 x23 þ 0:03 x24 þ 0:095 x25 :
The model (4) and (5), which were taken as the conditions and restrictions.
y2 ðx1 ; x2 ; x3 ; x4 ; x5 Þ ¼ 0:722 6:325 103 x1 6:657 103 x2

þ 0:033 x3 5:526 103 x4 3:072 103 x5
þ 6:981 103 x1 x2 þ 6:851 103 x1 x3
þ 5:709 103 x1 x4 7:045 103 x1 x5 þ 8:462 103 x2 x3
þ 7:245 103 x2 x4 5:554 103 x2 x5 þ 7:115 103 x3 x4
þ 2:725 103 x3 x5 6:826 103 x4 x5 þ 2:905 103 x21
þ 2:466 103 x22 7:031 103 x23 þ 3:445 103 x24 þ 0:038 x25 :
ð4Þ
y3 ðx1 ; x2 ; x3 ; x4 ; x5 Þ ¼ 0:6212 þ 0:06538 x3 þ 0:06613 x5

þ 0:04968 x23 0:06994 x1 x2 þ 0:04868 x2 x4 ð5Þ
0:03494 x2 x5 0:04581 x3 x5 :
All further calculations were made in the Maple and MATLAB software envi-
ronment for convenience.
The Lagrange function is an Eq. (6) in general form in this case:
Lðx1 ; x2 ; . . .; x5 ; k1 ; k2 Þ ¼ y1 ðx1 . . .x5 Þ þ k2 y2 ðx1 . . .x5 Þ þ k3 y3 ðx1 . . .x5 Þ ð6Þ
that takes the form (7) when Eqs. (3), (4), and (5) are substituted into it:
L :¼ 0:578 þ 2:29 103 x1 3:72 103 x2 þ 0:053 x3

þ 4:51 103 x4 þ 0:021 x5 1:34 103 x1 x2
3:14 103 x1 x3 2:41 103 x1 x4 þ 3:32 104 x1 x5
1:25 103 x2 x3 þ 1:37 103 x2 x4 0:044 103 x2 x5
6:04 105 x3 x4 0:035 x3 x5 þ 8:09 103 x4 x5
þ 0:03 x21 þ 0:03 x22 þ 0:031 x23 þ 0:03 x24 þ 0:095 x25
þ k2 ð0:722 6:325 103 x1 6:657 103 x2
þ 0:033 x3 5:526 103 x4 3:072 103 x5
þ 6:981 103 x1 x2 þ 6:851 103 x1 x3
þ 5:709 103 x1 x4 7:045 103 x1 x5 þ 8:462 103 x2 x3
þ 7:245 103 x2 x4 5:554 103 x2 x5 þ 7:115 103 x3 x4
þ 2:725 103 x3 x5 6:826 103 x4 x5 þ 2:905 103 x21
þ 2:466 103 x22 7:031 103 x23 þ 3:445 103 x24 þ 0:038 x25 Þ
þ k3 ð0:6212 þ 0:06538 x3 þ 0:06613 x5
þ 0:04968 x23 0:06994 x1 x2 þ 0:04868 x2 x4
0:03494 x2 x5 0:04581 x3 x5 Þ:
ð7Þ
Partial derivatives were taken from the variables x1, x2, x3, x4, x5 in Eq. (7) to find
the extremes of the Lagrange function, then they were equated to zero. The resulting
equations, the constraint Eqs. (4) and (5), those are equated to zero, constitute a system
of equations, during the solution of which all the variables x (x1, x2, x3, x4, x5) were
expressed in terms of k2 and k3 those are means L2 and L3 in the program text [21].
Next, the values x1…x5 were inserted into the main equation and the constraint
equations, then into the Lagrange function. This part of the calculations was performed
in the Maple 5 software environment (Fig. 1).
Fig. 1. Maple software environment.
Two small programs were written to find Lagrange coefficients in the MATLAB
software environment (Fig. 2).
Fig. 2. MATLAB software environment.

The results of calculations of variables x1…x5 and coefficients k2 and k3 and the
resulting value of the Lagrange function are presented in Table 2.
Table 2. Results of calculating included in the General Lagrange function variables.

L x1 x2 x3 x4 x5 k2 k3
1 0,35 −0,0872 0,0528 −1,2677 −0,0067 −1,6901 −1 1
2 0,68 −0,1258 0,1496 0,4461 0,0435 0,1184 1 −1
Analysis of the Table 2 shows that the best version of the Lagrange function is
shown in the first row of the table. But these values are not applicable because the
factors x3 and x5 are outside the scope of research. So, the values of line 2 are assumed
to be optimal values.
The Lagrange compromise model (8) was obtained by substituting optimal values:
L ¼ 0:032905 x21 þ 0:032466 x22 0:025711 x23

þ 0:033445 x24 þ 0:133 x25 0:004035 x1 þ 0:075585 x1 x2
þ 0:003715 x1 x3 þ 0:003304 x1 x4 0:0067131 x1 x5
ð8Þ
þ 0:007213 x2 x3 0:040062 x2 x4 þ 0:073386 x2 x5
þ 0:00977963 x3 x4 þ 0:01081 x3 x5 þ 0:001259 x4 x5 þ 0:6788
0:010373 x2 þ 0:02062 x3 0:001013 x4 0:048202 x5 ;
Graphical dependencies of the Lagrange function are shown in Fig. 3. They are
constructed using Eq. (9). Factors not shown on the graph are fixed at zero (equal to
zero) in the function (8).
Figure 3 shows the surface plot of the Lagrange function depending on the vari-
ables x1 and x5 which is obtained using the Eq. (9):
Fig. 3. Graph of the Lagrange function dependence on variables x1 and x5

Lðx1 ; x5 Þ ¼ 0:6788 0:004035 x1 0:048202 x5 þ 0:032905 x21

ð9Þ
þ 0:133 x25 0:0067131 x1 x5 ;
5 Conclusions
A compromise solution to this problem was found in this study. The extremum of the
“transmission transparency degree” function is found using Lagrange multipliers.
A compromise model is obtained that expresses the influence of the main EDM
parameters on the «transmission transparency degree». The values of factors included
in the resulting Lagrange function are found.
We have determined the value of the desired function is as a solution to the
compromise problem L = 0.68; the factor values are x1 = −0.1258, x2 = 0.1496,
x3 = 0.4461, x4 = 0.0435, x5 = 0.1184, they are included in the Lagrange function.
Finding natural values is determined by formulas those are described in detail in the
research [14].
The values are: Sth = 2.285 10–4 m2, Vhpa, = 4.030 10–3 m3, Pa, = 4.446
10 Pa, Jth, = 4.740 10–3 kgm2, f, = 0.940 Hz.
5
Acknowledgments. The team of authors expresses recognition for the organization of the
Conference ICO'2020, Thailand, and personally Dr. Pandian Vasant. The authors are grateful to
anonymous referees for their helpful comments.
References
1. Li, B., Sun, D., Hu, M., Zhou, X., Liu, J., Wang, D.: Coordinated control of gear shifting
process with multiple clutches for power-shift transmission. Mech. Mach. Theory 140, 274–
291 (2019). https://doi.org/10.1016/j.mechmachtheory.2019.06.009
2. Chen, X., Hu, Q., Xu, Z., Zhu, C.: Numerical modeling and dynamic characteristics study of
coupling vibration of multistage face gearsplanetary transmission. Mech. Sci. 10, 475–495
(2019). https://doi.org/10.5194/ms-10-475-2019
3. Kirchner, M., Eberhard, P.: Simulation model of a gear synchronisation unit for application
in a real-time HiL environment. Veh. Syst. Dyn. 55(5), 668–680 (2017). https://doi.org/10.
1080/00423114.2016.1277025
4. Guercioni, G.R., Vigliani, A.: Gearshift control strategies for hybrid electric vehicles: a
comparison of powertrains equipped with automated manual transmissions and dual-clutch
transmissions. Proc. Inst. Mech. Eng. Part D J. Autom. Eng. 233(11), 2761–2779 (2019).
https://doi.org/10.1177/0954407018804120
5. Zhu, S., Xu, G., Tkachev, A., Wang, L., Zhang, N.: Comparison of the road-holding abilities
of a roll-plane hydraulically interconnected suspension system and an anti-roll bar system.
Proc. Inst. Mech. Eng. Part D J. Autom. Eng. 231(11), 1540–1557 (2016). https://doi.org/10.
1177/0954407016675995
6. Sau, J., Monteil, J., Bouroche, M.: State-space linear stability analysis of platoons of
cooperative vehicles. Transportmetrica B Transp. Dyn. 1–26 (2017).https://doi.org/10.1080/
21680566.2017.1308846
7. Kim, J.: Design of a compact 18-speed epicyclic transmission for a personal mobility
vehicle. Int. J. Automot. Technol. 17(6), 977–982 (2016). https://doi.org/10.1007/s12239-
016-0095-9
8. Park, T., Lee, H.: Optimal supervisory control strategy for a transmission-mounted electric
drive hybrid electric vehicle. Int. J. Automot. Technol. 20(4), 663–677 (2019). https://doi.
org/10.1007/s12239-019-0063-2
9. Chen, S., Zhang, B., Li, B., Zhang, N.: Dynamic characteristics analysis of vehicle
incorporating hydraulically interconnected suspension system with dual accumulators.
Shock Vib. 2018, 1–5 (2018). https://doi.org/10.1155/2018/6901423
10. Ding, F., Zhang, N., Liu, J., Han, X.: Dynamics analysis and design methodology of roll-
resistant hydraulically interconnected suspensions for tri-axle straight trucks. J. Franklin Inst.
353(17), 4620–4651 (2016). https://doi.org/10.1016/j.jfranklin.2016.08.016
11. Kuznetsov, N.K., Iov, I.A., Iov, A.A.: Reducing of dynamic loads of excavator actuators. In:
Journal of Physics: Conference Series, vol. 1210, no. 1, p. 012075. IOP Publishing (2019).
https://doi.org/10.1088/1742-6596/1210/1/012075
12. Ziyadi, M., Ozer, H., Kang, S., Al-Qadi, I.L.: Vehicle energy consumption and an
environmental impact calculation model for the transportation infrastructure systems.
J. Clean. Prod. 174, 424–436 (2018). https://doi.org/10.1016/j.jclepro.2017.10.292
13. Melikov, I., Kravchenko, V., Senkevich, S., Hasanova, E., Kravchenko, L.: Traction and
energy efficiency tests of oligomeric tires for category 3 tractors. In: IOP Conference Series:
Earth and Environmental Science, vol. 403, p. 012126 (2019). https://doi.org/10.1088/1755-
1315/403/1/012126
14. Senkevich, S., Kravchenko, V., Duriagina, V., Senkevich, A., Vasilev, E.: Optimization of
the parameters of the elastic damping mechanism in class 1, 4 tractor transmission for work
in the main agricultural operations. In: Advances in Intelligent Systems and Computing,
pp. 168–177 (2018). https://doi.org/10.1007/978-3-030-00979-3_17
15. Senkevich, S.E., Sergeev, N.V., Vasilev, E.K., Godzhaev, Z.A., Babayev, V.: Use of an
elastic-damping mechanism in the tractor transmission of a small class of traction (14 kN):
Theoretical and Experimental Substantiation. In: Handbook of Advanced Agro-Engineering
Technologies for Rural Business Development, pp. 149–179. IGI Global, Hershey (2019).
https://doi.org/10.4018/978-1-5225-7573-3.ch006
16. Senkevich, S., Duriagina, V., Kravchenko, V., Gamolina, I., Pavkin, D.: Improvement of the
numerical simulation of the machine-tractor unit functioning with an elastic-damping
mechanism in the tractor transmission of a small class of traction (14 kN). In: Vasant, P.,
Zelinka, I., Weber, G.W. (eds.) Intelligent Computing and Optimization. ICO 2019.
Advances in Intelligent Systems and Computing, vol. 1072, pp. 204–213. Springer, Cham
(2020). https://doi.org/10.1007/978-3-030-33585-4_20
17. Senkevich, S.E., Lavrukhin, P.V., Senkevich, A.A., Ivanov, P.A., Sergeev, N.V.:
Improvement of traction and coupling properties of the small class tractor for grain crop
sowing by means of the hydropneumatic damping device. In: Kharchenko, V., Vasant,
P. (eds.) Handbook of Research on Energy-Saving Technologies for Environmentally-
Friendly Agricultural Development, pp. 1–27. IGI Global, Hershey (2020). https://doi.org/
10.4018/978-1-5225-9420-8.ch001
18. Senkevich, S., Kravchenko, V., Lavrukhin, P., Ivanov, P., Senkevich, A.: Theoretical study
of the effect of an elastic-damping mechanism in the tractor transmission on a machine-
tractor unit performance while sowing. In: Handbook of Research on Smart Computing for
Renewable Energy and Agro-Engineering, pp. 423–463. IGI Global, Hershey (2020). https://
doi.org/10.4018/978-1-7998-1216-6.ch017
19. Nocedal, J., Wright, S.: Numerical Optimization, p. 664. Springer, Heidelberg (2006)
20. Härdle, W., Simar, L.: Applied Multivariate Statistical Analysis, p. 580. Springer, Berlin
(2015)
21. Malthe-Sorenssen, A.: Elementary Mechanics Using Matlab: A Modern Course Combining
Analytical and Numerical Techniques, p. 590. Springer, Heidelberg (2015)
Multiobjective Lévy-Flight Firefly Algorithm
for Multiobjective Optimization
Somchai Sumpunsri, Chaiyo Thammarat,

and Deacha Puangdownreong(&)
Department of Electrical Engineering, Faculty of Engineering,

Southeast Asia University, 19/1 Petchkasem Road, Nongkhangphlu,
Nonghkaem, Bangkok 10160, Thailand
somchai.sumpunsri@gmail.com,
{chaiyot,deachap}@sau.ac.th
Abstract. The firefly algorithm (FA) was firstly proposed during 2008–2009 as
one of the powerful population-based metaheuristic optimization techniques for
solving continuous and combinatorial optimization problems. The FA has been
proved and applied to various real-world problems in mostly single objective
optimization manner. However, many real-world problems are typically for-
mulated as the multiobjective optimization problems with complex constraints.
In this paper, the multiobjective Lévy-flight firefly algorithm (mLFFA) is
developed for multiobjective optimization. The proposed mLFFA is validated
against four standard multiobjective test functions to perform its effectiveness.
The simulation results show that the proposed mLFFA algorithm is more effi-
cient than the well-known algorithms from literature reviews including the
vector evaluated genetic algorithm (VEGA), non-dominated sorting genetic
algorithm II (NSGA-II), differential evolution for multiobjective optimization
(DEMO) and multiobjective multipath adaptive tabu search (mMATS).
Keywords: Lévy-flight firefly algorithm Metaheuristic optimization search

techniques Multiobjective optimization
1 Introduction
Regarding to optimization context, many real-world optimization problems usually

consist of many objectives which conflict each other [1, 2]. It leads to trade-off phe-
nomenon among the objectives of interest. Also, it makes the multiobjective problems
much more difficult and complex than single-objective problems. The multiobjective
problem often possesses multiple optimal solutions (or non-dominated solutions)
forming the so-called Pareto front [1, 2]. The challenge is how to perform the smooth
Pareto front containing a set of optimal solutions for all objective functions. Following
the literature, the multiobjective optimization problems can be successfully solved by
some powerful metaheuristic optimization search techniques, for examples, vector
evaluated genetic algorithm (VEGA) [3], non-dominated sorting genetic algorithm II
(NSGA-II) [4] and differential evolution for multiobjective optimization (DEMO) [5].

https://doi.org/10.1007/978-3-030-68154-8_15
146 S. Sumpunsri et al.
During 2008–2009, the firefly algorithm (FA) was firstly proposed by Yang [6]
based on the flashing behavior of fireflies and a random drawn from the uniform
distribution for randomly generating the feasible solutions. The FA has been widely
applied to solving real-world problems, such as industrial optimization, image pro-
cessing, antenna design, business optimization, civil engineering, robotics, semantic
web, chemistry, meteorology, wireless sensor networks and so on [7, 8]. Afterwards in
2010, the Lévy-flight firefly algorithm (LFFA) was proposed by Yang [9]. The LFFA
utilized a random drawn from the Lévy distribution to randomly generate new solu-
tions. The LFFA was tested and its results outperformed the genetic algorithm
(GA) and the particle swarm optimization (PSO). The state-of-the-art and its applica-
tions of the original LF and LFFA algorithms are reported [7–9].
In early 2020, the multiobjective Lévy-flight firefly algorithm (mLFFA) is proposed
for particularly solving multiobjective optimization problems [10]. In this paper, the
mLFFA is applied to solve well-known standard multiobjective optimization problems.
After an introduction given in Sect. 1, the original FA and LFFA algorithms are briefly
described, and the proposed mLFFA algorithm is illustrated in Sect. 2. Four standard
multiobjective test functions used in this paper are detailed in Sect. 3. Results are
discussions of the performance evaluation are provided in Sect. 4. Conclusions are
followed in Sect. 5.
2 Multiobjective Lévy-Flight Firefly Algorithm
In this section, the original FA and LFFA algorithms are briefly described. The pro-
posed mLFFA algorithm is then illustrated as follows.
2.1 Original FA Algorithm

Development of the original FA algorithm by Yang [6] is based on the flashing
behavior of fireflies generated by a process of bioluminescence for attracting partners
and prey. The FA’s algorithm is developed by three following rules.
Rule-1: All fireflies are assumed to be unisex. A firefly will be attracted to others
regardless of their sex.
Rule-2: Attractiveness is proportional to the brightness. Both attractiveness and
brightness decrease when distance increases. The less brighter firefly will
move towards the brighter one. If there is no brighter one, it will move
randomly.
Rule-3: Brightness of each firefly will be evaluated by the objective function of
interest.
The light intensity I of the FA will vary according to the inverse square law of the
distance r, I = Is/r2, where Is is the intensity at the source. For a fixed light absorption
coefficient, the light intensity I varies with the distance r as stated in (1), where I0 is the
original light intensity. The attractiveness b will vary according to the distance r as
stated in (2), where b0 is the attractiveness at r = 0. From (1) and (2), the scaling factor
Multiobjective Lévy-Flight Firefly Algorithm 147
c is defined as the light absorption coefficient. The distance rij between two fireflies
i and j at their locations xi and xj can be calculated by (3). In FA algorithm, a new
solution xt+1 can be obtained by an old solution xt as stated in (4), where ei is a vector
of random numbers drawn from a uniform distribution. The randomization parameter at
can be adjusted as stated in (5), where a0 is the initial randomness scaling factor. The
original FA algorithm can be described by the pseudo code as shown in Fig. 1.
I ¼ I0 ecr ð1Þ
b ¼ b0 ecr
2
ð2Þ
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u d
u X
rij ¼ xi xj ¼ t
ðxi;k xj;k Þ2 ð3Þ
k¼1
xti þ 1 ¼ xti þ b0 ecrij ðxtj xti Þ þ at eti

2
ð4Þ
at ¼ a0 dt ; ð0\d\1Þ ð5Þ
Initialize:
The objective function f(x), x = (x1,…,xd)T
Generate initial population of fireflies xi = (i = 1, 2,…,n)
Light intensity Ii at xi is determined by f(x)
Define light absorption coefficient γ
while (Gen ≤ Max_Generation)
for i =1 : n all n fireflies
for j =1 : i all n fireflies
if (Ij > Ii)
- Move firefly i towards j in d-dimension via
Lévy-flight distributed random
end if
- Attractiveness varies with distance r via exp[-γr]
- Evaluate new solutions and update light intensity
end for j
end for i
Rank the fireflies and find the current best x*
end while
Report the best solution found x*
Fig. 1. Pseudo code of the original FA algorithm.

2.2 LFFA Algorithm

The LFFA is the one of the modified versions of the original FA [9]. Movement of a
firefly in (4) is changed to (6) by using the random numbers drawn from the Lévy
distribution. The product ⊕ stands for the entrywise multiplications. The sign[rand-
1/2] where rand 2 [0, 1] provides a random direction. From (6), a symbol Lévy(k)
represents the Lévy distribution as stated in (7). The step length s can be calculated by
(8), where u and v stand for normal distribution as stated in (9). Also, the standard
deviations of u and v are expressed in (10), where C is the standard Gamma function.
ð6Þ
ð7Þ
s ¼ u=jvj1=b ð8Þ
u Nð0; r2u Þ; v Nð0; r2v Þ ð9Þ

1=b
Cð1 þ bÞ sinðpb=2Þ
ru ¼ ; rv ¼ 1 ð10Þ
C½ð1 þ bÞ=2b2ðb1Þ=2
2.3 mLFFA Algorithm

The mLFFA is represented by the pseudo code as shown in Fig. 2 [10]. The multiob-
jective function F(x) is formulated in (11), where f1(x), f2(x),…, fn(x) are the objective
functions, gj(x) is the inequality constraints and hk(x) is the equality constraints. F(x) in
(11) will be simultaneously minimized by the mLFFA according to gj(x) and hk(x). The
best solution will be updated in every iteration. If it is a non-dominated solution, it will
be sorted and stored into the Pareto optimal set P* as stated in (12). After the search
terminated, the solutions stored in P* will be conducted to perform the Pareto front PF*
as expressed in (13), where s is the solutions stored in set S and s* is non-dominated
solutions. All non-dominated solutions appeared on the PF* are the optimal solutions.
9
Min FðxÞ ¼ ff1 ðxÞ; f2 ðxÞ; . . .; fn ðxÞg =
subject to gj ðxÞ 0; j ¼ 1; . . .; m ð11Þ
;
hk ðxÞ ¼ 0; k ¼ 1; . . .; p
P ¼ fx 2 Fj9x 2 F : Fðx Þ FðxÞg ð12Þ
PF ¼ fs 2 Sj9s 2 S : s sg ð13Þ
Multiobjective function:
F(x) = {f1(x), f2(x),…,fn(x)}, x = (x1,…,xd)T
Initialize LFFA1,…,LFFAk, and Pareto optimal set P*
Generate initial population of fireflies xi = (i = 1, 2,…,n)
Light intensity Ii at xi is determined by F(xi)
Define light absorption coefficient γ
while (Gen ≤ Max_Generation)
for z =1: k all k LFFA
for i =1 : n all n fireflies
for j =1 : i all n fireflies
if (Ij > Ii)
- Move firefly i towards j in d-dimension via
Lévy-flight distributed random
end if
- Attractiveness varies with distance r via exp[-γr]
- Evaluate new solutions and update light intensity
end for j
end for i
end for z
- Rank the fireflies and find the current best x*
- Sort and find the current Pareto optimal solutions
end while
- Pareto optimal solutions are found and
- Pareto front PF* is performed.
Fig. 2. Pseudo code of the mLFFA algorithm.
3 Multiobjective Test Functions
In order to perform its effectiveness, the mLFFA is thus evaluated against several
multiobjective test functions. In this work, four widely used standard multiobjective
functions, ZDT1 – ZDT4, providing a wide range of diverse properties in terms of
Pareto front and Pareto optimal set are conducted [11, 12]. ZDT1 is with convex front
as stated in (14) where d is the number of dimensions. ZDT2 as stated in (15) is with
non-convex front, while ZDT3 with discontinuous front is expressed in (16), where
g and xi in functions ZDT2 and ZDT3 are the same as in function ZDT1. For ZDT4, it
is stated in (17) with convex front but more specific. In evaluation process, the error Ef
between the estimated Pareto front PFe and its corresponding true front PFt is defined
in (18), where N is the number of solution points.
9
ZDT1 : >
>
f1 ðxÞ ¼ x1 ; pffiffiffiffiffiffiffiffiffi >
=
f2 ðxÞ ¼ gð1 f1 =g Þ; ð14Þ
P . >
>
>
g ¼ 1 þ 9 di¼2 xi ðd 1Þ; xi 2 ½0; 1; i ¼ 1; . . .; 30 ;
9
ZDT2 : =
f1 ðxÞ ¼ x1 ; ð15Þ
;
f2 ðxÞ ¼ gð1 f1 =gÞ2
9
ZDT3 : >
=
f1 ðxÞ ¼ x1; ð16Þ
pffiffiffiffiffiffiffiffiffi >
f2 ðxÞ ¼ g 1 f1 =g f1 sinð10pf1 Þ=g ;
9
ZDT4 : >
>
f1 ðxÞ ¼ x1 ; pffiffiffiffiffiffiffiffiffi =
f2 ðxÞ ¼ gð1 f1 =gÞ; ð17Þ
P >
>
;
g ¼ 1 þ 10ðd 1Þ þ di¼2 x2i 10 cosð4pf1 Þ ; xi 2 ½0; 1; i ¼ 1; . . .; 30:
X
N
Ef ¼ kPFe PFt k ¼ ðPFej PFt Þ2 ð18Þ
j¼1
4 Results and Discussions
The proposed mLFFA algorithms were coded by MATLAB version 2018b run on Intel
(R) Core(TM) i5-3470 CPU@3.60 GHz, 4.0 GB-RAM. Search parameters of each
LFFA in the mLFFA are set according to Yang’s recommendations [6, 9], i.e. the
numbers of fireflies n = 30, a0 = 0.25, b0 = 1, k = 1.50 and c = 1. These parameter
values are sufficient for most optimization problems because the LFFA algorithm is
very robust (not very sensitive to the parameter adjustment) [6, 9]. In this work, the
termination criteria (TC) either use a given tolerance or a fixed number of generations.
As simulation results, it was found that a fixed number of generations is not only easy
to implement, but also suitable to compare the closeness of Pareto front of test func-
tions. Therefore, for all test functions, Max_Gen = 2,000 is set as the TC.
For comparison, the results obtained by the proposed mLFFA over all four standard
multiobjective test functions are compared with those obtained by the well-known
algorithms from literature reviews, i.e. VEGA [3], NSGA-II [4], DEMO [5] and
multiobjective multipath adaptive tabu search (mMATS) [13]. The performance of all
algorithms is measured via the error Ef stated in (18) and for all algorithms, a fixed
number of generations/iterations of 2,000 (Max_Gen) is set as the TC. The results
obtained from all test functions are summarized in Table 1 – Table 2, and the estimated
Pareto fronts and the true front of functions ZDT1 – ZDT4 are depicted in Fig. 3,
Fig. 4, Fig. 5, Fig. 6, respectively. It was found from Fig. 3, Fig. 4, Fig. 5 and Fig. 6
that the mLFFA can satisfactory provide the Pareto front containing all Pareto optimal
solutions of all test functions very close to the true front of each multiobjective test
function. Referring to Table 1 – Table 2, the mLFFA shows superior results to other
algorithms with the less Ef and search time consumed.
1.0
0.9 True Pareto front
0.8 mLFFA
0.7
0.6
f2(x)
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
f1(x)
Fig. 3. Pareto front of ZDT1.
1.0
0.9
0.8
0.7
0.6
f2(x)
0.5
0.4
0.3
0.2
True Pareto front
0.1 mLFFA
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
f1(x)
Table 1. Error Ef between PFe and PFt.

Algorithms Error Ef
ZDT1 ZDT2 ZDT3 ZDT4
VEGA 2.79e-02 2.37e-03 3.29e-01 4.87e-01
NSGA-II 3.33e-02 7.24e-02 1.14e-01 3.38e-01
DEMO 2.08e-03 7.55e-04 2.18e-03 2.96e-01
mMATS 1.24e-03 2.52e-04 1.07e-03 1.02e-01
mLFFA 1.20e-03 2.48e-04 1.01e-03 1.01e-01
1.0
mLFFA
0.6
0.4
0.2
f2(x)
0
-0.2
-0.4
-0.6
-0.8
-1.0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
f1(x)
1.0
mLFFA
0.8
0.7
0.6
f2(x)
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
f1(x)
Table 2. Search time consumed.

Algorithms Search time (sec.)
ZDT1 ZDT2 ZDT3 ZDT4
VEGA 125.45 132.18 121.40 122.24
NSGA-II 126.82 145.63 158.27 165.51
DEMO 89.31 98.44 102.32 120.86
mMATS 65.54 72.33 82.47 78.52
mLFFA 52.42 65.18 71.53 64.78
5 Conclusions
The multiobjective Lévy-flight firefly algorithm (mLFFA) has been presented in this
paper for global optimization. Based on the original FA and LFFA, the mLFFA has
been tested against four standard multiobjective test functions to perform its effec-
tiveness of multiobjective optimization. Details of the performance evaluation of the
mLFFA algorithm have been reported. As simulation results, it was found that the
mLFFA algorithm performs more efficient than the well-known algorithms including
the VEGA, NSGA-II, DEMO and mMATS. This can be noticed that the mLFFA
algorithm is one of the most effective metaheuristic optimization search techniques for
solving global optimization problems. For future research, the proposed mLFFA
algorithm will be applied to various real-world engineering optimization problems.
References
1. Zakian, V.: Control Systems Design: A New Framework. Springer, London (2005)
2. Talbi, E.G.: Metaheuristics form Design to Implementation. Wiley, Hoboken (2009)
3. Schaffer, J.D.: Multiple objective optimization with vector evaluated genetic algorithms. In:
The 1st International Conference on Genetic Algorithms, pp. 93–100 (1985)
4. Deb, K., Pratap, A., Agarwal, S., Mayarivan, T.: A fast and elitist multiobjective algorithm:
NSGA-II. IEEE Trans. Evol. Comput. 6, 182–197 (2002)
5. Robič, T., Filipič, B.: DEMO: Differential Evolution for Multiobjective Optimization.
Lecture Notes in Computer Sciences, vol. 3410, pp.520–533 (2005)
6. Yang, X.S.: Firefly algorithms for multimodal optimization. In: Stochastic Algorithms,
Foundations and Applications. Lecture Notes in Computer Sciences, vol. 5792, pp. 169–178
(2009)
7. Fister, I., Fister Jr., I., Yang, X.S., Brest, J.: A comprehensive review of firefly algorithms.
In: Swarm and Evolutionary Computation, vol. 13, pp. 34–46. Springer, Heidelberg (2013)
8. Fister, I., Yang, X.S., Fister, D., Fister Jr., I.: Firefly algorithm: a brief review of the
expanding literature. In: Yang, X.S. (ed.) Cuckoo Search and Firefly Algorithm, pp. 347–
360 (2014)
9. Yang, X.S.: Firefly algorithm, Lévy flights and global optimization. In: Research and
Development in Intelligent Systems, vol. XXVI, pp. 209–218. Springer, Heidelberg (2010)
10. Sumpunsri, S., Puangdownreong, D.: Multiobjective Lévy-flight firefly algorithm for
optimal PIDA controller design. Int. J. Innovative Comput. Inf. Control 16(1), 173–187
(2020)
11. Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: a comparative case study and
the strength pareto approach. IEEE Trans. Evol. Comput. 3, 257–271 (1999)
12. Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms:
empirical results. Evol. Comput. 8, 173–195 (2000)
13. Puangdownreong, D.: Multiobjective multipath adaptive tabu search for optimal PID
controller design. Int. J. Intell. Syst. Appl. 7(8), 51–58 (2015)
Cooperative FPA-ATS Algorithm
for Global Optimization
Thitipong Niyomsat, Sarot Hlangnamthip,

and Deacha Puangdownreong(&)

ton_phuket@hotmail.com, {sarotl,deachap}@sau.ac.th
Abstract. This paper presents the novel cooperative metaheuristic algorithm

including the flower pollination algorithm (FPA) and the adaptive tabu search
named the cooperative FPA-ATS. The proposed cooperative FPA-ATS pos-
sesses two states. Firstly, it starts the search for the feasible solutions over entire
search space by using the FPA’s explorative property. Secondly, it searches for
the global solution by using the ATS’s exploitative property. The cooperative
FPA-ATS are tested against ten multimodal benchmark functions for global
minimum finding in order to perform its search performance. By comparison
with the FPA and ATS, it was found that the cooperative FPA-ATS is more
efficient than the FPA and ATS, significantly.
Keywords: Cooperative FPA-ATS algorithm Flower pollination algorithm

Adaptive tabu search Global optimization
1 Introduction
Metaheuristic algorithms have been consecutively developed to solve combinatorial

and numeric optimization problems over five decades [1]. The metaheuristic opti-
mization search technique can be classified by two main properties, i.e. exploration (or
diversification) and exploitation (or intensification) [1, 2]. The explorative property is
the ability to generate diverse solutions to explore the overall search space on the global
scale. Another is the exploitative property performing the ability to generate intensive
solutions on the local scale of search space. With these two properties, metaheuristics
are thus divided into two types, i.e. population-based and single solution-based
metaheuristic algorithms [2]. The population-based metaheuristic algorithms have
strong explorative property, whereas the single-solution based metaheuristic algorithms
have strong exploitative property [1, 2]. From literature reviews, one of the most
efficient population-based metaheuristic algorithms is the flower pollination algorithm
(FPA) proposed by Yang in 2012 [3]. The FPA was successfully applied to solve many
real-world problems such as process control, image processing, antenna design,
wireless network, robotics and automatic control problems [3, 4]. Moreover, the FPA
algorithm has been proven for the global convergence properties [5]. On the other
hand, one of the most powerful single-solution based metaheuristic algorithms is the

https://doi.org/10.1007/978-3-030-68154-8_16
Cooperative FPA-ATS Algorithm for Global Optimization 155
adaptive tabu search (ATS) proposed by Sujitjorn, Kulworawanichpong, Puang-

downreong and Areerak in 2006 [6]. The ATS was developed from the original TS
proposed by Glover in 1989 [7, 8]. The ATS was widely conducted to solve several
real-world optimization problems such as model identification, control system design,
power system design and signal processing [6]. In addition, the ATS algorithm has
been proven for the global convergence [6, 9, 10].
To combine two main properties, the novel cooperative metaheuristic algorithm
called the cooperative FPA-ATS is proposed in this paper. The proposed cooperative
FPA-ATS includes the FPA and the ATS in cooperating search for global optimization.
This paper consists of five sections. After an introduction in Sect. 1, the original FPA
and ATS algorithms are briefly described and the proposed cooperative FPA-ATS
algorithm is illustrated in Sect. 2. Ten standard multimodal benchmark functions used
in this paper are detailed in Sect. 3. Results and discussions of the performance tests of
the FPA, ATS and cooperative FPA-ATS algorithms against ten selected multimodal
benchmark functions are performed in Sect. 4. Finally, conclusions are given in
Sect. 5.
2 Cooperative FPA-ATS Algorithm
In this section, the original FPA and ATS algorithms are briefly described. Then, the
proposed cooperative FPA-ATS algorithm is elaborately illustrated as follows.
2.1 FPA Algorithm

The FPA algorithm mimics the pollination of the flowering plants in nature. It can be
divided into cross (or global) pollination and self (or local) pollination [3]. For the
cross-pollination, pollen will be transferred by the biotic pollinator. In this case, the
new position (or solution) xt+1 can be calculated by (1), where xt is the current position,
g* is the current best solution, L is the random drawn from Lévy flight distribution as
stated in (2) and C(k) is the standard gamma function. For the self-pollination, pollen
will be transferred by the abiotic pollinator. For this case, xt+1 can be calculated by (3),
where e is the random drawn from a uniform distribution as expressed in (4). The
proximity probability p is used for switching between cross-pollination and self-
pollination. The FPA algorithm can be described by the flow diagram shown in Fig. 1,
where TC stands for the termination criteria.
xit þ 1 ¼ xti þ Lðxti gÞ ð1Þ
kCðkÞ sinðpk=2Þ 1
L ; ðs [[ s0 [ 0Þ ð2Þ
p s1 þ k
xti þ 1 ¼ xti þ eðxtj xtk Þ ð3Þ

156 T. Niyomsat et al.

1=ðb aÞ; a q b
eðqÞ ¼ ð4Þ
0; q\a or q [ b
2.2 ATS Algorithm

The ATS method is based on iterative neighborhood search approach [6]. It consists of
the tabu list (TL), adaptive-radius (AR) mechanism and back-tracking (BT) mecha-
nism. The TL is used to record the list of visited solutions. The AR is conducted to
speed up the search process by reducing the search radius. Once the local entrapment is
occurred, the BT is activated by using some solutions collected in TL to escape from
the entrapment and move to new search space. The ATS algorithm is represented by the
flow diagram in Fig. 2.
Start
- Objective function f(x), x = (x1, x2,…, xd)

Start - Initialize a search radius R, weighting
search radius α, 0<α<1, and TL
- Randomly generate the initial solution x*
- Initialize a population of n flowers (pollen
gametes) with random solutions - Randomly N neighborhood solutions
- Find the best solution g* among the initial around x* within R.
population via f(x) - Evaluate all N solutions via f(x)
- Define a proximity probability p ∈ [0, 1] - Let the best of all N solutions be x
No
rand > p No
f(x)<f(x*)
Yes
- Draw ε from a uniform Yes
- Draw a step vector L distribution in (4)
via Lévy flight in (2) - Randomly choose j and k - Update x* = x
- Activate cross- among all the solutions
pollination in (1) - Invoke self-pollination
for generating in (3) for generating - Store x in TL
new solutions new solutions
- Activate AR mechanism by R = αR
- Evaluate new solutions

Entrapment No
No
occur ?
f(x)<f(g*)
Yes
Yes - Invoke BT mechanism
- Update g* = x
No
No TC ?
TC ? Yes
Yes
- Report the current best solution g* - Report the best solution x*
Stop Stop
Fig. 1. FPA algorithm. Fig. 2. ATS algorithm.

2.3 Cooperative FPA-ATS Algorithm

The proposed cooperative FPA-ATS algorithm possesses two states. In the first state,
the FPA algorithm in Fig. 1 is used for searching the feasible solutions over entire
search space. Afterwards in the second state, the ATS algorithm in Fig. 2 is activated
for searching the global solution. Such two states can be divided by the switching
generation (SG). Setting the SG setting depends on the problems of interest. The more
the local optima of the problem are (close to high multi-modal), the more the SG is set.
On the other hand, the less the local optima of the problem are (close to unimodal), the
less the SG is set. In general, the SG can be set as the half value of the maximum
generation. The cooperative FPA-ATS algorithm is proposed by the flow diagram in
Fig. 3.
Start
For FPA:
- Initialize a population of n flowers/pollen gametes with random solutions
- Find the best solution g* in the initial population
- Define a proximity probability p ∈ [0, 1]
For ATS:
- Initialize a search radius R, weighting search radius α, 0<α<1, and TL
For TC:
- Let Max_Gen is the maximum generation and Gen = 1
- Set SG for switching from FPA to ATS
- Activate FPA algorithm in Fig. 1
Yes
- Update Gen = Gen + 1 Gen ≤ SG
No
- Set the initial solution x* = g*
- Activate ATS algorithm in Fig. 2
Yes Gen ≤
- Update Gen = Gen + 1
Max_Gen
No
- Report the best solution x*
Stop
Fig. 3. Flow diagram of the cooperative FPA-ATS algorithm.

3 Benchmark Functions
The proposed cooperative FPA-ATS algorithm will be tested against the standard
function optimization to perform its effectiveness for global minimum finding. In this
work, ten multimodal benchmark functions are selected [11, 12] as detailed in Table 1,
where x* = (x*, y*) is the optimal solution and f(x*) is the optimal function values. All
selected functions are nonlinear and multimodal functions which are very complex for
global minimum finding. They consist of (1) Griewank function (GF), (2) Salomon
function (SF), (3) Michaelwicz function (MF), (4) Rastrigin function (RF), (5) Peaks
function (PF), (6) Beale function (BF), (7) Giunta function (GiF), (8) Lévy function
(LF), (9) Yang function (YF) and (10) Drop-Wave function (DWF), respectively.
Table 1. Details of ten benchmark functions.
Names, Functions,
Functions Surface sketches
Search space and Coefficients
6
5
f1(x1, x2) 4
3
2
1 100
0 50
100 0 x1
50
0 -50
x2 -50
-100 -100
2.5
2
f2(x1, x2)
1.5
1
0.5
0 4
4 2
2 0 x1
0 -2
x2 -2
-4 -4
1
0.5
0
f3(x1, x2)
-0.5
-1
-1.5
4
-2 3
4 2x
3 1
2 1
x2 1
0 0
80
60
f4(x1, x2)
40
20
5
0
5 0 x1
0
x2 -5
-5
5
f5(x1, x2)
-5
4
-10 2
4 0 x1
2
0 -2
x2 -2
-4 -4
(Continued)
Table 1. (Continued)
Names, Functions,
Functions Surface sketches
Search space and Coefficients
4
×10
15
f6(x1, x2)
10
5
4
0 2
4 0 x
2 1
0 -2
x2 -2 -4
-4
0.6
f7(x1, x2)
0.4
0.2
1
0
1 0 x1
0.5
0
x2 -0.5 -1
-1
400
300
f8(x1, x2)
200
100
10
0 5
10 0 x1
5
0 -5
x2 -5
-10 -10
0.5
f9(x1, x2)
-0.5
10
-1 5
10 0x
5 1
0 -5
x2 -5
-10 -10
-0.2
-0.4
f10(x1, x2)
-0.6
-0.8
5
-1
5 0x
1
0
x2 -5
-5
To perform its search performance, the proposed cooperative FPA-ATS will be tested
against 10 selected benchmark functions as summarized in Table 1. For the perfor-
mance evaluation and comparison, the ATS, FPA and cooperative FPA-ATS algo-
rithms were coded by MATLAB version 2017b (License No.#40637337). Searching
parameters of the ATS are set according to recommendations [6] and the preliminary
study as shown in Table 2. Those of the FPA are set according to recommendations [3]
and the preliminary study for all benchmark functions, i.e. numbers of flowers n = 20
Table 2. Search parameters of the ATS.

Functions Search parameters
N R(%) Jmax AR mechanism
Stage-I Stage-II Stage-III
GF 30 20 15 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3
R = 1e-2 R = 1e-3 R = 1e-4
SF 15 20 15 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3
R = 1e-2 R = 1e-3 R = 1e-4
MF 20 20 15 f(x) < f(x*)+1 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2
R = 1e-2 R = 1e-3 R = 1e-4
RF 20 20 15 f(x) < f(x*)+1 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2
R = 1e-2 R = 1e-3 R = 1e-4
PF 25 20 15 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3
R = 1e-2 R = 1e-3 R = 1e-4
BF 20 20 15 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3 f(x) < f(x*)+1e-4
R = 1e-2 R = 1e-3 R = 1e-4
GiF 20 20 15 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3 f(x) < f(x*)+1e-4
R = 1e-2 R = 1e-3 R = 1e-4
LF 20 20 15 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3
R = 1e-2 R = 1e-3 R = 1e-4
YF 20 20 15 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3
R = 1e-2 R = 1e-3 R = 1e-4
DWF 50 20 15 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3
R = 1e-2 R = 1e-3 R = 1e-4
Note: N is numbers of neighborhood members,
R is an initial search radius (% = percentage of search space) and
Jmax is numbers of maximum solution cycling to activate BT mechanism.
Table 3. Search parameters of the cooperative FPA-ATS.

Func-tions Search parameters of FPA-ATS
FPA ATS SG
n p N R(%) Jmax AR mechanism
GF 20 0.5 5 20 5 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3 700
R = 1e-2 R = 1e-3 R = 1e-4
SF 20 0.5 5 20 5 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3 650
R = 1e-2 R = 1e-3 R = 1e-4
MF 20 0.5 5 20 5 f(x) < f(x*)+1 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 450
R = 1e-2 R = 1e-3 R = 1e-4
RF 20 0.5 5 20 5 f(x) < f(x*)+1 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 500
R = 1e-2 R = 1e-3 R = 1e-4
PF 20 0.5 5 20 5 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3 400
R = 1e-2 R = 1e-3 R = 1e-4
BF 20 0.5 5 20 5 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3 f(x) < f(x*)+1e-4 350
R = 1e-2 R = 1e-3 R = 1e-4
(continued)
Func-tions Search parameters of FPA-ATS
FPA ATS SG
n p N R(%) Jmax AR mechanism
GiF 20 0.5 5 20 5 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3 f(x) < f(x*)+1e-4 350
R = 1e-2 R = 1e-3 R = 1e-4
LF 20 0.5 5 20 5 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3 750
R = 1e-2 R = 1e-3 R = 1e-4
YF 20 0.5 5 20 5 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3 700
R = 1e-2 R = 1e-3 R = 1e-4
DWF 20 0.5 5 20 5 f(x) < f(x*)+1e-1 f(x) < f(x*)+1e-2 f(x) < f(x*)+1e-3 750
R = 1e-2 R = 1e-3 R = 1e-4
Note:n is numbers of flowers, p is a switch probability and SG is a switching generation
Table 4. Results of performance evaluation of the ATS, FPA and cooperative FPA-ATS.
Func- Algorithms
tions ATS FPA Cooperative FPA-ATS
GF 2.608e+4 ± 8.786e+3 1.914e+4 ± 2.149e+3 1.809e+4 ± 4.386e+1
(19%) (25%) (100%)
SF 1.937e+4 ± 1.260e+03 8.484e+3 ± 5.945e+3 1.276e+4 ± 5.782e+2
(33%) (64%) (94%)
MF 8.106e+3 ± 8.432e+3 4.406e+3 ± 8.904e+2 2.454e+3 ± 7.009e+2
(69%) (100%) (100%)
RF 6.642e+4 ± 5.678e+3 1.153e+4 ± 1.591e+3 1.010e+4 ± 6.267e+1
(93%) (100%) (100%)
PF 3.583e+3 ± 7.092e+2 2.406e+3 ± 2.012e+2 2.096e+3 ± 2.840e+1
(100%) (100%) (100%)
BF 5.274e+3 ± 4.896e+3 4.232e+3 ± 7.790e+2 2.269e+3 ± 2.560e+2
(97%) (100%) (100%)
GiF 2.218e+3 ± 6.745e+2 2.013e+3 ± 6.116e+2 1.903e+3 ± 6.421e+1
(100%) (100%) (100%)
LF 1.239e+4 ± 7.379e+3 4.861e+3 ± 8.239e+2 4.152e+3 ± 2.280e+2
(64%) (100%) (100%)
YF 8.305e+3 ± 4.204e+3 4.887e+3 ± 4.351e+3 1.807e+4 ± 9.435e+1
(93%) (98%) (99%)
DWF 3.846e+4 ± 1.794+4 1.399e+4 ± 2.842e+3 8.085e+3 ± 4.898e+1
(37%) (96%) (100%)
50
40
30
20
10
0
x2
-10
-20
-30
-40
-50
-50 -40 -30 -20 -10 0 10 20 30 40 50
x1
(a) Search movement (b) Convergent rates
Fig. 4. Results of minimum finding of GF by cooperative FPA-ATS.
and a switch probability p = 0.5 (50%). Searching parameters of the cooperative FPA-
ATS algorithm are set according to the preliminary study as summarized in Table 3.
100-trial runs are conducted for each algorithm. All algorithms will be terminated once
two termination criteria (TC) are satisfied, i.e. (1) the function values are less than a
given tolerance e 10–5 or (2) the search meets the maximum generation (Max_-
Gen = 1,000). The former criterion implies that the search is success, while the later
means that the search is not success.
The comparison results between ATS, FPA and cooperative FPA-ATS are sum-
marized in Table 4. The numeric data in Table 4 are presented in the following format,
i.e. AE ± SD(SR%), where the AE is the average number (mean) of function evalu-
ations, the SD is the standard deviation and the SR is the success rate. The AE value
implies the searching time consumed. The less the AE, the less the searching time
consumed. The SD value implies the robustness of the algorithm. The less the SD, the
more the robustness. From Table 4, the proposed cooperative FPA-ATS performs more
efficient in finding the global minima of all selected benchmark functions than the FPA
and ATS, respectively, with the highest SR and smallest AE and SD values.
In addition, movements and convergent rates of the cooperative FPA-ATS for
global minimum finding of GF are plotted in Fig. 4 as an example. It was found that the
FPA-ATS can reach the global minima with the 1st TC for all trial runs.
5 Conclusions
A novel cooperative metaheuristic optimization algorithm denoted as the cooperative

FPA-ATS algorithm has been proposed in this paper. The proposed cooperative FPA-
ATS algorithm has been formed from the ATS, one of the trajectory-based meta-
heuristic optimization techniques possessing the dominant exploitative property, and
the FPA, one of the most efficient population-based metaheuristic optimization tech-
niques having the dominant explorative property. The search performance of the
cooperative FPA-ATS compared with the ATS and FPA have been elaborately tested
against ten selected standard benchmark functions for global minimum finding. As
results, it was found that the proposed cooperative FPA-ATS is more efficient than the
FPA and ATS with the highest success rate of minimum finding and smallest number
of function evaluations (least time-consumed) and smallest standard deviation of
solution finding (highest robustness). This can be concluded that the proposed coop-
erative FPA-ATS algorithm is one of the most effective metaheuristic optimization
techniques alternatively used for solving the global optimization problems.
References
1. Glover, F., Kochenberger, G.A.: Handbook of Metaheuristics. Kluwer Academic Publishers,
Dordrecht (2003)
2. Talbi, E.G.: Metaheuristics Forn Design to Implementation. Wiley, Hoboken (2009)
3. Yang, X.S.: Flower Pollination Algorithm for Global Optimization. Unconventional
Computation and Natural Computation. LNCS, vol. 7445, pp. 240–249 (2012)
4. Chiroma, H., Shuib, N.L.M., Muaz, S.A., Abubakar, A.I., Ila, L.B., Maitama, J.Z.: A review
of the applications of bio-inspired flower pollination algorithm. Procedia Comput. Sci. 62,
435–441 (2015)
5. He, X., Yang, X.S., Karamanoglu, M., Zhao, Y.: Global convergence analysis of the flower
pollination algorithm: a discrete-time Markov chain approach. In: International Conference
on Computational Science (ICCS2017), pp. 1354–1363 (2017)
6. Sujitjorn, S., Kulworawanichpong, T., Puangdownreong, D., Areerak, K-N.: Adaptive tabu
search and applications in engineering design. In: Zha, X.F., Howlett, R.J. (eds.) Integrated
Intelligent Systems for Engineering Design, pp. 233–257. IOS Press, Amsterdam (2006)
7. Glover, F.: Tabu Search - part i. ORSA J. Comput. 1(3), 190–206 (1989)
8. Glover, F.: Tabu Search - part ii. ORSA J. Comput. 2(1), 4–32 (1990)
9. Puangdownreong, D., Sujitjorn, S., Kulworawanichpong, T.: Convergence analysis of
adaptive tabu search. ScienceAsia J. Sci. Soc. Thai. 30(2), 183–190 (2004)
10. Puangdownreong, D., Kulworawanichpong, T., Sujitjorn, S.: Finite Convergence and
Performance Evaluation of Adaptive Tabu Search. LNCS, vol. 3215, pp. 710–717. Springer,
Heidelberg (2004)
11. Ali, M.M., Khompatraporn, C., Zabinsky, Z.B.: A Numerical evaluation of several
stochastic algorithms on selected continuous global optimization test problems. J. Global
Optim. 31, 635–672 (2005)
12. Jamil, M., Yang, X.S., Zepernick, H-J.: Test functions for global optimization: a
aomprehensive survey. In: Swarm Intelligence and Bio-Inspired Computation Theory and
Applications, pp. 193–222. Elsevier Inc. (2013)
Bayesian Optimization for Reverse Stress
Testing
Peter Mitic1,2(&)
1
Department of Computer Science, UCL, Gower Street,
London WC1E 6BT, UK
p.mitic@ucl.ac.uk
2
Santander UK, 2 Triton Square, Regents Place, London NW1 3AN, UK
Abstract. Bayesian Optimization with an underlying Gaussian Process is used

as an optimization solution to a black-box optimization problem in which the
function to be optimized has particular properties that result in difficulties. It can
only be expressed in terms of a complicated and lengthy stochastic algorithm,
with the added complication that the value returned is only required to be
sufficiently near to a pre-determined ‘target’. We consider the context of
financial stress testing, for which the data used has a significant noise compo-
nent. Commonly-used Bayesian Optimization acquisition functions cannot
analyze the ‘target’ condition in a satisfactory way, but a simple modification of
the ‘Lower Confidence Bound’ acquisition function improves results markedly.
A proof that the modified acquisition function is superior to the unmodified
version is given.
Keywords: Bayesian optimization Gaussian process Acquisition function

Loss distribution Reverse stress testing Financial risk Value-at-risk VaR
1 Introduction
The need for banks to be resilient to adverse change due to economic circumstances has
been a principle, and indeed a regulatory requirement, for many years. As a result,
stress testing has become a routine component in banking operations. The overall
purpose of stress testing is to ensure that national banking systems are resilient to
economic shocks. The testing principles involved originate from the Bank for Inter-
national Settlements [1] and are reflected in central bank publications such as from the
European Central Bank [2]. Most countries seek to implement these principles but
almost nothing is said about how that should be done.
A central bank would typically issue a set of stressed economic factors and expect
regional banks to calculate the reserves they need to cover anticipated losses. The
process is referred to simply as stress testing. The alternative is termed reverse stress
testing. Given a predetermined target, the level of stressed resources is calculated so as
to hit the target. The back calculations involved are often very onerous. In this paper we
present a method to do them efficiently, and illustrate it in the context of operational
risk (losses resulting from failed processes, fraud and operational failures).

https://doi.org/10.1007/978-3-030-68154-8_17
Bayesian Optimization for Reverse Stress Testing 165
1.1 Problem Statement

The reverse stress test process can be cast into an optimization problem in the following
way. Anticipated losses are modelled by a variable Y defined on a closed real interval
I. The majority of those losses are historic data, and a minority are generated by
simulation. Losses are inflated by a scale factor x, the value of which is to be calculated
in the optimization process. The function to be optimized, f(x), is a complex and
lengthy Monte Carlo algorithm for computing a common measure of financial risk:
Value-at-risk (VaR). Function f defines a stochastic process, and the optimal value of x,
^x, needs only to be sufficiently near to a target value V. Specifically, the absolute
relative estimation error should not exceed a given limit L. The optimization problem to
be solved is therefore given by Eq. (1), in which the objective function f is defined in
terms of an error term due to the stochastic nature of the embedded Monte Carlo
process.

f ðð1 þ xÞY V þ Þ
^x ¼ arg min \L ð1Þ
x2I V

þ Þ
To simplify Eq. (1) for later use, a simpler-looking function gð xÞ ¼ f ðð1 þ xÞYV
V
is defined in Eq. (2). In g, the parameters Y, V and 2 are implied.
^x ¼ arg minðgð xÞ\LÞ ð2Þ

x2I
The optimization problem defined by Eq. (2) should be regarded as a search for a
minimum relative error. Alternatively, it could be interpreted as a search for a value, ^x,
that satisfies an equality gð^xÞ ¼ 0, but with a very wide error bound specified by the
parameter L.
The objective function g used to calculate VaR in this context is an implementation
of the Loss Distribution Approach (hereinafter LDA), due to Frachot, Georges, and
Roncalli [3]. It uses a non-linear data fit, and simulates losses that represent, typically,
one year's activity. LDA valuations of g involve lengthy Monte Carlo calculations. In
this paper we say g is ‘expensive’. Therefore, we seek to minimize the number of
evaluations of g, and do it using a Gaussian Process (GP) embedded in Bayesian
Optimization (BO). Although BO is a well-established optimization technique,
empirical evidence presented here indicates a lack of success when using standard
forms of BO in the context of financial loss data. We suggest reasons and propose a
solution in Sect. 4.
2 Supporting Literature
In this section we summarize the main events in BO development, and stress its
relationship to a GP. BO with a GP is a method of using a Gaussian distribution of
functions as a proxy for optimizing a function that is difficult to optimize in any other
way. It is originally due to Mockus [4]. Two ideas were key to the original argument.
166 P. Mitic
The first was to calculate covariances using a kernel function. The original kernel took
ðxi xj Þ2
the form e 2 , in which the term xi xj is a distance measure between two feature
vectors xi and xj. The second was a proof that a sequence of evaluation points for the
‘expensive’ function proposed using the GP converges to a solution of the original
optimization problem. Mockus and co-workers proposed applications of the method
shortly after [5].
In [5], the concept of an acquisition function as a means of proposing the next
‘expensive’ evaluation point was formalised. A GP is characterised by two parameters:
a vector of means l and a covariance matrix K. From K, a vector, r, which can be
interpreted as a vector of standard deviations, can be derived. Specifically, r is the root
mean square error vector when a normal model parameterized by l and K is fitted to
data. The vectors l and r feature in the Expected Improvement (EI) acquisition
function, which was formalised by Jones [13], based on ideas in [5]. A Bayesian
predictor is tested at a sequence of possible evaluation points, and the point with the
highest expected value of providing an optimal next ‘expensive’ evaluation point is
selected. A similar acquisition function, Probability of Improvement - POI, was also
proposed by Mockus [6].
The innovation in this paper is to extend the Confidence Bound - CB acquisition
function, introduced by Cox and John [7]. Given elements of µ and r indexed by n and
with a real-valued constant j, the metric µn - jrn defines the Lower Confidence Bound
(LCB) acquisition function, which is used for minimisations. For maximizations, the
metric µn + jrn defines the Upper Confidence Bound (UCB) acquisition function. The
relationship between the objective function g(x) and the parameters of a GP, was
established by Srinivas et al. [8]. The absolute difference between g(x) and µn−1(x) at
stage n in a sequence of evaluations was shown to be bounded, with high probability,
pffiffiffiffiffi pffiffiffiffiffi
by bn rn1 ð xÞ, where bn is a real number that depends on logðn2 Þ. That result is
significant in the proof presented in Appendix A of this paper.
General discussion of the BO method, including development steps, may be found
in Rana et al. [9], Rasmussen and Williams [10], or Murphy [11]. Berk et al. [12]
highlight a general problem that GPs are prone to detecting a local rather than a global
optimum. That is the origin of the ‘sticking’ phenomenon, where the acquisition
function persistently proposes a non-optimal next evaluation point. We aim to make
progress in solving this problem in our analysis.
2.1 Other Optimization Methods

Other optimization techniques are available for solving problems with parameter or
data uncertainty. Some recent examples are discussed in this section.
Kara et al. [14] used a Robust Optimization technique with a Parallelepiped
Uncertainty methodology, originally formulated by Özmen et al. [15] for risk man-
agement in the context of a portfolio of stocks with uncertainty in the data. In Paral-
lelepiped Uncertainty, parameters are allowed to vary, according to a specific
probability distribution, within a confidence parallelepiped that is a linear approximation
of a confidence ellipsoid. Savku and Weber [16] used a Markov chain to track investors’
psychological interactions, with dynamic programming to determine movements within
the Markov chain. Kwon and Mertikopoulos [17] used the idea of ‘regret’ (defined in the
Appendix of this paper) in a general method to replace discrete-time optimizations by
continuous-time optimizations. Ascher [18] approximates a discrete process by a con-
tinuous system of differential equations. In our context, such a replacement might take
the form of an implicit or explicit expression of g(x) (Eq. 2), from which a solution to
g(x) = V could be obtained numerically. Samakpong et al. [21] used Particle Swarm
optimization methods in conjunction with random sampling and Monte Carlo simula-
tions. Particle Swarms are unlikely to be useful in our context because the ‘particles’ are
competing candidate solutions. Any competition implies more ‘expensive’ evaluations
of function g (Eq. 2). Chance-constrained optimization was used by Yang and Sutanto
[19] in conjunction with scenarios to model parameter uncertainty. As with Particle
Swarms, too many ‘expensive’ evaluations of g would be needed in our context.
3 Bayesian Optimization: Overview
In this section we give a brief overview of BO, and stress the roles of the GP, and the
acquisition function part of the BO. The intention of the combined BO-GP process is to
minimize the number of slow or difficult evaluations of g. It proceeds as follows.
1. Define an initial set of evaluation points
2. Repeat until converged
2:1. Define a Bayes prior with the GP
2:2. Supply data and calculate a Bayes likelihood
2:3. Calculate a Bayes posterior
2:4. Define sample points to calculate a Bayes predictor
2:5. Propose a next evaluation point using the BO Acquisition function
Rasmussen and Williams [10] define a GP as a distribution over functions. There
are two key points in GP usage. First, ‘expensive’ function evaluation is only necessary
at a finite, but arbitrary, set of initial evaluation points x = {x1, x2, …, xn}. Thereafter,
the BO executes multiple non-‘expensive’ function evaluations of a multi-variate
normal distribution, and only then proposes a next ‘expensive’ evaluation point xn+1.
Second, general applicability of a GP arises because if a Gaussian prior is defined and
combined with Gaussian data, both posterior and predictive distributions are Gaussian,
and evaluation of them is very fast.
3.1 The LCB Acquisition Function

In LCB acquisition, components of the GP parameter vectors µ and r, are combined using
a tunable parameter j. The process requires a calculation of the mean of M elements of µ,
fli gM
i¼1 , to give a vector of means of means, l ¼ Eðli Þ. Similarly, a vector means of
standard deviations, r is calculated using entries ri in the ith column of r, and estimators
of them r î , obtained from the kernel function. The required expression (see Rasmussen
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
M
and Williams [10]) is a vector of dimension M, r ¼ E ð ri r î Þ 2
. In our
i¼1
168 P. Mitic
calculations the vectors l and r are generated using the BayesianOptimization function in
the R package rBayesianOptimization [22]. Both are inputs to the LCB acquisition
function, which is the actual predictor of the next evaluation point, xn+1. Using j, the
M components of l and r are combined in the form li jri , and a minimum can be
sought by searching a range of values of those components, as in Eq. (3). Full details can
be found in Murphy [11]. Note that both l and r are both implied functions of x 2 I.
xn þ 1 ¼ min ðli jri Þ ð3Þ

n¼1::M
We have found that LCB acquisition, as formulated in Eq. (3), does not produce
satisfactory results for any reasonable value of j for the noisy data and noise-inducing
optimizing function g. This result was unexpected, since selecting larger j values
‘ought’ to have diversified the search (see [13], for example). The same is true for
UCB, EI and POI acquisitions. Therefore, we have introduced a small change to the
LCB acquisition, described in Sect. 4.
4 Changes to LCB Acquisition
It is likely that LCB acquisition fails because the optimization rule in Eqs. 1 and 2
contains the additional requirement that the minimum deviation from zero must be
within a pre-determined limit L. The following amendment to LCB acquisition makes a
dramatic difference. We call it the ZERO acquisition function since the optimal solution
should result in an error (relative to the target) of approximately zero.
xn þ 1 ¼ min ðli jri Þ2 ð4Þ

n¼1::M
The intuition behind the proposal in Eq. (4) is that a quantity has to be as close as
possible to a target, and a simple way to measure closeness is “deviation-squared”
(absolute deviation works just as well). Ultimately the justification for using Eq. (4) is
empirical, and results are given in Sect. 5.
4.1 Rationalization of the ZERO Acquisition Function

There appears to be little difference between the LCB and ZERO Acquisitions.
Therefore, we provide an indication of why ZERO out-performs LCB Acquisition in
this section, and a formal proof in Appendix A. The formal proof uses the concept of
regret: the difference between the function g evaluated at the current optimal point ^x
and the last evaluation point xn. An informal explanation can be seen with reference to
Fig. 1.
Figure 1 shows the case of a run with acquisition function zð xÞ ¼ ðli jri Þ2 . If
zð xÞ [ 0 for all x 2 I, the proposals for LCB and ZERO acquisition are identical. But if
zð xÞ\0 for some x 2 I, the situation is different. Let x1 and x2 be solutions to zð xÞ ¼ 0.
The dotted line marks a negative deviation of z ¼ L from zero, and z(x) intersects this
line at xn and xm. Therefore, xn and xm are candidate optimal solutions to Eq. 2. LCB
Fig. 1. Rationalization: ZERO acquisition converges faster than LCB acquisition
acquisition would propose x as a next evaluation point. ZERO acquisition would

propose x1 or x2 as a next evaluation point. The ordinate x1 is nearer to xn than x (just!).
Similarly, x2 is nearer to xm than x. Therefore, ZERO acquisition would be expected to
converge to xn or xm faster than LCB acquisition.
Figure 1 should be seen in conjunction with the objective function g. The opti-
mization process summarized in Eq. 1, with its solution using a GP, can be thought of a
mapping, F , from a set of GP-predicted evaluation points {x1, x2, …} to function
f. That mapping is non-deterministic due to the stochastic nature of f, fed with noisy
part-simulated data. Therefore, any particular mapping fr ¼ F ðx1 ; x2 ; . . .; xr ;Þ should
not be considered as yielding an accurate result. Rather, fr should be thought of as a
region in which that result is expected to lie. Consequently fr may or may not satisfy
the relative error constraint in Eq. 1, even though in theory, it should.
4.2 Definition: Run Length

Success in comparing ZERO and LCB acquisitions is measured by the length of the
sequence of successive ‘expensive’ evaluations fgðx1 Þ; gðx2 Þ; . . .g until the stop con-
dition gðxR Þ\L occurs for some positive integer R. We call R the Run Length, and we
seek to minimise it. Equation 5 gives the formal definition.
R ¼ fr 2 Z þ jðgðxr Þ\LÞ ^ ðgðxi Þ L; i ¼ 1; 2; . . .; r 1Þg ð5Þ
5 Results
The data used for this study (Y in Eq. 1) are daily operational losses collected from
January 2010 to December 2018. Outlier losses have a significant effect on the BO
process, resulting in unsatisfactory proposals.
Table 1 shows a comparison of the EI, POI, UCB, LCB and ZERO acquisition
functions. In each case, all evaluations were repeated 25 times to obtain a mean and
standard deviation run length. Three ‘expensive’ evaluations of g were done to ini-
tialize the GP. Also shown is a random search for the optimal solution. In general, the
ZERO, LCB and UCB parameters j were not significant in determining the run length,
and were both set to 1. The number of Monte Carlo iterations used in evaluating g was
170 P. Mitic
more important, and Table 1 shows the results for the maximum (5 million) and
minimum (1 million) considered. Normally for this type of calculation we would opt
for the minimum number (and often fewer) to save time.
Table 1. Mean and SD of run lengths for Random search, and the EI, POI, UCB, LCB and
ZERO acquisition functions: 1 and 5 million Monte Carlo iterations of g.
Acquisition function 5m Mean 1m Mean 5m SD 1m SD
EI 11.45 18.08 10.43 12.15
LCB (j = 1) 9.85 17.56 7.72 15.70
UCB (j = 1) 12.16 13.44 9.88 10.14
POI 14.60 22.92 14.41 15.87
Random 10.84 12.42 10.88 12.56
ZERO (j = 1) 5.60 7.28 3.85 6.62
The most notable observation from Table 1 is the poor performance of the
‘established’ acquisition functions relative to a random search, and the superior per-
formance of ZERO acquisition compared to all others. The high standard deviations in
all cases except ZERO acquisition indicate instability (and hence unreliability) of the
methods used. The ZERO acquisition standard deviations are also high (compared to
the corresponding means), but are acceptable given alternatives in the table.
Figure 2 shows an alternative view of the ZERO acquisition results, in the form of
surfaces that show the variation of the number of ‘expensive’ evaluations of g with the
number of Monte Carlo cycles used in those evaluations and values of j. Using µR and rR
for the mean and standard deviation of the Run Number, R, the upper surface shows
µR + rR, the middle surface shows µR, and the lower surface shows µR − rR (with a floor
of 3 – the initial number of evaluations). Overall, the surfaces are “choppy”, particularly
for fewer than 2 million Monte Carlo cycles. That is due to sampling and method-induced
uncertainty. The surfaces show that the ‘tightest’ confidence limits are also the most
stable regions. They are for low values of j and a high number of Monte Carlo cycles.
5.1 Discussion
There are two surprising elements in the foregoing discussion. The first is the obser-
vation that “traditional” GP acquisition functions underperformed when used with
noisy data and with an optimization function (g in Eq. 2) which is necessarily
stochastic. Comparisons indicated that points (xn) proposed by the GP did not lead to
convergence as fast as with a random selection process. The second is that a minor
modification of the LCB acquisition function (ZERO) performs significantly better than
the “traditional” acquisition functions. We have attempted to explain this improvement
informally by comparing the proposals in the LCB and ZERO cases. The ZERO case
appears to present more opportunity to propose a potential solution because there are
two possibilities for every one that LCB is able to propose. ZERO acquisition is helped
further because the target is expressed as a range rather than a fixed value.
Fig. 2. Confidence Surfaces. Upper: µR + rR, middle: µR, lower: µR − rR
More formally, the argument can be made more secure using the concept of “re-
gret”. Ultimately, the reason for using ZERO acquisition is strictly practical: it works.
That is an important consideration in commercial data analysis.
6 Conclusion and Future Scope
The simple transition from LCB to ZERO acquisition has resulted in a very significant
improvement in solving the optimization problem in Eq. 1. Since the objective function
concerned is stochastic, it is not possible to guarantee consistency of results (as
measured by Run Length) other than by using a very large number of Monte Carlo
cycles. Consequently, choosing an optimization method with a low Run Length stan-
dard deviation is an important consideration.
We are therefore considering two distinct approaches. The first is to improve the
performance of ZERO acquisition even further by providing feedback to each GP
evaluation from the previous GP evaluation (apart from the first). The intention is to
vary the j parameter in Eq. 4 so as to optimize the next evaluation, and to stabilize
oscillation around what emerges as the next proposal point. So far only a minimal
improvement has been noted. This feedback approach is similar to that used by Ozer
et al. [20] in the context of forecasting ATM cash withdrawals.
The second approach does not use BO. We have found that using a binary or linear
interpolation search to solve Eq. 1 does not necessarily result in a lower run length, but
does result in a much more consistent run length.
172 P. Mitic
Appendix A
This proof shows that the expected number of ‘expensive’ function evaluations of
g (Eq. 2) is greater for LCB acquisition than for ZERO acquisition.
We first define Local Regret, which is the difference between a proposed function
estimate and the actual function value. That is followed by Cumulative Regret, which a
sum of local regrets. All other notation is defined in the main body of this paper. The
following proof depends on a probability bound on the error estimate for g(xn) at an
evaluation point xn due to Srinivas et al. [8]. With that bound, the general strategy is to
calculate an upper bound for local regret, use that bound to determine the expected
value of local regret, and then to calculate the expected number of ‘expensive’ function
evaluations using the ZERO and LCB acquisition functions.
Definitions: Local regret rn and Cumulative regret RN
rn ¼ gð^xÞ gðxn Þ; 1 n N ðA1Þ

XN
RN ¼ r
n¼1 n
ðA2Þ
Definition: bn Srinivas [8], Appendix A.1, Lemma 5.1, for constants Cn > 0 that
P
satisfy n 1 Cn1 ¼ 1, and small d:

jI jCn
bn ¼ 2 log : ðA3Þ
d
Using the definition of local regret (Srinivas [8] Appendix A.1, Lemma 5.1),
Equation (A4) provides upper and lower confidence bound for the GP “mu” term.
pffiffiffiffiffi
P jgð xÞ ln1 ð xÞj bn rn1 ð xÞ 1 d; 8x 2 I; 1 n N ðA4Þ
Then since ^x is an optimal solution (so that gð^xÞ\gðxn Þ), these upper and lower
bounds apply respectively:
pffiffiffiffiffi pffiffiffiffiffi
ln1 ð^xÞ þ bn rn1 ð^xÞ ln1 ðxn Þ þ bn rn1 ðxn Þ
pffiffiffiffiffi pffiffiffiffiffi ðA5Þ
ln1 ðxn Þ bn rn1 ðxn Þ ln1 ð^xÞ þ bn rn1 ð^xÞ:
The general strategy in this proof is to estimate the maximum cumulative regret in
each of these cases for LCB and ZERO acquisitions, and then calculate the expected
difference between the two.
Proposition: ZERO acquisition converges faster than LCB acquisition in the case
PðÞ [ 1 d (Eq. A4).
Proof: First, from Eq. A4 with probability greater than 1 d, the Local Regret cal-
culation proceeds as in Eq. A6. The first line uses Eq. A4 with x ¼ ^x, the second line
uses Eq. A5 and the third line uses Eq. A4 with x ¼ xn . The fourth line uses an upper
bound: bn ¼ maxðbn Þ.
n
pffiffiffiffiffi
rn ln1 ð^xÞ þ bn rn1 ð^xÞ gðxn Þ
pffiffiffiffiffi
ln1 ðxn Þ þ bn rn1 ðxn Þ gðxn Þ
pffiffiffiffiffi ðA6Þ
2 bn rn1 ðxn Þ
pffiffiffi
2 brn1 ðxn Þ
Now consider the Local Regret in the cases of LCB and ZERO acquisition. ZERO
acquisition is always non-negative but LCB acquisition can be negative. So we partition
the values of n into those which result in zero or positive LCB acquisition (set S) and
those which result in negative LCB acquisition (the complement, set S0 ). These sets are
shown in Eq. A7.
S ¼ fn : ln1 ðxn Þ jrn1 ðxn Þ 0; 1 n N g

ðA7Þ
S0 ¼ fn : ln1 ðxn Þ jrn1 ðxn Þ\0; 1 n N g
For S, the evaluation points proposed are identical for the two acquisition functions,
since they both correspond to the same minimum. Therefore, using superscripts to
denote the regrets for the two acquisition functions, the following equality applies.
rnðLCBÞ ¼ rnðZEROÞ ; n 2 S ðA8Þ
For S0 , ZERO acquisition returns a proposal that corresponds to a zero of the

acquisition function, whereas the equivalent for LCB acquisition is negative, and we
introduce a term /n to account for the difference from zero (Eq. A9).
ln1 ðxn Þ ¼ jrn1 ðxn Þ ðZEROÞ

ðA9Þ
ln1 ðxn Þ ¼ jrn1 ðxn Þ /n ðLCBÞ
This leads to the following expressions for the two regrets, using Eq. A6.
8 pffiffi
> pffiffiffi
< max rnðZEROÞ ¼ 2 brn1 ðxn Þ ¼ 2 j b ln1 ðxn Þ
n
pffiffi ðA10Þ
> pffiffiffi
: max rnðLCBÞ ¼ 2 brn1 ðxn Þ ¼ 2 b ðln1 ðxn Þ þ /n Þ
n j
Equation (A11) shows the partitioning the Cumulative Regret between sets S
and S0 .
X X
RN ¼ r
n2S n
þ r
n2S0 n
ðA11Þ
Then, Eq. A12 show the maximum Cumulative Regret for ZERO and LCB
acquisitions.
174 P. Mitic
8 pffiffi
> P P
< max RðNZEROÞ ¼ n2S rn þ 2 b n2S0 ln1 ðxn Þ
j
p ffiffi ðA12Þ
> P 2 b P
: max RðNLCBÞ ¼
n2S rn þ j n2S0 ln1 ðxn Þ þ /n
Equations A12 then imply that the inequality in Eq. A13.

pffiffiffi
ðLCBÞ ðZEROÞ 2 b
max RN max RN ¼ /n [ 0; 1 n N ðA13Þ
j
Equation (A13) is a strong indication that ZERO acquisition leads to faster con-
vergence than LCB acquisition, since it applies with high probability 1 d. This
completes the proof.
References
1. Basel Committee on Banking Supervision, Stress testing principles d450. Bank for
International Settlements (BIS) (2018). https://www.bis.org/bcbs/publ/d450.htm
2. European Banking Authority: 2020 EU-wide stress test methodological note (2019). https://
www.eba.europa.eu/sites/default/documents/files/documents/10180/2841396/ba66328f-476f
-4707-9a23-6df5957dc8c1/2020%20EU-wide%20stress%20test%20-%20Draft%20Method
ological%20Note.pdf
3. Frachot, A., Georges, P., Roncalli, T.: Loss Distribution Approach for operational risk,
Working paper, Groupe de Recherche Operationnelle, Credit Lyonnais, France (2001).
https://ssrn.com/abstract=1032523
4. Mockus, J.: On Bayesian methods for seeking the extremum. In: Proceedings of IFIP
Technical Conference, pp. 400–404 (1974). https://dl.acm.org/citation.cfm?id=646296.
687872
5. Mockus, J., Tiesis, V., Zilinskas, A.: The application of Bayesian methods for seeking the
extremum. Towards Global Optimisation (eds. Dixon,L. and Szego,G.P.) vol. 2 (1978)
6. Mockus, J.: The Bayesian approach to local optimization. In: Bayesian Approach to Global
Optimization. Mathematics and Its Applications, vol. 37. Springer, Heidelberg (1989).
https://doi.org/10.1007/978-94-009-0909-0_7
7. Cox, D.D., John, S.: SDO: a statistical method for global optimization. In: Multidisciplinary
Design Optimization, pp. 315–329. SIAM, Philadelphia (1997)
8. Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian process optimization in the bandit
setting: no regret and experimental design. In: Proceedings of ICML 2010, pp. 1015–1022
(2010). https://dl.acm.org/citation.cfm?id=3104322.3104451
9. Rana, S., Li, C., Gupta, S.: High dimensional Bayesian optimization with elastic Gaussian
process. In: Proceedings of 34th International Conference on Machine Learning, Sydney,
PMLR 70 (2017)
10. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press,
Cambridge (2016)
11. Murphy, K.P.: Machine Learning: A Probabilistic Perspective, Chapter 15. MIT Press,
Cambridge (2015)
12. Berk, J., Nguyen, V., Gupta, S., et al.: Exploration enhanced expected improvement for
Bayesian optimization. In: Joint European Conference on Machine Learning and Knowledge
Discovery in Databases. LNCS, vol. 11052, pp. 621–637 (2018)
13. Jones, D.R.: A taxonomy of global optimization methods based on response surfaces.
J. Global Optim. 21(4), 345–383 (2001)
14. Kara, G., Özmen, A., Weber, G.: Stability advances in robust portfolio optimization under
parallelepiped uncertainty. Central Eur. J. Oper. Res. 27, 241–261 (2019). https://doi.org/10.
1007/s10100-017-0508-5
15. Özmen, A., Weber, G.W., Batmaz, I., Kropat, E.: RCMARS: robustification of CMARS
with different scenarios under polyhedral uncertainty set. Commun. Nonlinear Sci. Numer.
Simul. 16(12), 478–4787 (2011). https://doi.org/10.1016/j.cnsns.2011.04.001
16. Savku, E., Weber, G.: Stochastic differential games for optimal investment problems in a
Markov regime-switching jump-diffusion market. Ann. Oper. Res. (2020). https://doi.org/10.
1007/s10479-020-03768-5
17. Kwon, J., Mertikopoulos, P.: A continuous-time approach to online optimization. J. Dyn.
Games 4(2), 125–148 (2017). https://doi.org/10.3934/jdg.2017008
18. Ascher, U.M.: Discrete processes and their continuous limits. J. Dyn. Games 7(2), 123–140
(2020). https://doi.org/10.3934/jdg.2020008
19. Yang, Y., Sutanto, C.: Chance-constrained optimization for nonconvex programs using
scenario-based methods. ISA Trans. 90, 157–168 (2019). https://doi.org/10.1016/j.isatra.
2019.01.013
20. Ozer, F., Toroslu, I.H., Karagoz, P., Yucel, F.: Dynamic programming solution to ATM cash
replenishment optimization problem. In: Intelligent Computing & Optimization. ICO 2018,
vol. 866. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00979-3_45
21. Samakpong, T., Ongsakul, W., Nimal Madhu, M.: Optimal power flow considering cost of
wind and solar power uncertainty using particle swarm optimization. In: Intelligent
Computing and Optimization. ICO 2019, vol. 1072. Springer, Cham (2020). https://doi.org/
10.1007/978-3-030-33585-4_19
22. Yan, Y.: (2016). https://cran.r-project.org/web/packages/rBayesianOptimization/index.html
Modified Flower Pollination Algorithm
for Function Optimization
Noppadol Pringsakul and Deacha Puangdownreong(&)

Southeast Asia University, 19/1 Petchkasem Road,
Nongkhangphlu, Nonghkaem, Bangkok 10160, Thailand
dpcgnop@yahoo.com, deachap@sau.ac.th
Abstract. The flower pollination algorithm (FPA) was firstly proposed in 2012
as one of the population-based metaheuristic optimization search techniques. It
is conceptualized by the pollination behavior of flowering plants. In this paper,
the new enhanced version of the original FPA named the modified flower
pollination algorithm (MoFPA) is proposed to improve its search performance
for function optimization. The switching probability of the original FPA used for
selection between local and global pollinations is changed from the fixed
manner to the random manner according to the pollination behavior of flowering
plants in nature. To perform its effectiveness, the proposed MoFPA is tested
against ten standard benchmark functions compared with the original FPA. As
simulation results, it was found that the proposed MoFPA performs superior
search performance for function optimization to the original FPA with higher
success rates and faster search time consumed.
Keywords: Modified fower pollination algorithm Function optimization

Metaheuristic optimization technique
1 Introduction
Last few decades, the nature-inspired metaheuristic algorithms have become very
popular in a wide range of real-world applications. This is due to their success in
finding good solutions for complex problems, especially the NP-complete problems [1,
2]. By literatures, most nature-inspired metaheuristic algorithms are developed by
mimicking biological or physical phenomena. They can be grouped in four main
categories, i.e. (i) evolution-based metaheuristic algorithms, (ii) physics-based meta-
heuristic algorithms, (iii) bio-swarm-based metaheuristic algorithms and human-based
metaheuristic algorithms [1, 2].
The flower pollination algorithm (FPA) was firstly proposed by Yang in 2012 [3]
for solving both continuous and combinatorial, single-objective and multi-objective
optimization problems [3, 4]. As one of the most popular and efficient metaheuristic
algorithms, the FPA algorithm imitates the pollination behavior of flowering plants.
Optimal plant reproduction strategy involves the survival of the fittest as well as the
optimal reproduction of plants in terms of numbers. These factors represent the fun-
damentals of the FPA and are optimization-oriented. The FPA algorithm was proved

https://doi.org/10.1007/978-3-030-68154-8_18
Modified Flower Pollination Algorithm for Function Optimization 177
for the global convergent property [5]. Since 2012, the FPA has shown superiority to
other metaheuristic algorithms in solving various real-world problems, such as
economic/emission dispatch, reactive power dispatch, optimal power flow, solar PV
parameter estimation, load frequency control, wireless sensor networks, linear antenna
array optimization, frames and truss systems, structure engineering design, multilevel
image thresholding, travelling transportation problem and control system design. The
state- of-the-art developments and significant applications of the FPA have been
reviewed and reported [6].
Following the literature, many variants of FPA have been developed by modifi-
cation, hybridization, and parameter-tuning manners in order to cope with the complex
nature of optimization problem. One of the modified flower pollination algorithms
called the MFPA [7]. The MFPA hybridized the original FPA with the clonal selection
algorithm (CSA) in order for generating some elite solutions. The binary flower pol-
lination algorithm (BFPA) was developed for solving discrete and combinatorial
optimization problems [8]. Another significant modification was proposed as the
modified global FPA (mgFPA) [9]. The mgFPA was designed to better utilize features
of existing solutions through extracting its characteristics, and direct the exploration
process towards specific search areas [9].
In this paper, the novel enhanced version of the original FPA named the modified
flower pollination algorithm (MoFPA) is proposed for function optimization. The
switching probability of the original FPA used for selection between local and global
pollinations is changed from the fixed manner to the random manner. The proposed
MoFPA is tested against ten benchmark optimization problems compared with the
original FPA in order to perform its search performance. This paper consists of five
sections. After an introduction presented in Sect. 1, the rest of the paper is arranged as
follows. The original FPA and the proposed MoFPA algorithms are described in
Sect. 2. Ten selected standard benchmark functions used in this paper for function
optimization are detailed in Sect. 3. Results and discussions of the performance
evaluation of the FPA and MoFPA algorithms against ten selected benchmark func-
tions are illustrated in Sect. 4, while the conclusions are followed in Sect. 5.
2 FPA and MoFPA Algorithms
In this section, the original FPA algorithm is briefly detailed. Then, the proposed
MoFPA algorithm is elaborately illustrated.
2.1 Original FPA Algorithm

In nature, the objective of the flower pollination is the survival of the fittest and optimal
reproduction of flowering plants. Pollination in flowering plants can take two major
forms, i.e. biotic and abiotic. About 80–90% of flowering plants belong to biotic
pollination. Pollen is transferred by a pollinator such as bees, birds, insects and animals.
About 10–20% remaining of pollination takes abiotic such as wind and diffusion in
water. Pollination can be achieved by self-pollination or cross-pollination [10, 11].
Self-pollination is the fertilization of one flower from pollen of the same flower or
178 N. Pringsakul and D. Puangdownreong
different flowers of the same plant.Self-pollination usually occurs at short distance

without pollinators. It is regarded as the local pollination. Cross-pollination occurs
when pollen grains are moved to a flower from another plant. The process happens with
the help of biotic or abiotic agents as pollinators. Biotic, cross-pollination may occur at
long distance with biotic pollinators. It is regarded as the global pollination. Biotic
pollinators behave Lévy flight behavior [12] with jump or fly distance steps obeying a
Lévy flight distribution. The original FPA firstly proposed in 2012 by Yang [3] was
developed by the characteristics of the pollination process, flower constancy and
pollinators’ behavior.
In original FPA algorithms, a solution xi is equivalent to a flower and/or a pollen
gamete. For global pollination, flower pollens are carried by pollinators. With Lévy
flight, pollens can travel over a long distance as expressed in (1), where g* is the
current best solution found among all solutions at the current generation (iteration) t,
and L stands for the Lévy flight that can be approximated by (2), while C(k) is the
standard gamma function. The local pollination can be represented by (3), where xj and
xk are pollens from the different flowers of the same plant species, while e stands for
random drawn from a uniform distribution as stated in (4). A switch probability p is
used to switch between global pollination and local pollination. The algorithm of the
original FPA can be represented by the flow diagram shown in Fig. 1. From Yang’s
recommendations [3, 4], the number of flowers n = 25 – 50 and a switching probability
p = 0.2 – 0.25 work better for most applications.
xit þ 1 ¼ xti þ Lðxti gÞ ð1Þ
L ; ðs s0 [ 0Þ ð2Þ
p s1 þ k
xti þ 1 ¼ xti þ eðxtj xtk Þ ð3Þ

(
1=ðb aÞ; aqb
eðqÞ ¼ ð4Þ
0; q\a or q [ b
2.2 Proposed MoFPA Algorithm

Regarding to flowering plants in nature, pollination ability of the flowering plant between
cross-pollination via biotic pollinator and self-pollination by using abiotic pollinator
depends on the nature of flowering plant species [10, 11]. Once such the behavior is
conducted in order to perform algorithms, it should be random manner within the par-
ticular interval instead of fixed value manner as appeared in the original FPA algorithm.
Authors believe that with this concept the developed algorithm can reach the optimality
faster. The proposed MoFPA algorithm, the new enhanced version of the original FPA,
employs the randomly switching probability for selection between local and global
pollinations. This leads the opportunity of the global finding of the proposed MoFPA
algorithm according to the flower pollination behavior in nature. By this concept and new
motivation, the random value of pollination process is still varied from 0 to 1, i.e.
rand 2 [0, 1], but the fixed value of the switching probability p = 0.2 will be changed to
the randomly switching probability randp 2 [pmin, pmax], pmin, pmax 2 [0, 1] and
pmin < pmax, regarding to the natural flower pollination behavior.
Start
Initialize:
- Define a switch probability p ∈ [0, 1]
rand > p No
(random) (fixed)
Yes
global pollination
local pollination
- Draw a step vector L - Draw ε from a uniform
via Lévy flight in (2) distribution ∈ [0, 1] in (4)
- Activate cross - Randomly choose j and
pollination in (1) k among all the solutions
- Invoke self pollination in (3)
No
f(x)<f(g*)
Yes
- Update g* = x
No
TC ?
Yes
- Report the current best solution g*
Stop
Fig. 1. Flow diagram of the original FPA.
Referring to the original FPA algorithm in Fig. 1, the condition of selection

between local and global pollinations is rand > p, where rand 2 [0, 1] is random
number drawn from a uniform distribution and p is a fixed value. If rand > p, the
global pollination will be activated as shown in Fig. 2(a). Otherwise, the local
pollination will be invoked. The algorithm of the proposed MoFPA can be represented
by the flow diagram of the original FPA as shown in Fig. 1, but the condition of
selection between local and global pollinations is changed to rand > randp, where
rand 2 [0, 1] and randp 2 [pmin, pmax] are random numbers drawn from a uniform
distribution. With a new selecting condition, if rand > randp, the global pollination
will be activated as shown in Fig. 2(b). Otherwise, the local pollination will be
invoked. Form this modification, the mathematical relations in (1)–(4) of the original
FPA will be conducted in the proposed MoFPA. The algorithm of the proposed
MoFPA can be represented by the flow diagram shown in Fig. 3.
random random
pollinations
1.0 1.0
global
pollinations
randp ∈[pmin, pmax]

rand ∈[0, 1]
rand ∈[0, 1]
global
pollinations
0.5
local
pollinations
0.2
p = 0.2
(fixed)
local
0 0
(a) Original FPA (b) Proposed MoFPA
Fig. 2. Selecting conditions between local and global pollinations.
3 Selected Benchmark Functions
Hundreds benchmark test functions have been proposed for global minimum [13–15].
In this work, ten standard benchmark test functions are selected for testing the proposed
MoFPA. They are (1) Sinusoid function (SF), (2) Bohachevsky function (BF),
(3) Rastrigin function (RF), (4) Griewank function (GF), (5) Michaelwicz function
(MF), (6) Shubert function (ShF), (7) Rosenbrock function (RoF), (8) Schwefel
function (SchF), (9) Keane function (KF) and (10) Cosine-Mixture function (CMF),
respectively. Details of all selected benchmark test functions are summarized in
Table 1, where x* = (x*, y*) is the optimal solution and f(x*) is the optimal function
values. Ten selected benchmark test functions are different in that they are nonlinear,
multimodal and unsymmetrical functions which are very difficult for global minimum
finding [13–15].
Start
Initialize:
- Define a switch probability randp ∈ [pmin, pmax]
rand > randp No

(random) (random)
Yes
global pollination
local pollination
- Draw a step vector L - Draw ε from a uniform
via Lévy flight in (2) distribution ∈ [0, 1] in (4)
- Activate cross - Randomly choose j and
pollination in (1) k among all the solutions
- Invoke self pollination in (3)
No
f(x)<f(g*)
Yes
- Update g* = x
No
TC ?
Yes
- Report the current best solution g*
Stop
Fig. 3. Flow diagram of the proposed MoFPA.
The proposed MoFPA will be tasted against ten selected benchmark test functions to
perform its search performance for function optimization. Both original FPA and
MoFPA algorithms were coded by MATLAB version 2018b (License No.#40637337)
run on Intel(R) Core(TM) i5-3470 CPU@3.60 GHz, 4.0 GB-RAM for comparison.
Searching parameters of the original FPA are set according to recommendations of
Yang [3, 4], and the preliminary studies on the selected benchmark test functions as
detaied in Table 2, where n is number of flower and p is the switching probability. For
Table 1. Details of ten selected benchmark test functions.
(Continued)
Table 1. (Continued)
the proposed MoFPA, searching parameters are set for a fair comparison, i.e. n is the
same value of the original FPA for each function. Values of randp 2 [pmin, pmax] are
set as four intervals, i.e. the 1st interval: randp 2 [0, 0.25], the 2nd interval: randp
[0.25, 0.5], the 3rd interval: randp 2 [0.5, 0.75] and the 4th interval: randp 2 [0.75, 1],
respectively. 100-trial runs are conducted for each algorithm in order to carry out
meaningful statistical analysis. Both algorithms will be terminated once two termina-
tion criteria (TC) are satisfied, i.e. (1) the function values are less than a given tolerance
d 10–5 or (2) the search meets the maximum generation (Max_Gen = 1,000). The
former criterion implies that the search is success, while the later means that the search
is not success. The simulation results are summarized in the Table 3, where the global
optima are reached. The numeric data in Table 3 are expressed in the format of
ANE ± STD(PSR), where ANE is the average number of evaluations, STD is the
standard deviation and PSR is percent success rate.
Table 2. Search parameters of the original FPA.
Table 3. Results of performance evaluation of the original FPA and proposed MoFPA.
From Table 3, it was found that the proposed MoFPA with randp 2 [0, 0.25] is not
better than the original FPA because it almost uses only global pollination. Also, the
MoFPA with randp 2 [0.75, 1] is not better than the original FPA because it almost
uses only local pollination. The MoFPA with randp 2 [0.25, 0.5] is not better than the
original FPA, but it is better than the MoFPA with randp 2 [0, 0.25] and that with
randp 2 [0.75, 1]. However, the proposed MoFPA with randp 2 [0.5, 0.75] is the bset
amont those intervals. It performs more efficient in finding the global optima with faster
(less average number of evaluations) and higher success rates superior to the original
FPA. This is because it is balanced between local and global pollinations.
Figure 4(a), 4(b) and 4(c) show the results of the global optimal finding by the
MoFPA against the SF function at the 1st, 50th and 100th generation, respectively. The
convergent rates over 100-trial runs of the SF’s solution finding proceeded by the
original FPA and the proposed MoFPA are shown in Fig. 5(a) and 5(b), respectively.
The convergent rates of other functions are omitted because they have a similar form to
those of SF function in Fig. 5. Form Fig. 5, it can be observed that the proposed
MoFPA is more robust than the original FPA in global optimum finding.
(a) At the 1st generation (b) At the 50th generation (c) At the 100th generation
Fig. 4. MoFPA movement for global optimal finding of Sinusoid function (SF).
(a) Original FPA (b) Proposed MoFPA
Fig. 5. Convergent rates over 100-trial runs of the SF’s solution finding.
5 Conclusions
The modified flower pollination algorithm (MoFPA) has been proposed in this paper.
Based on the original FPA formed from the pollination behavior of flowering plants,
the proposed MoFPA is more flexible by using the random switching probability for
selection between local and global pollinations. The MoFPA has been tested against ten
selected standard benchmark functions. Results of function optimization obtained by
the MoFPA have been compared with those obtained by the original FPA. As simu-
lation results, it was found that the proposed MoFPA has performed the best perfor-
mance once the random switching probability is varied in the interval of 0.5 – 0.75.
Moreover, the MoFPA yields superior search performance for global optimization to
the original FPA with higher success rates and faster search time consumed.
References
1. Glover, F., Kochenberger, G.A.: Handbook of Metaheuristics. Kluwer Academic Publishers,
Dordrecht (2003)
2. Talbi, E.G.: Metaheuristics Forn Design to Implementation. Wiley, Hoboken (2009)
3. Yang, X.S.: Flower pollination algorithm for global optimization. In: Unconventional
Computation and Natural Computation. LNCS, vol. 7445, pp. 240–249 (2012)
4. Yang, X.S., Karamanoglu, M., He, X.S.: Flower pollination algorithm: a novel approach for
multiobjective optimization. Eng. Optim. 46(9), 1222–1237 (2014)
5. He, X., Yang, X.S., Karamanoglu, M., Zhao, Y.: Global convergence analysis of the flower
pollination algorithm: a discrete-time Markov chain approach. In: International Conference
on Computational Science (ICCS 2017), pp. 1354–1363 (2017)
6. Chiroma, H., Shuib, N.L.M., Muaz, S.A., Abubakar, A.I., Ila, L.B., Maitama, J.Z.: A review
of the applications of bio-inspired flower pollination algorithm. Procedia Comput. Sci. 62,
435–441 (2015)
7. Nabil, E.: A modified flower pollination algorithm for global optimization. Expert Syst.
Appl. 57, 192–203 (2016)
8. Rodrigues, D., Yang, X.S., De Souza, A.N., Papa, J.P.: Binary flower pollination algorithm
and its application to feature selection. In: Recent Advances in Swarm Intelligence and
Evolutionary Computation, pp. 85–100. Springer, Cham (2015)
9. Shambour, M.Y., Abusnaina, A.A., Alsalibi, A.I.: Modified global flower pollination
algorithm and its application for optimization problems. Interdisc. Sci. Comput. Life Sci. 11,
1–12 (2018)
10. Willmer, P.: Pollination and Floral Ecology. Princeton University Press, Princeton (2011)
11. Balasubramani, K., Marcus, K.: A study on flower pollination algorithm and its applications.
Int. J. Appl. Innov. Eng. Manag. 3, 320–325 (2014)
12. Pavlyukevich, I.: Lévy flights, non-local search and simulated annealing. J. Comput. Phys.
226, 1830–1844 (2007)
13. Ali, M.M., Khompatraporn, C., Zabinsky, Z.B.: A numerical evaluation of several stochastic
algorithms on selected continuous global optimization test problems. J. Global Optim. 31,
635–672 (2005)
14. Jamil, M., Yang, X.S.: A literature survey of benchmark functions for global optimization
problems. Int. J. Math. Model. Numer. Optim. 4(2), 150–194 (2013)
15. Kashif, H., Mohd, N.M.S., Shi, C., Rashid, N.: Common benchmark functions for
metaheuristic evaluation: a review. Int. J. Inf. Visual. 1(4–2), 218–223 (2017)
Improved Nature-Inspired Algorithms
for Numeric Association Rule Mining
Iztok Fister Jr.(B) , Vili Podgorelec, and Iztok Fister
Faculty of Electrical Engineering and Computer Science, University of Maribor,

Koroška cesta 46, 2000 Maribor, Slovenia
iztok.fister1@um.si
Abstract. Nowadays, only a few papers exist dealing with Association

Rule Mining with numerical attributes. When we are confronted with
solving this problem using nature-inspired algorithms, two issues emerge:
How to shrink the values of the upper and lower bounds of attributes
properly, and How to define the evaluation function properly? This paper
proposes shrinking the interval of attributes using the so-called shrinking
coefficient, while the evaluation function is defined as a weighted sum of
support, confidence, inclusion and shrink coefficient. The four nature-
inspired algorithms were applied on sport datasets generated by a ran-
dom generator from the web. The results of the experiments revealed
that, although there are differences between selecting a specific algo-
rithm, they could be applied to the problem in practice.
Keywords: Association rule mining · Numerical attributes ·

Nature-inspired algorithms · Optimization
1 Introduction
Association Rule Mining (ARM) is used for discovering the dependence rules
between features in a transaction database. On the other hand, Numeric Associ-
ation Rule Mining (NARM) extends the idea of ARM, and is intended for mining
association rules where attributes in a transaction database are represented by
numerical values [4]. Usually, traditional algorithms, e.g. Apriori, requires numer-
ical attributes to be discretized before use. Discretization is sometimes trivial,
and sometimes does not have a positive influence on the results of mining. On
the other hand, many methods exist for ARM that do not require the discretiza-
tion step before applying the process of mining. Most of the these methods are
based on population-based nature-inspired metaheuristics, such as, for example,
Differential Evolution or Particle Swarm Optimization. NARM has recently also
been featured in some review papers [3,7] which emphasize its importance in the
data revolution era.
The objective of this short paper is to extend the paper of Fister et al. [5],
where the new algorithm for NARM was proposed, based on Differential Evolu-
tion. Indeed, the practical experiments revealed some problems/bottlenecks that
can be summarized into two issues:
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
https://doi.org/10.1007/978-3-030-68154-8_19
188 I. Fister Jr. et al.
– How to shrink the lower and upper borders of numerical attributes?

– How to evaluate the mined rules better?
Each numerical attribute is determined by an interval of feasible values lim-
ited by its lower and upper bounds. The broader the interval, the more associa-
tion rules mined. The narrower the interval, the more specific relations between
attributes are discovered. Mined association rules can be evaluated according to
several criteria, like support and confidence. However, these cover only one side
of the coin. If we would also like to discover the other side, additional measures
must be included into the evaluation function. The paper is focused on devel-
oping the algorithm for the pure NARM. In line with this, a new evaluation
function needs to be proposed. As a result, the main contribution of this paper
can be extracted into the following indents:
– the new algorithm is proposed for the pure NARM,
– the new evaluation function is identified,
– the algorithm is applied to a sport dataset consisting of pure numeric
attributes.
The remainder of the paper is structured as follows: Sect. 2 highlights back-
ground information needed for understanding a subject. In Sect. 3, improved
algorithm is presented. Section 4 outlines the experiments as well as presents
results. Paper is wrapped up with a conclusion in Sect. 5.
2 Background Information
2.1 Association Rule Mining
This section presents briefly a formal definition of ARM. Let us suppose, a set
of objects O = {o1 , . . . , om }, where m is the number of objects, and transaction
set D is given, where each transaction T is a subset of objects T ⊆ O. Then, an
association rule can be defined as the implication:
X ⇒ Y, (1)
where X ⊂ O, Y ⊂ O, in X ∩ Y = ∅. The following two measures are defined
for evaluating the quality of the association rule [2]:
n(X ∪ Y )
conf (X ⇒ Y ) = , (2)
n(X)
n(X ∪ Y )
supp(X ⇒ Y ) = , (3)
N
where conf (X ⇒ Y ) ≥ Cmin denotes confidence and supp(X ⇒ Y ) ≥ Smin
support of association rule X ⇒ Y . Thus, N in Eq. (3) represents the number of
transactions in transaction database D, and n(.) is the number of repetitions of
the particular rule X ⇒ Y within D. Here, Cmin denotes minimum confidence
and Smin minimum support determining that only those association rules with
confidence and support higher than Cmin and Smin are taken into consideration,
respectively.
Improved Nature-Inspired Algorithms for Numeric Association Rule Mining 189
2.2 Differential Evolution for ARM Using Mixed Attributes
The basis of our study presents Differential Evolution for ARM using mixed
attributes (ARM-DE) proposed by Fister et al. in [5]. Development of this algo-
rithm is divided into three steps:
– domain analysis,
– representation of solution,
– evaluation function definition.
In that study, domain analysis of the observed sport database identified three
numerical and eleven categorical attributes. Thus, the former are determined
with intervals of feasible values limited by their minimum lower and maximum
upper bound values.
The individuals in the population are represented as real-valued vectors,
where every numerical attribute consists of corresponding lower and upper
bounds. On the other hand, each categorical attribute is identified by the real-
value drawn from the interval [0, 1] that is mapped to the corresponding discrete
attribute according to dividing the interval into equally-sized non-overlapping
sub-intervals. Additionally, the last element in the representation xi,D denotes
the so-called cut point, determining which part of the vector belongs to the
antecedent and which to the consequent of the mined association rule. The fol-
lowing fitness function was defined for evaluation of solutions:
(t) (t) (t)
(t) α ∗ conf (xi ) + γ ∗ supp(xi )/α + γ, if feasible(xi ) = true,
f (xi ) = (4)
−1, otherwise,
where conf (.) is confidence, supp(.) support, α and γ are weights, function
feasible(xi ), denotes if the solution is feasible. The task of the optimization
is to find the maximum value of the evaluation function.
3 Improved DE for NARM

The proposed DE for NARM (also NARM-DE) operates on the transaction
database using only numerical attributes containing data obtained from a wear-
able device during sport training. Therefore, the new domain analysis needs to
be performed in a first step. The result of this step is illustrated in Table 1,
from which it can be seen that domain analysis of this database identified seven
numerical attributes characterizing a performance of a realized training session.
To each attribute, the corresponding intervals with their minimum lower and
maximum upper bounds are assigned in the Table.
Then, the representation of solutions must be adjusted to the new demands.
Here, the solutions are represented as real-valued vectors, in the following form:
(t) (t) (t) (t) (t) (t)
xi = { xi,4πj , xi,4πj +1 , xi,4πj +2 , xi,4πj +3 , . . . , xi,4D )}, (5)

(t) (t)
At j Cp i
Table 1. Domain analysis performed on the sport database.
Attribute Minimum lower bound Maximum upper bound

Duration 107.95 142.40
Distance 8.76 85.19
Average HR 63.00 168.00
Average ALT 7.23 1779.04
Calories 273.00 2243.00
Ascent 6.0 1884.40
Descent 2.0 1854.20
(t)
where elements xi,4πj +k , for i = 1, . . . , Np and j = 0, . . . , D − 1 and k = 0, . . . , 3,
denote the attributes of features in association rules, t is an iteration counter,
and D the number of attributes. Indeed, each numerical attribute is expressed
as a quadruple:
(t)
At i,j = xi,4πj , xi,3πj +1 , xi,4πj +2 , xi,4πj +3 , (6)
where the first term denotes the lower bound, the second the upper bound, the
third the threshold value, and the fourth determines the ordering of the attribute
in the permutation.
The threshold value determines the presence or absence of the corresponding
numerical feature in the association rule, in other words:
⎧
⎪ (t)
⎨ [N U LL, N U LL], if rand(0, 1) < xi,4πj +2 ,
⎪
At(t)
πj =
(t) (t) (t) (t)
[xi,4πj , xi,4πj +1 ], if xi,4πj > xi,4πj +1 , (7)
⎪
⎪
⎩ [x(t) ,x
(t)
], otherwise,
i,4πj +1 i,4πj
where a shrinking coefficient K is expressed as:

⎛ ⎞
(t) (t)
xi,4πj − xi,4πj +1
⎝
K = 1− ⎠, (8)
Ub πj − Lb πj
and Lb πj and Ub πj denote the corresponding lower and upper bounds. The
motivation behind the proposed equation is to shrink the whole interval of the
feasible values.
(t)
A permutation Π = (π1 , . . . , πD ) is assigned to each solution xi , which
(t)
orders the attributes At j according to the following equation:
(t) (t) (t)
xi,4π0 +3 ≥ xi,4πj +3 ≥ xi,4πD−1 +3 , for j = 0, . . . , D − 1. (9)
(t)
Thus, the attributes with the higher value of the fourth element xi,4πj +3 are
ordered at the start of the permutation, while the attributes with the lower
values at the end of the permutation. In this case, each numerical attribute has
an equal chance to be selected as an antecedent or consequent of the mined
association rule.
(t)
The last element in the vector determines the cut point Cp i , expressed as:

(t) (t)
Cp i = xi,D · (D − 2) + 1, (10)
(t)
where Cp i ∈ [1, D −1]. In summary, the length of solution vector is the 4·D +1.
The mapping of the solution representation to the corresponding association
rule is expressed as follows:
(t)
Ante(X ⇒ Y ) = {oπj |πj < Cp i ∧ At π(t)j = [NULL, NULL]},
(t)
(11)
Cons(X ⇒ Y ) = {oπj |πj ≥ Cp i ∧ At π(t)j = [NULL, NULL]},
where Ante(X ⇒ Y ) represents a set of objects belonging to antecedent and

Cons(X ⇒ Y ) is a set of objects belonging to consequent of the corresponding
association rule. However, the attribute needs to be enabled in order the object
to be valid member of the particular set.
Finally, an evaluation function must be defined. As found in the experimen-
tal work of Fister et al. [5], however, the main weakness of the ARM-DE was
reflected in the fact that the evaluation function consisted of a linear combina-
tion of support and confidence measures, which favours expanding the interval of
feasible values of numerical variables. Consequently, the expanding caused that
the number of mined association rules was increased, and the value of the evalua-
tion function was raised indirectly. On the other hand, the number of categorical
attributes was decreased. As a result, a new evaluation function is proposed, as
follows in our study:
(t) (t) (t)
(t) α · supp(xi ) + β · conf (xi ) + γ · inclusion(xi ) + δ · K
f (xi ) = , (12)
α+β+γ+δ
(t) (t)
where supp(xi ) and conf (xi ) represent the support and confidence of the
(t)
observed association rule, K is the shrinking coefficient, and inclusion(xi ) is
defined as follows:
|Ante(X ⇒ Y )| + |Cons(X ⇒ Y )|
inclusion(X ⇒ Y ) = , (13)
m
where |Ante(X ⇒ Y )| returns the number of attributes in the antecedent,
|Cons(X ⇒ Y )| is the number of attributes in consequence of the particular
association rule, and m is the total number of attributes. Weights in Eq. (12)
are set to α = β = γ = δ = 1 in the study.
Obviously, the task of the NARM-DE algorithm is to maximize the value of
the proposed evaluation function.
4 Experiments and Results

The purpose of our experimental work was to show that the improved nature-
inspired algorithms for NARM should be applied successfully in practice. In
line with this, we focused on the posted issues, such as shrinking the lower and
upper bounds of the numerical attributes, and operating of the new evaluation
function.
During the experimental work, four different nature-inspired algorithms
were employed: Differential Evolution (DE) [6], Particle Swarm Optimization
(PSO) [6], Cuckoo Search (CS) [9], and Flower Pollination Algorithm (FPA) [8].
All algorithms used parameter settings as proposed in corresponding literature.
In order to make comparative analysis as fair as possible, the number of eval-
uation function evaluations was fixed as nFES = 10, 000, while the number of
independent runs was set as nRUN = 5.
The algorithms solved problems generated by the random sport dataset gen-
erator SportyDataGen [1]. The random generator is capable of generating ran-
dom instances of numerical attributes. The following measures were used for
comparing the algorithms: (1) Total number of mined rules, and (2) Average
number of antecedents and consequents1 , (3) Average fitness, (4) Average sup-
port, (5) Average confidence, (6) Average shrink, and (7) Average inclusion.
The results of NARM are illustrated in Table 2, which presents the mentioned
statistical measures for each of the algorithms in the experiments. Let us notice
that the best results are depicted in bold case in the Table. As can be seen from
Table 2, DE discovered the maximum number of total rules. These rules are of
the best average fitness and inclusion. On the other hand, PSO mined rules of
the best average support, confidence and shrink. However, CS and FPA achieved
the best results according to the average number of antecedents/consequents. In
summary, the best results for using in practice were obtained by DE.
Table 2. Number of rules found using different algorithms.
Algorithm DE PSO CS FPA

Total rules 241,455 212,352 74,069 146,659
Average number of antecedents/consequents 5/2 5/2 4/3 3/4
Average fitness 0.7506 0.6519 0.1165 0.1858
Average support 0.8242 0.8729 0.1413 0.1181
Average confidence 0.9637 0.9812 0.5575 0.4106
Average shrink 0.2639 0.1703 0.2048 0.2771
Average inclusion 0.9722 0.8370 0.3260 0.4454
1
The first number denotes the number of antecedents, while second denotes the num-
ber of consequent.
Interestingly, examples of selected solutions that were found by the proposed

algorithms are illustrated in Table 3.
Table 3. Examples of solutions found by the proposed algorithms.
Antecedent Cut Consequent

CAL[350.83, 1247.60]∧ =⇒ DIST[14.04, 85.19]∧
ALT(NO)∧ ASC[259.34, 1884.40]
DUR[107.95, 142.40]∧
DESC[312.42, 1409.12]
ALT[338.67, 589.31]∧ =⇒ DIST[8.76, 85.19]∧
DUR[131.29, 135.80]∧ DESC[774.99, 1258.80]
AVHR[63.0, 125.48]∧
CAL[273.0, 1498.70]∧
ASC[859.13, 1445.86]
ALT[7.22, 1134.88]∧ =⇒ DESC[2.0, 1598.18]∧
CAL[440.82, 1966.86]∧ DUR[107.95, 142.4]
ASC[6.0, 1503.78]∧
AVHR[86.16, 158.17]∧
DIST[17.43, 69.82]
4.1 Discussion
The results of the mentioned nature-inspired algorithms for NARM showed that
selection of the algorithm has a big influence on the quality of the results. Thus,
an advantage of the DE algorithm is in the total discovered rules, average fitness
and inclusion, while the PSO was better regarding the average support, confi-
dence, and shrink. On the other hand, working with the numerical attributes
revealed a lot of issues that need to be considered for the future work. Let us
mention only the more important ones:
– How to consider shrinking as the statistical measure? In our results, we con-
sidered the shrinking intervals of all attributes, including those that did not
arise in the mined rules.
– How to balance the weights of four terms in the proposed evaluation function?
In our case, all weights were set to the value of 1.0, which means that all the
contributions were weighted equally.
– Is the best mined association rule according to fitness value also the most
interesting?
– How to find the balance between shrink and inclusion?
The mentioned issues confirm that the development of the proposed algo-
rithm for pure NARM is far from completion. A lot of researches would be
necessary in order to find the proper answers to the mentioned issues.
5 Conclusion
Development of a nature-inspired algorithm for pure NARM demands answers
to new issues, such as, for instance: How to shrink the lower and upper bounds of
numerical attributes? or How to find the proper evaluation function? The former
issue is confronted with the exploration of the search space, while the latter with
evaluating the quality of the mined association rules.
This paper proposes usage of the shrinking coefficient, that is determined as
a ratio between the difference of the generated upper and lower bounds, and
difference of the maximum upper and minimum lower bounds. As an evalua-
tion function, a weighted sum of support, confidence, inclusion, and shrinking
coefficient are taken into consideration. However, the weights were set to the
same value of 1.0 in our preliminary study. The nature-inspired algorithms for
pure NARM were employed to a sample sport dataset generated by the random
generator located on the web. Even four nature-inspired algorithms were tested
in our comparative study, as follows: DE, PSO, CS, and FPA.
The results of the comparative analysis revealed that, although there are dif-
ferences between the specific nature-inspired algorithms, these could be applied
for solving the problem in practice. On the other hand, a lot of work is necessary
in order to find the proper weights for determining the particular contributions
of terms in the evaluation function. However, all this work could be a potential
direction for the future work.
References
1. Sportydatagen: An online generator of endurance sports activity collections. In:
Proceedings of the Central European Conference on Information and Intelligent
Systems, Vara ždin, Croatia, 19, 21 September 2018, pp. 171–178 (2018)
2. Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In:
Proceedings 20th international conference very large data bases, VLDB, vol. 1215,
pp. 487–499 (1994)
3. Altay, E.V., Alatas, B.: Performance analysis of multi-objective artificial intelligence
optimization algorithms in numerical association rule mining. J. Ambient Intell.
Human. Comput. 1–21 (2019)
4. Fister, Jr.I., Fister, I.: A brief overview of swarm intelligence-based algorithms for
numerical association rule mining. arXiv preprint arXiv:2010.15524 (2020)
5. Fister Jr., I., Iglesias, A., Galvez, A., Del Ser, J., Osaba, E., Fister, I.: Differential
evolution for association rule mining using categorical and numerical attributes. In:
Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A.J. (eds.) Intelligent Data
Engineering and Automated Learning - IDEAL 2018, pp. 79–88. Springer Interna-
tional Publishing, Cham (2018)
6. Storn, R., Price, K.: Differential Evolution – A Simple and Efficient Heuristic for
Global Optimization over Continuous Spaces. J. of Global Optimization 11(4), 341–
359 (dec 1997). https://doi.org/10.1023/A:1008202821328
7. Telikani, A., Gandomi, A.H., Shahbahrami, A.: A survey of evolutionary computa-
tion for association rule mining. Information Sciences (2020)
8. Yang, X.S.: Flower pollination algorithm for global optimization. In: Durand-Lose,
J., Jonoska, N. (eds.) Unconventional Computation and Natural Computation, pp.
240–249. Springer, Heidelberg (2012)
9. Yang, X.S.: Bat algorithm and cuckoo search: a tutorial, pp. 421–434. Springer,
Heidelberg (2013). https://doi.org/10.1007/978-3-642-29694-9 17
Verification of the Adequacy of the Topological
Optimization Method of the Connecting Rod
Shaping by the BESO Method in ANSYS
APDL System
Sergey Chirskiy1 and Vladimir Panchenko2,3(&)

1
Bauman Moscow State Technical University,
2nd Baumanskaya st. 5, 105005 Moscow, Russia
baragund@yandex.ru
2
pancheska@mail.ru
3
Abstract. The paper presents a topological optimization method based on the

function “Birth and death of an element” in ANSYS Mechanical APDL. To
confirm the adequacy of the considered method, models were tested. The
optimization results using the developed method were compared with the
optimization results using the “Topology Optimization” module, which is the
part of the ANSYS Workbench, where the boundary conditions were identical.
The test showed a good agreement between the optimization results by the
proposed method and the optimization in the ANSYS Workbench module
“Topology Optimization”.
Keywords: Piston engine Cyclic strength Connecting rod shape ANSYS

APDL Topological optimization Evolutionary structural optimization
1 Introduction
Increasing the specific performance of internal combustion engines with a simultaneous

decrease in their material consumption requires searching for parts that are optimal in
mass. In this case, the parts must remain operable. Most of the details are simultaneously
affected by several loads, varying in value and direction, which complicates the task of
finding the optimal shape. Currently, the urgent task is to develop automated methods
for solving such optimization problems. There are various methods for optimizing the
shape, but often they have some drawbacks and do not have sufficient flexibility.
Interest in optimizing the shape of parts has existed for a long time, but recently this
interest has increased. This is due on the one hand to the increasing tendency to reduce
material consumption. On the other hand, affordable and accurate computational
methods have appeared that allow evaluating the performance and reliability of parts.
In addition, a three-dimensional printing technology has been developed that allows the
manufacture parts of almost any shape.

https://doi.org/10.1007/978-3-030-68154-8_20
Verification of the Adequacy of the Topological Optimization Method 197
The main parts of piston of internal combustion engines are exposed to several
variables in magnitude and direction of forces; therefore, their stress-strain state turns
out to be difficult. Because of this, without calculation or experiment, it is difficult to
determine which parts of the detail are loaded more, and which parts are loaded less. In
addition, when acting on a variable of magnitude forces, it is necessary to take into
account fatigue phenomena in the metal. The experience of design, testing and oper-
ation shows that in most cases the destruction of the main parts of internal combustion
engines is precisely fatigue. Often the material of the part is loaded unevenly: in some
areas, the stress values approach the maximum permissible and in some – the material
is practically not loaded. This means that the material is not optimally used. The task of
finding the optimal shape of the part is difficult, but relevant, since such optimization
will reduce the material consumption. In addition, if the part is movable, then reducing
its mass will lead to a decrease in inertial forces acting on it.
The performance of the part can be checked by experimental or calculated methods.
Experimental methods are accurate, but their preparation and implementation take a
very long time even with the use of rapid prototyping technologies. Their cost is also
quite large. For this reason, it is preferable to use computational methods for assessing
the performance of parts, especially in the initial stages of finding the optimal design.
There is a large number of ready-made software products for solving optimization
problems. However, most often they have some disadvantages, such as cost, closed
source code or lack of flexibility. For example, in the “Topology Optimization” module
included in the ANSYS software package it is impossible to adequately describe a
cylindrical sliding bearing, since the types of contacts that allow significant relative
displacement of the contacting parts are not applicable. For this reason, it was decided
to develop its own optimization method. In order to formulate the requirements for the
developed optimization method, a review of existing methods was performed.
2 Analysis of Publications
The optimization problem is described by a goal function and restrictions. The function
of the goal in problems of optimizing the shape of parts is the mass of the
part. Restrictions may be different, but most often they are somehow related to the
performance of the part. Currently, the evaluation of the reliability and performance of
parts is almost always carried out by the finite element method. In most cases, com-
mercial software products are used, for example, ANSYS [1–6], ABAQUS [7–9],
COMSOL [10], MSC Patran [11], LS-DYNA [12].
To search the optimal shape, some method of creating various variants of the design
of the investigated part is necessary, from which the optimal variant is selected.
A set of design options can be created in advance [3–5, 7, 13]. In this case, the best
design option is selected from among the available options. Additional changes in the
design are not made. However, in this case only a limited number of options can be
investigated.
Another approach is to gradually change the design of the part, depending on the
results of testing its performance. This approach can be called iterative [1, 6, 8–12, 14–
16]. Design changes can be made either manually [1, 17] or automatically by creating
198 S. Chirskiy and V. Panchenko
various feedbacks between the results of the performance check and the design. To
automate the change in the shape of the part, use either its parametric description [3–5, 8,
10, 14], either topological [9, 11, 12, 15, 16]. Topological description is more compli-
cated than parametric, but it allows getting much more diverse forms of parts.
Different optimization methods use different ways to find the optimal solution. In
the event that the shape of the part changes during the optimization process, the
algorithm for finding the optimal variant describes precisely the law of shape change.
The approximation method can be used in tasks of finding the optimal shape of parts [5]
or method based on sensitivity analysis [4]. Often used evolutionary [6] or genetic
algorithms [6, 8–10, 12, 14, 15]. An example of a more exotic method is the modified
cuckoo algorithm [10].
The selected or developed algorithm for finding the best variant can be imple-
mented using the finished software product, for example, TOSCA [15], mode-
FRONTIER [8], OPTIMUS [12]. In addition, the program for finding the optimal
variant can be written in any programming system, for example, Python [9].
3 Description of the Developed Optimization Method
The implementation of topological optimization is carried out in the ANSYS system

[18, 19]. Developed method belongs to the group of methods of bi-directional evo-
lutionary structural optimization (BESO). The choice of this method is determined by
the possibilities provided by the finite element analysis package. In addition, this
method does not require as much iteration as, for example, genetic methods.
Changing the shape of the part is carried out using the built-in function of the birth
and death of elements, which allows disabling or enabling any elements of the finite
element model. Disabling an item is equivalent to removing the appropriate amount of
material.
The function can work only with previously created elements. Therefore, to
broaden the search for the optimal shape of the part, the original model being optimized
contains a supply of material. In fact, this model describes the space within which the
required detail may be contained. Accurate construction of the original model elimi-
nates the collision of the optimized part with other parts. The volume and shape of this
space is limited by design considerations and methods for interfacing parts. All this
provides the ability to install new part in the existing structure.
The algorithm is designed to optimize parts exposed to variable cyclic loads
(Fig. 1) [18, 19]. In this case, the deformation energy of the element is not a suitable
criterion for deciding whether to turn it off. Therefore, this criterion selected the safety
factor for cyclic strength calculated for each element of the model.
Fig. 1. The general block diagram of the developed algorithm for finding the optimal shape of
the detail
As a result, the simulation of the stress-strain state of the model being optimized is
performed for two sets of boundary conditions corresponding to the states of maximum
and minimum loading. The simulation results allow calculating the safety factors by the
criterion of cyclic strength.
4 Assessment of the Adequacy of the Developed Method
The result of the review was the following requirements for the optimization method
being developed: the optimization method should be iterative and based on the topo-
logical method of describing the shape of the detail. Since parts of piston engines
operate under conditions of variable periodic forces, it is necessary to take into account
the phenomena of metal fatigue. To model the stress-strain state, the ANSYS finite-
element analysis package was chosen and the program for finding the optimal variant is
implemented in the built-in programming language ANSYS APDL.
To assess the adequacy of the developed algorithm, test models were optimized
with its help and with the help of the topology optimization module “Topology
Optimization”, included in the ANSYS software package. The piston engine con-
necting rod is selected as the model to be optimized. The sizes and acting forces are
determined according to the tables of average statistical values for forced diesel
engines. Three cases of loading are considered:
1) Asymmetrical compression-tension, the compressive force is 22122 N, tensile force
is 5482,8 N (Fig. 2).
Fig. 2. Calculation model: boundary conditions for asymmetrical compression-tension: the

compressive force is on the left, the tensile force is on the right
2) Symmetric bending by forces equal modulo 2213,1 N (Fig. 3).
Fig. 3. Calculation model: boundary conditions for symmetric bending by forces equal modulo
3) Asymmetrical complex loading, approximately describing the real load cycle acting
on the engine connecting rod (Fig. 4).
Fig. 4. Calculation model: boundary conditions for asymmetrical complex loading

In all cases, the tightening of the connecting rod bolts is modeled by a pair of
oncoming forces equal to the modulus of 13316 N, compressing the middle part of the
connecting rod bolt. For the inner cylindrical surface of the connecting rod, a boundary
condition of the type “Fixed Support” is set, describing the termination. Such boundary
conditions do not fully describe the operating conditions of the connecting rod, but they
can also be set as boundary conditions in the ANSYS Mechanical APDL system and in
the “Topology Optimization” module.
Some surfaces of the model are used to set the boundary conditions – the upper
head bushing, the lower head liners, the connecting rod bolts and the surfaces with
which the bolts come into contact, the surface of the connecting rod and the connecting
rod cap. In order for these surfaces to remain unchanged during optimization, they are
excluded from the region being optimized. In both cases, the optimization criterion is
the mass of the part. Restrictions are set so that the maximum equivalent stress does not
exceed 600 MPa.
Optimization results for all three loading variants are presented below. In all cases
the left side of the figures shows the result of optimization using the developed method
and the right side of the figures shows the result of optimization in the “Topology
Optimization” module, included in the ANSYS software package.
The result in the Fig. 5 is obtained for the case of asymmetric tension-compression
(Fig. 2).
Fig. 5. Comparison of optimization results for the case of asymmetric tension-compression
Figure 6 shows the optimization result for the case of symmetric bending (Fig. 3).
Fig. 6. Comparison of optimization results for the case of symmetric bending
For the case of a complex asymmetric loading (Fig. 4), the results are obtained, as
shown in the Fig. 7.
Fig. 7. Comparison of optimization results for the case of a complex asymmetric loading
The forms obtained in the middle part of the connecting rod are almost identical.
The difference in shape in the areas of the upper and lower heads is due to the
difference in the algorithms. The “Topology Optimization” module, included in the
ANSYS software package, while optimizing, provides sufficient rigidity for all areas,
and since the cylindrical surface of the lower connecting rod head is fixed in all degrees
of freedom, then a minimum volume of material is sufficient to ensure its rigidity.
However, in the developed method, the criterion for the removal of material is
associated with the value of the equivalent stress, which in the sealing zone will be
significantly greater than zero. All this leads to the conclusion about the adequacy of
the developed method. The duration of solving the considered optimization tasks by the
developed method and using the “Topology Optimization” module is almost the same.
The main advantages of the developed method are the possibility of using movable
types of contacts and setting thermal loads on parts. Movable types of contacts are
necessary for an adequate description of cylindrical sliding bearings, such as a con-
necting rod bearing or a piston pin bearing. Heat loads are convenient, for example, to
simulate the pre-tightening of bolts or studs.
5 Conclusion
The proposed method allows searching for a new detail shape within a certain initial
model, as well as optimizing the shape of existing products. The program is written in
the integrated programming language APDL. The optimization goal function is the
minimum part mass. The limitations are related to the performance of the detail,
described by the strength or stiffness of the detail. The ability to install a new detail into
an existing structure is provided both at the stage of preparing the initial model and
when converting an optimized finite element model to a solid model. Thus, the pro-
posed method is quite universal and can be used for various tasks related to finding the
optimal form of details.
The optimized form of the detail often turns out quite complicated. Its production
by traditional methods is likely to be too complicated and expensive. Such details can
be produced using additive technologies, although in this case their cost remains high.
The duration of the optimization process is almost completely determined by the
time spent on modeling the stress-strain state of the detail. The processing of results
and switching on/off elements takes several minutes, even if the finite element model
contains a large number of elements.
The developed algorithm in the three test tasks showed the same results as the
“Topology Optimization” module of the ANSYS Workbench package. The observed
small difference is due to the fundamental difference between optimization algorithms.
At the same time, the developed algorithm has several advantages that are important in
optimizing the main details of piston internal combustion engines.
References
1. Shenoy, P., Fatemi, A.: Connecting Rod Optimization for Weight and Cost Reduction. SAE
Technical Paper 2005-01-0987 (2005). https://doi.org/10.4271/2005-01-0987
2. Charkha, P.G., Jaju, S.B.: Analysis & optimization of connecting rod. In: Second
International Conference on Emerging Trends in Engineering and Technology, ICETET
2009, pp. 86–91 (2009). https://doi.org/10.1109/ICETET.2009.30
3. Bin, Z., Lixia, J., Yongqi, L.: Finite element analysis and structural improvement of diesel
engine connecting rod. In: Second International Conference on Computer Modelling and
Simulation, pp. 175–178 (2010). https://doi.org/10.1109/ICCMS.2010.238
4. Hou, X., Tian, C., Fang, D., Peng, F., Yan, F.: Sensitivity analysis and optimization for
connecting rod of LJ276M electronic gasoline engine. In: Computational Intelligence and
Software Engineering (2009). https://doi.org/10.1109/CISE.2009.5363219
5. Limarenko, A.M., Romanov, A.A., Aleksejenko, M.A.: Optimizaciya shatuna avtomo-
bil'nogo dvigatelya metodom konechnyh ehlementov (Optimization of the connecting rod of
a car motor by the method of final elements). In: Trudy Odesskogo Politekhnicheskogo
Universiteta (Proceedings of the Odessa Polytechnic University), no. 2(39), pp. 98–100
6. Roos, D., Nelz, J., Grosche, A., Stoll, P.: Workflow-Konzepte zum benutzerfreundlichen,
robusten und sicheren Einsatz automatischer Optimierungsmethoden. In: 21th CAD-FEM
Users’ Meeting International Congress on FEM Technology (2003)
7. Bhandwale, R.B., Nath, N.K., Pimpale, S.S.: Design and analysis of connecting rod with
abaqus. Int. J. Recent Innovation Trends Comput. Commun. 4(4), 906–912 (2016)
8. Clarich, A., Carriglio, M., Bertulin, G., Pessl, G.: Connecting rod optimization integrating
modefrontier with FEMFAT. In: 6-th BETA CAE International Conference (2016). https://
www.beta-cae.com/events/c6pdf/12A_3_ESTECO.pdf
9. Zuo, Z.H., Xie, Y.M.: A simple and compact Python code for complex 3D topology
optimization. Adv. Eng. Softw. 85, 1 (2015). https://doi.org/10.1016/j.advengsoft.2015.02.
006
10. Moezi, S.A., Zakeri, E., Bazargan-Lari, Y., Zare, A.: 2&3-dimensional optimization of
connecting rod with Genetic and modified Cuckoo optimization algorithms. IJST Trans.
Mech. Eng. 39(M1), 39–49 (2015)
11. Shaari, M.S., Rahman, M.M., Noor, M.M., Kadirgama, K., Amirruddin, A.K.: Design of
connecting rod of internal combustion engine: a topology optimization approach. In:
National Conference in Mechanical Engineering Research and Postgraduate Studies,
pp. 155–166 (2010)
12. Fonseka, S.: Development of new structural optimization methodology for vehicle
crashworthiness. Honda R&D Tech. Rev. 22(2), 59–65 (2010)
13. Jia, D., Wu, K., Wu, S., Jia, Y., Liang, C.: The structural analysis and optimization of diesel
engine connecting rod. In: International Conference on Electronic & Mechanical Engineer-
ing and Information Technology, pp. 3289–3292 (2011). https://doi.org/10.1109/EMEIT.
2011.6023712
14. García, M.J., Boulanger, P., Henao, M.: Structural optimization of as-built parts using
reverse engineering and evolution strategies. Struct. Multidisc. Optim. 35, 541 (2008).
https://doi.org/10.1007/s00158-007-0122-6
15. Boehm, P., Pinkernell, D.: Topology optimization of main medium-speed diesel engine
parts. In: CIMAC Congress, Bergen (2010)
16. Ogata, Y., Suzuki, S., Iijima, Y.: Optimization method for reduction of transmission housing
weight. Honda R&D Tech. Rev. 16(2), 103–108 (2004)
17. Shedge, V.A., Munde, K.H.: Optimization of connecting rod on the basis of static & fatigue
analysis. IPASJ Int. J. Mech. Eng. (IIJME) 3(5), 7–13 (2015)
18. Myagkov, L.L., Chirskiy, S.P.: The implementation of the BESO method for topology
optimization in ANSYS APDL and its application for optimization of the connecting rod shape
of a locomotive diesel engine. In: Proceedings of Higher Educational Institutions. Machine
Building, no. 11, pp. 38–48 (2018). https://doi.org/10.18698/0536-1044-2018-11-38-48
19. Myagkov, L., Chirskiy, S., Panchenko, V., Kharchenko, V., Vasant, P.: Application of the
topological optimization method of a connecting rod forming by the BESO technique in
ANSYS APDL. In: Intelligent Computing & Optimization. Advances in Intelligent Systems
and Computing, vol. 1072, pp. 239–248 (2020). https://doi.org/10.1007/978-3-030-33585-
4_24
Method for Optimizing the Maintenance
Process of Complex Technical Systems
of the Railway Transport
Vladimir Apatsev1, Victor Bugreev1, Evgeniy Novikov1,

Vladimir Panchenko1,2(&), Anton Chekhov1, and Pavel Chekhov1
1
pancheska@mail.ru
2
Federal Scientific Agroengineering Center VIM, 1st Institutskiy passage 5,
Abstract. The article discusses one of the ways to improve the reliability of
technical systems by optimizing the maintenance process. The structure of the
engineering method for finding the optimal frequency of maintenance is pro-
posed. Practical recommendations have been developed to optimize the main-
tenance process of the “PALM” automatic identification system based on the
proposed method.
Keywords: Transport Maintenance Method Optimization Reliability

indicators
1 Introduction
The development of an engineering method for calculating scientifically based periods

for the maintenance of complex systems requires an integrated solution of particular
tasks. The variety of the applied mathematical apparatus and its complexity makes it
difficult to create a single analytical model. The most acceptable approach to solving
this problem is the construction of a model with a modular structure and its imple-
mentation on electronic computers [1]. The degree of modular decomposition is
determined mainly by the nature of the tasks to be solved and the possibility of their
separate use.
2 Block Diagram of the Method for Optimizing

the Parameters of the Maintenance System
Figure 1 shows the structural diagram of a method for optimizing the parameters of a
maintenance system. It is an algorithm for performing work in solving the problem of
optimizing the maintenance process of complex systems.
In blocks 1, 2, 5 the stages of work on compiling a meaningful description of the
operation of the technical system are displayed. Here, individual operational parameters
are highlighted and the corresponding functional units and elements are set.

https://doi.org/10.1007/978-3-030-68154-8_21
206 V. Apatsev et al.
Fig. 1. Block diagram of the method for optimizing maintenance periods
In blocks 3, 4, 6, 7 the failure rates of individual functional units are determined,

the type of distribution functions is determined taking into account the actual operating
modes, the presence of a time reserve and environmental factors.
In block 8 the optimization task is selected (by the criterion of maximum avail-
ability coefficient or minimum average unit costs).
In block 9 the fulfillment of the necessary and sufficient conditions for the existence
of optimal periodicity of maintenance is checked. When these conditions are met, a
conclusion is made about the need for maintenance after a finite time. Otherwise,
maintenance is not practical.
In block 10 analytical expressions are obtained for the selected optimization
criterion.
In block 11 the optimal maintenance frequency is determined by solving the
equations with respect to T (maintenance period) graphically or using numerical
methods.
Thus, the obtained method solves the general problem – the determination of the
optimal, from the point of view of maximum availability factor or minimum average
unit costs, period of maintenance of a complex technical system. The modularity of
building the model, and as a result, visibility, makes it a convenient tool not only for
developers, but also for engineering personnel. With this structure, the model becomes
invariant with respect to the choice of the criterion for solving the problem (in the sense
of the independence of the model blocks from the block for calculating the criterion
value). The ability to modify blocks makes this model a fairly universal means of
solving the problems of optimizing the maintenance process of complex systems.
Convenience of implementation on electronic computers allows considering the model
as a possible element of an automated control system for the operational reliability of
train support systems and railway infrastructure.
Method for Optimizing the Maintenance Process of Complex Technical Systems 207
3 Monitoring of the Technical Condition

of the Technical System
The maintenance of any technical system or facility begins with the control of the
technical condition. The task of determining the technical condition of a technical
system is extremely important for the following reasons:
1. The determination of the technical condition of a technical system is the stage after
which it is concluded that it is advisable to carry out maintenance and select specific
maintenance operations;
2. If the technical system fails during its intended use, then the possible damage
associated with this failure greatly exceeds the costs associated with the
maintenance;
3. By itself, control of a technical system is always associated with time and material
costs, which are always desirable to minimize, but not to the detriment of the
sufficiency and reliability of information about the technical condition of the
facility.
These factors make the task of choosing the initial operations difficult, multifaceted
and responsible. The optimization task is to find such a number of initial maintenance
operations that would take a minimum of time, and the information received would
completely characterize the technical condition of the technical system under study.
The selection of the required number of initial maintenance operations can be
carried out in various ways. The simplest task is solved on the basis of graph theory
[2, 3]. This theory allows, based on the analysis of the functional diagram of a par-
ticular device, to select a set of determining parameters for monitoring the technical
condition and determine the sequence of their control. The choice of parameters for
monitoring the technical condition based on the analysis of the functional diagram of
the technical system as a whole presents considerable difficulty. Therefore, the selec-
tion of parameters for monitoring the technical condition is advisable to carry out for
individual functional devices.
Denoting the set of outputs of functional devices by X = {xi}, i = 1… n, and the set
of external inputs by U = {Uv}, v = 1… m, which are necessarily controlled, we find
some set of outputs k Є x, which is necessary monitor to assess the technical condition
of the functional device. We represent each output xi Є X as the vertex of some oriented
graph G (x, v), where x is the set of vertices, and v is the set of arcs.
The principle of constructing the graph is as follows: if the output xi is the input of a
block having the input xj then these two points forming the vertices of the graph xi and
xj are connected by an arc (xi, xj). According to the constructed oriented graph of
outputs, the smallest set of controlled parameters is determined, with the help of which
the technical state of the functional device is evaluated. Finding the minimum number
of monitored parameters reduces to finding the smallest external stable set in the output
graph. An external stable set is a certain set of vertices of a graph into which arcs from
all other vertices that do not belong to K. The parameters of outputs belonging to an
external stable set fully characterize the technical state of the functional device, and
their number will be minimal [4].
The specified method allows selecting the minimum number of initial operations to
control the maintenance of a functional device, the implementation of which will
reliably judge the technical condition of the device.
In addition to the minimum number of initial maintenance operations, it is neces-
sary to determine the sequence of their implementation. The choice of the optimal
sequence is a rather complicated mathematical task, for the solution of which dynamic
programming methods are usually used [5, 6]. For practical purposes, it is convenient
to use the method when each previous maintenance operation would provide infor-
mation on the technical condition of the largest number of units or functional devices.
To determine the sequence of maintenance operations, some preference function and
preference rules are used. As a preference function, it can be used the ratio value:
li
FPREF ðli Þ ¼ ; ð1Þ
n
where li(i = 1… n) is the index of precedence of the output xi; n is the number of
blocks in the functional device. Obviously, the closer FPREF(li) is to unity, the greater
the number of blocks is controlled by the operation, and, therefore, the greater the
specific gravity of the control this operation has. Selected maintenance operations
should be carried out in accordance with the priority series, arranged in descending
order of the values of FPREF(li).
4 Automatic Identification System “PALM”
The developed method was applied to optimize the maintenance periods of the auto-
matic identification system “PALM”, which is the information basis of the automated
control systems for railway transport and freight traffic.
The automatic identification system is intended for automatic fixation of the rolling
stock following via pre-selected reference points along the route [5].
For this, the entire rolling stock is equipped with on-board encoders that carry
information about each moving object. Coded on-board sensors operate on the prin-
ciple of reflection of the irradiating signal of a reader with amplitude modulation by an
identification code. Wheel sensors record the moments when the wheelset passes a
given point on a path segment and sets the pace for the entire data reading process.
At reference points, the primary elements of automatic information collection are
established – reading points with irradiating and reading equipment, which automati-
cally reads information from the code-based sensors of the rolling stock. The com-
position of the reading point includes (Fig. 2):
– irradiating and reading equipment, consisting of a reader 1 and antenna 2;
– controller for counting axles of wheelsets 3;
– cold-resistant TGSA 4 modem with power supply 6;
– two devices for fixing wheelsets 7, 8;
– two-channel power supply system 9;
– reading point heater 5.
The reading point equipment (Fig. 2) is a two-channel polling system for param-
eters of rolling stock passing by: a high-frequency polling channel of on-board enco-
ders installed on locomotives and rolling stock cars, consisting of antenna 2, a high-
frequency, low-frequency channel, reader 1 and its final formation unit information
from the sensor, the low-frequency channel for fixing the moments of passage of the
wheels of the rolling stock over the devices for fixing the wheelsets, which includes
two devices for fixing the wheelsets 7, 8, the axle counting controller 3 and the reader
assembly to form the final message about the moments of passage of the wheels.
Fig. 2. Functional diagram of the reading point of the automatic identification system “PALM”
The reading point emits radio signals only in those periods when the rail circuit of
the block section to which the emitter is “tied” is occupied by rolling stock. In the
absence of rolling stock, the reading point is in standby mode.
The received information is decrypted and transmitted with fixed time stamps via a
local communication line as primary information to a line level concentrator, which is
built on the basis of a personal electronic computer.
The line concentrator collects information from all reading points of the railway
junction, processes the received information and transmits the final message about the
rolling stock via the network of the “Russian Railways” Company to the road level
concentrator. It is at this level that the operational management of traffic flows is
ensured.
In a personal electronic computer of the hub of the automatic identification system,
a message is generated with a list of past rolling stock numbers, location, time,
direction of movement. By the signals from the electronic pedals, serial numbers of
each mobile unit are also recorded, which makes it possible to take into account rolling
stocks with a missing code on-board sensor. The message is transmitted to the road
computer center and to the main computer center.
In the automatic identification system “PALM”, an algorithm using temporary
redundancy is implemented to control the operability of the reading point equipment.
For this, a passive test identifier is introduced into the reader, which is located opposite
the reader's antenna on the opposite side of the path. The reader periodically turns on
the radiation. In the absence of rolling stock in front of the reader, information is read
from the test identifier, after which a decision is made on the health of the reader. Since
the stationary identifier is part of the reader, in the absence of information during its
repeated reading, the maintenance personnel determines which elements are faulty.
After a decision is made on the health of the reader, the radiation is turned off and on in
the standard mode when the rolling stock approaches the reader. Thus, the operability
of the radiating and reading equipment is determined.
The automatic identification system also implements an algorithm for verifying the
operability of the equipment of a reading point by software polling of the state of
functional blocks as the rolling stock approaches. Such a check is started when the
rolling stock of the rail circuit of the block section is occupied, to which the emitter is
“tied”. In this case, a time reserve is also used, the value of which is determined from
the moment the train is occupied by the rolling stock to the next point through the
reading point.
Thus, the “PALM” automatic identification system meets the needs of the “Russian
Railways” Company for the necessary electronic control devices for rolling stock and
transported goods.
5 Automation of Obtaining the Necessary Initial

Data to Optimize Maintenance Periods
An important place in solving the task of optimizing the maintenance parameters is

given to the preparation of the initial data, since the implementation of the results of
theoretical studies in practice largely depends on their accuracy and correctness. For the
developed model, several groups of data can be distinguished.
The first group includes information on the basis of which the reliable character-
istics of the technical system are obtained. It includes circuits (electrical, schematic,
block diagrams, etc.), operating conditions, the presence of various types of reserve, the
correspondence tables of functional blocks of the technical system and operational
parameters. The practice of operating complex technical systems, including train
support systems and railway infrastructure, shows that the reliability of their operation
is largely determined by the correct, rational, high-quality organization of the main-
tenance process.
Therefore, the second group of data includes temporal and probabilistic charac-
teristics of the maintenance process. These characteristics can be determined by
technical documentation, maintenance instructions, which determine the frequency and
complexity of individual operations, expert assessments of the probabilities of
detecting malfunctions, failures and alarms about them.
The third group includes indicators of the quality of functioning of individual
functional blocks/nodes/elements and the entire system as a whole.
In each of these groups, data can be obtained based on the processing of statistical
information or an expert survey.
In accordance with the Fig. 1, to solve the problem of optimizing the maintenance
parameters, it is necessary to have information about the existing system and main-
tenance parameters, as well as information about the failures of the technical system.
Information on the maintenance schedule and inspection procedures for the
equipment of the “PALM” automatic identification system is contained in [7]. In [8],
time norms are provided for planning, organizing, and regulating the labor of electrical
engineers and electricians for servicing and repairing alarm, centralization and locking
devices engaged in maintenance of the equipment of a reading point of an automatic
identification system.
Statistical information about the failures of the automatic identification system
“PALM” was obtained from specialists of the “Industry Center for the Implementation
of New Equipment and Technologies” Company of the “Russian Railways” Company,
who were involved in the development of the automatic identification system and
monitoring the operation of the automatic identification system on the railway network.
According to [9], the initial data for calculating the necessary reliability indicators
should be generated using automated systems, where the main attention should be paid
not only to failure statistics, but generally to control the completeness and correctness
of the technological processes of operation and repair of technical equipment, as well as
the process of organizing transportation. Currently, in railway transport, such a system
is the “Integrated automated system for recording, monitoring the elimination of
technical equipment failures and analyzing their reliability”.
Within the framework of this system, the entire technological chain has been
implemented, starting from fixing the fact of failure to eliminating the cause and
assigning responsibility, forming materials for the investigation of the failure.
At present, the system under consideration contains data on technical equipment
failures from the automated system for maintaining the train schedule “Ural - All-
Russian Scientific Research Institute of Railway Transport”, a number of industrial
automated systems. The main source of primary data on technical equipment failures is
the above-mentioned system of the schedule of executed train traffic, which accounts
for about 70% of the data recorded in the “Integrated automated system for recording,
monitoring the elimination of technical equipment failures and analyzing their
reliability”.
The above-mentioned train execution schedule system is designed to control the
progress of the transportation process from automated workstations of the dispatching
and managing apparatus at all levels of operational management. In addition, the
information capabilities of the system are used by employees of other services and
departments. It includes the functions of forecasting, planning, control, regulation,
accounting and analysis.
The most important advantage of the “Integrated automated system for recording,
monitoring the elimination of technical equipment failures and analyzing their relia-
bility” is that the system allows receiving dynamically updated information about
technical equipment and equipment infrastructure failures directly.
6 Set of Operations for Maintenance
Guided by the above approach, on the basis of the analysis of functional diagrams of
the automatic identification system “PALM” and [10, 11], a set of maintenance
operations was selected. At the same time, the results of an expert survey of specialists
of the “Industry Center for the Implementation of New Equipment and Technologies”
Company on the operation of an automatic identification system were taken into
account. The method for the maintenance of the automatic identification system is
shown in the Table 1.
Table 1. Method for the maintenance of the automatic identification system

№ Work in progress Duration,
min
1 Reading point 2
Monitoring the connection of the hub with the reading points
2 Reading point internal inspection 10
3 Checking the main parameters of the communication line of the hub and 14
reading point
4 Checking the correct operation of the track devices of the automatic 4
identification system
5 Floor Readers Cabinet 5
Voltage measurement in the cabinet of floor readers from the main and
backup power sources with the transfer of power to the backup and back
6 Inspection and assessment of the aboveground part of the cabinet of floor 8
readers
7 Inspection and assessment of the underground part of the cabinet of floor 60
readers (foundation) in the anode and alternating zones
8 Inspection and assessment of the underground part of the cabinet of floor 60
readers (foundation) in areas with alternating current electric traction and
on non-electrified lines
9 Reader 45
Checking the reader settings with the assembly of the verification scheme
of the irradiating and reading equipment
10 Sensors 7
External inspection and verification of sensors
11 Wheel axle counting controller 1,5
Checking the operation of the wheel axle counter
12 Measurement of voltages at test leads of sensors 2,5
13 Rail chain 13
Checking rail circuits with automatic identification system equipment at
the station
14 Measurement of insulation resistance of the rail line (ballast) in the area 21
with automatic identification system
15 Track transformer box 18
Checking the internal condition of the track transformer box
(continued)
№ Work in progress Duration,
min
16 Cable rack 9
Checking the internal state of the cable rack
17 Cable glands 20
Checking the status of cable glands with opening
18 Ground loop 4
Checking the ground loop and ground circuits
19 Checking the correct connection of the grounding devices of the floor 2
readers cabinet
20 Checking and adjusting lightning protection devices 7
21 Checking the status of the visible elements of the grounding devices of 4,5
the floor readers cabinet
22 Selective digging and inspection of the elements of grounding devices 37
located in the ground
23 Checking the condition and serviceability of spark gaps 4
24 Checking the condition and serviceability of diode earthing switches of 14
floor reading devices
Based on the developed method, a study of the maintenance process of the auto-
matic identification system was conducted, the optimal values of the frequency of
maintenance of the main functional devices of the automatic identification system were
determined, which are summarized in the Table 2.
Table 2. Maintenance periods for the main functional devices of the automatic identification
system, used in practice before optimization and optimal
№ Functional device Maintenance period before Optimum maintenance
optimization, days period, days
1 Reading point 30 43
2 Floor readers cabinet 120 120
3 Reader 120 120
4 Sensors 7 10
5 Wheel axle counting 120 178
controller
6 Rail chain 30 22
7 Track transformer 30 24
box
8 Cable rack 180 192
9 Cable glands 180 278
10 Ground loop 180 200
11 System as a whole 360 450
Analysis of the results allows us to draw the following conclusions:

1. The period of maintenance of functional devices 1, 4, 5, 8, 9, 10 can be significantly
increased. This indicates that the maintenance periods established by experts and
accepted for operation correspond to the growth area of the availability factor,
which makes it possible to increase the frequency of maintenance.
2. The period of maintenance of functional devices 2, 3 it is advisable to keep the
same.
3. The maintenance period of functional devices 6, 7 is recommended to be reduced,
which is associated with a relatively low availability factor, since the obtained
values correspond to the area “to the right” of the optimal value, where, with an
increase in the frequency of maintenance, the availability factor decreases.
4. Orientation to the availability factor set according to the technical conditions allows
increasing maintenance periods for individual functional devices of the automatic
identification system and for the system as a whole.
7 Conclusion
The proposed method for optimizing the parameters of a maintenance system may be
applicable to various systems in terms of structure, size and characteristics. The
modularity of the method and the convenience of its implementation on an electronic
computer allow considering the method as a possible element of an automated control
system for the operational reliability of train support systems and railway transport
infrastructure facilities.
Based on the proposed method, a way for automating the calculation of reliability
indicators and the frequency of technical maintenance of train support systems and
railway transport infrastructure facilities using statistical information obtained from the
“Integrated automated system for recording, monitoring the elimination of technical
equipment failures and analyzing their reliability” and economic automated systems of
the “Russian Railways” Company is proposed. An example of the application of the
method for optimizing the maintenance process of the automatic identification system
“PALM” is considered. Measures are proposed to adjust the existing maintenance
system in order to achieve a maximum availability factor.
References
1. Novikov, E.V.: Nauchno-metodicheskij podhod k razrabotke inzhenernoj metodiki rascheta
pokazatelej nadezhnosti [Scientific and methodological approach to the development of an
engineering methodology for calculating reliability indicators]. Sbornik nauchnyh trudov
kafedry “Transportnye ustanovki” [The collection of scientific works of the department
“Transport units”], T. 1, 111–116 (2008). (in Russian)
2. Rainshke, K., Ushakov, I.A.: Ocenka nadezhnosti sistem s ispol'zovaniem grafov
[Reliability assessment of systems using graphs]. Moskva: Radio i svyaz' [Moscow: Radio
and communications], 208 p. (1988). (in Russian)
3. Herzbach, I.B., Kordonsky, H.B.: Modeli otkazov [Failure Models]. Moskva, Soviet Radio
[Moscow, Sovetskoe radio], 168 p. (1966). (in Russian)
4. Latinskij, S.M., SHarapov, V.I.: Teoriya i praktika ekspluatacii radiolokacionnyh sistem
[Theory and practice of the operation of radar systems]. Moskva, Sovetskoe radio [Moscow,
Soviet Radio], 432 p. (1970). (in Russian)
5. Beichelt, F., Franken, N.: Nadezhnost' i tekhnicheskoe obsluzhivanie. Matematicheskij
podhod [Reliability and maintenance. The mathematical approach]. Moskva, Radio i svyaz'
[Moscow, Radio and communications], 392 p. (1988). (in Russian)
6. Bezrodniy, B.F., Gorelik, A.V., Nevarov, P.A., Shaliagin, A.V.: Principy upravleniya
nadezhnost'yu sistem zheleznodorozhnoj avtomatiki i telemekhaniki [The principles of
reliability management of railway automation and telemechanics systems]. Avtomatika,
svyaz', informatika [Automation, communications, informatics], № 7, 13–14 (2008). (in
Russian)
7. Legkij, N.M., Kozlov, V.I.: Postroenie punktov schityvaniya SAI “Pal'ma” [Construction of
reading points of the AIS “Palm”]. Opyt proektirovaniya, vnedreniya i ekspluatacii
[Experience in design, implementation and operation], 55–67 (2007). (in Russian)
8. Punkt schityvaniya sistemy avtomaticheskoj identifikacii “Pal'ma” [Reading point of the
automatic identification system “Palm”]. Rukovodstvo po ekspluatacii. Moskva: “Otraslevoj
centr vnedreniya novoj tekhniki i tekhnologij” [Operation manual. Moscow, “Industry
Center for the Implementation of New Equipment and Technologies”] (2002). https://pandia.
ru/text/77/357/77520.php. (in Russian).
9. Funkcional'naya strategiya obespecheniya garantirovannoj nadyozhnosti i bezopasnosti
perevozochnogo processa [Functional strategy to ensure guaranteed reliability and safety of
the transportation process]. Moskva, “Rossijskie zheleznye dorogi” [Moscow, “Russian
Railways”] (2007). https://annrep.rzd.ru/reports/public/ru?STRUCTURE_ID=4422. (in
Russian)
10. Vremennye otraslevye normy vremeni i normativy chislennosti na tekhnicheskoe
obsluzhivanie punktov schityvaniya sistemy avtomaticheskoj identifikacii podvizhnogo
sostava “Pal'ma” [Temporary industry time standards and personnel standards for
maintenance of reading points of the automatic identification system of rolling stock
“PALM”]. Moskva, “Centr organizacii truda i proektirovaniya ekonomicheskih normativov”
“Rossijskih zheleznyh dorog” [Moscow: “Center for the organization of labor and designing
economic standards” of “Russian Railways”] (2004). https://www.railway.kanaries.ru/index.
php?act=attach&type=post&id=8065. (in Russian)
11. Instrukciya po tekhnicheskomu obsluzhivaniyu i remontu ustrojstv elektrosnabzheniya
signalizacii, centralizacii, blokirovki i svyazi na federal'nom zheleznodorozhnom transporte
[Instructions for the maintenance and repair of power supply devices for signaling,
centralization, blocking and communication on federal railway transport]. Departament
elektrifikacii i elektrosnabzheniya Ministerstva putej soobshcheniya Rossijskoj Federacii,
Moskva, Transizdat [Department of Electrification and Power Supply of the Ministry of
Railways of the Russian Federation, Moscow, Transizdat] (2002). https://files.stroyinf.ru/
Data2/1/4293764/4293764042.htm (in Russian)
Optimization of Power Supply System
of Agricultural Enterprise with Solar
Distributed Generation
Yu. V. Daus1, I. V. Yudaev1, V. V. Kharchenko2,

and V. A. Panchenko3,2(&)
1
FSBEI HE Saint-Petersburg State Agrarian University,
Pushkin, Saint-Petersburg 196601, Russian Federation
zirochka2505@gmail.com
2
Moscow 109428, Russian Federation
kharval@mail.ru
3
Russian University of Transport,
Obraztsova st. 9, Moscow 127994, Russian Federation
pancheska@mail.ru
Abstract. Today, the activities of most enterprises are partially, and in some
organizations, fully automated, that is why power outages lead to the inter-
ruption of all technological processes, financial damage to enterprises and
various problems in the communal sector during power outages. One of the
options for ensuring uninterrupted energy supply, improving the quality and
reliability of power supply to consumers in rural and remote areas in cases of
accidents in distribution networks can be considered the use of renewable
energy sources. The purpose of the research is to optimize the configuration of
the power supply system of agricultural enterprise by using solar power plants.
The object of the research is the power supply system of the production facility
of the agro-industrial complex. There was conducted calculation of the number
of photovoltaic modules that can be placed on the roofs of agricultural enterprise
to achieve maximum electrical energy generation during the day. On the basis of
experimentally obtained data of currents and voltages, the generated power of
photovoltaic modules was determined. The ratio of the maximum generated
power of the south-oriented module located at tilt angle of 15° to the maximum
generated power of the south-oriented module located at the optimal tilt angle of
39° was also determined. The average value of the ratio of the maximum
generated power of the south-oriented PV module located at tilt angle of 15° to
the maximum generated power of the south-oriented panel located at optimal tilt
angle of 39° was calculated. The total daily production of electrical energy by all
solar power plants is 885.22 kWh per day for a given date (Nov. 1). The ratio of
daily electricity generated by solar power plant to that consumed by production
facility in the specially selected city of Russia (Zernograd) is 71.8%.
Keywords: Photovoltaic module Load graph Spatial orientation of

photovoltaic module Electric power generation

https://doi.org/10.1007/978-3-030-68154-8_22
Optimization of Power Supply System of Agricultural Enterprise 217
1 Introduction
The main distribution networks in rural areas are overhead power transmission lines
with the voltage of 0.38 kV, which today have high degree of technical and economic
wear and tear. The annual growth in electricity consumption is observed in certain
sectors of industry, in agriculture, in the household sector, where the overwhelming
majorities are consumers of electricity with nonlinear and non-sinusoidal nature of
loads [1, 2]. In addition, the quality of electricity and the reliability of electricity supply
to rural consumers, and especially to remote ones, remain at low level. Until now, there
are territories on the planet and in Russia where there are no traditional power supply
systems even with low efficiency [3]. The reason for this state of affairs can be a
number of economic and technical difficulties faced by both consumers and utilities.
Today, the activities of most enterprises are partially, and in some organizations,
fully automated, therefore power outages lead to the fact that all technological pro-
cesses are interrupted, financial damage to enterprises is inflicted, and various problems
arise in the utilities sector during interruptions in the supply of electricity.
Besides traditional energy sector has very strong impact on the ecological situation
on Earth and is increasingly accused of global warming and climate change.
One of the options for uninterrupted energy supply, improving the quality and
reliability of power supply to consumers in rural and remote areas is considered to be
the use of renewable energy sources [4], united in small microgrid [5]. The use of solar
energy for the operation of solar power plants (SPP) and photovoltaic installations
(PI) is profitable, environmentally friendly and not so expensive source of primary
energy. Today solar power plants are used not only abroad, but also in Russia. Pho-
tovoltaic modules (PM), assembled on the basis of solar cells of various designs with
high characteristics [6], are now actively used in industry and in number of economy
sectors [7].
Besides a lot of modern approaches to computing and optimization applied by
growing number of scientists, pedagogues, advisors, deciders and practitioners now
exists [8]. Actually, very strong and smart (“intelligent”) computational techniques
arose for handling a very large number of real-world challenges and crises, in partic-
ular, in relation with the main areas of optimization in theory and practical use [9].
Purpose of Research
The purpose of the research is to optimize the configuration of the power supply system
of agricultural enterprise by means of solar power plants to provide additional tech-
nological capacities.
To carry out the analysis and substantiate the made decisions, there were use traditional
general methods of assessing the value of the solar potential in particular territory
[10, 11]; the methods of design, calculation and selection of the configuration of the
internal power supply network of production facilities adopted in the electric power
218 Yu. V. Daus et al.
industry; algorithm for calculating power consumption and methods for assessing the
power generated by photovoltaic installations [12].
To solve the set research problem, it is necessary to choose location and position
and determine the parameters of a photovoltaic installation, at which the consumption
of electrical energy from the network will be minimal. The theoretical basis for such a
search is the mathematical formulation of the optimization problem using the criterion-
the maximum generation of electrical energy by photovoltaic power plants:
X
m X
m
Wi ¼ Rbci gi Si Ni ! max
i¼1 i¼1
where Wi – electric power generation by the i-th generation source; m is the number
of distributed generation sources (photovoltaic power plants), pcs.; Rbci – the intensity
of solar radiation on the receiving surface, oriented at tilt angle bi relatively to the
horizon and ci relatively to the cardinal points, kWh/m2; Si, ηi – area and efficiency of
photovoltaic modules used for distributed generation source, m2 and relative units
respectively; Ni – number of photovoltaic modules at the power plant, pcs.
Since the purpose of the research is to increase the efficiency of the power supply
system of agricultural facility, this imposes number of restrictions on the choice of
parameters and composition of photovoltaic power plants - sources of distributed
generation. Agricultural object is area with dense buildings or extensive areas set aside
for agricultural activities. Therefore, the construction of photovoltaic installations on
the ground can cause number of difficulties, namely, shading of modules from nearby
structures and buildings, the impossibility of removing the land from processing,
blocking up territories for the passage of equipment, etc. It is advisable to place
generating units on the structures of existing buildings and constructions, although it
imposes certain restrictions on their parameters, it makes their construction and
maintenance more convenient [13].
Therefore, when modeling, the following restrictions must be taken into account:
1) dimensions of the roof where the photovoltaic installation is supposed to be placed:
a b, m;
2) spatial orientation of the photovoltaic installation location: slope angle bi, ° and
orientation to the cardinal points ci, °;
3) coordinates of the location of the photovoltaic power plant, u, °N, k, °E.
The proposed approach is implemented in the “Search for solutions” add-on of the
Microsoft EXCEL program. The result of solving the optimization problem is the
location, parameters of the sources of solar distributed generation, at which the con-
sumption of electrical energy from the network will be minimal.
3 Results and Their Discussion
The analyzed production facility consists of oil mill, weighing tower, drying and
purification tower, intake pit, machine building, hydraulic fracturing unit, laboratory, fuel
and lubricants station, garage, warehouses and utility rooms. The facility receives power
from a 10/0.4 kV transformation substation and has its own 0.38 kV electrical network.
Using the generally accepted methods [12], it is calculated for the power consumption of
the electrical equipment of this agricultural enterprise, which in the daytime is 136.1 kW,
and the total consumed daily electricity is 1232.5 kWh/daily.¼ 1232; 5kWh.
Based on the data obtained from the energy survey, daily graph of the power
consumption of the studied production facility in the summer-autumn period was
conducted (Fig. 1).
Fig. 1. The daily schedule of the power consumption of the production facility in the summer-
autumn period: 1 - lighting, 2 - street lighting, 3 - power equipment, 4 - household equipment, 5 -
office equipment
Since the production facility has seasonal load, which has maximum value in July
and August, as well as from September 15 to October 1 and from October 15 to
November 1, during these periods of the year the total consumed daily electricity is
1232.5 kWh, and during the rest of the year it decreases to 180–230 kWh. To achieve
maximum generation of electrical energy during the day, it is necessary to calculate the
number of photovoltaic modules that can be located on the roofs of pre-selected
constructions (Table 1).
The number of PMs located on the roofs of industrial facilities is limited by their
geometric dimensions. The configuration of a rooftop solar power plant was designed
using photovoltaic monocrystalline solar modules of the JAM72-365/PR brand from
HEVEL, with rated power of 365 W, rated voltage of 39.2 V, efficiency of 18.8%, and
dimensions of 1960 991 40 mm [13].
The optimal inclination angle of the receiving surface relatively to the horizon, at
which the module’s generation of electrical energy is maximum per year, is 39° for the
selected city of Zernograd [14, 15].
The amount of total solar radiation arriving at the receiving area, inclined at the
optimal angle (39°), on November 1 will be determined using the program [16], while
the insolation is 2; 82kWh=m2 .
However, not all roofs are optimally oriented in space. Some of them have tilt
angles of 15° and 30° with south and east orientation. As a result, to determine the ratio
of the solar insolation amount at south-oriented receiving surface located at tilt angle of
15° to the receiving surface located at optimal angle, series of experiments was carried
out, including the recording of current-voltage characteristics (I–V characteristics) of
three PMs during the day which are oriented at south at inclination angles of 15°, 30°
and 39°. Based on the experimentally obtained data of currents and voltages, the
generated power of the PV module was calculated. By calculation, the ratio of the
maximum generated power of south-oriented PV array located at tilt angle of 15° to the
maximum generated power of south-oriented PV array located at tilt angle of 39° was
also determined. Also, the average value of the ratio of the maximum generated power
of the south-oriented PM located at tilt angle of 15° to the maximum generated power
of the south-oriented PM located at tilt angle of 39° was calculated, and equals 77%.
To determine the percentage of solar insolation for the eastern orientation of the
receiving surface located at tilt angle of 39° to the southern orientation of the receiving
surface located at tilt angle of 39°, series of additional experiments was performed,
including taking the I–V characteristics of three PMs oriented to the east, south and
west at optimal inclination angle of 39°. Based on the obtained data, the calculation of
the maximum generated power of the PMs, oriented to the east, south and west at
optimal tilt angle of 39°, was made. The average value of this ratio was 89%.
The working day at this production facility starts at 8:00 (Fig. 1). Due to the
number of features of annual technological processes, the maximum generated power
of the projected solar power plant is required by consumers until November 1 of the
calendar year. Thus, it is necessary to determine the minimum distance between the
lower points of the PM in adjacent rows for the time t = 8 h on November 1 at the
optimal inclination angle of the receiving area for the geographical point 46.8 °N And
40.3 °E (Zernograd)
Thus, one PM located at tilt angle of 39° per day on November 1 generates
0.772 kWh. There were determined the daily amount of electrical energy generated by
the south-oriented photovoltaic module on November 1, at tilt angle of 15°, supple-
menting with factor of 0.77, which takes into account the angular orientation of the PM
15°/39°. One south-facing PM located at tilt angle of 15° generates 0.594 kWh. The
calculation results are summarized in Table 1.
There was determined the daily amount of electrical energy generated by the east-
oriented PM on November 1, at tilt angle of 15°, supplementing with a factor of 0.77,
which takes into account the angular orientation of the PM 15°/39°, and factor of 0.89,
which takes into account the spatial orientation of the PM East/South. One east-facing
PM located at tilt angle of 15° generates 0.529 kWh. The calculation results are
summarized in Table 1.
Table 1. Daily generation of PM located on the roofs of the agricultural enterprise’s facilities
Building Tilt angle of Orientation Size, a b, m Number Daily
the roof bi, ° ci, ° of PM, generation by
pcs. SPP, kWh/day
Warehouses 15 0 11 85 425 252.45
No. 3–5
Administrative 15 −90 12 22 132 69.82
building
Warehouses 0 0 84 24 + 57 13 393 303.39
No. 1–2
Elevator 15 −90 11.8 33 160 84.64
Warehouses 15 −90 10.5 54 270 142.83
No. 6–7
Oil mill 15 0 7.9 18 54 32.08
The total daily production of electricity by all solar power plants is 885.22 kWh/day
for November 1. The ratio of the daily electricity generated by the solar power plant to
the consumed one by production facility in Zernograd is 71.8%.
4 Conclusion
The use of renewable energy sources can be considered one of the options for unin-
terrupted energy supply, improving the quality and reliability of power supply to
consumers in rural and remote areas. The use of solar energy for the operation of solar
power plants (SPP) and photovoltaic installations (PI) is profitable, environmentally
friendly and not so expensive source of primary energy. Today solar power plants are
used not only abroad, but also in Russia. Photovoltaic modules (PMs) are now actively
used in industry and in number of sectors of the economy [17].
Conducted studies have revealed that on day with minimal solar radiation, solar
power plants located at differently oriented roof slopes with different tilt angles of roof
slopes of seasonal industrial agricultural facilities is able to generate 71.8% of the
required daily electricity consumption and significantly reduce the consumption of
electricity from the centralized power supply system.
In the future, it is interesting to consider the placement of solar generation sources

at studied object using photovoltaic thermal modules that generate not only electrical,
but also thermal energy [18, 19]. For the agricultural sector, where heat energy
occupies more than 2/3 of the energy balance of consumption, such a consideration
seems to be extremely relevant.
References
1. Hirsch, A., Parag, Y., Guerrero, J.: Microgrids: a review of technologies, key drivers, and
outstanding issues. Renew. Sustain. Energy Rev. 90, 402–411 (2018)
2. Molyneaux, L., Wagner, L., Foster, J.: Photovoltaic/battery system sizing for rural
electrification in Bolivia: considering the suppressed demand effect. Appl. Energy 235,
519–528 (2019)
3. Pode, R., Pode, G., Diouf, B.: Solution to sustainable rural electrification in Myanmar.
Renew. Sustain. Energy Rev. 59, 107–118 (2016)
4. Yudaev, I., Daus, Yu., Zharkov, A., Zharkov, V.: Private solar power plants of ukraine of
small capacity: features of exploitation and operating experience. Appl. Solar Energy 1(56),
54–62 (2020)
5. Adomavicius, V., Kharchenko, V., Valickas, J., Gusarov, V.: RES-based microgrids for
environmentally friendly energy supply in agriculture. In: Proceedings of 5th International
Conference TAE 2013, Trends in Agricultural Engineering, 2–3 September 2013, Prague,
Czech Republic, pp. 51–55 (2013)
6. Kharchenko, V.V., Nikitin, B.A., Tikhonov, P.V.: Estimation and forecasting of PV cells
and modules parameters on the basis of the analysis of interaction of a sunlight with a solar
cell material. In: Proceedings of 4th International Conference TAE 2010, Trends in
Agricultural Engineering, Prague, Czech Republic, 7–10 September 2010, pp. 307–310
(2010)
7. Daus, Yu., Yudaev, I., Taranov, M., Voronin, S., Gazalov, V.: Reducing the costs for
consumed electricity through the solar energy utilization. Int. J. Energy Econ. Policy 9(2),
19–23 (2019)
8. Intelligent computing & optimization. In: Conference Proceedings, ICO 2018. Springer,
Cham. ISBN 978-3-030-00978-6
9. Intelligent computing and optimization. In: Proceedings of the 2nd International Conference
on Intelligent Computing and Optimization 2019 (ICO 2019). Springer. ISBN 978-3-030-
33585-4
10. Daus, Yu., Yudaev, I.: Estimation of solar energy potential under conditions of urban
development. In: Actual Issues of Mechanical Engineering (AIME 2017). Advances in
Engineering Research, vol. 133, pp. 156–161. Atlantis-Press, Paris (2017)
11. Liu, B., Jordan, R.: Daily insolation on surfaces tilted towards the equator. ASHRAE J. 3
(10), 53–60 (1961)
12. Komenda, T., Komenda, N.: Morphometrical analysis of daily load graphs. Int. J. Electr.
Power Energy Syst. 1(42), 721–727 (2012)
13. Belenov, A., Daus, Y., Rakitov, S., Yudaev, I., Kharchenko, V.: The experience of operation
of the solar power plant on the roof of the administrative building in the town of Kamyshin,
Volgograd oblast. Appl. Solar Energy (English translation of Geliotekhnika) 52(2), 105–108
(2016). https://doi.org/10.3103/S0003701X16020092
14. Hevel Homepage. http://www.hevelsolar.com. Accessed 15 01 2020
15. Daus, Yu., Pavlov, K., Yudaev, I., Dyachenko, V.: Increasing solar radiation flux on the
surface of flat-oplate solar power plants in Kamchatka krai conditions. Appl. Solar Energy 2
(55), 101–105 (2019)
16. Daus, Yu., Kharchenko, V., Yudaev, I.: Managing spatial orientation of photovoltaic module
to obtain the maximum of electric power generation at preset point of time. Appl. Solar
Energy 6(54), 400–405 (2018)
17. Daus, Yu., Yudaev, I.: Designing of software for determining optimal tilt angle of
photovoltaic panels. In: 2016 International Conference on Education, Management and
Applied Social Science (EMASS 2016), pp. 306–309. DEStech Publications Inc., Lancaster
(2016)
18. Harder, E., Gibson, J.: The costs and benefits of large–scale solar photovoltaic power
production in Abu Dhabi, United Arab Emirates. Renew. Energy 2(36), 789–796 (2011)
19. Kharchenko, V., Panchenko, V., Tikhonov, P.V., Vasant, P.: Cogenerative PV thermal
Crack Detection of Iron and Steel Bar Using
Natural Frequencies: A CFD Approach
Rajib Karmaker1(&) and Ujjwal Kumar Deb2

1
Department of Mathematics, Premier University, Chattogram, Bangladesh
rajibcumath@gmail.com
2
Department of Mathematics,
Chittagong University of Engineering and Technology, Chittagong, Bangladesh
ukdebmath@cuet.ac.bd
Abstract. Solid composite materials such as steel, fiber plastics, etc. are now
particularly widely used in industry and civil construction, where a high
strength-to-weight ratio is required. Its dynamic behavior is influenced by
cracks, body deflection, or other defects in a structural element and changes its
stiffness. Therefore, crack identification and observation of load distribution and
deflection in the solid components of health monitoring systems is important.
The constant study provides a vibration-based method for detecting the location
of open or hidden cracks as well as a tool for comparing the best material
between iron and steel. In this analysis, a numerical simulation using the Finite
Element Method (FEM) has been performed using the COMSOL Multiphysics
software to optimize the frequency and strength of iron and steel bar along with
two semi-circular surface cracks. It is found that frequencies are proportional to
the increase in load and maximum frequency (around 2184.2 Hz for iron and
6197.7 Hz for steel) reserved at the cracked stage. Finally, we get that deflection
of steel bodies is higher than that of iron and steel is capable of tolerating so
much load as iron is deformed without actually breaking through cracked or un-
cracked bodies. From this study, very small sizes crack (minimum 0.05 mm)
can be defined in any structural body and the most acceptable solid materials can
be chosen for the particular structure.
Keywords: Vibration analysis Deflection Iron bar Steel bar Crack

detection Modal frequency FEM
1 Introduction
The monitoring of structural health and the identification of structural damage are
common areas of study. Infrastructural developments in civil structures and consid-
erable building materials have rapidly evolved around the world. Damage can be
incurred at various stages of the service life of the structure. The natural frequencies
and mode shapes of the structure provide details on the location and the size of the
damage [1]. The unexpected occurrence of damage can lead to catastrophic failure and
therefore poses a potential threat to human life [2]. Cracks are an important measure of
the safety status of infrastructure. Cracking is one of the most common damages that
occurs during the service life of engineering structures and, in some cases, causes a
https://doi.org/10.1007/978-3-030-68154-8_23
Crack Detection of Iron and Steel Bar Using Natural Frequencies 225
major reduction in their load-bearing capacity and can even lead to failure. Crack
detection and study of cracked structures are therefore critical and fascinating topics for
researchers. The presence of cracks in the material structures can be detected by using
variations in local stiffness, which have a major impact on the natural frequency and
shape of the mode.
To avoid the failure caused by cracks, several researchers have performed broad
investigations over the few decades to develop structural behavior observance tech-
niques. An excellent methodology was proposed by D. P. Patil and S. K. Maiti [3] to
find multiple cracks within the beam victimization frequency measuring. Their results
offer a linear relationship between injury parameters and the natural frequency of
vibration of beam. A. K. Darpe et al. [4] studied dynamics of a bowed rotor with a
transversal surface crack. They finished that amplitude and directional nature of upper
harmonic elements of bowed rotor remain unmoved, but rotating frequency element
changes in magnitude. In another analysis Athanasios C. Chasalevris and Chris A.
Papadopoulos [5] studied the identification of multiple cracks in beams below bending.
They formulate a compliance matrix of two degrees of freedom as an operator of each
crack depth and angle of rotation of the shaft. Their declared methodology provides not
solely depth and size of the crack however conjointly spatial relation of the crack.
Ashish K. Darpe [6] proposes a completely unique process to find the transverse
surface crack in a very rod. He studied the behavior of the merely supported shaft with
transverse crack subjected to each bending and torsional vibration. Prabhakar [7]
applied vibration analysis of a cantilever beam with two open transversal cracks, to
check the response characteristics. In an initial part native compliance matrices of
various degrees of freedom are accustomed model transversal cracks in the beam on the
offered expression of stress intensity factors. Lee et al. [8] studied the Leg Mating Unit
(LMU), which is an element that supports the elastomeric bearing and steel plate
topside of the offshore plant. The hybrid metamodel-based optimization technique and
the satisfying trade-off approach were applied in their research. S.K. Georgantzinos, N.
K. Anifantis [9] bestowed the study of the respiration mechanism of a crack in a solid
rod. He studied the behavior of the transversal crack in cantilever shaft beam with two
completely different cases of straight and semicircular front of the shaft. He concludes
that the respiration behavior depends on depth and form of the crack front. Regenhardt
et al. [10] uses systematic reliability tools to obtain details about how the failure of
multiple constituent members of offsh platforms contributes to overall system failure.
In this way, systemic and time-dependent information can be segregated, allowing
reliability to be computed flexibly and computationally efficient.
The main objective of our study is to build a finite element analysis based on a non-
cracked and cracked solid cylindrical shaped iron and steel bar consisting of two half-
circular, open, transverse cracks and to optimize the relationship between modal natural
frequencies with different crack locations subject to free vibration. The aim of this
paper is to show a way to predict crack parameters (crack location) in the association in
nursing iron and steel rod from changes in natural frequencies using simulation and to
find out the most important material for construction.
226 R. Karmaker and U. K. Deb
2 Mathematical Modeling
Over the last few decades, most techniques are focused on vibration measurement and
analysis because vibration-based approaches can provide an efficient and easy way to
detect fatigue cracks in structures [11]. In this constant analysis, we used a vibration-
based approach to identify cracks in our domain. This approach is mainly focused on
shifts in dynamic characteristics, such as natural frequency and crack location
parameters [12]. The purpose of this analysis is to compare the behavior and frequency
of shape curvatures with the existence and absence of cracks in iron and steel structures
subject to free vibration. We will also observe the effects of cracks on the natural
frequency and deflection behavior of the bar. In the current work, the natural frequency
of the domain with and without crack was investigated by FEM.
2.1 Governing Equation

We considered solid cylindrical shaped iron and steel bar to detect the crack in its body.
To detect the crack of a bar the system of the governing equation can be written as,
d2 uðxÞ
EA þ qðxÞ ¼ 0 ð1Þ
dx2
where,
q(x) = cx = Distributed load [c = constant]
A = Cross sectional area
E = Young’s Modulus of elasticity
u(x) = Displacement vector

ðxÞ
Subject to the condition: uð0Þ ¼ 0 and EA dudx
x¼L
The strain energy release rate at the cracked section Irwin et al. [13] is,
1
Es ¼ ðKI1 þ KI2 Þ2 : ð2Þ
E_
where, the stress intensity factors are KI1, KI2 of mode I (opening of the crack) under
load P1 and P2 respectively.
Defining the flexibility influence co-efficient Cij per unit depth,
dui d2 W=2 Z hZ1

Cij ¼ ¼ JC ðhÞdh dz ð3Þ
dPj dPi dPi W=2 0
where,
Uc = strain energy
JC ¼ dU
dh = strain energy release rate.
c
Using the value of the strain energy release rate (JC) we get,
B d2 hZ1
Cij ¼ ðKI1 þ KI2 Þ2 dh ð4Þ
E_ dPi d Pj 0
The local stiffness matrix can be obtained by taking the inverse of the compliance matrix,
1
K11 K12 C11 C12
½K ¼ ¼ ð5Þ
K21 K22 C21 C22
The stiffness matrix of the first crack,

1
C011 C012
½K 0 ¼ ð6Þ
C021 C022
The stiffness matrix of the second crack,

1
C0022
00 C0023
½K ¼ ð7Þ
C0032 C0033
All types of changes at the corresponding cracked points can be measured by Eqs. (6)
and (7).
2.2 Equation of Free Vibration

The natural frequency of the cracked beam can be evaluated using the crack model and
boundary conditions. These equations can be written in compact form as,
½QfAg ¼ f0g ð8Þ
Where [Q] is the coefficient matrix defined in terms of the cracked beam parameters
and A is the constants to be determined by the boundary conditions.
2.3 Boundary Conditions

There are two faces present bounding the calculation domain which are thin Elastic
Layer, the outlet boundary.
For the symmetry thin elastic layer,
nu¼0 ð9Þ
For the outlet boundary,
qx2 u ¼ r S þ FV eiu ð10Þ

3 Computational Domain and Mesh Generation
A solid cylindrical shaped iron and steel bar is considered as computational domain with
Length = 0.36 m and Radius = 0.015 m are shown in Fig. 1. We construct the suitable
mesh design of the computational domain for with and without cracked iron and steel
bar whose are generated by COMSOL Multiphysics software is shown in Fig. 2. The
mesh properties of the computational domain are shown in Table 1 and Table 2.
Fig. 1. The computational domain of the bar
Fig. 2. The mesh design of the computational domain

Table 1. Mesh properties of the computational domain

Description Value
Minimum element quality 0.1558
Average element quality 0.6598
Tetrahedron 17162
Triangle 3976
Edge element 485
Vertex element 20
Maximum element size 0.0198
Minimum element size 0.00144
Curvature factor 0.4
Resolution of narrow regions 0.7
Maximum element growth rate 1.4
Predefined size Finer
Table 2. Properties of the simulation of computational domain

Description Value
Number of degrees of freedom solved for 17587
Space dimension 3
Number of domains 1
Number of boundaries 16
Number of edges 34
Number of vertices 20
Number of boundary elements 3976
Number of elements 17162
Number of vertex elements 20
Number of edge elements 485
4 Numerical Results and Discussion
In this study, we have investigated the frequency and load distribution of Iron and Steel
bar containing half-circular double crack using the finite element method. For our
simulation, we construct a solid cylindrical Iron and Steel bar and have used different
parameter values according to the Table 3 and the Table 4.
In Fig. 3 it is seen that a different amount of load has been applied vertically to the
crown edge of the bar. The deflection of the body after applying the load and the
simulation provided in the Fig. 4.
Table 3. Description of computational domain

Description Value
Length of the bar (L) 0.36 m
Radius of the bar (r) 0.015 m
Width of First crack (c1l) 0.00027 m
Depth of First crack (c1h) 0.03 m
Width of second crack (c2l) 0.00027 m
Depth of second crack (c2h) 0.03 m
Length of the bar (L) 0.36 m
Radius of the bar (r) 0.015 m
Width of First crack (c1l) 0.00027 m
Depth of First crack (c1h) 0.03 m
Table 4. Description of material properties

Description Iron Steel Unit
Density 7870 7850 kg/m3
Young’s modulus 200 200 GPa
Poisson’s ratio 0.29 0.30 1
Shear modulus 5.5e9 73.3e9 N/m2
Tensile strength 540 430 MPa
Heat capacity at constant pressure 440 475 J/(kg * K)
Relative permeability 4000 1 1
Electrical conductivity 1.12e7 4.032e6 S/m
Thermal conductivity 76.2 44.5 W/(m * K)
Fig. 3. Applying load on the crown edge of the computational domain
Fig. 4. Deflection of the computational domain after applying load

(a) Position of the first crack = 0.01m and (b) Position of the first crack = 0.01m and
the second crack = 0.10 m of the Iron bar the second crack = 0.10 m of the Steel bar
(c) Position of the first crack = 0.10m and (d) Position of the first crack = 0.10m and
(e) Position of the first crack = 0.20m and (f) Position of the first crack = 0.20m and
(g) Un-cracked Iron bar (h) Un-cracked Steel bar
Fig. 5. The phase of the computational domain for various crack positions
In Fig. 5, we noticed that the applied load distributed across the domain of com-
putation. The full load for both metals is absorbed in the cracked position as shown in
Fig. 5 [(a)–(f)]. By comparing to our legend, we see the highest value found in the
cracked stage, however in un-cracked iron and steel bars, shown in Fig. 5 [(g)–(h)] that
the load ingested flows continuously to the end of the body and the steel is deviate
more than iron.
(a) (b) (c)

For the Iron bar
(d) (e) (f)

For the Steel bar
Fig. 6. The distribution of load absorption at the cracked point
Figure 6 shows the absorption of loads at any cracked domain level. We observe
that the steel body transfers the load to the bottom of the cracked point which is
maximum and is continuously spread to the domain boundary, which is the main
deflection factor. The iron body still performs the same but iron does not spread the
load as much as steel to the entire body.
Based on Fig. 7 [(g)–(h)] we found that, for the un-cracked steel and iron bar, the
frequency remains almost the same. But from Fig. 7 [(a)–(f)] for both bar we found a
surprising shift for the different crack positions. The frequency graph reaches the
highest point of the Iron body at the cracked point, and we get two peak points at the
corresponding cracked spot. We see a remarkable contrast between the iron line graph
of the highest point and its nearest point. The graph even reaches the steel bar at the
respective cracked point at the center, but the line graph is almost regular. This also
implies the highest frequency spread to its nearest location. Moreover, there is a large
difference between different points in the iron line graph for irregular load distribution.
In previous figures, we find that the load distributed across the whole body in the steel
structure, where the iron body is not. In this figure, this concept fits completely with our
previous description.
(a) Position of the first crack at 0.01m and the (b) Position of the first crack at 0.01m and the
second crack at 0.10 m of Iron bar second crack at 0.10 m of Steel bar
(c) Position of the first crack at 0.10m and the (d) Position of the first crack at 0.10m and the
(e) Position of the first crack at 0.20m and the (f) Position of the first crack at 0.20m and the
(g) Un-cracked Iron Bar (h) Un-cracked Steel Bar
Fig. 7. The line graph of the relative position of crack vs frequency

3.00E+05
2.50E+05
Deflection (mm)
2.00E+05
1.50E+05
1.00E+05
5.00E+04
Iron Steel
0.00E+00
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16
Poisition (m)
Fig. 8. Deflection of un-cracked Iron and Steel bar
1.2E+08 Iron Steel

1.0E+08
Deflection (mm)
8.0E+07
6.0E+07
4.0E+07
2.0E+07
0.0E+00
0 0.1 0.2 0.3
Poisition (m)
Fig. 9. Deflection of cracked Iron and Steel bar
According to Fig. 8, we found that two deflection graphs of both studies have
almost the same pattern for un-cracked steel and iron bars. But the form of Fig. 9
reveals the steel deflection graph is higher than the body of iron at the cracked points.
Finally, significant differences in deflection are found in the steel and iron body due to
the presence of the crack.
5 Conclusion
This paper deals with the optimization of frequency sensitivity of the solid cylindrical
iron and steel bar consists of half-circular double crack which varies with the change in
position of the crack for all vibration modes. In this analysis, in COMSOL Multi-
Physics software the Finite Element Analysis of the Iron and Steel bar with two
transverse cracks was performed for simulation. In our study, we also found that
applied load spread evenly across the steel bar and deformed the structure without
breaking. It is seen that the structure of steel is capable of absorbing various amounts of
load, causing vibration, and eventually deflected. Steel retains its elasticity by
remaining un-cracked in the body. On the other hand, Iron stores the load at any
defined section, such that the force applied cannot move through the entire body. That’s
why an amount of load creates too much vibration and the frequency at that portion or
affected region of the body, which creates a chance of damaging the body. It can be
said that due to the durability, good elasticity, and deflection facilities steel structure
can be considered as an acceptable and lot of property metal for each cracked and un-
cracked state of affairs than others. Finally, we can reach the conclusion that in the
construction sector steel has become a very popular and reliable structural material day
by day.
6 Future Study
In the present study, the domain under consideration have a uniform cross-section but
this method can be extended to components with varying cross section, different
geometry, and any boundary condition. The proposed method can be extended for
defect diagnosis in beams, shafts, or rotating machine elements.
Acknowledgement. The authors gratefully acknowledge the technical supports to The Centre of
excellence in Mathematics, Department of Mathematics, Mahidol University, Bangkok,
Thailand.
References
1. Khan, I.A., Yadao, A., Parhi, D.R.: Fault diagnosis of cracked cantilever composite beam by
vibration measurement and RBFNN. J. Mech. Des. Vib. 1, 1–4 (2013)
2. Ramanamurthy E.V.V., Chandrasekaran K.: Vibration analysis on a composite beam to
identify damage and damage severity using finite element method. Int. J. Eng. Sci. Technol.
(IJEST) 3, 0975–5462 (2011)
3. Patil, D.P., Maiti, S.K.: Detection of multiple cracks using frequency measurements. Eng.
Fract. Mech. 70, 1553–1572 (2002)
4. Darpe, A.K., Gupta, K., Chawla, A.: Dynamics of a bowed rotor with a transverse surface
crack. J. Sound Vib. 296, 888–907 (2006)
5. Chasalevris, A.C., Papadopoulos, C.A.: Identification of multiple cracks in beams under
bending. Mech. Syst. Signal Process. 20, 1631–1673 (2006)
6. Darpe, A.K.: A novel way to detect transverse surface crack in a rotating shaft. J. Sound Vib.
305, 151–171 (2007)
7. Prabhakar M.S.: Vibration analysis of cracked beam. Thesis of M school, National Institute
of Technology (2009)
8. Lee, S.H., Jeong, K-II, Lee, K.H.: Structural design of an LMU using approximate model
and satisficing trade-off method. In: 2nd International Conference on Intelligent Computing
and Optimization (ICO 2019), pp. 118–127. Springer (2019)
9. Georgantzinos, S.K., Anifantis, N.K.: An insight into the breathing mechanism of a crack in
a rotating shaft. J. Sound Vib. 318, 279–295 (2008)
10. Regenhardt, T.E., Azad, M.S., Punurai, W., Beer, M.: A novel application of system survival
signature in reliability assessment of offshore structure. In: International Conference on
Intelligent Computing and Optimization (ICO 2018), pp. 11–20. Springer (2018)
11. Kute, D., Hredeya Mishra, H.: Two crack detection in tapered cantilever beam using natural
frequency as basic criterion. Global J. Eng. Sci. Res. (2018). ISSN 2348–8034
12. Kocharla, R.P.B., Bandlam, R.K., Kuchibotla, M.R.: Finite element modelling of a turbine
blade to study the effect of multiple cracks using modal parameters. J. Eng. Sci. Technol. 11,
1758–1770 (2016)
13. Irwin, G.R.: Analysis of stresses and strains near the end of a crack transverse in a plate.
J. Appl. Mech. 24, 361–364 (1956)
Belief Rule-Based Expert System to Identify
the Crime Zones
Abhijit Pathak(&), Abrar Hossain Tasin, Sanjida Nusrat Sania,

Md. Adil, and Ashibur Rahman Munna
Department of Computer Science and Engineering,

BGC Trust University Bangladesh, Chittagong, Bangladesh
abhijitpathak@bgctub.ac.bd,
abrarhossaintasinp@gmail.com,
sanjidanusratsania353@gmail.com,
mdadilkhan616@gmail.com, jibonrahmanctg@gmail.com
Abstract. Crime is the most significant and dominant problem in our society, and
its prevention is an important task. A large number of crimes are being committed
daily. This requires tracking all crimes and maintaining a database that can be used
for future reference. Analyzing this data is the current problem we face to maintain
a proper dataset of crime and to assess and solve future crimes. With the increasing
advent of computerized systems, crime data analysts are helping law enforcement
officers speed up the process of solving crimes. 10% of offenders commit 50% of
crimes. Although we cannot assume that everyone is a victim of crime, we can
estimate where it will occur. This paper focuses on crime zone identification. Then,
it clarifies how we conducted the Belief Rule Base algorithm to produce interesting
frequent patterns for crime hotspots. The paper also shows how we used an expert
system to forecast potential types of crime. To further analyze the crime datasets,
the paper introduces an analysis study by combining our findings of the Chat-
togram crime dataset with demographic information to capture factors that could
affect neighbourhood safety. The results of this solution could be used to raise
awareness of the dangerous locations and to help agencies predict future crimes at a
specific location in a given time.
Keywords: Expert system Inference engine Knowledge base BRB Crime

zone
1 Introduction
Crimes are a common social issue that affects a society’s quality of life and economic
growth. It is considered an essential factor in deciding how people migrate to a new city
and what areas should be avoided while travelling. When crime increases, law
enforcement agencies continue to demand innovative expert solutions to enhance crime
detection and protect their communities better. While crimes may occur anywhere, it is
normal for offenders to focus on the crime opportunities they face in the most familiar
places. Through offering a Belief Rule Expert System to identify the crime zones and to
evaluate the type of the location and time of the crimes committed, we hope to raise
awareness of the dangerous locations in specific periods [1].
https://doi.org/10.1007/978-3-030-68154-8_24
238 A. Pathak et al.
Therefore, our proposed solution and saving lives can potentially help people stay
away from the locations at a particular time. Therefore, gaining this kind of awareness
help people make better choices about their places of living. Police forces, on the other
hand, can use this approach to improve crime detection and prevention. Also, this
would be useful for allocating police resources. It can help to ensure efficient use of
police resources in the distribution of police at most likely places of crime for any given
time. By providing all of this information, we hope to make our community safer for
the people who live there and others who will travel there.
The expected results cannot be assured of 100% accuracy. Still, the results show
that our program helps to minimize the crime rate by providing security in sensitive
areas of crime. So we have to collect and review crime reports to create such a useful
crime analytics tool [2] (Fig. 1).
Fig. 1. A typical Belief Rule-Based expert (BRB) system
The expert system is an evolution of the standard law-based system and can rep-
resent more complex casual relationships using various types of information with
ambiguity. A degree of belief is associated with every possible consequence of a rule in
an expert system based on belief system rules. In the creed rule-based expert system,
the expert system based knowledge is developed using the evidential reasoning
approach to replace the knowledge base inferential engine. The creed rule-based expert
system can capture complicated and continuous casual relationships between different
factors that traditional IF-THEN rules cannot apply [3]. Research would include
evaluating, developing, implementing, and reviewing the best system for defining the
Crime environment. It will concentrate on the processes performed to define crime
zones for many reasons, and how to determine how to detect crime hotspots before
decisions are made on which and when hotspots can be dangerous. It will also focus on
how ordinary people and law enforcement can know the crime hotspots properly. The
study of the crime zone and the detection of the effects of the crime zone is based on
paper. Our study is aimed at finding spatial and temporal criminal hotspots using a set
of real-world crime datasets. We must try to locate the most likely places for the crime
Belief Rule-Based Expert System to Identify the Crime Zones 239
and their frequent occurrence. We will also predict what type of crime could occur next
in a specific location within a given time frame [4]. Develop a computer-based system
that best identifies the crime zone and analyzes the results of crime hotspot tests and the
underlying conditions to recommend the best safe zone for the people.
• Characterize and significance of the hotspot selection process.
• Develop a user-friendly expert system with specific knowledge to advise a person
on selecting crime zone identifiers.
• Assess the performance of the Expert System developed.
To achieve the expected goals, the following objectives are addressed:
• Defining system requirements to assess the information needs of different users so
that the best crime hotspot identification system can be delivered effectively and
effectively.
• Assess the processes involved in the monitoring of the crime zone, and the
parameters used to assess the crime areas can be unsafe for ordinary people.
• Develop a new concept for the crime zone identification system that analyzes the
hotspot and provides not only specific feedback on crime hotspots but also mea-
sures possible crime zone.
2 Related Works
Countless work relating to crime has been done. Large datasets were reviewed, and
information such as location and type of crime was extracted to help people follow the
law enforcement. Existing methods used those databases to identify location-based
crime hotspots. Several map applications show the exact location of the crime along
with the type of crime for any given city. Although crime locations have been iden-
tified, there is no information available that includes the date and time of the crime and
techniques that predict precisely what crimes will occur in the future [5].
On the other hand, the previous work and its existing methods mainly identify crime
hotspots based on the location of high crime density without considering either the crime
type or the date and time of the crime. Related research work, for example, containing a
dataset for the city of Philadelphia with crime information from 1991–1999. It focused
on the existence of complex multi-scale relationships between time and space [6].
Another study titled “The effectiveness of hotspot analysis to forecast spatial crime
trends” looks at the various types of crime to see if they vary in their predictive capa-
bilities. Many studies investigate the relationship between criminal activity and socio-
economic variables such as age, race, income, and unemployment [7].
None of them considers the three elements (location, time, type of crime) together,
despite all the current work. In our study, we provide a Belief Rule-Based Expert
System to Identify the Crime Zones.
3 Methodology
In the proposed system, various components interact with each other. Their proper
interaction creates a perfects BRB and ER based system. During the implementation of
BRB and ER methodology first, important things are to choose a programming lan-
guage to implement ER methodology and user interface, knowledge-based need to
implement using the database. After designing the perfect Graphical User Interface
(GUI), it will be easy for the general user to interact with the developed system and
perform meditation assessment [8]. Thus, the implementation of the proposed system
of experts is one of the main tasks for this work. It is understood that BRB and ER
methods for the system proposed can be applied using an expert knowledge-based
system. An expert knowledge-based system works with data from input and output.
Interaction between the different components of the knowledge-based expert system for
our proposed system is shown in the following (Fig. 2):
Fig. 2. System architecture
It’s known that BRB and ER methodology for the proposed system is possible to
implement using a knowledge-based expert system. A knowledge-based expert system
works with input and output data. Interaction among various components of the
knowledge-based expert system for our proposed system is shown below:
• The crime zone identifier system must have a database that can support the storage
and retrieval of user details, crime details, and details that users who log in to the
system will access later.
• The crime zone identification system must have a client interface that allows
privileged users, such as administrators, to perform tasks such as adding/criminal
editing information, adding/editing hotspot details, and creating/editing user details.
• The crime zone identifier system must have a viewing functionality to allow regular
and privileged users to view the details of a particular entity from the system
database.
• The crime zone identification program must also have a software interface for
regular users to sign up for user accounts, enter data for review and measurement.
• The identifying crime zone system must-have functionality that produces summary
reports from analyses and calculations.
3.1 Crime Zone Prediction

Crime zone prediction is a systematic approach for assessment and analyzing patterns
and trends in crime. This system can predict places with a high probability of crime
occurrence and can also reveal crime-prone areas. With the increasing advent of
computerized systems, identifying of crime zone can help the Law enforcement officers
to speed up the process of solving crimes [8].
3.2 Crime Zone

The crime zone means the areas which have high crime intensity. Although crimes
could occur everywhere, it is expected that criminals work on crime opportunities they
face in the most familiar areas for them. We hope to raise people’s awareness about the
dangerous locations in specific periods.
3.3 Purpose of Identifying Crime Zone

Identification of a crime zone can help Law enforcement officers to speed up the
process of solving crimes. Using the Belief Rule-Based Expert System, we can extract
previously unknown, useful information from unstructured data.
3.4 Types of Crime Zone

There are four types of Crime zone—
a) Places b) Victims c) Streets d) Areas.
The most basic form of a hot spot is a place that has many crimes. A place can be an
address, street corner, store, house, or any other small location, most of which can be
seen by a person standing at its center. Crime zone prediction is a systematic approach
for assessment and analyzing patterns and trends in crime. This system can predict
places with a high probability of crime occurrence and can also reveal crime-prone
areas. With the increasing advent of computerized systems, identifying of crime zone
can help the Law enforcement officers to speed up the process of solving crimes [9].
3.5 Crime Zone Prediction Techniques

There are many techniques for predicting crime zone. From all techniques three most
popular techniques for identifying crime zone given below:
• Maps and Geographic Information Systems • Statistical Tests • Reading a Crime
Map.
3.6 Factors Related to Crime Zone

Factors that are related to crime zones are addressed below:
Type of offence; • Date and time of the incident; • Location of the incident (like
outside visitor rate, dark alleys, resident density, etc.); • Type of premises where the
incident occurred; • Whether drugs or alcohol were involved; • Whether a weapon was
used; • The age and gender of the offender; • The age and gender of the victim, and
more. (like women population) • Traffic in the area; • Income level in the area
(Unemployment, population density); • Education rate in the area; • Social cause
(Poverty, inequality, etc.);
3.7 Design of BRB for Identifying Crime Zone

Knowledgebase creation is an integral part of Identifying the Crime Zone. Knowledge
is acquired from several sources to create a knowledge base. Those sources include
domain experts and written documentation. Knowledgebase creation involving several
steps to deal with, as- first of all BRB framework need to develop, according to which
the work will go further. After the BRB development, initial rule base creation will be
followed. Then the assessment process will move forward with the data transformation
process, which will involve input transformation, activation weight, and belief degree
update process [11].
To design the BRB for the expert system, first, analyze various factors involved in
crime zone prediction and the effect of those factors. In this step, we came to learn
about factors like land type, water removal situation, drainage system, soil texture, and
pH value. Then discuss with domain exporters to validate the proposed BRB frame-
work and, after validation, finally got the following framework for identifying the
crime zone (Fig. 3).
Fig. 3. The antecedent for identifying crime zone
It has been shown how to calculate input transformation in Sect. 3. The input
transformation process enables the system to transform human readable input data into
system readable data for best crop selection. It’s been possible due to RIMER
methodology in the proposed system, which is a contribution of this project in the best
crop selection process. For instance, let’s take the consequent X (behavioural impact),
for which one of the antecedent Land Types, Water Removal Situation, Drainage
System, Soil Texture, and pH value. Evolution grade for each factor are:
0:7 \ ¼ High \ ¼ 1:0

0:4 \ ¼ Mid \ ¼ 0:6
0:0 \ ¼ Low \ ¼ 0:3:
Let take the following portion of BRB as an example to show initial rule base
construction (Fig. 4):
Fig. 4. Input data
Let For H = 1.0; M = 0.5; L = 0.0

If Hi3 Ri Hi2i
Then /i2 = (Hi3 − Ri)/(Hi3 − Hi2)
/i3 = (1 − /i2)
/i1 = 1 − (/i2 + /i3)
Hi3 = High
Hi2 = Medium
Hi1 = Low
ℰi = Input value
Now, For A1 = 0.8
h3 0.8 ℎ2; Then ai1(M) = (1 − 0.8) /(1 − 0.5) = 0.4
ai3(H) = 1 − 0.4 = 0.6
ai2(L) = 0.0 A1 = {(H, 0.6); (M, 0.4), (L,0.0)}.
4 Implementation of Expert System
In the proposed system, various components interact with each other. Their proper
interaction creates a perfects BRB and ER based system. During the implementation of
BRB and ER methodology first, important things are to choose a programming lan-
guage to implement ER methodology and user interface, knowledge-based need to
implement using the database. After designing the perfect Graphical User Interface
(GUI), it will be easy for the general user to interact with the developed system and
perform meditation assessment. So the implementation of the proposed expert system is
one of the primary tasks to deal with. System implementation was achieved using
MySQL, JAVA scripting language. The database was designed using MySQL because
it is highly efficient and an effective solution for data storage and handling even in a
networked environment [12].
4.1 Initial Rule Base for X

X is a behavioural impact; it’s the consequent that has five antecedents are land type,
water removal situation, drainage system, soil texture, and pH value. Land type has 3
referential values (high, mid, and low). Water removal situation has 3 referential values
(early, avg. and late). Drainage system has 3 referential values (well, fair, and low).
Soil texture has 3 referential values (sandy, silt, and clay). pH value has 3 referential
values (acidic, neutral, and alkynes) (Tables 1 and 2).
Table 1. Evaluation grade for behavioural impact

Types of land Water removal situation Drainage system Soil texture pH value
1 = High 1 = Early 1 = Well 1 = Sandy 1 = Acid
0.5 = Mid 0.5 = Average 0.5 = Good 0.5 = Silt 0.5 = Neutral
0 = Low 0 = Late 0 = Poor 0 = Clay 0 = Alkynes
Table 2. Initial rule base for X

Rule Rule Types of Water romoval Drainage Soil pH High Mid Low
ID Wt. land situation system texture
R-1 1 1 1 1 1 1 1 0 0
R-2 1 1 1 1 1 0.5 0.8 0.2 0
R-3 1 1 1 1 1 0 0.8 0 0.2
R-4 1 1 1 1 0.5 1 0.8 0.2 0
R-5 1 1 1 1 0.5 0.5 0.6 0.4 0
R-6 1 1 1 1 0.5 0 0.6 0.2 0.2
R-7 1 1 1 1 0 1 0.8 0 0.2
R-8 1 1 1 1 0 0.5 0.6 0.2 0.2
R-9 1 1 1 1 0 0 0.6 0 0.4
R-10 1 1 1 0.5 1 1 0.8 0 0.2
R-11 1 1 1 0.5 1 0.5 0.6 0 0
R-12 1 1 1 0.5 1 0 0.6 0.2 0.2
R-13 1 1 1 0.5 0.5 1 0.6 0.4 0
R-14 1 1 1 0.5 0.5 0.5 0.4 0.6 0
R-15 1 1 1 0.5 0.5 0 0.4 0.4 0.2
R-16 1 1 1 0.5 0 1 0.6 0.2 0.2
R-17 1 1 1 0.5 0 0.5 0.4 0.4 0.2
R-18 1 1 1 0.5 0 0 0.4 0.2 0.4
R-19 1 1 1 0 1 1 0.8 0 0.2
R-20 1 1 1 0 1 0.5 0.6 0.2 0.2
R-21 1 1 1 0 1 0 0.6 0 0.4
R-22 1 1 1 0 0.5 1 0.6 0.2 0.2
R-23 1 1 1 0 0.5 0.5 0.4 0.4 0.2
R-24 1 1 1 0 0.5 0 0.4 0.2 0.4
R-25 1 1 1 0 0 1 0.6 0 0.4
(continued)
Rule Rule Types of Water romoval Drainage Soil pH High Mid Low
ID Wt. land situation system texture
R-26 1 1 1 0 0 0.5 0.4 0.2 0.4
R-27 1 1 1 0 0 0 0.4 0 0.6
R-28 1 1 0.5 1 1 1 0.8 0.2 0
R-29 1 1 0.5 1 1 0.5 0.6 0.4 0
R-30 1 1 0.5 1 1 0 0.6 0.2 0.2
R-31 1 1 0.5 1 0.5 1 0.6 0.4 0
R-32 1 1 0.5 1 0.5 0.5 0.4 0.6 0
R-33 1 1 0.5 1 0.5 0 0.4 0.4 0.2
R-34 1 1 0.5 1 0 1 0.6 0.2 0.2
R-35 1 1 0.5 1 0 0.5 0.4 0.4 0.2
R-36 1 1 0.5 1 0 0 0.4 0.2 0.4
R-37 1 1 0.5 0.5 1 1 0.6 0.4 0
R-38 1 1 0.5 0.5 1 0.5 0.4 0.6 0
R-39 1 1 0.5 0.5 1 0 0.4 0.4 0.2
R-40 1 1 0.5 0.5 0.5 1 0.4 0.6 0
R-41 1 1 0.5 0.5 0.5 0.5 0.2 0.8 0
R-42 1 1 0.5 0.5 0.5 0 0.2 0.6 0.2
R-43 1 1 0.5 0.5 0 1 0.4 0.4 0.2
R-44 1 1 0.5 0.5 0 0.5 0.2 0.6 0.2
R-45 1 1 0.5 0.5 0 0 0.2 0.4 0.4
R-46 1 1 0.5 0 1 1 0.6 0.2 0.2
R-47 1 1 0.5 0 1 0.5 0.4 0.4 0.2
R-48 1 1 0.5 0 1 0 0.4 0.2 0.4
R-49 1 1 0.5 0 0.5 1 0.4 0.4 0.2
R-50 1 1 0.5 0 0.5 0.5 0.2 0.6 0.2
……. ……. ……. ……. ……. ……. ……. ……. ……. …….
4.2 User Interface

It has already been said that the proposed system is built using Java and MY SQL
Server. Eclipse is a complete IDE driven GUI, so the developer does not need to worry
about interface performance or data handling. Eclipse IDE provides the key compo-nets
that the GUI needs to use to improve the system; button, data grid, form, checkbox,
combo list, label, etc. Therefore most system improvements are made in visual design
with small back-end codes [13]. Following is a snapshot of the developed system
(Fig. 5).
Fig. 5. User interface
Each node (button) works as a link to a particular form for data input, without
calculating nodes in subsequent order, the user can’t move further, as it’s been seen that
some nodes are greyed out. The syntax grid is showing the syntax for corresponding
nodes. After calculating all nodes, aggregated values for the corresponding input value
will be shown in nodes. Finally, aggregated fuzzy values will be converted into
numerical value to get the desired result, which will indicate a value in the scale of
meditation assessment [14].
5 Results and Discussion
Data for identifying crime zone of the BRB framework in (previous) have been col-
lected from the CMP. Fifty expert data have been collected, and for simplicity, some of
them have been shown in Table. The expert has selected the major five significant
factors of nine factors and the associated risk factors data from the expert. The collected
data of the expert has been used as input data in this expert system to assess the crop
selection. The BRBES has been compared with a fuzzy rule-based expert system
(FRBES), developed in the MATLAB environment [17]. Illustrates the results of crime
zone prediction carried out by the BRBES (Column 2), the expert (Column 3), and by
the FRBES (Column 4). If the value of an expert is more excellent or equal, 50 positive
than the benchmark is considered as “1” and otherwise, it is considered as “0”. This
data has been considered as the benchmark data, as shown in column 5 of the table
(Table 3 and Fig. 6).
Table 3. Crime zone assessment by BRBES, EXPERT, and RBFL system

SL No. BRBES EXPERT RBFL BENCHMARK
1 79.95 69.44 74.34 1
2 89.39 70.32 78.23 1
3 73.15 60.13 67.15 1
4 58.1 45.15 46.23 1
5 40.45 30.45 22.25 0
6 44.52 37.57 38.54 0
7 55.01 50.65 51.55 1
8 41.57 45.56 42.26 0
9 33.27 30.29 30.25 0
… ……. …….. …….. ……..
… ……. …….. …….. ……..
48 91.85 79.24 82.37 1
49 61.54 53.53 56.24 1
50 24.15 18.73 21.45 0
Fig. 6. Comparison of reliability among BRBES, FRBES, and Expert opinion by using ROC
curves
The Receiver Operating Characteristics (ROC) curve is widely used to analyze the
effectiveness of assessment, having ordinal or continuous results. Therefore, it has been
considered in this research to test the accuracy of the BRBES output against expert
opinion and FRBES. The accuracy or performance of the BRBES in crop selection can
be measured by calculating the Area under Curve (AUC) (Table 4).
Table 4. Area under the curve

Test result variable (s) Area
BRBES .998
EXPERT .990
FRBES .995
Having less AUC against expert opinion in Crime Zone Prediction is that expert is
not aware of the uncertainty associated with the factor and risk factors of Crime Zone
selection. Moreover, during our research, I have observed while asking an expert about
the procedures of measuring factor, we have noticed their approach is Boolean. The
reason for less performance of the Fuzzy rule-based expert system than from the belief
rule-based expert system is that the belief rule base considers uncertainty and incom-
pleteness. Also, the inference procedures of BRBES consists of input transformation,
rule activation weight calculation, belief update, and rule aggregation using the evi-
dential reasoning approach [15]. Evidential reasoning can process various uncertainties,
which is not the case with the fuzzy-based inference engine.
In this paper, we have demonstrated the design, development, and application of
BRBES to identify the crime zone by taking five significant factors of crime zone
prediction. This BRBES employed a novel methodology known as RIMER, which
allows the handling of various types of uncertainties. Hence, the results generated from
the BRBES are reliable than from expert opinion or fuzzy rule-based expert systems
[16].
6 Conclusion
The project was successful in implementing the objectives stipulated in earlier chapters.
This system offers several benefits to the users; average persons and researchers can
automatically register, update, and view records. The system administrator can manage
and configure the various parameters of system functionality. The system can also
authenticate the users, display crop yields and total costs, and generate reports. To
improve and increase the use of information technology solutions in day-to-day life,
there is a need for further research to:
• Come up with more advanced software that can remotely connect all the other
remote major towns in districts round and globally.
• Enable an average person or researcher to use the internet, log in to this system and
remotely analyze test results and recommend crime hotspots and get all the infor-
mation they need, say, real danger, and so on.
• Other demands may arise with the ongoing evolution and use of the system.
The study demonstrated how a short text specification of a system could be
modelled in analysis, expanded and detailed into a design model, and finally imple-
mented and programmed in PHP, HTML, and JavaScript.
References
1. RIMER Conference on Belief Rule-Base Inference Methodology Using the Evidential
Reasoning Approach, pp. 41, 44 (2006)
2. Bogomolov, Lepri, B., Staiano, J., Oliver, N., Pianesi, F., Pentland, A.: Once Upon a Crime
3. Buczak, Gifford, C.: Fuzzy association rule mining for community crime pattern Auckland,
New Zealand, pp. 31–38 (2014)
4. Discovery. In: ACM SIGKDD Workshop on Intelligence and Security Informatics,

Washington, D.C. (2010)
5. Arulanandam, R., Savarimuthu, B., Purvis, M.: Extracting Crime Information from Online
6. Chainey, S., Tompson, L., Uhlig, S.: The Utility of Hotspot Mapping for Predicting Spatia
7. Pathak, A., Tasin, A.H., Esho, A.A., Munna, A.R., Chowdhury, T.: A smart semi-
autonomous fire extinguish quadcopter: future of Bangladesh. Int. J. Adv. Res. 8, 01–15
(2020). (ISSN 2320-5407)
8. Nath, S.: Crime pattern detection using data mining. In: Web Intelligence and Intelligent
Agent
9. Tayebi, Richard, F., Uwe, G.: Understanding the link between social and spatial distance in
the crime world. In: Proceedings of the 20th International Conference on Advances in
Geographic Information Systems (SIGSPATIAL 2012), Redondo Beach, California,
pp. 550–553 (2012)
10. 2006 IEEE/WIC/ACM International Technology Workshops. WI-IAT 2006 Workshops
(2006)
11. TIME SENSOR DATA STREAMING Towards Crime Prediction from Demographics and
Mobile Data, CoRR, vol. 14092983 (2014)
12. Intelligent Computing & Optimization, Conference Proceedings ICO 2018. Springer, Cham.
ISBN 978-3-030-00978-6
13. Intelligent computing and optimization. In: Proceedings of the 2nd International Conference
on Intelligent Computing and Optimization 2019 (ICO 2019), Springer. ISBN 978-3-030-
33585-4
14. Kiani, R., Mahadavi, S., Keshavarzi, A.: Analysis and prediction of crimes by clustering and
classification. Int. J. Adv. Res. Artif. Intell. 04(8) (2015)
15. Agarwal, J., Nagpal, R., Sehgal, R.: Crime analysis using K-means clustering. Int.
J. Comput. Appl. (0975-8887) 83(04) (2013)
16. Nath, S.V.: Crime pattern detection using data mining. IEEE Trans. Knowl. Data Eng. 18
(09), 41–44 (2010)
17. Kapoor, S., Kalra, A.: Data mining for crime detection. Int. J. Comput. Eng. Appl. VII(III)
Parameter Tuning of Nature-Inspired
Meta-Heuristic Algorithms for PID
Control of a Stabilized Gimbal
S. Baartman1,2(B) and L. Cheng1

1
Electrical and Information Engineering, University of the Witwatersrand,
Johannesburg, South Africa
Ling.Cheng@wits.ac.za
2
Optronic Sensor Systems, Defence and Security,
Council for Scientific and Industrial Research (CSIR), Pretoria 0001, South Africa
723263@students.wits.ac.za
Abstract. This work focused on optimizing a Proportional-Integral-

Derivative (PID) controller for a single-axis, gimbal-based inertial sta-
bilization system, for a given performance objective. This study com-
pared the suitability and relative performance of two meta-heuristic
optimization methods for tuning the PID controller. The methods com-
pared are two nature-inspired meta-heuristic algorithms: the Teaching-
Learning-Based Optimization (TLBO) algorithm, and the Flower Polli-
nation Algorithm (FPA). This study involved tuning the common and
algorithm-specific parameters for the algorithms to be able to optimize
the controller. These parameters were optimized for two different prob-
lem statements which represent two different target motion objectives.
Comparing the algorithm performances using fitness function value, it
was found that the problem instances favoured the TLBO algorithm.
Keywords: Gimbal stabilization · PID controller · Nature-inspired

meta-heuristic algorithms · Flower Pollination Algorithm (FPA) ·
Teaching-Leaning-Based Optimization Algorithm (TLBO)
1 Introduction
Nature-inspired meta-heuristic algorithms aid in solving problems which include

non-linear constraints, multiple objectives and dynamic non-linear properties [1].
However, not much work has gone into finding a method for selecting a suitable
algorithm for specific applications. Furthermore, the parameters of the algorithm
itself may have a substantial effect on the performance and thus must be altered
accordingly for the required application, such as the gimbal stabilization sys-
tem in this instance [2]. Gimbal stabilization is important in applications where
The work was supported by the Optronic Sensor Systems department, at the Council
for Industrial and Scientific Research (CSIR).
https://doi.org/10.1007/978-3-030-68154-8_25
Meta-Heuristic Algorithms in PID Tuning 251
optical systems require stabilizing the sensor’s line-of-sight (LOS) towards a tar-
get of interest [3]. How well the system stabilizes the LOS is highly dependent
on how well the controller is tuned to the system needs. The PID controller is
popular and extensively used in industry because it is simple to tune and has
been extensively implemented and tested [4]. Classical PID tuning methods (e.g.
Ziegler-Nichols) at times do not produce desired results, compared to intelligent
tuning methods [5–7]. This dilemma makes the optimization of controller param-
eters a prominent research area. There are various control systems where the PID
controllers have been tuned using meta-heuristic algorithms [8–13]. Rajesh and
Kavitha [8] are the few researchers who have used meta-heuristic algorithms to
tune the PID controller particularly for gimbal stabilization, which feeds into
the motivation of this work. The meta-heuristic algorithms used in the current
study are the Teaching-Learning-Based Optimization (TLBO) and the Flower
Pollination Algorithm (FPA), of which two variations of the FPA were evalu-
ated. The TLBO algorithm was chosen firstly due to it’s unique ‘parameterless’
quality unlike other algorithms which require careful parameter adjustment for
optimal performance, and secondly because it performed well in past studies [13].
The FPA was chosen because it also performed well in the study conducted in
[10]. This current study made use of two implementations of the FPA: the static
FPA with a switch probability of 80%, and a dynamic FPA with a deterministic
switch probability adapted from [14]. Therefore, the contribution of this paper
lies in investigating different heuristic algorithms, and the parameter settings for
these algorithms, that can best determine the PID controller parameters for an
inertial stabilization platform (ISP).
2 Description of the Control System
2.1 The Plant
The model of the gimbal system (the plant) used in this study is adapted from
[15], with some differences. The first difference is that this current study imple-
mented a rate control loop as the focus was on stabilizing the gimbal and LOS
rate, rather than tracking the target and focusing on the LOS angle. Secondly,
only gyro III from the original study is considered here, as the purpose of the
study in [15] was to compare the performances of different gyros which differs
from the purpose of this current study. Another difference is the controller in
this current study is a PID controller with a low pass filter in the derivative
path whereas the study in [15] did not use a low pass filter for the derivative
controller implemented. The model of this study was particularly chosen to be
used because it represents a simple generic system which can be applied in differ-
ent applications. This model was also chosen because the results were able to be
repeated thus model validation was achieved. Other elements and details of the
control system, including the DC motor specifications, the gimbal specifications,
and the gyro specifications, are shown in study [15] and will not be repeated
here.
252 S. Baartman and L. Cheng
2.2 Fitness Function Used for Meta-Heuristic Algorithms
The fitness function quantifies the system performance for a set of PID parame-
ters. The fitness function used in this study included weighted averages on both
the ISE and the ITAE because the ISE will help decrease the large initial system
error and aid in the fast response of the system, and the ITAE will reduce oscil-
lations caused by the ISE whilst ensuring a small steady-state error [16]. The
fitness function used in this study is:
F = μ ISE + α ITAE, (1)
where μ = 0.6, α = 0.4. Note that the smaller the value of the fitness function,
the better the solution.
3 Nature-Inspired Algorithms
3.1 Flower Pollination Algorithm
The FPA algorithm is inspired by flower pollination in nature [17]. Global polli-
nation occurs when pollen grains are transferred from one plant to the next using
a pollinator, like an insect. Whereas local pollination occurs when the plant does
not use a pollinator and sheds its pollen from its anther to its stigma [18]. The
switch probability (p) controls whether local or global pollination occurs at any
iteration, which has the range [0,1].
There are many different ways to determine step size when global pollination
occurs. One of which is using the Lévy flight distribution, which is shown below:
λΓ (λ) sin(πλ/2) 1
L(s) = , (2)
π s1+λ
where Γ (λ) represents the standard gamma function, with this study using λ
representing the scaling factor of value 1.5 adapted from [17]. This distribution
is valid for steps s > 0.
Static and Dynamic Switch Probability. The static switch probability (p)
is 0.8 because this works best in most applications [17]. The dynamic switch
probability (pn+1 ) in iteration n + 1 is adapted from [14]:

Niter − n
pn+1 = pn − λ , (3)
Niter
where pn is the value of the switch probability in the previous iteration, Niter is
the maximum number of iterations, n is the current iteration, and the scaling
factor λ = 1.5. The value of the dynamic switch probability begins from 0.9 and
decreases using Eq. (3). Algorithm 1 shows the flow of the FPA.
Algorithm 1 FPA pseudo code

1: Initialize popsize N
2: Initialize iteration number maxIT
3: Define objective function f(x)
4: Find global best solution g ∗
5: Define dynamic switch probability function
6: while t < maxIT do
7: Evaluate dynamic switch probability function
8: Obtain switch probability p
9: for i = 1 : N do
10: if rand < p then
11: Draw a d-dimensional step vector L
12: Global pollination: xt+1
i = xti + L(xti − g ∗ )
13: else
14: Draw from a uniform distribution in [0,1]
15: Randomly choose j and k among solutions
16: Local pollination: xt+1
i = xti + (xtj − xtk )
17: Evaluate new solutions
18: Compare with previous solution
19: Update new solutions if they are better
20: Find the current best solution g ∗
3.2 Teaching-Learning-Based Optimization Algorithm
The Teaching-Learning-Based Optimization (TLBO) algorithm is comprised of

two phases: the teacher phase and the learner phase. The teacher phase is where
the learners gain knowledge from the teacher and the learner phase is where
learners interact and gain knowledge from each other. The advantage of the
TLBO is that it only requires non-algorithmic specific parameters (i.e. common
parameters), such as the population size and number of generations [19].
In this particular study, each learner of the population has four subjects
being the three PID gains and the N filter coefficient. Algorithm 2 shows how
the TLBO was implemented in this study. Note that in Algorithm 2, ri , Tf , and
Mi represent a random number between 0 and 1, a teaching factor which can
either be 0 or 1, and the mean result of the learners in that particular iteration,
respectively.
4 Computational Experimental Design
This study used a basic design of experiments (DOE) [20] methodology which is
the 2k factorial design technique to compare the outcome effect of two values for
each parameter. Table 1 shows these two parameters, along with the two values
of each parameter.
Algorithm 2 TLBO pseudo code

1: Initialize population N
2: Choose iteration number maxIT
3: Define objective function f(X)
4: Evaluate population and find best learner k∗
5: while t < maxIT do
6: for i = 1 : N do
7: procedure Teacher Phase
8: Find difference mean: Δ̄k,i = ri (Xk∗ ,i − Tf Mi )

9: New solution: Xk,i = Xk,i + Δ̄k,i
10: procedure Learner Phase

11: Randomly choose learners P & Q s.t XP,i = XQ,i

12: if XP,i < XQ,i then

13: Update P: XP,i = XP,i + ri (XP,i − XQ,i )

14: if XQ,i < XP,i then

15: Update Q: XP,i = XP,i + ri (XQ,i − XP,i )
16: Evaluate new solutions
17: Update new solutions if they are better
18: Find the current best learner k∗
These parameter values were chosen considering various factors. A low pop-
ulation size may result in the algorithm not having enough population members
to find the optimal solution, and a high population size may result in a waste
of computational space. A low search space bound may result in the optimal
solution not included in the search space, and a high search space bound may
reduce the probability of the algorithm obtaining the optimal solution. The detail
of each algorithm design is shown in Table 2.
Table 1. 2k factorial design bounds
Factor Level 1 Level 2

Search space upper bound 100 500
Population size 20 60
The algorithm design and operation are evaluated for two problem instances:
a step input which represents a constant target velocity of 1 rad s−1 and a ramp
input which represents a constant acceleration of 1 rad s−2 . The study ran each
algorithm ten times (with random seed) in order to observe their reactive nature,
as also done in [11]. This research used the best PIDN solutions for performance
comparison because in a real-life scenario, the best solution from the number
of runs would be chosen [21]. The maximum iteration counts for the FPA and
TLBO algorithm were kept constant throughout the experiments at 50 and 25
respectively, as the TLBO performs two fitness function evaluations at each
iteration.
Table 2. Factorial design
AD Population size Search space

1 20 100
2 60 100
3 20 500
4 60 500
5.1 Parameter Optimization
Table 3 shows that the TLBO algorithm, implemented with algorithm design 4
(AD = 4), obtained the lowest fitness value. Furthermore, the lowest mean fitness
value was obtained by AD = 4 for all the algorithms, showing that AD = 4 is the
best performing for this problem instance when observing the fitness value.
Table 3. Fitness value statistics
AD Best (Min) Mean Max Std Dev

Dynamic 1 34.7076 35.0040 35.6922 0.343004
FPA 2 34.7114 34.7304 34.7682 0.0139906
3 14.8154 25.4771 34.7096 9.19568
4 14.8167 21.2140 34.4665 6.71812
Static 1 34.7128 35.1717 36.8325 0.6085149
FPA 2 34.7207 34.8006 34.9912 0.0871953
3 15.4595 29.3366 35.2087 8.25672
4 16.0833 24.4707 34.5326 6.35065
TLBO 1 34.6111 34.6611 34.6753 0.017996
2 34.5891 34.6242 34.6699 0.0311257
3 14.2111 23.1669 34.4206 9.247853
4 13.9728 21.0753 34.4200 8.781703
The results on Table 3 also show that the dynamic FPA generally obtained
lower fitness values compared to the static FPA, i.e., the problem benefits from
fine-tuning and exploitation in later iterations rather than a constant higher
probability of global pollination throughout all iterations. This could also be
the reason why the TLBO algorithm obtained the lowest fitness value, as this
algorithm works by fine-tuning the solutions at each iteration.
Both the FPA and TLBO resulted in a PI controller solution, as the FPA
resulted in a low derivative gain, and the TLBO resulted in a low filter coefficient
Table 4. Optimal PID parameters and results
AD KP KI KD N tR (s) tS (s) %OS

Dynamic 1 94.7108 45.0245 0 82.7842 0.0260 0.0389 0.0295
FPA 2 99.5786 46.3494 0.0239934 72.0288 0.0258 0.0389 0.0368
3 25.0692 219.082 0 368.749 0.0445 0.3159 13.3413
4 24.9139 278.149 0 500 0.0415 0.2687 16.2584
Static 1 98.8912 47.1230 0 100 0.0258 0.0380 0.0382
FPA 2 99.6802 46.182 0 84.5793 0.0257 0.0378 0.0339
3 23.0263 175.111 0 500 0.0497 0.3559 12.4789
4 21.7208 143.351 0 409.259 0.0544 0.3959 11.4223
TLBO 1 95.7066 48.0759 97.7518 0.0138523 0.0259 0.0384 0.0425
2 99.2521 53.2571 91.8365 0.287085 0.0251 0.0341 0.0303
3 25.1757 25.7884 0 462.389 0.0668 0.1197 0.0135
4 26.5995 23.8218 0 455.577 0.0634 0.1163 0.0030
Journal results N/A 4.5 3.6 0.05 100 0.4422 0.8526 0
1.2
1
A D1 A D1
A D2 A D2
1 0.95
A D3 A D3
A D4 A D4
0.9
journal journal
0.8
input input
Step Response
Step Response
0.85
0.6
0.8
0.4 0.75
0.7
0.2
0.65
0
0 0.5 1 1.5 0.05 0.1 0.15 0.2 0.25 0.3 0.35
Time (s) Time (s)
(a) Steady-State Response (b) Zoomed in Figure of 1a
Fig. 1. Step response comparing the algorithm design performance for the TLBO
as seen on Table 4. The PI controller solution means that the system is suffi-
ciently damped as the purpose of the derivative controller is to reduce overshoot
and oscillations, i.e. increasing damping.
The best performing algorithm design is AD = 2 as seen on Fig. 1 for the
TLBO algorithm, and Fig. 2 for the FPA, with the static FPA performing
the best. In order to find the overall best performing algorithm and algorithm
design, the best performing FPA algorithm (implemented with the best per-
forming algorithm design) was compared with the TLBO algorithm, to observe
whether this problem instance prefers algorithms with algorithm-specific param-
eters. The algorithm design chosen as best performing for the all the algorithms
is AD = 2 as it showed a good compromise with the most favourable response, yet
with a reasonable fitness value. Since the static FPA (implemented using AD = 2)
resulted in the lowest rise and settling time, and followed the commanded input
the closest, this algorithm was chosen to be compared with the TLBO algorithm.
Figure 3 shows that the TLBO algorithm reaches the commanded input faster.
1.2
D-AD1 1.02 D-AD1
D-AD2 D-AD2
1 D-AD3 D-AD3
1
D-AD4 D-AD4
S-A D1 S-A D1
0.8 0.98
S-A D2 S-A D2
Step Response
Step Response
S-A D3 S-A D3
0.96
0.6 S-A D4 S-A D4
journal journal
input 0.94 input
0.4
0.92
0.2
0.9
0
0 0.5 1 1.5 0.045 0.05 0.055 0.06 0.065 0.07 0.075
Time (s) Time (s)
Fig. 2. Step response comparing the algorithm design performance for the FPA
1.2
1
TLBO TLBO
Static FPA 0.99 Static FPA
1 input input
0.98
0.97
0.8
Step Response
Step Response
0.96
0.95
0.6
0.94
0.4 0.93
0.92
0.2 0.91
0.9
0
0 0.5 1 1.5 0.03 0.035 0.04 0.045 0.05 0.055 0.06 0.065 0.07 0.075
Time (s) Time (s)
(a) Steady-State Response (b) Zoomed-in view of Figure 3a
Fig. 3. The comparison of the step response of the algorithm performance
Thus, the TLBO algorithm, using AD = 2 was chosen to be the best performing
for this problem instance.
5.2 Robustness Analysis
Table 5 shows that the increase in population size from AD = 1 to AD = 2 resulted

in a decrease in the fitness value for all the algorithms. However, increasing the
search space bound, as done from AD = 2 to AD = 3, caused a significant decrease
in the fitness value. This is because of the increase in the integral controller gain
solutions (KI ) obtained, as seen on Table 6. An increase in this term shows its
importance when the input is changed from step to ramp.
The lowest fitness values were obtained using AD = 4 for all the algorithms,
with the lowest obtained from the TLBO algorithm.
Figure 4 shows that the increase in the integral gain in the algorithm designs
reduced this steady-state error which resulted in a decreased fitness value.
Table 5. Fitness value statistics
AD Best Mean Max Std Dev

Adaptive 1 5.9056 6.0110 6.1584 0.079769
FPA 2 5.8152 5.9067 5.9618 0.04944
3 1.8489 2.4535 4.4354 0.7344
4 1.7788 1.8828 2.077 0.07754
Static 1 5.9162 6.04216 6.1049 0.05763
FPA 2 5.86 5.9424 6.0185 0.05194
3 1.7962 2.6258 4.3594 0.88795
4 1.8326 1.91394 2.1113 0.08125
TLBO 1 5.8078 6.0329 6.2369 0.1189
2 5.7446 5.8962 6.1151 0.09516
3 1.7793 1.7851 1.791 0.003700
4 1.759 1.7761 1.7883 0.007449
Table 6. Algorithm solutions
AD KP KI Kd N
Adaptive 1 68.4011 34.2160 13.2017 100
FPA 2 97.7425 0 13.5612 94.1247
3 22.6570 500 0 36.0330
4 25.2946 500 0 165.944
Static 1 59.0788 55.6679 14.3738 89.4286
FPA 2 37.3178 79.6967 13.6054 99.8358
3 25.2989 500 0 84.1660
4 24.0282 500 0 415.586
TLBO 1 42.9984 66.3752 13.491 99.8793
2 43.2900 83.3955 13.4509 100
3 24.806 500 0 500
4 25.0514 499.932 0 358.493
This problem favoured the dynamic FPA instead of the static FPA, as the
dynamic FPA gave the lowest fitness value from the two. This means this problem
instance is not multi-modal and requires fine-tuning of a specific region in the
search space. This experiment resulted in AD = 4 performing best with the
lowest mean fitness value and best response behaviour for all the algorithms.
Because the dynamic FPA gave the lowest fitness function value when com-
pared to the static FPA, the dynamic FPA was chosen for comparison, along
with the TLBO using the algorithms ramp response shown in Fig. 6. This Figure
illustrates that though the TLBO algorithm resulted in the lowest fitness value,
the best performing algorithm is the dynamic FPA with it following the ramp
input closest. The low fitness value from the TLBO algorithm, when compared
to the dynamic FPA, may be a result of the dynamic FPA taking slightly longer
to settle. Thus, there is no best performing algorithm in this instance as the
1.6
0.068
A D1 A D1
1.4 A D2 A D2
0.066
A D3 A D3
1.2 A D4 A D4
0.064
input input
Step Response
Step Response
1
0.062
0.8 0.06
0.6 0.058
0.4 0.056
0.2 0.054
0
0 0.5 1 1.5 0.052 0.054 0.056 0.058 0.06 0.062 0.064 0.066 0.068 0.07 0.072
Time (s) Time (s)
Fig. 4. Ramp response comparing the algorithm design performance for the TLBO
1.6
0.075
D-AD1 D-AD1
1.4 D-AD2 D-AD2
0.07
D-AD3 D-AD3
1.2 D-AD4 D-AD4
0.065
S-A D1 S-A D1
S-A D2 S-A D2
Step Response
Step Response
1
S-A D3 0.06 S-A D3
0.8 S-A D4 S-A D4
input 0.055 input
0.6
0.05
0.4
0.045
0.2
0.04
0
0 0.5 1 1.5 0.055 0.06 0.065 0.07 0.075 0.08 0.085
Time (s) Time (s)
Fig. 5. Ramp response comparing the algorithm design performance for the FPA
1.6
0.887
input input
TLBO TLBO
1.4
Dynamic FPA Dynamic FPA
0.886
1.2
0.885
Step Response
1
Step Response
0.8 0.884
0.6
0.883
0.4
0.882
0.2
0 0.881
0 0.5 1 1.5 0.8825 0.883 0.8835 0.884 0.8845 0.885 0.8855 0.886
Time (s) Time (s)
(a) Steady-State Response (b) Zoomed-in view of Figure 6a
Fig. 6. The comparison of the ramp response of the algorithm performance

TLBO algorithm gave the lowest fitness value, but the dynamic FPA follows the
input the fastest, though it results in a longer time to settle.
5.3 Suggestions for Future Practitioners

– If the target is already moving with a constant velocity (step input), the pro-
portional gain should be high since the gimbal is starting from zero velocity
and needs to reach the target velocity. If the target begins from zero as well
and moves with constant acceleration (ramp input), the integral gain is sig-
nificant which shows the error over time because the gimbal moves with the
target rather than attempting to reach the target motion. The search space
bound of the algorithm must be adjusted accordingly to ensure that optimal
results are within the search space.
– Ensure that the algorithm-specific parameter values chosen give a higher
probability for exploitation rather than a constant exploration in the iter-
ation.
– Since the best solutions for all the algorithms resulted in higher PI controller
gains for both the problem instances but with lower derivative controller
gains or filter coefficients, one must reduce the search space upper bound
of the derivative controller so that this may increase the number of viable
solutions (Fig. 5).
6 Conclusion
The tuning of PID controllers is imperative to ensuring a good performance
as classical tuning methods have proven to only produce sub-optimal results
and the need for better tuning method remains. Nature-inspired meta-heuristic
algorithms have shown to be a good alternative solution to PID controller tuning.
However, these algorithms have their own (meta) parameters which need to be
tuned in order to ensure good algorithm performance. This study made use of the
2k factorial method for tuning non-algorithm specific parameters, and compared
self-tuning methods with static parameter values for tuning algorithm-specific
parameters. The performance of the algorithms were then compared on two
different simulation conditions which represent two different target motions. It
was found that the TLBO algorithm resulted in the lowest (best) fitness value
and reached the near optimal solution with the lowest iteration number.
Most gimbal stabilization systems have two, three or even more axes. Future
work should consider how these algorithms perform when optimising PID sys-
tems for multi-axis gimbal systems that include Coriolis-force cross-coupling
disturbance torque between the axes.
References
1. Jan, J.A., Šulc, B.: Evolutionary computing methods for optimising virtual reality
process models. In: International Carpathian Control Conference, May 2002
2. Memari, A., Ahmad, R., Rahim, A.R.A.: Metaheuristic algorithms: guidelines for
implementation. J. Soft Comput. Decis. Support Syst. 4(7), 1–6 (2017)
3. Hilkert, J.M.: Inertially stabilized platform technology concepts and principles.
IEEE Control Syst. Mag. 28, 26–46 (2008)
4. Bujela, B.W.: Investigation into the robust modelling, control and simulation of a
two-dof gimbal platform for airborne applications. Msc. Thesis, University of the
Witwatersrand, Johannesburg, SA (2013)
5. Ribeiro, J.M.S., Santos, M.F., Carmo, M.J., Silva, M.F.: Comparison of PID
controller tuning methods: analytical/classical techniques versus optimization
algorithms. In: 2017 18th International Carpathian Control Conference (ICCC),
pp. 533–538, May 2017
6. Jalilvand, A., Kimiyaghalam, A., Ashouri, A., Kord, H.: Optimal tuning of pid con-
troller parameters on a DC motor based on advanced particle swarm optimization
algorithm, vol. 3, no. 4, p. 9 (2011)
7. Salem, A., Hassan, M.A.M., Ammar, M.E.: Tuning PID controllers using artifi-
cial intelligence techniques applied to DC-motor and AVR system. Asian J. Eng.
Technol. 02(02), 11 (2014)
8. Rajesh, R.J., Kavitha, P.: Camera gimbal stabilization using conventional PID
controller and evolutionary algorithms. In: 2015 International Conference on
Computer Communication and Control (IC4), (Indore), IEEE, September 2015.
https://doi.org/10.1109/IC4.2015.7375580
9. Dash, P., Saikia, L.C., Sinha, N.: Flower pollination algorithm optimized PI-PD
cascade controller in automatic generation control of a multi-area power system.
Int. J. Electr. Power Energy Syst. 82, 19–28 (2016)
10. Jagatheesan, K., Anand, B., Samanta, S., Dey, N., Santhi, V., Ashour, A.S., Balas,
V.E.: Application of flower pollination algorithm in load frequency control of multi-
area interconnected power system with nonlinearity. Neural Comput. Appl. 28,
475–488 (2017)
11. Fister, D., Fister Jr., I., Fister, I., Šafarič, R.: Parameter tuning of PID controller
with reactive nature-inspired algorithms. Robot. Autonom. Syst. 84, 64–75 (2016)
12. Sabir, M.M., Khan, J.A.: Optimal design of PID controller for the speed control
of DC motor by using metaheuristic techniques. Adv. Artif. Neural Sys. 2014, 1–8
(2014). https://doi.org/10.1155/2014/126317
13. Rajinikanth, V., Satapathy, S.C.: Design of controller for automatic voltage regu-
lator using teaching learning based optimization. Procedia Technol. 21, 295–302
(2015)
14. Li, W., He, Z., Zheng, J., Hu, Z.: Improved flower pollination algorithm and its
application in user identification across social networks. IEEE Access 7, 44359–
44371 (2019)
15. Jia, R., Nandikolla, V., Haggart, G., Volk, C., Tazartes, D.: System performance
of an inertially stabilized gimbal platform with friction, resonance, and vibra-
tion effects. J. Nonlinear Dyn. 2017, 1–20 (2017). https://doi.org/10.1155/2017/
6594861
16. Pan, F., Liu, L.: Research on different integral performance indices applied
on fractional-order systems. In: 2016 Chinese Control and Decision Conference
(CCDC), pp. 324–328, May 2016
17. Yang, X.-S.: Flower pollination algorithm for global optimization. In: Durand-
Lose, J., Jonoska, N. (eds.) Unconventional Computation and Natural Computa-
tion vol. 7445, pp. 240–249. Springer, Heidelberg (2012)
18. Alyasseri, Z.A.A., Khader, A.T., Al-Betar, M.A., Awadallah, M.A., Yang, X.-S.:
Variants of the flower pollination algorithm: a review. In: Yang, X.-S. (ed.) Nature-
Inspired Algorithms and Applied Optimization, vol. 744, pp. 91–118. Springer,
Cham (2018)
19. Rao, R.V.: Teaching-Learning-Based Optimization and Its Engineering Applica-
tions. Springer, Cham (2016)
20. Adenso-Dı́az, B., Laguna, M.: Fine-tuning of algorithms using fractional experi-
mental designs and local search. Oper. Res. 54, 99–114 (2005)
21. Eiben, A.E., Jelasity, M.: A critical note on experimental research methodology
in EC. In: Proceedings of the 2002 Congress on Evolutionary Computation, CEC
2002, vol. 1, pp. 582–587. IEEE, Honolulu, 12 May 2002
Solving an Integer Program by Using
the Nonfeasible Basis Method Combined
with the Cutting Plane Method
Kasitinart Sangngern and Aua-aree Boonperm(&)
Department of Mathematics and Statistics, Faculty of Science and Technology,

Thammasat University, Bangkok, Thailand
s.kasitinart@gmail.com,
aua-aree@mathstat.sci.tu.ac.th
Abstract. For solving an integer linear programming problem, a linear pro-

gramming (LP) relaxation problem is solved first. If the optimal solution to the
LP relaxation problem is not integer, then some techniques to find the optimal
integer solution is used. For solving the LP relaxation problem, the simplex
method is performed. So, the artificial variables may be required which it means
that the problem size will be expanded and wasted more computational time. In
this paper, the artificial-free technique, called the nonfeasible basis method
(NFB), is used combined with the cutting plane method for solving an integer
linear programming problem, called the nonfeasible-basis cutting plane method
(NBC). It performs in condensed table which is the smaller simplex tableau.
From the computational results, we found that the computational time can be
reduced.
Keywords: Integer linear programming Cutting plane method Nonfeasible

basis method Artificial-free technique
1 Introduction
In many real-world problems, there are a lot of problems which require integer solu-
tions such as travelling salesman problems [1], packing problems [2], knapsack
problems [3], assignment problems [4], hub location problems [5] etc. To solve these
problems, integer linear programming problems are formulated which there are many
methods such as branch and bound method [6], cutting plane method [7, 8] and branch
and cut method [9] can be used to find the optimality. The steps of these methods are
solving an LP relaxation problem which the integer condition is dropped, then linear
programming subproblems are solved for finding the integer solution.
One of the most widely used to solve the integer programming problem is branch
and bound method [6] which is a divide-and-conquer strategy. This method starts by
considering the fractional optimal solution to the LP relaxation problem. Then, it is
branched on generating sub-problems and bounded by the integer solution, infeasible
solution and a fractional solution that gives the worse objective value than the current

https://doi.org/10.1007/978-3-030-68154-8_26
264 K. Sangngern and A. Boonperm
best solution. However, for solving a large problem, the branch and bound method has
a lot of sub-problems for solving which this leads to the expensive computational time.
In 1958, Gomory [7] introduced the technique named the cutting plane method to
solve an integer linear programming problem. This technique starts by solving the LP
relaxation problem. If an optimal solution to the LP relaxation problem is not an
integer, then the cut is added to find the integer solution. The construction of an
efficient added cut is one of the interesting issues to improve the cutting plane method.
The well-known cut is the Gomory’s cut which is obtained from the optimal
simplex tableau after the LP relaxation problem is solved. This cut is constructed by the
coefficients of nonbasic variables in the row that has a fractional solution. Later in
2005, Glover and Sherali [10] proposed the improvement of the Gomory’s cut, named
Chvatal-Gomory-tier cut. In 2011, Wesselmann et al. [11] developed the Gomory’s cut
enhance the performance of the Gomory mixed integer cut. Additionally, the Gomory’s
cut was improved by many researchers [8, 10].
From the above researches, each method which is used to solve the optimal integer
solution requires the algorithm to solve subproblems which are linear programming
problems. Therefore, the efficient algorithm for solving linear programming problems
has been investigated.
The well-known method for solving a linear programming problem is the simplex
method which is the traditional iterative method introduced by Dantzig in 1947 [12].
However, the simplex method is not the best method since Klee and Minty [13]
demonstrated the simplex method has poor worst-case performance.
Many issues that researchers attempt to improve the simplex method are proposed,
for example, proposed the technique for removing the redundant constraints [14],
introduced the new method for solving the specific linear programming [15, 16] and
developed the artificial-free techniques [17–24].
Before the simplex method starts, the linear programming problem in the standard
form is required. So, the slack or surplus variables are added. Then, a coefficient matrix
of constraints is separated into two sub matrices called the basic matrix (the basis) and
the nonbasic matrix. The variables corresponding to the basis are the basic variables
while the remaining variables are the nonbasic variables. The simplex method starts by
choosing an initial basis to construct a basic feasible solution, then the basis is updated
until the optimal solution is found. However, it is hard to construct an initial feasible
basis by choosing basic variables among decision variables. In the case that surplus
variables are added, the basic variables should consist of slack variables and artificial
variables. It leads to the enlarged problem size. To solve this issue, some researchers
develop a new method for solving linear programming problems which start by
infeasible initial basis.
In 1969, Zionts [17] introduced the artificial-free technique named criss-cross
technique. This technique focuses on both of primal and dual problems. It alternates
between the (primal) simplex method and the dual simplex method until a feasible
solution of a primal problem or a dual problem is found.
Later in 2000, Pan [18] proposed two novel perturbation simplex variants combined
with the dual pivot rules for solving a linear programming problem without using the
artificial variable.
Solving an Integer Program by Using the Nonfeasible Basis Method 265
In 2003, Paparrizos et al. [19] introduced a method for solving a linear program-
ming by using the infeasible basis. The chosen basis is generated by two artificial
variables associated with the additional constraint with a big-M number. Furthermore,
this infeasible basis returns the dual feasible solution.
In 2009, Nabli [20] described the artificial-free technique name the nonfeasible
basis method. The nonfeasible basis method starts when the initial basis gives the
primal infeasible solution. It performs on the condensed tableau which is the briefly
version of the simplex tableau. In the condensed tableau, the columns of nonbasic
variables are only considered because the column of basic variables for each iteration is
not changed. By the operation of the nonfeasible basis method, its performance is better
than two-phase method. Later in 2015, NFB was improved by Nabli and Chahdoura
[21] by using the new pivot rule for choosing an entering variable. In 2020, Sangngern
and Boonperm [22] proposed the method for constructing an initial basis which closes
to the optimal solution by considering the angles between the objective function and
constraints. If the current basic is infeasible, then the nonfeasible method will be
performed.
In 2014, Boonperm and Sinapiromsaran [23, 24] introduced the artificial-free
techniques by solving the relaxation problem which is constructed by considering the
angle between the objective function and constraints.
Due to the above researches, we found that one of techniques for solving an integer
program and one of techniques for solving a linear programming problem are com-
patible. That is the nonfeasible basis method and the cutting plane method with the
Gomory’s cut. Since the added Gomory’s cut is generated by only the coefficients of
nonbasic variables like the nonfeasible basis method, which is performed on the
condensed tableau, we will use this compatibility to develop the algorithm for solving
an integer linear programming problem by the combination of the nonfeasible basis
method and the cutting plane method. Then, it is named the nonfeasible-basis cutting
plane method (NBC).
In this paper, it is organized into five sections. First section is the introduction.
Second, all tools that are used for the proposed method are described. Third, the
proposed method is detailed. The computational result is shown in the fourth section.
Finally, the last section is the conclusion.
2 Preliminaries
In this section, the cutting plane method with the Gomory’s cut for solving an integer
linear programming problem and the artificial-free technique named the Nonfeasible
Basis method are described. We begin with the cutting plane method.
2.1 Cutting Plane Method

Consider a (pure) integer linear programming problem:
max cT x
ðILPÞ s:t: Ax b
x 0 and integer;
where A 2 Rmn ; b 2 Rm ; c 2 Rn and x 2 Rn .

Before an integer linear programming problem is solved, the solution to the LP
relaxation is required. The LP relaxation problem (LPR) of ILP is its integer condition
dropped which it can be written as follows:
max cT x
ðLPRÞ s:t: Ax b
x 0:
To solve LPR, if the origin point is a feasible point, then the simplex method can
start. Otherwise, the two-phase method is performed.
T
Let A ¼ ½ B N ; x ¼ xTB xTN and cT ¼ cTB cTN . The standard form of LPR
associated to the basis B, the nonbasic matrix N, the basic variable xB and the nonbasic
variable xN is written as follows:

max 0T xB þ cTN cTB B1 N xN ¼ cTB B1 b
s:t: I m xB þ B1 NxN ¼ B1 b
xB ; xN 0
For the cutting plane method, after LPR is solved and its optimal solution is not an
integer solution, then the cuts are added to the LP relaxation problem for cutting of a
non-integer solution until an optimal integer solution is found. In this paper, we are
interested in the Gomory’s cut which can be defined below.
Let B be the optimal basis of LPR. The associated tableau is as follows:
Im B 1N B 1b xB
0 cTN cTB B 1 N cTB B 1b
xB xN
Suppose that there is at least one row of the optimal tableau, called the ith row, with
ðB1 bÞi is fraction. This row is corresponding to the equation
X
xBi þ aij xj ¼ bi ; ð1Þ
j2IN
where aij ¼ ðB1 N Þij ; bj ¼ ðB1 bÞj and IN is the set of indices of nonbasic variables.
The Gomory’s cut generated by the ith row is defined by

X
aij aij xj þ xn þ 1 ¼
bi
bi ; ð2Þ
j2IN
where xn þ 1 is the non-negative slack variable.
Algorithm 1 : Cutting Plane method (with Gomory’s cut)

Input Optimal tableau of LPR
*
Let x be the optimal solution to LPR
*
While x is not integer do
Add the Gomory’s cut into the optimal tableau
Perform the dual simplex method
If the optimal solution is found then
The optimal integer solution is found
Else
The optimal integer solution does not exist
End
End
The step of the cutting plane method can be described as follows:

Before the cutting plane method is performed, LPR must be solved by the simplex
method or the two-phase method. If the two-phase method is chosen to perform, then
artificial variables are required. Consequently, the size of problem must be expanded,
and the computational time will be huge. To avoid the usage of artificial variables, the
artificial-free technique is introduced. This technique can deal with the artificial vari-
able and reduce the computational time.
2.2 Nonfeasible Basis Method

The nonfeasible basis method (NFB) is one of the artificial-free techniques performed
on the condensed tableau which is described below. Consider the following the original
simplex tableau:
Im B 1N B 1b xB
0 cT T
c B N 1 T 1
c B b
N B B
xB xN
For all iterations, the column matrix of basic variables ðxB Þ is the identity matrix.
Thus, the condensed tableau is introduced instead of the original simplex tableau by
removing the column of basic variables. It can be written as follows:
B 1N B 1b xB
T T 1 T 1
cN c B N
B c B b
B
xN
For using the simplex method with the condensed tableau, the updated condensed
tableau for each iteration is defined by the theorem in [9].
NFB was introduced to construct the feasible basis without using artificial variables
and deals with the condensed tableau. So, the size of a matrix of each problem handled
by this method is smaller than the original simplex method. The NFB method starts
when the chosen basis B is an infeasible basis (i.e., there exists the ith row that
ðB1 bÞi \0). Finally, this method returns a basic feasible solution. The detail of the
algorithm is shown as follows:
Algorithm 2 : Nonfeasible Basis method (NFB)

Input Infeasible basis B
Bb min
While Bb 0 do
k arg min and
If K then
Exit /*the feasible domain is empty*/
Else
s arg min
r arg max
basic nonbasic
End
Apply pivoting
End
The current basis is feasible.
Apply the simplex method by the desired pivot rule.
3 Nonfeasible-Basis Cutting Plane Method
Consider the following integer linear programming problem:
max cT x
ðILPÞ s:t: Ax b ð3Þ
x 0 and integer;

Then, the standard form of the LP relaxation problem of ILP is as follows:
max cT x
s:t: Ax þ Im s ¼ b ð4Þ
x; s 0:
If b 0, then the simplex method can be performed for solving the problem (4).
Otherwise, the origin point is not a feasible point and artificial variables are required.
From the previous section, the nonfeasible basis method is better than the two-
phase method because it performs on the smaller tableau and it is not involved the
artificial variable. Consequently, NFB is used for solving the problem (4). After the
problem (4) is solved, the optimal integer solution to the problem (3) can be found by
the following cases:
1. If the optimal solution to the LP relaxation problem is found and it is integer, then it
is the optimal solution to the problem (3).
2. If the optimal solution to the LP relaxation problem is not integer, then the cutting
plane method is performed in the condensed tableau.
3. If the LP relaxation problem is unbounded, then the problem (3) is unbounded.
4. If the LP relaxation problem is infeasible, then the problem (3) is infeasible.
From the problem (4), the optimal condensed tableau is written as seen below:
B 1N B 1b xB
cTN cTB B 1 N cTB B 1b
xN
From the above tableau, the reduced cost cTN cTB B1 N 0 is obtained, that is the
dual solution is feasible. Suppose that there exists the fractional solution. Then, the
cutting plane method is preferred for finding the integer solution.
If the Gomory’s cut is used, then one of rows in the optimal tableau, say the kth
row, which has a fractional solution is selected to generate an added cut. Then, the
added cut is

ðbak c ak ÞxN þ xm þ n þ 1 ¼ bk
bk ð5Þ
where ak is the k th row of matrix B1 N, bak c ¼ ½ b a1k c bamk c and bk ¼
ðB1 bÞk .
From the added cut (5), if xm þ n þ 1 is chosen to be a basic variable, then this cut can
add all coefficients of nonbasic variables to the optimal condensed tableau which is
suitable for the cutting plane method. Hence, the condensed tableau with the added cut
can be written as follows:
B 1N B 1b xB
ak ak bk bk xm n 1
cTN cTB B 1 N cTB B 1b

xN

Since bk bk cannot be positive, the dual simplex method will be performed for
solving the condensed tableau with the added cut. After the LP relaxation problem is
solved, if the obtained solution is not integer, then repeat steps by adding a new cut one
by one until the optimal integer solution is found. The algorithm of this method can be
summarized as follows:
Algorithm 3 : Nonfeasible-Basis Cutting Plane method (NBC)

Input A, b and c
Generate the LP relaxation problem
If the origin point is feasible do
Perform the simplex method in the condensed tableau
Else
Perform NFB
End
While the solution is not integer do
Perform the cutting plane method on the optimal condensed tableau
If the optimal integer solution is found then
The optimal integer solution is found
Else
There is no integer feasible solution
End
End
4 Computational Results
In Sect. 3, the combination method for solving an integer linear programming problem
without using artificial variables was designed. To show the performance of NBC, the
computational time for solving random generated integer programs between the NBC
method and the traditional method (the simplex method or the two-phase method
combined with the cutting plane method) is compared. Both methods are implemented
through Python. Finally, the computational results will be shown and discussed.
4.1 Generated Problem

Consider the following integer linear programming problem
max cT x
s:t: Ax b ð6Þ
x 0 and integer;

Generated problems are in the form of problem (6) and satisfy the following
conditions:

1. A coefficient matrix A ¼ aij , aij is randomly generated in an interval ½9; 9.
T
2. A vector cT ¼ cj and vector x ¼ ½xi , cj and xi are randomly generated in an
interval ½0; 9.
3. After a coefficient matrix A and vector x are generated, a right-hand-side vector b is
computed by Ax ¼ b.
The different sizes of the number of variants ðnÞ and the number of constraints ðmÞ
are tested and summarized in the following table (Table 1):
Table 1. The summarized data of the generated problems

n 21 41 61 81 101 201 301 401 501 601
m 10 20 30 40 50 100 150 200 250 300
Size Small Large
All methods are implemented by using Python coding on the Google Colaboratory,
and these tests were run on Intel® Core™ i7-5500U 8 GB 2.4 GHz.
For the traditional method, the two-phase method is used for solving the LP
relaxation problem, and then the original cutting plane method is used for finding the
optimal integer solution and they are performed on the full simplex tableau.
4.2 Computational Results

According to the experimental design, we will show the comparison of the average of
the computational time before the cutting plane method is performed, the average of the
total computational time and bar charts of the average of the total computational time
for each size of problem with the standard deviation.
Table 2. The average of the computational time (sec.)

Size TRAD NBC
(row, col.) BC TC BC TC
Small 21,10 0.14 0.15 0.03 0.04
41,20 0.56 0.58 0.16 0.19
61,30 1.83 1.87 0.41 0.49
81,40 4.15 4.22 0.94 1.19
101,50 7.75 7.85 1.92 3.08
201,100 70.12 70.50 32.23 32.33
Large 301,150 274.85 275.75 153.88 154.48
401,200 687.05 688.82 234.65 258.05
501,250 1408.01 1410.14 307.72 628.82
601,300 2972.74 2976.51 845.10 1334.19
Total average 5427.2 5436.39 1577.04 2412.86
Table 2 exhibits the average computational time before the cutting plane method is
performed and the average total computational time. In Table 2, the main column
named TRAD and NBC are represented the computational time for solving problem by
using the traditional method and the NBC method, respectively, and the sub column
named BC and TC are represented the average computational time before the cutting
plane method is performed and the average total computational time, respectively. Note
that the bold face numbers indicate the smallest average computational time.
From Table 2, the average computational time before the cutting plane method is
performed and the average total computational time of NBC are less than TRAD.
Since NBC is not involved the artificial variable and is performed on the condensed
tableau while TRAD requires artificial variables and the original cutting plane method,
NBC is faster than TRAD.
From the computational result, we found that the average total computational time
5436:39 0:4438 or approximate 44.38% of the average total computational
of NBC is 2412:86
time of TRAD. It means that the proposed method can reduce the computational time
approximately 55.62% from the computational time of the traditional time (Fig. 1).
Fig. 1. The average of the total computational time
4.3 Discussion
From the computational results, we found that the average of the total computational
time of the traditional method is greater than the NBC method. There are two main
reasons why the NBC method has more effective than the traditional method. First, the
size of the performed tableau for each method is different since the traditional method
performs on the full simplex tableau which is bigger than the condensed tableau that
used in the NBC method.
To compare the size of the tableaux between the original simplex tableau and the
condensed tableau, if one cut is added, then the original simplex tableau will be
expanded with one row and one column
Im B 1N 0 B 1b xB
0 as as 1 bs bs xm n 1
T T 1 T 1
0 cN c B N
B
0 c B b
B
xB xN xm n 1
while the condensed tableau will be expanded with only one row.
B 1N B 1b xB
as as bs bs xm n 1
T T 1 T 1
cN c B N
B c B b
B
xN
Moreover, if k cuts are added, then the original simplex must be expanded with k
rows and k columns while the condensed tableau is expanded with only k rows.
Therefore, the size of tableaux performed by NBC is smaller than the origin simplex
method causing that the NBC can reduce the computational time.
The remaining reason is about the use of artificial variable. The traditional method
is involved in the artificial variables. This leads to the number of variables in the system
of the traditional method more than the NBC method which is the artificial-free
technique.
5 Conclusion
There are two types of methods for solving an integer linear programming problem, the
exact method and the heuristic method. In this paper, we focus on the exact method.
The traditional exact method consists of the method for solving the LP relaxation
problem and the technique for finding the integer solution. For solving the LP relax-
ation problem, the simplex method or the two-phase method is preferred. If the two-
phase method is chosen, then artificial variables are added, and it implies that the
problem must be expanded. To avoid the artificial variable, the artificial-free technique
is required. The artificial-free technique that we choose for dealing with the artificial
variable is the nonfeasible basis method (NFB). NFB is not only the artificial-free
technique but also it performs on the small simplex tableau named condensed tableau.
The nonfeasible-basis cutting plane method (NBC) is a combination of the non-
feasible basis method and the cutting plane method performed on the condensed
tableau. The step of NBC starts with NFB for solving the LP relaxation problem, then
the cutting plane method is used to find the integer solution. Since NBC is not use the
artificial variable and perform on the condensed tableau, the computational time of
NBC must less than the traditional method. From the computational results, we found
that the computational time of NBC is approximate 44% of the computational time of
the traditional method.
References
1. Fatthi, W., Haris, M., Kahtan H.: Application of travelling salesman problem for minimizing
travel distance of a two-day trip in Kuala Lumpur via Go KL city bus. In: Intelligent
Computing & Optimization, pp. 227–284 (2018)
2. Torres-Escobar, R., Marmolejo-Saucedo, J., Litvinchev, I., Vasant, P.: Monkey algorithm for
packing circles with binary variables. In: Intelligent Computing & Optimization, pp. 547–
559 (2018)
3. Yaskov, G., Romanova, T., Litvinchev, I., Shekhovtsov, S.: Optimal packing problems:
from knapsack problem to open dimension problem. In: Advances in Intelligent Systems and
Computing, pp. 671–678 (2019)
4. Marmolejo-Saucedo, J., Rodriguez-Aguilar, R.: A timetabling application for the assignment
of school classrooms. In: Advances in Intelligent Systems and Computing, pp. 1–10 (2019)
5. Chanta, S., Sangsawang, O.: A single allocation P-Hub maximal covering model for
optimizing railway station location. In: Intelligent Computing & Optimization, pp. 522–530
(2018)
6. Land, A.H., Doig, A.: An automatic method of solving discrete programming problems.
Econometrica 28, 497–520 (1960)
7. Gomory, R.E.: Outline of an algorithm for integer solutions to linear programs. Bull. Am.
Math. Soc. 64, 275–278 (1958)
8. Gomory, R.E.: Solving linear programming problems in integers. Proc. Symposia Appl.
Math. 10, 211–215 (1960)
9. Padberg, M., Rinaldi, G.: A branch-and-cut algorithm for the resolution of large-scale
symmetric traveling salesman problems. SIAM Rev. 33, 60–100 (1991)
10. Glover, F., Sherali, H.D.: Chvatal-gomory-tier cuts for general integer programs. Discrete
Optim. 2, 51–69 (2005)
11. Wesselmann, F., Koberstein, A., Suhl, U.: Pivot-and-reduce cuts: an approach for improving
gomory mixed-integer cuts. Eur. J. Oper. Res. 214, 15–26 (2011)
12. Dantzig, G.B.: Activity Analysis of Production and Allocation. Wiley, New York (1951)
13. Klee, V., Minty, G.: How Good is the Simplex Algorithm. In Equalities. Academic Press,
New York (1972)
14. Paulraj, S., Sumathi, P.: A comparative study of redundant constraints identification methods
in linear programming problems. Math. Problems Eng. 2010, 1–16 (2010)
15. Gao, C., Yan, C., Zhang, Z., Hu, Y., Mahadevan, S., Deng, Y.: An amoeboid algorithm for
solving linear transportation problem. Physica A 398, 179–186 (2014)
16. Zhang, X., Zhang, Y., Hu, Deng, Y., Mahadevan, S.: An adaptive amoeba algorithm for
constrained shortest paths. Expert Syst. Appl. 40, 7607–7616 (2013)
17. Zionts, S.: The criss-cross method for solving linear programming problems. Manage. Sci.
15, 426–445 (1969)
18. Pan, P.Q.: Primal perturbation simplex algorithms for linear programming. J. Comput. Math.
18, 587–596 (2000)
19. Paparrizos, K., Samaras, N., Stephanides, G.: A new efficient primal dual simplex algorithm.
Comput. Oper. Res. 30, 1383–1399 (2003)
20. Nabli, H.: An overview on the simplex algorithm. Appl. Math. Comput. 210, 479–489
(2009)
21. Nabli, H., Chahdoura, S.: Algebraic simplex initialization combined with the nonfeasible
basis method. Eur. J. Oper. Res. 245, 384–391 (2015)
22. Sangngern, K., Boonperm, A.: A new initial basis for the simplex method combined with the
nonfeasible basis method. J. Phys. Conf. Ser. 1593, 012002 (2020)
23. Boonperm, A., Sinapiromsaran, K.: The artificial-free technique along the objective direction
for the simplex algorithm. J. Phys. Conf. Ser. 490, 012193 (2014)
24. Boonperm, A., Sinapiromsaran, K.: Artificial-free simplex algorithm based on the non-acute
constraint relaxation. Appl. Math. Comput. 234, 385–401 (2014)
A New Technique for Solving a 2-Dimensional
Linear Program by Considering the Coefficient
of Constraints
Panthira Jamrunroj and Aua-aree Boonperm(&)

Thammasat University, Pathum Thani 12120, Thailand
panthira.jamr@dome.tu.ac.th,
aua-aree@mathstat.sci.tu.ac.th
Abstract. The popular method to solve a 2-dimensional linear program is the

graphical method or the simplex method. However, if the problem has more
constraints then they take more time to solve it. In this paper, a new method for
solving a 2-dimensional linear program by considering only the coefficient of
constraints is proposed. If the problem satisfies the following 3 conditions: the
vector of the objective function is nonnegative, the right-hand-side values are
nonnegative, and the decision variables are nonnegative. Then, the optimal
solution can be obtained immediately if there exists a variable which has a
positive coefficient vector. Moreover, this technique can apply for choosing the
entering variable for the simplex method.
Keywords: 2-dimensional linear program Simplex method Double pivot

simplex method
1 Introduction
A linear program is an optimization technique which has a linear objective function

subject to linear equality or inequality constraints. The aim of this technique is an
investigation of the solution that satisfies the constraints and gives the minimum or
maximum objective value. In real-world problems, a linear program is widely used for
solving many industrial problems which are required to achieve the best outcome such
as a production planning problem, a traveling salesman problem, an assignment
problem, a transportation problem, etc. For solving a linear program, the graphical
method is an easy method for solving a 2 or 3-dimensional problem. However, for a
high-dimensional problem, it is not a practical method.
The first practical method which is used to solve a linear program was presented in
1947 by George Dantzig [1], namely the simplex method. However, in 1972, Klee and
Minty [2] had presented the worst-case computational time of the simplex method.
Then, many researchers have presented efficient methods to solve this problem. Some
issues that the researchers have been interested to study are to propose the new method
[3–10], to improve the simplex method by the new pivot rules [11–21].

https://doi.org/10.1007/978-3-030-68154-8_27
A New Technique for Solving a 2-Dimensional Linear Program 277
In 1976, Shamos and Hoey [22] proposed the algorithm which is used to solve a 2-
dimensional linear program. This algorithm forms the intersection of geometric objects
in the plane. Then, it starts by considering n line segments in the plane and finding all
intersecting pairs. After that, they use the Oðn log nÞ algorithm to determine whether
any 2 intersection points and use it to find out whether two simple plane polygons
intersecting. They proved that a linear program with 2 variables and n constraints can
be solved in Oðn log nÞ time while the simplex method requires Oðn2 Þ time.
In 1983, Megiddo [23] introduced the linear-time algorithm to solve a linear pro-
gram in R2 . This method searches for the smallest circle enclosing n given points in the
plane. Moreover, it disproves a conjecture by Shamos and Hoey that their algorithm
requires Oðn log nÞ time. In addition, a 2-dimensional linear program which has a
closely related problem called a separability problem in R2 is constructed and solved in
OðnÞ time. That is, this algorithm corrects the error in Shamos and Hoey’s paper that it
can solve in the Oðn log nÞ time.
In 1984, Dyer [24] also proposed the algorithm which requires OðnÞ time to solve a
2-dimensional linear program. The concept of this algorithm is utilization of convexity
which is well-known ideas for creating the linear time algorithm. This algorithm starts
at any feasible point and coefficient values of the objective function, c1 and c2 are
considered. If c1 ¼ c2 ¼ 0, then this feasible point is optimal. Otherwise, the problem
is transformed to provide a gradient direction of the objective function. After that, the
optimal solution is gained by performing the pairing algorithm. All processes of Dyer’s
algorithm require OðnÞ time for each process.
Recently, Vitor and Easton [25] proposed a new method to solve a 2-dimensional
linear program which is called the slope algorithm such that the problem must satisfy
the following conditions: the coefficient values of the objective function are positive,
the right-hand side values of all constraints are also positive and both variables are
nonnegative. Not only this algorithm can solve the 2-dimensional problem, but also can
be applied to identify the optimal basis for the simplex method. They apply the slope
algorithm into the simplex framework for solving a general linear program. The sim-
plex method is improved by exchanging two variables into the basis which are iden-
tified by using the slope algorithm and this method is called the double pivot simplex
method. For each iteration of this method, it requires the optimal basis from a 2-
dimensional linear program to indicate leaving variables for the original problem. Then,
the slope algorithm is used to identify such variables. After the slope algorithm is
performed, the simplex tableau is updated corresponding to the new basis. It repeats
steps until the optimal solution to the original problem is obtained. From the compu-
tational result, this algorithm can reduce the iterations of the simplex method.
Although the slope algorithm can identify the basis and can reduce the iterations of
the simplex method, it takes more computations for each iteration. So, in this paper, we
propose a new method for solving a 2-dimensional linear program. It can improve the
double pivot simplex method by avoiding the computation of the slope algorithm in
such case that is there exists only one pivot. We specifically determine the 2-
dimensional linear program such that the problem satisfies the following conditions: the
cost coefficient values are positive, the right-hand side values are nonnegative and both
variables are also nonnegative. This approach can identify the optimal solution by
278 P. Jamrunroj and A. Boonperm
considering only the coefficient values of constraints. Additionally, it can apply to

choose the effective pivot element for the simplex method. The details of the proposed
approach give in the next section.
2 The Proposed Approach
Consider the following special 2-dimensional linear program:
Maximize z = c1 x1 + c2 x2
Subject to a11 x1 + a12 x2 ≤ b1
≤ ð1Þ
am1 x1 + am 2 x2 ≤ bm
x1 , x2 ≥ 0,
where c1 ; c2 [ 0 and bi 0 for all i ¼ 1; . . .; m. Since the origin point is feasible, the
problem (1) is either optimal or unbounded solution (see in Fig. 1).
Fig. 1. The possible cases of a solution to the problem (1)
The motivation of our proposed method is to identify a solution to such a problem

by considering only the coefficient values of constraints. For the case of ai2 0 or
ai1 0 for all i ¼ 1; . . .; m, we see that x1 or x2 can increase infinitely. This implies that
the problem is unbounded (see in Fig. 2). For the other cases, the following theorems
can identify the optimal solution if the coefficient of constraints satisfies the conditions.
Fig. 2. The coefficient value ai2 0 for all i ¼ 1; . . .; m
Theorem 1. Consider the 2-dimensional linear program (1) with ai2 [ 0 for all
i ¼ 1; . . .; m. Let
n o n o
k ¼ arg min abi1i jai1 [ 0; i ¼ 1; . . .; m and r ¼ arg min abi2i jai2 [ 0; i ¼ 1; . . .; m .
i) If ak1 \ak2 ðc1 =c2 Þ, then x ¼ ðx1 ; 0Þ is the optimal solution to problem (1) where
x1 ¼ bk =ak1 .
ii) If ar1 [ ar2 ðc1 =c2 Þ, then x ¼ ð0; x2 Þ is the optimal solution to problem (1) where
x2 ¼ br =ar2 .
Proof.
i) Suppose that ak1 \ak2 ðc1 =c2 Þ. Let x1 ¼ bk =ak1 . We need to show that x ¼ ðx1 ; 0Þ
is the optimal solution by using the optimality conditions.
First, we show that ðx1 ; 0Þ is a feasible solution of the primal problem. Consider the
following system:
a11 x1 þ a12 x2 ¼ a11 x1

.. .. ..
. . .
Ax ¼ ak1 x1 þ ak2 x2 ¼ ak1 x1
.. .. ..
. . .
am1 x1 þ am2 x2 ¼ am1 x1
Since bi 0 for all i ¼ 1; . . .; m, if ai1 0, then ai1 x1 bi and if ai1 [ 0, then
ai1 x1 ¼ ai1 abk1k ai1 abi1i ¼ bi . Therefore, ðx1 ; 0Þ is a feasible solution.
Next, we will show that there exists a dual feasible solution. Consider the dual
problem:
Minimize b1 w1 + b2 w2 + + bm wm = bT w
Subject to a11 w1 + a21 w2 + + ak1 wk + + am1 wm ≥ c1
a12 w1 + a22 w2 + + ak 2 wk + + am 2 wm ≥ c2
w1 ,..., wm ≥ 0.
Choose w ¼ ½ 0 c1 =ak1 0 T . Since ak1 [ 0 and c1 [ 0, c1 =ak1 [ 0.

Consider the following system:
a11 w1 þ a21 w2 þ þ ak1 wk þ þ am1 wm ¼ c1

a12 w1 þ a22 w2 þ þ ak2 wk þ þ am2 wm ¼ ak2 ðc1 =ak1 Þ:
Since ak1 \ak2 ðc1 =c2 Þ, c2 \ak2 ðc1 =ak1 Þ. Hence, w ¼ ½ 0 c1 =ak1 0 T
is a dual feasible solution.
Finally, consider the objective values of the primal and the dual problems. We get
cT x ¼ c1 ðbk =ak1 Þ and bT w ¼ bk ðc1 =ak1 Þ.
Since x and w are a primal feasible solution and a dual feasible solution,
respectively, and cT x ¼ bT w , x is the optimal solution to problem (1).
ii) Suppose that ar1 [ ar2 ðc1 =c2 Þ. Let x2 ¼ br =ar2 . We need to show that x ¼ ð0; x2 Þ
is the optimal solution by using the optimality conditions. First, we will show that
ð0; x2 Þ is a primal feasible solution. Consider the following system:
a11 x1 þ a12 x2 ¼ a11 x1

.. .. ..
. . .
Ax ¼ ak1 x1 þ ak2 x2 ¼ ak1 x1 :
.. .. ..
. . .
am1 x1 þ am2 x2 ¼ am1 x1
Since bi 0 for all i ¼ 1; . . .; m, if ai2 0, then ai2 x2 bi and if ai2 [ 0, then
ai2 x2 ¼ ai2 abr2r ai2 abi2i ¼ bi . Therefore, the primal problem is feasible.
Next, we will show that there exists a dual feasible solution. Consider the dual
problem:
Minimize b1 w1 + b2 w2 + + bm wm = bT w
Subject to a11 w1 + a21 w2 + + ak1 wk + + am1 wm ≥ c1
a12 w1 + a22 w2 + + ak 2 wk + + am 2 wm ≥ c2
w1 , , wm ≥ 0.
Choose w ¼ ½ 0 c2 =ar2 0 T . Since ar2 [ 0 and c2 [ 0, c2 =ar2 [ 0.

Consider the following system:
a11 w1 þ a21 w2 þ þ ar1 wr þ þ am1 wm ¼ ar1 ðc2 =ar2 Þ

a12 w1 þ a22 w2 þ þ ar2 wr þ þ am2 wm ¼ c2 :
Since ar1 [ ar2 ðc1 =c2 Þ, c1 \ar1 ðc2 =ar2 Þ. Hence, w ¼ ½ 0 c2 =ar2 0 T
is a dual feasible solution.
Finally, we will prove that the objective values of primal and dual problems are
equal.
Consider cT x ¼ c2 ðbr =ar2 Þ and bT w ¼ br ðc2 =ar2 Þ.
Since x and w are a primal feasible solution and a dual feasible solution,
respectively, and cT x ¼ bT w , x is the optimal solution to problem (1).
Theorem 2. Consider the 2-dimensional linear program (1) with ai1 [ 0 for all
i ¼ 1; . . .; m. Let
n o n o
k ¼ arg min abi1i jai1 [ 0; i ¼ 1; . . .; m and r ¼ arg min abi2i jai2 [ 0; i ¼ 1; . . .; m .
i) If ar2 \ar1 ðc2 =c1 Þ, then x ¼ ð0; x2 Þ is the optimal solution to problem (1) where
x2 ¼ br =ar2 .
ii) If ak2 [ ak1 ðc2 =c1 Þ, then x ¼ ðx1 ; 0Þ is the optimal solution to problem (1) where
x1 ¼ bk =ak1 :
Proof. The proof is similar to Theorem 1.

From Theorem 1 and Theorem 2, the solution to the problem (1) is obtained by
considering only the coefficient values of constraints. Therefore, the 2-dimensional
linear program can be solved easily when it satisfies the conditions of these theorems.
We can see that this technique can reduce the computation to solve a 2-dimensional
problem. The examples of the proposed approach are shown in the next section.
3 The Illustrative Examples
For this section, we give two examples for solving 2-dimensional linear problems and
one example for applying with the simplex method. The first and second examples
present that the 2-dimensional linear program gives the optimal solution. For the third
example, we apply the proposed approach with the general linear program for selecting
a leaving variable.
Example 1. Consider the following 2-dimensional linear program:
Maximize z = 3x1 + 4 x2
Subject to −3 x1 + 5 x2 ≤ 15
−2 x1 + x2 ≤ 22
− x1 + 4 x2 ≤ 16 ð2Þ
x1 + 7 x2 ≤ 30
x1 + 2 x2 ≤ 11
x1 , x2 ≥ 0 .
Since c1 ¼ 3; c2 ¼ 4 [ 0; bi 0 for i ¼ 1; . . .; m, we compute

bi b4 b5
k ¼ arg min jai1 [ 0; i ¼ 1; . . .; m ¼ arg min ¼ 30; ¼ 11 ¼ 5:
ai1 a41 a51
Since ai2 [ 0 for i ¼ 1; . . .; m and ak1 ¼ 1\ak2 ðc1 =c2 Þ ¼ 2ð3=4Þ ¼ 3=2, the
optimal solution is x1 ; x2 ¼ ðbk =ak1 ; 0Þ ¼ ð11; 0Þ with z ¼ 44.
For solving this problem by the simplex method, it requires 4 iterations while our
method can identify the optimal solution by considering the coefficients.
Example 2. Consider the following 2-dimensional linear program:
Maximize z = 3x1 + 2 x2
Subject to 5 x1 + x2 ≤ 5
− x1 + x2 ≤ 5
ð3Þ
2 x1 + x2 ≤ 4
5 x1 + 4 x2 ≤ 25
x1 , x2 ≥ 0.
Since c1 ¼ 3; c2 ¼ 2 [ 0, bi 0 for i ¼ 1; . . .; m, we compute

b1 b2 b3 b4 25
r ¼ arg min ¼ 5; ¼ 5; ¼ 4; ¼ ¼ 3:
a12 a22 a32 a42 4
ai2 [
Since 0 for i ¼ 1; . . .; m and ar1 ¼ 2 [ ar2 ðc1 =c2 Þ ¼ 3=2, the optimal solu-
tion is x1 ; x2 ¼ ð0; br =ar2 Þ ¼ ð0; 4Þ with z ¼ 8.
The next example, we will apply our algorithm to solve a general linear program for
identifying an entering variable.
Example 3. Consider the following linear program:
Maximize z = 5 x1 + 3x2 + x3 + 2 x4
Subject to 2 x1 + x2 + 2 x3 + x4 ≤ 10
7 x1 + 4 x2 + 2 x4 ≤ 12
6 x1 + 3x2 + 3x3 ≤ 15
ð4Þ
3x1 + x2 + x3 + 3x4 ≤ 8
9 x1 + 2 x2 + 4 x3 + x4 ≤ 9
3x1 + x2 + 2 x4 ≤ 5
x1 , x2 , x3 , x4 ≥ 0.
First, construct the initial tableau by adding the slack variables and they are set to
form a basis. The initial tableau can be obtained below.
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 RHS
z −5 −3 −1 −2 0 0 0 0 0 0 0
x5 2 1 2 1 1 0 0 0 0 0 10
x6 7 4 0 2 0 1 0 0 0 0 12
x7 6 3 3 0 0 0 1 0 0 0 15
x8 3 1 1 3 0 0 0 1 0 0 8
x9 9 2 4 1 0 0 0 0 1 0 9
x10 3 1 0 2 0 0 0 0 0 1 5
Next, consider the coefficients of x1 and x2 . The 2-dimensional linear program

associated with x1 and x2 can be constructed as follows:
Maximize z = 5 x1 + 3 x2
Subject to 2 x1 + x2 ≤ 10
7 x1 + 4 x2 ≤ 12
6 x1 + 3 x2 ≤ 15 ð5Þ
3 x1 + x2 ≤ 8
9 x1 + 2 x2 ≤ 9
3 x1 + x2 ≤ 5
x1 , x2 ≥ 0.
After we use the proposed method, we get x ¼ ð0; 3Þ is the optimal solution to (5).
So, x2 is selected to be an entering variable. The variable x6 is chosen to be a leaving
variable because its corresponding constraint gives the minimum value of bi =ai2 for all
i ¼ 1; . . .; m. The simplex tableau is updated as the following tableau.
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 RHS
z 0.25 0 −1 −0.5 0 0.75 0 0 0 0 9
x5 0.25 0 2 0.5 1 −0.25 0 0 0 0 7
x2 1.75 1 0 0.5 0 0.25 0 0 0 0 3
x7 0.75 0 3 −1.5 0 −0.75 1 0 0 0 6
x8 2.25 0 1 2.5 0 −0.25 0 1 0 0 5
x9 5.5 0 4 0 0 −0.5 0 0 1 0 3
x10 1.25 0 0 1.5 0 −0.25 0 0 0 1 2
From the above tableau, x3 and x4 are considered. Since no positive coefficient
vector of both variables, the proposed method cannot determine entering and leaving
variables. Then, we back to use the slope algorithm to solve it and get that x3 ; x4 are
selected to be entering variables and x9 ; x10 are chosen to be leaving variables. The
updated simplex tableau is as follows:
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 RHS
z 2.042 0 0 0 0 0.542 0 0 0.25 0.333 10.417
x5 −2.917 0 0 0 1 0.083 0 0 −0.5 −0.333 4.833
x2 1.333 1 0 0 0 0.333 0 0 0 −0.333 2.333
x7 −2.125 0 0 0 0 −0.625 1 0 −0.75 1 5.75
x8 −1.208 0 0 0 0 0.292 0 1 −0.25 −1.667 0.917
x3 1.375 0 1 0 0 −0.125 0 0 0.25 0 0.75
x4 0.833 0 0 1 0 −0.167 0 0 0 0.667 1.333
Since all reduced

costs are nonnegative, the optimal solution is obtained at this
tableau which is x1 ; x2 ; x3 ; x4 ¼ ð0; 2:333; 0:75; 1:333Þ with z ¼ 10:417.
From Example 3, at the first iteration, the proposed method selects the second most
negative reduced cost to be an entering variable while the simplex method chooses the
most negative reduced cost to be an entering variable. At the second iteration, the
proposed method performed double pivots while the simplex method performed only
one a pivot. In summary, we can reduce the number of iterations for this case because
the proposed method performed only 2 iterations to solve the problem in Example 3
while the simplex method performed 4 iterations. In addition, at the first iteration, we
can reduce a computation because an entering variable was indicated without using the
slope algorithm.
4 The Computational Results
In this section, we report the computational results by comparing the number of iter-
ations between the simplex method with the proposed technique and the original
simplex method. By solving the following canonical linear program:
Maximize z = cT x
Subject to Ax ≤ b ð6Þ
x ≤ 0.
The sizes of tested problems have six different sizes of the coefficient matrix which
are m n ¼ 6 4; 6 8; 6 10; 8 6; 8 10 and 8 12. In addition, most of the
tested problems are generated by strictly at the first iteration that requires the second
most negative reduced cost to be an entering variable. Then, the computational results
are shown in Table 1. We can see that the average number of iterations of the simplex
method with the proposed technique are less than the average number of iterations of
the original simplex method. So, we can reduce the number of iterations when a
problem is solved by the simplex method with the proposed technique.
Table 1. The computational results of the tested problems.

Sizes of The average number of iterations
problems The original simplex The simplex method with the proposed
method technique
64 3.4 1.8
68 5.4 2.2
6 10 3.4 1.9
86 3.0 1.6
8 10 3.2 1.5
8 12 3.6 1.7
Average 3.667 1.783
5 Conclusions
In this paper, a new method for solving a 2-dimensional linear program by considering
only the coefficient of constraints is proposed. If there exists a positive coefficient
vector of one variable, then the optimal solution can be identified immediately.
Moreover, this technique can fix some cases of the double pivot simplex method that is
the case of having only one entering variable. In this case, the double pivot simplex
method performs the slope algorithm to solve the relaxed 2-dimensional linear program
for identifying an entering variable that has more computation while the proposed
method considers only the coefficient constraints for specifying an entering variable.
By this reason, we can reduce computation for this case. But if the relaxed 2-
dimensional linear program does not satisfy the conditions, it is solved inevitably by
the slope algorithm. Now, we can fix only the case of one entering variable for the
double pivot simplex method. In the future work, we need to find a method that can fix
the case of two entering variables.
References
1. Dantzig, G.B.: Linear Programming and Extensions. RAND Corporation, Santa Monica
(1963)
2. Klee, V., Minty, G.: How Good is the Simplex Algorithm. In Equalities. Academic Press,
New York (1972)
3. Terlaky, T.: A finite crisscross method for oriented matroids. J. Comb. Theory B 42(3), 319–
327 (1987)
4. Pan, P.: A new perturbation simplex algorithm for linear programming. J. Comput. Math. 17
(3), 233–242 (1999)
5. Pan, P.: Primal perturbation simplex algorithms for linear programming. J. Comput. Math.
18(6), 587–596 (2000)
6. Elhallaoui, I., Villeneuve, D., Soumis, F., Desaulniers, G.: Dynamic aggregation of set-
partitioning constraints in column generation. Oper. Res. 53(4), 632–645 (2005)
7. Elhallaoui, I., Desaulniers, G., Metrane, A., Soumis, F.: Bi-dynamic constraint aggregation
and subproblem reduction. Comput. Oper. Res. 35(5), 1713–1724 (2008)
8. Elhallaoui, I., Metrane, A., Soumis, F., Desaulniers, G.: Multi-phase dynamic constraint
aggregation for set partitioning type problems. Math. Program. 123(2), 345–370 (2010)
9. Elhallaoui, I., Metrane, A., Desaulniers, G., Soumis, F.: An improved primal simplex
algorithm for degenerate linear programs. INFORMS J. Comput. 23(4), 569–577 (2010)
10. Raymond, V., Soumis, F., Orban, D.: A new version of the improved primal simplex for
degenerate linear programs. Comput. Oper. Res. 37(1), 91–98 (2010)
11. Jeroslow, R.G.: The simplex algorithm with the pivot rule of maximizing criterion
improvement. Discr. Math. 4, 367–377 (1973)
12. Bland, R.G.: New finite pivoting rules for the simplex method. Math. Oper. Res. 2(2), 103–
107 (1977)
13. Pan, P.: Practical finite pivoting rules for the simplex method. OR Spectrum 12, 219–225
(1990)
14. Bixby, R.E.: Solving real-world linear programs: a decade and more of progress. Oper. Res.
50(1), 3–15 (2002)
15. Pan, P.: A largest-distance pivot rule for the simplex algorithm. Eur. J. Oper. Res. 187(2),
393–402 (2008)
16. Pan, P.: A fast simplex algorithm for linear programming. J. Comput. Math. 28(6), 837–847
(2010)
17. Csizmadia, Z., Illés, T., Nagy, A.: The s-monotone index selection rules for pivot algorithms
of linear programming. Eur. J. Oper. Res. 221(3), 491–500 (2012)
18. Liao, Y.: The improvement on R. G. Blands method. In: Qi, E., Shen, J., Dou, R. (eds.) The
19th International Conference on Industrial Engineering and Engineering Management,
Changsha, pp. 799–803 (2013)
19. Pan, P.: Linear Programming Computation, 1st edn. Springer, Heidelberg (2014)
20. Etoa, J.: New optimal pivot rule for the simplex algorithm. Adv. Pure Math. 6, 647–658
(2016)
21. Ploskas, N., Samaras, N.: Linear Programming Using MATLAB®, volume 127 of Springer
Optimization and Its Applications, 1st edn. Springer, Cham (2017)
22. Shamos, M., Dan Hoey: Geometric intersection problems. In: 17th Annual Symposium on
Foundations of Computer Science (SFCS 1976), pp. 208—215 (1976)
23. Megiddo, N.: Linear-time algorithms for linear programming in R3 and related problems.
SIAM J. Comput. 12(4), 759–776 (1983)
24. Dyer, M.: Linear time algorithms for two- and three-variable linear programs. SIAM J. Com-
put. 13(1), 31–45 (1984)
25. Vitor, F., Easton, T.: The double pivot simplex method. Math. Meth. Oper. Res. 87, 109–137
(2018)
A New Integer Programming Model
for Solving a School Bus Routing Problem
with the Student Assignment
Anthika Lekburapa, Aua-aree Boonperm(&),

and Wutiphol Sintunavarat

Anthika.lekburapa@gmail.com,
{aua-aree,wutiphol}@mathstat.sci.tu.ac.th
Abstract. The aim of this research is to present an integer linear programming

model for solving the school bus routing problem concerning the travel distance
of all buses and the student assignment to each bus stop which depends on two
views. Students’ satisfaction is more interested than the total travel distance of
buses in the first model, and vice versa in the second model.
Keywords: Capacity constraint Integer linear programming School bus

routing problem
1 Introduction
One of the four elements of the marketing mix is distribution, which is the process
involving selling and delivering products and services from a manufacturer to cus-
tomer. Distribution is also an essential part of manufacturers’ operations since a poor
distribution leads to a loss of trust of customers, retailers and suppliers. Therefore, it
becomes important to improve product distribution to ensure that customers are
pleased. There are many ways to improve the efficiency of the distribution. One of
these improvements is finding a route for distributing products to maximize the sat-
isfaction of a manufacturer and customer. For instance, in 2019, Mpeta et al. [1]
introduced the binary programming model for solving a municipal solid waste col-
lection problem. The obtained solution is an optimal route to reduce waste collection
time and costs.
The vehicle routing problem (VRP) is a problem for finding the route to distribute
products from center depot to customers in different locations and return to center
depot. The VRP is commonly investigated in companies for product distribution. There
are many factors to seek the route in the VRP such as the capacity of vehicles, the
number of vehicles, the driver cost, the travel distance, etc. In 1959, the VRP was first
studied by Dantzig et al. [2]. The purpose in their work is to find the shortest route for
delivery gasoline between a bulk terminal and many stations corresponding to the
demand in each station.

https://doi.org/10.1007/978-3-030-68154-8_28
288 A. Lekburapa et al.
The VRP is adjusted to the wider problem named the open vehicle routing problem
(OVRP). The OVRP relaxes the condition that the vehicles must return to the center
depot from the VRP. It aims that the OVRP is a much more precise model for con-
forming business compare to the VRP with close routes returning to the center depot.
There are several algorithms for solving the OVRP. For instance, Tarantilisl et al. [3]
presented the threshold-accepting algorithm called the list based threshold accepting
(LBTA) to solve the OVRP on artificial case with 50–199 depots.
The school bus routing problem (SBRP) is one type of the OVRP which is designed
for planning the route to pick up students from bus stops to a school or from a school to
bus stops. The SBRP was first considered by Newton and Thomas [4]. They solve this
problem by generating routes using a computer. A case study of this research is in
Indiana with 1500 students. Afterward, many researchers improved a model for solving
the SBRP. For instance, In 1997, Braca et al. [5] attempted to minimize the number of
buses with many constraints involving the capacity of buses, riding time, school time
window, walking distance of student from home to bus stop and the number of stu-
dents. They also studied the case study in New York with 838 bus stops and 73
schools. More importantly, they solved all the problems of every school in one state. In
2012, Jalel and Rafaa [6] developed a hybrid evolutionary computation based on an
artificial ant colony with a variable neighborhood local search algorithm for the Tunisia
school bus routing problem. In 2013, Taehyeong Kim et al. [7] formulated a mixed-
integer programming problem for the SBRP, and developed a heuristic algorithm based
on the harmony search to find the solution of this problem. The obtained solution from
the developed heuristic algorithm are compared with the exact solution by CPLEX. The
solution of both methods is the same, but the developed heuristic method is calculated
faster. In 2016, Hashi et al. [8] used Clarke & Wright’s algorithms for solving the
school bus routing and scheduling problem with time window. They also tested this
algorithm with the Scholastica School in Dhaka, Bangladesh. Different methods for
solving the SBRP can be seen from results of Park and Kim [9], Ellegood et al. [10],
Bodin and Berman [11], Schittekat et al. [12], Eldrandaly and Abdallah [13] and
references therein.
We now discuss the research about the SBRP due to Baktas et al. [14], which is the
main motivation for writing this paper. In their research, they presented a model to
minimize the number of buses with capacity and distance constraints was presented.
Both constraints prevent a sub-tour. They used this model for solving the SBRP with a
real case in Turkey. In this model, they solved the SBRP under the situation that the
number of students in each bus stop is known. However, the satisfaction of all students
before arrival at each bus stop is not considered in their results. This leads to the
inspiration for investigation and improving the process for solving the SBRP in this
view.
In this paper, we are interested for solving the SBRP with capacity and distance
constraints in view of the above mentioned inspiration. Distance and capacity con-
straints in our model are revised from the model in [14] to handle the satisfaction of all
students together with the total distance to all buses. In addition, we provide a com-
parison of results providing from our models between two situations based on two
distinct objectives including to maximize the total satisfaction of students and to
minimize the total travel distance of all buses.
A New Integer Programming Model 289
2 Problem Description and the Proposed Model
In this paper, the SBRP with the student assignment depending on the satisfaction score
of students at each bus stop is investigated. For this problem, all students go to bus
stops under two situations including to maximize the total score of all students and to
minimize the total travel distance of all bus stops. Then, each bus starts from some bus
stops and picks-up students for going to a school before the school starts, and it returns
to the same route for dropping-off students. The number of students in each bus must
not exceed the capacity of the bus. Each bus is involved in a single path, and the total
distance is limited by the specified distance. An example of a school, bus stops and
home of each student are illustrated as Fig. 1.
Fig. 1. An example of the node of school, bus stops and home of students
For the proposed model, sets, parameters and decision variables are as follows:
Sets:
I is the set of all bus stops.
V is the set of all bus stops and a school, that is, V ¼ I [ f0g; where 0 is a school
node.
V 0 ¼ V [ fdg; where d is a dummy node which is the first node of each path.
L is the set of all students.
Parameters:
dij is the distance between nodes i and j for all i; j 2 I:

0; if i ¼ d; j 2 I
dij ¼ where M is a large number.
M; if i ¼ d; j ¼ 0;
sil is the satisfaction score of a student l waiting a school bus at bus stop i:
k is the number of vehicles.

Q is the capacity of a school bus.
T is the limit of the total length of each path.
Decision variables:

1; if arc ði; jÞ is traveled by bus
xij ¼
0; otherwise,

1; if student l is picked up at bus stop i
yil ¼
0; otherwise:
Before developing the capacity constraint of Bektas et al. [14], we give its formula
as follows:
ui uj þ Qxij þ ðQ qi qj Þxji Q qj 8i; j 2 I with i 6¼ j

ui q i 8i 2 I
ui qi x0i þ Qx0i Q 8i 2 I;
where ui is the total number of students after pick-up from bus stop i; and qi is the
number of students waiting at bus stop i. In our focused problem, the number of
students in each bus stop is unknown because the number ofP students is obtained from
the selection of the model. So, the variable qi is replaced by yil : We also change the
l2L
index to make it easier to understand. The formula is as follows:

P P P
ui uj þ Qxij þ Q yil yjl xji Q yjl 8i; j 2 I with i 6¼ j
P l2L l2L l2L
ui yil 8i 2 I
P
l2L
ui yil xdi þ Qxdi Q 8i 2 I:
l2L
We can see that the first and third constraints are nonlinear. Then, the modification is
obtained by linearization to make it easier to find theanswer. To make alinearization,
P P
we will introduce new decision variables as zji ¼ Q yil yjl xji and pi ¼
P l2L l2L
yil xdi : Then, the capacity constraints can be rewritten as follows:
l2L
P
ui uj þ Qxij þ zji Q yjl 8i; j 2 I with i 6¼ j
l2L
zji Qxji P P 8i; j 2 I with i 6¼ j
zji Q yil yjl 8i; j 2 I with i 6¼ j
P l2L P l2L
Q yil yjl ð1 xji ÞQ zji 8i; j 2 I with i 6¼ j
P
l2L l2L
ui yil 8i 2 I
l2L
ui pi þ Qxdi Q 8i 2 I
pi Qx
Pdi 8i 2 I
pi yil 8i 2 I
P l2L
yil ð1 xdi ÞQ pi 8i 2 I:
l2L
Next, we present a model for maximizing the total student satisfaction with the
improved constraint as follows:
PP
Maximize sil yil ð1Þ
i2I l2L
P
s:t: xi0 k ð2Þ
i2I
X
xdi k ð3Þ
i2I
P
yil ¼ 1 8l 2 L ð4Þ
i2I
P
xij ¼ 1 8i 2 I ð5Þ
j2I [ f0g
P
xij ¼ 1 8j 2 I ð6Þ
i2I [ fdg
P
ui uj þ Qxij þ zji Q yjl 8i; j 2 I with i 6¼ j ð7Þ
l2L
zji Qxji 8i; j 2 I with i 6¼ j ð8Þ

P P
zji Q yil yjl 8i; j 2 I with i 6¼ j ð9Þ
l2L l2L
P P
Q yil yjl ð1 xji ÞQ zji 8i; j 2 I with i 6¼ j ð10Þ
l2L l2L
P
ui yil 8i 2 I ð11Þ
l2L
ui pi þ Qxdi Q 8i 2 I ð12Þ
pi Qxdi 8i 2 I ð13Þ
P
pi yil 8i 2 I ð14Þ
l2L
P
yil ð1 xdi ÞQ pi 8i 2 I ð15Þ
l2L
vi vj þ ðT ddi dj0 þ dij Þxij þ ðT ddi dj0 dji Þxji

ð16Þ
T ddi dj0 ðxij þ xji Þ 8i; j 2 I with i 6¼ j
vi di0 xi0 0 8i 2 I ð17Þ
vi di0 xi0 þ Txi0 T 8i 2 I ð18Þ
xij 2 f0; 1g 8i; j 2 V 0 ð19Þ
yil 2 f0; 1g 8i 2 I; l 2 L ð20Þ
zji 0 8i; j 2 I ð21Þ
pi 0 8i 2 I ð22Þ
For the objective function (1), the total student satisfaction is maximized. In con-
straints (2) and (3), the number of vehicles used does not exceed the number of vehicles
specified. Constraint (4) ensures that each student can only board at one bus
stop. Constraints (5) and (6) ensure that vehicles visiting bus stop i must exit bus stop i:
Constraints (7)–(15) ensure that each path does not occur sub tour by the number of
students and the capacity of buses. Constraints (16)–(18) ensure that the cumulative
distance at a bus stop i; denoted by vi ; does not exceed the limit T: Constraints (19)–
(20) guarantee that the decision variables are binary. Constraints (21)–(22) claim that
decision variables are nonnegative.
In addition, we consider the other model by changing the objective function from
(1) to a different view of the problem, that is, to minimize the total travel distance of all
buses. In this considering this problem, we solve the following objective function:
XX
Minimize dij xij ð23Þ
i2V 0 j2V 0
with the same constraints (2)–(22).
3 Experimental Results
In this section, an experimental result is given to show the usage of our model. This
experiment solves the SBRP in the case of a kindergarten with 70 students using the
school bus service. Each school bus has 13 seats and a kindergarten has 8 school buses.
There are 12 bus stops in our experiment. We solve this experiment in two situations by
using CPLEX.
Results of the first and the second objective functions are presented in Table 1 and
Table 2, respectively. In these tables, the first column is the number of buses. The
second column is the path of each bus for picking up students. The number of all
students who are picked up by each bus shows in the third column and the satisfaction
of students (%) for waiting the bus at each bus stop shown in the fourth column. The
final column presents the total travel distance of buses for picking up students at each
bus stop in the path to school.
Table 1. Results from the proposed model with the objective function (1).
Bus Bus route Number of students Total satisfaction score (%) Total distance (km.)
1 d-3-5-0 11 100 20.5
2 d-4-2-0 12 100 37.1
3 d-6-9-0 11 95.46 14
4 d-7-8-0 11 97.23 14.3
5 d-11-1-0 12 99.17 34.3
6 d-12-10-0 13 100 13.7
98.71 133.9
Table 2. Results from the proposed model with the objective function (23).
Bus Bus route Number of students Total satisfaction score (%) Total distance (km.)
1 1-2-0 12 73.33 22.3
2 3-8-0 12 84.17 15
3 5-10-0 10 90 13.35
4 7-4-0 12 76.67 8.9
5 9-6-0 11 83.64 11
6 12-11-0 13 78.46 12.5
80.71 83.05
In Table 1, The total distance of buses is 133.9 km. and the optimal solution is 691
scores, which is calculated as 98.71%. It implies that all students are extremely satisfied
using the school bus service. Table 2 shows the solution of the second objective
function. The total distance of buses is 83.05 km. The total satisfaction score is 565.
A comparison of the satisfaction score and the total distance of the first and the second
objective functions is given in Table 3. The solutions for these two objective functions
give different values. The choice to use to solve the problem depends on the purpose of
the user of this algorithm.
Table 3. Comparison of the first and the second models.

Total Satisfaction score (%) Total distance (km.)
The first proposed model 98.71 133.9
The second proposed model 80.71 83.05
Figure 2 shows an example route in the first model. It can be seen that each student
travels to the designated bus stop. The distance for each student will vary depending on
the satisfaction score the student provides. The path of the school bus is quite
Fig. 2. An example route from the first model.
Fig. 3. An example route from the second model.

complicated because we do not take into account the shortest route. But we want the
distance of each route to not exceed the specified distance.
Figure 3 shows an example route in the second model. The satisfaction of the
students with the bus stop was 80.71%. The shortest distance is found caused a
reducing travel time and expenses. It follows that the school can be saved money and
so the school bus price for students is reduced. As traveling time is reduced, students
have more time for homework or family activities.
4 Conclusion
In this paper, we presented two integer linear programming models formulated to solve
the SBRP with capacity and distance constraints. In these two models, students can be
assigned to each bus stop based on the choice of an objective function. The first model
is to maximize the total satisfaction score of students, and the second model is to
minimize the total travel distance of buses. The selection of the model will depend on
the needs of the user. That is, if the user takes the student’s needs in mind, the first
objective function should be chosen. If the user takes into account the distance, time or
cost of the travel, then the second objective function should be chosen.
Acknowledgement. This work was supported by Thammasat University Research Unit in Fixed
Points and Optimization.
References
1. Mpeta, K.N., Munapo, E., Ngwato, M.: Route optimization for residential solid waste
collection: Mmabatho case study. In: Intelligent Computing and Optimization 2019.
Advances in Intelligent Systems and Computing, vol. 1072, pp. 506–520 (2020)
2. Dantzig, G.B., Ramser, J.H.: The Truck dispatching problem. INFORMS. Manage. Sci. 6(1),
80–91 (1959)
3. Tarantilis, C.D., Ioannou, G., Kiranoudis, C.T., Prastacos, G.P.: Solving the open vehicle
routing problem via a single parameter metaheuristic algorithm. J. Oper. Res. Soc. 56(5),
588–596 (2005)
4. Newton, R.M., Thomas, W.H.: Design of school bus routes by computer. Socio-Econ. Plan.
Sci. 3(1), 75–85 (1969)
5. Braca, J., Bramel, J., Posner, B., Simchi-Levi, D.: A computerized approach to the New
York City school bus routing problem. IIE Trans. 29, 693–702 (1997)
6. Jalel, E., Rafaa, M.: The urban bus routing problem in the Tunisian case by the hybrid
artificial ant colony algorithm. Swarm Evol. Comput. 2, 15–24 (2012)
7. Kim, T., Park, B.J.: Model and algorithm for solving school bus problem. J. Emerg. Trends
Comput. Inf. Sci. 4(8), 596–600 (2013)
8. Hashi. E.K., Hasan, M.R., Zaman, M.S.: GIS based heuristic solution of the vehicle routing
problem to optimize the school bus routing and scheduling. In: Computer and Information
Technology (ICCIT), 2016 19th International Conference, pp. 56–60. IEEE (2016)
9. Park, J., Kim, B.I.: The school bus routing problem: a review. Eur. J. Oper. Res. 202, 311–
319 (2010)
10. Ellegood, W.A., Solomon, S., North, J., Campbell, J.F.: School bus routing problem:
contemporary trends and research directions. Omega 95, 102056 (2020)
11. Bodin, L.D., Berman, L.: Routing and scheduling of school buses by computer. Transp. Sci.
13(2), 113–129 (1979)
12. Schittekat, P., Sevaux, M., Sorensen, K.: A mathematical formulation for a school bus
routing problem. In: Proceedings of the IEEE 2006 International Conference on Service
Systems and Service Management, Troyes, France (2006)
13. Eldrandaly, K.A., Abdallah, A.F.: A novel GIS-based decision-making framework for the
school bus routing problem. Geo-spat. Inform. Sci. 15(1), 51–59 (2012)
14. Bektas, T., Elmastas, S.: Solving school bus routing problems through integer programming.
J. Oper. Res. Soc. 58(12), 1599–1604 (2007)
Distributed Optimisation of Perfect Preventive
Maintenance and Component Replacement
Schedules Using SPEA2
Anthony O. Ikechukwu1(&), Shawulu H. Nggada2,

and José G. Quenum1
1
Namibia University of Science and Technology, Windhoek, Namibia
ikechukwu_anthony@yahoo.com, jquenum@nust.na
2
Higher Colleges of Technologies, Ras Al Khaimah Women’s Campus,
Ras Al Khaimah, United Arab Emirates
snggada@hct.ac.ae
Abstract. The upsurge of technological opportunities has brought about the

speedy growth of the industry’s processes and machineries. The increased size
and complexity of these systems, followed by high dependence on them, have
necessitated the need to intensify the maintenance processes, requiring more
effective maintenance scheduling approach to minimise the number of failure
occurrences; which could be realised through a well-articulated perfect pre-
ventive maintenance with component replacement (PPMR) schedules from
infancy stage through to completion stage. Then, using Strength Pareto Evo-
lutionary Algorithm 2 (SPEA2), we devise a multi-objective optimisation
approach that uses dependability and cost as objective functions. Typically, due
to large scale problems, SPEA2 implementation on a single node proves to be
computationally challenging, taking up to 1 h 20 min to run an instance of
SPEA2 PPMR scheduling optimisation, the search time, the quality of the
solution and its accuracy suffer much. We address this limitation by proposing a
distributed architecture based on MapReduce, which we superimpose on
SPEA2. The evaluation of our approach in a case study presents the following
results: (1) that our approach offers an effective optimisation modelling mech-
anism for PPMR; (2) that the distributed implementation tremendously improves
the performance by 79% computational time of the optimisation among 4 nodes
(1 master-node and 3 worker-nodes) and the quality and accuracy of the solution
without introducing much overhead.
Keywords: Maintenance scheduling Component replacement Hierarchical

Population balancing Parallel Strength Pareto Evolutionary Algorithm 2
(SPEA2) MapReduce Optimisation
1 Introduction
Although maintenance and appropriate preventive maintenance schedules have been

shown to elongate component useful life [1, 19], and improve reliability and avail-
ability. However, at a certain point the component would become unmaintainable or its

https://doi.org/10.1007/978-3-030-68154-8_29
298 A. O. Ikechukwu et al.
reliability falls below a predefined threshold. Consequently, the work in (Ikechukwu

et al. 2019) considered component replacement as an improvement factor in form of
optimisation parameter to evaluate and determine [2] when the components reliability
has dropped beyond defined unacceptable level which informs whether the component
will be maintained or replaced. At this stage, the computation of the cost of mainte-
nance is evaluated to either include the cost of component replacement or not when
only maintenance activity is carried out. We established mathematical models for
reliability, unavailability and cost for perfect preventive maintenance using Weibull
distribution and age reduction model. Then we developed a variant of SPEA2, defined
and established an optimisation approach for optimising PPMR schedules. PPMR
schedules of Fuel Oil Service System (FOSS) model case study with respect to
unavailability and cost are optimised using the developed optimisation approach.
SPEA2 is an extension of SPEA. It is an enhanced elitist multi-objective, multi-
directional optimisation algorithm. SPEA2 is an enhanced version of SPEA that mit-
igates some of the drawbacks of SPEA. SPEA2 employs density valuation method to
differentiate among individuals possessing similar raw suitability values. SPEA2 uses
the archive update process and archive size remains same all through and the truncation
method avoids border solutions removal. SPEA2 outperforms most of the other evo-
lutionary algorithms (EAs) on almost all test problems [20] in the expense of com-
putational time, when processed step by step (sequentially) because of its huge iterative
processes.
Evolutionary algorithms like SPEA2 are generally slow especially when evaluation
function is computationally-intensive and slow. It takes so long for an evolutionary
algorithm to find near optimal solutions when processed sequentially; therefore,
techniques are needed to speed up this process. It is our view that distributing the
optimisation process will help achieve the needed speed up. With data parallel model
languages, data mapping and task assignments to the processors can be achieved. Data
parallel model adopted in this work is MapReduce. MapReduce programming model
[18] is efficient in solving large data-scale problems in a parallel and distributed
environment. The introduction of parallelism in a distributed architecture for solving
complex problems can increase performance [3]. This paper presents adoption of other
methods like hierarchical decomposition, population balancing to distributed process-
ing to further reduce computational time.
The remainder of the paper is organised as follows. Section 2 discusses related work.
Section 3 discusses the distributed solution which includes MoHPBPSPEA2 candidate
solution algorithm, architecture, and flowcharts, while Sect. 4 presents MoHPBPSPEA2
implementation. Section 5 presents and discusses the results. Section 6 draws some
conclusions and sheds light on future work. Section 7, the references.
Distributed Optimisation of Perfect Preventive Maintenance 299
1.1 Challenges with Sequential Evolutionary Algorithm

While EA executes processes sequentially step after step, and crossover and mutation
carried out on couple of individuals, computational time usually grows higher for large
populations as the number of steps also increases. Another aspect that further increases
execution time is the number of objective functions that is to be met and the complexity
of the system model under consideration. Another challenge with sequential imple-
mentation is its memory consumption for storing huge population. The execution
memory requirements are as well higher as the entire population must be available in
memory for the operations of EA. Even though EA still leads to optimal solution,
sometimes sequential EAs may experience partial run in the search space thus out-
putting local optimal solutions instead of the global optimal solutions if it had com-
pleted its full run. Consequently, real-world problems that require faster solutions are
not being considered for EA computation. This is so as EAs do not find solutions by
mere using mathematical illustration of the problem, but are stochastic, non-linear,
discrete and multidimensional. The use of evolutionary algorithms can be improved by
reduction in computational time to achieving solutions faster. This paper proposes a
technique to reduce computational time of evolutionary algorithms.
2 Related Work
In the first International Conference of Genetic Algorithms (ICGA), there were no

papers about parallel GAs at all. This situation changed in the second ICGA in 1987,
where six papers were published. From then on there has been a steady flow of papers
published in conferences and journals of GAs and parallel computation. To reduce the
computation load of genetic search, a lot of methods for searching in parallel and
distributed manner have been proposed [4–6].
D. Abramson and J. Abela [7] presented a paper in which they discussed the
application of an EA to the school timetabling problem [7], and showed that the
execution time can be reduced by using a commercial shared memory multiprocessor.
The program code was written in Pascal and run on an Encore Multimax shared
memory multiprocessor. The times were taken for a fixed number of generations
(100) so that the effects of relative solution quality could be ignored and proved that the
speedup was attained from a parallel implementation of the program.
Although much work is reported about PEA models (and implementations on
different parallel architectures), involving SOPs (single objective optimization prob-
lems), PEA models could also be applied for multiobjective optimization problems
(MOPs) [8]. MOPs normally have several (usually conflicting) objectives that must be
satisfied at the same time. The first multi-objective GA, called Vector Evaluated
Genetic Algorithms (or VEGA), was proposed by Schaffer [9]. Afterward, several
major multi-objective evolutionary algorithms were developed such as Multi-objective
Genetic Algorithm (MOGA) [10], Niched Pareto Genetic Algorithm [11], Random
Weighted Genetic Algorithm (RWGA) [12], Nondominated Sorting Genetic Algorithm
(NSGA) [13], Strength Pareto Evolutionary Algorithm (SPEA) [14].
Muhammad Ali Ismail [15] has presented his work by using a master-slave para-
digm on a Beowulf Linux cluster using MPI programming library. With the help of this
he has written a pseudo code that first initializes the basic population and also the
number of nodes present in the cluster. Then it assigns the fitness function to the slave
or other nodes in the cluster. The slave ones compute the fitness objective and do the
mutation. After this, they send back the result to the master where it makes the final
counterpart. This paper presented a view of implementation and realization of algo-
rithms on a parallel architecture.
Lim, Y. Ong, Y. Jin, B. Sendhoff, and B. Lee, proposed a Hierarchical Parallel
Genetic Algorithm framework (GE-HPGA) [16] based on standard grid technologies.
The framework offers a comprehensive solution for efficient parallel evolutionary
design of problems with computationally expensive fitness functions, by providing
novel features that conceals the complexity of a Grid environment through an extended
GridRPC API and a metascheduler for automatic resource discovery (Fig. 1).
Multiobjective Hierarchical Population

Balanced and Parallel SPEA2 using
MapReduce (MoHPBPSPEA2)
1. Accept user input objectives (tasks).
2. Generate a population, P, at random (the
solution space).
3.Decompose the problem based on population and objective (by the
server).
3.1 Divide P into subp1, subp2…, subpNsubpopulations
(N is the no of nodes use to solve the problem).
4.Parallel execution on each node (initialise parallel processing on the
nodes).
5.For subpi, i=1... N, execute in parallel the next steps on nodes (N).
6.On each node:
6.1 Receive the subpopulation, subpi and the objectives
(tasks).
6.2 Perform parallel SPEA2 on worker:
6.2.2 For subpw, w=1...W, execute in parallel the next
steps on the available working thread (W).
6.2.2.1 Apply the selection
mechanism and the genetic
operators using MapReduce.
6.2.2.2If the termination criteria are not satisfied, return
to 6.2.2
6.3 Combine results from the nodes.
6.4 Send the results to the server.
7.If the termination criteria are not met, return to 5.
8.Execute a local EA on server for best results.
9.Display best results (by the server).
Fig. 1. The proposed MoHPBPSPEA2 Algorithm

3 The Distributed Solution
It is extremely obvious from literature that a range of variants of EAs, particularly

hierarchical and parallel EAs perform better than the sequential EAs [16, 17]. Con-
sequently, we consider this kind of EA and propose the application of these hierarchical
and parallel EAs (MoHPBPSPEA2) to a multiobjective optimisation problems of
perfect preventive maintenance with component replacement schedules, using a
MapReduce model for the construct of the parallel design. MapReduce, [18] a software
framework used to provide distributed computing on large scale problems on a cluster
of nodes. The subsequent section presents an architecture depicting how a multiob-
jective real-world problem can be solved using this framework.
3.1 The Proposed Architecture

Shown in Fig. 2, is the proposed system architecture. The architecture setup consists of
the server and the nodes. They are connected together in form of a cluster. The
connection is such that information flows from the server to the nodes, between the
nodes and from the nodes back to the server. Each node contains processor(s) which
contains threads and each thread computes a process. A process consists of the task
(objective) and the subpopulation. We propose to apply objective (task) and population
(data) strategies together to decompose SPEA2. The construct for decomposition will
be hierarchical and all the nodes in a cluster will be offered equal set of population,
which they are to work on with the same task. The global decomposition level (server)
decomposes the several SPEA2 tasks (objectives) and then assigns them to the local
decomposition level (nodes). Each local level (node) then takes up the responsibility of
decomposing the given population and task, and run parallel tasks on the cluster. The
MapReduce model offers a parallel design arrangement for streamlining MapReduce
utilisation in distributed architecture. MapReduce splits a large task into smaller tasks
and parallelise the execution of the smaller tasks. The proposed system architecture
shows the global and local levels of MoHPBPSPEA2 decomposition.
Fig. 2. Proposed system architecture

3.2 The Proposed Distributed Algorithm and Flowcharts

We proposed the following MoHBPSPEA2 candidate solution (Fig. 2) above which
we superimposed on the developed SPEA2 variant in [2] for the above-mentioned
system architecture.
MoHBPSPEA2 approach of distributing jobs (tasks and population) at the nodes
and threads has two major benefits. First, it offers MoHBPSPEA2 the construct to
sustain different computer specifications, that is, being able to run on a node with many
threads or a node with only one thread – MoHBPSPEA2 ability to run on various
hardware architecture. Then, second is that, it creates opportunity to study various
variables, like communication between nodes of the same computer type and that of
different computer type, communication between thread of same node and that of
different nodes that may affect the performance of MoHBPSPEA2 so as to offer better
understanding of MoHBPSPEA2 and enable its improvement.
4 MoHPBPSPEA2 Implementation
In the proposed approach, the problem is first decomposed based on the number of
objectives, i.e., tasks and population P, on individual node Ni. Then the second stage
deals with population balancing by parallel decomposition of the given population on
each node. Each node is offered P/N number of individuals in the population, where P
is the population size and N is the number of nodes considered for solving the problem.
The smaller population or subpopulation; which forms part of the jobs or tasks is then
assigned to each node. Each node further distributes its workload or job among its
working threads and the jobs are performed in parallel in the cluster. This dual-phase
distribution approach also contributes to the efficiency and effectiveness of the
developed MoHBPSPEA2.
The dataset is first processed and stored in the datastore and almost all the time-
consuming computations and evaluations- population generation, fitness assignment,
selection and objective functions are moved to the mapper for processing while the
output is sent to the reducer as input for crossover and mutation, finally, the reducer
output the results. These strategies help to save time of dataset processing (input-output
process) that would have been done at different stages of the entire processing thereby
saving both computation time and communication time amongst the nodes. Then the
results from all the participating nodes in the clusters are combined and the results sent
to the server for execution of local EA for best results which are also displayed by the
server.
Julia is adopted to implement the developed distributed algorithm, the reliability

threshold tr used is 0.8, the shortest PM interval T is 180, population and archive
population size of 500 and after 500 generations, The MoHPBPSPEA2 algorithm is
deployed on the cluster of nodes.
The proposed approach is also generic as it offers flexibility for expansion to more
than one local cluster. In this case, the server is connected to the n local clusters and
have all the nodes on each cluster run in parallel. The algorithms and the flowcharts
presented, illustrate the processes that would be executed to implement
MoHPBPSPEA2.
A simple model of a fuel oil service system of a ship (FOSS) case study described
as in [2] is used to demonstrate the developed approach. The constituent components of
the FOSS are marked with failure behaviour data and their short names described as in
[1] (Figs. 3 and 4).
Fig. 3. Flowchart showing process flow on the server

Fig. 4. Flowchart showing process flow on the nodes
5 Results
A set of optimal PPMR schedules for the fuel oil system was obtained for (i) serial
processing and (ii) parallel and distributed processing. Also, a set of trade-off PM
schedules shown in the Pareto frontier of Table 1 and Fig. 6; Table 2 and Fig. 5 is
produced. Each of the solutions or PM schedule represents an optimal trade-off
between unavailability and cost.
5.1 Discussion
A total number of 53 optimal solutions are obtained under FOSS PPMR scheduling
optimisation via serial and 64 optimal solutions via parallel and distributed process.
The first and last 5 of the optimal PPM schedules for serial and parallel and distributed
processes are as presented in the Table 1 and Table 2 above respectively and their
graphic representation as shown in Figs. 5 and 6 respectively. Their statistical tool
result comparison is shown in Table 3.
Table 1. Optimal PPMR schedules – parallel and distributed optimisation via NUST Cluster of
Servers
Table 2. Optimal PPMR schedules – serial optimisation on a NUST Server

Fig. 5. Pareto frontier of PPMR schedules via serial optimisation on a NUST Server
Fig. 6. Pareto frontier of PPMR schedules via parallel and distributed optimisation on NUST
Cluster of Servers
Table 3. Statistical tool comparison for serial on server versus parallel and distributed
optimisation of FOSS PPMR scheduling
From Table 3 above, the unavailability minimum and cost minimum values for
both serial and, parallel and distributed processes are the same.
However, this is not the same for cost, via parallel and distributed process has cost
maximum value of 0.012276 greater than for serial and, the difference in range is also
0.012276 greater than for serial. In the same vein, parallel and distributed process has
cost maximum value of 166000 with range variance of 166000 greater than serial. This
implies that parallel and distributed process explored wider space into the feasible
region than serial, and also obtained more optimal PPMR schedules than serial.
The unavailability standard deviation for parallel and distributed process is 0.00151
smaller than for serial. But the reverse is the case for cost standard deviation as parallel
and distributed process is 12026 greater than for serial. Similarly, the unavailability
mean for parallel and distributed process is 0.014682 smaller than for serial. But the
cost mean for parallel and distributed process is 16347 greater than for serial. However,
to have a better and clearer perception of what happened in the optimisation of
FOSS PPMR schedules, the distances in objective function values for both via serial
and, parallel and distributed processes are required to be established further.
From Table 2 (via serial), the least optimal schedule has unavailability as
0.094808367 and cost as 4161191 in generation 500 while the highest optimal schedule
was also found in generation 500 with 0.37103334 as unavailability and cost as
2741967. While the least optimal schedule in Table 1 (via parallel and distributed
process) has unavailability as 0.094808367 and cost as 4373191 at generation 497 and
the highest optimal schedule obtained at generation 499 has unavailability as
0.383308865 and 2704967 as cost. The unavailability and cost variance for serial are
0.276225 and 1419224 respectively and the unavailability and cost variance for parallel
and distributed process are 0.2885 and 1668224 respectively.
Comparing the two objective functions variance values above, it is seen that dis-
tances (spread) between the optimal solutions obtained by parallel and distributed
process shrunk more than that for serial. This means that the optimal PPMR schedules
obtained via parallel and distributed process are closer in distance to themselves than
that of serial and this satisfies one of the goals of multiobjective optimisation that says
the smaller the distances of the objective function values between one solution and the
other obtained non-dominated solutions, the better diverse the non-dominated solutions
are.
Interestingly, from Fig. 5 (serial), the first and last optimal solutions of the Pareto
frontier for serial have objective functions as approximately 0.38 unavailability and
2.810 cost and; 0.08 unavailability and 4.310 cost respectively, giving objective
function space value difference as 0.30 and 1.500 for unavailability and cost respec-
tively. In the same vein, from Fig. 6 (parallel and distributed process), the first and last
optimal solutions of the Pareto frontier for parallel and distributed process have
objective functions as approximately 0.39 unavailability and 2.810 cost and; 0.07
unavailability and 4.410 cost respectively, giving objective function space value dif-
ference as 0.32 and 1.600. From the analysis, the difference of the objective functions
for parallel and distributed process is seen to strike a balance more than that for serial.
That is, the optimal schedules (solutions) obtained via parallel and distributed process
has the objective functions (unavailability and cost) better trade-off. It can be seen from
Fig. 5, Fig. 6, Table 1 and Table 2 that the PPMR schedules obtained via parallel and
distributed process are superior to PPMR schedules obtained via serial.
The computational time for optimisation via parallel and distributed process is
3792 s lower than the time for serial as shown in Table 3. In other words, parallel and
distributed process improved the optimisation run time by 79%.
6 Conclusion
The developed scalable, hierarchical, and distributed SPEA2 algorithm (MoHPbP-

SPEA2) for optimising preventive maintenance with component replacement schedules
(PPMR) provides a wider contribution to knowledge in the field of AI application.
The results obtained from subjecting FOSS constituent components to the devel-
oped optimisation approach were shown to have met the result expectations; that
optimising PPMR schedules in distributed processes can reduce computational time
and enhance cost effectiveness; inject more diversity into the solution space, as well as
produce better design solutions.
There are many possibilities for future work. The work in this paper can be
extended to:
(1) investigate and demonstrate the effect on design solution, of the use of Island
model and SPEA2 for perfect preventive maintenance with replacement in a dis-
tributed environment. Island model could further help to enhance the design
solution quality.
(2) Investigate optimising imperfect preventive maintenance schedules under
replacement policy using SPEA2 in a distributed process.
(3) Investigate a network model to track the fitness of system components in between
PM times and offer data that might be useful in recommending possible preventive
measures on the system should it be requiring attention before the next PM time.
(4) Investigate and demonstrate the effect of PPMR schedules execution on more than
4 nodes or more than one cluster. This will significantly further reduce the com-
putational time.
References
1. Nggada, S.H.: Reducing component time to dispose through gained life. Int. J. Adv. Sci.
Technol. 35(4), 103–118 (2011)
2. Ikechukwu, O.A, Nggada, S.H, Quenum, J.G, Samara, K.: Optimisation of perfect
preventive maintenance and component replacement schedules (PPMCRS) Using SPEA2.
In: Proceedings of ITT 2019 6th International Conference on Emerging Technologies on
Blockchain and Internet of Things, HCT, Women Campus, Ras Al Khaimah, UAE 20–21
November 2019. IEEE inspec access no: 19569157 (2019). https://doi.org/10.1109/itt48889.
2019.9075124. ISBN: 978-1-7281-5061-1
3. Pawar, R.S.: Parallel and distributed evolutionary algorithms. seminar computational
intelligence in multi-agent systems, 10 July 2018
4. Cantú-Paz, E.: A survey of parallel genetic algorithms. Calculateurs Paralleles, Reseaux et
Systems Repartis 10(2), 141–171 (1998)
5. Cantú-Paz, E., Goldberg, D.E.: Are multiple runs of genetic algorithms better than one? In:
Proceedings of the 2003 International Conference on Genetic and Evolutionary Computation
(GECCO 2003) (2003)
6. Hiroyasu, T., Kaneko, M., Hatanaka, K.: A parallel genetic algorithm with distributed
environment scheme.In: Proceedings of the 1999 IEEE International Conference on
Systems, Man, and Cybernetics (SMC 1999) (1999)
7. Abramson, D., Abela, J.: A parallel genetic algorithm for solving the school timetabling
problem. In: Proceedings of the 15th Australian Computer Science Conference, Hobart,
pp 1–11, February 1992
8. De Toro, F., Ortega, J., Fernandez, J., Diaz, A.: PSFGA: a parallel genetic algorithm for
multiobjective optimization. In: Proceedings of the 10th Euromicro Workshop on Parallel,
Distributed and Network-based Processing (2002)
9. Schaffer, J.D.: Multiple objective optimization with vector evaluated genetic algorithms. In:
Proceedings of the International Conference on Genetic Algorithm and their applications
(1985)
10. Fonseca, C.M., Fleming, P.J.: Multiobjective genetic algorithms. In: IEEE Colloquium on
Genetic Algorithms for Control Systems Engineering (Digest No. 1993/130) (1993)
11. Horn, J., Nafpliotis, N., Goldberg, D.E.: A Niched Pareto genetic algorithm for
multiobjective optimization. In: Proceedings of the 1st IEEE Conference on Evolutionary
Computation. IEEE World Congress on Computational Intelligence (1994)
12. Murata, T., Ishibuchi, H.: MOGA: multi-objective genetic algorithms. In: Proceedings of the
1995 IEEE International Conference on Evolutionary Computation, Perth (1995)
13. Srinivas, N., Deb, K.: Multiobjective optimization using nondominated sorting in genetic
algorithms. J. Evol. Comput. 2(3), 221–248 (1994)
14. Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: a comparative case study and
the strength Pareto approach. IEEE Trans. Evol. Comput. 3(4), 257–271 (1999)
15. Ismail, M.A.: Parallel Genetic Algorithms (PGAS): master slave paradigm approach using
MPI, ETech (2004)
16. Lim, D., Ong, Y., Jin, Y., Sendhoff, B., Lee, B.: Efficient hierarchical parallel genetic
algorithms using grid computing. Proc. Future Gener. Comput. Syst. 23(4), 658–670 (2007)
17. Sefrioui, M., Srinivas, K., Periaux, J.: Aerodyanmic shape optimization using a hierarchical
genetic algorithm. In: European Conference on Computational Methods in Applied Science
and Engineering (ECCOMAS 2000) (2000)
18. Verma, A., Llora, X., Goldberg, D.E., Campbell, R.H.: Scaling genetic algorithms using
MapReduce. In: Proceedings of the 9th International Conference on Intelligent Systems
Design and Applications, ISDA 2009 (2009)
19. Nggada, S.H.: Multi-objective system optimisation with respect to availability, maintain-
ability and cost. Ph.D. dissertation, The University of Hull, vol. 3 2012 (2012)
20. Zitzler, E.K., Deb, L.T.: Comparison of multiobjective evolutionary algorithms: Empirical
results (revised version). Technical Report 70, Computer Engineering and Networks
Laboratory (TIK), Swiss Federal Institute of Technology (ETH) Zurich, Gloriastrasse 35,
CH 8092 Zurich, Switzerland (1999)
A Framework for Traffic Sign Detection
Based on Fuzzy Image Processing
and Hu Features
Zainal Abedin(B) and Kaushik Deb
Chittagong University of Engineering and Technology (CUET),

Chattogram, Bangladesh
jakcse99@gmail.com, debkaushik99@cuet.ac.bd
Abstract. The Traffic Sign Recognition (TSR) system is an essential

part of traffic management, being support system for driver and intel-
ligent vehicle. Traffic Sign Detection (TSD) is the prerequisite step of
the automatic TSR system. This paper proposes a TSD framework by
exploring fuzzy image processing and invariant geometric moments. In
this framework, fuzzy inference system is applied to convert the HSV
image into gray tone. Then, statistical threshold is used for the seg-
mentation. After shape verification of every connected component using
Hu moments and Quadratic Discriminant Analysis (QDA) model, the
candidate signs are detected by referencing bounding box parameters.
The framework is simulated in different complex scenarios of day and
night mode. Experimental result shows that its performance is satisfac-
tory and recommendable with the state of the art research. The proposed
framework yields 94.86% F-measure in case of Bangladesh road signs and
93.01% in German road signs.
Keywords: Traffic sign detection · Fuzzy inference system · Hu

moments · Quadratic discriminant analysis
1 Introduction
Traffic sign is an indispensable component to assist traffic police and to pro-
vide safety mechanism by controlling flow of traffic and guiding walkers and
drivers in every aspect of road environment. Automatic traffic sign recognition
system using computer vision attracts special interest in real time applications
such as intelligent vehicles and traffic maintenance. In case of automatic TSR
system, on board mounted module is used to detect and return the information
of traffic signs for proper navigation of the vehicle. Regarding transportation
maintenance, well positioning and accurate and exact information of traffic signs
definitely enhance road security and safety. In a word, TSR is an important part
of Intelligent Transportation System (ITS) [1,2,4].
Traffic sign is designed using distinct colours and shapes, which are really
salient and visible, easily discriminable by human from the environment. Most of
https://doi.org/10.1007/978-3-030-68154-8_30
312 Z. Abedin and K. Deb
the signs are made using red, blue and yellow colors where shapes are triangular,
circular, square and rectangle. A sign is made distinctive by three things :border
color, pictogram and background. The signs are installed in a location along side
with the road, so that driver can easily detect the signs [11]. Also, the shape and
colors determine the categories of signs. The properties of Bangladesh road signs
are outlined in Table 1 and some examples of traffic signs are illustrated in the
Fig. 1.
Table 1. Properties of Bangladesh traffic signs [5].
Color Shape Type

Red border with white background Circular Regulatory
Red border with white background Triangular Warning
Blue background Circular Regulatory
Blue background Square or Rectangle Information
Fig. 1. Examples of Bangladesh traffic signs [5].
Uncertainty in image processing is a prevailing matter such that visual pat-

tern is featured by inherent ambiguity, digital image is noised while captur-
ing, object presentations are not crisp in practical scenario, knowledge of object
detection is a vague term and color properties of a pixel is a linguistic matter
(such as dark, bright and high luminance). Fuzzy logic in the field of soft comput-
ing is a tool to handle these sorts of uncertainty. Hence, the application of fuzzy
logic can be a great field to explore in developing vision based application [3].
As in [9,10], large amount of research works has been done on traffic signs of
countries like Germany, Belgium, Sweden, France, US, UK and China. Another
important fact from the literature review of [10] is that traffic signs are not
look alike across the globe. In spite of the clear definition of laws by the Vienna
Treaty, dissimilarity of sign structures still prevail among the countries’ signato-
ries to the agreements, and in some scenarios, noteworthy variation in traffic sign
appearances can remain within the country. These variations can be managed
easily by humans while in automatic detection and recognition system, being
the challenging issues, needing to be addressed. A graphic is illustrated in the
Fig. 2 showing this variation in the case of stop sign of Bangladesh with some
A Framework for Traffic Sign Detection 313
countries. Another important research gap is that most of the research did not
consider the scenario of night mode sign detection [10].
Fig. 2. Variation in sign appearances
In the light of above situation, the aim of this research is to propose a frame-
work for the detection of Bangladesh traffic signs, where along with the com-
mon challenges of the detection algorithm, night mode scenarios are emphasized.
The key contributions of the framework are intensity transformation using fuzzy
interference that can handle uncertainties in image processing and a computing
tool provided for incorporating human knowledge, segmentation by a statisti-
cal threshold and shape verification using geometric invariant Hu moments and
QDA model.
The rest of the paper is structured as such that Sect. 2 presents related
research, Sect. 3 presents the proposed framework while result and analysis is
demonstrated in Sect. 4. Finally, conclusion is drawn in Sect. 5.
2 Related Works
To detect and localize the RoIs from an image is called Traffic Sign Detection
(TSD) which is the first step of TSR. According to the comprehensive reviews
regarding TSD accomplished by the contributors [9,10], the major categorizes of
TSD algorithms are color based, shape based and the machine learning models.
In color based method, the images are segmented using a threshold where the
threshold is chosen on the basis of color information while in shape base method,
shapes of the signs are detected using some popular edge detection algorithms
such as Hough transformation. Recently, machine learning based sign detec-
tion has become very popular due to better accuracy. In this approach, pixels
are classified from the background by using different features and classification
models.
In [11], a color based detection algorithm was proposed for Spanish traffic
signs where the threshold was determined by observing the Hue and Satura-
tion histogram of HSI images of the manually cropped traffic signs. Also, this
algorithm used Distances to Borders (DtBs) vectors and SVM model to ver-
ify the shapes of the signs. Similar segmentation method was applied by the
authors of [4] where the shape verification was implemented by Hu features and
the cross correlation values of the detected shapes and the ground truth shapes
were calculated and the shapes within an empirical threshold were considered as
positive shapes. Similarly, in [8], the authors proposed a detection algorithm for
Bangladeshi road signs using YCbCr color space and used statistical threshold
values for segmentation and DtBs vectors to verify shapes.
In [12], a shape and color based detection algorithm was proposed with f-
measure of 0.93. The potential region of interests were searched in the scene
area by using Maximally Stable Extremal Regions (MSERs) and then, used
HSV threshold to detect the sign regions based on text. The detected regions
were further filtered by using temporal information through consecutive images.
Yang et al. proposed a computational efficient detection algorithm which can
detect signs at a rate of six frames per second on image of size 1360 × 800 [13].
He developed probability model for red and blue channel to increase the traffic
colors visibility using Gaussian distribution and Ohta space; then, the MSERs
were applied to locate potential RoIs. Next, the RoIs are classified either signs
or not using HOG features and SVM classifier.
The authors of [2] presented a machine learning based approach to segment
the image where positive colors are converted to foreground and others are
mapped to black. Mean shift clustering algorithm based on color information
was used to cluster the pixels into regions and then, centroid of the cluster was
used to classify the cluster whether it is sign or not by random forest. In addi-
tion, a shape verification was included in the framework to discard the false alarm
and this verification was based on log-polar transformation of the detected blob
and correlations values with the ground truth. The authors of [15] proposed an
efficient detection algorithm by using AdaBoost and Support Vector Regression
(SVR) for discriminating sign area from the background where a salient model
was designed by using specific sign features (shape, color and spatial informa-
tion). By employing the salient information, a feature pyramid was developed
to train an AdaBoost model for the detection of sign candidates in the image
and a SVR model was trained using Bag of Word (BoW) histograms to classify
the true traffic signs from the candidates. In [1], the chromatic pixels were clus-
tered using k-NN algorithm taking the values of a and b in the L*a*b space and
then, by classifying these clusters using Support Vector Machine. Alongside, the
achromatic segmentation was implemented by threshold in HSI space and shape
classification was done by Fourier descriptors of the regions and SVM model.
Most of the machine learning algorithms methods are based on handcrafted
features to identify signs. Recently, deep learning has been very popular in the
computer vision research to detect objects due to their the capabilities of fea-
tures representation from the raw pixels in the image. The detection algorithm
proposed in [19], used dark area sensitive tone mapping to enhance the illumi-
nation of dark regions of the image and used optimized version of YOLOV3 to
detect signs. The authors of [20] used to detect signs using Convolution Neural
Network (CNN) where modified version of YOLOv2 was applied.
An image segmentation by fuzzy rule based methods in YCbCr color space
was presented where triangular membership function and weighted average
defuzzification approach were selected [18].
3 Detection Framework
In general, the TSR is comprised of two steps: sign detection from an image or
video frame and sign classification of the detection. This research concentrates
on the detection of the signs from an image as a mandatory step before sign
recognition. The proposed framework for traffic signs detection is illustrated in
Fig. 3. According to the framework, the image is captured by a camera and
then some pre-processing such as reshaping of the image into 400 × 300 pixels
and color space mapping of RGB to HSV are applied. After pre-processing, the
image is transformed into gray level by using fuzzy inference where the inputs
are the crisp value of hue and saturation of a pixel. Then, the gray image is seg-
mented by applying a statistical threshold. The segmented image is subjected
to some processing such as morphological closing and filtering using area and
aspect ratio to discard some unwanted regions. Next, the shape of remaining
regions are verified by extracting Hu features and a trained Quadratic Discrim-
inate Analysis (QDA) model. The regions which are not classified as circle or
triangle or rectangle are discarded in this stage. After shape verification, the con-
nected components remaining in the binary image are considered as the Region
of Interest (RoI). Finally, the RoIs are detected as the potential traffic signs in
the input image by bounding box parameters.
Fig. 3. Steps of Traffic sign detection framework.
3.1 Preprocessing
Nonlinear color space mapping of RGB to Hue Saturation Values (HSV) is
applied in the input image to reduce the illumination sensitivity of the RGB
model. Hue and saturation levels of HSV image are isolated for the crisp inputs
in the fuzzy inference system.
3.2 Transformation Using Fuzzy Inference
In this work, we use Mamdani Fuzzy Inference System (FIS) to map the hue and
saturation values of a pixel to gray value. This inference consists of mainly three
steps: fuzzification, inference using fuzzy knowledge base and defuzzification [3,7]
with two inputs and one output. The inputs are the hue and saturation value of a
pixel and the output is the gray value of a pixel.
Fuzzification: it converts crisp sets into fuzzy sets with the assignment of degree
of membership values. There are many techniques for fuzzification process. Here,
the fuzzification is based on intuition of an expert. A fuzzy set is defined by lin-
guistic variable where degree of the membership values is assigned in the range
(0, 1) by a characteristic function. Intuition is used to make the membership func-
tions and the knowledge of an expert in the domain is utilized to develop the intu-
ition. For the universe of ‘hue’, we consider five linguistic labels such as reddish-1,
reddish-2, bluish, noise-1, noise-2, for ‘saturation’, they are reddish and bluish and
for the output they are reddish, bluish and black. To assign the membership val-
ues to each linguistic variable, triangular and trapezoidal functions are used. The
Table 2 presents the parameters values of each membership function.
Table 2. Membership parameters of the fuzzy sets.
Linguistic Variables Membership Function Parameters

Reddish-1 (hue) Triangular a = 0, m = 0, b = 22
Reddish-2 (hue) Triangular a = 240, m = 255, c = 255
Bluish (hue) Trapezoidal a = 115, m = 140, n = 175, b = 190
Noise-1 (hue) Trapezoidal a = 150, m = 210, n = 255, b = 255
Noise-2 (hue) Triangular a = 0, m = 0, b = 22
Reddish (saturation) Trapezoidal a = 96, m = 102, n = 255, b = 255
Bluish (saturation) Trapezoidal a = 150, m = 210, n = 255, b = 255
Reddish (output) Triangular a = 220, n = 235, b = 252
Bluish (output) Triangular a = 110, n = 150, b = 255
Black (output) Triangular a = 0, m = 15, b = 35
Inference: it is the main step of FIS where the fuzzy knowledge base is used
to infer a set of rules on the input fuzzy set. In fuzzy logic systems, the fuzzy
knowledge base is developed by the rules and linguistic variables based on the
fuzzy set theory to do inference. This knowledge base is constructed by using a
set of If-Then rules. The structure of the rules in the standard Mamdani FIS is
as follows: the kth rule is Rk : IF x1 is F1k and . . . and xp is Fpk , THEN y is Gk ,
where p is the number of inputs. Example of a rule is R1 = ‘If hue is reddish-1
AND saturation is reddish, THEN output is reddish’. In our knowledge base,
five rules are incorporated.
The inference is done by four steps. Step1: the inputs are fuzzified. Here
the crisp value of hue and saturation value of a pixel are converted to fuzzy
variables. Step2: all the antecedents of a rule are combined to yield a single
fuzzy value using the fuzzy logic max operation, depending on whether the parts
are connected by ORs or by ANDs. Step3: the output of each rule is inferred
by an implication method. We use AND operator for the implication which do
the min operation. Step4: This step aggregates the output of each rule to yield
a single fuzzy output. Here fuzzy max operator is used.
Defuzzification: in this step, we defuzzify aggregate output(fuzzy values) to

create a scalar value by applying centroid defuzzification method. Figure 4 shows
three examples of gray images after the fuzzy intensity transformation.
Fig. 4. Gray image after fuzzy inference.
3.3 Segmentation
The gray image, obtained by fuzzy inference, is converted to binary image using
the statistical threshold presented by the Eq. 1 which is derived on the statistical
parameters of the image.

1, if I(x, y) > μ + σ
Binaryimage (x, y) = (1)
0, otherwise
where, μ is mean and σ is standard deviation of the gray image, I(x, y).
3.4 Morphological Closing and Filtering

The goals of the steps is three folds: merging regions which are belong to the
same sign but fragmented during segmentation, to discard the regions which are
very small or big and, to eliminate regions which do not comply with the aspect
ratios of the signs. To get the desire output, morphological closing with circular
structural element and area and aspect ratio filtering are applied in this order to
the every connected components of the segmented image. The filtering criteria
which are derived empirically are listed in the Table 3.
Table 3. Filtering criteria.
Parameter Candidate region

Aspect ratio 0.66 to 0.71
Area 300 to 850
3.5 Shape Verification
Fig. 5. Steps in shape verification.
In this step, regions after the post processing, are verified to filter false alarm.
To verify the regions, geometric invariant Hu moments are extracted for every
regions and Quadratic Linear discriminant (QDA) classifier is used to verify the
shapes. The steps of shape classification are depicted in Fig. 5. After the shape
verification, the regions which are not triangular or circular or rectangle are
filtered.
According to [3,17], the 2D moment of order(p + q) of a digital image f(x,y)
of size M X N is defined as

M −1 N
−1
mpq = xp y q f (x, y) (2)
x=0 y=0
where p = 0,1,2,...... and q = 0,1,2........are integers. The corresponding central

moment of order (p + q) is defined as

M −1 N
−1
μpq = (x − x̄)p (y − ȳ)q f (x, y) (3)
x=0 y=0
for p = 0,1,2,..... and q = 0,1,2,......... where x̄ = m 10

m00 and ȳ =
m01
m00 The normalized
central moments, denoted by ηpq are defined as
μpq
ηp q = (4)
μγpq
where γ = p+q2 + 1 for p + q = 2,3,4,.......... A set of seven invariant moments are

calculated where h1 to h2 are orthogonal(independent of size, position and ori-
entation) moments and h7 is invariant skew moment(good for processing mirror
images) [17].
h1 = η20 + η02 (5)
h2 = (η20 − η02 )2 + 4η11
2
(6)
h3 = (η30 − 3η12 ) + (3η21 − η03 )
2 2
(7)
2 2
h4 = (η30 + η12 ) + (η21 + η03 ) (8)
h5 = (η30 − 3η12 )(η30 + η12 )((η30 + η12) )2 − 3(η21 + η03 )2 )
(9)
+(3η21 − η03 )(η21 + η03 )((3(η30 + η12 )2 − (η21 + η03 )2 )
h6 = (η20 − η02 )((η30 + η12 )2 − (η21 + η03 )2 ) + 4η11 (η30 + η12 )(η21 + η03 ) (10)
h7 = (3η21 − η03 )(η30 + η12 )((η30 + η12 )2 − 3(η21 + η03 )2 )
(11)
+(3η12 − η30 )(η21 + η03 )(3(η30 + η12 )2 − (η21 + η03 )2 )
The classification model, discriminate analysis is defined by the Eq. 12.
G(x) = arg max δk (x) (12)

k
where δk (x) is the discriminate function and definition of this is given by the
Eq. 13.
1 1
deltak (x) = − log |Σk | − (x − μk )T Σk−1 (x − μk ) (13)
2 2
where Σk is co-variance matrix and πk is prior probability.
3.6 Detection
After the shape verification, the potential candidate of sign region is finalized.
The bounding box parameters of the remaining connected component in the
binary image after shape filtering, are determined. Then, using these parameters,
the traffic sign area is detected in the input frame and can be easily cropped for
recognition of the sign.
4 Result and Analysis

4.1 Experiment Setup
To simulate the proposed detection framework, Matlab R2018a is used for the
experiment. The configuration of the machine is Pentium5 quad-core processors
with 8 GB ram without any GPU.
4.2 Detection Results

The presented traffic sign detection framework is evaluated over one publicly
available data sets, including the German Traffic Sign Detection (GTSD) and
Bangladesh Traffic Sign Detection(BTSD) created by own. For BTSD, the
images are captured under many scenarios (rural,city and highway) during day-
time, dusk and night focusing on different conditions such as illumination vari-
ation, shadows, different orientation and complex background. The images were
down sampled to a dimension of 400 × 300 pixels. Figure 6, Fig. 7 and Fig. 8
demonstrate every steps of the detection framework. Figure 9 shows some sam-
ples of the detection results in different scenarios and Fig. 10 vivified the detec-
tion results in case of night mode. The incorporation of night mode scenarios
is unique,as the previous research did not test their detection framework in
nigh time. Table 4 tabulates the average computational cost of the detection
framework per input image without any kind of code optimization and dynamic
algorithm.
Fig. 6. Steps of detection framework (image from Bangladeshi road scene): a) input
image b) HSV image c) fuzzified image d) binary image e) after morphological process-
ing f) after filtering g) after shape verification h) detection of sign.
Fig. 7. Steps of detection framework (image from GTSD): a) input image b) HSV
image c) fuzzified image d) binary image e) after morphological processing f) after
filtering g) after shape verification h) detection of sign.
Fig. 8. Steps of detection framework (night mode): a) input image b) HSV image c)
fuzzified image d) binary image e) after morphological processing f) after filtering g)
after shape verification h) detection of sign.
Fig. 9. Some examples of road sign detection.
Fig. 10. Some examples of detection results in night mode scenarios.

4.3 Performance Evaluation
The Pascal’s measures are used to evaluate detection performance [6,16]. A

detection is regarded as true positive if the intersection of the bounding box of
the prediction and the ground truth is more than fifty percent. The Intersection
over Union (IoU) is defined by the Eq. 14 where BBdt is the detection bounding
box of detection and that of ground truth is BBgt . By computing the IoU score
for each detection, the detection is classified into True Positives (TP) or False
Negative (FN) or False Positives (FP) and from these values, precision, recall
and f-measure of the detector are calculated. An illustration is presented in the
Fig. 11.
The precision and recall parameters of this detection framework are calcu-
lated for both German data sets and Bangladeshi road scenes. The performances
in both cases are quite remarkable and comparable. In case of Bangladesh, the
detection algorithm shows 95.58% recall and 94.15% precision and in GTSDB,
recall is 92.8% and precision is 93.23%. To assess the efficiency of the framework,
a comparative analysis is performed against five existing detection algorithms
and it is summarized in the Table 5. It is evident from the Table 5 that the
proposed framework achieves a F-measure value of 0.93 for GTSDB and 0.94 for
BTSD.
area(BBdt ∩ BBgt )
IoU = (14)
area(BBdt ∪ BBgt )
Fig. 11. Illustration of IoU (red BB is ground truth and blue BB is detected with TP
is 1, FN = 0 and =0).
4.4 Comparisons of Classifiers in Shape Verification
A data sets of 900 shapes for circle, triangle and rectangle are formed where the
shapes are taken from the segmented road images. Next, the seven Hu features
are extracted for every shape and a feature vector of size 900 × 8 is created.
Then, the models are trained with 10-fold cross validation. The Fig. 12 illustrates
the accuracy of different models. From Fig. 12, it is evident that the accuracy
of the QDA model is better than others, with 96% accuracy. Hence, this model
is deployed in our detection framework for shape verification.
Table 4. Average computational cost of the steps of detection framework.
Process Average simulated computational cost(s)

Image acquisition 0.016
Prepossessing 0.02
Intensity transformation using fuzzy inference 0.22
Segmentation 0.002
Morphological processing 0.035
Filtering 0.08
Shape verification 0.1
Detection using bounding box 0.015
Total 0.49
Table 5. Comparative analysis for the detection framework.
Methods Recall(%) Precision(%) F-measure(%) Data sets

[4] 91.07 90.13 90.60 GTSDB
[2] 92.98 94.03 93.50 GTSDB
[20] 90.37 95.31 92.77 GTSDB
[21] 87.84 89.65 88.74 GTSDB
[14] 87.84 89.6 88.71 GTSDB
Proposed framework 92.8 93.23 93.01 GTSDB
Proposed framework 95.58 94.15 94.86 Bangladesh Data sets
Fig. 12. Comparison of accuracy in shape verification.

5 Conclusion
In this paper, a detection framework is presented by utilizing fuzzy image pro-
cessing where crisp sets of hue and saturation are converted into fuzzy sets and
Mamdani Fuzzy Inference scheme is applied to transform those into gray val-
ues. A statistical threshold is applied to segment the fuzzified image. After some
filtering, to reduce the false alarm, shape classification is done by QDA model
trained on Hu features. In addition, an analysis is drawn to compare accuracy of
some popular models where QDA outperforms. The framework is evaluated by
calculating the IoU scores where the average recall and precision rate are 94.19%
and 93.69% respectively. The future work aims recognizing the traffic sign based
on deep learning model.
References
1. Lillo, C., Mora, J., Figuera, P., Rojo, Á.: Traffic sign segmentation and classification
using statistical learning methods. Neurocomputing 15(3), 286–299 (2015)
2. Ellahyani, A., Ansari, M.: Mean shift and log-polar transform for road sign detec-
tion. Multimed. Tools Appl. 76, 24495–24513 (2017)
3. Rafael, C.G., Richard, E.W.: Digital Image Processing, 3rd edn. Pearson Educa-
tion, Chennai (2009)
4. Ayoub, E., Mohamed, E.A., Ilyas, E.J.: Traffic sign detection and recognition based
on random forests. Appl. Soft Comput. 46, 805–815 (2015)
5. Road Sign Manual Volume-1.pdf. www.rhd.gov.bd/documents/ConvDocs/.
Accessed 4 Nov 2020
6. Hung, P.D., Kien, N.N, : SSD-Mobilenet implementation for classifying fish species.
In: Vasant, P., Zelinka, I., Weber, G.W. (eds.) Intelligent Computing and Optimiza-
tion. ICO 2019, Advances in Intelligent Systems and Computing, vol. 1072, pp.
399–408. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33585-440
7. Borse, K., Agnihotri, P.G.: Prediction of crop yields based on fuzzy rule-based sys-
tem (FRBS) using the Takagi Sugeno-Kang approach. In: Vasant, P., Zelinka, I.,
Weber, G.W. (eds.) Intelligent Computing and Optimization. ICO 2018, Advances
in Intelligent Systems and Computing, vol. 866, pp. 438–447. Springer, Cham
(2018). https://doi.org/10.1007/978-3-030-00979-3 46
8. Soumen, C., Kaushik, D.: Bangladeshi road sign detection based on YCbCr color
model and DtBs vector. In: 2015 International Conference on Computer and Infor-
mation Engineering (ICCIE) on Proceedings, Rajshahi, Bangladesh, pp. 158–161.
IEEE (2015)
9. Safat, B.W., Majid, A.A., Mahammad, A.H., Aini, H., Salina, A.S., Pin, J.K.,
Muhamad, B.M.: Vision Based traffic sign detection and recognition systems: cur-
rent trends and challenges. Sensors 19(9), 2093 (2019)
10. Chunsheng, L., Shuang, L., Faliang, C., Yinhai, W.: Machine vision based traffic
sign detection methods: review, analyses and perspectives. IEEE Access 7, 86578–
86596 (2019)
11. Maldonado, S., Lafuente, S., Gil, P., Gomez, H., Lopez, F.: Road sign detection
and recognition based on support vector machines. IEEE Trans. Intell. Transp.
Syst. 8(2), 264–278 (2007)
12. Greenhalgh, J., Mirmehdi, M.: Recognizing text-based traffic signs. IEEE Trans.
Intell. Transp. Syst. 16(3), 1360–1369 (2015)
13. Yang, Y., Luo, H., Xu, H., Wu, F.: Towards real time traffic sign detection and
classification. IEEE Trans. Intell. Transp. Syst. 17(7), 2022–2031 (2016)
14. Cao, J., Song, C., Peng, S., Xiao, F., Song, S.: Improved Traffic sign detection and
recognition algorithm for intelligent vehicles. Sensors 19(18), 4021 (2019)
15. Tao, C., Shijian, L.: Accurate and efficient traffic sign detection using discriminative
AdaBoost and support vector regression. IEEE Trans. Veh. Technol. 65(6), 4006–
4015 (2016)
16. Everingham, M., Van, G.L., Williams, C.K.I.: The pascal visual object classes
(VOC) challenge. Int. J. Comput. Vis. 88, 303–338 (2010)
17. Hu, M.K.: Visual pattern recognition by moment in-variants. IRE Trans. Inf. The-
ory 8(2), 179–187 (1962)
18. Alvaro, A.R., Jose, A.M., Felipe, G.C., Sergio, G.G.: Image segmentation using
fuzzy inference system on YCbCr color model. Adv. Sci. Technol. Eng. Syst. J.
2(3), 460–468 (2017)
19. Jameel, A.K., Donghoon, Y., Hyunchul, S.: New dark area sensitive tone mapping
for deep learning based traffic sign recognition. Sensors 18(11), 3776 (2018)
20. Zhang, J., Huang, M., Jin, X., Li, X.: A real time Chinese traffic sign detection
algorithm based on modified YOLOv2. Algorithms 10(4), 127 (2017)
21. Xue, Y., Jiaqi, G., Xiaoli, H., Houjin, C.: Traffic sign detection via graph-based
ranking and segmentation algorithms. IEEE Trans. Syst. Man Cybern. Syst.
45(12), 1509–1521 (2015)
Developing a Framework for Vehicle
Detection, Tracking and Classification
in Traffic Video Surveillance
Rumi Saha, Tanusree Debi, and Mohammad Shamsul Arefin(&)
Department of CSE, CUET, Chittagong, Bangladesh

rumicsecuet97@gmail.com, tanusreedebi11@gmail.com,
sarefin@cuet.ac.bd
Abstract. Intelligent Transportation System and safety driver assistance sys-

tems are significant topics of research in the field of transport and traffic man-
agement. The most challenging of moving vehicle detection, tracking and
classification is in the Intelligent Transportation System (ITS) and smart vehicles
sector. This paper provides a method for vision based tracking and classify
different classes of vehicle by controlling the video surveillance system. Several
verification techniques were investigated based on matching templates and
classifying images. This paper focused on improving the performance of a single
camera vehicle detection, tracking and classification system and proposed a
method based on a Histogram of Oriented Gradient (HOG) function that is one
of the most discriminatory features to extract the object features and trained for
classification and object on the Linear Support Vector Machine (SVM) classi-
fier. Also categorize vehicles on the shape or dimension based feature extraction
with cascade based Adaboost classifier which has the high predictive accuracy
and the low cost of storage affirm the efficacy for real-time vehicle classification.
In the final stage, for minimizing the number of missing vehicles, Kalman Filter
was used to track the moving vehicles in video frame. Our proposed system is
checked using different videos and provided the best output with appropriate
processing time. The experimental results shows the efficiency of the algorithm.
Keywords: Vehicle detection Histograms of Oriented Gradients (HOG)

Support Vector Machine (SVM) Occlusion Kalman filter Vehicle tracking
Vehicle classification
1 Introduction
As vehicle detection and classification system has attracted many researchers from both
institutions and industry in the field of Intelligent Transportation System (ITS). We
proposed a new method in this paper to effectively and robustly track and classify
vehicles. We also compared our own improved system with various existing algorithms
of detection and tracking them. The suggested approach is evaluated using self-
collected video sequences in condition of motorway driving. Experimental results show
how effective and feasible our approach is.

https://doi.org/10.1007/978-3-030-68154-8_31
Developing a Framework for Vehicle Detection 327
In recent years the rise of vehicles has caused significant transport problems. As a
result of the increase in automobile traffic, some issues emerged. Due to the rise in
vehicles, the jamming of traffic in cities is a major problem and people’s health is at
risk. Thus, road traffic video monitoring is becoming crucial work on Intelligent
Transportation System (ITS) is attracting considerable attention. The area of automated
video surveillance systems is actually of immense concern because of its safety
impacts.
The objective of this work is vehicle detection, tracking and classifying from the
high quality video sequence. The paper will be carried out to achieve following goals:
detection and tracking of multiple moving vehicles in traffic video sequence. Also
classifying vehicles into their types based on the dimension of the objects. It is an
important and difficult job to monitor frame by frame of identifying objects in video. It
is a vital part of the smart surveillance system because it would not be possible for the
system to obtain coherent temporal information about objects without monitoring
objects, and measures for higher level behavioral analysis. The first step in video
processing is identification of moving objects. Several regions may be used, such as
video monitoring, traffic controls, etc.
According to the relevant works mentioned in Sect. 2, the remainder of this paper is
structured. In Sect. 3, the system architecture and design is suggested. Section 4 and 5
display experimental results and conclusions.
2 Related Work
For vehicle detection, tracking and classification several different methods have been
suggested in the literature review. Detection and tracking of vehicles are major computer
vision issues that serve as the basis for further study of target motion. Vehicle detection
takes advantage of technology such as digital image processing to distinguish the
vehicle and its context [15], multimedia automation [16] statistics etc. In [11] an efficient
algorithm to track multiple people is presented. For feature extraction and neural net-
works for identification, PCA was used in Matthews et al. [12]. A method of measuring
traffic parameters in real time is defined here. It uses a feature-based approach to track
vehicles in congested traffic scenes, along with occlusion logic. Instead of monitoring
whole vehicles, sub-features of the vehicle are monitored for handling occlusions.
However this approach is very expensive in terms of computation [13].
For vehicle identification and spatio-temporal surveillance, Chengcu et al. [1]
suggested a system for adaptive context learning where in order to analyze image/video
segmentation process using SPCPE, this technique uses unsupervised vehicle detection
and spatio-temporal tracking. This work’s difficulty that lacks user interaction and
doesn’t know a priori can pixel belongs to the class. Gupte et al. [2] suggested a
method, where processing is carried out at three levels: raw images, the level of the area
and the level of the vehicle, and classification of two items of the type. The limitation
of this work is cannot classify a large number of category object and error occur for
occlusion. The experiment result shows the highway scene which can fail in case of
narrow road video sequence.
328 R. Saha et al.
Peng et al. [3] implemented a system using data mining techniques for vehicle type
classification where the front of the vehicle was precisely determined using the position
of the license plate and the technique of background-subtraction. Also vehicle type
probabilities were extracted from eigenvectors rather than explicitly deciding the type
of vehicle and using SVM for classification. The limitation of this method is that all
vehicle license plate can’t found so there is an error occur in detection object and also
handling complex.
Sun et al. [4] proposed a structure for a real-time precrash vehicle detection method
utilizing two forms of detection algorithms, such as the generation of multicycle-driven
hypotheses and the verification of appearance-based hypotheses, and the classification
of SVM algorithm used. The shortcoming of this work is that parameter failed in the
case of environment and also cause error in the detection moving object.
Jayasudha et al. [5] set out a summary of road traffic and accident data mining in the
report. They provide an overview of the techniques and applications of data mining in
accident analysis working with different road safety database and also need data dis-
criminator which compares the target class one or more set.
Chouhan et al. [6] suggested a technique for retrieving images using data mining
and image processing techniques to introduce an image’s size, texture and dominant
color factors. If the shape and texture same, they use weighted Euclidean distance.
Chen et al. [7] developed a framework on PCA based vehicle classification
framework where they used for feature extraction segment individual vehicles using
SPCPE algorithm and for classification use two algorithms – Eigenvehicle and PCA-
SVM. The weakness of this method detects vehicle have to use an unsupervised
algorithm SPCP. On-road vehicle detection technique with training linear and non-
linear SVM classifiers using modified HOG based features in Y. Zakaria et al. [9]. For
training dataset, they used the KITTTI dataset.
Vehicle detection, tracking and classification is complicated as any traffic
surveillance system due to the interference of illumination, blurriness and congestion of
narrow road in rural area but exposed to real world problems. This demand makes the
task very challenging for requirements of increasing accuracy.
This paper, we developed a robust vehicle detection, tracking and classification
method by extraction HOG features and Haar like features, with Support Vector
Machine (SVM) and cascade based classifier for classifying the types of vehicles. Then
Kalman filter technique is used. The system increases the tracking and classification
accuracy by using a proper algorithm with less computational complexity and reduces
the processing time.
3 System Architecture and Design
As we discuss previous related works step about the related works on vehicle detection,
tracking and classification procedure. We notice that there are some limitations of all
the procedure for which reason those methods can’t very efficient framework in the
Intelligent Transportation System. For solving all of the limitations of those methods
we proposed an efficient framework of this field, according to Fig. 6 which can reduce
the limitations of the background work. The system architecture of vehicle detection,
Tracking and classification System comprises some basic modules: Dataset initializa-
tion, feature extraction, Haar like feature, classifier, tracking and length calculation.
The function of dataset initialization is to set up the database with positive and negative
samples. The feature extraction module is for generating important feature from the
positive samples. Classifier module is for training samples for separate vehicles and
non- vehicles. Tracking module is for detecting all moving vehicles and also reduce the
number of missing objects. Finally, the most important part of the system is classifi-
cation module for classifying vehicle types for better recognition based on the
dimension of the object. Algorithm 3 shown the whole evaluating procedure of Vehicle
detection and Tracking through an Algorithm.
3.1 Dataset
Samples are collected to evaluate the algorithm presented, in various scenes, city
highway, wide roads, narrow roads, etc. In the first stage,7325 samples were collected
for training and testing, including 3425 samples of vehicles (positive samples) and
3900 samples of non-vehicles (negative samples). The vehicle samples include various
vehicle types such as car, truck and buses with various directions such as front, rear,
left and right. In addition, the vehicle samples include both vehicles near the camera-
mounted vehicle, as well as those far away. At both stages, traffics sign etc. Illustration
Fig. 1 and Fig. 2 shows example of training on vehicles and non-vehicles images. The
training samples were resized to 64 by 64 RGB images.
Fig. 1. Samples for vehicle image
Fig. 2. Samples for non-vehicle images

330 R. Saha et al.
3.2 Sample Process

The samples collected are then pre-processed prior to extraction of the feature. The
sample is resized so its dimensions can be 64 64 pixel where height is 64 pixels and
width is 64 pixels. The height is selected based on the sample’s original size, then the
width is determined so the sample retains the average aspect ratio. The next step is to
calculate the different variations in the HOG feature for each sample, the cell size
change for each sample size so all samples have the same vector length.
3.3 Histogram of Oriented Gradient (HOG) Feature Extraction

In the vehicle detection process Histogram of Oriented Gradients (HOG) features
extraction is the main components. The original HOG computation is performed in five
steps. First the image goes through the normalization of color and the correction of
gamma. The picture is then divided up as a cell grid. The cells are grouped into larger
overlapping blocks which allow the cells to belong to more than one block. Example of
dividing an image into 16 16 cells where each cell has 256 pixels and 2 2 size
blocks indicating that each block contains 2 cells in each direction. The blocks in the
figure have a 50% overlap ratio where half of the block cells are shared with the
neighboring block. Cell size and block size are parameters that the user must determine
according to the size of the image and the amount of details needed to be captured
(Fig. 3).
Algorithm 1: HOG calculation
1: input A: a gradient orientation image

2: initiate H 0
3: for all points inside the image (p, q) do
4: i A (m, n)
5: k the small region including (p, q)
6: for all offsets (r, s) such as the representing neighbors do
7: if (m + s, n + s) is inside of the image then
8: j I (m + x, n +y)
9: H (C, A, B, r, s) H (C, A, B, r, s) + 1
10: end if
11: end for
Fig. 3. Algorithm for HOG implementation
The key element of HOG features extraction in the object detection are as follows:
The images are converted to grayscale input color. The Gamma correction proce-
dure adopts the standardization of the color space (normalized) of the input image; the
aim is to adjust the contrast of the image, decrease the shadow of the local image and
the effect of changes in the illumination. Though it can restrain interference with noise.
Measure the gradient; primarily with a view to collecting contour details and reducing
light interference. Project gradient to a cell steering gradient. All cells within the block
are normalized; normalization has compressed the light, shadow, and edges, and the
block descriptor is called the HOG descriptor after normalization. Collect HOG fea-
tures from all blocks in the detection space; this step is intended to collect and merge
HOG features into the final classification feature vectors to overlap blocks in the
detection window.
In this Fig. 4 and 5 HOG feature extraction shown from vehicle and non- vehicle
image.
Fig. 4. Hog Feature for Vehicle image
Fig. 5. Hog feature for non-vehicle image
The variants of HOG features use the same technique for extraction of the function
but variations in the parameters below:
1. The number of angles to orientation.
2. The number of bins with histograms.
3. The method by which the pixel gradient is determined.
We may also optionally apply a color transformation to your HOG function vector
and attach binned color characteristics as well as color histograms.
332 R. Saha et al.
3.4 Support Vector Machine (SVM) Classifier

Support Vector Machine (SVM) is a kind of algorithm based on the structural risk theory.
We may use a Support Vector Machine (SVM) for binary classification, if our results
contain exactly two classes. An SVM classifies data by seeking the best hyperplane that
differentiates in one class for all data points from those of the other class. For an SVM the
best hyperplane means the one with the greatest gap between the two groups.
Two classes are divided between two classes, with the highest interval. Linear
classification of SVM and nonlinear classifiers of SVM is designed to detect artifacts.
As a nonlinear SVM kernel, a Gaussian kernel was used. Perform a dense multi-scale
scan at each location of the test image using the classifier, disclosing preliminary object
decisions. In image recognition, HOG features combined with the SVM classifier were
widely used, especially when an object detection was very effective. The goal is to
differentiate between the two groups by a feature as separate vehicles and non-vehicles
which are induced from the available examples. By analyzing the accuracy with the
other classifier, it has been seen that HOG-SVM accuracy is more than other classifier.
3.5 Non-linear Suppression

A non-linear suppression process is performed to minimize overlapping detected
windows to only one window, the suppression process is performed at each window
according to the confidence level of the classifier window with the highest confidence
level is left, while other windows that overlap with it are suppressed with a percentage
overlap greater than certain thresholds.
3.6 Kalman Filter for Tracking

The next step after detecting vehicles is to track vehicles for moving objects, from
frame to frame. Because of the detector’s relatively high cost we are interested in
contrasting an algorithm with a lower tracking complexity. That is why we choose to
use a tracking algorithm known as the Kalman Filter for tracking objects in our system.
It can help system, improve auto tracking accuracy and reduce frame losses. Kalman
filter typically has two sections. The former is prediction and update.
Prediction: Kalman filter is used to predict the position in the subsequent frame. The
vehicle speed is determined based on the blobs distance. The vehicle position in the
current frame is calculated using the vehicle velocity, the location of the current frame
and the time elapsed since the last frame. It is state dependent and used to get the
prediction of the state vector and the error covariance.
Fig. 6. System Architecture of the proposed framework.
Calculating Vehicle Positions: We use a heuristic approach in which each vehicle

patch is shifted around its current position to cover as many of the vehicles related
blobs are possible. This is taken as the real vehicle location.
Estimation: A measurement in a position in the system of image coordinates, as
determined previous section. It updates the prediction parameters to reduce the error
between the vehicle’s predicted and measured location.
By predicting the vehicle’s location in the next frame, the next step is to rectify the
prediction stage errors. We use the function for the correction step feature based. Now
we add two constraints on the color and the size features for the corresponding point.
The first condition is a contrast of blobs in two consecutive color frames. The blob
must be the same compared to the color of the blobs in the n + 1 frame in frame n. The
second condition in two frames is about the size of the blobs. Given that the distance
between the vehicle and the camera varies, the size of the vehicle is always different.
But the size variance ratio is minimal, from one frame to the next, and the size
difference. Two blobs must be low in one consecutive frame. So, if two blobs match
colors and the difference in size between them is lower than the threshold and the
choice foreseen satisfy these two conditions, we consider the prediction is true, and set
the predicted position to the blob’s center in a new frame. In this case, if two vehicles
are similar to each other, the criteria are met for each vehicle to be chosen as the
acceptable choice (Fig. 7).
334 R. Saha et al.
Algorithm 2: Kalman filter Tracking
1: If (time==0) {
2: Consider a new track for any vehicle detected
3: }
4: Else {
5: For (All current tracks) {
6: New position and track size predict by Kalman-filter
7: Overlap new place with all blobs found
8: If (Overlap (track (a), blob (b))! =0)
9: Mark match_matrix [a] [b];
10: Analyze (match_matrix); }
Fig. 7. Algorithm for Kalman Filter Tracking
3.7 Haar –Like Feature Extraction

Many approaches to machine learning have the benefit of being more effective in object
detection computed. Haar classification is a Decision Tree-based technique in which a
statistically improved rejection cascade is created during the training process. At first, a
classifier (namely a cascade of boosted graders operating with Haar like - Features)
train with a few hundred of positive samples scaled to them same size and negative
samples might be random images are also scaled to same size as positive images. After
resizing, listing these images into a text file and create a vec. File (using OpenCV
creates sample). After that, start training the samples using cascade classifier and built
classifier (.xml file). By following the above steps, we can extract Haar like feature
through training the cascade boosted classifier.
3.8 Load Classifier

Training with a standard classifier is the most important task of the work that has been
done our previous section. Now cascade classifier is loaded (.xml file), our training the
classifier for the giving positive images according to their vehicles types.
3.9 Length Calculation

In the shape based feature extraction of the object, its dimension such as height, width,
length etc. are very necessary things. The height and width of each detected vehicle are
calculated by using below formula (Fig. 8):
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
r
D¼ ðX2 X1 Þ2 þ ðY2 Y1 Þ2 ð1Þ
Algorithm 3: Vehicle Detection and Tracking
1: Input: The video sequence that is being monitored by the target specified
2: Output: The outcome of tracking
3: for t 1 to N do
4: if It has been detected for 5 seconds, compared with
the last detection then
5: HOG and SVM to detect the vehicle;
6: If objects are found to be of concern (ROI)
Then
7: Tracking the entity in the bounding box using the Kalman filter
tracking algorithm
8: if It has been detected for 10 seconds
9: Comparing with the last detection then
Normal SVM and HOG to judge whether
Object still a vehicle in boundary box

10: else
11: Back to do kalman filter tracking
12: end
13: else
14: Back to the last step for detection
15: end
16: else
17: Back to the last step for detection
18: end
Fig. 8. Vehicle detection and tracking algorithm
3.10 Classification
The classification of vehicle types has recently gained a great deal of research interest.
It follows two main directions on the basis of the vehicle’s shape or appearance.
Classification is achieved by categorizing the vehicles, according to the size of the
vehicles into three groups, namely, large, medium and small. Since the length of the
vectors is easy to find, the length was taken as the parameter for classifying vehicles
according to the given size. When any vehicle object comes to our region of interest,
the length of the vector has been calculated by using length calculation equation,
Classification was carried out for that vehicle then pass the length requirements
(Fig. 9).
336 R. Saha et al.
Vehicle Type Vehicles Name

Large Bus, Truck
Medium Car, Van
Small Motorbike, Bicycle
Fig. 9. vehicle type classification
4 Implementation and Experiments
In this portion, providing our development framework with the implementation pro-
cedure and performance review.
4.1 Experimental Setup

The developed vehicle Detection, Tracking and classification System have been
implemented on a machine having the window 10, 2.50 Core i3-4100 processor with
4 GB RAM. The system has been developed in python 3.6.7 (version).
4.2 Implementation
Figure 10(a) shown the results only where there is some false position for detection.
But the false positive rate can be reduced after applying the tracking algorithm which is
shown in Fig. 10 (b), (c) and (d) and also track multiple vehicles. The detection boxes
are in this bounding box red Color, the tracking boxes are the blue boxes.
(a) (b)
(c) (d)
Fig. 10. Vehicle detection and tracking example in different scenes such as (b) urban road with
high light, (c) rural congested road, (d) low light.
Vehicle tracking in town area is much more difficult because objects are nearby to
each other, while tress or other background might cast shadws both on the road and the
vehicles. Figure 10(b)-(d) shown some tracking results under different scenario where
our proposed system work very satisfactory. By using different time and scene traffic
sequence to check kalam-Filter based tracking system. We can see the system has a
good tracker both videos and images. It can also track fast moving objects. When a
vehicle enters the scene, a new tracking object is detected, a new number is transmitted
and a tracking window is configured for the new vehicle.
We tested the system on highway image sequences. Most vehicles can be suc-
cessfully tracked and classified by the system. Different videos and images were chosen
for testing of different types of vehicles, including 3 types, including buses, cars and
motorbikes. By plugging in haar cascade-based classifier algorithms based on vehicle
extract features or dimensions, we loop through them and draw a rectangle around each
frame of the videos. After those preprocessing, we have integrated the framework that
can automatically extract and classify vehicles into different traffic video scenes.
(e) (f)
(g) (h)
Fig. 11. Vehicle classification example in different types such as (e)-(f) motorbike types of
vehicle, (g) car type of vehicle, (h) Bus type of vehicle

To evaluate the efficiency of the proposed solution, we precisely break the training data
train and test it by retaining 20% of the vehicle images and 20% of the vehicle sub-
images for training. We used a fixed set of 231 sub-images of vehicles and non-
vehicles that were collected for testing as the data set.
338 R. Saha et al.
Correct correspondence is the number of points which the proposed method detects
and tracks correctly. And the term for correspondence is the total number of match
points.
Number of corrected correspondence

Accuracy ¼ 100%
Number of correspondence
(1) Performance of Kalman Filter Tracking

The traffic sequence videos we have tested for checking result Table 1, shown
some information about those videos such as videos frame height and width,
number of frames, total processing time. From this result of tracking video, we can
understand about our proposed system accuracy and effectiveness.
Table 1. Execution time analysis for vehicle detection and tracking

Parameter Video 1 Video 2
Frame height 320 720
Frame width 320 1280
Video length 14 s 30 s
No. of frames 256 918
Processing time 2.5 min 6 min
As shown in Table 2, we use traffic sequences of video to check the Kalman

Filter-based tracking system, with the tracking results as shown above. We can see
that the system has a good tracking result and it can also track fast moving objects
like vehicles. When a new vehicle enters the scene, a new Tracking object is seen, a
new number is distributed and a tracking window in that vehicle initialized.
For moving destinations such as vehicles quickly, this has a strong detection
and tracking effect. This can detect and track vehicles that randomly join the
monitoring scenario. As we assume that good tracking performance depends on
good observation, we put more computational complexity on detection. This
method is also a main-stream of current research trend called ‘Detection-by-
Tracking’ or ‘Tracking-Based-Detection’.
Table 2. Performance analysis for videos

Test video Tracking vehicles Actual vehicles Accuracy
Video 1 62 68 91%
Video 2 35 36 97%
Video 3 5 5 100%
Video 4 17 17 100%
Video 5 22 25 88%
Video 6 12 12 100%
Video 7 2 2 100%
Video 8 13 15 86%
Average accuracy 95.25%
Table 3. Comparison with some existing method with our proposed method.
Methods Accuracy False rate
Gabor + SVM 94.5% 5.4%
PCA + NN 85% 7.5%
Wavelet + Gabor +NN 89% 6.2%
LDA 88.8% 5.5%
Hog + SVM + Mean-shift 94% 3.7%
HOG + PCA +SVM +EKF 95% 3.5%
Our proposed method 95.25% 3.7%
(2) Comparison with other Existing Method

From the Table 3, shown the comparison of other vehicle detection and tracking
existing method with our proposed methods. From where, we see that wavelet
feature extraction [17] with another feature reduction dimensional method achieved
higher accuracy than other method but the false rate also increase in that method.
Other existing method HOG [16] with SVM and Mean-shift tracker gain less
accuracy comparing with other tracking method and this method failed to track
object in bad weather and congested traffic scenes. But applying Kalman-filter
tracker with mean-shift tracker increase the accuracy with decreasing the false rate.
From the result of the accuracy, we see that the average accuracy for detection and
tracking rate is 95.25%. The false rate of our proposed method is also less which is
3.4%. As increasing the size of video it is difficult to handle for tracking.
(3) Performnace of Classification
We managed to achieve a 90% right classification rate and the 15 fps frame rate.
The accuracy of classification is also an important component for evaluating the
classifier in machine learning. It is specified as the number of vehicles correctly
categorize, divided by the total number of vehicles. Since the project’s success
depends on a proper position line for the camera view, it was necessary to place the
camera on an overhead bridge directly above the traffic route flow to reduce vehicle
occlusion. Some of our system ‘s outcomes are shown in Fig. 11 (e)-(h).
Classification errors were mainly due to the slight break between classes of
vehicles. Since we have only used the scale as our metric, all vehicles cannot be
properly categorized. More features will need to be explored to further improve our
success in classification rates. Our information was collected different time on a
day. In more complex scenes and under a wider range of lighting and weather
conditions, we intend to further test the system.
We measure the actual number and correct classification rate for each class, the
final result for all types of vehicle is as shown Table 4.
340 R. Saha et al.
Table 4. Performance analysis for result of vehicle classification

Vehicle types Number of vehicles Number of classified vehicle Success rate
Bus 6 5 83.3%
Car 7 7 100%
Motorbike 8 7 87.5%
Total 18 15 90.26%
5 Conclusion and Future Work
Robust and accurate vehicle detection, tracking and classification of frames collected
by a moving vehicle is a significant issue for autonomous self-guided vehicle appli-
cations. This paper provides a summary study of the techniques proposed that were
used in traffic video and images with a very high accuracy of tracking and classification
rate. We have given a comprehensive analysis of the state-of-the-art literature in this
paper dealing with computer vision techniques used in video-based traffic surveillance
and monitoring systems. A HOG and Haar-like feature-based, integrated vehicle
classification architecture is proposed in this paper that integrates several main com-
ponents relevant to an Intelligent Transportation Surveillance System.
The idea of using Histogram of gradients to extract features of different scales and
orientations is key to our approach. The Kalman filter is used to track multiple objects
in video. Classification has done based on the extracting Haar-like feature with cascade
based classifier which is categorize vehicle based on the threshold value. The goal of
the approach presented is to reduce the calculation time. Experimental findings show
that the system is accurate and efficient and that a high performance rate for vehicles
can be achieved by the system.
References
1. Zhang, C., Chen, S., Shy, M., Peeta, S.: Adaptive background learning for vehicle detection
and spatio- temporal tracking. In: Information, Communication and Signal Processing 2003
and the Fourth Pacific Rim Conference on Multimedia.0-7803-8185-8/03/$17.00 0, pp. 797-
801. IEEE (2003). https://doi.org/10.1109/icics.2003.1292566
2. Gupte, S., Masud, O., Martin, R.F.K.: Detection and classification of vehicles. Trans. Intell.
Transp. Syst. 3(1) 37–47 (2002)
3. Peng, Y., Jin, J.S., Luo, S., Xu, M., Au, S., Zhang, Z., Cui, Y.: Vehicle type classification
using data mining techniques. In: Jin, J.S. et al. (ed.) The Era of Interactive Media, LLC,
pp. 325–335. Springer, Cham (2013). https://doi.org/10.1007/978-1-4614-3501-3_27
4. Sun, Z., Miller, R., Bebis, G., DiMeo, D.: A real-time precrash vehicle detection system. In:
Proceedings on the sixth Workshop on Application Computer Vision (WACV 2002),
pp. 253–259. IEEE (2002)
5. Jayasudha, K., Chandrasekar, C.: An overview of data mining in road traffic and accident
analysis. J. Comput. Appl. 4(4), 32–37 (2009)
6. Chouhan, P., Tiwari, M.: Image retrieval using data mining and image processing
techniques, pp. 53–58 (2015). https://doi.org/10.17148/IJIREEICE.31212
7. Zhang, C., Chen, X., Chen, W.: A PCA-based vehicle classification framework. In:
Proceedings of the 22nd International Conference on Data Engineering Workshops (ICDEW
2006) 0–7695-2571-7/06 $20.00 IEEE, pp. 451–458 (2006)
8. Sun, Z., Bebis, G., Mille, R.: On-road vehicle detection using gabor filters and support vector
machines. &7803-7S03- 3/02/$17.00 02002, pp. 1019–1022. IEEE (2002)
9. Zakaria, Y., Abd El Munim, H. E., Ghoneima, M., Hammad, S.: Modified HOG based on
road vehicle detection method. In: International Journal of Pure and Applied Mathematics,
vol. 118, no. 18, pp. 3277–3285 (2018)
10. Tian, Q., Zhang, L., Wei, Y., Zhao, W., Fei, W.: Vehicle detection and tracking at night. In:
Video Surveillance. iJOE – vol. 9, no. 6 “AIAIP2012”, pp. 60–64 (2013)
11. Tao, H., Sawhney, H., Kumar, R.: A sampling algorithm for tracking multiple objects. In:
Conference Proceedings of the International Workshop on Vision Algorithms (1999). https://
doi.org/10.1007/3-540-44480-7_4
12. Kul, S., Eken, S., Sayar, A.: A concise review on vehicle detection and classification. In:
2017 International Conference on Engineering and Technology (ICET), Antalya, Turkey,
IEEE (2017). https://doi.org/10.1109/IcengTechnol.2017.8308199
13. Cho, H., Rybski, P. E., Zhang, W.: Vision-based bicycle detection and tracking using a
deformable part Model and an EKF Algorithm. In: 13th International Conferenceon
Intelligent Transportation Systems, Funchal, Portugal. IEEE (2010). https://doi.org/10.1109/
ITSC,2010.5624993
14. Kong, J., Qiu, M., Zhang, K.: Authoring multimedia documents through grammatical
specification. In: IEEE ICME 2003, Baltimore, USA, pp. 629–632 (2003)
15. Qiu, M.K., Song, G.L, Kong, J., Zhang, K.: Spatial graph grammars for web information
transformation. In: IEEE Symposium on Visual/Multimedia Languages (VL), Auckland,
New Zealand, pp. 84–91. (2003)
16. Xu, W., Qiu, M., Chen, Z., Su, H.: Intelligent vehicle detection and tracking for highway
driving. In: IEEE International Conference on Multimedia and ExpoWorkshops, pp. 67–72
(2012). https://doi.org/10.1109/ICMEW.2012.19
17. Chitaliya, N.G., Trivedi, A.I.: Feature-extraction using Wavelet- PCA and neural network
for application of object classification and face recognition. In: Second International
Conference on Computer Engineering and Applications, pp. 510–514, IEEE (2010). https://
doi.org/10.1109/ICCEA.2010.104
Advances in Algorithms, Modeling
and Simulation for Intelligent Systems
Modeling and Simulation of Rectangular Sheet
Membrane Using Computational Fluid
Dynamics (CFD)
Anirban Banik1(&), Sushant Kumar Biswal1,

Tarun Kanti Bandyopadhyay2(&), Vladimir Panchenko3,
and J. Joshua Thomas4
1
Department of Civil Engineering, National Institute of Technology, Jirania,
Agartala 799046, Tripura, India
anirbanbanik94@gmail.com, sushantb69@gmail.com
2
Department of Chemical Engineering, National Institute of Technology,
Jirania, Agartala 799046, Tripura, India
tarunkantibanerjee0@gmail.com
3
Russian University of Transport, Obraztsova St., 127994 Moscow, Russia
pancheska@mail.ru
4
Department of Computing, School of Engineering,
Computing and Built Environment, UOW Malaysia,
KDU Penang University College, George Town, Malaysia
joshopever@yahoo.com
Abstract. The study demonstrates the modeling and simulation of the flow
phenomena inside the rectangular sheet-shaped membrane module using
Computational fluid dynamics (CFD) based solver. The module was imple-
mented to enhance the quality of effluent generated from the Rubber Industry.
Commercially available CFD software (ANSYS) implemented to mimic the
flow inside the porous membrane. The meshing of the developed model was
done using Gambit software. The grid independency study reports a grid size of
375000 was the best grid for the simulation procedure. To mimic the flow
pattern inside the membrane, the second-order laminar model was considered.
The accuracy of the simulation process is evaluated using error analysis. In the
error analysis, methods like percent bias, Nash-Sutcliffe, and the ratio of RMSE-
observation standard deviation are selected for error analysis. The assessed
estimations of PBIAS, NSE, and RSR are close to ideal value, justifying the
adequacy of the simulation. From model validation, it demonstrates that CFD
predicted values follow the experimental values with high precision.
Keywords: CFD Membrane separation and filtration process Modeling

Simulation Wastewater treatment

https://doi.org/10.1007/978-3-030-68154-8_32
346 A. Banik et al.
1 Introduction
Over the last decades, membrane separation processes played a crucial role in the
industrial separation process. Numerous studies focus on the optimal way of using
membrane separation and filtration processes. Computational fluid dynamics
(CFD) techniques provides lots of information regarding the development of the
membrane process. Numerous advancements in membrane technology allowed the
selection procedure of the suitable membrane for different procedure easy and quick.
Membrane filtration has been used in broad range of application like wastewater
treatment [1, 2], protein separation [3], Food Processing industry [4] etc. In membrane
separation and filtration technique, the hydrodynamics of the fluid flow pattern inside
the membrane bed is very crucial. A membrane separation and filtration process is a
combination of free flow inside the membrane module and flow through the porous
membrane. Fluid dynamics description of free flow inside the membrane module is
easy to simulate by using Navier-Stokes equation for this part but most complex one to
reproduce is the flow of the fluid through the porous membrane bed which can be
modeled by coupling Darcy’s law with the Navier Stokes equation. And the validity of
Darcy law for simulating the incompressible flow through the porous zone with small
porosity has been found to be acceptable. The important part is to ensure the continuity
of flow field variables through the interface of laminar flow region and porous flow
region is adequately maintained [5]. Initial simulation of flow inside the membrane was
simulated using laminar condition along with porous wall [6]. Many authors have used
computational fluid dynamics (CFD) for optimizing the membrane processes [7].
Rahimi et al. [8] used hydrophilic polyvinylidene fluoride (PVDF) membrane for
preliminary work and studied using cross-flow velocities of 0.5, 1, and 1.3 m/s.
Fouling of the membrane are analyzed with the help of photographs and micrographs
of Scanning electron microscope. 3D CFD modelling of the developed porous mem-
brane bed was carried out using Fluent 6.2 which was used to pronounce the shear
stress distribution upon the membrane for explaining the fouling process. Particle
deposition pattern on the membrane surface have also been pronounced using discrete
phase model. Ghadiri et al. [9] developed model upon mass and momentum transfer
equation for solute in all phases including feed, solvent and membrane. The flow
behaviour inside the model was simulated using Navier Stokes equation by finite
element method under steady state condition. CFD predicted results are compared with
the experimental data to evaluate the accuracy of the model. A 2D mathematical model
was developed for illustrating seawater purification using direct contact membrane
desalination (DCMD) system [10]. The model was developed by coupling the equa-
tions of conservation for water molecule in three domains of the module and the
governing equations are solved by using finite element method. From the study of Reza
Kazemi [10], found that enhancing the channel length, the temperature of concentration
stream and e⁄(T.d) ratio or reducing the inlet velocity of the concentrated solution,
rejection of the salts can be enhanced.
Such detail CFD based study considering rectangular sheet shaped membrane to
enhance the quality of the rubber industrial effluent has not been reported earlier.
Modeling and Simulation of Rectangular Sheet Membrane 347
In the present paper, numerical finite volume method (FVM) is used to solve the
three-dimensional convective diffusion equations for solute transport in laminar flow
condition over a porous zone in stationary rectangular sheet membrane. CFD simu-
lation is used to predict the distribution pressure drop, velocity profile, wall shear stress
over the static rectangular sheet membrane. The CFD simulated data validated with the
experimental data.
2 Experiment Descriptions
Cellulose Acetate rectangular sheet membrane utilized in laborotary with a solo

objective of improving the effluent quality. Figure 1 is the schematic illustration of the
experimental setup. The experimental setup consists of neutralizing tank, feed tank, a
centrifugal pump, and permeate tank. The effluent was permitted to flow through the
neutralizing tank to maintain the pH of the feed sample as any deviation from the
operating pH of the membrane may affect the lifespan of the membrane module.
Optimu dose of soda ash was used to maintain the pH as raw rubber industrial effluents
are acidic in nature. After neutralizing tank, the feed was allowed to flow into the feed
tank from which the feed was permitted to flow through the module using centrifugal
pump. The centrifugal pump was used to maintain trans-membrane pressure, and
facilitate the movement of the feed across the membrane. The permeate tank was used
to collect the permeate flux. The rejects of the membrane module were re-flowed to the
Fig. 1. Schematic diagram of experimental setup.

348 A. Banik et al.
feed tank. Table 1 demonstrate the characterization of the raw feed stream collected
from Rubber Industry of Tripura.
Table 1. Feed Stream Characterization.

Sl. No. Parameters Units Value
1 Total suspended solids mg/L 398
2 Total dissolved solids mg/L 3030
3 pH 5.2
4 Biochemical oxygen demand mg/L 12.3
5 Oil and grease mg/L 1080
6 Sulfide mg/L 23.5
3 Computational Fluid Dynamics (CFD)
Advancement of computers and growing computational power, the computational fluid

dynamics (CFD) become widely used computational tool for predicting solution for
fluid related problem [11–14]. The mathematical model developed by using 3D hex-
ahedral grid geometry utilizing Gambit software. continuum and Boundary condition
of the model was defined for simulation purpose. Examination of the mesh was con-
ducted for evaluate the skewness of the mesh/grid, which is found to be less than 0.5
and considered as acceptable. Developed geometry was exported to the pressure based
solver fluent for predicting the pressure distribution, wall shear stress and Concentra-
tion profile over the membrane, while effluent from rubber industry allowed to flow
through it. Simple pressure-velocity coupled with 2nd order upwind scheme imple-
mented under relaxation for simulation purpose. The flux through the membrane bed is
considered as laminar, axisymmetric, and Iso-thermal. Flow through the membrane bed
was considered to be Laminar as Reynolds’s number was less than 2300.
3.1 Assumptions
Following assumptions and ideas are considered for building up the CFD model
describing effluent flow through membrane bed [11]:
I. Effluents generated from rubber industry was considered to be Newtonian fluid
because of its dilution in water.
II. Effluent from rubber industry was considered to be Isothermal and
incompressible.
III. Mathematical model was assumed as laminar single phase pressure based
simple pressure-velocity coupling model for the simulation.
IV. Membrane was considered to be isotropic and homogeneous porous zone
throughout the geometry.
V. According to the necessity of the simulation work, under-relaxation factor
reduced to 0.7–0.3 or lower.
VI. Hexahedral grids considered for meshing the model for computational
simplicity.
VII. Large gradient of pressure and swirl velocities resolved by using refined grids.
VIII. The mathematical model developed for simulation work was confined to flow
model only.
3.2 Governing Equations

The flow through the membrane is found to be administered by equations like conti-
nuity, momentum, Darcy law, solute transfer [12]. The governing equations of the
porous membrane bed defined below:
Continuity equation
Equation of Continuity for effluent flow through the membrane bed can be defined by
Eq. 1,

! b d d d
V i þ bj þb
k ¼0 ð1Þ
dx dy dz
Darcy’s law
Darcy’s law for flow of effluent through the membrane bed is defined by Eq. 2,
l!
rP ¼ V ð2Þ
a
Momentum equation
Axial momentum equation for effluent flow through the membrane bed is given by
Eq. 3,
! dP
r qu V ¼ þ lr2 u ð3Þ
dx
Equation of radial momentum for rubber industrial effluent flow through membrane
is given by Eq. 4,
! dP
r qu V ¼ þ lr2 u ð4Þ
dr
Mass transfer or solute transfer equation.

The equation of solute transfer through the membrane is given by Eq. 5,
!
r q V C ¼ qDr2 C ð5Þ
350 A. Banik et al.
3.3 Boundary Conditions

The following boundary conditions were considered for modeling and simulation of
rectangular sheet membrane [11]:
1. Inlet was assumed to be mass inlet
2. Outlet was considered to be pressure outlet with guage pressure equal to zero
3. Membrane was assumed to be porous zone
4. No slip condition is considered near the wall, where fluid velocity tends to zero.
3.4 Convergence and Grid Independency Test

Default criteria of convergence for all equation was selected to be 10−5 except for the
transport equation which was selected as 10−3. A computational domain used for
calibrating the results of fully developed flow obtained for rectangular sheet membrane.
From the results of the study, it observed that the final predicted results depend upon
the mesh/grid geometry. Gradual increment and decrement of mesh/grid resolution by
50% have been applied to evaluate if the employed mesh/grid resolution was adequate
to obtain results with minimum error percentage. It found that when 50% decrease in
mesh/grid resolution, the pressure profile has 8–15% error of the currently employed
mesh/ grid pressure profile for rectangular sheet membrane. When 50% increased in the
mesh/grid resolution, the pressure profile has 1–5% error of the employed mesh/grid
pressure profile for rectangular sheet membrane. From the results, it concluded that
current mesh/grid resolution had been found sufficient for obtaining grid independent
solution for the proposed model of the rectangular sheet membrane.
4.1 Selection of Optimum Grid Size

The grid independency test is implemented to select the optimum grid to carryout the
simulation process. Table 2 demonstrate the grid selection procedure, where parame-
ters like computational time, and deviation are used to select the optimum grid. The
grid selection procedure is conducted by assuming membrane pore size and inlet
velocity to be 0.2 µm, and 2.1 m/s respectively. The study is conducted to obtain grid
independent solution. In the study, three type of grid size (coarse-81000, fine-375000,
and finer-648000) are used to select the optimum grid. From the study it was found that
there is no significant change in the results between the fine and finer grid. The fine grid
is selected as the optimum one for its properties like less computational time and cost,
less deviation. The fine grid having size 375000 is found to be optimum one and used
for the simulation process as further increasing in grid size does not show any sig-
nificant change in pressure profile inside the rectangular sheet membrane.
Table 2. Optimum grid selection.

Sl. Mesh Mesh No. of No. of Time Exp. CFD Deviation
No. nature size nodes faces (min) (Pa) (Pa)
1 Coarse 81000 249300 87451 43 761.7 742 19.8
2 Fine 375000 1142500 392751 167 761.7 756 5.8
3 Finer 648000 1969200 673501 240 761.7 756 5.8
4.2 CFD Analysis

Figure 2 shows the profile of the concentration for inlet and outlet section of the
membrane. The figure demonstrate quality of the permeate flux produced during the
separation process. Red and blue color in the plot shows the maximum and minimum
concentration of effluent respectively. The figures demonstrate that concentration of
mass is high at the inlet contrasted with outlet of the membrane. It is because of the
high resistance offered by the membrane bed and selectively permitting particle having
diameter size less than the membrane pore size. Figure 3 is a graphical illustration of
wall shear stress (Pa) of the rectangular sheet shaped membrane. It is observed that
deformation of fluid is high in the vicinity of wall and low at the center of the mem-
brane bed. The deformation of the fluid is high due to the existence of dominating
adhesive force close to the wall compared to the weak cohesive force in the center. Plot
of mass imbalance (kg/s) with respect to time (min) for rectangular sheet membrane is
illustrated using Fig. 4. In this case, the permeate flux of the membrane gradually
decreases with the progress of filtration time. This is because of the cake layer for-
mation over the surface of the membrane which in partially or totally hinders the
membrane pores and obstruct the flow, thus reducing the permeate flux. Membrane
shows a membrane filtration time of 160 min when it is implemented to improve the
effluent quality.
Fig. 2. Contour of concentration profile.

352 A. Banik et al.
Fig. 3. Graphical illustration of wall shear stress (Pa).
Fig. 4. Plot of mass imbalance (kg/s) with respect to time (min).
4.3 Error Analysis

The error analysis is implemented to assess the accuracy of the CFD model. The
methods like percent bias (PBIAS), Nash-Sutcliffe efficiency (NSE), and ratio of
RMSE-observation standard deviation (RSR) are used to conduct the error analysis
[15–21]. The PBIAS, NSE, and RSR values are estimated using Eqs. 6–8;
2P
n 3
Yi Yi 100
6i¼1 7
PBIAS ¼ 6
4 P
m
7
5 ð6Þ
ðYi Þ
i¼1
2 P n 3
2
Y Yi
6 i¼1 i 7
NSE ¼ 1 6
4Pn
7
5 ð7Þ
2
ðYi Ymean Þ
i¼1
2 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3
Pn
6 ðYi Yi Þ2 7
6 7
6 i¼1 7
RSR ¼ 6sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi7 ð8Þ
6 P n 7
4 ðY Y Þ2 5
i mean
i¼1
In Eqs. 6–8, Yi, Y*i , and Ymean represents actual dataset, predicted dataset, and mean
of the actual dataset, respectively. Where, n illustrate total number of observation in
actual dataset. The value of PBIAS, NSE, and RSR are considered to be best, if the
values are close to 0, 1, and 0, respectively. Table 3 shows the error analysis of the
developed CFD model and the result of PBIAS (0.317), RSR (0.022), and NSE (0.99)
are close to the ideal value. Error analysis demonstrate the accuracy of simulation
process and justify the use of CFD to demonstrate the flow through the membrane.
Table 3. Error analysis of the developed CFD model.

Sl. No. Methods Estimated value Best value
1 PBIAS 0.317 0
2 NSE 0.99 1
3 RSR 0.022 0
4.4 Model Validation

Figure 5 demonstrate the graphical plot between pressure drop (Pa) and feed velocity
(m/s) of rectangular sheet membrane for validation purpose. From Fig. 5, it is observed
that increment in inlet velocity impact the momentum of the fluid. As momentum of the
liquid is a function of velocity (V). So, increase in speed increases the momentum of
the fluid, which collides with membrane wall and pore wall at a higher rate, which
causes kinetic energy loss of the fluid. This kinetic energy loss of the fluid due to
increment in velocity of the fluid changed over into a form of pressure head which is
demonstrated as a increased pressure drop. Computational fluid dynamics predicted
results validated with the experimental values. From the validation plot, it is observed
that results predicted by CFD hold good agreement with the results of the experiment.
354 A. Banik et al.
Fig. 5. Validation plot between pressure drop and inlet velocity.
5 Conclusions
The model of the rectangular membrane was developed by using Gambit and effluent
flow pattern inside the membrane was simulated using Fluent. To pronounce outcomes
with least error percentage, hexahedral mesh with least PC memory was utilized for
meshing. From grid independence test, the mesh size of 375000 was selected for
carrying out the simulation work as any further refinement of the grid size does not
show any change in pressure profile of rectangular sheet membrane. The mass
imbalance (kg/sec) study proves that the membrane bed poses high separation effi-
ciency. CFD predicted results are validated against experimental data, where CFD
results follows the results of experiments with high precision and the error percentage
typically varies in the range of 1–5%. CFD simulation provides an insight to hydro-
dynamic properties such as wall shear stress and concentration profile over the
membrane. The error analysis was implemented to evaluate the accuracy of the sim-
ulation process. The methods such as PBIAS, NSE, and RSR are used for error analys.
Determined estimation of NSE, PBIAS, and RSR were 0.99, 0.317, and 0.022.
respectively, which were close to the ideal value. Pronounced results from error
analysis justify the accuracy of the simulation process. Results obtained from the study
can be used to develop a cost-effective membrane separation technique to treat the
rubber industrial effluent.
References
1. Chen, Z., Luo, J., Hang, X., Wan, Y.: Physicochemical characterization of tight
nanofiltration membranes for dairy wastewater treatment. J. Memb. Sci. 547, 51–63
(2018). https://doi.org/10.1016/j.memsci.2017.10.037
2. Noor, S.F.M., Ahmad, N., Khattak, M.A., et al.: Application of Sayong Ball Clay Membrane
Filtration for Ni (II) Removal from Industrial Wastewater. J. Taibah Univ. Sci. 11, 949–954
(2017). https://doi.org/10.1016/j.jtusci.2016.11.005
3. Emin, C., Kurnia, E., Katalia, I., Ulbricht, M.: Polyarylsulfone-based blend ultrafiltration
membranes with combined size and charge selectivity for protein separation. Sep. Purif.
Technol. 193, 127–138 (2018). https://doi.org/10.1016/j.seppur.2017.11.008
4. Nath, K., Dave, H.K., Patel, T.M.: Revisiting the recent applications of nanofiltration in food
processing industries: Progress and prognosis. Trends Food Sci. Technol. 73, 12–24 (2018).
https://doi.org/10.1016/j.tifs.2018.01.001
5. Pak, A., Mohammadi, T., Hosseinalipour, S.M., Allahdini, V.: CFD modeling of porous
membranes. Desalination 222, 482–488 (2008). https://doi.org/10.1016/j.desal.2007.01.152
6. Berman, A.S.: Laminar flow in channels with porous walls. J. Appl. Phys. 24, 1232–1235
(1953). https://doi.org/10.1063/1.1721476
7. Karode, S.K.: Laminar flow in channels with porous walls, revisited. J. Memb. Sci. 191,
237–241 (2001). https://doi.org/10.1016/S0376-7388(01)00546-4
8. Rahimi, M., Madaeni, S.S., Abolhasani, M., Alsairafi, A.A.: CFD and experimental studies
of fouling of a microfiltration membrane. Chem. Eng. Process. Process Intensif. 48, 1405–
1413 (2009). https://doi.org/10.1016/j.cep.2009.07.008
9. Ghadiri, M., Asadollahzadeh, M., Hemmati, A.: CFD simulation for separation of ion from
wastewater in a membrane contactor. J. Water Process. Eng. 6, 144–150 (2015). https://doi.
org/10.1016/j.jwpe.2015.04.002
10. Rezakazemi, M.: CFD simulation of seawater purification using direct contact membrane
desalination (DCMD) system. Desalination 443, 323–332 (2018). https://doi.org/10.1016/j.
desal.2017.12.048
11. Banik, A., Bandyopadhyay, T.K., Biswal, S.K.: Computational fluid dynamics simulation of
disc membrane used for improving the quality of effluent produced by the rubber industry.
Int. J. Fluid Mech. Res. 44, 499–512 (2017). https://doi.org/10.1615/InterJFluidMechRes.
2017018630
12. Banik, A., Biswal, S.K., Bandyopadhyay, T.K.: Predicting the optimum operating
parameters and hydrodynamic behavior of rectangular sheet membrane using response
surface methodology coupled with computational fluid dynamics. Chem. Papers 74(9),
2977–2990 (2020). https://doi.org/10.1007/s11696-020-01136-y
13. Myagkov, L., Chirskiy, S., Panchenko, V., et al.: Application of the topological optimization
method of a connecting rod forming by the BESO technique in ANSYS APDL. In: Vasant,
P., Zelinka, I., Weber, G. (eds) Advances in Intelligent Systems and Computing. Springer,
Cham (2020)
14. Vasant, P., Zelinka, I., Weber, G-W.: Intelligent computing and optimization. In:
Proceedings of the 2nd International Conference on Intelligent Computing and Optimization
2019 (ICO 2019), 1st edn. Springer, Cham (2020)
15. Banik, A., Biswal, S.K., Majumder, M., Bandyopadhyay, T.K.: Development of an adaptive
non-parametric model for estimating maximum efficiency of disc membrane. Int. J. Converg.
Comput. 3, 3–19 (2018)
356 A. Banik et al.
16. Panchenko, V.I., Kharchenko, A., Valeriy Lobachevskiy, Y.: Photovoltaic solar modules of
different types and designs for energy supply. Int. J. Energy Optim. Eng. 9, 74–94 (2020).
https://doi.org/10.4018/IJEOE.2020040106
17. Panchenko, V.A.: Solar roof panels for electric and thermal generation. Appl. Sol. Energy
(English Transl. Geliotekhnika) 54, 350–353 (2018). https://doi.org/10.3103/
S0003701X18050146
18. Banik, A., Dutta, S., Bandyopadhyay, T.K., Biswal, S.K.: Prediction of maximum permeate
flux (%) of disc membrane using response surface methodology (rsm). Can. J. Civ. Eng. 46,
299–307 (2019). https://doi.org/10.1139/cjce-2018-0007
19. Kalaycı, B., Özmen, A., Weber, G.-W.: Mutual relevance of investor sentiment and finance
by modeling coupled stochastic systems with MARS. Ann. Oper. Res. 295(1), 183–206
(2020). https://doi.org/10.1007/s10479-020-03757-8
20. Kuter, S., Akyurek, Z., Weber, G.W.: Retrieval of fractional snow covered area from
MODIS data by multivariate adaptive regression splines. Remote Sens. Environ. 205, 236–
252 (2018). https://doi.org/10.1016/j.rse.2017.11.021
21. Vasant, P., Zelinka, I., Weber, G-W.: Intelligent computing & optimization. In: Conference
proceedings ICO 2018. Springer, Cham (2019)
End-to-End Supply Chain Costs Optimization
Based on Material Touches Reduction
César Pedrero-Izquierdo1(&) , Víctor Manuel López-Sánchez1,

and José Antonio Marmolejo-Saucedo2
1
Universidad Anáhuac, Av. Universidad Anáhuac 46, Lomas Anáhuac, 50130
Huixquilucan, Edo. México, Mexico
cesarpedreroiz@gmail.com
2
Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin 498, 03920
Ciudad de México, Mexico
Abstract. The global manufacturing industry requires high standard produc-

tivity, quality, delivery and flexibility. This is especially true when it comes to
the trucking industry, which has gained high efficiency by adopting lean man-
ufacturing tools. Nevertheless, it is crucial to look into the supply chain to reach
higher efficiency and sustainable competitive advantages. Multifold research on
supply chain costs optimization treats it as a collection of interrelated and
indivisible levels whose internal operations can be neglected. This prevents us
from spotting non-value and wasteful activities inside these levels. To fix this
drawback, this research proposes a different cost optimization strategy taking
advantage of those internal operations. This procedure breaks the supply chain
levels down into basic elements to generate new sets of operations that col-
lectively satisfy the original supply chain. The solution to this combinatorial
problem, which is inspired in the IP “crew-pairing problem”, provides a new and
optimized supply chain that minimizes costs.
Keywords: Supply chain Set covering End to end Material touches

Trucks
1 Introduction
1.1 Motivation and Objective
The global manufacturing industry requires the highest standards in productivity,
quality, delivery and flexibility. This is especially true for the truck manufacturing
industry, which is a relevant example due to the technical complexity of its assembly
process, as well as its tendency toward mass personalization, creating a very com-
plexsupply chain (Fogliatto 2003). There has been vast research on supply chain
optimization; nevertheless, most of it relates to lean manufacturing concepts (Monden
2011, Krajewski 2010) or to the minimum cost flow problem or MCFP (Hammamia
and Frein 2013). When related to lean concepts, it is normally applied to limited
procedures or steps in the supply chain (or SC); and when related to the MCFP, SC is
conceptualized as a net of indivisible levels whose internal operations are neglected.
This prevents us from realizing that inside these levels there are redundant activities
https://doi.org/10.1007/978-3-030-68154-8_33
358 C. Pedrero-Izquierdo et al.
that do not add any value for further process (Closs 2010). In this particular research,
those internal activities are referred to as “material touches” or just “touches”, meaning
every action related to handling, moving or interacting in general with the product
flowing on the SC. Based on the background above, it follows that the general
objective of this research is to develop a different cost optimization methodology for a
SC of i 1 stages and i levels serving to a truck assembly company from suppliers to
final assembly process. This methodology must be capable of generating cost
improvements within the internal processes of the SC i levels by fragmenting them in j
basic touches and neglecting the traditional division between them.
1.2 Methodology
This research introduces a four-stage methodology. The first stage analyzes and seg-
ments the original i levels of the SC into individual j touches that are subsequently
cataloged according to specific attributes. The result of this stage plus the selection of
touches proposed to be externally outsourced, give rise to the next step that consists in
generating new sets of touches according to individual characteristics. These subsets
won´t satisfy the original SC touches individually; however, a combination of them
will satisfy them collectively. The third stage is calculating the costs of the brand-new
touch subsets. The last stage is dedicated to formulate and solve the mathematical
model as a “set covering problem”. The case studies used to validate this methodology,
come from a collaborative truck assembling company, which is the largest in the world
and the one with the highest offer for customization in Mexico and Latin America.
1.3 Contribution
The main objective and contribution to the state of the art of this research, was to
develop a new cost optimization strategy dedicated to the SC of the truck assembly
industry that could visualize improvement opportunities within the SC internal steps.
Out of the vast research dedicated to this topic, were lean concepts and MCFP stand
out, this research found the following outcome that hadn´t been considered within these
two branches:
• The concept of material touch is introduced, as well as its mathematical description
as the basis of a new methodology dedicated to SC costs optimization.
• An iterative process was developed to segment the entire supply chain in touches
based on individual attributes.
• A practical application of the “end to end” concept related to the SC was presented
as part of the cost optimization methodology.
• A mathematically strict methodology was provided in order to merge both strengths
and weaknesses of different logistics operators to optimize total costs.
• A four-stage methodology inspired by the aircrew-assigning problem “crew pair-
ing”, was developed as an IP in a “set covering” format.
End-to-End Supply Chain Costs Optimization 359
2 Literature Review
Kumar and Nambirajan (2014) suggested crossing borders between client and provider
to expand capabilities. Arif-Uz-Zaman and Ahsan (2014) suggested that the SC should
be treated as companies linked to each other. Stavrulaki and Davis (2010) suggested
that a SC improves the overall effectiveness when every stage is aligned. Krajewski
et al. (2010) integrated the design of the SC under “lean manufacturing system”.
Bezuidenhout (2015) added the agile concept to the existing lean concepts to describe
what he described as a more realistic “lean-agile” process. As an additional tool to
optimize a SC, Walters and Lancaster (2000) proposed the third party logistics (or 3PL)
figure as part of this process. Arias-Aranda et al. (2011) verified the correlation of 3PL
services and flexibility levels within a company. Wu et al. (2012) suggest flexibility
and specialization improvements trough 3PL services. Dolguia and Proth (2013)
considered that 3PL services can reduce non-essential activities and reduce costs.
Agrawal (2014) mention that using 3PL services enhances flexibility and innovation.
Lee (2010) used the concept “end to end” to suggest the elimination of redundant SC
processes. Closs et al. (2010) suggested also to migrate to a collaborative culture to
eliminate waste by eliminating silos along the SC. Trang (2016), Kram et al. (2015) and
Ageron (2013), propose the elimination of redundant activities through shared SC
design. González-Benito (2013) adopted the term “co-makership” as shared responsi-
bility for SC design. Ashley (2014) affirmed that SC co-responsibility must go from the
birth of the product to the end of its cycle. As of optimization methodologies, Castillo-
Villar (2014) added quality costs as optimization criterion. Hammamia and Frein
(2013) added delivery-time as optimization criterion. Sakalli (2017) included stochastic
parameters under uncertainty. Cheng and Ye (2011) adapted classic MCFP model to
handle parallel suppliers. Fahimniaa et al. (2013) introduced production planning and
SC parameters together. Ding (2009) proposed an optimization method using simul-
taneous production and distribution parameters. Paksoy and Ozceylan (2011) included
operation-balancing parameters to MCFP. Paksoy and Ozceylan (2012), particularized
their previous work by including U-shaped production balancing. Hamtaa (2015)
proposed an SC optimization problem by adding production with uncertain demand
parameters. Minoux (2006) solved the crew assignment problem by using heuristic
methods. Souai and Teghem (2009) proposed reassigning routes by using a genetic
algorithm. Deng and Lin (2010) proposed an ant colony algorithm to solve the crew
scheduling problem. The same problem but considering also legal requirements is
solved by Deveci and Demirel (2015) using a genetic algorithm. Tekiner et al. (2008)
proposed a column generation method incorporating disruptive events. Later on Muter
et al. (2010) also proposed a two-steps columns generation methodology solving firs
the crew pairing and then using these data to solve the crew rostering as well as
incorporating possible disruptions since the planning of the problem, what they called
“robust planning”. Özener et al. (2016) solved the aircraft allocation and flight
sequence assignation by using exact and meta heuristics methods.
3 Touches-Based Cost Optimization Methodology

3.1 Analysis of a SC Levels Using a Touches-Approach
This methodology starts by segmenting the i levels of a SC into j internal and inde-
pendent basic steps for every k component. The segmentation is carried out through an
empirical process that follows the knowledge dynamics described by Bratianu (2018).
These individual actions called “material touches” are generically denoted as Tijk ,
which describe the touch j belonging to level i that acts on component k; this concept is
represented in Fig. 1. Every touch Tijk will be particularized depending on the type of
activity and cost it generates. Nine types of touches are proposed to describe every
activity in the SC. These are listed below:
• Internal manual touch (TMNijk ): parts manipulation by physical labor force.
• Internal machinery touch (TQNijk ): parts movements by devices and mechanisms.
• Internal manual touch implying packaging (TMENijk ) packaging by force of labor.
• Internal machinery touch implying packaging (TQENijk ): packaging using devices.
• External manual touch (TMXijk ), external machinery touch (TQXijk ), external
machinery touch that implies packaging (TMEXijk ) and external machinery touch
that implies packaging (TQEXijk ): similar concepts but provided by a third party.
• Transportation touch (TTRijk ): parts transportation by motor-vehicles of any kind.
Fig. 1. The concept of material touches is represented in this diagram, where level i ¼ 2 is
fragmented in j ¼ 8 material touches for component k.
3.2 Incidence Matrix Generation and Numerical Example

The incidence matrix is the registration of every touch subset that will collectively
satisfy the activities of the original SC. Let A be a matrix of m n where m represents
every touch of the original SC and n the number of internal touch subsets plus the
subsets proposed to be outsourced. The subsets are formed in consecutive and insep-
arable operations determined by the process interrelation they keep with their imme-
diate previous activity. The three proposed interrelations are: initial (IN), used only to
tag the first touch; consecutive (C), intended for touches that are technically not sep-
arable from the previous process; and non-consecutive (NC) used for touches that can
be performed independently from the previous process. The original m touches are
recorded in the first columns of A strictly in the order and position of the original SC as
described next. Let M = {touches of original SC} and let P = {touches proposed to be
outsourced} where P M. Column n ¼ 1 corresponds to Pc , assigning a value of 1 to
the positions where a touch exists and 0 to those positions where it does not. Thereafter,
the elements of subset P are recorded in the subsequent columns. P is divided into
subsets depending on its touches interrelation as follows: subset P will be registered in
column ðn þ 1Þ starting with the initial touch IN followed by the subsequent touches
that are classified as consecutive C. Once a non-consecutive NC touch is reached, it
will be assigned to the next column followed by the subsequent touches C, until
another NC touch appears again. This iterative process is performed for the remaining
elements of P. Let be r the quantity of external offers to outsource selected subsets of P.
Let us consider Pr as the subset outsourced by offer r. Each Pr will be recorded in an
individual column for every r.
As a numerical example, consider a SC of i ¼ 4 levels with i 1 ¼ 3 stages, where
each level contains j ¼ 2 touches for k ¼ 1 components. Let us propose six touches
within the SC: T111 ; T121 ; T211 ; T221 ; T311 ^ T321 . Then, set M would be
M ¼ fT111 ; T121 ; T211 ; T221 ; T311 ; T321 g. Let us suppose that touches T111 ; T211 ; T221 can
be outsourced and that there are three external proposals, therefore r ¼ 3. Set P is
formed then by the touches P ¼ fT111 ; T211 ; T221 }. Let us propose the three external
offers as P1 ¼ fT211 ; T221 g, P2 ¼ fT111 ; T211 ; T221 g and P3 ¼ fT211 ; T221 g, where every
Pr P; therefore the complement set is Pc ¼ fT121 ; T311 ; T321 g. Each element belonging
to Pc is recorded in column n = 1 in rigorous order and position as of the original SC.
Succeeding, the elements of the set P are recorded starting at column n ¼ 2 depending
on their touches interrelation (see Table 1) as described avobe. Finally, the r ¼ 3
proposals P1 ; P2 ; P3 to be outsourced are recorded in independent columns. Resulting
incidence matrix is shown in Fig. 2.
Table 1. Interrelation of set P from numerical example presented in Sect. 3.2.

Fig. 2. Incidence matrix: internal touches registered in n ¼ 1 3 and external in n ¼ 4 6.
3.3 Touches Subsets Cost Calculation

The next step is to obtain the costs of both, internal and external touch subsets. The
calculation parameters include: Te which represents the total amount of touches per day
and Tt, the amount of touches corresponding to each k component. For both Te and Tt,
dimensional units are ½touches=day. Both parameters must be particularized to indi-
vidually represent the corresponding types of touches. The daily demand is represented
by d ½units=day. The quantity of parts that component k needs to form one unit of
product is Uijk ½pieces=unit and Qijk ½pieces is the quantity of k components in a
standard package. Let us develop the formulation of Te and Tt for the first type of touch
TMNijk ; the equations for the other eight type of touches are calculated the same way.
For all of them, the dimensional units are: ðððUnits=dayÞ ðPices=UnitÞÞ=PicesÞ
Touches ¼ Touches=day. The equations for TeTMNi and TtTMNijk are:
XJ XK d Uijk
TeTMNi ¼ j¼1 k¼1
TMNijk 8i ¼ 1; 2; . . .; I ð1Þ
Qijk
d Uijk
TtTMNijk ¼ TMNijk ð2Þ
Qijk
8i ¼ 1; 2; . . .; I; 8j ¼ 1; 2; . . .; J; 8k ¼ 1; 2; . . .; K
Where:

Tijk ¼ 1 if specific touch type j exists in level i for k:
ð3Þ
Tijk ¼ 0 in any other case:
Note that in (3) Tijk represents each of the nine type of touches TMNijk , TQNijk ,
TMENijk , TQENijk , TMXijk , TQXijk , TMEXijk , TQEXijk and TTRijk . Once Te and Tt are
calculated, the cost for each type of touch can be calculated. These costs are listed
below where the dimensional units for all of them are [monetary unit/day].
• CTMNi Internal manual touch costs.
• CTQNi Internal machinery touch costs.
• CTMENi Internal manual touch that implies packaging costs.
• CTQENi Internal machinery touch that implies packaging costs.
• CTMXijk External manual touch costs.

• CTQXijk external machinery touch costs.
• CTMEXijk external machinery touch that implies packaging costs.
• CTQEXijk external machinery touch that implies packaging costs.
• CTTRi Transportation costs.
Calculation of costs for internal and external touches subsets follow completely
different forms of logic. First, let us develop the internal cost calculation for the first
type of touch TMNijk ; the formulation for the other eight types is done the same way.
Let CTMNi be the total cost related to TMNijk activities in level i. Let CuTMNi be the
unitary cost of touch TMNijk of level i, which is the equitable partition by touch of the
resources dedicated to the execution of this type of touch. The dimensional units for
CuTMNi are: ½Monetaryunit=Touch. The mathematical expression (4) is stated as
follows:
CuTMNi ¼ ðCTMNi =TeTMNi Þ8i ¼ 1; 2; . . .; I ð4Þ
Once the unitary costs per touch-type are obtained for level i, the total cost for each
individual touch is calculated by multiplying the unitary cost by the total touches Tt.
This calculation is done for each touch belonging to each subset in column n. Its sum
represents the total cost of every touch existing in that subset. The internal subset cost
CIntn ½monetary unit=day is the sum of every element belonging to column n:
0 1
CuTMNi TtTMNijk þ CuTQNi TtTQNijk þ CuTMENi
B TtTMEN þ CuTQEN TtTQEN þ CuTMX TtTMX C
B ijk i ijk i ijk C
CIntn ¼ Rijk B C
@ þ CuTQXi TtTQXijk þ CuTMEXi TtTMEXijk þ CuTQEXi A
TtTQEXijk þ CuTTRi TtTTRijk
ð5Þ
Calculation of costs for the subsets to be outsourced follows a different logic from
the previous one. These costs depend on two factors; the first one, CextTr , refers to the
commercial proposal from third parties. The second one, defined as approaching cost
CTextTrijk , is the relocation cost of the components to the point where they will con-
tinue to be processed. This cost makes external proposals cost-comparable with the
internal ones since it compensates the fact that the outsourced processes will take place
in another facility and later on, the components will return to the starting point to
continue the original flow. The total cost CExtr of subset Pr is the sum of both costs,
CextTr and CTextTrijk ½monetary unit=day; this cost is formulated in Eq. (6).
X
CExtr ¼ CextTr þ ijk
CTextTrijk 8r ð6Þ
The resulting cost matrix is shown in Table 2, were touches have already been
illustratively particularized.
Table 2. Cost matrix of internal and external touch subsets.
3.4 Generation of the Mathematical Model in Its “Set Covering” Form

The mathematical model is formulated as a combinatorial optimization problem in the
form of “set covering”. The solution to the particular application presented can be
obtained by using exact methods due to the limited size of the problems found in the
industrial practice under study. Each subset will be represented by a decision variable
PA1 ; PA2 ; PA3 ; . . .; PAn . The coefficients of the objective function are CIntn and
CExtnni , in which the quantity of external proposals is r ¼ n ni. Decision variables
are binary type, where:

1 if the subset isselected:
PAn ¼
0 otherwise:
Internal and external touches subsets are recorded in the incidence matrix A, where:

1 if element exists in the subset registered in n:
amn ¼
0 otherwise:
The objective function minimizes the total cost by selecting a combination of

subsets PA1 , PA2 , … PAn . The selection will collectively satisfy the original SC
touches. The resulting mathematical model is:
Xni XN
Minimize Z ¼ n¼1
CInt n PA n þ n¼ni þ 1
CExt nni PAn ð7Þ
XN
Subject to : n¼1
amn PAn 18m ¼ 1; 2; . . .; M ð8Þ
PAn ¼ 0; 1 where n ¼ 1; 2; . . .; N
4 Results Analysis
The methodology was evaluated in two different ways. The first validation consisted in
applying the proposed methodology to four designed instances under extreme condi-
tions. The purpose was to verify the correct functioning of the methodology even under
unusual circumstances. The second validation consisted in empirical tests, where three
real documented cases previously solved for the company in collaboration would be
solved again with the new methodology. Table 3 shows the seven instances.
Table 3. Description of the instances used to validate the proposed methodology.
Validation IC1 verified the right segmentation of the SC based on the interrelation
of touches. Validations of instances IC2 and IC3 consisted of proposing atypical
parameters in combination with a random interrelation assignment. For IC2, very high
value coefficients were selected for the external subsets. For IC3, equal parameters
were proposed for all decision variables. As expected, for IC2 the result was the
rejection of external subsets, and for IC3 the solution yielded to multiple solutions. IC4
validation consisted in using the longest and most complex internal supply chain found
in the collaborative company.
The empirical validation consisted in solving three real cases found in the col-
laborating industry and then comparing the results with the real solutions observed in
industrial practice. For the empirical instances IR1 and IR2, the results obtained were
consistent with the real observed solutions. However, IR3 presented a discrepancy
when compared to the real solution observed. The new methodology recommended
rejecting the outsourcing offers (in results table improvements show as zero), whereas
in practice, it was observed that external services were engaged. When comparing the
data of the business case in detail, it was confirmed that IR3 would be costlier if
performed externally. However, in the business case IR3 was part of a package of
services that, on the whole, was financially convenient. The methodology results are
shown in Table 4.
Table 4. Improvements achieved by using the new touches-based methodology.
The proposed methodology offers improvements over the common cost reduction
methodologies used in the industrial practice, since it conveniently allows merging,
substituting or eliminating sections of the SC in the most cost effective combination.
This new methodology is capable of finding cost reduction opportunities in the internal
processes of the SC levels that some other methodologies cannot reach. In addition, it
allows combining touches belonging to one or more levels since the entire SC separates
in its basic elements and follows the “E2E” lean principle. This methodology allows
companies to search for cost opportunities that are hidden for other methodologies.
5 Conclusions and Future Research
This methodology is suitable to be used as a mathematical based tool to support the

decision-making process of the trucking industry as well as other industries whose SC
can be divided in individual touches.
The study of the truck manufacturing industry’s SC regarding touches and end-to-
end perspective allows visualizing redundant activities that do not add value or would
turn out to be a waste for future processes.
The presented SC segmentation strategy does not only apply to the trucking
industry. It is readily applicable to any human activity that can be broken down into
individual activities that can be performed independently from one another.
The growing trend for companies to despise commercial division between levels of
the SC is a fundamental aspect of the proposed methodology. Companies that share this
progressive vision for both customer and supplier will have competitive advantages
over companies with isolationist policies.
Future research which could expand and strengthen the present investigation may
not just include focusing on costs, but other equally important objectives in optimizing
a supply chain. For instance, the aforementioned methodology could be applied to
optimize quality levels, service satisfaction, labor in manufacturing processes, etc.
Additionally, this methodology could be expanded to study two or more parallel
supply chains to carry out the optimization by combining touches from the involved
supply chains to take advantage of the strengths of some and to amend the weaknesses
of others and therefore obtain a benefit for the complete cluster.
A third future research is the design of a green field project using the proposed
methodology since early stages. The intention is to select the best operation combi-
nation since the design of the SC and processes instead of modifying it once it is in
operations.
References
Dolgui, A., Proth, J.M.: Outsourcing: definitions and analysis. Int. J. Prod. Res. 51(23–24),
6769–6777 (2013). Enero 2020. De PROQUEST Base de datos
Agrawal, A., De Meyer, A., Van Wassenhove, L.N.: Managing value in supply chains: case
studies on the sourcing hub concept. California Management Review, University of
California, Berkeley, vol. 56, pp. 22–54 (2014)
David, A.: Differentiating through Supply chain innovation. Bus. Econ. Market. Purchas. 1, 1–4
(2014)
Fahimniaa, B., Farahani, R.Z., Sarkis, J.: Integrated aggregate supply chain planning using
memetic algorithm – a performance analysis case study. Int. J. Prod. Res. 51(18), 5354–5373
(2013). http://dx.doi.org/10.1080/00207543.2013.774492
Bratianu, C., Vătămănescu, E.-M., Anagnoste, S.: The influence of knowledge dynamics on the
managerial decision- making process. In: Proceedings of the European Conference on
Knowledge Management, vol. 1, pp. 104–111 (2018). Accessed from http://search.ebscohost.
com/login.aspx?direct=true&db=lih&AN=132145882&lang=es&site=ehost-live&custid=
s9884035
Kumar, C.G., Nambirajan, T.: Direct And Indirect Effects: SCM Componnets. SCMS J. Ind.
Manage. 1, 51–65 (2014)
Bezuidenhout, C.N.: Quantifying the degree of leanness and agility at any point within a supply
chain. School of Engineering, University of KwaZulu-Natal, Scittsville, South Africa and
SCION, Rotorua, New Zealand, vol. 118, no. 1, pp. 60–69, 16 September 2015
Arias-Aranda, D., Bustinza, O.F. and Barrales-Molina, V.: Operations flexibility and outsourcing
benefits: an empirical study in service firms. Routledge Taylor & Francis Group, vol. 31, no.
11, pp. 1849–1870. Enero 2020 (2011). De EBSCO Base de datos
Closs, D.J., Speier, C., Meacham, N.: Sustainability to support end-to-end value chains: the role
of supply chain management. Acad. Market. Sci. 39, 101–116 (2010)
Walters, D., Lancaster, G.: Implementing value strategy through the value chain. Manag. Decis.
38(3), 160–178 (2000)
Zeghal, F.M., Minoux, M.: Modeling and solving a crew assignment problem. Eur. J. Oper. Res.
175, 187–209 (2006). De ProQuest Base de datos
Cheng, F., Ye, F.: A two objective optimisation model for order splitting among parallel
suppliers. Int. J. Prod. Res. 49, 2759–2769 (2011). De EBSCO Base de datos
Fogliatto, F.S., Da Silveira, G.J.C., Royer, R.: Int. J. Prod. Res. 41(8), 1811–1829 (2003)
Deng, G.F., Lin, W.T.: Ant colony optimization-based algorithm for airline crew scheduling
problem. Expert Syst. Appl. 38, 5787–5793. (2011). De EBSCO Base de datos
Tekiner, H., Birbil, S.I., Bubul, K.: Robust crew pairing for managing extra flights. Manuf. Syst.
Ind. Eng. Sabanci Univ. 1, 1–30 (2008). De EBSCO Base de datos
Lee, H.L.: Don’t tweak your supply chain-rethink it end to end. Harvard Bus. Rev. I, 62–69
(2010)
Ding, H., Benyoucef, L., Xie, X.: Stochastic multi-objective production-distribution network
design using simulation-based optimization. Int. J. Prod. Res. 47(2), 479–505. (2009).
De PROQUEST Base de datos
Muter, I., Birbil, S.I., Bulbul, K., Sahin, G., Yenigun, H.: Solving a robust airline crew pairing
problem with column generation. algopt Alg. Optim. 40, 1–26 (2013). De EBSCO Base de
datos
González-Benito, J., Lannelonguea, G., Alfaro-Tanco, J.A.: Study of supply-chain management
in the automotive industry: a bibliometric analysis. Int. J. Prod. Res. 51(13), 3849–3863
(2013)
Wu, J.-Z., Chien, C.-F., Gen, M.: Coordinating strategic outsourcing decisions for semiconductor
assembly using a bi-objective genetic algorithm. Int. J. Prod. Res. 50(1), 235–260 (2012)
Arif-Uz-Zaman, K., Ahsan, A.M.M.N.: Lean supply chain performance measurement. Int.
J. Product. Perform. Manage. 63(5), 588–612 (2014)
Krafcik, J.F.: Triumph of the lean production system. Sloan Manag. Rev. 30(1), 41–52 (1988)
Krajewski, L.J., Ritzman, L.P., Malhotra, M.K.: Operations Management – Processes and Supply
Chains, vol. 07458, 9th edn. Pearson Education Inc, Upper Saddle River (2010)
Castillo-Villar, K.K., Smith, N.R., Herbert-Acero, J.F.: Design and optimization of capacitated
supply chain networks including quality measures. Hindawi Publishing Corporation,
Mathematical Problems in Engineering, pp. 1–17 (2014). Article ID 218913.
De PROQUEST Base de datos
Kram, M., Tošanović, N., Hegedić, M.: Kaizen approach to supply chain management: first step
for transforming supply chain into lean supply chain. Ann. Faculty Eng. Hunedoara – Int.
J. Eng. Tome XIII – Fascicule 13(1), 161–164 (2015)
Deveci, M., Demirel, N.C.: Airline crew pairing problem: a literature review. In: 11th
International Scientific Conference on Economic and Social Development, vol. 1, 17
December 2015–July 2019. De EBSCO Base de datos
Souai, N., Teghem, J.: Genetic algorithm based approach for the integrated airline crew-pairing
and rostering problem. Eur. J. Oper. Res. 199, 674–683 (2009). De ProQuest Base de datos
Trang, N.T.X.: Design an ideal supply chain strategy. Advances In Management 9(4), 20–27,
April 2016
Hamta, N., Shirazi, M.A., Fatemi Ghomi, S.M.T., Behdad, S.: Supply chain network
optimization considering assembly line balancing and demand uncertainty. Int. J. Prod.
Res. 53(10), 2970–2994 (2015). De PROQUEST Base de datos
Özener, O.Ö., Matoğlu, M.Ö., Erdoğan, G.: Solving a large-scale integrated fleet assignment and
crew pairing problem. Ann. Oper. Res. 253, 477–500 (2016). De ProQuest Base de datos
Paksoy, T., Ozceylan, E., Gokcen, H.: Supply chain optimization with assembly line balancing.
Int. J. Prod. Res. 50, 3115 (2011). https://doi.org/10.1080/00207543.2011.593052
Hammami, R., Frein, Y.: An optimisation model for the design of global multi-echelon supply
chains under lead time constraints. Int. J. Prod. Res. 51(9), 2760–2775 (2013). De EBSCO
Base de datos
Rubin, J.: A Technique for the solution of massive set covering problems, with application airline
crew scheduling. Transp. Sci. 7(1), 15–34 (1973)
Rushton, A., Croucher, P., Baker, P.: Logistics and distribution management. 4th edn. Kogan
Page, London, 636 p (2010). ISBN 978 0 7494 5714 3
Stavrulaki, E., Davis, M.: Aligning products with supply chain processes and strategy. Int.
J. Logist. Manage. Ponte Vedra Beach, 21(1), 127–151 (2010)
Paksoy, T., Özceylan, E.: Supply chain optimisation with U-type assembly line balancing. Int.
J. Prod. Res. 50(18), 5085–5105 (2012). De EBSCO Base de datos
Sakalli, U.S.: Optimization of production-distribution problem in supply chain management
under stochastic and fuzzy uncertainties. Hindawi, Mathematical Problems in Engineering,
pp. 1–29 (2017). Article ID 4389064. De EBSCO data base
Versión 2018 • Agenda Automotriz DIÁLOGO CON LA INDUSTRIA AUTOMOTRIZ 2018 •
2024 (2018). http://www.amia.com.mx/boletin/dlg20182024.pdf. Accessed 20 Jan 2019
Monden, Y.: Toyota Production System, An Integrated Approach to Just-In-Time, 4edn., CRC
Press, New York (2011)
Ageron, B., Lavastre, O., Spalanzani, A.: Innovative supply chain practices: the state of French
companies. Supply Chain Management Int. J. 18(3), 256–276 (2013)
Computer Modeling Selection of Optimal
Width of Rod Grip Header to the Combine
Harvester
Mikhail Chaplygin, Sergey Senkevich(&), and Aleksandr Prilukov
Federal Scientific Agroengineering Center VIM, 1 St Institute Pas. 5,

misha2728@yandex.ru, sergej_senkevich@mail.ru,
chel.diagnost@gmail.com
Abstract. Economic and mathematical model and computer program that

allows you to simulate the choice of the optimal header size for a combine
harvester are presented in the article. Direct expenses for harvesting taking into
account the quality of combine harvesters work in farm conditions at different
speeds are accepted as an optimization criterion
Keywords: Combine harvester Economic efficiency Header Computer

optimization model PC software
1 Introduction
The results of conducted research [1–5] on the rationale the combine harvesters
parameters are that the main factors determining the effectiveness of combine har-
vesters fleet choosing with the rationale the optimal header width are qualitative and
quantitative composition of combine harvesters and the standard header size and the
optimal harvesting period duration. The optimal these parameters ratio can be estab-
lished only during the study of the combine harvester fleet operation by mathematical
modeling methods, because natural experiments require a lot of time and material
resources.
The operating experience and results of comparative combine harvesters tests in
different areas of the country show that the main factor determining their capacity
underuse for direct combine harvesting is the insufficient headers width.
Optimizing the harvester header width will allow to use effective it at different crop
yield.
Research and testing domestic and foreign combine harvesters have allowed to
develop a refined economic and mathematical model for optimizing the size of the
header for a combine harvester in recent years.
The crop yield in the Russian regions varies widely from 10 to 80 c/Ha (centners
per hectare), so there is a question of header purchasing by farms, it allows them to
realize the nominal combine harvester productivity and ensure the minimum cash
expenditures for bread harvesting.

https://doi.org/10.1007/978-3-030-68154-8_34
370 M. Chaplygin et al.
Increasing the header width leads to a number of positive aspects:

– the working strokes coefficient increases so the shift performance increases too;
– the harvester operates at low speeds, which reduces the grain loss behind the header
and thresher, it should also be noted that many scientists in their work suggest
installing control systems and other technological and technical solutions, that help
reduce grain losses, on the header and combine harvester [6–12];
– the cost of power and fuel for moving heavy harvesting units at low speeds
decreases significantly and it realizes the nominal productivity;
– machine operator’s work environment are improving.
All this leads to an improvement in economic efficiency indicators.
On the other hand, increasing the header width leads to a number of disadvantages:
– the price of the header and combine harvester as a whole becomes higher, it make
the harvesting more expensive;
– the header weight increases, it requires additional power costs for self-movement of
the harvesting unit;
– soil compaction increases by heavy harvesting units.
Theoretical (system analysis, methods for optimizing the combine harvester fleet
composition, optimal header width calculation), logical (comparison, generalization
and analyzing scientific literature on the research problem), statistical (processing
experiment data). Mathematical modeling with the development deterministic eco-
nomic and mathematical model and calculation the optimal header width for the
combine harvester was used during the choosing header width.
Research objectives ensuring effective combine harvester operational and tech-
nological indicators due to the scientifically proven header width.
Research Tasks:
– to develop economical and mathematical model of choosing header width for the
combine harvester and a software for calculating by PC in relation to two levels of
winter wheat productivity;
– to conduct field research of a combine harvester with 12 kg/s throughput, that can
work with headers of different capture widths, with the definition of conditions and
operational and technological study indicators
– to calculate by PC for the choosing the optimal width of the header for combine
harvester with 12 kg/s throughput.
3 Discussion Results
The optimal header width is selected according to the thresher bandwidth and the
combine harvester engine power in specific zonal conditions. In this regard, a refined
economic and mathematical optimization model for the header size is proposed.
Computer Modeling Selection of Optimal Width of Rod Grip Header 371
Structural scheme of a refined economic and mathematical model for optimizing header
for the combine harvester size is shown in Fig. 1.
Fig. 1. Structural diagram of the algorithm of calculation and selection of the type size of the
header to the combine harvester.
The model consists interconnected blocks that determine the sequence of calcula-
tions for choosing the optimal combine harvester header size according to the formulas
(1−16) those are given in the text below.
The general version of the winter wheat harvesting organization using the direct
combining technology with grinding the non-grain crop part was adopted by us in order
to make the calculations as close as possible to the real economic task.
Investigated variable parameter is purpose working header width function (Bh ):

Bh ¼ f V; Gop
ch ; K2 ; Nmov ; Wsh ð1Þ
The following parameters change depending on the using different header width in
the combine unit:
– working movement speed V, km/h;
– operating combine harvester weight Gop ch , kg;
– combine harvester working moves coefficient K2 ;
– power for self-movement Nmov , h.p.;
– shift combine unit productivity Wsh , ton/h.
Combine harvester working movement speed with different header width, km/h, it
will be determined by the formula
W0n
V¼ ð2Þ
0; 1 Bh Ug
where Wn0 is nominal combine harvester productivity, ton/h;

Bh is header width, m;
Ug is grain crop productivity, ton/Ha.
Combine harvester operating weight, kg, is determined by:
Gop
ch ¼ Gch þ Gh þ Gg ð3Þ
where Gch is combine harvester weight without header, kg;

Gh is header weight, kg;
Gg is grain weight (in grain tank), kg.
The combine harvester working moves coefficient is determined by the formula
1
103 T2 W0n
K2 ¼ 1þ ð4Þ
6 Lr Bh Ug
where T2 is turn time, h;

W0n is nominal combine harvester productivity, ton/h;
Lr is rut length, m;
Bh is header width, m;
Ug is crop productivity, ton/Ha.
Power for self-movement, h.p., will be calculated by the formula
Gop
ch fst V
Nmov ¼ ð5Þ
3:6 75
where Gop ch is combine harvester operating weight, kg;

fst is rolling over stubble coefficient, 0.13;
V is operating speed, km/h;
3.6 is conversion from km/h to m/sec coefficient;

75 is conversion from kW to h.p. coefficient.
Combine harvester shift productivity, ton/h, will be determined by formula
1
1 1
Wsh ¼ W0W sh
þ 1 ð6Þ
K K2
where is productivity (per main time hour), ton;

Nen is engine power, h.p.;
Kr is power reserve coefficient;
Nmov is power for self-movement, h.p.;
thr
Nsp is specific power for threshing, h.p.∙ h/ton;
0
Ksh is using shift time normative coefficient.
Grain weight in combine harvester grain tank, kg, is determined by formula:
Gg ¼ Vgt cg ð7Þ
where Vgt is grain tank volume, m3;

cg is grain density, kg/m3.
Changes in grain loss behind the header depending on working speed, %, were
obtained as a result of research [3] by formula
Hh ¼ a V ð8Þ
where V is working speed, km/h;

a is grain losses behind the header on working speed linear dependence coefficient,
that is established experimentally.
Power for threshing, h.p., will be determined by the formula
Nthr ¼ Nen Kr Nmov ð9Þ
where Nen is combine harvester power, h.p.;

Kr is power reserve coefficient;
Nmov is power for self-movement, h.p.
Total harvesting costs is taken as an optimization criterion, RUB/ton, taking into
account the header work at different speeds quality
Ch ¼ S þ F þ R þ D þ Lh ð10Þ
where S is machine operators salary, RUB/ton;

F is fuel cost, RUB/ton;
R is repair cost, RUB/ton;
D is depreciation cost, RUB/ton;
Lh is wastage due to grain losses behind the header volume, RUB/ton.
Machine operator’s salary will be determined by formula
Ts
S¼ ð11Þ
Wsh
where s is hourly machine operators wage, RUB/h;

T is machine operators number;
Wsh is combine harvester shift productivity, ton/h.
Specific fuel consumption for movement per Hectare, kg/ton:
Nmov K2
qmov ¼ ð12Þ
W0W
where Nmov is power for movement, h.p.;

W0w is productivity per hour, ton/h;
K2 is combine harvester working strokes coefficient.
Fuel cost, RUB/ton, will be determined
F ¼ ðqthr þ qmov ÞPf ð13Þ
where qthr is specific fuel consumption for threshing bread mass, kg/ton;
qmov is specific fuel consumption for self-movement, kg/ton;
qmov Nmov
qthr ¼ ð14Þ
Wsh
Pf is diesel fuel price, RUB/kg;

Repair cost, RUB/ton, will be determined by formula
ðPch þ Ph Þsr
R¼ ð15Þ
Wsh Tzon
where Pch is combine harvester without header cost, thous. RUB.;

Ph is header cost, thous. RUB.;
Tzon is zonal combine harvester workload at harvesting, h;
sr is repair cost, %.
Depreciation cost, RUB/ton, will be determined:
ðPch þ Ph Þd
D¼ ð16Þ
Wsh Tzon
where Pch is combine harvester cost, RUB.;

Ph is header cost, RUB.;
d is depreciation cost, %;
Wsh is shift productivity, ton/h;
Tzon is zonal combine harvester workload at harvesting, h.
Wastage due to grain losses behind the header volume, RUB/ton, will be deter-
mined by formula
Ug Hh Pg
Lh ¼ ð17Þ
100
where Hh is grain losses behind the header, %;
Pg is commercial grain selling price, RUB/ton.
Let us assume, that some farm has combine harvester, that works in wet work mode
with specified normative indicators. The manufacturer produces different width headers
for this combine harvester model. We will accept next condition: winter wheat pro-
ductivity is two-level: achieved 5,0 ton/ha and predictable 7,0 ton/ha.
Indicator are being calculated taking into account certain harvesting characteristics
and combine harvesting productivity, after that shift productivity should be determined
and harvesting cost will be calculated with different header width variants. It will be the
first task solution. However we know that the answer to the question – how many
combine harvesters are required for this farm with certain winter wheat productivity
area? – it need calculations sequence for last block.
Economical and mathematical model and PC software, that is named « ECO-
Adapter » were developed for solution this task [13, 14].
Software is used for calculating combine harvester work with different header width
economical efficiency at grain harvesting and comparative analysis with using different
width headers. Until 4 variants calculating can be in comparative analysis.
Software includes two guide: header and combine guides. The guides contains
information about technical tools, cost and another combine harvesters and headers
characteristics. Command « Task – New » from main window menu is for making
new task. Next, window « Add new task » appears. There must be additional source
task data:
– fuel cost RUB/kg;
– crop productivity, ton/ha;
– grain density, kg/m3;
– typical rut length, m;
– turn time, min;
– rolling over stubble coefficient;
– using shift time coefficient;
– hourly machine operators wage, RUB/h;
– repair cost, %;
– depreciation cost, %;
– commercial grain selling price, RUB/ton.
Task report is generating and displayed in Excel. It can be formed and printed by
menu command « Report – Task » in main form.
Tasks with equal crop productivity values and different headers (different width),
the same combine harvester is aggregated by those, are selected for comparative
analysis. Selected tasks are named « VARIANT 1 » , « VARIANT 2 » , « VAR-
IANT 3 » or « VARIANT 4 » by pushing right mouse button on the needful task line
and by choosing variant number from context menu. Selected variants are displayed at
the lower part of main window in yellow rectangle. Comparative analysis report is
forming by menu command « Report – Comparative analysis » , after that it is dis-

playing in Excel, now it is available for editing, saving and printing.
Header standard size optimization is made for combine harvester type TORUM 740
that manufactured by Rostselmash Combine Plant, LLC for header width of 6, 7 and
9 m and winter wheat productivity 5.0 and 7.0 ton/ha.
Optimization results are in Table 1.
Table 1. Results of optimization.

No Width of cut Direct costs of means, RUB./ton
header, m Productivity 5.0 ton/Ha Productivity 7.0 ton/Ha
1 6 877.3 923.7
2 7 829.5 870.1
3 9 740.3 783.8
Calculating results can be presented as a graph taking into account the entered
optimal working speed range limit 3 to 7 km/h (Fig. 2). Choosing optimal working
speed range limit is based on the research results, that were reviewed in study [15].
Fig. 2. Selection of the optimal width of the header, taking into account operating speed limits.
Graph show that optimal combine thresher workload with minimal grain losses and
crushing values is provided at choosing optimal header width in the crop productivity
range 2.5−9.5 ton/ha subject to entered working speed range limit 3−7 km/h. Thresher
workload with working speed limit will be optimal for header width of 6 m at crop
productivity 4.6−9.8 ton/Ha, for 7 m is 4.2−9.0 ton/Ha and for 9 m is 3.0−7.6 ton/Ha.
All this indicates that header with 7 and 9 m width have bigger crop productivity range.
Made for the future calculating 12 m header shows that its using in conjunction
with combine harvester with 12 kg/sec capacity will be the most optimal at working
speed ranged workload and crop productivity 2.5−5.6 ton/Ha. Combine harvester
realizes its throughput potential fully under given conditions. To work with a larger
productivity range, engine unit power must be increased.
4 Conclusions
Analysis obtained results allows to make next conclusion. High-efficiency combine

harvester type TORUM 740 should be equipped by 9 m header for bread harvesting by
direct combining at 5.0 ton/Ha and 7.0 ton/Ha crop productivity. It will allow to
decrease harvesting costs in the comparison with 6.0 m and 7.0 m header by 11−16%
at 5.0 ton/Ha crop productivity and 15% at 7.0 ton/Ha.
Economical and mathematical model of choosing optimal header width differs than
previously developed by:
– operational and economical indicators are taken into account at choosing optimal
header width;
– power for self-movement, that has a significant effect on bread mass threshing, is
taken into account in calculations;
– grain losses behind header are taken into account.
Multi-year accounting for harvesting operations data for a number of Southern
Federal District farms show that costs are in the range from 748.4 to 890.5 RUB/ton, so
average calculations error does not exceed 15%. It proves optimization model cor-
rectness and suitability for modeling optimal header width at winter wheat harvesting
for two crop productivity levels: 5 and 7 ton/Ha.
Combine harvesting with optimal header width will allow to decrease grain losses
behind the header, grain injury rate and power and fuel costs for heavy harvesting units,
that is used for realizing the nominal productivity, also, this optimal header width
harvesting will improve the machine operator work environments.
The use of the proposed economical and mathematical model of choosing optimal
header width differs of the harvest grip will help to rationally choose a fleet of har-
vesting machines from the proposed range of harvests produced abroad by the man-
ufacturer having different width of the capture width, taking into account the features
and needs of the farm.
Further studies will be aimed at studying the effect of the grab width on grain loss
and the spread width of crushed straw on the width of the passage, as well as on the
development of new computer programs and test equipment.
References
1. Zhalnin, E.V.: Matematicheskoye modelirovaniye protsessov zemledel’cheskoy mekhaniki
[Mathematical modeling of agricultural mechanics processes]. Tractors Agric. Mach. 1, 20–
23 (2000). (in Russian)
2. Lipkovich, E.I.: Osnovy matematicheskogo modelirovaniya sistemy mashin [Fundamentals
of mathematical modeling of a machine system]. Povysheniye effektivnosti uborochnykh
rabot. VNIPTIMESKH. Zernograd, pp. 13–22 (1984) (in Russian)
3. Tabashnikov, A.T.: Optimizatsiya uborki zernovykh i kormovykh kul’tur [Optimization of
harvesting of grain and forage crops]. Agropromizdat. Moscow, p 159 (1985) (in Russian)
4. Kavka, M., Mimra, M., Kumhala, F.: Sensitivity analysis of key operating parameters of
combine harvesters. Res. Agric. Eng. 62(3), 113–121 (2016). https://doi.org/10.17221/48/
2015-rae
5. Badretdinov, I., Mudarisov, S., Lukmanov, R., Permyakov, V.: Mathematical modeling and
research of the work of the grain combine harvester cleaning system. Comput. Electron.
Agric. 165, 104966 (2019). https://doi.org/10.1016/j.compag.2019.104966
6. Šotnar, M., Pospíšil, J., Mareček, J., Dokukilová, T., Novotný, V.: Influence of the combine
harvester parameter settings on harvest losses. Acta Technol. Agric. N21(3), 105–108
(2018). https://doi.org/10.2478/ata-2018-0019
7. Chen, J., Wang, S., Lian, Y.: Design and test of header parameter keys electric control
adjusting device for rice and wheat combined harvester. Trans. Chin. Soc. Agric. Eng. 34
(16), 19–26 (2018). https://doi.org/10.11975/j.issn.1002-6819.2018.16.003
8. Liu, H., Reibman, A.R., Ault, A.C., Krogmeier, J.V.: Video-based prediction for header-
height control of a combine harvester. In: IEEE Conference on Multimedia Information
Processing and Retrieval (MIPR), San Jose, CA, USA, pp. 310–315 (2019). https://doi.org/
10.1109/mipr.2019.00062
9. Zhang, K., Shen, H., Wang, H., et al.: Automatic monitoring system for threshing and
cleaning of combine harvester. IOP Conference Series: Materials Science and Engineering.
452(4), p. 042124 (2018). https://doi.org/10.1088/1757-899X/452/4/042124
10. Shepelev, S.D., Shepelev, V.D., Almetova, Z.V., et al.: Modeling the technological process
for harvesting of agricultural produce. In: IOP Conference Series: Earth and Environmental
Science. 115(1), p. 012053 (2018). https://doi.org/10.1088/1755-1315/115/1/012053
11. Almosawi, A.A.: Combine harvester header losses as affected by reel and cutting indices.
Plant Archives. 19, 203–207 (2019). http://www.plantarchives.org/PDF%20SUPPLEMENT
%202019/33.pdf
12. Zhang, Y., Chen, D., Yin, Y., Wang, X., Wang, S.: Experimental study of feed rate related
factors of combine harvester based on grey correlation. IFAC-PapersOnLine 51(17), 402–
407 (2018). https://doi.org/10.1016/j.ifacol.2018.08.188
13. Reshettseva, I.A., Tabashnikov, A.T., Chaplygin, M.E.: Certificate of state registration of the
computer program “ECO-Adapter” No. 2015613469. Registered in the Program Registry
03/17/2015. (in Russian)
14. Chaplygin, M.Y.: Ekonomiko-matematicheskaya model’ optimizatsii tiporazmera khedera k
zernouborochnomu kombaynu [Economic-mathematical model for optimizing the size of a
header to a combine harvester]. Machinery and Equipment for Rural Area. (2), pp. 23–24
(2012) (in Russian)
15. Chaplygin, M.Y.: Povishenie effektivnosti ispolzovaniya zernouborochnogo kombaina
putem obosnovaniya optimalnoi shirini zahvata jatki dlya uslovii yuga Rossii [Improving the
efficiency of the combine harvester by justification the optimal header width for the
conditions in southern Russia]. (Dissertation Candidate of Technical Sciences). Volgograd
State Agrarian University, Moscow (2015) (in Russian)
An Integrated CNN-LSTM Model for Micro
Hand Gesture Recognition
Nanziba Basnin1(&) , Lutfun Nahar1 ,

and Mohammad Shahada Hossain2
1
International Islamic University Chittagong, Chittagong, Bangladesh
2
University of Chittagong, Chittagong, Bangladesh
hossain_ms@cu.ac.bd
Abstract. Vision based micro gesture recognition systems enable the devel-
opment of HCI (Human Computer Interaction) interfaces to mirror real-world
experiences. It is unlikely that a gesture recognition method will be suitable for
every application, as each gesture recognition system rely on the user cultural
background and application domain. This research is an attempt to develop a
micro gesture recognition system suitable for the asian culture. However, hands
vary in shapes and sizes while gesture varies in orientation and motion. For
accurate feature extraction, deep learning approaches are considered. Here, an
integrated CNN-LSTM (Convolutional Neural Network- Long Short-Term
Memory) model is proposed for building micro gesture recognition system. To
demonstrate the applicability of the system two micro hand gesture-based
datasets namely, standard and local dataset consisting of ten significant classes
are used. Besides, the model is tested against both augmented and unaugmented
datasets. The accuracy achieved for standard data with augmentation is 99.0%,
while the accuracy achieved for local data with augmentation is 96.1% by
applying CNN-LSTM model. In case of both the datasets, the proposed CNN-
LSTM model appears to perform better than the other pre-trained CNN models
including ResNet, MobileNet, VGG16 and VGG9 as well as CNN excluding
LSTM.
Keywords: CNN-LSTM model Augmented dataset Unaugmented dataset

Micro hand gesture Standard dataset Local dataset
1 Introduction
Gesture symbolizes the posturized instance of a body through which some information
is conveyed [3]. Gesture usually categorize as macro and micro. Where macro gesture
demonstrates the perpetuating motion of the hand in coordination to the body, while
micro gesture pictorializes the relative position of the fingers of the hand. This research
makes use of micro gestures with static images. Hand gesture recognition systems are
generally used to narrow the bridge between human and computer interactions (HCI).
Human interactions through hand gesture sometimes involve one hand or both hands.
An interactive machine which successfully mimics the natural way of human hand
interaction can be developed [16]. The research presented in this paper is an attempt to

https://doi.org/10.1007/978-3-030-68154-8_35
380 N. Basnin et al.
develop a classification tool, enabling the accurate recognition of both single and
double hand gestures. Thus, this paper focuses on building an effective tool for the
recognition of hand gesture, by introducing an integrated [17] CNN-LSTM (Convo-
lutional Neural Network - Long Short Term Memory) model. The CNN model is
considered [13] because it can handle large amount of raw data with comparatively less
pre-processing effort [2], while LSTM can better optimize the back propagation.
Therefore, the integration of CNN with LSTM [4] would play an important role to
achieve better model accuracy [1]. This research is further enhanced by investigating
the impacts of data augmentation on the proposed CNN-LSTM model. To perform
these two different datasets, namely standard and local with augmentation as well as
without augmentation have been used to train the CNN-LSTM model. It is important to
note that the local dataset contains classes of gesture of both hands such as ‘dua’,
‘namaste’, ‘prayer’ and ‘handshake’ which are unique in the context of Indian sub-
continent. The accuracy of the proposed CNN-LSTM model has been compared with
LSTM excluded CNN model as well as with state-of-the-art pre-trained CNN models,
including ResNet, Mobile Net, VGG9 and VGG16 by taking account of both aug-
mented and unaugmented datasets. The real time validation against local dataset has
also been carried out, this will be discussed in result section. To check the overfitting
and underfitting aspects of the CNN-LSTM model four folds cross-validation has also
been carried out. The remaining of the paper is structured as follows. Section 2
analyses the different works related to hand gesture recognition. Section 3 outlines the
methodology undertaken in order to develop the model. Section 4 presents the results,
while Sect. 5 concludes the paper with a discussion on future research.
2 Related Work
There exist a plethora of research works in the area of hand gesture recognition sys-
tems. In [14] the different classes of hand gestures were classified using CNN on
augmented and unaugmented datasets. It was seen that augmented dataset produced an
accuracy of 97.18%, while unaugmented dataset produced an accuracy of 92.91%.
However, the recognition of gesture by both hands was not considered in this research.
In [19] Artificial Neural Networks (ANN) were used to classify ten different categories
of hand gestures with an average accuracy of 98%. Due to the low computational cost
to identify the different categories of hand gestures, this method became a good can-
didate to execute in real-time. However, the use of ANN requires pre-processing of raw
data and hence, becomes computationally expensive. Another effective method [20] to
recognize American Sign Language Hand Gesture for 26 different classes was pro-
posed. Principal Component Analysis (PCA) was used to extract features from the
dataset. These extracted features Micro Hand Gesture Recognition 3 were later fed into
the ANN for training the dataset. This method produced an accuracy of 94.3%. In spite
of this accuracy, the feature extraction technique, applied in this paper, was unable to
extract features based on depth and hand direction. In [21] twenty-six different classes
of American Sign language Hand Gesture dataset were trained using three different
CNN models. The hidden layers of each model increased from the other by one. It was
observed that, by increasing the number of hidden layers results in decreasing
An Integrated CNN-LSTM Model for Micro Hand Gesture Recognition 381
recognition rate of CNN model but increasing its run time. Consequently, testing
accuracy appears to steeply fall from 91% to 83%. In another approach [24], a self-
organizing as well as self-growing neural gas network along with an YCbCr (Green,
Blue, Red) color spacing method for hand region detection was proposed. The data
used in this paper consists of six different classes of gestures and after applying
classification produces an accuracy of 90.45%. Although it is fast to compute, the hand
region detection method sometimes produces wrong feature detection. In [27] two
different datasets were utilized, namely self-constructed and Cambridge Hand Gesture
Dataset. Each dataset consists of 1500 images, which are categorized into five different
classes. The self-constructed dataset was acquired from images of both left and right
hands separately. The dataset was preprocessed using Canny Edge method which
removed the effect of illuminations. In this study, a CNN-Architecture of five convo-
lution layers were used to train each dataset. Although the method produced an
accuracy of 94%, it was found suitable for recognizing complex gestures. It can be
observed that none of the study as discussed above considered the classification of both
hand gestures as well as considered context-based gestures such as ‘dua’, ‘namaste’,
‘handshake’ and ‘prayer’, resulting to inefficient nonverbal machine human interaction.
In addition, the use of CNN models produces better accuracy than other machine
learning models such as ANN [19]. However, layering of CNN’s introduce the
problems with saturated neurons and vanishing gradient, which can correspond to a
poor testing accuracy [21]. Such a drawback of CNN models can be overcome by
integrating the model with such a model like LSTM because it reduces saturation of
neurons as well as overcomes the vanishing gradient problem by optimizing back-
propagation. Therefore, in this research an integrated model of CNN-LSTM has been
proposed, which will be described in the following section.
3 Methodology
Figure 1 illustrates the main components of the methodology used in this research to
identify the different classes of hand gestures as mentioned previously. A brief dis-
cussion of each of the components is presented below.
Fig. 1. System Framework

3.1 Data Collection

As mentioned, ins Sect. 1 before, this research utilizes two datasets namely standard
and local. The standard dataset is collected from a publicly available domain [10] as
shown in Fig. 2. It is used as a benchmark dataset which requires no preprocessing.
Two datasets are introduced in order to highlight a comparison between standard
dataset and the locally collected dataset. Moreover, this will not only aid to deduce how
the accuracy varies between double handed gestures and single-handed gestures but
also recognize how effective data pre-processing on the local dataset is in terms of
accuracy. So, the size of the standard dataset is kept similar to the size of the local
dataset. As it consists 10,000 images, which are divided into ten different classes of
hand gestures namely fist, thumb, palm, palm moved, okay, index, fist move, down, c
and L. Each class consists of 1000 images. The model uses 7000 images for training
while 3000 images for testing. The local dataset was collected using Webcam. Ten
people with ethical approval were used as subjects to build the dataset. Likewise, local
dataset consists of 10000 images. The sample of the local dataset consists of ten gesture
classes. Out of these ten classes five are associated with double hand gestures as
illustrated in Fig. 3. These five classes of gestures namely dua, handshake, namaste,
pray and heart are gestures used in the Indian sub-continent. The remaining five classes,
are gestures by single hand and they consist of palm, fist, one, thumb and okay as can
be seen from Fig. 3. In this way, the local dataset demonstrates a heterogeneous
combination of hand gesture classes, which cannot be found in the standard dataset.
Each class consists of 1000 images. Likewise, standard dataset, the model uses 7000
images for training and 3000 images for testing from the local dataset.
Fig. 2. Sample of Standard Dataset
Fig. 3. Sample of Local Dataset

3.2 Data Pre-processing

Since the standard dataset is already pre-processed, so it can be directly passed into the
CNN-LSTM model. To pre-process the local dataset the steps shown in Fig. 4 are
followed. Firstly, in order to extract the foreground, the background of the image is
subtracted. Secondly, gray scale conversion is applied to the image. The single channel
property of gray scale image benefits the CNN model to gain a faster learning rate [9].
Thirdly, Morphological erosion is used [12]. Fourthly, the median filtering method
lowers down the noise in the image [28]. Finally, for the convenience of CNN-LSTM
model the images are resized into a pixel size of 60 60.
Fig. 4. Steps of Data Pre-Processing
3.3 CNN-LSTM Integrated Model

The proposed CNN-LSTM model comprises four convolution layers, two max pooling
layers, a time distributed flattening layer, one hidden LSTM layer and a output layer.
After every two convolution layers, a dropout layer is considered. After the LSTM
layer, a dropout layer is also considered. The use of dropout layer facilitates addressing
over-fitting issue [23] by deactivating the nodes. For example, the dropout value of
0.25, deactivates 25% of the nodes. The first input layer consists of 32 filters with a
kernel size of 5 5. Zero padding is also added to this input layer in order to attach a
border of zeros to the input volume. The convolution layer generates a feature map
which is fed into the next layer. In each convolution layer of the network, the activation
function Rectified Linear Unit (ReLU) is used. Rectified Linear Unit is chosen because
it not only outperforms all the other activation functions including sigmoid and tanh but
also avoids suffering from vanishing gradient problem. ReLU does this by introducing
non-linearity to the linear process of convolution. A max pooling layer is added after
two convolution layers in the model. The layer works by extracting the maximum
features from the feature map and fitting each feature into a reduced window size of 2
2 dimension. This not only narrows down the parameters required to train the model
but also retains the most important features in the feature map. Next, a flattening layer
is introduced after the convolution layers. This layer flattens the image into a linear
array so that it can easily be fed into the neural network. The CNN layers are integrated
with an LSTM layer of 200 nodes. While CNN layers obtain extraction of features from
an image, the LSTM layer supports by interpreting those features in each distributed
time stamps. These interpretations gives the model exposure to some complex temporal
dynamics of imaging [6]. As a result, the perceptual representation of images in the
convolutions become well defined. All the layers are connected to the output layer
which consists of ten nodes that indicate ten different classes of hand gestures used to
train the model. Afterwards, an activation function SoftMax is included in this layer to
obtain a probabilistic value for each of the classes. SoftMax is used in the very last
layer of the network to provide a nonlinear variant of multinomial logistic regression.
Stochastic Gradient Descent (SGD) of a learning rate 0.01 is used as an optimizer to
compile the model. SGD improves the optimization techniques by working with large
datasets as it is computationally faster [7] and hence, reduces the model loss. Table 1
summarizes the various components of CNNLSTM model.
Table 1. CNN-LSTM integrated configuration

Model content Details
1st convolution layer 2D Input size 60 60, 32 filters of kernel
size 5 5, Zero Padding, ReLU
2nd convolution layer 2D 64 filters of kernel 3 3, ReLU
1st max pooling layer Pool size 2 2
Dropout layer Randomly deactivates 25% neurons
3rd convolution layer 2D 64 filters of kernel 3 3, Same Padding, ReLU
4th convolution layer 2D 64 filters of kernel 3 3, ReLU
2nd Max pooling layer 64 filters of kernel 3 3, ReLU
Flattening layer Time Distributed
LSTM layer 200 nodes
Output layer 10 nodes, SoftMax
Optimization function Stochastic Gradient Descent (SGD)
Learning rate 0.01
Matrix Loss, Accuracy
3.4 System Implementation

The data collection and real-time validation of the model are carried out in Jupyter
Notebook IDE. The training module is developed in Google Collab. This platform
facilitates the execution of programs in run time as well as support deep learning
libraries by providing access to a powerful GPU (Graphics Processing Unit) and TPU
(Tensor Processing Unit). Moreover, TPU comes with faster throughput. Python is
supported in both Jupyter and Google Collab environments. The libraries required for
implementation include Tensor flow, Keras, OpenCV, sklearn, Matplotlib, Tkinter,
PIL, Pandas and Numpy. Here, Tensor flow acts as the backend of the system while
Keras is used to build the CNNLSTM classifier since it provides built-in functions for
layering, optimization and activation. Further, Keras contains advanced tools for fea-
ture extraction. OpenCV is required for image processing [18]. On the other hand,
Sklearn gives access to many supervised and unsupervised algorithms. Matplotlib is
used to generate the confusion matrix. Tkinter supports easier configuration tools for
the user interface, required for real-time data collection and validation. PIL is an
integration tool for image processing, while Numpy is used to carry out the operations
of arrays [8]. Callbacks are used to train the model. Callbacks not only prevents
overfitting that occurs as a result of too many epochs but also avoids underfit models
[11]. The callbacks, which are implemented in this model include checkpoints, early
stopping and reducing learning rate on the plateau. Checkpoints allow saving the best
models by monitoring the loss invalidation. Once the model stops performing well on
the validation dataset, early stopping is applied to stop the training epochs. Reducing
the learning rate on the plateau is used when the validation loss abstains from any
further improvement. The data augmentation is directly applied to the CNN-LSTM
model through Keras built-in Data Augmentation API, which enables the generation of
the dataset while training is carried out on the model [5, 29]. Parameters namely
shifting, rotation, normalization and zoom are applied to induce robustness of dataset.
Therefore, the augmented dataset will expand the number of data points, enabling the
reduction of the distance between training and testing datasets. Thus, this will decrease
overfitting in the training dataset.
3.5 Cross Validation

Cross-validation is carried out to assess the quality of the CNN-LSTM when the
accuracy of the model is greater or equal to 99.0%. k-4 cross-validation is performed.
The testing set is divided into four equal sets (i.e. k-1, k-2, k-3, k-4), each comprising
of 2,500 images for testing and 7500 images for training. For instance, from Fig. 5
‘experiment 1 the first tuple consists of 2500 images for testing while the remaining
three tuples consist of 7500 images for training.
Fig. 5. Demonstration of Cross Validation

4 Result and Discussion
The section investigates the application of the developed CNN-LSTM model by taking
account of the both standard and local datasets as introduced in Sect. 3.
4.1 CNN-LSTM Model on Standard Dataset
Fig. 6. Confusion Matrix Fig. 7. Model Accuracy Fig. 8. Model Loss
Figure 6 illustrates the confusion matrix generated by applying CNN-LSTM model on

the augmented standard dataset. By taking account of the results presented in this figure
the accuracy of the hand gesture class, named ‘Palm Moved’ is calculated as 99.2% by
applying equation mentioned in [26].
Figures 7 and 8 illustrate the Model Accuracy and Model Loss of the augmented
standard dataset by applying CNN-LSTM model. It can be seen from both the curves
that an early stop at 68 epochs has been achieved. The training accuracy as well as
testing accuracy achieved 99.1% and 99.2% respectively. Hence, it can be concluded
that the model is well fit.
4.2 CNN-LSTM Model on Local Dataset
Fig. 9. Confusion Matrix Fig. 10. Model Accuracy Fig. 11. Model Loss
Figure 9 illustrates the confusion matrix generated by applying CNN-LSTM model on

the augmented local dataset. By taking account of the results presented in this figure the
accuracy of the both hand gesture class, named ‘dua’, ‘namaste’, ‘heart’, and ‘hand-
shake’ are calculated as 90.0%, 99.3%, 93.8% and 99.8% respectively, by applying
equation mentioned in [26]. Figure 10 and 11 illustrate the Model Accuracy and Model
Loss of the augmented local dataset by applying CNN-LSTM model. It can be seen
from both the curves that an early stop at 54 epochs has been achieved. The training
accuracy as well as testing accuracy achieved are 95.0% and 96.1% respectively.
Hence, it can be concluded that the model is well fit.
4.3 Comparison of Result
Table 2. Classification result of CNN-LSTM model on augmented datasets

Performance measure Standard dataset Local dataset
Precision 0.9909 0.8444
Recall 0.9907 0.8110
F1-score 0.9908 0.8224
Table 3. Classification result of CNN-LSTM model on unaugmented datasets

Performance measure Standard dataset Local dataset
Precision 0.7298 0.6894
Recall 0.6559 0.6133
F1-score 0.6909 0.6492
To evaluate each model used in this research, performance matrices such as precision,
recall, F1-score and accuracy are used. These matrices calculate the performance by
using parameters such as TP (true positive) and FP (False Positive). Precision tries to
reduce the false positive values, in order to induce a precise model. Recall mainly
focuses on reducing the false negative value. F1- score reduces the false positive and
false negative values, in order to increase F1- score for the model. Accuracy focuses
solely on true positive and true negative values, in order to correctly identify the
classes. Table 2 provides the classification results of CNN-LSTM model for augmented
data of both standard and local datasets. Table 3 provides the classification results off
CNN-LSTM model for the unaugmented data of both the standard and local datasets.
From both the tables it can be observed that the values of evaluation metrics of
augmented data are much higher than that of unaugmented data.
Table 4 illustrates a comparison between the proposed CNN-LSTM model and four
pre-trained CNN models, namely RasNet, VGG9, VGG16 and MobileNet by taking
account of augmented local dataset. In addition, it also illustrates the comparison
between CNN-LSTM model and CNN model excluding LSTM by taking account of
augmented local dataset. It can be observed from the values of the evaluation metrics
against ResNet that the model performs poorly. It can also be observed that none of the
pre-trained CNN models are well fit because testing accuracy for VGG9 and VGG16
are much higher than training accuracy, meaning they are underfitted. On the other
hand, the testing accuracy of MobileNet is much lower than training accuracy meaning
it is overfitted. The CNN model excluding LSTM also appears overfitted since its
testing accuracy is much higher than training accuracy. Thus, it can be concluded that
none of the models are well fit. On the other hand, the proposed CNN-LSTM model is
well fit since its testing and training accuracy is much closer. Table 5 illustrates a
comparison between the proposed CNN-LSTM model and four pre-trained CNN
models, namely RasNet, VGG9, VGG16 and MobileNet by taking account of aug-
mented standard dataset. In addition, it also illustrates the comparison between CNN-
LSTM model and CNN model excluding LSTM by taking account of augmented
standard dataset. ResNet demonstrates a poor performance in comparison to other CNN
models because of its poor training accuracy. Both Mobile Net and CNN suffer from
overfitting because their training accuracies are much lower than their testing accura-
cies. In contrast, VGG9 and VGG16 are well fit models. However, the training
accuracy of VGG 9 is only 64%. Although VGG16 is a well fit model, its accuracy is
lower than the proposed CNN-LSTM model. Therefore, it can be concluded that the
proposed CNN-LSTM model is well fit as well as its performance in terms of accuracy
is better than other models.
Table 4. Comparision between CNN-LSTM model with other CNNmodels on augmented local
dataset
CNN model Precision Recall F1-score Train Acc. Test Acc.
ResNet 0.03 0.12 0.04 0.10 0.12
VGG9 0.11 0.10 0.10 0.27 0.72
MobileNet 0.44 0.28 0.24 0.40 0.28
CNN 0.80 0.80 0.81 0.89 0.75
VGG16 0.87 0.86 0.86 0.87 0.97
CNN-LSTM 0.83 0.81 0.82 0.96 0.95
Table 5. Comparison between CNN-LSTM model with other CNNmodels on augmented

standard dataset
CNN model Precision Recall F1-score Train Acc. Test Acc.
ResNet 0.02 0.11 0.03 0.10 0.11
VGG9 0.09 0.09 0.08 0.64 0.64
MobileNet 0.57 0.36 0.32 0.43 0.36
CNN 0.98 0.98 0.98 0.98 0.88
VGG16 0.98 0.97 0.98 0.98 0.96
CNN-LSTM 0.99 0.99 0.99 0.99 0.99
4.4 Cross Validation

Cross-validation is carried out on the augmented standard dataset. In order to, justify
the 99.0% accuracy. The average training accuracy is 96.0%, while the average testing
accuracy is around 98.0%. This means the testing accuracy is higher than the training
accuracy. This means our proposed CNN-LSTM model is well fit. The average testing
loss is 19.0% while the average training loss is around 26.0%. This means the testing
loss is lower than the training loss (Table 6).
Table 6. Cross validation result of testing verses training accuracy and loss
Dataset Test accuracy Train accuracy Test loss Train loss
k-1 0.96 0.97 0.30 0.15
k-2 0.99 0.98 0.10 0.20
k-3 0.98 0.96 0.20 0.40
k-4 0.97 0.92 0.15 0.30
Average 0.98 0.96 0.19 0.26
4.5 Real Time Validaiton

The CNN-LSTM model has been validated against real world data [25]. In doing so,
the real time unknown images of hand gesture was fed into to the model. From the
Fig. 12 it can be observed that the model can accurately identify one of the classes of
hand gesture named ‘dua’.
Fig. 12. Sample Clip of Real Time Validation Program

In this research, two datasets were used namely, standard and local dataset for vision-
based micro hand gesture recognition. The local dataset is a heterogeneous dataset
which includes five double hand gesture classes out of the ten classes. These five
different classes of gestures include gestures used in the Indian sub-continent. Thus, a
socio-physical approach is taken into account. The CNNLSTM model proposed in this
study can accurately classify the micro gestures used in both the datasets. Besides, the
CNN-LSTM model for micro gesture datasets has been compared with four pre-trained
models namely ResNet, Mobile Net, VGG9 and VGG16 and with LSTM excluded
CNN model. It has been demonstrated in the result section the CNN-LSTM model
outperforms other pretrained models in terms of accuracy in case of both standard and
local dataset. Therefore, the system not only recognizes the gestures of one hand but
also two hands. In future, this research aims to build a dynamic dataset by involving a
greater number of micro gesture classes in the context of Asia as well as collect images
from a greater number of subjects. Few environmental factors such as the intensity of
light, a variety of image backgrounds and skin tone can be considered while building
the dataset. Furthermore, the performance of this robust dataset will present interesting
phenomena. This complexity can be overcome if an ensemble learning methodology
proposed in [22] is considered. This will not only result in better accuracy but also help
in selecting the optimal learning algorithm.
References
1. Ahmed, T.U., Hossain, M.S., Alam, M.J., Andersson, K.: An integrated cnn-rnn framework
to assess road crack. In: 22nd International Conference on Computer and Information
Technology (ICCIT), pp. 1–6. IEEE (2019)
2. Ahmed, T.U., Hossain, S., Hossain, M.S., ul Islam, R., Andersson, K.: Facial expression
recognition using convolutional neural network with data augmentation. In: Joint 8th
International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd
International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 336–341.
IEEE (2019)
3. Akoum, A., Al Mawla, N., et al.: Hand gesture recognition approach for asl language using
hand extraction algorithm. J. Softw. Eng. Appl. 8(08), 419 (2015)
4. Basnin, N., Hossain, M.S., Nahar, L.: An integrated cnn-lstm model for bangla lexical sign
language recognition. In: Proceedings of 2nd International Conference on Trends in
Computational and Cognitive Engineering (TCCE-2020) Springer Joint 8th International
Conference on Informatics (2020)
5. Chowdhury, R.R., Hossain, M.S., ul Islam, R., Andersson, K., Hossain, S.: Bangla
handwritten character recognition using convolutional neural network with data augmen-
tation. In: Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV)
and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR),
pp. 318–323. IEEE (2019)
6. Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko,
K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and
description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 2625–2634 (2015)
7. Goyal, P., Dollar, P., Girshick, R., Noordhuis, P., Wesolowski, L., Kyrola, A., Tulloch, A.,
Jia, Y., He, K.: Accurate, large minibatch sgd: training imagenet in 1 hour. arXiv preprint
arXiv:1706.02677 (2017)
8. Greenfield, P., Miller, J.T., Hsu, J., White, R.L.: Numarray: a new scientific array package
for python. PyCon DC (2003)
9. Grundland, M., Dodgson, N.A.: Decolorize: fast, contrast enhancing, color to grayscale
conversion. Pattern Recogn. 40(11), 2891–2896 (2007)
10. Gti: Hand gesture recognition database (2018), https://www.kaggle.com/gtiupm/
leapgestrecog
11. Gulli, A., Pal, S.: Deep learning with Keras. Packt Publishing Ltd (2017)
12. Haralick, R.M., Sternberg, S.R., Zhuang, X.: Image analysis using mathematical morphol-
ogy. IEEE Trans. Pattern Anal. Mach. Intell. 4, 532–550 (1987)
13. Hossain, M.S., Amin, S.U., Alsulaiman, M., Muhammad, G.: Applying deep learning for
epilepsy seizure detection and brain mapping visualization. ACM Trans. Multimedia
Comput. Commun. Appl. (TOMM) 15(1s), 1–17 (2019)
14. Islam, M.Z., Hossain, M.S., ul Islam, R., Andersson, K.: Static hand gesture recognition
using convolutional neural network with data augmentation. In: Joint 8th International
Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International
Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 324–329. IEEE (2019)
15. Islam, R.U., Hossain, M.S., Andersson, K.: A deep learning inspired belief rulebased expert
system. IEEE Access 8, 190637–190651 (2020)
16. Jalab, H.A.: Static hand gesture recognition for human computer interaction. Inf. Technol.
J. 11(9), 1265 (2012)
17. Kabir, S., Islam, R.U., Hossain, M.S., Andersson, K.: An integrated approach of belief rule
base and deep learning to predict air pollution. Sensors 20(7), 1956 (2020)
18. Nandagopalan, S., Kumar, P.K.: Deep convolutional network based saliency prediction for
retrieval of natural images. In: International Conference on Intelligent Computing &
Optimization. pp. 487–496. Springer (2018)
19. Nguyen, T.N., Huynh, H.H., Meunier, J.: Static hand gesture recognition using artificial
neural network. J. Image Graph. 1(1), 34–38 (2013)
20. Nguyen, T.N., Huynh, H.H., Meunier, J.: Static hand gesture recognition using principal
component analysis combined with artificial neural network. J. Autom. Control Eng. 3(1),
40–45 (2015)
21. Oyedotun, O.K., Khashman, A.: Deep learning in vision-based static hand gesture
recognition. Neural Comput. Appl. 28(12), 3941–3951 (2017)
22. Ozö˘gür-Akyüz, S., Otar, B.C., Atas, P.K.: Ensemble cluster pruning via convex- concave
programming. Comput. Intell. 36(1), 297–319 (2020)
23. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a
simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–
1958 (2014)
24. Stergiopoulou, E., Papamarkos, N.: Hand gesture recognition using a neural network shape
fitting technique. Eng. Appl. Artif. Intell. 22(8), 1141–1158 (2009)
25. Uddin Ahmed, T., Jamil, M.N., Hossain, M.S., Andersson, K., Hossain, M.S.: An integrated
real-time deep learning and belief rule base intelligent system to assess facial expression
under uncertainty. In: 9th International Conference on Informatics, Electronics & Vision
(ICIEV). IEEE Computer Society (2020)
26. Wang, W., Yang, J., Xiao, J., Li, S., Zhou, D.: Face recognition based on deep learning. In:
International Conference on Human Centered Computing. pp. 812– 820. Springer (2014)
27. Yingxin, X., Jinghua, L., Lichun, W., Dehui, K.: A robust hand gesture recognition method
via convolutional neural network. In: 6th International Conference on Digital Home (ICDH),
pp. 64–67. IEEE (2016)
28. Zhu, Y., Huang, C.: An improved median filtering algorithm for image noise reduction.
Phys. Procedia 25, 609–616 (2012)
29. Zisad, S.N., Hossain, M.S., Andersson, K.: Speech emotion recognition in neurological
disorders using convolutional neural network. In: International Conference on Brain
Informatics. pp. 287–296. Springer (2020)
30. Zivkovic, Z., Van Der Heijden, F.: Efficient adaptive density estimation per image pixel for
the task of background subtraction. Pattern Recogn. Lett. 27(7), 773–780 (2006)
Analysis of the Cost of Varying Levels
of User Perceived Quality for Internet
Access
Ali Adib Arnab1(B) , Sheikh Md. Razibul Hasan Raj1 , John Schormans2 ,
Sultana Jahan Mukta3 , and Nafi Ahmad2
1
University of Global Village, Barisal, Bangladesh
adib9877@yahoo.com
2
Queen Mary University of London, London, UK
3
Islamic University, Kushtia, Bangladesh
Abstract. Quality of Service (QoS) metrics deal with network quanti-

ties, e.g. latency and loss, whereas Quality of Experience (QoE) provides
a proxy metric for end-user experience. Many papers in the literature
have proposed mappings between various QoS metrics and QoE. This
paper goes further in providing analysis for QoE versus bandwidth cost.
We measure QoE using the widely accepted Mean Opinion Score (MOS)
rating. Our results naturally show that increasing bandwidth increases
MOS. However, we extend this understanding by providing analysis for
internet access scenarios, using TCP, and varying the number of TCP
sources multiplexed together. For these target scenarios our analysis indi-
cates what MOS increase we get by further expenditure on bandwidth.
We anticipate that this will be of considerable value to commercial orga-
nizations responsible for bandwidth purchase and allocation.
Keywords: Mean Opinion Score (MOS) · Quality of Experience

(QoE) · Bandwidth · Bandwidth cost · Quality of Service (QoS)
1 Introduction
Quality of Experience (QoE) has a significant but complex relationship with
Quality of Service (QOS) and it’s underlying factors [1]. Mean Opinion Score
(MOS) ranges from 1 to 5 (Bad to Excellent) and represents QoE [2]. Although
considerable work has been carried out in papers like [3–5], in this paper, we
have considered QOE within a budget with numerous metrics such as PLP,
bandwidth, and round trip time, TCP sending rate factor, packet buffer lengths
and packet bottleneck capacity. From the curve fitting, we obtained an analytical
expression for bandwidth and bandwidth cost. The goodness of fit is obtained
from SSE, R Square, Adjacent R Square and RMSE. Consequently, we found
one equation with variable MOS and bandwidth and another one with variable
bandwidth and bandwidth cost. The analysis has been performed multiple times
for varying the number of TCP sources. The major objective of this research is
to identify the mathematical relationship between MoS and bandwidth cost.

https://doi.org/10.1007/978-3-030-68154-8_36
394 A. A. Arnab et al.
2 Related Work
Well-grounded bandwidth anticipated by the Internet Protocol is a pivotal
requirement for the advanced services, which also includes reliability amplifi-
cation arranged by TCP [6]. The paper signifies QoS (Quality of service) as a
contemporaneous considered issue worldwide that offers further reliable band-
width network traffic is based on TCP which keeps increasing the sending rate.
Packet loss could be resultant from above issues which is analyzed during
the QoS model improvement [7]. The author implies it is also possible for the
bottleneck links to have high utilization without any packet loss. According to
Roshan et al. [8] paper evaluates the relationships between QoS metrics packet
loss probability (PLP) and the user-perceived Quality of Experience (QoE) for
video on demand (VoD).
QoS has major implications from a policy standpoint [9]. The authors indi-
cates about a present situation where users acquire internet service according
to a fixed price. A simple monthly subscription fee or price according to the
connection time is common [10]. Dial-up modem, Ethernet, cable Ethernet, or
DSL are utilized to provide bandwidth. It is very clear from the analysis of the
authors that there may be no bandwidth guarantee or only statistical band-
width guarantees either one can be offered. Conventional Internet applications
mainly utilize TCP as their transport layer protocol [11]. We can understand
from the paper that Packet loss is used as a sign of congestion. As soon as the
sender experiences network congestion, TCP responds quickly by reducing its
transmission rate multiplicative [12].
The UK is ranked third in terms of the services and metrics included in
this analysis followed by France and Germany [13]. In this paper, The UK has
occupied third place in a table of the weighted average and lowest available
basket prices across the services which are covered in analyzing bandwidth cost
at the varying level. USA is the most expensive regarding both average and
lowest available prices across all the services while France is the least expensive.
We can also learn that UK’s highest ranking is acquired by the mobile phone
service price, triple-play bundle price, in which category the UK is ranked second
followed by France.
There are numerous analysis of QoE, QoS and relationship with band-
width where our paper actually provides relationship between QoE and Band-
width cost. This can cover up business aspects of many organizations. It also
implements proper idea about within allocating budget how much QoE can be
achieved which has not yet been discussed in mentioned papers.
2.1 Abbreviations and Acronyms

Mean Opinion Score and Its Indicator. Mean Opinion Score is broadly
known as MOS, and has become a widespread perceived media quality indicator
[12]. MOS (also superfluously known as the MOS score), is one way of classifying
the characteristic of a phone call. This score is set out of a five-point scale,
shown in Table 1 below: MOS of 4.0 or higher is toll-quality. Once within the
Analysis of the Cost 395
Table 1. MOS Score vs Performance
MOS Score Performance

5 Excellent
4 Good
3 Fair
2 Poor
1 Bad
building, enterprise voice quality patrons generally expect constant quality while
employing their telephone [14].
3 Method
Our first requirement is to determine PLP and MOS for different numbers of
TCP sources, round trip time, bottleneck capacity in packets and buffer length.
We take buffer length values from 10 to 1000 [8]. Then we obtained a MOS vs
bandwidth graph for different settings of buffer lengths. Different countries pays
different amounts of money for internet access. It largely depends on the internet
providers, facilities, internet access availability of networking telecommunication
product, infrastructure, etc. For example, someone in North America may not
be paying the same as someone in Africa. We considered values for UK which
will give us range of bandwidth and estimated average cost for that bandwidth.
From our analysis we can obtain a graph for bandwidth and bandwidth cost.
Our goal is to relate MOS score and bandwidth cost analytically.
The next step was to use existing formulas for PLP and MOS again. Initially
we used bottleneck capacity as one of the parameters to determine PLP. We
obtained a formula for MOS and bandwidth cost and plotted various bandwidth
cost value and evaluated MOS scores against those values. Also, we obtained a
similar curve which we previously obtained from MOS vs bandwidth. If we want
to differentiate for big organizations and small family houses we need to modify
our formula for MOS by changing number of TCP sources. In big organizations,
the number of TCP sources will be large, and for small household, it can be
much smaller. We modify the number of TCP sources in our initial formula for
MOS. We obtain one graph of MOS vs bandwidth cost for a larger number of
TCP sources and one graph of MOS vs bandwidth cost for smaller number of
TCP sources.
3.1 Determining Parameters Value for Experimental Analysis

An existing formula for PLP in TCP has been used to determine Mean Opinion
Score (MOS) [15]. MOS was plotted against bandwidth and bandwidth have been
plotted against cost. The relationship of MOS and cost has been determined from
MOS vs bandwidth and bandwidth vs cost plots.
To determine the values of MOS we require PLP values. According to the

analytical expression and performance evaluation of TCP packet loss probability
is given by:
32N 2
Pi = (1)
3b(m + 1)2 (C.RT T + Q)2
N = Number of TCP sources=50

C = Bottleneck capacity in packets per second=12500
b = number packets acknowledged by an ACK packet=1
m = factor by which TCP sending rate is reduced =1/2
RTT = Round Trip Time=0.1 s
Q = Packet buffer lengths(bits)
According to [10] we can obtain MOS from PLPs shown in Eq. 2 below:
M OS = 1.46 ∗ exp(exp(−44 ∗ P LP )) + 4.14 ∗ exp(−2.9 ∗ P LP ) (2)
Using different buffer length we get values for PLP and MOS (See Table 2). The
Q value has been taken from 10 to 1000. With the same packet buffer length and
increasing bandwidth-15 Mbps each time, from 15 to 120 we evaluate PLP and
MOS in Table 3. Bandwidth and cost pricing is different worldwide. We can get
an estimation of bandwidth vs cost for United Kingdom which is used as sample
data here as a unit of analysis [13].
bandwidth= [10 30 50 100 200 400 600 800 1000] Mbps
Cost= [20 37 40 42 43 45 46 46 46] Dollars
Table 2. Q, PLP and MOS value with different Q
Q PLP MOS
10 7.465263e−01 4.750997e−01
100 6.503074e−01 6.280123e−01
200 5.637028e−01 8.073142e−01
400 4.353297e−01 1.171447e+00
600 3.462922e−01 1.516566e+00
800 2.820191e−01 1.827308e+00
1000 2.341107e−01 2.099708e+00
3.2 Determining Formula for Bandwidth and Bandwidth Cost by

Curve Fitting
To obtain bandwidth vs cost curve, we need a specific formula for bandwidth
and cost which is obtained by curve fitting.
f (x) = axb (3)
Table 3. Q, bandwidth, PLP and MOS value with same Q
Q Bandwidth PLP MOS

10 15 6.503074e−01 6.280123e-01
10 30 1.753233e−01 2.490591e+00
10 45 7.995852e−02 3.326486e+00
10 60 4.556652e−02 3.824152e+00
10 75 2.939265e−02 4.202314e+00
10 90 2.051913e−02 4.492741e+00
10 105 1.513212e−02 4.712481e+00
10 120 1.161832e−02 4.878501e+00
Coefficients (with 95% confidence bounds)

a= 27.13 (22.65, 31.61)
b= 0.0986 (0.06944, 0.1279)
The values of a or b were provided by the curve fitting as the best fit for the
graph we got for bandwidth vs cost. Confidence bounds value are:
Goodness of fit:
SSE: 38.48
R-square 0.9589
Adjusted R-square 0.953
RMSE 2.345
See Table 4 below where goodness of fit parameters are shown. From Table
5 and 6 we know the method named in curve fitting is NonlinearleastSquares
which is taken automatically by the characteristics, shape and number of squares
present in the curve. Among Interpolant, Linear fitting, Polynomial, Rational,
Sum of shine, Smoothing Spline, Weibull, Exponential, Gaussian, Fourier we
selected Power1 since it gives better visualization and better goodness of fit
prediction. The prediction includes calculation of SSE, R-square, Adjacent R-
square, RMSE, coefficient with confidence bound. The robust option is enabled
and Least Absolute Residuals (LAR) shows more accurate results than Bi
Square. LAR mainly focuses on the contrast between residuals rather than
squares and Bi-Square reduces weight of the sum of the squares which is neces-
sary for our curve’s case.
Table 4. Goodness of fit parameters for curve fitting
Fit name Data Fit type

SSE R-square DFE
Adj R-sq RMSE Coeff
Fit 1 Cost vs bandwidth Power1
38.4806 0.9589 7
0.9530 2.3446 2
Table 5. Method and Algorithm for curve fitting
Method NonlinearleastSquares
Robust LAR
Algorithm Trust Region
DiffMinChange 1.0e-8
DiffMaxChange 0.1
MaxFunEvals 600
Maxlter 400
TolFun 1.0e-6
TolX 1.0e-6
Table 6. Method and Algorithm for curve fitting
Coefficient StartPoint Lower Upper

a 9.2708 −lnf lnf
b 0.3019 −lnf lnf
We will take bandwidth as ‘x’ which lies X axis and bandwidth cost as ‘f(x)’
which lies in Y axis. We can acquire values for ‘a’ and ‘b’ from curve fitting tool
which is visible in Fig. 1. After that by implementing the Eq. (3) we obtain Eq.
(4) below:
Cost = 27.13 ∗ bandwidth0.0986 (4)
Fig. 1. Setup for applying Curve fitting formula from Cost vs bandwidth to obtain a
formula for MOS vs bandwidth cost
3.3 Implementing MOS and Cost Relationship by Eliminating

Bandwidth from Bandwidth vs Cost Formula
By taking different values of cost we can get different values of bandwidth and if
we replace bandwidth from Eq. (6) we can get different values for cost and MOS.
The MATLAB code for MOS vs Bandwidth cost is shown in Fig. 2. Bottleneck
capacity is given by Eq. (5) below:
Fig. 2. MOS vs bandwidth cost relationship
bandwidth ∗ 1000000
C= (5)
12000
MOS and Bandwidth. From Eq. (1), (2) and (5) we obtained relationship
between MOS and bandwidth which is Eq. (6):
11851.851
−44∗
0.01( BW.1000 )2 +2( BW.1000
M OS = 1.46 ∗ e 12 12 )+100
11851.851 (6)
−2.9∗
0.01( BW.1000 )2 +2( BW.1000
+ 4.14 ∗ e 12 12 )+100
MOS and Bandwidth Cost. If we put bandwidth cost in the place of band-
width by help of Eq. (4) we obtain the following relationship which is Eq. (7):
−44∗ √ Cost1185151.1 √ Cost

0.01(83.33 0.0986 2 0.0986
M OS = 1.46 ∗ e 27.13 ) +2(83.33 27.13 )+100
(7)
−2.9∗ √ Cost1185151.1 √ Cost
0.01(83.33 0.0986 )2 +2(83.33 0.0986
+ 4.14 ∗ e 27.13 27.13 )+100
3.4 MOS and Bandwidth Cost

To evaluate MOS with different numbers of TCP sources we changed the values
for N. We took 2 sample values of N which are 80 and 500. 80 TCP sources
mainly represent a small building or bandwidth use for family purposes. 500
TCP sources represents bigger companies and organizations.
To get MOS and cost relationship, we took a value of N=80 and 500 instead of
50 which provided us a different PLP and a different MOS. The bandwidth and
cost relationship remains same as before because it is here seen to have nothing
to do with the number of TCP sources. We were able to obtain different MOS
and bandwidth formula and different MOS and Cost formula and get output for
different number of TCP sources.
For N=80, MOS and Cost relationship is obtained in Eq. (8):
−44∗ √ Cost30340.740 √ Cost

0.01(83.33 0.0986 2 0.0986
M OS = 1.46 ∗ e 27.13 ) +2(83.33 27.13 )+100
(8)
−2.9∗ √ Cost1185151.1 √ Cost
0.01(83.33 0.0986 )2 +2(83.33 0.0986
+ 4.14 ∗ e 27.13 27.13 )+100
Similarly for N=500, we can obtain Eq. (9):
−44∗ √ Cost11851.51 √ Cost

0.01(83.33 0.0986 2 0.0986
M OS = 1.46 ∗ e 27.13 ) +2(83.33 27.13 )+100
(9)
−2.9∗ √ Cost1185151.1 √ Cost
0.01(83.33 0.0986 2 0.0986
+ 4.14 ∗ e 27.13 ) +2(83.33 27.13 )+100
4 Results and Testing

4.1 Plotting MOS vs Bandwidth (Bps) with Different Packet Buffer
Lengths
Our initial formula for PLP provides a relationship between packet loss proba-
bility and various parameters including the number of TCP sources, bottleneck
capacity, round trip time, number of packets acknowledged by an ACK packet,
factor by which TCP sending rate is reduced and packet buffer length. We can
determine the PLP by taking sample data for this parameter. The MATLAB
code and formula for PLP is discussed in Sect. 3.1. Then we calculated MOS
from PLP. Figure 3 is MOS vs bandwidth (Mbps) with different packet buffer
lengths.
Fig. 3. MOS vs bandwidth (Bps) with different packet buffer lengths
4.2 Plotting of MOS vs Bandwidth (Mbps) with Constant Packet

Buffer Length Output
If we keep changing buffer length it is very difficult to evaluate MOS change with
the effects of bandwidth change. So by keeping the same buffer length, Q which
is 10, we can obtain a MOS vs bandwidth curve. So it is quite evident from Fig.
4 that when we increase bandwidth, MOS is also increasing proportionally band-
width. As a sample when bandwidth is 50 Mbps, MOS is 3.5 (approximately),
when bandwidth is 100 Mbps MOS is 4.5 (approximately) and when bandwidth
is 150 Mbps MOS is close to 5.
4.3 Plotting of Bandwidth vs Bandwidth Cost Relationship
We took some different parameters for bandwidth cost in the UK. If we look
at the Fig. 5 we can see initially the cost increases within a limited range of
bandwidth. In this case even for 20 Mbps the bandwidth price is somewhere
close to 35£ per month. The rate increases until 100 Mbps, from the graph
customer has to pay nearly 43£ for 100 Mbps per month. Beyond 100 Mbps the
cost increment is very slow. From 100 Mbps to 1000 Mbps cost only increases
from 43£ to 46£ per month which is incredibly low. So we can draw conclusions
about how much it is worth spending on bandwidth before we run into a law of
diminishing returns.
Fig. 4. MOS vs bandwidth (Mbps) with same packet buffer length
Fig. 5. Bandwidth vs bandwidth cost relationship

Fig. 6. MOS vs bandwidth cost relationship
4.4 Plotting of MOS vs Bandwidth Cost Relationship Output

Initially MoS is very low with low cost. That indicates there is a certain amount
of money a customer needs to pay initially to get internet access. A customer
paying 38£ can obtain quality of experience of 2 MoS. Where a customer paying
45£ is receiving far better quality of experience, MoS is close to 5 according to
Fig. 6.
The experiment makes sense if we compare broadband prices in the UK. For
example, Broadband provider named ‘Now Broadband’ offers 11 Mbps in 18£
per month in the ‘Brilliant Broadband’ plan. The Same broadband provider is
offering 36 Mbps speed in 24£ per month in the ‘Fab Fibre’ plan. So the band-
width is increasing more than 3 times while cost is only increasing by £ 6 [16].
4.5 Plotting of MOS and Bandwidth Cost Output with Different

TCP Sources
Initially we took the number of TCP sources to be 50. If we change the TCP
sources we obtain different outputs and results. There are 2 output graphs in
Fig. 7. The first one in Fig. 7 is calculated by taking 80 TCP sources and second
one is calculated taking 500 TCP sources. If we take more TCP sources the
quality increases more rapidly with the price increase compared to taking fewer
TCP sources. Big organizations usually have TCP sources and so that they have
the luxury of getting better quality of experience within the same cost but after
crossing initial price barrier. In small household fewer TCP sources are most
Fig. 7. MOS vs bandwidth Cost Output with different TCP sources.
likely used which also experiences a rise in MOS which provides good quality
within a price range but that is less than that seen in big organizations.
5 Discussion and Further Work
Prior work in the field has shown that QoE has a complex relationship with QoS
factors like packet loss probability (PLP), delay or delay jitter. Furthermore,
contemporary analyses have indicated that the relationship between these QoS
factors and QoE can significantly vary from application to application. In this
paper we take prior analyses of the relationship between the key QoS metric of
packet loss probability and QoE and target an internet access scenario. We use
this relationship to show how QoE (measured as MOS) varies as more money
is spent on the bandwidth of the internet access link. Our results target two
different scenarios – a small number of multiplexed TCP sources and a relatively
large number of multiplexed TCP sources. We show that increase in MOS is not
a linear with increasing spent on bandwidth at all, and considering these two
different we are able to resolve the highly non-linear fashion in which MOS does
increase with spend on bandwidth.
References
1. Quality of Experience Paradigm in Multimedia Services. ScienceDirect (2017)
2. Streijl, R.C., Winkler, S., Hands, D.S.: Mean opinion score (MOS) revisited: meth-
ods and applications, limitations and alternatives. SpringerLink (2014)
3. Ahmad, N., Schormans, J., Wahab, A.: Variation in QoE of passive gaming video
streaming for different packet loss ratios. In: QoMEX 2020- The 12th International
Conference on Quality of Multimedia Experience, Athlone (2020)
4. Ahmad, N., Wahab, A., Schormans, J.: Importance of cross-correlation of QoS
metrics in network emulators to evaluate QoE of video streaming applications.
Bordeaux (2020)
5. Vasant, P., Zelinka, I., Weber, G. W.: Intelligent computing & optimization. in
SpringerLink (2018)
6. Camp, L.J., Gideon, Carolyn.: Limits to Certainty in QoS Pricing and Bandwidth.
22 July 2002. https://dspace.mit.edu/handle/1721.1/1512. Accessed 2 May 2020
7. Xiao, X.: Technical, commercial and regulatory challenges of QoS. Amsterdam
[u.a.]: Elsevier/Morgan Kaufmann, p.30. Zarki, M. (2019). QoS and QoE (2008)
8. Roshan, M., Schormans, J., Ogilvie, R.: Video-on-demand QoE evaluation across
different age-groups and its significance for network capacity. EAI Endorsed Trans-
actions on Mobile Communications and Applications, no. EAI (2018)
9. Gideon, C. (2002). Limits To Certainty in QoS Pricing and bandwidth. http://hdl.
handle.net/1721.1/1512. Accessed 11 Aug 2019
10. Internet Cost Structures and Interconnection Agreements. The journal of electronic
publishing (1995)
11. Joutsensalo, J., Hämäläinen, T.D., Siltanen, J., Luostarinen, K.: Delay guaran-
tee and bandwidth allocation for network services. In: Next Generation Internet
Networks, 2005 (2005)
12. Aragon, J.C.: Analysis of the correlation between packet loss and network delay
and their impact in the performance of surgical training applications. In: Semantic
Scholar (2006)
13. Ofcom (2017). International Communications Market Report 2017. London.
https://www.ofcom.org.uk/research-and-data/multi-sector-research/cmr/cmr-
2017/international. Accessed 7 Aug 2019
14. Expert System (2017). What is Machine Learning? A definition - Expert System.
Expertsystem.com. https://www.expertsystem.com/machine-learning-definition/.
Accessed 6 Aug 2019
15. Bisio, I., Marchese, M.: Analytical expression and performance evaluation of TCP
packet loss probability over geostationary satellite. IEEE Commun. Lett. 8(4),
232–234 (2004)
16. Cable.co.uk (2019). Best Broadband Deals August 2019 — Compare Broadband
Offers - Cable.co.uk. Cable. https://www.cable.co.uk/broadband/. Accessed 16
Aug 2019
Application of Customized Term
Frequency-Inverse Document Frequency
for Vietnamese Document Classification
in Place of Lemmatization
Do Viet Quan and Phan Duy Hung(&)
FPT University, Hanoi, Vietnam

quandvmse0087@fpt.edu.vn, hungpd2@fe.edu.vn
Abstract. Natural language processing (NLP) is a problem which attracts lots

of attention from researchers. This study analyzes and compares a different
method to classify text sentences or paragraphs in Vietnamese into different
categories. The work utilizes a sequence of techniques for data-preprocessing,
customize learning model and methods before using Term Frequency-Inverse
Document Frequency (TF-IDF) for model training. This classification model
could contribute positively to many Vietnamese text-analyzing based busi-
nesses, such as social network, e-commerce, or data mining in general. This
problem’s challenge relies on two main aspects: the Vietnamese language itself
and current NLP researches for the Vietnamese language. The paper utilizes the
pros of many different classification methods to provide better accuracy in text
classification.
Keywords: NLP Text-classification Vietnamese IF-TDF POS-tag

Lemmatization
1 Introduction
Vietnamese is a special language whether how it was formed and developed, or how it
is expressed and used. The language itself is ambiguous, but we can find various
reasons as why Vietnamese is not a simple language to be processed by a typical NLP
model such as deep learning model [1], thus we need customization.
As a historical result, Vietnamese vocabulary is majorly structured from words
derived from Chinese, notably words in scientific or politic domains, namely Han-Viet
(Vietnamese words taken from Chinese since the Han-Dynasty era), which existence
takes roughly 70% [2]. Moreover, because of French colonization, Vietnamese had
taken in a numerous loanword from French language in multiple form. Recently, many
words from English have joined the Vietnamese vocabulary family, either being
translated word-by-word or being used directly as a part of Vietnamese [2].
NLP researches in Vietnam have been greatly delayed compare to what it could
have reached, as they have been made without a common, academic direction, there are
some notable projects but still lack the systematic approach to the some important
issues (text summarization, context based intelligent search, etc.). The lack of

https://doi.org/10.1007/978-3-030-68154-8_37
Application of Customized Term Frequency-Inverse Document Frequency 407
inheritable researches also hinders the followers to continue researching. Although

some methods have been implemented for Vietnamese, NLP in general and text
classification in specific still face several issues and can mostly work in formal doc-
ument situations.
Inheriting from this vocabulary issue, we have stated the following reasons that
make Vietnamese unable to efficiently utilize typical NLP pre-processing methods [3]
and they are available in Table 1.
Table 1. Reasons why Vietnamese are relatively easy for human to learn, but cause difficulties
for Machine Learning.
For human For NLP efficiency
Short words Easy to learn Repetitive structure, hence,
many synonyms and
homophones
No gender Easy to learn Inefficient lemmatization
No plural, no cases Easy to speak, hard to No lemmatization needed,
listen thus no lemmatization
No articles Easy to speak, hard to No lemmatization needed
listen
No verb conjugation Easy to learn Difficult to differentiate
between tenses or nouns’
genres
Simple verb tenses Easy to speak, hard to Shift the complicating task(s)
comprehend onto the listening side instead
of the speaking side, thus
increasing NLP difficulties
Tenses are optional Easy to speak, hard to Shift the complicating task(s)
comprehend onto the listening side instead
of the speaking side, thus
increasing NLP difficulties
No agreement Faster to learn, less No explicit relationship in
knowledge to terms of data between words
remember in a sentence
Single syllable words Easy to understand Increase the difficulties in
compound words if one distinguishing if a word is a
understands each compound word or it is
element multiple separated words
Homophones: due to single- Hard to distinguish Hard to distinguish
syllable characteristics, in
Vietnamese, homophones also
mean words that are written
identically
These characteristics effectively reduce the typical Stemming and Lemmatization

steps’ methods, and this will also affect the accuracy of POS-tagging task. Many
408 D. V. Quan and P. D. Hung
Vietnamese NLP experts have been working to address these steps with different
approaches. This project does not have the ambition to overcome all these challenges
by itself, we will implement some works of other experts before adding our contri-
bution. Our work concern in the differentiating potential of synonyms and classifiers in
Vietnamese.
Ho Le et al. conducted a systematic research using Vectorized IF-TDF. The authors
prepared their dataset for training and archived an average impressive result of 86,7%,
and peak result of 92.8%. We follow their team’s footsteps [4] while conducting this
paper. Using several different libraries, our average result provides comparable per-
formance with faster processing speed, and minor accuracy rate improvement.
The following sections are ordered as: Sect. 2 describes our methodology, Sect. 3
explains and describes the pre-processing step; we provide some case studies in
Sect. 4; and finally, Sect. 5 is reserved for conclusions and what to do next.
2 Methodology
As we mentioned in Sect. 1, Vietnamese lemmatization is not as effective using the

same method of lemmatization for Indo-European languages. The Vietnamese lan-
guage consists of mostly single-syllable words, which have meaning on their own,
depending on contexts. Additionally, when the Vietnamese language needs to express a
more complicated meaning, it uses compound words, but, by any mean, still being
written by two or more single-syllables words next to each other (e.g.: “anh em” is
compounded from “anh” – big brother and “em” – little brother, which means
“brothers”) [5]. This explains why, homophones in Vietnamese are more common than
Indo-European languages, because Vietnamese syllables consist of 5 components:
initial sounds, rhymes (main sounds, ending sounds and accompaniments) and tones
[2]. However, unlike the Indo-European languages, since Vietnamese does not
transform, there is only one type of homonym which make homophones to be written
and spoken identically. This phenomenon makes Vietnamese having a high number of
words which have different meaning, but are written identically (for single-syllables) or
partially identically (for unit(s) in a compound word), just like “giáp” may mean “12
years”, “armor”, or “beside” depending on context, or “tiền tuyến” (frontline) has one
common unit with “tiền bạc” (money) [5]. With the rule of combining phonetics,
theoretically Vietnamese can create over 20,000 different syllables, but in practice, it
only uses about 6000. Digging deeper into the study of Ho Le et al. [4], we reckon that
more than 20% of Vietnamese vocabulary is occupied by homophones, among which,
73.6% are single-syllable words, 25.9% are compound words which are created using 2
single-syllable words, effectively occupy 99.56% of all cases in Vietnamese Dic-
tionary, edition of 2006 as stated in Table 2.
Table 2. Number of Homophone cases in Vietnamese (2006).

Statistic of Homophone in Vietnamese, according to Vietnamese Dictionary [5]
Single-syllable word Compound words
Coincident Same From 2 units From 3 units From 4 units
origin
1913 807 Coincident Same Coincident Same Coincident Same
origin origin origin
282 673 04 03 02 07
2720 cases 955 cases 07 cases 09 cases
3691 cases
Looking further into the characteristic of these cases in [2] Sect. 2.1.2, we gathered
the statistic of these cases by Parts of Speech (POS) as presented in Table 3.
Table 3. Number and percentage of Homophone distributed onto POS.

Same POS Different POS
Noun Verb Adjective 2 POS 3 POS
Number 111 38 53 Noun - Noun - Adjective - Others Total
Verb Adjective Verb
202 279 245 114 36 14
Percentage 22.69% 31.35% 27.53% 12.81% 4.04% 1.57%
Moreover, most of Vietnamese homophones consist a small number of different

meanings (01−06 meanings). Among these, 77.31% (rounded number) are Homo-
phones with different POS, meaning if a word is detected to have different POS in
different contexts, it will highly occur that they have a different meaning. This effec-
tively replaces traditional lemmatizing methods, specifically in text classification, as 2
words can be synonyms but are used in 2 different genres of text (e.g.: “sư phụ” as
master in ancient time, and “thầy giáo” as teacher in more modern time can have a
similar meaning, but exist and be used in two totally different text genre) [6].
Following these conclusions, we decide to combine a word with its POS to give it a
unique ID, thus effectively distinguish between homophones while preserving the
difference between synonyms. The execution of this theory will be presented in the
following sections.
3 Data Collection and Pre-processing

3.1 Data Collection
To perform the experiment, we have prepared a dataset (modest size) using several
books and multiple collected articles, taken from various fields as data source, as
following:
• Lord of the ring story [7]
• A collection of Grim fairy tales [8]
• Diary of a Cricket [9]
• A collection of 109 Vietnamese fables [10].
• A collection of 80 articles about the world’s politic, written in 2019 and 2020 [11, 12].
• A collection of 80 articles about Vietnam’s showbiz, written in 2019 and 2020 [12].
• A collection of 80 articles about Vietnam’s economics, written in 2019 and 2020
[11, 13].
Each of these categories consists of 18,000 to 22,000 sentences. Selected data are
versatile enough to contain different writing styles, purposes, and vocabularies.
Moreover, the chosen articles and stories all talk about the developing stories of one or
more subjects (instead of a more specific matter such as scientific documents) so that it
can show us the ability to distinguish between different contexts of our method.
3.2 Data Processing

The dataset is manually prepared, and trim lined using the following steps to clean up
and for consistency sake:
The dataset is copied and process semi-manually, going through some pre-
processing tasks. First step, we filter and delete unnecessary special characters,
unprintable characters. Since the data come from various sources (including from
websites, pdf files, etc.), we needed to filter all the unprintable characters such as
pilcrows, spaces, non-breaking spaces, tabs, etc. Next, we remove all formatting and
unnecessary line chunking, since there are some parts where the dialog gets displayed
as a separate line and it is not consistent. Into the next steps, we fix the content if
needed. Throughout Vietnamese language history, there are multiple “popular” way of
writing and it is correct in their own era. For example, we had two ways of writing
words with phonetics: “hòa bình” and “hoà bình” (peace). We needed to unify how
words are written across all documents to ensure accuracy. Along the way, we also fix
all detected typos. The expectation from this step is to make all words under the same
set of standards, which reduce the risk of mis-interpreting words meaning or its nature.
Next step, we remove redundant spaces, dots, and commas. Afterward, we modify
the dataset to standardize keep-able special characters (such as: “…” for dialogs and [1]
for annotations). It is important to note that all this step is done automatically instead of
manually. These special characters are kept distinguishing between different categories
of document, for example, scientific documents often use “[…]” for references, but this
does not appear as much in others.
Lastly, we reordered these data onto files to represent the following classes:
• Western fantasy novel (Lord of the ring)
• Western fairy tale (Grimm stories)
• Vietnamese fairy tale and stories for kids (Diary of a Cricket, Vietnamese fables)
• World’s politic
• Vietnam’s showbiz
• Vietnam’s economics
• Others
It would be acceptable if in our test run, we find out that a text can exist in more
than one category, for example, Vietnam’s economics matters may coerce with World’s
politic event(s) in one way or another, and a predict in either one of these categories can
be considered acceptable.
4 Data Training
To train the collected data, first we perform sentences tokenizing, using the sentence-
boundary-detection based on Naïve Bayes algorithm by Vuong Quoc Binh and Vu Anh
[4]. The result of this step is one text file for each document, with each sentence in a
separated line, for example, with the story “Diary of a Cricket” we have a chunk of
result as sampled in Table 4.
Table 4. Sample output of Sentence Tokenizing step performed on “Diary of a Cricket”.

…
tôi là em út, bé nhất nên được mẹ tôi sau khi dắt vào hang, lại bỏ theo một ít ngọn cỏ non trước
cửa, để tôi nếu có bỡ ngỡ, thì đã có ít thức ăn sẵn trong vài ngày
rồi mẹ tôi trở về
tôi cũng không buồn
trái lại, còn thấy làm khoan khoái vì được ở một mình nơi thoáng đãng, mát mẻ
tôi vừa thầm cảm ơn mẹ, vừa sạo sục thăm tất cả các hang mẹ đưa đến ở
khi đã xem xét cẩn thận rồi, tôi ra đứng ở ngoài cửa và ngửng mặt lên trời
qua những ngọn cỏ ấu nhọn và sắc, tôi thấy màu trời trong xanh
tôi dọn giọng, vỗ đôi cánh nhỏ tới nách, rồi cao hứng gáy lên mấy tiếng rõ to
từ đây, tôi bắt đầu vào cuộc đời của tôi
…
Afterward, we perform words tokenizing step combining with POS-tagging, we

loop through each sentence and execute the POS-tagging method of Vu Anh et al. [4]
utilize Conditional Random Fields. The output of POS-tagging is a python array where
each element contains a pair of tokenized word (single or compound) with its
respectively POS-tag, available in Table 5.
Table 5. Sample output of Word Tokenizing step performed on “Diary of a Cricket”.

…
[(‘tôi’, ‘P’), (‘là’, ‘V’), (‘em út’, ‘N’), (‘,’, ‘CH’), (‘bé’, ‘N’), (‘nhất’, ‘A’), (‘nên’, ‘C’), (‘được’,
‘V’), (‘mẹ’, ‘N’), (‘tôi’, ‘P’), (‘sau’, ‘N’), (‘khi’, ‘N’), (‘dắt’, ‘V’), (‘vào’, ‘E’), (‘hang’, ‘N’), (‘,’,
‘CH’), (‘lại’, ‘R’), (‘bỏ’, ‘V’), (‘theo’, ‘V’), (‘một ít’, ‘L’), (‘ngọn’, ‘Nc’), (‘cỏ’, ‘N’), (‘non’,
‘A’), (‘trước’, ‘E’), (‘cửa’, ‘N’), (‘,’, ‘CH’), (‘để’, ‘E’), (‘tôi’, ‘P’), (‘nếu’, ‘C’), (‘có’, ‘V’), (‘bỡ
ngỡ’, ‘A’), (‘,’, ‘CH’), (‘thì’, ‘C’), (‘đã’, ‘R’), (‘có’, ‘V’), (‘ít’, ‘A’), (‘thức ăn’, ‘N’), (‘sẵn’, ‘A’),
(‘trong’, ‘E’), (‘vài’, ‘L’), (‘ngày’, ‘N’), (‘.’, ‘CH’)]
[(‘rồi’, ‘C’), (‘mẹ’, ‘N’), (‘tôi’, ‘P’), (‘trở về’, ‘V’), (‘.’, ‘CH’)]
[(‘tôi’, ‘P’), (‘cũng’, ‘R’), (‘không’, ‘R’), (‘buồn’, ‘V’), (‘.’, ‘CH’)]
[(‘trái lại’, ‘N’), (‘,’, ‘CH’), (‘còn’, ‘C’), (‘thấy’, ‘V’), (‘làm’, ‘V’), (‘khoan khoái’, ‘N’), (‘vì’,
‘E’), (‘được’, ‘V’), (‘ở’, ‘V’), (‘một mình’, ‘X’), (‘nơi’, ‘N’), (‘thoáng đãng’, ‘V’), (‘,’, ‘CH’),
(‘mát mẻ’, ‘N’), (‘.’, ‘CH’)]
[(‘tôi’, ‘P’), (‘vừa’, ‘R’), (‘thầm’, ‘A’), (‘cảm ơn’, ‘V’), (‘mẹ’, ‘N’), (‘,’, ‘CH’), (‘vừa’, ‘R’),
(‘sạo sục’, ‘V’), (‘thăm’, ‘V’), (‘tất cả’, ‘P’), (‘các’, ‘L’), (‘hang’, ‘N’), (‘mẹ’, ‘N’), (‘đưa’, ‘V’),
(‘đến’, ‘V’), (‘ở’, ‘V’), (‘.’, ‘CH’)]
[(‘khi’, ‘N’), (‘đã’, ‘R’), (‘xem xét’, ‘V’), (‘cẩn thận’, ‘A’), (‘rồi’, ‘T’), (‘,’, ‘CH’), (‘tôi’, ‘P’),
(‘ra’, ‘V’), (‘đứng’, ‘V’), (‘ở’, ‘E’), (‘ngoài’, ‘E’), (‘cửa’, ‘N’), (‘và’, ‘C’), (‘ngửng mặt’, ‘V’),
(‘lên’, ‘V’), (‘trời’, ‘N’), (‘.’, ‘CH’)]
[(‘qua’, ‘V’), (‘những’, ‘L’), (‘ngọn’, ‘Nc’), (‘cỏ’, ‘N’), (‘ấu’, ‘N’), (‘nhọn’, ‘A’), (‘và’, ‘C’),
(‘sắc’, ‘V’), (‘,’, ‘CH’), (‘tôi’, ‘P’), (‘thấy’, ‘V’), (‘màu’, ‘N’), (‘trời’, ‘N’), (‘trong’, ‘E’),
(‘xanh’, ‘A’), (‘.’, ‘CH’)]
[(‘tôi’, ‘P’), (‘dọn giọng’, ‘V’), (‘,’, ‘CH’), (‘vỗ’, ‘V’), (‘đôi’, ‘M’), (‘cánh’, ‘N’), (‘nhỏ’, ‘A’),
(‘tới’, ‘E’), (‘nách’, ‘N’), (‘,’, ‘CH’), (‘rồi’, ‘C’), (‘cao hứng’, ‘V’), (‘gáy’, ‘V’), (‘lên’, ‘V’),
(‘mấy’, ‘L’), (‘tiếng’, ‘N’), (‘rõ’, ‘A’), (‘to’, ‘A’), (‘.’, ‘CH’)]
[(‘từ’, ‘E’), (‘đây’, ‘P’), (‘,’, ‘CH’), (‘tôi’, ‘P’), (‘bắt đầu’, ‘V’), (‘vào’, ‘E’), (‘cuộc đời’, ‘N’),
(‘của’, ‘E’), (‘tôi’, ‘P’), (‘.’, ‘CH’)]
…
Noted that we will perform this step for all included words, including proper noun
for an individual person, place, or organization. These are valuable data that helps with
document classification, for e.g., multiple presidents’ name will help concluding the
said document is politic related. Moreover, special characters are also POS-tagged, as
dialogs keep their respectively quotation marks, the end of sentences can be a period or
a quotation mark depending on the nature of the sentence itself. This reservation helps
with indicating if a certain category will include a lot of dialogues by counting quo-
tation marks, or we indicate that brackets are mostly used in scientific articles and
books, or sometime used in novels.
Next, we transform these data into a combined state, where each item consists of a
word and its POS-tag, which we address as CWP (combined word with POS-tag).
Following what we have been studying, this combined item effectively differentiate
itself with its homophones as effective as 77.31% or better. The output of this step
should look like the sample presented in Table 6.
Table 6. Sample output of POS-tag and Word combining step performed on “Diary of a
Cricket”.
…
tôi__P là__V em_út__N,__CH bé__N nhất__A nên__C được__V mẹ__N tôi__P sau__N
khi__N dắt__V vào__E hang__N,__CH lại__R bỏ__V theo__V một_ít__L ngọn__Nc cỏ__N
non__A trước__E cửa__N,__CH để__E tôi__P nếu__C có__V bỡ_ngỡ__A,__CH thì__C
đã__R có__V ít__A thức_ăn__N sẵn__A trong__E vài__L ngày__N.__CH
rồi__C mẹ__N tôi__P trở_về__V.__CH
tôi__P cũng__R không__R buồn__V.__CH
trái_lại__N,__CH còn__C thấy__V làm__V khoan_khoái__N vì__E được__V ở__V
một_mình__X nơi__N thoáng_đãng__V,__CH mát_mẻ__N.__CH
tôi__P vừa__R thầm__A cảm_ơn__V mẹ__N,__CH vừa__R sạo_sục__V thăm__V tất_cả__P
các__L hang__N mẹ__N đưa__V đến__V ở__V.__CH
khi__N đã__R xem_xét__V cẩn_thận__A rồi__T,__CH tôi__P ra__V đứng__V ở__E ngoài__E
cửa__N và__C ngửng_mặt__V lên__V trời__N.__CH
qua__V những__L ngọn__Nc cỏ__N ấu__N nhọn__A và__C sắc__V,__CH tôi__P thấy__V
màu__N trời__N trong__E xanh__A.__CH
tôi__P dọn_giọng__V,__CH vỗ__V đôi__M cánh__N nhỏ__A tới__E nách__N,__CH rồi__C
cao_hứng__V gáy__V lên__V mấy__L tiếng__N rõ__A to__A.__CH
từ__E đây__P,__CH tôi__P bắt_đầu__V vào__E cuộc_đời__N của__E tôi__P.__CH
…
Since Vietnamese compound words are multiple single-syllable words being written
next to each other, to present these words, we use underscore special character “_” to
separate its units and its POS-tag. Thus, the noun “thức ăn” (food) will be presented as
“thức_ăn__N”. The final output of this step is a list of all sentences being tokenized,
with a label for each of them, stating which category it belongs to. For the sake of
simplicity, we have written these data to a separate file for each book or each article.
Fourth step is where we merge the output files of third step into two large datasets:
one with CWP and one without. This way we can compare the difference between bare-
bone training and CWP method. These datasets are written onto separate files, but with
the same structure as following: a csv file with two columns: Encoded Label and
Processed sentence. Encoded Labels are used to differentiate between classes to be
classified, and to improve processing performance. We have used a sample Encoded
Labels set as in Table 7.
Table 7. Encoded labels.

Category Label
Vietnamese fairy tale and stories for kids 1
Vietnamese History books 2
Western fantasy novel 3
Vietnamese news articles 4
World news articles 5
Others 100
Step five, we perform TD-IDF vectorizer by building up weight matrix. This step
evaluates the relevance of each CWP toward its sentence and its Encoded Label. We
will perform this step on top of the training dataset, consisting 80% number of rows in
the combined dataset in step four. The output of this step is presented in Table 8.
Table 8. Sample output of Sentence Tokenizing step performed on “Diary of a Cricket”.

…
‘tôi__p’: 4068, ‘sống__v’: 3289, ‘từ__e’: 4223, ‘bé__n’: 218, ‘__ch’: 48, ‘một__m’: 2327,
‘sự__n’: 3331, ‘đáng__v’: 4694, ‘suốt__a’: 3209, ‘đời__n’: 4933, ‘ấy__p’: 4973, ‘là__v’: 1935,
‘lâu_đời__a’: 1966, ‘trong__e’: 3807, ‘họ__p’: 1560, ‘nhà__n’: 2562, ‘chúng_tôi__p’: 559,
‘vả_lại__c’: 4378, ‘mẹ__n’: 2301, ‘thường__r’: 3556, ‘bảo__v’: 281, ‘rằng__c’: 3117,
‘phải__v’: 2877, ‘như__c’: 2603, ‘thế__p’: 3617, ‘để__e’: 4858, ‘các__l’: 715, ‘con__n’: 673,
‘biết__v’: 138, ‘một_mình__x’: 2331, ‘cho__e’: 467, ‘quen__v’: 2946, ‘đi__v’: 4644,
‘con_cái__n’: 678, ‘mà__c’: 2194, ‘cứ__r’: 894, ‘vào__e’: 4307, ‘bố_mẹ__n’: 358, ‘thì__c’:
3510, ‘chỉ__r’: 623, ‘sinh__v’: 3180, ‘ra__v’: 3037, ‘tính__v’: 4056, ‘xấu__a’: 4522, ‘lắm__r’:
2075, ‘rồi__c’: 3128, ‘ra_đời__v’: 3044, ‘không__r’: 1720, ‘gì__p’: 1269, ‘đâu__p’: 4711,
‘bởi__e’: 383, ‘nào__p’: 2685, ‘cũng__r’: 786, ‘vậy__p’: 4393, ‘đẻ__v’: 4841, ‘xong__v’:
4474, ‘là__c’: 1933, ‘có__v’: 746, ‘ba__m’: 88, ‘anh_em__n’: 69, ‘ở__v’: 4993, ‘với__e’: 4426,
‘hôm__n’: 1482, ‘tới__e’: 4203, ‘thứ__n’: 3687, ‘trước__n’: 3885, ‘đứa__nc’: 4942, ‘nửa__n’:
2781, ‘lo__v’: 1896, ‘vui__a’: 4294, ‘theo__v’: 3379, ‘sau__n’: 3164, ‘dẫn__v’: 1033, ‘và__c’:
4301, ‘đem__v’: 4635, ‘đặt__v’: 4836, ‘mỗi__l’: 2322, ‘cái__nc’: 720, ‘hang__n’: 1318,
‘đất__n’: 4807, ‘ở__e’: 4992, ‘bờ__n’: 382, ‘ruộng__n’: 3072, ‘phía__n’: 2844, ‘bên__n’: 223,
‘kia__p’: 1766, ‘chỗ__n’: 641, ‘trông__v’: 3864, ‘đầm__n’: 4815, ‘nước__n’: 2722, ‘đã__r’:
4718, ‘đắp__v’: 4830, ‘thành__v’: 3452, ‘bao_giờ__p’: 103, ‘nhất__a’: 2618, ‘nên__c’: 2697,
‘được__v’: 4775, ‘khi__n’: 1658, ‘dắt__v’: 1041, ‘lại__r’: 2044, ‘bỏ__v’: 352, ‘một_ít__l’:
2336, ‘ngọn__nc’: 2517, ‘cỏ__n’: 862, ‘non__a’: 2679, ‘trước__e’: 3884, ‘cửa__n’: 902,
‘nếu__c’: 2750, ‘ít__a’: 4596, ‘thức_ăn__n’: 3693, ‘sẵn__a’: 3281, ‘vài__l’: 4302, ‘ngày__n’:
2449, ‘trở_về__v’: 3955, ‘buồn__v’: 167, ‘còn__c’: 743, ‘thấy__v’: 3588, ‘làm__v’: 1936,
‘vì__e’: 4319, ‘nơi__n’: 2721, ‘vừa__r’: 4437, ‘thầm__a’: 3589, ‘cảm_ơn__v’: 821, ‘thăm__v’:
3544, ‘tất_cả__p’: 4133, ‘đưa__v’: 4764, ‘đến__v’: 4848, ‘xem_xét__v’: 4462,
…
The TF-IDF vocabulary can now be accessed, with top rows being as in Table 9 as
sampling data.
We can now look at some cases where CWP effectively differentiate homophones
as sampled in Table 10.
Table 9. TF-IDF vocabulary.

(0, 4644) 0.168109877554555
(0, 4254) 0.4531296121077332
(0, 2872) 0.3682155978026012
(0, 2595) 0.35653421293953347
(0, 2383) 0.36033217381957333
(0, 1496) 0.21122947912465437
(0, 804) 0.34709010848646565
(0, 703) 0.43111976142881975
(0, 48) 0.15139449063208063
(1, 4718) 0.18776376846938878
(1, 3367) 0.39076469627344607
(1, 3102) 0.3703815706378775
(1, 1784) 0.33743975469057436
(1, 1749) 0.3143520631846064
(1, 1720) 0.16931995401726685
(1, 1235) 0.4196143541973651
(1, 745) 0.30462437923547336
(1, 185) 0.40110986375766133
…
Table 10. Sampling processed data – Homophone differentiation.

# Homophone Words Notes
218 bé bé__n A noun (a baby)
217 bé__a An adjective (small)
1183 giáp giáp__v A verb (being adjacent to)
1182 giáp__n A noun (armor)
1181 giáp__m An article
Afterward, we train and test with Support Vector Machine (SVM). The reason for
this choice is because it is a very universal learner, and it is independent of the
dimensionality of the feature space according to Thorsten Joachims [14]. It is proven to
be effective and can be used to generalize even with many features involved, which is
ideal in our case study. In practice, with some tuning, we achieve the following result in
Table 11.
Table 11. SVM accuracy rate.

Random seed 500 100 100 500 1000 5000 5000 500 200 100 50
Max feature 5 5 50 50 100 100 500 5000 5000 5000 5000
With CWP 63.295 63.341 67.940 68.189 72.029 72.282 82.917 90.734 91.075 90.573 90.799
Without CWP, with 63.173 63.296 69.892 69.781 72.631 72.788 81.510 90.144 90.140 90.179 90.106
combined words
The highest accuracy rate we received is 91.075% with CWP, and 90.179%
without CWP. Based on these data, we concluded that CWP does not always give
better results, but with tuning it can give significant improvement over without.
For comparison purpose, we also perform Naïve-Bayes Classifier algorithm train-
ing model (arguably providing among one of the fastest processing speeds out of the
box) with a slight improvement as stated in Table 12.
Table 12. Naïve-Bayes classifier accuracy rate.

Random seed 500 100 100 500 1000 5000 5000 500 200 100 50
Max feature 5 5 50 50 100 100 500 5000 5000 5000 5000
With CWP 63.295 63.341 66.916 66.532 70.805 70.847 80.041 87.980 88.329 88.179 88.064
Without CWP, with 63.173 63.296 67.272 67.107 70.643 70.417 78.990 87.248 87.336 87.551 86.999
combined words
The best we were able to achieve without overfitting is 88.329% with CWP, and
87.551% without CWP. The trend followed through as we needed proper tuning for
this result. It is also safe to conclude that in general, even though SVM leads the
accuracy rate in most of cases, Naïve-Bayes Classifier still process faster with rea-
sonably accuracy rate.
Comparing with results from “underthesea group”, who has been developing and
maintaining the popular NLP python toolkit for Vietnamese language processing, that
they have also published in their official website [2] and quoted as in Table 13, we
conclude that after skipping lemmatizing step, CWP provide a modestly comparable
result. This proves that CWP not only saves processing power, but also remain com-
petitive in accuracy front.
Table 13. underthesea group’s work result – version 1.1.17 - 2020 [2].
TfidfVectorizer(ngram_range = (1, 2), max_df = 0.5) 92.8
CountVectorizer(ngram_range = (1, 3), max_df = 0.7) 89.3
TfidfVectorizer(max_df = 0.8) 89.0
CountVectorizer(ngram_range = (1, 3) 88.9
TfidfVectorizer(ngram_range = (1, 3)) 86.8
CountVectorizer(max_df = 0.7) 85.5
5 Conclusion and Perspectives
This paper proposes the combination of a word with its POS to give it a unique ID, thus
effectively distinguish between homophones while preserving the difference between
synonyms. The dataset is collected and (semi-)manually processed from various fields,
then a sequence of techniques for customizing and training learning model is used. The
classification model could contribute positively to many Vietnamese text-analyzing
based businesses, such as social network, e-commerce, etc. The work results prove that
CWP not only saves processing power but also competes for accuracy with the best-in-
class results in the same field.
For future work, we shall apply studied method into sentiment analysis of social
network posts and comments, e-commerce reviews analysis for data mining purpose.
The study also can give a reference to other fields, in general, any text processing
field that requires synonyms difference preservation such as social network analysis,
sentiment analysis [15, 16].
References
1. Cai, J., Li, J., Li, W., Wang, J.: Deeplearning model used in text classification. In:
Proceedings 15th International Computer Conference on Wavelet Active Media Technology
and Information Processing (ICCWAMTIP), Chengdu, China, pp. 123–126 (2018)
2. Le, H., Hoang, T., Toan, D.M.: Homophone and polysemous words in Vietnamese (in
comparison with modern Han language) - Đồng âm và đa nghĩa trong tiếng Việt (Đối chiếu
với tiếng Hán hiện đại) (Vietnamese) (2011)
3. Halpern, J.: Is Vietnamese a hard language? http://www.kanji.org/kanji/jack/vietnamese/is_
VN_hard_sum_EN_VN.pdf
4. Le, H. et al.: Vietnamese NLP Toolkit https://underthesea.readthedocs.io
5. Institute of Linguistics of Vietnam: Vietnamese dictionary - Republish 12th ed, (2006)
6. Thuat, D.T.: Vietnamese phonetic (Ngữ âm tiếng Việt (Vietnamese)), Hanoi, p. 89 (1977)
7. Tolkien, J.R.R.: Lord of the Ring - Book 1: The Fellowship of the Ring, Vietnamese
Literature publisher, translated by Yen. N.T.T, Viet, D.T. (2013)
8. A collection of Grim fairy tales, Dan Tri publisher, translated by various members (2008)
9. Hoai, T.: Diary of a Cricket, Kim Dong publisher (2014)
10. 109 modern Vietnamese fables – Various artists, Hong Duc publisher (2018)
11. Various newspaper published during 2019–2020 at https://vnexpress.net
12. Various newspaper published during 2019–2020 at https://thanhnien.vn
13. Various newspaper published during 2019–2020 at https://dantri.com.vn
14. Thorsten, J.: Text Categorization with Support Vector Machines: Learning with Many
Relevant Features (2005)
15. Hung, P.D., Giang, T.M., Nam, L.H., Duong, P.M., Van Thang, H., Diep, V.T.: Smarthome
control unit using Vietnamese speech command. In: Vasant, P., Zelinka, I., Weber, G.W.
(eds.) Intelligent Computing and Optimization. ICO 2019. Advances in Intelligent Systems
and Computing, vol 1072. Springer, Cham (2019)
16. Tram, N.N., Hung, P.D.: Analyzing hot Facebook users posts’ sentiment. In: Emerging
Technologies in Data Mining and Information Security Proceedings of IEMIS (2020)
A New Topological Sorting Algorithm
with Reduced Time Complexity
Tanzin Ahammad1 , Mohammad Hasan1,2(&) ,

and Md. Zahid Hassan1,2
1
Chittagong University of Engineering and Technology, Chattogram,
Bangladesh
hasancse.cuet13@gmail.com
2
Department of CSE, Bangladesh Army University of Science and Technology
(BAUST), Saidpur, Bangladesh
Abstract. In the case of finding the topological ordering of a directed acyclic

graph (DAG), kahn’s and Depth First Search (DFS) topological sorting algo-
rithms are used. Both of these algorithms time complexity is O(|V| + |E|). Here a
topological sorting algorithm is proposed that is completely new and it reduces
the time complexity of the previous algorithms. By separating the vertices having
outgoing edges and the vertices having no outgoing edges then removing out-
going edges step by step, we can find a topological ordering of any DAG. The
P
jvj
time complexity after using the proposed algorithm reduces to O i¼1 ðNE Þ
i
for both average case and worst case but for best case it reduce to O(|V|), here |V|
is the number of vertex and |(NE)| is the number of vertex contains at least one
outgoing edge. This algorithm also can detect cycle in a graph. This algorithm
cab be used for resolving dependencies, scheduling system, planning and in
many graph algorithms.
Keywords: Topological sorting Directed acyclic graph Complexity Set

Vertex
1 Introduction
Topological sorting for a directed acyclic graph (DAG) means the linear ordering of the
vertices of a graph. In this ordering for every directed edge (u, v) from vertex u to
vertex v, u comes before v. In a graph, vertex represents task which has to perform and
edge indicates constraints that the task performed one after another, then a valid
sequence for the tasks is the topological ordering of the tasks. Topological ordering is
possible only for DAGs. If at least one cycle exists in a graph then the topological
ordering of this graph is not possible. Every DAG has at least one topological ordering.
The first topological sorting algorithm was given by Kahn (1969). In Kahn’s
algorithm, for topological sorting first have to take the vertices from the graph which do
not have any incoming edges. If the graph is acyclic then the set must contain at least
one vertex that does not have any incoming edges. From this set of vertices, graph
traversing will begin. Then have to find the vertices of outgoing edges and have to

https://doi.org/10.1007/978-3-030-68154-8_38
A New Topological Sorting Algorithm with Reduced Time Complexity 419
remove them. For other incoming edges this vertex have to check again, if there are no
incoming edges then this vertex has to insert into the set of sorted elements. So, Kahn’s
algorithm uses a breadth first search technique. In this algorithm, we have to check all
the vertices and edges for the start vertex. Then have to start check every elements over
again. So, the overall time complexity of Kahn’s algorithm is O(|V| + |E|).
Another popular topological sorting algorithm is DFS topological sorting algo-
rithm, which is based on DFS graph traversal technique. In DFS topological sorting
algorithm, graph traversing started from the start vertex and traversing goes through
down from the start vertex. When a vertex is found which contain more incoming
vertices then by backtracking procedure a previous level vertex is traversed. The time
complexity of this DFS topological sorting algorithm is also O(|V| + |E|).
In the proposed algorithm traversing of the full graph is not needed. Just list the
empty vertices (vertices contain no outgoing edge) in a set and non-empty vertices
(vertices contain at least one outgoing edge) in another set, then order the vertices by
removing empty vertices fromP non-empty vertices set. In our proposed algorithm time
jvj
complexity reduced to O
ðNE Þ , because of using two sets and here also
i¼1 i
elements will decrease in each step. The space complexity of our proposed algorithm is
O(|V| + |E|). So, the proposed algorithm is completely new that finds the topological
ordering of a graph in an optimum time. The proposed algorithm also can detect cycle
in a graph. This is a completely new algorithm for topological sorting which reduces
the time complexity for topological sorting. This algorithm avoids using any graph
traversing algorithm like BFS or DFS. This is a new approach for simply implemen-
tation of topological sorting algorithm. This algorithm returns false, if any cycle pre-
sents in the graph that means, then not any topological sorting is possible. So, this new
topological sorting algorithm can be used for detecting cycle in the graphs. This
algorithm implementation is very simple.
Topological sorting is a popular algorithm for scheduling tasks. From a long period
of time researchers continuing research over this topic for improving this algorithm and
they also working on application of topological sorting. So, there is a large space for
researching on this algorithm. The application of topological sorting algorithm is huge.
It can be used for complex database tables that have dependencies. It is also useful for
ordering object in a machine, course prerequisite system and many more. Topological
ordering is mostly used for planning and scheduling.
2 Related Works
A depth first discovery algorithm (DFDA) is proposed to find topological sorting in

paper [1]. To reduce the temporary space of a large DAG they used depth first search.
DFDA is more efficient and simple than Discovery Algorithm (DA). To generate all
topological sorting, three algorithms are described in paper [2]. From this three algo-
rithms they improved the Well’s topological sorting algorithm. A new sorting algo-
rithm is implemented that reduces the time complexity of topological sorting algorithm
in paper [3] by using matlab script. This algorithm is more efficient than Kahn’s and
DFS topological sorting algorithm and required storage is few than DFS. But, in the
420 T. Ahammad et al.
case of complex and large system this algorithm increases the complexity. In a food
supply chain, for checking the adulterant nodes, a new algorithm is proposed in paper
[4]. A novel optimization algorithm is proposed in paper [5] to find the solution of
many diverse topological sorts which implies many cut-sets for a directed acyclic
graph. For enhancing grid observation, the proposed algorithm is very effective. For
large graphs, another topological sorting algorithm is given in paper [6]. For topo-
logical sorting, they proposed an I/O-efficient algorithm named IterTs. But, this
algorithm is inefficient for worst case. A new algorithm is given in paper [7] that gives
the topological sorting of sparse digraphs with better time complexity than all other
previous algorithms. This topological sorting algorithm is dynamically maintained.
They have given an experimental comparison result of topological sorting for large
randomly generated directed acyclic graphs. In paper [8], P. Woelfel solved the
topological sorting with O(log2|V|) OBDD operations. For a fundamental graph
problem, it is the first true runtime analysis of a symbolic OBDD algorithm. In paper
[9], they proposed some algorithms to solve the problem of single-source shortest path,
computing a directed ear decomposition and topological sorting of a planar directed
acyclic graph in O(sort(N)) I/Os, here sort(N) is the number of I/Os needed to sort N
elements. All topological ordering of a directed acyclic graph is found in paper [10].
They proposed an extended algorithm which is capable of finding all topological
solutions of a DAG. To implement this algorithm they have used backtracking, iter-
ation in the place of recursion, data structures and many more. In paper [11], they
proposed a new algorithm by using parallel computation approach. To find topological
sorting of a directed acyclic graph, this algorithm is very simple. In this algorithm, they
have traversed the nodes parallel from a node and marked as visited. After visiting all
the source nodes, all other nodes in the graph is marked as visited. For acyclic graph the
parallel traversing will terminate. For an SIMD machine the implementation of this
algorithm is discussed in this paper. Time complexity of this proposed parallel algo-
rithm is ordered for longest distance between a source and a sink node for a directed
acyclic graph. Large networks topological sorting is given in paper [12]. The general
topological algorithms are not efficient to find topological ordering for a large network.
So, A. B. Kahn proposed this procedure to find the topological ordering of large
networks efficiently. By using this algorithm a PERT (Program Evaluation Review
Technique) network has been maintained which contains 30,000 activities which can
be ordered in less than one hour of machine time. This is an efficient method because
the location of any item is known so that no need to searching an item and for correct
sequence of network there only needed a single iteration, so the method go in single
step for this case. If all events are traversed or any cycle found in the network then the
network is terminated. This method is more efficient than all other topological algo-
rithm for large network. In paper [13], T. Ahammad et al. proposed a new dynamic
programming algorithm to solve job sequencing problem that reduces the time com-
plexity of job sequencing problem from O(n2) to O(mn). They have used the tabulation
method of dynamic programming. They have showed the step by step simulation of
their algorithm working procedure with experimental analysis. In paper [14], J.F.
Beetem proposed an efficient algorithm for hierarchical topological sorting. This
algorithm is also applicable in the presence of apparent loops. Its application is also
described in this paper.
3 Overview of the Proposed Algorithm
In this proposed algorithm, from a graph we put the empty vertices means those
vertices which contains no outgoing edges in a set E and the non-empty vertices means
those vertices which contains at least one outgoing edge in another set NE. In each step,
we will take an empty vertex and remove this vertex from all non-empty vertices and
also remove this particular vertex from set E. In this procedure, if any vertex becomes
empty then we will put it in set E. The topological ordering is the set T, the sequence in
which we choose the empty vertices from the set E. When the set E becomes empty
then the algorithm terminated and if T contains all vertices of the graph then no cycle
exists and the graph is a DAG and T is the correct topological order. But if T doesn’t
contain all vertices of the graph then the graph contains cycle and the graph is not
DAG, then any topological ordering is not possible (Fig. 1).
Fig. 1. Flow Chart of topological sorting algorithm.

4 Algorithm
4.1 Pseudocode
Here G is the set of vertices of the graph which divided into two set E (contains empty
vertices) and NE (contains non-empty vertices). Empty vertex contains no outgoing
edges and non-empty vertex contains outgoing edges.
When the set of empty vertices E becomes empty then the ordering is completed. If
the size of T is not equal to the size of G then the graph has at least one cycle and the
graph is not DAG. So, the topological sorting of this graph is not possible. Otherwise if
the size of T is equal to the size of G then topological ordering is possible and the
elements of set T in reverse order is in topological ordered. From this algorithm, we can
get the correct topological ordering of a graph in set T after reversing T. If topological
ordering is not possible or the algorithm detect any cycle then the algorithm return false
otherwise the algorithm return set T. T is correct topological ordering of graph G. By
this procedure this algorithm also detect cycle in a graph.
1. G: a set of vertices of a graph

2. T: an empty set that will fill in topological order
3. Procedure Topological Sort(G,T)
4. E <-a set of empty vertices of G
5. NE <-a set of non-empty vertices of G
6. repeat
7. v <-any vertex from E
8. for all w NE(G) do
9. if v w then
10. remove v from w
11. if w is empty then
12. add w to E and remove from NE
13. end if
14. end if
15. end for
16. add v to T and remove from E
17. until E is empty
18. if T size ≠ G size then
19. return false (Topological Sorting is not possible)
20. else
21. reverse T
22. return T
23. end if
24. end procedure
4.2 Example
Let a directed acyclic graph (DAG) is given (see Fig. 2). Have to find a correct
topological ordering of this graph.
Fig. 2. A directed acyclic graph (DAG) for topological sorting.
The graph has 14 vertices numbered as (0−13) and 16 edges (Fig. 2). So the
adjacency list of this graph-
0->
1->9
2->0
3->2
4->5, 6, 7
5->
6->7
7->8
8->
9->4
10->0, 1, 12
11->13
12->0, 11
13->5
Since the graph is a directed acyclic graph (DAG), so the graph has at least one
topological ordering exist. Here the empty vertices (vertex with no outgoing edges) are
(0, 5, 8). We can take those empty vertices in any order. Each order is valid for a
particular step. After taking those empty vertices the procedure begins. If the graph is a
DAG then in the first step at least one empty vertex will exists.
Here showed adjacency list simulation for finding a topological ordering of this
graph. Here first table represents the adjacency list of the following graph (Fig. 3). By
removing its empty vertices, the size reduces which is shown below-
Fig. 3. Step by step visualization of proposed algorithm working procedure with an example.
Step 1: Find the empty vertices (0, 5, 8) and insert these vertices in a set T (0, 5, 8).
Then remove these vertices from adjacency list and also remove these from all other
vertices. Now the empty vertices are (2, 7, 13).
Step 2: Find the empty vertices (2, 7, 13) and insert these vertices in the set T(0, 5, 8,
2, 7, 13). Then remove these vertices from adjacency list and also remove these from
all other vertices. Now the empty vertices are (3, 6, 11).
Step 3: Find the empty vertices (3, 6, 11) and insert these vertices in the set T(0, 5, 8,
2, 7, 13, 3, 6, 11). Then remove these vertices from adjacency list and also remove
these from all other vertices. Now the empty vertices are (4, 12).
Step 4: Find the empty vertices (4, 12) and insert these vertices in the set T(0, 5, 8, 2,
7, 13, 3, 6, 11, 4, 12). Then remove these vertices from adjacency list and also remove
these from all other vertices. Now the empty vertex is 9 only.
Step 5: Find the empty vertex 9 and insert this vertex in the set T(0, 5, 8, 2, 7, 13, 3, 6,
11, 4, 12, 9). Then remove this vertex from adjacency list and also remove this from all
other vertices. Now the empty vertex is 1 only.
Step 6: Find the empty vertex 1 and insert this vertex in the set T(0, 5, 8, 2, 7, 13, 3, 6,
11, 4, 12, 9, 1). Then remove this vertex from adjacency list and also remove this from
all other vertices. Now the empty vertex is 10 only.
Step 7: Insert the last empty vertex 10 in the set T(0, 5, 8, 2, 7, 13, 3, 6, 11, 4, 12, 9, 1, 10).
Then remove this vertex from adjacency list. Now there is no such empty vertex remaining.
After reversing T, we will get the topological ordering of the graph. The topological
ordering of the graph by proposed algorithm is-
(10->1->9->12->4->11->6->3->13->7->2->8->5->0)
By Kahn’s algorithm, the topological ordering is-
(3->10->2->1->12->9->0->11->4->13->6->5->7->8)
By DFS toposort algorithm, the topological ordering is-
(10->12->11->13->3->2->1->9->4->6->7->8->5->0)
The proposed topological sorting algorithm is less complex than Kahn’s and DFS
topological sorting algorithms.
5 Analysis
P
The time complexity of this proposed algorithm is O
j vj ðNE Þ and the space
i¼1 i
complexity of this proposed algorithm is O(|V| + |E|). The calculation process of the
time and space complexity is given in the next sections.
5.1 Time Complexity

In the proposed algorithm the outer loop will execute |V| number of times and the inner
loop will be execute for |NE| number of times. But in this algorithm for each vertex 1 to
|V| the inner loop value |NE| will be different-

i ¼ 1. . .ðNE Þ1

i ¼ 2. . .ðNE Þ2

i ¼ 3. . .ðNE Þ3

i ¼ 4. . .ðNE Þ4

i ¼ 5. . .ðNE Þ5

i ¼ 6. . .ðNE Þ6
: :
: :
: :

i ¼ jV j. . .ðNE Þjvj
Total complexity will be counted for the summation of non-empty vertex elements of
each step. So, for 1 to |V|

Compexity ¼ ðNE Þ1 þ ðNE Þ2 þ ðNE Þ3 þ ðNE Þ4 þ ðNE Þ5 þ ðNE Þ6 þ . . . þ ðNE Þjvj
Xjvj
¼ ðNE Þ ; where i ¼ 1; 2; 3; 4; 5; 6; . . .; jV j
i¼1 i
ð1Þ
So, the general time complexity is
X
jmj
jðNEÞi j
i¼1
The time complexity for best case, average case and worst case is depends on the
jðNEÞi j term.
Best Case Analysis

From Sect. 1, the general time complexity is
X
jmj
jðNEÞi j
i¼1
If the graph has not any edge then empty vertex set E will be full and non-empty
vertex set NE will be empty. Then only the outer loop will execute |V| times. Here,
jðNEÞi j is the total number of elements in non-empty vertex set where (i) is step
number. The value of jðNEÞi j will 1 for each step means-
|(NE)1| = |(NE)2| = |(NE)3| = |(NE)4| = |(NE)5| = |(NE)6| = 1
Equation (1) becomes-
Total complexity ¼ 1 þ 1 þ 1 þ . . . þ 1
¼ jV j
So, the time complexity for the best case of this algorithm is X(|V|), where |V| is the
total number of vertices of the graph.
Average Case Analysis

X
jmj
jðNEÞi j
i¼1
Time complexity for the average case depends on the jðNEÞi j term. The outer loop
of the algorithm will execute for |V| number of times and the inner loop will be execute
for |NE| number of times.

Xjvj
¼ ðNE Þ ; where i ¼ 1; 2; 3; 4; 5; 6; . . .; jV j
i¼1 i
P
jvj
So, the average case of this algorithm is H i¼1ðNE Þ .
i
Which depends on the jðNEÞi j term. jðNEÞi j is the total number of elements in non-
empty vertex set in each step(i).
Worst Case Analysis

X
jmj
jðNEÞi j
i¼1
Here, the complexity depends on the jðNEÞi j term. In worst case the |NE| will be
greater than average case. Here also the outer loop of the algorithm will execute for |V|
number of times and the inner loop will be execute for |NE| number of times.

Xjvj
¼ ðNE Þ ; where i ¼ 1; 2; 3; 4; 5; 6; . . .; jV j
i¼1 i
P
jvj
So, time complexity for the worst case of this algorithm is O i¼1ðNE Þ .i
Which depends on the jðNEÞi j term. jðNEÞi j is the total number of elements in non-
empty vertex set in each step(i).
5.2 Space Complexity

Space complexity of this algorithm is O(|V| + |E|). Here |V| is the number of vertices
and |E| is the number of edges. In this algorithm, we have used adjacency list so that the
sum of |NE| in the worst case will be O(|V| + |E|). So the space complexity of this
proposed algorithm is O(|V| + |E|).
6 Comparison with Related Works
Table 1. Comparison of time and space complexity of popular algorithms.

Algorithm Time Time complexity Time complexity Space
Name/Paper complexity for for the average for the worst case complexity
No the best case case
Kahn’s X(|V| + |E|) H(|V| + |E|) O(|V| + |E|) O(|V|)
Algorithm
DFS X(|V| + |E|) H(|V| + |E|) O(|V| + |E|) O(|V|)
Algorithm
Paper [1] X(|A|k log k) H(|A|k log k) O(|A|k log k) O(|A| + ||
K||)
Paper [2] X(m + n) H(m + n) O(m + n) O(n)
P P
Proposed X(|V|) H
jv j jv j O(|V| + |
i¼1 ðNE Þi O i¼1 ðNE Þi
Algorithm E|)
It is clear that the proposed algorithm’s time complexity is less than all other topo-
logical sorting algorithms (Table 1).
7 Conclusions and Future Recommendation
In this paper, we proposed a completely new algorithm in simplest way where no such
graph traversing technique is needed and the complexity has reduced. Time complexity
for the best case becomes X(|V|), here |V| is
P the total number of vertices of the graph.
jvj
Time complexity for the average case is H
i¼1 ðNE Þi , here |NE| is the number of
P
jvj
non-empty edges. Time complexity for the worst case is O i¼1 ðNE Þi . Normally
the time complexity of general topological sorting algorithm is O(|V| + |E|). In this
paper, we have showed that our algorithm is better which gives results faster than any
other topological sorting algorithm. As in future, we will again try to reduce the
average case and worst case time complexity. We will also try to do further research
about the application of this proposed algorithm.
References
1. Zhou, J., Müller, M.: Depth-first discovery algorithm for incremental topological sorting of
directed acyclic graphs. Inf. Process. Lett. 88, 195–200 (2003)
2. Kalvin, A.D., Varol, Y.L.: On the generation of all topological sorting. J. Algorithm 4, 150–
162 (1983)
3. Liu, R.: A low complexity topological sorting algorithm for directed acyclic graph. Int.
J. Mach. Learn. Comput. 4(2) (2014)
4. Barman, A., Namtirtha, A., Dutta, A., Dutta, B.: Food safety network for detecting
adulteration in unsealed food products using topologcal ordering. Intell. Inf. Database Syst.
12034, 451–463 (2020)
5. Beiranvand, A., Cuffe, P.: A topological sorting approach to identify coherent cut-sets within
power grids. IEEE Trans. Power Syst. 35(1), 721–730 (2019)
6. Ajwani, D., Lozano, A.C., Zeh, N.: A topological sorting algorithm for large graphs.
ACM J. Exp. Algorithmic 17(3) (2012)
7. Pearce, D.J., Kelly, P.H.J.: A dynamic topological sort algorithm for directed acyclic graphs.
ACM J. Exp. Algorithmic 11(1.7) (2006)
8. Woelfel, P.: Symbolic topological sorting with OBDDs. J. Discrete Algorithms 4(1), 51–71
(2006)
9. Arge, L., Toma, L., Zeh, N.: I/O-efficient topological sorting of planar DAGs. In:
Proceedings of the Fifteenth Annual ACM Symposium on Parallel Algorithms and
Architectures, pp. 85–93 (2003)
10. Knuth, D.E., Szwarcfiter, J.L.: A structured program to generate all topological sorting
arrangements. Inf. Process. Lett. 2(6), 153–157 (1974)
11. Er, M.C.: A parallel computation approach to topological sorting. Comput. J. 26(4) (1983)
12. Kahn, A.B.: Topological sorting of large networks. Commun. ACM 5, 558–562 (1962)
13. Ahammad, T., Hasan, M., Hasan, M., Sabir Hossain, M., Hoque, A., Rashid, M.M.: A new
approach to solve job sequencing problem using dynamic programming with reduced time
complexity. In: Chaubey, N., Parikh, S., Amin, K. (eds.) Computing Science, Communi-
cation and Security. COMS2 2020. Communications in Computer and Information Science,
1235. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-6648-6_25
14. Beetem, J.F.: Hierarchical topological sorting of apparent loops via partitioning. IEEE Trans.
Comput. Aided Des. Integr. Circuits Syst. 11(5), 607–619 (1992)
15. Intelligent Computing & Optimization, Conference proceedings ICO 2018, Springer, Cham,
ISBN 978-3-030-00978-6
16. Intelligent Computing and Optimization, Proceedings of the 2nd International Conference on
Intelligent Computing and Optimization 2019 (ICO 2019), Springer International Publish-
ing, ISBN 978-3-030-33585-4
Modeling and Analysis of Framework
for the Implementation of a Virtual Workplace
in Nigerian Universities Using
Coloured Petri Nets
James Okpor1(&) and Simon T. Apeh2

1
Department of Computer Science, Federal University Wukari,
Wukari, Taraba State, Nigeria
okpor2004@gmail.com
2
Department of Computer Engineering, University of Benin,
Benin City, Edo State, Nigeria
Abstract. This paper presents a Hierarchical Coloured Petri Nets model

specifically designed to support the implementation process of a virtual work-
place in Nigerian Universities. CPN Tools 4.0 was used to capture the various
phases and activities of the framework for the implementation of a virtual
workplace into a Hierarchical Coloured Petri Nets (CPN) model. The developed
virtual workplace CPN model has been analyzed using simulation, and state
space analysis methods. The developed CPN model being a graphical and also
an executable representation of the implementation framework; will assist those
empowered with the responsibility of implementing a virtual workplace to have
a better understanding of the implementation process.
Keywords: Coloured petri nets Framework Nigerian universities Virtual

workplace
1 Introduction
The sudden emergence of the COVID-19 pandemic has impacted the workplace.
Several nations of the world have introduced lockdown measures and social distancing
in a bid to curtail the spread of the novel coronavirus. This has led to the mass
implementation of virtual workplace by organizations in an attempt to ensure business
continuity [1]. In Nigeria, only a few organizations have adopted the virtual workplace
before the global outbreak of the COVID-19 Pandemic. Unfortunately, Nigerian
Universities are among those that are yet to embrace the concept of virtual workplace
[2] even though the recent technological advancement in information and communi-
cation technology has altered the workspace, and has ushered a new form of work
known as virtual workplace [3, 4]. Although the impact of the COVID-19 in Nigerian
Please note that the AISC Editorial assumes that all authors have used the western naming
convention, with given names preceding surnames. This determines the structure of the names in the
running heads and the author index.

https://doi.org/10.1007/978-3-030-68154-8_39
Modeling and Analysis of Framework 431
universities are enormous due to the closure of schools [2, 5], this crisis offers a great
opportunity for policymakers in the educational sector to explore ways on how to
reposition Nigerian universities by adopting the necessary technologies needed for e-
learning, and also creating a virtual workplace for staff.
In a virtual workplace, employees work remotely away from the central office,
communicate with colleagues, and managers through electronic means [6]. Research
suggests that the implementation of a virtual workplace comes with a lot of benefits
such as enabling employees to work wherever it is convenient for them to carry out
their work [4], reduction in traffic congestion and vehicular emissions [7–10], greater
talent pool, less unnecessary meetings, reduction in commuting time, increased
employees productivity [3], reduction in vehicle miles traveled [11], provides oppor-
tunities for people with disabilities to work from home, savings to the organization in
terms of cost of real estate [3, 12], decrease energy consumption, and ultimately leads
to a significant reduction in air pollution [10].
Despite the enormous benefits that can be derived from the implementation of a
virtual workplace, there is currently minute theoretical and empirical research as
regards the development of a framework for the implementation of a virtual workplace
in developing countries, and more specifically Nigeria. Although there are few pub-
lications and blog posts about how best to implement a virtual workplace, however,
many of these materials are not research-based [13]. According to [14] organizations
still lack much-needed theoretical and empirical support as regards the planning and
implementation of a virtual workplace.
The objective of this study was to transform the phases and activities of the
framework for the implementation of a virtual workplace in Nigerian Universities into a
CPN model and to analyze the behaviour of the model. Interestingly, Coloured Petri
Nets have been used by several authors in modeling workflow systems and business
processes, manufacturing systems, computer networks, and embedded systems [15–17,
19]. This motivated us to used coloured Petri Nets to capture the various phases and
activities of the framework for the implementation of a virtual workplace in Nigerian
Universities into a CPN model.
The advantage of using Coloured Petri Nets to model the framework for the
implementation of a virtual workplace is that being a graphical model and also an
executable representation [16, 18], the CPN model will help to provide a better
understanding of the implementation framework than textual description. Also, simu-
lation from the CPN model will provide a more accurate view of the framework for the
implementation of a virtual workplace, and could be used to demonstrate to manage-
ment and other stakeholders on the best approach to implementing a virtual workplace.
The choice of coloured Petri Nets was based on fact that it is a graphical modeling
language and executable [19], and the analysis methods [15, 19].
The rest of this paper is organized as follows: Sect. 2 shows the proposed imple-
mentation framework for virtual workplace. Section 3 presents the developed CPN
model of the virtual workplace. Section 4 describes the state space analysis. Section 5
concludes the paper.
432 J. Okpor and S. T. Apeh
2 Framework for the Implementation of a Virtual Workplace

in Nigerian Universities
This section gives an overview of the developed framework for the implementation of a
virtual workplace in Nigerian universities. The framework for the implementation of a
virtual workplace was developed based on a comprehensive literature review, and data
obtained from selected organizations in Nigeria that have successfully implemented
virtual workplace in their respective organizations. Although the framework was
developed for Nigerian universities, it can be adopted by other organizations in Nigeria
that intend to implement the concept of a virtual workplace. The developed framework,
when adopted, will assist universities and other organizations in the successful
implementation of a virtual workplace in Nigeria. The developed virtual workplace
framework is composed of four phases, namely conceptual phase, pre-implementation
phase, implementation phase, and post-implementation phase as shown in Fig. 1.
Fig. 1. Proposed Virtual Workplace implementation framework for Nigerian Universities
In the conceptual phase, the university set up a committee that will be empowered
with the responsibility of identifying the benefits, barriers, and risks associated with
virtual workplace implementation. The implementation of a virtual workplace should
only be considered after an adequate evaluation of university needs for the virtual
workplace, identification of Job suitability, cost-benefits analysis, determination of

impact (in terms of structure, people, task, culture), and the identification of critical
success factor have been carried out successfully as indicated in the pre-implementation
phase. Immediately after the pre-implementation phase has been concluded, the next
stage (implementation phase) involves four sets of activities; which include planning,
adoption of the implementation strategy, and program launch. Finally, the last phase
(post-implementation) involves program evaluation.
3 Top-Level Module of the Virtual Workplace CPN Model
The Hierarchical CPN model of the virtual workplace was developed based on the
framework for the implementation of a virtual workplace in Fig. 1, where the various
phases and activities involved in the implementation framework were used to develop a
CPN model of the virtual workplace, and is shown in Fig. 2. The top-level module of
the Virtual Workplace CPN model in Fig. 2 consists of 10 places and 4 substitution
transitions (drawn as rectangular boxes with double-line borders) representing the
conceptual phase, Pre-implementation phase, implementation phase, and post-
implementation phase. Each of the substitution transition has a substitution tag (sub-
module) beneath it, that model a more detailed behaviour of the conceptual phase, Pre-
implementation phase, implementation phase, and post-implementation phase.
Fig. 2. The Top-level module of the virtual workplace CPN model
The relationship between the submodule and their functions is described as follows:
The conceptual phase module in Fig. 3 represents the initial phase of the virtual work-
place implementation process. The benefits, barriers, and risks are identified in this
module, and thereafter a decision is made either to accept or abandon the virtual work-
place implementation. Once a decision is made to accept a virtual workplace, the next
phase (pre-implementation phase) involves series of activities such as the identification of
university needs for the virtual workplace, job suitability, cost analysis, determination of
the impact, and the identification of success factors as indicated in the pre-implementation
phase module in Fig. 4. Also, some critical decisions such as the selection of appropriate
implementation methods, selection of interested staff/participants, training, and the

launch of virtual workplace programs are made in the implementation phase module in
Fig. 5. Finally, the post-implementation phase module in Fig. 7 model two sets of
activities, which include; assessing program implementation and the subsequent analysis
of the assessment result.
3.1 Conceptual Phase Module of the Virtual Workplace Model

Figure 3 shows the conceptual phase module, which is the submodule of the con-
ceptual phase substitution transition in Fig. 2. The essence of the conceptual phase is to
ensure that the university identifies the benefits, barriers, and risks associated with the
implementation of a virtual workplace. Port place university management is given the
marking “1`Management” and signifies the first step of the conceptual phase. If a valid
token as specified by the arc expression mgt is presented in the input port place
University Management, then the transition setup committee will be enabled. When the
transition setup committee is fired, a committee will be set up in place VW committee.
Transition Assess becomes enabled immediately place VW committee receives its
desired token(s). Place VWorkplace is connected to transition Assess and is declared as
colour set VWrkplace, which is a record data type with two tokens of benefits, barriers,
and risk assessment. When the transition Assess occurs it extracts tokens from place
VWorkplace and placed each of these tokens on the corresponding output places VW
benefits identified, VW barriers identified, and Asset Identified, as specified by the
output arc expressions #Ben(V_Wrk), #Barr(V_Wrk), and #Risk_Ass(V_Wrk).
When the place Asset Identified has received its required tokens, transition Identify
Threat Vuln Likehd Imp becomes enabled and ready to fire. Transition Identify Threat
Vuln Imp has two input places. The first place Threat Vuln and Iikehd is defined as
product colour set Thrt_Vuln_LHd containing two tokens of threat, vulnerabilities, and
threat likelihood. While the second place Impact is defined as an enumerated colour set
Imp and contains three tokens of High_Impact, Medium_Impact, and Low_Impact. So
when the transition Identify Threat Vuln Likehd Imp fires, token(s) is moved to place
Risk Determination, and either transition Low Risk, Medium Risk, or High Risk
becomes eligible for execution. A guard is placed on each of the three transitions to
ensure that only the required threat likelihood and threat impact passed through the
transitions. Place Risk level received the token(s) immediately after the execution of
transition Low Risk, Medium Risk, and High Risk, and trigger transition Make Decision.
When transition Make decision is triggered, function Acpt(RL1,RL2,VW_Ben,
VW_Barr), and Rej(RL3,VW_Ben,VW_Barr) ensure that place Accept, and Abandon
meet the requirement of the output arc expression.
3.2 Pre-implementation Phase Module of the Virtual Workplace

CPN Model
The pre-implementation phase module shown in Fig. 4 consists of five sets of activities
which include the identification of needs, job suitability, cost analysis, determination of
impact, and the identification of success factors. Two of these activities (job suitability,
and cost analysis) are modeled with a substitution transition, and the remaining three
Fig. 3. Conceptual Phase module of the virtual workplace CPN Model
activities (identification of needs, determination of impact, and identification of success

factor) are modeled with an elementary transition. The port place Accept connect the
conceptual phase module to the pre-implementation phase module. Transition Identify
needs is enabled when the input port place Accept received the desired token. When the
transition Identify needs is fired, place identified needs received the tokens as specified
by the output arc expression Nds,deci. The substitution transitions job suitability, and
cost analysis and its associated submodules, modeled in more detail the activities
involved in each of the submodules.
Fig. 4. Pre-implementation Phase module of the virtual workplace model

Port place Suitable Job serves as an interface between the Job suitability module
and the cost Analysis module. The cost analysis module estimates the total cost and the
expected savings. The transition Determine Impact model the expected impact of the
implementation of a virtual workplace in the university while transition Identify success
factor identifies the critical success factor to successful virtual workplace implemen-
tation shown in Fig. 4.
3.3 Implementation Phase Module of the Virtual Workplace Model

Figure 5 shows the submodule associated with the implementation phase substitution
transition. There are four sets of activities in this phase which include planning,
implementation strategy, and training modeled by three substitution transitions, and
program launch modeled with an elementary transition.
Fig. 5. Implementation phase module of the virtual workplace model
The submodule of the planning substitution transition is shown in Fig. 6. Place

Planning is defined as a record colour set containing “Policy_Devt”, “Eva_ICT_infra”,
“Sele_Remote_Access”. When the transition identify planning area fires, place Policy
Development, Evaluate ICT Infra., and Selection of remote Access will receive the
desired token(s) according to output arc expressions. Once the tokens are received, the
transition Evaluate becomes enabled and ready to fire. When the transition Evaluate
fires, tokens are moved from places Existing ICT Infra., Policy Guidelines, and Remote
Acess Technique according to the function defined by the output expression DPG(PD,
PG), EICT(EVAII,EIIE), and RATECH(SRA,rt) to place Developed Policy Guide, ICT
Evaluation, and selected remote access respectively. The transition finalize plan with
output arc expression Plan(PG,EIIE,rt) ensures that every part of the plan is captured.
Place VW plan serves as a connection between the planning module, and the
implementation strategy module. The implementation phase module model all the
activities required in the implementation phase. While on the other hand, the planning
module handles all the planning activities.
The CPN model of the virtual workplace cannot be presented in full in this paper
due to page limitations.
Fig. 6. Planning submodule
3.4 Post-implementation Phase Module of the Virtual Workplace

The submodule of the post-implementation phase substitution transition is shown in
Fig. 7. It contains two elementary transitions; conduct IMP Assmnt, and Analyze
Assmnt Result. This module model the program evaluation.
Fig. 7. Post-implementation phase module of virtual workplace model
Transition conduct IMP Assmnt model all activities relating to the conduct of the
implementation assessment. When transition conduct IMP Assmnt is fired, Places Imp
Assment Pilot, Imp Assment Phased, and Imp Assment full will receive tokens con-
currently for the conduct of implementation assessment as defined by the output arcs
expression with function PI_Assmnt(Assmnt,SSP), PH_Assmnt(Assmnt,GS), and
FU_Assmnt(Assmnt,CAES). Transition Analyze Assmnt result immediately become
enable as Place Imp Assment Pilot, Imp Assment Phased, and Imp Assment full receive
a valid token from transition Conduct Imp Assmnt. Transition Analyze Assmnt result
ensures that the strength and weakness of the virtual workplace program are critically
examined.
4 Simulation and State Space Analysis
Simulation and state space are the two methods of analysis provided by CPN Tools
[19]. We conducted several simulations on the module and submodule of the developed
virtual workplace CPN model using CPN Tools installed on an Intel (R) Pentium
(R) CPU P6200 @2.13 GHz PC to ascertain whether the module and submodule work
as expected. The simulation shows that token(s) were received/consume in the correct
order, and the model always terminates in the desired state. State space was then
applied to explore the standard behavioral properties of the virtual workplace CPN
model. The analysis of the state space was achieved using the state space report. The
state space report revealed important information about the Boundedness properties,
home properties, liveness properties, and fairness properties of the CPN model.
Analysis of the state space report for virtual workplace CPN model shows that there are
12889 nodes, 30400 arcs, and was generated in 36 s while the SCC graph has 12889
nodes, 30400 arcs, and was generated in 2 s. The state space report shows that the SCC
graph and state space have the same number of nodes and arcs. It is obvious from the
report that no infinite occurrence exists, and hence we can conclude the model ter-
minate appropriately. The liveness properties show that are 24 dead markings in the
developed CPN model. A query was used to investigate these dead states and to
determine whether these dead markings represent the terminal state. The query result
proves that these dead markings are end states in the model. That is the state at which
the CPN model will terminate. The fact that these dead markings in the CPN model are
end states shows that the virtual workplace CPN model will produce the desired result
when it terminates. The state space report also shows that there are no dead transitions
in the CPN model. A transition is dead if there are no reliable markings in which it is
enabled [19].
5 Conclusions
In this paper, we have presented a framework designed for the implementation of a

virtual workplace in Nigerian Universities. A CPN model was developed based on the
framework for the implementation of virtual workplace using CPN Tools. The CPN
model covers the four phases and activities involved in the virtual work implementation
framework. The developed virtual workplace CPN model was verified using the model
simulation and state space analysis methods. The result from the analysis of the
developed virtual workplace CPN model shows that the CPN model work as expected.
Therefore, the CPN model being an executable model and also a graphical represen-
tation of the developed framework will help to provide a better understanding of the
virtual workplace implementation framework.
Future work may be geared towards the development of an animation that will be
interface to the virtual workplace CPN model.
References
1. International Labour Organization: Teleworking during the COVID-19 pandemic and
beyond A Practical Guide, pp. 1–48 (2020)
2. Pauline, E.O., Abiodun, A., Hindu, J.A.: Telecommuting: a panacea to COVID-19 spread in
Nigerian Universities. Int. J. Innov. Econ. Devt. 6(1), 47–60 (2020)
3. PwC S.A.: HR Quarterly. 1–16 (2015)
4. Richard, E., Carl, J.K., Christopher, G.P., Antony, I.S.: The new workplace: are you ready?
How to capture business value. IBM Global Serv. 1–12 (2011)
5. Ogunode, N.J., Abigeal, I., Lydia, A.E.: Impact of COVID-19 on the higher institutions
development in Nigeria. Elect. Res. J. Soc. Sci. Hum. 2(2), 126–135 (2020)
6. Cascio, W.F.: Managing a virtual workplace. Acad. Mgt. Exec. 14(3), 81–90 (2000)
7. Caldow J..: Working outside the box: a study of the growing momentum in telework. Inst.
for Elect. Govt. IBM Corp. 1–14 (2009)
8. Obisi, C.: The empirical validity of the adjustment to virtual work arrangement by business
organisations in Anambra State. Nigeria. Int. J. Sci. Res. Edu. 9(3), 173–181 (2016)
9. Koemg, B.E., Henderson, D.K., Mokhtanan, P.L.: The travel and emissions impacts of
telecommuting for the state of California telecommuting pilot project. Transpn. Res-C. 4(1),
13–32 (1996)
10. Choo, S., Mokhtarian, L.P., Ilan, S.: Does telecommuting reduce vehicle-miles traveled? An
aggregate time series analysis for the U.S. Transportation 32(1), 37–64 (2005)
11. Henderson K.D., Koening E.B., Mokhtarian L.P.: Travel diary-based emissions analysis of
telecommuting for the puget sound demonstration project. Res. Report UCD-ITS-RR-94–26,
1–54 (1994)
12. Thompson C., Caputo P.: The Reality of Virtual Work: is your organization Ready?, AON
consulting. 1–12 (2009)
13. Onpoint Consulting: Success factors of top performing virtual teams. Res. report. 1–14
(2015)
14. Madsen, S.R.: The benefits, challenges, and implications of teleworking: a Literature
Review. Cult. Relig. J. 1(1), 148–158 (2011)
15. vander Aalst, W.M.P.: Modelling and analysis of production systems using a Petri Net based
approach. In: Boucher, T.O., Jafari, M.A., Elsayed, E.A. (eds.) Proceedings of the
Conference on Computer Integrated Manufacturing in the Process Industry, pp. 179–193
(1994)
16. Kristensen, L.M., Mitchell, B., Zhang, L., Billington, J.: Modelling and initial analysis of
operational planning processes using coloured Petri nets. In: Proceedings of Workshop on
Formal Meth. Applied to Defence System, Australian Computing Society in Conference in
Research and Practice in Information Technology (12), 105–114 (2002)
17. Hafilah, D.L., Cakravastia, A., Lafdail, Y., Rakoto, N.: Modeling and simulation of air
france baggage handling system with colored Petri Nets. IFAC PapersOnLine. 2443–2448
(2019)
18. Jensen, K., Kristensen, L.M., Wells, L.: Coloured Petri Nets and CPN tools for modelling
and validation of concurrent systems. Int. J. Softw. Tools Techno. Trans. (STTT) 9(3–4).
213–254 (2007)
19. Jensen, K., Kristensen, L.M.: Coloured Petri Nets. Springer-Verlag, Modelling and
Validation of Concurrent Systems (2009)
Modeling and Experimental Verification
of Air - Thermal and Microwave - Convective
Presowing Seed Treatment
Alexey A. Vasiliev(&) , Alexey N. Vasiliev ,

Dmitry A. Budnikov , and Anton A. Sharko
Federal Agroengineering Scientific Center VIM, Moscow, Russian Federation

vasilev-viesh@inbox.ru
Abstract. The use of electrophysical effects for presowing treatment of seeds

belongs to most effective methods of improving their sowing quality. However,
application of such methods is limited by the fact that specialized new-type
technological equipment is normally required for their implementation in seed
processing lines. This problem can be solved in a simpler way where processing
lines designed for forced ventilation are applied for presowing treatment of grain
crops. This work deals with the issues related to the use of aerated bins and
convection-type microwave grain driers for presowing processing of seeds.
The requirement for homogeneity of external field distribution over a dense
grain layer has to be met, in the course of processing. It is necessary to ensure
both the preset values of temperature and the dynamics of its change. Computer
simulations were carried out to check whether the applied processing facilities
are physically capable to maintain the required operation modes during these
seed treatment procedures. For the purposes of modeling, the entire dense grain
layer was divided onto three sections, and the heat-and-moisture exchange
processes developing during treatment of seeds were simulated, in each section.
The curves of change for temperature and moisture content in seeds during
processing were calculated. Simulation results for heat exchange in the course of
air-thermal treatment and for convective-microwave processing have proved the
implementability of the required operation modes of presowing treatment.
The results of field experiments on seed presowing processing in aerated bins
and in convective-microwave grain dryers made it possible to evaluate the
advantages and effectiveness of each method. Thus, air-thermal treatment
stimulates the development of secondary root system and intensive growth of
green-basis weight of plants. Application of convective-microwave processing
of seeds makes it possible to increase the number of productive culms, as well as
that of grain heads (ears) per plant. Therefore, specific modes of electrophysical
exposure, in the course of seeds presowing treatment, can be selected depending
on the goals that have to be achieved, in each particular case.
Keywords: Air-thermal treatment Convective-microwave processing

Computer simulations Presowing treatment Seeds

https://doi.org/10.1007/978-3-030-68154-8_40
Modeling and Experimental Verification of Air 441
1 Introduction
Presowing treatment of seeds is an essential link in the chain of technological opera-

tions within the entire seeding cycle. Prior to chemical treatment with the use of
chemical solutions for disinfection, seeds have to be calibrated and culled out. Seeds
are vernalized and hardened off depending on particular plant species. All those
operations are designed to improve sowing quality of seeds and, therefore to increase
yields [1]. Electrophysical method of seed treatment play an important role in pre-
sowing processing technologies [2–4]. Wide-scale experimental research [5–8] has
proved the effectiveness of various methods of electrophysical treatment. However,
their application at grain-growing farms is limited by the technological performance of
existing processing equipment. It is necessary that a newly-designed processing plant
or treatment technology could be optimally integrated into the technological processes
of seed presowing treatment adhered to at a particular agricultural enterprise. This
requirement is, to a major extent, fulfilled for grain dryers including aerated ones, as
well as, plants for drying and disinfection of grain with the use of microwave fields
[9–11]. In case that presowing treatment lines are used to implement convective-
microwave processing of seeds one has to be sure that the requirements for processing
modes are complied with and no damage to seeds occurs.
Objective of research is to study, with the help of computer simulations, the pos-
sibility to apply the combination of aerated bins and convective-microwave grain
dryers for seed presowing processing and to evaluate the effectiveness of such pre-
sowing treatment.
In order to study the issues related to the implementability of grain presowing treatment
in such processing lines, modeling of the processes air-thermal and convective-
microwave seed treatment processes were performed. The basis of simulation model
was the following system of equations and transfer functions making it possible to
calculate heat-and-moisture exchange in elementary grain layer [12, 13]:
1
T ðpÞ ¼ T0 ðpÞ ð1 þ peps1 eps1 Þ A1 hðpÞp A2 W ðpÞp; ð1Þ
p
1 1
DðpÞ ¼ D0 ðpÞ ð1 þ peps1 eps1 Þ WðpÞp; ð2Þ
p A3
T ð pÞ1 1
hð pÞ ¼ T0 ð pÞ ðð1 A4 Þ ð1 eps1 Þ þ peps1 Þ; ð3Þ
A4 pA4
1 1
h ð p Þ ¼ ð h0 T ð pÞ Þ þ h0 þ A5 W ð pÞp þ A5 W0 þ A6 Qv ; ð3:1Þ
T1 p þ 1 p
442 A. A. Vasiliev et al.
K
WðpÞ ¼ Weq ; ð4Þ
pþK
0:435
lnð1 F Þ
Weq ¼ ; ð5Þ
5:47106 ðT þ 273Þ
K ¼ 7:4897 0:1022T 0:6438W þ 0:0134T V þ 0:0148T W þ 0:0029V

W 0:0026T 2 þ 0; 0071W 2 ;
ð6Þ
745D
F ¼ : ð7Þ
ð622 þ DÞeð0:622 þ 238 þ T Þ
7:5T
cg cg cg r 0 eca aq sv
where A1 ¼ eca ca ; A2 ¼ 100ca ca e; A3 ¼ 103 cg ; A4 ¼ eca ca ; A5 ¼ r 0 cge ; A6 ¼ cg1q .
g
T is air temperature, °C; D is air moisture content, g/kg; W is grain moisture

content, %; F is air relative humidity, a.u.; Weq is equilibrium moisture content of
grain, %; K is drying coefficient, 1/h; h is temperature of grain, °C; Qv is specific
microwave power dissipated in dielectric medium, W/m3; V is air velocity, m/s; ca is
specific heat capacity of air, kJ kg–1 °C–1; qg is volumetric density of grain, on
dry basis, kg/m3; e is pore volume of grain layer; cg is specific heat capacity of grain,
kJ kg–1 °C–1;r’ is specific heat of evaporation for water, (kJ/kg); cg is grain bulk
weight (kg/m3); ca is air volumetric density kg/m3; s is time h.
Equation (3) was used in computer model for air-thermal treatment while Eq. (3.1)
was applied to that of convective-microwave presowing treatment of seeds.
The developed computer model includes those of corresponding elementary grain
layers. The thickness of elementary layer equals to the size of one seed. Therefore, the
sequence of models for all elementary layers makes it possible to describe the process
of heat-and-moisture exchange in dense grain layers of any thickness owing to the
conditions when the out parameters of each precedent layer are equal to input
parameters of the next one. In the process of air-thermal presowing treatment, it is
important to not only maintain parameters of air within a required range but also insure
a required value of grain temperature and exposure time over the entire grain layer. In
the process of air-thermal presowing treatment, seeds are exposed to heated air.
Application of microwave fields for heating seeds is not provided for. That is why
conventional electric heaters are used to heat seeds. Therefore, electric heater control
has to be considered while modeling the process.
In this work, the applicability of aerated bins with radial air distribution for air-
thermal presowing treatment was studied. Grain layer of thickness 1.2 m is located in
such bin in a stable manner and is blown through with air directed from the central air
duct towards the perforated outer wall. For the modeling purposes, the entire gran layer
was divided onto three zones of equal thickness. However, the air velocity will differ,
in each zone. It is a result of radial direction of air flux from the center of bin. It means
that the cross-section area of grain layers increases while the aggregate air flow remains
constant. While modeling the process, the average value of air velocity was assumed to
be equal to 0.7 m/s, 0.5 m/s and 0.7 m/s, for the first, second and third zone,
respectively. Simulink software was used for developing computer model. The diagram
of the developed model is shown in Fig. 1.
Fig. 1. Simulink model for air-thermal presowing treatment of seeds in bins with radial air
distribution.
Units ‘Interpeted MATLAB Fcn’ [14] (F = f(tau), Ta = f(tau)) serve for definition
of temperature and relative humidity change profile of ambient air, in correspondence
with the statistically processed data by meteorological services. Unit ‘Q0’ is designed
for setting the initial value of grain temperature. The dependence of relative humidity
of air fed into the grain layer on the capacity of electric heater is set with the help of
unit ‘Interpeted MATLAB Fcn’ (F = f(Pk)). Unit ‘C’ is used to set the temperature
level to which ambient air has to be heated in order to provide the required seed
presowing processing algorithm. Oscillograph units (F, W, Q, T) make it possible to
monitor the change of parameters of seeds and air over the layer thickness, in the
course of treatment.
In the process of modeling the process of presowing treatment with the use of
heated air the requirements of process control algorithm for aerated bins were con-
sidered [15]. In accordance with this algorithm the following operations have to be
performed:
– to measure the initial value of grain temperature,
– to heat ambient air in the input to the value equal to 1.3 of initial grain temperature
with the help of electric heater,
– to blow heated air through the grain layer during 1.1 h,
– to cool seeds with the help of ambient air during 1.1 h,
– to blow through the layer during 1.1 h with the help of air heated to the temperature
value equal to 1.8 of initial grain temperature,
– to cool seeds with the help of ambient air.
It is assumed that the initial grain temperature value is 20. The results of modeling
enable to monitor the change of seed temperature, that of air blowing through the grain
layer, as well as moisture content of seeds. These results of modeling are presented in
Figs. 2, 3 and 4.

3.1 Air-Thermal Treatment of Seeds
Heated air is the major external factor, in this kind of presowing treatment of seeds.
Therefore, it is important to ensure conditions when air blowing through the grain layer
could heat seeds to a specific required temperature value.
The analysis of diagrams shown in Fig. 2 allows for the conclusion that a slight (by
0.11 h) temperature delay inside the grain layer and temperature change over the grain
layer compared to that in its input do not considerably affect air temperature variations
within the grain layer.
а) temperature in the input of grain layer, b) temperature in the output of Zone 1, c)

temperature in the output of Zone 2, d) temperature in the output of Zone 3
Fig. 2. Temperature change of air blowing through the grain layer.
It means that the requirements for temperature mode of seed presowing treatment
are complied with. A short-term air temperature drop within inter-seed spaces can be
explained by water transfer from the areas of grain layer located close to the air inlet
towards its downstream areas. This can be seen from the time dependences of grain
moisture content presented in Fig. 3.
Variations of moisture content of seeds do not exceed 3%. Besides, the values of
their moisture content upon completing the presowing process are identical to those in
the beginning. It is an essential feature since seeds have not to be exposed to
overdrying resulting in loss of their internal water.
Modeling results for temperature change in seeds under presowing treatment are
presented in Fig. 4. Diagrams shown in Fig. 4 give the reason to conclude that the
requirements for control algorithm of seed presowing treatment in aerated bins are fully
complied with, and such processing lines can be applied for the purposes under this study.
а) moisture content in Zone 1, b) moisture content in Zone 2, c) moisture content in

Zone 3
Fig. 3. Results of modeling the change of moisture content in grain layer during presowing
treatment with heated air.
а) grain temperature in Zone 1, b) grain temperature in Zone 2, c) grain tempera-

ture in Zone 3
Fig. 4. Temperature change of seeds during air-thermal presowing treatment in aerated bins.
3.2 Dependence of Operation Modes of Seed Presowing Treatment

on Parameters of Convective-Microwave Zone
Microwave fields have the advantages of their combined character of effects produced
on seeds in the course of presowing processing including thermal and electromagnetic
effects. Combination of microwave method with thermal treatment enables to imple-
ment various seed processing options that may yield specific useful reactions for plant
vegetation processes. Seed presowing treatment method of this kind is assumed to be of
perspective ones. However its implementation requires application of special
equipment.
Based on the results of analysis of electrophysical methods applicable to the pre-
sowing treatment technologies, it was found out that one of the critical aspects is
providing homogeneous effects over the entire depth of grain layer under processing.
That is why it is advisable to carry out modeling the process of air-thermal treatment of
seeds with the use of microwave fields in order to specify the basic requirements for
processing line design.
The equations described above were applied for modeling heat-and-moisture
exchange in grain layer. The thickness of grain layer exposed to processing, in
convective-microwave plant, is 15 cm to 20 cm. Therefore, the average thickness value
of 18 cm was chosen for simulations. For descriptive reasons, conditionally thick grain
layer was divided in three sections, 6 cm each. Computer model for convective-
microwave presowing seed treatment designed in Simulink is presented in Fig. 5.
Fig. 5. Block diagram of computer model for convective-microwave presowing seed treatment
This computer model enables to set values of input parameters of air (temperature
and relative humidity). For this purpose, model units Tvh and Fvh. Magnetron oper-
ation control is performed with the use of two-step action controller that is represented
by unit Relay. Operation control of magnetron is organized so that it has to be

switched-off when grain temperature in the first zone attains 45 °C. This temperature
value was selected in order to avoid overheating of seeds in the course of their thermal
treatment. The graphs of seed temperature change in each Layer (unit ‘Q’), grain
moisture content (unit ‘W’), air relative humidity in the output of each Layer (Fvih),
temperature of drying agent in the output of each Layer (Tvih) are displayed in
oscillographs. The results of modeling are shown in Figs. 6 and 7.
Fig. 6. Change of grain temperature under microwave heating, in three sublayers
From the diagrams presented in Fig. 6, different layers are heated to different
temperatures. However, this heterogeneity disappears and the grain temperature
becomes uniform over a thick layer as the grain passes through the entire dryer.
Because when moving inside the dryer, the grain is mixed and processed by a
microwave field of different intensities. Magnetron operation control ensures the
required modes of presowing grain treatment without exceeding set values of seed
temperature.
Diagrams for moisture content of seeds in each layer (see Fig. 7) show that no
water loss occurs in seeds in the process of presowing treatment. Variations of grain
moisture content in the process of treatment do not exceed 0.2% and they will not affect
sowing quality of seeds.
а) moisture content in Layer 1, b) moisture content in Layer 2, c) moisture content

in Layer 3
Fig. 7. Change of grain moisture content in the process of presowing treatment with the use of
microwave fields
3.3 Experimental Tests of Seed Presowing Treatment Modes.

Experiment tests of seed presowing treatment modes were performed on seeds of
winter barley of grade ‘Elita’. At the first step of tests, presowing treatment was carried
out with the use of air-thermal processing. Seeds were treated in aerated bins with
radial distribution of air. During processing, processing modes described above in this
article were adhered to.
1 – receiving container, 2 – microwave active chamber, 3 – magnetrons, 4 – airduct

of magnetron cooling system, 5 – input airduct
Fig. 8. Convective-microwave unit for presowing treatment of seeds.

Convective-microwave processing unit was applied whose layout is shown in

Fig. 8.
Grain is fed into the module from receiving container (1). Waveguides provide the
link between microwave active chamber (2) and the magnetron unit (3). Air for cooling
power supplies of magnetrons is blown through via airduct (4). Air for cooling seeds
enters the grain layer through airduct (5).
Two different physical methods were studied in the course of presowing treatment.
In the first one, called ‘air-thermal treatment’, is heating grain in a dense layer by
blowing hot air through it. The second one called ‘convective-microwave treatment’ is
heating grain with the use of microwave fields with simultaneous cooling by air. Field
experiments have been carried out for adequate interpretation of the results of pre-
sowing treatment. For this purpose, experimental working plots were used for seeding
grain treated after two-year lying period [16]. The results of treatment were evaluated
from the day of seedling emergence and during the whole cycle of plant vegetation. In
accordance with the practice adhered to in agricultural science all necessary mea-
surements were made during the whole period of plant vegetation including formation
stems and ears. Productivity indicators of plants as grain weight and their number per
one ear were evaluated after harvesting such [17]. The results of observations are
presented in Table 1.
Table 1. Experimental data on the results of pre-sowing seed treatment

Indicators for evaluating the effectiveness Presowing treatment option
of pre-sowing treatment Without Air –thermal Microwave
processing treatment field treatment
Seedling density 516 491 486
Number of secondary roots 2.7 5.4 3.5
Plant length 18.9 26.0 21.2
Number of leaves 6.8 9.9 7.9
Weight of 20 green plants 32.5 71.4 44.3
Number of plants per m2 488 480 476
Number of stems per m2 1030 871 984
Number of productive stems per m2 809 788 877
Number of ears per 1 plant 1.58 1.60 1.79
Plant height, cm 76.7 77.9 81.9
Ear length, cm 6.9 7.6 7.4
The number of grains per ear, pcs 15.0 16.1 15.4
Weight of 1000 grains, g 45.5 48.4 46.0
Grain weight per ear, g 0.72 0.82 0.77
Productivity, c / ha 57.5 64.0 67.0
The data presented in Table make it possible to evaluate the effectiveness of each
presowing treatment technology for particular purposes. For instance, air-thermal
treatment is preferable for winter crops cultivation because it ensures better conditions
for secondary root development. In agricultural technologies oriented on production of

crops for green fodder it is advisable, as well, to apply air-thermal presowing treatment.
The best crop yield was obtained with the application of convective-microwave
treatment of seeds. This treatment technology made it possible to increase the number
of ears per one plant. Probably, this particular indicator will enable to achieve pro-
ductivity growth.
It has to be taken into account that the results of experiments may differ depending
on microclimate variations. Therefore, additional research has to be carried out to study
possible effects of climatic factors.
4 Conclusion
The results of both computer simulations and experimental research afford a strong
ground to claim that technological equipment designed for grain drying with the
application of forced ventilation method can be effectively applied in presowing
treatment technologies of seeds. Aerated bins make it possible to implement air-thermal
presowing processing technique. In convective-microwave grain driers, the required
conditions of electrophysical processes are implementable, as well.
The results of field experiments have shown that application of various external
factors in the process of presowing treatment of seeds is an effective tool to control the
structure of grain yield. For example, application of air-thermal method may be
advisable for winter crops in which case development of secondary roots is essentially
important. Convective-microwave technique makes it possible to stimulate develop-
ment of multiple productive stems and ears per one plant.
References
1. Isaeva, A.O., Kirilin, G.M., Iskakov, A.Y., Giniyatulina, A.K.: Effects of vernalizing and
seeding method on output yield of carrot. In: Selection of Collected Papers: Scientific
Society of the 21th Century Students. Technical Science. E-print Selection of Collected
Papers Based on Materials of the 67th Inteernational Students Scientific-practical
Conference, pp. 119–123 (2018). (Russ.)
2. Shabanov, N.I., Ksenz, N.V., Gazalov, V.S., et al.: The substantiation of dose for presowing
treatment of cereal seeds in electromagnetic field of industrial frequency. Ambient Sci. 5(2),
20–24 (2018)
3. Dukic, V., Miladinov, Z., Dozet, G., et al.: Pulsed electro-magnetic field as a cultivation
practice used to increase soybean seed germination and yield. Zemdirbyste Agric. 104(4),
345–352 (2017)
4. Badridze, G., Kacharava, N., Chkhubianishvili, E., et al.: Effect of UV radiation and artificial
acid rain on productivity of wheat. Russ. J. Ecol. 47(2), 158–166 (2016)
5. Gilani, M.M., Irfan, A., Farooq, T.H., et al.: Effects of pre-sowing treatments on seed
germination and morphological growth of acacia nilotica and faidherbia albida. Scientia
Forestalis 47(122), 374–382 (2019)
6. Luna, B., Chamorro, D., Perez, B.: Effect of heat on seed germination and viability in species
of Cistaceae. Plant Ecol. Divers. 12(2), 151–158 (2019)
7. Mildaziene, V., Aleknaviciute, V., Zukiene, R., et al.: Treatment of common sun-flower
(Helianthus annuus L.) seeds with radio-frequency electromagnetic field and cold plasma
induces changes in seed phytohormone balance, seedling development and leaf protein
expression. Sci. Rep. 9(1), 1-12 (2019) Article Number: 6437
8. Nurieva, K.O.: Electrophysical factors for treating seeds. In: Collected Papers ‘The Youth:
Education, Science, Creating – 2019’ Collected Papers Based on Materials of Regional
scientific-practical Conference, pp. 122–124 (2019) (Russ.)
9. Budnikov, D.A., Vasiliev, A.N., Vasilyev, A.A., Morenko, K.S., Mohamed, S.I., Belov, F.:
Application of electrophysical effects in the processing of agricultural Materials. In:
Advanced Agro-Engineering Technologies for Rural Business Development. Valeriy
Kharchenko (Federal Scientific Agroengineering Center VIM, Russia) and Pandian Vasant
(Universiti Teknologi PETRONAS, Malaysia), pp. 1–27 (2019) 10.4018 / 978–1–5225–
7573–3.ch001.
10. Vasiliev, A.N., Ospanov, A.B., Budnikov, D.A.: Controlling reactions of biological objects
of agricultural production with the use of electrotechnology. Int. J. Pharm. Technol. 8(4),
26855–26869 (2016)
11. Wang, S., Wang, J., Guo, Y.: Microwave irradiation enhances the germination rate of tartary
buckwheat and content of some compounds in its sprouts. Pol. J. Food Nutr. Sci. 68(3), 195–
205 (2018)
12. Ospanov, F.B., Vasilyev, A.N., Budnikov, D.A., Karmanov, D.K., Vasilyev, A.A., et al.:
Improvement of grain drying and disinfection process in microwave field. Almaty Nur-Print
155 (2017)
13. Vasilyev, A.A., Vasilyev, A.N., Dzhanibekov, A.K., Samarin, G.N., Normov, D.A.:
Theoretical and experimental research on pre-sowing seed treatment. IOP Conf. Ser.: Mater.
Sci. Eng. 791(1), 012078 (2020) https://doi.org/10.1088/1757-899X/791/1/012078
14. Dabney, J., Harman, T.L.: Mastering Simulink 4. 412 p. Prentice-Hall, Upper Saddle River
(2001)
15. Udintsova, N.M.: Mathematical substantiation of electric heater capacity for presowing
treatment of seeds. Deposited manuscript All-Russian Institute of Scientific and Technical
Information No. 990-B2004 10.06.2004. (Russ.)
16. Sabirov, D.C.: Improvement of seed presowing treatment effectiveness. In: Collected Papers
‘Electric Equipment and Technologies in Agriculture’. Collected papers based on materials
of the 4th International Scientific-practical Conference, pp. 222–225 (2019) (Russ.)
17. G. A. Burlaka, E. V. Pertseva. Effect of presowing treatment of seeds on germination
readiness and germination ability of spring wheat. In: Collected Papers ‘Innovative
Activities of Science and Education in Agricultural Production. Proceedings of International
scientific-practical Conference, pp. 301–305 (2019) (Russ.)
Modeling of Aluminum Profile Extrusion
Yield: Pre-cut Billet Sizes
Jaramporn Hassamontr1(&) and Theera Leephaicharoen2

1
King Mongkut’s University of Technology North Bangkok,
Bangkok 11000, Thailand
jaramporn@gmail.com
2
Thai Metal Aluminum Company Limited, Samutprakarn 10280, Thailand
Abstract. In some manufacturing industries, starting raw material size is crit-

ical to determining how much material will be scrapped by the end of a process.
Process planners working in aluminum profile extrusion industry, for example,
need to select appropriate aluminum billet sizes to be extruded to meet specific
customer orders while minimizing resulting scraps. In this research, extrusion
process is classified according to how billets are extruded, namely multiple
extrusions per billet and multiple billets per extrusion. Mass balance equations
for each configuration are used to formulate a yield optimization problem where
billets are pre-cut to specific sizes and kept in stock. Both models are non-linear
and discrete. The solution procedure is developed using an enumeration tech-
nique to identify the optimal solution. It is validated with extrusion data from an
industrial company. Its effectiveness is demonstrated using various case studies.
Keywords: Aluminum profile extrusion Yield optimization Mass balance

equation Billet selection problem. Scrap reduction
1 Introduction
Most manufacturing processes or operations can be optimized to save time and cost.
Yet theory and practice are sometimes far apart. Most practitioners in industry deem
optimization as too idealistic, cannot be done in practice. Thus, many manufacturing
processes are not yet optimized to the most effective manner. In this research an attempt
has been made to apply optimization technique to aluminum profile extrusion process.
Aluminum extrusion process is used to manufacture long aluminum profiles that are
widely used in construction, mechanical and electrical appliances. Obvious examples are
window frames, heat sink and steps on SUVs. Global aluminum extrusion market value
was $78.9 billion in 2018 [1] and is still growing. Aluminum extrusion process typically
involves 3 main steps. First, aluminum billets are cut into specific sizes, heated and then
loaded into a container, one billet at a time. Then each billet will be pushed through a die
by hydraulic press. Aluminum profile will be formed at the other side of the die. Once the
profile reaches prespecified extrusion length, it will be cut by the saw in front of the die.
The profiles will be straightened by a stretcher and cut to customer-specified lengths
afterward. Figure 1 illustrates an example of extrusion plan where one billet is extruded

https://doi.org/10.1007/978-3-030-68154-8_41
Modeling of Aluminum Profile Extrusion Yield: Pre-cut Billet Sizes 453
three times, i.e. three identical extrusion lengths. Due to stretcher capacity, this extrusion
length is constrained to within a certain range, such as from Rmin to Rmax.
Fig. 1. Aluminum extrusion process: 3 extrusions per billet
Given a customer order, consisting of product length L and order quantity n, process
planner needs to select available pre-cut billet sizes bi and extrusion lengths LE1 to meet
individual demand whilst resulting in minimal scrap. Many short billets need to be
extruded consecutively until the extrusion length exceeds Rmin before being cut by the
saw. This type of configuration will be called multiple billets per extrusion in this paper.
On the other hand, a large billet can be extruded and cut by the saw more than once with
each extrusion length between Rmin and Rmax. This will be called multiple extrusions per
billet, though a single extrusion per billet is also included in this category.
So far, in the literature, there is no explicit modeling that will consider both
extrusion setups simultaneously. The objective of this research is to formulate a yield
optimization model for the extrusion process using mass balance equations. Available
pre-cut billet sizes, process setup and customer order requirements are taken into
accounts in the model. It is validated with actual data from an aluminum extrusion
company. The model is then used to illustrate possible yield improvements through
case studies. Specifically, it is shown how two objective functions, namely minimizing
total billet weight used versus maximizing yield, provide different optimal solutions.
The effects of extrusion die design on yield is also studied.
454 J. Hassamontr and T. Leephaicharoen
2 Literature Review
Aluminum extrusion yield was first investigated by Tabucanon [2]. A dynamic pro-
gramming model was proposed to identify an optimal set of billets to be stocked, that
will minimize overall scraps. Masri and Warburton [3, 4] formulated yield optimization
as a billet selection problem. Using past customer orders’ data, the goal was to identify
optimal billet sizes that should be stocked. Their model considered a global problem-
multiple profile cross sections and lengths. Multiple pushes within the same billet were
not considered due to longer cycle time. Wastes from each billet size-customer order
pair was calculated a priori, though not explicitly shown. Hajeeh [5] formulated a
mathematical model to optimize cutting aluminum logs into billets to fill customer
orders. The scraps considered were from log cutting, butt scrap and profile cutting. The
goal was to minimize overall scrap. Again, wastes from each billet assignment were
calculated before optimization. Another approach to scrap reduction was introduced by
researchers in mechanical engineering discipline. Reggiani, Segatori, Donati and
Tomesani [6] used Finite Element simulation to predict charge welds between two
consecutive billets that must be scrapped. Oza and Gotowala [7] used HyperXtrude, a
commercial CAD software for extrusion, to study the effect of ram speed, billet and die
temperatures on traverse weld length. Ferras, Almeida, Silva, Correia and Silva [8]
conducted an empirical study on which process parameters contributing the most to
material scrap.
In this research, an alternative approach for extrusion yield optimization is pro-
posed. Instead of considering a global problem where multiple customer orders are
optimized simultaneously, the problem size can be greatly reduced by considering one
customer order at a time. Secondly, mass balance equations are used explicitly in the
model, allowing clear indication how billets are extruded in the optimal solution.
Thirdly, both general extrusion setups, including multiple pushes, are considered. As
will be shown later, there are cases where multiple pushes can lead to lower scrap. Last,
it is possible to identify the impact of improving hardware constraints. For example,
how die design can improve the optimal solution.
3 Research Methodology
3.1 Problem Formulation
There are three types of material scrap in aluminum extrusion process. The first is
incurred from cutting aluminum log into specific billet lengths. The second comes from
in-process extrusion, which are denoted by billets’ backend e, in kg, and aluminum
material remained in the die pot, P, also in kg, after the die is removed from the
extrusion press. These are denoted in Fig. 1. The third type of waste is induced by
profile cutting. They come in many forms as follow.
• Aluminum portion between die and cutting saw, Ls, is usually too short for cus-
tomer usage and therefore discarded.
• Part of the profile gripped by pullers for stretching are usually deformed and later
cut off. These are denoted by head loss h and tail loss t in Fig. 1. Generally, portion
of the profile that first comes out and gripped by puller, h, is less than Ls. Thus, the
scrap h is already included in Ls and can be considered zero in most cases.
• An allowance, d, shown in Fig. 2, is to avoid irregularities within the profiles. These
irregularities may come from stop marks while profile is being cut, or where
material from two different billets are merged within the same profile. These
irregularities may lead to profile failures when undertaking large loads. Thus,
manufacturers will cut this portion out.
• Another allowance, though not shown in both figures, is the material scrap remained
after the extruded profile cut to pre-determined number of pieces, say q1, of cus-
tomer order’s length L.
Other input variables for the planning includes billet weight per length, b, number
of die openings (how many profiles can be extruded in one time), a, and extruded
profile’s average weight per length, w.
In the case of multiple extrusions per billet, process planner must determine billet
sizes to be used as the first and consecutive billets B1 and B2 consecutively. For each
billet, how many extrusions C1 and C2 are performed and in each extrusion how many
pieces of product with length L are made within each profile, q1 and q2. The extrusion
lengths LE1 and LE2 from each billet should be constant for simplicity in profile cutting.
All C’s are at least 1. Figure 1, for example, demonstrates the case of three extrusions
per billet (C1 = 3), whereas consecutive billets are not shown.
Fig. 2. Multiple billets per extrusion (3 billets per extrusion).
Schematics of multiple billets per extrusion are shown in Fig. 2. Here at most three
different billet sizes, B1, B2 and B3, can be used to form a set of billets that will result in
single extrusion. The second billet B2 can be used D2 times within each billet set while
the first billet B1 and the last billet B3 are used only once. Each billet results in q1, q2,
and q3 pieces of product within each profile respectively. As it will be shown in the
solution procedure, the first billet in the next set, B1′, can be smaller than B1. The
extrusion length LE can be computed for each billet set.
Assumptions for the problem formulation are summarized as follows.
• Billets are pre-cut to specific sizes and available in stock with unlimited quantity.
• Extrusion length must be between Rmin and Rmax.
• Saw used for profile cutting on the runout table is at fixed position Ls from the die.
• Material losses due to cutting saw is negligible.
• Desired product length, L and minimum required quantity n are known input
variables for billet selection.
• Aluminum density w along the profile is constant.
• Process parameters such as extrusion speed, temperature, etc. are pre-determined.
No significant adjustment is made during extrusion.
• For multiple pushes per billet case, extrusion length is uniform for one billet size. At
most two billet sizes are used to meet a customer order.
• For multiple billets per extrusion case, extrusion length should be uniform for each
set of billets. At most three billet sizes are used to meet a customer order.
3.2 Solution Procedure

Based on mass balance equations from both extrusion setups, it is possible to devise a
divide-and-conquer approach.
Step 1. Classify available billet sizes into two categories. For each billet size bi,
maximum extrudable length can be calculated from
bi e P
Lmax;i ¼ Ls ð1Þ
aw
Let U be the set of billets bu of which maximum extrudable length Lmax,u Rmin
and V the set of billets bv of which maximum extrudable length Lmax,v < Rmin.
Step 2. For billets in U, solve the following optimization model for multiple extrusions
per billet setup. First, a few more decision variables are needed. Let.
X1u = 1 if billet u is used as the first billet and zero otherwise, 8u.
X2u = 1 if billet u is used as consecutive billets and zero otherwise, 8u.
nb2 = number of consecutive billets used.
The objective function is to minimize total billet weight used. Equations (2) and (5)
are mass balance equations that can be derived from Fig. 1. Equations (2)–(7) are to
ensure that only one billet size is used as the first billet and only one for consecutive
billets. Equations (8)–(11) are to impose constraint on extrusion lengths for the first and
consecutive billets respectively. The number of consecutive billets, nb2, can be cal-
culated from (12). All decision variables are denoted in (13) and (15).
minimize B1 þ nb2 B2 .
Subject to
B1 e þ P þ awfC 1 ðLs þ Lq1 þ tÞ þ Ls þ hg ð2Þ

X
B1 ¼ X b
u 1u u
ð3Þ
X
u
X 1u ¼ 1 ð4Þ
B2 e þ awfC 2 ðt þ Ls þ Lq2 Þ þ hg: ð5Þ

X
B2 ¼ X b
u 2u u
ð6Þ
X
u
X 2u ¼ 1 ð7Þ
LE1 ¼ Ls þ t þ Lq1 þ h ð8Þ
Rmin LE1 Rmax ð9Þ
LE2 ¼ Ls þ t þ Lq2 þ h ð10Þ
Rmin LE2 Rmax ð11Þ
aC 2 q2 nb2 n aC 1 q1 ð12Þ
C1 ; C 2 1; integer ð13Þ
q1 ; q2 ; X 1u ; X 2u ; nb2 0; integer ð14Þ
B1 ; B2 ; LE1 ; LE2 0 ð15Þ
Step 3. For billets in V, solve the following optimization model for multiple billets per
extrusion case. A few more variables are needed. Let.
X1v = 1 if billet v is used as the first billet and 0 otherwise, 8v.
X2v = 1 if billet v is used as consecutive billets and 0 otherwise, 8v.
X3v = 1 if billet v is used as the last billet and 0 otherwise, 8v.
Y1v = 1 if billet v is used as the first billet for the next set and 0 otherwise, 8v.
ns = number of consecutive billet sets used.
The objective is to minimize total billet weight which consists of the first billet set
and consecutive billet sets. Equations (16), (19), (22) and (25) represent mass balance
equations that can be derived from Fig. 2. Equations (16)–(27) are used to make sure
that only one billet size is used for each billet type. The extrusion length for each billet
set is computed in (28) while its limits are imposed by (29). The total number of
consecutive billet sets is calculated by (30). Constraints on decision variables are
summarized in (31) and (32).
0
minimize ðB1 þ D2 B2 þ B3 Þ þ ns B1 þ D2 B2 þ B3 .
Subject to
B1 e þ P þ awfLq1 þ Ls þ hg ð16Þ
X
B1 ¼ X b
v 1v v
ð17Þ
X
v
X 1v ¼ 1 ð18Þ

P
B2 e þ P þ aw Lq2 þ d ð19Þ
aw
X
B2 ¼ X b
v 2v v
ð20Þ
X
v
X 2v ¼ 1 ð21Þ

P
B3 e þ P þ aw Ls þ t þ Lq3 þ d ð22Þ
aw
X
B3 ¼ X b
v 3v v
ð23Þ
X
v
X 3v ¼ 1 ð24Þ

0 P
B1 e þ P þ aw Lq1 þ d ð25Þ
aw
0
X
B1 ¼ Y b
v 1v v
ð26Þ
X
v
Y 3v ¼ 1 ð27Þ
LE ¼ ðLs þ Lq1 Þ þ D2 ðd þ Lq2 Þ þ ðd þ Lq3 þ tÞ ð28Þ
Rmin LE Rmax ð29Þ
aðq1 þ D2 q2 þ q3 Þns n aðq1 þ D2 q2 þ q3 Þ ð30Þ

0
B1 ; B1 ; B2 ; B3 0 ð31Þ
D2 ; q1 ; q2 ; q3 ; X 1u ; X 2u ; X 3u ; Y 1v ; ns 0; integer ð32Þ
Step 4. Show results from steps 2 and 3 so that the better solution can be selected.
Even though both optimization models are non-linear, all independent variables are
discrete. A computer program can be developed to enumerate all possible values of
these variables and identify the optimal solutions, if any, in each model. The solution
procedure is developed using Visual Basic Application in Microsoft Excel.
4 Case Studies
4.1 Case Study 1
Actual operating data from an aluminum extrusion company is used to validate the
model. The input data, shown in Table 1, is used for all cases, unless specified
otherwise. In the company where the case data is taken, a human process planner
decides which billet sizes are used. Table 2 illustrates how this conventional extrusion
data compares with results from the enumeration program developed. To make 600
pieces of 3.1 m-long profile, the first billet used is 64 cm in length whereas consecutive
billets are smaller at 59 cm. Each billet is extruded once, single push per billet, to make
10 pieces of product in each profile extruded. There are two profiles extruded at the
same time (a = 2). Each billet, therefore, generates 20 pieces of product. With 29
consecutive billets, exactly 600 pieces of product are made. Total weight for all billets
used are 1185.7 kg. Since material yield is defined as a ratio of total product weight to
total billet weight, the resulting yield is 85.8%.
Table 1. Parameters used in case studies.

Extrusion press setup Ls = 1.5 m, h = 0 m, t = 1 m, d = 0 m, e = 1.75 kg
Billet Rmin = 25 m, Rmax = 47 m
b = 0.668 kg/cm
Available billet sizes 35, 40, 59, 64, 74, 78 cm
Die P = 1.25 kg, a = 2
Product w = 0.545 kg/m, L = 3.1 m, n = 600 pcs
Table 2. Conventional plan versus results from the enumeration program.

Result from industry Min. Billet weight Max yield
Multiple billets/ext Multiple billets/ext Multiple ext/billet Multiple ext/billet
First billet size (cm) 64 40/35 59 78
pcs/profile pushes/billet q1C1 10 1 6 91 13 1
Extruded length LE1 (m) 33.5 30.40 42.80
Consecutive billet size (cm) 59 78 78
pcs/profile pushes/billet q2 C2 10 1 14 1 14 1
Extruded length (m) LE2 33.5 45.90 45.90
Number of consecutive billets nb2 29 21 21
Last billet size (cm) 40
pcs/billet q3 6
Extruded length (m) LE3 39.70
Number of sets ns 24
Total no. of workpieces made (pcs) 600 600 606 614
Total billet weight (kg) 1185.7 1255.84 1133.60 1146.29
Yield (%) 85.80 81.02 90.65 90.83
With the enumeration program, the first two billet sizes, 35 and 40 cm, are put into
set V, to be considered for multiple billets per extrusion. The program selects the first
billet set consisting of 40-cm billet as the first and the last billet while no billets are
extruded in between. Both billets result in 6 pieces per profile or 24 pieces per billet set.
The second set is similar to the first, except that the first billet size can be reduced to
35 cm, but still able to produce the same number of products per billet set. The
extrusion length for each billet set is 39.7 m. Making 600 pieces product requires
another 24 sets of billets besides the first set. Certainly, this extrusion setup requires
more billet weight than the conventional case. The resulting yield is 81.02%. In this
particular case, multiple billets per extrusion could not perform better than multiple
extrusions per billet.
As the program considers billets in U for multiple (or single) pushes per billet, it
selects 59 cm for the first billet size and 78 cm for consecutive billet sizes. The first
billet created 9 pieces of product per profile while consecutive billets 14 pieces. It
should be noted that extrusion lengths from the first billet is not the same as those from
consecutive billets. Here, only 21 consecutive billets are needed. The total product
quantity made is 606 pieces while total billet weight required is less than that from the
conventional case. The resulting yield, 90.65%, is significantly better than that from the
conventional solution.
The last column in Table 2 is used to demonstrate how objective function influ-
ences the optimal solution. If the objective function is to change from minimizing total
billet weight to maximizing yield, the optimal solution will look to add more pieces of
products, the nominator of yield, while keeping total billet weight as low as possible.
Thus, the optimal solution from maximizing yield objective is to use the largest billet
sizes for the first and consecutive billets. The resulting yield is 90.83% with more
products made. Selecting which objective function to use will depend on whether
customers allow over-shipment or if there exists a need to make extra pieces, for
example, to compensate for product defects.
4.2 Case Study 2

As a common practice in industry, extrusion companies try to improve productivity by
increasing the number of die openings, denoted by a in this paper, to as high as
possible. For example, the original number of die openings is 2. If the third opening can
be added, the number of profiles extruded will be increased by 50% immediately. In
this case study, the impact of increasing a is investigated. As a increases, the amount of
material remained in die pot, P, also increases. There are some other drawbacks in
increasing a. Notably, there are more material scraps introduced in each profile and it is
generally more difficult to control extrusion process (Table 3).
Table 3. Effects of increasing the number of die openings a within a die.

a=1 2 3 4 5
First billet size (cm) 40 59 78 40/35 35/35
pcs/profile pushes/billet q1C1 13 1 91 81 3 2
Extruded length LE1 (m) 42.80 30.40 27.30
Consecutive billet size (cm) 78 78 78 40 78
pcs/profile pushes/billet q2C2 14 2 14 1 91 4 6
Extruded length (m) LE2 45.90 45.90 30.40
Number of consecutive billets nb2 21 21 22 2 1
Last billet size (cm) 40 59
pcs/billet q3 3 4
Extruded length (m) LE3 45.90 39.70
Number of sets ns 10 9
Total no. of workpieces made (pcs) 601 606 618 616 600
Total billet weight (kg) 1120.9 1133.6 1198.4 1142.4 1149
Yield (%) 90.92 90.65 87.45 91.44 88.55
The enumeration program is used to investigate the extrusion process in case study
1 with various number of die openings, from a = 1 to 5, assuming that extrusion press
capacity is sufficient, and the process is under control. Here, only the objective of
minimizing total billet weight is used and only the better solutions from two extrusion
setups are shown. When a = 1, 2 or 3, the optimal solutions are from multiple
extrusions per billet. As a increases, the optimal yield decreases. The optimal solutions
from a = 1 and a = 2 are very close to each other as the main difference comes from
only the first billet size used. The consecutive billets, when a = 1 or 2, will create 28
pieces of product. When a = 4 or 5, the optimal solution comes from multiple billets
per extrusion with only slightly less yield. At a = 5, all billet sizes are in U. Thus,
multiple billets per extrusion can be beneficial when companies look to improve both
yield and productivity at the same time.
5 Conclusion
In this research, an alternative yield optimization model is proposed for aluminum

profile extrusion process using mass balance equations. The model takes into account
two general ways to extrude aluminum profiles, namely multiple billets per extrusion
and multiple extrusions per billet. The model proposed has three distinct advantages.
First, the problem size is drastically reduced since customer orders are considered one
at a time. Secondly, all possible ways to extrude billets are explored to find the
optimum solution. And last, it allows process planners to investigate the effects of
process parameters, such as allowable runout length.
Based on limited case studies from the industry, preliminary conclusions can be
drawn.
• The choice of objective function will have profound effects on the optimal solution.
Minimizing total billet weight seems to be a better objective than maximizing yield,
at least from the overproduction quantity standpoint.
• Multiple pushes, such as multiple extrusions per billet and multiple billets per
extrusion, can be beneficial for material utilization. Both setups should be con-
sidered unless there are some physical constraints on the extrusion press.
More work can be done to extend the model for online billet cutting where billets
can be cut to lengths and extruded directly. This should provide even better yield.
References
1. Grand View Research. https://www.grandviewresearch.com/industry-analysis/aluminum-
extrusion-market.
2. Tabucanon, M.T., Treewannakul, T.: Scrap reduction in the extrusion process: the case of an
aluminium production system. Appl. Math. Modelling 11, 141–145 (1987). https://doi.org/10.
1016/0307-904X(87)90158-2
3. Masri, K., Warburton, A.: Optimizing the yield of an extrusion process in the aluminum
industry. In: Tamiz, M. (ed.) Multi-Objective Programming and Goal Programming. LNE,
vol. 432, pp. 107–115. Springer, Heidelberg (1996). https://doi.org/10.1007/978-3-642-
87561-8_9
4. Masri, K., Warburton, A.: Using optimization to improve the yield of an aluminum extrusion
plant. J. Oper. Res. Socy. 49(11), 1111–1116 (1998). https://doi.org/10.1057/palgrave.jors.
2600616
5. Hajeeh, M.A.: Optimizing an aluminum extrusion process. J. Math. Stat. 9(2), 77–83 (2013).
https://doi.org/10.3844/jmssp.2013.77.83
6. Reggiani, B., Segatori, A., Donati, L., Tomesani, L.: Prediction of charge welds in hollow
profiles extrusion by FEM simulations and experimental validation. Intl. J. Adv. Manuf.
Technol 69, 1855–1872 (2013). https://doi.org/10.1007/s00170-013-5143-2
7. Oza, V.G., Gotowala, B.: Analysis and optimization of extrusion process using Hyperworks.
Int. J. Sci. Res. Dev. 2(8), 441–444 (2014)
8. Ferras, A.F., Almeida, F. De, Silva, E. C, Correia, A., Silva, F.J.G.: Scrap production of
extruded aluminum alloys by direct extrusion. Procedia Manufacturing, 38, 1731–1740
(2019). https://doi.org/10.1016/j.promfg.2020.01.100
Models for Forming Knowledge Databases
for Decision Support Systems for Recognizing
Cyberattacks
Valery Lakhno1 , Bakhytzhan Akhmetov2 ,

Moldyr Ydyryshbayeva3 , Bohdan Bebeshko4 ,
Alona Desiatko4 , and Karyna Khorolska4(&)
1
National University of Life and Environmental Sciences of Ukraine,
Kiev, Ukraine
valss21@ukr.net
2
Abai Kazakh National Pedagogical University, Almaty, Kazakhstan
bakhytzhan.akhmetov.54@mail.ru
3
Al Farabi Kazakh National University, Almaty, Kazakhstan
moldir.ydyryshbaeva@gmail.com
4
Kyiv National University of Trade and Economics,
Kioto Street 19, Kyiv, Ukraine
{b.bebeshko,desyatko,k.khorolska}@knute.edu.ua
Abstract. Patterns of Bayesian networks have been developed for the com-
puting core of the decision support system in the course of threats prediction and
stages of intrusion into information and communication networks of informa-
tization objects. The proposed Bayesian networks templates allow one to operate
with a variety of random variables and determine the probability of a cyber
threat or a specific stage of an invasion under given conditions. Probabilistic
models for detecting network intrusions based on the use of dynamic Bayesian
networks have been added. The training of Bayesian networks parameters based
on the EM-algorithm was carried out. In contrast to existing solutions, the
proposed approach makes it possible not only to take into account the main
stages of intrusions but also to make more reasonable decisions based on the use
of both typical intrusion patterns and newly synthesized patterns. All templates
and models make up the decision support system computing core for intrusion
detection. The effectiveness of the developed models was tested on test samples
that were not previously used in training.
Keywords: Decision support system Intrusion recognition Bayesian

networks Models
1 Introduction
It is possible to resist the constant growth of the complexity of illegitimate impacts on

objects of informatization (OBI), in particular, using systems for intelligent recognition
of anomalies and cyber-attacks [1]. In the face of the increasing complexity of attack
scenarios, many companies that are developing intrusion detection systems began to

https://doi.org/10.1007/978-3-030-68154-8_42
464 V. Lakhno et al.
integrate intelligent decision support systems (DSS) into their products. Note that the
basis of modern DSS is formed by various models and methods, which together form
its computational core [2, 3].
In difficult situations of guaranteed information security of various objects of
informatization, the decision-making process will take place under the condition of
active interaction of information security means and cybersecurity systems with
experts. The above circumstances determined the relevance of our research in the field
of synthesis of models for the computational core of the decision support systems as
part of the I intrusion detection system.
2 Review and Analysis of Previous Research
As noted in [4], a promising direction of research in this area has become studies on the
development of methods, models, and software systems for DSS [5] and expert systems
(ES) [6] in the field of information security.
In studies [6, 7, 24], Data Mining technologies in information security problems
were considered. The emphasis in these studies was made on the task of identifying the
situation evolution pattern, which is associated with the provision of informatization
objects information security. The considered works had no practical implementation, in
the form of a software system. The studies [8, 9] analyzed the methodology of intel-
ligent modeling in the problems of informatization objects information security. The
methodology proposed by the authors is intended to provide analysis and decision-
making under different scenarios of informatization objects intrusion implementation.
However, these studies have not been brought down to hardware or software
implementation.
The appearance of new classes makes it difficult to analyze and support decision-
making in information security tasks due to poorly amenable formalization and
structuring of the task themselves for providing information security [10, 11]. In such
cases, the parameters of the informatization objects information security state can be
represented by qualitative indicators. The latter is not always advisable.
According to the authors of [12, 13], the analysis of the security level of
informatization objects and the development of plans to counter targeted cyber-attacks
should be preceded by the stage of identifying the main threats. The authors of [14]
point out that it is problematic to qualitatively solve such a problem without the
appropriate DSS. The researchers did not describe the practical results of their
development.
In studies [15, 16], it was shown that aspects of building information security
means can be taken into account in the approach based on the applied Bayesian
networks (hereinafter BN). Compared to existing data analysis methods, they provide a
clear and intuitive explanation of their findings. Also, BN (often dynamic BN - DBN)
implies a logical interpretation and modification of the structure of the relations among
the variables of the problem. The representation of the BN in the form of a graph makes
it a convenient tool for solving the problem of assessing the probabilities of the pos-
sible cyber threat occurrence, for example, for the information and communication
networks of informatization objects.
Models for Forming Knowledge Databases for Decision 465
The computing core of the developed decision support system is based on proba-
bilistic models that make up the knowledge base. Such probabilistic models are capable
of describing processes under conditions when the data is poorly structured or insuf-
ficient for statistical analysis. Problems that can be solved based on the use of
knowledge bases probabilistic models should also use the procedures of probabilistic
inference. Various processes take place in information and communication networks of
informatization objects to generate their own data sequences. Accordingly, these data
sequences should be reflected in the description of the course of each process. The
above analysis showed that research in the field of new model development for the
formation of decision support systems knowledge bases in the process of recognizing
cyber-attacks is relevant.
3 Objectives of the Study
The research aims to develop models based on Bayesian networks for the knowledge
base of the computing core of the decision support system in the course of identifying
complex cybernetic attacks.
4 Methods and Models
It should be noted that unauthorized actions of an information and communication

networks intruder, as a rule, generate corresponding anomalies in its operation. And in
this context, information and communication networks itself is not something com-
pletely isolated but is located in a certain external environment (environment, which
includes personnel, competitors, legislative and regulatory bodies, etc.). Such an
environment is usually rather weakly formalized, since the connections between the
components are not always clearly defined, and to solve the problems of cyberattacks
detection that have generated anomalies in this environment, appropriate tools and
methods are needed to detect an intrusion based on a variety of different characteristic
features and characteristics. The main selection criterion for intrusion signs and their
parameters is the level of statistical significance. But since there are no signs of network
intrusions following KDD99 [18] 41 (see Table 1 [17, 18]), then the stage of mini-
mizing the number of the most informative of these signs is necessary. However, as
was shown in [19, 20], the number of informative features can be minimized. Such
minimization will allow, as in the previous example, to build a Bayesian network for
simulation of the intrusion detection logic for the intrusion detection systems. Thus,
after the procedure of minimization of the total number of parameters for the analysis
and selection of the most informative characteristics, the number of parameters
decreased from 41 to 8. In Table 1 are shown the 8 rows with features that were
recognized as the most informative for detecting network intrusions [20]. These 8
features make it possible to recognize with an accuracy of 99% the presence of typical
network attacks such as Probe, U2R, R2L, Dos / DDos.
Note that some of the selected informative features (in particular, lines 23 and 28 of
Table 1) are dynamic and change their value in the time interval 0–2 s. Therefore, to
design a DSS and fill its knowledge base with appropriate Bayesian network patterns,
in this subtask it is better to use a dynamic Bayesian network - DBN. Dynamic
Bayesian networks are ordinary Bayesian networks that are related by variables at
adjacent time steps [21]. In fact, the dynamic Bayesian network is a Markov process,
which can be attributed to the 1st order.
The designed dynamic Bayesian network will consist of two networks. Let's des-
ignate the first BN as ðB1Þ; respectively, the second as ðB2Þ: The network ðB1Þ will be
the original one. Transit network ðB2Þ: The original network ðB1Þ defines the prior
distributions of the available model ðpðzð1ÞÞÞ: The transit model ðB2Þ will determine
the probabilities of transitions between time slices.
Table 1. Parameters of network traffic for building a BN (according to [18, 19])

№ Parameter Description
2. protocol_type Protocol type (TCP, UDP, etc.)
3. service Attacked service
4. src_bytes Number of bytes from source to destination
23. count The number of connections per host in the current session
for the last 2 s
26. same_srv_rate % of connections with the same service
27. diff_srv_rate % of connections for various services
28. srv_count Number of connections for the same service in the last 2 s
34. dst_host_same_srv_rate % of connections to localhost established by the remote
side and using the same service
Then, taking into account the above written, one can hold the dynamic Bayesian
networks for the variables marked in lines 2–4, 23, 26–28, 34 of Table 1. Also, one can
add a variable type to the DBN, which, will display the moment of transitions between
networks B1 and B2. For the design of dynamic Bayesian networks was used editor
GeNIe Modeler.
The probabilistic model of the designed dynamic Bayesian network for transitions
between the vertices of the joint distribution graph of available models is presented as
follows:
Y
N
pðZt jZt1 Þ ¼ p p Zti jPa Zti ; ð1Þ
t¼1
N Y
Y N
pðZ1:t Þ ¼ p Zti jPa Zti ; ð2Þ
t¼1 i¼1
where Z t - is the BN slice for the time point t; Z it - BN node at the moment in time t;
i
Pa Z t - a set of parent nodes for a BN node; the number of nodes to cut the BN.
Expression (2) will describe the probabilities of transitions between BN nodes.
If the model is divided into a set of non-observable variables ðV t Þ, as well as a lot of

variables recorded ðU t Þ, in such a case, the expression (1) should be supplemented by
the expression (2) that, accordingly, asks not only state model first, but and the
observation model first.
The solution to the redundancy problem of the traffic properties analysis for the
designed test network, was made by analysis of traffic informative characteristics.
In the course of such an analysis of informative characteristics, based on the
analysis of publications [17–20], it was proposed to apply the criterion of information
gain IðV; UÞ: According to this criterion, information gain IðV; UÞ one attribute to
another attribute (e.g., line 3 - attribute V, line 23 - attribute U) indicates that the
uncertainty about the values U decrease when limestone defined values V: Hence,
IðV; UÞ ¼ HðUÞ HðUj V), where HðUÞ; HðUjVÞ - is the entropy of the attribute U
and ðUjVÞ; respectively.
Since the values U and V are both discrete, and can take values in the ranges for
fu1 ; :::; uk g andfv1 ; :::; vk g, respectively, the entropy values for the attribute U were
determined as follows:
X
i¼k
H ðU Þ ¼ PðU ¼ ui Þ log2 ðPðU ¼ ui ÞÞ: ð3Þ
i¼1
Conditional entropy can be found like this:
X
j¼l
H ðUjV Þ ¼ PðV ¼ vi Þ H ðUjV ¼ vi Þ: ð4Þ
j¼1
Because the calculation was IðV; UÞ performed for discrete values, the values from
the set of fu1 ; :::; uk g and fv1 ; :::; vk g had to be discretized. In the course of test
experiments, the method of equal frequency intervals was used for sampling. Following
the method value space attribute value should be pitched be an arbitrary number of
partitions. Each section will contain the same number of points described by the
corresponding data. The application of this method allowed us to achieve a better
classification of data values.
The information amplification factor IðV; UÞ will depend on the number of values
of the corresponding attributes. Therefore, as the number of values increases, the
entropy value for the attribute decreases.
More informative attributes can be selected using the gain [23]:
I ðU; V Þ H ðU Þ H ðUjV Þ
GðUjV Þ ¼ ¼ : ð5Þ
H ðV Þ H ðV Þ
The information amplification factor GðUjVÞ will take into account not only its
amount, which is necessary to represent the result in the DSS but also HðVÞ that is
necessary to separate the information by the current attribute ðVÞ:
Consider the following example of designing a dynamic Bayesian network for a
knowledge base decision support system. We will proceed with the following
assumptions. Typically, intrusion into the network is implemented in a three-stage

scheme. In the first step, the attacker performs a network scan ðSÞ: In the second stage,
the impact on information and communication networks vulnerabilities occurs ðEÞ: At
the final third stage, the attacker seeks to gain access to the information and commu-
nication networks, in particular, through the backdoor ðBÞ:
Modern attacks most often occur remotely and attackers do not have all the
information about the attacked network. The attacker needs to collect as much data as
possible about the target of the attack. Otherwise, you will have to go through all the
known vulnerabilities, and this is a rather lengthy procedure. The scan network process
certainly will impose its mark on the network traffic. But when the attacker already
gained information, they are able to focus more specifically on the potential vulnera-
bilities of network devices, as well as of services, operating systems and application
software. In this case of impact on the network, traffic will change respectively. The
sequence of attackers actions has been already described in sufficient details, therefore,
without delving into the details and techniques of the invasion, we will focus on the
construction of a dynamic Bayesian network, which will describe a probabilistic model
of network states at different stages of the invasion. Note that if the stages of the
ðSÞ; ðEÞ; ðBÞ were not detected by protection services, then they will be considered as
hidden.
Fig. 1 shows a general model of dynamic Bayesian network for the invasion stage
ðSÞ; ðEÞ; ðBÞ: The model shows two slices that correspond to the invasion process.
Fig. 1. General DBN model for the inva- Fig. 2. An example of the relationship between
sion of S, E, B the observed parameters (lines 3–6, 12, 25, 29,
30, 38, 39 of Table 1)
Invasion stages are hidden. Accordingly, it makes it necessary to collect statistics

on the network traffic observed parameters (Table 1). An example of the relationship
between the observed parameters (lines 3–6, 12, 25, 25, 29, 30, 38, 39 of Table 1) is
shown in Fig. 2.
Having a state graph of the attacked network, one can describe it as the following
model:
Y
3
PðV ðnÞjPaðV ðnÞÞÞ ¼ Pðvi ðnÞjPaðvi ðnÞÞÞ; ð6Þ
i¼1
Y
3
Pðvi ðnÞjPaðvi ðnÞÞÞ
i¼1 ð7Þ
¼ PðsðnÞjsðn 1ÞÞ PðeðnÞjeðn 1Þ; sðn 1ÞÞ PðbðnÞjbðn 1Þ; eðn 1ÞÞ:
For example, for the Bayesian network variant, which is shown in Fig. 2, the model,
which describes the relationship between the observed variables for network traffic and
the probabilities of transition from state to the state of the graph vertices, will look like
following:
Y
11
PðU ðnÞjPaðU ðnÞÞÞ ¼ P uNj ðnÞjPa uNj ðnÞ ; ð8Þ
j¼1
Y
11 Y
4
P uNj ðnÞjPa uNj ðnÞ ¼ P uNj ðnÞjPaðsðnÞÞ
j¼1 j¼1
ð9Þ
Y
8 Y11
P uNj ðnÞjPaðeðnÞÞ P uNj ðnÞjPaðbðnÞÞ ;
j¼5 j¼9

P uNj ðnÞjPaðsðnÞÞ ¼ P u31 ðnÞjsðnÞ P u42 ðnÞjsðnÞ
ð10Þ
P u53 ðnÞjsðnÞ P u64 ðnÞjsðnÞ ;
26
P uNj ðnÞjPaðeðnÞÞ ¼ P u255 ðnÞjeðnÞ P u6 ðnÞjeðnÞ
39 ð11Þ
7 ðnÞjeðnÞ P u8 ðnÞjeðnÞ ;
P u38
Y
11
P uNj ðnÞjPaðbðnÞÞ :
j¼9 ð12Þ
29 30
9 ðnÞjbðnÞ P u10 ðnÞjbðnÞ P u11 ðnÞjbðnÞ ;
¼ P u12
Where j ¼ 1; . . .; 11 - is the number of variables that are observed for the network
state graph (see Fig. 2), N ¼ 1; . . .; 41 the corresponding property (traffic parameter)
from Table 1.
Therefore, for the previously selected informative signs of a DDoS attack, lines 2–
4, 23, 26–28, 34 of Table 1, the dynamic Bayesian networks will look like this, see
Fig. 3.
Fig. 3. An example of the relationship between the observed parameters (lines 2 - 4, 23, 26–28,
34 of Table 1)
Accordingly, for a given attack pattern, the model that describes the relationship
between the observed variables for network traffic and the probabilities of transition
from state to the state of the graph vertices will look like following:
Y
8
PðU ðnÞjPaðU ðnÞÞÞ ¼ P uNj ðnÞjPa uNj ðnÞ ; ð13Þ
j¼1
Y
8 Y
3
P uNj ðnÞjPa uNj ðnÞ ¼ P uNj ðnÞjPaðsðnÞÞ
j¼1 j¼1
ð14Þ
Y
6
P uNj ðnÞjPaðeðnÞÞ P u76 ðnÞjPaðbðnÞÞ :
j¼4

P uNj ðnÞjPaðsðnÞÞ ¼ P u21 ðnÞjsðnÞ P u32 ðnÞjsðnÞ P u43 ðnÞjsðnÞ ; ð15Þ
26
P uNj ðnÞjPaðeðnÞÞ ¼ P u23 4 ðnÞjeðnÞ P u5 ðnÞjeðnÞ
28 ð16Þ
P u27
6 ðnÞjeðnÞ P u7 ðnÞjeðnÞ ; :

P uNj ðnÞjPaðbðnÞÞ ¼ P u34
7 ðnÞjbðnÞ : ð17Þ
Based on the above calculations, it became possible to compose Bayesian network

templates and corresponding models that describe the relationship of observable
variables for network traffic for typical attacks Probe, U2R, R2L, Dos / DDos and
simulate the probabilities of transition from state to the state of the graph vertices.
Bayesian networks templates and corresponding models form the basis of the decision
support system knowledge base.
5 Computational Experiments
Below we consider the results of testing Bayesian networks templates for the decision
support system knowledge base, see Fig. 2 and 3. Test samples included 30 entries for
each template. Testing of samples was implemented on a test network, see Fig. 1.
The PC and EM algorithms were used [15, 16, 21] to train the networks.
The experimental results are shown in Fig. 4 and 5. The graphs show the results of
modeling the probabilities of the correct definition and interpretation of Dos / DDos
attacks (errors of the 1st kind, respectively) for networks that were trained: using the
PC algorithm (line 1) and the EM algorithm (line 2).
And the analysis of the simulation results for the obtained Bayesian networks (see
Fig. 2, 3) was carried out in the direction of estimating errors of the 1st and 2nd kind.
Of the 60 records that were used to support decision-making and did not take part in
training the BN, the first 30 records are correct data to check whether a test attack was
missed (type 2 error). The remaining 30 entries were used to check for a false positive
(type 1 error).
1- EM algorithm; 2 - PC algorithm 1-EM algorithm; 2 - PC algorithm
Fig. 4. Probability of correct determination of Fig. 5. The probability of correctly detecting

type 1 error when interpreting a DDoS attack for a DDoS type attack (type 2 errors) for BNs
BN that was trained using various algorithms that were trained using various algorithms
6 Discussion of the Results of the Computational Experiment.
Figure 4 and Fig. 5 shows errors of the 1st and 2nd kind, respectively. The PC and EM
algorithms [15, 16, 21] have confirmed their effectiveness. BN testing showed that the
developed templates correctly interpret the attack with a probability of 95–96%.
Like the previous series of experiments that were described above, the experiments
carried out confirmed that Bayesian reasoning makes it possible to determine the likely
representation for each component in the diagram. Some laboriousness of drawing up
templates is compensated by the possibility of their repeated use without significant
modification if all the developed templates become part of the decision support system
knowledge base.
Hypothetically, a Bayesian network can be constructed using a simple enumeration
to construct the sets of all possible models, if they are not cyclical. However, note that,
as shown in [15] and [16], this approach is not optimal. This is because, with the
number of more than 7 vertices, a full search requires significant computational
resources and takes quite a long time. Therefore, for the developed decision support
system and its knowledge base, the preferable option is the preliminary creation of
attack patterns, minimization of hidden variables, and the elimination of uninformative
variables that do not have a determining effect on the accuracy of intrusion detection.
Bayesian networks training consists of incorrectly adjusting the parameters of
individual nodes for a specific task, for example, to identify a certain type of network
attack.
The prospect for the development of this research is the software implementation of
the decision support system in high-level algorithmic languages and the subsequent
testing of this decision support system and its knowledge base on the segments of real
informatization objects networks.
7 Further Research
Further development of the proposed method has two standalone directions. The first
direction of the research will be sharpened on the development of specific thread
simulation tools that will work under an adjustable environment with most likely bio-
behavior. As soon as the proposed model is an artificial emulation it really cannot
mirror the psychology and unpredictability of real threads. However, the development
of these tools will provide an unlimited thread pool that could/will possibly be made
against the system and therefore provide silicon datasets for models to train how to
detect those threads during their lifecycle without actually being under real attack. The
second direction will be to create a hardware supply for a thread detection system that
can work in conjunction with a prediction system and will be able to block suspicious
threads. Hardware will be installed right after the main gateway transferring all the data
through itself. As a result of the described directions, we are planning to implement this
system on the real working network.
8 Conclusions
Patterns of Bayesian networks have been developed for the computational core of the
decision support system in the course of predicting threats and stages of intrusion into
the OBI of information and communication networks. The constructed BN templates
allow you to operate with a variety of random variables and determine the probability
of a cyber-threat or a specific stage of an invasion under given conditions. To improve
the efficiency of intrusion forecasting, the network parameters were trained. The EM
algorithm and the PC algorithm were used, as well as the available statistics for the test
network. Probabilistic models for detecting network intrusions based on the use of BN
are described. In contrast to the existing models, the proposed approach makes it
possible not only to take into account the main stages of intrusions but also to make
more reasonable decisions based on the use of both typical intrusion patterns and newly
synthesized patterns. All templates and models make up the decision support system
computing core for intrusion detection.
The effectiveness of the developed models was tested on test samples that were not
previously used in training. The results obtained indicate the feasibility of using the EM
algorithm to obtain a high-quality result of the recognition of cyber threats to infor-
mation and communication networks.
References
1. Elshoush, H.T., Osman, I.M.: Alert correlation in collaborative intelligent intrusion detection
systems–a survey. Appl. Soft Comput. 11(7), 4349–4365 (2011)
2. Shenfield, A., Day, D., Ayesh, A.: Intelligent intrusion detection systems using artificial
neural networks. ICT Express 4(2), 95–99 (2018)
3. Rees, L.P., Deane, J.K., Rakes, T.R., Baker, W.H.: Decision support for Cybersecurity risk
planning. Decis. Support Syst. 51(3), 493–505 (2011)
4. Akhmetov, B., Lakhno, V., Boiko, Y., & Mishchenko, A.: Designing a decision support
system for the weakly formalized problems in the provision of cybersecurity. Eastern-Eur.
J. Enterp. Technol. (1(2)), 4–15 (2017)
5. Fielder, A., Panaousis, E., Malacaria, P., Hankin, C., Smeraldi, F.: Decision support
approaches for cybersecurity investment. Decis. Support Syst. 86, 13–23 (2016)
6. Atymtayeva, L., Kozhakhmet, K., Bortsova, G.: Building a knowledge base for expert
system in information security. In: Chapter Soft Computing in Artificial Intelligence of the
series Advances in Intelligent Systems and Computing, vol. 270, pp. 57–76 (2014)
7. Dua S., Du, X.: Data Mining and Machine Learning in Cybersecurity, p. 225. CRC Press
(2016)
8. Buczak, A.L., Guven, E.: A Survey of data mining and machine learning methods for cyber
security intrusion detection. IEEE Commun. Surv. Tutor. 18(2), 1153–1176 (2016)
9. Zhang, L., Yao, Y., Peng, J., Chen, H., Du, Y.: Intelligent information security risk
assessment based on a decision tree algorithm. J. Tsinghua Univ. Sci. Technol. 51(10),
1236–1239 (2011)
10. Ben-Asher, N., Gonzalez, C.: Effects of cybersecurity knowledge on attack detection.
Comput. Hum. Behav. 48, 51–61 (2015)
11. Goztepe, K.: Designing fuzzy rule based expert system for cyber security. Int. J. Inf. Secur.
Sci. 1(1), 13–19 (2012)
12. Gamal, M.M., Hasan, B., Hegazy, A.F.: A Security analysis framework powered by an
expert system. Int. J. Comput. Sci. Secur. (IJCSS) 4(6), 505–527 (2011)
13. Chang, L.-Y., Lee, Z.-J.: Applying fuzzy expert system to information security risk
Assessment – a case study on an attendance system. In: International Conference on Fuzzy
Theory and Its Applications (iFUZZY), pp. 346–351 (2013)
14. Kanatov, M., Atymtayeva, L., Yagaliyeva, B.: Expert systems for information security
management and audit, Implementation phase issues, Soft Computing and Intelligent
Systems (SCIS). In: Joint 7th International Conference on and Advanced Intelligent Systems
(ISIS), pp. 896–900 (2014)
15. Lakhno, V.A., Lakhno, M.V., Sauanova, K.T., Sagyndykova, S.N., Adilzhanova, S.A.:
Decision support system on optimization of information protection tools placement. Int.
J. Adv. Trends Comput. Sci. Eng. 9(4), 4457–4464 (2020)
16. Xie, P., Li, J. H., Ou, X., Liu, P., Levy, R.: Using Bayesian networks for cybersecurity
analysis. In: 2010 IEEE/IFIP International Conference on Dependable Systems & Networks
(DSN), pp. 211–220. IEEE, June 2010
17. Shin, J., Son, H., Heo, G.: Development of a cybersecurity risk model using Bayesian
networks. Reliab. Eng. Syst. Saf. 134, 208–217 (2015)
18. Özgür, A., Erdem, H.: A review of KDD99 dataset usage in intrusion detection and machine
learning between 2010 and 2015. PeerJ Preprints, 4, e1954v1 (2016)
19. Elkan, C.: Results of the KDD’99 classifier learning. ACM SIGKDD Explorat. Newsl. 1(2),
63–64 (2000)
20. Lakhno, V.A., Kravchuk, P.U., Malyukov, V.P., Domrachev, V.N., Myrutenko, L.V., Piven,
O.S.: Developing of the cybersecurity system based on clustering and formation of control
deviation signs. J. Theor. Appl. Inf. Technol. 95(21), 5778–5786 (2017)
21. Lakhno, V.A., Hrabariev, A.V., Petrov, O.S., Ivanchenko, Y.V., Beketova, G.S.: Improving
of information transport security under the conditions of destructive influence on the
information-communication system. J. Theort. Appl. Inf. Technol. 89(2), 352–361 (2016)
22. Heckerman, D.: A tutorial on learning with bayesian networks, Tecnical report, Redmond:
Microsoft Research (1995). 58 p.
23. Raileanu, L.E., Stoffel, K.: Theoretical comparison between the gini index and information
gain criteria. Ann. Math. Artif. Intell. 41(1), 77–93 (2004)
24. Alhendawi, K.M., Al-Janabi, A.A.: An intelligent expert system for management information
system failure diagnosis. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.) Intelligent Computing
& Optimization. ICO 2018. Advances in Intelligent Systems and Computing, vol. 866.
Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00979-3_26
Developing an Intelligent System
for Recommending Products
Md. Shariful Islam1, Md. Shafiul Alam Forhad1(&),

Md. Ashraf Uddin2, Mohammad Shamsul Arefin1,
Syed Md. Galib3, and Md. Akib Khan4
1
Department of CSE, CUET, Chattogram, Bangladesh
u1604064@student.cuet.ac.bd,
{forhad0904063,sarefin}@cuet.ac.bd
2
Information and Communication Technology Division, Dhaka, Bangladesh
ashraf.uddin04066@gmail.com
3
Department of CSE, JUST, Jessore, Bangladesh
galib.cse@just.edu.bd
4
Department of EEE, AUST, Dhaka, Bangladesh
akib.austeee@gmail.com
Abstract. When it comes to making decisions on which product to buy,

knowing the overall reviews from other users becomes very helpful. Evaluating
this task from user ratings is so simple. Although a machine can be used for
evaluating the recommendations, simply by calculating its user ratings, some-
times it becomes difficult to provide accurate and efficient results. As therefore,
evaluating users’ comments usually leads to assigning humans to read all the
comments one by one and then let them decide on how useful the product
seems. This is a tedious process which wastes our valuable time and resources
due to no way of automating the process. On the other hand, selecting the most
valuable product from an enormous number of reviews becomes a hectic task for
the consumers. Considering all of the above, we have developed a machine
learning based intelligent system which not only evaluates the ratings from
users’ reviews but also provides a reflection about the products which are
popular simply by analyzing those reviews.
Keywords: Sentiment analysis Machine learning Random forest classifier

K-nearest neighbors Support vector machine Deep learning
1 Introduction
With the ever-increasing growth of the internet, there have been a lot of websites
containing a continuously huge number of user-generated texts all around the world.
Depending on the types of websites, a large amount of those texts contain reviews
belonging to various types of products. If processed properly, these texts can be used in
order for understanding public opinion on different products so that manufacturing
companies can adjust their products accordingly. On the other hand, this can also help
customers make proper decisions, when it comes to buying products. While it is easier
to figure out the utilization of a product from learning the average rating, it becomes
https://doi.org/10.1007/978-3-030-68154-8_43
Developing an Intelligent System for Recommending Products 477
difficult when ratings are not provided on the website. Finding the best product
involves reading a plenty number of reviews on several items and finally making
decision for getting the best product. In either case, performing sentiment analysis on
online reviews for various products can help both the companies and customers for
saving their valuable time. Supervised learning is a learning technique which requires
suitable labels for efficient result. However, polarity of individual text review is highly
unlikely to be available with texts themselves. Even if they are available, these are
either machine generated or human viewed, and the main problem for both cases is that
they do not reflect the reviewer’s original intention. Moreover, ratings provided with
the reviews can be considered a valid label since a rating is the originally intended
quantitative representation of the reviews. Hence, in this paper, we have proposed an
intelligent system in order to predict ratings for different products and also evaluate
which product is best depending on the user’s reviews. After predicting ratings with
different machine learning models, various performance measurements were checked
for the comparison and evaluation of the overall efficiency.
The rest of the paper is organized as follows: In Sect. 2, Literature review related to our
work is shown. A detail description of our proposed system is shown in Sect. 3. In Sect. 4,
Experimental result is shown. Finally, a conclusion section is provided in Sect. 5.
2 Literature Review
There are some researches which have previously been done on similar topics. For
example, Ahmed et al. Have proposed a framework in their paper [1] about deep
learning in order to predict ratings from online reviews from Yelp and Amazon
Datasets. In their paper, they have managed to achieve a very good performance using
various models. However, it is seemed that they had enough lacking using the linguistic
rules during the analysis of review text.
Due to the ever-growing availability of online reviews, it becomes difficult to draw
chronological insights from them. For this reason, in the paper [1] by Murtadha et al.,
chronological sentiment analysis using dynamic and temporal clustering was presented.
In their research, they have also considered consistency in order to assess the perfor-
mance of their algorithms. However, they have utilized unified feature set of window
sequential clustering in order to enhance the ensemble learning. Murtadha et al. Have
also proposed in another paper [2] an automated unsupervised method of sentiment
analysis. Their method involves contextual analysis and unsupervised ensemble
learning. SentiWordNet has been deployed in both phases. The research datasets
include Australian Airlines and HomeBuilders reviews for domain performance and
algorithm seems to be the most accurate for some datasets even though Support Vector
Machine (SVM) seems to be the most accurate on average. Applying deeper contextual
analysis might have resulted in a higher accuracy score.
Long Mai and Bac Le have taken a slightly different approach towards rating
prediction of product reviews. In their paper [3], they have proposed a unique
framework for automatic collection and processing of YouTube comments on various
products which are then used for sentiment analysis of those products. Various models
478 Md. S. Islam et al.
have been used on their reviews performance with accuracy ranging from 83%–92%.
However, the domain which has been used for the dataset needs to be expanded in
order to make their models more effective in the real world.
In [4], they presented a novel approach that uses aspect level sentiment detection.
They have tested their model on Amazon customer reviews and their acquired cross-
validation score seems to be somewhere near 87%, which is 10% less than the score
has been claimed in the paper. More challenging areas like spam, sarcasm etc. have not
been touched in their current research.
Rishith et al. Introduces in their paper [5], a sentiment analysis model that uses
Long Short Term Memory for analyzing movie reviews. Their model seems to have a
very high accuracy on both the train and the test data. However, it might be able to
achieve a better performance if it was applied on more varied datasets.
This paper [6] reviews the foremost later considers approximately that have utilized
profound learning to disentangle conclusion examination issues, such as suspicion
limit. It is superior to combine significant learning techniques with word embeddings
than with Term Frequency-Inverse Document Frequency (TF-IDF) when performing a
supposition examination. Eventually, a comparative take into account has been con-
ducted on the searching comes more or less gotten for the differing models and input
highlights.
In [7], the authors used applied machine learning algorithms to detect polarity of
the reviews. Their models seem having a very high accuracy with Naïve Bayes.
However, their algorithm cannot detect from review why people like or dislike a
product. In [8], they used Isolation Forest for sentiment analysis on data recovered from
Amazon on products of different brands. While their project flow diagram shows
sentiment detection process, no prediction measurement score is present within their
paper. The use of convolutional neural network is also common in text-based pre-
diction. In [9], they proposed an approach for predicting the helpfulness of different
online reviews in the form of helpfulness score by using Convolutional Neural Net-
work (CNN). However, in this paper, their dataset is limited to Amazon and Snapdeal
only, and it has seemed that they never use any model for getting metadata information
either.
Mamtesh et al. Analyzed customer reviews for a number of movies using K-
Nearest Neighbors (KNN), Logistic Regression and Naïve Bayes in their paper [10].
They were able to predict the sentiment of the reviews with very high accuracy. They
might be able to achieve higher accuracy with a hybrid approach.
In [11], the author analyzes user sentiments of Windows Phone applications and
classifies their polarity by using Naïve Bayes model. The model seems to be able to
achieve a high accuracy. But it can classify in only 2 categories: positive and negative.
Rintyarna et al. Proposed a method in order to extract features from texts by
focusing on the words’ semantics in their paper [12]. To achieve this, they used an
extended Word Sense Disambiguation method. From their experiment, it seems that
their proposed method boosts the performance of the ML algorithms sufficiently.
In [13], they have proposed a method for data pre-processing within three machine
learning techniques Support Vector Machine (SVM), Naïve Bayes Multinomial
(NBM) and C4.5 for the sentiment analysis for both English and Turkish language.
Finally, they concluded that although SVM worked well for the English language,
NBM performed well for the Turkish language. In [14], they have proposed an
appropriate and accurate algorithm for finding movie reviews among some of machine
learning based algorithms.
In [15], they proposed a hybrid sentiment analysis method for analyzing Amazon
Canon Camera Review in order to classify them into 2 polarity classes, positive
&negative. Compared to SVM, their approaches seem to be having very high score in
different performance measurements including accuracy, precision, recall, and others.
In [16], Haque et al., they used supervised learning method on Amazon Product
Review on large scale. The model used in their algorithms has performance around
86% to 94% varying on different models and different datasets. Their main limitation is
the scarcity of standard dataset required with 10-fold cross-validation. Sentiment
analysis is a demanding task comprising of natural language processing, data mining,
and machine learning. To tackle this challenge furthermore, deep learning models are
often merged with them. In [17], they have emphasized on the execution of different
DL techniques in Sentiment Analysis.
In [18], the assessment was done by utilizing 10-Fold Cross Validation and the
estimation precision is done by Confusion Matrix and Receiver Operating Character-
istic (ROC) bend. The result appeared an expanding in precision SVM of 82.00% to
94.50% results of testing the model can be concluded that the Support Vector Machine-
based Particle Swarm Optimization provide solutions to problems of classification
more accurate smartphone product reviews.
3 Proposed Methodology
The proposed methodology consists of the following steps for polarity prediction
through sentiment analysis. Figure 1 shows a block diagram for the entire process.
3.1 Dataset Description

Two different datasets have been used here. The first dataset contains over 400,000
reviews of unlocked phones for sale the Amazon website. 40,000 of these reviews have
been used for this paper. This dataset has 6 columns: Product Name, Brand Name,
Price, Rating, Reviews and Review Votes. It was collected from [19] as CSV format.
The second dataset has more than 34,000 reviews provided by Datafiniti on Amazon
products like Kindle E-readers, Echo smart speaker, and so on. This dataset contains 24
columns: id, dateAdded, dateUpdated, primary Categories, imageURLs, keys, reviews.
id, reviews.rating, reviews.text, reviews.title, etc. However, it does not contain type of
the products; which has been manually added to the dataset. It has reviews mostly on
batteries and tablets. This is why only these two product types have been used for this
paper. This dataset [20] was acquired as CSV format.
3.2 Pre-processing
This step involves preparing data from the dataset for training model.
There are several steps to this.
3.2.1 Initial Database Based Pre-processing

Since the dataset is very large with about 40 million reviews, only the first 40000
reviews have been selected for training for the first dataset while 26467 reviews were
selected for the second. Since the product pages are written by the sellers, almost all of
the product names consist of their description in the first dataset. As a result, the same
product ended up with multiple different unique product names. For this reason, the
names of products with such issues have been replaced only by the common name of
the products in the first dataset for better product suggestion. Moreover, among those
reviews, some products have a very small number of user reviews, indicating very low
popularity among users. However, due to such a low amount, very small number of
reviews with high ratings (either because only the satisfied customers ended up
reviewing the products or because of fake reviews with high ratings) will give them
higher average ratings than most of the actually popular products. This is why products
with less than 25 reviews have been filtered out. For the second dataset, an extra
column is added before filtering in order to identify which product belongs in which
product type. Since this dataset is smaller, products with less than 15 reviews were
filtered out this time. Next, only the reviews of the most common product types are
selected for machine learning. The next part is Text Cleaning.
3.2.2 Cleaning Text

This step involves transforming the text into a much simpler form of data so that the
result can easily be crunched into numbers. This step involves the following sub-steps.
3.2.2.1. Punctuation Removal
In this step, all the punctuation symbols are removed from the texts since they usually
do not have any significant impact on the final rating.
3.2.2.2. Normalization
Here, all uppercase letters in all the texts are replaced by their lowercase counterparts
for further simplicity.
3.2.2.3. Stop-Word Removal
There are some words in every language that are so common in texts that their exis-
tence in a sentence does not influence the meaning of the sentence too much. For
example, articles, prepositions, conjunctions etc. are some common stop-words. Since
their influence in the meaning of a sentence is very little removing them results in a
more condensed text.
3.2.2.4. Lemmatization
Lemmatization involves resolving words to their dictionary form. Hence, dictionary
search is usually used in order to proper lemmatize the words into their dictionary-
accurate base forms. The purpose of lemmatization is to make sure that the algorithm
can distinguish a word in its different forms. This also reduces the word index of the
tokenizer and hence makes the result more efficient.
3.2.3 Final Pre-processing and Training

In this step, the labels are adjusted and the dataset is split in the ratio of 3:1. Then the
training data used for tokenization and all out of vocabulary words are represented by
the “< OOV>” symbol. Next, the texts both training and test data are converted to
sequences according to the tokenizer. Finally, all the reviews are padded in order to
ensure equal length for all reviews.
3.3 Training
In this step, K-Nearest Neighbors (KNN) classifier, Random Forest Classifier (RFC) &
Support Vector Machine (SVM) classifier are trained using padded training data as
features and ratings as labels. Moreover, a deep learning classifier is also developed for
the same purpose. For the second dataset, rather than being trained separately, the
Machine Learning (ML) models are trained on 75% of the entire combined dataset to
Battery reviews and Tablet reviews for better generalization.
3.4 Testing and Rating Prediction

When training is complete, these classifiers are tested against the padded test data for
performance measurement. Even though the ML models are trained on 75% of the
combined dataset, they are tested individually against Battery’s and Tablet’s test
datasets. The values predicted by the classifiers are stored as the predicted ratings.
3.5 Product Recommendation

This step is performed once after the completion of training session for each classifier.
Here, average predicted rating for each product is separately calculated. These average
ratings are used for recommending products.
Storage Module
Unlocked Amazon
Phone Product Dataset Splitting
Reviews Reviews
Training Set
Test Set
Eliminating Unnecessary
Reviews
Tokenization
Pre-Processing
Sentence Splitting
Text Sequencing
Stopword Removal
Padding
Special Character
Removal
Training Testing
Normalization
Rating Prediction
Lemmatization
Recommendation
Generation
Fig. 1. An intelligent system for recommending products
The dataset & number of reviews used for machine learning for different types of
products is shown in Table 1.
Table 1. Review count for different types of products

Product types Dataset no. Number of reviews
Unlocked Phones 1 40000
Batteries 2 12071
Tablets 2 14396
Table 2. Accuracy, precision and recall for several classifiers on different products
Classifiers Products Unlocked Phones Batteries Tablets
KNN Accuracy 82.75% 69.18% 84.7%
Precision 78.67% 47.26% 67.54%
Recall 76.37% 41.12% 73.3%
RFC Accuracy 86.63% 77.5% 88.86%
Precision 90.35% 77.92% 95.81%
Recall 77.44% 40.12% 73.57%
SVM Accuracy 81.80% 78.66% 89.52%
Precision 91.93% 89.48% 97.26%
Recall 72.27% 36.17% 74.57%
DL Model Accuracy 86.90% 77.80% 87.21%
Precision 80.86% 53.28% 78.34%
Recall 79.54% 47.22% 77.38%
From Table 1, we can see that 40000 reviews on unlocked phone reviews from the
first dataset have been used. On the other hand, Reviews for batteries and tablets both
come from the second dataset and contain 12071 and 14396 reviews respectively. The
different score types used for measuring performance of different classifiers are accu-
racy, precision, and recall.
The Number of Correct Predictions

Accuracy ¼
Total Number of Predictions
TP
Precision ¼
TP þ FP
TP
Recall ¼
TP þ FN
Here, TP, FP & FN denote True Positive, False Positive & False Negative
respectively.
The overall precision for a multi-class classification is the average of the precision
scores of every individual class. The same goes for the overall recall score as well.
Table 2 shows different performance scores for certain classifiers on several types of
products.
Figure 2 shows a graphical representation of accuracy, precision and recall for
several classifiers on different product types. From this figure, it can be seen that Deep
Learning based classifier has both the highest accuracy and the highest recall score for
unlocked phone reviews, with an accuracy score of 86.90% and a recall score of
79.54%. However, Support Vector Machine (SVM) has the highest precision score of
91.93% for the same reviews.
Fig. 2. Graphical presentation of performance measurement scores of different classifiers
The second type of products is battery. For this product’s reviews, Support Vector
Machine (SVM) seems to be having both the highest accuracy score and the highest
precision with accuracy score of 78.66% and precision score of 89.48%. However, the
Deep Learning (DL) based classifier has the highest recall value of 47.22%. Finally,
tablets are the last product type. For the selected tablet reviews, Support Vector
Machine again has the highest accuracy score and the highest precision score of
89.52% and 97.26% respectively. However, Deep Learning (DL) based model again
seems to have the highest recall score with a score of 77.38%.
Table 3 shows the training time for all 4 of the classifiers. From here, it can be seen
that the classifiers sorted in their ascending order of their training time are: KNN, RFC,
SVM and DL for either dataset; with KNN’s training time being less than 1second in
both cases and DL model’s training being greater than 300seconds for both cases. It
can be noted that training time is higher on the 2nd dataset for all classifiers; despite the
2nd dataset being smaller. This is because the 1st dataset is very large and that is why,
compared to the 2nd dataset, less training was required for better generalization.
Table 3. Training time (in seconds) for different classifiers

Classifier Dataset 1 (Unlocked Phones) Dataset 2 (Batteries and Tablets)
KNN 0.0568621 0.0903794
RFC 0.8510487 1.2277246
SVM 41.0242428 52.5817226
DL Model 372.0272711 519.7536614
Figure 3 shows the top 5 phones according to the average of their individual
ratings. This can used as a benchmark for comparing the effectiveness recommendation
systems implemented with the ML models.
Fig. 3. Graphical presentation of the average of the actual ratings of Top 5 phones
From Fig. 4, we can see that the DL (Deep Learning) model managed to predict 4
items from the top 5 Phones based on the average of the average ratings (Fig. 3) with
mostly the same order.
Fig. 4. Graphical presentation of average of predicted scores for Top 5 phones recommended by
different machine learning models
Similar to Fig. 3, Fig. 5 also shows the average of the actual ratings, but for the
batteries. From Fig. 6, we can see that all four machine learning models predicted the
order correctly. However, both KNN and DL models were able to predict the average
close to the actual values.
Fig. 5. Graphical presentation of the average of the actual ratings of the types of batteries
Fig. 6. Graphical presentation of average of predicted scores for the types of batteries
Figure 7 shows a graphical representation of top 5 tablets based on the average of

their actual ratings.
Fig. 7. Graphical presentation of the average of the actual ratings of Top 5 tablets
From Fig. 8, it can be seen that KNN model was able to predict 4 items from
Fig. 7, with the order and the values also being mostly accurate.
Fig. 8. Graphical presentation of average of predicted scores for top 5 tablets recommended by
different machine learning models
5 Conclusion
In this paper, we have developed an intelligent system which predicts ratings from
different product reviews and provides recommendation to the customers. From the
experimental results, it has been found that SVM gives better result based on the
performance scores. However, during training time for each classifier, it seems that
Random Forest (RF) is more efficient than SVM and the performance is too close to
SVM for the three product types. Overall, performance scores for all four classifiers are
pretty good. Bag of Words has the tendency of increasing the dimensionality of the
matrix with the increase of vocabulary size. Moreover, since the context of the words
cannot be recovered from Bag of Words approach, semantic meaning is often not
preserved in this method. In order to address these issues, we have decided to post-
truncate and post-pad the review texts with a maximum word length of 15 and 30 for
Dataset 1 and Dataset 2 respectively. In future, we will try to improve the performance
of our system by using some rule-based pre-processing techniques.
References
1. Al-sharuee, M.T.: Sentiment analysis : dynamic and temporal clustering of product reviews
(2020)
2. Al-sharuee, M.T., Liu, F., Pratama, M.: Sentiment analysis: an automatic contextual analysis
and ensemble clustering approach and comparison. Data Knowl. Eng. (2018)
3. Mai, L., Le, B.: Joint sentence and aspect-level sentiment analysis of product comments.
Ann. Oper. Res. (2020)
4. Nandal, N., Tanwar, R., Pruthi, J.: Machine learning based aspect level sentiment analysis
for Amazon products. Spat. Inf. Res. (2020)
5. Rishith, U.T., Rangaraju, G.: Usense: a user sentiment analysis model for movie reviews by
applying LSTM. Int. J. Res. Eng. Appl. Manag. 01, 369–372 (2020)
6. Dang, N.C., Moreno-García, M.N., De la Prieta, F.: Sentiment analysis based on deep
learning: a comparative study. Electron. 9(3) (2020)
7. Jagdale, R.S., Shirsat, V.S., Deshmukh, S.N.: Sentiment Analysis on Product Reviews Using
Machine Learning Techniques : Proceeding of CISC 2017 Sentiment Analysis on Product
Reviews Using Machine Learning Techniques, no. January. Springer Singapore (2019)
8. Salmiah, S., Sudrajat, D., Nasrul, N., Agustin, T., Harani, N.H., Nguyen, P.T.: Sentiment
Analysis for Amazon Products using Isolation Forest (6), 894–897 (2019)
9. Saumya, S., Singh, J.P., Dwivedi, Y.K.: Predicting the helpfulness score of online reviews
using convolutional neural network. Soft Comput., no. BrightLocal 2016 (2019)
10. Mamtesh, M., Mehla, S.: Sentiment analysis of movie reviews using machine learning
classifiers. Int. J. Comput. Appl. 182(50), 25–28 (2019)
11. Normah, N.: Naïve Bayes algorithm for sentiment analysis windows phone store application
reviews. SinkrOn 3(2), 13 (2019)
12. Rintyarna, B.S., Sarno, R., Fatichah, C.: Semantic features for optimizing supervised
approach of sentiment analysis on product reviews. Computers 8(3), 55 (2019)
13. Parlar, T., Özel, S.A., Song, F.: Analysis of data pre-processing methods for sentiment
analysis of reviews. Comput. Sci. 20(1), 123–141 (2019)
14. Dwivedi, S., Patel, H., Sharma, S.: Movie reviews classification using sentiment analysis.
Indian J. Sci. Technol. 12(41), 1–6 (2019)
15. Chhabra, I.K., Prajapati, G.L.: Sentiment analysis of Amazon canon camera review using
hybrid method. Int. J. Comput. Appl. 182(5), 25–28 (2018)
16. Haque, T.U., Saber, N.N., Shah, F.M.: Sentiment analysis on large scale amazon product
reviews. In: 2018 IEEE International Conference Innovation Research Deviation, no. May,
pp. 1–6 (2018)
17. Kalaivani, A., Thenmozhi, D.: Sentimental Analysis using Deep Learning Techniques,
pp. 600–606 (2018)
18. Wahyudi, M., Kristiyanti, D.A.: Sentiment analysis of smartphone product review using
support vector machine algorithm-based particle swarm optimization. J. Theor. Appl. Inf.
Technol. 91(1), 189–201 (2016)
19. Amazon Reviews: Unlocked Mobile Phones | Kaggle. https://www.kaggle.com/
PromptCloudHQ/amazon-reviews-unlocked-mobile-phones
20. Consumer Reviews of Amazon Products | Kaggle. https://www.kaggle.com/datafiniti/
consumer-reviews-of-amazon-products
Branch Cut and Free Algorithm
for the General Linear Integer Problem
Elias Munapo(&)
Department of Statistics and Operations Research, School of Economic Sciences,

North West University, Mafikeng, South Africa
Elias.Munapo@nwu.ac.za
Abstract. The paper presents a branch, cut and free algorithm for the general
linear integer problem. This proposed algorithm, like most of the other exact
algorithms for the general linear integer problem also relies on the usual strategy
of relaxing the linear integer problem and then solve it to obtain a continuous
optimal solution. If solution is integer then the optimal solution is found else the
largest basic variable in the continuous optimal solution is selected and freed of
integral restrictions and then the branch and cut algorithm is used to search for
the optimal integer solution. The main and obvious challenge with the branch
and bound related algorithms is that the numbers of nodes generated to verify
the optimal solution can sometimes explode to unmanageable levels. Freeing a
selected variable of integral restrictions which is proposed in this paper can
significantly reduce the complexity of the general linear integer problem.
Keywords: Linear integer problem Continuous optimal solution Variable

freeing Branch and cut algorithm Computational complexity Optimality
verification
1 Introduction
The general linear integer programming (LIP) problem has so many important appli-
cations in life. These applications include, interactive multimedia systems [1], home
energy management [15], cognitive radio networks [7], mining operations [22], relay
selection in secure cooperative wireless communication [11], electrical power alloca-
tion management [18, 20], production planning [4], selection of renovation actions [2],
waste management, formulation and solution method for tour conducting and opti-
mization of content delivery networks [16].
The main challenge with the branch and bound related algorithms is that the
numbers of nodes that are required to verify the optimal solution can sometimes
explode to unmanageable levels. Freeing a variable of all its integral restrictions which
is proposed in this paper can significantly reduce the complexity of the general linear
integer problem. In fact the LIP problem is NP hard, thus it is very difficult to solve and
heuristics are still being used up to now to approximate optimal solutions in reasonable
times. An efficient consistent exact solution for the general LIP is still at large. The
coefficient matrix for the general linear integer problem is not unimodular [14].
A branch, cut and free algorithm which combines the branch and cut algorithm and

https://doi.org/10.1007/978-3-030-68154-8_44
492 E. Munapo
freeing of a selected variable is proposed in this paper. This algorithm is similar to [17]
but differs in that it can be used for any linear integer model with any number of
variables and linear constraints. On the other hand the method proposed in [17] requires
calculation of variable sum limits before solving and it works for a single constrained
linear integer problem and no variables are freed in the solving process.
2 The General LIP
Maximize Z ¼ c1 x1 þ c2 x2 þ . . . þ cn xn ,
Such that:
a11 x1 þ a12 x2 þ . . . þ a1n xn b1 ;

a21 x1 þ a22 x2 þ . . . þ a2n xn b2 ;
ð1aÞ
...
am1 x1 þ am2 x2 þ . . . þ amn xn bm :
Minimize Z ¼ c1 x1 þ c2 x2 þ . . . þ cn xn ,
Such that:
a11 x1 þ a12 x2 þ . . . þ a1n xn b1 ;

a21 x1 þ a22 x2 þ . . . þ a2n xn b2 ;
ð1bÞ
...
am1 x1 þ am2 x2 þ . . . þ amn xn bm :
Where aij ; bi and cj are constants and xj is integer 8; i ¼ 1; 2; . . .; m and

j ¼ 1; 2; . . .; n:
3 Variable Integral Restriction Freeing Theorem
Given n variables in an equality constraint such as that given in (2), then one of the
n variables can be freed of integral restrictions. In other words one out of the n vari-
ables is not supposed to be restricted to integer.
a1 x1 þ a2 x2 þ . . . þ aj xj þ . . . þ an xn ¼ b: ð2Þ
Where aj and b are integral constants and the unknown variables are integer
8j ¼ 1; 2; . . .; n:
If all the variables are integers then one variable ðxj Þ out of n variables is not
supposed to be restricted integer.
Branch Cut and Free Algorithm 493
Proof
From (2) the variable ðxj Þ can be made the subject as given in (3).
aj xj ¼ b ða1 x1 þ a2 x2 þ . . . þ an xn Þ: ð3Þ
If a sum of integers is subtracted from an integer, then the difference is also an

integer. According to constraint (2) the variable xj is a free variable.
4 Continuous Optimal Solution
The general continuous optimal tableau is given in Table 1. The arrangement of

variables is done in this way just for convenience, as any optimal continuous tableau
can be arranged in many ways.
Table 1. Continuous optimal tableau.
Basic variables ( x) Non-basic variables ( s ) rhs

0 0 0 … 0 ω1 ω2 ω3 … ωm γ
1 0 0 … 0 α11 α12 α13 … α1m β1
0 1 0 … 0 α 21 α 22 α 23 … α 2m β2
0 0 1 … 0 α 31 α 32 α 33 … α 3m β3
…
0 0 0 … 1 α m1 α m2 α m3 … α mn βm
Since this is the continuous optimal tableau then (4) and (5) are valid.
x1 ; x2 ; x3 ; . . .; xm 0: ð4Þ
b1 ; b2 ; b3 ; . . .; bm 0: ð5Þ
In addition c is the objective value and aij is a constant. Both c and aij can either be
negative or positive. In this paper Zcopt is the continuous optimal solution. The other
examples of general continuous optimal solutions are given in Table 2 and Table 3.
The specific continuous optimal solutions for numerical illustrations are given in
Table 4 and Table 5.
5 Surrogate Constraint
The surrogate constraint or clique constraint is obtained by adding all the rows of
original variables that are basic at optimality as given in (6).
494 E. Munapo
Table 2. Sub-Problem 1.
Table 3. Sub-Problem 2.
x1 þ a11 s1 þ a12 s2 þ a13 s3 þ . . . þ a1m sm ¼ b1

þ
þ ð6Þ
...
xm þ am1 s1 þ am2 s2 þ am3 s3 þ . . . þ amm sm ¼ bm :
This simplifies to (7).
x1 þ x2 þ x3 þ . . . þ xm þ k1 s1 þ k2 s2 þ k3 s3 þ . . . þ km sm ¼ bc : ð7Þ
Where
kj ¼ a1j þ a2j þ a3j þ . . . þ amj 8j; j ¼ 1; 2; 3; . . .; m: ð8Þ
bc ¼ b1 þ b2 þ b3 þ . . . þ bm : ð9Þ
Since bc is not necessarily integer then we have (10).
bc ¼ I þ f : ð10Þ
Where I is the integer part and f is the fractional part.

Since at optimality the non-basic variables are zero as given in (10) i.e.
s1 ¼ s2 ¼ s3 ¼ . . . ¼ sm ¼ 0: ð11Þ
Then (12) and (13) are valid.

Sub-Problem a:
x1 þ x2 þ x3 þ . . . þ xm I: ð12Þ
Sub-Problem b:
x1 þ x2 þ x3 þ . . . þ xm I þ 1: ð13Þ
In other words we add (14) to Sub-Problem a instead of (12).
x1 þ x2 þ x3 þ . . . þ xj þ . . . þ xm þ xn þ 1 ¼ I: ð14Þ
Similarly (13) becomes (15).
x1 þ x2 þ x3 þ . . . þ xj þ . . . þ xm xn þ 2 ¼ I þ 1: ð15Þ
As a diagram, the original problem and the two sub-problems are related as shown
in Fig. 1.
x1 + x2 + x3 + ... + x j x1 + x2 + x3 + ... + x j
+... + xm + xn +1 = I . +... + xm − xn + 2 = I + 1.
a b
Fig. 1. The two initial sub-problems of an LIP.

496 E. Munapo
Where Node 0 is the original problem, Node a is given by Table 2 which is the
original problem after adding constraint (14) and Node (b) is given by Table 3 which is
obtained after adding constraint (15) to original problem.
The unrestricted variable is selected from that basic variable satisfying (16).
b‘ ¼ Maxfb1 ; b2 ; b3 ; . . .; bm g: ð16Þ
Justification: The larger the variable range, the more branches we are likely to obtain,
so it makes sense not to subject such a variable to integral restriction. The equality
constraint (14) is added to continuous optimal tableau to obtain Sub-Problem a. Sim-
ilarly (15) is added to the continuous optimal tableau to obtain Sub-Problem b.
Where variable xj is now unrestricted by using the variable integral restriction
freeing theorem. In this paper Zopt is the optimal integer solution.
6 Measuring Complexity Reduction
The percentage complexity reduction ðqÞ when using the proposed algorithm is given
as (17).
ðR rÞ
q¼ 100%: ð17Þ
R
Where R is the number of branch and bound nodes before using the proposed algorithm
and r is the number of nodes after freeing a selected variable.
6.1 Numerical Illustration 1

Maximize Z ¼ 111x1 þ 211x2 þ 171x3 þ 251x4 þ 151x5 ,
Such that:
110x1 þ 210x2 þ 115x3 þ 112x4 31x5 7189;

50x1 þ 183x2 þ 261x3 79x4 þ 259x5 6780;
142x1 þ 244x2 140x3 þ 139x4 þ 153x5 2695; ð18Þ
224x1 87x2 þ 128x3 þ 129x4 þ 133x5 12562;
155x1 þ 252x2 þ 258x3 þ 156x4 þ 157x5 2533:
Where x1 ; x2 ; x3 ; x4 ; x5 0 and integer.

The continuous optimal solution for Numerical Illustration 1 is presented in
Table 4.
Table 4. Continuous optimal tableau (18).
Where s1 ; s2 ; s3 ; s4 ; s5 0 are slack variables and these satisfy (19).
110x1 þ 210x2 þ 115x3 þ 112x4 31x5 þ s1 ¼ 7189;

50x1 þ 183x2 þ 261x3 79x4 þ 259x5 þ s2 ¼ 6780;
142x1 þ 244x2 140x3 þ 139x4 þ 153x5 þ s3 ¼ 2695; ð19Þ
224x1 87x2 þ 128x3 þ 129x4 þ 133x5 þ s4 ¼ 12562;
155x1 þ 252x2 þ 258x3 þ 156x4 þ 157x5 þ s5 ¼ 2533:
Solving (18) directly by the automated branch and bound algorithm it takes 459
nodes to verify the optimal solution given in (20).
x1 ¼ 31; x2 ¼ 0; x3 ¼ 22; x4 ¼ 9; x5 ¼ 0; Zopt ¼ 9462: ð20Þ
Using the simplex method it gives (21) as the continuous optimal solution.
x1 ¼ 31:5636; x2 ¼ 0; x3 ¼ 22:7229; x4 ¼ 9:8910;

ð21Þ
x5 ¼ 0:1264; Zcopt ¼ 9890:8977:
6.1.1 Freeing the Selected Variable

The surrogate constraint or clique equality becomes (22).
x1 þ x3 þ x4 þ x5 ¼ 64:3039: ð22Þ
The variable to be freed of integer restrictions is determined by (23).
b‘ ¼ Maxf31:5636; 0; 23:7229; 9:8910; 0:1264g ¼ 31:5636: ð23Þ
The largest value 31.5636 comes from variable x1 which implies that this variable is
to be freed of the integral restriction.
498 E. Munapo
Sub-Problem a - additional constraint:
x1 þ x3 þ x4 þ x5 þ x6 ¼ 64: ð24Þ
Sub-Problem b - additional constraint:
x1 þ x3 þ x4 þ x5 x7 ¼ 65: ð25Þ
Note that we consider only the original variables ðx1 ; x3 ; x4 &x5 Þ that are in the
optimal basis. Where x6 and x7 are the additional variables which are also restricted to
integers. In diagram form the original problem and the two sub-problems (a) and
(b) are related as shown in Fig. 2.
x1 + x3 + x4 + x5 + x6 = 64 x1 + x3 + x4 + x5 − x7 = 65
a b
Fig. 2. The two initial sub-problems of LIP for Numerical Illustration 1
Solving Sub-Problem (a) using the automated branch and bound algorithm, it takes
21 nodes (from Sub-Problem a) to verify the optimal solution given in (20).
x1 ¼ 31; x2 ¼ 0; x3 ¼ 22; x4 ¼ 9; x5 ¼ 0; Z ¼ 9462: ð26Þ
Solving Sub-Problem 2 using the automated branch and bound algorithm, it takes 1
node to verify infeasibility. The complexity of this problem is reduced from 459 to just
(21 + 1) = 22 nodes i.e. complexity reduction is 95.2 as given in (27).
ð459 22Þ
q¼ 100% ¼ 95:2%: ð27Þ
459
Mere addition of the cut given in (28) for Sub-problem (a) and the cut given in (29) for
Sub-problem (b) does not reduce the total number of nodes required to verify an
optimal solution by using the automated branch and bound algorithm.
x1 þ x3 þ x4 þ x5 64: ð28Þ
x1 þ x3 þ x4 þ x5 65: ð29Þ
In fact it takes 461 nodes for (Sub-Problem a) and 1 node for (Sub-Problem b) which
gives a total of 462 nodes to verify optimality.
A set of 100 randomly generated linear integer problems have shown that com-
plexity decreases significantly if the largest basic variable in the continuous optimal
solution is freed of integral restrictions. As a result of this, freeing of variables is now
combined with cuts to form what is proposed in this paper as the branch, cut and free
algorithm for the general linear integer problem.
7 Branch and Cut Algorithm
Previously cuts have been used to enhance the performance of the branch and bound
algorithm [3, 17, 22] to form what is now known as the branch and cut algorithm [6, 8,
15, 19]. In addition pricing has been used in a branch and bound setting to come up
with the well-known branch and price algorithm [5, 21]. The branch and cut and branch
and price are well-known ideas and have been combined to form a hybrid which is now
called the branch, cut and price [9, 10]. With all these wonderful and impressive ideas
available, the general linear integer problem is still NP hard and very difficult to solve
up to now. Heuristics are still being used to this day to approximate optimal solutions
to this difficult problem [11–13]. Freeing a variable from integral restriction is now
being used in the context of a branch and cut to give birth to the branch, cut and free
algorithm for the general integer problem.
7.1 Proposed Branch, Cut and Free Algorithm

The proposed algorithm is made up of the following steps.
Step 1: Relax the given LIP and use linear programming techniques to obtain a
continuous optimal solution. If optimal solution is integer then it is also optimal to the
original problem else go to Step 2.
Step 2: Use the continuous optimal tableau to construct Sub-Problem a and Sub-
Problem b. Determine the variable xj to be freed of integral restriction.
Step 3: From the Sub-Problems a and b, use the branch and cut to search for
smallest i that satisfies (30) and (31).
x1 þ x2 þ x3 þ . . . þ xj þ . . . þ xm ¼ I i; i ¼ 0; 1; 2. . . ð30Þ
x1 þ x2 þ x3 þ . . . þ xj þ . . . þ xm ¼ I þ 1 þ i; i ¼ 0; 1; 2. . . ð31Þ
Where xj is the freed variable and the rest are integers, 8j ¼ 1; 2; 3; . . .; m.

Step 4: Call integer solution from (30) Za and that from (31) Zb . The optimal
solution ðZopt Þ is given by (32) if a maximization problem and (33) if a minimization
problem.
Zopt ¼ Max½Za ; Zb : ð32Þ
Zopt ¼ Min ½Za ; Zb : ð33Þ
Step 4: Verify optimality to determine the actual optimal integer solution.

500 E. Munapo
7.2 Verification of Optimality

We assume the continuous optimal solution and integer optimal solution are known as
given in Fig. 3.
Fig. 3. Proof of optimality
Where
I. ‘copt is when hyperplane x1 þ x2 þ x3 þ . . . þ xj þ . . . þ xm is at Zcopt ,
II. ‘opt is when hyperplane x1 þ x2 þ x3 þ . . . þ xj þ . . . þ xm is at Zopt ,
III. ‘s is when hyperplane x1 þ x2 þ x3 þ . . . þ xj þ . . . þ xm has the smallest value,
IV. ‘m is when hyperplane x1 þ x2 þ x3 þ . . . þ xj þ . . . þ xm has the largest value,
V. c1 x1 þ c2 x2 þ . . . þ cn xn is the green objective plane and it moves from the
continuous optimal point ðZcopt Þ to the optimal integer point ðZopt Þ.
From Fig. 3, it can be noted that there are no other integer points in the shaded
region i.e. in the region between the continuous optimal point ðZcopt Þ and the optimal
integer point ðZopt Þ besides that optimal integer point.
7.2.1 Searching for the Optimal Integer Point

Armed with these important ideas we can now develop the branch, cut and free
algorithm. The first stage is to assume that Zopt is not known. From the diagram we
know that this optimal integer point ðZopt Þ can be on the left hand side or the right hand
side of ‘copt . Since Zopt is not known then ‘s and ‘m are also not known. We can avoid
not knowing ‘s and ‘m ; but we still get ‘opt . This is done by searching from ‘copt and
going in both directions until ‘opt is obtained. After obtaining ‘opt there is a need to
verify this value for optimality.
7.2.2 Optimality Is Verification

Let Zopta be the optimal solution from Sub-Problem a. Optimality can be verified by
adding constraint set (34) to Sub-Problem a.
x1 þ x2 þ x3 þ . . . þ xj þ . . . þ xm ‘opta 1;
ð34Þ
c1 x1 þ c2 x2 þ . . . þ cn xn Zopta 1:
Let Zoptb be the optimal solution coming from Sub-Problem b. Optimality in this case
can be verified by adding constraint set (35) to the Sub-problem b.
x1 þ x2 þ x3 þ . . . þ xj þ . . . þ xm ‘optb þ 1;
ð35Þ
c1 x1 þ c2 x2 þ . . . þ cn xn Zoptb 1:
The optimal integer solution Zopt is optimal to the original ILP if both (34) and (35)
are infeasible.
Mixed Linear Integer Problem
In the case of a mixed linear integer problem, only those variables that are supposed to
be integer in the original mixed linear integer problem can be freed of integral
restriction.
7.3 Numerical Illustration 2 [18]

Minimize Z ¼ 162x1 þ 38x2 þ 26x3 þ 301x4 þ 87x5 þ 5x6 þ 137x7 ;
Such that
165x1 þ 45x2 þ 33x3 þ 279x4 þ 69x5 þ 6x6 þ 122x7 ; ð36Þ
Where x1 ; x2 ; . . .; x7 0 and are integers.

The continuous optimal solution for Illustration 4 is given in Table 5.
Table 5. Continuous optimal tableau (36).

502 E. Munapo
The variable x3 is the only original variable that is in the optimal basis. This
variable must be freed from integral restrictions. The continuous optimal solution for
Numerical Illustration 2 is given in (37).
x1 ¼ x2 ¼ x4 ¼ x5 ¼ x6 ¼ x7 ¼ 0; x3 ¼ 568:88788; Zcopt ¼ 14790:8495: ð37Þ
b‘ ¼ Maxf0; 0; 568:8788; 0; 0; 0; 0g ¼ 568:8788: ð38Þ
The largest value 568.8788 comes from variable x3 which implies that this variable is to
be freed of the integral restriction.
x3 ≤ 568 x3 ≥ 569
a b
x3 = 568, x6 = 4.8333,
x3 = 569, Z = 14894. (fathomed)
Z = 14792.1667.
Fig. 4. The two initial sub-problems of the LIP in Numerical Illustration 2.
The branch and cut algorithm is now used to search Sub-Problem a for integer
points.
x3 ¼ 568 i; i ¼ 0; 1; 2. . .
i ¼ 0 : Add x3 ¼ 568 to problem where x3 is a free variable. Solving by branch and cut
we have (39).
x3 ¼ 568; x6 ¼ 5:0000; Z ¼ 14793: ð39Þ
There is no need to search Sub-Problem b since it is fathomed already.

i.e.
x3 ¼ 569; Z ¼ 14794: ð40Þ
Zopt ¼ Min½14793; 14794 ¼ 14793: ð41Þ

This solution can be verified by adding (42) to Sub-Problem a.
x3 568 1 ¼ 567;
ð42Þ
162x1 þ 38x2 þ 26x3 þ 301x4 þ 87x5 þ 5x6 þ 137x7 14792:
Solving Sub-Problem a using the branch ad cut algorithm with x3 as a free variable we
obtain an infeasible solution which shows that there are no other integer points on the
left hand side of ‘copt besides (41).
Similarly we add (43) to Sub-Problem b.
x3 569 þ 1 ¼ 570;
ð43Þ
162x1 þ 38x2 þ 26x3 þ 301x4 þ 87x5 þ 5x6 þ 137x7 14792:
Adding (43) to Sub-Problem b results in infeasibility and this verifies that (39) is
optimal.
8 Computational Experience
One hundred randomly generated pure LIPs with variables ranging from 10 to 110
were used in the computational analysis. The branch and cut algorithm was compared
with the proposed branch, cut and free algorithm. The same number of cuts were used
for each of these two algorithms. What emerged from the computations is that freeing a
basic variable in a branch and cut algorithm is more effective than the plain branch and
cut algorithm in solving pure LIPs.
9 Conclusions
In the paper we presented a way of selecting the variable to be freed. We also presented
a way of dividing the problem into simpler parts as given in (51) and (52). It is easier to
search the separate divisions ði ¼ 0; 1; 2. . .Þ of Sub-Problem a or ði ¼ 1; 2; 3; . . .Þ of
Sub-Problem b than the original ILP problem as a whole. In addution we presented
optimality verification of the optimal integer solution ðZopt Þ: This is a new avenue for
research that will attract the attention of many researchers in the area of linear integer
programming. A lot has been done for linear integer programming in terms of exact
methods such as branch and cut, branch and price and the hybrid branch, cut and price.
We were not aware of variables that have been freed of integral restrictions before and
the concept used in solving the linear integer problem. Variable freeing may provide
the answer to the difficult general linear integer problem. Large numbers of branches
are prevented as the variables with large variable ranges can be identified and not
subjected to integral restrictions. In the paper only one variable was freed. There is a
need to explore ways to free more than one variable. Variable freeing is an area in its
early stages of development.
504 E. Munapo
Acknowledgments. We are grateful to the anonymous reviewers and conference organizers.
References
1. Abdel-Basset, M., El-Shahat, D., Faris, H., Mirjalili, S.: A binary multi-verse optimizer for
0-1 multidimensional knapsack problems with application in interactive multimedia systems.
Comput. Ind. Eng. 132, 187–206 (2019)
2. Alanne, A.: Selection of renovation actions using multi-criteria “knapsack” model. Autom.
Constr. 13, 377–391 (2004)
3. Alrabeeah, M., Kumar, S., Al-Hasani A., Munapo, E., Eberhard, A.: Computational
enhancement in the application of the branch and bound method for linear integer programs
and related models. Int. J. Math. Eng. Manage. Sci. 4(5), 1140–1153 (2019) https://doi.org/
10.33889/IJMEMS.2019.4.5-090
4. Amiri, A.: A Lagrangean based solution algorithm for the knapsack problem with setups,
Expert Syst. Appl. 1431 (2020)
5. Barnhart, C., Johnson, E.L., Nemhauser, G.L., Savelsbergh, M.W.P., Vance, P.H.: Branch
and price column generation for solving huge integer programs. Oper. Res. 46, 316–329
(1998)
6. Brunetta, L., Conforti, M., Rinaldi, G.: A branch and cut algorithm for the equicut problem.
Math. Program. 78, 243–263 (1997)
7. Dahmani, I., Hifi, M., Saadi, T., Yousef, L.: A swarm optimization-based search algorithm
for the quadratic knapsack problem with conflict Graphs, Expert Syst. Appl. 14815 (2020)
8. Fomeni, F.D., Kaparis, K., Letchford, A.N.: A cut-and-branch algorithm for the Quadratic
Knapsack Problem. Discrete Optimization (2020)
9. Fukasawa, R., Longo, H., Lysgaard, J., Poggi de Aragao, M., Uchoa, E., Werneck, R.F.:
Robust branch-and-cut-price for the Capacitated vehicle routing problem. Math. Program.
Series A 106, 491–511 (2006)
10. Ladányi, L., T.K. Ralphs, L.E.: Branch, cut and price: sequential and parallel. In:
Computational Combinatorial Optimization, Naddef, N., Jüenger, M., eds, Springer, Berlin (
2001)
11. Lahyani, R., Chebil, K., Khemakhem, M., Coelho, L.C.: Matheuristics for solving the
multiple knapsack problem with setup. Comput. Ind. Eng. Vol. 129, 76–89 (2019)
12. Lai, X., Hao, J.K., Fu, Z.H., Yue, Y.: Diversity-preserving quantum particle swarm
optimization for the multidimensional knapsack problem, Expert Syst. Appl., 1491 (2020)
13. Lai, X., Jin-Kao Hao, Fu, Z.H., Yue, D.: Diversity-preserving quantum particle swarm
optimization for the multidimensional knapsack problem, Expert Syst. Appl. Vol.
1491 (2020)
14. Micheli, G., Weger, V.: On rectangular unimodular matrices over the algebraic integers.
SIAM J. Discr. Math. 33(1), 425–437 (2019)
15. Mitchell, J.E.: Branch and cut algorithms for integer programming. In: Floudas, C.A.,
Pardalos, P.M., (Eds.), Encyclopedia of Optimization, Kluwer Academic Publishers (2001)
16. Munapo, E.: Network reconstruction – a new approach to the traveling salesman problem
and complexity. In: Intelligent Computing and Optimization Proceedings of the 2nd
International Conference on Intelligent Computing and Optimization 2019 (ICO 2019),
pp. 260–272 (2020)
17. Munapo, E.: Improvement of the branch and bound algorithm for solving the knapsack linear
integer problem. Eastern-Euro. J. Enter. Technol. 2(4), 59–69 (2020)
18. Munapo, E.: Improving the optimality verification and the parallel processing of the general
knapsack linear integer problem. In: Research Advancements in Smart Technology,
Optimization, and Renewable Energy (2020)
19. Oprea, S.V., Bâra, A., Ifrim, G.A., Coroianu, L.: Day-ahead electricity consumption
optimization algorithms for smart homes. Comput. Ind. Eng. 135, 382–401 (2019)
20. Padberg, M., Rinaldi, G.: A branch and cut algorithm for the resolution of large-scale
symmetric traveling salesman problems. SIAM Rev. 33(1), 60–100 (1991)
21. Salvelsbergh, M.W.P.: A branch and price algorithm to solve the generalized assignment
problem. Oper. Res. 45, 381–841 (1997)
22. Taha, H.A.: Operations Research: An Introduction, Pearson Educators, 10th Edition (2017)
Resilience in Healthcare Supply Chains
Jose Antonio Marmolejo-Saucedo(B) and Mariana Scarlett

Hartmann-González
Facultad de Ingenierı́a, Universidad Panamericana,

Augusto Rodin 498, Ciudad de México 03920, Mexico
{jmarmolejo,0172952}@up.edu.mx
Abstract. The recent COVID-19 pandemic that the world is experi-

encing right now should be the catalyst for companies to reflect on the
processes of their supply chains. Global supply chains, regardless of the
type of industry, will need to adopt changes in their operations strategy.
The implementation of mathematical models of optimization and simula-
tion will allow the adoption of proposals for the design of resilient supply
chains to respond to the immediate challenge. This work proposes the
use of optimization-simulation techniques to reduce the impact of inter-
ruptions in the workforce, the closure of facilities and transportation. A
hypothetical case study is presented where various disruption scenarios
are tested and the best strategies to achieve the recovery of the desired
service levels are analyzed.
Keywords: Epidemic outbreaks · COVID-19 · Supply chain design ·

Simulation · Resilient · Optimization
1 Introduction
Many solutions have been proposed by different societies in order to improve
the way they carry out their activities and to adapt measures that allow them
to prepare for and mitigate disasters or catastrophes that could happen at any
time. It is important to consider that historical data and previous events are
very relevant to study in order to mitigate disasters [7]. For example, it could
be considered a disaster or a catastrophe, in the interruption of daily activities,
whether due to fires, earthquakes and floods. Therefore, governments seek to
be prepared to minimize the damage that could occur through proper disaster
management [1,3]. It should be noted that these disasters affect a country, a
government, or a society in a specific manner, which can be controlled, and the
effects mitigated in short periods. As well as other countries can support the
affected country, either by sending medical supplies, rescue support personnel
and basic necessities, among other things.
Resilience plays an extremely important role, since interruptions in the health
sector are a reality and something very common. Considering that not only a
pandemic can affect, but also natural disasters and social, economic and political
conflicts, among other things, for which a rapid response is necessary.
https://doi.org/10.1007/978-3-030-68154-8_45
Resilience in Healthcare Supply Chains 507
So a platform that manages to connect hospitals with suppliers and distrib-

utors would be amazing. Since it would allow to provide existing inventories,
the time necessary to place orders, all this in real time considering the current
demand. The platform would also allow suppliers to share planned and forecast
orders for the optimization of operations. However, it would allow them to iden-
tify potential problems in advance allowing them to react, either by increasing
production or how to manage the existing stock. In order to achieve a balance
of supply and demand, with accurate information in real time.
The healthcare industry currently relies on the just-in-time (JIT) distribution
model, it has operated in the same way for several decades. JIT is known to
have helped the industry control costs and reduce waste, while the processes
and technology systems that support it failed to meet the demands of a global
pandemic. So with COVID-19, suppliers found it necessary to design interim
measures to address the shortage of personal protective equipment (PPE) and
some other key things. Approaches are expected to be applied long after we
return to pre-pandemic activity levels. As a consequence, the virus has challenged
the industry to rethink its definition of supply chain resilience.
2 Literature Review
The new way of managing knowledge is based on Digital Twins, and one of
the main fields to which it is being applied is in health. Since it is important
to take into consideration that they allow to give impartial conclusions of the
patients. That’s why healthcare supply chains have currently been considered
different from the usual supply chains due to their high level of complexity, the
presence of high value medical materials and, finally, the fact that they deal with
human lives. On the other hand, it is important to mention that international
health systems are under constant and increasing pressure to reduce waste and
eliminate unnecessary costs while improving the quality and consistency of care
provided to the patient. As well as providing 100% service level, to avoid short-
ages in hospital wards. That’s why authors have been developing different ways
to manage medical knowledge. MyHealthAvatar and Avatar Health [5] are two
clear examples of how health knowledge can be created, which aim to collect
and track people’s lifestyle and health data. MyHealthAvatar is a project whose
main feature is data analysis and presentation with which doctors can visualize
it. Because results are presented in a 3d avatar, health status dashboard, disease
risks, clock view, daily events. While Avatar Health collects information through
health monitoring devices, to check that the patient’s habits and parameters are
correct. So, it would act as an equivalent of a human, considering its physical
state, condition of life and habits. These would allow the use of this information
not only individually, but also collectively, which could be very useful for Digital
Twins focused on health issues. Because diseases could be predicted, drugs that
the population needed, emergencies, diseases, epidemics. That’s why with the
development of big data, the cloud and the Internet of things, the use of the
digital twins as a precision simulation technology of reality has been enhanced.
508 J. A. Marmolejo-Saucedo and M. S. Hartmann-González
It is extremely important to consider that simulation is essential in the field of

health and research. In order to be able to plan and allocate medical resources,
prediction of medical activities, among other things. So, when digital twins and
health care are combined, a new and efficient way to provide more accurate and
faster services will result. Digital twins act as a digital replica of the physical
object or service they represent in the healthcare industry, providing monitor-
ing and evaluation. They can provide a secure environment to test the impact
of changes on a system’s performance, so problems, how and when they might
occur can be predicted, with time to implement the necessary changes or pro-
cedures, allowing for optimal solutions and risk reduction. On the other hand,
digital twins actively play a crucial role in both hospital design and patient
care. As an example, the authors seek to manage the health of the life cycle of
elderly patients, in order to have and use their information, both physically and
virtually. Aiming to monitor, diagnose and precede health issues, considering
portable medical devices that will be sending information about patients to the
Digital Twin. On the other hand, there is also Supply Chain concern associated
with pharmaceuticals. Because this is essential for customer service and supply
of drugs to patients in pharmacies, considering that supply represents between
25 to 30% of costs for hospitals. In this way it is considered vital to be able to
maintain cost and service objectives. Today, society is facing a pandemic, which
is not only causing deaths, it is also severely affecting Supply Chains. Because it
is characterized by long-term disruption existence, disruption propagations (i.e.,
the domino effect), and high uncertainty. While governments and agencies seek
to stop the spread of Covid-19 and provide treatment to infected people, man-
ufacturers of industries are constantly fighting to control the growing impact of
the epidemic on their supply chains. Therefore, several authors have been writing
about the topic. As an example, an author considered it extremely important
to carry out simulations on the impact of COVID-19 on supply chains, in order
to be able to investigate and reduce the impact of future epidemic outbreaks.
Taking this into consideration, Digital Twins could help mitigate this global
impact by providing important data in business decision-making. Other authors
considered developing a practical decision support system based on physician
knowledge and the Fuzzy Inference System (FIS) which helps manage demand
in the healthcare supply chain to reduce stress in the community, to break down
the COVID-19 chain of spread and, in general, mitigate outbreaks due to dis-
ruptions in the healthcare supply chain. What they are doing is dividing the
residents of the community into four groups according to the level of risk of
their immune system and by two indicators of age and pre-existing diseases.
These individuals are then classified and required to comply with regulations
depending on which group they are in. Finally, the efficiency of the proposed
approach was measured in the real world using the information from four users
and the results showed the effectiveness and precision of the proposed approach.
It is important to recognize that some companies are better prepared than oth-
ers to mitigate the impact, because they have developed and implemented busi-
ness continuity and supply chain risk management strategies. They have also
diversified their supply chains from a geographical perspective to reduce risks.

Another important factor is that they usually have a diversified portfolio of sup-
pliers. In order not to compromise their key products and reduce their depen-
dence on any supplier, they have therefore considered the inventory strategy
to avoid interruption of the supply chain. While logistics has sought to better
understand risks and drive specific actions based on priorities, so agility has
been developed within production and distribution networks to quickly recon-
figure and maintain supply to global demand.
3 Health Care Supply Chain Design
Unlike what happens today, because the world population has been affected by
an infectious and highly contagious disease better known as COVID-19. So, each
country is trying to serve its population and safeguard its own people. Therefore,
the ability to support in such a disastrous situation makes decision-making by
top management, the interaction of societies and the activities of supply chains
impossible. So far, the effect of COVID-19 is unknown and its research very
expensive, so it is considered highly dangerous once contracted by the person,
because the contagion capacity is very high. It is sad but very true that many
countries are faced with the fact that they do not have the necessary medical
and human resources to be able to combat this virus, and even less considering
the outbreak and contagion rate of this disease. So the health care supply chain
is being severely affected, because “everyone” needs it at the same time and
does not have such responsiveness. Therefore, it is important that governments
look for the correct way to prioritize health personnel in order to provide bet-
ter service to the community and find the best way to manage the health care
supply chain to avoid its interruption and mitigate harm to the population in
the best possible way. The proposed model has as its main objective the use
of mitigation activities to reduce the effects of COVID-19, that is, reduce the
interruptions in the health care supply chains and provide better service to com-
munities that do not have access to health care. Several different solutions have
been described by [6], among which are increasing the flexibility of companies
and supply chains, such as: the postponement of production; the implementation
of strategic stocks to supply various demands; the use of a flexible supplier base
to be able to react more quickly; the use of the make or buy approach; planning
of transportation alternatives; and the active management of income and prices,
directing consumption to products with greater availability. Figure 1 shows the
possible combinations to design a supply chain considering the most important
factors that occur in the industry.
Figure 2 presents the process diagram for designing resilient supply chains,
which consists of a prior analysis of possible disruptions.
Fig. 1. Resilience in health care supply chains
Fig. 2. Flow diagram for resilience in health care supply chains
4 Mathematical Model
In this section, we apply the supply chain design model for a resilient supply
network. We consider the model into a generalized network. The model is a
mixed-integer linear problem.
Let K be the set of manufacturing plants. An element k ∈ K identifies a
specific plant of the company. Let I be the set of the potential warehouse. An
element i ∈ I is a specific warehouse. Finally, let J be the set of current distri-
bution centers, a specific distribution center is any j ∈ J. Let Z denote the set
of integers numbers {0, 1}.
4.1 Parameters
Qk = Capacity of plant k.
βi = Capacity of warehouse i.
Fi = Fixed cost of opening warehouse in location i.
Gki = Transportation cost per unit of the product from the plant k to the ware-
house i.
Cij = Cost of shipping the product from the warehouse i to the distribution
center (CeDis) j.
dj = Demand of the distribution center j.
4.2 Decision Variables
We have the following sets of binary variables to make the decisions about the
opening of the distribution center, and the distribution for the cross-docking

warehouse to the distribution center.
1 If location i is used as a warehouse,
Yi =
0 otherwise,
1 If warehouse i supplies the demand of CeDis j,
Xij =
0 otherwise,
Wki = The amount of product sent from plant k to the warehouse i is repre-
sented by continuous variables
We can now state the mathematical model as a (P) problem based on [2].

min Z= Gki Wki + Fi Yi + Cij dj Xij (1)
Wki ,Yi ,Xij
k∈K i∈I i∈I i∈I j∈J
Subject to constraints:
Capacity of the plant
Wki ≤ Qk , ∀k ∈ K (2)
i∈I
Balance of product
dj Xij = Wki , ∀i ∈ I (3)
j∈J k∈K
Single warehouse to distribution center

Xij = 1, ∀j ∈ J (4)
i∈I
Warehouse capacity

dj Xij ≤ βi Yi , ∀i ∈ I (5)
j∈J
Demand of items

pYi ≤ Wki , ∀i ∈ I (6)
k∈K
p = min{dj } (7)
Wki ≥ 0, ∀i ∈ I, ∀k ∈ K (8)
Yi ∈ Z, ∀i ∈ I (9)
Xij ∈ Z, ∀i ∈ I, ∀j ∈ J (10)
The objective function (1) considers in the first term the cost of shipping
the product from the plant k to the warehouse i. The second term contains the
fix cost to open and operate the warehouse i. The last term incorporates the
cost of fulfilling the demand of the distribution center j. Constraint (2) implies
that the output of plant k does not violate the capacity of plant k. Balance
constraint (3) ensures that the amount of products that arrive to a distribution
center j is the same as the products sent from the plant k. The demand of each
distribution center j will be satisfied by a single warehouse i, this is achieved by
constraint (4). Constraint (5) bounds the amount of products that can be sent to
a distribution center j from an opened cross-docking warehouse i. Constraint (6)
guarantees that any opened warehouse i receives at least the minimum amount
of demand requested by a given distribution center j. Constraint (7) ensures
that the minimum demand of each distribution center j is considered. Finally,
constraints (8), (9) and (10) are the non-negative and integrality conditions.
5 Case Study
The case study considers a company that manufactures medical supplies that
has its base of operations in central Mexico, and wants to locate new warehouses
of finished product to reduce delivery times. Likewise, it is intended to use a mul-
tistore strategy that allows increasing the resilience of the supply chain. To sim-
ulate the case of an epidemic or pandemic, a hypothetical situation is presented
where the various potential warehouse locations will be affected depending on
the distribution of the disease. In other words, the government establishes differ-
ent times for closing operations depending on the geographic area. Decisions are
required to locate new facilities for the production and distribution of injectable
products. Likewise, various operating scenarios (under disruption conditions) of
the supply, manufacturing, inventory and product distribution process are mod-
eled and analyzed, see Fig. 3.
Fig. 3. Current health care supply chain
The customer demand presented in this paper tends to be of uniform distribu-

tion, each considering its different parameters of maximum and minimum order
of the 2 products modeled. For Carsilaza’s product there is a maximum order per
customer of 24,000 pieces and a minimum of 320 pieces, with a monthly average
of 65,000 pieces sold per month, see Fig. 4. On the other hand, for Nuverasa’s
product there is a maximum order per customer of 12,000 pieces and a minimum
of 110 pieces, with an average of 28,000 pieces sold per month, see Fig. 5.
Fig. 4. Carsilaza’s demand
This section will seek to compare the performance of several inventory policies
that allow the design of supply chain networks, in order to determine which is the
best option according to what is needed. The first approach consider disruption
scenarios during supply chain design decisions, among which are location and
allocation decisions. Whereas, the second approach makes decisions about the
design of the supply chain without considering the disruption scenarios. There-
fore, each one will give their own result. In this way, when comparing the total
Fig. 5. Nuverasa’s demand
profits under these two approaches, the benefits of considering the disruptions
in the supply chain design model are implied. Therefore, it can be concluded
that not only the location costs of the facilities come to affect the location deci-
sions, but also the rates of disruption in the facilities are important factors in
determining and deciding where the warehouses will be located. For example,
despite the low cost of installation location, a decision may be made not to open
a warehouse at a candidate site due to high disruption rates. Additionally, the
number of clients to be served and client assignments to distribution centers
may depend on disruption rates in warehouses. Therefore, a significant increase
in total profit can be achieved by considering facility disruptions in the supply
chain design model.
The Inventory Control Policy was created in order to define inventory levels,
a policy (s, S) with Safety Stock is modeled, which is better known as Min-Max
policy with Safety Stock. Which assumes that both the time between orders and
the order quantity are variable, where the latter varies between the order level S
and the Reorder Point (ROP) s. Considering this, the parameters to be defined
are the safety stock (SS), the minimum inventory value (s) and the maximum
level (S): √
SS = z · σ · LT (11)
s = d · (LT ) + SS (12)
S =2·s (13)
Where z is the z-value obtained from the tables of the normal distribution,
σ is the standard deviation of the demand, LT is the delivery time of the supply
and d is the demand. Also, weekly consumption has been considered for the
calculation of the inventory parameters. As well as a service level equal to 99.9%
has been established for all classes of products; thus, z is set to 3.5. The initial
stock when the simulation starts has been set equal to the maximum value (S).
Considering the suppliers, it can be assumed that they have sufficient capacity
to always satisfy the demand, without any problem. In other words, they have
a very intense production rhythm and can produce for the national and inter-
national market. Therefore, their inventory levels are modeled to be infinite.
The model is implemented in Anylogistix software (ALX) and simulated with a
one-year period (Jan 1, 20, Dec 31, 20) considering different scenarios, see [4].
First, the starting point is analyzed without any disruption. Next, a disruption
will be inserted into the model and then supply chain performances with and
without different recovery policies are evaluated to mitigate the impact of the
disruption. Considering the disruption, the model introduces a complete closure
of the supplier. Recovery policies are based on different strategies. The first one
requires to increase the inventory levels in the main warehouse, for all prod-
ucts with high demand. The second consists in activating lateral transhipment
between facilities (plants, warehouses and distribution centers. However, in order
to be more effective, in some cases this action requires also an increase in the
inventory levels.
6 Computational Results
From the analysis of the demand for products, it was sought to establish inven-
tory policies in order to be able to offer the best level of service to customers,
which is intended to be greater than 90%. Different inventory policies were tested,
among which are:
• Min-max policy: Products are ordered when the inventory level falls below
a fixed replenishment point (s). The ordered quantity is set to such a value that
the resulting inventory quantity equals S. • Min-max policy with safety stock:
Products are ordered when the inventory level falls below a fixed replenishment
point (s safety stock). The ordered quantity is set to such a value that the result-
ing inventory quantity equals S safety stock. • RQ policy: When the inventory
level falls below a fixed replenishment point (R), the fixed replenishment quan-
tity (Q) of products is ordered. The time horizon considered is 1 year of customer
order. The model was simulated several times, in order to observe different sce-
narios and be able to choose the best possible option. The policy that obtained
the best result from the different KPIs analyzed is the Min-Max policy with
different ranges in both minimums and maximums for products. For the mini-
mum of Carsilaza the quantity of 40,000 pieces and a maximum of 60,000 pieces
was stipulated. While for Nuverasa the minimum amount of 15,000 pieces was
stipulated and a maximum of 25,000 pieces. Obtaining the following KPIs as a
result (Fig. 6):
Fig. 6. The proposed inventory policy
When evaluating another scenario with the Min-Max policy with safety stock,
it showed a total cost of $ 3, 273,810 USD, but the level of service offered was 90%
and a Profit of $ 321, 112,234 USD. So it was decided to choose the previously
presented scenario because the profit was above 20M USD and the service offered
was 95%. It is necessary to consider that the higher the service level is, the higher
the costs will be, since these are directly proportional together with the inventory
level.
The following results obtained show the service level, the average available
inventory and the lead time for a recovery strategy based on Min-Max inventory
levels for a disruptive scenario, see Figs. 7, 8, 9, 10, 11 and 12. Likewise, a
comparison of KPIs is presented, corroborating that the proposed inventory
strategy substantially improves profit and other indicators.
Fig. 7. Service level by product for proposed inventory strategy

Fig. 8. Service level by product for current inventory strategy
Fig. 9. Available inventory for proposed inventory strategy
Fig. 10. Available inventory for current inventory strategy

Fig. 11. Lead time for proposed inventory strategy
Fig. 12. Lead time for current inventory strategy
7 Conclusions
In this paper, the effect of different disruptions on the resilient design of a health-
care supply chain is studied. The disruptions considered model those that can
occur in cases of global pandemics such as COVID-19.
The design of resilient healthcare chains is proposed through the mathemat-
ical modeling of the problem and the dynamic simulation of various inventory
policies. Likewise, different KPIs are used to evaluate the performance of the
proposals in the different disruption scenarios.
Specialized software for supply chain analysis is used to implement the pro-
posals and the alternatives for resilient designs are contrasted for different levels
of service.
The presented study offers an alternative to reduce the severe impacts in the
supply chains dedicated to health care.
References
1. Acar, M., Kaya, O.: A healthcare network design model with mobile hospitals for
disaster preparedness: a case study for Istanbul earthquake. Transp. Res. Part E
Logist. Transp. Rev. 130, 273–292 (2019). https://doi.org/10.1016/j.tre.2019.09.
007. http://www.sciencedirect.com/science/article/pii/S136655451930314X
2. Marmolejo, J., Rodrı́guez, R., Cruz-Mejia, O., Saucedo, J.: Design of a distribution
network using primal-dual decomposition. Math. Probl. Eng. 2016, 9 (2016)
3. Rezaei-Malek, M., Tavakkoli-Moghaddam, R., Cheikhrouhou, N., Taheri-
Moghaddam, A.: An approximation approach to a trade-off among efficiency, effi-
cacy, and balance for relief pre-positioning in disaster management. Transp. Res.
Part E Logist. Transp. Rev. 93, 485–509 (2016). https://doi.org/10.1016/j.tre.2016.
07.003. http://www.sciencedirect.com/science/article/pii/S136655451630134X
4. anyLogistix supply chain software: supply chain digital twins, February 2020.
https://www.anylogistix.com/resources/white-papers/supply-chain-digital-twins/
5. Spanakis, E.G., Kafetzopoulos, D., Yang, P., Marias, K., Deng, Z., Tsiknakis, M.,
Sakkalis, V., Dong, F.: myhealthavatar: personalized and empowerment health ser-
vices through Internet of Things technologies. In: 2014 4th International Confer-
ence on Wireless Mobile Communication and Healthcare - Transforming Healthcare
Through Innovations in Mobile and Wireless Technologies (MOBIHEALTH), pp.
331–334 (2014). https://doi.org/10.1109/MOBIHEALTH.2014.7015978
6. Tang, C.S.: Robust strategies for mitigating supply chain disruptions. Int. J. Logist.
Res. Appl. 9(1), 33–45 (2006). https://doi.org/10.1080/13675560500405584
7. Yan, Y., Hong, L., He, X., Ouyang, M., Peeta, S., Chen, X.: Pre-disaster investment
decisions for strengthening the Chinese railway system under earthquakes. Transp.
Res. Part E Logist. Transp. Rev. 105, 39–59 (2017). https://doi.org/10.1016/j.tre.
2017.07.001. http://www.sciencedirect.com/science/article/pii/S1366554516306913
A Comprehensive Evaluation
of Environmental Projects Through
a Multiparadigm Modeling Approach
Roman Rodriguez-Aguilar1(&), Luz María Adriana Reyes Ortega2,

and Jose-Antonio Marmolejo-Saucedo3
1
Facultad de Ciencias Económicas y Empresariales, Universidad Panamericana,
Ciudad de México, Augusto Rodin 498, 03920 Mexico City, México
rrodrigueza@up.edu.mx
2
Facultad de Ingeniería, Universidad Anáhuac, Huixquilucan, México
3
Facultad de Ingeniería, Universidad Panamericana, Ciudad de México,
Augusto Rodin 498, 03920 Mexico City, México
Abstract. The evaluation of environmental projects has been structured in most

cases on financial profitability indicators, to obtain private and public financing.
However, the environmental performance measures of the evaluated projects
have been left aside. The present work is a proposal for the evaluation of
environmental projects using cost-effectiveness criteria, which take into account
the environmental results of the project and its implementation costs, addi-
tionally, the uncertainty analysis of the projects are integrated through simula-
tion methods and real options. The results show that the cost-effectiveness
evaluation approach of environmental projects allows the integration of envi-
ronmental results measures in the decision to implement a project beyond only
financial indicators.
Keywords: Environmental projects Cost-effectiveness Discrete simulation

Dynamic simulation Real options
1 Introduction
The impact of human activities on the environment has become more relevant in recent
years worldwide. This has generated the need to evaluate technical proposals that allow
reducing the environmental impact derived mainly from production and consumption,
it has been chosen to reduce the generation of waste, increase recycling and produce
through clean energy. In this transition process, the United Nations determined a set of
sustainable development objectives with a horizon to 2030, among these objectives it is
worth highlighting the objectives related to energy and caring for the environment. In
this regard, it is sought to be able to have clean energy sources that allow access
worldwide to the entire population as well as maintain the systemic balance and ensure
the care of natural resources for future generations.
In this transformation process towards more sustainable production and con-
sumption, it is necessary to have the collaboration of public and private institutions. For
the design of public policies and proposals that allow achieving the objectives set. An
https://doi.org/10.1007/978-3-030-68154-8_46
A Comprehensive Evaluation of Environmental Projects 521
essential point in this stage of transition towards more sustainable approaches is the
evaluation of the proposals from the public and private point of view. Since in most
cases the investments are made from the private initiative with public support, once the
technical evaluation stage is approved, the financial feasibility of the environmental
projects is evaluated. In this area where the need to innovate through the application of
new approaches is observed to be able to define if an environmental project is viable,
taking into account both financial and environmental results metrics.
The classic methodologies for evaluating projects to access financing consider the
financial profitability of the project and the period of recovery of the investments,
leaving the evaluation of environmental results in second place, considering this stage
as part of an ex-post evaluation of the project. However, through different quantitative
tools such as discrete and dynamic simulation models, the integration of project
uncertainty through real options, as well as the definition of cost-effectiveness criteria
in ex-ante evaluations, it is possible to project and evaluate the results of an envi-
ronmental project in a feasibility study. The accelerated depletion of renewable and
non-renewable natural resources in recent years has generated the need to implement
policies focused on sustainable development, which allow guaranteeing access to
natural resources for future generations.
As part of the implementation of these policies, the need to develop financing by
the specific characteristics of projects focused on environmental conservation.
Emphasis has been placed on promoting the generation of clean energy and projects
focused on reducing the environmental impact of the activity productive [1]. A fun-
damental factor in evaluating the financial viability of these projects is to determine the
degree of effectiveness of the desired environmental results, which is not necessarily
compatible with classic investment project evaluation approaches, in many cases, the
presence of negative net benefits is justifiable when intangible benefits are taken into
account. It is necessary to take into account the intangible results in problems
intertemporal related to the management and conservation of natural resources [2]. The
need arises to establish a framework according to the characteristics of environmental
projects and above all to emphasize the evaluation of the expected results that cannot
be measured through the profitability of the investment alone. It is necessary to have
comprehensive methods for evaluating environmental projects that allow evaluating
financial viability, operational feasibility, as well as the expected environmental impact.
The integration of environmental aspects as requirements in the operation of contem-
porary companies, as well as the development of sustainable companies, have gener-
ated the need to objectively assess the technical, financial, and especially
environmental viability of the proposals. Until now, there are evaluation proposals, but
most of them are focused on a single aspect of the evaluation, likewise, the classic
approaches of project evaluation are not adapted efficiently in the case of environ-
mental projects due to the particularities and objectives of the same [3, 4].
At the national and international level, schemes have been adopted to promote
sustainable development through policies to reduce environmental impact, specialized
regulation, and generation of financing schemes for clean energy projects; as well as a
decrease in the environmental impact of productive activity [5, 6]. Consistent with
these policies, it is necessary to have project evaluation schemes beyond traditional
financial or cost-benefit evaluations. Since an Environmental project can be beneficial
522 R. Rodriguez-Aguilar et al.
in environmental terms, despite operating with negative net results. An additional factor
to consider is the treatment of uncertainty since in the case of environmental projects
multiple factors can impact the feasibility of the project. Until now, the approach to
evaluating environmental projects has been analytical, focusing on specific project
segments, which is why there is a need for an eclectic approach based on robust
methods that allow support for decision-making.
The work is structured as follows, in section two the methodological framework of
the proposal is presented, briefly describing the quantitative tools that will be used.
Section three addresses the application of the proposed methodology in a case study
and finally the conclusions and recommendations are presented.
2 Methodology
The proposal proposes the integration of a multiparadigm modeling approach that

allows addressing the evaluation of each key stage in the development of an envi-
ronmental project, so the theoretical framework will be integrating various quantitative
approaches and will be specified according to the stage of the environmental project.
2.1 Discrete Event Simulation

By simulating discrete events it is possible to evaluate the operational feasibility of the
proposal, in existing operational processes or the design of new products/processes.
One of the advantages of discrete simulation is that it allows simulating the operation of
a system in a controlled environment and identifying its possible failures through the
analysis of feasible scenarios [7]. The discrete simulation approach is based on the
mapping of an operational process based on activities, seeking to consider a standard
process and its possible behavior in a controlled environment (Fig. 1).
Fig. 1. Discrete event simulation. Source: AnyLogic®.
In the specific case of environmental projects, the use of discrete simulation will
allow evaluating the feasibility of the process to be implemented as well as the
expected results according to different scenarios considered. One of the advantages of
using discrete simulation is that it allows considering the uncertainty in the behavior of
the system to be simulated.
2.2 Dynamic Systems Modeling

For environmental projects, it is of great relevance to know in the long term the
expected behavior of the variables of interest, such as CO2 emissions, energy gener-
ation, or the generation of pollutants. The modeling of dynamic systems is a tool that
allows taking into account the behavior of a system and its interaction between its parts
to robustly project expected trajectories of variables of interest. As well as designing
intervention policies to achieve the desired objectives in the final trajectory. The
foundation of dynamic simulation is based on the resolution of differential equations
that represent the behavior of a variable of interest over time, taking into account its
interaction with a set of auxiliary variables, causal relationships, and flows. The general
approach of a dynamic model is through a differential equation given certain initial
conditions.
dy
¼ f ð x; yÞwithyðx0 Þ ¼ y0 : ð1Þ
dx
The objective is to identify the expected trajectory of a state variable of the dif-
ferential equation. Since not all differential equations have a procedure to find the
analytical solution, it is necessary to resort to the application of numerical methods for
their solution. In the case of the application of dynamic systems to systems related to
human activity, the approach known as System Dynamics was developed, which is a
methodology for the analysis and temporal modeling in complex environments [8]
(Fig. 2).
Fig. 2. The Systems Dynamics approach. Source: AnyLogic®.
The Systems Dynamics methodology considers the behavior over time of a level
variable, flows, and the related environment. The application of this methodology will
allow considering the interactions in a system to be modeled in a defined time horizon
allowing to generate environmental variables of result for the financial evaluation and
cost-effectiveness of the project.
2.3 Real Options

It is an approach highly used in recent years to address the financial evaluation of
projects whose behavior differs from classical standards, so it is necessary to capture
the uncertainty of the variables to be considered, as well as to evaluate decision-making
in the process. Development of the project. The project is considered as a financial

option that can have the following statuses:
a) Expand
b) Collapse
c) Stop and restart or temporary closure operations
There are several options evaluation approaches, among the most widely used is the
binomial model and the Black-Scholes model [9]. One of the advantages of applying
the real options approach in project evaluation is that it allows managing uncertainty to
increase the value of the project over time.
2.4 Cost-Effectiveness Evaluation

Cost-effectiveness studies allow evaluating a project or policy based on the expected
results considered effective according to the objectives set. Unlike cost-benefit studies,
the cost-effectiveness approach seeks to consider as outcome variables that are directly
related to the project objectives, so a project may be financially unprofitable, but cost-
effective. It is an evaluation mechanism that goes beyond financial data and focuses
more on the fulfillment of the expected results of the project. The applications of cost-
effectiveness studies are generally focused on the evaluation of health interventions, but
their application in other sectors has been explored in recent years with positive results
[10, 11]. The cost-effectiveness result is expressed in a relationship between the
associated cost and the earnings according to the expected result using the cost-
effectiveness ratio.
Costoftheintervention
Cost effectivenessratio ¼ : ð2Þ
Measureofeffectiveness
The results are shown in what is known as the cost-effectiveness plan, which allows
locating the alternatives with the trade-off between costs and effectiveness of the
intervention (Fig. 3).
Fig. 3. Cost-effectiveness plane
The desired results are those located in plane II, being the most effective and least
expensive options.
2.5 Multiparadigm Evaluation Proposal

The integration of these three methodologies will allow evaluating the feasibility of the
environmental project in terms of cost-effectiveness in various dimensions. Prioritizing
above all the generation of environmental results, the objective is to have sufficiently
robust evidence to accept a project that allows generating environmental benefits and
that the evaluation does not focus exclusively on the variables of financial profitability.
It is an iterative process where operational feasibility will be evaluated based on
discrete simulation models that will allow simulating the operation of the project over a
time horizon as well as evaluating the performance of performance measures. The
second stage consists of modeling the behavior of the environmental result variable
using a dynamic model that allows determining an expected trajectory during the life of
the project. And finally, based on the information collected from the previous stages,
the financial and outcome evaluations will be carried out through the use of real options
and the estimation of a cost-effectiveness outcome measure (Fig. 4).
Fig. 4. Multiparadigm evaluation
The results of each evaluation stage will be focused on an element of the project
and each result generated will be integrated into the final evaluation that seeks to
evaluate the results and costs of the alternatives on a cost-effectiveness level.
3 Case Study
Information used corresponding to a small service company seeking to implement an

energy efficiency project. A set of project-specific parameters were defined to evaluate
the project's results in environmental and economic terms (Table 1). The service
company seeks to evaluate the feasibility of replacing heaters that use natural gas with
solar heaters in its production process.
Table 1. Parameters considered for the evaluation of the project.

Parameter Units Value
Total inversion Thousands of dollars 500.00
GHG emissions tCO2e / MWh 0.527
Cash flows volatility % 40
Discount rate % 10
Reference rate % 7
Evaluation horizon (real options) Years 15
Real options scenarios Probability for scenario [0, 1]
These general parameters were used as inputs to evaluate the project with the
multiparadigm approach. The elaboration of each model is not detailed because the
objective of the study is to present the methodological structure of the multi-paradigm
evaluation.
The discrete simulation model is based on the daily operations of the company, the
objective of this model is to evaluate the feasibility in the change of technology when
going from the use of gas heaters to solar heaters. Company operations were simulated
for 8 business hours six days a week. Due to the intermittency of the photovoltaic
generation, the simultaneous operation of gas and solar heaters is evaluated. Seeking to
minimize the use of heaters as much as possible, which will reduce operating costs and
reduce the emissions generated by the combustion of this fuel. For its part, the dynamic
model will allow modeling based on the operation of the company with the integration
of photovoltaic energy the emissions generated. It is important to consider that in this
case the emissions generated will be compared concerning the status quo and in the
same way, in the cost-effectiveness evaluation, the new technology will be compared
with the status quo. Table 2 shows the main results of the discrete and dynamic
simulation models, in a simulation horizon of one year, it is important to highlight that
in the case of the dynamic model the horizon is long term.
Table 2. Discrete and dynamic simulation results.

Indicator Value
Average production per hour 100 units
Average layup buffer 15%
Average production cycle time 25 min
Average system utilization 85%
Average CO2 emissions per year 308.75 tCO2e
Dynamic system stabilization time 3 years
With the information generated through the simulation models, the financial
evaluation will be carried out using real options. In this case, three scenarios will be
considered within the life of the project, the expansion, contraction of the project, and
closure are considered. The evaluation will be carried out in year 10 with the following
considerations:
a) Increase the investment amount by 30%, with expenses of 30%.
b) Reduce the investment amount by 25%, with 28% savings.
c) Settle the project with a recovery value of 50%.
Once the probability that the initial Net Present Value (NPV) of the project will rise
or fall is calculated, using the binomial up-down model, the formula corresponding to
each scenario is applied to each probability of year ðnÞ and then all periods are dis-
counted, eliminating the effect of the probability of increase and decrease (up and
down) and is brought to present value, removing the effect of the interest rate. From the
amount obtained, the price of the exercise or initial investment is subtracted and the
present value of the project is taken with real options (Table 3).
Table 3. Evaluation formulas for each type of real option.

Type of Option Value
Option to increase E% by investing I FC t ¼ FC 0 þ maxðE FC 0 I; 0Þ
Option to reduce in C%, reducing investment FC t ¼ MAXðFC 0 I 1 ; C FC 0 I 2 Þ
from I1 to I2
Option to defer or wait a period FC t ¼ MAXðFC n I; 0Þ
Option to close or abandon with a liquidation FC t ¼ MAXðFC t ; Lt Þ
value
Closure option or temporary abandonment FC t ¼ MAXðFC n cf cv; E FC n cf Þ
Opción de selección a escoger FC t ¼ MAXðE FC n I C; FCn þ A; LÞ
The financial evaluation of the project in the three scenarios generated the fol-
lowing results (Table 4).
Table 4. Option value in each scenario.

Option value ($)
Scenario 1 17,919
Scenario 2 − 179,250
Scenario 3 − 112, 961
With these results, it is concluded that, in year 10, it is not convenient to either
reduce the size of the project or liquidate it, but rather expand it, since not only does it
present a more valuable NPV, but also the value of the option is positive, indicating
that it should be exercised.
Once the results measures and the costs incurred to achieve environmental benefits
in the project have been determined, it is necessary to evaluate whether these actions
are cost-effective, not only with the initial situation but also with other options available
in the market, to ensure that the choice made is the best for the company. To determine
which option is the most cost-effective, we must determine the costs associated with the
measurement, but also the elements related to the measurement of results, which is the
emission of Greenhouse gas (GHG), its savings, and the costs of producing them. In the
case of the cost-effectiveness study, different proposals are evaluated to have a com-
parator (Table 5).
Table 5. Cost-effectiveness results.

Supplier A vs status quo Supplier B vs status quo
Costs Supplier A $214, 957 $228,761
Costs Supplier B $289, 834 $289, 834
Emissions 14.29 17.44
Emissions status quo 20.02 20.02
Cost for avoid emissions $13, 063 $23, 684
Two providers of the technology to be implemented were evaluated and the costs of
each option were estimated to evaluate which is the most cost-effective option, addi-
tionally, the outcome measures were considered for each alternative. Equipment A is
cheaper than equipment B, also, it generates less GHG emissions. What makes it more
cost-effective is that the cost of avoided emissions is also lower, so the company will
make sure that choosing supplier A is better than supplier B.
4 Conclusions
The world economies are gradually moving towards more environmentally friendly
production models as a result of trying to harmonize the economic, social, and envi-
ronmental axes to achieve sustainable growth. The integration of environmental
objectives as millennium development goals is a great advance towards the protection
of the environment and the pursuit of sustainable development. One of the great
challenges for the countries in the evaluation and selection of environmental projects
with the greatest possible impact, for this up to now only financial profitability criteria
were considered, which contemplated the recovery of the investment in a defined time,
especially in those projects where there is greater participation of private capital.
However, this approach limits the environmental impact of many projects that could be
profitable in environmental and not necessarily financial benefits.
The use of a multi-paradigm evaluation approach allows a comprehensive evalu-
ation of various aspects of the project, from an operational, technical, environmental,
and financial perspective. The integration of a discrete and dynamic simulation method,
as well as the financial evaluation of the uncertainty of the project through real options,
allow having information of greater added value for decision-making on environmental
projects. The integration of a results evaluation approach in terms of cost-effectiveness
makes it possible to weigh the environmental results on the investment made in the
projects.
The case study presented shows the application of the proposed methodology
through the evaluation of a reference environmental project that allows integrating the
methodological approaches addressed. The results show that the integration of simu-
lation methodologies, the use of real options as well as the cost-effectiveness approach,
allow having robust and reliable information for making decisions. This is a first
methodological proposal that seeks to build comprehensive approaches to the evalu-
ation of environmental projects, and above all to prioritize environmental criteria over
financial ones.
References
1. Panwar, N.L., Kaushik, S.C., Surendra, K.: Role of renewable energy sources in
environmental protection: a review. Renew. Sustain. Energy Rev. 15(3), 1513–1524 (2011)
2. Kronbak, L.G., Vestergaard, N.: Enviromental cost-efectiveness analysis in intertemporal
natural resource policy: evaluation of selective fishing gear. J. Environ. Manage (2013)
3. Manzini, F., Islas, J., Macías, P.: Model for evaluating the environmental sustainability of
energy projects. Technol. Forecast. Soc. Chang. 78(6), 931–944 (2011)
4. Torres-Machi, C., Chamorro, A., Yepes, V., Pellicer, E.: Current models an practicies of
econmic and enviromental evaluation of sustainable network-level pavement management.
J. Constr. 13(2), 49–56 (2014)
5. SEMARNAT: Guía de Programas de Fomento a la Generación de Energía con Recusos
Renovables (2015). Disponible en https://www.gob.mx/cms/uploads/attachment/file/47854/
Guia_de_programas_de_fomento.pdf
6. SENER: Prospectiva de Energías Renovables 2016–2030 (2016). Disponible en https://
www.gob.mx/cms/uploads/attachment/file/177622/Prospectiva_de_Energ_as_Renovables_
2016-2030.pdf
7. Schriber, T.J., Brunner, D.T.: How Discrete‐Event Simulation Software Works. In: Banks,
J. (Ed.) Handbook of Simulation (2007)
8. Rodríguez Ulloa, R., Paucar-Caceres, A.: Soft system dynamics methodology: combining
soft systems methodology and system dynamics. Syst. Pract. Action Res. 18(3), (2015)
9. Calle, A., Tamayo, V.: Decisiones de inversión a través de opciones reales. Estudios Gerenc.
25(111), 7–26 (2009)
10. Finnveden, G., et al.: Recent developments in Life Cycle Assessment. J. Environ. Manage.
91(1), 1–21 (2009)
11. Uchida, E., Rozelle, S.: Grain of green: cost-effectiveness and sustainability of China's
conservation set-aside program. Land Econ. 81(2), 247–264 (2005)
Plant Leaf Disease Recognition Using
Histogram Based Gradient Boosting
Classifier
Syed Md. Minhaz Hossain1,2 and Kaushik Deb1(B)

1
Chittagong University of Engineering and Technology,
Chattogram 4349, Bangladesh
minhazpuccse@gmail.com
2
Premier University, Chattogram 4000, Bangladesh
debkaushik99@cuet.ac.bd
Abstract. Plant leaf disease (PLD) recognition’s current techniques lack

proper segmentation and locating similar disorders due to overlapping fea-
tures in different plants. For this reason, we propose a framework to over-
come the challenges of tracing Region of Interest(ROI) under different
image backgrounds, uneven orientations, and illuminations. Initially, mod-
ified Adaptive Centroid Based Segmentation (ACS) is applied to find K’s
optimal value from PLDs and then detect ROIs accurately, irrespective of
the background. Later, features are extracted using a modified Histogram
Based Local Ternary Pattern (HLTP) that outperforms for PLDs with
uneven illumination and orientation, capitalizing on linear interpolation
and statistical threshold in neighbors. Finally, Histogram-based gradient
boosting is utilized to reduce biasness for similar features while detect-
ing disorders. The proposed framework recognizes twelve PLDs having an
overall accuracy of 99.34% while achieves 98.51% accuracy for PLDs with
more than one symptom, for instance, fungal and bacterial symptoms.
Keywords: Plant leaf disease recognition · Modified adaptive

centroid-based segmentation · Histogram-based local ternary pattern ·
Histogram-based gradient boosting classifier
1 Introduction
Diagnosis and detection of various plant diseases through leaves’ symptoms are
complicated for farmers and agronomists. It creates complexity due to various
symptoms in the same plant and similar symptoms in different plant diseases.
This complicated task may cause misleading to conclude the status of plants and
their proper treatments. Automated plant diagnosis using the mobile application
through the real field’s capturing image helps the agronomist and farmers make
better decisions on plant health monitoring. Due to the growth of the Graph-
ical Processing Unit(GPU) embedded processors, Machine Learning, Artificial
Intelligence makes it possible to incorporates new models and methods to detect
the appropriate ROIs and hence, identify the plant diseases correctly. However,
memory space(number of parameters) is still in consideration for mobile-based
PLD recognition.
https://doi.org/10.1007/978-3-030-68154-8_47
Plant Leaf Disease Recognition Using Histogram 531
Machine learning techniques mainly investigate localizing the ROIs, feature

extraction, and classification, such as in [9,11]. Limitations of learning-based
techniques are: a. lack of sensitivity to proper segmentation in different image
backgrounds and under different capture conditions and b. failing to trace similar
symptoms in different plant disorders.
The recent trend of Convolutional Neural Network (CNN) performs com-
plex patterns using a large number of data. The state-of-the-art architecture
of convolutional neural network (CNN) such as, VGG in [5,13],GoogleNet in
[8], ResNet50, ResNet101, ResNet152, Inception V4 in [13], Student-teacher
CNN in [4], AlexNet in [3,5,8] and DenseNet in [13] are applied in recogniz-
ing PLDs.Though CNN achieves better results, tuning the parameters depends
on the CNN architecture, to an extend. Furthermore, space(memory) limitation,
especially in handheld devices, to support such a high volume of network param-
eters is not considered. Last but not least, when exposed to a new dataset, CNN
fails to generalize, and its accuracy drops down drastically [5,8].
Our primary emphasis is to modify the K means clustering to overcome the
limitations of lack of sensitivity to proper segmentation in [10] and remove the
noises, including unwanted objects beside the plant leaf or leaves. The modi-
fied ACS suggested here to find optimal K such that it can cause a. segment of
the appropriate disease symptoms in different background images and uneven
illuminations and b. identify the disorders having similar symptoms. This work
also employs modified HLTP to alleviate the limitation of the traditional local
ternary pattern (LTP), which outperforms the uneven illumination and orien-
tation counterparts. Detecting ROIs from complex backgrounds and extracting
histogram features under various health states generalizes better when exposed
to the unspecified dataset. As memory space is a significant factor for mobile
devices, we propose a PLD framework to recognize PLDS using histogram-based
gradient boosting classifier instead of CNN. It improves PLDs’ recognition rate
than various machine learning algorithms for histogram features and reduces the
memory cost compared to CNN.
The remaining paper is demonstrated as follows. Section 2 depicts the liter-
ature review including the related works; proposed framework for recognizing
plant leaf diseases is described in Sect. 3; experiments, performance evaluation
and observations are presented in Sect. 4; and lastly, conclusion of this paper is
illustrated in Sect. 5.
2 Related Work
Plant/crop-related machine learning-based works are categorized into PLD

recognition, prediction production of crop-based on weather parameters, and
post-harvest monitoring of grains in [1]. A study has been conducted to predict
the co-relations between the weather parameters(temperature, rainfall, evapora-
tion, and humidity) and crop production in [2]. For this, the authors in [2] design
a fuzzy rule-based system using the Takagi Sugeno-Kang approach. Besides, the
machine learning and image processing based PLD recognition framework have
532 S. Md. Minhaz Hossain and K. Deb
several parts; the localization of symptoms of the disease (region of interest),

feature extraction, and classification. Before localization, image enhancement
technique is used in [11]. However, it is not always mandatory to improve the
intensity of plant leaf images. Plant image intensities are changing under differ-
ent capture conditions and uneven illumination. Two conditions are used based
on statistical features to trace the changing pattern of plant images. It makes
robust PLD detection and avoids unnecessary image enhancement.
GrabCut algorithm in [9], the Genetic Algorithm in [11], k-means clustering
in [10] has been used to get proper disease region in leaf image. Besides, in [10],
a couple of limitations of lack of sensitivity to proper segmentation in K-means
clustering due to improper initialization of K and localizing multiple disorders
in a PLD image. In [11], there are some misclassifications between the two leaf
spot conditions because of similar features. Our modified ACS overcomes the
limitations of lack of sensitivity of segmentation using the auto initialization of
K from the plant leaf images. Also, it makes the segmentation effective under
different critical environments and in different backgrounds.
The texture feature has been extracted by histogram-based local binary pat-
tern (LBP) in [9], by color co-occurrence matrix (local homogeneity, contrast,
cluster shade, energy, and cluster prominence)in [11]. In [9], histogram-based
local binary pattern extracts the better feature under different orientations and
uneven illuminations. We use a feature extraction method HLTP using linear
interpolation and dynamic threshold. The neighbors found using interpolation
make the feature extraction method sensitive to orientations. The variation of
the gray level of neighbors makes it invariant to illumination in recognizing PLD.
Moreover, multiple classifiers have been used to recognize the correct PLD in
various works. One Class Support Vector Machine (OCSVM) is used in [9], and
SVM is used in [11] for recognizing PLD. Further, Minimum Distance Criterion
(MDC) is used in [11]. Though in all of the works, better accuracy is achieved,
there still is a lack of proof in recognizing better in case of similar symptoms in
different disorders.
Also, there are many works for recognizing various plant diseases using CNN.
PLD recognition frameworks using the CNN model still have some limitations.
These limitations have an impact on the performance of CNN models. Some
of the works are restricted to plain backgrounds, e.g., [5,8,13] and inconsistent
with image capturing conditions of not doing data augmentation in [7]. Finally,
sometimes, plant leaf diseases have a generalization problem in an independent
dataset [5,8].
Using the ensemble learning classifiers, we can reduce the biasness of clas-
sifiers and improve accuracy than machine learning. Also, we can reduce the
parameters than the state-of-the-art CNN PLD recognition models. Though
random forest takes less time to build trees, gradient boosting classifiers are
better in the benchmark result. Especially for histogram features, histogram-
based gradient boosting classifiers perform well in considering memory cost and
recognition rate than gradient boosting classifier.
Fig. 1. The proposed framework for recognizing plant leaf disease.
We can conclude that auto initialization in this framework’s segmentation

phase overcomes lacking sensitivity to proper segmentation in [10] using modi-
fied Adaptive Centroid Based Segmentation (ACS). The automatic initialization
of K defined using ACS can effectively detect changes in image characteristics for
different orientations and illuminations and improve generalization. This paper
also explores histogram-based local ternary patterns (HLTP) to alleviate the
limitation of the traditional local ternary pattern (LTP), outperforms in the
uneven illumination and orientation. Finally, histogram-based gradient boosting
classifier is used to classify PLD because of the classification phenomena of his-
togram over features. This classifier is more suitable than CNN in considering
restricted memory devices like mobile. Besides, histogram-based features make
this framework useful to recognize the health status of newly-added plant images,
increasing the generalization. So, accuracy never falls in the newly added diverse
plant image, and this phenomenon overcomes the limitation of drastic fall in the
validation of CNN with new plant leaf images in [5,8].
3 Proposed Framework for Recognizing Plant Leaf

Diseases
In this section, the proposed framework is demonstrated in detail. Initially, the

disease recognition framework optionally enhances the plant leafs’ RGB image,
and then modified adaptive centroid-based segmentation(ACS) is applied to
trace the ROIs. After that, features selection from the grayscale image is executed
using a histogram-based local ternary pattern. At last, the plant leaf disease
is classified using a histogram-based gradient boosting classifier. The proposed
PLD recognition framework has been exhibited in Fig. 1.
3.1 Dataset
In the experiment, 403 images of size 256 × 256 pixels comprising eight differ-
ent plants, such as rice, corn, potato, pepper, grape, apple, mango, and cherry,
and twelve diseases are used to train the proposed framework. The images are
collected from the PlantVillage dataset1 except rice disease images. Rice dis-
ease images are gathered from the Rice diseases image dataset in Kaggle2 , the
International Rice Research Institute (IRRI)3 and Bangladesh Rice Research
Institute (BRRI)4 .
We vary the image backgrounds among natural, plain, and complex to trace
a disease properly in different backgrounds. Our framework includes six fungal
diseases, two bacterial diseases, two diseases having both fungal and bacterial
symptoms, one viral disease, and another one from a different category. Further,
the framework considers various symptoms, such as small, massive, isolated,
and spread. Twelve samples of eight plants are represented, considering different
symptoms and image backgrounds, as shown in Fig. 2. For generalization, 235
independent (excluding the training dataset) images from twelve different classes
are used during the test phase. Complete information regarding the plant leaf
disease dataset is described in Table 1.
Table 1. Dataset description of recognizing plant leaf disease.
Health-wise Plant Type Disease Samples # of # of test # of training # of test

condition training images images images
images (Health- wise ) (Health- wise )
Fungal Rice Blast 54 30 208 134
Potato Early-blight 42 39
Late-blight 21 10
Corn Northern-blight 50 30
Mango Sooty-mould 19 12
Cherry Powdery-mildew 22 13
Bacterial Rice Bacterial leaf-blight 65 30 115 60
Pepper Bacterial-spot 50 30
Fungal/Bacterial Rice Sheath-rot 20 10 35 17
Apple Black-rot 15 7
Virus Rice Tungro 10 5 10 5
Miscellenous Grape Black-measles 35 19 35 19
Total Images 403 235 403 235
3.2 Enhancing Image
If images are not captured precisely due to hostile conditions, image enhance-
ment is needed to increase the PLD image quality. The enhancement is optional
as it depends on the magnitude of degradation. Two enhancement conditions
have been used here using statistical features such as mean(µ), median(x ), and
mode(M0 ) of a plant leaf image. The first condition for image enhancement is
devised as in Eq. 1.
µ < x < M0 (1)
1
https://www.kaggle.com/emmarex/plantdisease.
2
https://www.kaggle.com/minhhuy2810/rice-diseases-image-dataset.
3
https://www.irri.org/.
4
http://www.brri.gov.bd/.
Fig. 2. Samples of plant leaf disease images under numerous health conditions in var-
ious backgrounds and having different symptoms: (a) Rice Sheath-rot(natural back-
ground, spread symptoms), (b) Rice Tungro(natural background, spread symptoms),
(c) Rice Bacterial leaf-blight(complex background, spread symptoms), (d) Rice blast
(complex background, isolated, small symptoms), (e) Potato Early-blight(plain back-
ground, isolated small symptoms), (f) Potato Late-blight(plain background, isolated
small symptoms), (g) Pepper Bacterial-spot(plain background, small symptoms), (h)
Grape Black-measles(plain background, small symptoms), (i) Corn Northern Leaf-
blight(plain background, spread, spot symptoms), (j) Apple Black-rot(plain back-
ground, small symptoms), (k) Mango Sooty-mould(natural background, spread symp-
toms) and (l) Cherry Powdery-mildew(natural background, small symptoms).
According to Eq. 1, the image enhancement condition performs effectively in

tracing ROIs with the identical background color as shown in Fig. 3(a1 –c2 ). The
second statistical condition for image enhancement is formulated as in Eq. 2).
The second statistical condition is effective when there is a shadow of the leaf
image on the background, as shown in Fig. 3(a2 –c4 ). Otherwise, the leaf image
is directly converted to the L * a * b color space image without enhancement.
µ < x > M0 (2)
3.3 Clustering by Adaptive Centroid Based Segmentation
The modified adaptive centroid-based segmentation (ACS) has been applied once
the PLD image quality has been enhanced. In the beginning, the RGB (PLD)
image space is converted to L * a * b color space for better perceptual linearity in
differentiating colors. Conversion from RGB space to L * a * b color space signifi-
cantly increases K-means clustering performance, especially when narrow distin-
guishes among symptoms colors in different plant leaf disorders. Differentiating
among the color intensities having identical ROI color and background is non-
trivial. Another challenge is distinguishing the basic color of ROIs in the same
sunlight shade and shadowing the background. To overcome these challenges, we
perform L * a * b color conversion before segmentation. In Fig. 3(c2 , c4 ), improve-

ments in segmentation is shown comparing with Fig. 3(c1 , c3 ) having extra noise
in the PLD RGB image. Our modified ACS focuses on initializing optimal K,
automatically from the leaf image, to eliminate the limitation of lacking sensitiv-
ity of K in [10]. In traditional K-means, euclidean distance between each point
and centroid has been calculated to check whether the point is in the same clus-
ter. In the modified ACS, data points are investigated for eligibility by using a
statistical threshold. After that, we calculate the distance between these eligible
points and centroids, thus, comparatively reducing the effort to form clusters
and restrict misclustering of data points. The statistical threshold (ST) value
has been calculated by Eq. 3.

N

ST = ((Xi − C)2 )/N (3)
i=1
Where, Xi , C, and N stand for data points, the centroid of data points, and the
total number of data points.
The automatic initialization of K defined using ACS can effectively
detect image characteristics for different orientations and illuminations. ACS,
also,increases the scalability of the proposed segmentation technique as shown
in Fig. 3(c2 , c4 ) and Fig. 3(c1 , c3 ). A few examples under different circumstances,
such as in the same colored reflection on ROIs, in the presence of the shadow
behind the ROIs, overlapped blur images, the orientation of leaf images such as
shrunk ROI and rotation, are as shown in Fig. 4(b1 –b5 ).
3.4 Selecting Features Using HLTP
Once the PLD image’s ROIs has been traced, the RGB segments are converted
to grayscale images. Then HLTP has been applied to extract the features of
leaf disease. We perform two approaches of feature extraction; namely HLTP-
1 (8 pixels with radius 1) and HLTP-2 (8 pixels with radius 2). Firstly, four
neighboring points are determined using Eq. 7–Eq. 10. Other four points have
been calculated by using linear interpolation coefficient for 45◦ in both HLTPs
formulated using Eq. 11–Eq. 14.
√
a=r− r (4)
b=1−a (5)
f (n + a) = a ∗ f (n + 1) + b ∗ f (n) (6)
d0 = A(r0 , c0 − r) − I (7)
d2 = A(r0 , c0 + r) − I (8)
d4 = A(r0 − r, c0 ) − I (9)
d6 = A(r0 + r, c0 ) − I (10)
d1 = a ∗ A(r0 + r − 1, c0 − r + 1) + b ∗ A(r0 + r, c0 − r) − I (11)
d3 = a ∗ A(r0 + r − 1, c0 + r − 1) + b ∗ A(r0 + r, c0 + r) − I (12)
d5 = a ∗ A(r0 − r + 1, c0 + r − 1) + b ∗ A(r0 − r, c0 + r) − I (13)
d7 = a ∗ A(r0 − r + 1, c0 − r + 1) + b ∗ A(r0 − r, c0 − r) − I (14)
Where, a and b are interpolation coefficients, and r is the radius. A(r0 , c0 )
stands for the matrix of PLD gray image I considering each neighbor of position
(r0 , c0 ). In Eq. 6, f(n+a) is the unknown pixel, f(n), and f(n+1) are two known
pixels. Unknown pixels, as shown in Eq. 11–Eq. 14, are formulated by Eq. 6 using
Eq. 7–Eq. 10. In Eq. 7–Eq. 14, d0 , d1 , d2 , d3 , d4 , d5 , d6 , and d7 are all neighboring
pixels’ derivatives. These derivatives are then put into 1 × 8 vector, d. 1 × 8
vector for each pixel Pi ; where, i= 0,1,2,3,...., i.e total (m × n) × 8 matrix is
found; where, m is he width of the plant leaf disease image and n is the height
of plant leaf images.Then, mean threshold(MT) for each pixel Pi is determined
using the surrounding eight pixels of this pixel. Then we get two values; one
contains the lower pattern values and another contains the upper pattern values
formulated in [12]. From that using histogram, we get two vectors of 1 × 256;
one from lower values and another from upper values.
Traditional LTP has the limitation of uneven illumination and orientation
in leaf image. In our modified HLTP, the mean threshold(MT) in [12] has been
considered instead of a fixed threshold to overcome LTP’s drawback. It handles
the variation of the gray level of neighbors and makes it invariant to illumination.
Using linear interpolation in determining directives helps increase the ability
to extract features from different oriented plant leaf images. It outperforms, as
shown in Fig. 3(d2 −e2 ) and Fig. 3(d4 −e4 ) compared to traditional LTP, as shown
in Fig. 3(d1 − e1 ) and Fig. 3(d3 − e3 ). Our modified HLTP functions effectively in
the same colored reflection on ROIs, in the shadow behind the ROIs, overlapped
blur images, and the orientation of leaf images such as shrunk ROI and rotation,
as shown in Fig. 4(c1 − f5 ).
3.5 Classifying Using Histogram-Based Gradient Boosting Classifier
Finally, a histogram-based gradient boosting classifier in [6] is used to recognize

PLD. Feature vectors developed in HLTP-1 and HLTP-2 have been applied to a
histogram-based gradient boosting classifier. Histogram-based gradient boosting
classifier is used due to its benchmark accuracy using histogram features and
computational cost compared to gradient boosting classifier. Unlike the gradient
boosting classifier, in a histogram-based gradient boosting classifier, optimum
splitting feature points are found by feature histogram. So, computational com-
plexity reduces due to the histogram data structure. Moreover, it takes memory
cost of O(#f eatures ∗ #data ∗ 1 byte).
In histogram-based gradient boosting classifier, for every feature, we build
the Histogram using 255 bins. Then gradient and hessian are calculated based
on the loss. As we classify 12 PLDs, we use categorical cross-entropy. Trees are

expanded based on the information gain from every feature. Information gain is
evaluated using the gradient and hessian of each feature. The maximum depth
for each is considered as 20. Each leaf includes a minimum of 30 samples of
PLD images, and each tree has 30 leaf nodes. As, histogram-based boosting
classifier (inspired by LightGBM)in [6], adds each best split tree level-wise, a
new gradient and hessian are calculated to predict the next one. The boosting
process are examined up to maximum iterations of 10 to 1000 and are learned
with a learning rate from 0.1 to 1. However, our classification method gets a
minimum loss function using a learning rate of 0.2 and a maximum iteration
of 100. The best-tuned parameters used to train the histogram-based gradient
boosting is represented in Table 2.
Table 2. Parameters used in histogram based gradient boosting classifier for plant leaf
disease recognition.
Parameters Value(s)
Loss function Categorical cross-entropy
Max iterations 100
Minimum samples in leaf node 30
Maximum leaf nodes 30
Max depth 20
Max bins 255
Learning rate 0.2
4 Results and Observations

In this section, the results of our experiments for recognizing plant leaf diseases
are presented.
Environment. The experiments for recognizing plant leaf disease are executed
on Intel(R) Core i5 7200U 2.5 GHz with 4 GB RAM. The proposed framework
is implemented in Python with packages sklearn and MATLAB.
Dataset for Training and Test. In this experiment, 403 images of eight plants
of size 256 × 256 pixels, are used to train and 235 PLD images are used to test
for twelve classes from different sources.The statistics of different PLD train and
test images is shown in Table 3.
Effect of Image Enhancement Conditions. From Fig. 3, it is observed that

without image enhancement, there are some noises in segmentation and also fur-
ther have its impact on feature extraction in critical cases. Two image enhance-
ment conditions of PLD images have been performed effectively in ROIs with
Table 3. Dataset description according to the sources.
Source Plant Type Disease Samples # of # of test # of # of test

condition training images training images
images images (Source-
(Source- wise )
wise )
PlantVillage Pepper Bacterial-spot 50 30 254 160
Potato Early-blight 42 39
Late-blight 21 10
Corn Northern-blight 50 30
Mango Sooty-mould 19 12
Apple Black-rot 15 7
Cherry Powdery-mildew 22 13
Grape Black-measles 35 19
Kaggle Rice Blast 54 30 119 60
Bacterial leaf-blight 65 30
IRRI/BRRI Rice Sheath-rot 20 10 30 15
Tungro 10 5
Total Images 403 235 403 235
Table 4. Comparison among the experiments using traditional K-means clustering,

Local ternary pattern, modified adaptive centroid-based segmentation and modified
histogram-based local ternary pattern.
Frameworks Accuracy F1-score

Traditional K-means clustering+ Local ternary pattern 90% 88%
Traditional K-means clustering+ HLTP 92.76% 89.4%
Modified ACS+ Local ternary pattern 94.89% 90.4%
Our PLD framework(Modified ACS+ HLTP) 99.34% 94.10%
the same color background, due to higher mode as shown in Fig. 3(c2 ) and in a
shadow of the leaf image on the background due to its higher median than other
two statistical values, as shown in Fig. 3(c4 ).
Effect of Modified Adaptive Centroid Based Clustering. The automatic

initialization of K defined using ACS can effectively detect image characteristics
for different orientations and illuminations. ACS, also,increases the scalability of
our modified segmentation technique as shown in Fig. 3(c2 , c4 ) and Fig. 3(c1 , c3 ).
In various critical circumstances, such as same-colored reflection on ROIs, when
background and ROIs have the same color, ROIs in the natural background
with shrunk, rotated, and overlapped blur images, modified ACS outperforms,
as shown in Fig. 4(b1 − b5 ).
Effect of Our HLTP on Feature Extraction. One thousand twenty-

four(1024) histogram features (512 features of each HLTP-1 and HLTP-2) are
extracted using HLTP. The dynamic mean threshold handles the variation of
Fig. 3. Effect of image enhancement on recognizing plant leaf disease on critical situ-
ations: (a1 ) rice blast disease image and (a2 ) apple black rot disease image. (b1 ), and
(b2 ) is the leaf image histogram of a1 and a2 , respectively. (c1 ), and (c3 ) is the color
segmentation results of a1 and a2 respectively in traditional K-means clustering having
extra noise without image enhancement, and (c2 ), and (c4 ) is the segmentation results
of a1 and a2 respectively in our modified color segmentation algorithm with image
enhancement. (d1 ), (d3 ) and (e1 ), (e3 ) are the lower and upper features of traditional
LTP respectively. (d2 ), (d4 ) and (e2 ), (e4 ) are the lower and upper features of modified
HLTP respectively.
neighbors’ gray level and makes it invariant to illumination. Linear interpola-

tion in determining directives helps increase the ability to extract features from
different oriented plant leaf images. It outperforms, as shown in Fig. 3(d2 –e2 )
and Fig. 3(d4 –e4 ) compared to traditional LTP, as shown in Fig. 3(d1 –e1 ) and
Fig. 3(d3 –e3 ). HLTP functions effectively in the same colored reflection on ROIs,
in the shadow behind the ROIs, overlapped blur images, and the orientation of
leaf images such as shrunk ROI and rotation, as shown in Fig. 4(c1 –f5 ). From
Table 4, it is observed that our proposed PLD recognition using HLTP compar-
atively achieves better accuracy of 99.34% and F1-score of 94.10%.
Effect of Histogram-Based Gradient Boosting Classifier. A total of

1024 features are applied to the histogram-based gradient boosting classifier.
Histogram-based gradient boosting classifier reduces computational complex-
ity due to its histogram data structure. It also reduces the biasness of similar
features in various PLDs because of histogram classification phenomena over
features. Variance in histogram comparatively differentiates well. It improves
accuracy than the other machine learning algorithms and requires less memory
Fig. 4. The processing examples of rice images in our proposed PLD framework under
different critical environments: (a1 − a5 ) are the RGB PLD samples. (b) Segmented
ROIs after implementation of adaptive centroid-based segmentation. (c) HLTP-1 lower
features. (d) HLTP-1 upper features. (e) HLTP-2 lower features and (f) HLTP-2 upper
features.
Fig. 5. ROC curve of each plant leaf Fig. 6. Confusion matrix for recognizing plant
diseases recognition of our framework. leaf diseases.
space than CNN. So, it is useful and reliable for recognizing PLDs using the
mobile application.
Performance Analysis. Two hundred thirty-five(235) plant leaf disease
images of twelve classes are used to evaluate our PLD recognition framework’s
performance. The recognition rate of each class is shown in a confusion matrix

in Fig. 6. The summary of performance metrics, including accuracy, precision,
recall, and F1-score, are shown in Table. 5. Our PLD recognition framework
achieves accuracy, precision, recall, and F1 score of 99.34%, 94.66%, 93.54%, and
94.10%, respectively. For measuring the degree of separability among classes, the
ROC curve has been shown in Fig. 5. AUC (The area under the ROC curve) for
our proposed framework is 0.97. Minimum AUC is 0.85 for rice sheath-rot, and
the maximum of AUC is 1 for five classes such as pepper bacterial-spot, grape
black-measles, rice blast, cherry powdery-mildew, and rice tungro.
Table 5. Performance evaluation of each classes using our proposed plant leaf disease
recognition framework.
Class TP FP FN Accuracy Precision Recall F1-score

Corn northernblight 29 0 1 99.57% 100% 96.67% 98.30%
P epper bacterialspot 30 0 0 100% 100% 100% 100%
Grape blackmeasles 19 0 0 100% 100% 100% 100%
Rice blast 30 0 0 100% 100% 100% 100%
potato earlyblight 39 4 0 98.29% 90.69% 100% 95.12%
Apple blackrot 7 1 1 99.15% 87.5% 87.5% 87.5%
M ango sootymould 11 0 1 99.57% 100% 91.67 % 95.65%
Cherry powderymildew 13 0 0 100% 100% 100% 100%
Rice bacterialleaf blight 29 1 1 99.14% 96.67% 96.67% 96.67%
P otato lateblight 8 0 2 99.15% 100% 80% 88.89%
Rice sheathrot 7 2 3 97.87% 77.78% 70% 73.69%
Rice T ugro 5 1 0 99.57% 83.33% 100% 90.91%
Average 99.34% 94.66% 93.54% 94.10%
For further evaluation, we compare the performance of our PLD recogni-

tion with the benchmark method proposed by Pantazi et al. in [9] and Singh
et al. in [11] on our dataset. The proposed method in [9] is significant for its
high generalization using histogram features and the ability to overcome the
intrinsic challenges(segmentation and different disorders with similar symptoms)
under uncontrolled capture conditions. For comparing with the method in [9],
the GrabCut algorithm for segmentation, Histogram-based Local Binary Pat-
tern for feature extraction, and one class SVM for classification are executed on
our dataset. The proposed method in [11] has significance in the auto initial-
ization of clustering centers and generalization. For comparing with the method
in [11], Genetic algorithm for segmentation, Color Co-occurrence method for
feature extraction, and SVM for classification is executed on our dataset. From
Table 6, it is visual that our proposed PLD recognition framework performs rel-
atively better than other methods proposed in [9] and [11]. Our PLD recognition
framework achieves accuracy and F1-score of 99.34% and 94.10%, respectively.
Table 6. Comparison of performance evaluation with other state-of-the-art plant leaf

disease recognition frameworks.
Class Our framework Method in [9] Method in [11]

Accuracy F1-score Accuracy F1-Score Accuracy F1-score
Corn northernblight 99.57% 98.30% 95.74% 84.85% 97.02% 87.96%
P epper bacterialspot 100% 100% 100% 100% 99.15% 96.67%
Grape blackmeasles 100% 100% 97.87% 85.71% 97.45% 85%
Rice blast 100% 100% 99.15% 96.77% 99.58% 96.77%
potato earlyblight 98.29% 95.12% 98.72% 96.30% 97.46% 92.86%
Apple blackrot 99.15% 87.5% 99.15% 83.33% 97.02% 60%
M ango sootymould 99.57% 95.65% 97.00% 55.55% 98.30% 76.13%
Cherry powderymildew 100% 100% 99.15% 91.72% 98.30% 84.62%
Rice bacterialleaf blight 99.14% 96.67% 97% 89.66% 96.60% 85.19%
P otato lateblight 99.15% 88.89% 99.15% 84.21% 97.02% 58.83%
Rice sheathrot 97.87% 73.69% 96.59% 60% 94.46% 31.58%
Rice T ugro 99.57% 90.91% 99.57% 83.33% 98.72% 63.31%
Average 99.34% 94.10% 97.59% 76.57% 98.26% 85.02%
These evaluations are superior to the accuracy achieved by the state-of-the-art

method.
Moreover, we compare the PLD recognition framework results using
histogram-based gradient boosting with the CNN-based PLD recognition model.
As we have a small number of PLD images, we augment the PLD images using
rotation, shifting, scaling, flipping, change in brightness, and contrast changes.
Then, considering the number of network parameters, we execute the state-of-
the-art convolutional layers based architecture, AlexNet(input image of 224 ×
224) using ImageNet weights and achieves 99.25% accuracy, as shown in Table 7.
Table 7. Comparison between PLD recognition using histogram-based gradient boost-

ing classifier and state-of-the-art CNN model.
Method/Network Accuracy #Network/Learning Parameters Storage required

Our proposed framework 99.34% 6 0.62 MB
AlexNet 99.25% 6.4 M 25.6 MB
Critical Analysis. Our framework recognizes well under different illumina-

tion, in the natural background, and complex background. However, there are
still some misclassifications in detecting disease, as shown in Fig. 7(a–h). By ana-
lyzing these misclassifications, it is found that PLD images are misclassified due
to multiple disease symptoms and changed symptom’s features, such as shape.
These challenges are located for future work. Not only information of colors or
intensities of ROIs in spatial order, but also geometric features are considered
as features.
Fig. 7. Some misclassified images: (a), (b), (c) are some false positive rice sheath-rot
images. (d) is rice bacterial leaf blight.(e) and (f) are some false positive potato late
blight images. (g) is false positive apple black-rot image and (h) is false positive corn
northern leaf-blight.

In our PLD recognition framework, ROIs are initially detected by modified ACS
with automatic initialization of K. Then features have been extracted by HLTP.
Finally, classification has been done by a histogram-based gradient boosting
classifier. Our proposed PLD framework overcomes existing PLD recognition
limitations, such as having image backgrounds, similar features in different dis-
orders, and under uneven illumination and orientation for uncontrolled captured
images. ACS eliminates the lack of sensitivity of k in K-means clustering [10] and
performs effectively irrespective of the image backgrounds and similar features
in different disorders. HLTP overcomes other challenges of PLD detection under
uncontrolled capturing. Using linear interpolation and dynamic mean threshold,
HLTP handles the orientation and variation of neighbors’ grey level. In this work,
some diseases having fungal and bacterial symptoms such as rice sheath-rot and
apple black-rot are recognized in a better rate of, on average, 98.51%, as shown in
Table 5. Our PLD recognition framework achieves an average of 99% of accuracy
for PLD with similar symptoms such as potato early-blight, potato late-blight,
and corn northern-blight, as shown in Table 5. However, our proposed framework
performs well and having high generalization ability but still has limitations of
detecting multiple diseases. It can be solved by concatenating ROIs of multiple
diseases.
References
1. Vasilyev, A.A., Vasilyev, G.N.S.: Processing plants for post-harvest disinfection of
grain. In: Proceedings of the 2nd International Conference on Intelligent Comput-
ing and Optimization (ICO 2019) , Advances in Intelligent Systems and Computing
1072, 501–505 (2019)
2. Borse, K., Agnihotri, P.G.: Prediction of crop yields based on fuzzy rule-based
system (FRBS) using the takagi sugeno-kang approach. In: Proceedings of the
International Conference on Intelligent Computing and Optimization (ICO 2018),
Advances in Intelligent Systems and Computing 866, 438–447 (2018)
3. Boulent, J., Foucher, S., Théau, J., St-Charles, P.L.: Convolutional neural networks
for the automatic identification of plant diseases. Frontiers in Plant Science 10
(2019)
4. Brahimi, M., Mahmoudi, S., Boukhalfa, K., Moussaoui, A.: Deep interpretable
architecture for plant diseases classification. In: Signal Processing: Algorithms,
Architectures, Arrangements, and Applications (SPA), pp. 111–116. IEEE (2019)
5. Ferentinos, K.P.: Deep learning models for plant disease detection and diagnosis.
Comput. Electron. Agriculture 145, 311–318 (2018)
6. Ke, G., Meng, Q., Finey, T., Wang, T., Chen, Ma, W., Ye, Q., Liu, T.Y.: Lightgbm:
a highly efficient gradient boosting decision tree. In: 31st Conference on Neural
Information Processing Systems (NIPS 2017), Long Beach, CA, USA. pp. 1–3
(2017)
7. Liang, W.J., Zhang, H., Zhang, G.F., Cao, H.X.: Rice blast disease recognition
using a deep convolutional neural network. Scientific Reports 9(1), 1–10 (2019)
8. Mohanty, S.P., Hughes, D.P., Salathé, M.: Using deep learning for image-based
plant disease detection. Front. Plant Sci. 7, 1419 (2016)
9. Pantazi, X., Moshou, D., Tamouridou, W.: Automated leaf disease detection in dif-
ferent crop species through image feature analysis and one class classifiers. Comput.
Electron. Agric. 156, 96–104 (2019)
10. Sharma, P., Berwal, Y.P.S., Ghai, W.: Performance analysis of deep learning cnn
models for disease detection in plants using image segmentation. Information Pro-
cessing in Agriculture (2019)
11. Singh, V., Misra, A.: Detection of plant leaf diseases using image segmentation and
soft computing techniques. Inf. Process. Agric. 4, 41–49 (2017)
12. Taha H., Rassem, B.E.K.: Completed local ternary pattern for rotation invariant
texture classification. The Scientific World Journal, p. 10 (2014)
13. Too, E.C., Yujian, L., Njuki, S., Yingchun, L.: A comparative study of fine-tuning
deep learning models for plant disease identification. Comput. Electron. Agric.
161, 272–279 (2019)
Exploring the Machine Learning
Algorithms to Find the Best Features
for Predicting the Breast Cancer
and Its Recurrence
Anika Islam Aishwarja1(B) , Nusrat Jahan Eva1 , Shakira Mushtary1 ,

Zarin Tasnim1 , Nafiz Imtiaz Khan2 , and Muhammad Nazrul Islam2
1
Department of Information and Communication Engineering, Bangladesh
University of Professionals, Dhaka, Bangladesh
anika.i.aishwarja@gmail.com
2
Department of Computer Science and Engineering, Military Institute of Science
and Technology, Dhaka, Bangladesh
nazrul@cse.mist.ac.bd
Abstract. Every year around one million women are diagnosed with
breast cancer. Conventionally it seems like a disease of the developed
countries, but the fatality rate in low and middle-income countries is
preeminent. Early detection of breast cancers turns out to be beneficial
for clinical and survival outcomes. Machine Learning Algorithms have
been effective in detecting breast cancer. In the first step, four distinct
machine learning algorithms (SVM, KNN, Naive Bayes, Random for-
est) were implemented to show how their performance varies on different
datasets having different set of attributes or features by keeping the same
number of data instances, for predicting breast cancer and it’s recurrence.
In the second step, analyzed different sets of attributes that are related
to the performance of different machine learning classification algorithms
to select cost-effective attributes. As outcomes, the most desirable per-
formance was observed by KNN in breast cancer prediction and SVM
in recurrence of breast cancer. Again, Random Forest predicts better for
recurrence of breast cancer and KNN for breast cancer prediction, while
the less number of attributes were considered in both the cases.
Keywords: Breast Cancer · Prediction · Recurrence · Attributes

selection · Data mining · Machine learning
1 Introduction
Breast cancer is the first and most common cancer for females. Around 10% of
females are affected by breast cancer at any stage of their life. Again, among
the cancer affected women, 34.3% are affected by breast cancer and showed a
high mortality around the world [4]. Most breast cancers begin in the cells that
line the ducts, while fewer breast cancers start in the cells lining the lobules [4].
Breast cancer’s causes are multi-factorial which involves family history, obesity,
hormones, radiation therapy, and even reproductive factors [1]. The diagnosis
https://doi.org/10.1007/978-3-030-68154-8_48
Prediction of Breast Cancer and Its Recurrence 547
of breast cancer is carried out by classifying the tumor. Tumors can be either
benign or malignant. In a benign tumor, the cells grow abnormally and form
a lump and do not spread to other parts of the body [5]. Malignant tumors
are more harmful than benign. Unfortunately, not all physicians are experts in
distinguishing between benign and malignant tumors, while the classification of
tumor cells may take up to two days [3].
Information and Communication Technologies (ICT) can play potential roles
in cancer care. For example, data mining approaches applied to medical science
topics rise rapidly due to their high performance in predicting outcomes, reduc-
ing costs of medicine, promoting patients’ health, improving healthcare value
and quality, and making real-time decisions to save people’s lives [2]. Again,
Machine Learning (ML) algorithms gain insight from labeled sample and make
prediction for unknown samples; which greatly being used in health informat-
ics [11,14–16], predicting autism [24], and the likes. Again, ML becomes very
impactful in several fields like diseases diagnosis, disease prediction, biomedical,
and other engineering fields [25,26]. It is an application of artificial intelligence
(AI) that utilizes the creation and evaluation of algorithms that facilitate pre-
diction, pattern recognition, and classification [9]. Machine learning algorithms
like Support Vector Machine (SVM), Random Forest (RF), Naive Bayes (NB),
and k Nearest Neighbours (KNN) are frequently used in medical applications
such as the detection of the type of cancerous cells. The performance of each
algorithm is generally varied in terms of accuracy, sensitivity, specificity, preci-
sion, and recall for predicting the possibility of being affected by diseases like
diabetes, breast cancer, etc.
Over the last few years, a number of studies have been conducted on the
prediction of breast cancer. Those studies have focused on either the existence
of the cancer or the recurrence of the cancer. Therefore, this study has the
following objectives: first, explore the performance of different classifier models
considering different sets of data having and equal number of data objects for
predicting breast cancer and its recurrence. Second, to analyze the performance
of the predictors, while considering different sets of attributes.
This paper is organized into five separate sections as follows. Section 2 high-
lights the published literature focusing on breast cancer prediction models using
data mining techniques. Section 3, explains the detailed description of data,
various prediction algorithms, and measures their performance. The prediction
results of all the classification and regression algorithms along with the accu-
racy, sensitivity, and specificity are presented in Sect. 4. Section 5 concludes with
a summary of results and future directions.
2 Literature Review
Machine learning algorithms have been used for a long time in the case of pre-
dicting and diagnosing breast cancer. Several ML related studies were conducted
using breast cancer wisconsin (Original) dataset. For example, Asri et al. [2]
compared four different algorithms namely, SVM, Decision tree, Naive Bayes
548 A. I. Aishwarja et al.
and KNN to find out the best-performed classifiers for breast cancer prediction.
As outcome, SVM shows the best accuracy of 97.13%. Karabatak and Ince [13]
proposed an automatic diagnosis system for detecting breast cancer using Associ-
ation Rules (AR) and Neural Network (NN) and then compared with the Neural
Network Model. In the test stage, they used 3 fold cross-validation method and
the classification rate was 95.6%. Islam et al. [12] proposed a model based on
SVM and KNN then compare with other existing SVM-based models namely ST-
SVM, LPS-SVM, LP-SVM, LSVM, SSVM, NSVM; and found that their model
shows the highest accuracy in term of accuracy, sensitivity, specificity. In addi-
tion to this dataset, Huang et al. [10] used another dataset having 102294 data
samples with 117 different attributes to compare the performance of SVM clas-
sifiers and SVM ensembles. They constructed the SVM classifiers using kernel
functions (i.e., linear, polynomial, and RBF) and SVM ensembles using bagging
and boosting algorithms. They found that for both datasets SVM ensembles
performed better than classifiers. Khourdif and Bahaj [18] implemented Ran-
dom Forest, Naive Bayes, SVM, KNN, Multilayer Perceptron (MLP) in WEKA
to select the most effective algorithm with and without the Fast Correlation-
Based Feature selection (FCBF). This method was used to filter irrelevant and
redundant characteristics to improve the quality of cancer classification. The
experimental results showed that SVM provided the highest accuracy of 97.9%
without FCBF. Khourdif and Bahaj [17] used several machine-learning algo-
rithms including Random Forest, Naive Bayes, Support Vector Machines, and
K-Nearest Neighbors for determining the best algorithm for the diagnosis and
prediction of breast cancer using WEKA tools. Among 699 cases they have taken
30 attributes with 569 cases and found the SVM as the most accurate classifier
with accuracy of 97.9%.
Again, to classify recurrent or non-recurrent cases of breast cancer Ojha
and Goel et al. [27] used the Wisconsin Prognostic Breast Cancer dataset (194
cases with 32 attributes). Four clustering algorithms K-means, Expectation-
Maximization(EM), Partitioning Around Medoids(PAM) and Fuzzy c-means,
and four classification algorithms (SVM, C5.0, Naive Bayes, and KNN) were
used. The study showed SVM and C5.0 achieved the best 81% accuracy; and
the classification algorithms are better predictors than the clustering algorithms.
Li et al. [20] detected the risk factors of breast cancer and multiple common risk
factors adopting the Implementing Association Rule (IAR) algorithm and N-IAR
algorithm (for n-item) using a dataset of Chinese women having 83 attributes
and 2966 samples. Experimental results showed that the model based on the
ML algorithm is more suitable than the classic Gail model. Stark et al. [25]
proposed a new model with Logistic Regression, Gaussian Naive Bayes, Deci-
sion Tree, Linear Discriminant Analysis, SVM, and Artificial Neural Network
(ANN) and trained using a dataset of 78,215 women (50–78 age). These mod-
els predicted better than previously used BCRAT (Gail model). Delen et al. [7]
applied Artificial Neural Network(ANN), Decision Tree, Logistic Regression to
predict the survivability rate of breast cancer patients in WEKA using the SEER
Table 1. Summary of related studies
Ref Objective No of ML Results

features technique Accuracy Specificity Sensitivity Precision
[2] Estimating the 11 C4.5 95.13 0.95 (Benign) 0.96 (Benign)
definiteness in 0.94 (Malignant) 0.91 (Malignant)
classifying data SVM 97.13 0.97 (Benign) 0.98 (Benign)
0.96 (Malignant) 0.95 (Malignant)
NB 95.99 0.95 (Benign) 0.98 (Benign)
k-NN 95.27 0.97 (Benign) 0.95 (Benign)
[17] Predicting breast 11 K-NN 96.1 0.961 0.961
cancer detection SVM 97.9 0.979 0.979
and the risk of RF 96 0.960 0.960
death analysis.
NB 92.6 0.926 0.926
[13] An automatic 11 NN 95.2
diagnosis system AR1 + NN 97.4
for detecting BC. AR2 + NN 95.6
[12] Developing a 11 LPSVM 97.1429 95.082 98.2456
classification model LSVM 95.4286 93.33 96.5217
for breast cancer
SSVM 96.5714 96.5517 96.5812
prediction
PSVM 96 93.4426 97.3684
NSVM 96.5714 96.5517 96.5812
St-SVM 94.86 93.33 95.65
Proposed 98.57 95.65 100
model using
SVM
Proposed 97.14 92.31 100
model using
K-NN
[20] Finding the best 83 Logistic 0.8507 0.8403 0.8594
classifier Regression
Dicision 0.9174 0.9262 0.9124
Tree
Random 0.8624 0.8577 0.8699
Forest
XGBoost 0.9191 0.9275 0.9142
LightGBM 0.9191 0.9248 0.9164
MLP 0.8128 0.8289 0.8059
Gali 0.5090 0.1880 0.5253
[27] Detecting and 10 SVM 70%
predicting breast KNN 68%
cancer RF 72%
Gradient 75%
Boosting
[18] Filtering irrelevant 11 KNN 94.2% 0.942 0.942
and redundant data SVM 96.1% 0.96 0.961
in order to improve RF 95.2% 0.953 0.952
quality of cancer
NB 94% 0.94 0.94
classification
Multilayer 96.3% 0.963 0.963
Perceptron
[21] Building an 17 AUC 0.8805 0.2325 0.9814
integration decision AUC with 0.7422 0.7570 0.7399
tree model for under-
predicting breast sampling
cancer survivability ratio of
15%
Bagging 0.7659 0.7859 0.7496
algorithm
[7] Developing 72 ANN 0.9121 0.8748 0.9437
prediction models Decision 0.9362 0.9066 0.9602
for breast cancer trees
survivability Logistic 0.892 0.8786 0.9017
regression
Cancer Incidence Public-Use Database (433272 cases with 72 attributes), while

the Decision Tree showed the best result [21].
The summary of the literature review is shown in Table 1. Few important
concerns are observed through this literature review. Firstly, machine learning
has played a significant role in predicting the possibility of having breast can-
cer, recurrence, and for the diagnosis of breast cancer. Secondly, several types
of researches have been conducted focusing on performance and comparison of
algorithms. Thirdly, both the SVM and Naive Bayes algorithms showed compar-
atively better prediction accuracy than other ML algorithms. Fourthly, though
a number of studies used different datasets haing various attributes, a little
attention has been paid to explore that attributes of every dataset have a great
impact on the overall performance of algorithms. Thus, this research focused to
analyze Machine Learning algorithms on different datasets having different sets
of attributes.
3 Methodology
This section discusses the implementation of four machine learning algorithms.

The overview of the study methodology is presented in Fig. 1.
Fig. 1. The overview of the study methodology
3.1 Data Acquisition
At the first stage, four datasets were selected from the UCI machine learn-
ing repository [8]. “Breast Cancer Wisconsin (Original) dataset” as dataset 1
having 11 attributes and 699 instances and “Breast Cancer Wisconsin (Diag-
nostic) dataset” as dataset 2 having 32 attributes and 569 instances were used
for predicting the breast cancer. Both of these datasets include two classes,
namely Benign (B) and Malignant (M). As those datasets have different sets
of attributes and a different number of instances, the least number of instances
569 were considered also from the dataset 1. Similarly “Breast Cancer Wiscon-
sin (Prognostic) dataset” was used as dataset 3 and “Breast Cancer Data Set”
was used as dataset 4 for predicting the recurrence of breast cancer. Dataset 3
has 33 attributes and 198 instances while dataset 4 has 10 attributes and
286 instances. Dataset 3 and dataset 4 include two classes, namely recurrence
and non-recurrence. There are 30 common attributes between dataset 2 and
dataset 3. However, some instances were deleted from dataset 4 due to noisy
data and considered 180 instances. Similarly, 180 instances were considered also
from dataset 3 (see Table 2).
Table 2. Summary of the selected datasets
Dataset Source Classes Attributes Instances

Dataset 1 Breast Cancer Benign 11 569
Wisconsin (Original) Malignant
Data Set
Dataset 2 Breast Cancer 32 569
Wisconsin
(Diagnostic) Data
Set
Dataset 3 Breast Cancer Recurrence 33 180
Wisconsin Non-recurrence
(Prognostic) Data
Set
Dataset 4 Breast Cancer Data 10 180
Set

Data pre-processing has a huge impact on the performance of the ML algorithms
as irrelevant and redundant data can lead to provide erroneous outputs. As
part of data pre-processing, the following tasks were performed meticulously: (a)
duplicate rows existing in each of the datasets were removed from the datasets.
(b) missing values in the datasets were handled properly: missing numerical
attribute values were replaced with the mean value of the particular column
whereas, missing categorical attribute values were replaced with most frequent
values in the particular column, (c) the attributes having text/string inputs were
encoded to have numerical class values as machine learning algorithms are not
capable to deal with string, (d) numerical values were normalized to get values
between zero to one, as it is easier for ML algorithms to deal with small values,
and (e) a random train-test split 80-20 was applied on the dataset where 80%
data was considered as a training set and the rest of the data was considered as
a test set, since ML models may be biased towards a particular class, if there
doesn’t exist an equal number of class instances in the training data [19]. To
solve this problem Synthetic Minority Over-sampling Technique (SMOTE) [6]
was used, which is capable to remove class imbalance issues.
3.3 Model Building and Analysis

3.3.1 Random Forest
The Random forest is a supervised learning algorithm. It builds multiple deci-
sion trees and merges them to get a more accurate and stable prediction. Some
interesting findings can be observed by using Random Forest method as the pre-
diction model. Considering predefined classes of attributes, and the predicting
results using Random Forest are shown in Fig. 2. Here, datasets are shown on the
X-axis, and performance measures are shown on the Y-axis. Different datasets
come with different attributes. The result of data analysis is shown separately
for class attributes. Considering the benign-malignant class attribute, the high-
est accuracy was obtained by dataset 2. Similarly, for recurrence-non recurrence
class attribute, the highest accuracy was obtained by dataset 3. Overall perfor-
mance of dataset 2 was the best as the outcome of accuracy, precision, sensitivity,
and specificity were 95.9, 97.2, 96.3, and 96.2 respectively.
(a)Predicting using dataset 1 and (b)Predicting using dataset 3 and

dataset 2 dataset 4
Fig. 2. Result of data analysis using Random Forest
3.3.2 KNN
Among the supervised machine learning algorithms, K-nearest neighbors (KNN)
is one of the most effective techniques. It performs classification on certain data
points [23]. The KNN algorithm is a type of supervised ML algorithm that can
be used for both classifications as well as regression predictive problems. It uses
‘attribute similarity’ to predict the values of new data-points and then the new
data point will be assigned a value based on how closely it matches the points
in the training set. This algorithm has been applied based on different sets of
attributes of different datasets and the results are shown in Fig. 3. The high-
est accuracy was acquired by dataset 1 (for benign-malignant class attribute)
and dataset 3 (for recurrence-non recurrence class attribute). Moreover, the best
performance was obtained from dataset 1 as the outcome of accuracy, precision,
sensitivity, and specificity were 95.9, 96, 98.5, and 93.42 respectively.

dataset 2 dataset 4
Fig. 3. Result of data analysis using KNN
3.3.3 SVM
The SVMs are a set of related supervised learning methods that analyze data
to recognize patterns and are used for classification and regression analysis [22].
SVM is an algorithm that attempts to find a linear separator (hyper-plane)
between the data points of two classes in multidimensional space. The result
of adopting SVM on predefined classes of attributes is shown in Fig. 4. The
highest accuracy was found for dataset 2 (for benign-malignant class attribute)
and dataset 4 (for recurrence-non recurrence class attribute). But overall best
performance was observed for dataset 2 as the outcome of accuracy, precision,
sensitivity, and specificity were 97.2, 97.3, 99.07, and 95.2 respectively.
3.3.4 Naı̈ve Bayes

The Naive Bayes is a quick method for the creation of statistical predictive
models based on the Bayesian theorem [27]. This classification technique analyses
the relationship between each attribute and the class for each instance to derive
a conditional probability for the relationships between the attribute values and
the class. Finding from the performance of this algorithm from different datasets
vary because of the selection of attributes; and shown in Fig. 5. For the benign-
malignant class attribute, the highest accuracy was found for dataset 2 and
for the recurrence-non recurrence dataset, highest accuracy was observed for
dataset 3. The overall best performance can be observed for dataset 2 as the
outcome of accuracy, precision, sensitivity, and specificity were 92.4, 92.6, 95.5,
and 97.6 respectively.

dataset 2 dataset 4
Fig. 4. Result of data analysis using SVM

dataset 2 dataset 4
Fig. 5. Result of data analysis using Naive Bayes
4 Discussion
The analysis and result discussed in Sect. 3 highlight the diverse performance of
the attributes used in different datasets, keeping the same number of data and
by applying different machine learning classification algorithms.
The summary of this finding is shown in Table 3. Here, dataset 1 has 11
attributes and dataset 2 has 32 attributes, but both of them have 569 levels of
instances. Different levels of performance were observed as shown in Fig. 6 while
they have differences only in the number of attributes. However, the highest accu-
racy for dataset 1 and dataset 2 was obtained by KNN and SVM respectively.
Similarly, for predicting the recurrence-non recurrence of breast cancer, dataset 3
and dataset 4 showed different performance as shown in Fig. 6. Dataset 3 con-
sists of 33 levels of attributes and dataset 4 consists of 10 attributes. Similarly,
the different levels of performance were observed due to the different sets of
attributes, while the highest accuracy for dataset 3 and dataset 2 was acquired
by KNN and Random Forest respectively.
Table 3. Accuracy obtained by different machine learning algorithms on various

datasets
Algorithm Dataset 1 Dataset 2 Dataset 3 Dataset 4

Random Forest 95.20% 95.91% 87% 86%
KNN 95.90% 91.77% 88.90% 83.60%
Naı̈ve Bayes 92.40% 92.54% 83.30% 54.50%
Support Vector Machine 94.50% 97.22% 68.50% 80.00%
(a) Malignant and Benign (b) Recurrence and non-recurrence
Fig. 6. Accuracy comparison
Again, the study results indicated that dataset 2 having 32 attributes (for
benign-malignant cancer) and dataset 3 having 33 attributes (for recurrence-non
recurrence) showed better performance for different algorithms. Hence, it can
be said that attributes used in dataset 2 are better for benign-malignant breast
cancer prediction while the attributes used in dataset 3 are better for recurrence-
non recurrence of breast cancer prediction. Thus the attributes of dataset 2 are
observed to be the best for predicting benign-malignant breast cancer prediction.
Similarly, the attributes of dataset 3 are the best for predicting the recurrence
of breast cancer.
Furthermore, KNN and SVM performed best for dataset 1 and dataset 2
respectively. In this case, the difference in accuracy gained by KNN and SVM was
below 2% while the attribute of dataset 1 and dataset 2 is 11 and 32 respectively.
Similarly, KNN and Random Forest performed best for dataset 3 and dataset 4
respectively. Here, the difference of accuracy was below 3% while the attributes
of dataset 3 and dataset 4 are 33 and 10 respectively. In both cases, increasing
the number of attributes doesn’t make a huge difference while the collection of
attributes increases diagnosis or medical cost. From a cost-effective point of view,
in both cases less number of attributes can be used, 11 attributes (dataset 1)
can be used for predicting breast cancer adopting the KNN. On the other hand,
10 attributes (dataset 4) can be used for predicting recurrence of breast cancer
adopting the Random Forest algorithm.
5 Conclusion
In this research, different datasets with various attributes were implemented
using four algorithms. Aiming to find which attributes and algorithms tend to
be more effective.
Machine learning approaches have been increasing rapidly in the medical
field due to their monumental performance in predicting and classifying disease.
Research on which algorithm and attributes perform better in breast cancer
prediction has been done before but the reasons for their performing better were
not explored. In this research, considering the performance of algorithms, the
best accuracy was observed by KNN in breast cancer prediction and SVM in
the recurrence of breast cancer. This result indicated that while detecting breast
cancer, a dataset with diverse attributes tends to be more accurate in prediction.
One of the limitations of this work is, this study considers datasets having
limited number of instances. In future, larger datasets can be considered for
exploring the different insight. Besides, more algorithms can be implemented to
validate and generalize the outcomes of this research.
References
1. Aaltonen, L.A., Salovaara, R., Kristo, P., Canzian, F., Hemminki, A., Peltomäki,
P., Chadwick, R.B., Kääriäinen, H., Eskelinen, M., Järvinen, H., et al.: Incidence of
hereditary nonpolyposis colorectal cancer and the feasibility of molecular screening
for the disease. N. Engl. J. Med. 338(21), 1481–1487 (1998)
2. Asri, H., Mousannif, H., Al Moatassime, H., Noel, T.: Using machine learning
algorithms for breast cancer risk prediction and diagnosis. Procedia Comput. Sci.
83, 1064–1069 (2016)
3. Bharat, A., Pooja, N., Reddy, R.A.: Using machine learning algorithms for breast
cancer risk prediction and diagnosis. In: 2018 3rd International Conference on
Circuits, Control, Communication and Computing (I4C), pp. 1–4. IEEE (2018)
4. Chaurasia, V., Pal, S.: Data mining techniques: to predict and resolve breast cancer
survivability. Int. J. Comput. Sci. Mob. Comput. IJCSMC 3(1), 10–22 (2014)
5. Chaurasia, V., Pal, S., Tiwari, B.: Prediction of benign and malignant breast cancer
using data mining techniques. J. Algorithms Comput. Technol. 12(2), 119–126
(2018)
6. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic
minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
7. Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a com-
parison of three data mining methods. Artif. Intell. Med. 34(2), 113–127 (2005)
8. Frank, A., Asuncion, A., et al.: UCI machine learning repository (2010), 15, 22
(2011). http://archive.ics.uci.edu/ml
9. Gokhale, S.: Ultrasound characterization of breast masses. Indian J. Radiol. Imag-
ing 19(3), 242 (2009)
10. Huang, M.W., Chen, C.W., Lin, W.C., Ke, S.W., Tsai, C.F.: SVM and SVM
ensembles in breast cancer prediction. PLoS ONE 12(1), e0161501 (2017)
11. Inan, T.T., Samia, M.B.R., Tulin, I.T., Islam, M.N.: A decision support model to
predict ICU readmission through data mining approach. In: Pacific ASIA Confer-
ence on Information Systems (PACIS), p. 218 (2018)
12. Islam, M.M., Iqbal, H., Haque, M.R., Hasan, M.K.: Prediction of breast cancer
using support vector machine and k-nearest neighbors. In: 2017 IEEE Region 10
Humanitarian Technology Conference (R10-HTC), pp. 226–229. IEEE (2017)
13. Karabatak, M., Ince, M.C.: An expert system for detection of breast cancer based
on association rules and neural network. Expert Syst. Appl. 36(2), 3465–3469
(2009)
14. Khan, N.S., Muaz, M.H., Kabir, A., Islam, M.N.: Diabetes predicting mHealth
application using machine learning. In: 2017 IEEE International WIE Conference
on Electrical and Computer Engineering (WIECON-ECE), pp. 237–240. IEEE
(2017)
15. Khan, N.S., Muaz, M.H., Kabir, A., Islam, M.N.: A machine learning-based intel-
ligent system for predicting diabetes. Int. J. Big Data Anal. Healthcare (IJBDAH)
4(2), 1–20 (2019)
16. Khan, N.I., Mahmud, T., Islam, M.N., Mustafina, S.N.: Prediction of cesarean
childbirth using ensemble machine learning methods. In: 22nd International Con-
ference on Information Integration and Web-Based Applications Services (IIWAS
2020) (2020)
17. Khourdifi, Y., Bahaj, M.: Applying best machine learning algorithms for breast
cancer prediction and classification. In: 2018 International Conference on Elec-
tronics, Control, Optimization and Computer Science (ICECOCS), pp. 1–5. IEEE
(2018)
18. Khourdifi, Y., Bahaj, M.: Feature selection with fast correlation-based filter for
breast cancer prediction and classification learning learning algorithms. In: 2018
International Symposium on Advanced Electrical and Communication Technolo-
gies (ISAECT), pp. 1–6. IEEE (2018)
19. Kotsiantis, S., Kanellopoulos, D., Pintelas, P.: Handling imbalanced datasets: a
review. Gests Int. Trans. Comput. Sci. Eng. 30, 25–36 (2006). Synthetic Oversam-
pling of Instances Using Clustering
20. Li, A., Liu, L., Ullah, A., Wang, R., Ma, J., Huang, R., Yu, Z., Ning, H.: Association
rule-based breast cancer prevention and control system. IEEE Trans. Comput. Soc.
Syst. 6(5), 1106–1114 (2019)
21. Liu, Y.Q., Wang, C., Zhang, L.: Decision tree based predictive models for breast
cancer survivability on imbalanced data. In: 2009 3rd International Conference on
Bioinformatics and Biomedical Engineering, pp. 1–4. IEEE (2009)
22. Mangasarian, O.L., Musicant, D.R.: Lagrangian support vector machines. J. Mach.
Learn. Res1, 161–177 (2001)
23. Miah, Y., Prima, C.N.E., Seema, S.J., Mahmud, M., Kaiser, M.S.: Performance
comparison of machine learning techniques in identifying dementia from open
access clinical datasets. In: Advances on Smart and Soft Computing, pp. 79–89.
Springer (2020)
24. Omar, K.S., Mondal, P., Khan, N.S., Rizvi, M.R.K., Islam, M.N.: A machine learn-
ing approach to predict autism spectrum disorder. In: 2019 International Confer-
ence on Electrical, Computer and Communication Engineering (ECCE), pp. 1–6.
IEEE (2019)
25. Stark, G.F., Hart, G.R., Nartowt, B.J., Deng, J.: Predicting breast cancer risk using
personal health data and machine learning models. PLoS ONE 14(12), e0226765
(2019)
26. Vasant, P., Zelinka, I., Weber, G.W. (eds.): Intelligent Computing and Optimiza-
tion. Springer International Publishing (2020). https://doi.org/10.1007/978-3-030-
33585-4
27. Yarabarla, M.S., Ravi, L.K., Sivasangari, A.: Breast cancer prediction via machine
learning. In: 2019 3rd International Conference on Trends in Electronics and Infor-
matics (ICOEI), pp. 121–124. IEEE (2019)
Exploring the Machine Learning
Algorithms to Find the Best Features
for Predicting the Risk of Cardiovascular
Diseases
Mostafa Mohiuddin Jalal1(B) , Zarin Tasnim1 , and Muhammad Nazrul Islam2

1
Department of Information and Communication Engineering,
Bangladesh University of Professionals, Dhaka, Bangladesh
protikmostafa.bup.ice@gmail.com
2
Department of Computer Science and Engineering, Military Institute of Science
and Technology, Dhaka, Bangladesh
nazrul@cse.mist.ac.bd
Abstract. Nowadays, cardiovascular diseases are considered as one of

the fatal and main reasons for mortality all around the globe. The mor-
tality or high-risk rate can be reduced if an early detection system for
cardiovascular disease is introduced. A massive amount of data gets col-
lected by healthcare organizations. A proper and careful study regarding
the data can be carried out to extract some important and interesting
insight that may help out the professionals. Keeping that in mind, in this
paper, at first six distinct machine learning algorithms(Logistic Regres-
sion, SVM, KNN, Naı̈ve Bayes, Random Forest, Gradient Boosting) were
applied to four different datasets encompasses different set of features to
show their performance over them. Secondly, the prediction accuracy of
the ML algorithms was analyzed to find out the best set of features and
the best algorithm to predict cardiovascular diseases. The results find
out the best suited eleven feature and also showed that Random Forest
performs well in terms of accuracy in predicting cardiovascular diseases.
Keywords: Prediction · Machine learning · Cardiovascular disease ·

Classification · Healthcare · Feature identification
1 Introduction
Cardiovascular diseases have been one of the note-worthy reasons for mortality
all over the world. According to the reports of WHO, every year more than 18
million people die because of cardiovascular diseases which covers almost 31%
of global death [23]. Damages in parts or all of the heart, coronary artery, or
inadequate supply of nutrients and oxygen to this organ result in cardiovascu-
lar disease. Several lifestyle choices can increase the risk of heart disease that
include, for example, high blood pressure and cholesterol, smoking, overweight
and obesity, and diabetes.
https://doi.org/10.1007/978-3-030-68154-8_49
560 M. M. Jalal et al.
Disease detection is generally dependent on the experience and expertise of

doctors [24,26]. Though the decision support system could be a more feasible
choice in the diagnosis of cardiovascular diseases through prediction [3]. Health-
care organizations and hospitals collect data from patients regarding various
health-related issues all around the world. The collected set of data can be uti-
lized using various machine learning classification techniques to draw effective
insights that are overwhelming for human minds to comprehend. To rectify the
hospital errors, prevention, easy detection of diseases along with better health
policy-making, and preventable hospital deaths data mining applications can be
used [10–12,19]. In the vein, prediction of cardiovascular disease using machine
learning can efficiently assist medical professionals [5,20,23].
A number of studies have been conducted focusing to the ML and cardiovas-
cular diseases that have considered a specific dataset (on specific features) and
applied several ML algorithms for comparing their performance. Again, a very
few studies have considered the performance analysis of several algorithms using
different datasets. Thereby this research aims to identify the most suited data
mining classification techniques and the best set of features used in classifica-
tion. Six ML techniques (Logistic Regression, Support Vector Machine, K-NN,
Naı̈ve Bayes, Random Forest, and Gradient Boosting) were applied to generate
prediction models on four distinct datasets. This paper compared the accuracy
of different ML algorithms on individual datasets
There are five distinct sections in this paper, namely, Sect. 2 highlights
the previous works focusing on cardiovascular disease prediction using machine
learning. Section 3 illustrates the details of data, several ML algorithms, and
the performance evaluation of the models in the disease prediction. Section 4
explains the result and overall performance of all the classification algorithms in
terms of accuracy, precision, sensitivity, and specificity. Section 5 concludes by
highlighting the main outcomes, limitations and future work.
2 Literature Review
Several studies have been conducted using the UCI machine learning repository
heart disease dataset to classify the presence of cardiovascular disease in the
human body. This multivariate dataset involves 303 instances along with 75 fea-
tures. For example, Rajyalakshmi et al. [22] explored different machine learning
algorithms including Decision Tree (DT), Support Vector Machine (SVM), Ran-
dom Forest (RF), and a hybrid model named HRFLM that combines Random
Forest and Linear method for predicting cardiovascular diseases. The hybrid
model showed the best accuracy among the implemented algorithms. Similarly,
Sen et al. [25] implemented the Naı̈ve Bayes, SVM, Decision Tree, and KNN
algorithms and found that SVM shows the best accuracy (83%). Again, Dinesh
et al. [6] predicted the possibility of cardiovascular diseases using six different
algorithms, that includes Logistic Regression, Random Forest, SVM, Gradient
Boosting, and an ensemble model, while Logistic regression showed the best accu-
racy (91%). In another study, Maji et al. [17] proposed the use of hybridization
Predicting the Risk of Cardiovascular Diseases 561
techniques to reduce the test cases to predict the outcome using ANN, C4.5,
and hybrid Decision Tree algorithms. The result also showed that the hybrid
Decision tree performed better in terms of accuracy.
Prediction of cardiovascular diseases has also been conducted using differ-
ent datasets to bring out the best accuracy like UCI Statlog Dataset having
270 instances along with 13 features. Dwivedi et al. [7] proposed an auto-
matic medicinal system using advanced data mining techniques like the mul-
tilayer perceptron model and applied several classification techniques like Naı̈ve
Bayes, SVM, logistic Regression, and KNN. Another study conducted by Georga
et al. [9] explained AI methods(Random Forest, Logistic Regression, FRS, GAM,
GBT algorithms) to find out the most effective predictors to detect Coronary
Heart Disease (CAD). In this study, a more coherent and clustered dataset along
with hybridization were thought to be incorporated to find out the best result.
UCI Cleveland data has also been used in classification to predict the pres-
ence of cardiovascular disease in the human body in many studies. For example,
Pouriyeh et al. [21] investigated and compared the performance of the accu-
racy of different classification algorithms including Decision Tree, Naı̈ve Bayes,
SVM, MLP, KNN, Single Conjugative Rule Learner, and Radial Basis Function.
Here, the hybrid model of SVM and MLP produced the best result. Latha [16]
suggested a comparative analytical approach to determine the performance accu-
racy of ensemble techniques and found that the ensemble was a good strategy
to improve accuracy. Again, Amin et al. [2] identified significant features and
improve the prediction accuracy by using different algorithms including, KNN,
Decision Tree, Naı̈ve Bayes, Logistic Regression, SVM, Neural Network, and an
ensemble model that combines Naı̈ve Bayes and Logistic regression. It used the
most impactful 9 features to calculate the prediction and the Naı̈ve Bayes gave
the best accuracy.
Alaa et al. [1] used UK Biobank data to develop an algorithmic tool using
AutoPrognosis that automatically selects features and tune ensemble models.
Five ML algorithms were used in this study, namely, SVM, Random Forest,
Neural Network, AdaBoost, and Gradient Boosting. The AutoPrognosis showed
a higher AUC-ROC compared to all other standard ML models.
In sum, the literature review showed that most of the studies focusing on car-
diovascular diseases and ML were striving to explore the algorithm that showed
the best performance in predicting the possibility of cardiovascular diseases. The
summary of the literature review is shown in Table 1. Again, though different
studies are conducted using different sets of datasets having a different number
of features, no study has been conducted to explore how the performance accu-
racy of different algorithms are varying due to different set of features used in
different datasets. Thus, this study focus on this issue.
3 Methodology
This section describes the overall working procedure used to obtain the research
objectives. The overall methodology of this study is shown in Fig. 1
Table 1. Summary of related studies
Ref No. of features Objectives ML techniques Results

Accuracy Specificity Sensitivity Precision
[22] 13 Predicting whether a Logistic 87.00%
person has heart disease Regression
or not by applying several
Random 81.00%
ML algorithms and
Forest
provides diagnosis
Naive Bayes 84.00%
Gradinet 84.00%
Boosting
SVM 78.00%
[23] 13 Presenting a survey of Naive Bayes 84.16%
various models based on
SVM 85.77%
algorithm and their
performance KNN 83.16%
Decision Tree 77.55%
Random 91.60%
Forest
[7] 13 Evaluating six potential ANN 84% 79% 87% 84%
ML algorithms based on
SVM 82% 89% 77% 90%
eight performance indices
and finding the best Logistic 85% 81% 89% 85%
algorithm for prediction regression
KNN 80% 76% 84% 81%
Classification 77% 73% 79% 79%
Tree
Naive Bayes 83% 80% 85% 84%
[20] 13 Comparing different J48 with 57%
algorithms of decision Reduced
tree classification in heart Errorpruning
disease diagnosis Algorithm
Logistic mode 56%
tree
Random forest
[21] 14 Applying traditional ML Decision Tree 78% 83% 77%
algorithms and ensemble
Naive Bayes 83% 87% 84%
models to find out the
best classifier for disease Knn, k = 1 76% 78% 78%
prediction Knn, k = 3 81% 84% 82%
Knn, k = 9 83% 84% 85%
Knn, k = 15 83% 84% 85%
MLP 83% 82% 82%
Radial basic 84% 86% 85%
function
Single 70% 70% 73%
conjunctive
rule learner
SVM 84% 90% 83%
[2] 14 Identifying significant SVM 85%
features and mining Vote 86%
techniques to improve
Naive Bayes 86%
CVD prediction
Logistic 86%
Regression
NN 85%
KNN 83%
Decision Tree 83%
(continued)
Ref No. of features Objectives ML techniques Results

Accuracy Specificity Sensitivity Precision
[6] 13 Finding significant KNN 59%
features and introducing SVM 72%
several combinations with
Logistic 77%
ML techniques
Regression
Naive Bayes 70%
Random 74%
Forest
[25] 14 Comparing performance Naive Bayes 83%
of various ML algorithms SVM 84%
and predicting CVD
Decision Tree 76%
KNN 76%
[17] 13 Proposing hybridization ANN 77%
technique and validating
using several performance C4.5 77%
measures to predict CVD Hybrid-DT 78%
Fig. 1. The overview of research methodology
At first, we acquired data from various sources namely UCI machine learning
repository [8] and Kaggle. Four different datasets were used in this research
having different number of instances and the features of these datasets are not
similar. As Dataset 1, we have used the “Kaggle Cardiovascular Disease Dataset”
while having 13 features and 70000 instances. The “Cleveland dataset” from
kaggle having 14 attribues and 303 instances as Dataset 2. Likewise, we used
“Kaggle Integrated Heart Disease dataset” as Dataset 3 that contains 76 features
and 899 instances. As Dataset 4, we used “UCI Framingham Dataset” having
14 features and 4240 instances. However, we have considered 11 and 14 features
for computational purpose from dataset 3 and dataset 4 respectively. There
are 8 common features in each of the datasets.
Performance of ML algorithms hugely relies on data pre-processing. In the

dataset, there is often redundant and inappropriate data which lead to inac-
curate outcomes. Thus, the noise, ambiguity and redundancy of data need to
reduce for better classification accuracy [14,28]. The following operations were
performed: (a) duplicate rows in each of the datasets are removed; (b) the rows
having ambiguous data are removed; (c) missing numerical values in each of
the datasets are replaced with the mean value of the particular column; (d) the
columns having text or string values are converted to numeric data to apply
ML algorithms; (e) data having numeric values are normalized to obtain values
between one and zero, since ML algorithms show better outcomes if values of
numeric columns are changed in the dataset to a common scale, without dis-
torting differences in the ranges of values; (f) training and testing are performed
with a random train-test split of 70–30; (g) outcomes of ML algorithms can
be biased if equal number of class instances doesn’t exist in the training set of
data [13]. To eradicate class instances imbalance problems, Synthetic Minority
Over-sampling Technique (SMOTE) [4] was used.
3.3 Analyzing ML Algorithms
The objectives of the analysis are to find the optimum set of features and an
algorithm for the prediction of cardiovascular diseases. For this, different machine
learning algorithms which used mostly for predicting cardiovascular disease were
chosen that includes, Logistic Regression, Support Vector Machine, K-th Nearest
Neighbor, Naı̈ve Bayes, Random Forest and Gradient Boosting. These algorithms
were applied on the selected four datasets by splitting the dataset into 70% as
training set and 30% as testing set.
3.3.1 Logistic Regression

Logistic Regression is one of the widely used ML algorithms. It is used in mod-
eling or predicting because of its less computational complexity [18]. LR is con-
sidered as the standard statistical approach to modeling binary data. It is a
better alternative for a linear regression which assigns a linear model to each of
the class and predicts unseen instances basing on majority vote of the models
[15]. Some interesting findings can be observed by using the Logistic Regression
method as the prediction model considering predefined classes of features and
results are shown in Fig. 2. Here, datasets are shown in X axis and performance
measures are shown in Y axis. Different datasets come with different features and
outcomes. The highest accuracy 93.8% was attained by Dataset 3 with 93.7%
precision, 100% sensitivity and 0% specificity.
3.3.2 Support Vector Machine (SVM)

SVM is a supervised pattern classification model which is used as a training
algorithm for learning classification and regression rules from gathered data.
The purpose of this method is to separate data until a hyperplane with high
minimum distance is found. SVM is used to classify two or more data types [27].
Results of applying SVM on predefined classes of features are shown in Table 2.
Here, the best performance measure was observed for dataset 3, having accuracy,
precision, sensitivity, specificity of 96.39%, 93.79%, 100%, 0% respectively.
3.3.3 K-th Nearest Neighbor

KNN is a supervised learning method which is used for diagnosing and classifying
cancer. In this method, the computer is trained in a specific field and new data
is given to it. Additionally, similar data is used by the machine for detecting
(K); hence, the machine starts finding KNN for the unknown data [27]. This
algorithm has been performed based on different sets of features of different
datasets. The performance measure of this algorithm is found to be less effective
than previous algorithms. Here better performance was observed using dataset 3
(see Table 2). The performance measures namely, accuracy, precision, sensitivity
and specificity were 88.37%, 92.57%, 99.67%, 0% respectively.
3.3.4 Naı̈ve Bayes

Naı̈ve Bayes refers to a probabilistic classifier that applies Bayes’ theo-
rem with robust independence assumptions. In this model, all properties
are considered separately to detect any existing relationship between them.
Fig. 2. Performance of Logistic regression algorithm on different datasets

It assumes that predictive features are conditionally independent given a class.

Moreover, the values of the numeric features are distributed within each class.
NB is fast and performs well even with a small dataset [27]. Findings from the
performance of this algorithm from different datasets vary because of the selec-
tion of features. Results of these are shown in Table 2. The best accuracy can
be obtained from Dataset 3 having accuracy, precision, sensitivity, specificity of
92.3, 93.9, 100, 0% respectively.
3.3.5 Random Forest (RF)

RF algorithm is used at the regularization point where the model quality is high-
est. RF builds numerous numbers of Decision Trees using random samples with a
replacement to overcome the problem of Decision Trees. RF is used in the unsu-
pervised mode for assessing proximities among data points [27]. By performing
Python scripts on 4 different datasets on predefined features. Results of these
are shown in Table 2. The best accuracy of this algorithm can be observed for
dataset 3. The accuracy, precision, sensitivity and specificity of this algorithm
for this particular dataset are 93.85, 94.67, 100, 37.5% respectively.
3.3.6 Gradient Boosting

The gradient boosting is a machine learning technique for regression and clas-
sification [6]. Different datasets with different sets of features were considered
to compute the performances for this algorithm and the results are showed in
Table 2. The best outcome in terms of accuracy comes from dataset 3. Over-
all performance measures namely, accuracy, precision, sensitivity and specificity
were 89.8, 91.65, 96.36, 13.7% respectively.
Table 2. Performance of the selected ML techniques for different datasets
Dataset Performance Logistic SVM K-NN Naı̈ve Bayes Random Gradient

measures regression forest boosting
Dataset 1 Accuracy 71.36% 72.7% 66.9% 58.43% 72.78% 72.79%
Precision 81.09% 70.58% 67.37% 55.18% 71.94% 72.37%
Sensitivity 61.44% 77.09% 75.7% 90.08% 81.66% 80.58%
Specificity 71.36% 67.82% 56.46% 26.76% 61.55% 63.55%
Precision 85.18% 88.46% 85.18% 88.46% 88.46% 75.86%
Sensitivity 85.18% 85.18% 85.18% 85.18% 85.18% 81.48%
Specificity 88.23% 91.18% 88.23% 91.18% 91.18% 79.41%
Precision 93.7% 93.79% 92.57% 93.9% 94.68% 91.65%
Sensitivity 100% 100% 99.67% 100% 100% 96.36%
Specificity 0% 0% 5% 0% 37.5% 13.7%
Precision 84.81% 84.06% 86.38% 88.34% 84.88% 86.5%
Sensitivity 99.15% 99.89% 92.11% 93.3% 99.25% 99.07%
Specificity 7.18% 1.1% 9.62% 20.79% 7.73% 3.85%
4 Discussion
The study result highlights that with different dataset al.ong with different fea-
tures show diversity in the result. Datasets used here for computation purposes
consist of different numbers of features. As they have different features, this
could be one of the reasons of varying the prediction accuracy.
Again, different datasets show difference in accuracy as shown in Table 2.
For dataset 1, 2, 3 and 4 comparatively better accuracy was obtained using
Gradient Boosting; SVM, Naı̈ve Bayes and Random Forest; Random Forest;
Random Forest and Gradient Boosting respectively. The results thus indicate
that Random Forest shows the best accuracy in most of the datasets.
Each of the algorithm applied in each particular dataset, the best perfor-
mance was observed for dataset 3 having 11 features. Thus, the results indicated
that for predicting cardiovascular disease in the human body, features used in
the dataset 3 is most likely to be the best recommended attributes. The features
considered in Dataset 3 are Age, Sex, Chest Pain Type, Resting Blood Pressure,
Smoking Year, Fasting Blood Sugar, Diabetes History, Family History Coronary,
ECG, Pulse Rate and Presence of Disease.
5 Conclusion
In this research, six different ML algorithms were implemented on four different
datasets having different set of features to predict cardiovascular disease. The
result showed that 11 features of dataset 3 are the most efficient features while
the Random Forest showed the best accuracy for most of the datasets with
different set of features.
While the existing work have primarily focused on the implementation of
several algorithms in a particular dataset and then compared their performance;
This study demonstrated performance comparison among multiple datasets hav-
ing different set of features along with evaluating multiple machine learning
algorithms on them.
One of the limitations of this work is, this study considers only the traditional
and ensemble ML algorithms. Hybrid and different ensemble models could have
been developed for a different insight. In future, an app or tool can be developed
using ML algorithms to detect cardiovascular disease. Besides, algorithms can
also be applied on new datasets to validate and generalize the outcomes of this
research related to the best features to predict cardiovascular diseases.
References
1. Alaa, A.M., Bolton, T., Di Angelantonio, E., Rudd, J.H., van der Schaar, M.: Car-
diovascular disease risk prediction using automated machine learning: a prospective
study of 423,604 uk biobank participants. PLoS ONE 14(5), e0213653 (2019)
2. Amin, M.S., Chiam, Y.K., Varathan, K.D.: Identification of significant features
and data mining techniques in predicting heart disease. Telematics Inform. 36,
82–93 (2019)
3. Bhatt, A., Dubey, S.K., Bhatt, A.K.: Analytical study on cardiovascular health
issues prediction using decision model-based predictive analytic techniques. In:
Soft Computing: Theories and Applications, pp. 289–299. Springer (2018)
4. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic
minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
5. Dangare, C.S., Apte, S.S.: Improved study of heart disease prediction system using
data mining classification techniques. Int. J. Comput. Appl. 47(10), 44–48 (2012)
6. Dinesh, K.G., Arumugaraj, K., Santhosh, K.D., Mareeswari, V.: Prediction of car-
diovascular disease using machine learning algorithms. In: 2018 International Con-
ference on Current Trends towards Converging Technologies (ICCTCT), pp. 1–7.
IEEE (2018)
7. Dwivedi, A.K.: Performance evaluation of different machine learning techniques
for prediction of heart disease. Neural Comput. Appl. 29(10), 685–693 (2018)
8. Frank, A., Asuncion, A., et al.: UCI machine learning repository, 2010, vol. 15, p.
22 (2011). http://archive.ics.uci.edu/ml
9. Georga, E.I., Tachos, N.S., Sakellarios, A.I., Kigka, V.I., Exarchos, T.P., Pelosi,
G., Parodi, O., Michalis, L.K., Fotiadis, D.I.: Artificial intelligence and data min-
ing methods for cardiovascular risk prediction. In: Cardiovascular Computing–
Methodologies and Clinical Applications, pp. 279–301. Springer (2019)
10. Inan, T.T., Samia, M.B.R., Tulin, I.T., Islam, M.N.: A decision support model to
predict ICU readmission through data mining approach. In: PACIS, p. 218 (2018)
11. Khan, N.S., Muaz, M.H., Kabir, A., Islam, M.N.: Diabetes predicting mhealth
application using machine learning. In: 2017 IEEE International WIE Conference
on Electrical and Computer Engineering (WIECON-ECE), pp. 237–240. IEEE
(2017)
12. Khan, N.S., Muaz, M.H., Kabir, A., Islam, M.N.: A machine learning-based intel-
ligent system for predicting diabetes. Int. J. Big Data Anal. Healthcare (IJBDAH)
4(2), 1–20 (2019)
13. Kotsiantis, S., Kanellopoulos, D., Pintelas, P.: Handling imbalanced datasets: a
review, gests international transactions on computer science and engineering. Syn-
thetic Oversampling Instances Using Clustering 30, 25–36 (2006)
14. Krak, I., Barmak, O., Manziuk, E., Kulias, A.: Data classification based on the
features reduction and piecewise linear separation. In: International Conference on
Intelligent Computing & Optimization, pp. 282–289. Springer (2019)
15. Kumar, G.R., Ramachandra, G., Nagamani, K.: An efficient prediction of breast
cancer data using data mining techniques. Int. J. Innov. Eng. Technol. (IJIET)
2(4), 139 (2013)
16. Latha, C.B.C., Jeeva, S.C.: Improving the accuracy of prediction of heart dis-
ease risk based on ensemble classification techniques. Informat. Med. Unlocked 16,
100203 (2019)
17. Maji, S., Arora, S.: Decision tree algorithms for prediction of heart disease. In:
Information and Communication Technology for Competitive Strategies, pp. 447–
454. Springer (2019)
18. Miah, Y., Prima, C.N.E., Seema, S.J., Mahmud, M., Kaiser, M.S.: Performance
comparison of machine learning techniques in identifying dementia from open
access clinical datasets. In: Advances on Smart and Soft Computing, pp. 79–89.
Springer (2020)
IEEE (2019)
20. Patel, J., TejalUpadhyay, D., Patel, S.: Heart disease prediction using machine
learning and data mining technique. Heart Disease 7(1), 129–137 (2015)
21. Pouriyeh, S., Vahid, S., Sannino, G., De Pietro, G., Arabnia, H., Gutierrez, J.: A
comprehensive investigation and comparison of machine learning techniques in the
domain of heart disease. In: 2017 IEEE Symposium on Computers and Communi-
cations (ISCC), pp. 204–207. IEEE (2017)
22. Rajyalakshmi, P., Reddy, G.S., Priyanka, K.G., Sai, V.L.B.S., Anveshini, D.: Pre-
diction of cardiovascular disease using machine learning. Entropy 23, 24
23. Ramalingam, V., Dandapath, A., Raja, M.K.: Heart disease prediction using
machine learning techniques: a survey. Int. J. Eng. Technol. 7(2.8), 684–687 (2018)
24. Rani, K.U.: Analysis of heart diseases dataset using neural network approach.
arXiv preprint arXiv:1110.2626 (2011)
25. Sen, S.K.: Predicting and diagnosing of heart disease using machine learning algo-
rithms. Int. J. Eng. Comput. Sci 6(6) (2017)
26. Soni, J., Ansari, U., Sharma, D., Soni, S.: Predictive data mining for medical
diagnosis: an overview of heart disease prediction. Int. J. Comput. Appl. 17(8),
43–48 (2011)
27. Tahmooresi, M., Afshar, A., Rad, B.B., Nowshath, K., Bamiah, M.: Early detec-
tion of breast cancer using machine learning techniques. J. Telecommun. Electr.
Comput. Eng. (JTEC) 10(3–2), 21–27 (2018)
28. Vasant, P., Zelinka, I., Weber, G.W. (eds.): Intelligent Computing and Optimiza-
tion. Springer International Publishing (2020). https://doi.org/10.1007/978-3-030-
33585-4, https://doi.org/10.1007%2F978-3-030-33585-4
Searching Process Using Boyer Moore
Algorithm in Digital Library
Laet Laet Lin(&) and Myat Thuzar Soe
Faculty of Computer Science,

University of Information Technology, Yangon, Myanmar
{laetlaetlin,myatthuzarsoe}@uit.edu.mm
Abstract. Reading helps to learn more vocabularies and their usages. People
should read books to get knowledge. Nowadays, digital libraries are used to
search for various books. If the internet connection is available, digital libraries
can be accessed easily via computers, laptops, and mobile phones. Students and
readers can search for the desired books from this digital library by typing sub-
strings of a book title that matches with parts of the original book title. To be
able to make such a searching process effectively, string searching algorithms
can be applied. There are many sophisticated and efficient algorithms for string
searching. In this paper, a Boyer Moore (BM) string-searching algorithm is
applied to search the desired books effectively in the digital library. And then the
BM algorithm is presented by comparing it with Horspool’s algorithm to prove
the performance efficiency of the BM algorithm for the searching process.
Keywords: Boyer Moore (BM) algorithm Horspool’s algorithm

Bad-symbol shift Good-suffix shift String matching Digital library
1 Introduction
Reading helps students and readers to gain knowledge. Users can use digital libraries to
get knowledge or information by reading educational or non-educational books. Most
frequently users find or track via computers or mobile phones the desired books that are
available in the library. The effectiveness of libraries is based on the results of how they
make users easier in finding books. A good computerized library system is to help users
for easy and effective use. To achieve this, string searching algorithms are required.
String searching sometimes called string matching is the act of checking the existence
of a substring of m characters called the pattern in a string of n characters called the text
(where m n) and finding its location in that text [1–3].
String searching algorithms are the basic components of existing applications for
text processing, intrusion detection, information retrieval, and computational biology
[4]. Better string-matching algorithms can improve the applications’ efficiencies more
effectively. So, a fast string matching algorithm is an important area of research. There
are several more sophisticated and more efficient algorithms for string searching. The
most widely known of them is the BM algorithm suggested by R. Boyer and J. Moore.
The BM algorithm is the most efficient string-matching algorithm because a lot of
character comparisons are ignored during the process of searching. It performs the

https://doi.org/10.1007/978-3-030-68154-8_50
Searching Process Using Boyer Moore Algorithm in Digital Library 571
pattern’s character comparisons in reverse order from right to left and, in the case of
mismatched, the entire pattern to be searched is not required to compare [5, 6]. This
paper aims to recommend an efficient BM algorithm for the string processing of digital
libraries.
The remainder of this paper is organized as follows. Section 2 presents related
work focusing on using the BM algorithm for various related applications. Section 3
explains a background theory about the BM algorithm and the Horspool’s algorithm in
detail. Section 4 highlights the searching processes of BM and Horspool algorithms
and comparative analysis between them. Finally, the conclusion of this paper is
described in Sect. 5.
2 Related Work
The authors [7] developed a detection method to detect web application vulnerabilities
by using the BM string-matching algorithm. Their proposed method performed well in
terms of the ability to accurately detect vulnerabilities based on false-negative and have
no false positive with low processing time.
Using the BM algorithm as a web-based search, the authors [8] implemented a
dictionary application for midwifery. The authors [9] applied the BM algorithm for the
application of a baby name dictionary based on Android. The authors [10] proposed a
new algorithm as a regular expression pattern matching algorithm. This new algorithm
was derived from the BM algorithm, and it was an efficient generalized Boyer-Moore-
type pattern matching algorithm for regular languages.
The authors [11] studied different kinds of string matching algorithms and observed
their time and space complexities. The performance of these algorithms is tested with
biological sequences such as DNA and Proteins. Based on their study, the BM algo-
rithm is extremely fast for large sequences, it avoids lots of needless comparisons by
pattern relative to the text, and its best-case running complexity is sub-linear.
The researcher [12] analyzed the efficiency of four string-matching algorithms
based on different pattern lengths and pattern placement. The researcher showed that
among the four algorithms, the Horspool algorithm which is a simplified version of the
BM algorithm is the fastest algorithm without considering the pattern length and
pattern placement.
3 Background Theory
In this section, the BM algorithm and the Horspool’s algorithm are presented in detail.
3.1 Boyer Moore (BM) Algorithm

BM algorithm is the most well-known string-matching algorithm because of its efficient
nature. Based on BM’s concept, many string-matching algorithms were developed
[13, 14]. BM algorithm compares a pattern’s characters with their partners in the text
572 L. L. Lin and M. T. Soe
by moving from right to left [1, 2, 5, 15]. It determines the size of the shift by using a
bad-symbol shift table and a good-suffix shift table.
3.1.1 Bad-Symbol Shift

The bad-symbol shift is guided by the character c in the text T that does not match with
its partner in the pattern P. If the text’s c does not include in P, P is shifted to pass this
text’s c. This shift size is calculated by the following equation:
txtðcÞ r: ð1Þ
where txt(c) is an entry in the bad-symbol shift table, and r is the number of matched
characters.
This table is a list of possible characters that can be found in the text. Texts may
contain punctuation marks, space, and other special characters. The txt(c) is computed
by the formula:
8
>
>
Length m of the pattern P; if character c in the text is not among
>
< P's first ðm - 1Þ characters;
txtðcÞ ¼ ð2Þ
>
> Distance from the rightmost character c amidst P's first ðm - 1Þ
>
:
characters to its final character; otherwise:
Let us build a bad-symbol shift table for searching a pattern PKSPRS in some text.
It is shown in Table 1, all entries of the table are equal to 6, except for K, P, R, and S,
which are 4, 2, 1, and 3, respectively.
Table 1. Bad-symbol shift table

c K P R S Other Characters
txt(c) 4 2 1 3 6
An example of shifting the pattern using the above Table 1 is shown in Fig. 1. In
this example, the pattern PKSPRS is searched in some text. Before the failure of
comparison on the text’s character Z, the comparisons of the last two characters were
matched. So, the pattern can be moved to the right 4 positions because the shift size is
txt(Z) − 2 = 6 − 2 = 4.
T0 ... Z R S ... Tn-1
P K S P R S
P K S P R S
Fig. 1. Shifting the pattern using bad-symbol table

Bad-symbol shift dist1 is calculated by txt(c) − r if this value is greater than zero
and by 1 if it is less than or equal to zero. This is described as a formula:
dist1 ¼ maxftxtðcÞ r; 1g: ð3Þ
3.1.2 Good-Suffix Shift

The good-suffix shift is directed by a match of the pattern’s last r > 0 characters. The
pattern’s ending portion is referred to as its suffix of size r and denotes it suf(r). If other
events (not preceded by the same character as in its rightmost event) of suf(r) contain in
P, then P can be shifted according to the distance dist2 between like a second rightmost
event of suf(r) and its rightmost event. In the case not including another event of suf(r)
in P, the longest prefix of size k < r which matches the suffix of the same size k is
needed to be found. If such a prefix exists, the shift size dist2 is calculated by the
distance between the corresponding suffix and this prefix; dist2 is set to P’s length m,
otherwise. Table 2 shows a sample of the good-suffix shift table for the pattern
QLVLQL.
Table 2. Good-suffix shift table

r Pattern dist2
1 QLVLQL 2
2 QLVLQL 4
3 QLVLQL 4
4 QLVLQL 4
5 QLVLQL 4
3.1.3 Algorithm Steps

The steps in the BM string-matching algorithm are as follows:
Step 1: Build a bad-symbol shift table as discussed earlier for a specified pattern P
and the alphabet used in both text T and P.
Step 2: Build a good-suffix shift table as discussed earlier by using P.
Step 3: Against the starting of T, align P.
Step 4: Until either a matched substring is found, or the P string comes to past the
final character of T, repeat the following step. Beginning with P’s last character,
compare the corresponding characters of P and T until all m pairs of the characters
match or a character pair that mismatch is found after r 0 pairs of the characters
are matched. In the mismatched situation, fetch txt(c) from the bad-symbol shift
table where c is T’s mismatched character. When r > 0, additionally fetch the
corresponding dist2 from the good-suffix shift table. Move P to the right according
to the number of positions calculated by the formula:

dist1 if r ¼ 0;
dist ¼ wheredist1 ¼ maxftxtðcÞ r; 1g: ð4Þ
maxfdist1 ; dist2 g if r [ 0;
3.2 Horspool’s Algorithm

Horspool’s algorithm for string matching is a simplified version of the BM algorithm.
Both algorithms use the same bad-symbol shift table; the BM algorithm also uses the
good-suffix shift table [2]. The steps in this algorithm are as follows:
Step 1: Build a bad-symbol shift table like the BM algorithm.
Step 2: Against the starting of the text T, align the pattern P.
Step 3: Repeat the following step until either a matching substring is found, or the P
reaches beyond the final character of the text. Starting with the P’s final character,
compare the corresponding characters in the P and T until either all m characters are
matched, or a mismatching pair is found. In the mismatched case, fetch the entry txt
(c) of the shift table where c is the T’s character currently aligned against the P’s
final character, and move the P by txt(c) characters to the right along with the text.
In these two algorithms, the worst-case time complexity of Horspool’s algorithm is
in O(nm). But it is not necessarily less efficient than the BM algorithm on the random
text. On the other hand, the worst-case time complexity of the BM algorithm is linear if
only the very first occurrence of the pattern is searched [2]. The BM algorithm takes O
(m) comparisons when the pattern string is absent in the text string. The best-case time
efficiency of the BM algorithm is O(n/m) [16, 17].
4 Discussion and Result
In this section, the searching process for the BM algorithm is illustrated by comparing
it with Horspool’s algorithm which is a simplified version of the BM algorithm.
Consider the problem of searching for the desired book in a digital library. A book
title is represented by a text that comprises English letters and spaces, and the book title
or the segment of the book title is the pattern.
Text: VIP_AZWIN_ ZAWZAZA
Pattern: ZAWZAZ
Table 3. Bad-symbol shift table for above sample

c A W Z Other Characters
txt(c) 1 3 2 6
Table 4. Good-suffix shift table for above sample

r Pattern dist2
1 ZAWZAZ 2
2
ZAWZAZ 5
3
ZAWZAZ 5
4
ZAWZAZ 5
5
ZAWZAZ 5
V I P _ A Z W I N _ Z A W Z A Z A
Z A W Z A Z
Fig. 2. First step
4.1 Searching Process with the BM Algorithm

First, construct the bad-symbol shift and the good-suffix shift tables. The bad-symbol
shift table to find dist1 value and the good-suffix shift table with dist2 values are shown
in Table 3 and Table 4, respectively.
Z A W Z A Z
Z A W Z A Z
Fig. 3. Second step
As shown in Fig. 2, first, the pattern string is aligned with the starting characters of
the text string.
As shown in Fig. 3, after matching with two pairs of characters, the pattern’s
character ‘Z’ fails to match its partner ‘_’ in the text. So, the algorithm retrieves txt(_) =
6 from bad symbol table to compute dist1 = txt(_) – 2 = 4 and also retrieves dist2 = 5
from good-suffix shift table. And then the pattern is moved to the right by max {dist1,
dist2} = max {4, 5} = 5.
Z A W Z A Z
Z A W Z A Z
Fig. 4. Third step
As shown in Fig. 4, after matching one pair of Z’s and failing the next comparison
on the text’s space character, the algorithm fetches txt(_) = 6 from bad-symbol table
to compute dist1 = 6 − 1 = 5 and also fetches dist2 = 2 from good-suffix shift table.
And then the pattern is moved to the right by max {dist1, dist2} = max {5, 2} = 5.
Lastly, after matching all the pattern’s characters with their partners in the text, a
matching substring is found in the text string. Here, the total number of character
comparisons is 11.
Z A W Z A Z
Z A W Z A Z
Fig. 5. First step
Z A W Z A Z
Z A W Z A Z
Fig. 6. Second step
4.2 Searching Process with the Horspool’s Algorithm

For searching the pattern ZAWZAZ in a given text comprised of English letters and
spaces, the shift table to be constructed is the same as shown in Table 3. As shown in
Fig. 5, first, the pattern is aligned with the beginning of the text and the characters are
compared from right to left. A mismatch occurs after comparing the character ‘Z’ in the
pattern and the character ‘_’ in the text. The algorithm fetches txt(Z) = 2 from the bad-
symbol shift table and shifts the pattern by 2 positions to the right along with the text.
In the next step, as shown in Fig. 6, after the last ‘Z’ of the pattern fails to match its
partner ‘I’ in the text, the algorithm fetches txt(I) = 6 from the bad-symbol shift table
and shifts the pattern by 6 positions to the right along with the text.
Z A W Z A Z
Z A W Z A Z
Fig. 7. Third step
In the next step, as shown in Fig. 7, after failing the second comparison on the
character ‘W’ in the text, the algorithm fetches txt(Z) = 2 from the bad-symbol shift
table and shifts the pattern by 2 positions to the right along with the text.
Finally, after matching all the pattern’s characters with their partners in the text, a
matched substring is found in the text. Here, the total number of character comparisons
is 12.
4.3 Comparison Between BM and Horspool Algorithms

In this section, the BM algorithm is compared with its simplified version, Horspool’s
algorithm, based on the number of character comparisons. These two algorithms are
implemented in the java language and are compared by searching the patterns of dif-
ferent sizes like 3, 5, 7, 9, 11, 13, and 15 respectively in a text string of 947 characters.
The total number of character comparisons of each algorithm using different pattern
lengths is shown in the graph form as in Fig. 8. The number of comparisons for both
algorithms can vary based on the pattern string.
BM Algorithm Horspool Algorithm

400
350
Number of Comparisons
300
250
200
150
100
50
0
3 5 7 9 11 13 15
Pattern Lengths
Fig. 8. Comparison between BM and Horspool algorithms based on the number of character
comparisons
According to the results in Fig. 8, the BM algorithm produces less comparison time
than Horspool’s algorithm. So, if the BM algorithm is used in the searching process of
the digital library system, the performance efficiency of this library system will be
improved effectively. A further experiment about the searching time and the accuracy
of the searching process in the digital library will be carried out.
5 Conclusion
In the text processing areas, finding the specific string in the least time is the most basic
factor. String-searching algorithms play a critical role in such a manner. Most of the
other string-searching algorithms use the basic concepts of the BM algorithm because it
has the least time complexity. In this paper, the BM algorithm is used for the process of
finding the desired books from the digital library. The result also based on the number
of comparisons shows that the BM algorithm is more efficient than the Horspool’s
algorithm. Therefore, if the text processing applications, such as the digital library
system, use the BM algorithm, the searching time will be fast, and their performance
will be significantly increased.
Acknowledgments. We are appreciative of the advisors from the University of Information

Technology who gave us valuable remarks and suggestions all through this project.
References
1. Rawan, A.A.: An algorithm for string searching. In: International Journal of Computer
Applications, vol. 177, no: 10, pp. 0975–8887 (2019)
2. Anany, L.: Introduction to the Design and Analysis of Algorithms. Villanova University,
Philadelphia (2012)
3. Robert, S., Kevin, W.: Algorithms. Princeton University, Princeton (2011)
4. Bi, K., Gu, N.J., Tu, K., Liu, X.H., Liu, G.: A practical distributed string-matching algorithm
architecture and implementation. In: Proceedings of World Academy Of Science,
Engineering and Technology, vol. 1, no: 10, pp. 3261–3265 (2007)
5. Robert, S.B., Strother, M.J.: A fast string searching algorithm. Assoc. Comput. Mach., 20
(10), 762–772 (1977)
6. Abdulellah, A.A., Abdullah, H.A., Abdulatif, M.A.: Analysis of parallel Boyer-Moore string
search algorithm. Glob. J. Comput. Sci. Technol. Hardware Compu. 13, 43–47 (2013)
7. Ain, Z.M.S., Nur, A.R., Alya, G.B., Kamarularifin, A.J., Fakariah, H.M.A., Teh, F.A.R.: A
method for web application vulnerabilities detection by using Boyer-Moore string matching
algorithm. In: 3rd Information Systems International Conference, vol. 72, pp. 112–121
(2015)
8. Rizky, I.D., Anif, H.S., Arini, A.: Implementasi Algoritma Boyer Moore Pada Aplikasi
Kamus Istilah Kebidanan Berbasis Web. Query: J. Inf. Syst. 2, 53–62 (2018)
9. Ayu, P.S., Mesran, M.: Implementasi algoritma boyer moore pada aplikasi kamus nama bayi
beserta maknanya berbasis android. Pelita informatika: informasi dan informatika 17, 97–
101 (2018)
10. Bruce, W.W., Richard, E.W.: A Boyer-Moore-style algorithm for regular expression pattern
matching. Sci. Comput. Program. 48, 99–117 (2003)
11. Pandiselvam, P., Marimuthu, T, Lawrance. R.: A comparative study on string matching
algorithm of biological sequences. In: International Conference on Intelligent Computing
(2014)
12. DU, V.: A comparative analysis of various string-matching algorithms. In: 8th International
Research Conference, KDU (2015)
13. Robbi, R., Ansari, S.A., Ayu, P.A., Dicky, N.: Visual approach of searching process using
Boyer-Moore algorithm. In: Journal of Physics, vol. 930 (2017)
14. Mulyati, I.A.: Searching process using Bayer Moore algorithm in medical information
media. In: International Journal of Recent Technology and Engineering (IJRTE), vol.
8 (2019)
15. Michael, T.G., Roberto T.: Algorithm Design and Applications. John Wiley and Sons,
Hoboken (2015)
16. Abd, M.A., Zeki, A., Zamani, M., Chuprat, S., El-Qawasmeh, E. (eds.): Informatics
Engineering and Information Science. New York (2011)
17. Yi, C.L.: A survey of software-based string matching algorithms for forensic analysis. In:
Annual ADFSL Conference on Digital Forensics, Security and Law (2015)
Application of Machine Learning
and Artificial Intelligence Technology
Gender Classification from Inertial
Sensor-Based Gait Dataset
Refat Khan Pathan1 , Mohammad Amaz Uddin1 ,

Nazmun Nahar1 , Ferdous Ara1 ,
Mohammad Shahadat Hossain2(&) , and Karl Andersson3
1
BGC Trust University Bangladesh Bidyanagar, Chandanaish, Bangladesh
refatkhan93@gmail.com, amazuddin722@gmail.com,
cu.ferdous@gmail.com, nazmun@bgctub.ac.bd
2
University of Chittagong, Chittagong 4331, Bangladesh
hossain_ms@cu.ac.bd
3
Lulea University of Technology, 931 87 Skellefteå, Sweden
Karl.andersson@ltu.se
Abstract. The identification of people’s gender and events in our everyday

applications by means of gait knowledge is becoming important. Security,
safety, entertainment, and billing are examples of such applications. Many
technologies could also be used to monitor people’s gender and activities.
Existing solutions and applications are subject to the privacy and the imple-
mentation costs and the accuracy they have achieved. For instance, CCTV or
Kinect sensor technology for people is a violation of privacy, since most people
don’t want to make their photos or videos during their daily work. A new
addition to the gait analysis field is the inertial sensor-based gait dataset.
Therefore, in this paper, we have classified people’s gender from an inertial
sensor-based gait dataset, collected from Osaka University. Four machine
learning algorithms: Support Vector Machine (SVM), K-Nearest Neighbor
(KNN), Bagging, and Boosting have been applied to identify people’s gender.
Further, we have extracted 104 useful features from the raw data. After feature
selection, the experimental outcome exhibits the accuracy of gender identifi-
cation via the Bagging stands at around 87.858%, while it is about 86.09% via
SVM. This will in turn form the basis to support human wellbeing by using gait
knowledge.
Keywords: Gait Inertial sensor Gender classification Bagging Boosting
1 Introduction
Gender is one of the understandable and straightforward human information, yet that
can opens up the entryway to the collection of facts used in various pragmatic oper-
ations. In the process of gender determination, the individual’s gender is determined by
assessing the diverse particularities of femaleness and maleness [1]. Automatic human
gender categorization is an interesting subject in pattern recognition, since gender
includes very important and rich knowledge about the social activities of individuals

https://doi.org/10.1007/978-3-030-68154-8_51
584 R. K. Pathan et al.
[2]. In particular, information on gender can be employed by professional and intel-

ligent frameworks, which are portion of applications for health-service, smart spaces,
and biometric entrance control.
Throughout the recent years the identification of demographic characteristics for
people including age, sex, and ethnicity utilizing computer vision has been given
growing consideration. It would be beneficial if a computer structure or device could
properly identify a given person. A great number of potential areas are identified where
gender identity is very important.
Although a person can easily differentiate between male and female, computer
vision technique considers it as a challenge to identify the gender. Many psychological
and medical tests [3, 4] showed that gait characteristics are able to recognize people’s
gender. Several biometrics have been developed for human identification and
authentication by checking the face, fingerprinting, palm printing, iris, gait, or a
combination of these characteristics [5–7]. Human Gait is widely considered in
surveillance. Because gait characteristics reflect how a person walks and explains his
physical ability. It is hard to mimic the gait of others. The development of sensing
technologies and sensor signal processing techniques paved the way for the use of
sensors to identify gender status. Inertial Measurement Unit (IMU), which contains a
3D accelerometer, a 3D spinner, and a magnetometer, has been utilized to assess
physical movement through differentiation of human activity [8]. This paper deals with
the issue of human gender recognition by using the gait dataset. Inertial sensor-based
gait data is utilized for gender prediction. Institute of Scientific and Industrial Research
(OU-ISIR) of Osaka University made the Gait inertial sensor data [9].
In the field of gait analysis, the inertial sensor-dependent gait data set represents a
relatively new addition. Therefore, a lot of investigational activities that include
machine learning algorithms in the gait data set are founded on the image. Hence, the
majority of the walking steps and its manner related data sets were assessed for gait
identification. A few research works on personal verification of the sensor-based
inertial gait dataset have been conducted. The reason is that personal verification
consists of a number of different ways, of which gender is very hard to predict.
The goal of this work is to develop a way to effectively identify gender from inertial
sensor-based gait data. In this paper, for gender classification we have studied some
supervised machine learning model. For the classification purpose at first we have
extracted some statistical feature. During the feature extraction process a primary
collection of raw variables has been compressed to get more useful features. The
feature data still explains the original data set thoroughly and accurately. We have
extracted 104 features from the original dataset. But all this features are not necessary
for the classification process. For that reason we apply feature selection technique
known as NCA (Neighborhood Component Analysis). In this paper, we also compare
the classification process before and after the selection of feature.
Gender Classification from Inertial Sensor-Based Gait Dataset 585
The remaining of the paper is prepared in different sections. The related work is
discussed in section two. The methodology of the study is described in section three.
Section four discusses the result of the experiment. The final section demonstrates the
conclusion and the future work.
2 Related Work
Many useful techniques or methods are used to find out the gender of a person by using
the gait data.
Kanij Mehtanin Khabir et al. [10] explored twelve types of time-domain features:
average, median, maximum, minimum, variance, root mean square, mean absolute
deviation, standard error of mean, standard deviation, skewness, kurtosis, vector sum
from the inertial sensor dataset. They measured 88 features which are suitable for
classification of the dataset and regression problems. SVM provides the highest
accuracy among other classifiers. This proposed model has some over fitting problems
because training set of the data with compare to test set has higher accuracy difference.
The statistical features such as global minimum, global maximum, step duration,
step length, mean, root mean square, standard deviation, entropy, energy and amplitude
from different components of accelerations and angular velocities [11]. They estimated
total 50 features for every single step. The variety of features was too small, so this
experiment only works as proof of concept for now.
Makihara et al. [12] describes gender identification by apply a Video-based gait
feature analysis with the help of a multi-view gait database. In this research, they did a
deep machine learning process to predict gender using their own created multi-view
gait database.
Tim Van hamme et al. [13] explored the best solution of gender information using
IMU Sensor-Based Gait Traces data. They compared distinctive component engi-
neering and machine learning algorithms, including both conventional and profound
machine learning techniques.
ThanhTrung Ngo et al. [14] organized a challenging competition on gender
detection using the OU-ISIR inertial sensor dataset. Several processing and feature
extraction are done by using deep learning method, conventional classification methods
and sensor orientation handling methods. The amounts of features are not enough to
generate real time usable model.
Ankita Jain et al. [15] used accelerometer and gyroscope sensor readings for gender
identification using Smartphone. They combined the collected data from accelerometer
and gyroscope sensor for betterment of the experimental performance. Bagging clas-
sifier gives the best accuracy in this experiment.
Rosa Andrie Asmara et al. [17] used Gait Energy Image (GEI) and Gait Infor-
mation Image (GII) processes for gender recognition. The Gait Information Image
(GII) method performed better than Gait Energy Image (GEI) using SVM. The accu-
racy of those works is too low because the shortage of features.
Jang-HeeYoo et al. [16] is used a sequential set of 2D stick facts to characterize the
gait signature. Each gait signature discriminated by the 2D sticks and joint angles of the
hip, knee and ankle. The support vector machine (SVM) is used to classify the gender
recognition of this method.
Another researcher take out 2D and 3D gait features founded on silhouette and
shape descriptors and combined them for gender classification [18]. This combined
feature gives the higher accuracy for the Dgait dataset. They used kernel SVM for this
experiment.
In [13, 14] they have used deep and shallow architecture for the classification of
gender. Deep learning consists of several representational levels, while the shallow has
few levels. In [15] they used behavioral biometric gait information for gender classi-
fication on smartphones. They collected data from 42 subjects. In [16] they extracted
motion based feature and joint angle feature from gait data. Their dataset is only limited
in medical purpose. In [17] the data was collected only from 20 subjects. For this
reason the dataset accuracy are low. In [18] they have collected their gait data from 53
subjects walking in different direction. These data sets clearly show us the distortion of
the age and gender ratio.
3 Dataset Description
The University of Osaka has developed OU-ISIR gait dataset, inertial dataset. The
dataset is relatively well developed and is the biggest sensor-based inertial gait database
sensor [19]. These dataset used 3 IMU (inertial measurement unit) sensors known as
IMUZ which is used to capture gait signals. The IMUZ comprises of a 3-axial
accelerometer and a 3-axial gyroscope sensor. These three sensors are on the right, one
on the left and the one at the center-back of the belt. These three sensors had been
mounted to the belt with various orientations (90° for center left, center right and 180°
for left-right pair).Gait data was collected for five days from 745 visitors. Each visitor
entered and departed only once from the designated data capture tool. The dataset
included equal number of gender (384 males and 384 females).In each IMUZ sensor,
triaxial accelerometer and triaxial gyroscope sequence of signals are captured. There-
fore 6D data are collected from each signal. Five activities data have been collected
known as: slope-up walk, slope-down walk, level walk, step-up walk and step-down
walk. For each subject, data were extracted for only level walk, slope down and slope
up walk. The data has four labels namely: ID, Age, gender and Activity. Figure 1 is an
example of sequence of signals for accelerometer data and gyroscope data.
Gryosocope(rad/s)
1
Gx
0
Gy
-1
Gz
-2
Time
1
Acceleration signals
0 Ax
Ay
-1
Az
-2
Time
Fig. 1. Example of signals for gyroscope and accelerometer data
4 Methodology
This section presents the proposed methodology framework consisting of collecting

data, data pre-processing, feature extraction and classifiers for machine learning to
classify gender as illustrated in Fig. 2. The dataset which is collected by using 3-axis
accelerometer and gyroscope sensor at The University of Osaka was considered in this
research. The dataset has been preprocessed to extract features, which have been
divided into training and testing dataset. The training dataset was trained by the
machine learning classifiers. Below is the description of each of the components of the
proposed methodology as shown in Fig. 2.
Fig. 2. Graphical illustration of proposed methodology

4.1 Feature Extraction

We have obtained important features which are given as a classification input. The
main part of the classification is the extraction of features. The walking patterns of men
and women are different biologically. We have taken advantage of statistical and
energy motion features since we are trying to classify the patterns on different surfaces
like the plane, stairs. Therefore in order to obtain precise representation of the walking
pattern for gender classification, we have computed both time and frequency domain
feature for 6D components. Time-domain feature includes maximum, minimum, mean,
median, mean absolute deviation, skewness, kurtosis, variance, standard error of mean,
standard deviation, root mean square, vector sum of mean, vector sum of minimum,
vector sum of maximum, vector sum of median, vector sum of standard deviation,
vector sum of variance, vector sum of standard error of mean, vector sum of skewness,
vector sum of kurtosis, Entropy and vector sum of entropy and frequency domain
feature included energy, magnitude, vector sum of energy and vector sum of magni-
tude. Fast Fourier Transform (FFT) has been used to compute the frequency domain
feature. The total numbers of features are 104. Table 1. Shows name of the time
domain and frequency domain feature.
Table 1. Feature for gender classification

Domain Sensor Type Axis Feature Name
Time Accelerometer, x, y, Mean, Median, Minimum, Maximum, Standard
Gyroscope z Deviation, Mean Absolute deviation, Standard
Error of mean, skewness, Kurtosis, Variance, Root
mean square, Root mean square, Vector sum,
Vector sum of mean, Vector sum of median,
Vector sum of maximum, Vector sum of
minimum, Vector sum of Standard deviation,
Vector sum of square error of deviation, Vector
sum of square error of deviation, Vector sum of
mean absolute deviation,
Vector sum of skewness, Vector sum of kurtosis,
Vector sum of variance, Vector sum of root mean
square, Entropy
Frequency Accelerometer, x, y, Energy, Magnitude, Vector sum of energy, Vector
Gyroscope z sum of magnitude
Table 2 shows the some of the feature value which is extracted from the gait data.
Table 2. Some of feature value after feature extraction

ax-mean ax-median ax-max ax-min ax-mad ax-skew ax-kurtosis
−0.041 0.0093 1.514 −1.719 0.296 −0.290 4.483
−0.045 −0.057 1.998 −1.207 0.296 0.568 5.729
−0.021 −0.003 1.264 −0.882 0.234 0.261 4.093
−0.020 0.039 1.381 −1.249 0.292 −0.213 3.342
−0.017 −0.034 2.599 −1.485 0.303 1.425 10.554
−0.020 −0.004 1.997 −0.846 0.264 0.902 7.443
−0.019 −0.024 0.598 −0.567 0.162 0.228 2.844
0.002 0.017 0.587 −0.621 0.179 −0.102 2.945
4.2 Feature Selection

All the features are not need to classify gender. Increasing features includes multiple
dimensions and therefore this can lead to an overfitting problem. A smart feature
selection methodology has been implemented to find the relevant features to eliminate
the overfitting problem caused by unnecessary features. It is fundamental for the
learning algorithm to concentrate on the applicable subset of features and ignore the
remainder of the highlights. The specific learning algorithm takes a shot at the training
set to choose the right subset of the extracted feature that can be applied to test set.
Neighborhood component Analysis (NCA) is one of the popular techniques that are
used for feature selection process.
NCA is a non-parametric approach to identify features with view to optimizing
regression and classification algorithm predictability. NCA uses a quadratic distance
calculation of k-nearest neighbor (KNN) supervised classification algorithm by
reducing Leave-one-leave out (LOO) error. Quadratic distance metrics can be repre-
sented by using symmetric positive and semi- define metrics. A linear transformation of
the input feature, denoted by matrix A can result in higher KNN classification per-
formance. Let Q ¼ AT A is matric, two points x1 and x2’s distance can be calculated by
using the following terms.
d ðx1 ; x2 Þ ¼ ðx1 x2 ÞT Qðx1 x2 Þ ¼ ðAx1 Ax2 ÞT ðAx1 Ax2 Þ
To prevent a discontinuity of the LOO classification error, the soft cost function of
the neighbor assignments in the transformed space can be used. In the transformed
space, the probability pij that point j is as the nearest point I can be described as
2
exp Axi Axj
pij ¼ P
exp Axi Axj 2
k6¼j
Where pii ¼ 0.
Transformation matrix A can be achieved by correctly classifying the expected
number of points
2
XX exp Axi Axj
A ¼ max P 2 kjj AjjF
2

i j2Ci k6¼j exp Axi Axj
Where k parameter is used for maximizes the NCA probability and minimizes the
Frobenius norm. To solve the optimization problem, the conjugate method can be used.
If A is confined to diagonal matrix, the diagonal inputs represent the weight of the
every input feature. Therefore, the selection of the feature can be obtained on the basis
of the importance of the weights.
Matlab statistics and machine learning Toolbox functions fscnca is used for NCA
feature selection process. After applying the feature selection method NCA, the
numbers of features are now 84.
4.3 Classification of Gender

The classification problem consists of two classes: male and female. The main goal of
our research is to classify human gender with high accuracy. For the classification
purpose, we have used four algorithms of classification namely KNN (K-nearest
neighbor), SVM (Support Vector Machine) [21, 22], Bagging [24], and Boosting [25,
26] algorithm.
4.4 Model Training

We have divided the entire dataset into two sections for training and evaluation. We
separated 30% data from the dataset, which has been considered as the test data before
preprocessing. The remaining 70% data, considered as the training. There are two
contradictory factors when dividing a dataset. If we have limited training data, the
prediction of the parameter would have greater variance. And if we have fewer data on
testing, the performance figures would have a higher variance. The data should be split
so that none of them is too large which relies more on the volume of data. The dataset
we have used is not large so no spilt ratio will give us the better result for that reason
we have to do the cross-validation. These two sets than have been translated into two
different threads. After preprocessing, we have collected 1,556 training samples and
456 test samples. For the training phase, with the training set of 1100 samples, we
trained our model. Finally 456 numbers of samples used to test the models. These test
dataset was never seen by the model.
Once model training is completed, the learned models are used to classify the gender of
people. We have also demonstrated the k-fold effects of cross-validation So that the
model didn’t overfit the training dataset. Different parameters for each model and the
maximum accuracy of each model have been observed. First, we construct the models
using 104 features derived from sensor data.
To minimize the number of features, we have used a feature selection method. Our
primary objective is to select the most discriminative time-domain and frequency-
domain feature for gender classification and also find out the best accuracy among the
four classification algorithms. For this purpose, we have used four well-known clas-
sifiers KNN (K-Nearest Neighbor), SVM (Support Vector Machine), Bagging, and
Boosting algorithm.
For the experimental purpose, we have used accuracy, MAE, and RMSE to measure
the performance of the algorithms and we also compared classification model perfor-
mance with three metrics [27]: precision, Recall, and F1-score.
Matlab 2018 software was used for calculating all the result. For Feature selection
Matlab Statistics and Machine learning toolbox was used and for classification
problem.
Table 3. Classification accuracy before and after feature selection

Classifier Name Accuracy Accuracy
Before feature Selection After feature Selection
SVM 83.898% 86.091%
K-Nearest Neighbor 79.661% 81.456%
Bagging 83.615% 87.854%
Boosting 81.425% 84.105%
Table 3 depicts the results from the comparison of the classification accuracy
before and after feature selection process. 1,556 samples have been used for these
classifications where each class contains the same number of samples. It can be seen
from the table that the accuracy of the all classifier are higher than 80%.Bagging
algorithm offers the best possible result with 87.5% compare to the other classifier.
Because bagging reduces the variance of one estimate, since several estimates from
various models are combined. Support vector machine shows comparatively lower
accuracy. The results from the table show that the accuracy of the each algorithm is
increased by 3% when Neighborhood component analysis (NCA) dimension reduction
method is applied to its original feature. This is because NCA method has decreased the
dimensionality of the data. Therefore the NCA method for reducing dimensionality
makes the classification process simpler and increases the classification accuracy rate.
100
87.854
86.091
84.105
95
83.898
83.615
81.456
81.425
Accuracy in %
79.661
90
85
80
75
70
65
60
Acc before selecƟon Acc aŌer selecƟon
SVM KNN Bagging BoosƟng
Fig. 3. Graphical representation of accuracy before and after feature selection
Figure 3 shows the graphical representation of classification accuracy before and

after feature selection. It can be seen from the graph that the accuracy rate of the
classifier is increased after the feature selection method is applied. From the obser-
vation, we can see that the bagging algorithm gives the better accuracy before and after
selection process. Bagging algorithm shows the better result because it combines the
weak learners and aggregates the weak learners in such way that the output is average
of the weak learners. That’ why bagging algorithm has less variance and therefore it
gives the better result than the other algorithm.
Table 4. Performance matrices for gender classification

Classifier Name MAE RMSE Precision Recall F-Score
SVM 0.139 0.373 0.867 0.855 0.860
K-Nearest Neighbor 0.185 0.430 0.827 0.798 0.813
Bagging 0.121 0.348 0.899 0.855 0.876
Boosting 0.159 0.399 0.858 0.820 0.839
From Table 4, it has been seen that the Bagging algorithm shows better perfor-
mance among the four algorithms. It also gives the lowest MAE and RMSE. Bagging
algorithm has the highest F1-score, Precision, and Recall.
Fig. 4. ROC curve for model evaluation
Figure 4 shows the ROC curve for all classifier of male class and it can be noticed
that Bagging, Boosting and KNN classifier curve converge quickly where SVM
classifier curve converge slowly.
Figure 5 illustrates the example of a confusion matrix in the Bagging classifier. The
input of this classification is based on data after the process of feature selection (NCA-
based features). The confusion matrix is carried out between two classes namely Male
and Female classes. The table row represents the real class and the column represents
the class that classifiers know. The value in the table indicates the probability of the
actual class being recognized. The Table shows that the male class is correctly rec-
ognized with 87.4% and female class are correctly recognized with 85.3%.
Fig. 5. Confusion matrix
We have performed a comparative evaluation of our methodology with the existing

work. The experimental results are presented in Table 5. From the result we can see
that our proposed methodology outperforms than the other existing method.
Table 5. Comparative study of the proposed approach with existing work

Research Study Approach Accuracy
Ankita jain et al. [15] Bagging 76.83%
Khabir et al. [10] SVM 84.76%
R.A Asmara [17] SVM 70%
The proposed Algorithm Bagging 87.85%
In this research, we have tried to identify the best machine learning algorithm to
classify gender from the inertial sensor-based gait dataset. The largest sensor-based
inertial OU-ISIR gait dataset has been analyzed for this experiment. The use of time-
domain and the frequency-domain features are the essential part of our paper, and we
also select features which are most important for the classification of gender. These
extracted features are used successfully to train our selected classification models. From
the result, it has been observed that after selecting the features from the 104 features,
the accuracy of the classifier is increased, which in turn could be used to ensure the
safety and security of the human beings In the future, we will use some deep learning
methods for gender classification [28–31] and also apply some methodology to remove
uncertainty [32–37]. However, in this study, we have used only 1100 data for the
training set. In the future, we will use more data for the training phase and we will also
use other feature selection method like PCA and t-SNE for training our dataset.
References
1. Wu, Y., Zhuang, Y., Long, X., Lin, F., Xu, W.: Human gender classification: a review. arXiv
preprint arXiv:1507.05122,2015
2. Udry, J.R.: The nature of gender. Demography, 31(4), 561–573 (1994)
3. Murray, M.P., Drought, A.B., Kory, R.C.: Walking patterns of normal men. JBJS 46(2),
335–360 (1964)
4. Murray, M.P.: Gait as a total pattern of movement: including a bibliography on gait. Am.
J. Phys. Med. Rehabilitation 46(1), 290–333 (1967)
5. Xiao, Q.: Technology review-biometrics-technology, application, challenge, and computa-
tional intelligence solutions. IEEE Comput. Intell. Magazine, 2(2), 5–25 (2007)
6. Wong, K.Y.E., Sainarayanan, G., Chekima, A.: Palmprint based biometric system: a
comparative study on discrete cosine transform energy, wavelet transform energy, and
sobelcode methods (< Special Issue > BIOMETRICS AND ITS APPLICATIONS). Int.
J. Biomed. Soft Comput. Human Sci. Official J. Biomed. Fuzzy Syst. Assoc. 14(1), 11–19
(2009)
7. Hanmandlu, M., Gupta, R.B., Sayeed, F., Ansari, A.Q.: An experimental study of different
features for face recognition. In: 2011 International Conference on Communication Systems
and Network Technologies, pp. 567–571. IEEE, June 2011
8. Sasaki, J.E., Hickey, A., Staudenmayer, J., John, D., Kent, J.A., Freedson, P.S.: Performance
of activity classification algorithms in free-living older adults. Med. Sci. Sports and Exercise,
48(5), 941 (2016)
9. Ngo, T.T., Makihara, Y., Nagahara, H., Mukaigawa, Y., Yagi, Y.: The largest inertial
sensor-based gait database and performance evaluation of gait-based personal authentication.
Pattern Recogn. 47(1), 228–237 (2014)
10. Khabir, K.M., Siraj, M.S., Ahmed, M., Ahmed, M.U.: Prediction of gender and age from
inertial sensor-based gait dataset. In: 2019 Joint 8th International Conference on Informatics,
Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision &
Pattern Recognition (icIVPR), pp. 371–376. IEEE, May 2019
11. Riaz, Q., Vögele, A., Krüger, B., Weber, A.: One small step for a man: estimation of gender,
age and height from recordings of one step by a single inertial sensor. Sensors, 15(12),
31999–32019 (2015)
12. Makihara, Y., Mannami, H., Yagi, Y.: Gait analysis of gender and age using a large-scale
multi-view gait database. In: Asian Conference on Computer Vision, pp. 440–451. Springer,
Heidelberg, November 2010
13. Garofalo, G., ArgonesRúa, E., Preuveneers, D., Joosen, W.: A systematic comparison of age
and gender prediction on imu sensor-based gait traces. Sensors, 19(13), 2945 (2019)
14. Ngo, T.T., Ahad, M.A.R., Antar, A.D., Ahmed, M., Muramatsu, D., Makihara, Y., Hattori,
Y.: OU-ISIR wearable sensor-based gait challenge: age and gender. In: Proceedings of the
12th IAPR International Conference on Biometrics, ICB (2019)
15. Jain, A., Kanhangad, V.: Investigating gender recognition in smartphones using accelerom-
eter and gyroscope sensor readings. In: 2016 International Conference on Computational
Techniques in Information and Communication Technologies (ICCTICT), pp. 597–602.
IEEE, March 2016
16. Yoo, J.H., Hwang, D., Nixon, M.S.: Gender classification in human gait using support vector
machine. In: International Conference on Advanced Concepts for Intelligent Vision Systems,
pp. 138–145. Springer, Heidelberg, September 2005
17. Asmara, R.A., Masruri, I., Rahmad, C., Siradjuddin, I., Rohadi, E., Ronilaya, F., Hasanah,
Q.: Comparative study of gait gender identification using gait energy image (GEI) and gait
information image (GII). In: MATEC Web of Conferences, vol. 197, p. 15006. EDP
Sciences (2018)
18. Borràs, R., Lapedriza, A., Igual, L.: Depth information in human gait analysis: an
experimental study on gender recognition. In: International Conference Image Analysis and
Recognition, pp. 98–105. Springer, Heidelberg, June 2012
19. Lu, J., Tan, Y.-P.: Gait-based human age estimation, IEEE Transactions
20. Trung, N.T., Makihara, Y., Nagahara, H., Mukaigawa, Y., Yagi, Y.: Performance evaluation
of gait recognition using the largest inertial sensor-based gait database. In: 2012 5th IAPR
International Conference on Biometrics (ICB). IEEE, pp. 360–366 (2012)
21. Burges, C.J.: A tutorial on support vector machines for pattern recognition. Data Min.
Knowl. Discovery 2(2), 121–167 (1998)
22. Szegedy, V.V.O.: Processing images using deep neural networks. USA Patent 9,715,642, 25
July 2017
23. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
24. Johnson, R.W.: An introduction to the bootstrap. Teach. Stat. 23(2), 49–54 (2001)
25. Rahman, A., Verma, B.: Ensemble classifier generation using non-uniform layered clustering
and Genetic Algorithm. Knowledge-Based Syst. 43, 30–42 (2013)
26. Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: International
Conference on Machine Learning, pp. 148–156 (1996)
27. Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification
tasks. Inf. Process. Manage. 45(4), 427–437 (2009)
tation. In: 2019 Joint 8th International Conference on Informatics, Electronics & Vision
(ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition
(icIVPR), pp. 318–323. IEEE, May 2019
29. Ahmed, T.U., Hossain, M.S., Alam, M.J., Andersson, K.: An integrated CNN-RNN
framework to assess road crack. In: 2019 22nd International Conference on Computer and
Information Technology (ICCIT), pp. 1–6. IEEE, December 2019
recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th
IEEE, May 2019
using convolutional neural network with data augmentation, May 2019
32. Biswas, M., Chowdhury, S.U., Nahar, N., Hossain, M.S., Andersson, K.: A belief rule base
expert system for staging non-small cell lung cancer under uncertainty. In: 2019 IEEE
International Conference on Biomedical Engineering, Computer and Information Technol-
ogy for Health (BECITHCON), pp. 47–52. IEEE, November 2019
34. Monrat, A.A., Islam, R.U., Hossain, M.S., Andersson, K.: A belief rule based flood risk
assessment expert system using real time sensor data streaming. In: 2018 IEEE 43rd
Conference on Local Computer Networks Workshops (LCN Workshops), pp. 38–45. IEEE,
October 2018
35. Karim, R., Hossain, M.S., Khalid, M.S., Mustafa, R., Bhuiyan, T.A.: A belief rule-based
expert system to assess bronchiolitis suspicion from signs and symptoms under uncertainty.
In: Proceedings of SAI Intelligent Systems Conference, pp. 331–343. Springer, Cham,
September 2016
36. Hossain, M.S., Monrat, A.A., Hasan, M., Karim, R., Bhuiyan, T.A., Khalid, M.S.: A belief
rule-based expert system to assess mental disorder under uncertainty. In: 2016 5th
International Conference on Informatics, Electronics and Vision (ICIEV), pp. 1089–1094.
IEEE, May 2016
37. Hossain, M.S., Habib, I.B., Andersson, K.: A belief rule based expert system to diagnose
dengue fever under uncertainty. In: 2017 Computing Conference, pp. 179–186. IEEE, July
2017
Lévy-Flight Intensified Current Search
for Multimodal Function Minimization
Wattanawong Romsai, Prarot Leeart, and Auttarat Nawikavatan(&)

Southeast Asia University, 19/1 Petchkasem Road,
Nongkhangphlu, Nonghkaem 10160, Bangkok, Thailand
wattanawong.r@gmail.com, poaeng41leeart@gmail.com,
auttarat@hotmail.com
Abstract. This paper proposes the novel trajectory-based metaheuristic algo-

rithm named the Lévy-flight intensified current search (LFICuS) for multimodal
function minimization. The proposed LFICuS is the new modified version of the
intensified current search (ICuS) initiated from the electrical current flowing
through the electric networks. The random number drawn from the Lévy-flight
distribution and the adjustable search radius mechanism are conducted to
improve the search performance. To perform its effectiveness, the proposed
LFICuS is tested against ten selected standard multimodal benchmark functions
for minimization. Results obtained by the LFICuS will be compared with those
obtained by the ICuS. As simulation results, it was found that the proposed
LFICuS is much more efficient for function minimization than the ICuS.
Keywords: Lévy-flight intensified current search Intensified current search

Function minimization Metaheuristic algorithm
1 Introduction
Based on modern optimization, many metaheuristic algorithms have been launched for
solving several real-world optimization problems under complex constraints [1, 2].
From literature reviews, the current search (CuS) is one of the most interesting
trajectory-based metaheuristic algorithms [3]. The CuS development scenario begins in
2012 once it was firstly proposed as an optimizer to solve the optimization problems
[3]. The CuS algorithm mimics the principle of an electric current behavior in the
electric circuits and networks. It performed superior search performance to the genetic
algorithm (GA), tabu search (TS) and particle swarm optimization (PSO) [3]. The CuS
was successfully applied to control engineering [4] and signal processing [5]. During
2013-2014, the adaptive current search (ACuS) was launched [6] as a modified version
of the conventional CuS. The ACuS consists of the memory list (ML) used to escape
from local entrapment caused by any local solution and the adaptive radius
(AR) conducted to speed up the search process. The ACuS was successfully applied to
industrial engineering [6] and energy resource management [7]. For some particular
problems, both the CuS and ACuS are trapped by local optima and consume much
search time. In 2014, the intensified current search (ICuS) was proposed to improve its

https://doi.org/10.1007/978-3-030-68154-8_52
598 W. Romsai et al.
search performance [8]. The ICuS algorithm consists of the ML, AR and the adaptive
neighborhood (AN) mechanisms. The ML regarded as the exploration strategy is used
to store the ranked initial solutions at the beginning of search process, record the
solution found along each search direction, and contain all local solutions found at the
end of each search direction. The ML is also applied to escape the local entrapments
caused by local optima. The AR and AN mechanisms regarded as the exploitation
strategy are together conducted to speed up the search process. The ICuS was suc-
cessfully applied to many control engineering problems including single-objective and
multi-objective optimization problems [8–11]. For some optimization problems espe-
cially in large-space multimodal problems, the ICuS might be trapped by local optima.
This is properly because the random number with the uniform distribution used in the
ICuS algorithm is not efficient enough for such the problems. Thus, it needs to be
modified to enhance its search performance and to speed up the search process.
In this paper, the new trajectory-based metaheuristic algorithm called the Lévy-
flight intensified current search (LFICuS) is peoposed. The proposed LFICuS is the
newest modified version of the ICuS. The random number with the Lévy-flight dis-
tribution and the adjustable search radius mechanism are conducted to improve the
search performance. This paper consists of five sections. An introduction is given in
Sect. 1. The ICuS algorithm is briefly described and the proposed LFICuS is illustrated
in Sect. 2. Ten selected standard multimodal benchmark functions used in this paper
are detailed in Sect. 3. Results and discussions of the performance evaluation of the
ICuS and LFICuS are provided in Sect. 4. Finally, conclusions are followed in Sect. 5.
2 ICuS and LFICuS Algorithms
The ICuS algorithm is briefly described in this section. Then, the proposed LFICuS
algorithm is elaborately illustrated as follows.
2.1 ICuS Algorithm

The ICuS algorithm is based on the iteratively random search by using the random
number drawn from the uniform distribution [8]. The ICuS possesses the ML regarded
as the exploration strategy, the AR and AN mechanisms regarded as the exploitation
strategy. The ML is used to escape from local entrapment caused by any local solution.
The ML consists of three levels: low, medium and high. The low-level ML is used to
store the ranked initial solutions at the beginning of search process, the medium-level
ML is conducted to store the solution found along each search direction, and the high-
level ML is used to store all local solutions found at the end of each search direction.
The AR mechanism conducted to speed up the search process is activated when a
current solution is relatively close to a local minimum by properly reducing the search
radius. The radius is thus decreased in accordance with the best cost function found so
far. The less the cost function, the smaller the search radius. The AN mechanism also
applied to speed up the search process is invoked once a current solution is relatively
close to a local minimum. The neighborhood members will be decreased in accordance
with the best cost function found. The less the cost function, the smaller the
Lévy-Flight Intensified Current Search 599
neighborhood members. With ML, AR and AN, a sequence of solutions obtained by

the ICuS very rapidly converges to the global minimum. Algorithms of the ICuS can be
described by the pseudo code shown in Fig. 1.
Fig. 1. ICuS algorithm.
2.2 Proposed LFICuS Algorithm

Referring to Fig. 1, the ICuS employs the random number drawn from the uniform
distribution for generating the neighborhood members as feasible solutions. The
probability density function (PDF) f(x) of the continuous uniform distribution can be
expressed in (1), where a and b are lower and upper bounds of random process. Mean l
and variance r2 of the continuous uniform distribution are limited by bounds of random
process. The random number drawn from the uniform distribution is considered as non-
scale-free characteristics. Another random number with scale-free characteristics is the
random number drawn from the Lévy-flight distribution [12], where its PDF is stated in
(2) and c is the scale parameter. The random with Lévy-flight distribution has an
infinite mean and infinite variance [12]. Then, it is more efficient than the random with
uniform distribution. Many metaheuristics including the cuckoo search (CS) [13] and
flower pollination algorithm (FPA) [14] utilize the random with Lévy-flight distribution
for exploring the feasible solutions.
Fig. 2. Proposed LFICuS algorithm.
8
< 1
for a x b;
f ðxÞjuniform ¼ b a ð1Þ
:
0 for x\a or x [ b
rffiffiffiffiffiffi
e2ðxlÞ
c
c
f ðxÞjLevy ¼ ð2Þ
2p ðx lÞ3=2
The proposed LFICuS algorithm uses the random number drawn from the Lévy-
flight distribution to generate the neighborhood members as feasible solutions in each
search iteration. Once applying in the proposed LFICuS algorithm, the Lévy-flight
distribution L can be approximated by (3) [14], where s is step length, k is an index and
C(k) is the Gamma function as expressed in (4), when C(n) = (n−1)!, k = n (an integer).
The Lévy-flight distribution works well for large space. Therefore, the adjustable
search radius (AR) mechanism is also conducted by setting the initial search radius
R = X (search space). The proposed LFICuS algorithm can be described by the pseudo
code shown in Fig. 2.
L 1þk ð3Þ
p s
Z1
CðkÞ ¼ tk1 et dt ð4Þ
0
3 Benchmark Functions
To test the search performance of optimization algorithms, various standard benchmark

functions in literature were designed [15, 16]. Any new optimization algorithm should
also be validated and tested against these standard benchmark functions. In this work,
ten standard multimodal benchmark functions are selected from [15, 16] for testing the
proposed LFICuS. They can be considered as the nonlinear and unsymmetrical func-
tions which are very difficult for function minimization.
1500
1000
f(x1, x2)
500
0
500
500
x2 0 0x
1
-500 -500
Fig. 3. 3D surface of SF function.
Details of ten selected standard multimodal benchmark functions are summarized in

Table 1, where x* is the optimal solution, f(x*) is the optimal function value and fmax is
the maximum allowance of f(x*). For example, the 3D surface of the Schwefel function
(SF), f1 ðxÞ, is plotted in Fig. 3.
Table 1. Ten selected benchmark functions.
602
Function names Functions, search space, optimal solution and optimal function value
XD pffiffiffiffiffiffi
(1) Schwefel function (SF) f1 ðxÞ ¼ 418:9829D x sinð jxi jÞ; 500 xi 500; i ¼ 1; 2; . . .; D;
i¼1 i
D ¼ 2; x ¼ ð420:9687; 420:9687; . . .; 420:9687Þ; f1 ðxÞ ¼ 0; f1 max ¼ 1 104
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
(2) Ackley function (AF) 0:02 D1
PD 2
x 1
PD
i¼1 i cosð2pxi Þ
f2 ðxÞ ¼ 20e eD i¼1 þ 20 þ e; 35 xi 35;
i ¼ 1; 2; . . .; D; D ¼ 2; x ¼ ð0; 0; . . .; 0Þ; f2 ðxÞ ¼ 0; f2 max ¼ 1 105
W. Romsai et al.
(3) Bohachevsky function (BF) f3 ðxÞ ¼ x21 þ 2x22 0:3 cosð3px1 Þ 0:4 cosð4px2 Þ þ 0:7; 100 xi 100;
i ¼ 1; 2; x ¼ ð0; 0Þ; f3 ðxÞ ¼ 0; f3 max ¼ 1 109
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
(4) Drop-Wave function (DWF) 1 þ cos 12 x21 þ x22
f4 ðxÞ ¼ ; 5:2 xi 5:2; i ¼ 1; 2;
0:5ðx21 þ x22 Þ þ 2
x ¼ ð0; 0Þ; f4 ðxÞ ¼ 1; f4 max ¼ 0:9999
(5) Egg-Crate function (ECF) f5 ðxÞ ¼ x21 þ x22 þ 25½sin2 ðx1 Þ þ sin2 ðx2 Þ; 5 xi 5; i ¼ 1; 2;
x ¼ ð0; 0Þ; f5 ðxÞ ¼ 0; f5 max ¼ 1 105
2 2 0:5
(6) Pen-Holder function (PHF) =p 1
f6 ðxÞ ¼ exp½j cosðx1 Þ cosðx2 Þej1½ðx1 þ x2 Þ j ; 11 xi 11; i ¼ 1; 2;
x ¼ ð 9:6462; 9:6462Þ; f6 ðxÞ ¼ 0:9635; f6 max ¼ 0:9634
XD
(7) Rastrigin function (RF) f7 ðxÞ ¼ 10D þ ½x2 10 cosð2pxi Þ; 5:12 xi 5:12; i ¼ 1; 2; . . .; D;
i¼1 i
D ¼ 2; x ¼ ð0; 0; . . .; 0Þ; f7 ðxÞ ¼ 0; f7 max ¼ 1 109
(8) Styblinski-Tang function (STF) 1 XD
f8 ðxÞ ¼ ðx4 16x2i þ 5xi Þ; 5 xi 5; i ¼ 1; 2; . . .; D;
i¼1 i
2
D ¼ 2; x ¼ ð2:9035; . . .; 2:9035Þ; f8 ðxÞ ¼ 39:1660D; f8 max ¼ 78:3319
XD PD 2
(9) Yang-2 function (Y2F)
f9 ðxÞ ¼ jx j e i¼1 sinðxi Þ ; 2p xi 2p; i ¼ 1; 2; . . .; D;
i¼1 i
D ¼ 2; x ¼ ð0; 0; . . .; 0Þ; f9 ðxÞ ¼ 0; f9 max ¼ 1 109

hXD PD 2 i PD 2 pffiffiffiffiffi
(10) Yang-4 function (Y4F)
f10 ðxÞ ¼ i¼1
sin2 ðxi Þ e i¼1 xi e i¼1 sin jxi j ; 10 xi 10;
i ¼ 1; 2; . . .; D; D ¼ 2; x ¼ ð0; 0; . . .; 0Þ; f10 ðxÞ ¼ 1; f10 max ¼ 0:9999
The proposed LFICuS algorithm was coded by MATLAB version 2017b (License No.
#40637337) run on Intel(R) Core(TM) i7-10510U CPU@1.80 GHz, 2.30 GHz,
16.0 GB RAM for function minimization tests against ten selected multimodal
benchmark functions. Searching parameters of the proposed LFICuS algorithm are set
from the preliminary studies with different ranges of parameters, i.e., step length s,
index k, number of state of AR and AN mechanisms activation h, number of initial
neighborhood members n, number of search directions N. By varying s = 0.0001,
0.001, 0.01 and 0.1, k = 0.0000001, 0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 0.2,
0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9 and 2.0, h =
2, 3, 4 and 5, n = 25, 50, 100, 150, 200 and 300 and N = 10, 25, 50, 75, 100, 125, 150
and 200, it was found that the best parameters for most benchmark functions are: s =
0.001 to 0.01, k = 0.01 to 1.3, h = 2 to 3, n = 50 to 100 and N = 50 to 75. For this
performance evaluation tests, s = 0.01, k = 0.3, h = 2, n = 100 and N = 50 are set for all
selected functions. Each search direction is terminated at 1000 iterations. 1000-trial
runs are conducted for each algorithm. All algorithms will be terminated once two
termination criteria (TC) are satisfied, i.e. (1) the function values are less than fmax in
Table 1 of each function or (2) the search meets the maximum search directions N = 50.
The former criterion implies that the search is success, while the later means that the
search is not success. For comparison with the ICuS, searching parameters of the ICuS
are fairly set as the same values to those of LFICuS.
The obtained results are summarized in Table 2. Data represented in Table 2 are in
the form of AE ± SD(SR%), where the AE is the average number (mean) of function
evaluations, the SD is the standard deviation and the SR is the success rate. The AE
value implies the searching time consumed. The SD value implies the robustness of the
algorithm. Referring to Table 2, the proposed LFICuS algorithm provides greater SR
and less AE and SD values than the ICuS. This can be noticed that the proposed
LFICuS algorithm performs much more efficient in function minimization than the
Table 2. Results of function optimization by IcuS and proposed LFICuS algorithms.

Functions Algorithms
ICuS Proposed LFICuS
(1) SF 8.8161105 ± 1.1378106(90.80%) 5.0051104 ± 0.00(100%)
(2) AF 1.0944106 ± 9.4378105(95.40%) 5.3501104 ± 1.3630104(100%)
(3) BF 3.6153105 ± 3.4431105(100%) 5.0351104 ± 3.8630103(100%)
(4) DWF 4.9805105 ± 4.1998105(100%) 5.1151104 ± 7.6710103(100%)
(5) ECF 2.1490105 ± 3.2485105(99.70%) 5.0051104 ± 0.00(100%)
(6) PHF 6.9583104 ± 1.1281104(100%) 5.0051104 ± 0.00(100%)
(7) RF 1.9771106 ± 1.2190106(71.90%) 6.1951104 ± 2.7370104(100%)
(8) STF 7.3347105 ± 8.7492105(96.60%) 5.0401104 ± 4.1710103(100%)
(9) Y2F 2.5562105 ± 2.3652105(100%) 5.0351104 ± 3.8630103(100%)
(10) Y4F 7.8195105 ± 6.6517105(97.60%) 5.0301104 ± 3.5284103(100%)
ICuS algorithm. The convergent rates of the SF function proceeded by the ICuS and
proposed LFICuS algorithms are depicted in Fig. 4. Those of other functions are
omitted because they have a similar form to that of SF function shown in Fig. 4.
(a) ICuS algorithm
(b) Proposed LFICuS algorithm
Fig. 4. Convergent rates of the SF proceeded by the ICuS and proposed LFICuS algorithms.
5 Conclusions
The Lévy-flight intensified current search (or LFICuS) algorithm has been developed
and proposed in this paper for multimodal function minimization. As the novel
trajectory-based metaheuristic algorithm, the proposed LFICuS is based on the ICuS
algorithm. In the proposed LFICuS, the random number drawn from the Lévy-flight
distribution and the adjustable AR mechanism have been conducted to improve the
search performance. The proposed LFICuS has been tested against ten selected stan-
dard multimodal benchmark functions for minimization. As results, the proposed
LFICuS algorithm has provided greater success rate, less number of function evalua-
tions and less standard deviation values than the ICuS. This can be concluded that the
proposed LFICuS algorithm is one of the most efficient trajectory-based metaheuristic
algorithms for function minimization. For future research, the proposed LFICuS
algorithm can be probably conducted to any real-world engineering applications and
other optimization problems including finance, forecasting, meteorology and so all.
References
1. Talbi, E.G.: Metaheuristics Forn Design to Implementation. John Wiley & Sons, Hoboken
(2009)
2. Yang, X.S.: Engineering Optimization: An Introduction with Metaheuristic Applications,
John Wiley & Sons (2010)
3. Sukulin, A., Puangdownreong, D.: A novel metaheuristic optimization algorithm: current
search. In: The 11th WSEAS International Conference on Artificial Intelligence, Knowledge
Engineering and Data Bases (AIKED 2012), pp. 125–130 (2012)
4. Puangdownreong, D.: Application of current search to optimum PIDA controller design.
Intell. Control Autom. 3(4), 303–312 (2012)
5. Puangdownreong, D., Sukulin, A.: Current search and applications in analog filter design
problems. J. Commun. Comput. 9(9), 1083–1096 (2012)
6. Suwannarongsri, S., Bunnag, T., Klinbun, W.: Energy resource management of assembly
line balancing problem using modified current search method. Int. J. Intell. Syst. Appl. 6(3),
1–11 (2014)
7. Suwannarongsri, S., Bunnag, T., Klinbun, W.: Optimization of energy resource management
for assembly line balancing using adaptive current search. Am. J. Oper. Res. 4(1), 8–21
(2014)
8. Nawikavatan, A., Tunyasrirut, S., Puangdownreong, D.: Application of Intensified Current
Search to Optimum PID Controller Design in AVR System. Lecture Notes in Computer
Science, pp. 255–266 (2014)
9. Nawikavatan, A., Tunyasrirut, S., Puangdownreong, D.: Optimal PID controller design for
three-phase induction motor speed control by intensified current search. In: The 19th
International Annual Symposium on Computational Science and Engineering (ANSCSE19),
pp. 104–109 (2015)
10. Thammarat, C., Puangdownreong, D., Nawikavatan, A., Tunyasrirut, S.: multiobjective
optimization of PID controller of three-phase induction motor speed control using intensified
current search. In: Global Engineering & Applied Science Conference, pp. 82–90 (2015)
11. Nawikavatan, A., Tunyasrirut, S., Puangdownreong, D.: Application of intensified current
search to multiobjective PID controller optimization. Int. J. Intell. Syst. Appl. 8(11), 51–60
(2016)
12. Pavlyukevich, I.: Lévy flights, non-local search and simulated annealing. J. Comput. Phys.
226(9), 1830–1844 (2007)
13. Yang, X.S., Deb, S.: Engineering optimisation by cuckoo search. Int. J. Math. Model. Num.
Opt. 1(4), 330–343 (2010)
14. Yang, X.S.: Flower pollination algorithm for global optimization. Unconventional Compu-
tation and Natural Computation, Lecture Notes in Computer Science 7445, 240–249 (2012)
15. Ali, M.M., Khompatraporn, C., Zabinsky, Z.B.: A numerical evaluation of several stochastic
algorithms on selected continuous global optimization test problems. J. Global Opt. 31, 635–
672 (2005)
16. Jamil, M., Yang, X.S., Zepernick, H-.J.: Test Functions for Global Optimization: a
Aomprehensive Survey. Swarm Intelligence and Bio-Inspired Computation Theory and
Applications, Elsevier Inc., pp. 193–222 (2013)
Cancer Cell Segmentation Based
on Unsupervised Clustering
and Deep Learning
Juel Sikder1(&), Utpol Kanti Das1, and A. M. Shahed Anwar2

1
Rangamati Science and Technology University, Chittagong, Bangladesh
juelsikder48@gmail.com, utpoldasrmstu@gmail.com
2
BRAC University, Dhaka, Bangladesh
shahedanwar@rmstu.edu.bd
Abstract. This paper proposed a methodology that can segment, recognize,

classify and detect different types of cancer cell from RGB, CT and MRI
images. In this study, partial contrast stretching is used on the preprocessed
images to increase the cells' visual aspect. Otsu method is applied to enhance the
stretched image and the K-means clustering method to segment the desired
cancer cell. Median filtering is applied to improve the appearance of the seg-
mented cancer cell and relevant features are extracted from the filtered cell and
multi-Support Vector Machine (M-SVM) is applied to classify the test image
and identified different types of cancer cell. This paper also proposed the
Convolutional Neural Network (CNN) classifier to extract features from the
filtered cell and to classify the image. Then two classified results are compared
to identify either equal or not. If the result is same, the extracted cancer region
shown using region growing method and finally calculated percentage of
detected areas of the different types of cancer cell. The Experimentation has
been done on MATLAB environment on RGB, CT and MRI images.
Keywords: Multi SVM Convolutional neural network Brain Leukemia

Lung
1 Introduction
Human bodies are rapidly producing trillions of cells that form tissues and organs such
as muscles, bones, the lungs, and the liver. Genes of each cell instruct it when to grow,
work, divide, and die. Sometimes these instructions get mixed up and go out of control;
then, the cells become abnormal as they grow and divide but do not die in time. These
uncontrolled and abnormal cells can form a lump in the body known as a tumor.
Tumors are of two types, such as non-cancerous (benign) and cancerous (malignant).
When cancer starts in the lungs, it is known as lung cancer, and cancer that has spread
to the lungs from another part of the body might be called secondary lung cancer.
Leukemia is a cancer of white blood cells, which cannot fight infection and develops
bone marrow. To classify the MRI image to detect the tumor category in the patient's

https://doi.org/10.1007/978-3-030-68154-8_53
608 J. Sikder et al.
brain is the purpose of brain tumor classification. Many test types can be used for brain
tumor detection, such as MRI, Biopsy, and Computed Tomography scan. However, in
comparison to CT scan images, MRI images are safe, and they also provide higher
contrast. Moreover, MRI images do not affect the human body. It is easy to detect and
classify the brain through MRI images as they have a fair resolution for different brain
tissues.
Image segmentation always plays a significant part in cancer diagnosis. The actual
meaning of segmentation is splitting an image into several regions and extracting
meaningful information from these regions [1]. There are several segmentation tech-
niques have been applied to the MRI images. For medical image analysis, edge
detection, threshold-based, region-growing, model-based, and clustering-based seg-
mentation techniques are the most widely used methods. All the segmentation tech-
niques have their advantages and disadvantages, and thus, it depends on the user's
choice. Besides the segmentation and detection of cancer cells, it is necessary to
classify the types of cancer after segmentation to help the specialists provide the correct
direction to the patient for early emergence. There have various classification methods
among them multi-SVM (Multi-support vector machine) is applied in our proposed
method because it is a linear learning algorithm used for classification and is a powerful
supervised classifier and accurate learning technique. The Convolutional Neural Net-
work models have been very successful for a large number of visual tasks [2]. In a
word, this paper approaches benign, malignant, Leukemia, and Lung cancer detection
using partial contrast stretching, Otsu, K-means, median filtering, and performs the
classification of the benign, malignant Leukemia and lung using M-SVM and CNN
classifier.
This system uses MRI, RGB and CT images of Brain MRI Images for Brain Tumor
Detection dataset for both Benign and Malignant, SN-AM dataset for leukemia [3–6],
LIDC database for lung cancer [7].
This paper is organized as follows. Section 2 summarizes the previous works
completed in the fields of cancer cell detection. Section 3 gives a brief description of
the methodology that has been proposed for the different types of cancer cell detection,
feature extraction, classification and calculation of the detected area of cancer cell and
conclusions is in Sect. 4.
2 Literature Review
Many researchers have applied different methods to segment, detect, and classify
images of cancer cells. Some of the existing methods are described here.
Cancer Cell Segmentation Based on Unsupervised Clustering and Deep Learning 609
Wang et al. [1] has presented a novel cell detection method that utilizes both
intensity and shape information of a cell to improve the segmentation. Upon obtaining
the cell shape information with the binarization process, they have used both intensity
and shape information for local maxima generation. Lastly, the nuclei pixels are per-
mitted to move inside the gradient vector field. The pixels will finally join at these local
maxima, and the detected cells are then segmented through the seeded watershed
algorithm.
Purohit and Joshi [8] introduced a new efficient approach towards the K-means
clustering algorithm. They proposed a new method that produces the cluster center by
reducing the mean square final cluster's error without a considerable increment in the
execution time. Many judgements have been done and it can conclude that accuracy is
more for dense dataset rather than small dataset.
Alan Jose, S. Ravi, and M. Sambath [9] have proposed a methodology to segment
Brain Tumor using K-means Clustering and Fuzzy C-means Algorithm and its' area
calculation. They split the process into several parts. First of all, preprocessing is
implemented by using the filter to improve the quality of the image. After that, the
advanced K-means algorithm is applied, followed by Fuzzy c-means to cluster the
image. Then the resulted segment image is used for the feature extraction of the region
of interest. They have used MRI images for the analysis and calculated the size of the
extracted image's tumor region.
Selvaraj Damodharan and Dhanasekaran Raghavan [10] have proposed a neural
network-based procedure for brain tumor detection and classification. In their
methodology, the quality rate is formed separately for segmentation of WM, GM, CSF,
and tumor region and claimed accuracy of 83% using the neural network-based
classifier.
Nihad Mesanovic, Mislav Grgic, Haris Huseinagic [11] have proposed region
growing algorithm for segmentation of CT scan images of the lung. This algorithm
starts with a seed pixel which also checks other pixels that surround it. It determines the
most alike one. If it meets confident criteria, it will include in the region. The region is
established by examining all unallocated adjacent pixels to the region.
Kanitkar, Thombare, and Lokhande [12] proposed a methodology that takes a CT
scan image of the lung, then preprocesses the input image for smoothing using the
Gaussian filter and, then enhanced using Gabor filter. Then thresholding and marker-
controlled watershed segmentation methods are used to segment the processed image.
They used a marker-based watershed segmentation technique to overcome the draw-
backs of the watershed segmentation technique, i.e., over-segmentation. Then feature
extraction and classification of cancer stages are done.
Figure. 1 shows the proposed method. This method proposes the preprocessing, seg-
mentation, detection, feature extraction and classification of cancer cells using Otsu, k-
means, multi support vector machine and convolutional neural network. As the initial
step, an input image is given to the system.
Input Image
Preprocessing
Segmentation Stage
Median Filtering
Feature M-SVM Training CNN

Extraction Training
Database Database
Benign Malignan Leukemia Lung Benign Malignan Leukemia Lung

t t
Equal
?
Yes No
Trace Region Nothing Detected Correctly!

Please try again.
Detected Region
Calculate Area
Fig. 1. Block diagram of proposed methodology.
Fig. 2. (a) Original Brain test image-Benign; (b) Original Brain test image- Malignant;
(c) Original Leukemia test image; (d) Original Lung Cancer test image.
The proposed methodology contains following section:

3:1 Preprocessing
3:2 Segmentation
3:3 Median Filtering
3:4 Classification
3:5 Trace and Detection
3:6 Area Calculation
3.1 Preprocessing
RGB2GRAY
In the preprocessing stage firstly the input image checked either RGB or gray. If the
testing image becomes a RGB image it converts the true color RGB image into gray
scale by eliminating the hue and saturation information while retaining the luminance.
Fig. 3. Gray image- (a) Benign; (b) Malignant; (c) Leukemia; (d) Lung Cancer.
Remove Noise
Gaussian filtering method used to smooth the testing images so that it looks more
meaning full for segmentation. Filtering method helps to improve certain parameters of
testing images such as improving the signal-to-noise ratio, enhancing the visual
appearance of testing image, removing the irrelevant noise.
Fig. 4. Removed Noise- (a) Benign; (b) Malignant; (c) Leukemia; (d) Lung Cancer.
Partial Contrast Stretching

In medical science, images used for the diagnosis may have their own weakness such
as blurred or low contrast. To solve this problem a contrast enhancement technique
such as partial spatial starching is used so that it can improve the image quality and
contrast of the image. It is implemented by stretching and compression process and
applying this technique, the pixel range of lower threshold value and upper threshold
value will be mapped to a new pixel range [13].
Fig. 5. Partial Contrast Stretched- (a) Benign; (b) Malignant; (c) Leukemia; (d) Lung Cancer.
Enhancement Using Otsu

Otsu’s method is one of the most successful methods for image threshold-based
enhancement. The algorithm provide idea that the image to be threshold contains two
classes of pixels or bi-modal histogram then calculates the optimum threshold sepa-
rating those two classes so that their combined spread is minimal. The main objective
of Otsu method is to implement binary algorithms for converting gray image into
monochrome image or binary image. Consequently, it is playing the vital rule to
enhance the image.
Fig. 6. Enhanced using Otsu Method (a) Benign; (b) Malignant; (c) Leukemia; (d) Lung cancer.
Input Image
Assign K number of Cluster and Centroid
Calculate Euclidean distance between each

centroid and each pixel of input image
Assign all pixels to the nearest centroid
Recalculate new Centroid
Centroid
satisfy the No
tolerance or
error value?
Yes
Reshape the cluster pixels into image
Fig. 7. K-means Clustering.
3.2 Segmentation
K-means clustering is an unsupervised clustering method that classifies a given set of
data into k number of disjoint clusters. It is also an iterative method which is used to
partition an image into K clusters. Firstly, it takes k number of clusters and k centroid
where the value of k is defined as 2 and initial 2 centroids are chosen randomly, then it
calculates the Euclidean distance between the center and each pixel of the input image
[14]. Here Euclidean distance is used to define the distance of the nearest centroid and
assigns each point to the cluster which has nearest centroid from the respective data
point. After assigning all data points it recalculates the new centroid of each cluster and
using that centroid, a new Euclidean distance. So, K-means is an iterative algorithm, it
repeats the process and calculates new centroid in each iteration until it satisfies the
tolerance or error value. In our proposed system we divide the whole region into two
clustered. Best clustering region is taken as region of interest from two clustered e.g.
the white cancer cell cluster is taken as segmented with black background [15].
Fig. 8. K-means Segmented-(a) Benign; (b) Malignant; (c) Leukemia; (d) Lung cancer.
3.3 Median Filtering

After segmentation, the segmented image may contain some noise. So, to detect the
affected region perfectly median filtering method is applied on the segmented image.
Median filter can provide excellent noise reduction capabilities with considerably less
blurring than linear smoothing filters of similar size. After completing the clustering
process, the resultant clustering image was filtered by using 7 7 pixels median filter
to improve the segmented images [16]. In the median filtering process, firstly read the
test image and place a 7 7 neighborhood Kernel around each pixel in the image.
After sorting the neighboring pixels according to the pixel value, the median value
becomes the new value for the central pixels. Then move over the neighborhood kernel
to the next pixel in the image. Repeat place a 7 7 neighborhood Kernel and sorting
the neighboring pixels and also move over the neighborhood kernel are until all the
pixels in the image being processed. Until all the pixels in the image being processed,
repeat place a 7 7 neighborhood Kernel and sorting the neighboring pixels and also
move over the neighborhood kernel.
Fig. 9. Median Filtering of- (a) Benign; (b) Malignant; (c) Leukemia; (d) Lung cancer.
3.4 Classification
In this stage this study proposed multi SVM and Convolutional Neural Network
(CNN) for classification. Multi-SVM has two steps. Firstly, relevant features have
extracted from the filtered image. Secondly, the extracted features have feed the multi-
SVM to classify the test image. The process of two classifier is described below.
Feature Extraction
Collecting higher level information of an image such as shape, texture, color and
contrast is known as feature extraction. The analyzing methods that have been done so
far has used the values of pixel intensities, pixel coordinates and some other statistic
features namely mean, variance or median, which have error in determination process,
low precision and low efficiency in classification [10]. Gray Level Co-occurrence
Matrix (GLCM) is one of the most widely used image analysis applications. This
procedure follows two steps so that it can extract features from the medical images. For
this purpose, firstly the GLCM is computed, and in the other step, the texture features
based on the GLCM are calculated. In the proposed system the GLCM texture features
chosen are contrast, correlation, energy, homogeneity and features like mean, standard
deviation, RMS (Root Mean Square), entropy, variance, smoothness, kurtosis, skew-
ness, IDM. After extracting the features from segmented image, all features fed to the
multi-SVM and CNN classifier to classify the segmented image. Those features are
extracted from datasets to train the multi-SVM and also extracted from test image to
classify it. Again, CNN features are extracted using fc1000 layer from both datasets
and test image, while the test image is being classified using CNN.
Classification the Test Image
Multi-SVM is used to classify the input image as Benign, Malignant, Leukemia or
Lung cancer, a systematic technique for four class problems [17]. It needs two stages,
training and testing. The Multi-SVMs can be trained by features given as an input to its
learning algorithm and while training, it finds the suitable margins between two classes.
When extracted features are feat to the multi-SVM, it performs the steps to classify. In
this methodology, multi SVM works as a two class SVM repeatedly. When an input
image features are fed to the multi-SVM, it classifies the input image from first two
class. After classifying from two class, it takes the classified class and take the next
third class two classify the input image. It also gives a classified class. Lastly, the
classified class and fourth class is used to classify again. From this last classification,
we get the correct classified class where the input image belongs to. In testing phase,
the multi-SVM produces 90.75% to 95.55% accuracy.
Convolutional neural networks (CNN) are important tools for deep learning, and also
especially useful for image classification, object detection, and recognition tasks. Deep
learning uses convolutional neural networks to learn useful representations of data
directly from images [18]. Neural networks combine diverse nonlinear processing layers,
using simple elements operating in parallel, biological systems and deep learning models
are trained using a large set of labeled data [19]. Again, neural network architectures that
contain many layers, usually include some convolutional layers. The cell images are fed
into a CNNs model to extract deep-learning features. The CNN based classification is
divided into two phases such as training and testing phases. The number of images is
divided into different category by using labels name such as benign, malignant, leukemia
and lung. In the training and testing phase, preprocessing is done for image resizing.
Further, passing the input dataset to all the layers mentioned the classification network
undertakes training. The convolution neural network is used for automatic cancer cell
classification and for forecasting the class label of a new test image.
Firstly, cancer cell database is loaded which create an image data store to manage
the data. Then it calculated images and the category labels associated with each image.
The labels are automatically assigned from the folder names of the image files and
adjust unequal number of images per category so that the number of images in the
training set is balanced. Using the resnet-50 function from Neural Network, we Load
``ResNet-50'' pretrained network. To avoid biasing the results, randomize the split and
the training and test sets processed by the CNN model. Then prepare training and
testing set for CNN. It can easily extract features from one of the deeper layers
using the activations method and extract training features using ‘fc1000′ layer. CNN
image features used to train a multiclass SVM classifier. To extract image features
from test set, repeat the procedure used earlier and test features can then be passed to
the classifier to measure the accuracy of the trained classifier. Finally, newly trained
classifier used to classify the desired segmented image. A general block diagram of the
Convolutional neural networks (CNNs) is shown in Fig. 10.
Load database
Load Image Categories
Apply Convolutional Neural Network using ResNet-50
Prepare Training and Validation Image Sets Load Segmented Image
Feature Layer extraction using fc1000 Prepare for Feature Extract
Create Classifier using extracted features Classify Extract Features
Evaluate Classifier Predict label
Evaluate Accuracy
Fig. 10. Process of Convolutional neural networks.
Fig. 11. (a) Architecture of ResNet-50 (b) Convolutional Layer Weight.
Confusion matrix is a square matrix which describes the performance of a classifier

based on the test data set to observe the relationship between the classifier and the true
ones. It predicts results of a classification problem counting the number of correct and
incorrect predictions of each class given by the classifier. It provides use the errors and
types of errors being made by the classifier. The CNN classifier produces approxi-
mately 87.5% to 100% accuracy.
TP þ TN
Accuarcy ¼ ð1Þ
TP þ TN þ FP þ FN
Fig. 12. Confusion matrix-(a) Benign; (b) Malignant; (c) Leukemia; (d) Lung cancer.
3.5 Trace and Detect Region

After classification, based on the classified results of multi-SVM and CNN (Convo-
lutional Neural Network) are equal, then morphological operation applied on the
segmented image to trace and detect the cancerous region. Several types of image
property measuring and area measuring methods are applied to trace regions. In phase-
contrast microscopy images, noncancerous cells appear to be darker than the back-
ground after observing the most exposures. On the other hand, cancerous cells appear
to be brighter than the background. Therefore, by creating a new averaged image via
taking the mean of all exposure images, pixels at cell regions should have low
intensities, and pixels at cancer regions should have high intensities. Applying the
MSER (Maximally Stable Extremal Regions) method, the cancer region is traced.
Finally, detected regions are shown with a red boundary. The detected regions of tested
images are given below.
Fig. 13. Detected region of-(a) Benign; (b) Malignant; (c) Leukemia; (d) Lung cancer.
3.6 Area Calculation

The main task for area calculation of test image is the global threshold processes, which
has examination pixel not equal to 0. Regionprops function measure properties of
image regions. It returns a set of property measurements for each 8-connected com-
ponent (object) in the binary image. Regionprops can also use in contiguous regions
and discontiguous regions. If the properties argument does not specify, then Region-
props returns the 'Area', 'Centroid', and 'BoundingBox' measurements. The label image
specified a numeric array of any dimension. Pixels labeled 0 are the background. The
label image specifies an array of numeric values with different dimensions. Different
pixel labels identify different objects, such as pixel labeled 0 for background, and pixel
labeled 1 for one object, pixel labeled 2 for another object, etc. One of the critical
characteristics of Regionprops is that it treats the negatively valued pixels as back-
ground and round down input pixels that are not integers. The actual number of the
pixels in the region are returned as scalar. The bwarea value differs slightly from this
value as it weighs a different pattern of pixels differently. The equation of the area has
represented the result of the total pixel.
X
n;m
Area ¼ Pði; jÞ ð2Þ
i¼1;j¼1
The area of detected area is obtained by counting the number of affected regions
pixels. The ratio of cancer cells area was calculated using the formula:
Area of Cancer cell

Ratio ¼ 100% ð3Þ
Area of Test image
Here, the benign tumor has a 7.4% affected region where the test image has 49295
pixels area, and the detected benign tumor region is 3663 pixels. The malignant has a
4.6% affected region where the test image has a total of 51790 pixels area and detected
malignant region is 2399 pixels. Leukemia has a 31.4% affected region where the test
image has 65536 pixels area, and the detected leukemia region is 20561 pixels. The
lung has a 1.5% affected region where the test image has 54231 pixels area, and the
detected lung region is 814 pixels.
4 Conclusions
This research developed a methodology that can identify and classify different types of
cancer cell using unsupervised clustering method and deep learning more specially
multi SVM and Convolutional Neural Network and can also be able to segment per-
fectly compare to the results from current methodologies. In comparison with found
results to those of current studies, it must be pointed out that this methodology is not
limited to a fixed. Moreover, the current methods recognized on a specific type of class
or a specific type of cancer dataset. This experimental results indicate that the proposed
approach is an improved method and it can significantly generate accurate and auto-
matic detection of cancer cells than existing methods. This research has shown that the
overall accuracy of the system is approximately 93%. One of the primary challenges of
this research is the expected centroid of clusters, need initial knowledge of the image
and more awareness to train dataset.
In future research, it will give some directions such as developing better segmen-
tation technique, selecting better feature extraction, clssification process when optimal
learning algorithms are applied on different feature descriptors. This research will open
a pathway to implement and analyze different areas of image segmentation in future.
References
1. Wang, M., Zhou, X., Li, F., Huckins, J., King, R.W., Wrong, S.T.C.: Novel cell
segmentation and online learning algorithms for cell phase identification in automated line-
lapse microscopy. biomedical imaging from Nano to Macro. In: ISBI 2007 4th IEEE
International Symposium, pp. 65–6 (2007)
retrieval of natural images. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.) Intelligent
Computing & Optimization. ICO 2018. Advances in Intelligent Systems and Computing,
vol. 866. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00979-3_51
3. Gupta, R., Mallick, P., Duggal, R., Gupta, A., Sharma, O.: Stain color normalization and
segmentation of plasma cells in microscopic images as a prelude to development of
computer assisted automated disease diagnostic tool in multiple Myeloma. In: 16th
International Myeloma Workshop (IMW), India, March 2017
4. Duggal, R., Gupta, A., Gupta, R., Wadhwa, M., Ahuja, C.: Overlapping cell nuclei
segmentation in microscopic images using deep belief networks. In: Indian Conference on
Computer Vision, Graphics and Image Processing (ICVGIP), India, December 2016
5. Duggal, R., Gupta, A., Gupta, R.: Segmentation of overlapping/touching white blood cell
nuclei using artificial neural networks. CME Series on Hemato-Oncopathology, All India
Institute of Medical Sciences (AIIMS), New Delhi, India, July 2016
6. Duggal, R., Gupta, A., Gupta, R., Mallick, P.: SD-Layer: stain deconvolutional layer for
CNNs in medical microscopic imaging. In: Medical Image Computing and Computer-
Assisted Intervention − MICCAI 2017, MICCAI 2017. Lecture Notes in Computer Science,
Part III, LNCS 10435, pp. 435–443. Springer, Cham. https://doi.org/https://doi.org/10.1007/
978-3-319-66179-7_50
7. Armato, I.I.I., Samuel, G., McLennan, G., Bidaut, L., McNitt-Gray, M.F., Meyer, C.R.,
Reeves, A.P., Clarke, L.P.: Data from LIDC-IDRI. Cancer Imaging Archive (2015). https://
doi.org/10.7937/K9/TCIA.2015.LO9QL9SXArchive
8. Purohit, P., Joshi, R.: A new efficient approach towards k-means clustering algorithm. Int.
J. Comput. Appl. 65(11), 0975–8887 (2013)
9. Jose, A., Ravi, S., Sambath, M.: Brain segmentation using k-means clustering and fuzzy C-
means algorithm and its area calculation. Int. J. Innov. Res. Comput. Commun. Eng. 2(2)
(2014)
10. Damodharan, S., Raghavan, D.: Combining tissue segmentation and neural network for
brain detection. Int. Arab J. Inf. Technol. 12(1) (2015)
11. Mesanovic, N., Grgic, M., Huseinagic, H., Males, M., Skejic, E., Smajlovic, M.:
Automatic CT image segmentation of the lungs with region growing algorithm
12. Kanitkar, S.S., Thombare, N.D., Lokhande, S.S.: Detection of lung cancer using marker -
controlled watershed transform. In: proceedings of the IEEE Conference on Pervasive
Computing, 978–1–4799–6272–3/15 (2015)
13. Abdullah, A.S., Rosline, H.: Improving color leukemia images using contrast enhancement
techniques. In: 2010 IEEE EMBS Conference on Biomedical Engineering and
Sciences IECBES 2010, Kuala Lumpur, Malaysia (2010)
14. Abdul Nasir, A.S., Mashor, M.Y., Harun, N.H., Abdullah, A.A., Rosline, H.: Improving
colour image segmentation on acute myelogenous leukaemia images using contrast
enhancement techniques. In: 2010 IEEE EMBS Conference on Biomedical Engineering &
Sciences (IECBES 2010), Kuala Lumpur, Malaysia (2010)
15. Dhanachandra, N., Manglem, K., Chanu, Y.J.: Image segmentation using K-means
clustering algorithm and subtractive clustering algorithm. Article in Procedia Computer
Science, December 2015. https://doi.org/10.1016/j.procs.2015.06.090
16. Abd Halim, N.H., Mashor, M.Y., Abdul Nasir, A.S., Mustafa, N., Hassan, R.: Color image
segmentation using unsupervised clustering technique for acute leukemia images. In:
International Conference on Mathematics, Engineering and Industrial Applications 2014,
Proceedings 1660, 090038–1–090038–10. https://doi.org/10.1063/1.4915882,2015
17. Rejintal, A., Aswini, N.: Image processing based leukemia cancer cell detection. In: IEEE
International Conference on Recent Trends in Electronics Information Communication
Technology, 978–1–5090–0774–5/16 (2016)
18. Joshua Thomas, J., Pillai, N.: A deep learning framework on generation of image
descriptions with bidirectional recurrent neural networks. In: Vasant, P., Zelinka, I., Weber,
G.W. (eds.) Intelligent Computing & Optimization. ICO 2018. Advances in Intelligent
Systems and Computing, vol. 866. Springer, Cham (2019). https://doi.org/10.1007/978-3-
030-00979-3_22
19. Ghoneim, A., Muhammad, G., Hossain, M.S.: Cervical cancer classification using
convolutional neural networks and extreme learning machines. Future Generation Comput.
Syst. 102 (2020) 643–649 (2019)
Automated Student Attendance Monitoring
System Using Face Recognition
Bakul Chandra Roy, Imran Hossen, Md. Golam Rashed(&) ,

and Dipankar Das
Department of Information and Communication Engineering,

University of Rajshahi, Rajshahi 6205, Bangladesh
bakulice800@gmail.com, imranhsobuj97@gmail.com,
{golamrashed,dipankar}@ru.ac.bd
Abstract. In a conventional attendance monitoring system, the concerned

teacher takes attendance manually in a classroom. In general, it is a time-
consuming and very difficult task to take attendance of a huge number of
students in a short period of time and involves proxy attendance. To overcome
these issues, we proposed a face recognition-based student attendance moni-
toring system in a classroom environment. The proposed method uses the
Histogram of Oriented Gradients (HOG) as features extractor, Convolutional
Neural Network (CNN) as face encoding and Support Vector Machine
(SVM) as a classifier. The proposed system recognizes the face in real-time
using a webcam and generates attendance report automatically without any
human intervention. Our face recognition method accomplished 99.5% accu-
racy on Labeled Faces in the Wild (LFW) database and 97.83% accuracy in
real-time inside the classroom for the case of attendance monitoring. Finally, we
tested our system to validate its effectiveness.
Keywords: Face detection Face recognition Machine learning HOG Face

landmark estimation CNN and SVM
1 Introduction
Attendance management is very crucial for a modern institution. Especially, in every

educational institution, an automatic student attendance monitoring system is needed
due to the increasing number of students. The currently available systems such as RFID
[1], IRIS [2] and FINGERPRINT [3] based methods are not time and cost-effective
and, sometimes, they are not correct enough. Bluetooth based attendance monitoring
systems [4] takes a long time for taking attendance. To overcome these limitations, we
propose an automated face recognition-based attendance monitoring system. In face
recognition system, the eigenface algorithm based on PCA is proposed in [5]. In 2001,
Paul Viola and Michael Jones proposed a real-time object detection method
[6]. Haar Cascade based classifier works well in some situation, however, HOG fea-
tures produce better result compared with the eigenface and the Haar Cascade [7].
Having some unique characteristics in the face, the face recognition is the most
studied topic in computer vision than other biometric recognition [8]. Face recognition

https://doi.org/10.1007/978-3-030-68154-8_54
622 B. C. Roy et al.
technique was used for personal identification, access control, and security manage-
ment [9]. This technique is also suitable for taking attendance of students in an edu-
cational institution. The method is also applicable for taking attendance of the
employee in government institution, military, bank, garment industry, private organi-
zation, shop, etc.
Furthermore, real-world examples such as auto-tagging on Facebook and screen
lock in the mobile application are using face recognition [10, 11]. Computer vision,
artificial intelligence and its sub field machine learning and deep learning have
achieved a great achievement in this regard.
However, currently available systems are not time efficient, cost-effective and much
correct. Due to these problems in the conventional method and feeling the necessity of
face recognition for person identification, we were motivated to propose the face
recognition-based attendance monitoring system. The system uses a HOG to extract
distinctive features and facial landmark estimation due to pose and projection variation.
Then, we use CNN for face encoding and SVM for face recognition.
The aim of this work is to contribute to the development of the smart classroom.
Thus, an automated student attendance monitoring system has been proposed. The
main purpose of this work is to create such a system that will be accurate alongside
time and cost-efficient and can be implemented in real-time. The other key goals of this
work are as follows: first, is to detect the face for feature extraction so that we can get
some important characteristics for further processing. Secondly, we use CNN for face
encoding so that we can recognize the face for taking attendance. Thirdly, we use LFW
database for determining the accuracy and our own dataset. Finally, we will make our
proposed system for real-time application.
The rest work of this paper is organized in the following way: the proposed
methodology is described in Sect. 2, then the result and discussion are discussed in
Sect. 3, finally the conclusion and future scope are given in Sect. 4.
Figure 1 illustrate the abstract view of the proposed system. It comprises HOG features
extraction, facial landmark estimation, face encoding using CNN, and face recognition
using SVM. At the entering path of a classroom, there should have a low-resolution
webcam which will directly be connected to the computer or there should use a laptop
computer which have built-in webcam. This computer/laptop will be operated by the
concerned class teacher at any particular schedule. Students will enter the classroom
one by one at any particular time and look on webcam for a while. Each time our
system will process the captured images of every student in real-time to recognize their
faces to make sure his/her attendance of that class. These techniques are described in
the following sub-sections in details as follows.
Automated Student Attendance Monitoring System 623
Fig. 1. Overall workflow of our proposed system.
2.1 Hog Features Extraction

The first step is face detection that finds the facial area of the image. Here, we used
HOG [12, 13], as a feature extractor. In this case, a black and white image is used for
facial features due to simplicity. The goal is to find out how the current dark pixel is
compared to the pixels surrounding it. We draw an arrow in which the direction of the
image is getting darker. If we repeat this way for every single pixel in the image, the
Fig. 2. HOG features extraction.

image is being replaced by an arrow. These arrows are known as gradients, and we
show the flow from light to dark across the entire image. Finally, we get the HOG
version of our image. Figure 2 show the feature extraction of our image using HOG.
2.2 Facial Landmark Estimation

Due to pose and projection variation, the facial landmark estimation [14] is needed to
detect the face accurately. In this case, 68 specific points (landmarks) that exist on
every face is detected. The facial landmarks are ordered for an image is as follows: Jaw
Points = 0 to 16; Right Brow Points = 17 to 21; Left Brow Points = 22 to 26; Nose
Points = 27 to 35; Right Eye Points = 36 to 41; Left Eye Points = 42 to 47; Mouth
Points = 48 to 67. These 68 specific points of any face are trained by the Ensemble of
Regression Trees (ERT) [15] algorithm. The result with the locations of the 68 land-
marks on test image is shown in Fig. 3.
Fig. 3. (a) The 68 landmarks on face; (b) the 68 landmarks on test images.
We have performed the basic image transformations such as rotation and scaling
that keep parallel lines parallel i.e. affine transformations using these landmark points.
Figure 4 visualizes the results of the transformation for a given test face images.
Fig. 4. Affine transformation process.

2.3 Face Encoding Using CNN

Convolutional Neural Network [16, 17] works in the same way as the human visual
system does. The images obtained from step 2.2 are fed to the neural network. It
generates 128 unique numerical points from the complete image. We use a feed-
forward neural network. The training process of the neural network is done by looking
3 face images at a time. This network works on the basis of the following principles:
a) Load an image of a known student e.g. Bakul.
b) Load another image of the same student.
c) Load an image of different student e.g. Imran.
We trained our neural network so that 128-d measurement of the two images
Bakul_1 and Bakul_2 will be close to each other and farther from Imran. Figure 5
show the single triplet training step.
Fig. 5. Visualization of the learning algorithm.
By repeating this step many times for different students, the network is learned to
generate a 128-d output feature vector i.e. a list of 128 real-valued numbers which is
used to quantify the faces. If once the network has been trained, it can generate
measurement points for any face even if it has never seen before. Hence, this step needs
only one time. We used OpenFace [18] model and adapted it to our dataset to get 128
measurement points for the face images.
2.4 Face Recognition Using SVM

This section consists of face recognition using linear SVM classifier. A separating
hyperplane of the SVM can be given by:
W:X þ b ¼ 0 : ð1Þ
Where W is a weight vector, W = {w1, w2, w3…wn}, X is the training image, and b is
the bias term.
The decision boundary of SVM is as follows:
X
D XT ¼ yi ai Xi XT þ b0 : ð2Þ
Where yi is the class level of the Xi training image, XT is test images, ai and b0 are
numeric parameters.
If the sign of the Eq. (2) is positive, the SVM predicts, the student is registered
otherwise unregistered. Figure 6 shows liner SVM separating hyperplane.
Fig. 6. SVM separate the registered student from the unregistered student.
Our aim is to find whether a student is present or not. Hence, we use a web camera
to capture the students on the classroom to recognize them using SVM [19] classifier
that take 128-D encoding values from the test image and compares it with the training
image. Based on the comparison, the classifier recognizes the registered student
showing their “Name” and unregistered student showing “Unknown”. To simulate the
model, we use 450 images of 45 students (10 images/student) with multiple illumi-
nations, pose variations and expressions. Then we test the trained model by 45 images
(one image/student). Figure 7 show some sample images of our own dataset.
Fig. 7. Dataset of students.
After training our own dataset, the system was implemented in our classroom
environment. Figure 8 illustrates the recognized output in the real-time situation.
Finally, the system automatically updates the attendance status of students in the Excel.
Fig. 8. Face recognition in classroom.

The result of our proposed system is to give an attendance report automatically. The
generated attendance report with the status is shown in Fig. 9. Here, ‘P’ indicates the
student was present in a class and ‘A’ indicates he/she was absent. Our proposed
system also computes the total number of days of the students who are presented in a
month or a year or an entire semester.
It also provides security such as, who is registered or not. The registered student
is recognized by his/her name and the unregistered student is identified by the
unknown, as shown in Fig. 10.
Fig. 9. An attendance report inside in a classroom.
For the real-time experiment in a classroom environment, we used 45’s registered

students and 1’s unregistered student. To identify how well our model work, we need
to calculate accuracy, precision and recall [20]. They are given below.
Accuracy = (TP + TN) / (P +N)
Precision = TP / (TP + FP)
Recall = TP / (TP + FN).
Where TP: True Positive, TN: True Negative, FP: False Positive, FN: False
Negative, P: Positive, N: Negative, Positive example refers to registered student, and
negative example means the unregistered student. Table 1 show the accuracy calcu-
lation using precision and recall.
Fig. 10. Registered student (left), unregistered student (right).
Table 1. Accuracy calculation using precision and recall.

Actual/Predicted Registered Unregistered
Registered 44 (TP) 0 (FP)
Unregistered 1 (FN) 1 (TN)
True Positive is the registered student who is correctly identified as registered. True
Negative is the unregistered student who is correctly recognized. False Positive refers
to the unregistered student identified as registered. Finally, False Negative is the reg-
istered student who is identified as unregistered. The accuracy is 97.83%, however,
precision is 100%, and recall is 97.78%. To confirm the accuracy of the proposed
system, we evaluated and compared it to LFW dataset. Table 2 show the accuracy
comparison with other methods.
Table 2. Accuracy comparison based on other methods.

Database Methods Accuracy (%)
LFW DeepFace [21] 97.35
LFW FaceNet [22] 99.63
LFW DeepID2+ [23] 99.47
LFW Alignment learning [24] 99.08
LFW Proposed method 99.50
Own database Proposed method 97.83
Comparing our result with previous results, we found the recognition accuracy is
reasonable. As we were mainly concerned for real-time application and here we found
97.83% accuracy on classroom dataset. Thus, it reveals that the proposed system is
implementable in the real-time environment. The overall performance of the proposed
system on LFW database and our own database is illustrated in Fig. 11.
Fig. 11. Performance analysis on LFW and our own datasets.
4 Conclusion and Future Scopes
The face recognition system is used to solve many problems in our daily activities. This
paper demonstrates that the manual attendance system can be replaced by automated
student attendance system using face recognition. It works efficiently in different
lighting conditions and recognizes the student’s faces and updates the attendance report
accurately. We were interested to use HOG rather than Haar Cascade due to much false
positive output. We use CNN for a more accurate result. For face classification, we use
linear SVM for simplicity instead of using CNN itself. We have accomplished 99.50%
and 97.83% accuracy on LFW and our own student datasets, respectively. In the
future, we will use a deep neural network for feature extraction and more accu-
rate result. Also, we want to introduce a feedback system based on facial expression.
Adding this, the system will help the teacher to improve or modify their teaching
method.
References
1. Lim, T., Sim, S., Mansor, M.: RFID based attendance system. In: IEEE Symposium on
Industrial Electronics & Applications, ISIEA 2009, vol. 2, pp. 778–782. IEEE (2009).
https://doi.org/10.1109/isiea.2009.5356360
2. Kadry, S., Smaili, K.: A design and implementation of a wireless iris recognition attendance
management system. Inf. Technol. Control 36(3), 323–329 (2007)
3. Bhanu, B., Tan, X.: Learned templates for feature extraction in fingerprint images. In:
Proceedings of the IEEE Conference on Computer Vision and Pattern recognition, vol. 2,
pp. 591–596 (2009). https://doi.org/10.1109/CVPR.2001.991016
4. Bhalla, V., Singla, T., Gahlot, A., Gupta, V.: Bluetooth based attendance management
system. Int. J. Innov. Eng. Technol. (IJIET) 3(1), 227–233 (2013)
5. Belhumeour, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs Fisherfaces: recognition
using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intel. 19(7), 711–720
(1997). https://doi.org/10.1109/34.598228
6. Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57(2), 137–
154 (2004). https://doi.org/10.1023/B:VISI.0000013087.49260.fb
7. Boyko, N., Basystiuk, O., Shakhovska, N.: Performance evaluation and comparison of
software for face recognition based on Dlib and Opencv library. In: IEEE Second
International Conference on Data Stream Mining & Processing, Lviv, Ukraine, 21–25
August (2018). https://doi.org/10.1109/dsmp.2018.8478556
8. Berini, D.J., Van Beek, G.A., Arnon, I., Shimek, B.J., Fevens, R.B., Bell, R.L.: Multi-
biometric enrolment kiosk including biometric enrolment and verification, face recognition
and fingerprint matching systems. US Patent 9,256,719, 9 February (2016)
9. Priya, T., Sarika, J.: IJournals: Int. J. Softw. Hardware Res. Eng. 5(9) (2017). ISSN-2347-
4890
10. Tambi, P., Jain, S., Mishra, D.K.: Person-dependent face recognition using histogram of
oriented gradients (HOG) and convolution neural network (CNN). In: International
Conference on Advanced Computing Networking and Informatics. Advances in Intelligent
Systems and Computing, Singapore (2019)
11. Bong, C.W., Xian, P.Y., Thomas, J.: Face recognition and detection using Haars features
with template matching algorithm. In: ICO 2019, AISC 1072, pp. 457–468 (2020). https://
doi.org/10.1007/978-3-030-33585-4_45
12. Navneet, D., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE
Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005,
vol. 1, pp. 886–893. IEEE (2005). https://doi.org/10.1109/cvpr.2005.177
13. Hanamsheth, S., Rane, M.: Face recognition using histogram of oriented gradients. Int. J. 6
(1) (2018)
14. Rosebrock, A.: Facial landmarks with dlib, OpenCV, and Python [Electronic resource] -
Access mode. https://www.pyimagesearch.com/2017/04/03/facial-landmarks-dlib-opencv-
python/
15. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression
trees. In: CVPR 2014 Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 1867–1874 (2014). https://doi.org/10.1109/cvpr.2014.241
16. Nam, N.T., Hung, P.D.: Pest detection on traps using deep convolutional neural networks.
In: Proceedings of the 2018 International Conference on Control and Computer Vision
(ICCCV 2018), pp. 33–38. ACM, New York. https://doi.org/10.1145/3232651.3232661
descriptions with bidirectional recurrent neural networks. In: ICO 2018, AISC 866, pp. 219–
230 (2019). https://doi.org/10.1007/978-3-030-00979-3_22
18. Brandon Amos and his team, Access mode: https://cmusatyalab.github.io/openface/
19. Timotius, I.K., Linasari, T.C., Setyawan, I., Febrianto, A.A.: Face recognition using support
vector machines and generalized discriminant analysis. In: The 6th International Conference
on Telecommunication Systems, Services, and Applications (2011). https://doi.org/10.1109/
tssa.2011.6095397
20. Hung, P.D., Kien, N.N.: SSD-MobileNet implementation for Classifying Fish Species, ICO
2019, AISC 1072, pp. 399–408 (2020). https://doi.org/10.1007/978-3-030-33585-4_40
21. Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level
performance in face verification. In: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 1701–1708 (2014). https://doi.org/10.1109/cvpr.2014.
220
22. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition
and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 815–823 (2015). https://doi.org/10.1109/cvpr.2015.7298682
23. Sun, Y., Wang, X., Tang, X.: Deeply learned face representations are sparse, selective, and
robust. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,
pp. 2892–2900 (2015). https://doi.org/10.1109/cvpr.2015.7298907
24. Zhong, Y., Chen, J., Huang, B.: Towards end-to-end face recognition through alignment
learning arXiv:1701.0717 (2017). https://doi.org/10.1109/lsp.2017.2715076
Machine Learning Approach to Predict
the Second-Life Capacity of Discarded EV
Batteries for Microgrid Applications
Ankit Bhatt, Weerakorn Ongsakul(&), and Nimal Madhu
Department of Energy, Environment and Climate Change,

School of Environment Resources and Development,
Asian Instituteof Technology, Khlong Luang 12120, Pathum Thani, Thailand
ongsakul@ait.ac.th
Abstract. Batteries retired from electric vehicles (EV), known as second-life

batteries, have enough potential in them to be used storage applications in other
domains. By estimating the remaining capacity of these retired batteries, a
reliable low-cost alternative can be provided for microgrid applications, and
measures are taken to increase their remaining lifetime. To estimate the storage
capacity available in these batteries, a predictive model that can provide the
second-life capacity of the battery accurately is required. Since, second-life
batteries are discarded from fresh (First life) batteries, whose state of health is in
the range 80%–100%, a machine learning model is developed that can predict
the operational characteristics of the second-life, utilizing that of its first life.
This research proposes two different battery models, distinctive in terms of the
input parameters, for the above purpose. Based on the obtained results, the
prediction model using the charging voltage, charging current, battery capacity
for every cycle alongside all three input parameters for 5 lag cycles, provided an
accurate prediction of the second-life operation, in comparison to the other.
Mean square error (MSE), root mean square error (RMSE), mean absolute error
(MAE), and R-squared was considered as performance indicating parameters.
Keywords: Machine learning model First life stage of battery Second life
stage of battery Charging capacity prediction
1 Introduction
Nowadays battery energy storage systems are being used widely in electric vehicles
(EV) and microgrid applications because of their continuously decreasing price due to
the advancement in technology and mass production of these batteries. EV batteries can
be utilized for around 8 years because of their lifetime limitation and after this, a new
battery is required instead of a retired one [1]. These retired batteries from EV are
known as second-life batteries. The lifetime of these retired batteries can be increased
by investigating the remaining capacity so that the requirements of other storage
applications can be fulfilled. These batteries come under electrochemical battery
technologies which are classified into three categories (see Fig. 1): Flow Batteries,
Primary Batteries, and Secondary Batteries. From these several battery technologies,

https://doi.org/10.1007/978-3-030-68154-8_55
634 A. Bhatt et al.
nickel-metal hydride (Ni-MH) and lithium-ion (Li-ion) are the main battery tech-
nologies used in EVs [2]. Due to the several advantages of Lithium-ion batteries such
as high power density, high energy density, high energy efficiency, high specific
energy, long life cycles, and low daily self-discharge [3–5] these batteries are preferred
by vehicle manufacturers. Among Li-ion batteries, lithium iron phosphate (LiFePO4)
batteries are the most favorable batteries used in several applications because they are
cheaper, less toxic, having flat charge/discharge voltage, comparatively better cycle
life, and more structural stability [6].
The concept of a second life battery is that how to utilize batteries that are discarded
from their first life application. After a lifetime of about 6–8 years, the owners of EVs
or plug-in hybrid electric vehicles (PHEVs) usually replace their vehicle’s batteries due
to low performance, warranty, etc. At this stage, these retired batteries can be reused
again in different applications some of which are mentioned below [7, 8]:
I. For storage of wind and solar power at small or large scale, off-grid or grid-
connected mode for household applications.
II. For peak shaving in industries to reduce power demand.
III. For EV charging to reduce power demand at the time of charging.
IV. For increasing grid capability and stability by reducing the installation of large
cables.
V. For electricity trading in the form of a battery farm for electricity companies.
Fig. 1. Classification of electrochemical battery technologies
A second life battery comes into existence when a new battery that is being used in
an EV, can provide only up to 80% of its actual (initial) energy. To know the char-
acteristics of these batteries some performance tests such as hybrid pulse power
characterization test (HPPC), capacity test, electrochemical impedance spectroscopy
(EIS), and open-circuit voltage (OCV) test need to be performed [9].
The transition of the new battery into the second life stage happens due to the aging
phenomenon which increases capacity fade and the resistance growth. There are two
types of battery aging: cycle aging and calendar aging [10]. Calendar aging happens
during energy storage in the battery whereas cycle aging happens due to battery
operation during charge and discharge cycles. These aging mechanisms are affected by
Machine Learning Approach to Predict the Second-Life Capacity 635
temperature variation, charging and discharging current/voltage, state of charge (SOC),

and different ways of charging, etc. Therefore, for accurate lifetime prediction, safer
operation, and better battery performance, it is of more importance to know the actual
aging [11].
Due to aging, the energy storing or delivering capability of a battery decreases over
its lifetime. So, state of health (SOH) estimation is a key aspect to evaluate the battery
aging state for safe and reliable operation. Capacity and internal resistance of a battery,
are the most important indicators used for SOH estimation [12]. Equation (1) and
Eq. (2) define SOH in terms of internal resistance and capacity [13]. SOH estimation
methods are divided into two categories: the experimental methods and model-based
estimation methods. (see Fig. 2) the classification of battery SOH estimation methods.
To reflect the dynamic characteristics of a battery accurately, there is a need for a
battery model that is difficult to establish because of the complexity in internal prin-
ciples and working environment. Therefore, a data-driven method can be an effective
solution to predict SOH because it neither requires any explicit battery model nor
working principles of battery but only requires the collection of aging data. Machine
learning methods as a part of Data-Driven methods, use past experienced data to solve
the problem. Apart from various fields such as image recognition [14, 15], the financial
sector [16], healthcare sector [17], performance prediction of students [18], many
researchers have proposed these methods for estimating battery states.
Ri
SOHR ¼ 100½% ð1Þ
R0
Cmax;i
SOHC ¼ 100½% ð2Þ
C0
Where,
Ri = battery resistance at any time interval i
R0 = battery initial resistance
Cmax;i = maximum capacity of the battery at any time interval i
C0 = battery initial capacity (nominal capacity or rated capacity)
To estimate SOH accurately and reliably, the charging curve based Gaussian
process regression (GPR) model was proposed by [19]. [20] investigated that a fuzzy
logic-based approach is also a reliable solution to determine battery SOH through
offline discharge testing of the batteries. [21] used a support vector machine (SVM) to
predict SOH for Li-ion batteries. To handle a large set of data for a complex and non-
linear system, a neural network was proposed by [22, 23]. For accurate prediction of
SOH in different working conditions, it uses a robust algorithm. A neural network-
based SOH estimation method was also proposed by [24]. They used a recurrent neural
network (RNN) for the prediction of deterioration in the performance of the Li-ion
battery. A backpropagation neural network (BPNN) was used by [25] for SOH esti-
mation. Long short-term memory (LSTM) based prediction of remaining useful life of
Li-ion cells under different operation conditions, was proposed by [26]. [23] proposed a
deep convolutional neural network (DCNN) for the assessment of SOH during a charge
636 A. Bhatt et al.
cycle based on the voltage, current, and capacity measurements. For calculating
capacity, the coulomb counting method is used to integrates the charge current over-
time for the entire charge cycle. [27] used feedforward neural network, convolutional
neural network (CNN), and LSTM for estimating SOH based on capacity measurement
from charging profiles of the battery. Similarly, the Li-ion battery’s remaining useful
life prediction by using LSTM-RNN is also proposed by [28].
This paper shows how multi-layer perceptron (MLP), a machine learning model,
can predict the second-life stage capacity of a lithium-ion battery from its first life
stage. It also shows that how prediction results can be improved by improvising input
parameters. For the proposed machine learning model, activation functions used are
leaky rectified linear unit (ReLU) and ReLU activation functions.
Fig. 2. Classification of battery SOH estimation methods
2 Battery Data Collection
For the present work, we are using the datasets of commercially available lithium-ion
18650 rechargeable batteries from Prognostics Data Repository under NASA [29],
which is a public source. In this study, the B0005 Li-ion battery was considered for
obtaining experimental data from three operational tests: charging, discharging, and
impedance at room temperature of 24 °C. The test profile of the B0005 battery is
mentioned in Table 1. During the charging process, the constant current constant
voltage (CCCV) method was used where, we have to provide a constant current of 1.5
A until the voltage reaches up-to 4.2 V, and then charging at a constant voltage will
remain to continue until the charge current reduces to 20 µA. For the discharging
process, a constant current of 2 A was provided until the voltage reached 2.7 V and
2.2 V for battery B0005 and B0007 respectively.
Table 1. Test profile of battery data set B0005

Battery Constant Charge cut- Discharge Discharge Nominal
no. charge off voltage current cut-off (Rated)
current (A) (V) (A) voltage (V) capacity (Ah)
B0005 1.5 4.2 2.0 2.7 2.0
For the present work, out of 3 operational test profiles, only the charging profile of
the battery was considered for creating data to train, validate, and test the machine
learning model. Based on the charging profile of batteries during several cycles, the
degradation characteristics of the battery can be analyzed. During the charging process
of battery B0005, the capacity at cycle 1 is maximum whereas at cycle 115 the capacity
is reduced to a minimum value (see Fig. 3). Due to the continuous switching between
charging and discharging cycles [30], a sudden increase in battery capacity is also
shown in battery degradation characteristics which are also known as the capacity
regeneration phenomenon [31].
Fig. 3. Capacity degradation of B0005
3 Input, Output, and Pre-processing of Data
This study focuses on the capacity prediction of the second life stage of the battery
which means that the SOH of the battery is almost lesser than 80% of its initial state.
For this purpose, we are considering that the parameters (input and output) associated
with the first life stage of the battery will be used for training purposes of the machine
learning model, and for testing purposes we will consider the parameters of the second
638 A. Bhatt et al.
life stage of battery. Here we are proposing two different cases of input parameters for
the proposed machine learning model and comparing the results based on the perfor-
mance indicators.
3.1 Case 1
Case 1 consists of 3 input parameters, charging voltage (V), charging current (I), and
the number of cycles (K). Based on these 3 input parameters we are trying to predict the
final charging capacity of each cycle during the second life stage of the battery. Here,
for input parameters, we are arranging our datasets into one lag (previous) cycle dataset
(K − 1) and one current cycle dataset (K), and based on these arrangements we are
predicting final charging capacity for future (next) cycle (K + 1) (see Fig. 4). By
arranging datasets in this manner we are providing a total of 6 inputs for our model.
Here, initial 60 cycles were used for creating training datasets, 61–70 cycles were used
for validation datasets and 71–115 cycles were used for testing datasets. After 70
cycles, the final SOH of B0005 is around 81% so we are considering that up to 70
cycles battery can be considered at its first life stage whereas after 70 cycles we are
considering that our battery is in the second life stage because SOH is lower than 80%.
Fig. 4. Input and output parameters for case 1
3.2 Case 2
Case 2 also consists of 3 input parameters but here instead of the number of cycles, we
are using capacity (C) as an input. Here, we are arranging our datasets into a 5 lag
(previous) cycle dataset (K − 5 to K − 1) and one current cycle dataset (K), and based
on these arrangements we are predicting final charging capacity for future (next) cycle
(K + 1) (see Fig. 5). Due to this arrangement, the total 18 inputs we are providing to
our machine learning model. The division of training, validation, and testing datasets
are the same as mentioned in Case 1.
Fig. 5. Input and output parameters for case 2
After preparing input and output data from the Prognostics Data Repository under
NASA [29], the preprocessing of data through an appropriate normalization technique
is important for making a robust and efficient machine learning algorithm. Normal-
ization of input and output data will be in the range of 0 to 1, as given by Eq. (3).
x xmin
xðnormalized Þ ¼ ð3Þ
xmax xmin
Where,
xmax = upper limit of input and output value ‘x’
xmin = lower limit of input and output value ‘x’
4 Machine Learning Model and Activation Functions Used
This study used the Multi-Layer Perceptron (MLP) model which is a feed-forward
artificial neural network (ANN) model. It maps multiple input data with the output data.
It consists of three basic layers: an input layer, hidden layer, and output layer (see
Fig. 6), where each layer consists of several nodes and is interconnected with each
other. By increasing the number of hidden layers we can convert our machine learning
model into a deep neural network. In MLP every node is known as an artificial neuron
which is linked with a nonlinear activation function such as sigmoid, tanh, linear,
rectified linear unit (ReLU), Leaky ReLU, etc.
For training the network, MLP employs a supervised learning-based, back-
propagation algorithm, which performs propagation and weight update tasks [32]. The
back-propagation is an accurate learning algorithm for nonlinear relationships between
640 A. Bhatt et al.
Fig. 6. MLP structure with m inputs, 1 hidden layer, and n outputs
input and output. In this algorithm, the output values during each iteration are com-
pared with the actual value and based on the deviation in output to the actual value, an
error function is generated. By feeding back the error function, the model tries to do
some changes in the weights so that the error function can reduce to a certain level.
In this study, we are using two activation functions: ReLU and Leaky ReLU.
Table 2 shows the mathematical and graphical representations of the activation func-
tions used. For deep learning applications, the most widely used activation function is
ReLU because of its faster learning ability. In comparison with Sigmoid and tanh
activation functions, ReLU performs and generalize in a better way for deep learning
applications [33]. It can easily optimize with gradient descent methods because of its
linear properties. This function rectifies the negative input values by imposing zero
value to them and eliminates the gradient vanishing problem which makes this function
better than several other activation functions. Apart from this ReLU has a significant
limitation that sometimes it becomes fragile during training which causes some gra-
dients to die. This causes some neurons to be dead which hinders the learning process.
To overcome this issue, we can use the Leaky ReLU activation function. In Leaky
ReLU, we use the a (alpha) parameter to overcome the problem of the dead neuron.
Due to this parameter, gradients never become zero during the entire training time.
Table 2. Activation functions and their representations

Type of Mathematical expression Waveform representation
activation function
ReLU f ðxÞ ¼ maxð0; xÞ

x; if x 0
¼
0; if x\0
Leaky ReLU f ðxÞ ¼ maxðax; xÞ

x; if x [ 0
¼
ax; if x 0
5 System Configuration and Software Used
The selected machine learning architecture (MLP) for the simulation is designed and
coded by using Python Jupyter Notebook which is an open-source web application that
allows us to create and share documents that contain live code, equations, visualiza-
tions, and narrative text. It performs, data cleaning and transformation, numerical
simulation, statistical modeling, data visualization, machine learning, and many other
tasks. In this software, NumPy, TensorFlow, Matplotlib, and Keras libraries are
applied. (https://www.python.org) (https://jupyter.org). The software version we are
using here is Python 3.7.6. System configuration is shown in Table 3 for the present
analysis.
Table 3. System configuration used

Processor Installed RAM System Type GPU
Intel(R) Core(TM) i5-7200U 8.00 GB 64-bit NVIDIA
CPU @ 2.50 GHz (7.89 GB operating Geforce GTX
usable) system 860 M
6 Result Analysis
For evaluating the performance of the MLP model, we are using Mean Square Error
(MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and R-
Squared (R2) as the performance indicators. Based on the input parameters, we are
comparing the performance of the machine learning model for two different cases
mentioned as Case 1 and Case 2.
642 A. Bhatt et al.
6.1 Result Analysis for Case 1

For ‘Case 1’ we are considering, charging voltage, charging current, and the number of
cycles as input parameters. Here, we are using ‘Leaky ReLU’ and ‘ReLU’ activation
functions with hidden layers and output layers respectively. By varying the value of the
alpha (a) parameter of the Leaky ReLU function, we are trying to improve the pre-
diction accuracy. In this case, initially, we performed 70 epochs during the training
stage with the ‘adam’ optimizer. After completion of training we found a learning curve
which shows that instead of choosing 70 epochs, we are getting an optimal solution at
epochs 38 in terms of performance indicating parameters. At epochs 38, initially, we
used the value of alpha to be 0.06 and after this by varying the value of the alpha
parameter, we found that the value of alpha to be 0.01, will provide better prediction
results. Table 4 shows the final configuration of the MLP model for Case 1.
Table 4. Configuration of the MLP model for case 1

1 input layer Dense layer 1(Hidden) Dense layer 2 (Output) Epochs
6 Nodes 10 Nodes 1 Node 38
Leaky ReLU activation ReLU activation
Alpha = 0.01
6.2 Result Analysis for Case 2

For ‘Case 2’ we are considering, charging voltage, charging current, and capacity as
input parameters. Here, we are using activation functions in the same manner as we
have mentioned in Case 1. By varying the value of the alpha (a) parameter of the Leaky
ReLU function, we are trying to improve the prediction accuracy. Based on the
learning curve after the training process, we found that the machine learning model can
give better prediction results at epochs 91. At epochs 91, initially, we used the value of
alpha to be 0.06 and after this by varying the value of the alpha parameter, we found
that the value of alpha to be 0.04, will provide better prediction results. Table 5 shows
the final configuration of the MLP model for Case 2.
Table 5. Configuration of the MLP model for case 2

1 Input layer Dense layer 1(Hidden) Dense layer 2 (Output) Epochs
18 Nodes 18 Nodes 1 Node 91
Leaky ReLU activation ReLU activation
Alpha = 0.04
6.3 Comparative Analysis Between Case1 and Case 2

After comparing the results for Case 1 and Case 2 based on error indicators as shown in
Table 6, we found that Case 2 at alpha equal to 0.04, can predict the final charging
capacity more accurately than Case 1. Based on the prediction results (see Fig. 7), we
can analyze that Case 2 can capture the trend of actual value more accurately than Case
2 and it is also able to capture the sudden spikes coming during the capacity degra-
dation curve, up to some extent.
Table 6. Comparative analysis of the MLP model for case 1 and case 2 based on error indicators
Error indicators Case 1 Case 2
MLP model @ Epochs 38 MLP model @ Epochs 91
Predicted Predicted Predicted Predicted
capacity @ capacity @ capacity @ capacity @
alpha = 0.06 alpha = 0.01 alpha = 0.06 alpha = 0.04
MSE 0.00007758 0.00005706 0.00003579 0.00002959
RMSE 0.00880823 0.00755415 0.00598296 0.00544033
MAE 0.00769091 0.00634489 0.00487101 0.00408353
R-Squared (R2) 0.9170 0.9390 0.9617 0.9683
Fig. 7. Prediction results for case 1 and case 2

644 A. Bhatt et al.
7 Conclusion
This paper aims to find a suitable machine learning-based capacity prediction model for
the second life stage of the battery from its first life stage. In this regard, this work
proposes an MLP model in which feeding of input parameters being done in two
different ways. These two different ways are categorized as Case 1 and Case 2. The
result shows that Case 2 consisting of charging voltage, charging current, and capacity
as input parameters along with the arrangement of datasets in 5 lag cycle values and 1
current (present) cycle value, can predict the capacity for the second life stage of the
battery more reliably and accurately in comparison with Case 1. This analysis shows
that charging voltage and charging current are the important input parameters for
predicting the capacity for the next cycle. We also noticed that instead of taking cycle
numbers as an input parameter if we are considering capacity as an input from some
previous (lag) cycles, then the machine learning model can predict the capacity for the
future cycles more accurately and reliably. Therefore, we can conclude that the pro-
posed machine learning model can predict the capacity of the second life stage of the
battery from its first life stage up-to several extents.
References
1. Strickland, D., Chittock, L., Stone, D.A., Foster, M.P., Price, B.: Estimation of transportation
battery second life for use in electricity grid systems. IEEE Trans. Sustain. Energy 5(3), 795–
803 (2014)
2. Abdel-Monem, M., Hegazy, O., Omar, N., Trad, K., Breucker, S.D., Bossche, P.V.D.,
Mierlo, J.V.: Design and analysis of generic energy management strategy for controlling
second-life battery systems in stationary applications. Energies 9(11), 889 (2016)
3. Abdel-Monem, M., Trad, K., Omar, N., Hegazy, O., Bossche, P.V.D., Mierlo, J.V.:
Influence analysis of static and dynamic fast-charging current profiles on ageing performance
of commercial lithium-ion batteries. Energy 120, 179–191 (2017)
4. Luo, X., Wang, J., Dooner, M., Clarke, J.: Overview of current development in electrical
energy storage technologies and the application potential in power system operation. Appl.
Energy 137, 511–536 (2015)
5. Bhatt, A., Tiwari, S., Ongsakul, W.: A review on re-utilization of electric vehicle’s retired
batteries. In: Proceedings of Conference on Ind. Commercial Use Energy, ICUE, vol. 2018-
October, pp. 1–5. IEEE, Phuket, Thailand (2019)
6. Tan, Q., Lv, C., Xu, Y., Yang, J.: Mesoporous composite of LiFePO4 and carbon
microspheres as positive-electrode materials for lithium-ion batteries. Particuology 17, 106–
113 (2014)
7. Olsson, L., Fallahi, S., Schnurr, M., Diener, D., Loon, P.V.: Circular business models for
extended EV battery life. Batteries 4(4), 57 (2018)
8. Martinez-Laserna, E., Gandiaga, I., Sarasketa-Zabala, E., Badeda, J., Stroe, D., Swierczyn-
ski, M., Goikoetxea, A.: Battery second life: hype, hope or reality? a critical review of the
state of the art. Renew. Sustain. Energy Rev. 93, 701–718 (2018)
9. Abdel-Monem, M., Hegazy, O., Omar, N., Trad, K., Bossche, P.V.D., Mierlo, J.V.: Lithium-
ion batteries: comprehensive technical analysis of second-life batteries for smart grid
applications. In: 19th European Conference on Power Electronics and Applications, EPE
2017 ECCE Europe, vol. 2017-January, pp. 1–16. IEEE, Poland (2017)
10. Ali, M.C.: Exploring the potential of integration quality assessment system in construction
(qlassic) with ISO 9001 quality management system (QMS). Int. J. Qual. Res. 8(1), 73–86
(2014)
11. Stiaszny, B., Ziegler, J.C., Krauß, E.E., Zhang, M., Schmidt, J.P., Ivers-Tiffée, E.:
Electrochemical characterization and post-mortem analysis of aged LiMn 2O4-
NMC/graphite lithium ion batteries part II: Calendar aging. J. Power Sources 258, 61–75
(2014)
12. Waag, W., Käbitz, S., Sauer, D.U.: Experimental investigation of the lithium-ion battery
impedance characteristic at various conditions and aging states and its influence on the
application. Appl. Energy 102, 885–897 (2013)
13. Giordano, G., Klass, V., Behm, M., Lindbergh, G., Sjoberg, J.: Model-based lithium-ion
battery resistance estimation from electric vehicle operating data. IEEE Trans. Veh. Technol.
67(5), 3720–3728 (2018)
14. Thomas, J.J., Pillai, N.: A deep learning framework on generation of image descriptions with
bidirectional recurrent neural networks. In: International Conference on Intelligent
Computing & Optimization, ICO 2018, vol. 866, pp. 219–230. Springer, Pattaya (2018)
Optimization, ICO 2018, vol. 866, pp. 487–496. Springer, Pattaya (2018)
16. Birla, S., Kohli, K., Dutta, A.: Machine learning on imbalanced data in credit risk. In: 7th
IEEE Annual Information Technology, Electronics and Mobile Communication Conference,
IEEE IEMCON 2016, pp. 1–6. IEEE, Canada (2016)
17. Rodriguez-Aguilar, R., Marmolejo-Saucedo, J.A., Vasant, P.: Machine learning applied to
the measurement of quality in health services in Mexico: the case of the social protection in
health system. In: International Conference on Intelligent Computing & Optimization, ICO
2018, vol. 866, pp. 560–572. Springer, Pattaya (2018)
18. Thomas, J.J., Ali, A.M.: Dispositional learning analytics structure integrated with recurrent
neural networks in predicting students performance. In: International Conference on
Intelligent Computing & Optimization, ICO 2019, vol. 1072, pp. 446–456. Springer, Koh
Samui (2019)
19. Yang, J., Xia, B., Huang, W., Fu, Y., Mi, C.: Online state-of-health estimation for lithium-
ion batteries using constant-voltage charging current analysis. Appl. Energy 212, 1589–1600
(2018)
20. Singh, P., Kaneria, S., Broadhead, J., Wang, X., Burdick, J.: Fuzzy logic estimation of SOH
of 125Ah VRLA batteries. In: INTELEC, International Telecommunications Energy
Conference (Proceedings), pp. 524–531. IEEE, USA (2004)
21. Nuhic, K., Terzimehic, A., Soczka-Guth, T., Buchholz, T., Dietmayer, M.: Health diagnosis
and remaining useful life prognostics of lithium-ion batteries using data-driven methods.
J. Power Sources 239, 680–688 (2013)
22. Hannan, M.A., Lipu, M.S.H., Hussain, A., Saad, M.H., Ayob, A.: Neural network approach
for estimating state of charge of lithium-ion battery using backtracking search algorithm.
IEEE Access 6, 10069–10079 (2018)
23. Hannan, M.A., Hoque, M.M., Hussain, A., Yusof, Y., Ker, P.J.: State-of-the-art and energy
management system of lithium-ion batteries in electric vehicle applications: issues and
recommendations. IEEE Access 6, 19362–19378 (2018)
24. Eddahech, A., Briat, O., Bertrand, N., Delétage, J.Y., Vinassa, J.M.: Behavior and state-of-
health monitoring of Li-ion batteries using impedance spectroscopy and recurrent neural
networks. Int. J. Electr. Power Energy Syst. 42(1), 487–494 (2012)
646 A. Bhatt et al.
25. Yang, D., Wang, Y., Pan, R., Chen, R., Chen, Z.: A neural network based state-of-health
estimation of lithium-ion battery in electric vehicles. Energy Procedia 105, 2059–2064
(2017)
26. Zhang, Y.Z., Xiong, R., He, H.W., Pecht, M.: Validation and verification of a hybrid method
for remaining useful life prediction of lithium-ion batteries. J. Clean. Prod. 212, 240–249
(2019)
27. Choi, Y., Ryu, S., Park, K., Kim, H.: Machine learning-based lithium-ion battery capacity
estimation exploiting multi-channel charging profiles. IEEE Access 7, 75143–75152 (2019)
28. Lui, Y., Zhao, G., Peng, X., Hu, C.: Lithium-ion battery remaining useful life prediction with
long short-term memory recurrent neural network. In: Annual Conference on Prognostics
and Health Management Society, vol. 1, pp. 1–7 (2017)
29. Bole, B., Kulkarni, C.S., Daigle, M.: Adaptation of an electrochemistry-based Li-ion battery
model to account for deterioration observed under randomized use. In: PHM 2014 -
Proceedings of Annual Confernce Prognostics and Health Management Society, pp. 502–
510 (2014)
30. Liu, Z., Zhao, J., Wang, H., Yang, C.: A new lithium-ion battery SOH estimation method
based on an indirect enhanced health indicator and support vector regression in PHMs.
Energies 13(4), 830 (2020)
31. Qin, T., Zeng, S., Guo, J., Skaf, Z.: A rest time-based prognostic framework for state of
health estimation of lithium-ion batteries with regeneration phenomena. Energies 9(11), 1–
18 (2016)
32. Khumprom, P., Yodo, N.: A data-driven predictive prognostic model for lithium-ion
batteries based on a deep learning algorithm. Energies 12(4), 1–21 (2019)
33. Nwankpa, C.E., Ijomah, W., Gachagan, A., Marshall, S.: Activation Functions: Comparison
of Trends in Practice and Research for Deep Learning, pp. 1–20 (2018)
Classification of Cultural Heritage Mosque
of Bangladesh Using CNN and Keras Model
Mohammad Amir Saadat1, Mohammad Shahadat Hossain2,

Rezaul Karim2, and Rashed Mustafa2(&)
1
Bandarban University, Bandarban, Bangladesh
amirsaadi86@gmail.com
2
University of Chittagong, Chittagong, Bangladesh
hossain_ms@cu.ac.bd, pinnacle_of_success@yahoo.com,
rashed.mutafa78@gmail.com
Abstract. Now-a-days convolutional neural network is widely used in catego-

rizing cultural heritage and other images. Important image features can be easily
extracted using this technique before classification. So, we can utter that con-
volutional neural network acts as a complete package. Again, keras is a neural
network API which works as a package in R interface. Along with convolutional
neural network, keras supports faster execution. To do this, a new dataset has
been formed and this has to be passed through several layers of convolutional
network for training and test purposes. Promising accuracy has been achieved as
a result and it is considered as one of the best techniques for the restoration and
preservation of cultural heritage mosques and other resources.
Keywords: Image classification Deep learning Convolutional neural

network Cultural heritage mosque
1 Introduction
We stay in an age of technology and technology has been developed based on many
formed and unformed data and that data can be collected from many historical sources.
These ancient information are needed to be stored for future purposes and for a nation’s
pride. Research fields can’t be thought without historical information. Again, we have
responsibilities to increase awareness to save cultural heritage resources. Different
researchers and organizations are doing their level best to keep continuing researches to
facilitate cultural heritages.
Bangladesh is a country which is bestowed with huge cultural heritages. There are
several monuments, ancient mosques, places, properties that are part of cultural heritages.
These resources are needed to be documented and preserved for various purposes. Again,
country people have affection to know about their heritages. Study of cultural heritages
help to know about hidden cultural treasures and to increase knowledge.
Our country occupies a large number of mosques. Most of them have been built
recently or after British period. But some had been built at the time of British period or
before that period. Several organizations have recognized such mosques as cultural
heritages. Some mosques have not been recognized yet. All of these mosques are our

https://doi.org/10.1007/978-3-030-68154-8_56
648 M. A. Saadat et al.
valuable resources. These mosques contain valuable historic information. They need
proper researches and analysis for storage. Our objective is to collect the images of
these mosques, analyze them and classify them depending on their patterns.
2 Problem Statement
2.1 Image Classification

Image classification is the way of categorizing images into different classes. Some basic
operations are performed by image processing and image classification is one of them.
The purpose of image classification is to categorize images to several classes. It can be
categorized into supervised and unsupervised classification [1].
Image classification process includes training and test phases. In the training phase,
important properties are taken and uncommon illustrations are built for a specific class.
In the test phase, the test images are classified into various classes for which the system
was trained [2].
Different image classification techniques have been used by different researchers for
different purposes. In our paper, we will like to use convolutional neural network for the
classification of training and test images. The aim of our experiment is to differentiate
cultural heritage mosques of Bangladesh from other buildings and mosques.
2.2 Deep Learning

Deep learning is a field of machine learning that permits models to realize representa-
tions of data using multiple levels of abstraction. Learning techniques can be semi-
supervised, supervised or unsupervised. The low-level layers support top-level layers by
providing useful features. Data is converted through the layers in deep learning [3–6].
There are various deep learning models available to define various experiments.
They are convolutional neural network, deep neural network, auto encoder, restricted
Boltzmann machine, recurrent neural network and hybrid model [7]. This paper will
state about convolutional neural network to define our experiment.
Machine learning and artificial intelligence have been greatly influenced by deep
learning techniques during the final few years [8]. Now, computer vision is getting
benefitted from these techniques [6]. Again, robotics, agriculture, finance, biotech,
radiology and many more fields are now facilitated from deep learning [9]. Deep
learning techniques are now applicable in object recognition, speech recognition, signal
processing, NLP etc. [3].
2.3 Convolutional Neural Network and Its Characteristics

Convolutional Neural Network is a deep learning model that can be used for classi-
fication purpose specifically images [6, 7].
Several layers are required to build a CNN. The layers descriptions are provided in
the following line:
Classification of Cultural Heritage Mosque of Bangladesh 649
Convolutional: Convolution operations are performed by convolution layers to the

input and then passed to the following layer. The convolution presumes the reaction of
a separate neuron to visual stimuli [10]. Convolution operation decreases the quantity
of parameters and performs better with less parameters [11].
Pooling: There are many types of pooling available in CNN i.e. max pooling, mean or
average pooling etc. Max pooling layer executes down-sampling. This is done by
dividing the input into rectangular pooling regions and calculating the maximum of
each region. An average pooling layer computes the mean values of each region and
thus it supports down-sampling. Pooling layers decreases overfitting by minimizing the
quantity of parameters [12].
Dropout: Input elements are set to zero by a dropout layer. This layer controls
overfitting and it doesn’t have any learning itself [13].
Flat: It supports the process of transforming all the outputted two-dimensional arrays
into single linear vector and the process is called flattening.
Fully Connected: A fully-connected layer supports the multiplication of inputs with
weight matrices and addition of bias vectors with them. It performs classification tasks
in convolutional neural network [5, 6].
2.4 Hyperparameter Description and Optimization

In this section, we will discuss about the hyperparameters used in our experiment.
Learning Rate: It is a significant parameter that plays an important role in fixing the
amplitude of the jump to be built by the optimization technique in each iteration. It will
take long period to achieve convergence for the low learning rate. For the high rate, it
can be diverged or become unstable around the minimum. Use of small training set can
be a solution to take suitable learning rate in this regard.
Number of Epochs: We count one epoch when the whole dataset is moved both
forward and backward only once. It is necessary to use more than one epoch because
one epoch is not good enough to pass the whole dataset through a neural network and
again, the entire dataset can be passed through the neural network several times [16].
Batch Size: The total number of training samples appear in a single batch is called the
batch size. A single epoch is too large to be understood by the computer. In this regard,
it is partitioned into several batches [16]. For batch gradient descent, batch size will be
equal to training set size, for stochastic gradient descent, size will be 1 and for mini-
batch gradient descent size must be greater than 1 and less than the size of training set
and popular minibatch size is 32, 64 and 128 [17]. Minibatch consumes less memories
and model can be fitted faster with it [18].
Momentum: Momentum is a terminology that helps to speed up the optimization
technique for the improvement of model parameters. It supervises parameters changing
in nearly recent iterations and keep working in the indifferent direction using that
supervision. Actually, it is helpful for dimensions whose gradients moving in the
similar direction. Dimensions that changes direction will be reduced instead of getting
updated. It reduces oscillation and speed up convergence [6].
Nesterov: Nesterov is one kind of momentum that facilitates convex functions with
potential or powerful convergence. It performs better than typical momentum [19].
2.5 Activation Function

Activation functions are abstractions that decides whether a neuron will be passed or
not. Neurons are activated based on these functions. They play vital role in artificial
neural networks [20].
Activation function can be linear or nonlinear. A neural network with an activation
function is called nonlinear transformation or nonlinear activation function and image
classification is not possible without this activation function. So, activation function is a
must for image classification operation [20].
Have a look at a typical perceptron in the following figure: (Fig. 1).
Fig. 1. A typical perceptron
Popular Activation Functions:

ReLU: “ReLU” stands for Rectified Linear Unit. It is a popular activation function
used in neural networks. Errors can ne back-propagated using this function and it can
activate several layers of neurons. For these reasons, it is called nonlinear activation
function [20].
Softmax: Softmax is a type of activation function which is actually used for classi-
fications. It works at the end layers of the network and converts numbers into proba-
bilities [21].
3.1 Dataset
Our dataset contains 196 images of buildings which are classified into two types i.e. the
ancient mosques and the recent mosques including other buildings. These images are
mainly collected using Google and Wikipedia information.
Among the images, 176 were used for the training phase and the remaining were
used for the test phase. 20% of the training images were used for validation phase
which had been selected randomly.
All the images have been used in our experiment are color images and have been
resized using appropriate image processing tools.
Our train and test datasets are: (Fig. 2 and Fig. 3).
Fig. 2. Train images
Fig. 3. Test images

3.2 RStudio
There are many programming languages available to perform image classification.
Among them we have used R programming language to implement our experiment.
RStudio has been used as the editor to write R codes [22].
3.3 Methodology
A convolutional neural network is a class of artificial neural network (ANN) where
connectivity patterns between neurons resemble the structure of the biological brain
visual cortex [6, 23]. Its typical structure is composed of several layers. The primary
layers are the convolutional and the pooling layers. The end layers consist of the fully
connected layer and the output layer. Feature extraction are basically done by the
convolutional and the pooling layers and classification is performed based on the
extracted features using fully connected layer [5, 6].
Convolutional Neural Network provides ideas about sparse interaction, parameter
sharing and equivariant representations [5, 7]. It can automatically extract features from
images and classify them based on the extracted features [7].
Several approaches are needed to be considered while applying CNN to input
images i.e. data driven approach and model driven approach. Data driven approach
uses each dimension as channel and then executes 1D convolution on them. The
outputs of each channel are flattened after convolution and pooling. On the other hand,
inputs are resized to virtual 2D images by model driven approaches to adopt a 2D
convolution [7].
The more accurate version of the approach can be defined from the following
figure: (Fig. 4).
Fig. 4. Detailed CNN architecture

3.4 Keras Sequential Model with CNN

Keras sequential model is a linear stack of layers. A sequential model can be created by
calling keras_model_sequential() function then a series of layer functions [22] (Fig. 5).
Fig. 5. Summary of model
4.1 Result
Here is our fitted trained data plot given in the following figure: (Fig. 6).
Fig. 6. Train data plot

Here is the accuracy of the trained images given in Table 1:
Table 1. Accuracy-loss evaluation for train images

Accuracy Loss
1 0.006507687
Here is our predicted and actual train data given in the confusion matrix Table 2:
Table 2. Actual and predicted train data

Class Actual
0 1
Predicted 0 35 0
1 0 141
We have used 20 images as the test images to check our experiment. Here is the
predicted and actual train data given in the confusion matrix Table 3:
Table 3. Actual and predicted test data

Class Actual
0 1
Predicted 0 13 0
1 2 5
The predict classes of the test images should be like this:

0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,1,1,1,1,1
Where ‘0’ indicates the class of the ancient mosque images and ‘1’ indicates the
class of the buildings including the recent mosques.
Now, we will show the result of the test for negative images. Here, also 20 images
have been used for test. 15 images have been given to positive image class which are
actually negative, 5 images are tested for another class which actually contains the
negative image class.
Summary of the predicted and actual test data has been given in the confusion
matrix Table 4:
Table 4. Actual and predicted test data

Class Actual
0 1
Predicted 0 1 0
1 14 5
The predict classes of the test images should be like this:

1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1
Where ‘0’ indicates the class of the ancient mosque images and ‘1’ indicates the
class of the buildings including the recent mosques.
4.2 Discussion
Our objective is to classify train images and get a better accuracy to make our
experiment meaningful. From our experiment, classification of train images was
accurate enough and it was 100% accurate. It was shown in Table 1 in the previous
result section. Our model can easily classify train images based on the extracted fea-
tures. Among 176 trained images, 35 images have been classified to ancient mosques
class and 141 images have been categorized to the other buildings class. Again, the
accuracy of the classification between both classes was good and appreciable.
20 images have been used to test our experiment. Among them 15 images have
been given to ancient mosque classes and 5 images have been given to other buildings
classes for the test purpose. Among the 15 ancient mosque images, 13 images have
been correctly classified and 2 images haven’t been classified correctly. The 2 images
that have been misclassified are shown by the following Fig. 7:
Fig. 7. Misclassified images
Again, the 5 others buildings images have been classified correctly. We have got
86.67% accuracy for the ancient mosque class images and 100% for other buildings
class images. So far, the total accuracy is 90%.
Finally, a test has been done for negative images to verify our model. 20 images
have been provided for the test. The ancient mosque image class has been considered to
be the positive image class and another class which contains the buildings images has
been considered as the negative image class. 15 images have been given to positive
image class which have been actually negative and 5 images have been given to the
negative image class which have been actually negative. Among the 15 images, 14
images have been predicted as the negative class images and 1 image has been iden-
tified as the positive class image which has been also a negative class image from our
perspective. The one image that has been predicted as the positive class images and the
14 images that have been predicted negative class images have been shown by the
following figure: (Fig. 8 and Fig. 9).
Fig. 8. Negative image that has been predicted positive
Fig. 9. Images that have been predicted negative
5 Advantages and Limitations
It is known that CNN is a good feature extractor as well as a good classifier. So, we
didn’t need to use additional feature extractor and classifier for the entire experiment.
This is one of the reasons to choose CNN instead of others. Besides, our proposed
methodology provided better accuracy.
Although the quantity of images was small, the accuracy of the result was well
enough. We focused on the quality of the research rather than the quantity of data.
Again, the sizes of data were not that small to fulfill our requirements. We utilized
ancient mosque images of our country which are essential part of our heritages and add
a new dimension to cultural heritage related researches.
6 Conclusion
Our proposed methodology has performed well as per our expectations. Convolutional
neural network along with keras has made the mechanism more efficient. Again, feature
extraction has been done with that convolutional neural network without adding any
additional feature extractor. So, we can say that convolutional neural network is a
complete package for classification jobs and we are satisfied with our experimental
performance.
7 Future Scope
1. Other researchers may get motivated from the experiment.
2. Documentation and preservation of the ancient mosques name may be done based
on the current research.
References
1. Lillesand, T.M., Kiefer, R.W., Chipman, J.W.: Remote Sensing and Image Interpretation,
Open Access (2004)
2. Jaswal, D., Sowmya, V., Soman, K.P.: Image classification using convolutional neural
networks. Int. J. Sci. Eng. Res. 5(6), June 2014
3. Bengioy, Y., Courville, A., Vincent, P.: Representation learning: a review and new
perspectives. IEEE Trans. Pattern Anal. Mach. Intell. (2014)
4. Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. (2014)
5. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
6. Llamas, J., Lerones, P.M., Medina, R., Zalama, E., Gómez-García-Bermejo, J.: Classifica-
tion of architectural heritage images using deep learning techniques. Appl. Sci. 26 September
2017
7. Wang, J., Chen, Y., Hao, S., Peng, X., Hu, L.: Deep Learning for Sensor-based Activity
Recognition: A Survey. Elsevier (2017)
8. Du, T., Shanker, V.K.: Deep Learning for Natural Language Processing
9. Wang, J.: Deep Learning - An Artificial Intelligence Revolution, Ark Invest, June 2017
10. Convolutional Neural Networks (LeNet) – DeepLearning 0.1 documentation, LISA Lab
(2013)
11. Aghdam, H.H., Heravi, E.J.: Guide to Convolutional Neural Networks: A Practical
Application to Traffic-Sign Detection and Classification. Springer, Cham (2017)
12. Nagi, J., Ducatelle, F., Di Caro, G.A., Cireşan, D., Meier, U., Giusti, A., Nagi, F.,
Schmidhuber, J., Gambardella, L.M.: Max-pooling convolutional neural networks for vision-
based hand gesture recognition. In: IEEE International Conference on Signal and Image
Processing Applications (ICSIPA) (2011)
simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958
(2014)
14. van den Oord, A., Dieleman, S., Schrauwen, B.: Deep Content-Based Music Recommen-
dation (2013)
15. Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural
networks with multitask learning. In: Proceedings of the 25th International Conference on
Machine Learning, ICML 2008, New York, NY, USA (2008)
16. Quora. https://www.quora.com/What-is-an-epoch-in-deep-learning
17. Machine Learning Mastery (2018). https://machinelearningmastery.com/difference-between-
a-batch-and-an-epoch/
18. Stack Exchange (2018). https://stats.stackexchange.com/questions/153531/what-is-batch-
size-in-neural-network
19. CS231n Convolutional Neural Networks for Visual Recognition. http://cs231n.github.io/
neural-networks-3/
20. Dishashree Gupta, “Analytics Vidhya,” 23 October 2017. https://www.analyticsvidhya.com/
blog/2017/10/fundamentals-deep-learning-activation-functions-when-to-use-them/
21. Data Science Bootcamp. https://medium.com/data-science-bootcamp/understand-the-
softmax-function-in-minutes-f3a59641e86d
22. Chollet, F., Allaire, J.J.: Deep Learning with R, Manning Publications Co.
23. Matsugu, M., Mori, K., Mitari, Y., Kaneda, Y.: Subject independent facial expression
recognition with robust face detection using a convolutional neural network. ScienceDirect
16(5–6) (2003)
Classification of Orthopedic Patients Using
Supervised Machine Learning Techniques
Nasrin Jahan, Rashed Mustafa(&), Rezaul Karim,

and Mohammad Shahadat Hossain
Computer Science and Engineering, University of Chittagong,

Chittagong 4331, Bangladesh
rubycu2015@gmail.com, {rashed.m,hossain_ms}@cu.ac.bd,
pinnacle_of_siccess@yahoo.com
Abstract. Now a day a number of orthopedic patients increasing due to fre-

quencies of road accidents, cycling, exercises and many more similar practices.
In this work, machine learning classifiers have been performed to classify
orthopedic patients according to their biomechanical features. Supervised
machine learning classifiers have been applied and their different evaluation
metrics have been analyzed to find a suitable classifier. A preprocessed dataset
with 310 instances is used to classify patients. Some patients have no spinal
vertebral issues (normal), those who are suffering from disc dislocation (disk
hernia) and from spondylolisthesis have been assigned as abnormal patient. The
classification has done in two stages. In first stage, patients have been classified
as normal, abnormal and in second stage, patients have been classified according
to their physical condition. This work aims to find best performing algorithm
analyzed with different evaluation metrics to classify orthopedic patients. The
average accuracy is more than 80% for most of the algorithms, where K-Nearest
Neighbor gives with highest accuracy and shown good performance with other
evaluation metrics. The society will get benefit from this research by properly
classifying orthopedic patients so that the patients can start the treatment
immediately and thus save many lives.
Keywords: Machine learning K-Nearest neighbor Classification

Biomechanics
1 Introduction
Machine learning algorithm has been successfully used in various medical fields like
disease prediction, classification, and therapy and medicine recommendation. To
extend the service of machine learning in medical field, this paper addresses a lot of
supervised machine learning models to classify the orthopedic patients according to
their health condition. It seems easy to choose a classifying model based on their
accuracy. But in most cases accuracy is not the final metrics to judge a classifier model.
Accuracy sometimes hides the predicting capability of models. More analysis of dif-
ferent evaluation metrics needed to validate a classifier model. The purpose of this
work is to evaluate and compare the performance of popular supervised machine
algorithms which have been applied to classify orthopedic patients.

https://doi.org/10.1007/978-3-030-68154-8_57
660 N. Jahan et al.
2 Background
Biomechanics is known as the study of the composition, motion of the living beings of
biological systems [1]. Biomechanics help to understand the physical condition of a
person. According to the American Academy of Orthopedic surgeons (AAOS), four out
of five adults feel lower backside pain at different point during their lives and many of
them have common intervertebral disc disorders [1]. Disc hernia and spondylolisthesis
are examples of common vertebral disease which cause intense pain. One of the major
causes of LBP is intervertebral disk herniation [2]. The condition when the gel like
material (Nucleus Pulposus) gets squeezed out through fractures in outer wall of
intervertebral disc is known as Lumbar Disc Herniation [3]. Figure 1 shows an example
of hernia where. Disk hernia causes pressure to the nerves which leads to severe pain.
Fig. 1. Disk Hernia
Spondylolisthesis is known as a situation in which a person’s one of the vertebrae

(bones) in the spine is slipped out of the position onto the vertebra below it [4]. This
situation may cause severe pain and chronic damage if it left untreated. Patients may
eventually face weakness and leg may have paralyzed due to nerve damage. Figure 2
shows a condition of spondylolisthesis where a vertebra slipped forward position onto
the vertebra below it.
Fig. 2. Spondylolisthesis
Classification of Orthopedic Patients 661
Both conditions can squeeze the spinal cord or nerve roots and cause pain [5]. But
they create different impact on human body but shows similar symptoms. Hence,
diagnosis of hernia and spondylolisthesis may cause misclassification. In this study,
different types of machine learning algorithm are applied to classify patients. The goal
of this work is to answer the following question:
“From selected algorithm which one is best for classifying orthopedic patients from
their biomechanical studies?
This study will help specialist to classify patients easily and accurately according to
their biomechanical feature.
3 Literature Review
Application of machine learning in medical field has been very essential to specialists.
Different machine learning algorithms were used in different medical fields. Recently a
study has been conducted by applying a lot of machine learning techniques named
Naive Bayes, Multilayer perceptron, Support vector machine, Logistic Regression,
Quadratic Discriminant Analysis, K-Nearest Neighbor, Adaptive Boosting, Gradient
Descent, Decision Tree, Random forest to classify orthopedic patients. The classifi-
cation was done in two stage where the best performed algorithm was decision tree as it
stood out with 99% accuracy [6].
Nureedeen et al. [7] presented an approach to classify orthopedic patients in one
stage. In this work, different evaluation metrics were evaluated and the best performed
algorithm was Naïve Bayes classifier with 83.4% accuracy, 100% specificity,16%
error. Rocha et al. [8] conducted an experiment to classify orthopedic patients by the
use of support vector machine, Multilayer Perceptron at SIMPATICO platform. And
the result showed that the classifiers are better than standalone classifier. Alwaneh et al.
[8] proposed a method to classify the hernia and normal disk from MRI image which
contains suspected region. The work was done by conducting different task like image
acquisition, image annotation, enhancement and extraction of region of Interest, feature
extraction, and image classification. S. Liao et al. [4] proposed an automatic system to
detect spondylolisthesis by learning based detector. Synthesized spondylolisthesis
image was used to train the learning based detector.
4 Methodology
4.1 Learning Technique

Supervised machine learning is a learning technique where a function is mapped from
input to output according to given input-output pairs [9]. Supervised learning technique
according to its name indicates that the learning process is supervised by presence of a
teacher. Generally, in supervised learning process the machine is trained or taught
using data which class level is known that means some of data is already categorized
with the accurate label [10]. After training, the machine is fed with new examples so
that the supervised learning algorithm can analyze the training data and may produce a
correct prediction from labeled data.
662 N. Jahan et al.
4.2 Machine Learning Algorithms

Naïve Bayes is a probabilistic classifier which is based on Bayes’ Theorem [11]. The
Naïve Bayes classifier is not only a single classifier but also considered as a family of
classifier where each of them share a common principle. The common principle of this
algorithm is every pair of attributes which is being classified is independent with
respect to each other. Independent attributes is also the main point to make predictions.
Bayes rule is the combination of conditional probability with product and sum rule. The
Bayes theorem can be written as,
pðX ¼ xjY ¼ yÞ ¼ ðX ¼ xj ¼ yÞ=pðY ¼ yÞ ð1Þ
Multilayer perception (MLP) is a supervised machine learning model which is a class

of feedforward artificial neural network (ANN) [12]. For supervised learning MLP uses
a technique called backpropagation for training. MLP has multiple nodes which are
arranged in interconnected layers named input, hidden and output layers. In MLP a
group of inputs are mapped into a set of desired outputs.
The basic concept of the support vector machine classifier is to find out a hyper-
plane in the N-dimensional space that classifies the data points individually where N—
the number of features. There are a lot of possible hyperplanes that could be chosen to
separate the two classes of data points. Margins are drawn in such a technique that the
distance between the margin and the maximum layers, thus, reduces the classification
error [13]. Maximizing the margin distance provides some benefit like that future data
points can be classified with more accuracy.
Logistic Regression searches the whole datasets to find the hyper plane which fits
the most for identifying the classes. The outcome of logistic regression can be either
two possibilities for binary (1/0) for text labels (Yes/No, True/False, normal/abnormal)
of independent variables [14]. The core of logistic regression is “logistic function”. It is
an ‘S’ shaped curve which tends to limit the cost function between [0,1] and map it into
between these value. Therefore, linear function fails to represent logistic regression
because it may have a value greater than 1 or less than 0 which is not possible
according to hypothesis of logistic regression. The function of logistic regression is
given below:

1= 1 þ ek ð2Þ
Most of the algorithms in machine learning are parametric but K-Nearest Neighbor is
non-parametric. It classifies the data based on number of majority of vote it gets from
its closest neighbors [15]. This algorithm predicts the class by using the Euclidean
distance. Choosing ‘K’ as the certain value of the nearest neighbors and these neigh-
bors refers to the core of deciding the values and the value of ‘K’ can influence the final
result. The equation for selecting ‘K’ is given below:
p
K¼ n ð3Þ
Decision Tree is the most powerful and a widely used classifier. Instances classified
start from the root node and rank by their attribute values [16]. It partitions data into
subsets and the partition continues until there is no partition possible. The partition is
done using Binary Splitting. The splitting is completed when the subset at a node all
has the same value of the target variable, or when no longer splitting is possible.
Decision Tree uses the entropy function for characterizing impurity of a dataset.
Random forest classifier is a supervised classification model. Random forest, as its
name indicates, consists of a set of single decision trees which are created from ran-
domly selected subset of training set. Every single tree in the random forest classifier
makes a class prediction about the data. And the class getting the most of the votes
becomes the model’s predictions [17]. The large number of uncorrelated trees operates
as a committee to get the higher chance of accurate results. RF gives high accurate
results and it learns very fast. It is very efficient on large data sets. Random forest uses
Gini index for deciding the final class of each tree. If data set T contains examples from
n class Gini index, Gini (T) is defined as
GiniðTÞ ¼ 1 RðPjÞ2 ð4Þ
4.3 Evaluation Metrics

Evaluating a machine learning model is an important part of any work. The evaluation
of binary classifiers compares two methods of assigning a binary attribute, one of
which is usually a standard method and the other is being investigated [18]. The model
may give satisfactory results when it is evaluated using metric like accuracy. But when
it is evaluated against other metrics such as logarithmic loss or any other such metric, it
may give poor results. Most of the times, classification accuracy is used to calculate the
performance of an algorithm, but it is not the final metric to truly judge performance of
a model. In this section, different types of evaluation metrics will be discussed.
Classification accuracy is the proportion of the accurate predictions to the all
number given input data points. When there are only equal number of input samples
belongs to every class, the classification accuracy works well in those cases. A mea-
surement system may be correct but not precise, precise but not accurate, neither, or
both [19].
Number of correct predictions

Accuracy ¼ ð5Þ
Total Number of predictions
In the era of machine learning, the problem of different statistical classification, the
confusion matrix is known as the error matrix, which is a specific table layout that
allows visualization of the performance of an algorithm. Each row of the confusion
matrix describes the instances in predicted class and each column describes the
instances in the actual class The name is originated from the idea that it may confuse
during predict the classes such as mislabeling one class as another. It is a particular
kind of contingency table, containing two dimensions like “actual” and “predicted”,
where identical sets of “classes” are in both dimensions [20].
Specificity also called the true negative rate measures the ratio of actual negatives
class that are accurately identified like the percentage of healthy people those are
664 N. Jahan et al.
correctly classified as not having the disease [21]. Specificity is defined as False
Positive Rate(FPR) represents the proportion of negative class level that are mistakenly
determined as positive, with respect to the all negative class level.
False Positive
FPR = ð6Þ
False Positive þ True Negative
Sensitivity also called the true positive rate, the recall, or probability of detection in
some fields measures the ratio of actual positives class that are correctly classified as
positive [22]. True positive rate defines the proportion of the positive samples that are
correctly determined as positive class level, with respect to the all positive class level.
False Positive
TPR = ð7Þ
False Positive þ True Positive
A receiver operating characteristic curve, or ROC curve, is a graphical plot that

describes the diagnostic ability of a binary classifier system because its selection
threshold is varied. The ROC curve is drawn by plotting the true positive rate (TPR)
against the false positive rate (FPR) at different threshold settings [23]. The area under
the ROC curve, or “AUC” (“Area Under Curve”) is the mostly used metrics for
evaluating performance of a classifier. It tells about how much a model is capable of
separating a class from another one. The higher value of AUC means the model is
better at predicting classes. The AUC-ROC is plotted with TPR (True Positive Rate)
against the FPR (False Positive Rate) where TPR is plotted on the y-axis and FPR is
plotted on the x-axis.
4.4 Dataset
In this work, the datasets are downloaded from the website [24] each containing 310
instances. Each instance contains six features named pelvic incidence, pelvic tilt
numeric, lumbar lordosis angle, sacral slope, pelvic radius and degree spondylolisthesis
[25]. The Classification is done in two stage. In first stage, patients had no spinal
vertebral issues were labeled as normal and the rests were labeled as abnormal. In
second stage patients were classified according to their disease and normal patients.
The feature of this dataset is used as the learning parameters. Table 1 and Table 2
shows the sample dataset for first analysis.
Table 3 and Table 4 shows the sample dataset for second analysis where classes are
classified into three levels named normal, disk hernia, spondylolisthesis.
Table 1. First analysis sample dataset (i)

SL Pelvic incidence Pelvic tilt numeric Lumber lordosis angle
1 63.02782 22.55259 39.60912
2 38.50527 16.9643 35.11281
Table 2. First analysis sample dataset (ii)

Sacral slope Pelvic radius Degree spondylolisthesis Class
40.47523 98.67292 −0.2544 Abnormal
21.54098 127.6329 7.986683 Normal
Table 3. Second analysis sample dataset (i)

SL Pelvic incidence Pelvic tilt numeric Lumber lordosis angle
1 40.2502 13.9210 25.1245
2 44.52905 9.433234 52.2831
3 38.50527 16.9643 35.1121
Table 4. Second analysis sample dataset (ii)

Sacral slope Pelvic radius Degree spondylolisthesis Class
26.38249 130.3279 2.23061 Hernia
35.09582 134.7118 29.10657 Spondylolisthesis
21.54098 127.6329 7.986683 Normal
5 Results
In this study, seven classifiers are used named Naive Bayes (NB), K-Nearest Neighbor
(k-NN), Support Vector Machine (SVM), Decision Tree (DT) Random Forest (RF),
Logistic Regression (LR), Multilayer Perception (MLP). Python (3.7) as well as its
related packages are used in this work for classification. The whole dataset is used for
two stage classification. For first stage classification analysis, dataset is classified into
two class normal and abnormal. For second stage classification, dataset is classified into
three class normal, hernia and spondylolisthesis. For each stage, 30% of total data is
used for test data. The six attribute obtained from the dataset have been used as the
input attribute of the system for classification. After training the system, firstly the
system was tested to evaluate whether the person is in normal state or not. After that,
the second test is conducted for classifying the patient according to their disease like
hernia, spondylolisthesis.
Test data is 30%, so 93 samples out of 310 is used for predicting the model.
Different evaluation metrics like accuracy, specificity, sensitivity, AUC, mislabeled
data points are calculated to measure the performance. An algorithm is chosen as final
classifier with higher metrics. Figure 3 shows the histogram analysis of different
evaluation metrics of first stage analysis.
666 N. Jahan et al.
Fig. 3. Performance chart for first classification
From this analysis, it can be seen that different model outperformed with great
accuracy and different evaluation metrics and different model stood out with similar
accuracy, specificity, sensitivity. In this case, it is tough to find a suitable model which
can be declared as the best model. So, other metrics should be analyzed.
AUROC (Area under the Receiver Operating Characteristics). Curve may come
with a good solution in this case. It is one of the most important evaluation metrics to
check any classification model’s performance. The AUC-ROC curve is plotted with
TPR against the FPR. TPR is plotted on y-axis and FPR is plotted on the x-axis. The
dotted navy blue line represents the threshold, which is set by value .5. The more area
captured by a model, the more performance model will show. The ROC of different
model is drawn in a curve and will help to choose a model. The AUC-ROC curve is
drawn below: (Fig. 4).
Fig. 4. AUC-ROC Curve

Fig. 5. Performance chart for second analysis
By analyzing above curve, it can be seen that the most area is covered by K-Nearest
Neighbor model. It is also figured out that only MLP and KNN model show a linear
line while other lines are nonlinear. As KNN results in higher value in accuracy,
specificity, sensitivity, AUC so the chosen classifier is K-Nearest Neighbor model.
Figure 5 shows the different performance for second analysis.
From the above analysis, it can be seen that most of the algorithm outperformed
with great performance like accuracy, specificity, sensitivity. Among these, k-nearest
neighbor stood out with 83% accuracy, 84% specificity, 83% sensitivity, which is
highest among all other model. So, for second analysis k-nearest neighbor is the right
classifier to classify orthopedic patients.
6 Conclusion
In this work, several machine learning models are applied to classify orthopedic
patients according to their biomechanical features. To choose a best classifier, their
different evaluation metrics are analyzed. As there is not enough application of machine
learning in the field of classification of orthopedic patients. Six biomechanical features
from preprocessed dataset have been used as learning parameters for training the
model. For first analysis, a complexity was raised as many classifiers produced same
accuracy, specificity, sensitivity. In this case, an additional evaluation metric AUC-
ROC is drawn to determine suitable model. As the KNN model captures most of the
area in ROC curve, so it is chosen as final classifier for first analysis. In second
analysis, KNN stands with highest accuracy, specificity, sensitivity that’s why it is
chosen as final classifier. This work is more significant in this way that it is evaluated
with different metrics. This analysis will help specialist to find out the best classifier in
an accurate and faster way. In concluding remark we would like to say that with the
help of this research possible orthopedic patients will be classified and thus save many
lives.
668 N. Jahan et al.
References
1. Ghosh, S., Alomari, R., Chaudhary, V., Dhillon, G.: Composite features for automatic
diagnosis of intervertebral disc herniation from lumbar MRI. In: Annual International
Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 5068–
5071, August 2011
2. Alomari, R.S., Corso, J.J., Chaudhary, V., Dhillon, G.: Automatic diagnosis of lumbar disc
herniation using shape and appearance features from MRI. In: Proceedings of SPIE
Conference on Medical Imaging (2010). http://web.eecs.umich.edu/jjcorso/pubs/
alomarispie2010.pdf
3. Alawneh, K., Al-dwiekat, M., Alsmirat, M., Al-ayyoub, M.: Computer-aided diagnosis of
lumbar disc herniation. In: 6th International Conference on Information and Communication
System (ICICS 2013) (2013)
4. Liao, S., et al.: Automatic lumbar spondylolisthesis measurement in CT images. IEEE Trans.
Med. Imag. 35(7), 1658–1669 (2016)
5. Herath, H.M.M.G.T., Kumara, J.R.S.S., Fernando, M.A.R.M., Bandara, K.M.K.S., Serina,
I.: Comparison of supervised machine learning techniques for PD classification in generator
insulation. In: IEEE International Conference on Industrial and Information Systems (ICIIS),
Peradeniya, pp. 1–6 (2017)
6. Hasan, K., Islam, S., Samio, Md.M.R.K., Chakrabarty, A.: A machine learning approach on
classifying orthopedic patients based on their biomechanical features. In: Joint 7th
International Conference on Informatics, Electronics & Vision (ICIEV) and 2nd Interna-
tional Conference on Imaging, Vision & Pattern Recognition (icIVPR) (2018)
7. Matoug, N.A.A., AL-Jebury, H., Akyol, K., Alsarrar, M.R.: Comparison supervised learning
algorithms for spinal-column disease. Int. J. Sci. Res. 8(1), January 2019
8. Rocha Neto, A.R., Barreto, G.A.: On the application of ensembles of classifiers to the
diagnosis of pathologies of the vertebral column: a comparative analysis. IEEE Latin Am.
Trans. 7(4), 487–496 (2009)
bidirectional recurrent neural networks. In: International Conference on Intelligent
Computing & Optimization. Springer, Cham (2018)
10. Akinsola, J.E.T.: Supervised machine learning algorithms: classification and comparison.
Int. J. Comput. Trends Technol. (IJCTT). 48, 128–138 (2017). https://doi.org/10.14445/
22312803/ijctt-v48p126
11. Rish, I.: An empirical study of the Naïve Bayes classifier. Work Empir. Methods Artif. Intell.
IJCAI (2001)
12. Campos, P.G., Oliveira, E.M.J., Ludermir, T.B., Araujo, A.F.R.: MLP networks for
classification and prediction with rule extraction mechanism. In: IEEE International Joint
Conference on Neural Networks (IEEE Cat. No. 04CH37541), Budapest, 2004, vol. 2,
pp. 1387–1392 (2004)
13. Chen, J., Jiao, L.: Classification mechanism of support vector machines. In: 5th International
Conference on Signal Processing Proceedings, WCC 2000 - ICSP 2000. 16th World
Computer Congress, Beijing, China, 2000, vol. 3, pp. 1556–1559 (2000)
14. Liu, L.: Research on logistic regression algorithm of breast cancer diagnose data by machine
learning. In: International Conference on Robots & Intelligent System (ICRIS), Changsha,
2018, pp. 157–160 (2018)
15. Li, Y., Cheng, B.: An improved k-nearest neighbor algorithm and its application to high
resolution remote sensing image classification. In: 2009 17th International Conference on
Geoinformatics, Fairfax, VA, pp. 1–4 (2009)
16. Tu, P., Chung, J.: A new decision-tree classification algorithm for machine learning. In:
Proceedings Fourth International Conference on Tools with Artificial Intelligence TAI 1992,
Arlington, VA, USA, pp. 370–377 (1992)
17. Patel, S.V., Jokhakar, V.N.: A random forest based machine learning approach for mild steel
defect diagnosis. In: 2016 IEEE International Conference on Computational Intelligence and
Computing Research (ICCIC), Chennai, 2016, pp. 1–8 (2016)
18. Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation
metrics tell us about saliency models? IEEE Trans. Pattern Anal. Mach. Intell. 41(3), 740–
757 (2019)
19. Cufoglu, A., Lohi, M., Madani, K.: A comparative study of selected classification accuracy
in user profiling. In: 2008 Seventh International Conference on Machine Learning and
Applications, San Diego, CA, pp. 787–791 (2008)
20. Marom, N.D., Rokach, L., Shmilovici, A.: Using the confusion matrix for improving
ensemble classifiers. In: IEEE 26-th Convention of Electrical and Electronics Engineers in
Israel, Eliat, 2010, pp. 000555–000559 (2010)
21. Marín, N., Rivas-Gervilla, G., Sánchez, D., Yager, R.R.: Specificity measures and referential
success. IEEE Trans. Fuzzy Syst. 26(2), 859–868 (2018)
22. Bong, C.W., Xian, P.Y., Thomas, J.: Face Recognition and detection using haars features
with template matching algorithm. In: International Conference on Intelligent Computing &
Optimization. Springer, Cham (2019)
23. Keedwell, E.: An analysis of the area under the ROC curve and its use as a metric for
comparing clinical scorecards. In: 2014 IEEE International Conference on Bioinformatics
and Biomedicine (BIBM), Belfast, pp. 24–29 (2014)
24. https://archive.ics.uci.edu/ml/datasets/Vertebral Column, prepared and built by (Dr. Hen-
rique da Mota)
25. Berthonnaud, E., Dimnet, J., Roussouly, P., Labelle, H.: Analysis of the sagittal balance of
the spine and pelvis using shape and orientation parameters. J. Spinal Disorder. Techn.
(2005)
Long Short-Term Memory Networks
for Driver Drowsiness and Stress Prediction
Kwok Tai Chui1(&), Mingbo Zhao2, and Brij B. Gupta3

1
School of Science and Technology, The Open University of Hong Kong,
Ho Man Tin, Kowloon, Hong Kong SAR, China
jktchui@ouhk.edu.hk
2
School of Information Science and Technology, Donghua University,
Changning District, Shanghai, China
mzhao4@dhu.edu.cn
3
Department of Computer Engineering, National Institute of Technology
Kurukshetra, Kurukshetra 136119, Haryana, India
bbgupta@nitkkr.ac.in
Abstract. Road safety is crucial to prevent traffic deaths and injuries of drivers,
passengers, and pedestrians. Various regulations and policies have been pro-
posed to aim at reducing the number of traffic deaths and injuries. However,
these figures have remained steady in recent decade. There has been an
increasing number of research works on the prediction of driver status which
gives warning before undesired status, for instance drowsiness and stress. In this
paper, a long short-term memory networks is proposed for generic design of
driver drowsiness prediction and driver stress prediction models using electro-
cardiogram (ECG) signals. The proposed model achieves sensitivity, specificity,
and accuracy of 71.0–81.1%, 72.9–81.9%, and 72.2–81.5%, respectively, for
driver drowsiness prediction. They are 68.2–79.3%, 71.6–80.2%, and 70.8–
79.7%, for driver stress prediction. The results have demonstrated the feasibility
of generic model for both drowsiness and stress prediction. Future research
directions have been shared to enhance the model performance.
Keywords: Artificial intelligence At-risk driving Driver drowsiness Driver

stress Intelligent transportation Long short-term memory
1 Introduction
The world has experienced an important issue that the road traffic accidents have led to
1.35 million deaths and 50 million injuries every year [1]. It ranks 8th in the global
leading causes of death. Most severe, it ranks 1st for youngsters (5–29 years old). The
deaths and injuries results in 3–5% loss in gross national product. United Nations has
defined 17 sustainable development goal and 169 targets in which the global number of
traffic deaths and injuries should achieve at least 50% reduction by 2020 [2]. However,
the total number of traffic deaths has been increased and the rate of traffic deaths has
remained nearly constant (about 18 deaths per 100,000 population). With the failure in
meeting the target, it is expected United Nation will postpone the completion date,
likely before 2030 to meet the overall vision.
https://doi.org/10.1007/978-3-030-68154-8_58
Long Short-Term Memory Networks 671
Various measures have been implemented for road traffic safety. For instance,
government officials have proposed mandatory inspections of vehicles [3], education
[4], stronger legislation [5], speed limit [6], and highway rest areas [7]. Nevertheless,
these measures have limited benefits on lowering the number of traffic accidents.
Monitoring and detecting the physical status of drivers has become an emergent
research trend to give warning to drivers with undesired status. A large-scale survey
collected questionnaires from 60 million drivers and concluded that 54% and 28% of
respondents had experienced drowsiness and fallen asleep on the wheel, respectively
[8]. Aggressive driving behaviors, as a result of stress, are common (90%) in traffic [9].
In this paper, attention is drawn to the generic design of prediction algorithm for driver
drowsiness and stress. In contrast, some research studies were discussing the driver
drowsiness [10, 11] and stress detection [12, 13] which current status of drivers was
examined. In reality, this may not help preventing traffic accidents as the average time
for reaction time may vary from 0.5 s [14] to 1.5 s [15] reported in different studies.
Therefore, to allow sufficient time for drivers’ response, it is required to reformulate the
driver drowsiness and stress from detection to prediction.
1.1 Literature Review

In literature, researchers have presented different machine learning approaches for
stand-alone driver drowsiness prediction and stand-alone driver stress prediction. To
the best of our understanding, previous works separate the considerations of driver
drowsiness and stress prediction. Therefore, this paper has proposed a generic model
that servers both driver drowsiness and stress prediction. This section is divided into
two parts which driver drowsiness prediction will firstly be discussed. This is followed
by existing methods for driver stress prediction.
The existing works in driver drowsiness prediction is summarized as follows. In
[16], researchers considered the sub-problem of drowsy driving, yawning detection.
Convolutional neural network (CNN) has been applied to extract spatial information,
which served as key features for bi-directional long short-term memory (LSTM) net-
work. The model achieved accuracy of 75% for 3–5 s advance prediction. Another
work proposed an ensemble learning method by joining k-nearest neighbor (kNN),
support vector machine (SVM), and logistic regression (LR) for 10 s advance pre-
diction [17]. The reported accuracy was 87.9%. Fisher’s linear discriminant analysis
was employed for 5 s advance prediction [18]. The average accuracy was 79.2% (with
up to 40% deviation for different candidates). Artificial neural network (ANN) [19] has
been employed for 1 min advance prediction. Results showed that the model yielded
root-mean-square error of 6 min.
When it comes to driver stress prediction, various approaches have been proposed.
Five typical machine learning algorithms kNN, SVM, ANN, random forest (RF), and
decision tree (DT) were evaluated on the 2 s advance prediction [20]. Synthetic
minority oversampling technique (SMOTE) was adopted to reduce the effect of
imbalance class where 4% of samples belongs to stress event. RF achieved best per-
formance among five algorithms, with accuracy, specificity, and sensitivity of 98.92%,
98.46%, and 99.36%, respectively. In [21], Naive Bayes (NB) classifier was applied for
30 s advanced prediction. It achieved average accuracy of 78.3% (with up to 37%
672 K. T. Chui et al.
deviation for different candidates). A 6 min advance prediction was achieved by LR

[22]. The results were with sensitivity of 60.9% and specificity of 86.7%. A study
adopted deep-belief network to provide 1 min advance prediction [23]. The average
sensitivity and specificity were 82.3% and 83.6%, respectively. However, the perfor-
mance deviated up to 25% and 38% respectively.
The key limitations of existing works [16–23] are summarized as follows (i) depen-
dence on various sensor [17–22]; (ii) simulated datasets were utilized [17–19, 21, 23];
(iii) only single split validation was adopted [16, 18]; and (iv) limited analysis in the
performance of prediction model with varying time of advance prediction.
1.2 Motivation and Contributions

To address the aforesaid limitations in existing works, in this paper, we have consid-
ered to (i) adopt only electrocardiogram (ECG) signal as data source; (ii) evaluate the
prediction model using real-world datasets; (iii) adopt 5-fold cross-validation; and
(iv) analyze the performance of prediction model in different times of advance
prediction.
In this paper, a LSTMs network has been proposed for driver drowsiness and stress
prediction. The contributions of the paper have been summarized as follows (i) To
emphasize, it is a feasibility study on the generic approach to predict driver drowsiness
and stress. It helps reducing the effort to conduct research works in two directions; and
(ii) Reduce the rely on multiple data sources, i.e. inputs while maintaining the per-
formance of the prediction models.
2 Proposed LSTM Network for Driver Drowsiness and Stress

Prediction
This section is organized as follows. Firstly, the driver drowsiness and stress databases
are presented. This is followed by data pre-processing and ECG beat segmentation. At
last, the proposed EMD based LSTM network will be discussed for driver drowsiness
and stress prediction
2.1 Driver Drowsiness and Stress Databases

Two open-access datasets namely the Cyclic Alternating Pattern Sleep (CAPS) Data-
base [24, 25] and the Stress Recognition in Automobile Drivers (SRAD) Database [25,
26], were retrieved. The CAPS database is composed of 108 records. The sleep patterns
are governed by six sleep stages: (i) awake stage; (ii) stage 1; (iii) stage 2; (iv) stage 3;
(v) stage 4; and (vi) rapid eye movement (REM) stage. The drowsy stages are linked
with sleep stage 1 and sleep stage 2 since these are the immediate stages after awake
stage. On the other hand, the SRAD database contains 18 records of real-world driving
in city and highway. ECG signal is chosen as the data source of the prediction model.
2.2 Data Pre-processing and ECG Beat Segmentation

Each of the records in the CAPS and SRAD databases requires further processing to
produce individual ECG samples. Traditional Tompkin’s algorithm [27, 28] has been
employed to achieve ECG beat segmentation. As this is a widely adopted approach and
is not the major contribution of this paper, only key steps are summarized: (i) eliminate
offset; (ii) apply low-pas filter; (iii) apply high-pass filter; (iv) apply derivative filter;
(v) square the signal; (vi) configure sliding window; (vii) set threshold and locate Q, R,
and S waves; and (viii) construct individual ECG sample by joining portion of signals
between two consecutive R waves.
Table 1 summarizes the information of classes and sample sizes (number of indi-
vidual ECG samples) in CAPS and SRAD databases.
Table 1. Classes and sample sizes in CAPS and SRAD databases.

Database Class Sample size
CAPS database Class 2: Drowsy stage 2 20000
Class 1: Drowsy stage 1 35300
Class 0: Awake 76200
SRAD database Class 2: High stress level 11900
Class 1: Medium stress level 45000
Class 0: Low stress level 19300
2.3 LSTM Network Based Driver Drowsiness and Stress Prediction

We aim at carrying out feasibility study on generic design for the formulations of driver
drowsiness prediction and driver stress prediction. Figure 1 shows the workflow of the
proposed approach.
After obtaining individual ECG samples, empirical mode decomposition
(EMD) has applied to decompose the ECG sample into sum of all intrinsic mode
functions (IMFs) and a residue. All local extrema are identified. All maxima are linked
and served as upper envelope Eupper whereas all minima are linked and served as lower
envelope Elower . The mean of the extrema E is calculated as follows:
¼ 0:5ðEupper ðtÞ þ Elower ðtÞÞ:

EðtÞ ð1Þ
from the individual ECG

The first pseudo-IMF can be deduced by subtracting E
sample xðtÞ.

p1 ðtÞ ¼ xðtÞ EðtÞ: ð2Þ
Fig. 1. Workflow of the proposed LSTM network.
Equations (1) and (2) are repeated until the standard deviation r is less than the
threshold value, 0.05 [29]. The iterations can be expressed as:
ij ðtÞ:
pij ðtÞ ¼ piðj1Þ ðtÞ E ð3Þ
where i is the index of IMF and j is the iteration index.

The real IMFs IMFi ðtÞ and residues ri ðtÞ are governed by:
ri ðtÞ ¼ xðtÞ IMFi ðtÞ: ð4Þ
EMD completes when final residue rN ðtÞ (N is the total number of IMFs) is in any
of the following conditions (i) a constant; (ii) a function with one extreme; and (iii) a
monotonic function. The individual ECG sample is the sum of all IMFs and rN ðtÞ.
XN
xðtÞ ¼ i¼1
IMFi ðtÞ þ rN ðtÞ: ð5Þ
Features are extracted based on Pearson correlation coefficient between original

signal and IMFi ðtÞ and rN ðtÞ.
It is worth mentioning that LSTM network takes the advantage in addressing time
series data attributable to the ability to map between input and output sequences with
contextual information. A common LSTM network is comprised of a forget gate, an
input gate, and an output gate. The workflow of the forget gate, input gate, and output
gate of LSTM network is summarized as follows.
Forget Gate. The forget gate ft determines whether the information is kept from the
cell state. It takes the latest input xt and previous output of memory block ht1 . The
activation function ra is chosen to be logistic sigmoid as common practice, determines
the extent of retained information from the upper cell. In general, the output is ranged
from 0 (completely ignore) and 1 (completely retain).
ft ¼ ra ðWf ½ht1 ; xt þ bf Þ: ð6Þ
where Wf is the weight matrix and bf is the bias vector.

Input Gate. The information to be stored into the cell is controlled by the input gate.
Also, it determines whether an update is needed.
it ¼ ra ðWi ½ht1 ; xt þ bi Þ: ð7Þ
where Wi is the weight matrix and bi is the bias vector.

Output Gate. The output of the memory cell is regulated by the output gate. It
depends on the filtered cell state. A sigmoid layer and tanh will be involved.
ot ¼ ra ðWo ½ht1 ; xt þ bo Þ: ð8Þ
where Wo is the weight matrix and bo is the bias vector.

The state of the old memory cell Ct1 is updated to new memory cell Ct .
~ t ¼ tanh(Wc ½ht1 ; xt þ bc Þ:
C ð9Þ
~t:
Ct ¼ ft Ct1 þ it C ð10Þ
ht ¼ ot tanhðCt Þ: ð11Þ
where Wc is the weight matrix and bc is the bias vector.
3 Analysis and Results
The effectiveness of proposed LSTM network has been evaluated based on (i) perfor-
mance evaluation of prediction models for driver drowsiness and stress prediction
under varying times of advance prediction; (ii) performance comparison between
propose work and existing works; and (iii) future research directions.
3.1 Performance Evaluation Under Varying Times of Advance

Prediction
There is no common setting on the time of advance prediction, existing works [16–23]
have considered different scenarios (2 s, 3–5 s, 5 s, 10 s, 30 s, and 1 min). To provide
latest driver’s status and sufficient time for drivers’ response, the analysis of proposed
LSTM network is carried out for 1–5 s advance prediction. 5-fold cross-validation, as a
common practice [30, 31], has been adopted for performance evaluation. Sensitivity,
specificity, and accuracy of the LSTM network are obtained by averaging the 5 sets of
results in 5-fold cross-validation.
Table 2 summarizes the results under 1–5 s advance prediction. In both driver
drowsiness prediction and driver stress prediction, the proposed EMD based LSTM
network achieve highest sensitivity, specificity, and accuracy, in which the perfor-
mance gradually decreases with the increase of time of advance prediction. The phe-
nomenon could be suggested by more unseen information with earlier advance
prediction. There is a trade-off between time of advance prediction and performance. In
this paper, the analysis was not made towards long period of advance prediction (for
example 10 s, 30 s, and 1 min in literature) because the expected performance of
prediction model is much lower than that of short period of advance prediction. Given
the fact that short period of advance prediction could help preventing severe stages of
drowsiness and stress levels and thus avoiding accident. It may also relate to the nature
of the design of features as well as prediction model. Further investigation is sought for.
Table 2. Performance evaluation of proposed LSTM network under 1–5 s advance prediction.
Driver prediction Advance prediction Results
Sensitivity Specificity Accuracy
Drowsiness 1s 81.1% 81.9% 81.5%
2s 78.8% 80.3% 79.8%
3s 76.4% 78.1% 77.5%
4s 73.6% 74.4% 74.0%
5s 71.0% 72.9% 72.2%
Stress 1s 79.3% 80.2% 79.7%
2s 75.8% 77.2% 76.6%
3s 73.4% 74.4% 73.9%
4s 70.5% 71.6% 71.2%
5s 68.2% 72.3% 70.8%
3.2 Performance Comparison Between Proposed Work and Existing

Works
The proposed work is compared with existing works [16–23] discussed in Subsect. 1.1.
Table 3 summarizes the type of driver status prediction, datasets, input signals,
methodology, advance prediction, cross-validation, and results of proposed work and
existing works.
The comparisons are based on each column heading.
Driver Status. To the best of our knowledge, this paper is first work to consider a
generic approach for driver drowsiness and stress prediction.
Datasets. Existing works [16–19, 21] built and evaluated the model with simulated
datasets which may not be practical as real world environment is at-risk environment.
Research studies [20, 22] adopted real world dataset with number of candidates of 14
and 1, respectively, which are small sample size compared with that in our work, 126
candidate.
Input Signals. The works [17–23] rely on multiple input signals which are more
dependent on the quality of the signals. A research study [32] has suggested the
measurement stability of image and EEG signal could not be maintained in excellent
performance (59% for image and 85% EEG signal) in real world environment. Also,
multiple data sources may experience difficulty in handling data heterogeneity espe-
cially if the sampling rates do not align with each other.
Table 3. Comparisons between proposed work and existing works for driver status prediction.
Work Driver Dataset Input signal Methodology Advance Cross- Results
status prediction validation
[16] Drowsiness Simulation: Image CNN, LSTM 3–5 s No ACC: 75%
18 candidate
[17] Drowsiness Simulation: EEG; HRV; kNN, SVM, 10 s 5-fold ACC: 87.9%
16 candidate Image LR
[18] Drowsiness Simulation: EEG; EOG; FLDA 5s No ACC: 79.2%
11 candidate HRV; Image (deviation up to
40%)
[19] Drowsiness Simulation: HRV; Image; ANN 1 min 5-fold RMSE: 6 min
21 candidate Lateral and
steering
information
[20] Stress Real world: Demographic kNN, SVM, 2s 10-fold ACC: 98.92%; SEN:
14 candidate data; EDA; ANN, RF, 99.36%; SPE:
EMG; RESP; DT 98.46%
Time; Weather
[21] Stress Simulation: HRV; Weather NB 30 s 10-fold ACC: 78.3%
5 candidate (deviation up to
37%)
[22] Stress Real world: Accelerometer; LR Not 10-fold SEN: 60.9%; SPE:
1 candidate EDA, PPG specify 86.7%
[23] Stress Simulation: HRV; intensity DBN 1 min 10-fold SEN: 82.3%; SPE:
10 candidate of turning; 83.6% (deviation 25–
speed; 38%)
Proposed Drowsiness Real world: ECG EMD; LSTM 1–5 s 5-fold Drowsiness
and stress 126 SEN: 71.0–81.1%
candidate SPE: 72.9–81.9%
using CAPS ACC: 72.2–81.5%
and SRAD Stress SEN: 68.2–
databases 79.3% SPE: 71.6–
80.2% ACC: 70.8–
79.7%
Accuracy (ACC); artificial neural network (ANN); convolutional neural network (CNN); deep-belief network (DBN);
decision tree (DT); electrocardiogram (ECG); electrodermal activity (EDA); electroencephalogram (EEG); empirical mode
decomposition (EMD); electrooculography (EOG); Fisher’s linear discriminant analysis (FLDA); heart rate variability
(HRV); k-nearest neighbor (kNN); logistic regression (LR); long short-term memory (LSTM); Naive Bayes (NB);
photoplethysmogram (PPG); respiration (RESP); random forest (RF); root-mean-square error (RMSE); sensitivity (SEN);
specificity (SPE); support vector machine (SVM).
Methodology. Various approaches have bene proposed in literature and served as

feasibility studies. LSTM has been adopted in [16] where CNN was applied to extract
features from images. Differed from [16], we extract features by Pearson correlation
coefficients of IMFs and residue using ECG signal.
Advance Prediction. There is no common setting of time for advance prediction. The
criteria are sufficient time for driver’s response and tradeoff between time and per-
formance of model.
Cross-validation. Cross-validation was not adopted in [16, 18] which may cause bias
in the performance evaluation of model in practice (ability to manage unseen data).
Other research works followed 5-fold [17, 19] or 10-fold [20–23] cross-validation
which are common value of k in k-fold cross validation.
Results. Regarding the results, as preliminary analysis, indirect comparison is
made. The performance of proposed work is comparable or outperforming existing
works [16, 18, 19, 21–23]. It has room for improvement compared with [17, 20],
however, it requires confirmation as future work to repeat the methodologies in [17, 20]
using CAPS and SRAD databases for fair comparison.
3.3 Future Research Directions

The proposed work confirms the feasibility of generic approach for driver drowsiness
and stress prediction using single type of input signal. Further research would be
conducted to enhance the performance of prediction model. The possibilities are (but
not limited to) (i) fine-tuning algorithm for performance enhancement; (ii) apart from
parametric Pearson correlation for feature extraction, non-parametric approaches
Spearman correlation and Kendall correlation will be analyzed; (iii) investigating the
possibility of reduction of the number of IMFs and thus computational power while
maintaining the model performance; and (iv) ensemble learning could be introduced to
combine the prediction algorithms with other approaches like Markov regime-
switching jump-diffusion model with delay [33].
4 Conclusion
In this paper, an EMD based LSTM network has been proposed to serve as generic
model for driver drowsiness and stress prediction. As preliminary study, we choose
Pearson correlation coefficient in feature extraction process and varying the times for
advance prediction from 1 to 5 s. The proposed model achieves sensitivity, specificity,
and accuracy of 71.0–81.1%, 72.9–81.9%, and 72.2–81.5%, respectively, for driver
drowsiness prediction. When it comes to driver stress prediction, the sensitivity,
specificity, and accuracy are 68.2–79.3%, 71.6–80.2%, and 70.8–79.7%, respectively.
This has demonstrated the feasibility of generic model for dual applications. Four future
research directions have been suggested to enhance the performance of the prediction
model. In near future, it is foreseeable that there are many more research in intelligent
applications using intelligent computing and optimization techniques [34, 35].
Acknowledgments. The work described in this paper was fully supported by the Open University
of Hong Kong Research Grant (No. 2019/1.7).
References
1. Global Status Report on Road Safety 2018, World Health Organization. https://www.who.
int/publications/i/item/global-status-report-on-road-safety-2018
2. Transforming Our World: The 2030 Agenda for Sustainable Development, United Nations.
http://sustainabledevelopment.un.org
3. Das, S., Geedipally, S.R., Dixon, K., Sun, X., Ma, C.: Measuring the effectiveness of vehicle
inspection regulations in different states of the US. Transp. Res. Rec. 2673, 208–219 (2019)
4. Alonso, F., Esteban, C., Useche, S., Colomer, N.: Effect of road safety education on road
risky behaviors of Spanish children and adolescents: findings from a national study. Int.
J. Environ. Res. Public Health 15, 2828 (2018)
5. Castillo-Manzano, J.I., Castro-Nuño, M., López-Valpuesta, L., Pedregal, D.J.: From
legislation to compliance: the power of traffic law enforcement for the case study of Spain.
Transp. Policy 75, 1–9 (2019)
6. Silvano, A.P., Koutsopoulos, H.N., Farah, H.: Free flow speed estimation: a probabilistic,
latent approach. Impact of speed limit changes and road characteristics. Transport. Res. A-
Pol. 138, 283–298 (2020)
7. Choi, J., Lee, K., Kim, H., An, S., Nam, D.: Classification of inter-urban highway drivers’
resting behavior for advanced driver-assistance system technologies using vehicle trajectory
data from car navigation systems. Sustainability 12, 5936 (2020)
8. Royal, D., Street, F., Suite, N.W.: National Survey of Distracted and Drowsy Driving
Attitudes and Behavior. Technical report, National Highway Traffic Safety Administration
(2002)
9. Pfeiffer, J.L., Pueschel, K., Seifert, D.: Interpersonal violence in road rage. Cases from the
medico-legal center for victims of violence in Hamburg. J. Forens. Leg. Med. 39, 42–45
(2016)
10. Dua, M., Singla, R., Raj, S., Jangra, A.: Deep CNN models-based ensemble approach to
driver drowsiness detection. Neural Comput. Appl. 32, 1–14 (2020)
11. Zhang, X., Wang, X., Yang, X., Xu, C., Zhu, X., Wei, J.: Driver drowsiness detection using
mixed-effect ordered logit model considering time cumulative effect. Anal. Meth. Accid.
Res. 26, 100114 (2020)
12. Chung, W.Y., Chong, T.W., Lee, B.G.: Methods to detect and reduce driver stress: a review.
Int. J. Automot. Technol. 20, 1051–1063 (2019)
13. Kyriakou, K., Resch, B., Sagl, G., Petutschnig, A., Werner, C., Niederseer, D., Liedlgruber,
M., Wilhelm, F.H., Osborne, T., Pykett, J.: Detecting moments of stress from measurements
of wearable physiological sensors. Sensors 19, 3805 (2019)
14. Dickerson, A.E., Reistetter, T.A., Burhans, S., Apple, K.: Typical brake reaction times
across the life span. Occup. Ther. Health Care 30, 115–123 (2016)
15. Arbabzadeh, N., Jafari, M., Jalayer, M., Jiang, S., Kharbeche, M.: A hybrid approach for
identifying factors affecting driver reaction time using naturalistic driving data. Transp. Res.
Part C Emerg. Technol. 100, 107–124 (2019)
16. Saurav, S., Mathur, S., Sang, I., Prasad, S.S., Singh, S.: Yawn detection for driver’s
drowsiness prediction using bi-directional LSTM with CNN features. In: International
Conference on Intelligent Human Computer Interaction, pp. 189–200. Springer, Cham
(2019)
17. Gwak, J., Hirao, A., Shino, M.: An investigation of early detection of driver drowsiness
using ensemble machine learning based on hybrid sensing. Appl. Sci. 10, 2890 (2020)
18. Nguyen, T., Ahn, S., Jang, H., Jun, S.C., Kim, J.G.: Utilization of a combined EEG/NIRS
system to predict driver drowsiness. Sci. Rep. 7, 43933 (2017)
19. de Naurois, C.J., Bourdin, C., Bougard, C., Vercher, J.L.: Adapting artificial neural networks
to a specific driver enhances detection and prediction of drowsiness. Accid. Anal. Prev. 121,
118–128 (2018)
20. Hadi, W.E., El-Khalili, N., AlNashashibi, M., Issa, G., AlBanna, A.A.: Application of data
mining algorithms for improving stress prediction of automobile drivers: a case study in
Jordan. Comput. Biol. Med. 114, 103474 (2019)
21. Alharthi, R., Alharthi, R., Guthier, B., El Saddik, A.: CASP: context-aware stress prediction
system. Multimed. Tools Appl. 78, 9011–9031 (2019)
22. Bitkina, O.V., Kim, J., Park, J., Park, J., Kim, H.K.: Identifying traffic context using driving
stress: a longitudinal preliminary case study. Sensors 19, 2152 (2019)
23. Magana, V.C., Munoz-Organero, M.: Toward safer highways: predicting driver stress in
varying conditions on habitual routes. IEEE Veh. Technol. Mag. 12, 69–76 (2017)
24. Terzano, M.G., Parrino, L., Sherieri, A., Chervin, R., Chokroverty, S., Guilleminault, C.,
Hirshkowitz, M., Mahowald, M., Moldofsky, H., Rosa, A., et al.: Atlas, rules, and recording
techniques for the scoring of cyclic alternating pattern (CAP) in human sleep. Sleep Med. 2,
537–553 (2001)
25. Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, P.C.H., Mark, R.G.,
Mietus, J.E., Moody, G.B., Peng, C.K., Stanley, H.E.: PhysioBank, PhysioToolkit, and
PhysioNet: components of a new research resource for complex physiologic signals.
Circulation 101, e215–e220 (2003)
26. Healey, J.A., Picard, R.W.: Detecting stress during real-world driving tasks using
physiological sensors. IEEE Trans. Intell. Transp. 6, 156–166 (2005)
27. Tompkins, W.J.: Biomedical Digital Signal Processing C-Language Examples and
Laboratory Experiments for the IBM®PC. pp. 236–264. Prentice Hall, Upper Saddle River
(2000)
28. Kohler, B.U., Hennig, C., Orglmeister, R.: The principles of software QRS detection. IEEE
Eng. Med. Biol. 21, 42–57 (2002)
29. Azbari, P.G., Abdolghaffar, M., Mohaqeqi, S., Pooyan, M., Ahmadian, A., Gashti, N.G.: A
novel approach to the extraction of fetal electrocardiogram based on empirical mode
decomposition and correlation analysis. Aust. Phys. Eng. Sci. Med. 40, 565–574 (2017)
30. Chui, K.T., Fung, D.C.L., Lytras, M.D., Lam, T.M.: Predicting at-risk university students in
a virtual learning environment via a machine learning algorithm. Comput. Hum. Behav. 107,
105584 (2020)
31. Wong, T.T., Yeh, P.Y.: Reliable accuracy estimates from k-fold cross validation. IEEE
Trans. Knowl. Data Eng. 32, 1586–1594 (2020)
32. Sun, Y., Yu, X.: An innovative nonintrusive driver assistance system for vital signal
monitoring. IEEE J. Biomed. Health Inform. 18, 1932–1939 (2014)
33. Savku, E., Weber, G.W.: A stochastic maximum principle for a Markov regime-switching
jump-diffusion model with delay and an application to finance. J. Optim. Theory Appl. 179,
696–721 (2018)
34. Vasant, P., Zelinka, I., Weber, G.W. (eds.): Intelligent Computing & Optimization, vol. 866.
Springer, Cham (2018)
35. Vasant, P., Zelinka, I., Weber, G.W. (eds.): Intelligent computing and optimization. In:
2019. Springer, Cham (2019)
Optimal Generation Mix of Hybrid
Renewable Energy System Employing
Hybrid Optimization Algorithm
Md. Arif Hossain1(B) , Saad Mohammad Abdullah1 , Ashik Ahmed1 ,

Quazi Nafees Ul Islam1 , and S. R. Tito2
1
Department of Electrical and Electronic Engineering,
Islamic University of Technology, Gazipur 1704, Bangladesh
arifhossain@iut-dhaka.edu
2
Department of Electrical Engineering, School of Professional Engineering, Faculty
of Engineering and Trades, Manukau Institute of Technology, Auckland, New Zealand
Abstract. Among the renewable sources of energy, wind and photo-

voltaic based energy conversion processes are capturing recent inter-
ests. As the input to these two kinds of energy conversion processes
is highly unpredictable, the incorporation of an energy storage device
becomes imperative for an uninterruptible power supply. However, con-
sidering hybrid renewable power generation for fulfilling load demand,
arbitrary mixing among participating generating units could result in
non-profitable outcomes for power supplying entities. Hence, in this work,
optimal sizing of a Wind-Photovoltaic-Battery system has been sug-
gested using a hybrid optimization method integrating Ant Colony Opti-
mization extended to continuous domains (ACOR ) and Genetic Algo-
rithm (GA) forming ACOR -GA. The ACOR -GA is compared against
other algorithms like Ant Colony Optimization (ACO), GA, Particle
Swarm Optimization (PSO), and Grasshopper Optimization Algorithm
(GOA) for 10 independent runs. The analysis shows that the proposed
hybrid algorithm shows better performance in terms of convergence
speed, obtaining global minima, and rendering a more reliable solution.
Keywords: Hybrid Renewable Energy System (HRES) · Hybrid

optimization · Ant colony optimization extended to continuous
domains (ACOR ) · Genetic algorithm (GA)
1 Introduction
In recent times, the energy policymakers have shown a vivid inclination towards
replenishable sources of energy like Hybrid Renewable Energy Sources (HRES)
to meet escalating energy demand and lessen the impact of consumption of non-
renewable energy sources [1]. HRES allows the hybridization of renewable and
conventional energy sources and facilitates in obtaining higher efficiency than a
https://doi.org/10.1007/978-3-030-68154-8_59
682 Md. A. Hossain et al.
single power source [2]. Among the numerous functional HRES technology, the
proven, environment-friendly and economically viable ones are the combinations
of Photo-Voltaic (PV) cell, Wind Turbine (WT), and Battery System (BS) [3].
1.1 Background
The cost-effectiveness of HRES is mostly dependent upon the intelligent mix-

ing and sizing among different participating sources. The researchers are opting
towards numerous optimizing techniques for optimizing the size of HRES to
achieve cost-effective solution to the problem [4]. The prominent optimization
techniques are the swarm intelligence based algorithms which mostly imitate the
biological behavior of the creatures among which Particle Swarm Optimization
(PSO), Ant Colony Optimization (ACO), Grey Wolf Optimization (GWO), etc.
have obtained noteworthy recognition. Apart from the swarm intelligence based
algorithms, evolutionary algorithms like Genetic Algorithm (GA) is also worth
mentioning. For an HRES, the objective function is designed in a way to obtain a
cost-effective combination of the considered modules. GA has been applied in [5]
where the variation of weather was taken into consideration to maintain a zero
Loss of Power Supply Probability (LPSP). In [2] PSO has been used to minimize
the Levelized Cost of Energy (LCE). ACO is another popular nature inspired
algorithm that has been employed in [6] for sizing and performance analysis of a
standalone Hybrid energy system (HES). In recent times Grasshopper Optimiza-
tion Algorithm (GOA) has also been used in HRES optimization as such in [7],
GOA has been used to extract maximum power from wind energy system with
variable speed. Apart from these works, recently a lot of works on optimization
have also been carried out in [8] and [9].
All these works in the literature review motivated us to formulate a hybrid
optimization algorithm to be employed in the field of HRES. The literature
suggests that the most promising combination among the components of HRES
is PV module and wind turbine where battery is kept as the energy storage
device. Thus, in this study, we consider the above-mentioned combination for
the HRES. The main objective of this study is to improve the reliability of
the supplied power from the HRES model by ensuring zero LPSP. This is a
challenging task to accomplish as the associated installation and maintenance
cost of different modules is likely to increase while trying to achieve zero LPSP.
Thus, in order to find an optimum combination among the modules of the HRES;
Ant Colony Optimization extended to continuous domains (ACOR )is hybridized
with GA to form ACOR -GA technique. This hybridization is made to enable
ACOR in achieving a quicker convergence in terms of iteration number. The
ACOR -GA is then applied to minimize the overall system cost. A comparison
between the existing ACO, GOA, GA, and PSO techniques and proposed hybrid
ACOR -GA is then carried out to justify the performance of ACOR -GA.
Hybrid ACOR -GA in HRES 683
2 Methodology
This section briefly interprets the steps which were used to reach the optimum
result. In this study, the optimum result is defined as the minimum attainable
setup, maintenance and operational cost of the chosen components of the HRES
for a period of 20 years while maintaining an LPSP of zero throughout this
duration. All the calculations made in this study are on an hourly basis.
2.1 System Model of HRES
Fig. 1. A PV-WT-BS hybrid renewable energy system.
The HRES considered in this study consist of three main components, two of
which are renewable energy sources (Photovoltaic and wind energy), and the
third one is an energy storage unit (Battery) forming a PV-WT-battery based
HRES. These system components are connected to a 24V DC bus via power
electronic converters, as shown in Fig. 1 [10]. For this work, the typical hourly
summer domestic demand of Auckland, New Zealand has been taken from [11]
as the hourly load demand.
Modeling of Photovoltaic Module. The Photovoltaic (PV) module is one of

the renewable energy sources of the HRES considered in this study. Along with
solar irradiation, several other factors influence the power generation, which is
specified by the manufacturer. At a particular time, the output power of a PV
module can be obtained from Eq. (1) [12].
PP V (t, β) = VOC (t, β).ISC (t, β).F F (t)

VOC (t, β) = VOC−ST C − KV .TC (t)
G(t, β) (1)
ISC (t, β) = {ISC−ST C + KI [TC (t) − 25 ◦ C]}
100
G(t, β)
TC (t) = TA + (N COT − 20 ◦ C)
800
where PP V (t, β) represents the output of the PV array at the tth hour with
a tilt angle of β. VOC and ISC represent the open circuit voltage and short
circuit current of a PV module respectively, F F is the fill factor, KV and KI
are the open circuit voltage temperature coefficient and short circuit current
temperature coefficient respectively, G represents the global solar irradiance on
a PV module, TA is the ambient temperature and N COT is the nominal cell
operating temperature. Finally, the total output power of a PV array is expressed
in terms of the following equation.
Parray (t, β) = ηP V .NS .NP .PP V (t, β) (2)
where ηP V represents the efficiency of the PV module and its respective converter
and NS and NP denote the number of series-connected and parallel-connected
PV modules respectively. The specifications of the PV module considered for
the HRES of this study are presented in Table 1 [12].
Modeling of Wind Turbine Generators. Wind turbine (WT) is used to

harness the wind energy to generate electrical power. The specific power output,
Pw (W/m2 ) from a WT is dependent on the wind speed, v(t) of a particular site
as expressed in the following equation [12].
Pw (t) = 0 v(t) < vci

Pw (t) = av (t) − bPr
3
vci ≤ v(t) < vr
(3)
Pw (t) = Pr vr ≤ v(t) < vc0
Pw (t) = 0 v(t) ≥ vc0
v3
where Pr is the rated power, a = v3P−v
r
3 and b = v 3 −v 3 . The cut-in speed, rated
ci
r ci r ci
speed, and cut-out speed of the WT are denoted by vci , vr and vc0 respectively.
The wind speed,vh at a particular height, can be calculated from Eq. (4) [12].
α
h
v=v (4)
hr
where hr represents the reference height and according to [13], the hourly wind
speed at hr = 33m was considered for this study. On the other hand, the power-
law exponent,α was chosen to be 1/7 for open space [14]. The output electrical
power generated from a WT can be obtained from the following equation.
PW G = Pw AW G ηwG (5)
where ηwG represents the efficiency of the WT generator and its associated
converters and AW G is the total swept area of the WT. The specifications of the
WT generator opted for this study are presented in Table 2 [12].
Battery Model. To deal with the uncertainty of renewable energy resources,

energy storage units are incorporated into an integrated system for balancing the
fluctuations in electrical power generation. The charging and discharging state
of the battery is determined with respect to the instantaneous state of charge
(SOC), which can be calculated using Eq. (6) [12].
σ Ibat (t)ηbat
SOC(t) = SOC(t − 1). 1 − + (6)
24 Cbat
where SOC(t) represents the state of charge at the current hour,t and SOC(t−1)
denotes the same for the previous hour,(t − 1). According to [5], the battery
charging efficiency ηbat is taken to be 0.8, and the discharging efficiency was set
to 1. For this study, the self-discharge rate σ of the battery was assumed to be
0.2% per day. Furthermore, in Eq. (6), Cbat denotes the capacity of the battery
and Ibat (t) refers to the current from the battery at the tth hour. For the HRES
considered in this work, Ibat (t) due to the incorporation of the battery can be
determined as follows.
PP V (t) + PW G (t) − Pload (t)
Ibat (t) = (7)
Vbat (t)
where PLoad (t) refers to the load demand at the tth hour and Vbat (t) indicates
the battery voltage. The battery specifications considered for this study are
presented in Table 3 [12].
Table 1. Specification of the PV module
VOC (V) ISC (A) Vmax (V) Imax (A) Pmax (W) Capital Cost ($)
64.8 6.24 54.7 5.86 320 640
Table 2. Specification of the WG
Power (W) hlow (m) hhigh (m) WG Capital Cost($) Tower Capital Cost ($/unit length)
100 11 35 2400 55
Table 3. Specification of the battery model
Price ($) Voltage (V) Capacity (Ah)

1239 12 357
2.2 Formulation of the Optimization Problem
LPSP is a very important term that signifies the probability of unsatisfied load
demand due to a shortage of power supply or due to economic constraints [15].
Thus, the goal of this study is to optimize the number of PV modules NP V ,
the number of WT generators NW G , and the number of batteries Nbat along
with the tilt angle β of the PV modules and WT installation height h required
to achieve zero LPSP at the minimum total cost for the HRES model. The
objective function for minimizing the overall cost associated with the HRES
system is highlighted in Eq. (8) [12].
In the objective function of Eq. (8), the total cost is defined in terms of the
initial capital costs of the PV modules, WT generators, and batteries along with
the associated maintenance and operational cost over a span of 20 years [10]. In
this equation, CP V ,CW G and Cbat are the capital costs associated with a PV
module, WT generator, and battery, respectively. Whereas, the maintenance and
operational costs for a PV module, WT generator, and battery are denoted by
MP V ,MW G and Mbat , respectively. The capital cost per unit height (Ch ) of a
WT and the annual operational and maintenance cost per unit height (Mh ) of
a WT are also incorporated in the objective function. The expected number of
batteries to be replaced within the period of 20 years is represented by ybat [10].
Thus, the goal of this study is to determine the optimum values of NP V (NS ×
NP ), NW G , Nbat , β and h in order to minimize the objective function in (8) with
respect to the constraints mentioned in (9).
M inimize f (NP V , NW G , Nbat , β, h) = [NP V (CP V + 20MP V ) + NW G (CW G (8)

+ 20MW G + h.Ch + 20h.Mh ) + Nbat (Cbat + ybat .Cbat ) + (20 − ybat − 1)Mbat ]
Subject to the constraints
NW G ≥ 0, NP V ≥ 0, Nbat ≥ 0, 90◦ ≥ β ≥ 0, 11 ≥ h ≥ 35 (9)
3 The Hybrid Algorithm
In this study ACOR proposed by the authors in [16] is hybridized with the aid
of conventional crossover and mutation operations which are the core part of the
Genetic Algorithm [17] to formulate the ACOR -GA algorithm. The formulated
hybrid algorithm commences by initializing the required parameters of ACOR ,
crossover, mutation, initial population (pop), initial population size, number of
iterations, and loading the required data for simulation. The initial population
is generated randomly within the adopted parameter bounds. After calculating
the solution weights (ω) and selection probabilities (p) for an individual run, the
algorithm enters the main loop which is run for a definite number of iterations.
In this study, it is assumed that the optimization algorithm will cease once the
maximum number of iterations reaches 100 for each individual run. Besides,
the number of the initial population is set to 50. After entering the main loop,
mean (μ) and standard deviation (σ) of the initial population set are calculated
which, along with previously calculated (ω) and (p) influence the generation of
the new population (newpop) set in that particular iteration. Thereafter, based
on roulette wheel selection, a solution kernel is generated which governs the
generation of a new population termed as newpop. This process takes place
50 times i.e. equal to the number of initial population size and each time it
updates all the optimizing parameters. It is to be noted that all the parameters
of ACOR are calculated following the equations provided by the authors in [18].
Consequently, the algorithm performs crossover and mutation [19] on randomly
selected populations from the initial population set and generates two more new
population sets named popc and popm respectively. All the solutions, i.e.newpop,
popc, popm and initial population (pop) are then merged to generate an updated
pop of larger dimension. After merging, all the solutions are checked if any of
the solutions exceed the adopted bounds. The solutions are pulled back to the
upper boundary provided they go beyond the defined upper and lower limits.
This solution set is then sorted based on the obtained cost from the objective
function and only the best 50 solutions are kept for the next iteration. The
current best solution in a particular iteration is stored provided it supersedes the
best solution achieved in the previous iteration. And thus the above-mentioned
methodology runs for 100 times which is the maximum number of iterations.
The overall work-flow of the proposed ACOR -GA is illustrated in Fig. 2. It is to
be mentioned here that the hybrid algorithm takes a slightly higher computation
time concerning the single algorithms. But because the convergence rate is much
quicker, this difference becomes negligible.
Fig. 2. Flowchart of the ACOR -GA algorithm


An elaborate explanation of the superiority of the hybrid algorithm is presented in
this section by comparing it against four other algorithms. Each algorithm was run
for 10 independent times and every time the optimum result was recorded. Every
independent run constituted 100 iterations and the initial population was 50.
4.1 Descriptive Analysis of the Algorithms
Table 4 demonstrates that the mean value of the ACOR -GA is less than all the
other algorithms. However, only the mean value can not signify the superiority
of an algorithm. Variance and standard deviation are two other parameters that
help us further analyze the data. These parameters demonstrate the spreading of
our data around the mean. An algorithm with high variance and standard devi-
ation indicates that the result obtained in each independent run is significantly
different from the other independent runs. And thus, for this algorithm to be
reliable, it is needed to carry out more number of independent runs. On the other
hand, a smaller value of variance and standard deviation signifies that the result
obtained in each independent run is not significantly different from the other
runs. From Table 4, it is seen that both the values of standard deviation and
variance of ACOR -GA are considerably smaller than the rest of the algorithms.
Thus the value of the mean together with the values of standard deviation and
variance indicates that the ACOR -GA is more reliable than the rest of the algo-
rithms. Further, the minimum value reached by the hybrid algorithm is 37943.70
$ which is again less than the comparing algorithms. The number of iterations
signifies the convergence speed of an algorithm. In this aspect, the hybrid algo-
rithm leaves behind all the other algorithms by a very high margin. The average
number of iterations needed by the ACOR -GA is 26.3 and the algorithm closest
to this value is GA with 34 iterations. It is to be noted here that apart from
the hybrid algorithm, GA has the best convergence speed among the four algo-
rithms. Thus hybridizing GA with ACOR significantly improves the convergence
of the latter algorithm.
Table 4. Statistical Descriptive
ACOR -GA ACO GA PSO GOA

Mean ($) 38271.75 38457.15 40400.70 39865.32 47253.03
Median ($) 37997.40 38315.70 40247.70 39470.55 46067.40
Variance 184923.82 283633.62 1330418.00 1973622.34 55249687.80
Std. Deviation 430.02 532.57 1153.43 1404.85 7433.01
Minimum ($) 37943.70 37997.40 39149.70 38435.40 38873.40
Maximum ($) 39257.70 39641.40 42689.70 43421.70 60773.70
Average iteration 26.3 45.4 34 48.8 56.4
Figure 3 portrays a closer look at the improvement of convergence speed of the

ACO. In Fig. 3(a) it is seen that the lowest and the highest number of iterations
needed to converge for ACO in 10 independent runs is 22 and 74 respectively.
After hybridizing, as observed in Fig. 3(b), this number drops to 19 and 35 as
the least and the highest number of iterations respectively to converge. This is
one of the main benefits achieved by introducing GA within the ACOR . Figure 4
shows the gradual convergence of all the algorithms in a single frame. It is seen
that the hybrid algorithm not only reaches the lowest cost, it also reaches it
faster than the rest of the algorithms.
4.2 Normality Tests of the Algorithms
In this normality test, the null hypothesis states that the data is not statisti-
cally significantly different from the normal distribution. The important aspect
to realize in this particular test is that it is expected that the null hypothesis is
rejected. Because the data which are being analyzed here is the optimum costs
obtained in each independent run. Thus, since all the costs here are optimum,
it is expected that all the values will cluster to a particular point to increase
Fig. 3. Convergence speed employing; (a) ACO, (b) ACOR -GA
Fig. 4. Gradual convergence of the algorithms

Table 5. Normality test
Method Kolmogorov-Smirnov (K-S) Shapiro-Wilk (S-W)

Statistic Df Sig. Statistic Df Sig.
ACOR -GA .338 10 .002 .755 10 .004
ACO .321 10 .004 .793 10 .012
GA .253 10 .070 .883 10 .143
PSO .263 10 .048 .794 10 .012
GOA .196 10 .200 .898 10 .209
the reliability of the algorithm. However, if the null hypothesis is not rejected,
signifying that the data is normally distributed, means that for that particular
algorithm to have reasonable credibility, a lot of independent runs are required.
The value of asymptotic sigma (Sig.) determines whether the null hypothesis is
rejected or not. If its value is less than 0.05 then the null hypothesis is rejected
and vice versa. From Table 5, it is observed that ACOR -GA and ACO are not
normally distributed in both K-S and S-W tests, while PSO is marginally reject-
ing the normal distribution in the case of K-S test and the other algorithms are
accepting the null hypothesis. Thus the reliability of the hybrid algorithm is also
supported by these tests of normality.
4.3 Optimum Configuration of the HRES

Finally, Table 6 gives the optimum HRES combination for the different algo-
rithms. In this table, the configuration shown is only for the best cost obtained
in ten independent runs. A closer look at Table 6 reveals that the key factor that
changes the overall cost is the number of batteries. While the other algorithms
focused on minimizing the number of PV modules and WTs, the hybrid algo-
rithm successfully decreased the number of batteries allowing the number of PV
modules to escalate and thus reducing the overall cost. It is to be mentioned
here that the number of batteries connected in parallel is one of the optimizing
Table 6. Optimum configuration
Parameter Algorithm
ACOR -GA ACO GA PSO GOA
NP V 24 8 10 9 10
NW G 3 5 5 5 5
Nbat 2×1 2×2 2×2 2×2 2×2
β 14 4 33 24 29
h 35 29 29 28 26
Cost ($) 37943.70 37997.4 39149.70 38453.4 38873.4
parameters and not the number of batteries connected in series. As can be seen
from the table, the number of batteries in series will always be two owing to the
particular setup that was considered in this study.

Modern optimization algorithms are based on randomness and thus it is reason-
able for these algorithms to demonstrate different sensitivities when applied to
different fields. In his study, their application in the field of HRES was analyzed.
It is to be noted that the choice of the components of HRES is site dependant
and cannot be chosen arbitrarily. Two algorithms were merged to generate a
hybrid ACOR -GA algorithm and then the hybrid algorithm along with some
other renowned algorithms were applied to find the optimal configuration of
PV, WT, and battery in an HRES. The results exhibited the overall superiority
of the hybrid algorithm with respect to the rest of the algorithms. The hybrid
algorithm not only obtained a lower mean and minimum cost but also obtained
those parameters in a significantly less number of iterations. It is expected that
in the near future the hybrid algorithm will be converted into a multi-objective
optimization and will consider the effects of uncertainty when applied in the field
of HRES.
References
1. Bajpai, P., Dash, V.: Hybrid renewable energy systems for power generation in
stand-alone applications: a review. Renew. Sustain. Energy Rev. 16(5), 2926–2939
(2012)
2. Amer, M., Namaane, A., M’sirdi, N.: Optimization of hybrid renewable energy
systems (HRES) using PSO for cost reduction. Energy Procedia 42, 318–327 (2013)
3. Kamjoo, A., Maheri, A., Dizqah, A.M., Putrus, G.A.: Multi-objective design under
uncertainties of hybrid renewable energy system using NSGA-ii and chance con-
strained programming. Int. J. Electr. Power Energ. Syst. 74, 187–194 (2016)
4. Anand, P., Rizwan, M., Bath, S.K.: Sizing of renewable energy based hybrid system
for rural electrification using grey wolf optimisation approach. IET Energ. Syst.
Integr. 1(3), 158–172 (2019)
5. Yang, H., Zhou, W., Lu, L., Fang, Z.: Optimal sizing method for stand-alone hybrid
solar-wind system with LPSP technology by using genetic algorithm. Sol. Energy
82(4), 354–367 (2008)
6. Suhane, P., Rangnekar, S., Mittal, A., Khare, A.: Sizing and performance analy-
sis of standalone wind-photovoltaic based hybrid energy system using ant colony
optimisation. IET Renew. Power Gener. 10(7), 964–972 (2016)
7. Fathy, A., El-baksawi, O.: Grasshopper optimization algorithm for extracting maxi-
mum power from wind turbine installed in al-jouf region. J. Renew. Sustain. Energy
11(3), 033303 (2019)
8. Vasant, P., Zelinka, I., Weber, G.W.: Intelligent Computing & Optimization, vol.
866. Springer, Cham (2018)
9. Mitiku, T., Manshahia, M.S.: Fuzzy logic controller for modeling of wind energy
harvesting system for remote areas. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.)
International Conference on Intelligent Computing & Optimization, pp. 31–44.
Springer, Cham (2019)
10. Tito, M.R., Lie, T.T., Anderson, T.: Sizing optimization of wind-photovoltaic
hybrid energy systems under transient load. Int. J. Power Energy Syst. 33(4),
168–174 (2013)
11. Authority, T.E.: (2014). Available at: https://www.ea.govt.nz/dmsdocument/4755
12. Tito, S., Lie, T., Anderson, T.: Optimal sizing of a wind-photovoltaic-battery
hybrid renewable energy system considering socio-demographic factors. Sol. Energy
136, 525–532 (2016)
13. Auckland, T.K.: (2014). Available at: https://bit.ly/2Bu4Y4H
14. Borowy, B.S., Salameh, Z.M.: Methodology for optimally sizing the combination
of a battery bank and PV array in a wind/PV hybrid system. IEEE Trans. Energy
Convers. 11(2), 367–375 (1996)
15. Yang, H., Lu, L.: Study of typical meteorological years and their effect on building
energy and renewable energy simulations. ASHRAE Trans. 110, 424 (2004)
16. Socha, K., Dorigo, M.: Ant colony optimization for continuous domains. Eur. J.
Oper. Res. 185(3), 1155–1173 (2008)
17. Grefenstette, J.J.: Optimization of control parameters for genetic algorithms. IEEE
Trans. Syst. Man Cybern. 16(1), 122–128 (1986)
18. Nguyen, T.K., Lee, I.G., Kwon, O., Kim, Y.J., Hong, I.P.: Metaheuristic optimiza-
tion techniques for an electromagnetic multilayer radome design. J. Electromagn.
Eng. Sci. 19(1), 31–36 (2019)
19. Yoon, Y., Kim, Y.H.: The Roles of Crossover and Mutation in Real-Coded Genetic
Algorithms. Citeseer, London (2012)
Activity Identification from Natural Images
Using Deep CNN
Md. Anwar Hossain1(&) and Mirza A. F. M. Rashidul Hasan2

1
Pabna University of Science and Technology, Rajapur, Pabna, Bangladesh
manwar.ice@pust.ac.bd
2
University of Rajshahi, Rajshahi, Bangladesh
mirzahasanice@gmail.com
Abstract. With the increasing demand for surveillance, robotics, health-care,

sports play, human-computer interaction and human activity classification &
recognition have become a contemporary research topic in the computer vision
era. Classification of human activity from video dataset is a really challenging
task in the field of computer vision due to high dimensionality, various activi-
ties, the variability of inter-class and intraclass, background behavior, actor
movement and multiple signs of interaction. The convolutional neural network
(CNN) in deep learning plays an active role in today’s technology for classifi-
cation and recognition patterns from sequence images in the video which is
superior to the conventional machine learning methods. In this paper, we pro-
pose two convolutional neural network-based deep learning models that are
suitable for human activity recognition. Initially, we have applied some opti-
mization techniques in this model that eliminate the major problem of data
overfitting. The results of the proposed models show that the proposed opti-
mization methods are increasing the CNN performance higher than the con-
ventional two CNN models. The models trained and tested were KTH and
UCF11 YouTube action dataset which contains a large amount of different
human activity videos with different conditions. The proposed two models were
modified from the two traditional neural network models and the results show
that the output of the proposed models achieved the classification accuracy
greater than the traditional convolutional neural network models.
Keywords: Machine learning Deep learning CNN
1 Introduction
Human activity recognition (HAR) is the procedure of identifying and classifying

different human actions performed in a video sequence. HAR in computer vision is
important for numerous real-world applications such as medical data analysis, human
monitoring systems, home-based rehabilitation, and wildlife observation. HAR is
becoming an emerged research topic with the increasing demand for security defense,
anti-terrorism investigation and disaster rescue, human activity classification and
recognition [1]. Due to the widespread availabilities of computer and research data,
HAR has increased gradually. However, human activity recognition is a challenging
task in computer vision due to illumination change, camera movement or viewpoint,
https://doi.org/10.1007/978-3-030-68154-8_60
694 Md. Anwar Hossain and M. A. F. M. Rashidul Hasan
complex dynamic backgrounds, complexity and large variability in human actions

which can impact the performance of these proposed algorithms.
Over the last few years, numerous methods have been proposed for human activity
recognition from the human action video dataset. The existing technique of HAR
depends on three methods, (a) Human body model-based method, (b) Holistic methods,
(c) Local feature methods. In the human body-based methods, human action recog-
nition technology is based on the extraction of 2D or 3D features of human body parts
[2]. On the other hand, the Holistic method is based on the extraction of features by
localizing the people in videos but does not use human body parts. Holistic methods
only depend on body structure, shape and movement of humans [3]. Furthermore, the
Local feature method is based on the extraction of local features without localization
and information of humans [4]. The previous approach of human activity recognition
has limitations such as data limitation, time consumes and they cannot process the raw
images in videos. Traditional machine learning technique removes this limitation and
occupies many areas of computer vision but these models cannot handle the large
amount of data.
Deep learning is one type of machine learning technique that learns directly from
data by using multiple levels of representations [5]. Convolutional neural network
(CNN) is a type of deep feed-forward artificial neural network that has been applied
successfully in image classification and pattern recognition problems [6]. Nowadays,
there have been designed lots of deep learning methods in the application of human
activity classification with the rapid development of machine learning techniques. This
paper presents two traditional CNN models and two proposed models derived from the
traditional CNN model with including some optimization criteria like network ini-
tializer, weight regularization, and weight constraints. The model proposed in this
paper applied to KTH and UCF11 YouTube action dataset individually. These datasets
are widely used in human activity recognition of the field of computer vision. The
contribution of this paper designs two CNN based deep learning model applied to the
challenge of classifying human activity on KTH and UCF11 video dataset. This paper
presents four CNN based models that evaluated on KTH and UCF11 datasets and
compare them to find the effective model. In the preprocessing stage, the frame capture
from the video and applied to every model. The results of proposed model 2 and model
4 demonstrate the effective methods of CNN for human activity recognition than model
1 and model 3.
2 Literature Review
Human activity recognition performs an important role in medical research and security
system. HAR has been studied for years and researchers have introduced different
solutions to challenge the problem. CNN usually produces more accurate and reliable
results than the existing methods. CNN's are deep learning-based algorithm extends the
capability of traditional artificial neural network by adding additional constraints to the
earlier layers and increased the depth of the network. Recent work has focused on
tuning different deep learning-based [7] architecture to achieve maximum performance
on different human action video dataset [8].
Activity Identification from Natural Images Using Deep CNN 695
Qingchang Zhu [9] proposed a semi-supervised deep learning technique by tem-

poral ensembling of deep long short-term memory that recognizes the human activities
with smartphone inertial sensors. They combine the losses of supervised and unsu-
pervised for utilizing the unlabeled data that the supervised learning method cannot
leverage. Their results indicate the effectiveness compared to several modern semi-
supervised learning methods.
Wen-Hui Chen [10] proposed a combined model of LSTM-RNNs with scene
information based Human activity recognition system. Their experiment can identify
human activities and present a location-aware approach to improving the recognition
accuracy using long short-term memory (LSTM) recurrent neural networks by ana-
lyzing sensor data from accelerometers and gyroscopes. Their results revealed a clas-
sification accuracy of the LSTM-RNNs in this reaches 82.57% without location
information and offers a feasible solution to activity recognition for healthcare.
Shuiwang [11] introduced 3D CNN architecture that having three convolution
layers, one hardwired layer, two subsampling layers, and one fully connected layer.
The supervisory depth architecture generates multiple information channels from
adjacent input frames and performs convolution and sub-sampling in each channel. To
represents the final feature, they combined the information from all channels.
Krishanu Sarker [12] proposed an end-to-end framework for HAR from RGB video
that contains human silhouette. They implement a novel structure that couples skeleton
data obtained from RGB video and deep Bidirectional Long Short-Term Memory
(BLSTM) model for activity recognition.
In addition, there exist a few CNN based deep learning model that conduct on the
combination of KTH and UCF11 YouTube human action dataset. The research among
them gets limited number of attentions to the deep learning era. We emphasize the
effectiveness of the traditional CNN model to learn these datasets and increased this
model efficiency by including some optimization techniques which are addressed in
this paper.
3 Methods and Proposed Methodologies
This section provides the background to the traditional CNN based Model 1, a
description of the proposed modified CNN of Model 1 (Model 2) and a brief
description of the traditional 4 Layer CNN based Model 3, basic overview of the
optimized 4 Layer CNN of Model 3 (Model 4) architectures. Besides, this section
provides the Layer operation and configuration table of all neural network that has
proposed in this paper.
3.1 Dataset
The evaluation process of our proposed method on two different human activity video
databases. The Human activity video databases were KTH [13] action database and
UCF11 [14] YouTube action dataset. The KTH dataset created by the Royal Institute of
Technology in 2004 that achieved an important milestone in the computer vision. This
dataset contains six types of human actions (walking, jogging, running, boxing, hand
waving, and hand clapping) performed several times by 25 subjects in four different
conditions. There is a total of 25 6 4 = 600 video files for each combination of 25
individuals, 6 actions, and 4 conditions. KTH dataset is an RGB color space video
dataset and the sequences from these videos provide all our requirements of proposed
human activity recognition.
This UCF11 [14] dataset was created videos from YouTube by the University of
Central Florida in 2009. This dataset contains 11 varieties of human actions (basketball
shooting, biking/cycling, diving, golf swinging, horseback riding, soccer juggling,
swinging, tennis swinging, trampoline jumping, volleyball spiking, and walking with a
dog). This dataset is a collection of 1600 video samples with large variations under
camera motion, object appearance, and pose, object scale, viewpoint, cluttered back-
ground, illumination conditions, etc. This is a more challenging task due to the above
properties of this dataset.
3.2 Preprocessing
In this preprocessing step, we extracted frames from every video in the database and
converted them from RGB to grayscale for the purpose of less memory and low
computation requirement. The depth of every image was 15 with 64 64 15
resolution. We modified the input dimension to 5D due to the requirement of input to
the 3D convolution. Table 1 indicates the dataset characteristics for human activity
recognition model.
Table 1. Dataset characteristics

Dataset Classes Total video Input dimension No of samples
Training Validation Testing
KTH 6 600 (600, 1, 64, 64, 15) 480 120 120
UCF11 11 1600 (1600, 1, 64, 64, 15) 1280 320 320
3.3 Convolutional Neural Network

Traditional CNN consist of two parts, namely the feature extraction part also called the
hidden layer and the classification part. In feature extraction part the CNN can perform
a series of convolutions and pooling operations during which the features have
detected. In the case of a CNN, the convolution has performed by sliding the filter or
kernel over the input images with some stride which the size of the step the convolution
filter moves each time and the sum of the convolution produces a feature map. The
most common thing in the CNN network is to add a pooling layer in between CNN
layers which function is to continuously reduce the dimensionality to reduce the
number of parameters and computation in the network model. In CNN, two types of
pooling layer available. They are average, and maxpool which reduces the training time
and controls overfitting. In the classification part which contains some fully connected
layers that can only accept one-dimensional data. The output of this layer produces the
desired class using an activation function and classifies given video sequences [15].
3.4 Model 1 (One Layer CNN)

The Model 1 architecture of CNN is combined of one convolutional layer with Rec-
tified Linear Unit (ReLU) activation function, one pooling layer and two fully con-
nected layers with ReLU. The structure of Model 1 for Human activity recognition
developed in this paper is depicted in Fig. 1. As shown in Fig. 1, the model takes video
sequences as input to the convolution layer which is followed by pooling and fully
connected layer.
3.5 Model 2 (Proposed One Layer CNN)

In the previous traditional CNN, the model is highly dependent on the size and quality
of the training data. So, to get the better result it must have need to optimize model 1. In
model 2 a good-tuned and optimized techniques applied to the traditional CNN model 1
are discussed previously. This Proposed model performance was improved by network
initialization, regularization, constraint, and dropout. The proposed model 2 architec-
tures are illustrated in Fig. 2.
Network Initialization: The performance of a neural network is fully depending on
the primary initialization of weight. The proper initialization of weight can easily learn
the network from the training datasets. The usage of the initializer in this model means
to set the initial random weight of each layer. The proposed model 2 uses the glorot
uniform for all ReLU layers and normal for the output layer.
Weight Constraints: The aims of weight constraints are checking the size or
magnitude of the weights and forces weights to be small and can be used instead of
weight decay and in conjunction with more aggressive network configurations. This
Proposed model uses maxnorm weight constraints for the flattening layer that can
improve the generalization.
L2 Regularization: The aims of Regularization are decreasing the network com-
plexity by penalizing the large weight of a model. It can simplify the learning process
during optimization by applying penalties on network parameters and layers. In this
model, we have L2 regularization that forces the weights to decay towards zero. It uses
the hyperparameter lambda = 0.01 for regularization.
Dropout: Due to the non-linear hidden layers, the overfitting is still one of the
sensitive weaknesses of the model. Dropout is a simple regularization technique widely
used in deep neural network optimization [16]. Dropout can randomly select some
nodes and removes them along with all of their incoming and outgoing connections at
every iteration. Training a network with dropout leads to significantly lower general-
ization error on a wide variety of classification problems. The dropout was applied in
the pooling and fully-connected layer that randomly exclude 50% of neurons to reduce
overfitting.
3.6 Model 3 (Traditional 4 Layer CNN)

The Model 3 architecture of CNN is combined of four convolutional layers with ReLU
activation function, two pooling layers and two fully connected layers with ReLU. The
structure of Model 3 for Human activity recognition developed in this paper is depicted
in Fig. 3. As shown in Fig. 3, the model takes video sequences as input to the con-
volution layer which was followed by pooling and fully connected layer. Table 2
shows the parameters used in Model 3.
3.7 Model 4 (Proposed 4 Layer CNN)

In the previous traditional 4-layer, the CNN model is highly dependent on the size,
process, and quality of the training data. So, to get the better result it must have need to
optimize model 3. In model 4, a good-tuned and optimized techniques applied to the
traditional 4-layer CNN model 3 are discussed previously. The layers used in model 4
is similar to model 3 except for some optimization techniques. This Proposed model
performance was improved by network initialization, regularization, constraint, and
dropout. The proposed model 4 architectures are illustrated in Fig. 4. Table 2 shows
the parameters used in Model 4 with respect to other models.
Videos(Images)
Videos (Images) Kernel Initializer &
Kernel Regularize Convolution ReLU
Convolution ReLU Batch Normalization

Pooling
Pooling
Dropout
Flatten Layer Kernel con-

straint Flatten Layer
ReLU Fully Connected
Fully Connected ReLU
Batch Normalization
Fully Connected ReLU Dropout
Softmax with Loss Labels ReLU Fully Connected

Kernel Softmax with Loss Labels
initializer
Probability of Action
Fig. 1. Architecture of the traditional CNN Fig. 2. Architecture of the proposed CNN
based Model 1 for Human activity recognition. based Model 2 for human activity recognition.
Table 2. Number of Parameters for four different models

Parameters Model 1 Model 2 Model 3 Model 4
No. of convolution 1 1 4 4
layer
Activation function ReLU ReLU Softmax ReLU ReLU Softmax
Softmax Softmax
Iteration 100 100 100 100
Learning rate 0.01 0.01 0.01 0.01
Optimization RMSprop RMSprop RMSprop RMSprop
Network None Xavier/ Glorot None He uniform & Glorat
initialization uniform uniform
L2 Regularization None Yes None Yes
Batch None Yes None Yes
normalization
Videos (Images) Videos (Images)

Convolution ReLU Kernel Initializer &
Kernel Regularize
Convolution ReLU
ReLU Batch Normalization

Convolution
Kernel con- Convolution ReLU
Pooling straint
Dropout Pooling
Convolution Dropout
ReLU
Kernel Initializer &Convolution ReLU
Kernel Regularize
Convolution ReLU
Batch Normalization
Pooling Kernel con- Convolution ReLU
straint
Dropout
Pooling
Flatten Layer
Dropout
Fully Connected ReLU Flatten Layer
Dropout
Kernel con- Fully Connected ReLU
straint
Fully Connected ReLU Dropout

Kernel con-
Fully Connected ReLU
Softmax with Loss Labels straint
Softmax with Loss
Labels
Fig. 3. Architecture of the traditional 4 Fig. 4. Architecture of the proposed 4

Layer CNN based Model 3 for Human Layer CNN based model 3 for human activity
activity recognition. recognition.
This section provides the experimental results obtained when using the Tradi-
tional CNN, Proposed CNN, Traditional 4 Layer CNN, Proposed 4 Layer CNN on
KTH and UCF11 action dataset. This experiments mainly evaluated by python Keras,
Scikit-learn libraries. The presented models of this paper evaluated on two different
human action datasets. We applied preprocessed dataset from KTH and UCF11 on four
different model that were introduced in this paper. The accuracy of testing phase is
shown in Table 3. The experiment results show that our proposed model 2 and model 4
architecture achieved higher performance than the performance obtained by the model
1 and model 3 architecture.
Table 3. Testing Accuracy for KTH and UCF11 dataset of different model
Model Representation Test Accuracy Test Accuracy
(KTH Dataset) (UCF11 Dataset)
1 Traditional CNN 67% 71%
2 Proposed CNN 70% 73%
3 Traditional 4 Layer CNN 74% 75%
4 Proposed 4 Layer CNN 80% 79%
When using the KTH dataset applied on the four models, our proposed Model 2
and Model 4 obtained accuracy 70% and 80% which were 3% and 6% higher than the
performance obtained by model 1(67%) and model 3(74%) respectively. When using
the UCF11 dataset applied on the four models, our proposed model 2 and model 4
obtained accuracy 73% and 79% which were 2% and 4% higher than the performance
obtained by model 1(71%) and model 3(75%) respectively.
Table 4. Class wise right(R) and wrong(W) cases for KTH dataset over different model
(Training samples 480 and testing samples 120)
Class No Class name Traditional Proposed Traditional 4 Proposed 4
CNN CNN (Model Layer CNN Layer CNN
(Model 1) 2) (Model 3) (Model 4)
R W R W R W R W
1 Boxing 15 2 15 2 15 2 16 1
2 Hand Clapping 12 3 12 3 14 1 14 1
3 Hand Waiving 15 5 16 4 18 2 17 3
4 Jogging 11 16 14 13 14 13 14 13
5 Running 11 12 11 12 14 9 14 9
6 Walking 16 2 15 3 15 3 16 2
Overall R/W 80 40 83 37 90 30 91 29
(a) Model 1 (b) Model 2
(c) Model 3 (d) Model 4
Fig. 5. Confusion matrix for KTH Dataset
The overall KTH dataset classification performance of the Model 2 and Model 4 are
high and Table 4 shows that class wise right and wrong prediction over four different
models. Fig. 5 depicts the confusion matrices show that all of the images which were
misclassified by Model 1 and Model 3, were correctly classified by Model 2 and Model 4.
Fig. 6 and Fig. 7 depict the model accuracy and training loss when evaluating four
different models over the KTH dataset. These figures indicate that training and testing
performance were close together during different epochs, which also indicate that the
data were not overfitting during training.
Fig. 6. Accuracy for Training and Testing Data for KTH Dataset
The overall accuracy for UCF11 dataset classification of the Model 2 and Model 4
is high and gave the significant result over Model 1 and Model 3 by optimizing the
parameters. Table 5 shows that class wise right and wrong predictions over four dif-
ferent models. Fig. 8 depicts the confusion matrices show that all of the images which
were misclassified by Model 1 and Model 3, were correctly classified by Model 2 and
Model 4.
Fig. 7. Loss during training process of KTH Dataset.
Table 5. Class wise right(R) and wrong(W) cases for UCF dataset over different model
(Training samples 1280 and testing samples 320)
Class No Class name Traditional Proposed Traditional 4 Proposed 4
CNN CNN Layer CNN Layer CNN
(Model 1) (Model 2) (Model 3) (Model 4)
R W R W R W R W
1 Basketball 24 8 25 7 25 7 26 6
2 Biking 15 13 16 12 22 6 15 13
3 Diving 27 9 31 5 29 7 31 5
4 Golf Swing 18 6 21 3 22 2 20 4
5 Horse Riding 29 7 27 9 30 6 30 6
6 Soccer Juggling 19 14 20 13 21 12 24 9
7 Swing 16 4 12 8 9 11 18 2
8 Tennis Swing 29 5 27 7 33 1 32 2
9 Trampoline Jumping 19 1 19 1 20 0 19 1
10 Volleyball Spiking 19 8 21 6 21 6 24 3
11 Walking 12 18 13 17 9 21 14 16
Overall R/W 227 93 232 88 241 79 253 67
Fig. 8. Confusion matrix for UCF11 dataset.
Fig. 9 and Fig. 10 depicts the model accuracy and training loss when evaluating
four different models over the UCF11 dataset. These figures indicate that training and
testing performance were close together during different epochs, which also indicate
that the data were not overfit-ting during training.
Fig. 9. Accuracy for training and testing data for UCF11 dataset
To investigate the results of all the models proposed in this paper, we observed that
the improvement in CNN model performance highly attributed to network initializa-
tion, network regularization, weight constraint which were allowed to the Proposed
Model 2 and Model 4. These models have shown higher performance than the existing
traditional model 1 and model 3 which were the similar layer except optimization
parameters.
Fig. 10. Loss during training process of UCF11 dataset
5 Conclusion
In this paper, we have successfully implemented two proposed convolutional neural

network type deep learning model and make a comparison with the traditional deep
learning models on KTH and UCF11 video datasets. Based on the obtained experi-
mental result, it explains that the performance of proposed model 2 and model 4
achieved better accuracy compared to the traditional model 1 and model 3. Opti-
mization techniques like network initialization, weight constraints, regularization were
applied to the proposed convolutional deep learning model that increased the perfor-
mance for human activity recognition. However, our proposed methods demonstrate
that the methods were effective for human activity recognition. Our methods were
conducted on the KTH and UCF11 YouTube action video datasets. In the future, we
would like to extend our method and evaluate the proposed models with other complex
datasets.
References
1. Xiaoran, S., Yaxin, L., Feng, Z., Lei, L.: Human activity recognition based on deep learning
method. In: IEEE International Conference on Radar, IEEE (2018)
2. Saad, A., Arslan, B., Mubarak, S.: Chaotic invariants for human action recognition. In: IEEE
11th International Conference, pp. 1–8, IEEE (2007)
3. Moshe, B., Lena, G., Eli, S., Michal, I., Ronen, B.: Actions as space-time shapes. In:
Tenth IEEE International Conference, vol. 2, pp. 1395–1402. IEEE (2005)
4. Heng, W., Alexander, K., Cordelia, S., Cheng-Lin, L.: Dense trajectories and motion
boundary descriptors for action recognition. Int. J. Comput. Vision 103(1), 60–79 (2013)
5. Alani, A., Arabic, H.: Digit recognition based on restricted boltzmann machine and
convolutional neural networks. Information 8(4), 142 (2017)
6. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional
neural networks. In: NIPS (2012)
7. Dhillon, J.K., Chandni., Kushwaha A.K.S.: A recent survey for human activity recognition
based on deep learning approach. In: Fourth International Conference on Image Information
Processing (ICIIP), Shimla (2017)
8. José, M.C., Enrique, J.C., Antonio, F.C.: A survey of video datasets for human action and
activity recognition. Comput. Vis. Image Underst. 117(6), 633–659 (2013)
9. Qingchang, Z., Zhenghua, C., Yeng, C.S.: A novel semisupervised deep learning method for
human activity recognition. IEEE Trans. Ind. Inform. 15(7), (2019)
10. Wen, H.C., Carlos, A.B.B., Chih, H.T.: LSTM-RNNs combined with scene information for
human activity recognition. In: IEEE 19th International Conference on e-Health Networking,
Applications and Services (Healthcom), IEEE (2017)
11. Shuiwang, J., Wei, X., Ming, Y., Kai, Y.: 3D convolutional neural networks for human
action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
12. Krishanu, S., Mohamed, M., Saeid, B., Shihao, J.: Towards robust human activity
recognition from RGB video stream with limited labeled data. In: 17th IEEE International
Conference on Machine Learning and Applications (ICMLA) (2018)
13. Laptev, I., Caputo, B.: Recognition of human actions. https://www.nada.kth.se/cvap/actions/
(2011)
14. University of Central Florida.: UCF YouTube action dataset. https://www.cs.ucf.edu/*liujg/
YouTubeAction_dataset.html (2011)
bidirectional recurrent neural networks. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.)
Intelligent Computing & Optimization. ICO 2018, Advances in Intelligent Systems and
Computing, vol. 866, Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00979-3_
22
16. Zeng, M., Nguyen, L.T., Yu, B., Mengshoel, O.J., Zhu, J., Wu, P., Zhang, J.: Convolutional
neural networks for human activity recognition using mobile sensors. In: 6th international
conference on mobile computing, applications and services, ICST (2014).
Learning Success Prediction Model for Early
Age Children Using Educational Games
and Advanced Data Analytics
Antonio Tolic, Leo Mrsic(&) , and Hrvoje Jerkovic
Algebra University College, Ilica 242, 10000 Zagreb, Croatia

antonio.tolic@racunarstvo.hr,
{leo.mrsic,hrvoje.jerkovic}@algebra.hr
Abstract. The early years of a child’s life greatly affects education potential
and furthermore potential of educational achievements in adulthood. The brain
develops faster during early age, and missed cognitive opportunities in that
period are difficult to make up for. In this research, we have developed a
machine learning model based on the data points obtained from the educational
game, with aim to predict how many attempts are necessary for an individual
child to complete the task or assessment as part of educational game. In-game
assessments are based on the skills that the child already possess and those
developed while playing the game. Training of the machine learning model is
based on collected and processed data points (features), while model intercon-
nections are related to the factors of the child’s cognitive growing up process.
Model performance benchmarks are elaborated in results and conclusion section
of the paper as quality measures of the forecast indicators.
Keywords: Artificial intelligence Machine learning XGBoost CatBoost

QWK Data preparation and processing Educational game Learning success
prediction model Quadratic Weighted Kappa Extreme gradient boosting
1 Introduction
Major improvements in computing power, the availability of large data sets, and new
algorithms have paved the way for the development of artificial intelligence. Artificial
intelligence is an interdisciplinary field of research that deals with the development of
computers capable of intelligent activity. The approach that enables artificial intelli-
gence to be an intelligent activity, is called machine learning and represents its
skeleton. This will be the basis for the practical part of this paper. The initial idea of the
work comes from PBS KIDS Measure Up!1, educational computer games for children
between the ages of three and five, where PBS Kids is the most famous educational
medium in the United States. The app helps children learn early STEM concepts by
concentrating on length, width, capacity and complexity. Anonymized data collected
from the game available on the Kaggle platform will be processed in order to train the
model for learning success prediction. In this paper, the collected game data will be
used to predict the number of attempts needed for child to pass the offered task, and in

https://doi.org/10.1007/978-3-030-68154-8_61
Learning Success Prediction Model for Early Age Children 709
order to discover the key relationship between the child’s experience, engagement and
learning itself, which can be crucial for new educational concepts for modern
generations.
The topic of this paper stems from the world’s largest data science competition with
a focus on the social good- the Data Science Bowl2. The competition was presented on
the Kaggle online platform. Prediction model aims for four different outcomes, as
follows: (i) the task is solved in one attempt; (ii) the task is solved in the second
attempt; (iii) the task is solved in three or more attempts; (iv) the task has never been
solved yet. A successful solution will mostly depend on defining and extracting the
relevant features (data points) of examples for learning machine learning models. The
emphasis is on decision tree developed using advanced libraries, dominantly XGBoost
and CatBoost. The reason for their use is interpretability, they work very well without
great requirements for optimizing hyperparameters. In the evaluation of the results, the
Quadratic Weighted Kappa (QWK) will be used as an official measure, given by the
organizers.
The motivation and importance of competition rely in the fact that the first five
years of a child’s life are crucial to his or her development. The child begins to learn
and nurture his own skills that anticipate his later actions in different domains and
situations. Access to high-quality and effective early learning resources is crucial
during the early childhood development period, and support for early childhood edu-
cation is also key to other factors of long-term educational success. Machine learning
will try to use information on how children aged three to five use the app. This would
determine what children currently know and learn based on experience in order to
discover the important relationships between their engagement with the educational
medium and learning itself, as well as relevant aspects of development.
Machine Learning Methods and Concepts
According to the definition of machine learning, computers are programmed in such a
way as to optimize some performance criteria based on data examples or experience.
For this purpose, a model defined with carefully selected parameters is used. In order
for a given model to learn progressively, optimization of model parameters based on
data is practiced. It is important to point out that model training is performed on seen
data, and key prediction on unseen data. The main goal of machine learning is to build
a model that generalizes well under given circumstances.
The evaluation of the model is performed using certain metrics and should be per-
formed on a separate set, called a test set. To calculate the match coefficient of the
evaluator’s “decisions”, Cohen’s Kappa statistics are used [1]. If evaluators are present,
the Cohen kappa is used, while in the case of evaluators, the Fleiss kappa is used. Since
kappa-statistics come in multiple forms, its application depends on the type of data. In
the practical part of the paper, the evaluation requires the use of a weighted cap, i.e. a
Quadratic Weighted Kappa (QWK). Thus, the QWK metric in this paper can be
710 A. Tolic et al.
interpreted as a measure of the similarity between actual values and predicted values.
The measure can take values from a segment where 1.0 is a perfect estimate (predictions
and actual values are completely corresponding) while the lowest possible score is −1.0
(predictions are furthest from actual values).
2 Research Preparation
The data used in this competition is anonymous. Each piece of data was generated
during the child’s interaction with the PBS KIDS Measure Up! App. PBS KIDS is
committed to creating a safe environment that can be enjoyed by family members of all
ages. It is important to note that the PBS KIDS Measure Up! does not collect any
personal information such as name or location. Moreover, the data was reviewed by
PRIVO (Privacy Vaults Online, Inc.), the world’s leading expert in the children’s
Internet privacy industry, to ensure that all children’s privacy requirements are met.
Data Set
By launching the application, the child can choose one of the three default worlds
through which to move. Children move around the map and perform different tasks at
different levels. Tasks can be grouped into activities, videos, games, and assessments.
The child’s learning path is represented by the sequential execution of interrelated
tasks. It is designed for the child to first watch a video on the current topic and
approach to problem solving and to explore an activity that is similar to the game, but
without a defined goal, and in which the child can train their own skills. The child then
trains their own skills in play, but with the goal of solving a specific problem.
Demonstrates previously acquired skills in an assessment that is interactive and
specifically designed to measure and test a child’s knowledge, skills, and understanding
of the current topic. However, it is not necessary for the child to follow the stated order,
and some of the previously mentioned tasks may have several levels. Based on the
given data, an attempt will be made to predict in how many attempts the child will pass
the assessment task. Any wrong answer will count as an attempt [9].
The target variable has the following construction/state: (3) The assessment was
solved in the first attempt; (2) The assessment was solved in the second attempt;
(1) The assessment is solved in three or more attempts; (0) The estimate was never
resolved.
Data Preparation
In this step, among other things, the main goal is to extract data that is in JSON format
and represents the values of the event_data variable. Since a large amount of nested
data is present, the file explicitly states to what level the nested values the algorithm
approaches, after which a subset of the obtained values is selected for further process.
An example of such data can be seen in the following figure (Fig. 1).
Fig. 1. Data sample for event_data
In this paper, parsing is done to level one. At the end of the parsing, a unify
operation is performed with the rest of the data set. We have to point out that, as part of
the process, outcome of the last assessment is goal for prediction. This means that there
may be a history of outcomes related to the installation identification number of a
particular assessment. In other words, a child can go through the same assessment
multiple times, and only the outcome of the last game is relevant to prediction.
Visualization and Exploratory Analysis
Figure below shows a histogram of experimental groups according to individual
assessment titles. It is clear that Mushroom Sorter and Cart Balancer are easier for
children because the category marked with 3 predominates, while Chest Sorter is
somewhat heavier because the number of categories marked with 0 predominates
(Fig. 2).
Fig. 2. Groups for accuracy_group attribute
Figure below shows a histogram that provides insight into the attendance of inter-
actions. The picture shows all the titles (Fig. 3).
712 A. Tolic et al.
Fig. 3. Attendance of interactions
The graph above shows that the highest activity is present at the end of September,
while the end of July has the lowest recorded game play activity. The time range of the
given data can also be seen on the graph. Thus, the time interval covers four months
(Fig. 4).
Fig. 4. Interactions over time

3 Model Development
New features are defined based on the resulting data set. Feature parsing was based on
free assessment taking into account computing power/hardware capacity. The history
of the given data is observed. Since the prediction will take place at the level of
installation identification numbers, the operations listed below are performed on groups
of the same. The goal is to have a unique installation identification number per
instance. Thus, after grouping, counting, summing, and arithmetic mean computation
operations are performed on the following features [10].
for_aggregation = {
'event_count' : sum,
'game_time' : ['sum','mean'],
'event_id' : 'count',
'total_duration' : 'mean',
'duration' : 'mean',
'size' : 'mean',
'level' : ['sum','mean'],
'round' : ['sum','mean'],
'correct' : ['sum','mean'],
'source' : sum,
'weight' : ['sum','mean'],
'coordinates_x' : 'mean',
'coordinates_y' : 'mean',
'coordinates_stage_width' : 'mean',
'coordinates_stage_height' : 'mean'
The motivation for constructing the previous features is as follows: it has been
estimated that the coordinates feature with the corresponding suffix could be useful as
they may indicate distractions and distractions to the child while looking at the screen.
Other extracted features could also be directly related to the child’s skills [6–8].
Model Selection and Optimization of Hyperparameters
Since this is a problem of supervised learning, with the availability of input data,
appropriate output categories are available. The histogram of the target variable is
shown in the following figure (Fig. 5).
Fig. 5. Histogram of the target variable

714 A. Tolic et al.
The used data set is divided into a training data set that holds 80% of the total data
share, and a test set that holds the remaining 20% of the total data share. XGBoost and
CatBoost are trained. In addition to the data set for training and testing, a third set, the
so-called validation set, is often used. In this paper, the validation set will be 20% of the
training set. Then two of the hyperparameters for each of the models.
grid_parameters_xg = {
'n_estimators' : [50, 100, 150],
'min_child_weight' : [3, 4, 5],
'max_depth' : [2, 3],
'learning_rate' : [0.01, 0.03, 0.05],
'reg_lambda' : [0.01, 0.1, 1],
'reg_alpha' : [0.01, 0.1, 1]
}
grid_parameters_cb = {
'depth' : [1, 2],
'learning_rate' : [0.01, 0.03, 0.06, 0.08],
'iterations' : [100, 150, 200],
'l2_leaf_reg' : [0.01, 0.1, 1, 20, 50]
The following figures visualize the learning curves for XGBoost and CatBoost
(Figs. 6 and 7).
Fig. 6. Learning curve – XGBoost Fig. 7. Learning curve – CatBoost

The best hyperparameters found by web search

XGBoost
- n_estimators = 150
- min_child_weight = 4
- max_depth = 3
- learning_rate = 0.05
- reg_alpha = 0.01
- reg_lambda = 0.01
CatBoost
- depth = 2
- learning_rate = 0.08
- iterations = 200
Para>
Model Evaluation
In this part we describe concrete results that indicate how reliable and applicable the
models made are, and how well the models generalize. The choice of metrics generally
depends on the machine learning task. Taking into account the imbalance of the set of
values of the target variable, the metrics of the square-weighted cap and some addi-
tional metrics visible in the classification report were used to assess the success of the
classifiers implemented in the paper.
Additionally, two simpler models were made. Here, the goal was to outperform the
results of simpler models with more complex models. The first is a simpler model
called DummyClassifier that does random guessing. In the second case, a mod-rule
solution was created. The initial default set was divided into a training set and a test set
where the training set was 75% of the total share and the test set was 25% of the total
data share. Target values were constructed by grouping assessment titles and finding
mod-values of the same using only the training set (Tables. 1, 2, 3, 4, 5 and 6).
Table 1. QWK – results comparison

Model Train Test
XGBoost 0.7913472660568854 0.7683384433770963
CatBoost 0.7625145704846481 0.7406780824054147
Baseline_mod 0.39489 0.41548
Baseline_cls −0.03057 −0.01316
The obtained values are as follows:

Bird Measurer (Assessment) : 1
Cart Balancer (Assessment) : 3
Cauldron Filler (Assessment) : 3
Chest Sorter (Assessment) : 0
716 A. Tolic et al.
Table 2. XGBoost – report (testing sample/training sample)

XGBoost – report (testing sample) XGBoost - report (training sample)
Test Precision Recall F1-score Support Precsion Recall F1-score Support
0 0.82 0.81 0.82 171 0.87 0.84 0.86 681
1 0.55 0.39 0.46 89 0.80 0.55 0.65 358
2 0.87 0.24 0.37 84 0.95 0.36 0.52 358
3 0.76 0.95 0.85 350 0.78 0.97 0.86 1398
Accuracy – – 0.76 694 – – 0.81 2775
macro avg 0.75 0.60 0.62 694 0.85 0.68 0.72 2775
Weighted avg 0.76 0.76 0.73 694 0.82 0.81 0.79 2775
Table 3. CatBoost – report (testing sample/training sample)

CatBoost – report (testing sample) CatBoost - report (training sample)
Test Precision Recall F1-score Support Precsion Recall F1-score Support
0 0.80 0.80 0.80 171 0.85 0.82 0.83 681
1 0.60 0.43 0.50 89 0.69 0.52 0.59 358
2 0.78 0.21 0.34 84 0.92 0.29 0.44 358
3 0.75 0.95 0.84 350 0.76 0.95 0.85 1398
Accuracy – – 0.75 694 – – 0.78 2775
Macro avg 0.74 0.60 0.62 694 0.80 0.64 0.68 2775
Weighted avg 0.75 0.75 0.73 694 0.79 0.78 0.78 2775
Table 4. F1 (macro) – grade

Model Train Test
XGBoost 0.72 0.62
CatBoost 0.68 0.62
Baseline_mod 0.37 0.36
Baseline_cls 0.25 0.24
The following figures show the associated (confusion) matrices with comparisons of
actual and predicted XGBoost and CatBoost model classes (Figs. 8, 9, 10 and 11).
The following figures show the importance of individual features for the XGBoost
and CatBoost models. It is important to note that the selection of features works on the
built-in principle, that is, the algorithm itself finds the most important features (Figs. 12
and 13).
Fig. 8. XGBoost - confusion matrix (test set) Fig. 9. XGBoost - confusion matrix (train-
ing set)
Fig. 10. CatBoost - confusion matrix (test set) Fig. 11. CatBoost - confusion matrix (train-
ing set)
Fig. 12. XGBoost - importance of features Fig. 13. CatBoost - importance of features
718 A. Tolic et al.
4 Conclusion
The simplicity of the game certainly does not imply the simplicity of the data.
Moreover, the data are extremely complex, so understanding them takes some time.
The dimension of the data directly affected the execution time of the algorithms. In
particular, the execution time of individual algorithms was measured in hours. In order
to prepare the data in a correct way, individual phases required the implementation of
additional algorithms. The implementation of the extraction algorithm due to its
complexity required special attention and time.
This research indicates that the implemented models, in addition to the previously
mentioned parameters, can predict the outcome of the evaluation with a square
weighted kappa value of 0.76 for XGBoost and 0.74 for CatBoost on the test set. The
values of the square-weighted cap, observing the training set, are 0.79 for XGBoost and
0.76 for CatBoost. Thus, the results related to the square-weighted kappa support the
XGBoost model. Results F1 (macro) - scores are identical for implemented XGBoost
and CatBoost models, with a value of 0.62 on the test data set. The same score on the
training set is 0.72 for XGBoost and 0.68 for CatBoost. Additionally, two simpler
models were implemented where the goal was to outperform the simpler ones with
more complex models, and the experiment proved to be successful. The more complex
XGBoost and CatBoost models significantly outperformed the simpler models.
In conclusion, the elevation of existing results could be achieved by possibly
searching a larger space of hyperparameters, extracting the values of higher levels of
the event_data feature, and more advanced feature construction, taking into account
hardware capacities. Furthermore, the PBS KIDS Measure Up! provides free access to
content that has been proven to help children learn. In the generated data of the
mentioned type of game, it was crucial to identify the relevant prediction factors. This
research, among other things, provide a basic insight into the same. Such results are
extremely important in building future education systems, and more advanced analysis
of relevant features can be of great help when it comes to a child’s development. Based
on the experiments performed, it is evident that even in a limited work environment, a
respectable prediction can be achieved that can be crucial for an understanding how
cognitive child’s growth can be measured and, hopefully, empowered.
Notes
1. https://pbskids.org/apps/pbs-kids-measure-up.html.
2. https://datasciencebowl.com/.
References
1. Ben-David, A.: Comparison of classification accuracy using Cohen’s Weighted Kappa.
Expert Syst. Appl. 34(2), 825–832 (2008). https://www.sciencedirect.com/science/article/
abs/pii/S0957417406003435
2. Chen, T., Guestrin, C.: XGBoost: A scalable tree boosting system. In: Proceedings of the
22nd acm sigkdd international conference on knowledge discovery and data mining. ACM
(2016) https://dl.acm.org/citation.cfm?doid=2939672.2939785
3. Domingo, P.: A few useful things to know about machine learning. Commun. ACM 55(10),
78–87 (2012). https://homes.cs.washington.edu/*pedrod/papers/cacm12.pdf
4. Hastie, T., Tibshirani, R., Friedman, J.: Introduction to Statistical Learning. Berlin: Springer
(2013). http://faculty.marshall.usc.edu/gareth-james/ISL
5. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Berlin:
Springer (2009). https://web.stanford.edu/*hastie/Papers/ESLII.pdf
6. Nielsen, D.: Tree Boosting with XGBoost-Why Does XGBoost Win “Every” Machine
Learning Competition? MS thesis. Trondheim: NTNU (2016). https://ntnuopen.ntnu.no/
ntnu-xmlui/bitstream/hanle/11250/2433761/16128_FULLTEXT.pdf, prosinac. 2019
7. Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A.V., Gulin, A.: CatBoost: Unbiased
boosting with categorical features. In: Advances in neural Information Processing System.
Moskva: Yandex (2018). https://papers.nips.cc/paper/7898-catboost-unbiased-boosting-
with-categorical-features.pdf
8. Skansi, S.: Introduction to Deep Learning. Springer, London (2018)
9. Intelligent Computing & Optimization, Conference proceedings ICO 2018, Springer, Cham
(2018) ISBN 978-3-030-00978-6
Intelligent Computing and Optimization 2019 (ICO 2019), Springer International Publishing
(2019) ISBN 978-3-030-33585-4
Advanced Analytics Techniques for Customer
Activation and Retention in Online Retail
Igor Matic, Leo Mrsic(&) , and Joachim Keppler
Algebra University College, Ilica 242, 10000 Zagreb, Croatia

igor.matic@bio-info.hr, leo.mrsic@algebra.hr,
Joachim.Keppler@defacto.de
Abstract. In an age of ubiquitous, super-fast internet, online orders have been

increasing exponentially. This, in turn, significantly increases the customer's
options in terms of product range and price, and thus has an impact on the
increased competition between companies. It was known that customers are
often switching between offers and thus between companies or just stayed
dormant. The associated decrease in the average order frequency therefore
managing customer churn has a huge profit potential for each online retailer. For
online retailers, customer loyalty and regular purchase behaviour is an important
part of achieving the sales and margin targets so that maintaining and preserving
the customer base. This paper uses the key performance indicators of one big
online retail company to examine the current situation in detail and provide
methods to reduce the churn. For this purpose, several aspects are used, ranging
from the use of tracking software to record customer activities and interests in
the online shop itself, to the resulting segmentation into various customer types
and the precise calculation of customer lifetime value. These aspects converted
to the numerical values are used to train machine learning model with goal to
calculate a probable churn score. Additionally, the probability calculation for
reordering is used as an input for further marketing activities together with
estimation of financial uplift and profit potential.
Keywords: Online Retail Web store E-commerce Big data analytics

Machine learning Churn prediction Prevention and retention
1 Introduction
Online retail industry has a whole new business potential that has not yet been deeply
unlocked both from a market and customer perspective. The clear difference from the
offline world is that Internet companies monitor consumers' decisions in the buying
process. But like any traditional offline marketplace, the network industry has its own
set of business rules, stakeholders, business problems and solutions. Some of the
typical Internet retail challenges are Churn Prediction, Retention, and Prevention
(PRP). A possible approach towards solution for PRP is to use machine learning
techniques. The main prerequisite is the unique identification of the customer required
for anticipation and retention. The second prerequisite is a reliable relationship between
the provider and the user through the identity and consent of the customer. Forecasting

https://doi.org/10.1007/978-3-030-68154-8_62
Advanced Analytics Techniques for Customer Activation 721
and retaining Churn is a well-known process in some transaction industries like

telecommunications, but not so easy to manage for the online retail industry. In the e-
commerce industry, we typically have a huge number of products and services packed
into online orders, a clear definition of roles and responsibilities, and a largely digitally
transformed process that gives us a thorough overview of the often-changing e-
commerce revenue stream. Most classification of this business models is based on some
value proposition, customer segments, partner network, delivery channel, revenue
streams etc. This definition of the e-commerce business model is a huge advantage for
the churn prediction and retention process because it gives valuable input for the churn
machine learning model.
2 Customer Segmentation and Analytics
As a starting point for research we need to analyse the customer, what and which
relevant customer data we have, create and define customer journey, analyse the
behaviour of our customers, classify and segment the customer in similar groups.
Differentiating between customer groups has several advantages. In order to be able to
adequately differentiate the customers, we must assure that the data used for this
purpose meet certain requirements. The most important factors are the following: (f1)
Buying behaviour. Suitable indicators for future buying behaviour; (f2) Measurability.
Measurable and recordable with existing market research methods; (f3) Accessibility.
Ensure targeted addressing of the formed segments; (f4) Ability to act. Ensure the
targeted use of the marketing tool; (f5) Profitability. The benefit of the analytics should
be greater than the costs incurred; (f6) Temporal stability. Longer-term validity of the
information collected. A distinction is made between the segmentation criteria between
geographical, socio-demographic, psychographic and behaviour-oriented content. In
our case we would concentrate on the non-purchase behaviour because this is the
information, we need to understand for our churn case. The so-called “lifestyle seg-
mentation” divides the consumers according to their lifestyle. It contains criteria that
address individual and collective value orientations and goals. A personality is iden-
tified that, in conjunction with the segmentation criteria just mentioned, demographic,
socio-economic as well as cultural origin allow a distinction according to lifestyle. We
can distinct and segment online retail customers based on their lifestyle and put them in
several groups. Most retail customers can be put in six lifestyle groups: struggles,
action oriented, believers, experiencers, achievers, full fields which are mainly self-
explanatory [23].
2.1 Customer Lifetime Value (Scoring)

Another way to differentiate the customers puts the focus on the monetary value. This
provides information about the amount that a customer spends while shopping at a
company. It gives us the possibility of splitting into different customer classes and also
calculate the profitability with regard to recruiting, maintaining or incentivizing a
customer. For the churn case this is a very important input parameter. Because the
additional expenses described can exceed the margins on the products sold, the final
722 I. Matic et al.
result would be a loss-making business. In order to avoid this, the evaluation of the
customer with regard to his purchase and the associated earnings/costs value, Customer
Lifetime Value (CLV) calculation is used. CLV defines itself as the sum of the income
that a customer generates for the company during the business relationship, minus the
costs for customer acquisition, the sales process and the customer service. Putting this
in the perspective of the online retailer, we used order costs, value of goods, delivery
costs and other administrative costs compared with the order-related earnings values
per calendar year. The customer should be included in this calculation from the first
purchase and seasonal influences should be excluded. The result of the calculation can
either be positive or negative telling us that customers with negative earnings are
undesirable. Customers with a positive value can be divided into categories from A to
C, in which category A contains the top customers. Other customer categories are CR
(reactivated customers) and CN (new customers). A further distinction is made by
calculating total number of orders within the last 12 months, if the customer had only
one or more than three in this period. To be able fine tune this customer segmentation, a
quotient that describes the effort that was necessary to generate the earnings would be
used. The quotient calculation will differentiate less profitable from profitable cus-
tomers and how to cluster them. The yield quotient (earnings ratio) used for this is
calculated from the order-related earnings value divided by the delivery value (sum of
the sales price successfully delivered products). This short mathematical example
illustrates yield quotient (YQ) = EPO / Delivery Value. If you set this quotient
depending on the respective customer category, further segments are created. This
makes differences in efficiency within a category more clearly visible. The quotient
alone does not make any statements about the coverage of additional costs that were
not yet included in the earnings per order (EPO) calculation which can be either high or
low level. The preference of the order profit amount is higher than the order efficiency
(the yield quotient). Because the earnings ratio was opposite to the EPO, in our online
retail company, A1 customers brought an average profitability of 38.0% and A2 cus-
tomers value of 44.1%. For churn prevention, the scoring of customers is very
important since we need to calculate break-even points for each churn case individually
to know what customer we need to address and reactivate or retain. The customer
categories should be visually differentiated from each other so we can better see where
our target is. If you plot the two characteristics (EPO and yield quotient) on the axes of
a diagram, you get a cluster that shows the customer categories according to the cost-
income ratio. The clear subdivision by the earnings quotient becomes clear, but above
all the areas of the customer segments are shown with their respective reach. Cate-
gories B and C are limited to 300 EUR each in their EPO area which raises the question
of the distribution of the customers. Customers with a high EPO are forced to show
regular activity in order to achieve this earnings value at all. It is therefore unrealistic
that large parts of customer category A suddenly become inactive which makes them
excluded from churn target. Based on previous company experience, it is primarily new
customers who miss another purchase recognized as a good candidate for the customer
churn prevention program. Another group are already reactivated customers. They
form a large part of the customer base and are mostly only active again due to an
incentive. With the churn prevention process, they too need to be converted into regular
buying behaviour [24] (Fig. 1).
Fig. 1. EPO categorization in the EPO yield quotient diagram
3 Research
Churn prevention and reactivation processes is geared by the individual purchasing

histories and rhythms of the customers, in contrast to universal definitions that apply to
all the customers. The criteria for entering the processes is based on the activities
(purchasing frequencies) of the customers, in contrast to purely value-based indicators.
Churn prevention process aims at those customers who are threatened with inactivity,
but do not yet count as inactive1. Churn prevention measures pursue the goal of
initiating the next purchase, thus prompting the customers to return to their normal
buying cycles. The basic idea of the prevention process can therefore be seen in the
individual approach according to the personal buying rhythm and the associated
inactivity. Reactivation process addresses those customers who have not bought for a
long time, so that they count as inactive. Reactivation measures pursue the goal of
prompting the customers to resume their purchasing activities and re-establish loyalty.
The focus of churn prevention and reactivation is primarily on addressing customers
who already show a certain amount of inactivity with regard to their regular buying
process, but who are not yet considered to be completely inactive [1]. They are sup-
posed to be led to a purchase again through a special (marketing) communication, so
that the normal purchase cycle is brought back again. The basic idea of the prevention
process can therefore be seen in the individual approach according to the personal
buying rhythm and the associated inactivity. To determine potential target group for
churn case, some indicators must be used that can make visible the behaviour of the
customer and his frequent shopping habits. For this purpose, we developed activity
criteria that may also show current behaviour [2, 3]. This is where the advantages of
clickstream tracking flow in, which can reveal historical as well as current perspectives
on a customer and thus offer a much better insight into the current interests of the
customer in addition to the purchases actually made. This behaviour we can record at
various customer touchpoints like contact points (all contact options where a customer
comes into contact) but primarily thru direct touchpoints such as newsletters, websites
1
Activity and inactivity of customers refers in the present use cases to orders; other activity criteria
(newsletter, online shop, …) are treated as indicators.
724 I. Matic et al.
or advertisements in search engines [18]. In the course of the prevention of customer

inactivity, the possible absolute inactivity should also be addressed and we would try
here to reactivate the customer [4].
3.1 Parameters Selection

The first question is, at which point in time a customer is considered to be at risk of
migration or completely inactive. To do this, we first need to define which influencing
criteria determine a risk of emigration. The deviation from the normal buying rhythm,
which a customer had originally shown, will be the first pick. As a target group we
would pick customers who are about to deviate from their “normal” purchasing pat-
terns, i.e. customers who show an anomalous purchasing behaviour and are therefore
threatened with inactivity. Main requirement from the customer history database
comprises of at least 2 purchases, which is a prerequisite for the calculation of a
customer-specific purchasing interval [20]. Trigger for the selection is if the duration
since last purchase exceeds the average purchasing interval of the customer by a
significant value. In order to be able to determine this selection individually, we require
a certain minimum of purchases already made. From a mathematical point of view, two
numbers are required to define an interval, an upper limit and a lower limit. This can
also be applied to the distance between the individual orders, at which the first order
determines the start of the purchase rhythm and the interval is limited with the sub-
sequent order. It can therefore be deduced from this that new customers who have only
made one purchase are excluded from the horizon of the use case. For customers with
more than two orders, however, the average of the individual purchase intervals can be
used: Average purchasing interval: IAP = (Tlast purchase – Tfirst purchase) / (Number of
purchases – 1). The total period, which has passed from the first to the last order,
divided by the purchases made therein minus the start purchase, shows the average
purchase interval (I_AP). In order to be able to finally determine when a customer is
now at risk of migration, a simple definition of `èxceeding the average order interval''
would be possible. If you consider that this definition is already met with a delay of
24 h, certain doubts arise. To minimize this doubt, we must see when is the churn
danger case and when it is only a delay in the buying cycle. Setting an absolute time
makes little sense due to the different purchase intervals for different customers [5–7].
An approach is therefore required that can individually determine a significant relative
deviation and thus a realistic risk for each buyer. Some empirical values from the
analyses carried out by the analytical experts in similar contexts speak for a factor of
2.0. Applied to the problem of the use case, this means that the purchase interval is
doubled in time [19]. To make this usable for a definition, it must be stated that the
current interval, i.e. the elapsed time from the last order placed (ILP) to the present day,
is above twice the average order period. With this in mind, we can take that elapsed
time since last purchase (ILP) is significantly greater than IAP, i.e. ILP > fC * IAP (first
value proposal: fC = 2.0). But there is still a need to expand the definition if we want to
involve high active customers. For high-frequency customers who sometimes order
several times a month and thus generate an ILP of a few days old, it is important not to
introduce a churn prevention process too early. This can be justified by unforeseeable
circumstances (e.g. hospital stays) that trigger an unwanted interruption of the order
frequency, although the customer ultimately has no loss of interest and thus results in a
process to prevent customer churn. It is therefore recommended to set a minimum
interval that must be exceeded in any case before a churn action is taken [8]. For this
churn use case an empirical value for the minimum interval is set to: ILP, min = 2
months (this value can be adapted based on each individual case). On the other hand,
for low-frequency customers, this idea can also be accepted but with inverted logic.
Exceeding a certain period for which no churn prevention has been started in the
medium term can potentially lead to a complete loss of the customer if the approach is
made too late. Formerly regularly active customers who avoid their direct qualification
for Churn due to ever increasing purchase distances, but who nevertheless have a
certain risk, cannot be included. A maximum period should be set, with which the
process must be started in order to address the customers just mentioned. This period
can be defined with typical value: ILP, max = 6 months (this value can be adapted based
on each individual case). If we summarize this in a final definition for churn risk in the
sense of the churn case, the following definition applies: Churn is always present if the
period since the last purchase of a customer is twice as long as the average order
interval. This period is limited by at least two months and by a maximum of six months
at the top [9–11].
3.2 Churn Prevention Process and Customer Reactivation Process

The average purchasing interval (IAP) of a customer characterizes the typical duration
between two consecutive purchases. Customers enter the churn prevention process if
the elapsed time since the last purchase (ILP) is twice as large as the average purchasing
interval (IAP). As mentioned above, there are two limitations: (1) High-frequency
buyers cannot enter the process until after 2 months; (2) Low-frequency buyers enter
the process after 6 months at the latest. With this principle which should be acceptable
from the business side of the retailers, we can segment the customers based on their
(non)purchasing behaviour which will be the input for our churn case.
This target group is very similar to the churn prevention with some differences in
the high and low frequency intervals. Target group for reactivation case are customers
who show substantial deviations from their normal purchasing patterns, i.e. customers
who have to be regarded as inactive with respect to their previous activity levels.
Requirement on the customer history database comprises at least 2 purchases, which is
a prerequisite for the calculation of a customer-specific purchasing interval. Trigger for
the selection if the duration since last purchase is several times as high as the average
purchasing interval of the customer. Average purchasing interval formula used:
IAP = (Tlast purchase – Tfirst purchase) / (Number of purchases – 1). Elapsed time since last
purchase (ILP) is substantially greater than IAP, i.e. ILP > fR * IAP (proposal vales:
fR = 4.0). High-frequency buyers have to be excluded from acquiring the status “in-
active” too quickly, i.e. a lower threshold should be specified for ILP (typical value: ILP,
7
min = 3 months) . On the other hand, an upper threshold should be specified for ILP so
as not to unduly defer reactivation measures for occasional buyers showing a very low
frequency (typical value: ILP, max = 12 months)7. The average purchasing interval (IAP)
of a customer characterizes the typical duration between two consecutive purchases.
Customers enter the reactivation process if the elapsed time since the last purchase (ILP)
726 I. Matic et al.
is four times as large as the average purchasing interval (IAP). There are two limitations:
(1) High-frequency buyers cannot enter the process until after 3 months; (2) Low-
frequency buyers enter the process after 12 months at the latest. This principle is quite
similar as for churn prevention, with the difference of the time period observed and
taken into account for the customer segmentation (Fig. 2).
Fig. 2. Churn process overview (IAP = average purchasing interval; ILP = elapsed time since
last purchase)
In this comparative view we see that the main difference is in the elapsed time since
last purchase values which is much greater for the reactivation case. Using this prin-
ciple, we would segment different customer group and we would use different process
for the churn case. Also, the modelling part would be different since we would use
slightly different input parameters (Fig. 3).
Fig. 3. Example of IAP/ ILP customer matrix, status view
If we put complete customer base in this IAP/ ILP matrix and visually present the
result we would get status view on potential target group for the churn prevention and
the reactivation process. It is to be expected that the reactivation group would be bigger
but churn prevention group could show more financial potential.
3.3 Forecasting Scenarios

Forecasting methods based on machine learning, regression models and decision
matrices are necessary in order to be able to make a monetary assessment of the
individual customer potential. An individual calculation marketing proposal offers
much more precise results than an average of customers who seem similar. The cal-
culation methods used are based on various data and influencing variables of the
respective customer. At this point, reference can be made using two of the technical
aspects mentioned previously in the development phase of the churn use case, because
the input variables required are made up of existing customer data generated by pre-
vious orders [21].
Prediction of the Purchase Pobability (Propensity Model). This Model predicts the
probability of purchase of the customers within the projection period (typically
4 weeks). We would use churn probability values where 1 would mean highest
probability of purchase and 0 no purchase probability. With this method we would
differentiate the customers with respect to their individual purchase probability. As the
result we would get target group of customers who are likely to purchase in the near
future, as opposed to those customers who are very likely to continue further their
inactivity. From the marketing perspective we need to answer the question on how to
define the churn prevention/reactivation strategy. Should we allocate the marketing
budget to those customers who are going to buy anyway or to those who will most
probably not buy at all.
Prediction of the Effect of Churn Prevention/Reactivation Measures (uplift
model). This Model predicts the effect of marketing measure on the purchasing
behaviour of the customers within the projection period (typically 4 weeks). Behind the
uplift definition we understand here the probability of purchase with stimulus and the
probability of purchase without stimulus. We would differentiate the customers with
respect to their individual uplift potential. As the result we would get segmented
customer group which can be positively influenced by a marketing campaign. If we
compare these two models’ side-by-side (Juxta positioning method) to show the dif-
ferences, we see the following. The main difference is how the results are threated from
the Marketing campaigns perspective.
Purchase Probability (Propensity Model). With the propensity model we would
analyse our customers to find out what churn probability they have. Using the machine
learning methods, we would create the potential target group for the marketing campaign
selection. Using the control group, we would evaluate the results and repeat the process
with the new findings. Here are some potential measures which can be used in the
propensity model for the dispatching of vouchers: x% discount on next purchase (x
between 15 and 30), x% discount on one item of the next purchase (x between 30 and 50);
provided the purchase consists of more than 1 item, x€ discount on purchase value above
critical threshold (depending on average net scales per order) and remission of shipping
728 I. Matic et al.
fees. Before applying the propensity model on the whole customer based some test
campaigns should be conducted. These tests could be applied using minor customer
segment: Testing of different voucher values, testing of different content variants (product
groups), Testing of different tonalities (considering the customer typologies) or Selection
of channel: email is much more inexpensive and more flexible than print but with print the
achievable effect size is significantly higher as compared to email.
Differential Response Analysis (Uplift Model). One of the best-known models which
can make changes in customers and their respective buying behaviour visible through
stimuli triggered by companies, is the so-called uplift or incremental model. In order to
optimally pursue its goals, lower the churn rate and the associated increase in sales
while ignoring the unprofitable customers, the uplift model can be used to optimally
differentiate customers. Before the customer scoring can be continued later in order to
split up the customer groups and then addressed by the churn process, the first step
should be to respond to a stimulus (in our churn case, this is synonymous with
advertising) like addressing the customer at different communication levels (e.g.
newsletter, letter advertising, etc.). The background is that potentially unnecessary
investments should be excluded in advance. The conceivable situations in which this
would be the case is visible in a four-field stimulus panel. Addressing customers who
buy regardless of a stimulus (see figure below) would be just as unnecessary as
addressing customers who would never order (bottom left). Most important is the
exclusion of the small group of customers, at the same time, where a stimulus would
prevent an order (bottom right). It is important to filter out the customers who fall into
the grid at the top left and seem to only become active again due to a stimulus. This
filtering can be achieved relatively simply by carrying out a campaign-dependent Ps/Po
test. Similar like in the propensity model, forecasting methods based on machine
learning, regression models and decision matrices are necessary, but in uplift model we
create a monetary assessment of the individual customer potential [13–16]. For the final
evaluation purpose, the customer groups are divided into target and control groups,
with the control group mostly accounting for almost 7% of the total group. Previous
experience has shown that this amount has always been optimal so far, on the one hand,
to make deviations visible and, on the other hand, to determine the maximum beha-
viour of the target group. Potential reactivation measures for the dispatching of
vouchers are similar with the propensity model with some additions like incentive
which is typically more “aggressive” as compared to churn prevention and rewards for
being a loyal customer. Same tests campaigns proposals as for propensity models can
be used for the reactivation model but with some further marketing campaigns opti-
misations. Differential Response Analysis usually maximizes the Marketing Campaign
effects [17]. For the churn case, optimization of the marketing campaigns is needed.
Following table shows potential effect using the different campaigns approach
(Table 1).
Table 1. Campaigns effect potential

Campaigns Number of Purchase rate Campaign
Contacts Target group Control group effect
(with mailing) (without mailing) (in + points)
Campaign A 70 6.7% 6.1% 0.6%
Campaign B 80 8.3% 8.2% 0.1%
Campaign C 45 7.5% 4.8% 2.7%
Purchase rate for the conventional campaign selections (like Campaign B) maximize
the value for the target group (8,3%) but can be high even without a stimulus (8,2%)
which gives only 0,1% campaign effect. Differential response analysis maximizes the
effects of campaigns and will propose only the best Campaigns which have maximal
points effect value, example is in the case – Campaign C with 2.7% + points.
Relevant Forecasting Methods. Typical forecasting methods used in the Machine
Learning Environment in the churn prediction and prevention are Regression models,
Decision three and Neural networks. Regression algorithm was chosen because was the
best in this churn case. Neural was too complex and did not add any gains and decision tree
did not deliver good scores. The forecasting process from the individual attributes to a
final scoring takes place in several steps and at some points allows an independent and
consequent improvement of the applied decision criteria by the procedure itself. Based on
the data sources from customer database and behavioural (clickstream) data, a first cus-
tomer profile is developed, which brings additional input variables with it and flows into
the data forecasting model. There, various decision trees are used to identify individual
customers through existing correlations (correlation filter), such as the payment method-
basket size key figure (filtered out). Logistic regression filters develop an individual
forecast for the customer as to whether the resulting uplift would turn out to be positive
and the decision-making processes of the customer (Decision Tree Learner / Predictor),
which are shown using clickstream (behavioural) data, also influence the evaluation.
Evaluation of the prediction models is based on some key figures used for the assessment
of the forecasting quality and selectivity [12] (Figs. 4 and 5).
Fig. 4. Lift chart for web tracking Fig. 5. Gains chart for web tracking
730 I. Matic et al.
Gini index/ratio or coefficient is the measure of the statistical dispersion between

two areas where 0 is the perfect equality and 100 is the maximal inequality. Overall
accuracy = 0.735 (estimated). Giving the nature of the ecommerce business and this
churn case, the inclusion of additional attributes like from web tracking(clickstream) is
expected to improve the model selectivity by approximately 20%.
4 Results
After initial analysis and necessary agreement with the key business users, final
implementation scenario represents the potential customer base for the online retail
churn processing scenario. As mentioned in status view we segment the customers
based on the purchasing behaviour and focus on the most potential customer groups.
The complete customer base is around 2 million customers. These end results tell us
which customer segment is the best suited for the churn prevention and which group is
better to approach with the reactivation measures. Here are the key characteristics of
this customer status scenario: (i) Customers enter the churn prevention process as soon
as ILP is more than twice as large as the average purchasing interval (IAP); (ii) Cus-
tomers enter the reactivation process as soon as ILP is more than four times as large as
the average purchasing interval (IAP); (iii) 18 months after the last purchase customers
are assigned the status “inactive” (independent of the average purchasing interval). We
can split the customer segment to 3 vertical models based on the last purchase period.
The main focus for our model development, training and scoring is on the most
prospective churn customers (orange coded customer groups) which is a combination
of the horizontal and vertical attributes. Following this logic, the best churn prospective
groups are the middle groups of the status view matrix (M06-M08xM13-M29 col-
umns). If we address the customers too soon (left side) or too late (right side) our churn
activities would have less chance to success (Fig. 6).
Fig. 6. Customer base calculation
We have prepared 3-steps filtering of the customers to achieve the best potential
customer segment for our churn case. In the first step we select the customers which
ordered in the last 2-year period. Then we filtered only customers with more the two
orders in this period. In the third step, we excluded internal employees (and connected
persons), customers with some profile blocks/locks like marketing block, order block or
dunning/financial lock. Final number on the left represent customer segment base
which are eligible (good potential candidate) for churn prevention or reactivation
activities. In the churn case definition chapters, we defined two customer selection
principles for the proof of success: Manual selection - just picking up some potential
churn customers by “thumbs” principle and Model based - ML algorithm model (re-
gression) segment. For the customer segment selection, we selected best-score interval,
not the highest potential score customers but just below this. This was chosen because
first short test with highest score did not deliver expected results. This could be
explained with the fact that highest profile churn customers have firm reasons and
could not so easily be persuaded to switch back or to (re)activate their purchases. Fort
the main proof of the success we calculate the uplift and additional gross sales
potential. The uplift stands for the effect caused by the implementing respective mar-
keting measure compared to the manual selection or (control group) without these
measures. The basis for comparison is the customer average gross sales per each
respective group. We would also check additional gross sales impact, which would
give the additional gross sales purchase value per each targeted customer. They rep-
resent the additional sales of the model-based selection compared to the manual
selection and expresses the additional benefit that results from the model-based
selection. The additional gross sales (mainly upsell and cross-sell orders) are the results
from the uplift values of the respective campaigns. Future subsequent orders are not
considered here.
Here is a comparison to the baseline or manual selection for churn prevention
case: (i) The uplift in gross sales per customer compared to a manual selection was
between 5.40 and 7.50%; (ii) The uplift in gross sales per customer compared to a
model-based selection (best score interval) was around 18%; (iii) Additional gross sales
per customer compared to a manual selection (model-assisted) was 10.5%; (iv) Addi-
tional gross sales per customer compared to ‘no activity’ (baseline) was 18%. For the
quantification of the uplift/improvement we can use of the following formula:
Pot_Gross_Sales = additional gross sales per customer * number of customers per
wave * number of waves, where waves are the marketing waves, we have conducted
during the specific marketing campaigns. The usual number of waves was between 2
and 12.
Here is a comparison to the baseline or manual selection for churn reactivation
case: (i) The uplift in purchase probability per customer compared to a manual
selection was between 2.0% and 2.5%; (ii) The uplift in purchase probability per
customer compared to a model-based selection (best score interval) was 4.5%;
(iii) Additional purchase probability per customer compared to a manual selection
(model-assisted) was 2.0%; (iv) Additional purchase probability per customer com-
pared to ‘no activity’ (baseline) was 4.5%. For the quantification of the churn reacti-
vation improvement we can use this simple formula: Pot_Active_Cust = additional
purchase probability per customer * number of recipients per wave * number of waves.
If we put the values from the Print test campaign and calculate how many additional
active customers (compared to baseline) resulting from a regular monthly reactivation
process we have so far and use the values for the customer segment we have targeted
(60000) and the number of marketing waves we have performed during the campaign
(12) the final result is following: Num_Active_Cust = additional purchase probability
732 I. Matic et al.
per customer * number of recipients per wave * number of waves = 0.045 * 60,000 *
12 = 32,400. Additional gross sales (compared to baseline) resulting from a regular
monthly reactivation process can give us the monetary explanation of this potential.
This can be change based on the company gross sales figures and market potential but
if we put the numbers what we have so far, the expected range would be from 2–6% on
the yearly base (Table 2).
Table 2. Results comparison

Use case KPI for the estimation of the potential Potential
(YoY/annual)
Churn Prevention and Reduction of the churn rate 10–15%
Reactivation Baseline churn rate (based on the target 19.2%
customers for churn prevention)
Relevant customer base 1,400,000
Estimation of additional active customers 32,000
Whole gross sales uplift 3%
In order to calculate the complete potential, the first step is to calculate the number
of customers that could be retained by applying the churn prevention process.
Implementing the 12% churn reduction rate derived from the 19.2% baseline value, a
new customer churn rate of 16.9% is obtained. If we apply it to the target groups of
active customers in the AC, CN and CR customer categories, which corresponds to
95.3% of the total number of customers, we get additional active 32,200 customer.
Multiplying this by the average order value of all customers, excluding related costs
such as vouchers or email costs, we can achieve a final uplift value of about 3% on
annual gross sales. Quite a considerable amount, which certainly exceeds the cost of
overall implementation in a very short period of time [22].
5 Conclusion
Online retail company already has data analysis potential as essential part and must
learn how to use it in the best manner. With the ever-increasing competitive situation
among online retailers, customer churn is accelerating for several reasons. This is due,
among other things, to the freedom of choice and the ability to switch quickly between
the websites. Customer loyalty therefore drops just as quickly as the competition
grows, so that there is a need for action in order to avoid further customer churn
through the (artificial) intelligent customer segmentation and carefully planning mar-
keting prevention campaigns. Successful implementation requires an intensive con-
sideration of various influencing factors as well as the development of solution
approaches, taking into account the risks and problems that arise. If we look at the
maximum potential that can be raised by addressing the most likely customers, in the
case of our online retailer which has couple of million customers, we are talking around
the amount of five million euros per year and in connection with this around 32,000
customers who became active. The use of data-driven programs to identify and prevent
migration risks offers some advantages that should not be missed in the current context
of strong competition in the online retail industry because maintaining the customer
base of a company is vital task for such a company. If a large number of customers
drops out, only the much more complex and cost-intensive method of acquiring new
customers remains in the midterm perspective. If this does not succeed, there is a long-
term risk of falling below the average customer cost limit with the consequences of a
potential complete cash-flow insolvency.
References
1. Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)
2. Mrsic, L., Klepac, G., Kopal, R.: Developing Churn Models Using Data Mining Techniques
and Social Network Analysis. IGI Global, Pennsylvania (2015). https://doi.org/10.4018/978-
1-4666-6288-9
3. Provost, F., Fawcett, T.: Data Science for Business. O’Reilly Media, Inc., Sebastopol
(2013).978-1-449-36132-7
4. Burkov, A.: The Hundred-Page Machine Learning Book, Leanpub, Victoria (2019) www.
themlbook.com
5. Andrew, N.G.: Machine Learning Yearning (2018) www.deeplearning.ai/machine-learning-
yearning
6. Kraljević, G., Gotovac, S.: Modeling data mining applications for prediction of prepaid
churn in telecommunication services. Automatika 51(3), 275–283 (2010)
7. Hemann, C., Burbary, K.: Digital Marketing Analytics. Que Publishing, Indianapolis (2013)
ISBN-13: 978–0–7897–5030–3
8. Farris, P., Bendle, N., Pfeifer, P., Reibstein, D.: Marketing Metrics. Pearson Education,
Upper Saddle River (2010)
9. Harvard Business Review. https://hbr.org/2014/10/the-value-of-keeping-the-right-customers
10. Ahlemeyer-Stubbe, A., Coleman, S.: Monetising Data: How to Uplift your Business. Wiley,
Newark (2018)
11. Franks, B., Bucnis, R.: Taking Your Analytics Up A Notch By Integrating Clickstream Data.
Teradata, Atlanta (2011)
12. Schellong, D., Kemper, J., Brettel, M.: Clickstream data as a source to uncover con-sumer
shopping types in a large-scale online setting, Association for Information Systems, ECIS
(2016)
13. Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2006)
14. Strickland, J.: What are Uplift Models (2015) https://www.analyticbridge.datasciencecentral.
com/profiles/blogs/what-are-uplift-models
15. Cottineau, C., Chapron, P., Reuillon, R.: Growing models from the bottom up. an
evaluation-based incremental modelling method (EBIMM) applied to the simulation of
system of cities. J. Artif. Societies Soc. Simul. 18(4), 9 (2015)
16. Bucklin, R., et al.: Choice and the internet: from clickstream to research stream. Mark. Lett.
13, 245–258 (2002)
17. Han, J., Kamber, M., Pei, J.: Data Mining. Concepts and Techniques. Morgan Kaufmann,
Waltham (2011)
18. Davenport, T.: Big Data @ Work. Harvard Business Review Press, Boston (2014)
734 I. Matic et al.
19. Freiling, J., Kollmann, T.: Entrepreneurial Marketing. Springer, Wiesbaden (2015)
20. Few, S.: Show Me the Numbers. Analytics Press, Burlingame (2012)
21. Aaker, D.A.: Strategic Market Management, 10th edn. Wiley, Hoboken (2014)
22. Olson, D.L., Delen, D.: Advanced Data Mining Techniques. Springer, Berlin (2008)
23. Intelligent Computing and Optimization, Conference proceedings ICO 2018, Springer,
Cham, ISBN 978–3–030–00978–6
24. Intelligent Computing and Optimization. In: Proceedings of the 2nd International
Conference on Intelligent Computing and Optimization 2019, ICO 2019, Springer
International Publishing, ISBN 978–3–030–33585–4
An Approach for Detecting Pneumonia
from Chest X-Ray Image Using Convolution
Neural Network
Susmita Kar1(&), Nasim Akhtar1, and Mostafijur Rahman2

1
Department of Computer Science and Engineering, Dhaka University
of Engineering and Technology, Gazipur, Dhaka, Bangladesh
susmitak275@gmail.com, drnasim@duet.ac.bd
2
Department of Software Engineering, Daffodil International University,
Dhaka, Bangladesh
Abstract. Pneumonia is one of the most cynical problems to human beings all
over the world and detecting the presence of pneumonia in an early stage is very
necessary to avoid Premature Death. According to the World Health Organi-
zation above 4 million sudden deaths happen each year from domiciliary air
pollution correlated diseases inclusive of pneumonia. Generally, pneumonia can
be identified using chest X-ray images that are performed by an expert radiol-
ogist. But only rely on the radiologist sometimes blocks the treatment because of
detecting diseases from the chest X-ray images which requires human effort,
experience, and time. In this case, Computer-aided diagnosis (CAD) system is
required for identifying pneumonia from chest X-ray images automatically. In
this research, a modified model is proposed using Convolution Neural Network
(CNN) model to train sample data to relegate and diagnose the presence of
pneumonia from an amassment of chest X-ray images. From the experimental
result, it is found that the proposed model performs better results (89%) com-
pared to other related existing algorithms.
Keywords: Pneumonia Chest X-ray image Radiologist Deep learning

Convolution neural networks Feature extraction
1 Introduction
The threat of pneumonia is a great problem for developing nations where billion of
people facing hunger, penuriousness, and unemployment related problems. Pneumonia
is one of the drastic inhaling inflammation in which the lunch of individuals are
affected. It is a disastrous disease that causes inflammation alveoli of individuals’ lunch
and it is very painful to inhaling naturally when the alveoli is full of other fluid or pus.
According to the report of World Health Organization (WHO) above 4 million
immature deaths occur each year from domiciliary air pollution correlated diseases
inclusive of pneumonia [1]. Above 150 million individuals get affected with pneu-
monia every year particularly children below the age of 5 years [2]. In these conditions,
the complication can be further instigated due to the scarcity of medical supplies and
workforce. Looking at the situation of Africa's 57 nations, where there exists a gap of

https://doi.org/10.1007/978-3-030-68154-8_63
736 S. Kar et al.
2.3 million doctors and nurses [3, 4]. For this immense population, a very accurate and
fast diagnosis means a lot for saving those lives.
The manual process of identifying pneumonia from chest x-ray image are
accomplished by an expert radiologist. This procedure requires humongous time,
expertise, and personnel. To overcome this difficulty, an unusual but straightforward
model is required that can automatically perform the optimal relegation tasks from a
collection of sample datasets with a deep neural network model. Here in this study, a
Convolution Neural Network (CNN) model is proposed to train sample data to relegate
and diagnosis the existence of pneumonia from an amassment of sample datasets.
Consider another deep learning algorithm that depends only on transfer learning
techniques based on traditional handcrafted to attain an incredible accuracy, and to
develop a convolution neural network model that is trained from scratch with a sample
dataset in order to identify the existence pneumonia in the human body. The proposed
CNN model could help us to alleviate the reliability and explainable challenges during
working with medical imagery. Compare another deep learning relegation tasks with
ample image depository, it is arduous to obtain a substantial quantity of dataset for this
pneumonia relegation task.
It is found that the CNN-stimulated deep learning framework have become one the
usual choice for most of the recent researchers for relegation of images. Examples some
of the excellent and most commonly used architectures for medical image classification
are U-Net [5], SegNet [6], and CardiacNet [7].
2 Related Work
There are a lot of researchers have been doing their research in deep learning for image
classification in past. Kermany, Daniel S., et al. [12] uses CAD system based on a
deep-learning framework to detect blinding retinal diseases using a dataset of optical
coherence tomography images and pediatric pneumonia using chest X-ray images.
Antin, Benjamin, Joshua Kravitz, and Emil Martayan [13], developed an algorithm that
uses a 121-layer Convolutional Neural Network similar to Chex Net for the classifi-
cation of image collected chest x-ray images from the NHIS dataset to identify the
presence of pneumonia. From the experimental result, it was observed that the pro-
posed deep learning model achieved an overall accuracy of 0.684. Rajpurkar, Pranav,
et al. [14], developed an algorithm that able to identify 14 disease viruses utilizing the
Chex Net algorithm. Park et al. [15], developed a Gabor filter-based algorithm for the
detection of rib reduction on abnormal tissue by scanning the X-ray imaging. Dina A.
Ragab [16] developed a CAD system for the classification of malignant and benign
tumors from mammography. Livieris, Ioannis E., et al. [18], developed a new algorithm
based on ensemble semi-supervised learning in order to relegate the abnormalities of
lung. Choudhari, Sarika, and Seema Biday [17], developed an algorithm that uses
neural networks to detect skin cancer. Omar, H. S., and A. Babalık. [20], developed a
convolution neural network model to diagnoses the presence of pneumonia from the
sample datasets. Here the researchers observed that compared to other machine
learning algorithms their proposed model achieved higher accuracy of 87.5. Abiyev
et al. [21], also developed a deep convolutional neural network for the detection of
An Approach for Detecting Pneumonia from Chest X-Ray 737
Chest diseases. In this work, the researcher work uses 380 images of size 32 32
pixels and from the experiments it was observed that backpropagation neural networks
attained the utmost classification accuracy rate of 99.19%. Naranjo-Torres, José, et al.
[22] construct a CNN based model for processing collection of fruit images. Han,
Fenglei, et al. [23] uses a CNN modified model for processing image of underwater to
help the robot. Alazab, Moutaz, et al. [24] developed a technique inherited from CNN
model for the diagnosis of COVID-19 from original datasets. Chakraborty, S., Aich, S.,
Sim, J.S., Kim, H.C [25] presents computer aided based Convolutional Neural Network
(CNN) models to perfectly identify pneumonitis from chest X-ray image dataset, that
can be used in the real world application such medical department in order to handle
pneumonia. In order to reduce overfitting dropout regularization technique is applied in
the model. Anthimopoulos, M., Christodoulidis et al. [26] Designed Convolution
neural network model for the classification of Interstitial Lung Liseases (ILDs). To
conduct the experiment on CNN model, 14696 images were used which were taken by
120 CT scans from several OCR device and health center and achieve a remarkable
classification accuracy of 85.5%.
3 Proposed Methods
For assessing the performance of the proposed model, the experiments and evaluation
steps are presented in detail. Due to experimental purpose, the chest X-ray image
dataset was taken from [8]. As a deep learning framework, Keras is used. It is an open-
source with tensor flow backend in which the proposed convolutional neural network
model is built and trained. For the experimental purpose an HP laptop is used with the
configurations, Intel Core i5–8265 processor, 8 GB RAM, cuDNN v7.0 library, and
CUDA Toolkit 9.0. Here Python 3.7 version is used to compile the proposed model.
3.1 Dataset
For experimental purpose, the datasets were collected from [8] which consists of a total
of 5840 X-ray images of anterior-posterior chests which were picked punctiliously
from patients aged between 1 to 5 years old. The datasets were assembled by the
National Institutes of Health (NIH), and it has been carefully selected from Guangzhou
Women's Medical Center in Guangzhou. The entire chest X-ray images were selected
from the patients in the time of their scheduled checking time. Table 1 shows the
details of the dataset and Fig. 1 and Fig. 2 shows samples images one is being normal
and the other is affected by pneumonia from the dataset.
Table 1. The dataset in details

Sample data Training data Testing data
Pneumonia affected 3875 390
Normal 1341 231
Total 5216 621
738 S. Kar et al.
Fig.1. Sample Chest X-ray image without pneumonia affected taken from dataset
Fig.2. Sample Chest X-ray image with pneumonia affected taken from dataset
3.2 Proposed Model

Fig. 3 shows the details of the proposed model for the identification of pneumonia from
an amassment of sample dataset. The different sizes images are collected from the
dataset. Further the images are resized as 165 165. The batch size is used here 35.
Every layer in the CNN model takes input from its instant previous layer's output and
the output is forwarded as input in the next layers.
The feature extractor and classifier (softmax as an activation function) are two parts
of the proposed CNN model that are merged together. The feature extractors layer
comprise of conv 3 3, 16; conv 3 3, 32; conv 3 3, 64; conv 3 3, 128,
Fig. 3. Proposed CNN model architecture
conv 3 3, 128, max-pooling layer of size 2 2, and a Rectified Linear Unit as an

activation function between them.
The results from the convolution and max-pooling layer are congregated into 2D
planes called feature maps. In order to achieve more accurate output, the algorithm
needs sizably voluminous amplitudes of data. CNN model shows their applications in
distinct and diverse fields including identification of handwriting, detection of a face,
image relegation and segmentation, etc. Here as a classifier softmax is used which is
positioned at the terminal point of the suggested CNN model.
The parameters of the proposed model is shown in Table 2. In order to increase the
performance of the proposed model batch normalization, dropout layer parameters are
considered. Further, changes are made on the two parameters values such as, epoch
number and dropout for validation which increases the accuracy of the proposed model.
.
4 Experimental Result
To assess the performance and effectiveness of the suggested model same experiment is
taken several times. The parameters are tuned to get better performance of the proposed
model. During the training process the epoch number and dropout is set as 3 and 0.2
respectively. The graphics view of the accuracy and loss function at the time of the
training process are shown in Fig. 4(a) and (b).
740 S. Kar et al.
Table 2. Parameter used in proposed model for training.
From the experiment it is noticed that the proposed algorithm perform better
compared to other machine learning and deep learning algorithm. The proposed
method accuracy is found 89%. Table 3 shows the experimental result of the proposed
model CNN against state-of-the-art supervised algorithms SMO [9], C4.5 [10], 3NN
[11], WvEnSL3 [18] and CNN [23] based on dataset of Pneumonia. The overall results
show that the proposed model exhibits better performance compare to other supervised
algorithm listed in the table. (SMO), (C4.5), (3NN) etc. are the base learner self-labeled
machine learning algorithm for object relegation task.
Fig. 4. (a). The accuracy diagram during the training process (b). The loss diagram during the
training process.
Table 3. Accuracy comparison of the proposed model

Algorithm Accuracy
SMO 76.76%
C4.5 74.83%
3NN 74.51%
WvEnSL3 83.49%
CNN model 87.65%
Proposed CNN model 89.00%
742 S. Kar et al.
Convolution Neural Network is one of the most favored and established deep learning
algorithm for image processing and image classification tasks [19]. In this research an
algorithm is developed in order to relegate and diagnosis the existence of pneumonia
from an amassment of chest X-ray images. The developed CNN model is trained and
evaluated from the chest X-ray images to get the presence of pneumonia. The proposed
CNN model is compared with other machine learning and CNN methods. The simu-
lation result shows that the proposed CNN model is capable of achieving more accurate
results than the existing algorithms. This research work can be extended for the other
chest related diseases such as effusion, pleurisy, pneumonia, infiltration, bronchitis
nodule, atelectasis, pericarditis, cardiomegaly, pneumothorax, fractures.
References
1. World Health Organization. Household Air Pollution and Health [Fact Sheet] Geneva,
Switzerland: WHO (2018). https://www.who.int/newa-room/fact-sheets/detail/household-
air-pollution-and-health.
2. Rudan, I., Tomaskovic, L., Boschi-Pinto, C., Campbell, H.: Global estimate of the incidence
of clinical pneumonia among children under five years of age. Bull. World Health Organ. 82,
85–903 (2004)
3. Narasimhan, V., Brown, H., Pablos-Mendez, A., et al.: Responding to the global human
resources crisis. Lancet 363(9419), 1469–1472 (2004). https://doi.org/10.1016/s0140-6736
(04)16108-4
4. Naicker, S., Plange-Rhule, J., Tutt, R.C., Eastwood, J.B.: Shortage of healthcare workers in
developing countries. Africa Ethn. Dis. 19, 60 (2009)
5. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image
segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds.) International
Conference on Medical image computing and computer-assisted intervention. Springer,
Cham (2015)
6. Badrinarayanan, V., Kendall, A., Copolla, R.: Segnet: deep convolutional encoder-decoder
architecture for image segmentation (2015) https://arxiv.org/abs/1511.00561
7. Mortazi, A., Karim, R., Rhode, K., Burt, J., Bagci, U., Cardiacnet.: Segmentation of left
atrium and proximal pulmonary veins from MRI using multi-view CNN. In: Descoteaux, M.,
Maier-Hein, L., Franz, A., Jannin, P., Collins, D., Duchesne, S. (eds.) Medical Image
Computing and Computer-Assisted Intervention, MICCAI 2017, Springer, New York
(2017).
8. National Institutes of Health Chest X-Ray Dataset. https://www.kaggle.com/nih-chest-xrays/
datasets. Accessed 30 Aug 2020
9. Platt, J.: Advances in Kernel Methods—Support Vector Learning. MIT Press, Cambridge
(1998)
10. Quinlan, J.: C4.5: Programs for Machine Learning, Morgan Kaufmann, San Francisco
(1993)
11. Aha, D.: Lazy Learning. Kluwer Academic Publishers, Dordrecht (1997)
12. Kermany, D.S., Goldbaum, M., Cai, W. et al.: Identifying medical diagnoses and treatable
diseases by image-based deep learning. Cell 172(5), 1122–1131 (2018)
13. Antin, B., Joshua, K., Martayan, E.: Detecting pneumonia in chest X-Rays with supervised
learning. Semanticscholar.Org (2017)
14. Rajpurkar, P., Irvin, J., Zhu, K. et al.: Chexnet: Radiologist-level pneumonia detection on
chest x-rays with deep learning. arXiv preprint arXiv: 1711.05225 (2017)
15. Park, M., Jin, J.S., Wilson, L.S.: Detection of abnormal texture in chest X-rays with
reduction of ribs. In: Proceedings of the Pan-Sydney area workshop on Visual information
processing (2004)
16. Ragab, D.A., Sharkas, M., Marshall, S., Ren, J.: Breast cancer detection using deep
convolutional neural networks and support vector machines. Peer 7, e6201 (2019)
17. Choudhari, S., Seema, B.: Artificial neural network for skin cancer detection. Int. J. Emerg.
Trends Technol. Comput. Sci. (IJETTCS) 3(5), 147–153 (2014)
18. Livieris, I., Kanavos, A., Tampakas, V., et al.: A weighted voting ensemble self-labeled
algorithm for the detection of lung abnormalities from X-rays. Algorithms 12(3), 64 (2019)
19. Yamashita, R., Nishio, M., Togashi, K., et al.: Convolutional neural networks: an overview
and application in radiology. Insights imaging 9(4), 611–629 (2018)
20. Omar, H.S., Babalık, A.: Detection of Pneumonia from X-Ray Images using Convolutional
Neural Network. Proceedings Book, p. 183 (2019)
21. Abiyev, R.H., Ma’aitah, M.K.S.: Deep convolutional neural networks for chest diseases
detection. J. Healthc. Eng. 2018 (2018)
22. Naranjo-Torres, J., Mora, M., Hernández-García, R., Barrientos, R.J., Fredes, C.,
Valenzuela, A.: A review of convolutional neural network applied to fruit image processing.
Appl. Sci. 10(10), 3443 (2020)
23. Han, F., Yao, J., Zhu, H., Wang, C.: Underwater Image Processing and Object Detection
Based on Deep CNN Method. J. Sensors 2020 (2020)
24. Alazab, M., Shalaginov, A., Mesleh, A., et al.: COVID-19 prediction and detection using
deep learning. Int. J. Comput. Inf. Syst. Ind. Manage. Appl. 12, 168–181 (2020)
25. Chakraborty, S., Aich, S., Sim, J.S., Kim, H.C.: Detection of pneumonia from chest x-rays
using a convolutional neural network architecture. Int. Conf. Future Inf. Commun. Eng. 11
(1), 98–102 (2019)
26. Anthimopoulos, M., Christodoulidis, S., Ebner, L., Christe, A., Mougiakakou, S.: Lung
pattern classification for interstitial lung diseases using a deep convolutional neural network.
IEEE Trans. Med. Imaging 35(5), 1207–1216 (2016)
An Analytical Study of Influencing
Factors on Consumers’ Behaviors
in Facebook Using ANN and RF
Shahadat Hossain1,2(B) , Md. Manzurul Hasan1 , and Tanvir Hossain2

1
American International University-Bangladesh (AIUB), Dhaka, Bangladesh
manzurul@aiub.edu
2
City University, Dhaka, Bangladesh
hossain shahadat92@outlook.com, tanvir.cse@cityuniversity.edu.bd
Abstract. This study looks at factors that effect on consumers’ inten-

tions to buy online, especially from Facebook. We enlighten the impact
and analyze how factors influence consumers to purchase products from
Facebook. Specifically, we observe consumer behaviors using different
viewpoints. Some viewpoints are related to psychology, and some are
relevant to the experiences of consumers. We emphasize the analysis of
those intentions that work behind the consumption of any product from a
Facebook page or group. An analytical study in which the contributions
of all assumptions are investigated and reported. We gather the percep-
tions of 505 people regarding buying products from Facebook pages or
groups. In terms of relative contributions, we find two models and evalu-
ation matrices that indicate the accuracy of those models to predict the
consumers’ purchases from Facebook pages or groups.
Keywords: Data mining · Machine learning (ML) · Artificial Neural

Network (ANN) · Random Forest (RF) · Consumer behavior
1 Introduction
Electronic Commerce (e-commerce) mostly focuses on web-based product buy-

ing and selling [4]. However, now a days, the main thought of e-commerce is
divided into two parts. One is e-commerce another one is Facebook commerce
(F-commerce) [20]. Hence, several small businesses are operating only on F-
commerce though they have an e-commerce website. Our primary focus is to
identify these small market aspects and how consumer behavior can have a vital
role in this F-commerce. This study unfolds the direction of consumer interests
in buying goods from the F-commerce groups or pages.
Consequently, the challenge being faced by all marketers today is how to
influence consumers’ purchasing behavior in favor of their products or services
[3]. Knowledge of purchasing behavior, therefore, sheds light on the nature that
how consumers think, feel, negotiate, and choose between current alternatives
https://doi.org/10.1007/978-3-030-68154-8_64
Influencing Factors on Consumers’ Behaviors in Facebook 745
(e.g., brands, goods, and retailers), as well as on the consumer environment

(e.g., society, friends, media). As a large number of people use FB and most
of the small businesses are operating in FB, this study focuses on FB among
different social media. Hawkins et al. [11] stated “All marketing decisions are
based on assumptions and knowledge of consumer behavior”. For acquiring the
expectations of consumers towards a small market place like F-commerce, we
need to know about consumers’ psychology in the context of buying any product.
We illustrate the effect and apply the ANN and RF decision tree algorithm in
experiments on real-world dataset from the Facebook groups, pages, and poten-
tial consumers. The visualizations of how expectations are influenced by partic-
ular acts or sequence of actions carried out by consumers have been shown in
this study. Main objectives of our total study are as follows.
– To explain the reasons behind buying something from Facebook pages and
groups.
– To construct ANN and RF decision tree models from the dataset for predic-
tion.
– Lastly, to show what factors are driving consumers to purchase more.
The rest of this paper’s sections are organized in the following order. In
Sect. 2, we include relevant works on the study of consumer behavior analy-
sis, in particular, using machine learning. Section 3 illustrates the methodology.
In Sect. 4, we describe the experiment with models. We show the results and
observations in Sect. 5 and discuss future works in Sect. 6 before we conclude.
2 Related Works
Heijden et al. [21] researched on 228 online consumers and tried to determine
consumers’ intentions to purchase online products. This research presented the
empirical study of a conceptual model from which we can understand the types
of factors effect the actions of the consumers, adapted from [8] by Fishbein et al.
The foundation of this model is a relationship between online buying behavior
and online buying intention. In this study, this model is a base for defining the
purchase intentions of online consumers. There are some issues in the sense of
competitive market and some critical factors that attract consumers to purchase
where three main factors are usability, interactivity and psychological matters
in online shops [6]. Kumar [16] gave a summary of the advertising actions on
consumers. Besides that, the elements of consumer decision-making were iden-
tified by Khaniwale [14]. The author discussed about the internal and external
factors of purchasing products.
Artificial Intelligence (AI)’s advancement helps business enterprises analyz-
ing the consumers’ behaviors to another degree. Business organizations strongly
emphasize on scrutinizing consumers’ feedbacks to make the best possible deci-
sions for their businesses. As a result, in recent years, many research works have
been carried out on digital signage [18], shopping atmosphere [7], and consumers’
746 S. Hossain et al.
reactions [1,9,15,17]. Alhendawi et al. [1] applied ANN to develop an expert sys-
tem for predicting a Management Information System (MIS)’s characteristics by
reviewing end users’ comments. Besides developing the AI-based tools, many sec-
tions of research are being conducted on framing consumers’ emotions, such as
Kumar et al. [15] analyzed the consumers’ emotions on their reviews over Ama-
zon web resources with the combination of Natural Language Processing (NLP)
and Machine Learning (ML) models.
Recently, deep learning has been applied to plenty of consumers’ reaction
investigation tasks. Gloor et al. [9] developed a system tool named Tirbefinder
by analyzing consumer behavior on Twitter. The research’s crucial contribution
is that the tool can categorize the consumers by applying word embedding and
LSTM1 RNN2 on their tweets in three macro-categories: alternative realities,
lifestyles, and recreations. Lang et al. [17] applied RNN on consumers’ actions
history data collected from Europe’s renowned fashion platform Zalando. In
this collaboration, they predicted the probability of consumers’ orders with-
out explicit feature engineering and achieved a better result than that of other
machine learning algorithms.
3 Methodology
In this section, the most significant steps of this research have been described.
First, the data have been collected, then have been engineered and divided into
training and testing data. After that, the training dataset has been fitted with
the proposed models. Finally, the testing dataset has been predicted through
the RF and ANN models. From starting to prediction on testing data using the
models step in Fig. 1.
Fig. 1. Work-flow of the analysis.
1
LSTM: Long Short-Term Memory.
2
RNN: Recurrent Neural Network.
3.1 Sample Collection and Dataset
Our samples have been collected from Facebook buying and selling groups of
authentic Facebook e-commerce consumers. The responses are collected through
Google form. The participations of 505 people have finally been recorded from
the survey. Attributes are chosen based on the parameters that require for the
extraction of consumer behavior. We choose questions for survey based on Fish-
bein et al. [8]’s conceptual model and the attributes are chosen from those ques-
tions. Some attributes are derived from the responses of participants. The total
number of dataset attributes is 22 with the number of instances being 505.
Table 1. Participants’ profile sample (N = 505)
Question Count Percentage

Age
11-30 477 95%
31-60 25 4%
≤90 03 1%
Gender
Male 419 83%
Female 83 16%
Other 03 1%
Time spend in FB
≤ 2h 69 14%
≥ 2h 436 86%
Years of activation in FB
1 year 05 1%
2 years 08 2%
3 years 18 3%
4 years & more 474 94%
Buy anything from FB
Buy from FB 294 59%
Not buy from FB 211 41%
Table 2. Influencing factors (IF) of participants from FB
Factors Positive Negative

Not sure 36 469
Having less time 47 458
Influenced by others 82 423
Special discount and offer 154 351
Page Ratings 139 366
Page Advertisements 92 413
The profiles of participants are described in Table 1 and the influencing

factors (IF) work behind the purchase of participants from FB pages or groups
are described in Table 2.
3.2 Random Forest Algorithm

A decision tree [19] is a classifier which expresses itself as a space partition of
a recursive instance. From the concept of the decision tree, T.K.ho develops a
tremendous ML algorithm named Random Forest [13]. In a decision tree, there
exist low bias with high variance, whereas in RF, there exist low variance with
low bias. Rather than having one decision tree, we have multiple decision trees
in RF.
In Random Forest algorithm, feature and row sample d(d1 , d2 , d3 , .....dn ) have
been selected to construct decision trees from the training dataset D, where
d < D. It is not obvious that each feature and row sample need to be identical.
For each d there are decision tree models M (M1 , M2 , M3 , .....Mn ). After applying
bagging on M, the final prediction has been done. In the terms of regression, RF
chooses the mean or median of the individual predictions to have the final result.
In comparison with other single tree classifiers, RF shows better performance
than algorithm C4.5 or J48 [5]. In the context of error rate generalization and
resistance to noise, RF does better though it makes computation slower.
3.3 Artificial Neural Network

Artificial Neural Network (ANN) [10] is an information processing prototype
that perceives the environment’s experiences inspired by the human brain. In
an ANN, multiple layers are connected to create a network where each layer
consists of different units. The first and last layers are the input and output
layers, and the other layers are called the hidden layer. A unit of a particular
layer is forwardly connected to its next layer’s units; by following this process,
the network is formed from the input layer to the output layer.
The connection between two units creates an edge, and a randomized
weighted value passes from one layer to another through the units. A value is
measured from an inner calculation of connected units’ values and weight met-
rics between two layers. Then, the value passes through an activation function
which is contained by an unit as it’s final value is in pass. After calculation of the
final layer value, a backpropagation algorithm is applied to update the weights
so that the network can learn as correctly as possible [12]. The general equation
for an input layer (xi−1 ) and corresponding output layer (xi ) is given below -
n

xi = σ( Wi−1 · xi−1 + b); where k = 1, 2, 3.......n number layer
k=1
Here the symbol σ denotes the sigmoid activation function, W represents the
weights between two units of connected layers (k − 1, k) and b represents the
bias unit.
4 Experiment with Models

In data preprocessing, the data have been converted from text to numerical
values. Then the categorical features have been deducted: age, gender, and pro-
fession for the experiment purpose. Next, the dataset is shuffled 150 times before
split. The data have been divided into training and testing dataset where 67%
are training dataset, and 33% are the testing dataset.
By using sklearns (scikit-learn), a python predictive data analysis library , we
apply RF on our dataset. Random Forest constructs 20 identical decision trees
from our given dataset. Each node split is based on the answer of a question.
The question asked is based on a value of a feature. From the question’s answer
a data point moves down along the tree. So, no question is asked to any leaf
node. From 20 trees here, we give a tree as an example (Fig. 2). In the tree, each
node contains an information ‘gini’ which is a probability of the randomly chosen
sample in a node [2]. As we shift down the tree (Fig. 2), the average weighted
gini impurity value decreases [2]. Sample contains the number of observations
in the node. The prediction of all the samples in the node is a class. In a node,
a class is the majority classification for points. The leaf nodes are for the final
prediction.
Fig. 2. A proposed random forest decision tree model.
In our proposed ANN model (Fig. 3) after the data pre-processing and cross-
validation, we feed the 18 featured values and a bias as the input to the ANN.
The dimension of the input layer is R19 . The hidden layer consists of 4 units that
are forwarded to the output unit with another bias unit. So, the dimension of the
hidden layer and output layer are R5 and R1 . We do experiment by using Keras,
a python deep learning library where TensorFlow works in the back-end. For
training, we run the model for 300 times with a learning rate of 0.01. Besides, to
minimize the error, we use the binary cross-entropy loss function and stochastic
gradient descent optimizer. Finally, model predicts our testing data through the
trained model.
Fig. 3. Proposed Artificial Neural Network model.
5 Results and Observation

After fitting the training data into the RF and ANN model, the test data have
been predicted through the models and have been represented in the confusion
matrix.
Table 3. Confusion matrix of RF and ANN (Bought = B, Not Bought = NB)
Actual value RF ANN

B NB B NB B NB
Predicted value B TP FP 118 2 119 1
NB FN TN 1 81 2 80
In Table 3, a confusion matrix table is illustrated, which consists of True

Positive (TP), False Positive (FP), False Negative (FN), and True Negative
(TN) values from the test data.
For measuring the RF model’s evaluation metrics, the scikit-learn library
has been used, and the TensorFlow works in the ANN model. Both models have
achieved robust results in terms of accuracy, precision, sensitivity, and specificity.
Table 4. Result table
Random Forest Decision Tree Artificial Neural Network

Accuracy 98% 98%
Precision 98% 99%
Sensitivity 99% 98%
Specificity 98% 99%
(a) Bar chart for bought from FB (b) Bar chart for bought from FB
pages under following buy and sell FB pages under special discount and offer:
groups: real vs. prediction real vs. prediction
(c) Bar chart for bought from FB (d) Bar chart for bought from FB
pages under using of FB during study pages under the factor (not buying due
or work: real vs. prediction to lack of trust): real vs. prediction
Fig. 4. Different bar chart analysis of predicted data.
According to Table 4, both models have acquired the same 98% accuracy. The
ANN model has performed best in precision and in specificity metrics with 99%,
whereas the RF model has attained 98% in both metrics. On the other hand,
the RF model has achieved the best score in the sensitivity metrics, which is
99%, and in the case of the ANN model, it is 98% as illustrated in Fig. 4.
After applying the ANN Model, The predicted values have been plotted using
the matplotlib, a python plotting library. In the x-axis, attributes are taken
to compare and to analyze the result. The y-axis demonstrates the count of
the value of that attribute. Figure 4a shows that some participants follow FB
buying and selling groups, and they have a positive intention to buy from FB
pages. Another observation is that some participants do not follow the FB groups
or pages, but they are buying from FB groups or pages. So, if any consumer fol-
lows any Facebook group or any Facebook page, then the probability of buying
from F-commerce is high. According to Fig. 4b, it can be noticed that some
participants have the intention to buy from Facebook pages if there is any spe-
cial discount and offer. Some consumers are still buying from Facebook pages
though there is no special discount and offer. Figure 4c, reveals that how much
participants involve in technology. From Fig. 4d it is being seen that, if any
consumer has any trust issues from any FB pages, then buying anything from
Facebook is zero. Some consumers have no issues with the reliability of any Face-
book pages, but have no intention to purchase. The same type of outcome has
been observed in the question of reliability, bad experience, and slow response
from the Facebook pages.
6 Future Research Directions and Conclusion
Although our proposed systems achieved robust results, it has some limitations
which invoke future research directions. Our research has only focuses on Face-
book users’ activities on F-commerce, which limits the influence of other social
media. In future, we would like to specify how and why some classes of the
consumers are persuaded to buy products from social media frequently.
In this research work, we have proposed two machine learning models to
predict consumers’ buying intentions from Facebook groups or pages. We have
applied RF and ANN for the prediction that has given a significant result
and improve the perceptions of purchase intentions from Facebook. We aim
to expand this research and analyze individual tastes more accurately. We hope
this research will help to open up a more sophisticated study area for other
researchers on consumers’ behaviors to achieve significant outcomes.
References
1. Alhendawi, K.M., Al-Janabi, A.A., Badwan, J.: Predicting the quality of mis char-
acteristics and end-users’ perceptions using artificial intelligence tools: expert sys-
tems and neural network. In: Proceedings of the International Conference on Intel-
ligent Computing and Optimization, pp. 18–30. Springer (2019)
2. Ali, J., Khan, R., Ahmad, N., Maqsood, I.: Random forests and decision trees. Int.
J. Comput. Sci. Issues (IJCSI) 9(5), 272 (2012)
3. Ballantyne, D., Varey, R.J.: The service-dominant logic and the future of market-
ing. J. Acad. Mark. Sci. 36(1), 11–14 (2008)
4. Bhat, S.A., Kansana, K., Khan, J.M.: A review paper on e-commerce. Asian J.
Technol. Manage. Res. [ISSN: 2249–0892] 6(1), 16–21 (2016)
5. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
6. Constantinides, E.: Influencing the online consumer’s behavior: the Web experi-
ence. Internet Res. 14(2), 111–126 (2004)
7. Eroglu, S., Machleit, K., Davis, L.: Atmospheric qualities of online retailing: a
conceptual model and implications. J. Bus. Res. 34, 177–184 (2001)
8. Fishbein, M., Ajzen, I.: Predicting and changing behavior the reasoned action
approach (2015). OCLC: 1190691560
9. Gloor, P., Colladon, A.F., de Oliveira, J.M. and Rovelli, P.: Put your money where
your mouth is: using deep learning to identify consumer tribes from word usage.
Int. J. Inf. Manage. 51, p. 101924, April 2020
10. Mohamad, M.H.: Fundamentals of Artificial Neural Networks. MIT Press, Cam-
bridge (1995)
11. Hawkins, D.I., Mothersbaugh, D.L.: Consumer Behavior: Building Marketing
Strategy, 13th edn. McGraw-Hill Education, New York (2016)
12. Hecht-Nielsen, R.: Theory of the backpropagation neural network. In: Neural net-
works for perception, pp. 65-93. Elsevier (1992)
13. Tin Kam Ho: The random subspace method for constructing decision forests. IEEE
Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
14. Khaniwale, M.: Consumer buying behavior. Int. J. Innov. Sci. Res. 14(2), 278–286
(2015)
15. Kumar, P.K., Nandagopalan, S., Swamy, L.N.: Investigation of emotions on pur-
chased item reviews using machine learning techniques. In: Proceedings of the
International Conference on Intelligent Computing & Optimization, pp. 409–417.
Springer (2018)
16. Kumar, R.: Consumer behaviour and role of consumer research in marketing. J.
Commer. Trade 12(1), 65–76 (2017)
17. Lang, T., Rettenmeier, M.: Understanding consumer behavior with recurrent neu-
ral networks. In: Proceedings of the Workshop on Machine Learning Methods for
Recommender Systems (2017)
18. Ravnik, R., Solina, F., Zabkar, V.: Modelling in-store consumer behaviour using
machine learning and digital signage audience measurement data. In: Proceedings
of the International Workshop on Video Analytics for Audience Measurement in
Retail and Digital Signage, pp. 123–133. Springer (2014)
19. Rokach, L., Maimon, O.: Top-down induction of decision trees classifiers–a survey.
IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 35(4), 476–487 (2005)
20. Siregar, E.: Young attitudes towards f-commerce advertising. In: Proceedings of
the International Conference on Industrial Technology and Management (ICITM),
pp.218–223 (2018)
21. van der Heijden, H., Verhagen, T., Creemers, M.: Understanding online purchase
intentions: contributions from technology and trust perspectives. Eur. J. Inf. Syst.
12(1), 41–48 (2003)
Autism Spectrum Disorder Prognosis
Using Machine Learning Algorithms: A
Comparative Study
Oishi Jyoti1(B) , Nazmin Islam1 , Md. Omaer Faruq1 ,

Md. Abu Ismail Siddique1 , and Md. Habibur Rahaman2
1
Department of Electrical and Computer Engineering, Rajshahi University
of Engineering and Technology (RUET), Rajshahi, Bangladesh
oj.ruet@gmail.com, nazmin.soroney07@gmail.com, omaerfaruq0@gmail.com,
saif101303@gmail.com
2
Department of Electrical and Computer Engineering, Memorial University
of Newfoundland, St. John’s, Canada
mhrahaman@mun.ca
Abstract. Autism Spectrum Disorder (ASD) is a state of immature

cerebrum progression which resists the normal way of life, including com-
munication, behavior, and sensory ability. Autism can be detected at
an early stage with proper advanced methods when it is assumed as a
behavioral disease. The screening test is one of the approved processes
in detecting Autism Spectrum Disorder (ASD), which is time-consuming
as well as extravagant. Using intelligent retrieval and neural-based algo-
rithms, autism can be identified with great efficiency and precision. Dif-
ferent models have been developed consuming this advanced technology
in that context, but still, there is a scope for betterment. In this paper,
a bunch of methods of machine learning based on Deep Neural Net-
work (DNN), Support Vector Machine (SVM), and K-Nearest Neighbor
(KNN) has been introduced in the prediction of autism at any age with
higher regulation and acceleration. The methods were trained over 10
autism-spectrum quotient (AQ) and several features that can reveal the
state of function of mind and behavior. The proposed model shows better
accuracy than previous work and also illustrates the comparison between
the outcomes of used models.
Keywords: Autism Spectrum Disorder (ASD) · Autism Spectrum

Quotient (AQ) · Support Vector Machine (SVM) · K-Nearest neighbor
(KNN) · Deep Neural Network (DNN)
1 Introduction
Autism Spectrum Disorder (ASD) refers to the resistance in neurological as well
as genetic development leading towards difficulties in interaction and rudimen-
tary tasks. Autism patients experience several struggling with verbal communi-
cation, sensing, learning, and other chores that an average child can do easily.
https://doi.org/10.1007/978-3-030-68154-8_65
ASD Prognosis Using Machine Learning Algorithms 755
Furthermore, they suffer from different diseases and syndromes. Nowadays,

ASD is achieving its impulse at a high rate with the help of different aspects.
Apart from a genetic disorder, environmental factors are also playing an impor-
tant role in forming autism or exacerbate the features.
The escalating rate of autism can be noted from the statistical analysis.
According to the World Health Organization (WHO), one in every 160 children
possesses ASD [1]. In 2020, Hong Kong experiences the most amounts of ASD
affected children which are 372 per 10,000 children, and in the United States,
this rate is 222 per 10,000 children [2].
Autism Spectrum Disorder (ASD) is influencing people of different ages.
Autism can be detected at an early age, which can provide an advantage of
treatment. No permanent cure has been introduced so far, but intensive care
can develop the situation and help to learn the fundamentals requirement so
that in the future, patients can have a decent life.
However, developing the state depends on early detection, and early detec-
tion relies on advanced models. With machine learning, this advanced model
with accurate, effective prediction can be amplified for individuals. Our proposed
model focuses on building an autism prediction model using machine learning
and bringing out the best result with moderate speed. Apart from this, a com-
parison between three models will generate a brief knowledge about functioning
procedure as well as an idea for selecting the effective one.
The prevailing portion of the proposed system is arranged as given below. In
Sect. 2 previous works have been shown where the proposed framework has been
illustrated in Sect. 3. Section 4 demonstrates the model progression and Sect. 5
includes an evaluation of performance and result. Conclusion for the proposed
work has been concluded in Sect. 6.
2 Literature Review
In recent times, many algorithms and models using machine learning have been
introduced to predict Autism Spectrum Disorder (ASD) and achieved different
accuracy.
In [3] Raj et al. introduce six models to predict the autism disorder where
three datasets are used containing child, adolescents, and adulthood autism traits
and accuracy of 99.53%, 98.30%, 96.88% are the highest for respective stages.
Erkan et al. use three models in [4], including KNN, SVM, and Random Forest
(RF) where the accuracy of 100% is achieved for the RF-based model. Deep
embedding representation is proposed as an attested model with 99% sensitivity
and specificity in [5].
Lerthattasilp et al. in [6] illustrate multivariate logistic regression analysis
to develop the speculated model, and datasets are acquired from a standardized
diagnostic tool. In [7], Hyde et al. research the previous work done in ASD
prediction using supervised machine learning and investigate the clinical and
statistical addressing for data mining.
Machine learning is applied for the first time according to the authors of [8] to
investigate the kinematic movement traits with hand gestures in finding autism.
756 O. Jyoti et al.
Baranwal et al. propose in [9] machine learning including several models to

develop the desired model with the best accuracy on ASD screening dataset.
In [10] Ghosh et al. use the Random Forest model as a predictive tool trained
on the behavioral and individual dataset, which brings out 99.46% accuracy.
Several models, along with linear discriminant function, are used in [11] by
Geng et al. where sensitivity and specificity vary from 20%–100% and 48%–
100%, respectively. Omar et al. illustrate the comparative study of several models
counting Random Forest-CART and Random Forest-ID3 and develop a diagnos-
ing model for autism in [12].
According to previous work, there are various predictive models with diverse
accuracy. Though many researchers have used the same model to build up the
predictive tool, their gained accuracy and approaches are different. In this paper,
the used three models are successful individually to bring out the best accuracy
than previous work.
3 Proposed Framework
The proposed models using neural-based algorithms e.g. Deep Neural Network,
Support Vector Machine (SVM), and K-Nearest Neighbor (KNN) are developed
in several phases, including data overview, data pre-processing, data analysis,
prediction model development, evaluating the prediction model. The flowchart
of the framework is given in Fig.1.
Fig. 1. Framework of the proposed method containing the steps e.g. ASD dataset, data
preprocessing, splitting data, build three models using three algorithm DNN, SVM and
KNN and evaluation of performance are shown in the flowchart.
3.1 Dataset Overview

For training the models, a dataset of Autism Spectrum Disorder is used where
704 adults aged 18 and more have performed the screening test which includes
some basic questions to be answered by the participants. The dataset has been
derived from Kaggle which is a public dataset. The number of attributes is 21,
among which ten attributes are the questionnaires of AQ-10 screening and others
include age, gender, ethnicity, born with jaundice or not, etc. The questions are
mainly yes/no question because of better understanding. Apart from AQ-10
questions, age, and a result which are Boolean, other features are string type.
Class/ASD, the dependent variable indicates whether an adult has ASD or not.
A short description of the dataset is illustrated in Table1.

In the general ASD dataset, there were some imbalances in different traits. Sev-
eral null values were found which have been handled by the listwise deletion
method, and somewhere the null values have been replaced with the mean value.
The Boolean attributes have been transformed by 0 and 1 for no and yes, respec-
tively. ID, country of residence, and age desc attributes seem to have less impact
on the classification analyzed by the Decision Tree algorithm. For this, dropping
those columns has given a better result. Alternating numeric values have been
given to string attributes in descending order.
Table 1. Overview of the dataset with the attributes name and attribute type.
Attribute name Attribute description

AQ1-AQ10 10 questions to investigate autism in adult, given as binary
age Age of adults tested by autism screening tool
gender Gender of volunteers illustrating male and female
ethnicity Nation or identity stated as string
jaundice Illustrates if the volunteer is born with jaundice or not
autism Demonstrates if any family member has autism traits
used app before Shows if volunteers have used the application in advance
result Marks given as integer type of screening test
Class/ASD Classifies if the volunteer has autism traits or not
3.3 Data Analysis

Analyzing the dataset reveals that there is no specific relationship between being
autistic with one born with jaundice or not. Furthermore, relative diagnosed with
autism does not affect the probability of being detected with autism. Also, an
indication of identified as autistic does not rely on the previous use of screening
application. A score of 7 or more than seven certainly results in ASD affected
adult. These traits have been noted down from the ASD dataset.
758 O. Jyoti et al.
3.4 Prediction Model Development
Following the pre-processing, three algorithms based on neural networks, which

are effectual, and fast have been applied to generate the desired model. DNN
based model has been analyzed with different activation functions to evaluate
the most efficient one where rectified linear unit illustrates the best outcome. To
take a decision, based on SVM, the radial basis function kernel has been used to
draw the boundary. While working with the distance of test data and training
data in KNN, the Euclidean method has been delegated as the metric. However,
moderation has been made throughout the models to gain better accuracy and
precision.
3.5 Evaluating the Prediction Model
The main focus of the research is to estimate the preferable prediction model
for Autism Spectrum Disorder. For this, three efficient machine learning models
have been developed which are trained on the same dataset. Furthermore, a
comparison study on the performance of each method regarding the prediction
and precision have been established through the process, which upholds the
effectiveness so that choosing the model for early diagnosis can be easier. Among
these models, Deep Neural Network has come up with the highest accuracy of
100%, which is, in our knowledge is the better outcome so far in DNN than
previous work.
4 Model Progression
To solve the diagnosis problem of Autism Spectrum Disorder, three network

architectures of machine learning have been introduced in this paper. First of
all, the Deep Neural Network model has been developed along with different
activation functions and various learning rates. Then Support Vector Machine
has been delegated in the process of prediction with necessary modification.
Eventually, K-Nearest Neighbor has illustrated the predictive model with the
least accuracy.
4.1 Prediction Model Based on DNN
In solving complex issues, Deep Neural Network is gaining its momentum day by
day [13]. In the prediction of ASD, after pre-processing of data, it has been split
into 70% train and 30% test data which means among 704 instances, 493 samples
have been used as train dataset, and 211 have been delegated as test dataset. The
network has been developed with two hidden layers for better function and both
layers consist of 12 layers each. For the initialization of weight Glorot Uniform
initialize has been introduced. The loss function evaluates the similarity between
the predicted scores and the ground truth labels in the training data [14]. In the
backpropagation stage, the loss has been reduced by the Batch Gradient Descent
algorithm and as a loss optimizing function, Mean Squared Error (MSE) has
been delegated. The epochs and learning rate for training the model is 10,000
and 0.01 respectively, and an activation function, Hyperbolic tangent (tanh),
Rectified linear unit (ReLu), Softplus and Swish function have been used. The
highest accuracy gained from this model is 100%. The comparison of accuracy
according to activation functions is shown in the Table 2.
Table 2. Comparison between gained accuracy and F1-score of activation functions of

DNN.
Activation function Accuracy F1-score

Hyperbolic tangent (tanh) 92.9% 93%
Softplus 99.5% 99%
Rectified linear unit (ReLu) 100% 100%
Swish 98.3% 99%
4.2 Prediction Model Based on SVM
Algorithm supported by vector machine, based on the analytical study, has

become popular in the research area of machine learning because of its moderate
promotion and higher accuracy [15]. For linear or non-linear problems, Support
Vector Machine, the simple and fast algorithm is very effective. Regarding the
ASD prediction, following the pre-processing of a dataset using simple imputer
to impute the missing values with mean value, the dataset has been split into
70% train and 30% test data like DNN.
42 random-state parameters have been introduced to generate random num-
bers obtained from distribution probability. Radial basis function (RBF) kernel
has been used for its easy calibration and practice. The RBF kernel can be
defined as-
||x − x ||2
K(x, x ) = exp (1)
2σ 2
The accuracy achieved from this network is 98.11% which is the second-best
accuracy of the research.
4.3 Pediction Model Based on KNN
The K-Nearest Neighbor algorithm is simple but effective in machine learning

classification method [16]. By accelerating the power of computers’ processors
it has obtained its significance [17]. The main operation of KNN is to find out
the distances between query and examples. Then a specific number of examples
which is close to the query is selected, and the model chooses the most continuous
label. It completes its function in 4 steps. The ASD dataset has been split into
0.8 train and 0.2 test data size in the pre-processing level. In the way of finding
760 O. Jyoti et al.
distance Euclidean metric, which is the most popular method, has been used
with 6 numbers of neighbors. The Euclidean equation can be written as-

n

d(p, q) = (qi − pi )2 (2)
i=1
where d denotes the distance and p,q are the instances.

To point out the neighborhood with the same weight for better calculation,
the uniform weight has been initialized. To compute the neighbor, which is near-
est, several algorithms are used in this process. But with compatible perfor-
mance, the ’auto’ algorithm has been used here. Following the fitting of the
classifier, the model has achieved 97.16% accuracy, which is less than the other
two described models.
The overall environment of the process of building a model is based on
python, along with sklearn, numpy, pandas library. The programming was per-
formed in Google Colab, which is an online platform for jupyter notebook.
5 Evaluating the Performance and Result
Autism Spectrum Disorder (ASD) can be stated as a developmental disorder

because it limits social communication [18]. Due to this reason, advanced tool
for detection is required.
The correlation, illustrating the relationship of features and variables of ASD
dataset is shown in Fig 2.
According to the accuracy gained from the models, DNN has become the
most effectual model with 100% accuracy working with the Rectified Linear
Unit (ReLu) activation function as well as 100% F1-score. Figure 3 shows the
accuracy curve with a comparison of activation functions.
With 98.11% accuracy, SVM based model can be determined also as an
effective tool for predicting autism because it is better than previous work is
done in [7] which is 86.7%. On the other hand, KNN, with 97.16% accuracy, can
be less constructive compare to previous models, but in general, it can be also
fruitful. Figure 4 illustrates the accuracy curve of the model.
Table 3 represents the overall outline of accuracy obtained by the prediction
models.
Table 3. Value of accuracy attained from used models
Prediction model Accuracy

Deep Neural Network(DNN) 100%
Support Vector Machine(SVM) 98.11%
K-Nearest Neighbors(KNN) 97.16%
Fig. 2. Relationship status between the traits and variables of dataset as well as the
dependency of features are shown.
Fig. 3. Comparison graph of accuracy gain for activation functions of Deep Neural
Network (DNN) model is shown above
The estimation metrics used in this research are error rate, sensitivity, and
specificity, which can be determined through confusion matrix [19]. Each column
of confusion matrix illustrates the traits of actual class, and by row, it visualizes
the predicted class. True positive, true negative, false positive, and false negative
are the instances of matrix (Table 4).
762 O. Jyoti et al.
Fig. 4. Testing and training accuracy gained from KNN model are illustrated with
number of neighbors as x-label and accuracy as y-label in the figure.
Table 4. Confusion matrix
Predicted 0 Predicted 1
Actual 0 TN FP
Actual 1 FN TP
Using those instances of this matrix, different characteristics can be drawn,

and by analyzing those, aspects of models can be established. Furthermore, it
assists to realize the performance or continuation of the program or model.
The gained values of TP, TN, FP, and FN from the confusion matrix of the
three machine learning models are given in Table 5.
Performance of three models has been disparate from each other because of its
different algorithms and process. Figure 5 demonstrates the comparison between
accuracy, f1-score, sensitivity, and specificity of Deep Neural Network, Support
Vector Machine, and K-Nearest Neighbor methods, which are determined using
the values of traits of the confusion matrix and the delegated equations.
Table 5. Evaluation of instances of confusion matrix based on model
Confusion matrix DNN SVM KNN

TP 54 54 40
TN 158 154 97
FP 0 1 1
FN 0 3 3
Fig. 5. Accuracy, f1-score, specificity and sensitivity analysis of DNN, SVM and KNN
models is shown in this figure.
As per the above discussion, the proposed system has developed the predic-
tion model with supreme accuracy using DNN and converged the utmost goal.
Furthermore, the finest precision and moderate time make the model acceptable
in the field of detecting autism among adults.
6 Conclusion
Autism is considered to be one of the enlarging disorders around the globe. ASD
cannot be cured entirely but an appropriate method can demolish the sufferings
and hardships of the affected person as well as his/her family and relatives.
Keeping pace with the progression of the world, a constructive method must
be introduced to predict autism accurately. A vast area of research in machine
learning has opened up the opportunity of contribution in this field. In this paper,
the accuracy of 100% using DNN has managed to attain the contemplated result.
Delegated three models e.g. Deep Neural Network, Support Vector Machine,
and K-Nearest Neighbor have delivered an outstanding result throughout the
research by which detecting autism in an adult would be efficacious. However,
Deep Neural Network has secured the best outcome with preferable precision
and as a tool for predicting ASD successfully, it has become more effective than
others. Mostly, the work done in this paper has improved the previous result. For
future work, along with DNN other machine learning models will be analyzed
and modified so that accuracy can be refined.
764 O. Jyoti et al.
References
1. Autism spectrum disorders (2019). https://www.who.int/news-room/fact-sheets/
detail/autism-spectrum-disorders. Accessed 21 July 2020
2. Prevalence of autism spectrum disorder among children in select countries world-
wide as of 2020 (2020). https://www.statista.com/statistics/676354/autism-rate-
among-children-select-countries-worldwide/. Accessed 15 Aug 2020
3. Raj, S., Masood, S.: Analysis and detection of autism spectrum disorder using
machine learning techniques. Procedia Comput. Sci. 167, 994–1004 (2020)
4. Erkan, U., Thanh, D.N.: Autism spectrum disorder detection with machine learning
methods. Curr. Psychiatry Res. Rev. Formerly Curr. Psychiatry Rev. 15(4), 297–
308 (2019)
5. Wang, H., Li, L., Chi, L., Zhao, Z.: Autism screening using deep embedding rep-
resentation. In: International Conference on Computational Science, pp. 160–173.
Springer (2019)
6. Lerthattasilp, T., Tanprasertkul, C., Chunsuwan, I.: Development of clinical pre-
diction rule for diagnosis of autistic spectrum disorder in children. Mental Illness
(2020)
7. Hyde, K.K., Novack, M.N., LaHaye, N., Parlett-Pelleriti, C., Anden, R., Dixon,
D.R., Linstead, E.: Applications of supervised machine learning in autism spectrum
disorder research: a review. Rev. J. Autism Dev. Disord. 6(2), 128–146 (2019)
8. Li, B., Sharma, A., Meng, J., Purushwalkam, S., Gowen, E.: Applying machine
learning to identify autistic adults using imitation: an exploratory study. PloS one
12(8), e0182652 (2017)
9. Baranwal, A., Vanitha, M.: Autistic spectrum disorder screening: prediction with
machine learning models. In: 2020 International Conference on Emerging Trends
in Information Technology and Engineering (ic-ETITE), pp. 1–7. IEEE (2020)
10. Shuvo, S.B., Ghosh, J., Oyshi, A.: A data mining based approach to predict autism
spectrum disorder considering behavioral attributes. In: 2019 10th International
Conference on Computing, Communication and Networking Technologies (ICC-
CNT), pp. 1–5. IEEE (2019)
11. Geng, X. Kang, X., Wong, P.C.: Autism spectrum disorder risk prediction: a sys-
tematic review of behavioral and neural investigations (2020)
IEEE (2019)
13. Yang, X., Li, F., Liu, H.: A comparative study of dnn-based models for blind image
quality prediction. In: 2019 IEEE International Conference on Image Processing
(ICIP), pp. 1019–1023. IEEE (2019)
14. Vasant, P., Zelinka, I., Weber, G.-W.: Intelligent Computing & Optimization,
vol. 866. Springer, Cham (2018)
15. Zhang, Y.: Support vector machine classification algorithm and its application. In:
International Conference on Information Computing and Applications, pp. 179–
186. Springer (2012)
16. Guo, G., Wang, H., Bell, D., Bi, Y., Greer, K.: Knn model-based approach in
classification. In: OTM Confederated International Conferences” On the Move to
Meaningful Internet Systems”, pp. 986–996. Springer (2003)
17. Altay, O., Ulas, M.: Prediction of the autism spectrum disorder diagnosis with
linear discriminant analysis classifier and k-nearest neighbor in children. In:2018
6th International Symposium on Digital Forensic and Security (ISDFS), pp. 1–4.
IEEE (2018)
18. Niu, K., Guo, J., Pan, Y., Gao, X., Peng, X., Li, N., Li, H.: Multichannel deep
attention neural networks for the classification of autism spectrum disorder using
neuroimaging and personal characteristic data. Complexity 2020 (2020)
19. Al Diabat, M., Al-Shanableh, N.: Ensemble learning model for screening autism
in children (2019)
Multidimensional Failure Analysis Based
on Data Fusion from Various Sources Using
TextMining Techniques
Maria Stachowiak1 , Artur Skoczylas1(&) , Paweł Stefaniak1 ,

and Paweł Śliwiński2
1
KGHM Cuprum Ltd, R&D, gen.,
W. Sikorskiego Street 2-8, 53-659 Wrocław, Poland
askoczylas@cuprum.wroc.pl
2
KGHM Polska Miedź SA,
M. Skłodowskiej-Curie str. 48, 59-301 Lubin, Poland
Abstract. Enterprise Asset Management (EAM) software is commonly used in

large industrial enterprises. The main task of EAM is to optimize performance,
extend the life cycles of equipment, minimize downtime as well as operational
costs. In a simple approach, EAM can only be used to support service and repair
actions and their registration without using the form to enter data. In this case,
maintenance logs contain useful pieces of information, however access to them
is difficult due to the method of storage. There is a lack of standardization of
records, several activities in one record that relate to different machine com-
ponents, the presence of spelling errors, jargon, mental shortcuts, etc. In such
situations, further analysis of maintenance data is very onerous and comes down
to many time consuming manual operations, especially if the analyses are
related to combining these records with other data sources. Their automation
requires the use of advanced analytical tools that must be customized to indi-
vidual business needs and database capabilities. In this paper we propose use of
TextMining tool to automize the analytical process in failure analysis area based
on data from many various sources.
Keywords: Text mining EAM Failure analysis Root-Cause analysis

LHD machines BigData analytics Predictive maintenance
1 Introduction
Failure and reliability analysis is a well-recognized subject whose results have a direct
impact on technical diagnostics. Maintenance of machinery directly affects efficiency
issues, making them an important variable for optimizing production as well as the
company’s financial result. In practice, both failure analysis is often based on advanced
mathematical methods. U. Kumar and B. Klefso (1990) considered the issue of
applying statistical distributions (with particular emphasis on the Weibull distribution)
for the purposes of modeling the residual lifetime of underground mining machinery.
The paper presents the methodology for estimating distribution parameters based on
empirical data from the machine operation process [1]. This subject was also taken up

https://doi.org/10.1007/978-3-030-68154-8_66
Multidimensional Failure Analysis Based on Data Fusion from Various Sources 767
in paper [2] with particular emphasis on the use of this method for modeling the
lifetime of haul trucks. In paper [3] the authors conducted a study of the reliability and
availability of underground loaders along with the analysis of the failure of individual
subsystems using modeling based on Markov chain theory. The method based on a
support vector machine for estimating the residual life-time of machines, taking into
account varying operational conditions has been prosposed in [4]. In [5] we can find the
method of applying a non-linear statistical model for estimating the lifetime of mining
machines. In [6] the authors described ways of using the Kaplan-Meier estimator for
the purposes of statistical analysis of reliability, availability, and maintenance of
underground haul trucks. The article [7] conducted a comparative analysis of failures
occurring among four drill rigs operating in an underground mine in Sweden. The
paper also proposes a method of estimating the time remaining to the next failure.
Continuing this topic, in [8] presented methods of using Monte Carlo simulation in the
analysis of the reliability of the underground drilling platform.
Despite many works dealing with the reliability of LHD machines and good
recognition of this issue, the main problem remains the collection of data on the failure
of machines, especially in the case of huge machinery parks (e.g. over 1200 machines).
In practice, very often the machine’s operational history is still kept manually in
notebooks. Some mine departments run individual spreadsheets. A step forward turned
out the launch of a dedicated IT system ensuring data entry via the form. Unfortunately,
despite a large number of computer stations in the underground mine with EAM
software (e.g. CMMS - Computerised Maintenance Management Systems), the data is
entered manually by service technicians due to time limitations and the amount of
maintenance and repair works carried out. As a consequence, there are huge amounts of
unstructured data in digital storage, but the automation of this data analysis or its
integration with other data sources is very limited. This is an especially important issue
for BIG DATA analytics. For this purpose, several solutions have been developed, but
only a fraction of them can be applied in relation to the Polish language. In [9] a 3 step
Text-Mining tool for the CMMS databes was presented. This tool focuses mainly on
the clustering the terms describing failures and later mining of the association rules.
Paper [10] describes 3 methods for dataset analysis using Text-Mining techniques in
order to classify failure mode for each of the records. The aspects of the use of Text-
Mining tools in different systems for the support is described in the review [11].
Authors searched the state of published evidences on the use of Text Mining and tried
to understant the issues from organizational prespective and to create a core list of Text-
Mining tools.
In this paper, the authors propose a methodology used for creation of the Text-
Mining tool used for data acquisition from CMMS-like databes containing records of
every work that was perform on evidences machines. In addition, a methods of analysis
of such gathered data are proposed.
768 M. Stachowiak et al.
2 Description of Problems
Examples of problems encountered in the CMMS database relate to: (a) one entry may
refer to several elements of the machine not related to each other, (b) individual entries
containing several independent operations very often are not separate punctuation
marks, (c) appearance of stylistic and spelling mistakes, (d) shorter forms and mine
jargon are used, (e) very often the same corrective action is called differently, this also
applies to machine components and foreign names.
TextMining is the proposed tool for extracting data from the text as well as its
subsequent processing. Text mining can rely on finding key phrases, sentences, which
are then encoded in the form of numeric variables. Later, statistics and data mining
methods are used to discover relationships between variables. After that, it is also
possible to combine data from other sources (SCADA, SAP, etc.) and further data
fusion and multidimensional analyzes using machine learning techniques. Because the
resulting variables are usually nominal, basket analysis may be particularly useful and
allows one to identify unknown regularities accompanying the failure phenomenon
(operator’s driving style, diagnostic symptoms, workplace, mine, maintenance stuff,
machine type).
In this paper, we propose a tool for extensive machine failure analysis based on data
fusion from various sources. The paper is organized as follows: a short review on the
subject and current practice in the mining will be described; some remark and
assumption will be formulated and a novel procedure will be presented and discussed;
finally, application of the method will be provided.
3 Methodology
The algorithm scheme proposed in this paper is shown in Fig. 1. The algorithm consists
of: Data preparation – which is a set of actions to clean data and prepare for further
processing, Lemmatization – a process of finding the core of a word using an additional
dictionary, Feedback Loop – actions aiming at errors correction and improvement
algorithm overall effectivity, Categorization and Grouping – which are a processes for
sorting and labeling data collection. The described algorithm allows for actions like
developing a specialized dictionary of entries, catching stylistic and spelling mistakes,
unification of CMMS database records, sorting or searching entries by specified rules,
etc. In addition, the results of the algorithm can be used to develop further analyzes
such as e.g. affinity analysis.
Data mining tool for entry data uses raw records from CMMS-like Polish database
containing information about self-propelled machines used in underground copper
mines. Each of the records consists of separate variables like a machine or department
number, and cells containing a description of failure and repair. Most of the variables
are raw copied to the output, only text variables containing event information are
processed. Data contained in those cells did not pass any form of grammatical,
punctuation, or semantic control which causes that many of them contain multiple
errors. Many of them also contain information like dates or numbers that hinder
analysis. The most useful information contained in those cells is basic information like
Fig. 1. Algorithm of text data mining tool for extraction and categorization of information from
maintenance logs.
the name of the damaged component and description of its repair (like part repair or
part replaced). For lemmatization, the algorithm uses “PoliMorf” which is an open-
source morphological dictionary for Polish language. PoliMorf is a result of the
Morfologik project operation. Dictionary contains 3 main categories which are: word,
the core of a word and grammatical category. Each of these variables is used in the
process of lemmatization and later categorization. It is worth adding that a given
dictionary is a general and not a specialized one which may cause conflicts.
3.1 Data Preparation

Every record of the database is broken down into separate variables. The one variable
containing information is selected and subjected to the data cleaning process. The first
operation performed is the conversion of all uppercase letters to lowercase. This is done
because two same words with the difference of at least one letter (uppercase or low-
ercase) are treated as two different words. The next set of actions consist of removing
specified parts of the record. Punctuation marks and special characters like colons,
double spaces, or newline characters are removed because they don’t carry any useful
information. Next to be removed are Polish diacritical marks. In order to remove them,
they are changed into generally accepted substitutes from the Latin alphabet (for
example: ó ! o, ą ! a, etc.). This operation results in the improvement of operational
speed. In addition, some of the spelling and stylistic errors are dealt with. The last step
of preprocessing of data is the correction of the most common grammatical errors. This
is done by simply matching a list with the most common errors and searching for equal
cases.
3.2 Lemmatization
Lemmatization is an operation of finding the core of a word from one of its changed
forms. This process is shown in Fig. 2. This task, depending on language, can be done
with either stemming or lemmatization. While stemming is a crude process of chopping
parts of a word with a set of rules to reduce it to form that cannot be influenced by
factors such as tense. Lemmatization refers to doing the same operation (reducing
words to basic forms) but with different tools like vocabulary and morphological
analysis. In the case of morphologically complex languages like Polish stemming
simply doesnt’t work, the error rate of this operation is too high. This is the reason for
using the lemmatization process with a polish dictionary.
Fig. 2. Lemmatization process based on exemplary record from maintenance log.
In the process of lemmatization, each of the words from the record is searched in
the PoliMorf dictionary. If there will be a match in the searching process, the basic
form of a word is saved from the core column of the dictionary. The grammatical
category of a word (from the dictionary) is also stored. An exception to these opera-
tions are words consisting of one letter and numbers. If a word hits no match in
dictionary searching it is also stored for further processing. If a searched word had more
than one match, it was also stored for further decision.
There are many reasons why a word did not find its equivalent in the dictionary.
The most common are spelling errors and typos. They occur because of manual data
entry and the lack of unified form. Another reason for lacking match is that word is a
shortcut, proper name, or comes from jargon or colloquial speech. Many of such
unique words are already introduced in the dictionary, but due to the a wide range of
colloquial words and continuous evolution of speech it is impossible to predict all of
them, therefore their collection should be updated repeatedly. Many cases of repetition
of words not found in the dictionary were recorded, so it was decided to leave them in
the most common form. A different situation for missing words is the lack of space
between two words. In cases like these two words are connected and processing as one.
3.3 Feedback Loop

The feedback loop is responsible for dealing with any cases that didn’t get a core
matched in the lemmatization process. Some cases are dealt with automatically, but for
some of them, there is still a need for human intervention. This part of the algorithm
deals with cases like: typos, errors, words with multiple meanings, shortcuts, or
connected words. For each of these cases, a separate algorithm was created. As
mentioned earlier, some cases should require human verification, however, as the tool
is used, their number will gradually disappear.
Words with Errors. words containing errors like typos are similar to their correct
form (usually they do not differ by more than a few characters). Some measures allow
for measuring the similarity between words. These metrics measure “distance” between
words, and the smaller the distance, the more similar words. In the algorithm two
metrics were used, The Jaro-Winkler algorithm and the Levenshtein distance. The Jaro-
Winkler distance algorithm pays a lot of attention to the order of letters. In addition, the
speed of processing words is relatively fast. Typos and words with only a few errors
will be characterized by a small value of distance. On the other hand, the Jaro-Winkler
does not seem to care about interlaced, randomly placed, or missing characters. As long
as the target word characters are present at the correct distance, the words are similar.
The Jaro-Winkler Distance for two string of characters (s1 ; s2 ) is expressed by the
formula:
(
0; m ¼ 0
dj ¼ ð1Þ
1
3
m
js1 j þ jsm2 j þ mt
m ; m 6¼ 0
where, m – is the number of correct characters and t is half the number of transposi-
tions. Those two words are recognized as matched only when they are the same, or with
distance lesser than:
maxðjs1 j;js2 jÞ
ð2Þ
2
Every character of s1 is compared to every matched character from s2 . The number

of matched characters divided by 2, describes the number of transpositions. The out-
come is normalized, 0 means no similarities between words while 1 means an exact
match.
The Levenshtein distance defines the minimum number of single-characters edits
between measured words. The Levenshtein distance between two strings a,b is given
by leva;b ðjaj; jbjÞ, where:
8
> 8 maxði; jÞ; if minði; jÞ ¼ 0
>
<
< leva;b ði 1; jÞ þ 1
leva;b ði; jÞ ¼ leva;b ði; j 1Þ þ 1 ð3Þ
>
> min ; otherwise
: : lev ði 1; j 1Þ þ 1
a;b ðai 6¼bj Þ
Where 1ðai 6¼bj Þ is the indicator function equal to 0 when ai ¼ bj , and equal to 1
otherwise, and leva;b ði; jÞ is the distance between the first i characters if a and the first j
characters of b.i and j are 1-based indices. Thanks to these two metrics the algorithm
searches the dictionary for the word that has the highest match [12].
Connected Words. Every word that is longer than 5 characters is taken as a possible
connection of 2 words. Algorithm recursively divides that word into two words where
both must have at least 2 characters (in the Polish language the shortest word that
carries useful information have 2 characters) and searches for matches in the dictionary
with the use of distance functions mentioned before. If the similarity between two
strings and words in the dictionary is large enough, the string is replaced by two words.
In every other case, a string is saved as an incorrect word for later human processing.
Shortcuts and Colloquial Speech. In the mining facility, the use of colloquial speech,
abbreviations, and jargon is very high. This includes also foreign-language names
which can be translated into Polish in a variety of ways (in both formal and common
speech). This problem can be solved by looking at the whole dataset. If such word
repeats multiple times it’s recognized as a proper name or shortcut. This action,
because of low fault-tolerance needs to be later accepted by a human.
Words with Multiple Meanings. In Polish and many other languages there is a
significant number of homonyms and homographs, that is, words that have multiple
meanings but the same spelling (for example: crane – as a bird or a construction
machine). In the case of such words, multiple cores can be assigned. To solve the
problem a simple method was developed. If at the stage of lemmatization it turns out
that more than one core can be assigned to a word, those cores are saved. Later, the
amounts of occurrences of given cores are compared in the entire data sample. A core
that has a greater number of repetitions is assigned to the word. If the number is the
same for both cores, the case is recorded and referred to human intervention.
3.4 Categorization
Cleaned text is subjected to exploratory analysis. The main action in order of proper
categorization is the extraction of keywords. To study those we used tag clouds. This
tool presents the frequency of a given word in the form of a cloud. The more often a
word is repeated, the larger is it’s representation in the cloud when comparing to other.
This tool works great to find nuances and relations with data. To find proper keywords
we take most frequently repeated words and form another tag clouds with two or three
words expressions that contain words from the first cloud. This can be seen in Fig. 3.
For clarity and understanding of the results, all of them have been translated from
Polish into English. We discovered that many of the keywords they mean the same as
others (e.g. propulsion engine means the same as the combustion engine). At this stage,
specialized knowledge in the field of construction and operations on the tested
machines is required.
For proper categorization, first, the data need to be cleaned again. Most of them
contain words that still doesn’t bring useful information like pronouns or sentence like
“manufacturer’s materials were used for repair”. The next step was to assign each word
its grammatical form. In this way we get:
Fig. 3. Data exploration using tag clouds (results are translated).
• Maintenance actions – in the Polish language most of the analyzed actions are
verbs. Exceptions like “replacement” or “repair” (in Polish they are nouns) were
added manually. In that simple way, the algorithm can isolate the action that was
performed and described in the record.
• Machine components – in Polish every component is a noun. In addition, most of
them occur with an adjective. By this the algorithm can isolate not only component
that is involved in records by also can specify which component (for example
“silnik spalinowy” (ang. combustion engine) where the word “silnik” (noun) means
engine and “spalinowy” (adjective) describe what kind of engine - combustion).
• Number of actions performed – this information was isolated from conjunctions.
Words like “and” bring the information that more than one element was involved in
the event. There were also less frequently words like “in” or “from” which contain
information about the system which were the elements belong to.
After assigning grammatical form, first, the actions are isolated. Often there was more
than one maintenance action. Next, to each action, the machine component involved were
matched. The last step was to assign to each group an adjective to specify elements.
Adjectives were selected on the basis of the quantity with which they appeared. In this
way, crude categories were isolated. The main problem at this stage is that many cate-
gories are repeated. Some of them carry the same information but are written differently. It
is caused by adding new information or changing the order of the sentence. Action
performed were always matching, but the nouns were the problem. To solve this problem,
categories were grouped. If the sentence describing the machine component has more
than 3 words, the algorithm checks how much it has common with other phrases. If there
were a category, that differs with only one word, the phrases were grouped as matching. In
the case of shorter clusters (less than 3 words), the algorithm searches for categories with
at least one word the same and matching letters in other words. In this way, proper
categories were formed (a few of them connecting 3 or 4 elements).
3.5 Grouping
For better data presentation and further analysis, the categories were matched to one of
the 5 main groups. This was performed manually based on the action performed. The
main groups are: repair, replacement, maintenance (preservation), inspection, correc-
tion. Grouping data allows one to analyze in a simple way what actions (without
specified details) were performed on the machine.
An example of such analysis is shown in Fig. 4. On the machine’s lifeline, groups
of individual works have been marked. Experimental data comes from underground
mines that belong to the KGHM Polska Miedź SA. Input data are entries from the
CMMS database that contain information about the work carried out on the machines
(information entered by mechanics performing a specific job). Data comes from 3
mines and a total of 24 different machines which gives a total of 13332 unique entries.
Each entry was generated on the basis of a form that contained a dozen or so fields
where information was collected by means of: text, option selection, Yes/No fields, etc.
For this type of analysis, the main column used was the one containing the scope of the
work carried out, in which the mechanic entered the activities performed on the
machine using the keyboard. This cell has no form restrictions, and any text can be
entered there.
Fig. 4. Categories of actions performed on the machine (results have been translated).
The course of maintenance and repair actions can be used in analysis with data from
the machine monitoring system for fault analysis, better tracking of diagnostic symptoms,
verification of the effectiveness of repair work, and so on. An example of such analysis is
shown in Fig. 5. The nominal exceedances of quantiles of the gearbox oil temperature
parameter (red horizontal line) have been compared with the periods of its repair and
maintenance. After the quantiles began to show overshoots, two machine inspections and
repairs were made and the overshoots stopped. Longer periods of missing data are
associated with machine downtimes caused by failure events. Some advanced algorithms
to detect anomaly in temperature data have been described in [13–15].
Fig. 5. Quantiles analysis with maintenance and repair actions performed (results have been
translated). A red horizontal line indicates the threshold value for the gearbox oil temperature.
4 Conclusion
The general advantage of a text mining tool is it’s versatility in further implementation.
Once created can be applied to similar types of data with only little effort. Using a text-
mining tool with CMMS databases allows for fast keywords extraction, categorization
by almost any factor. That kind of action can be also used for automatic abstract or
report generation or connecting data from incompatible sources. In this paper we
presented a text mining tool for CMMS-like databases, that consists of several lesser
algorithms. The main goal for that tool was to process raw text data from facilities in
order of later analysis of that data. That target was achieved with a satisfactory level of
efficiency. As a result, each of the records are categorized and grouped allowing for
actions like quick search, statistical analysis, or more complex multivariate failure
analysis. The main drawback of the presented analysis are distance based operation on
short words wich can bring unexpected results. For data processed by the text mining
tool, we presented two kinds of analyses that we then applied to real industrial data.
One of them is the quantitative analysis that allows us to look at the data numerically. It
allows the creation of various types of statistics or rankings thanks to which we can
optimize work and look for hidden regularities. The result of algorithm work can be
also used for the creation of a more complex tool for analysis that consists of methods
like basket analysis.
References
1. Kumar, U., Klefsjö, B.: Reliability analysis of hydraulic systems of LHD machines using the
power law process model. Reliab. Eng. Syst. Saf. 35(3), 217–224 (1992)
2. Camelia, D.S.M., Silviu, N. M., Emil, D.: Study of reliability modeling and performance
analysis of haul trucks in quarries. In: Advances in Computer and Information Sciences and
Engineering (2015)
3. Samanta, B., Sakar, B., Mukherjee, S.K.: Reliability modelling and performance analyses of
an LHD system in mining. J. South Afr. Inst. Min. Metall. 104(1), 1–8 (2004)
4. Galar, D., Kumar, U., Lee, J., Zhao, W.: Remaining useful life estimation using time
trajectory tracking and support vector machines. In: Journal of physics: Conference series,
vol. 364, no. 1, p. 012063. IOP Publishing (2012)
5. Si, X.S., Wang, W., Hu, C.H., Zhou, D.H., Pecht, M.G.: Remaining useful life estimation
based on a nonlinear diffusion degradation process. IEEE Trans. Reliab. 61(1), 50–67 (2012)
6. Mouli, C., Chamarthi, S., Gȧ, R.C., Vȧ, A.K.: Reliability modeling and performance analysis
of dumper systems in mining by KME method. IJCET 2, 255–258 (2014)
7. Al-Chalabi, H.S., Lundberg, J., Wijaya, A., Ghodrati, B.: Downtime analysis of drilling
machines and suggestions for improvements. J. Qual. Maintenance Eng. 20, 306–332 (2014)
8. Al-Chalabi, H., Hoseinie, H., Lundberg, J.: Monte Carlo reliability simulation of
underground mining drilling rig. In: Current Trends in Reliability, Availability, Maintain-
ability and Safety, pp. 633–643. Springer, Cham (2016)
9. Gunay, H.B., Shen, W., Yang, C.: Text-mining building maintenance work orders for
component fault frequency. Build. Res. Inf. 47(5), 518–533 (2019)
10. Chen, L., Nayak, R.: A case study of failure mode analysis with text mining methods. In:
Ong, K.-L., Li, W., Gao, J. (eds.) Proceedings 2nd International Workshop on Integrating
Artificial Intelligence and DataMining (AIDM 2007) CRPIT, 84, pp. 49–60. Gold Coast,
QLD (2007)
11. Paynter, R.A., Bañez, L.L., Berliner, E., Erinoff, E., Lege-Matsuura, J.M., Potter, S.: Use of
text-mining tools for systematic reviews. Value in Health 19(3), A108 (2016)
12. Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for
name-matching tasks. In: IIWeb, vol. 2003, pp. 73–78, August 2003
13. Wodecki, J., Stefaniak, P., Michalak, A., Wyłomańska, A., Zimroz, R.: Technical condition
change detection using Anderson-Darling statistic approach for LHD machines–engine
overheating problem. Int. J. Min. Reclam. Environ. 32(6), 392–400 (2018)
14. Stefaniak, P., Śliwiński, P., Poczynek, P., Wyłomańska, A., Zimroz, R.: The automatic
method of technical condition change detection for LHD machines-engine coolant
temperature analysis. In: International Conference on Condition Monitoring of Machinery
in Non-Stationary Operation, pp. 54–63. Springer, Cham June 2018
15. Wodecki, J., Stefaniak, P., Polak, M., Zimroz, R.: Unsupervised anomaly detection for
conveyor temperature SCADA data. In: Advances in Condition Monitoring of Machinery in
Non-Stationary Operations, pp. 361–369. Springer, Cham (2018)
Road Quality Classification Adaptive
to Vehicle Speed Based on Driving Data
from Heavy Duty Mining Vehicles
Artur Skoczylas(&) , Paweł Stefaniak , Sergii Anufriiev ,

and Bartosz Jachnik
KGHM Cuprum Research and Development Centre Ltd,

gen. W. Sikorskiego 2-8, 53-659 Wroclaw, Poland
{askoczylas,pkstefaniak,sanufriiev,
bjachnik}@cuprum.wroc.pl
Abstract. Maintaining a pavement in good condition is one of the key chal-

lenges faced by services responsible for road infrastructure. In particular, the
poor quality of the surface may pose a serious threat to life and health and lead
to serious damage to vehicles. The problem becomes much more serious in
difficult environmental and road conditions. Moreover, carrying out an inap-
propriate policy of maintaining the surface of road infrastructure results in high
repair costs and an increase in traffic jams. A similar problem is observed on the
access routes and the haulage roads in the mine, where the quality of the roads
determines optimal and safe production. One of the components of the mine’s
efficiency is the reliability of the wheeled transport fleet. The poor quality of
roads leads to high dynamic overloads on the machine and serious damage to its
structural nodes. In this article, the authors propose a method for assessing the
quality of a road dedicated to underground mining. The algorithm is based on
data from the inertial sensors (IMU) installed on mining haul trucks and an on-
board monitoring system. The deployed procedure consists of a three-state
classification of vibration acceleration readings adapting to the driving speed
using machine learning techniques. It is a crusial from the viewpoint of an
automatic evaluation of the pavement quality. Using the inertial navigation
algorithm enables to plot the road quality on the estimated motion path against
the background of mining excavations. In this way, it is possible to obtain a
holistic insight into the technical condition of the road infrastructure, which is
key for further optimizing production or scheduling road repair works.
Keywords: Fleet monitoring Pavement management Machine learning

Road surface anomaly
1 Introduction
Road condition monitoring is a commonly known challenge in line with different

angles of haulage process like driving comfort, road infrastructure maintenance,
logistics, reliability of vehicles as well as safety aspects [1, 2]. Maintenance of road
conditions is necessary to avoid potential hazards related to fleet failures or traffic

https://doi.org/10.1007/978-3-030-68154-8_67
778 A. Skoczylas et al.
accidents. It should be highlighted that each year, thousands of people in the world are
involved in serious road accidents (very often lethal) resulting from a damaged road
surface [3, 4]. In a rich country like U.S. planned yearly expenditures for maintenance
of federal highways are estimated at $46 billion. The poor condition of roads is yearly
responsible for nearly 10,000 traffic fatalities. Considering the above, the major
emphasis should be taken on safety and maintenance costs. The two critical factors
influencing driving quality are: the surface roughness of the pavement. and vehicle
response to the road. A commonly used road roughness metric is IRI (the International
Roughness Index) which defines based on simulation a scale presented relation
between vehicle response regarding road quality obtained at speeds 80 km/h [5]. In the
practice, conducting continuous repairs is difficult to achieve because of expensive
manpower, various weather conditions, and heavy traffic.
Thus, what is really important here is the development of a robust monitoring
system both for early detection of road failures, tracking its evolution and support
planning the maintenance tasks in advance. So far, an inspection of the road is per-
formed via a manual process which is based on a visual approach and strongly depends
on subjective impressions of service personnel and is very time consuming and not
objective. In general, such an approach has many limitations and requires frequent
inspections. An example of standard procedures has been described in [6].
In the literature subject of automated road, surface inspection is very popular. We
can highlight: vision methods [7], LIDAR scanning [8], ground penetration radar
(GPR) [9], and techniques based on inertial sensors [10, 11]. In [12] the authours used a
quarter car model with Kalman Filter estimate road roughness index (in underground
mine) based on data from IMU, wheel speed sensors and RSSI from Wi-Fi. The most
common are accelerometer-based methods. The optimal solution is to collect real-time
driving data from smartphones in terms of service like Google Maps. As a result, it will
be possible to immediately obtain information about the condition of all the most
frequently used roads practically for free. Paper [13] presents a method of surface
monitoring based on Accelerometer Data, where Gaussian model was used for
abnormal event detection, and X-Z axis ratio filtering for event classification (like
pothole or hump). The issue of road quality assessment is well described in Interna-
tional Road Roughness Experiment (IRRE) [14], in which the authors compared many
methods of measuring this value and proved that they are compatible with each other.
The problem of road quality evaluation has also become very popular in mining.
The development of methods initiated the analysis of the factors of dynamic overloads
observed among heavy-duty mining machines, which resulted in damage to the con-
struction nodes. An inertial sensor installed on the haul truck and data from the on-
board monitoring system [15] was used for the tests. In paper [16] clarifies that the
main factors that have influence on vibration level are road conditions (shape of road,
failures, bumps, slope), driving style (e.i. adapting speed to road conditions), and
technical conditions of the machine (structure stiffness) [17]. As part of the preliminary
work, the authors proposed a simple method of road-quality assessment which comes
down to: (1) signal segmentation in order to identify haulage cycles and its particular
operation [18, 19] (loading of the cargo box, driving with a full cargo box), unloading
in dumping point and returning to mining area with empty box, (2) estimation of
Road Quality Classification Adaptive to Vehicle Speed Based on Driving Data 779
trajectory for driving with full and empty cargo box, (3) calculation of statistics for
inertial data, (4) its tristate classification and representation in the route visualization.
This paper is an extension of the original method [16]. The key is to fill the gap
related to the impact of speed on the values of recorded vibration accelerations. This is
critical for several reasons. First of all, machine operators work in the piecework
accounts mode and care for the machine moderately not using high speeds where
thoose should be avoided. Secondly, one machine is run by dozens of different oper-
ators with completely different driving styles. Thirdly, each cycle may have different
capacity of the route resulting from a traffic jam, a busy dumping point. Fourthly and
most importantly, the driving speed has a real impact on the magnitude of the recorded
vibration values. For these reasons, authors present a novel method fitted to mining
machines. This adaptive approach is based on inertial data and machine learning
techniques. The classifier model used in the procedure is based on the k-means algo-
rithm. The developed model is adapted to the case of an underground mine, but after
minor modifications, it can be applied to diagnose the state of pavement in other types
of road infrastructure. The article presents the application of the model on real
industrial data. The result data from the classification model was integrated into the
machine motion trajectory estimation results obtained from inertial navigation. In this
way, a spatial distribution of the surface condition on the GIS map was obtained. The
description of the navigation algorithm is beyond the scope of this article.
2 Input Data
Classification of road quality is based on three signals: an accelerometer in the Z-axis, a

signal describing the current speed of the machine (SPEED) and, a signal describing
the currently selected gear (SELGEAR). In addition the signal describing engine load
(ENGRPM) is used in process of learning of the algorithm. The signal from the
accelerometer was recorded using an external device called NGIMU. This signal is
sampled at a constant frequency of 50 Hz, which can be increased to 400 Hz. Other
signals were recorded using the “SYNAPSA” on-board monitoring system embedded
on self-propelled machines. This device reads sensor data at a frequency of 100 Hz,
however, densely sampled data are available for a specified length of time. After this
time window, the data are averaged to 1 Hz and saved to the database. 1 Hz sampled
data were used in the process of algorithm creation. Besides, it should be mentioned
that the instantaneous machine speed signal (SPEED) does not take into account
vehicles direction of movement (forward, backward) and also is averaged to 1 km/h.
Information about heading direction is extracted from the currently active gear signal
(SELGEAR). ENGRPM signal contains the information about the current engine load
which is described by angular velocity expressed in the unit of rpm (rotation per
minute). Raw signals are shown in Fig. 1.
Fig. 1. 2 h long fragment of input variables.
3 Algorithm Assumptions
The main assumption of the algorithm is the effect of speed on machine vibration from
the road. This means that when driving on an identical road, only at different speeds,
the vibrations of the machine will be different (will transform because of the change in
speed). Therefore, in order to correctly estimate the state of road quality, the algorithm
should be separated and “immunized” against the influence of speed.
The next assumption of the algorithm is the frequency band in which vibrations
from the road are manifested. The algorithm assumes that the quality of the road is
manifested in vibration at higher frequencies, which is true e.g. passenger cars. Mining
machines, however, are quite different in many factors. These machines are much
larger and heavier than ordinary passenger cars. The speeds with which they travel are
also quite different, maximum it is up to 18 km/h where it is usually about 5–10 km/h.
The last assumption concerns the idle and the machine’s frequencies. At times
when the machine is found to be idling, a spectrum is extracted from the vibration
signal. The algorithm assumes that the spectrum largely presents the machine’s natural
frequencies. This assumption, however, can be confirmed by methods like modal
analysis, which were unavailable for the duration of the experiments.
4 Methodology
The adaptive road quality classifier can be divided into two similar algorithms: the
algorithm for creating the classifier and its further learning and algorithm for the
classification process. There is great convergence between these two algorithms,
mainly in the functions/ideas used, but they were broken due to easier implementation
(and faster operation). The main blocks of both algorithms are shown in Fig. 2.
Fig. 2. Schematics of algorithms for the creation of the classifier and the classification.

There are five main steps to perform data preparation. The first is to change the types of
all variables to floating-point values. Later the instantaneous speed of the machine is
further processed. It is inverted at the moments when a negative gear has been
recorded. The next step involves filtering the accelerometer Z signal with a 5 Hz high-
pass filter. The main purpose of this action is to eliminate gravity, which is a low-
frequency component. In addition, as mentioned before, the algorithm assumes that the
vibrations determining the quality of the path lie in the higher frequency bands.
The fourth step is to remove the differences in signals samplings and time. Due to two
different recording devices, the data was sampled differently (NGIMU - 50 Hz,
SYNAPSA - 1 Hz). This problem was solved by repeating each of the SYNAPSA
measurements 50 times. This solution is of course less accurate than more advanced
methods of approximation, however, the speed is averaged to 1 km/h anyway, so only a
small amount of information is lost. The signals must also be synchronized in time. This
action is simply done by matching timestamps that are present in both signals. Even
though both devices have a real-time clock, differences in time are still detected due to the
lack offrequent synchronization of the clock (with a more accurate time source like atomic
clock available online) on one of the devices. That is why additional synchronization is
performed by detecting the first value jump. In the case of the variable SPEED, it is
enough to detect the first non-zero value to get the first moment of the machine motion. In
the case of the NGIMU sensor, the ideal signal for detecting the first movement is the
accelerometer in the Z-axis. This detection is performed with a simple decision threshold
based on the signal averaged in the window (window with a length of 10 samples - 1 /5 s).
The last step is the signal segmentation. The whole signal is divided into parts with
a length of 2 s. Then the algorithm checks the average speed per fragment, and if it is
close to zero, then such a fragment is considered an idle proposal and passed to the
function responsible for its extractions. The rest of the fragments remain unchanged.
4.2 Idle Detection

Idle detection is performed on fragments during which the machine should not be
moving (speed close to 0). However, the system is not perfect and due to the inertia of
the measuring system and other factors, some measurements are incorrect. This man-
ifests itself in zero speed measurements while most of the other measured parameters
indicate that the machine has moved. Therefore, only proposals are extruded based on
speed, while the main detection takes place using a signal from the engine (ENGRPM).
Engine speed during normal operation is spread between 1000 and 2000 rpm.
When the machine switches to idle, these rotations decrease to the level of around
800 rpm. This relationship is the main factor based on which idle is detected, while the
detection itself takes place using simple decision thresholds.
From the fragments that were considered idling, an amplitude spectrum is created
using FFT (Fast Fourier Transform). These spectra are then averaged to create one
spectrum representing the machine. This can be described by the formula (1):
1 XN
Xj ¼ Y ;
n ¼ 1 n;j
ð1Þ
N
where: Xj is new spectrum value for j frequency, N describe the number of calculated
spectra and Y n;j is value of N spectrum for j frequency . Such a spectrum should mainly
contain only native frequencies. An example spectrum obtained from fragments with a
total length of about 2 h is shown in Fig. 3. It can be seen despite the low resolution
that clear peaks are observed for some frequencies.
Fig. 3. Averaged amplitude spectrum of idle level vibrations signal.
4.3 Data Preparation for Classification

For proper classification, the data must first be properly prepared (previously selected
sections during which the machine moved are used for classification). The preparation
of data begins segmentation relative to the vehicle speed. This is done by finding
unique speed values and dividing the entire signal relative to them. Under normal
conditions, it would be best to average the signal depending on the expected accuracy
(e.g. 0.5 km/h for medium and 0.1 km/h for high accuracy) before searching for such
values. However, as described earlier, our data has already been mediated to 1 km/h, so
we could receive a maximum of about 30 unique values. At this stage, the data
transformed from one signal to X signals form where X is the number of unique
velocity values found in the data. At a later stage, each of these X signals is processed
separately (all further described activities are repeated for each of the signals).
Further segmentation consists of dividing the given speed signal into fragments of
the window length (2 s). The amplitude spectrum is calculated for each such fragment
using the Fourier transform. Then, for each such spectrum, the difference with the idle
spectrum is calculated (spectrum - idle spectrum). If negative values appear as a result
of this operation, they are filled with zeros.
The last step involves creating one parameter to describe the given spectrum. For
this, we used a weighted average with an additional weight system that would favor
higher frequency spectra components. Due to the sampling frequency of 50 Hz, the
useful spectrum was in the range of 0–25 Hz, the weight system for this range is shown
in Table 1.
Table 1. The weighting system for frequency components.

Frequency range [Hz] 1–5 6–10 11–15 16–20 21–25
Weight 1 2 3 4 5
Calculation of the weighted average from the spectrum completes the entire process
of data pretreatment. Only two values are used for the classification itself: the average
and the speed of the fragment from which it was calculated. All the activities described
up to this point can be represented by the pseudo-code visible in Fig 4.
Signal = Accelerometer Data, where Speed > 0

UniqueSpeed = list of unique values of Speed in Signal
For speed in UniqueSpeeds:
secondarySignal = Signal where Speed == speeds
fragments = secondarySignal divided by window
for fragment in fragments:
spectrum = amplitude spectrum of fragment
spectrum = spectrum – idle spectrum
av = weighted average of the spectrum
Fig. 4. A pseudo-code that describes the data manipulation process.
4.4 Road Quality Classifier Creation

The classifier used to assess road quality is actually a set of instances of KMeans
classifiers. A separate model has been created for each of the speed values, which
allows to easily immunize the classifier against differences in vibration in relation to
machine speed. This structure also allows for the easy development of the classifier.
New models can be created when speed range of the vehicle extends, while those
already created can be developed further.
Each of the models in the classifier is a separate KMeans working on one-
dimensional data. These data are weighted average spectra calculated from signal
segments for the currently processed speed. Each of the KMeans models is designed to
divide the spectrum data into 3 groups (corresponding to bad, medium, and good road
quality). This is also partly the reason for choosing this grouping algorithm. As one of
the few, it allows grouping data into a specific number of groups, and KMeans from a
group of algorithms that allow it seemed to work best.
With this classifier design, there is also the problem of assigning different markings
each time the model is started. The solution to this is to sort the assigned labels using
group centers. The lowest of all groups always get the label 0, medium one – 1, while
the highest – 2.
Figure 5 presents the classifier which we were able to obtain based on the collected
data. The colors marked the next detected road quality (and their boundaries): white
(lowest) - good, gray - medium, black (highest) - bad. Division of groups in the
classifier is not simple (as in the case of simple decision thresholds), and as the speed
increases, the acceptable vibrations of good and medium quality roads increase. After
exceeding the speed of 15 km/h, the vibrations seem to drop, however, there are too
few measurements to be able to definitely conclude. It can also be stated that the
increase in thresholds is similar in both directions (forward/backward), at least up to a
speed of 5 km/h because after that there are not enough measurements on the negative
side. Maximum vibrations are recorded for a speed of 7–9 km/h.
Fig. 5. Road Classifier model with established values, white (lowest) - good, gray (medium) –
medium, black (highest) – bad.
4.5 Road Quality Classification

The classification is carried out using a similar methodology as in the case of building a
road quality classifier, with the difference that it is simplified. There is also an addi-
tional validation (similar to that used for the classifier contruction), which results in
assigning labels not specifically related to the quality of the road, but to the quality of
the elements needed to estimate it.
At the beginning, the classifier checks whether it is dealing with one of two special
variants: idling (speed = 0) or empty measurements (speed = NaN). Accelerometer
readings in these two cases do not bring any information about the quality of the path,
therefore they are not processed in any way. The appropriate labels are then assigned to
these fragments.
If the fragment does not belong to these variants, its average speed value is pulled
out. Such an estimated speed of the fragment is further checked for compliance with the
models. If there is no model for the set speed, then the label informing about that is
assigned to the fragment. When the speed is supported, the amplitude spectrum is
created from the fragment, the idling spectrum is subtracted from it and the weighted
average is further calculated. Based on the average result and using a model for a given
speed, a group belonging (road type) label is obtained.
The algorithm was learned on the data from the vehicles hauling the spoil in one of the
mines owned by KGHM Polska Miedź SA. A total of 10 work shifts were used to learn
the algorithm, each lasting about 6 h (4 days of experiments, each consists of 2 shifts).
The Speed of the machine was recorded using internal measurement unit SYNAPSA,
the vibration were recorded by the accelerometer obtained from the IMU (NGIMU)
mounted on the machine. Then the algorithm was tested on one additional work shift,
from which it was decided to show one fragment. The designated section of road
against the background of the mine map with color-coded quality is shown in Fig. 6.
The correctness of the method was confirmed by the passengers of the vehicle driving
on this route.
Fig. 6. Designated road quality shown in color on the map.

At the moment, the described method lacks unequivocal confirmation of its cor-
rectness. Single fragments of routes and the results of the algorithm’s operation have
been confirmed by machine operators, however, this assessment is highly inaccurate
and also subjective, thus some further works in this direction are planned.
6 Summary
The article deals with the issue of assessing the condition of mining road infrastructure,
which is crucial from the point of view of efficient, sustainable and safe exploatation.
The authors proposed an algorithm based mainly on data from an inertial measuring
unit and the speed signal. A classification model based on the spectral analysis and
KMeans algorithm was developed, which allows to perform a three-state assessment of
the road surface condition adaptively to the driving speed of a mining vehicle. The
paper presents the integration of the result data from the classification model with the
GIS map and the vehicle movement path estimated from an inertial navigation. This
approach allows to achieve a holistic view of the condition of road infrastructure in the
underground mine. In the future, it is suggested to integrate the proposed solution with
an IoT platform based on thousands of low-cost sensors installed on the each of
vehicles in mine. In this way, on-line measurements can cover the entire road network
in a mine. Besides that, the more data the machines will gather, the greater that
classifier can be in terms of accuracy. At this moment, a relatively small sample of data
allowed for the creation of a classifier which performed the detection with a satisfactory
accuracy for a machine running at the most popular (average) speeds. Since machines
rarely run at higher speeds, more data is needed to obtain similar results for these
speeds. There is also the need to compare classification results with the more accurate
and tested method of the road quality detection, because the validation method pre-
sented here is highly subjective.
Acknowledgements. This work is a part of the project which has received funding from the
European Union’s Horizon 2020 research and innovation programme under grant agreement No
780883.
References
1. Bishop, R.: A survey of intelligent vehicle applications worldwide. In: Proceedings of the
IEEE Intelligent Vehicles Symposium 2000 (Cat. No. 00TH8511), pp. 25–30. IEEE,
October 2000
2. Eriksson, J., Girod, L., Hull, B., Newton, R., Madden, S., Balakrishnan, H.: The pothole
patrol: using a mobile sensor network for road surface monitoring. In: Proceedings of the 6th
International Conference on Mobile Systems, Applications, and Services, pp. 29–39, June
2008
3. U.S. Department of Transportation, Traffic Safety Facts – Crash Stats, June 2015
4. Pothole (2002). http://www.pothole.info
5. Gillespie, T.D.: Everything you always wanted to know about the IRI, but were afraid to ask.
In: Road Profile Users Group Meeting, Lincoln, Nebraska, pp. 22–24, September 1992
6. Pierce, L.M., McGovern, G., Zimmerman, K.A.: Practical guide for quality management of
pavement condition data collection (2013)
7. Ferguson, R.A., Pratt, D.N., Turtle, P.R., MacIntyre, I.B., Moore, D.P., Kearney, P.D.,
Breen, J.E.: U.S. Patent No. 6,615,648. Washington, DC: U.S. Patent and Trademark Office
(2003)
8. Pothole Marker And More - Apps on Google Play, Google Play
9. Mahmoudzadeh, M.R., Got, J.B., Lambot, S., Grégoire, C.: Road inspection using full-wave
inversion of far-field ground-penetrating radar data. In: 2013 7th International Workshop on
Advanced Ground Penetrating Radar, pp. 1–6. IEEE, July 2013
10. Basavaraju, A., Du, J., Zhou, F., Ji, J.: A machine learning approach to road surface anomaly
assessment using smartphone sensors. IEEE Sens. J. 20(5), 2635–2647 (2019)
11. Vittorio, A., Rosolino, V., Teresa, I., Vittoria, C.M., Vincenzo, P.G.: Automated sensing
system for monitoring of road surface quality by mobile devices. Procedia-Soc. Behav. Sci.
111, 242–251 (2014)
12. Åstrand, M., Jakobsson, E., Lindfors, M., Svensson, J.: A system for under-ground road
condition monitoring. Int. J. Min. Sci. Technol. 30, 405–411 (2020)
13. Harikrishnan, P.M., Gopi, V.P.: Vehicle vibration signal processing for road sur-face
monitoring. IEEE Sens. J. 17(16), 5192–5197 (2017)
14. Sayers, M.W., Gillespie, T.D., Queiroz, C.A.V.: The international road roughness
experiment: establishing correlation and a calibration standard for measurements. University
of Michigan, Ann Arbor, Transportation Re-search Institute, January 1986
15. Stefaniak, P.K., Zimroz, R., Sliwinski, P., Andrzejewski, M., Wyłomanska, A.: Multidi-
mensional signal analysis for technical condition, operation and performance understanding
of heavy duty mining machines. In: International Conference on Condition Monitoring of
Machinery in Non-Stationary Operation, pp. 197–210. Springer, Cham, December 2014
16. Stefaniak, P., Gawelski, D., Anufriiev, S., Śliwiński, P.: Road-quality classification and
motion tracking with inertial sensors in the deep underground mine. In: Asian Conference on
Intelligent Information and Database Systems, pp. 168–178. Springer, Singapore, March
2020
17. Wodecki, J., Stefaniak, P., Michalak, A., Wyłomańska, A., Zimroz, R.: Technical condition
change detection using Anderson-Darling statistic approach for LHD machines–engine
overheating problem. Int. J. Min. Reclam. Environ. 32(6), 392–400 (2018)
18. Wodecki, J., Stefaniak, P., Śliwiński, P., Zimroz, R.: Multidimensional data segmentation
based on blind source separation and statistical analysis. In: Advances in Condition
Monitoring of Machinery in Non-Stationary Operations, pp. 353–360. Springer, Cham
(2018)
19. Stefaniak, P., Zimroz, R., Obuchowski, J., Sliwinski, P., Andrzejewski, M.: An effectiveness
indicator for a mining loader based on the pressure signal measured at a bucket’s hydraulic
cylinder. Procedia Earth and Planet. Sci. 15, 797–805 (2015)
Fabric Defect Detection System
Tanjim Mahmud1(&), Juel Sikder1, Rana Jyoti Chakma1,

and Jannat Fardoush2
1
Rangamati Science and Technology University, Rangamati, Bangladesh
{tanjim.cse,rchakma}@rmstu.edu.bd,
sikder_juel@yahoo.com
2
Department of Computer Science and Engineering, University of Chittagong,
Chittagong, Bangladesh
jannat.soma@gmail.com
Abstract. Fabric inspection is very significant in textile manufacturing. Quality

of fabric defends on vital activities of fabric inspection to detect the defects of
fabric. Profits of industrialists have been decreased due to fabric defects and
cause disagreeable loses. Traditional defect detection methods are conducted in
many industries by professional human inspectors who manually draw defect
patterns. However, such detection methods have some shortcomings such as
exhaustion, tediousness, negligence, inaccuracy, complication as well as time-
consuming which cause to reduce the finding of faults. In order to solve these
issues, a framework based on image processing has been implemented to
automatically and efficiently detect and identify fabric defects. In three steps, the
proposed system works. In the first step, image segmentation has been employed
on more than a few fabric images in order to enhance the fabric images and to
find the valuable information and eliminate the unusable information of the
image by using edge detection techniques. After the first step of the paper,
morphological operations have been employed on the fabric image. In the third
step, feature extraction has been done through FAST (Features from Accelerated
Segment Test) extractor. After feature extraction, If PCA (Principal Component
Analysis) is applied as it reduces the dimensions and preserves the useful
information and classifies the various fabric defects through a neural network
and used to find the classification accuracy. The proposed system provides high
accuracy as compared to the other system. The investigation has been done in a
MATLAB environment on real images of the TILDA database.
Keywords: Defect detection FAST (Features from Accelerated Segment

Test) Neural network PCA (Principal Component Analysis)
1 Introduction
The textile industry is a rising sector. Development and advancement of the sector
normally bring to build the going through huge investment. Be that as it may, the
textile, like any other sector, industry experienced various issues. These include some
insurance to diminish the effect of misfortunes that are budgetary, client disappoint-
ment, time squandering, and so on. Fabric defects are probably the greatest test
https://doi.org/10.1007/978-3-030-68154-8_68
Fabric Defect Detection System 789
confronting the textile business. Fabric is made in a day by day life utilizing fibers and
a usually utilized material. Most fabrics are delivered after passing through a series of
making stages. Various machines and methods are utilized during the making stages.
Fabrics are subjected to pressures and stresses along these lines that cause defects. As
indicated by their structures and directions, defects take various names. The textile
business has distinguished in more than 70 types of defects [1] such as laddering, end-
out, hole, and oil spot as shown in Fig. 1. Unexpected tasks might be the reason for
various defects on the fabric surface during the manufacturing of fabric [2]. The lack of
defects can diminish the cost of fabric by 50–60% [1]. The decrease in the impacts in
the production process is typical for the industrialist.
Fig. 1. Different defects in a fabric
Thus, fabric manufacturing is one of the largest traditional businesses where fabric
inspection systems can play a vital role in growing the manufacturing rate. These days,
the significance of an inspection process nearly rises to with manufacturing process in
the current industrialist viewpoint. The idea of inspection process is to recognize the
happened errors or defects, on the off chance that any exist, at that point to change
argument or give alert of inspector for checking the manufacturing procedure [3]. For
the most part, fabric defects recognition utilizes two kinds of investigation models [4].
The essential one is that the human-based inspection systems as shown in Fig. 2.The
second framework is automated based inspection systems as shown in Fig. 3.
Accordingly, human-based defect detection done by specialists’ turns out to be rapidly
a mind-boggling and fussy task [5, 6]. In this manner, having proficient and automated
based frameworks nearby is a significant necessity for improving unwavering quality
and accelerating quality control, which may expand the profitability [7–10]. The
subject of automated based defect detection has been examined in a few works in the
most recent decades. In spite of the fact that there is no widespread methodology for
handling this issue, a few strategies dependent on image processing procedures have
been proposed in recent years [11–13]. These strategies were utilized to recognize
defects at the image level, so the precision rate is little and additionally, it is hard to find
790 T. Mahmud et al.
the defects precisely. In this way, they can’t be stretched out to various fabrics. As of
late, some different techniques dependent on local image-level have been proposed,
which utilize the base unit as the fundamental activity object to extract image features.
These methodologies can be ordered into four principle gatherings: Statistical,
Signal processing-based, Structural methodology, and Model-based methodology.
Fig. 2. Human-based inspection system Fig. 3. Machine automated inspection system
In the statistical approach, gray-level properties are utilized to describe the textural
property of texture image or a measure of gray-level reliance, which are called 1st-order
statistics and higher-order statistics, separately [14]. The 1st-order statistics, for
example, mean and standard deviation [15, 16], rank function [17], local integration,
can gauge the variance of gray-level intensity among different features between
defective areas and background. The higher-order statistics depends on the joint
probability distribution of pixel sets, for example, gray-level co-occurrence matrix [18]
gray-level difference strategy [15] and autocorrelation method. In any case, the
inconvenience of this strategy is that defects size is sufficiently enormous to empower a
compelling estimation of the texture property. So this methodology is feeble in han-
dling local little defects. Additionally, the calculation of higher-order statistics is
tedious [17].
In the subsequent class model-based methodology, the generally utilized strategies
are Markov random field Gaussian Markov random field [16]. The texture features of a
contemplated texture and can signify to all the more exactly spatial interrelationships
between the gray-levels in the texture. However, like the methodologies based on
second-order statistics, additionally it is tough for model-based methodology to deal
with identifying small-sized defects in light of the fact that the methodologies as a rule
require an adequately large region of the texture to assess the parameters of the models.
The structural approach generally utilized on properties of the primitives of the
defect-free fabric texture for the nearness of the flawed region, and their related
placement rules. Apparently, the practicability of this methodology is to congestion to
those textures with regular macro texture.
Not at all like the above methodologies which separate the defects as far as the
visual properties of the fabric texture, the signal processing based methodology extract
features by applying different signal processing procedures on the fabric image. It is
projected that the distinguishable between the defect and the non-defect can be
improved in the handled fabric image. This methodology further comprises of the
accompanying techniques: Spatial filtering, Karhunen-Loeve transform, Fourier
transform, Gabor transform, and Wavelets transform.
As a weakness of this methodology, its performance is effortlessly influenced by
the noise in the fabric image. These coefficients exemplify to optimal the defect-free
fabric image, be that as it may, not the optimal separation between the defect and the
non-defect.
They are progressively proficient in the separation of fabric defects than different
techniques that depend on the texture investigation at a single scale [19]. Contrasted
with the Gabor transform; the wavelet transform has the benefit of greater adaptability
in the decomposition of the fabric image [20]. Consequently, the wavelet transform is
seen as the most suitable way to deal with the feature extraction for fabric defect
detection.
Table 1. Taxonomy of some most recent related works

Article Classifier Machine learning technique Accuracy rate
[21] Artificial Neural Network Counterpropagation 82.97%
[22] Artificial Neural Network Backpropagation 78.4%
[23] Artificial Neural Network Resilient backpropagation 85.57%
[24] Support Vector Machine NA 77%
[25] Artificial Neural Network Backpropagation 84%
[26] Artificial Neural Network Backpropagation 81%
Artificial Neural Network Least mean square error (LMS) 87.21%
[29] Artificial Neural Network Learning vector quantization (LVQ) 67.11%
[30] Model-based clustering NA 65.2%
[32] Artificial Neural Network Resilient backpropagation 69.1%
Table 1 Illustrates the taxonomy of most recent fabric defects detection methods, in
light of their classifier, machine learning technique, and accuracy rate.
In this paper, we propose an innovative defect detection algorithm which has the
capability to cope with different types of defects. Our algorithm is based on four
phases. In the initial phase, image segmentation has been utilized on an excess of a
couple of fabric images so as to enhance the fabric image and to locate the important
data and wipe out the unusable data of the image by utilizing different edge detection
strategies. After the initial phase, morphological operations have been utilized on the
fabric image. In the third step, feature extraction has been done through FAST (Fea-
tures from Accelerated Segment Test) extractor. After feature extraction, If PCA is
applied as it lessens the dimensions and preserves the helpful data, and characterizes
the different fabric defects through a neural network; additionally classifier has been
utilized to find the accuracy rate. The proposed framework gives high precision when
contrasted with the other framework. The investigation has been done in MATLAB
environment on real images of the TILDA database [33].
The remaining of the paper is arranged as follows: In Sect. 2, the various types of
fabric defects are presented. In Sect. 3 explains our proposed approach for defect
detection. In Sect. 4 presents the application of our system and analysis. Finally, Sect. 5
accomplishes the paper and presents our future research plans.
2 Defects in Fabric
In order to prepare various categories and forms of fabric items in the industry, fabric
materials are used. Consequently, yarn quality and/or loom defects affect the fabric
quality. Fabric defect has been estimated [34] that the price of fabrics is reduced by
45%-65% due to the presence of defects such as dye mark/dye Spot, slack warp, faulty
pattern card, holes, spirality, grease oil/ dirty stains, mispick, slub, wrong end, slack
end, and so on [1]. In a fabric, defects can occur due to: machine faults, color bleeding,
yarn problems, excessive stretching, hole, dirt spot, scratch, poor finishing, crack point,
material defects, processing defects, and so on [35, 36].
3 Proposed Methodology for Defect Detection
Fig. 4. Block diagram of the developed system
Figure 4 shows the steps of methodology, to sum up; the following steps are image
segmentation, feature extraction, PCA (Principal Component Analysis), and image
classification.
3.1 Image Segmentation

Image segmentation is a fundamental advance in image analysis. Segmentation isolates
an image into its objects or segment parts. Edge detection is a mechanism in image
processing to make the image segmentation procedure and pattern recognition more
precise [37, 38]. It fundamentally diminishes the measure of information and filters out
pointless data, while protecting the helpful properties in an image. The adequacy of
many image processing relies upon the flawlessness of identifying significant edges. It
is one of the procedures for detecting strength discontinuities in a digital image.
Essentially we can say, the way toward arranging and setting sharp discontinuities in an
image is known as edge detection. There are many edge detection strategies available,
every procedure intended to be keen on particular types of edges. Fac-
tors that are concerned about the choice of an operator for edge detection include
edge direction, edge structure, and noise condition. The paper applied the histogram
equalization strategy on fabric image as shown in Fig. 6. After that edge detection
strategy has been applied as shown in Fig. 6. There are numerous operators in edge
detection methods model Roberts, Sobel, and Prewitt [39, 40] and the result shows that
the Canny’s edge detection method performs superior to every other strategy.
3.2 Feature Extraction

The feature extractor applied on the dataset of images as shown in Fig. 6 which re-
lies on the extractor of the local feature. The point of local feature portrayal is to
express to the image which depends on some notable regions. The image relies upon its
local structures with a lot of local feature extractors and which is get from a lot of
image regions called interest regions [41]. FAST (Features from Accelerated Segment
Test) has been applied to the fabric picture to extract the features as shown in Fig. 6.
Rosten and Drummond proposed initially that FAST is a strategy for recognizing
interest regions in an image [42]. An interest region in an image is a pixel that has a
very much characterized position and can be vigorously identified. An interest region
has high local data substance and they ought to be in a perfect world repeatable
between various images [43].
3.3 PCA (Principal Component Analysis)

It is a straight forward method used in dimensionality reduction to decrease the features
that are not helpful. It protects the valuable features of the information or image [44, 45].
If applied after feature extraction through feature extractor FAST, it gives a great per-
formance or accuracy via classifier.
3.4 Image Classification

Neural network [26, 45] is a machine learning technique that has been applied to
classify the Images of fabric defects as shown in Fig. 6 and applied the pattern
recognition framework and to get the great outcomes after train the framework by
dataset [46, 47]. It can partition the dataset into a testing stage and training stage to
locate the hidden neurons in the pattern recognition framework as shown in Fig. 5. The
classifier will apply for the classification accuracy when feature extraction from the
extractor. The classifier has been applied to the real image of fabric [33].
Fig. 5. Neural network
4 Application of the System and Analysis

4.1 Proposed Methodology
Phase For enhancing the fabric images to apply edge detection method
1
Phase For the application of FAST (Features from Accelerated Segment Test) extract the
2 features and discover the interest regions
Phase Consequently, applied the PCA (Principal Component Analysis) to decrease the
3 dimensions and reserve the beneficial data
Phase Applied machine learning algorithms neural network to get better accuracy as it
4 provides better outcomes
Original image Gray image Histogram Binary image Detected Region
Fig. 6. Application of the system
Fig. 7. MATLAB environment for defect detection

Table 2. Comparison of neural network-based classification models

Article Accuracy Comment
[21] 82.97%
[22] 78.4%
[23] 85.57%
[24] 77%
[25] 84%
[26] 85.9%
[27] 76.5%
[28] 67.11%
[29] 65.2%
[30] 71.34%
[31] 69.1%
Developed system 97.21% Greatest accuracy among all developed systems
As compared to the other techniques our methodology provides better accuracy as

shown in Table 2.
Experiments are worked on TILDA database [33] and give better result in terms of
100% detection rate for the training set and 97.21% accuracy for the test set.
Fig. 8. System snapshot

Accuracy
120.00%
100.00%
80.00%
60.00%
40.00%
20.00% Accuracy
0.00%
Article
Fig. 9. Performance comparison of different studies
The proposed methodology feature extraction technique as compared to the other

discussed methodology feature extraction techniques using machine learning algo-
rithms gives better accuracy as shown in Fig. 9.
The detection of faulty fabrics plays an important role in the success of any fabric
industry. The fabric industry needs a real-time quality control to find defects quickly
and efficiently. Manual control is inefficient and time-consuming that leads to heavy
loss. On the other hand, automatic quality control is considerably more proficient, in
light of the fact that it is a real-time and autonomous compared to manual productivity.
Till now all the fabric detection systems suggested by all the researchers, the accuracy
rate for detecting defective fabric is very low. However, this paper analyzed the
shortcomings of the traditional approach for fabric defect detection, and proposed an
innovative fabric defect detection technique based on a FAST (Features from Accel-
erated Segment Test) extractors and PCA (Principal Component Analysis) combined
with a neural network classification, to enhance the recognition accuracy rate that
texture fabrics cannot be effectively detected by present existing techniques. The paper
concludes that the proposed fabric defect detection technique gave better accuracy after
applied the machine learning algorithm and PCA in comparison to the other referenced
approaches. Additionally, our method notably showed its efficiency in separating
defect-free from defective areas of the fabric. Moreover, after a series of improvements,
our method exhibited better recognition performance for the fabric images. Having
successfully trained the neural network, 30 samples of each type of defect were used to
assess the accuracy of the network classification. The defective images were then
graded with a 97.21% overall accuracy score. With a 100% accuracy score, the dye
spot defect was identified. The experimentation has been applied to the real images of
the TILDA database dataset. The implementation has been done in MATLAB software
as shown in Fig. 7 and Fig. 8. It automatically detects the fabric defect.
In the future, we will focus on sensor data-oriented systems and developing an
optimal system to match more accurately with a real system for fabric defect detection
as well as applying our machine learning algorithms for other different feature
extractors.
References
1. Stojanovic, R., Mitropulos, P., Koulamas, C., Karayiannis, Y., Koubias, S., Papadopoulos,
G.: Real-time vision-based system for textile fabric inspection. Real-Time Imaging 7, 507–
518 (2001)
2. Aasim A.: A catalogue of visual textile defects, ministry of textiles (2004)
3. Newman, T.S., Jain, A.K.: A survey of automated visual inspection. Comput. Vis. Image
Underst. 61(2), 231–262 (1995)
4. Kumar, A.: Computer-vision-based fabric defect detection: a survey. IEEE Trans. Ind.
Electron. 55(1), 348–363 (2008)
5. Huart, J., Postaire, J.G.: Integration of computer vision on to weavers for quality control in
the textile industry. In: Proceeding SPIE 2183, pp. 155–163, February 1994
6. Dorrity, J.L., Vachtsevanos, G.: On-line defect detection for weaving systems. In:
Proceeding IEEE Annual Technical Conference Textile, Fiber, and Film Industry, pp. 1–
6, May 1996
7. Ryan G.: Rosandich: Intelligent Visual Inspection, Chapman & Hall, London (U.K.) (1997)
8. Batchelor, B.G.: Lighting and viewing techniques. In: Batchelor, B.G., Hill, D.A., Hodgson,
D.C. (eds) Automated Visual Inspection. IFS and North Holland (1985)
9. Roberts, J.W., Rose, S.D., Jullian, G., Nicholas, L., Jenkins, P.T., Chamberlin, S.G.,
Maroscher, G., Mantha, R., Litwiller, D.J.: A PC-based real time defect imaging system for
high speed web inspection. In:Proceeding SPIE 1907, pp. 164–176 (1993)
10. Bayer, H.A.: Performance analysis of CCD-cameras for industrial inspection. In: Proceed-
ing SPIE 1989, pp. 40–49 (1993)
11. Cho, C., Chung, B., Park, M.: Development of real-time vision-based fabric inspection
system. IEEE Trans. Ind. Electron. 52(4), 1073–1079 (2005)
12. Kumar, A.: Computer-vision-based fabric defect detection: a survey. IEEE Trans. Ind.
Electron. 55(1), 348–363 (2008)
13. Ngana, H., Panga, G., Yung, N.: Automated fabric defect detection a review. Image Visi.
Comput. 29(7), 442–458 (2011)
14. Smith, B.: Making war on defects. IEEE Spectr. 30(9), 43–47 (1993)
15. Fernandez, C., Fernandez, S., Campoy P., Aracil R.: On-line texture analysis for flat
products inspection. neural nets implementation. In: Proceedings of 20th IEEE International
Conference on Industrial Electronics, Control and Instrumentation, vol. 2, pp. 867–872
(1994)
16. Ozdemir S., Ercil A.: Markov random fields and Karhunen-Loeve transforms for defect
inspection of textile products. In: IEEE Conference on Emerging Technologies and Factory
Automation, vol. 2, pp. 697–703 (1996)
17. Bodnarova A., Williams J. A., Bennamoun M., Kubik K. Optimal textural features for flaw
detection in textile materials. In: Proceedings of the IEEE TENCON 1997 Conference,
Brisbane, Australia, pp. 307–310 (1997)
18. Gong, Y.N.: Study on image analysis of fabric defects. Ph.D. dissertation, China Textile
University, Shanghai China (1999)
19. Zhang, Y.F., Bresee, R.R.: Fabric defect detection and classification using image analysis.
Text. Res. J. 65(1), 1–9 (1995)
20. Nickolay, B., Schicktanz, K., Schmalfub, H.: Automatic fabric inspection– utopia or reality.
Trans. Melliand Textilberichte 1, 33–37 (1993)
21. Habib, M.T., Rokonuzzaman, M.: A set of geometric features for neural network-based
textile defect classification, ISRN Artif. Intell. 2012, Article ID 643473, p. 16 (2012)
22. Saeidi, R.D., Latifi, M., Najar, S.S., Ghazi Saeidi, A.: Computer Vision-Aided Fabric
Inspection System For On-Circular Knitting Machine, Text. Res. J. 75(6), 492–497 (2005)
23. Islam, M.A., Akhter, S., Mursalin, T.E.: Automated textile defect recognition system using
computer vision and artificial neural networks. In: Proceedings World Academy of Science,
Engineering and Technology, vol. 13, pp. 1–7, May 2006
24. Murino, V., Bicego, M., Rossi, I.A.: Statistical classification of raw textile defects. In: 17th
International Conference on Pattern Recognition (ICPR 2004), ICPR, vol. 4, pp. 311–314
(2004)
25. Karayiannis, Y.A., Stojanovic, R., Mitropoulos, P., Koulamas, C., Stouraitis, T., Koubias,
S., Papadopoulos, G.: Defect detection and classification on web textile fabric using multi
resolution decomposition and neural networks. In: Proceedings on the 6th IEEE International
Conference on Electronics, Circuits and Systems, Pafos, Cyprus, pp. 765–768, September
1999
26. Kumar, A.: Neural network based detection of local textile defects. Pattern Recogn. 36,
1645–1659 (2003)
27. Kuo, C.F.J., Lee, C.-J.: A back-propagation neural network for recognizing fabric defects.
Text. Res. J. 73(2), 147–151 (2003)
28. Mitropoulos, P., Koulamas, C., Stojanovic, R., Koubias, S., Papadopoulos, G., Karayiannis,
G.: Real-time vision system for defect detection and neural classification of web textile
fabric. In: Proceedings SPIE, vol. 3652, San Jose, California, pp. 59–69, January 1999
29. Shady, E., Gowayed, Y., Abouiiana, M., Youssef, S., Pastore, C.: Detection and
classification of defects in knitted fabric structures. Text. Res. J. 76(4), 295–300 (2006)
30. Campbell, J.G.,. Fraley, C., Stanford, D., Murtagh, F., Raftery, A.E.: Model-based methods
for textile fault detection, Int. J. Imaging Syst. Technol. 10(4), 339–346, July 1999
31. Islam, M.A., Akhter, S., Mursalin, T.E., Amin, M.A.: A suitable neural network to detect
textile defects. Neural Inf. Process. 4233, 430–438. Springer, October 2006
32. Habib, M.T., Rokonuzzaman, M.: Distinguishing feature selection for fabric defect
classification using neural network. J. Multimedia 6 (5), 416–424, October 2011
33. TILDA Textile texture database, texture analysis working group of DFG. http://lmb.
informatik.unifreiburg.de
34. Srinivasan, K., Dastor, P. H., Radhakrishnaihan, P., Jayaraman, S.: FDAS: a knowledge-
based frame detection work for analysis of defects in woven textile structures, J. Text. Inst.
83(3), 431–447 (1992)
35. Rao Ananthavaram, R.K., Srinivasa, Rao O., Krishna P.M.H.M.: Automatic defect detection
of patterned fabric by using RB method and independent component analysis. Int.
J. Comput. Appl. 39(18), 52–56 (2012)
36. Sengottuvelan, P., Wahi, A., Shanmugam, A.: Automatic fault analysis of textile fabric using
imaging systems. Res. J. Appl. Sci. 3(1), 26–31 (2008)
37. Abdi. H., Williams, L.J.: Principal component analysis. Wiley Interdisciplinary Rev.
Comput. Stat. 2 (4), 433–459 (2010). https://doi.org/10.1002/wics.101
38. Kumar, T., Sahoo, G.: Novel method of edge detection using cellular automata. Int.
J. Comput. Appl. 9(4), 38–44 (2010)
39. Zhu, Q.: Efficient evaluations of edge connectivity and width uniformity. Image Vis.
Comput. 14, 21–34 (1996)
40. Senthilkumaran. N., Rajesh, R.: Edge detection techniques for image segmentation – a
survey of soft computing approaches. Int. J. Recent Trends Eng. 1(2), 250–254 (2009)
41. Rizon, M., Hashim, M.F., Saad, P., Yaacob, S.: Face recognition using eigen faces and
neural networks. Am. J. Appl. Sci. 2(6), 1872–1875 (2006)
42. Rosten, E., Porter, R., Drummond,T.: FASTER and better: a machine learning approach to
corner detection, IEEE Trans Pattern Anal Mach Intell. 32, 105–119 (2010)
43. Wikipedia, Corner Detection. http://en.wikipedia.org/wiki/Corner_detection. Accessed 16
March 2011
44. Chang, J.Y., Chen, J.L.: Automated facial expression recognition system using neural
networks. J. Chin. Inst. Eng. 24(3), 345–356 (2001)
45. Jianli, L., Baoqi, Z.: Identification of fabric defects based on discrete wavelet transform and
back-propagation neural network. J. Text. Inst. 98(4), 355–362 (2007)
46. Tamnun, M.E., Fajrana, Z.E., Ahmed, R.I.: Fabric defect inspection system using neural
network and microcontroller. J. Theor. Appl. Inf. Technol. 4(7) (2008)
47. Bhanumati, P., Nasira, G.M.: Fabric inspection system using artificial neural network. Int.
J. Comput. Eng. 2(5), 20–27 May 2012
Alzheimer’s Disease Detection Using CNN
Based on Effective Dimensionality
Reduction Approach
Abu Saleh Musa Miah1(&), Md. Mamunur Rashid1,

Md. Redwanur Rahman1, Md. Tofayel Hossain1,
Md. Shahidujjaman Sujon1, Nafisa Nawal1, Mohammad Hasan1(&),
and Jungpil Shin2(&)
1
abusalehcse.ru@gmail.com, hasancse.cuet13@gmail.com
2
School of Computer Science and Engineering, The University of Aizu,
Aizuwakamatsu, Fukushima 965-8580, Japan
jpshin@u-aizu.ac.jp
Abstract. In developed countries, Alzheimer’s disease (AD) is one of the

major causes of death. Until now, clinically there is not have any diagnostic
method available but from a research point of view, this disease detection
accuracy is produced by computational algorithms. There are many researchers
who are working to find about Alzheimer’s disease property, its stages, and
classification ways. This research plays a vital role in clinical tests for medical
researchers and in the overall medical sector. One of the major problems found
by the researchers in the field is the large data dimension. In the study, we
proposed an efficient dimensionality reduction method to improve Alzheimer’s
dis-ease (AD) detection accuracy. To implement the method first we cleaned the
dataset to remove the null value and removing other unacceptable data by some
preprocessing tasks. On the preprocessed data first, we have split into training
and test dataset then we employed a dimension reduction method and there-fore
applied a machine learning algorithm on the reduced dataset to produce accu-
racy for detecting Alzheimer’s disease. To overserve and calculate the accuracy
we computed confusion matrix, precision, recall, f1-score value and finally
accuracy of the method as well. Reducing the dimension of data here we applied
consequently Principle component analysis (PCA), Random Projection (RP) and
Feature Agglomeration (FA). On the reduced features, we have applied the
Random Forest (RF) and Convolution neural network (CNN) machine-learning
algorithm based on the dimensionality reduction method. To evaluate our pro-
posed methodology here we have used Alzheimer’s disease neuroimaging ini-
tiative (ADNI) dataset. We have experimented for (i) Random forest with
principal component analysis (RFPCA), (ii) Convolution neural network
(CNN) with (PCA) (CNNPCA), (iii) Random forest with Ran-dom projection
(RFRR), and (iv) Random forest with Feature agglomeration(RFFA) to differ-
entiate patients with the AD from healthy patients. Our model namely Random
forest with Random projection(RFRP) has produced 93% accuracy. We believe
that our work will be recognized as a groundbreaking discovery in this domain.

https://doi.org/10.1007/978-3-030-68154-8_69
802 A. S. M. Miah et al.
Keywords: Alzheimer’s disease (AD) Diagnosis Dimentia Convolution

neural network (CNN) Principle component analysis (PCA)
1 Introduction
Alzheimer’s disorder is a neurodegenerative disease that is the conventional state of

dementia [1]. In our cutting-edge society, it is the most high-priced disorder & it is
characterized by means of cognitive, mental, and behavioral disruption. In other words,
the most prominent cause of dementia is Alzheimer’s disease, a most simple concept
for memory loss and other cognitive competencies that interfere with everyday life [2].
As of Today, Approximately 60 to 80% of instances of dementia are due to Alz-
heimer’s in Bangladesh. It most often starts in individuals over the age of 65, alt-hough
Alzheimer’s early-onset is 4–5% of instances. There are about 460,000 people strug-
gling from dementia in our country, according to Alzheimer’s Disease International
Asia Pacific Report-2015, which would be doubled in 2030 and tripled via 2050 [3, 4].
So Bangladesh is no longer unfamiliar with this disease and the influence of AD is now
not negligible. Bangladesh has a noticeably young population, among one hundred
sixty million people 8% are older people, the amount of the older people approximately
12 million [5]. One can think that 12 million older people there would be at least a few
thousand struggling with the dementia. Older humans who are struggling from remi-
niscence disturbances are often stigmatized or branded as “Foolish”. Some facts on the
amount of AD patients in Bangladesh are accessible. In this nation, there are no correct
epidemiological records of AD [6]. The impacted patient and their household partici-
pants are constantly facing quite a number issues. There is restricted funding for project
AD studies. It is excessive time, therefore, to agree with proactively about the sickness
and its administration and to take needed motion in this respect. The coverage makers,
health professional and allied businesses need to come ahead to make countrywide
precedence for AD in Bangladesh [7].
In this research field there have many open source database [8, 9], ADNI is the
most widely used (adni.loni.usc.edu) [10]. Moreover, OASIS (www.oasis-brains.org)
and AIBL (aibl.csiro.au) are also usable Alzheimer open source database. Another
clinical open source database also most used in recent year namely J-ADNI database
[11, 12], which contained longitudinal studies data in japan. In the last decade, ma-
chine-learning approach has been applied to detecting Alzheimer disease with a great
success [13–16]. Alonso et al. and Ni et al. employed not only machine learning but
also data mining tools to detect Alzheimer’s disease, and they are working for
enhancing the productivity and quality of health center and medicine research [17, 18].
Esmaeilza-deh et al. has applied 3D convolution neural network to detect Alzheimer
disease with a magnetic resonance imaging (MRI) data set collected from 841 people
[14]. Long et al. has proposed a methodology based on MRI data 427 patients with
support vector machine [19]. They proposed some mathematical and statistical can be
used for network as a black box concept. David et al. employed an ensemble method to
predict the conversion from mild cognitive impairment (MCI) to Alzheimer’s disease
(AD). They have used 51 person’s data for cognitively normal controls and 100 per-
son’s data for MCI patients’ data and then combined 5 types score calculated using
Alzheimer’s Disease Detection Using CNN 803
natural language processing and machine learning [21]. Liu et al. also employed
ensemble method to detect Alzheimer disease [22]. One of the limitations of those
work is high dimensionality of feature vector.
In the study, we applied machine learning and data mining methodology based on the
Alzheimer’s disease neuroimaging initiative (ADNI) dataset for classifying the various
stages of Alzheimer disease. Especially we have classified Alzheimer disease (AD) and
normal control (NC) from ADNI dataset using Random forest and Cognitive neural
network (CNN) model. To reduce the dimensionality of feature vector in this field
Davatzikos et al. and Yoon et al. employed principle component analysis (PCA) algo-
rithm [23, 24]. Moreover, the technique of the algorithm is to select parameters carefully.
To overcome the high dimensionality problem we have applied here three-dimensionality
reduction method to reduce the dimensionality of feature vector namely Principle com-
ponent analysis (PCA), Random projection (RP) and Feature agglomeration (FA) tech-
nique. We have experimented Random Forest and Cognitive Neural Network (CNN)
through the dimensionality reduction method. Finally, we found combined of Random
projection (RP) and Random forest (RF) (RPRF) produced best result.
2 Dataset
Our dataset contains data of Alzheimer’s Disease Neuroimaging Initiative (ADNI)

(http://adni.loni.usc.edu/) 1.5 T Database with both Healthy Controls (HC), Alzhei-
mer’s disease (AD) patients. The dataset contains data of total 627 peoples. Where 363
of them were Male, who had an average age of 75.49 years and with range between 56.
4 to 89 years. Whereas, there were 264 female with averaging age of 74.72 years,
ranging from 89.6 to 55.1 years. Information about subject of ADNI dataset shows in
Table 1.
Table 1. Dataset
Age/Gender Male Female
Min 56.4 55.1
Max 89 89.6
Average Age 75.49 74.72
We have proposed combination of four methods, which we implied separately for

observing various accuracies. The following flow chart Fig. 1 shows various
methodologies which have implemented in general.
Fig. 1. Flowchart of the proposed method.
3.1 Preprocessing
We took the ADNI dataset and load the dataset, we used Python as language and used
pyCharm as our IDE. Then the dataset was proceeded for next step. We checked for
null values in dataset using a Pandas Dataframe and find null values. If any null values
are found then we used .sum () to convert them into Boolean function. After that again .
sum () was used for converting them to binary values. Then we removed the irrelavent
data from our dataset. All the data are sorted according to their features and classes. All
the columns containing true/false values, APOES were calculated. And normalized
using one hot encoding.
3.2 Feature Reduction

We used the following methods for feature reduction task:
Principal Component Analysis (PCA)
PCA is a dimensionality reducing strategy that is used to map features onto lower
dimensional space. The transformation of data can be linear or nonlinear. This algo-
rithm based on a transformation function like T = XW which maps data vector from
one space to a new space. In new space its select a specific number of component from
all component based on eigenvector using Eq. (1) that is gives the truncated trans-
formation that is gives the truncated transformation.
TL ¼ XWL ð1Þ
Here TL is the reduced component eigen vector. In other words, PCA look like a linear
transformation, which is like Eq. (2)
t ¼ WTx ð2Þ
Where x 2 Rp ; t 2 RL , the columns of p L and matrix W comes from an orthogonal

basis for the L features components that are selected by construction, and that is means
selected only L columns from all column [25], The output matrix maximizes the
original data variance that has been preserved and minimizing the total squared
reconstruction error being,
T
TW TL W T 2 ð3Þ
L 2
Or; kX XL k22 ð4Þ
Random Projection
Random projection is a method used to minimize the dimensionality of the Euclidean
space set of points. In contrast to other methods, random projection methods are known
for their strength, simplicity, and low error rate. In random projection, the original d-
dimensional data is projected to a k-dimensional (k << d) subspace, using a random k *
d – dimensional matrix R with unit column length [26]. If Xd *N is the original set of N
d-dimension, then the projection of data in a lower k-dimension will be
RP
XkN ¼ Rkd XdN ð5Þ
Feature Agglomeration
Feature Agglomeration is a dimensionality reducing method that is Similar to
Agglomerative clustering but recursively merges features instead of samples. Let Mmn
representing n samples of dimension m that you want to cluster. In feature agglom-
eration, the algorithm clusters the transpose of the matrix, i.e., MT so it clusters m
samples of dimension n; these samples represent the features [27]. If there is 3 samples
of dimension 3 (a 3 3 matrix with three features, then this can reduce the dimension
of your dataset to 2 dimensions, the algorithm clusters together feature 1 and feature 2,
and leaves feature 3 unchanged.
3.3 Classification
The reduced data set we have splitted into the train and test data consequently in 70%
train data and 30% of test data. We applied a machine learning algorithm namely
Random forest and Convolution neural network algorithm.
We experimented with input data into different combination of models namely
(i) PCA & Random Forest, (ii) PCA & CNN, (iii) Random Projection & Random
Forest, (iv) Feature Agglomeration & Random Forest to differentiate patients with the
AD from healthy patients. We have used two classifiers for our study:
Random Forrest
Random forest algorithm builds decision trees on sample data, then gets the prediction
from each of them, and eventually selects the best solution by voting. It is an ensemble
approach that is stronger than a single decision tree, and by integrating the result, it
eliminates the overfitting [28]. The key thing to consider when doing Random Forests
based on classification data is that the Gini index is often used or the method used to
determine how nodes are to be chosen on a branch of a decision tree.
Xc
Gini ¼ 1 i¼1
ð pi Þ 2 ð6Þ
This method uses the class and probability to calculate the Gini of each branch on a
node, which branch is more likely to occur. Here pi represents the relative frequency of
the class you observe in the dataset and C is the number of classes to observe.Another
way to decide how nodes split in a decision tree is by entropy.
XC
Entropy ¼ i¼1
pi log2 ðpi Þ ð7Þ
Convolutional Neural Network (CNN)

A convolution neural network (CNN) is a class of deep neural networks most notably
used for visual imagery research [29]. A CNN has Convolutional layers, ReLU layers,
pooling layers, and a fully connected layer. In the convolution process, subsequent
feature map values are calculated according to the following formula, where f and our
kernel denote the input image by h. The indexes of rows and columns of the result
matrix are marked with m and n, respectively.
X
G½m; n ðf hÞ½m; n h½j; k f ½m j; nk ð8Þ

To evaluate the performance of the proposed model here have calculated the precision,
recall f1 score and confusion matrix as well. All performance matrix are calculated and
made the classification reports using the confusion matrix from SciKit Learn. Finally,
we observed the precision, recall twice for proposed four combinations of models to
see that which patients have Cognitively Normal (CN), Alzheimer’s disease(AD), and
Mild Cognitive Impairment (MCI).
4 Experimental Result
After splitting the dataset (ADNI) into training and test, we got 157 patients as test data.
Table 2 shows the details about test data.
Table 2. Dataset label for test data

Stage Subject
Alzheimer’s disease (AD) 32
Healthy Controls (HC) 49
Mild cognitive impairment (MCI) 76
Evaluating this test data into our model, we have found different performance
accuracies. Here we evaluated four models; some of them are performed quite well and
gave good accuracy. To understand the accuracy of our model first, we computed a
confusion matrix. Then computed precision, recall, f1 score, and finally, the accuracy
of the proposed model as well.
Table 3. Confusion matrix: PCA + RF Table 4. Confusion matrix: PCA + CNN

Stage AD CN MCI Stage AD CN MCI
AD 24 2 6 AD 27 0 5
CN 0 46 3 CN 0 47 2
MCI 8 11 57 MCI 5 6 65
In Table 3, Table 4, Table 5 and Table 6 shows confusion matrix for every model.
The confusion matrix is one of the best way to evaluate classifier performance. Table 3
shows the confusion matrix for Random forest with PCA (RFPCA) model. Table 3
shows RFPCA model with 24 subjects have correctly classified and 8 subjects have
misclassified in AD stage, best accuracy has acquired in CN stage because in this stage
misclassified only 3 subjects and 46 subjects have correctly classified finally have
classified correctly 57 subjects and misclassified 19 subjects. Table 4 shows the con-
fusion matrix of CNN with PCA (CNNPCA) model. Therefore, the model also
achieved the best accuracy at the CN stage because this stage CNNPCA has classified
correctly 47 subjects and misclassified only 2 subjects.
Table 5. Confusion matrix: Random pro- Table 6. Confusion matrix: Feature agglom-
jection + Random forest eration + Random forest
Stage AD CN MD Stage AD CN MD
AD 30 0 2 AD 23 0 9
CN 2 43 4 CN 1 45 3
MCI 1 1 74 MCI 8 2 66
Table 5 shows the confusion matrix of Random forest with the Random projection
(RFRP) model
This model has correctly classified 32 subjects and misclassified 2 subjects at the
AD stage. Moreover, at CN stage model has classified correctly 43 subjects and
misclassified six subjects. RFRP finally maximum correctly classified at the MCI stage.
In the same way, Table 6 shows the confusion matrix for Random forest with Feature
Table 7. Comparative precision, recall, f1-score, Accuracy of implemented methods

Method Result
Label Precision Recall f1-score Accuracy
Random forest with PCA AD 0.84 0.84 0.84 0.885 + −0.02
(RFPCA) CN 0.89 0.86 0.92
MCI 0.90 0.86 0.88
Cognitive neural network AD 0.75 0.76 0.75 0.808 + −0.01
with PCA (CNNPCA) CN 0.78 0.94 0.85
MCI 0.86 0.75 0.80
Random forest with AD 0.91 0.94 0.92 0.936 + −0.02
Random projection (RFRP) CN 0.98 0.88 0.92
MCI 0.93 0.97 0.95
Random forest with AD 0.71 0.75 0.72 0.853 + −0.02
Feature agglomeration CN 0.96 0.92 0.94
(RFFA) MCI 0.85 0.87 0.86
agglomeration(RFFA) model. This model has classified correctly maximum subject at

CN stage and minimum subject AD stage.
From the confusion matrix of Table 3, Table 4. Table 3, Table 4, Table 5, and
Table 6 we have calculated performance matric precision-recall, f1-score, and accuracy
those have shown in Table 7.
In Table 7 we have seen that for the RFPCA model, maximum precision 0.90 is
achieved for the MCI stage, and minimum precision 84 is achieved at the AD stage.
So RFPCA model is good for classifying the MCI stage and not good for classifying
the AD stage. Same as maximum recall and f1 score produced at the CN stage.
Therefore, the CNNPCA model maximum precision achieved at the MCI stage, as will
maximum recall, and f1 score produced at the CN stage. Additionally, the RFRP model
produced maximum precision at CN, and recall and f1 score at the MCI stage. Finally,
the RFFA model achieved maximum precision, recall, and f1 score at the CN stage.
Table 8. Accuracy Table

Method name Accuracy [%]
Random forest with PCA (RFPCA) 88.00
Cognitive neural network with PCA (CNNPCA) 81.00
Random forest with Random projection (RFRP) 93.00
Random forest with Feature agglomeration (RFFA) 85.00
The summaries accuracy of all four models shows in Table 8. The accuracy table
shows that the Random forest with PCA (RFPCA) model generated 88.00% accuracy,
same as Cognitive neural network with PCA (CNNPCA), Random forest with Random
Fig. 2. Comparative accuracies of proposed methodologies
projection(RFRP), and Random forest with Feature agglomeration (RFFA) models

produced 81.00%, 93.00%, and 85.00% accuracy. We observed that the Random forest
with the Random projection (RFRP) model was used the Random projection model as a
dimension reduction method that worked as the best model and produced 93.00%
accuracy. Figure 2 shows the bar graph, which represents the varying accuracies of the
implemented strategy.
5 Conclusions and Future Scope
To solve the high dimensionality problem in the Alzheimer’s disease diagnosis, we

propose a novel dimensionality reduction method. To further improve the performance,
we design four models here as a dimensionality reduction method, namely PCA,
Random Projection, Feature Agglomeration method. Finally, we integrated this three-
dimensionality reduction method with the Random forest and CNN method. All know
PCA, but Random Projection and Feature Agglomeration method not applied yet in
this field. The confusion matrix, accuracy table, and bar graph show that the Random
projection method as a dimensionality reduction method produced good accuracy. The
success of the proposed models is experimentally verified on the given dataset. In the
future, we would like to enhance our results and find out how the algorithms vary in
respect to other algorithms.
References
1. Moser, A., Pike, C.J.: Obesity and sex interact in the regulation of Alzheimer’s disease.
Neurosci. Biobehavioral Rev. 67, 102–118 (2016). ISSN 0149-7634. https://doi.org/10.
1016/j.neubiorev.2015.08.021
2. Ligthart, S.A.: Cardiovascular prevention in older people: The pre DIVA trial Thesis,
Academic Medical Center – University of Amsterdam (2015). ISBN: 978-94-6169-623-6
3. Farjana, S.: World Alzheimer’s day: Let’s not forget the forgetful 11:17 AM, 21 September
2018
4. Aggarwal, N.T., Tripathi, M., Alladi, H.H., Anstey, K.S.: Trends in Alzheimer’s disease and
dementia in the Asian-pacific region. Int. J. Alzheimer’s Dis. Hindawi Publishing
Corporation VL - 2012 SN - 2090–8024. https://doi.org/10.1155/2012/171327
5. Dr. Taha, S.: World Alzheimer’s Day: Forgetting dementia in Bangladesh. Paragraph: The
Impact of Dementia in Bangladesh, 21 September 2014
6. Sneddon, R., Shankle, W.R., Hara, J., Rodriquez, A., Hoffman, D., Saha, U.: EEG detection
of early Alzheimer’s disease using psychophysical tasks. Clin. EEG Neurosci. 3, 141–150
(2005)
7. Rahman, Md., et al.: Overview and Current Status of Alzheimer’s Disease in Bangladesh,
pp. 27–42, 1 January 2017
8. Jack Jr, C.R., et al.: Magnetic resonance imaging in alzheimer’s disease neuroimaging
initiative 2. Alzheimer’s Dementia 11, 7 (2015)
9. Jongin, K., Lee, B.: Identification of Alzheimer’s disease and mild cognitive impairment
using multimodal sparse hierarchical extreme learning machine. Hum. Brain Mapp. 39(9),
3728–3741 (2018)
10. Veitch, D.P., et al.: Understanding disease progression and improving Alzheimer’s disease
clinical trials: Recent highlights from the Alzheimer’s disease neuroimaging initiative.
Alzheimer’s Dementia (2018)
11. Fujishima, M., Kawaguchi, A., Maikusa, N., Kuwano, R., Iwatsubo, T., Matsuda, H.:
Sample size estimation for Alzheimer‘s disease trials from Japanese ADNI serial magnetic
resonance imaging. J. Alzheimer‘s Dis. 56(1), 75–88 (2017)
12. Gallego-Jutglà, E., Solé-Casals, J., Vialatte, F.-B., Elgendi, M., Cichocki, A., Dauwels, J.: A
hybrid feature selection approach for the early diagnosis of Alzheimer‘s disease. J. Neural
Eng. 12(1), 016018 (2015)
13. Pellegrini, E., et al.: Machine learning of neuroimaging for assisted diagnosis of cognitive
impairment and dementia: a systematic review. Alzheimer’s Dement. Diagn. Assess. Dis.
Monit. 10(2018), 519–535 (2018)
14. Moradi, E., Pepe, A., Gaser, C., Huttunen, H., Tohka, J., Initiative, A.D.N.: Machine
learning framework for early MRI-based Alzheimer‘s conversion prediction in MCI subjects.
NeuroImage 104(2015), 398–412 (2015)
15. Albright, J.: Forecasting the progression of Alzheimer’s disease using neural networks and a
novel preprocessing algorithm. Alzheimer’s Dement. Transl. Res. Clin. Interv. 5, 483–491
(2019). ISSN 2352-8737
16. Tanveer, M., Richhariya, B., Khan, R.U., Rashid, A.H.: Machine learning techniques for the
diagnosis of alzheimer’s disease: a review, article. In: ACM Transactions on Multimedia
Computing, Communications and Applications, April 2020
17. Alonso, S.G., De La Torre-D´ıez, I., Hamrioui, S., LópezCoronado, M., Barreno, D.C.,
Nozaleda, L.M., Franco, M.: Data mining algorithms and techniques in mental health: a
systematic review J. Med. Syst. 42(9), 161 (2018)
18. Ni, H., Yang, X., Fang, C., Guo, Y., Xu, M., He, Y.: Data mining-based study on sub-
mentally healthy state among residents in eight provinces and cities in china. J. Tradit. Chin.
Med. 34(4), 511–517 (2014)
19. Esmaeilzadeh, S., Belivanis, D.I., Pohl, K.M., Adeli, E.: End-to-end Alzheimer’s Disease
Diagnosis and Biomarker Identification. arXiv: 1810.00523 (2018)
20. Long, X., Chen, L., Jiang, C., Zhang, L.: Prediction and classification of Alzheimer disease
based on quantification of MRI deformation. PLoS One 12, 1–19 (2017)
21. Clark, D.G., McLaughlin, P.M., Woo, E., Hwang, K., Hurtz, S., Ramirez, L., Eastman, J.,
Dukes, R.M., Kapur, P., DeRamus, T.P., Apostolova, L.G.: Novel verbal fluency scores and
structural brain imaging for prediction of cognitive outcome in mild cognitive impairment.
Alzheimer’s Dement. (Amsterdam, The Netherlands) 2, 113–122 (2016)
22. Liu, M., Zhang, D., Shen, D.: Alzheimer’s disease neuroimaging initiative. Ensemble sparse
classification of Alzheimer’s disease. NeuroImage 60(2), 1106–1116 (2012). https://doi.org/
10.1016/j.neuroimage.2012.01.055
23. Davatzikos, C., Resnick, S.M., Wu, X., Parmpi, P., Clark, C.M.: Individual patient diagnosis
of AD and FTD via high-dimensional pattern classification of MRI. NeuroImage 41, 1220–
1227 (2008). [PubMed: 18474436]
24. Yoon, U., Lee, J.M., Im, K., Shin, Y.W., Cho, B.H., Kim, I.Y., Kwon, J.S., Kim, S.I.:
Pattern classification using principal components of cortical thickness and its discriminative
pattern in schizophrenia. NeuroImage 34, 1405–1415 (2007). [PubMed: 17188902]
25. Grimm, M.O., Rothhaar, T.L., Grösgen, S., Burg, V.K., Hundsdörfer, B., Haupenthal, V.J.,
Friess, P., Kins, S., Grimm, H.S., Hartmann, T.: Trans fatty acids enhance amyloidogenic
processing of the Alzheimer amyloid precursor protein (APP). J. Nutr. Biochem. 23, 1214–
1223 (2012)
26. Bengio, Y., et al.: Representation learning: a review and new perspectives. IEEE Trans.
Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). arXiv:1206.5538. https://doi.org/10.
1109/tpami.2013.50. PMID 23787338. S2CID 393948
27. Bingham, E., Mannila, H.: Random projection in dimensionality reduction: Applications to
image and text data, 6 May 2014
28. Zhang, J., Wu, X., Hoi, S.C.H., Zhu, J.: Feature agglomeration networks for single stage face
detection. Neurocomputing 380, 180–189 (2020). ISSN 0925-2312. https://doi.org/10.1016/
j.neucom.2019.10.087
29. Joy, M.H., Hasan, M., Miah, A.S.M., Ahmed, A., Tohfa, S.A., Bhuaiyan, M.F.I., Rashid, M.
M., Zannat, A.: Multiclass MI-Task Classification using Logistic regression and Filter Bank
Common Spatial Patterns. Springer-Nature Singapore Pte Ltd., COMS2 2020, CCIS 1235,
pp. 160–170 (2020)
30. Teuwen, J., Moriakov, N.: Chapter 20 - Convolutional neural networks, Handbook of
Medical Image Computing and Computer Assisted Intervention, pp. 481–501. Academic
Press (2020). ISBN 9780128161760
An Analytical Intelligence Model
to Discontinue Products in a Transnational
Company
Gabriel Loy-García1, Román Rodríguez-Aguilar2(&),

and Jose-Antonio Marmolejo-Saucedo3
1
Facultad de Ingeniería,
Universidad Anáhuac, Av Universidad Anáhuac 46, Mexico, Mexico
2
Facultad de Ciencias Económicas y Empresariales, Universidad Panamericana,
Augusto Rodin 498, 03920 Mexico City, Mexico
rrodrigueza@up.edu.mx
3
Facultad de Ingeniería, Universidad Panamericana,
Augusto Rodin 498, 03920 Mexico City, Mexico
Abstract. This work is a proposal of an analytical intelligence model for the

discontinuation of products in a transnational soft drink company. The objective
is to identify products that due to their volume and sales value should leave the
company’s catalog. For this, the integration of an analytical intelligence model
that considers unsupervised classification algorithms integrating key information
about the products to be evaluated is proposed. The results generated show that
the product classification makes it possible to identify a set of products that are
candidates for discontinuation due to their volumes and sales value, likewise, the
detailed information of these products allows evaluating the characteristics of
the cluster to be discontinued and thus planning production and distribution in
the medium and long term. The planned model allows timely monitoring of the
discontinuation process automatically as well as the monitoring of executive
reports through the cloud.
1 Introduction
A fundamental activity for the sales planning and operations in large companies with
value chains for the production, distribution, and marketing of consumer goods, is the
product discontinuation process. This product portfolio management process allows
optimizing the value chain. This process is generally driven by the commercial area of
the companies and is accompanied by the financial and supply chain area. These areas
make decisions about the portfolio empirically and based on the experience of the staff.
In the literature, there is evidence focused on solving the problem of discontinuing
products using techniques for classifying inventory types with a classic supply chain
approach. Other works are more focused on the application of machine learning
methods to categorize products based on consumer behavior. Transactional databases
are generally used, and rules are general for decision making. [1] show an application
to the retail sector on the type and quantity of products that they keep in their catalogs,
through the application of a hybrid model based on the k-means algorithm and
https://doi.org/10.1007/978-3-030-68154-8_70
An Analytical Intelligence Model to Discontinue Products 813
association rule mining. [2] shows a case of innovation and discontinuation but using
the firm as the unit of analysis.
[3] shows the application of text mining methodologies to assert the voice of the
customer by analyzing the case of a household and consumer products company. [4]
analyze the classification of products in an e-commerce framework using a deep multi-
modal architecture using text and images as inputs, showing the application of a deep
learning model. [5] develops an intelligent platform for the classification of products in
e-commerce based on natural language processing algorithms, vector space models, k-
nearest neighbor, Naive-Bayes classifier, and hierarchical classification. [6] address a
classification problem with big data. The goal is to properly classify millions of
products by type at Walmart-Labs. Applying large scales crowdsourcing, learning
rules, and in-house analysts.
Other studies such as the work of [7] use the Eigen-color feature with ensemble
machine learning combining artificial neural networks and support vector machines for
the classification of product images. The little identified on the application of product
classification does not address discontinuation and the vast majority focus on super-
vised models. However, the problem of this study lacks a dependent variable since
what is sought is to identify underlying information that allows grouping products for
later discontinuation based on a group of variables that address various areas of the
business beyond the physical characteristics of the product. For the above, this work
seeks to develop an analytical intelligence model to improve decision-making for the
discontinuation of products. The information used corresponds to commercial and
financial information for the entire product line in Mexico’s operation. The company
has 52 bottling plants and more than 300 distribution centers, its corporate headquarters
are located in Mexico City. The portfolio of products includes around 250 and up to
536 different product codes (SKUs). With the constant changes in consumer habits, the
proliferation of new product codes is a constant that different areas must deal with
when carrying out the quarterly sales and operations plan.
The application of an analytical intelligence model that allows integrating groups
susceptible of SKUs to be discontinued will allow having a standard methodology so
that this process is carried out in the most objective way possible based on technical
evidence. The work is organized as follows, in the first section, the theoretical
framework and the sources of information to be used are presented. The second section
presents the proposal for the model. The third section presents the main results iden-
tified. Finally, the conclusions and future lines of research.
Description of the data

The data corresponding to the operation in Mexico from December 2019 to February
2020. The database contains 50,000 records with 16 mixed variables. Table 1 shows a
description of the variables:
814 G. Loy-García et al.
Table 1. Data description

Variable Description
State Geographic location (state of Mexico)
Distributors (U. O) The distributor that belongs to a particular State
SKUs Unique product code
Category Indicate if the product is a soft drink or a non-carbonated drink and
what is its nature
Type of Indicates whether the size of the product is for personal consumption
consumption or family
Brand Brand of the product
Product size Number of milliliters of the product presentation
Presentation Packaging type
Returnability Returnable or non-returnable packaging
CU Sales volume of a product in packages of 5, 678 L
TR Number of bottles sold
VN Sales in Mexican pesos
Desc Discounts in Mexican pesos
CV Variable Contribution in Mexican pesos generated by a product
The main variable that was historically taken into account to discontinue products
was the variable contribution, which is the result of subtracting from sales, variable
costs, and expenses, as well as the marketing expense of each product. It is the amount
that remains per product to cover costs and fixed expenses. The negative CV represents
a loss of economic value for the company. Presentations that correspond to specific
consumption occasions and that are important for the brand image were eliminated; as
well as products that according to their life cycle are in a phase of introduction to the
market, or growth.
It is important to mention that by transforming categorical variables into binary
variables, the amount of data is exponentially increased by generating a total of 761
variables, so the 16 original variables are transformed into 768 variables to be analyzed.
It is assumed that the discontinuation of any SKU will be for total channels and the
Customer Business Model variable will not be taken into account. As we can see, the
complexity of making a model with this database lies in the number of levels or factors
of each categorical variable. An exploratory analysis of the resulting database was
carried out and the levels of each categorical variable were converted into factors. The
extreme values of the base were eliminated using the criterion of the interquartile range
where the observations corresponding to the lower 5% and the upper 1% of the base
were discarded. 15 records were deleted. Subsequently, the analysis of Mixed Principal
Components was carried out, to be able to confirm if the data are grouped according to
the factors of the categorical variables and if any natural structure is identified in the
data.
Figure 1 shows the eigenvectors of the first two principal components, the grouping
of the set of numerical variables with the first principal component, and the categorical
variables with the second component is observed. The first two mixed principal
components explain only 9% of the variance, so we cannot take the variables that are
most related to each one as a starting point for unsupervised modeling.
Fig. 1. Mixed principal component map.
The graph of the observations concerning the first two mixed principal components
(Fig. 2) shows a natural grouping, the segmentation of 4 groups can be identified
according to the values of the first two principal components, the exploratory analysis
allows evaluating whether it is convenient for the construction of an unsupervised
model for the classification of products for their discontinuation.
Fig. 2. Grouping of projected observations on the principal components map.
The integration of mixed variables in the model implies the use of a specific distance
metric for the treatment of mixed data; in this case, the Gower distance matrix will be
used as the base input for the construction of the model.
Analytical intelligence model

The proposed model considers a set of unsupervised cluster methodologies as options
for classifying the available information; the first module considers data integration and
cleaning. Subsequently, the validation of the grouping trends, selection of the number
of partitions, selection of the best cluster methodology to use, and finally the validation
and implementation of the algorithm (Fig. 3). It is important to note that given the size
of the problem, it was decided to use a specialized platform for handling Big data,
Microsoft Azure, integrating open-source code in the statistical software R.
Fig. 3. Structure of the proposed analytical intelligence model.
A set of unsupervised cluster model methodologies were applied to compare their

performance. A methodological structure was followed for the elaboration of an
unsupervised model based on [8]. Establishing a series of non-parametric tests before
the application of the methodologies as well as internal validation methodologies of the
clusters generated to select the best methodology to apply. Randomness validation of
the base to define if the data are subject to clustering to carry out the validation the
Hopkins statistic is used [9, 10].
Pn
yi
H ¼ Pn i¼1P
n ð1Þ
i¼1 xi þ i¼1 yi
Where xi¼distðpi ;pj Þ is the distance for each observation from its closest neighbor and
yi¼distðqi ;qj Þ is the distance for each point from its closest neighbor in a data set simu-
lated using a uniform distribution. Since it is a database with mixed data, it is necessary
to establish an ad hoc metric, the Gower distance was used [11, 12].
Pn
i¼1 Wijk Sijk
GOWjk ¼ P n ð2Þ
i¼1 Wijk
Where Wijk ¼ 0 if the objects j and k cannot be compared for variable i either because
xij or xik are not known. Additionally, for nominal variables:
Wijk = 1 if xij and xik are known; so
Sijk = 0 if xij 6¼ xik
Sijk = 1 if xij ¼ xik
To determine the number of partitions to be made, the Silhouette Coefficient was
used, which allows an optimization criterion to be used to determine the number of
partitions in a cluster using a measure of the quality of the cluster classification [13].
1X N
di s i
Silhouette ¼ ð3Þ
N i¼1 maxfdi si g
Where si is the mean distance to objects in the same cluster and di is the mean
distance to objects in the next nearest cluster. Observations with a large Silhouette
(1) are well grouped. A small Silhouette (0) means that the observation is between two
groups. Observations with a negative Silhouette are probably located in the wrong
group. A group of hard and soft cluster methodologies was selected to select the best
methodology to apply in the analyzed database. The selected methodologies were k-
means [14], k-medoids [13, 15], hierarchical agglomerative [16] and fuzzy clustering
[17, 18]. To select the best algorithm for the classification and profiling of the groups,
the validation of each model or each stage of the model is carried out by generating an
internal validation and the calculation of several stability means. To perform internal
validation of clusters, the Average Silhouette statistic is calculated. The grouping
quality will be better when the value of this statistic is close to 1. This statistic is
calculated for each selected algorithm by varying the number of clusters. In this way,
the algorithm and with the number of groups that the Silhouette statistic has closest to
1, should be chosen as the best model [13].
Taking into account the disaggregation of the categorical variables used in each
iteration, the increase in the dimension of the problem grows exponentially, due to all
the possible combinations of levels between the categorical variables. The proposed
model was coded on a free software platform (R) and later it was implemented on the
Microsoft Azure platform (Fig. 4).
Fig. 4. Integration of the model in the Microsoft Azure architecture.

The data processing power in the Data-Bricks module comes from a dedicated
processing cluster within the module with the following features: 140 GB of memory
and 20 cores. Additionally, it was configured with a minimum of 6 “Workers” or
parallel computing modules.
3 Results
Cluster method selection

To verify that the data are subject to clustering, a graphic exploration was carried out
and Hopkins Statistic. The representation of the data on a two-dimensional map using
the first two mix principal components shows that the distribution of the data is not
random [19] (Fig. 4). Based on the Silhouette method we can decide the optimal
number of clusters for K ¼ 2 and up to K ¼ 5. To select the appropriate cluster
algorithm, a comparison was made between the different proposed methodologies
(Fig. 6). The Silhouette index for different methods to alternatively obtain the optimal
number of clusters (Table 2). Three hard algorithms and one soft algorithm were run,
varying the number of clusters for Kii¼2;::;5 .
Table 2. Cluster method selection

Silhouette Index/Method K-means K-medoids Hierarchical Fuzzy
k=2 0.18 0.18 0.53 0.30
k=3 0.43 −0.03 0.40 NA
k=4 0.27 0.15 0.31 NA
k=5 0.19 0.16 0.18 NA
In this case, the fuzzy cluster algorithm can no longer be calculated for K ¼ 3,
K ¼ 4; and K ¼ 5. We observe that the highest Silhouette index is obtained for the K-
Means and Hierarchical Agglomerative algorithms with K ¼ 3 and K ¼ 4 respectively
and it is verified that the optimal number of groups is 4. Given the complexity of the
problem, we seek to obtain partitions greater than two due to the desired segmentation
for discontinuation. Therefore, we selected the K-Means algorithm with 4 groups for
the classification of the base, graphically the clusters are represented on a principal
component map and dendrogram to observe the initial classification generated by the
selected model (Fig. 5). this is consistent with the segmentation identified in the
exploratory data analysis by applying mixed principal components.
Fig. 5. Cluster plot using the two principal components and dendrogram graph.
The selection of the algorithm with four partitions allows integrating the groups of
SKUs, now it is necessary to build the profiles of each group to evaluate their char-
acteristics and define which group would be a candidate for discontinuation.
Definition of profiles within each group
The groups were summarized and characterized by the variable contribution that each
group generates per bottle and the percentage that it contributes to the total volume in
unit boxes to the business in Mexico. A total of 20, 218 records were excluded as they
had a negative variable contribution or a special reason, which is equivalent to 16% of
the volume in unit cases to the total of enterprise in Mexico. 82% of the remaining
volume is what was classified with this model. In this case, the outliers correspond to
the best products of the company.
$0.22 (
(a) (b
b)
$0.19
$
$0.17 $0.16
$0.13
993
34.6%
2
23.8%
18.1% 406
160
3.0% 2.3% 15 44
OU
UTLIERS 1 2 3 4 OUTLIE
ERS 1 2 3 4
% as of Total Unit Cases Variable Contribution

n per Bottle (MXN) # Combinations
C
Fig. 6. (a) Characterization and (b) dimension of the clusters generated in the first stage.
Group 1 is mainly made up of Non-Returnable Pet products. Group 2 corresponds

mostly to 355–500 mL Returnable Glass products. Group 3 corresponds mostly to non-
returnable Can and Glass products. Group 4 corresponds to products of the most
important brand of the company, mostly 500 mL Returnable Pet. Deciding to dis-
continue products based on what was found in this first classification is impossible.
More iterations of classification on specific groups need to be done. In this case group,
1 still groups a considerable number of records and concentrates a large volume mix, as
well as group 3, which has the lowest CV per bottle and still has a significant number of
records. By representing these groups, we can conclude that makes a great effort in the
value chain to guarantee the supply of these products. Leaving them in the portfolio
would be inefficient.
Applying the same proposed approach iteratively, a disaggregation of groups 1 and
3 were performed to select the candidate SKUs to be discontinued. The proposed list of
SKUs discontinued corresponds to 12% of the records from the original base. This
important amount of records does not generate any profit for the company. With the
results obtained and making use of technology-oriented Business Intelligence, a
dashboard was made in “Tableau” where the profiles by a group and the detail of the
SKUs to be discontinued by the state can be reviewed interactively (Fig. 7).
Fig. 7. The dashboard of SKUs to be discontinued by state.
These are mostly products that do not make commercial sense due to the type of size
and presentation with which they are sold or because of their selling price that is too
onerous for states with a low socioeconomic status. The most representative sizes
belong to Personal Presentations and most correspond to Tin and Non-Returnable
Glass. These records are equivalent to discontinuing 195 SKUs, which do not represent
any profit for the company and only increase the complexity of the value chain as we
have previously pointed out.
4 Conclusions
The implementation of a standard clustering methodology comparing various

methodologies allowed to generate coherent and robust results for decision making. The
development of an analytical intelligence tool such as the proposal generates added
value for companies, especially by supporting the discontinuation of products based on
an analytical intelligence model that considers a comprehensive set of variables from the
entire supply chain of the company and not the emphasis is only on financial aspects.
The proposed model considers the automation of the classification of SKUs to
establish possible candidates for discontinuation. Although the proposal allowed seg-
menting a proportion of products to be discontinued, it is necessary to deepen the
analysis of the information in correspondence with the feedback from decision-makers,
which will allow to improve the proposed model and improve the results generated.
With the list obtained from SKU’s to discontinued, the supply chain planning area will
be able, through the origin-destination matrix, to know the primary origin of the plants
that supply the Distributors and will be able to determine if the complexity of the
supply chain value will drop as a result of the departure of those SKUs that will no
longer be produced in certain plans, nor will they be chartered to the Distributors. The
model designed and the reports generated to measure will allow to analysis in detail the
decision-making or discontinuation of products. Making use of technology-oriented
Business Intelligence, a dashboard by a group and the detail of the SKUs to be
discontinued by Distributor and State can be reviewed interactively.
Some proposals of the commercial area correspond to the disaggregation of the

clusters by a distributor, commercial channel, and use of the product in the rest of the
company’s operations. In the same way, the integration of the ensemble methods model
(Bagging and Boosting) is considered as a future development to improve the per-
formance of the classification, it would also be relevant to integrate information related
to the fiscal policies established in Mexico for discontinuation decisions as well as to
integrate the effects observed on sales and consumer preferences post COVID-19
pandemic. As future work it is necessary to evaluate the hypothesis of value having
discontinued the SKU’s and based on the new sales and operations plan for the new
quarter, the above, not including the new launches and before incorporating the new
pricing plan.
References
1. Karki, D.: A hybrid approach for managing retail assortment by categorizing products based
on consumer behavior. Dublin, National College of Ireland. Ph.D. Thesis (2018)
2. Crowley, F.: Product and service innovation and discontinuation in manufacturing and
service firms in Europe. Eur. J. Innov. Manage. 20(2), 250–268 (2017)
3. Gonzalez, R.A., Rodriguez-Aguilar, R., Marmolejo-Saucedo, J.A.: Text mining and
statistical learning for the analysis of the voice of the customer. In: Hemanth, D., Kose,
U. (eds.) Artificial Intelligence and Applied Mathematics in Engineering Problems.
ICAIAME 2019. Lecture Notes on Data Engineering and Communications Technologies,
vol 43. Springer, Cham (2020)
4. Zahavy, T., Magnani, A., Krishnan, A., Mannor, S.: Is a picture worth a thousand words? A
deep multi-modal fusion architecture for product classification in e-commerce. In: The
Thirtieth Conference on Innovative Applications of Artificial Intelligence (IAAI) (2018)
5. Ding, Y., Korotkiy, M., Omelayenko, B., Kartseva, V., Zykov, V., Klein, M., Schulten, E.,
Fensel, D.: Golden bullet: automated classification of product data in e-commerce. In:
Proceedings of Business Information Systems Conference (BIS 2002), Poznan, Poland
(2002)
6. Sun, C., Rampalli, N., Yang, F., Doan, A.: Chimera: Large-scale classification using
machine learning, rules, and crowdsourcing. PVLDB 7(13), 1529–1540 (2014)
7. Oyewole, S.A., Olugbara, O.O.: Product image classification using Eigen Colour feature
with ensemble machine learning. Egypt. Inf. J. 19(2), 83–100 (2018)
8. Kassambara, A.: A practical guide to cluster analysis in R: unsupervised machine learning.
CreateSpace Independent Publishing Platform (2017)
9. Hopkins, B., Skellam, J.G.: A new method for determining the type of distribution of plant
individuals. Ann. Bot. Co. 18(2), 213–227 (1954)
10. Banerjee, A.: Validating clusters using the Hopkins statistic. In: IEEE International
Conference on Fuzzy Systems, pp. 149–153 (2004)
11. Gower, J.: A general coefficient of similarity and some of its properties. Biometrics 27, 857–
872 (1971)
12. Tuerhong, G., Kim, S.B.: Gower Distance-Based Multivariate Control Charts for a Mixture
of Continuous and Categorical Values. Elsevier, South Korea (2013)
13. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis,
p. 1990. John Wiley & Sons Inc, Hoboken, NJ (1990)
14. Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(1982), 129–137
(1982)
15. Park, H.-S., Jun, C.-H.: A simple and fast algorithm for K-medoids clustering. Expert Syst.
Appl. 36, 3336–3341 (2009)
16. Ward, J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58
(301), 236–244 (1963)
17. Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-
separated clusters. J. Cybern. 3(3), 32–57 (1973)
18. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms (1981)
19. Rodriguez-Aguilar, R.: Proposal for a comprehensive environmental key performance index
of the green supply chain. Mobile Netw. Appl. (2020)
Graph Neural Networks in Cheminformatics
H. N. Tran Tran1, J. Joshua Thomas1(&),

Nurul Hashimah Ahamed Hassain Malim2, Abdalla M. Ali1,
and Son Bach Huynh1
1
UOW Malaysia KDU Penang University College, Penang, Malaysia
tranhuungoctran@gmail.com, joshopever@yahoo.com,
abdalla.muf@gmail.com, huynhbach197@gmail.com,
2
Universiti Sains Malaysia, Penang, Malaysia
nurulhashimah@usm.my
Abstract. Graph neural networks represent nowadays the most effective

machine learning technology in the biochemistry domain. Learning on the huge
amount of chemical data can take an important part in finding new molecules or
new drugs, which is a crucial research work in cheminformatics. This work
would be no more time-consuming and labor-intensive with the assistant of
machine learning techniques: they are capable of both handling massive datasets
and learning the hidden information from the structure of graphs. In terms of
applying machine learning of graphs in chemistry, this paper discusses the
explorations on the following matters. Firstly, we introduce the up-to-date study
of the machine learning approaches being applied in solving cheminformatics
research problems. Secondly, we present concise overviews on the original
mathematical model and variants of graph neural networks and the utilization in
drug discovery evaluating the performance with machine learning. We end our
analysis with a critical discussion of potential research based on current litera-
ture reviews and suggestions for relevant approaches and challenges.
Keywords: Graph neural networks Machine learning Cheminformatics

Deep learning Drug discovery Drug design Drug-target interactions
Molecule generation
1 Introduction
Past decades have witnessed the surge of data in every aspects of life and the devel-
opment of data science tools created to study them [1], especially in the field of
chemistry. In this revolution, machine learning stays the most important in solving the
problems of chemical data exploitation and understanding [2]. Machine learning is a
domain in artificial intelligence that focuses on learning from data with computational
models. In procedures which are dealing with a huge amount of data like drug discovery,
machine learning, particularly deep learning, is contemporary of great importance [3].
On the other side, the graph is important structural information of chemistry data,
hence it requires appropriate algorithms to learn and analyze them taking advantage of
their structures [4]. In the context of utilizing deep learning techniques for both han-
dling massive datasets and learning from graphs, novel algorithms specialized in graph
data are developed, which are currently known as graph neural networks (GNN).
https://doi.org/10.1007/978-3-030-68154-8_71
824 H. N. T. Tran et al.
In this article, we present a review of researches on applying GNN in cheminfor-

matics, a domain in which studying graphs is inevitable. The rest of this review is
organized as follows. We explain in Sect. 2 the basic concepts in cheminformatics
studies and summarize machine learning methods that are utilized in drug discovery. In
Sect. 3, we introduce fundamental notations that are used in graph learning, the original
model of graph neural network, and its variants. In Sect. 4, we provide a review of
existing graph neural network models applied in cheminformatics. In Sect. 5, we
discuss the performance of and suggest the approaches for future research. Finally, we
conclude the review in Sect. 6.
2 Cheminformatics in Drug Discovery
2.1 Fundamental Concepts in Cheminformatics

Cheminformatics is the field of applying the techniques of computer science in
chemistry. It importantly focuses on solving problems such as chemical information
retrieval and extraction, molecular data searching, and is playing an essential role in
drug discovery and drug development [5].
Drug discovery is known as the process of finding the functions of bioactive
compounds for novel drug development, and it is generally one of the early phases in a
drug development pipeline [6].
Explanations of terminologies used in the domain are provided below:
• A ligand is a molecule that forms a binding affinity with another molecular structure
physically and modulates its activity.
• A compound is a chemical structure which is composed of two or more chemical
bond-linked atoms.
• A target protein (or target) is a biomolecule of an organism in which a ligand can
bind, perform acts on and modulate its function, resulting in a physiological change
in the organism’s body.
• Bioactivity is the activity elicited by the target protein of interest.
• A drug is a biological entity (antibodies) or chemical entity (small molecules) that
can cure or decelerate the course of a disease state by interacting with its target.
• Molecular properties are characteristics of a compound that are chemical, physical,
and structural. The term property in this research refers to drug property. Investi-
gated properties, which will be used to evaluate whether a generated compound is
drug-like, include molecular weight (MW), partition coefficient (Log P), Hydrogen
bond acceptor (HBA), Hydrogen bond donor (HBD), and Topological polar surface
area (TPSA).
The key function of drugs is modulating the cellular actions participating in disease
conditions for treatment purposes. Hence, traditional research on drugs begins with the
identification of biomolecular targets for a predetermined treatment and proceeds with
the high-throughput screening experiments to search for bioactive compounds for the
defined targets, aiming at finding suitable drug candidates [7].
On the other hand, detecting drug-target interactions (DTI) is a significant solution
to reduce the scope of searching for candidate drugs [8]. Drug-target interaction refers
Graph Neural Networks in Cheminformatics 825
to the binding of a drug to a target location that results in a change in its behavior or
function [9]. This process is currently boosted up with the assistance of various
machine learning models.
2.2 Machine Learning Approaches in Cheminformatics

Machine learning techniques can be roughly categorized into supervised learning and
unsupervised learning. In the supervised learning models, training data are labeled and
the model is trained to predict the labels for given data inputs. Similarity-based methods
[10] and feature vector-based methods [11] are the main approaches in supervised
learning. Typical models in this approach of learning consist of k-nearest neighbor (k-
NN), random forest (RF), logistic regression (LR), and support vector machine
(SVM) [12]. Some supervised learning models have proved their ability in predicting
drug-target interactions, yet supervised learning has not addressed the problem of the
imbalanced dataset when the model treats unlabeled drug-target pairs as negative
samples and consequently provides inaccurate predictive results [8]. On the contrary,
unsupervised learning methods can learn to identify hidden molecular patterns from
unlabeled inputs. Artificial neural networks referred to as neural networks in this review,
and deep learning have been suggested to be primary models in this category. Being a
branch of machine learning, artificial neural networks are built on a set of algorithms,
resembling the network of human brain neurons. Moreover, deep learning is the hybrid
development of artificial neural networks: the more layers of neural networks a model
has, the “deeper” it becomes. The purpose of building such deep models is to learn the
data by transforming them into more complex representations through each layer.
Convolutional neural networks (CNN), recurrent neural networks (RNN), Long-short
term memory networks (LSTM), and autoencoders are well-known successful deep
learning techniques that have proved their productivity in drug discovery tasks.
The work of Dahl et al. is one of the early researches applying neural networks in a
QSAR model for drug-target interactions prediction; reported to be able to improve
prediction accuracy [13]. Deep learning models proposed by Tian et al. likewise aimed
at predicting compound-protein interactions; a four-hidden-layer network in which
extracted molecular fingerprints and protein features were taken as inputs were proved
to be outstanding from SVM, random forest, and logistic regression models [14].
Suggested generative models built with RNN based on LSTM by Segler and Gupta
have successfully generated novel compounds libraries that play a fundamental part in
drug discovery [15, 16]. Besides, CNN is another option for constructing a prediction
model. [17, 18] and [19] have introduced deep learning models using CNN also for
predicting interactions between generated compounds and targets; their results have
shown that these models outperformed previously learning models.
Inspired by potential outcomes from RNN and CNN, many hybrid models are
developed, typically the models that combine RNN and CNN like the works of [20]
and [21]. Moreover, to generate novel molecules and produce molecules with desirable
properties, auto-encoder began to be used, where a neural network learns to compress
molecular data into a form of representation, impose them in a hidden layer (latent
space), and then learns to reconstruct it back. NeVAE model by [22] and scaffold-based
variational auto-encoder by [23] are recent auto-encoder models on molecular graphs
proving that auto-encoder architecture can generate di-verse and novel molecules. To
improve the work of auto-encoder in molecule generation, [24, 25], and [26] adopted
RNN in their encoder or decoder, or both, and performed a high reconstruction rate.
[27] even took advantage of generative topo-graphic mapping (GTM) in their
RNN/auto-encoder model to visualize the auto-encoder latent space.
3 Graph Neural Networks

3.1 Basic Concepts of Graphs
Definition. A graph is denoted as G = (V, E) in which V is the set of vertices (nodes)

and E is the set of edges in that graph. Let vi 2 V represent a node and eij = (vi, vj) 2
E represents an edge connecting nodes vi and vj. N(v) = {u 2 V|(v, u) 2 E} demon-
strates the neighborhood of a node v. d(v) denotes the degree of vertex v, which is the
number of edges connected with v.
Algebra Representations of Graphs. Considering a graph that has n vertices, its
common algebra representations in graph learning tasks are presented as follows.
• Adjacency matrix is denoted as A 2 Rnn where:
(
1 if eij 2 E
Aij ¼ ð1Þ
0 if eij 62 E
• Degree matrix is denoted as D 2 Rnn where:
Dii ¼ dðvi Þ ð2Þ
• Laplacian matrix is defined as:
L¼DA ð3Þ
In other words, its elements are

8
< dðvi Þ if eij 2 E and i ¼ j
Lij ¼ 1 if eij 2 E and i 6¼ j ð4Þ
:
0 otherwise
• Incident matrix is denoted as M 2 Rnm where:

8
<1 if 9k s:t ej ¼ ðvi ; vk Þ
Mij ¼ 1 if 9k s:t ej ¼ ðvk ; vi Þ ð5Þ
:
0 otherwise
with m is the number of edges.

In the case of an undirected graph, the incidence matrix satisfies that

1 if 9k s:t ej ¼ ðvi ; vk Þ
Mij ¼ ð6Þ
0 otherwise
Classifications of Graphs. A graph can be directed or undirected. A directed graph is

a graph whose all edges are directed edges and an undirected graph is a graph whose all
edges are undirected. A graph is undirected if and only if its adjacency matrix is
symmetric.
A graph can also be classified as a homogeneous or heterogeneous graph. All nodes
in a homogeneous graph have the same function. But in a heterogeneous graph, there
are two or more classes of nodes that behave differently from one another. The ele-
mentary graphs to be discussed in this research would be homogeneous, undirected
graphs.
Taking inputs in the forms of graph structure and node representations, GNN might
focus on analyzing tasks for one of these levels of output:
• Node-level tasks: the problems concern node regression and node classification.
• Edge-level tasks: the problems concern edge classification and link prediction.
• Graph-level tasks: the problems concern graph classification and graph regression.
3.2 Vanilla Graph Neural Networks

One of the most basic GNN was proposed by Scarselli et al. [28], who aimed at
extending current neural networks for graph-structured data processing. A node in the
graph is characterized by its features and related nodes. The purpose of GNN is to learn
a state embedding hv of a node v in which is a b-dimension vector containing infor-
mation of the neighborhood for the node. The state embedding hv is used to determine
an output embedding ov, which can be for instance a label of the node. In the graph,
each node has its input features xv. The set of edges connected to node v is denoted as
co[v] and the set of its neighbors is denoted as N(v). hv and ov are obtained by following
formulas:
hv ¼ f ðxv ; xco½v ; hNðvÞ ; xNðvÞ Þ ð7Þ
ov ¼ gðhv ; xv Þ ð8Þ
where f is a parametric function, called local transition function, which is shared among
all nodes and g is a parametric function, called local output function, used to produce
the output.
Let H, O, X, and XN be the matrices constructed by stacking all the states, outputs,
features, and node features respectively. We have a compact form as following:
H ¼ FðH; XÞ ð9Þ
O ¼ GðH; XN Þ ð10Þ
where F is the global transition function and G is the global output function.
Let Ht be the tth iteration of H, the general form of the state is presented as:
H t þ 1 ¼ FðH t ; XÞ ð11Þ
With target information for a specific node denoted as tv and the number of
supervised nodes denoted as p, the loss of model will be:
X
p
loss ¼ ðti oi Þ ð12Þ
i¼1
Briefly, this vanilla model presents an effective method of modeling graphic data,
which is the first step towards integrating neural networks into the graph domain.
3.3 Graph Convolutional Neural Networks

Graph convolutional neural network (GCN)’s main idea is to generate a node v’s
representation by combining its features xv and neighbors’ features xu, when u 2 N(v).
This architecture is motivated by the function of CNN and is the most fundamental
model that will take part in constructing more complex GNN [29].
A framework of GCN was early introduced in 2016 by Kipf & Welling [30]. They
consider a multi-layer graph convolutional network with the following layer-wise
propagation rule:
H ðl þ 1Þ ¼ rðD ~D
~ 12 A ~ 12 H ðlÞ W ðlÞ Þ ð13Þ
in which A ~ is the adjacency matrix of the undirected graph G with added self-
P
connections, computed as Ã = A + IN with IN is the identity matrix; D ~ ii ¼ ~
j Aij and
W is a layer-specific trainable weight matrix. r(.) represents an activation function.
(l)
H ðlÞ 2 Rnd is the matrix of activation in the lth layer (n is the number of nodes, d is the
dimension of the node feature vector), H(0) = X.
The definition of the GCN is demonstrated as:
Z¼D ~D
~ 2 A ~ 2 XH
1 1
ð14Þ
where H 2 RCF is a matrix of filter parameters with C input channels and F filters for
the feature.
GraphSAGE proposed by Hamilton et al. is another GCN framework that learns to

aggregate feature information from a node’s neighborhood and generates embeddings
[31]. A GraphSAGE layer is computed by:
htv ¼ rðW t :½ht1

v jjAGGt ðfhu ; 8u 2 Nv gÞÞ
t1
ð15Þ
where Wt is the parameter at layer t.

The aggregator functions must operate over an unordered set of vectors because of
a node’s neighbors’ unordinary orders. Thus, an aggregator function can be expressed
in various forms. Hamilton et al. suggested three cases of this function:
• Mean aggregator:
X ht1
AGGMEAN ¼ u
ð16Þ
u2NðvÞ
jNðvÞj
• LSTM aggregator:
u ; 8u 2 pðNðvÞÞÞ
AGG ¼ LSTMð½ht1 ð17Þ
when LSTM is applied to random permutation of neighbors.

• Pooling aggregator:
AGGpool ¼ maxðfrðWpool ht1

u þ bÞ; 8u 2 NðvÞgÞ ð18Þ
3.4 Graph Recurrent Neural Networks

Graph recurrent neural networks (GRNN) apply the same set of parameters recurrently
over nodes in a graph to extract high-level node representations. Node representations
are learned with recurrent neural architecture, assuming that a node in the graph
continually exchanges data with its neighbors until reaching a state of stability.
In a gated graph neural network introduced by Li et al. [32], a gated recurrent unit
(GRU) was employed as a recurrent function to limit the recurrence to a fixed number
of steps. A hidden state of a node is updated by its previous hidden states and its
neighboring hidden states as in formula:
X
hðtÞ
v ¼ GRUðhv
ðt1Þ
; Whðt1Þ
u Þ ð19Þ
u2NðvÞ
where h(0)
v = xv.
Based on GRNN, You et al. proposed a graph generative model called GraphRNN,
whose aim is to consider graphs under different node orderings as sequence repre-
sentations, and then to construct an auto-regressive generative model on these
sequences [33]. The state transition function and the output function of the neural
network are presented as:
hi ¼ ftrans ðhi1 ; Spi1 Þ ð20Þ
hi ¼ fout ðhi Þ ð21Þ
where hi 2 Rd is a vector representing the state of the generated graph, p is the node
ordering. An element Spi 2 {0,1}i−1, i 2 {1,…,n} denotes an adjacency vector repre-
senting the edges between node p(vi) and previous nodes p(vj), j 2 {1,…,i-1}. hi is the
distribution of the next node’s adjacency vector. ftrans and fout in general can be arbi-
trary neural networks.
Furthermore, the research work of [34] has extended GraphRNN of You et al.
(2018) to the model MolecularRNN in which they apply RNN for both state transition
function (NodeRNN) and output function (EdgeRNN).
3.5 Graph Autoencoders

Graph autoencoders is an unsupervised learning framework that encodes graph data
into a latent space and reconstructs it from the encoded representation. This framework
is applied in learning networks embedding and graph generative distributions.
In the graph autoencoder by Kipf and Welling [35], GCN are used to encode nodes
in the graph:
Z ¼ GCNðX; AÞ ð22Þ
with Z is the embedding matrix of a graph.

The output of this autoencoder is the reconstructed adjacency matrix created from
graph embeddings in latent space:
~ ¼ rðzT ; zu Þ
A ð23Þ
v
where zv is the representation of node v.
4 Graph Neural Networks Applications in Cheminformatics
In the field of cheminformatics, GCN was early proposed by Duvenaud et al. who
studied a neural graph fingerprints architecture that employs CNN, and the model
training had matched or beaten the predictive performance of standard (fixed) finger-
prints on drug efficacy [36].
Differently, Kearns et al. employed ‘Weave modules’, which merge information
from all atoms and atom pairs to learn molecular features for molecular graph con-
volutional deep neural network models [37]. Compared to the random forest, logistic
regression, and a multitask neural network with two underlying layers, this graph
neural network framework demonstrated higher performance in all cases.
Another type of GCN called Weisfeiler-Lehman Network (WLN) was used on

graph data by Coley et al. to analyze and predict the likelihood of each node-node pair
(molecule atoms) to change to each new vertex (molecule bond) order [38]. Ryu et al.
performed molecular properties prediction using GCN and innovated the model with
attention and gate mechanisms, resulting in more accurate finding graph features,
which are expected molecular properties [39].
In another perspective, Tsubaki et al. had combined GNN and CNN to learn
molecular fingerprints of ligands and protein sequences respectively, in one end-to-end
representation model for predicting interactions between chemical com-pounds and
proteins; their model demonstrated to achieve higher performance than plenty of
existing prediction methods, such as a k-nearest neighbor, random forest, logistic
regression, and SVM [40]. Importantly, the authors found that the model correctly
predicted important amino acid residues at the binding site responsible for drug-target
interactions. As proposed by Nguyen et al., a GCN model to learn drug-target inter-
action had shown that it predicted the binding affinity better than non-deep learning
models and on the other hand, pointed out practical advantages of graph-based
molecule representation in providing accurate drug-target interactions prediction [41].
Similarly, the proposed technique adopting variational auto-encoders on graphs by
proved to improve the predictive performance and outperform recent state-of-the-art
computational methods by a large margin [42].
We noted down predictive performance results of typical models utilizing GNN and
those using other machine learning-based methods in Table 1. Research works
employed different datasets, were evaluated on different metrics and compared their
results to different methods, so it is not possible to cover and make a cross-comparison
for all methods mentioned.
Apart from drug-target interactions prediction, generative chemistry, which aims at
creating novel compounds, is also a crucial approach in applying machine learning for
drug design. Research work of Niepert et al. on CNN framework for learning arbitrary
graphs had presented that the framework could learn node representation of graphs and
generate normalized sequences of nodes, from which un-seen graphs could be com-
puted [43]. The GrammarVAE model by Kusner et al. in which the auto-encoder took
molecular graphs as input had successfully generated new valid and discrete molecules
[44]. Also, for molecule generation, Simonovsky and Komodakis employed GCN in
their variational auto-encoder models for graphs as the encoder and built a multi-layer
perceptron as the decoder to optimize the reconstruction loss [45]. Similarly, De Cao
and Kipf [46] and Bresson and Laurent [47] have also constructed an auto-encoder in
which a graph neural network is used as the encoder. In general, graph auto-encoders
have been able to address the problem of generating graphs from continuous
embedding.
Table 1. Performance results of GNN and other machine learning-based methods in drug-target
interactions prediction
Article Evaluation Datasets Performance measurement
metrics Proposed Compared
method methods
[40] GNN-CNN k-NN RF SVM
AUC1 human 0.970 0.860 0.940 0.910
C.elegans 0.978 0.858 0.902 0.894
Precision human 0.923 0.798 0.861 0.966
C.elegans 0.938 0.801 0.821 0.785
[37] GCN MaxSim LR RF
AUC PubChem 0.908 0.754 0.838 0.804
MUV 0.858 0.638 0.736 0.655
Tox21 0.867 0.728 0.789 0.802
[21] GAT-GCN CNN SimBoost kernel-based
[10]
CI2 Davis 0.881 0.878 0.872 0.871
Kiba 0.891 0.863 0.836 0.782
MSE3 Davis 0.245 0.261 0.282 0.379
Kiba 0.139 0.194 0.222 0.411
[48] GAT-GNN docking CNN
AUC DUD-E 0.968 0.689 0.868
ChEMBL 0.633 0.572 –
MUV 0.536 0.533 0.518
PRAUC4 DUD-E 0.697 0.016 –
[49] GCN Smina neural AutoDock
fingerprints Vina
AUC DUD-E 0.567 0.642 0.704 0.633
MUV 0.474 0.593 0.575 0.503
1
Area under the receiving operating characteristic curve
2
Concordance index
3
Mean square error
4
Area under the precision-recall curve
5 Discussion
According to the investigation on the performance of various methods in drug dis-

covery in Sect. 4, GNN has significantly demonstrated better performance than tradi-
tional machine learning methods, most of which were supervised-learning approaches.
Referring to Table 1, it can be seen that proposed works on GNN outperform other
machine learning-based methods in the measurement of standard evaluation metrics
such as AUC, PRAUC, Precision, MSE, and CI.
Most current models for drug-target interaction prediction are using GCN, mean-
while, graph autoencoders are preferably adopted in generative models, which are
meant to generate new compounds. There haven’t been many works on GRNN,
however, it is gaining more attention as in recent research by [34].
Various models of GNN have been being developed and extended in terms of
choosing methods for encoding and decoding graph data, the depth of networks and
selecting activation functions used in layers of neural networks, and the combination of
different learning methods in one model. From the literature, GNN has proved to be a
promising tool for studying cheminformatics with the ability of handling graph
structures, predicting from unlabeled data, and generating novel entities.
GNN is becoming state-of-the-art among machine learning techniques that are
created and practiced to solve cheminformatics problems. Although recent researches
have shown positive results, there is room for great improvement in this learning
method. The future of applying GNN in cheminformatics will be broad yet challenging
because research is expected to turn to discovering and designing a specific type of
drug, to reposition current drugs (to find new clinical uses of existing drugs), and to
enhance the performance of predictive models. Also importantly, there are still
demands of availability of various toolkits, programming libraries, and frameworks for
data mining and model building.
6 Conclusion
In this review, we have focused on recent applications of graph neural networks in drug
discovery. We first defined technical terms relevant to cheminformatics and indicated
that drug discovery is a key problem in cheminformatics. We provided the presentation
of the mathematical model of the initial graph neural network and its main variants:
graph convolutional neural networks, graph recurrent neural networks, and graph
autoencoders. Subsequently, we investigated recent researches employing graph neural
networks for drug-target interactions prediction and molecule generating. Along with
this, we discussed the performance of graph neural networks models and suggested
prospective topics in studying graph neural networks in cheminformatics.
Acknowledgement. This work is supported by the Fundamental Research Grant Scheme (FRGS)
of the Ministry of Higher Education Malaysia under the grant project number FRGS/1/2019/ICT02/
KDUPG/02/1.
References
descriptions with bidirectional recurrent neural networks. In: Advances in Intelligent
Systems and Computing, vol. 866. Springer International Publishing (2019). https://doi.org/
10.1007/978-3-030-00979-3_22
2. Lipinski, C.F., Maltarollo, V.G., Oliveira, P.R., da Silva, A.B.F., Honorio, K.M.: Advances
and perspectives in applying deep learning for drug design and discovery. Front. Robot. AI 6
(November), 1–6 (2019). https://doi.org/10.3389/frobt.2019.00108
3. Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., Blaschke, T.: The rise of deep learning in
drug discovery. Drug Discov. Today 23(6), 1241–1250 (2018). https://doi.org/10.1016/j.
drudis.2018.01.039
4. Li, J., Cai, D., He, X.: Learning Graph-Level Representation for Drug Discovery (2017).
http://arxiv.org/abs/1709.03741
5. Lo, Y.C., Rensi, S.E., Torng, W., Altman, R.B.: Machine learning in chemoinformatics and
drug discovery. Drug Discov. Today, 23(8), 1538–1546. Elsevier Ltd. (2018). https://doi.
org/10.1016/j.drudis.2018.05.010
6. Chan, H.C.S., Shan, H., Dahoun, T., Vogel, H., Yuan, S.: Advancing drug discovery via
artificial intelligence. Trends Pharmacol. Sci. 40(8), 592–604 (2019). https://doi.org/10.
1016/j.tips.2019.06.004
7. Rifaioglu, A.S., Atas, H., Martin, M.J., Cetin-Atalay, R., Atalay, V., Doǧan, T.: Recent
applications of deep learning and machine intelligence on in silico drug discovery: Methods,
tools and databases. Brief. Bioinform. 20(5), 1878–1912 (2019). https://doi.org/10.1093/bib/
bby061
8. Chen, R., Liu, X., Jin, S., Lin, J., Liu, J.: Machine learning for drug-target interaction
prediction. Molecules 23(9), 1–15 (2018). https://doi.org/10.3390/molecules23092208
9. Wang, H., Wang, J., Dong, C., Lian, Y., Liu, D., Yan, Z.: A novel approach for drug-target
interactions prediction based on multimodal deep autoencoder. Front. Pharmacol. 10, 1–19
(2020). https://doi.org/10.3389/fphar.2019.01592
10. Ding, H., Takigawa, I., Mamitsuka, H., Zhu, S.: Similarity-based machine learning methods
for predicting drug-target interactions: A brief review. Brief. Bioinform. 15(5), 734–747
(2013). https://doi.org/10.1093/bib/bbt056
11. Sachdev, K., Gupta, M.K.: A comprehensive review of feature-based methods for drug target
interaction prediction. J. Biomed. Inf. Elsevier (2019). https://doi.org/10.1016/j.jbi.2019.
103159
neural networks in predicting students performance. In: Vasant, P., Zelinka, I., Weber, G.W.
(eds.) Intelligent Computing and Optimization. ICO 2019. Advances in Intelligent Systems
and Computing, vol 1072. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-
33585-4_44
13. Dahl, G.E., Jaitly, N., Salakhutdinov, R. Multi-task Neural Networks for QSAR Predictions,
pp. 1–21 (2014). http://arxiv.org/abs/1406.1231
14. Tian, K., Shao, M., Wang, Y., Guan, J., Zhou, S.: Boosting compound-protein interaction
prediction by deep learning. Methods 110, 64–72 (2016). https://doi.org/10.1016/j.ymeth.
2016.06.024
15. Segler, M.H.S., Kogej, T., Tyrchan, C., Waller, M.P.: Generating focused molecule libraries
for drug discovery with recurrent neural networks. ACS Central Sci. 4(1), 120–131 (2018).
https://doi.org/10.1021/acscentsci.7b00512
16. Gupta, A., Müller, A.T., Huisman, B.J.H., Fuchs, J.A., Schneider, P., Schneider, G.:
Generative recurrent networks for De Novo drug design. Mol. Inf. 37(1) (2018). https://doi.
org/10.1002/minf.201700111
17. Hirohara, M., Saito, Y., Koda, Y., Sato, K., Sakakibara, Y.: Convolutional neural network
based on SMILES representation of compounds for detecting chemical motif. BMC Bioinf.
19(Suppl 19), 83–94 (2018). https://doi.org/10.1186/s12859-018-2523-5
18. Lee, I., Keum, J., Nam, H.: DeepConv-DTI: prediction of drug-target interactions via deep
learning with convolution on protein sequences. PLoS Comput. Biol. 15(6), 1–21 (2019).
https://doi.org/10.1371/journal.pcbi.1007129
19. Öztürk, H., Özgür, A., Ozkirimli, E.: DeepDTA: Deep drug-target binding affinity
prediction. Bioinformatics 34(17), i821–i829 (2018). https://doi.org/10.1093/bioinform
atics/bty593
20. Trabelsi, A., Chaabane, M., Ben-Hur, A.: Comprehensive evaluation of deep learning
architectures for prediction of DNA/RNA sequence binding specificities. Bioinformatics 35
(14), i269–i277 (2019). https://doi.org/10.1093/bioinformatics/btz339
21. Karimi, M., Wu, D., Wang, Z., Shen, Y.: DeepAffinity: Interpretable deep learning of
compound-protein affinity through unified recurrent and convolutional neural networks.
Bioinformatics 35(18), 3329–3338 (2019). https://doi.org/10.1093/bioinformatics/btz111
22. Samanta, B., De, A., Jana, G., Chattaraj, P.K., Ganguly, N., Rodriguez, M.G.: NeVAE: a
deep generative model for molecular graphs. In: Proceedings of the AAAI Conference on
Artificial Intelligence, vol. 33, pp. 1110–1117 (2019). https://doi.org/10.1609/aaai.v33i01.
33011110
23. Lim, J., Hwang, S.Y., Moon, S., Kim, S., Kim, W.Y.: Scaffold-based molecular design with
a graph generative model. Chem. Sci. 11(4), 1153–1164 (2019). https://doi.org/10.1039/
c9sc04503a
24. Gómez-Bombarelli, R., Wei, J.N., Duvenaud, D., Hernández-Lobato, J.M., SánchezLen-
geling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T.D., Adams, R.P., Aspuru-Guzik,
A.: Automatic chemical design using a data-driven continuous representation of molecules.
ACS Central Sci. 4(2), 268–276 (2018). https://doi.org/10.1021/acscentsci.7b00572
25. Bjerrum, E.J., Sattarov, B.: Improving chemical autoencoder latent space and molecular de
novo generation diversity with heteroencoders. Biomolecules 8(4), 1–13 (2018). https://doi.
org/10.3390/biom8040131
26. Lim, J., Ryu, S., Kim, J.W., Kim, W.Y.: Molecular generative model based on conditional
variational autoencoder for de novo molecular design. J. Cheminform. 10(1), 1–9 (2018).
https://doi.org/10.1186/s13321-018-0286-7
27. Sattarov, B., Baskin, I.I., Horvath, D., Marcou, G., Bjerrum, E.J., Varnek, A.: De Novo
molecular design by combining deep autoencoder recurrent neural networks with generative
topographic mapping. J. Chem. Inf. Model. 59(3), 1182–1196. Research-Article (2019).
https://doi.org/10.1021/acs.jcim.8b00751
28. Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural
network model. IEEE Trans. Neural Networks 20(1), 61–80 (2009)
29. Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.S.: A Comprehensive Survey on
Graph Neural Networks, pp. 1–22 (2019). http://arxiv.org/abs/1901.00596
30. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks.
In: 5th International Conference on Learning Representations, ICLR 2017 - Conference
Track Proceedings, pp. 1–14 (2017). https://arxiv.org/abs/1609.02907
31. Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In:
Advances in Neural Information Processing Systems, pp. 1024–1034 (2017)
32. Li, Y., Zemel, R., Brockschmidt, M., Tarlow, D.: Gated graph sequence neural networks. In:
4th International Conference on Learning Representations, ICLR 2016 - Conference Track
Proceedings, pp. 1–20 (2016)
33. You, J., Ying, R., Ren, X., Hamilton, W.L., Leskovec, J.: GraphRNN: generating realistic
graphs with deep auto-regressive models. In: 35th International Conference on Machine
Learning, ICML 2018, vol. 13, pp. 9072–9081 (2018)
34. Popova, M., Shvets, M., Oliva, J., Isayev, O.: MolecularRNN: Generating realistic molecular
graphs with optimized properties (2019). https://arxiv.org/abs/1905.13372
35. Hajiramezanali, E., Hasanzadeh, A., Duffield, N., Narayanan, K.R., Zhou, M., Qian, X.:
Variational Graph Recurrent Neural Networks, pp. 1–12 (2019). http://arxiv.org/abs/1908.
09710
36. Duvenaud, D., Maclaurin, D., Aguilera-Iparraguirre, J., Gómez-Bombarelli, R., Hirzel, T.,
Aspuru-Guzik, A., Adams, R.P.: Convolutional networks on graphs for learning molecular
fingerprints. In: Advances in Neural Information Processing Systems, pp. 2224–2232 (2015)
37. Kearnes, S., McCloskey, K., Berndl, M., Pande, V., Riley, P.: Molecular graph
convolutions: moving beyond fingerprints. J. Comp. Aided Mol. Des. 30, 595–608
(2016). https://doi.org/10.1007/s10822-016-9938-8
38. Coley, C.W., Jin, W., Rogers, L., Jamison, T.F., Jaakkola, T.S., Green, W.H., Jensen, K.F.:
A graph-convolutional neural network model for the prediction of chemical reactivity.
Chem. Sci. 10(2), 370–377 (2019). https://doi.org/10.1039/c8sc04228d
39. Ryu, S., Lim, J., Hong, S.H., Kim, W.Y.: Deeply learning molecular structure-property
relationships using attention- and gate-augmented graph convolutional network (2018).
https://doi.org/10.1039/b000000x/been
40. Tsubaki, M., Tomii, K., Sese, J.: Compound-protein interaction prediction with end-to-end
learning of neural networks for graphs and sequences. Bioinformatics 35(2), 309–318
(2019). https://doi.org/10.1093/bioinformatics/bty535
41. Nguyen, T., Le, H., Quinn, T.P., Le, T., Venkatesh, S.: Predicting drug–target binding
affinity with graph neural networks. BioRxiv 12, 1–18 (2019). https://doi.org/10.1101/
684662
42. Thomas, J.J., Tran, H.N.T., Lechuga, G.P., Belaton, B.: Convolutional graph neural
networks: a review and applications of graph autoencoder in chemoinformatics. In: Thomas,
J.J., Karagoz, P., Ahamed, B.B., Vasant, P. (eds.) Deep Learning Techniques and
Optimization Strategies in Big Data Analytics, pp. 107–123. IGI Global (2020). http://doi.
org/10.4018/978-1-7998-1192-3.ch007
43. Niepert, M., Ahmad, M., Kutzkov, K.: Learning convolutional neural networks for graphs.
In: 33rd International Conference on Machine Learning, ICML 2016, vol. 4, pp. 2958–2967
(2016)
44. Kusner, M.J., Paige, B., Hemández-Lobato, J.M.: Grammar variational autoencoder. In: 34th
International Conference on Machine Learning, ICML 2017, vol. 4, pp. 3072–3084 (2017)
45. Simonovsky, M., Komodakis, N.: GraphVAE: towards generation of small graphs using
variational autoencoders. In: Lecture Notes in Computer Science (including subseries
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). LNCS, vol.
11139, pp. 412–422 (2018). https://doi.org/10.1007/978-3-030-01418-6_41
46. De Cao, N., Kipf, T.: MolGAN: An implicit generative model for small molecular graphs
(2018). http://arxiv.org/abs/1805.11973
47. Bresson, X., Laurent, T.: A Two-Step Graph Convolutional Decoder for Molecule
Generation (2019). http://arxiv.org/abs/1906.03412
48. Lim, J., Ryu, S., Park, K., Choe, Y.J., Ham, J., Kim, W.Y.: Predicting drug-target interaction
using a novel graph neural network with 3D structure-embedded graph representation.
J. Chem. Inf. Model. 59(9), 3981–3988. Research-Article (2019). https://doi.org/10.1021/
acs.jcim.9b00387
49. Gonczarek, A., et al.: Interaction prediction in structure-based virtual screening using deep
learning. Comput. Biol. Med. (2017). https://doi.org/10.1016/j.compbiomed.2017.09.007
Academic and Uncertainty Attributes
in Predicting Student Performance
Abdalla M. Ali(&), J. Joshua Thomas, and Gomesh Nair
UOW Malaysia KDU Penang University College, Penang, Malaysia

abdalla.muf@gmail.com, joshopever@yahoo.com,
gomesh.nair@kdupg.edu.my
Abstract. This article investigates the use of academic and uncertainty attri-
butes in predicting student performance. The dataset used in this article consists
of 60 students who studied these subjects: Cloud Computing, Computing
Mathematics, Fundamentals of OOP, Object-Oriented SAD, and User Interface
Design. A Deep Learning model was developed that uses a Long Short-Term
Memory network (LSTM) and Bidirectional Long Short-Term memory network
(BLSTM) to predict the student grades. The results show combining different
types of attributes can be used in predicting the student results. The result is
further discussed in the article.
Keywords: Machine learning LSTM BLSTM Deep learning Recurrent

neural network Student grade prediction
1 Introduction
The Use of ML technology in the education field has been increasingly popular in
recent years, with many types of research being conducted from analyzing student data,
making predictions, and helping prevent the student from dropping out of the
university.
Universities now offer a broader range of specialized degrees than ever before
(minors, double degrees, interdisciplinary and inter-university masters). [1] Therefore,
this results in generating a large amount of data from LMS used by students and
lecturers in the university.
The data stored in educational institutions repository plays an essential role in
extracting hidden and interesting patterns to assist every stakeholder in an educational
process.
Students’ performance is the primary concern of various stakeholders, including
educators, administrators, and corporations. For recruiting fresh graduates, academic
achievement is the main factor considered by the recruiting agencies. Therefore,
graduates have to work hard for excellent grades to rise to the expectations of recruiting
agencies [1].
University uses LMS tools to provide convenience for the student to interact with
the instructor, but what the university lack is the knowledge behind this student data.
University is not aware of the benefits that it may gain if it analyses this student data
generated from the LMS tool. Analyzing this student data will help the university
https://doi.org/10.1007/978-3-030-68154-8_72
Academic and Uncertainty Attributes in Predicting Student Performance 839
understand why specific student(s) may not perform well and inform the instructor to
provide guidance and support to the student(s).
This paper proposes an DL model that can be applied to the student information,
analyze it, and predict their performance. Being able to analyze student data and make
an informed decision can have an impact on the university. University who put effort
into analyzing this data can help understand student performance by guiding students
when needed. The paper contributes to the universities trying to improve their students’
performance and the individual student who wants to know how likely it will progress
throughout the study. This paper shows how DL, together with a specific type of
attributes, can be applied to the student data to analyze it better and predict the student
performance results. The proposed model will assist the university in predicting student
performance.
The rest of this paper is organized as follows: In Sect. 1 will describe work that is
done by the other researchers in terms of the techniques, the methods of the work, and
the achieved results. Section 2 will be implementation, where it describes in detail the
proposed model, the type of attributes, the dataset used for the experiment, and a result
produced by the model. Section 3, which concludes, will highlight the paper contri-
bution and some limitations.
2 Related Works
Data prediction using machine learning is being used in different types of fields. Pre-
dicting student grades gained attention from universities who are seeking ways to
improve student experience with the university.
The research was done by this author [2], who wants to predict student performance
by developing an algorithm based on BLSTM called GritNet that achieves the
objective. The algorithm do not depend on feature selection because it learns from raw
input data, and it can also incorporate the student events based on the time-stamp. It
was used on The algorithm could be further improved by incorporating indirect data
such as student board activity and the interaction between the student and lecturer. This
kind of information will help GritNet to produce great performance prediction.
Utilizing those data gathered from LMS and applying them to an ML algorithm
may not be enough to predict student performance. There is another part where it
required attention to identify the features that will be used together with the algorithm
to perform the prediction. Selecting the right set of features plays an essential role in
producing a quality prediction.
This study by [3] presents an analysis of the performance feature selection to help
other researchers identify the best feature selection algorithm. During the experiment
various types of feature selection algorithms were used such as: CfsSubsetEval,
ChiSquaredAttributeEval, FilteredAttributeEval, GainRatioAttributeEval, Principal
Components, and ReliefAttributeEval. The result has shown that the principal com-
ponent produced better results. This could be further improved by using different
dataset sizes and perform parameter tuning of these features.
840 A. M. Ali et al.
This research by [4] wanted to predict student difficulties during a learning session
by using a tool called Digital Electronic Education and Design Suit (DEEDS). During
the experiment, the author has trained the following models: Artificial Neural Network
(ANN), Support Vector Machine (SVM), Logistic Regression (LR), Naïve Bayes (NB),
and Decision Tree (DT) on a dataset consist of 361 students. The result shows that
ANN & SVM have achieved a high accuracy result of 75% in predicting the student’s
difficulty for the next digital session. This also indicates that ANN & SVM can be
integrated into the DEEDS and Massive Open Online Course(MOOC) to help students
choose the appropriate session to study and work effectively. A future improvement
would be to use the K-means algorithm to study the student learning behavior during
the interaction with DEEDS so that the lecturer can provide further assistance to
students who need help.
Many types of research were conducted in the education field, where the research
goal is to predict student performance. Researchers used different ML algorithms and
various kinds of a dataset that achieved a certain prediction accuracy.
Table 1. List of attributes used by other researchers [7]

Demographic Attributes: Financial Attributes:
Gender Student’s cash amount
Marital status Student’s parents cash amount
Ethnicity Student’s income
Hispanic or non-Hispanic Father’s income
Residence county Mother’s income
Family Background Attributes: Enrollment Attributes:
Father’s education level Age of admission
Mother’s education level First admission semester
Number of family members Did student transfer credit?
Number of family members in college Student’s college
Student’s major
Pre-Enrollment Attributes: Semester-wise Attributes:
High School GPA Credit hours attempts
Composite ACT score Percentage of passed credits
Math ACT score Percentage of dropped credits
English ACT score Percentage of failed credits
Reading ACT score Semester GPA
Science ACT score
High school graduation age
This paper proposes another work to improve on the weaknesses which were
identified by other researchers such as:
• The set of features used in other research was not effective as it focused entirely on
the student academic data.
• Lack of student’s previous grades in the dataset.
• The algorithms do not work with indirect data, such as student board activity and
behavior.
This section has shown other research works where researches have implemented
various types of ML technique to predict the student performance. This paper proposes
using academic, behavior, and uncertainty data to predict student performance. The
next section will cover the implementation.
3 Methodology
In this section will cover the research implementation starting with the research
framework, and its components where will explain the overall structure in details.
The framework used in this research, as shown in Fig. (1), consist of 3 layers:
Fig. 1. Framework diagram
3.1 Student Attributes

These attributes are used to build the model to predict student performance. They are
extracted from other research papers. A literature review was conducted on different
papers to understand the objective and the type of attributes were used during the
research experiment. The attributes that are used in this experiment consist of two
types:
• Academic attributes
– Behavior attributes
– Demographic attributes
• Uncertainty attributes.
3.2 Academic Attributes

This refers to the student grades and performance during studies such as quiz,
assignment, attendance, mid-term exam, and final exam. This type of data can be
collected from LMS tools, which are used by the university. This type of attribute is
essential in building the model and making the prediction because this describes how
the student performs. The final grade will indicate whether or not the student needs
assistance to perform better in the future. This type of attribute is divided into two
attributes, which are: Behavior and demographic attributes.
Behavior refers to how active the student is during the course study. This is
measured by the time the student spent in the library and while using Canvas LMS as
well. While the demographic attributes refer to the student gender and background
information that can be used in measuring how the student is to perform during the
course of study.
3.3 Uncertainty Attributes

This refers to student behavior that plays an important role in determining how the
student performs, such as student background, the motivation to continue studying, and
perform better. These kinds of attributes may not be captured through LMS tools, but it
can help determine whether or the student requires assistance from the lecturer.
3.4 Dataset
The dataset used for this experiment consist of 60 students who are studying 5 subjects
which are: Cloud Computing, Computing Mathematics, Fundamentals of OOP, Object
Oriented SAD, and User Interface Design.
These data are extracted from Canvas LMS and Oracle LMS. The following
Table 2 shows the types of attributes that were collected from Canvas and Oracle LMS.
Table (2) shows the set of attributes that will be used in this experiment. Table (3)
explains the attributes as shown in Table (2) in details:
Table 2. Type of attributes and their data

Academic Behavior Demographic Uncertainty Attributes
Attributes Attributes Attributes
Student ID Library Activity Gender Student ID
Subject Name Canvas Active Age Enrollment term
Enrollment Term Usage Marital Status Number of semesters
Assignment 1 Country Origin taken
Assignment 2 Country Residence Withdraw
Mid Term
Final Exam
Table 3. Explanation of the type of attributes

Attribute Meaning
Student ID This refers to the identification number that is given to each student
Subject name This refers to the name of the subject
Enrollment term This refers to which semester the student has taken the subject
Assignment 1 & This refers to the assignments given to the student while taking the
assignment 2 subject
Midterm This refers to the student’s assessment in the middle of the semester
to assess student performance based on previous lectures
Final exam This refers to the final assessment given to the student at the end of
the semester to assess the student based on lectures taken during the
whole semester
Library activity This refers to the time the student spends in the library
Canvas active usage This refers to the time the student spends in a particular subject
while using Canvas LMS
Gender This refers to the student’s gender
Age This refers to the student’s age
Marital status This refers to the student’s marital status
Country origin This refers to the name of the country the student is originally from
Country residence This refers to the name of the country the student is currently living
in
Number of semesters This refers to the number of the semester(s) the student has taken
taken
Withdraw This refers to the reason why the student decided to drop out
Library activity refers to the total time the student spent in the library. This is
calculated based on the time the student enters and leaves the library. LMS activity
refers to the total time the student spent on the subject. This ranges from viewing
subject materials, taking an assessment, or attending an online lecture.
This dataset is collected from the LMS while the library activity is collected from
the library system, where it tracks student access/exit using the student card.
3.5 Prediction Model

The prediction model used to predict student performance consists of three parts:
Dispositional Learning Analytics, Long Short Term Memory Network (LSTM), and
Bidirectional Long Short Term Memory Network (BLSTM).
3.6 Dispositional Learning Analytics

Is a combination of student data collected from LMS together with data collected
survey reports to better understand the student behavior during the course of study.
This meaningful data can be used as part of prediction to identify those students who
require special care from the lecturer [8].
3.7 Long Short-Term Memory and Bidirectional (LSTM)

Is part of Recurrent Neural Network (RNN) which can process an entire sequence of
data at one time.
LSTM has the advantage of capturing temporal information but cannot remember
the full input context [6] Fig. 2.
Fig. 2. LSTM architecture [6]
3.8 Bidirectional Long Short Term Memory Networks (BLSTM)

Processes the input sequence in both directions with two sub-layers to account for the
full input context. These two sub-layers compute forward and backward hidden
sequences ! (h), ← (h), respectively, which are then combined to compute the output
sequence (y) [5] Fig. 3.
The difference between BLSTM and LSTM is that BLSTM can process the
sequence of data back-forth to achieve a better accuracy prediction.
3.9 Proposed Application

Once the prediction model performs the prediction and predicts student performance, it
will be displayed inside the proposed application.
Fig. 3. BLSTM architecture [5]
4 Implementation
This section will cover in detail the research implementation. Will show how the model
is developed by explaining the type of algorithms and the techniques used to produce
the result. LSTM and BLSTM were used to build the model and make the prediction.
These 2 networks were used because the dataset consists of student data, and it’s
sequential, and the model needs to predict the student progress throughout the course.
Using LSTM alone is not sufficient to make an accurate prediction because due to the
context of data, so will use BLSTM as well because it can move forward-backward in
analyzing the data to achieve a higher prediction accuracy. The data is split into an
80/20 ratio, where 80% of the total data was used for training, while the remaining 20%
was used for testing.
An activation function called Relu was used in the model together with a Loss
function called Mean Square Error and an Optimizer called Adam. The Dense layer
was set to 1 as the final value that the model expected to produce is a single value or
output. Executing the program has produced the result, as shown in Fig. (4).
The Blue color line indicates the actual student grades, while the Orange color line
indicates the predicted grades. This shows the model is learning from the data and
producing prediction grades.
Fig. 4. Student predicted performance
5 Conclusion
This paper has presented different types of attributes that can be applied to a model to
predict the student result, and it was done on a group of students who took the subjects.
A future improvement will be to apply this model to a larger number of data to monitor
how the model will produce the result and fine-tune it to make it better.
Acknowledgment. This research work is supported by UOW Malaysia KDU Penang University
College.
References
1. Rovira, S., Puertas, E., Igual, L.: Data-driven system to predict academic grades and dropout.
PLoS One 12(2), e0171207 (2017)
2. Kim, B.H., Vizitei, E., Ganapathi, V.: GritNet: Student performance prediction with deep
learning. arXiv preprint arXiv:1804.07405 (2018)
3. Zaffar, M., Hashmani, M.A., Savita, K.S.: Performance analysis of feature selection algorithm
for educational data mining. In: 2017 IEEE Conference on Big Data and Analytics (ICBDA),
pp. 7–12. IEEE, November 2017
4. Hussain, M., Zhu, W., Zhang, W., Abidi, S.M.R., Ali, S.: Using machine learning to predict
student difficulties from learning session data. Artif. Intell. Rev. 52(1), 381–407 (2019)
5. Mousa, A.E.D., Schuller, B.W.: Deep bidirectional long short-term memory recurrent neural
networks for grapheme-to-phoneme conversion utilizing complex many-to-many alignments.
In: Interspeech, pp. 2836–2840, September 2016
6. Xia, J., Pan, S., Zhu, M., Cai, G., Yan, M., Su, Q., Ning, G.: A long short-term memory
ensemble approach for improving the outcome prediction in intensive care unit. Comput.
Math. Methods Med. 2019, 1–11 (2019)
7. Ameri, S., Fard, M.J., Chinnam, R.B., Reddy, C.K.: Survival analysis based framework for
early prediction of student dropouts. In: Proceedings of the 25th ACM International on
Conference on Information and Knowledge Management, pp. 903–912, October 2016
Intelligent Computing and Optimization, pp. 446–456. Springer, Cham, October 2019
Captivating Profitable Applications
of Artificial Intelligence in Agriculture
Management
R. Sivarethinamohan1(&), D. Yuvaraj2, S. Shanmuga Priya3,

and S. Sujatha4
1
Department of Professional Studies, CHRIST (Deemed to be University),
Bangalore, India
mohan.dimat@gmail.com
2
Department of Computer Science, Cihan University,
Duhok, Kurdistan Region, Iraq
contactyuvaraj199@gmail.com
3
Department of Computer Science and Engineering, MIET Engineering
College, Trichy, Tamilnadu, India
priya501@gmail.com
4
Department of Civil Engineering, R. Ramakrishnan College of Technology,
Trichy, Tamilnadu, India
sujalalit@gmail.com
Abstract. Today’s agriculture routinely uses sophisticated technologies such as

robots, temperature and moisture sensors, aerial images, and GPS technology.
These advanced devices and precision agriculture and robotic systems allow
businesses to be more profitable, efficient, safe, and environment friendly.
Precision agriculture uses AI technology to aid in detecting diseases in plants,
pests, and poor plant nutrition in farms. The first milking robot was launched in
1995 and is now a fixture on farms everywhere. These AI powered technologies
ensure crop yields despite climate changes, population growth, employment
issues, and food security problems. Further AI helps prevent the use of surplus
water, pesticides, and herbicides, preserve soil fertility, enable competent
manpower use, and increase productivity and quality. With current employment
levels, future food demand would strain the global food system, thereby lending
credence to the need to make if highly efficient. This review surveys the work of
numerous researchers to get a brief outline of the potentials of AI applications in
agriculture and weeding systems with robots and drones. Various AI applica-
tions are discussed along with automated cropping and weeding techniques.
Various methods used by AI for spraying and crop-monitoring are also dis-
cussed here.
Key terms: Soil monitoring Drones Smart irrigation Remote sensing

Driverless tractors

https://doi.org/10.1007/978-3-030-68154-8_73
Captivating Profitable Applications of AI in Agriculture Management 849
1 Introduction
There are about 14 crore farmers in India whose main means of livelihood is agri-
culture. With changes in ecological requirements [13], we have raised the question ‘as
to how we can use technology to replace some of the human activities and guarantee
efficiency in farming?’ Farmers need to reach the next level of profitability and effi-
ciency in farming. The situation is slowly changing right under our noses. Artificial
Intelligence is emerging as part of the solutions to improve agricultural productivity. AI
adoption in agriculture provides better farming systems and better ways to grow crops.
It also provides accurate and timely information on crop growth and on conditions to
detect and prevent crop disease, weed, and insect damage based on weather forecasts. It
helps in precision farming, crop management, crop yield estimation, c early warning
and mitigation of crop disaster, agricultural production planning, crop commodity
trading, and food security decision support [9]. While some of AI based agriculture
technology benefits include saving farmers time and money and boosting average crop
yields; others stem from more complex global concerns like weather uncertainties,
labor shortages, and rising demand for healthier, sustainably grown food [2].
2 How AI Works in Agriculture

1. Sense: a group of sensors capture farm data and upload it to a fasal cloud platform.
2. Analyse: the data is analysed and presented to the farmer through an app for
decision making.
3. Predict: captured data is used by prediction models to predict ideal growth condi-
tions, resource requirements, and disease.
4. Action: farmers are notified and resultant actions taken directly from the application
Fig. 1.
Fig. 1. Schematic representation of Artificial intelligence in agriculture

850 R. Sivarethinamohan et al.
3 Artificial Intelligence in Agriculture

1. Crop yield prediction and price forecasts: AI identifies the output yield of crops
and forecasts the price for the next few weeks. This will help farmers get maximum
profit.
2. Intelligence spraying: sensors detect weed affected areas and precisely spray
herbicides in the right region thereby reducing use of herbicides.
3. Predictive insights: AI provides insights on the “right time to sow seeds” for
maximum productivity. It also provides insights into impact of weather conditions.
4. Agriculture robots: using autonomous robots, agriculturists can harvest huge
amounts of crop at a faster pace.
5. Crop and soil monitoring: under ML/AI, crop health can be monitored to diag-
nose pests/soil defects and nutrient deficiency in the soil, etc. (cropin –smartfarm
work as robust and flexible system for)?
6. Disease diagnosis: prior information and classification of plant diseases help
farmers to control crop disease through proper strategies [6].
7. Drones: From drones: AI enabled cameras can capture images of the entire farm
and analyse them in near- real-time to identify problem areas for potential
improvement.
8. Pest control: image recognition technology can identify and treat various types of
bugs and vermin.
9. Boost yield crop: AI algorithms can determine the breeds and conditions that will
produce the highest yield.
10. Seasonal forecasting: AI systems create probabilistic models for seasonal
forecasting.
4 Applications of Artificial Intelligence in Agriculture
4.1 Automated Irrigation System (Smart Irrigation)

An automated irrigation system works on irrigation along with surveillance and pre-
vention of depletion of natural resources with or without minimum levels of manual
intervention [14]. Drip, sprinkler, surface, and other irrigation systems are programmed
and mechanized with the help of electronic appliances and detectors like computers,
timers, sensors, and other mechanical devices. This system works with a soil sensor.
Besides, it has a distributed wireless network of soilmoisture and temperature based
sensors placed at the plants’ root zones. Further, the soil moisture sensor module
contains a comparator [10]. Voltage from the prongs and a predefined voltage are
compared and the comparator’s output is high only when the soil is dry. Output from
the soil moisture sensor is provided to the analogue input pin (Pin 2 – RA0) of the
microcontroller. A gateway unit handles sensor information, triggers actuators, and
transmits data to a web application. As a recent addition, smart irrigation controllers
screen weather, soil conditions, evaporation, and plant water use to automatically
regulate the watering schedule to meet actual site conditions. Thus smart irrigation
helps in minimal water wastage and controls the quantum of water delivered to plants
[1]. In the present, irrigation and weeding has been mitigated by automated robotic
systems [15] Fig. 2.
Fig. 2. Smart irrigation
4.2 Drones for Spraying

Almost 84% farmers use drones daily or more than once a week. At 73%, crop
monitoring is the most popular usage of drones by farmers, followed by soil and field
analysis at 46% and health assessment of crops and livestock at 43%. Mostly the DJI
MG-1S - Agricultural Wonder Drone is used in agriculture Fig. 3.
Fig. 3. A drones being used to spray crops
AI is has been paired with drones to analyse aerial images and give agriculturists
data that can help them reduce planting costs, cut down water and fertilizer use, and
monitor crop health. Drones map out pre-planting patterns to help farmers maximize
the area to plant crops and implement the best planting pattern for individual crops.
This minimizes the time to map out the pattern on foot.
4.3 Remote Sensing-Based Crop Health Monitoring

Remote sensing-based crop monitoring monitors the condition of cereal crop seedlings,
along with the status and trend of their growth. In consort with the development of
remote sensing applications, satellite data is the primary data source to investigate
large-scale crop conditions. Hyperspectral imaging and 3-D Laser scanning, multi
sensor collections of phenotype data are remote sensing-based crop health monitoring
systems. Recent advances in imaging and non-imaging sensor technologies, remote
sensing platforms, and satellite data availability have provided new opportunities and
challenges. The recent very-high-resolution satellite imagery, acquired typically in sub-
meter to 5-m resolution, such as WorldView-2, Pleiades-1, IKONOS, and RapidEye,
have brought us into a new phase of remote sensing for precision crop management
over large farm tracts. Freely available satellite data from sensors such as MODIS,
NPP VIIRS, and Landsat have greatly facilitated large scale (i.e., regional or even
global level) crop growth monitoring Fig. 4.
Fig. 4. Remote sensing- based crop health monitoring
4.4 Autonomous Early Warning System for Fruit-Fly

Autonomous early warning system, built on wireless sensor networks and GSM net-
work, capture up-to-the-minute natural environment fluctuationsin fruit-farms. Self-
organizing maps and support vector machines have been unified in this system to
implement adaptive learning and to spontaneously issue a warning to farmers through
GSM networks Fig. 5.
Fig. 5. Fruit-fly
4.5 Driverless Tractor

A driverless tractor consists of various machine-tractor units and carries out ploughing,
cultivation, and cropping [13]. It was invented as early as 1940, but came to full time
use only recently. The tractor uses GPS and other wireless technologies to farm land
without a driver. In a driverless tractor, the software is coupled to sensors, Radar, and
GPS systems. These GPS-enabled tractors can plant, spray, and harvest fields. They
provide higher efficiency for precision agriculture and help overcome labour shortage
in farms. With advanced GPS, a tractor operator can tell which rows have been planted
to avoid overlapping Fig. 6.
Fig. 6. Driverless Tractor
4.6 Face Recognition System for Domestic Cattle

Face recognition of cattle in dairy units can individually monitor all aspects of group
behaviour as well as body conditions e and feeding. The machine learning (ML) al-
gorithm is fed with 20–30 pictures of each cow taken from different angles, different
backgrounds, and different lighting. The pictures of two cows can be distinguished by
the physical features across their faces, including the muzzles and eyes, which ML
catches. This helps in classifying the cattle in terms of age, breed, and other categories.
For instance, the Mooofarm app is used for digitizing the breeding cycle of cattle., The
farmer uploads the basic inputs of his cattle to get real-time alerts such as the right time
to inseminate them and the right time to get a pregnancy diagnosis. The app also has an
inbuilt e-commerce platform that allows farmers to connect with input suppliers (food,
fodder seeds) and other service providers such as insurance and veterinarians. It also
allows making an entry for the medicines fed to the cattle, the kind of insemination
done, the amount of fodder given to the cattle, etc. This app helps farmer to reduce the
feed amount Fig. 7.
Fig. 7. Working of Mooofarm app
4.7 AI Paired IoT Device

The Internet of Things (IoT) is highly helpful to today’s farmers. AI enhances IoT
Devices, thereby transforming farm management systems. The machines employed by
farmers to traverse their fields are stuffed with sensors and software that gather data,
process them with machine learning, and beam them into mobile apps. Sensors are the
eyes of the machine while the software and mobile apps bring data to life [4].
4.8 Chabots
A chat-bot will provide assistance to farmers using NLP. This bot answers their
questions related to agriculture practice and technology and provides advice and rec-
ommendations on specific farm problems [7].
4.9 Stethoscope
An excellent stethoscope provides the farmers with data and observations to make an
informed decision about their crops and thus takes the guess work out of farming.
That’s exactly done by fasal. The shoot and root borer insects in fruit and plantation
crops are detected through a stethoscope.
4.10 Intello Labs

Intello Labs perform quality assessment of food commodities using computer vision
and AI. They help food businesses (growers, traders, retailers, food service companies,
exporters, etc.) improve customer satisfaction and cut down losses. Deep learning,
Intello labs help farmers with image analyses that facilitate agriculture product grading
and alerts on crop infections [3].
4.11 AI-Based Precision Agriculture

Precision agriculture mostly uses geographic information and communication tech-
nology (Geo-ICTs) principles. It helps a farmer to provide correct and timely treatment
to his crops at the right farm location. Drone-based sensing and image interpretation
help to accumulate timely high-resolution data related to soil and crop conditions [12].
Leaf area index, normalized difference vegetation index, photochemical reflectance
index, crop water stress index, and other such vegetation indices offer crop health-
related information. Temporal changes in these indices are related to biophysical and
biochemical stresses which help to know changes in the health and canopy structure of
a crop over time. These stresses occur due to insufficient soil nutrients, unfortunate soil
moisture, or pest attack. Through UAV-based PA, stressed areas are identified in real-
time, and corrective measures such as fertilizer and pesticide spraying are carried out
Fig. 8.
Fig. 8. Working of AI-based precision agriculture
4.12 Crop Phenotyping and Analysis

Crop phenotyping targets quantification of quality, photosynthesis, development,
architecture, growth, or biomass productivity of single plants or plant stands using a
wide range of sensors and analysis procedures.
4.13 Remote Sensing-Based Water Resources Mapping

This maps water resources on a farm. It provides soil moisture data, assesses moisture
content in the soil and guides the farmer on the type of crop that can be grown. There
are two types of remote sensing technology: active and passive remote sensing. Active
sensors emit energy with the intention of scanning objects and areas, after which a
sensor senses and measures the radiation reflected from the target. Passive remote
sensors consist of film photography, infrared, charge-coupled devices, and radiometers.
4.14 AI Technology in Aquaculture

Through AI, farmers can remotely switch on or off pumps, motors, aerators, or dif-
fusers. Production and demand can be forecast by altering program parameters, further
improving farm efficiency, and monitoring ability. Fish farms turn to AI and
recirculating systems to scale up sustainable aquaculture. Recirculating aquaculture

systems (RAS) operate by filtering water from the fish (or shellfish) tanks so that it can
be reused within the tank. This dramatically reduces the amount of water and space
required to produce seafood products intensely. Recirculating systems enable fish to be
reared in consistent conditions thereby reducing chances of them contracting illnesses,
and the need to feed them antibiotics [5].
4.15 AI in Food Engineering and Cold Chain Logistics

Ensuring stable food supply can be accomplished by using AI and machine learning.
The technology enables companies to test and monitor food safety products at every
step of the supply chain. Algorithms based on artificial neural networks monitor the
processes of AI food delivery and goods tracking at every step. In recent days, the
emergence of IoT helps in regulations to improve food safety and to increase the
adoption of data-guided decision making. In reality, the whole process of agricultural
products needs to be in a state of low temperature. In the Cold Chain temperature-
controlled storage and transportation facilities prevent rotting of easily perishable food
products. These products are stored in a way that keeps them fresh long enough to be
exported.
4.16 Big Data and Cloud Computing

Big data provides farmers with granular data on rainfall patterns, water cycles, fertilizer
requirements, and more enabling them to make smart decisions, as to what crops are to
be planted for better profitability and when they are to be harvested. The right decisions
ultimately improve farm yields. Further data from state, district, and government
agriculture directories are stored in the Agri-Cloud. Farmers use the cloud to access
daily weather reports, soil reports, fertilizer reports, crop reports, market information,
storage information, agriculture machinery, and production data using mobile-based
applications, browsers, and graphical interface with help of smartphones and com-
puters. They obtain an accurate prediction of products that are in demand in different
markets and adjust production accordingly Fig. 9.
4.17 Automatic Navigation and Self-driving Technology

When ploughing self-driving technology takes the human out of the driver’s seat. AI
and location tracking are added to determine the path. A converted tractor and harvester
are combined into autonomous vehicles equipped with cameras, lasers, and GPS
systems. The two vehicles prepare the ground, sow seeds, and maintain crops, while
drones swoop in to scoop soil and crop samples. They even monitor the farm for weeds
and disease. AI technology developers are now working to make them smarter.
A prototype self-driving tractor and autonomous sprayer may handle more of the grunt
work required to harvest crops.. See and Spray devices use cameras, machine learning,
Fig. 9. Working of the Agricloud
and AI to detect weeds in r fields and spray pesticides, fertilizer, and fungicides. These
applications are stored in the cloud and the farmer needs only a modem on the machine
to undertake heavier computing in the cloud. Using a combination of GPS, sensors, and
imaging, farmers can presumably have a better handle at how to deploy robotic
vehicles to till the soil [8].
4.18 Other Applications of AI in Agriculture

Artificial intelligence technology can be used for implementation of technological
innovation in the construction of intelligent agriculture [17]. Few other applications are
(i). AI-based decision support systems
(ii). Agricultural robotics and automation equipment
(iii). Computational intelligence in agriculture, food, and bio-systems
(iv). AI in agricultural optimization management

(v). Intelligent interfaces and human-machine interaction
(vi). Machine learning and pattern recognition
(vii). Systems modelling and analysis
(viii). Intelligent systems for animal feeding [11]
(ix). Expert systems in agriculture
5 Future of AI in Agriculture
The global Artificial Intelligence (AI) in the agriculture market is witnessing a CAGR
of 26.2% during the period 2019–2024 [16]. Digital farming and connected farm
services can impact 70 million Indian farmers in 2020, thereby adding $9 billion to
farmers incomes. In 2017, the Global AI in agriculture market was 240 million US$
and is expected to reach 1100 million US$ by the end of 2025 and a CAGR of 20.8%
during 2018–2025. Thus initiatives to increase digital literacy in the rural landscape can
be seen as a weapon to double farmers income in the near future The estimated number
Agricultural IoT devices are 75 million by 2020. 4.1 million data points farms are
estimated to be generated daily by 2050. The CAGR of AI in the agriculture industry
will be 22.68% during 2017–2021.
6 Conclusion
Artificial Intelligence is a program that can adapt itself to execute tasks in real-time
situations using cognitive processing similar to the human mind. AI can be appropriate
and efficacious in the agriculture sector as it optimises resource use and efficiency. It
solves the problem of the resource and labour scarcity to a great extent. Adoption of AI
is quite useful in agriculture. Artificial Intelligence can prove to be a technological
Revolution and a boon to agriculture to feed an increasing human population globally.
It will complement and challenge farmers to make the right decisions.
References
1. Harishankar, S., Kumar, R.S., Sudharsan, K., Vignesh, U., Viveknath, T.: Solar powered
smart irrigation system. Adv. Electr. Comput. Eng. 4, 341–346 (2014)
2. Ling, Y.: Application of artificial intelligence technology in agriculture. Comput. Knowl.
Technol. 202(29), 181–183 (2017)
3. Nagaraju, M., Chawla, P.: Systematic review of deep learning techniques in plant disease
detection. Int. J. Syst. Assur. Eng. Manag. 11, 547–560 (2020)
4. Mekala, M.S., Viswanathan, P.: CLAY-MIST: IoT-cloud enabled CMM index for smart
agriculture monitoring system. Measurement 134, 236–244 (2019)
5. Murugesan, R., Sudarsanam, S.K., Malathi, G., Vijayakumar, V., Neelanarayanan, V.,
Venugopal, R., Rekha, D., Sumit, S., Rahul, B., Atishi, M., Malolan, V.: Artificial
Intelligence and Agriculture 5.0 8, 1870–1877 (2019)
6. Mishra, P., Polder, G., Vilfan, N.: Close range spectral imaging for disease detection in
plants using autonomous platforms: a review on recent studies. Curr. Robot. Rep. 1, 43–48
(2020)
7. Liu, S.Y.: Artificial intelligence (AI) in agriculture. In: IT Professional, vol. 22(3), pp. 14–
15, 1 May–June 2020. https://doi.org/10.1109/mitp.2020.2986121
8. Shi, Y.L.: Application of artificial intelligence technology in modern agricultural production.
South. Agric. Mach. 50(14), 73 (2019)
9. Sun, G., et al.: Application of artificial intelligence in intelligent agriculture. J. Jilin Normal
Univ. Eng. Technol. 35, 93–95 (2019)
10. Wang, N., Zhang, N., Wang, M.: Wireless sensors in agriculture and food industry- Recent
development and future perspective. Comput. Electron. Agric. 50, 1–14 (2006)
11. Hashimoto, Y., Murase, H., Morimoto, T., Torii, T.: Intelligent systems for agriculture in
Japan. IEEE Control Syst. Mag. 21(5), 71–85 (2001). https://doi.org/10.1109/37.954520
12. Yang, N., et al.: Tea Diseases Detection Based on Fast Infrared Thermal Image Processing
Technology, (wileyonlinelibrary.com) (2019). https://doi.org/10.1002/jsfa.9564
13. Senkevich, S., et al.: Optimization of the parameters of the elastic damping mechanism in
class 1, 4 tractor transmission for work in the main agricultural operations. In: Computing
and Optimization, Conference Proceedings ICO 2018, Springer, Cham (2018). ISBN 978–3-
030-00978-6
14. Kovalev, A., et al.: Optimization of the process of anaerobic bioconversion of liquid organic
wastes, intelligent computing and optimization. In: Proceedings of the 2nd International
Conference on Intelligent Computing and Optimization (ICO 2019), Springer International
Publishing (2019). ISBN 978-3-030-33585-4
15. Talaviya, T., Shah, D., Patel, N., Yagnik, H., Shah, M.: Implementation of artificial
intelligence in agriculture for optimisation of irrigation and application of pesticides and
herbicides. Artif. Intell. Agric. (2020). https://doi.org/10.1016/j.aiia.2020.04.002
16. https://www.businesswire.com/news/home/20200817005511/en/Artificial-Intelligence-in-
Agriculture-An-Industry-Overview-2020-2025-Featuring-Microsoft-IBM-and-Agribotix-
Among-Other-Major-Players—ResearchAndMarkets.com
17. Jia, L., Wang, J., Liu, Q., Yan, Q.: Application research of artificial intelligence technology
in intelligent agriculture. In: Liu, Q., Liu, X., Shen, T., Qiu, X. (eds) The 10th International
Conference on Computer Engineering and Networks. CENet 2020 Advances in Intelligent
Systems and Computing, vol 1274. Springer, Singapore (2021). https://doi.org/10.1007/978-
981-15-8462-6_25
18. Ahamed, B.B., Yuvaraj, D.: Framework for faction of data in social network using link
based mining process. In: International Conference on Intelligent Computing and
Optimization, pp. 300–309. Springer, Cham, October 2018
19. Ahamed, B.B., Yuvaraj, D.: Dynamic secure power management system in mobile wireless
sensor network. In: International Conference on Intelligent Computing and Optimization,
pp. 549–558. Springer, Cham, October 2019
20. Yuvaraj, D., Sivaram, M., Ahamed, A.M.U., Nageswari, S.: An efficient lion optimization
based cluster formation and energy management in WSN based IoT. In: International
Conference on Intelligent Computing and Optimization, pp. 591–607. Springer, Cham,
October 2019
Holistic IoT, Deep Learning and
Information Technology
Mosquito Classification Using Convolutional
Neural Network with Data Augmentation
Mehenika Akter1 , Mohammad Shahadat Hossain1(&) ,

Tawsin Uddin Ahmed2 , and Karl Andersson1,2
1
mhnk.a.mitu@gmail.com, hossain_ms@cu.ac.bd
2
Department of Computer Science, Electrical and Space Engineering,
Luleå University of Technology, Skellefteå, Sweden
tawsin.uddin@gmail.com, karl.andersson@ltu.se
Abstract. Mosquitoes are responsible for the most number of deaths every year
throughout the world. Bangladesh is also a big sufferer of this problem. Dengue,
malaria, chikungunya, zika, yellow fever etc. are caused by dangerous mosquito
bites. The main three types of mosquitoes which are found in Bangladesh are
aedes, anopheles and culex. Their identification is crucial to take the necessary
steps to kill them in an area. Hence, a convolutional neural network
(CNN) model is developed so that the mosquitoes could be classified from their
images. We prepared a local dataset consisting of 442 images, collected from
various sources. An accuracy of 70% has been achieved by running the pro-
posed CNN model on the collected dataset. However, after augmentation of this
dataset which becomes 3,600 images, the accuracy increases to 93%. We also
showed the comparison of some methods with the CNN method which are
VGG-16, Random Forest, XGboost and SVM. Our proposed CNN method
outperforms these methods in terms of the classification accuracy of the mos-
quitoes. Thus, this research forms an example of humanitarian technology,
where data science can be used to support mosquito classification, enabling the
treatment of various mosquito borne diseases.
Keywords: Mosquito Classification Dengue Malaria Convolutional

neural network Data augmentation
1 Introduction
Mosquitoes may seem to be tiny little creatures but they are one of the deadliest
animals of the world. They bring significant harm to humans since they are the main
reason behind various transmissible diseases like dengue, malaria, chikungunya, zika,
yellow fever etc. Their ability to carry and spread diseases to humans and animals
causes millions of killings each year. As stated by the World Health Organization,
deaths of millions of people every year are caused by mosquito bites [26]. With almost
2.5% case fatality, almost 500,000 people having severe dengue need to be hospitalized

https://doi.org/10.1007/978-3-030-68154-8_74
866 M. Akter et al.
each year annually [27]. Malaria is responsible for more than three hundred million
acute illnesses and it kills at least one million people every year. There are more than
3,000 species of mosquitoes but the most dangerous ones are aedes, anopheles and
culex because aedes cause dengue, yellow fever, chikungunya etc. Anopheles causes
malaria whereas culex causes zika, west nile virus etc.
Bangladesh, which is a densely populated country, and has a very low average
income, is one of the unhealthiest places to live in. Every year hundreds of people lose
their lives by mosquito bites and thousands get sick. They mainly get affected by
dengue, malaria and chikungunya. But in recent years, the problem has became acute.
In 2019, at least 18 people died of dengue and 16,223 got infected in Bangladesh by
August [3]. The number of malaria cases is also shocking. Approximately 150,000–
250,000 malaria instances are found in this country every year [20].
The goal of this research consists of developing a model that is capable of detecting
three different classes of mosquitoes: Aedes, anopheles and culex from a given input
image. The goal would be accomplished by training a model using machine learning
techniques on our dataset. Then we tried to improve the accuracy compared to other
systems and maintain roughly equal recognition rates for the classes.
The background and some related works on mosquito classification are presented in
the upcoming section. The methodology of this research is demonstrated in Sect. 3.
Then Sect. 4 gives an overview of the dataset constructed which is used in this research
along with data augmentation. Section 5 defines the implementation of the presented
system. After that, Sect. 6 shows the inspection of the result and finally, Sect. 7 gives
the conclusion of the paper by providing a brief description of the future works.
2 Related Work
There has been some work done on vision-based mosquito classification. The most
recent work on mosquito classification was done by Okayasu et al. [25]. They con-
structed their own dataset which consisted of 14,400 images of 3 types of mosquitoes.
They used three types of deep classification methods and showed a comparison of the
accuracy.
Motta et al. [23] presented a classification of adult mosquitoes. They trained CNN
to implement morphological classification of the mosquitoes. They used a dataset
consisting of 4,056 images of the mosquitoes. Using three neural network models:
LeNet, AlexNet and GoogleNet, they found the finest result (76%) with GoogleNet.
In 2018, Li-Pang Huang et al. [12] classified mosquitoes using edge computing
along with deep learning. They managed to have validation accuracy of 98%. On the
other hand, they achieved a test accuracy of 90.5%.
Fuchida et al. [8] showed the pattern and exploratory validation of a self-operating
factor for vision-based classification of mosquitoes. The factor could determine mos-
quitoes from other bugs, with the separation of the dialectal features. It has also used
the classification based on support vector machine (SVM). With a maximum recall of
98% having used different classification methods, they proved the efficiency and
validity of their proposed method. However, the classification of mosquito species was
not considered there.
Mosquito Classification Using Convolutional Neural Network 867
Ortiz et al. [28] presented a work based on convolutional neural network to classify
mosquito larvae. They used 1,118 images of larvae 8th segment. Using a pre-trained
model (VGG16), they achieved an accuracy of 97%.
MAM Fuad et al. [7] classified aedes aegypti larvae and float valve using transfer
learning and implementing Inception-V3. They performed the experiment on 534
images and used three different learning rates achieving an average accuracy of about
99%.
Minakshi et al. [21] classified seven species with sixty mosquitoes by using clas-
sifiers and indicating pixel merits. They implemented the random forest algorithm and
achieved a validation accuracy of 83.3%. But as their dataset was very small having
only sixty images, the classification was not very suitable.
Jeffrey Glick and Katarina Miller [9] classified insects using Hierarchical Deep
CNNs (HD-CNNs). They worked on 217,657 images of different insects. They
implemented their method on 277 distinctive classes.
Several research on mosquito classification using mosquito wingbeat have been
implemented too. Fanioudakis et al. [6] classified six species of mosquitoes using
observable recordings of their wingbeats. They achieved a classification accuracy of
96%. Their dataset consisted of 279,566 recordings of mosquitoes’. They used top-tier
deep learning approaches for implementing the classification.
Kiskin et al. [18] worked on wavelet transformations of audio recordings of
mosquitoes in 2017. They built a CNN model to detect the presence of mosquitoes by
their wingbeat recordings.
Kyukwang Kim et al. [16] built an automatic mosquito sensing and control system
using deep learning. They found an accuracy of 84% for Fully Convolutional Network
(FCN) and regression which was built on neural networks.
The key difference from the existing systems with our system is that they did not
use a custom CNN model for classifying the mosquitoes whereas our system has done
that. We know that CNN models perform better than other approaches in classifying
images as well as retrieval of images [24]. Therefore, custom CNN model has been
used as well as other pre-trained CNN models like VGG-16 to compare their perfor-
mances. As for the limitations are concerned, even though many of them used a dataset
consisting of a large amount of images, the results are not that great in comparison to
the dataset. In contrast, we tried to maintain a fair accuracy rate even after having a
dataset containing a small amount of images.
As there are not so many mosquito images found online, that restricts us to research
more on their classification. There is also no standard dataset for this purpose. So, the
dataset has to be built on our own to determine the species of the mosquito. We made
our own dataset by gathering the images of these mosquitoes and developed the
classification method in this research.
3 Methodology
The system was developed by Convolutional Neural Network(CNN) with data aug-
mentation in this research. Figure 1 illustrates the system flow chart of this system.
First of all, the model takes the images from the dataset. Then it starts preprocessing.
868 M. Akter et al.
After that, the images are augmented by using some augmentation functions. Finally,
the augmented dataset is run into the CNN model so that it can predict the class.
Fig. 1. System flow chart
The preprocessing of the images has been done in a simple way. First of all, the
figures of the mosquitoes have been detected using OpenCV. Then, the body portions
of the mosquitoes have been cropped so that the unnecessary parts from the back-
ground are not available in the images. After that, the images are converted into gray-
scale so that the model could learn the images easily. Next, the images have been
normalized to a certain limit to be recognized in a better way. The feature extraction of
the images has been conducted by CNN itself. Images could be detected by using haars
features too [4].
Using a filter w(m, n), convolution over an image f(m, n) is defined in Eq. (1):
Xa Xb
wðm; nÞ f ðm; nÞ ¼ s¼a t¼b
wðs tÞf ðm s; n tÞ ð1Þ
ReLU activation function has been applied in the convolution layer. To make sure the
non-linearity of the model [10], ReLU is applied which is shown in Eq. (2):
f ðmÞ ¼ maxð0; mÞ ð2Þ
The model has been provided with 442 images. It includes 4 convolutional layers:
1st Convolution Layer 2D, 2nd Convolution Layer 2D, 3rd Convolution Layer 2D and
4th Convolution Layer 2D. The 1st convolution layer has 16 filters of size 2 * 2
whereas the 2nd convolution layer has 32 filters. The 3rd convolution layer has 64
layers of size 2 * 2 and finally, the 4th convolution layer has 128 layers. The kernel size
is 2. The model consists of 2 * 2 pool size pooling layer after each convolution layer.
Max pooling layer has also been chosen after each convolution layer. ReLU activation
function has been administered in hidden layers like the convolution layer. After every
hidden layer, there is a dropout layer. The value of dropout was 0.5. The work of the
dropout layer is to deactivate 50% nodes from every hidden layer so that overfitting
[31] could be avoided. One Global Average Pooling layer has been added in the last
hidden layer which takes the average as it is suitable for feeding into our dense output
layer. The hyperparameters have been tuned by adding layers until the error did not
improve. Also, the value of the dropout layer has been chosen by experimenting with
multiple values. As the dropout value of 0.5 has helped to avoid overfitting more than
other values, it has been selected as the final value of the dropout layer. The parameters
have been optimized to improve the result.
Table 1. System architecture

Model contents Details
1st Convolution Layer 2D 16 filters of size 2 2, ReLU
1st Max Pooling Layer Pooling size 2 2
Dropout Layer Excludes 50% neurons at random
2nd Convolution Layer 2D 32 filters of size 2 2, ReLU
2nd Max Pooling Layer Pooling size 2 2
3rd Convolution Layer 2D 64 filters of size 2 2, ReLU
3rd Max Pooling Layer Pooling size 2 2
4th Convolution Layer 2D 128 filters of size 2 2, ReLU
4th Max Pooling Layer Pooling size 2 2
Global Average Pooling Layer N/A
Output Layer Three nodes for three classes, SoftMax
Optimization Function Adam
Callback ModelCheckpoint
Finally, the output layer of the model includes three nodes because it has three
classes. As the activation function, SoftMax [33] has been used in the model.
ej
SoftmaxðmÞ ¼ P j ð3Þ
ie
As the model optimizer, we have used Adam [17] in our system and in case of the loss
function, we have used Categorical Crossentropy. ModelCheckpoint is also added as
callback function. The CNN model which has been composed for this study is illus-
trated in Table 1.
4 Dataset and Data Augmentation

As there is no standard dataset on mosquitoes available online, we had to construct the
dataset from different online sources. We collected mosquito images from websites like
Pixabay [40], Getty Images [38], Shutterstock Images [41], iStock [39] etc. We col-
lected approximately 40 images from Pixabay, 120 images from Getty Images, 90
images from Shutterstock Images, 60 images from iStock and the rest from other
sources. We had a total of 442 images; 188 of aedes species, 130 of anopheles species
870 M. Akter et al.
and 124 of culex species. Figure 2 displays some sample images of our dataset. We
took help from some sources like [19] to label the data correctly.
Fig. 2. Dataset samples
4.2 Characteristics of Aedes, Anopheles and Culex

There are some properties associated with the mosquitoes by which one can recognize
the mosquitoes and differentiate between them. Figure 3 shows example images of the
three mosquito species: aedes, anopheles and culex. The characteristics of aedes,
anopheles and culex are given below:
Aedes. Aedes mosquitoes can be identified differently as they possess black and white
markings all over the body. These mosquitoes stay awake in the daytime in dark
corners. Primarily female aedes bites humans and sucks blood so that they can lay eggs.
This mosquito is the carrier for infectious diseases like dengue, chikungunya etc. These
diseases are mediated to humans by the bites of an affected female aedes.
Anopheles. The body color of an anopheles mosquito’s body is brown or black. It
consists of 3 body parts: the head, abdomen and thorax. The lower body of the vector
points to the top while they are resting. It lays eggs after sucking blood. Even though it
can live some weeks to a month, it is able to produce eggs in that time span. The
anopheles mosquito is considered throughout the world for bearing one of the most
infectious diseases called malaria. It is also responsible for heartworm.
Culex. Culex appears to be a black mosquito with some white stripes on some body
parts. Male and female culex, both of them, live on honey and herb liquids. When a
female culex is willing to produce eggs, it feeds on the blood of humans, other beasts
and also birds. It is compulsory for the female culex to have blood so that they can
reproduce. Though the female mosquitoes bite only the birds at some point, they also
attack the mammals sometimes. This culex mosquito is responsible for spreading the
Zika virus. It is also found in charge of spreading west nile virus, encephalitis and
filariasis.
Fig. 3. Example images of aedes, anopheles and culex
4.3 Data Augmentation

It is known that we need a big amount of data in the dataset to get a finer performance
with convolutional neural network [5, 13, 34]. If the dataset is big, more features can be
extracted from the data and matched to the unknown data. But when there is not
sufficient data, we could use data augmentation so that the performance of the model
can be improved [2]. By putting in some augmentation functions on the existing
dataset, image augmentation can generate more images, like random rotation, shift,
zoom, noise, flips, etc. We used four types of augmentation functions on the dataset;
vertical flip, horizontal flip, random rotation and noise. Scikit-image [37] and Image
Data Generator were used for the augmentation. After using augmentation, the aug-
mented dataset consisted of 3600 images containing 1200 images for each class.
Figure 4 shows the original image along with the augmented images made from the
original one.
Fig. 4. Data augmentation
5 Implementation Process
The system’s code has been written and developed in a python programming language
using the Spyder IDE [30]. The libraries that we used in this research are Keras [11],
Tensorflow [1], NumPy [36], and Matplotlib [32]. Tensorflow has been selected for the
backend of the system and keras has been used in the system to provide built-in
functions like activation functions, optimizers, layers etc. Data augmentation has been
872 M. Akter et al.
performed by keras API. NumPy is used for numerical analysis. Sklearn is used for
generating confusion matrix, splitting train and test data, modelcheckpoint, callback
function etc. where matplotlib library has been used to make the graphical represen-
tations, such as confusion matrix, loss versus epochs graph, accuracy versus epochs
graph, etc. When an image is provided as an input in our system, it preprocesses the
image. The preprocessing is done in the exact way when the model is being trained.
Then it predicts the class based on the processes.
The intended model brings a strong output for the classification in spite of having a
dataset which does not contain many images. Although there is a small variation in the
classification rate for the three classes, it is quite decent. Our proposed model, the
convolutional neural network (CNN), could be able to gain the validation accuracy of
about 70% without data augmentation.
Figure 5 shows the confusion matrix of the system: with and without data
augmentation.
(a) Without Augmentation (b) With Augmentation
Fig. 5. Confusion matrix
Here, the x-axis represents the predicted labels and the y-axis represents the true
levels. The validating accuracy of each class of the model (before and after augmen-
tation) is shown in Table 2. Before the augmentation, we can see that the individual
accuracy for anopheles is 70%. For aedes, the accuracy is 72% and finally, the accuracy
found for culex is 68%. But after the augmentation is done, the accuracy for anopheles
becomes 94% and the accuracy for both aedes and culex become 92%. With the help of
data augmentation, our proposed model could achieve an overall accuracy of 93%
which is way better than the accuracy without data augmentation.
Table 2. Accuracy before and after augmentation

Class Before augmentation After augmentation
Anopheles 70% 94%
Aedes 72% 92%
Culex 68% 92%
Figure 6 displays the training accuracy versus validation accuracy of the model before
augmentation. We can see that the training accuracy keeps increasing but the validating
accuracy increases up to 300 epochs. But it keeps decreasing in the last 200 epochs.
Fig. 6. Training accuracy versus validation accuracy
Figure 7 displays the training loss versus validation loss of the model before
augmentation. The training loss keeps decreasing until the end but the validation loss
decreases up to 100 epochs but keeps increasing after that.
874 M. Akter et al.
Fig. 7. Training loss versus validation loss
The training accuracy versus validation accuracy of the model after data aug-
mentation is shown in Fig. 8. We can see that both training accuracy and validation
accuracy keep increasing gradually until the end.
Fig. 8. Training accuracy versus validation accuracy after data augmentation
Figure 9 displays the training loss vs validation loss of the model after data aug-
mentation. Here both training and validation loss keeps decreasing until the end unlike
the validation loss without augmentation.
Fig. 9. Training loss versus validation loss after data augmentation
Table 3 represents the comparison of model evaluation matrices before and after
data augmentation. The validation accuracy was 70%, the training accuracy was 86%,
the precision was 71%, the recall was 70% and finally, the F1-score was 69% without
augmentation. But after the data augmentation, the accuracy becomes 93% and the
training accuracy becomes 97%. Precision, recall and F1-score are 93%, 93% and 92%
respectively.
Table 3. Model evaluation matrices before and after data augmentation

Acc Train. Acc Precision Recall F1-score
Before AUG 70% 86% 71% 70% 69%
After AUG 93% 97% 93% 93% 92%
A commendable improvement can be observed in model learning. The model could

not perform well in recognising unseen images before augmentation due to data lim-
itation. But after the augmentation, the model overfitting problem is resolved. As our
previous dataset consisted of inadequate images, data augmentation increased the
system’s efficiency by adding augmented data to our dataset.
Overall performance comparison among several models is demonstrated in Table 4
to validate the effectiveness of our proposed CNN architecture. Support Vector
Machine (SVM), Extreme Gradient Boosting (XGBoost), Random Forest and deep
learning pre-trained model VGG net (VGG16) are assigned for this mosquito classi-
fication task. These models are compared in terms of validation accuracy, recall,
precision and f1 score. SVM achieves the lowest validation accuracy (66%) which is
beaten by the validation accuracy of XGBoost model with a margin of 3%. However,
random forest outperforms the other machine learning models and is able to gain 83%
876 M. Akter et al.
validation accuracy. On the other side, when it comes to applying the transfer learning
approach, VGG16 crosses the machine learning models in all performance evaluation
matrices as it gives an accuracy of 91%. But our proposed CNN method surpasses all
of them and gives the highest accuracy. In addition to that, an integration of data-driven
(CNN) and knowledge-driven (BRBES) approach can be proposed to portrait risk
assessment of a mosquito bite in the human body [14, 15, 22, 35].
Table 4. Models’ performance comparison with augmentation

Model name Accuracy Precision Recall F1-score
Proposed CNN 0.93 0.93 0.93 0.923
VGG16 0.91 0.909 0.901 0.905
Random Forest 0.83 0.833 0.833 0.830
XGboost 0.69 0.690 0.687 0.687
SVM 0.66 0.677 0.660 0.663
7 Conclusion and Future Works
This research focuses on determining the opportunities to improve and build a mos-
quito classifier, which could bring benefits to human beings. Therefore, the goal of the
research was to classify the mosquitoes when too many images of mosquitoes are not
available. The proposed Convolutional Neural Network model can give more efficiency
if it can be run on a good amount of data. The Convolutional Neural Network with data
augmentation should be more efficient and robust while comparing to other machine
learning methods in image processing. We will increase the data for each class to get
close to equal accuracy of each class since it is obvious that CNN is a procedure that is
data-driven. We can apply more augmentation functions to make the augmented dataset
bigger. We can gather a large amount of data so that we could make a standard dataset
for mosquitoes. We will also try to make the system capable of classifying the mos-
quitoes in real-time so that the system can be more efficient. It can be tried to build the
model more effectively to make a better mosquito classification system by using the
convolutional neural network (CNN) in the future.
References
1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S.,
Irving, G., Isard, M., et al.: Tensorflow: a system for large-scale machine learning. In: 12th
{USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16),
pp. 265–283 (2016)
2. Ahmed, T.U., Hossain, S., Hossain, M.S., Ul Islam, R., Andersson, K.: Facial expression
IEEE (2019)
3. Akram, A.: Alarming turn of dengue fever in Dhaka city in 2019. Bangladesh J. Infect. Dis.
6(1), 1–2 (2019)
4. Bong, C.W., Xian, P.Y., Thomas, J.: Face recognition and detection using haars features
with template matching algorithm. In: International Conference on Intelligent Computing &
Optimization, pp. 457–468. Springer (2019)
(icIVPR), pp. 318–323. IEEE (2019)
6. Fanioudakis, E., Geismar, M., Potamitis, I.: Mosquito wingbeat analysis and classification
using deep learning. In: 2018 26th European Signal Processing Conference (EUSIPCO),
pp. 2410–2414. IEEE (2018)
7. Fuad, M.A.M., Ghani, M.R.A., Ghazali, R., Izzuddin, T.A., Sulaima, M.F., Jano, Z.,
Sutikno, T.: Training of convolutional neural network using transfer learning for aedes
aegypti larvae. Telkomnika 16(4) (2018)
8. Fuchida, M., Pathmakumar, T., Mohan, R.E., Tan, N., Nakamura, A.: Vision-based
perception and classification of mosquitoes using support vector machine. Appl. Sci. 7(1), 51
(2017)
9. Glick, J., Miller, K.: Insect classification with heirarchical deep convolutional neural
networks. Convolutional Neural Netw. Vis. Recogn. (CS231N), Stanford Univ. Final
Rep. Team ID 283, 13 (2016)
10. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of
the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323
(2011)
11. Gulli, A., Pal, S.: Deep Learning with Keras. Packt Publishing Ltd, Birmingham (2017)
12. Huang, L.P., Hong, M.H., Luo, C.H., Mahajan, S., Chen, L.J.: A vector mosquitoes
classification system based on edge computing and deep learning. In: 2018 Conference on
Technologies and Applications of Artificial Intelligence (TAAI), pp. 24–27. IEEE (2018)
using convolutional neural network with data augmentation. In: 2019 Joint 8th International
14. Jamil, M.N., Hossain, M.S., Ul Islam, R., Andersson, K.: A belief rule based expert system
for evaluating technological innovation capability of high-tech firms under uncertainty. In:
2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and
2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR),
pp. 330–335. IEEE (2019)
16. Kim, K., Hyun, J., Kim, H., Lim, H., Myung, H.: A deep learning-based automatic mosquito
sensing and control system for urban mosquito habitats. Sensors 19(12), 2785 (2019)
17. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:
1412.6980 (2014)
18. Kiskin, I., Orozco, B.P., Windebank, T., Zilli, D., Sinka, M., Willis, K., Roberts, S.:
Mosquito detection with neural networks: the buzz of deep learning. arXiv preprint arXiv:
1705.05180 (2017)
19. Littig, K., Stojanovich, C.: Mosquitoes: Characteristics of anophelines and culicines (2005).
http://www.cdc.gov/nceh/ehs/docs/pictorial_key/mosquitoes.pdf. Accessed 06 Jan 2017
878 M. Akter et al.
20. Maude, R.J., Hasan, M.U., Hossain, M.A., Sayeed, A.A., Paul, S.K., Rahman, W., Maude,
R.R., Vaid, N., Ghose, A., Amin, R., et al.: Temporal trends in severe malaria in chittagong,
Bangladesh. Malaria J. 11(1), 323 (2012)
21. Minakshi, M., Bharti, P., Chellappan, S.: Identifying mosquito species using smart-phone
cameras. In: 2017 European Conference on Networks and Communications (EuCNC),
pp. 1–6. IEEE (2017)
Conference on Local Computer Networks Workshops (LCN Workshops), pp. 38–45. IEEE
(2018)
23. Motta, D., Santos, A.Á.B., Winkler, I., Machado, B.A.S., Pereira, D.A.D.I., Cavalcanti, A.
M., Fonseca, E.O.L., Kirchner, F., Badaro, R.: Application of convolutional neural networks
for classification of adult mosquitoes in the field. PLoS ONE 14(1), e0210829 (2019)
Optimization, pp. 487–496. Springer (2018)
25. Okayasu, K., Yoshida, K., Fuchida, M., Nakamura, A.: Vision-based classification of
mosquito species: comparison of conventional and deep learning methods. Appl. Sci. 9(18),
3935 (2019)
26. Omodior, O., Luetke, M.C., Nelson, E.J.: Mosquito-borne infectious disease, risk-
perceptions, and personal protective behavior among us international travelers. Prevent.
Med. Rep. 12, 336–342 (2018)
27. Organization, W.H., et al.: Dengue and severe dengue. Technical rep., World Health
Organization. Regional Office for the Eastern Mediterranean (2014)
28. Ortiz, A.S., Miyatake, M.N., Tünnermann, H., Teramoto, T., Shouno, H.: Mosquito larva
classification based on a convolution neural network. In: Proceedings of the International
Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA),
pp. 320–325. The Steering Committee of The World Congress in Computer Science,
Computer â€¦ (2018)
29. Perez, L., Wang, J.: The effectiveness of data augmentation in image classification using
deep learning. arXiv preprint arXiv:1712.04621 (2017)
30. Raybaut, P.: Spyder-documentation. Available online at: pythonhosted. org (2009)
simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–
1958 (2014)
32. Tosi, S.: Matplotlib for Python Developers. Packt Publishing Ltd, Birmingham (2009)
33. Tüske, Z., Tahir, M.A., Schlüter, R., Ney, H.: Integrating gaussian mixtures into deep neural
networks: softmax layer with hidden variables. In: 2015 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), pp. 4285–4289. IEEE (2015)
34. Uddin Ahmed, T., Hossain, M.S., Alam, M., Andersson, K., et al.: An integrated cnn-rnn
Information Technology (ICCIT) (2019)
35. Ul Islam, R., Andersson, K., Hossain, M.S.: A web based belief rule based expert system to
predict flood. In: Proceedings of the 17th International conference on information integration
and web-based applications & services, pp. 1–8 (2015)
36. Walt, S.V.D., Colbert, S.C., Varoquaux, G.: The numpy array: a structure for efficient
numerical computation. Comput. Sci. Eng. 13(2), 22–30 (2011)
37. Van der Walt, S., Schönberger, J.L., Nunez-Iglesias, J., Boulogne, F., Warner, J.D., Yager,
N., Gouillart, E., Yu, T.: scikit-image: image processing in python. PeerJ 2, e453 (2014)
38. Royalty free stock photos, illustrations, vector art, and video clips. in: Getty images. https://
www.gettyimages.com/. Accessed 17 Sept 2019
39. Stock images, royalty-free pictures, illustrations videos - istock. in: istockphoto.com. https://
www.istockphoto.com/. Accessed 20 Sept 2019
40. 1 million Stunning Free Images to Use Anywhere – Pixabay. https://pixabay.com/. Accessed
24 Sept 2019
41. Stock images, photos, vectors, video, and music. in: Shutterstock. https://www.shutterstock.
com/. Accessed 24 Sept 2020
Recommendation System for E-commerce
Using Alternating Least Squares
(ALS) on Apache Spark
Subasish Gosh1 , Nazmun Nahar1 ,

Mohammad Abdul Wahab1 , Munmun Biswas1 ,
Mohammad Shahadat Hossain2(&) , and Karl Andersson3
1
BGC Trust University Bangladesh Bidyanagar, Chandanaish, Bangladesh
{subhasish,nazmun,wahab,munmun}@bgctub.ac.bd
2
University of Chittagong, University-4331, Chittagong, Bangladesh
hossain_ms@cu.ac.bd
3
Lulea University of Technology, 931 87 Skellefteå, Sweden
Karl.andersson@ltu.se
Abstract. Recommendation system can predict the ratings of users to items by

leveraging machine learning algorithms. The use of recommendation systems is
common in e-commerce websites now-a-days.Since enormous amounts of data
including users’ click streams, purchase history, demographics, social net-
working comments and user-item ratings are stored in e-commerce systems
databases, the volume of the data is getting bigger at high speed, and the data is
sparse. However, the recommendations and predictions must be made in real
time, enabling to bring enormous benefits to human beings. Apache spark is
well suited for applications which require high speed query of data, transfor-
mation and analytics results. Therefore, the recommendation system developed
in this research is implemented on Apache Spark. Also, the matrix factorization
using Alternating Least Squares (ALS) algorithm which is a type of collabo-
rative filtering is used to solve overfitting issues in sparse data and increases
prediction accuracy. The overfitting problem arises in the data as the user-item
rating matrix is sparse. In this research a recommendation system for
e-commerce using alternating least squares (ALS) matrix factorization method
on Apache Spark MLlib is developed. The research shows that the RMSE value
is significantly reduced using ALS matrix factorization method and the RMSE is
0.870. Consequently, it is shown that the ALS algorithm is suitable for training
explicit feedback data set where users provide ratings for items.
Keywords: Recommendation system Alternating Least Square (ALS)

Matrix factorization Spark MLlib E-commerce
1 Introduction
In a homogenous market similar types of products are usually found, while in

heterogeneous markets different types of product found. Therefore it is necessary for
the migration of homogenous markets to heterogeneous markets, enabling us to find all

https://doi.org/10.1007/978-3-030-68154-8_75
Recommendation System for E-commerce 881
types of product in one place [1]. Pine claims that the design of one product is simply
no longer sufficient. Organizations must at least be able to develop multiple goods that
satisfy different customer needs. The e-commerce revolution helps companies to offer
more products to consumers. E-commerce is a process whereby goods, services and
information are purchased and sold or exchanged through computer networks,
including the Internet [2].
Nevertheless, this e-commerce generates vast amounts of information that cus-
tomers need to process before selecting which goods to purchase. The use of the
Recommendation System is the solution to this problem of overloading information.
The recommendation system is used to propose products for its customers through e-
commerce sites. The products may be recommended based on the online retailer, based
on the demographics of the buyer, or based on an analysis of the customers past buying
behavior in order to anticipate future behavior. Such strategies are usually a part of a
website’s personalization, since they allow any customer to follow a platform. Rec-
ommendation system automates web personalization, which enables individual cus-
tomization to this extent is one way of realizing the idea of pines on the web. So Pine is
presumably in agreement with JeffBezos, CEO of Amazon.com; “If I have two million
online customers, I should have two million web stores”.
The Recommendation System employs a range of technologies to filter the best
result and to provide users with information they need. Recommendation System is
divided into three broad categories: first one is collaborative filtering system, next one
is content based system and the last one is hybrid recommendation system [3]. Content-
based systems attempt to check the actions of the item as suggested. This operates by
learning new user’s behavior, based on their knowledge needed, which is provided in
objects by the user. It is a keyword-specific Recommendation System which uses
keywords to describe the items. Therefore, models work in a content-based Recom-
mendation System so that they can suggest comparable user’s products which have
been enjoyed or are currently browsing.
The collaborative filtering system is focused on similarities between customers’
need and the items. The suggested items for a new user are the ones that other similar
people in the past browsing history liked. In collaborative filtering, an item is com-
bined, similarities are recognized based on user ratings and new recommendation is
created based on multi-user comparisons. It faces however many difficulties as well as
limitations, such as data sparsity which is responsible for the evaluation of large items.
It is also difficult to predict on the basis of the nearest neighboring algorithm.
Scalability, when increasing the number of users and the number of items and the
last thing is cold starts, where bad relationships between people with the same mind.
Hybrid Recommendation System carries out their tasks by taking account of the
combination of content-based and collaborative filtering approaches so that a particular
item is suitable. The hybrid system is considered by many organizations to be the most
widely used Recommendation System because it is able to remove any defect that
could have arisen in the implementation of a Recommendation System.
The Main objective of this paper is to develop a recommendation system. Here for
developing a recommendation system a computationally effective algorithm for matrix
factorization ALS (Alternating Least Square) is proposed. ALS is one kind of col-
laborative filtering method is used to solve the overfitting issue in sparse data [4].
882 S. Gosh et al.
Apache Spark is used to implement the ALS algorithm. In this paper ALS algo-
rithm is also compared with other collaborative filtering method which is SGD
(Stochastic Gradient Descent) and SVD (Singular Value Decomposition).
In the first section of this research paper. In the second section of this research
paper, ALS Matrix Factorization Method is explained. In the third section of this
research paper, predictive accuracy metrics used to evaluate the system is defined. In
the following section, system prototype along with system workflow is described with a
diagram. In Sect. 6, system architecture of the recommendation system is illustrated.
Experimental Result is described in Sect. 7 and finally Conclusion and Future work of
the system is discussed.
2 Related Work
Many researchers have worked in the field of e-commerce utilizing techniques of big
data over with machine learning and data mining methods. A number of applications
have developed using those techniques. Recommendation system is used in large areas
such as e-commerce, social networks, education, government etc. The development of
recommendation system using machine learning methods of previous researches is
described in this section.
A Hybrid Recommendation model was designed by Wei et al. [5]. This study
solved the issue of Complete Cold start (CCS) and Incomplete Cold Start (ICS) prob-
lem. Those remaining problem was overcome by using combined feature of Collab-
orative filtering(CF) and deep learning neural network. A Specific Deep learning neural
network (SADE) used for extracting content features of items. CF model and time
SVD++ used to predict the rating by taking content features of cold start items. Netflix
large movie dataset used to perform their analysis. This research found that CCS can
able to do good recommendation than ICS. But their experiment done by investigating
less parameter and evaluated their performance by using RMSE rating prediction which
is not efficient for real recommendation system.
Using collaborative filtering algorithm on the basis of item a recommendation system
was developed by Kupisz and Unold [6]. They used hadoop and apache spark platform to
implement the system. MapReduce paradigm was used in this study to process large
datasets for reducing problem of parallel programming. The limitation of access time and
memory this prototype does not give better outcome for large amount of data.
Chen et al. [7] used the CCAM (co-clustering with augmented matrices) to develop
multiple methods like heuristic scoring, conventional classification and machine learning
to construct a recommendation system as well as the integration of content-based hybrid
recommendation systems in collaboration with collaborative filtering model.
Zhou et al. [8] developed a collaborative filtering based ALS Algorithm for the
Netflix Prize. ALS works to solve scalability issue of extensive datasets. This study
used ALS algorithm to develop a movie recommendation system for predicting rating
of users. Due to not refining the Restricted Boltzmann Machine (RBM) this system
cannot show moderately better result.
The item-based collaborative filtering strategies was used by Dianping et al. [9]. In
this system, the user item rating matrix is first examined and the associations between
different items categorized and use these relationships are used to evaluate the user’s
recommendations.
Dev et al. [10] have proposed a scalable method for developing recommendation
systems based on similarity joins. To work with big data applications, MapReduce
framework was used to design the system. Using a method called extended prefix
filtering (REF), the process can significantly reduce the unnecessary overhead com-
putation such as redundant comparisons in the similarity computing step.
Zeng et al. [11] proposed PLGM to boost the accuracy, scalability and processing
power of large-scale data. Two matrix factorization algorithms, which are ALS and
SGD, were considered in their work. SGD based parallel matrix factorization was
implemented on spark and for its efficiency was compare with ALS and MLlib. Based
on test result, advantage and disadvantage of each model was analyzed. A number of
approaches to profile aggregation were studied and the model was adopted which gave
the best result in terms of efficiency and precision; such as PLGM and LGM have been
studied.
Jooa et al. [12] was implemented a recommendation system by using both col-
laborative filtering and association rules. Distance data from Global positing system
(GPS) was focused to recommend the items to customer for purchasing on the basis of
customer’s preference. The problem of information overloading solved by using Near
Field Communication (NFC). NFC used to select high preference of user by analyzing
time slot with association rules.
A combination of collaborative filtering and cluster techniques based recommen-
dation system was proposed by Panigrahi et al. [13]. Apache spark platform was used
to accelerate the running time of memory computation for performing recommenda-
tion. Spark native language scala used for speeding computational time. This study
focused to reduce the cold start problem which is limitation of traditional collaborative
filtering by correlating the customer to items on the base of features. But limitation of
programming feature and verbose code of scala programming language this model did
not evaluate better prediction on recommendation system.
An algorithm is proposed by Li et al. [14] for paralleling the frequent item set of
large dataset. This proposed algorithm discovered hidden pattern to help query rec-
ommendation. By calculating between cluster nodes this algorithm able to minimize
computation cost. They gathered better performance for distributed processing by
utilizing MapReduce framework concerning cluster computers. This study does not
analyzing the query logs for google search engine.
From the analysis of previous work, the drawback is shown for over fitting issue of
sparse data. Our proposed system overcomes this issue by applying matrix factorization
in ALS algorithm. The use of matrix factorization in our system also increases pre-
diction accuracy from previous researches.
3 Alternating Least Squares (ALS)
E-commerce companies encounter challenges in personalized product recommendation

like Amazon and Netflix. In this personalized settings users rate the items and the
ratings data are used to predict to find ratings for other items.
884 S. Gosh et al.
The rating data can be represented as an n m matrix R where n users and m items.
The ðu; iÞth entry is rui in the matrix R which implies that ðu; iÞth ratings for ith item by
u user. The R matrix is sparse matrix as items do not receive ratings from many users.
Therefore, the R matrix has the most missing values.
Matrix factorization is the solution of this sparse matrix problem. There are two k
dimensional vectors which are referred to as “factors”.
1. xu is k dimensional vectors summarizing’s every user u.
2. yi is k dimensional vectors summarizing’s every item i.
Hence, rating for ith item by user u can be predicted by
rui xTu yi ð1Þ
Let,

xu ¼ x1 ; x2; . . .; xn; 2 Rk

yi ¼ y1 ; y2; . . .; yn; 2 Rk
Equation 1 can be formulated as an optimization problem to find:

X 2 X X
argmin rui
rui xTu yi þk u
kx u k2 þ i
ky i k2 ð2Þ
Here, k is the regularization factor, which is used to solve overfitting problem, is

referred to as weighted - k - regularization. The value of k can be tuned to solve
overfitting whereas default value is 1 [15].
Let, the set of variables xu is constant then the objective function of yi is convex and
the set of variables yi is constant then the objective function of xu is convex. Hence the
optimize value of xu and yi can be found repeating the aforementioned approach until
the convergence. This is called ALS (Alternating Least Squares).
Algorithm:
Alternating Least Squares (ALS)
Procedure ALS ðxu ; yi Þ
Initialization xu 0
Initialization matrix yi with random values
Repeat
Fix yi ; solve xu by minimizing’s the objective
function (the sum of squared errors)
Fix xu solve yi by minimizing the objective
function similarly
Until reaching the maximum iteration
Return xu ; yi
End procedure
4 Predictive Accuracy Metrics
The recommendation system is evaluated using offline analysis. During offline analysis,
there is no actual users rather as large dataset is split into a training and test set. The
recommendation system is trained with the training dataset to predict the ratings given
by users in the test dataset and evaluation is carried out by comparing the resulting
prediction with the actual prediction in the test dataset. Popular predictive accuracy
metrics such as mean absolute error (MAE), root mean squared error (RMSE) and
mean user gain (MUG) are used to measure the accuracy of the prediction made by the
recommendation system.
1 X
MAE ¼ jr ðr Þ pi ðrk Þj
rk 2rui i k
ð3Þ
jrui j
sffiffiffiffiffiffiffiffi
1 X
RMSE ¼ ðjri ðrk Þ pi ðrk ÞjÞ2 ð4Þ
jrui j rk 2rui
Recommendation systems predict an item as interesting or not interesting to a user. So

the accuracy of prediction made by recommendation system should also be measured
with the amount of impact on users for a recommendation. This is user-specific
accuracy metric often referred to as mean user gain (MUG) which measures the quality
of a recommendation system helping users to buy right choice item.
ri ðrk Þ hi if pi ðrk Þ hi
UGðpi ðrk ÞÞ ¼ ð5Þ
hi ri ðrk Þ otherwise
Where hi is the threshold value to calculate the quality of a recommendation system.
1 X
MUG ¼ UG ðpi ðrk ÞÞ ð6Þ
jrui j rk 2rui
5 System Prototype
The proposed system helps online customers with swift selection of their desired
products among bigdata stored in a hadoop cluster. Customers, software bots/agents,
and robotic shopping assistants are performing data transactions in the online shopping
cart and e-commerce site frequently. Therefore, the size of information is growing up to
zattabytes and processing of this requires significant amount of computing resources
and time.
Apache spark runs on hadoop clusters. Apache spark can perform batch processing,
stream processing, graph processing, interactive processing and in-memory processing
of data. Applications built on Apache spark can process data in high speed as spark
stores the Resilient Distributed Dataset (RDD) in the RAM of clusters nodes. RDD is
886 S. Gosh et al.
immutable, logically partitioned, fault-tolerant records of the data as objects which are
computed on different cluster nodes [16–19]. The dataset are created by different types
of file such as json file, CSV file, text file etc. saved in the Hadoop file system. The text
file RDDs are created by sparkcontext’s textfile method.
In the system, the CSV file is loaded externally from HDFS in hadoop clusters. The
file in RDD is logically partitioned which are stored in multiple nodes of the cluster.
The file is partitioned in smaller blocks so that RDD data can be distributed equally
among threads. The full dataset is transformed coarse gained to take an input RDD
producing many output RDDs. The transformation is pipelined to improve the per-
formance. The transformation applies the computation like map(), filter(), reduceBy-
Key(), groupByKey() etc. Spark delivers final result of all intermediate transformations
and write out to the HDFS of hadoop clusters, this is referred to as action. Examples of
actions in spark are first(), take(), reduce(), collect(), count() etc.
The Recommendation system uses RDD based spark MLlib package. The spark
MLlib implements alternating least squares (ALS) algorithm to predict the user’s rat-
ings of different items. The algorithm uses Matrix Factorization technique to predict the
explicit preferences of users to items.
Fig. 1. Alternating least squares matrix factorization method on spark recommendation system
workflow
6 System Architecture
The recommendation system is designed for interactive e-commerce sites. In this case,
real time recommendation and retraining the recommendation system are two most
important requirements. Therefore, dealing with increasing volume of data and change
of data are major challenges. The number of transactions on e-commerce sites varies
with time as new users arrive frequently. Predicting the new users recommendations is
important to keep users attracted and for good feedback.
The Movielens 1M data set has 71567 users ratings to 10681 movies [20]. The
users in this data set have performed rating s for minimum 20 movies. There are
10000054 ratings available in this data set. The Movielens data set is explicit feedback
data set here users give ratings for items.
The recommendation system architecture is depicted in the Fig. 2. The system is
running on Apache Spark 2.4.5, Apache Hadoop 3.2.1 and Aapche HBase. The
training dataset storage and retrieval operations along with Apache HBase data files are
managed by HDFS. The Apache Spark MLlib produces predictions and stores the
recommendations in HBase nosql database. The spark system consists of Master nodes,
cluster manager and worker nodes. The RDDs are created and processed in Worker
nodes and the cluster manager keeps records of RDD locations in the worker nodes.
The users visualize the recommendations in the e-commerce websites browsed by
smart phone clients and personal computer clients connected to the HTTP servers in the
cloud.
Fig. 2. System architecture of the proposed recommendation system
The system stores and manages master dataset in a Hadoop distributed file system
(HDFS) so that data can be accessed faster. The recommendations are pre-computed by
processing master dataset in batch using Apache Spark and are stored in HBase. Real
time views are computed on user data streaming by using Apache Spark streaming and
are also stored in HBase. The real time recommendations are calculated by merging the
batch view and real time view. Therefore, following the lambda architecture is a
sustainable option to develop the system.
888 S. Gosh et al.
The project undergoes a task of experimenting the difference of total execution time
required for calculating recommendations using Singular Value Decomposition train-
ing algorithm in non-distributed system and spark system. The experiment is conducted
on a desktop with AMD Sempron(tm) processor (2.8 GHz) and 4 GB RAM. The
experiment shows that the total execution time is reduced in spark system significantly
shown in the Table 1.
Table 1. Total executions time comparison in dataset

Dataset Total executions time
Non-distributed Movielens 1M 6 min 13 s
Apache Spark Movielens 1M 4 min 43 s
7 Experimental Result and Discussion
In this research, the recommender system is trained using Alternating Least Square
(ALS) algorithm. The Movielens performance metric RMSE value is dependent on
these parameters. The objective of the training is to decrease the value of RMSE and
the ALS model producing least minimum RMSE value is saved for future recom-
mendation. The system is implemented on Apache Spark and Apache Hadoop cluster.
Therefore, the hardware configuration of the cluster’s master node, data nodes, node
manager, resource manager, name node, secondary name node, and history servers
have impact on the performance.
Fig. 3. Spark jobs execution gantt chart for active jobs
The Fig. 3 shows the process of job submission in Spark. It is noticed that the
maximum time is required for building top recommendations for each user and each
movie. Because, during the job execution the datasets are divided into 20 RDDs
(Resilient Distributed Dataset) for user Factors and item Factors equally 10 each.
The results for these two factors are aggregated to produce recommendations.
Initial parameter for the system is shown in Table 2.
Table 2. Parameter setting

Parameters Value
Lambda 0.01
Iteration 5
Table 3 shows the top 5 recommendations of items for a particular user along with
the average rating and prediction values. The RMSE value resulted as 0.9 in this
experiment.
Table 3. Recommendations using ALS Algorithm

UserID ItemID Avg rating Prediction
5192 557 5 5.402
5192 864 4 5.284
5192 2512 3.9 5.189
5192 1851 4 5.125
5192 2905 4.6 5.098
As shown in Table 4 the training process includes a RMSE threshold check step
where the lambda value can be changed to optimize the recommendation model. In
Fig. 4 by keeping the number of iteration fixed at 5 and by changing the lambda value
in the range between 0.005 to 1 the minimum RMSE value is found as 0.870 against
the lambda value 0.05. From Table 4 it has been shown that as when the lambda value
is between 0.005 and 0.006 the RMSE value is respectively 0.907 and 0.905. As the
lambda value is increased the value of RMSE is reduced than the previous value. After
that RMSE value is changed as the lambda value is increased and minimum RMSE
value is found when the lambda value is 0.05. The minimum RMSE value is 0.870 at
iteration number 5.
Table 4. RMSE value with number of iteration

Lamda Iteration RMSE
0.005 5 0.907
0.006 5 0.905
0.008 5 0.898
0.009 5 0.893
0.01 5 0.896
0.05 5 0.870
0.1 5 0.892
0.15 5 0.916
0.2 5 0.931
0.25 5 0.941
0.5 5 1.022
890 S. Gosh et al.
RMSE
1.1
RMSE 1
0.9
0.8
0.7 RMSE
Lamda
Fig. 4. Graphical representation of RMSE value
In this research, a comparative analysis is performed to evaluate the performance of

proposed recommendation system training algorithm Alternating Least Square (ALS).
Therefore, a comparison is carried out among ALS, SVD++ and Parallelized SGD
matrix factorization methods. The result shows that the RMSE value is significantly
reduced using ALS matrix factorization method and the RMSE is 0.870. Consequently,
the experiment shows that the ALS algorithm is suitable for training explicit feedback
data set where users provide ratings for items (Table 5 and Fig. 5).
Table 5. RMSE comparison among ALS, SVD++ and SGD

Algorithm RMSE
ALS 0.870
SVD++ 0.977
SGD 0.919
RMSE
1
0.95
RMSE
0.9
0.85 RMSE
0.8
ALS SVD++ SGD
Algorithm
Fig. 5. Graphical representation of RMSE comparison among ALS, SVD++ and SGD
Apache spark is well suited for applications which require high speed query of data,
transformation and analytics results. Therefore, the recommendation system developed
in this research is implemented on Apache Spark. Also, the matrix factorization using
Alternating Least Squares (ALS) algorithm which is a type of collaborative filtering is
used to solve overfitting issues in sparse data and increases prediction accuracy. The
overfitting problem arises in the data as the user-item rating matrix is sparse. In this
research a recommendation system for e-commerce using alternating least squares
(ALS) matrix factorization method on Apache Spark MLlib is developed. The research
shows that the RMSE value is significantly reduced using ALS matrix factorization
method and the RMSE is 0.870. Consequently, it is shown that the ALS algorithm is
suitable for training explicit feedback data set where users provide ratings for items.
The proposed recommendation system architecture is designed following lambda
architecture so that the users can see the recommendations real time. This will in turn
bring enormous benefit to the customers during purchasing goods.
In future, the recommendation system will be integrated with content based fil-
tering, users click stream predictive analytics, deep collaborative filtering, and con-
volutional neural network based [21–24] recommendation. The integration might use
results produced by these different systems and correlate these results to show rec-
ommendations to the online users. Thus, the future research direction will be focusing
on hybrid models of recommendation systems [25–32].
References
1. Pine, B.J.: Mass Customization, vol. 17. Harvard Business School Press, Boston (1993)
2. Shahjee, R.: The impact of electronic commerce on business organization. Sch. Res.
J. Interdisc. Stud. 4(27), 3130–3140 (2016)
3. Verma, J.P., Patel, B., Patel, A.: Big data analysis: recommendation system with Hadoop
framework. In: 2015 IEEE International Conference on Computational Intelligence &
Communication Technology, pp. 92–97. IEEE, February 2015
4. Alzogbi, P., Koleva, G.: Lausen, towards distributed multi-model learning on apache spark
for model-based recommender. In: 2019 IEEE 35th International Conference on Data
Engineering Workshops (ICDEW), pp. 193–200. IEEE, April 2019
5. Wei, J., He, J., Chen, K., Zhou, Y., Tang, Z.: Collaborative filtering and deep learning based
hybrid recommendation for cold start problem. In: 2016 IEEE 14th International Conference
on Dependable, Autonomic and Secure Computing, 14th International Conference on
Pervasive Intelligence and Computing, 2nd International Conference on Big Data
Intelligence and Computing and Cyber Science and Technology Congress
(DASC/PiCom/DataCom/CyberSciTech), pp. 874–877. IEEE, August 2016
6. Kupisz, B., Unold, O.: Collaborative filtering recommendation algorithm based on Hadoop
and spark. In: 2015 IEEE International Conference on Industrial Technology (ICIT). IEEE
(2015)
7. Chen, Y.C.: User behavior analysis and commodity recommendation for point-earning apps.
In: 2016 Conference on Technologies and Applications of Artificial Intelligence (TAAI).
IEEE (2016)
892 S. Gosh et al.
8. Zhou, Y.H., Wilkinson, D., Schreiber, R.: Large scale parallel collaborative filtering for the
Netflix prize. In: Proceedings of 4th International Conference on Algorithmic Aspects in
Information and Management, Shanghai, pp. 337–348. Springer (2008)
9. Ponnam, L.T., et al.: Movie recommender system using item based collaborative filtering
technique. In: International Conference on Emerging Trends in Engineering, Technology,
and Science (ICETETS). IEEE (2016)
10. Dev, A.V., Mohan A.: Recommendation system for big data applications based on set
similarity of user preferences. In: International Conference on Next Generation Intelligent
Systems (ICNGIS). IEEE (2026)
11. Zeng, X., et al.: Parallelization of latent group model for group recommendation algorithm.
In: IEEE International Conference on Data Science in Cyberspace (DSC). IEEE (2016)
12. Jooa, J., Bangb, S., Parka, G.: Implementation of a recommendation system using
association rules and collaborative filtering. Proc. Comput. Sci. 91, 944–952 (2016)
13. Panigrahi, S., Lenka, R.K., Stitipragyan, A.: A hybrid distributed collaborative filtering
recommender engine using apache spark. In: ANT/SEIT, pp. 1000–1006, January 1026
14. Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems.
Computer 42(8), 30–37 (2009)
15. Meng, X., Bradley, J., Yavuz, B., Sparks, E., Venkataraman, S., Liu, D., Freeman, J., Tsai,
D.B., Amde, M., Owen, S., Xin, D.: Mllib: machine learning in apache spark. J. Mach.
Learn. Res. 17(1), 1235–1241 (2016)
16. Li, H., Wang, Y., Zhang, D., Zhang, M., Chang, E.Y.: Pfp: parallel fp-growth for query
recommendation. In: Proceedings of the 2008 ACM Conference on Recommender Systems,
pp. 107–114, October 2008
17. Shanahan, J.G., Dai, L.: Large scale distributed data science using apache spark. In:
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, pp. 2323–2324, August 2015
18. Sharma, S.: Dynamic hashtag interactions and recommendations: an implementation using
apache spark streaming and graphX. In: Data Management, Analytics and Innovation,
pp. 723–738. Springer, Singapore (2020)
19. Armbrust, M., Xin, R.S., Lian, C., Huai, Y., Liu, D., Bradley, J.K., Meng, X., Kaftan, T.,
Franklin, M.J., Ghodsi, A., Zaharia, M.M.: Spark sql: relational data processing in spark. In:
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data,
pp. 1383–1394, May 2015
20. Harper, F.M., Konstan, J.A.: The movielens datasets: history and context. Acm. Trans.
Interact. Intell. Syst. (tiis) 5(4), 1–19 (2015)
(icIVPR), pp. 318–323. IEEE, May 2019
22. Ahmed, T.U., Hossain, M.S., Alam, M.J., Andersson, K.: An integrated CNN-RNN
Information Technology (ICCIT), pp. 1–6. IEEE, December 2019
IEEE, May 2019
using convolutional neural network with data augmentation, May 2019
25. Sinitsyn, S., et al.: The concept of information modeling in interactive intelligence systems.
In: International Conference on Intelligent Computing & Optimization. Springer, Cham
(2019)
26. Alhendawi, K.M., Al-Janabi, A.A.: An intelligent expert system for management
information system failure diagnosis. In: International Conference on Intelligent Computing
& Optimization. Springer, Cham (2018)
27. Biswas, M., Chowdhury, S.U., Nahar, N., Hossain, M.S., Andersson, K.: A belief rule base
expert system for staging non-small cell lung cancer under uncertainty. In: 2019 IEEE
International Conference on Biomedical Engineering, Computer and Information Technol-
ogy for Health (BECITHCON), pp. 47–52. IEEE, November 2019
Conference on Local Computer Networks Workshops (LCN Workshops), pp. 38–45. IEEE,
October 2018
30. Karim, R., Hossain, M.S., Khalid, M.S., Mustafa, R., Bhuiyan, T.A.: A belief rule-based
expert system to assess bronchiolitis suspicion from signs and symptoms under uncertainty.
In: Proceedings of SAI Intelligent Systems Conference, pp. 331–343. Springer, Cham,
September 2016
31. Hossain, M.S., Monrat, A.A., Hasan, M., Karim, R., Bhuiyan, T. A., Khalid, M.S.: A belief
rule-based expert system to assess mental disorder under uncertainty. In: 2016 5th
International Conference on Informatics, Electronics and Vision (ICIEV), pp. 1089–1094.
IEEE, May 2016
32. Hossain, M.S., Habib, I.B., Andersson, K.: A belief rule based expert system to diagnose
dengue fever under uncertainty. In:2 017 Computing conference, pp. 179–186. IEEE July
2017
An Interactive Computer System with Gesture-
Based Mouse and Keyboard
Dipankar Gupta1, Emam Hossain2, Mohammed Sazzad Hossain3(&),

Mohammad Shahadat Hossain2, and Karl Andersson4
1
Department of Computer Science and Engineering, Port City International
University, Chattogram, Bangladesh
dipucpi50@gmail.com
2
Chattogram, Bangladesh
ehfahad01@gmail.com, sazzad.hossain@ulab.edu.bd
3
Department of Computer Science and Engineering, University of Liberal Arts
Bangladesh, Dhaka, Bangladesh
hossain_ms@cu.ac.bd
4
Department of Computer Science, Electrical and Space Engineering, Lulea
University of Technology, Skelleftea, Sweden
karl.andersson@ltu.se
Abstract. Researchers around the world are now focused on to make our
devices more interactive and trying to make the devices operational with min-
imal physical contact. In this research, we propose an interactive computer
system which can operate without any physical keyboard and mouse. This
system can be beneficial to everyone, especially to the paralyzed people who
face difficulties to operate physical keyboard and mouse. We used computer
vision so that user can type on virtual keyboard using a yellow-colored cap on
his fingertip, and can also navigate to mouse controlling system. Once the user is
in mouse controlling mode, user can perform all the mouse operations only by
showing different number of fingers. We validated both module of our system by
a 52 years old paralyzed person and achieved around 80% accuracy on average.
Keywords: Human computer interaction Color detection Hand gestures

Virtual keyboard Virtual mouse
1 Introduction
The uses of computers have become an integral part of our daily life and the human
computer interactions are becoming more convenient in everyday. While the majority
of the people take these facilities for granted, people with physical impairments face
many difficulties in properly using these devices. In particular, people with severe
movement disabilities may have physical impairments which significantly limit their
ability to control the fine motor. Therefore, they may not be able to type and com-
municate with a normal keyboard and mouse. In this situation, it is important to use
effective assisted technologies to ensure accessibility for such people.

https://doi.org/10.1007/978-3-030-68154-8_76
An Interactive Computer System with Gesture-Based Mouse and Keyboard 895
A wide range of eye-tracking devices are currently available commercially on the

market, offering many functionalities, accuracy level, and price range. Many research
studies require eye-tracking devices of high precision to test a range of eye charac-
teristics, but they are expensive such as infrared [14].
In this work, we propose a novel multi-modal interactive keyboard and mouse
system where we detect and track a color (yellow in this research) to replace the use of
traditional keyboard and mouse using the device’s camera. This is achieved by taking
inputs from a camera using a vision-based color recognition technique and hand ges-
ture recognition technique and without any additional hardware requirements.
Our system will allow the user to operate their computer’s keyboard and mouse
using only their hand bearing a yellow color cap on their fingertip. The main objective
of this research is to build an interactive keyboard and mouse system so that motion
impaired people can communicate with the computer through its webcam using their
one hand only. To support this aim, secondary objectives are:
– to detect a yellow-colored cap.

– to recognize the key on which the cap is placed.
– to track the movement of colored cap for mouse movement and.
– to detect the number of fingers shown to determine left-button or right-button click
of the mouse.
The rest of the article is organized as follows: section II presents few related works
on virtual keyboard and virtual mouse system, section III illustrates methodology,
section IV discusses about the results of our study and finally, section V concludes our
work and discusses about the future work.
2 Literature Review
There are traditional approaches for virtual keyboard and mouse systems which are
usually based on eye gestures. Our literature review focuses on the research works on
virtual keyboard and virtual mouse which were published in Elsevier, Springer, ACM
Digital Library, IEEE Digital Library etc. We discussed about few related works on
virtual keyboard and virtual mouse in the following two subsections.
2.1 Virtual Keyboard

In 2010, Y. Adajania et. al developed a Virtual Keyboard Using Shadow Analysis [2].
This system detects keyboard, hands shadow, finger tips using colour segmentation and
sobel technique. Ambient lighting conditions required for this system. This system can
analyze 3 frames per second.
In 2011, S. Hernanto et al. built a method for virtual keyboard using webcam [10].
In this approach, two functions are used for finger detection and location detection.
This system used two different webcams which are used to detect skin and location
separately. The average time per character of this virtual keyboard is 2.92 ms and the
average accuracy of this system is 88.61%.
896 D. Gupta et al.
In 2013, M. H. Yousuf et al. introduced a keystroke detection and recognition

model using fingertip tracking [25]. They captured real time movements of finger joints
and successfully recognised 28 keys.
In 2015, I. Patil et al. constructed a virtual keyboard interaction system using eye
gaze and eye blinking [16]. Their system first detects face and then detects eye and nose
region to recognize an eye blink. The OpenCV java framework is used in this approach.
In 160 X 120 frame size, this approach achieves 48% accuracy and in 1280 X 960
frame size, 98% accuracy is achieved.
In 2016, Hubert Cecotti developed a system for disabled people named a multi-
modal gaze-controlled virtual keyboard [6]. The virtual keyboard has 8 main com-
mands for menu selection to spell 30 different characters and a delete button to recover
from error. They evaluated the performance of the system using the speed and infor-
mation transfer rate at both the command and application levels.
V. Saraswasti et al. introduced a system for disabled people entitled Eye Gaze
System to Operate Virtual Keyboard [18]. First it captures the user’s face and gets the
position of eye gaze which is used as reference point in the later stages. HaarCascade
method was used to extract features of face, and Integral Projection method was used to
get the position of the eye movement. Based on their experiment, the ratio between the
duration of normal writing and duration of typing using their system for two words is
1:13.
In 2017, S. Bhuvana et al. constructed a virtual keyboard interaction system using
webcam [5]. This system can detect the hand position over the virtual keyboard. This
system provides a white paper virtual keyboard image and detects which character is
pointed. This approach used built-in function of Image Processing Toolbox in
MATLAB.
In 2018, Jagannathan MJ et al. presented a finger recognition and gesture based
augmented keyboard system [13]. The system was developed using OpenCV libraries
and Python. Palm detection is used for typing on the augmented keyboard. Virtual
Keyboard performs based on the movement of the finger.
2.2 Virtual Mouse

In 2016, S. Shetty et al. constructed a virtual mouse system using color detection [19].
They used webcam for detecting mouse cursor movement and click events using
OpenCV built-in functions. A mouse driver, written in java, is required as well. This
system fails to perform well in rough background.
P. C. Shindhe et al. expanded a method for mouse free cursor control where mouse
cursor operations are controlled by using hand fingers [21]. They have collected hand
gestures via webcam using color detection principles. The built-in function of Image
Processing Toolbox in MATLAB and a mouse driver, written in java, used in this
approach. The pointer was not too efficient on the air as the cursor was very sensitive to
the motion.
G. Sahu et al. built a system for controlling mouse pointer using webcam [17]
which control volume of media player, powerpoint slides and can make or end a call.
They used RGB color tapes to recognise user’s finger.
In 2019, K. Hassan et al. presented a system to design and develop a hand gesture
based virtual mouse [20]. They captured different gestures via webcam and performed
mouse functions according to the gestures. This system achieved 78%–90% accuracy.
The system does not work efficiently in the complex or rough background.
As we can see from the reviewed literature, previous systems include either virtual
keyboard or virtual mouse. Those systems can’t fully eliminate the need of mouse and
keyboard completely. This work aims to build an interactive computer system which
can be operated without any physical mouse and keyboard.
3 Methodology
3.1 Problem Description

The aim of this paper is to implement a computer application which uses alternative
methods to control keyboard and mouse cursors for rehabilitation of people who are
suffered from stroke so that they can recover the side effects. Therefore, we propose a
new keyboard and mouse cursor control system based on vision and color recognition
technique, utilizing hand gestures recorded from a webcam.
Fig. 1. Overview of proposed interactive computer system
Figure 1 shows the overview of the process of interactive keyboard and mouse
controlling system. This work aims at creating a system that recognizes the colors and
hand gestures, and controls computer’s keyboard and mouse according to those ges-
tures using color detection technique.
Our system will use computer’s webcam and will display an onscreen keyboard
layout. Users will be able to type through the keyword using a yellow color cap on his
fingertip. User can also turn on mouse controlling system by pressing Mouse Control
Module button using that yellow color cap. After that, another live video frame will be
shown for tracking the hand movements to recognize mouse functions. Figure 2 rep-
resents the system architecture for virtual communication system.
898 D. Gupta et al.
Fig. 2. Procedure of gesture-based mouse and keyboard

We used the following procedure to type on virtual keyboard using our fingertip:
Step 1: Capturing real time video using computer’s webcam.

Step 2: Processing individual image frame from the captured video.
Step 3: Converting image frames into HSV format.

Step 4: Creating a filter which can create the mask for yellow color.
Step 5: Draw contours from the mask. We will loop through all the contours and put
a rectangle over it for object tracking.
Step 6: Find position of yellow color object over the virtual keyboard.
Step 7: Print the character which is pointed by yellow colored cap.
Fig. 3. Virtual keyboard: typing using virtual keyboard
Figure 3 displays a live demonstration of typing j using fingertip and Fig. 4 shows
how to navigate into mouse controlling system.
3.3 Virtual Mouse

We used an infinite loop to catch the frames in each instance by the web camera which
will be available throughout the program. We capture the stream from live feed, frame
by frame and then convert RGB images to grayscale images. We create a mask here
which recognizes hand’s shape and then counts the number of fingers in the shape. We
have used the law of cosine as expressed in Eq. (1) to nd the angle in shape of hand.
c2 ¼ a2 þ b2 2abcosðCÞ ð1Þ
The mask creates some specific region of the image according to certain rules.
Instead we draw contours from the mask. For object tracking, we loop through all the
contours. Convex hull of a set X of points in any space is defined as the smallest
convex set that contains X. Any deviation of the object from this convex hull can be
considered as convexity defect. The convex hull of a finite point set S can be defined as
the set of all convex combinations of its points.
To find the contours in the image, we have used cvFindContours() function of
OpenCV which uses an order finding method to detect edges. We are interested in
extracting the hand contour in the contour extraction process so that shape analysis can
be done to determine hand gestures. The hand contour convexity defects were
900 D. Gupta et al.
Fig. 4. Virtual keyboard: press on mouse control module
measured using OpenCV’s cvConvexityDefects() function. Convex hull of an object

can be defined using the convex combination of all its points. Convexity defects are
identified when there is any deviation of the object from its convex hull [9]. After the
convexity defects are acquired, two major tasks are considered to determine mouse
control functions:
– identifying fingertip and.

– counting number of fingers from the number of convexity defects.
We convert the detected coordinate from camera resolution to actual resolution of
the screen. Mouse controlling module will perform in the following manner:
– if it detects two fingers, it will move the mouse cursor in the four directions (left,
right, up and down) according to the movement of the fingers, and
– if it detects four fingers and five fingers, then right button click and left button click
actions will be performed, respectively
Fig. 5. Virtual mouse: mouse cursor movement

Fig. 6. Virtual mouse: left button click
Fig. 7. Virtual mouse: right button click
Figures 5, 6 and 7 demonstrate mouse cursor movement, left button click and right
button click operations, respectively.

We have considered a stroke patient for our testing who has lost control of his left side.
After doing some exercises, he was able to use our system and performed keyboard and
mouse operations for five times. We have performed our experiment in a normal
lighted room condition.
902 D. Gupta et al.
The summary of our experiment’s parameters is given below:
– Considered text: A Brown Fox Jumps Over The Lazy Crocodile 1 2 3 4 5 6 7 8 9 0.

– Number of characters (without space): 44.
– Number of tests: 5.
– Tested by: A 52 years old stroke patient who has very little control of his left side.
Figure 8 shows the number of times each word and digit is correctly recognized by
the system.
Fig. 8. Experimental result of virtual keyboard
4.2 Virtual Mouse

Virtual Mouse module in our system performs mouse functions by recognizing hand
and counting the finger numbers. It can perform six different functions: left click, right
click, left movement, right movement, up movement and down movement. We con-
sidered the same lighting and room condition which was used in virtual keyboard
experiment. The distance between the camera and object is maximum 10 m and the
objects are set in a fixed environment [24].
The summary of the virtual mouse experiment is given below:
– Mouse functions: 6.
– Number of test for each function: 5.
– Total number of test: 30.
– Tested by: A 52 years old stroke patient who has very little control of his left side.
Figure 9 shows the number of times each of the six mouse functions works
accurately.
Figure 10 shows the confusion matrix for operations of virtual keyboard and
mouse. We performed each of our 24 tasks (eight words in the sentence, ten digits and
Fig. 9. Experimental result of virtual mouse
six mouse functions) five times. Our system successfully recognizes 95 operations out
of 120 operations.
In order to evaluate system performance the accuracy of the system was measured
using the Eq. (2) [23].
DF
Accuracy ¼ 100% ð2Þ
TF
Fig. 10. Confusion matrix of keyboard and mouse operations

904 D. Gupta et al.
Where, DF is the number of successfully recognized operations and TF is the

number of total operations. The accuracy of our system using Eq. (2) is 79.17%.
Since the system uses webcam captured videos, the performance of the system may
depend on illumination. Additionally, if there are other colored objects are present in
the background, the system may produce an incorrect response. Although this issue can
be minimized by configuring the threshold values and other device parameters, it is still
advisable that the operating background should be light and there should not be any
bright colored artifacts present in the background.
Additionally, on some low computing computers, the device could run slower
because it performs a large number of complex calculations in a very short time.
However, for an optimal system performance, a regular computer or laptop has the
computational power needed. Another aspect is that the device will run slow if the
camera’s resolution is too high. This problem can be solved by reducing image
resolution.
Keyboard and mouse actually form an integral part of the computer system. Our system
architecture can facilitate the use of computer for the paralyzed people. We have
developed a virtual system where people can communicate with the computer without
using any physical keyboard and mouse. This could lead to a new age of Human
Computer Interaction in which physical contact with the computer would not be
necessary at all. The use of object detection and image processing in OpenCV for the
implementation of our work has proved to be practically successful and the task of
keyboard and mouse is achieved with good precision. This system can be beneficial to
certain people who have no control over their limbs.
Most of the applications require additional hardware which are often very expen-
sive. The motive of this work is to create this technology as cheaply as possible and to
create it under a standardized operating system as well. Though, our system can be
used as an alternative for physical keyboard and mouse, it still may perform less
accurately in a low light condition. This is a concern for further research. Moreover, the
work can be extended for a wide variety of environments and can be tested using the
sophisticated existing models [1, 3, 4, 7, 8, 11, 12, 15, 22].
References
1. Abedin, M.Z., Nath, A.C., Dhar, P., Deb, K., Hossain, M.S.: License plate recognition
system based on contour properties and deep learning model. In: 2017 IEEE Region 10
Humanitarian Technology Conference (R10-HTC), pp. 590–593. IEEE (2017)
2. Adajania, Y., Gosalia, J., Kanade, A., Mehta, H., Shekokar, N.: Virtual keyboard using
shadow analysis. In: 2010 3rd International Conference on Emerging Trends in Engineering
and Technology, pp. 163–165. IEEE (2010)
3. Ahmed, T.U., Hossain, S., Hossain, M.S., ul Islam, R., Andersson, K.: Facial ex-pression
IEEE (2019)
4. Asad, M.U., Mustafa, R., Hossain, M.S.: An efficient strategy for face clustering use in video
surveillance system. In: 2019 Joint 8th International Conference on Informatics, Electronics
& Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern
Recognition (icIVPR), pp. 12–17. IEEE (2019)
5. Bhuvana, S., Ashwin, E., Boopathi, R., Victor, A.D.: Virtual keyboard interaction with
system based on webcam (2017)
6. Cecotti, H.: A multimodal gaze-controlled virtual keyboard. IEEE Trans. Hum.-Mach. Syst.
46(4), 601–606 (2016)
(icIVPR), pp. 318–323. IEEE (2019)
8. Gupta, D., Hossain, E., Hossain, M.S., Andersson, K., Hossain, S.: A digital personal
assistant using bangla voice command recognition and face detection. In: 2019 IEEE
International Conference on Robotics, Automation, Artificial-intelligence and Internet-of-
Things (RAAICON), pp. 116–121. IEEE (2019)
9. Haria, A., Subramanian, A., Asokkumar, N., Poddar, S., Nayak, J.S.: Hand gesture
recognition for human computer interaction. Procedia Comput. Sci. 115, 367–374 (2017)
10. Hernanto, S., Suwardi, I.S.: Webcam virtual keyboard. In: Proceedings of the 2011
International Conference on Electrical Engineering and Informatics, pp. 1–5. IEEE (2011)
using Convolutional neural network with data augmentation. In: 2019 Joint 8th International
12. Islam, R.U., Hossain, M.S., Andersson, K.: A novel anomaly detection algorithm for sensor
data under uncertainty. Soft. Comput. 22(5), 1623–1639 (2018)
13. Jagannathan, M., Surya, M., BT, A.M., Poovaraghavan, R.: Finger recognition and gesture
based augmented keyboard (2018)
14. Keil, A., Albuquerque, G., Berger, K., Magnor, M.A.: Real-time gaze tracking with a
consumer-grade video camera (2010)
15. Noor, K., et al.: Performance analysis of a surveillance system to detect and track vehicles
using haar cascaded classifiers and optical flow method. In: 2017 12th IEEE Conference on
Industrial Electronics and Applications (ICIEA), pp. 258–263. IEEE (2017)
16. Patil, I.D., Lambhate, P.: Virtual keyboard interaction using eye gaze and eye blink. Int.
J. Recent Innov. Trends Comput. Commun. (IJRITCC) 3(7), 4849–4852 (2015)
17. Sahu, G., Mittal, S.: Controlling mouse pointer using web cam (2016)
18. Saraswati, V.I., Sigit, R., Harsono, T.: Eye gaze system to operate virtual keyboard. In: 2016
International Electronics Symposium (IES), pp. 175–179. IEEE (2016)
19. Shetty, S., Yadav, S., Upadhyay, R., Bodade, V.: Virtual mouse using colour detection
(2016)
20. Shibly, K.H., Dey, S.K., Islam, M.A., Showrav, S.I.: Design and development of hand
gesture based virtual mouse. In: 2019 1st International Conference on Advances in Science,
Engineering and Robotics Technology (ICASERT), pp. 1–5. IEEE (2019)
906 D. Gupta et al.
21. Shindhe, P.C., Goud, S.: Mouse free cursor control. Bonfring Int. J. Res. Commun. Eng.
(Spec. Issue Recent Advancements Electron. Commun. Eng. Eds. Dr. G.A. Bidkar, Dr.
C. Vijaya and Dr. S.B. Kulkarni) 6, 92–98 (2016)
22. Uddin Ahmed, T., Jamil, M.N., Hossain, M.S., Andersson, K., Hossain, M.S.: An integrated
real-time deep learning and belief rule base intelligent system to assess facial expression
under uncertainty. In: 9th International Conference on Informatics, Electronics & Vision
(ICIEV). IEEE Computer Society (2020)
23. Vasant, P., Zelinka, I., Weber, G.W.: Intelligent computing and optimization. In: Conference
Proceedings on Intelligent Computing and Optimization, vol. 866. Springer, Heidelberg
(2018).https://doi.org/10.1007/978-3-030-00979-3
24. Vasant, P., Zelinka, I., Weber, G.W.: Intelligent computing and optimization. In:
Proceedings of the 2nd International Conference on Intelligent Computing and Optimization,
vol. 1072. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-030-33585-4
25. Yousaf, M.H., Habib, H.A.: Virtual keyboard: real-time finger joints tracking for keystroke
detection and recognition. Arab. J. Sci. Eng. 39(2), 923–934 (2014)
Surface Water Quality Assessment
and Determination of Drinking Water Quality
Index by Adopting Multi Criteria Decision
Analysis Techniques
Deepjyoti Deb(&), Mrinmoy Majumder, Tilottama Chakraborty,

Prachi D. Khobragade, and Khakachang Tripura
Hydro Informatics Engineering, Civil Engineering Department, National Institute

of Technology, Agartala, Tripura, India
ddeb210@gmail.com, mmajumder15@gmail.com
Abstract. Quality Drinking water needs to be maintained for the healthy

human body. This study focuses on delineating and expressing the water quality
level concerning Water Quality Index, which involves implementing Multi-
Criteria Decision Analysis techniques such as Analytical Hierarchy Process and
Analytic Network Process. For this purpose, surface water samples are accu-
mulated from Jirania, Tripura, and the quality of the intake tap water, and the
treated purified water is compared for each of the sampling stations. The study’s
objective is to determine a new process from evaluating the water quality index
with the Multi-Criteria Decision Analysis approach. The present investigation
approximates the Water Quality Index with Analytical Hierarchy Process. It will
also validate the same by comparing it with the Analytic Network Process
incorporating the novel criteria like Cost, Potability, and Taste. Dissolved
Oxygen has emerged as the most influential parameter in determining water
quality. Sensitivity analysis is also performed to validate the Multi-Criteria
Decision Analysis approach in index evaluation. The present study uses a
comparative approach in assigning weighted values to the water quality
parameters in which the priorities given to the various criteria in the Multi-
Criteria Decision Analysis processes might not be consistent for all sections of
users, and as a result of which the weighted value might show slight changes
during computation of Water Quality Index for different water samples across
the world. The accuracy of the Water Quality Index evaluation shall improve if a
fixed set of criteria and their preference are maintained and justified based on
which the water quality parameters are ranked, and their relative weights will be
assessed accordingly.
Keywords: Water quality Water quality index Analytical hierarchy

process Analytic network process

https://doi.org/10.1007/978-3-030-68154-8_77
908 D. Deb et al.
1 Introduction
Environmental surveillance chiefly depends on the evaluation of water quality. This is

because low-quality water affects the aquatic life and the surrounding ecological
community and animals associated with the water bodies. Quality drinking water is
crucial to our fitness and well-being. World Health Organization (WHO) has recom-
mended permissible limits for various physiochemical and biological water quality
parameters such as Total Dissolved Solids (TDS), Total Suspended Solids (TSS),
Hardness, Dissolved Oxygen (DO), Biological Oxygen Demand (BOD), Total Plate
Count (TPC), pH etc. based on which grade of water is tested and analyzed. The water
is said to be fit for consumption only when the values of different parameters for the
water samples abide within the water quality standards.
Water quality is considered a measure to check water’s aptness for a specific
purpose appertaining to selected physicochemical and biological characteristics. Vijaya
Kumar et al. (2020), in their study, evaluate groundwater quality of villages affected by
industrial development in Ranebennur taluk, Karnataka, India, in which it has been
found that most of the groundwater samples (collected from bore wells) are appropriate
for consumption and irrigation purposes along with a threat of decline in groundwater
level.
Water Quality Index (WQI) is a numerical term that provides a clear idea of the
concerned water quality by consolidating the composite influence of the several
parameters on the inclusive quality of water (Chaurasia et al. 2018). Horton (1965)
developed a water quality index by allocating proper arithmetic weight to the water
quality parameters appertaining to their interpretation. Lkr et al. (2020) evaluate the
water quality status of Doyang River, Nagaland, India, with the usage of Weighted
Arithmetic Water Quality Index (WAWQI) in which water from eight sampling sta-
tions was collected. It was observed that sampling stations situated upstream experi-
ence low WQI rating owing to the presence of the hydroelectric dam, extensive
deforestation in the catchment, changeable land-use practices, and advancing estab-
lishments. Table 1 depicts the range of Water Quality Index for various uses of water,
as described in the weighted arithmetic WQI method. Moreover, the rating of water
quality corresponding to the WQI range is also given for a clear understanding of the
purity of water (Tokatli 2019).
Table 1. Water quality index (WQI) range, status, and possible usage of water sample (Brown
et al. 1972)
WQI Rating of water quality Probable use
value
0–25 Excellent water quality Drinking, irrigation, and industrial purpose
25–50 Good water quality Drinking, irrigation, and industrial purpose
50–75 Poor water quality Irrigation and industrial purpose
75–100 Very poor water quality For irrigation purpose
Above Unsuitable for drinking Proper treatment required for any kind of
100 purpose usage
Surface Water Quality Assessment and Determination 909
Analytical Hierarchy Process (AHP) is a multi-criteria decision-making method

where a stratified framework is developed. It is an approach to rank choices or ways
appertaining to the decision maker’s reasoning and perception concerning the impor-
tance of the criteria and the extent to which they are met by each alternative (S. Vaidya
et al. 2006). Zhang et al. (2020) study the application of combined fuzzy inclusive
evaluation and analytic hierarchy process in determining risk assessment of large-scale
seawater desalination projects based on a comprehensive evaluation of an incorporated
fuzzy and analytical hierarchy process method in which two levels of risk indicators are
recognized. The analyzed reports suggest that all the concerned projects’ overall risks
are at the “Very low” level.
The Analytic Network Process (ANP) is a decision finding technique which is also
a generalization of AHP. This technique forces detailed definitions of nodes and
interconnections, which require thorough thinking of the problem. As a result of this,
the ANP technique is widely used in decision-making, which comprises multiple cri-
teria (Tuzkaya et al. 2008). Mokarram et al. (2019) evaluate the groundwater quality in
northern Fars Province, Iran with the use of Geographic Information System
(GIS) based ANP and AHP in which Fuzzy charts are created for each layer using a
trapezoidal membership function, and the outcomes indicate that ANP generates higher
preciseness than fuzzy-AHP. Moreover, the findings suggest that Calcium, Chlorine,
and Sodium content along with high electrical conductivity have adversely affected
groundwater quality conditions of Northern Fars Province.
1.1 Motivation
Water Quality Index is a very effective way to tell us about the quality of the water.
Several standard methods have been prescribed for computing the WQI value. But
every technique has certain advantages and disadvantages. Moreover, most of the
methods have used the absolute approach in assigning weightage to the water quality
parameters. One of such popular process is WAWQI approach which is widely used
across the world.
Our present investigation has tried to enhance the accuracy of the method men-
tioned above by bringing some crucial modifications in its formula, which is broadly
discussed in Sect. 3. Moreover, the study has emphasized the relative approach in
assigning weightage to various parameters with the help of Multi-Criteria Decision
Analysis (MCDA) techniques. This approach will also tell us about the most significant
parameter in determining water quality based on the different criteria set by users across
the world.
1.2 Objectives of the Study

The objective of the present study is to determine a new procedure from the estimation
of water quality index with the help of the AHP multi-criteria decision-making project
and comparing the same with ANP multi-criteria decision-making method based index.
The study also wants to assess the quality of drinking water collected from thirteen
different sampling sites of Jirania, Agartala, India, incorporating the installed water
purifiers’ efficiency in the concerned areas. The MCDA based WQI score is compared
910 D. Deb et al.
with the conventional WAWQI method to check the new procedure’s effectiveness.
Finally, the study validates the efficiency of the MCDA based WQI approach in
assigning a weightage of priority to the water quality parameter with Sensitivity
Analysis of WQI.
2 Study Area
Jirania is a small town in India’s Tripura state situated on the banks of river Saidra with
longitude and latitude of 23.8132°N, 91.4362°E. Thirteen sampling stations (namely,
S1, S2… S13) are selected within Jirania, where two different water samples (one
sample consisting of tap water and the other comprising of treated drinking water) are
collected and tested from each of the locations. The tap water sample is termed as
‘INLET,’ and the purified water sample collected from the installed water cooler is
termed ‘OUTLET’. Figure 1 shows the Study area of Jirania which is obtained from
Google Earth Pro software.
Fig. 1. Location of the study area

3 Methodology
Collection of two water samples from each of the sampling stations of Jirania, Tripura.
One sample is the tap water (untreated) and the other sample is purified drinking water.
Performing different water quality tests on these water samples based on seven
parameters
Comparing the results of the tap water and purified water for each sampling station
to notice the change in the water quality
Determining the relative weighted significance value of the seven chosen water
quality parameters with the help of AHP -ANP MCDA Techniques
Index calculation using MCDA based WQI method
Comparison of the MCDA based WQI with the absolute conventional WAWQI
method
Performing sensitivity analysis of WQI to validate the most influential water quality
parameter in determining index calculation
Fig. 2. Detailed methodology of the study
The first work in this study was to carry out the collected water samples’ tests to assess
the water quality. Different instruments and procedures are available to analyze water
samples. To test various water quality parameters, multi-parameter water quality
checker instruments like HORIBA are of great use as values of different parameters can
912 D. Deb et al.
be obtained easily, which means it is not time-consuming. HORIBA has been used to
determine parameters like Total TDS, TSS, and pH for our study purpose. The EDTA
Titration process has selected the hardness of various water samples. Total Bacterial
count has been done using the Spread Plate technique. DO of all the collected samples
is measured using a DO meter.
The next aim of this study is to estimate the water quality index of the collected
water samples. Weighted Arithmetic WQI method was selected for this purpose. In
this study, two modifications in this standard formula are being computed:-
(i) The relative weights of the water quality parameters are determined using
MCDA Techniques such as Analytical Hierarchy Process or Analytic Network Process,
which will prioritize the impact of each of the parameters on the water quality.
(ii) The standard formula for Qi does not hold good for a few parameters such as
pH and DO. Generally, for consumption purposes, pH should be in the range of 6.5–8.5
and DO concentration stays typically between 5–18 mg/L. If the pH and DO values go
below these ranges, it is not acceptable for intake in both cases. If we go according to
the standard expression, even for lower costs of pH and DO, the quality rating scale
shows a negative value, which reduces the WQI value. Thus, we should consider only
the absolute value of the quality rating to compute the correct value of WQI, as shown
in Eq. 3. WQI of all the thirteen water samples is found using the traditional WAWQI
approach, and the weighted values are compared with our MCDA based method.
Validation of the efficiency of the new process is done using sensitivity analysis.
Estimation of WQI is done by the application of AHP and ANP methods.
The MCDA techniques are used to assign weightage and prioritize each of the seven
water quality parameters to determine the Drinking Water Quality Index. The AHP
based index is compared to the ANP based index to check the unit weights’ aptness.
Table 2 depicts the list of all the criteria and alternatives that are used in the process.
Table 2. List of criteria and alternatives used in MCDA techniques

Alternatives TDS TSS Hardness DO BOD TPC pH
Criteria COST POTABILITY TASTE
The first phase in the AHP process is to rank the Alternatives in the context of
specific criteria as per the study’s goal and decision maker’s judgment. The second step
is to determine the Equivalent number of Ranks (R) using the following equation:-
R ¼ 1 ½1=ðm þ 1Þ ð1Þ
Where m = no. of alternatives/criteria

The next step is to select the order of preference for criteria and compute the
equivalent rank. After that, compare each measure with the other corresponding to the
objective in the pair comparison matrix. The most crucial part of the AHP is to compare
each alternative with every other alternative with respect to each of the selected criteria
in the form of a pair comparison matrix. Finally, the Criteria’s weightage will multiply
with the relative weighted significance of the alternative for that criteria and calculate
the row-wise average. Normalize the average to find the weight of the importance of
the alternatives.
In ANP, the six steps used in the AHP for computation of weighted significance to
the criteria will be carry forwarded further. Up next, the criteria are compared with each
other concerning each of the alternatives. As a result, each criterion will have another
weightage of significance. In the final step, multiply the weighted relevance of criteria
with the weighted significance of alternatives (which are computed in the sixth step of
AHP) and calculate the row-wise average. Normalize the average to find the updated
relative weights of alternatives.
Relative weighted significance to all the alternatives is computed using AHP and
ANP separately. And the results are compared and analyzed accordingly. If the values
obtained from both the MCDA techniques are almost exact, we will consider that value
as the alternative’s final relative weight. Still, if they show slightly different values,
then ensemble value can be regarded as the ultimate relative weighted significance to
the alternatives.
After computing updated weighted values, Water Quality Index can be found out
by Modified Weighted Arithmetic Water Quality Index Method formula. The
following steps have been worked out to determine the WQI value.
First Step - Relative Weightage value of all the selected water parameters is found
out with MCDA Techniques’ help. The summation of all the weights is 1.
Second Step - Assigning of a quality rating scale for each parameter, as below

ðVa ViÞ
Qi ¼ 100 ð2Þ
ðVs ViÞ
Va is the observed value of the parameter

Vi is the ideal value of the water quality parameter (For most cases, Vi = 0, e.g.,
TDS, TSS, HARDNESS, etc. But for pH, Vi = 7 and Dissolved Oxygen,
Vi = 14.6 mg/L)
Vs is the standard value as per WHO
Third Step - Computation of WQI
Xn WiQi
WQI ¼ P ð3Þ
i¼1 Wi
Wi is the relative unit weights obtained from MCDA methods for the ith parameter.
The first and the most crucial stage in computing AHP and ANP is to rank the water
quality parameters (Alternatives) with respect to the selected criteria. These rankings
play a pivotal role in determining the most significant parameter and take further
decisions. For example, considering the criterion ‘Cost,’ the alternative that involves
the least cost to perform the water sample test is ranked 1. The experimental test of the
parameter that requires maximum cost is given rank 7. Performing the Hardness test
914 D. Deb et al.
requires significantly less cost than other parameters, and consequently, it is ranked first
place under the ‘cost’ criterion. Similarly, Dissolved oxygen contributes to the
enhanced taste of drinking water. So, this parameter is rated 1 concerning criterion
‘Taste.’ Presence of bacteria influences the Potability of drinking water to a great
extent. Therefore, TPC is ranked 1 under the criterion’ ‘Potability Table 3.’
Table 3. Ranking the alternatives as per the criteria corresponding to the objective of the study.
Parameter Cost Potability Taste
TDS 4 7 4
TSS 2 3 6
DO 3 2 1
BOD 6 6 5
Hardness 1 4 7
TPC 7 1 3
pH 5 5 2
Table 4 compares the updated relative weights assigned to the alternatives derived
from the Analytic Hierarchy Process and Analytic Network Process, respectively.
Furthermore, the table shows the ensemble weights for each of the water quality
parameters, which are computed by taking the average of the weighted values by the
applied MCDA techniques. Since there is not much difference in the weighted sig-
nificance values of the alternatives by both the MCDA techniques; therefore, the
ensemble weights will be considered as the final updated values of the seven water
quality parameters.
Table 4. Relative weighing factor of the alternatives as per AHP and ANP
Water Weighted significance Weighted significance Ensemble
quality values as per the analytical values as per the analytic weighted
parameters hierarchy process network process significance
values
TDS 0.093 0.106 0.1
TSS 0.171 0.156 0.164
DO 0.207 0.213 0.21
BOD 0.078 0.082 0.08
Hardness 0.159 0.145 0.152
TPC 0.168 0.156 0.162
pH 0.126 0.142 0.134
From ANP and AHP approach, Dissolved Oxygen came out as the most influential
water quality parameter in deciding water quality. It can be seen that TSS and TPC are
also crucial water quality parameters carrying the higher weightage.
Table 5 shows the test results of the water samples collected from all the sampling
stations. Two water samples are collected from each location. The first one is tap water,
which is represented as ‘INLET.’ The second one is the purified water collected from
the water cooler/water purifier, designated as ‘OUTLET.’ The drinking water quality
standards for the selected seven water quality parameters prescribed by the World
Health Organization (Third Edition) are also shown at the end of the table.
Table 5. Water quality analysis of the samples taken from thirteen sampling stations
TDS (mg/L) TSS (mg/L) Hardness DO (mg/L) BOD (mg/L) TPC pH
(mg/L) (colonies)
Inlet Outlet Inlet Outlet Inlet Outlet Inlet Outlet Inlet Outlet Inlet Outlet Inlet Outlet
S1 32 30 41 8 10 4 5.4 5.6 0.8 0.2 248 95 5.22 6.68
S2 32 29 10 5 6 10 3.9 3.3 1.7 1.3 254 190 5.95 6.36
S3 53 53 39 5 17 22 4.1 4.6 1.1 0.9 320 270 6.53 6.55
S4 58 56 28 14 12 12 3.3 3.8 1.2 0.3 165 50 6.37 6.83
S5 34 45 28 26 7 17 4 6.3 0.4 2.4 152 450 5.43 5.57
S6 94 93 32 24 19 11 5.7 6.9 1.6 1 260 20 5.7 6.02
S7 94 27 32 10 19 10 5.7 6.6 1.6 1.1 260 195 5.7 6.95
S8 94 80 32 3 19 12 5.7 6.5 1.6 1.2 260 65 5.7 6.89
S9 38 37 27 14 12 11 6.4 7 1.2 0.9 224 185 5.81 6.44
S10 38 36 24 9 10 8 7.4 7.4 1.9 0.8 224 77 5.25 6.02
S11 67 31 21 14 20 6 5.2 6.5 1.4 1.2 48 30 5.85 6.91
S12 67 51 21 2 20 16 5.2 6.3 1.4 0.8 48 21 5.85 6.42
S13 67 50 21 8 20 16 5.2 5.9 1.4 0.7 48 38 5.85 7.41
WHO 500 (mg/L) 20 (mg/L) 300 (mg/L) 5 (mg/L) 5 (mg/L) 300 8.5
Limits (colonies)
Table 6 depicts the water quality index value of all the tested water samples with
the help of a Modified Weighted Arithmetic WQI method in which MCDA techniques
have been used. Moreover, the level of water quality corresponding to the WQI value is
also enlisted to get a clear understanding of the quality of drinking water.
Figure 3 represents the comparison of the inlet and the outlet water samples of each
of the sampling stations through a line graph. It can be observed that apart from the
outlet/purified water sample of S5 (WQI value– 66.462), all other outlet samples are
found to be of good quality. Treated water of S8 and S12 is of Excellent quality as per
the WQI method. Except for S5, the WQI value of all the Outlet samples are lower than
their Inlet water samples.
916 D. Deb et al.
Table 6. MCDA based water quality index value of all the

water samples and its rating is tabulated as follows
Sampling stations WQI Rating of water
quality
Inlet Outlet` Inlet Outlet
S1 85.465 35.359 Poor Good
S2 58.366 47.693 Poor Good
S3 80.109 48.19 Poor Good
S4 65.905 41.532 Poor Good
S5 70.056 82.152 Poor Poor
S6 76.765 49.376 Poor Good
S7 76.765 39.483 Poor Good
S8 76.765 28.799 Poor Good
S9 66.092 42.835 Poor Good
S10 67.466 38.448 Poor Good
S11 55.241 34.467 Poor Good
S12 55.241 29.222 Poor Good
S13 55.241 34.237 Poor Good
90
80
Water Quality Index (WQI)
70
60
50
40
30
20
10
0
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13
INLET 85.47 58.37 80.11 65.91 70.06 76.77 76.77 76.77 66.09 67.47 55.24 55.24 55.24
OUTLET 35.36 47.69 48.19 41.53 82.15 49.38 39.48 28.8 42.84 38.45 34.47 29.22 34.24
Fig. 3. Line graph representing WQI values of the water samples
Table 7 shows the comparison between the absolute conventional WAQWI method
(Olayiwola et al. 2016) and the modified MCDA based WQI method. The unit weights
for every parameter are derived based on three criteria: Cost, Potability, and Taste. The
table shows that there is no significant difference in the WQI scores of the thirteen
collected water samples in most of the sites. However, the small fluctuations in the
value are due to the MCDA based relative unit weights, which prioritize the parameters
based on the factors mentioned above. Moreover, modification in the Qi formula has
also improved the accuracy of the MCDA based WQI score as the conventional
approach might not always work for water quality parameters such as pH and DO.
Table 7. Comparison between WQI values derived from both the standard expression and
MCDA based approach
Sampling MCDA based WQI value of WQI value of the treated water samples
Stations the treated water samples computed from conventional Weighted
Arithmetic Water Quality Index approach
(Olayiwola et al. 2016)
S1 35.359 47.729
S2 47.693 47.579
S3 48.19 41.762
S4 41.532 46.154
S5 82.152 47.825
S6 49.376 37.719
S7 39.483 41.029
S8 28.799 38.362
S9 42.835 35.918
S10 38.448 28.089
S11 34.467 43.19
S12 29.222 32.085
S13 34.237 43.243
5 Sensitivity Analysis of MCDA Based WAWQI
The study of the response of an output variable in context to alteration of input

variables is known as the sensitivity analysis. This analysis is performed in the present
study to identify the most influential water quality parameter in evaluating the WQI
score. Each parameter is eliminated in turn from the WQI calculation. Accordingly, the
output WQI values are compared with the original MCDA based WQI value, which
comprises seven selected water quality variables. The parameter whose removal from
the WQI calculation has shown maximum fluctuations in the WQI score compared to
the original score is considered the critical parameter is impacting the water quality at
all the sampling locations (Namugize et al. 2018).
In this study, Dissolved Oxygen came out to be the most influencing one as its
removal from the WQI score has shown a drastic reduction in the WQI value, which
can be observed for all the sampling locations. For example, in S1 and S2, the elim-
ination of DO in the MCDA based index calculation has resulted in a decrease of more
than 50% in the overall WQI score. In 9 out of 13 sampling sites, DO is the most
influencing variable amongst all other seven parameters. The efficiency of MCDA
based WQI in the present study is validated with the fact that the relative weights of the
water quality variables obtained from the ensemble approach of AHP and ANP also
highlighted DO as the critical parameter in determining the water quality. TPC and TSS
also have greatly influenced the WAWQI at Site S5 and S6, respectively. This is also in
accordance with our MCDA approach, where after DO, TSS and TPC have obtained
maximum weightage.
918 D. Deb et al.
Table 8 depicts the sensitivity analysis of the MCDA based WQI of the water
samples collected from thirteen sampling sites to determine the critical parameter in
evaluating the water quality index. All seven variables are removed one at a time, and
the corresponding WQI score is computed. These WQI values are compared with the
original WQI value for that sampling site, and the changes in the scores are observed
accordingly. More the difference in the index value, superior is the parameter. From the
table, it can be observed that in 11 out of 13 sites, DO is the most influential parameter
as it can be seen that a significant decrease is happening in the WQI value after
eliminating DO. Thus, this analysis validates the efficiency of the MCDA based WQI,
which has also given maximum weightage to the DO parameter through its procedure.
Table 8. Sensitive analysis of WQI at the sampling sites

Sampling MCDA WQI after WQI after WQI after WQI after WQI after WQI after WQI after
sites based eliminating eliminating eliminating eliminating eliminating eliminating eliminating
WQI TDS TSS Hardness DO BOD TPC pH
value
S1 35.359 34.76 28.8 35.16 15.67 35.03 30.23 32.5
S2 47.693 47.11 43.59 47.18 22.98 45.62 37.43 41.97
S3 48.19 47.13 44.09 47.07 26.32 46.75 33.61 44.17
S4 41.532 40.41 30.05 40.92 17.91 41.05 38.83 40.01
S5 82.152 81.25 60.83 81.29 63.99 78.32 57.85 69.37
S6 49.376 48.52 29.69 48.82 32.53 47.77 48.29 40.62
S7 39.483 38.94 31.28 38.98 21.98 37.72 28.95 39.03
S8 28.799 27.19 26.4 28.19 11.08 26.87 25.29 27.82
S9 42.835 42.10 31.36 42.28 26.21 41.4 32.85 37.83
S10 38.448 37.73 31.07 38.04 22.7 37.17 34.29 29.69
S11 34.467 33.85 23 34.16 16.75 32.55 32.85 33.66
S12 29.222 28.2 27.58 28.41 11.07 27.94 28.09 24.04
S13 34.237 33.24 27.68 33.43 15.21 33.12 32.19 30.57
Figure 4 shows the sensitivity analysis of MCDA based WQI at sampling sites S1,
S5, and S9, respectively. The Bar Graph indicates the dominance of the DO parameter
in the index calculation of water samples at the three locations as the WQI score shows
a significant reduction in the absence of the DO variable compared to the original WQI
value. However, apart from DO, TPC is also one of the critical parameters in the water
quality rating at site S5. Parameters have TDS and Hardness have hardly shown any
difference in the WQI score, making them least significant amongst the concerned
seven variables.
The competence to reduce the composite and varied information into a single value
to express the data in a simplified and consistent form makes the users’ water quality
index valuable. Based on previous literature, various methods have been used to cal-
culate WQI and each one of these methods has some merits and limitations. Lots of
researches are going on to improve this iterative process of evaluating WQI. Studies on
MCDA based WQI have been done earlier, but they have not considered the impact on
cost, Potability, and taste of the water quality parameter. These three parameters used in
the present study were not considered in any of the previous studies. That is why the
WQI without pH 37.83

WQI without TPC 32.85
WQI without BOD 41.4
WQI without DO 26.21
S9
WQI without Hardness 42.28

WQI without TSS 31.36
WQI without TDS 42.1
Original WQI value 42.84
WQI without pH 69.37

S5

WQI without pH 32.5

s1

Fig. 4. Sensitivity analysis of MCDA based WQI at sampling sites S1, S5, and S9
current investigation will not only approximate the WQI with AHP MCDA but will
also validate the same comparing with ANP MCDA incorporating the novel criteria
like Cost, Potability, and Taste. This MCDA approach will help improve the accuracy
of estimating WQI and present the most influential parameters in maintaining the water
quality. Thus, the users will focus on keeping the magnitude of that parameter within
the standard value.
6 Conclusion
WAWQI method is applied to determine WQI value. Two modifications in the standard
expression have been made; one uses MCDA techniques in assigning relative
weightage to the parameters to figure out the influence of the concerned parameters in
the quality of drinking water. The second modification uses the absolute term in the
Quality rating formula to get a more accurate and justified WQI value for the concerned
water. The results obtained from MCDA based water quality index show that the tap
water status collected from various sampling locations of Jirania, Agartala varies from
good to low quality. In comparison, most of the treated or purified water samples
collected from the thirteen sampling stations vary from Excellent to Good quality.
Sampling station 12 has the best water quality (Both Inlet and Outlet) than other
buildings, whereas sampling station 5 (Outlet purified water) shows unsatisfactory
results. The treated water samples of sampling stations 2 and 3 also show unsatisfactory
results for some of the water quality parameters. The application of the two MCDA
techniques in the present study has shown Dissolved Oxygen as the most influential
parameter in determining drinking water quality. It is observed that the unit weights
920 D. Deb et al.
obtained from these two methods are almost equal for each of the seven parameters.
Hence, ensemble weights from both the techniques have been considered, and the
calculation of WQI has been worked out accordingly. Sensitivity analysis of MCDA
based WQI is evaluated for validation of the new method, and the results have also
shown DO as the key parameter in the index calculation. While calculating relative
weighted significance with the help of MCDA processes, the order of preference for the
criteria is not standard or fixed. It can be inconsistent for a different section of users.
Furthermore, the selection of criteria may also vary depending upon the requirement of
the consumers. These factors might result in minor alterations in the value of relative
weights.
References
Bouslah, S., Djemili, L., Houichi, L.: Water quality index assessment of Koudiat Medouar
Reservoir, northeast Algeria using weighted arithmetic index method. J. Water Land Dev. 35,
221–228 (2017)
Brown, R.M., McClellan, N.I., Deininger, R.A., Tozer, R.G.: A water quality index—do we
dare? Water Sew Works 117, 339–343 (1972)
Chapelle, F.H., Bradley, P.M., McMahon, P.B., et al.: What does “water quality” mean? Ground
Water 47(6), 752–754 (2009). https://doi.org/10.1111/j.1745-6584.2009.00569.x
Chaurasia, A.K., Pandey, H.K., Tiwari, S.K., et al.: Groundwater quality assessment using water
quality index (WQI) in parts of Varanasi district, Uttar Pradesh, India. J. Geol. Soc. India 92
(1), 76–82 (2018)
Fondriest Environmental, Inc. Dissolved Oxygen Fundamentals of Environmental Measurements
[online] (2013). https://www.fondriest.com/environmental-measurements/parameters/water-
quality/dissolved-oxygen
Horton, R.K.: An index number system for rating water quality. J. Water Pollut. Control Fed. 37
(3), 300–306 (1965)
Intelligent Computing & Optimization, Conference proceedings ICO. Springer, Cham (2018).
ISBN 978-3-030-00978-6
Intelligent Computing and Optimization. In: Proceedings of the 2nd International Conference on
Intelligent Computing and Optimization 2019 (ICO 2019). Springer International Publishing.
ISBN 978-3-030-33585 -4
Karakuş, C.B.: Evaluation of groundwater quality in Sivas province (Turkey) using water quality
index and GIS-based analytic hierarchy process. Int. J. Environ. Health Res. 1–20 (2018). 10.
1080/09603123.2018.1551521
Lkr, A., Singh, M.R., Puro, N.: Assessment of water quality status of Doyang river, Nagaland,
India, using water quality index. Appl Water Sci. 10, 46 (2020)
Mokarram, M., Pourghasemi, H.R., Tiefenbacher, J.P.: Comparison analytic network and
analytical hierarchical process approaches with feature selection algorithm to predict
groundwater quality. Environ. Earth Sci. 78, 625 (2019)
Namugize, J.N., Jewitt, G.P.W.: Sensitivity analysis for water quality monitoring frequency in
the application of a water quality index for the uMngeni River and its tributaries, KwaZulu-
Natal. South Africa Water SA 44(4), 516 (2018)
Nandeesha, Khandagale, V.R., Mahesh, V.B., et al.: Assessment of groundwater quality by
integration of water quality index and GIS techniques. In: Proceedings of the 35th
International Conference of the Polymer Processing, PPS-35 (2020)
Olayiwola, O., Fasakin, O.: The Use of water quality index method to determine the potability of
surface water and groundwater in the vicinity of a municipal solid waste dumpsite in Nigeria.
Am. J. Eng. Res. (AJER) 5(10), 96–101 (2016)
Sanders, E.R.: Aseptic laboratory techniques: plating methods. J. Vis. Exp. 63, e3064 (2012)
Sandle, T.: Microbiology laboratory techniques. Pharmaceutical Microbiology, Woodhead
Publishing, pp. 63–80 (2016). ISBN 9780081000229
Shah, K.A., Joshi, G.S.: Evaluation of water quality index for River Sabarmati, Gujarat. India.
Appl. Water Sci. 7(3), 1349–1358 (2015). https://doi.org/10.1007/s13201-015-0318-7
Tokatli, C.: Water quality assessment of yazir pond (Tekirdağ, Turkey): an application of water
quality index. Res. J. Biol. Sci. Biyoloji Bilimleri Araştırma Dergisi, E-, 12(1), 26–29 (2019).
ISSN: 1308-0261
Tuzkaya, G., Önüt, S., Tuzkaya, U.R., et al.: An analytic network process approach for locating
undesirable facilities: an example from Istanbul. Turkey. J. Environ. Manag. 88(4), 970–983
(2008)
Vaidya, O.S., Kumar, S.: Analytic hierarchy process: an overview of applications. Euro. J. Oper.
Res. 169(1), 1–29 (2006)
Vijaya Kumar, H., Patil, N.S., Prabhu, N.: Analysis of water quality parameters of groundwater
near Ranebennur industrial area, Haveri district, Karnataka, India. In: Proceedings of the 35th
International Conference of the Polymer Processing Society, PPS-35 (2020)
World Health Organization Guidelines for Drinking-water Quality. Third Edition, vol. 1 (2008).
ISBN 9789241547611
Zhang, Y., Wang, R., Huang, P., et al.: Risk evaluation of large-scale seawater desalination
projects based on an integrated fuzzy comprehensive evaluation and analytic hierarchy
process method. Desalination 478, 114286 (2020)
An Approach for Multi-human Pose
Recognition and Classification Using
Multiclass SVM
Sheikh Md. Razibul Hasan Raj(&), Sultana Jahan Mukta,

Tapan Kumar Godder, and Md. Zahidul Islam
Islamic University, Kushtia, Bangladesh

rajice62@gmail.com
Abstract. The aim of this paper is to recognize human activities and classify
their activity using multi class support vector machine (SVM). There are many
ways to predict human activity either vision based or wearable sensor based. In
vision based, different types of sensors that can be used to address this task like a
Kinect sensor. In this paper, a vision-based algorithm is proposed to identify
multi-human activity so that we can predict the gesture Based on their action.
The model is based on segmentation and filtering the environment and identi-
fying human firstly, to identify frame based on semantic structure a dynamic
distance separability algorithm is leading to divide a shot into sub shots for
selecting appropriate key-frames in each sub shot by SVD decomposition. An
adaptive filter is used for filtering process. Then to compose a feature vector,
extraction of key poses is performed, and the meticulous Support vector
Machine performs classification, and activity recognition. The application of this
model is to recognize multi human and estimate their poses for classification.
The future research area of this paperwork will be predicted abnormality of
people’s health for medical purpose.
Keywords: Activity recognition PSAR method Depth camera Multiclass

SVM
1 Introduction
The area of human activity recognition from unknown video sets there be limited
intelligent which can robustly and accurately recognize each class of human activities.
The difficulties in recognizing human activities come from several aspects. Firstly, the
feature vector data of human activities are high dimensional and secondly similar
activities by different subjects show large variations. Using depth sensors, like
Microsoft GenICam, Kinect or other similar devices, it is possible to design activity
recognition model exploiting depth maps, which are a better source of information
because they are not affected by environment like temperature and light variations, can
provide body shape, and simplify the problem of human activity detection and seg-
mentation [1] represents a smartphone inertial sensors-based approach for human
activity recognition. Efficient features are first extracted from raw data. The features
include mean, median, auto-regressive coefficients. Extracted from the depth frames the
https://doi.org/10.1007/978-3-030-68154-8_78
An Approach for Multi-human Pose Recognition 923
skeleton, joints allow impenetrable representation of the human body that can be used
in many applications [2].
As compared to the RGB images, depth image has several advantages for the aspect
of activity recognition. For example, depth images provide appropriate shape infor-
mation, which can be more preferential than the RGB images in many problems such
as classification, gesture detection, and activity recognition. Depth images also provide
additional body shape, and structure information, which have been accurately applied
to recover skeleton, joints from a single depth image-Sect. (3.6) [3]. The feature
extraction method based on light intensity and optical flow works well for the RGB
images. It is therefore, important to design a feature extractions method that is based on
specific characteristics of depth sequences (skeleton joint-reference- [4] equation no. 5,
features selection vector-reference [5], 3D depth structure) [4–6]. This extracted
information involved with the spatial properties such as legs, hands and bodies
movements difference during activity comparison and temporal relationship between
consecutive frames. Also, depth sensors are cost effective and feasible to use [7].
2 Related Work
In the last few years, many models have been proposed to detect human activities,
Devanne et al. [8] Proposed representing human actions by spatio-temporal motion
trajectories in a 60-dimensional space, they considered 20 joints, each of them with 3
coordinates. Then, an elastic metric which is a metric invariant to time and speed of the
action sequence, within a Riemannian shape space, is employed to represent the dis-
tance between two curves. The action recognition can be seen as a classification in the
Riemannian space, using a k-Nearest-Neighbor algorithm. Taha et al. [9] Exploit joints
spherical coordinates to represent the skeleton and a framework composed of a multi
class SVM and a discrete HMM to recognize activities constituted by many actions for
suspicious activity detection.
Ding et al. [6] develop a Spatio-temporal Feature, Chain (STFC) to represent the
human actions by the trajectories of joint action. Before using the STFC model, a graph
is used to wipe periodic sequences, making the solution more accurate to noise and
periodic sequence misalignment. Lorenzo s. [10] present a method to recognize human
actions based on the positions of joints. First, the body skeleton is decomposed in a set
of kinematic chains, and the position of each joint is expressed in a locally defined
reference system which make the coordinates invariant to body translations and rota-
tions. Wondering is a wearable sensor developed to detect and synthesis user’s
activities. The sensor used for old people to monitor their activities on daily bases, and
send it to their doctor or family member to know information about their health status
to help them to live safely. The system challenges to solve the issues of recognizing the
activity of the user, such as running, eating, walking, etc. Also, gaining useful infor-
mation from monitoring user activities and deliver these data to the healthcare Body.
User’s activities are stored in databases after that, the system filter these activities to get
useful data [11].
924 S. Md. R. H. Raj et al.
3 Model
The proposed algorithm for human activity prediction starts from a frame sequence
which are extracted from a live data set. These frames are then filtered through linear
filter, and then object selection is performed to determine a particular person which
activity we want to recognize after that compute skeleton joints and produce a vector of
features for each activity. A multi class machine learning (ML) algorithm, where each
class represents a different activity, is used for classification purpose.
3.1 Steps
Seven main steps constitute the whole algorithm which is discussed in the following: 1.
Frame segmentation. For segmentation, we take a live video as input and consider the
video frame based and determine the particular object like a human for activity
recognition. 2. Filtering. At the tracking stage, we find a correspondence between
hypothesized frame regions at current and previous frames, and find the true object
region among all the hypotheses for current frame. 3. Posture Features Extraction. To
evaluate the feature vectors which represent human postures the coordinates of the
skeletal joints are used.4. Postures Selection. Important postures for each activity
selected. 5. Activity Features Computation. A feature vector which represents the
whole activity is created and are used for classification. 6 Classification. 7. Activity
recognition. The classification is realized using a multi-class SVM.
3.2 Proposed Method

The frame extraction approach which is used in this paper, are available for both video
summary and live video. Based on semantic structure a dynamic distance separability
algorithm is leading to divide a shot into sub shots for selecting appropriate key-frames
in each sub shot by SVD decomposition. An adaptive filter is used for filtering process.
The adaptive filter is more picked than a comparable linear filter and also median filter
which preserve edges and other high frequency parts of an image. The wiener2 function
performs all primary computations and implements the filter for an input image.
Wiener2, yet, does require more computation time than linear filtering. Wiener works
best when the noise is constant-power, additive noise, such as Gaussian noise. For
representing the input to the algorithm, there is the need to extract the features from the
dataset. We present a novel method for human action recognition from a video
sequence. Kinect libraries include the joint extraction algorithm proposed in [2] is in,
and is ready to use. Starting from skeleton data, many features to represent a human
pose, therefore a human action, have been proposed in the model. In [8], the owner
found that many features can be computed from skeleton joints for features map. The
simplest feature can be extracted by joint locations from their distances, considering
spatial information as the basis. Other features may involve joints motion or orienta-
tion, temporal data. Complex features may be initially based on the estimation of a
plane considering some joints of the points. The features can be extracted by measuring
the distances between the planes, and other joints.
The proposed model exploits spatial features computed from skeleton coordinates,
without including the time information features in the computation, to make the system
independent of the speed of movement. The feature extraction method has been
introduced in [11] and it is adopted with small differences. For each skeleton frame, a
posture feature vector is computed initially. Each joint is represented by Oi a three-
dimensional vector in the coordinate space of Kinect. The person may find at any place
within the coverage of Kinect and the coordinates of the same joint may assume
different values. It is necessary to recoup this effect, by using feature computation
algorithm. A straightforward method is to compensate to the position of the skeleton by
centering the coordinate area in one skeleton joint. Considering a skeleton composed of
J joints, O0 being the coordinates of the torso joint, and O2 being the coordinates of the
neck joint and the ith joint feature, di is as the distance vector between Oi and O0,
normalized to the distance between O2 and J0:
Oi oo
Di ¼ i ¼ 1; 2; . . .. . .J 1 ð1Þ
jjo2 Oojj
This feature vector is invariant to the position of the skeleton within the range of
Kinect; yet the invariance of the feature for building of the person is obtained by
considering the distance between the neck and torso joints. These features may be seen
as a set of distance matrix which connect each joint to the joint of the torso. A posture
feature vector g is created for each skeleton frame:
g ¼ ½ D1 ; D2 ; D3 ; . . .. . .. . .. . .. . .::DJ1 ð2Þ
A set of F feature vectors is computed, having an activity constituted by F frames.

In the second phase concerns the human postures selection, representing the activity by
means of only a subset of poses with the aim of reducing the complexity and increasing
generality without using all the frames. A clustering algorithm is used to process F
feature vectors constituting the activity, by grouping them into N clusters. k-means
clustering algorithm, can be used to group together the frames representing similar
postures based on the squared Euclidean distance as a metric. Considering an activity
composed of F feature vectors [g1, g2, g3,…, gN], the k means algorithm gives as
outputs F clusters and N vectors [b1, b2, b3,…, bK] that represent the centroids of each
cluster. The feature vectors are partitioned into clusters A1, A2,…, AK so as to satisfy
the condition expressed by
P
k P
arg min ¼ jjgi bjjj
j¼1 gi2Aj ð3Þ
A
The N centroids are as the main postures, which are the most important feature
vectors. Where classical approaches based on key poses, proposed algorithm select the
most informative postures by considering all the sequences of each activity, in the
proposed algorithm the clustering is executed for each sequence. This will help to
avoids the application of a learning algorithm required to associate each frame to the
closest key pose and allows to have a more accurate representation of the activity.
The computation of a feature vector which models the whole activity is performed in
the third phase, starting from the N centroid vectors computed by the clustering
algorithm. In a word [b1, b2, b3,…, bK] vectors are sorted on the basis of the order in
which the cluster’s elements occur during the activity. The activity features vector is
definite of concatenating the sorted centroid vectors. For example considering an
activity featured by F = 12 and N = 5, after running the k-means algorithm, one of the
possible outputs could be the following sequence of cluster IDs: [2, 2, 2, 3, 3, 1, 1, 4, 4,
4, 5, 5], that means the first three posture vectors belong to cluster 2, the fourth and the
fifth are associated with cluster 3 and the sixth and seventh are in cluster 5 and so on. In
this case, the activity features vector is S = [B2,B3,B1,B4,B5]. A feature activity vector
has a dimension of 4N (J − 1) that can be handled without using dimensionality
reduction algorithms if K is small. The classification step target to associate each
feature activity vector to the correct activity. There are a number of machine learning
algorithms may be applied to fulfil this task, among them SVM is one. Considering a
number of d training vectors xi 2 Tn and a vector of labels y 2 Td, where yi 2 {−1,1},
a binary SVM can be formulated as follows (Fig. 1):
Pd
min ¼ 12 WT W þ C i1 ni
w; b; n
subject to yi ðwT /ðxi Þ þ bÞ 1ni

ð4Þ
ni 0; i ¼ 1; . . .; d;
Where wT /ðxÞ þ b ¼ 0 ð5Þ
And Distance between wT x + b = 1 and − 1:

p
2=jjwjj ¼ 2= wT w ð6Þ
Is the optimal hyperplane that allows separation between groups in the feature
vector, C is a constant, and ni are nonnegative variables which are considered as
training error. The function / allows transforming between the features vector and a
reasonable higher dimensional space where the data are separable. Now we Consider
two training vectors xi and xj, the kernel function can be defined as
kðxi ; xj Þ ¼ /ðxi ÞT /ðxj Þ ð7Þ
In this work Gaussian kernel have been used where
jjX1 X2
kðX1:X2Þ , expð Þ; r[ 0 ð8Þ
r2
Fig. 1. Activity segmentation and recognition (ASAR) model. The general scheme of the
activity recognition algorithm is composed of 7 steps. First segmentation then filtering then
object specification and then the posture feature vectors are computed for each skeleton frame
then the postures are selected and an activity features vector is created. Finally, a multiclass SVM
is exploited for activity classification.
It follows that C and r are the parameters that have to be estimated prior using the
SVM.
The idea we exploited is to use a multi class SVM, where each group or class
represents an activity of the dataset. Some approaches have been proposed in the
literature, to extend the role from a binary to a multi class classifier. In [2], the authors
compared many methods and found that the most suitable for practical use is the one
against- one method. It is implemented in LIBSVM [7] and it is approach used in this
work for classification. The one-against-one approach is based on the construction of
several binary SVM classifiers; in a broad sense, a number of N (N − 1)/2 binary
SVMs are necessary in an N-classes dataset. This happens because each SVM is trained
to distinguish between 2 classes, and the final decision is taken exploiting a voting
method among all the binary classifiers. During the training phase, the activity feature
vectors are given as inputs to the multi class Support Vector Machine (SVM), together
Fig. 2. Subsets of joints considered in the execution of proposed algorithm. The whole skeleton
is represented by 20 joints for representing a person, the selected ones are depicted as green
circles, while the unselected joints are represented by red squares. Subsets of (a) = 7, (b) = 11,
and (c) = 15 joints, and the whole set of (d) = 20 joints are selected.
with the label M of the action. In the test phase, the activity label is obtained from the
SVM classifier.
Experimental Results
The algorithm performance is evaluated on three publicly available data sets. The
reference test procedures have been considered for each dataset to perform an objective
comparison to previous works. The performance indicators are evaluated using four
different subsets of joints, shown in Fig. 3, started from a minimum of 7 up to a
maximum of 20 joints. Finally, to evaluate the performance in activity recognition
scenarios, some suitable activities related to HA are selected from the datasets, and
recognition accuracies are evaluated. After training, the test procedure can recognize
multi-human activity real-time.
Performance Analysis
The proposed model has been tested over the datasets detailed above, to ensure a fair
comparison to previous works which used the same data sets, following the recom-
mendations provided in each reference paper. Following this comparison, a more
activity recognition-oriented evaluation has been conducted, which consists of con-
sidering only suitable actions for each dataset, excluding the gestures. A subset of the
skeleton, joints that have to be included in the feature computation vector.
KARD Dataset
Gaglio et al. [11] collected the KARD dataset and proposed few evaluation experi-
ments on it. They considered two modalities of dataset splitting and three different
experiments. The experiments are as follows: (i) Experiment A: one-third of the data
set is considered for training, and the rest for testing.
Fig. 3. a) single person activity recognition b) multi person activity recognition.
(ii) Experiment B: two-thirds of the data set are considered for training, and the rest
for testing.
(iii) Experiment C: half of the data set is considered for training, and the rest for
testing. The activities constituting the data set are split in the following classes:
(i) Gestures and Actions.
(ii) Activity Set 1, Activity Set 2, and Activity Set 3, as listed in Table 1. Activity
Set 1, is the simplest one since it is composed of quite different activities while the
other two sets include more similar gestures and actions.
Table 1. Activity sets grouping different and similar activities from KARD dataset.
Activity set 1 Activity set 2 Activity set 3
Horizontal arm wave High arm wave Draw tick
Bend Catch cap Sit down
Two-hand wave side kick Drink
Forward kick Forward kick Toss paper
Phone call Draw tick Phone call
Stand up Hand clap Take umbrella
Walk Sit down Horizontal arm wave
Draw X Bend High throw
Each experiment has been repeated minimum ten times, randomly splitting training
and testing data. Finally, the “test-person” continuity is also performed that is a leave-
one-actor-out setting, consisting of training the model on nine of the ten the people of
the dataset, and testing on the tenth. In the “test-person” test, no hinds provided about
how to split the dataset, so, we assumed that whole data set of 18 activities is con-
sidered for activity classification. The only parameter which is set in the proposed
algorithm is the number of clusters N, and different subsets of skeleton joints are
considered. Considering J = 7, the maximum accuracy (98.2%) is shown in Fig. 5, and
it is obtained by N = 40. The minimum accuracy is 91.7% and it is obtained by N = 5.
The difference is quite high (6.6%) but this gap is reduced by considering more training
data. In fact, Experiment B shows a difference of 0.4% in Activity Set 1, 1.0% in
Activity Set 2, and 2.5% in Activity Set 3. Considering the number of selected joints,
the observation of Tables 2 and 3 gives us indication that not all the joints are nec-
essary to achieve good recognition accuracy. From Table 2, it can be noticed that the
Activity Set 1 and Activity Set 2, which are the simplest ones, has a good recognition
accuracy using a subset composed of J = 7 joints. The Activity Set 3, composed of
more similar activities, is better recognized with J = 11 joints. In this case it is not
necessary to consider all the skeleton, joint provided from KARD dataset for greater
accuracy. The results obtained with the “test-person” scenario is shown in Fig. 4. The
best result is obtained with N = 35.
Table 2. Accuracy (%) of the proposed algorithm compared to the other algorithm using KARD
dataset with different Activity Sets.
Activity Set 1 Activity Set 2 Activity Set 3
A B C A B C A B C
Gaglio et al. [27] 95.1 99.1 93.0 89.9 94.9 90.1 84.2 89.5 81.7
Enea Cippitelli [29] 98.2 98.4 98.1 99.8 100 99.7 91.6 95.8 93.3
Proposed (J = 7) 98.3 98.4 98.2 99.7 100 99.8 90.2 95.0 91.3
Proposed (J = 11) 98.0 99.0 97.7 99.9 100 99.6 91.8 96.8 95.3
Proposed (J = 15) 97.5 98.8 97.6 99.5 100 99.6 91.7 95.1 94.2
Fig. 4. Confusion matrix of the “test-person” test on the whole KARD dataset.
Using a number of J = 11 joints. The overall precision and recall are about 5%
higher than the previous work which uses the KARD dataset. In this situation, the
confusion matrix obtained is shown in Fig. 4. It can be observed that the actions are
distinguished very well, only toss paper and draw tick and drink show a recognition
accuracy equal or lower than 90%. The most critical activities are the draw X and draw
tick gestures, which are quite higher (Table 4).
From the point of view, only some activities are pertinent. Table 3 shows that the
eight actions which made the KARD dataset are recognized with an accuracy greater
than 99%, even if there are some similar actions such as stand up and sit down, or drink
and phone call. Considering only the Actions subset, the lower recognition accuracy is
95.1% and it is obtained with N = 5 and J = 7. It means that the algorithm is able to
reach a high recognition rate even if the feature vector is limited to 90 or less elements.
Table 3. Accuracy (%) of the proposed algorithm compared to the other using KARD dataset
where the dataset is splitted in Gestures and Actions for different experiment on the data.
Gestures Actions
A B C A B C
Gaglio et al. [29] 86.5 93.0 86.7 92.5 95.0 90.1
Enea Cippitelli [29] 89.9 95.9 93.7 99.1 99.6 99.4
Proposed (J = 7) 91.9 93.5 92.5 99.2 99.6 99.5
Proposed (J = 11) 92.9 96.9 94.7 99.0 99.9 99.1
Proposed (J = 15) 87.4 93.6 92.8 98.7 99.5 99.3
Table 4. Precision (%) and recall (%) of the proposed algorithm compared to the other
algorithms, using the whole KARD dataset and “test-person” activities.
Algorithm Precision Recall
Gaglio et al. [29] 84.8 84.5
Enea Cippitelli [29] 95.1 95.0
Proposed (J = 7) 95.3 95.7
Proposed (J = 11) 97.1 97.0
Proposed (J = 15) 96.0 95.8
CAD-60 Dataset. The CAD-60 dataset is an exceptional data set consisting of 12

activities performed by four people in five, different environments and conditions. The
dataset is usually evaluated by splitting the activities according to the environment and
conditions; the global performance of the algorithm is given by the average precision
and recall among all the environments. Two different settings were experimented for
CAD-60 in [3]. The former is defined as “test-person” and the latter is the so-called
“have-seen.” test-person setting, and the CAD-60 dataset is an exceptional dataset
consisting of 12 activities performed by four people in five different environments and
conditions. The dataset is usually evaluated by splitting the activities according to the
environment and conditions; the global performance of the algorithm is given by the
average precision and recall among all the environments. Two different settings were
experimented for CAD-60 in [33]. The former is defined as “test-person” and the latter
is the so-called “have-seen.” test-person setting has been considered in all the works
using CAD-60, so it is the one selected also in this work.
The most challenging element of the dataset is the presence of a left-handed par-
ticipant. Mirrored copies of each action are created as suggested in [33], In order to
increase the performance which are particularly affected by this unbalancing in the
“test-person” test. For each participant, a left-handed and a right-handed version of
each action are made available. The dummy version of the test person activity has been
obtained by mirroring the skeleton with respect to the virtual sagittal plane that cuts the
person in a half. The proposed algorithm is evaluated using three different sets of joints,
from J = 7 to J = 15, and the sequence of clusters N = [5, 10, 15, 20, 25, 30, 35,40], as
in the KARD dataset. The best results are obtained with the configurations J = 11 and
N = 25, and the performance in terms of precision and recall, for each activity, is
shown in Table 5. Very good results are given in office environment, where the average
precision and recall are 97.4% and 96.8%, respectively. In fact, the activities of this
environment are quite different, only drinking water and talking on phone are similar.
On the other hand, the living room environment includes relaxing on couch and talking
on couch in addition to drinking water and talking on phone, it is the most challenging
case, since the average precision and recall are 92% and 91.6%.
The proposed algorithm is compared to other works using the same “test-person”
setting, and the results are shown in Table 6, which tells that the J = 11 configuration
outperforms the state-of-the-art results in terms of precision, and it is only 0.5% lower
in terms of recall. Shan and Akella [36] achieve outstanding results using a multiclass
SVM scheme with a linear kernel. However, they train and test mirrored actions
separately and then merge the results at the time of computing average precision and
recall. In our approach simply considers two copies of the same action given as input to
the multiclass SVM and retrieves the classification results activities. The reduced
number of joints does not affect too much the average performance of the algorithm
that reaches a precision of 93.7%, and a recall of 92.5%, with J = 7. Using all the
available joints, on the other hand, brings to a more substantial reduction of the per-
formance, showing 88.9% and 87.7% for precision and recall, respectively, with
J = 15. The best results for the proposed algorithm were always obtained with a high
number of clusters (30 or 35). The reduction of this number affects the performance; for
example, considering the J = 11 subset, the worst performance is obtained with N = 5
and with a precision of 87.6% and a recall of 87.0%. From the ASAR point of view,
this dataset is composed only of actions, so the dataset does not have to be separated to
evaluate the performance in a scenario which is close to ASAR.
4.2.3. UTKinect Dataset. The UTKinect dataset is composed of 10 activities,
performed twice by 10 participants, and the evaluation method proposed in [16] is the
leave-one-outcross- validation (LOOCV), which means that the system is trained on all
the sequences of action except one and that one is used for testing. To reduce the
random effect of k-means each training or testing procedure is repeated 20 times. Since
the skeleton is captured using Microsoft SDK which provides 20 joints, for this
dataset all the different subsets of joints shown in Fig. 2 are considered. The considered
sequence of clusters is only N = [3, 4, 5], because the minimum number of frames
constituting an action sequence is 5. The results obtained compared with previous
works are shown in Table 7.
The best results for the proposed algorithm are obtained with J = 7, which is the
smallest set of joints considered. The result corresponds to a number of N = 4 clusters
but the difference with the other clusters is very low only 0.6% with N = 5 that
provided the worst result. The selection of different sets of joints from J = 7 to J = 15,
changes the accuracy only by a 2% from 93.1% to 95.1%. In this dataset, the main
drawback to the performance is given by the reduced number of frames that constitute
some action sequences and limits the number of clusters representing the actions
sequence (Table 8).
Vemulapalli et al. [32] reach the highest accuracy but the approach is much more
complex, the processing scheme includes Dynamic Time Warping after modeling
skeleton joints in a Lie group to perform temporal alignments and a specific modeling
called Fourier Temporal Pyramid before classification with one-versus-all multiclass
SVM. The UTKinect dataset to ASAR involves the consideration of a subset of
activities which includes only actions and discards gestures. Farther details, the fol-
lowing 5 activities have been selected: walk, sit down, pick up, carry and stand up. In
this condition, the highest accuracy (97.7%) is still given by J = 7, with 3 clusters, and
the lower one (95.1%) is represented by J = 15, again with 3 clusters. The confusion
matrix for these two configurations are shown in Fig. 5, where the main difference is
the reduced misclassification between the carry and activities walk that are very similar
to each other.
Table 5. Precision (%) and recall (%) of the proposed algorithm, in the different environments
of CAD-60, with J = 11 and N = 20.
Location Activity Test-person
Precision Recall
Living room Talking on phone 89.5 88.5
Talking on couch 93.9 100
Drinking water 89.5 88.5
Relaxing on couch 100 89.5
Average 93.23 91.6
Bathroom Brushing teeth 90.9 100
Wearing contact lens 100 85.2
Rinsing mouth 94.3 100
Average 95.06 95.01
Kitchen Cooking-chopping 89.7 100
Opening pill container 100 100
Cooking-stirring 100 82.1
Drinking water 97.0 100
Average 96.7 95.2
Bedroom Talking on phone 95.7 93.7
Opening pill container 98.0 100
Average 96.0 95.1
Office Talking on phone 100 89.5
Working on computer 100 100
Writing on whiteboard 100 96.8
Drinking water 89.7 100
Average 97.4 96.6
Global average 96.29 95.5
Table 6: Global precision (%) and recall (%) of the proposed algorithm for CAD-60 dataset and
“new-person” setting, with different subsets of joints, compared to other works.
Algorithm Precision Recall
Sung et al. [43] 67.9 55.5
Gaglio et al. [29] 77.3 76.7
Enea Cippitelli [29] 92.2 89.4
Proposed (P = 15) 91.9 89.7
Faria et al. [47] 91.1 91.9
Parisi et al. [48] 91.9 90.2
Proposed (P = 7) 96.7 93.5
Shan and Akella [46] 93.8 94.5
Proposed (P = 11) 95.9 94.5
Fig. 5. Confusion matrices of the UTKinect dataset with only ASAR related activities. (a) Best
accuracy confusion matrix, obtained with J = 7. (b) Worst accuracy confusion matrix, obtained
with J = 15.
Table 7. Global accuracy (%) of the proposed algorithm for UTKinect dataset and LOOCV
setting, with different subsets of joints, compared to other works.
Algorithm Accuracy
Jiang et al. [35] 91.9
Xia et al. [26] 90.9
Theodorakopoulos et al. [30] 90.95
Gan and Chen [24] 92.0
Ding et al. [31] 91.5
Zhu et al. [17] 91.9
Liu et al. [22] 96.0
Proposed (J = 15) 95.1
Vemulapalli et al. [33] 97.1
Table 8. Global accuracy (%) of the proposed algorithm for Florence3D dataset and “test-
person” setting with different subsets of joints compared to other methods.
Algorithm Accuracy
Vemulapalli et al. [33] 90.9
Seidenari et al. [44] 82.0
Anirudh et al. [34] 89.7
Taha et al. [28] 96.2
4 Discussion
Limiting the analysis to PSAR related activities only, the algorithm achieves surprising
results in all the datasets. In the considered method, the group of 8 activities defined as
Actions in the KARD dataset is recognized with an accuracy greater than 98.7%. The
CAD-60 dataset contains only actions for this all the activities, is considered within the
proper location and has been included in the evaluation, which provides precision and
recall of 94.9% and 95.5%, respectively. In the UTKinect dataset, the performance
improves considering only the PSAR activities. To improve recognition accuracy, not
only joints relative positions, but also, their velocity should be considered, for more
complex features. First, the proposed algorithm exploits a multi class SVM, which is a
good classifier, but it does not easy to make things understandable if an activity belongs
or not to any of the training classes. In a real scenario, it is possible to have a sequence
of frames for a particular action that does not represent any activity of the training set,
in this case, the SVM outputs the most likely class. The main problem is the seg-
mentation of actions. In many Data sets, the actions are represented as segmented
sequences of frames, but in the real cases, the algorithm has to handle a continuous
stream of frames and to segment actions by itself. Though some solutions for seg-
mentation has been proposed, most of them are based on thresholds on movements,
which can be highly data-dependent [11]. Also, this aspect has to be further investi-
gated, to have a system which is effectively applicable in a real PSAR scenario.
The main contribution of this work is classified different activity more accurately
through the multi-class SVM process and making model to identify and predict dif-
ferent activities of multiple people. For better accuracy Gaussian distribution is applied.
In this work, the activity recognition algorithm is based on skeleton data extracted from
an RGBD image and after segmentation it creates a feature vector representing the
whole activity. It is able to overcome state-of-the-art results in two, available data sets,
the KARD and CAD-60, outperforming more complex algorithms in many conditions
and case. The algorithm has been tested also, over other more challenging datasets, the
UTKinect, where it is outperformed only by the algorithms exploiting temporal
alignment techniques, or a combination of several machine learning methods. Future
works will concern the application of the activity recognition algorithm human health
abnormality prediction, considering the estimation of pose-based abnormality
classification.
References
1. Uddin, M.Md.Z., Almogren, M.A.: A robust human activity recognition system using
smartphone sensors and deep learning (2018)
2. Faria, D.R., Premebida, C., Nunes, U.: A probabilistic approach for human everyday
activities recognition using body motion from RGBD images. In: Proceedings of the 23rd
IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN
2014), Edinburgh, UK, pp. 732–737 (2014)
3. Parisi, G.I., Weber, C., Wermter, S.: Self-organizing neural integration of posemotion
features for human action recognition. Front. Neurorob. 9, 3 (2015)
4. Gasparrini, S., Cippitelli, E., Gambi, E., Spinsante, S., Flórez- Revuelta, F.: Performance
analysis of self-organising neural networks tracking algorithms for intake monitoring using
Kinect. In: Proceedings of the 1st IET International Conference on Technologies for Active
and Assisted Living (TechAAL 2015), Kingston, UK (2015)
5. Sung, J., Ponce, C., Selman, B., Saxena, A.: Unstructured human activity detection from
RGBD images. In: Proceedings of the IEEE International Conference on Robotics and
Automation (ICRA 2012), Saint Paul, Minn, USA, pp. 842–849 (2012)
6. Ding, W., Liu, K., Cheng, F., Zhang, J.: STFC: spatio-temporal feature chain for skeleton-
based human action recognition. J. Vis. Commun. Image Representation 26, 329–337 (2015)
7. Sung, J., Ponce, C., Selman, B.: Unstructured human activity detection from RGBD images.
In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA
2012), Saint Paul, Minn, USA, pp. 842–849 (2012)
8. Devanne, M., Wannous, H., Berretti, S., Pala, P., Daoudi, M., Del Bimbo, A.: Spacetime
pose representation for 3D human action recognition. In: New Trends in Image Analysis and
Processing—ICIAP 2013. Lecture Notes in Computer Science, vol. 8158, pp. 456–464.
Springer, Berlin (2013)
9. Taha, A., Zayed, H.H., Khalifa, M.E., El-Horbaty, E.-S.M.: Human activity recognition for
surveillance applications. In: Proceedings of the 7th International Conference on Information
Technology, pp. 577–586 (2015)
10. Seidenari, L., Varano, V., Berretti, S., Del Bimbo, A., Pala, P.: Recognizing actions from
depth cameras as weakly aligned multi-part Bag-of-Poses. In: Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2013),
Portland, Ore, USA, pp. 479–485 (2013)
11. Gaglio, S., Lo Re, G., Morana, M.: Human activity recognition process using 3-D posture
data. IEEE Trans. Hum.-Mach. Syst. 45(5), 586–597 (2015)
Privacy Violation Issues in Re-publication
of Modification Datasets
Noppamas Riyana, Surapon Riyana(&), Srikul Nanthachumphu,

Suphannika Sittisung, and Dussadee Duangban
Maejo University, Chiangmai-Phrao Road, Maejo, Sansai,

Chiang Mai 50290, Thailand
{noppamas,surapon_r,srikul,
suphannika,dussadee}@mju.ac.th
Abstract. Privacy preservation models are often proposed to address privacy

violation issues in datasets that are focused on performing one-time data
releasing. Thus, if datasets are allowed to update (modify) the data of them when
the new data become available and released on performing multiple times,
privacy preservation models could be insufficient. For this reason, the aims of
this work are to identify the vulnerabilities of privacy preservation models in
dynamic datasets which are based on data updating (data modifying), and fur-
ther propose a new algorithm that can address privacy violation issues in the re-
publication of modified datasets.
Keywords: Privacy preservation Data modification Dynamic datasets

Re-publication datasets
1 Introduction
Data utility and data privacy are both serious issues that must be considered when
datasets are released for public uses. To address these issues in released datasets, there
are various anonymization models to be proposed such as k-Anonymity [1], l-Diversity
[2], t-Closeness [3], and k-Likeness [4]. For privacy preservation with these
anonymization models, before datasets are released for public uses, all explicit iden-
tifier values of users are removed. Moreover, the unique quasi-identifier values of users
are suppressed or generalized by using their less specific values to be indistinguishable.
Unfortunately, with the best of our knowledge about these anonymization models, they
can be adequate to address privacy violation issues in static datasets. Thus, if the
specific dataset for public uses is dynamic and released on performing multiple times,
these anonymization models could be inadequate, i.e., the privacy data of users is
available in released datasets, it could be violated by using such an appropriate data
comparison technique. To rid this vulnerability of anonymization models, there are
various privacy preservation models to be proposed. A well-known privacy preser-
vation model is proposed to address privacy violation issues in dynamic datasets, it is
m-Invariance [5]. In addition, m-Invariance is generally proposed to address privacy
violation issues in dynamic datasets that are only allowed to change the data of them by
using insertion and deletion methods. The privacy preservation idea of m-Invariance is

https://doi.org/10.1007/978-3-030-68154-8_79
Privacy Violation Issues in Re-publication of Modification Datasets 939
that before datasets are released for public uses, all unique tuples are distorted by using
the additive counterfeit tuples until they are satisfied by the given privacy preservation
constraint. Aside from [5], in [6], the authors show that the incremental datasets also
have the concern of privacy violation issues that must be addressed. Aside from
dynamic datasets that are based on insertion and deletion methods, in [7] and [8], the
authors show that decremental datasets also have the concern of privacy violation
issues that must be addressed. To address these privacy violation issues, the authors
suggest that aside from released datasets are satisfied by the given privacy preservation
constraint, all possible comparison results between released datasets and their related
released dataset versions must also be satisfied by the given privacy preservation
constraint. Although these privacy preservation models can address privacy violation
issues in incremental and decremental datasets, they could be inadequate to address
privacy violation issues in dynamic datasets that allow to change the data of them by
using data updation methods. For this reason, a new appropriate privacy preservation
model for updation datasets (modification Datasets) is proposed in this work.
The organization of this work is as follows. In Sect. 2, the difference between
computer security and privacy preservation will be explained. In Sect. 3, an
anonymization model, l-Diversity, is presented. Then, in Sect. 4, we present privacy
violation scenarios in re-publication of modification datasets. Then, a privacy preser-
vation model that can use to address privacy violation issues in re-publication of
modification datasets to be proposed in Sect. 5. Subsequently, the experimental results
are based on the effectiveness and efficiency of the proposed model to be discussed in
Sect. 6. Finally, the conclusion of this work will be presented in Sect. 7.
2 Computer Security Versus Privacy Preservation
Computer security is not privacy preservation. Generally, computer security is data

access controls and authentications [1,9]. Thus, the aim of computer security is a check
to ensure that the recipient has the authority to receive the information. The charac-
teristic of computer security is shown in Fig. 1.
Fig. 1. Computer security
With privacy preservation, it is to limit the ability of data re-identification when

datasets are released for public uses [1–8, 10–14]. The characteristic of privacy
940 N. Riyana et al.
preservation is presented in Fig. 2, and it will be more explained in Sect. 3. In addition,

this work only devotes to address privacy violation issues in released datasets.
Fig. 2. Privacy preservation
3 Anonymization Model
One of the most well-known privacy preservation ideas has been proposed to address
privacy violation issues in released datasets, it is data anonymizations. The example of
privacy preservation models is based on data anonymizations to be k-Anonymity [1], l-
Diversity [2], t-Closeness [3], and k-Likeness [4]. For privacy preservation with these
anonymization models, before datasets are released for public uses, all explicit iden-
tifier values of users are removed, and the unique quasi-identifier values of users are
distorted by using data generalizations or data suppressions [12, 13] to be indistin-
guishable. To improve the readability of anonymization ideas, only l-Diversity is
focused on this work.
3.1 l-Diversity [2]

l-Diversity is a well-known anonymization model that is extended from k-Anonymity
[1]. Let U be the set of all possible users. Let Dj be the raw dataset at the timestamp j.
Each tuple dji 2 Dj is proposed to represent the profile tuple of ui 2 U at the timestamp j
such that it is represented by the sequence attributes as ðQIj1 ; QIj2 ; :::; QIjq ; Sj Þ such that
QIj1 ; QIj2 ; :::; QIjq are the quasi-identifier attributes and Sj is the sensitive attribute. Let
Dj ½QIj1 ; QIj2 ; :::; QIjq be the data projection over on the quasi-identifier attributes
QIj1 ; QIj2 ; :::; QIjq of Dj . Also, let Dj ½Sj be the data projection over on the sensitive
attribute Sj of Dj . Let l be a positive integer, where l 2, such that it is the privacy
preservation constraint of l-Diversity. Let
f l ðDj ; lÞ : Dj !l Dj
be the generalized function for transforming the raw dataset Dj to become the
released dataset Dj . That is, the unique values which are available in QIj j j
1 ; QI2 ; :::; QIq ,
j
of D are generalized by their less specific values to be indistinguishable such that each
group of the indistinguishable quasi-identifier tuples must relate to the sensitive values
in Sj to be at least l different values, it calls an equivalence class of Dj . For this reason,
Dj can also be denoted in the form of the set of its equivalence classes, i.e.,
Dj ¼ fECj1 ; ECj2 ; :::; ECje1 ; ECje g. Without loss of generality, ECj1 [ ECj2 [ ::: [ ECje1
[ ECje ¼ Dj and ECj1 \ ECj2 \ ::: \ ECje1 \ ECje ¼ £. Also, let Dj ½QIj j j
1 ; QI2 ; :::; QIq
be the data projection over on the quasi-identifier attributes QIj j j j
1 ; QI2 ; :::; QIq of D ,
j j j j
and D ½S be the data projection over on the sensitive values S of D . Moreover,
ECjg ½QIj j j j j j
1 ; QI2 ; :::; QIq is the data projection over on QI1 ; QI2 ; :::; QIq of ECg , and
j
j j j
ECg ½S is the data projection over on the sensitive attribute S of ECg . j
Example 1. Privacy Preservation Is Based on l-Diversity: Let Table 1 be the

specific raw dataset such that the data holder must release to the outside scope of the
data-collecting organization with such an appropriate business reason. Let Name be the
explicit identifier attribute. Let Age and City be the quasi-identifier attributes. And let
Salary be the sensitive attribute. Let the value of l be set at 2. For privacy preservation,
all values are available in the Name attribute to be removed. Moreover, the unique
values are available in the Age and City attributes, they are distorted by using their less
specific values to be indistinguishable such that every group of indistinguishable tuples
in the part of the Age and City attributes must relate to be at least 2 different values of
the Salary attribute. Therefore, a released dataset version of Table 1 is satisfied by 2-
Diversity to be shown in Table 2.
Table 1. An example of raw datasets

Name Age Area Salary
Bob 51 London $20000
David 52 Berkshire $23000
John 52 London $25000
Jennifer 51 Leicestershire $22000
Alice 51 Derbyshire $25000
Table 2. A released dataset version of Table 1 is satisfied by 2-Diversity

Age Area Salary
51–52 Greater London $20000 EC1
51–52 Greater London $23000
51 East Midlands $22000 EC2
51 East Midlands $25000
With Table 2, we can conclude that after released datasets are satisfied by l-
Diversity constraints, they can guarantee that all possible query conditions of them
always have at least l different sensitive values to be satisfied. For this situation,
released datasets of l-Diversity seem to have not the concern of privacy violation
issues. Unfortunately, in this work, we show that although released datasets are sat-
isfied by l-Diversity constraints, they still have the concern of privacy violation issues
that must be addressed.
4 Privacy Violation Scenarios in Re-publication

of Modification Datasets
In this section, the vulnerabilities of l-Diversity in modification datasets to be identified

such that they based on the assumption as follows. The adversary received Dj1 and
Dj . Moreover, the adversary highly believes that the profile tuple of the target user is
available in Dj1 and Dj , and he/she further knows that the quasi-identifier values or
the sensitive values of the target user in Dj1 and Dj are different. Furthermore, the
adversary has the background knowledge about the target user to be enough.
4.1 Privacy Violation Issues in Released Datasets Based on the Difference

of Quasi-Identifier Values
Let Dj1 and Dj be both specific released dataset versions of D such that they are
released at the timestamp j 1 and j respectively. Let ui be the target user of the
adversary. Let BGKj1 j
ui and BGKui be the adversary’s background knowledge about
j1
the target user ui at the timestamp j 1 and j respectively. Let ECj1
g ½QI1 ;
QIj1
2 ; :::; QIj1
q of Dj1 match to BGKj1 j1
ui , i.e., BGKui ðQI1
j1
[ QIj1
2 [ :::
[ QIj1
q Þ. Moreover, let ECjg ½QIj j j
1 ; QI2 ; :::; QIq be the related equivalence class of
j1
g ½QI1
ECj1 ; QIj1
2 ; :::; QIj1
q such that ECjg ½QIj j j
1 ; QI2 ; :::; QIq is available in D
j
j1
g ½QI1
and fully covered by ECj1 ; QIj1
2 ; :::; QIj1
q . Moreover, let ECjgg ½QIj
1;
QIj j
2 ; :::; QIq of D
j
match to BGKjui , i.e., BGKjui ðQIj j j
1 [ QI2 [ ::: [ QIq Þ. Let
j1
gg ½QI1
ECj1 ; QIj1
2 ; :::; QIj1
q be the related equivalence class of ECjgg ½QIj
1;
QIj j j1
2 ; :::; QIq such that ECgg ½QI1
j1
; QIj1
2 ; :::; QIj1
q is available in Dj1 and fully
covered by ECjgg ½QIj j j j1 j1
1 ; QI2 ; :::; QIq . For this situation, if ECg ½S ECjg ½Sj and
j j1 j1
ECgg ½S ECgg ½S do not satisfy the given value of l, the sensitive value of the
j
user ui is collected in Dj1 and Dj can be violated by using data inferring.
Example 2. Privacy Violation Issues in Released Datasets Based on the Difference
of Quasi-Identifier Values: Let Table 1 be the raw dataset at the timestamp j 1, and
its released dataset version is shown in Table 2. We suppose that after Table 2 is
released for public uses, Bob’s area, London, in Table 1 is updated to be Notting-
hamshire. Thus, the raw dataset version of Table 1 at the timestamp j is shown in
Table 3. Moreover, we suppose that the released dataset version of Table 3 that is
satisfied by 2-Diversity to be shown in Table 4. With Table 2 and 4, we can see that all
possible re-identification conditions to them always have at least 2 different salaries to
be satisfied. For this situation, Table 2 and 4 seem to have not the concern of privacy
violation issues. Unfortunately, in this work, we show that Table 2 and 4 still have the
concern of privacy violation issues that must be addressed. That is, we suppose that
Bob is the target user of the adversary such that Bob’s salary is the target data.
Furthermore, we assume that adversary knows that Bob is a male who is 51 years old,
and he/she knows that after Table 2 is released, Bob’s area data, London, is updated to
be Nottinghamshire. For this situation, the adversary can be highly confident that such
a tuple is collected in EC1 of Table 2 must be Bob’s profile tuple at the timestamp
j 1. Moreover, the adversary can observe that EC1 of Table 2 fully covers EC1 of
Table 4. For this reason, the adversary can highly ensure that EC1 of Table 4 is the
related equivalence class of EC1 of Table 2. Therefore, the adversary can be highly
confident that $20000 is collected in EC1 of Table 2 must be Bob’s salary. In order to
ensure that $20000 is Bob’s salary information, Bob’s salary is collected in Table 4
must also be revealed. With Table 4, the adversary can see that its EC2 matches to
his/her background knowledge about Bob at the timestamp j, and he/she also see that
EC2 of this table fully covers EC2 of Table 2. Therefore, the adversary can ensure that
Bob’s salary is collected in Table 2 and 4 to be $20000.
Table 3. The raw dataset version of Table 1 is after Bob’s area information to be updated
Bob 51 Nottinghamshire $20000

Age Area Salary
52 Greater London $23000 EC1
52 Greater London $25000
From the example 2, it is so clear that although released datasets can guarantee that
all possible re-identification conditions to them always have at least l different sensitive
values to be satisfied, they still have the concern of privacy violation issues that must be
addressed.
4.2 Privacy Violation Issues in Released Datasets Based on the Difference

Sensitive Values
Let Dj1 and Dj be both specific released dataset versions of D such that they are
released at the timestamp j 1 and j respectively. Let ui be the target user of the
adversary. Let BGKui be the adversary’s background knowledge aboutui . Let
j1
g ½QI1
ECj1 ; QIj1
2 ; :::; QIj1
q and ECjg ½QIj j j
1 ; QI2 ; :::; QIq match to BGKui . More-
j1
g ½QI1
over, ECj1 ; QIj1
2 ; :::; QIj1
q ¼ ECjg ½QIj j j
1 ; QI2 ; :::; QIq . For this situation, if
j1 j1 j j j j j1 j1
ECg ½S ECg ½S and ECg ½S ECg ½S does not satisfy the given value of
l, the sensitive values of ui is collected in Dj1 and Dj can be inferred that they are
j1
changed from ECj1 g ½S ECjg ½Sj to become ECjg ½Sj ECj1 g ½S
j1
.
Example 3. Privacy Violation Issues in Released Datasets Based on the Difference
of Sensitive Values: Let Table 1 be the raw dataset at the timestamp j 1, and its
released dataset version is shown in Table 2. We suppose that after Table 2 is released,
Alice’s salary is updated from $25000 to become $30000. Thus, the raw dataset version
of Table 1 at the timestamp j is shown in Table 5. Moreover, we suppose that Table 6 is
the released dataset version of Table 5 that it is satisfied by l-Diversity constraints. With
Table 2 and 6, we can also see that all possible re-identification conditions to them have
at least two different salaries to be satisfied. For this situation, Table 2 and 6 seem to
have not the concern of privacy violation issues. Unfortunately, in this work, we show
that Table 2 and 6 still have the concern of privacy violation issues that must be
addressed. Let Alice be the target user of the adversary such that Alice’s salary is the
target data of the adversary. We suppose that the adversary highly believes that Table 2
and 6 collect Alice’s tuple profile. Moreover, the adversary knows that after Table 2 is
released, Alice’s salary information is updated, and he/she further know that Alice is a
female who is 51 years old and lives in Derbyshire. For this situation, the adversary is
highly confident that EC2 of Table 2 and 6 must collect Alice’s profile tuple. After the
adversary compares the salaries which are available in both specific equivalence
classes, the adversary can be highly confident that Alice’s salary is changed from
$25000 to become $30000.
Table 5. The raw dataset version of Table 1 is after Alice’s salary information to be updated
Bob 51 London $20000

Age Area Salary
From the example 2 and 3, they can be so clear that although released datasets are
satisfied by l-Diversity constraints, they still have the concern of privacy violation
issues that must be addressed. To rid this vulnerability of l-Diversity, a new privacy
preservation model for re-publication of modification datasets is proposed in this work,
it will be presented in Sect. 5.
5 Proposed Privacy Preservation Model
In this section, a new privacy preservation model for modification datasets is proposed,
so called as b-Diversity. For privacy preservation with b-Diversity, aside from released
datasets are satisfied by the given privacy preservation constraint, all comparison
results between their related released dataset versions must also be satisfied by the
given privacy preservation constraint. In addition, it has the assumption that all ordered
pairs of released dataset versions which are constructed from the raw dataset D such
that they are released at the timestamp between 1 and j 1, they always are satisfied by
b-Diversity constraints. For this reason, only the released dataset version at the
timestamp j 1, Dj1 , is considered to construct the released dataset version Dj .
5.1 b-Diversity
Let Dj be the raw dataset at the timestamp j. Let Dj1 be the released dataset of Dj1 .
Let a positive integer b, where b 2, be the privacy preservation constraint. Let
f b ðDj1 ; Dj ; bÞ : Dj !Dj1 ;b Dj
be the privacy preservation function such that it transforms Dj to becomeDj . That

is, the unique quasi-identifier values are available in QIj j j
1 ; QI2 ; :::; QIq of D
j
are
distorted by using data holding, data suppressions, or data generalizations to be
indistinguishable such that every group of indistinguishable quasi-identifier values
must relate to at least b different sensitive values. Moreover, the different sensitive
values is collected in every ECjg ½Sj of Dj and each related ECj1 g ½S
j1
that is
j1
available in D must also be satisfied by the given value of b. In addition, every
group of tuples in Dj is after satisfied by b-Diversity constraints, it is called as an
equivalence class of Dj . Therefore, Dj can also be denoted by the set of its equiva-
lence classes, i.e., Dj ¼ fECj1 ; ECj2 ; :::; ECje1 ; ECje g.
Table 7. A released dataset version of Table 3 is satisfied by b-Diversity, where b ¼ 2.

Age Area Salary
Example 4. Privacy Preservation Based on b-Diversity: We suppose that Table 1 be

the raw dataset at the timestamp j 1, and Table 3 is the raw dataset version of Table 1
at the timestamp j. Let Table 2 be the released dataset version of Table 1. Let the value
of b be set at 2. With these given instances, Table 7 is a released dataset version of
Table 3 such that it is satisfied by b-Diversity constraints, where b ¼ 2. That is, the
unique quasi-identifier values of Table 3 are distorted by using data holding, data
suppressions, or data generalizations to be indistinguishable such that every group of
indistinguishable quasi-identifier value must relate to the sensitive values in the sen-
sitive attribute to be at least b different values. Furthermore, the comparison results
between each equivalence class of Table 7 and its related equivalence classes which are
available in Table 2 must also be satisfied by the given value of b. For this situation,
Table 7 can guarantee that all possible query conditions of them always have at least b
different sensitive values to be satisfied. Moreover, all possible data comparison results
between Table 2 and 7 also have at least b different sensitive values to be satisfied. For
this reason, released datasets are after satisfied by b-Diversity constraints, they can be
highly secure in the term of privacy preservation than released datasets which are
satisfied by l-Diversity constraints.
5.2 Data Utility

With the proposed privacy preservation model, b-Diversity, it is a data anonymization
model such that its released datasets can be satisfied by the given privacy preservation
constraint from using data holding, data suppressions, and data generalizations. For this
reason, released datasets are after satisfied by b-Diversity constraints, they are generally
highly secure in the term of privacy preservation than the original dataset of them.
However, they often lose some data utility. Thus, the data utility metric or the data
penalty metric is necessary for b-Diversity. A data penalty metric that can use to
measure the penalty cost of released datasets which are after satisfied by b-Diversity
constraints to be shown in Eq. 2. With this equation, the penalty cost of released
datasets is presented in the range between 0.0 and 1.0 such that the released dataset that
has the data penalty cost of Eq. 2 which is nearly 0.0 to be desired.

j
X Xq h di QIjz
PLOSS ECjg ; W ¼ þ ðjWj qÞ ð1Þ
8dj
i
2ðECjg WÞ z¼1 H QIj
z
Pe
g¼1 PLOSS ðECg ; WÞ
j
j
DLOSS ðD ; WÞ ¼ j
ð2Þ
jD j q
Where,
– W is the set of the tuples which are holded or suppressed,
– hðdj j
i ½QIz Þ is the level of the generalized value which is available in the quasi-
j
identifier attribute QIj
z of the tuple di , and
– HðQIj j
z Þ is the height of generalization structure of QIz .
5.3 Proposed Algorithm

In this section, a greedy privacy preservation algorithm is based on b-Diversity con-
straints to be proposed. To achieve b-Diversity constraints in released datasets, the
algorithm firstly investigates the updated sensitive values. If the updated sensitive
values do not satisfy the given value of b, they are deleted from Dj and the related tuple
of them is available in the dataset at the timestamp j 1 to be inserted into Dj . Then,
the algorithm constructs all possible equivalence classes of Dj . That is, the unique
quasi-identifier values of each equivalence class are generalized by their less specific
values to be indistinguishable such that every equivalence class must collect the sen-
sitive values to be at least b different values, so denoted as Dj . Then, all equivalence
classes of Dj are iterated. In each iteration, the algorithm selects an appropriate
equivalence class dj of Dj to Dj such that dj has the penalty cost, PLOSS ðdj ; WÞ, to be
minimized and the comparison result between dj and its related equivalence classes in
Dj1 satisfy the given value of b. In addition, all dj of Dj are not overlapped. Finally,
Dj is returned.
6 Experiment
In this section, the experimental results are based on the effectiveness and efficiency of
the proposed model to be discussed by comparing with its traditional model, l-Diversity
[2].
6.1 Experiment Setup

All experiments are proposed to evaluate the effectiveness and efficiency of the pro-
posed privacy preservation model, they are conducted on both Intel(R) Xeon(R) Gold
5218 @2.30 GHz CPUs with 64 GB memory and six 900 GB HDDs with RAID-5.
Furthermore, all implementations are built and executed on Microsoft Visual Studio
2019 Community Edition in conjunction with MSSQL Server 2019.
Moreover, all experiments are based on Adult Dataset [15]. With this dataset, it
contains about 48,843 tuples with six continuous attributes (i.e., age, fnlwgt, education-
num, capital-gain, capital-loss, and hours-per-week) and eight nominal attributes (i.e.,
workclass, education, marital-status, occupation, relationship, race, sex, and native-
country). In order to conduct the experiments effectively, only the education, age, sex,
native-country, race, and capital-loss attributes are available in the experimental dataset
such that the capital-loss attribute is set to be the sensitive attribute, and other attributes
are set to be the quasi-identifier attributes. Furthermore, all tuples that consist of “0”
and “?” to be removed. Thus, the experimental dataset only contains 2283 tuples, and
the characteristic of the experimental dataset is shown in Table 8. In addition, all
effective experimental results are evaluated by using the DLOSS metric which is pre-
sented in Sect. 5.2.
Table 8. The characteristic of the experimental dataset

Quasi-identifier Sensitive
Education Age Sex Native-country Race Capital-loss
… … … … … …
… … … … … …
6.2 Experimental Results and Discussion

6.2.1 Effectiveness
All experiments are proposed in this section, they are devoted to evaluate the effec-
tiveness of the proposed model. In the first experiment, it is proposed to evaluate the
effect of the number of the updated tuples that influence to the concern of privacy
violation issues in released datasets from using the data comparison attacking. For
experiments, the value of l is fixed to be 2, and the tuples of the experimental dataset
are randomly chosen to update the data of them by varying from 100 tuples to 1600
tuples. From the experimental results that are shown in Fig. 3, they indicate that
released datasets are after satisfied by the proposed model, they do not have any
concern of privacy violation issues from using the data comparison attacking. How-
ever, released datasets of l-Diversity still have these privacy violation issues that must
be addressed. The cause of these privacy violation issues in released datasets of l-
Diversity is that the compared results between released datasets and the related released
datasets of them are not considered in l-Diversity constraints.
In the second experiment, it is proposed to evaluate the effect of the number of
quasi-identifier attributes that influence to the data utility of released datasets which are
after satisfied by the proposed model and l-Diversity. For experiments, the value of l is
fixed to be 2, and the number of quasi-identifier attributes is varied from 1 to 5
attributes. Moreover, in all experiments, 1600 tuples of experimental datasets are
random to update the sensitive value (capital-loss) or the quasi-identifier values (ed-
ucation, age, sex, native-country, and race) of them.
Fig. 3. The effect of the number of updated tuples
Fig. 4. The effect of the number of quasi-identifier attributes
From the experimental results that are shown in Fig. 4, we can see that the number
of quasi-identifier attributes more influence to the data utility of released datasets. That
is, if the number of quasi-identifier attributes is increased, the data utility of released
datasets is decreased. The cause of decreasing the data utility of released datasets is
when the number of quasi-identifier attributes to be increased. It is size of equivalence
classes, i.e., more quasi-identifier attributes lead to be the larger size of equivalence
classes in released datasets. Unfortunately, we see that the number of quasi-identifier
attributes more influences to released datasets of the proposed model than released
datasets of l-Diversity. It is generally data privacy and data utility in released datasets to
be traded off.
In the third experiment, it is proposed to evaluate the effect of the value of l that
influences to the data utility of released datasets which are after satisfied by the pro-
posed model and l-Diversity. For experiments, the value of l is varied in the range
between 1 and 5. Also, in all experiments, 1600 tuples of experimental datasets are
random to update the sensitive value or the quasi-identifier values of them.
Fig. 5. The effect of the value of l
From the experimental results that are shown in Fig. 5, we can see that the value of
l also influences to the data utility of released datasets. That is, if the value of l is
increased, the data utility of released datasets is decreased. Moreover, we can also see
that released datasets are satisfied by l-Diversity constraints, they often have the data
utility to be more than released datasets which are satisfied by the proposed model. The
cause of the less data utility in released datasets of the proposed model is the size of
equivalence classes that are available in the released datasets, i.e., they are often the
larger size than released datasets of l-Diversity.
6.2.2 Efficiency
In this section, both experiments for evaluating the efficiency of the proposed model are
presented such that they are based on the number of quasi-identifier attributes and the
value of l. By the effect of the number of quasi-identifier attributes, they are evaluated
by the limitations which are that the value of l is fixed to be 2 and the number of quasi-
identifier attributes is varied from 1 to 5 attributes. For evaluating the effect of the value
of l, all quasi-identifier attributes can be available in the experimental dataset and the
value of l is varied in the range between 1 and 5. Moreover, in all experiments, 1600
tuples of experimental datasets are random to update the sensitive value or the quasi-
identifier values of them. From the experimental results which are shown in Fig. 6 and
7, we observe that the greater number of quasi-identifier attributes and more value of l
are more using the execution time for transforming the experimental dataset to satisfy
the given privacy preservation constraints of the proposed model and l-Diversity.
Moreover, we can see that the proposed model uses the execution to be more than l-
Diversity. The cause of using more execution time in the proposed model is that aside
from the data of released datasets must be satisfied by the given privacy preservation
constraint, all compared results between released datasets and the related released
datasets of them are also considered to satisfy the given privacy preservation con-
straints of the proposed model.
Fig. 6. The execution time based on the number of quasi-identifier attributes
Fig. 7. The execution time based on the value of l
7 Conclusion
In this work, the concern of privacy violation issues in re-publication of modification

datasets are identified. Moreover, a privacy preservation algorithm that can use to
address privacy violation issues in re-publication of modification datasets to also be
proposed in this work.
References
1. Sweeney, L.: K-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzz. Knowl.
Based Syst. 10(5), 557–570 (2002)
2. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-diversity: Privacy
beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1, 1 (2007)
3. Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: privacy beyond k-anonymity and l-
diversity. In: 2007 IEEE 23rd International Conference on Data Engineering, Istanbul,
pp. 106–115 (2007)
4. Riyana, S., Natwichai, J.: Privacy preservation for recommendation databases. Serv.
Oriented Comput. Appl. 12(3–4), 259–273 (2018)
5. Xiao, X., Tao, Y.: M-invariance: towards privacy preserving re-publication of dynamic
datasets. In: Proceedings of the 2007 ACM SIGMOD International Conference on
Management of Data (SIGMOD 2007), pp. 689–700. Association for Computing
Machinery, New York (2007)
6. Byun, J.W., Sohn, Y., Bertino, E., Li, N.: Secure anonymization for incremental datasets. In:
Jonker, W., Petković, M. (eds.) Secure Data Management. SDM 2006. Lecture Notes in
Computer Science, vol. 4165. Springer, Heidelberg (2006)
7. Riyana, S., Harnsamut, N., Sadjapong, U., Nanthachumphu, S., Riyana, N.: Privacy
preservation for continuous decremental data publishing. In: Chen, J.Z., Tavares, J., Shakya,
S., Iliyasu, A. (eds.) Image Processing and Capsule Networks. ICIPCN 2020. Advances in
Intelligent Systems and Computing, vol. 1200. Springer, Cham (2021)
8. Riyana, S., Riyana, N., Nanthachumphu, S.: An effective and efficient heuristic privacy
preservation algorithm for decremental anonymization datasets. In: Chen, J.Z., Tavares, J.,
Shakya, S., Iliyasu, A. (eds.) Image Processing and Capsule Networks. ICIPCN 2020.
Advances in Intelligent Systems and Computing, vol. 1200. Springer, Cham (2021)
9. Russell, D., Gangemi, G.T.: Computer Security Basics. O’Reilly & Associates Inc, USA
(1991)
10. Riyana, S., Riyana, N., Nanthachumphu, S.: Enhanced (k,e)-anonymous for categorical data.
In: Proceedings of the 6th International Conference on Software and Computer Applications
(ICSCA 2017). Association for Computing Machinery, New York, pp. 62–67 (2017)
11. Riyana, S., Harnsamut, N., Soontornphand, T., Natwichai, J.: (k, e)-Anonymous for ordinal
data. In: Proceedings of the 2015 18th International Conference on Network-Based
Information Systems (NBIS 2015), pp. 489–493. IEEE Computer Society, USA (2015)
12. Riyana, S., Nanthachumphu, S., Riyana, N.: Achieving privacy preservation constraints in
missing-value datasets. SN Comput. Sci. Appl. 12(3–4), 259–273 (2020)
13. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppres-
sion. Int. J. Uncertain. Fuzz. Knowl. Based Syst. 10(5), 571–588 (2002)
14. Wang, G., Zhu, Z., Du, W., Teng, Z.: Inference analysis in privacy-preserving data re-
publishing. In: 2008 Eighth IEEE International Conference on Data Mining, Pisa, pp. 1079–
1084 (2008). https://doi.org/10.1109/ICDM.2008.118.
15. Kohavi, R.: Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In:
Proceedings of the Second International Conference on Knowledge Discovery and Data
Mining (1996)
Using Non-straight Line Updates in Shuffled
Frog Leaping Algorithm
Kanchana Daoden1,2(&) and Trasapong Thaiupathump1,2

1
Smart Electronics Engineering, Faculty of Industrial Technology,
Uttaradit Rajabhat University, Tha It, Thailand
kanchana.dao@uru.ac.th, trasapong@eng.cmu.ac.th
2
Department of Computer Engineering, Faculty of Engineering,
Chiang Mai University, Chiang Mai, Thailand
Abstract. This paper presents an improved version of the shuffled frog leaping
algorithm (SFLA), called the drunken SFLA. The updating process in the
original SFLA moves a solution toward a hopefully better solution in a straight-
line manner while the algorithm is trying to search for the optimal solution,
however, for some types of problems, moving in a straight-line manner might
not converge effectively or might not converge at all. Since the optimal solution
might not be on the straight line between solutions while the algorithm is
progressing. The idea is to add the angle to the updating process in order to
expand the search space in each step. The updating process of ‘Drunken SFLA’
is no longer in a straight line. This paper explores the possibility of modifying
the updating process of SFLA in a Non-Straight Line manner. The convergence
performance results of the proposed drunken SFLA and the original SFLA are
presented.
Keywords: Optimization Meta-heuristic Shuffled frog leaping algorithm

(SFLA) Truncated Gaussian distribution
1 Introduction
The evolutionary algorithm (EA) is a heuristic-based approach that has been developed
to solving complicated problems in polynomial time and huge search space, classically
NP-Hard problems. For instance, genetic algorithms, particle swarms, ant colony,
SFLA, and anything else that would take too much time. The Shuffled Frog Leaping
Algorithm (SFLA) is a meta-heuristic which related to repeatedly update the frog
positions and segmenting the frogs in searching for food. This research compared the
principle and results of five algorithms based on evolutionary processes, including the
SFLA [1].
SFLA was presented by Eusuff and Lansey in 2003 [2]. This research proposes the
SFLA used to solve the discrete optimization problems. In this case, they applied SFLA
to determine the fitting size of the discontinuous pipe for guiding the new network
pipeline installation and expansion. Besides, it is utilized to find and try to access a
suitable global solution. SFLA is consists of global search and local exploration
step. The SFLA is a discrete search algorithm that imitates the search behavior of frogs

https://doi.org/10.1007/978-3-030-68154-8_80
Using Non-straight Line Updates in Shuffled Frog Leaping Algorithm 955
by grouping them to determine which group has the highest fitness function [3]. It has a
local exploration technique and global exploration steps. The local exploration step is
the process to divide all of the analyzed frogs into subgroups, called a memeplex.
Several researchers have modified SFLA to improve the efficiency of solving and have
been successfully implemented for their results [5, 6]. Accordingly, SFLA is broadly
used in various fields of science and industry.
In 2013, the fast-shuffled frog leaping algorithm (FSFLA) was proposed for two
reasons; First, each sub-group learns from groups using an update strategy. Second,
their algorithm speed is improved through all sorting and grouping of the individual at
regular intervals [7]. Their results show that the FSFLA is a high convergence accuracy
and fast speed. After that, in 2014, many researchers applied SFLA to improve an
optimization problem [8, 9]. The improved SFLA study the parameters. The experi-
mental results showed the improved SFLA solves the optimization problem more
efficiently in convergence accuracy compared to the original SFLA.
This paper is structured as follows: Sect. 1 as an introduction, Sect. 2, the materials
and the proposed method explained, a modified drunken SFLA using the Truncated
Gaussian random with a drift angle in updating process was described, and the impact
parameters used are presented. Section 3 explained the overview of the original SFLA
and the drunken SFLA. Section 4 shows the results of the graphs convergence between
the primary SFLA and three types of drunken SFLA and discussion. Section 5
explained the conclusion.

2.1 Shuffled Frog Leaping Algorithm (SFLA)
The SFLA technique simulates the evolution of frog populations at locations with the
highest food. The algorithm starts with a random initial population F with is generated
process. The fitness function of each frog, f(i), is evaluated by using its position
searching in the search space with is each fitness function. Afterward, all frogs in each
memeplex are sorted by the fitness function in descending order. Each memeplex is
composed of n frogs, such as F = m x n. In the process of frog’s position updating in
traditional SFLA, the worst frog ðPw Þ positions move in a straight line to the best frog
ðPb Þ position. In this article, the position update vector is identified by the vector D,
calculated in Eq. (1). The D vector uses a uniformly distributed with a random number
called RdðÞ, in the range [0,1].
D ¼ RdðÞ ðPb Pw Þ ð1Þ
Pw ðnewÞ ¼ Pw þ D ð2Þ
In Eq. (1) vector D is the position updating, where RdðÞ is a random number with a
uniform distribution between 0 to 1, Pb has denoted the best frog position and Pw is
denoted the worst frog positions, respectively. Also, in Eq. (2), the new worst frog
position is denoted Pw ðnewÞ. If Pw ðnewÞ has better fitness than old Pw , then Pw ðnewÞ
956 K. Daoden and T. Thaiupathump
will be replaced to Pw , if not, the global frog (Pg ) was used instead of Pb in the
updating position process. If its fitness function value does not improve, the process
will generate randomly the new for Pw . This process is called ‘censorship’. For each
memeplex, a step is performed for a predetermined number of repetition iterations.
Then all the frogs in the entire population are shuffled for global exploration. Local
exploration and global search process will continue alternately until the predetermined
convergence criteria are satisfied.
3 The Drunken SFLA
3.1 The Improved Drunken SFLA

In the step of exploration, the range of the fitness function value around the best frog’s
position are closest to the best frog, while the fitness function values range around the
worst’s position are similar to the worst frog as well. Then the frog’s position updating
is moved from the worst frog’s position to the best frog’s position on the direct line of
the uniform distribution random, the specific of original SFLA. Additionally, when
updating the position. The improved drunken SFLA tends to move the worst frog
position to the area near the center area around the direct line between the worst and the
best frog position.
Fig. 1. The Truncated Gaussian distribution with an angle.
In addition, the position updating of the modified drunken SFLA probably tend to
move the worst frog’s position to the area close to the middle area around the direct line
between the worst frogs’ and the best frogs’ positions. Thus, the drunken SFLA
replaces the uniformly distributed random number with a Truncated Gaussian distri-
bution. It is possible to fall in the middle of the vector D line, as shown in Fig. 1.
This experiment added the directional movement of the frog’s position updating pro-
cess and the angle value is considering. The drift angle ðhÞ is an important parameter to
the study to investigates the use of the drift angle and compares the results of its which
range from 0 to 90. Further, this study using uniformly random number distribution
combined with the drift angle. Conclusively, this experiment combined the two types
of important parameters, using Truncated Gaussian distribution random number
combined with the drift angle parameter.
3.2 The Drift Angle ðhÞ is the First Parameter of the Proposed Method
The experiment defined the Truncated Gaussian distributed random number ðRTG Þ in
the range 0 to 1. The mean ðlÞ equals 0.5 used to substitute the uniform random
number in D, as shown in Eq. (3).
D ¼ RTG ðÞ jjPb Pw jj b ð3Þ
b ¼ ½cos h1 cos h2 T ð4Þ

RandTG ðÞ ðPb Pw Þ
h ¼ cos1 þ Dh ð5Þ
jjPb Pw jj
The drift angle ðhÞ is during in the range from 0 to 90. When the b is calculating in
Eq. (4), and h is the drift angle from the worst frog direct moving close to the best frog
in the 2-dimensional of the search area, calculated in Eq. (5). Dh is the parameter that
adjusts the deviation from the original direction and when i = 1 and 2. When the h is
the drift angle of deviation moving from the worst frog straightway to the best frog in
the 2D of the search space area. The process is calculating follow the Eq. (6) and (7),
which are parameters that improved from the original direction.
c0max
c ð6Þ
itermax
Dh Uðc maxÞ ð7Þ

max
The shaded area (cmax ) from the Eq. (6) is a moving parameter that modifies the
process where the worst frogs randomly search around the best frogs through a local
exploration process. Moreover, the iter is the parameter number of algorithm iterations
while the cmax is the maximum initial deviation angle that starting from the original
direction. cmax is value between 0 and p/2. Therefore, a result of cmax is the highest
value at the initialization of the algorithm. It is decreasing as a better solution founded.
Since the h parameter in Eq. (7) is the angle between cmax and cmax , the variation of
the deviation angle will be the interval of [cmax , −cmax ], as described in this experiment.
This paper investigates studies and compares the value of drift angle (h), which is the
appropriate values as the explanation. In this experiment, start from angle equal 0, 10,
20, 30, 45, 60, and 90. When the h is smaller and larger, the convergence performance
results are poor. When the h is around 30 to 60, the convergence is better. In the
simulation results, the best angle = 30, therefore, the reason to choose this value using
in the study. Four cases of these studies of original SFLA and Drunken SFLA
(DSFLA) are shown in Fig. 2.
Fig. 2. Four cases of the studies consist of the original SFLA and the Drunken SFLA (DSFLA).
Case 1. (a) Original SFLA model in locating with the frogs is a random number
where the Rd() is between 0 and 1. Case 2. (b) is DSFLA which distributed randomly
with a uniformly distributed random number where the RTG() is between 0 and 1. The r
is equal to 0.5. Case 3. (c) DSFLA are distributed randomly with a uniformly distributed
random number where the RTG() is between 0 and 1, the h is equal 30. And Case 4. (d) is
DSFLA with distributed randomly with a uniformly distributed random number where
the RTG() is between 0 and 1. The r is equal to 0.5. And the h is equal to 30.
3.3 The Standard Deviation ðrÞ as the Second Parameter

This research also studies the impact of the standard deviation parameter or the sigma
ðrÞ for the Truncated Gaussian distribution on convergence efficiency. When the r
parameters are large, their distribution is nearly uniform. When the r is small or
approaches zero, the random number is close to the constant of the mean.
In this study, the parameter has is considering a suitable value. Figure 3 shows the
Drunken SFLA (DSFLA) with Truncated Gaussian distribution random numbers [10],
with r equal to 0.2, 0.5, and 0.7 value which is a uniformly random number in the
original SFLA and shown in the shaded area.
Fig. 3. The distributions of random numbers used in frogs’ position updating step in SFLA and
Drunken SFLA (DSFLA) with different r values.
4.1 The Drift Angle ðhÞ During 0 to 90

The number of repetitions requires to reach the global minimum. The average fitness
function values have shown in Table 1.
Table 1. Experiment parameters of the drift angle ðhÞ.

Drift angle ðhÞ Parameters Iteration
h=0 1.082 200
h = 10 1.006 200
h = 20 1.049 200
h = 30 0.998 88
h = 45 1.000 200
h = 60 0.999 45
h = 90 0.999 66
Runtime (s.): The success rate was set up in global optimum at (−32, −32), with an
average of 10 runs.
In Fig. 4 shown the convergence rate of the drift angle ðhÞ with different values
comparing between h equal to 10, 30, 45, 60 and 90. The experiment achieved with the
same data set of the initial random population in solution space [−66,66]. When h is
equal to 10, the convergence rate is particularly slow. While the convergence rate for
the h equal to 20 caused is better than for h equal to 10, it still converges to the global
solution much slower. When the h equal to 30, the graph has shown a better conver-
gence performance than in the others, as shown by a solid line at the bottom of Fig. 4,
and when h equal to 45, 60 and 90 respectively. The convergence rates are slower than
the 30, for this reason, to choose h equal to 30 in this experiment.
Fig. 4. The different angles when h are during 10–90.
4.2 The Standard Deviation ðrÞ During 0.1 to 0.9

In the Truncated Gaussian distribution, the standard deviation called the sigma ðrÞ is
the significant parameter investigated in this paper. Also, this experiment studies the
effect of the standard deviation on the convergence efficiency also. Notice that when the
r is large, the distribution is near to the traditional SFLA, means the random is a
uniformly distributed random number. Including when the r is small, the random
number is near to the constancy of the mean value. Table 2 shows, the convergence
rate of the r with different values start from 0.1 to 0.9 and the runtimes using the
convergence rate.
Table 2. Experiment parameters of the standard deviation ðrÞ.

Sigma (r) Parameters Iteration
r = 0.1 Not found > 55.1707
r = 0.2 145 49.8457
r = 0.3 117 42.3587
r = 0.4 110 39.7841
r = 0.5 59 22.7354
r = 0.6 31 26.6572
r = 0.7 39 28.2036
Runtime (s.): The success rate was found in global optimum at (−32, −32) in 10
runs.
When r is greater than 1, the convergence efficiency can be expected to approach
the performance of the traditional SFLA.
4.3 The Setting of Experimental Parameters.

This experiment is designing the initial populations equal 400 frogs (F = 400) sepa-
rates into 20 memeplexes (m = 20), the number of frogs in each memeplex is equal 20
(n = 20), the number of frogs in each sub-memeplex is equal 15 (q = 15), the number
of evolutions in local exploration is equal 15 (N = 15). The maximum step size allows
equal to 100 (Smax = 100%), the solution space is during −6 to −66, the mean value of
the RTG is equal 0.5 (RTG = 0.5), the random number during in the range 0 to 1
(rand = [0,1]). The value of the drift angle is equal to 30 (h = 30), the r is equal to 0.5.
The maximum iterations of all tests are 200 iterations.
4.4 The Testing Function

The test function used as an evaluation function for this experiment is the De Jong Fn5
function [4]. There are 25 holes with a minimum value, and each hole is close to the
others. The De Jong Fn5 function is reasonable for testing whether a solution has stuck
at a local minimum or not. The global minimum of this function f(x) = 0.998, at (−32,
−32) is calculated by Eq. (8).
1 X25
1
f ðxÞ ¼ þ ð8Þ
500 P
2
6
j¼1 jþ ðxi aij Þ
i¼1
5 Experiment Method and Results
This result shows the convergence performance of four types. The comparison graph
between the traditional SFLA, the modified DSFLA using Truncated Gaussian distri-
bution with sigma ðrÞ equal to 0.5, the DSFLA using Truncated Gaussian distribution
with h equal to 30 and the DSFLA using Truncated Gaussian distribution with sigma
ðrÞ equal to 0.5 with the drift angle ðhÞ equal to 30. The mean ðlÞ value is equal to 0.5
in the range [0, 1].
Fig. 5. The Original SFLA comparison with DSFLA combine purposed method.
Figure 5 shows the convergence rates of the original SFLA and the modified
DSFLA. The original SFLA represents in a small black line. The DSFLA with the
truncated Gaussian distribution and the r equal to 0.5 uses the dashed line. The DSFLA
with the truncated Gaussian distribution and the h equal to 30 presents by the dashed
dot line, while the DSFLA with the drift angle h equal 30 is a solid black line,
respectively. The x-axis of the graph represents the number of iterations equal to 150,
while the y-axis shows the average population fitness level using in the log scale.
6 Conclusion
This paper presents an improvement in SFLA using a Truncated Gaussian distribution

in the Non-Straight Line updates of the frog position update process instead of the
uniform distribution in conventional SFLA. The simulation results show the efficiency
of convergence into the optimum point of the Truncated Gaussian distribution case
differs from the original SFLA. The different parameters are the drift angle, which is
considering and studying. The results of studies illustrate that when the drift angle is
smaller or larger, it affects the speed convergence efficiency. Therefore, this research is
to investigate the value of the drift angle and how it affects convergence. The simu-
lation results for improving SFLA with the sigma corrected the Truncated Gaussian
distribution of 0.5, with a mean of 0.5 and a drift angle at 30. The outcomes
demonstrate that the proposed algorithm performs better convergence efficiency than
the original SFLA with the sigma-modified SFLA of the truncated Gaussian distri-
bution of 0.5 and the SFLA with a drift angle of 30, respectively. Future work will
apply this purposed method to discrete optimization.
References
1. Emad, E., Tarek, H., Donald, G.: Comparison among five evolutionary-based optimization
algorithms. Adv. Eng. Inform. 19, 43–53 (2005)
2. Muzaffar, E., Kevin, L.: Optimization of water distribution network design using the shuffled
frog leaping algorithm. J. Water Resour. Plan. Manag. 129, 210–225 (2003)
3. Muzaffar, E., Kevin, L., Fayzul, P.: Shuffled frog leaping algorithm: a memetic meta-
heuristic for discrete optimization. Eng. Optim. 38, 129–154 (2006)
4. Nagham, A., Ahamad, K.: De Jong’s sphere model test for a social-based genetic algorithm
(SBGA). Int. J. Comput. Sci. Netw. Secur. 8, 179–187 (2008)
5. Ziyang, Z.,Daobo, W.,Yuanyuan, L.: Improved shuffled frog leaping algorithm for
continuous optimization problem. In: IEEE Congress on Evolutionary Computation (2009)
6. Mohammad, A., Mohammd, A., Amin, S.: An efficient modified shuffled frog leaping
optimization algorithm. Int. J. Comput. Appl. 32, 0975–8887 (2011)
7. Lianguo, W.,Yaxing, G.: A fast shuffled frog leaping algorithm. In: IEEE International
Conference on Natural Computation (2013)
8. Guang-Yu, Z., Wei-Bo, Z.: An improved shuffled frog-leaping algorithm to optimize
component pick-and-place sequencing optimization problem. Expert Syst. Appl. 41, 6818–
6829 (2014)
9. Ehsan, B., Malihe, F.: An improved adaptive shuffled frog leaping algorithm to solve various
non-smooth economic dispatch problems in power systems. In: IEEE Intelligent Systems
(2014)
10. Kanchana, D., Trasapong, T.: A modified shuffled frog leaping algorithm using truncated
Gaussian distribution in Frog’s position updating process. In: Conference: Information
Science and Applications (ICISA), Ho Chi Minh City, Vietnam, vol. 376s (2016)
11. Zhen, W., Danhong, Z., Biao, W., Wenwen, C.: Research on improved strategy of shuffled
frog leaping algorithm. In: 34rd Youth Academic Annual Conference of Chinese
Association of Automation (YAC), Jinzhou, China (2019)
Efficient Advertisement Slogan Detection
and Classification Using a Hierarchical
BERT and BiLSTM-BERT
Ensemble Model
Md. Akib Zabed Khan1 , Saif Mahmud Parvez2(B) , Md. Mahbubur Rahman3 ,
and Md Musfique Anwar4
1
Bangladesh University of Business and Technology, Dhaka, Bangladesh
akibcseju21@gmail.com
2
Daffodil International University, Dhaka, Bangladesh
saif.mahmud.parvez@gmail.com
3
Crowd Realty, Tokyo, Japan
mahbuburrahman2111@gmail.com
4
Jahangirnagar University, Dhaka, Bangladesh
manwar@juniv.edu
Abstract. The advertising slogan is a short sentence that is considered as

one of the fundamental terms to promote business brands and attract cus-
tomers’ attention to consuming the products. Most of the existing works
paid less focus on identifying a perfect slogan as a slogan which is not a reg-
ular sentence. Again, these methods didn’t detect the related context of the
slogans in an automated way. To resolve this problem, we have proposed a
hierarchical method using the BERT model to effectively detect slogans as
well as classify those slogans into related contexts using an ensemble model
of BiLSTM-BERT. We have got 97.43% accuracy in slogan detection and
82.5% accuracy in context-wise slogan classification. Our model is capa-
ble of effectively identify the inner meaning of the advertisement slogans
to align them into a deserving brand’s promotional context.
Keywords: Slogan · Context · BiLSTM-BERT

1 Introduction
Nowadays, companies use different types of advertising slogans to introduce and
promote the positive characteristics of their products. Basically, it is an intelli-
gent phrase which is mostly used for promotional purposes in business and trade.
Brand owners are always ready to pay huge amounts of money to advertising
agencies to create an innovative, unique advertising slogan. The main purpose is
to point out the effectiveness of a product keeping eyes on the audience’s needs
to offer more advantages for their probable consumers [1]. The understanding of
the influential factors of different brands is very important because the corporate
sector largely depends on the values of these brands. The two essential factors
such as brand image and brand awareness can influence brand knowledge. A per-
fect slogan or motto can increase a company’s brand image, and thereby aware
https://doi.org/10.1007/978-3-030-68154-8_81
Slogan Detection and Classification 965
people to identify their products from existing competitive brands. The slogan
must meet the criteria of a context. (e.g., food, automotive, business, etc.) [2] as
well as can be changed over time based on market demand.
Catchy phrase is usually used to catch the attention of the customers which
helps to distinguish a particular product from other similar products that are
available in the market. Many scholars analyzed and explained a number of lan-
guages and rhetorical devices that are common for advertising slogans. At the
realistic level, trademarks illustrate the utilization of full or fractional capitaliza-
tion as well as offbeat spelling, whereas, at the phonological level, the broad util-
ity of rhyme, similar sounding word usage, sound similarity, and onomatopoeia
can be seen for advertising slogans. Usage of pronouns, numerals, adjectives,
and verbs are common lexical features in the above methodology, whereas usage
of idioms and phrases, imperative sentences are usual notions at the syntactic
analysis level. Again, simile, puns, metaphor etc. seems to be a usual trade from
a semantic point of view of a slogan [3].
Perfect slogan finding for branding a particular product is not a simple task as
it usually takes a lot of brainstorming and time to generate good branding rhymes.
Natural language processing (NLP) is a technique that has already enabled a new
dimension for modeling human language, detecting the inner meaning of language,
and sentiment identification of a text. Sentence structure consolidates a number of
errands, such as dialect structure acknowledgment, lemmatization, morphological
division, discourse labeling, parsing, and stemming. Different coherent and fac-
tual computational strategies are utilized for the language structure and semantic
investigations [4]. In cooperation of deep neural networks gives it a boost to find
out the inner meaning in a more correct way [5].
In this work, we have used 10,023 renowned advertisement slogans and cre-
ated a model using BERT classifier to detect and classify the slogans into related
contexts. We also have evaluated if a slogan is a relevant slogan to the context
or not by incorporating text classification and deep learning approaches. Our
main objective has been to create a model that can detect whether a sentence
is a slogan or not and classify the respective context if the sentence seems to be
a slogan for representing a brand in advertising a product.
2 Literature Review
In English language, slogans have different stylistic features which differentiate
a slogan sentence from normal English sentences. Such features are, presence of
simile, usage of metaphors and puns, usage of different sound techniques etc.
Tatjana et al. proposed a method that focused on rhetorical devices such as
figurative language and sound techniques to classify 100 sampled slogans into
three groups [3]. However, the drawback of their method is that no automation
technique is used to generate or categorize the slogan.
Recent research works paid more attention for the automation of slogan gen-
eration. For example, nominal metaphors are generated on the basis of metaphor
interpretation models proposed in [6]. Using these metaphors and typical slogan
templates, the method is able to generate new slogans in a given context with its
966 M. A. Z. Khan et al.
associated adjectives by applying multi-objective selection based genetic algo-

rithm. A word-vector and case frame based automated slogan generation method
utilizing gated recurrent units in recurrent neural networks (RNN-GRU) is pro-
posed in [7] that focused only on Japanese slogans for advertisement. Another
genetic algorithm based model is introduced in [8] that applied a beam search
approach based framework BRAINSUP along with some linguistic features for
quality slogan generation.
Yamane H et al. proposed a probabilistic method using a bag of words model
and also consider the ratings from social media users to find efficient Japanese
advertisement slogan [9]. The major limitation in their work is the lack of proper
automation in the proposed framework.
Many existing works applied different text pattern extraction methodologies
to analyze and generate slogans. Al-Makhadmeh et al. [10] applied killer natural
language processing optimization ensemble deep learning approach to classify
different hate speeches from Twitter data. Jin M et al. introduced the LMTF
model which utilized both LSTM and topic modeling techniques to get the inner
meaning of a users’ reviews [11]. Jeong CY et al. in [12] proposed a method that
applied chi-square statistics in text features such as TF-IDF etc. in order to
train the Support Vector Machine (SVM) classifier to classify users’ opinions.
Machine Learning algorithms are also being used to classify Bangla text data.
O. Sharif et al. in [13] introduced an NLP approach to detect suspicious Bangla
texts where they used logistic regression algorithms to build their model that can
classify whether a text is suspicious or non-suspicious. They showed state-of-art
solutions and got 92% accuracy to classify Bangla texts while they considered
small amount of data.
From the above studies, we find that existing research works didn’t consider
slogan classification from regular sentences which can give a particular context
dependent quality slogan in an automated way. These methods also ignored
NLP and deep learning techniques which are efficiently used in opinion mining.
Therefore, we have applied NLP and deep learning approaches in this work
to classify slogans from regular sentences as well as classify the slogans in the
respective contexts.
3 Slogan Detection and Context Classification Method

3.1 Data Collection and Preprocessing
We have considered mainly two broad categories of data in order to create the
dataset consisting of around 20,000 sentences: (i) Slogan type data and (ii) Non-
slogan type data. There are broad eight slogan categories are present in slo-
gan type data. In case of non slogan type data, there are six categories of non
slogan type sentences available such as simple sentence, complex sentence, com-
pound sentence, proverbs, different quotes of philosophers and normal day to day
life communication sentences. The main similarity among the sentences in the
dataset is that all of them are short sentences. We have combined all these sen-
tences and around 20,000 sentences are present in our dataset. For collecting out
data we have scraped through different websites1,2 and extracted the sentences.
This dataset is used for slogan detection. Later, we have scrapped the context
only for slogans along with slogan sentences for context classification such as food
& beverage, business, etc. Next, we have pre-processed the dataset to improve
the quality and performance of the dataset and the subsequent steps. We have
lower cased the sentences and removed the stopwords, numbers, unnecessary
characters, single word sentences to clean our data. We also applied stemming
and lemmatization to convert the words into corresponding root forms. Then,
we have shuffled our datasets so that slogan and non-slogan type sentences can
be distributed equally in all over the dataset.
3.2 Slogan Detection

Slogan contains some special linguistic features which can effectively separate it
from normal sentences. Each sentence is represented as stated in Eq. 1.
sen = {word1 , word2 , word3 , .., , wordi , ..., wordn } (1)
Here n is the maximum number of words in a sentence. The sub-sequence of the

sentence (sen) is the target which is denoted as tar in Eq. (2).
tar = {wordi , wordi+1 , ..., wordi+m−1 } (2)

Each target contains m words. The main goal of our approach is to differentiate
among slogan and non-slogan sentences which is denoted by y in (3).

1, if sen is slogan
y= (3)
0, if sen is non-slogan
For example, considering two example sentences:
sen1 = “W here is the beef ?” , and sent2 = ”Do not break anyone s heart”
Here, sen1 is clearly a slogan which is indicating an intention of a product

related to Food. But in sen2, it looks like a slogan but it does not have any
context or give a notion of any target product. So, it is not a slogan. As a result,
the output is like y1 = slogan and y2 = non − slogan In our model, slogan is
indicated by 1 and non-slogan is indicated by 0. Then, classification outcome
gives a probability of a sentence being a slogan between this 0 and 1 range.
For detecting a slogan, we have used a pre-trained BERT (Bidirectional
Encoder Representations from Transformers) model which can separate a slo-
gan from random texts as depicted in Fig. 1. Firstly, we have tokenized all the
20,296 sentences using BERT tokenizer. Next, we passed these tokens to a pre-
trained BERT model which implements embedding, transformer encoding and
1
https://github.com/heckenna/SloganGenerationProject/blob/master/slogans.csv.
2
https://www.thinkslogans.com/slogans/advertising-slogans.
pooling to detect the latent features from the sentences. The output of this layer
is first passed to a Dropout layer which passes its output to the Fully connected
layer. Finally, the output from Fully connected layer is passed to a Classifica-
tion/Output layer which classified all the sentences as slogan and non-slogan.
In the pre-trained BERT model, a bidirectional transformer encoder is used to
understand the latent syntactic and semantic features of the sentences. By imple-
menting the attention mechanism, the transformer model balances the relation-
ship of input and output. In the decoder portion, it calculates the relationship
of input sentence and output in parallel which improves the computational effi-
ciency and also removes the long distance information decay problem.
Fig. 1. Slogan and non-slogan classification using BERT.
3.3 Slogan Feature Analysis for Context Classification
Slogan Text Features: There are many features exist in advertisement slogan
text which differentiate it from traditional text such as news, articles, etc. Some
of these main features are:
Context: The main feature of a slogan is that it indicates a particular context as
each slogan tries to get the attention of customers of a specific product category.
If the slogan fails to express the exact category according to the context, then
it doesn’t recognize as a good slogan. In reality, there are millions of brands
that have slogans but a majority of them are not good enough as well as not
expressed in right format. As a result, those slogans sometimes represent an
ambiguous context. Therefore, it is a challenging task to identify good slogans
with good context among all these unpopular advertising slogans.
Data Sparsity: Slogans usually consist of words ranging from only four to
seven. Due to this highly short number of words, it creates a massive problem
with sparse data. Context analysis becomes very difficult due to this sparse data
and if they are not processed properly they would cause serious problems in
sparse matrix representation.
Irregularity and Interactivity: As advertising slogans are really concise and
also brand focusing is the main concern, many common garbage words, unknown
nouns are seen in a common manner. Also, slogans are bound to be unique. As a
result, there is hardly a proper relationship among same types of slogans rather
than their specific contexts.
Feature Extraction: As the context is being the core feature of slogans, the
same context slogan tends to have the same type of words. Such as, a food
category tends to focus all the eating behaviour and food-related words whereas
an automotive category tends to focus on speed, energy and power focusing
words. Our dataset contains eight broad categories of contexts. Table 1 shows the
names of few of the contexts with an example. We have performed exploratory
data analysis on the overall dataset as well context wise slogan data. Common
words over the dataset are those words which are common to all the contexts.
As a result, these words create problems to distinguish the contexts that is
why they are discarded. Context wise common words indicate a context’s inner
nature. Both of these types of common words examples are shown in Table 2. To
fetch out these common words, first we have performed exploratory data analysis
(EDA) over the dataset. We got 30 of such common words over the dataset and
then discarded them. It helped the dataset to be clustered among the contexts.
Later, we performed topic modeling using LDA [14] to find the latent topics with
their relevant words. We also performed context wise EDA to cross check with
the LDA to get the top common words which are specific to a context.
Table 1. Different contexts of slogans
Context Example Slogan Common Words in Context

Food & Bev Finger-licking good taste, drink, energy, pure, etc.
Education Personalized tutoring for success learn, law, education, students etc.
Luxury The hair color experts shop, hair, beauty, style, look, etc.
Transport Making travel simpler cruise, holiday, travel, relax, etc.
Healthcare Healthy life is priceless gift health, fitness, mind, medicine, etc.
Table 2. Repetitive words over all the contexts
Common Words Over the Dataset

make, life, better, good, care, get, best, world, live, people, love, way, great, time,
every, feel, take, real, one, quality, place, clean, think, like, come, new, work, help,
experience, power, need, always, etc.
3.4 Context Classification
After extracting and analyzing the features, the next step is to find the cor-
relation among the words. Firstly, we have classified the slogans from a pool
of sentences using the BERT classifier. Next, these slogans are processed based
on the common words. For context classification, we have used an ensemble of
BiLSTM and BERT to get the best result.
BiLSTM Architecture: LSTM is a very renowned neural network architecture

which is used for modeling sequential data. It is possible to get more information
about all the context and find correlations of the words for a particular context
using BiLSTM. For context analysis, we have used one-hot encoder mechanism
to represent individual context for each slogan and then it is passed to BiLSTM.
When a training list is passed to a BiLSTM, it first uses a forward LSTM net-
work and then a backward LSTM network. These two networks are connected.
The three door structure unit of LSTM in both directions captures a better
understanding of word correlations [15]. The calculation for the input, output
and forget state take place using the following Equations stated in (4) to (9).
igt = σ(ωig [ht−1 , xt ] + βig ) (4)

ogt = σ(ωog [ht−1 , xt ] + βog ) (5)
f gt = σ(ωf g [ht−1 , xt ] + βf g ) (6)
Ct = tanh(ωc [ht−1 , xt ] + βc ) (7)
ct = f gt ∗ ct−1 + igt ∗ Ct (8)
ht = ogt ∗ tanh(ct ) (9)
Here, the input is denoted with xt , hidden state is denoted by ht . The three
door structure is represented by three gates which are igt , ogt and f gt that
indicate input, output and forget gate, respectively. The symbols ωig , ωog , ωf g
and βig , βog , βf g denote the weights and biases for input, output and forget
gate, respectively. In our proposed model, we have used two BiLSTM layers
with dropouts in between them. At the end, we have used softmax classifier to
classify eight contexts.
BERT Architecture: Unlike BiLSTM, we have used multi-label encoding to

represent each class with a number from 0–7 (Table 3) to classify the contexts
using a pre-trained BERT model. We have tokenized all these processed sentences
using BERT tokenizer. Then, BERT classified these contexts according to Eqs.
(1), (2) and (3).
We have used a BERT layer, a dropout layer (p = 0.3) and an output layer
for 8 context classification.
Table 3. Encoded output of each class.
yclass Class yclass Class yclass Class yclass Class

0 Food & Bev 2 Education 4 Business 6 Healthcare
1 Automotive 3 Luxury 5 Transport 7 Toiletries
Ensemble Model: Both BiLSTM and BERT perform well to classify the con-
texts effectively. The three gate architecture of LSTM captures some temporary
information and generates candidates for selecting context traversing in both
ways. On the other hand, BERT uses transformer blocks which use self atten-
tion masks which tries to bring out the inner architecture of a slogan and predict
its context. In our method, we have used ensemble of BiLSTM and BERT and
used majority rules across these models to fetch the best probability score. The
final architecture of our model is shown in Fig. 2.
Fig. 2. Block diagram of the proposed Ensemble model.
4 Experiment and Result Analysis

4.1 Dataset and Evaluation Metrics
Our experiment has been conducted in two steps. In the first step, we have
experimented on a dataset which is a collection of 10,073 advertisement slogan
sentences and 10,223 regular single lined short sentences. After that, these 20,296
sentences are splitted into a training set and test set. Table 4 shows this distri-
bution, and Fig. 3 shows word cloud distribution of words into these two classes:
slogan data word cloud at left and non-slogan at right.
Table 4. Slogan and non-slogan distribution over the total dataset
Slogan Non-slogan
Train 9062 (44.65%) 9204 (45.35%)
Test 1011 (4.98%) 1019 (5.02%)
Fig. 3. Slogan and non-slogan data word cloud.
Table 5. Slogan context distribution over the slogan dataset
Context Train Test Context Train Test

Food & Bev 21.14% 20.26% Business 19.22% 19.78%
Automotive 6.36% 7.15% Transport 14.46% 13.58%
Education 4.29% 4.53% Healthcare 16.91% 17.40%
Luxury 13.6% 13.23% Toiletries 4.02% 4.05%
Next, we have classified the slogan sentences from the dataset and split it into
train and test sets. Then, we have to again pre-process those identified slogans
for context classification. We discard some sentences containing garbage words
or of length of only one word. This processed dataset contains eight different
classes/categories of context as shown in Table 5.
We have used Accuracy, Precision, Recall and F1-score as stated in Eq. (11),
(12), (13) and (14), respectively for performance evaluation.
Accuracy, A : A = ((T P + T N )/(T P + T N + F N + F P )) ∗ 100% (10)
P recision, P : P = T P/(T P + F P ) (11)

RecallRate, R : R = T P/(T P + F N ) (12)
F 1 − Score : F 1 = (2 ∗ P ∗ R)/(P + R) (13)
where,
T P : N umber of entities which are recognized and matched

T N : N umber of entities which are not recognized but matched
F P : N umber of entities which are recognized and not matched
F N : N umber of entities which are not recognized and not matched
4.2 Hyper Parameters Setting
In our experimental approach, we have accumulated a BERT-Base pre-trained

model. This model contains 12 layers with 768 numbers of hidden dimensions,
12 self attention heads. We have encoded each sentence with a maximum length
of 128 for slogan detection, 70 for slogan context classification and batch size
of 16 for either case. For our BiLSTM model, we have used a vocabulary size
of 20,000, each sentence has been encoded with a maximum length of 60. Also,
we have utilized Glove vectors for our word embedding. Firstly, we conducted
our experiment for slogan detection only with the BERT pre-trained model. For
context classification, BiLSTM and BERT ensemble models have been adopted.
4.3 Result Analysis
In the first part of our experiment, we classified slogan from non-slogan using
the BERT pre-trained model. Table 6. shows the output of the model.
Table 6. Accuracy and evaluation metrics for slogan detection using BERT
Category Accuracy Precision Recall F1-Score

Slogan 97.43% 0.96 0.96 0.96
Non-slogan 0.97 0.97 0.97
After classifying slogans from sentences, we have used the BiLSTM-BERT

ensemble model to classify contexts from slogans. Solely in the case of BiLSTM,
91% train accuracy and 81.60% test accuracy are acquired for 2-layer BiLSTM.
For the case of BERT, we have achieved 93% train and 82.21% test accuracy. For
the ensemble of BiLSTM-BERT model, we have attained 82.5% test accuracy.
Table 7 shows the overall accuracy of our model and Table 8 shows the model
metrics for context classification.
Table 7. Accuracy of different models for context-wise slogan classification
Model
BiLSTM BERT Ensemble Model
Accuracy 81.60% 82.21% 82.5%
Table 8. Model metrics for context-wise slogan classification
Context Precision Recall F1-Score

BiLSTM BERT BiLSTM BERT BiLSTM BERT
Food & Bev 0.86 0.83 0.93 0.98 0.89 0.90
Automotive 0.79 0.87 0.56 0.51 0.66 0.65
Education 0.79 0.92 0.67 0.62 0.72 0.74
Luxury 0.80 0.80 0.85 0.72 0.83 0.76
Business 0.77 0.69 0.86 0.93 0.81 0.79
Transport 0.70 0.80 0.88 0.97 0.78 0.88
Healthcare 0.76 0.72 0.81 0.84 0.79 0.77
Toiletries 0.92 0.92 0.79 0.81 0.85 0.86
5 Conclusion
In this work, we have presented a hierarchical BERT and BiLSTM-BERT ensem-

ble model for identifying advertising slogans and perceiving the contexts of these
slogans. BERT model has exhibited notable performance for identifying slo-
gans. Again, the ensemble model of BiLSTM-BERT successfully categorizes the
detected slogans into related contexts with an accuracy of 82.5%. There are some
limitations in our work such as lack of resources of advertising slogan data, dif-
ferent types of contexts based slogans are not evenly distributed in our second
dataset, and complex structure of slogan type short text data has created lots
of problems to classify context-based slogans. For future work, we would like to
predict the efficiency or rating of a slogan providing its context where we can
analyze if a slogan is good, average, or bad one. And also we can automate to
generate new effective slogans based on respective contexts to properly advertise
the business brands without heavy brainstorming. Moreover, we would like to
work with advertisement slogans used in the Bengali language.
References
1. Abdi, S., Irandoust, A.: The importance of advertising slogans and their proper
designing in brand equity. Int. J. Organ. Leadership 2(2), 62–69 (2013)
2. Kohli, C., Leuthesser, L., Suri, R.: Got slogan? Guidelines for creating effective
slogans. Bus. Horiz. 50(5), 415–422 (2007)
3. Dubovičienė, T., Skorupa, P.: The analysis of some stylistic features of English
advertising slogans. Žmogus ir žodis, t. 16(3), 61–75 (2014)
4. Rahim, A., Qiu, T., Ning, Z., Wang, J., Ullah, N., Tolba, A., Xia, F.: Social acquain-
tance based routing in vehicular social networks. Future Gener. Comput. Syst.
1(93), 751–60 (2019)
5. Dou, Z.Y., Wang, X., Shi, S., Tu, Z.: Exploiting deep representations for natural
language processing. Neurocomputing 21(386), 1–7 (2020)
6. Alnajjar, K., Kundi, H., Toivonen, H.: “Talent, Skill and Support.”: a method for
automatic creation of slogans. In: Proceedings of the Ninth International Confer-
ence on Computational Creativity, Salamanca, Spain, Association for Computa-
tional Creativity, June 25–29, (2018)
7. Iwama, K., Kano, Y.: Japanese advertising slogan generator using case frame and
word vector. In: 11th International Conference on Natural Language Generation,
pp. 197–198 (2018)
8. Tomašič, P., Papa, G., Žnidaršič, M.: Using a genetic algorithm to produce slogans.
Informatica, 39(2), (2019)
9. Yamane, H., Hagiwara, M.: Advertising slogan generation system reflecting user
preference on the web. IEEE ICSC 2015, 358–364 (2015)
10. Al-Makhadmeh, Z., Tolba, A.: Automatic hate speech detection using killer natu-
ral language processing optimizing ensemble deep learning approach. Computing
102(2), 501–22 (2020)
11. Jin, M., Luo, X., Zhu, H., Zhuo, H.H.: Combining deep learning and topic modeling
for review understanding in context-aware recommendation. In: NAACL: Human
Language Technologies, vol. 1, pp. 1605–1614 (2018)
12. Jeong, C.Y., Han, S.W., Nam, T.Y.: A hierarchical text rating system for objec-
tionable documents. JIPS 1(1), 22–26 (2005)
13. Sharif, O., Hoque, M.M.: Automatic detection of suspicious bangla text using logis-
tic regression. In: International Conference on Intelligent Computing & Optimiza-
tion, pp. 581-590. Springer, Cham (2019)
14. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn.
Res. 3, 993–1022 (2003)
15. Jiang, S., Zhao, S., Hou, K., Liu, Y., Zhang, L.: A BERT-BiLSTM-CRF model
for chinese electronic medical records named entity recognition. In: ICICTA, pp.
166–169 (2019)
Chronic Kidney Disease (CKD) Prediction
Using Data Mining Techniques
Abhijit Pathak1(&), Most. Asma Gani1, Abrar Hossain Tasin1,

Sanjida Nusrat Sania1, Md. Adil1, and Suraiya Akter2
1
Department of Computer Science and Engineering, BGC Trust University
Bangladesh, Chittagong, Bangladesh
abhijitpathak@bgctub.ac.bd, asmagani42@gmail.com,
abrarhossaintasinp@gmail.com,
sanjidanusratsania353@gmail.com,
mdadilkhan616@gmail.com
2
Department of Pharmacy, BGC Trust University Bangladesh,
sifransuraiya@gmail.com
Abstract. In the past decade, the rapid growth of digital data and global
accessibility through the modern internet has seen a massive rise in machine
learning research. In proportion to it, the medical data has also seen a massive
surge of expansion. With the availability of structured clinical data, researchers
have attracted scores to study clinical disease detection automation with
machine learning and data mining. Chronic Kidney Disease (CKD), also known
as the renal disorder, has been such a field of study for quite some time now.
Therefore, our research aims to study the automated detection of chronic kidney
disease using several machine learning classifiers with clinical data. The purpose
of this research work is to diagnose kidney disease using a number of machine
learning algorithms such as the Support Vector Machine (SVM) and the
Bayesian Network (BN) and to select the most effective one to assess the extent
of CKD patients. The amount of expertise in the medical field in relation to
CKD is limited. Many patients have to wait a long to get their test results. The
experience of medical staff is declining in value. Upon retirement, new
employees replace them. It helps professional doctors or medical staff in their
diagnosis of CKD. This paper’s primary purpose is to present a clear view of
Chronic Kidney Disease (CKD), its symptoms, and the process of early
detection that may help humanity be safe from this life-threatening disease.
Keywords: Data mining Chronic Kidney Disease Glomerular filtration

rate SVM Naïve Bayes classifier
1 Introduction
As the medical technologies of our time, making groundbreaking contributions to

reduce human death due to many fatal chronic diseases, eradicating it and transcending
above the grasp of diseases is still far from a reality.

https://doi.org/10.1007/978-3-030-68154-8_82
Chronic Kidney Disease (CKD) Prediction 977
There have been many fatal diseases over past centuries—some of those formed
into an epidemic and affected millions. Deadly diseases like Smallpox, SARS, Mar-
liese, and polio has been cured with the vaccine in the past century. We are still fighting
to cure the likes of Cancer, HIV but the rVSV-ZEBOV vaccine is a single-dose vaccine
protocol that has been found to be safe and secure only against Ebolavirus strains of
Ebolavirus. This is the first FDA approval to vaccinate against Ebola.
One of the most deadly health hazards of our time is chronic kidney disease or in
abbreviation. CKD is defined as the gradual degradation of the normal kidney function,
and the consequence can be catastrophic. The kidney is essential for the filtering and
purification process of our blood. Without at least one functioning kidney, death is
imminent and inevitable within a few days. As it is a chronic disease and the symptoms
are mild and gradual, it often goes unnoticed for years until the very late stage.
So, the scope of this research is to build a model using data mining techniques to
predict if a patient does indeed have CKD by reviewing and analyzing symptoms and
various health parameters; Using Data Mining tools to classify those data and compare
the results acquired through different techniques.
In Bangladesh, a considerable portion of the population living below the poverty
line does not have sufficient access to the required medical attention. The public
medical sector, which is almost entirely run by the federal fund, does not have the
financial capability or the proper medical resources to incorporate this massive amount
of financially unprivileged people into the medical sector. As a result, a preeminent
health concern, such as chronic kidney disease. And renal failure is not diagnosed in
this population. As chronic kidney disorder often does not show any symptoms.
In most cases, patients develop ESRD (End-Stage Renal Disease) without knowing
they even had a chronic renal disorder in the first place. Chronic kidney disease- a
gradual loss of kidney function- when an advanced stage is reached. In end-stage
kidney disease, your kidneys can no longer function because they need to meet your
body’s needs. The lack of federal funding, Lengthy process in healthcare, persistent
lack of quality, and unsatisfactory service in public health institutes have forced the
general public to seek private healthcare. This consequently has flourished the private
sector but at the price of high medical costs. So a massive amount of people affected by
ESRD face dreadful consequences with the lack of proper treatment and dialysis, which
often leave the patient and the family in unfathomable sufferings and humanitarian
crises.
As the first step towards treatment in any medical condition is to getting diagnosed
first. The advancement of medical technology and the capacity to store medical data in
digital form have rejuvenated the idea of medical automation, and the data revolution
has all but made the possibility of artificially automated doctors more than just an
ambitious dream. An automated virtual system to classify CKD is still not entirely
convincing or decisive to the vast majority of doctors and medical personal, and rightly
so. But with more data, efficiency, and more accuracy, a future of automated artificial
medical assistant can become a reality. Followings are the list of things we tried to
accomplish in this project:
• Can the CKD disease be classified with clinical data?
• We are applying a probabilistic classifier to the dataset.
By analyzing the dataset, the CKD status of the patient should be classified. We
tried to find out the correlation of different attributes in the dataset in developing CKD.
The scope of this problem is to classify our dataset using machine learning algo-
rithms. We will attempt to explore the relationship between dataset symptoms to
determine interdependence in the development of chronic kidney disease.
In Bangladesh, an automated diagnosis system would reduce the lengthy process in
health care. With improved symptoms analyzing algorithms, the system can suggest
diagnostic tests to the users, reducing big hospitals’ time and cost.
2 Literature Review
Sujata Drall, Gurdeep Singh Drall, Sugandha Singh, and Bharat Bhushan Naib predict
chronic kidney disease and determine the accuracy of data mining algorithms such as
naïve Bayes classifier and k-nearest neighbour [18]. In this research, they worked with
more than 5 attributes. The classification algorithm’s performance has been compared
based on accuracy, precision, and total execution time for the prediction of Chronic
Kidney disease.
S Dilli Arasu, Dr. R. Thirumalaiselvi has reviewed several algorithms, such as
clustering and classification algorithms [17]. In this research, they dealt with the
algorithm used in a different research paper. They tried to find out the better one
classification algorithm, which has been used to identify chronic kidney disease and its
stages.
Dr. S. Vijayarani, S. Dhayanand has predicted chronic kidney disease and find out
better performance accuracy in less time. They used Naïve Bayes and Support Vactor
Machine algorithms [2]. In this research, They worked with 5 attributes. The execution
performance of the Naïve Bayes classifier was 1.29, which was better than the Support
Vector Machine with a performance of 3.22 to predict CKD.
Tabassum S, Mamatha Bai BG, Jharna Majumdar has developed a system for
predicting chronic kidney disease and find out the better performance accuracy of used
data mining algorithms. In this research, they used Expectation-Maximization, Artifi-
cial Neural Network, C4.5 algorithm, where EM got 70%, ANN got 75%, and C4.5 got
96.75% accuracy [19].
Pushpa M. Patil has predicted chronic kidney disease with data mining algorithms
[15]. This research dealt with those algorithms used in a different research paper and
tried to determine the better one.
3 Methodology
Data Mining is a technique where a large volume of preexisting raw data in the
database is processed or altered to needs and analyzed to reveal useful patterns and new
relations among attributes for achieving various goals. Data mining is also called
knowledge discovery in databases, also known as KDD (Fig. 1).
Fig. 1. Steps associated with KDD [29]
Data mining is a powerful and new field having various techniques to analyses
recent real-world problems. It converts the raw data into useful information in various
research fields and finds the patterns to decide the future medical field [9]. There are
robust data mining techniques developed and used in data mining projects recently for
knowledge discovery from the database.
Fig. 2. Proposed system architecture
In this project, we will use the SVM and NB algorithms that will perform classi-
fication and predictability in the database to extract information and classify patients
into two categories: chronic kidney disease (ckd) and not a chronic disease (notckd). In
this section, we propose a method of extracting action rules for CKD datasets showing
in Fig. 2.
Classification is done for the data mining assigned to the objects in a group to the
target categories. Classification is when predicting the final results will be based on the
given input dataset [12]. The set of rules attempts to discover relationships among the
attributes that might make it feasible to expect the outcome. The classification tech-
nique aims to appropriately be expecting the target class for every case within the
records. Example: Whether the affected person is affected by Chronic Kidney Disease
or not.
A. Prediction of CKD
A Naive Bayes classifier is general classifiers with probabilities based on Bayes theory.
It has strong (innocent) beliefs of independence in facilities. SMO is an algorithm that
solves a quadratic programming problem that occurs when training support vector
machines. IBk is one example-based learner who uses the k training examples class,
close to the test example class. The parameter k is the number of immediate neighbours
to be used for the estimate [26].
B. Data Preprocessing
Before implementing any taxonomy algorithm on data, the data must be cleaned in
advance and is called the pre-processing step. During this pre-processing stage, many
processes occur, such as estimating missing values, deleting noisy data such as outliers,
normalizing, and balancing unbalanced data. Real-world data usually do not contain
values. One way to deal with missing values is to leave the entire record containing the
missing value, called case deletes.
C. Estimate the Steps by Calculating the GFR
Different steps are calculated by calculating the GFR from the given properties using
the MDRD formula. GFR is traditionally measured as the renal clearance of an ideal
filter marker, such as insulin from plasma. The calculated GFR is considered the
standard for everyday clinical use but not practical due to the complexity of the
measurement process. Estimating GFR based on the filter marker (usually serum cre-
atinine) is now widely accepted as a basic test.
MDRD formula:
GFR (mL/min/1.73m2) = 175 (Scr)-1.154 (Age)-0.203 (0.742 if female)
(1.212 if African American)
3.1 Chronic Kidney Disease (CKD)

Chronic kidney disease is referred to as kidney damage or the gradual decrease of
Glomerular filtration (GFR) rate of the kidney for three months or more [4]. The
measurement of Glomerular filtration Rate or GFR is the most common way to
determine the kidney’s state. GFR is measured by measuring the clearance of a sub-
stance excreted by the kidney, often called filtration markers. The clearance is then
used in a formula to determine GFR.
Very few substances completely fulfill the criteria mentioned above, but some have
very close proximity to meeting those features. Inulin is such a substance that can be
used to measure GFR. But inulin is not an internal constitution of the body. So an
external infusion of inulin is required for the detection of inulin clearance. In most
practical cases, creatinine clearance is used for this process. Creatinine (a direct result
of protein metabolism) is an endogenous substance used for GFR measurement pretty
accurately. The estimation of GFR is used to classify kidney impairment [11]. A low
GFR rate is indicative of a disruptive kidney function. Table 1 illustrates the various
stages of Kidney disease and its severity.
Table 1. GFR stages to classify CKD [24]

Sl. Stages of CKD Glomerular Action Plans
No. Filtration Rate
(GFR)
1 Kidney damage 90 or above Diagnosis and treatment of comorbid
with normal or conditions, disease progression, reduction
increase GFR of risk factors for cardiovascular disease
2 Kidney damage 60 to 89 Estimation of disease progression
with a mild decrease
in GFR
3 Moderate decrease 30 to 59 Evaluation and treatment of disease
in GFR complications
4 Severe reduction in 15 to 29 Preparation of kidney replacement
GFR treatment (dialysis, transplantation)
5 Kidney failure Less than 15 Kidney replacement therapy (if uremia
present)
Source: Review of Chronic Kidney Disease based on Data Mining Techniques”, International
Journal of Applied Engineering Research, ©Research India Publications.
3.2 The Source of the CKD Dataset

We used the proposed system has been prepared at Apollo Hospital in Tamil Nadu of
India [27]. The dataset owner graciously made the dataset available in the machine
learning data site Kaggle.com from which we gained access to the dataset. The fol-
lowings are the information on the creative person of this dataset.
The data set had 25 different attributes and 400 instances. The class value for this
data set is CKD, which refers to the prevalence of kidney disorder in a patient and the
other one is Non-CKD, indicates vice-versa. The dataset has 225 instances of the class
value “CKD,” and the other 175 has been classified as a NON-CKD (Table 2 and
Table 3).
Table 2. Information attributes [31]

Sl. Attribute Representation Information Description
No. Attribute
1 Age Age Numerical Years
2 Blood Bp Numerical Mm/Hg
pressure
3 Specific Sg Nominal 1.005,1.010,1.015,1.020,1.025
gravity
4 Albumin Al Nominal 0.1.2.3.4.5
5 Sugar Su Nominal 0.1.2.3.4.5
6 Red blood Rbc Nominal Normal, abnormal
cells
7 Pus cell Pc Nominal Normal, abnormal
8 Pus cell Pcc Nominal Present, not present
clumps
9 Bacteria Ba Nominal Present, not present
10 Blood glucose Bgr Numerical Mgs/dl
random
11 Blood urea Bu Numerical Mgs/dl
12 Serum Sc Numerical Mgs/dl
creatinin
13 Sodium Sod Numerical mEq/L
14 Potassium Pot Numerical mEq/L
15 Haemoglobin Hemo Numerical Gms
16 Packed cell Pcv Numerical mL in volume
volume
17 White blood Wc Numerical Cells/cumm
cell count
18 Red blood cell Rc Numerical Millions/cmm
count
19 Hypertension Htn Nominal Yes, no
20 Diabetes Dm Nominal Yes, no
mellitus
21 Coronary Cad Nominal Yes, no
artery
disease
22 Appetite Appet Nominal Good, poor
23 Pedal edema Pe Nominal Yes, no
24 Anemia Ane Nominal Yes, no
25 Class Class Nominal Ckd notckd
Table 3. Class distribution

Sl. No. Class Distribution
1 Ckd 225 (63.6%)
2 Notckd 175 (36.4%)
3.3 Matrix and Research Hypothesis

To understand the behavior of taxonomy, we use the following hypothesis:
• True Positive (TP) is the number of positively estimated positive assumptions.
• True Negative (TN) is an accurately estimated number of negative samples
• False Negative (FN) means the number of positively estimated positive samples.
• False Positive (FP) is the number of negative samples that are negatively estimated
(Table 4).
Table 4. Metric and research hypotheses [31]

Metric Description Formula
Accuracy Number of correct predictions from all TP þ TN
TP þ FP þ TN þ FN (1)
predictions made
Sensitivity Proportion of positive predictions that are TP
TP þ FN (2)
correctly identified
Specificity Proportion of negative predictions that are TN
FP þ TN (3)
correctly identified
Precision Positive predictive values TP
TP þ FP (4)
Mean Absolute Comparison between forecast or predictions FP þ FN
TP þ FP þ TN þ FN (5)
Error (MAE) and the eventual outcomes
F-measure Combination of precision and recall 2PrecisionSensitivity
Precision þ Sensitivity (6)
Another important metric to consider is the Confusion Matrix. It is a commonly

used visualization tool to demonstrate the accuracy of taxonomy in taxonomy. The
columns represent the ict entities and the rows represent the actual squares as shown in
Table 5.
Table 5. Confusion matrix description

Predicted
Positive Negative
Actual Positive TP FN
Negative FP TN
3.4 Screenshots of the Dataset
Fig. 3. (a) Pre-processed dataset (b) Processed dataset
The above two interfaces represent the processed and non-processed data used in this
research [7]. The first one is for non-processed data, and the other is for pro-cessed data
with the probability of CKD or non-CKD. The data which are used in the data mining
technique cannot be used without processing. That’s why data must be processed
before using in data mining technique [8] (Fig. 3).
To test our classification and evaluate its performance, we apply a 10-fold cross-
certification test, which is a technology that divides the original set to train the model
for training, and has a test set to evaluate it. After applying the pre-processing and
manufacturing methods, we try to analyze the data and find out the distribution of
values in terms of model performance and accuracy (Table 7).
Table 6. Classifiers’ performance criteria

Evaluation criteria SVM NB
Time to build model (s) 0.43 0.04
Correctly classified instances 392 382
Incorrectly classified instance 8 21
Accuracy (%) 61.25 56.5
Error 0.38 0.43
450
400
350
300
250
200
150
100
50
0
Correctly Incorrectly
Time to build
classified classified Accuracy (%)
model (s)
instances instance
SVM 0.43 392 8 61.25
NB 0.04 382 18 56.5
Fig. 4. A comparative graph of classifiers performance
Table 7. Accuracy measures by class

TP FP Precision Recall F-measure Class
SVM 0.97 0 1 0.97 0.99 CKD
1 0.04 0.95 0.96 0.98 NotCKD
NB 0.93 0 1 0.93 0.96 CKD
1 0.09 0.89 1 0.94 NotCKD

Ckd NotCkd
SVM 210 15 Ckd
0 175 NotCkd
NB 200 25 Ckd
0 175 NotCkd
In this study, we applied a machine-learning algorithm to a chronic kidney disease

dataset to assess patients and non-patients with chronic kidney disease, based on data
from each specificity for each patient. Our goal is to compare different taxonomic
models and define the most effective. We were compared based on three algorithms in
the top 10 [30]; SVM and NB (see Fig. 4). After executing the taxonomy, the results
show:
We observed a difference between 61% and 57% concerning the accuracy of
indicating the percentage of cases correctly classified. It has nothing to do with tax-
onomy, but it does have to do with the application domain and data type. In our study,
SVM followed with reasonable accuracy (61%), followed by NB (56.5%). Since
accuracy alone is not enough to define classroom performance, we have used several
other criteria.
Another critical measure is the F-measure, which combines two performance
measurements: accuracy and recall. If we take the case of patients with CKD, SVM is
considered the best rate (0.97), and in the case of non-disease (NACH) it is the best rate
(0.96) (Table 6).
The confusion matrix (Table 8) shows that SVM (385) follows exactly with 15
miscalculated cases, followed by NB (375) and 25 cases miscalculated.
SBM later ranked NB, but it did well during classification and accuracy con-
struction. SVM has proven its performance as a powerful classification during accuracy
and minimum execution, making it a good classifier to use in the medical field for
classification and evaluation (Fig. 5).
Fig. 5. Output interface of GFR

In conclusion, the use of data mining methods for data risk analysis is very important in
the health sector because it first gives the power to fight diseases and therefore saves
people’s lives by reversing treatment. In this work, we used a number of learning
algorithms to assess patients with chronic renal failure (ckd) and patients with this
disorder (notckd). Simulation results have shown that SVM classification has proven its
performance in predicting the best results in terms of accuracy and minimum execution
time. The result of the testing does not achieve 100%. It may be due to calculation and
weight. To get a more accurate result, the calculation of the similarity has to modify
with the weight. The experts’ opinion and the view will determine the most critical
features (weight), and it will be used in similarity computation by weighted average. By
doing that, the most accurate calculation will be determined. In the future, data col-
lection using the accelerometer sensor will be collected for older people with non-
neurodegenerative diseases. An in-depth Artificial Neural Network can be done before
getting better performance. Many details can be used to train the learning model.
Finally, bring out some of the features that will be most helpful in training the learning
model. In the future, the information-driven approach may be used to remove uncer-
tainty as a legal system based on expertise.
References
1. Andrew, K., Bradley, D., Shital, S.: Predicting survival time for kidney dialysis patients: a
data mining approach. Comput. Biol. Med. 35, 311–327 (2005)
2. Vijayarani1, S., Dhayanand, S.: Data mining classification algorithms for kidney disease
prediction. In: International Journal on Cybernetics & Informatics (IJCI), vol. 4 (2015)
3. Dulhare, U.N., Ayesha, M.: Extraction of action rules for chronic kidney disease using Naïve
bayes classifier. In: 2016 IEEE International Conference on Computational Intelligence and
Computing Research (ICCIC) pp. 1–5, IEEE (2016)
4. Firman, G.: Definition and Stages of Chronic Kidney Disease (2009) http://www.
medicalcriteria.com/site/index.php?option=com_content&view=article&id=142%
3Anefckd&catid=63%3Anephrology&Itemid=80&lang=en
5. Jiawei, H., Micheline, K.: Data Mining, Concepts and Techniques, Second (Eds.), Elsevier
Publication (2003)
6. Jinn-Yi, Y., Tai-Hsi, W., et al.: Using data mining techniques to predict hospitalization oh
hemodialysis patients, Elsevier, vol. 50(2), pp. 439–448 (2011)
7. Kirubha, V., Manju Priya, S.: Survey on data mining algorithms in disease prediction. Int.
J. Comput. Trends Technol. 38(3), 24–128 (2016)
8. Jena, L., Narendra, K.K.: Distributed data mining classification algorithms for prediction of
chronic- kidney-disease. Int. J. Emerg. Res. Manage. Technol. 4(11), 110–118 (2015)
9. Kumar, M.: Prediction of chronic kidney disease using ran-dom forest machine learning
algorithm. Int. J. Comput. Sci. Mobile Comput. 5(2), 24–33 (2016)
10. Dunham, M.H., Sridhar, S.: Data Mining: Introductory and Advanced Topics. Pearson
Education, Dorling Kindersley (India) Pvt Ltd (2006)
11. Naganna, C., Kunwar, S.V., Sithu, D.S.: Role of at-tributes selection in the classification of
chronic kidney disease patients. In: International Conference on Computing, Communication
and Security (ICCCS), pp. 1–6.4-5 Dec 2015
12. National Kidney Foundation (NKF): Clinical practice guidelines for chronic kidney disease:
evaluation, classification and stratification. Am. J. Kidney Disease 39, 1–266 (2002)
13. Pavithra, N., Shanmugavadivu, R.: Survey on data mining techniques used in kidney related
diseases. Int. J. Modern Comput. Sci. 4(4), 178–182 (2016)
14. Pushpa, M.P.: Review on prediction of chronic kidney disease using data mining techniques.
Int. J. Comput. Sci. Mobile Comput. 5(5), 135–141 (2016)
15. Pushpa, M.P.: Review on prediction of chronic kidney disease using data mining techniques.
Int. J. Comput. Sci. Mobile Comput. 5(5), 135–141 (2016)
16. Ruey Kei, C., Yu-Jing, R.: Constructing models for chronic kidney disease detection and
risk estimation. In: Proceedings of 22nd IEEE International Symposium on Intelligent
Control, Singapore, pp. 166–171.International Conference on Internet of Things and
Applications (IOTA), p. 5. Pune, IEEE (2011)
17. Dilli, A., Thirumalaiselvi, R.: Review of chronic kidney disease based on data mining
techniques. Int. J. Appl. Eng. Res. vol. 12(23), 13498–13505 (2017)
18. Sujata, D., Gurdeep, S.D., Sugandha, S., Bharat, B.: Chronic kidney disease prediction using
machine learning: a new approach. Int. J. Manage. Technol. Eng
19. Tabassum, S., Mamatha Bai, B.G., Jharna, M.: Analysis and prediction of chronic kidney
disease using data mining techniques. Int. J. Eng. Res. Comput. Sci. Eng. (IJERCSE) 4(9),
25–32 (2017)
20. Uma, N.D., Ayesha, M.: A review on prediction of chronic kidney disease using
classification techniques. In: 4th International Conference on Innovations in Computer
Science & Engineering (2016)
21. Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, GJ., Ng,
A., Liu, B., Philip, S.Y., Zhou, Z.H.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14
(1):1–37 (2008)
22. Vasant, P., Zelinka, I., Weber, G.W.: Intelligent Computing & Optimization. Springer,
Berlin (2018)
23. Intelligent Computing and Optimization. In: Proceedings of the 2nd International
Conference on Intelligent Computing and Optimization 2019 (ICO 2019), Springer
International Publishing, ISBN 978–3-030-33585 -4 (2019)
24. Lominadze, D., Schuschke, D.A., Joshua, I.G., Dean, W.L.: Increased ability of erythrocytes
to aggregate in spontaneously hypertensive rats. Clin. Exp. Hy-pertains. 24(5), 397–406
(2002) https://doi.org/10.1081/ceh-120005376
25. Dulhare, U.N., Ayesha, M.: Extraction of action rules for chronic kidney disease using Naïve
bayes classifier. In: 2016 IEEE International Conference on Computational Intelligence and
Computing Research (ICCIC), pp. 1–5, (2016) 10.1109/ICCIC.2016.7919649
26. Naganna, C., Kunwar, S.V., Sithu, D.S.: Role of attributes selection in classification of
Chronic Kidney Disease patients. In: International Conference on Computing, Communi-
cation and Security (ICCCS), pp 1–6. 4-5 Dec 2015
27. Dua, D., Graff, C.: UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine,
CA: University of California, School of Information and Computer Science (2019)
28. UCI Machine Learning Repository: Kidney failure Data Set. https://archive.ics.uci.edu/ml/
datasets/Chronic_Kidney_Disease
29. Overview of the KDD Process. http://www2.cs.uregina.ca/*dbd/cs831/notes/kdd/1_kdd.
html
30. http://www.datasciencecentral.com/profiles/blogs/python-resources-for-top-data-mining-
algorithms
31. Basma, B., Hajar, M., Abdelkrim, H.: Performance of data mining techniques to predict in
healthcare case study: chronic kidney failure disease. In: International Journal of Database
Management Systems (IJDMS), vol. 8 (2016)
Multi-classification of Brain Tumor Images
Based on Hybrid Feature Extraction Method
Khaleda Akhter Sathi and Md. Saiful Islam(&)
Department of Electronics and Telecommunication Engineering, Chittagong

University of Engineering and Technology, Chittagong-4349, Bangladesh
{sathi.ete,saiful05eee}@cuet.ac.bd
Abstract. The development of brain cancer treatment is fully dependent on

physician’s and radiologist’s knowledge. An automated classification method
can improve the physician’s knowledge to accelerate the treatment process. This
paper aims at developing an artificial neural network (ANN) based automated
brain tumor classifier with a hybrid feature extraction scheme. The classification
approach is started with the preprocessing of the tumor images using the min-
max normalization rule. Then the statistical features of the preprocessed images
are extracted utilizing the hybrid feature extraction method comprised of sta-
tionary wavelet transform (SWT) and Gray level co-occurrence matrix (GLCM)
techniques to enhance the classification performance. Finally, the ANN is
employed for classifying the brain tumors in most frequent brain tumor types as
glioma, meningioma, and pituitary. The proposed approach provides an
improved classification accuracy of 96.2% that is much better than modern
multi-classification methods because of employing an SWT based hybrid feature
extraction method rather than the conventional discrete wavelet transform
(DWT) technique.
Keywords: Hybrid feature extraction Gray level co-occurrence matrix

Stationary wavelet transform Brain tumor classification Artificial neural
network
1 Introduction
Recently, machine learning (ML) approaches have shown a remarkable performance in

the area of medical image classification. The learning approaches are partitioned into
two classes, supervised and unsupervised. The commonly used supervised learning
approach such as support vector machine (SVM), random forests (RF), artificial neural
network (ANN), naive bayes, etc. train the network with predefined target values [1–5].
On the contrary, the unsupervised learning approach named convolutional neural
network (CNN), k-nearest neighbor (KNN), fuzzy c-means, etc. train the network with
unknown target values [6, 7]. In terms of brain tumor classification, the supervised
learning approach shows excellent performance rather than an unsupervised learning
approach. In addition, the supervised learning approach has another great advantage
includes low computational complexity that makes it more preferable for effective
classification. For instance, Sultan et al. [7] developed a classifier based on an

https://doi.org/10.1007/978-3-030-68154-8_83
990 K. A. Sathi and Md. Saiful Islam
unsupervised learning method using the CNN algorithm for brain tumor image clas-
sification into different classes. The performance of the classifier in terms of accuracy
was found approximately 98.7% to categories the brain tumors into different classes.
Moreover, gumaei et al. [8] employed a supervised learning method named regularized
extreme learning to classify MRI images as glioma, meningioma, and pituitary
respectively. The classification method provides an accuracy of approximately
94.233% because of using the PCA-based normalized GIST as a hybrid feature
extraction technique.
In another study, Kaplan et al. [9] designed a modified feature extraction technique
named local binary pattern (LBP) based on angle and distance to classify the brain
tumors into three different classes. Then different classification methods were applied
by using KNN, ANN, RF, and linear discriminant analysis (LDA). Among these, the
KNN showed the highest classification accuracy of 95.56% with a distance-based LBP
feature extraction method. Ismael et al. [5] developed a classification method using a
multilayer perceptron for MRI brain tumor classification. The hybrid feature extraction
process including DWT and Gabor filter was employed to produce input statistical data
for the classification network. The classification network was trained based on the
back-propagation algorithm to provide the highest accuracy of 91.9% for brain disease
classification into multiple classes.
The major contributions of this work are summarized as (i) develop an automatic
classification method for brain tumors classification into three classes. (ii) Implement a
simple and effective hybrid feature extraction technique that uses SWT and GLCM for
developing a total of 13 statistical features that contains the most significant infor-
mation of the tumor images. (iii) Enhance the classification accuracy with the aid of
these extracted statistical features. (iv) Analyze the efficacy of the designed method by
comparing it with existing classification methods, where the same datasets are used.
The resting part of the paper is arranged as follows: Sect. 2 gives an explanation of
the methodology of the proposed system with an extensive explanation of each of the
steps. Section 3 presents the results with analysis and at the end of the section, a
comparative study is also conducted. Finally, Sect. 4 shows concluding remarks.
2 Proposed Approach for Multi-class Tumor Classification
Figure 1 illustrates the flow diagram of the proposed multi-class classification method,
in which the system starts with the preprocessing of the tumor images. Then without
any segmentation process, the hybrid feature extraction method is utilized with the
composition of SWT and GLCM. The hybrid technique produces a total of 13 statis-
tical features of the brain tumor images. Finally, the ANN classifies the brain tumor
images according to the statistical feature.
Multi-classification of Brain Tumor Images 991
Brain Tumor Images
Image Preprocessing
Min-Max Normalization
Image Feature Extraction
SWT GLCM
Statistical Features
Image Classification
ANN
Meningioma
Glioma Pituitary
Fig. 1. Flow diagram of the proposed multi-classification method.
2.1 Brain Tumor Dataset

The tumor dataset used in this work is obtained from 233 patients with three types of
brain tumor images at different slices (a) 994 axial images, (b) 1045 coronal images,
and (c) 1025 sagittal images. This T1-weighted contrast-enhanced image dataset is
provided by Cheng [10]. It comprised of 3064 brain tumor MRI images. The datasets
are formulated with 1426 meningioma images, 708 glioma images, and 930 pituitary
images. Each of the images has a size of 512 512 in pixels. The datasets are
organized in MATLAB data format with the arrangement of each image as tumor
labels, mask, border, and patient ID.
2.2 Image Preprocessing Based on Min-Max Normalization

Before starting the preprocessing part, the original images of size 512 512 are
downsized into 256 256 in pixels to enhance the performance of the network within
lower computational time. Then the preprocessing part is started by applying the min-
max normalization rule that has a positive impact on the excellence of feature extraction
as well as in tumor classification. The following min-max normalization rule [8] is
utilized for improving the contrast of the brain edge and regions.
f ðx; yÞ Vmin
f ðx; yÞ ¼ ð1Þ
Vmax Vmin
Where f ðx; yÞ represents each pixel value of the brain image. Moreover, Vmin and Vmax
represent the minimum and maximum value of the image. This normalization rule
excludes all the negative and large values of the input image that range from 0 to 255 as
displayed in Fig. 2(a). After preprocessing, the images are converted into an intensity-
based contrast enhanced image of values from 0 to 1 as illustrated in Fig. 2(b).
Fig. 2. Preprocessing steps; (a) The input image, and (b) The contrast enhanced output image.
2.3 SWT and GLCM Based Feature Extraction

The extraction of features is the main part of the proposed classifier that generates a
precise and unique valued vector of the preprocessed images. For this purpose, a total
of thirteen values of statistical features of the brain tumor image are extracted based on
SWT and GLCM methods.
2.3.1 Stationary Wavelet Transform

In general, the conventional DWT decomposes the input image into different levels by
convolving the image with a filter and sampled it in a downscale. Therefore, the
limitations called translation-variance occurs in the output decomposed image with a
size of 21l where the l represents the level of decomposition [11]. For this reason, the
SWT is employed for producing an equal number of pixels at each level of the output
decomposed image with inherent redundancy. In the first step of hybrid feature
extraction, 3-level of SWT is employed with the composition of the coif1 wavelet. The
1-level decomposition results in four sub-bands of images represented as an
approximation (LL), diagonal (HH), vertical (HL), and horizontal (LH) images. Then
the sub-band LL is utilized to evaluate the next 2-level of decomposition. Finally, in the
3-level decomposition, the tumor image produces eight sub-bands images represents as
LLL, LLH, LHL, LHH, HLL, HLH, HHL, and HHH. The 3-level of SWT image
decomposition, in which the approximation sub-band, LLL is applied for further
processing and also for finding the nine (mean, variance, standard deviation, entropy,
root means square (RMS), inverse difference moment (IDM), smoothness, kurtosis,
skewness) statistical features shown in Table 1.
Table 1. List of statistical features.

Features Equation
Mean PN
x = N1 xi
i¼1
P
Entropy e = pði; jÞlogðpði; jÞÞ
i;j
P
Kurtosis K= 1
ðf ðx; yÞ xÞ4
ðr2 Þ4
x;y
Standard deviation qffiffiffiffiffiffiffi P
N
s= 1
N1 ðxi xÞ2
i¼1
rP
ffiffiffiffiffiffiffiffiffiffiffiffiffiffi
RMS
ju j
2
i ij
Y= M
P
IDM I= 1
1 þ ðijÞ2
pði; jÞ
i;j
P
Variance r2 = N1 ðxi xÞ2
i
Smoothness R = 1-1 þ1 r2
P
Skewness S = ðr12 Þ3 ðf ðx; yÞ xÞ3
x;y
P
Energy E= pði; jÞ2
i;j
P
Contrast C= ji jj2 pði; jÞ
i;j
Homogeneity P pði;jÞ
H= 1 þ jijj
i;j
P
Correlation C1 = pði; jÞ ðilirÞi rðjlj
j
Þ
i;j
2.3.2 Gray Level Co-occurrence Matrix

The GLCM utilized in this work mainly works on image pixels to perform second-
order statistics that result in textural feature extraction [12]. In this process, an 8 8
matrix is generated from the decomposed image by measuring the horizontal adjacency
of gray-level intensity value i with the neighboring pixels of value j. Then the nor-
malization is performed on each element (i, j) of the matrix to calculate the four
(energy, contrast, correlation, homogeneity) statistical features as narrated in Table 1.
2.4 Image Classification Using ANN

The ANN is a powerful tool for solving non-linear composite problems and locating
the universal mappings between input and output [13]. Generally, the network is
considered as a fully connected feed-forward network with multiple layer perceptron
consisting of an input layer, a succession of hidden layers, and an output layer [14].
The layers are denoted as l = {0,…, L − 1}, where L represents the number of the
layer where each layer contains several neurons denoted as nl. Generally, the extracted
intended features are utilized as input layer neurons for the classification of the images.
Fig. 3. The proposed network model of ANN.
Table 2. Accuracy of the proposed network at different hidden neurons.

Number of hidden neurons Root Mean Square Error
(RMSE)
150 0.04982
200 0.06987
250 0.04172
300 0.05022
350 0.05168
The implementation of the classification network for accurately identifying the

brain tumor types from MRI brain images is shown in Fig. 3. The network is designed
with the composition of three layers (a) an input layer with 13 neurons, (b) a hidden
layer with 250 neurons, and (c) an output layer with 3 neurons. Moreover, the network
is trained with several neurons (150, 200, 250, 300, and 350) to find out the optimum
model with the minimum root mean square error (RMSE). Table 2 displays the values
of RMSE at several neuron numbers. Based on these values the superior ANN model is
the third model in which the RMSE equals 0.04172 for 250 neurons. On the other hand,
the target data is created by setting all the values of the vectors ‘0’ except the classes
that are to be represented contain the value of ‘1’. In this work, the total image datasets
are splitting into 70% training, 15% testing, and 15% validation dataset. The training
process based on the feed-forward back-propagation algorithm is employed to update
the network weights and biases according to the Levenberg-Marquardt optimization.
3 Results Analysis
The performance of the classifier in terms of training, testing, and validation is shown
in Fig. 4. It is shown that the mean squared error (MSE) is diminished at epoch 33 and
the performance of the classifier is found superior at this epoch with a minimum valued
error of approximately 0.04.
Fig. 4. The mean squared error of the proposed classifier algorithm.
Figure 5 presents the testing performance of the modeled classifier based on the
confusion matrix (CM) for classifying the brain tumor images into three classes. The
predicted classes (system output) are represented along the x-axis while the actual
classes (ground truth) are represented along the y-axis. From this matrix, the Accuracy,
Precision, and Sensitivity have been calculated using the following equations.
TP þ TN
Accuracy ¼ 100% ð2Þ
TP þ TN þ FP þ FN
TP
Precision ¼ 100% ð3Þ
TP þ FP
TP
Sensitivity ¼ 100% ð4Þ
TP þ FN
Where
• TP (true positive) = Positive predicted classes are correctly identified as positive.
• TN (true Negative) = Negative predicted classes are correctly identified as negative.
• FP (False positive) = Negative predicted classes are incorrectly identified as

positive.
• FN (False Negative) = Positive predicted classes are incorrectly identified as
negative.
Fig. 5. The confusion matrix of the proposed multi-class classifier.
The accuracy metrics developed from the confusion matrices are shown in Table 3.
The highest performance based on accuracy, precision, and sensitivity is found for the
pituitary class. The accuracy is achieved 96.96% for glioma tumor, 96.70% for
meningioma tumor, and 98.69% for a pituitary tumor.
Table 3. Accuracy metrics for three classes of tumors.

Tumor Type TP TN FP FN Accuracy Precision Sensitivity
Glioma 1375 1596 42 51 96.96% 0.9703 0.9642
Meningioma 666 2297 59 42 96.70% 0.9186 0.9406
Pituitary 906 2118 16 24 98.69% 0.9934 0.9741
Figure 6 presents the receiver operating characteristic (ROC) curve of the proposed
classifier with the variation of different threshold values to shows the capability of the
classifier for identifying the brain tumor types correctly. The region under the ROC
curve for each type of tumor is 0.97, 0.96, and 0.98 for Glioma, Meningioma, and
Pituitary tumors respectively.
Fig. 6. ROC curve of modeled classifier.
The performance of the proposed approach is estimated by comparing the method

with some existing methods. The classification results from existing works that have
used the same brain tumor types with different feature extraction methods and classi-
fiers are summarized in Table 4. From this table, it is obvious that the proposed
classifier is quite enhanced the accuracy because of utilizing the SWT and GLCM
based hybrid feature extraction technique. However, the accuracy of the modeled
classifier at different feature extraction methods is determined based on DWT, SWT,
DWT + GLCM, and SWT + GLCM as displayed in Fig. 7. The graph represents that
the classifier significantly improves the multi-classification accuracy by utilizing both
GLCM and SWT techniques comparing with other feature extraction techniques.
Table 4. Proposed model comparison with the existing model.

Reference Features Extraction Technique Classifier Accuracy (%)
[5] DWT + Gabor filter BPNN 95.7
[7] – CNN 96.1
[8] Hybrid PCA-NGIST RELM 94.2
[9] local binary pattern (LBP) KNN 93.3
aLBP 90.6
nLBP 95.6
[15] DWT + Gabor wavelet + Support 72
GLCM + PCA Vector
Machine
This work SWT + GLCM ANN 96.2
Fig. 7. Variation of modeled classifier accuracy with different feature extraction techniques.
4 Conclusion
This paper introduced a hybrid feature extraction based techniques with the compo-
sition of ANN for classifying the brain tumor MRI images into three classes as glioma,
meningioma, and pituitary respectively. The performance of the ANN classifier in
terms of accuracy is quite improved than the modern classifier because of employing
the SWT and GLCM based hybrid feature extraction method that extracts the statistical
information of the preprocessed tumor images. Moreover, the proposed ANN classifier
gives a satisfactory classification accuracy of 96.2% that also improves the discrimi-
nation capability among three types of brain tumor images. Therefore, the modeled
classifier can provide an enormous impact on the clinical diagnosis and treatment of
brain cancer disease. In the future, the proposed approach can be applied to solve
another biomedical classification problem.
References
1. Rashid, M.H,, Mamun, M.A., Hossain M.A., Uddin, M.P.: Brain tumor detection using
anisotropic filtering, svm classifier and morphological operation from mr images. In: 2018
International Conference on Computer, Communication, Chemical, Material and Electronic
Engineering (IC4ME2), pp. 1–4. IEEE (2018)
2. Raju, A.R., Pabboju, S., Rao, R.R.: Brain image classification using dual-tree m-band
wavelet transform and naïve bayes classifier. In: Intelligent Computing in Engineering 2020,
pp. 635–642. Springer, Singapore (2020)
3. Zhang, L., Han, Z., Islem, R., Yaozong, G., Qian, W., Dinggang, S.: Malignant brain tumor
classification using the random forest method. In: Joint IAPR International Workshops on
Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern
Recognition (SSPR), pp. 14–21. Springer, Cham (2018)
4. Siddiqui, M.F., Mujtaba, G., Reza, A.W., Shuib, L.: Multi-class disease classification in
brain MRIs using a computer-aided diagnostic system. Symmetry 9(3), 37 (2017)
5. Ismael, M.R., Abdel-Qader, I.: Brain tumor classification via statistical features and back-
propagation neural network. In: 2018 IEEE International Conference on Electro/Information
Technology (EIT), pp. 0252, IEEE (2018)
6. Ramdlon, R.H., Entin, M.K., Tita., K.: Brain tumor classification using mri images with k-
nearest neighbor method. In: 2019 International Electronics Symposium (IES), pp. 660–667.
IEEE (2019)
7. Sultan, H.H., Salem, N.M., Al-Atabany, W.: Multi-classification of brain tumor images
using deep neural network. IEEE Access 27(7), 69215–69225 (2019)
8. Gumaei, A., Hassan, M.M., Hassan, M.R., Alelaiwi, A., Fortino, G.: A hybrid feature
extraction method with regularized extreme learning machine for brain tumor classification.
IEEE Access 11(7), 36266–36273 (2019)
9. Kaplan, K., Kaya, Y., Kuncan, M., Ertunç, H.M.: Brain tumor classification using modified
local binary patterns (LBP) feature extraction methods. Med. Hypo. 25, 109696 (2020)
10. Cheng, J.: Brain tumor dataset (version 5) (2017) https://doi.org/10.6084/m9.figshare.
1512427.v5
11. Kumar, R., Gupta, A., Arora, H.S., Pandian, G.N., Raman, B.: CGHF: a computational
decision support system for glioma classification using hybrid radiomics-and stationary
wavelet-based features. IEEE Access 21(8), 79440–79458 (2020)
12. Vasant, P., Ivan, Z., Gerhard-Wilhelm, W.: Intelligent computing & optimization (ICO
2018). Vol. 866. Springer (2018)
13. Daoud, M., Mayo, M.: A survey of neural network-based cancer prediction models from
microarray data. Artif. Intell. Med. 1(97), 204–214 (2019)
14. Badwan, J.: Predicting the quality of mis characteristics and end-users’ perceptions using
artificial intelligence tools: expert systems and neural network. In: Intelligent Computing and
Optimization: Proceedings of the 2nd International Conference on Intelligent Computing and
Optimization 2019 (ICO 2019). Vol. 1072. Springer Nature (2019)
15. Mathew, A., Reema, P., Babu, A., Thara, N.K.: Brain tumor segmentation and classification
using DWT, Gabour wavelet and GLCM. In: 2017 International Conference on Intelligent
Computing, Instrumentation and Control Technologies (ICICICT), pp. 1744–1750. IEEE
(2017)
An Evolutionary Population Census
Application Through Mobile Crowdsourcing
Ismail Hossain Mukul1,3 , Mohammad Hasan1,2(&) ,

and Md. Zahid Hassan1,2
1
Chattagram, Bangladesh
hasancse.cuet13@gmail.com
2
3
Samsung R & D Institute, Dhaka, Bangladesh
Abstract. Mobile Crowdsourcing System (MCS) has emerged as an effective

method for data collection and processing. In this paper, a brief discussion of the
concept of Mobile Crowdsourcing system has been given where the main cri-
teria of MCS is followed. The government or the census bureau performs the
role of end user, the internet provider and some monitoring supervisor performs
the role of service provider and the smart phone users can perform the role of
worker. The whole country has been divided into some regions for counting the
population, each region has been divided into several sub-regions. There will be
a supervisor in each sub-region with a number of selected workers who will be
checked on their own reliability and authentication on the basis of verification of
personal information. An authenticated worker can collect information from a
sub-region and the supervisor is able to determine the location of the worker.
After collecting information, redundancy is checked using National Identity or
birth registration number and stored after completing the verification process
and used to make statistics including total population, population density, rate of
birth, rate of death, rate of literacy etc. Population and household census process
is a more important issue for the country. So, a census system or model has been
proposed and designed for performing the whole census process with more
authentication that reduce cost and time and make a faster calculation.
Keywords: Census system Statistics Census bureau Mobile

crowdsourcing Cloud
1 Introduction
The census is the most serious and complex process which requires mapping the entire
country, mobilizing and training a large number of enumerators, conducting a massive
public campaign, canvassing all households, collecting individual information, com-
piling vast amounts of data in paper or electronic form, and analyzing and dissemi-
nating the data. In most developing countries, censuses are the primary source of data
and it helps to make the decision for doing the task which are required for country
development [1, 2].
https://doi.org/10.1007/978-3-030-68154-8_84
An Evolutionary Population Census Application 1001
Mobile Crowdsourcing (MCS) is a Crowdsourcing system in which mobile phone

is being used to collect various types of data such as image/voice/video, location, and
ambient information by taking advantages of the sensing, communication and com-
puting capabilities of the extensively available mobile devices [4, 5]. In more flexible
and cheaper way MCS can perform instant data collection than another traditional way.
So, MCS provides a way to involve and utilize both human and machine intelligence
[2, 3].
The tradition census process is a process of collecting personal information by
visiting their household which is not easy to handle and takes too much time [14].
As the development of the modern technology, people started to depends on
technology. More than 140 million people, out of Bangladesh’s population of 160
million, use mobile phones while around 80 million people access the internet,
according to the Bangladesh Telecommunication Regulatory Commission (BTRC)
statistic [10]. The number of people with access to the internet has also grown by 18%
in one year. Mobile Crowdsourcing based digital census process which is more time
and cost- effective census process for a country [6, 7]. As a result, decisions are made
within a short time which contributes to the countries positively [15].
One of the biggest challenges associated with conducting census in poor countries
is the enormous financial costs of conducting the exercise. As a result of this, many
poor countries cannot conduct censuses as regularly as richer countries do. Information
collection requires more census form and more planning for every household all over
the country. As it is not feasible to provide census form to every household people,
workers and proper training are needed to perform this task [16, 17].
MCS based Census Application can be used for many purposes. Mainly the MCS
based census Application is used for population and household census. Using this app,
a user can know the whole information of a region like division, district, thana and one
can know his/her information that were given in census period [11–13].
The main objective of this work is to design and implement a system that can help
to collect, measure and monitor the whole census process of a country.
The key objective and possible outcomes of this work may mention in the
following:
• To design a system that can perform whole census process of a country.
• To implement the system that can able to make a statistic using the required
information.
• To evaluate the system performance in terms of accuracy, precision and recall of the
system.
Section 1 describes some introductory concepts and motivation and contribution of
the work. The following chapters will go through different aspects of the project
“Population Census process though MCS” in details. Section 2 contains the the ter-
minologies related to the system which are important to understand Mobile Crowd-
sourcing process. Section 3 contains brief discussion of my project, necessary tools and
show it’s similar with Mobile Crowdsourcing process. Section 4 discusses about the
full system architecture and internal works of the whole system. Section 5 describes the
experimental results and evaluation of the system. Section 6 & 7 states the summary
and discussion about future recommendation of this project.
1002 I. H. Mukul et al.
2 Terminologies Related to MCS
In this section we present studies on the terminologies related to the system which are
important to understand Mobile Crowdsourcing process. This section also contains
brief discussion on related previous works.
There are many examples which are implemented using mobile crowdsourcing
concept. Some of example or related work is describe bellow:
Quora is the most popular website which is an example of crowdsourcing. The
crowd can answer the dropped questions simultaneously [18].
Uber provides a private car while on the go. This on-demand service is already
available in over forty countries and saves the waiting time in the taxi line. The app
allows riders to pay via PayPal or credit card, compare the fare rates and be picked up
from their current location in a few minutes [17].
Mobile Crowdsourcing System (MCS) consists of three main part namely MCS
Service Provider (SP), end user and MCS worker. SP accepts service requests from
MCS end users and deals with the requests. It selects proper MCS workers and assigns
relevant tasks to them. MCS’s end users are the requester of MCS services who can
make a request to the worker through SP. To perform the assigned task a mobile user
may act as a worker of MCS system.
The tools which are required for designing census application describe in below:
Java is an Object Oriented Programming (OOP) Language like PHP and C++. To
develop an android application using java, we need a development environment.
Android Studio is a very popular and powerful IDE development environment for java.
XML stands for Extensible Mark-up Language. XML is a design tool for android.
XML defines a set of rules for making a document to human-readable and machine-
readable format [8, 9].
To develop an application for Android platform android SDK (software develop-
ment kit) is used which have a set of developing tools.
Firebase is a mobile and web app development platform which provides many
services like phone and email authentication system, storage, real time database,
information security.
The Firebase Realtime Database is a cloud-hosted NoSQL database that stores and
synchronizes between users in Realtime. Firebase Authentication provides backend
services, easy-to-use SDKs and ready-made UI libraries to authenticate users to app.
3 Methodology
In this work we focus on developing of a crowdsource based census system that is able
to perform census process. This section mainly focuses on the overall system archi-
tecture of the proposed system and the procedure to achieve this in details.
3.1 Proposed System

Our main focusing point is to design and implement a system that can perform a census
process with the concept of Mobile crowdsourcing. The traditional or manual census
system is replaced by digital census process using smart phone to reduce cost and time.
Figure 1 shows a general structure of census process that is implemented by the mobile
crowdsourcing system and machine learning.
Fig. 1. General form of MCS based Census process
3.2 Census Procedure

See (Fig. 2).
Fig. 2. Flow model of census process
3.1.1.1 Download and Installing

One can download census application from the Census Bureau website. The census
application is designed for collecting the require information. The application is owned
by the census bureau of a country. It is an android application that must be run into the
android platform.
3.1.1.2 Phone Verification

When a user of an application or website attempts to sign in after a long period of
inactivity, a mobile phone verification process can help to ensure once again whether
the user is genuine or not. Phone verification system is required for the new user that
says a new user is trying to access the application (Fig. 3).
Fig. 3. Phone verification process
3.1.1.3 Registration Process

Registration process is allowed with a valid user National Identity No. A person who
have no National Identity No cannot attend in census process. Registration process is
successful by filling the required information field (address, phone number, education
qualification, age, institution name and address) and a successful registration depends
on these data (Fig. 4).
Fig. 4. Worker registration process
3.1.1.4 Worker Activities

A worker plays an important role during the census process. His/her main task is to
represent as a reliable worker. A worker must put the valid information. Because, based
on uploaded information’s reliability a worker will take his reward from the supervisor.
A worker could be eliminated for his/her wickedness. Three types of activity which are
done by a worker (Fig. 5).
• Collect people’s information
• Show the uploaded information
• Show different region information
Fig. 5. Information collection
3.1.1.5 Handwriting vs Mobile Viewed Census form

In 2011 Bangladesh Bureau of statistics (BBS) designed a census form. It was a printed
copy which was carried by the worker to the households. A worker did not know how
many forms would be needed. So, this MCS based census system provides a census
form which do not make such type of a boring situation (Fig. 6).
Fig. 6. Handwriting form replacd with mobile viewed form
3.3 Database Design

In database there have some entities like’User verification’, ‘User details’, ‘Final
Personal information’, ‘Personal information’, ‘Birth Reg No’ and ‘Passport No’. Here,
‘User verification’ entity contains the details of a user according to his National Identity
Card. When a user requests for registration, at first his/her National Identity No is being
checked. Finally, information is uploaded in the ‘Final Personal information’ entity by
verifying person’s identity. Here, ‘Birth Reg No’, ‘Passport No’ both entities are used
for checking the user’s identity.
4 Implementation Details
4.1 Implementation of Application Front-End
4.1.1 Phone Verification Module
A worker inputs his phone number in the required field and a verification code is sent to
this number within a few second. If the phone number is already active on this phone, it
automatically read the verification code.
4.1.2 Registration Module

If a user or worker have a valid National identity (NID) No only he can perform such
type of task. After uploading the data, these will store to the database of relative table.
4.1.3 Login Module

A successful registration defines a user id for a user according to his/her address. By
using this user id and password a user gets a permission to do the next activities. User
Id is generated by the system according to the address code and serial number of a user.
There are two options for logging system, one logging module for user and another for
supervisor. A user can’t log in without registration. So, before sign in a user must
collect his/her user id from the system that is possible by completing the registration
process.
4.1.4 Dashboard and User Location Module

After a successful registration, a user can perform the following tasks:
• Add person’s information
• Check person’s information
• Find the region’s details
Here, Google map is being integrated for detecting the worker’s current location.
4.1.5 Information Collection and Uploading Module

People’s information are collected by a worker and fill to the required field. The form is
designed according to the census form of 2011 which is designed by the Bangladesh
Bureau of Statistics (BBS).
4.1.6 User Information Checking Module

User can check the person’s information by using a National Identity (NID) No/birth
registration No/passport No which is used for storing data in the database.
4.1.7 Region Information Checking Module

Region information is made according to the storage’s data address code. Address code
consists of six digits. First two digits refer to the Division code and second two digits
refer to the District code and last two digits refer to the Upazilla code.
Fig. 7. All modules captured from App front end in a single image
In the above Fig. 7, at first the worker should complete the registration process with
his mobile number and should confirm the cell number by submitting a verification
code upon receiving it from server. If someone haven’t any account yet he should first
create one and then have to log in with this user name and password. After logging in
he can add a voter, can see the voter list according to region. He can add the infor-
mation of a new voter by submitting the necessary data of that voter in the required
fields. This data includes his/her (voter) name, age, district, thana, sex, educational
qualification etc. then he can submit this information. At any time, a worker can see the
voter-list as well as the details information about these voters according to region.
5 Experimental Result and Analysis
In this chapter we have discussed about what and how the information is measured in
this system and the performance of our proposed application to survey the used
feedbacks. This survey was performed among 50 peoples. The experimental and survey
results are given in table and chart in the subsequent sections.
5.1 Information and Calculation

The information which are required to calculate after census process are Total Popu-
lation, Population Density, Rate of Birth, Rate of Death, Number of total men, Number
of total women, Number of Muslim, Number of Hindu, Number of Buddhism, Number
of Cristian, Number of Tribe, Rate of Literacy, Income per head.
5.1.1 Calculation Process

Let,
Number of validate information uploaded = Nv
Total Population = Pt
Population of previous Census result = P (prv)
Population present census result = P (prs)
Increased People in ten years = P(inc)
Area of a Region = At
Population of Capable to write a latter = P(wl)
Total Income = It
Equations:
Total population = Nv
Population Density ¼ Areaofaregion
Pt
Birth Rate ¼ Pð10

incÞ
Death Rate ¼ PðprvÞ þ P10

ðincÞPðprsÞ
Rate of Literacy ¼ PðwlPtÞ100

Income per head = ¼ Pt It
5.2 Survey Chart

From our experiment we can calculate the performance of our apps. Now it’s clear to us
which features are suitable and which are not for our apps (Fig. 8).
Fig. 8. Pi-Chart for survey result ‘Yes’ (Left image) and ‘Need to develop’ (Right image)
6 Conclusion
Now a days Mobile Crowdsourcing System (MCS) is the most important and very
much popular technology. Many applications are designed based on MCS. In com-
munication system or census process or business purpose, this technique is used
increasingly. MCS provides an easy and comfortable life to person or an organization.
Population Census process is more important issue for a country. Based on Census
result measure the change of country’s condition. In census time, it takes more time and
huge cost. MCS based census application provides an easy way to complete census
process of a country. To design this application main contribution of this giant is
android platform. Worldwide thousands of the developers are dedicated to develop
world class application in this platform. Besides android and, google establishes
Firebase, a mobile and web application platform in support of developing world class
applications. Firebase provides a high quality and secured database for the developer.
On the other hand, the most efficient developed application can be done by using the
Java library. As android is a derivative of Java, integration is quite flexible. Again,
using the Firebase platforms in the background makes the application secured.
6.1 Future Recommendations

Reduce the information redundancy more and more should user unique id for each
person. The ‘Birth Registration No’ of each person is unique. When there have ‘Birth
Reg No’. of all person in a family, use each of ‘Birth Reg No’ to reduce data redun-
dancy. None can input the information twice, if information are already exist. When a
worker will visit another region and collect information that is quickly recognized by
the supervisor. Then automatically send a warning to the user who broke the rule and
commitment.
If the Government or Census Bureau want to calculate more information, then
modify the system. And modify the calculation process, so that this process can cal-
culate the information efficiently.
References
1. Feng, W., Yan, Z., Zhang, H., Zeng, K., Xiao, Y., Hou, Y.T.: A survey on security, privacy,
and trust in mobile crowdsourcing. IEEE Internet Things J. (2018). https://doi.org/10.1109/
JIOT.2017.2765699
2. Feng, W., Yan, Z.: MCS-chain: decentralized and trustworthy mobile crowdsourcing based
on blockchain. Future Gen. Comput. Syst. (2019). https://doi.org/10.1016/j.future.2019.01.
036
3. Ma, Y., Sun, Y., Lei, Y., Qin, N., Lu, J.: A survey of blockchain technology on security,
privacy, and trust in crowdsourcing services. World Wide Web (2020). https://doi.org/10.
1007/s11280-019-00735-4
4. Wang, Y., Huang, Y., Louis, C.: Respecting user privacy in mobile crowdsourcing. Science
2(2), 50 (2013)
5. Yuen, M.C., King, I., Leung, K.S.: A survey of crowdsourcing systems. In: Proceedings -
2011 IEEE International Conference on Privacy, Security, Risk and Trust and IEEE
International Conference on Social Computing, PASSAT/SocialCom 2011 (2011). https://
doi.org/10.1109/PASSAT/SocialCom.2011.36
6. Wang, Y., Huang, Y., Louis, C.: Towards a framework for privacy-aware mobile
crowdsourcing. In: Proceedings - SocialCom/PASSAT/BigData/EconCom/BioMedCom
2013 (2013). https://doi.org/10.1109/SocialCom.2013.71
7. Yuen, M.C., King, I., Leung, K.S.: A survey of crowdsourcing systems. In: Proceedings -
2011 IEEE International Conference on Privacy, Security, Risk and Trust and IEEE
International Conference on Social Computing, PASSAT/SocialCom 2011 (2011). https://
doi.org/10.1109/PASSAT/SocialCom.2011.36
8. Fang, C., Yao, H., Wang, Z., Wu, W., Jin, X., Yu, F.R.: A survey of mobile information-
centric networking: research issues and challenges. IEEE Commun. Surv. Tutorials (2018).
https://doi.org/10.1109/COMST.2018.2809670
9. Phuttharak, J., Loke, S.W.: Logiccrowd: A declarative programming platform for mobile
crowdsourcing. In: Proceedings - 12th IEEE International Conference on Trust, Security and
Privacy in Computing and Communications, TrustCom 2013 (2013). https://doi.org/10.
1109/TrustCom.2013.158
10. Liza, F.Y.: Factors influencing the adoption of mobile banking: perspective Bangladesh.
Glob. Discl. Econ. Bus (2014). https://doi.org/10.18034/gdeb.v3i2.164
11. Jonathon Rendina, H., Mustanski, B.: Privacy, trust, and data sharing in web-based and
mobile research: participant perspectives in a large nationwide sample of men who have sex
with men in the United States. J. Med. Internet Res. (2018). https://doi.org/10.2196/jmir.
9019
12. Pan, Y., de la Puente, M.: Census Bureau guideline for the translation of data collection
instruments and supporting materials: documentation on how the guideline was developed.
Surv. Meth. (2005)
13. U.S. Census Bureau. Population and Housing Unit Estimates. Annual Estimates of the
Resident Population for the United States, Regions, States, and Puerto Rico (2018)
14. Australian Bureau of Statistics. (2017). 2071.0 - Census of Population and Housing:
Reflecting Australia - Stories from the Census, 2016. Census of Population and Housing:
Reflecting Australia - Stories from the Census, (2016). https://doi.org/10.4018/978-1-59904-
298-5
15. United Nations Secretariat Dept. of Economic and social Affairs Statistics Division Census
Data Capture Methodology, New York, September 2009
16. https://digital.gov/2013/12/03/census-mobile-app-showcases-localstatistics/
17. https://www.maketecheasier.com/crowdsourcing-mobile-apps/
18. https://www.quora.com
IoT-Enabled Lifelogging Architecture
Model to Leverage Healthcare Systems
Saika Zaman1(B) , Ahmed Imteaj1,2 , Muhammad Kamal Hossen1 ,

and Mohammad Shamsul Arefin1
1
Chittagong 4349, Bangladesh
saika 24@yahoo.com
2
School of Computing and Information Sciences, Florida International University,
Miami, FL 33199, USA
Abstract. As the world’s population is exploding day by day, the num-

ber of patients and hospital capacity is also increasing due to high-
demands. This situation leads to engaging more people to monitor the
overall situation of a hospital. However, it is quite difficult to observe the
cabin room, and the patient thoroughly 24 h. To tackle such a situation,
we have propounded a scalable IoT-based system, where a large number
of hospital cabin and the patient can be monitored without any hassle.
We leverage a mechanism that can handle many clients and their related
data and undertake immediate actions based on the situation. For this
purpose, we use Raspberry Pi as our main server that is capable of ana-
lyzing a large number of hospital cabins’ and patients’ data. Particularly,
Raspberry Pi performs analysis based on receiving data that are related
to environmental conditions, the patient’s body movement, and pulse
rate. The environment can be monitored by observing the amount of
CO2 and the temperature of a cabin room that helps us to track a fire
situation and also allows us to realize if a cabin has an overwhelming
number of people. Moreover, if a patient faces any issue, we can track
that based on the patients’ body movement and pulse rate. If the system
discovers any unexpected situation, it immediately raises a buzzer and
notifies the administrator.
Keywords: IoT · Raspberry Pi · Wireless communication · sensors ·

Hospital cabin · Patient
1 Introduction
The number of hospitals in urban areas is increasing day by day. Many hospitals
contain several hospital cabins. It is difficult to monitor the cabin individually
to avoid unwanted environmental conditions like fire occurrence, excessive noise,
improper room temperature, increasing CO2 , over-attendance of guests, smoking
inside the cabin, etc. Over-attendance of guests create noise and enhances the
room temperature and ratio of CO2 , which consequently disturbs the patients

https://doi.org/10.1007/978-3-030-68154-8_85
1012 S. Zaman et al.
and increases the possibility of spreading infection or diseases. Besides, some-

times the patient or his relative secretly does smoking inside the hospital cabin,
which may cause serious labyrinth to the rest of the patient. Furthermore, it
is even cumbersome to observe a patient’s pulse rate and movement for 24 h.
Henceforth, it is a toilsome task to monitor the environmental condition inside
the cabin and the patient’s state for the whole day long. This proposed system
will make an effort to tackle such an unexpected situation. This system aims to
make it convenient for the hospital authority to observe the individual cabin and
monitor the patient effortlessly. For successful deployment of the propounded sys-
tem, we use a couple of sensors (smoke/gas, temperature, CO2 gas, sound, pulse
sensor, tilt sensor), a Microcontroller (WEMOS D1), router, and the total work
is controlled through a Raspberry Pi 3 (credit-card sized micro-computer) model.
The Raspberry Pi 3 has A 1.2GHz 64-bit quad-core ARMv8 CPU, 802.11n Wire-
less LAN, Bluetooth 4.1 1GB RAM, 4USB port, Ethernet port, etc. and that is
why it is capable of performing multitasking efficiently and also able to store a
massive amount of data. We also use a wireless microcontroller named WEMOS
D1 as we want to constitute a comprehensive wireless system. The sensors’ analog
data is passed to the microcontroller (WEMOS D1) of that individual cabin, and
the WEMOS D1 will convert the analog data into digital data. After that, the
data will be passed wirelessly via a router to the Raspberry Pi 3. The Raspberry
Pi 3 will then observe each microcontroller (WeMos D1) by going through each
IP listed in it, requesting data, receiving the data, and processing them. After
that, it will compare the data with the standard level of the hospital authority’s
values to keep the situation under control. If the observed result sent from any
WEMOS D1 becomes unusual, the Raspberry Pi 3 will immediately trace the IP
and inform the admin about any unanticipated occurrence in the specific cabin.
The hospital authority will then take the necessary steps to solve it by sending a
nurse or other human resources. Our system’s main motivation is to monitor the
hospital cabins environment and patient health conditions such as pulse rate and
movement rate for critical patients. Nowadays, as the number of patients within
a hospital proliferates, the number of cabins inside the hospital also increases.
It is very challenging to monitor cabins for 24 h. Besides, it is costly to keep any
system manual. In our proposed approach, a wireless communication medium is
used to do the work efficiently and centrally. It is necessary to keep the envi-
ronmental condition inside the cabin under control. As the cabins are isolated
in a hospital, what is happening inside the cabin remains unknown. Sometimes,
the room temperature or the amount of CO2 gas increases for over-attendance
of the guest and creates excessive noise, which could be harmful for the patient.
Besides, any germ/virus can easily infect the patients carried by the visitors. On
the other hand, it is impossible to observe all the patients inside the cabins for
24 h. The patient’s monitoring pulse rate is essential so that proper steps can
be taken if any unexpected situation occurs. Sometimes it is indispensable to
observe some critical patients’ degree of movement. To do the above work, we
design a system to monitor the hospital cabins centrally and efficiently, so if any
Lifelogging Architecture for Healthcare 1013
unwanted situation occurs, the hospital authority can take proper steps. Here
we implement a wireless communication system, which is also cost-effective.
2 Background and Present State of the System
The evaluation of IoT technology has increased the effectiveness of real-time

decision making by collecting data from various environment and situations.
Numerous research works are already been conducted explaining the importance
[1,2], paradigms [3,4], architectural design [5,6], applications [7–12], and chal-
lenges [13–17] of designing a scalable IoT-based systems. The author in [18] pro-
pounded a system of patient monitoring in the hospital management using LI-FI
technology instead of WI-FI technology. Using Li-Fi technology, they transmit a
patient’s temperature, heartbeat, glucose, and respiration data to hospital man-
agement. However, no direct signal was not displayed to instantly take action
if the patient’s physical condition was critical. But there are no unwanted envi-
ronmental conditions of the hospital measured to take any step to handle any
unexpected situation. The authors in [19] presented a system to design and imple-
ment a framework for monitoring patients in a hospital using wireless sensors in
Ad Hoc configuration. However, they did not specify which sensors will be used
to collect the data of the patient. They proposed using the doctor and nurse’s
PDA device to observe the patient data, but no central monitoring system is
not specified. The authors in [20] proposed an IoT-based system for monitoring
patients of the hospital ward who was going under a surgical procedure and
monitored in an ICU. Here they proposed the method in IoT based by using
three types of sensors. They also suggested using the Raspberry Pi B+ model
as a gateway. However, their Raspberry Pi model does not contain the built-in
WI-FI feature. A system by which the doctor can observe multiple patients both
in-home or hospital by monitoring patient temperature, heart rate, and ECG by
using the eHealth concept is proposed in [21]. They want to implement wireless,
cheap, and easy to operate a system for sending the patient data to the doctor.
Here they implement the system by using sensors and Bluetooth technology to
transmit data. They mentioned that their proposed method could be improved
by increasing efficiency and capability. The authors in [22] developed a system
model for monitoring indoor air and health reporting system. They want to
implement the system through a mobile app. Here they want to implement the
system by using air pollutant sensors and self-reporting systems by the people
affected by polluted air. But the self-reporting system is not suitable for devel-
oping countries because many people are not concerned about the health and
cannot operate the technology required for self-reporting. The authors in [23]
presented a system of monitoring air quality and fire detection for indoor by
using an electronic micro nose (EN). They want to develop a system that can
obtain random information about air pollution and detect an existing gaseous
precursor that can cause fires by using EN as a household device. Their tech-
nique could detect gaseous precursors that can cause a fire but cannot identify
the fire if it takes place accidentally. The authors in [24] propounded a system
for monitoring air quality by using various gas sensors. Here the sensors data are
stored in an SD card and send the data through a wireless medium to the inter-
net(database) and the web server where the received data will calculate to draw
a final air quality result. They said that the data would transmit to the SD card
database only when a proper internet connection is available, but maintaining
a decent internet connection available for 24 hrs is not possible for developing
countries.
The authors in [25] proposed to develop a system for air quality monitoring
by using a wireless system. Here they wanted to design a system by using two gas
sensor and one temperature sensor. A microcontroller will receive the sensors’
data and then send it to a PC to calculate the values via a wireless media. If
the calculated result is not satisfactory, the MC will generate an alarm. In this
paper, they said to use wireless media, but which wireless media will use are not
defined. The wireless media they used for this system can only send 64 bytes
every five seconds. For that reason, they got a lower bit rate than they expected.
Sneha et al. [26] designed an embedded system for monitoring air quality.
Here, they proposed to develop a system by using air quality sensors and tem-
perature sensors. A microcontroller will receive the sensor data and then send
it to the processor to calculate the data and compare it with the given patient
sustainable value. If the calculated result is not satisfactory, it will send a mes-
sage over the phone to the authority to take the proper air purifier or humid-
ifier/dehumidifier. Here they want to develop the system by using an electrical
circuit. However, nowadays, wireless media are cheaper, easy install system, and
instinctive to use. For monitoring environmental issues and patient health con-
ditions, several works have been done. But most of the system designed either
for air quality monitoring or patient monitoring. Besides, most of the previous
work is based on a wired system and not that much instinctive and efficient
compared to our system. By making it completely wireless, it may become naive
for the related authority. Sometimes the environmental condition of a hospi-
tal cabin became worse, and the authority cannot take proper steps to control
the unwanted situation for not getting the information at the right time. Also,
a patient’s health condition may become down while the patient is alone in
the cabin. In our proposed system, we develop a system for the hospital cabin
to monitor the cabin environmental conditions and patient health condition at
some point. We want to implement the system using a wireless microcontroller
WeMos D1 and microcomputer Raspberry Pi 3, to wirelessly process the whole
work. Using a microcomputer like Raspberry Pi 3 will ensure efficacy, low cost,
scalability, low power usage, and overall proficiency.
3 System Description
3.1 System Architecture
Our system’s main purpose is to monitor the hospital cabins’ environment and
patient physical conditions centrally so that the hospital authority can take
proper steps to control the situation. In this proposed system, we used WeMos
Fig. 1. Lifelogging architecture of our proposed system.
D1 to take analog data from the sensors and convert them into digital data.
In each cabin of a hospital, there will be a WeMos D1 to take the sensor data
which will exist in each cabin. Then, WeMos D1 will continuously broadcast the
received data over IP address to the router. A Raspberry Pi 3 will perform the
central work. An IP address list exists in both the router and the Raspberry Pi
3. The Raspberry Pi 3 takes an IP address from the IP list and establishes a
connection with the WeMos D1, which holds that IP address. The Raspberry Pi
3 asks for data from the Wemos D1 through a router. The router then matches
the given IP address by the Raspberry Pi 3 with the IP address of the related
WeMos D1 from the IP address list, which it contains. Then the WeMos D1
sends the digital data to the Raspberry Pi 3 through the router. Raspberry Pi
3 checks the received data and compares it to the hospital authority’s given
standard values. If Raspberry Pi 3 founds any unusual data, it will inform the
admin. Based on that, the admin will take proper steps to control the unwanted
situation. Then, the Raspberry Pi 3 will move to the next WeMos D1 from the IP
address list. The overall system architecture and block diagram of our proposed
method are presented in Fig. 1 and Fig. 2, respectively.
3.2 Logic Diagram
The logic diagram if our system is depicted in Fig. 3. At first, each WeMos D1
retrieves the sensor values from five individual sensors. After that, the Raspberry
Pi scans the IP of each WeMos D1 from a list of IP address. By using the IP
of WeMos D1, Raspberry Pi will request for data. If the reading data is true,
Raspberry Pi will receive the sensor data where A1 is for the temperature sensor,
A2 is for the amount of CO2 sensor, A3 is for the movement sensor, A4 is for
the pulse sensor, and A5 is for noise/sound sensor. For each sensor, a threshold
Fig. 2. Block diagram of our proposed system.
Fig. 3. Logic diagram of our proposed system.
value will be given to the Raspberry Pi. If the receive data of A1 is greater
than 33◦ Celsius or A5 is greater than 10 dB, then set the alarm to the buzzer
and inform the admin computer; otherwise, the data of the next sensor will be
examined. If the receive data of A2 is greater than 4000 kPa, then the alarm will
set and inform to admin; otherwise, it will analyze the following sensor data.
If the data of A3 greater than 4.5◦ , it will inform the admin computer and set
the alarm; otherwise, it will proceed to the next sensor to read the data. If the
receiving value of A4 is greater than 16000bp, it will inform the admin computer
and set the alarm to the buzzer. If every sensor receives data is equal, or below
the threshold value, then the receiving data by the WeMos D1 is accepted. Then
the Raspberry Pi will change the IP of WeMos D1 to read the next WeMos D1
data, which is given to the IP list. The total system will proceed in the same
way.
3.3 Communication Between Server and Client

Here the microcontrollers (WeMos D1) are the clients, and Raspberry Pi 3 is
the server. The interaction between the WEMOS D1 and Raspberry Pi 3 may
occur as the following steps:
Step 1: WeMos D1 receives an analog signal from the sensor and converts
it into digital data.
Step 2: WeMos D1 then continuously broadcast it over IP address to the
router.
Step 3: Raspberry Pi 3 takes an IP address of WeMos D1 from the IP list
and establishes a connection with it.
Step 4: Raspberry Pi 3 asks for data from the clients and gets sensor data
via a router.
Step 5: Raspberry Pi 3 checks and analyzes the data. If the data is unusual,
it prompts the admin.
Step 6: Raspberry Pi 3 moves to the next WeMos D1 in the IP list.
3.4 Set up the Value of the Sensors

Temperature Sensor: For measuring temperature value here, we use some
equations. If the received analog voltage from the environment is 1.1 V and we
use 10 bit ADC. Then 1.1 V will be divided by 210 for converting the analog
data into digital. After dividing, the result will be 1.0742 mV. We know 10 mV
is equal to 1◦ Celsius. So if we divide 10 mV by 1.0742 mV, we get 9.31 mV
differences for which the temperature will change at 1◦ . The equation we use for
calculating the temperature value is given below:
ADC = T ∗ 1024/(5 ∗ 100)
where, T = 9.31 mV
Noise Sensor: For measuring noise value, we use two equations. For converting
the analog data into digital, we use the following equation:
Here, the resolution of ADC is 1024 as we use 10 bit ADC and system voltage
is 5 V. After measuring the converted voltage, we convert it into decibel as the
sound parameter is measuring in decibel. We use the following equation for
converting the data into decibel:
Converted data = 20 ∗ log 10 (Received data/Source voltage)
Gas and Pulse Sensor: For gas and pulse sensor, the direct analog to digital
data is used.
4 System Implementation
4.1 Hardware Implementation
In our proposed system, we used two controller boards. One is Raspberry Pi,
which can be regarded as a Nano computer, and the other one is a WeMos
D1 R2, which is a wireless communication-enabled micro-controller board. The
code related to WeMos D1 R2 imported in every WeMos D1 R2 with specific IP
addresses and IP sources. The analog reading of the sensors is passed to their
respective WeMos D1 R2 with the ADS1115 analog to digital converter module
for communicating with the analog sensors. Then the converted digital data is
processed for further data transmission to the Raspberry pi. PHP programming
language has been used to receive the data from the server, and Python program-
ming language to reprocess the data and execute the commands at Raspberry Pi
retrieved from the sensor reading through WeMos D1 R2 and ADS1115 module.
By analyzing these values, Raspberry Pi will notify the admins. The Wemos D1
R2 is the micro-controller that will be acting as a client in the proposed system.
The Raspberry Pi is the central controller of our system that acts as a server. In
the proposed system, the Raspberry Pi fits perfectly because it has 1 Ghz 64-bit
single-core BCM2835 with 512 MB ram, one micro USB port, a mini HDMI
port, and HAT-compatible 40 GPIO pins which can read and write digital data
which is more than enough for the system. Unlike Raspberry Pi 3, Raspberry Pi
also has built-in a wireless module with 802.11 b/g/n wireless LAN and Blue-
tooth 4.1 with BLE feature by which we can transmit the data faster with less
energy consumption. For implementing the system, we have used WeMos D1
R2, which is both Arduino and NodeMCU compatible board, including some
extra features. Unlike Arduino Uno, it has 11 digital input/output pins, and all
pins have interrupt/pwm /I2C /one-wire supported (except for D0). It has one
analog input (3.2 V max input) and has a micro USB connection port for both
communication and power source of 5v. It has a Power jack that can handle up to
9 − 24 V power input. However, the most useful feature is that it uses an ESP-
8266EX micro-controller, enabling wireless communication and providing 4MB
of flash memory. It can be operated in both 80 mhz and 160 mhz clock speed.
Unfortunately, WeMos D1 R2 has only one analog pin built-in, in which it lacks
from Arduino Uno. But, WeMos D1 R2 has a built-in I2C port, which makes a
remedy to the problem. I2C is a multi-master-slave serial computer bus that is
used for attaching lower-speed peripheral ICs to the micro-controller. Using I2C
port, ADS1115, a 16bit, four programmable channel analog to digital converter,
the other sensors have been connected. ADS1115 has ten connectors. They are
SDA, SCL, Address pin, Alert pin, and the other four connectors are analog pins
denoted with A0-A3. The ADS1115 uses SDA and SCL pin of WeMos D1 R2 to
communicate with it. The address pin is for configuring the bit rate. We will be
using 16bit from making the threshold more precise. The four-analog pin is used
for taking data for the analog sensor.
The system uses a total of five sensors for collecting the data of the hospital
cabin. They are LM35 temperature sensor, KY−038 microphone sound sensor,
MG811 CO2 gas sensor module, ADXL345 Tilt/movement sensor, and pulse
sensor SEN−11574. The LM35 temperature sensor has been used that acts for
change in surroundings temperature and drops 10 mV for per 1◦ C change is the
temperature; this is what helps to measure the environment’s temperature. It has
three output pins, and from the plain side of the sensor first one is VCC, which
can handle from 4v to 30v. The second one is the signal pin, which gets connected
with the A0 port of the WeMos D1 R2 board, and the last one the ground pin.
The KY-038 microphone, sound sensor module, has been used to detect the
sound level in a cabin. It consists of four pins, which are VCC, GND, A0, and
D0. The VCC and GND pins are for power input, which should be constant 5v.
This can provide both analog and digital output. But as we measure the sound
level, we need to use the A0 pin, which would be connected to the A0 pin of
the ADS1115 module. The value returned from ADS1115 will be threshold by
WeMos D1 R2. MG811 gas sensor module has an MG811 gas sensor embedded
with the L293 interface module. The MG811 sensor is sensitive to CO2 gas and
operated with the voltage of 5–6 V. The MG811 gas sensor has six pins, but the
L293 interface turns it into four. These four pins are the same as the KY-038
sensor module. The ADXL345 Tilt sensor is a small, thin, low powered 3-axis
MEMS sensor, also known as an accelerometer sensor with a high resolution
of 13bit measurement at up to (+−) 16g. This sensor can be communicated
via both SPI and I2C digital interface. We will be using the I2C interface for
communicating with the sensor. The sensor can be operated from 2.0−3.6 V with
a 40 uA current supply. The sensor can sense the static acceleration of gravity
and dynamic acceleration resulting from motion or shock. The sensor is set under
the cabin bed of the patient to sense the patient’s inappropriate behavior. The
pulse sensor SEN-11574 will be the only sensor that will be physically connected
with the patient to measure the patient’s pulse rate. The sensor is very straight
forward and has three output pins: VCC, GND, and signal pin. As the pulse
sensor is analog, a graph can be directly generated from its sensor value via
Arduino IDE. The sensor will be connected through the ADS1115 module, and
the data will be processed for providing precise human-readable output.
4.2 Steps of Implementation:

In this section, we elaborately discuss our overall implementation through a step
by step process.
Step 1 (WeMos D1 R2 setup): After configuring the code for WeMos D1 R2,
the code should be burnt to ESP8266 sketc.h with the correct port, baud rate,
and operating Hz. Once the code is successfully uploaded, other peripherals can
be connected with the device.
Fig. 4. Circuit diagram of WeMos D1 integrated with sensor.
Fig. 5. Circuit diagram of wireless raspberry pi and buzzer.
Step 2 (Temperature Sensor Connecting): The LM35 sensor operates with

5v, which can be supplied from WeMos. The signal pin gets connected with the
A0 pin of WeMos.
Step 3 (Connecting ADS1115 module): The ADS1115 communicates via
the I2C bus interface. The SDA and the SCL pin gets connected to the SDA
and SCL port of WeMos, respectively. The address pin is grounded for 16bit
operation. Now the analog pins are ready for further usage.
Step 4 (KY-038 Module Connecting): The A0 pin of KY-038 gets connected
with the A0 pin of the ADS1115 module. This operates with 5v, so the VCC of
KY-038 should be connected with a 5v port of the WeMos.
Step 5 (MG811 CO2 Gas Sensor Connecting): The A0 pin of the MG811
module should be connected with the A1 pin of the ADS1115 module. This
module’s operating voltage is 5–6 V, so an op-amp should be used for proper
power supply.
Step 6 (ADXL345 Sensor Connecting): As we will be using the I2C bus
interface for communicating with the module, the software declaration of SDA,
SCL port should be digital port 1 and 2 of the WeMos D1 R2. The sensor SDA,
SCL pin is connected to those pins. The other three pin INT1, INT2 and S00 pin
should be connected to other three digital pins according to the software code
declaration.
Step 7 (Connecting Pulse Sensor): The signal pin of the pulse sensor is
connected to the A2 pin of the ADS1115 module. The sensor operates in 2.0–3.6
V, so the VCC should be connected with the 3.3 V port of WeMos.
Step 8 (Connecting Buzzer with Raspberry Pi): The buzzer operates with
5v. So, the buzzer’s positive pin should be connected with the Pin 23 of the Pi
Zero, and the other pin should be grounded.
Step 9 (Powering up the devices): Both WeMos D1 R2 and Raspberry
Pi operate with a 5v power supply. But two devices should be powered with
separate power supply as the system is wireless. For the perfect operation of the
system, the router should also be powered up.
In Fig. 4 and Fig. 5, we presented the connection of WeMos with the sensors
and circuit diagram of raspberry pi and buzzer.
5 Experimental Results and Evaluation

After integrating the sensor with WeMosD1, we got each sensor’s values and
checked the values by serial monitoring. After that, we have imported a mecha-
nism to write those data into a file and read the sensor data from that file. All
the sensor data for each cabin are uploaded in a server that is being called in
the program. After converting the analog data into digital data WeMos D1 will
transfer the data to the Raspberry Pi. Then, it will compare the sensor received
value with the given threshold value. If the temperature or noise sensor value is
greater than 33◦ or 10 dB, then the buzzer alarm will be on. If the amount of
CO2 gas is greater than 4000 Kpa, or the patient’s pulse is greater than 16000
bp, or if the patient’s movement is greater than 4.5◦ , then the buzzer alarm
will be on. This paper uses the threshold values taken from some prior works,
which are given in Table 1. In Fig. 6, we showed the serial monitoring of our
experimental data after uploading on to the server. In Fig. 7, we showed the
data collected from each cabin that are uploaded on our server and displayed on
a webpage. In Figs. 8 and 9, we displayed the buzzer ON/OFF situation based
on the retrieved data from each cabin.
Table 1. Setting up threshold value of different sensors.
Parameter Threshold value Reference

Temperature 30◦ [27]
CO2 gas 2500 ppm [28]
Noise 10 db [29]
Pulse 13000 bpm [30]
Movement 5◦ [31]
Fig. 6. Experimental data after serial monitoring.
Fig. 7. Data retrieved from each cabin uploaded on a web server.

Fig. 8. Buzzer turns ON when movement sensor value (= 5.26) exceeds threshold (=
4.5)
Fig. 9. Buzzer turns ON when the sensors value greater than threshold and turns OFF
while the sensors value less than threshold.
6 Conclusion
This paper implements an autonomous wireless communication system to
observe the hospital cabins’ surrounding environment and patients’ health con-
dition. In this system, each cabins sensor data can report to the admin if the
received data is less than or greater than the pre-set threshold values. The advan-
tage of our system is that a hospital’s overall situation can be monitored centrally
with little cost and effective setup. This approach would reduce hospitals’ man-
ual activity, and the hospital authority would be acknowledged immediately if
any unusual condition occurs. This system is easy to operate so any user can
easily handle it. This system will help the hospital authority to monitor all the
cabins centrally for 24 h and take proper steps at the right time if any unwanted
situation occurs. Our proposed implementation scheme can be adapted to other
sectors of IoT applications, e.g., smart transportation, smart city, smart liv-
ing, and so on. In future, we have a plan to discover pattern, identify behavior
and patient status using the extracted data. We also have a plan to optimize
the energy consumption of the IoT devices and perform local training without
sharing the sensitive information with any external entity.
References
1. Lee, I., Lee, K.: The Internet of Things (IoT): applications, investments, and chal-
lenges for enterprises. Bus. Horiz. 58(4), 431–440 (2015)
2. Madakam, S., Lake, V., Lake, V., Lake, V.: Internet of Things (IoT): a literature
review. J. Comput. Commun. 3(05), 164 (2015)
3. Farooq, M.U., Waseem, M., Mazhar, S., Khairi, A., Kamal, T.: A review on internet
of things (IoT). Int. J. Comput. Appl. 113(1), 1–7 (2015)
4. Singh, A., Payal, A., Bharti, S.: A walkthrough of the emerging IoT paradigm:
visualizing inside functionalities, key features, and open issues. J. Netw. Comput.
Appl. 143, 111–151 (2019)
5. Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of Things (IoT): a
vision, architectural elements, and future directions. Future Generation Comput.
Syst. 29(7), 1645–1660 (2013)
6. Marjani, M., Nasaruddin, F., Gani, A., Karim, A., Hashem, I.A.T., Siddiqa, A.,
Yaqoob, I.: Big IoT data analytics: architecture, opportunities, and open research
challenges. IEEE Access 5, 5247–5261 (2017)
7. Imteaj, A., Rahman, T., Hossain, M.K., Alam, M.S., Rahat, S.A.: An IoT based fire
alarming and authentication system for workhouse using Raspberry Pi 3. In: 2017
International Conference on Electrical, Computer and Communication Engineering
(ECCE), pp. 899–904. IEEE (2017)
8. Imteaj, A., Rahman, T., Hossain, M.K., Zaman, S.: IoT based autonomous percip-
ient irrigation system using raspberry Pi. In: 2016 19th International Conference
on Computer and Information Technology (ICCIT), pp. 563–568. IEEE (2016)
9. Zhu, Z.-T., Ming-Hua, Yu., Riezebos, P.: A research framework of smart education.
Smart Learn. Environ. 3(1), 4 (2016)
10. Imteaj, A., Amini, M.H.: Distributed sensing using smart end-user devices: path-
way to federated learning for autonomous IoT. In: 2019 International Conference
on Computational Science and Computational Intelligence (CSCI), pp. 1156–1161.
IEEE (2019)
11. Kalaycı, B., Özmen, A., Weber, G.W.: Mutual relevance of investor sentiment and
finance by modeling coupled stochastic systems with MARS. Ann. Oper. Res. 1–24
(2020)
12. Imteaj, A., Rahman, T., Begum, H.A., Alam., M.S.: IoT based energy and gas
economic home automation system using Raspberry Pi 3. In: 2017 4th International
Conference on Advances in Electrical Engineering (ICAEE), pp. 647–652. IEEE
(2017)
13. Reyna, A., Martı́n, C., Chen, J., Soler, E., Dı́az, M.: On blockchain and its inte-
gration with IoT. challenges and opportunities. Future Generation Comput. Syst.
88, 173–190 (2018)
14. Chen, S., Hui, X., Liu, D., Bo, H., Wang, H.: A vision of IoT: applications, chal-
lenges, and opportunities with china perspective. IEEE Internet of Things J. 1(4),
349–359 (2014)
15. Intelligent Computing and Optimization, Proceedings of the 2nd International
Conference on Intelligent Computing and Optimization 2019 (ICO 2019). Springer
International Publishing, ISBN 978-3-030-33585 -4
16. Bhaumik, A., Roy, S.K., Weber, G.W.: Multi-objective linguistic-neutrosophic

matrix game and its applications to tourism management. J. Dyn. Games, 0 (2019)
17. Vasant, P., Zelinka, I., Weber, G.W. eds. Intelligent computing & optimization,
vol. 866. Springer (2018)
18. Sudha, S., Indumathy, D., Lavanya, A., Nishanthi, M., Merline Sheeba, D., Anand,
V.: Patient monitoring in the hospital management using Li-Fi. In: Proceedings of
Technological Innovations in ICT for Agriculture and Rural Development (TIAR),
pp. 93–96 (2016)
19. Habash, Z.A., Hussain, W., Ishak, W., Omar, M.H.: Android-based application to
assist doctor with Alzheimer’s patient. In: Proceedings of International Conference
on Computing and Informatics (ICOCI), 28–30 August (2013)
20. Archip, A., Botezatu, N., Şerban, E., Herghelegiu, P.C., Zală, A.: An IoT based
system for remote patient monitoring. In: Proceedings of 17th International Con-
ference in Carpathian Control (ICCC), pp. 1–6 (2016)
21. Ahmed, S., Millat, S., Rahman, M.A., Alam, S.N., Zishan, M.S.R.: Wireless health
monitoring system for patients. In: Proceedings of IEEE International WIE Con-
ference on Electrical and Computer Engineering (WIECON-ECE), pp. 164–167
(2015)
22. Ho, K.F., Hirai, H.W., Kuo, Y.H., Meng, H.M., Tsoi, K.K.: Indoor air monitoring
platform and personal health reporting system: big data analytics for public health
research. In: Proceedings of International Congress on Big Data, pp. 309–312 (2015)
23. Arnold, C., Harms, M., Goschnick, J.: Air quality monitoring and fire detection
with the Karlsruhe electronic micronose KAMINA. Proc. IEEE Sens. J. 2(3), 179–
188 (2002)
24. Marinov, M.B., Topalov, I., Gieva, E., Nikolov, G.: Air quality monitoring in urban
environments. In: Proceedings of 39th International Spring Seminar on Electronics
Technology (ISSE), pp. 443–448 (2016)
25. du Plessis, R., Kumar, A., Hancke, G.P., Silva, B.J.: A wireless system for indoor
air quality monitoring. In: Proceedings of 42nd Annual Conference of the IEEE
Industrial Electronics Society, pp. 5409–5414 (2016)
26. Jangid, S., Sharma, S.: An embedded system model for air quality monitoring.
In: Proceedings of Computing for Sustainable Global Development, pp. 3003–3008
(2016)
27. Thermal comfort of patients in hospital wards. https://www.ncbi.nlm.nih.gov/
pubmed/264497
28. Carbon Dioxide Concentration - Comfort Levels. https://www.engineeringtoolbox.
com/co2-comfort-level-d 1024.html
29. Decibel level of common sounds. https://www.hearingaidknow.com/too-loud-
decibel-levels-of-common-sounds
30. Heart rate. https://en.wikipedia.org/wiki/Heart rate
31. ADXM345 Digital Accelerometer. https://learn.adafruit.com/adxl345-digital-
accelerometer?view=all
An Improved Boolean Load Matrix-Based
Frequent Pattern Mining
Shaishab Roy1(&), Mohammad Nasim Akhtar1,

and Mostafijur Rahman2
1
Dhaka University of Engineering and Technology, Gazipur, Dhaka, Bangladesh
shaishab.cse@gmail.com
2
Department of Software Engineering, Daffodil International University,
Dhaka, Bangladesh
Abstract. Frequent Pattern Mining (FPM) has been playing an essential role in
data mining research. In literature, many algorithms have proposed to discover
interesting association patterns. However, frequent pattern mining in a large-
scale dataset is a complex task because of a prohibitively large number of the
smaller pattern need to be generated first to identify the extended pattern. The
complexity in terms of computational time is a big concern. In this paper, we
have proposed an updated novel FPM algorithm that uses a boolean matrix and
then decomposed the matrix vertically. Here we have reduced the computational
time by reducing a few steps and added itemset if and only if satisfy the
condition instead of pruning after added into the candidate itemset. Our pro-
posed algorithm has reduced the computational time at least two times on the
generation of the boolean matrix (i.e. Q), frequent-1 matrix (i.e., F1) and
frequent-1 itemset (i.e., L1), which has improved the efficiency of the algorithm.
Keywords: Data mining Frequent pattern Candidate set Itemset Boolean

matrix
1 Introduction
The frequent pattern mining (FPM) is using to identify the association rules among the
items which are present in the dataset. This technique is also known as Association
Rules Mining (ARM). In FPM, by applying some statistical methods able to generate
compelling and unrecognized patterns, itemset, substructures from the dataset. One of
the essential techniques is FPM, which enables to discover important feature from the
given itemset. In many data mining techniques, the FPM used as a base technique to
analyze, like association, relationship, and correlation.
Many algorithms have been proposed to make more intense the techniques in FPM
by different researchers. Though, improvements are still required to increase the per-
formance by reducing the computational time of the existing FPM algorithms. Some of
them are the well-reputed algorithms of FPM includes Apriori [1–3], FP Growth [1–4],
Partitioning [1], Dynamic Itemset Counting (DIC) [6], etc. One of the base algorithms
is Apriori to discover the frequent pattern. The significant challenges of this algorithm

https://doi.org/10.1007/978-3-030-68154-8_86
An Improved Boolean Load Matrix-Based Frequent Pattern Mining 1027
are lengthy computational time and massive consumption of memory due to the
multiple time scan of the transactional dataset. To reduce computational time and
memory consumption a boolean load-matrix based frequent pattern mining has been
developed [15]. Hence, improvements are still required to increase the performance of
the algorithm in terms of time complexity.
In our research paper, we have proposed an improved boolean load-matrix based
novel algorithm to reduce computational time.
This paper is structured as follows. In Sect. 2, we have presented a few previous
researchers that have been done in the literature. In Sect. 3, we have proposed an
improved Boolean load-matrix based novel algorithm. In Sect. 4, we have discussed on
the obtained result with example. In Sect. 5, performance evaluation and complexity
analysis between the existing and proposed algorithm. Finally, in Sect. 6, we have
concluded our research paper.
2 Related Work
Frequent pattern mining is a significant area of data science. Using the FPM huge
amount of are mined to explore some mysterious pattern or subsequence for generating
the association rules. Several algorithms have been proposed by different researchers to
make more intense the techniques in FPM. Still, there are some scopes to improve the
algorithm efficiency in terms of execution time means by reducing the execution time
while finding the frequent pattern. The work mentioned in [1, 3] proposed an efficient
algorithm to mine the association rules from the transaction dataset. In [1] proposed an
improved algorithm to generate a new frequent itemset but still needed to scan the
previous frequent itemset to computed support that also a costly process. This proposed
algorithm avoids the pruning technique instead of using the pruning technique we have
included in the candidate itemset only while satisfying the criteria. Many hybrid
Apriori algorithms have developed and proposed by different researchers [3, 7–9]. In
[2], Author proposed eight performance metrics, such as execution time, memory
consumption, completeness, and interestingness for association rules mining. In [3],
AprioriTid has used to present a hybrid algorithm to generate the association rules for
discovering the large, frequent itemset. From paper [3], we have influenced and start
work to create large frequent-itemset. In [4], Author proposed a frequent pattern tree
(FP-tree) structure, an extended prefix tree structure to store compressed, crucial
information about the frequent patterns. In [5], Athors reviewed and presented a
comparison of different algorithms for Frequent Pattern Mining (FPM) to develop a
more efficient FPM algorithm. In [6], They proposed an algorithm for finding large
itemsets which use fewer passes over the data than classic algorithms and uses fewer
candidate itemsets than methods-based sampling. In [7], Based on confidence and
classification, another Apriori algorithm has been suggested. In [8], a new FPM
algorithm has been proposed, which has given time consumption, CPU utilization, and
efficient memory utilization during knowledge discovery from datasets by combining
the Weighted Apriori algorithm and Apriori Hash T algorithm. Weighted Apriori,
Itemset combination will be generated frequently, and it will increase the candidate
itemset. The weight computation for each transaction takes more time to execute, with
1028 S. Roy et al.
less accuracy. Apriori Hash T, high computational requirement, high memory uti-
lization. In [9] proposed an Apriori-Growth algorithm based on the Apriori algorithm
and FP-tree structure to mine the frequent pattern.
The work reported in [10], by analyzing the Apriori algorithm, researchers pro-
posed a DC_Apriori algorithm, which has increased the efficiency of the algorithm.
Here they avoided the generation of repeated candidate itemset. They also compared
their proposed algorithm with the MC_Apriori algorithm [11] based on the compressed
matrix.
In the research [12], researchers proposed an improved algorithm, which includes
the following improvement. To generate the candidate items by decreasing the number
of scans the dataset, to pruning frequent itemset, and candidate itemset generation to
improve joining efficiency and also used overlap strategy to count support to increase
the efficiency of the algorithm. In [13], An improved Apriori algorithm has been
proposed, where dataset scan performed only once, and all the transactions have
transformed into the components of a two-dimensional array with a weight value. An
improved algorithm proposed in [14], In this paper count-based method used for
pruning the candidate itemset to reduce dataset scans.
In the research [15], researchers proposed an algorithm to discover the frequent
pattern using a boolean matrix base where researchers reduced the dataset scan time. In
this paper, they transformed the transactional dataset into the Boolean matrix, which
has is using for further process. Researchers also used the pruning process to delete the
itemset from the candidate itemset if each subset of an itemset is not present in the
previous frequent itemset and finally combined all frequent itemset to discover the final
frequent itemset.
In the work reported in [16], researchers compared Apriori and Eclat algorithms for
association rules mining by applying them on a real-world dataset. This paper high-
lighted how it becomes more computationally expensive for Apriori compared to Eclat
to mine frequent patterns at very low support threshold levels. In [17], A new algorithm
has been proposed that combines ECLAT algorithm and a method that uses special
candidates to find frequent patterns from the database.
By studying the above research papers, we have gathered some knowledge and
found a tremendous amount of research completed by many researchers on frequent
pattern mining (FPM). Day by day, the data volume is increasing. From that massive
amount of data, find out the frequent pattern with minimum computational time is still
challenging. For the sake of that, in this paper, we have proposed an improved boolean
load-matrix based novel algorithm to reduce the computational time that also usable in
a distributed system to share the data loads among the available nodes.
3 Proposed Algorithm
In this research, we have developed an improved algorithm. The detailed process to

generate frequent item-sets from a transactional dataset of the proposed algorithm is
presented in Algorithm 1. Assume, D be the transactional dataset contains (I, T), I is the
set of items {I0, I1, I2, …., In}, and T is the set of transactions of an item {IT0, IT1,
IT2, …., ITm}.
Algorithm 1 Improved Boolean Load-Matrix based FPM Algorithm
Input: D, min_sup_count
Output: L
1: Scan D
2: Construct Qm×n and set sup_count for each item set
for each i, i ≤ m set sup_counti = 0;
for each j, j ≤ n
if ITj Ii then Q[i][j] = 1 and sup_counti += 1;
else Q[i][j] = 0; end if
end for
if (sup_counti ≥ min_sup_count) then set
sup_count[i] = sup_counti; add the tuple i to F1
matrix; add the item to L1; end if
end for
3: Vertically split Qm×n into multiple matrices such as
Load1m×x, Load2m×x, ..., Loadtm×x, where x > 0.
4: for each i, i ≤ m
for each j, (i + 1) ≤ j ≤ m
if q1[i] ≠ q2[j], q1, q2 Lk−1 then
q = q1[i] q2[i]; include_in_ck = true;
for (k-1) subset p, p q
if each p Lk-1 then include_in_ck = false;
break; end if
end for
if include_in_ck = true then Ck = Ck q; end if
end if
end for
end for
5: for each r, r Ck
compute sup_itemset = Load1.r + Load2.r + ...+
Loadt.r.
if sup_itemset ≥ min_sup_count then add it to Lk; end
if and end for
6: Repeat steps 4 to 5 till Lk−1 ≠ .
7: L = Li.
In this algorithm, the m and n represent total no of items and transactions in the
dataset D, respectively. The min_sup_count is the predefined minimum support count
threshold, and Q is a Boolean matrix with size m n. The items are stored in the rows,
and transactions of an item are stored in the columns. Lk is the frequent k itemset that
has generated from Ck, where Ck is the candidate itemset that has developed from Lk-1
itemset, whereas k 2. After scanning dataset D, the Q matrix has constructed. In the
1030 S. Roy et al.
matrix Q, if the item is present in a specific transaction, then put decimal value one
(i.e.,1) otherwise zero (i.e., 0), at the same time calculating support count and stored in
sup_count array.
Here, sup_count is a two-dimensional array where the row contains itemset, and the
column contains the support count value of the respective itemset. After scanning each
tuple, compare with min_sup_count count value to determine whether it goes to fre-
quent k itemset or not. Further, based on the number of nodes, decomposed the fre-
quent one itemset into multiple loads (i.e., Load1, Load2, …., Loadt). To calculate the
support count of frequent k itemset, we have applied the bitwise AND operator (i.e., &)
on the decomposed matrix.
Algorithm 1 shows a pseudo-code of improved boolean load-matrix-based FPM
algorithm and the improvement part of the algorithm is highlighted (step 2). Initially,
dataset D scanned then a boolean matrix (i.e. Qmn) is constructed and calculated
support count for each row i, which is stored in sup_count[i] and compared with
min_sup_count. If sup_count[i] min_sup_count, then enter rows is added to the k-
1 matrix (i.e. F1) and listed in k-1 itemset (i.e. L1). After that, vertically split the
Q matrix into multiple matrices (i.e., Load1, Load2, Load3, …., Loadt) where the size
of each matrix size is m x where x is a positive number. Here the matrix is
decomposed based on the available nodes to use in a distributed environment and each
load is assigned to individual nodes to execute.
Now, join operation is performed between two sets of items from Lk-1 items (i.e., q)
and joined such a way that q1[i] \ q2[i] = ⌀. Then the k-1 subsets are generated from
q (i.e., p) and if each p is present in Lk-1, then stored in Ck. Then, calculate the support
count of each itemset r 2 Ck (i.e., sup_itemset) from the simulation of Load.r values.
Then, Load.r is computed by applying bitwise AND operation (i.e., &) on the Load
matrices, which is stored binary digits. Finally, combining all the frequent itemset (i.e.,
{L1 [ L2 [ L3 [ …… Lk-1 [ Lk}) generated the final frequent itemset L.
4 Discussion of the Improved Algorithm Operation Process
In this segment of the paper, we have discussed the proposed algorithm. Let assume,
the transactional dataset D presented in Table 1, which has contained the set of items
I = {I0, I1, I2, I3, I4, I5, I6, I7, I8} and set of transactions T = {IT0, IT1, IT2, IT3, IT4, IT5,
IT6, IT7, IT8, IT9}. The min_sup_count is a predefined threshold for minimum support
count. Ck is a candidate k itemset that has generated from k-1 itemset where k 2.
The main intention of this researches is to find out the large, frequent itemset with less
time consumption to draw the association rules. In this discussion, according to our
proposed algorithm, we have presented as a transactional dataset in Table 1, which will
convert as the boolean matrix. In this matrix, we have shown the items as rows and the
transactions as columns.
Table 1. Transaction dataset

Item Transactions
I0 IT0, IT1, IT3, IT4, IT6
I1 IT1, IT2, IT3, IT4, IT6, IT8
I2 IT5, IT6, IT9
I3 IT0, IT3, IT5, IT6, IT7, IT9
I5 IT0, IT1, IT2, IT4, IT6, IT8, IT9
I6 IT4, IT5, IT7
I7 IT0, IT7, IT8, IT9
We have constructed the boolean matrix Q by scanning each itemset Ii. If specific
transaction ITi present, then we have inserted 1 for the position Q[Ii][ITi], if not present,
then inserted 0. For example, if IT1 2 I1, then we have inserted value one (i.e.,1) at the
position [I1] [IT1] of boolean matrix Q (i.e., Q[I1] [IT1] = 1). Again IT2 2 I1, so we
have inserted value one (i.e.,1) at the position [I1] [IT2] of boolean matrix Q (i.e., Q[I1]
[IT1] = 1). But IT5 62 I1, so we have inserted zero (i.e.,0) at the position [I1] [IT5] of
boolean matrix Q (i.e., Q[I1] [IT5] = 0). In the same way, we have prepared the entire
boolean matrix represented in Table 2. At the same time, we have calculated support
count (i.e., sup_count) for each row and stored it into the sup_count array with the
same row indicator. In this analysis, we have considered value four (i.e., 4) as the
minimum support count (i.e., min_sup_count = 4). After calculating the support count
of each row, we have compared with the minimum support count (i.e., min_sup_count)
to determine whether it goes to frequent-1 itemset or not.
Table 2. Matrix 1 with all items; some of them will not include in the frequent matrix but just
shown to explain.
IT0 IT1 IT2 IT3 IT4 IT5 IT6 IT7 IT8 IT9 sup_count
I0 1 1 0 1 1 0 1 0 0 0 5
I1 0 1 1 1 1 0 1 0 1 0 6
I2 0 0 0 0 0 1 1 0 0 1 3
I3 1 0 0 1 0 1 1 1 0 1 6
I4 0 1 1 0 0 1 1 0 1 0 5
I5 1 1 1 0 1 0 1 0 1 1 7
I6 0 0 0 0 1 1 0 1 0 0 3
I7 1 0 0 0 0 0 0 1 1 1 4
I8 0 1 0 0 1 0 1 1 0 1 5
Table 3 represents all items with boolean values and respective row sup_count.
After comparing the minimum support count value with support count value, we
1032 S. Roy et al.
figured out that row no 3 and 7 will not include in the frequent-1 matrix, which is F1.
All items of frequent matrix-1 are listed as frequent itemset, which is L1.
Table 3. Frequent-1 Matrix

IT0 IT1 IT2 IT3 IT4 IT5 IT6 IT7 IT8 IT9 sup_count
I0 1 1 0 1 1 0 1 0 0 0 5
I1 0 1 1 1 1 0 1 0 1 0 6
I3 1 0 0 1 0 1 1 1 0 1 6
I4 0 1 1 0 0 1 1 0 1 0 5
I5 1 1 1 0 1 0 1 0 1 1 7
I7 1 0 0 0 0 0 0 1 1 1 4
I8 0 1 0 0 1 0 1 1 0 1 5
So, the L1 will be like
L1 ¼ ffI0 g; fI1 g; fI3 g; fI4 g; fI5 g; fI7 g; fI8 gg
After that, we have split the dataset into multiple loads based on the number of
available nodes. In this example, we have considered-2 nodes available. So, we have
divided the dataset into the two loads, which are Load1 and Load2 that represented in
Table 4.
Table 4. Load wise divided frequent matrix

Load Load1 Load2
Transaction IT0 IT1 IT2 IT3 IT4 IT5 IT6 IT7 IT8 IT9
I0 1 1 0 1 1 0 1 0 0 0
I1 0 1 1 1 1 0 1 0 1 0
I3 1 0 0 1 0 1 1 1 0 1
I4 0 1 1 0 0 1 1 0 1 0
I5 1 1 1 0 1 0 1 0 1 1
I7 1 0 0 0 0 0 0 1 1 1
I8 0 1 0 0 1 0 1 1 0 1
Then we have to perform the join operation on frequent-1 itemset (i.e., L1) to
generate candidate-2 itemset, which is C2. After completing the join operation, we have
to check that each subset of a joined itemset is present in the previous itemset (i.e., Lk-1)
or not. If each subset of a joined itemset is present in the Lk-1 itemset, then include the
joined itemset into the candidate-2 itemsets; otherwise, the joined itemset will not
involve in the candidate-2 itemsets. After performing these steps, we will get C2, which
is shown below.

fI0 ; I1 g; fI0 ; I3 g; fI0 ; I4 g; fI0 ; I5 g; fI0 ; I7 g; fI0 ; I8 g; fI1 ; I3 g; fI1 ; I4 g; fI1 ; I5 g; fI1 ; I7 g; fI1 ; I8 g;
C2 ¼
fI3 ; I4 g; fI3 ; I5 g; fI3 ; I7 g; fI3 ; I8 g; fI4 ; I5 g; fI4 ; I7 g; fI4 ; I8 g; fI5 ; I7 g; fI5 ; I8 g; fI7 ; I8 g
All subset of C2 present in L1 (i.e., C2 2 L1). Now we have to calculate the support
count of all itemset to obtain frequent itemset 2 (i.e., L2). Support count calculated from
the candidate-2 itemsets by performing bitwise AND operator (i.e., &) on each row
values for individual load matrices. In this example, to calculate the support count for
{I0, I1}, we have to perform bitwise AND (i.e., &) operation between the row values of
I0 and I1 from load1 and load2 matrix. For load1 calculated value is 3 (i.e. {1, 1, 0, 1,
1} & {0, 1, 1, 1, 1}) and for load2 value is 1 (i.e. {1, 1, 0, 1, 1} & {0, 1, 1, 1, 1}). To
calculate the support count of {I0, I1}, we have to add all load matrices result, which is
4 (i.e., 3 + 1 = 4). In the same way, we have to calculate the support count value of
remaining itemset for candidate-2 itemset (i.e., C2). The support count calculation
process of each itemset for C2 is exhibit below.
{I0, I 1} = {1, 1, 0, 1, 1} & {0, 1, 1, 1, 1} + {0, 1, 0, 0, 0} & {0, 1, 0, 1, 0} = 4

{I0, I 3} = {1, 1, 0, 1, 1} & {1, 0, 0, 1, 0} + {0, 1, 0, 0, 0} & {1, 1, 1, 0, 1} = 3
{I0, I 4} = {1, 1, 0, 1, 1} & {0, 1, 1, 0, 0} + {0, 1, 0, 0, 0} & {1, 1, 0, 1, 0} = 2
{I0, I 5} = {1, 1, 0, 1, 1} & {1, 1, 1, 0, 1} + {0, 1, 0, 0, 0} & {0, 1, 0, 1, 1} = 4
{I0, I 7} = {1, 1, 0, 1, 1} & {1, 0, 0, 0, 0} + {0, 1, 0, 0, 0} & {0, 0, 1, 1, 1} = 1
{I0, I 8} = {1, 1, 0, 1, 1} & {0, 1, 0, 0, 1} + {0, 1, 0, 0, 0} & {0, 1, 1, 0, 1} = 3
{I1, I 3} = {0, 1, 1, 1, 1} & {1, 0, 0, 1, 0} + {0, 1, 0, 1, 0} & {1, 1, 1, 0, 1} = 2
{I1, I 4} = {0, 1, 1, 1, 1} & {0, 1, 1, 0, 0} + {0, 1, 0, 1, 0} & {1, 1, 0, 1, 0} = 4
{I1, I 5} = {0, 1, 1, 1, 1} & {1, 1, 1, 0, 1} + {0, 1, 0, 1, 0} & {0, 1, 0, 1, 1} = 5
{I1, I 7} = {0, 1, 1, 1, 1} & {1, 0, 0, 0, 0} + {0, 1, 0, 1, 0} & {0, 0, 1, 1, 1} = 1
{I1, I 8} = {0, 1, 1, 1, 1} & {0, 1, 0, 0, 1} + {0, 1, 0, 1, 0} & {0, 1, 1, 0, 1} = 3
{I3, I 4} = {1, 0, 0, 1, 0} & {0, 1, 1, 0, 0} + {1, 1, 1, 0, 1} & {1, 1, 0, 1, 0} = 2
{I3, I 5} = {1, 0, 0, 1, 0} & {1, 1, 1, 0, 1} + {1, 1, 1, 0, 1} & {0, 1, 0, 1, 1} = 3
{I3, I 7} = {1, 0, 0, 1, 0} & {1, 0, 0, 0, 0} + {1, 1, 1, 0, 1} & {0, 0, 1, 1, 1} = 3
{I3, I 8} = {1, 0, 0, 1, 0} & {0, 1, 0, 0, 1} + {1, 1, 1, 0, 1} & {0, 1, 1, 0, 1} = 3
{I4, I 5} = {0, 1, 1, 0, 0} & {1, 1, 1, 0, 1} + {1, 1, 0, 1, 0} & {0, 1, 0, 1, 1} = 4
{I4, I 7} = {0, 1, 1, 0, 0} & {1, 0, 0, 0, 0} + {1, 1, 0, 1, 0} & {0, 0, 1, 1, 1} = 1
{I4, I 8} = {0, 1, 1, 0, 0} & {0, 1, 0, 0, 1} + {1, 1, 0, 1, 0} & {0, 1, 1, 0, 1} = 2
{I5, I 7} = {1, 1, 1, 0, 1} & {1, 0, 0, 0, 0} + {0, 1, 0, 1, 1} & {0, 0, 1, 1, 1} = 3
{I5, I 8} = {1, 1, 1, 0, 1} & {0, 1, 0, 0, 1} + {0, 1, 0, 1, 1} & {0, 1, 1, 0, 1} = 4
{I7, I 8} = {1, 0, 0, 0, 0} & {0, 1, 0, 0, 1} + {0, 0, 1, 1, 1} & {0, 1, 1, 0, 1} = 2
Now we have to compare each itemset support count value with the minimum
support count value to determine whether it will go to the frequent itemset-2 or not. If
the support count of an itemset is greater than the minimum support count value (i.e.,
min_sup_coun), then added to the frequent itemset-2 (i.e., L2). L2 is shown below.
L2 ¼ ffI0 ; I1 g; fI0 ; I5 g; fI1 ; I4 g; fI1 ; I5 g; fI4 ; I5 g; fI5 ; I8 gg
Similarly, we have generated candidate itemset 3 with all possible itemset (i.e., C3),
which has generated from L2 by performing join operation with L2 itemset itself.
1034 S. Roy et al.
We did not include some of the itemsets that are not present in L2. We have included if
each subset is present in L2 (i.e., C3 2 L2). For example, one possible itemset was {I0,
I1, I4}. Possible subset of {I0, I1, I4} with max k-1 items (i.e. 2) are {I0, I1}, {I0, I4}, {I1,
I4} as {I0, I4} 62 L2 that is why we did not include {I0, I1, I4} itemset into the C3. The
candidate itemset 3 (i.e., C3) demonstrates below.
C3 ¼ ffI0 ; I1 ; I5g; fI1 ; I0 ; I5 gg
After getting the final candidate itemset-3, we have to calculate the support count of
each itemset, which demonstrates bellow.
fI0 ; I1 ; I5 g ¼ f1; 1; 0; 1; 1g & f0; 1; 1; 1; 1g & f1; 1; 1; 0; 1g

þ f0; 1; 0; 0; 0g &f0; 1; 0; 1; 0g & f0; 1; 0; 1; 1g ¼ 3
fI1 ; I4 ; I5 gg ¼ f0; 1; 1; 1; 1g & f0; 1; 1; 0; 0g & f1; 1; 1; 0; 1g

þ f0; 1; 0; 1; 0g & f1; 1; 0; 1; 0g & f0; 1; 0; 1; 1g ¼ 4
Since we have found both itemsets support count value if greater than or equal to
minimum support count (i.e., 4), so they have added into the frequent itemset-3 (i.e.,
L3), the L3 demonstrates below.
L3 ¼ fI1 ; I4 ; I5 g
Now, we have to find out the candidate itemset-4 by performing the join operation
on the L3 itemset itself. But we have found that L3 contains only one itemset so we
cannot complete the join operation. So, we will not get any other candidate set.
Finally, we have to generate the final frequent itemset by combining all the frequent
itemset. Which is L = {L1 [ L2 [ L3 [ L4 [ …. Lk-1 [ Lk}. The last large, frequent
itemset (i.e., L) is presented below from the given dataset.

fI0 g; fI1 g; fI3 gfI4 g; fI5 g; fI7 g; fI8 g; fI0 I1 g; fI0 I5 g
L¼
fI1 I4 g; fI1 I5 g; fI4 I5 g; fI5 I8 g; fI1 ; I4 ; I5 g
5 Performance Evaluation and Complexity Analysis
The performance evaluation is made by comparing the proposed method with the
existing method [15]. The experimental data were generated from a machine with the
configurations 2.90 GHz, Intel(R), Core (TM)-i7 7820HQ CPU, 16 GB main memory.
Python Language is used to implement the proposed design. Both existing method [15]
and Proposed method are coded and execute in the same platform. In this performance
evaluation, the one Load is considered. Apriori and FP growth Algorithm dataset from
Kaggle [18] is used for the performance evaluation. The dataset contains 12 items and
12526 transactions. In this experiment the minimum support count (min_sup_count) is
set 20 for mining the dataset. Noted that the minimum support count value can be
varied depends on needs.
In this experiment, we have found that our proposed algorithm has given at least
two times better performance than the existing boolean load-matrix based algorithm in
terms of computational time on the generation of the boolean matrix (i.e., Q), frequent-
1 matrix (i.e., F1) and frequent-1 itemset (i.e., L1). The frequent-1 matrix (i.e., F1) and
frequent-1 itemset (i.e., L1) generation execution time for existing and proposed
algorithms.
In Eq. (1) shows the time complexity to construct the boolean matrix (i.e., Q),
frequent-1 matrix (i.e., F1), and frequent-1 itemset (i.e., L1) of the existing algorithm.
In Eq. (2) shows the time complexity to generate the boolean matrix (i.e., Q), frequent-
1 matrix (i.e., F1), and frequent-1 itemset (i.e., L1) of the proposed algorithm. Equa-
tion (2) is for step-2 of the proposed algorithm. Here T is the time complexity.
Time Complexity ðT Þ for existing boolean matrix

¼ ðm nÞ þ c1 þ ðm nÞ þ c2 þ m þ c3
ð1Þ
¼ 2 ðm nÞ þ m þ c1 þ c2 þ c3
¼ 2 ðm nÞ þ m þ c4
Time Complexity ðT Þ for proposed boolean matrix

¼ ðm nÞ þ c1 þ c2 ð2Þ
¼ ðm nÞ þ c3
To generate the boolean matrix (i.e., Q) existing algorithm needs (m n) + c1

time, to generate sup needs (m n) + c2 times, for frequent-1 matrix (i.e., F1) and
frequent-1 itemset (i.e., L1) needs m + c3 times. To generate all boolean matrix (i.e.,
Q), support_count, frequent-1 matrix (i.e., F1) and frequent-1 itemset (i.e., L1) needs
(m n) + c3. So, proposed algorithm is faster than existing algorithm. Though, Big O
notation for both algorithms are same O(m n) but second one (Eq. 2) is faster than
first one (Eq. 1) because for Eq. 1 is needed 2 (m n) + m and for Eq. 2 is needed
only (m n).
In step-4 of proposed algorithm, we have reduced space for candidate set because
we have added an itemset if satisfied the condition instead of adding into the candidate
set and then pruning.
The proposed and exiting program execution result of the comparison has been
demonstrated in Fig. 1.
1036 S. Roy et al.
Fig. 1. Comparison result between existing and proposed boolean-load matrix FPM algorithms.
6 Conclusion
In this research, we have presented an updated algorithm to generate large, frequent

itemset from a massive dataset. We have reduced three steps than the existing algorithm
[15], and we just added itemset if that itemset satisfies the condition instead of counting
and then pruning from the created itemset. Our proposed algorithm is performing at
least two times faster than the existing algorithm on the generation of the boolean
matrix (i.e., Q), frequent-1 matrix (i.e., F1), and frequent-1 itemset (i.e., L1) in terms of
computational time. That comparison has shown in Eqs. 1 and 2. It also reduced space
for candidate set generation. The enhanced proposed algorithm can be used for Market
Basket Analysis (MBA), Clickstream Analysis (CA), etc. In the future, this research
can be extended in a distributed environment for the parallel computation of the load
matrices.
References
1. Fang, X.: An improved apriori algorithm on the frequent itemset. In: International
Conference on Education Technology and Information System (ICETIS 2013). IEE (2013)
2. Ghafari, S.M., Tjortjis, C.: A survey on association rules mining using heuristics. In:
Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 9, no. 4, p. e1307.
Wiley (2019)
3. Nath, B., Bhattacharyya, D.K., Ghosh, A.: Incremental association rule mining: a survey. In:
Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 3, no. 3, pp. 157–
169. Wiley (2013)
4. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a
frequent-pattern tree approach. In: Data Mining and Knowledge Discovery, vol. 8, no. 1,
pp. 53–87. Springer, Netherlands (2004)
5. Chee, C.-H., Jaafar, J., Aziz, I.A., Hasan, M.H., Yeoh, W.: Algorithms for frequent itemset
mining: a literature review. In: Springer's Artificial Intelligence Review, pp. 1–19 (2018)
6. Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication
rules for market basket data. In: ACM SIGMOID Record, pp. 255–264 (1997)
7. Chun-sheng, Z., Yan, L.: Extension of local association rules mining algorithm based on
apriori algorithm. In: International Conference on Software Engineering and Service
Science, pp. 340–343. IEEE, China (2014)
8. Thakre, K.R., Shende, R.: Implementation on an approach for Mining of Datasets using
APRIORI Hybrid Algorithm. In: International Conference on Trends in Electronics and
Informatics, pp. 939–943. IEEE, India (2017)
9. Wu, B., Zhang, D., Lan, Q., Zheng, J.: An efficient frequent patterns mining algorithm based
on apriori algorithm and the FP-tree structure. In: International Conference on Convergence
and Hybrid Information Technology, vol. 1, pp. 1099–1102. IEEE, South Korea (2008)
10. Du, J., Zhang, X., Zhang, H., Chen, L.: Research and improvement of apriori algorithm. In:
International Conference on Information Science and Technology, pp. 117–121. IEEE,
China (2016)
11. Yu, N., Yu, X., Shen, L., Yao, C.: Using the improved apriori algorithm based on
compressed matrix to analyze the characteristics of suspects. Int. Express Lett. Part B Appl.
Int. J. Res. Surv. 6(9), 2469–2475 (2015)
12. Yuan, X.: An improved apriori algorithm for mining association rules. In: Conference
Proceedings, vol. 1820, no. 1. AIP (2017)
13. Liao, B.: An improved algorithm of apriori. In: International Symposium on Intelligence
Computation and Applications (ISICA), pp. 427–432. Springer, Berlin (2009)
14. Wu, H., Lu, Z., Pan, L., Xu, R., Jiang, W.: An improved aprioribased algorithm for
association rules mining. In: International Conference on Fuzzy Systems and Knowledge
Discovery, vol. 2, pp. 51–55, IEEE, China (2009).
15. Sahoo, A., Senapati, R.: A Boolean load-matrix based frequent pattern mining algorithm. In:
International Conference on Artificial Intelligence and Signal Processing (AISP), IEEE,
India (2020). ISSN:2572-1259
16. Robu, V., Santos, V.D.: Mining frequent patterns in data using apriori and eclat: a
comparison of the algorithm performance and association rule generation. In: 6th
International Conference on Systems and Informatics (ICSAI). IEEE, China (2019).
ISBN:978-1-7281-5257-8
17. Jin, K.: A new algorithm for discovering association rules. In: International Conference on
Logistics Systems and Intelligent Management (ICLSIM), ISBN 978-1-4244-7331-1, IEEE,
China (2010)
18. (Dataset for Apriori and FP growth Algorithm). https://www.kaggle.com/newshuntkannada/
dataset-for-apriori-and-fp-growth-algorithm
Exploring CTC Based End-To-End
Techniques for Myanmar Speech Recognition
Khin Me Me Chit(&) and Laet Laet Lin
University of Information Technology, Yangon, Myanmar

{khinmemechit,laetlaetlin}@uit.edu.mm
Abstract. In this work, we explore a Connectionist Temporal Classification

(CTC) based end-to-end Automatic Speech Recognition (ASR) model for the
Myanmar language. A series of experiments is presented on the topology of the
model in which the convolutional layers are added and dropped, different depths
of bidirectional long short-term memory (BLSTM) layers are used and different
label encoding methods are investigated. The experiments are carried out in low-
resource scenarios using our recorded Myanmar speech corpus of nearly 26 h.
The best model achieves character error rate (CER) of 4.72% and syllable error
rate (SER) of 12.38% on the test set.
Keywords: End-to-end automatic speech recognition Connectionist temporal

classification Low-resource scenarios Myanmar speech corpus
1 Introduction
ASR plays a vital role in human-computer interaction and information processing. It is

the task of converting spoken language into text. Over the past few years, automatic
speech recognition approached or exceeded human-level performance in languages like
Mandarin and English in which large labeled training datasets are available [1].
However, the majority of languages in the world do not have a sufficient amount of
training data and it is still challenging to build systems for those under-resourced
languages.
A traditional ASR system is composed of several components such as acoustic
models, lexicons, and language models. Each component is trained separately with a
different objective. Building and tuning these individual components make developing
a new ASR system very hard, especially for a new language. By taking advantage of
Deep Neural Network’s ability to solve complex problems, end-to-end approaches
have gained popularity in the speech recognition community. End-to-end models
replaced the sophisticated pipelines with a single neural network architecture.
The most popular approaches to train an end-to-end ASR include Connectionist
Temporal Classification [2], attention-based sequence-to-sequence models [3], and
Recurrent Neural Network (RNN) Transducers [4].
CTC defines a distribution over all alignments with all output sequences. It uses
Markov assumptions to achieve the label sequence probabilities and solves this effi-
ciently by dynamic programming. It has simple training and decoding schemes and
showed great results in many tasks [1, 5, 6].
https://doi.org/10.1007/978-3-030-68154-8_87
Exploring CTC Based End-To-End Techniques for Myanmar Speech Recognition 1039
The sequence-to-sequence model contains an encoder and a decoder and it usually

uses an attention mechanism to make alignment between input features and output
symbols. They showed great results compared to CTC models and in some cases
surpassed them [7]. However, their computational complexity is high and they are hard
to parallelize.
RNN-Transducer is an extension of CTC. It has an encoder, a prediction network,
and a joint network. It uses the outputs of encoder and prediction networks to predict
the labels. It is popular due to its capability to do online speech recognition which is the
main challenge for attention encoder-decoder models. Due to its high memory
requirement in training and the complexity in implementation, there is less research for
RNN-Transducer although it has obtained several impressive results [8].
The CTC-based approach is significantly easier to implement, computationally less
expensive to train, and produces results that are close to state-of-the-art. Therefore it is
a good start to explore the end-to-end ASR models using CTC. In this paper, several
experiments are carried out with the CTC based speech models on low-resource sce-
narios using the Myanmar language dataset ( 26 h). We compare different label
encoding methods: character-level encoding, syllable-level encoding, and sub-word
level encoding on our ASR model. We also vary the number of BLSTM layers and
explore the effect of using a deep Convolutional Neural Network (CNN) encoder on
top of BLSTM layers.
2 Related Work
Most of the early works in Myanmar speech recognition are mostly Hidden Markov
Model (HMM) based systems. Hidden Markov Model is used together with Gaussian
Mixture Model (GMM) and Subspace Gaussian Mixture Model (SGMM) and the
performance of the model increases with the increased use of training data [9].
HMM-GMM models are also used in spontaneous ASR systems and tuning
acoustic features, number of senones and Gaussian densities tends to affect the per-
formance of the model [10].
Since the Deep Neural Networks (DNNs) have gained success in many areas, they
are also used in automatic speech recognition systems usually in the combination with
HMMs [11–14]. Research has shown that the DNN, CNN, and Time Delay Network
models outperformed the HMM-GMM models [14].
However, many of these works needed several components where we have diffi-
culties to tune each individual component. End-to-end ASR models simplified these
components into a single pipeline and achieved state-of-the-art results in many sce-
narios. To the best of our knowledge, no research has been conducted on end-to-end
training in Myanmar ASR systems. In this paper, we introduce an end-to-end Myan-
mar ASR model with the CTC based approach.
1040 K. M. M. Chit and L. L. Lin
3 End-To-End CTC Model
This section introduces the CTC based end-to-end model architecture. We explore the
architectures with the initial layers of VGGNet [15] and up to 6 layers of BLSTM.
Batch Normalization is applied after every CNN and BLSLM layer to stabilize the
learning process of the model. A fully connected layer with the softmax activation
function is followed after the BLSTM layers. The CTC loss function is used to train the
model. Figure 1 shows the model architecture, which includes a deep CNN encoder and
a deep bidirectional LSTM network.
Label
CTC Loss
Fully Connected
BLSTM
Deep CNN (VGGNet)
Spectrogram
Fig. 1. The architecture of CTC based end-to-end ASR model. The different architectures are
explored by varying the number of BLSTM layers from 3 to 6 and by removing and adding deep
convolutional layers.
3.1 Deep Convolutional LSTM Network

Since the input audio features are continuous, the time dimension is usually down-
sampled to minimize the running time and complexity of the model. In order to perform
the time reduction, one way is to skip or concatenate the consecutive time frames. In
this work, the time reduction is achieved by passing the audio features through CNN
blocks containing two max-pooling layers. It is observed that using convolutional
layers speeds up the training time and also helps with the convergence of the model. As
an input to the CNN layers, we use two dimensions of spectrogram features and the
dimension of length one as the channel. After passing through two max-pooling layers,
the input features are downsampled to (1/4 1/4) along the time-frequency axes. The
architecture of the CNN layers is shown in Fig. 2.
Fig. 2. The architecture of convolutional layers (VGGNet). Each convolutional block consists of
two convolutional layers with Batch Normalization and a max pooling layer. Each convolutional
layer uses the ReLU activation function.
A stack of BLSTM layers is followed after the CNN layers. The models with 3 to
6 layers of BLSTM are experimented using 512 hidden units in each layer and
direction. Since the Batch Normalization is shown to improve the generalization error
and accelerate training [16], Batch Normalization is added after every BLSTM layer.
3.2 Connectionist Temporal Classification (CTC)

In speech recognition, the alignment between input audio and output text tokens is
needed to consider. CTC is a way to solve this problem by mapping the input sequence
to the output sequence of shorter length. Given the input sequence of audio features
X ¼ ½x1 ; x2 ; . . .; xT and the output sequence of labels Y ¼ ½y1 ; y2 ; . . .; yU , the model
tries to maximize PðYjXÞ the probability of all possible alignments that assigns to the
correct label sequence. An additional blank symbol is introduced in CTC to handle the
repeated output tokens and silence audio. The CTC objective [17] for a ðX; Y Þ pair is
described as follows:
X YT
PðYjX Þ ¼ A2AX;Y t¼1
Pt ðat jX Þ: ð1Þ
where A is a set of valid alignments, t denotes a time-step and a indicates a single

alignment. In Eqs. 1, we need to solve the expensive computation of the sum of all
possible probabilities. However, it can be efficiently computed with the help of
dynamic programming. During the model training, as we minimize the loss of the
training dataset D, CTC loss is reformulated as the negative sum of log-likelihood. The
equation is written as follows:
X
log PðYjX Þ: ð2Þ
ðX;Y Þ2D
4 Experimental Setup
4.1 Description for Myanmar Speech Corpus
Due to the scarcity of public speech corpora for the Myanmar language, we build a corpus
of read Myanmar speech. The dataset contains 9908 short audio clips each with a tran-
scription. The content of the corpus is derived from the weather news of the “Department
of Meteorology and Hydrology (Myanmar)” [18] and “DVB TVnews” [19].
The dataset contains 3 female and 1 male speakers. The audio clips are recorded in
a quiet place with a minimum of background noise. The lengths of audio clips vary
from 1 to 20 s and the total length of all audio clips is nearly 26 h. Audio clips are in
single-channel 16-bit WAV format and are sampled at 22.05 kHz.
Most of the transcriptions are collected from the above-mentioned data sources and
some are hand-transcribed. Since the transcriptions are mixed with both Zawgyi and
Unicode font encodings, all the texts are firstly normalized into Unicode encoding.
Myanmar Tools [20] is used to detect the Zawgyi strings and ICU Transliterator [21] is
used to convert Zawgyi to Unicode. The numbers are also expanded into full words and
the punctuations and the texts written in other languages are dropped. The audio clips
and transcriptions are segmented and aligned manually.
For the experimental purpose, we randomly split 70% ( 18 h) of the total data as
the training set, 10% ( 2 h) as the development set and 20% ( 5 h) as the test set.
We make sure to contain the audio clips of all four speakers in each dataset (Table 1).
Table 1. Statistics of the Myanmar speech corpus.

Total clips 9908
Total duration 25 h 18 min 37 s
Mean clip duration 9.19 s
Min clip duration 0.72 s
Max clip duration 19.92 s
Mean character per clip 121
Mean syllable per clip 38
Distinct characters 57
Distinct syllables 866
4.2 Training Setup

As input features, we use log spectrograms computed every 10 ms with 20 ms window.
For CNN networks, the initial layers of the VGGNet with Batch Normalization layers
are used as described in Sect. 3.1. The time resolution of the input features is reduced
by 4 times after passing the convolutional layers. We use 3 to 6 layers of BLSTM each
with 512 hidden units. A Batch Normalization with momentum 0.997 and epsilon 1e-5
is followed after each CNN and BLSTM layer. For the fully connected layer, we use
the softmax activation function. All the models are trained with the CTC loss function.
For the optimization, the Adam optimizer is used with the initial learning rate k of 1e-4.
The learning rate is reduced by the factor of 0.2 if the validation loss stops improving
for certain epochs. The batch size of 8 is used in all experiments. The early stopping
call back is also used to prevent overfitting. The different output label encoding
methods are compared: character-level, syllable-level and sub-word level encodings.
We use a set of regular expression rules [22] to segment the syllable-level tokens and
use Byte Pair Encoding (BPE) to create the sub-word tokens. Since the experiments are
performed with different label encoding methods, the results are evaluated using both
CER and SER metrics. All the experiments are carried out on NVIDIA Tesla
P100 GPU with 16GB GPU memory and 25 GB RAM (Google Colab). We do not use
the external language model in this work. We will further investigate the effect of
combining the decoding process with the external language model in the future.
5.1 Effect of Using Different Label Encoding Methods

Only a small amount of research is done in character-level Myanmar ASR system. So it
is quite interesting to explore Myanmar ASR models using the character-level output
tokens. Since the Myanmar language is a syllable-timed language, the syllable-level
tokens are also considered to be used. The dataset contains 57 unique characters and
866 unique syllables. We also conduct a number of experiments on the BPE sub-word
tokenization in which a specific vocabulary size is needed to define. Because the
dataset is relatively small, only small vocabulary sizes of 100, 300 and 500 BPE sub-
word units are used in the experiments.
Table 2 shows the results of varying the different label encoding methods. It is
observed that using large label encoding units is not very beneficial for a small dataset.
The syllable-level model (866 tokens) and BPE model (500 tokens) show high error
rates on both development and test sets. The best results are obtained by the character-
level encoder with 4.69% CER and 11.86% SER on the development set and 4.72%
CER and 12.38% SER on the test set.
5.2 Effect of Using Convolutional Layers

The use of many convolutional blocks may over-compress the number of features and
make the length of the input audio features smaller than the length of output labels.
This can cause problems in CTC calculation. This usually occurs when training the
character-level ASR model of which the output of the model requires at least one time
step per output character. For this reason, we limit the amount of downsampling with
the maximum use of 2 max pooling layers.
As can be seen in Table 2, the use of convolutional layers tends to increase the
performance of the model. All the models with convolutional layers outperform the
models without convolutional layers. Moreover, the downsampling effect of the con-
volutional blocks significantly reduces the training time and speeds up the convergence
of the model.
Table 2. Comparison of the effect of using convolutional layers and different label encoding
units on both development and test sets. Each model contains the same number of convolutional
layers, 5 layers of LSTM layers and a fully connected layer. The input features are not
downsampled for the models without convolutional layers. The experiments are evaluated with
both character error rate (CER) and syllable error rate (SER) metrics.
Unit With CNN Without CNN
Dev Test Dev Test
CER SER CER SER CER SER CER SER
Char 4.69 11.86 4.72 12.38 5.29 12.12 5.67 12.61
Syllable 21.95 20.67 22.68 22.34 22.09 23.46 23.41 24.83
BPE 100 16.44 18.80 13.72 16.58 19.51 26.69 18.72 25.27
BPE 300 9.61 19.12 10.38 20.35 9.65 20.88 9.98 21.83
BPE 500 10.77 24.73 11.58 27.34 22.08 34.61 22.67 36.03
5.3 Effect of Using Different Numbers of BLSTM Layers

Since the Myanmar speech corpus is relatively small, the effect of varying model sizes
are explored starting from a small model depth of 3. Only the depth of BLSTM layers
is varied and the depth of CNN layers and other parameters are kept constant.
Table 3 shows that the performance of the model keeps increasing until it reaches the
depth of 6 LSTM layers on the test set. The model with 5 LSTM layers is the best fit for
the small dataset which achieves the overall best results of 4.72% CER and 12.38%
SER on test set.
Table 3. Comparison of character-level ASR models with different numbers of BLSTM layers.
Convolutional layers are used in every experiment. A fully connected layer is followed after the
BLSTM layers. The hidden unit size of BLSTM layers is set to 512 units.
Number of BLSTM Layers Dev Test
CER SER CER SER
3 4.65 11.85 5.03 13.24
4 4.58 12.54 4.93 13.75
5 4.69 11.86 4.72 12.38
6 5.26 14.45 5.45 15.46
6 Conclusion
In this paper, we explore the CTC based end-to-end architectures on the Myanmar
language. We empirically compare the various label encoding methods, different
depths of BLSTM layers and the use of convolutional layers on the low-resource
Myanmar speech corpus. This work shows that a well-tuned end-to-end system can
achieve state-of-the-art results in a closed domain ASR even for low-resource
languages. As future work, we will further investigate the integration of the language
model to our end-to-end ASR system and will explore the other end-to-end multi-
tasking techniques.
Acknowledgments. The authors are grateful to the advisors from the University of Information
Technology who gave us helpful comments and suggestions throughout this project. The authors
also thank Ye Yint Htoon and May Sabal Myo for helping us with the dataset preparation and for
technical assistance.
References
1. Amodei, D., Anubhai, R., Battenberg, E., Case, C., Casper, J., Catanzaro, B., Chen, J.,
Chrzanowski, M., Coates, A., Diamos, G., Elsen, E., Engel, J., Fan, L., Fougner, C., Han, T.,
Hannun, A., Jun, B., LeGresley, P., Lin, L., Narang, S., Ng, A., Ozair, S., Prenger, R.,
Raiman, J., Satheesh, S., Seetapun, D., Sengupta, S., Wang, Y., Wang, Z., Wang, C., Xiao,
B., Yogatama, D., Zhan, J., Zhu, Z.: Deep Speech 2: end-to-end speech recognition in
English and Mandarin. arXiv:1512.02595 [cs.CL] (2015)
2. Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classifica-
tion: labelling unsegmented sequence data with recurrent neural networks. In: ICML 2006:
Proceedings of the 23rd International Conference on Machine Learning. ACM Press, New
York (2006)
3. Chan, W., Jaitly, N., Le, Q., Vinyals, O.: Listen, attend and spell: a neural network for large
vocabulary conversational speech recognition. In: 2016 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), pp. 4960–4964. IEEE (2016)
4. Graves, A.: Sequence transduction with recurrent neural networks, arXiv:1211.3711 [cs.NE]
(2012)
5. Zweig, G., Yu, C., Droppo, J., Stolcke, A.: Advances in all-neural speech recognition. In:
2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
pp. 4805–4809. IEEE (2017)
6. Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R.,
Satheesh, S., Sengupta, S., Coates, A., Ng, A.Y.: Deep speech: scaling up end-to-end speech
recognition, arXiv:1412.5567 [cs.CL] (2014)
7. Shan, C., Zhang, J., Wang, Y., Xie, L.: Attention-based end-to-end speech recognition on
voice search. In: 2018 IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP), pp. 4764–4768. IEEE (2018)
8. Li, J., Zhao, R., Hu, H., Gong, Y.: Improving RNN transducer modeling for end-to-end
speech recognition. In: 2019 IEEE Automatic Speech Recognition and Understanding
Workshop (ASRU), pp. 114–121. IEEE (2019)
9. Mon, A.N., Pa, W.P., Thu, Y.K.: Building HMM-SGMM continuous automatic speech
recognition on Myanmar Web news. In: International Conference on Computer Applications
(ICCA2017), pp. 446–453 (2017)
10. Naing, H.M.S., Pa, W.P.: Automatic speech recognition on spontaneous interview speech.
In: Sixteenth International Conferences on Computer Applications (ICCA 2018), Yangon,
Myanmar, pp. 203–208 (2018)
11. Nwe, T., Myint, T.: Myanmar language speech recognition with hybrid artificial neural
network and hidden Markov model. In: Proceedings of 2015 International Conference on
Future Computational Technologies (ICFCT 2015), pp. 116–122 (2015)
12. Naing, H.M.S., Hlaing, A.M., Pa, W.P., Hu, X., Thu, Y.K., Hori, C., Kawai, H.: A Myanmar
large vocabulary continuous speech recognition system. In: 2015 Asia-Pacific Signal and
Information Processing Association Annual Summit and Conference (APSIPA), pp. 320–
327. IEEE (2015)
13. Mon, A.N., Pa Pa, W., Thu, Y.K.: Improving Myanmar automatic speech recognition with
optimization of convolutional neural network parameters. Int. J. Nat. Lang. Comput.
(IJNLC) 7, 1–10 (2018)
14. Aung, M.A.A., Pa, W.P.: Time delay neural network for Myanmar automatic speech
recognition. In: 2020 IEEE Conference on Computer Applications (ICCA), pp. 1–4. IEEE
(2020)
15. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. In: International Conference on Learning Representations (ICRL) (2015)
16. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing
internal covariate shift, arXiv:1502.03167 [cs.LG] (2015)
17. Hannun: Sequence Modeling with CTC. Distill. 2 (2017)
18. Department of Meteorology and Hydrology. https://www.moezala.gov.mm/
19. DVB TVnews. https://www.youtube.com/channel/UCuaRmKJLYaVMDHrnjhWUcHw
20. Google: Myanmar Tools. https://github.com/google/myanmar-tools
21. ICU - International Components for Unicode. https://site.icu-project.org/
22. Thu, Y.K.: sylbreak. https://github.com/ye-kyaw-thu/sylbreak
IoT Based Bidirectional Speed Control
and Monitoring of Single Phase Induction
Motors
Ataur Rahman, Mohammad Rubaiyat Tanvir Hossain(&) ,

and Md. Saifullah Siddiquee
Department of Electrical & Electronic Engineering, Chittagong University

of Engineering and Technology, Chattogram 4349, Bangladesh
mrthossain@cuet.ac.bd
Abstract. This paper presents the construction and laboratory investigation of

an IoT (Internet of Things) based smart system to control, measure, and monitor
the bidirectional speed control of single-phase induction motors (SPIM) remo-
tely. The design prototype consisting of two single-phase induction motors
demonstrates multi-motor control. The motors are turned ON and OFF by a
specific relay operation. To achieve the desired motor speed, the stator voltage
control method has been applied by using Pulse Width Modulation
(PWM) technique. For reversing the motor direction of rotation, the stator
magnetic field is reversed by swapping the contacts of auxiliary winding by
relay operation. Whenever the desired value is submitted for a specific operation
from a specially designed website, the desired control signal is generated from a
programmed microcontroller according to the user's command via a webserver
using GSM communication. Motor status data is measured using an IR sensor
and observed remotely on the monitoring panel integrated with a web appli-
cation. The result shows little deviation compared to direct field measurement.
The IoT-based smart motor control system can be used in this modern age to
continuously track, control, and monitor machines, goods, plants, etc. for ver-
satility in multi-purpose applications.
Keywords: Internet of Things Bidirectional speed control Single-phase

induction motor
1 Introduction
In this modern age, induction motors (IM) are widely used in industry, automotive,
aerospace, military, medical, and domestic equipment and appliances because of their
low costs, robustness, high power, and low maintenance [1]. The bulk of the electric
power that is being consumed globally, is by single-phase induction motors (SPIMs)
driving either a fan, pump, or compressor type of load, including their applications in
heating, ventilating, and air conditioning [2]. Traditionally, motor-driven systems run
at a nearly constant speed and are usually designed to provide a load margin of 20–30%
over the full load value for a long duration. A significant portion of the input energy is
wasted for controlling their outputs to meet the variable load demands where the power

https://doi.org/10.1007/978-3-030-68154-8_88
1048 A. Rahman et al.
drawn from the utility remains essentially the same as at full load. This energy loss can
be avoided by driving the motors at a speed that results in the desired regulated outputs.
The input power decreases as the speed is decreased. This decrease can be estimated by
recognizing that in induction motor,
Torque k1 ðspeedÞ2 ð1Þ
and therefore, the power required by the motor-driven system is,
Power k2 ðspeedÞ3 ð2Þ
where k1 and k2 are the constants of proportionality.

If the motor efficiencies can be assumed to be constant as their speed and loadings
change, then the input power required by the induction motor would also vary as the
speed cubed. Thus instead of constant speed operation of induction motors the variable
speed driven system can result in significant energy conservation. Several methods
exist for the variable speed operation of a single-phase induction motor. Considering
simplicity and low cost, the most common type is the control of applied voltage to the
motor. Also, bidirectional rotation control is important in various applications, like a
conveyor belt, exhaust fan, etc. By reversing the direction of rotation, the same motor
can be used as a water feed pump and smoke exhauster in a boiler.
Present-day industries are increasingly being shifted towards automation. As the
internet has become widespread today, remote accessing, monitoring, and controlling
of systems are possible [3]. With rapidly increasing IoT (Internet of Things) technology
the network of physical objects or things that are interconnected with electronics can be
communicating with each other and be managed by computers [4–7]. IoT has provided
a promising way to build powerful industrial automation by using wireless devices,
sensors, and web-based intelligent operations [7–9].
For web-based solutions for controlling and monitoring single-phase induction
motors (SPIM), diverse research, design, and implementations have been performed
[10, 11]. In [12, 13], the authors proposed schemes to observe IM parameters using the
ZigBee protocol. But because of the low data speed and high cost of ZigBee, the
proposed systems are not suitable to cover a longer distance. P.S. Joshi et al. [14] have
proposed wireless speed control of IMs using GSM, Bluetooth, and Wi-Fi technology
which has a low range of communication features as compared to IoT. In [15], IoT
based IM monitoring system is developed based on analyzing sensor data collected
from local and cloud servers, Wi-Fi enabled Raspberry Pi and a web application. For
predictive maintenance of motors to avoid delayed interruptions in the production, an
IoT based Induction Motor monitoring scheme is proposed using ARM-Cortex in [16].
The authors in [17] have proposed an IoT based Induction Motor controlling and
monitoring using PLC and SCADA. But the system is costly, complex, and requires
trained manpower. Authors in [18] implemented wireless sensor networks based on IoT
and Bluetooth Low Energy to monitor and control the speed, torque, and safety of
SPIM. In [19], three separate SPIM or one three-phase induction motor is controlled
(ON/OFF) simultaneously according to voltage and current data collection through the
internet embedded with an Ethernet board. The majority of web-based implementations
IoT Based Bidirectional Speed Control and Monitoring of SPIMs 1049
have been for monitoring and speed control of induction motors. While bidirectional
rotation of motors has not been widely investigated.
This paper addresses this gap by providing a simple web-based communication
solution for bidirectional speed control and monitoring of multiple single-phase
induction motors. The speed control of the induction motors has been performed by the
stator voltage control method using the pulse width modulation (PWM) technique [20].
The direction of rotation of the motor has been changed by reversing the stator mag-
netic field by swapping the connection of auxiliary stator winding with simple, low-
cost relay switching. The parameters of the prototype design are observed in LCD by a
microcontroller and shared to remote computer/mobile using IoT through GPRS which
makes the system flexible and user friendly.
2 Proposed System Design
Figure 1 shows the proposed system's architecture in the block diagram form. It mainly
consists of a web application developed on a remote computer or smartphone, a
GSM/GPRS module for interfacing with the server system, a Microcontroller, an LCD,
IGBTs with gate drive circuits, Bridge rectifiers, IR sensors, a relay module, and two
single-phase induction motors (SPIM).
Fig. 1. Block diagram of IoT based bidirectional speed control and monitoring of SPIM.
The IoT based bidirectional speed and status observation of the motors is accom-
plished by the web application. The GSM/GPRS module acts as an interface between
the web server and the microcontroller unit to send and receive command and output
signals. Following the specific command input from the field operator, the programmed
microcontroller generates PWM signals. These signals are sent through the isolation
circuit to the IGBT gate driver circuits to operate the IGBT switches. By the PWM AC
chopping technique, input AC voltages to the stator windings of the induction motors
are varied. As a result, the motor speeds are controlled. The IR sensors are used for
measuring the RPM of the motors and the results are compared with fine-tuned digital
tachometer readings. The relay module with four sets of relays is employed for the
motor on/off and direction control. The overall motor status is observed in the LCD and
the web application developed in the remote computer or smartphone.
3 Implementation of the Proposed System
The circuit diagram of the proposed scheme for the controlling and monitoring of
induction motors with IoT is depicted in Fig. 2. Functionally, the system can be
divided into four major functional units described as follows:
Fig. 2. Circuit diagram of the proposed control scheme
3.1 Speed Control Scheme for Induction Motor

The speed of a single-phase induction motor can be controlled by varying the RMS
value of the stator voltage at a constant frequency. This technique is usually used in
fan, pump, or compressor type motor-driven systems with high slip. The stator voltage
can be controlled by three methods; integral cycle control, phase control, and PWM
control. The PWM control method is simple, cost-effective, and has low input
harmonic content compared to other methods. Here the supply voltage is chopped and
modulated with a high switching frequency IGBT gate drive signal generated from the
microcontroller unit. The change in the duty cycle of the IGBT switch changes the
effective value of the load voltage and current. The chopped voltage can be expressed
by multiplying the sinusoidal line voltage with the switching signal. The detailed
analysis is given in [20]. The switching function can be written by the Fourier series
expansion of the pulse for one period as:
X1
d ð t Þ ¼ a0 þ n¼1
ðan cosnxs t þ bn sinnxs tÞ ð3Þ
Where a0 is the DC component, an and bn are the Fourier coefficients, xs is the

switching frequency. a0 , an and bn have the following values
ton ton
a0 ¼ ¼ ¼D ð4Þ
T ton þ toff
1
an ¼ sinð2pnDÞ ð5Þ
np
1
bn ¼ ½1 þ cosð2pnDÞ ð6Þ
np
The load voltage can be calculated by multiplication of supply voltage, vs ðtÞ ¼

V m sinxt and the switching function d ðtÞ.
vL ðtÞ ¼ vs ðtÞ:d ðtÞ ¼ V m sinxt:d ðtÞ ð7Þ

hX1 i
vL ðtÞ ¼ a0 V m sinxt þ ð a
n¼1 n m
V cosnx s t:sinxt þ b V
n m sinnxs t:sinxt Þ ð8Þ
The high-frequency terms in Eq. (8) are filtered by the motor’s inner inductance
that results in the load voltage to be expressed as:
vL ðtÞ ¼ a0 V m sinxt ¼ D:V m sinxt ð9Þ
The RMS value of the stator load voltage can be calculated as
D:V m
V Lrms ðtÞ ¼ p ð10Þ
2
As the input power required by the induction motor varies as the speed cubed,
decreasing the motor speed to half of the rated speed by voltage control method to meet
the variable load demands the input power drawn from the utility will drop to one-
eighth of the rated power input. So there is a significant reduction in the power that is
being consumed.
3.2 ON/OFF and Bidirectional Rotation Control

A simple and low-cost relay module consisting of four sets of relays is used for the
motor on/off switching and bidirectional rotation control as shown in Fig. 2. Two
relays are connected in series with the power line and each of the two motors. To turn
on a motor, the microcontroller sends a low signal to the specific relay for a specific
motor. The relay coil gets energized and builds up a connection between common and
‘NO’ relay contacts to connect the power line with the motor which remains discon-
nected by default. For bidirectional rotation control swapping of the contacts of motor
auxiliary winding occurs by relay operation. Hence the motor auxiliary winding gets
electrically connected in reverse fashion. As the direction of current flow reverses in the
auxiliary winding the stator magnetic field is reversed which reverses the rotational
direction of the rotor. The on/off control relay is energized only after the switching of
winding swapping relays to protect the motor from the probability of getting short-
circuited.
3.3 Speed Measurement and Motor Status Monitoring

The Measurement of the speed of induction motors is done by IR sensor, disc, and
microcontroller as shown in Fig. 3. IR sensor emits infrared rays through the discs or
blades of the motors and the receiver of the sensor experiences high voltage peaks
when an obstacle or blade cuts the rays. By counting the number of peaks, the speed of
rotation in rpm is determined using the programmed microcontroller. The rpm values
are sent to the LCD unit and the web application in a remote computer via the
GSM/GPRS module. The rpm values obtained from the IR sensor are verified with a
fine-tuned digital tachometer as shown in Fig. 4.
Fig. 3. Measurement of the speed of SPIM using IR sensor

Fig. 4. Measurement of the speed of SPIM using a digital tachometer
3.4 IoT Based Communication Platform

As user interfaces for the framework, a dynamic web application connected to the
internet with the help of a web host service (i.e., 000webhost) is built using the code of
HTML, CSS, and PHP. This is essentially a two-page website. One is for receiving
input commands from the user as shown in Fig. 5 and another is for monitoring
machine control parameters. Users can send motor ON/OFF command signals, desired
motor speed as a percentage of rated speed, and also the preferred direction of rotation
via the input tab.
By HTTP protocol, communication is made with the server and GPRS module by
applying the POST method of PHP to send device data. The data parameters are
translated into the GPRS module which is already enabled to communicate with
specific HTTP server and microcontroller using AT command’s set such as for setup
connection (i.e., AT + CGATT, AT + SAPBR, AT + HTTPINIT, etc.), and GET &
POST method (i.e., AT + HTTPPARA, AT + HTTPDATA, AT + HTTPACTION,
AT + HTTPREAD and so on). The microcontroller sends motor status data like
ON/OFF, direction, and speed data (from measuring unit) to the webserver via GPRS
through reverse communication, and finally the designed website receives the status
data using the GET method of PHP. On the monitoring unit, such as, in remote
computer/ mobile devices using the web application, and in LCD, the status of motors
can be observed as shown in Figs. 6 and 7 respectively.
Fig. 5. Web application control panel for the user interface.
Fig. 6. Web application based panel for monitoring the motor status
Fig. 7. Observation of motor status in LCD
The laboratory implementation of multiple SPIM speed and direction control and
monitor using IoT technology is shown in Fig. 8. As described earlier, an operator from
a remote location can observe the system and provide command inputs for the desired
operation of motors.
Fig. 8. A laboratory prototype of IoT based bidirectional speed of single-phase induction motor
This proposed system offers ON/OFF control, clockwise (CW) and counter-
clockwise (CCW) rotation control, speed control as a percentage of rated speed through
the specially designed website over the internet. The motor parameters, measured by
using the IR sensors, are processed by the microcontroller and shared to remote
computer/mobile by using GPRS and also displayed in LCD for field operators.
Concerning the variation of the PWM duty cycle, the SPIM speed variation has been
observed in the web server and LCD and also compared with field measurement data
obtained from a fine-tuned digital tachometer. It has been found that the deviation of
field-measured tachometer-data and IR sensor-based remote data is negligible. This
comparison is presented in Fig. 9 by using a comparative bar diagram.
Fig. 9. Digital Tachometer and IR sensor-based speed measurement comparative bar diagram.
Concerning the PWM duty cycle, the practically measured RPM values are plotted
against percent duty cycle variation as shown in Fig. 10 and a second-order polynomial
Eq. (11) has been derived.
N ¼ 0:641D2 þ 113:5619D 2557:6 ð11Þ
Here, N = Speed values in RPM; D = Duty cycle in percentage (%). This equation
can be used to get motor speed empirically based on the corresponding percent of duty
ratio. For various desired speed values of single-phase induction motor, the required
percent of PWM duty cycles obtained from the above equation are summarized in
Table 1.
Fig. 10. Polynomial curve fitting using measured data.
Table 1. Percentage of the duty cycle for different motor speed values
Speed, N (RPM) Duty cycle, D (%) Speed, N (RPM) Duty cycle, D (%)
1700 55.87 2100 65.75
1800 58.03 2200 69.01
1900 60.36 2300 72.96
2000 62.91 2400 78.40
5 Conclusion
Multiple SPIM speed and direction control and monitoring using IoT technology have
been presented in this paper. The speed control of the induction motors has been
performed by the stator voltage control method using the PWM technique. The
ON/OFF and the bidirectional rotation (CW/CCW) of the motors have been controlled
by the cheap and simple operation of relays. The motor operational parameters have
been monitored remotely in a computer/smartphone using IoT through GPRS. For a
variation of the PWM duty cycle, the SPIM speed variation has been observed and was
compared with field measurement data obtained from a fine-tuned digital tachometer. It
has been found that the deviation of direct field measurement using the tachometer and
IR sensor-based remote measurement is little. Also from the analysis, it has been shown
possible to calculate the desired rpm of the motor beforehand following the percentage
of the duty cycle from the second-order polynomial equation derived. In industrial,
manufacturing, agricultural, and household applications, the proposed IoT based smart
control system can be introduced with more sensors to continuously track multiple
control variables of machines, plants, etc. It is also possible to develop an android
based application to make the system more user-friendly.
References
1. Sarb, D., Bogdan, R.: Wireless motor control in automotive industry. In: 2016 24th
Telecommunications Forum (TELFOR), pp. 1–4. Belgrade (2016). https://doi.org/10.1109/
TELFOR.2016.7818790
2. Mohan, N.: Power Electronics: A First Course, 1st edn. Wiley, USA (2012)
3. Internet World Stats. https://www.internetworldstats.com/stats.htm. Accessed 03 June 2019
4. Doshi, N.: Analysis of attribute-based secure data sharing with hidden policies in smart grid
of IoT. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.) Intelligent Computing & Optimization.
ICO 2018. Advances in Intelligent Systems and Computing, vol. 866. Springer, Cham
(2019). https://doi.org/10.1007/978-3-030-00979-3_54
5. Khan, A.A., Mouftah, H.T.: Energy optimization and energy management of home via web
services in smart grid. In: IEEE Electrical Power and Energy Conference, pp. 14–19, London
(2012). https://doi.org/10.1109/EPEC.2012.6474940
6. Cheah, P.H., Zhang, R., Gooi, H.B., Yu, H., Foo, M.K.: Consumer energy portal and home
energy management system for smart grid applications. In: 10th International Power &
Energy Conference (IPEC), pp. 407–411. Ho Chi Minh City (2012). https://doi.org/10.1109/
ASSCC.2012.6523302
7. Kuzlu, M., Pipattanasomporn, M., Rahman, S.: Communication network requirements for
major smart grid applications in HAN, NAN, and WAN. Comput. Netw. 67, 74–88 (2014).
https://doi.org/10.1016/j.comnet.2014.03.029
8. Tsai, W., Shih, Y., Tsai, T.: IoT-type electric fan: remote-controlled by smart-phone. In:
Third International Conference on Computing Measurement Control and Sensor Network
(CMCSN), pp. 12–15. Matsue (2016). https://doi.org/10.1109/CMCSN.2016.17
9. Kuzlu, M., Rahman, M.M., Pipattanasomporn, M., Rahman, S.: Internet-based communi-
cation platform for residential DR programmes. IET Networks 6(2), 25–31 (2017). https://
doi.org/10.1049/iet-net.2016.0040
10. Kamble, I., Patil, Y.M.: A review of parameters monitoring and controlling system for
industrial motors using wireless communication. Int. J. Res. Appl. Sci. Eng. Technol. 7(1),
47–49 (2019). https://doi.org/10.22214/ijraset.2019.1010
11. Potturi, S., Mandi, R.P.: Critical survey on IoT based monitoring and control of induction
motor. In: IEEE Student Conference on Research and Development (SCOReD), pp. 1–6.
Selangor, Malaysia (2018). https://doi.org/10.1109/SCORED.2018.8711222.
12. Patil, R.R., Date, T.N., Kushare, B.E.: ZigBee based parameters monitoring system for
induction motor. In: IEEE Students' Conference on Electrical, Electronics and Computer
Science, Bhopal, pp. 1–6 (2014). https://doi.org/10.1109/SCEECS.2014.6804469
13. Khairnar, V.C., Sandeep, K.: Induction motor parameter monitoring system using zig bee
protocol & MATLAB GUI: Automated Monitoring System. In: Fourth International
Conference on Advances in Electrical, Electronics, Information, Communication, and Bio-
Informatics (AEEICB), Chennai, pp. 1–6 (2018). https://doi.org/10.1109/AEEICB.2018.
8480992
14. Joshi, P.S., Jain, A.M.: Wireless speed control of an induction motor using PWM technique
with GSM. IOSR J. Electr. Electron. Eng. 6(2), 1–5 (2013). https://doi.org/10.9790/1676-
0620105
15. Rekha, V.S.D., Ravi, K.S.: Induction motor condition monitoring and controlling based on
IoT. Int. J. Electron. Electr. Comput. Syst. 6(9), 74–89 (2015)
16. Şen, M., Kul, B.: IoT-based wireless induction motor monitoring. In: XXVI International
Scientific Conference Electronics (ET), pp. 1–5. Sozopol (2017). https://doi.org/10.1109/ET.
2017.8124386.
17. Venkatesan, L., Kanagavalli, S., Aarthi, P.R., Yamuna, K.S.: PLC SCADA based fault
identification and protection for three-phase induction motor. TELKOMNIKA Indone-
sian J. Electr. Eng. 12(8), 5766–5773 (2014)
18. Kathiresan, S., Janarthanan, M.: Design and implementation of industrial automation using
IOT and Bluetooth LE. Int. J. Adv. Res. Trends Eng. Technol. (IJARTET) 3(19), 335–338
(2016). https://doi.org/10.20247/IJARTET.2016.S19040062
19. Çakır, A., Çalış, H., Turan, G.: Remote controlling and monitoring of induction motors
through internet. TELKOMNIKA Indonesian J. Electr. Eng. 12(12), 8051–8059 (2014).
https://doi.org/10.11591/telkomnika.v12i12.6719
20. Yildirim, D., Bilgic, M.: PWM AC chopper control of single-phase induction motor for
variable-speed fan application. In: 2008 34th Annual Conference of IEEE Industrial
Electronics, Orlando, FL, pp. 1337–1342 (2008). https://doi.org/10.1109/IECON.2008.
4758148.
Missing Image Data Reconstruction Based
on Least-Squares Approach
with Randomized SVD
Siriwan Intawichai and Saifon Chaturantabut(&)

Thammasat University, Pathumthani 12120, Thailand
Siriwan_amth50@hotmail.com,
saifon@mathstat.sci.tu.ac.th
Abstract. In this paper, we introduce an efficient algorithm for reconstructing

incomplete images based on optimal least-squares (LS) approximation. Gener-
ally, LS method requires a low-rank basis set that can represent the overall
characteristic of an image, which can be obtained optimally via the singular
value decomposition (SVD). This basis is called proper orthogonal decompo-
sition (POD) basis. To significantly decrease the computational cost of SVD,
this work employs a randomized singular value decomposition (rSVD) to
compute the basis from the available image pixels. In this work, to preserve the
2-dimensional structure of the image, the test image is first subdivided into many
2-dimensional small patches. The complete patches are used to compute the
POD basis for reconstructing corrupted patches. For each incomplete patch, the
known pixels in the neighborhood around the missing components are used in
the LS approximation together with the POD basis in the reconstruction process.
The numerical tests compare the execution time used in computing this optimal
low-rank basis by using rSVD and SVD, as well as demonstrate the accuracy of
the resulting image reconstructions.
Keywords: Missing data reconstruction Singular value decomposition

Randomized SVD Least-squares approximation
1 Introduction
Missing data problems have been important issues in many research fields such as
biology, medical, engineering, image processing, remote sensing etc. In image pro-
cessing, pixels in some images may be corrupted or simply missing. In some cases,
pixels may be randomly deleted in order to save network bandwidth during data
transmission. The missing data reconstruction methods are extensively developed to
restore those corrupted or missing pixels.
In 2008, Emmanuel J. C. et al. [1] considered matrix completion in the recovery of
a data matrix from a sampling of its entries which led to the notions of missing data
reconstruction algorithms. For example, Jin Z. et al. [2] and Feilong C. et al. [3]
demonstrated the reconstruction algorithm by working on sparse matrix completion.

https://doi.org/10.1007/978-3-030-68154-8_89
1060 S. Intawichai and S. Chaturantabut
Singular value decomposition (SVD) is one of the fundamental techniques in

matrix completion, which can be used for reconstructing missing data. SVD is the most
widely used for matrix decomposition (see [4–8]). It is a stable and effective method to
split the system into a set of linearly independent components. Also, SVD is the
optimal matrix decomposition in a least square sense that it packs the maximum signal
energy as few coefficients as possible (see more detail [7]).
SVD is adapted for dimensionality reduction techniques which are applied for
solving many problems, such as low-rank matrix approximations [6, 7], image pro-
cessing (see [9–12]), image compression ([10, 11]), face recognition [11] and missing
data image reconstruction ([12–15]). Due to the tradition way of computing the full
SVD of a high-dimension matrix is often expensive as well as memory intensive, the
improved methods for decrease the computational work have been proposed.
Sarlos T. [16], Liberty E. et al. [17] and Halko N. et al. [18] introduced randomized
singular value decomposition (rSVD) which is a more robust approach based on ran-
dom projections. These method can decrease the computational work of extracting low-
rank approximations and preserve accurately.
In [19] and [20], rSVD methods was applied to reconstruct missing data compo-
nents of image with the notion of least-squares approximation. The least-squares
method is used for estimating missing data with an optimal low-rank basis in Euclidean
norm, which is also called proper orthogonal decomposition (POD) basis. In general,
POD basis can be computed by using SVD. However, computing SVD can be time
consuming and can dominate the execution time during the reconstruction process. The
rSVD method is therefore used for improving the accuracy and decreasing the com-
putation times. The image reconstruction method in [19] and [20] computed a POD
basis by using all complete columns in the image. Then, each incomplete column with
missing pixels was approximated by using this POD basis with the coefficients com-
puted from the remaining pixels in that column. This approach was shown to decrease
the computation times with equivalent reconstruction errors when compared to the
traditional approach. However, this approach may not be applicable or efficient when
almost all columns of the image are incomplete or the incomplete columns contain
missing pixels in different rows. Moreover, this approach may not preserve the 2-
dimensional structure of the original image.
This work resolves the above limitations by separating the test image into many 2-
dimensional small patches and then using the complete patches to build the POD basis
for reconstructing corrupted patches. For each incomplete patch, the known pixels in
the neighborhood around the missing components are used in the LS approximation
together with the POD basis to reconstruct the missing pixels. This approach provides
the flexibility of controlling the size of patches, and hence the number of complete
patches that are used for POD basis can be chosen arbitrary. This reconstruction
approach also applies the rSVD method to compute POD basis for decreasing the
computation times.
The remainder of this paper is organized as follows. In Sect. 2, we consider the
background knowledge of SVD and rSVD. The approach for reconstructing missing
data based on rSVD via least-squares method is discussed in Sect. 3. The numerical
experiments shown in Sect. 4 use these approaches for constructing the projection basis
and compare the CPU times together with the reconstruction errors. The rSVD
Missing Image Data Reconstruction 1061
approach is shown to use least execution time, for sufficiently small dimensions of the
POD basis, while providing the same level of accuracy when compared to the other
approach. Finally, some concluding remarks are given in Sect. 5.
2 Background Knowledge
2.1 Singular Value Decomposition

The singular value decomposition (SVD) of a matrix X ¼ ½x1 ; x2 ; . . .; xn 2 Rmn can
be expressed as
X ¼ URV T ð1Þ
where U 2 Rmm and V 2 Rnn are matrices with orthonormal columns. The column
vectors of U and V are left and right singular vectors, respectively, denoted as ui and vi ,
R ¼ diagðr1 ; . . .; rr Þ 2 Rmn , r ¼ minfm; ng where r1 r2 . . . rr [ 0, are
known as singular values. Using the subscript T to denote the transpose of the matrix.
The left singular vectors of X are eigenvectors of XX T and the right singular vectors of
X are eigenvectors of X T X.
Truncated SVD is obtained by computing full SVD and then truncating it by
selecting the top k dominant singular values and their corresponding singular vectors
such that
X
k
Xk ¼ Uk Rk VkT ¼ ri ui vTi ð2Þ
i¼1
Where k\r is the numerical rank, Uk 2 Rmk and Vk 2 Rnk are matrices with
orthonormal columns Rk ¼ diagðr1 ; . . .; rk Þ 2 Rkk where r1 r2 . . . rk [ 0.
Here, Xk ¼ ½~x1 ; ~x2 ; . . .; ~xk is the best rank k approximation to the matrix X by the
low-rank matrix approximation error measured by 2-norm:
X
n X
r
kX Xk k22 ¼ r2k þ 1 ; or kX Xk k2F ¼ kxi ~xi k22 ¼ r2l ð3Þ
i¼1 l¼k þ 1
The SVD method is summarized in Algorithm 1.

Algorithm 1: The SVD algorithm
INPUT : A data matrix X  m n with target rank k .
OUTPUT : The SVD of X : U , ,V
Step 1. Set B  X T X
Step 2. Computing the eigendecomposition, B  VDV T
Step 3. Calculating   D
Step 4. Computing U  XV  1
2.2 Randomized Singular Value Decomposition

We define the randomized singular value decomposition (rSVD) of X as,
^k ¼ U
X ^kV
^kR ^kT ; ð4Þ
where k\r is the numerical rank, U ^ k 2 Rmk and V ^k 2 Rnk are matrices with
orthonormal columns R ^k 2 R kk
is a diagonal matrix with the singular values
^1 r
r ^2 . . . r
^k [ 0: Details are given as follows.
Define the random projection of a matrix as,
Y ¼ XX ð5Þ
where X is a random matrix. The rSVD algorithm as considered by [18] explores

approximate matrix factorizations using random projections, separating the process into
two stages: the first stage, random sampling is used to obtain a reduced matrix whose
range approximates the range of the data matrix. For a given e [ 0; we wish to find a
matrix Q with orthonormal columns such that

X QQT X 2 e: ð6Þ
2
Without loss of generality, we assume Q 2 Rml ; l n. The columns of Q form an

orthogonal basis for the range of XX which is an approximation to the range of X;
where X is a matrix composed of the random vectors.
The second stage of the rSVD method is to compute the SVD of B :¼ QT X.
Suppose B ¼ U ~ ^SV
^ T is the SVD of B, which can be obtained from the orthogonal
projection of X onto the low dimensional subspace spanned by columns of Q. We
finally obtain the approximated POD basis or U^ of X from the product QU. ~
We would like the basis matrix Q to contain as few columns as possible, but it is
even more important to have an accurate approximation of the input matrix. However,
there are many methods for constructing a matrix Q such as QR factorization,
Eigenvalue decomposition, SVD. In this work, we will compute this matrix by using
QR decomposition as summarize in Algorithm 2.
Algorithm 2: The rSVD algorithm
INPUT : A data matrix X  m n with target rank k and an oversampling parameter p
OUTPUT : The rSVD of X : U , ,V
Step 1. Draw a random matrix  with dimension n  (k  p)
Step 2. Form the matrix product, Y  X 
Step 3. Construct Q from the QR decomposition of Y
Step 4. Set B  QT X
Step 5. Compute an SVD of the small matrix: B  USV T
Step 6. Set U  QU
3 Missing Image Data Reconstruction Approach

pffiffiffiffi pffiffiffiffi
Let X be a gray scale image. We begin with exacting all the m m patches from
the image X to form a matrix S, by S ¼ ½s1 ; s2 ; . . .; sn 2 Rmn , where si ; ði ¼
1; 2; . . .; nÞ are the vectorized image patches ordered as columns of S, m is the number
of pixels in each patch and n is the number of patches. The patches can split to the
known patches and corrupted patches, which form complete and incomplete data
vectors respectively. However, we need to reconstruct an input image X through a
matrix S as illustrated in Fig. 1 which gives a general framework for an image
reconstruction.
By applying the approach in [19, 20] to approximate an incomplete data vector.
This use a projection onto a subspace spanned by a basis that represents the related
complete data vectors. First, let fs1 ; s2 ; . . .; sns g Rn be a complete data set, and form
a matrix of a complete data Sc ¼ ½s1 ; s2 ; . . .; sns 2 Rnns . This matrix Sc will be used
to compute the projection basis matrix for approximating incomplete patches.
Let ^s 2 Rn be an incomplete vectorized patch and n ¼ nc þ ng where nc ; ng are the
numbers of known and unknown components, respectively. Suppose that C ¼
½ec1 ; . . .; ecnc 2 Rnnc and G ¼ ½eg1 ; . . .; egng 2 Rnng , where eci ; egi 2 Rn are the ci -
th, gi -th column of the identity matrix In , for fc1 ; c2 ; . . .; cnc g; fg1 ; g2 ; . . .; gng g
f1; 2; . . .; ng are the indices of the known and unknown components, respectively, of ^s.
Fig. 1. The overview of missing image data reconstruction approach.
Let ^sc :¼ C T ^s 2 Rnc and ^sg :¼ GT ^s 2 Rng . Then, the known components and the
unknown components are given in the vectors ^sc and ^sg , respectively. The overview of
the steps described in this section for approximating an incomplete image is shown in
Fig. 2.
Fig. 2. Stage-diagram of the proposed missing image data reconstruction approach
Note that, pre-multiplying C T is equivalent to extracting the nc rows corresponding

to the indices fc1 ; c2 ; . . .; cnc g. Similarly, GT is equivalent to extracting the ng rows
corresponding to the indices fg1 ; g2 ; . . .; gng g. The missing components contained in
^sg will be approximated by first projecting ^s onto the column span of a basis matrix U
with rank k.
^s Ua; or ^sc Uc a; and ^sg Ug a;
for some coefficient vector a 2 Rk , and where Uc :¼ CT U 2 Rnc k ; Ug :¼ GT U,

2 Rng k . The known components contained in ^sc are then used to determine the
coefficient vector a through the approximation ^sc Uc a from the following least-
squares problem:
min k^sc Uc ak22 ð7Þ

a2Rk
y y
The solution of the above problem is given by a ¼ Uc ^sc where Uc ¼ ðUcT Uc Þ1 UcT is
the Moore-Penrose inverse. That is,
^sg Ug a ¼ Ug Ucy^sc ð8Þ

The details of these steps are provided in Algorithm 3 below.

Algorithm 3: Standard POD Least-Squares approach
INPUT : Complete data set {s j }njs1  n
and  
 rank {s j }njs1 , incomplete data s  n
ng
with known entries in sc  nc
and unknown entries in sg  , where n  nc  ng
OUTPUT : Approximation of sg
n  ns
Step 1. Create snapshot matrix : S  [ s1 , s2 , ..., sns ]  and let r  rank (S )
Step 2. Construct basis U of rank k  r for S
2
Step 3. Find coefficient vector a from sc using Least-Squares problem: mink sc  U c a 2
a
Step 4. Compute the approximation sg  U g a
Next, we consider the optimal basis which is obtained from the singular value
decomposition (SVD) because it is optimal in the least-squares sense. The basis defined
above can be obtained from the left singular vector of the matrix S.
Recall that S ¼ ½s1 ; s2 ; . . .; sns 2 Rnns and k\r ¼ rank (SÞ. The SVD of S is
S ¼ URV T , where U ¼ ½u1 ; . . .; ur 2 Rnr and V ¼ ½v1 ; . . .; vr 2 Rns r are matrices
with orthonormal columns and R ¼ diagðr1 ; . . .; rr Þ 2 Rrr with r1 r2 . . .
rr [ 0.
The optimal solution of least-squares problem minSk kS Sk k2F , rank (Sk Þ ¼ k is
2 2 Pr
Sk ¼ Uk Rk VkT with minimum error S Sk 2 ¼ r2k þ 1 or S Sk F ¼ r2l . Then
l¼k þ 1
optimal orthonormal basis of rank k (the POD basis) of dimension k is the matrix
formed by the first k left singular vectors, i.e. Uk ¼ ½u1 ; . . .; uk 2 Rnk , k r.
However, it could be computational intensive to obtain the standard SVD. We will
use the rSVD methods in Sect. 2.2 to find the optimal orthonormal basis, which reduce
the computational complexity of the standard approach for computing SVD.
4 Numerical Experiments
In this Section, we show the numerical experiments of the rSVD and SVD methods for
constructing the projection basis and compare the CPU times together with the
reconstruction errors. Here, we use the relative error in 2-norm.
We use a standard test image, called Lena picture. This image is considered in
gray scale color with size 512 512 shown in Fig. 3. We consider this image with
2:75% missing pixels in a block of missing pixels with size 15 20 pixels spreading
over the test image. We consider two cases with different patch sizes for image
reconstructing:
Case I. Clipping the image to 256 patches with size 32 32 pixels or the
dimension of the matrix representing this image is 1024 256.
Case II. Clipping the image to 1024 patches with size 16 16 pixels or the
dimension of the matrix representing this image is 256 1024.
Fig. 3. The standard test image, called Lena picture (a) The original gray-scale image
(b) Incomplete image with missing pixels.
Note that, the numbers of the known patches must be enough for computing the
basis. Also, the available pixels in the corrupted patch are sufficiently used to recon-
struct the missing pixels. The performance of both cases are shown in Fig. 4 and Fig. 5.
Fig. 4. The reconstructed images (when clipped to 256 patches with size 32 32 pixels) by
using the rSVD method. The images in (a)-(e) show the reconstruction using basis of rank
k = 10, 15, 20, 25, 30, respectively.
By using the reconstruction approach explained in this work, we investigate the

results when ranks k = 10, 15, 20, 25 and 30 are used. For the rSVD method, the
reconstructed results are accurate and seem to be indistinguishable from the original
image, as shown in Fig. 4. The comparisons between using rSVD and SVD methods
for these two cases are shown in Fig. 5 when ranks k ¼ 10; 20; 30 are used.
The relative error of the reconstruction results and the computational time for
constructing basis set are shown in Fig. 6 and Fig. 7, respectively. In addition, the
efficacy of the proposed reconstruction strategy is measured by the peak signal-to-noise
ratio (PSNR), as shown in Table 1 and Fig. 8.
Fig. 5. The reconstructed images of case I, (1a)–(3a) and (1b)–(3b) show the images when using
the rSVD method and SVD method, respectively with rank k = 10, 20, 30. Similarly, the
reconstructed images of case II, (1c)–(3c) and (1d)–(3d) show the images when using the rSVD
method and SVD method, respectively with rank k = 10, 20, 30.
Fig. 6. Relative errors of the reconstruction results of the Lenna picture when 2.75% of pixels
are missing: (a) Case I and (b) Case II.
Fig. 7. Computational time for computing basis used in the reconstruction of the Lenna picture
when 2.75% of pixels are missing: (a) Case I and (b) Case II.
Table 1. Reconstruction performances of images under using rSVD and SVD method with
different rank k.
PSNR Case I. Case II.
(dB) rSVD SVD rSVD SVD
k = 10 32.3814 32.3476 32.2426 31.3460
k = 15 32.4651 32.3421 32.2796 31.0623
k = 20 32.0866 32.0097 31.7313 30.9716
k = 25 32.2250 32.0922 31.9056 30.6867
k = 30 32.1546 31.8980 31.7345 30.5484
From Table 1, both cases show that using the rSVD method is a little more accurate
than SVD method. Although the rSVD method gives slightly more PSNR values in
some test cases, it uses significantly less computation time.
Fig. 8. Reconstruction performances in term of PSNR: (a) Case I and (b) Case II.
5 Conclusions
This work presented an image reconstruction approach. A given incomplete image was
first divided into many patches, the only corrupted patches are reconstructed while the
known patches are used to compute the POD basis. The available image pixels in
patches were used to form a low-dimensional subspace and approximate the missing
pixels by applying the least-squares method. By using the available pixels in the
corrupted patches around the missing pixels, the incomplete image is shown to be
approximated efficiently and accurately. Instead of using a traditional approach based
on the standard SVD for computing this low-dimension basis, this work efficiently used
rSVD, which was shown in the numerical tests to give substantially less reconstruction
time with the same order of accuracy.
Acknowledgments. The authors gratefully acknowledge the financial support provided by this
study was supported by Thammasat University Research Fund, Contract No. TUGR 2/12/2562
and Royal Thai Government Scholarship in the Area of Science and Technology (Ministry of
Science and Technology).
References
1. Emmanuel, J.C., Benjamin, R.: Exact matrix completion via convex optimization. Found.
Comput. Math. 9, 717–772 (2008)
2. Jin, Z., Chiman, K., Bulent, A.: A high performance missing pixel reconstruction algorithm
for hyperspectral images. In: Proceedings of 2nd International Conference on Applied and
Theoretical Information Systems Research (2012)
3. Feilong, C., Miaaomiao, C., Yuanpeng, T.: Image interpolation via low-rank matrix
completion and recovery. IEEE Trans. Circuits Syst. Video Technol. 25(8), 1261–1270
(2015)
4. Pete, G.W.S.: On the early history of the singular value decomposition. Siam Rev. 35, 551–
566 (1993) http://www.jstor.org/stable/2132388
5. Dan, K.: A singularly valuable decomposition: the svd of a matrix. The College Math. J. 27
(1), 2–23 (1996)
6. Dimitris, A., Frank, M.: Fast computation of low-rank matrix approximations, J. Assoc.
Comput. Mach., 54 (2007)
7. Gilbert, S.: Linear Algebra and its Applications, 3rd edn. Harcourt Brace Jovanovich, San
Diego, USA (1988)
8. Gene, H.G., Charies, F.V.L.: Matrix Computations, 3rd ed., Johns Hopkins University Press,
Baltimore, Maryland (1996)
9. Harry, C.A., Claude, L.P.: Singular value decompositions and digital image processing,
24, 26–53 (1976)
10. Samruddh, K., Reena, R.: Image compression using singular value decomposition. Int.
J. Adv. Res. Technol. 2(8), 244–248 (2013)
11. Lijie, C.: Singular Value Decomposition Applied To Digital Image Processing, Division of
Computing Studies. Arizona State University Polytechnic Campus, Mesa, Arizona, 1–15
12. Sven, O.A., John, H.H., Patrick, W.: A critique of SVD-based image coding systems. In:
IEEE International Symposium on Circuits and Systems (ISCAS), vol. 4, pp. 13–16 (1999)
13. Vitali, V.S., Roger, L.: Fast PET image reconstruction based on svd decomposition of the
system matrix. IEEE Trans. Nuclear Sci. 48(3), 761–767 (2001)
14. Rowayda, A.S.: SVD based image processing applications: state of the art, contributions and
research challenges. Int. J. Adv. Comput. Sci. and Appl. 3(7), 26–34 (2012)
15. Davi, M.L., João, P.C.L.C., João, L.A.C.: Improved MRI reconstruction and denoising using
SVD-based low-rank approximation. In: Workshop on Engineering Applications, pp. 1–6
(2012)
16. Sarlos, T.: Improved approximation algorithms for large matrices via random projections. In:
47th Annual IEEE Symposium Foundations of Computer Science, pp. 143–152(2006)
17. Liberty, E., Woolfe, F., Martinsson, P.G., Rokhlin, V., Tygert, M.: Randomized algorithms
for the low-rank approximation of matrices. Proc. Natl. Acad. Sci. 104(51), 20167–20172
(2007)
18. Halko, N., Martinsson, P.G., Tropp, J.A.: Finding structure with randomness: probabilistic
algorithms for constructing approximate matrix decompositions. Siam Rev. 53(2), 217–288
(2011)
19. Siriwan, I., Saifon, C.: An application of randomized singular value decomposition on image
reconstruction using least-squares approach, Preprint in Thai Journal of Mathematics (2018)
20. Siriwan, I., Saifon, C.: A Numerical Study of Efficient Sampling Strategies for Randomized
Singular Value Decomposition, Thai Journal of Mathematics, pp. 351–370 (2019)
An Automated Candidate Selection
System Using Bangla Language
Processing
Md. Moinul Islam, Farzana Yasmin, Mohammad Shamsul Arefin(B) ,

Zaber Al Hassan Ayon, and Rony Chowdhury Ripan
Computer Science and Engineering, Chittagong University of Engineering and

Technology, Chattogram 4349, Bangladesh
moinulislam7002@gmail.com, farzanaefu@gmail.com, sarefin@cuet.ac.bd,
zaberayon@gmail.com, ronyripon@gmail.com
Abstract. Recruiting or selecting the right candidates from a vast pool

of candidates has always been a fundamental issue in Bangladesh as far
as employers are concerned. In the case of candidate recruitment, differ-
ent government organizations, nowadays, ask the applicants to submit
their applications or resumes written in Bengali in the form of elec-
tronic documents. Matching the skills with the requirements and choos-
ing the best candidates manually from all the resumes written in Bengali
is very difficult and time-consuming. To make the recruitment process
more comfortable, we have developed an automated candidate selection
system. First, it takes the CVs (written in Bengali) of candidates and
the employer’s requirements as input. It extracts information from the
candidate’s CV using Bangla Language Processing (BLP) and Word2Vec
embedding. Then, it generates an average cosine similarity score for each
CV. Finally, it ranks the candidates according to the average cosine sim-
ilarity scores and returns the dominant candidate’s list.
Keywords: Automation · Bangla language processing · Candidate

selection · Word2vec · Cosine similarity · Gensim
1 Introduction
Data mining is a logical process used to find relevant data from a large data set.
It is the process of analyzing data from different aspects and summarizes them
into useful information [11]. Data mining helps us extract this information from
an extensive dataset by finding patterns in the given data set. Patterns that are
certain enough according to the user’s measures are called knowledge [18]. Data
mining is a sub-process of knowledge discovery in which the various available data
sources are analyzed using different data mining algorithms. Like data mining,
text mining is used in extracting essential pieces of information from the text
[3]. Since all data on the web and social media are available in a fuzzy and
https://doi.org/10.1007/978-3-030-68154-8_90
1072 M. M. Islam et al.
random manner, it is sometimes hard for humans to understand and process the
data effectively. In that type of case, text mining tools help to create a bridge
and relationship among the texts. This process is also known as Knowledge
Discovery from Texts (KDT) [15]. There are several text mining applications,
such as speech recognition, social media data analysis, content enrichment, etc.
[1,2].
Recruitment, or selecting the right candidates from a vast pool of candidates,
has always been a fundamental issue as far as employers are concerned. Generally,
the recruitment process follows like this for a company. Candidates send CV
to the company. Recruiter selects some candidates from those CVs followed by
conducting personality and other technical eligibility evaluation tests, interviews,
and group discussions. The human Resource (HR) department is one of the most
critical departments of a company. The HR department plays a vital role in
selecting an expert candidate for a particular post. The first task of the HR
department is to shortlist the CVs of various candidates who applied for the
specific post. High level of uncertainty when the HR department checks all the
CVs, ranks them and selects a candidate for a particular job position because of
many applicants. This uncertainty occurs due to the various occupation domain
expert’s different opinions and preferences in the decision-making process. This
evaluation process involves excessive time consumption and monotonous work
procedure. To reduce the load of the HR department in the recruiting process,
an automation tool like CV ranker is very helpful.
Recently, some studies have focused on automated candidate selection system
in [9,13,16,17,20]. An elaborate discussion on these methodologies is given in
Sect. 2. The common phenomenon of the most existing techniques is done for a
CV that is written in Bengali. In Bangladesh, as Bengali is our native language,
and for many government posts, there is a need for a CV written in Bengali.
However, any automated candidate selection system for Bengali CV has not been
done yet.
In this study, we have developed an automated candidate selection system
that takes the CVs (written in Bengali) of candidates and the employer’s require-
ments as input. After that, our system extracts information of the candidates
using Bangla Language Processing (BLP) from the CVs, generates an average
Cosine Similarity Score for each CV. Finally, our system ranks the candidates
according to these average Cosine Similarity Scores and returns the dominant
candidates’ list.
The rest of this paper is organized as follows: We present the related works
on the automated candidate selection system in Sect. 2. Section 3 provides the
details of the proposed methodology of the automated candidate selection sys-
tem. Our experimental results are shown in Sect. 4. Finally, we conclude the
paper in Sect. 5 by providing the future directions of this research.
2 Related Work
Few contributions are regarding the automated candidate selection system. Such
as, Evanthia et al. [7] implement automated candidate ranking, based on objec-
An Automated Candidate Selection System 1073
tive criteria that can be extracted from the applicant’s LinkedIn profile. The
e-recruitment system was deployed in a real-world recruitment scenario, and
expert recruiters validated its output. Menon et al. [14] represent an approach
to evaluate and rank candidates in a recruitment process by estimating emo-
tional intelligence through social media data. The candidates have to apply for a
job opening by filling an online resume and accessing their Twitter handle. This
system estimates the candidate’s emotion by analyzing their tweets, and the
professional eligibility is verified through the entries given in the online resume.
Faliagka et al. [6] present an approach for evaluating job applicants in online
recruitment systems, using machine learning algorithms to solve the candidate
ranking problem, and performing semantic matching techniques. The system
needs access to the candidate’s full profile and the recruiters’ selection criteria,
including assigning weights to a set of candidate selection criteria. The pro-
posed scheme can output consistent candidate rankings compared to the ones
assigned by expert recruiters. De Meo et al. [5] propose an Extensible Markup
Language (XML)-based multi-agent recommender system for supporting online
recruitment services. XML is a standard language for representing and exchang-
ing information. It embodies both representation capabilities, which are typical
of Hypertext Markup Language, and data management features, which are char-
acteristic of the Database Management System. Kessler et al. [12] present the
E-Gen system: Automatic Job Offer Processing system for Human Resources
implementing two tasks - analysis and categorization of job postings. Fazel
et al. [8] present an approach to matchmaking job seekers and job advertise-
ments. When searching for jobs or applicants, a job seeker or recruiter can ask
for all the job advertisements or applications that match his/her application in
addition to expressing his/her desired job-related descriptions. Getoor et al. [10]
represent a survey on Link Mining using the Top-k query processing technique.
Link mining refers to data mining techniques that explicitly consider these links
when building predictive or descriptive models of the linked data.
Most of the CV ranking research is done on the CV, written in the English
language. In this study, we have developed an automated candidate selection
system that takes the CVs (written in Bengali) of candidates and the employer’s
requirements as input. Then, it extracts information of the candidates using
Bangla Language Processing (BLP) from the CVs, generates an average Cosine
Similarity Score for each CV. Finally, it ranks the candidates according to these
average Cosine Similarity Scores and returns the dominant candidates’ list.
3 Methodology
In this chapter, we describe the overall methodology of candidate selection

system.
Fig. 1. Data preparation module
3.1 Data Preparation Module

The data preprocessing module’s function is to extract data, tokenize data and
extract keywords from that data. Since resumes are written in Bengali, Bangla
Language Processing (BLP), Natural Language Processing (NLP) is used to
extract information from the resumes. For a computer to store text and numbers
that are written in Bengali, there needs to be a code that transforms characters
into numbers. The Unicode standard defines such a coding system by using char-
acter encoding. The Unicode Standard consists of a set of code charts for visual
reference, an encoding method and standard character encodings, a collection
of reference data files, and many related items. Different character encodings
can implement Unicode. The most commonly used encodings are UTF-8, UTF-
16, etc. UTF-8 is a variable width character encoding capable of encoding all
1,112,064 valid code points in Unicode using one to four 8-bit bytes [19]. Bengali
Unicode block contains characters for the regional languages such as Bengali,
Assamese, Bishnupriya Manipuri, etc. The range for Bangla Unicode is U+0980
to U+09FF.
Before going into the score generation module, data needs to be in the list
form because of the word2bec model discussed lengthly in Score Generation
Module Sect. 3.2. Converting a sequence of characters into a sequence of tokens
is known as tokenization or lexical analysis. The program that performs lexical
analysis is termed as tokenizer. Tokens are stored in the list for further access.
Finally, these tokens, as several lists, go into the score generation module.
3.2 Score Generation Module

Word Embedding: Word embedding is the collective name for language mod-
eling and feature learning techniques in natural language processing (NLP),
where words or phrases from the vocabulary maps to vectors of real num-
bers. Conceptually it involves a mathematical embedding from space with many
dimensions per word to a continuous vector space with a much lower dimension.
Word Embedding [21] is all about improving networks’ ability to learn from text
data by representing that data as lower-dimensional vectors (which is also known
as Embedding). This technique reduces the dimensionality of text data and can
Fig. 2. Score generation module
also learn some new traits about words in a vocabulary. Besides, it can also
capture the context of a word in a document, semantic and syntactic similarity,
relation with other words, etc. Word2Vec is a popular method to construct such
word embedding. This method takes input as a large corpus of text and produces
a vector space as output. Each vector consists of many hundred dimensions or
more than a hundred dimensions, with each unique word from the large corpus
of text being assigned a corresponding vector in the space. Word vectors are
positioned in the vector space so that words that share common contexts in
the corpus are located close to one another in the space. Considering the fol-
lowing sentence: “Have a good day” with exhaustive vocabulary V = Have, a,
good, day. One-hot encoding of this vocabulary list is done by putting one for
the element at the index representing the corresponding word in the vocabulary
and zero for other words. One-hot encoded vector representation of these words
are: Have = [1,0,0,0]; a=[0,1,0,0]; good=[0,0,1,0]; day=[0,0,0,1]. After creating
a one-hot encoded vector for each of these words in V, the length of our one-
hot encoded vector would be equal to the size of V (=4). Visualization of these
encodings represented by 4-dimensional space, where each word occupies one of
the dimensions and has nothing to do with the rest (no projection along the
other dimensions).
In this study, the Common Bag of Words (CBOW) (based on Neural Net-
work) method is used to obtain Word2Vec. CBOW method takes each word’s
context as the input and tries to predict another word similar to the context
[20]. For the sentence “Have a great day,” if the input to the Neural Network is
the word (“great”), the model will try to predict a target word (“day”) using
a single context input word (“great”). More specifically, the one-hot encoding
of the input word is used to measure the output error compared to the one-hot
encoding of the target word (“day”).
In Fig. 3, a one hot encoded vector of size V goes as input into the hidden
layer of N neurons, and the output is again a V length vector with the elements
being the softmax values. Wvn(V*N dimensional matrix) is the weight matrix
that maps the input x to the hidden layer. W’nv (N*V dimensional matrix)is
Fig. 3. A simple CBOW model with only one word in the context
the weight matrix that maps the hidden layer outputs to the final output layer.
The hidden layer neurons copy the weighted sum of inputs to the next layer. The
only non-linearity is the softmax calculations in the output layer. But, the above
CBOW model in Fig. 3 used a single context word to predict the target. This
CBOW model takes C context words. When Wvn is used to calculate hidden
layer inputs, we take an average over all these C context word inputs.
Cosine Similarity: Cosine Similarity is used for determining the similarity

between two non-zero vectors of an inner product space by measuring the cosine
of the angle between them [4]. The cosine of zero degrees (Cos(0)) is 1, and it is
less than 1 for any angle in the interval (0,π] radians. The cosine of two non-zero
vectors can be calculated by using the Euclidean dot product formula. Given two
vectors of attributes, A and B, the cosine similarity, cos(Θ), is represented using
a dot product and magnitude as where Ai and Bi are components of vector A
and B, respectively.
A.B = ||A||||B||cosΘ (1)
n
A.B Ai Bi
similarity(A, B) = cosΘ = = n i=12 n (2)
||A||||B|| i=1 Ai i=1 Bi
2
Cosine similarity returns a list of scores for each skill. After that final score
for a CV is calculated by taking the mean of all the scores. Thus, all the CV
goes through the model and generates scores, which is then stored in a database
to rank them. Finally, Recruiters can see all the CVs in sorted order.
4 Implementation and Experimental Results

4.1 Experimental Setup
Using Bangla Language Processing (BLP), an automated candidate selection
system has been developed on a machine having Windows10, 2.50 GHz Core
i5-3210 processor with 8 GB RAM. The system has been developed in Python
3.7.3, in which gensim, tensorflow, and keras is used to complete this project.
4.2 Implementation and Performance Evaluation
For implementing our study, we have collected 50 CVs (written in Bengali), and
all the experiments are done on these CVs. All the data are extracted from these
CV documents using UTF-8 encoding, and the output of the extracted data is
shown in Fig. 4.
Fig. 4. Console output of extracted data from CV
Here, all the data in Fig. 4, is not needed for calculating cosine similarity.
So, important keywords like “Skills,” “CGPA,” etc. are extracted to match the
company requirements and calculate average cosine similarity. All the important
keywords extracted from CV is shown in Fig. 5a.
Fig. 5. Snapshot of the system which indicates important keywords and company
requirements required
Fig. 6. Bar plot of all the CV’s average cosine similarity scores
Table 1. Average cosine similarity scores for all the CVs
CV No Avg. Cosine Similarity Score

1 0.54
2 0.67
3 0.65
4 0.60
5 0.58
6 0.62
7 0.61
8 0.52
9 0.56
10 0.58
For measuring Cosine Similarity, these keywords needed to be in the vector

form. So, all the keywords are transformed into vectors using Word2Vec embed-
ding. Company requirements are shown in Fig. 5b, also transformed into vectors
in the same process. Finally, Cosine Similarity is calculated using Eq. 2. Cosine
Similarity gives a list of scores for all the keywords that are matched. So, tak-
ing an average of all the scores, an average cosine similarity score is calculated.
Thus, for each CV, an average cosine similarity score is measured and is shown
in Table. 1.
A bar chart of similarity scores for 10 CVs among all the CVs are also shown
in Fig. 6. From which, we can see that CV No. = 2 has the highest score. Using
the average Cosine Similarity Score, all the CVs are sorted. The top 5 sorted
CVs are shown in Fig. 7.
Fig. 7. Top 5 best CV among all the candidates
5 Conclusion
In this paper, we have narrated an automated candidate selection system that
ranks all the CVs (written in Bengali) by extracting information and calculating
an average Cosine Similarity Score. Automating the complete candidate selection
task may help the HR agencies save time, cost, and effort to search and screen
the pioneer applicants from vast applications. There are many automated can-
didate ranking systems available online in which all the CV format is in English.
But we have developed an automated candidate selection system suitable with
the help of Bangla Language Processing (BLP) for a CV that is written in Ben-
gali. In the system performance evaluation, we have used 50 CVs of technical
background testing of the system and found that the system works efficiently to
return the best candidates by matching the given requirements with the candi-
date’s qualifications. Altogether, the system performs better in filtering the CV
documents written in Bengali and the candidates based on the CV document’s
information. But, Our current method can not capture temporal features. That
is why it will fail to find sequential similarity between sentences. Suppose the
recruiter wants to prioritize a candidate’s recruitment qualities via giving quality
tags sequentially in a sentence. In that case, the current method will fail to rank
CVs according to the priorities.
Our future work includes a system that can directly take a job description
from the recruiters and then evaluate CVs more dynamically. We are planning to
start with a bi-directional LSTM encoder with the cosine similarity computing
layer to give our model the capability of understanding both the semantic and
syntactic meaning of a sentence.
References
1. Vasant, P., Zelinka, I., Weber, G.W.: Intelligent computing & optimization.
Springer, Berlin (2018)
2. In: Intelligent Computing and Optimization, Proceedings of the 2nd International

Conference on Intelligent Computing and Optimization 2019 (ICO 2019), Springer
International Publishing, ISBN 978-3-030-33585 -4
3. Aggarwal, C.C., Zhai, C.: Mining text data. Springer Science & Business Media,
Berlin (2012)
4. Croft, D., Coupland, S., Shell, J., Brown, S.: A fast and efficient semantic short
text similarity metric. In: 2013 13th UK Workshop on Computational Intelligence
(UKCI), pp. 221–227. IEEE (2013)
5. De Meo, P., Quattrone, G., Terracina, G., Ursino, D.: An xml-based multiagent sys-
tem for supporting online recruitment services. IEEE Trans. Syst. Man, Cybernet-
Part A: Syst. Humans 37(4), 464–480 (2007)
6. Faliagka, E., Ramantas, K., Tsakalidis, A.K., Viennas, M., Kafeza, E., Tzimas, G.:
An integrated e-recruitment system for cv ranking based on ahp. In: WEBIST, pp.
147–150 (2011)
7. Faliagka, E., Tsakalidis, A., Tzimas, G.: An integrated e-recruitment system for
automated personality mining and applicant ranking. Internet research (2012)
8. Fazel-Zarandi, M., Fox, M.S.: Semantic matchmaking for job recruitment: an
ontology-based hybrid approach. In: Proceedings of the 8th International Semantic
Web Conference, vol. 525 (2009)
9. Gedikli, F., Bagdat, F., Ge, M., Jannach, D.: Rf-rec: Fast and accurate computation
of recommendations based on rating frequencies. In: 2011 IEEE 13th Conference
on Commerce and Enterprise Computing, pp. 50–57. IEEE (2011)
10. Getoor, L., Diehl, C.P.: Link mining: a survey. Acm Sigkdd Explorations Newsletter
7(2), 3–12 (2005)
11. Hand, D.J., Adams, N.M.: Data mining. Wiley StatsRef: Statistics Reference
Online pp. 1–7 (2014)
12. Kessler, R., Torres-Moreno, J.M., El-Bèze, M.: E-gen: automatic job offer process-
ing system for human resources. In: Mexican International Conference on Artificial
Intelligence, pp. 985–995. Springer (2007)
13. Kumari, S., Giri, P., Choudhury, S., Patil, S.: Automated resume extraction and
candidate selection system. Int. J. Res. Eng. Technol. [IJRET] 3, 206–208 (2014)
14. Menon, V.M., Rahulnath, H.: A novel approach to evaluate and rank candidates
in a recruitment process by estimating emotional intelligence through social media
data. In: 2016 International Conference on Next Generation Intelligent Systems
(ICNGIS), pp. 1–6. IEEE (2016)
15. Mining, K.D.T.D.: What is knowledge discovery. Tandem Computers Inc 253
(1996)
16. More, S., Priyanka, B., Puja, M., Kalyani, K.: Automated cv classification using
clustering technique (2019)
17. Shabnam, A., Tabassum, T., Islam, M.S.: A faster approach to sort unicode rep-
resented bengali words. Int. J. Comput. Appl. 975, 8887 (2015)
18. Suresh, R., Harshni, S.: Data mining and text mining—a survey. In: 2017 Interna-
tional Conference on Computation of Power, Energy Information and Communi-
cation (ICCPEIC), pp. 412–420. IEEE (2017)
19. Wikipedia contributors: Utf-8 — Wikipedia, the free encyclopedia. https://en.
wikipedia.org/w/index.php?title=UTF-8&oldid=974048414 Accessed 22 August
2020
20. Yasmin, F., Nur, M.I., Arefin, M.S.: Potential candidate selection using informa-
tion extraction and skyline queries. In: International Conference on Computer Net-
works, Big data and IoT, pp. 511–522. Springer (2019)
21. Zhang, Y., Jatowt, A., Tanaka, K.: Towards understanding word embeddings:
Automatically explaining similarity of terms. In: 2016 IEEE International Con-
ference on Big Data (Big Data), pp. 823–832. IEEE (2016)
AutoMove: An End-to-End Deep Learning
System for Self-driving Vehicles
Sriram Ramasamy(&) and J. Joshua Thomas
Department of Computing, UOW Malaysia, KDU Penang University College,

10400 Penang, Malaysia
sriram6897@gmail.com, joshopever@yahoo.com
Abstract. End to End learning is a deep learning approach that has been used
to solve complex problems that would usually be carried out by humans with
great effect. A deep structure was designed within this study to simulate
humans’ steering patterns in highway driving situations. The architecture of the
network was based on image processing algorithm which is integrated with deep
learning convolutional neural network (CNN). There are five aspects in this
work, which enables the vehicle to detect the lanes, detect the speed of the
vehicle, detect the angle of the road, recognize the objects on the road and
predict the steering angle of the vehicle. A self-derived mathematical model is
used to calculate the road angles for the prediction of vehicle’s steering angles.
The model is trained on 2937 video frame samples and validated on 1259
samples with 30 epochs. The video of the local road was set as the output which
will show the difference between actual steering angles and predicted steering
angle. The experiments have been carried out in a newly built industrial park
with suitable industry 4.0 standard design of urban smart development.
Keywords: Autonomous driving Steering angle Deep neural network Self-

driving vehicles
1 Introduction
1.1 Background of the Work

AutoMove, an End-to-End Deep Learning for Self-Driving Vehicles have five main
features which is to detect the lanes of the road, detect the speed of the vehicle from
white-dashed lines, detect the angle of the road, object detection and predicting the
steering angle of the vehicle. This work involves five phases to complete. Firstly,
dataset is taken using smartphone in a Malaysian local road. Then, the dataset is
analyzed and added as an input to detect the lanes for the vehicle to navigate inside a
lane (Thomas et al. 2019). After detecting the lanes, white-dashed lines on the middle
of the road are taken to calculate the speed of the vehicle. The speed is determined by
calculating the distance of starting and ending point of white-dashed lines and the time
taken to reach the end line. Next, the angle of the road is calculated. Then, the objects
on the road are detected. Finally, the steering angle of the vehicle is predicted based on
the curvature of the road. If the angle is around zero to five degrees, the steering angle
will be straight whereas if the angle goes under zero degree, the steering angle will turn
https://doi.org/10.1007/978-3-030-68154-8_91
AutoMove: An End-to-End Deep Learning System 1083
left and if the angle is more than five degree the steering angle will turn right The rest
of the article are organized such as, literature review in Sect. 2. Section 3 discussed the
methodology, and navigation of the self-driving vehicle. The implementation of the
five stages are described in Sect. 4. Section 5 has covered the integration of the image
processing algorithm with convolutional neural network (CNN) for the automatic
navigation. Section 6 concludes the work done.
2 Literature Review
2.1 Image Recognition

2.1.1 Traffic Sign Recognition
Rubén Laguna and his group mates developed a traffic sign recognition system using
image processing techniques (Laguna et al. 2014). There are four steps in this appli-
cation. Firstly, image pre-processing step was done. Next, is the detection of regions of
interest (ROIs), that involves image transformation to grayscale and implementing edge
detection by the Laplacian of Gaussian (LOG) filter. Then, the potential traffic signs
detection is identified, where comparison between ROIs and each shape pattern will be
done (Laguna et al. 2014). Third step is recognition stage using a cross-correlation
algorithm, where if each potential traffic sign is validated, it will be classified based on
the traffic sign database. Finally, previous stages are managed and controlled by a
graphical user interface. The results obtained showed a good accuracy and performance
of the developed application, acceptable conditions of size and contrast of the input
image were taken in consideration (Laguna et al. 2014).
2.2 Deep Learning Neural Network

2.2.1 Convolutional Neural Network (CNN)
In 2016, NVIDIA Corporation implemented End to End Learning for Self-Driving Cars
using a convolutional neural network (CNN). Raw pixels from a single front-facing
camera was map directly to steering commands (Bojarski et al. 2016, 3-6). According
to the authors, this end-to-end approach proved powerful. The system learns to drive in
traffic with less training data on local roads with or without lane markings. It also
worked in places with blurry visual guidance such on impervious surface roads and in
parking lots. The system automatically learns internal model of the needed processing
steps like detecting suitable road signs with only the human steering angle as the
training signal (Bojarski et al. 2016, 3-6). Images are fed into CNN, which then
measure a proposed steering command. After the training is done, the steering from the
video images will be generated from the network. For data collection, training data
taken by driving on a different type of roads and in a diverse set of weather conditions
and lighting. Mostly, the road data was collected in central New Jersey. Data was
collected in cloudy, clear, snowy, foggy, and rainy weather, both day and night. 72 h of
driving data was collected as of March 28, 2016 (Bojarski et al. 2016, 3-6). The
weights of network were trained to reduce the mean squared error between the adjusted
steering command and steering command output by the network. There are 9 layers for
the network including 5 convolutional layers, a normalization layer and 3 fully
1084 S. Ramasamy and J. Joshua Thomas
connected layer. The input image is separated into YUV planes and send to the network
(Bojarski et al. 2016, 3-6).
Next is the training. Selecting the frames to use is the first step to train a neural
network. The collected data is labelled with weather condition, road type, and driver’s
activity (staying in lane, turning and switching lanes). The data that driver was staying
in a lane was selected and others were discarded. Then the video sampled at 10 FPS
(Bojarski et al. 2016, 3-6). After that is data augmentation. The data will be augmented
after selecting the final set of frames by adding artificial rotations and shifts to teach the
network how to restore from a poor orientation or position. The final stage is simu-
lation. Pre-recorded videos will be taken by the simulator from a forward-facing on-
board camera and images that relatively what would appear if the CNN were, instead,
steering the vehicle will be generated. In conclusion, during training, the system learns
to detect the outline of a road without the explicit labels. Thomas, J. J et al., has referred
multiple works in deep learning which we have used for the literature review.
2.3 Object Detection

2.3.1 Faster R-CNN
The R-CNN technique trains CNN end-to-end to do classification for proposal regions
into object background or categories. R-CNN mostly act as a classifier, and the object
bounds cannot not be predicted. The performance of the region proposal module
defines the accuracy. Pierre Sermanet and his team proposed a paper in 2013 under the
title of “OverFeat: Integrated Recognition, Localization and Detection using Convo-
lutional Networks” which shows the ways of using deep networks for predicting object
bounding boxes (Sermanet et al. 2013). In the OverFeat method, a fully connected layer
is trained to predict the coordinates of the box for the localisation task that consider a
single object. Then, the fully connected layer is turned into a convolutional layer for
multiple class object detection (Sermanet et al. 2013). Figure 1 shows the architecture
of Faster R-CNN, which is a single, unified network for object detection.
Fig. 1. Architecture of Faster R-CNN (Sermanet et al. 2013)

Fast R-CNN implement end-to-end detector training on shared convolutional fea-

tures and display decent accuracy and speed. On the other hand, R-CNN fails to do real
time detection because of its two-step architecture (Joshua Thomas & Pillai 2019).
2.3.2 YOLO
YOLO stands for “You only look once”. According to Redmon (2016), it is an object
detection algorithm that runs quicker than R-CNN because of its simpler architecture.
Classification and bounding box regression will be done at the same time. Figure 2
shows how YOLO detection system works.
Fig. 2. YOLO Detection
A single convolutional network predicts several bounding boxes and class proba-
bilities for those boxes simultaneously. YOLO trains on full images and the detection
performance will be optimized directly. Thomas, J. J., Tran, H. N has worked on graph
neural network application which has been referred in this work for literature.
3 Methodology
3.1 System Architecture

Figure 3 shows the overview of AutoMove system architecture. The entire process was
done after collecting the road dataset using a smartphone. A smartphone camera was
mounted in front of the steering of a car. The video of the road approximately 1.5 km
was taken while driving on the road. Then, the dataset was fed into the code. The first
step to produce a self-driving vehicle is to detect the lanes. Then, the speed of the
vehicle is identified by using the white line markers on the road. Next, objects that are
surrounded on the road are detected using object detection. Besides, road angles will be
calculated in order to perform autonomous navigation.
4 Implementation
4.1 Lane Detection
Lane detection is implemented by colour thresholding. The first step is to convert the
image to grayscale identifying the region of interest. The pixels with a higher bright-
ness value will be highlighted in this step. Figure 4 shows the code snippet to convert
the image to grayscale and Fig. 5 shows the image that is converted to grayscale.
Collect road dataset

with straight and
curve road
Object Detection Speed Detection

Find the road
Lane Detection
on the road angle
Specify the objects

to be detected such Predict the steering
as vehicles angle
Train the actual steering

angle with neural network
Navigation for Self-driving

vehicle
Fig. 3. Overall stages in AutoMove
#read image
img = cv2.imread('road png.PNG')
#convert image to grayscale
gray = cv2.cvtColor(img,
cv2.COLOR_BGR2GRAY)
Fig. 4. Code snippet to convert image to gray scale
After applying threshold, it is required to remove as much of the noise as possible

in a frame (Lane detection for self driving vehicles, 2018). A mask is added to the
region of interest to remove the unwanted noise near the lanes so that the lanes can be
detected clearly. Figure 6 shows the image after removing the unwanted lines.
Fig. 5. Image that are converted to grayscale
Fig. 6. Image after removing noises
After removing all the noises, higher order polynomial is needed to incorporate to
detect curved roads. For that, the algorithm must have an awareness of previous
detection and a value of confidence. In order to fit a higher order polynomial, the image
is sliced in many horizontal strips. On each strip, straight line algorithm was applied,
and pixels were identified corresponding to lane detections. Then, the pixels will be
combined all the stripes and a new polynomial function will be created that fits the best
(Bong et al. 2019). Finally, a blue mask is attached on the lane to see the difference
between the lanes and the other images on the road. Figure 7 and 8 shows the lane
detection on two different sets of roads.
4.2 Speed Detection

Speed detection in autonomous vehicles is very crucial to reduce number of accidents
on the road. According to Salvatore Trubia (2017), one of the countermeasures is the
use of in-vehicle technology to help drivers to maintain the speed limits and also
prevent the vehicle from overtaking the speed limit (Thomas et al. 2020).
Fig. 7. Lane detection on curved road Fig. 8. Lane Detection on straight road
In this work, white line markers are used to measure the speed of the vehicle. When
a line marker is present, number of frames will be tracked until the next lane marker
appears in the same window. Then, the number of frame rate of the video is also taken
which will be extracted from the video to determine the speed. Figure 9 shows the
architecture to calculate the linear speed of the vehicle.
Fig. 9. Speed calculation
Based on the Fig. 9, on a straight road, the far white lines will be ignored because
the distance between the dash cam and the white line is very far from the video.
Therefore, nearest white straight line which can be seen fully from the dash cam will be
taken. Starting and endpoint of the white dashed line will be taken to measure the
speed. When it goes from one frame to another, the position of the white dashed lines
will be measured (mcCormick 2019). Then, when the white dashed lines reach the end
point of it, the number of frames took to reach the point will be calculated. Equation 1
is the calculation to calculate the speed.
distance
Speed ¼ ; where time is number of frames
time
To get the position of the white dashed lines, a cv2.rectangle is used. A rectangle is
drawn in between of two lines. Then, using matplotlib, the output is plotted to see the
position of rectangle. It is plotted to the nearest white line. Figure 10 shows the output
of the drawn rectangle.
Fig. 10. Output of the drawn rectangle
Based on the output, if the number of frame counts is 14, then the speed of the
vehicle will be 10.7 km/h. This is calculated using the Eq. 1. Figure 11 shows the
output using matplotlib and cv2 and Fig. 12 shows the speed of the vehicle on a
straight road.
Fig. 11. Matplotlib speed detection Fig. 12. Speed detection actual road
To measure the speed on the curve, the same method is used. However, the position
of rectangle will be changed according to the position of the white lines. Figure 13
shows the speed of the vehicle on a curve road.
Fig. 13. Speed Detection output for curved road
4.3 Road Angle Detection

According to Cuenca (2019), vehicle speed and steering angle are required to develop
autonomous driving. Supervised learning algorithms such as linear regression, poly-
nomial regression and deep learning are used to develop the predictive models. In this
work, due to lack of hardware support, the road angle is taken to predict the steering
angle of the vehicle. The video dataset is used to measure real-time road angle.
The equation of a polynomial regression (Curve Fitting and Interpolation, 2019) is:
y ¼ ao þ a1 x þ a2 x 2 þ e
Then, Rosenbrock function (Croucher, 2019) is created to optimize the algorithms.

The mathematical equation is stated in Eq. 3:
X h 2 i
f ðx; yÞ ¼ n i ¼ 1 b xi þ x21 þ ða xi Þ2
In this formula, a and b parameters represent the constants and usually set to a = 1
and b = 100. Figure 14 shows the code snippet for Rosenbrock function.
def curve ( x, t, y ):
return x[0] * t * t + x[1] * t + x[2]
Fig. 14. Rosenbrock function

Fig. 15. Road angle calculation Fig. 16. Curved road angle average
Based on the Fig. 15, the difference angle for each frame is taken by subtracting top
angle with bottom angle of the rectangle which is being placed on the white-dashed
lines. However, the average angle is calculated using a formula called exponential
moving average, (EMA) (Fig. 16).
EMA ¼ Difference AngleðtÞ k þ EMAð yÞ ð1 kÞ
Where: t = difference angle y = average angle k = alpha

Figure 17 and 18 shows the top and bottom angle of straight road which are used to
determine the difference angle.
Fig. 17. Top angle of straight road Fig. 18. Bottom angle of straight road
Figure 19 and 20 shows the top and bottom angle of curved road which are used to
determine the difference angle (Fig. 21).
In this work, YOLO algorithm is used for object detection as it is faster compared
to other classification algorithms. In real time, YOLO algorithm process 45 frames per
second (Redmon and Farhadi 2016). Furthermore, even though it makes localization
error, less false positives will be predicted in the background. To make use of YOLO
algorithm, an image is divided as grids of 3 3 matrixes.
Fig. 19. Top angle of curved road Fig. 20. Bottom angle of curved road
Fig. 21. Bounding boxes
4.4 Dataset
To develop this work, a new dataset was created. A One Plus 6 model smartphone was
placed on a car’s dashboard. A video of the driving was taken at Jalan Barat Cassia 2,
Batu Kawan, Penang which consist of straight and curve roads. The video size is
1920 1080 and the frame rate is 30fps per second.
4.5 Autonomous Navigation

In this section, we will discuss about the methods to navigate the autonomous vehicle
using the trained model. Firstly, the trained model (Autopilot.h5) is loaded. Then, the
model is predicted using a Keras model. According to the official Tensorflow docu-
mentation (tf.keras.Model| TensorFlow Core r2.0 2019), model groups layers into an
object with training and inference features. The model then predicts the processed
image which is the image that was fed into the training model. After that, a steering
image is read from the local disk to visualize the steering angle difference on the output
window. Then, the video of the local road was set as the output to predict the steering
angle. Figure 22 shows the overall end-to-end learning.
Fig. 22. Autonomous navigation of the novel end-to-end learning
To determine the accuracy of the steering angle, the difference between actual and
predicted steering angle were measured. At the last frame of the video, the accuracy for
the whole model is 91.3784%. Table 1 shows the actual and predicted angle for straight
road.
Fig. 23. Straight road: AutoMove Fig. 24. Curved road: AutoMove
Figure 23, 24 shows the output of autonomous navigation straight road and curved
road. Based on the output, the steering of the vehicle is quite straight as the steering
wheel does not turn on a straight road. This is because, the steering angle is in between
zero to five degrees which was set to drive straight.
However, when the road is curved, the accuracy reduces to 85% which can be seen
in Table 2. Furthermore, the steering wheel is set according to few parameters, for
instance, if the steering angle is less than zero degrees, then the steering wheel will turn
left whereas if the steering angle is more than five degrees, the steering wheel will shift
to right. Table 2 shows the actual and predicted angle for curved road.
Table 1. Actual and predicted angle for straight road

Actual steering angle (°) Predicted steering angle (°) Accuracy (%)
0.4912 3.1006 91.3743
0.5635 2.7528 91.3754
0.3902 −0.2838 91.3771
-0.3001 −1.2578 91.3784
Table 2. Actual and predicted steering angle for curved road

Actual steering angle (°) Predicted steering angle (°) Accuracy (%)
−10.8001 −9.5169 85.8128
−10.8005 −9.8750 85.8192
−10.3392 −7.8328 85.8265
−10.7338 −10.0125 85.8334
5 Conclusion
This work has developed mainly to reduce accidents in Malaysia. According to Oth-
man O. Khalifa (2008), the United Nations has categorised Malaysia as 30th in the
ranking of highest number of fatal accidents with an average of 4.5 causalities per
10,000 vehicles. Furthermore, this system was also developed to help elderly people
who are too old to drive safely. This is because as driver age increases, changes in
flexibility, visual acuity, reaction time, memory and strength will affect the ability to
drive (Lutin et al. 2013).
Acknowledgement. The authors would like to thank the UOW Malaysia KDU Penang
University College on selecting this work for the launch of the new Batu Kawan campus,
mainland Penang, Malaysia. There is an urgent need for low-cost but high throughput novel road
analysis operations for self-driving vehicle in Batu Kawan UOW Malaysia Campus, hence this
simulation might provide to be one of the ways forward.
References
Thomas, J.J., Karagoz, P., Ahamed, B.B., Vasant, P.: Deep learning techniques and optimization
strategies in big data analytics. IGI Global 10, 978 (2020). https://doi.org/10.4018/978-1-
7998-1192-3
Thomas, J.J., Tran, H.N.T., Lechuga, G.P., Belaton, B.: Convolutional graph neural networks: a
review and applications of graph autoencoder in chemoinformatics. In: Thomas, J.J., Karagoz,
P., Ahamed, B. B., Vasant, P., (Ed.), Deep Learning Techniques and Optimization Strategies
in Big Data Analytics (pp. 107–123). IGI Global (2020) https://doi.org/10.4018/978-1-7998-
1192-3.ch007
Assidiq, A., Khalifa, O., Islam, M., Khan, S.: Real time lane detection for autonomous vehicles.
In: 2008 International Conference on Computer and Communication Engineering (2008)
Bojarski, M., Del Testa, D., Dworakowski, D.: End to End Learning for Self-Driving Cars
(2016). https://arxiv.org/abs/1604.07316 Accessed 10 March 2019
Croucher, M.: Minimizing the Rosenbrock Function (2011). https://demonstrations.wolfram.
com/MinimizingTheRosenbrockFunction/ Accessed 11 October 2019
Curve Fitting and Interpolation (2019) [ebook] (2019). http://www.engineering.uco.edu/
*aaitmoussa/Courses/ENGR3703/Chapter5/ch5.pdf Accessed 6 October 2019
García Cuenca, L., Sanchez-Soriano, J., Puertas, E., Fernandez Andrés, J., Aliane, N.: Machine
learning techniques for undertaking roundabouts in autonomous driving. Sensors 19(10),
2386 (2019)
Geethapriya, S., Duraimurugan, N., Chokkalingam, S.: Real-Time Object Detection with
Yolo. Int. J. Eng. Adv. Technol. (IJEAT), 8(38) 578–581 (2019) https://www.ijeat.org/wp-
content/uploads/papers/v8i3S/C11240283S19.pdf Accessed 13 Nov. 2019
Huang, R., Pedoeem, J., Chen, C.: YOLO-LITE: A Real-Time Object Detection Algorithm
Optimized for Non-GPU Computers (2018). https://arxiv.org/pdf/1811.05588.pdf Accessed
13 Nov. 2019
Jalled, F.: Face Recognition Machine Vision System Using Eigenfaces (2017) https://arxiv.org/
abs/1705.02782 Accessed 3 March 2019
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural
networks. Commun. ACM 60(6), 84–90 (2017)
Laguna, R., Barrientos, R., Blázquez, L., Miguel, L.: Traffic sign recognition application based
on image processing techniques. IFAC Proc. Vol. 47(3), 104–109 (2014)
Lane detection for self driving vehicles (2018). https://mc.ai/lane-detection-for-self-driving-
vehicles/ Accessed 8 Oct. 2019
Lutin, J., Kornhauser, L.A., Lerner-Lam, E.: The Revolutionary Development of Self-Driving
Vehicles and Implications for the Transportation Engineering Profession. Ite Journal (2013).
https://www.researchgate.net/publication/292622907_The_Revolutionary_Development_of_
SelfDriving_Vehicles_and_Implications_for_the_Transportation_Engineering_Profession
Accessed 23 Nov. 2019
Majaski, C.: Comparing Simple Moving Average and Exponential Moving Average (2019).
https://www.investopedia.com/ask/answers/difference-between-simple-exponential-moving-
average/ Accessed 9 Nov. 2019
McCormick, C.: CarND Advanced Lane Lines (2017). https://github.com/colinmccormick/
CarND-Advanced-Lane-Lines Accessed 12 Oct. 2019
Pant, A.: Introduction to Linear Regression and Polynomial Regression (2019). https://
towardsdatascience.com/introduction-to-linear-regression-and-polynomial-regression-
f8adc96f31cb Accessed 1 Nov. 2019
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You Only Look Once: Unified, Real-Time
Object Detection pp. 1–4 (2016)
Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger (2016). https://pjreddie.com/media/
files/papers/YOLO9000.pdf Accessed 11 Nov. 2019
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M.: OverFeat: Integrated Recognition, Localization
and Detection using Convolutional Networks (2013). https://arxiv.org/abs/1312.6229
tf.keras.Model| TensorFlow Core r2.0 (2019) https://www.tensorflow.org/api_docs/python/tf/
keras/Model Accessed 23 Nov. 2019
Trubia, S., Canale, A., Giuffrè, T., Severino, A.: Automated Vehicle: a Review of Road Safety
Implications as Driver of Change (2017)
Vasant, P., Zelinka, I., Weber, G.: Intelligent Computing & Optimization. Springer (2018)
Vasant, P., Zelinka, I., Weber, G.: Intelligent Computing and Optimization. Springer
International Publishing (2019)
Venketas, W.: Exponential Moving Average (EMA) Defined and Explained (2019). https://www.
dailyfx.com/forex/education/trading_tips/daily_trading_lesson/2019/07/29/exponential-
moving-average.html Accessed 12 Nov. 2019
Vitelli, M., Nayebi, A.: CARMA: A Deep Reinforcement Learning Approach to Autonomous
Driving (2016). https://www.semanticscholar.org/paper/CARMA-%3A-A-DeepReinforce
ment-Learning-Approach-to-Vitelli-Nayebi/b694e83a07535a21c1ee0920d47950b4800b08bc
Wagh, P., Thakare, R., Chaudhari, J., Patil, S.: Attendance system based on face recognition
using eigen face and PCA algorithms. In: 2015 International Conference on Green
Computing and Internet of Things (ICGCIoT) (2015)
Joshua Thomas, J., Pillai, N.: A deep learning framework on generation of image descriptions
with bidirectional recurrent neural networks. In: Vasant, P., Zelinka, I., Weber, G.W.
(eds) Intelligent Computing & Optimization. ICO 2018. Advances in Intelligent Systems and
Computing, vol. 866. Springer, Cham (2019) https://doi.org/10.1007/978-3-030-00979-3_22
Joshua Thomas, J., Belaton, B., Khader, A.T.: Visual analytics solution for scheduling
processing phases. In: Vasant, P., Zelinka, I., Weber, G.W., (eds) Intelligent Computing &
Optimization. ICO 2018. Advances in Intelligent Systems and Computing, vol. 866. Springer,
Cham (2019) https://doi.org/10.1007/978-3-030-00979-3_42
Thomas, J. J., Ali, A. M.: Dispositional learning analytics structure integrated with recurrent
Intelligent Computing & Optimization. pp. 446–456. Springer, Cham (2019)
Bong, C.W., Xian, P.Y., Thomas, J.: Face recognition and detection using haars features with
template matching algorithm. In: International Conference on Intelligent Computing &
Optimization. pp. 457–468. Springer, Cham (2019)
Thomas, J.J., Fiore, U., Lechuga, G. P., Kharchenko, V., Vasant, P.: Handbook of Research on
Smart Technology Models for Business and Industry. IGI Global (2020) https://doi.org/10.
4018/978-1-7998-3645-2
Thomas, J.J., Wei, L.T., Jinila, Y.B., Subhashini, R.: Smart computerized essay scoring using
deep neural networks for universities and institutions. In: Thomas, J.J., Fiore, U., Lechuga, G.
P., Kharchenko, V., Vasant, P. (Ed.), Handbook of Research on Smart Technology Models
for Business and Industry. pp. 125–152. IGI Global (2020) https://doi.org/10.4018/978-1-
7998-3645-2.ch006
An Efficient Machine Learning-Based
Decision-Level Fusion Model to Predict
Cardiovascular Disease
Hafsa Binte Kibria(B) and Abdul Matin
Department of Electrical and Computer Engineering, Rajshahi University

of Engineering and Technology, Rajshahi 6204, Bangladesh
hafsabintekibria@gmail.com, ammuaj.cseruet@gmail.com
Abstract. The world’s primary cause of mortality is cardiovascular dis-

ease at present. Identifying the risk early could reduce the rate of death.
Sometimes, it is difficult for a person to undergo an expensive test reg-
ularly. So, there should be a system that can predict the presence of
cardiovascular disease by analyzing the basic symptoms. Researchers
have focused on building machine learning-based prediction systems to
make the process more simple and efficient and reduce both doctors’ and
patients’ burdens. In this paper, a decision level fusion model is designed
to predict cardiovascular disease with the help of machine learning algo-
rithms that are multilayer neural network and the K Nearest Neighbor
(KNN). The decision of each model was merged for the final decision
to improves the accuracy. Here Cleveland dataset was used for ANN
and KNN, which contains the information of 303 patients with eight
attributes. In this two-class classification, ANN gave 92.10% accuracy,
and KNN gave 88.16%. After fusing the decision of them, we got an
accuracy of 93.42% that performed much better than two of them. The
result was obtained by using 75% data in training.
Keywords: Cardiovascular disease · Machine learning · Artificial

neural network · knn · Decision level fusion
1 Introduction
Coronary heart failure is one of the primary causes of mortality globally. It is
commonly regarded as a primary disease in the old and middle ages. In world-
wide, this Disease (CAD) particularly has the highest rate of mortality. In prac-
tice, the disease is treated by physicians, but there are very few cardiac experts
in contrast to the patient’s number. Particularly, the traditional method of diag-
nosis of the disease requires time, and it is expensive. Furthermore, at the initial
stage, the symptoms are mild, so people usually ignore it until it gets serious.
False diagnosis, expensive tests are the main reasons people can not depend so
much on doctors. Money also plays a crucial role in this issue. Researchers are
trying their best to develop a more effective intelligent system for the diagnosis
https://doi.org/10.1007/978-3-030-68154-8_92
1098 H. B. Kibria and A. Matin
of heart disease, and also a number of smart models have been developed over
time. The main reason behind our motivation for designing a combined model
is that it would improve the health care sector and reduce patients’ struggle. It
will also save a lot of time for both physicians and the patients, and again, the
patients will save from having an extra expense.
A medical diagnosis system to predict cardiovascular disease based on
machine learning gives more accurate results than traditionally, and the treat-
ment cost reduces [5]. Medical diagnosis is a test where a physician tries to
identify the disease by experimenting with the symptoms and the lab values
obtained from the test. Various algorithms are used in that task [7].
In this paper, an automated medical diagnosis system is used to classify
coronary disease based on a machine-learning algorithm. The system is based
on a backpropagation algorithm that uses an Artificial Neural Network learning
method. Also, the KNN algorithm is used for disease classification, and the per-
formance of the two algorithms is observed. Finally, the two algorithms’ decision
is fused by summation to make a single decision that gave a better result than
each individual model [8]. So, the problem we are solving in this paper is to
detect patients with or without heart disease by analyzing the medical data. If
the patients do not have heart disease, then there is nothing to worry about, but
if the result is positive (the patient has heart disease), he can undergo further
treatment. As machine learning can process huge datasets, it can give results at
very low cost and within a short time, improving the quality of health care and
saving the physicians from a lot of pressure.
This paper is presented as follows. Cardiovascular disease and the importance
of early identification of disease are introduced in the first section. The second
section discusses previous studies of medical diagnosis based on machine learning.
The third section describes the method that is used to develop the model. In
segment four, the paper analyzes the experimental findings of our suggested
decision level fusion model. Then in section five, our fusion model’s performance
has been compared with the related work done by so far. Finally, in section six,
we end with a conclusion and possible future development of the work.
2 Related Study
Many researches have been done on heart disease classification. So now

researchers are trying to develop the current models with something new. Modi-
fied and fusion algorithms have been introduced recently, which are giving better
performance than others. Different techniques have also been trying to imple-
ment in preprocessing to make the data more suitable for algorithms. Here we
have discussed some works related to the classification of heart disease with their
potential improvements.
In [7], cardiovascular disease was classified using the mean value. Mean values
replaced missing values in the preprocessing step. They used the SVM classifier
and Naive Bayes algorithm to compare with and without using mean value. Both
algorithms improve their accuracy by using it in preprocessing. They used using
Decision-Level Fusion Model to Predict Cardiovascular Disease 1099
SVM linear-kernel and got an accuracy of 86.8% using 80% data in training.
They have experimented with different train-test ratios in their work, but the
accuracy was not satisfactory. 86.66% accuracy was achieved using 75% data in
training. The best result among other algorithms was the SVM linear kernel.
The main weakness of this work is its poor accuracy.
In [5], researchers proposed a system that was designed with a multilayer
perceptron neural network. Backpropagation was used for training and their pro-
posed system. They have calculated different accuracy by adjusting the number
of sizes in the hidden layer. With eight nodes in the hidden layer, the maximum
accuracy of 95% was reached with PCA. Other performance parameters were
observed with respect to different sizes of the hidden layer.
Another study was done [12], where two supervised data mining algorithms
were applied to classify a patient’s heart disease using two classification models,
Naı̈ve Bayes and Decision tree classifier. The decision tree classifier predicted
better with an accuracy of 91%, where Naı̈ve Bayes showed 87% accuracy. In
[1], researchers analyzed machine learning methods for different types of dis-
ease prediction. Logistic regression (LR), XGBoost (XGB), random forest (RF)
and LSTM, which is a special kind of recurrent neural network, were used for
prediction. Here XGBoost performed better than LSTM.
In a recent study [13], researchers used four different algorithms for heart
disease prediction. The highest accuracy was achieved from KNN, which was
90.789% using test-rain split. Naı̈ve Bayes and random forest both gave 88.15%
and 86.84% accuracy, respectively, using the Cleveland dataset. No complex or
combinational model for higher accuracy was introduced in this study, which is
a drawback.
There is scope for improving these work accuracy by implementing a fuzzy
method or combining separate algorithms to make a new one. So we have pre-
ferred to use the fusion model to classify disease, which can provide greater
accuracy.
3 Proposed Solution
This research aims to design a decision-level fused model that can improve accu-
racy and identify heart disease efficiently. This fusion model classifies patients
with and without heart disease by merging two machine learning-based algo-
rithms. The proposed solution is displayed in Fig. 1:
Fig. 1. The proposed architecture

In this proposed architecture, there are several steps. First of all, the data
were preprocessed. Using the chi-square test, only the critical features were
selected that contribute most to the prediction. For the training with KNN,
the top 10 important features were chosen for prediction. And for the artificial
neural network, all 13 features were taken. After training, we fused the individ-
ual models’ decision by summation to get a more accurate outcome. The phases
are described in detail below.
3.1 UCI Heart Diseases Dataset
The dataset UCI for heart disease was used in heart disease classification. The
Cardiovascular disease data set has been taken from the UCI machine learning
repository [2]. It has 303 instances with 13 features with some missing values.
This repository holds a huge and varied range of data sets that are of interest
to various sectors. The machine learning group uses these data to contribute to
developments in different domains. The repository had been developed by David
Aha [2]. The dataset is labeled into those two classes. The target value contains
two categories: 0 and 1. 0 means no heart disease, and 1 represents having heart
disease, which is a binary classification. The explanation of the dataset has been
shown in Table 1 [2].
Table 1. Descriptions of features
Features Description
Age Lifetime in years (29–77)
Gendr Instance of gender (0 = Female, 1 = Male)
ChstPainTyp The type of chest pain (1 = typical angina,
2 = atypical angina, 3 = non-anginal pain,
4: asymptomatic)
RestBlodPresure Resting blood pressure in mm Hg [94, 200]
SermCholstrl Serum cholesterol in mg/dl [126–564]
FstingBlodSugr Fasting blood sugar >120 mg/dl (0 = False, 1= True)
ResElctrcardigrphic Results of Resting ECG (0: normal, 1: ST-T wave
abnormality, 2: LV hypertrophy)
MaxHartRte Maximum heart rate achieved [71, 202]
ExrcseIndcd Exercise induced angina (0: No, 1: Yes)
Oldpek ST depression induced by exercise relativeto rest
[0.0, 62.0]
Slp Slope of the peak exercise ST segment
(1 = up-sloping, 2 = flat, 3 = down-sloping)
MajrVesels Number of major vessels coloured by fluoroscopy
(values 0–3)
Thl Defect types: value 3 = normal, 6 = fixed defect,
7 = irreversible defect
(a) Target class (b) Gender
(c) Chest pain type (d) Slope
(e) Fasting blood sugar (f) Major vessel number
(g) Exercise including angina (h) Thalach
(i) Age (j) Resting blood pressure
(k) Cholesterol (l) Maximum heart rate
Fig. 2. Data distributions of attributes in terms of target class

(m) ST by exercise (n) ECG
Fig. 2. (continued)
Figure 2 shows the data distribution of all 13 attributes in terms of the target
class. The target class distribution has been displayed here as well.
Pre-processing represents an important step in the classification of data [6].

There are some missing data in the dataset. For example, major vessels and
thalassemia are missing whilst classifying some data in patient records. The
most common values replace such data that are lacking. Of the 14 attributes,
eight are symbolic, and six are numeric. The categorical values are converted
into numeric data by using Label Encoder.
3.3 Feature Scaling
Feature scaling was applied to limit the range of variables. We have used min-
max normalization for feature scaling.
Qv − min(Qv )
Qv = (1)
max(Qv ) − min(Qv )
where Qv is an original value, Qv is the normalized value. And then, the dataset
has been splitted into two parts: testing and training. 75% of the data were used
for training and 25% for testing. 10% data was taken from training data as a
validation set. The validation set prevents training data from over-fitting.
3.4 Feature Selection
There are 13 features in our dataset, and only ten of them were used for the
training of KNN. The top significant features were selected using the chi-square
test. It selects the features that have the best connection with output. Sex,
fasting blood sugar, and resting ECG were excluded with the help of the chi-
square test. So ten features out of 13 were used in KNN. Figure 3 has represented
the ten most essential features according to their score.
Fig. 3. Data distributions of attributes in terms of target class
3.5 Training Phase

One portion of the data was used for the training set. The data for the training
set was trained with ANN and KNN. For KNN, highly dependent features have
been selected using the Chi-square test from 13 attributes.
Artificial Neural Network. The human brain inspires scientists to build an

Artificial Neural Network (ANN) as it has webs of interconnected neurons, so the
processing ability is incredible. The basic processing unit of ANN is perceptron.
The input, hidden, and output layers are the main three layers. The input
is given to the input layer, and in the output layer, we got the result. The
backpropagation has been used to find the error and to adjust the weight between
the layers. After completing the backpropagation, the forward process begins and
continues until the error is minimized [15]. In input, there are thirteen nodes.
There are eight neurons in the hidden layer and one neuron for the output layer.
The output layer produces the output value. In Fig. 4, the architecture of our
artificial neural network has been displayed. We used sgd as an optimizer.
K-Nearest Neighbor. K Nearest Neighbor (KNN) is a supervised machine

learning algorithm. At close proximity, KNN finds similarities. Euclidean dis-
tance is usually used for the computation of similarity. KNN starts with deciding
how many k neighbor numbers to compare with. K is set to be an odd number
for the best outcome. After determining k, the distance of the object is calcu-
lated with every available object in the dataset. The least distance k from all
acquired distance was chosen. The least distance k will be examined, and the
classification result will be the category that shows the most [14]. Some of KNN’s
benefits are that it is easy to use and simple. Proper optimization is needed to
get the best result for any algorithm [16].
Fig. 4. Structure of proposed artificial neural network
3.6 Decision Level Fusion

After preprocessing and dividing the data for test and training, the dataset was
trained with ANN and KNN. Then the output was predicted for the test data
using the trained models. The trained models provided a decision probability for
all the test data, and based on the decision score, the final output for test data
have been predicted. The value for decision probability ranges between 0 to 1. If
the value exceeds 0.5, then the model predicts the patient with that particular
data with having heart disease. In decision-level merging, we have simply added
the decision scores obtained from the two algorithms. The decisions score from
two different algorithms was used to form one decision for all test data. Thus,
we got a single decision from two classifiers. The equation for decision fusion is
as follows:
n
Df = Ds /n (2)
s=1
Where n is the algorithm number used for the fusion of decision, since only
ANN and KNN have been used for fusion, the number of k was 2 in our approach.
Df represents the final decision of the fusion model. And Dk is the decision
probability that any separate algorithm gave. The score of Dk ranges between 0
to 1. in our fusion, the steps are listed below:
Only at the decision stage, the fusion has occurred; that is why the model is
defined as decision level fusion. As the decision scores range from both models is
the same, so there is no need for extra scaling before fusion. Suppose one of the
individual models has a false negative (like decision probability 0.45) for a test
data that actually suggests a patient with heart disease and other algorithm gave
the accurate result for that particular test data (0.7 that is true positive). Then
after fusion (.45+.7)/2 = .575 score was found for the fusion model’s prediction.
As the score is greater than .5, the result leads to true positive for that specific
Algorithm 1: Algorithm for decision-level fusion

Input: Float value Ds , one int n. Ds is the individual algorithm’s decision
score, n is the individual algorithm number in fusion
Output: Df ,new decision score of fusion model
1 Dsum = 0
2 for s ← 0 to n do
3 Dsum = Ds + Dsum
4 end
5 Df = Dsum /n
test data; thus, the fusion model’s accuracy increases. Usually, both separate
algorithms gave accurate decisions for the same test data; this situation occurs
only with one or two test data, and that’s made the fusion model more reliable
and efficient.
3.7 Evaluation Metrics

Various performance parameters were measured with accuracy to see the per-
formance of our model. The model’s performance will be clearly understood to
see the output value of these parameters.
Accuracy was calculated to observe the system performance.
T Rpos + T Rneg
Accuracy = ∗ 100 (3)
T Rpos + T Rneg + F Apos + F Aneg
These following terms have been used in the equation of the evaluation matrix.
The abbreviation has been introduced here.
– True positive (T Rpos ): The output is positive, and expected to be positive.
– True negative (T Rneg ): The output is positive, but expected to be negative.
– False positive (F Apos ): The output is negative, and expected to be negative.
– False negative (F Aneg ): The output is negative, but expected to be positive.
If our model predicts any value as positive, precision means how many times it
is actually positive.
T Rpos
P recision = (4)
T Rpos + F Apos
Recall is the indicator of the amount of actual positive data, both the parameters
are significant for a model’s evaluation.
T Rpos
Recall = (5)
T Rpos + F Aneg
F1-score combines recall and precision in a single calculation.
2 ∗ P recision ∗ Recall
F 1 − score = (6)
P recision + Recall
We have also measured the confusion matrix to see the exact number of positive
and negative predicted data correctly and incorrectly classified in Table 2.
Really positive Really negative

Positive prediction True positive T Rpos False positive F Apos
Negative prediction False negative F Aneg True negative T Rneg
4 Performance Evaluation
A decision-level fusion model has been constructed by using an artificial neural
network and k-nearest neighbor algorithm. The two algorithms were applied
individually on the same dataset, and then the decision score of each algorithm
was combined to improve the accuracy. And we got a better result after fusion.
(a) Train and validation accuracy (b) Train and validation loss
Fig. 5. Train and validation accuracy and loss of ANN in terms of training number
Fig. 6. Variation of the accuracy of train and test regarding the KNN neighbor number
(a) Confusion matrix of

fusion model (b) ROC curve of ANN, KNN and fusion model
Fig. 7. Confusion matrix and ROC curve of decision level fusion model
Figure 5 displays the training and validation accuracy and loss with respect
to the epoch. Fifty epochs have been used. In the graph, validation is mentioned
as a test.
The relationship between the test and training accuracy regarding the vari-
ance of KNN neighbour numbers has been shown in Fig. 6. From this graph, we
selected the neighbor number k = 12 and achieved an accuracy of 88.16% for
test data.
Table 3 shows the various performance parameters, which increases when
two algorithms’ decisions were combined. Thus the decision score level fusion
became more efficient for predicting disease. Also, performance parameters have
been displayed. Here the fusion’s model accuracy has been represented only with
testing data, as fusion model was not trained.
Table 3. Experimental results
Approach Tp Fp Accuracy (train) Accuracy (test) Precision Recall F1-score Roc-auc score
Fn Tn
ANN 40 2 82.35 92.10 92 92 92 91.73
4 30
KNN 41 1 81.15 88.16 89 88 88 87.04
8 26
ANN+KNN 41 5 – 93.4 94 93 93 93.94
1 29
In Table 3, the accuracy for test and train data has been measured. And for
the fusion model, only the accuracy for test data is displayed as decision-level
fusion has been implemented after training two algorithms. The fusion model’s
accuracy has risen 1% comparing ANN and 5% comparing KNN. All the other
parameters also improved. The confusion matrix is generated from the training
dataset.
From the values in Table 3, it can be said that ANN performed well com-
pared to KNN, and the performance of the fusion model increased after merging.
The decision-level fusion model performed much better than the two individual
algorithms. Here each model is used independently for prediction, and then the
decision score from each model has been combined to improve accuracy.
Fig. 8. Comparison of the performance parameters of ANN,KNN and fusion model
Figure 7(a) shows the confusion matrix of our fusion model. The miss classi-
fication rate is 6.57%. And in Fig. 7(b), the ROC curve has been displayed.
The comparison of the performance of the three algorithms (ANN, KNN, and
ANN+KNN) was displayed in Figs. 8 to show to improvement of fusion model.
5 Performance Comparison with Previous Work

In our work, we used the Cleveland dataset to classify heart disease. The previous
works that have been done with the same dataset have been shown in this part.
To illustrate the progress of our model, we compared our model to these. Table
4 represents the previous works. The methods that the researchers used have
been mentioned with year and accuracy in this table.
Researchers have used different algorithms in these works to diagnose heart
disease. Some of these works in Table 4 have measured their accuracy with
an 80:20 train-test split. So we have calculated the accuracy for 80% data in
training using our fusion model to compare with this work. It gave an outstanding
accuracy of 96.72% higher than any other model. Also, for other split ratios,
the fusion model did well. In comparison, the efficiency of our decision-making
fusion model is much higher than theirs. We have good accuracy with ANN and
KNN, but accuracy has improved mostly after the two algorithms’ fusion. The
comparison of the existing works in Table 4 indicates the predominance of the
fusion model.
Table 4. Previous work with same dataset
Author Year Approach Train-test ratio Accuracy Fusion model’s

accuracy
D. Shah et al. [13] 2020 KNN Naı̈ve Bayes Not mentioned 90.78% 88.157% -
P. Ramya et al. [11] 2020 SVM Logistic regression 80:20 85% 87% 96.72%
H. Karthikeyan et al. [3] 2020 SVM Random forest 80:20 82% 89.9% 96.72%
N. Louridi et al. [7] 2019 SVM linear-kernel 80:20 86.8% 96.72%
N.K. Jalil et al. [10] 2019 KNN 80:20 91.00% 96.72%
M. Senthilkumar et al. [9] 2019 HRFLM(proposed) 83:17 88.40% 97%
K. Tülay et al. [5] 2017 ANN with PCA 85:15 95% 97%
T.T. Hasan et al. [4] 2017 MLP 70:30 87% 91.20%
6 Conclusion
This work aims to develop a decision score level fusion model that can provide a
better result than one individual algorithm. This model creates a final decision
using the decision score from two other models. Artificial neural network and
k-nearest neighbor both has given good results individually for the prediction of
cardiovascular disease. By merging the decision score of two algorithms, a signif-
icant improvement is noticed. If one algorithm somehow gives the wrong result
for a particular data and other predicts right about it, so there is a possibility
that the correct result will be obtained in the fusion model. That’s why in the
medical diagnosis fused model is dominating. Our model gave an accuracy of
93.42%, and the separate model’s accuracy was 92.10% and 88.16% for ANN
and KNN, respectively. In this paper, only the cardiovascular disease dataset
has been used. This decision level fusion model can also be used for other med-
ical datasets. In the future, this fused model can be enhanced by using different
algorithms. Furthermore, more than two classification algorithms can be used to
build a new model to gain more accurate result.
References
1. Christensen, T., Frandsen, A., Glazier, S., Humpherys, J., Kartchner, D.: Machine
learning methods for disease prediction with claims data. In: 2018 IEEE Interna-
tional Conference on Healthcare Informatics (ICHI), pp. 467–4674. IEEE (2018)
2. Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.
edu/ml
3. Harimoorthy, K., Thangavelu, M.: Multi-disease prediction model using improved
svm-radial bias technique in healthcare monitoring system. J. Ambient Intell. Hum.
Comput. 1–9 (2020)
4. Hasan, T.T., Jasim, M.H., Hashim, I.A.: Heart disease diagnosis system based on
multi-layer perceptron neural network and support vector machine. Int. J. Curr.
Eng. Technol. 77(55), 2277–4106 (2017)
5. Karayılan, T., Kılıç, Ö.: Prediction of heart disease using neural network. In: 2017
International Conference on Computer Science and Engineering (UBMK), pp. 719–
723. IEEE (2017)
6. Kotsiantis, S., Kanellopoulos, D., Pintelas, P.: Data preprocessing for supervised
leaning. Int. J. Comput. Sci. 1(2), 111–117 (2006)
7. Louridi, N., Amar, M., El Ouahidi, B.: Identification of cardiovascular diseases
using machine learning. In: 2019 7th Mediterranean Congress of Telecommunica-
tions (CMT), pp. 1–6. IEEE (2019)
8. Matin, A., Mahmud, F., Ahmed, T., Ejaz, M.S.: Weighted score level fusion of iris
and face to identify an individual. In: 2017 International Conference on Electrical,
Computer and Communication Engineering (ECCE), pp. 1–4. IEEE (2017)
9. Mohan, S., Thirumalai, C., Srivastava, G.: Effective heart disease prediction using
hybrid machine learning techniques. IEEE Access 7, 81542–81554 (2019)
10. Nourmohammadi-Khiarak, J., Feizi-Derakhshi, M.R., Behrouzi, K., Mazaheri, S.,
Zamani-Harghalani, Y., Tayebi, R.M.: New hybrid method for heart disease diag-
nosis utilizing optimization algorithm in feature selection. Health Tech. 1–12 (2019)
11. Ramya Perumal, K.A.: Early prediction of coronary heart disease from cleveland
dataset using machine learning techniques. Int. J. Adv. Sci. Technol. 29(06), 4225 –
4234, May 2020. http://sersc.org/journals/index.php/IJAST/article/view/16428
12. Krishnan, S., Geetha, J.S.: Prediction of heart disease using machine learning algo-
rithms. In: 1st International Conference on Innovations in Information and Com-
munication Technology (ICIICT). IEEE (2019)
13. Shah, D., Patel, S., Bharti, S.K.: Heart disease prediction using machine learning
techniques. SN Comput. Sci. 1(6), 1–6 (2020)
14. Telnoni, P.A., Budiawan, R., Qana’a, M.: Comparison of machine learning classifi-
cation method on text-based case in twitter. In: 2019 International Conference on
ICT for Smart Society (ICISS). vol. 7, pp. 1–5. IEEE (2019)
15. Thomas, J., Princy, R.T.: Human heart disease prediction system using data min-
ing techniques. In: 2016 International Conference on Circuit, Power and Computing
Technologies (ICCPCT), pp. 1–5. IEEE (2016)
16. Vasant, P., Zelinka, I., Weber, G.W.: Intelligent Computing & Optimization, vol.
866. Springer, Berlin (2018)
Towards POS Tagging Methods
for Bengali Language: A Comparative
Analysis
Fatima Jahara, Adrita Barua, MD. Asif Iqbal, Avishek Das, Omar Sharif ,
Mohammed Moshiul Hoque(B) , and Iqbal H. Sarker
Department of Computer Science and Engineering, Chittagong University

of Engineering and Technology, Chittagong 4349, Bangladesh
fatimajahara@ieee.org, adrita766@gmail.com, asifiqbalsagor123@gmail.com,
avishek.das.ayan@gmail.com, {omar.sharif,moshiul 240,iqbal}@cuet.ac.bd
Abstract. Part of Speech (POS) tagging is recognized as a signif-

icant research problem in the field of Natural Language Processing
(NLP). It has considerable importance in several NLP technologies.
However, developing an efficient POS tagger is a challenging task for
resource-scarce languages like Bengali. This paper presents an empiri-
cal investigation of various POS tagging techniques concerning the Ben-
gali language. An extensively annotated corpus of around 7390 sen-
tences has been used for 16 POS tagging techniques, including eight
stochastic based methods and eight transformation-based methods. The
stochastic methods are uni-gram, bi-gram, tri-gram, unigram+bigram,
unigram+bigram+trigram, Hidden Markov Model (HMM), Conditional
Random Forest (CRF), Trigrams ‘n’ Tags (TnT) whereas the transfor-
mation methods are Brill with the combination of previously mentioned
stochastic techniques. A comparative analysis of the tagging methods is
performed using two tagsets (30-tag and 11-tag) with accuracy measures.
Brill combined with CRF shows the highest accuracy of 91.83% (for 11
tagset) and 84.5% (for 30 tagset) among all the tagging techniques.
Keywords: Natural language processing · Part-of-speech tagging ·

POS tagset · Training · Evaluation
1 Introduction
POS tagging has significant importance in many NLP applications such as pars-
ing, information retrieval, speech analysis, and corpus development. Moreover,
it is used as a pivotal component to build a knowledge base for natural language
analyzer. It makes the syntactic parser effective as it resolves the problem of
input sentence ambiguity. Tagging of words is significantly useful since they are
used as the input in various applications where it provides the linguistic signal on
how a word is being used within the scope of a phrase, sentence, or document.
POS tagging directly affects the performance of any subsequent text process-
ing steps as it makes the processing easier when the grammatical information
https://doi.org/10.1007/978-3-030-68154-8_93
1112 F. Jahara et al.
about the word is known. Usually, supervised and unsupervised approaches are
employed in POS tagging, which are further divided into rule-based, stochas-
tic based, and transformation based methods. The rule-based POS tagging uses
a dictionary or lexicon for taking the possible tags of a word. The stochastic
method considers the highest frequency or probability value to assign a POS tag.
Few stochastic tagging methods such as N-grams, CRFs, and HMMs have been
implemented for Bengali, English and other languages [1–3]. The transformation-
based method combines rule-based and stochastic techniques such as Brill
Tagger.
Designing a POS tagger is a very challenging task for a resource poor lan-
guage like Bengali. POS tagging of the Bengali sentence is complicated due
to its complex morphological structure, the dependency of the subject on verb
infections, person-verb-tense-aspect agreement and the scarcity of pre-tagged
resources [4,5]. Moreover, the ambiguity of a word with multiple POS tags and
the lack of availability of language experts in Bengali language posses other
obstacles that need to overcome. Most of the previous works on POS tagging
in Bengali neither highlighted the tagging effectiveness nor investigated their
appropriateness. Thus, to address this issue, this paper empirically investigates
the performance of 16 POS tagging methods using a supervised approach on
a corpus containing 7390 Bengali sentences. Comparative analysis in terms of
execution time and accuracy are reported, which helps to decide the suitable use
of POS tagging technique for various language processing tasks in Bengali.
2 Related Work
Different approaches have been explored on POS tagging in Bengali and other
languages. Stochastic and transformation based methods are the most widely
used techniques where a large dataset is prerequisite to achieve good perfor-
mance. Hasan et al. [6] showed a comparative analysis of n-grams, HMM and
Brill transformation-based POS tagging for south Asian languages. A tagset of
26 tags used for Bengali, Hindi and Telegu languages which gained 70% accu-
racy for Bengali using Brill tagger. Another work implemented the trigram and
HMM tagging methods [7] for the Marathi language. A comparison between the
stochastic (HMM, Unigram) and transformation based (Brill) methods is pre-
sented by Hasan et al. [8]. This work used a small training set of 4048 tokens in
Bengali and experimented with two different tagsets (12-tag and 41-tag). The
results revealed that Brill tagger performed better than the other stochastic
methods. A stochastic based approach proposed by Ekbal et al. [9] concluded
that a maximum entropy-based method outperforms the HMM-based POS tag-
ging method for Bengali. Ekbal et al. [10] developed a POS tagger for Bengali
sentences using CRF in the name entity recognition task. PVS et al. [11] showed
that CRF, along with transformation-based learning, achieved 76.08% accuracy
for Bengali POS tagging.
The supervised tagging methods demanded a large amount of tagged data
to achieve high accuracy. Dandapat et al. [12] used a semi-supervised method of
Towards POS Tagging Methods for Bengali Language: 1113
POS tagging with HMM and Maximum Entropy (ME). Hossain et al. [13] devel-
oped a method that checked whether the construction of the Bengali sentence
is valid syntactically, semantically or pragmatically. They designed a rule-based
algorithm using context-free grammars to identify all POS meticulously. Roy
et al. [14] developed a POS tagger that identifies 8 POS tags in Bengali using
grammar and suffix based rules. However, they only considered word-level tag
accuracy, which failed to identify the correct tag sequence in sentences. Sakiba
et al. [15] discussed a POS tagging tool where they used a predefined list of POS
tags and applied rules to detect POS tags from Bengali texts. Their method used
a very small data set containing 2000 sentences, and it faced difficulties due to
the limited rules. Chakrabarti et al. [16] proposed a POS tagging method using
the layered approach. Rathod et al. [17] surveyed different POS tagging tech-
niques such as rule-based, stochastic, and hybrid for Indian regional languages
where the Hybrid method performed better. Most of the previous approaches
of POS tagging in Bengali experimented on a smaller dataset which limits to
investigate their effectiveness on evaluation concerning diverse tagsets. In this
work, a bit larger dataset consisting of 7390 sentences are used.
3 Methodology
The key objective of our work is to investigate the performance of different types
of POS tagging techniques under supervised approach. To serve our purpose, we
used a tagged corpus in Bengali developed by Linguistic Data Consortium [18].
The overall process of the work consists of five significant steps: tokenization,
training/testing corpus creation, POS tagger model generation, tagged sentence
generation, and evaluation. Figure 1 illustrates the abstract representation of
the overall process to evaluate POS taggers.
3.1 Tagged Corpus

The corpus consists of 7390 tagged sentences with 22330 unique tokens and
115 K tokens overall. Two different levels of tagsets have been used for the anno-
tation: 30-tags and 11-tags [19]. The corpus originally tagged with a 30-tag
which denotes the lexical categories (i.e., POS) and sub-categories. This 30-tag
is mapped into an 11-tag using a mapping dictionary, which considered the lex-
ical categories alone. An extra tag (‘Unk’) is used for handling unknown words.
If a tagger identifies a word which is not available on the training set, then the
tagger labels it into “Unk” tag. The Table 1 illustrates the tagsets with their
tag name and number of tags.
3.2 Tokenization
Tokenization is the task of slicing a sequence of character into pieces, called
tokens [20]. A token is a string of contiguous characters grouped as a semantic
unit and delimited by space and punctuation marks. These tokens are often
Fig. 1. Abstract view of POS tagging evaluation process.
loosely referred to as terms or words. In our corpus, 7390 tagged sentences were
tokenized into a total of 115 K tokens with 22,330 unique tokens. A sample
tagged sentence and its corresponding tokens are shown in the following.
3.3 Train/Test Corpus Creation
The tokenized dataset is divided into two sets: train corpus and test corpus. The
training corpus consists of 98 K tagged tokens, while the test corpus contained
17 K tokens. The data in the test corpus is untagged to use in the testing phase
for evaluation. A data sample in the training corpus (with 11-tagset and 30-
tagset) and the testing corpus is illustrated in the following.
Table 1. Summary of POS tagsets
11 Tagset Tagset count 30 Tagset Tagset count

Noun (N) 44425 Common Noun (NC) 30819
Proper Noun (NP) 7994
Verbal Noun (NV) 2985
Spatio-Temporal Noun (NST) 2627
Verb (V) 14292 Main Verb (VM) 12062
Auxiliary Verb (VAUX) 2230
Pronoun (P) 6409 Pronominals (PPR) 5137
Reflexive (PRF) 362
Reciprocal (PRC) 15
Relative (PRL) 448
WH Pronoun (PWH) 447
Nominal Modifier (J) 14332 Adjective (JJ) 9377
Quantifier (JQ) 4955
Demonstrative (D) 2876 Absolute Demostrative (DAB) 2421
Relative Demostrative (DRL) 400
WH Demostrative (DWH) 55
Adverb (A) 3965 Adverb of Manner (AMN) 1995
Adverb of Location (ALC) 1970
Participle (L) 573 Verbal Participle (LV) 72
Conditional Participle (LC) 501
Post position (PP) 3989 Post Position (PP) 3989
Particle (C) 6704 Coordinating Particle (CCD) 2899
Subordinating Particle (CSB) 2051
Classifier Particle (CCL) 324
Interjection (CIN) 59
Others (CX) 1371
Punctuation (PU) 13519 Punctuation (PU) 13519
Residual (R) 4348 Foreign Word (RDF) 1873
Symbol (RDS) 1968
Others (RDX) 507
3.4 POS Tagging Model Generation
The training set is used to train different POS tagging methods. Each tagging
method is trained on training corpus tagged with 11 and 30 tagsets–each of
16 POS tagging methods used in a unique way to generate their corresponding
training models. N-gram, HMM, TnT, and CRF tagger models generate feature
matrices which is used in calculating the probability of the tags. The Brill tagger
generates rules used to estimate tags, and the HMM model makes a transition
probability matrix called Markov Matrix. An N-gram tagger follows the Bag of
Words approach while the CRF tagger uses a statistical approach.
N-Gram Tagger. N-gram is a sequence of N words in a sentence. N-gram

tagger considers the tags of previous (n − 1) words (such as one word in bigram
and two words in trigram) in the sentence to predict the POS tag for a given
word [7]. The best tag is ensured by the probability that the tag occurred with
the (n − 1) previous tags. Thus, if τ1 , τ2 ...τn are tag sequence and ω1 , ω2 , ..., ωn
are corresponding word sequence then the probability can be computed using
the Eq. 1.
P (τi |ωi ) = P (ωi |τi ).P (τi |τi−(n−1) , ..., τi−1 ) (1)
where, P(ωi —τi ) denotes the probability of the word ωi given the current tag τi ,
and P(τi —τi−(n−1) ,...,τi−1 ) represents the probability of the current tag τi given
the (n-1) previous tags. This provides the transition between the tags and helps
to capture the context of the sentence. Probability of a tag τi given previous
(n − 1) tags τi−(n−1) ,...,τi−1 can be determined by using the Eq. 2.
C(τi−(n−1) , ..., τi )
P (τi |τi−(n−1) , ..., τi−1 ) = (2)
C(τi−(n−1) , ..., τi−1 )
Each tag transition probability is computed by calculating the count occurrences

of n tags divided by the count occurrences of the previous (n − 1) tags. Different
N-gram models can be combined together to work as a combined tagger.
HMM Tagger. In HMM, the hidden states are the POS tags (τ1 , τ2 , ....., τn )
and the observations are the words themselves (ω1 , ω2 , ......, ωn ). Both transition
and emission probabilities are calculated for determining the most appropriate
tag for a given word in a sequence.
The overall probability of a tag τi given a word ωi is,
P (τi |ωi ) = P (ωi |τi ).P (τi |τi−1 ).P (τi+1 |τi ) (3)
Here, P(ωi —τi ) is the probability of the word ωi given the current tag τi ,
P(τi —τi−1 ) is the probability of the current tag τi given the previous tag τi−1
and P(τi+1 —τi ) is the probability of the future tag τi+1 given the current
tag τi .
TnT Tagger. In TnT, the most appropriate sequence is selected based on the
probabilities of each possible tag. For a given sequence of words of length n, a
sequence of tags is calculated by the Eq. 4.
n

arg max [P (τi |τi−1 , τi−2 ).P (ωi |τi )]P (τn |τn−1 ) (4)
τ1 ...τn
i=1
where, P(τi —τi−1 , τi−1 ) denotes the probability of the current tag τi given the
two previous tags τi−1 , and τi−2 . P(ωi —τi ) indicates the probability of the word
ωi given the current tag ωi , and P(τn —τ(n−1) ) denotes the probability of the
tag τn given the previous tag τ(n−1) .
CRF Tagger. CRF is a discriminative probabilistic classifier that calculates

the conditional probability of the tags given an observable sequence of words.
The conditional probability of a sequence of tags T=(τ1 , τ2 ,..,τn ) given a word
sequence W=(ω1 , ω2 ,.., ωn ) of length n can be calculated by using the Eq. 5.
n
1
P (T |W ) = exp{ θk fk (τi , τi−1 , ωi )} (5)
Z(w) i=1
k
Here, fk (τi ,τi−1 ,ωi ) represents a feature function whose weight is θk , and Z(w)
is a normalization factor which is the sum of all possible tag sequences.
Brill Tagger. Brill tagger is a transformation-based tagger, where a tag is

assigned to each word using a set of predefined rules. For a sequence of tags τ1 ,
τ2 ,.., τn Brill rules can be represented as the Eq. 6. Here, a condition tests the
preceding words or their tags and executes the rule if fulfilled.
τ 1 → τ2 (6)
Stochastic taggers can be used as a back-off tagger with the Brill Tagger. In
our work, to investigate the effect of back-off tagger on tagging performance, we
used uni-gram, bi-gram, tri-gram, uni-gram+bi-gram, uni-gram+bi-gram+tri-
gram, HMM, TNT, and CRF tagging methods with Brill tagger.
3.5 Predicted Tagged Sentence Generation
The generated POS tagging model predicts the highest probability of a tag
against the token and labels it with the appropriate POS tag. This process reads
the untagged tokens and calculates the probabilities of different tags based on
the trained tagger model. Stochastic tagger models (such as N-gram, HMM,
TnT, and CRF) use the feature matrices to calculate the probability of the tags.
The transformation-based (i.e., Brill tagger) model use the generated rules to
estimate the probability of the tags. After POS tagging of each token, individual
lists of tagged tokens are created for each sentence. Algorithm 1 describes the
process of converting the tagged tokens lists into the tagged sentences.
Algorithm 1: Tagged tokens to tagged sentence generation
T ←List of Tagged Tokens
tagged sentence ←[] List initialization
for t ∈ T do
S ←”” Tagged sentence initialization
for token ∈ t do
S ←S + token[word] + ”\” + token[tag] + ””
end
tagged sentence.append(S);
end
Here T denotes a list of tagged tokens of the corresponding sentence, and S

represents the tagged sentence list. Every token is a tuple of ‘word and corre-
sponding ‘tag as token{word, tag}. The list of tokens is stacked as a sequence
of ‘word and ‘tag to generate a tagged sentence.
As an example, for the untagged testing tokens (illustrated in Sect. 3.3),
the prediction model generates the tagged tokens and tagged sentence (for 11-
tagset), as shown in the following.
4 Evaluation Results and Analysis

To evaluate the performance of POS tagging technique, we use two parameters:
accuracy (A) and execution time (E). The accuracy can be defined as the ratio
between the number of correctly tagged tokens and the total number of tokens.
The execution time (E) of a tagger can be computed by the addition of time
required during training and testing. Experiments were run on a general-purpose
computer with an IntelR
CoreTM i5-5200H processor running at 2.20 GHz, 8 GB
of RAM, and Windows 10. NVIDIA GeForce GTX 950 M GPU is used with 4 GB
RAM. Sixteen POS tagging methods are implemented, and their performance is
investigated. Two tagsets (11-tagset and 30 tagset) are used with 115 K tokens for
evaluating the performance of each POS tagging method in terms of accuracy and
execution time. Table 2 summarizes the accuracy of the POS tagging techniques.
The analysis revealed that the Brill+CRF model achieved the highest accu-
racy of 84.5% (for 30 tagset) and 91.83% (for 11 tagset). The tri-gram methods
performed poorly in both sets of POS tags. Additionally, it is observed that the
accuracy of the taggers increases with the reduced number of tags in the tagset
in all cases.
Table 2. Accuracy of 16 POS tagging methods.
POS Tagger Accuracy(%) Accuracy(%)

for 30 tagset for 11 tagset
Uni-gram 71.46 75.88
Bi-gram 7.79 9.67
Tri-gram 5.33 6.21
Uni-gram+bi-gram 72.59 76.35
Uni-gram+bi-gram+tri-gram 72.42 76.12
HMM 75.12 79.22
TnT 72.35 76.39
CRF 84.27 90.84
Brill+uni-gram 72.99 76.49
Brill+bi-gram 60.58 70.37
Brill+tri-gram 59.3 69.57
Brill+uni-gram+bi-gram 72.75 76.55
Brill+uni-gram+bi-gram +tri-gram 72.54 76.23
Brill+HMM 76.04 79.98
Brill+TnT 72.83 76.45
Brill+CRF 84.50 91.83
To examine the effect of the various corpus size on the performance of POS
taggers, the taggers were trained with different amounts of tokens from the
same corpus. The tagged corpus is partitioned into different sizes as train sets
such as 10 K, 20 K, 30 K, 40 K, 50 K, 60 K, 70 K, 80 K, 90 K, 100 K, 115 K. Each
model is trained with each partitioned data sets individually and tested over the
17 K untagged testing dataset. Figure 2 shows the performance of the different
POS tagging methods for various sizes of training corpus using a 30-tagset.
From the Figure, it observed that Brill+CRF tagger has the highest accuracy
even when the corpus size is small. Again, both CRF and Brill+CRF tagger
reached almost 75% (for 30-tagset) and 85% (for 11-tagset) of accuracy with a
10 K tagged set. The accuracy of each method increased sharply with the rise of
the data set and becomes almost steady at 100 K.
The performance of the tagger also depends on the execution time. The faster
the execution time, the better the performance. We have computed the execution
time of taggers into 11-tagset and 30-tagset. Table 3 shows the performance
comparison concerning execution time among 16 tagging techniques.
The amount of time required to train a model using the train set decides the
training time of the tagger. From Table 3, it is observed that the HMM tagger
requires the least training time (0.25 s) whereas, Brill+TnT requires the highest
training time (333.38 s) for 30-tagset. For the 11-tagset, HMM consumed 0.25 s
and Brill+TnT required 164.56 s of training time. In the case, if testing time,
Fig. 2. The effect of data size on accuracy for 30-tagset
Table 3. Comparison of execution time among 16 POS tagging methods.
30 Tagset 11 Tagset
POS tagger Training Testing Execution Training Testing Execution
time (s) time (s) time (s) time (s) time (s) time (s)
Unigram 0.43 0.02 0.45 0.34 0.02 0.37
Bigram 0.6 0.03 0.63 0.59 0.03 0.62
Trigram 0.76 0.04 0.8 0.67 0.03 0.7
Uni-gram+bi-gram 1.0 0.04 1.03 1.04 0.04 1.07
Uni-gram+bi- 1.79 0.05 1.84 1.66 0.05 1.71
gram+tri-gram
HMM 0.25 6.88 7.13 0.25 3.8 4.05
TnT 0.53 44.87 45.39 0.5 19.01 19.51
CRF 49.94 0.14 50.08 15.97 0.11 16.09
Brill+uni-gram 21.54 0.16 21.7 16.3 0.2 16.51
Brill+bi-gram 67.08 0.79 67.87 42.97 0.77 43.74
Brill+tri-gram 74.24 0.94 75.17 55.75 0.85 56.6
Brill+uni- 12.36 0.2 12.56 12.79 0.21 13.0
gram+bi-gram
Brill+uni- 9.64 71.67 81.31 10.62 75.45 86.07
gram+bi-
gram+tri-gram
Brill+HMM 39.84 5.76 45.6 25.24 3.5 28.73
Brill+TnT 333.38 44.81 378.19 164.56 22.56 187.12
Brill+CRF 88.56 0.48 89.04 36.83 0.5 37.33
uni-gram tagger utilized the lowest tagging time (0.2 s) in both tagsets, whereas
Brill+unigram+bigram+trigram required the highest tagging time about 71.67 s
(for 30-tagset) and 75.45 s (for 11-tagset) respectively.
The execution time determines the tagging speed of the POS tagging tech-
niques. Figure 3 shows the execution time required for each tagging methods on
our dataset. Results indicate that the Brill+TnT demand more execution time
compared to other POS tagging methods.
Fig. 3. Execution time of different POS Taggers
From the results, it is investigated that Brill taggers, along with other back-
off taggers achieved the higher accuracy but they lag in execution time. The
Brill+CRF obtained the highest accuracy of 91.83% (for 11 tagset), but it
requires a higher execution time (37.33 s). On the other hand, the CRF method
achieved 90.84% of accuracy and consumes 16.09 s for execution. Thus, there is a
trade-off between accuracy and execution time. Taking into consideration both
the accuracy and execution time, it revealed that the CRF method provided
better POS tagging performance compared to other techniques.
5 Conclusion
In this work, we have illustrated and investigated the different POS tagging
techniques for the Bengali language. A comparative analysis of 16 POS tagging
techniques (eight stochastic-based and eight transformations-based) on a tagged
corpus consisting of 1,15,000 tokens have been reported. The comparative analy-
sis revealed that Brill with CRF technique achieved the highest accuracy among
other POS tagging techniques, but it requires more execution time. CRF can
be maintained as a good trade-off between accuracy and execution time, and
this method can be used as a better POS tagging technique. Tagging methods
that include both statistical and linguistic knowledge may produce a better per-
formance. The performance of the other tagging techniques such as TAGGIT,
CLAWS, Xerox, and Hybrid can be investigated further on the larger tagged
corpus with more unique words. These issues will be addressed in the future.
References
1. Dandapat, S., Sarkar, S.: Part of speech tagging for Bengali with Hidden Markov
Model. In: Proceedings of NLPAI Machine Learning Competition (2006)
2. Diesner, J.: Part of speech tagging for English text data. School of Computer
Science Carneige Mellon University Pittsburgh, PA 15213
3. Manju, K., Soumya, S., Idicula, S.M.: Development of a POS tagger for Malayalam-
an experience. In: International Conference on ARTCom, pp. 709–713. IEEE (2009)
4. Haque, M., Hasan, M.: Preprocessing the Bengali text: an analysis of appropriate
verbs (2018)
5. Bhattacharya, S., Choudhury, M., Sarkar, S., Basu, A.: Inflectional morphology
synthesis for Bengali noun, pronoun and verb systems. Proc. NCCPB 8, 34–43
(2005)
6. Hasan, M.F., UzZaman, N., Khan, M.: Comparison of Unigram. HMM and Brill’s
POS tagging approaches for some South Asian languages, Bigram (2007)
7. Kumawat, D., Jain, V.: POS tagging approaches: a comparison. Int. J. of Com.
App. 118 (6) (2015)
8. Hasan, F.M., UzZaman, N., Khan, M.: Comparison of different POS tagging tech-
niques (N-Gram, HMM and Brill’s tagger) for Bangla. Advances and Innovations
in Systems. Computing Sciences and Software Engineering, pp. 121–126. Springer,
Berliin (2007)
9. Ekbal, A., Haque, R., Bandyopadhyay, S.: Maximum entropy based Bengali part
of speech tagging. J. Res. Comput. Sci. 33, 67–78 (2008)
10. Ekbal, A., Haque, R., Bandyopadhyay, S.: Bengali part of speech tagging using
conditional random field. In: International Conference on SNLP2007, pp. 131–136
(2007)
11. PVS, A., Karthik, G.: Part-of-speech tagging and chunking using conditional ran-
dom fields and transformation based learning. Shallow Parsing South Asian Lang.
21, 21–24 (2007)
12. Dandapat, S., Sarkar, S., Basu, A.: Automatic part-of-speech tagging for Bengali:
an approach for morphologically rich languages in a poor resource scenario. In:
Proceedings of 45th Annual Meeting of ACL Companion, pp. 221–224 (2007)
13. Hossain, N., Huda, M.N.: A comprehensive parts of speech tagger for automatically
checked valid Bengali sentences. In: International Conference ICCIT, pp. 1–5. IEEE
(2018)
14. Roy, M.K., Paull, P.K., Noori, S.R.H., Mahmud, H.: Suffix based automated parts
of speech tagging for Bangla language. In: International Conference on ECCE, pp.
1–5. IEEE (2019)
15. Sakiba, S.N., Shuvo, M.U.: A memory efficient tool for Bengali parts of speech
tagging. In: Artificial Intelligence Techniques for Advanced Computer Application,
pp. 67–78. Springer (2020)
16. Chakrabarti, D., CDAC, P.: Layered parts of speech tagging for Bangla. Language
in India: Problems of Parsing in Indian Languages (2011)
17. Rathod, S., Govilkar, S.: Survey of various POS tagging techniques for Indian
regional languages. Int. J. Comput. Sci. Inf. Technol. 6(3), 2525–2529 (2015)
18. Bali, Kalika, M.C., Biswas, P.: Indian language part-of-speech tagset: Bengali.
Philadelphia: Linguistic Data Consortium (2010)
19. Sankaran, B., Bali, K., Choudhury: A common parts-of-speech tagset framework
for Indian Languages (01 2008)
20. Rai, A., Borah, S.: Study of various methods for tokenization. In: Application of
IoT, pp. 193–200. Springer (2021)
BEmoD: Development of Bengali
Emotion Dataset for Classifying
Expressions of Emotion in Texts
Avishek Das, MD. Asif Iqbal, Omar Sharif ,

and Mohammed Moshiul Hoque(B)
Chittagong University of Engineering and Technology, Chittagong-4349, Bangladesh

avishek.das.ayan@gmail.com, asifiqbalsagor123@gmail.com,
{omar.sharif,moshiul 240}@cuet.ac.bd
Abstract. Recently, emotion detection in language has increased atten-

tion to NLP researchers due to the massive availability of people’s
expressions, opinions, and emotions through comments on the Web 2.0
platforms. It is a very challenging task to develop an automatic senti-
ment analysis system in Bengali due to the scarcity of resources and
the unavailability of standard corpora. Therefore, the development of
a standard dataset is a prerequisite to analyze emotional expressions
in Bengali texts. This paper presents an emotional dataset (hereafter
called ‘BEmoD’) for analysis of emotion in Bengali texts and describes
its development process, including data crawling, pre-processing, label-
ing, and verification. BEmoD contains 5200 texts, which are labeled into
six basic emotional categories such as anger, fear, surprise, sadness, joy,
and disgust, respectively. Dataset evaluation with a Cohen’s κ score of
0.920 shows the agreement among annotators. The evaluation analysis
also shows the distribution of emotion words that follow Zipf’s law.
Keywords: Natural language processing · Bengali emotion corpus ·

Emotion classification · Emotion expressions · Evaluation · Corpus
development
1 Introduction
Recognizing emotion in expression involves the task of attributing an emotion
label to an expression preferred from a set of pre-specific labels of emotion. There
are several application areas where there is a need to understand and interpret
emotions in text expressions such as business, politics, education, sports, and
entertainment. Recognizing emotion in text expression is one of the critical tasks
in NLP, which demands an understanding of natural language. The hindrance
commences at the sentence level in which an emotion is stated through the
semantics of words and their connections; as the level enhances, the problem’s
difficulty grows. Nevertheless, not all opinions are stated explicitly; there are
metaphors, mockery, and irony.
https://doi.org/10.1007/978-3-030-68154-8_94
BEmoD: Development of Bengali Emotion Dataset 1125
Sentiment classification from texts can be divided into two categories:

opinion-based and emotion-based. Opinion classification is based on text polar-
ity, which classifies text/sentences into positive, negative, or neutral sentiments
[1]. Emotion classification deals with classifying sentences according to their
emotions [2].
Bengali is the fifth most-spoken native language in the world. Approximately
228 million people all over the world speak Bengali as their first language, and
around 37 million people speak it as a second language. In recent years, data
storage on the web increased exponentially due to the emergence of Web 2.0
applications and its related services in the Bengali language. Most of these data
are available as textual forms such as reviews, opinions, recommendations, rat-
ings, and comments, which are mostly in unstructured form. The analysis of
these enormous amounts of data to extract underlying sentiment or emotions is
a challenging research problem for the resource-constrained language, especially
in Bengali. The complexity arises due to various limitations, such as the lack of
tools, scarcity of benchmark corpus, and learning techniques. Because Bengali
is a resource-scarce language, emotion-based classification based on six emotion
class has not yet been performed, to the best of our knowledge. Therefore, in
this work, we are motivated to develop a corpus (we called it BEmoD-Bengali
Emotion Dataset) for classifying emotions from Bengali texts. We consider six
types of textual emotions such as joy, sadness, anger, fear, surprise, and disgust
based on Ekman’s basic emotion classification [3]. The critical contributions of
our work are to develop an emotion dataset for classifying Bengali texts into one
of the six emotions. Several expressions are collected from online/offline sources
over three months and classified into emotion classes. The findings help the
researchers explore several issues, such as the characteristics of emotion words
in each class. Another contribution is that this work identifies the high-frequency
emotion words in Bengali literature and analyze the dataset in terms of several
metrics such as Cohen’s kappa and Zipf’s law.
2 Related Work
Different corpora have been developed to detect emotions in text for various
languages. Alm et al. [4] developed a dataset consisting of 185 tales, in which
labeling is done at the sentence level with sadness, happy, neutral, anger-disgust,
fear, positive surprise, and negative surprise classes. A dataset consisting of blog
posts representing eight emotion labels, including Ekman’s six basic emotions
[5]. ISEAR2 [6] is a corpus with joy, fear, anger, sadness, disgust, shame, and
guilt classes. Semantic Evaluation (SemEval) introduced a complete dataset in
English for emotion analysis [7]. To perform sentiment analysis, corpus of several
languages, such as Arabic [8], Czech [9], and French [10], were created. Another
corpus proposed by SemEval [11] consisting of Arabic, English, and Spanish
tweets, and these are labeled with eleven emotions. A recently developed corpus
consists of textual dialogues where each discussion is either labeled as anger,
happy, sad, or others [12]. A tweeter corpus containing Hindi-English code mixed
1126 A. Das et al.
text was annotated with Ekman’s six basic emotions by two languages proficient
and the quality validated using Cohen’s Kappa coefficient in [13].
Although Bengali is a resource-poor language, few works have been con-
ducted on sentiment analysis by classifying Bengali words or texts into positive,
negative, and neutral classes. Das et al. [14] approached with semi-automated
word-level annotation using the Bangla Emotion word list, which was translated
from English WordNet Affect lists [15]. The authors focused heavily on words
rather than the context of the sentences and built an emotion dataset of 1300
sentences with six classes. Prasad et al. [16] tagged sentiments of tweet data
based on the emoticon and hashtags. They build a sentiment dataset having
999 Bengali tweets. Nafis et al. [17] developed a Bangla emotion corpus con-
taining 2000 Youtube comments, which were annotated with Ekman’s six basic
emotions. They measured the majority vote for final labeling and came up tag-
ging with four emotion labels as some of them are ambiguous. Rahman et al.
[18] used Bengali cricket Dataset for sentiment analysis of Bengali text in which
2900 different online comments were labeled into positive, negative, and neu-
tral sentiment. A couple of work [19] and [20] developed datasets for sentiment
analysis from Bengali text. The first work considered 1000 data of restaurant
reviews, whereas the second considered 1200 data. Most of the previous work
Bengali dataset developed to analyze sentiment into positive, negative, or neu-
tral. However, BEmoD intends to use classifying emotion in Bengali text into
six basic emotion classes.
3 Properties of Emotion Expressions in Bengali

Several text expressions in the Bengali language has investigated and found
out the distinctive properties of each category of Ekman’s six emotion such as
happiness, sadness, anger, disgust, surprise, and fear. In order to identify the
distinct properties of each category, the sentences are investigated based on the
following characteristics.
• Emotion Seed Words: We identify words commonly used in the context
of a particular emotion. For example, the words “happy”, “enjoy”, “pleased”
are considered as seed words for the happiness category. Thus, some specific
seed words have been stored for a specific emotion in Bengali. For example,
(Angry) or (Anger) is usually used for expressing “Anger”
emotion. Likewise, (Happy) or (Good mood) are usually
used for expression “joy” emotion.
• Intensity of Emotion Word: In Bengali, different seed words express dif-
ferent emotions in a particular context. In such cases, seed words are com-
pared in terms of intensity and choosing the highest intensity seed word,
including its emotion class, is assigned for the emotion of that context.
Consider the following example,
(English translation: When the news of Alexander’s death reached Athens,

someone was surprised and asked, “Alexander is dead! Impossible! If he’s
dead, the smell of his dead body would waft from every corner of the earth.”)
In these texts, several seed words are found such as, (death
news), (surprised) and (dead! Impossible!) Here the
words, (surprised) and (dead! Impossible!) have more
weight than (death news). Thus this type of texts can be con-
sidered as “surprise” because the intensity of this emotion is higher than the
intensity of sadness emotion.
• Sentence Semantic: Observing the semantic meaning of the texts is one
of the prominent characteristics of ascertaining emotion class. In the previ-
ous example, though the sentence started with the death news of Alexan-
der, this sentence turns into the astonishment of a regular person in Athens.
So, sentence semantics make an important parameter in designating emotion
expression.
• Emotion Engagement: It is imperative to involve the annotation actively
while reading the text for understanding the semantic and context of the
emotion expression explicitly. For example,
(English
translation: Every moment spent in St. Martin was awesome. Here are some
from those countless moments captured on camera ). In this particular expres-
sion, annotators can feel some happiness as it describes an original moment of
someone’s experience. This feeling causes annotators engaged with happiness
and the expression designated as “joy.”.
• Think Like The Person (TLTP): Usually, an emotion expression is a
type of expression of someone’s emotion in a particular context. By TLTP,
an annotator imagines his/her in the same context where the emotion expres-
sion is displayed. By repeatedly uttering, an annotator tried to imagine the
situation and annotated emotion class.
By considering the above characteristics, each emotion expression will label
into one of the six emotion classes: joy, sadness, anger, disgust, surprise, and
fear.
4 Dataset Development Processes

The prime objective of our work is to develop an emotion dataset that can
be used to classify emotion expression, usually written in Bengali, into one of
six basic emotion categories. To develop dataset in Bengali is a critical chal-
lenge for any language processing task. One of the notable challenges is the
scarcity of appropriate emotion text expression. Some links, misspelled sentences,
and “Benglish” sentences were obtained during data crawling. Moreover, emo-
tion detection from plain text is challenging than detecting emotion from facial
expression. Because sometimes people pretend to be alright through text mes-
sages with having lots of emotional problem in his day to day life. Figure 1 shows
1128 A. Das et al.
an overview of the development process of BEmoD, which consists of four major

phases: data crawling, preprocessing, data annotation, and label verification,
respectively. We adopted the method described by Dash et al. [21] to develop
dataset.
Fig. 1. Development processes of BEmoD
4.1 Data Crawling

Bengali text data were accumulated from several sources such as Facebook com-
ments/posts, YouTube comments, online blog posts, storybooks, Bengali novels,
daily life conversations, and newspapers. Five participants were assigned to col-
lect data. They manually collected 5700 text expressions over three months.
Although most of the data collected from online sources, data can be created by
observing people’s conversations. In social media, many Bengali native talkers
wrote their comments or posts in the form of transliterated Bengali. For exam-
ple, a transliterated sentence, “muvita dekhe amar khub valo legeche. ei rokom
movi socharacor dekha hoy na.” (Bangla:
[English translation: I really enjoyed watching
this movi.e. Such movies are not commonly seen]. This type of texts demands
to be converted phonetically by the phonetic conversion. However, errors may
take place during phonetic conversion. For instance, in the above texts, the
word “socharacor (English: usually) could be translated in Bengali as,
after phonetic conversion whereas the accurate word should be
Therefore, correction should handle because there is no such word like
in Bengali Dictionary [22].
4.2 Preprocessing
Pre-processing performed in two phases: manual and automatic. In the manual
phase, “typo” errors eliminate from the collected data. We took Bangla academy
supported accessible dictionary (AD) database [22] to find the appropriate form
of a word. If a word existed in input texts but not in AD, then this word was
considered to be a typo word. The appropriate word searched in AD and the
typo word was replaced with this corrected word. For example, the text,
. In this example, the bold words indicate the

typo errors that need to be corrected by using AD. After replacing, the sentence
turned into
.
It has been observed that emojis and punctuation marks sometimes create
perplexity about the emotional level of the data. That why in the automatic
phase, these were eliminated from the manually processed data. We made an
emoji to the hex (E2H) dictionary from [23]. Further, all the elements of E2H
were converted to Unicode to cross-check them with our corpus text elements.
A dictionary was introduced, which contains punctuation marks and special
symbols (PSD). Assume any text element matched with elements in E2H or
PSD substituted with blank space. All the automatic preprocessing was done
with a python-made script. After automatic preprocessing, the above example
comes into
.
4.3 Data Annotation
The whole corpus was labeled manually, followed by majority voting to assign a
suitable label. The labeling or annotation tasks were performed by two separate
groups (G1 and G2). G1 consists of 5 postgraduate students having a Computer
Engineering background and working on NLP. An expert group (G2) consists of
three academicians and working on NLP for several years. They performed label
verification by selecting an appropriate label. The majority voting mechanism
uses to decide the final emotion class of emotion expression. The unique ultimate
label of expression chosen by pursuing the process described in Algorithm 1.
Algorithm 1: Majority Voting & Final Label

1 T ← text corpus;
2 label ← [0,1,2,3,4,5];
3 AL ← Annotator Label Matrix ;
4 F L ← Final Label Matrix;
5 for ti ∈ T do
6 count label = [0,0,0,0,0,0];
7 for aij ∈ AL do
8 count label[aij ]++;
9 end
10 F Li = indexof [max(count label)];
11 end
1130 A. Das et al.
4.4 Label Verification
The majority voting by the G1 annotators has decided the original label of data.
The original label was considered the ultimate if this label matched the expert
label (G2). When the label of G1 and G2 was mismatched, then it was sent to the
groups for further discussion. Both groups accepted a total of 4950 data labels
among 5700 data. The remaining 750 data was sent for discussion. Both groups
are agreed about 250 data label after discussion and added to BEmoD. About
500 data have been precluded due to the disagreement between groups. This
exclusion may happen due to the texts with neutral emotion, implicit emotion,
and ill-formatted. Holding verifying 5200 data, including their labels saved in
*.xlsx format.
5 Analysis of BEmoD
Dataset analysis was performed by determining the data distributions concerning

source and emotion classes. Emotion expression data were collected from online
sources, books, people’s conversations, and artificially generated.
Table 1. Statistics of BEmoD
Corpus attributes Attributes value

Size on disk 685 KB
Total number of expressions 5200
Total number of sentences 9114
Total number of words 130476
Unique words 26080
Maximum words in a single expression 114
Minimum words in a single expression 7
Average words in a single expression 25 (approximately)
– Statistics of BEmoD: Table 1 illustrates the discriminating characteristics

of the developed BEmoD which consists of 130476 words in total and 26080
unique words under 9114 sentences.
– Categorical Distribution in BEmoD: A total of 5200 expressions are
labeled into one of the six basic emotion categories after the verification pro-
cess. Table 2 shows the categorical summary of each class in BEmoD. It is
observed that the highest number of data belongs to the sadness category,
whereas the lowest number of data belongs to the surprise category.
6 Evaluation of BEmoD
We investigate how much the annotators were agreed in assigning emotion classes
by using Cohen’s kappa (κ) [24]. We also measure the density of emotion words,
high-frequency emotion words, and distribution of emotion words with Zipf’s
law, respectively [25].
Table 2. Data statistics by categories
Category Total emotion data Total sentences Total words Unique words
Anger 723 1126 18977 7160
Fear 859 1494 19704 6659
Surprise 649 1213 16806 6963
Sadness 1049 1864 26986 8833
Joy 993 1766 25011 8798
Disgust 927 1651 22992 8090
κ score computed from the majority voting of G1 and G2 by investigating the

standard inter-annotator agreement. Table 3 shows the result of Kappa statistics.
According to Landis et al. [26], a κ score of 0.920 is almost a perfect agreement.
Table 3. Kappa statistics
Kappa metrics G1 vs. G2

Number of observed agreements (po ) 93.33% of the observations
Number of agreements expected by chance (pe ) 16.69% of the observations
Kappa (κ) 0.920
SE of kappa 0.021
95% confidence interval 0.879 – 0.960
To measure the influences of emotion words in various classes, we consolidate

the density of emotion words. Density can be computed by the ratio of the
number of emotion words to the total number of words in each expression. The
density of emotion words per class illustrated in Table 4. The overall density in
the whole corpus is 0.2433. If the density for one class is higher than 0.2433, it
signifies that each writer communicates enhanced emotion concerning this class;
it also indicates that emotions are directed within this class.
Figure 2 shows the variance of each emotion class density from the average.
This figure indicates that the frequencies of sadness, joy, and disgust classes
are higher than the average density of 0.2433, revealing that people are more
responsive in these classes and use more emotional words.
1132 A. Das et al.
Table 4. Density of emotion words in each class
Emotion class Unique words(T) Emotion words(N) Density(D) D-0.2433

Anger 7160 1140 0.1592 −0.08409
Fear 6659 1174 0.1763 −0.67
Surprise 6963 1689 0.2425 −.0008
Sadness 8833 2607 0.2951 .0518
Joy 8798 2394 0.2721 0.0288
Disgust 8090 2312 0.2857 0.0424
Total/average 46503 11316 0.2433 0.0000
Fig. 2. Emotion words density vs. average density
Emotion words frequency is counted on the whole BEmoD. This frequency

of emotion words brings to the conclusion that some specific words are always
meant to express specific emotions of humans. Table 5 listed 10 most frequent
emotion words on BEmoD.
Zipf’s law reveals an empirical observation that states that the frequency of
a given word should be inversely proportional to its rank in the corpus. This
law states that if the Zipf curve is plotted on a log-log scale, a straight line
with a slope of −1 must be obtained. Figure 3 shows the resultant graph for
each classes. It is observed that the curve obeys Zipf’s law as the curve follows a
slope of −1. Considering all the evaluation measures along with 93.33% similarity
score, 0.920 κ score, and the obeying the Zipf’s law, the developed corpus (i.e.,
BEmoD) can be used to classify basic emotions from Bengali text expressions.
This work provides a primary foundation to detect human’s six basic emotions
by relating Bengali text expressions.
Table 5. Top 10 highest frequency emotion words in BEmoD.
Count Bengali Word English Equivalent Frequency

1 Fear 259
2 Surprised 168
3 Trouble 148
4 Beautiful 93
5 Bad 59
6 Anger 54
7 Suddenly 50
8 Love 48
9 Bad 40
10 Damn it 34
Fig. 3. Distribution of word frequencies:(a) Zipf curve (b) Zipf curve (log-log) scale
6.1 Comparison with Existing Bengali Emotion Datasets
We investigated the available datasets on Bengali text emotions and find out
their characteristics in terms of the number of data, number of emotion cat-
egories, and types of emotions. Table 6 summarizes the properties of several
available datasets along with developed BEmoD. The summarizing indicates
that the developed dataset is larger than the available datasets.
Classification of implicit emotion is the most critical problem because of
its unapparent nature within the expression, and thus, its solution demands to
interpret of the context. Emotions are complicated; humans often face problems
to express and understand emotions. Classifying or detecting emotions in text
enhances the complicity of interpreting emotion because of the lack of apparent
facial expressions, non-verbal gestures, and voice [25]. Recognition of emotion
automatically is a complicated task. A machine should handle the difficulty of
linguistics phenomena and the context of written expression.
1134 A. Das et al.
Table 6. Comparative illustrations of Bengali emotion datasets.
Dataset No. of data No. of class Types of emotions/sentiment

Rahman et al. [18] 2900 3 Positive, negative, neutral
Maria et al. [7] 2800 3 Positive, negative, neutral
Ruposh et al. [20] 1200 6 Happy, sad, anger, fear, surprise,
disgust
BEmoD 5200 6 Joy, fear, anger, sadness, surprise,
disgust
7 Conclusion
Emotion recognition and classification are still developing areas of research, and
challenges for low-resource languages are daunting. The scarcity of the bench-
mark dataset is one of the vital challenges to perform the emotion classification
task in the Bengali language. Thus, in this work, we presented a new corpus
(called BEmoD) for emotion classification in Bengali texts and explained its
development processes in details. Although few datasets are available for emo-
tion classification (mostly consider positive, negative, and neutral classes) in
Bengali, this work adopted basic six emotions, including joy, fear, anger, sad-
ness, surprise, and disgust. This work revealed several features of emotion texts,
especially concerning each different class, exploring what kinds of emotion words
humans use to express a particular emotion. The evaluation of BEmoD shows
that the developed dataset followed the distribution of Zipf’s law and main-
tained an agreement among annotators with an excellent κ score. However, the
current version of BEmoD has 5200 emotion texts. This amount is not sufficient
to apply in deep learning algorithms. Therefore, more data samples should be
considered with implicit and neutral emotion categories. BEmoD can be consid-
ered to annotate emotion expressions in terms of various domains. These are left
for future research.
References
1. Liu, B.: Sentiment analysis and subjectivity, 1–38 (2010)
2. Garg, K., Lobiyal, D.K.: Hindi emotionnet: a scalable emotion lexicon for sentiment
classification of hindi text. ACM Trans. Asian Low-Resour. Lang. Inf. Process.
19(4), 1–35 (2020)
3. Eckman, P.: Universal and cultural differences in facial expression of emotion. In:
Nebraska Symposium on Motivation, vol. 19, pp. 207–284 (1972)
4. Alm, O.C., Roth, D., Richard, S.: Emotions from text: machine learning for text-
based emotion prediction. In: Proceeding in HLT-EMNLP, pp. 579–586. ACL, Van-
couver, British Columbia, Canada (2005)
5. Aman, S., Szpakowicz, S.: Identifying expressions of emotion in text. In: Inter-
national Conference on Text, Speech and Dialogue, pp. 196–205. Springer, Berlin
(2007)
6. Scherer, K.R., Wallbott, H.G.: Evidence for universality and cultural variation of
differential emotion response patterning. J Per. Soc. Psy. 66(2), 310–328 (1994)
7. Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos,
I., Manandharet, S.: Semeval-2014 task 4: aspect based sentiment analysis. In:
International Workshop on Semantic Evaluation, pp. 27–35. ACL, Dublin, Ireland
(2014)
8. Al-Smadi, M., Qawasmeh, O., Talafha, B., Quwaider, M.: Human annotated Ara-
bic dataset of book reviews for aspect based sentiment analysis. In: International
Conference on Future Internet of Things and Cloud, pp. 726–730. IEEE, Rome,
Italy (2015)
9. Ales, T., Ondrej, F., Katerina, V.: Czech aspect-based sentiment analysis: a new
dataset and preliminary results. In: ITAT, pp. 95–99 (2015)
10. Apidianaki, M., Tannier, X., Richart, C.: Datasets for aspect-based sentiment anal-
ysis in French. In: International Conference on Lan. Res. & Evaluation, pp. 1122–
1126. ELRA, Portorož, Slovenia (2016)
11. Mohammad, S., Bravo-Marquez, F., Salameh, M., Kiritchenko, S.: Semeval-2018
task 1: affect in tweets. In: International Workshop on Semantic Evaluation, pp.
1–17. ACL, New Orleans, Louisiana (2018)
12. Chatterjee, A., Narahari, K.N., Joshi, M., Agrawal, P.: Semeval-2019 task 3:
emocontext: contextual emotion detection in text. In: International Workshop on
Semantic Evaluation, pp. 39–48. ACL, Minneapolis, Minnesota, USA (2019)
13. Vijay, D., Bohra, A., Singh, V., Akhtar, S.S., Shrivastava, M.: Corpus creation and
emotion prediction for hindi-english code-mixed social media text. In: Proceedings
of the 2018 Conference of the North American Chapter of the Association for
Computational Linguistics: Student Research Workshop, pp. 128–135 (2018)
14. Das, D., Bandyopadhyay, S.: Word to sentence level emotion tagging for Bengali
blogs. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pp.
149–152 (2009)
15. Strapparava, C., Valitutti, A., et al.: Wordnet affect: an affective extension of
wordnet. In: Lrec, vol. 4, p. 40. Citeseer (2004)
16. Prasad, S.S., Kumar, J., Prabhakar, D.K., Tripathi, S.: Sentiment mining: an app-
roach for Bengali and Tamil tweets. In: 2016 Ninth International Conference on
Contemporary Computing (IC3), pp. 1–4. IEEE (2016)
17. Tripto, N.I., Ali, M.E.: Detecting multilabel sentiment and emotions from Bangla
youtube comments. In: 2018 International Conference on Bangla Speech and Lan-
guage Processing (ICBSLP), pp. 1–6. IEEE (2018)
18. Rahman, A., Dey, E.K.: Datasets for aspect-based sentiment analysis in Bangla
and its baseline evaluation. Data 3(2), 15 (2018)
19. Sharif, O., Hoque, M.M., Hossain, E.: Sentiment analysis of Bengali texts on online
restaurant reviews using multinomial naıve bayes. In: International Conference on
Advance in Science, Engineering & Robotics Technology, pp. 1–6. IEEE, Dhaka,
Bangladesh (2019)
20. Ruposh, H.A., Hoque, M.M.: A computational approach of recognizing emotion
from Bengali texts. In: International Conference on Advances in Electrical Engi-
neering (ICAEE), pp. 570–574. IEEE, Dhaka, Bangladesh (2019)
21. Dash, N.S., Ramamoorthy, L.: Utility and Application of Language Corpora.
Springer (2019)
22. Accessible dictionary. https://accessibledictionary.gov.bd/. Accessed 2 Jan 2020
23. Full emoji list. https://unicode.org/emoji/charts/full-emoji-list.html. Accessed 7
Feb 2020
1136 A. Das et al.
24. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Meas.
20(1), 37–46 (1960)
25. Alswaidan, N., Menai, M.B.: A survey of state-of-the-art approaches for emotion
recognition in text. Knowl. Inf. Syst. 62, 2937–2987 (2020)
26. Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical
data. Biometrics, 159–174 (1977)
Advances in Engineering
and Technology
Study of the Distribution Uniformity
Coefficient of Microwave Field of 6 Sources
in the Area of Microwave-Convective Impact
Dmitry Budnikov(&), Alexey N. Vasilyev, and Alexey A. Vasilyev
Federal State Budgetary Scientific Institution “Federal Scientific

Agroengineering Center VIM” (FSAC VIM),
1-st Institutskij 5, 109428 Moscow, Russia
{dimm13,vasilev-viesh}@inbox.ru, lexoff@mail.ru3
Abstract. The development of processing modes using electrical technologies

and electromagnetic fields can reduce the energy intensity and cost of grain heat
treatment processes. During development, it is necessary to consider the tech-
nological requirements of the processed material, types of used technology, the
mode of operation of the mixing equipment (continuous, pulse etc.). In addition,
it is necessary to ensure uniform processing of the grain layer. Thus, the purpose
of the work is experimentally evaluate the uniformity of the microwave field
distribution in the zone of microwave-convective grain processing of a labora-
tory installation containing 6 microwave energy sources. The article presents the
scheme of the microwave convective processing zone in which experimental
studies were conducted, the factors considered in the experiment and the levels
of their variation. An experiment was performed to determine the uniformity
coefficient of propagation of the microwave field in a layer of grain material. It
was found that the results of the study of the microwave field strength in the
grain mass can be used in the study of the dielectric properties of the processed
grain, as well as the construction of a control system for the microwave con-
vective processing plants.
Keywords: Electrophysical effects Post-harvest treatment Microwave

FIELD Uniform distribution
1 Introduction
Many agricultural processing processes, such as drying, are characterized by their high
energy intensity. In this connection, the development of energy-efficient equipment
remains relevant. Nowadays the development of energy-efficient equipment is con-
necting with accumulating and handling the mount of information. Obtained data this
way can be oriented at optimized algorithms of the managing equipment and also
education deeply for building smart control systems. Improving the post-harvest pro-
cessing of grain does not lose its relevance. At the same time, the improvement of these
processes using electrophysical effects is becoming increasingly popular [3, 4, 8, 10,
11, 15]. Many researchers consider exposure to ultrasound, ozone-air mixtures, aero-
ions, infrared radiation, microwave field, etc. At the same time, almost all of these

https://doi.org/10.1007/978-3-030-68154-8_95
1140 D. Budnikov et al.
factors are associated with a high unevenness of impact on the volume of the processed
material. Thus, there is a need to assess the uniformity of the studied factor over the
layer of the processed material [1, 2, 6, 7, 12]. This article will consider an experi-
mental study of the propagation of the microwave field in a dense layer of wheat
subjected to microwave convective drying.
2 Main Part
2.1 Research Method
In order to take into account the influence of uneven distribution of the field by volume,
enter the coefficient of uniform distribution of the layer Кun. This coefficient is the ratio
of the average tension in the chamber volume to the maximum one. Similarly, this
coefficient can be calculated as the square root of the ratio of the average power
absorbed by the grain material in the volume of the product pipeline to the maximum
one:
rffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Emid Qmid 5:56 1011 Emid 2 f e0 tgd
Kun ¼ ¼ ¼ ; ð1Þ
Emax Qmax 5:56 1011 Emax 2 f e0 tgd
where Emid – the average value of the amplitude of the electric field in the volume of
the processed material, V/m; Emax – the maximum value of the amplitude of the electric
field in the volume of the microwave convection processing zone, V/m; Qmid, Qmax –
average and maximum power dissipated in dielectric material, W/m3.
If we take into account the fact that the depth of penetration into the material is the
distance at which the amplitude of the incident wave decreases by e times, then the
minimum uniformity coefficient Кun min, which should be taken into account can be
defined as: Кun min = 1/e = 0.368.
Further, the uniformity coefficient of propagation of the electromagnetic field can
be used to assess the applicability of the design of the microwave convective treatment
zone together with the applied.
To solve the problem of propagation of electromagnetic waves in a waveguide with
a dielectric, it is necessary to solve two dimensional Helmholtz equations in various
sections of the waveguide [4, 5, 9, 12], in particular, in an empty waveguide (product
line) and in a filled with dielectric, also write the boundary conditions connecting these
equations.
2.2 Modelling
This problem can be solved numerically. Currently, there is a large number of spe-
cialized software that uses various numerical methods [5].
CST Microwave Studio was chosen as the framework for conducting the numerical
experiment. The most flexible calculation method, implemented in Microwave Studio
as a transient solver, can calculate the device over a wide frequency range after cal-
culating a single transient characteristic (as opposed to the frequency method, which
Study of the Distribution Uniformity Coefficient of Microwave Field 1141
requires analysis at many frequency points). The results obtained by this product are
based on the Finite Integration Technique (FIT), which is a sequential scheme for
discretizing Maxwell's equations in integral form.
The desired coefficient cannot only be calculated based on the presented simulation,
but also based on experimental measurements on control points. The Table 1 presents
the calculated coefficient for the given dimensions of the product pipeline and three
types of applied waveguides for the propagation of the microwave field in a dense layer
of wheat with a humidity of 16%.
It was found that the results of modeling the distribution of the electromagnetic
field in the zone of microwave convective influence of the installation containing
sources of microwave power for processing the grain layer indicate a high level of its
unevenness in the volume of the product pipeline.
Table 1. Calculated uniformity coefficient Kun of propagation of the electromagnetic field in

wheat.
Product section, mm mm W, %
14 16 20 24 26
200 200 0.4946 0.4332 0.3827 0.3502 0.3278
200 300 0.3128 0.3022 0.2702 0.2546 0.2519
The obtained data on the dependence of the uniformity coefficient of the electro-
magnetic field propagation in the zone of microwave convective exposure on the
humidity of wheat can be approximated by a third-degree polynomial of the form:
Kun ¼ b0 þ b1 W þ b2 W 2 þ b3 W 3 ; ð2Þ
where b0, b1, b2, b3 – proportionality coefficients.

Table 1 shows the results of field propagation modeling in wheat, and Table 2
shows corresponding values of b0, b1, b2, b3 coefficients.
Table 2. Calculated values of the proportionality coefficients for calculating the uniformity
coefficient.
Product section, mm mm b0 b1 b2 b3 R2
200 200 2.2753 −0.2525 0.0114 −0.0002 0.9992
200 300 0.1887 0.0336 −0.0023 4⋅10–5 0.9972
The next step is conducting an experimental check of the distribution uniformity

coefficient in the laboratory installation.
2.3 Experimental Research

Measurement of the field strength at a given point in the product pipeline is carried out
experimentally. The results of the study of the microwave field strength in the grain
mass can be used in the study of the dielectric properties of the processed grain, as well
as the construction of a control system for the design of microwave convective pro-
cessing plants.
The experimental studies presented further were conducted for the installation
containing six sources of microwave power. In this case, the response function was the
intensity of the acting electric field. The research was carried out on the volume of the
product pipeline with dimensions of 200 200 600 mm relative to the central point
(Fig. 1). The step along the coordinate axes was 50 mm. Thus, on the 0X and 0Y axes,
the values were taken from −100 to +100; on the 0Z axis they are from −300 to +300.
(a) (b)
Fig. 1. Product line model 200 200 600 mm in cross section: a) three-dimensional model;
b) simulation results.
The experiment was conducted for wheat of three humidity levels: 14, 20, 26%.
Taking into account the fact that even one experiment in this case will include
325 measurement points and the symmetry of the design, the number of control points
can be reduced by 4 times. Moreover, in order to exclude the mutual influence of the
sensors on each other, they can be distributed throughout the volume.
2.4 Results and Discussion

The regression models obtained in this case are of low quality and it is better to use the
Regression Learner tool of Matlab application software package to describe them.
Considering 19 models shows that the Rational Quadratic GPR model has the best
quality indicators, and it can be improved by accumulating and additional statistical

data.
This model can be used for developing control algorithms as a data source on the
behavior of the grain layer during processing. For a more detailed understanding of the
distribution of the driving forces of the heat-moisture transfer process in the grain layer
under the exposure to the microwave field, a presented earlier nonlinear regression
model [13, 14] can be used.
Based on the obtained data, the uniformity coefficient of the electromagnetic field
propagation can be calculated.
The Table 3 shows values of the uniformity coefficient of the electromagnetic field
propagation calculated from experimental data.
Table 3. The uniformity coefficient of the electromagnetic field propagation calculated from
experimental data.
W, % Kun1 Kun2 Kun3 Kun4 Kun5 Kun mid
16 0.38128 0.390358 0.402989 0.426276 0.357204 0.391621
20 0.28125 0.340313 0.330938 0.300313 0.33 0.316563
24 0.252511 0.250549 0.271037 0.247713 0.256229 0.257601
Figure 2 shows the experimental dependence of the uniformity coefficient of the

electromagnetic field propagation on the humidity of processed wheat. In this case, line
1 is the dependence obtained based on experimental studies; line 2 is a straight line
corresponding to 0.368.
0.45
1
0.4
2
Uniformity coefficient
0.35
0.3
0.25 Kun = 0.0005 W2 - 0.0369 W + 0.8528
0.2
15 17 19 21 23 25
Moisture, W , %
Fig. 2. Experimental values of the uniformity coefficient of the electromagnetic field

propagation.
Differences in experimental data from the calculated data obtained on the basis of
electrodynamic modeling are 15–20% and may be due to inaccuracy of information
about the dielectric properties of the material, its blockage, measurement errors, etc.
3 Conclusions
Based on the foregoing, we can conclude the following:

1. The results of the study of the microwave field strength in the grain mass can be
used in the study of the dielectric properties of the processed grain.
2. The results of the study of the microwave field strength in the grain mass can be
used in the construction of a control system for the microwave convective pro-
cessing plants.
3. The values and regularities in the uniformity coefficient of the electromagnetic field
distribution in the grain layer indicate a difference in reference data on the dielectric
properties of grain materials and their values for the layer of the processed material.
4. The values of the coefficient of uniformity in the considered options for a dense
layer of wheat are in the range 0.247713–0.426276.
References
1. Agrawal, S., Raigar, R.K., Mishra, H.N.: Effect of combined microwave, hot air, and
vacuum treatments on cooking characteristics of rice. J. Food Process Eng. e13038 (2019).
https://doi.org/10.1111/jfpe.13038
2. Ames, N., Storsley, J., Thandapilly, S.J.: Functionality of beta-glucan from oat and barley
and its relation with human health. In: Beta, T., Camire, M.E. (eds.) Cereal Grain-Based
Functional Foods, pp. 141–166. Royal Society of Chemistry, Cambridge (2019)
3. Bansal, N., Dhaliwal, A.S., Mann, K.S.: Dielectric characterization of rapeseed (Brassica
napus L.) from 10 to 3000 MHz. Biosyst. Eng. 143, 1–8 (2016). https://doi.org/10.1016/j.
biosystemseng.2015.12.014
4. Basak, T., Bhattacharya, M., Panda, S.: A generalized approach on microwave processing
for the lateral and radial irradiations of various Groups of food materials. Innov. Food Sci.
Emerg. Technol. 33, 333–347 (2016)
5. Budnikov, D.A., Vasilev, A.N., Ospanov, A.B., Karmanov, D.K., Dautkanova, D.R.:
Changing parameters of the microwave field in the grain layer. J. Eng. Appl. Sci. N11
(Special Issue 1), 2915–2919 (2016)
6. Dueck, C., Cenkowski, S., Izydorczyk, M.S.: Effects of drying methods (hot air, microwave,
and superheated steam) on physicochemical and nutritional properties of bulgur prepared
from high-amylose and waxy hull-less barley. Cereal Chem. 00, 1–3 (2020). https://doi.org/
10.1002/cche.10263
7. Izydorczyk, M.S.: Dietary arabinoxylans in grains and grain products. In: Beta, T., Camire,
M.E. (eds.) Cereal Grain-Based Functional Foods, pp. 167–203. Royal Society of
Chemistry, Cambridge (2019)
8. Nelson, S.O.: Dielectric Properties of Agricultural Materials and Their Applications.
Academic Press, Cambridge (2015). 229 p.
9. Pallai-Varsányi, E., Neményi, M., Kovács, A.J., Szijjártó, E.: Selective heating of different
grain parts of wheat by microwave energy. In: Advances in Microwave and Radio Frequency
Processing, pp. 312–320 (2007)
10. Ranjbaran, M., Zare, D.: Simulation of energetic- and exergetic performance of microwave-
assisted fluidized bed drying of soybeans. Energy 59, 484–493 (2013). https://doi.org/10.
1016/j.energy.2013.06.057
11. Smith, D.L., Atungulu, G.G., Sadaka, S., Rogers, S.: Implications of microwave drying
using 915 MHz frequency on rice physicochemical properties. Cereal Chem. 95, 211–225
(2018). https://doi.org/10.1002/cche.10012
12. Vasilev, A.N., Budnikov, D.A., Ospanov, A.B., Karmanov, D.K., Karmanova, G.K.,
Shalginbayev, D.B., Vasilev, A.A.: Controlling reactions of biological objects of agricultural
production with the use of electrotechnology. Int. J. Pharm. Technol. (IJPT) 8(N4), 26855–
26869 (2016)
13. Vasiliev, A.N., Ospanov, A.B., Budnikov, D.A., et al.: Improvement of grain drying and
disinfection process in the microwave field. Monography. Almaty: Nur-Print (2017).155
p. ISBN 978-601-7869-72-4
14. Vasiliev, A.N., Goryachkina, V.P., Budnikov, D.: Research methodology for microwave-
convective processing of grain. Int. J. Energy Optim. Eng. (IJEOE) 9(2), 11 (2020). Article:
1. https://doi.org/10.4018/IJEOE.2020040101.
15. Yang, L., Zhou, Y., Wu, Y., Meng, X., Jiang, Y., Zhang, H., Wang, H.: Preparation and
physicochemical properties of three types of modified glutinous rice starches. Carbohyd.
Polym. 137, 305–313 (2016). https://doi.org/10.1016/j.carbpol.2015.10.065
Floor-Mounted Heating of Piglets with the Use
of Thermoelectricity
Dmitry Tikhomirov1(&) , Stanislav Trunov1, Alexey Kuzmichev1,

Sergey Rastimeshin2, and Victoria Ukhanova1
1
Federal Scientific Agroengineering Center VIM, 1-st Institutskij 5,
tihda@mail.ru, alla-rika@yandex.ru, alkumkuzm@mail.ru,
v.ukhanova@owen.ru
2
Russian State Agrarian University - Moscow Timiryazev
Agricultural Academy, Timiryazevskaya st., 49, 127550 Moscow, Russia
resurs00@mail.ru
Abstract. The article discusses the problem of energy saving when providing
comfortable conditions for keeping young animals. The question of the use of
thermoelectric modules as a source of thermal energy in installations for local
heating of piglets is considered. A functional technological scheme of floor-
mounted heating of suckling piglets is proposed. In the scheme the energy of the
hot circuit of thermoelectric modules is used to heat the panel. The energy of the
cold circuit of the thermoelectric module is used to assimilate the heat from the
removed ventilation air. The heat exchange of the heated panel with the envi-
ronment is considered. Much attention is paid to the study of the prototype of
thermoelectric installation and its laboratory tests. The energy efficiency of using
thermoelectric modules as energy converters in thermal technological processes
of local heating of young animals has been substantiated.
Keywords: Thermoelectricity Pigsty Local heating Thermoelectric

module Energy saving
1 Introduction
The development of a comfortable temperature regime in a hog house is an important

and urgent task in animal husbandry. In the cold season it is necessary to create two
different temperature regimes for sows and suckling piglets. The air temperature should
be maintained at the level of 18…20 °C for sows while in the zone of piglets’ location
it should be about 30 °C and should be decreased for weaned piglets (in 26 days) to
22 °C [1].
The cold floor causes intense cooling of animal bodies and, in addition, the air above
the surface is saturated with moisture and ammonia. This air is poorly removed by
ventilation, and animals are forced to breathe it. Such conditions cause colds and animal
deaths. A means of combating such phenomena is the use of heated floors and heat-
insulating mats [2]. It is necessary to heat separate areas on which animals are resting.

https://doi.org/10.1007/978-3-030-68154-8_96
Floor-Mounted Heating of Piglets with the Use of Thermoelectricity 1147
Various types of equipment have been developed for heating young animals [3, 4].
However, the most widespread in practice are infrared electric heaters and heating
panels [5]. The unevenness of the heat flow in the area where animals are located and
the incomplete correspondence of the radiation spectrum to the absorption spectrum of
heat waves by animals are the main disadvantages of IR heaters. In addition, a common
disadvantage of local heaters is rather high energy consumption.
Heat-cold supply by means of heat pumps including using thermoelectricity
belongs to the field of energy-saving environmentally friendly technologies and
becomes more widespread in the world [6].
The analysis of the use of Peltier thermoelectric modules (thermoelectric assem-
blies) shows that thermoelectric assemblies are compact heat pumps allowing for the
creation of energy-efficient plants for use in various technological processes of agri-
cultural production [7–10].
The purpose of the research is to substantiate the parameters and develop an energy
saving installation for local heating of piglets using Peltier thermoelectric elements.
A sow is located in the area of the stall, the piglets are mainly located in the area fenced
off from the sow. In this area there is a heated panel where the piglets rest. Suckling
piglets freely pass through the passage in the separation grate to the sow for feeding.
Based on our analysis of various schemes of thermoelectric assemblies [11, 12] it
was found that the “liquid-air” heat exchange scheme should be adopted for energy and
structural parameters for creating a local heating system for piglets using a thermo-
electric heat pump. The technological scheme of the installation of local floor-mounted
heating of piglets is shown in Fig. 1.
The installation consists of a thermoelectric assembly, which in turn includes the
estimated number of Peltier elements 3, the air cooler of cold circuit 1 with an exhaust
fan 2 installed in the bypass of a pigsty ventilation system, the water cooler of hot
junction 4 connected by direct and return pipes 8 through a circulation pump 9 with a
heated panel 6. Thermal power of the thermoelectric assembly is regulated by changing
the current strength flowing through the thermocouples. Power, control and manage-
ment of the circuit is carried out by the block 5.
To assess the performance of a thermoelectric water heater circulating through a
heating panel, consider the heat balance equations for hot and cold junctions of a
thermoelectric element [13], which have the form:
Qph þ 0; 5QR ¼ QH þ QT ; ð1Þ
QC þ QT þ 0; 5QR ¼ Qpc ; ð2Þ
Qpc ¼ eTC I; ð3Þ

1148 D. Tikhomirov et al.
Qph ¼ eTH I; ð4Þ

p p
where Qh , Qc are Peltier heat of hot and cold junctions, J; QH is heat transferred by
hot junction to an heated object, J; QT is heat transferred by the thermal conductivity
from hot junction to cold one, J; QC is heat taken from the environment, J; QR is Joule-
Lenz heat, J; e is Seebeck coefficient, lV/K; TC and TH are temperatures of hot and
cold junctions, K; I is current strength in a thermocouple circuit, A.
Fig. 1. Functional-technological scheme of the installation of local floor-mounted heating of

piglets using thermoelectricity. 1 is air cooler of cold circuit; 2 is fan; 3 are Peltier elements; 4 is
water cooler of hot junction; 5 is power and control unit; 6 is heat panel; 7 are temperature
sensors; 8 is pipe; 9 is pump.
Since QH and QC represent the amount of heat per unit time, the work of electric
forces (power consumption) can be determined by Eq. (5)
W ¼ QH QC ð5Þ
Taking into account Eqs. (1) and (2) as well as relations (3) and (4), the Eq. (5) can
be rewritten in the following form:
W ¼ eI ðTH TC Þ þ I 2 R; ð6Þ
where R is the resistance of the thermocouple branch, Ohm.

From the analysis of the equation it is seen that the power consumed by the
thermocouple W is spent on overcoming thermo-EMF and active resistance. In this
case, the thermocouple works like a heat pump, transferring heat from the environment
to the heated object.
To analyze the energy efficiency of heat pumps, turn to Eq. (5), which can be
rewritten in the following form:
1 ¼ QH =W QC =W ¼ kh kc ; ð7Þ
where kh is heating factor, kc is cooling factor.

Moreover, the heating coefficient kh > 1.
Given the Eqs. (1), (6) and (7) the heating coefficient will be determined by Eq. (8):
Qpc þ 0; 5QR QT
kh ¼ : ð8Þ
eIðTH TC Þ þ I 2 R
The smaller the temperature difference between the junctions TH – TC is, the higher
the efficiency of the thermocouple in heating mode is.
From the point of view of electric energy consumption, the most economical mode
of operation is the mode of a heat pump, in which the heating coefficient tends to the
maximum. When operating in heating mode, there is no extreme dependence of the
heating coefficient on current.
The thermoelectric assembly is a compact heat pump that absorbs thermal energy
from the cold side of the module and dissipates it on the hot side [14]. The use of
thermoelectric assemblies makes it possible to simultaneously heat and cool objects in
different technological processes whose temperature is not adequate to the ambient
temperature.
The power of a heating device for a heated panel should compensate for heat loss of
the site heated to the required temperature on its surface [15]. From the conditions of a
quasi-stationary regime, the energy balance of a heated area free of animals has the
form:
P ¼ P1 þ P2 þ P3 ; ð9Þ
where: P1 is loss of heat transfer through the surface of a site to a room, W; P2 is heat
transfer from side surfaces of a site, W; P3 is loss of heat transfer through a base to a
floor array, W.
The magnitude of the convective component of heat transfer is determined by the
well-known Eq. (10):
Pc ¼ ac F ðtb ta Þ; ð10Þ
where ta is the air temperature in a room, °C; tb is the required temperature on the
surface of a heated panel, °C; ac is heat transfer coefficient from the panel to the
environment, W/m2K; F is surface area of the heated panel, m2.
These air parameters should be taken in accordance with the norms of technological
design of agricultural livestock facilities. The sizes of sites F vary depending on the
purpose of a farm. For piglets it is from 0.5 to 1.5 m2.
According to technological requirements, the value ta should be maintained in the

range of 18–22 °C (for rooms with weaned piglets and sows). The value tb should be in
the range 32–35 °C [1]. The mobility ratio of air in the areas of animals should not
exceed 0.2 m/s (Fig. 2).
Fig. 2. Calculation of the power of a heating panel. 1 is base; 2 is heat carrier; 3 is thermal
insulation.
Reynolds number is determined by Eq. (11):
Re ¼ va l=t; ð11Þ
where va is the relative air mobility in the room, m/s; l is length of a heated panel -
characteristic size, m; t is kinematic viscosity of air, m2/s.
Heat transfer coefficient in a laminar boundary layer is determined by Eq. (12):
ac ¼ 0; 33ðk=lÞ Re0;5 Pr1=3 ; ð12Þ
where k is thermal conductivity coefficient of air, W/mK.

The radiant component of heat transfer by a heating panel site is determined by
Eq. (13):
h i
Pr ¼ C0 eb F ðTb =100Þ4 ðTbe =100Þ4 ; ð13Þ
where C0 = 5.76 W/m2K is the emissivity of a completely black body; eb is the integral
degree of blackness of the surface of a heated panel; F is surface area of the heated
panel, m2; Tb and Tbe are surface temperatures of a heated panel and building envelope
indoors, K.
General losses from the surface of a heated panel as a result of convective and
radiant heat transfer are:
P1 ¼ Pc þ Pr : ð14Þ
Coolant temperature is:
tt ¼ tb þ P1 db =ðFkb Þ; ð15Þ
where tb is the surface temperature of a heated panel (base), °C; db is the thickness of a
base, m; kb is the coefficient of thermal conductivity of a base material, W/mK.
Heat losses from the side surface P2 of the heated panel can be neglected due to its
small area.
A preliminary calculation of the heat loss of the heated panel through the floor P3 is
carried out taking into account a number of assumptions:
• The values of floor temperature in winter should be higher than negative values and
dew points;
• The heated panel does not affect the thermal regime of a floor at its location.
• The value of floor temperature is taken below the calculated values in the range of
5–10 °C to ensure system stability.
P3 ¼ ðkins =dins ÞF ðtt tf Þ; ð16Þ
where kins is coefficient of thermal conductivity of an insulating material, W/mK; dins is

thickness of an insulation layer, m; tf is floor temperature, °C.
The total thermal power of an energy carrier for heating a site:
P ¼ P1 þ P2 þ P ð17Þ
Table 1 presents the calculation of the power of the heated panel made for the
maximum allowable temperature and air velocity in the room for suckling piglets.
Table 1. Calculation of the power of the heated panel.

Parameter Designation Unit measure Value
Room temperature ta °C 18
Temperature on heated panel surface (max) tb °C 35
Relative room air mobility va m/s 0.2
Panel length l m 1.5
The area of the upper surface of a heated panel F m2 1.05
Kinematic viscosity of air t m2/s 14.7∙10–6
Thermal conductivity coefficient k W/mK 2.49∙10–2
Reynolds number Re 20408
Convection heat transfer coefficient ac W/m2 K 0.694
Convective component of heat loss Pc W 12.5
Degree of blackness of a heated panel surface eb 0.92
Surface temperature of a heated panel Tb K 308
Wall temperature Tbe K 283
(continued)
Parameter Designation Unit measure Value
Radiant component of heat loss Pr W 150.2
General losses from a heated panel surface P1 W 162.7
Base thickness db m 0.003
Thermal conductivity of a base material kb W/mK 0.4
Coolant temperature tt °C 36.2
Conductivity coefficient of thermal insulation kins W/mK 0.03
Thermal insulation layer thickness dins m 0.04
Floor temperature tf °C 5
Heat loss through a floor P3 W 44.3
Power of a heated panel (max) P W 207.0
The estimated maximum power of the panel was about 207 W (Table 1). The
minimum rated power of the panel for its surface temperature of 32 °C will be 174 W
in accordance with the presented method.
Based on the performed calculations, a prototype of the installation for local heating of
piglets was made (Fig. 3) using a heated floor panel with thermoelectric modules
(TEC1–12706). Water is used as a heat carrier that circulates in a closed loop.
Fig. 3. A prototype of a heated panel and thermoelectric assembly. 1 is heated panel; 2 is

circulation pump; 3 is thermoelectric assembly; 4 are water heating tubes; 5 are thermoelectric
elements; 6 is cold side radiator; 7 is fan; 8 is hot side radiator
The thermal energy and structural calculation of the thermoelectric assembly, cold
and hot circuit radiators was performed according to the methodology [16] and taking
into account the results of [17]. The main parameters of the local floor-mounted
installation for heating piglets are given in Table 2.
Table 2. The main parameters of the installation of floor-mounted heating piglets.

Parameter Unit Value
measure
Mains voltage V 220
Power consumption W 120
Supply voltage of one Peltier element V 12
The number of thermoelectric elements in the assembly pcs 2
Panel heat power W 170
Temperature on the working surface of the heated panel at an ambient °C 32
temperature of 18 °C
Circulating fluid volume l 0.9
Panel size m 0,7
1,5
Figure 4 shows the experimental dependences of the power consumed from an

electrical network P and the power removed by the hot circuit of the thermoelectric
module QH on the mains voltage. After analyzing the averaged experimental depen-
dences, it is possible to conclude that the power removed by the hot circuit of the
thermoelectric module is 30% higher than the power consumed from a network
(Unominal = 12 V). This occurs due to the intake of absorbed energy of the warm air
removed from a room by the cold circuit of the thermoelectric assembly. Using the
Fig. 4. Energy characteristic of the thermoelectric module

waste heat of removed ventilation air to supply a cold junction, it is possible to reduce
the power consumed from an electrical network to heat the panel itself. This heat pump
generates 1.3 kWh of thermal energy per 1 kWh of consumed electrical energy in the
set mode.
It is possible to use infrared heaters located above piglets if the combined heating of
piglets is necessary [18]. They might be the irradiators [19] with uniform heat flux and
high reliability.
4 Conclusions
Analysis of the experimental results of the prototype and theoretical studies suggest the
possibility of using Peltier thermoelectric modules in local heating systems for young
animals as an energy-saving source of thermal energy.
The use of local heating systems, based on Peltier elements as heat pumps, can
significantly reduce the energy consumption (up to 30%) for heating young animals
compared to existing direct heating plants (electric irradiators, heated floor mats,
brooders, etc.). The heat output of the panel with an area of 1,0–1.4 m2 should be 170–
200 W, and the total electric power of the thermoelements – 120–140 W.
The estimated payback period for local heaters of young animals, based on Peltier
thermoelectric elements, is about 3.5 years.
Further research will be aimed at determining the conditions for the optimal
operation of thermoelectric systems to obtain the maximum increase in power taken by
the hot circuit compared to the power consumed from an electrical network. The
justification of heat, power and structural parameters of local installations for heating
piglets will be also carried out.
The heat power of the panel is determined for static modes of heat exchange with
the environment. The study of heat exchange dynamics, taking into account the degree
of filling of the panel with piglets, will allow further development of an adaptive
control system to maintain a set temperature on its surface.
References
1. RD-APK 1.10.02.01–13. Metodicheskie rekomendacii po tekhnologicheskomu proek-
tirovaniyu svinovodcheskih ferm i kompleksov [Guidelines for the technological design of
pig farms and complexes]. Ministry of Agriculture of the Russian Federation, Moscow,
Russia (2012)
2. Trunov, S.S., Rastimeshin, S.A.: Trebovaniya k teplovomu rezhimu zhivotnovodcheskih
pomeshchenij s molodnyakom i predposylki primeneniya lokal’nogo obogreva [Require-
ments for the thermal regime of livestock buildings with young animals and the prerequisites
for the application of local heating]. Vestnik VIESKH 2(27), 76–82 (2017)
3. Valtorta, S.: Development of microclimate modification patterns in animal husbandry. In:
Stigter, K. (eds.) Applied Agrometeorology. Springer, Heidelberg (2010). https://doi.org/10.
1007/978-3-540-74698-0_92
4. Tikhomirov, D., Vasilyev, A.N., Budnikov, D., Vasilyev, A.A.: Energy-saving automated
system for microclimate in agricultural premises with utilization of ventilation air. Wireless
Netw. 26(7), 4921–4928 (2020). Special Issue: SI. https://doi.org/10.1007/s11276-019-
01946-3
5. Samarin, G.N., Vasilyev, A.N., Zhukov, A.A., Soloviev, S.V. Optimization of microclimate
parameters inside livestock buildings. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.)
Computing, vol 866. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00979-3_35
6. SHostakovskij, P.G.: Teplovoj kontrol’ ob”ektov na baze termoelektricheskih sborok
[Thermal control of objects based on thermoelectric assemblies]. Komponenty i tekhnologii
9, 142–150 (2011)
7. Tihomirov, D.A., Trunov, S.S., Ershova, I.G., Kosolapova, E.V.: Vozdushno-teplovaya
zavesa s ispol’zovaniem termoelektricheskih modulej v doil’nom bloke fermy KRS [Air-
thermal curtain using thermoelectric modules in the milking unit of the cattle farm].
Vestnik NGIEI 1, 47–57 (2020)
8. Amirgaliyev, Y., Wojcik, W., Kunelbayev, M.: Theoretical prerequisites of electric water
heating in solar collector-accumulator. News Nat. Acad. Sci. Repub. Kaz. Ser. Geol. Tech.
Sci. 6, 54–63 (2019)
9. Trunov, S.S., Tikhomirov, D.A.: Termoelektricheskoe osushenie vozduha v sel’skohozya-
jstvennyh pomeshcheniyah [Thermoelectric air drainage in agricultural premises]. Nauka v
central’noj Rossii 2(32), 51–59 (2018)
10. Kirsanov, V.V., Kravchenko, V.N., Filonov, R.F.: Primenenie termoelektricheskih modulej
v pasterizacionno-ohladitel’nyh ustanovkah dlya obrabotki zhidkih pishchevyh produktov
[The use of thermoelectric modules in pasteurization-cooling plants for processing liquid
food products]. FGOU VPO MGAU, Moscow, Russia (2011)
11. Chen, J., Zhang, X.: Investigations of electrical and thermal properties in semiconductor
device based on a thermoelectrical model. J. Mater. Sci. 54(3), 2392–2405 (2019)
12. Reddy, B.V.K., Barry, M., Li, J.: Mathematical modeling and numerical characterization of
composite thermoelectric devices. Int. J. Therm. Sci. 67, 53–63 (2013)
13. Stary, Z.: Temperature thermal conditions and the geometry of Peltier elements. Energy
Convers. Manage. 33(4), 251–256 (1992)
14. Ismailov, T.A., Mirzemagomedova, M.M.: Issledovanie stacionarnyh rezhimov rabo-ty
termoelektricheskih teploobmennyh ustrojstv [Study of stationary modes of operation of
thermoelectric heat exchange devices]. Vestnik Dagestanskogo gosudarstvennogo tekhnich-
eskogo universiteta. Tekhnicheskie nauki 40(1), 23–30 (2016)
15. Wheeler, E.F.; Vasdal, G., Flo, A., et al.: Static space requirements for piglet creep area as
influenced by radiant temperature. Trans. ASABE 51(1),2008) 278–271 )
16. Pokornyj, E.G., SHCHerbina, A.G.: Raschet poluprovodnikovyh ohlazhdayushchih ustrojstv
[Calculation of semiconductor cooling devices]. Nauka, Lipetsk, Russia (1969)
17. Trunov, S.S., Tihomirov, D.A., Lamonov, N.G.: Metodika rascheta termoelektricheskoj
ustanovki dlya osusheniya vozduha [Calculation Method for Thermoelectric Installation for
Air Drying]. Innovacii v sel’skom hozyajstve 3(32), 261–271 (2019)
18. Ziemelis, I, Iljins, U, Skele, A.: Combined local heating for piglets. In: International
Conference on Trends in Agricultural Engineerin, Prague, Czech Republic, pp. 441–445
(1999)
19. Kuz’michyov, A.V., Lyamcov, A.K., Tihomirov, D.A.: Teploenergeticheskie pokazateli IK
obluchatelej dlya molodnyaka zhivotnyh [Thermal energy indicators of IR irradiators for
young animals]. Svetotekhnika 3, 57–58 (2015)
The Rationale for Using Improved Flame
Cultivator for Weed Control
Mavludin Abdulgalimov1, Fakhretdin Magomedov2, Izzet Melikov2,

Sergey Senkevich3(&), Hasan Dogeev1, Shamil Minatullaev2,
Batyr Dzhaparov2, and Aleksandr Prilukov3
1
Federal Agrarian Scientific Center of the Republic of Dagestan,
St. A.Shakhbanova, 18a, Makhachkala 367014, Russia
mavludin.62@mail.ru, niva1956@mail.ru
2
Dagestan State Agrarian University named after M.M. Dzhambulatov,
St. M. Gadzhiev, 180, Makhachkala 367032, Russia
fahr-59@yandex.ru, izmelikov@yandex.ru,
djapa-rov.batir87@yandex.ru, interpol1199@mail.ru
3
Federal Scientific Agroengineering Center VIM, 1 st Institute pas. 5,
sergej_senkevich@mail.ru, chel.diagnost@gmail.com
Abstract. The work substantiates the problem of weed control and the need to
improve control methods in the production of agricultural products, the applied
control methods are considered. The design of an improved flame cultivator for
burning weeds in the near-stem strips of orchards and vineyards (rows and rows
of perennial plantings) by the thermal method, as well as the technological
process of its operation and conclusions, is given.
Keywords: Technological process Control methods Burning Weeds

A flame cultivator Stability Flame correctability
1 Introduction
Agricultural sector environmental management approaches are taking on a global scale

at the present stage. The consumer focuses the greatest interest on the quality of the
purchased product and the presence of harmful elements in it, this is due to the great
attention to the environmental friendliness of products produced by the agro industrial
complex.
Weeds are always kept in crops regardless of the degree of formation of agriculture,
agrotechnical methods used and technical tools [1].
Weed control is obligatory and expendable agrotechnical event. The usage of
different methods depends on the level of contamination, climate and soil conditions,
and also it depends on growth and formation of cultivated crops conditions [2, 3]. The
soreness of farmland makes the implementation of agricultural activities very difficult,
weedy vegetation contributes to a decrease in the quality of agricultural products. The
cost of weed destruction is about 30% of the total cost of pre-paring and caring for
crops.

https://doi.org/10.1007/978-3-030-68154-8_97
The Rationale for Using Improved Flame Cultivator 1157
It is necessary to understand the possible soil infestation with weed seeds in order
to analyze their composition and monitor their growth and the difference in weed
infestation depends on the weed seeds concentration in the upper layer of the soil [4].
It is established that the decrease productivity of cultivated crops reaches 35% or
higher when their soreness. Weed vegetation not only reduces soil fertility, it also helps
to generate pests and diseases. Weed vegetation rapidly reduces the harvest and its
existence complicates the harvesting process, besides the quality of the grown product
is reduced.
The temperature of soils full of weeds decreases from 2 to 4 °C, this helps to reduce
the vital activity of bacteria in them, restraining the development of decay of organic
matter by this and reducing the number of useful components [5]. This explains the
interest of agricultural producers in development and integration of modern technical
means for weed control, those are allowing to reduce the cost of growing crops.
Agricultural production does consist not in absolute elimination of weeds, but in
keeping them in volume, does not negatively affect the yield of cultivated crops in
present conditions. The main factors of significant infestation are high productivity,
viability and resistance of weed vegetation to the protection measures used.
Classic measures of protection from weeds are chemical and preventive. These
measures make it possible to reduce the number of weeds to a cost-effective threshold
value of harmfulness in their collective use. They also have a number of other well-
known disadvantages. A certain proportion of herbicide solution particles carried away
by uncontrolled wind movement has a negative effect on the surrounding atmosphere.
Modification of weeds occurs and the impact on most of their chemical methods
varieties of weed destruction is reduced [6]. Researchers from many countries are
searching for other ways to destroy weed vegetation for this reason. So, fire methods of
weed control have General scientific and practical significance as well as classical
methods [7].
2 Main Part
Purpose of research is development of an advanced flame cultivator to in-crease the

productivity of burning weeds.
Tasks set to achieve the intended goal:
– to analyze the functioning methods, ways, technologies and technical tools for weed
control;
– to develop the design of an advanced flame cultivator for destroying weeds by
burning.
Materials and Methods

The combination of agricultural works with using of advanced machines and technical
tools is one of the fundamental conditions for the destruction of weed vegetation.
Methods and methods of destruction with any of the varieties of weed vegetation are as
multiform as the weed itself [8]. The agrotechnical method of weed destruction makes
it possible to form requirements for provoking their germination with further
1158 M. Abdulgalimov et al.
elimination of formed sprouts. The practical operation of agricultural production

demonstrates that the achievement of the smallest amount of weed vegetation is pos-
sible only with the unified use of preventive and destructive (agrotechnical, mechan-
ical, chemical, biological) measures for their destruction [9, 10].
The study of various physical methods of weed control (electrothermal soil treat-
ment, high-voltage electric current, ultrasound, laser beam, microwave energy, fire
cultivation, and others) is actively carried out as well as the mentioned methods both
abroad and in our country. Methods of destruction of weeds such as electro-physical
are not widely used in the agricultural sector because of the need for con-tact with
weeds, despite the environmental accuracy, such methods also do not affect weed
seedlings that are located in a shallow soil cover [11, 12]. Fire cultivation is used to
destroy weeds and their seeds on cultivated land, some arable land, channels for
irrigation of cultivated crops at the end of harvesting. These works are carried out by
flame cultivators equipped with burners that operate on oil fuel and natural gas. The
early weeds growing from 3,0 to 5,0 cm in height control consists in heating to a
temperature of more than 50 °C. This leads to dehydration, protoplasm clotting, drying
of stems and leaves and as a result to their death. Death temperature for formed weeds:
300 °C – annual; from 1000 to 1200 °C – perennial. The effectiveness of destroying
seeds with a flame cultivator is in the range of 90 to 95%, when the temperature from
250 to 300 °C at the output of its burners affects the treated surface.
The energy effect on the weed seeds germination is shown in Fig. 1.
Fig. 1. Weed seed germination dependence on the impact energy
The use of flame cultivator in vegetable growing is also effective. Good results can
be seen when using this method on areas sown with onions, beets, carrots and others,
which are characterized by delayed germination. Flame cultivators do not have a

negative impact on the soil structure due to the lightweight design [13].
Besides that, the weed vegetation destruction on the ground surface by burning is
becoming more relevant, during which the short-term high temperatures effect on the
seedlings of weeds is produced. Heat exposure makes it possible to control weeds, at
the same time the impact on the microbiological and physical and chemical properties
of the soil almost does not happen [14]. Using this method of weed destruction reduces
the number of treatments and the negative impact of the machine's propellers on the
soil [15]. This is due to a reduction in the number of moves on the cultivated area,
decelerating the formation of soil erosion [16]. It should be noted that this contributes
to the preservation of soil fertility [17].
The search for ways to reduce the number of moves on the treated area by using
modern technical tools, including machines for burning weeds, is one of the conditions
for solving the considered problem.
Damage to weed vegetation level dependence on the impact energy is shown in
Fig. 2.
Fig. 2. Weed vegetation damage level dependence on the impact energy
The amount of weed vegetation in the areas where crops are cultivated in-creases
their harmful impact. It accompanies a decrease in crop productivity. This circumstance
contributed to the establishment of a mathematical interpretation of the quantitative
relationship between the weed vegetation excess and the particular crop yield, which is
determined by an exponential regression equation quite reliably:
Y ¼ aebx þ c; ð1Þ
where Y is crop productivity of cultivated crops on a particular area that has a weed
infestation %, g/m2, ton/Ha;
x is excess weed vegetation, %, g/m2, ton/Ha;
e = 2,718 is base of natural logarithms;
a is the value that determines the decrease crop productivity with the maximum
weed infestation of the cultivation area;
b is the value that determines the expression of the decrease in crop productivity
with weed infestation of the cultivation area;
c is the value that determines the volume of saving crop productivity with maximum
weed infestation of the cultivation area.
The following levels of weed infestation harmfulness limits have been established
subject to the impact of cultivated crops weed vegetation: phytocenotic (an excess
weed vegetation does not have a significant impact on cultivated crops), critical (the
excess of weed vegetation contributes to a decrease in the productivity of cultivated
crops from 3.0 to 6.0%), economic (the increase crop productivity reaches from 5.0 to
7.0%, that is provided by reducing to a minimum the amount of weed vegetation).
The economic limit of harmfulness is determined by the formula:
Ca ¼ Ya P; ð2Þ
where Ca is additional expenses for the destruction of weeds, RUB./Ha;

Ya∙ is additional harvest cultivated crops, ton/Ha;
P is tariff per unit of harvest cultivated crops, RUB./ton.
Economic limits numerical indicators of harmfulness of weed vegetation can be
defined not only for any cultivated crop and its individual producer, but also for a
particular cultivation area.
The movement speed of the proposed flame cultivator is determined by the
formula:
v ¼ 3; 6ðl=tÞ; km=h ð3Þ
where l is the length of the trip, m;

t is trip time, s.
Gas consumption is determined by the formula:
Vg ¼ Vge Vog ; L ð4Þ
where Vge is initial gas volume in the gas tank, liter;

Vog is remaining gas at the end of the trip, liter.
The processing area is determined by the formula:
S ¼ B l = 10000; Ha ð5Þ
where B is machine’s working width, m;

The working gas flow rate is determined by the formula:
Q ¼ Vg = S; L=Ha ð6Þ
The working gas flow rate (minute) is determined by the formula:
Qmin ¼ 60000 V = t; mL=min ð7Þ
The value of the destructive flame temperatures (°K) dependent on the exposure
time is determined by the formula:
T ¼ 1700 950 lnðsÞ; ð8Þ
where s is exposure time, sec.

Heat used value is determined by the formula:
X
K¼ ðt sÞ=G; ð9Þ
where t is temperature, °C;

s is exposure time, sec;
G is gas mixture flow rate, kg/Ha.
Destruction weed vegetation technology by using open flame burners provides:
– weed seeds destruction by applying heat to them;
– destruction weed sprouts;
– tall weeds destruction.
The proposed scheme of weed vegetation technological decontamination is illus-
trated in Fig. 3.
Fig. 3. Technological scheme of weed vegetation decontamination
Results
The flame cultivator must comply with the following main conditions based on the
technological neutralization weeds scheme:
– to provide a significant level of weed control;
– to not harm cultivated plants within the limits of the safe lane width;
– provide copying of the terrain movement in the transverse-longitudinal-vertical
planes;
– have protective devices against damage to the working elements in contact with
obstacles.
Weeds destruction in the ground between the rows does not imply significant
problems with the support current agricultural tools. However, the issue of weed
control in rows and protection zones of cultivated crops is still not fully resolved,
despite the large number of technical means and methods available in the current
agricultural sector [18].
Agricultural machinery and equipment manufacturers are mainly focused on the
creation agricultural aggregates for the destruction weed vegetation without the use of
toxic drugs, this is relevant due to the demand for ecological agricultural production at
present stage. Gas-powered units that use the gas-burning flame energy to control
weeds are considered the result of these foundations formation [19, 20]. They combine
in itself high productivity with low construction and maintenance costs and the ability
to effectively perform joint work on crop care and weed control.
The control of weeds and their seeds should be effective and carried out by high-
performance technical means at minimal cost [21, 22]. Implementation of weed vege-
tation neutralization process on the soil surface is allowed using a flame cultivator [23].
The design of the proposed advanced flame cultivator (Figs. 4, 5 and 6) includes
gas sections-burners 1 consisting of removable paired pipe sections (burners) 5 with
deaf flanges 20 at the ends and holes at the bottom for screwing gas-air nozzles 6
connected by mixer pipes 19 having holes on top for connecting the air pipe 3 and the
gas pipe 2 with the nozzle 4; reflector shields 7; rails-skates (bearing) 8 with vertical
plates-knives (vertical) 9 to maintain the straight movement of the unit; gas-air nozzles
(terminal) 10; damper 11 welded to the probe 12 and the sleeves 13; brackets with
stoppers 15; rod 16 connected to each other by return springs 17 [23].
Fig. 4. Front view of the flame cultivator
Fig. 5. Paired sections-burners construction

Fig. 6. Mechanisms for protecting trunk from flames
The flame cultivator works as follows.

Gas is fed to the mixer pipe 19 by pipeline 2 through the nozzle 4, air is fed
enveloping the gas jet through a «jacket pipe» 3, which can be connected to the tractor
compressor. The gas-air mixture flows from the mixer pipe 19 to the paired pipe-
burners 5 and the nozzles 6 ignite it at the exit.
The flame of the burners and hot air, concentrated under the reflector shield 7, burn
out weeds and their seeds. The flame of the terminal nozzles 10 enters the plantings
rows direction down at a slight angle. During of the unit movement the probes 12 are
sequentially withdrawn to the side opposite to the direction of the unit movement,
contacting the trunk 18, at the same time, the arc-shaped dampers 11 welded to them
also rotate alternately (one after the other) around the axes and block the flame from the
terminal nozzles 10, preventing direct contact with the planting trunk 18. The dampers
11 return alternately to the original position, re-turns, unfolding to stoppers 15, placed
on brackets 14, which together with the springs 17 support the dampers 11 with the
dipsticks 12 in a stationary state until contact with the subsequent obstacle, under the
action of return springs 17 upon completion of the contacts of the probes 12 with the
trunk 18. Knives 9 in the form of plates, welded to the rails 8, form «rails-skates» and
provide straightness of movement, wedging into the soil, and prevent the unit from
sliding sideways on land with a slope.
Production simplicity, low metal consumption and the ability to simultaneous soil
processing ground between the rows and perennial plantings row are the advantages of
the proposed advanced flame cultivator and they will ensure high efficiency and
resource saving of the technological process.
Discussion
The use of the proposed flame cultivator for weed control reduces the number of soil
processing and the cost of caring for cultivated crops, and the fire method of destroying
weeds is environmentally safe.
It is established that the destruction weeds occurs due to pruning at considerable
cost and only at different stages of their formation in a significant part of the available
technical means and aggregates for weed control.
The device novelty is the improved flame cultivator construction developed for the
burning weeds technological process [23].
3 Conclusion
Negative results of the use methods for weed control indicate the need for their
improvement.
The improved technology and design of the proposed flame cultivator ensure the
effectiveness of the destroying weed vegetation process using the proposed flame
cultivator.
Economic feasibility is substantiated by a decrease number of soil processing while
destroying weed vegetation relative to other technologies.
It is necessary to continue the modernization of both existing and developed flame
cultivator for the destruction weeds constructions.
References
1. Perederiyeva, V.M., Vlasova, O.I., SHutko, A.P.: Allelopaticheskiye svoystva sornykh
rasteniy i ikh rastitel’nykh ostatkov v protsesse mineralizatsii [Allelopathic properties of
weeds and their plant residues in the process of mineralization]. KubGAU, Krasnodar 09
(73), 111–121 (2011). https://ej.kubagro.ru/2011/09/pdf/11. (in Russian)
2. Hatcher, P.E., Melander, B.: Combining physical, cultural and biological methods: prospects
for integrated non-chemical weed management strategies. Weed Res. 43(5), 303–322 (2003).
https://doi.org/10.1046/j.1365-3180.2003.00352.x
3. Abouziena, H.F., Hagaag, W.M.: Weed control in clean agriculture: a review. Planta
Daninha. 34(2), 377–392 (2016). https://doi.org/10.1590/S0100-83582016340200019
4. Dorozhko, G.R., Vlasova, O.I., Perederiyeva, V.M.: Sposob obrabotki - faktor reg-
ulirovaniya fitosanitarnogo sostoyaniya pochvy i posevov ozimoy pshenitsy na chernoze-
makh vyshchelo-chennykhzony umerennogo uvlazhneniya Stavropol’skogo kraya [The
tillage method is a factor in regulating the phytosanitary state of the soil and winter wheat
crops on leached humus of the moderate humidification zone of the Stavropol Territory].
KubGAU, Krasnodar 04(68), 69–77 (2011). https://ej.kubagro.ru/2011/04/pdf/08. (in
Russian)
5. Tseplyaev V.A., Shaprov M.N., Tseplyaev A.N.: Optimizatsiya parametrov tekhnologich-
eskogo protsessa poverkhnostnoy obrabotki pochvy rotornym avtoprivodnym agregatom
[Optimization of the technological process parameters of the surface tillage by the rotary
automatic drive unit]. Izvestiya Nizhnevolzhskogo agrouniversitetskogo kompleksa: nauka i
vyssheye professional’noye obrazovaniye 1(25), 160–164 (2012). (in Russian)
6. Schütte, G.: Herbicide resistance: promises and prospects of biodiversity for European
agriculture. Agric. Hum. Values 20(3), 217–230 (2003). https://doi.org/10.1023/A:
1026108900945
7. Bond, W., Grundy, A.C.: Non-chemical weed management in organic farming systems.
Weed Res. 41(5), 383–405 (2001). https://doi.org/10.1046/j.1365-3180.2001.00246.x
8. Izmaylov, A.YU., Khort, D.O., Smirnov, I.G., Filippov, R.A., Kutyrëv, A.I.: Analiz
parametrov raboty ustroystva dlya gidravlicheskogo udaleniya sornoy rastitel’nosti [Analysis
of Work Parameters of the Device for Hydraulic Removal of Weed Vegetation]
Inzhenernyye tekhnologii i sistemy. FGBOU VO «MGU im. N. P. Ogarëva». Saransk 29
(4), 614–634 (2019). https://doi.org/10.15507/2658-4123.029.201904.614-634. (in Russian)
9. Agricultural and Forestry Machinery. Catalogue of Exporters Czech Republic. Copyright: A.
ZeT, Brno (2005)
10. Bezhin, A.I.: Obosnovaniye parametrov i rezhimov raboty kul’tivatornogo agregata dlya
sploshnoy obrabotki pochvy [Rationale for parameters and operating modes of the cultivator
for complete tillage]. Ph.D. dissertation. Orenburg State Agrarian University, Orenburg
(2004),183 p. (in Russian)
11. Popay, I., Field, R.: Grazing animals as weed control agents. Weed Technol. 10(1), 217–231
(1996). https://doi.org/10.1017/S0890037X00045942
12. Astatkie, T., Rifai, M.N., Havard, P., Adsett, J., Lacko-Bartosova, M., Otepka, P.:
Effectiveness of hot water, infrared and open flame thermal units for controlling weeds. Biol.
Agric. Hortic. 25(1), 1–12 (2007). https://doi.org/10.1080/01448765.2007.10823205
13. Abdulgalimov, M.M.: Nestandartnyye resheniya i tekhnicheskiye sredstva dlya bor’by s
sornyakami v sadakh i vinogradnikakh [Non-standard solutions and technical means for
weed control in orchards and vineyards]. DagGAU im. M.M. Dzhambulatova. Makhachkala
4–6 (2015). (in Russian)
14. Blackshaw, R.E., Anderson, R.L., Lemerle, D.: Cultural weed management. In: Upadhyaya,
M.K., Blackshaw, R.E. (eds.) Non-Chemical Weed Management: Principles, Concepts and
Technology,1 edn.,CABI 2007, Wallingford, England, pp. 35–47. https://researchoutput.csu.
edu.au/en/publications/cultural-weed-management
15. Melikov, I., Kravchenko, V., Senkevich, S., Hasanova, E., Kravchenko, L.: Traction and
energy efficiency tests of oligomeric tires for category 3 tractors. In: IOP Conference Series:
Earth and Environmental Science, vol. 403, p. 012126 (2019). https://doi.org/10.1088/1755-
1315/403/1/012126
16. Senkevich, S., Ivanov, P.A., Lavrukhin, P.V., Yuldashev, Z.: Theoretical prerequisites for
subsurface broadcast seeding of grain crops in the conditions of pneumatic seed
transportation to the coulters. In: Handbook of Advanced Agro-Engineering Technologies
for Rural Business Development, pp. 28–64. IGI Global, Hershey (2019). https://doi.org/10.
4018/978-1-5225-7573-3.ch002
17. Lavrukhin, P., Senkevich, S., Ivanov, P.: Placement plants on the field area by seeding
machines: methodical aspects assessment rationality. In: Handbook of Research on Smart
Computing for Renewable Energy and Agro-Engineering, pp. 240–261. IGI Global, Hershey
(2020).https://doi.org/10.4018/978-1-7998-1216-6.ch010
18. Tokarev, N.A., Gar’yanova, E.D., Tokareva,, N.D., Gulyayeva, G.V.: Sposob bor’by s
sornyakami [Weed Control Method]. Zemledeliye. 2012. 8. p. 37–38. (in Russian)
19. Baerveldt, S., Ascard, J.: Effect of soil cover on weeds. Biol. Agric. Hortic. 17(2), 101–111
(1999). https://doi.org/10.1080/01448765.1999.9754830
20. Latsch, R., Anken, T., Herzog, C., Sauter, J.: Controlling Rumex obtusifolius by means of
hot water. Weed Res. 57(1), 16–24 (2017). https://doi.org/10.1111/wre.12233
21. Abdulgalimov, M.M.: Ognevoy kul’tivator [Flame Cultivator]. Byul. 14, 2016. RU Patent
2584481 C2. A01M 15/00
22. Abdulgalimov, M.M., Magomedov, F.M., Senkevich, S.E., Umarov, R.D., Melikov, I.M.:
Sovershenstvovaniye tekhnologii i sredstv mekhanizatsii dlya bor’by s sornoy rastitel’nos-
t’yu [Improvement of technology and means of mechanization for weed control].
Sel’skokhozyaystvennyye mashiny i tekhnologii 5, 38–42 (2017). https://doi.org/10.
22314/2073-7599-2017-5-38-42. (in Russian)
23. Abdulgalimov, M.M.: Ognevoy kul’tivator [Flame Cultivator]. Byul. 7, 2019. RU Patent
187387 U1. A01M 15/00
The Lighting Plan: From a Sector-Specific
Urbanistic Instrument to an Opportunity
of Enhancement of the Urban Space
for Improving Quality of Life
Cinzia B. Bellone1(&) and Riccardo Ottavi2

1
Urban Planning, DIS, Università degli Studi Guglielmo Marconi, Rome, Italy
c.bellone@unimarconi.it
2
Industrial Engineer, Electrical and Lighting Design, Perugia, Italy
riccardo.ottavi@gmail.com
Abstract. The urban space, its lighting, the environmental sustainability, the
energy efficiency, the attention to public spending, the innovation and the opti-
mization of public utility services, the light pollution, the energy and lighting
upgrading of public lighting systems and – most importantly – the growth
potential of urban living thanks to a new strategy for the cities of the future, the
creation of efficient and effective smart cities: these are the themes that innervate
the research whose results are presented in this article. These topics are very
different from each other but share the urban space with the features of com-
plexity and modernity. And it is on the city, on the urban space, that this study is
directed on since, as De Seta writes “nowadays over half the global population
lives in a city, and it is estimated that this percentage will rise to three quarters in
2050” [1].
The present research represents the evolution of what has been submitted by
the authors for the AIDI conference “XIX Congresso Nazionale AIDI, La luce
tra cultura e innovazione nell’era digitale” (The research work was submitted
and accepted at the XIX National Conference of AIDI, 21−22 May 2020,
Naples, Italy.).
Keywords: Sustainable urban development Innovation Social wellbeing

Smart city
1 Introduction
The complexity of a contemporary city led to the creation of new analysis, design and
planning tools in the discipline of urban technology. These above mentioned tools
relate to issues both widespread - including energy saving, water protection, waste
management and localization of broadcast media – and locally, such as traffic con-
gestion, pollution in its different forms, subsoil monitoring, etc.
These new tools fill a gap in the general planning method in its traditional meaning:
they listen to and interpret those specific territorial requirements and try to give them a

https://doi.org/10.1007/978-3-030-68154-8_98
The Lighting Plan: From a Sector-Specific Urbanistic Instrument 1169
concrete answer, with the main aim of raising performance, functional and quality city
standards [2, 3].
These topics are fully covered by multiple types of plan in the sector-specific city
planning: the urban traffic and the acoustic renewal plan, the timetable/schedule and the
municipal energy plan, the color plan and the electro-smog renewal one and, last but
not least, the lighting plan.
2 The Lighting Plan
To better understand the meaning of the latter tool, here’s a definition from an ENEA1
guideline:
«The lighting master plan is an urbanistic instrument to regulate all types of lighting
for the city, it is a veritable circuit of how the city is supposed to be planned in terms of
lighting technique. It has key advantages, as it allows to respect urban fabrics in general
correlating them with a type of proper lighting. The final result is the obtaining and the
optimizing of the municipal lighting network, depending on main requirements» [4].
The lighting of a city, therefore public, is a very
peculiar component in the structure of the urban
system, as it assumes two different identities
depending on the daytime and the night-time. By
day, its components are visible objects which have to
be mitigated as much as possible in the environ-
mental context, while by night, those same elements
play an essential role by lighting urban space,
ensuring that the whole city works as during the day
and that everything proceeds in terms of quality,
efficiency and reliability [5]. Not less important,
considering the technologically advanced and inno-
vative role that this engineerable and widespread
Fig. 1. Infographic of the report system can play, this may be considered the first real
by the McKinsey Global Institute step towards the formation of the smart city [6].
According to a report by the McKinsey Global
Institute, smart cities might reduce crime by 10% and healthcare costs by 15%, save
four days stuck in traffic and 80 L of water per day (Fig. 1) [7].
The real usefulness and importance of a dedicated sector-specific urbanistic
instrument – as is the lighting plan – becomes evident. Such an instrument, in fact, can
act as a guide in people’s lives within the night urban spaces [8]. In fact, almost all
Italian regional legislation – for about two decades now – imposes the obligation for
the Municipalities to adopt this sector-specific urbanistic instrument (unfortunately,
today it is still disregarded by the municipalities themselves). From an analysis con-
ducted by ENEA on municipalities involved in the Lumière Project Network, less than
20% of them had adopted a lighting plan at the time of the surveys [9].
1
(IT) National agency for new technologies, energy and sustainable economic development.
1170 C. B. Bellone and R. Ottavi
While, on one hand, there is no coherence of adoption and/or unique model of how
to cover issues revolving around urban space lighting, on the other hand, it is also true
that the sensitivity to urban lighting issues is increasing day by day; we are witnessing
the creation of a real process of change, which is felt not only in our country but all
over Europe, in response to an increasing citizens’ demand to see the quality of life in
urban public spaces improved [10, 11].
3 Proposal for an Experimental Methodological Analysis
In order to scientifically analyze the different urban contexts and how the above
principles, within the lighting plans, may or may not be implemented in the public
lighting, a scale of values is proposed to represent a valid instrument for an objective
research.
Fig. 2. Matrix chart no.1 “Light quality”
The two levers of analysis will be quality and innovation. We’ll start investigating
what it means to generate quality in the urban space through light, and then understand
how it can represent a real innovation for the cities of the future.
Returning to the
above explained
methodological
approach, we report a
matrix chart “light
quality” (Fig. 2) high-
lighting the key points
and the implementation
practical connections
which can be planned,
in order to determine a
planning of the light
useful to men and their
space, so that light truly
becomes an instrument
Fig. 3. Matrix chart no.2 “Innovating with liht”
for the city planning, an
instrument to build the
city.
Here is the matrix chart “innovating with light” (Fig. 3), showing the objective key
points which will act as a guide in the planning research of the next paragraph.
4 Two Case Studies in Comparison: Florence and Brussels
The answers provided in Europe are quite varied2 and, among these ones, the French-
Belgian experience deserves special attention: in fact, the connection created there
between light, urban space and aesthetics led this reality to be an example in terms of
contemporary urban quality.
The plan lumière of the Région de Bruxelles-Capitale3 is particularly significant.
Also referred to as a « second generation plan » for its choice to put users, people
living the night urban space and the way they do it at the center of the study, rather than
making the city spectacular [12].
The merit of this plan is not confined to provide a global vision of lighting at
regional level, but also to accompany the actions carried out in the area defined as
priorities by the Brussels government in the 2014−2019 declaration4 with some
specific proposals for this areas of prime importance for the regional development.
In order to analyze the key points of this lighting plan, during the research project
some technical sheets have been drawn up to synthesize and frame the peculiarities of
this urbanistic instrument, in which the general data, the technical and regulatory
features and the purposes of the lighting plan itself have been reported.
2
In the European context, Brussels, Lion, Copenhagen, Eindhoven, Rotterdam and Barcelona stand
out.
3
Adopted by the municipality of Brussels in March 2016.
4
Accord de Majorité 2014−2019.
Even in Italy there are some virtuous examples of light planning5, among which the
P.C.I.P. (Public Lighting Municipal Plan) of Florence stands out, which gives priority
to the need and the concept of light quality6. What emerges from the analysis of the
Florence’s plan is indeed the pragmatization of the concept of quality, which can be
found in the issues covered such as plant engineering technological renewal of the
municipal lighting plan, modification of the road network and the urbanization
(pedestrian and cycling urban areas), energy saving, environmental protection and, not
least, green-procurement [13].
Similarly to the previous case study, a specific technical sheet of the urbanistic
instrument has been drawn up also for the city of Florence.
Both plans, both planning modes, express innovation of the subject by mutating the
very concept of lighting plan; despite implementing different approaches to this topic,
they both move in the direction of giving special attention to the most functional
themes of the cities of the future, with the aim of improving the urban space user’s
living standards.
In this regard, there are particularly significant elements that effectively show these
different approaches.
About the P.C.I.P. in Florence, project-based structure stands out immediately,
which means involving lighting mimic diagrams, reports, risk analysis and graphic
drawings: an important documentary writing that goes in determining what will be the
structure of the next project phases of technological upgrading and modernization.
While in the case of the plan lumière, even if a broader approach at an analysis context
level can be perceived, Brussels chose a more compact form and a very manual-like
representation of its urban space. The plan lumière of the Brussels-Capital Region is a
real guideline for the lighting of the urban space; everything is based on a very
innovative and efficient method, ready for use by light designers for a direct imple-
mentation on the territory.
Again, Florence chooses the theme of the user’s visual perception of the night
urban space [14] as the heart of its plan, all the planning follows this fil rouge with a
particular attention to protecting the artistic and architectural context of the space to be
illuminated: the city of Florence.
While the lighting plan of Florence shows innovation in planning the quality of
urban light in favor of the users, even imagining them as they are surrounded by a real
open-air night museum itinerary, Brussels adopts innovative layered planning tech-
niques, directly involving the citizens and all their sensations and desires for comfort
and wellbeing in the plan itself7.
5
Among the most relevant Italian experiences and which, in some cases, have even try to overcome
the limitations imposed by a concept of general development urbanistic instrument, there are
Florence, Udine, Turin, Bergamo, Rome and Milan.
6
Adopted by the municipality of Florence on 12 September 2016.
7
Organization’s program of the marche exploratoire nocturne (14/04/2016) – participatory study tool
involving a sort of night walk of the night urban space. (Source: Lighting Plan of the Brussels-
Capital Region, p. 125 - Partie Technique).
In the Plan Lumière of Brussels there is evidently a structure of the discipline of the
Light Urbanism and the methodology of the Light Master Plan8. It is also noteworthy
that the municipality of Brussels aims for a dynamic lighting through a chronobio-
logical approach (even creating moments of dark therapy) and a sensitivity focused on
people’s circadian rhythms, on how citizens can perceive the lighting, discovering and
understanding the functionality of the urban space around them. Finally, this special
attention towards the man also affects the surrounding environment, with the imple-
mentation of lighting control measures in harmony with flora and fauna [15].
What clearly emerges from the analysis and the comparison of these two plans is
the observation that both use consolidated techniques but at the same time innovative in
planning the urban lighting and, although they are two de facto capitals (one of the
Italian Renaissance and the other of a united Europe), it shows the lack of a magic
word, which is smart city, to converge on innovation, on the opportunities of lighting as
an “enabling technology” [16] capable of shaping the future of the smart city.
5 Results of the Research and Future Visions
Both such planning experiences are examples of an organic, careful and innovative
planning modes at the same time.
On that basis, it is desirable that the culture of good lighting may be increasingly
extended and implemented by the local authorities, in order that lighting can actually be
considered as an element of characterization capable of deeply affecting the urban
space, no longer as a mere lighting plant [17] (already an important signal from the
world of standardization, with the birth of the new CIE9 234:2019 - A Guide to Urban
Lighting Masterplanning). In other words, it is proposed to imagine light as an
instrument capable of improving society in its everyday life, by redesigning the urban
space and making it more responsive to the needs of the community.
The results of the research have led to the definition of 15 guidelines standing as
good practices for the conception and implementation of innovative and high-quality
lighting plans focused on men and their relationship with the urban space in all its
forms, functions and conditions.
Here is an overview of the above guidelines, indicating the urgency of intervening
in this field with a regained approach of innovation and high quality:
(a) The choice of the model of plan and the implementation methodology (guidelines
1-2-3-15).
A model of plan shall be identified to combine functionality, aesthetics and
sociality; organic and proceduralized. The model shall begin from the pre-completion
status and present a layered structure, in order to make the elements overlapping and, at
the same time, analyzable separately.
8
The methodology of the Lighting Master Plan created by Roger Narboni, the lighting designer who
outlined the first plan of Brussels in 1997.
9
CIE: Commission Internationale de l'Éclairage.
(b) Urban spaces to be preserved (guidelines 4-5-6).

The plan, which is attentive not only to the substantial aspects but also to the
aesthetic-formal ones, shall aim at preserving the urban spaces, enhancing its pecu-
liarities and improving them, by qualifying and redesigning the night landscape.
(c) Ecological and social aspect (guidelines 7-8-10).
The plan shall be focalized on men, on their natural biorhythm, guiding them on
their space, but it shall also meet the needs of the night social life of the users, helping
to guarantee their physical and psychological safety. A renewed attention shall be given
also to the respect for the local flora and fauna, by regulating the lighting, even
providing for its absence or minor intensity.
(d) The team and the user’s involvement (guidelines 9-11).
Different forms of collaboration shall be planned for the plan implementation. On
one hand, in view of its multidisciplinary nature, the collaboration with a team of
professionals specialized in the respective areas of intervention; on the other, the one
with the users of the night urban space, whose demands must be considered.
(e) A responsible light (guidelines 12-13).
The plan, which shall be economically and financially sustainable, shall hopefully
be extended to the entire visible urban lighting, to involve also the private outdoor one
and/or that for not strictly public use.
(f) An innovative light (guideline 14).
The urban lighting shall evolve into “enabling technology” [16], putting the elec-
trical infrastructure at the service of the smart city.
And if it is true, as it is, that « the city is the most typical expression of civilization
on the territory; the city as a group of building facilities and infrastructures, ma also as
an aggregation of inhabitants in their roles as citizens taking part in the urban phe-
nomenon » [18], then men are the starting point for such an operation: men within their
spaces, men in their daily multiple forms of expression, aggregation, sharing.
The involvement and the collaborative commitment requested to the sector-specific
professionals (engineering, architectural, urbanistic, lighting, etc.), together with the
public administrations, aim at promoting a real implementation of this new concept of
urban space through the planning of light, reimagined in terms of innovation and
quality [19, 20].
References
1. De Seta, C. (ed.): La città: da Babilonia alla smart city. Rizzoli, Milano (2017)
2. Talia, M. (ed.): La pianificazione del territorio. Il sole 24 ore, Milano (2003)
3. Rallo, D. (ed.): Divulgare l’urbanistica. Alinea, Firenze (2002)
4. Cellucci, L., Monti, L., Gugliermetti, F., Bisegna, F.: Proposta di una procedura
schematizzata per semplificare la redazione dei Piani Regolatori di Illuminazione Comunale
(PRIC) (2012). http://www.enea.it
5. Enel-Federelettrica: Guida per l’esecuzione degli impianti di illuminazione pubblica, pp. 25–
26. Roma (1990)
6. Spaini, M.: Inquadramento normo-giuridico della pubblica illuminazione. Atto di convegno
AIDI-ENEA, Milano (2014)
7. McKinsey Global Institute: Smart cities: Digital solutions for a more livable future, rapporto
(2018). http://www.mckinsey.com
8. Süss, M.: I piani della luce – Obiettivi, finalità e opportunità per le Pubbliche
Amministrazioni. Atto di convegno AIDI-ENEA (2014)
9. Progetto Lumière ENEA - Report RdS/2010/250
10. Terzi, C. (ed.): I piani della luce. Domus « iGuzzini » , Milano (2001)
11. Ratti, C., con Claudel, M. (eds.): La città di domani. Come le reti stanno cambiando il futuro
urbano. Einaudi, Torino (2017)
12. Plan Lumière de la Régione de Bruxelles-Capitale (www.mobilite-mobiliteit.brussels/www.
radiance35.it/Bruxelles Mobilité)
13. Piano Comunale di Illuminazione Pubblica della Città di Firenze, (www.silfi.it/P.O. Man-
ufatti e Impianti Stradali comune di Firenze)
14. Gehl, J. (ed.): Vita in città. Maggioli, Rimini (2012)
15. Narboni, R. (ed.): italiana a cura di Palladino P.: Luce e Paesaggio. Tecniche Nuove, Milano
(2006)
16. Gianfrate, V., Longo, D. (eds.): Urban Micro-Design. FrancoAngeli, Milano (2017)
17. Frascarolo, M. (ed.): Manuale di progettazione illuminotecnica (parte a cura di E.Bordonaro)
Manco.su Edizioni Architectural Book and Review « TecnoTipo » , Roma (2011)
18. Dioguardi G. (ed.): Ripensare la città, p. 45. Donzelli, Roma (2001)
19. Bellone C., Ranucci P., Geropanta V.: The ‘Governance’ for smart city strategies and
territorial planning. In: (a cura di) Vasant, P., Zelinka, I, Weber, GM.: Intelligent Computing
& Optimization, Intelligent Computing & Optimization. AISC, pp. 76–86. Springer, Berlin
(2018)
20. Bellone C., Geropanta V.: The ‘Smart’ as a project for the city smart technologies for
territorial management planning strategies. In: (a cura di): Vasant, P., Zelinka, I., Weber,
GW. (eds.) Intelligent Computing & Optimization. AISC, vol. 866, pp. 66–75. Springer,
Berlin (2019)
PID Controller Design for BLDC Motor Speed
Control System by Lévy-Flight Intensified
Current Search
Prarot Leeart, Wattanawong Romsai, and Auttarat Nawikavatan(&)

poaeng41leeart@gmail.com, wattanawong.r@gmail.com,
auttarat@hotmail.com
Abstract. This paper presents the optimal proportional-integral-derivative

(PID) controller design for the brushless direct current (BLDC) motor speed
control system by using the Lévy-flight intensified current search (LFICuS).
Based on modern optimization, the LFICuS is one of the most powerful
metaheuristic optimization techniques developed from the electric current
flowing through the electric networks. The error between the referent speed and
actual speed is set as the objective function to be minimized by the LFICuS
according to the constraint functions formed from the design specification. As
results, it was found that the PID controller can be optimally achieved by the
LFICuS. The speed responses of the BLDC motor speed controlled system are
very satisfactory according to the given design specification.
Keywords: PID controller Lévy-flight intensified current search BLDC

motor speed control system Modern optimization
1 Introduction
Since 1970’s, the brushless direct current (BLDC) motor has been widely used in
different applications, for example, industrial automation, automotive, aerospace,
instrumentation and appliances [1]. This is because the BLDC motor retains the
characteristics of a brushed DC motor but eliminates commutator and brushes. It can be
driven by DC voltage source, while the current commutation is electronically done by
solid state switches [2]. From literature reviews, the BLDC motor have many advan-
tages over brushed DC motors. Among those, higher speed range, higher efficiency,
better speed versus torque characteristics, longer operating life, noiseless operation and
higher dynamic response can be found in BLDC usage [1–3].
The BLDC motor speed control can be efficiently operated under the feedback
control loop with the proportional-integral-derivative (PID) controller [4–6]. Moving
toward new era of control synthesis, an optimal controller design has been changed
from the conventional paradigm to the new framework based on modern optimization
by using some powerful metaheuristics as an optimizer [7, 8]. For example of the

https://doi.org/10.1007/978-3-030-68154-8_99
PID Controller Design for BLDC Motor Speed Control System 1177
BLDC motor speed control, optimal PID controller was successfully design by genetic
algorithm (GA) [6], tabu search [6], cuckoo search (CS) [9] and intensified current
search (ICuS) [10].
The Lévy-flight intensified current search (LFICuS), one of the most efficient
metaheuristics, is the lasted modified version of the ICuS. It utilizes the random number
drawn from the Lévy-flight distribution and the adjustable search radius mechanism to
enhance the search performance and speed up the search process. In this paper, the
LFICuS is applied to design an optimal PID controller for BLDC motor speed control
system. This paper consists of five sections. After an introduction is given in Sect. 1,
the BLDC motor control loop with the PID controller is described in Sect. 2. Problem
formulation of the LFICuS-based PID controller design is illustrated in Sect. 3. Results
and discussions are illustrated in Sect. 4. Finally, conclusions are given in Sect. 5.
2 BLDC Motor Control Loop
The BLDC motor control loop with the PID controller can be represented by the block
diagram as shown in Fig. 1, where R(s) is reference input signal, C(s) is output signal,
E(s) is error signal between R(s) and C(s), U(s) is control signal and D(s) is disturbance
signal.
Fig. 1. BLDC motor control loop.
2.1 BLDC Motor Model

Referring to Fig. 1, the BLDC motor can be represented by the schematic diagram as
shown in Fig. 2. The modeling of BLDC motor is then formulated as follows. Phase
voltages van, vbn and vcn are expressed in (1), where ia, ib and ic are phase currents, Ra,
Rb and Rc are resistances, La, Lb and Lc are inductances and ean, ebn and ecn are back
emfs of phase a, b and c, respectively [1, 3].
The back emfs in (1) are stated in (2) related to a function of rotor position, where
he and Ke are the electrical rotor angle and the back emf constant of each phase, x is the
motor angular velocity in rad/s and fa, fb and fc represent the function of rotor position
of phase a, b and c, respectively.
1178 P. Leeart et al.
Fig. 2. Schematic diagram of BLDC motor.
9
dia >
van ¼ Ra ia þ La þ ean >
>
>
dt >
>
=
dib
vbn ¼ Rb ib þ Lb þ ebn ð1Þ
dt >
>
>
>
>
þ ecn >
dic
vcn ¼ Rc ic þ Lc ;
dt
9
ean ¼ Ke xfa ðhe Þ >=
ebn ¼ Ke xfb ðhe 2p=3Þ ð2Þ
>
;
ecn ¼ Ke xfc ðhe 4p=3Þ
The electromagnetic torque Te depending on the current and back emf voltage is
expressed in (3), where Tea, Teb and Tec represent phase electric torque. The relation
between the speed and the torque is then stated in (4), where Tl is load torque, J is the
moment of inertia and B is the viscous friction.
In generally, a simple arrangement of a symmetrical (balanced) 3-phase wye
(Y) connection as shown in Fig. 2 can provide the per-phase concept. With symmet-
rical arrangement, the mechanical time constant sm and the electrical time constant se of
the BLDC can be formulated as expressed in (5) [5], where Kt is torque constant.
½ean ia þ ebn ib þ ecn ic

Te ¼ Tea þ Teb þ Tec ¼ ð3Þ
x
dx
Te ¼ J þ Bx þ Tl ð4Þ
dt
X Ra J 9
JRRa Jð3Ra Þ >>
sm ¼ ¼ ¼ =
Kt Ke Kt Ke Kt Ke
X La ð5Þ
La La >
>
se ¼ ¼ ¼ ;
Ra RRa 3Ra
Commonly, the BLDC motor will be driven by DC input voltage via the particular
invertor (or power amplifier). Such the amplifier can be approximated by the first order
model. Therefore, the mathematical model of the BLDC motor can be formulated as the
transfer function stated in (6), where KA is amplifier constant and sA is amplifier time
constant. From (6), the BLDC motor model with power amplifier can be considered as
the third-order system.

XðsÞ KA 1=Ke b0
Gp ðsÞ ¼ ¼ ¼ ð6Þ
VðsÞ sA s þ 1 sm se s2 þ sm s þ 1 a 3 s 3 þ a 2 s 2 þ a1 s þ a0
In this work, BLDC motor model will be obtained by MATLAB system identifi-
cation toolbox [11]. The testing rig of the BLDC motor system including a blushless
DC motor of 350 W, 36 VDC, 450 rpm and a power amplifier (driver) is conducted as
shown in Fig. 3. The experimental speed data at 280, 320 and 360 rpm are tested and
recorded by a digital storage oscilloscope and PC. Speed data at 320 rpm as the
operating point will be used for model identification, while those at 280 and 360 rpm
will be used for model validation.
Based on the third-order model in (6), MATLAB system identification toolbox
provides the BLDC motor model as stated in (7). Results of model identification and
validation are depicted in Fig. 4. It was found that the model in (7) shows very good
agreement to actual dynamics (sensory data) of BLDC motor. The BLDC motor model
in (7) will be used as the plant model Gp(s) shown in Fig. 1.
Power Amp
(Driver)
+ 350W 36Vdc
10A
Power Supply
36VDC, 10A
BLDC Motor
350W 36VDC
-
450rpm.
Tachogenerator
Vout = 0-10V
Computor
Intel(R) Core(TM) i7-10750H -
CPU@2.60GHz
+
USB
DAQ
NI USB-6008
Fig. 3. BLDC motor testing rig.
0:5725
Gp ðsÞ ¼ ð7Þ
s3 þ 3:795s2 þ 5:732s þ 0:5719
Fig. 4. Results of BLDC motor model identification.
2.2 PID Controller

Regarded to Fig. 1, the PID controller models Gc(s) in time-domain and s-domain are
stated in (8) and (9), where e(t) is error signal, u(t) is control signal, Kp is the pro-
portional gain, Ki is the integral gain and Kd is the derivative gain. In the control loop,
the PID controller Gc(s) receives E(s) to generate U(s) in order to control the plant
(BLDC motor) Gp(s) for producing C(s) referring to R(s) and regulating D(s),
simultaneously.
Z
deðtÞ
uðtÞ ¼ Kp eðtÞ þ Ki eðtÞdt þ Kd ð8Þ
dt
Ki
Gc ðsÞ ¼ Kp þ þ Kd s ð9Þ
s
3 Problem Formulation
The LFICuS-based PID controller design for the BLDC motor speed control system
can be formulated as represented in Fig. 5. Based on modern optimization, the
objective function f(Kp,Ki,Kd) is set as the sum-squared error (SSE) between the ref-
erence input r(t) and the speed output c(t) as stated in (10). The objective function f
() = SSE will be fed to the LFICuS block to be minimized by searching for the
appropriate values of the PID controller’s parameter, i.e., Kp, Ki and Kd, within the
corresponding search spaces and according to the constraint functions set from the
design specification as expressed in (11), where [Kp_min, Kp_max] is the lower and upper
bounds of Kp, [Ki_min, Ki_max] is the lower and upper bounds of Ki, [Kd_min, Kd_max] is
the lower and upper bounds of Kd, tr is the rise time, tr_max is the maximum allowance
of tr, Mp is the maximum percent overshoot, Mp_max is the maximum allowance of Mp,
ts is the settling time, ts_max is the maximum allowance of ts, ess is the steady-state error
and ess_max is the maximum allowance of ess.
Fig. 5. LFICuS-based PID controller design for BLDC motor speed control system.
X
N
Min f ðKp ; Ki ; Kd Þ ¼ ½ri ðtÞ ci ðtÞ2 ð10Þ
i¼1
9
Subject to tr tr max ; Mp Mp max ; >
>
>
>
ts ts max ; ess ess >
max ; >
=
Kp min Kp Kp max ; ð11Þ
>
>
Ki min Ki Ki max ; >
>
>
>
;
Kd min Kd Kd max
In this work, the LFICuS algorithm, the last modified version of the ICuS algorithm
[10], is applied to optimize the PID controller for the BLDC motor speed control
system. The LFICuS utilizes the random number drawn from the Lévy-flight distri-
bution to generate the neighborhood members as feasible solutions in each search
iteration. Such the Lévy-flight random distribution L can be approximated by (12) [12],
where s is step length, k is an index and C(k) is the Gamma function as expressed in
(13). In the LFICuS algorithm, the adjustable search radius (AR) and adjustable
neighborhood member (AN) mechanisms are also conducted by setting the initial
search radius R = X (search space). The LFICuS algorithm for optimizing the PID
controller for the BLDC motor speed control system can be described step-by-step as
follows.
L 1þk ð12Þ
p s
Z1
CðkÞ ¼ tk1 et dt ð13Þ
0
Step-0 Initialize the objective function f() = SSE in (10) and constraint functions in
(11), search space X = [Kp_min, Kp_max], [Ki_min, Ki_max] and [Kd_min, Kd_-
max], memory lists (ML) W, Ck and N = ∅, maximum allowance of solution
cycling jmax, number of initial solutions N, number of neighborhood
members n, search radius R =X, k = j = 1.
Step-1 Uniformly random initial solution Xi= {Kp, Ki and Kd} within X. Evaluate f
(Xi) via (10) and (11), then rank and store Xi in W.
Step-2 Let x0 = Xk as selected initial solution. Set Xglobal = Xlocal= x0.
Step-3 Generate new solutions xi= {Kp, Ki and Kd} by Lévy-flight random in (12)
and (13) around x0 within R. Evaluate f(xi) via (10) and (11), and set the best
one as x*.
Step-4 If f(x*) < f(x0), keep x0 into Ck, update x0 = x* and set j = 1. Otherwise,
keep x* into Ck and update j = j + 1.
Step-5 Activate AR mechanism by R = qR, 0 < q < 1 and invoke AN mechanism
by n = an, 0 < a < 1.
Step-6 If j jmax, go back to Step-3.
Step-7 Update Xlocal = x0 and keep Xglobal into N.
Step-8 If f(Xlocal) < f(Xglobal), update Xglobal = Xlocal.
Step-9 Update k = k + 1 and set j = 1. Let x0 = Xk as selected initial solution.
Step-10 If k N, go back to Step-2. Otherwise, stop the search process and report
the best solution Xglobal= {Kp, Ki and Kd} found.
To optimize the PID controller of the BLDC motor speed control system, the LFICuS
algorithm was coded by MATLAB version 2017b (License No.#40637337) run on
Intel(R) Core(TM) i7-10510U CPU@1.80 GHz, 2.30 GHz, 16.0 GB-RAM. The
search parameters of the LFICuS are set from the preliminary study, i.e., initial search
radius R = X = [Kp_min, Kp_max], [Ki_min, Ki_max] and [Kd_min, Kd_max], step length
s = 0.01, index k = 0.3, number of initial neighborhood members n = 100 and number
of search directions N = 50. Each search direction will be terminated by the maximum
iteration (Max_Iter) of 100. Number of states of the AR and AN mechanisms activation
h = 2, state-(i): at the 50th iteration, R = 50% of X and n = 50, state-(ii): at the 75th
iteration, R = 25% of X and n = 25. The constraint functions (11) are set as expressed
in (14). 50-trials are run for searching for the optimal parameters of the PID controller.
9
Subject to tr 2:50 sec.; Mp 5:00%; > >
>
>
ts 4:00 sec.; ess 0:01%; > >
=
0 Kp 10:00; ð14Þ
>
>
0 Ki 5:00; >
>
>
>
;
0 Kd 5:00
After the search process stopped, the LFICuS can successfully provide the PID
controller for the BLDC motor speed control system as stated in (15). The convergent
rates of the LFICuS for the PID design optimization over 50-trials run are depicted in
Fig. 6. The step-input and step-disturbance responses of the BLDC motor speed control
system without and with PID controller designed by the LFICuS are depicted in Fig. 7
and Fig. 8, respectively.
Fig. 6. Convergent rates of the LFICuS for the PID design optimization.
0:8164
Gc ðsÞ ¼ 8:5875 þ þ 1:8164s ð15Þ
s
Referring to Fig. 7, it was found that the BLDC motor without the PID controller
gives tr = 20.39 s., Mp = 0.00%, ts = 36.74 s., and ess = 0.00%. The BLDC motor
speed control system with the PID controller optimized by the LFICuS in (15) yields
tr = 2.44 s., Mp = 2.47%, ts = 3.28 s., and ess = 0.00%. From Fig. 8, it was found that
the BLDC motor without the PID controller cannot regulate the output response from
the external disturbance. However, the BLDC motor speed control system with the PID
controller designed by the LFICuS in (15) can successfully regulate the output response
from the external disturbance with the maximum percent overshoot from regulation
Mp_reg = 10.43% and regulating time treg = 19.78 s.
Fig. 7. Step-input responses of the BLDC motor speed control system.
Fig. 8. Step-disturbance responses of the BLDC motor speed control system.

5 Conclusions
The application of the LFICuS to design an optimal PID controller for the BLDC motor
speed control system has been presented in this paper. The LFICuS algorithm has been
developed from the ICuS as one of the most efficient metaheuristic optimization
techniques. Based on modern optimization, the sum-squared error between the refer-
ence input and the speed output of the BLDC motor speed control system has been set
as the objective function for minimization. In addition, the desired specification and
search bounds have been correspondingly set as the constraint functions. As results, the
optimal PID controller for the BLDC motor speed control system has been successfully
obtained by the LFICuS algorithm. The step-input and step-disturbance responses of
the BLDC motor speed controlled system are very satisfactory according to the design
specification. The advantage of the proposed design method is that users can utilize any
controller to be optimally designed for any plant of interest. However, the disadvantage
of this method is the boundary setting which is the problem-dependent. For future
research, the LFICuS algorithm will be conducted to design the PIDA, FOPID and
FOPIDA controllers for many real-world systems.
References
1. Yedamale, P.: Brushless DC (BLDC) motor fundementals. AN885 Michrochip Technol. Inc.
20, 3–15 (2003)
2. Vas, P.: Parameter Estimation, Condition Monitoring and Diagnosis of Electrical Machines.
Oxford University Press, Oxford (1993)
3. Tashakori, A., Ektesabi, M., Hosseinzadeh, N.: Modeling of BLDC motor with ideal back-
EMF for automotive applications. In: The World Congress on Engineering (2011)
4. Othman, A.S., Mashakbeh, A.: Proportional integral and derivative control of brushless DC
motor. Eur. J. Sci. Res. 35(4), 198–203 (2009)
5. Patel, V.K.R.S., Pandey, A.K.: Modeling and performance analysis of PID controlled BLDC
motor and different schemes of PWM controlled BLDC motor. Int. J. Sci. Res. Publ. 3, 1–14
(2013)
6. Boonpramuk, M., Tunyasirut, S., Puangdownreong, D.: Artificial intelligence-based optimal
PID controller design for BLDC motor with phase advance. Indonesian J. Electr. Eng.
Inform. 7(4), 720–733 (2019)
7. Zakian, V.: Control Systems Design: A New Framework, Springer-Verlag (2005)
8. Zakian, V., Al-Naib, U.: Design of dynamical and control systems by the method of
inequalities. IEEE Int. Conf. 120, 1421–1427 (1973)
9. Puangdownreong, D., Kiree, C., Kumpanya, D., Tunyasrirut, S.: Application of cuckoo
search to design optimal PID/PI controllers of BLDC motor speed control system. In: Global
Engineering & Applied Science Conference, pp. 99–106 (2015)
10. Puangdownreong, D., Kumpanya, D., Kiree, C., Tunyasrirut, S.: Optimal tuning of 2DOF–
PID controllers of BLDC motor speed control system by intensified current search. In:
Global Engineering & Applied Science Conference, pp. 107–115 (2015)
11. Ljung, L.: System Identification Toolbox for use with MATLAB. The MathWorks (2007)
12. Yang, X.S.: Flower pollination algorithm for global optimization. Unconventional Comput.
Nat. Comput. Lect. Notes Comput. Sci. 7445, 240–249 (2012)
Intellectualized Control System
of Technological Processes of an Experimental
Biogas Plant with Improved System
for Preliminary Preparation of Initial Waste
Andrey Kovalev1(&), Dmitriy Kovalev1, Vladimir Panchenko1,2,

Valeriy Kharchenko1, and Pandian Vasant3
1
1st Institutskij proezd 5, 109428 Moscow, Russia
kovalev_ana@mail.ru, pancheska@mail.ru
2
3
Universiti Teknologi PETRONAS, Tronoh, 31750 Ipoh, Perak, Malaysia
pvasant@gmail.com
Abstract. Obtaining biogas is economically justified and is preferable in the

processing of a constant stream of waste. One of the most promising and energy-
efficient methods of preparing a substrate for fermentation is processing it in a
vortex layer apparatus. The aim of the work is to develop an intellectualized
control system of technological processes of an experimental biogas plant with
improved system for preliminary preparation of initial waste, created to study
the effectiveness of using of VLA for the pretreatment of return biomass during
anaerobic processing of organic waste. The article describes both an experi-
mental biogas plant with improved system for preliminary preparation of initial
waste, created to study the effectiveness of using of VLA for the pretreatment of
return biomass during anaerobic processing of organic waste, and a schematic
diagram of the being developed intellectualized process control system The use
of the developed intellectualized process control system makes it possible to
determine the main parameters of experimental research and maintain the
specified operating modes of the equipment of the experimental biogas plant,
which in turn will reduce the error in the subsequent mathematical processing of
experimental data.
Keywords: Anaerobic treatment Preliminary preparation of initial waste

Vortex layer apparatus Bioconversion of organic waste Recirculation of
biomass
1 Introduction
Obtaining biogas is economically justified and is preferable in the processing of a

constant stream of waste, and is especially effective in agricultural complexes, where
there is the possibility of a complete ecological cycle [1].
Despite the long-term use of biogas plants and an even longer period of studies of
the processes occurring in them, our ideas about its basic laws and mechanisms of
https://doi.org/10.1007/978-3-030-68154-8_100
Intellectualized Control System of Technological Processes 1187
individual stages are insufficient, which in some cases determines the low efficiency of
biogas plants, does not allow them to be controlled to the necessary extent, leads to
unjustified overstatement of construction volumes, increase in operating costs and,
accordingly, the cost of 1 m3 of biogas produced. This puts forward the tasks of
developing the most effective technological schemes for biogas plants, the composition
of their equipment, creating new designs and calculating their parameters, improving
the reliability of their work, reducing the cost and construction time, which is one of the
urgent problems in solving the issue of energy supply of agricultural production
facilities [1].
One of the most promising and energy-efficient methods of preparing a substrate for
fermentation is processing it in a vortex layer of ferromagnetic particles (vortex layer
apparatus (VLA)), which is created by the action of a rotating magnetic field [2].
Previously, the positive effect of processing various organic substrates in VLA on the
characteristics of methanogenic fermentation, in particular on the kinetics of
methanogenesis, the completeness of decomposition of organic matter, the content of
methane in biogas, and waste disinfection has been shown [3–6].
The purpose of this work is to develop an intellectualized control system of
technological processes of an experimental biogas plant with improved system for
preliminary preparation of initial waste, created to study the effectiveness of using of
VLA for the pretreatment of return biomass during anaerobic processing of organic
waste.
2 Background
The experimental biogas plant consists of the following blocks:

– block for preliminary preparation of the initial substrate (pre-treatment block);
– block for anaerobic bioconversion of organic matter of the prepared substrate;
– block for dividing the anaerobically treated substrate into fractions;
– block for recirculation of the thickened fraction of the anaerobically treated
substrate;
– process control block.
The pre-treatment block includes a pre-heating vessel, a vortex layer apparatus
(VLA) and a peristaltic pump for circulating the substrate through the VLA.
The preheating tank is a steel cylindrical tank equipped with devices for loading
and unloading the substrate, a device for mixing the substrate, branch pipes for outlet
and supplying the substrate, as well as a substrate heating device equipped with a
temperature sensor.
The diameter of the preheating tank is 200 mm, the height of the preheating tank is
300 mm, the filling factor is 0.9 (the volume of the apparatus is 9.4 L; the volume of
the fermented substrate is 8.5 L).
Temperature and mass transfer are maintained in the preheating tank by means of a
heating system and mechanical stirring.
1188 A. Kovalev et al.
The temperature in the preheating tank is maintained by a heating system consisting

of: an electric clamp heater located in the lower part of the outer wall of the preheating
tank and a temperature sensor.
To intensify the preheating process in the preheating tank stirring is carried out.
Stirring is carried out by a mechanical stirrer controlled by a time switch. The diameter
of the stirrer blades is 100 mm, the blade height is 25 mm, and the rotation speed is
300 rpm.
The unloading device is connected by means of a pipeline with the block for
anaerobic bioconversion of organic matter of the prepared substrate.
VLA is a tube with a diameter of 50 mm, made of stainless steel and placed instead
of the rotor in the stator of an induction motor. In the pipe, the initial mixture of
components is affected by the electromagnetic field created by the stator windings, and
by intensively and chaotically moving ferromagnetic bodies, which change the direc-
tion of movement with a frequency equal to the frequency of the current. In those areas
of the pipe where electromagnetic fields arise, a vortex layer is literally created, which
is why the devices under consideration got their name. In this layer, all possible
mechanical effects on the processed material are realized [3].
The suction branch pipe of the peristaltic pump for circulation of the substrate
through the VLA is connected to the branch pipe for outlet the substrate from the
preheating tank. The supply branch pipe of the peristaltic pump for circulation of the
substrate through VLA is connected to the hydraulic inlet of VLA. The hydraulic outlet
of VLA is connected to the substrate supply pipe into the preheating tank.
Thus, during the preheating process, the substrate circulates through the VLA,
where it is subjected to additional pretreatment.
The block for anaerobic bioconversion of organic matter of the prepared substrate
includes a laboratory anaerobic bioreactor, as well as equipment required for the
functioning of the anaerobic bioreactor.
The laboratory anaerobic bioreactor is a steel cylindrical tank equipped with a gas
cope, devices for loading and unloading a substrate, a biogas outlet pipe, a substrate
stirring device, and a substrate heating device equipped with a temperature sensor.
The diameter of the laboratory anaerobic bioreactor is 400 mm, the height of the
bioreactor is 500 mm, the filling factor is 0.9 (the volume of the apparatus is 56 L, the
volume of the fermented substrate is 50 L).
The pretreated substrate is loaded into the laboratory anaerobic bioreactor through a
substrate loading device that is connected to the preheating tank unloading device.
In the laboratory anaerobic bioreactor, the optimal conditions for the vital activity
of anaerobic microorganisms are maintained: temperature and mass transfer using a
heating system and through mechanical stirring.
The temperature in the laboratory anaerobic bioreactor is maintained using a
heating system, which consists of: an electric clamp heater located in the lower part of
the outer wall of the anaerobic bioreactor, and a temperature sensor.
To intensify the fermentation process in a laboratory anaerobic bioreactor, stirring
is carried out. Stirring is carried out using a mechanical stirrer controlled by a time
switch. The diameter of the stirrer blades is 300 mm, the height of the main blade is
25 mm, the height of the blade for breaking the crust is 25 mm, the rotational speed is
60 rpm.
The unloading of the anaerobically treated substrate occurs automatically by

gravity (overflow) through the substrate unloading device when the next portion of
prepared substrate is added. The substrate unloading device is connected to a block for
separating the anaerobically treated substrate into fractions.
The biogas formed in the laboratory anaerobic bioreactor, the main constituent of
which is methane, is collected in a gas cope and removed through the biogas outlet pipe
and the hydraulic seal to the biogas quality and quantity metering unit. The metering
unit for the quality and quantity of biogas is included in the process control unit.
The clamp electric heater ensures the maintenance of the temperature regime of
anaerobic processing of the substrate in the laboratory anaerobic bioreactor.
The block for separating the anaerobically treated substrate into fractions includes
an effluent sump equipped with a supernatant discharge collector, a thickened fraction
outlet pipe and an anaerobically treated substrate inlet pipe.
The effluent sump is a rectangular steel vessel with a total volume of 45 L.
The supernatant drain collector allows you to control the volume of the resulting
thickened fraction, depending on the rheological properties of the initial and fermented
substrate and on the sedimentation time (hydraulic retention time in effluent sump).
The block for recirculation of the thickened fraction of the anaerobically treated
substrate includes a peristaltic pump, the suction branch pipe of which is connected to
the thickened fraction outlet pipe of the effluent settler, and the supply branch pipe is
connected to the preheating tank loading device.
The process control block includes temperature relays, time relays, a biogas
quantity and quality metering unit, as well as actuators (starters, intermediate relays,
etc.), sensors and light-signaling fittings.
The described experimental plant is designed to study the effectiveness of the
introduction of ABC for the treatment of recycled biomass in the technology of
methane digestion of organic waste.
The described experimental plant allows experimental research in the following
modes:
– anaerobic treatment of liquid organic waste without biomass recirculation and
without treatment in VLA in mesophilic or thermophilic conditions with different
hydraulic retention times;
– anaerobic treatment of liquid organic waste with recirculation of biomass and
without treatment in VLA in mesophilic or thermophilic conditions with different
hydraulic retention times;
– anaerobic treatment of liquid organic waste with preliminary treatment in VLA
without recirculation of biomass in mesophilic or thermophilic conditions with
different hydraulic retention times;
– anaerobic treatment of liquid organic waste with recirculation of biomass and pre-
treatment in VLA under mesophilic or thermophilic conditions with different
hydraulic retention times.
At the same time, in modes with biomass recirculation, operation is provided with
different recirculation coefficients (up to 2).
However, in order to implement experimental studies in a continuous mode with
obtaining experimental data of increased accuracy, it is necessary to develop an
intellectualized system for monitoring technological processes and measuring the

operation parameters of each of the blocks of biogas plant with improved system for
preliminary preparation of initial waste.
3 Results of Investigations
Figure 1 shows a block diagram of the proposed method for increasing the efficiency of
anaerobic bioconversion of organic waste for the study of which an experimental setup
was created and an intellectualized process control system was developed.
Fig. 1. A block diagram of a method for increasing the efficiency of anaerobic bioconversion of
organic waste to produce gaseous energy and organic fertilizers based on biomass recycling using
a vortex layer apparatus
When developing an intellectualized control system of technological processes of

an experimental biogas plant with an improved system for preliminary preparation of
initial waste, both previously developed schemes [7, 8] and the results of studies of
other authors [9, 10] were taken into account.
The main control element of the intellectualized process control system is a pro-
grammable logic controller (PLC) (“UDIRCABiBo” on the diagram), which simulates
the functions of executing regulators of various purposes. A functional diagram of the
integration of a PLC into an intellectualized process control system is shown in Fig. 2.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
KS5 QIR KS7

KS1
TIRCA2 KS6
KS2
LIRCA2
KS4
KS3
TIRCA1 LIRCA1
Fig. 2. Functional diagram of PLC integration into an intellectualized process control system
In Fig. 2, arrows show the directions of electrical signals between the simulated
controllers (temperature and level controllers, time relays) and the contacts of sensors
and actuators (the numbers of contacts in Fig. 2 correspond to the numbers of contacts
in Fig. 3). The designations of regulators and sensors comply with GOST 21.208-2013.
The locations of the sensors and actuators required for the implementation of the
processes of pretreatment of the mixture of substrates, anaerobic bioconversion of
organic matter, as well as for the recirculation of the thickened fraction of the anaer-
obically treated substrate, are shown in the schematic diagram of the intellectualized
process control system shown in Fig. 3.
The main parameters to be determined in the course of experimental research:
– hydraulic retention time in the anaerobic bioreactor;
– hydraulic retention time of the vortex layer apparatus;
– hydraulic retention time in the preparation reactor;
– the quantity and quality of the produced biogas;
– the rate of recirculation of the thickened fraction of the anaerobically treated
substrate;
– frequency of the electromagnetic field in the vortex layer apparatus;
– substrate temperature during anaerobic processing;
– initial substrate temperature.
3 M 1
QIE
8 M 10
11
LE1 5
2 7
1 4 6 6 LE2
2 9
14
TE1 TE2 9 10
4 5 TE3 TE4
3
12 13
8
16 15
12
7
11
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
ТТ1 ТТ 2 LТ1 QT TT3 TT4 LT2

Local
EDT
UVDT EDT
On the panel
UVDT
EDT
UDIRCABiBo
Fig. 3. Schematic diagram of an intellectualized process control system 1 - laboratory RP

heater; 2 - laboratory reactor for preparation (RP) of the substrate for anaerobic digestion; 3 -
mixing device of laboratory RP; 4 - pump for circulation of the being prepared substrate; 5 -
valve for loading the prepared substrate; 6 - LAB heater; 7 - laboratory anaerobic bioreactor
(LAB); 8 - LAB mixing device; 9 - device for separating the fermented substrate into fractions;
10 - valve for removing the supernatant; 11 - pump for recirculation of the thickened fraction of
the anaerobically treated substrate; 12 - vortex layer apparatus (VLA).
4 Conclusion
The application of the proposed method of intensification allows to obtain the fol-
lowing effects:
1. The return of the condensed effluent fraction allows to increase the residence time of
the hardly decomposable components of organic waste in the reactor, which in turn
leads to an increase in the degree of decomposition of organic matter and,
accordingly, an increase in the biogas yield.
2. Biomass recycling leads to an increase in the concentration of methanogenic
microorganisms in the bioreactor, which will increase the productivity of the
bioreactor and ensure the stability of its operation when changing the characteristics
of the incoming raw materials.
3. Fine grinding, which allows to improve the rheological properties of the substrate,
to conduct partial hydrolysis of complex organic compounds, to improve the
availability of nutrients for microorganisms, to ensure heating of the substrate.
4. The introduction of the abrasive working body (steel needles) into the substrate of
ferromagnetic particles, which allows to shorten the launch time of the bioreactor,
increase the rate of formation and final yield of methane, provide more complete
decomposition of the substrate and reduce the required volume of the bioreactor,
increase the adaptive ability of the microbial community to adverse conditions (for
example excessive accumulation of volatile fatty acids (VFA) or H2, lower pH).
Thus, the use of the vortex layer apparatus for additional processing of the recir-
culated biomass of the condensed fraction of fermented organic waste mixed with
prepared source organic waste in a system for increasing the efficiency of anaerobic
bioconversion of organic waste to obtain a gaseous energy carrier based on biomass
recirculation will allow providing a synergistic effect of a set of methods for intensi-
fying the process of anaerobic biochemistry organic waste of the agricultural industry
in anaerobic bioreactors, namely:
– mechanical impact on the initial organic waste (grinding of the initial mass in VLA
before loading into the bioreactor);
– biochemical methods of influencing the fermentable substrate (introducing ferro-
magnetic particles into the substrate in VLA);
– electromagnetic processing of fermentable substrate;
– microbiological methods (biomass retention due to its recirculation).
The use of the developed intellectualized process control system makes it possible
to determine the main parameters of experimental research and maintain the specified
operating modes of the equipment of the experimental biogas plant, which in turn will
reduce the error in the subsequent mathematical processing of experimental data.
5 Acknowledgment
This work was supported by the Federal State Budgetary Institution “Russian Foun-
dation for Basic Research” as part of scientific project No. 18-29-25042.
References
1. Kovalev A.A.: Tekhnologii i tekhniko-energeticheskoye obosnovaniye proizvodstva biogaza
v sistemakh utilizatsii navoza zhivotnovodcheskikh ferm (Technologies and feasibility study
of biogas production in manure utilization systems of livestock farms). Dissertatsiya …
doktora tekhnicheskikh nauk (Thesis … doctor of technical sciences), p. 242, All-Russian
Research Institute of Electrification of Agriculture, Moscow (1998)
2. Kovalev, D., Kovalev, A., Litti, Y., Nozhevnikova, A., Katrayeva, I.: Vliyaniye nagruzki po
organicheskomu veshchestvu na protsess biokonversii predvaritel’no obrabotannykh
substratov anaerobnykh bioreaktorov (The Effect of the Load on Organic Matter on
Methanogenesis in the Continuous Process of Bioconversion of Anaerobic Bioreactor
Substrates Pretreated in the Vortex Layer Apparatus). Ekologiya i promyshlennost’ Rossii
(Ecology and Industry of Russia) 23(12), 9–13 (2019). https://doi.org/10.18412/1816-0395-
2019-12-9-13
3. Litti, Yu., Kovalev, D., Kovalev, A., Katraeva, I., Russkova, J., Nozhevnikova, A.:
in an electromagnetic mill. In: Journal Physics: Conference Series vol. 1111, no. 1,
p. 012013 (2018). https://doi.org/10.1088/1742-6596/1111/1/012013
4. Kovalev, D.A., Kovalev, A. A., Katraeva, I.V., Litti, Y.V., Nozhevnikova A.N.: Effekt
obezzarazhivaniya substratov anaerobnykh bioreaktorov v apparate vikhrevogo sloya (The
effect of disinfection of substrates of anaerobic bioreactors in the vortex layer apparatus).
Khimicheskaya bezopasnost’ (chemsafety) 3(1), 56–64 (2019) https://doi.org/10.25514/
CHS.2019.1.15004
5. Kovalev, A.A., Kovalev, D.A., Grigor’yev, V.S.: Energeticheskaya effektivnost’ pred-
varitel’noy obrabotki sinteticheskogo substrata metantenka v apparate vikhrevogo sloya
(Energy Efficiency of Pretreatment of Digester Synthetic Substrate in a Vortex Layer
Apparatus). 2020; 30(1):92–110. Inzhenernyye tekhnologii i sistemy (Engineering Tech-
nologies and Systems) 30(1), 92–110 (2020). https://doi.org/10.15507/2658-4123.030.
202001.092-110
6. Litti, Y.V., Kovalev, D.A., Kovalev, A.A., Katrayeva, I.V., Mikheyeva, E.R., Nozhevni-
kova, A.N.: Ispol’zovaniye apparata vikhrevogo sloya dlya povysheniya effektivnosti
metanovogo sbrazhivaniya osadkov stochnykh vod (Use of a vortex layer apparatus for
improving the efficiency of methane digestion of wastewater sludge) Vodosnabzheniye i
sanitarnaya tekhnika (Water supply and sanitary technique) 11, 32–40 (2019). https://doi.
org/10.35776/MNP.2019.11.05
7. Kovalev, A., Kovalev, D., Panchenko, V., Kharchenko, V., Vasant, P.: Optimization of the
process of anaerobic bioconversion of liquid organic wastes. In: Intelligent Computing &
Optimization. Advances in Intelligent Systems and Computing, vol. 1072, pp. 170–176
(2020). https://doi.org/10.1007/978-3-030-33585-4_17
optimization of the combustion process of biogas for the biogas plant heat supply. In:
Intelligent Computing & Optimization. Advances in Intelligent Systems and Computing,
vol. 1072, pp. 361–368 (2020). https://doi.org/10.1007/978-3-030-33585-4_36
9. Kalyuzhnyi, S., Sklyar, V., Fedorovich, V., Kovalev, A., Nozhevnikova, A.: The
development of biotechnological methods for utilisation and treatment of diluted manure
streams. In: Proceeding IV International Conference IBMER, Warshawa. (1998)
10. Mulata, D.G., Jacobi, H.F., Feilberg, A., Adamsen, A.P.S., Richnow, H.-H., Nikolausz, M.:
Changing feeding regimes to demonstrate flexible biogas production: effects on process
performance, microbial community structure and methanogenesis pathways. Appl. Environ.
Microbiol. 82(2), 438–449 (2016). https://doi.org/10.1128/AEM.02320-15
Way for Intensifying the Process of Anaerobic
Bioconversion by Preliminary Hydrolysis
and Increasing Solid Retention Time
Andrey Kovalev1(&), Dmitriy Kovalev1, Vladimir Panchenko1,2,

Valeriy Kharchenko1, and Pandian Vasant3
1
1st Institutskij proezd 5, 109428 Moscow, Russia
kovalev_ana@mail.ru
2
Russian University of Transport, Obraztsova street 9, 127994 Moscow, Russia
pancheska@mail.ru
3
Universiti Teknologi PETRONAS, 31750 Tronoh, Ipoh, Perak, Malaysia
pvasant@gmail.com
Abstract. The negative impact of agricultural activities on the environment is

associated with the formation of liquid and solid waste from agricultural and
processing industries. Obtaining biogas is economically justified and is prefer-
able in the processing of a constant stream of waste. The purpose of the work is
to develop the foundations of a method for intensifying the process of anaerobic
bioconversion of organic matter to obtain a gaseous energy carrier through the
combined use of recirculation of a thickened fraction of fermented waste (to
increase solid retention time and retain biomass) and preliminary hydrolysis of a
mixture of an initial substrate and a thickened fraction of fermented waste. The
article describes the foundations of the proposed way for intensifying the pro-
cess of anaerobic bioconversion by preliminary hydrolysis and increasing solid
retention time, including both the technological scheme of the proposed way and
the material balance of the biogas plant with the proposed way. Application of
the proposed method for intensifying the process of anaerobic bioconversion of
organic matter will provide a synergistic effect of a combination of methods for
intensifying the process of anaerobic bioconversion.
Keywords: Anaerobic treatment Preliminary preparation of initial waste

Hydrolysis Bioconversion of organic waste Recirculation of biomass
1 Introduction
The negative impact of agricultural activities on the environment is associated not only
with the increasing consumption of natural resources, but also, to a greater extent, with
the formation of liquid and solid waste from agricultural and processing industries. In
particular, raising animals, processing meat and dairy products, producing beer, sugar,
starch, etc. are accompanied by the formation of a large amount of wastewater [1, 2].
Obtaining biogas is economically justified and is preferable in the processing of a
constant stream of waste, and is especially effective in agricultural complexes, where
there is the possibility of a complete ecological cycle [3].

https://doi.org/10.1007/978-3-030-68154-8_101
The transition of animal husbandry to an industrial basis and the associated con-
centration of animals on large farms and complexes lead to a sharp increase in the
volume of organic waste that must be disposed of without polluting the environment.
One of the ways of processing organic waste is its anaerobic digestion in biogas
plants due to the vital activity of microorganisms (methanogenesis), when biogas is
obtained with the help of microbiological processing of biomass and used, in turn, as a
raw material for obtaining thermal and electrical energy [3].
Development, design and construction of new and modification of existing reactors
for biogas production from various agricultural waste are designed to solve a number of
significant energy and environmental problems. It allows to reduce anthropogenic load on
ecosystems by reduction harmful emissions of greenhouse gases and fully utilization and
recycling organic waste. In addition, the use of biogas plants can provide facilities with
uninterrupted electricity and heat supply for their own needs, as well as obtain, through
the use of various technological solutions, high-quality products from methanogenesis
waste, which can be used as fertilizer in greenhouses, feed additives or bedding for
livestock [3].
Despite the long-term use of biogas plants and an even longer period of studies of
the processes occurring in them, our ideas about its basic laws and mechanisms of
individual stages are insufficient, which in some cases determines the low efficiency of
biogas plants. It does not allow them to be controlled to the necessary extent, which
leads to unjustified overstatement of construction volumes, increase in operating costs
and, accordingly, the cost of 1 m3 of biogas produced. This puts forward the tasks of
developing the most effective technological schemes for biogas plants, the composition
of their equipment, creating new designs and calculating their parameters, improving
the reliability of their work, reducing the cost and construction time, which is one of the
urgent problems in solving the issue of energy supply of agricultural production
facilities [3].
In this regard, the main goal of the research was to modernize the technology using
one of the methods for intensifying processes in biogas plants - preliminary hydrolysis
of the initial organic waste together with recirculation of the thickened fraction of
fermented waste.
Thus, the purpose of the work is to develop the foundations of a way for inten-
sifying the process of anaerobic bioconversion of organic matter to obtain a gaseous
energy carrier. It is planned to intensify the process through the combined use of
recirculation of a thickened fraction of fermented waste and preliminary hydrolysis.
Recirculation of a thickened fraction of fermented waste makes it possible to increase
solid retention time and to retain biomass un bioreactor. At the same time, it is planned
to expose a mixture of an initial substrate and a thickened fraction of fermented waste
to preliminary hydrolysis.
2 Background
Biomethanogenesis is a complex multistage decomposition of various organic sub-

stances under anaerobic conditions under the influence of bacterial flora, the end result
of which is the formation of methane and carbon dioxide [4].
Way for Intensifying the Process of Anaerobic Bioconversion 1197
According to modern views, the anaerobic conversion of almost any complex

organic matter into biogas goes through four successive stages:
– stage of hydrolysis;
– fermentation stage;
– acetogenic stage;
– methanogenic stage [4].
The limiting stage of methane digestion of urban wastewater sludge (WWS) is the
hydrolysis of the solid phase, in particular, activated sludge, which consists mainly of
cells of microorganisms and a small amount of dissolved organic matter (OM). The
sediment of the primary settling tanks contains more dissolved OM and fewer active
microorganisms as compared to activated sludge. Soluble organic compounds, which
can be further converted into biogas, are formed in WWS in the course of hydrolysis.
Therefore, the biogas yield during WWS fermentation is in direct proportion to the
biodegradability of the sludge and, accordingly, the rate of hydrolysis. One of the
technological methods of increasing the bioavailability of sediments is their pretreat-
ment before fermentation in digesters. Pretreatment of sludge allows:
• lyse/disintegrate microbial cells of activated sludge;

• solubilize sediment solids;
• partially decompose the formed organic polymers to monomers and dimers [5].
Hydrolysis of macromolecules (polysaccharides, proteins, lipids) included in the
organic matter is carried out by exogenous enzymes excreted into the intercellular
medium by various hydrolytic microorganisms. The action of these enzymes leads to
the production of relatively simple products, which are efficiently utilized by the
hydrolytics themselves and by other groups of bacteria at the subsequent stages of
methanogenesis. The hydrolysis phase in methane fermentation is closely related to the
fermentation (acidogenic) phase, with hydrolytic bacteria performing both phases, and
they are sometimes combined with enzymatic bacteria [4].
Another way to increase the efficiency of the process of anaerobic bioconversion of
organic waste to obtain a gaseous energy carrier and organic fertilizers is to increase
solid retention time in the reactor [6].
For the effective operation of the reactor, it is important to correctly correlate two
main parameters - the retention time of raw materials in the reactor and the degree of
digestion of the raw materials. Retention time is the time it takes to completely replace
raw materials in the reactor. The degree of digestion of raw materials is the percentage
of decomposed organic matter converted into biogas over a certain period of time. In
the process of anaerobic fermentation, the organic matter of the waste passes into
biogas, i.e. the amount of total solids in the reactor is constantly decreasing. The
formation of biogas is usually maximum at the beginning of the process, then the
biogas yield decreases. Often, the longer the raw material is held in the reactor, the
more methane is recovered, due to the increased contact time between the microor-
ganisms and the substrate. Typically, in periodic systems, the degree of digestion of
raw materials is higher than in continuous. Theoretically, in periodic systems, the
degree of digestion of raw materials can reach 100%. In practice, however, complete
(100%) decomposition of raw materials and complete extraction of biogas is impos-

sible. The degree of digestion of the raw material also depends on its type. Rapidly
decaying waste, such as squeezing sugar beets, can have a degree of decomposition of
more than 90%, whereas forage crops with a high fiber content decompose by 60%
over the same period (see Table 1) [7].
Table 1. Degree of digestion of various types of raw materials [7]

Raw material Degree of digestion
(% of ashless substance)
Cattle manure 35
Pig manure 46
Forage crops 64
Squeezed sugar beets (cake) 93
Fruit and vegetable waste 91
Thus, based on the degree of digestion of raw materials, such a retention time of
raw materials in the reactor is experimentally selected that ensures the most efficient
operation of the reactor (i.e., the maximum biogas yield at a relatively high degree of
decomposition of the substrate). A constant volume of raw materials is maintained in
the reactor by supplying new raw material to the reactor and removing the fermented
mass at regular intervals. Often the volume of added raw materials is more than the
volume of the removed fermented mass, because part of the total solids of the raw
material passes into biogas during the fermentation process. The volume of raw
materials in the reactor can be adjusted by adding a certain amount of liquid. The
fermented mass (substrate removed from the reactor after fermentation) consists of
water, including dissolved salts, inert materials and undecomposed OM. The fermented
mass also contains the biomass of microorganisms accumulated during the retention of
raw materials in the reactor.
Hydraulic retention time (HRT) is the average length of time that liquids and
soluble compounds remain in the reactor. An increase in HRT promotes a longer
contact between microorganisms and the substrate, but requires a slower supply of raw
materials (reactor loading) and / or a larger reactor volume. If the retention time is too
short, there is a great risk that the growth rate of the microorganisms will be lower than
the rate of removal of the raw materials from the reactor. Often, the doubling time of
methanogens in reactors is more than 12 days. Therefore, HRT should be longer than
this time, otherwise microorganisms will be removed out of the reactor during the
unloading of the fermented mass and the population will not be dense enough to
efficiently decompose the raw materials. For single-stage reactors, HRT ranges from 9
to 30 days; HRT for thermophilic reactors averages 66% of that for mesophilic reactors
(10–16 versus 15–25 days).
One of the ways to increase the efficiency of the process of anaerobic bioconversion
of organic waste to obtain a gaseous energy carrier and organic fertilizers is to increase
the time spent in the reactor on solid matter (solid retention time – srt).
Separating, using one of the known methods (sedimentation, centrifugation, etc.),

the liquid fraction (supernatant) and the solid, it is possible without increasing the
hydraulic retention time of the substrate to increase its solid retention time.
The return flow consists not only of the biomass of methane-forming bacteria, but
also of the not completely decomposed organic matter of the sediment. With this
approach, the dissolved organic matter decomposes first, and the particles of organic
matter remaining in the solid phase, which require a longer decomposition time, are
returned to re-fermentation. An important advantage is the ability to increase the SRT
of sediment without increasing HRT, as well as the retention of biomass. As a result,
the required degree of digestion is achieved in smaller reactors [6].
3 Results of Investigations
The foundations of a way for intensifying the process of anaerobic bioconversion of

organic matter with the production of a gaseous energy carrier include, among other
things, the technological scheme and material balance. Figure 1 shows the techno-
logical scheme for using the proposed way for intensifying the process of anaerobic
bioconversion by preliminary hydrolysis and increasing solid retention time.
Fig. 1. Technological scheme for using the proposed way for intensifying the process of
anaerobic bioconversion by preliminary hydrolysis and increasing solid retention time 1 -
hydrolyser; 2 - anaerobic bioreactor; 3 - effluent settler; 4 - pump for recirculation of the
thickened fraction; 5 - waterseal.
The anaerobic bioreactor (2) is equipped with both a stirring system and a heating
system with temperature control of the fermentation process. Also, the anaerobic
bioreactor (2) is equipped with systems for loading and unloading the substrate, as well
as a branch pipe for removing biogas. The design of the hydrolyser (1) repeats the
design of the anaerobic bioreactor (2) in a reduced form. As a consequence, the
hydrolyser (1) has the same systems as the anaerobic bioreactor (2). However, the heat
supply system of the hydrolyser (1) controls the temperature of the hydrolysis process,
and through the biogas removal pipe acid biogas is discharged, which is sent to the
anaerobic bioreactor. The effluent settler (3) is equipped with an overflow pipe, as well
as systems for loading the fermented substrate, removing the supernatant liquid and
draining the thickened fraction. The pump for recirculation of the thickened fraction
(4) delivers the thickened fraction of the fermented substrate to the hydrolyser (1). The
waterseal (5) is used to exclude air from entering the anaerobic bioreactor when
removing biogas, as well as to maintain the biogas pressure.
When developing a technological scheme for using the proposed way for intensi-
fying the process of anaerobic bioconversion by preliminary hydrolysis and increasing
solid retention time, both schemes of continuous supply of the initial substrate [8] and
schemes with discrete supply were taken into account [9]. In addition, it is proposed to
perform the heat supply scheme for the biogas plant in accordance with the scheme
described in the article [10].
Material Balance of Biogas Plant
The general view of the material balance of the hydrolyser is as follows:
Ginit þ k Gthd ¼ Ginf þ Gabg ð1Þ
where Ginit – specific feed of the initial substrate to the hydrolyser to pretreatment
for anaerobic digestion, kg/kginit, where kginit is the amount of organic matter in the
initial substrate;
k – proportion of recirculated thickened fraction of fermented waste;
Gthd – specific yield of the thickened fraction of fermented waste from the effluent
settler, kg/kginit;
Ginf – specific yield of the initial substrate prepared for anaerobic treatment,
kg/kginit;
Gabg – specific losses of organic matter with acid biogas during the preparation of
the initial substrate for anaerobic treatment, kg/kginit.
Specific losses of organic matter with acid biogas during the preparation of the
initial substrate for anaerobic treatment will be:
Gabg ¼ ðGinit þ k Gthd Þ uh ð2Þ
where uh – the degree of digestion of organic matter in the hydrolyser during the
preparation of the initial substrate for anaerobic treatment.
The general view of the material balance of the anaerobic bioreactor is as follows:
Ginf þ Gabg ¼ Geff þ Gbg ð3Þ
where Geff – specific yield of fermented waste from anaerobic bioreactor, kg/kginit;
Gbg – specific biogas yield from anaerobic bioreactor, kg/kginit.
Specific biogas yield from anaerobic bioreactor will be:
Gbg ¼ ðGinf þ Gabg Þ um ð4Þ
where um – the degree of digestion of organic matter during anaerobic treatment of

the prepared substrate in an anaerobic bioreactor.
The general view of the material balance of the effluent settler is as follows:
Geff ¼ Gthd þ Gsup ð5Þ
where Gsup – specific yield of the supernatant liquid from the effluent settler,
kg/kginit.
The general view of the material balance of a biogas plant is as follows:
Ginit ¼ Gsup þ Gbg þ ð1 k Þ Gthd ð6Þ
Figure 2 shows a block diagram of the material balance of a biogas plant with the
proposed way for intensifying the process of anaerobic bioconversion by preliminary
hydrolysis and increasing solid retention time.
Ginit Gbg
Anaerobic Effluent
Hydrolyser Ginf Geff Gsup
bioreactor settler
Gabg
k·Gthd
(1-k)·Gthd
Fig. 2. Block diagram of the material balance of a biogas plant with the proposed way for
intensifying the process of anaerobic bioconversion by preliminary hydrolysis and increasing
solid retention time
4 Conclusion
Application of the proposed method for intensifying the process of anaerobic bio-
conversion of organic matter will provide a synergistic effect of a combination of
methods for intensifying the process of anaerobic bioconversion.
The use of recirculation of the thickened fraction of fermented waste allows:
– to increase the retention time in the reactor of hardly decomposable components of

organic waste, which in turn leads to an increase in the degree of digestion of
organic matter and, accordingly, to an increase in the yield of biogas;
– to increase the concentration of microorganisms in the hydrolysis reactor, which

will make it possible to intensify the hydrolysis process, thereby reducing the time
for pretreatment of the substrate before anaerobic fermentation;
– to increase the concentration of methanogenic microorganisms in the anaerobic
bioreactor, which, in turn, will increase the productivity of the bioreactor and ensure
the stability of its operation when the characteristics of the incoming raw material
change.
Pre-treatment of the substrate in a hydrolyser allows:
– to transfer a significant part of the organic matter of the substrate into a dissolved
state, thus improving the availability of nutrients to methanogenic microorganisms;
– to heat the initial substrate to the fermentation temperature in the anaerobic
bioreactor, which, in turn, allows you to maintain the optimal temperature mode in
the anaerobic bioreactor, avoiding temperature fluctuations that are harmful to
methanogenic microorganisms.
The supply of acid biogas formed during hydrolysis to the anaerobic bioreactor
allows:
– avoid the loss of organic matter of the original substrate with acid biogas formed
during hydrolysis;
– to intensify mixing in an anaerobic bioreactor, which, in turn, allows to improve
mass transfer and, as a consequence, the availability of nutrients of the fermented
mass to methanogenic microorganisms;
– to provide methanogenic microorganisms with additional nutrients in the gas phase.
Acknowledgment. This work was supported by the Federal State Budgetary Institution “Rus-
sian Foundation for Basic Research” as part of scientific project No. 18-29-25042.
References
1. Izmaylov, A.Y., Lobachevskiy, Y.P., Fedotov, A.V., Grigoryev, V.S., Tsench, Y.S.:
Adsorption-oxidation technology of wastewater recycling in agroindustrial complex
enterprises. Vestnik mordovskogo universiteta Mordovia Univ. Bull. 28(2), 207–221
(2018). https://doi.org/10.15507/0236-2910.028.201802.207-221
2. Artamonov, A.V., Izmailov, A.Y., Kozhevnikov, Y.A., Kostyakova, Y.Y., Lobachevsky, Y.
P., Pashkin, S.V., Marchenko, O.S.: Effective purification of concentrated organic
wastewater from agro-industrial enterprises, problems and methods of solution. AMA
Agric. Mech. Asia Afr. Latin Am. 49, 49–53 (2018)
3. Kovalev, A.A:. Tekhnologii i tekhniko-energeticheskoye obosnovaniye proizvodstva
biogaza v sistemakh utilizatsii navoza zhivotnovodcheskikh ferm (Technologies and
feasibility study of biogas production in manure utilization systems of livestock farms).
Dissertatsiya ... doktora tekhnicheskikh nauk (Thesis ... doctor of technical sciences), p. 242.
All-Russian Research Institute of Electrification of Agriculture, Moscow (1998)
4. Kalyuzhny, S.V., Danilovich, D.A., Nozhevnikova, A.N.: Results of Science and

Technology, ser. Biotechnology, vol. 29. VINITI, Moscow (1991)
5. Nozhevnikova, A.N., Kallistova, A., Litty, Y., Kevbrina, M.V.: Biotechnology and
Microbiology of Anaerobic Processing of Organic Municipal Waste: A Collective
Monograph. University Book, Moscow (2016)
6. Kevbrina, M.V., Nikolaev, Y.A., Dorofeev, A.G., Vanyushina, A.Y., Agarev, A.M.:
Vysokoeffektivnaya tekhnologiya metanovogo sbrazhivaniya osadka stochnykh vod s
retsiklom biomassy (Highly efficient technology for methane digestion of sewage sludge
with biomass recycling). Vodosnabzheniye i sanitarnaya tekhnika (Water Supply Sanitary
Tech.) 10, 61 (2012)
7. Schnurer, A., Jarvis, A.: Microbiological handbook for biogas plants. Swedish Gas Centre
Rep. 207, 13–8 (2010)
8. Kovalev, A., Kovalev, D., Panchenko, V., Kharchenko, V., Vasant, P.: Optimization of the
process of anaerobic bioconversion of liquid organic wastes. intelligent computing &
optimization. In: Advances in Intelligent Systems and Computing, vol. 1072, pp. 170–176
(2020). https://doi.org/10.1007/978-3-030-33585-4_17.
9. Kovalev, D., Kovalev, A., Litti, Y., Nozhevnikova, A., Katraeva, I.: Vliyaniye nagruzki po
organicheskomu veshchestvu na protsess biokonversii predvaritel’no obrabotannykh
substratov anaerobnykh bioreaktorov (The effect of the load on organic matter on
methanogenesis in the continuous process of bioconversion of anaerobic bioreactor
substrates pretreated in the vortex layer apparatus). Ekologiya i promyshlennost’ Rossii
(Ecol. Ind. Russ.) 23(12), 9–13 (2019). https://doi.org/10.18412/1816-0395-2019-12-9-13
Optimization of the combustion process of biogas for the biogas plant heat supply.
Intelligent computing & optimization. In: Advances in Intelligent Systems and Computing.
Vol. 1072, p. 361–368 (2020). https://doi.org/10.1007/978-3-030-33585-4_36
Evaluation of Technical Damage Caused
by Failures of Electric Motors
Anton Nekrasov1, Alexey Nekrasov1, and Vladimir Panchenko2,1(&)

1
Agroengineering Center VIM” (FSAC VIM),
1-st Institutskij 5, 109428 Moscow, Russia
nalios@mail.ru
2
Russian University of Transport, Obraztsova st. 9, 127994 Moscow,
Russian Federation
pancheska@mail.ru
Abstract. Emergent failures of electric motor impair economic damage com-

prising its technological and technical components. The first one is associated
with underproduction of livestock goods while the second includes expenditures
for replacement of electric motor. Evaluation of the technical component of the
economic damage caused by failures of electric motors having various capacities
and installed in driving assemblies of technological equipment in livestock
production has been made. Dependences of technical damage on the price and
lifespan of failed electric motor have been obtained, based on the results of
estimations. Failures of electric motors in equipment of livestock farms cause
interruptions in technological processes leading to material damage. The extent
of the damage is particularly high at remote agricultural enterprises character-
ized by substantial dispersion of dairy and fattening farms in which case prompt
illumination of electric motor failure is not an easy task. Therefore, time-outs
may appear unacceptably long. The major factors affecting the effectiveness
indicators of electric motors use in agricultural production that ensure the
reduction of failure rate and time to recover have been specified. Results of this
work will enable to estimate the technical component of economic damage
associated with failures of electric motors, to define the extent of economical
responsibility of the applied electric equipment, to make more specific schedules
of preventive maintenance and repair and to correct them, as well as to find
room for improvement of operation effectiveness of energy services at agri-
cultural enterprises.
Keywords: Technical maintenance electric equipment Failures of electric

motors Technical damage
1 Introduction
Improving the system of operation and maintenance of electric equipment will make it
possible to extend its lifespan, to save material expenditures for purchasing new electric
components and for repair of failed ones thus reducing associated technological
damage of agricultural enterprises [1–4]. Failure of electric motor or motor shutdown

https://doi.org/10.1007/978-3-030-68154-8_102
Evaluation of Technical Damage Caused by Failures 1205
due to a power outage lead to economic damage comprising its technological and
technical components arising due to, respectively, underproduction of livestock goods
and the need to replace failed electric motors [5]. Organizational concept, structure and
economic base of agricultural production have changed, during recent years. Modern
peasant households, farms and agricultural enterprises are provided with electrified
machinery and equipment, and new types of electric equipment are produced today
including electric motors and automatic tripping contactors [6, 7]. Evaluating damages
caused by failures of electric equipment, in livestock raising, is based on researches
[8–10] carried out earlier and dedicated to the methods of calculating damage to
agricultural production in case of power outages and electric equipment failures.
2 Technical Demages
Estimating damages shall be done on the basis of a complex approach to the solution of
the problem of improving operational reliability of rural electric installations with the
use of relevant data on damages caused by failure of electric drives, in technological
equipment. A convenient method for calculating damages is needed to be applied in
practice of electrical-technical services of agricultural enterprise that does not involve
insignificant components. It has to be supported with up-to-date reference materials on
electric equipment and agricultural products easy to use by service personnel of agri-
cultural enterprises.
In case of failure of electric motor, the plant or machine stops in which this motor is
installed. The extent of the damage is mainly defined by the period of time-out of
working mechanism [11, 12]. In the majority of cases, failed electric motors are not
subject to repair directly on the site. They are transported to a specialized service center
for complete repair in which case a spare electric motor has to be installed on the
working mechanism. Normally repair shops are located at a great distance from agri-
cultural production unit. That is why failed motors are stored on the site and sent to the
repair shops by batches of 10 to 15 pieces [12]. A certain majority of repair shops have
dedicated exchange fleets of electric motors in order to reduce the time of returning
repaired motors to the production sites. It means that repair shops receive failed electric
motors in exchange to repaired ones, in the same quantities, to maintain a bank of spare
components at the production sites [13].
Damage from replacement of failed electric motor comprises expenditures for
dismantling-assembling and the price of either a new electric motor or one that has
passed complete repair. Besides, it depends on capacity of electric motor, its size and
weight, design and methods of assembly for different types of machines characterized
by different operating conditions including convenience of dismantling-assembling.
That is why a uniform method of estimating the volumes of works and working
conditions of electrical installers cannot be reliably defined, in case of failure, for
particular type and capacity of electric motor. It is assumed that, in today’s operation
and maintenance conditions of electric motors, an approximate value of the average
share of technical component of economic damage associated with either ahead-of-time
repair of decommissioning of electric motor may amount to a half of its price.
1206 A. Nekrasov et al.
More specific values of technical damage due to electric motor failure can be
calculated in various ways depending on whether a failed electric motor has to be
replaced by either a new one of one that has passed the complete repair with the use of
fault-free components or it is intended to be repaired either at the site or in a repair shop
of agricultural enterprise. The unrecovered cost of electric motor and all other
expenditures and charges have to be taken into account which makes such calculations
relatively complicated.
As evidenced from the available information on the nomenclature of electric motors
supplied to agricultural consumers, electric equipment factories produce mainly motors
of general purpose industrial versions. It means that they are operated and applied in
conditions they are not intended for. With regard to low quality of complete repair the
lifespan of repaired motors does not, normally, exceed 1 to 1.5 years [8].
In Tables 1, 2, 3 and 4 the input data required for estimating the technical com-
ponent of economic damage is given including price, weight and complete repair
expenditures, for various types of asynchronous electric motors of unified series (in
Russia) most commonly applied in agricultural production [11, 12].
Table 1. Price of electric motors of 5a series, for various capacities

Rotation Price of electric motor (euro’s)
frequency Capacity (kW)
(min–1)
5.5 7.5 11 15 18,5 22 30 37 45 55
3000 – – 231.76 346.98 366.73 414.89 497.6 759.39 858.18 1038.81
1500 – 205.78 247.92 330.92 372.90 428.47 558.1 784.08 891.51 970.54
1000 229.83 251.52 329.69 372.90 472.93 698.88 807.6 985.35 1248.1 1360.64
750 281.53 334.63 390.19 497.61 772.98 838.41 1075.5 1296.5 1478.7 1704.84
Table 2. Weight of electric motors of 5a series, for various capacities

Rotation frequency (min–1) Weight of electric motor (kg)
Capacity (kW)
5.5 7.5 11 15 18,5 22 30 37 45 55
3000 – – 70 106 112 140 155 235 255 340
1500 – 64 76 111 120 145 165 245 270 345
1000 63 74 108 129 160 245 280 330 430 450
750 74 108 124 160 240 260 340 430 460 705
These data related to electric motors is intended for use while estimating economic
damages from failure of electric equipment, planning activities on technical mainte-
nance and improvement of its operational reliability to ensure high effectiveness of its
operation, in livestock-breeding farms.
Table 3. Average price of electric motors of AIR series

Rotation Price of electric motor (euros)
frequency Capacity (kW)
(min–1)
0.25 0.55 1.1 2.2 3.0 4.0 5.5 7.5 11 18.5 22 30
3000 25.9 32.2 43.8 59.8 74.95 92.01 106.2 134.46 194.34 302.59 398.6 462.2
1500 30.7 41 52.9 72.4 92.56 103.55 139.3 175.48 211.16 328.50 406.2 495.3
1000 36.2 45.4 59.0 91.1 130.1 141.44 183.6 200.97 297.86 435.42 538 616.2
750 47.1 70.5 87.9 133 146.6 192.50 205.2 297.55 333.96 559.52 596.7 775.9
Table 4. Weight of electric motors of AIR series

Rotation frequency (min–1) Weight of electric motor (kg)
Capacity (kW)
0.25 0.55 1.1 2.2 3.0 4.0 5.5 7.5 11 18.5 22 30
3000 3.8 6.1 9.2 15 30 26 32 48 78 130 150 170
1500 4.2 8.1 9.4 18.1 35 29 45 70 84 140 160 180
1000 5.6 9.9 16 27 43 48 69 82 125 160 195 255
750 – 15.9 22 43 48 68 82 125 150 210 225 360
Technical damage from failures of electric motors depends on multiple factors [14].
That is why its correct values are difficult to define, in practice of agricultural enterprise
conditions. Technical damage DT associated with replacement of a failed electric motor
comprises the expenditures for dismantling-assembling and price of either new P0
electric motor or one that has passed complete repair PR, with the account of its
amortization.
In a general case, technical component of damage from ahead-of-time failure of
electric motor (that has not been operated during its rated lifespan) can be calculated as
follows [5]:

tF
DT ¼ PR þ KM 1 PD ð1Þ
TN
where PR are expenditures associated with replacement of the failed electric motor
with a new one (dismantling and assembling), KM is price of the failed electric motor,
PD is price of failed electric motor that has been decommissioned (metal scrap), TN and
tF are, respectively, rated and factual lifespan of electric motor.
In calculations, data on the factual lifespan shall be used that comes from agri-
cultural enterprises operating corresponding types of electric equipment in accordance
with the requirements of System of Scheduled-Preventive Repair and Technical
Maintenance of Electric Equipment in Agriculture. The rated lifespan of 7 years is
established [9], for electric motors operating in hard conditions of livestock-breeding.

Price of metal scrap for electric motors is assumed to be 20 rubles or 0.35$ per kg (for
Russian market on 21.05.2020) while expenditures for dismantling-assembling are
estimated as 10% of new electric motor’s price. Results of calculations for technical
damage caused by failure of electric motor operating in conditions of animal-breeding
premises with the use of expression (1) are presented in Table 5.
Table 5. Calculation results for technical damage caused by failures of electric motors, in
livestock production
Electric motor model Technical damage (euros)
Lifespan tF (years)
1 2 3 4 5 6 7
AIR 0.55 kW 37 31 25 20 14 8 –
AIR 3.0 kW 80 66 53 40 27 14 –
AIR 5.5 kW 122 102 82 62 42 22 –
Dependence of technical damage on the lifespan in case of electric motor failure in

livestock production are shown in Fig. 1, for 0.55 kW (1), 3.0 kW (2) and 5.5 kW (3).
In works [13, 14] It found that electric motors applied in agricultural production have
an average lifespan (before the first complete repair) 3 to 3.5 years, 4 years and
5 years, for livestock-breeding, plant cultivation and private subsidiary farming,
respectively. The period of repair cycle, between capital repairs, of electric motors is
1.5 years, 2 years and 2.5 years, respectively, in livestock farming, plant cultivation
and subsidiary farming. Practice shows that electric motors have not to be repaired
more frequently than once within their whole lifespan. The level of maintenance and
the quality of repairs have to be improved so that each motor could retain its operability
AIR 0.55 kW AIR 3.0 kW AIR 5.5 kW

140
120 122
100 102
DT [euros]
80 80 82
66 62
60
53
40 40 42
37 27 22
20 31
25 14
20
0 14
8
0 1 2 3 4 5 6 7
tF [years]
Fig. 1. Dependence of technical damage caused by failure of electric motors on their lifespan,
for 0.55 kW, 3.0 kW and 5.5 kW.
within a period not shorter than 5 to 6 years before the first complete repair and, at
least, 3 to 4 years after that. Electric motors shall be subject to decommissioning upon
termination of their rated lifespan determined by their depreciation rates. The values of
these rates are used while estimating the actual number of complete repairs and the
average values of technical damage.
In order to reduce material damages caused by failures of electrified equipment, at
agricultural enterprises, by decreasing the failure rate of electric motors it is necessary
to perform preventive activities on their technical maintenance and repair accompanied
by diagnostic checks of the technical status of electric motors. Besides, the required
sets of spare components and materials have to be kept for repair-maintenance pur-
poses, and the appropriate protection devices have to be applied to protect electric
motors, in alarm conditions. It is also important to improve operational efficiency of
repair teams and their provision with technical means for trouble-shooting and fault
handling.
A more reliable evaluation of the technical component of damage requires col-
lecting large volumes of data that can be only obtained in conditions of practical
operation and maintenance of electric equipment at agricultural enterprises. The latter
has to have a complete staff of energy service team capable to follow schedules of
preventive activities keeping accurate records related to the operation and maintenance
of electrified equipment. It is also important to know actual lifespan of spare electric
motors from the date of their manufacture by the factory till the moment of installation
on the working machine, as well as those of the failed electric motors from the moment
of manufacture by the factory till the date of failure on the corresponding working
machine. Their rated lifespan have to be also specified. In the process of operation of
electric motor, amortization charges related to electric motor’s initial price have to be
carried out, on the annual basis. These charges can be later involved for either repairing
the motor or its replacement by a new one. In case that a failed electric motor has not
operated during its whole rated lifespan before the complete repair or decommission-
ing, the enterprise will run up losses associated with repair or replacement that are a
part of technical damage DT.
In order to make practical evaluations of technical damage more convenient it is
advisable to apply summarizing coefficient of damage kD depending on actual lifespan
of electric motor. In the case of failure of electric motor followed by its decommis-
sioning or replacement by its analog, damage DT arises that is defined by the following
expression [12]:
DT ¼ PR kD ðtF Þ ð2Þ
where PR is price of new electric motor including installation expenditures (euros),

kD is coefficient of damage (a.u.), tF is actual lifespan of electric motor (years).
The obtained dependence of coefficient of damage on actual lifespan of electric
motor for livestock production is shown in Fig. 2.
In case of decommissioning of electric motors installed in technological equipment
of livestock farm, the values of technical damage DT, in the dependence on the lifespan
is calculated by multiplying coefficient of damage by the price of electric motor
including installation expenditures, for the rated value of lifespan 7 years. As it is clear
1 1.0
0,8 0.8
0,6
kD [a.u.]
0.6
0,4 0.4
0,2 0.2
0.1
0
0 1 2 3 4 5 6 7
tF [years]
Fig. 2. Dependence of coefficient of damage on actual lifespan of electric motor, in livestock
production.
from Fig. 2, coefficient of damage equals to 0.5 to 0.6, for the average value of lifespan
of electric motor equal to 3 to 3.5 years, in livestock production.
Calculation results for the average value of technical damage DT caused by failure
of electric motors of type AIR depending on their capacity calculated with the use of
expression (2), for KD = 0.6, are presented in Table 6.
Table 6. Calculation results for technical damage caused by failure of electric motor of type
AIR, in livestock production, for KD = 0.6.
Capacity of electric motor (P) kW 1.1 2.2 3.0 4.0 5.5 7.5
Price of new electric motor (P0), euros 52.92 72.43 92.57 103.55 139.30 177.79
Price of installation (PI), euros 5.30 7.25 9.26 10.35 13.93 17.78
PR = P0 + PI, euros 58.21 79.61 101.83 113.90 153.23 195.56
Damage (DT), euros 34.92 47.76 61.10 68.33 91.86 117.34
When necessary, this evaluation method enables prompt calculations of economic

damage due to early replacements or complete repairs of electric motors and other
electric equipment. It is also applicable to estimative calculations while solving eco-
nomic objectives in conditions of absence of clearly specified initial information related
to the failures of electric motors.
4 Conclusions
Evaluation of technical components of economic damage from failures of electric

motors installed in technological equipment of livestock-breeding farms has been
made. It has been found out that technical damage caused by failures of electric motors
depends on multiple factors. Taking these factors into account makes calculations
rather complicated while their results may appear insignificant in terms of practical
application, in conditions of agricultural enterprise. The main components of technical
damage arise from the reduction of actual lifespan and because of unexpended price of
electric motor. For practical application in conditions of livestock farm, the average
calculated value of technical damage due to emergent failure of single electric motor is
defined with the use of averaged coefficient of technical damage related to the price of a
new electric motor including installation expenditures. Calculations have been per-
formed, and dependences of technical damage on actual lifespan have been defined, for
failures of electric motors of various capacities applied in livestock sector of
agriculture.
The results of works will make it possible to estimate technical damage associated
with failures of electric motors, to define the extent of economical responsibility of
particular employed electric equipment, to make more specific schedules of preventive
maintenance and repair and to correct them, as well as to find room for improvement of
operational effectiveness of energy services at agricultural enterprises.
References
1. Gill, P.: Electrical Power Equipment Maintenance and Testing. CRC Press, Boca Raton
(2008)
2. Gracheva, E., Fedorov, V.: Forecasting reliability electrotechnical complexes of in-plant
electric power supply taking into account low-voltage electrical apparatuses. In: 2019
International Conference on Industrial Engineering, Applications and Manufacturing
(ICIEAM). IEEE, pp. 1–5 (2019)
3. Vasant, P., Zelinka, I., Weber, G.-W. (eds.): Intelligent computing and optimization. In:
2019 (ICO 2019). Springer (2019). ISBN 978–3–030–33585–4.
4. Vasant, P., Zelinka, I., Weber, G.-W. (eds.) Intelligent computing & optimization. In:
Conference proceedings ICO 2018. Springer, Cham (2018). ISBN 978-3-030-00978-6
5. Johannes, R., Schmidthaler, M., Schneider, F.: The value of supply security: the costs of
power outages to Austrian households, firms and the public sector. Energy Econ. 36, 256–
261 (2013)
6. Kharchenko, V., Vasant, P.: Friendly agricultural development. In: Handbook of Research
on Energy-Saving Technologies for Environmentally. IGI Global. https://doi.org/10.4018/
978-1-5225-9420-8
7. Kharchenko, V., Vasant, P.: Renewable Energy and Power Supply Challenges for Rural
Regions. IGI Global (2019). https://doi.org/10.4018/978-1-5225-9179-5
8. Syrykh, N.N., Ye Kabdin, N.: Theoretical Foundations of Operation of Electric Equipment.
Agrobiznestsentr Publ., Moscow (2007). 514 p. (in Russian)
9. Strebkov, D.S., Nekrasov, A.I., Nekrasov, A.A.: Maintenance of power equipment system
based on the methods of diagnosis and control of technical conditions. In: Handbook of
Research on Renewable Energy and Electric Resources for Sustainable Rural Development.
January 2018, Chapter 18, pages 535. pp. 421–448. https://doi.org/10.4018/978-1-5225-
3867-7
10. Nekrasov, A.I., Nekrasov, A.A.: Method for definition of required spare components set to
maintain electric equipment for rural applications. In: 2018 International Russian Automa-
tion Conference (RusAutoCon). IEEE (2018)
11. Yu, S., Borisov, A., Nekrasov, A.N., Nekrasov, A.A.: Methodical recommendations on
forecasting and monitoring technical status of electric motors in agricultural production.
Moscow. GNU VIESH (2011). 108 p. (in Russian)
12. Yeroshenko, G.P., Bakirov, S.M.: Adaptation of operation of electric equipment to the
specifics of agricultural production. Saratov IC ‘Nauka’ (2011). p. 132. (in Russian)
13. Price of electric motors. https://electronpo.ru/price. Accessed 24 Jan 2020
14. LaVerne, S., Bodman, G., Schinstock, J.: Electrical wiring methods and materials for
agricultural buildings. Appl. Eng. Agric. 5(1), 2–9 (1989)
Development of a Prototype Dry Heat Sterilizer
for Pharmaceuticals Industry
Md. Raju Ahmed(&), Md. Niaz Marshed,

and Ashish Kumar Karmaker
Abstract. The use of sterilizer in pharmaceutical industries are very important

due to having capabilities of reducing microbial contamination of medicinal
preparations. Sterilizer eliminates all form of microorganisms and other bio-
logical agents, which helps to maintain a sterile environment. Although several
researches conducted research on design & implementation of sterilizer, due to
lack of accuracy, safety issues, operational limitation and high cost, these
sterilizers are not suitable for pharmaceutical industries in Bangladesh. In this
project, a low-cost and user friendly Dry Heat Sterilizer (DHS) is designed and
implemented for pharmaceutical industries. In order to make fully automated
control scheme, Programmable Logic Controller (PLC) and Human Machine
Interface (HMI) is used in this experimental work, which ensures necessary
safety measures also. The performance of the implemented DHS is analyzed for
two different experimental setups. It is found that the DHS made from locally
available materials can perform satisfactory and can be used in pharmaceutical
industries as well as for oven applications.
Keywords: Sterilizer Heating Cooling Microorganism PLC HMI
1 Introduction
The importance of using sterilizer in healthcare sector is getting popularity due to its
capability of killing microorganisms. DHS is a necessary equipment for pharmaceu-
ticals as well as hospitals to facilitate sterilization process. Different type of cycles has
been used sterilization process to perform the function, such as drying, exhaust, pre-
sterilization (heating), stabilization, sterilization and cooling. Heat destroys bacterial
endotoxins (or pyroxenes) which are difficult to eliminate by other means, and this
property makes it applicable for sterilizing glass bottles and pharmaceuticals equip-
ment. In this paper, the design and implementation of a DHS using locally available
materials is presented. The performance analysis of the implemented DHS is conducted
and also presented.
Many researchers have suggested and prescribed many ways of sterilizer and
sterilization process. Pradeep and Rani [1] described the function of the sterilizer and
sterilization process. Their process is used for implementation of different type of
sterilizer. They also carried out detailed study on machine performance, to show the
evaluation process of the efficiency of micro-organism control. Kalkotwar et al. [2]
performed some research on the sterilization process for killing of microorganisms by
heat, which was a combination of time and temperature. Finally, they published papers

https://doi.org/10.1007/978-3-030-68154-8_103
1214 Md. R. Ahmed et al.
on methods and process of validation of sterilization process. Purohit and Gupta [3] has
analyzed the temperature mapping of DHS. Sultana [4] discussed about the sterilization
methods and principles, also have explain different type of sterilizer and sterilization
methods. She employs high temperatures in the range of 160–180 °C. Obayes [5] has
studied the construction and working principle of DHS. He applied different values of
time and temperature to perform the sterilization, such as 170 °C for 30 min, 160 °C
for 60 min, and 150 °C for 150 min. He also described the construction of the ster-
ilizer, he used conventional control system using switch and relay to perform the
functions of selecting electric heater for temperature rise and thermostat for measuring
temperature. Satarkar and Mankar [6] fabricated an autoclave sterilization machine for
reducing the microbial contamination of packaged products. They have described about
the moist heat with low range of temperature. Oyawale and Olaoye [7] designed and
fabricated a low cost conventional steam sterilizer. They performed the testing of their
design considering temperature of 120 °C for 12 min.
According to the above discussions, many researchers were designed and imple-
mented DHS and conducted the performance tests. In Bangladesh some switch- relay
controlled conventional machines are available that can be used only for oven purpose,
cannot use pharmaceuticals applications due to lack of accuracy, safety and operational
limitations. In Bangladesh, DHS are imported for industrial purpose which are large in
size and costly, and all sorts of control remained in the grip of them. Our engineers do
not have knowledge of controlling system; they only can operate. If there is any
problem the total system become useless.
In this project, a prototype DHS is designed and implemented using locally
available materials. For automatically control the operation of the DHS, PLC is used.
HMI is used for giving the instructions and for displaying all operation parameters.
The specific issues of this work can be summarized as follows:
a. To design and implement a PLC controlled DHS using locally available materials
for pharmaceuticals and experimental use.
b. To control the various parameters of the operation cycle and display the parameters
in the HMI display.
c. Analysis the performance of the implemented DHS and compare with the set val-
ues, and discuss the suitability of the implemented DHS for industrial use.
2 Proposed DHS System
Figure 1 shows the operational block diagram of the proposed DHS comprising PLC,
HMI and other accessories. PLC is interfaced with the HMI for bidirectional control
and display.
Development of a Prototype Dry Heat Sterilizer for Pharmaceuticals Industry 1215
Fig. 1. Operational block diagram of the proposed DHS.
PLC will continuously monitor the inside temperature of the DHS. Door sensor and
emergency stop switch are used for safety purpose. Before starting operation, the door
of sterilizer should be closed, door sensor will ensure this. If door is not closed at the
beginning of operation cycle or any unwanted situation arises, buzzer will alarm. The
desired cycle parameters such as temperature, heating time, sterilizing time and cooling
time will be loaded in PLC through HMI. All parameters will be continuously moni-
tored by PLC and will be displayed on the HMI display. Heaters and blowers are used
for heating and cooling the chamber of DHS. Magnetic contractor is used to turn on/off
the blower and solid state relay is used to turn on/off the heater.
3 Implementation of DHS
The proposed DHS is implemented practically in the laboratory. Figure 2 shows the
control panel of the implemented DHS. Table 1 shows the list of materials used for
implementing the prototype DHS. Operational flowchart of the sterilization process in
shown in Fig. 3.
Fig. 2. Control panel of implemented DHS.
Table 1. List of materials used for implementing the DHS.

Sl no. Name of material Specification Quantity
01 Machine Frame 1 package 01 unit
02 PLC FBs-20MAR2-AC, 240 VAC 01 Pcs
03 HMI P5043S, 24 VDC 01 Pcs
04 Expansion Module FBs-6RTD 01 Pcs
05 Communication cable RS232 01 Pcs
06 Power supply unit IP 100/220 VAC, OP 24 VDC 01 Pcs
07 Blower-1 24VDC 01 Pcs
08 Blower-2 24 VDC 01 Pcs
09 Heater element Cartridge, 220 VAC, 500 W 02 Pcs
10 Temperature sensor Pt100 02 Pcs
11 SSR 250 V, 25 A 01 Pcs
12 Fuse Holder 220 V 01 Pcs
13 Fuse 220 V, 6 A 01 Pcs
14 Electromechanical relay 24 V, 14 pin 04 Pcs
15 Indicator lamp Red color, 220 VAC 02 Pcs
16 Indicator lamp Green color, 220 VAC 02 Pcs
17 Limit Switch 220 V, 5 A 02 Pcs
Fig. 3. Flowchart of operation of the proposed sterilization process.

4 Performance Analysis of Implemented DHS
The recommended operating conditions for the DHS is a cycle of 40 °C to 50 °C for

5 min and a cycle of 55 °C to 65 °C for 3 min. The tests were carried out at an empty
chamber, the set value and measured value for two cases are given in Table 2 and in
Table 3 respectively. Variation of chamber inside temperature with time for the two
cases are shown in Fig. 4 and Fig. 5. Ambient temperature was 28.0 °C for case-1 and
30.5 °C for case-2. It is seen that the implemented DHS can perform the sterilization
cycle properly as per instruction.
Table 2. Performance test of the implemented DHS for case-1.

Sl. Parameter Set Actual Remarks
no
1 Air circulation for 45 s 45 s Internal setting
balance
2 Exhaust 60 s 60 s Internal setting
3 Heating – 950 s @ 40.0 °C At actual
4 Stabilization 120 s 120 s @ 40.3 °C Internal setting
5 Sterilization 40.0 ° Max: 40.3°C Min: Operational data
temperature C 40.0 °C control
6 Sterilization time 300 s 300 s Operational data
control
7 Cooling temperature 32.0 ° 31.9 °C Operational data
C control
8 Cooling time – 1055 s At actual
9 Cooling extinction 300 s 300 s Internal setting
time
10 Total Cycle Time 2770 s
Fig. 4. Variation of chamber temperature with time for case-1.

Table 3. Performance test of the implemented DHS for case-2.

Sl. Parameter Set Actual Remarks
no
1 Air circulation for 45 s 45 s Internal setting
balance
2 Exhaust 60 s 60 s Internal setting
3 Heating – 1637 s @ 55.0 °C At actual
4 Stabilization 120 s 120 s @ 55.3 °C Internal setting
5 Sterilization 55.0 ° Max: 55.3 °C Min: Operational data
Temperature C 54.8 °C control
6 Sterilization Time 180 s 180 s Operational data
control
7 Cooling Temperature 33.0 ° 32.9 °C Operational data
C control
8 Cooling Time – 2147 s At actual
9 Cooling Extinction 300 s 300 s Internal setting
time
10 Total Cycle Time 4427 s
Fig. 5. Variation of chamber temperature with time for case-2.
4.1 Alarm Check of the DHS

To ensure the proper operation cycle as per instruction and to operate the HDS in safe
mode, proper protection system is incorporated with the implemented DHS. If any
violation of the setup value or any dander situation occur, the system will alarm. If
violation is severe the protection system automatically shut-off the supply. All violation
and danger situation will be displayed in HMI display. The alarm messages and the
corresponding remedial actions are given in Table 4. A breakdown list and remedy
action are also mention in Table 5.
Table 4. Alarm list and remedy of the implemented DHS.

Sl. Message Cause Remedial action
no
1 Emergency stop Emergency switch is Reset emergency switch
active
2 Blower 1 overload Blower 1 tripped Check safety device
3 Blower 2 overload Blower 2 tripped Check safety device
4 Sterile door open Door is open or Close the door or check the
sensor missing sensor function
5 Non sterile door open Door is open or Close the door or check the
sensor missing sensor function
6 Low air pressure Compressed air not Make sure the air is available
available
Air pressure too low Increase the air pressure
7 High temperature Temperature limit Reset
exit
8 Product temperature Sensor broken or wire Check sensor and PLC input
Sensor broken turnoff
9 Safety temperature Sensor broken or wire Check sensor and PLC input
sensor broken turnoff
10 Sensor function off Sensor broken or wire Replace the sensor or
turnoff reconnect the wire
Table 5. Breakdown list and remedy of the implemented DHS.

Sl. Observations Cause Remedial action
no
1 Machine not Door is open Close the doors and door
start sensors
Any component is tripped or Check all components
damaged
Inputs not available Check alarm and take steps
2 Heater not Temperature sensor may be Replace the sensor or reconnect
ON damaged
3 High Safety temperature sensor failure Replace the sensor or reconnect
temperature
4 Compressed Compressed air not available or Checking air availability,
Air low air pressure too low Increase the air pressure
5 Long time Heater not ON Check SSR and heater
heating Blower not ON Check blower
Sensor may be damaged Check temperature sensor
5 Conclusion
High temperature for certain time is required to control or kill the microorganism.
A DHS is implemented in this project to perform the sterilization. Locally available
materials are used to implement the DHS, the cost of this machine is very low around
six hundred fifty US dollar.
Two setups are chosen to test the performance of the implemented DHS. It is seen
from the experimental results (Table 2 and 3) that the implemented DHS performed
welly as per setup operation cycle. Maximum deviation of temperature is found around
0.75 percent and time is around 0.31 percent. The results of completed cycles show that
the implemented DHS work perfectly. In this project, to construct the body of the DHS,
wooden material is used due to constructional limitations. Therefore, test cycles are run
considering comparatively low temperature. All parameters of the operation cycle of
DHS can be set through HMI, as a results the implemented DHS is user friendly. It is
expected that the DHS implemented in this project can be used for oven purpose in
pharmaceuticals and other industries and also can be used for experimental purpose.
This project will be helpful for producing low cost, customized design and user friendly
DHS locally, which can be used in industries.
6 Recommendation for Future Work
Further research can be done by considering the high temperature resistance material
i.e. stainless steel (SS) for performing high temperature sterilization. Air velocity of
blowers did not calculate due to the use of small blowers and limitation of measurement
instrument. Rate of rise and fall of temperature did not measured in this project, further
research can be done in this direction.
References
1. Pradeep, D., Rani, L. : Sterilization protocols in dentistry – a review. J. Pharm. Sci. Res. 8(6),
558 (2016)
2. Kalkotwar, R.S., Ahire, T.K., Jadhav, P.B., Salve, M.B.: Path finder process validation of dry
heat sterilizer in parenteral manufacturing unit. Int. J. Pharm. Qual. Assur. 6(4), 100–108
(2015)
3. Purohit, I.K., Gupta, N.V.: temperature mapping of hot air oven (dry heat sterilizer). J. Pharm.
Res. 11(2), 120–123 (2017)
4. Sultana, D.Y.: Sterilization Methods and Principles, Faculty of Pharmacy, Jamia Hamdard,
Hamdard Nagar, New Delhi-110062, pp. 1–4, 11 July 2007
5. Obayes, S.AS.: Hot air oven for sterilization, definition and working principle. SSRN
Electron. J. (2018)
6. Satarkar, S.S., Mankar, A.R.: Fabrication and analysis of autoclave sterilization machine.
IOSR J. Eng. 05–08. ISSN (e): 2250–3021, ISSN (p): 2278–8719
7. Oyawale, F.A., Olaoye, A.E.: Design and construction of an autoclave. Pac. J. Sci. Technol. 8
(2), 224–230 (2007)
Optimization of Parameters of Pre-sowing Seed
Treatment in Magnetic Field
Volodymyr Kozyrsky(&), Vitaliy Savchenko, Oleksandr Sinyavsky,

Andriy Nesvidomin, and Vasyl Bunko
National University of Life and Environmental Sciences of Ukraine,

Street Heroiv Oborony, 15, Kiev 03041, Ukraine
{epafort1,sinyavsky2008}@ukr.net, vit1986@ua.fm,
a.nesvidomin@gmail.com, vbunko@gmail.com
Abstract. The results of theoretical and experimental researches of seed

biopotential change at its pre-sowing treatment in a magnetic field are presented.
It is established that under the action of a magnetic field the speed of chemical
and biochemical reactions in a plant cell increases, which causes a change in the
biopotential. The method of determining the efficiency of seeds pre-sowing
treatment by changing the biopotential is substantiated. It is established that the
main acting factors in magnetic seed treatment are magnetic induction, its
gradient and speed of seed movement in a magnetic field. The effect of magnetic
treatment takes place at low energy doses (2.0–2.5 Js/kg). The optimal mode of
magnetic seed treatment is determined: magnetic induction 0.065 T with four-
fold re-magnetization, magnetic field gradient 0.57 T/m and the velocity of its
movement in a magnetic field of 0.4 m/s.
Keywords: Magnetic field Biopotential Pre-sowing seed treatment

Magnetic induction Speed of seed movement Activation energy Energy
dose of treatment
1 Introduction
It is possible to increase production and improve the quality of crop products by

stimulating seeds using the l biological potential of seed material and reducing crop
losses from diseases and various types of pests.
In the practice of agricultural production, the main method of increasing crop yields
is the application of mineral fertilizers, and the protection of plants from diseases and
pests - the use of chemical pesticides. But long-term use of mineral fertilizers and plant
protection products leads to irreparable environmental damage.
Therefore, there is a need to increase yields without the use of chemicals. The
greatest interest in terms of obtaining environmentally friendly products are electro-
physical factors affecting plants [1], among which a promising method is the use of a
magnetic field for pre-sowing seed treatment.
Unlike traditional methods of pre-sowing seed treatment with chemicals and other
electrophysical methods, it is a technological, energy-efficient method, does not cause
negative side effects on plants and staff and is an environmentally friendly type of
treatment [2].
https://doi.org/10.1007/978-3-030-68154-8_104
Optimization of Parameters of Pre-sowing Seed Treatment in Magnetic Field 1223
Seed treatment in a magnetic field affects the physicochemical processes directly in

the seed, which leads to biological stimulation, activation of metabolic processes,
enhancing enzymatic activity, accelerating the growth of plants, which further increases
their productivity. The magnetic field destroys fungi and microorganisms that are on
the surface of the seeds, which reduces the incidence of plants.
2 Background
Many researchers have found a positive effect of the magnetic field on crop seeds,
which is manifested in improving seed sowing qualities [3], plant biometrics and yields
[4, 5], crop storage [6] and reducing plant morbidity [7], biochemical indicators [8] and
the quality of plant products [9].
However, all studies were performed at different values of magnetic induction and
treatment time (treatment dose), although it was found that crop yields and biometrics
depend on the dose of magnetic treatment. As a result, many different processing
modes are proposed, which sometimes differ significantly from each other. Since the
mechanisms of action of the magnetic field on the seeds did not have a clear expla-
nation, so not all the active factors in the magnetic treatment of seeds and their optimal
values were established.
Numerous studies suggest that seed treatment in a magnetic field may be an
alternative to chemical methods of pre-sowing treatment [10]. For successful intro-
duction of pre-sowing treatment of seeds in production it is necessary to establish mode
parameters of processing and their optimum values.
A common disadvantage of all existing methods of electromagnetic stimulation is
the lack of instrumental determination of the treatment dose. Its optimal value is usually
determined by yield, which largely depends on agro-climatic factors, soil fertility,
cultivation technology used and so on. Therefore, to determine the optimal modes of
pre-sowing seed treatment in a magnetic field, it is necessary to develop a method of
indicating its effect.
The purpose of the study is to establish the influence of the magnetic field on the
change of the biopotential of seeds of agricultural crops and to determine the optimal
parameters of magnetic seed treatment.
3 Results of the Research
Various chemical and biochemical reactions take place in plant seeds, which are mainly
redox. Stimulation of seeds is associated with an increase in their speed, resulting in an
increase in the concentration of reaction products:
dCi ¼ xdt ð1Þ
where Ci is the concentration of the substance, mol/l; x – rate of chemical reaction,

mol/(l s); t – hour, s.
Under the action of a magnetic field changes the rate of chemical reactions [11]:

xm ¼ x exp mK 2 B2 þ 2KBvNa =2RT ; ð2Þ
where x – rate of chemical reaction without the action of a magnetic field, mol/(ls);
m – the mass of ions, kg; B – magnetic induction, T; v - velocity of ions, m/s; K –
coefficient that depends on the concentration and type of ions, as well as the number of
re-magnetizations, m/(sT); Na – Avogadro’s number, molecules/mol; R – universal gas
constant, J/(mol K); T – temperature, K.
To study biological objects, Albert Szent-Györgyi introduced the concept of
biopotential, which is associated with the redox potential (ORP) by a relation [8]:
BP ¼ 820 ORP; ð3Þ
where 820 mV is the energy potential of water.

The change in the redox potential of seeds during its treatment in a magnetic field
can be determined by the Nernst equation [8]:
RT
DORP ¼ 2; 3 ðlg C2 lg C1 Þ ð4Þ
zF
where z is the valence of the ion; F is the Faraday number, Cl/mol; C1 - concentration
of ions before magnetic treatment, mol/l; C2 - concentration of ions after magnetic
treatment, mol/l.
Taking into account (1)
RT
DORP ¼ ðlg x2 lg x1 Þ: ð5Þ
zF
Substituting in Eq. (4) the expression of the rate of chemical reaction (2), we
obtain:

mNa K KB2
DORP ¼ þ vB ð6Þ
zF 2
whence

mNa K KB2
DBP ¼ þ vB ð7Þ
zF 2
Expression (7) can be written as
DBP ¼ A1 B2 þ A2 Bv; ð8Þ
where A1 and A2 are the coefficients.

The coefficients included in Eq. (8) cannot be determined analytically. They were
determined on the basis of experimental data.
Experimental studies of the effect of the magnetic field on the seed biopotential
were performed with pea seed “Adagumsky”, beans seed “Hrybovsky”, rye seed
“Kharkivsky 98”, oats seed “Desnyansky”, barley seed “Solntsedar”, cucumber seed
“Skvyrsky”, sunflower seed “Luxe”.
The seeds were moved on a conveyor through a magnetic field created by a
multipolar magnetic system based on an induction linear machine (Fig. 1) [12].
а b
Fig. 1. Installation for pre-sowing treatment of seeds in a magnetic field: a - general view; b -
functional diagram: 1 – load device; 2 – conveyor; 3 – textolite inserts; 4 – permanent magnets; 5
– plate made of electrical steel; 6 – object of processing; 7 – container
Magnetic induction was adjusted by changing the distance between the magnets
and measured with a teslameter 43205/1. The velocity of movement of seeds through a
magnetic field was regulated by change of speed of rotation of the driving electric
motor of the conveyor by means of the frequency converter.
Seeds treated in a magnetic field were germinated and the ORP value was
measured.
A measuring electrode in the form of a platinum plate with a pointed end was
developed to measure ORP. The platinum electrode was inserted into the germinated
seed. A standard silver chloride electrode was used as an auxiliary electrode. Using the
ionomer И-160 M was determined by the ORP of germinated seeds.
The studies were performed using the experimental planning method. An orthog-
onal central-compositional plan was used for this purpose [13]. The values of the
upper, main and lower levels were taken for magnetic induction, respectively, 0, 0.065
and 0.13 T, for seed velocity - 0.4, 0.6 and 0.8 m/s, for response – the biopotential of
germinated seeds.
As a result of the conducted researches it is established that in the range of change
of magnetic induction from 0 to 0,065 T the seed biopotential increases, and at biger
values of magnetic induction the biopotential decreases (Fig. 2). At magnetic induc-
tion, which exceeds 0.13 T, the seed biopotential does not change, but exceeds its value
for the seed, untreated in a magnetic field.
The seed biopotential during pre-sowing treatment in a magnetic field is also

affected by the velocity of its movement, but in the velocity range of 0.4–0.8 m s it is a
less significant factor than magnetic induction.
Fig. 2. Dependence of change of biopotential of oat seeds on magnetic induction and speed of
movement of seeds
According to the results of a multifactorial experiment, a regression equation was

obtained, which connects the seed biopotential with the regime parameters of seed
treatment in a magnetic field in the form of:
DBP ¼ a0 þ a1 B þ a2 v þ a12 Bv þ a11 B2 ; ð9Þ
where a0, a1, a2, a12, a11 – coefficients, the values of which for different crops are
shown in Table 1.
Table 1. Values of coefficients in the regression equation for seed biopotential.

Agricultural culture a0 a1 a2 a12 a11
Pea 0,47 906,9 −0,42 −121,8 −5404
Bean 0,84 1006 −1,53 −147,44 −5812
Rye −0,9 1621 −0,83 −282,1 −9007
Oat 0,9 841,5 −1,81 −96,15 −5536
Barley 5,5 1218 −8,47 −185,9 −6746
Sunflower 4,14 878,2 −7,08 −134,62 −5641
Cucumber 4.12 1428 −6.31 −147,44 −8218
The conducted researches made it possible to determine the optimal parameters of

treatment by the method of steep ascent. It is established that the optimal value of
magnetic induction for seeds of agricultural crops is 0.065 T, velocity - 0.4 m/s.
The same optimal treatment regimes were established for magnetic treatment of
water and salt solutions [11]. This confirms the hypothesis of a significant role of water
in the mechanism of influence of the magnetic field on seeds.
The seed biopotential can determine not only the efficiency of pre-sowing treatment
in a magnetic field, but also the change in activation energy. The expression for the rate
of a chemical reaction in a magnetic field can be written as [14]:
xm ¼ x exp½ðE þ D Þ=kT ; ð10Þ
where DE* - change in activation energy, J/mol.

Using dependence (5), we obtain:

DÅ
DORP ¼ ð11Þ
zF
The change in biopotential is determined by the equation:

DÅ
DBP ¼ ð12Þ
zF
Then
DE ¼ zFDBP ð13Þ
Thus, by formula (13) it is possible to determine the change in the activation energy
during seed treatment in a magnetic field according to experimentally determined
values of the change in biopotential.
The experimental dependences of the change in the activation energy on the
magnetic induction during seed treatment in a magnetic field are similar to the
dependences for the change in the seed biopotential during treatment in a magnetic
field. The activation energy changes the most at a magnetic induction of 0.065 T and a
velocity of 0.4 m/s. In this treatment mode, the activation energy changes by 2.4–
5.7 kJ/g-eq (Table 2).
Table 2. Change in activation energy during pre-sowing seed treatment in a magnetic field.
Agricultural culture Change in biopotential, mV Change in activation energy, kJ/g-eq
Rye 59 5,69
Barley 50 4,82
Sunflower 34 3.28
Oat 30 2,89
Pea 33 3,18
Bean 38 3,67
Cucumber 58 5,6
As follows from Table 2, the change in activation energy during seed treatment in a
magnetic field is less than the Van der Waals forces (10–20 kJ/mol), which characterize
the intermolecular interaction and the interaction between dipoles. Thus, the magnetic
field acts on the ions that are in the aqueous solution of the cell.
It is now established that the yield and biometric features of crops depend on the
dose of magnetic treatment, regardless of the method of creating a magnetic field [15].
Experimental studies of the change in biopotential during seed treatment in a
magnetic field made it possible to determine the energy dose of treatment.
The energy dose of treatment is determined by the formula [15]:
Z
W
D¼ dt ð14Þ
m
where W is the energy of the magnetic field, J; m – seed weight, kg; t – treatment time,
s;
or
Z
B2 dt
D¼ ð15Þ
2ll0 q
where l is the relative magnetic permeability; l0 – magnetic constant, Gn/m, q – seed

density, kg/m3.
If we replace dt with dl, we get:
Z
B2 dl
D¼ ; ð16Þ
2ll0 qv
where l is the path lengths, m.

When the seeds move in an gradient magnetic field, the magnetic induction changes
along the conveyor belt (Fig. 3).
Fig. 3. Change of magnetic induction in the air gap along the conveyor belt
Using the dependence (Fig. 3), the integral (15) is determined by the method of
trapezoids:
ZL Zl=8 3L=8
Z 5L=8
Z
8Bm 2 8Bm 2 8Bm 2
B2 dl ¼ l dl þ 2Bm þ l dl þ 4Bm l dl
L L L
0 0 L=8 3L=8
7L=8
Z ZL
8Bm 2 8Bm 2 B2 L B2 L B2 L B2 L B2 L B2 L
þ 6Bm þ l dl þ 8Bm l dl ¼ m þ m þ m þ m þ m ¼ m ;
L L 24 12 12 12 24 3
5L=8 L=8
ð17Þ
where Bm is the maximum value of magnetic induction, which takes place in the plane
of installation of magnets, Tl, L – the path that passes the seed in a magnetic field, m
Then the energy dose of treatment
B2m L
D¼ ; ð18Þ
6ll0 qv
or
B2m ns
D¼ ; ð19Þ
6ll0 qv
where n is the number of re-magnetizations; s is the pole division, m.

The formula for determining the energy dose of treatment (19) contains all the
mode parameters of seed treatment in a magnetic field (magnetic induction, seed
velocity, number of re-magnetizations, pole division).
Studies of changes in the biopotential of germinated seeds during its magnetic
treatment allowed to determine the energy dose of the treatment by the corresponding
values of magnetic induction and seed velocity according to formula (18).
The interrelation between the energy dose of treatment and the biopotential of seeds
has been established. The dependence of the change in the biopotential of germinated
seeds on the energy dose of treatment is shown in Fig. 4. As follows from this
dependence, the optimal value of the energy dose of treatment for sunflower seeds is
3.8 Js/kg, rye – 1.86 J s/kg, oats – 2.8 Js/kg, peas – 1, 9 Js/kg, beans – 2.22 Js/kg,
cucumber – 2.02 Js/kg/kg, barley – 2.22 Js/kg.
From the condition of providing the optimal energy dose of treatment, the value of
the pole division 0, 23 m is determined, while the magnetic field gradient is 0.57 T/m.
Fig. 4. The dependence of the change in the biopotential of cucumber (1) and barley (2) seeds
on the energy dose of treatment in a magnetic field
4 Conclusion
The change in the seed biopotential during pre-sowing treatment in a magnetic field
depends on the square of the magnetic induction and the velocity of the seeds moving
in the magnetic field. By measuring the biopotential, it is possible to determine the
optimal treatment mode, which takes place at a magnetic induction of 0.065 T, fourfold
re-magnetization, pole division of 0.23 m and seed velocity in a magnetic field of
0.4 m/s.
The change in seed biopotential depends on the energy dose of treatment in a
magnetic field. The greatest seed biopotential during pre-sowing seed treatment in a
magnetic field was at an energy dose of treatment 1.7–3.8 J•s/kg.
The change in activation energy during seed treatment in a magnetic field is directly
proportional to the change in biopotential and at the optimal treatment mode is 2.4–
5.7 kJ/g-eq.
References
1. Vasilyev, A., Vasilyev, A., Dzhanibekov, A., Samarin, G., Normov, D.: Theoretical and
experimental research on pre-sowing seed treatment. In: IOP Conference Series: Materials
Science and Engineering, p. 791. 012078 (2020). https://doi.org/10.1088/1757-899x/791/1/
012078
2. Kutis, S.D., Kutis, T.L.: Elektromagnitnyye tekhnologii v rasteniyevodstve. 1. Elektromag-
nitnaya obrabotka semyan i posadochnogo materiala. [Electromagnetic technologies in crop
production. Part 1. Electromagnetic treatment of seeds and planting material], Moscow:
Ridero, p. 49 (2017)
3. Ülgen, C., Bİrinci Yildirim, A., Uçar Türker, A.: Effect of magnetic field treatments on seed
germination of melissa officinalis L. Int. J. Sec. Metabolite 4(3), 43–49 (2017)
4. Kataria, S., Baghel, L., Guruprasad, K.N.: Pre-treatment of seeds with static magnetic field
improves germination and early growth characteristics under salt stress in maize and
soybean. Biocatal. Agr. Biotechnol. 10, 83–90 (2017)
5. Maffei, M.E.: Magnetic field effects on plant growth, development, and evolution. Front.
Plant Sci. 5, 445 (2014)
6. Lysakov, A.A., Ivanov, R.V.: Vliyaniye magnitnogo polya na sokhrannost’ kartofelya
[Influence of the magnetic field on the preservation of potatoes]. Adv. Modern Natural Sci.
8, 103–106 (2014)
7. De Souza, A., Sueiro, L., Garcia, D., Porras, E.: Extremely low frequency non-uniform
magnetic fields improve tomato seed germination and early seedling growth. Seed Sci.
Technol. 38, 61–72 (2010)
8. Iqbal, M., ul Haq, Z., Jamil, Y., Nisar, J.: Pre-sowing seed magnetic field treatment influence
on germination, seedling growth and enzymatic activities of melon (Cucumis melo L.).
Biocatal. Agr. Biotechnol. 2016(6), 176–183 (2016)
9. Ramalingam, R.: Seed pretreatment with magnetic field alters the storage proteins and lipid
profiles in harvested soybean seeds. Physiol. Mol. Biol. Plants 24(2), 343–347 (2018)
10. Stange, B.C., Rowlans, R.E., Rapley, B.I., Podd, J.V.: ELF magnetic fields increase
aminoacid uptake into Vicia faba L. Roots and alter ion movement across the plasma
membrane. Bioelectromagnetics 23, 347–354 (2002)
11. Zablodskiy, M., Kozyrskyi, V., Zhyltsov, A., Savchenko, V., Sinyavsky, O., Spodoba, M.,
Klendiy, P., Klendiy, G. Electrochemical characteristics of the substrate based on animal
excrement during methanogenesis with the influence of a magnetic field. In: Proceedings of
the 40th International Conference on Electronics and Nanotechnology, ELNANO, pp. 530–
535 (2020)
12. Sinyavsky, O., Savchenko, V., Dudnyk, A.: Development and analysis methods of
transporter electric drive for electrotechnological complex of crop seed presowing by
electromagnetic field. In: 2019 IEEE 20th International Conference on Computational
Problems of Electrical Engineering (CPEE), pp. 1–6 (2019)
13. Adler, Yu.P., Markova, E.V., Granovskiy, Yu.V.: Planirovaniye eksperimenta pri poiske
optimal’nykh usloviy [Planning an experiment when searchingfor optimal conditions].
Moskow: Science, p. 278 (1976)
14. Kozyrskyi, V., Savchenko, V., Sinyavsky, O.: Presowing processing of seeds in magnetic
field. In: Handbook of Research on Renewable Energy and Electric Resources for
Sustainable Rural Development, IGI Global, USA, pp. 576–620 (2018)
15. Pietruszewski, S., Martínez, E.: Magnetic field as a method of improving the quality of
sowing material: a review. Int. Agrophys 29, 377–389 (2015)
Development of a Fast Response Combustion
Performance Monitoring, Prediction,
and Optimization Tool for Power Plants
Mohammad Nurizat Rahman(&),

Noor Akma Watie Binti Mohd Noor(&),
Ahmad Zulazlan Shah b. Zulkifli, and Mohd Shiraz Aris
TNB Research Sdn Bhd, 43000 Kajang, Malaysia

{nurizat.rahman,akma.noor}@tnb.com.my
Abstract. Combustion performance monitoring is a challenging task due to

insufficient post combustion data and insights at critical furnace areas. Post
combustion insights are valuable to reflect the plant’s efficiency, reliability, and
used in boiler tuning programmes. Boiler tuning, which is scheduled after all the
preventive maintenance suggested by boiler manufacturer, could not addressed
the operational issues face by plant operators. A system-level digital twin
incorporating both computational fluid dynamics (CFD) and machine learning
modules is proposed in the current study. The proposed tool could act as a
combustion monitoring system to diagnose, pinpoints boiler problems, and
troubleshoots to reduce maintenance time and optimize operations. The system
recommends operating parameters for different coal types and furnace condi-
tions. The tool can be used as a guideline in daily operation monitoring/
optimization and in risk assessments of new coals. The current paper discusses
the general architecture of the proposed tool and some of the preliminary results
based on the plant’s historical data.
Keywords: Coal-fired boiler Combustion tuning Optimization
1 Introduction
The coal-fired furnace is designed with the aim that there is sufficient amount of heat
transferred from the flames to the heat exchangers tubes [1]. This must be done to
ensure that the steam that entering turbine is at specific temperature value. At the same
time, the plant operators need to monitor the gas temperature to not exceeding the
design limit at the furnace exit. Higher gas temperature after the furnace zone will not
only damage the heat exchangers tubes but also lead to ash deposition problems and
environmental issues [2].
At the moment, it is a challenging task for plant operators to maintain plant
availability and efficiency due to wide variety of coal properties [3]. The properties of
coal that is received today will not be the same with the coal properties that will be
received in the next month. Hence, it will certainly results in different combustion
behavior. Furthermore, operators do not have sufficient information of combustion

https://doi.org/10.1007/978-3-030-68154-8_105
Development of a Fast Response Combustion Performance Monitoring 1233
inside the boiler where they can only evaluate the combustion behavior by looking at
certain parameters e.g. carbon monoxide (CO) emission and gas temperature [4].
Insufficient post combustion data and insights from critical furnace area pose a
challenging task for combustion performance monitoring [4]. The inability to predict
boiler performance will increase the operating cost due to higher coal consumption and
poor reliability that may jeopardize plant operation [5]. There has been an increasing
trend in power plant operational disruption across the globe and this has been con-
tributed by issues such as high furnace exit gas temperature (FEGT), increased emis-
sion level, and ash deposition [5].
In recent years, technologies for monitoring the quality of combustion in utility
boilers have been developed through online spatial measurements of oxygen and fuel
[4]. The measurements will be used as indications to balance the flow of air and fuel
and subsequently, the combustion performance. While experienced performance
engineers can relate these measurements to the combustion’s characteristics, and adjust
the boiler to improve combustion, the step in combustion tuning is still primarily a trial
and error procedure and even a highly experienced performance engineer could take a
number of steps to optimize the boiler performance.
Computational Fluid Dynamics (CFD) modelling has often been used for com-
bustion and flow field characterizations [6, 7]. One cannot deny the capability of CFD
modelling to facilitate combustion monitoring process by providing a visualization of
the combustion characteristics. Rousseau et al. (2020) [8] have developed CFD model
to predict the combustion behaviour of coals in coal-fired utility boilers. They also
demonstrated the viability of the computational approach as an effective tool for coal
burning optimization in full-scale utility boilers. General Electric (GE) Power Services
have also suggested an efficiency improvement for the coal-fired boiler guided by CFD
approaches via optimizing the flow characterization and combustion optimization [4].
Via CFD, the spatial visualization of combustibles in the flue gas, temperature, and also
the potential areas for slagging and fouling can be detected which could assist in the
combustion tuning process in order to achieve the required combustion quality.
Nonetheless, coal power plants, which are designed to operate within certain range
of coal properties, will require CFD modelling to be performed multiple times
depending on the properties of coal. To avoid the excessive calculation times, a CFD
feature namely Reduced Order Modelling (ROM) [9] can be used to enable continuous
monitoring through dynamic data simulation and this will allow multiple simulations to
be conducted at one time. Moreover, a system-level digital twin [4] can be generated
via integrating both CFD ROM and machine learning modules to assist not only in the
combustion monitoring, but also in predicting the upcoming combustion anomalies and
provide a certain guidance which will allow proactive troubleshooting to be carried out
to eliminate any potential issues. With a recent spike of popularity towards machine
learning, several publications have tested the capability of machine learning for opti-
mization purposes and the outcomes were found to be beneficial for the digital twin
perusal [10, 11].
The current paper discusses the general workflow of the proposed tool to provide a
guideline for optimizing combustion performance in thermal power plants, specifically
the coal-fired power plant, which is the location where the preliminary study was
carried out based on the plant’s historical data and boiler’s geometry.
1234 M. N. Rahman et al.
2 Methodology
2.1 Overview of the Proposed Application Architecture
The tool is based on the system-level digital twin which incorporate both CFD and
machine learning modules to provide a virtual model of combustion performance in a
boiler where it allows analysis of data and monitoring of combustion behavior to head
off any anomalies (out of range temperature and pollutants) before it occurs. Subse-
quently, operators could check and apply corrective action on the twin model platform
and monitor the predicted outputs prior to applying the corrective action on the actual
boiler. Refer to Fig. 1 below on the general application architecture of the tool.
Fig. 1. General application architecture.
Referring to Fig. 1, smart components in an actual boiler system which use sensors
to gather real-time status of the boiler operation will be integrated with the commercial
industrial connectivity platform in order to bridge the communication gap between the
sensors in boiler system and the twin model platform. Once the real-time data has
passed through the platform, it will be sent to the twin model which act as the mon-
itoring platform to oversee the real-time parameters status, predicted outputs status, and
provide anomalies alert notifications. The twin model will also act as a bridge between
CFD and machine learning modules. The mentioned inputs are the list of parameters in
which the operators will have the flexibility to control in order to optimize the com-
bustion performance in boilers. The outputs are the outcome in which the operators
need to ensure that it is within the range of acceptable limit. There was a prior
engagement with the power plant where the study was held in order to get insights from
the plant personnel on the important combustion parameters which could massively
affect the plant’s reliability. Afterwards, the list of output parameters is determined
which includes the FEGT, flue gas concentration, and the rear pass temperature. These
three parameters are the common criteria to measure the combustion performance as
the negligence in monitoring and tuning these parameters will increase the potentiality
for unplanned outages and additional maintenance cost due to the formation of slagging
and fouling, along with the amount of emissions surpassing an acceptable limit.
The machine learning module will act as a “brain” where the machine learning
happened based on the incoming real-time data. Prior to the prediction stage, the raw
historical data (from January 2020 to August 2020) from the plant’s sensors was
cleaned and undergoes dimension reduction along with the parameter extraction. The
data was cleaned to remove any outliers since there are periods where the plant
experienced several outages. The raw historical data has a large number of parameters
which may not be affective since there are some parameters which might be redundant
and it will cause longer processing time. Moreover, there are several parameters that do
not contribute much on the specified output. From the raw historical data, there is a
total of 80 parameters. To determine the optimum number of parameters, the data was
trained using several values of parameter number including 70, 30, 25, 20, and 10.
Based on the values from the Root Mean Square Error (RMSE) and the Pearson
Correlation, the number of chosen parameters is 30 as the former and the later variables
have the lowest and the highest values as compared with the other tested parameter
numbers.
Once the validation of the analytics model is done, the output prediction was
executed where the twin model platform as mentioned before, will act as a bridge
between the predicted output from the analytics model and the CFD ROM database.
The CFD ROM database will received the real-time data from the plant’s sensors along
with the predicted outputs from the analytics model to visualize the real time com-
bustion behavior (5-min interval) and the predicted combustion behavior ahead of time.
From the prediction of combustion visualization in the CFD ROM interface, the
operators will have the flexibility to check and apply corrective action towards the
inputs of the boiler twin model prior to applying the corrective action on the actual
boiler.
2.2 Computational Fluid Modelling (CFD) Setup

CFD simulations for a coal-fired boiler need a complete dataset to determine the input
conditions. However, since the current work is still in the preliminary stage, the number
of controlled inputs are reduced to 4 which include the flowrates of primary air
(PA) flowrate, secondary air (SA), over-fire air (OFA), and the fuel. The boiler
system under study is a 1000-MW boiler with an opposed wall-firing configuration, see
Fig. 2 (a).
The baseline case used the flowrates depicted in the pocket book for operator. The
subsequent FEGT from CFD simulation was compared with the design value of FEGT
from the pocket book, see Fig. 2 (b) on the visualization of FEGT.
[°C]
SA 1200
Fuel
1000
+ PA
800
600
(a) (b)
Fig. 2. A 1000 MW opposed-firing boiler; (a) Meshing model and (b) predicted FEGT.
Several assumptions were made on the operating conditions which were applied to
the CFD model to cater the efficiency and the accuracy of the model. The air and fuel
flow rates for the individual burners and OFA ports are assumed to be identical. The
swirl effect from the burners is also neglected. On the numerical sides, there are several
sub models implemented to characterize the coal combustion behavior, radiation,
gaseous reaction, and turbulence, all of which are utilized to dictate the source terms for
the mass, momentum, energy, and the species governing equations. The incoming coal
particles were tracked based on the Lagrangian scheme considering the turbulent
dispersion of particles [12]. The distribution of particle sizes was based on the Rosin-
Rammler distribution function where it was calculated based on the fineness test given
by the plant’s operators. For the devolatilisation of the coal, FG-DVC which is an
advanced coal network model, was used to get the volatiles composition along with the
respective rate constants [13]. The rate of devolatilisation was computed based on a
single Arrhenius equation model with an activation energy of 33.1 MJ/kmol and a pre-
exponential factor of 5799s−1 [12].
The volatiles reactions were solved by applying the kinetic rate/eddy-dissipation

model. The reaction steps are from the Jones and Lindstedt mechanism for hydrocarbon
gases, and the rate constants for tar are based on the Smooth and Smith [12]. The
turbulence was solved using the SST k – x model, where it was found that the model
managed to capture better convergence due to the capability to effectively blend the
robust and accurate formulation of the k – x model in the near-wall region [14]. The
radiative heat transfer from the coal combustion was resolved via the application of the
discrete ordinate (DO) model. The DO model is mainly used for the reacting flow
modelling due to its compatibility with CFD approaches, both of which based on a
finite volume approach [15].
For the simulations, ANSYS Fluent (version R19.1) was implemented with user
defined functions (UDFs) build for devolatilization. The mesh for the boiler’s domain
was constructed using 1.2 million hexahedral cells. To lessen the number of mesh, the
reheater and superheater panels were simplified as a number of thin walls. The water
wall membrane of boiler which have both convective and conductive heat transfer had
an overall heat transfer coefficient of 500 W/m2 K and an emissivity of 0.7 [12]. The
superheater and reheater panels were assumed to have an overall heat transfer coeffi-
cient of 3000 W/ m2 K.
2.3 Reduced Order Modelling (ROM)

While CFD is a powerful platform which can generate a huge amount of information
related to the combustion behavior in coal-fired boilers, the simulations running time
can be quite expensive due to the requirement for vast computational resources. Hence,
the implementation of a full CFD model in the digital twin platform is highly
impractical as it could not visualize and predict the real-time data from the plant’s
operation. As a countermeasure, a ROM approach can be used to supplement the CFD
simulations by quickly estimate and visualize the aforementioned outputs based on the
inputs from the power plant sensors and the machine learning model [9].
The ROM for the current study was created by the advanced mathematical methods
which combined the three-dimensional solver result snapshots from a set of design
inputs [9]. The location of the result snapshot is at the furnace exit as shown in Fig. 2
(b). The ROM production was done based on several CFD simulations within the
specified range of design inputs. Even though the ROM production for the current
study was computationally expensive, the final ROM database can be utilized at
negligible computational cost, plus with the capability for near real-time analysis [9].
2.4 Machine Learning

For the data pre-processing, feature selection was done to reduce the dimensionality of
data by selecting only a subset of measured features (predictor variables) to create a
model. Feature selection algorithms will search for a subset of predictors that optimally
models measured responses, subject to constraints such as required or excluded features
and the size of the subset. Several major benefits of feature selection include the
improvement of prediction performance, provide faster and more cost-effective pre-
dictors, and also provide a better understanding for the data generation process [16].
Using too many features could degrade the prediction performance even when all
features are relevant and contain information about the response variable.
In the current scenario, a total of 80 parameters were collected and analyze by
observing the correlation matrix and the parameters were reduced to 30 relevant
parameters. Matlab Regression Learner software was used to train the machine learning
model. The software trains several regression models including the linear regression
models, regression trees, Gaussian process regression models, support vector machines,
and the ensembles of regression trees. The software can automatically train one or more
regression models, compare validation results, and choose the best model that works
for the regression problem.
3.1 CFD ROM Capabilities

A number of simulation cases were executed using the ROM and the FEGT result from
one of the simulations were then compared with the design value of FEGT from the
operator pocket book for similar combustion scenario. Less than 7 percent error was
detected, proving the capability of the current CFD ROM model to reasonably predict
the combustion behavior in a boiler.
The real-time capabilities of CFD ROM model were also tested in which the model
managed to display the output almost instantaneously. Figure 3 below shows the two
examples of FEGT results from different operating conditions of boiler.
[°C]
1200
1000
800
600
(a) (b)
Fig. 3. FEGT results from (a) the design and (b) the reduced over-fire air (OFA) conditions.
The reduced OFA scenario as seen in Fig. 3 predicts the imbalance of temperature
distribution along with the excessive temperature at the furnace exit area. Higher gas
temperature after the furnace zone would not only affect the tubes of the heat
exchangers but also cause issues with ash deposition and environmental problems [2].
3.2 Machine Learning Capabilities

Table 1 below shows the RMSE for each algorithm. The Fine Tree algorithm shows the
lowest RMSE value which represents the highest accuracy for the prediction. The linear
regression model shows the highest RMSE as the method has a simple work function to
model the medium complex data. Support Vector Machine (SVM) algorithm is in the
middle range of error. SVM were often used for the classification method.
Table 1. RMSE values for each algorithm.

Machine learning algorithm RMSE
Linear Regression 103.99
Stepwise linear regression 80.15
Fine Tree 4.53
Coarse tree 8.65
Linear SVM 40.23
Fine Gaussian SVM 36.78
Ensemble Bagged Trees 12.53
Gaussian Process Regression 25.74
Figure 4 below shows the prediction and the actual data of FEGT from the his-
torical data test set. With the prediction ahead of the time-frame provided, plant
Temperature (°C)
Unplanned
outage
Date (Jan 2020 - Aug 2020)
Fig. 4. Prediction vs actual outputs of FEGT.

operators will have the capability to manage their work plan to avoid unwanted
conditions.
4 Conclusion
A system-level digital twin of combustion performance in a coal-fired boiler integrating

both CFD and machine learning models is proposed in the current study. The validation
for both CFD ROM and machine learning models were done based on the operating
data from the coal power plant under study in which acceptable errors were found in
both models. As the current study was mainly focused on the feasibility of the proposed
tool, a well-integrated digital twin system tested in power plant is the step forward. The
aforementioned architecture of the proposed tool has shown a major potential for a
learning-based model to be integrated in boiler’s operation to assist not only in the
boiler tuning process, but could also help in maintaining the reliability of the boiler
system for a long run.
References
1. Speight, J.G.: Coal-Fired Power Generation Handbook, 1st edn. Scrivener Publishing LLC,
Massachusetts (2013)
2. Beckmann, A.M., Mancini, M., Weber, R., Seebold, S., Muller, M.: Measurements and CFD
modelling of a pulverized coal flame with emphasis on ash deposition. Fuel 167, 168–179
(2016)
3. Mat Zaid, M.Z.S., Wahid, M.A., Mailah, M., Mazlan, M.A.: Coal combustion analysis tool
in coal fired power plant for slagging and fouling guidelines. In: Editor, F., Editor, S. (eds.)
The 10th International Meeting of Advances in Thermofluids 2018, vol. 2062, AIP
Conference Proceedings (2019)
4. Zhou, W.: Coal fired boiler flow characterization, combustion optimization and Efficiency
improvement guided by computational fluid dynamics (CFD) modeling. Research Gate
(2017).
5. Achieving Better Coal Plant Efficiency and Emissions Control with Digital, GE (2017)
6. Laubscher, R., Rousseau, P.: Coupled simulation and validation of a utility-scale pulverized
coal-fired boiler radiant final-stage superheater. Thermal Sci. Eng. Progress 18, 100512
(2020)
7. Belosevic, S., Tomanovic, I., Crnomarkovic, N., Milicevic, A.: Full-scale CFD investigation
of gas-particle flow, interactions and combustion in tangentially fired pulverized coal
furnace. Energy 179, 1036–1053 (2019)
8. Rousseau, P., Laubscher, R.: Analysis of the impact of coal quality on the heat transfer
distribution in a high-ash pulverized coal boiler using co-simulation. Energy 198, 117343
(2020)
9. Rowan, S.L., Celik, I., Gutierrez, A.D., Vargas, J.E.: A reduced order model for the design
of oxy-coal combustion systems. J. Combustion 2015, 1–9 (2015)
10. Zhao, Y.: Optimization of thermal efficiency and unburned carbon in fly ash of coal-fired
utility boiler via grey wolf optimizer algorithm. IEEE Access 7, 114414–114425 (2019)
11. Sangram, B.S., Jagannath, L.M.: Modeling and optimizing boiler design using neural
network and firefly algorithm. J. Intell. Syst. 27, 393–412 (2018)
12. Yang, J.-H., Kim, J.-E.A., Hong, J., Kim, M., Ryu, C., Kim, Y.J., Park, H.Y., Baek, S.H.:
Effects of detailed operating parameters on combustion in two 500-MWe coal-fired boilers of
an identical design. Fuel 144, 145–156 (2015)
13. Czajka, K.M., Modlinski, N., Kisiela-Czajka, A.M., Naidoo, R., Peta, S., Nyangwa, B.:
Volatile matter release from coal at different heating rates – experimental study and kinetic
modelling. J. Anal. Appl. Pyrol. 139, 282–290 (2019)
14. Yeoh, G.H., Yuen, K.K.: Computational Fluid Dynamics in Fire Engineering: Theory,
Modelling and Practice, 1st edn. Butterworth-Heinemann, USA (2009)
15. Joseph, D., Benedicte, C.: Discrete Ordinates and Monte Carlo Methods for Radiative
Transfer Simulation applied to CFD combustion modelling. Research Gate (2009)
16. Liu, H.: Encyclopedia of Machine Learning. Springer, Boston (2010)
Industry 4.0 Approaches for Supply Chains
Facing COVID-19: A Brief Literature Review
Samuel Reong1(&), Hui-Ming Wee1, Yu-Lin Hsiao1,

and Chin Yee Whah2
1
Industrial and Systems Engineering Department,
Chung Yuan Christian University, Taoyuan City 320, Taiwan
samuel.reongjareonsook@gmail.com
2
School of Social Sciences, Universiti Sains Malaysia,
11800 Gelugor, Penang, Malaysia
Abstract. Widespread disruptions of the COVID-19 pandemic on the perfor-

mance and planning of global supply chains have become a matter of interna-
tional concern. While some key supply chains are tasked with the prevention
and eventual treatment of the virus, other commercial supply chains must also
adapt to issues of shortages, uncertain demand and supplier reselection. This
paper provides a brief literature survey into current Industry 4.0 solutions per-
tinent to COVID-19, and also identifies the characteristics of successful supply
chain solutions to the pandemic. In this investigation, it is found that differing
technology-enabled supply chain strategies are required for pre-disruption,
disruption, and post-disruption phases. Furthermore, a comparison of supply
chain success in several nations suggests a need for data transparency, public-
private partnerships, and AI tools for effective manufacturing implementation.
Keywords: SCM Manufacturing Industry 4.0 COVID-19
1 Introduction
The onset of COVID-19 in 2020 has detrimentally impacted countless lives and
industries across the globe. In terms of manufacturing and supply chain management,
global supply chains – networks of suppliers, manufacturers, retailers and distributors -
that had adapted lean manufacturing practices with streamlined inventories suddenly
found themselves crippled by shortages generated by lockdowns in China and in other
Southeast Asian suppliers. Shocks generated by the pandemic, such as overcompen-
sated ordering, lack of information sharing, and lack of collaboration between the
private and public sectors, have all forced various industries to seek alternative solu-
tions in order to survive.
Many researchers, however, have also identified these forced changes as oppor-
tunities for growth. While each industry experiences various difficulties across the
board due to inadequate visibility from upstream to downstream channels, many of the
available solution methods are also shared. Adaptation of prediction models used in
conjunction with automation and modelling techniques allow many manufacturing

https://doi.org/10.1007/978-3-030-68154-8_106
Industry 4.0 Approaches for Supply Chains Facing COVID-19 1243
factories to continue operations, and flexible supplier selection paired with collabora-
tive geographical tracking systems would have allowed manufacturers to predict dis-
ruptions and adjust their planning areas around impacted areas. In a few cases, some of
these digital solutions were preemptively prepared, which enabled supply chains in
these countries to continue operations. Such scenarios have led to a future outlook of
“resilient” supply chains, in which sourcing and manufacturing capacities are auto-
matically prepared for ahead of time across a variety of international channels, pro-
ceeding regardless of the situation at hand. It has also been proposed that most AI-
driven technologies and digital platforms used to solve issues created by the pandemic
would also strengthen the resiliency of their manufacturing networks against similar,
future disruptions.
2 Current Enabling Technologies
Javaid et al. [1] provides some insight into certain technologies that present opportu-
nities for supply chains seeking to adopt Industry 4.0 practices: cloud computing
services such as Amazon AWS, Azure, and Google Cloud reduce operating costs and
increase efficiency. This is apparent through the close examination of supply chain
failures that gave rise to a new class of manufacturing technologies using self-learning
AI, which researchers hope will eventually enable smart supply chains to operate
autonomously. The authors maintain that Big Data methods can forecast the extent of
COVID-19, while AI can be used to predict and manage equipment manufacturing.
Ivanov [2] demonstrated that performing simulations to assess the supply chain effects
of COVID-19 could be used to demonstrate the effects of the pandemic across multiple
supply chain levels. Furthermore, while tracing diseases across geographic locales was
impossible before 2010, researchers such as Dong et al. [3] at John Hopkins university
have harnessed Global Imaging Systems (GIS) alongside data mining and machine
learning techniques to geographically trace the spread of the COVID-19 pandemic,
allowing for preemptive preparation. Furthermore, as detection techniques in other
systems were used to isolate the spread of the disease in localized areas, the spread of
the pandemic was tracked using live, real-time methods, with immediate and automated
alerts sent to key professionals. Wuest et al. [4] suggest further that the pandemic
provides strong justification for industries to adapt an “AI-Inspired Digital Transfor-
mation,” spearheaded by unmanned smart factories, automated supply chains and
logistics systems, AI-based forecasts for demand variation, shortages and bottlenecks,
and predicted maintenance routines to lessen the severity of internal disruptions.
3 Supply Chain Characteristics
Ivanov [2] further observed that supply chains are characterized by different challenges
corresponding to their respective objectives, noting that supply chains affected by
COVID-19 can be further categorized as humanitarian supply chains and commercial
supply chains. Altay et al. [5] found that post-disaster humanitarian logistics differed
greatly from pre-disaster logistics, or even commercial logistics.
1244 S. Reong et al.
3.1 Humanitarian Supply Chains

According to [5], humanitarian supply chains are often hastily formed in order to
respond to disaster relief needs, and as a result possess much higher levels of com-
plexity than corporate supply chains. Furthermore, they are much more sensitive to
disruptions. In order to mitigate these detrimental effects, the authors suggested that
prior to a disaster, a humanitarian supply chain should be designed for one of two
orientations: (1) flexibility, which can quickly adapt to change, or (2) stability, which
follows a traditional hierarchical design and maximizes efficiency. In relation to pan-
demic response teams, the necessity of rapidly identifying sites for mass vaccination
requires the key success factors of communication and efficient vertical integration.
Thus, response planners must determine whether a static, efficient setup should be
prepared, compared to a dynamic organization able to absorb potential aftershocks
subsequent to the first disaster. It was then suggested that the post-disaster phase be
characterized by effective collaboration, transparency, and accountability. Local gov-
ernments were identified as a strong role in in both disaster preparedness and response
enablers, which would provide support through advanced planning and stronger
coordination. Through close and open working relationships, disaster relief supply
chains would be further ensured to perform in an effective manner. Furthermore, Altay
et al. [9] noted that humanitarian supply chains are often hastily formed and are
vulnerable to disruptions. These disruptions are further intensified after the effects of a
disaster, and can often prevent the aid targeted towards the recipients of relief efforts.
3.2 Commercial Supply Chains

While commercial supply chains are composed of a more straightforward structure and
lack the ad hoc, impromptu complexity of humanitarian supply chains, they never-
theless carry their own challenges. Unlike their more flexible counterparts, commercial
supply chains must adhere to uniform standards while maintaining an effective level of
financial performance. Their primary goal is to ensure the continued profitability of
both suppliers and retailers alike, in addition to maintaining brand loyalty with con-
sumers. Rigid scheduling issues and lack of communication across multiple levels in
each supply chain thus result in disruptions at multiple levels and underutilized
potential. Wuest et al. [4] found that commercial industries were impacted at different
product life stages by the onset of COVID-19, shown in Fig. 1.
It was further noted that the most affected industries were the service and hospitality
industries. In addition, the challenges faced by commercial supply chains remain
widely varied in nature: automotive and aircraft manufacturing plants have closed due
to safety issues and the lack of remote management capacities. Since the private
industry has proven incapable of matching demand, many countries such as the US
have made use of state interventionist policies to ensure supply-side capacity. One
particular example includes legal enforcement on GM and General Electric to switch
from the production of non-essential consumer goods to that of medical supplies.
Beginning of Life Middle of Life End of Life
Automotive Manufacturing
Pharmaceuticals Manufacturing
Aircraft Manufacturing
Defense Manufacturing
Fig. 1. Manufacturing and supply networks affected by COVID-19 at different stages, adapted
from Wuest et al. [4]
4 Solution Characteristics
Changes to manufacturing and supply chains in 2020 have been compared to a “global
reset,” according to the most recent meeting of the World Economic Forum’s Global
Lighthouse Network [6]. Out of the 54 leading advanced manufacturers in the world, a
performance survey found the relevant manufacturing changes to be (1) Agility and
customer centricity, (2) Supply chain resilience, (3) Speed and productivity, and
(4) Eco-efficiency. In terms of relevance to the COVID-19 pandemic, the former three
shifts are shown below (Table 1):
Table 1. Three necessary industry shifts: Adapted from the WEF 2020 Agenda.
Global changes Necessary industry shifts
Demand uncertainty and disruptions Agility; customer
centricity
National security interests, trade barriers and logistics disruption Supply chain resilience
Disruption of global manufacturing
Forced transition to remote management and digital collaboration Speed and productivity
Physical distancing regulations
Workforce displacement and unbalanced growth
Economic recession: costs must be reduced
By virtue of success, the implementation of effective technology-enabled solutions

should thus be tailored to address these issues. The following points below discuss both
the possible implementation methods explored by researchers and examine examples
used by real industries during the COVID-19 pandemic.
4.1 Supply Chain Agility and Resilience

Swafford et al. [7] defined supply chain agility quantitatively as an enterprise’s ability
to reduce its own lead time and respond to shifts in supply and demand – namely, how
quickly its rates of procurement and manufacturing activities can be changed. Agility is
often used in conjunction with resilience, the latter of which is defined by Sheffi et al.
[8] as the ability of an enterprise to bounce back from a disruption, characterized by
redundancy - reserve resources such as a safety stock - and flexibility, the ability to
redirect material flows through alternative channels.
In line with the above definitions, Altay et al. [9] established that humanitarian
supply chain activities are categorized by pre-disaster and post-disaster activities. Using
survey data, they found that while agility and flexibility were effective for pre-disaster
preparations, only flexibility proved to be significant for post-disaster activities. These
findings suggest that the appropriate objectives must be established for implemented
supply chain technologies before, during, and after major disruptions. Furthermore,
these results imply that shifting between phases necessitate the use of adequate pre-
diction and detection systems.
One such proposed solution method was made by Ivanov and Dolgui [10], who
prioritized supply chain resilience and recovery planning introduced the concept of a
“digital supply chain twin” alongside a data-driven supply chain risk modelling
framework. Using this framework, the digital supply chain twin would then provide
decision-making support through machine learning analysis and modelling software.
Yet the researchers Marmolejo-Saucedo et al. [15, 16], also noted that many papers
incorrectly present the term “Digital Twin;” rather, such an area of research presents a
need for valid statistical analysis and information collection throughout the use of
Agent-Based Simulation.
In particular, the benefits of appropriate decision-making under early warning
systems cannot be overlooked. As was previously established, the most effective
method of supply chain preparation comes from preliminary, pre-disruption prepara-
tion. For early warning systems to be effective, a degree of cooperation must take place
on both a domestic and a cross-border level. A CDC report by Lin et al. [11] documents
the use of cross-departmental surveillance in a real-time database in Taiwan. The
authors noted how the Central Epidemic Command Center (CECC) partnered with the
National Health Insurance administration's confidential, 24-h interval cloud-based
patient database, and with the Customs and Immigration database in order to identify
and track persons with high infection risks.
Anticipating shortages, the Taiwanese government suspended mask exports and
funded Mask Finder, a mobile phone application that identifies and local mask supply
distribution points and their current stocks. Next, in order to prevent overbuying, the
government implemented a mask rationing system tied to each resident’s identification

card. Thus, the case study demonstrates how public and private sector cooperation
augmented by sensor technologies and database systems assists essential resource
allocation.
4.2 Supply Chain Productivity

Speed and productivity, according to the definition set forth by [6], are associated with
the maintenance of production and distribution goals despite the addition of new
challenges and constraints. An essential direction for analysis involves identifying the
causes of success in supply chains in certain nations subsequent to the COVID-19
crisis, and the causes of failures in others. Dai et al. [12] remarked that data trans-
parency greatly limited supply chain mobility in the United States PPE supply chain,
where supply chain members were left in the dark concerning the capacities, locations,
and reliability of other members. In fact, such information was intentionally kept
private as trade secrets. Such an approach can be counterproductive to the reliability of
a supply chain, much like how a single point of failure can cripple even the most
resilient network. To address this problem, the authors also supported public-private
partnerships, much like what was observed by [11] in Taiwan and suggested the use of
digital management and cyber-physical systems in the production line.
One novel data transparency and validation system for supply chains is the
development of the MiPasa project by IBM, which uses the IBM blockchain platform
and cloud to verify information sources for analysts seeking to map the appropriate
COVID-19 response [13]. HACERA, the startup company in ownership of MiPasa, has
collaborated with healthcare agencies such as the CDC, WHO, and the Israeli Public
Health Ministry to make peer-to-peer data ledgers possible. Accordingly, researchers
worldwide have proposed similar usage of blockchain technology to verify and share
information between supply chain members. Unlike present methods, which require
costly and time-consuming verification procedures, blockchain technology enables data
sharing to occur between members with relative ease and increased trustworthiness,
and tracked with almost no downtime.
5 Direct Supply Chain Impacts & Research Strategies
5.1 Vaccine Development and Distribution

As of September 2020, 10 vaccine candidates for COVID-19 prevention were reported
to have entered their clinical trial phases [19]. The main commercial players for the
development of these vaccines include CanSino Biologics, Sinovac Biotech, Novavax,
BioNTech/Pfizer, Inovio Pharmaceuticals, and Symvivo. Global attention is focused on
several specific qualities of the vaccine candidates that will inevitably take the lead, as
these characteristics will heavily influence what issues will be faced by the manufac-
turing, supply chain, and end-user parties.
Specifically, factors of the leading candidates such as storage temperature, shelf
life, and the number of required doses per patient were stated to heavily impact
implementation in global supply chains. In a conference addressing the supply chain

implications of a COVID-19 vaccine, CanSino CFO Jing Wang stated that the most
critical logistical challenge is the cold chain [17]. Namely, whether the vaccine can be
maintained in cold storage between 2–8 °C, or at temperatures lower than -60 °C, will
impact availability and cold chain solution methodologies. In particular, the latter
option historically displayed strict limitations due to available technology and supply
constraints, during distribution of the Ebola vaccine. Alex de Jongquieres, chief of staff
at Gavi, the Vaccine Alliance, also reflected that only a sufficiently long shelf life
would ultimately determine whether the vaccines could be stored in control facilities or
regional warehouse. Lastly, while a single dose is ideal, multiple dosage requirements
further compound the quantity problem. One recent leading candidate in which this is
anticipated to carry over into distribution is that of BioNTech/Pfizer, for which a 2-dose
regimen has been confirmed [20].
Thus, several implications exist for researchers seeking to model distribution
solutions based on the most promising COVID-19 vaccine candidates. The nature of
cold chain and ultra-cold chain capacity and their available variants will be a major
factor, for which Lin et al. [21] have recently formed a cold chain decision-making
model, finding that certain cold chain transport combinations can be calculated as more
viable than others under specific constraints. Nevertheless, the authors noted, little
research has yet been published on the subject.
Shelf life of the leading candidates will determine whether traditional central
warehouse network models deserve consideration, or whether more innovative solu-
tions such as cross docking and last-mile delivery logistics will grow in popularity. At
the same time, the practicality of these latter methods must also be addressed, as
limitations on how far healthcare workers can safely travel to collect or apply treatment
will be readily apparent in each country. More than ever, proper planning and orga-
nization will be necessary to prevent wastage and deterioration. Currently, owners of
the other vaccine candidates have yet to release their distribution requirements.
5.2 Personal Protective Equipment (PPE)

According to Park et al. [18], the ongoing shortage of PPE materials stemmed from the
offshoring of PPE production to the People’s Republic of China, for which factory bans
have caused global shortages. Under the just-in-time strategy, national stockpiles of
materials used in PPE products were continuously reduced in order to make efficiency
gains. While this is a common practice in many sectors, it proved problematic in the
event of disease outbreak. As a result of the COVID-10 pandemic, global supply chains
are experiencing temporary shortages until the PPE supply can be renewed.
An adapted summary table of strategies selected by the US Center for Disease
Control and Prevention for PPE supply optimization [22], as pertaining to systemic
objectives, can be seen below (Table 2):
Table 2. “Summary Strategies to Optimize the Supply of PPE during Shortages,” selected for
possible model objectives; adapted from the CDC guide on July 2020.
PPE type Conventional capacity Contingency capacity
All PPE • Use telemedicine whenever • Selectively cancel elective and non-
possible urgent procedures and
• Limit number of patients going to appointments where PPE is
hospital/outpatient settings typically used
• Limit face-to-face health care
professional encounters with
patients
N95 • Implement just-in time fit testing • Facilities communicate with local
Respirators • Extend the use of N95 respirators by healthcare coalitions and public
and wearing the same N95 for repeated health partners to identify
facemasks close-contact encounters with additional supplies
several patients (within reasonable • Track facemasks in a secure and
limits) monitored site and provide
• Restrict face mask usage to health facemasks to symptomatic patients
care professionals, rather than upon check-in at entry points
asymptomatic patients (who might
use cloth coverings) for source
control
Furthermore, Park et al. maintain that the main bottlenecks in the PPE supply chain
include raw materials shortages such as polypropylene, lack of production infrastruc-
ture, export bans, and transport constraints caused by quarantine measures or limited
workforce capacity. Thus, research on alternate sourcing solutions are a matter of
concern for all related industries. A segment of research in 2020, such as that of
Campos, et al. [23] suggests growing interest in the reuse of PPE masks and respirators.
In such a scenario, the local scale and collection methodologies of reverse logistics
activities can also be explored.
6 Conclusion
Due to its point of origin, scale, and difficulty for treatment, COVID-19 has created
some of the most significant and widespread global supply chain disruptions in modern
times. Supply chain managers and IT developers seeking to weather the effects of the
pandemic must first distinguish the different challenges faced by their humanitarian or
commercial enterprise, and then assess whether they are in a pre-disruption or dis-
ruption state.
If the supply chain is in a pre-disruption state, use of modeling systems and pre-
diction technologies will allow preparation for alternative material flow routes, which
increases supply chain resilience. Whether the supply chain will be able to adapt such a
disruption, however, will depend not only on the level of technological expertise
available, but also on information sharing attitudes in the industry and on the domestic
level of public-private cooperation.
Lastly, researchers seeking to model the COVID-19 vaccine supply chain are
suggested to investigate the cold chain, shelf life, and dosage requirements of the
leading candidates, which will determine relevance of traditional warehousing network
models or more innovative last-mile delivery solutions. PPE supply chain modelers
may find direction in alternate procurement, or in reverse logistics models that facilitate
the sterilization and reuse of protective equipment.
References
1. Javaid, M., Haleem, A., Vaishya, R., Bahl, S., Suman, R., Vaish, A.: Industry 4.0
technologies and their applications in fighting COVID-19 pandemic. Diabetes & Metabolic
Syndrome: Clinical Research & Reviews (2020)
2. Ivanov, D.: Predicting the impacts of epidemic outbreaks on global supply chains: a
simulation-based analysis on the coronavirus outbreak (COVID-19/SARS-CoV-2) case.
Transp. Res. Part E Logist. Transp. Rev. 136, 101922 (2020)
3. Dong, E., Du, H., Gardner, L.: An interactive web-based dashboard to track COVID-19 in
real time. Lancet. Infect. Dis. 20(5), 533–534 (2020)
4. Wuest, T., Kusiak, A., Dai, T., Tayur, S.R.: Impact of COVID-19 on Manufacturing and
Supply Networks—The Case for AI-Inspired Digital Transformation. Available at SSRN
3593540 (2020)
5. Altay, N., Gunasekaran, A., Dubey, R., Childe, S.J.: Agility and resilience as antecedents of
supply chain performance under moderating effects of organizational culture within the
humanitarian setting: a dynamic capability view. Prod. Plann. Control 29(14), 1158–1174
(2018)
6. Betti, F., De Boer, E.: Global Lighthouse Network: Four Durable Shifts for a Great Reset in
Manufacturing [Pdf]. World Economic Forum, Cologny (2020)
7. Swafford, P.M., Ghosh, S., Murthy, N.: The antecedents of supply chain agility of a firm:
scale development and model testing. J. Oper. Manage. 24(2), 170–188 (2006)
8. Sheffi, Y., Rice, J.B., Jr.: A supply chain view of the resilient enterprise. MIT Sloan Manage.
Rev. 47(1), 41 (2005)
9. Altay, N., et al.: Agility and resilience as antecedents of supply chain performance under
moderating effects of organizational culture within the humanitarian setting: a dynamic
capability view. Prod. Plann. Control 29(14), 1158–1174 (2018)
10. Ivanov, D., Dolgui, A.: A digital supply chain twin for managing the disruption risks and
resilience in the era of Industry 4.0. Production Planning & Control, pp. 1–14 (2020)
11. Lin, C., Braund, W.E., Auerbach, J., Chou, J.H., Teng, J.H., Tu, P., Mullen, J.: Early
Release-Policy Decisions and Use of Information Technology to Fight 2019 Novel
Coronavirus Disease, Taiwan (2020)
12. Dai, T., Zaman, M.H., Padula, W.V., Davidson, P.M.: Supply chain failures amid Covid-19
signal a new pillar for global health preparedness (2020)
13. Singh, G., Levi, J.: MiPasa project and IBM Blockchain team on open data platform to
support Covid-19 response, March 2020. https://www.ibm.com/blogs/blockchain/2020/03/
mipasa-project-and-ibm-blockchain-team-on-open-data-platform-to-support-covid-19-
response/Accessed Sept 2020
14. Intelligent Computing & Optimization, Conference proceedings ICO 2018, Springer, Cham,
ISBN 978–3–030–00978–6
15. Marmolejo-Saucedo, J.A., Hurtado-Hernandez, M., Suarez-Valdes, R.: Digital twins in

supply chain management: a brief literature review. In International Conference on
Intelligent Computing & Optimization, pp. 653–661. Springer, Cham, October 2019
Intelligent Computing and Optimization 2019 (ICO 2019), Springer International Publish-
ing, ISBN 978–3–030–33585 -4
17. de Jonquières, A.: Designing the Supply Chain for a COVID-19 Vaccine (Doctoral
dissertation, London Business School) (2020)
18. Park, C.Y., Kim, K., Roth, S.: Global shortage of personal protective equipment amid
COVID-19: supply chains, bottlenecks, and policy implications (2020)
19. Koirala, A., Joo, Y.J., Khatami, A., Chiu, C., Britton, P.N.: Vaccines for COVID-19: the
current state of play. Paediatr. Respir. Rev. 35, 43–49 (2020)
20. Walsh, E.E., Frenck, R., Falsey, A.R., Kitchin, N., Absalon, J., Gurtman, A., Swanson, K.
A.: RNA-based COVID-19 vaccine BNT162b2 selected for a pivotal efficacy study.
Medrxiv (2020)
21. Lin, Q., Zhao, Q., Lev, B.: Cold chain transportation decision in the vaccine supply chain.
Eur. J. Oper. Res. 283(1), 182–195 (2020)
22. Centers for Disease Control and Prevention: Summary Strategies to Optimize the Supply of
PPE During Shortages. In Centers for Disease Control and Prevention (US). Centers for
Disease Control and Prevention (US), July 2020
23. Campos, R.K., Jin, J., Rafael, G. H., Zhao, M., Liao, L., Simmons, G., Weaver, S.C., Cui,
Y.: Decontamination of SARS-CoV-2 and other RNA viruses from N95 level meltblown
polypropylene fabric using heat under different humidities. ACS Nano 14(10), 14017–14025
(2020)
Ontological Aspects of Developing Robust
Control Systems for Technological Objects
Nataliia Lutskaya1(&), Lidiia Vlasenko1, Nataliia Zaiets1,

and Volodimir Shtepa2
1
Automation and Computer Technologies of Management Systems,
Automation and Computer Systems, National University of Food Technologies,
Kiev, Ukraine
lutskanm2017@gmail.com
2
Department of Higher Mathematics and Information, Technology Polessky
State University, Pinsk, Belarus
Abstract. The ontological aspects of designing the efficient control systems of

technological objects, which are operating in uncertain environment have been
demonstrated in the research work. Design and monitoring of the control system
have been outlined as the two basic tasks on the basis of the covered subject and
problem domain of the research as well as the life cycle of the system. The
subject domain, which consists of the ontology of objects and processes, has
been described with the use of the system and ontological approach. The
peculiarity of the developed ontological system lies in the knowledge on the
uncertainty of technological objects and the conditions of their operation. The
ontological system, which underlies the further development of an intelligent
decision support system, has been formed alongside with the ontology of
objectives. The advantage of the ontology based design lies in the scientific
novelty of the knowledge presentation model and the practical relevance for
designers, developers, and researchers of the control systems for technological
objects operating in uncertain environment.
Keywords: Ontological system Technological object Subject domain

Control system
1 Introduction
Nowadays the synthesis of an efficient control system (CS) for a technological object
(TO) is still a creative process which is completely dependent on the personal pref-
erences of the CS designer. In the first place, it can be explained by the determinant part
of the designer’s subject domain initial knowledge and the empirical knowledge
obtained on its basis. Although the stages of developing an efficient control system for
a technological object have been formalized a long time ago [1, 2], they need to be re-
thought, given the current diversity of methods and approaches. In addition, techno-
logical objects operating in uncertain environment require the use of a generalized
methodology based on the life cycle (LC) of the control system for a technological

https://doi.org/10.1007/978-3-030-68154-8_107
Ontological Aspects of Developing Robust Control Systems 1253
object. A robust controller whose structure and / or parameters are calculated in

accordance with the H2/H∞-criterion [3, 4] becomes the control device of such a CS.
However, the changing operating conditions and the evolution of the TO lead to a
change in the uncertainty environment within which the robust controller has been
engineered. Consequently, the efficiency of the system as a whole decreases and
reconfiguration of the control system for a technological object becomes a necessity.
Thus, using the data and the subject domain knowledge requires formalization to be
used in the final system, which can be implemented by means of the decision support
subsystem (DSS).
2 Design of the Ontological System
2.1 Concept of an Ontological System

The research work uses an approach based on the following crucial system-ontological
approaches: abstraction technique and instantiating, composition and decomposition,
structuring and classification [5].
It is assumed that the ontological approach to the design of the CS, including its
software part, is a multidisciplinary issue of formation, presentation, processing and
analysis of the knowledge and data, models of which describe the structure and
interrelation of the objects of the subject domain (SD) [6]. Unlike the empiric approach,
this approach implies a clear systematization of the SD knowledge, including the
interdisciplinary knowledge [7–9].
Fig. 1 shows the components of the subject domain of the research, namely the
control system for a technological object operating in uncertain environment. The
problem domain presented in Fig. 2 forms the objectives which are described in the
objective ontology.
Subject domain: Problem do-

CS for ТО in uncertain main: Fig. 2
environment
Objective set:
Object set Process set Design and monitoring
of the sub- of the sub- of the CS, compatible
ject domain ject domain objectives
Ontology of Ontology of Ontology of

objects processes objectives
Ontological system
Fig. 1. Components of the ontology of the subject domain of design and monitoring of the
efficient control systems operating in uncertain environment.
1254 N. Lutskaya et al.
Ontology forms the framework of the knowledge base in describing the basic
concepts of the subject domain and serves as the basis for the development of intel-
lectual decision support systems. Today, ontology can use different models of
knowledge presentation, for instance semantic or frame networks. Recently, a
descriptive logic sub-set and the OWL2 language dialects [10–12] become a popular
formal description for the development of the subject domain ontology. The means of
formal description and ontology development allow us to store, edit, verify, transmit
and integrate the developed ontology in different formats.
Technological regulations
(as the minimum and maximum values within
which there is the point of extremum)
Uncertainty of Criteria for operation

the external en- (diffused by production
vironment levels and time)
Control system
Nonlinearity, uncer-
Technological Control
tainty and evolution of
object device
the object
C o n s e q u e n c e s
Decrease in
the quality of Readjustment
Loss of
the product of the control
rigidity
system
Increase in energy
consumption
Fig. 2. The problem domain of design and monitoring of the efficient control systems operating
in uncertain environment.
2.2 Engineering Features

Two generalized objectives have been raised while distinguishing the problem domain
of the research (Figs. 1, 2): the objective of designing the control system for a tech-
nological object in uncertain environment and the objective of monitoring the control
system for a technological object. These objectives are the basis for the formation of the
life cycle of the CS for a TO.
The next stage includes the development of an ontological system for better
understanding of the subject domain, stating and capturing the general knowledge and
its structural relationships as well as clear conceptualizing of the DSS software which
describes the semantics of the data.
The ontological approach is based on the concept of the ontological system
(OnS) which is an ontological tool for supporting the applied problems, which is
described by means of the tuple:

OnS ¼ \ OSD O; OP ; OT [ ð1Þ
The subject domain SD consists of two parts – the ontology of the objects O and the
ontology of the processes OP. The former defines the static terms, definitions and
relationships of the subject domain, while the second defines the dynamics (events and
duration) of the SD. The ontology of the processes OP can be constructed in accordance
with the operation of the objects of the subject domain, or it is possible to be done in
accordance with the objectives of the problem domain. The research work proposes to
develop the ontology of processes in accordance with the life cycle (LC) of the control
system (CS) for TO and the objectives of the problem domain OT.
Let us consider the CS for a TO in terms of its LC. Like any complex system, the
CS for a TO consists of at least three stages: origination, operation, and expiry. The
origination is connected with the CS design process and is included into the design and
operation LC of the industrial control system (ICS) of the technological process
(TP) and its parts, including automation equipment (AE). On the other hand, the LC of
the CS is associated with the operation of the TO, which also develops on the principle
of evolution. The incompleteness of the LC prevents from the optimality of the deci-
sion on designing the efficient CS for a TO, that is why when designing the efficient
CSs for TOs it is necessary to take it into account.
Let us describe the LC of the CS for a TO with the following tuple:
CCS ¼ \ PðLCCS Þ; fSg; R; T [ ð2Þ
where P(LCCS) stands for the aim, requirement or assignment of the CS; {S} is the set
of stages of the life cycle of the CS; R is the result of the operation of the CS; T is the
life cycle time. Such dependence reflects the orientation of the CS for a TO both
towards the aim (assignment) of the system and towards the end result of its operation.
Thus, if the system loses its assignment or stops meeting the requirements, it goes to
the final stage of the life cycle, its expiry.
The LC of the efficient CS for a TO is divided into the following stages (Fig. 3).
The aim and criteria of control are selected and the limitations for the system are
determined at the stage of determining the requirements for the system. The input
documents are the technological regulations and the technical design specifications.
The idea of creating a system is substantiated and the input and output variables of
the CS are selected at the stage of formulating the concept of the CS. The end result of
this stage manifests in recommendations for the creation of the CS for the separate parts
of the TO with specified characteristics of each system, the sources and resource
limitation for its development and operation.
At the third and the subsequent stages of the LC, the issue of designing the CS for a
TO is solved using a number of the following traditional activities: the study of the TO;
identification of a mathematical model of the TO; design of the CS and the CS
modeling. However, the research work proposes to choose the CS from a variety of
alternative solutions, which includes different structures of the CS. The final choice is
the decision of a designer. In addition, the design process should be model-oriented,
where the system model is used to design, simulate, verify and validate the programed
code (similar to the DO-178B standard for embedded systems).
Requirements Elimination of
from the upper Restriction the CS
levels for the АE
Defining the re- redesigning Monitoring the

quirements to the CS
CS
Formulation of the reformulating Implementation

concept of the CS and validation of
the CS
Development of verification Simulation of the

the ММ of the ТО CS for a ТО
Development of the structure and parame-

ters of the CS
Fig. 3. Life cycle of the CS for TO.
Unlike the previous stage, where the verification was only performed on the ready
built process models, the assumptions, calculations and conclusions, which have been
made at the previous stages are verified at the implementation stage. That is, the
reliability of the actual costs of the selected alternative solution is being assessed.
At the stage of operation and monitoring, the implemented CS for a TO is subjected
to final evaluation of theoretical and manufacturing research. The peculiarity of this
stage lies in monitoring the system and detecting its “aging”. The “aging” of the CS is
caused by lower efficiency of the CS for a TO and can eventually lead to a system
failure, the divergence of the time response being one of the manifestations of this
phenomenon for control systems. The evolution of the TO or its environment might be
the reason for the decrease in efficiency of such systems. A separate issue here is the
failure of individual elements of the AE [13].
Elimination of the CS is directly related to the termination of the TO operation
when the technological process is physically or morally outdated and its restoration is
futile for technical and economic reasons. The return to the previous stages of the LC is
predicated by the increase in flexibility and adaptability of the created system.
Thus, for the efficient design of the CS for a TO the ontological system is to be
considered in terms of ontological objectives which reflect the LC of the CS.
The proposed ontological system considers conducting ontological studies in order
to identify all the factors which affect the structure of the control system for the TO
operating in uncertain environment. The model also takes into account the LC of the
CS, as well as the peculiarities of developing the robust CSs for TOs, alternative
solutions for which will form the basis of the CS structures.
2.3 Model of the Ontological System

In accordance with the previous section and Fig. 1, an ontological system for the
effective functioning of the CS for a TO which operates in uncertain environment has
been developed. The ontological system consists of three ontologies which are inter-
linked by relevant objectives arising from the problem domain of the research.
Figure 4 shows an ontology fragment of the subject domain of the research which
is described by the following tuple:
O ¼ hX; R; Fi ð3Þ

where X = Xi i = 1,n is a finite non-empty set of concepts (subject domain con-

cepts); R = ri i = 1,m is a finite set of semantically significant relationships between
concepts; F:X R is a finite set of interpretation functions which have been preas-
signed on concepts and relationships.
The basic concepts of the subject domain X have been determined as follows: a
technological object, standards, a life cycle, an individual, as well as a controlling part
of the ICS which consists of software and hardware facilities. These subject domain
concepts are substantiated by the direct impact on the subject domain and research
objectives.
According to semantic relations, significant relationships between concepts for the
set R, have been shown on the O (Fig. 4). The interpretation functions on the
ontologies have been shown by means of the corresponding arrows. In addition, the
model is divided into reference levels for better structuring of the categories and for
linking ontologies. The relationship between other schemes in a single ontological
system is carried out with the help of ovals (links) and rectangles (acceptance) with
numbers corresponding to the following pattern:
Ontology ðSchemeSheetÞ: Level: Relationship number
For example, P1.L1.00 corresponds to the reference to the ontology of processes (O

stands for the ontology of objects, T for the ontology of objectives (tasks), P for the
ontology of processes) of letter 1 to the concept, which is located at the first level with
the relationship number 00. The last number is individual for the entire ontological
system and it can also be used to trace the relationship between ontological sub-
systems.
L0 Automation
Industrial control system

L1
1
7 8 8 1
Standards Technological
Individual object
Life cycle ICS Control device
L2 2 4 4 7 3
8
Metrology 1 Type of
National Designer Operator Software 3 8
IT standards process
standarts
standarts
of ICS 0
4 5
The nature
Systems Food CL P1.L5.07
of the
engineering quality design Hardware
7 3 functioning
standards standards 0 ICS
L3 2
5
7 5
2 8 4
0 LC TO 2
4
0
3
GOST 4
CRISP-
ISO 9001
34.601-90 DM Information Control
7 Field AE
support AE
8 5 Mathematical
GOST support
P1.L5.01
34.201-89 CL AE
ISO/IEC 5 Custom
7 5
15288:2005 software
9 6 Terms of
DSTU Technological reference 5
5
L4 2503-94... regulation 5 8
0 0 1
0 is a categorical relationship; 7
1 is an integer part; Logical
2 is a kind-of-relation; Requirements for part
Functional
3 is a class – subclass (set – sub-set); mathematical
requirements Regulatory
4 is a set – element; support part
5 is an attribute relationship;
6 is an equality relationship; T1(P1).L5.05
7 is an initialization relationship ; P1.L5.02 P1.L5.03 T1.L6.04
L5 2 5
8 is the relationship of the process behavior;
9 is the end of the process relationship.
The structure Parameters
L6 P1.L5.06
of the reg.part of the reg.part
Fig. 4. Ontology of objects of the subject domain.
The ontology of objectives consists of a general objective which is divided into

tasks, sub-tasks, methods of their solution, as well as the problem solver which is
represented in the ontology of processes. Two tasks have been identified according to
the LC of the CS – designing the CS and monitoring the AE and CS, which in turn are
divided into smaller parts. The objective of monitoring the efficiency of the CS is a part
of the objective of monitoring and fault identification of the automation equipment
(AE). A fragment of the ontology of objectives is shown in Fig. 5. The main methods
for solving the issue of designing the CS for a TO operating in uncertain environment
are as follows: methods of identification the mathematical model (MM) of the TO,
which also include the identification of the uncertainties of the TO; methods of optimal
and robust synthesis; robustness testing methods.
The ontology of processes (the fragment is presented in Fig. 6) has been built on a
similar tuple (3) in accordance with the selected objectives of the problem domain. The
processes at the lower L5 level correspond to the sequence of activities and operations
which contribute to the solution of the corresponding sub-tasks of the ontological
research.
Ontology of objectives
L0
0 0
0
Generalized Methods for solving Tasks solver 1 P1.L1.00

objective tasks
L1
1
0 0
0 0 0
Task 1 1 0
Optimization
Designing an
methods Prediction
effective CS
methods
CS CS analysis Methods of
Task 2 methods
synthesis identification Monitoring
Monitoring of
1 2 methods of MM TO methods
AE and CS
L2 1 5
2
3 0
Task 1.1 Methods of 3 6 0
Designing static
hardware optimization Structural
Task 1.2 identification
Software CS synthesis Robust Methods of 7
Dynamic methods
Design methods on stability parametric
optimization
linear MM methods identification
methods
L3 5 0
1 3 2
5 2 µ- By transfer
3 Khariton's
1 P1.L Methods of 3 analysis 2 function
Task 1.2.1 theorem 5
4.10 synthesis of 2
Designing the 5 5
adaptive Synthesis
logical part By state
systems methods of The principle
robust IC Small gain space
Task 1.2.2 of zero Defining
Design of the theorem the region
Synthesis exclusion
regulatoring part The By uncertainty
methods of By Lyapunov regression
optimal CS probabilistic
1 functions MM
L4 approach
1 0 5
Task 1.2.2.1 1 0
Preliminary
Synthesis 3
analysis of TO
methods of
9 Structural
5 Task 1.2.2.2 local regulators Synthesis
Identification of methods of
MM TO Task 1.2.2.3
Synthesis of robust control Mixed
P1.L5.11 5
СS TO Parametric
L5 3 3
9
P1.L5.12 P1.L5.13 О(P).L5.05
Non smooth Loop-shaping-

О.L5.04 Single-loop Entropy approach LMI-approach approach
optimization
4
2-Riccati approach µ-synthesis
L6 5
P/I/PI/PD/PID
Multy-loop
Fig. 5. Ontology of objectives.
The task of synthesizing the SC is divided into three subtasks (Fig. 5 - Task 1.2.2.1–
1.2.2.3): preliminary analysis of the TO, identification of the mathematical model of
TO and the synthesis of the control system. Each component of the task has corre-
sponding processes (Fig. 6) that must be performed upon its achievement. For exam-
ple, when using a basic TO model, it is necessary to perform six actions that will lead to
the receipt of MM TO. In contrast to obtaining MM TO, not considered in conditions
of uncertainty, action 6 was introduced - determining the region of uncertainty, which
can be performed by the procedures described in [14]. The features of the proposed
approach to the synthesis of the regulatory part of the CS is testing of alternative
structures of the CS and the choice of an effective one based on the theory of decision
making.
Using the developed ontological model, it is possible to build a decision support
system based on the acquired knowledge of the problem area. This approach to the
development of the CS reduces the design time for the calculation of structured and
automated procedures that are embedded in the ontological model. When synthesizing
the CS, modern methods of robust TO control systems are taken into account, which
are formalized in the ontology of tasks and the main interrelationships with traditional
methods are indicated.
The process of designing

T1.L4.10 0 ...
a continuous part
L4 8
without ММ ТО
Stage 1. The with ММ ТО
О.L3.07 Action 1.
process of
Stage 2. Definition of control
preliminary Action 1. Identification criterions and
analysis Preparation
О.L5.03 of MM TO restrictions О.L5.03
T1.L5.11
Action 1. and conduct of
Division into the experiment
О.L5.03 subsystems Action 2.
О.L5.02 ... Definition of plural of О.L5.02
... ...
CS structures
No
Stage 3. Development
О.L4.01 Action 5.
Action 2. Action 3. О.L3.07
Definition of regulated Evaluation Sort alternatives of
of CS
О.L4.01 (verification) MM
variables, control actions, О.L3.07 CS by criterions
basic disturbances T1.L5.13
Yes
Action 4.
Action 6. Determination of CS
T1.L5.12 Defining the area parameters and
T1.L5.05
of uncertainty MM modeling
Action 5.
Selection and
implementation of CS
No
Action 6.
CS verification О.L4.01
Yes
О.L5.05
L5
Fig. 6. Ontology of processes (fragment).
4 Conclusion
The analysis of the subject and problem domain of the synthesis of the CS for a TO in
uncertain environment has been carried out. On its basis, a system-ontological
approach to the efficient design and monitoring of the CS for a TO operating in
uncertain environment has been proposed. The ontological system consists of three
ontologies: objects, objectives, and processes. The conducted research revealed the
relationships between subject domain objects, objectives, methods for solving them and
the ontology of processes. The advantage of the given approach is the scientific novelty
of the knowledge presentation model, as well as practical importance for both the
researchers and developers of the CS for a TO in uncertain environment and for

designers of the CS for a TO.
The ontological system forms the basis for the further development of an intelligent
decision support system for the efficient operation of the CS for a TO operating in
uncertain environment.
The authors declare no conflict of interest. All authors contributed to the design and
implementation of the research, to the analysis of the results and to the writing of the
manuscript.
References
1. McMillan, G.K., Considine, D.M.: Process/Industrial Instruments and Controls Handbook,
5th edn. McGraw-Hill Professional, New York (1999)
2. The Control Handbook: Control System Applications, 2nd edn. CRC Press, W.S. Levine
(2011)
3. Lutskaya, N., Zaiets, N., Vlasenko, L., Shtepa, V.: Effective robust optimal control system
for a lamellar pasteurization-cooling unit under the conditions of intense external
perturbations. Ukrainian Food J. 7(3), 511–521 (2018)
4. Korobiichuk, I., Lutskaya, N., Ladanyuk, A., et al.: Synthesis of optimal robust regulator for
food processing facilities, automation 2017: innovations in automation. Robot. Measurement
Techniques, Advances in Intelligent Systems and Computing, Springer International
Publishing 550, 58–66 (2017)
5. Takahara Y., Mesarovic M.: Organization Structure: Cybernetic Systems Foundation,
Springer Science & Business Media (2012).
6. Fernandez-Lopez, M., Gomez-Perez, A.: Overview and analysis of methodologies for
building ontologies. Knowl. Eng. Rev. 17(02), 129–156 (2003)
7. Baader, F., Calvanese, D., McGuinness, D.L., et al.: The Description Logic Handbook:
Theory. Implementation, Applications, Cambridge (2003)
8. Palagin, A., Petrenko, N.: System-ontological analysis of the subject area. Control Syst.
Mach. 4, 3–14 (2009)
9. Smith, B.: Blackwell guide to the philosophy of computing and information: Chapter
ontology. Blackwell 39, 61–64 (2003)
10. OWL 2 Web Ontology Language Document Overview, 2nd edn., W3C. 11 December 2012.
11. OWL Web Ontology Language Guide. W3C Recommendation, 10 February 2004. https://
www.w3.org/TR/owl-guide/
12. Protege Homepage. https://protege.stanford.edu/
13. Zaiets N., Vlasenko L., Lutska N., Usenko S.: System Modeling for Construction of the
Diagnostic Subsystem of the Integrated Automated Control System for the Technological
Complex of Food Industries/ICMRE 2019, Rome, Italy, pp. 93–98 (2019).
14. Lutska, N.M., Ladanyuk, A.P., Savchenko, T.V.: Identification of the mathematical models
of the technological objects for robust control systems. Radio Electron. Comput. Sci. Control
3, 163–172 (2019)
15. Voropai, N.I.: Multi-criteria decision making problems in hierarchical technology of electric
power system expansion planning. In: Intelligent Computing & Optimization. ICO 2018.
Advances in Intelligent Systems and Computing, vol. 866, pp. 362–368. Springer (2019)
16. Alhendawi, K.M., Al-Janabi, A.A., Badwan, J.: Predicting the quality of MIS characteristics
and end-users’ perceptions using artificial intelligence tools: expert systems and neural
network. In: Intelligent Computing and Optimization. ICO 2019. Advances in Intelligent
Systems and Computing, vol. 1072. pp. 18–30. Springer (2020)
A New Initial Basis for Solving the Blending
Problem Without Using Artificial Variables
Chinchet Boonmalert, Aua-aree Boonperm(&),

and Wutiphol Sintunavarat

chinchet.boon@dome.tu.ac.th, {aua-aree,wutiphol}
@mathstat.sci.tu.ac.th
Abstract. Blending problem is one of production problems that can be for-

mulated to a linear programming model and solved by the simplex method,
which begins with choosing an initial basic variables set. In the blending
problem, it is not easy in practice to choose basic variables since the original
point is not a feasible point. Therefore, artificial variables are added in order to
get the origin point as the initial basic feasible solution. This addition brings to
increase the size of the problem. In this paper, we present a new initial basis
without adding artificial variables. The first step of the proposed technique is to
rewrite the blending problem. Then, it is divided into sub-problems depend on
the number of products. The variable associated with the maximum profit,
together with all slack variables of each sub-problem are selected to be a basic
variable. This selection can guarantee that the dual feasible solution is obtained.
Therefore, artificial variables are not required.
Keywords: Linear programming model Dual simplex method Blending

problem Artificial-free technique
1 Introduction
Over the past decade, the blending problem is one of the well-known optimization
problems related to the production process of a large number of raw materials to get
many types of products. It was first mentioned in 1952 by Chanes et al. [1], which
proposed a linear programming problem to find mix of fuels and chemicals in the
airline business. There is a lot of research involving the blending problem such as the
blending of tea, the blend of milk and coal (see more details in [2–4]).
Blending problem can be formulated as the linear programming model, and it can
be solved by the simplex method. To use this method, the transformation of the
canonical form to the standard form is performed. For converting it to the standard
form, slack and surplus variables are added. Then, the basic variable set must be chosen
and considered the possibility both of its inverse matrix and its feasibility. For solving a
large problem, it’s hard to choose the basic matrix that gives the feasible solution.
Thus, a main topic of research to improve the simplex method is to propose the method
to choose the initial basis or initial basic feasible solution.

https://doi.org/10.1007/978-3-030-68154-8_108
A New Initial Basis for Solving the Blending Problem 1263
The well-known methods that are used to find a basic feasible solution are Two-
Phase and Big-M method. These methods start by adding artificial variables to choose
the initial basis. However, adding the artificial variables to the problem not only makes
the problem size become larger but also increases the number of iterations. Conse-
quently, one of topics of the research to find the basic feasible solution without adding
the artificial variables called artificial-free technique is widely investigated.
In 1997, Arsham [5] proposed the method that has two phases which the first phase
starts with the empty set of basic variables. Then, each basic variable is chosen one by
one for entering the basic variable set until its set is full. The second phase is the
original simplex method to find the optimal solution which starts from the basic fea-
sible solution from the first phase. In 2015, Gao [6] gave the counterexample that
Arsham’s algorithm reports the feasible solution for the infeasible problem. Nowadays,
various artificial-free techniques are proposed [7–10].
In this research, we focus on solving blending problem by the simplex method.
However, since an origin point is not feasible solution, artificial variables are required.
This leads to the size of problem is expanded. Therefore, to avoid the step for finding a
basic feasible solution by adding artificial variables, we propose the technique for
choosing an initial basis without using artificial variables. First, the original form of the
blending problem is rewritten and divided into master problem and sub-problems.
Then, all sub-problems are considered to find a basic feasible solution. The mainly
topic of this research is not only suggesting the dividing of the original problem but
also propose the algorithm to choose the initial basis for each sub-problem which
avoids to add the artificial variables.
This paper is organized as follows: Sect. 2 is the briefly review of the blending
problem. Section 3 describes about detail of the proposed method. Section 4 gives a
numerical example showing the usage of the proposed method. And final section ends
with the conclusion.
2 Blending Problem
Blending problem is a production problem that blends m types of raw materials to n

types of products under the limit of each raw material, demand of each product and
ratio to mix the material. Let M and N be the set of raw materials and the set of
products, respectively. Parameters aij and aij are denoted by the smallest and the largest
proportion of raw material i that allows in product j, pj and ci are selling and cost price
of product j and raw material i, respectively. Also, si is represent the amount of raw
material i and dj is represent the amount of demand of the product j. In this problem, xij
is the decision variable that represents the amount of the raw material i to produce the
product j. The objective of the blending problem is to maximize the profit. Thus, the
linear programming model of the blending problem can be written as follows:
1264 C. Boonmalert et al.
P P P P
max pj xij ci xij ð 1Þ
j2N i2M i2M j2N
P
s.t. xij si 8i 2 M ð 2Þ
j2N
P
xij aij xkj 8i 2 M; 8j 2 N ð 3Þ
k2M
P
xij aij xkj 8i 2 M; 8j 2 N ð 4Þ
k2M
P
xij ¼ dj 8j 2 N ð 5Þ
i2M
xij 0 8i 2 M; 8j 2 N ð 6Þ
In the above model, the objective function (1) aims to find the maximum profit.
Constraint (2) means that the amount of each raw material must not exceed the
available source.
h Constraints
i (3)–(4) aim to force the amount of materials must be on
the interval aij ; aij . Constraint (5) aims to force the amount of each raw material must
be enough to encounter the demand. Finally, constraint (6) indicates the domain of the
decision variables.
3 The Proposed Method
Before presenting the process of the proposed method, we rewrite the general form of
the blending problem (1)–(6) to the following form:
P P P P
max pj xij ci xij ð 7Þ
j2N i2M i2M j2N
P
s.t. xij si 8i 2 M ð 8Þ
j2N
P
aij xkj xij 0 8i 2 M; 8j 2 N ð 9Þ
k2M
P
xij aij xkj 0 8i 2 M; 8j 2 N ð10Þ
k2M
P
xij ¼ dj 8j 2 N ð11Þ
i2M
xij 0 8i 2 M; 8j 2 N: ð12Þ
In this paper, we reconstitute the above general model into the following model:
P
n
max xj
cTj ^ ð13Þ
j¼1
s.t. AS x s ð14Þ
j
Aj Im ^x 0 8j ð15Þ
j

Im Aj ^x 0 8j ð16Þ
1m ^
x j
¼ dj 8j ð17Þ
xj
^ 0 8j; ð18Þ
where x^ j ¼ ½ x1i x2j ... xmj T for all j 2 N,

AS is the coefficient matrix of an inequality (8),
s ¼ ½ s1 s2 . . . sm T ,
Im is the identity matrix of dimension m m,
1m ¼ ½1m1 ,
T
Aj ¼ 1m a1j a2j . . . amj 1m for all j 2 N,
T
Aj ¼ 1m ½ a1j a2j . . . amj 1m for all j 2 N,

and cj ¼ cij m1 such that cij ¼ pj ci for all i 2 M and j 2 N.
Since the set of all decision variables can be partitioned depending on the number
of product types, the above problem can be divided into the master problem
max cT x
s.t. AS x s ð19Þ
x 0
and n sub-problems such that each Sub-problem j 2 N is as follows:
max xj
cTj ^ ð20Þ
j
s.t. A j Im ^
x 0 ð21Þ
j

Im Aj ^x 0 ð22Þ
1m ^
x j
¼ dj ð23Þ
xj
^ 0: ð24Þ
To choose the initial basis for the blending problem, we will consider it from all
sub-problems. First, we transform Sub-problem j to a standard form and so 2m of slack
variables are added to this sub-problem. Let A j be the coefficient matrix of constraints
_ j
(21)–(22), A and bj are the coefficient matrix and the column vector of RHS of
constrains (21)–(23), respectively. So
For choosing basic variables, all slack variables are chosen first. However, it is not
enough to construct the set of basic variables. Hence, to avoid adding an artificial
variable, one of decision variable is selected. We now present the new technique for
choosing this remaining basic variable. First, for each j 2 N, we let.
n o
lj ¼ arg max cTj ð25Þ
Then, the initial basic variables of Sub-problem j is constructed associated with

_ j _ j
xB ¼ ½ xij s1 s2 . . . s2m . Let A:;l be the lth l
j column of A . Then, a basis Bj , its
inverse and a non-basic matrix of Sub-problem j can be written as follows:
and . Since ,
the primal problem is not feasible. However,
Thus, the dual problem of Sub-problem j is feasible. The initial tableau can be
constructed as Table 1.
1 1
Since Blj bj l0 and cTBl Blj N l cTN l 0, the dual simplex can start without
j
using artificial variables for solving Sub-problem j. After the optimal solution to all
sub-problems is found, if it satisfies the master problem then it is the optimal solution to
the original blending problem.
Table 1. Initial tableau of simplex method for sub-problem j of the blending problem.
z xBlj xN l RHS
T 1 1
z 1 0
cTBl Blj N l cTN l cTBl Blj bj
j j
1 1
xBlj 0 I2m þ 1
Blj Nl Blj bj
4 An Illustrative Example
In this section, we give an illustrative example showing the numerical results obtained
from our algorithm.
Example 1. Consider the following blending problem:
max 71ðx11 þ x21 þ x31 Þ þ 83ðx12 þ x22 þ x32 Þ

8ðx11 þ x12 Þ 3ðx21 þ x22 Þ 6ðx31 þ x32 Þ
s.t. x11 þ x12 1684

x21 þ x22 1793
x31 þ x32 1348
x11 0:1315ðx11 þ x21 þ x31 Þ x11 0:9231ðx11 þ x21 þ x31 Þ

x11 þ x21 þ x31 ¼ 325

x12 þ x22 þ x32 ¼ 410
x11 ; x12 ; x21 ; x22 ; x31 ; x32 0
Thus, the above model can be written as follows:
max 63x11 þ 75x12 þ 68x21 þ 80x22 þ 65x31 þ 77x32
s.t. x11 þ x12 1684

x21 þ x22 1793
x31 þ x32 1348
x11 þ x21 þ x31 ¼ 325
x12 þ x22 þ x32 ¼ 410
0:8685x11 þ 0:1315x21 þ 0:1315x31 0
0:3710x11 0:6290x21 þ 0:3710x31 0
0:3024x11 þ 0:3024x21 0:6976x31 0
0:8847x12 þ 0:1153x22 þ 0:1153x32 0
0:2425x12 0:7575x22 þ 0:2425x32 0
0:3064x12 þ 0:3064x22 0:6936x32 0
0:0769x11 0:9231x21 0:9231x31 0

0:9510x11 þ 0:0490x21 0:9510x31 0
0:7979x11 0:7979x21 þ 0:2021x31 0
0:4657x12 0:5343x22 0:5343x32 0
0:6090x12 þ 0:3910x22 0:6090x32 0
0:9347x12 0:9347x22 þ 0:0653x32 0
x11 ; x12 ; x21 ; x22 ; x31 ; x32 0
Then, the model can be divided into one master problem
max 63x11 þ 68x12 þ 65x21 þ 75x22 þ 80x31 þ 77x32
s.t. x11 þ x12 1684

x21 þ x22 1793
x31 þ x32 1348
x11 ; x12 ; x21 ; x22 ; x31 ; x32 0
and two sub-problems are as follows:
max 63x11 þ 68x21 þ 65x31

s.t. 0:8685x11 þ 0:1315x21 þ 0:1315x31 0
0:3710x11 0:6290x21 þ 0:3710x31 0
0:3024x11 þ 0:3024x21 0:6976x31 0
0:0769x11 0:9231x21 0:9231x31 0
0:9510x11 þ 0:0490x21 0:9510x31 0
0:7979x11 0:7979x21 þ 0:2021x31 0
x11 þ x21 þ x31 ¼ 325
x11 ; x21 ; x31 0;
max 75x12 þ 80x22 þ 77x32

s.t. 0:8847x12 þ 0:1153x22 þ 0:1153x32 0
0:2425x12 0:7575x22 þ 0:2425x32 0
0:3064x12 þ 0:3064x22 0:6936x32 0
0:4657x12 0:5343x22 0:5343x32 0
0:6090x12 þ 0:3910x22 0:6090x32 0
0:9347x12 0:9347x22 þ 0:0653x32 0
x12 þ x22 þ x32 ¼ 410
x12 ; x22 ; x32 0:
In Sub-problem 1, we let

since l ¼ arg max cT1 ¼ 2. Thus, the basis can be constructed associated with
xB ¼ ½ x31 s1 s2 s3 s4 s5 s6 T as follows:
The initial tableau of Sub-problem 1 can be constructed in Table 2 and 3:
Table 2. The initial tableau of Sub-problem 1.

xB21 xN 2
z x21 s1 s2 s3 s4 s5 s6 x11 x31 RHS
z 1 0 0 0 0 0 0 0 −7 −5 22100
x21 0 1 0 0 0 0 0 0 1 1 325
s1 0 0 1 0 0 0 0 0 −1 0 −42.722
s2 0 0 0 1 0 0 0 0 1 1 204.4269
s3 0 0 0 0 1 0 0 0 0 −1 −98.2894
s4 0 0 0 0 0 1 0 0 1 0 299.9998
s5 0 0 0 0 0 0 1 0 −1 −1 −15.9281
s6 0 0 0 0 0 0 0 1 0 1 259.3024
After the initial tableau is constructed, the dual simplex method is used for solving
each sub-problem. Thus, the optimal solutions to Sub-problem 1 and Sub-problem 2
are found at ^x1 ¼ ð42:7375; 183:9825; 98:28Þ and ^ x2 ¼ ð47:273; 237:103; 125:624Þ.
After checking this solution with the master problem, it satisfies the master problem.
Therefore, it is the optimal solution to the original problem. The number of iterations of
Table 3. The initial tableau of Sub-problem 2.

xB22 xN 2
z x22 s1 s2 s3 s4 s5 s6 x12 x32 RHS
z 1 0 0 0 0 0 0 0 −5 −3 32800
x22 0 1 0 0 0 0 0 0 1 1 410
s1 0 0 1 0 0 0 0 0 −1 0 −47.2711
s2 0 0 0 1 0 0 0 0 1 1 310.5865
s3 0 0 0 0 1 0 0 0 0 −1 −125.6288
s4 0 0 0 0 0 1 0 0 1 0 219.0499
s5 0 0 0 0 0 0 1 0 −1 −1 −160.3011
s6 0 0 0 0 0 0 0 1 0 1 383.2287
each sub-problem is only two, while the simplex method uses 12 iterations. Moreover,
15 slack variables and 2 artificial variables are added before the simplex method starts.
The comparison of the simplex method and the proposed method of this example is
shown in Table 4.
Table 4. Comparison between the simplex method and the proposed method.
Method Simplex method Proposed method
Phase I Phase II Sub-problem 1 Sub-problem 2
Size of problem 17 23 17 21 7 10 7 10
Number of iterations 9 3 2 2
5 Conclusion
In this paper, the original blending problem is rewritten and divided into master
problem and sub-problems which the size of each sub-problem is smaller than the
original problem. After the problem is divided, sub-problems which depend on the
number of products are considered. To find the optimal solution of each sub-problem,
an initial basis is chosen to construct the initial simplex tableau for each sub-problem.
The algorithm for choosing the initial basic variables for each sub-problem is proposed
by selecting one variable that has the maximum coefficient of each objective function
and all slack variables. By this selection, the dual feasible solution can be obtained. So,
the dual simplex can be performed without using artificial variables. Based on the fact
that our algorithm doesn’t use artificial variables, the process for finding the initial
feasible solution is omitted. This brings to the advantage of our algorithm.
Acknowledgment. This work was supported by Thammasat University Research Unit in Fixed
Points and Optimization.
References
1. Chanes, A., Cooper, W.W., Mellon, B.: Blending aviation gasolines—a study in
programming independent activities in an integrated oil company. Econometrica 20(2),
135–259 (1952)
2. Fomeni, F.D.: A multi-objective optimization approach for the blending problem in the tea
industrial. Int. J. Prod. Econ. 205, 179–192 (2018)
3. Marianov, V., Bronfman, V., Obreque, C., Luer-Villagra, A.: A milk collection problem
with blending. Transp. Res. Part E Logist. Transp. Rev. 94, 26–43 (2016)
4. Aurey, S., Wolf, D., Smeers, Y.: Using Column Generation to Solve a Coal Blending
Problem. RAIRO – University Press, New Jersey (1968)
5. Arsham, H.: An artificial-free simplex-type algorithm for general LP models. Math. Comput.
Model. 25(1), 107–123 (1997)
6. Gao, P.: Improvement and its computer implementation of an artificialree simplexype
algorithm by arsham. Appl. Math. Comput. 263, 410–415 (2015)
7. Huhn, P.: A counterexample to H. Arsham: Initialization of the Simplex Algorithm: An
Artificial Free Approach (1998)
8. Boonperm, A., Sinapiromsaran, K.: The artificial-free technique along the objective direction
for the simplex algorithm. J. Phys. Conf. Ser. 490, 012193 (2014)
9. Nabli, H., Chahdoura, S.: Algebraic simplex initialization combined with the nonfeasible
basis method. Eur. J. Oper. Res. 245(2), 384–391 (2015)
10. Phumrachat, M.T.: On the use of sum of unoccupied rows for the simplex algorithm
initialization. Doctoral dissertation, Thammasat University (2017)
Review of the Information that is Previously
Needed to Include Traceability in a Global
Supply Chain
Zayra M. Reyna Guevara1, Jania A. Saucedo Martínez1,

and José A. Marmolejo2(&)
1
Universidad Autónoma de Nuevo León,
66451 San Nicolás de los Garza, México
{zayra.reynagvr,jania.saucedomrt}@uanl.edu.mx
2
Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin 498,
03920 Ciudad de México, México
jmarmolejo@up.edu.mx
Abstract. Caring for and preserving the quality of products throughout the
supply chain has become a challenge for global trade. So is the backup of
information so that it is available to the clients or authorities that require it.
Global chains are made up of different stages and processes that include
production, transformation, handling and transportation from origin to destina-
tion. These stages are executed by different actors and therefore an effective
communication is required on a continuous basis between them.
For this reason, the initial purpose of this research is to identify what prior
information is required to include traceability processes that serve as support for
an integrated supply chain and thus reduce potential risks to the integrity of the
marketed product. An analysis of available traceability technologies that have
been used in global supply chains is included.
The information was collected through a review of literature related to
traceability and quality with which it was possible to identify in a general way
the main stakeholders in supply chains, this provides us with a frame of refer-
ence on the possible limitations of the research and selection of a starting point
for the proposal of an efficient traceability process according to the client’s
requirements.
Keywords: Traceability Global supply chain Technology Quality
1 Introduction
In a market context with a tendency to expand in which logistics plays a key role in
constantly seeking to satisfy demand at the best possible cost without neglecting the
part of customer care and service that requires the delivery of their products in the best
possible quality and safety with the possibility of having access to the information
related to the product purchased, currently all this is possible thanks to the integration
of technological tools in supply chains in conjunction with the application of trace-
ability processes that have become a requirement in international trade for better

https://doi.org/10.1007/978-3-030-68154-8_109
Review of the Information that is Previously Needed to Include Traceability 1273
control and risk management, especially in agro-industrial chains. The food and
pharmaceutical industries must have traceability systems that allow the identification of
products and processes with a proper backup of such information.
It is necessary that our production chain complies with both local and international
regulations as well as the expected quality standards; also risks must be reduced as
much as possible by providing each participant the possibility of access to information
on the monitoring and management of their products, which is achieved when trace-
ability systems are integrated and they work properly. Tracking tools allow the product
to be tracked at each stage of the transformation process and during its distribution
throughout the supply chain by having information support, which includes the
implementation of new technologies.
Since our study is in the initial phase we can use the simulation tool to start doing
some tests. Evolutionary Prototyping is a popular system development technique and
can enable users to develop a concrete sense about software systems that have not yet
been implemented [1].
Traceability in the supply chain allows the identification of information related to
the monitoring and tracing of traceable units during the production and marketing
process [10]. Since a supply chain is made up of various actors according to each stage
that include producers, suppliers, packers, transporters, traders and consumers, trace-
ability system serves to support the integration of the chain to be able to satisfy demand
and if necessary take corrective actions.
All this in an effort to seek better marketing opportunities, ensuring compliance
with health and quality standards regardless of the type of transport used to reach your
destination.
All stages of the chain should be documented and managed so that it is possible to
trace them back to the origin of the product, step by step, and ensure that there has been
no contamination… This traceability minimizes the risk of fraud at all stages and is a
very important part of the inspection process of certifying organizations [5]. An
analysis of available methodologies and technologies is included that may be useful to
identify the risks involved in the chain and how the traceability processes allow to
ensure the quality of products by having determined which stages already have safe
handling and which are the vulnerable stages. Finally, the use of tools or technologies
that serve as support in the backup of information from the process documentation will
be proposed.
2 Background
Logistics is defined as the set of activities and processes necessary to ensure the
delivery of merchandise from the place of its production to the point where the product
is marketed or delivered to the final consumer. This involves inspection and quality
control of products and in terms of geographical scope it can be: urban, domestic or
foreign trade [2].
In this way, d that there are customers who are willing to pay more if the product
they are purchasing is of higher quality and they can stay informed.
1274 Z. M. Reyna Guevara et al.
Traceability is the set of pre-established and self-sufficient procedures that allow

knowing the history, location and trajectory of a product or batch of products
throughout the supply chain, at a given time and through specific tools [6].
According to ISO 9000: 2000, quality is the degree to which a set of characteristics
meets the requirements.
As a precedent, proposals were found that involve the use of technologies in the
management of traceability processes and that conclude that a traceability system must
be highly automated for better management and control of all movements involved in
the supply chain and for the distribution of a product supported by systems that allow its
continuous monitoring and which must function globally and effectively interconnected
throughout the food chain, thus achieving a high degree of safety for consumers [9].
Due to the complexity of operations in international trade and the demand for high
quality standards, there is a need for adequate risk management, information exchange,
and traceability processes that help to verify the safe handling of the product
throughout the entire path in the supply chain as well as reducing risks that may affect
quality, traceability provides information necessary for compliance with international
standards.
Depending on the chain in which we are working, a specific strategy must be
designed that adapts to the needs therefore, each stage must be analyzed from origin to
destination, identifying vulnerable stages by pointing out the variables and critical
points. For this, a review of available technological tools has been made (see Table 3).
In terms of risks, a study conducted by Dr. J. Paul Dittman for UPS Capital
Corporation in 2014 identified the following vulnerabilities in the supply chain, ranking
quality risks at the top of the list. Long global supply chains can make it extremely
difficult to recover from quality problems [7].
1. Quality.
2. Inventory.
3. Natural disasters.
4. Economy.
5. Loss in transit.
6. Delays.
7. Computer security.
8. Intellectual property.
9. Political instability.
10. Customs.
11. Terrorism.
According to the level of implementation, some advantages of using traceability
processes in our supply chain are:
– Documentation and support of the product history against possible claims.
– Assist in the search for the cause of the nonconformity.
– Help in the control and management of processes.
– Provide information to regulators and customers.
– Reduction of manual controls in ports.
– Ease of identifying defective products and the withdrawal of products in the event
of possible incidents.
– Support for product safety and quality control system.
– Helps reduce risks related to food safety.Helps reduce risks related to food safety.
3 Methodology
3.1 Identification of Variables and Critical Points

There are several methodologies that are addressed by the authors listed in Table 1 that
help us to identify variables and critical points that affect a supply chain.
Table 1. Methodologies for identification of variables and critical points.

Author Title Methodology
Heizer, J. y Principios de administración de Factor qualification
Render, B., 2009 operaciones, 7th. Ed method
Quality Function
Deployment (QFD)
Total quality
management (TQM)
Stanley, R., Experiences and challenges in the Hazard analysis and
Knight, C., development of an organic HACCP system critical control points
Bodnar, F., 2011 (HACCP)
Rodríguez, E., Identificación de prácticas en la gestión de Network or lattice
2018 la cadena de suministro sostenible para la analysis
industria alimenticia
The factor rating method is an excellent tool for dealing with evaluation problems
such as country risk or for the selection of suppliers [11].
The methodology QFD (Quality function deployment) refers on first place to
determining what will satisfy the customer and second translating the customer’s
wishes into a target design. The idea is to gain a good understanding of the customer’s
wishes and to identify alternative process solutions. The QFD is used early in the
design process to help determine what will satisfy the customer and where to deploy
quality efforts [11].
TQM (Total Quality Management) refers to the emphasis that an entire organization
places on quality, from supplier to customer. TQM emphasizes management’s com-
mitment to continually direct the entire company toward excellence in all aspects of
products and services that are important to the customer [11].
The use of Hazard Analysis and Critical Control Point (HACCP) based quality
assurance has a well established place in controlling safety hazards in food supply
chains. It is an assurance system based on the prevention of food safety problems and is
accepted by international authorities as the most effective means of controlling food-

borne diseases [12].
From the results of a qualitative analysis, it is possible to implement the technique
of network analysis or reticular analysis that allows identifying the relationships
between the different practices, dimensions and categories of analysis [13].
Therefore, for this research, we must first start by looking at the type of supply
chain that we will be working with. Once the previous step has been carried out, we
will be able to select an appropriate methodology to identify the existing variables.
3.2 Supply Chain Modeling

To continue to the next stage, we will select a tool that allows us to model in detail the
value chain with which we are going to work and thus verify the traceability aspects
involved, Table 2 lists some of the tools available for this purpose.
Table 2. Characterization of the supply chain.

Author Title Tool
Martínez, K. Caracterización de la cadena de suministro de Analysis of basic
et al., 2018 la Asociación Ruta de la Carne en el functions
departamento de Boyacá SCOR
Strategic supply chain
management
(SCM) framework
García- Characterization of the supply and value Analysis of basic
Cáceres, R. chains of Colombian cocoa functions
et al., 2014
Pizzuti, T. Food Track & Trace ontology for helping the Business process
et al., 2014 food traceability control management notation
(BPMN)
Giraldo, Simulación discreta y por agentes de una Simulation
J. et al., 2019 cadena de suministro simple incluyendo un
Sistema de Información Geográfica (SIG)
Sarli, J. et al., Un modelo de interoperabilidad semántica Simulation
2018 para simulación distribuida de cadenas de
suministro
The analysis of basic functions help us to identify the characteristics of the supply
chain starting from qualitative and quantitative evaluations that include interviews and
surveys applied to the members of the chain [23].
We can represent the current state of the supply chain through the geographic map,
thread diagram and process diagram used in the SCOR Model which describes the
existing process by identifying sources, manufacturing sites and distribution centers,
using the process categories [23].
Object Management Group mentioned in 2010 that the BPMN tool is a notation
that has been specifically designed to coordinate the sequence of processes and the
messages that flow between the different participants of the process in a set of related
activities.
The analysis of a generic supply chain underscores that as a product passes from the
primary producer to the end customer, that product undergoes a series of transforma-
tions. Each transformation will involve different agents. Each agent must collect,
maintain and share information to enable food traceability. The registration and
management of information is facilitated through the use of an information system
generated from the BPMN model of the supply chain [25].
A well known technique of system modeling is simulation, which allows to imitate
the operation of various kinds of facilities and processes in the real world and since
there is one unique way to model by simulation, as this depends on the characteristics
of the system, the objectives of the study and the information available [26].
In the Sarli study published in 2018, distributed simulation appears as a more
appropriate configuration to run a simulation of a supply chain since it allows the reuse
of previously existing simulators in the members of the chain, thus maintaining
independence and avoiding need to build a single simulator that understands the
behavior of all its participants. In this way each member preserves its logic of business,
uses its simulation model and shares the minimum amount of information necessary,
which allows modifying its internal logic without affecting the rest [27].
Having mentioned the above, we can start the characterization of the supply chain
using the analysis of basic functions at first using qualitative or quantitative evaluations
and once that we have that information we can proceed to make a geographic map a
thread diagram and the process diagram from the SCOR Model.
Our next step can be a simulated supply chain, we considered it an appropriate
option for our study since it would allow us to make changes or new proposals in a
flexible way to the tests with the supply chain as appropriate.
3.3 Technology for Traceability

For the next stage, a technology for traceability will be selected. Table 3 shows some of
the technological tools used in traceability processes, the most common is RFDI
technology but the most recent that is still in the exploratory phase in supply chains is
blockchain technology, therefore we will select the one or those that allow us to
propose improvements to the traceability processes as applicable to our future case
study.
Table 3. Technological tools for traceability.

Author Title Technology
Catarinucci, L. RFID and WSNs for traceability of agricultural RFDI
et al., 2011 goods from Farm to Fork: Electromagnetic and
deployment aspects on wine test-cases
Zou, Z. et al., Radio frequency identification enabled wireless RFDI
2014 sensing for intelligent food logistics
Mainetti, L. An innovative and low-cost gapless traceability RFDI
et al., 2013 system of fresh vegetable products using RF NFC
technologies and EPCglobal standard DataMatrix
Mack, M. et al., Quality tracing in meat supply chains Unique quality
2014 identification
Zhao, Y. et al., Recent developments in application of stable isotope Isotope
2014 analysis on agro-product authenticity and traceability analysis
Consonni, R., Nuclear magnetic resonance and chemometrics to Chemometry
Cagliani, L. 2010 assess geographical origin and quality of traditional and NIRS
food products
De Mattia, F. A comparative study of different DNA barcoding DNA barcode
et al., 2011 markers for the identification of some members of
Lamiacaea
Casino, F. et al., Modeling food supply chain traceability based on Blockchain
2019 blockchain technology
A comparative evaluation of these available methodologies and tools will be carried

out to define which would be the most appropriate for this research.
For the final stage, it is intended to analyze the data as well as to measure whether
the objectives were achieved and if it was possible to identify the variables and critical
points and thereby generate the discussion of the results and the dissemination of the
research.
4 Conclusions
We also need to find specific information about regulatory requirements. Also the
interest of those responsible for each phase throughout the supply chain is required to
achieve full traceability.
A good understanding of the business problem is required and also to have a clear
definition of the objectives to be achieved, since as logistics is not the same for all
organizations, traceability processes must be flexible and adapted to each case as
necessary, taking into account the requirements of the client, demand, capacity of
implementation and design of the supply chain.
In this order of ideas, we understand the level of implementation of traceability will
depend on the type of supply chain and the ability to adapt to the expectations and
demands of the client with the best possible technology since global markets demand it
and there are also clients who they will be willing to pay more for a higher quality
product and be able to stay informed about it.
References
1. Zhang, X., Lv, S., Xu, M., Mu, W.: Applying evolutionary prototyping model for eliciting
system requirement of meat traceability at agribusiness level. Food Control 21, 1556–1562
(2010)
2. Montanez, L., Granada, I., Rodríguez, R., Veverka, J.: Guía logística. Aspectos conceptuales
y prácticos de la logística de cargas. Banco Interamericano de Desarrollo, Estados Unidos
(2015)
3. Barbero, J.: La logística de carga en América Latina y el Caribe, una agenda para mejorar su
desempeño. Banco Interamericano de Desarrollo, Estados Unidos (2010)
4. Besterfield, D.: Control de calidad, 8th edn. Pearson Educación, México (2009)
5. ITC: Guía del Exportador de Café, 3ra. Ed. Centro de Comercio Internacional, agencia
conjunta de la Organización Mundial del Comercio y las Naciones Unidas, Suiza (2011)
6. AECOC: La Asociación de Fabricantes y Distribuidores. Recuperado de https://www.aecoc.
es/servicios/implantacion/trazabilidad/. Accessed 10 Aug 2020
7. Dittman, J.: Gestión de riesgos en la cadena de suministro global, un informe de la Facultad
de Administración de la Cadena de Suministro en la Universidad de Tennessee. Patrocinado
por UPS Capital Corporation, Estados Unidos (2014)
8. Forero, N., González, J., Sánchez, J., Valencia, Y.: Sistema de trazado para la cadena de
suministro del café colombiano, Colombia (2019)
9. Sosa, C.: Propuesta de un sistema de trazabilidad de productos para la cadena de suministro
agroalimentaria, España (2017)
10. Rincón, D., Fonseca, J., Castro, J.: Hacia un marco conceptual común sobre trazabilidad en
la cadena de suministro de alimentos, Colombia (2017)
11. Heizer, J., Render, B.: Principios de administración de operaciones, 7th edn. Pearson
Educación, México (2009)
12. Stanley, R., Knight, C., Bodnar, F.: Experiences and challenges in the development of an
organic HACCP system, United Kingdom (2011)
13. Rodríguez, E.: Identificación de prácticas en la gestión de la cadena de suministro sostenible
para la industria alimenticia, Colombia (2018)
14. Badia-Melis, R., Mishra, P., Ruiz-García, L.: Food traceability: new trends and recent
advances. A review. Food Control 57, 393–401 (2015)
15. Catarinucci, L., Cuiñas, I., Expósito, I., Colella, R., Gay-Fernández, J., Tarricone, L.: RFID
and WSNs for traceability of agricultural goods from farm to fork: electromagnetic and
deployment aspects on wine test-cases. In: SoftCOM 2011, 19th International Conference on
Software, Telecommunications and Computer Networks, pp. 1–4. Institute of Electrical and
Electronics Engineers (2011)
16. Zou, Z., Chen, Q., Uysal, I., Zheng, L.: Radio frequency identification enabled wireless
sensing for intelligent food logistics. Philos. Trans. R. Soc. A 372, 20130302 (2014)
17. Mainetti, L., Patrono, L., Stefanizzi, M., Vergallo, R.: An innovative and low-cost gapless
traceability system of fresh vegetable products using RF technologies and EPCglobal
standard. Comput. Electron. Agric. 98, 146–157 (2013)
18. Mack, M., Dittmer, P., Veigt, M., Kus, M., Nehmiz, U., Kreyenschmidt, J.: Quality tracing
in meat supply chains. Philos. Trans. R. Soc. A 372, 20130302 (2014)
19. Zhao, Y., Zhang, B., Chen, G., Chen, A., Yang, S., Ye, Z.: Recent developments in
application of stable isotope analysis on agro-product authenticity and traceability. Food
Chem. 145, 300–305 (2014)
20. Consonni, R., Cagliani, L.: Nuclear magnetic resonance and chemometrics to assess
geographical origin and quality of traditional food products. Adv. Food Nutr. Res. 59, 87–
165 (2010)
21. De Mattia, F., Bruni, I., Galimberti, A., Cattaneo, F., Casiraghi, M., Labra, M.: A
comparative study of different DNA barcoding markers for the identification of some
members of Lamiacaea. Food Res. Int. 44, 693–702 (2011)
22. Casino, F., Kanakaris, V., Dasaklis, T., Moschuris, S., Rachaniotis, N.: Modeling food
supply chain traceability based on blockchain technology. IFAC PapersOnLine 52, 2728–
2733 (2019)
23. Martínez, K., Rivera, L., García, R.: Caracterización de la cadena de suministro de la
Asociación Ruta de la Carne en el departamento de Boyacá. Universidad Pedagógica y
Tecnológica de Colombia, Colombia (2018)
24. García-Cáceres, R., Perdomo, A., Ortiz, O., Beltrán, P., López, K.: Characterization of the
supply and value chains of Colombian cocoa. DYNA 81, 30–40 (2014). Universidad
Nacional de Colombia, Colombia
25. Pizzuti, T., Mirabelli, G., Sanz-Bobi, M., Goméz-Gonzaléz, F.: Food track & trace ontology
for helping the food traceability control. J. Food Eng. 120, 17–30 (2014)
26. Giraldo-García, J., Castrillón-Gómez, O., Ruiz-Herrera, S.: Simulación discreta y por
agentes de una cadena de suministro simple incluyendo un Sistema de Información
Geográfica (SIG). Información tecnológica 30, 123–136 (2019)
27. Sarli, J., Leone, H., Gutierrez, M.: SCFHLA: Un modelo de interoperabilidad semántica para
simulación distribuida de cadenas de suministro. RISTI, Revista lbérica de Sistemas y
Tecnologías de Información 30, 34–50 (2018)
Online Technology: Effective Contributor
to Academic Writing
Md. Hafiz Iqbal1(&) , Md Masumur Rahaman2, Tanusree Debi3,

and Mohammad Shamsul Arefin3
1
Government Edward College, Pabna, Bangladesh
vaskoriqbal@gmail.com
2
Bangladesh Embassy, Bangkok, Thailand
3
Chittagong University of Engineering and Technology,
Abstract. This study explores the potential contributors to online technology

in academic writing and designs a policy for using online technology. Focus
group discussion and the survey (n = 151) were used for variable selection,
questionnaire development, and data collection. A mini-experiment was con-
ducted at the Department of Economics, Government Edward College, Pabna,
Bangladesh and the survey was conducted at seven different colleges also in
Pabna district. Students’ socio-economic-demographic characteristics like age,
gender, religion, students’ parent income, students’ daily expenses, household
composition, and level of education are major influential determinants of
effective usage of online technology. Tutorial sessions for using online tech-
nology in writing, cost-effective internet package for students, institutional and
infrastructural supports, motivation, students’ residential location, and electronic
gadgets provided to the students at a subsidized price are important contributors
to use online technology. Online technology-aided academic writing is a joint
effort of students, teachers, college administration, University authority, and the
Government.
Keyword: Online technology Academic writing Knowledge management

Massive open online course
1 Introduction
An academic writing without cut, copy, and paste has a greater implication in every
research and educational institutions where plagiarism-induced writing is intolerable [1,
2]. Online technology plays a significant role to produce quality writing. It facilitates
the provision of e-mentoring, e-library, and e-discussion forum [3, 4]. The most
common and popular forms of online technology include Dropbox, Google Drive,
YouTube, Facebook, Database from online, Reflective Journals, Blogs, Chat Room for
academic discussion, Wikis, Skype, WhatsApp, Zoom, and Social Media Group [5].
Students’ academic and research activities at the tertiary level are mostly influenced by
online technology because of its significant role in generating diversified learning
strategies [6]. It develops student presentation, writing, searching, concept

https://doi.org/10.1007/978-3-030-68154-8_110
1282 Md. H. Iqbal et al.
development, critical thinking, citation, and referencing skills [7]. Besides, mentees or
students can send assignments to their respective mentors, and academic supervisors or
teachers for assessment of their academic ask through Dropbox, Google Drive, and
WhatsApp. They can share their necessary files, documents, and relevant articles,
information, and data with other researchers, academicians, and policymakers by using
online technology. Proper utilization of online technology also helps to assist, select,
design, and develop the questionnaire, survey technique, study area, theory, concept,
research questions and objectives, research approach and philosophy, methodology,
and data management technique, and interview questions, and proofreading from
teachers, researchers, experts, and mentors. Thus, online technology has resulted in the
emergence of new forms of learning genres and literature that empowers students’ level
of cognition through collaboration among students, teachers, and researchers [8].
Considering the importance of online technology, a British proverb reveals that “if
students or learners do not effectively make a bridge between their ideas and online
technology, they might as well live in a cave or the ground shell of a space shuttle” [9,
p. 1]. Modern education and research always give emphasize on online technology for
significant academic writing [10].
In the present time, students are habituated to handling computer, internet, smart-
phone, and tab. Besides, they have proficiency in intelligent computing that leads to
motivating them using online and digital technologies. Optimum use of online tech-
nology in academic writing requires the proper utilization of online technology. Having
no prior application knowledge of online technology may lose students’ and
researchers’ level of interest and satisfaction towards it. This situation may narrow
down the scope of collaborative learning in a broader sense and restrict it to produce
quality academic writing. Besides, motivation towards more usage of online technol-
ogy in academic writing gets stuck when it faces a few challenges. Potential challenges
to using this technology include learners’ ability to purchase essential electronic gad-
gets and internet package, insufficient supports for designing online technology-
mediated curriculum and assessment techniques, motivation, and encouragement from
teachers for more usage of online technology in writing, students’ incapability to
handling online technology, and interrupted internet connection [11, 12]. Authenticity
verification, security, and fraud prevention are also another form of challenge to
continue online technology [13, 14].
Graduate college students of the National University (NU) of Bangladesh are not
habituated with online technology for their academic writing. They prefer verbal
instruction from their respective teacher, printed textbooks, newspapers, periodicals,
and predatory journals. Having no sound application knowledge of online technology
in academic writing, they prefer cut, copy, and paste practice to complete their term
papers and other academic assignments and fail to show their potentiality and capa-
bility in assignment and term paper [15]. Moreover, they fail to develop critical
thinking, constructive, innovative and creative ideas, and thus their academic assign-
ments lose scientific quality and originality. At the same time, they fail to present and
fail to present noble ideas and merge different research paradigm like epistemological
stance, and ontological beliefs. To promote online technology in academic writing, this
study tries to fulfill the following two objectives: detect potential contributors to more
Online Technology: Effective Contributor to Academic Writing 1283
usage of online technology at the graduate level under the NU of Bangladesh and
develop policy options for using online technology in academic writing.
Online technology has got popularity in academic writing in many universities and
research institutions due to its significant role in better learning, research, and educa-
tion. This study mainly focuses on the benefits of online learning in academic writing.
The study also provides an overview of online technology and academic writing. The
study is significant in several ways. It contributes to the current research regarding the
effect of online technology in academic writing in two ways. First. We provide
quantitative evidence on the effect of online technology in academic writing, which has
gained minor importance in the previous literature. Our findings highlight the effect of
online technology in writing. Second. We conduct a mini-experiment, which allows us
to consider two groups such as treated or experiment or study group and usual or
control group for proper empirical investigation. Hence, we can be able to measure the
causal effects of online technology in academic writing. In particular, the study gives a
fuller understanding of the use of online technology in academic writing and provides
an effective guideline for future researchers and policymakers in this field.
The remainder of this paper is organized as follows: Sect. 2 covers a brief literature
review. Section 3 presents the theoretical motivation, Sect. 4 highlights the method-
ology and research plan. The results and discussion are provided in Sect. 5. Section 6
concludes by offering recommendations and policy implications.
2 Literature Review
Online technology enhances students’ writing skills. It further helps to make strong
academic collaboration among students, teachers, researchers, mentors, and others.
Research work, communication, and social and cultural competencies get more force
when these are dealt with online technology [16, 17]. It works as a catalyst to reduce
plagiarism practice in academic writing and helps to seek research opportunities around
the world [18]. It is treated as a more effective, significant, and powerful tool that
facilitates concept building. Students get a good grade when they are involved in online
technology-mediated academic writing [19]. It deals the pedagogical tasks and con-
siders as a means and ways of inclusive and life-long learning. The principal objective
of this technology is to generate quality writing. On the other hand, students get easy
access to the learning platform through online technology that enhances higher-order
learning, thinking, and writing skills [20]. Based on the numerous benefits of online
technology, [21] gives more stress on online technology-induced academic writing
rather than the traditional modes of writing because it is a suitable platform for distance
mode learning. Students can get easy access to learning from anywhere and anytime
through online technology [22].
The use of online technology in academic writing through e-discussion forum,
social media group, wiki, skype, blog, and zoom are well recognized [23]. For instance,
Blog-mediated concepts and feedback play an important role in academic writing [24].
It is an important contributor to students’ learning and academic writing because it
provides critical comments and constructive feedback [25, 26]. It develops learners’
cognitive level, enhances capacity building in academic writing, and foster literacy
skills [27]. It also develops the general concept of any particular issue, motivates
students to copy and paste free writing, and enhances students’ confidence along with
their perceived strengths and weakness of academic writing. Like the blogs, skype
generates a new dimension of the academic writing approach. Generally, it is seen that
traditional writing (such as library-oriented and class notes-based academic writing)
requires a longer time and a large number of academic resources (such as class note,
textbook, newspaper, and article) to finalize a manuscript [28]. Skype-mediated stu-
dents perform better in academic writing than traditional academic writing practices
[29]. Wikis have greater impacts on academic writing. It makes students pay closer
attention to the formal aspects of academic writing by collectively contributing or
mutually supportive approaches [30]. Wiki is an emergent technology responsible to
increase writing capacity. This web-based authoring instrument is used in academic
writing to execute in collaboration with others [31]. Wiki-assisted writing can con-
tribute to raise awareness of the audience and increase the use of interpersonal
metadiscourse [32]. On the other hand, the reflective journal has a vital role to eliminate
typos and errors free writing. In the provision of the reflective journal, students get
relevant and constructive feedback from researchers, scholars, key persons, and
teachers that meet their objectives [33]. Like the reflective journal, the chat room also
has positive impacts on concept building for academic writing. Online-based chat
rooms are functioning well in regular interactions with students and librarians for
essential, valid, and reliable bibliographic database, and desirable e-journals and books
[34]. Students who are habituated with the chat room for concept development perform
well in academic writing-mediated academic writing compared to those of students not
used chat reference service [35].
Assessment of the effectiveness of online technology in learning is highly required
for teachers, college administrators, the authority of NU of Bangladesh, and the gov-
ernment to implement, plan, and modify online technology-based learning for better
academic writing. Existing literature perfectly highlights the effectiveness of online
technology in learning in terms of different countries’ perspectives. A very few studies
focused on the effectiveness of online technology in academic writing in Bangladeshi
educational institutions like NU of Bangladesh. This study tries to cover this issue
significantly.
3 Theoretical Motivation
A society can utilize an alternative option when it fails to satisfy desirable condition
with its traditional or existing option. With this point of view, Kelvi Lancaster and
Richard G. Lipsey first formalized the theory of the second-best in their article entitled
“The General Theory of Second Best” in 1956 followed an earlier work by James E.
Meade [36]. In this viewpoint, online technology works as an effective contributor to
academic writing instead of traditional writing practice. The theory of the second-best
explains why students use online technology in their academic writing practice. This
theory is based on Pareto optimality (academic writing by online technology cannot
increase the writing efficiency without reducing traditional writing practice). A certain
social dividend can be gained by a movement from a Pareto non-optimal allocation to a
Pareto optimal allocation [37]. Therefore, the optimality of the Pareto condition is often
considered an efficient usage of online technology in academic writing.
For better understanding, the fundamental feature of this theory can be presented
for a single student case. The first-order condition for Pareto optimality is obtained by
maximizing the quality of the student’s academic writing subject to the associate
writing production function. From the Lagrangian function, it is possible to write the
following mathematical form:

V ¼ W ðx1 ; x2 ; . . .; xn Þ sF x1 ; x2 ; . . .; xn ; q0 ð1Þ
and equating partial derivative to zero
@V
¼ Wi sFi ; i ¼ 1; . . .; n ð2Þ
@x1
where Wi ¼ @W @F
@x1 and Fi ¼ @x1 . It follows that
Wi F i
¼ ; i; j ¼ 1; . . .; n ð3Þ
Wj F j
If Eq. (2) is satisfied, the rate of substitution between the traditional writing approach
and the online technology-mediated writing approach will equal the corresponding rate
of writing transformation.
Assume that an educational institution fails to promote an online technology-
mediated academic writing practice in its campus corresponding to one of the condi-
tions of Eq. (2). Under this practice, we can consider another assumption to eliminate
the implementation barriers of online technology-induced writing condition which can
be expressed as follows:
Wi kFi ¼ 0 ð4Þ
where k is a positive but arbitrary constant that will able to produce different optimal
values of s calculated from Eq. (2) and the writing production function.
The conditions for a second-best solution are obtained by maximizing online
technology-mediated academic writing practice subject to Eqs. (1) and (4). From the
Lagrangian function, we can write the following mathematical form:

V ¼ W ðx1 ; x2 ; . . .; xn Þ sF x1 ; x2 ; . . .; xn ; q0 cðWi kFi Þ ð5Þ
where s and s indicate undetermined and unknown multipliers. Take the partial
derivatives of V, setting them equal to zero
@V
¼ Wi sFi cðWi kFi Þ ¼ 0; i ¼ 1; . . .; n ð6Þ
@x1
@V
¼ Fi x1 ; x2 ; . . .; xn ; q0 ¼ 0 ð7Þ
@s
@V
¼ ðWi kFi Þ ð8Þ
@c
Considering the condition c ¼ 0; it is not possible to get a solution from Eq. (6)
because of the violation of the assumption presented in Eq. (4). Moving to Eqs. (7) and
(8) to the right-hand side and dividing the ith equation by the jth:
Wi sFi þ cðWi1 kF1i Þ

¼ ; i; j ¼ 1; . . .; n
Wj sFj þ c Wij kF1j
It is not possible to know a priori about the signs of cross partial derivatives such as
Wi1 ; Wij ; F1i ; and F1j . Hence, anyone may not expect the usual Pareto conditions to be
required for the attainment of the second-best optimum [38]. People’s perception of
certain actions and their effectiveness are required in experiment, survey, observation,
interview, and judgment to attain the second-best optimum. Thus, the following section
will discuss students’ perceptions about online technology and its effectiveness in
academic writing through a few research plans such as experiment, focus group dis-
cussion (FGD), and questionnaire survey.
4 Methodology and Research Plan
4.1 Present Study

Pabna district is selected as the study area. It is the southernmost district of Rajshahi
Division. This district is bounded by Natore and Sirajgang districts on the north, Rajbari and
Kustia districts on the south, Manikgonj and Sirajganj districts on the east, and the Padma
River and Kustia district on the west. A large number of NU affiliated colleges are located in
this district. The graduate programs of these colleges are run by the traditional academic
practice instead of semester system. Generally, teachers and students of these colleges do
not instruct and follow online technology in academic writing. As a consequence, the
writing process is disrupted. Very often, teachers engage their students with academic
writing,but their performances are not satisfactory due to more traditional academic writing
practice. Cut, copy, and paste practices are commonly seen in their academic writing. They
failed to submit their writing tasks within the designated timeframe.
4.2 Mini Experiment

For a better empirical assessment of the effectiveness of online technology in writing,
this study has conducted a mini-experiment that occurred from 7 January to 22
February 2017. It was concerned with writing term papers and other academic
assignments in the graduate class at the Economics Department under the Pabna
Government Edward College. To experiment, we asked two questions to separate
whole graduate students in a treated or experimental or study group and a usual or
control group of this department. The first question was concerned to know about
online technology, and the other question was associated with the effectiveness of
online technology in academic writing. A total of 32 students answered “yes” to both of
the questions were classified into an experimental group and the rest 47 students
answered “no” to both of the questions were treated as a control group. The first group
was instructed to develop concept building and plagiarism-free academic writing by
using different tools of online technology (see Table 1 for more details) and the last
group was requested to follow traditional writing practice. Both groups were assigned
the same writing tasks such as group works in the classroom, home works (writing
summary, problem statement, review of literature, result and discussion, and recom-
mendations), and report writing on a particular topic. The grading scheme of the
writing assignment was carried a 10% mark in participation, a 20% mark in a home-
work assignment, and the rest 50% mark in short report writing. Our result showed that
the experimental group performs well in academic writing and they got a good grade in
their writing tasks (average score 82%) compared to those of the usual group (average
score 71%). This result motivated us to run the questionnaire survey.
Table 1. Online technology used in the mini-experiment

Kinds of Objectives
technology
Web City Designated to upload and download essential reading materials, students’
assignments, PowerPoint Presentation (PPT), and feedback
DropBox Designated to backup students’ important files and class notes and share
concept notes and assignments to teachers
Skype Designated to make meetings relevant to academic discussions with
teachers, classmates, scholars, and researchers
Google Drive Designated to store relevant reading materials, article journals, and data set
You Tube Designated to run tutorial sessions relevant to useful research and academic
issues
Blogs Designated to build concept
E-discussion Designated to interact with others for developing the idea, concept, and
Forum critical thinking
Kahoot Designated to evaluate students’ performance in academic issues through a
set of quizzes related to use form of a verb, correction, punctuation marks,
typos, and grammar
Social Media Designated to identify assignment topics by e-discussion forums
Group
Reflective Designated to assess content and academic writing by experts and
Journal proofreaders
Chat Room Designated to share experience and ideas
Wiki Designated to backup and modify academic writing
Zoom Designated to make the academic meetings with teachers, and classmates
WhatApps Designated to share relevant academic documents and writing materials
Podcast Designated to record and listen to the teacher’s lecture
4.3 Techniques and Tools of Variables Selection, and Data Collection

In Bangladesh no prior attempt was made to compare the effect of the online tech-
nology with traditional academic writing before and no existence of the database,
instruments of both the quantitative and qualitative data were applied for better
empirical assessment as well. Variables of the study were selected from existing lit-
erature and focus group discussion (FGD), and data were collected from the survey
through the semi-structured questionnaire in few NU affiliated colleges in the Pabna
district.
A systematic search of electronic databases was performed for concept develop-
ment of online technology and its role to develop academic writing for the period 1999
to 2020. All studies related to the effectiveness of online technology were selected at
random to develop its concepts, strategies, and drivers. Three attributes e.g., institu-
tional and infrastructural supports (ins_inf_sup), motivation (mtv), and students’ resi-
dential location (stu_resi_loc) were selected from [39–41].
This study arranged 2 FGDs consists of (7–8) participants (e.g. undergraduate and
graduate students, teachers, parents, and college administrators) of each occurred from
24 March to 25 March 2017 at Government Edward College and Government Shaheed
Bulbul College of Pabna district. The objectives of FGD were to select variables and
design relevant questionnaire for data collection. Three attributes such as tutorial
sessions for using online technology in academic writing (tut_se_on_te), cost-effective
internet package for students (co_ef_in_pac), and providing electronic gadgets for
students at subsidizing price (gad_sub_pri) were selected from our two FGDs.
The surveys were conducted by interviews based on a semi-structured question-
naire in the seven NU affiliates colleges from 7 May to 19 August 2017. Purposive
random sampling method was applied to the whole students of the targeted seven
colleges. Three colleges (Government Edward College, Pabna Government Mohila
[women] College, and Ishwardi Government College) were selected from two urban
areas and 82 respondents participated in the survey from these colleges. Other 69
respondents were selected for a survey from Dr. Zahurul Kamal College under the
Sujanagar sub-district, Haji Jamal Uddin College under the Bhangura sub-district,
Chatmohar College under the Chatmohar sub-district, and Bera College under the Bera
sub-districts. The selection of respondents was kept random as much as possible.
However, there was a possibility of sampling errors. The following procedures were
taken in reducing the bias of the survey. All survey interviews were conducted by the
trained data collectors. All respondents were briefed about the importance of online
technology in academic writing. The interviews of respondents were taken care of for a
long time. The data collectors do not indulge in any personal and irrelevant gossiping
to avoid anchoring or influencing the answers of the respondents. The questionnaire for
surveys consists of a few sections. The first section covered the socio-economic-
demographic (SED) characteristics of respondents. The second section highlighted the
determinants of online technology in academic writing.
Collected data have many dummy responses. Random parameter logit (RPL) or
basic model has applied dummy responses [42]. In the first step of estimation, the RPL
was run for participation decision in online technology where 1 indicates positive
perception on online technology for writing excellence, and 0 indicates otherwise are
regressed on explanatory variables or attributes (such as ins_inf_sup, mtv, stu_resi_loc,

tut_se_on_te, co_ef_in_pac, and gad_sub_pri). All of our proposed attributes and
variables are included in the following regression model:
Yi ¼ b0 þ b1 Xi þ ei ð9Þ
Equation (9) is known as the RPL model when we use only the attributes. It is also
known as the MNL model when we use both the attributes and SED characteristics
[43]. The use of online technology in academic writing (us_olt_ac_wri) is treated as an
outcome variable in both of the models.
5.1 Descriptive Statistics of the Variables in the Model

Collected data from the survey in seven NU affiliated colleges of Pabna district show
basic descriptive statistics of major SED characteristics (see Table 2 for more details).
Table 2. Brief descriptive statistics of major SED variables

Variable Minimum Maximum Mean Std. Deviation
Age of respondents (years) 20 25 22.33 1.373
Parents’ monthly income (Tk.) 20000 90000 43791.67 19023.965
Daily expense (Tk.) 20 150 70.83 45.076
Household composition (family member) 4 12 6.88 2.028
Level of education (years of schooling) 16 17 16.50 0.511
(Source: Field Survey)
Out of 151 students from seven NU affiliated colleges, 119 (78.807%) were male
students and 32 (21.192%) were female students in the survey. Among them, 92% of
respondents belonged to Muslim households and the rest 8% of respondents were from
Hindu households. About 100 (66%) of students argued that online technology could
play a significant role in academic writing, but 51 (34%) of respondents were not
interested in online technology due to low family income, scarcity of internet service,
inability to purchase smartphone (Android and iOS), the higher price of internet
package, lower speed of bandwidth, and the feeling of discomfort with online tech-
nology. All students in the survey process have strongly pointed out that they did not
get any support from their respective institutions for better utilization of online tech-
nology in academic writing. Table 1 outlines the summary statistics of the study.
Respondents’ average age and their parents’ monthly income were recorded at 22.33
and Bangladeshi currency Taka 43792 respectively. All respondents were studying at
undergraduate and graduate levels and their daily average expense was estimated Taka
71 per person. The majority of the respondents have more than 5 family members.
5.2 Results of Models

Coefficients of proposed attributes of online technology express the students’ choice
possibility for considering the online technology-mediated academic writing. The sign,
level of significance, and degree of magnitude make a guarantee of the effectiveness of
online technology in academic writing. All coefficients are statistically significant at
1%, 5%, and 10% levels. Estimated results make the guarantee that online technology-
mediated writing has produced plagiarism-free and innovative writing than the tradi-
tional writing approach. Thus, it can be said that online technology has a greater causal
effect on academic writing.
Results for all 151 respondents from RPL and MNL models are presented in
Table 3. The RPL model shows the result when it includes only the proposed attributes
of online technology. The result in the MNL model makes the surety that SED char-
acteristics along with proposed attributes are found to be significant determinants of
online technology. But it is not possible to predict the relationship between gender-
online technology and religion-online technology. During the survey, a less number of
female and Hindu students participated in the survey and that may cause an insignif-
icant relationship between gender and use of online technology, as well as that of
between religion and use of online technology. Generally, students’ age, their parents’
income, their daily expenses, and their household compositions are more sensitive to
online technology [44]. There are some similar results obtained in the literature in terms
of the level of education (years of schooling) of the respondents. For instance, Woo-
drich and Fan [45] found that a 1% increase in the level of education will lead to an
increase in online technology in academic writing. Higher-level education requires
citation, referencing, summarizing, concept note development, and paraphrasing for
innovative academic writing.
The log-likelihood test was used to determine the acceptance or rejection of each
variable [46]. The goodness of fit (R2) is also improved when the addition of the
covariates is considered [47]. The value of R2 in the range of (0.20–0.30) is comparable
to the range of (0.70–0.90) of an adjusted R2 of an ordinary least square (OLS) method
[48]. Thus, the RPL and MNL model along with the covariates is deemed the good
regression model.
5.3 Result of the Correlation Matrix

The estimated correlation also supports the results generated from the RPL and MNL
models. There are various relationships between the outcome variable and the proposed
attributes but the intensity or degree of relationships are not the same because of their
different values (e.g., very strong (r 0:8Þ, strong (r 0:6), moderate (r 0:4) and
weak (r 0:2) of correlation coefficients (see Table 4 for more details).
There are positive relationships among all the proposed attributes and online
technology in academic writing at the convenient significant levels except students’
residential location because it is negatively associated with online technology. Among
all the attributes, institutional, tutorial sessions for online technology, infrastructural
supports, and cost-effective internet package have a very strong association with online
technology in academic writing.
Table 3. Regression results of the survey

Attributes/Variables RPL model MNL model
Sub-district level colleges District level colleges All colleges All colleges
Intercept 21.613* (0.309) 13.093** (0.019) 24.499**(0.010) 32.071* (0.546)
ins_inf_sup 0.421 (0.431) 0.271* (0.150) 0.178** (0.547) 0.119** (0.031)
Motivation −0.221*** (0.680) 0.375** (0.003) 0.110* (0.326) 0.210* (0.001)
stu_resi_loc −0.347 (0.037) −0.175* (0.172) −0.109* (0.342) −0.113* (0.090)
tut_se_on_te 0.848*** (0.731) 0.182*** (0.511) 0.020** (0.437) 0.120** (0.022)
co_ef_in_pac 0.131** (0.130) 0.296*** (0.221) 0.216* (0.738) 0.129*** (0.067)
gad_sub_pri 0.196** (0.608) 0.132* (0.362) 0.093** (0.586) 0.093*** (0.245)
age 0.009* (0.002)
gender 0.221 (0.023)
religion 0.452 (0.165)
parents’ income 0.130* (0.459)
daily expense 0.062** (0.683)
household composition 0.072*** (0.453)
level of education −0.172*** (0.002)
Log-likelihood −479.9231 −371.5701 −360.0361 −506.4572
Goodness of fit 0.339 0.394 0.402 0.273
Observations (n) 69 82 151 151
Note. Standard errors are reported in parentheses
*
Significant at the 1% level.
**
***
Table 4. Correlations matrix of relevant attributes for online technology
us_olt_ac_wri 1
ins_inf_sup 0.831** 1
Mtv 0.518* 0.327 1
stu_resi_loc –0.623*** 0.442** –0.678 1
tut_se_on_te 0.891 ** 0.339* 0.589 * 0.348* 1
co_ef_in_pac 0.883 * 0.125** –0.423** 0.449 0.889 * 1
gad_sub_pri 0.567 ** 0.254*** –0.798*** 0.673** 0.805* 0.810*** 1
Note. * Significant at the 1% level.
**
Significant at the 5% level.
***
Significant at the 10% level.
6 Conclusion and Policy Implications
This study explores the potential contributors to online technology in academic writing
in NU affiliated colleges of Bangladesh and develops a policy for using online tech-
nology. We used cross-sectional data collected from NU affiliated few colleges in the
Pabna district. FGDs help us to select a few attributes and design the questionnaire. We
also depend on mini-experiment and two types of regression models for proper
empirical assessment. We found that students were much aware of plagiarism-free
academic writing and they wanted to produce online technology-mediated writing. Our
empirical assessment also takes stand on the effectiveness of online technology based
on significant levels of institutional and infrastructural status, students’ residential
location, motivation towards more usage of online technology in academic writing,
tutorial session for using online technology, cost-effective internet package for students
and provides electronic gadgets for students at the subsidized price. Sometimes few
SED characteristics work as an accelerator to move forward online technology in
writing [49].
The paradigm shift from the traditional writing approach to the online-mediated
writing approach is essential to develop the environment of online technology in any
educational institution. It requires appropriate guidelines, strategies, policies, and joint
efforts of other stakeholders like class teachers, college administrators, the authority of
NU of Bangladesh, and government intervention. For instance, the government can
formulate an online technology-mediated educational policy for college education. The
government can support training facilities for college teachers to habituate them to
online technology in education. The government can also provide free electronic
gadgets like tab, smartphone, and laptop to the students or provide them a subsidy to
purchase electronic gadgets at a lower price. Likewise, college authority can provide an
uninterrupted Wi-Fi facility, multimedia projector supported classroom, and a contin-
uous power supply facility for better access to online technology. The NU of Ban-
gladesh can redesign their syllabus and curriculum with special attention to online
technology-mediated academic writing. Lastly, the class teachers can encourage and
motivate their students to more use of online technology in academic writing. The class
teacher can arrange tutorial sessions for their students so that the students can effec-
tively using online technology in academic writing tasks.
The study is not free from certain limitations. As a new concept, it is essential to
undertake an in-depth study and a higher range of questionnaire surveys. For a better
assessment, it is essential to conduct further research in this field. Only a shorter range
of the survey in a few colleges may narrow down the scope, concept, and effectiveness
of online technology in academic writing. However, we successfully applied different
approaches that guarantee the validity, reliability, and consistency of our empirical
findings.
Acknowledgments. We greatly acknowledge Assistant Professor Dr. Renee Chew Shiun Yee,
School of Education, University of Nottingham, Malaysia Campus (UNMC) for her significant
comments and suggestions to develop the concept and methodology of this study. We are
thankful to the editor(s) and other anonymous referees of this journal paper for their valuable and
constructive suggestions for improving the draft.
References
1. Fahmida, B.: Bangladesh tertiary level students’ common errors in academic writing. BRAC
University, Dhaka (2020). https://dspace.bracu.ac.bd/xmlui/bitstream/handle/10361/252/
08163004.PDF?sequence=4
2. Iqbal, M.H.: E-mentoring: an effective platform for distance learning. E-mentor 2(84), 54–61
(2020)
3. Macznik, A.K., Ribeiro, D.C., Baxter, G.D.: Online technology use in physiotherapy
teaching and learning: a systematic review of effectiveness and users’ perceptions. BMC
Med. Educ. 15(160), 1–2 (2015)
4. Iqbal, M.H., Ahmed, F.: Paperless campus: the real contribution towards a sustainable low
carbon society. J. Environ. Sci. Toxicol. Food Technol. 9(8), 10–17 (2015)
5. Iqbal, M.H., Sarker, S., Mazid, A.M.: Introducing technology in Bangla written research
work: application of econometric analysis. SAMS J. 12(2), 46–54 (2018)
6. Hanson, C., West, J., Neiger, B., Thackeray, R., Barnes, M., Mclntyre, E.: Use and
acceptance of social media among health education. Am. J. Health Educ. 42(4), 197–204
(2011)
7. Kuzme, J.: Using online technology to enhance student presentation skills. Worces-
ter J. Learn. Teach. 5, 27–39 (2011)
8. Vlachopoulos, D., Makri, A.: The effect of games and simulations on higher education: a
systematic literature review. Int. J. High. Educ. 14(1), 22–34 (2017)
9. Adams, P.C.: Placing the anthropocene: a day in the life of an enviro-organism. Trans. Inst.
Br. Geogr. 41(1), 54–65 (2016)
10. Goldie, J.G.S.: Connectivism: a knowledge learning theory for the digital age? Med. Teach.
38(10), 1064–1069 (2016)
11. Shelburne, W.A.: E-book usage in an academic library: user attitudes and behaviors. Libr.
Collect. Acquisitions Tech. Serv. 33(2–3), 59–72 (2009)
12. Thornton, P., Houser, C.: Using mobile phones in English education in Japan. J. Comput.
Assisted Learn. 21(3), 217–228 (2005)
13. Bandyopadhyay, D., Sen, J.: Internet of things: applications and challenges in technology
and standardization. Wireless Pers. Commun. 58(1), 49–69 (2011)
14. Aebersold, R., Agar, J.N., Amster, I.J., Baker, M.S., Bertozzi, C.R., Boja, E., Ge, Y., et al.:
How many human proteoforms are there? Nat. Chem. Biol. 14(3), 206–217 (2018)
15. Click, A.B.: International graduate students in the United States: research process and
challenges. Libr. Inf. Sci. Res. 40(2), 153–162 (2018)
16. Aesaert, K., Vanderlinde, R., Tondeur, J., van Braak, J.: The content of educational
technology curricula: a cross-curricular state of the art. Educ. Tech. Res. Dev. 61(1), 131–
151 (2013)
17. Voogt, J., Roblin, N.P.: 21st century skills. Discussion Paper. University of Twente, Faculty
of Behavioral Science, University of Twente, Department of Curriculum Design and
Educational Innovation, Enschede (2010)
18. Amiel, T., Reeves, T.C.: Design-based research and educational technology: rethinking
technology and the research agenda. Educ. Technol. Soc. 11(4), 29–40 (2008)
19. Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T., Weitzner, D.: Web science: an
interdisciplinary approach to understanding the web. Commun. ACM 51(7), 60–69 (2008)
20. Levine, M., DiScenza, D.J.: Sweet, sweet science: addressing the gender gap in STEM
disciplines through a one-day high school program in sugar chemistry. J. Chem. Educ. 95(8),
1316–1322 (2018)
21. Schulman, A.H., Sims, R.L.: Learning in an online format versus an in-class format: an
experimental study. J. Technol. Horiz. Educ. 26(11), 54–64 (1999)
22. Murthy, D.: Digital ethnography: an examination of the use of new technologies for social
research. Sociology 42(5), 837–855 (2008)
23. Miyazoe, T., Anderson, T.: Learning outcomes and students’ perceptions of online writing:
simultaneous implementation of a forum, blog, and wiki in an EFL blended learning setting.
System. 38(2), 185–199 (2010)
24. Jimoyiannis, A., Schiza, E.I., Tsiotakis, P.: Students’ self-regulated learning through online
academic writing in a course Blog. In: Digital Technologies: Sustainable Innovations for
Improving Teaching and Learning, pp. 111–129. Springer, Cham (2018)
25. Hansen, H.E.: The impact of blog-style writing on student learning outcomes: a pilot study.
J. Polit. Sci. Educ. 12(1), 85–101 (2016)
26. Chong, E.K.: Using blogging to enhance the initiation of students into academic research.
Comput. Educ. 55(2), 798–807 (2010)
27. Novakovich, J.: Fostering critical thinking and reflection through blog-mediated peer
feedback. J. Comput. Assisted Learn. 32(1), 16–30 (2016)
28. Hussein, N.O., Elttayef, A.I.: The impact of utilizing Skype as a social tool network
community on developing English major students’ discourse competence in the English
language syllables. J. Educ. Pract. 7(11), 29–33 (2016)
29. Lo Iacono, V., Symonds, P., Brown, D.H.: Skype as a tool for qualitative research
interviews. Sociol. Res. Online 21(2), 103–117 (2016)
30. Li, M., Zhu, W.: Patterns of computer-mediated interaction in small writing groups using
wikis. Comput. Assisted Lang. Learn. 26(1), 61–82 (2013)
31. Yusoff, Z.S., Alwi, N.A.N., Ibrahim, A.H.: Investigating students’ perception of using wikis
in academic writing. 3L Lang. Linguist. Lit. 18(3), 17–32 (2012)
32. Zhang, W., Cheung, Y.L.: Researching innovation in English language writing instruction: a
state-of-the art-review. J. Lang. Teach. Res. 9(1), 80–89 (2018)
33. Estaji, M., Salimi, H.: The application of wiki-mediated collaborative writing as a
pedagogical tool to promote ESP learners’ writing performance. Asian ESP J. 14(1), 112–
141 (2018)
34. Henter, R., Indreica, E.S.: Reflective journal writing as a metacognitve tool. Int. Sci.
Commit. 533, 133–141 (2014)
35. MacDonald, H.: Undergraduate students can provide satisfactory chat reference service in an
academic library. Evid. Based Libr. Inf. Pract. 13(2), 112–114 (2018)
36. Tudini, V.: Using native speakers in chat. Lang. Learn. Technol. 7(3), 141–159 (2003)
37. Lipsey, R.G., Lancaster, K.: The general theory of second best. Rev. Econ. Stud. 24(1), 11–
32 (1956)
38. Vaez-Glasemi, M., Moghadds, Z., Askari, H., Valizadeh, F.: A review of mathematic
algorithms and data envelopment analysis. Iran. J. Optim. 12(1), 115–137 (2020)
39. Dasgupta, P., Stiglitz, J.: On optimal taxation and public production. Rev. Econ. Stud. 39(1),
87–103 (1972)
40. Porter, W.W., Graham, C.R., Spring, K.A., Welch, K.R.: Blended learning in higher
education: institutional adoption and implementation. Comput. Educ. 75, 185–195 (2014)
41. Hsu, H., Wang, S.: The impact of using blogs on college students’ reading comprehension
and learning motivation. Literacy Res. Instr. 50(1), 68–88 (2010)
42. Peterson, P.W.: The debate about online learning: key issues for writing teachers. Comput.
Compos. 18(4), 359–370 (2001)
43. Iqbal, M.H.: Valuing ecosystem services of sundarbans approach of conjoint experiment.
J. Global Ecol. Conserv. 24, e01273 (2020)
44. Iqbal, M.H.: Telemedicine: an innovative twist to primary health care in rural Bangladesh.
J. Prim. Care Community Health 11, 2150132720950519 (2020)
45. Dillenburger, K., Jordan, J.A., McKerr, L., Keenan, M.: The Millennium child with autism:
early childhood trajectories for health, education and economic wellbeing. Dev. Neurore-
habilitation 18(1), 37–46 (2015)
46. Woodrich, M., Fan, Y.: Google Docs as a tool for collaborative writing in the middle school
classroom. J. Inf. Technol. Educ. Res. 16(16), 391–410 (2017)
47. Hemmert, G.A., Schons, L.M., Wieseke, J., Schimmelpfennig, H.: Log-likelihood-based
pseudo-R 2 in logistic regression: deriving sample-sensitive benchmarks. Sociol. Methods
Res. 47(3), 507–531 (2018)
48. Miaou, S.P., Lu, A., Lum, H.S.: Pitfalls of using R2 to evaluate goodness of fit of accident
prediction models. Transp. Res. Rec. 1542(1), 6–13 (1996)
49. Louviere, J.J., Hensher, D., Swait, J.: Stated Choice Methods: Analysis and Application.
University Press, Cambridge (2020)
A Secured Electronic Voting System
Using Blockchain
Md. Rashadur Rahman , Md. Billal Hossain ,

Mohammad Shamsul Arefin(B) , and Mohammad Ibrahim Khan
Department of CSE, CUET, Chattogram 4349, Bangladesh

rsdrcse14@gmail.com, mhbillal160@gmail.com , sarefin@cuet.ac.bd,
muhammad ikhancuet@yahoo.com
Abstract. The foundation of sustainable democracy and good gover-

nance is the transparency and credibility of elections. For last several
years, electronic voting systems have gained much more popularity and
have been of growing interests. E-voting has been considered as a promis-
ing solution to many challenges of traditional paper-ballot voting. Con-
ventional electronic voting systems are vulnerable due to centralization
of information system. Blockchain is one of the most secure public ledgers
for preserving transaction information and also allows transparent trans-
action verification. It is a continuously growing list of data blocks which
are linked and secured by using cryptography. Blockchain is emerging
as a very potential technology for e-voting as it satisfies the essential
requirements for conducting a fair, verifiable and authentic election. In
this work, we proposed a blockchain based voting mechanism based on
predetermined turn for each node to mine new block in the blockchain
rather than performing excessive computation to gain the chance to
mine block. We analyzed two possible conflicting situation and proposed
a resolving mechanism. Our proposed voting system will ensure voter
authentication, anonymity of the voter, data integrity and verifiability
of the election result.
Keywords: Blockchain · E-voting · Bitcoin · Consensus mechanism
1 Introduction
Our modern representative democracy has been operated on the voting mecha-
nism since the 17th century [1]. The security of an election is the most fundamen-
tal issue and is a matter of national security in every democracy. Many voting
systems have been developed to conduct fair elections which are acceptable to all
form of people involved in it. With the prompt expansion of information technol-
ogy, electronic voting (e-voting) or online voting has become a new phenomenon.
For the last several years e-voting systems have been of growing interest in many
sectors as it provides much conveniences over traditional paper voting [2].

https://doi.org/10.1007/978-3-030-68154-8_111
1296 Md. R. Rahman et al.
Neumann [3] proposed several fundamental criteria for a reliable e-voting

system which includes integrity of the system, data reliability and integrity,
anonymity of the voter, operator authentication. Employment of e-voting sys-
tems may encounter several issues like transparency of the voting, data integrity,
ballot secrecy, fraud handling, reliability etc. [4]. The centralization of voting
data makes the traditional e-voting systems quite vulnerable to many kinds
of attacks as it provide a single point of failure. It is very difficult to ensure
anonymity of voter in e-voting by using encryption alone. Therefore, designing a
more secure and feasible e-voting system has become an emerging research area.
Blockchain technology can be an auspicious solution in resolving the secu-
rity concerns of voting systems. Blockchain is a decentralized, distributed,
immutable, incontrovertible public ledger technology. It was first introduced by
Satoshi Nakamoto in 2008. The first application of Blockchain technology was
Bitcoin. Bitcoin is a currency system that could be exchanged securely based on
only cryptography. The fundamental characteristics of this emerging technology
are:
1. This is a decentralized ledger technology. The chain is stored in many different

locations. Since the ledger is replicated and distributed over many nodes, there
is no single point of failure. This provides verifiability and ensures availability.
2. A block in the Blockchain is chained to the immediate previous block through
a reference that is actually a hash value of the previous block which is called
parent block. This makes it immutable and tempering proof.
3. A consensus must be reached by most of the network nodes before making a
proposed new block as a permanent entry in the chain.
Blockchain is an ordered data structure that is composed of data chunks

called blocks. Each block contains transactions as the block data. Blocks are
cryptographically linked with each other. Blockchain technology does not store
the data in a single centralized server. In this paper, we proposed a blockchain
based voting system for storing the vote securely. The key contributions of our
work are summarized as:
(i) We have proposed a voting system based on blockchain technology which

satisfies the essential properties of e-voting. For verification, we implemented
our proposed system in windows platform.
(ii) Proposed a consensus mechanism based on predefined turn of nodes to mine
blocks in the network. We analyzed two conflicting situations and proposed
a mechanism for resolving them.
2 Related Works
Electronic voting is getting popularity day by day. It offers convenience to the

voters and the authority over the conventional paper-ballot process. Security
is the prime challenge in electronic voting [2]. Some researches have been done
A Secured Electronic Voting System Using Blockchain 1297
in the domain of e-voting [5,6]. Ofori et al. presented OVIS based online vot-
ing system in [8]. Most of the earlier works were usually based on signature
mechanism and the votes were stored in a centralized server. These e-voting
systems are proven vulnerable to many threats like DDoS attack, sybil attack
[7]. Blockchain is the most suitable technology in storing the cast votes securely.
There exist very few research works on E-voting on blockchain. Some projects
of E-voting on blockchain are showed in [9].
In [10], Ayed proposed a method of storing the votes in the blocks in different
chains. In that work, the first transaction added to the block is a special transac-
tion that represents the candidate. Each chain is allocated for each candidate. If
there are n numbers of candidate, there will be n numbers of blockchains. There
will be much processing and storage overhead because of assigning a dedicated
blockchain for every candidate.
Smart contracts are irreversible and also traceable applications which exe-
cutes in decentralized environment like Blockchain. No one can change or manip-
ulate the code or its execution behavior once it is deployed. Hjalmarsson et al.
in [11] proposed election as a smart contract. The election process is represented
by a set of smart contracts. The smart contracts are instantiated by the election
administrators on the blockchain. Their implementation was based on Ethereum
platforms. Ethereum-based structures include Quorum and Geth For which con-
current execution of transactions is not supported, That limits scalability and
velocity. Exonium is a cryptocurrency based paid which makes it very expensive
for large scale implementation.
Lee et al. proposed a method where votes can be stored in the blockchain
preserving the anonymity of the voter in [12]. In this work they assumed that
there be a trusted authentication organization which authenticate the validity
of the voters. The organization will provide an unique key against each voter’s
identity. The voter has to vote using the given key. So the identity of the voter
and his or her information are not stored in the blockchain. The main limitation
of the work is that, the system completely assume that all the voters are eligible.
So a voter can cast his or her vote multiple time. The system validate the cast
vote at the end of the voting operation.
Yi proposed a blockchain-based e-voting scheme based on distributed ledger
technology (DLT) [13]. For providing authentication and non-repudiation, they
used a user credential model based on elliptic curve cryptography (ECC). This
system allows a voter to change the cast vote before deadline this can be used as
a way of vote alteration which violates the basic security of the e-voting system.
Bellini et al. proposed an e-voting service based on blockchain infrastructure [14].
In there system, service configuration defined by the end user is automatically
translated into a cloud-based deployable bundle.
Most of the existing works have directly adopted Bitcoin’s consensus mech-
anism, where nodes compete with each other to solve a cryptographic puzzle to
mine a block and get incentive. But in the case of public election, there is no
incentive necessary. So the competition among the nodes to mine a new block,
is quite redundant and will waste a lot of computing power of every node. Our
proposed blockchain based voting system is based on predetermined turn for

each nodes to mine a new block. Our system stores all the voting information
in a single blockchain. It does not require any cryptocurrency to operate and an
eligible voter can vote once in the system.
3 Methodology
The overall architecture of our proposed voting system is shown in Fig. 1. The
primary focus of this work is to store the voters’ votes securely in blockchain and
make sure that the voting information is immutable. Each center in the election
have a node in the network. Each time a vote is cast, first the validity of the vote
is checked. The system ensures that all the voters participating in the voting are
validate. Each center or node has its own copy of the blockchain, where all the
cast votes are stored.
Center n
Voter registration Authentic voter database
voter verification
& authentication Node n
Rest API
Vote key
Center 3
Rest API
Center 1 Rest API Rest API Center 2 Node 3
Validity Sign with the

check centre key
Casting vote Node 2
Node 1
New block added

by the mining node
Voting Blockchain stored in every valid node Election monitoring
• Election result
Block 1 Block 2 Block 3 Block N • Verify vote
• Query vote
Fig. 1. The system architecture overview of our proposed blockchain based voting
system
3.1 Blockchain Structure

The structure of the blockchain in our proposed system is shown in Fig. 2. Each
block contains transactions as the block data. Blocks are cryptographically linked
with each other. Each block has the hash of the previous block. Almost all the
data of the previous block including the hash value are used to generate the hash
value of the current block and so on.
Block 1 Block 2 Block 3
Block index Block index Block index
Previous Hash Previous Hash Previous Hash
Timestamp Timestamp Timestamp
Transactions Transactions Transactions

……………. ……………. …………….
Proof/ Nonce Proof/ Nonce Proof/ Nonce
Fig. 2. Basic structure of connections of blocks in blockchain
3.2 Voter Verification and Authentication
Our proposed voting system assumes that there must be an authentication orga-
nization to authenticate the eligible voters. The authentication organization han-
dles the registration of the voters. When a voter makes a vote request to the
system by entering the vote key given by the authentication organization, the
system first checks the vote key in the registered voter database provided by the
authentication organization. If the voter is valid and the voter has not yet voted
in the election, then the vote is considered as a valid vote and added to the open
transactions. The organization has all the information of the registered voters.
When voting, the voter has to give none of his private information but only the
vote key which is only known to that particular voter. Thus the anonymity of
the voter is well preserved. The voter is able to verify whether his or her vote
was cast by the vote key.
3.3 Blockchain Transactions
Each block in a blockchain has the only data field that is called its transactions.
The data which are to be stored in blockchain are in this transaction field. For our
case the votes are stored as data for the blocks in other words, votes are stored as
transactions in our blockchain. These data of the votes are very sensitive, these
data should not be trusted in the hand of third parties [15]. After casting a vote,
the vote is stored as the transaction in the block. The size of the transaction
field is variable. A block can store variable number of transactions (Fig. 3).
Open Transactions. Open transactions are the transactions in the blockchain

system which are not yet been mined or added to the blockchain. In our system
the votes are first stored in the open transactions before mining (Fig. 4). Once
a voter cast his or her vote, if the voter information is valid then the vote is
saved as a transaction in the open transactions list. Once the block is mined,
all the open transactions are stored in block thus the open transactions become
empty. The process of storing votes in the open transactions is shown in Fig. 5.
At first the key of the particular center or node must be loaded or created.
Fig. 3. Transactions inside block
A valid Blockchain
casted vote Open transactions
Fig. 4. Flow of a cast vote through open transactions.
Each center must have a public key and a private key. The keys must be loaded
before any vote is cast from any particular center. After creating or loading the
center. The voter will cast his or her vote with the voting key provided by the
authentication organization. The voter will input the key and select a candidate
to vote.
Signing the Transactions. Each vote are treated as a transaction in the

blockchain. Each transaction has a field named signature. The signature is cre-
ated by the center. The signature is used for the further validation of the trans-
actions. It is quite impossible for an attacker to alter any block information in
the blockchain. Any single alteration of the block information will result in a
complete changed hash. So the chain will be completely broken. Each transac-
tion got its signature which is created by the private key of the node along with
the candidate, vote key of the vote cast. The signature can only be verified by
the public key of the center. If any of the information of any vote is altered
the verification process will fail because the signature will not be valid for the
changed transaction. All the transactions or votes in the open transactions are
verified before mining. So this signature ensures the integrity of all the open
transactions (Fig. 6).
Load Center
Casting vote
Registered voter Validity Invalid

Rejection
& Blockchain Checking
Valid
Adding the signature with the Center Key
The vote added to the open transactions
Broadcast the transaction to the other connected nodes
Fig. 5. Storing vote in the open transactions
Private key
Creates
Transaction
Created
Together
Signature
Only this Public

Verify Key can verify
Transaction with a
signature created
Public key with the Private
Key
Fig. 6. Creation and validation of signature of a transaction
3.4 Hashing the Blocks
Each block has a field named previous hash. By this field one block is chained to
its prior block. Thus the integrity of the overall chain is preserved. The hashing
algorithm used for this hashing is Secure Hashing Algorithm (SHA-256). Mes-
sages up to 264 bits are transformed into digests or hash value of size 256 bits
or 32 bytes. It is quite impossible to forge the hash value.
64
SHA 256 : B 1 ∪ ............. ∪ B 2 → B 256
M →H
All the fields of the previous block are used as the input message strings
for the hashing function including all the transactions of the previous block.
The function maps all these information into a fixed message digests or hash of
size 256 bits or 32 bytes. Before mining any particular block the hash is gener-
ated every time then used as the previous hash field of the block being mined
(Fig. 7).
Input message Hash

Block index c775ae7455f086e2fc6852
SHA – 256 Hash 0d31bfebfdb18ffeaceb933
Timestamp
Function 085c510d5f8d2177813
Proof
Transactions
Fig. 7. Generation of the hash value of a block using SHA-256
3.5 Proposed Mechanism for Consensus
Blockchain has many use cases outside Bitcoin. In our work a methodology is
designed to store the votes securely insides block. The main distinction of the
work is, most of the other blockchain based voting systems, there is a reward
given to the mining node. Each and every node in the network tries to mine a
new block to the chain, the wining node has to solve a cryptographic puzzle to
find the nonce or proof which will produce the required hash for the block to
be mined. In our case there need not to be any mining competition among the
block. The procedure of our proposed consensus is showed in Algorithm 1.
Each node gets it turn Tn to mine a block and add it to the blockchain. The
mining node will generate a valid proof number which will be used to generate
a hash with specific condition and later used to verify the block by all the peer
nodes. Each transaction tx in the open transactions are checked for verification
with signature of that particular transaction. If all the transaction in the open
transactions are valid then proceed to the next step otherwise rejected. The hash
is generated with the Secure Hashing Algorithm (SHA-256). The information of
the previous block is used to generate the hash for the block. For each block
in the blockchain stored in this particular node (self chain), the validity is
checked. If all the blocks in the chain are valid then proceed to the next step
otherwise rejected. The new block is then broadcast to the connected peer nodes
(self.peer nodes). Finally if there exists no conflict situations the new block is
appended to the blockchain and all the open transactions are cleared.
Algorithm 1. Proposed Consensus Mechanism

1: for Turn Tn of a N odei in N ODE T U RN S do
2: previous hash ← Hash SHA256 (self.chain[chain length − 1]
3: proof ← 0
4: guess hash ← Hash SHA256(self.open transactions, previous hash, proof )
5: while guess hash [0 : 3] ! = “000” do
6: P roof ← P roof + 1
7: guess hash ← Hash SHA256(self.open transactions, previous hash, proof )
8: end while
9: for each Transaction tx in self.open transactions do
10: if V erif y T ransaction (tx ) = f alse then
11: return
12: end if
13: end for
14: block ← Create Block (self.chain, previous hash, self.open transactions, proof )
15: for each blocki in self.chain do
16: if block.previous hash! = Hash SHA256(self.chain[block.index − 1]) then
17: return
18: end if
19: if block.proof is not valid then
20: return
21: end if
22: end for
23: for each nodei in self.peer nodes do
24: if Broadcast Block T o (nodei ) = N ull then
25: raise conf licts and return
26: end if
27: end for
28: if conf lict = N ull then
29: append the block to self.chain
30: self.open transactions ← [ ]
31: end if
32: end for
3.6 Conflicts in the Network

In a practical blockchain with so many nodes situated in different locations, there
is a probability that any or some of the nodes can’t receive any broadcasted
block. This situation can occur for so many reasons like network congestion,
power damage etc. If such situation occurs, the affected node will have shorter
chain than rest of the nodes. This situation is termed as conflicts. There might
be two possible types of conflicts.
(i) Any of the nodes except the broadcasting node, has shorter chain.
(ii) The broadcasting node has the shorter chain.

in Fig. 8a shows the case of the conflict situation where any of the nodes in
the network except the broadcasting node has the shorter chain. In Fig. 8b the
broadcasting node has shorter chain than the network.
Node New Block
Node
Node
Node
Node with shorter chain

Conflict Occurs
(a)
The Broadcasting node has shorter chain
Node New Block
Node
Conflict Occurs
Node
Node
(b)
Fig. 8. (a)Conflict when any node has shorter chain than the network, (b) Conflict
when the broadcasting node has shorter chain than the network
Resolving Conflicts. Earlier we have mentioned two different type of conflicts

in the network. Any new block is broadcast to all the connected nodes or center.
During the broadcasting of the block any node which has the shorter chain is
informed that there exists conflict. The steps of detecting conflicts are:
Step-1: If index of the broadcasted block > index of the last block in the chain
+ 1, then the recieving node has the shorter chain.
Step-2: If index of the broadcasted block <= index of the last block in the
chain, then the broadcasting node has the shorter chain.
In both the cases, there must be resolving mechanism of the conflicts. The
procedures of the resolving conflicts of any node is shown in Algorithm 2. The
main goal of the resolving is “the longest chain wins”. The node which performs
the resolving task loop for all the chains in the connected network and repeatedly
compare the longest of chains. Ultimately the local chain of the node is replaced
by the longest chain in the network.
Algorithm 2. Resolving Conflicts

1: winner chain ← self.chain
2: for each nodei in self.peer nodes do
3: node chain ← chain of nodei
4: node chain len ← len(node chain)
5: local chain len ← len(self chain)
6: end for
7: if node chain len > local chain len and V erif ication(node chain) = true then
8: winner chain ← node chain
9: end if
10: self.chain ← winner chain
4 Implementation and Experiment

We implemented the system in Python. For the user interface of the system we
used Vue Js. We simulated our proposed system in 100 virtual nodes or servers.
Each node acts as a voting center in our system. Information transfer among the
nodes is done through Rest API. The system interface of a node in our system is
shown in Fig. 9. Where a voter has to enter his or her valid voter key provided
by the authentication organization to cast vote. Once a block is mined in the
chain, the mined block is broadcasted to all the nodes in the system. Each node
adds the newly mined block to its own copy of the chain. The chain stored in one
Fig. 9. Interface of our proposed system

Fig. 10. Blocks mined in the blockchain
of the nodes is shown in Fig. 10. There exists so many challenges and security
requirements to develop a secure e-voting system. Some of the comprehensive
security requirements of contemporary e-voting systems are summarized in [16].
Our implemented blockchain based voting system satisfies several essential
criteria for transparent and trusted electronic voting. These are listed below:
Authentication: The system will only allow the authenticated voter who are
already registered to vote. Our system does not contain the registration process.
For simulating a election, we used about 1,000 voters in the voting process. We
tried to input same vote key from multiple nodes but the system automatically
detect this and does not permit multiple cast. The system is also able to verify
voter and makes sure that no two vote is cast from a single vote key.
Anonymity of voter: Voters in the system cast their vote by using they vote
key. No identity of the voter is revealed. The voter remains anonymous during
and after the election. We can see stored chain of a node in Fig. 10, where a vote
in block no 2 is shown. There is no information of the voter is revealed except the
key (QFYPLMBTLTDN). This ensures the anonymity of voter in our system.
Data integrity: The system makes sure that once a vote is cast, it can never
be changed. Each block is linked with hash to the prior block. To ensure that we
performed manual manipulation of the local chain of several nodes. An example
is shown in Fig. 11. We intentionally changed the proof number of a block from
17 to 23 shown in Fig. 11b. After the change, we tried to cast vote and mine the
block from the manipulated node, the system can detect the manipulation and
immediately mark the node as a violated node and reject mining Fig. 12.
Verifiability: There are 100 nodes in our implemented system and each nodes
stores a valid copy of the chain. Thus Each node knows how many vote is cast
in the system. When the final result of the election is counted, the system makes
sure all the valid nodes agree on the final count of the vote. So system is verifi-
able and makes sure that all votes are counted correctly.
(a) Before manipulation
(b) After manipulation (Changing proof from 17 to 23)
Fig. 11. Manipulation of data in the blockchain stored in a node
Fig. 12. The system rejects the mining request from the manipulated node
The system is secure against DDoS (Distributed Denial-of-Service) attack

and Sybil attack as blockchain technology by its nature is secured against DDoS
attack [2]. In the proposed e voting system it is not allowed for the participants
to generate their own identity and cast a vote. In Sybil attack [17] an individual
creates a large amount of fake identities to disrupt the network. As our proposed
blockchain network does not allow creation of identity so no individual has access
to create one.
4.1 Comparative Analysis of Using Proposed Mechanism

There are several approaches to consensus in blockchain based systems. Different
consensus mechanism is used in different use case of blockchain [18]. In our
proposed system, the consensus mechanism used is not exactly the Bitcoin’s
proof of work (POW). Most of the existing works in this domain directly adopted
the consensus mechanism of the Bitcoins. In that consensus all the mining nodes
compete with one another to solve a cryptographic puzzle to generate a new valid
hash for the node to be mined. this mechanism consumes a lot of computational
resources [19] as all the node tries to calculate (Table 1).
For the existing works the consumption of computational resources is pro-
portional to: O(n), Where n is the number of nodes in the system. In our system
only one node gets its turn to mine a new block rather than all the node try
to get the chance to mine in the system. So the consumption of computational
resources is proportional to: O(1) as only one node tries to mine a new block
Table 1. Comparison with the existing consensus mechanism.
Candidates Computational resource consumption

Existing works which directly adopted Bitcoin’s consensus O(n)
Proposed methodology O(1)
in the system. In this analysis we only considered the complexity of getting the
turn to mine a block. We assumed all the processing works in the blockchain are
constant.
5 Conclusion
In this work we proposed and implemented an electronic voting system based
on Blockchain. The votes are stored as transactions inside the blocks and the
blocks are chained together using hash. The system is decentralized so there is
no single point of failure. Each node mines a new block in the system based on
predetermined turn instead of all node competing to solve puzzle to get turn.
We have also mentioned two potential conflict situations and adopted a conflict
resolving methodology for our system. And have showed how conflicts can be
resolved in the system. There is scope of future research on this work. The system
can be employed to conduct real political or organizational elections to measure
its scalability and execution performance.
Our proposed voting system can be a solution to the security concern of the
electronic voting system and can be useful in conducting a transparent election
which is trusted by all. In this work, the consensus mechanism is a not exactly
the Bitcoin’s Proof of Work, rather it is a modified version of that. The method
is based on predefined turn on each node or vote center in the blockchain. Which
will save a lot of computational consumption.
References
1. Eijffinger, S., Mahieu, R., Raes, L.: Inferring hawks and doves from voting records.
Eur. J. Polit. Econ. 51, 107–120 (2018)
2. Taş, R., Tanrıöver, Ö.: A Systematic review of challenges and opportunities of
Blockchain for e-Voting. Symmetry. 12, 1328 (2020)
3. Neumann, P.G.: Security criteria for electronic voting. In: Proceedings of 16th
National Computer Security Conference, pp. 478-481. Maryland (1993)
4. Esteve, J.B., Goldsmith, B., Turner, J.: International Experience with E-
Voting. Available online: https://www.parliament.uk/documents/speaker/digital-
democracy/IFESIVreport.pdf (accessed on 20 August 2020)
5. Evans, D., Paul, N.: Election security: perception and reality. IEEE Secur. Privacy
Mag. 2(1), 24–31 (2004)
6. Chaum, D.: Secret-ballot receipts: true voter-verifiable elections. IEEE Secur. Pri-
vacy Mag. 2(1), 38–47 (2004). https://doi.org/10.1109/msecp.2004.1264852
7. Daramola, O., Thebus, D.: Architecture-centric evaluation of blockchain-based
smart contract e-voting for national elections. Informatics 7, 16 (2020)
8. Ofori-Dwumfuo, G.O., Paatey, E.: The design of an electronic voting system.

Research J. Inf. Tech. 3, 91–98 (2011)
9. Curran, K.: E-voting on the blockchain. J. Br. Blockchain Assoc. 1, 1–6 (2018)
10. Ayed, A.B.: A conceptual secure blockchain based electronic voting system. Int. J.
Net. Secur. Appl. 9, 1–9 (2017)
11. Hjalmarsson, F., Hreioarsson, G., Hamdaqa, M., Hjalmtysson, G.: Blockchain-
based e-Voting system. In: 2018 IEEE 11th International Conference on Cloud
Computing (CLOUD), pp. 983–986. IEEE (2018)
12. Lee, K., James, J.I., Ejeta, T.G., Kim, H.J.: Electronic voting service using block-
chain. J. Dig. Forens., Secur. Law. 11, 123 (2016)
13. Yi, H.: Securing e-voting based on blockchain in P2P network. J Wireless Com
Netw. 2019, 137 (2019)
14. Bellini, B., Ceravolo, P., Damiani, E.: Blockchain-based e-vote-as-a-service. In:
2019 IEEE 12th International Conference on Cloud Computing (CLOUD), pp.
484–486. IEEE (2019)
15. Zyskind, G., Nathan, O., Pentland, A.: Decentralizing privacy: using Blockchain
to protect personal data. In: 2015 IEEE Security and Privacy Workshops, pp.
180–184. IEEE (2015)
16. Wang, K.H., Mondal, S.K., Chan, K., Xie, X.: A review of contemporary e-voting:
requirements technology systems and usability. Data Sci. Pattern Recogn. 1, 31–47
(2017)
17. Douceur, J.R.: The Sybil Attack. In: International Workshop on Peer-to-Peer Sys-
tems, WA, USA (2002)
18. Zheng, Z., Xie, S., Dai, H.N., Chen, X., Wang, H.: Blockchain challenges and
opportunities: a survey. Int. J. Web Grid Serv. 14(4), 352–375 (2018)
19. Bach, L.M., Mihaljevic, B., Zagar, M.: Comparative analysis of blockchain consen-
sus algorithms. In: 2018 41st International Convention on Information and Commu-
nication Technology, Electronics and Microelectronics (MIPRO), pp. 1545–1550.
IEEE (2018)
Preconditions for Optimizing Primary Milk
Processing
Gennady N. Samarin1,2(&) , Alexander A. Kudryavtsev1,

Alexander G. Khristenko3 , Dmitry N. Ignatenko4 ,
and Egor A. Krishtanov5
1
Agroengeneering Center VIM” (FSAC VIM), Moscow, Russia
samaringn@yandex.ru, kudralex94@yandex.ru
2
Northern Trans-Ural State Agricultural University, Tyumen, Russia
3
Novosibirsk State Agrarian University, Novosibirsk, Russia
a-hristenko@mail.ru
4
Prokhorov General Physics Institute of the Russian Academy of Sciences,
Moscow, Russia
DmitriyEK13104@yandex.ru
5
St. Petersburg State Agrarian University, Saint-Petersburg, Pushkin, Russia
dekanazam@mail.ru
Abstracts. 99% of freshly drawn milk produced on farms in Russia is cooled

with artificial cold produced by refrigeration machines, which consume large
amounts of electricity (up to 0.00944 Ws per 1 kg of milk when cooled from
30…35 to 3…5 °C). Therefore, the goal of this work is to optimize the process
of primary milk processing using energy efficiency as a target function, and the
objective is to study the layered souring of milk in an open container without a
mixing system. According to experimental studies of layered milk souring in an
open container without a mixing system, the dependence of changes in titratable
acidity and the number of bacteria in the lower and upper layers of milk in the
container at a certain temperature over time have been obtained. As a result of
the research conducted, adequate regression equations for titratable acidity and
the quantities of colony forming units in unit volumes of the lower milk layer
depending on storage time have been obtained. The significance of the research
results is that in milk storage tanks, decontamination devices should be installed
at the top.
Keywords: Milk Primary processing Alternative methods Milk quality

indicators Specific energy efficiency Optimization
1 Introduction
Currently, the Russian Federation (RF) is one of the world's largest producers of milk
and dairy products, but has low share of commercial milk in total production, which is
57%. RF achieves about half milk productivity per cow compared to developed
countries.

https://doi.org/10.1007/978-3-030-68154-8_112
Preconditions for Optimizing Primary Milk Processing 1311
The main two challenges the dairy industry in the Russian Federation faces are:
reducing the dairy industry's dependence on imported products; increased demand for
commercial milk.
The solution to the first problem stems from the country’s food security: the share
of domestic products should be 90% of the total volume of commodity resources, and
this can be achieved by replacing imported products with domestic products, by
increasing the production of commercial milk.
During solving the first problem, a natural solution to the second one appears, that
is, increasing the production of raw commercial milk, not losing, but increasing the
demand for it [1, 2].
In milk production, cooling is one of the most important factors. When cooled, the
bacteria in the milk do not die, but fall into suspended animation. When bacteria get
into comfortable temperature conditions, and this can occur at any time, including
during storage and transportation of milk, sometimes even 10 °C is enough, they begin
to procreate intensively, thereby affecting the quality of the product and, accordingly,
its cost [3–6].
Combining these two tasks, we can conclude that alternative methods and technical
means to kill bacteria are needed to solve them.
Analyzing the regulations of different economic zones and countries, we can see:
For the Russian Federation, Federal laws (FL) FL-88 and FL-163 “Technical
Regulations for Milk and Dairy Products” contain general and specific safety
requirements; State Standard (GOST) GOST R 52054–2003 “Cow’s milk raw. Spec-
ifications” contains general requirements for the production of all types of dairy
products [7].
For the European Economic Community (EEC) there is a Council Directive
92/46/EEC of 16 June 1992 laying down the health rules for the production and
placing on the market of raw milk, heat-treated milk and milk-based products; Council
Directive 2002/99/EC laying down the animal health rules governing the production,
processing, distribution and introduction of products of animal origin for human
consumption; Regulation (EC) No 852/2004 of the European Parliament and of the
Council of 29 April 2004 on the hygiene of foodstuffs; Regulation (EC) No 853/2004
of the European Parliament and of the Council of 29 April 2004 laying down specific
hygiene rules for food of animal origin; Regulation (EC) No 854/2004 of the European
Parliament and of the Council of 29 April 2004 laying down specific rules for the
organisation of official controls on products of animal origin intended for human
consumption; Regulation (EC) No 882/2004 of the European Parliament and of the
Council of 29 April 2004 on official controls performed to ensure the verification of
compliance with feed and food law, animal health and animal welfare rules [8, 9].
For the Eurasian Customs Union (EACU), there are technical regulations (TR):
“TR CU 033/2013. Technical Regulations of the Customs Union. On the Safety of
Milk and Dairy Products” [7].
The definition of the “quantity of mesophilic aerobic and facultative anaerobic
microorganisms” (QMFAnM) refers to the estimate of the size of the group of sanitary
indicative microorganisms. Different groups of microorganisms are part of the
QMFAnM: bacteria, fungi, yeast and others. Their total number indicates the sanitary
condition of the product and the degree of its insemination with microflora. The best
1312 G. N. Samarin et al.
temperature for QMFAnM growth is 35…37 °C (aerobic); the temperature boundary of

their growth is within 20…45 °C [7].
The general safety criteria in different economic zones and countries in terms of the
key quality of raw milk, QMAFAnM, according to literary sources [10–14] is:
In the EU, the quality of milk is considered satisfactory if the value of QMAFAnM
is less than 100103 CFU/g;
The CFU abbreviation is deciphered as a Colony Forming Unit and denotes the
number of bacteria that are capable of forming a full-fledged microbial colony [7, 15–
17].
For the Customs Union (Republic of Armenia; Republic of Belarus; Republic of
Kazakhstan; Kyrgyz Republic; Russian Federation), the QMAFAnM value has to be
less than 4000103 CFU/g;
In the Russian Federation and Ukraine, the QMAFAnM value has to be less than
4000103 CFU/g.
In order to reduce the number of microorganisms in milk, it is necessary to observe
the sanitary rules of automatic milking. However, as we know from literary sources
[18–21], even a small number of microorganisms in milk can affect its storage, so it is
necessary to take measures to destroy or suspend the development of microorganisms
that got into it.
The most effective and accessible method on livestock farms is the one that
involves cooling and heating the milk.
The vital activity of many microorganisms found in milk slows down sharply when
it is cooled to 8…10 °C and is completely suspended when cooled to 2…3 °C. When
the temperature drops to minus 20 °C and fluctuates from 20 °C to minus 20 °C, a
small fraction of the microorganisms dies [20, 23].
When milk is heated to 50 °C, 68% of microorganisms die, and if the milk is kept
for 20 min at this temperature, more than 99% of all microorganisms contained in it are
eliminated. To completely destroy the microorganisms and their spores it is necessary
to bring the milk to a temperature of 120 °C and sustain the temperature level for 15…
20 min.
The most significant factor influencing the growth and development of bacteria is
the positive temperature. And freezing leads to the slow destruction of the product, as
ice crystals break cell membranes [22–24].
Light is essential for the development of bacteria, but in the presence of ultraviolet
light (sunlight), bacteria are destroyed [16, 25].
Depending on the need for oxygen, microorganisms can be classified as follows:
aerobic, anaerobic, facultative aerobic/anaerobic; a typical example would be common
lactic acid bacteria, microaerophilic.
The process of cooling milk requires high energy costs, special equipment and its
centralized maintenance.
Alternative methods of primary milk processing include processes such as exposure
to ultraviolet, microwave radiation, infrared electric heating, ultrasound, ultra-high
pressure, electroprocessing (electrochemical processing), bactofugation, sterilization,
and others [1, 7, 21].
Therefore, the goal of this work is to develop and justify the constructive and
technological parameters and modes of operation of the container-type milk decon-
tamination unit by the energy efficiency criteria.
Based on this goal, we have outlined the following objectives: to develop a tech-
nological scheme for milk decontamination with alternative variants allowing for
repeated exposure; to justify constructive and technological parameters and optimal
operation modes that ensure the required milk quality indicators at minimal energy
costs.
The description of the real processes of heat and mass exchange in the unit can be done
via mathematical modeling, therefore, the mathematical model will be defined by the
system of three Eqs. (1, 2, 3).
where W is energy consumption of the unit (pump and radiator), kWh; NP is power of
the drive of the pump supplying milk to the container (at small volumes of milk may
not be necessary), kW; s1 is operation time of the milk pump, s; RNu is the total power
consumed by the radiator of the unit, kW; s2 is total time of milk irradiation, s; f1, f2, f3
are functionalities; RGMi is the total fractional mass of milk, kg; DP is pressure drop
during pumping between the entrance and exit of the unit, Pa; FS is area of the cross-
section of the unit, m2; DnMB is the difference in the number of bacteria at the entrance
and exit of the unit, units; ηy0 is efficiency of the radiator; NG is power of the radiator
generator, kW; GA is mass of air that comes into contact with the unit, kg; tIM is initial
milk temperature, K; tIA is initial ambient air temperature, K; FV is surface area of the
container in the unit, m2; kP is the heat transfer coefficient, W/(m2K); ηt is thermal
efficiency of the radiator; lHi is fractional factor of freshly drawn milk; tKi is speed ratio
of souring of each fraction of milk when stored, 1/s; s3 is duration of milk storage, s;
GMi is mass of each fraction of milk, kg, hyI is depth of the radiator immersion; ηM is
bactericidal efficiency of the radiator.
As a target function of optimization of these parameters, we have set minimal
energy consumption while getting the longest duration of milk storage.
All studies on determination of the physical and chemical characteristics of milk

were conducted according to conventional methods and relevant state standards:
GOST R 52054–2003 “Cow’s milk raw. Specifications”; GOST 3624–92 “Milk and
milk products. Titrimetric methods of acidity determination”; GOST 32901–2014
“Milk and milk products. Methods of microbiological analysis.” The method is based
on the restoration of resazurin by redox enzymes secreted by microorganisms into milk.
3 Results
The purpose of this experiment was to determine the nature of milk souring in different
layers of liquid over time s. The study of layered freshly drawn milk souring in an open
container without mixing was conducted at ambient air temperature of about 20 °C.
During the experiment, we have measured three main parameters of milk: temperature
t, titratable acidity TA and number of CFU N in a unit volume of the liquid.
The results of studies of milk’s temperature and titratable acidity are presented in
Table 1.
The titratable acidity of milk is measured in degrees Therner, °Th.
Table 1. Results of the experimental research no. 1.

Time Temperature of the Temperature of Titratable Titratable
to upper milk layer the lower milk acidity of the acidity of the
freeze, (UL), tUL, °C layer (LL), tLL, °C UL milk, TAUL, LL milk, TALL,
s,h °Th °Th
10 16.05 16.32 15.84 15.97
14 18.30 17.89 18.38 16.73
18 18.74 19.38 30.16 16.66
22 19.25 19.69 48.84 17.62
6 18.27 19.09 54.71 19.23
10 18.74 19.09 42.11 20.28
14 18.39 19.44 43.28 25.74
18 18.61 19.05 40.67 32.32
Figure 1 shows a graphical representation of the results of the experiment no. 1.

Mathematical processing of the data of the experiment no. 1 [26, 27] allowed us to
obtain an adequate regression Eq. (4), which describes the dependence of the titratable
acidity in the lower milk layer (TALL, °Th) on its storage time s, h, with coefficient of
determination R2 = 0.944.
TALL ¼ 0:5335 s2 2:0043 s 17:103 ð4Þ

Fig. 1. The dependence of the change in titratable acidity in the lower and upper milk layers in
the container at a certain temperature over time.
The results of temperature studies and CFU numbers in the unit volumes of liquid
are presented in Table 2.
Table 2. Results of the experimental research no. 2.

Measurement Temperature Temperature The number of The number of
time, s, h of the upper of the lower bacteria in the UL bacteria in the LL
milk layer milk layer milk (NUL, milk (NLL, million
(UL), tUL, °C (LL), tLL, °C million bacteria/ml)
bacteria/ml)
10 16.05 16.32 0.3 0.3
14 18.30 17.89 1.75 0.4
18 18.74 19.38 8.0 0.4
22 19.25 19.69 20.0 0.4
6 18.27 19.09 20.0 1.75
10 18.74 19.09 20.0 1.75
14 18.39 19.44 20.0 1.75
18 18.61 19.05 20.0 1.75
22 20.0 20.0
Figure 2 shows a graphical representation of the results of the experiment no. 2.

Fig. 2. The dependence of changes in the number of bacteria in the lower and upper layers of
milk in the container at a certain temperature over time.
Mathematical processing of the experimental data [26, 27] allowed us to obtain an

adequate regression Eq. (5), which describes the dependence of the number of CFUs in
the unit volumes of liquid in the lower layer of milk NLL, million bacteria/ml, on its
storage time s, h, with coefficient of determination R2 = 0.9933
NHC ¼ 0:0261s5 0:5755s4 þ 4:641s3 16:688s2 þ 26:077 13:292s ð5Þ
4 Discussion
The Eqs. (4) and (5) and the graphical interpretation (Fig. 1, 2) suggest that milk
souring in the container begins with the upper layer. Consequently, during the research
of milk treatment with ultrasound and microwave or ultraviolet radiation, the milk
should be treated top down.
The information obtained can be used in the following stages: the development of
new technological schemes for decontamination of milk using alternative methods and
new machines containing an optimized number of radiators, allowing milk to be
decontaminated repeatedly during processing.
5 Conclusions
In the course of the research conducted by the authors of the article, it was found that
the JSC “Krasnoe Znamya” farm located at the Novosokolniki district of the Pskov
region of the Russian Federation consumes 132 500 kWh during storage of 3 900 tons
of milk per year.
Milk in an open container begins to sour from its upper layer, therefore, the
required location of devices for preserving milk using alternative methods (ultrasound,
microwave radiation, ultraviolet and other options) was experimentally confirmed.
The milk souring process is described by regression Eqs. (4) and (5).
References
1. Samarin, G., Vasiliev, A.N., Vasiliev, A.A., Zhukov, A., Krishtopa, N., Kudryavtsev, A.:
Optimization of technological processes in animal husbandry. In: International Conference
on Efficient Production and Processing (ICEPP-2020), E3S Web Conferences, vol. 161,
p. 1094 (2020)
2. Samarin, G.N.: Energosberegayushchaya tekhnologiya formirovaniya sredy obitaniya
sel'skohozyajstvennyh zhivotnyh i pticy: monografiya [Energy-saving technology for
creating the habitat of farm animals and poultry: monograph]/G.N. Samarin. V. P. Goryachkin
Moscow state agrarian University, Moscow, 215 p. (2008) . (in Russian)
3. Boor, K.J.: Pathogenic Microorganisms of Concern to the Dairy Industry. . Dairy Food
Environ. Sanitation 17, 714–717 (1997)
4. Chambers, J.V.: The microbiology of raw milk. In: Robinson, R.K. (ed.) Dairy Microbiology
Handbook, 3rd edn, pp. 39–90 (2002). Wiley, New York
5. Ye, A., Cui, J., Dalgleish, D., et al.: Effect of homogenization and heat treatment on the
behavior of protein and fat globules during gastric digestion of milk. J. Dairy Sci. 100(1),
36–47 (2017)
6. Liang, L., Qi, C., Wang, X., et al.: Influence of homogenization and thermal processing on
the gastrointestinal fate of bovine milk fat: in vitro digestion study. J. Agric. Food Chem.
65(50), 11109–11117 (2017)
7. Coles, K.: Information Regarding US Requirements for the Importation of Milk and Dairy
Products/Washington Department of Agriculture. Food Safety Program. Milknews - News of
the dairy market [Electronic resource] – Electron. text data Moscow (2016). https://
milknews.ru/index/novosti-moloko_6294.html.
8. Bisig, W., Jordan, K., Smithers, G., Narvhus, J., Farhang, B., Heggum, C., Farrok, C.,
Sayler, A., Tong, P., Dornom, H., Bourdichon, F., Robertson, R.: The Technology of
pasteurisation and its effects on the microbiological and nutritional aspects of , p. 36. milk
International Dairy Federation, IDF, Brussels (2019)
9. Mack, G., Kohler, A.: Short and long-rung policy evaluation: support for grassland-based
milk production in Switzerland. J. Agric. Econ. 2018, 1–36 (2018)
10. Aiassa, E., Higgins, J.P.T., Frampton, G.K., et al.: Applicability and feasibility of systematic
review for performing evidence-based risk assessment in food and feed safety. Crit. Rev.
Food Sci. Nutr. 55(7), 1026–1034 (2015)
11. Finnegan, W., Yan, M., Holden, N.M., et al.: A review of environmental life cycle
assessment studies examining cheese production. Int. J. Life Cycle Assess. 23(9), 1773–
1787 (2018)
12. Djekic, I., Miocinovic, J., Tomasevic, I., et al.: Environmental life-cycle assessment of
various dairy products. J. Cleaner Prod. 68, 64–72 (2014)
13. Depping, V., Grunow, M., van Middelaar, C., et al.: Integrating environmental impact
assessment into new product development and processing-technology selection: milk
concentrates as substitutes for milk powders. J. Cleaner Prod. 149, 1 (2017)
14. Bacenetti, J., Bava, L., Schievano, A., et al.: Whey protein concentrate (WPC) production:
environmental impact assessment. J. Food Eng. 224, 139–147 (2018)
15. Carlin, F.: Origin of bacterial spores contaminating foods. Food Microbiol. 28(2), 177–182
(2011). Special Issue SI
16. Coorevits, A., De Jonghe, V., Vandroemme, J., et al.: Comparative analysis of the diversity
of aerobic spore-forming bacteria in raw milk from organic and conventional dairy farms.
Syst. Appl. Microbiol. 31(2), 126–140 (2008)
17. Cusato, S., Gameiro, A.H., Corassin, C.H., et al.: Food safety systems in a small dairy
factory: implementation, major challenges, and assessment of systems’ performances.
Foodborne Pathog. Dis. 10(1), 6–12 (2013)
18. Doll, E.V., Scherer, S., Wenning, M.: Spoilage of Microfiltered and Pasteurized Extended
Shelf Life Milk Is Mainly Induced by Psychrotolerant Spore-Forming Bacteria that often
Originate from Recontamination. Front. Microbiol. 8, 135 (2017)
19. Boor, K.J., Murphy, S.C.: The microbiology of market milks. In: Robinson, R.K. (ed.) Dairy
Microbiology Handbook, 3rd edn, pp. 91–122. Wiley, New York, (2002)
20. Doyle, C.J., Gleeson, D., Jordan, K., et al.: Anaerobic sporeformers and their significance
with respect to milk and dairy products. Int. J. Food Microbiol. 197, 77–87 (2015)
21. O’Riordan, N., Kane, M., Joshi, L., et al.: Structural and functional characteristics of bovine
milk protein glycosylation. Glycobiology 24(3), 220–236 (2014)
22. Huck, J.R., Sonnen, M., Boor, K.J.: Tracking heat-resistant, cold-thriving fluid milk spoilage
bacteria from farm to packaged product. J. Dairy Sci. 91(3), 1218–1228 (2008)
23. Chavan, R.S., Chavan, S.R., Khedkar, C.D., et al.: UHT milk processing and effect of
plasmin activity on shelf life: a review. Compr. Rev. Food Sci. Food Saf. 10(5), 251–268
(2011)
24. Mafart, P., Leguerinel, I., Couvert, O., et al.: Quantification of spore resistance for
assessment and optimization of heating processes: a never-ending story. Food Microbiol.
27(5), 568–572 (2010)
25. Luecking, G., Stoeckel, M., Atamer, Z., et al.: Characterization of aerobic spore-forming
bacteria associated with industrial dairy processing environments and product spoilage. Int.
J. Food Microbiol. 166(2), 270–279 (2013)
26. Samarin, G.N., Vasilyev, A.N., Zhukov, A.A., Soloviev, S.V.: Optimization of microclimate
parameters inside livestock buildings. In: Vasant, P., Zelinka, I., Weber, G.W. (eds) Intel-
ligent Computing & Optimization. ICO 2018. Advances in Intelligent Systems and
Computing, vol 866. Springer, Cham (2018).
27. Özmen, A., Weber, G., Batmaz, İ.: The new robust CMARS (RCMARS) method. In:
International Conference 24th Mini EURO Conference “Continuous Optimization and
Information-Based Technologies in the Financial Sector” (MEC EurOPT 2010), Izmir,
Turkey, 23–26 June 2010 (2010).
Optimization of Compost Production
Technology
Gennady N. Samarin1,2(&) , Irina V. Kokunova3,

Alexey N. Vasilyev2 , Alexander A. Kudryavtsev2,
and Dmitry A. Normov4
1
Northern Trans-Ural State Agricultural University, Tyumen, Russia
samaringn@yandex.ru
2
Agroengeneering Center VIM” (FSAC VIM), Moscow, Russia
vasilev-viesh@inbox.ru, kudralex94@yandex.ru
3
Velikie Luki State Agricultural Academy, Velikie Luki, Pskov region, Russia
i.kokunova@yandex.ru
4
Kuban State Agrarian University Named After I.T. Trubilin, Krasnodar, Russia
danormov@mail.ru
Abstract. Due to the intensive development of livestock production in the

world, there is an increase in the amount of waste generated by the production
activities of agricultural enterprises. This sets us to formulate the new goals of
minimizing the cost of livestock products, taking into account the reduction of
the negative impact on the environment, as well as developing promising
methods and technical means for the disposal of livestock waste. We note that
one of the promising methods of manure utilization, including its liquid and
semi-liquid fractions, is composting. To speed up this process and produce high-
quality organic composts, special technical means are needed, capable of mixing
and grinding the components of mixtures. Considering this, creation of compost
mixers and development of new technologies for the disposal of livestock waste
on their basis is an urgent task.
Keywords: Environmental safety Livestock waste Manure disposal

Organic composts Composting phases Compost mixers
1 Introduction
One effective way to dispose of livestock waste is to produce organic composts based
on it. Peat, straw, sapropels, logging waste and organic waste from industrial plants, as
well as other ingredients, can be used as components of compost mixtures. The
additional costs of compost production are paid off by increasing crop yields and
improving soil fertility [1–5].
Modern composting is an aerobic process and consists of several stages: lag stage,
mesophilic, thermophilic and maturation. The timing of the compost maturation and the
quality of the fertilizers obtained depend on the nature of these processes. At the first
stage of composting, microorganisms “adapt” to the type of components of the mixture

https://doi.org/10.1007/978-3-030-68154-8_113
and living conditions in the compost heap. At the same stage, organic components
begin to decompose, but the total population of microorganisms and temperature of the
mass are still small. In the second phase, the decomposition of substrates is increasing,
the number of microorganisms is constantly growing. At the beginning, simple sugars
and carbohydrates are decomposed, and from the moment of their depletion, bacteria
start processing cellulose proteins, while secreting a complex of organic acids neces-
sary to feed other microorganisms [6–11].
The third phase of composting is accompanied by a significant increase in tem-
perature caused by an increase in the number of thermophilic bacteria and their
metabolic products. The temperature of 55 °C is harmful for most pathogenic and
conditionally pathogenic microorganisms of humans and plants. However, it does not
affect aerobic bacteria, which continue the composting process: accelerated breakdown
of proteins, fats and complex carbohydrates. When food resources run out, the tem-
perature of compost mass begins to gradually decrease [7, 12–14].
In the fourth stage, mesophilic microorganisms begin to dominate the compost
heap, and the temperature indicates the onset of the maturation stage. In this phase, the
remaining organic matter forms complexes that are resistant to further decomposition.
They are called humic acids or humus. The resulting compost is easily loosened and
becomes free-flowing. Truck-mounted organic fertilizer spreaders can be used to dis-
perse it onto the fields [8, 14–17].
In order to produce quality organic composts and shorten their maturation, it is
necessary to thoroughly mix and grind the components of the mixture both during
laying down the heaps and at their ripening. In this case, the oxygen supply to different
areas of compost heaps is improved, the formation of “dead zones” is excluded,
microbiological processes are intensified. To perform these operations, special equip-
ment is needed: compost aerator-mixers [15, 18–20].
As promising ways of disposing of livestock waste we should note its composting
with the use of special technical means: aerator-mixers of compost heaps [21–23].
This sets us to formulate new goals of minimizing the cost of livestock products,
taking into account the reduction of negative impact on the environment, as well as the
development of promising ways and technical means for the utilization of livestock
waste.
In this article, the authors specify the prerequisites for optimizing compost production
technology in different ways [24–26]. The task of optimizing the economic parameters
of compost production technology in mathematical terms can be reduced to finding the
minimum value of the accepted target function: specific costs for the production of
compost SC′ [27–29]
SC0 ¼ DC þ Es CI ! min; ð1Þ
where DC are specific direct costs of compost production, rub/kg;

Optimization of Compost Production Technology 1321
ES is the standard return rate on capital investments, ES = 0.15;

CI are total specific capital investments in the technological process, rub/kg.
At present, quite a variety of models of equipment for aeration of compostable
organic mass are being produced. These can be machines with their own wheeled or
tracked running (self-propelled or attached to tractors), as well as tunnel (bridge)
agitators, propelled with various consoles and driven by electric or gasoline engines.
They differ from each other, among other things, by the arrangement of the main
working body (agitator drum), productivity, and price [30].
In particular, to improve the quality of operation of the machine, we propose to
change the design of the agitator drum. It is known that the helical-feed drum devices
are broadly used for the mixing of various loose and viscous components, as well as for
their grinding and transporting. The drum usually consists of a hollow pipe and a
trunnion with blades, and a screw or helical feeder. The axis of the drum rotation is
mostly horizontal [31].
Analysis of studies of the helical-feed drums of different design showed that the
drums of the 1st variant of the experiment formed a fertilizer distribution diagram with
a depression in the central part and two peaks. The presence of cavities and peaks on
the diagrams of the distribution of organic fertilizers indicates a highly uneven dis-
tribution of fertilizers in the middle zone of the feeder working body.
Removing the loops of the helical feed in the middle part of the drum (variant 2), as
well as installing radial blades instead (variant 3) or anti-loops (variant 4) eliminates the
depression in the central part of the distribution diagram. In addition, the density of
fertilizer distribution in this part is significantly increased, which allows to produce the
required form of compost heaps. The best variant in our case is the 3rd one, as the
installation of additional blades will not only contribute to the formation of a conical
heap profile, but also provide better aeration of compostable mass.
Studies of the process of agitation of organic composts using an upgraded heap
aerator were carried out using an experimental device developed by the authors; a
TG 301 machine (Fig. 1) for compost heaps aeration by Italian manufacturers was
taken as its analogue, which has an acceptable price and is well-sold on the Russian
market. It is a semi-hanging machine, supported by two pneumatic wheels during
operation. The machine has a rounded arc-type frame and a main working body in the
shape of a rotor, with agitator blades with notches on their edges installed in a circular
formation, which crash well into the compostable mass and grind it. However, the
agitator does not mix the components of compost mixtures well enough and does not
ensure the formation of a heap of a given shape. The machine is equipped with an
apron reflector, involved in the formation of a heap. It operates with tractors of traction
class 1.4. The working body is driven from the tractor power take-off shaft.
3 Results
The results of the study of the parameters of the compost heaps agitator and the
experimental results are presented in Table 1.
Figure 2, 3, and 4 contain graphical representations of the experimental results.
Fig. 1. TG 301 compost heap aerator-mixer.
Table 1. The results of the study of the parameters of the aerator (agitator-mixer) of compost
heaps.
No Factor Natural factor Output parameter N, W h
variation values
levels
x n Vtr x n Vtr N1 N2 N3 Navg
x1 x2 x3 min−1 units km/h
1 1 1 0 230 6 0.2 62.98 64.29 69.54 65.60
2 −1 −1 0 170 2 0.2 48.60 49.60 52.10 50.10
3 1 −1 0 230 2 0.2 60.19 60.19 67.72 62.70
4 −1 1 0 170 6 0.2 52.28 51.74 57.67 53.90
5 1 0 1 230 4 0.25 77.72 74.58 83.21 78.50
6 −1 0 −1 170 4 0.15 45.70 46.65 50.46 47.60
7 1 0 −1 230 4 0.15 54.14 55.27 59.78 56.40
8 −1 0 1 170 4 0.25 61.75 63.05 70.20 65.00
9 0 1 1 200 6 0.25 71.38 68.50 76.43 72.10
10 0 −1 −1 200 2 0.15 45.01 44.08 50.11 46.40
11 0 1 −1 200 6 0.15 13.68 13.82 15.70 14.40
12 0 −1 1 200 2 0.25 3.84 3.72 4.19 3.92
13 0 0 0 200 4 0.2 5.92 6.05 6.54 6.17
14 0 0 0 200 4 0.2 6.30 6.23 6.55 6.36
15 0 0 0 200 4 0.2 5.93 6.05 6.35 6.11
N, W h
80
75-80
75 70-75
70 65-70
65 60-65
55-60
60
50-55
55
b3 45-50
50 1
0,6 40-45
45
0,2
40 -0,2
-1 -0,6
-0.6
-0.2
0.2 -1
0.6
1 b1
Fig. 2. The dependence of the full power consumption of the compost agitator on the frequency
of drum rotation and the speed of mass supply.
N, W h
70 65-70
60-65
65
55-60
60 50-55
45-50
55
40-45
50
b2
1
45 0,6
0,2
40 -0,2
-1 -0,6
-0.6
-0.2
0.2 -1
0.6
1 b1
Fig. 3. The dependence of the full power consumption of the compost agitator on the frequency
of blades rotation and the number of tossing blades.
N, W h
70-75
75
65-70
70 60-65
55-60
65
50-55
60 45-50
40-45
55
b3
50
1
45 0,6
0,2
40 -0,2
-1 -0,6
-0.6
-0.2
0.2 -1
0.6
1 b2
Fig. 4. The dependence of the full power consumption of the compost agitator on the number of
tossing blades and the speed of mass supply
As a result of multifactor regression analysis, based on the results of research

conducted by us, we have obtained the dependence of the full power required for the
process of mixing components of compost mixtures from the frequency of rotation of
the agitator drum x, the number of tossing blades n and the speed of supply of organic
mass onto the agitator drum, that is, the forward speed of the machine Vtr.
After conducting the subsequent multifactor regression analysis, excluding
insignificant effects, we have obtained a regression equation
N ¼ 59:6 þ 5:825 b1 þ 10:9375 b3 þ 2:25 ðb3 Þ2 ; ð2Þ
where N is the full power of the aerator (agitator-mixer) of heaps, W/h.

From the presented data, we can conclude that the model (1) is well-fitting, as
coefficient of determination of parameters is quite high (R squared equals 99:0818%),
the resulting model explains 97.43% of the change in N.
The model in question is significant because there is a statistically significant
relationship between variables. There is no noticeable correlation between experimental
values placed in the matrix, as the Durbin-Watson (DW) statistics is higher than 1.4.
4 Discussion
Taking into account the values of the coefficients of the obtained mathematical model,
analyzing the response surface (Fig. 2), we note that the increased influence on the
power consumed by the machine is exerted by the frequency of rotation of the agitator
drum and the rate of supply of compostable mass to the working body. With an
increase in these parameters, energy costs grow.
From the Fig. 3 we can see that the highest values of power consumed, required to
mix the compostable organic mass, are observed at the maximum rotation rate of the
agitator drum (230 min−1) and at the maximum rate of feeding the mass to the working
body (0.25 km/h).
Because of the dependence of energy costs (Fig. 4) on the frequency of the drum
rotation and the speed of feeding the compostable mass, the minimum energy intensity
is achieved at the minimum speed of movement of the unit and the minimum speed of
rotation of the agitator drum.
5 Conclusions
Based on the research conducted by the authors, the regression Eq. (2), which describes
the dependence of full power required for the process of mixing the components of
compost mixtures from the rotation rate x of the drum, the number of tossing blades n,
and the rate of feeding the organic mass onto the agitator drum.
The research will be used in the design of a three-dimensional model of the
upgraded aerator-mixer.
References
1. Abdalla, M.A., Endo, T., Maegawa, T., Mamedov, A., Yamanaka, N.: Effectiveness of
organic amendment and application thickness on properties of a sandy soil and sand
stabilization. J. Arid Environ. 183, 104273 (2020). https://doi.org/10.1016/j.jaridenv.2020.
104273
2. El-Haddad, M.E., Zayed, M.S., El-Sayed, G.A.M., Abd EL-Safer, A.M.: Efficiency of
compost and vermicompost in supporting the growth and chemical constituents of salvia
officinalis L. cultivated in sand soil. Int. J. Recycl. Org. Waste Agric. 9(1), 49–59 (2020).
https://doi.org/10.30486/IJROWA.2020.671209
3. Chen, T.M., Zhang, S.W., Yuan, Z.W.: Adoption of solid organic waste composting
products: a critical review. J. Cleaner Prod. 272, 122712 (2020). https://doi.org/10.1016/j.
jclepro.2020.122712
4. Alavi, N., Daneshpajou, M., Shirmardi, M., et al.: Investigating the efficiency of co-
composting and vermicomposting of vinasse with the mixture of cow manure wastes,
bagasse, and natural zeolite. Waste Manage. 69, 117–126 (2017)
5. Guo, H.N., Wu, S.B., Tian, Y.J., Zhang, J., Liu, H.T.: (2020). Application of machine
learning methods for the prediction of organic solid waste treatment and recycling processes:
a review. Bioresour. Technol. 319, p. 124114. https://doi.org/10.1016/j.biortech.2020.
124114.
6. Kokunova, I.V., Kotov, E.G., Ruzhev, V.A.: Razrabotka klassifikacii tekhnicheskih sredstv
dlya proizvodstva organicheskih kom-postov (Development of classification of technical
means for the production of organic compost). In: The Role of Young Scientists in Solving
Urgent Problems of the Agro-Industrial Complex: Collection of Scientific Papers
International Scientific and Practical Conference. Saint Petersburg-Pushkin, pp. 179–182
7. Akhtar, N., Gupta, K., Goyal, D. et al.: Recent advances in pretreatment technologies for
efficient hydrolysis of lignocellulosic biomass. Environ. Prog. Sustain. Energy 35(2),
pp. 489–511 (2016)
8. El-Sherbeny, S.E., Khalil, M.Y., Hussein, M.S. et al.: Effect of sowing date and application
of foliar fertilizers on the yield and chemical composition of rue (Ruta graveolens L.) herb.
Herba Polonica 54(1), 47–56 (2008)
9. Joshi, R., Singh, J., Vig, A.P.:Vermicompost as an effective organic fertilizer and biocontrol
agent: effect on growth, yield and quality of plants. Rev. Environ. Sci. Bio-Tech. 14(1), 137–
159 (2015)
10. Marinari, S., Masciandaro, G., Ceccanti, B. et al.: Influence of organic and mineral fertilizers
on soil biological and physical properties. Bioresour. Tech. 72(1), 9–17 (2000).
11. Hargreaves, J.C., Adl, M.S., Warman, P.R.: A review of the use of composted municipal
solid waste in agriculture. Agric. Ecosyst. Environ. 123(1–3), 14 (2008)
12. Ievinsh, G. (2011).Vermicompost treatment differentially affects seed germination, seedling
growth and physiological status of vegetable crop species. Plant Growth Regul. 65(1), 169–
181.
13. Bernstad, A.K., Canovas, A., Valle, R.: Consideration of food wastage along the supply
chain in lifecycle assessments: a mini-review based on the case of tomatoes. Waste Manag.
Res. 35(1), 29–39 (2017).
14. Buchmann, C., Schaumann, G.E.: The contribution of various organic matter fractions to
soil-water interactions and structural stability of an agriculturally cultivated soil. J. Plant
Nutr. Soil Sci. 181(4), 586–599.
15. Petrova, I.I., Kokunova, I.V.: Povyshenie effektivnosti vneseniya tverdyh organicheskih
udobrenij putem razrabotki novogo raspredelyayushchego ustrojstva dlya navozorazbrasy-
vatelya (Improving the efficiency of solid organic fertilizer application by developing a new
distribution device for the manure spreader of agricultural state), pp. 19–23 (2013). (in
Russian)
16. Hu, C., Xia, X., Chen, Y. et al.: Soil carbon and nitrogen sequestration and crop growth as
influenced by long-term application of effective microorganism compost. Chil. J. Agric. Res.
78(1), 13–22 (2018).
17. Case, S.D.C., Oelofse, M., Hou, Y., et al.: Farmer perceptions and use of organic waste
products as fertilizers - a survey study of potential benefits and barriers. Agric. Syst. 151,
84–95 (2017)
18. Kazimierczak, R., Hallmann, E., Rembialkowska, E.: Effects of organic and conventional
production systems on the content of bioactive substances in four species of medicinal
plants. Biol. Agric. Hortic. 31(2), 118–127 (2015)
19. Cerda, A., Artola, A., Font, X., et al.: Composting of food wastes: status and challenges.
Biores. Technol. 248, 57–67 (2018)
20. Dannehl, D., Becker, C., Suhl, J. et al.: Reuse of organomineral substrate waste from
hydroponic systems as fertilizer in open-field production increases yields, flavonoid
glycosides, and caffeic acid derivatives of red oak leaf lettuce (Lactuca sativa L.) much more
than synthetic fertilizer. J. Of Agric. Food Chem. 64(38), 7068–7075 (2016)
21. Hwang, H.Y., Kim, S.H., Kim, M.S., Park, S.J., Lee, C.H...: Co-composting of chicken
manure with organic wastes: Characterization of gases emissions and compost quality. Appl.
Biol. Chem. 63(1), 3 (2020). https://doi.org/10.1186/s13765-019-0483-8
22. Awasthi, M.K., Pandey, A.K., Bundela, P.S. et al.: Co-composting of gelatin industry sludge
combined with organic fraction of municipal solid waste and poultry waste employing
zeolite mixed with enriched nitrifying bacterial consortium. Bioresour. Tech. 213, 181–189
(2016).
23. Cesaro, A., Belgiorno, V., Guida, M.: Compost from organic solid waste: quality assessment
and european regulations for its sustainable use. Resour. Conserv. Recycl. 94, 72–79 (2015)
24. Ben-Tal, A., Nemirovski, A.: Lectures on modern convex optimization: analysis, algorithms,
and engineering applications. MPR-SIAM Series on optimization. SIAM, Philadelphia
(2001)
25. Ben-Tal, A., Nemirovski, A.: Robust optimization - methodology and applications. Math.
Program. 92(3), 453–480 (2002)
26. Özmen, A., Weber, G.-W., Batmaz, İ: The new robust CMARS (RCMARS) method. In:
International Conference 24th Mini EURO Conference “Continuous Optimization and
Information-Based Technologies in the Financial Sector” (MEC EurOPT 2010), Izmir,
Turkey, 23–26 June 2010 (2010)
27. Samarin, G.N., Vasilyev, A.N., Zhukov, A.A., Soloviev, S.V.: Optimization of microclimate
parameters inside livestock buildings. In: Vasant, P., Zelinka, I., Weber, G.W. (eds) Intel-
ligent Computing & Optimization. ICO 2018. Advances in Intelligent Systems and
Computing, vol. 866. Springer, Cham (2018).
28. Samarin, G., Vasiliev, A.N., Vasiliev, A.A., Zhukov, A., Krishtopa, N., Kudryavtsev, A.:
Optimization of technological processes in animal husbandry. In: International Conference
on Efficient Production and Processing (ICEPP-2020), E3S Web Conferences, vol. 161,
p. 1094 (2020).
29. Carr, L., Grover, R., Smith, B., et al.: Commercial and on-farm production and marketing of
animal waste compost products Animal waste and the land-water interface, pp. 485–492.
Lewis Publishers , Boca Raton (1995)
30. Kranz, C.N., McLaughlin, R.A., Johnson, A., Miller, G., Heitman, J.L.: The effects of
compost incorporation on soil physical properties in urban soils - a concise review.
J. Environ. Manage. 261, 110209 (2020). https://doi.org/10.1016/j.jenvman.2020.110209
31. Beck-Broichsitter, S., Fleige, H., Horn, R.: Compost quality and its function as a soil
conditioner of recultivation layers - a critical review. Int. Agrophys. 32(1), 11–18 (2018).
Author Index
A Bandyopadhyay, Tarun Kanti, 345

Abdulgalimov, Mavludin, 1156 Banik, Anirban, 73, 345
Abdullah, Saad Mohammad, 681 Barua, Adrita, 1111
Abedin, Zainal, 311 Basnin, Nanziba, 379
Adil, Md., 237, 976 Bebeshko, Bohdan, 463
Ahammad, Tanzin, 418 Belikov, Roman P., 28
Ahmad, Nafi, 393 Bellone, Cinzia B., 1168
Ahmed, Ashik, 681 Belov, Aleksandr A., 43
Ahmed, Md. Raju, 1213 Bhatt, Ankit, 633
Ahmed, Tawsin Uddin, 865 Biswal, Sushant Kumar, 345
Aishwarja, Anika Islam, 546 Biswas, Munmun, 880
Akhmetov, Bakhytzhan, 463 Bolshev, Vadim E., 19, 28
Akhtar, Mohammad Nasim, 1026 Boonmalert, Chinchet, 1262
Akhtar, Nasim, 735 Boonperm, Aua-aree, 263, 276, 287, 1262
Akter, Mehenika, 865 Borodin, Maksim V., 19, 28
Akter, Suraiya, 976 Budnikov, Dmitry, 36, 1139
Ali, Abdalla M., 823, 838 Budnikov, Dmitry A., 440
Andersson, Karl, 583, 865, 880, 894 Bugreev, Victor, 205
Anufriiev, Sergii, 777 Bukreev, Alexey V., 19
Anwar, A. M. Shahed, 607 Bunko, Vasyl, 1222
Anwar, Md Musfique, 964
Apatsev, Vladimir, 205
Apeh, Simon T., 430 C
Ara, Ferdous, 583 Chakma, Rana Jyoti, 788
Arefin, Mohammad Shamsul, 326, 476, 1011, Chakraborty, Tilottama, 907
1071, 1281, 1295 Chaplygin, Mikhail, 135, 369
Aris, Mohd Shiraz, 1232 Chaturantabut, Saifon, 1059
Arnab, Ali Adib, 393 Chekhov, Anton, 205
Asma Gani, Most., 976 Chekhov, Pavel, 205
Ayon, Zaber Al Hassan, 1071 Cheng, L., 250
Chernishov, Vadim A., 19
B Chirskiy, Sergey, 73, 196
Baartman, S., 250 Chit, Khin Me Me, 1038
Babaev, Baba, 95 Chui, Kwok Tai, 670
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2021
https://doi.org/10.1007/978-3-030-68154-8
1330 Author Index
D Hossain, Tanvir, 744

Daoden, Kanchana, 954 Hossen, Imran, 621
Das, Avishek, 1111, 1124 Hossen, Muhammad Kamal, 1011
Das, Dipankar, 621 Hsiao, Yu-Lin, 1242
Das, Utpol Kanti, 607 Hung, Phan Duy, 406
Daus, Yu. V., 216 Huynh, Son Bach, 823
Deb, Deepjyoti, 907
Deb, Kaushik, 311, 530 I
Deb, Ujjwal Kumar, 224 Ignatenko, Dmitry N., 1310
Debi, Tanusree, 326, 1281 Ikechukwu, Anthony O., 297
Desiatko, Alona, 463 Ilchenko, Ekaterina, 135
Dogeev, Hasan, 1156 Imteaj, Ahmed, 1011
Drabecki, Mariusz, 118 Intawichai, Siriwan, 1059
Duangban, Dussadee, 938 Iqbal, MD. Asif, 1111, 1124
Dzhaparov, Batyr, 1156 Iqbal, Md. Hafiz, 1281
Islam, Md. Moinul, 1071
E Islam, Md. Shariful, 476
Eva, Nusrat Jahan, 546 Islam, Md. Zahidul, 922
Islam, Muhammad Nazrul, 546, 559
F Islam, Nazmin, 754
Fardoush, Jannat, 788 Islam, Quazi Nafees Ul, 681
Faruq, Md. Omaer, 754
Fister Jr., Iztok, 187 J
Fister, Iztok, 187 Jachnik, Bartosz, 777
Forhad, Md. Shafiul Alam, 476 Jahan, Nasrin, 659
Jahara, Fatima, 1111
G Jalal, Mostafa Mohiuddin, 559
Galib, Syed Md., 476 Jamrunroj, Panthira, 276
Godder, Tapan Kumar, 922 Jerkovic, Hrvoje, 708
Golam Rashed, Md., 621 Joshua Thomas, J., 823, 838, 1082
Golikov, Igor O., 19, 28 Jyoti, Oishi, 754
Gosh, Subasish, 880
Gupta, Brij B., 670 K
Gupta, Dipankar, 894 Kar, Susmita, 735
Karim, Rezaul, 647, 659
H Karmaker, Ashish Kumar, 1213
Hartmann-González, Mariana Scarlett, 506 Karmaker, Rajib, 224
Hasan, Md. Manzurul, 744 Keppler, Joachim, 720
Hasan, Mirza A. F. M. Rashidul, 693 Khan, Md. Akib, 476
Hasan, Mohammad, 418, 801, 1000 Khan, Md. Akib Zabed, 964
Hassamontr, Jaramporn, 452 Khan, Mohammad Ibrahim, 1295
Hlangnamthip, Sarot, 154 Khan, Nafiz Imtiaz, 546
Hoque, Mohammed Moshiul, 1111, 1124 Kharchenko, V. V., 216
Hossain, Emam, 894 Kharchenko, Valeriy, 95, 103, 1186, 1195
Hossain, Md. Anwar, 693 Khobragade, Prachi D., 907
Hossain, Md. Arif, 681 Khorolska, Karyna, 463
Hossain, Md. Billal, 1295 Khristenko, Alexander G., 1310
Hossain, Mohammad Rubaiyat Tanvir, 1047 Khujakulov, Saydulla, 103
Hossain, Mohammad Shahada, 379 Kibria, Hafsa Binte, 1097
Hossain, Mohammad Shahadat, 583, 647, 659, Klychev, Shavkat, 95
865, 880, 894 Kokunova, Irina V., 1319
Hossain, Mohammed Sazzad, 894 Kovalev, Andrey, 63, 73, 1186, 1195
Hossain, Shahadat, 744 Kovalev, Dmitriy, 1186, 1195
Hossain, Syed Md. Minhaz, 530 Kozyrska, Tetiana, 111
Author Index 1331
Kozyrsky, Volodymyr, 111, 1222 Nesvidomin, Andriy, 1222

Krishtanov, Egor A., 1310 Nggada, Shawulu H., 297
Kudryavtsev, Alexander A., 1310, 1319 Nikitin, Boris, 95
Kuzmichev, Alexey, 1146 Niyomsat, Thitipong, 154
Noor, Noor Akma Watie Binti Mohd, 1232
L Normov, Dmitry A., 1319
Lakhno, Valery, 463 Novikov, Evgeniy, 205
Lansberg, Alexander A., 28
Leeart, Prarot, 597, 1176 O
Leephaicharoen, Theera, 452 Okpor, James, 430
Lekburapa, Anthika, 287 Ongsakul, Weerakorn, 53, 633
Lin, Laet Laet, 570, 1038 Ortega, Luz María Adriana Reyes, 520
López-Sánchez, Víctor Manuel, 357 Ottavi, Riccardo, 1168
Loy-García, Gabriel, 812
Lutskaya, Nataliia, 1252 P
Panchenko, V. A., 216
M Panchenko, Vladimir, 63, 73, 84, 95, 103, 196,
Madhu, Nimal, 633 205, 345, 1186, 1195, 1204
Magomedov, Fakhretdin, 1156 Pardayev, Zokir, 103
Mahmud, Tanjim, 788 Parvez, Saif Mahmud, 964
Majumder, Mrinmoy, 907 Pathak, Abhijit, 237, 976
Makarevych, Svitlana, 111 Pathan, Refat Khan, 583
Malim, Nurul Hashimah Ahamed Hassain, 823 Pedrero-Izquierdo, César, 357
Mamunur Rashid, Md., 801 Podgorelec, Vili, 187
Manshahia, Mukhdeep Singh, 3 Prilukov, Aleksandr, 135, 369, 1156
Marmolejo, José A., 1272 Pringsakul, Noppadol, 176
Marmolejo-Saucedo, Jose-Antonio, 520, 812 Puangdownreong, Deacha, 145, 154, 176
Marmolejo-Saucedo, José Antonio, 357, 506
Marshed, Md. Niaz, 1213 Q
Matic, Igor, 720 Quan, Do Viet, 406
Matin, Abdul, 1097 Quenum, José G., 297
Melikov, Izzet, 1156
Miah, Abu Saleh Musa, 801 R
Minatullaev, Shamil, 1156 Rahaman, Md Masumur, 1281
Mitic, Peter, 164 Rahaman, Md. Habibur, 754
Mitiku, Tigilu, 3 Rahman, Ataur, 1047
Mrsic, Leo, 708, 720 Rahman, Md. Mahbubur, 964
Mukta, Sultana Jahan, 393, 922 Rahman, Md. Rashadur, 1295
Mukul, Ismail Hossain, 1000 Rahman, Mohammad Nurizat, 1232
Munapo, Elias, 491 Rahman, Mostafijur, 735, 1026
Munna, Ashibur Rahman, 237 Raj, Sheikh Md. Razibul Hasan, 393, 922
Mushtary, Shakira, 546 Ramasamy, Sriram, 1082
Mustafa, Rashed, 647, 659 Rastimeshin, Sergey, 1146
Redwanur Rahman, Md., 801
N Reong, Samuel, 1242
Nahar, Lutfun, 379 Reyna Guevara, Zayra M., 1272
Nahar, Nazmun, 583, 880 Ripan, Rony Chowdhury, 1071
Nair, Gomesh, 838 Riyana, Noppamas, 938
Nanthachumphu, Srikul, 938 Riyana, Surapon, 938
Nawal, Nafisa, 801 Rodríguez-Aguilar, Román, 520, 812
Nawikavatan, Auttarat, 597, 1176 Romsai, Wattanawong, 597, 1176
Nekrasov, Alexey, 1204 Roy, Bakul Chandra, 621
Nekrasov, Anton, 1204 Roy, Shaishab, 1026
1332 Author Index
S Tikhomirov, Dmitry, 1146

Saadat, Mohammad Amir, 647 Tito, S. R., 681
Saha, Rumi, 326 Tofayel Hossain, Md., 801
Saiful Islam, Md., 989 Tolic, Antonio, 708
Samarin, Gennady N., 43, 1310, 1319 Tran, H. N. Tran, 823
Sangngern, Kasitinart, 263 Tripura, Khakachang, 907
Sania, Sanjida Nusrat, 237, 976 Trunov, Stanislav, 1146
Sarker, Iqbal H., 1111
Sathi, Khaleda Akhter, 989 U
Saucedo Martínez, Jania A., 1272 Uddin, Md. Ashraf, 476
Savchenko, Vitaliy, 111, 1222 Uddin, Mohammad Amaz, 583
Schormans, John, 393 Ukhanova, Victoria, 1146
Senkevich, Sergey, 135, 369, 1156 Uzakov, Gulom, 103
Shahidujjaman Sujon, Md., 801
Shanmuga Priya, S., 848 V
Sharif, Omar, 1111, 1124 Vasant, Pandian, 1186, 1195
Sharko, Anton A., 440 Vasiliev, Aleksey Al., 43
Shin, Jungpil, 801 Vasiliev, Aleksey N., 43
Shtepa, Volodimir, 1252 Vasiliev, Alexey A., 440
Siddique, Md. Abu Ismail, 754 Vasiliev, Alexey N., 440
Siddiquee, Md. Saifullah, 1047 Vasilyev, Alexey A., 1139
Sikder, Juel, 607, 788 Vasilyev, Alexey N., 1139, 1319
Sinitsyn, Sergey, 84 Vinogradov, Alexander V., 19, 28
Sintunavarat, Wutiphol, 287, 1262 Vinogradova, Alina V., 19, 28
Sinyavsky, Oleksandr, 1222 Vlasenko, Lidiia, 1252
Sittisung, Suphannika, 938 Voloshyn, Semen, 111
Sivarethinamohan, R., 848 Vorushylo, Anton, 111
Skoczylas, Artur, 766, 777
Śliwiński, Paweł, 766 W
Sobolenko, Diana, 111 Wahab, Mohammad Abdul, 880
Soe, Myat Thuzar, 570 Wee, Hui-Ming, 1242
Sorokin, Nikolay S., 19, 28 Whah, Chin Yee, 1242
Stachowiak, Maria, 766
Stefaniak, Paweł, 766, 777 Y
Sujatha, S., 848 Yasmin, Farzana, 1071
Sumpunsri, Somchai, 145 Ydyryshbayeva, Moldyr, 463
Yudaev, I. V., 216
T Yuvaraj, D., 848
Tasin, Abrar Hossain, 237, 976
Tasnim, Zarin, 546, 559 Z
Thaiupathump, Trasapong, 954 Zahid Hassan, Md., 418, 1000
Thammarat, Chaiyo, 145 Zaiets, Nataliia, 1252
Thomas, J. Joshua, 345 Zaman, Saika, 1011
Thupphae, Patiphan, 53 Zhao, Mingbo, 670
Tikhomirov, Dmitry A., 19, 28 Zulkifli, Ahmad Zulazlan Shah b., 1232

Intelligent Computing and Optimization: Pandian Vasant Ivan Zelinka Gerhard-Wilhelm Weber Editors

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Intelligent Computing and Optimization: Pandian Vasant Ivan Zelinka Gerhard-Wilhelm Weber Editors

Uploaded by

Copyright:

Available Formats

Advances in Intelligent Systems and Computing 1324

More information about this series at http://www.springer.com/series/11156

ISSN 2194-5357 ISSN 2194-5365 (electronic)

The third edition of the International Conference on Intelligent Computing and

committee members of ICO’2020. We would like to sincerely thank Prof. Igor

December 2020 Pandian Vasant

Sustainable Clean Energy System

Modeling of Bilateral Photoreceiver of the Concentrator

Sustainable Optimization, Metaheuristics and Computing

Bayesian Optimization for Reverse Stress Testing . . . . . . . . . . . . . . . . . 164

A Framework for Trafﬁc Sign Detection Based on Fuzzy Image

Advances in Algorithms, Modeling and Simulation

Modeling and Experimental Veriﬁcation of Air - Thermal

Application of Machine Learning and Artiﬁcial

Advanced Analytics Techniques for Customer Activation

Captivating Proﬁtable Applications of Artiﬁcial Intelligence

Holistic IoT, Deep Learning and Information Technology

Multi-classiﬁcation of Brain Tumor Images Based on Hybrid

BEmoD: Development of Bengali Emotion Dataset for Classifying

Advances in Engineering and Technology

Optimization of Parameters of Pre-sowing Seed Treatment in

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1329

Pandian Vasant is a senior lecturer at University of Technology Petronas,

Ivan Zelinka is currently working at the Technical University of Ostrava

research of Czech grant agency GAČR, co-supervisor of grant FRVŠ - Laboratory

G.-W. Weber is a professor at Poznan University of Technology, Poznan, Poland,

Tigilu Mitiku1 and Mukhdeep Singh Manshahia2(&)

Keywords: Wind energy harvesting system Adaptive neuro-fuzzy inference

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021

2 Motivation and Objective of Research

developed using Matlab/Simulink tool to enhancement the performance of the system

4 Wind Energy Harvesting System

4.1 Modelling of Wind Turbine

Cp ðk; bÞqpR2 Vw3

where q is density of air, A is area swept by blades, Vw is wind speed, Cp is the

Fig. 1. PMSG based wind turbine model [21, 24]

The tip speed ratio k is deﬁned by

4.2 Modeling of the PMSG

where, Lq and Ld are q, d axis components of stator inductance of the generator

For surface mounted PMSG, Lq = Ld and Eq. (12) becomes

Where Te is the electromagnetic torque of the generator in Nm, J is inertia of rotor

5 Adaptive Neuro-Fuzzy Inference System

Rule 1: If x is A1 and y is B1 ; then f1 ¼ p1 x þ q1 y þ r1 ; ð14Þ

Rule 2: If x is A1 and y is B1 ; then f2 ¼ p2 x þ q2 y þ r2 ; ð15Þ

O2i ¼ wi ¼ lAi ðxÞ lBi ðyÞ for i ¼ 3; 4 ð17Þ

O4i ¼ wi fi ¼ wi fi ¼ wi ðpi x þ qi y þ ri Þ ð19Þ

6 Simulation Results and Discussion

The generator-side converter is controlled to catch maximum power from available

Fig. 4. Speed control block diagram

The block diagram of the ANFIS-based MPPT controller module is shown in

Fig. 5. ANFIS-based MPPT control module of turbine rotor speed.

The simulated results of generator output voltage at average speed of 12 m/s is

Fig. 6. Inverter output voltage

ANFIS constructs a fuzzy inference system (FIS) whose membership function

Table 1. Parameters of wind turbine and PMSG

6.1 Limitation of Research

7 Conclusion and Future Scope

Alexander V. Vinogradov1 , Dmitry A. Tikhomirov1 ,

Abstract. Agriculture is the guarantor of the sustainable development of the

Keywords: Agricultural enterprises Agricultural holdings Distributed

An agroholdings is a group of legal entities engaged in agricultural activities and sales

2 Problems of Integration of Distributed Energy Projects

Features of distributed generation allow in certain cases to solve a number of these