You are on page 1of 300

Lecture Notes in Electrical Engineering 738

Sergio Saponara
Alessandro De Gloria   Editors

Applications
in Electronics
Pervading Industry,
Environment and
Society
APPLEPIES 2020
Lecture Notes in Electrical Engineering

Volume 738

Series Editors
Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli
Federico II, Naples, Italy
Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán,
Mexico
Bijaya Ketan Panigrahi, Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India
Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, Munich, Germany
Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China
Shanben Chen, Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore,
Singapore, Singapore
Rüdiger Dillmann, Humanoids and Intelligent Systems Laboratory, Karlsruhe Institute for Technology,
Karlsruhe, Germany
Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China
Gianluigi Ferrari, Università di Parma, Parma, Italy
Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid,
Madrid, Spain
Sandra Hirche, Department of Electrical Engineering and Information Science, Technische Universität
München, Munich, Germany
Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA,
USA
Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Alaa Khamis, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt
Torsten Kroeger, Stanford University, Stanford, CA, USA
Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA
Ferran Martín, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra,
Barcelona, Spain
Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore
Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany
Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA
Sebastian Möller, Quality and Usability Laboratory, TU Berlin, Berlin, Germany
Subhas Mukhopadhyay, School of Engineering & Advanced Technology, Massey University,
Palmerston North, Manawatu-Wanganui, New Zealand
Cun-Zheng Ning, Electrical Engineering, Arizona State University, Tempe, AZ, USA
Toyoaki Nishida, Graduate School of Informatics, Kyoto University, Kyoto, Japan
Federica Pascucci, Dipartimento di Ingegneria, Università degli Studi “Roma Tre”, Rome, Italy
Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China
Gan Woon Seng, School of Electrical & Electronic Engineering, Nanyang Technological University,
Singapore, Singapore
Joachim Speidel, Institute of Telecommunications, Universität Stuttgart, Stuttgart, Germany
Germano Veiga, Campus da FEUP, INESC Porto, Porto, Portugal
Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Beijing, China
Junjie James Zhang, Charlotte, NC, USA
The book series Lecture Notes in Electrical Engineering (LNEE) publishes the latest developments
in Electrical Engineering - quickly, informally and in high quality. While original research
reported in proceedings and monographs has traditionally formed the core of LNEE, we also
encourage authors to submit books devoted to supporting student education and professional
training in the various fields and applications areas of electrical engineering. The series cover
classical and emerging topics concerning:
• Communication Engineering, Information Theory and Networks
• Electronics Engineering and Microelectronics
• Signal, Image and Speech Processing
• Wireless and Mobile Communication
• Circuits and Systems
• Energy Systems, Power Electronics and Electrical Machines
• Electro-optical Engineering
• Instrumentation Engineering
• Avionics Engineering
• Control Systems
• Internet-of-Things and Cybersecurity
• Biomedical Devices, MEMS and NEMS

For general information about this book series, comments or suggestions, please contact leontina.
dicecco@springer.com.
To submit a proposal or request further information, please contact the Publishing Editor in
your country:
China
Jasmine Dou, Editor (jasmine.dou@springer.com)
India, Japan, Rest of Asia
Swati Meherishi, Editorial Director (Swati.Meherishi@springer.com)
Southeast Asia, Australia, New Zealand
Ramesh Nath Premnath, Editor (ramesh.premnath@springernature.com)
USA, Canada:
Michael Luby, Senior Editor (michael.luby@springer.com)
All other Countries:
Leontina Di Cecco, Senior Editor (leontina.dicecco@springer.com)
** This series is indexed by EI Compendex and Scopus databases. **

More information about this series at http://www.springer.com/series/7818


Sergio Saponara Alessandro De Gloria

Editors

Applications in Electronics
Pervading Industry,
Environment and Society
APPLEPIES 2020

123
Editors
Sergio Saponara Alessandro De Gloria
DII DITEN
University of Pisa University of Genoa
Pisa, Italy Genoa, Italy

ISSN 1876-1100 ISSN 1876-1119 (electronic)


Lecture Notes in Electrical Engineering
ISBN 978-3-030-66728-3 ISBN 978-3-030-66729-0 (eBook)
https://doi.org/10.1007/978-3-030-66729-0
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

The 2020 edition of the Conference on “Applications in Electronics Pervading


Industry, Environment and Society” was exceptionally held fully online during
November 19 and 20, 2020
During the 2 days, 87 registered participants, from 27 different entities (20
Universities and seven industries), discussed electronic applications in several
domains, demonstrating how electronics has become pervasive and ever more
embedded in everyday objects and processes.
The conference had the technical and/or financial support of University of Pisa,
University of Genoa, SIE (Italian Association for Electronics), Giakova and of the
H2020 European Processor Initiative.
After a strict blind-review selection process, 12 short presentations and 24 lec-
tures have been accepted (with co-authors from 14 different nations) in 11 sessions
focused on circuits and electronic systems and their relevant applications in the
following fields: Wireless and IoT, health care, vehicles and robots (electrified and
autonomous), power electronics and energy storage, cybersecurity, AI and data
engineering.
More in detail, the short presentation sessions involved contributions on SS1
mechatronics, energies and Industry 4.0 and SS2 IoT, AI and ICT applications,
while the full oral sessions involved contributions on S1 AI and ML techniques, S2
environmental monitoring and E-health, S3 electronics for health and assisted
living, S4 digital techniques for mechatronics, energy and critical systems and S5
photonic circuits and IoT for communications.
There were also two scientific keynotes, given by Cecilia Metra (IEEE Computer
Society Past President) and by John David Davies (Barcelona Super Computing)
and three industrial keynotes, by Carlo Cavazzoni (Leonardo Spa), Paolo Gai
(Huawei) and Luca Poli (Giakova Spa).
The articles featured in this book, together with the talks and round tables of the
special events, prove that the capabilities of nowadays electronic systems, in terms
of computing, storage and networking, are able to support a plethora of application
domains, such as mobility, health care, connectivity, energy management, smart

v
vi Preface

production, ambient intelligence, smart living, safety and security, education,


entertainment, tourism and cultural heritage.
In order to exploit such capabilities, multidisciplinary knowledge and expertise
are needed to support a virtuous iterative cycle from user needs to the design,
prototyping and testing of new products and services that are more and more
characterized by a digital core.
The design and testing cycles go through the whole system engineering process,
which includes analysis of user requirements, specification definition, verification
plan definition, software and hardware co-design, laboratory and user testing and
verification, maintenance management and life cycle management of electronics
applications. The design of electronics-enabled systems should be characterized by
innovation, high performance, real-time operations and budget compliance (in
terms of time, cost, device size, weight, power consumption, etc.). Design
methodologies and tools have emerged in order to support teams dealing with such
a complexity.
All these challenging aspects call for the importance of the role of Academia as a
place where new generations of designers can learn and practice with the
cutting-edge technological tools, and where new solutions are studied, starting from
challenges coming from a variety of application domains. This approach is sus-
tained by industries that understand the role of a high-level educational system, able
to nurture new generations of designers and developers.
The APPLEPIES conference has reached, in 2020, its eight edition, confirming
its role as a reference point for a growing research community in the field of
electronics systems design, with a particular focus on applications.

Sergio Saponara
General Chair
Alessandro De Gloria
Honorary Chair
Contents

AI and ML Techniques
Implementation of Particle Image Velocimetry for Silo Discharge
and Food Industry Seeds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Romina Molina, Valeria Gonzalez, Jesica Benito, Stefano Marsi,
Giovanni Ramponi, and Ricardo Petrino
Analysis and Design of a Yolo like DNN for Smoke/Fire Detection
for Low-cost Embedded Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Alessio Gagliardi, Marco Villella, Luca Picciolini, and Sergio Saponara
Video Grasping Classification Enhanced
with Automatic Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Edoardo Ragusa, Christian Gianoglio, Filippo Dalmonte,
and Paolo Gastaldo
Enabling YOLOv2 Models to Monitor Fire and Smoke Detection
Remotely in Smart Infrastructures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Sergio Saponara, Abdussalam Elhanashi, and Alessio Gagliardi
Exploring Unsupervised Learning on STM32 F4 Microcontroller . . . . . 39
Francesco Bellotti, Riccardo Berta, Alessandro De Gloria, Joseph Doyle,
and Fouad Sakr

Environmental Monitoring and E-health


Unobtrusive Accelerometer-Based Heart Rate Detection . . . . . . . . . . . . 49
Yurii Shkilniuk, Maksym Gaiduk, and Ralf Seepold
A Lightweight SiPM-Based Gamma-Ray Spectrometer
for Environmental Monitoring with Drones . . . . . . . . . . . . . . . . . . . . . . 55
Marco Carminati, Davide Di Vita, Luca Buonanno,
Giovanni L. Montagnani, and Carlo Fiorini

vii
viii Contents

Winter: A Novel Low Power Modular Platform for Wearable


and IoT Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Patrick Locatelli, Asad Hussain, Andrea Pedrana, Matteo Pezzoli,
Gianluca Traversi, and Valerio Re
Hardware–Oriented Data Recovery Algorithms for Compressed
Sensing–Based Vibration Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Federica Zonzini, Matteo Zauli, Antonio Carbone, Francesca Romano,
Nicola Testoni, and Luca De Marchi

Electronics for Health and Assisted Living


Automatic Generation of 3D Printable Tactile Paintings
for the Visually Impaired . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Francesco de Gioia, Massimiliano Donati, and Luca Fanucci
Validation of Soft Real-Time in Remote ECG Analysis . . . . . . . . . . . . . 90
Miltos D. Grammatikakis, Anastasios Koumarelis,
and Efstratios Ntallaris
Software Architecture of a User-Level GNU/Linux Driver
for a Complex E-Health Biosensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Miltos D. Grammatikakis, Anastasios Koumarelis,
and Angelos Mouzakitis
Enabling Smart Home Voice Control for Italian People
with Dysarthria: Preliminary Analysis of Frame Rate Effect
on Speech Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Marco Marini, Gabriele Meoni, Davide Mulfari, Nicola Vanello,
and Luca Fanucci
Brain-Actuated Pick-Up and Delivery Service for Personal Care
Robots: Implementation and Case Study . . . . . . . . . . . . . . . . . . . . . . . . 111
Giovanni Mezzina and Daniela De Venuto

Digital Techniques for Mechatronics, Energy and Critical Systems


Creation of a Digital Twin Model, Redesign of Plant Structure
and New Fuzzy Logic Controller for the Cooling System
of a Railway Locomotive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Marica Poce, Giovanni Casiello, Lorenzo Ferrari,
Lorenzo Flaccomio Nardi Dei, and Sergio Saponara
HDL Code Generation from SIMULINK Environment for Li-Ion
Cells State of Charge and Parameter Estimation . . . . . . . . . . . . . . . . . . 136
Mattia Stighezza, Valentina Bianchi, and Ilaria De Munari
Contents ix

Performance Comparison of Imputation Methods in Building


Energy Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Hariom Dhungana, Francesco Bellotti, Riccardo Berta,
and Alessandro De Gloria
Design and Validation of a FPGA-Based HIL Simulator
for Minimum Losses Control of a PMSM . . . . . . . . . . . . . . . . . . . . . . . 152
Giuseppe Galioto, Antonino Sferlazza, and Giuseppe Costantino Giaconia
x86 System Management Mode (SMM) Evaluation for Mixed
Critical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Nikos Mouzakitis, Michele Paolino, Miltos D. Grammatikakis,
and Daniel Raho

Photonic Circuits and IoT for Communications


A Novel Pulse Compression Scheme in Coherent OTDR Using
Direct Digital Synthesis and Nonlinear Frequency Modulation . . . . . . . 173
Yonas Muanenda, Stefano Faralli, Philippe Velha, Claudio Oton,
and Fabrizio Di Pasquale
Design and Analysis of RF/High-Speed SERDES in 28 nm CMOS
Technology for Aerospace Applications . . . . . . . . . . . . . . . . . . . . . . . . . 182
Francesco Cosimi, Gabriele Ciarpi, and Sergio Saponara
Enabling Transiently-Powered Communication via Backscattering
Energy State Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Alessandro Torrisi, Kasım Sinan Yıldırım, and Davide Brunelli
Analysis and Design of Integrated VCO in 28 nm CMOS
Technology for Aerospace Applications . . . . . . . . . . . . . . . . . . . . . . . . . 202
Paolo Prosperi, Gabriele Ciarpi, and Sergio Saponara
vrLab: A Virtual and Remote Low Cost Electronics Lab Platform . . . . 213
Massimo Ruo Roch and Maurizio Martina

Mechatronics, Energies and Industry 4.0


Mechatronic Design Optimization of an Electrical Drilling Machine
for Trenchless Operations in Urban Environment . . . . . . . . . . . . . . . . . 223
Valerio Vita, Luca Pugi, Lorenzo Berzi, Francesco Grasso, Raffaele Savi,
Massimo Delogu, and Enrico Boni
Analysis and Design of a Non-linear MPC Algorithm for Vehicle
Trajectory Tracking and Obstacle Avoidance . . . . . . . . . . . . . . . . . . . . 229
Francesco Cosimi, Pierpaolo Dini, Sandro Giannetti, Matteo Petrelli,
and Sergio Saponara
x Contents

Impact of Combined Roto-Linear Drives on the Design


of Packaging Systems: Some Applications . . . . . . . . . . . . . . . . . . . . . . . 235
Marco Ducci, Alessandro Peruzzi, and Luca Pugi
Preliminary Study of a Novel Lithium-Ion Low-Cost Battery
Maintenance system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Andrea Carloni, Federico Baronti, Roberto Di Rienzo, Roberto Roncella,
and Roberto Saletti
Low Cost and Flexible Battery Framework
for Micro-grid Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
Roberto Di Rienzo, Federico Baronti, Daniele Bellucci, Andrea Carloni,
Roberto Roncella, Marco Zeni, and Roberto Saletti
Survey of Positioning Technologies for In-Tunnel Railway
Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Luca Fronda, Francesco Bellotti, Riccardo Berta, Alessandro De Gloria,
and Paolo Cesario

IoT, AI and ICT Applications


Edgine, A Runtime System for IoT Edge Applications . . . . . . . . . . . . . . 261
Riccardo Berta, Andrea Mazzara, Francesco Bellotti, Alessandro De
Gloria, and Luca Lazzaroni
An Action-Selection Policy Generator for Reinforcement Learning
Hardware Accelerators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Gian Carlo Cardarilli, Luca Di Nunzio, Rocco Fazzolari, Daniele Giardino,
Marco Matta, Marco Re, and Sergio Spanò
Porting Rulex Machine Learning Software to the Raspberry
Pi as an Edge Computing Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Ali Walid Daher, Ali Rizik, Marco Muselli, Hussein Chible,
and Daniele D. Caviglia
High Voltage Isolated Bidirectional Network Interface for SoC-FPGA
Based Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
Luis Guillermo García, Maria Liz Crespo, Sergio Carrato, Andres Cicuttin,
Werner Florian, Romina Molina, Bruno Valinoti, and Stefano Levorato
A Comparison of Objective and Subjective Sleep
Quality Measurement in a Group of Elderly Persons
in a Home Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Maksym Gaiduk, Ralf Seepold, Natividad Martínez Madrid,
Juan Antonio Ortega, Massimo Conti, Simone Orcioni, Thomas Penzel,
Wilhelm Daniel Scherz, Juan José Perea, Ángel Serrano Alarcón,
and Gerald Weiss
Contents xi

A Preliminary Study on Aerosol Jet-Printed Stretchable Dry


Electrode for Electromyography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
M. Borghetti, Tiziano Fapanni, N. F. Lopomo, E. Sardini,
and M. Serpelloni

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297


AI and ML Techniques
Implementation of Particle Image
Velocimetry for Silo Discharge and Food
Industry Seeds

Romina Molina1,2,4(B) , Valeria Gonzalez2 , Jesica Benito3 , Stefano Marsi1 ,


Giovanni Ramponi1 , and Ricardo Petrino2
1
Università Degli Studi di Trieste—IPL, Dipartimento di Ingegneria
e Architettura, Piazzale Europa, 1, 34127 Trieste, TS, Italy
rominasoledad.molina@phd.units.it
2
National University of San Luis—LEIS, Department of Electronic,
Av. Ejército de los Andes 950, D5700 BPB San Luis, Argentina
3
National University of San Luis—INFAP, CONICET, Department of Physics,
Av. Ejército de los Andes 950, D5700 BPB San Luis, Argentina
4
The Abdus Salam International Centre for Theoretical Physics—MLAB,
Strada Costiera, 11, 34151 Trieste, TS, Italy

Abstract. This work focuses on determining the velocity profile of a


granular flow at the outlet of a silo, using artificial vision techniques.
The developed algorithm performs a frame enhancement through neural
networks and the particle image velocimetry detects seed motion in the
hopper. We process 50, 100, 150 and 200 frames of a video discharge
for three different grains using: CPU and PYNQ-Z1 implementations
with a simple image processing at pre-processing level, and CPU imple-
mentation using neural network. Execution times are measured and the
differences between the involved technologies are discussed.

Keywords: PIV · Image processing · SoC

1 Introduction
The growth of artificial vision techniques for image processing, recognition and
classification permits to expand the expectations of the systems to solve problems
that otherwise are much more difficult or impossible in different fields as security,
industry, autonomous drive, among others [1–3].
In this work, we present the design of an artificial vision application for the
calculation of the velocity field in a granular media at the outlet of a silo. The
study of granular flows within silos is a topic of great interest due to its influ-
ence on different industrial processes (emptying, mixing, grinding and transfer of
material) in industries such as cement, pharmaceutical, food, mining, among oth-
ers. In food industry, there are countless examples where silos intervene in pro-
duction processes, presenting problems associated with the geometry and char-
acteristics of the silo and grains [4–6]. When the silos are not designed properly,
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 3–11, 2021.
https://doi.org/10.1007/978-3-030-66729-0_1
4 R. Molina et al.

serious difficulties can arise in the discharge flow, leading to non-homogeneity of


mixing or blockages in hoppers.
Therefore, it is highly relevant to know the characteristics of the granular
flow. This work focuses on determining the velocity profile at the outlet of a silo
using artificial vision techniques: (a) video enhancement (b) image processing in
frequency domain using Fast Fourier Transform.

2 Experimental Setting Under Study


The device used in the experiments is shown in Fig. 1 [6]. It consists of a quasi-2D
silo, with acrylic walls and mobile rods that allow to vary the silo outlet opening
and the angle of inclination of the hopper. Transparent walls allow visualizing the
flow during discharge. The granular material used are seeds typically employed in
food industry: black sesame, millet and canary seeds (Fig. 1). The shape and color
of the seeds present a hard challenge due to the difficulty to identify them during
the rapid discharge close to the outlet. Also, in these experiments, the hopper
angle is 90◦ (measured from the vertical), known as a flat silo. Furthermore, the
outlet opening is large enough to avoid blockages. This configuration presents
a rapid discharge zone in the center of the silo and stagnation zones near the
walls.

Fig. 1. (a) Experimental device. (b) Grains (from top to bottom): millet, canary seeds
and sesame.
Implementation of Particle Image Velocimetry 5

Particle image velocimetry (PIV) determines the displacement of particles


in a certain flow using two images captured in a known time interval. Thus, it
is required a video or image sequence of the silo discharge to be analyzed. In
our experiments video is captured with a digital camera IDS UI-3160CP-C-HQ
Rev.2.1, with a resolution of 900 × 400 pixels at 100 fps for millet seed and 940
× 328 pixels at 142 fps for black sesame. Input frame examples can be see in
Fig. 2.

Fig. 2. Input frame - Millet (top) and black sesame (bottom)

Through the PIV processing, it can be determined the displacement (magni-


tude and direction) of the particles and, therefore, their velocity [7]. To imple-
ment this technique, each frame of the video is divided into a certain number of
areas distributed over the image (interrogation windows).

3 Particle Image Velocimetry Algorithm: Design


and Implementation
The proposed algorithm employs artificial vision techniques and has three main
stages: (1) Frame enhancement, (2) Particle image velocimetry (PIV) algorithm
to detect seed motion and (3) Motion vectors debugging.

3.1 Stage 1: Frame Enhancement Through Neural Networks


This stage improves the quality and appearance of each frame affected by exper-
imental conditions (such as non-homogeneity in lighting, low brightness, noise,
6 R. Molina et al.

color distortion), to improve the subsequent tasks. We employ two techniques:


(a) based on simple image processing, which involves conversion of the input
frame to the HSV color space, followed by a per-element bit-wise conjunction
with a predefined mask, and (b) based on neural network using WESPE [8] archi-
tecture, an image-to-image Generative Adversarial Network-based architecture
with openly available models and code. Both techniques were included into the
video-processing pipeline in separate experiments.
As regarding neural network implementation, the training was performed
using strong supervision, with the DPED dataset introduced in [9], with some
modifications in the original architecture: (i) the weights for each loss were mod-
ified: w content (reconstruction): 0.2, w color (gan color):25, w texture (gan tex-
ture): 3, w tv (total variation): 1/600 (ii) for the content loss, relu 2 2 layer from
VGG19 was used. The training parameters were configured as follows: learning
rate: 0.0001, batch size: 32, train iterations: 20000.
It should be observed that implementing a unique pre-processing technique
makes the deployment of Stage 1 independent of the input video, i.e. permits
to process different types of seeds with the same algorithm, avoiding specific
techniques for each seed and generating a robust long-term processing technique.
Our experiments put in evidence that, on the contrary, traditional enhancement
methods without neural networks needed to be modified and tuned differently
for the different cases.
Once the enhancement is performed, each frame is converted to gray scale to
be used as input in the next stage. Figure 3 shows the input frame in gray scale
(left) and the output frame of this stage using the neural network (right).

3.2 Stage 2: Particle Image Velocimetry Algorithm


In this stage we determine the displacement of the particles within each inter-
rogation window. An optimal window size has to be used in order to obtain a
high accuracy without generating invalid vectors. Here, the image is divided into
18 × 8 (millet) and 18 × 6 (sesame) windows.
By decreasing the window size, the number of resulting speed vectors
increases and thus it is possible to estimate the direction and speed of the seeds
with better accuracy. But this is not always optimal, if the window size is less
than the proper size of the seed, it may happen that the real movement of the
seeds is not detected, generating invalid displacement vectors. And, if the size
of the windows is very large, a loss of information occurs.
Then, we calculate the Fast Fourier Transform (FFT) for each interrogation
window and the individual spectres of each subsequent interrogation window
are multiplied. Finally, we calculate the Inverse Fourier Transform to obtain the
position having the maximum correlation value. This information results in the
displacement vector of the grains within that window.
Implementation of Particle Image Velocimetry 7

Fig. 3. Input frame in gray scale (top). Output of the pre-processing stage using neural
network (bottom)

3.3 Stage 3: Debugging the Motion Vectors

Incorrect vectors are inevitable in the processing due to: size of the windows,
stagnant or almost immobile seeds on the sides of the silo, very fast motion
of the seeds between two subsequent frames, among others. With debugging,
vectors are subjected to a reduction, validation and replacement. This is done
comparing the resulting vectors with those obtained in neighboring windows.
If there are inconsistencies the vector is eliminated and replaced by an average
obtained from all neighboring windows.
Finally, with the calculated displacement and the time interval between
frames, we determine the velocity field along the hopper.

4 Embedded Implementation Using System on Chip


The implementation of increasingly complex systems is possible due to the devel-
opment of modern technologies that allow on-chip systems to include the Pro-
cessing system (PS) and Programmable logic (PL) in a single integrated system.
These technologies permit a Hardware/Software (H/S) co-design for a reduction
in processing time. In this context, with PYNQ (Python + Zynq) framework [10]
we can create applications with SoC and MPSoC devices, using Python through
8 R. Molina et al.

Jupiter Notebook, at PS level and a certain available hardware configurations


of the PL, through the so-called overlays.
The main algorithm to implement PIV technique was developed and tested in
CPU and, after verifying the correct functionality, it was ported to the System
on Chip using Jupyter Notebook in the processing system through Ethernet
connection.
The measurements of the execution times is the first step to perform the co-
design H/S in future works, to obtain a final embedded implementation for the
complete system, looking forward to obtain real time processing in a portable
device.

5 Results
Experimental setup: The algorithm was implemented and executed on a CPU
Core i7 3.4 GHZ 64 GB RAM GeForce GTX 1070, using the Python 3.6.7, Ten-
sorFlow 1.12 and OpenCV 3.4.1 libraries. For the embedded implementation,
PYNQ-Z1 board from Xilinx was used. The input videos were captured with a
digital camera IDS UI-3160CP-C-HQ Rev.2.1, with a resolution of 900 × 400
pixels at 100 fps for millet seed and 940 × 328 pixels at 142 fps for black sesame.
Figure 4 (a) and (b) show the results (velocity field) for the millet and sesame
seeds. It can be noted that this technique predicts quite well the motion of the
grains in the different zones of the hopper. Possible errors can be related to:
stagnation zones (grains move very slowly, or short displacement every many

Fig. 4. Velocity field of the grains: (a) Millet (b) Black sesame.
Implementation of Particle Image Velocimetry 9

frames), or very fast motion (central region with fluctuations of the flow). It is
also important to note that results for sesame seeds are very good despite the
fact that grains have a very different geometry and color compared to the millet.
Regarding the execution times, the experiments were carried out by process-
ing 50, 100, 150 and 200 frames of the discharge in different technologies and
with 3 types of grain as input: CPU implementation, PYNQ-Z1 implementation,
both with simple image processing at pre-processing level, and CPU using neural
network.
The results are presented in Table 1. As we can observe, the rise in execution
time is directly proportional to the increase in processed frames. Also, the differ-
ences between the involved technologies are related with the processing system:
CPU with i7 and PYNQ-Z1 with dual-core Cortex-A9. Despite run time differ-
ences, the algorithm was fully implemented in the embedded system, enabling
the next stage of H/S co-design.

Table 1. Execution times in seconds: (A) CPU implementation with simple image pro-
cessing, (B) CPU implementation using neural network, (C) PYNQ-Z1 implementation
with simple image processing.

(A) CPU [sec] (B) CPU NN [sec] (C) PYNQ-Z1 [sec]


Frames Seed Frames Seed Frames Seed
Millet Black Canary Millet Black Canary Millet Black Canary
sesame seed sesame seed sesame seed

50 7.69 4.69 9.76 50 26.95 21.58 28.29 50 102.88 79.84 154.15


100 10.80 6.93 13.13 100 50.5 38.8 50.17 100 221.01 136.35 234.57
150 13.96 9.29 16.56 150 74.24 55.19 71.88 150 300.84 192.67 318.81
200 20.04 11.57 20.07 200 97.08 72.8 93.75 200 415.08 254.49 405.06

The speed of the grains at the final line of the outlet vs. position is shown in
Fig. 5. With 200 processed frames, better results are obtained (less fluctuations),
even if the execution time is larger than the one obtained with 50 frames. The
behavior of the speed observed at the exit of the silo (Fig. 5) is the expected one
for a granular discharge [6], this is, the speed in the center of the hopper is higher
and, as the grains approach the edges of the outlet opening, it tends to zero.
Besides, the curves have the shape of an arc (typical arc formed by the particles
in the outlet where they describe a free fall). It also can be observed that black
sesame grains present higher velocities than the millet ones. This may be due to
the fact that the seeds have different characteristic sizes. Nevertheless, in these
analyzed cases, the width of the outlet opening is different (to avoid blockages),
thus, a more in-depth study should be carried out to analyze the dependence of
the speed at the outlet with the size of the seeds and outlet width.
When comparing the displacements of the seeds found in the different inter-
rogation windows through PIV with the ones determined visually, the differences
result of the order of 10%. Also, inside the hopper, these differences are more
noticeable as we go closer to the walls of the silo. This may be due to the transi-
tion of the behavior: from fast discharge zone to stagnant zone. In this transition,
10 R. Molina et al.

the displacements of the grains are quite small and, some of them do not even
move.
On the other hand, the enhancement stage improves the subsequent process-
ing, and a trade off between pre-processing stage and its impact in the final
velocity field must be taken into account. With the neural network, run times
are longer than implementing the simple processing for the enhancement task,
but using WESPE we can obtain more vectors in the final velocity field along
the hopper.

Fig. 5. Speed at the final line in pix/sec: Millet (top) and black sesame (bottom)

6 Conclusions and Future Work


In this work we developed an algorithm for the analysis of the motion of differ-
ent granular materials within a silo hopper. Different stages were carried out in
order to improve the motion detection. This goal was successfully achieved for
all the tested cases and, in particular, the number of frames used in the process
proves relevant to reduce fluctuations in the obtained velocity. The algorithm
was executed using different technologies (CPU and PYNQ-Z1) and different
pre-processing methods in stage 1 (image processing and neural network). As
expected, when comparing the execution times, they were lower for the case of
Implementation of Particle Image Velocimetry 11

the CPU implementation. Nevertheless, with these results and the new inno-
vations for neural networks and FPGAs, in future work we can improve the
execution times to get a good compromise between time processing and a solid
pre-processing stage in order to generate a robust and precise velocity field. Also,
future efforts will be dedicated to test more types of seeds, to process the velocity
field in the entire silo (not only the area of the outlet) and to incorporate the
version video camera in an embedded system.

References
1. Wang, Y., Zhang, J., Cao, Y., Wang, Z.: A deep CNN method for underwater image
enhancement. In: IEEE International Conference on Image Processing (ICIP), Bei-
jing, pp. 1382–1386 (2017)
2. Kadar, M., Onita, D.: A deep CNN for image analytics in automated manufacturing
process control. In: 11th International Conference on Electronics, Computers and
Artificial Intelligence (ECAI), Pitesti, Romania, pp. 1–5. (2019)
3. Ko, S., Yu, S., Kang, W., Park, C., Lee, S., Paik, J.: Artifact-free low-light video
enhancement using temporal similarity and guide map. IEEE Trans. Ind. Electron.
64(8), 6392–6401 (2017)
4. Job, N., Dardenne, A., Pirard, J.P.: Silo flow-pattern diagnosis using the tracer
method. J. Food Eng. 91(1), 118–125 (2009)
5. Eurocode 1, Basis of design and actions on structures - 4: Actions in silos and
tanks (1998)
6. Villagrán,C.: Efecto de los parámetros de forma de los granos y del ángulo de
inclinación de la tolva en el flujo de semillas en silos. Trabajo Final de Licenciatura
en Fı́sica - FCFMyN - UNSL, and references within (2018)
7. Westerweel, J.: Fundamentals of digital particle image velocimetry. Meas. Sci. Tech-
nol. 8(12), 1379–1392 (1997)
8. Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: WESPE:
weakly supervised photo enhancer for digital cameras. In: IEEE International Con-
ference on Computer Vision and Pattern Recognition Workshop (CVPRW) (2018)
9. Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van, G.: LucDSLR-Quality
Photos on Mobile Devices with Deep Convolutional Networks, Luc (2017)
10. PYNQ - http://www.pynq.io/. Seen 8 2020
Analysis and Design of a Yolo like DNN
for Smoke/Fire Detection for Low-cost
Embedded Systems

Alessio Gagliardi(B) , Marco Villella, Luca Picciolini, and Sergio Saponara

Department of Information Engineering, University of Pisa, Via G. Caruso 16, 56122 Pisa, Italy
alessio.gagliardi@phd.unipi.it

Abstract. This paper proposes a video-based fire and smoke detection technique
to be implemented as antifire surveillance system into low cost and low power sin-
gle board computer (SBC). Such algorithm is inspired by YOLO (You Only Look
once), a real-time state of the art object detector system able to classify and local-
ize several objects into a single camera frame. Our architecture is based in three
main segments: Bounding Box Generator, Support Classifier and Alarm Gener-
ator. The custom Yolo network was trained using already available dataset from
literature and tested with respect to Classical and DL (Deep Learning) algorithms
achieving best performance in terms of accuracy, F1, Recall and precision. The
proposed technique has been implemented on four low cost embedded platform
and compared respect the frame per second that they can achieve in real-time.

1 Introduction
Video surveillance involves the action of observing a scene and looking for specific
incorrect behavior, emergency situations or dangerous conditions. We develop, for that
purpose, a video surveillance system for fire and smoke detection by designing a video
camera-based algorithm. Similar methods have already been proposed in the literature
e.g. those that use classic techniques such as background subtraction using Visual Back-
ground Estimation (ViBe) [1], fuzzy logic [2] and Gaussian Mixed Model (GMM) [3]
for recognition. Other approaches exploit the Kalman filter [4] for the estimation of the
moving smoke blobs, while other authors propose the modern Neural Networks object
detection retraining existing models or creating new ones from scratch [5–8]. While
traditional video-based smoke detectors use hand-designed features such as colour and
shape characteristics of smoke, the recent Deep Learning algorithms allow for automatic
data-driven feature extraction and classification from raw image streams [6, 16, 17]. In
[6] authors propose an image classification method to detect fire and smoke instances in
various videos using a simple Deep Neural Network (DNN) classifier. In [16] and [17]
the authors fine-tuned two similar You Only Look Once (YOLO) [9] networks to detect
fires in indoor scenario. However, the main issues of these presented studies involve
the implementation on an embedded system or the dataset used for the training phase.
In fact, both studies of [16] and [17] lacks of implementation on embedded systems
and uses a limited dataset. In [16] where used 1720 images while in [17] where used

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 12–22, 2021.
https://doi.org/10.1007/978-3-030-66729-0_2
Analysis and Design of a Yolo like DNN for Smoke/Fire 13

only 60. It is clear that these two networks are able to cover only the indoors scenarios
according to their dataset and therefore difficult to apply in real world scenarios. The
works in [6] instead propose lightweight Convolutional Neural Network (CNN) and an
implementation on a Raspberry Pi 3. However, such CNN is not able to identify the
objects inside each frame camera, as YOLO does, but works as an image classifier. This
means that the frame must be completely covered by flames to have good performance
and therefore such method is not applicable for fire prevention systems.
The goal of the project is to build a solution that combines the robustness and the
accuracy derived provided by Neural Network (NN) model and the portability for an
embedded implementation. Hence, a Neural Network inspired by Yolo [9], was designed
from scratch. Subsequently we added an alarm generator algorithm, based on the per-
sistence of anomalies in the image, to increase robustness. This solution has been finally
deployed in four different embedded systems having different price and capabilities.
The main contributions of this work are as follows:

• We introduce a YOLO like neural network for fire and smoke detection, able to
consistently detect smoke and fire without generating false alarms compared to the
State of the Arts algorithm.
• We also introduce a new heterogeneous training fire and smoke dataset combining
images from different indoor and outdoor scenarios. The entire dataset is finally com-
posed of 90 danger videos, 1200 danger images, 28 neutral videos, and 600 neutral
images.
• We also present implementations on four low cost embedded system comparing them
respect to their performance of real time processing and power consumption. This
comparison is necessary verify the specification that fits well for an IoT application.

Hereafter, the paper is organized as follows: Sect. 2 deals with the description of
global Neural Network. Section 3 describes the configuration parameter used in the
training phase and dataset. Section 4 reports a short comparison with respect the state-
of-the-art algorithm and a comparison of real-time processing with different boards.
Conclusions are reported in Sect. 5.

2 The Deep Neural Network Architecture

YOLO [9] is one of the fast object detector method where a single CNN simultaneously
predicts multiple bounding boxes and class probabilities for those boxes. Such method
has become popular because it achieves high accuracy while also being able to run in
real-time. The algorithm “only looks once” at the image in the sense that it requires only
one forward propagation pass through the neural network to make predictions and then
output multiple bounding boxes.
YOLO method divides the input image into an S × S grid. If the center of an object
falls into a grid cell, that grid cell is responsible for detecting that object. Each grid cell
produce a prediction of bounding boxes and confidence scores for those boxes. These
confidence scores reflect how confident the model is that the box contains an object
and also how accurate it thinks the box is that it predicts. The author of YOLO define a
14 A. Gagliardi et al.

confidence as: Pr(Object) ∗ IOU, where IOU is the Intersection of Union. The confidence
is equal to zero if no object exists in that cell. Otherwise, the confidence score to equal
the IOU between the predicted box and the ground truth. After that, it is defined the
class probability for each grid as: Pr(Classi|Object)∗ Pr(Object) ∗ IOU = Pr(Classi)∗
IOU. A bounding box will be selected with the highest probability value to be used as
a separator of one object with another object, as shown in Fig. 1.

Fig. 1. Detection of multiple objects using the YOLO method.

The original YOLO method uses a 7 × 7 grid, 24 convolutional layers with two
fully connected for a total model size of almost 200 MB. So that, the idea is to design
a custom neural network to keep the same efficiency and speed as the yolo method, but
saving memory space by obtaining a model smaller than 200 MB and with less than 24
convolutional layers to fit in an embedded system.

Fig. 2. Picture of the full deep network architecture. blue layers represent convolutional blocks.
the red layers represent instead fully connected blocks. on top of the bounding box generator
subnet, on bottom the support classifier.
Analysis and Design of a Yolo like DNN for Smoke/Fire 15

Inspired by YOLO, we developed a Deep Neural Network architecture composed of


two different CNNs. The whole architecture receives as inputs RGB frames of a video
source and performs fire and smoke detection within each input frame and, eventually,
triggers a pre-alarm signal. The first CNN is a Bounding Box Generator that has the task
of collecting each fire and smoke instance within a frame into a single bounding box. The
second CNN is a Support Classifier that has the purpose of rejecting false positives com-
ing from the first sub-network. Furthermore such “danger classifier” decides whether
to trigger the pre-alarm state by inspecting the last few frames detections determin-
ing consistency in term of position, frequency, and label. Finally, the Alarm Generator
Algorithm check the persistence of the anomaly and eventually trigger a fire alarm. The
Fig. 2 shows a representation of the full model’s architecture: the connection between
the first subnet’s output and the second subnet’s input is handled by cropping the region
of the image corresponding each bounding box predicted. Once cropped, such regions
are processed and resized in order to last feeding them to the Support Classifier who
predicts again their validity. Filtered outputs are then stored within a list and checked by
the alarm generator algorithm, which represents the final judge on the need of triggering
the pre-alarm state.

2.1 The Bounding Box Generator Subnetwork

The Bounding Box Regressor subnet can be furtherly split into two logic portions: the
convolutional portion, or Feature Extractor, and the fully connected portion, or Feature
Regressor. The first consists of a fully convolutional network that has the role to extract
desired features from the input 96 × 96 × 3 RGB image, the second is a fully connected
segment that stores most of the first subnet’s capacity and uses the extracted features
to assign bounding boxes. A Global Max-Pooling (GMP) layer interconnects the two
portions. This layer achieves a much less dimensional reduction of parameter than the
usually implemented with Flatten layers. Each convolutional and fully connected block
stores several simpler layers, as depicted in Figs. 2 and 3.

Fig. 3. Convolutional block composition (left) and fully connected block composition (right).
16 A. Gagliardi et al.

Each Convolutional block consists of Convolutional Layer, ReLU and Batch Nor-
malization. The Convolutional portion handles the heavy image processing computa-
tions; they have been designed with a filter size of 3 × 3, a unitary stride, dimensions-
preserving zero-padding and depth that ranges from 16 kernels to 512 throughout the
whole architecture.
The Fully Connected block consists instead of Dense layer, Leaky ReLU, and Batch
Normalization. Such layers represent the majority of the model capacity being the main
source of the network’s trainable weights. We decided to adopt a deeper stack of smaller
layers so that we could keep the network memory efficient by reducing parameters
count and enhancing layers expressivity by using a higher amount of activation non-
linearities. The Leaky-ReLU is selected as activation function because it achieves the
best inference computation time while being more robust than traditional ReLU unit
that can cause many dead connections instead. Max-Pooling layers are implemented
with a kernel size of 2 × 2 and a stride of 2. Moreover, an 2-norm regularization with
0.01 regularizing force has been applied to all trainable layers to discourage skewed
large-weight distribution from forming during training. Batch Normalization Layers
and Dropout layers were used to furtherly reduce over-fitting related issue and enforce
the maximum possible regularity in weights distribution.
The Bounding Box Generator subnet returns as output a tensor of 3 × 3 × 6 (Fig. 4).
The tensor represents a 3 × 3 grid on the image and for each of the nine cells there are
6 main data that are: Detection Probability, X coordinate of the bounding box’s center,
Y coordinate of the bounding box’s center, Width of the bounding box, Height of the
bounding box and Class Score (0 if smoke, 1 if fire).

Fig. 4. The 3 × 3 grid for detection encoding applied to a frame.

2.2 The Support Classifier Subnetwork


The input of the support classifier is an RGB image 24 × 24 × 3 while the output is the
danger score, a 2-dimensional array where the first number is the probability that the
input is classified as a danger, and the second is the probability that the input is classified
as neutral. We selected a typical Convolutional block’s architecture with a 2 × 2 pooling
between each block to halve the image size. At the end of the Convolution blocks series,
there is a Global max-pooling 2d which is an ordinary max pooling layer with pool size
Analysis and Design of a Yolo like DNN for Smoke/Fire 17

equals to the size of the input useful because it allows reducing tensor dimensions and
overall trainable weights. Then there are the Fully Connected blocks, each of which
followed by a dropout to prevent overfitting. At the end of the architecture, there is
the Softmax layer that is a generalization of binary Logistic Regression classifiers to
multiple cases.
There are three Convolutional blocks, each of which is composed by a Convolu-
tional layer, an Activation layer, and a Batch Normalization layer in series with another
Convolutional layer, Activation layer and Batch Normalization layer of the same dimen-
sions. The convolutional and fully connected blocks of such subnetworks share the same
architecture already shown in Fig. 3.

2.3 The Alarm Generator Algorithm

The final goal of this work is to develop an efficient video surveillance system. So that,
we must avoid any false positive, which means that the alarm is generated without real
danger. For doing so, the algorithm check in an observation window of a second every
prediction of the two Neural Network. If the at least 60% of the frame is marked as
“danger” by both the bounding box generator and the support classifier, then an alarm
is triggered.
Figure 5 shows the real-time detection of some test video applied as input to our
algorithm. The fire is inserted into a red bounding box while the smoke is inserted into
blue bounding box. The alarm is generated when the red circle appears on the top left
of the image otherwise the circle displayed is green.

Fig. 5. Example of fire and smoke detection in tests videos.

3 The Dataset and the Training Procedure

To train these CNNs we use a dataset composed of both images and videos of fire,
smoke, and neutral scenarios. Some of these videos and images are taken from the
“Firesense” dataset available online [10], other from the Foggia’s [11] and Sharma’s
[12] dataset, and most of the images from Yuan’s dataset [13]. Other fire videos were
provided by Trenitalia which showed interest in equipping the cameras already installed
inside the train wagons with intelligent fire detection algorithms. The entire dataset is
finally composed of 90 danger videos, 1200 danger images, 28 neutral videos, and 600
neutral images. A total number of 60 danger videos and 22 neutral videos have been
selected as test dataset while 30 danger videos, 6 neutral videos, and all 1800 images
18 A. Gagliardi et al.

were used for training and validation. We used data augmentation techniques such as
shift, rotation, and flip etc. to improve the training by increasing the number of the images
from 2400 to 4800. We labeled the input data to the two neural subnets in a different
way. The labeling for the Bounding Box Generator consists in drawing a bounding
box by hand around any target (smoke or fine) present in the images or video frames.
Subsequently, are reported the 6 values that have been previously discussed. The labeling
for the Support Classifier consists in sorting the cropped images in two different folders,
one containing the danger images and the other containing the neutral images. For both
CNN’s, the learning rate methods chosen is Adam which is a stochastic gradient descent
method that is based on adaptive estimation of first order and second-order moments. A
batch size of 64 was configured for the Bounding Box Generator while a batch size of 32
was chosen for the Support Classifier. While the Bounding Box Generator network has
trained for 10000 epochs, for the Support Classifier 2500 epochs were enough to reach
the convergence. We used a custom loss function for training the first neural network
designed specifically to correctly predict the presence/absence of a target within a cell
and, in case of presence of smoke/fire, to learn its bounding box parameters. This was
performed to reduce training time and to avoid wasting network’s capacity on learning
useless parameters such as the zeros in no danger containing cells’ depth vector. The
Support Classifier is, as already reported, a Softmax classifier, and so it uses a binary
cross-entropy loss with the form:

1 
N
     
L y, ŷ = − y log ŷi + (1 − y) log 1 − ŷi (1)
N
i=0

where y is the ground truth value, yˆ is the prediction and N is the number of sam-
ples per batch. The Softmax classifier is hence minimizing the cross-entropy between
the estimated class probabilities and the “true” distribution. For the training phase we
considered three initial values of learning rate from 10−2 to 10−5 . Such value has been
finally set as 10−4 achieving the best model after 7363 epochs for the first network and
around 2100 epoch as shown in Fig. 6 with the red marks.

Fig. 6. Training loss and Validation loss at 10−4 values of learning rate achieved for the bounding
box generator network(left) and for support classifier network(right).
Analysis and Design of a Yolo like DNN for Smoke/Fire 19

4 Results of the Proposed DNN Technique


To assess the performance of the proposed technique the first comparison was performed
with respect to non-AI-based algorithm. Therefore, we decided to test the performance
in terms both of accuracy, F1, Precision, and Recall metrics. These tests were conducted
on a set of videos presented in [4] and the results are shown in Fig. 7. We can clearly see
that our solution, mentioned as Custom Yolo, outperforms both [4] and [14, 15] in terms
of all the four main metrics. We performed another comparison versus other AI-based
solutions available in the state of the arts by using the same dataset. In this case the test
was performed comparing only accuracy and recall metrics since those were the only
available in most of the research literature. Looking at results indicated in Table 1 it
is possible to see how our solution achieves the highest accuracy value and the second
highest recall value, being outperformed only by an RCNN based solution. However, it is
proved that RCNN runs at about 25x slower in terms of frame per seconds (fps) than our
solution on tested embedded systems. Hence, we tested the algorithm in four different
embedded systems with different price and capabilities: a Raspberry Pi3, Raspberry Pi4
with 4 GB of RAM, an Nvidia Jetson Nano and an Nvidia Jetson AGX Xavier. We
performed the test measuring the maximum frame per seconds that they can achieve at
different power consumption setting if available. Results are reported in Table 2.

Fig. 7. Performance Comparison with Non-AI-Based Methods [4, 14, 15].

The Nvidia Jetson Xavier obtains the best results in MAXN configuration reaching
about 80 frames per second. In such configuration, the Xavier board is x4 times faster
than the Nvidia Jetson Nano and about x16 times faster than the Raspberry Pi 3 that
reaches only 5 fps. The Nvidia Jetson Nano seems to be the ideal compromise between
performance and power consumption. In fact, the Raspberry Pi 4 does not get more than
10 fps even if it consumes an average 5 W. The Jetson Nano board gets about 20 fps in
10 W consumption configuration feasible and so can be considered as final platform for
a final IoT antifire system.
20 A. Gagliardi et al.

Table 1. Accuracy and Recall of our Custom CNN vs State of the Arts AI-based methods.

Reference Accuracy Recall


Custom Yolo 0.973 0.978
Saponara et al. [18] R-CNN 0.936 1
Jadon et al. [6] 0.939 0.94
Filonenko et al. [7] 0.85 0.96
Yuan et al. [8] 0.86 0.53
Celik et al. [2] 0.83 0.6

Table 2. Average fps values achieved across all the four platforms.

Reference Fps
Raspberry Pi 3 Model b 5
Raspberry Pi 4 4 GB RAM 10
Nvidia Jetson Nano 5 W 12
Nvidia Jetson Nano MAXN 20
Nvidia Jetson AGX 10 W 26
Nvidia Jetson AGX 15 W 33
Nvidia Jetson AGX MAXN 81

5 Conclusion

In this paper we presented a design, implementation, and a short comparison of a DNN


algorithm for smoke and fire detection inspired by the YOLO technique. We have shown
that through dedicated design procedure it was possible to build DNN-based object detec-
tion models that preserve and improve the flexibility and performance of traditional state
of the art solutions. We also have achieved better performance than similar algorithms
using both classic image processing methods and modern AI techniques implementing
the techniques in different embedded systems. As future work, the authors intend to con-
tinue the investigation respect to the state-of-the-art algorithms by comparing the delay
time of detection, CPU, GPU, Memory usage and power consumption on such already
presented boards. The authors are further investigating about a possible implementation
of the described methodology on FPGA.

Acknowledgments. Work partially supported by Dipartimenti di Eccellenza Crosslab Project by


MIUR.
Analysis and Design of a Yolo like DNN for Smoke/Fire 21

References
1. Vijayalakshmi, S.R., Muruganand, S.: Smoke detection in video images using background
subtraction method for early fire alarm system. In: 2017 2nd International Conference on
Communication and Electronics Systems (ICCES), pp. 167–171 (2017)
2. Çelik, T., Özkaramanlı, H., Demirel, H.: Fire and smoke detection without sensors: image
processing-based approach. In: IEEE 15th European Signal Processing Conference, pp. 1794–
1798 (2007)
3. Zhang, Q., et al.: Dissipation function and ViBe based smoke detection in video. In: 2017 2nd
International Conference on Multimedia and Image Processing (ICMIP). IEEE, pp. 325–329
(2017)
4. Gagliardi, A., Saponara, S.: AdViSED: advanced video smoke detection for real-time
measurements in antifire indoor and outdoor systems. Energies, 13(8), 2020
5. Ehlnashi, A., Gagliardi, A., Saponara, S.: Exploiting R-CNN for video smoke/fire sensing in
antifire surveillance indoor and outdoor systems for smart cities. In: IEEE 6th International
Conference on Smart Computing (SSC), Bologna, Italy, September 2020
6. Jadon, A., et al.: Firenet: a specialized lightweight fire & smoke detection model for real-time
iot applications. arXiv preprint arXiv:1905.11922 (2019)
7. Filonenko, A., Hernández, D.C., JO, K.-H.: Fast smoke detection for video surveillance using
CUDA. IEEE Trans. Ind. Inform. 14(2), 725–733 (2017)
8. Yuan, F., Fang, Z., Wu, S., Yang, Y., Fang, Y.: Real-time image smoke detection using staircase
searching-based dual threshold AdaBoost and dynamic analysis. IET Image Process. 9(10),
849–856 (2015)
9. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time
object detection. In: Proceedings of the IEEE conference on computer vision and pattern
recognition, pp. 779–788 (2016)
10. Dimitropoulos, K., Barmpoutis, P., Grammalidis, N.: Spatio-temporal flame modeling and
dynamic texture analysis for automatic video-based fire detection. IEEE Trans. Circ. Syst.
Video Technol. 25(2), 339–351 (2014)
11. Foggia, P., Saggese, A., Vento, M.: Real-time fire detection for video-surveillance applications
using a combination of experts based on color, shape, and motion. IEEE Trans. Circ. Syst.
Video Technol. 25(9), 1545–1556 (2015)
12. Sharma, J., Granmo, O.C., Goodwin, M., Fidje, J.T.: Deep convolutional neural networks for
fire detection in images. In: International Conference on Engineering Applications of Neural
Networks, pp. 183–193. Springer, Cham August 2017
13. Yuan, F., Shi, J., Xia, X., Fang, Y., Fang, Z., Mei, T.: High-order local ternary patterns with
locality preserving projection for smoke detection and image classification. Inf. Sci. 372,
225–240 (2016)
14. Saponara, S., Pilato, L., Fanucci, L.: Early video smoke detection system to improve fire
protection in rolling stocks. In: Real-Time Image and Video Processing 2014, vol. 9139,
p. 913903. International Society for Optics and Photonics, May 2014
15. Saponara, S., Pilato, L., Fanucci, L.: Exploiting CCTV camera system for advanced passenger
services on-board trains. In: 2016 IEEE International Smart Cities Conference (ISC2), pp. 1–6.
IEEE, September 2016
16. Shen, D., Chen, X., Nguyen, M., Yan, W.Q.: Flame detection using deep learning. In: 2018
4th International Conference on Control, Automation and Robotics (ICCAR), pp. 416–420.
IEEE April 2018
22 A. Gagliardi et al.

17. Lestari, D.P., Kosasih, R., Handhika, T., Sari, I., Fahrurozi, A.: Fire hotspots detection system
on CCTV videos using you only look once (YOLO) method and tiny YOLO model for high
buildings evacuation. In 2019 2nd International Conference of Computer and Informatics
Engineering (IC2IE), pp. 87–92. IEEE September 2019
18. Saponara, S., Elhanashi, A., Gagliardi, A.: Real-time video fire/smoke detection based on
CNN in antifire surveillance systems. J. Real-Time Image Proc., 1–12 (2020)
Video Grasping Classification Enhanced
with Automatic Annotations

Edoardo Ragusa(B) , Christian Gianoglio, Filippo Dalmonte,


and Paolo Gastaldo

Department of Electrical, Electronic, Telecommunication Engineering and Naval


Architecture DITEN, University of Genoa, Genova, Italy
{edoardo.ragusa,christian.gianoglio}@edu.unige.it,
4103871@studenti.unige.it, paolo.gastaldo@unige.it

Abstract. Video-based grasp classification can enhance robotics and


prosthetics. However, its accuracy is low when compared to e-skin based
systems. This paper improves video-based grasp classification systems by
including an automatic annotation of the frames that highlights the joints
of the hand. Experiments on real-world data prove that the proposed
system obtains higher accuracy with respect to the previous solutions.
In addition, the framework is implemented on a NVIDIA Jetson TX2,
achieving real-time performances.

Keywords: Grasping classification · CNNs · Prosthetics · Embedded


systems

1 Introduction
Automatic inference of grasping actions can boost fields like prosthetics and
rehabilitation. Artificial intelligence techniques in combination with electronic
skin proved effective for this task [11,21]. However, tactile sensors are costly and
limit grasping and manipulation actions [5,7]. Besides, the sensing system might
be annoying for patients.
Video-based models for hand grasp analysis can represent a non-invasive
solution [6,15]. These approaches are possible thanks to deep learning. In fact,
computer vision can address complex tasks like medical image analysis [1], sen-
timent analysis [16] and sports application [10] thanks to automatic features
learning capabilities. However, the deployment of computer vision on embedded
devices is tricky and requires ad-hoc solutions [18]. In practice, the most accurate
solutions can be rarely deployed in real-time on embedded systems.
Video grasp detection is a challenging task composed of two sub-tasks: the
system locates the hand into the image, then, it classifies the grasp action. The
literature provides three works that addressed both tasks. In [24] the authors
proposed a solution based on convolutional neural networks (CNNs) that iden-
tifies hand grasping by filtering patches from an image. The CNN architec-
ture was composed of 5 convolutional layers. Patches identification used an
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 23–29, 2021.
https://doi.org/10.1007/978-3-030-66729-0_3
24 E. Ragusa et al.

ensemble of three classifiers proposed in [14]. Similarly, [4] approached the two
tasks by employing a heterogeneous set of computer vision techniques for both
hand detection and feature extraction. [17] introduced a novel framework for
video-based grasping classification. Deep Learning (DL) supported the two lev-
els schema thanks to an automatic refinement of the existing databases.
In addition to the closely related solutions, many works offer interesting
insights: [15] estimated contact force using visual tracking between users’ hands
and objects. In [12] the video analysis models the manipulation of deformable
objects. In [8] deep learning (DL) methods discriminate right and left hands
inside an image. [9,13] enriched the processing system of a prosthetic hand with
a video system. Finally, hand gesture recognition techniques [22] offer valuable
insight for grasp classification.
Despite the interesting results shown in [4,14,17], accuracy is still a primary
concern in video-based solutions. In practice, inference systems detect hand posi-
tions with high accuracy, but they struggle in recognizing the grasp action. This
is mostly due to intrinsic problems like self-occlusion and dataset limitation. In
fact, labeling operations are time-consuming and prevent high size datasets. In
addition, during grasp actions, parts of the hand are occluded.
This paper tackles explicitly these issues using an automatic annotation of
the images. The proposed solution uses the approach proposed in [20] where the
authors trained a CNN in recognizing the joint of the hand using a dataset com-
posed of multiple views of the hand. In practice, thanks to a projection strategy,
the network was trained to detect also the joints that were self occluded. This
network was included in the prediction pipeline proposed in [17]. The information
extracted about joints position is superimposed to RBG images using colored
segments that connect the joints of the hand. Accordingly, these annotations
add precious information for the grasping action recognition. Notably, using the
superimposed annotations avoids the development of a custom dataset contain-
ing multiple views of hands with the label of the grasping action. This annotation
is expected to simplify the classification task reducing the needs for high-size
datasets and alleviating the self-occlusion problem, leading to better general-
ization performances. Experiments on real-life videos prove that the framework
overcomes recent solutions in terms of accuracy. In addition, experiments on
Nvidia Jetson TX2 confirm that the proposed solution still has real-time per-
formances in high-performance edge devices suitable for embedded implementa-
tions.

2 Proposal

This paper extends the solution proposed in [17] by including a novel block in
the processing flow. In the following, in accordance with previous works, the
right hand is the target of the analysis. Figure 1 shows the novel solution: the
Hand detection network locates all the hands inside an input image (the red
boxes in the figure “Detections”). The Right hand heuristic block selects the
right hand, using a well-known heuristics. The selected patch feeds the new
Video Grasping Classification Enhanced with Automatic Annotations 25

Fig. 1. Outline of the grasping classification pipeline

Annotation network, marked in green. This network superimposes a skeleton


over the hand under analysis, as shown in the “Annotated Patch” sub-image.
Finally, the Grasping Classification Network outputs a grasping label based on
the “Annotated Patch”. This work considers a bi-class output: grasp versus
pinch.
The new block simplifies the classification task highlighting the parts of the
hand. Accordingly, the training procedure is expected to converge more easily
and the inference process should become more accurate.
In the following subsections, the paper describes the Annotation Network
and the training procedure.

2.1 Annotation Network

The annotation network superimposes a skeleton on the image of the hand. In


this work, the model proposed in [20] is the core of the annotation network.
This solution relies on a deep CNN called Convolutional Pose Machine [23].
The outputs of the network are 21 heatmaps, with values corresponding to the
probability of each pixel being part of one of the joints of the hand. Figure 2
exemplifies the working scheme: sub-figure (a) shows the 21 joints of the hand,
sub-figure(b) depicts the role of the probability distribution, i.e. the outputs of
the network.
The Convolutional Pose Machine proves effective in this hard task thanks to
the custom training procedure proposed in [20]. In practice, the dataset contains
always multiple versions of a picture collected simultaneously from different,
known angulation: in other words, multiple projections of the hands are avail-
able. Using an iterative procedure, a detector produces labels for all the images.
The predictions from multiple views of an object are triangulated. When trian-
gulation confirms that the predictions where consistent, the images are added
to the training set. Finally, a simple software superimposes the skeleton of the
image given the coordinates of the joints, drawing segments that connect the
prediction, as per Fig. 2(a).
26 E. Ragusa et al.

Fig. 2. Representation of the hand’s annotation outputs (Color figure online)

2.2 Training Procedure


The grasping classification is a challenging task. Accordingly, the training process
needs high size datasets [3,19]. The existing datasets are noisy, lack the hands’
position inside the frame, and contain a modest amount of images.
To overcome these issues this paper improves the multi-step learning app-
roach proposed in [17]. A deep network pre-trained on the task of object detec-
tion is fine-tuned on the hand detection problem using a small dataset of labeled
images, i.e. images containing annotation of the hands’ position. In practice,
only hundreds of images are sufficient to accomplish training convergence.
Thus, the small annotated training set puts the basis for the development
of the whole system for hand grasping classification. After, the trained Hand
detection network infers the position of the hands in all the frames of an existing
grasping dataset, producing a high size labeled dataset, i.e., a dataset in which
the position of the grasping hand inside a frame is known. Accordingly, the Right
hand heuristic can be utilized to identify the right hand in all the frames and to
extract the corresponding patches.
Then, the Annotation Networks annotates all the patches containing the
hands that perform the grasping action. Finally, the annotated patches form the
training set for the Grasping classification network. Accordingly, the grasping
classification is expected to converge to better solutions.

3 Experimental Results
The experiments assessed both the accuracy and the computational cost of the
proposal. SSD-MobileNetV1, Convolutional Pose Machine, and MobileNetV1
architectures support blocks 1, 3, and 4 of Fig. 1, respectively.
The hand detection network was trained using Oxford Hands Dataset [14]
and Egohands Dataset [2]. The Annotation Network exploits the implementation
of the original paper [20]. The classification network was trained on a subset
of the Yale Human Grasping Dataset [3]. The remaining part of the dataset
was employed as a test set. 500 frames were excluded from the training phase
and classified by the complete system. The classification problem was pinch vs
Video Grasping Classification Enhanced with Automatic Annotations 27

grasp problem following the setup proposed in [17]. This setup considers only
the biclass classification task because EMG control of prosthesis allows only a
limited set of actions, for example, grasp pinch and wrist rotation. Anyway, the
proposed approach can be extended to fine-grained classification without changes
in the setup.
The experiments consider only genuine RGB solutions as baseline comparison
because other methods would be out of the scope of this paper. In fact, electronic
skin leads to better performance in terms of accuracy, but it is more annoying
for a patient.

3.1 Generalization Capability


The experiment compared the proposed solution with the previous version of
the system. In addition, the experiments assess the impact of annotations colors:
multicolor assigns to each finger a color, as per Fig. 2. Blue, Red, and Green use
the base colors of RGB. Accordingly, the annotation affects only one channel of
the original image. Finally, gray uses a constant value for the annotation over
all three RGB channels.
Table 1 shows the results: the first row describes the previous solution, the
following rows remark the different versions of the proposal. Four standard met-
rics measure the performance of the model, one for each column. The best results
are marked in bold.

Table 1. Performance of grasping predictors

Model Accuracy Precision Recall F1


Original [17] 0.81 0.84 0.84 0.84
Multicolor 0.78 0.77 0.86 0.81
Blue 0.78 0.82 0.80 0.81
Red 0.79 0.84 0.81 0.82
Green 0.80 0.81 0.84 0.83
Grey 0.85 0.86 0.87 0.87

Results confirm that the color of the annotation is critical. In fact, only
the configuration with grey annotations improved over the original solution.
However, the grey configuration overcomes the previous solution consistently. It
is reasonable to assume that the grey annotations yielded the best performance
because affects equally all the three RGB channels. Accordingly, the annotations
affect more consistently the convolution kernels.

3.2 Implementation
The later phase concerned the deployment of the predictor on a NVIDIA Jet-
son TX2. The test used floating-point arithmetic and python implementation.
Accordingly, generalization performances are identical to offline measures.
28 E. Ragusa et al.

The proposed solution deployed on the embedded device detects the eventual
presence of a hand and the actual grasping action with a maximal latency of
200 ms. This performance meets the real-time constraint of 400 ms [11]. Com-
pared to the previous version of the system the latency increases by 70 ms in
the worst case, however, this additional cost is justified by the enhanced gen-
eralization performances. When considering continuous acquisitions, the overall
power consumption ranges from 5 to 11 W, while 7W are consumed by the GPU.
As expected, power consumption in Jetson TX2 is high. Considering 2 lion bat-
teries with 3.6 V output and 2900 mAh capacity, used in series, the continuous
prediction span-time ranges from 2 to 4 h. However, a real system should trigger
prediction only when needed, highly increasing battery duration.

4 Conclusion
The paper presented an enhanced video-based solution for grasping classifica-
tion. An annotation network, included in the classification pipeline, simplifies the
classification problem. Overall, the experiments confirmed the improved perfor-
mances of the new solution. In addition, experiments on a Jetson TX2 module
revealed that the proposed setup can process a frame in 200 ms confirming the
feasibility of the proposed method for embedded devices.

References
1. Anwar, S.M., Majid, M., Qayyum, A., Awais, M., Alnowami, M., Khan, M.K.:
Medical image analysis using convolutional neural networks: a review. J. Med.
Syst. 42(11), 226 (2018)
2. Bambach, S., Lee, S., Crandall, D.J., Yu, C.: Lending a hand: detecting hands and
recognizing activities in complex egocentric interactions. In: Proceedings of the
IEEE International Conference on Computer Vision, pp. 1949–1957 (2015)
3. Bullock, I.M., Feix, T., Dollar, A.M.: The Yale human grasping dataset: grasp,
object, and task data in household and machine shop environments. Int. J. Robot.
Res. 34(3), 251–255 (2015)
4. Cai, M., Kitani, K.M., Sato, Y.: An ego-vision system for hand grasp analysis.
IEEE Trans. Hum.-Mach. Syst. 47(4), 524–535 (2017)
5. Chortos, A., Liu, J., Bao, Z.: Pursuing prosthetic electronic skin. Nat. Mater.
15(9), 937 (2016)
6. Fan, Q., Shen, X., Hu, Y., Yu, C.: Simple very deep convolutional network for
robust hand pose regression from a single depth image. Pattern Recogn. Lett.
119, 205–213 (2017)
7. Feix, T., Romero, J., Schmiedmayer, H.B., Dollar, A.M., Kragic, D.: The grasp
taxonomy of human grasp types. IEEE Trans. Hum.-Mach. Syst. 46(1), 66–77
(2015)
8. Gao, Q., Liu, J., Ju, Z., Zhang, X.: Dual-hand detection for human-robot inter-
action by a parallel network based on hand detection and body pose estimation.
IEEE Trans. Ind. Electron. 66, 9663–9672 (2019)
9. Ghazaei, G., Alameer, A., Degenaar, P., Morgan, G., Nazarpour, K.: Deep learning-
based artificial vision for grasp classification in myoelectric hands. J. Neural Eng.
14(3), 036025 (2017)
Video Grasping Classification Enhanced with Automatic Annotations 29

10. Huang, Y.C., Liao, I.N., Chen, C.H., İk, T.U., Peng, W.C.: TrackNet: a deep
learning network for tracking high-speed and tiny objects in sports applications.
In: 2019 16th IEEE International Conference on Advanced Video and Signal Based
Surveillance (AVSS), pp. 1–8. IEEE (2019)
11. Ibrahim, A., Valle, M.: Real-time embedded machine learning for tensorial tactile
data processing. IEEE Trans. Circuits Syst. I Regul. Pap. 99, 1–10 (2018)
12. Li, Y., Wang, Y., Yue, Y., Xu, D., Case, M., Chang, S.F., Grinspun, E., Allen,
P.K.: Model-driven feedforward prediction for manipulation of deformable objects.
IEEE Trans. Autom. Sci. Eng. 99, 1–18 (2018)
13. Markovic, M., Dosen, S., Popovic, D., Graimann, B., Farina, D.: Sensor fusion and
computer vision for context-aware control of a multi degree-of-freedom prosthesis.
J. Neural Eng. 12(6), 066022 (2015)
14. Mittal, A., Zisserman, A., Torr, P.H.: Hand detection using multiple proposals. In:
BMVC, pp. 1–11. Citeseer (2011)
15. Pham, T.H., Kyriazis, N., Argyros, A.A., Kheddar, A.: Hand-object contact force
estimation from markerless visual tracking. IEEE Trans. Pattern Anal. Mach.
Intell. 40(12), 2883–2896 (2017)
16. Ragusa, E., Cambria, E., Zunino, R., Gastaldo, P.: A survey on deep learning
in image polarity detection: balancing generalization performances and computa-
tional costs. Electronics 8(7), 783 (2019)
17. Ragusa, E., Gianoglio, C., Zunino, R., Gastaldo, P.: Data-driven video grasping
classification for low-power embedded system. In: 2019 26th IEEE International
Conference on Electronics, Circuits and Systems (ICECS), pp. 871–874. IEEE
(2019)
18. Ragusa, E., Gianoglio, C., Zunino, R., Gastaldo, P.: Image polarity detection on
resource-constrained devices. IEEE Intell. Syst. 35, 50–57 (2020)
19. Saudabayev, A., Rysbek, Z., Khassenova, R., Varol, H.A.: Human grasping
database for activities of daily living with depth, color and kinematic data streams.
Sci. Data 5, 180101 (2018)
20. Simon, T., Joo, H., Matthews, I., Sheikh, Y.: Hand keypoint detection in single
images using multiview bootstrapping. In: Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 1145–1153 (2017)
21. Sundaram, S., Kellnhofer, P., Li, Y., Zhu, J.Y., Torralba, A., Matusik, W.: Learning
the signatures of the human grasp using a scalable tactile glove. Nature 569(7758),
698 (2019)
22. Wang, T., Li, Y., Hu, J., Khan, A., Liu, L., Li, C., Hashmi, A., Ran, M.: A survey
on vision-based hand gesture recognition. In: International Conference on Smart
Multimedia, pp. 219–231. Springer (2018)
23. Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines.
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-
nition, pp. 4724–4732 (2016)
24. Yang, Y., Fermuller, C., Li, Y., Aloimonos, Y.: Grasp type revisited: a modern
perspective on a classical feature for vision. In: Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pp. 400–408 (2015)
Enabling YOLOv2 Models to Monitor Fire
and Smoke Detection Remotely in Smart
Infrastructures

Sergio Saponara, Abdussalam Elhanashi, and Alessio Gagliardi(B)

Department of Information Engineering, University of Pisa, Via G. Caruso 16, 56122 Pisa, Italy
alessio.gagliardi@phd.unipi.it

Abstract. This paper presents implementation of a centralized antifire surveil-


lance management system based on video camera. The system provides visual-
ization information and an optimal guide for quick response of fire and smoke
detection. We utilize deep learning model (YOLOv2) and Jetson nano board with
Raspberry Pi camera as Internet of things (IoT) sensors. The smart cameras will
be mounted in indoor and outdoor environments, and connected to the central-
ized computer via ethernet cables and communication protocols according to an
IoT scheme. Specific software will be used in the centralized computer to show
video stream from each camera, in real-time while these cameras are responsible
for detecting fire and smoke objects and to generate the alarms accordingly. The
proposed approach is able to monitor and supervise fire and smoke detection from
different cameras remotely. It is suitable for targeted applications such as smart
cities, smart transports, or smart infostructures.

Keywords: YOLOv2 · Internet of Things (IoT) · Fire-smoke detection · Jetson


nano · Smart cities

1 Introduction
Fire and smoke accidents have become a very big concern as it causes severe destructive
including loss of human lives and damage to the properties [1]. Warning and alerting of
the citizens are a vital importance in terms of emergency management and preparedness
in large cities. One of the targets of this research is to deploy integrated mass warning
systems which can provide an emergency alert to the population. Several methods and
techniques have been used to detect fire and smoke accidents. Most of the traditional
methodologies are using sensors-based techniques. The drawback of these technologies
is that can detect fire and smoke in the vicinity where they are installed. Smoke and fire
detection sensors are not adequate to provide the location, magnitude, and direction for
the fire. Traditional sensors are limited to cover a large area for fire and smoke and smoke
detection. Other similar works have already been proposed in the literature. The author
in [2] has been developed a distributed video antifire surveillance system based on IoT
Embedded Computing nodes. The system takes advantage of an existing Video Smoke
Detection algorithm providing a Web Application able to detect smoke in real-time from

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 30–38, 2021.
https://doi.org/10.1007/978-3-030-66729-0_4
Enabling YOLOv2 Models 31

several cameras distributed in different areas. Although this system can give access to
several users via a handy login page and a centralized dashboard, such a technique was
only implemented for smoke detection. In our approach, we designed a deep learning
model which is able to detect both fire and smoke objects and monitored remotely from a
centralized management system. In this research, we are proposing CNN deep learning
detector (YOLOv2) for smoke and fire detection based on a video camera. YOLOv2
is a real-time deep learning model for object detection [3]. By exploiting YOLOv2, it
is possible to notify the early fire and smoke detection in a timely manner. Fire and
smoke detection in IoT environment is a promising component of early accident-related
event detection in smart cities. The target of this paper is to reduce the utilization of
techniques based-sensors, data processing, and communication resources to minimize
energy consumption in favor of increased battery life with regulations.
This paper is organized as follows: Sect. 1 deals with introduction of the system archi-
tecture. Sections 2 presents YOLOv2 implementation and hardware setup for remote
control. Section 3 presents software implementation. Section 4 discusses the experi-
mental results and it is followed by Sect. 5 where is presented the Smart IoT real-time
model for fire and smoke detection. Conclusions and further work are presented in
Sect. 6.

2 YOLOv2 Implementation and Hardware Setup for Remote


Control
The following is a section that introduces the neural network algorithm and the hardware
setup components for the proposed system. This section shows the design for deep neural
network YOLOv2, training, validation, and evaluation of the model, hardware compo-
nents, and system diagram for utilizing YOLOv2 to monitor fire and smoke accidents
remotely.

2.1 YOLOv2 Design


YOLOv2 model has been developed in MATLAB by using deep neural designer toolbox.
The deep neural network model was built with 25 layers as shown in Fig. 1. We estab-
lished light-weight architecture to fit into low-cost IoT nodes permitting a standalone
solution for a Smart Antifire System. This light-weight model is suitable for real-time
applications and it is worthy to be deployable on low-cost embedded devices. YOLOv2
architecture includes the input layer, a set of middle layers, and YOLOv2 specific layers.
Our network accepts as inputs 128 × 128 pixel RGB images and produces as output the
object class probabilities (fire or smoke) and the coordinates of the bounding box. The
middle layers used in this architecture consist of convolutional, batch normalization,
ReLU, and max pooling. For such a neural network we decided to use a size of 3 ×
3 for each convolution layer. All layers after the input layer up to ‘ReLU4’ are to be
considered as layers to extract features. YOLOv2 subnetwork succeeding in those layers
was used instead to permit the object localization typical of Yolov2 algorithm. Finally,
the Output layer was constructed to predict the location and the class of the detected
objects like fire or smoke.
32 S. Saponara et al.

Fig. 1. The architecture of the proposed YOLOv2 neural network

2.2 Training and Validation


The model was trained with 400 images of fire and smoke. Ground truth labeler appli-
cation was used to label the region of interests (ROI) of our dataset [4]. The designed
YOLOv2 was trained with stochastic gradient descent with a momentum optimization
method which helps accelerate gradients vectors in the right directions, thus leading to
faster converging (sdgm) [5].

Table 1. Summary for validation results by ROC tool analysis

Matrices Validation values


Number of images 200
Accuracy 93%
Specificity 80%
Sensitivity 94%

Momentum 0.9 was used to accelerate the speed of the training process for the
architecture. We set the learning rate at 10−3 to control the model change in response to
the error. YOLOv2 was validated with an independent dataset of 200 fire/smoke images
(100 images with NO fire/smoke, and 100 images with fire/smoke). According to the
results from Receiving Operating Characteristics (ROC), the accuracy for this validation
achieved up to 93%, see Table 1 and Fig. 2.
Enabling YOLOv2 Models 33

Fig. 2. ROC curve for validation dataset.

2.3 YOLOv2 Evaluation

The model has been tested with a dataset of videos which includes 170 fire/smoke videos
and 117 no fire/smoke. These videos have been made challenging for color-based and
motion-based objects. This dataset has been collected from various realistic situations for
smoke/fire and normal conditions. As per confusion matrix criteria, different matrices
(false-negative rate, false-positive rate, and accuracy) were analyzed to evaluate the
performance of YOLOv2 model for fire and smoke detection, see Eq. (1–3). The proposed
approach achieved promising classification results with an accuracy of 96.82% and
overcomes all other methodologies [6, 7] and [8], see Table 2.
FN
False negative rate = (1)
FN + TP
FP
False positive rate = (2)
FP + TN
TP + TN
Accuracy = (3)
TP + FN + TN + FP
Where:

• True positive (TP) detects fire/smoke objects in positive videos.


• True negative (TN) does not detect fire/smoke in negative videos.
• False-positive (FP) detects fire/smoke objects in negative videos.
• False-negative (FN) does not detect fire/smoke in positive videos.
34 S. Saponara et al.

Table 2. Performance of the proposed approach vs. state-of-art

Method False positive (%) False negative (%) Accuracy (%)


YOLOv2 3.4 2.9 96.82
De Lascio et al. [6] 13.33 0 92.86
Fu T J, Zheng et al. [7] 14 8 91
YOLO [8] 5 5 90

2.4 Hardware Setup for Remote Control


Jetson nano is a powerful computer tailored for running machine learning and neural
network models for object detection. It is a suitable board for application which are
based on distributed networks [9]. YOLOv2 models have been deployed on Jetson nano
boards. We used MATLAB and third-party support packages to generate the C code for
the Nvidia devices to run the algorithm as a standalone application. The hardware setup
consists of three Jetson Nano devices, three Raspberry Pi V2 cameras, Ethernet cables,
LAN switch, and a personal computer. We connected Raspberry cameras to the CSI
(Camera Serial Interface) port of each hardware by using a proper CSI flat cable. Such
a system permits the connection of multiple cameras with Jetson nano devices from
different locations that guarantees the best performance for real-time fire and smoke
detection and monitored from a centralized computer see Fig. 3.

Fig. 3. Hardware setup for fire and smoke smart surveillance detection system

3 Software Implementations
We used MobaXterm software in Windows 10 Operating System (OS) to establish
the communication between the main computer (Centralized fire and smoke manage-
ment system) and Nvidia Jetson nano boards [10]. The communication is processed
through OpenSSH server with respect to the defined IP address of each Jetson nano
node. OpenSSH or Secure Shell is a remote ICT protocol that allows users to control
Enabling YOLOv2 Models 35

and transfer data between computers. The system is built with a multi-access point of
IP addresses through OpenSSH sessions in the MobaXterm software. Each OpenSSH
session communicates with Jetson nano board with its designated IP address. We can
visualize the status of fire and smoke from several Jetson nano devices in one centralized
fire and smoke management system. We implemented a specific code in the Linux OS
of Jetson Nano for remote access, see Table 3. This code will enable the Remote Frame
Buffer (RFB) protocol for remote access to the Graphical User Interface (GUI) of Jetson
Nano boards.

Table 3. Implemented code in XML file for remote access.

<key name=’enabled’ type=’b’>


<summary>Enable remote access to the desktop</summary>
<description>
If true, allows remote access to the desktop via the RFB
protocol. Users on remote machines may then connect to
the desktop using a VNC viewer.
</description>
<default>true</default>
</key>

4 Experiment Results

We started the communication between the centralized fire and smoke management
computer and the Jetson Nano boards through MobaXterm software that resides in the
main computer. Each node is identified with a static IP address. The neural network
is executed on each board through specific commands on OpenSSH session terminal
in MobaXterm software. We displayed a set of videos of real fire and smoke on a PC
screen and exposed them to the Raspberry Pi cameras connected to each Jetson Nano
devices. At the time when fire and smoke were caught to Raspberry Pi cameras, the
bounding boxes were created enclosing the detected objects (fire and smoke), see Fig. 4.
We measured the latency time of communication between the main computer and Jetson
boards obtaining 0.3 ms.
The execution time for MobaXterm software was recorded at 0.008 s and the trans-
mission bandwidth was measured at 7.91 Gbits/sec. We recorded the average frames
per second from a centralized computer which were processed by Jetson nano devices,
using different sizes of video display (width and height). According to the results from
this experiment, see Table 4, the real-time in the centralized computer reached up to 27
fps.
36 S. Saponara et al.

Fig. 4. Mobaxterm software in the main computer for visualizing fire and smoke from different
cameras nodes.

Table 4. The real-time measurement (fps) in the centralized computer

Frame size (Width × Height) Real-time in centralized computer (fps)


128 × 128 27
224 × 224 18.4
416 × 416 11.2
640 × 480 9.49

5 Smart IoT Real-Time Model for Fire and Smoke Detection

Early fire and smoke detection in smart cities can minimize large-scale damage and
improve public and society safety significantly. The proposed approach detects fire and
smoke based on video processing signals which are inputted from closed-circuit tele-
vision (CCTV). This system can detect intrusion and explosive accidents in indoor and
outdoor environments. We measured the time decision of YOLOv2 necessary to detect
smoke or fire and to trigger an alarm. When the camera is in video mode, the time
delay between the start of smoke/fire in videos and YOLOv2 detection is 1 to 2 s. It
means that the presented architecture requires 1–2 s to trigger a fire alarm. YOLOv2
uses a single-stage object detection network which is faster than other two-stage deep
learning detectors such as regions with convolutional neural networks (R-CNN) models.
Regional convolutional neural network algorithms are slow and hard to optimize because
each stage needs to be processed separately. We compared our approach with respect to
other methodologies, see Table 5. Note that our design can produce better time decision
for fire and smoke detection in comparison to the method [13], which proposed Faster
Enabling YOLOv2 Models 37

R-CNN detector. This is an advantage of utilizing IoT deep learning model (YOLOv2)
to detect and analyze an early warning for fire and smoke disasters.

Table 5. The proposed approach vs other methodologies for time decision.

Methodologies Time decision


Proposed approach 1–2 s
Shin-Juh et al. [11] 10 s
AdViSED [12] 3s
Faster R-CNN [13] 10 s

6 Conclusion and Further Work


The objective of this research was to design a low-cost supervised management system
for antifire surveillance system from different video cameras simultaneously. The pro-
posed approach takes the advantage of utilizing several Nvidia Jetson Nano nodes, which
are able to communicate with the main computer via ethernet cables and openSSH and
RFB protocol. We designed a lightweight neural network model to account requirement
for an embedded system. YOLOv2 technique showed promising results for real-time
measurement up to 27 frames per second in the centralized computer. Indeed, the time
decision was tracked as the best (1–2 s) when compared to the other state of art method-
ologies. In the future, we intended to connect the proposed system to iCloud facilities
via wireless communication such as Wi-Fi and 4G LTE technologies.

Acknowledgments. Work partially supported by H2020 European Processor Initiative project


n. 826647 and by Dipartimenti di Eccellenza Crosslab Project by MIUR. We thank the Islamic
Development Bank for their support of the Ph.D. work of A. Elhanashi.

References
1. Hall, J.R.: The Total Cost of Fire in the United States. National Fire Protection Association,
Quincy, MA (2014)
2. Gagliardi, A., Saponara, S.: Distributed video antifire surveillance system based on IoT
embedded computing nodes. In: International Conference on Applications in Electronics
Pervading Industry, Environment and Society, pp. 405–411. Springer, Cham, September 2019
3. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time
object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 779–788 (2016)
4. MathWorks Student Competitions Team. Using Ground Truth for Object Detec-
tion (2020). (https://www.mathworks.com/matlabcentral/fileexchange/69180-using-ground-
truth-for-object-detection), MATLAB Central File Exchange. Accessed 27 July 2020
38 S. Saponara et al.

5. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural
networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence
and CCD Camera, IEEE Trans. on Instrumentation and Measurement, vol. 54, no. (4) (2005)
6. Di Lascio, R., Greco, A., Saggese, A., Vento, M.: Improving fire detection reliability by a
combination of video analytics. In: International Conference Image Analysis and Recognition,
Vilamoura, Portugal, Springer, Cham, CH (2014)
7. Fu, T.J., Zheng, C.E., Tian, Y., Qiu, Q.M., Lin, S.J.: Forest fire recognition based on deep
convolutional neural network under complex background. Comput. Modernization 3, 52–57
(2016)
8. Lestari, D., et al.: Fire hotspots detection system on CCTV videos using you only look once
(YOLO) method and tiny YOLO model for high buildings evacuation. In: 2nd International
Conference of Computer and Informatics Engineering (IC2IE2019), Banyuwangi, Indonesia,
pp. 87–92 (2019)
9. Jetson Nano Developer Kit. https://developer.nvidia.com/embedded/jetson-nano-develo
per-kit. Accessed 25 Feb 2020
10. Mobatek (n.d.). MobaXterm free Xserver and tabbed SSH client for Windows. mobax-
term.mobatek.net. https://mobaxterm.mobatek.net. Accessed 21 Jul 2020
11. Chen, S.J., Hovde, D.C., Peterson, K.A., Marshall, A.W.: Fire detection using smoke and gas
sensors. Fire Saf. J. 42(8), 507–515 (2007)
12. Gagliardi, A., Saponara, S.: AdViSED: advanced video smoke detection for real-time
measurements in antifire indoor and outdoor systems. Energies 13(8), 2098 (2020)
13. Kim, B., Lee, J.: Video-based fire detection using deep learning models. Appl. Sci. 9(14),
2862 (2019)
Exploring Unsupervised Learning on STM32 F4
Microcontroller

Francesco Bellotti1(B) , Riccardo Berta1 , Alessandro De Gloria1 , Joseph Doyle2 ,


and Fouad Sakr1
1 Department of Electrical, Electronic and Telecommunication Engineering (DITEN),
University of Genoa, Via Opera Pia 11a, 16145 Genoa, Italy
{francesco.bellotti,riccardo.berta,
alessandro.degloria}@unige.it, fouad.sakr@elios.unige.it
2 School of Electronic Engineering and Computer Science, Queen Mary University of London,
London E14NS, UK
j.doyle@qmul.ac.uk

Abstract. This paper investigated the application of unsupervised learning on a


mainstream microcontroller, like the STM32 F4. We focused on the simple K-
means technique, which achieved good accuracy levels on the four test datasets.
These results are similar to those obtained by training a k-nearest neighbor (K-
NN) classifier with the actual labels, apart from one case, in which K-NN performs
consistently better.
We propose an autonomous edge learning and inferencing pipeline, with a
K-NN classifier which is periodically (i.e., when a given number of new samples
have arrived) trained with the labels obtained from clustering the dataset via K-
means. This system performs only slightly worse than pure K-means in terms of
accuracy (particularly with small data subsets), while it achieves a reduction of
about two orders of magnitude in latency times. To the best of our knowledge, this
is the first proposal of this kind in literature for resource-limited edge devices.

Keywords: IoT · Development tools · Edge computing · Arduino ·


Platform-independence

1 Introduction

Machine learning (ML) is currently applied on the edge, in a variety of applications (e.g.,
[1]) and platforms (e.g., [2]), also including resource-constrained mainstream microcon-
trollers (e.g., [3]). This approach, compared to the cloud computing paradigm, provides
better latency, bandwidth occupation, data security, energy consumption [4].
At present, most of the ML applications on the edge process sensor samples to
make inferences (classification or regression) exploiting a model trained in the cloud.
The processes of data cleaning and preparation, and of model training is very time
consuming [2]. This is a major reason for the growing interest towards unsupervised
learning, which aims at discovering (previously) unknown patterns in a data set without

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 39–46, 2021.
https://doi.org/10.1007/978-3-030-66729-0_5
40 F. Bellotti et al.

the help of pre-defined labels. This could be of further relevance for embedded devices,
as they could be left in the field to learn autonomously from the data they collect.
A well-known unsupervised machine learning (ML) technique is clustering, which
enables the detection of hidden patterns in data based on a cluster strategy and a distance
function [5].
In this paper we are interested in exploring performance (accuracy and latency) of
clustering data on a mainstream microcontroller with no a priori knowledge. Since the
clustering works on the whole (available) dataset, we are also interested in understanding
whether combination of clustering and classification (which works on each single record)
may be beneficial.
The remainder of this paper is organized as follows. Section 2 presents an outlook
of related works. Section 3 proposes the pipeline methodology. Experimental results are
shown in Sect. 4, while conclusions are drawn in Sect. 5.

2 Related Works

Few works in literature have proposed models combining supervised and unsupervised
learning. Agrawal et al. [6] have developed a novel classification framework for the
identification of breast cancer, featuring a pipeline with an ensemble classification stage
after the ensemble clustering stage, in order to target the unclustered patients. Oliveira
et al. [7] proposed an iterative methodology combining automatic clustering and expert
analysis for labeling tweets to be used in k-nearest neighbours (K-NN) and Centroid-
Based Classifier (CBC) classification. Chakraborty et al. [8] presents EC2 (Ensemble
Clustering and Classification), a novel algorithm for discovering Android malware fam-
ilies of varying sizes-ranging from very large to very small families (even if previously
unseen). Thanks to the proposed merging of classification and clustering to support
binary and multi-class classification, EC2 constitutes an early warning system for new
malware families, as well as a robust predictor of the family (when it is not new) to which
a new malware sample belongs. Papas et al. [9] presented a data mining technique for
software quality evaluation. They use K-means clustering to establish clusters of Java
classes based on static metrics, and then built decision trees for identifying metrics which
determine cluster membership. Our approach falls into the same group but unlike other
techniques, we are targeting edge devices in the Internet of Things (IoT) field, with the
goal to support completely autonomous systems.

3 The Autonomous Edge Pipeline

3.1 Background

In this paper we deal with two very simple ML algorithms: K-means and K-NN. K-means
is a very simple centroid-based, iterative, clustering algorithm that aims to partition data
into k different and non-overlapping clusters where points with similar features belong
to the same group. It randomly chooses K representative points as the initial centroids,
and then each data point is assigned to the closest centroid. At the end of each iteration,
Exploring Unsupervised Learning on STM32 F4 Microcontroller 41

the centroids of each cluster are updated using the mean of all data points belonging to
the same cluster until there is no further change in their values [10].
K-NN is a simple classification algorithm based on feature similarity. It assigns
to an input data point the class of the nearest set of previously labeled points. The
performance of this method is dependent on k, the number of neighbors to be considered
at each decision, which is the only one hyper-parameter to be set for a model [11].
This paper presents an experimental analysis on a mainstream microcontroller first
of the K-means clustering algorithm, then of a pipeline combining clustering and
classification, as described in the following.

3.2 Methodology for the Autonomous Edge Pipeline (AEP)

Unsupervised classification of samples is typically done through clustering. This is


very appealing for field deployed devices, that would not need any prior knowledge, but
requires, for each new sample to classify, the processing of the whole dataset collected so
far. We propose a different approach, with an iterative pipeline alternating clustering and
classification. Particularly, clustering is executed periodically (e.g., after the reception
of 100 samples) and provides the labels for the classifier, which performs the much
faster classification of each single sample. This is expected to lead also to a reduction in
energy consumption because of the lower execution time needed for classification than
for clustering, but we need to investigate the possible performance drop. The proposed
Autonomous Edge Pipeline (AEP) implements a two-stage workflow, shown in Fig. 1.
The initial step consists in filling the dataset up to a certain level L (e.g., 50 records)
is reached. Then, the K-means clustering is run on the dataset, providing the labels that
are attached to the original records. Then, the continuous operation loop starts. Using
the above labels, the K-NN classifier (which does not need a training, apart from the
definition of the k hyper-parameter) is used to classify the subsequent samples, that are
also stored in the dataset. In order to avoid memory overflow, the dataset is implemented
as a fixed maximum length queue. After other L records, another clustering session is
run and the K-NN classifier updated.

Fig. 1. The AEP workflow.


42 F. Bellotti et al.

4 Experimental Results
We conducted the experimental analysis on a STM NUCLEO-F401RE board, with
84 MHz processing speed, 512 kB flash memory and 96 kB SRAM. The F series is widely
spread at industrial level, as it offers a compact, high-performing and cost-effective solu-
tion [12]. WAs a simple baseline for desktop/cloud computation, we use a PC hosting a
2.7 GHz core i7 processor with 16 GB of RAM and 8 MB cache.
For data clustering at the edge, we implemented the K-means algorithm in platform-
independent C (i.e., not using native OS libraries). On the desktop, with consistent results,
we used the K-means implementation offered by the sk-learn python library.
For our tests we use four binary classification datasets representing the IoT field.
The first dataset is Seismic Mine (2584 samples × 18 features) [13], used for seismic
hazard prediction. This dataset deals the problem of high energy seismic bumps (higher
than 10ˆ4 J) and comprises data from two longwalls located in a coal mine, and is
quite unbalanced (93% zeros). We randomly reduced the dataset size to 30%, to fit the
MCU memory size, and considered only 4 features, which looks closer to a field device
environment. Daphnet Freezing of Gait (28801 × 10) [14], it is used to recognize gait
freeze from wearable acceleration sensors placed on legs and hip of Parkinson patients.
Similarly, this dataset has been reduced to 5% and 3 features. The third one is IoT_Failure
(951 × 9) [15], which is used to predict failure in IoT field. The last one is Heart (303
× 12) [16], a popular medical dataset used to predict heart diseases. For simplicity,
we chose binary label datasets only. In the following experiments, we removed these
actual labels from the training, and used them only as a ground truth for comparing the
clustering/classification results.

4.1 Assessing K-means on IoT Datasets


As a first step, we assessed performance of K-means clustering. Lastly, we evaluated
the performance of the clustering method using two common metrics. The silhouette
score is a measure of how close a point is to its own cluster compared to other clusters,
it ranges from −1 to 1 where higher value indicates that the point is well matched
to its own cluster. The Davies Baldwin score is the ratio of within-cluster to between-
cluster distances. Lower values indicate better clustering. We also considered two scaling
cases: no scaling, and Standard Scaler. Table 1 shows the obtained empirical results. The
Silhouette value is the average over all the samples. In general, results show a certain
discrepancy between the K-means clustering and the values obtained with the actual
labels, that are generally worse. This seems to indicate the challenge of the classification
task, which will have to guess the actual labels based on the dataset features. The scaling
effect (standardization) improves the metrics in two cases but not in general.
Table 2 shows the clustering time performance on both PC and F4. Results highlight
the long latency on microcontrollers (measured with HAL_GetTick()), also compared
with the classification latency, which is typically in the order of tens of milliseconds (see
also Table 3).
Exploring Unsupervised Learning on STM32 F4 Microcontroller 43

Table 1. K-means clustering performance.

Dataset K Scaling K-means labels Actual labels


Silhouette Davies Silhouette Davies
Mine 2 None 0.9 0.45 0.4 2.9
2 Standard 0.67 1.01 0.17 3.92
Daphnet 2 None 0.72 0.77 −0.17 9.62
2 Standard 0.61 0.84 −0.15 9.85
IoT_Failure 2 None 0.94 0.28 −0.03 6.83
2 Standard 0.17 2 0.14 2.28
Heart 2 None 0.38 0.97 0.04 4.51
2 Standard 0.16 2.2 0.1 2.9

Table 2. K-means timing performance.

K-means clustering time


Dataset PC F4
Mine 6 ms 1.9 s
Daphnet 8 ms 3.3 s
IoT_Failure 11 ms 3.9 s
Heart 4 ms 1.6 s

4.2 Autonomous Edge Pipeline


Figure 2 depicts the learning curves for the AEP on the four test datasets. Scaling is
performed through a standard scaler. The learning curves represent the performance of
a K-NN classifier as a function of training set size, which varies from 5% to 75% of the
whole dataset. Performance is measured on the same 20% testing set. Table 3 reports the
numerical values. The best k is obtained by a 3-fold cross validation with a value range
from 1 to an upper limit which is 3, 5 or 10, depending on the number of samples in
the dataset. The inference time using the K-NN classifier is reported only for the dataset
size providing the best accuracy (e.g. 65% training set size for Mine). Accuracy does
not seem to be affected by variable unbalanced-ness (Mine dataset case).
For comparison Fig. 2 shows also the performance obtained by a K-NN classifier
trained on the actual labels, which almost always outperforms the AEP, and by the
simple K-means clustering algorithm. Not surprisingly, the K-means always performs
better than AEP. This is because the AEP classification is trained on labels created by a
run of K-means on the training set. Moreover, K-means computes the labels on the basis
of the knowledge of the testing set, differently from AEP. However, the computational
burden (and consequent energy consumption) of K-means is significantly higher (Table 2
vs 3). In AEP, the clustering algorithm is run only every Tl samples. Computation of the
44 F. Bellotti et al.

best value of the k-hyperparameter requires one run for each candidate value. However,
we observed that choosing a fixed value of 5 we got very similar results, with a (slight)
decrease only for Heart (2%) and IoT failure (1%).

Fig. 2. Learning curves for AEP on various datasets.

Table 3. AEP performance.

AEP
Dataset Best training set Accuracy K Inference time
size
Mine 65%+ 90% 1 12 ms (65% case)
Daphnet 55%+ 88% 6 23 ms (55%)
IoT_Failure 45%–65% 91% 7 16 ms (45%)
Heart 55% 82% 8 9 ms (55%)

The lines in Fig. 2 suggest some other interesting considerations. The performance
of all datasets quickly saturates with low percentages of the training set (i.e., the learning
rate is high). AEP performance tends to be less stable than the other techniques as the
training size increases.
Exploring Unsupervised Learning on STM32 F4 Microcontroller 45

As a final experiment, we computed on the desktop (due to the lack of training tools
on the STM32 F4 board), performance of other classifiers beside K-NN in the AEP
implementation. In the Heart dataset, Decision Tree starts (5% size) with 74% accuracy
and reaches 80% at 65% size. SVM reaches 84% at 75% size. In Mine, SVM starts with
90% and ends with 86% (which is similar to the K-means performance), but has a drop
at 15% (69%). Also this point confirms a certain instability of the AEP results, which we
attribute to the sub-optimality of the training labels, but should be better investigated. In
Daphnet, both classifiers perform similar to K-NN, while in IoT failures SVM achieves
always at least 90% accuracy (93% at 75%).

5 Conclusions and Future Work


This paper investigated application of unsupervised learning on a mainstream microcon-
troller, like STM32 F4. We focused on the simple K-means technique, which achieves
about 90% accuracy on two datasets (Mine and IoT failure), and slightly worse (83–
86%) in other two (Heart and Daphnet). These results are similar to those obtained by
training a K-NN classifier with the actual labels, apart from the Daphnet case, in which
K-NN performs about 10% better.
We propose AEP, an autonomous edge learning and inferencing pipeline, with a K-
NN classifier which is periodically trained with the labels obtained from clustering the
dataset via K-means. This system performs only slightly worse than pure K-means in
terms of accuracy (and particularly with small data subsets), while it achieves a reduction
of about two orders of magnitude in latency times. To the best of our knowledge this is
the first proposal of this kind in literature for resource-limited edge devices.
In the future, we intend to integrate the AEP in the Edge Learning Machine (ELM)
framework [17] and expand its implementation to support other combinations of clus-
tering and classification algorithms (e.g., SVM seems to have promising results), as well
as fusion. This could also help understand a certain instability of the AEP results when
increasing training set sizes. Other possible research directions concern the management
of the dataset on the edge (e.g., filtering of K-NN training samples), also exploiting
hierarchical architectures, so to prevent memory overflow.

References
1. Albanese, A., d’Acunto, D., Brunelli, D.: Pest detection for precision agriculture based on
IoT machine learning. In: Applepies 2019, Lecture Notes in Electrical Engineering, vol. 627,
pp. 65–72 (2020). https://doi.org/10.1007/978-3-030-37277-4_8
2. Lipnicki, P., Lewandowski, D., Syfert, M., Sztyber, A., Wnuk, P.: Inteligent IoTSP - imple-
mentation of embedded ML AI tensorflow algorithms on the NVIDIA jetson Tx Chip. In:
Proceedings-2019 International Conference on Future Internet of Things and Cloud, FiCloud
2019, pp. 296–302 (2019). https://doi.org/10.1109/ficloud.2019.00049
3. Sakr, F., Bellotti, F., Berta, R., De Gloria, A.: Machine learning on mainstream microcon-
trollers. Sensors 20(9), 2638 (2020). https://doi.org/10.3390/s20092638
4. Lin, L., Liao, X., Jin, H., Li, P.: Computation offloading towards edge computing. Proc. IEEE
107, 1584–1607 (2019)
46 F. Bellotti et al.

5. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666
(2010). https://doi.org/10.1016/j.patrec.2009.09.011
6. Agrawal, U., et al.: Combining clustering and classification ensembles: a novel pipeline to
identify breast cancer profiles. Artif. Intell. Med. 97, 27–37 (2019). https://doi.org/10.1016/
j.artmed.2019.05.002
7. De Oliveira, E., Gomes Basoni, H., Saúde, M.R., Ciarelli, P.M.: Combining Clustering and
Classification Approaches for Reducing the Effort of Automatic Tweets Classification. https://
doi.org/10.5220/0005159304650472
8. Chakraborty, T., Pierazzi, F., Subrahmanian, V.S.: EC2: ensemble clustering and classification
for predicting android malware families. In: IEEE Transactions on Dependable and Secure
Computing, vol. 17, no. 2, pp. 262–277, 1 March-April 2020. https://doi.org/10.1109/tdsc.
2017.2739145
9. Papas, D., Tjortjis, C.: Combining clustering and classification for software quality evalua-
tion. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics), vol. 8445. LNCS, pp. 273–286 (2014).
https://doi.org/10.1007/978-3-319-07064-3_22
10. Marsland, S.: Machine Learning An Algorithmic Perspective, 2nd edn. CRC Press, Boca
Raton (2015)
11. Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to
Algorithms. Cambridge University, New York (2014)
12. STM32 High Performance Microcontrollers (MCUs)—STMicroelectronics. http://www.st.
com/en/microcontrollers-microprocessors/stm32-highperformance-mcus.html. Accessed 23
Jul 2020
13. Sikora, M., Wrobel, U.: Application of rule induction algorithms for analysis of data collected
by seismic hazard monitoring systems in coal mines. Arch. Min. Sci. 55(1), 91–114 (2010)
14. Bächlin, M., Plotnik, M., Roggen, D., Giladi, N., Hausdorff, J.M., Tröster, G.: A wearable
system to assist walking of Parkinsońs disease patients benefits and challenges of context-
triggered acoustic cueing. Methods Inf. Med. 49(1), 88–95 (2010). https://doi.org/10.3414/
ME09-02-0003
15. IoT_failure_prediction | Kaggle. https://www.kaggle.com/mukundhbhushan/iot-failure-pre
diction. Accessed 23 Jul 2020
16. Heart Disease UCI | Kaggle. https://www.kaggle.com/ronitf/heart-disease-uci/kernels.
Accessed 23 Jul 2020
17. Edge-Learning-Machine GitHub. https://github.com/Edge-Learning-Machine. Accessed 31
Jul 2020
Environmental Monitoring and E-health
Unobtrusive Accelerometer-Based Heart Rate
Detection

Yurii Shkilniuk1(B) , Maksym Gaiduk1,2 , and Ralf Seepold1,3


1 HTWG Konstanz, Alfred-wachtel-Str. 8, 78462 Konstanz, Germany
{yshkilni,maksym.gaiduk,ralf.seepold}@htwg-konstanz.de
2 University of Seville, Av. Reina Mercedes s/n, 41012 Seville, Spain
3 I.M. Sechenov First Moscow State Medical University, Bolshaya Pirogovskaya St. 2-4,

119435 Moscow, Russian Federation

Abstract. Ballistocardiography (BCG) can be used to monitor heart rate activity.


Besides, the accelerometer should have high sensitivity and minimal internal noise;
a low-cost approach was taken into consideration. Several measurements have been
executed to determine the optimal positioning of a sensor under the mattress to
obtain a signal strong enough for further analysis. A prototype for an unobtrusive
accelerometer-based measurement system has been developed and tested in a
conventional bed without any specific extras. The influence of the human sleep
position for the output accelerometer data was tested. The obtained results indicate
the potential to capture BCG signals using accelerometers. The measurement
system can detect heart rate in an unobtrusive form in the home environment.

1 Introduction
A ballistocardiography is a technique that measures the heart rate from the mechanical
vibrations of a human body at each cardiac cycle. Ballistocardiography can perform
non-contact measurements of such quantities by studying the vibration patterns that
propagate through an object mechanically coupled to the subject. For example, a bed
can be used to track HR of subjects lying overnight [1], resulting in a contactless,
non-intrusive measurement. Some non-invasive techniques use the ballistocardiography
placing sensors in a chair or in the bed where the patient is placed [2]. The BCG can
realize in an unobtrusive sensory form and be embedded with different configurations
in the patient’s environment. Owing to its physical nature, it can convey critical medical
information about the cardiovascular system, which might be otherwise unattainable,
e.g., the force of the heart’s contraction, which is a crucial indicator of the heart’s
physiologic age and its decline [3]. BCG can offer useful perspectives for application in
preventive medicine, e.g., in determining the quality of sleep, in the detection of physical
or mental stress, or the early detection of coronary heart disease. The quality of sleep,
sleep phases, and duration can be determined using data on heart rate and respiratory
rate of the patient [4, 5].
The main objective of the study in this work is to develop a prototype unobtrusive
accelerometer-based measurement system and investigate the accelerometer positioning
for heart rate detection in a home environment used for sleep stage classification.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 49–54, 2021.
https://doi.org/10.1007/978-3-030-66729-0_6
50 Y. Shkilniuk et al.

2 Status and Experiment


Researchers proposed several ways to measure BCG signals during sleep. They use dif-
ferent types of sensors, their quantity, and placement. Some of them used a hydraulic
sensor system filled with water [6], load sensors installed on four legs of the bed
[7], piezoelectric load, and acceleration sensors [8], pressure sensors installed under
a mattress [4, 9] and others. With the development of highly sensitive accelerometers
based on microelectromechanical (MEMS) technologies, their applications in various
fields of measurement are growing rapidly. Unfortunately, the use of accelerometers for
measuring BCG signals is still a little-studied area of measurement science.
The found publications determined that the sensor should have high sensitivity and
minimal internal noise [8, 9]. By these requirements, a search for available accelerom-
eters was done. The accelerometers were compared in Table 1. For further tests, the
LIS3DSHTR1 the accelerometer was chosen since it has such advantages as high sen-
sitivity, low noise density, built-in 16 bits analog-to-digital converter (ADC), and low
cost.

Table 1. Comparison of accelerometers.

Parameters Measurement Output Sensitivity Noise density Axes Power Price


range data supply
type
ADXL ±2g, ±4 g, ±8g Digital 1 mg/LSB 175 µg/sqrt(Hz) 3 3.3 V 8e
362 (12 for ±2g
bits) range
SCA820-D04 ±2g Digital 1.2 mg/LSB 2000 µg 1 3.3 V 18e
(12 RMS
bits)
LIS3D ±2g, ±4 g, ±8g, Digital 0.06 mg/LSB 150 µg/sqrt(Hz) 3 3.3 V 2e
SHTR ±16g (16 for ±2g
bits) range
SCA620-EF1V1B 2g Analog 2 V/g 2000 µg 1 5V 45e
RMS
MXA ±1g Analog 0.5 V/g 200 µg/sqrt(Hz) 2 3.3 V; 7e
2500E 5V
ADXL ±1,7g Analog 1 V/g 110 µg/sqrt(Hz) 2 3.3 V; 22e
103 5V

The structure of the bed and mattress are common low-cost offers (Fig. 1a). A foam
mattress 120 mm thick was used. The studies in the field of measuring heart rate using the
pressure sensors under the mattress show that the accuracy of the heart rate measurements
depends on the human body position [1]. In this study, the influence of the human sleep
position for the output accelerometer data was tested. Four basic human sleep positions

1 https://www.st.com/resource/en/datasheet/lis3dsh.pdf.
Unobtrusive Accelerometer-Based Heart Rate Detection 51

were investigated. They lie on a chest, lying on a left side (with the arm folded back),
lying on a right side, and lying on a back.

Fig. 1. Three methods of attaching the accelerometer (a - attached to a slat, b - attached to the
mattress, c - attached to a cantilever between the slats).

Three methods of attaching the accelerometer to the bed and mattress were tested.
They are: on a slat (Fig. 1a), between slats and attached directly to the mattress (Fig. 1b),
between slats with cantilever (Fig. 1c). The test cantilever was made of rough polypropy-
lene plastic plate 2 mm thick. The influence of this cantilever’s geometric and mechanical
characteristics on the measurement results has not been investigated. The dimensions of
the flexible part of the plate were 200 × 25 mm.
A Raspberry Pi and a program written in Python program language were used to
collect data from the accelerometer. The polling rate was 600 values per second. Data
recording was carried out, taking into account time ranges. Processing and visualization
of data were carried out on a personal computer using another Python program.
Data were filtered using a digital low-pass and high-pass filter to reduce low-
frequency bias and high-frequency noises that do not carry useful information. The
frequency bandwidth of the heart rate component varies among publications, where the
cutoff frequencies range between 0.1–1 Hz for high-pass filter and 10–25 Hz for low-
pass filters [8, 9]. By some publications [3, 10] as well as empirical results, it was found
that the bandwidth 1–15 Hz overall more suitable for heart rate monitoring using the
accelerometer.

3 Discussion of the Results


The development system prototype provides the measurement of heart rate with minimal
influence of human body movement, including movement caused by breathing. The
measurement system was designed for use in experiments with different attachment
methods to the mattress and human sleep positions.
All measurements (except presented in Fig. 6) were done with the sensor placed
directly opposite the heart. Any BCG signals from the accelerometer according to the
methods of attaching to the slat and to the mattress was not recognized (Fig. 2). The
only recognizable results were given by the method by using the cantilever (Fig. 3).
52 Y. Shkilniuk et al.

Fig. 2. An output signal from the accelerometer attached to the slat.

Fig. 3. An output signal from the accelerometer located on the cantilever in lying chest position.

The experiments showed that the human sleep position significantly affects the pre-
cision of measuring the BCS signals by the accelerometer. The most precise results were
shown by the position lying on the chest (Fig. 3) and the left side. Here is possible to
recognize a typical periodic BCG heartbeat signal described in the scientific papers [1,
11].
Lying on the right side, it is much more challenging to recognize the heartbeat.
(Fig. 4). Lying on a back, it was impossible to recognize the BCG signals (Fig. 5).

Fig. 4. An output signal from the accelerometer located on the cantilever in lying right side
position

Fig. 5. An output signal from the accelerometer located on the cantilever in lying back position

The most informative signals of the biggest amplitude were obtained when the
accelerometer was located under the mattress directly opposite the human heart (Table 2).
As the accelerometer was moving away from the human heart, the BCG signal level was
dropping sharply. At a distance of 10 cm to the side from the conditional vertical of the
human heart, the BCG signal from the chest position looks like shown in Fig. 6 (Table 2).
Unobtrusive Accelerometer-Based Heart Rate Detection 53

Table 2. Signal recognition results for different sleep positions and sensor placement.

Sleep position Chest Left side Right side Back


Directly opposite Good recognition Good recognition Poor Impossible to
the heart recognition recognize
10 cm from the Poor recognition Poor recognition Impossible to Impossible to
vertical of the recognize recognize
heart

Fig. 6. BCG signal from the chest position at a distance of 10 cm to the side from the conditional
vertical of the human heart.

The accelerometers with high sensitivity and low noise density can be used for
measuring the heart rate from the mechanical vibrations of the body due to the heart
movement. For successful measurements, special cantilever design is needed. As the
accelerometer is moving away from the human heart, the BCG signal level is dropping
sharply. Not all human sleep positions are suitable for clearly recognizing BCG signals
using the system with one accelerometer.

4 Conclusions and Outlook

The prototype of an accelerometer-based measurement system was developed. Using


the system does not cause inconvenience during sleep as the sensor is placed under the
mattress. The investigation on the human sleep position and accelerometer placement for
quality of recognition of heart rate detection in a home environment were performed. In
this research, the most precise heart rate recognition results were achieved when subjects
were lying on the chest (prone position) and the left side (left lateral position), whereas
it was almost impossible to identify heart rate in other sleep positions. In the left lateral
and prone position, most heartbeats were detected when the sensor was placed no further
than 10 cm from the heart axis under the mattress.
Further research into the use of accelerometers for measuring BCG can be focused on
the aspect of mechanical processes that occur when BCG signals propagate through the
mattress. It can also help determine the influence of the cantilever’s mechanical and geo-
metric characteristics on the measurement results. Another possible aim of further work
could be to investigate the use of several accelerometers to eliminate the disadvantage
of signal dropping when moving away from the accelerometer.
54 Y. Shkilniuk et al.

Acknowledgments. This research was partially funded by the Ministry of Economics, Labour
and Housing Baden-Württemberg (Germany) under the contract ‘Errichtung und Betrieb eines
(virtuellen) Kompetenzzentrums Markt- und Geschäftsprozesse Smart Home & Living Baden-
Wurttemberg’. The author is responsible for the content of this publication. This research was
partially funded by the EU Interreg V-Program “Alpenrhein-Bodensee-Hochrhein”: Project “IBH
Living Lab Active and Assisted Living”, grants ABH040, ABH04, ABH066 and ABH068.

References
1. Brüzer, C., Stadlthanner, K., Waele, S., Leonhardt, S.: Adaptive beat-to-beat heart rate
estimation in ballistocardiograms. IEEE Trans. Inf Technol. Biomed. 15, 778–786 (2011)
2. Sadek, I., Biswas, J.: Non-intrusive heart rate measurement using ballistocardiogram signals:
a comparative study. Signal Image Video Process. 13, 475–482 (2019)
3. Albukhari, A., Lima, F., Mescheder, U.: Bed-embedded heart and respiration rates detection
by longitudinal ballistocardiography and pattern recognition. Sensors 19, 1451 (2019)
4. Gaiduk, M., Seepold, R., Martínez Madrid, N., Orcioni, S., Conti, M.: Recognizing breathing
rate and movement while sleeping in home environment. Appl. Electron. Pervading Ind.
Environ. Soc. 627, 333–339 (2019)
5. Rodríguez, E.T., Seepold, R., Gaiduk, M., Martínez Madrid, N., Orcioni, S., Conti, M.:
Embedded system to recognize movement and breathing in assisted living environments.
In: Applications in Electronics Pervading Industry, Environment and Society. LNEE,
pp. 391–397. Springer, Cham (2019)
6. Jiao, C., Su, B., Lyons, P., Zare, A., Ho, L.C., Skubic, M.: Multiple instance dictionary learning
for beat-to-beat heart rate monitoring from ballistocardiograms. IEEE Trans. Biomed. Eng.
65, 2634–2648 (2018)
7. Sivanantham, A.: Measurement of heartbeat, respiration and movements detection using smart
Bed. In: IEEE Recent Advances in Intelligent Computational Systems (2015)
8. Feng, X., Dong, M., Levy, P., Xu, Y.: Non-contact home health monitoring based on lowcost
high-performance accelerometers. In: IEEE/ACM International Conference on Connected
Health: Applications, Systems and Engineering Technologies, pp. 356–364 (2017)
9. Gomez-Clapers, J., Serra-Rocamora, A., Casanella, R., Pallas-Areny, R.: Towards the stan-
dardization of ballistocardiography systems for J-peak timing measurement. Measurement
58, 310–316 (2014)
10. Lima, F., Albukhari, A., Zhu, R., Mescheder, U.: Contactless sleep monitoring measurement
setup. In: Proceedings, vol. 2. (2018)
11. Inan, O.T., Migeotte, P.F., Park, K.S., Elemadi, M., Tavakolian, K., Casanella, R., Zanetti,
J., Tank, J., Funtova, I., Prisk, G.K., Rienzo, H.K.: Ballistocardiography and seismocar-
diography: a review of recent advances. IEEE J. Biomed. Health Inform. 19, 1414–1427
(2014)
A Lightweight SiPM-Based Gamma-Ray
Spectrometer for Environmental Monitoring
with Drones

Marco Carminati1,2(B) , Davide Di Vita1,2 , Luca Buonanno1,2 ,


Giovanni L. Montagnani1 , and Carlo Fiorini1,2
1 Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano,
Piazza Leonardo da Vinci 32, 20133 Milan, Italy
marco1.carminati@polimi.it
2 Istituto Nazionale di Fisica Nucleare, Sezione di Milano, via Celoria 16, 20133 Milan, Italy

Abstract. A wireless, compact (8 × 8 × 11 cm3 ) and lightweight (<1 kg)


gamma-ray spectrometer featuring a 2 CsI scintillator readout by silicon photo-
multipliers and microcontroller data acquisition is operated on board of a prosumer
drone with 25 min of flight time. It provides 50 kcps count rate, 8% energy reso-
lution (662 keV) and a full scale range up to 1.7 MeV. Thanks to the affordability
of the solution, a swarm of drones could be deployed for versatile environmental
monitoring of radioactivity and identification of radionuclides.

1 Introduction

Environmental monitoring represents one of the application domains where the perva-
siveness of wireless sensors networks (WSN) can provide the most significant improve-
ments (to human health and planet preservation), as well as in terms of safety, efficiency
and economy. Relevant examples include monitoring of water [1] and air [2] quality
with miniaturized and affordable devices. In addition to traditional fixed networks, an
emerging paradigm is the combination of static data with moving sensors, at different
scales spanning from satellites to unmanned aerial vehicles (UAV), commonly known
as drones [3]. The key features of a sensing technology for the scalability to a very dense
networks are low cost, miniaturization, low power, ease of networking, and potential for
data compression and distributed/edge processing.
The field of radiation monitoring is following the same trend towards pervasiveness.
However, this evolution is characterized by a lag due to peculiar aspects, both at technical
level (detection principle and requirements) and non-technical ones, such as sanitary and
geo-political implications of radioactivity mapping.
In this work we focus on the development of a compact γ-ray spectrometer operat-
ing on board of a medium-grade drone. The main goal of the instrument is to measure
the energy spectrum in order to identify the presence of radionuclides in the environ-
ment such as 137 Cs (photopeak at 662 keV), 60 Co (1.33 MeV), 131 I (346 keV) etc…
Despite several studies of drone-based environmental radiation monitoring have been

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 55–61, 2021.
https://doi.org/10.1007/978-3-030-66729-0_7
56 M. Carminati et al.

already reported [4], they are mostly based on bulky instrumentation. Commercial sys-
tems (Fig. 1) have a minimum weight of about 3 kg [5], requiring expensive profes-
sional drones. A similar situation characterizes research prototypes recently proposed:
for example the Lusi drone [6] equipped with several sensors and sampler for the explo-
ration of extreme environments has a take-off weight of 7.5 kg (and a flight time of 6 min
with a weight of 6 kg). Also in the case of γ-ray spectrometers based on lightweight
solid-state detectors, such as Cadmium Zinc Telluride (CdZnTe), an octocopter with
4 kg payload is employed [7]. Even when compact detectors are realized, such as a
Compton-camera, power dissipation is often not compatible with a medium-size drone
(e.g. a current consumption of 1 A at +5 V [8]). Thus, this novel development is moti-
vated by the need for a low-power and lightweight gamma spectrometer for versatile
and parallelizable field deployment.

2 System Design
The design specifications for the instrument are: spectrometric capability (1024 chan-
nels) with energy resolution better than 10% at 662 keV, energy range from 80 keV to
1.4 MeV, count rate above 30 kcps, compactness, robustness and total weight below
1 kg.

Fig. 1. Examples of reference commercial γ-ray spectrometers equipped with scintillators of


comparable volume (~2 ) read by photomultiplier tubes (PMT): detector weight and power con-
sumption require professional, heavy lift and expensive UAVs and prevent operation of swarms
of drones.
A Lightweight SiPM-Based Gamma-Ray Spectrometer 57

In order to contain the cost of this solution, a traditional architecture of indirect


detection was chosen. A scintillating material converting gamma photons into secondary
photons detected by photodetectors coupled to the crystal. After some initial tests with
a cubic scintillator [9], a cylindrical 2 NaI crystal (Scionix) was selected. Being a
standard shape, it allows cost reduction, availability from different manufacturers and
ease of mechanical design.
The main novelty relies in the replacement of bulky and fragile photomultiplier
tubes (PMT), requiring hundreds of V of bias and not compatible with magnetic fields,
with solid-state silicon photomultipliers (SiPM). Optical simulations were performed
to optimize the number and position of SiPMs (geometry details cannot be disclosed
due to commercial confidentiality) looking for an optimal balance between cost of the
photodetectors and energy resolution, related to the fraction of collected light.

Fig. 2. Wireless spectrometer: (a) scheme of the electronics and (b) prototype encapsulated in a
light-proof 3D-printed case with external antenna and power supply connector.

Fig. 3. Portable ground receiver: Raspberry Pi 3 touchscreen (a), software main page (b) and
spectrum display page (c).
58 M. Carminati et al.

The architecture of the electronics is shown in Fig. 2a. The currents of all SiPM pixels
are merged into a single readout channel composed of a transimpedance amplifier, a
CR-RC shaper and a peak stretcher. The 32-bit ARM microcontroller (STM32) handles
the acquisition of the events (with a 12-bit 2.4 Msps ADC). Furthermore, it controls
the DC-DC converter setting the SiPM bias (~35 V), that is dynamically adjusted to
avoid thermal dependence of the gain (and to avoid the weight of a thermal stabilization
unit). The radio transceiver (HumPRO900) operates in the 915 MHz band with a serial
data rate of 115200 bit/s and 9.5 dBm output power. Auxiliary sensors are integrated
in the compact boards (5 × 5 cm2 area): a tri-axial accelerometer (±8 g, to provide
inertial information), and a magnetometer. The thermometer is used for compensating
the thermal drift of SiPM gain and for checking thermal gradients that can be dangerous
for the crystal.
Two versions of the acquisition software were realized (Fig. 3): initially a Matlab
GUI was developed for operation of the instrument with a wired USB connection. Once
the system has been fully characterized, an optimized Phyton version was created to
run on the battery-powered Raspberry Pi (3B) single-board computer used as ground
portable receiving station in the field.

3 Experimental Results

The spectrometer was initially characterized in the laboratory: wireless data communi-
cation (exceeding 300 m in free air) and the maximum count rate (50 kcps limited by
the ADC conversion time) were validated by electrically pulsing the input. Calibration
sources (133 Ba, 137 Cs and 60 Co) were employed to test the spectroscopic energy range
(from 60 keV to 1.7 MeV, Fig. 4) and to assess the FWHM energy resolution which
results at 662 keV equal to 11% with a single SiPM and to 8.35% with 3 tiles.

Fig. 4. Gamma spectroscopy with calibration sources: (a) low-energy with 133 Ba and (b) high
energy with 60 Co. Energy resolution spans from 11% to 8% depending on the number of SiPM.

Then, the system was mounted on the UAV (by means of a 3D-printed holder, Fig. 5)
and tested in the field. We selected the Tarot 1000 Octocopter, a standard medium-grade
drone (cost < 1000 $) with 8 propellers, a suitable payload (~1 kg), GPS receiver and a
carbon fiber structure. It is equipped with a 6-cell 22 V, 14000 mAh battery from which
the spectrometer is powered (+5 V are generated by means of a DC-DC converter). The
A Lightweight SiPM-Based Gamma-Ray Spectrometer 59

Fig. 5. Tarot 1000 Octocopter mounting the gamma spectrometer.

average spectrometer consumption is 130 mA during acquisition (peaking to 180 mA


during radio transmission lasting few ms). Spectra are acquired and transmitted every
second. Due to the addition of the spectrometer, the resulting flight time is reduced of
only 5 min (from 30 min to 25 min), an acceptable value.
As reported in Fig. 6, by hovering for 30 s in a grid of points spaced by 1 m, a
tridimensional map of the counts (integrated for 30 spectra and after subtraction of the

Fig. 6. Validation of the spectrometer in the field: aerial mapping of the 137 Cs gamma photons.
60 M. Carminati et al.

local background) of a calibration 137 Cs source can be acquired. Far from the source
(points and ) the measured count rate is ~2 cps. This value can be compared with the
safety threshold activity for a 137 Cs source of 10 kBq (in the Italian regulations), which,
considering the solid angle of the detector at 1 m distance, turns into a count rate of
~1.5 cps, demonstrating the feasibility for this instrument to spot dangerous sources
hidden in the environment at a speed of ~0.5 km/h. This spatial resolution is compatible
with what reported for drone-based surveys in the Fukushima area: 2–5 m resolution at
10m height and 10–20 min flight duration [10].

4 Conclusions
A compact SiPM-based spectrometer has been validated on a consumer-grade drone. It
enables achieving performance comparable with bulkier detectors (1.7 MeV full scale
range, 8% FWHM energy resolution at 662 keV, 50 kcps max. count rate) with a weight
below 1 kg (dominated by the crystal weight). This weight is less than one third of
what offered by the commercial state-of-the-art and thus suitable for deployment with
affordable drones, following GPS waypoints or flying in dynamic swarms and able to map
the territory and identify incorrectly disposed radioactive sources (with an activity above
safety thresholds). Furthermore, machine learning algorithms for real-time collimator-
free direction finding of gamma sources have been embedded in the same microcontroller
adopted in this spectrometer [11], paving the way to future strategies for automatic source
localization, potentially leveraging a swarm of affordable drones.

Acknowledgments. The authors would like to acknowledge the following people: Emanuele
Lavelli and Luca Lorusso past master students at Politecnico di Milano, TNE Nuclear Electronics
(Italy) who partially supported the development of the instrument and Gianluca Passarella who
professionally tuned and piloted the drone.

References
1. Carminati, M., Turolla, A., Mezzera, L., Di Mauro, M., Tizzoni, M., Pani, G., Zanetto, F.,
Foschi, J., Antonelli, M.: A self-powered wireless water quality sensing network enabling
smart monitoring of biological and chemical stability in supply systems. Sensors 20(4), 1125
(2020)
2. Carminati, M., Ferrari, G., Sampietro, M.: Emerging miniaturized technologies for airborne
particulate matter pervasive monitoring. Measurement 101, 250–256 (2017)
3. Carminati, M., Kanoun, O., Ullo, S.L., Marcuccio, S.: Prospects of distributed wireless sensor
networks for urban environmental monitoring. IEEE Aerosp. Electron. Syst. Mag. 34(6),
44–52 (2019)
4. Connor, D., Martin, P.G., Scott, T.B.: Airborne radiation mapping: overview and application
of current and future aerial systems. Int. J. Remote Sens. 37(24), 5953–5987 (2016)
5. Corbo, M., Morichi, M., Fanchini, E., Mini, G., Pepperosa, A., Mangiagalli, G.: Modular
and integrated sensor network of intelligent radiation monitor systems for radiological and
nuclear threat response. EPJ Web Conf. 225, 07005 (2020)
6. Di Stefano, G., Romeo, G., Mazzini, A., Iarocci, A., Hadi, S., Pelphrey, S.: The Lusi drone:
a multidisciplinary tool to access extreme environments. Mar. Pet. Geol. 90, 26–37 (2018)
A Lightweight SiPM-Based Gamma-Ray Spectrometer 61

7. Aleotti, J., Micconi, G., Caselli, S., Benassi, G., Zambelli, N., Bettelli, M., Zappettini, A.:
Detection of nuclear sources by UAV teleoperation using a visuo-haptic augmented reality
interface. Sensors 17(10), 2234 (2017)
8. Sato, Y., Ozawa, S., Terasaka, Y., Kaburagi, M., Tanifuji, Y., Kawabata, K., Miyamura, H.N.,
Izumi, R., Suzuki, T., Torii, T.: Remote radiation imaging system using a compact gamma-ray
imager mounted on a multicopter drone. J. Nucl. Sci. Technol. 55(1), 90–96 (2018)
9. Montagnani, G.L., Carminati, M., Lavelli, E., Morandi, G., Rizzacasa, P., Fiorini, C.: SiPM-
based scrap metal radioactivity detector embeddable in lifting electromagnets. In: 2018 IEEE
Nuclear Science Symposium and Medical Imaging Conference Proceedings (NSS/MIC), 1–3
(2018)
10. Mochizuki, S., Kataoka, J., Tagawa, L., Iwamoto, Y., Okochi, H., Katsumi, N., Kinno, S.,
Arimoto, M., Maruhashi, T., Fujieda, K., Kurihara, T., Ohsuka, S.: First demonstration of
aerial gamma-ray imaging using drone for prompt radiation survey in Fukushima. J. Inst.
12(11), P11014–P11014 (2017)
11. Buonanno, L., Di Vita, D., Carminati, M., Fiorini, C.: A Directional Gamma-Ray Spectrometer
with Microcontroller-Embedded Machine Learning. IEEE J. Emerg. Sel. Topics Circuits Syst.
10(4), 433–443 (2020)
Winter: A Novel Low Power Modular Platform
for Wearable and IoT Applications

Patrick Locatelli1(B) , Asad Hussain1 , Andrea Pedrana1 , Matteo Pezzoli2 ,


Gianluca Traversi1 , and Valerio Re1
1 Department of Engineering and Applied Sciences, University of Bergamo, Viale Marconi 5,
24044 Dalmine, BG, Italy
patrick.locatelli@unibg.it
2 Department of Electrical, Computer, and Biomedical Engineering, University of Pavia,

Via Ferrata 5, 27100 Pavia, PV, Italy

Abstract. This paper presents a new multi-purpose, ultra-low power device


designed to cope with the typical limitations of commercial IoT platforms. The
device can be used both as (1) a stand-alone system, with embedded sensing, data
processing, storage and communication capabilities, and as (2) a motherboard for
miniaturized expansion boards, which can be plugged on its top to enhance the
basic features. The paper also provides the performances of the system, as well as
a use case in an IoT context.

1 Introduction
Thanks to the synergy of wireless technologies, Micro Electro-Mechanical Systems
(MEMS) and the Internet, smart and wearable devices are becoming the driving force
in the Internet-of-Things (IoT) era. The capability to monitor different parameters by
means of several sensing units enables such systems to a wide range of applications:
from the recording of human movements for fitness and rehabilitation purposes, to the
monitoring of environmental parameters for quality of life assessment [1].
In this context, the massive spread of machine learning and big data applications
amazingly increased not only the amount, but also the variety of data needed. This is
especially true for medical research projects, in which the selection of the optimal devices
has taken on a key role [2]. It is the authors’ opinion that among the requirements such
devices should address, the availability of raw data which can be processed by in-house
algorithms, and the opportunity to collect different data types for long periods are the
most valuable for research purposes. Nevertheless, most of the commercial devices do not
grant access to the raw information, providing only the output of proprietary algorithms;
on the other hand, many research-graded products are designed for specific applications,
with either low memory or few sensors.
Driven by the limitations of the wearable devices used in previous studies, the authors
developed a new wearable platform, specifically designed to be both (1) an efficient
monitoring system suitable for edge computing and real-time data analysis, and (2) a
long-term data logger, providing up to a few days of continuous data log thanks to low
power consumption and a high-capacity memory.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 62–68, 2021.
https://doi.org/10.1007/978-3-030-66729-0_8
Winter: A Novel Low Power Modular Platform 63

2 System Architecture
Winter, an acronym of Wearable Inertial TrackER, is the result of a design aimed to
provide a system-on-board with sensing and processing capabilities, low power con-
sumption, (relatively) high storage capacity thanks to an on-board SD connector and
wireless transmission capabilities provided by a BLE module, all in a form factor of
32 × 20 mm2 (see Fig. 1). The overall block architecture of the system, as well as the
communication interfaces used, are depicted in Fig. 2.

Fig. 1. The winter device.

Fig. 2. Block diagram of the winter platform. processing, sensing, storage, connectivity and I/O
blocks are grouped by color.

Processing. The core processing unit of Winter is the ultra-low power STM32L475RG
microcontroller unit (MCU), manufactured by STMicroelectronics. Based on the ARM
Cortex-M4 32-bit architecture, it embeds high-speed memories (1 MB of Flash mem-
ory, 128 kB of SRAM), a low-power RTC (Real-Time Clock) and an extensive range of
enhanced I/Os and peripherals. In addition to this, the ABS06-1-T SMD crystal (Abra-
con) was connected to the microcontroller: by generating a clock signal of 32.768 kHz
with a frequency tolerance of ±10 ppm (parts per million), a more accurate and precise
timebase can be obtained.
Sensing. The sensing capabilities of Winter range from environmental monitoring to
inertial measurements. These are achieved by the presence of three on-board modules.
64 P. Locatelli et al.

The first module is the LSM6DSL (STMicroelectronics), a system-in-package fea-


turing a 3D digital accelerometer and a 3D digital gyroscope with full-scale ranges of
±2/±4/±8/±16 g and ±125/±245/±500/±1000/±2000 dps, respectively. The module
has a current draw of 0.65 mA when both sensors set to operate in high-performance
mode, and it enables always-on low-power features. An SPI connection to the MCU was
preferred over the I2 C bus to work at higher baud rates.
The second and third sensing modules are the HTS221 and the LPS22HH, from
STMicroelectronics. The former is an ultra-compact sensor for relative humidity (rH)
and temperature, with an accuracy of ±3.5% in the 20–80% rH interval and ± 0.5 °C
in the temperature range from 15 to 40 °C. The latter is an ultra-compact piezoresistive
absolute pressure sensor functioning as a digital output barometer, with a 260 to 1260 hPa
absolute pressure range (accuracy: 0.5 hPa). Both modules are connected to the MCU
via an I2 C interface and are suitable for ultra-low power applications, having a current
draw below 3 µA at 1 Hz ODR (Output Data Rate).
Storage. The MicroSD card socket mounted on the Winter platform makes it possible
to use an external memory card to store raw data and processed results. The data transfer
is performed by using the 4-line SDMMC communication interface provided by the
MCU, at a rate up to 48 MHz (8-bit mode). The stored data can be accessed either by
removing the SD card or through the USB connection.
Connectivity. The communication with external devices allows Winter (1) to transfer
the output of the embedded data analysis algorithms, and (2) to be remotely configured
to operate in one of the available working modes. To enable such communication, two
different solutions were adopted. The first solution is represented by the SPBTLE-RF, an
easy-to-use Bluetooth Low Energy master/slave network processor module provided by
STMicroelectronics, compliant with Bluetooth v4.1. On the other hand, Winter provides
a Micro-USB 2.0 high-speed port that allows data transfer with an external device in
both directions.
Power Management. The system is powered by a 3.7 V, 210 mAh lithium polymer
battery, scaled down to the 3 V operating voltage by means of a high efficiency step-
down converter (TPS62740 manufactured by Texas Instruments). The battery can be
recharged through the dedicated micro-USB connector and the process is supervised by
the MCP73831 charging controller by Microchip. To monitor the battery charging level,
the MAX17048X + was integrated and connected to the MCU via I2 C interface.
I/O. Two 10-pin connectors are mounted on the bottom side of the Winter platform. The
first one is a simple debug connector meant to be removed at release stage. The second
one enables the Winter platform to extend its sensing and communication capabilities by
providing a socket for potential expansion boards. In fact, the system’s power supply, as
well as all of the most common interfaces available (SPI, I2 C, UART) can be exploited
through this connector by any expansion board, which only needs to be plugged on the top
of Winter. In such a context, while the Winter platform would act as a motherboard, i.e.
supplying the required power and providing basic storage and connectivity capabilities,
an expansion board could be designed with the strictly necessary components, thus
minimizing its form factor.
Winter: A Novel Low Power Modular Platform 65

3 Firmware and Performance


3.1 Finite-State Machine
Figure 3 depicts the finite-state machine (FSM) embedded on the Winter’s MCU, whereas
the state details are described in Table 1.

Fig. 3. Finite-state machine of the winter platform.

Table 1. List of Winter’s FSM states (Inrt: inertial; Axl: accelerometer; Env: environmental).

Name MCU mode Inrt sensors Env sensors Data storage Bluetooth status
INIT Run – – – –
IDLE LP Run OFF ON Inactive Advertising
SLEEP Stop 2 Axl ON OFF OFF OFF
READY LP Run OFF ON Inactive Connected
LOG_INRT LP Run/Run ON ON Active Connected/Adv
LOG_ENV Run/Stop 2 OFF ON Active Connected/Adv

After the success of the startup routine (INIT state), the platform enters the IDLE state
characterized by the inertial sensor turned off, the environmental sensing components
powered on and the Bluetooth (BT) module being ready to accept a connection. If a
connection is not established within a defined time window (1 min), the system enters
the SLEEP state: the BT module is switched off and the microcontroller is set to operate
in Stop 2 mode. A double-click event detected by the inertial module awakens the system
in the INIT state.
From the IDLE state, an active BT connection makes the FSM move to the READY
state, in which the platform is ready to communicate with the master device. By means
of the BT command set, Winter can then enter two different log states. The first one is
called LOG_INRT, and involves the log of inertial data from the selected sensors (only
accelerometer, only gyroscope or both) on the SD card at a selected frequency; the MCU
continuously alternates the Low-Power Run mode (collection phase) and the Run mode
(SD writing/analysis phase) to reduce the mean power consumption. The second one
is called LOG_ENV: while the inertial sensors are turned off, the MCU is kept in Stop
66 P. Locatelli et al.

2 mode and is awaken (Run mode) every 1 min only for the time required to read the
environmental parameters and store them on the SD card. Both these states can operate
either with an active BT connection, or with the BT module in Advertising mode.
Data collected by Winter can be accessed in two ways only. The first one requires
the user to remove the SD card: all the logged data (either inertial or environmental,
depending on the mode previously used) can be downloaded to any device embedding
an SD support (e.g. computer, smartphone), and the SD card can then be cleared and
used again for new acquisition sessions. Alternatively, data are periodically provided
to the outer world through the BLE characteristics system: this holds true only for
environmental data, and it is performed in all those states in which such sensors are
active. No real-time data streaming is currently provided by the system, although it
could be implemented in future releases of the firmware.
During each of the different states of the FSM, the scheduling of tasks to perform
is based on the internal interrupts handling mechanism natively embedded in the MCU
management. For each task, the timebase represented by the system clock is used to raise
some internal interrupts after a specific time period elapsed; then, the related call-back
function is invoked depending on the interrupt priority (i.e. the priority of the task);
finally, the code related to the interrupt handling is executed. Specifically, the main tasks
performed by the system are the following (the order defines the priority from highest
to lowest): (1) reading the data collected by the on-board sensors from the respective
registers; (2) logging the data onto the SD card (if present); (3) handling the Bluetooth
connection, by interpreting the incoming commands and by updating the characteristics
to be read from the user.

3.2 Power Consumption

By means of the developed testing firmware, a preliminary evaluation of the overall power
consumption was carried out. The results of the measurements are reported in Table 2.
The 210 mAh battery allows up to 3 days of continuous inertial data log (with both
accelerometer and gyroscope’s ODRs set to 416 Hz), or up to 2 months of environmental
monitoring (with one-shot measurements per minute).

Table 2. Average power consumption of the states (power supply: 3.7 V, 210 mAh battery).

State Current drawn Power Autonomy


consumption
IDLE 0.966 mA 3.574 mW 9 days
SLEEP 0.045 mA 0.167 mW 6 months
READY 1.046 mA 3.870 mW 8 days
LOG_INRT 2.881 mA 10.660 mW 3 days
LOG_ENV 0.171 mA 0.633 mW ~2 months
Winter: A Novel Low Power Modular Platform 67

3.3 Data Comparison


Accelerations collected by Winter were compared to those collected by MuSe, a research-
grade inertial platform used by authors in other works [3, 4], both in static and dynamic
conditions (Fig. 4). Acceleration signals collected by the two platforms are almost iden-
tical on each axis, thus confirming the possibility to use Winter in place of the MuSe
one when the collection of inertial data is needed. Some negligible differences can still
be observed in static conditions: this can be ascribed to the different 0 g-offset values
of the devices’ inertial modules (±90 mg for MuSe’s LSM9DS1, ±40 mg for Winter’s
LSM6DSL). However, such differences will be reduced once the self-zeroing procedure
will be implemented.

Fig. 4. Comparison of acceleration magnitude collected by MuSe and Winter devices in quasi-
static (left) and dynamic (right) conditions.

4 IoT Application: Indoor Environmental Monitoring


To simultaneously test the system and to provide a use case of the Winter platform,
an IoT-oriented application was developed. Specifically, the presented device was used
to continuously collect information about the environmental conditions of a room, i.e.
temperature, humidity level and atmospheric pressure. The system is shown in the left-
hand side of Fig. 4 and was composed of: (1) a Winter device, used to collect the
environmental data; (2) a Raspberry Pi 3, connected to the Internet via an Ethernet cable
and acting as a gateway; (3) an open-source IoT application named ThingSpeak, which
provides both a cloud server and a customizable graphical front-end.
The availability of a BLE module on the Raspberry platform allowed a Python
script to be developed ad-hoc for a periodical readout of the environmental data: these
were made available by Winter as characteristics of the standard Environmental Sensing
Service (ESS) of the BLE protocol. Data were then streamed to ThingSpeak cloud to
aggregate and visualize the live values. Towards this goal, a dedicated channel was
created on the web platform at first; then, the provided API keys as well as the function
calls were embedded in the aforementioned Python script, to allow the batch job to send
data packets via the HTTP protocol: Fig. 5 (right) depicts ThingSpeak’s temperature
chart for multiple monitoring sessions. Starting from this simple example, a network
composed of several Winter nodes connected to the same gateway could be easily set up
to monitor different rooms within the same building.
68 P. Locatelli et al.

Fig. 5. On the left, the environmental monitoring system based on Winter device. On the right, a
web chart representing the output of the embedded temperature sensor.

5 Conclusions

This work presented a new multi-purpose, smart, wearable device specifically designed
to overcome the limitations of the typical devices used in IoT contexts. Performance tests
confirmed the capability of the platform to work in ultra-low power regimes. Moreover,
the similarities between inertial data from Winter and MuSe support the replacement of
the latter in all the authors’ past (or future?) studies. An application example was also
described for Winter: the device was used as an IoT node for environmental monitoring.
However, the variety of integrated sensors and the enhancement capabilities supported by
the expansion connector allows Winter to be used for many purposes: activity tracking,
home rehabilitation assistance, physiological monitoring and quality of life assessment
are just some of the potential applications.

Acknowledgments. Authors want to thank Francesco Galizzi for his contribution to the design
of the platform.

References
1. Mamun, M.A.A., Yuce, M.R.: Sensors and systems for wearable environmental monitoring
toward IoT-enabled applications: a review. IEEE Sensors J. 19(18), 7771–7788 (2019). https://
doi.org/10.1109/jsen.2019.2919352
2. Polhemus, A.M., et al.: Human-centered design strategies for device selection in mHealth
programs: development of a novel framework and case study. JMIR Mhealth Uhealth 8(5),
e16043 (2020). https://doi.org/10.2196/16043. PMID: 32379055, PMCID: 7243134
3. Locatelli, P., Alimonti, D.: Differentiating essential tremor and Parkinson’s disease using a
wearable sensor—a pilot study, In: 2017 7th IEEE International Workshop on Advances in
Sensors and Interfaces (IWASI), Vieste, pp. 213–218 (2017). https://doi.org/10.1109/iwasi.
2017.7974254
4. Pedrana, A., Comotti, D., Locatelli, P., Traversi, G.: Development of a telemedicine-oriented
gait analysis system based on inertial sensors. In: 2018 7th International Conference on Modern
Circuits and Systems Technologies (MOCAST), Thessaloniki, pp. 1–4 (2018). https://doi.org/
10.1109/mocast.2018.8376592
Hardware–Oriented Data Recovery
Algorithms for Compressed
Sensing–Based Vibration Diagnostics

Federica Zonzini1(B) , Matteo Zauli1 , Antonio Carbone2 , Francesca Romano2 ,


Nicola Testoni1 , and Luca De Marchi2
1
ARCES - Advanced Research Center of Electronic Systems, 40136 Bologna, Italy
{federica.zonzini,matteo.zauli7,nicola.testoninicola.testoni}@unibo.it
2
DEI - University of Bologna, 40136 Bologna, Italy
{antonio.carbone5,francesca.romano15}@studio.unibo.it,
l.demarchi@unibo.it

Abstract. The huge amount of data to be processed is still a challenge


for the current Structural Heath Monitoring (SHM) sensor networks and
a primary reason for the development of hardware–oriented signal pro-
cessing techniques. In the specific case of vibration–based inspections, the
sparse spectral distribution of structures in dynamic regime have made
the Compressed Sensing (CS) paradigm a compelling solution. In this
work, the on–sensor deployment of data recovery procedures is tackled
from an edge–computing perspective, aiming at selecting the most per-
forming strategy which allows for the joint optimization of implied mem-
ory storage, latency and consistency of the retrieved structural informa-
tion. An experimental campaign conducted on a steel beam undergoing
ground motion excitation revealed that the Orthogonal Matching Pursuit
(OMP) strategy might be a promising candidate for sensor deployment,
since it attains the highest reconstruction levels while minimizing the
associated memory/time cost.

Keywords: Hardware–oriented compressed sensing · Structural health


monitoring · Vibration analysis

1 Introduction
The reduction in the amount of data to be transferred in a sensor network
is a fundamental issue in the context of Structural Health Monitoring (SHM),
where the requirements in terms of real–time functionalities, memory storage
and network congestion are increasingly more restrictive due to the size and/or
harshness of current monitoring scenarios.
Focusing on vibration–based SHM, structures in the dynamic regime present
a well–distinguishable vibration pattern characterized by few and highly local-
ized frequency peaks. From a signal processing perspective, this means that

c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 69–75, 2021.
https://doi.org/10.1007/978-3-030-66729-0_9
70 F. Zonzini et al.

vibration signals are sparse in the Fourier domain, thus a small number of coef-
ficients can capture most of the total mechanical energy [1]. Such a property
matches with the sparsity premise at the basis of the Compressed Sensing (CS)
theory [2], a signal processing paradigm which aims at reducing data size by
resorting to ad–hoc sparse representations.
Several examples of compression/decompression stages performed on remote
stations or dedicated servers are present in literature; conversely, effective on–
board implementations of CS techniques are rarely described. Among the most
significant works, a customized version of the Imote2 sensor platform was pre-
sented in [3], showing good performance for the on–line structural assessment
of long–span structures. Similarly, authors in [4] employed the Narada wireless
sensor as a prototyping board for acceleration compression in the framework of
bridge assessment.
In this paper, the practical aspects related to the definition of the most
suitable data recovery algorithm are dealt with in the framework of a miniatur-
ized monitoring network developed within the Intelligent Sensor System Lab of
the University of Bologna. Since the reconstruction stage constitutes the most
burdensome step of the CS processing flow, its optimization enables the imple-
mentation of effective energy aware monitoring solutions while tackling data
reduction at the same time. More specifically, three different strategies are com-
pared in terms of computation time, memory requirements, and accuracy of the
reconstructed vibration signals.

2 CS for SHM: From Theory to Implementation


2.1 Basics of CS
Two main ingredients are required to make the CS paradigm applicable: on one
hand, the existence of a basis Ψ ∈ RN ×N in which the class of processed signal
is supposed to be sparse, i.e. x = Ψ c where only M  N significant coefficients
can be found after a signal projection stage is applied. On the other hand, the
existence of a sensing matrix A ∈ RM ×N which performs the actual dimension
reduction operation.
Once Ψ and A have been determined, the CS processing flow encompasses
the three following steps:

1. Compression: given a generic signal instance x ∈ RN ×1 , its compressed form


y ∈ RM ×1 can be computed according with the matrix–vector multiplication

y = Ax (1)
2. Sparse coefficients recovery: assuming that the estimation of the original sig-
nal samples can be treated either as an iterative or optimization–based prob-
lem, a wide number of algorithms were proposed, taking advantage of the
underlying sparsity condition [7] to estimate the sparse coefficients ĉ. Indepen-
dently of the fitness function of the specific algorithm, the common objective
of each method is to recover ĉ by satisfying some prescribed criteria.
Hardware–Oriented Data Recovery Algorithms for CS–Based SHM 71

3. Decompression: a good approximation x̂ of the original data can be straight-


forwardly computed by projecting back the sought sparse coefficients in the
time domain, i.e.
x̂ = Ψ ĉ (2)

As far as the behavior of structures in dynamic regime is concerned, Fourier


bases are conventionally exploited since the structural vibrations are sparse in
this domain [5]. At the same time, even if a wide range of strategies exists for
designing the optimal sensing matrix, Gaussian matrices, i.e. matrices where
each entry is taken from a normal distribution, are effective in the vibration
analysis field, paving the way to fully unsupervised CS approaches [6]. Accord-
ingly, the Discrete Cosine Transform (DCT) matrix and the classical Gaussian
matrix sampled from a standard normal distribution are employed in this work
as sparsity and sensing matrix, respectively.

2.2 Data Recovery Algorithms


The implementation of the data recovery algorithm must meet the computational
and storage resources of the monitoring network. The best trade–off between
algorithmic complexity, memory storage, and retrieved signal accuracy is to be
pursued. In this work, we concentrate on iterative solutions due to their faster
convergence: this property is essential in real–time inspection scenarios where the
latency due to mere processing should be kept to the minimum. In particular,
three main approaches were considered worthy of investigation [7]:

(i) Orthogonal Matching Pursuit (OMP): considered as one of the most effective
serial greedy strategies, the rationale behind this procedure is to update
the values and positions of the non–zero signal coefficients step–by–step by
exploiting a least–square method.
(ii) Compressive Sampling Matching Pursuit (CoSaMP): overcoming the main
limitations given by the sequential approach at the basis of OMP, CoSaMP
jointly refreshes all the non–null entries by refining at each iteration their
value in the direction of the minimum residual error.
(iii) Iterative Hard Thresholding (IHT): in its essence, IHT is similar to CoSaMP,
the main difference being related to the exploitation of a thresholding oper-
ator for the simultaneous update of the estimated set of signal coefficients.

3 Experimental Validation
3.1 Materials and Methods
A simply supported steel beam depicted in Fig. 1 was instrumented with a
Smart Sensor Network (SSN) composed by six accelerometers connected in a
daisy–chain fashion and developed by the Intelligent Sensor System Labs of the
University of Bologna [8]. Among its most distinguishing features, each sensor
72 F. Zonzini et al.

Fig. 1. Experimental setup

node integrates an ST Microelectronics STM32F303 microcontroller unit (MCU)


embedding a single–precision floating point unit. The sensing element consists of
an LSM6DSL inertial measurement unit enabling the simultaneous acquisition
of tri-axial accelerations and as many angular velocities. As proven in previous
works [9], the maximum synchronization time between the sensors is inferior
to 4.7 ms, a quantity which is compliant with standard requirement for proper
modal shape reconstruction. Beside, the sensor’s positions were chosen in a way
that not interferes with the nodal points, i.e. structural points where modal
shapes present a zero crossing value, a condition which should guarantee the
correct reconstruction of all the sought modal parameters independently from
the specific sensing positions.
The structure is roughly L = 200 cm long, with a cross–section of 6 cm ×
1 cm. Considering the material properties discussed in [8], the three most ener-
getic frequencies of the structure were predicted to remain below 50 Hz; thus, a
sample rate of 200 Hz was deemed to be sufficient to assess with enough spectral
resolution the dynamics of the beam. The structure was subjected to ground
motion excitation and measurements were repeated ten times in order to get a
sufficient time insight. Worthy to be mentioned, the nature of the considered
solicitation, which still represents a worst case stimulus condition, is expected
not to influence the mere signal processing part. A 512 samples frame size was
chosen, which in turn dictated the computational complexity at the recovery
side, together with the desired compression ratio CR = N/M spanning in the
interval [2;10]. After a preliminary analysis, the sparsity measure k, which is an
important input parameter for the recovery algorithm, was fixed at 10.
The code implementing the decompression stage described in Sect. 2 was
programmed in the C++ language to be compatible with the digital signal pro-
cessing functionalities embedded in the sensor nodes’ MCU. To this end, all the
mathematical procedures and functions were purposely written without exploit-
ing the built–in operations in order to actually customise the signal processing
framework to the embedded resources of the network.
The three main metrics chosen to quantify the performance of each data
recovery method are: (i ) the memory footprint M , namely the total number of
initialised data and temporary variables required for the complete reconstruction
of a single set of coefficients ĉ, (ii ) the running time T , i.e. the time needed to
restore one single frame (which has been estimated by employing code functions
Hardware–Oriented Data Recovery Algorithms for CS–Based SHM 73

IHT CoSaMP OMP


50

Memory size [KB]


25
20
45
15
40
10

2 4 6 8 10 35
100
Exec time [ms]

MTA [KB x ms]


80 30
60

40 25

20 20
2 4 6 8 10
15
15
ARSNR [dB]

10
10

5
0
2 4 6 8 10 2 4 6 8 10
CR CR

Fig. 2. Cost analysis for the considered recovery algorithm. In the left-hand side, mem-
ory occupancy, execution time and ARNSR are displayed from top to the bottom. The
MTA product between the three curves per CR is conversely displayed in the right
panel.

measuring the elapsed time in between the start and stop of the required pro-
cessing), and (iii ) the Average Reconstruction Signal–to–Noise Ratio (ARSNR),
which is computed off–line in a post–processing phase. The latter is convention-
ally used to quantify the noise levels introduced by the CS processing operations
according with  
||x||2
ARSNR = 20 log (3)
||x − x̂||2
in which || · ||2 stands for the 2 norm of a vector. Finally, the Memory–per–
Time–over–Accuracy (MTA) factor

M ·T
MTA = (4)
eARSNR/20
was introduced with the primary goal of providing an overall evaluation: the
lower the MTA, the higher the recovery performance of the sought algorithms
are. For the sake of clarity, ARSNR values were computed back in the linear
scale to account for the singular values implied by the logarithmic operator, i.e.
ARNSR = 0 or ARSNR < 0.
74 F. Zonzini et al.

3.2 Results

The obtained results are depicted in Fig. 2, where the panels in the left-hand
side refer, from top to bottom, to the memory occupancy, the mean execution
time and the ARNSR computed by averaging among the six accelerometers,
respectively. For the sake of clarity, the memory occupancy here reported only
accounts for the variables involved in the data recovery algorithms themselves.
Thus, assuming that the CS operators are pre–loaded during the network start–
up configuration, it has been estimated that, in the worst cases associated to
limited compression scenarios (e.g. CR = 3), the IHT, CoSaMP and OMP solu-
tions may require a buffer size up to 1 MB due to the huge dimensions of the
sensing matrix and the sparsity basis. Accordingly, an example of reconstructed
signal by resorting to the IHT algorithm is displayed in Fig. 3, which has been
processed with a fixed CR value equal to 4. As it can be observed, the global
shape of the waveform is preserved, even if the magnitude of the retrieved high–
energy components is lowered.

Fig. 3. Example of IHT–reconstructed signal over 29 different signal frames.

The MTA product is displayed in the right chart to perform an overall cost
analysis. As it can be observed, the OMP implementation largely outperforms
the other alternatives at all the levels of analysis; its MTA is at least half of
the total burden associated to the IHT and CoSaMP implementations for all
Hardware–Oriented Data Recovery Algorithms for CS–Based SHM 75

the considered CRs. It is also worthy to mention that, despite the character-
istics of IHT appear to be competitive in terms of memory size, its associated
reconstruction accuracy is lower and it requires a larger execution time. Fur-
thermore, the higher memory occupancy characterizing the CoSaMP algorithm
is absolutely coherent with respect to the entailed algebraic procedures, given
the double dimension of the buffer this technique works on.

4 Conclusions
This work compares the effectiveness of three iterative data recovery algorithms
in the context of CS–based vibration diagnostics in view of hardware–oriented
implementations. The OMP, CoSaMP and IHT strategies were specifically inves-
tigated, and their performance were quantified on the basis of memory occu-
pancy, processing time and reconstruction accuracy. Results deriving from in–
field data show that the OMP algorithm is a suitable candidate to compress
vibration signals by a considerable amount while preserving meaningful infor-
mation.

References
1. Géradin, M., Rixen, D.J.: Mechanical Vibrations: Theory and Application to Struc-
tural Dynamics. Wiley, Hoboken (2014)
2. Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306
(2006)
3. Zou, Z., Bao, Y., Li, H., Spencer, B.F., Ou, J.: Embedding compressive sensing-
based data loss recovery algorithm into wireless smart sensors for structural health
monitoring. IEEE Sens. J. 15(2), 797–808 (2014)
4. O’Connor, S.M., Lynch, J.P., Gilbert, A.C.: Compressed sensing embedded in an
operational wireless sensor network to achieve energy efficiency in long-term moni-
toring applications. Smart Mater. Struct. 23(8), 085014 (2014)
5. Thadikemalla, V.S.G., Gandhi, A.S.: A data loss recovery technique using com-
pressive sensing for structural health monitoring applications. KSCE J. Civ. Eng.
22(12), 5084–5093 (2018)
6. Sun, H., Wang, Z., Xu, Y.: Research on sampling of vibration signals based on
compressed sensing. Vibroeng. Procedia 10, 459–463 (2016)
7. Rani, M., Dhok, S.B., Deshmukh, R.B.: A systematic review of compressive sensing:
concepts, implementations and applications. IEEE Access 6, 4875–4894 (2018)
8. Testoni, N., Aguzzi, C., Arditi, V., Zonzini, F., De Marchi, L., Marzani, A., Cinotti,
T.S.: A sensor network with embedded data processing and data-to-cloud capabili-
ties for vibration-based real-time SHM. J. Sens. (2018)
9. Zonzini, F., Malatesta, M.M., Bogomolov, D., Testoni, N., Marzani, A., De Marchi,
L.: Vibration-based SHM with up-scalable and low-cost Sensor Networks. IEEE
Trans. Instrum. Measur. (2020)
Electronics for Health and Assisted
Living
Automatic Generation of 3D Printable
Tactile Paintings for the Visually
Impaired

Francesco de Gioia(B) , Massimiliano Donati(B) , and Luca Fanucci(B)

Department of Information Engineering (DII), University of Pisa, Via Girolamo


Caruso, 16, 56122 Pisa, PI, Italy
francesco.degioia@phd.unipi.it
{massimiliano.donati,luca.fanucci}@unipi.it

Abstract. Traditional 3D scanning techniques can be used effectively


to replicate solid objects such as statues, to produce low-cost replicas
allowed to be touched by visitors. However, such techniques require
expensive equipment and precise 3D scanning, also they are unsuitable
for rendering 2D paintings as 3D models. In order to address these limi-
tations, we developed a scalable, low-cost solution to produce braille-like
readable 3D models of 2D paintings. We directly addressed the problem
of rendering the shades of color and brushstrokes styles present in the
original painting as an embossed composable surface. We repurposed a
technique first used in 1634 by the Italian heraldist Padre Silvestro da
Pietrasanta to reproduce the colors of coat of arms with only lines and
dots. We acknowledge its simplicity and effectiveness in conveying color
information and its feasibility for modern 3D printing. As an extension
to the basic color conversion, we included smooth color transition and
textural content. With this work we aim at including more people with
visual disabilities in experiencing our vast cultural heritage.

Keywords: Tactile surfaces · Tactile exhibitions · 3D printing · Visual


disabilities

1 Introduction
People with visual disabilities have significant difficulties experiencing visual
art exhibitions. Recently, museums are improving the accessibility of exhibits
for visually impaired visitors placing painting descriptions in Braille, including
more complete commentary in their audio tour devices, organising events and
training sessions, preparing dedicated ‘tactile exhibitions’, etc.
In the last years, low-cost 3D printing rapidly evolved from a device for hob-
byist to a valid support for effective prototyping [1,2] and [3]. Nowadays, mature
solutions exists as off-the-shelves products relatively easy to use. Modern 3D print-
ers have been frequently used to produce safe-to-handle replicas of fragile art-
works and, more recently, such replicas have been included in tactile exhibitions
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 79–89, 2021.
https://doi.org/10.1007/978-3-030-66729-0_10
80 F. de Gioia et al.

[4,5] specifically addressed to visually impaired people. In contrast to traditional


museum exhibitions, tactile exhibitions allow people with visual disabilities to
interact with the artworks and enjoy our shared artistic heritage [6].
Museums are constantly improving the accessibility of exhibits for visually
impaired visitors. For instance, the New York Smithsonian Museum, the Victoria
and Albert Museum in London, the Omero Tactile Museum of Ancona [7] or the
Art Institute of Chicago that provide services, prepare events and exhibitions
specifically dedicated to blind people. Often, for realising tactile exhibitions,
3D scanners are used to create 3D models that can then be printed to create
safe-to-handle replicas of artworks.
Although 3D scanning and 3D printing can be used effectively to replicate
statues and other solid artworks intended to be touched by blind people, the
same techniques cannot be used to replicate 3D models of the scenes repre-
sented in paintings. Specifically, traditional 3D Scanning have strong limitations
when used for 3D artwork replication for example, colour and texture cannot be
directly reproduced, markers needs to be applied for mobile scanners and sta-
tionary scanners have limitations on the size of the object they can scan. Due to
these limitations, often 3D replicas are realized by human artists through labo-
rious manual work. For most of the museums the high costs of these 3D models
impose a stringent limitation of its larger-scale application. However, new emerg-
ing computer-aided software capable of automating the replication process and
the diffusion of 3d printing enable the production of low-cost 3D models.
In this paper we present a novel technique to perform the conversion from
a 2D painting to a 3D model. We designed our solution to produce easily com-
posable 3D printable embossed tiles that provide tactile feedback for visually
impaired people. Although we developed this procedure mainly targeting 3D
printing, we kept the approach general enough to be applicable for other type
of materials (i.e. metal, paper, textile, etc.).
The remainder of the paper is structured as follows: in Sect. 2, in Sect. 3, in
Sect. 4. Finally, conclusions and future works are discussed in Sect. 5.

2 Previous Works

Recent development of new emerging technologies, cheaper and feature-richer


platforms for rapid prototyping and increased system interconnectivity, acceler-
ated the exploration of new ideas to help people with disabilities. Compelling
solutions assist visually impaired people in navigating unfamiliar environments,
detecting obstacles and recognizing people [8].
In [9] a smart guidance system for blind people is presented. The system
consists of a white cane equipped with a device able to read RFID tags and
perform simple color detection. Different color lines are assigned to each route
and RFID tags are used to provide further information for navigating the envi-
ronment. Similarly, in [10] the proposed low-cost object detection system is able
to warn the user of the presence of obstacles and can provide support for indoor
navigation.
Automatic Generation of 3D Printable Tactile Paintings 81

More specifically, the problem of developing systems that allow visually


impaired users to better experience art exhibitions has been addressed in [11],
where the authors propose the use of 3D scanning and 3D printing to pro-
vide duplicates allowed to be touched. In their paper, the authors described the
technical difficulties and costs derived in preparing such “tactile exhibitions”,
nevertheless authors reported positive feedback from users.
Traditional 3D scanning is feasible for solid artworks, but it is cannot be
used to scan scenes represented in paintings. Thus, in order to reproduce 2D
images as 3D objects, new techniques must be adopted. One of such techniques,
called “Relief Generation”, aims at rendering paintings as high-relieves or bas-
relieves by performing 3D geometry reconstruction of the scene analysing its
illumination.
Although various relief generation techniques can used to render 2D images
as 3D relief models [12], passive tactile information may not be sufficient for the
user to compose a mental representation of the painting. To specifically address
this issue, in [13] and [14] the authors propose a computer-vision haptic system
that includes complementary audio descriptions.
In [14] authors present a manual approach to tactile bas-relief generation.
Although the final 3D output has detailed and smooth results, the overall gen-
eration process is mostly manual and time consuming, making it a high-cost
solution for large-scale exhibitions.
The development of 3D printing, as an enabling technology for low-cost
touchable 3D model replication, can effectively assist art educators explaining
artworks to visually impaired students [15]. Existing solutions for generating 3D
models of paintings allow students to experience theme and geometric composi-
tion of artworks, but are unable to deliver important details such as color shades,
textures and brushstrokes styles.
In this paper we further develop the idea of converting 2D images in 3D
models, but instead of generating bas-relieves that render the geometric structure
of the image, we generate 3D readable braille-like surfaces that render color and
texture information of the image. Our approach is twofold: first it does not
assume correct lighting and prospective for geometric scene reconstruction –
which may be the case for many abstract art styles –, second it allows the user
to elaborate a personal mental representation of the painting from the tactile
feedback obtained as a direct translation of its visual content.

3 Methods

In this paper, we propose a multi-step procedure to generate a texture-based


heightmap from a digital picture of a painting. The heightmap produced can
be converted to a 3D planar mesh and used as a model for a 3D printer. Once
printed, the 3D model is an interpretable representation of the original picture
of the painting. The visual content of the original image is rendered as a combi-
nation of textures and patterns that can be read from the embossed support in
a way similar to braille text.
82 F. de Gioia et al.

Our procedure expect as input a RGB color image in 24-bit format. We start
the processing by subdividing the image in fixed sized blocks, then for each
block in the image we compute the mean for each color channel. We introduced
this step as a mean to remove spurious pixels arising from the subsequent color
segmentation step. By subdividing the image in blocks of uniform color, we
obtain more uniform texture patches that are easier to read.
The next step in our processing pipeline is to convert the input color image
into a texture image. We perform this conversion in two phases: in the first
phase we remap the original colorspace in a limited palette of seven colors; in
the second phase we convert the palette in the corresponding textural patterns.
We select the palette and the textural patterns based on the findings reported
in [16]. The seven colors named Or (gold/yellow), Argent (silver/white), Gules
(red), Sable (black), Azure (cyan), Vert (green) and Purpure (purple) have a
corresponding texture easy to identify. In heraldry, the present conventional
hatching system defines Or as represented by hatched points, Argent as plain,
Azure as represented by horizontal lines, Vert by diagonal lines from right to
left, Purpure by diagonal lines from left to right, and Sable by horizontal and
vertical lines intersecting at 90◦ This technique was first used in 1634 by Padre
Silvestro da Pietrasanta, an Italian heraldist [17], to render the colors of coat of
arms with lines and dots. We repurpose this technique in our application, mainly
leveraging its simplicity and effectiveness to encode color information.
To generate the texture patterns used for color coding, we define a dictionary
of seven functions reported in Table 1, and displayed graphically in Fig. 1. The
texture resolution, i.e. the spacing between lines and the spacing between dots,
can be modified by the tunable function parameter τ .

Table 1. Table of texture pattern generator functions.

Argent fτ (x, y) = 1
Or fτ (x, y) = cos(x 2π
τ
)cos(y 2π
τ
)
Gules fτ (x, y) = cos(x 2π
τ
)
Azure fτ (x, y) = cos(y 2π
τ
)
Vert fτ (x, y) = cos((x + y) 2π
τ
)
Purpure fτ (x, y) = cos((1 − x + y) 2π
τ
)
Sable fτ (x, y) = max(cos(x 2π
τ
+ π
2
), cos(y 2π
τ
+ π
2
))

Simple color segmentation as previously described, can be improved by intro-


ducing smooth transition between colors. In this paper we used alpha blending
between two texture patterns to create more complex patterns from the limited
dictionary of seven texture patterns (see Fig. 2).
In order to improve the readability of the output texture, we perform alpha
blending as a convex combination between only two base colors. Then, the best
combination of base colors and alpha factor is found to give the closest approx-
imation of the original color. Finally, the same alpha factor is used to blend the
Automatic Generation of 3D Printable Tactile Paintings 83

Fig. 1. Dictionary of seven texture patterns.

Fig. 2. Texture pattern alpha blending by increasing value of α.

two texture patterns corresponding to the base colors. Formally, we define the
error between the original color and its approximation as the l2-norm of their
distance vector,
errij = x − αij xi − (1 − αij )xj 2 (1)
where errij is the error value, x is the original color, alphaij is the blending
factor and xi and xj are the base colors in some colorspace. For each pair (xi ,
xj ) of base colors in the dictionary, we find the αij that minimizes the error
value by solving:
84 F. de Gioia et al.

2
min 1
2 x − αij xi − (1 − αij )xj 2
αij (2)
s.t. 0 ≤ αij ≤ 1
finally, we find the (i, j)-pair of base colors that gives the minimum error value
and use the corresponding αij value to mix the two base texture patterns. With
this approach, some base texture pattern may result from specific combination of
other two base texture patterns, thus in order to avoid this problem we combine
the texture corresponding to the primary (higher alpha) color, with a modified
texture of the secondary color (lower alpha) by doubling its τ parameter.
To complete the image representation, we include the gradient of the image as
a final layer. With the introduction of this layer we directly address the problem
of rendering the brushstrokes and lineworks of the artist as a 3D embossing
preserving the original content. The image gradient layer is computed with an
edge detection Sobel filter, rescaled by a constant factor in order for the output
to lie in [0–1]. By using a Sobel filter, we preserve the relative intensity of edges
as mean to reproduce softer or heavier brushstrokes.
The seven texture layers and the edge layer can be merged into a single
grayscale image, adjusted to a printable 3D mesh and exported as a 3D model.
We notice that, for some manufacturing processes, it may be more convenient
to keep the layers separate.
The overall processing pipeline with output examples for each processing
stage is displayed in Fig. 3.

Fig. 3. Processing pipeline.


Automatic Generation of 3D Printable Tactile Paintings 85

4 Results
We tested our method on various paintings with different artistic styles and
obtained well defined 3D meshes. A complete painting rendering is displayed in
Fig. 4.

Fig. 4. Texture Heightmap for Garden at Sainte-Adresse by Claude Monet, 1867.


86 F. de Gioia et al.

In Fig. 5 and Fig. 6 two examples of 3D meshes generated with Blender 2.79b
are displayed. The 3D meshes generated can be 3D printed in tiles of fixed sized
and used to compose a larger surface. Dividing the original painting in compos-

Fig. 5. From left to right: original input image, texture heightmap and 3D tile of
The Healing of the Cripple and the Raising of Tabita, particular, 1426–1427, Cappella
Brancacci, Santa Maria del Carmine, Florence.
Automatic Generation of 3D Printable Tactile Paintings 87

able tiles is sometimes necessary since most 3D printers can only print objects
with limited dimensions. Nevertheless, better 3D surfaces for high resolution
images can be obtained by composing multiple tiles.

Fig. 6. Particular of Garden at Sainte-Adresse by Claude Monet with texture


heightmap and 3D tile.
88 F. de Gioia et al.

In all the images the colors are rendered well as 3D texture patterns and
the overlayed edge layer allow image contours and brushstrokes to be rendered
with good accuracy. We are able to provide color transitions as mixtures of base
texture patterns without limiting the readability of the 3D surface. Optionally,
to further improve the readability of the surface the continuous alpha blending
step can be converted in a quantized alpha blending with a finite set of level of
transitions.
Since we developed our method mainly targeting 3D printing, we designed
our processing pipeline to take into account the 3D printer resolution through
the τ parameter – higher values of τ represent higher spacing between pattern
elements (dot and lines) –, and by dividing the original input image in tiles of
arbitrary size. Thus, by changing these two parameters users can adapt the final
3D mesh to the appropriate resolution of the 3D printer.
We designed our processing pipeline to be fully automated, with a default
color palette. However, we note that, for some images, it may be necessary for
the user to define a custom color palette as a mean to better highlight specific
colors. Our approach can support such change in the color palette in the color
segmentation step.

5 Conclusions

In this paper we presented a novel image processing pipeline to convert color


pictures of paintings in 3D printable embossed surfaces for tactile feedback.
This approach has been specifically developed to address the limitation faced
by visually impaired people in fully experiencing art exhibitions. We consid-
ered an automated, scalable, cost-effective process to support museums offering
new services for people with disabilities. In our image processing pipeline we
repurposed the hatching schemes used in heraldry for generating interpretable
textures, and introduced a dithering-like technique to encode smooth color tran-
sitions. In future works, we will report the feedback from final and intermediate
users, we are expecting to compare methods for creating uniform color patches
for easier composition, and develop similar approaches for other media such as
paper, metal and textile.

References
1. Dumond, D., Glassner, S., Holmes, A., Petty, D.C., Awiszus, T., Bicks, W.,
Monagle, R.: Pay it forward: getting 3D printers into schools. In: 2014 IEEE Inte-
grated STEM Education Conference, pp. 1–5 (2014)
2. Kim, S., Kim, H.: The development of adjustable 3D printer module. In: 2018 Inter-
national Conference on Electronics, Information, and Communication (ICEIC),
pp. 1–2 (2018)
3. Rajamanickam, P., Mulla, R.Y.: Cloud based 3D printer. In: 2017 International
Conference on Information, Communication, Instrumentation and Control (ICI-
CIC), pp. 1–4 (2017)
Automatic Generation of 3D Printable Tactile Paintings 89

4. Comes, R.: Haptic devices and tactile experiences in museum exhibitions. J.


Ancient Hist. Archaeol. 3, 12 (2016)
5. The tactile tour of the modern art gallery. https://www.uffizi.it/en/special-visits/
the-tactile-tour-of-the-modern-art-gallery. Accessed 01 Oct 2020
6. Best practice in making museums more accessible to visually impaired visitors.
https://www.museumnext.com/article/making-museums-accessible-to-visually-
impaired-visitors/. Accessed 01 Oct 2020
7. Museo Omero, Ancona. http://www.museoomero.it/. Accessed 26 Sept 2020
8. Joe Louis Paul, I., Sasirekha, S., Mohanavalli, S., Jayashree, C., Moohana Priya,
P., Monika, K.: Smart eye for visually impaired-an aid to help the blind people.
In: 2019 International Conference on Computational Intelligence in Data Science
(ICCIDS), pp. 1–5 (2019)
9. Fukasawa, A., Magatani, K.: A navigation system for the visually impaired an
intelligent white cane. In: Conference Proceedings: Annual International Confer-
ence of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering
in Medicine and Biology Society. Conference, vol. 2012, pp. 4760–4763, August
2012
10. Froneman, T., van den Heever, D., Dellimore, K.: Development of a wearable sup-
port system to aid the visually impaired in independent mobilization and naviga-
tion. In: 2017 39th Annual International Conference of the IEEE Engineering in
Medicine and Biology Society (EMBC), pp. 783–786 (2017)
11. Montusiewicz, J.: Technical aspects of museum exposition for visually impaired
preparation using modern 3D technologies. In: 2018 IEEE Global Engineering Edu-
cation Conference (EDUCON), pp. 768–773 (2018)
12. Wang, M., Chang, J., Zhang, J.J.: A review of digital relief generation techniques.
In: 2010 2nd International Conference on Computer Engineering and Technology,
vol. 4, pp. V4–198–V4–202 (2010)
13. Buonamici, F., Furferi, R., Governi, L., Volpe, Y.: Making blind people autonomous
in the exploration of tactile models: a feasibility study. In: Universal Access in
Human-Computer Interaction. Access to Interaction, pp. 82–93, August 2015
14. Governi, L., Furferi, R., Volpe, Y., Puggelli, L., Vanni, N.: Tactile exploration of
paintings: an interactive procedure for the reconstruction of 2.5D models. In: 22nd
Mediterranean Conference on Control and Automation, pp. 14–19 (2014)
15. Chen, Y., Chang, P.: 3D printing assisted in art education: study on the effective-
ness of visually impaired students in space learning. In: 2018 IEEE International
Conference on Applied System Invention (ICASI), pp. 803–806 (2018)
16. Brunot, W., Brunot, G.: Colors without sight: a method for differentiating colors
in braille. In: ETC: A Review of General Semantics, vol. 43, no. 4, pp. 332–335
(1986)
17. Pietrasanta Silvestro, P.P.R., Galle, C., Moretus, B.: De symbolis heroicis libri IX.
Antuerpiae: Ex officina Plantiniana Balthasaris Moreti (1634)
Validation of Soft Real-Time in Remote
ECG Analysis

Miltos D. Grammatikakis(B) , Anastasios Koumarelis, and Efstratios Ntallaris

Hellenic Mediterranean University, 71410 Heraklion, Greece


{mdgramma,tkoumarelis,entallaris}@cs.hmu.gr

Abstract. We present a distributed embedded e-Health platform targeting real-


time remote monitoring, analysis and visualization of electrocardiogram signals
obtained from a wearable pulse sensor. Experimental results using the STMicro
BodyGateway biosensor operating at 128 or 256 pulses/s show that we can support
signal acquisition, analysis and visualization in soft real-time, using a low-cost
Odroid XU3/4 board as server. Validation of real-time analysis is based on a novel
shared memory timing infrastructure that allows multiple collaborating processes
to share relevant performance metrics.

1 Introduction
Heart-related disorders refer to sporadic changes of a patient’s ECG signal characteristics
and account for 1 of 3 deaths in US. They include atrial fibrillation related to stroke risk
and ventricular arrhythmia associated to cardiac arrest [1]. According to ANSI/AAMI
EC13 standard, these disorders involve unusual perturbations of distances between R-R
peaks in consecutive ECG pulses and are usually treated with offline ECG analysis based
on 72-h ambulatory Holter that has a low diagnosis rate.
In this context, we consider a pervasive in-hospital use case, whereas a typical hospi-
tal server logs and analyzes patient data from a biosensor in real-time. In our experimen-
tal framework, we use a typical pulse sensor with an ARM microcontroller supporting
RTOS (ST Micro BodyGateway, or BGW) connected to a low-cost embedded single
board computer acting as a Server.
Our framework running on the Server extends open source medical decision support
software (Harvard Physionet WFDB, OSEA and WAVE packages) towards soft real-time
ECG monitoring, analysis of non-fatal arrhythmias, and visualization. In addition, notice
that most industrial products, such as AliveCor [2], BodyGuardian [3], LifeMonitor [4],
NowCardio [5] and PhysioMem [6] target real-time monitoring on mobile pulse devices.
ECG analysis is performed offline by a physician, i.e., after the ECG signal from a patient-
worn biometric device is transferred to a data center. In relation to real-time analysis,
Apple Smart Watch Series 5 supports analysis of a very short ECG waveform, detecting
signs of arrhythmia, specifically atrial fibrillation [7]. However, unlike our solution, this
product does not support a medical-grade pulse sensor device, is not server-based, and
is not able to detect ventricular fibrillation.
Moreover, in order to validate real-time behavior of our application, we have devel-
oped a novel POSIX shared memory infrastructure for sharing timing metrics among

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 90–96, 2021.
https://doi.org/10.1007/978-3-030-66729-0_11
Validation of Soft Real-Time in Remote ECG Analysis 91

processes, such as one-way delay (OWD), processing latency, bandwidth rate, and packet
loss. Timings can be accurate to the nanosecond in high-speed networks. Using this
framework, we have been able to validate soft real-time in our remote ECG arrhythmia
analysis application when the BGW pulse sensor device operates at 128 or 256 pulses/s.
In Sect. 2, we explain our soft real-time application, including the BGW device driver,
and the ECG_Server, ECG_Consumer and ECG_Animator processes. Section 3
focuses on the experimental framework and validations of real-time operation. Finally,
Sect. 4 provides a short summary and discusses future work.

2 Soft Real-Time ECG Analysis and Timing Infrastructure


at Server

We have written a multithreaded Linux driver to operate the BGW device. The first
thread requests a biosignal, e.g., ECG at 128 or 256 pulses/s, the second one receives
raw data from the BGW device via a BT 3.0 interface and extracts data to a shared list,
while the third one transmits data from the list to the Server over Ethernet.
Then, our distributed embedded soft real-time application running on Server invokes
two open source software libraries (~35K lines of code). First, WFDB (WaveForm
DataBase from Harvard Physionet [8]) to standardize the ECG signal transmitted by
the BGW driver to 200 samples/s according to EC-13, and b) OSEA (Open Source
ECG Analysis) to perform low- and high-pass QRS filtering (via Easytest) for heart-
beat detection and classification to normal or abnormal beats [9–11]. To manage ECG
annotation in soft real-time, we avoid excessive re-computation by applying a training
signal on the latest data, extending Easytest functionality without affecting predic-
tivity; our framework theoretically achieves a positive predictivity close to 99.8% when
using MIT/BIH and AHA arrhythmia databases. Alternative ECG analysis methods offer
smaller predictivity rates [12]; however, deep learning techniques are promising, see an
extensive study by Preventice using BodyGuardian sensor (BGW successor) [13].
Finally, annotated ECG waveforms are displayed using WAVE, which supports asyn-
chronous display of annotations in soft real-time (i.e., without noticeable delays). WAVE
is a fast, easy-to-use graphics library based on a 32-bit XView open source toolkit (a
low-level XWindows client).
For managing real-time, interactive applications, such as low-bandwidth ECG signal
transmission or high-bandwidth video streaming, we must share timing metrics across
different processes using a dynamic shared memory allocator. Hence, we have prototyped
a novel, non-intrusive, thread-safe timing infrastructure based on POSIX shared memory
that allows multiple collaborating processes to share latency, throughput, and packet loss
metrics. Our infrastructure allows composition of common shared memory names (in
shm_open calls) on a case-by-case basis, e.g. using macid, pid (process id), and/or
ip address info. Thus, as explained in Fig. 1, it can be adapted to enable automated
monitoring in our soft real-time ECG application using simple atomic shared memory
read/write operations, e.g. for transferring sensitive timing info, such as server receive
process timestamps, and OWD.
As shown in Fig. 1, the ECG_Server process connected to the BGW_driver
of Device_N writes ECG data to file F (/tmp/ECG_macid_pid), where macid
92 M. D. Grammatikakis et al.

Fig. 1. Shared memory timing infrastructure for monitoring latency.

identifies the mac address of the BGW, and pid is the ECG_Server process id. If
the ECG_Server process receives new ECG data to be written to file F and this
file is empty (meaning that the ECG_Animator has just processed the samples),
a triplet is written to a POSIX shared memory component (called M1, controlled by
lock server_lock_pid). This triplet consists of a) the Server’s current time (times-
tamp accurate in ns), b) the macid of the BGW device (since we support multiple
pulse sensor devices from the same ip), and c) the current OWD between Device_N
and Server (accurate in us). The OWD value, representing the time interval from
departure of the first bit from Device_N to arrival of last bit at the Server, is writ-
ten by OWD_Server to a shared memory component (called M2, controlled by lock
owd_lock_ip), and can thus be obtained by the ECG_Server. During animation,
ECG_Consumer and ECG_Animator analyze new samples in file F to obtain anno-
tations, and validate real-time behavior using timing info in M1. Finally, another file
W (/tmp/.WAVE_macid_wpid) is used by ECG_Animator (via wave-remote
script) to map ECG data and annotations (identified by macid) to a WAVE process
identified by wpid. (This file is omitted from Fig. 1).
Due to clock drifts, OWD is a key metric for evaluating our real-time application.
Our computation extends Choi and Yoo’s ping-pong algorithm [14] to compute OWD
during runtime (in parallel with RTT) by simplifying network calculus equations to avoid
re-computation; this equation rewriting is omitted due to space restrictions.
Validation of Soft Real-Time in Remote ECG Analysis 93

3 Experimental Framework, Testbenches and Results


Our use case involves validating soft real-time ECG analysis using a single BGW device
operating either at 128 or 256 pulses/sec. The BGW device transmits ECG data via Blue-
tooth to an Odroid XU4. The XU4 device uses a) the Bluetooth-to-WiFi BGW Driver
to transfer ECG signal data, and b) an OWD_Client to transmit one way delay com-
putations to the Server (Odroid XU3), via a 2.1 Gbit/sec router TP-Link Archer C5400.
Using our timing infrastructure, we collect measurements each time new ECG data is
uploaded to the WAVE visualization tool (via wave-remote). This includes current
time, number of ECG samples processed, total time passed from last server file write
(when ECG file was previously empty), one-way delay, as well as total analysis time
and distribution to different subprocesses.

Fig. 2. Average pulse rate for real-time analysis/visualization from one BGW device operating
a) at 128 and b) 256 pulses/sec. The rate is obtained from timestamp and no. samples.

Figure 2 examines the average rate. Based on the number of samples, and the
time interval that the data is processed and loaded via wave-remote to WAVE, we
observe that we can sustain soft real-time with the BGW device operating at 128 and
256 pulses/sec. Focusing on instant variation, instead of average, Fig. 3 shows a large
fluctuation around the average value. This is due to kernel interrupts which reduce the
transmission rate from BGW_driver and/or receive rate at the ECG_Server, or the
processing rate of the ECG_Animator; these data will be received/processed in a
subsequent time interval. The distribution for 256 pulses/sec is omitted but is larger.
Figure 4 shows a) the SRV_to_ANIM delay (time interval between consecutive
writes to ECG file from ECG_Server until they are all processed by ECG_Animator),
and b) the distribution of ECG_Animation delay. Contribution to average total process-
ing delay (SRV_to_ANIM) of 0.298 s is: 22.86% from ECG_Server alone (receiving
and writing data to file), 23.49% from wrsamp used mainly for conversion to std EC-13,
20.27% from easytest used for heartbeat detection and classification via filtering,
and 31.58% from wrann/rdann used for writing/reading to/from annotation files
94 M. D. Grammatikakis et al.

Fig. 3. Instant pulse rate for real-time analysis/visualization from one BGW device operating at
128 pulses/sec. The rate is obtained from timestamp and no. samples.

related to latest data. Also visualization via wave-remote takes only 1.67%, while
our shared memory constructs are non-intrusive with an overhead ~0.1%. The graph for
a BGW at 256 pulses/sec is omitted, however ECG_Server, wrsamp, easytest,
wrann/rdann, wave-remote, and shared memory take (66.77%, 10.38%, 8.44%,
13.62%, 0.72%, and less than 0.1%). ECG_Server processes twice as many samples.

Fig. 4. Server to animation delay, and distribution of animation delays (BGW at 128 pulses/sec)

Figure 5 shows distribution of one-way delay (OWD) at the time our ECG_Server
writes data to the file F for processing by the ECG_Animator. We observe that the
OWD delay ranges significantly, from 8.67% below to 46.03% above the average of
366.3us, thus forming the most critical component after ECG_Server. The OWD
delay for a BGW operating at 256 pulses/sec ranges from 25.3% below to 39.77% above
the average of 416.8 us. This increase relates to extra ECG traffic.
Validation of Soft Real-Time in Remote ECG Analysis 95

Fig. 5. OWD distribution (BGW_Driver to ECG_Server) with BGW at 128 pulses/sec.

4 Future Work

We have prototyped a distributed embedded platform for soft real-time ECG monitor-
ing, analysis, and visualization. The platform extends open source WFDB and OSEA
software packages and uses a novel shared memory timing infrastructure for sharing
performance metrics among multiple processes. Validation of real-time ECG analysis
uses the STMicro BodyGateway operating at 128 or 256 pulses/s.
Our future research will focus on parallelization techniques, data compression, and
socket options/flags to increase scalability, i.e., support more sensor devices at higher
rates. In addition, we will examine mixed criticality scenarios using memory/network
bandwidth management schemes, and evaluate overheads due to network/system security
mechanisms in the presence of immediate notifications or alerts.

References
1. American Heart Organization, “Heart Disease and Stroke Statistics”, Report. (2020). https://
www.heart.org/-/media/files/about-us/statistics/2020-heart-disease-and-stroke-ucm_505
473.pdf
2. Alivecor. https://www.alivecor.com/
3. Preventice BG. https://www.preventicesolutions.com/patients/body-guardian-heart
4. Lifemonitor. http://www.equivital.co.uk/products/tnr/sense-and-transmit
5. NowCardio. https://contex-tech.com/medical/nowcardio
6. Physiomem. http://www.getemed.net/en/telemonitoring/physiomemr-pm-1000
7. Apple Watch, Series 5. https://www.apple.com/apple-watch-series-5/health/
8. WFDB. https://archive.physionet.org/physiotools/wfdb.shtml
9. Hamilton, P.S., Patrick, S., Tompkins, W.J.: Quantitative investigation of QRS detection rules
using the MIT/BIH arrhythmia database. IEEE Trans. Biomed. Eng. 12, 1157–1165 (1986)
10. Tompkins, W.J.: A real-time QRS detection algorithm. IEEE Trans. Biomed. Eng. 3, 230–236
(1985)
11. EP Limited, OSEA. https://www.eplimited.com/confirmation.htm
12. Pinto, J.R., Cardoso, J.S., Lourenço, A.: Evolution, current challenges, and future possibilities
in ECG biometrics. IEEE Access 6, 4746–4776 (2018)
96 M. D. Grammatikakis et al.

13. Teplitzky, B.A., McRoberts, M., Ghanbari, H.: Deep learning for comprehensive ECG
annotation. Heart Rhythm J. 17(5), 881–888 (2020)
14. Choi, J.-H., Yoo, C.: One-way delay estimation and its application. Comput. Commun. 28,
819–828 (2005)
Software Architecture of a User-Level
GNU/Linux Driver for a Complex E-Health
Biosensor

Miltos D. Grammatikakis(B) , Anastasios Koumarelis, and Angelos Mouzakitis

Hellenic Mediterranean University, 71410 Heraklion, Greece


{mdgramma,tkoumarelis,angelos.mouzakitis}@cs.hmu.gr

Abstract. Biosensor devices transform healthcare services from a physician,


hospital- or clinic-centric system to one that directly involves patients. In this
work, we develop the software architecture of a GNU/Linux driver for a com-
plex biosensor (called STMicro Bodygateway). Our multithreaded driver supports
data acquisition from multiple on-board sensors, capturing raw Bluetooth packets
via rfcomm, processing them to retrieve associated biosignals, and transmitting
extracted biometric data over Ethernet to a server in soft real-time. In our exper-
imental framework, our driver commands the BGW sensors to transmit either
an ECG signal at 256 Hz, or accelerometer (x, y, z) data at 50 Hz. For both
cases, performance results indicate that our Linux driver can sustain soft real-time
behavior.

1 Introduction
E-Health refers to deployment of information and communication technologies (includ-
ing Internet) in the health sector, an essential step influencing lifestyle, maintaining well-
ness through early detection of disease, and lowering the escalating costs of healthcare.
E-Health includes scientific and R&D activities related to the deployment of medical
computer systems or services in a wide range of areas, such as

• Electronic Health Records (EHR) providing the ability to store and exchange patient
data among health professionals.
• Computerized physician applications for diagnostic tests.
• Telemedicine involving diagnosis and treatment of the patient’s physical and psycho-
logical condition from a distance.
• mHealth for monitoring real-time vital signs of a patient and transmitting patient data
to/from specialists using mobile wireless devices.
• Remote surgery by manipulating robotic systems.

In particular, wearable medical sensors and implantable devices are assisting to a


proliferation of healthcare services. These miniaturized and robust sensors continually
monitor biological signals that correspond to a number of relevant physiological param-
eters. Nowadays, most of these sensors send data via Bluetooth to a server, which pro-
cesses, analyzes and/or displays data. In a thorough literature search, we have discovered

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 97–103, 2021.
https://doi.org/10.1007/978-3-030-66729-0_12
98 M. D. Grammatikakis et al.

that approximately 40% of existing commercial pulse recorders, such as BodyGuardian


[1, 2], CardiBeat [3], Bittium Faros 360 [4], EC-12RM [5], and D-Heart [6] support
Bluetooth communication which consumes less energy than wireless. In addition, Blue-
tooth can be compatible with IEEE 11073, a new family of standards that promote a
homogeneous e-health ecosystem and provide uniform data representation and exchange
between sensor and gateway [7].
In this work, we utilize the STMicroelectronics BodyGateway (BGW), a medical-
grade wearable electronic patch widely used for remote monitoring of cardiac and res-
piratory functions. The device attaches similarly to a bandage to one of three places
on the patient’s chest and allows physicians and care providers to monitor patients
continuously from anywhere and at any time. The BGW device architecture integrates
several micro-electromechanical (MEMS) body sensors with an ARM STM32 F-series
microcontroller supporting a real-time operating system (RTOS). The system includes a
standard oscillator (32768 Hz) and transmits packets based on custom medical protocols
by connecting to an integrated Bluetooth (BT 3.0) interface (STLC2584) via SPI. An
extension of BGW is the Bodyguardian device used by Preventice to monitor hundreds
of thousands of patients. The complete system consists of mobile components that col-
lect and transmit biosignals to a remote hospital, and visualize data for evaluation by
dedicated healthcare professionals.
BGW can be programmed to acquire, digitize, and either stream in real-time, or store
and periodically transmit vital physiological data via a BT radio link to a receiver. Using
the BGW, both in- and out-hospital application use cases can be supported, such as elderly
people home monitoring, chronic cardiac disease monitoring, and single lead Holter
applications. In this case, physicians can constantly access patients’ data at anytime via
a diagnostic decision support system that runs on a hospital media gateway. They can
usually review the biosignals graphically, enable additional sensors, or set automated
notification thresholds that correspond to events that relate to personalized care plans.
Our contribution focuses on the software architecture of a GNU/Linux driver that
manages a complex BT-to-Wireless mixed-protocol stack. In particular, it uses custom
packet and cluster libraries to communicate over Bluetooth to the BGW device, sending
BT packets to program the BGW device, enabling reliable data capture of raw BT packets
from BGW device (via rfcomm protocol), retrieving of biosignal information, and
finally, transmitting vital signals to a server via Ethernet. Experiments with accelerometer
signals and high-rate ECG demonstrate that our Linux driver can sustain soft real-time
behavior.
Our driver emulates a single-producer single-consumer bridge between Bluetooth
and Wireless. Its implementation uses a lock-based list or concurrent queue to store
received BT packets. These data structures, although important performance-wise, have
not been examined in prior work. Existing state-of-the-art mostly relates to efficient
processing of ECG signals, i.e., filtering, smoothing, and arrhythmia analysis [8, 9].
Although, our work is similar in its foundation to [10] which deals with receiving,
converting and transmitting data packets between BT and Wireless, our focus is on
efficient real-time processing rather than interoperability.
Software Architecture of a User-Level GNU/Linux Driver 99

Next in Sect. 2 we discuss the software architecture of our BW. Section 3 focuses
on the experimental framework and validates real-time performance. Finally, Sect. 4
provides a short summary and discusses future work.

2 Software Architecture of Our BGW Driver


Based on STMicro specifications (covered by NDA in EU project DREAMS), we have
written a multithreaded user-level GNU/Linux driver that manages the BGW device,
supporting transfer of biometric data to a server. This study concentrates on the devel-
opment of the Linux driver, omitting detailed specifications. The Linux driver on an
embedded single board computer, e.g. an ARM v7 Odroid XU4. The main program con-
figures connection to the BGW device via BT (bluez tools), handling pairing, opening
an rfcomm socket via a specific channel number, and starting a connection [11]. Then,
as seen in Fig. 1, three POSIX threads are initiated.

Fig. 1. Block architecture of the BGW driver.

• The BT writer thread sends packets to the BGW device via the rfcomm channel
to request for a specific type of periodic signal from its suite of biosensors. Packets
contain multiple commands and optional sets of parameters (called clusters). Clusters
are structures embedded in packets that help define command functionality. Due to
the large number of supported sensors and direct notifications from the BGW, packet
and cluster structures are more complex than similar ones proposed in healthcare
100 M. D. Grammatikakis et al.

devices [12–15] and are supported by two custom libraries (Packet & Cluster Lib).
Example packets include request for a) ECG data at 128 or 256 Hz from standard 1-
lead patch-like electrodes for heart rate (including Holter) variability and reliability, b)
bioimpedance sensor data for respiratory rate operating at 30 Hz, and c) linear/angular
(3-axis) accelerometer data at 50 Hz for body position and physical activity estimation.
Data rates are competitive to current medical-grade biosensors.
• The BT reader thread receives via rfcomm protocol the expected raw data (254
bytes) from the BGW device via a BT 4.0 interface (USB dongle on Odroid XU4),
retrieving the vital signal from the appropriate packet fields (performing low-level
bit/byte operations considering ARM’s big endian architecture), e.g. 12 bits for ECG,
or 16 bits for each x, y, z accelerometer data. Biometric data is subsequently added
inserted either in a sharing list (or a concurrent queue), i.e. BT reader thread acts as
a single producer.
• Finally, the BT-WiFi sharing thread periodically pops all data from the shared
list (or resp. dequeues data from the concurrent queue) and transmits them to the
hospital server (a cost-efficient Odroid XU4) connected over a TCP connection. The
server can perform further analysis using signal filtering. The thread optionally saves
raw data to a file for validation, or deploys a low-overhead, X Window System (X11)
application for monitoring biometric data in soft real-time (i.e., without noticeable
delays). The X11 application performs hundreds of times faster than high-level visu-
alization tools (such as Java libraries) or even Grace, a common GNU/Linux tool for
dynamic visualization.

3 Experimental Framework, Testbenches and Results


In our experimental framework, we set the BGW in streaming mode for real-time trans-
mission of vital data, either ECG, or accelerometer. More specifically, we evaluate per-
formance of the BGW driver when the BGW device is set to operate either the ECG
sensor sending cardiac pulses at 256 Hz, or the accelerometer sending consecutive (x, y,
z) values at 50 Hz. We focus on the average rate during TCP transmission, i.e. the BGW
driver must capture the raw Bluetooth packets via rfcomm, process them to retrieve the
biosignals, and transmit extracted biometric data over Ethernet in real-time.
In Fig. 2, we consider performance of our Linux driver when the BGW device is set
to operate the ECG sensor at a posted rate of 256 pulses/sec. The average transmission
rate over a local TCP connection is obtained by measuring the no. samples sent at the
TCP send timepoint (T2) of the BGW driver (see Fig. 1). Figure 2 indicates that both
implementations are able to sustain the posted rate.
Similarly, in Fig. 3, we consider performance of our Linux driver when the BGW
device is set to operate the accelerometer at 50 (x, y, z) pulses/sec. The average transmis-
sion rate of BGW accelerometer values (x, y, z) over a local TCP connection is obtained
by measuring the no. values sent at the TCP send timepoint (T2) of the BGW driver
(see Fig. 1). Figure 3 indicates that both implementations (lock-free and lock-based)
perform well in respect to real-time. Even though our concurrent queue implementa-
tion is extremely fast, the relatively small rate of data transmission allows a lock-based
implementation to perform well. In fact, for the rates Lock has surprisingly predictable
performance.
Software Architecture of a User-Level GNU/Linux Driver 101

Fig. 2. Average ECG data rate when transmitting 12-bit ECG data over TCP connection via our
Linux driver. We consider two different implementations: a) lock-free concurrent queue-based,
and b) POSIX lock-based.

Fig. 3. Average data rate sustained by our Linux driver when transmitting 16-bit BGW accelerom-
eter values (x, y, z) over TCP connection. We consider: a) lock-free (concurrent queue-based) and
b) lock-based implementations.

Notice that validation of real-time signal transmission on the BGW depends on the
quality of the real-time clock (RTC). The internal RTC has ± 1 ppm oscillator accuracy,
which can create issues if used for an extended period of time. Our measurements at
timepoint T1, validate the small variability of the average rate, which is at least four
times better than those reported in Fig. 2 and 3 (graph omitted due to space restric-
tions). However, disturbances can still occur, especially since the BGW device may
often respond with partial (short) packets, e.g. when missing an RTOS deadline (e.g.
when high-priority alert notifications take place). As shown in Fig. 4, non-full packets
carry ~12-18% of the ECG data compared to complete ones.
102 M. D. Grammatikakis et al.

Fig. 4. ECG data (in Kbytes) carried in partial and complete packets. Measurement is made from
BT Reader Thread at timepoint T1, i.e. while reading BT raw data.

4 Summary and Future Work


We have developed a Linux driver for a complex biometric device involving the STMicro
BodyGateway sensor and validated real-time behavior when capturing and transmitting
ECG or accelerometer signals to an external server. Future work concentrates on exam-
ining driver scalability when connecting to multiple sensor devices operating at higher
rates. We are also interested in pursuing the design of low-cost secure IoT-enabled
medical devices and services.

References
1. Preventice BG. https://www.preventicesolutions.com/patients/body-guardian-heart
2. Preventice BG Mini. https://www.preventicesolutions.com/hcp/body-guardian-mini
3. CardiBeat. https://www.theheartcheck.com/cardibeat/index.html
4. Bittium Faros 360. https://www.bittium.com/medical/bittium-faros
5. EC-12RM. http://www.labtech.hu/products/netecg/ec-12rm.html
6. D-Heart. https://www.d-heartcare.com
7. http://www.pchalliance.org/personal-health-gateway-bluetooth-low-energy-manager
8. Pinto, J.R., Cardoso, J.S., Lourenço, A.: Evolution, current challenges, and future possibilities
in ECG biometrics. IEEE Access 6, 4746–4776 (2018)
9. Teplitzky, B.A., McRoberts, M., Ghanbari, H.: Deep learning for comprehensive ECG
annotation. Heart Rhythm J. 17(5), 881–888 (2020)
10. Shin, S-H., Suwon, S.: Apparatus and method for linking Bluetooth to Wireless LAN. US
Patent 0071123A1, 15 April 2004
11. Huang, A.S., Rudolph, L.: Bluetooth Essentials for Programmers. Press, Cambridge U (2007)
12. Simunic, D., Tomac, S., Vrdoljak, I.: Wireless ECG monitoring. In: Proceedings Confer-
ence Wireless Communication, Vehicular Info Theory, Aerospace & Electrical Systems
Technology, pp. 73—76. (2009)
13. Elaarag, H., Bauschlicher, D., Bauschlicher, S.: System architecture of HatterHealthConnect.
Int. J. Comp. Networks Commun. 5(2), 1–22 (2013)
Software Architecture of a User-Level GNU/Linux Driver 103

14. Wang, X.: Design of ECG acquisition system based on Bluetooth wireless communication.
In: International Conference on Software Engineering and Service Science, pp. 1019—1022
(2014)
15. Biagetti, G., Crippa, P., Falaschetti, L. et al.: Recognition of daily human activities using
accelerometer and sEMG signals. In: Czarnowski, R.J., Howlett, R., Jain, L.C. (eds.),
Intelligent Decision Technologies, Springer, Singapore, pp. 37—47 (2019)
Enabling Smart Home Voice Control
for Italian People with Dysarthria:
Preliminary Analysis of Frame Rate
Effect on Speech Recognition

Marco Marini(B) , Gabriele Meoni, Davide Mulfari, Nicola Vanello,


and Luca Fanucci

Department of Information Engineering, University of Pisa, via G. Caruso 16,


56122 Pisa, Italy
marco.marini@phd.unipi.it

Abstract. Within the field of automatic speech recognition, the pro-


cessing of dysarthric speech is a challenge because standard approaches
are ineffective in presence of dysarthria. This paper presents prelimi-
nary evidence that the performance of speaker-dependent speech recog-
nition systems trained for speakers with dysarthria may be substantially
improved by tuning the size and shift of the spectral analysis window used
to compute the initial short-time Fourier transform used in many speech
front ends. Evidence for this comes from a set of experiments performed
on a small collection of Italian speech (isolated words) from five different
speakers suffering from different degrees of dysarthria. The experimental
framework used in the paper constructs speaker-dependent GMM-HMM
speech recognition models using the triphone Kaldi recipe and varying
choices of the spectral analysis window size and shift. Results show a
variable improvement (31% to 81%), according to the selected user with
dysarthria.

Keywords: Dysarthria · Automatic Speech Recognition · Speech


analysis · Genetic algorithm · Kaldi

1 Introduction
Dysarthria is a motor speech disorder caused by a neurological injury [1] affecting
the brain areas responsible for speech. This damage manifests itself in different
ways in different subjects, leading to several types of dysarthria. Most forms of
dysarthria reduce speech intelligibility, which often implies a reduction in social
interaction.
A possible solution to improve the life of people affected by speech disabil-
ities could be a system capable of recognizing the intended speech, thus allow-
ing to perform text to speech or realizing real time voice synthesizers. For this
purpose, the most important challenge is to build an Automatic Speech Recog-
nition (ASR) system able to understand a dysarthric utterance. Mainly, they
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 104–110, 2021.
https://doi.org/10.1007/978-3-030-66729-0_13
Enabling Smart Home Voice Control for Italian People with Dysarthria 105

are based on Hidden Markov Model (HMM) combined with Gaussian Mixture
Model (GMM) [2] or based on Deep Neural Network (DNN) [3].
Nowadays, ASR systems and interfaces are not designed to process commands
from a person with dysarthria, because they are mainly trained on unimpaired
speech, and the pronunciation of dysarthric speakers deviates from that of non-
disabled speakers in many aspects. Thus, it is very important to make a speech
with dysarthria more recognizable for an ASR system.
An earlier study about acoustic and lexical model adaptation was investi-
gated in [4] showing that average relative Word Error Rate (WER) reduction
of 36.99% for an ASR system trained over a large vocabulary dysarthric speech
database. Another interesting approach comes from the idea of tuning GMM-
HMM parameters. In [5] they work in TORGO [6] for a Dysarthric speech recog-
nition task. They trained an acoustic model for dysarthric speech recognition
system using GMM-HMMs and DNN-HMMs with careful tuning of speaker-
specific parameters. They reported a relative WER reduction of 17.62% with
respect to the baseline system trained on a more complex model. A compara-
tive study among different architectures was performed in [7]. The result shows
that hybrid DNN-HMM models outperform classical GMM-HMM one accord-
ing to WER measure. The database used was TORGO database [6] and a 13%
improvements in WER was achieved with respect to the classical architectures.
A common feature among [4,5] and [7] is the use of 15 ms as a time step in mov-
ing window procedure for speech feature extraction, instead of 10 ms as included
in the standard. The window size is still 25 ms. All those papers show that such
a change improve the obtained results. The reason behind such an improvement
was though to be the slower articulatory rate in dysarthic speech. However, this
hypothesis has been neither verified in such works nor are we aware about any
study aiming at optimizing such a parameter.
In this work, we propose to optimize the performance of an ASR system
for dysarthirc speech by tuning both the time step and the moving window
size parameters. The ASR system is developed with a speaker dependent (SD)
approach by Kaldi [8] toolkit. A genetic algorithm (GA) is used to search for
the window size and shift that optimizes word error rate for each speaker. We
describe the materials and method used in Sect. 2 before discussion of the results
in Sect. 3. Finally, Sect. 4 the paper.

2 Data and Methods

2.1 Materials

The data used for this research comes from Centro Ausili di Bologna, which is
an aid centre for people with disabilities. This database contains the records of
5 voluntaries, 3 males (M01, M02, M03) and 2 females (F01, F02) have different
grades of dysarthria. Unfortunately, there is not clinics information or about
level of dysarthria for each speaker. The speech recording system consists of two
microphones: high quality microphone [9] and low quality one [10]. Data acquired
106 M. Marini et al.

by using a sampling rate 44, 1 KHz. The latter microphone was included in order
to be closer to the expected application (mobile application from smartphone).
The total words were 189 and the total phones were 43. Every speaker has
spoken a single word at most three times. The duration of recordings goes from
1 s to 2 s, and each word can be composed by at least 2 phones up to a maximum
of 11. The phones were extracted, in SAMPA format, from Italian words thanks
to “g2p” program [11].
The amount of minutes recorded by a single user is about 40, but it can
change.

2.2 Automatic Speech Recognition

We decided to use an GMM-HMM architecture for our ASR system, as sug-


gested in the state-of-the-art when limited amount of data is available [12]. The
toolkit Kaldi was chosen to create the ASR system. Kaldi [8] is a toolkit for
speech recognition licensed under the Apache License v2.0. Kaldi has the same
goal of HTK [13], the most classic ASR system usually found in literals and
books. Kaldi was chosen because it allows to have a flexible and expandable
code written in C++. Using Kaldi recipes, the user is able to build models
mono-phoneme, tri-phoneme and base on neural network. It is also possible to
extract MFCC and PLP features from speech signal and setup all the Features
Extraction parameters.
For our purpose, we decided a tri-phoneme Acoustic Model trained whit
Speaker Adaptive Training algorithm. The features vector is projected by Linear
Discriminant Analysis criterion [14] and transformed by Maximum Likelihood
Linear Transformation [15] (LDA + MLLT + SAT ) trained with records of a
single user (speaker dependent approach).
About the features, Perceptual Linear Prediction (PLP) and Mel Frequency
Cepstral Coefficients (MFCC) vectors were computed by means Short Time
Fourier Transform (STFT) and then tested and compared.

2.3 Training and Testing

The words were divided into a training and test sets. The former is composed
by 126 words. The test set is composed of 63 words. In this way, the Acoustic
Model is tested with words never seen in the training phase.
In each experiment WER values were estimated as a function of window and
shift size. We did an experiment for each type of features (MFCC and PLP) and
for each subject. So, we can define the function W ERASR (w, s) as the WER
function of an ASR depending on its window and shift value.
The results have been compared with the state of the art choice for windows
size and shift time, which are 25 and 10 ms respectively.
The Genetic Algorithm (GA) [16] was used to find the optimal values for
window and time shift used for STFT [17] within a given range. This algorithm
was chosen because we do not know the shape of the function that we would like
Enabling Smart Home Voice Control for Italian People with Dysarthria 107

to optimize. Specifically, the GA generates a pair of w, s within a given range.


Then an ASR system will be created, trained with that pair of Window and
Shift, and then evaluated (WER value).
The GA was implemented in java. Each individual represents a point in
Window-Shift field, and we use the WER measure of ASR system as Fitness
Function. So, in the fitness function, the java program first modifies the Kaldi
configuration file through its Window-Shift values and then executes the Kaldi
recipe which returns a WER value.

3 Experimental Setup and Results

The aim of our experiment is to find out how changing the window and shift
size could improve the performance of ASR system for people with dysarthria,
and if an optimization at single user level exists. For this reason, we decided to
use a speaker dependent approach. We will also compare our results with the
standard approach that includes 25 and 10 ms for window size and shift time.
The accuracy of a speech recognizer is typically measured by WER. The
WER measure is computed on ASR output and their human transcriptions. The
WER is computed as an edit distance on words between the ASR output and
reference transcription, in the continuous speech. Nevertheless in this paper,
since we are using isolated words, there is no need for an alignment procedure
so, the WER is simply the proportion of words that was incorrectly recognized.
Note that WER is an error function so the ideal value is zero.
Dysarthric speaker may do a pause in the middle of a word due to some
breathing issues, so the ASR system can interpret one word as two single word.
This behaviour lead to have a WER greater than 100%.
The results consist of a set of points in the Window-Shift field, which compose
a curve for each user. The Figs. 1 and 2 show the curve for F01 user as example.
The curve of others are very similar to each other.

Fig. 1. User F01: side view of WER curve Fig. 2. User F01: side view of WER
using MFCC features (Window side). curve using MFCC features (Shift side).
108 M. Marini et al.

As a common finding across all users, we were able to identify a region where
WER is very low. We called it Optimal Region of an ASR system for a specific
user. In the Tables 1 and 2, the borders of the Optimal Region for each user
are shown along with the average WER and its standard deviation of ASR
systems trained with Windows and Shift values taken inside of that region. The
average points of Window and Shift indicate the centre of gravity of region.
Furthermore, each average WER value is compared with WER of ASR system
which uses baseline values of Window and Shift and the relative improvements
are shown in the last column.

Table 1. Optimal region of Window-Shift space for each user, and relative performance
of ASR trained on MFCC feature extracted from this region, compared with baseline
performance.

Window [ms] Shift [ms] WER [%] Improvements [%]


min max avg min max avg avg std dev baseline
M01 22.68 93.68 43.85 15.70 28.74 22.64 36.33 1.47 60.41 39.86
M02 21.57 65.69 39.97 17.66 25.29 19.91 8.35 0.57 35 76.14
M03 21.32 86.93 52.14 15.68 25.96 17.88 30.69 1.56 44.74 31.40
F01 45.04 95.33 68.82 33.09 48.02 45.13 15.17 0.90 83.38 81.81
F02 25.75 78.74 50.67 17.19 28.92 22.84 31.26 1.30 55.10 43.27

Table 2. Optimal region of Window-Shift space for each user, and relative performance
of ASR trained on PLP feature extracted from this region, compared with baseline
performance.

Window [ms] Shift [ms] WER [%] Improvements [%]


min max avg min max avg avg std dev baseline
M01 20.36 98.40 48.94 18.18 27.36 19.94 35.82 1.86 62.44 42.63
M02 35.09 68.03 47.80 18.16 31.51 22.81 8.68 0.56 37.5 76.85
M03 24.56 64.18 43.65 14.86 24.09 16.86 31.02 1.85 53.16 41.64
F01 43.96 97.37 72.53 29.79 46.46 39.06 14.93 1.20 66.67 77.61
F02 20.69 93.32 49.74 12.99 29.21 20.93 29.99 2.01 54.59 44.70

To determine the Optimal Region, we have taken into account all the points
of the Window-Shift space (W Sspace ), with WER lower than:

T hreshold = 1.20 ∗ min(W ER(x)), x ∈ W Sspace (1)


x

xeval = {x ∈ W Sspace : W ER(x) ≤ T reshold} (2)


and we have taken as lower and upper bound of the Optimal Region the
minimum and maximum of Window and Shift values of xeval :
1 
Wavg = (W indowx ), N = |xeval | (3)
N x
eval
Enabling Smart Home Voice Control for Italian People with Dysarthria 109

where W indowx is the Window size of the element x. The same procedure
was performed for the Shift.
From the results of our experiments it can be inferred that there are not sig-
nificant differences between ASR systems by employing MFCC or PLP features.
Furthermore, we have noticed that the standard values of Window and Shift
are outside the Optimal Region. Especially, it is always smaller then Optimal
Region and for some speaker is quite away from that.
Results in Table 1 and 2 show that a user dependent improvement ranging
from 31% to 81% can be obtained.
Another interesting result is that the Optimal Region is different among
all people, so it could be due to some speech feature that characterize a speaker.
The WER improvement seems to be mainly related to the choice of shift
size, while Window size is not so relevant as we expected. Indeed, the shape of
function in the Optimal Region seems quite flat with local oscillations for each
user.

4 Conclusion
This paper presented some preliminary experiments about the effects of window
and shift size on speech recognition of Italian dysarthric speech. The database
taken into account for these experiments contains a quite limited amount of
samples for just 5 speakers, but it is still a start point since there are not Italian
dysarthric speech database available online.
First of all, we saw that there is no big difference, in terms of WER value,
between PLP or MFCC as features.
As result, we found out that an optimal region exists in the Window and Shift
field in which the WER value can be minimized. The performance of the opti-
mized models is compared to performance of models trained using the standard
window parameters. The results show that tuning the window size and shift can
substantially reduce word error rates compared to using the default parameters.
It could be interesting replicate these experiments on unimpaired speech to
evaluate if the effect seen in dysarthric speech is the same for normal one. So, in
the future works we want to go in deep about this research taking into account
also normal speech.
Furthermore, it could be interesting investigate if it is possible evaluate the
level of dysarthria taking into account spectral analysis parameters as Window
size and Shift time.

References
1. Robin, D.A. et al. Clinical Management of Sensorimotor Speech Disorders. In:
Malcolm, M.R. (ed.) Thieme, New York (1997)
2. Gales, M., et al.: The Application of Hidden Markov Models in Speech recognition.
Now Publishers Inc, Hanover (2008)
110 M. Marini et al.

3. Jinyu, L. et al.: Improving wideband speech recognition using mixed-bandwidth


training data in CD-DNN-HMM. In: 2012 IEEE Spoken Language Technology
Workshop (SLT) (2012)
4. Mengistu, K.T. et al. Adapting acoustic and lexical models to dysarthric speech. In:
2011 IEEE International Conference on Acoustics, Speech and Signal Processing
(2011)
5. Joy, M.M., et al.: Improving acoustic models in TORGO dysarthric speech
database. IEEE Trans. Neural Syst. Rehabil. Eng. 26(3), 637–645 (2018)
6. Rudzicz, F., et al.: The TORGO database of acoustic and articulatory speech from
speakers with dysarthria. Lang. Resour. Eval. 46(4), 523–541 (2012)
7. Espana-Bonet, C. et al.: Automatic speech recognition with deep neural net-
works for impaired speech. In: Abad, A. et al. (eds.) International Conference on
Advances in Speech and Language Technologies for Iberian Languages, Springer,
Cham (2016)
8. Povey, D. et al.: The kaldispeech recognition toolkit. In: IEEE 2011 Workshop on
Automatic Speech Recognition and Understanding (2011)
9. Zoom-na Hn4. https://www.zoom-na.com/products/field-video-recording/field-
recording/zoom-h4nsp-handy-recorder
10. HP H2300. https://support.hp.com/us-en/product/Headsets/5382553/model/
5407009
11. Bisani, M., et al.: Joint-sequence models for grapheme-to-phoneme conversion.
Speech Commun. 50(5), 434–451 (2008)
12. Shahin, M., et al.: A comparison of GMM-HMM and DNN-HMM based pronun-
ciation verification techniques for use in the assessment of childhood apraxia of
speech. In: 15th Annual Conference of the International Speech Communication
Association (2014)
13. Young, S.J. et al.: The HTK hidden Markov model toolkit: Design and philosophy
(1993)
14. Fukunaga, et al.: Introduction to Statistical Pattern Recognition. Elsevier, New
York (2013)
15. Gopinath, R.A.: Maximum likelihood modeling with Gaussian distributions for
classification. In: Proceedings of the 1998 IEEE International Conference on Acous-
tics, Speech and Signal Processing, (Cat. No. 98CH36181), vol. 2. IEEE (1998)
16. Sivanandam, S.N. et al.: Genetic algorithm optimization problems. In: Introduction
to genetic algorithms, pp. 165–209. Springer, Berlin (2008)
17. Allen, J.B., Rabiner, L.R.: A unified approach to short-time Fourier analysis and
synthesis. Proc. IEEE 65(11), 1558–1564 (1977)
Brain-Actuated Pick-Up and Delivery Service
for Personal Care Robots: Implementation
and Case Study

Giovanni Mezzina(B) and Daniela De Venuto

Department of Electrical and Information Engineering, Politecnico Di Bari, 70126 Bari, Italy
{giovanni.mezzina,daniela.devenuto}@poliba.it

Abstract. In this paper, we propose a smart architecture able to provide an auto-


mated pick-up and delivery service for personal care assistance. The presented
architecture consists of a human-robot interface that connects the user intentions, at
the cortical level, with the functionalities of a personal care robot (PCR). This inter-
face must, firstly, acquire and interpret the user’s electroencephalographic (EEG)
signals. Then, it must uniquely formalize these EEG-driven requests, and con-
tinuously communicating with the environment to provide an online-updated list
of available services. The users’ intentions recognition is entrusted to a nested 2-
choice asynchronous Brain-Computer Interface (BCI). It bases the feature extrac-
tion and discrimination steps on an end-to-end binary technique: the local binary
patterning (LBP). The experimental results demonstrated that the LBP-based BCI,
here proposed, can decode EEG and drive the actuator in ~883 ms with an accu-
racy of 84.22%. Also, the tests proved that the 79.2% of the requests have been
successfully satisfied by the PCR.

1 Introduction

The report “European disability statistics” by Eurostat shows how the 26.3% of people
65+, and up to 10/100 people in the working-age experience severe motor disabilities,
due to accidental injuries, and neurological disease [1]. These impairments limit their
ability in moving independently and taking care of themselves, leading to the need for
continuous and qualified assistance that in most cases cannot be provided [1, 2]. In
this context, the recent advances in robotic technologies have led to the introduction of
service robots for personal and domestic use, the personal care robots (PCRs), which are
typically emotional companion [3]. Nevertheless, the only sociable interaction feature is
not enough to cover the spectrum of services required for bedridden patient care [4]. The
PCRs are more frequently asked to take care of the needs and the physical limitations
of the target patients. In this respect, they are independently improved by different
research groups to address this issue [4–7]. A tangible example of this trend concerns
the human-robot interactions (HRIs) improvement, which passed from vocal commands
to brain ones. Indeed, since the vocal commands can lead to misunderstanding due to
language semantics and speech defects, the HRIs are currently largely exploiting the
Brain-Computer Interfaces (BCIs), minimizing any kind of physical interactions by the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 111–121, 2021.
https://doi.org/10.1007/978-3-030-66729-0_14
112 G. Mezzina and D. De Venuto

user [4–8]. A few assistive solutions implementing jointly BCI and PCRs have been
proposed at the state of the art [4–6].
Some noteworthy architecture examples in the context are the ones proposed by the
authors in [5] and [6]. Specifically, the authors in [5] implemented a combination of P300,
steady state visually evoked potential (SSVEP), and event related de-synchronization
(ERD) based BCI techniques on a humanoid robot named NAO, to solve multi-task prob-
lems such as of humanoid robot navigation and control along with object recognition.
The experimental results in the paper [5] showed that the system required up to 750 s to
move the robots of about 5 m in a mixed directions path. It resulted in an unavoidable
user fatigue increment. Moreover, the most used BCI technique for navigation purpose,
among the available ones, is the SSVEPs based one. Nevertheless, to elicit visual poten-
tial in a subject, the video terminal should continuously flicker at different frequency,
resulting in eye muscles fatigue.
Authors in [6] propose a BCI based on both electrooculography (EOG) and elec-
troencephalography (EEG). The EOG signals are analyzed to recognize eye movements
such as blinks, blinks, winks, and frowns, while the EEGs are used to detect event related
potentials (ERPs) like P300. Both eye movements and ERPs have been separately used
for implementing assistive interfaces, which help patients with motor disabilities in
performing daily tasks, the proposed hybrid interface integrates them together. For eval-
uation purpose the interface has been tested on NAO robot and an automatic cleaner,
i.e. Mobile robot Kobuki. The navigation system based on the classification of the eye
movements has been tested by means of an ask-pick-deliver task, which requested for
2 min of continuous eye movements, leading to somatic muscle effort [6].
Among the proposed studies, no BCI-based HRI solutions include - together - all
the functions required for the PCR proper working: (1) user’s needs comprehension;
(2) generation of unambiguous requests; (3) interaction with the environment to satisfy
those needs; (4) real-time updating of the user’s choices and change of ideas.
In this paper, we report the design and the implementation of a smart architecture for
bedridden patients’ care. More in detail, although the presented architecture can cover
a broad range of applications, this paper will be “masked” as a case study in which the
system is exploited for a single purpose: to implement an automatic pick-up and delivery
service driven by brain signals. The proposed architecture implements a human-robot
interface based on the acquisition of EEG signals wirelessly streamed to the PCR. This
latter implements a BCI approach that interprets the EEGs allowing the user to formalize
a request (no language/semantic limitations) and driven by the BCI outcomes, actuates
an autonomous navigation and exploration routine to reach the needed object. In this
architecture, the PCR also ensures the human side an always updated interaction with
the environment, supporting him in the choice.
The paper is structured as follows. Section 2 discusses the working principle of the
architecture. Section 3 provides the experimental results and Sect. 4 concludes the paper.

2 The Smart Architecture


The proposed architecture relates among each other three main actors: human user,
assistive robot, and smart environment.
Brain-Actuated Pick-Up and Delivery Service for Personal Care Robots 113

In the proposed case study, the first performer (i.e., human user) consists of a bedrid-
den patient who wears a wireless EEG in front of a video terminal. The EEG headset
oversees the user’s cortical signals acquisition and transmission, while the video termi-
nal is used to drive the patient in the choice and to provide him a feedback stimulation
by the PCR. Figure 1.a) shows a labeled experimental set up snapshot where the EEG
headset is identified with the label 4, while the video terminal as label 3.
Data from EEG device are then wirelessly sent to the PCR. This second architecture
performer (i.e., the PCR) embeds the BCI algorithm that oversees the cortical signals
interpretation and the unambiguous formulation of the request. The BCI algorithm is also
responsible for updating the video protocol, whenever a brain-guided choice is made.
PCR is always connected to the third performer of architecture: the smart environ-
ment. The latter consists of a set of markers and sensors distributed in the environment
to facilitate the navigation of the robot and to provide – if queried – an always updated
list of available services.

Working Principle. As a first step, the user wears the EEG headset as depicted in
Fig. 1.a. The video terminal screen is initially blank. To start interacting with the PCR,
the user should blink his eyes 3 times in less then 4 s. This functional sequence of eyes
movement brings the PCR and the related BCI in the wake-up status. The video terminal
turns on and the BCI starts periodically providing a list of binary choices as per Fig. 1
(e.g., Left: “Bring me…”, Right: “Call…”). The user can activate a further nested binary
choice by slightly moving the index of the right or left hand according to the choice.
It must be specified that the BCI is based on a cortical activity named movement
related potentials (MRPs), which is typically linked to the movement planning. In this
respect, the video protocol is not intended to elicit any specific brain potential, but it has
the only role of informing the user of the currently available choices.
Moving the fingers, it is possible to unlock a path made up by nested binary choices,
that leads up to a final decision. This decision tree can embed a theoretically infinite
number of final requests. The cost to be paid in adding new choices is the timing increment
to reach the final decision.

Fig. 1. Experimental setup of the BCI system. (a) Snapshot at time t = 0. Labels: #1 PCR (Pepper);
#2 User; #3 Video terminal; #4 EEG headset; #5 Finger movement by elicitation protocol (b)
Snapshot at time t = 420 ms. Label: #5 consequence of the BCI recognition of finger movement.
114 G. Mezzina and D. De Venuto

Figure 2 shows an example of HRI for a food request, based on a nested 2-choice
decision tree. More in detail, when the BCI is turned ON (BCI ON), it provides a binary
choice: Bring me… or Call… The red arrows represent the selected choices composing
the path that will lead to the request formalization. Once the left-hand movement (i.e.,
Bring me…) is recognized, the BCI proposes another choice: “Food” or “Drink”. In
this step, the user selection brings to the declaration of a variable named Mark ID that
will be useful to autonomously navigate up to the repository containing food (according
to demo in Fig. 2). At this point, the user can choice a specific food (Choice – Fig. 2)
or asking for a list of the available food (List of Choices – Fig. 2). In this latter
case, the PCR queries the environment gateway, which collects all the sensors outcomes.
The environment transmits a list to the user video terminal and updates (if needed) the
nested decision tree of the PCR. Following the selected branch as per Fig. 2 it is possible
to require a Chocolate. The final leaf (decision) of the tree brings to the definition of
the image recognition ID (IR ID – Fig. 2). This ID is useful to speed up the image
recognition process when the object will be manipulated and scanned to find a match.
The request is uniquely formalized as a code and drive the PCR in its operations
(see Fig. 2). The example in Fig. 2 shows how every selection concurs to define a part
of the unique code. The second part of the code (i.e. 112) identifies a specific marker
that creates a path to the selected repository (food or drink in the example). A marker
is a texturized picture distributed all over the environment to create pre-set path for the
PCR autonomous navigation. More details will be provided in Sect. 2.3.
Following these markers, the PCR autonomously navigates to the repository that
contains the selected object (Chocolate as per Fig. 2). Here, the PCR moves at the
repository position provided by the code (i.e., 3rd code field) and manipulate the related
object to scan it. The scanning embeds an image recognition process in which an IR
ID is generated for the picked object. Then, it is provided as feedback to the user. The
label should correspond to the last code field (IR ID- Fig. 2). Finally, the PCR goes back
to the patient’s proximity to deliver it. To asynchronously deactivate the BCI, the same
procedure that activated it can be used.

Fig. 2. HRI example of a food request to PCR based on a nested 2-choice decision tree.
Brain-Actuated Pick-Up and Delivery Service for Personal Care Robots 115

2.1 Devices and Sensors


EEG Acquisition Device. It has been proven [7, 9] that any voluntary movement is phys-
iologically preceded by a motor planning procedure that exploits specific brain activity
patterns (BAPs) mainly detectable the central lobe of the brain, also called the pre-
motor area. Among these BAPs, the movement-related potentials (MRPs) plays a main
role in muscular sequencing planning. For the purpose, EEG data from 8 EEG chan-
nels of a g.Nautilus Research headset by g.Tec (Fig. 1.a) have been acquired: T7, C3,
Cp1, Cp5, C4, T8, Cp2, Cp6, Afz as GND and A2 as REF. In this application, EEGs
were recorded with 24 bits resolution at 512 Hz sampling rate. Next, the signals are
re-referenced [10, 11] to the global average and band-pass filtered between 1 Hz and
35 Hz (8th order Butterworth). Once transmitted [12], the EEG data are subjected to an
online artifact rejection procedure by means of an approach named riemaniann Artifact
Subspace Reconstruction (rASR) [13]. It filters out from the useful EEG signal, most
of the physiological and non-physiological artifacts sources. The procedure exploits an
Independent Component (IC) Analysis approach, which allow the system to extract the
ICs related to the eye movements that are used for the BCI activation/deactivation.

Personal Care Robot. In this case study, the used PCR is PEPPER by SoftBank Robotics
[3]. For the navigation purposes, the PCR has been equipped with an OV5640 RGB
camera placed on the head top and an ASUS Xtion 3D sensor located in the forehead to
generate a real-time depth map.
Two sonars (front, rear) and 2 IR sensors (±45° with respect to the robot front
direction) are used for real-time obstacle detection.

Environment. Each repository containing foods or drinks have been equipped with force
sensing resistor (FSR) load-cell under each object. They operate as switches providing
a food/drink presence flag to the PCR. These latter update a string in the PCR memory
containing all the available goods IDs.

2.2 The Brain Computer Interface


The BCI implemented on the PCR for the purpose follows some pillar guidelines
introduced in our previous work [14]. Its working principle can be divided in two
main branches: an offline calibration and a real-time classification (online). Both the
computations are preceded by a trial extraction procedure.
Trial Extraction. In correspondence of each finger movement (left or right), it is possible
to record a cortical activity positive deflection diffused at the whole pre-motor area. This
deflection, known as Bereitschafts potential (BP) [7, 9], reaches its peak at the movement
onset, then, starts a suppression process. For this reason, the BP peak has been used as
the BCI trigger.
Starting from the trigger onset, the EEG data, cyclically stored in a circular buffer,
are extracted, creating a trial. Each trial consists of an EEG time-window 1.25 s long
(Ns = 648 samples), comprising 700 ms before the trigger onset, and 550 ms after it.
The trial is extracted for all the channels parallelly [15].
116 G. Mezzina and D. De Venuto

Offline Calibration. During this calibration stage, the user moves the indexes according
to a known pattern. This pattern is driven by a video protocol that indicates which is the
hand to be moved. Then, it provides the user a buzzer sound to trigger the movement.
At the end of the data collection, the system has two matrices: one for the left-hand
movements (LHM) and one for the right-hand ones (RHM). L(R)HM ∈ RNch,Ns,Nt ,
where Nch is the number of channels (Nch = 8), while Nt is the number of collected
trials concurring to the calibration stage.
The trials composing those matrices are averaged, channel-by-channel, along the
dedicated trials, resulting in a matrix containing the 8 averaged waveforms, M∈ RNch,Ns .
Each waveform is singularly analyzed by a symbolization routine known as Local Binary
Patterning (LBP) [14, 15]. The LBP transforms a physical time series, such as a 1D EEG
data vector, in a corresponding binary matrix. The LBP exploits the amplitude differences
among two contiguous samples, assigning “1” for a growing trend, “0” otherwise. More
details about the LBP method application for 1D EEG data are provided in our previous
work [14]. These 2D matrices related to each single channel are appended realizing 2
binary matrices, one for each available movement. These resulting matrices are LHM
Mask and RHM Mask. L(R)HM Mask ∈ Rb*Nch,Ns , where b is the number of bits used
for the LBP routine time-series translation [14, 16]. In this application, b = 6 according
to [14].

Real Time Classification. Once the two masks (LHM Mask and RHM Mask) are
extracted, the system is ready for a real-time classification.
In this operating mode, every indexes movement enables the trial extraction stage.
The extracted trial is, in this case, unlabeled. As per the above-mentioned procedure,
the Nch waveforms composing the trial are sent to the LBP routine that extract a binary
matrix compatible -in size- with both the masks. The LBP-treated unlabeled observation
(UO) is compared, element-by-element with both the masks (LHM Mask and RHM
Mask). If the corresponding elements (e.g., UO (2,3) and LHM Mask (2,3)) are equal,
the corresponding element of a likelihood matrix related to the comparison goes “1”,
otherwise “0”. By counting the number of “1” in each comparison matrix it is possible
to find the degree of similarity between the UO and one of the two masks. It roughly
leads to the final binary classification of the trial.
As shown in Fig. 1, the finger movement determines the specific user choice. Once
the last UO is labeled, basing the decision on the similarity degree, the command is sent
to the PCR, which will start with the navigation routine.

2.3 The Navigation and Object Manipulation Routines

PCR Navigation Routine. The PCR receives a code from the BCI system. The 2nd field
of the code determines the mark ID that should be recognized to proceed toward the
repository. Exploiting the RGB camera, the PCR scans the environment to find a marker
identifiable via the received ID. Figure 3.a shows, as example, the recognition process
for the marker with ID 112. If no marks are detected, the PCR moves around with a preset
rotation angle (i.e., 5°), then repeats the routine. If a marker is detected, the PCR adjust
its alignment to place the marker inside a tolerance band (TOL red band - Fig. 3.a). Next,
Brain-Actuated Pick-Up and Delivery Service for Personal Care Robots 117

it evaluates the inter-distance between the camera position and the center of the marker
by means of depth map (see Fig. 3.b). The PCR moves toward the marker in a safety way
(turning on the collision avoidance systems). Then starts the routine again, excluding
the last marker from the computation. If no new markers are detected, it means that the
final position is reached, then the robot is ready for the object manipulation.

Fig. 3. PCR navigation routine. (a) Marker identification. Left: marker outside the tolerance band
(TOL); Right: marker within the tolerance band (b) Depth map extraction and distance measure

Fig. 4. PCR object manipulation routines. Main steps of procedure and object scan

PCR Object Manipulation Routine. The first step of this routine, labeled as STAND in
Fig. 4, consists of a precise alignment of the center of the robot chest with the last marker.
Once the STAND-related alignments are completed, the manipulation procedure consists
of four main steps (Fig. 4 labels 2–5). Position 2 is used to avoid collision in approaching
the repository. Positions 3 consists of putting the hand on the object, by slightly pressing
on it. Position 4 consists of closing the hand, grasping the object. Finally, the last frame
represents a procedure named SCAN. It consists of rotating the wrist to bring the object
in front of the RGB camera for the image recognition process. The PCR compares the
scanned image with the ones previously stored in its memory.
118 G. Mezzina and D. De Venuto

3 Experimental Results
3.1 BCI Performance: Accuracy and Timing

Four healthy subjects have been enrolled to in-vivo test the proposed architecture
[17, 18]. The subjects age span ranged from 24 to 27 years old.
Real-time Validation. To validate the system in a real-life scenario, the subjects have
been asked to formalize requests by completing the whole decision trees, stepping into
the nested binary choices.
On average, 74 ± 2 trials have been extracted for each subject. The request patterns
have been programmed to be equally distributed between the available choices (RHM
and LHM).
Defining the accuracy as the ratio between the number of correctly classified hand
movements and the total number of movements, the BCI system showed an overall
accuracy of 84.22 ± 0.73%. In the same context, the Fig. 5 shows the confusion matrix
concerning the real time validation for all the evaluated subjects. The worst accuracy has
been recorded on the Sub. 2 and 4 (i.e. 83.33%), while the best one has been recorded
on Sub. 3 (i.e., 85.13%).

BCI Timing. Starting from the finger movement, the EEG transmission introduces a
delay of 15 ms. Next, 550 ms are dedicated to the register filling according to the trial
extraction phase. The application of the LBP routine on the UO asks for 1.61 ± 0.22 ms,
while the UO versus L(R) HM Mask comparison is completed in 1.32 ± 0.29 ms. The
final decision based on the similarity degree assessment is reached in about 0.94 ±
0.07 ms. The extracted code is stored in the PCR memory, which set the movement
routine in about 315 ± 102 ms before starting the navigation.

Fig. 5. PCR object manipulation routines. Main steps of procedure and object scan

3.2 PCR Routines Performance

On average, the real-time validation of the BCI, allowed the formalization of about
6 ± 1 requests. No errors in markers recognition during motion planning has been
recorded during the laboratory tests.
Brain-Actuated Pick-Up and Delivery Service for Personal Care Robots 119

Table 1 summarizes the PCR routine performance subject-by- subject (Sub. 1 to 4)


and on average (Avg.). Table 1 is divided in two main outcomes: navigation routine
related results and object manipulation routine ones. The first part shows the number of
total point-to-point movements carried out by the PCR to satisfy the requests. The num-
ber of formalized requests (per subject) are shown between brackets. The “Odo. Error”
column represents the number of movements not successfully completed due to an error
in the odometry algorithm (typically not invalidating because little amount of error). The
“Obstacle Avoidance” column refers to wrong obstacle avoidance management, which
leads to a totally wrong positioning of the PCR. Concerning the object manipulation
section, the “Success. Grasp” column summarizes the number of grasping actions suc-
cessfully completed with a positive image recognition match. “No IR ID Recognition”
column refers to the number of objects that have been correctly manipulated but with a
negative image recognition match. Lost grasp corresponds to manipulation procedures
ended with an empty hand. Overall, on a total of 192 PCR movements (on all the sub-
jects), 49 movements (25.6%) were not properly completed due to errors in obstacle
avoidance routines and odometry shifts errors.

Table 1. PCR routine performance: navigation and object manipulation

Sub. Navigation routine Object manipulation routine


Movements Odo. error Obstacle Success. grasp No IR ID Lost grasp
(Requests) avoidance recognition
1 51 (7) 12 1 4 1 2
2 47 (6) 9 3 4 0 2
3 48 (6) 11 2 5 1 0
4 46 (5) 9 2 3 1 1
Avg. 192 (24) 41/192 8/192 16/24 3/24 5/24
(21.4%) (4.2%) (66.7%) (12.5%) (20.8%)

In the object manipulation context, on 24 requests and related manipulations and


scan, 19 procedures were successfully completed (79.2%), of which 3 manipulations
resulted in a non-IR (12.5%). In five cases (20.8%) the object grasp has been lost resulting
in an empty hand.

4 Conclusions

In this paper, a smart architecture based on a brain driven HRI for bedridden patients’
care has been proposed. Although the architecture can cover several applications, in
this paper it has been exploited for a specific case study: implementing an automatic
pick-up and delivery service driven by brain signals. The proposed architecture exploits
the advantages of assistive robots and neural interfaces to realize desire-driven action
when physical interaction is impossible or critical.
120 G. Mezzina and D. De Venuto

Specifically, the user’s brain activity is acquired via an EEG headset and wirelessly
sent to a PCR. This latter embeds a BCI approach based on a symbolization routine, the
Local Binary Patterning. The here-introduced nested binary choice approach allows the
BCI architecture to provide a theoretically infinite number of available requests, over-
coming the limit of other state-of-the-art BCIs that exploit a static number of choices.
Moreover, the coding procedure to formalize, in a unique way, the user’s request over-
comes the limits of the semantic misunderstanding typical of vocal commands, while
remaining totally a wireless approach. It also improves the architecture portability on
other actuation systems. In an assistive context, the continuous connection among the
PCR and the environment allows the user to be pervasively included in all the available
services. Experimental results from four healthy subjects demonstrated how the system
can decode EEG with an accuracy of 84.22% and actuate a planned motion via PCR
in ~883 ms. Functional test also showed that the robot has successfully completed the
79.2% of the requests formalized by the users.

References
1. Eurostat 2020 Report on: Disability statistics - elderly needs for help or assistance. https://tin
yurl.com/y5h2fu8r
2. Lee, S., Naguib, A.M.: Toward a sociable and dependable elderly care robot: design,
implementation and user study. J. Intell. Robot. Syst. 98, 5–17 (2020)
3. Pandey, A.K., Gelin, R.: A mass-produced sociable humanoid robot: Pepper: the first machine
of its kind. IEEE Robot. Autom. Mag. 25(3), 40–48 (2018)
4. Robert, L., et al.: A Review of Personality in Human–Robot Interactions (2020). arXiv:2001.
11777
5. Choi, B., Jo, S.: A low-cost EEG system-based hybrid brain-computer interface for humanoid
robot navigation and recognition. PLoS One 8(9), e74583 (2013)
6. Ma, J., et al.: A novel EOG/EEG hybrid human-machine interface adopting eye movements
and ERPs: application to robot control. IEEE Trans. Biomed. Eng. 62(3), 876–889 (2015)
7. Annese, V.F., De Venuto, D.: The truth machine of involuntary movement: FPGA based
cortico-muscular analysis for fall prevention. In: 2015 IEEE International Symposium on
Signal Processing and Information Technology (ISSPIT), Abu Dhabi, pp. 553–558 (2015).
https://doi.org/10.1109/ISSPIT.2015.7394398
8. De Venuto, D., Rabaey, J.: RFID transceiver for wireless powering brain implanted micro-
electrodes and backscattered neural data collection. Microelectronics J. 45(12), 1585–1594
(2014). ISSN 0026-2692. https://doi.org/10.1016/j.mejo.2014.08.007
9. De Venuto, D., Annese, V.F., Mezzina, G., Defazio, G.: FPGA-Based embedded cyber-
physical platform to assess gait and postural stability in parkinson’s disease. IEEE Trans.
Compon. Packag. Manuf. Technol. 8(7), 1167–1179 (2018). https://doi.org/10.1109/TCPMT.
2018.2810103
10. De Venuto, D., Ohletz, M.J., Ricco, B.: Automatic repositioning technique for digital cell
based window comparators and implementation within mixed-signal DfT schemes. In: Fourth
International Symposium on Quality Electronic Design Proceedings, San Jose, CA, USA,
pp. 431–437 (2003). https://doi.org/10.1109/ISQED.2003.1194771
11. De Venuto, D., Ohletz, M.J., Ricco, B.: Testing of analogue circuits via (standard) digital
gates. In: Proceedings International Symposium on Quality Electronic Design, San Jose, CA,
USA, pp. 112–119 (2002). https://doi.org/10.1109/isqed.2002.996709
Brain-Actuated Pick-Up and Delivery Service for Personal Care Robots 121

12. De Venuto, D., Tio Castro, D., Ponomarev, Y., Stikvoort, E.: 0.8 µW 12-bit SAR ADC
sensors interface for RFID applications. Microelectronics J. 41(11), 746–751 (2010). ISSN
0026-2692. https://doi.org/10.1016/j.mejo.2010.06.019
13. Blum, S., et al.: A Riemannian modification of artifact subspace reconstruction for EEG
artifact handling. Frontiers Hum. Neurosci. 13, 141 (2019)
14. Mezzina, G., De Venuto, D.: Local binary patterning approach for movement related potentials
based brain computer interface. In: 2019 IEEE 8th International Workshop on Advances in
Sensors and Interfaces (IWASI), Otranto, Italy, pp. 239–244 (2019). https://doi.org/10.1109/
IWASI.2019.8791266
15. Khan, K.A., et al.: A hybrid Local Binary Pattern and wavelets based approach for EEG
classification for diagnosing epilepsy. Expert Syst. Appl. 140, 112895 (2020)
16. De Venuto, D., Annese, V.F., Mezzina, G.: An embedded system remotely driving mechanical
devices by P300 brain activity. In: Design, Automation & Test in Europe Conference &
Exhibition (DATE), Lausanne, pp. 1014–1019 (2017). https://doi.org/10.23919/DATE.2017.
7927139
17. De Venuto, D., Ohletz, M.J.: On-Chip Test for Mixed-Signal ASICs using Two-Mode Com-
parators with Bias-Programmable Reference Voltages. J. Electron. Test. 17, 243–253 (2001).
https://doi.org/10.1023/A:1013377811693
18. De Venuto, D., Ohletz, M.J., Riccò, B.: Digital window comparator DFT scheme for mixed-
signal ICs. J. Electron. Test. 18, 121–128 (2002). https://doi.org/10.1023/A:1014937424827
Digital Techniques for Mechatronics,
Energy and Critical Systems
Creation of a Digital Twin Model,
Redesign of Plant Structure and New
Fuzzy Logic Controller for the Cooling
System of a Railway Locomotive

Marica Poce1(B) , Giovanni Casiello1 , Lorenzo Ferrari1 ,


Lorenzo Flaccomio Nardi Dei2 , and Sergio Saponara1
1
DII, Università di Pisa, via G. Caruso 16, Pisa, Italy
marica.poce@gmail.com
2
Trenitalia S.p.A., via S. Lavagnini, Firenze, Italy

Abstract. This paper presents a digital twin model plus the redesign
of plant structure and a new fuzzy logic controller for the cooling system
of the locomotive E402B. This work results from master thesis works [1]
and [2]. The locomotive E402B cooling system is susceptible to malfunc-
tions that may cause cooling tower engines overheat during operation.
In collaboration with the Engineering of Trenitalia, we build a cooling
tower digital twin model and use it to propose structural improvement
for the cooling plant design and to redesign the control system. We use a
fuzzy logic control scheme because it is widely used in the temperature
control field and it is easily applicable in the present case.

Keywords: Locomotive · Revamping · Model based · Control ·


Fuzzy · Digital twin · Redesign · Cooling system · Cooling tower

1 Introduction

The object of study is the locomotive E402B cooling tower (Fig. 1). Each loco-
motive has two towers that cool the water radiator, the oil radiator and the
breaking rheostat using air sucked in from outside by a centrifugal fan. The air
path is not optimal because the physical structure of the tower is quite twisted.
The main problem of the tower is the engine overheating.
To address the above issues, we build a cooling tower digital twin model
in Sect. 2. The model results have been compared vs data sheet values and real
experimental measurements (Sect. 3). We suggest four modifications to the plant
(Sect. 4) and a control system redesign to turn down the engine temperature
(Sect. 5). Conclusions are drawn in Sect. 6.

c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 125–135, 2021.
https://doi.org/10.1007/978-3-030-66729-0_15
126 M. Poce et al.

Fig. 1. E402B cooling tower

2 Digital Twin Model


The model was built using the Simulink section of MATLAB R2019a.
The cooling tower model is shown in Fig. 2. Air enters by Inlet and it passes
through the first and second radiators. These are micro-channel cross-flow heat
exchangers as studied in [3]. Simscape Fluids offers a Heat Exchanger (G-TL)
block that simulates the thermal dynamics of two fluids briefly maintained in
contact through a thin conductive wall. Model parameters are set from locomo-
tive maintenance manual data and using theoretical tools taken by [4–6]. In both
radiators the air flows from A1 to B1, thermal liquids flow from A2 to B2 pushed
by Pump. The heat injected into the liquids (input 1 and input 2) is modelled
with Tank (TL) and Controlled heat flow rate source. Tank (TL) implements a
constant pressure tank settable by user.
After radiators, the air passes through a block called Flow resistance, which
represents pressure loss. The reason is to consider the pressure loss before the fan,
which is not included in the radiators. It is low and it is modelled as concentrated.
The same is done after the fan, modeled using its characteristic in nominal
conditions. Fan moves under Ideal Angular Velocity Source action, which is given
the system input control. At this point the air is pushed into the rheostat and
then it goes out. The idea behind the rheostat is to model the two resistive
packs sections as thermal inertia Thermal Mass R1 and R2. They are heated by
a power Q̇ (input 3) equal to the power that rheostat has to dissipate. Thermal
masses exchange heat with air through a Pipe (G) which exchanges heat between
fluid and wall in a convective way. The wall surface is equal to rheostat wings
area. The main tower problem is the engine overheating, so we use a thermal
model as in [8–10]. An energy balance is used to describe the heat exchange in
electric motor. On first approximation:
1
P δt = Cδθ + θδt (1)
Rtot
Locomotive E402B Cooling System 127

Fig. 2. Cooling tower Simulink model


128 M. Poce et al.

The power dissipated P for a time δt is equal to the stored energy (dependent
on the engine’s thermal capacity C and the over-temperature compared to the
environment δθ) added to the energy exchanged for conduction, convection and
radiation. The Rtot comes from the fact that heat is transmitted by conduction
in the engine and by convection and radiation from the engine surface to the
environment. A thermal/electrical analogy is used and an electric circuit is built.
We use a thermal analysis report to size the circuit. The tower is currently
controlled by a set of rules that switches fan speed between 40 and 60 Hz.

3 Verifying the Digital Twin Model

In this section the digital twin model is compared and validated vs data sheet
values and real experimental measurements on board for the locomotive E402B.
The inputs have been chosen using nominal condition manual data: environ-
mental temperature 40 ◦ C, fan 60 Hz, input 1 120 kW, input 2 230 kW, input 3
750 kW. Target values are also given in manual data.

Fig. 3. Radiators and rheostat temperatures.

Fig. 4. Pressure loss of water and oil vs nominal values.


Locomotive E402B Cooling System 129

Figure 3 shows water radiator temperature, oil radiator temperature and


rheostat temperature. They are how expected. Indeed, outlet liquids temper-
atures reach target values and the same goes for rheostat temperature.
Figure 4 shows air pressure loss in radiator and rheostat vs nominal values.
The model reflects description provided in the locomotive manual.
Figure 5 shows air pressure loss in radiator and rheostat vs nominal values.
Figure 6 shows air flow characteristics of the fan. The difference between
target and real values is due to the fact that the operating point given in the

Fig. 5. Air pressure loss in radiator and rheostat vs nominal values.

Fig. 6. Air flow characteristics of the fan.

Fig. 7. Engine temperatures (60 Hz)


130 M. Poce et al.

datasheet is different from the flow/pressure characteristics measured by Treni-


talia and used in our model.
Finally, engine temperatures when the fan is operating 60 Hz in shown in
Fig. 7. Motor windings exceed the 180 ◦ C threshold indicated by thermal class
(H).

4 Analysis of Plant Modification

Hereafter, the digital twin model has been used to propose structural improve-
ment for the cooling plant design.
Air Conveyor Lifting. It has already been applied to the real system because
less current was experimentally found in the engine. This change has been left
for excessive rheostat overheating. The conveyor is a funnel-shaped component
that channels all air from radiators into fan and separates inlet from outlet. The
air conveyor modification is to lift the funnel of 2 cm. This means that fan inlet
and outlet are no longer separated, so the air recirculates to the inlet because
pressure is higher at the outlet fan. The model is updated as shown in Fig. 8.

Fig. 8. This modification is implemented by adding a recirculation branch on the fan


and using a block that models the size of the opening and the pressure drop.

Engine Ventilation Improvement. Motor is placed in a small and poorly


ventilated compartment. The idea is to increase engine ventilation. This mod-
ification can be modelled by reducing the convection resistance in the engine
thermal model. In our case this resistance is halved.
Pressure Loss Reduction. The ventilation shaft has square curves which cause
high pressure loss. Reducing load losses means reducing the fan effort. It would
require a lower speed to have the same volumetric flow rate. This change is
implemented by reducing pressure losses in Flow resistance (G). In our case,
pressure losses are halved.
Fan Position Changing. Fan are positive displacement pumps so, keeping the
RPM constant, volume of gas moved does not change with the volumetric mass
of the gas itself. The fan is moved to the beginning of the circuit, before radiators,
to make it work with cooler and denser air. In this way we can decrease the fan
speed leaving tower performance at nominal level.
Locomotive E402B Cooling System 131

4.1 Results

The following figures (Fig. 9 and Fig. 10) show the temperature trend of the main
tower components for all cases described above, when input 1 120 kW, input 2
230 kW, input 3 750 kW.
With regard to air conveyor modification the improvement of engine temper-
ature is not relevant compared to cooling performance deterioration. Improving
the ventilation of the engine, however, the only change is in the motor temper-
ature that decreases. The pressure loss reduction maybe is the most practicable
change. In addition, without affecting tower performance, engine temperature
improves. Finally move the fan gives the best results. However this change is not
feasible but it gives us an idea of how the system is badly designed.

Fig. 9. Water and oil temperatures at radiators outlet for all plant modification.

Fig. 10. Rheostat and motor windings temperatures for all plant modification.
132 M. Poce et al.

5 Fuzzy Control
The control currently in use gives the maximum fan speed when a condition
occurs. The tower is modelled as a MIMO system and its transfer functions
aren’t available. Furthermore, precise temperatures regulation isn’t required,
they simply have to be below limit thresholds. For this reason, traditional control
methods may be unnecessarily complicated to apply as proposed by [11]. A fuzzy
control has been implemented because it is often used in temperature settings
as in [12].
Control inputs were redefined in fuzzy logic as in Fig. 11. The same is done
with control output in Fig. 12. At this point input and output are related by
rules. Defuzzification is done using the centroid method. A real journey from
Milano to Genova has been simulated to evaluate the control. Traditional and
fuzzy controls are compared. The goal is to obtain a control that heats up the
engine as little as possible, maintaining unchanged the tower performance.

Fig. 11. Control inputs redefined in fuzzy logic. Rheostat temperature shape is due to
the sensor type. It only shows if temperature have been exceeded 500 ◦ C or 700 ◦ C.

Fig. 12. Fan speed in fuzzy logic. It is the control output.

The results are shown in the figures below. In Fig. 13 the output control
signal and motor windings temperature are shown. Using new fuzzy control, the
maximum engine temperature reaches about 20 ◦ C less.
Locomotive E402B Cooling System 133

Fig. 13. Output control signal (motor speed) and motor windings temperature using
traditional and fuzzy control.

Fig. 14. Water and oil temperatures at radiators outlet using traditional and fuzzy
control.

In Fig. 14 it’s clearly water and oil temperatures at radiators outlet do not
change significantly. The rheostat temperature is slightly higher, but it is not
alarming because rheostat input power is much greater compared to what hap-
pens in the real case.
134 M. Poce et al.

6 Conclusion
In conclusion, the use of a model-based approach has brought the expected
results. We build a cooling tower digital twin model of the E402B cooling tower
in collaboration between University of Pisa and Trenitalia. Thanks to the built
model, the system has been studied and this has made it possible to try and
evaluate several plant modification proposals without physically realizing them.
These modifications have reduced the engine overheating problem but not all are
actually realizable. The most appreciated change was the pressure loss reduction.
It is not very invasive and it does not affect cooling performance.
Also the control system modification has offered an interesting idea. The
control system in use is very basic, complicating it we can get big improvements.
Fuzzy logic control is easy to implement and fits perfectly to our problem. For
this reason, it is widely used for thermal control. The results in terms of motor
overheating are very encouraging. Cooling performance is comparable.
Future research includes a more detailed study of changes that Trenitalia
find interesting, with further information on the system. The results and the
company encourage investment in control system improving and pressure losses
reducing, so we are mooving in this direction. This seems to be the best strategy
taking into account feasibility, cost and results. It is important to say that all
the changes can also be coupled.

References
1. Poce, M., Saponara, S., Ferrari, L., Flaccomio Nardi Dei, L.: Creazione di un
modello digital twin, riprogettazione strutturale e nuovo controllo fuzzy logic di
un sistema di raffreddamento per locomotive E402B. UNIPI, Master thesis (2020)
2. Casiello, G., Saponara, S., Ferrari, L., Flaccomio Nardi Dei, L.: Progettazione
model-based di sistemi di controllo e di parti strutturali termomeccaniche per
locomotive elettriche in applicazioni ferroviarie. UNIPI, Master thesis (2020)
3. Tuckerman, D.B., Pease, R.F.W.: High-performance heat sinking for vlsi. IEEE
Electron Device Lett. 2, 126–129 (1981)
4. Thulukkanam, K.: Heat Exchanger Design Handbook. Dekker Mechanical Engi-
neering. CRC Press, Boca Raton (2000). https://books.google.it/books?id=
G52EfFF4uQYC
5. Stephan, P., Kabelac, S., Kind, M., Martin, H., Mewes, D., Schaber, K.: VDI Heat
Atlas. Springer (2010)
6. Shah, R., Sekulic, D.: Fundamentals of Heat Exchanger Design. Wiley, Hoboken
(2003). https://books.google.it/books?id=beSXNAZblWQC
7. Bayer, A.G.: Bayer silicones baysilone fluidss M. https://www.dcproducts.com.au/
documents/BayerBaysiloneFluidsBrochure.pdf
8. Zubenko, D., Petrenko, A., Dulfan, S.: Investigation of the heating processes and
temperature field of the frequency-controlled asynchronous engine based on math-
ematical models. EUREKA Phy. Eng. 5, 64–72 (2019)
9. Henao, H., Capolino, G.-A., Fernandez-Cabanas, M., Filippetti, F., Bruzzese, C.,
Strangas, E., et al.: Trends in fault diagnosis for electrical machines: a review of
diagnostic techniques. IEEE Ind. Electron. Mag. 8(2), 31–42 (2014)
Locomotive E402B Cooling System 135

10. Pedra, J.: Estimation of typical squirrel-cage induction motor parameters for
dynamic performance simulation. IEE Proc. Gener. Trans. Distrib. 153(2), 137–
146 (2006)
11. Gad, A., Farooq, M.: Application of fuzzy logic in engineering problems. In: IECON
2001. The 27th Annual Conference of the IEEE Industrial Electronics Society 2001,
vol. 3, pp. 2044–2049 (2001)
12. Tobi, T., Hanafusa, T.: A practical application of fuzzy control for an air-
conditioning system. Int. J. Approximate Reasoning 5, 331–348 (1991)
HDL Code Generation from SIMULINK
Environment for Li-Ion Cells State of Charge
and Parameter Estimation

Mattia Stighezza, Valentina Bianchi(B) , and Ilaria De Munari

Dipartimento di Ingegneria e Architettura, Università di Parma, Parco Area delle Scienze 181/a,
43124 Parma, Italy
{mattia.stighezza,valentina.bianchi,ilaria.demunari}@unipr.it

Abstract. Monitoring State of Charge battery cells is necessary for all the most
common batteries to avoid damages and to extend lifetime. State of Charge must be
estimated since it can not be directly measured on the cell. Among all developed
techniques, Equivalent Circuit Models is one of the most interesting and it is
based on modeling the behavior of the electric components of the cell. However,
a real-time parameter estimation is necessary, due to the change of them during
the battery life. FPGA is first choice for these applications, for its flexibility and
hardware reconfigurability. Traditionally, FPGA is configured with a synthetized
Hardware Description Language, HDL, but this process can be time consuming
depending on the model complexity. In this paper, a Simulink model-based design
for a Li-Ion cell parameters identification is presented. This approach together
to HDL Coder Simulink tool, assures higher code portability and short time to
market.

1 Introduction
The rapid growth in the last decade of electric vehicles in the automotive scenario
involves wide use of electric motors and batteries. Researchers are continuously trying
to improve the performance of the batteries in terms of energy stored and lifespan. This
can be done by changing chemical composition and improving the battery management
during charging and discharging cycles. Nowadays, the most common chemistries for
battery cells are based on Lithium-Ion and Lithium-Polymers [1]. Developing accurate
and sophisticate Battery Management Systems (BMS) could lead to an improvement in
the battery lifecycle, through the monitoring of suitable battery cells indicators. The most
important indicators are State of Charge (SoC) and State of Health (SOH) which can
lead to an extension of the lifespan and can also prevent early turn off. An optimum BMS
should be able to monitor each cell of a battery pack to obtain an accurate estimation of
every single SoC and SoH.
Indicators such as SoC can not be measured directly but need to be estimated through
voltage and current sensing. This must be done in real time [2] since they change during
battery lifetime. The SoC estimation can be performed through various methods [2, 3].
The most common technique, which guarantees higher accuracy and lower complexity,

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 136–143, 2021.
https://doi.org/10.1007/978-3-030-66729-0_16
HDL Code Generation from SIMULINK Environment for Li-Ion Cells State 137

is based on Equivalent Circuit Models (ECM) [4]. Here, the battery cell is modeled
as an electrical circuit composed by resistors and capacitors. A good fitting of the cell
behavior through these components, allows an accurate SoC estimation.
A real time estimation with sufficient accuracy requires a lot of computational power
[5, 6]. Different implementations have been presented in literature based on microcon-
troller systems [7–9]. Recently, solutions exploiting the intrinsic parallelism of the FPGA
architectures [10, 11] have been proposed. On a FPGA, multiple instances of the same
algorithm can be implemented allowing to estimate accurately the state of charge of
each cell composing a battery module (the number of cells depends on the application
but normally some units).
Traditionally, a FPGA coding is done by typing a Hardware Description Language
(HDL) code, testing it and finally programming the FPGA, but this process is typically
time consuming [12, 13]. Recently, a new approach has been successfully proposed [14–
16], based on Model Based designs with software such as MATLAB/Simulink, and in
particular its dedicated tool HDL Coder allowing automatic HDL code generation from
Simulink schemes. This allows to rapidly develop the system with better time-to-market.
Moreover, Mathworks guarantees that the HDL code generated is compliant to industrial
standards.
Main goal of our work is to design a model-based version of the state-of-the-art
ECM algorithm, generating HDL code using HDL Coder Tool. The main advantage of
this approach is the complete compliance with both Intel and Xilinx FPGAs and then a
greater code portability and generalization. Moreover, using native Simulink blocks and
libraries [17], allows to make the process independent from different vendor specific
tools (e.g. Intel DSP builder), usually sold as add-ons of the Synthesis Tools.
The paper is organized as follows. In Sect. 2 related works are discussed while in
Sect. 3 the designed architecture is introduced. In Sect. 4 simulation results are shown,
then in Sect. 5 conclusions are drawn.

2 Related Works

In literature, the online cell parameters estimation is usually performed through the
application of an Extended Kalman Filter (EKF) [18] or by using a Moving Window
Least Square (MWLS) estimation method like in [19]. In [20] a comparative study was
made, showing that both methods have good accuracy in different situations. MWLS
algorithm is less accurate if there are too many non-linearities in the system whilst in
that case EKF is more stable but computational demanding. Most of the algorithms are
applied on conventional microcontrollers.
In [21], instead, an FPGA-synthesizable ECM model that estimates SoC is imple-
mented in MATLAB/Simulink and exported using HDL Coder but it uses Look-Up
Tables (LUT) to store the parameters values. Hence, an offline phase is required in order
to previously estimate the correct parameters.
In [22] a Mixed Algorithm (MA) is implemented, which combines an ECM model
and the Coulomb Counting algorithm to estimate SoC. The MA is realized on Simulink,
updating the parameters through MWLS algorithm. The model is then exported in HDL
code using Intel DSP Builder Tool.
138 M. Stighezza et al.

In [23] is presented the same methodology but implementing also EKF. Using a
more general code generation tool as Simulink HDL coder, could help to produce a
vendor-independent code improving portability [17].

3 Architecture
The Simulink model of the MA is based on [22] while the implemented algorithm that
executes the MWLS in this work is well described in [19, 24].
In Fig. 1, a scheme of the actual battery cell is represented. The actual battery voltage
VOC can be estimated through the measured cell Terminal Voltage VT and the voltage
drop VRC . The main block estimates the SoC by integrating the charging/discharging
battery current with Coulomb Counting algorithm [2]. This is done by using a Discrete
Time Integrator block. Then, the estimated SoC is converted into an estimated VOC ,
through a previously offline computed characteristic stored in a LUT. The difference
between the estimated VOC and the voltage obtained by the ECM, VRC is then compared
to the measured VT . The computed error is finally used to update the Coulomb Counting
algorithm in order to produce a better estimated SoC.

Fig. 1. Thevenin ECM [22]

The second subsystem deals with real-time ECM parameters identification by apply-
ing the MWLS algorithm. It needs L samples of VT , VOC and IL , where L is a parameter
dependent on sampling time. In the designed Simulink scheme this is accomplished
through Serial-In Parallel-Out Shift Registers implemented through a series of Delay
Simulink blocks. The time step between windows has been set to 30 s: this produces a
negligible variation of SoC due to the cell parameters (applying a 1 C charge to the cell,
SoC drops of about 0.8% in 30 s). The MWLS block is enabled by a properly configured
HDL Counter block. Output vectors are the MWLS algorithm Data matrix [19].
The overall MA model comprising the MWLS Parameters Identification block and
the block which execute SoC estimation is visible in Fig. 2, while the Simulink developed
MWLS algorithm scheme is reported in Fig. 3.
A Transpose block and Matrix Multiply blocks have been applied to the Data matrix
[19]. A matrix inversion is necessary, but the Simulink standard block for inverting
HDL Code Generation from SIMULINK Environment for Li-Ion Cells State 139

Fig. 2. System block diagram

Fig. 3. MWLS Algorithm presented in [19] modeled in Simulink

matrices is not supported by HDL Coder. Hence, it has been designed with other stan-
dard Simulink blocks according to Gauss-Jordan Elimination algorithm [25]. It is based
on row swaps and math operations, modeled inside Matrix Inverse subsystem through
Multiply-Add HDL Coder blocks, specifically designed to improve performance on hard-
ware by mapping DSP slices on FPGA [26]. During HDL code generation, HDL Coder
configures these blocks so that synthesis tool can map to the DSP unit, if available on
the targeted hardware [27]. This means that this model is not vendor specific but can
be compatible with properly chosen hardware. Then by modeling [24] formulas inside
Parameters Extraction subsystem, the three ECM parameters are obtained. After a check
performed to avoid a bad initialization while the first output is not ready yet, parameters
are used to update the ECM circuit (Fig. 1).
All the operations are performed in fixed-point precision. This can lead to a simpler
hardware architecture, but it is necessary to convert the whole system from the Simulink
default double precision to fixed-point precision. A signed 32-bit word length data with
21 fractional bits was chosen. It can represent data in a range from −1024 to 1023 and a
precision of 4 × 10−7 : these values are compliant with the most common battery packs.
140 M. Stighezza et al.

4 Results
For a first validation, the main block which estimates SoC through MA has been verified
assuming constant cell parameters. In a collaboration with the University of Modena a
Panasonic 18650 Li-Ion battery cell has been made available. A first offline characteriza-
tion was performed using Hybrid Pulse Power Characterization [28] to obtain VOC -SoC
characteristic and a mean value of the cell parameters. Data were sampled by a Labjack
T7 DAQ at 2 Hz. These resulted equal to: R0 = 0.0613 , R1 = 0.0351 , C1 =
1521 F. These cell parameters were used in a Simulink battery model which outputs
the voltage response to a current input. These voltage and current data were used for
Simulink simulation successfully achieving a SoC estimation.
Hence, this block has been converted in HDL code through HDL Coder tool and
tested in Xilinx Vivado Design Suite. A non-target specific VHDL code generation has
been performed, highlighting a non-vendor specific compatibility. This allows to switch
to a specific hardware in few actions, letting HDL Coder itself optimize the code for
the specified board. A synthesis elaboration for a Digilent Nexys 4 DDR Development
Board, which is based on Xilinx Artix-7 FPGA, resulted in 22.76% of occupied area, as
shown in Table 1.

Table 1. Area occupation from synthesis elaboration

Slice LUTs utilization (%) Slice Registers (%) DSPs utilization (%)
SoC Estimation Block 14427/63400 196/126800 8/240
(22.76%) (0.15%) (3.33%)

Since cell constant parameters is a too-strong approximation for an accurate SoC


estimation, the MWLS Parameters Identification block has also been realized. The same
voltage and current data as for the main block were used both for Simulink and Vivado
behavioral simulations. A parameters identification has been successfully obtained in
Simulink and then MWLS algorithm model has been converted in VHDL code through
HDL Coder.
Once the code generation has been completed, the VHDL generated code has been
tested in Xilinx Vivado Design Suite.
The behavioral simulation in Fig. 4 shows that the generated code performs a param-
eters estimation with a time step of 30 s, as desired. Comparing the Vivado and Simulink
results a maximum percentage error in the magnitude of 10−4 has been obtained, as
visible in Fig. 5. Such an error can be acceptable in this application due to negligible
effect of 10−4  on SoC estimation.
HDL Code Generation from SIMULINK Environment for Li-Ion Cells State 141

Fig. 4. Behavioral simulation of the HDL code performed in Vivado

Fig. 5. Percentage error between HDL and Simulink estimations

5 Conclusions
In this paper, a parameters identification algorithm to estimate the SoC of a battery cell
was successfully realized in Simulink environment, supporting automatic HDL code
generation through a model-based design approach. ECM and Coulomb Counting have
been exploited in a Simulink model for a Mixed Algorithm capable of SoC estimation.
This model has been successfully verified through both Simulink and Vivado simulations.
In order to enhance its estimation performances, a cell parameters estimation model has
been realized in Simulink, performing a MWLS algorithm. Preliminary results show the
correct algorithm behavior and full model compatibility with HDL Coder tool.
The whole model is based on the state-of-the-art models and is portable on FPGAs
of different vendors, thanks to the general approach not tied to a vendor specific tool.
Full compatibility with HDL Coder has been confirmed and results based on software
simulation show an acceptable accuracy in online cell parameters prediction thus it is
possible to think that the model could be tested successfully once it will be programmed
on a FPGA.
142 M. Stighezza et al.

Acknowledgements. For this work is necessary to recognize the support of University of Modena,
in particular thanks to Luca Dallara, Michele Franceschetti, Marco Mostarda, and Prof. Giovanni
Franceschini.

References
1. Yong, J.Y.: A review on the state-of-the-art technologies of electric vehicle, its impacts and
prospects | elsevier enhanced reader. Renew. Sustain. Energy Rev. 49(2015), 365–385 (2015)
2. How, D.N.T., Hannan, M.A, Lipu, M.S.H., Ker, P.J.: State of charge estimation for lithium-ion
batteries using model-based and data-driven methods: a review. IEEE Access, 7, 136116–
136136 (2019)
3. Zhang, R., et al.: State of the art of lithium-ion battery SOC estimation for electrical vehicles.
Energies, 11(7), p. 1820 (2018)
4. He, H., Xiong, R., Fan, J.: Evaluation of lithium-ion battery equivalent circuit models for
state of charge estimation by an experimental approach. Energies 4(4), 582–598 (2011)
5. Osornio-Rios, R.A., Romero-Troncoso, R.D.J., Morales-Velazquez, L., De Santiago-Perez,
J., Rivera-Guillen, R.D.J., Rangel-Magdaleno, J.D.J.: A real-time FPGA based platform
for applications in mechatronics. In: Proceedings - 2008 International Conference on
Reconfigurable Computing and FPGAs, ReConFig 2008, pp. 289–294 (2008)
6. Al-Mahmood, A., Opoku, M.: A study of FPGA-based system-on-chip designs for real-time
industrial application. Int. J. Comput. Appl. 163(6), 9–19 (2017)
7. Trinandana, G.A., Pratama, A.W., Prasetyono, E.: Real time state of charge estimation for
lead acid battery using artificial neural network. In: 2020 International Seminar on Intelligent
Technology and Its Applications (ISITIA) (2020)
8. Mazzi, Y., Sassi, H.B., Errahimi, F., Es-Sbai, N.: State of charge estimation using extended
kalman filter. In: 2019 International Conference on Wireless Technologies, Embedded and
Intelligent Systems (WITS) (2019)
9. Lin, C.H., Hung, M.H., Wang, C.M., Ho, C.Y.: A microcontroller-based fast charger with state-
of charge estimation for LiCoO2 battery. In: 2013 1st International Future Energy Electronics
Conference (IFEEC) (2013)
10. Monmasson, E., Idkhajine, L., Naouar, M.W.: FPGA-based controllers. IEEE Ind. Electron.
Mag. 5(1), 14–26 (2011)
11. Bianchi, V., Savi, F., De Munari, I., Barater, D.. Buticchi, G., Franceschini,G.: Minimization of
network induced jitter impact on FPGA-based control systems for power electronics through
forward error correction. Electron 9(2), 281 February 2020
12. Monmasson, E., Cirstea, M.N.: FPGA design methodology for industrial control systems - a
review. IEEE Trans. Ind. Electron. 54(4), 1824–1842, August 2007
13. Van Beek, S., Sharma, S., Prakash, S.: Four best practices for prototyping MATLAB and
simulink algorithms on FPGAs. Verification Horiz. 8(2), 49–53 (2012)
14. Ghodhbani, R., Horrigue, L., Saidani, T., Atri, M.: Fast FPGA prototyping based real-time
image and video processing with high-level synthesis. Int. J. Adv. Comput. Sci. Appl. 2,
108–116 (2020)
15. Bassoli, M., Bianchi, V., De Munari, I.: A model-based design floating-point accumulator.
case of study: FPGA implementation of a support vector machine kernel function. Sensors
(Switzerland) ( 5), p. 1362, March 2020
16. Bianchi, V., De Munari, I.: A modular Vedic multiplier architecture for model-based design
and deployment on FPGA platforms. Microprocess. Microsyst. 76, p. 103106 (2020)
HDL Code Generation from SIMULINK Environment for Li-Ion Cells State 143

17. Kintali, K., Gu, Y., Cigan, E.: Model-based design using simulink, HDL coder, and
DSP builder for Intel FPGAs. https://it.mathworks.com/content/dam/mathworks/tag-team/
Objects/m/78706_92161v00_HDLCoderInterface_WhitePaper_final.pdf. Accessed 08 Jan
2021
18. Wang, L.: Research on improved EKF algorithm applied on estimate EV battery SOC (2010).
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5448581. Accessed 09 Feb 2020
19. Poloei, F., Akbari, A., Liu, Y.F.: A moving window least mean square approach to state of
charge estimation for lithium ion batteries. In: Proceedings - 2019 IEEE 1st Global Power,
Energy and Communication Conference, GPECOM 2019, pp. 398–402 (2019)
20. Baronti, F., et al.: Parameter identification of Li-Po batteries in electric vehicles: a comparative
study. In: IEEE International Symposium on Industrial Electronics (2013)
21. Debreceni, T., Szabó, P., Balázs, G.G., Varjasi, I.: FPGA-synthesizable electrical battery cell
model for high performance real-time algorithms. Period. Polytech. Electr. Eng. Comput. Sci.
60(3), 163–170 (2016)
22. Morello, R., Di Rienzo, R., Baronti, F., Roncella, R., Saletti, R.: System on chip battery state
estimator: E-bike case study. In: IECON Proceedings (Industrial Electronics Conference),
pp. 2129–2134 (2016)
23. Morello, R., et al.: Hardware-in-the-loop simulation of FPGA-based state estimators for elec-
tric vehicle batteries. In: IEEE International Symposium on Industrial Electronics, vol. 2016,
pp. 280–285 (2016)
24. Xia, B., et al: Online parameter identification and state of charge estimation of lithium-
ion batteries based on forgetting factor recursive least squares and nonlinear Kalman filter.
Energies, 11(1), p. 3 (2018)
25. HDL code generation for streaming matrix inverse system object - MATLAB &
Simulink. https://www.mathworks.com/help/hdlcoder/examples/hdl-code-generation-stream
ing-matrix-inverse-system-object.html. Accessed19 Jun 2020
26. Perform a multiply-accumulate operation on the inputs - Simulink. https://www.mathworks.
com/help/hdlcoder/ref/multiplyaccumulate.html. Accessed 24 Jun 2020
27. Multiply-Add. https://it.mathworks.com/help/hdlcoder/ref/multiplyadd.html
28. Zhang, R., et al.: A study on the open circuit voltage and state of charge characterization of
high capacity lithium-ion battery under different temperature. Energies 11(9), p. 2408 (2018)
Performance Comparison of Imputation
Methods in Building Energy Data Sets

Hariom Dhungana(B) , Francesco Bellotti, Riccardo Berta, and Alessandro De Gloria

DITEN, University of Genova, 16145 Genova, Italy


{hariom.dhungana,franz,berta,adg}@elios.unige.it

Abstract. Statistical procedures for missing data imputation techniques have


vastly improved, yet selection and suitability of optimal imputation technique
for particular application\datasets\context still confusing. This works frames the
missing-data problem in building energy measurement systems, review different
imputation methods and suggest the optimal imputation technique for missing
values for energy metering data set. The main objective of this paper is to show
performance of different imputation techniques with respect to accuracy and com-
putation time in energy meter data. Missing values in the energy metering data
set are imputed by seven imputation methods such as last value carried forward
(LVCF), Mean, Median, Mode, multiple imputation by chain equation (MICE); K-
nearest neighbors (K-NN) and long short term memory (LSTM). The performance
of each imputation method is compared with respect to accuracy and execution
time under a missing completely at random assumption. Based on the two evalu-
ation criteria the LVCF imputation is very fast with high accuracy among single
point imputation. The LSTM deserves the best among the seven imputation meth-
ods for energy metering data set, but the tradeoff is computation time compared
to LVCF.

1 Introduction
Internet of Things (IoT) sensing architecture to monitor the user and/or environment
to derive information about their status are very popular now a day [1]. Those sensed
data should be complete and clean in order to use them for trustworthy judgement.
Nevertheless, missing data in IoT networks severely affects the information and knowl-
edge quality that rely on those missing datasets. Moreover, the impact of missing data
on quantitative research is also serious, which causes loss of information, leading to
biased evaluations of parameters, shrank statistical authority, increased standard errors,
and deteriorated generalizability of findings. Loss of data during measurements in IoT
platforms is one of the main problems for data pre-processing in an IoT application
[2]. The common causes of those data missing problems are unreliable sensor devices,
synchronization problems, unstable network communication, environmental factors and
other device malfunctions etc. [3]. Real-world measurement data has numerous dissim-
ilar features such as amplitude resolutions, sampling rates, the number and quality of
sensors deployed, data collection network, the availability and extent of ground-truth
annotations. Those heterogeneity poses a significant challenge for researchers intending

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 144–151, 2021.
https://doi.org/10.1007/978-3-030-66729-0_17
Performance Comparison of Imputation Methods in Building Energy Data Sets 145

to use data imputations because of the required data characteristics, acceptable error,
available computing resources and time restriction of processing.
Missing data are a part of almost all research field data science, mathematics and
statistics data science, economics & financial sector [4]. The data missing pattern are
categorized into 3 groups [5]: (a) missing completely at random (MCAR), (b) missing
at random (MAR), and (c) missing not at random (MNAR). An MCAR is a missing
data mechanism in which the missing value is independent of the variable itself, but is
unavailable owing to random events like to a sensor node breaking down because of
an accident. In the case of MAR, the missing variable is independent, yet the missing
value can be predicted using other influencing factors of that data variables, like as
sensor failings during a cleaning event, when the power supply to the sensors was
disconnected. In the case of MNAR, the missing values are dependent on the variable
itself and the event is non-random. There are several alternative ways to overcome the
data missing problems. Basically, data imputation techniques are categorized into four
class (a) deletion of missing data, (b) imputation or estimation of missing data using
statistical methods and/or machine learning, (c) estimating the missing values on the
basis of modelling the known distribution (such as Expectation Maximization, Gaussian
Mixture Models), and (d) classifying data that contains missing data by means of machine
learning [6]. Ignore and delete method can be used when the missing value is undersized,
but when the missing value is significant compared to the amount of data, the use of
ignore and delete methods may cause bias in the mining result. Data imputation is the
process of filling the missing values in an incomplete data set by appropriate values.
In short, there is a lack of widely agreed unique data imputation techniques in IoT
measurement framework for various data collection systems. The usability of data impu-
tation technique depends upon the nature of data sources, data collection infrastructure,
missing data patterns, computational complexity and prediction accuracy. Tabachnick
and Fidell proposed that the missing data mechanisms and the missing data patterns
have higher influence on research results than does the proportion of missing data [7].
Furthermore, Horton showed that each imputation approach is more complicated when
there are many patterns of missing values, or when both categorical and continuous
random variables are involved [8]. We intend to foster the selection criteria of data
imputation techniques for building energy management system datasets. The main rea-
son for presenting this work is to publicize researchers with newer imputation techniques
and encourage them to apply those techniques in different datasets for optimal perfor-
mance on IoT application as presented in [9]. The remainder of this paper is structured
as follows: Sect. 2 describes the related works. Experimental datasets and procedures of
evaluation are explained in Sect. 3. Section 4 shows the results and express discussion
about the performance comparison. Finally, Sect. 5 gives the conclusions and shows the
future work for acquiring additional goal.

2 Related Works

The data collected from the IoT measurement framework are multivariate as well as time
series data. Moreover, the time series data have a connection between the values of an
object with a certain time, if a value changes at any other time, then the value of the
146 H. Dhungana et al.

previous measured data will give significant bias for decision making. The LVCF impu-
tation technique was implemented to estimate the missing data values in environmental
sensing dataset, which was collected from industrial wireless sensor networks in [10].
Authors claimed that, the LVCF was easy and effective measurement for missing value
imputation of the large multi-dimensional sensing and over perform compared to mean
imputation and high frequency imputation technique.
Moreover, Bennett already showed that if the missing data are MCAR pattern then
LVCF is reasonable imputation [11].
A comparative study of five data imputation methods was carried out for the estima-
tion of missing values in building sensor data in [12]. Those five methods were linear
regression, weighted K-NN, support vector machines (SVM), mean imputation and
replacing missing entries with zero. Further, it illustrated the importance of predicting
missing values and how it may affect the accuracy of building energy simulation. Peter
et al. compared six imputation methods: Mean, K-NN, fuzzy K-means, singular value
decomposition, Bayesian principal component analysis and MICE [13]. The experiment
was performed on four real datasets under a missing completely at random assump-
tion based on four evaluation criteria: Root mean squared error (RMSE), unsupervised
classification error, supervised classification error and execution time.
In [3], a spatial and temporal (ST) correlation proximate missing data imputation
model was proposed to carry on missing data problem in IoT. The ST-correlated missing
data model was more accurate than the single imputation methods such as mean, median,
mode. Chuentawat and Kan compared the accuracy of multivariate and univariate time
series prediction of the PM2.5 datasets by evaluating RMSE based on SVM and Genetic
algorithm [14]. In all three data subsets, the univariate forecasting model has lower
RMSE and mean absolute percentage error than a multivariate forecasting model. With
the recent progress in computational power of computers and more advanced machine
learning algorithms such as deep learning, new algorithms are developed to impute time
series data. In [15], Sima and Akbar compared the two forecasting techniques ARIMA
and LSTM on economic and financial time series datasets. The result showed that deep
learning-based algorithms such as LSTM outperform traditional-based algorithms such
as ARIMA model in error rates. Poulos and Valle showed k-NN imputation can outper-
formed explicit modeling methods for supervised learning tasks [16]. The performance
of the classifiers and imputation strategies generally depend on the nature and propor-
tion of missing data and perturbation can help increase predictive accuracy for imputed
models by regularizing the classifier.

3 Experimental Setup
3.1 Provided Datasets
There are various existing freely available building energy consumption datasets for
better energy management in households such as UK-DALE [17], REDD [18], AMPds
[19]. Among them, GREEND is an energy metering data set, in which some data collect-
ing households were from Italy [2]. The reason for choosing this dataset is to maximize
the applicability of research from practical aspects from demographic as well as user
behavior. All data features are numeric, consist of integers, and do not have negative
Performance Comparison of Imputation Methods in Building Energy Data Sets 147

value; each numeric data has same unit that is active power consumption by electrical
appliances in watt. The eight household measurements are stored in separate folders with
respective buildings. Each folder consists of hundreds of CSV files and each CSV file
consists of multivariate time series with active power measurements values from the nine
appliances at 1Hz. Therefore, on each CSV file the data consist of 86,400 measurements
of 9 different variables, giving a total of 777,600 data points. The 9 different variables
represent the different electrical appliances used on that house. We randomly picked
one file from each building folder then marked those eight files as original datasets for
analysis. The average value from those eight files for different building are are con-
sidered to tabulate RMSE value and execution time for each imputation methods. The
validation datasets are constructed by pre-processing the randomly picked data files by
visual inspection, duplication and outlier removal. The example of missing pattern and
frequency of missing in one files are plotted in Fig. 1.

Fig. 1. Example of Frequency of missing value in different electrical appliances on building0 and
missing patterns of that file

3.2 Procedures of Evaluation and Metrics


The procedures of performance comparison of various imputation methods are per-
formed based on process shown in Fig. 2. The datasets downloaded from the repository
are named as original dataset O, which are partially corrupt with some missing values
and heterogeneous format, the preprocessing is done on original dataset to get structured
validation dataset D. Measured data values are randomly removed for making datasets
with missing values M. Each column represents the power consumption of particular
electrical appliances therefore imputation is carried out on each column based on the
previous data that are the only data available without foreseeing the future. Machine
learning based imputation technique randomly splits each dataset 60% for training and
40% for testing. Different imputation techniques are implemented to reconstruct the
imputed dataset D. Then the RMSE between imputed CSV file D and valid CSV file D
is calculated. Manly there are two evaluation criteria for performance metric. First one is
accuracy and second is computing resources. Mean square error, RMSE, unsupervised
classifier error, supervised classifier error and execution time, symmetric mean abso-
lute percentage error are typical criteria of evaluating the performance of imputation
148 H. Dhungana et al.

techniques [12, 13]. Similar to the figure of merit employed by most studies, this works
compares the performance based on RMSE and imputation time. The RMSE signifies
the sample standard deviation of that difference. The imputation time is generated by
using the datetime module on all imputation scripts during code running.

Fig. 2. Evaluation procedure of imputation techniques.

4 Discussion and Results


The handling of missing data is a very widespread generic statistical problem and there
is no universal imputation method performing best in different datasets. Ensuring the
completeness of data in terms of timestamps is necessary, because the fragmented data
with missing values provides insufficient information, and many data analysis methods
strictly require serially entire observations. It is clearly observed that unbiased and well-
designed comparison studies in computational sciences are necessary to guarantee that
particular data imputation method works on respective datasets by following the estab-
lished standards and guidelines. The type of missing data on the dataset determines the
appropriate method to use in handling the missing data before a formal statistical analy-
sis begins [11]. In this work, we did a neutral comparison of seven imputation methods
based on real building energy datasets of uniform sizes, under an MCAR assumption.
The quality of statistical inferences is directly proportional to the amount of missing
data. However, there is no standard threshold from the literature regarding an acceptable
percentage of missing data in a data set for effective statistical implications. In this work,
we have evaluated the nominal range of missing values randomly distributed from 5% to
45% by 5% for all data files. The experiments were carried out eight data file from each
building to fill RMSE and execution time by averaging those eight different experiments
on the Table 1 and Table 2.
The results from Table 1 show RMSE in different imputation methods with various
missing proportion from 5% to 45%. The first four imputation method are classified into
single imputation method and solely based on statistical calculation only. Among those
four methods, LVCF shows very good results with very less RMSE error compared than
mean imputation, mode imputation and median imputation. The performance is still
good in higher missing values. MICE imputation and K-NN imputation method is better
Performance Comparison of Imputation Methods in Building Energy Data Sets 149

Table 1. RMSE values of active power in watt from different imputation methods in various
missing proportions

5% 10% 15% 20% 25% 30% 35% 40% 45%


LVCF 5.8 6.74 7.2 8.9 9.3 10.2 11.1 11.9 12.8
Mean 741.9 1063.7 1284.2 1485.9 1668.5 1814.7 1967.4 2085.9 2215.8
Median 748.25 1041.8 1276.5 1480.6 1657.6 1820.4 1960.3 2089.2 2227.7
Mode 645.7 952.4 1187.9 1403.3 1593.0 1717.6 1913.9 2087.6 2331.8
MICE 494.9 684.2 856.4 987.2 1135.0 1219.8 1306.9 1339.6 1427.2
K-NN 446.1 651.3 757.7 908.8 1038.7 1111.7 1209.2 1291.7 1381.2
LSTM 0.5 1.1 1.23 1.8 1.5 1.62 1.8 1.95 2.0

than the mean, median and mode imputation but less accurate than LVCF. The LSTM
imputation give less RMSE error even compared to rest of all but the computation effort
is much higher than the LVCF. Machine learning based imputation give more accurate
filling of missing values and LSTM imputation method over perform all seven other
imputation methods for energy metering datasets. Another important observation that
is showed of every imputation is proportion of missing directly affects the accuracy of
imputation technique. Therefore, missing percentage is also the decision criteria for the
IoT service developments. Moreover, Error calculations are necessary for the evaluation
criteria, we agree on [11], it is good practice to perform a sensitivity analysis of different
missing data patterns for the robustness of the data imputation.

Table 2. Imputation time in seconds from different imputation methods in various missing
proportions

5% 10% 15% 20% 25% 30% 35% 40% 45%


LVCF 4.5 7.4 10.2 13.17 16.0 18.6 21.5 24.2 27.1
Mean 4.5 7.3 10.2 13.07 15.6 18.5 21.2 23.7 26.7
Median 4.5 7.4 10.2 12.98 15.8 18.4 21.5 24.2 26.7
Mode 4.6 7.4 10.1 13.07 15.8 18.4 21.4 23.8 26.8
MICE 567.4 831.4 1080.1 1447.3 2005.8 2327.8 2589.2 2827.9 3391.2
K-NN 13.6 21.9 30.7 39.21 46.8 55.5 63.7 71.2 80.1
LSTM 32.8 51.8 70.8 91.5 110.9 132.3 150.3 166.7 189.9

Another metric of performance comparison is execution time, the complete list of


execution time with different methods are showed in Table 2. The execution times for four
single statistical based imputation methods are almost same and proportionally increase
with the mission values. The MICE imputation technique is most time consuming among
all other imputation method. The K-NN method consumes least execution time among
150 H. Dhungana et al.

machine learning based imputation. Overall, by considering both metric accuracy and
time, if there is limitation of computation resource the LVCF is suitable for missing values
for energy meter datasets. If the accuracy is the primary measure for the IoT application
the LSTM imputation technique is best among seven imputation methods. There is a
considerable trade-off between prediction accuracy and time taken for imputation. This
table clearly illustrated the selection criteria on the hand of researcher and application
builder based on datasets.

5 Conclusion

Missing value imputation is a well-recognized term in the IoT collected datasets and
it has been studying for a long period. Though, the continuously evolving imputation
methods introduce greater accuracy and faster imputation time, that over performs than
established imputation techniques. We have compared the seven imputation methods
namely, LVCF, Mean, Median, Mode, MICE; K-NN and LSTM. Validation of imputation
results is compared based on two evaluation criteria RMSE and execution time. Single
point imputation is occasionally useful, but LVCF shows very good results in energy
measuring datasets both in terms of accuracy and execution time. Furthermore, we found
machine learning and neural network based model like LSTM imputation gives very
accurate imputation values. The multiple imputations by chained equation methods is
better than the single point statistical imputation but the execution time for imputation is
too high compared to rest of all imputation techniques in this work. We have incorporated
the variation of missing but not the cause of missing data and the missing pattern,
therefore the detail task of data imputation with missing mechanism and pattern will be
open question for future work. Additional time and effort are needed to evaluate other
available building energy consumption datasets such as UK-DALE, REDD, AMPds
etc. for further generalization. Such additional analyses allow insight for better energy
management in spite of incompleteness and corrupt datasets.

References
1. Monteriu, A., Prist, M.R., Frontoni, E., Longhi, S., Pietroni, F., Casaccia, S., Scalise, L.,
Cenci, A., Romeo, L., Berta, R., Pescosolido, L.: A smart sensing architecture for domes-
tic monitoring: methodological approach and experimental validation. Sensors 18(7), 2310
(2018)
2. Monacchi, A., Egarter, D., Elmenreich, W., D’Alessandro, S., Tonello, A.M.: GREEND: an
energy consumption dataset of households in Italy and Austria. IEEE, November 2014. https://
www.andreatonello.com/greend-energy-metering-data-set/
3. Mary, I.P.S., Arockiam, L.: Imputing the missing data in IoT based on the spatial and tem-
poral correlation. In: 2017 IEEE International Conference on Current Trends in Advanced
Computing (ICCTAC), pp. 1–4. IEEE, March 2017.
4. Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data, vol. 793. Wiley, Hoboken
(2019)
5. García-Laencina, P.J., Sancho-Gómez, J.L., Figueiras-Vidal, A.R.: Pattern classification with
missing data: a review. Neural Comput. Appl. 19(2), 263–282 (2010)
Performance Comparison of Imputation Methods in Building Energy Data Sets 151

6. González-Vidal, A., Rathore, P., Rao, A.S., Mendoza-Bernal, J., Palaniswami, M., Skarmeta-
Gómez, A.F.: Missing data imputation with Bayesian maximum entropy for internet of things
applications. IEEE Internet Things J. (2020)
7. Tabachnick, B.G., Fidell, L.S., Ullman, J.B.: Using Multivariate Statistics, vol. 5, pp. 481–498.
Pearson, Boston (2007)
8. Horton, N.J., Kleinman, K.P.: Much ado about nothing: a comparison of missing data methods
and software to fit incomplete data regression models. Am. Stat. 61(1), 79–90 (2007)
9. Schafer, J.L., Graham, J.W.: Missing data: our view of the state of the art. Psychol. Methods
7(2), 147 (2002)
10. Zhou, H., Yu, K., Lee, M., Han, C.: The application of last observation carried forward method
for missing data estimation in the context of industrial wireless sensor networks. In: 2018
IEEE Asia-Pacific Conference on Antennas and Propagation (APCAP), Auckland, pp. 1–2
(2018). https://doi.org/10.1109/APCAP.2018.8538147
11. Bennett, D.A.: How can I deal with missing data in my study? Aust. N. Z. J. Public Health
25(5), 464–469 (2001)
12. Chong, A., Lam, K.P., Xu, W., Karaguzel, O.T., Mo, Y.: Imputation of missing values in
building sensordata. Proc. SimBuild 6(1) (2016)
13. Schmitt, P., Mandel, J., Guedj, M.: A comparison of six methods for missing data imputation.
J. Biometrics Biostatistics 6(1), 1 (2015)
14. Chuentawat, R., Kan-ngan, Y.: The comparison of PM2. 5 forecasting methods in the form of
multivariate and univariate time series based on support vector machine and genetic algorithm.
In: 2018 15th International Conference on Electrical Engineering/Electronics, Computer,
Telecommunications and Information Technology (ECTI-CON), pp. 572–575. IEEE, July
2018.
15. Siami-Namini, S., Namin, A.S.: Forecasting economics and financial time series: ARIMA
vs. LSTM. arXiv preprint arXiv:1803.06386 (2018)
16. Poulos, J., Valle, R.: Missing data imputation for supervised learning. Appl. Artif. Intell.
32(2), 186–196 (2018)
17. Kelly, J., Knottenbelt, W.: The UK-DALE dataset, domestic appliance-level electricity
demand and whole-house demand from five UK homes. Sci. Data 2(1), 1–14 (2015)
18. Kolter, J.Z., Johnson, M.J.: REDD: a public data set for energy disaggregation research. In:
Workshop on Data Mining Applications in Sustainability (SIGKDD), San Diego, CA, vol.
25, pp. 59–62. Citeseer (2011).
19. Makonin, S., Popowich, F., Bartram, L., Gill, B., Bajić, I.V.: AMPds: a public dataset for
load disaggregation and eco-feedback research. In: 2013 IEEE Electrical Power & Energy
Conference, pp. 1–6. IEEE, August 2013
Design and Validation of a FPGA-Based
HIL Simulator for Minimum Losses
Control of a PMSM

Giuseppe Galioto, Antonino Sferlazza, and Giuseppe Costantino Giaconia(B)

Department of Engineering, University of Palermo,


viale delle scienze Ed. 9, 90128 Palermo, Italy
{giuseppe.galioto,antonino.sferlazza,costantino.giaconia}@unipa.it

Abstract. This work examines the FPGA programmable logic plat-


forms applied to minimum losses control of a Permanent Magnet Syn-
chronous Motor (PMSM), which represents a flexible solution for the
implementation of an advanced digital control algorithm, given their
intrinsic parallel structure and the capability to be directly repro-
grammable in the field. In particular, design and validation of a FPGA-
based Hardware-In-the-Loop (HIL) simulator is proposed, by investigat-
ing about data format, quantization and discretization effects and other
issues arising during the experimental validation of a controller proto-
type, in order to reduce the embedded software development cycle and
test control systems.
The proposed simulator has been applied to control a PMSM. Specif-
ically, two different minimum losses control techniques have been imple-
mented as well as a space vector modulation of a three-phases voltage
source inverter. The results given in this paper show the comparison of
this two different algorithms and the effectiveness of the proposed HIL
simulator.

Keywords: FPGA · Hardware-in-the-loop · Simulations · Electrical


drives · PMSM

1 Introduction
HIL simulations have become very popular in the last decades with the inte-
gration of electronic systems in everyday technology. The strong competition
between companies’ products in addition to the wide-spread use of electronics
systems in almost every high-tech applications, has extended the use of HIL
simulations into very different areas [1].
HIL are very popular in the automotive industry [2,3]. Modern cars in fact
embed electronic systems that nowadays form a crucial part of the whole car.
Almost every electronic or electric main component of a vehicle is tested by
using hardware-in-the-loop simulations. One of the first approaches of HIL for
automotive systems is based on the design of engine control systems as well as
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 152–163, 2021.
https://doi.org/10.1007/978-3-030-66729-0_18
Design and Validation of a FPGA-Based HIL Simulator 153

other automotive systems such as Anti-lock Brake Systems (ABS), suspension


systems, or the general control system.
A HIL simulation is characterized by the operation of real components in
connection with real-time simulated parts of a system. The simulated parts are
often the processes being controlled and/or sensors and actuators.
Research and industrial implementation of Digital Signal Processor (DSP)
based electric drive controllers are typically implemented on FPGA’s or DSP
microcontrollers by using proprietary development software. FPGAs have advan-
tages over DSP microcontrollers due to their parallel processing capability and
flexible architectures [4]. Owing to the rapid developments in digital hardware
technology, FPGA are the fastest, most reliable, and preferred computational
engines for digital hardware implementation of complex systems without sac-
rificing the accuracy. Indeed the FPGA can perform required calculations for
real-time simulation with relatively small sample-period.
However, in order to have optimum exploitation of FPGA technology, it gen-
erally requires the use of fixed-point numbers where the precision and range
are defined by the designer while most DSP-based processors are equipped
with floating-point unit with data format of 32 or 64 bits. It follows that an
FPGA implementation is not user-friendly and usually time consuming to man-
age the compromise between precision and the use of FPGAs resources. Also,
the increase of the data streams reduces the performances in the FPGA, justi-
fying the importance of the arithmetic optimization of the models under study.
Therefore, the minimization of the FPGA resources is very important to find
the best system performance [5–7].
This paper deals with the study of a HIL platform for minimum losses control
of a PMSM [8,9]. Note that the minimum losses control represents an important
field of study in electrical drives, also recently, with the development of new
models, estimators and control algorithms dealing with it [10–12].
In particular, in this work the motor with its converter and sensors are mod-
eled and implemented on Simulink, while the controller is implemented on a
Xilinx board. Moreover, the Hardware Description Language (HDL) verified has
been employed since it allows to test and verify Verilog and VHDL designs
for FPGA, ASIC and SoC. This toolbox allows to compare the RTL with the
test benches running in MATLAB or Simulink by leveraging on a co-simulation
environment, tailored with an HDL simulator. Some test benches can be used
with FPGA and SoC development boards to verify HDL implementations in
the hardware. Therefore HDL Verifier provides tools to test and debug FPGA
implementations on Xilinx and Intel boards. Using the cosimWizard command
it is possible to set all the parameters related to the HDL block to be simulated
such as the clock and reset signals, the output formats and the simulation time
scale [13,14].
Moreover, another important purpose of this work relies on the validation
of FPGA platforms as embedded controller in-the-loop, in order to assess the
effective advantages offered by these chips in electric drive control applications,
154 G. Galioto et al.

and, to evaluate the accuracy and time resolution the system can sustain to
fulfill the application requirements.

2 Mathematical Model of PMSM


Since the main focus of the control system is to minimize the losses, the mathe-
matical model developed should take into consideration both joule and leakage
flux losses. For this reason a more complex model, with respect to a standard
one, will be used. In particular the PMSM considered can be described by the
following differential equations expressed in a rotating d-q reference frame [8]:
  
Rs + Rc dλd
vd = Rs iod + − ωr λ q , (1a)
Rc dt
  
Rs + Rc dλq
vq = Rs ioq + + ωr λd , (1b)
Rc dt
where vd and vq are the direct and quadrature components of the stator voltage
space-vector, iod and ioq are the direct and quadrature components of the stator
current space-vector, ωr is the rotor speed, Rs and Rc are the circuit resistances,
and λd and λq are the direct and quadrature components of the flux space-vector
defined as:
λd = Ld iod + λm . (2a)
λq = Lq ioq . (2b)
where λm is the flux produced by the permanent magnet. The mechanical equa-
tion is given by:
dωr 1
= (Ce − Cr − fv ωr ) , (3)
dt J
where J is the inertia moment, fv is the viscous friction coefficient, Cr is the
load torque and Ce is the electromagnetic torque produced by the motor given
by:
3
Ce = p (λm ioq − (Ld − Lq )iod ioq ) , (4)
2
where p is the number of pole pairs and Ld and Lq are the stator inductances
along direct and quadrature components respectively. Equations (1)–(4) rep-
resent the mathematical model of the system and they are implemented on
Simulink in order to simulate the dynamic behavior of the system and to vali-
date the control algorithm.

2.1 Space Vector Modulation of the Inverter


The Space Vector Modulation (SVM) of the inverter is based on the introduction
of the space vectors representative of the three-phase voltages system reproduced
by the inverter and defined as:
2  2π 4π

V̄h = Vcc Sa + Sb ej 3 + Sc ej 3 . (5)
3
Design and Validation of a FPGA-Based HIL Simulator 155

where Vcc is the DC-voltage sourcing the inverter, Sa , Sb and Sc can assume
alternatively the value 0 or 1, coherently with the logic state of the upper elec-
tronic switches of the three branches of the inverter, and h = 0, ..., 7 identify the
eight possible states of the inverter. The eight vectors defined in (5) identify the
inverter operating hex shown in Fig. 1.(a).

(a) (b)

Fig. 1. Inverter operating hex (a), firing pulses of static devices (b).

Given a reference voltage Vr contained into the operating hex, it is “recon-


structed ” by the inverter by means of a suitable sequence of space vectors, so
that the average value of the output voltage equals, in the period of modulation,
that of the reference voltage in the same time interval. Therefore, in terms of
space vectors, it should be found that:
 TP W M
 
Vr (t) − V̄ (t) dt = 0, (6)
0

V̄ (·) is the space vector generated by the inverter. Thus, if the vector Vr (·) is in
the h-th sector of the inverter operating hex (delimited by the vectors Vh and
Vh+1 ) the vector Vh is generated for a time th and the vector Vh+1 for a time
th+1 . In general, it results that th + th+1 < TP W M . Therefore, in the residual
time of the modulation period the null vector is applied. The right sequence of
firing pulses of static devices into a modulation period TP W M should be obtained
as shown in Fig. 1(b).
A easy way of implementing the above described vector modulation is to
calculate the on and off instants of each device starting from the expression of
the duty cycle [15], i.e. the ratio TON /TP W M , in which TON is the total time in
which a static device is conducting. The total computing time is reduced because
it is not necessary to use trigonometric and transcendent functions for calculating
the inverter operating hex sector. Indeed, as shown in [15], it is possible to define:
1 Vr n + U ∗
dn = + , n = a, b, c, (7)
2 Vcc
156 G. Galioto et al.

where:
1
U ∗ = − (max{Vr 1 , Vr 2 , Vr 3 } + min{Vr 1 , Vr 2 , Vr 3 }) .
2
Once the three duty-cycles are computed, the on and off times are computed
as follows:
TON n = TP W2
M
(1 − dn ),
TP W M n = a, b, c. (8)
TON n = 2 (1 + dn ),

2.2 Minimum Losses Field Oriented Control of the PMSM

In this situation it is possible to adopt the field oriented control strategy to the
system (1)–(4) [15,16]. In particular, if the current iod is forced to be zero, and
the two control inputs vd and vq are chosen, by means of a state feedback, as:
 
Rs + Rc
vd = vd∗ − ωr λ q , (9a)
Rc
 
Rs + Rc
vq = vq∗ + ωr λ d , (9b)
Rc

then the whole system can be considered as two decoupled subsystems, and the
torque can be controlled by acting only on the current ioq . The block diagram
of the Field Oriented Control (FOC) algorithm is shown in Fig. 2.

Fig. 2. Block diagram of the proposed control algorithm.

However, the main focus of the proposed application is the investigations on


efficiency improvements through some Losses Minimization Algorithms (LMA)
for any working condition. For this reason it is important to define both electrical
and mechanical losses and obtain the efficiency of the system as function of iod
and ioq as it can be also observed in the expressions given in [17] and [18].
Design and Validation of a FPGA-Based HIL Simulator 157

Usually there are two methods for searching the maximum efficiency point,
which correspond with the minimum losses operating condition: an open-loop
model-based approach [8] and closed-loop method based on a binary search [19].
In the former model-based algorithm, the reference current value iod was
off-line computed so that the maximum efficiency point was reached; while the
current ioq was chosen in order to respect the constraint on the required torque.
This method requires a low computational effort because the minimization prob-
lem is off-line computed. However it is not robust, because the reference value
of iod depends on the model parameters (resistances, inductances, etc.), that are
not perfectly known and can also vary during operations.
The second algorithm searches for the minimum losses point by means of a
bisection procedure. In particular, fixing the search step amplitude (d = 1mA)
and starting from an initial searching interval [iodmin = −10A, iodmax = 10A],
this interval is divided by two and its average value is determined as:
iod min + iod max
x= . (10)
2
Then the searching interval is updated adopting the following logic:

iodmax = x, if W (x − d) < W (x + d),
(11)
iodmin = x, if W (x − d) ≥ W (x + d),

where the expression of W (·) represents the total losses of the machine whose
expression is:

 2  2
3 ωr Lq ioq ωr (λm + Ld iod )
W (iod , ioq , ωr ) = Rs iod − + ioq −
2 Rc Rc
2  

+ r (Lq ioq ) + (λm + Ld iod )
2 2
(12)
2Rc
The algorithm is recursively iterated until |iodmin + iodmax | < 2d, where d was
the predetermined search step amplitude. Finally the value of x obtained at last
iteration will be the reference value of iod , which corresponds to the output of
LMA block of Fig. 2 named i∗d .
This second method is more robust against parameters uncertainties as will
be shown in the results Section.

3 Digital Implementation
The above described control algorithm was digitally implemented. In particular,
for the development and synthesis of the VHDL code, Xilinx ISE Design Suite
14.7 was used, while both ISim and ModelSim were adopted for behavioural and
physical simulation. The targeted system was chosen among the Kintex 7 family
products, since they are a good compromise between computational capabilities
158 G. Galioto et al.

and needed power, compared to the more powerful Xilinx products of the Virtex
or Zync families.
The chosen representation for all variables and signals has been the signed
[16.16] format, resulted as a good balance between precision and computational
efficiency; hence for ModelSim simulation it was necessary to use the following
library:
library ieee; use ieee.fixed pkg.all;
Fixed point arithmetic is highly used in signal processing expecially when
an implementation on FPGAs or ASICs must be carried out. It is usually more
efficient than its floating-point counterpart, providing equivalent results in terms
of precision at a reduced cost in terms of resources. For this reason, VHDL 2008
has incorporated into its standard the fixed-point package. It provides arith-
metic operators and necessary functions to work with these numbers’ formats.
Fixed-point arithmetic, by its intrinsic nature, involves expansions of the results
of sums, differences, multiplications, divisions and, therefore, it requires an ade-
quate action of truncation and/or rounding potentially leading to possible losses
both in numerical terms (overflow) and in reliability of the obtained results. If,
however, the system is appropriately sized to possess the necessary numerical
dynamics to accommodate the numerical values of the algorithm, their associ-
ated problems can be overcome. Moreover, the fixed pkg provides a set of useful
predefined functions to manage operations with fixed-point numbers such as:
type conversions, resizing functions, common arithmetic operations, maximum
and minimum etc.; and in the development of the control algorithm they play a
crucial role for the implementation management, as well as operands sizing and
results of subsequent chains of operations.
Looking at the main Block Diagram, the Proportional Integral (PI) con-
trollers are obtained by implementing the following discrete equation:

y(k) = y(k − 1) + Kp e(k) + Ki Ts e(k) (13)

where k is the k-th sampling instant and Ts represents the sampling period.
System behavior was analyzed in [16.16] format and regulators calibration were
kept unchanged for the continuous and discrete case; while sampling time of
the integrator has been set at 10 µs (100 kHz), considering that the converter’s
switching frequency was fixed at 10 kHz.
The equations that describe vector modulation are intrinsically digital and
therefore required no further manipulation in order to be digitally implemented.
It was necessary to implement a up/down counter for determining the switching
instants of the inverter. This counter should give a triangular waveform with a
10 kHz period and count, in the up/down case, up to a value equal to TP W M /2.
The modulation period can be expressed in arbitrary units and in this case,
for convenience, we have chosen to express it in µs. Since TP W M = 100 µs, the
counter counts from 0 to 50 up/down at a frequency equal to 10 kHz. Of course
the counter clock signal should have a higher frequency and it was set to 1 MHz,
easily attainable with the adopted FPGA, starting from the system clock and
coding a suitable frequency divider.
Design and Validation of a FPGA-Based HIL Simulator 159

For obtaining the VHDL code of the model-based LMA it was necessary to
make changes of the numerical format used. Only for this block, in fact, the
operations were carried out using a 64-bit format [32.32], and then truncating
the obtained output in order to have a fixed-point [16.16] format value of the
reference current iod . This has been necessary because a low accuracy of the
algorithm was experimentally observed. Furthermore a study has been carried
out in order to take into account for relatively strong deviations of the captured
speed and position signals due to possible errors introduced by the analog to
digital conversion, observing that a decrease of the effective ADC resolution
down to 9 bit can be easily tolerated by the control algorithm.
For the implementation of the binary-search algorithm, the fixed-point [16.16]
format has been used and, as shown in Fig. 3, a Finite State Machine (FSM)
implementing the algorithm, described in equations (10), (11) and (12), was
designed. In particular a clock cycle triggers the transition of the FSM from its
idle state to the Status1, which carries out the equations of the binary search,
while the other two states properly check whether the minimization value has
been reached.

Fig. 3. States diagram of the LMA algorithm.

4 Results and Conclusion


The results obtained have been shown in Fig. 4, Fig. 5 and Tables 1, 2 and 3.
In particular, they take as a reference the efficiency of classical FOC method
and depict the percentage improvements attainable with the two minimization
algorithms. Two tests have been carried out, the first implementing a variable
speed waveform, and a second test with a variable torque profile. The results of
efficiency improvements are shown in Fig. 4, which compares the model-based
and binary-search algorithms when nominal parameters are used. While both
methods obtain an improvements of about 1% at higher speed and load, in these
conditions the model-based algorithm gave slightly better performances mainly
because a hypothetical good knowledge of the model was available. However,
160 G. Galioto et al.

if the parameters are detuned, the efficiency improvement of the binary search
algorithm overcomes the one of the model-based algorithm, since it is based on
the measurements of the state variable directly obtained from the real model.
Thus the main purpose of this dynamic model was reached and the parameter
uncertainties did not affect the efficiency improvement even at very low speed,
which is the most used in normal usage conditions.

(a) (b)

Fig. 4. Comparison between model-based and binary-search algorithms as function of


the speed (a) and of the torque (b).

Fig. 5. Comparison between model-based and binary-search algorithms as function of


the speed when the nominal parameters are detuned of 10%.

Finally, Tables 1, 2 and 3 gives a summary of the device utilization for


three cases: classical FOC algorithm (Table 1), model-based algorithm (Table 2),
binary-search algorithm (Table 3).
For the purpose of these work several Xilinx’s FPGA have been targeted,
spanning from some little-sized Spartan3 and Spartan6 whose resources were not
able to fulfill the goals. Of course more powerful chips, such as Virtex and/or
Design and Validation of a FPGA-Based HIL Simulator 161

Table 1. Device utilization summary for the classical FOC algorithm

Platform: Xilinx Kintex-7 FPGA - XC7K160T (28 nm technology)


Logic utilization Used Available Utilization
Number of slice registers 3325 202800 1%
Number of slice LUTs 23816 101400 23%
Number of fully used LUT-FF pairs 1219 25922 4%
Number of bonded IOBs 197 285 64%
Number of BUFG/BUFGCTRLs 5 32 15%
Number of DSP48E1s 118 600 19%
Minimum period: 101.558 ns (maximum frequency: 9.847 MHz)
Minimum input arrival before clock: 24.581 ns
Maximum output required after clock: 2.532 ns

Table 2. Device utilization summary for the model-based algorithm

Platform: Xilinx Kintex-7 FPGA - XC7K160T (28 nm technology)


Logic utilization Used Available Utilization
Number of slice registers 4049 202800 1%
Number of slice LUTs 75140 101400 74%
Number of fully used LUT-FF pairs 1545 77644 1%
Number of bonded IOBs 197 285 69%
Number of BUFG/BUFGCTRLs 5 32 15%
Number of DSP48E1s 183 600 30%
Minimum period: 225.410 ns (maximum frequency: 4.436 MHz)
Minimum input arrival before clock: 24.579 ns
Maximum output required after clock: 2.271 ns

Table 3. Device utilization summary for the binary-search algorithm

Platform: Xilinx Kintex-7 FPGA - XC7K160T (28 nm technology)


Logic utilization Used Available Utilization
Number of slice registers 5927 202800 2%
Number of slice LUTs 59430 101400 58%
Number of fully used LUT-FF pairs 1705 63652 2%
Number of bonded IOBs 197 400 49%
Number of BUFG/BUFGCTRLs 7 32 21%
Number of DSP48E1s 248 600 41%
Minimum period: 101.558 ns (maximum frequency: 9.847 MHz)
Minimum input arrival before clock: 24.581 ns
Maximum output required after clock: 2.586 ns
162 G. Galioto et al.

Zync FPGAs, are capable to host both LMAs, but the comparison was instead
targeted on a Kintex-7 (XC7K160T model), as already described.
From these compared tables it is evident that the binary-search algorithm
requires a lower computational effort as expected, while remaining well within
the capabilities of the adopted FPGA. Finally from the speed point of view,
the maximum attainable frequency leaves room to largely increase the main
switching control frequency.
This is a remarkably interesting result because it envisage the possibility to
realize PMSM control by using GaN based inverters at very high control fre-
quency (nominally 1 MHz) thus substantially decreasing the weight of passive
and active components and drastically improving the power density of the elec-
tronics control.

References
1. Sarhadi, P., Yousefpour, S.: State of the art: hardware in the loop modeling and
simulation with its applications in design, development and implementation of
system and control software. Int. J. Dyn. Control 3(4), 470–479 (2015)
2. Fathy, H.K., Filipi, Z.S., Hagena, J., Stein, J.L.: Review of hardware-in-the-loop
simulation and its prospects in the automotive area. In: Modeling and Simulation
for Military Applications, vol. 6228, pp. 62280E-1–62280E-20. International Society
for Optics and Photonics (2006)
3. Bouscayrol, A.: Different types of hardware-in-the-loop simulation for electric
drives. In: International Symposium on Industrial Electronics, pp. 2146–2151.
IEEE (2008)
4. Paiz, C., Pohl, C., Porrmann, M.: Hardware-in-the-Loop Simulations for FPGA-
based digital control design, vol. 15, pp. 355–372. Springer, Heidelberg (2008)
5. Tavana, N.R., Dinavahi, V.: A general framework for FPGA-based real-time emu-
lation of electrical machines for HIL applications. IEEE Trans. Industr. Electron.
62(4), 2041–2053 (2014)
6. Ould-Bachir, T., Dufour, C., Bélanger, J., Mahseredjian, J., David, J.P.: Effective
floating-point calculation engines intended for the FPGA-based HIL simulation.
In: IEEE International Symposium on Industrial Electronics, pp. 1363–1368. IEEE
(2012)
7. Rogers, P., Kavasseri, R., Smith, S.C.: An FPGA-in-the-loop approach for HDL
motor controller verification. In: 2017 International Conference on ReConFigurable
Computing and FPGAs (ReConFig), pp. 1–6 (2017)
8. Morimoto, S., Tong, Y., Takeda, Y., Hirasa, T.: Loss minimization control of per-
manent magnet synchronous motor drives. IEEE Trans. Industr. Electron. 41(5),
511–517 (1994)
9. Lee, J., Nam, K., Choi, S., Kwon, S.: Loss minimizing control of PMSM with the
use of polynomial approximations. In: IEEE Industry Applications Society Annual
Meeting, pp. 1–9. IEEE (2008)
10. Accetta, A., Cirrincione, M., Pucci, M., Sferlazza, A.: State space-vector model
of linear induction motors including end-effects and iron losses part I: theoretical
analysis. IEEE Trans. Ind. Appl. 56(1), 235–244 (2019)
11. Alonge, F., Cirrincione, M., Pucci, M., Sferlazza, A.: A nonlinear observer for
rotor flux estimation of induction motor considering the estimated magnetization
characteristic. IEEE Trans. Ind. Appl. 53(6), 5952–5965 (2017)
Design and Validation of a FPGA-Based HIL Simulator 163

12. Accetta, A., Alonge, F., Cirrincione, M., D’Ippolito, F., Pucci, M., Rabbeni, R.,
Sferlazza, A.: Robust control for high performance induction motor drives based on
partial state-feedback linearization. IEEE Trans. Ind. Appl. 55(1), 490–503 (2018)
13. MathWorks: HDL coder. https://www.mathworks.com/products/hdl-coder.html
14. MathWorks: HDL verifier. https://www.mathworks.com/products/hdl-verifier.
html
15. Vas, P.: Sensorless Vector and Direct Torque Control. Oxford University Press,
Oxford (1998)
16. Leonhard, W.: Control of Electrical Drives. Springer, Heidelberg (2001)
17. Uddin, M.N., Zou, H., Azevedo, F.: Online loss-minimization-based adaptive flux
observer for direct torque and flux control of PMSM drive. IEEE Trans. Ind. Appl.
52(1), 425–431 (2015)
18. Sato, D., Itoh, J.I.: Total loss comparison of inverter circuit topologies with inte-
rior permanent magnet synchronous motor drive system. In: IEEE ECCE Asia
Downunder, pp. 537–543. IEEE (2013)
19. Cavallaro, C., Di Tommaso, A.O., Miceli, R., Raciti, A., Galluzzo, G.R., Trapanese,
M.: Efficiency enhancement of permanent-magnet synchronous motor drives by
online loss minimization approaches. IEEE Trans. Industr. Electron. 52(4), 1153–
1160 (2005)
x86 System Management Mode (SMM)
Evaluation for Mixed Critical Systems

Nikos Mouzakitis1(B) , Michele Paolino1(B) , Miltos D. Grammatikakis2 ,


and Daniel Raho1
1
Virtual Open Systems, 17 Rue Lakanal, 38000 Grenoble, France
{nikos,m.paolino,s.raho}@virtualopensystems.com
2
Hellenic Mediterranean University, Estavromenos, 71004 Heraklion, Greece
mdgramma@cs.hmu.gr
http://www.virtualopensystems.com
http://www.hmu.gr

Abstract. As autonomous driving, industry 4.0, smart cities, etc.


become very popular, safety relevant computing is demanding high per-
formance processors to manage an increasing number of sensors, actua-
tors and control units. In this context, safety critical environments (typ-
ically run by real time operating systems) have to co-exist with one
or multiple functional rich environments, e.g., Linux. Existing virtual-
ization technologies today are considered not secure enough to isolate
these two execution environment types. For this reason this paper evalu-
ates x86 System Management Mode (SMM) as a technology for building
mixed critical virtualization solutions. Considering them as key perfor-
mance indicators, interrupt context switch and the minimal round trip
time overheads have been measured. The obtained results on an Intel
platform of respectively 1.39 µs and 12.73 µs, confirm a high potential
for SMM. At the best of our knowledge, this is the first work considering
SMM as possible solution for mixed critical environments.

Keywords: SMM · x86 · Mixed criticality · Real time · Virtualization

1 Introduction
Mixed critical systems are today increasingly important with the emergence of
autonomous driving, industrial internet of things, smart cities applications. In
fact, there is a need to combine software with different levels of criticality in
a single platform, to satisfy both certification (e.g., ISO26262 for automotive,
IEC61511 for industry, etc.) and user experience requirements (Linux, Android,
etc.). An example of mixed critical application is certainly the cockpit of a road
vehicle, where safety related warning icons driven by a Real Time Operating
System (RTOS) coexist with infotainment (connectivity, radio, road sign recog-
nition, etc.) based on Linux. The performance requirement of mixed critical-
ity systems are increasing as well, driven by autonomous driving, industry 4.0,
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 164–170, 2021.
https://doi.org/10.1007/978-3-030-66729-0_19
x86 System Management Mode Evaluation 165

etc. In this context, there is a need of virtualization solutions that enable a


safe and performant execution of different operating systems. Key requirements
for such solutions are: i) strong isolation in terms of memory, CPU and IO,
ii) low overhead and iii) certifiability. Existing technologies today, i.e., certi-
fied hypervisors, leverage low footprint and CPU virtualization extensions to
address requirements. However, there are important security issues with virtu-
alization [1], mainly due to the fact that this technology has not been designed
with security or functional safety in mind. For this reason, looking for a solution
that provides high computing power and robustness in terms of security and
functional safety, this paper proposes to use x86 based processors System Man-
agement Mode (SMM) for mixed critical applications. In fact SMM provides a
strongly isolated execution environment that runs no intermediation (low over-
head) and benefits from a very thin Trusted Computing Base (certifiability).
The key idea behind this is to use the isolation provided by SMM to protect
the safety critical execution environment, while the feature rich execution envi-
ronment is run transparently on the system. In this paper, feasibility of this
approach is evaluated by measuring the overhead that would be introduced in
CPU context-switch operations between an operating system in SMM mode and
Linux.
The rest of the paper is organized as follows: Sect. 2 details related work
and Sect. 3 introduces the background technologies used. Section 4 describes
benchmarks setup and results while Sect. 5 presents conclusions.

2 Related Work

An important part of the SMM research in literature is related to security, prov-


ing that it is very important for x86 mixed criticality systems to re-architect
its use. For instance, Duflot et al. [2] demonstrated how SMI handlers and
ACPI tables can be used to hide functions to escalate privileges. In addition,
the LONGKIT framework [3] showed the potential threats that can be hidden
behind SMM isolation, providing high persistence and stealth capabilities to
attackers while the SMM Based Rootkit (SMBR) [4] has been used to build a
keylogger where keyboard interrupt requests were redirected to SMM and sent
out via UDP packets. Similarly, Wojtczuk et al. [5] showed the importance of
considering SMM as a potential attack vector, proving how SMM flows can be
exploited to threaten also Intel TXT (Trusted Execution Technology).
Other works focus on the SMM impact on the overall system performance.
For instance, the SMM impact on applications, system and hypervisor perfor-
mance has been studied in [6]. The results proved that SMM workload can cause
performance degradation for latency sensitive applications on Linux side. On
the other hand TrustLogin [7] authors used SMM to protect credentials from
malware, considering as low the overhead introduced by SMM for their use case.
Finally, SICE [8] proposed a study similar to the present one, but with impor-
tant differences. In fact, it used SMM to host a cloud service in x86 AMD plat-
forms, thus lacking of functional safety requirements considered as key in this
166 N. Mouzakitis et al.

work. However, SICE provides interesting performance numbers, with a context


switch time of 6.8 µs. This work results are sensibly lower, however they are not
directly comparable due to hardware and software differences.
At the best of our knowledge, this is the first work that ambitions to verify
the ability of SMM to execute safety related applications.

3 Background
This paper evaluates SMM as a technology for building mixed critical virtual-
ization solutions, by extending the EFI Development Kit II (EDK2) open source
project. This section briefly presents both SMM and EDK2.

3.1 System Management Mode (SMM)


System Management Mode (SMM) is an x86 operating environment designed to
provide system wide functionalities like power management, hardware control
and proprietary OEM code execution. These functions are executed transpar-
ently from the operating system, that has no mean to control or verify them.
In addition, SMM is non-preemptible and benefits from an isolated memory
address space that can be made not accessible from any other of the standard
execution modes (i.e., protected, real-address, virtual-8086 and IA32) [9]. Such
isolated memory area, called System Management RAM (SMRAM), is used to
store code and SMM execution context information such as registers, descriptor
tables and interrupt handlers. The only way to enter SMM is through a specific
interrupt, named System Management Interrupt (SMI). SMIs have the highest
priority in x86 systems and that are used to trigger the context switch between
any operation modes and SMM.
As a result, software running in SMM mode is transparent and strongly
isolated from standard applications that cannot access SMRAM. On the other
hand, it is possible for SMM software to access standard RAM, which gives SMM
software the possibilities to monitor and verify what standard execution modes
are doing. In addition, SMM is populated at system boot time, even before
the operating system executed. This means that the SMM trusted computing
base (set of hardware/software components critical for its security) is relatively
small, and this helps in keeping it secure. All these characteristics make of SMM
a very interesting technology for mixed critical virtualization. In fact, by exploit-
ing SMM and its SMRAM/SMI features, the safety critical environment has a
strongly isolated execution environment to protect an RTOS, while Linux is as
usual running in protected mode.

3.2 EFI Development Kit II (EDK2)


EFI Development Kit II (EDK2) is a reference implementation of UEFI (Uni-
fied Extensible Firmware Interface), the specification which defines the software
interface between the platform firmware and the operating system.
x86 System Management Mode Evaluation 167

EDK2 populates the SMM environment with all the functions and interrupt
handlers needed guarantee the functionality of the system. In fact, EDK2 makes
sure that each time that an SMI is triggered, a context switch to the SMM
mode (where the SMI handlers is stored) is done properly. Figure 1 shows the
execution flow of a SMI IPI (Inter-Processor Interrupt) caused by a kernel driver
after writing the ICR (Interrupt Command Register).

Fig. 1. Flow chart of SMI Handling in EDK2

When an SMI IPI is triggered the processor waits for current instructions to
complete, save its running context, and enters in SMM executing the SmiEn-
try.nasm file. After, EDK2 waits that all the CPU cores enter SMM, with the
SmiRendezvous() function. At that point, debug functions are activated (CpuS-
mmDebugEntry() function) before the execution of each handler registered for
that SMI in SmiManage(). Once the related handlers are executed, the flow goes
back following the same path, and when all CPUs are ready to switch back to
the previous execution mode, they execute the rsm instruction that will restore
the context and restart the execution from the point where it was interrupted.

4 SMM Performance Assessment for Mixed Criticality


Systems
This work evaluates the overhead that would be introduced by hosting a safety
critical operating system in SMM along with Linux. To do this the SMI IPI
context switch time and the round trip time overhead have been measured.
The platform used for the experiments is the Minnowboard Turbot Dual Eth-
ernet equipped with an Intel E3845 CPU running at 1.917 GHz. This platform
has been chosen because of its open firmware, that was a key requirement for the
measurements of this paper. In fact, a purpose built version of EDK2 (version
stable202002) has been used together with Ubuntu 18.04 with kernel 5.7.0.
Moreover, the Intel CPU’s hardware time stamp counter has been used to
count the number of clock cycles needed [10]. In addition, to minimize the pro-
cessors synchronization time, the number of Linux CPU cores has been limited to
1 using the boot argument maxcpus = 1. No specific performance optimization
have been applied to EDK2, that was however been compiled in release mode
to remove unnecessary prints on the screen. Lastly, Hyper-Threading and CPU
frequency scaling have been disabled. All tests have been repeated 200 times.
168 N. Mouzakitis et al.

4.1 Measurements on the Target Board

Benchmarks are focused on the context switch time, i.e., the time required for a
context change from kernel to SMM, the reverse context switch time, i.e., the time
required to exit SMM back to the Linux kernel, and the Round Trip Time(RTT),
i.e., the time required to switch in SMM, execute the standard handlers and exit
back to the Linux kernel. In addition, we calculate the Minimal Round Trip Time
(Minimal RTT) value considered as the theoretical minimum overhead required
to enter and exit the SMM environment.
More in detail, for the context switch time the number of clock cycles spent
to go from the kernel driver to SmiEntry.nasm (Step 1 in Fig. 1) have been
measured. This test aims to measure the shortest context switch overhead, min-
imizing the number of software components involved in the measurement. As for
the reverse context switch time measurements, which is the final context switch
triggered by the rsm assembly call (Step 7 in Fig. 1) has been measured. Finally
for the Round Trip Time(RTT) the time spent in SMM, from the initial context
switch (Step1) up to the (Step 7 in Fig. 1) has been measured.
As a result of the benchmarks, the average context switch time is of 1.39 µs,
with a standard deviation of 0.11 µs. On the other hand, for what concerns reverse
context switch time, the time spent is of 11.34 µs with a standard deviation of
0.26 µs. Round Trip Time(RTT) measurement show an average of 272 µs with
a standard deviation of 18 µs. Such time is directly impacted by the number of
handlers executed by the system once an SMI arrives. In the used platform, two SMI
handlers were active (PchSmiDispatcher and DigitalThermalSensor respectively)
during tests, with an impact of 235 µs. These handlers are distributed by Intel
as a binary blob to be included during the compilation of EDK2, and were not
removable. Finally, by summing context switch time and reverse context switch
time we obtain a Minimal RTT of 12.73 µs.

Table 1. SMM assessment experimental results overview

Measured overhead (µs) Standard deviation


Context switch 1.39 0.11
Reverse context switch 11.34 0.26
RTT 272 18
Minimal RTT 12.73 –

These results (summed up in Table 1) provide a positive feedback to the


authors ambitions, considering that for mixed criticality systems such as auto-
motive the worst case response time is in the order of milliseconds [11,12].
x86 System Management Mode Evaluation 169

5 Conclusion and Future Work


This work evaluates SMM as a technology for building mixed critical virtualiza-
tion solutions. In fact, the isolation, low overhead and low footprint (important
for certifiability) characteristics that SMM can provide makes of it a promising
technology to protect safety critical workloads such as an RTOS.
The SMI IPI context switch time and the Minimal Round Trip Time overhead
have been measured, considering them as key indicators because they represent
the overhead to pay when the two mixed critical operating systems (e.g., Linux
and an RTOS for instance) coexist on the same platform. With results as low
as 1.39 µs and 12.73 µs respectively for the context switch and the Minimal
Round Trip Time, SMM is seen as a technically promising solution to implement
strongly isolated mixed critical solutions.
Future work include the implementation of EDK2 extensions to support the
co-execution of a real time operating system in SMM together with Linux. In
addition, further benchmarks involving a higher number of cores and AMD hard-
ware platforms are of interest to evaluate SMM mixed critical solution perfor-
mance.

References
1. Sierra-Arriaga, F., Branco, R., Lee, B.: Security issues and challenges for virtual-
ization technologies. ACM Comput. Surv. (CSUR) 53(2), 1–37 (2020)
2. Duflot, L., Grumelard, O., Levillain, O., Morin, B.: ACPI and SMI handlers: some
limits to trusted computing. J. Comput. Virol. 6(4), 353–374 (2010)
3. Rauchberger, J., Luh, R., Schrittwieser, S.: Longkit-a universal framework for
BIOS/UEFI rootkits in system management mode. In: ICISSP, pp. 346–353 (2017)
4. Embleton, S., Sparks, S., Zou, C.C.: SMM rootkit: a new breed of OS independent
malware. Secur. Commun. Netw. 6(12), 1590–1605 (2013)
5. Wojtczuk, R., Rutkowska, J.: Attacking intel trusted execution technology. Black
Hat DC 2009, pp. 1–6 (2009)
6. Delgado, B., Karavanic, K.L.: Performance implications of system management
mode. In: IEEE International Symposium on Workload Characterization (IISWC),
pp. 163–173. IEEE (2013)
7. Zhang, F., Leach, K., Wang, H., Stavrou, A.: Trustlogin: securing password-login
on commodity operating systems. In: Proceedings of the 10th ACM Symposium
on Information, Computer and Communications Security, pp. 333–344 (2015)
8. Azab, A.M., Ning, P., Zhang, X.: SICE: a hardware-level strongly isolated com-
puting environment for x86 multi-core platforms. In: Proceedings of the 18th ACM
Conference on Computer and Communications Security, pp. 375–388 (2011)
9. Intel:
R IntelR 64 and IA-32 architectures software developer’s manual combined
volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4 (2018)
10. Paoloni, G.: How to benchmark code execution times on intel IA-32 and IA-64
instruction set architectures. Intel Corporation 123 (2010)
170 N. Mouzakitis et al.

11. de Carvalho, S.M.T., Campos, G.L.: Worst case response time approach evaluation
for computing can messages response time in an automotive network. In: Brazilian
Power Electronics Conference (COBEP), pp. 1–6. IEEE (2017)
12. Junior, E.A.S., de Araujo-Filho, P.F., Campelo, D.R.: Experimental evaluation of
cryptography overhead in automotive safety-critical communication. In: IEEE 87th
Vehicular Technology Conference (VTC Spring), pp. 1–5. IEEE (2018)
Photonic Circuits and IoT for
Communications
A Novel Pulse Compression Scheme in Coherent
OTDR Using Direct Digital Synthesis
and Nonlinear Frequency Modulation

Yonas Muanenda(B) , Stefano Faralli, Philippe Velha, Claudio Oton,


and Fabrizio Di Pasquale

Scuola Superiore Sant’Anna, TeCIP Institute, Via Moruzzi 1, 56124 Pisa, Italy
y.muanenda@santannapisa.it

Abstract. We report on an experimental demonstration of a novel scheme for


enhancing the spatial resolution of coherent OTDR using chirped pulse com-
pression based on Direct Digital Synthesis (DDS) of the frequency modulation
of probing pulses. We show that DDS can be used to generate linear frequency
modulated (LFM) optical pulses with high fidelity and enables programmable
customization of the frequency modulation laws of pulse waveforms for advanced
pulse compression employing matched filtering with nonlinear frequency mod-
ulated (NLFM) pulses. We demonstrate the effectiveness of the method in sig-
nificantly suppressing ambiguity in pulse compression by confirming the clear
separation of events spaced by 50 cm at ~1.13 km using 1.2-μs pulses. Thanks
to the use of rigorously defined, DDS-generated NLFM waveforms with desired
features, error-inducing sidelobes in the pulse autocorrelation function are sup-
pressed by more than ~20 dB, offering an improvement of ~16 dB compared to
conventional LFM pulses.

1 Introduction

Optical fiber sensors have been a subject of wide investigation due to their unique features
which make them suitable for integrity and safety monitoring in many sectors including
the civil engineering, transportation and energy industries [1]. Specifically, distributed
optical fiber sensors enable the concurrent measurement of physical parameters over an
extended region. They are principally based on the observation of the backscattering sig-
nal from optical fibers employing Raman, Brillouin and Rayleigh scattering phenomena
whose local characteristics including intensity, frequency and/or phase vary depending
on the change in environmental parameters [2]. Specifically, coherent Rayleigh scatter-
ing has been of particular focus in recent years due to its ease and speed of measurement
and has been employed in Distributed Acoustic Sensing (DAS), which has interesting
applications in among others intrusion & leakage detection and railways infrastructure
monitoring systems. While there are configurations employing frequency and correla-
tion domains, a commonly employed technique is known as coherent or phase-sensitive
Optical Time Domain Reflectometry (ϕ-OTDR) [1]. The method involves sending a
pulse of coherent light along the fiber and measuring the distributed echo of the light

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 173–181, 2021.
https://doi.org/10.1007/978-3-030-66729-0_20
174 Y. Muanenda et al.

as it traverses through each location, with the measured delay being mapped to a spe-
cific location with known speed of light in the fiber. The resulting coherent Rayleigh
backscattering from the fiber exhibits coherent speckles. When a disturbance such as an
intrusion, heating or vibration happens at a specific point, the optical path length and
refractive index of the fiber change, thereby altering the local phase of the speckles.
The change in phase manifests itself as a change in intensity or the actual phase of the
speckles, which can in turn be used to measure the location, frequency and intensity of
vibrations or quantify the change in temperature.
A number of investigations have been done to improve the performance of conven-
tional ϕ-OTDR in recent years targeting dynamic performance enhancement, improved
measurement precision [3], extending sensing distance, enhancing the measurand
dynamic range and enabling more spatially resolved sensing [1]. A technique used to
improve the spatial resolution in ϕ-OTDR consists in the use of pulse probes with lin-
ear frequency modulation (LFM) within the pulse duration, instead of a monochromatic
light, with subsequent matched filtering to obtain compressed responses [4–6]. Although
simple to implement, linear chirping suffers from ambiguity in resolving events near
resolution limits. More complex waveforms which are broadly known as Nonlinear
Frequency Modulated (NLFM) waveforms offering enhanced spatially resolved mea-
surements have been employed in radiofrequency electronic systems [7]. The optical
fiber sensing community can benefit from further research on flexible implementation
of frequency modulation functions and the use of optical NLFM pulses in fiber optic
systems. Until recently, one of the hurdles of flexible frequency modulation has been the
challenge in the synthesis of optical pulses with rigorously defined features which bring
specific benefits. Among others, a key parameter whose optimization yields the reduction
of ambiguity in pulse compression is the suppression of the sidelobes in the Autocor-
relation Function (ACF) of pulses. When the power of sidelobes relative to the main
lobe is low, it means two closely-spaced events or targets can be resolved without error-
inducing ambiguity in the presence of noise. Waveforms which have desirable ACFs
exhibit frequency modulation features which are not straightforward to be implemented
with simple analogue components. Recently, however, there has been a markedly rapid
development in the technology of advanced digital to analogue converters enabling direct
digital synthesis of complex waveforms in high-speed optical communication systems
and radiofrequency applications [8].
In this contribution, we show that frequency modulation in optical reflectometry can
be done in a flexible and programmable manner using Direct Digital Synthesis (DDS) of
desired pulse frequency modulation functions. We experimentally demonstrate the use
of DDS for the implementation of an adaptable ϕ-OTDR system for pulse compression
with linear chirping as well as a significant disambiguation of events near the limits of the
spatial resolution using rigorously defined nonlinear chirping, for the first time to the best
of our knowledge. When used to compress a 1.2-μs pulse, the scheme enables advanced
filtering with significant disambiguation in the backscattering response of two events
placed at ~1.13 km and separated by 50 cm. We confirm that this is due to achievable
sidelobe suppressions in the ACF of the NLFM, which is ~20 dB below the main lobe
with no averaging of the raw backscattering response, offering an improvement of more
than ~16 dB compared to that of the conventional LFM pulse.
A Novel Pulse Compression Scheme in Coherent OTDR 175

2 Theory and Operating Principle


In conventional OTDR techniques, since narrow pulses offer spatially resolved measure-
ments while containing less signal energy and vice versa, there is an inherent tradeoff
between pulse energy and spatial resolution. Matched filtering is a mechanism which
can be used to address this problem wherein a waveform of chosen characteristics is
sent along a noisy channel and its altered echo is filtered with the original one, resulting
in the maximization of the SNR [9]. Thanks to matched filtering, as shown in Fig. 1(a),
a wide pulse having width T and chosen frequency content can be used to probe a fiber
to obtain a response which is equivalent to that of a narrow pulse t. This effectively
offsets the energy-resolution tradeoff and the resulting response is said to be compressed
in time by factor of T /t. A commonly used matched filter for pulse compression is
the one based on an LFM pulse in which the signal frequency varies linearly within the
pulse duration. For an LFM pulse having a frequency per unit time ratio (chirp rate) of
k and width T, the waveform is given by:
 
s(t) = rect(t/T ) exp j2π(fc t + k/2t 2 ) , (1)

where f c is the frequency of the modulated optical carrier. When used in a generic
OTDR technique, the resulting output of the matched filter is given by a convolution of
the scaled amplitude response of the fiber with a sync-like function given by [4]:

(t) = rect(t/2T ) × T sin[π kt(T − |t|)]/(π kTt). (2)

The zero-crossing points of the main lobe of this response determine the range
resolution that can be obtained for a given pulse width and modulation frequency. The
spatial resolution z of the response for an optical fiber is provided by the Rayleigh
criterion and for B = kT, which is commonly known as the signal base, is given by [4,
9]:
c
z = , (3)
2nB
where the c is the speed of light in free space and n is the group refractive index of
the fiber. Hence, the spatial resolution is now dependent on the total bandwidth content,
as opposed to only the pulse width for conventional OTDR with a monochromatic pulse.
When using NLFM instead, the frequency varies within the pulse duration according
to a certain nonlinear modulation law which in turn, thanks to the fact that frequency
and phase are integration-differentiation pairs, defines the phase modulation. These are
generally complex functions whose precise features are determined by the parameter
which needs to be optimized which in our case is sidelobe suppression in the ACF. In our
contribution, we have explored different NLFM waveforms having such a desired feature
and the chosen nonlinear frequency modulation scheme has the following response [7]:
⎡ ⎛ √  ⎞⎤

2 1/2
t − T /2 T  2+4 T 2 (2 + 4) T
s(t) = Arect( ) exp⎣j2π ⎝ − − t− ⎠⎦,
T 2 42 2
(4)
176 Y. Muanenda et al.

where  and T are the nonlinear frequency span and the pulse width, respectively.
We first employ our DDS to confirm high fidelity and desired frequency features of
pulses and compression of responses for sub-meter resolution and SNR enhancement
with LFM pulses, and subsequently demonstrate the benefits of the technique to enable
advanced filtering with NLFM pulses. The normalized frequency and phase modulation
laws for the NLFM pulse used in our DDS-based ϕ-OTDR are shown in Fig. 1(b). It is
worth noting that, owing to the rigorous way in which the frequency modulation function
in Eq. (4) is defined, it would have been impossible to employ such a waveform in an
optical setup without the DDS technique, which allows for flexibility in rigorously and
programmatically controlling the pulse waveform characteristics.

Fig. 1. (a) Mechanism of a generic pulse compression (b) phase and frequency modulation laws
for NLFM pulse used in pulse compression

3 Experimental Setup

The experimental setup shown to demonstrate the effectiveness of the technique is shown
in Fig. 2. First, light form a coherent laser with a linewidth of ~200 kHz is allowed to
pass through an optical coupler whose 1% tap acts as a local oscillator while the 99%
is fed to an I-Q modulator used to generate the optical pulses. It consists of a nested
Mach-Zhender configuration having separate controls for the two arms of the inner
and one arm of the outer structure. The chirped pulse waveform is fed to the two RF
ports of the modulator which constitute the in-phase and quadrature components of
the modulating signal. The modulator is driven in Single-Sideband Suppressed Carrier
(SSB-SC) modulation thanks to the use of a 90-degree electrical hybrid which takes in
the analogue RF signal with custom, digitally synthesized features and yields two signals
which constitute the in-phase and quadrature components of the driving RF signal. The
electric hybrid is fed with the DDS module with an inbuilt DAC in which programmable
waveforms of desired features as defined in Eqs. (2) and (4) are loaded. To ensure SSB-
SC, the three bias controls of the I-Q modulator are carefully adjusted to make sure the
carrier and one sideband remain suppressed by up to 25 dB, and the frequency modulated
chirp is allowed to appear only in one sideband.
A Novel Pulse Compression Scheme in Coherent OTDR 177

The SSB-SC optical pulse is subsequently amplified with an Erbium-doped Fiber


Amplifier, and filtered with an Optical Bandpass Filter (OBPF) to suppress the Amplified
Spontaneous Emission (ASE) noise from going into subsequent stages. The signal is
then fed to an Acousto-Optic Modulator (AOM) which will increase the extinction ratio
of the probing pulses and further suppresses the ASE noise in the non-zero level of
the pulse. After amplification and filtering with a second EDFA and OBPF pair, the
resulting probe is sent into the Fiber Under Test (FUT) composed of two spools of
fiber separated by patch cords of varying length, through a three-port optical circulator.
The coherent Rayleigh backscattering from the fiber which appears in the return port
of the circulator is then mixed with the local oscillator and the beating is detected
with a 10-GHz photodetector. The resulting uncompressed, raw signal is acquired in
realtime using an ADC with 10GS/S sampling rate. Since the aim is to check reasonable
fidelity of interrogating pulses and verify possibility of compressed responses with SNR
improvement, no averages were performed in acquired raw traces. The acquired signal
is subsequently processed using matched filtering to obtain a compressed response. This
operation involves correlation of the uncompressed, raw traces with the interrogating
frequency modulated pulse waveform. For more accuracy, the latter is characterized by
mixing a tap of the optical pulse sent into the sensing fiber with the local oscillator.
The waveform of the pulse used in the DDS module can be directly employed in the
pulse compression but the characterized optical pulse is used instead since it gives more
accurate results which capture changes introduced to the pulse in the analogue RF and
optical components.

Fig. 2. Schematic of the experimental setup of ϕ-OTDR with DDS-generated pulse waveforms

4 Experimental Results and Discussions


To verify that the method of digital synthesis of pulses yields pulse shapes and spectra
with the desired time and frequency features, we characterized a sample linearly chirped
pulse of width 1.2 μs and total frequency modulation of 500 MHz, from 700 MHz to
1.2 GHz. These values can be flexibly set by the digital synthesis tool and ultimate values
used in the optical setup are determined by the bandwidth of analogue RF components.
A sample acquisition of the chirped pulse is shown in Fig. 3(a) which shows the desired
chirp length and pattern while the power spectrum given in 3(b) confirms the expected
chirp frequency range and bandwidth.
178 Y. Muanenda et al.

(a) (b)

Fig. 3. (a) Sample LFM pulse in time domain (b) Pulse chirp spectrum

Subsequently, the linearly chirped pulse is then sent into the sensing fiber and the
backscattering has been observed on an oscilloscope. A raw, uncompressed trace of a
sample acquisition is shown in Fig. 4(a). Note that the patch cord between FUT-1 and
FUT-2 is not observed in the uncompressed trace. The raw trace is then compressed via
matched filtering with the response of the characterized optical LFM pulse, which itself
has been digitized after photodetection. The raw, compressed responses when using an
LFM is shown in Fig. 4(b) for three cases: FUT-1 alone (green), FUT and patch cord
(red) and FUT-1, FUT-2 and patch cord (blue).
As shown, the corresponding connector reflections at the end of FUT-1 are clearly
visible in all traces and so are the positions of the patch chord from the two traces in which
it is included. There is also a higher SNR in compressed traces thanks to the expected
reduction in the noise due to matched filtering. The same observations were confirmed
when changing various patch cord length values, which were progressively reduced to
the meter and sub-meter scales. The compressed responses of the linear chirped probe for
two sample patch cord length values of 1 m and 50 cm, with the starting and end points
at 1128.5 m, are depicted respectively in Fig. 5(a) and (b). This method of determining
the resolution is more effective compared to ones in conventional schemes based solely
on the duration and/or total bandwidth content of the probing pulse, as the latter can lead
to errors due to changes in pulses during propagation along the fiber.
It is evident from the compressed responses that the length of the patch cord in each
case is clearly visible, as well as the reflections from both ends, confirming effectiveness
of matched filtering with DDS-generated LFM pulses. Note that resolutions of a 50-cm
span using conventional monochromatic pulses would have required 5-ns pulses instead
of 1.2-μs one used in ours, resulting in a compression ratio of more than 240. It can
also be observed that there are some sideways oscillations seen in identifying the end
points, consistent with the ones present in RF applications using conventional LFM
pulses. These features inevitably result in ambiguities in identifying fixed-point events
and nonlinearity in quantitative measurements of spatially dispersed ones in distributed
sensing.
To address the issue of ambiguity using the flexibility of our DDS-based scheme,
an NLFM pulse with the frequency modulation law defined in Eq. (4) is used to probe
the fiber and a sample compressed response is shown in Fig. 6, together with that of the
LFM pulse of the same width and frequency content. The reflections from the connectors
A Novel Pulse Compression Scheme in Coherent OTDR 179

Fig. 4. Sample (a) Raw backscattering traces for whole sensing fiber with FUT-1 patch cord &
FUT-2 (b) Compressed backscattering response for FUT-1 and FUT-2 in the presence & absence
of a sample patch cord

Fig. 5. Compressed LFM response for measurement of (a) 1-m and, (b) 0.5m patch cord spans
180 Y. Muanenda et al.

are clearly visible, as well as the expected gap corresponding to the 50-cm patch cord,
consistent with the features of an OTDR trace. There are no discernible sidelobes &
ambiguity associated with the compressed trace for the NLFM pulse, in clear contrast
to those of the LFM pulse, confirming the benefits of advanced matched filtering with
nonlinear chirping.

Fig. 6. Compressed NLFM and LFM responses at a 50-cm patch chord span

To determine the source of the significantly reduced ambiguity, the plots of the
ACFs of LFM and NLFM optical pulses, which were characterized using coherent
detection with the local oscillator used for the matched filtering, are depicted in Fig. 7.
The sidelobes in the LFM pulses are seen to be ~4 dB below the main lobe, similar
to linearly chirped schemes in presence of noise which suffer from degradation of the
SNR resulting in ACF sidelobe suppression values down to a few dB [10]. The sidelobe
suppression in the ACF of the NLFM is instead ~20 dB, confirming that the suppression
of the ambiguity is owed to the enhanced features of the rigorously defined, DDS-
generated NLFM. Note that no averaging has been performed in the acquisition of both
the raw traces used to obtain compressed responses in Fig. 6 and the characterized pulses
whose ACFs are shown in Fig. 7.

Fig. 7. Relative power levels of sidelobes in autocorrelation functions for linear and nonlinear
frequency modulated pulses used in matched filtering
A Novel Pulse Compression Scheme in Coherent OTDR 181

5 Conclusions
We have proposed and experimentally demonstrated the use of DDS-generated chirped
pulses for pulse compression in a ϕ-OTDR system which enables sub-meter spatial res-
olution using advanced matched filtering. We have showed, for the first time to the best
of our knowledge, that DDS-generated pulses with desired features can be employed
in enhanced matched filtering with NLFM optical pulses having strict frequency mod-
ulation laws. Experimental results show pulse compression with LFM chirping yields
compressed responses which enable resolution of sub-meter events while exhibiting
ambiguity in resolving events which introduce errors in sensing near the limits of spatial
resolution.
The ambiguity in the compressed response is significantly suppressed using matched
filtering with NLFM pulses, thanks to the enhancement in the ACF whose sidelobes have
been suppressed by more than ~20 dB, for an improvement of ~16 dB compared to those
of the LFM when a 1.2-μs pulse has been compressed to resolve two reflection events
separated by 50 cm at ~1.13 km. The contribution confirms that DDS enables enhanced
spatially resolved measurements in optical reflectometry measurements via pulse com-
pression with advanced matched filtering employing custom, rigorously defined fre-
quency modulation schemes which offer specific advantages not implementable with
conventional techniques.

References
1. Muanenda, Y.: Recent advances in distributed acoustic sensing based on phase-sensitive
optical time domain reflectometry. J. Sens. 3897873 (2018)
2. Muanenda, Y., Oton, C.J., Di Pasquale, F.: Application of raman and brillouin scattering
phenomena in distributed optical fiber sensing. Front. Phys. 7, 155 (2019)
3. Muanenda, Y., Faralli, S., Oton, C.J., Cheng, C., Yang, M., Di Pasquale, F.: Dynamic phase
extraction in high-SNR DAS based on UWFBGs without phase unwrapping using scalable
homodyne demodulation in direct detection. Opt. Express 27, 10644–10658 (2019)
4. Zou, W., et al.: Optical pulse compression reflectometry: proposal and proof-of-concept
experiment. Opt. Express 23, 512–522 (2015)
5. Lu, B., et al.: High spatial resolution phase-sensitive optical time domain reflectometer with
a frequency-swept pulse. Opt. Lett. 42, 391–394 (2017)
6. Zou, W., et al.: Optical pulse compression reflectometry with 10 cm spatial resolution based
on pulsed linear frequency modulation. In: Optical Fiber Communication Conference, paper
W3I.5. Optical Society of America, Los Angeles (2015)
7. Leśnik, C.: Nonlinear frequency modulated signal design. ACTA Phys. Pol. A 116, 351–354
(2009)
8. Electronic Design: How New DAC Technologies are Changing Signal Generation for Test
(2017). https://bit.ly/35dxalj
9. Richards, M.A.: Fundamentals of Radar Signal Processing. McGraw-Hill Education, New
York (2005)
10. Grodensky, D.: Laser Ranging using Incoherent Pulse Compression Techniques (Chap. 3).
Bar-Ilan University, Ramat-Gan (2014)
Design and Analysis of RF/High-Speed SERDES
in 28 nm CMOS Technology for Aerospace
Applications

Francesco Cosimi1(B) , Gabriele Ciarpi2(B) , and Sergio Saponara1(B)


1 Università di Pisa, Dip. Ingegneria dell‘Informazione, Via G. Caruso 16, 56122 Pisa, Italy
f.cosimi1@studenti.unipi.it, sergio.saponara@iet.unipi.it
2 INFN, Largo B. Pontecorvo, 56127 Pisa, Italy

gabriele.ciarpi@ing.unipi.it

Abstract. This paper proposes a transistor-level design of a high-speed 10-bit


Serializer-Deserializer (SerDes) circuit for Aerospace applications, in a 28 nm
CMOS technology. A data-rate above 10 Gbit/s has been taken as a target for the
development, together with a −50 °C to 125 °C temperature range. The extreme
performance requirements made necessary the realization of a full-custom design
and the use of Current Mode Logic (CML) circuits. This solution brings advan-
tages in devices where high speeds are required, overcoming standard CMOS
logic capabilities. Moreover, an aerospace application involves an analysis over
radiation and their effect on integrated circuits. The relatively low presence of
cumulative radiation doses in this environment let them to be neglected and to
focus the attention on the disturbs coming from high-energy particles hitting the
substrate (Single Event Effects). These, in fact, constitute one of the main causes
of failures in electronic devices for avionic systems.

1 Introduction
In safety-critical applications operating in harsh environments, like aerospace and auto-
motive, there is an increasing need of high-speed robust links to manage on-board transfer
of high data rates generated by large bandwidth sensors (e.g. radar, lidars, detector arrays,
multi-spectral cameras). The current trend in high speed communication is to transfer
the data on optical mediums like fibers. Indeed, compared to electrical mediums, fibers
have extremely low transmission loss, greater bandwidth, and higher tolerance to elec-
tromagnetic (EM) interference and to radiations. The state-of-the-art aerospace link is
represented by the recent SpaceFiber standard, targeting a data rate of 6.25 Gbit/s [1].
However, data rates higher than 10 Gbit/s are also needed to sustain the communication
with high resolution instrumentations. In automotive, the new trend is represented by
the on-board integration of 10G-Ethernet fiber links [2]. Being the automotive environ-
ment filled by electromagnetic interferences, generated by electrical motors and power
converter [3], the fiber links improve the communication reliability of safety-critical
systems.
A key element of such systems is the Serializer-Deserializer (SerDes) unit, which
allows to interface the data provided by parallel electrical buses to serial links and vice

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 182–191, 2021.
https://doi.org/10.1007/978-3-030-66729-0_21
Design and Analysis of RF/High-Speed SERDES 183

versa. The fundamental relevance of this device for high-speed links is also highlighted by
the recent born of the Automotive SerDes Alliance (ASA) [4], which targets automotive
communication with data-rate of 13 Gbit/s.
As shown in Fig. 1, the SerDes device, made by a serializer and a deserializer
block, is typically part of a more complex system that includes both electrical and
optical interfaces. The Serializer unit works at transmission level, with a high-speed
transmission architecture: where multiple lines coming from a digital front-end are fused
in a single multi Gbit/s stream, which is passed to a driver and then transmitted through
a fiber, using the classical VCSEL (Vertical Cavity Surface Emitting Laser) or other
modulators integrated in the emerging Silicon Photonic technology [5]. On the receiver
side, the data, after to be detected by a PhotoDetector (PD), amplified and digitalized by
a Trans-Impedance Amplifier (TIA) and a fast ADC, are then parallelized through the
deserialization circuit, making them ready to be processed.
This work is focused on the design of a SerDes unit in 28 nm CMOS technology
robust to Single-Event Effects (SEE) and addressing a data rate above 10 Gbit/s.
The rest of the paper is organized as follows: Sect. 2 presents the proposed SerDes
circuit architecture. Section 3 deals with pre-layout transistor level performance of the
proposed circuit and particularly with robustness to SEE. Layout design and post-layout
performance are discussed in Sect. 4. Conclusions are drawn in Sect. 5.

Fig. 1. System level architecture of a multi-Gbit/s transmission link exploiting electro-optical


conversion and Serializer-Deserializer data stream conversion.

2 SerDes Architecture
The proposed SerDes is a single programmable circuit capable of both Serializer and
Deserializer behaviours, to be driven with a Time Division Multiplexing logic (TDM).
It can bring a reduction of costs in terms of power consumption and area occupation,
where a bidirectionality of the communication is needed.
184 F. Cosimi et al.

Figure 2 shows the system architecture of the proposed device. It has 10-bit parallel
output and 10-bit parallel input, sided by high speed serial input and output. SerDes
behaviour is ruled by a clock signal (clk) and two control terminals: enable and clock_div.
The device has been realized using two main registers capable of absolving PISO features
(Parallel Input Serial Output) for serialization purposes, and to behave as a SIPO (Serial
Input Parallel Output) when data have to be deserialized.

Fig. 2. SerDes device integrating in a single unit Serializer and Deserializer functions, that can
be selected according to a Time-division multiplexing control logic.

The device requires, together with the main clock (clk), a non-periodic digital signal
called enable, which allows to switch from a Serialization behaviour (HIGH level of the
signal) to a Deserialization behaviour (LOW level of the signal) and vice versa.
The other signal, called clock_div, is periodic, and it is 1/10 of main clock frequency,
providing information on whether the data is ready to be serialized or deserialized.

2.1 Flip-Flops Design

The classic electronic digital design uses standard cells, provided with the technology,
to speed up the design of complex devices, but they can be a limiting factor for circuit
performances. To exploit the best capability of the 28 nm targeted technology a full
custom design of the SerDes is performed. From a direct examination over different
solutions came out that the differential Current Mode Logic (CML) architecture was
necessary to obtain acceptable results in term of data rate. Therefore, D-type CML flip-
flop and other synchronous circuits, necessary for the application, have been developed,
enhancing the complexity of the original CML latches.
The D-Latch designed in Fig. 3 is composed by two differential CML-gates, which
are exclusively activated thanks to clkp and clkn signals. The first gate (M1, M2 and M3),
activated with the high-level of the clkp, samples the input signal, while the second one
(M4, M5 and M6), turned on with the high-level of the clkn, holds the bit value for the
remaining half clock period. Using two D-Latches in Master-Slave logic a D Flip-flop
is obtained [6–9].
Design and Analysis of RF/High-Speed SERDES 185

In addition, the first latch can be adapted to obtain different flip-flop features:

• Load flip-flop: if Strobe is enabled, data is sampled from the input. If it is disabled,
the circuit holds stored bit (Fig. 4).
• Mux flip-flop: if Valid is enabled, data is sampled from Input#0, instead if it is disabled,
data come from Input#1.
• AND flip-flop: Input#0 and Input#1 are AND-gated if Enable (M5 en_p) is HIGH.
Otherwise output is on LOW level if the circuit is disabled (en_p LOW).

Fig. 3. D-Latch circuit using CML-gates designed in this work.

Fig. 4. Load flip-flop: if Strobe (M3 selp) is enabled, data is sampled from the input. If it is
disabled, the circuit holds stored bit.

2.2 Pulse Generator Circuit


The programmable pulse generation circuits are used to generate the Data_valid or
Data_strobe pulses, depending on the value of the enable. The two generated pulses
186 F. Cosimi et al.

are synchronous with the data and used to regulate the SIPO and PISO behaviours,
depending if a Serializer or a Deserializer is needed, enabling for a single cock period
the desired CML flip-flops’ input gates. As shown in Fig. 5, the pulse generation is made
of a couple of D-FF and an AND gate connected together.
The proposed solution adds a second clocked AND-FF and the possibility of control
the type of pulses that are generated.

Fig. 5. Pulse Generation circuit adopted in this work.

2.3 Clock Tree

The presence of a CML clock-tree is inevitable due to dimensions of device and the
performances required. The aim of using a tree is to uniform clock’s delay when it
reaches synchronous blocks, avoiding problems coming from a possible skew of it. The
clock-tree has the peculiarity that last stage has a controlled voltage supply (V REF ), which
allows to modulate bit stream’s voltage swing. This permits to contrast the downgrading
of performances connected to aging and device’s degradation, restoring the logical swing
of bit stream. A smaller clock-tree has been also realized to distribute clock_div, which
needs to be correctly synchronized, avoiding the presence of skew. On a rising edge of
clock_div a Data_valid (or Data_strobe) pulse reaches FFs, regulating the behaviour of
the whole architecture, as shown in Fig. 6 and Fig. 7.

Fig. 6. Serializer time diagram (enable is HIGH).


Design and Analysis of RF/High-Speed SERDES 187

Fig. 7. Deserializer time diagram (enable is LOW ).

3 Schematic Design Results


A preliminary analysis of the schematic design, with transistors-level pre-layout imple-
mentation in 28 nm TSMC technology, has been made. SEE radiations have been consid-
ered, and SER (Symbol Error Rate) analysis at different clock frequencies and operating
temperatures is reported. Moreover, an eye diagram analysis is also reported.

3.1 Symbol Error Rate and Eye Diagram Analysis


A Verilog-A testbench permitted to verify the speed-rate limits for a complete transmitter-
receiver system, with a Serializer and a Deserializer devices properly connected. A pseu-
dorandom sequence of symbols is generated, and after serialization/deserialization the
result is compared with the original data. Verilog-A circuits also provides synchroniza-
tion and control signals clock_div and enable.
An extended temperature range, from −50 °C to 125 °C has been considered being
it typical for aerospace and automotive applications. For −50 °C, 27 °C and 125 °C
a SER analysis has been made, and PVT corners (Process-Voltage-Temperature) have
been examined. The circuit has nearly 90 mW power consumption for a 0.9 V voltage
supply at 25 Gbit/s.
Figure 8 shows that device’s performances largely decrease with the rise of temper-
ature. Maximum clock frequency is registered for a −50 °C temperature and decreases
of nearly 10 GHz for temperature higher than 125 °C. The architecture has been tested

Fig. 8. SER vs clock frequency at different operating temperatures.


188 F. Cosimi et al.

for a total of 1000 symbols (10 000 bits) for each process corner and temperature, which
corresponds to a BER (Bit Error Rate) upper boundary of 10–4 , enough to highlight the
different behaviour of the SerDes as function of the temperature.
Figure 9 compares the eye-diagrams of a 25 Gbit/s bit stream. These results are
obtained for a typical corner at 27 °C and a slow corner at 125 °C. Here the differential
voltage swing is 800 mV [−400 mV; 400 mV] and it can be modulated through the value
of VREF , to prevent from a progressive downgrade of devices’ performances.

Fig. 9. Eye-diagram 25 Gbit/s stream for a typical corner at 27 °C and a slow corner at 125 °C.

3.2 Single Event Effect Analysis


SEEs refer to the consequences that particles may cause when they strike the silicon
substrate of an electronic device. The hit generates a charge collection, which can be
simulated in a CAD as an injected current peak in every pn-junction node of the circuit,
trying to find out the most sensible ones [10, 11]. SEE may be also caused by glitch due
to EM interference in automotive applications.

Fig. 10. Effects of progressively increasing current injections, corresponding to a growth of hitting
particles’ energy.
Design and Analysis of RF/High-Speed SERDES 189

The graph in Fig. 10 shows the effects of progressively increasing current injections,
corresponding to higher hitting particles’ energy (or increasing EM interference). The
plot shows how radiations, at high levels, may interrupt the behaviour of the device,
inhibiting the bit stream. If the SEE afflicts the clock-tree the synchronization signal
(clock or clock_div) results unable to reach all the elements of the architecture.

4 Layout Design and Post-layout Results


In order to enhance the reliability of the system and its robustness against radia-
tions some mitigation techniques have been adopted at layout level, together with the
implementation of a fully-differential system (guard-rings and interleaved fingers) [12,
13].

4.1 Layout Design


Flip-flops have a long and thin layout, which allows to create a stack of similar syn-
chronous devices. The clock-tree structure has instead been designed to ease the layout
of the interface between last stage and synchronous devices and uniform the length of
metal paths. Top-level layout is 60 µm × 190 µm wide.

Pulse
Generator

2x
SerDes
stages

Fig. 11. Base Cell layout (First block is a Pulse Generator Circuit. The second block contains
a couple of Load-FFs and of Mux-FFs and they constitute two consecutive stages of SerDes
registers). The architecture is realized combining 5 Base Cells of this type.

Figure 11 shows a cell layout. The first block is a Pulse Generator Circuit, composed
by two D-FFs, two AND-FFs and two CML to pseudo-CMOS buffers (needed to cor-
rectly drive the FFs’ control terminals). The second block contains a couple of Load-FFs
190 F. Cosimi et al.

and of Mux-FFs, they constitute two consecutive stages of SerDes registers. The whole
SerDes architecture is realized combining 5 cells of this type.

4.2 Post-layout Results

Post-Layout results have highlighted a discrete loss in terms of speed-rate due to the
presence of parasitic. The architecture may normally operate at a 12.5 Gbit/s data-
rate. It is capable of reaching more than 15 Gbit/s speed-rate in the best environmental
conditions. For fast corner it is better to slow down transmission to nearly 10 Gbit/s if a
temperature enhancement is registered (125 °C).
SerDes power consumption remains still the same estimated for schematic results,
100 mW for 0.9 V supply, which is in line with the 10 Gbit/s Serdes in [14], so the
presented circuit results to dissipate less than 10 mJ for each Gigabit transmitted.
In this case, BER simulations had not been realized due to their excessive request of
computational resources.
Figure 12 shows the post-layout eye-diagrams of a 12.5 Gbit/s bit stream. These
results are obtained for a typical corner at 27 °C and a slow corner at 125 °C. The
desired voltage swing is maintained but the parasitic due to connections produces a
significant loss in terms of speed-rate.

Fig. 12. Post-layout eye-diagram 12.5 Gbit/s stream for a typical corner at 27 °C and a slow
corner at 125 °C.

5 Conclusions

This work has proposed a transistor-level design of a prototype for a high-speed 10-bit
SerDes circuit in a 28nm TSMC 0.9 V CMOS technology. An analysis of the impact of
SEE for increasing level of injected currents is performed since the design targets harsh
operating environments like Aerospace applications. Achieved post-layout results show
that a data rate above 10 Gbit/s is sustained also in worst operating conditions, consid-
ering an extended temperature range from −50 °C to 125 °C. In a typical corner case
12.5 Gbit/s can be sustained. The extreme performance requirements made necessary
the realization of a full-custom design and the use of CML circuits. This solution brings
Design and Analysis of RF/High-Speed SERDES 191

advantages in devices where high speeds are required, overcoming standard CMOS
logic capabilities. The power consumption of the proposed SerDes design is limited to
100 mW for a 0.9 V supply, so the circuit results to dissipate less than 10 mJ for each
Gigabit transmitted [15]. The achieved results well compare vs the state of the art where
the fast links for aerospace applications are limited to a maximum of 6.25 Gbit/s in the
recent released SpaceFiber Standard.

References
1. https://www.esa.int/Enabling_Support/Space_Engineering_Technology/Onboard_Data_Pro
cessing/SpaceFibre
2. Ciordia, Ó., Pérez, R., Pardo, C.: Optical communications for next generation automotive
networks. In: 2017 22nd Microoptics Conference (MOC), Tokyo, pp. 24–25 (2017)
3. Saponara, S., Ciarpi, G., Groza, V.Z.: Design and experimental measurement of EMI reduction
techniques for integrated switching DC/DC converters. Can. J. Electr. Comput. Eng. 40(2),
116–127 (2017)
4. https://auto-serdes.org/
5. Ciarpi, G., Magazzù, G., Palla, F., Saponara, S.: Design, implementation, and experimental
verification of 5 Gbps, 800 Mrad TID and SEU-tolerant optical modulators drivers. IEEE
Trans. Circuits Syst. I Regul. Pap. 67(3), 829–838 (2020)
6. Ozsema, H.G., Kostak, D.: Full swing 20 GHz frequency divider with 1 V supply voltage in
FD-SOI 28 nm technology. Microelectronic Systems Laboratory (LSM) Ecole Polytechnique
Federale de Lausanne (EPFL), Lausanne (2010)
7. Szilagyi, L., Belfiore, G.: Low power inductor-less CML latch and frequency divider for full-
rate 20 Gbps in 28-nm CMOS. Dresden University of Technology Chair for Circuit Design
and Network Theory, Dresden (2013)
8. Heydari, P., Mohanavelu, R.: Design of Ultrahigh-Speed Low-Voltage CMOS CML Buffers
and Latches. IEEE Trans. Very Large-Scale Integr. (VLSI) Syst. 12(10), 1081–1093 (2004)
9. Voinigescu, S.: High-Frequency Integrated Circuits. University of Toronto (2013)
10. DasGupta, S.: Trends in single event pulse widths and pulse shapes in deep submicron CMOS.
Master of Science in Electrical Engineering, Nashville (2007)
11. Frontini, L.: Design of CMOS logic gates tolerant of single-event effects for extreme radia-
tion environments. Università degli Studi di Milano, Dipartimento di Scienze Matematiche,
Fisiche e Naturali (2014)
12. Black, J.D., et al.: HBD layout isolation techniques for multiple node charge collection
mitigation. IEEE Trans. Nuclear Sci. 52(6), 2536–2541 (2005)
13. Agrawal, F.W.: Single event upset: an embedded tutorial. In: 21st International Conference
on VLSI Design, Department of Electrical and Computer Engineering, Auburn University,
Auburn, AL, 36849, USA (2008)
14. Nga, N.T.H., Lee, M.H., et al.: 10 Gb/s SerDes for bidirectional chip-to-memory optical
interconnection. In: 2007 Conference on Lasers and Electro-Optics-Pacific Rim, pp. 1–2
(2007)
15. Cosimi, F.: Analysis and Design of RF/High-speed SERDES in 28nm CMOS technology for
Aerospace Applications. Università di Pisa, Ingegneria dell’Informazione (2020)
Enabling Transiently-Powered Communication
via Backscattering Energy State Information

Alessandro Torrisi1 , Kasım Sinan Yıldırım2 , and Davide Brunelli1(B)


1 Department of Industrial Engineering, University of Trento, Via Sommarive 9,
38123 Povo, Italy
{alessandro.torrisi,davide.brunelli}@unitn.it
2 Department of Information Engineering and Computer Science, University of Trento,
Via Sommarive 9, 38123 Povo, Italy
kasimsinan.yildirim@unitn.it

Abstract. The growing interest in ultra-low-power wireless sensors powered


directly by energy harvesters has revealed one of the major drawbacks of such
battery-less devices, which is engaging communication between nodes, without
wasting energy due to unavailable receivers. Backscatter communication enables
low-power communication by eliminating energy-hungry hardware components
and can communicate if IoT devices are ready to receive even at zero-energy
onboard. In this paper, we present the design of a backscatter radio mechanism
that is used as a feedback channel to transmit the energy state information almost
for free. Simulation results demonstrate the effectiveness of our approach designed
according to the novel approach of “transient computing”.

1 Introduction
Powering the Internet of Things using batteries brings about fundamental drawbacks
such as high maintenance cost for replacing the batteries and limited miniaturization of
the hardware [1]. Fortunately, with the growth of energy harvesting circuitries [2] and
ultra-low-power microcontrollers [3], zero-power wireless communication mechanisms
[4] and sensors [5], we are now able to build tiny sensing devices that can operate by
relying only on ambient energy, without the need for batteries [5–9]. These batteryless
devices enable a new application space such as body implants and wearables [10, 11],
and even deployments in extreme locations [12].
The architecture of a typical batteryless device consists of an energy harvester block
that stores the ambient energy from several sources (e.g., solar [2, 5], radio-frequency
(RF) [6, 13]) into an energy buffer, i.e., typically a capacitor. The stored energy in the
capacitor is used to power the ultra-low-power microcontroller as well as other system
components such as sensors and communication circuitry. Since the energy is harvested
in marginal amounts, and the availability of the ambient energy sources is sporadic and
stochastic, batteryless devices are powered transiently and in turn, operate intermittently
utilizing charge-compute-die cycles. As depicted in Fig. 1, if the energy stored in the
capacitor is above an operation threshold, a batteryless device can compute, sense, and
communicate. As the device consumes the energy stored in the capacitor and the energy

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 192–201, 2021.
https://doi.org/10.1007/978-3-030-66729-0_22
Enabling Transiently-Powered Communication via Backscattering 193

level drops below a threshold value, the device dies due to a power failure. This leads to
the loss of the volatile state of the device, i.e., the contents of the stack, program counter,
registers, and memory. The device starts operating again when the capacitor is charged
and the voltage level is above the operating threshold.

Fig. 1. The operation of the transiently-powered device is composed of interleaved charge-


compute-die cycles, that lead to an intermittent execution.

1.1 The State-of-the-Art


The marginal energy budgets and intermittent operation of transiently-powered battery-
less devices bring about several research challenges within the context of computation
and communication.
Intermittent Computing. The intermittent operation of the transiently-powered
devices prevents existing software designed for continuously-powered computers from
being run correctly due to frequent power failures and loss of the computational state.
In particular, power failures might hinder the forward progress of computation and lead
to memory inconsistencies. The researchers proposed instrumenting existing programs
with checkpoints to save the device state (e.g., registers, contents of the volatile memory)
in non-volatile memory typically implemented as FRAM [14, 15], so that upon a power
failure the device state can be restored from the latest checkpoint and the computation
can be progressed with a consistent memory content [16–19]. Another approach is to
rewrite existing programs using task-based programming models [20, 21], that offers
an efficient alternative to checkpoints but require a non-trivial code transformation.

Zero-Power Communication. Backscatter communication, implemented by tradi-


tional RFID tags, enables almost zero-power wireless communication by eliminating
the energy-hungry hardware components of active radios, e.g. power-hungry mixers that
generate carrier waves. This makes it a perfect choice for batteryless devices considering
their marginal energy budgets. In traditional backscatter, tags transmit by modulating the
reflections of RF signals generated by a dedicated reader, that requires several orders of
magnitude less energy than transmission with active radios [22]. Most of the traditional
backscatter networks [23, 24] allow one-way (unidirectional) communication, i.e., only
between a batteryless device and a dedicated master device (e.g., an RFID reader). This
pushes the decoding of the received weak backscattered signal, which requires complex
digital signal processing techniques, to the RFID reader side and simplifies the design,
194 A. Torrisi et al.

Fig. 2. Two transiently-powered devices miss packets during communication due to unpredictable
power failures. A coordination mechanism is required to ensure packet delivery.

and reduces the energy requirements of the batteryless devices. Recent works demon-
strated bidirectional communication among batteryless devices without the need for a
dedicated RFID reader [4, 25–27]. This is enabled by decoding the received signal using
only low-power analog operations such as envelope detectors that require components
like diodes, capacitors, operational amplifiers and comparators.

1.2 The Problem Statement and Contributions


Despite the aforementioned progress achieved in the intermittent computing and zero-
power communication for batteryless devices, intermittent communication remains
untouched. In prior work, the batteryless devices are powered continuously by dedi-
cated energy sources (e.g., the carrier wave generators or RFID readers) during commu-
nication. Therefore, these studies overlooked the intermittent operation of batteryless
devices. However, when sporadic energy sources power these devices, the transmission
or reception of the packets can be interrupted due to arbitrary power failures (see Fig. 2).
For successful communication, the stored energy on both sides of the channel should be
sufficient to perform packet transmission and reception. Otherwise, the energy consumed
for data transmission is lost, and the data transmission fails.

Fig. 3. To ensure packet delivery, the transmitter device (depicted as TX) receives the energy
status of the potential receiver (depicted as RX) via the backscatter channel almost for free. If
the receiver’s energy status is high, the data transmission can be performed via an active radio;
otherwise, the data transmission is postponed until the receiver has sufficient energy.

This requires a notion of coordination between the transmitter and receiver so that
the transmitter device knows beforehand the receiver device’s availability before trans-
mitting its data. Therefore, the batteryless devices need to obtain state information from
their neighbors to understand if they have sufficient energy and, in turn, if they will be
Enabling Transiently-Powered Communication via Backscattering 195

able to receive the transmitted data. In this paper, we propose to use backscatter com-
munication as a feedback channel to transmit the energy state information almost for
free (see Fig. 3). Based on the backscatter radio design proposed in [27], we use a duty
cycling protocol for mismatching the antenna to indicate different energy levels using
an ultra low power, low frequency oscillator. By using the energy state, any transiently-
powered transmitter device can start data transmission by means of an active radio. We
believe that our work proposes the first attempt to introduce the fundamental hardware
support and the building block of future transiently-powered networking protocols.

2 Systems Design
In this scenario, the challenge is to encode the information of the energy status using the
backscatter radio channel and the lowest power-hungry components and circuits. The
proposed system is developed upon the circuit design presented in [27]. The core of the
system can be divided into two main sections: the receiver (RX) and the transmitter (TX).
Besides, we propose to implement a low frequency, low power oscillator to modulate a
backscatter signal to share the energy status information.

2.1 Receiver RX
The backscatter receiver aims to demodulate the signal coming from a neighbor backscat-
ter node. For this purpose, an RF mixer circuit is usually designed to shift the backscatter

Fig. 4. Backscatter transceiver schematic derived from [27]. To notice the input RF signal (Vin),
the output digital signal (Vout), the envelope detector output signal (Venv), and the oscillator
modulation (Vmod). The energy harvester, the low-frequency oscillator, and the MCU are linked
so that the low-frequency oscillator can encode the information about the energy status.
196 A. Torrisi et al.

signal in the baseband. Thanks to the accuracy of the demodulator, the RF mixer, and the
reference RF oscillator, it is possible to achieve a complex modulation scheme. How-
ever, even though it is possible to reach a relatively high data-rate, all these circuits are
power-hungry.
For simpler modulations and lower data-rate, such as ON-OFF keying, the system
can be built upon a much simpler circuit. Indeed, the backscatter receiver presented in
[27] is specially designed as a demodulator exploiting an envelope detector to operate
the frequency shift in a clever, low power and cheap circuit. The main actor is a biased
Schottky diode envelope detector which is finely matched with the RF input and the
antenna (see Fig. 4).
The remaining circuitry aims to optimize the voltage swing of the low frequency
demodulated signal using a high pass filtering amplifier stage and a comparator for the
final digital output.

2.2 Transmitter TX

The backscatter transmitter produces a modulated RF signal by the reflection of the


incident RF power, as discussed, allowing for zero-power communication for the end-
nodes. The fundamental operation is a mismatch of the antenna exploiting RF switches
and different match impedances.
As a proof of concept for a simple implementation backscatter transceiver, a single
MOSFET can be utilized (see Fig. 4). The MOSFET is operated as a switch to achieve the
required ON-OFF keying modulation. When the switch is open, the antenna is matched,
and the envelope detector can operate the demodulation. When the switch is closed, the
antenna is mismatched, and the circuit operates the reflection.
The modulation is given by the low power, low-frequency oscillator. Properly driving
the MOSFET could require a relatively high peak current. A tradeoff is needed to comply
with the low power oscillator output and a fast switching of the device. If needed, a driving
buffer can be placed between the oscillator and the switch.
A second solution can be to use an analog switch (e.g., the ADG904 presented in
[27]), which is easier to match with the RF circuit.

2.3 Low-Frequency Oscillator

In this paper, we propose to modulate the energy status information of a generic end-
node. To achieve this result in the harsh condition discussed above, we must comply with
low power requirements, especially in the charging phase (see Fig. 1). We propose to
drive the RF switch with a low frequency, ultra-low power oscillator. The low-frequency
oscillator can be tuned at a specific frequency identifying and differentiating multiple
end-nodes. Moreover, we propose a duty cycling protocol for the oscillator to encode
the information regarding the energy status and to ensure all the nodes to get in touch.
The duty cycling period should be relatively long, several milliseconds.
For instance, while the node is in a charging transient and/or the energy is too low
to compute specific tasks, such as receiving information from the neighboring nodes,
the oscillator is placed at a fixed duty cycle e.g. 100% (i.e. the oscillator is always on).
Enabling Transiently-Powered Communication via Backscattering 197

On the other hand, while the node is in an active transient the duty cycle can be easily
changed by the MCU accordingly to the energy status and availability, for instance, to
advise the neighboring nodes that the energy is close to the lower threshold and the
communication could be interrupted.

Fig. 5. Simulation results on the receiver (Input Voltage) and envelope detector (Envelope Detec-
tor Voltage) while a 250 kHz ON-OFF keying waveform is applied at the circuit input (Vin in the
schematic, see Fig. 3) and the MOSFET is always off.

This energy status information can be decoded on the receiver side when the end-
node has enough energy to perform the computing action (discharge transient see Fig. 1).
After that, it can decide to transmit the sensible and relevant information, avoiding the
problem of packet loss mentioned above.

3 Results
To validate our proposal, we carried out some simulations, using LTSpice, on the elec-
tronic circuit depicted in figure Fig. 4. A first simulation, while the transmitter is off and
a modulated RF signal is applied at the input, is performed to show the behavior of the
envelope detector output. Figure 5 presents the results of this simulation. The input RF
198 A. Torrisi et al.

signal is simulated with an amplitude varying between 20 mV and 100 mV, a carrier
frequency of 868 MHz, and a modulation frequency of 250 kHz, while the output of the
envelope detector is about 5 mV. In the figure, it is visible that the signal needs to be
amplified before the comparator and digitalization stage.
We performed a second simulation to show the behavior of the duty cycling protocol
at the digital output of the receiver (Vout in the schematic, see Fig. 3). In Fig. 6 (a), a
10% duty cycle is applied, while in Figure Fig. 6 (b), a duty cycle of 50% is used. Still,
it is visible the ON-OFF keying modulation frequency, which is set to 250 kHz. This
should be recognized by the receiver to identify the specific modulation frequency of
the transmitting node. Finally, it is visible that the ON-OFF keying modulation appears
at the digital output only after a settling time of roughly 25 µs.

Fig. 6. Simulation results on the duty cycle protocol (a) 10% and (b) 50% of the period of 1 ms.
The input signal is a 250 kHz ON-OFF keying modulated. The digital output voltage Vout contains
information on the duty cycle and on the ON-OFF keying modulation frequency.
Enabling Transiently-Powered Communication via Backscattering 199

Figure 7 presents the results of the preliminary simulation of the transceiver behavior.
We fixed the input RF source to 100 mV and 868 MHz, the ON-OFF keying modulation
of the MOSFET at a frequency of 250 kHz, and a duty cycle of 50% (the duty cycle of
the ON-OFF keying modulation). In the figure, the input voltage of the circuit shows
the mismatching operation of the switch. Indeed, while the switch is in the off state
the voltage is a half of the RF source and about 50 mV, while, when the switch is in
the on state, the mismatch operation appears and the voltage decreases above 25 mV,
accordingly with matching rules.

Fig. 7. Simulation results on the 250 kHz ON-OFF keying modulation produced by the MOSFET.
The RF source voltage is 100 mV.

4 Conclusions

We presented a backscatter communication circuit to improve the overall energy effi-


ciency in a network of wireless sensors powered only by the ambient energy. The
backscatter circuit is used mainly as a feedback channel to transmit the energy state
information almost for free, while data communication is engaged using conventional
low-power radios. Based on the proposed backscatter radio design, we use different duty
cycle levels for mismatching the antenna to indicate different energy levels using an ultra
low power, low frequency oscillator. Simulation results demonstrate that the transmitters
can always be updated about the availability of the receiving nodes.

References
1. Palacín, M.R., de Guibert, A.: Why do batteries fail? Science 351, 1253292 (2016)
2. Brunelli, D., Dondi, D., Bertacchini, A., Larcher, L., Pavan, P., Benini, L.: Photovoltaic scav-
enging systems: Modeling and optimization. Microelectron. J. 40(9), 1337–1344, September
2009
200 A. Torrisi et al.

3. Davies, J.H.: MSP430 Microcontroller Basics. Elsevier (2008)


4. Liu, V., Parks, A., Talla, V., Gollakota, S., Wetherall, D., Smith, J.R.: Ambient backscatter:
wireless communication out of thin air. SIGCOMM Comput. Commun. Rev. 43, 39–50 (2013)
5. Nardello, M., Desai, H., Brunelli, D., Lucia, B.: Camaroptera: a batteryless long-range remote
visual sensing system. In: Proceedings of the 7th International Workshop on Energy Harvest-
ing and Energy-Neutral Sensing Systems, New York, NY, USA, pp. 8–14. Association for
Computing Machinery (2019)
6. Smith, J.R.: Wirelessly Powered Sensor Networks and Computational RFID. Springer Science
& Business Media, New York (2013)
7. Sartori, D., Brunelli, D.: A smart sensor for precision agriculture powered by microbial fuel
cells. In: 2016 IEEE Sensors Applications Symposium (SAS), Catania, pp. 1–6 (2016). https://
doi.org/10.1109/SAS.2016.7479815
8. Hester, J., Sorber, J.: Flicker: rapid prototyping for the batteryless internet-of-things. In:
Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems, New
York, NY, USA, pp. 1–13. Association for Computing Machinery (2017)
9. Rossi, M., Rizzon, L., Fait, M., Passerone, R., Brunelli, D.: Energy neutral wireless sensing
for server farms monitoring. IEEE J. Emerging Sel. Top. Circuits Syst. 4(3), 324–334 (2014).
https://doi.org/10.1109/JETCAS.2014.2337171
10. Hester, J., Sorber, J.: The Future of Sensing is Batteryless, Intermittent, and Awesome (2017).
http://dx.doi.org/10.1145/3131672.3131699
11. Brunelli, D., Farella, E., Rocchi, L., Dozza, M., Chiari, L., Benini, L.: Bio-feedback system
for rehabilitation based on a wireless body area network. In: Fourth Annual IEEE International
Conference on Pervasive Computing and Communications Workshops (PERCOMW 2006),
Pisa, vol. 5, p. 531 (2006). https://doi.org/10.1109/percomw.2006.27
12. Denby, B., Lucia, B.: Orbital edge computing: nanosatellite constellations as a new class of
computer system. In: Proceedings of the Twenty-Fifth International Conference on Archi-
tectural Support for Programming Languages and Operating Systems, New York, NY, USA,
pp. 939–954. Association for Computing Machinery (2020)
13. Torrisi, A., Brunelli, D.: Magnetic resonant coupling wireless power transfer for lightweight
batteryless UAVs. In: 2020 International Symposium on Power Electronics, Electrical Drives,
Automation and Motion (SPEEDAM), Sorrento, Italy, pp. 751–756 (2020). https://doi.org/
10.1109/SPEEDAM48782.2020.9161953
14. Texas Instruments: MSP430FR5994 LaunchPad Development Kit, http://www.ti.com/tool/
MSP-EXP430FR5994. Accessed 06 Aug 2020
15. Balsamo, D., Weddell, A.S., Merrett, G.V., Al-Hashimi, B.M., Brunelli, D., Benini, L.: Hiber-
nus: sustaining computation during intermittent supply for energy-harvesting systems. IEEE
Embedded Syst. Lett. 7, 15–18 (2015)
16. Ransford, B., Sorber, J., Fu, K.: Mementos: system support for long-running computation
on RFID-scale devices. In: Proceedings of the Sixteenth International Conference on Archi-
tectural Support for Programming Languages and Operating Systems, New York, NY, USA,
pp. 159–170. Association for Computing Machinery (2011)
17. Balsamo, D., et al.: Graceful performance modulation for power-neutral transient computing
systems. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 35(5), 738–749 (2016).
https://doi.org/10.1109/TCAD.2016.2527713
18. Kortbeek, V., Yildirim, K.S., Bakar, A., Sorber, J., Hester, J., Pawełczak, P.: Time-sensitive
intermittent computing meets legacy software. In: Proceedings of the Twenty-Fifth Inter-
national Conference on Architectural Support for Programming Languages and Operating
Systems, New York, NY, USA, pp. 85–99. Association for Computing Machinery (2020)
19. Rodriguez Arreola, A., Balsamo, D., Das, A.K., Weddell, A.S., Brunelli, D., Al-Hashimi,
B.M., Merrett, G.V.: Approaches to transient computing for energy harvesting systems: a
Enabling Transiently-Powered Communication via Backscattering 201

quantitative evaluation. In: Proceedings of the 3rd International Workshop on Energy Har-
vesting and Energy Neutral Sensing Systems (ENSsys 2015), New York, NY, USA, pp. 3–8.
ACM. https://doi.org/10.1145/2820645.2820652
20. Colin, A., Lucia, B.: Chain: tasks and channels for reliable intermittent programs. In: Proceed-
ings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming,
Systems, Languages, and Applications, New York, NY, USA, pp. 514–530. Association for
Computing Machinery (2016)
21. Yıldırım, K.S., Majid, A.Y., Patoukas, D., Schaper, K., Pawelczak, P., Hester, J.: InK: reactive
kernel for tiny batteryless sensors. In: Proceedings of the 16th ACM Conference on Embedded
Networked Sensor Systems, New York, NY, USA, pp. 41–53. Association for Computing
Machinery (2018)
22. Zhang, P., Rostami, M., Hu, P., Ganesan, D.: Enabling practical backscatter communication
for on-body sensors. In: Proceedings of the 2016 ACM SIGCOMM Conference, New York,
NY, USA, pp. 370–383. Association for Computing Machinery (2016)
23. Talla, V., Hessar, M., Kellogg, B., Najafi, A., Smith, J.R., Gollakota, S.: LoRa backscatter:
enabling the vision of ubiquitous connectivity. In: Proceedings ACM Interactive Mobility
Wearable Ubiquitous Technology, vol. 1, pp. 1–24 (2017)
24. Alevizos, P.N., Tountas, K., Bletsas, A.: Multistatic Scatter Radio Sensor Networks for
Extended Coverage (2018). http://dx.doi.org/10.1109/twc.2018.2827034
25. Ryoo, J., Jian, J., Athalye, A., Das, S.R., Stanaćević, M.: Design and evaluation of “BTTN”:
a backscattering tag-to-tag network. IEEE Internet Things J. 5, 2844–2855 (2018)
26. Ryoo, J., Karimi, Y., Athalye, A., Stanaćević, M., Das, S.R., Djurić, P.: BARNET: towards
activity recognition using passive backscattering tag-to-tag network. In: Proceedings of the
16th Annual International Conference on Mobile Systems, Applications, and Services, New
York, NY, USA, pp. 414–427. Association for Computing Machinery (2018)
27. Majid, A.Y., Jansen, M., Delgado, G.O., Yildirim, K.S., Pawełczak, P.: Multi-hop backscat-
ter tag-to-tag networks. In: IEEE INFOCOM 2019 - IEEE Conference on Computer
Communications, pp. 721–729 (2019)
Analysis and Design of Integrated VCO in 28 nm
CMOS Technology for Aerospace Applications

Paolo Prosperi1(B) , Gabriele Ciarpi1,2 , and Sergio Saponara1


1 Department of Information Engineering, University of Pisa, Via G. Caruso 16,
56122 Pisa, Italy
p.prosperi@studenti.unipi.it, gabriele.ciarpi@ing.unipi.it,
sergio.saponara@iet.unipi.it
2 INFN, Largo B. Pontecorvo, 56127 Pisa, Italy

Abstract. This paper proposes the comparison between various types of inte-
grated VCO (Voltage Controlled Oscillator) architectures, designed in 28 nm
CMOS technology, for aerospace applications. A frequency of 25 GHz and a
temperature range from −40 °C to +125 °C have been taken as target, together
with a low supply voltage. In particular, ring oscillators (RO) based VCOs and
a LC tank VCO were designed and compared. Although RO based VCOs are
attractive for the low area occupation and for the high tuning range capability,
the comparison has highlighted a very high PVT (Process Voltage Temperature)
sensitivity and poor phase noise performances at the target frequency for these
structures. Instead, the designed LC tank oscillator has shown a lower sensitivity
to PVT variations and better phase noise performances at 25 GHz, together with
a lower power dissipation. A varactor-based voltage tuning control allows the LC
tank VCO to recover the target frequency among PVT variations. The complete
layout of this last structure has been implemented. Post layout simulations have
shown a typical oscillation frequency which can be varied from 24.35 to 25.65 GHz
with a phase noise of −95 dBc/Hz at 1 MHz offset from the 25 GHz carrier and a
power dissipation of 860 μW. A two stage output buffer was also designed to be
able to drive chip pads and test the VCO. SEEs (Single Event Effects) simulations
have been performed to test circuit’s reliability in a radiation environment.

1 Introduction

High speed serial data communication links are required in most of today’s high-
demanding systems. Communication speeds from 5 up to 40 Gbit/s are reported recently
with speeds increasing each year [1]. These circuits, such as each electronic complex
system, need a precise and reliable clock to work properly. Due to this, high speed
data communication links always need a phase-locked loop (PLL) that generates the
data transmission timing, as well as clock recovery from a serial data stream with a
low power consumption and low jitter. The main circuit block of these synchronization
circuits is surely the Voltage Controlled Oscillator (VCO), which generates an output
signal at a certain frequency, that can be controlled by a voltage input.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 202–212, 2021.
https://doi.org/10.1007/978-3-030-66729-0_23
Analysis and Design of Integrated VCO in 28 nm CMOS 203

Moreover, in environments such as avionics, space, high energy physics or biomed-


ical instruments, high speed communication links are essential too. Aerospace environ-
ment is a field of application where a lot of research is done today. This research is
made to increase the reliability of on-board electronic components of planes, satellites
and spaceships. Furthermore, for the next three-year period, (2020/2023), the Euro-
pean Commission, the ESA and the EDA have the common objective of investing in the
research to be able to develop and design radiation tolerant systems in”Non-dependence”
from other countries, such as USA [2]. So, in this perspective, the main goal of this work
is to design a VCO for this field of application and to compare the performances of
the widely used RO (Ring Oscillator) and LC circuits in radiation environments and to
contribute with new approaches for exploiting the characteristics that have made these
systems the most implemented.
The state of the art aerospace link is represented by the recent SpaceFiber [3] standard,
targeting a data rate of 6.25 Gbit/s [4]. However, since the demand for higher data rates
increases day by day, a 25 GHz target was chosen for the design of the VCO in this
work. This frequency could also be used for multi-lines applications of SpaceFiber, or
other standards. Moreover, the speed target of this VCO could be of interest not only
for space applications, but also for other slices of market, such as automotive. As an
example, the ASA (Automotive Ser-Des Alliance) [5], which is attempting to standardize
communication links for this field of application, has target data rates greater than 10
Gbit/s.
The 28 nm technology process chosen for this work is a commercial-grade TSMC
process. This technology node surely introduces some challenges in analog circuit
design, such as low Vdd/Vth ratio and low intrinsic transistors gain, but could offer
many advantages if used to design RF rad-tolerant SoC (System on a Chip), with high
operation frequency analog circuitry and a performing digital electronic too. Finally,
compared with the previous 65 nm CMOS node [6], the 28 nm is considered a promis-
ing technology for radiation hard applications due to its thin oxides, which limit TID
(Total Ionization Dose) effects, due to charge trapping in oxide layers after radiation
exposure [7].
So, in conclusion, the goal of this work is to design a 25 GHz VCO, in a 28 nm
process, with the most suitable architecture for a radiation environment application.
The rest of the paper is organized as follows: Sect. 2 presents designed and analyzed
RO-based structures. Section 3 describes the proposed LC tank VCO and briefly presents
the two-stages output buffer which is necessary to test the VCO on a test board. Layout
design of the designed VCO is discussed in Sect. 4. Section 5 reports all the obtained
simulation results1 [8]. Section 6 is about SEE (Single Event Effects) simulations and
rad hard improvements that can be done on the LC VCO proposed structure. Conclusions
and state of Art considerations are drawn in Sect. 7.

2 RO-Based VCOs
A RO-VCO consists on a cascade of an odd number of inverting amplifiers, also called
delay cells, in which the output of the last stage is connected to the first stage forming a
1 All results in this work are obtained using Cadence Spectre simulation tool.
204 P. Prosperi et al.

loop. Three-stages structures of single ended and differential ring oscillators are shown
in Fig. 1. In this work several types of ring VCOs were designed and compared: CMOS
or only NMOS based, single ended or differential [9]. Conventional simple CMOS
inverters, and current starved inverters based ring oscillators have been designed too,
but they showed an inadequate working frequency for our purposes. Single ended delay
cell, Fig. 1a, is based on a pseudo-NMOS inverter, in which the NMOS transistor’s
load is composed by two PMOS transistors. One of this two PMOS transistors is biased
with a fixed gate voltage (GND), while the other PMOS load is biased with a variable
control voltage (Vc). By varying this control voltage, the active resistance of the load
can be changed in order to change the oscillation frequency. The PMOS load with a
fixed bias is inserted to avoid oscillation failures with high values of Vc, when MP2
turns off. A pseudo-differential version of this delay cell was also implemented, Fig. 1b.
To have this pseudo-differential structure, it is not sufficient to replicate the single ended
version to obtain a differential voltage between the output nodes of each branch. Indeed,
without any self-balancing mechanism, each branch would act as a single ended ring
oscillator without any precise phase correlation. To create this balancing mechanism a
cross coupled NMOS pair is inserted. In Fig. 1c it’s reported the last differential delay
cell structure, which is a CML (Current Mode Logic) one, based on a typical resistive
fully differential amplifier [10]. Here the oscillation frequency can be varied with two
Varicaps by acting on their DC bias with Vc.

Fig. 1. Designed ring oscillators.

3 LC Tank Proposed VCO


In order to overcome the effects of the device parameters deviation on the oscillation
frequency and to obtain higher FOM values at target speed, a LC-Tank VCO architecture
has been designed. This architecture bases its oscillation frequency on the filtering effect
of a L-C tank, leaving to active components only the role of setting the feedback gain and
compensate the loss of the inductor [11–13]. Obviously, this is true from a theoretical
point of view. Indeed, the parasitic elements of active components surely play a key role
Analysis and Design of Integrated VCO in 28 nm CMOS 205

also in determining the oscillation frequency, especially in ultra-scaled technologies


such as 28 nm, where the value of the parasitic capacitances of active components
and interconnections could degrade ideal performances consistently. The LC tank VCO
implemented in this work is based on a conventional NMOS cross coupled structure,
(Fig. 2a), with the improvements showed in Fig. 2b.

Fig. 2. Conventional a) and implemented b) cross coupled LC VCO.

The resistor R is inserted in order to decrease the output common mode voltage to
have a correct DC bias for a cascade connected buffer, and to reduce stress on transistors
due to voltage swing. Indeed, a voltage exceeding Vdd across the MOSFET junctions
can damage the device or dramatically reduce its lifetime. This resistor is connected to
the center tap of a symmetrical inductor chosen for its lower layout area than that of
two separate inductors. The inductor value and its geometry factors (number of turns,
coil width…) are chosen in order to have the best Q factor of the entire LC tank at
25 GHz oscillation frequency. Indeed, it can be demonstrated [11] that the higher is
the tank Q factor, the better are the phase noise performances of the VCO. In order to
achieve the best frequency performance of this technology, the cross-coupled pair is
sized using minimum length mosfets while the width of the transistor pairs is chosen
to guarantee start-up conditions. The design guideline to respect Barkhausen oscillation
criterion should be gm > 1/RP , where gm is the transconductance of the NMOS inside
the cross-coupled cell and RP is the parasitic resistance of the inductor [12]. Varicaps
value had been chosen to have the correct TR (Tuning Range) to recover frequency
deviations over PVT corners. The tail current was fixed to 1 mA.

3.1 Output Buffer Design


After a preliminary analysis and sizing of the VCO with an output ideal capacitance as
load, a decoupling resistive CML buffer was sized and connected at VCO outputs. This
buffer is necessary to decouple the VCO from the rest of the circuit in order to fix the
oscillation frequency and to pick up the oscillation voltage. Then, since the VCO has to
be tested, and the first stage buffer is not able to drive the complex load offered from IC
pads, wire bonding and measure instrument, a second stage output buffer was designed.
206 P. Prosperi et al.

This second stage buffer had been designed as an inductive tuned amplifier, in order to
have a reasonable power on measurement instrument and to have a better matching with
the impedance viewed looking towards the pads. The entire circuit, VCO and buffers, is
shown in figure Fig. 3.

Fig. 3. Complete circuit schematic.

4 Layout Implementation

The entire circuit, LC-VCO and buffer, has been implemented in layout view, showed for
the VCO in Fig. 4. For the design of this layout, all choices were made in order to reduce
the parasitic resistance and to guarantee a good matching of simple current mirrors and
transistor pairs. Indeed, a high parasitic resistance can lead to a gain degradation and this

Fig. 4. VCO layout in 28 nm CMOS technology.


Analysis and Design of Integrated VCO in 28 nm CMOS 207

could cause a weak start-up condition for the VCO. It’s possible to see how, the main
contribution to area occupation is surely that one of inductor, which is implemented
as three-turns differential inductor, already present in technology libraries. The space
between the devices is the minimum allowed by technology rules helping to minimize
the devices mismatch.

5 Results and Comparison


5.1 Schematic (Pre-layout) Simulations
In schematic design, the target central frequency had been fixed to higher than 25 GHz in
order to leave a margin for an eventual layout implementation. In Table 1 are summarized
the main performances of the designed voltage controlled oscillators. To have a better
comparison between the designed VCOs, a very used figure of merit (FOM) can be
introduced [12, 13]. This FOM, defined in Eq. (1), allows to compare different VCOs
taking into account several important performance parameters at the same time, such as
Phase Noise (L(Δf)), dissipated power (Pdc ) and central frequency (f0 ). Parameter Δf is
the frequency offset from the carrier at which Phase Noise, and so FOM, are evaluated.

Table 1. Designed VCOs schematic design main performances.

Ring a) Ring b) Ring c) LC-VCO


VCO Type Single ended Pseudo-diff Fully-diff Fully-diff
Vdd[V ] 0.9 0.9 1.2 0.9
Temperature Range [−40–100] °C [−40–100] °C [−40–100] °C [−40–125] °C
Used Transistors Core RF-Mos Core RF-Mos ULVT RF-Mos Core RF-Mos
Frequency[GHz] 28 28 28 30
Power diss.[mW ] 9.85 20.6 12.78 0.9
Kvco [GHz/V ] 75 75 12 3.3
PVT variations High High Mid-low low
Area [µmˆ2] A1 > 4 A2 ≈ 2xA1 A3 > 350 A4 ≥ 10000
L (f = 1 MHz) −63.6 −65.97 −65.6 −90
[dBc/Hz]
FOM (f = 1 MHz) −142.6 −141.77 −143 −180
[dBc/Hz]

 
f0 Pdc
FOM (f ) = L(f ) − 20 log10 ( ) + 10log10 (1)
f 1mW
It’s important to say that the temperature range for ring VCOs had to be reduced from
−40 °C −125 °C to −40 °C −100 °C because ElectroMigration (EM) current density
208 P. Prosperi et al.

specifications at 125°C are too stringent to be respected for the RF_MOS devices used
in this work, for these topologies. We can also see that for the CML ring VCO a higher
supply voltage of 1.2 V and the use of Ultra-Low-Threshold-Voltage (ULVT) are needed
to overcome some gain issues due to low Vdd/Vth ratio. Instead, LC tank architecture
allows to respect all declared constraints, so Vdd is 0.9 V and the temperature range is −
40 °C to 125 °C. From Table 1 it’s possible to see how pseudo-Nmos ring VCO structures
have a very high sensitivity to PVT (Process Voltage Temperature) variations and, due
to this, high Kvco values are required to recover target oscillation frequency among all
corners. CML-based structure has instead a lower sensitivity to PVT variations, but a
higher area occupation in addition to a higher supply voltage and to the use of non-core
transistors. Phase noise performances are instead very similar and around −65 dBc/Hz
at 1 MHz offset from the 28 GHz carrier for all ring VCOs. FOM values also confirmed
the comparability of ring structures in terms of noise/power dissipation performances.
FOM values are around −142 dBc/Hz, always at 1 MHz offset from 28 GHz carrier.
Now, it can be noticed how the LC tank VCO has better noise, power and PVT response
performances. At 30 GHz schematic target frequency, its PN at 1 MHz offset is around −
90 dBc/Hz and dissipation power is about 900 μW. This results in a much higher FOM of
−180 dBc/Hz at 1 MHz offset from 30 GHz carrier. The drawback of the LC architecture
is obviously the Area occupation, which is much greater than ring VCOs one.

5.2 Post Layout Simulations

Post layout simulations show how the target frequency of 25 GHz is reached and recov-
ered in every PVT corner, from the slowest to the fastest, by varying the control voltage
Vc on Varicaps (Fig. 5). Post layout tuning range is typically 1.3 GHz. Typical Kvco is
around 2.3 GHz/V. The power dissipation of the single VCO is about 860 μW and the

Fig. 5. Post layout slowest (green), fastest (yellow) and typical (red) tuning range curves.
Analysis and Design of Integrated VCO in 28 nm CMOS 209

PN at 1 MHz frequency offset is −95 dBc/Hz, with a typical FOM of −184 dBc/Hz. The
total power dissipation, buffers included is about 8.5 mW. The area of the single VCO
is about 160 μm × 110 μm, while the total area, buffers, mirror and interconnections
included, is about 410 μm × 220 μm. Typical single ended VCO swing is about 550 mV,
while the single ended swing on the measuring instrument is around 185 mV.

6 SEE Simulations

An aerospace application involves a radiation tolerance analysis. The relatively low


presence of cumulative radiation doses in the space environment let them to be neglected
and to focus the attention on the disturbs coming from high-energy particles hitting the
substrate (Single Event Effects). SEEs refer to the consequences that particles may
cause when they strike the silicon substrate of an electronic device [14, 15, 16]. The hit
generates a charge collection, which can be simulated in a CAD as an injected double
exponential modeled current peak in every pn-junction node of the circuit, trying to find
out the most sensitive ones. SEE may be also caused by glitches due to EM interference
in automotive applications. Figure 6a shows a typical response to a quite high SEE
injected charge of 375 fC. We can see that both, frequency and amplitude, vary with a
SEE hit. Frequency variations are wider when a SEE hits a VCO node, near the Varicaps.

Fig. 6. SEE simulations results when each node of the circuit is stimulated by a charge injection
with a period of 3 ns for circuit: a) without, b) with mitigation technique.
210 P. Prosperi et al.

Instead, amplitude variations could be critical when SEE hits the common gate node of
the current mirrors. To raise circuit reliability in a radiation environment Guard rings
and Deep Nwells are used for transistors and a differential architecture is implemented.
To mitigate amplitude variations due to SEE at common gate nodes of current mirrors
a technique based on increasing the RC constant of those nodes [15], by raising their
capacitance is presented(Fig. 7).

Fig. 7. Proposed mitigation technique.

Since, those nodes are directly connected to IC pads, from which externals bias
currents are injected, it’s possible to integrate these capacitances under the pads, saving
some die area. Figure 6b shows the improved results.

7 State of the Art Comparison and Conclusions


This work proposes a comparison of various VCO structures, in a 28 nm process, and the
complete design of a radiation tolerant VCO, working at a central frequency of 25 GHz,
with a low supply voltage of a 0.9 V and able to work in a temperature range from −
40 °C to 125 °C. The comparison has highlighted that ring oscillator based VCOs, both
in their CMOS and CML solutions, suffer from some problems, in term of reliability

Table 2. State of the Art comparison.

This Work [2] [4] [17]


Technology [nm] 28 65 65 28
VCO type NMOS LC CMOS LC NMOS LC NMOS LC
Rad. tolerance yes yes yes no
Vdd [V ] 0.9 1.2 1.2 0.85
Temperature range [−40–125] °C – [−55–125] °C –
Frequency [GHz] 25 2.56 6.25 15
Power diss. [mW ] 0.86 1.8 1.8 6.8
L (1 MHz) [dBc/Hz] −95 −118 −105 −97
FOM (1 MHz) [dBc/Hz] −184 −188.7 −178 −172
Analysis and Design of Integrated VCO in 28 nm CMOS 211

and of global performances, resulting from the issues introduced by low voltage ultra-
scaled technologies in Analog-RF design. LC tank oscillator has instead shown better
performances, so a LC tank VCO has been totally implemented. The designed LC-VCO
covers the frequency range from 24.35 to 25.65 GHz, with a power dissipation <1 mW
around all PVT corners, and a typical phase noise of −95 dBc/Hz at 1 MHz offset
from 25 GHz carrier. An analysis of the consequence of SEEs has been performed since
the design targets harsh operating environments like Aerospace a applications. Table 2.
reports a brief comparison between the designed system and some previously published
ultra-scaled, GHz-range, and possibly radiation tolerant, VCOs.
The designed system will be integrated in 28 nm TSMC process, in a 1 mm x 1 mm
chip, together with another project developed by the DII of the University of Pisa. Both
circuits will be tested at CERN, shooting them with an intense flux of ionizing particles,
like for example heavy ions, to verify circuit’s robustness against SEE.

References
1. EDA (European Defence Agency) European Commission, ESA (European Space Agency).
Critical Space Technologies for European Strategic Non-Dependence. Background document
25/02/19
2. Prinzie, J., et al.: A 2.56-GHz SEU radiation hard LC-tank VCO for high-speed com-
munication links in 65-nm CMOS technology. IEEE Trans. Nucl. Sci. 65(1), 407–412
(2018)
3. https://www.esa.int/Enabling_Support/Space_Engineering_Technology/Onboard_Data_Pro
cessing/SpaceFibre
4. Monda, D., et al.: Analysis and comparison of ring and LC-tank oscillators for 65 nm integra-
tion of rad-hard VCO for spacefibre applications. In: International Conference on Applica-
tions in Electronics Pervading Industry, Environment and Society, pp. 25–32. Springer, Cham
(2019)
5. https://auto-serdes.org/
6. Ciarpi, G., et al.: Radiation hardness by design techniques for 1 Grad TID rad-hard sys-
tems in 65 nm standard CMOS technologies. In: International Conference on Applications
in Electronics Pervading Industry, Environment and Society, pp. 269–276. Springer, Cham
(2018)
7. Zhang, C., Jazaeri, F., et al.: Characterization of gigarad total ionizing dose and annealing
effects on 28-nm bulk mosfets. IEEE Trans. Nucl. Sci. 64(10), 2639–2647 (2017)
8. https://www.cadence.com/ko_KR/home/tools/custom-ic-analog-rf-design/circuit-simula
tion/spectre-simulation-platform.html
9. Moghavvemi, M., Attaran, A.: Recent advances in delay cell VCOs [application notes]. IEEE
Microwave Mag. 12(5), 110–118 (2011)
10. Heydari, P.: Design and analysis of low-voltage current-mode logic buffers. In: Fourth
International Symposium on Quality Electronic Design. IEEE (2003)
11. Razavi, B.: A study of phase noise in CMOS oscillators. IEEE J. Solid-State Circuits 31(3),
331–343 (1996)
12. Razavi, B.: RF Microelectronics, vol. 1. Prentice Hall, New Jersey (1998)
13. Voinigescu, S.: High-Frequency Integrated Circuits. Cambridge University Press, Cambridge
(2013)
14. Frontini, L.: Design of CMOS logic gates tolerant of single-event effects for extreme radia-
tion environments. Università degli Studi di Milano, Dipartimento di Scienze Matematiche,
Fisiche e Naturali (2014)
212 P. Prosperi et al.

15. Space Product Assurance Techniques for Radiation Effects Mitigation in ASICs and
FPGAs Handbook. ECSS Secretariat ESA-ESTEC Require-ments and Standards Division,
Noordwyk, The Netherlands, ECSS-Q-HB-60-02A, 1 September 2016
16. DasGupta, S.: Trends In Single Event Pulse Widths And Pulse Shapes In Deep Submicron
CMOS. Master of Science in Electrical Engineering, Nashville, Tennessee (2007)
17. Jorgensen, E.: Design of VCOs in global foundries 28 nm HPP CMOS. J. Microelectronic
Eng. Conf. 21(1), Article 14. https://scholarworks.rit.edu/ritamec/vol21/iss1/14
vrLab: A Virtual and Remote Low Cost
Electronics Lab Platform

Massimo Ruo Roch(B) and Maurizio Martina

Politecnico di Torino, Dipartimento di Elettronica e Telecomunicazioni, Turin, Italy


massimo.ruoroch@polito.it

Abstract. SARS-CoV2 pandemic stressed the need to increase adoption


of remote teaching. Technical courses, specifically electronic engineering
ones, suffered the miss of real lab experiments directly carried out by
students. In this paper a new approach is presented, based on the usage
of very low cost experimental boards, which act both as a measurement
instrument and a programmable prototype circuit. A first board, targeted
to analog and digital electronics courses experiments, has been designed,
and is described in this paper.

Keywords: SARS-CoV2 · Lock down · Electronics lab · Virtual lab

1 Introduction
Starting from the spring of 2020, SARS-CoV2 pandemic hit China, Italy, and
the rest of the world. Lock down countermeasures were mandatory, to mitigate
virus spread among the population. It means that teaching ‘in presence’ was
stopped, too, in schools of every rank, from primary, up to universities [1].
Current Internet capabilities allow to overcome difficulties related to lectures,
as videoconferencing is a well established technique [2], even if there are some
challenges coming out when the number of attendants goes beyond some hun-
dredths, if direct interactivity is desired (we are speaking about real time lessons,
and not just playing recorded videos) [3]. Nowadays, host virtualization and pri-
vate or public clouds are already used inside universities, allowing to deploy into
the campus remote access to computing platforms [4], even at supranational level
[5].
Moreover, web applications allowing to exploit interactivity in simulated envi-
ronments like Moodle [6] are of widespread usage, and can be used to introduce
exercises and simulated laboratory experiments, in which students can interact
with virtual objects, changing their parameters, and simulating the behaviour
of the so modified experiments.
Electronics engineering courses are natural candidates for these kind of web-
based tools. As an example, in analog and digital electronics courses students
can design circuits, either via schematic capture or Hardware Description Lan-
guages (VHDL, Verilog, SystemC). Later, they can perform analog, digital, or
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 213–220, 2021.
https://doi.org/10.1007/978-3-030-66729-0_24
214 M. Ruo Roch and M. Martina

mixed signal simulations, utilizing web interfaces to the standard simulation


tools (Spice, Modelsim) [7,8].
What is anyway missing in this approach, is the contact with real world
objects, which, according to us, is fundamental to acquire specific engineering
skills. In fact, developing design capabilities requires to acquire the following
abilities:

a) Design space exploration


b) Simulation of the designed system
c) Verification of compliance to a previously defined high level reference model
through every refinement step
d) Hardware verification, with limited debugging capabilities
e) Characterization of the hardware system

Points a) through c) can be easily accomplished with the web based method-
ologies described above. But the last two items are nowadays not affordable at
all.
First of all, it must be emphasized that simulation can not fully substitute
measurements performed on the real circuit. In fact, the major drawback of
simulation performed inside courses is the lack of coverage of the tests, mainly
due to student inexperience, and time shortage. So, hardware verification is a
must.
Moreover, the ability to diagnose real circuits faults is a typical high value
engineering skill, which must be pursued. And it requires skills on both real mea-
surements instruments usage, and personal development of fault search method-
ologies specifically targeted to digital or analog circuits.
Last, real signals are usually very different from simulated ones, sometimes
in surprising ways for a student, and it is important to be able to visualize them
in a realistic manner, as sampled by a real measurement instrument, including
noise and other artifacts.
A possible solution to these requirements would be to give to students access
to physical devices on which to perform lab work, but, at the same time, to learn
measurements instruments usage, too.
In this context, there are two different possibilities:
– Universities buy, and send to students, a ‘lab kit’ built-up by a set of boards
suitable to lab needs. As an example, one MCU board, one FPGA board, a
digital oscilloscope, a power supply, etc. This approach has problem from the
point view of cost, as the number of kits must be greater then the number
of students which are following courses using the kits itself. In fact, in con-
trast to what happened with ‘in presence’ labs, no sharing is possible for lab
appliances, as they are physically at home of the student. Moreover, there are
logistics, problem, too, due to the complexity involved in the delivery of kits
to students before course starts, and to collecting them after course end. And
things can be even worse, if students are located in different cities, or coun-
tries (at Politecnico di Torino, one half of the students comes from different
regions, and one 10th from different countries).
vrLab: A Virtual and Remote Low Cost Electronics Lab Platform 215

– Universities suggest the above kit to students (like it happens for a course
text book). Of course, this solution was affordable only if total kit cost would
be low, acceptable for students balance. And this is the problem. In fact, it is
quite easy to find low cost experimental boards which can be used to imple-
ment the experiment, typically MCU based (Arduino, Nucleo), with prices in
the range of 10–20 USD. But there are very few low cost boards suitable for
programmable logic development (anyway around 100 USD each), and, worst,
it is nearly impossible to get a set of low cost measurement instruments, suit-
able for reasonably sophisticated experiments (minimum is around 200 USD)

The main reason for which no suitable solution has been found is mainly
related to the fact that systems building a kit are not developed with teaching
in mind. They are general purpose boards, designed for small scale prototyping,
or for technology evaluation. And as such, they have much more hardware than
needed, i.e. their cost is not at a minimum. Equally, measurement instruments
are true, complete, sophisticated devices, with overabundant features, i.e., again,
they are too expensive.
The basic idea to fill the above gap is the development of a technology, a
platform, specifically targeted to teaching labs, aimed to minimize unit cost,
but fulfilling electronics labs requirements. It will be realized as a mix of hard-
ware/software components and it could allow three different usage models:

– The students use it at home. No additional hardware must be required, except


for the experiment platform itself and a personal computer (laptop or desk-
top).
– The students use it in the campus labs. This can be a duplication of existing
lab equipment, but this choice allows to use the same course material in
different situations, i.e. remote labs or ‘in presence’ ones, and avoid biases
between on-site and off-site students.
– The student is at home, but can have Internet access to the experiment
deployed inside university labs. A critical point is to mimic, as far as pos-
sible, the same user interface as in preceding use cases, to maintain a uniform
usage experience.

In the following sections, a possible solution is proposed. First of all, a generic


architecture is described, then a first case study implementation is presented.

2 System Architecture

The basic idea is to develop a hardware device with two sections. The first one,
called desk area is to resemble a typical lab desk, integrating the functionalities
of a digital storage oscilloscope (DSO), a multimeter and a programmable analog
signal generator. Moreover, in this section the teacher can load special purpose
devices, e.g. test beds for the lab experience carried out by students. The second
section (student area) must be instead the equivalent of an experimental board,
on which students can carry out the experiment itself.
216 M. Ruo Roch and M. Martina

The board must be easily connected to a computing device (PC, Raspberry,


etc.), where high cost computational tasks can be performed. These tasks are,
as an example, signal analysis algorithms (FFT, noise filtering, digital protocol
analysis), data presentation to the user, and virtual instruments controls.
The integration of custom hardware and software running on the computing
device is the key to minimize cost of the overall system. In fact, repetitive costs
(hardware), are minimized, at the expense of designing PC software, but the
latter can be developed through an Open Source model. This choice introduce a
further possibility, i.e. to involve computer science students in the development
of this software, too. It fulfill another teaching activity, well integrated inside an
IT degree, even if not strictly required by initial specifications.
In this architecture, students can access the experimental board ether
directly, or through the Internet, where a Raspberry PI or a laboratory server
expose the user interface.
The overall architecture is depicted in Fig. 1.

Fig. 1. Basic architecture of the system

3 Implementation
To assess feasibility of the proposed approach, a first system has been designed,
specifically aimed to electronics engineering courses in a master degree. Target
courses are related to embedded systems design, low power digital electron-
ics, programmable logic, and bio medical electronics systems. Skills acquired
by students will be mainly in the field of HDL design methodology, MCU hard-
ware/software integration, embedded systems firmware development, analog and
digital data acquisition and processing techniques, hardware/software low power
design.
A high level block diagram of the board is visible in Fig. 2.
Three main blocks are visible, the USB interface (STLinkV3MODS, at
extreme left), the student area (upper half), and the desk area (lower half).
The two sections are linked by a 32 bit general purpose bus, which is freely
usable by experiments.
vrLab: A Virtual and Remote Low Cost Electronics Lab Platform 217

Fig. 2. Block diagram of the designed board

3.1 USB Interface


This part is built-up by a commercial module, STLinkV3MODS, produced by
STM. First, it is used as a serial high speed (15 Mb/s) communication channel
toward the PC. The serial channel is connected to the desk MCU, to exchange
commands and data, used mainly to control the virtual instruments, and to
collect acquired measurements. A second function of this block is to control,
through a standard ARM SWD interface, the student MCU. In details, this
link is used to allow MCU programming, and real-time debugging of firmware
designed and loaded by the student.

3.2 Desk Area


This area contains three main devices:
– Desk MCU. Based on a 32 bit ARM low power device (STM32L496), it con-
tains a firmware, which implements the low speed portion of virtual instru-
ments (multimeter, analog signal generator, and DSO), parse commands
received from the USB interface, and interface to the teacher FPGA. The
DSO is implemented through the internal ADC of the MCU, while a two
channels DAC is used to generate arbitrary analog signals. The ADC is used
to implement the virtual multimeter, too, used to measure supply currents
drawn by students devices. The same MCU can host teacher provided test
benches, too, to assess the validity of signals generated in the student area.
– Desk FPGA. An Intel Cyclone 10 LP low power FPGA, with up to 25k
LEs, implements in hardware high speed portion of the virtual instruments.
As an example, it samples digital input channels of the logic state analyzer,
generates digital sequences for the digital pattern generator, and stores them
into the high speed RAM storage. It works as a bridge, too, between the
MCU and the RAM storage. This way, MCU can save multimeter or DSO
acquired samples. And can get samples from the RAM storage itself, to supply
218 M. Ruo Roch and M. Martina

it’s internal analog signal generator. The FPGA can host teacher provided
custom test benches, too.
– RAM storage. An 8 or 16 MByte RAM is provided, as a generic data store.
Due to limitations on costs and package pins, a HyperBus device has been
chosen, managed by the FPGA. Please notice that the desk FPGA can also
be used to expose to students an arbitrary bus protocol, like SRAM, DRAM,
etc., emulating a real memory device.

3.3 Student Area


The student area is implemented with an MCU, and one FPGA, identical to
the ones used in the teacher block. On the general purpose bus there are also
connected some switches, LEDs, and an I/O connector used to link the board
to external breadboards, or external custom circuits. Last but not least, there is
an Arduino compatible connector, used to integrate board functionalities with
low cost expansion boards (motor drivers, power I/O, displays, accelerometers,
bluetooth transceivers, etc.). Last, an important feature of the board is that
students can program both FPGA and MCU using standard programming tools
(STM32CubeIDE for the MCU, and QuartusPrime for the FPGA), which are
available at no cost.

4 Conclusions and Future Work


In this work, a custom experimental board has been described, which will be
used to allow lab access to students of electronics courses in the master degree
of Electronics Engineering at Politecnico di Torino. The designed architecture is
able to fulfill design requirements, and achieved the following targets:
– Flexibility. Everything is fully programmable, both on the student and on
the teacher side. It means new experiments and new virtual instruments can
be freely implemented just changing the firmware of the MCUs and FPGA
configuration.
– Scalability. Devices were chosen to allow ‘family migration’. As an example,
the same footprint can host FPGAs ranging from 6k, up to 25k LEs. And
the same applies to MCUs, in which the same device can be used with differ-
ent internal memory sizes. And the HyperRAM, too. It means a ‘university
edition’ of the board, used in campus laboratories, can be built maximiz-
ing available hardware resources (and cost). And a ‘student edition’ directly
bought by students will be realized with minimum cost hardware.
– Low cost. As the board is specifically designed for teaching, its cost is in the
order of 70 USD. It is remarkable, as it substitutes an entire set of boards
and measurement instruments.
Board design is now completed, and prototypes are currently in production.
Figure 3 is a 3D rendering of the assembled board. Few components are miss-
ing (USB-C connectors, LCD display, and DC/DC converters IC’s) as their 3D
models were not available.
vrLab: A Virtual and Remote Low Cost Electronics Lab Platform 219

Fig. 3. 3D rendering of the assembled board

Overall size is 160 × 100 mm (Single Eurocard format). Connectors on the


right allow to pickup student area digital signals. The same apply for the big
bottom connector, carrying analog signals of the same section. In the upper right
corner, the light green block is the USB interface.
The board itself will be used by students in real courses, starting from Novem-
ber 2020, and feedback will be collected to possibly refine the system.
The flexibility given by the designed architecture allows to forecast fur-
ther usages, implementing different virtual instruments, just with firmware and
FPGA configuration changes. As stated above, this could be the target of IT
courses, too, which were not included while defining system specifications. The
same applies for the refinement of the PC software used as user interface. It will
be customized according to course needs. Last, at the end of 2019, an IEEE stan-
dard was approved, specifically related to virtual laboratories [9], and a possible
development is the integration of the developed board to this standard.

References
1. Tropea, M., De Rango, F.: COVID-19 in Italy: current state, impact and ICT-based
solutions. In: IET Smart Cities, vol. 2(2), pp. 74-81, July 2020. https://doi.org/10.
1049/iet-smc.2020.0052
2. Kiss, G.: Comparison of traditional and web-based education - case study “BigBlue-
Button” In: International Symposium on Information Technologies in Medicine and
Education. Hokodate, Hokkaido, vol. 2012, pp. 224–227 (2012). https://doi.org/10.
1109/ITiME.2012.6291286
220 M. Ruo Roch and M. Martina

3. Magalhães Vasconcelos, P.R., de Araújo Freitas, G.A., Marques, T.G.: Virtualization


technologies in web conferencing systems: a performance overview. In: 11th Inter-
national Conference for Internet Technology and Secured Transactions (ICITST).
Barcelona, vol. 2016, pp. 376–383 (2016). https://doi.org/10.1109/ICITST.2016.
7856734
4. Ruo Roch, M., Graziano, M.: Teaching in the cloud - microelectronics ubiquitous lab
(MULAB). In: 9th European Workshop on Microelectronics Education (EWME),
Grenoble, pp. 131–135 (2012)
5. Roch, M.R., Demarchi, D., Klossek, M., Tzanova, S.: MECA, the microelec-
tronics cloud alliance. In: 2018 IEEE Global Engineering Education Conference
(EDUCON), Tenerife, pp. 1419–1423 (2018). https://doi.org/10.1109/EDUCON.
2018.8363396
6. http://www.moodle.org
7. Magdin, M., Cápay, M., Halmeš, M.: Implementation of LogicSim in LMS moodle.
In: 2012 IEEE 10th International Conference on Emerging eLearning Technologies
and Applications (ICETA), Stara Lesna, pp. 57-59 (2012). https://doi.org/10.1109/
ICETA.2012.6418281
8. Alajbeg, T., Sokele, M.: In: Implementation of electronic design automation soft-
ware tool in the learning process. In: 42nd International Convention on Informa-
tion and Communication Technology, Electronics and Microelectronics (MIPRO).
Opatija, Croatia, vol. 2019, pp. 532–536 (2019). https://doi.org/10.23919/MIPRO.
2019.8757096
9. IEEE Standard for Networked Smart Learning Objects for Online Laboratories. In:
IEEE Std 1876-2019 , pp. 1-57, 30 May 2019. https://doi.org/10.1109/IEEESTD.
2019.8723446
Mechatronics, Energies and Industry 4.0
Mechatronic Design Optimization
of an Electrical Drilling Machine for Trenchless
Operations in Urban Environment

Valerio Vita1 , Luca Pugi1(B) , Lorenzo Berzi1 , Francesco Grasso2 , Raffaele Savi3 ,
Massimo Delogu1 , and Enrico Boni2
1 Department of Industrial Engineering, University of Florence, Via di Santa Marta 3,
50139 Florence, Italy
Luca.pugi@unifi.it
2 Department of Information Engineering, University of Florence, Via di Santa Marta 3,
50139 Florence, Italy
Francesco.grasso@unifi.it
3 E.G.T. SRL, Fontevivo, Parma, Italy

r.savi@egt.it

Abstract. Trenchless excavation will play a key role in the development of smart
cities allowing a fast and sustainable improvement of underground infrastructures.
Directional drilling machines are a fundamental tool in this process allowing the
installation of pipes, ducts and cables with a relatively free trajectory. In the optic
of a progressive decarbonification of urban communities also this kind of machines
have to be electrified. In this work authors introduce a model based design criteria,
for an optimized mechatronic design of the main subsystems of the machine.

1 Introduction

There is a continuous research concerning directional drilling machines and more gen-
erally trenchless drilling technologies. In this work authors focused their attention on
model design procedures in order to optimize as much as possible their design. This
work is the natural continuation of a previous one [1] in which this application has
been originally proposed. Main elements of the investigated construction vehicle are
described in Fig. 1. Machine is fundamentally a tracked drilling machine. The drilling
unit, is composed by a rotary machine that actuate the drilling rods that are used to
perform the desired perforation. Advance of the drilling unit is performed trough a rack
and pinion transmission. Additional actuation systems, the grippers are used to hold the
drilling rods to perform screwing and unscrewing operations. Drilling rods are stored
in an automated box/magazine. At the top of drilling rods a drilling head/tool is placed.
A lubrication flow of pressurized water is injected through the drilling rod in order to
assure the following effects:

• Assure lubrication and cooling of cutting tools;


• Remove debris of excavated material allowing the advance of the drilling systems.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 223–228, 2021.
https://doi.org/10.1007/978-3-030-66729-0_25
224 V. Vita et al.

Fig. 1. Proposed Drilling Machine, main components

• Also, lubricant flow can be oriented in order to control the trajectory of the drilling
head.

In order to assure a limited mobility of the machine on every ground the vehicle is
moved through tracks (also electrified).
In this work authors focused their attention on models devoted to a proper sizing of
the motors of the drilling unit.

2 Excavation Process: Modelling


The trajectory of a bore for a trenchless excavation can be modelled as a sequence of
straight and curved bores. Loads that have applied to drilling rods to perform the chosen
excavation, should be modelled as the sum of three different contributions:

• Cutting Forces: these are the loads associated to the excavation process.
• Distributed Friction: these are friction forces which are proportional to the length of
the excavated bore since they are related to the interaction of the bore surface with
lubricated water, transported excavation debris surface of the drilling rod.
• Distributed Curvature Losses: curvature of the bore introduces some additional losses
friction effects.

2.1 Cutting Forces


For a rough estimation of forces and torques due to excavation is should considered
the definition of the specific energy of excavation briefly called SEE which is defined
according (1) as the ratio between the energy needed to cut drill the ground and the
corresponding volume of debris produced.
EEX
SEE = (1)
Abore VEX
Mechatronic Design Optimization of an Electrical Drilling Machine 225

SEE is a specific feature of the material. According the chosen tool and drilling param-
eters (advance and rotational speed) it’s possible to define an exaction efficiency ηex (2)
which is defined as the ratio between the minimum value of SSE defined according (1)
and the real specific energy required by the tool SSE real (2)
SEE
ηex = (2)
SEEreal
By assuming a constant mean efficiency ηex of 0.35, power provided by the tool W drill
is described by (3)
SSE
Abore vex = Mex ωrotary + Tex vex (3)
ηex
In (3) following symbols are adopted:

• T ex and vex represent the longitudinal thrust applied to the drill and corresponding
advance speed;
• M ex and ωrotary are applied cutting torque and rotational speed of rotary unit;

According literature [9] there is a linear dependency between applied torque and thrust:
this linear dependency saturates to a maximum advance rate which roughly corresponds
to a maximum ratio Ropt between thrust advance power W T and the rotational one W ω
due to cutting torque, as defined by (4):
WT Tex vex
= ≤ Ropt ≈ 0.01 − 0.1 (4)
Wω Mex ωrotary
Limit Ropt is associated to the maximum advance speed of the drill and consequently
to the maximum volume of excavated material. For what concern advance speed two
additional constraint should be considered:

• ωrotary cannot be lower than an assigned limit which depends also from diameter and
typology of considered tools (for considered application, about 20 rpm).
• Advance speed vex is limited by the capacity of managing excavated material. In
particular, to assure the evacuation of excavated material the flow of injected lubricant
water has to be about four times higher; so maximum advance speed vex is also limited
by available flow of injected lubricant water Qlub (5)

1 Qlub
vex ≤ (5)
4 Abore

2.2 Distributed Friction


For straight bores it’s possible to calculate the differential increment of applied thrust
T ex respect to bore length L and inclination α (6)
dT
= βw(cos α + μsign(vx ) sin α)
dL
226 V. Vita et al.

ρlub (Aint ) − ρmud (Abore − Arod )


β =1−
ρrod (Arod − Aint )
π 2 
w = ρrod Dext − Dint
2
g (6)
4
Where β and w are respectively the floating rod coefficient and the specific weight of
the drilling rod that are calculated as functions of the following parameters:
ρlub , ρmud , ρrod are respectively the densities of injected water, excavated mud and
drilling rod;Abore , Arod are areas occupied by the drilled bore and drilling rod being Dext ,
Dint its external and internal diameter.
Rotation of the drilling rod inside a bore, introduce also distributed loads in terms
of dissipated torque which is evaluated according (7)
dM
= mrw (7)
dL
r is the radius of the drilling rod, so this contribution is relatively modest, typical values
of μ are around 0.5.

2.3 Distributed Curvature Losses


For arc θ of radius R variation of T is described by Eq. (8):
 
sin αθ − sin α0
Tθ = T0 e±μ|θ| + βwRθ (8)
αθ − α0
For what concern the expression of curvature losses in terms of torques (9):

Mθ = μr(T0 + wR sin α0 )(αθ − α0 ) ± 2μrw(cos αθ − cos α0 ) (9)

3 Sizing of Drilling Motors


Knowing the specific energy of drilled ground SSE, the area of desired bore Abore and a
desiderable advance speed vex it’s possible to roughly calculate the power W ω (10) also
including efficiency ηtrasm of chosen mechanical transmission system (W ωη ).

SSE
Abore vex = 1.1Mex ωrotary = Wω = Wωη ηtrasm (10)
ηex
Once the size in terms of power of the rotary drilling unit is chosen it’s also possible
to calculate the power of the unit that have to be installed to assure desired vex . In
theory this power is about one tenth of the rotary one, however in order to compensate
losses due to friction losses along the drilled bore, according preliminary specification
of the machine this power is doubled resulting (W T = 0.2 W ω ). Also, an additional thrust
capability is generally useful to increase performance robustness of designed system.
In Fig. 2 performances in terms of exerted T ex and M ex including gearbox reduction
ratio and efficiency are shown.
Mechatronic Design Optimization of an Electrical Drilling Machine 227

Fig. 2. Performances of Adotped Motors

For what concern sizing of the linear actuation stage which provide Tex , Maximum
Thrust effort is limited to about 7–8 tons; this limitation arise from practical static
considerations: for the design of the machine it’s considered a total weight between 7
and 9 tons so it should be unwise in terms of reaction forces the application to the ground
of a thrust which is higher than vehicle weight. By combining simplified cutting model
described in Eqs. (1)–(9) with limited performances of motors described in Fig. 2, it’s
possible to obtain results of Fig. 3: maximum advance speed vex is calculated respect
to motor performance limits considering a constant tool efficiency a reamer diameter of
about 430 mm respect to different values of SEE ranging from 5 to 50 kWh/m3 .

Fig. 3. Continuous Potentiality of the Machine (comb. of motor features with drilling models)

Same analysis is also repeated considering different lengths L (L = 0 and L = 300


m). From results in Fig. 3, it’s interesting to notice that as the length L of the drilled
bore increase, increased friction losses penalize high rotation and advance speed. Results
are obtained considering a constant tool excavation efficiency: this hypothesis is useful
to evaluate potential machine performances. For a real tool proposed tool efficiency is
verified only for a limited subset of rotation and advance speed. So for a specific real
228 V. Vita et al.

tool this calculation has to be performed considering its excavation efficiency function
ηex (ω,vex ). Also flow of lubricating water plays an important role in limiting machine
performances, in this work it was supposed a maximum flow of 200 lt/min with a
maximum pressure from 40 to 80 bar depending on the working point of the pump.
From these considerations also sizing of electro-hydraulic lubricating system has been
performed.

4 Conclusion and Future Developments


A mechatronic approach fusing the drilling models, mechanical and electric features of
the considered actuators have been successfully used to properly size and investigate the
performance capabilities of an electrified drilling machines.
Future investigation efforts should be focused on the design, diagnostic and life
estimation of the energy storage systems applying more recently developed simulation
and diagnostic techniques that authors have recently developed as a part of European
research project OBELICS (www.obelics.eu)

References
1. Pugi, L., Delogu, M., Grasso, F., Berzi, L., Del Pero, F., Savi, R., Boni, E.: Electrification of
directional drilling machines for sustainable trenchless excavations. In: Proceedings - 2019
IEEE International Conference on Environment and Electrical Engineering and 2019 IEEE
Industrial and Commercial Power Systems Europe, EEEIC/I and CPS Europe 2019 (2019).
https://doi.org/10.1109/eeeic.2019.8783793
2. Locorotondo, E., Pugi, L., Berzi, L., Pierini, M., Pretto, A.: Online state of health estimation
of lithium-ion batteries based on improved ampere-count method. In: Proceedings - 2018
IEEE International Conference on Environment and Electrical Engineering and 2018 IEEE
Industrial and Commercial Power Systems Europe, EEEIC/I and CPS Europe 2018, art. no.
8493825 (2018). https://doi.org/10.1109/eeeic.2018.8493825
3. Locorotondo, E., Pugi, L., Berzi, L., Pierini, M., Lutzemberger, G.: Online identification of
thevenin equivalent circuit model parameters and estimation state of charge of lithium-ion
batteries. In: Proceedings - 2018 IEEE International Conference on Environment and Electrical
Engineering and 2018 IEEE Industrial and Commercial Power Systems Europe, EEEIC/I and
CPS Europe 2018, art. no. 8493924 (2018). https://doi.org/10.1109/eeeic.2018.8493924
4. Locorotondo, E., Scavuzzo, S., Pugi, L., Ferraris, A., Berzi, L., Airale, A., Pierini, M., Carello,
M.: Electrochemical impedance spectroscopy of li-ion battery on-board the electric vehicles
based on Fast nonparametric identification method. In: Proceedings - 2019 IEEE International
Conference on Environment and Electrical Engineering and 2019 IEEE Industrial and Com-
mercial Power Systems Europe, EEEIC/I and CPS Europe 2019, art. no. 8783625 (2019).
https://doi.org/10.1109/eeeic.2019.8783625
Analysis and Design of a Non-linear MPC
Algorithm for Vehicle Trajectory Tracking
and Obstacle Avoidance

Francesco Cosimi(B) , Pierpaolo Dini(B) , Sandro Giannetti(B) , Matteo Petrelli(B) ,


and Sergio Saponara(B)

Università di Pisa, Dipartimento Ingegneria dell ‘Informazione, Via G. Caruso 16,


56122 Pisa, Italy
{f.cosimi1,s.giannetti2,m.petrelli1}@studenti.unipi.it,
pierpaolo.dini@phd.unipi.it, sergio.saponara@iet.unipi.it

Abstract. This paper presents a non-linear Model Predictive Control (MPC) algo-
rithm developed using GRAMPC library (GRadient-based Augmented-lagrangian
framework for embedded non-linear MPC). Trajectory tracking and obstacle
avoidance capabilities for vehicle equipped Advanced Driver Assistance Systems
are becoming more and more important. These functions give better comfort and
enhance safety for drivers and passengers. In this work, the vehicle has been mod-
elled using six states (XY coordinates of the mass centre, yaw angle, velocity,
reference yaw angle and Y errors) and two controls (front steer angle and acceler-
ation). In real applications, the desired trajectory and the constraints are provided
by the navigation system and sensors, while here a sinusoidal testbench has been
chosen. GRAMPC library gives the opportunity of controlling many options, of
managing real-time features and accuracy performances. Due to complexity of
the non-linear MPC algorithm, classic Electronic Control Units (ECUs) based on
low-cost microcontroller units (MCU), often do not have sufficient computing
capabilities to meet the accuracy specifications. This explain the reasons of using
high-end MCUs or, if necessary, HW-accelerated systems (MCU + FPGA), in
order to guarantee performances and safety.

1 Introduction

To increase safety and comforts for drivers and passengers, modern industry is integrating
on vehicles Adaptive Cruise Control and Autonomous Driving features sided by ADAS
(Advanced Driver Assistance Systems) as shown in Fig. 1.
Desired trajectory is provided by advanced navigation system (maps, satellite ser-
vices, etc.) merged to a road and obstacles recognition architecture (cameras, ultrasonic
sensors, radars, lidars, etc.). Once reference is generated, a real-time trajectory tracking
algorithm is needed. This work presents the development of a custom non-linear Model
Predictive Control algorithm, using GRAMPC library able of good tracking features and
which respects real-time execution requirements.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 229–234, 2021.
https://doi.org/10.1007/978-3-030-66729-0_26
230 F. Cosimi et al.

Fig. 1. Data from sensors are processed by specific algorithms. Planned trajectory reaches the
Actuator Control system (MPC in this work) to manage vehicle’s actuators.

1.1 MPC vs PID Controllers


Model Predictive Control (MPC) is an advanced process control method that is used to
satisfy a given setpoint r(t) according to a set of possible constraints.
MPC objective is to minimize a functional cost, reducing the mismatch between
reference r(t) and state x(t) acting on plant’s inputs u(t) as represented in Fig. 2.

Fig. 2. MPC high level description.

This algorithm is characterized by a finite time-horizon (Prediction Horizon), divided


in a finite number of slots, giving the possibility of anticipating future events.
There are many differences distinguish PID and MPC control mechanisms.
Firstly, MPC owns easily understandable control laws and guarantees the possibility
satisfy many constraints in a dynamic system. Meanwhile PID’s calibration is regulated
by empirical relationships and multiple variable constraints are difficulty manageable.
Analysis and Design of a Non-linear MPC Algorithm 231

Then, taking as example a taxi-driver may be significative to understand the advantages


that a prediction algorithm can provide, to increase safety and comfort features.
MPC acts in a similar way of the typical attitude of a conscious driver. In fact, the
algorithm estimates in advance the trajectory should be maintained for a defined horizon,
and try to follow it, acting on available controls (steer and accelerator). The result is a
sequence of controlling actions that results in a safe and comfortable experience for the
passengers. On the other hand in control systems like PID, control is based on previous
instants’ error. In this case, the taxi-driver should be imagined while driving his vehicle’s
looking only at the rear-view mirror, with a front tinted glass. The only thing he can do
is correct his trajectory just after he recognises the error, without knowing what happens
in front of him. It is obvious that any passenger will feel more secure and comfortable in
first case, where the driver acts in a proper way and will pass over the second one. The
real difficulty of MPC is choosing an appropriate plant model for the system. In fact,
must be verified the most possible coherence between reality and simulations, to obtain
a correct set of results. Furthermore, the complexity MPC requires a focus on needed
computational resources. This brings to consider high-performances architectures, with
the right MCU or maybe the use HW-accelerated platforms (MCU + FPGA).

2 GRAMPC Library and MPC Development Environment


GRAMPC is a non-linear MPC software based on a gradient method from optimal con-
trol and includes a real-time solution strategy for controlling non-linear systems with
real-time demands and constraints related to the state and control variables. GRAMPC is
implemented as C code but also provides a user interface to Matlab/Simulink, C++ and
dSpace. A specific problem formulation can be implemented and provided to GRAMPC
using the C function template probfct.c. Each main function has a related MEX-routine
which allows running GRAMPC and modifying options and parameters in Matlab with-
out re-compile. Library objects are initialized by defining the structure variable grampc
containing default options, parameters, and additional inherent information of an MPC
step, then the algorithm is started and updated with a new initial state in each MPC
iteration. After GRAMPC has finished its computation, it provides the predicted new
state, the predicted controls, and the corresponding cost value for the current sampling
time as outputs [4].

3 Vehicle’s Model
In this paper the vehicle has been modelled using an augmented kinematic model taking
account the following assumptions:

• Wheel’s slip angles equal to zero


• The motion is planar
• Only the front wheel is steering

Vehicle’s mass centre position and attitude are described through three state variables:
x position, y position and Ψ yaw angle. The (x, y) coordinates refer to the position of the
232 F. Cosimi et al.

centre of gravity of the vehicle, while Ψ describes how the vehicle is oriented respect
to horizontal axes, as schematically represented in Fig. 3. Vehicle’s speed is indicated
by v and last two states represent the tracking error. The angle β is vehicle’s lateral slip
angle, between velocity and vehicle longitudinal axes [5, 6]. The distance from the rear
and front wheels axles and the vehicle gravity centre are respectively called lr and l f .
Vehicle inputs are front steering angle δ f and acceleration a. The side slip angle is called
β.

Fig. 3. System function and augmented kinematic model adopted for the vehicle.

3.1 Constraints Evaluation


Inequality constraints express a prohibited region for states and in this work are used to
define road limits, speed limits and the presence of obstacles (Fig. 4). Road boundaries
and obstacles, which may interfere with the motion of the vehicle, are detected, and
processed by a proper algorithm that provides to the MPC only final results.

Fig. 4. Obstacle is represented with a red circle (physical obstacle) and a wider black circle to a
tolerance associated to the real dimensions of the vehicle.
Analysis and Design of a Non-linear MPC Algorithm 233

4 Vehicle Trajectory Tracking and Obstacle Avoidance Results


To better understand the capabilities of the algorithm and analyse the results in different
probable conditions two different test benches have been employed. Firstly, a starting
state X0 and a static reference as destination XDES have been provided, to evaluate the
importance of cost parameters and constraints. Then algorithm has been updated to the
one presented in this paper, introducing the tracking of a specific trajectory.
Then obstacles have been introduced. Physical limits of the problem and constraints
are described in functions defined by the library. This permits to calibrate the options
regarding obstacle’s overtaking and to analyse the how algorithm may be influenced.

4.1 Testbench Results


This testbench is based on the introduction of an imposed trajectory to be followed by
the vehicle, see example in Fig. 5.

Fig. 5. MPC provides good results for trajectory tracking and obstacles overtaking. If these
features result to be insufficient, they may be upgraded acting on library options.

Fig. 6. Controls values and Computation time are reported for a simulation of 10 s on a i7 Intel
processor.

In this case may be interesting to perform an analysis in terms of computational


time and performances. The introduction of different options and the variation of MPC
234 F. Cosimi et al.

characteristic parameters (i.e. Prediction Horizon, control steps…) deeply influences


timing features. The main aim of the developer is to combine all the available choices
to get the best trade-off between real-time behaviour and quality of results (Fig. 6).

5 Conclusions and Future Works


This work analyses an MPC algorithm for the dynamics of a vehicle, implemented with
the GRAMPC library. The proposed simulations illustrate the behaviour of a vehicle in a
possible environment. Results achieved show that the algorithm implemented provides a
fair description for a path-following vehicle; this can be improved by adding or modifying
some simulation options with the drawback of an increasing computational time.
Other possible future implementations can provide system-redundancy, with a par-
allel execution of several algorithms and a output comparison, in order to ensure proper
operation and good handling of external-disturbances or errors.
The complexity of the algorithm is the centre of the future work: porting of the entire
project on embedded systems managing to ensure compliance with real-time constraints.
As an initial stage, the porting will be done targeting a Raspberry Pi 3 Model B.
Then the idea is to move from a single MCU implementation to an automotive-compliant
platform which includes MCU and FPGA. The chosen platform for the development is
the Zynq UltraScale + MPSoC ZCU104.

References
1. Kiam, H., Ang, G., Chong, Yun, L.: PID control system analysis, design, and technology. IEEE
Trans. Control Syst. Technol. 13(4), 559–576 (2005)
2. Grüne L., Pannek J.: Nonlinear model predictive control. In: Nonlinear Model Predictive
Control. Communications and Control Engineering. Springer (2017)
3. Lindberg, Y.: A Comparison Between MPC and PID Controllers for Education and Steam
Reformers, Department of Signals and Systems. Chalmers University of Technology Goteborg,
Sweden (2014)
4. https://sourceforge.net/projects/grampc/
5. van Essen, H.A., Nijmeijer, H.: Non-linear Model Predictive Control for Constrained Mobile
Robots, Department of Mechanical Engineering. Dynamics and Control Group Eindhoven
University of Technology, (2001)
6. Bonanno, F.: Autonomous Driving: Model Predictive Control for overtaking maneuvers.
Politecnico di Torino (2019)
7. Salem, F., Mosaad, M.I.: A comparison between MPC and optimal PID controllers: Case
studies (2015)
8. Marzaki, M.H.: Comparative study of Model Predictive Controller (MPC) and PID Controller
on regulation temperature for SSISD plant (2014)
Impact of Combined Roto-Linear Drives
on the Design of Packaging Systems: Some
Applications

Marco Ducci, Alessandro Peruzzi, and Luca Pugi(B)

Dipartimento di Ingegneria Industriale, Università degli Studi di Firenze, Florence, Italy


luca.pugi@unifi.it

Abstract. The term roto-linear actuator is used to describe a solution able to con-
trol at the same time two degree of freedom, a linear motion along an axis and a
rotation around the same one, realizing a controlled helicoidal motion with variable
pitch. This kind of actuation systems widely diffused in robotics are encounter-
ing a growing diffusion also applications associated to relatively small produc-
tion series such as packaging assembly machines. In this work some applications
related to common industrial applications developed by authors are introduced
to show how the increased accessibility of this mature technology can be useful
to improve design and performances of existing machines even considering very
small production scales.

1 Introduction
Roto-Linear units are actuators able to control two degree of freedom at the same time a
linear motion along an axis and the rotation along the same one. Availability of this kind
of actuators can be exploited to simplify machine design also contributing to performance
and reliability improvements. For this reason, this kind of actuation systems have been
widely proposed and used in robotics especially for the a more rational and efficient
design of manipulators such as the SCARA one [1].
More frequently adopted solutions are related to following technologies as visible
in Fig. 1/a/b:

1. Ball-Screw-Spline transmission with position-controlled motors: a ball screw trans-


mission controlled by two position-controlled motors is used to combine the action
of two precise rotary actuator in single roto-linear one [2]
2. Roto-Linear Direct Drive: two direct drive motors (typically PM ones or voice coil)
are integrated in a single unit [3].

Progress of electric drive technology and continuous improvement of this technology


also in terms of costs and availability of standard, modular solution is leading to a growing
diffusion of this kind of units in a wide population of machines associated to very small
production scales. In particular in this work authors cite two simple linear applications
both referred to real engineering activities performed by UNIFI mechatronics labs as a

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 235–240, 2021.
https://doi.org/10.1007/978-3-030-66729-0_27
236 M. Ducci et al.

part of their cooperation activities with existing small and medium factories operating
in Tuscany:

1 A cartesian assembly cell for press-fit assembly of power electronic components.


2 A machine for the assembly of bottle and dispenser for cosmetic or pharmaceutical
products.

Fig. 1. a/b: Example of roto-linear actuators, a) ball screw spline transmission controlled by
rotary motors, b) direct drive solution

As taught by good engineering practice there is no universal solution but specific


solutions for different applications. So, these two examples of applications to which have
recently worked engineers of University of Florence for local automation providers.

2 Benchmark Test Case N.1: Cartesian, Press Fit Assembly Cell


Investigated application is a cartesian assembly cell that is use for Press-Fit of Power-
Electronic Components: the machine visible in Fig. 2/a/b has two linear drive aiming
to control longitudinal and transversal position and an additional head which properly
orient the component and the assure the vertical clamping force needed to properly press
the component in its housing.

Fig. 2. a/b: Example of roto-linear actuators, a) ball screw spline transmission controlled by
rotary motors, b) direct drive solution
Impact of Combined Roto-Linear Drives on the Design of Packaging Systems 237

Fig. 3. a/b: a) innovative machine design with ball screw spline, b) assembled prototype

In the oldest conventional version visible in Fig. 2/a/b, machine suffered of troubles
related to insufficient stiffness respect to desired clamping forces in z direction and
consequently in term of repeatability. Being the stiffness respect to high static insertion
forces the main specification for purposed application, the best solution for head actuation
is represented by the new head with an integrated roto-linear actuation performed with
a ball-screw spline system visible in Fig. 2/a/b which have been successfully prototyped
producing a considerable improvement in terms of machine productivity which was
experimentally measured on the real machine an increase in terms of both axis speed
and acceleration between 20 and 30%. Also equivalent stiffness of the machine respect to
vertical clamping forces was increased of about 15 times with no increments in terms of
costs, weights, encumbrances. Increased stiffness of the machine was the key of recorded
improvements in terms of performances.

3 Benchmark Test Case N.2: Application of Direct Drive


Solutions to the Assembly of Plastic Assemblies for Cosmetic
and Pharmaceutic Industry
As visible in the example of Fig. 4, the assembly of common low cost dispensers that
are commonly adopted for cosmetic and pharmaceutical products, it’s quite a complex
manipulation task: assembly of a tooth paste dispenser involve the manipulation of 5
part; motion associated to the assembly of each component are relatively complex, so
the assembly of a dispenser require the sequential control of 15 motions which typi-
cally corresponds to the same number of independently controlled axis. Also, in terms
238 M. Ducci et al.

of feedback these applications are quite demanding considering that adopted polymeric
materials (as example HDPE, LDPE, hPP, cPP, racoPP, PST) are flexo-viscous mate-
rial with mechanical properties and tolerances that depends from several design and
production factors. As a consequence not only speed and positions of actuators are also
monitored but typically current (for electric unit) or pressure (for fluidic unit) feed-backs
has to be provided in order to indirectly estimate assembly forces and torque. Finally,
low profit margin and involved mass production rate which are over 100 pieces every
minute (which means two dispensers completely assembled in one seconds) involve an
extreme rapidity and reliability of performed operations.

Fig. 4. Example of complex plastic assembly for cosmetic and pharmaceutical industry a
toothpaste dispenser

For this kind of applications direct drive actuations are an ideal solution in order to
better calibrate actuator stiffness respect to modest and uncertain properties of manip-
ulated objects. Also the absence of complex/not reversible transmission systems assure
higher undirect sensing and control performances in terms of exerted force and torques.
Authors in past experience concerning design of active suspensions [4, 5] or smart actu-
ators [6] have the possibility of appreciate this feature, so decide to improve the response
of one of the assembly unit of the machine that proved to be more limiting in term of
machine performance, the unit that assemble the body of the toothpath dispenser. In
Fig. 5/a/b it’s shown the original pneumatic actuation unit and the new completely elec-
trified system with the new direct drive roto-linear actuator which is used to screw the
component.
The new system offered considerable improvement of performances considering that
the old unit was able to manage no more than 40 pieces/minute while the speed of the new
one is 30–40% higher (around 50–60 piece/minute). Also thank to direct drive solution
it’s possible to estimate applied forces and torques, through actuator current sensing.
Impact of Combined Roto-Linear Drives on the Design of Packaging Systems 239

Fig. 5. a/b: a) original pneumatic unit used for the assembly of the body dispenser, b) Completely
Electrified solution with rotolinear direct-drive actuation

Fig. 6. Example of performed measurements on z vertical axis


240 M. Ducci et al.

This accurate sensing also provide a continuous monitoring of managed operations as


visible in Fig. 6. For what concern energy consumption the ratio in terms of expended
energy for each assembled piece is around a twentieth (1/20) thanks to the superior
efficiency of electric drives respect to pneumatic ones, to mass reduction (machine weight
is less than half) finally by the substitution of dissipative elements such as dampers
with holding/parking brakes that provides only reaction torques without introducing
further dissipations. Finally in terms of acoustic emissions the electric machine is near
to noiseless while for the pneumatic one operators have to be protected.

4 Conclusions and Future Developments


In this brief application authors have tried to show how innovation in multi axial com-
bined actuation stages that was originally developed for high end applications is leading
to an improvement of the design of common machines produced in relatively small pro-
duction scale. Integration design process of this new technologies involve the redesign
of the shape of the machine that, as introduced in proposed examples must be different
from the original solution to fully exploit the advantages offered by new actuation tech-
nologies. Despite to the brief examples showed in this work, authors are more generally
exploiting the possibilities of redesigning existing machines exploiting new actuation
technologies offered by the market.

Acknowledgments. Authors wish to thanks all the small and medium firms working in the sector
of automation in Tuscany that are offering through cooperation with university a continuous
support for student and experience exchange that is bringing good opportunities of growth for
both industry and engineering students.

References
1. Mousavi, A. Akbarzadeh, A., Shariatee, M., Alimardani, S.: Design and construction of a
linear-rotary joint for robotics applications. In: 2015 3rd RSI International Conference on
Robotics and Mechatronics (ICROM), Tehran, pp. 229–233 (2015). https://doi.org/10.1109/
icrom.2015.7367789
2. Hiwin Catalogue Available on line
3. Koen, J., Meessen, J. Meessen, J.H., Paulides, J.H., Paulides, E.A.: Lomonova, E.A.: Lomonova
Electromagnetic Fields and Interactions in 3D Cylindrical Structures: Modeling and Applica-
tion, September 2012, Eindhoven University of Technology Library, https://doi.org/10.6100/
ir735355. ISBN: 978-90-386-3210-0
4. Allotta, B., Pugi, L., Bartolini, F.A.: An active suspension system for railway pantographs: The
T2006 prototype. Proceedings of the Institution of Mechanical Engineers Part F: J. Rail Rapid
Transit 223(1), 15–29 (2009). https://doi.org/10.1243/09544097jrrt174
5. Allotta, B., Pugi, L., Colla, V., Bartolini, F., Cangioli, F.: Design and optimization of a semi-
active suspension system for railway applications. J. Modern Transport. 19(4), 223–232 (2011).
https://doi.org/10.3969/j.issn.2095-087x.2011.04.002
6. Pugi, L., Galardi, E., Pallini, G., Paolucci, L., Lucchesi, N.D.: Design and testing of a pulley
and cable actuator for large ball valves. Proceedings of the Institution of Mechanical Engineers
Part I: Journal of Systems and Control Engineering 230(7), 622–639 (2016). https://doi.org/
10.1177/0959651816642093
Preliminary Study of a Novel Lithium-Ion
Low-Cost Battery Maintenance system

Andrea Carloni(B) , Federico Baronti(B) , Roberto Di Rienzo(B) ,


Roberto Roncella(B) , and Roberto Saletti(B)

Dipartimento di Ingegneria dell’ Informazione (DII), University of Pisa,


via Girolamo Caruso 16, 56122 Pisa, Italy
{andrea.carloni,federico.baronti,roberto.dirienzo,roberto.roncella,
roberto.saletti}@unipi.it

Abstract. The expected increase of the diffusion of Light Electric Vehi-


cles will lead to many issues, as the lack of widely distributed assistance
centers, which needs to be addressed in advance. Nowadays, the Light
Electric Vehicles battery packs are not easily accessible. Hence, this paper
provides a new battery packaging concept, which allows easier access to
the single cells and proposes an innovative low-cost maintenance system
architecture. The new system can diagnose the battery independently of
the BMS, allowing the workshop to provides maintenance and assistance
services to the faulty device.

Keywords: Battery assistance · Low-cost maintenance system ·


Single-cell tester · Switch matrix

1 Introduction

The electrification process of vehicles is now becoming reality. Li-ion batteries


are the supporters of this change, thanks to the higher energy and power den-
sity compared to the lead-acid and nickel-cadmium energy storage systems [1].
The market of small portable devices has been completely redefined, and also
the cities sustainable mobility has shown a huge increase since a few years [2].
According to the forecast study in [3], the number of Light-Electric-Vehicles
(LEV), such as e-scooters, e-bikes and e-motorcycle, will show up a huge incre-
ment. Thus, the increase of the energy demands [4], the lack of charging infras-
tructures [5] and the scarcity of assistance centers dedicated to e-mobility [6]
need to be addressed. Nowadays, the LEV assistance and maintenance services
are directly required on the producer web-page or to call centers, by sending the
faulty object to the manufacturer or its partner workshops [7]. Indeed, assistance
center networks independent by the e-mobility device producers that are spread
around the cities and the countries would be strong means to support and accel-
erate the mobility electrification process. Generally, the battery characterization
instruments can simultaneously test all the cells of a pack, but the instrument
cost is high and grows with the number of the battery cells. In order to reduce

c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 241–245, 2021.
https://doi.org/10.1007/978-3-030-66729-0_28
242 A. Carloni et al.

the overall costs, a single Li-ion cell tester based on low-cost commercial lab-
oratory instruments and controlled by a personal computer is proposed in [8].
A Python open-source control-software and a Raspberry Pi replaced the use of
a personal computer reducing the instrument cost in [6]. However, the battery
pack usually is a closed system, and the difficult access to a single cell limits
these low-cost tester implementations. Furthermore, they can characterize only
one cell at a time, so the diagnosis process duration is very large if compared
to professional tools. Thus, the idea of a new connector is proposed to simplify
the access to each cell by the assistance operator. In addition, a switch matrix
board is adopted to select a specific cell in the battery pack and to improve the
battery test characterization time. This concept can further reduce the mainte-
nance service cost, as the low-cost single-cell tester is able to diagnose the State
of Health (SoH), the State of Charge (SoC) and the equalization status of each
cell in the battery pack. The rest of the paper is organized as follows. Section 2
deals with issue related to the access to a commercial LEV battery pack and
suggests the adoption of a standard connector. Section 3 shows the preliminary
main architecture blocks of the low-cost maintenance system proposed in this
study. Section 4 describes in more detail the switch matrix architecture, for a
four-cell battery example. Finally, the conclusion is drawn in Sect. 5.

2 The Battery Pack Accessibility Issue

The BMS is the electronic system that manages the battery to avoid unsafe and
hazardous situations. A complex BMS may significantly contribute to the cost
of the battery, which can reach up to 45 % of the LEV cost [9]. For this reason,
the battery manufacturers tend to reduce as much as possible its complexity, by
removing or limiting some advantageous functionalities, such as the balancing
system and the internal battery state estimation. Furthermore, the BMS is a
closed system, so a producer-independent assistance center cannot access the
internal data to identify and fix the battery faults.
Thus, our idea is to equip the battery with a connector able to reach the cell
terminals. If the battery has N cells, the connector will have N + 1 pins. In this
way, the switch matrix can select every single cell, and the maintenance system
can acquire its voltage also determining the SoC. Moreover, the single-cell tester
may charge or discharge the cell modifying its state. This functionality simplifies
the maintenance process, because the access to the cells is independent of the
proprietary BMS. For example, the battery could present a deeply unbalanced
charge condition that may not be addressed by the BMS. Since the single cell
terminals are accessible, the system may charge the cells with the lowest SoC
and discharges those with the highest one, restoring the balanced condition of
the battery pack. Indeed, an unbalanced battery pack reduces the maximum
Preliminary Study of a Novel Lithium-Ion Low-Cost Battery 243

available energy, decreasing the LEVs driving range [10]. Besides, the system can
estimate the SoH of every cell by measuring their capacity and internal resistance
[11]. For instance, the internal resistance can be extracted by implementing the
classic Pulsed Current Test (PCT) [12] on each cell, to find or even predict
possible faults and to indicate preventive actions to avoid them.

3 The New Concept of Low-Cost Maintenance System

Figure 1 a) shows the general architecture of the low-cost maintenance system


proposed for an N -cell battery. The battery standard connector and the switch
matrix represent the first block, and the single-cell tester and the battery load or
supply compose the second group. The standard connector simplifies the access
to each cell. Instead, the switch matrix selects the item in the battery pack to
be diagnosed. The switch matrix in Fig. 1 a) has N differential input and two
differential outputs, as it will be explained in the next section. The differential
inputs are the positive and negative terminals of the cells. The two differential
outputs are the battery and the selected cell CX terminals. The possibility to
directly access the battery terminals allows the user to charge and discharge
the battery, speeding up the diagnostic procedure, by bringing the battery in
the desired state. For instance, the equalization process usually starts when the
battery is fully charged. The system can discharge the entire pack until the
cell with the lowest capacity reaches the full discharge-state. At this point, the
single-cell tester can individually discharge the other elements, measuring the
actual capacity of every cell. Finally, a chip monitor integrated circuit is added
as system control block. Indeed, it measures the current and the voltage of every
cell in the pack, selects the cell to be characterized, activates the matrix to select
the proper cell and communicates with the single-cell tester.

Fig. 1. General architectural blocks of the low-cost maintenance system in a). An


instance of switch matrix for a 4-cell battery in b)
244 A. Carloni et al.

4 The Switch Matrix


The switch matrix consists of switches organized on Y cascaded layers. A selector
sets the switch state for each layer. The total number of switches S and layers Y
are given by (1) and (2), respectively. The complexity of the matrix significantly
grows with the number of cells.

Y = log2 (N ) (1)

S = 2Y +1 − 2 (2)
Nevertheless, LEV battery packs are usually composed of tens of cells, because
the battery voltage is always lower than 60 V [13]. Thus, the number of switches
and layers does not grow too much, and the system proposed can be appealing
in this kind of application. The choice of a matrix of switches instead of a MOS
matrix prevents the cell short-circuit due to a programming mistake, since the
mechanical switch will never contact the two cell terminals simultaneously. The
matrix architecture for a 4-cells battery is reported as example in Fig. 1 b). Here,
the switch matrix is composed of six switches divided in two layers. The selectors
X0 and X1 control the switch states of the first and the second layer, respectively.
The states of two selectors determine the cell terminals connected to the output
pins U + and U − , according to the truth-table shown in Table 1. For the sake
of clarity, the identifiers 0 and 1 show the switch position when the selector
associated to them assumes one of the values indicates in Fig. 1 b).

Table 1. Truth-table of the switch matrix for a 4-cell battery

X1 X0 U − U + CX
0 0 C1− C1+ C1
0 1 C2− C2+ C2
1 0 C3− C3+ C3
1 1 C4− C4+ C4

5 Conclusion
This work proposes a novel battery assembly concept, which requires the appli-
cation of a standard connector on each pack at assembling time and a new
architecture for the battery maintenance system. The system allows the use of a
low-cost single-cell tester to investigate an entire battery pack, speeding up the
diagnostic procedure. The scheme proposed is independent of the proprietary
BMS of the LEV device. The use of low-cost and open-source instrumentation
can promote the diffusion of assistance centers for LEV devices to be spread in
the cities. Finally, this preliminary work only presents the new concept of bat-
tery maintenance for now. However, a switches matrix for a 16-cell battery has
Preliminary Study of a Novel Lithium-Ion Low-Cost Battery 245

been designed and it will be soon implemented to demonstrate the functionality


and the advantages of this system. The design details and the experimental data
will be presented as future work.

References
1. Tarascon, J.-M., Armand, M.: Issues and challenges facing rechargeable lithium
batteries. Nature 414(6861), 359–367 (2001)
2. Brenna, M., Foiadelli, F., Longo, M., Zaninelli, D.: e-mobility forecast for the
transnational e-corridor planning. IEEE Trans. Intell. Transport. Syst. 17(3), 680–
689 (2016)
3. Carolina, S., Stefanie, B., Simon, M., Raphael, M.: Assessing the market of light
electric vehicles as a potential application for electric in-wheel drives. In: 2016 6th
International Electric Drives Production Conference (EDPC). IEEE, November
2016
4. Brdulak, A., Chaberek, G., Jagodziński, J.: Determination of electricity demand
by personal light electric vehicles (PLEVs): an example of e-motor scooters in the
context of large city management in poland. Energies 13(1), 194 (2020)
5. Petrauskiene, K., Dvarioniene, J., Kaveckis, G., Kliaugaite, D., Chenadec, J., Hehn,
L., Pérez, B., Bordi, C., Scavino, G., Vignoli, A., Erman, M.: Situation analysis of
policies for electric mobility development: experience from five european regions.
Sustainability 12(7), 2935 (2020)
6. Carloni, A., Baronti, F., Di Rienzo, R., Roncella, R., Saletti, R.: Open and flexible
li-ion battery tester based on python language and raspberry pi. Electronics 7(12),
454 (2018)
7. Segway-Ninebot After-sale Assistance. https://uk-en.segway.com/after-sales-
service
8. Vergori, E., Mocera, F., Somà, A.: Battery modelling and simulation using a pro-
grammable testing equipment. Computers 7(2), 20 (2018)
9. Babin, A., Rizoug, N., Mesbahi, T., Boscher, D., Hamdoun, Z., Larouci, C.: Total
cost of ownership improvement of commercial electric vehicles using battery sizing
and intelligent charge method. IEEE Trans. Ind. Appl. 54(2), 1691–1700 (2018)
10. Zhong, L., Zhang, C., He, Y., Chen, Z.: A method for the estimation of the battery
pack state of charge based on in-pack cells uniformity analysis. Appl. Energy 113,
558–564 (2014)
11. Manoj, M., Stefan, J., Mahir, R., Frank, L., Michael, F.: Comparative analysis
of lithium-ion battery resistance estimation techniques for battery management
systems. Energies, 11(6), 1490 (2018)
12. Zine, E.D., Ali, M., Mustapha, B., Moussaab, B., Karim, K., Mohamed, S.A.C.: A
proposed pulses current method to extract the batteries parameters. In: 2017 6th
International Conference on Systems and Control (ICSC). IEEE, May 2017
13. Tanel, J., Janis, Z.: Experimental verification of light electric vehicle charger mul-
tiport topology. In: 2015 9th International Conference on Compatibility and Power
Electronics (CPE). IEEE, June 2015
Low Cost and Flexible Battery
Framework for Micro-grid Applications

Roberto Di Rienzo(B) , Federico Baronti, Daniele Bellucci, Andrea Carloni,


Roberto Roncella, Marco Zeni, and Roberto Saletti

Dipartimento di Ingegneria dell’Informazione, University of Pisa,


via Girolamo Caruso 16, 56122 Pisa, Italy
{roberto.dirienzo,federico.baronti,
roberto.roncella,roberto.saletti}@unipi.it

Abstract. A low cost flexible battery framework for micro-grid appli-


cation is presented in this paper. The battery structure is based on a
hierarchical and modular approach to easily adapt the battery design
to different voltage and capacity requirements. This feature makes the
battery configurable and usable in different kind of micro-grids reducing
design complexity and cost. The framework presented here is used to
design a battery that satisfy the needs of a micro-grid equipped with a
micro electric vehicle charging station. The battery implementation was
verified with a preliminary test campaign that confirms the successful
usability of the battery in this environment.

Keywords: Lithim-ion battery · Battery Management System ·


BMS · Micro-grid · Smart-grid · Flexible battery

1 Introduction
Nowadays, the number of Electric Vehicles (EVs) is rapidly growing. Mobility
problems in city centers are alleviated not only by electric cars but also by micro
EVs, such as e-bike, e-scooter, hover-boards and micro-car that are used more
often than before [1]. Unfortunately, the growth of the EVs number will have a
negative impact on the electric power system, especially if they are recharged
in an uncontrolled way [2]. Many researchers proposed algorithms to schedule
the charge of the EVs using several different approaches and optimization pro-
cedures [3,4]. Another approach is the smart-grid concept, in which the power
distribution network is equipped with smart nodes. The smart node manages the
energy exchange between the network and the EVs charging stations. Further-
more, the nodes can be equipped with energy storage systems (usually batteries)
and renewable sources such as PV and wind generator [5]. The battery assumes
a crucial role in this environment, because it stores the energy generated by
the renewable sources during the micro-grid idle-state and supplies the power
requested during the EVs charging phases. In this way, the micro-grid reduces
the power peaks and smooths the energy request to the network.
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 246–251, 2021.
https://doi.org/10.1007/978-3-030-66729-0_29
Low Cost and Flexible Battery Framework for Micro-grid Applications 247

Lead-acid batteries are still the most used storage systems in stationary
applications [6]. However, the higher lifetime, energy and power densities make
Lithium-ion batteries very attractive also [7]. On the other hand, Lithium-ion
technology is more expensive than other technologies and requires Battery Man-
agement System (BMS). The BMS is an electronic control circuit that keeps
voltage, temperature, and current of the battery cells inside the Safe Operating
Area (SOA), avoiding hazardous conditions and the possible damage of the bat-
tery [8]. The BMS complexity and cost depends on the requirements posed on
the battery system.
The BMS cost can be reduced with a flexible and modular design approach.
It makes the BMS usable in a large number of micro-grids with different energy
and power requirements. For example, the BMS is divided in one module for each
cell which composes the battery in [9]. In this way, the BMS is extremely flexible
but it is expensive because requires many electronic components to obtain the
same functionality for each battery cell, such as the power management, the
control algorithms and the communication, which could be shared by multiple
cells. A good trade-off between cost and flexibility is reached by dividing the
battery in standard modules, which are connected one to each other in a way to
fulfill the battery energy and power settings. This idea is the basis of the work
presented in this paper, which describes a BMS based on standard modules.
Moreover, the paper shows the implementation choices of a battery based on
this BMS and designed to be used in the micro-grid presented in Sect. 3. Finally,
the preliminary battery test campaign and the main conclusions of the work are
reported in the last two sections.

2 Modular BMS Framework

The battery framework adopted in this work is shown in Fig. 1. It is based on


a modular and hierarchical approach and allows us to easily change the battery
nominal voltage by varying the number of series-connected standard modules. A
variable number of cells can be parallel-connected to change the battery capacity,
and thus the energy and power deliverable. A macro-cell is obtained, which can
be considered as a single cell with a capacity due to the sum of the single cell
capacity [10]. Moreover, the developed structure allows the parallel-connection
between a variable number of batteries, as shown in [11], to obtain an overall
battery with a capacity and a maximum power equal to the sum of the single
batteries.
The hierarchical BMS architecture is divided in two level: the Module Man-
agement Unit (MMU) and the Pack Management Unit (PMU). The MMU is
the control board of the module, which can be composed from 6 to 16 series-
connected cells. This board is based on the chip monitor TI bq76pl455 and its
main aim is to measure the voltages and temperatures of the cells. Moreover,
it is provided of a passive charge balancing circuit. This is needed to maxi-
mize the quantity of energy that the battery can deliver. The MMUs are con-
nected one to each other via a daisy-chain bus that supports up to 16 modules.
248 R. Di Rienzo et al.

Fig. 1. Block diagram of the battery architecture.

The first module of the series is connected with the PMU to provide the quan-
tities measured by the modules to it.
The PMU is the core of this architecture. It is based on the NXP LPC1769
micro-controller and communicates with the other micro-grid main units. More-
over, the PMU processes all the data coming from the MMUs and the current
sensor, and controls the battery protection circuits. This last one is used to
interrupt the power flow to the battery if one or more cells exceed their SOA.
The BMS is also equipped with a current sensor and a fuse. The first one
is the VT-S-300-U3-I-CAN2-12/24 with measurement range of ±300 A. It com-
municates with the PMU via a CAN-bus. The fuse is series-connected to the
battery terminals and intervenes in case of accidental short-circuits.

3 Case Study and Battery Design

A battery based on the presented BMS framework was designed for a micro-grid
used to charge micro-EVs such as e-bike, e-scooter, and e-motorcycle. Generally,
these devices require a charging power of hundreds of watt, and the charging
station usually guarantees multiple and simultaneous charging processes. The
inverter chosen for this case study is the HYD 6000 ES. It automatically manages
the energy flows among the PV, the battery and the grid. It can manage up 6 kW
from/to the grid and the PV and 3 kW from/to a battery with a voltage between
42 V and 58 V.
Before moving further to the battery design, the characteristics of the
lithium-ion cells chosen for the case study are needed to set the design con-
straints. The lithium iron phosphate (LFP) technology seems the best solution
for this application, because it shows a very good trade-off among density of
power and energy, safety, and cost [12]. The Winston LFP060AHA cell was
Low Cost and Flexible Battery Framework for Micro-grid Applications 249

Fig. 2. Photograph of the case study battery.

selected. It has 3.2 V nominal voltage, a capacity of 60 A h and 3 C as the maxi-


mum charging and discharging-rate (where 1 C is the current rate that discharges
a completely charged cell in 1 h).
Starting from the above mentioned constraints, the designed battery was
assembled using one module of 16 series-connected cells and one PMU board. The
battery has a nominal voltage of 51.2 V, an energy capacity of about 3.1 kW h,
a maximum discharge power of about 9.2 kW. Figure 2 shows a photo of the
assembled battery.

4 Preliminary Test Campaign

A preliminary test campaign was carried-out to verify the battery assembly


and to identify the main parameters, such as the maximum energy that the
battery can provide at different discharging current rates. In particular, the
battery was charged with the same current profile, using a power supply with a
constant current value of 20 A, and discharged with different current rates using
an electronic load. These preliminary tests have verified the BMS design. The
BMS was able to control and manage the battery avoiding unsafe situations.
Figure 3 reports the energy extracted from the battery for different discharge
power values and the average temperature increase of the cells in these exper-
imental tests. We can note that the energy extracted by the battery in all the
test is higher than the nominal one indicated by the blue dashed line. Moreover,
the temperatures of the cells increase only of around 20 ◦ C when the battery
is discharged with 3.3 kW. This power level is higher than that of the chosen
inverter. We can conclude that the cell temperature will remain within the SOA
values when it will be used in the micro-grid application.
250 R. Di Rienzo et al.

Fig. 3. Energy extracted from the battery with different power loads and average cell
temperature increase.

5 Conclusion

This paper shows a low-cost flexible framework for the battery design in a micro-
grid application. Micro-grids equipped with an energy storage system seems the
best approach to solve the excessive power demand on the power network due
to uncontrolled EVs charging. The batteries have a key role in this environment
and the possibility to adapt the voltage and power of the battery using the
same framework can foster the market interest to the micro-grid approaches. A
battery architecture based on modular and hierarchical approach was designed.
This framework was used to assemble a battery for a micro-grid provided with
a micro-EVs charging station. A preliminary battery test campaign has been
presented and the results coming from charge/discharge cycles with different
current loads are discussed. In all the test conditions the battery has provided
an energy higher than the nominal one, showing a very good quality of the
used cells and the full functionality of the hierarchical BMS. Moreover, the cell
temperatures increase of only 20 ◦ C with a power load higher that the maximum
value requested by the micro-grid application we are dealing with. In a future
work, the battery will be assembled into the micro-grid presented as case-study
and an extensive test campaign will be carry-out on the entire system.

Funding. This research was partially funded by PAR FAS Toscana 2007–2013
(Bando FAR FAS 2014), under agreement n. 4421.02102014.072000022 Project
SUMA, and supported by CrossLab project, University of Pisa, funded by MIUR
“Department of Excellence” program.
Low Cost and Flexible Battery Framework for Micro-grid Applications 251

References
1. Loustric, I., Matyas, M.: Exploring city propensity for the market success of micro-
electric vehicles. Eur. Transp. Res. Rev. 12(1), 42 (2020)
2. Dubey, A., Santoso, S.: Electric vehicle charging on residential distribution systems:
impacts and mitigations, pp. 1871–1893 (2015)
3. Masoum, A.S., Deilami, S., Moses, P.S., Masoum, M.A., Abu-Siada, A.: Smart load
management of plug-in electric vehicles in distribution and residential networks
with charging stations for peak shaving and loss minimisation considering voltage
regulation. IET Gener. Transm. Distrib. 5(8), 877–888 (2011)
4. Nour, M., Said, S.M., Ali, A., Farkas, C.: Smart charging of electric vehicles accord-
ing to electricity price. In: Proceedings of 2019 International Conference on Inno-
vative Trends in Computer Engineering, ITCE 2019, pp. 432–437. Institute of
Electrical and Electronics Engineers Inc., February 2019
5. Mwasilu, F., Justo, J.J., Kim, E.K., Do, T.D., Jung, J.W.: Electric vehicles and
smart grid interaction: a review on vehicle to grid and renewable energy sources
integration, pp. 501–516, June 2014
6. McKeon, B.B., Furukawa, J., Fenstermacher, S.: Advanced lead-acid batteries and
the development of grid-scale energy storage systems. Proc. IEEE 102(6), 951–963
(2014)
7. Zhao, G., Shi, L., Feng, B., Sun, Y., Su, Y.: Development status and compre-
hensive evaluation method of battery energy storage technology in power system.
In: Proceedings of 2019 IEEE 3rd Information Technology, Networking, Electronic
and Automation Control Conference, ITNEC 2019, pp. 2080–2083. Institute of
Electrical and Electronics Engineers Inc., March 2019
8. Wakihara, M.: Recent developments in lithium ion batteries, pp. 109–134, June
2001
9. Elsayed, A.T., Lashway, C.R., Mohammed, O.A.: Advanced battery management
and diagnostic system for smart grid infrastructure. IEEE Trans. Smart Grid 7(2),
897–905 (2016)
10. Baronti, F., Di Rienzo, R., Papazafiropulos, N., Roncella, R., Saletti, R.: Investiga-
tion of series-parallel connections of multi-module batteries for electrified vehicles.
In: 2014 IEEE International Electric Vehicle Conference, IEVC 2014, pp. 1–7.
IEEE, December 2014
11. Di Rienzo, R., Baronti, F., Roncella, R., Morello, R., Saletti, R.: Simulation plat-
form for analyzing battery parallelization. In: SMACD 2017 - 14th International
Conference on Synthesis, Modeling, Analysis and Simulation Methods and Appli-
cations to Circuit Design, pp. 1–4. IEEE, June 2017
12. Keshan, H., Thornburg, J., Ustun, T.S.: Comparison of lead-Acid and lithium ion
batteries for stationary storage in off-grid energy systems. In: IET Conference Pub-
lications, vol. 2016, no. CP688. Institution of Engineering and Technology (2016)
Survey of Positioning Technologies for In-Tunnel
Railway Maintenance

Luca Fronda1(B) , Francesco Bellotti1 , Riccardo Berta1 , Alessandro De Gloria1 ,


and Paolo Cesario2
1 DITEN - University of Genoa, Via Opera Pia 11a, 16145 Genova, Italy
s4047957@studenti.unige.it, {francesco.bellotti,riccardo.berta,
alessandro.degloria}@unige.it
2 Si Consulting, Srl - Via Gavotti 5/6, 16128 Genova, Italy

paolo.cesario@siconsulting.biz

Abstract. Maintenance plays a fundamental role for the safety and efficiency of
the railway infrastructure. This document analyzes the state of the art of technolog-
ical solutions for indoor positioning, which has recently had significant develop-
ments, particularly with ultra-wide band (UWB), and can be taken into account to
manage the positioning of teams of workers inside a tunnel. Based on our analysis,
we argue that there is not sufficient information about the performance achievable
by state of the art technologies in a railway tunnel maintenance scenario. We thus
propose a set of research questions for an experimental measurement campaign
in the authentic context of use.

1 Introduction

The maintenance of train tracks plays a key role in railway safety, but is problematic
and expensive, for the safety of workers and the impact on the service. At present, a
system is usually deployed, namely, the “Automatic Track Warning System” (ATWS),
that detects the arrival of trains and notifies the workers with the use of acoustic and
visual alarms. An indoor positioning system may complement the ATWS to verify that
workers are actually in a safe location.
Positioning solutions in indoor environments have had significant improvements,
recently, as we will show in this paper presenting the state of the art. However, only sel-
dom does literature deal with applications in railway tunnels. A railway tunnel represents
a very particular environment for positioning, as it is a long and narrow environment,
with a vault, thick walls, humidity and the presence of particular obstacles and distur-
bances, such as iron powder, tracks, high voltage cables, trains, workers and machinery.
Therefore, we argue that experiments are needed in order to understand applicability in
railway tunnels of the latest indoor positioning advances.
The remainder of the document is structured as follows: in Sect. 2 the fundamentals
of internal localization are presented. Section 3 shows the research questions we have
elaborated for an experimental campaign, while Sect. 4 draws the conclusions on the
work done.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 252–257, 2021.
https://doi.org/10.1007/978-3-030-66729-0_30
Survey of Positioning Technologies for In-Tunnel Railway Maintenance 253

2 Indoor Environments Localization


A fundamental distinction between indoor localization technologies concerns the use
of anchors, which are sensors or transmitters positioned in the target environment. The
first approach (e.g., WiFi, UWB, BLE) uses different anchors and tags, that identify
the users who need to be located. In the second approach (e.g., inertial sensors, slotted
cables, terrestrial magnetic field and infrared) do not need an infrastructure, and are
easily installable.
For each technology, the received data must be processed in order to extract the
information for localization. This is done by means of various techniques that we briefly
present in the next subsection.

2.1 Techniques for Processing Data from Localization Sensors

There are two main types of techniques for processing data from location sensors: deter-
ministic and fingerprint. Deterministic techniques typically exploit measurements such
as the angle of arrival (AOA), the time of arrival (TOA), and the arrival time difference
(TDOA).
The most commonly used technique is trilateration, which uses a known distance
between at least three fixed points (the three necessary anchors) to calculate the position
of a tag. Particularly, it uses the intersection of a set of circles to define an area where
the tag is located [1].
The fingerprint, on the other hand, is based on a statistical model, mitigating the
effects of multipath, shadowing and other problems that are encountered with the prop-
agation of the electromagnetic signal in closed environments. Fingerprint is based on
two phases, one offline, in which the received signal intensity (RSSI) or channel state
information (CSI) are used to build a database (fingerprint map), and an online once
in which the RSSI or the CSI of the tags are read and compared with the fingerprint
map by using machine learning algorithms like k-nearest neighbor (kNN), support vec-
tor machine (SVM), random forest(RF), artificial neural network (ANN). In general,
the fingerprint accuracy depends on the density of the anchors, on the precision of the
database constructed and on the variability of the environment [3].
Filters may also be used to improve result accuracy and robustness. We can distin-
guish two types, the first ones (mainly employed with the fingerprint) are used to select
the most valid and influential measurements [4]. The second case concerns state esti-
mation filters, such as Kalman Kalman Filter (KF), Unscendent Kalman Filter (UKF),
Extended Kalman Filter (EKF) and Particle Filter which use the previous state (in our
case the position in two dimensions) to estimate the next one [2].

2.2 Indoor Positioning Technologies

The IEEE 802.11 standard, also known as WiFi, has been used for both the deterministic
and fingerprint approach providing accuracy on the order of decimeters. In [5] a person
is localized in a area of (52 × 40) m2 that present some offices, a caffetteria and different
corridors, with 20 anchors; an algorithm is developed,
254 L. Fronda et al.

“SpotFi”, which uses AoA, obtaining a median error of 40 cm in general and of about
one meter in corridors, a space expected to be similar to tunnels. In [6], an algorithm
based on the fingerprint is used to estimate the position with 8 anchors, with a best case
error of 43 cm. We also find commercial applications like “Infsoft”.
Ultra wide band (UWB) can ensure good range and pass through some obstacles,
like slim walls. The typical technique is the trilateration, but fingerprint is also used,
obtaining an accuracy in the order of decimeters. [7] presents the localization in an area
of (5,256 × 3,088) cm2 , with 4 anchors achieving an accuracy of 14 cm. [8] uses 4
anchors in an area of (20 × 5) m2 , achieving an accuracy of 10 cm for the 24,21% of the
cases and of 30 cm for the 71,05% of the cases. In [9], 6 anchors were used in an area
of (20 × 28) m2 , with an average accuracy of 12 cm and an accuracy of about 40 cm for
more than 99% of the cases. Solutions based on commercial systems, like “KIO RTLS”
or “pozyx”, are also presented.
Bluetooth low energy (BLE) positioning applications typically employ a determin-
istic approach obtaining accuracy in the order of decimeters. BLE is used for low energy
commercial systems like “Bleindoor”. Compared to the technologies previously ana-
lyzed, it has a weaker signal diffusion, and for this it requires the installation of a higher
number of anchors. In [10], 4 anchors are used in an area of (5 × 5) m2 obtaining an
average accuracy of 53 cm. [11] 3 BLE beacons were used to estimate the position in a
room of 91.8 m2 obtaining a mean error of 0.86 m.
Other technologies like leaky cables, Hearth magnetic field, inertial sensors or
infrared sensors can be used for indoor positioning, but they provide lower accuracy,
and could be used as a complement. Leaky cables, that are already deployed in some
tunnels, can be used to estimate the position of a person between two cables reaching
accuracy in the order of meters (e.g., [12]). The Hearth magnetic field is used with finger-
print obtaining an accuracy of about 1–2 m [13]. Infrared presents two kinds of sensors,
passive infrared (PIR), that detect presence and movement within a certain area, and
beacons as for previous technologies. Infrared seems to provide decimeter-level accu-
racy. But recent results (e.g., [14] and [15]) should be better investigated, especially in
terms of sensor density. Inertial sensors (IMUs), such as accelerometers, gyroscopes and
magnetometers, are used to calculate position, velocity, angular velocity and accelera-
tion through dead reckoning [16]. They suffer from integration drift in time, so they are
typically used in combination with other technologies. Sensor fusion might be used to
improve performance. For instance, [17] uses WiFi and bluetooth anchors together with
inertial sensors in cell phones; it was tested in a room of 600 m2 using 8 WiFi anchors
and 8 BLE anchors and a google nexus 6 phone, obtaining an error of at least 1 m for the
80% of the cases. [18] uses the fusion of the Earth’s magnetic field and inertial sensors
achieving a mean error of 1.48 m.

3 Proposed Test and Studies

In general, accuracy of the system depends on the density of the anchors and complexity
of the environment. However, literature lacks information on this environment. Thus,
also in the light of the recent developments in indoor positioning, we argue that an
experimental campaign would be needed in order to characterize the performance of the
Survey of Positioning Technologies for In-Tunnel Railway Maintenance 255

most promising technologies (UWB, WiFi and BLE, in this order), in terms of accuracy
and confidence, varying the number and location of the anchors. Particularly we highlight
the following items, to be investigated:

• Range. The range of technologies strongly depends on the environment: obstacles,


humidity, uneven surfaces can limit the range. The range is key to understand how
many anchors are needed in a given surface.
• Anchor density. This parameter is particularly relevant when using the fingerprint
technique, and a lot is likely to depend on the post-processing algorithms. This factor
is expected to have a significant impact on precision.
• Probabilistic vs deterministic approach. Especially in the context of a railway tunnel,
with a number of possible sources of noise (electrical line, iron powder, presence
of people, maintenance machinery, and running trains), literature lacks information
about the best approach to follow.
• Fusion of technologies and techniques: this approach, also including infrared sensors,
could be useful particularly to address robustness

Tests should consider different scenarios of use and indicate the relevant achievable
precision and confidence levels, with different sensor configurations, technologies and
exploiting the most suited algorithmic approaches.
Considering the typically engineering application, considerations about installation
time and costs will also have to be made for a proper experimental analysis. For instance,
tunnels could be fully infrastructured, or only partially, with a mobile installation, and
longitudinal position of the workers (which is less safety-critical) could be more roughly
estimated through low-cost leaky cables.

4 Conclusions
The literature presents various technologies and solutions for localization in indoor envi-
ronments. Recent developments have led to accuracy levels around 20–30 cm with UWB,
using various types of deterministic and probabilistic approaches. WiFi and BLE pro-
vide similar performance, but seem to require a higher density of the anchors. Infrared
sensors look promising as well, even if they have a smaller coverage in literature. The
lack of research reports on the railway tunnel environment demands some specific exper-
iments, to understand whether positioning technologies could improve the maintenance
operations in railway tunnels. Particularly, we highlight four dimensions for an experi-
mental campaign in the authentic context of use: range, anchor density, probabilistic vs
deterministic approach, fusion. Other key aspects that should be addressed concern the
effects of the exposure of the human body to the magnetic field generated by measur-
ing instruments and the ethical and legal problems connected to the tracking of people.
Moreover, removal of a physical barrier between the two rails in a tunnel is obviouly
expected to increase the risk.

Acknowledgments. This work was also supported by operative program Por FSE Regione Liguria
2014–2020 (Grant Agreement RLOF18ASSRIC).
256 L. Fronda et al.

References
1. Canalda, P., Chatonnay, P., Spies, F., Lassabe, F., Charlet, D.: Friis and iterative trilatera-
tion based wifi devices tracking. In: 14th Euromicro International Conference on Parallel,
Distributed, and Network-Based Processing (PDP 2006) (2006)
2. Pelka, M., Hellbr¨uck, H.: Introduction, discussion and evaluation of recursive bayesian fil-
ters for linear and nonlinear filtering problems in indoor localization. In: 2016 International
Conference on Indoor Positioning and Indoor Navigation (IPIN), Alcala de Henares, 2016,
pp.1–8, (2016). https://doi.org/10.1109/IPIN.2016.7743663
3. Guo, X., Ansari, N., Hu, F., Shao, Y., Elikplim, N.R., Li, L.: A Survey on fusion-based indoor
positioning. In: IEEE Communications Surveys & Tutorials, vol. 22, no. 1, pp. 566–594,
Firstquarter (2020). https://doi.org/10.1109/COMST.2019.2951036
4. Meneses, F., Eisa, S., Peixoto, J., Moreira, A.: Removing useless aps and fingerprints from
wifi indoor positioning radio maps. In: International Conference on Indoor Positioning and
Indoor Navigation, Montbeliard-Belfort, pp. 1–7 (2013). https://doi.org/10.1109/IPIN.2013.
6817919
5. Bharadia, D., Katti, S., Kotaru, M., Joshi, K.: Spotfi: decimeter level localization using wifi.
In: 10th International Conference on Wireless Communications, Networking and Mobile
Computing (WiCOM 2014) (2015)
6. Hu, K., Yu, M., Xiao, T.T., Liao, X.Y.: Study of fingerprint location algorithm based
on wifi technology for indoor localization. In: 10th International Conference on Wireless
Communications, Networking and Mobile Computing (WiCOM 2014) (2014)
7. Cheng, L., Chang, H., Wang, K., Wu, Z.: Real time indoor positioning system for smart
grid based on uwb and artificial intelligence techniques. In: 2020 IEEE Conference on
Technologies for Sustainability (SusTech), pp. 1–7 (2020).
8. Puschita, E., Simedroni, R., Palade, T., Codau, C., Vos, S., Ratiu, V., Ratiu, O.: Performance
evaluation of the uwb-based cds indoor positioning solution. In: 2020 International Workshop
on Antenna Technology (iWAT), pp. 1–4 (2020)
9. Stahlke, M., Kram, S., Mutschler, C., Mahr, T.: Nlos detection using uwb channel impulse
responses and convolutional neural networks. In: 2020 International Conference on Local-
ization and GNSS (ICL-GNSS), pp. 1–6 (2020)
10. Essa, E., Abdullah, B.A., Wahba, A.: Improve performance of indoor positioning system
using ble. In: 2019 14th International Conference on Computer Engineering and Systems
(ICCES), pp. 234–237 (2019)
11. Phutcharoen, K., Chamchoy, M., Supanakoon, P.: Accuracy Study of Indoor Positioning with
Bluetooth Low Energy Beacons. In: 2020 Joint International Conference on Digital Arts,
Media and Technology with ECTI Northern Section Conference on Electrical, Electronics,
Computer and Telecommunications Engineering (ECTI DAMT & NCON), Pattaya, Thailand,
pp. 24–27 (2020) https://doi.org/10.1109/ECTIDAMTNCON48261.2020.9090691
12. Tsujita, W., Inomata, K., Shikai, M., Hirai, T.: Two dimensional location estimation with
leaky coaxial cables for wide area surveillance system. In: 2011 41st European Microwave
Conference, pp. 143–146 (2011)
13. Saxena, A., Zawodniok, M.: Indoor positioning system using geo-magnetic field. In: 2014
IEEE International Instrumentation and Measurement Technology Conference (I2MTC)
Proceedings, pp. 572–577 (2014)
14. Lai, K., Ku, B., Wen, C.: Using cooperative pir sensing for human indoor localization. In:
2018 27th Wireless and Optical Communication Conference (WOCC), pp. 1–5 (2018)
15. Arai, T., Yoshizawa, T., Aoki, T., Zempo, K., Okada, Y.: Evaluation of indoor positioning
system based on attachable infrared beacons in metal shelf environment. In: 2019 IEEE
International Conference on Consumer Electronics (ICCE), pp. 1–4 (2019)
Survey of Positioning Technologies for In-Tunnel Railway Maintenance 257

16. Zhang, M., Yang, J., Zhao, J., Dai, Y.: A dead-reckoning based local positioning system for
intelligent vehicles. In: 2019 IEEE International Conference on Power, Intelligent Computing
and Systems (ICPICS), pp. 513–517 (2019)
17. Zou, H., Chen, Z., Jiang, H., Xie, L., Spanos, C.: Accurate indoor localization and tracking
using mobile phone inertial sensors, wifi and ibeacon. In: 2017 IEEE International Symposium
on Inertial Sensors and Systems (INERTIAL), pp. 1–4 (2017)
18. Fentaw, H. W., Kim, T.: Indoor localization using magnetic field anomalies and inertial mea-
surement units based on monte carlo localization. In: 2017 Ninth International Conference
on Ubiquitous and Future Networks (ICUFN), Milan, pp. 33–37 (2017)
IoT, AI and ICT Applications
Edgine, A Runtime System for IoT Edge
Applications

Riccardo Berta(B) , Andrea Mazzara, Francesco Bellotti, Alessandro De Gloria,


and Luca Lazzaroni

DITEN, Università degli Studi di Genova, Via Opera Pia 11/a, 16145 Genoa, Italy
{riccardo.berta,francesco.bellotti,alessandro.degloria}@unige.it

Abstract. The diffusion of Internet of Things (IoT) technologies has paved the
way to new applications and services. In this context, developers need tools for
efficient design and implementation. This paper proposes Edgine (Edge engine),
a cross-platform open-source edge computing system. The system is the edge
computing extension of Measurify, a cloud Application Programming Interface
(API) dedicated to the collection and processing of measurements from the field.
Particularly, Edgine can be remotely programmed to perform various kinds of
processing on the field sensors’ data streams, thus allowing optimizing resource
utilization, and reducing latency, bandwidth, transmission energy and computa-
tional burden on the cloud side. This paper presents three simple application cases
that show effectiveness and versatility of the tool. To the best of our knowledge this
is the first end-to-end development system dedicated to IoT measurements, open
source, programmable on both the edge and cloud side, platform independent and
non-cloud-vendor locked.

Keywords: IoT · Development tools · Edge computing · Arduino ·


Platform-independence

1 Introduction
In the Internet of Things (IoT) scenario [1], edge devices are getting ever more rel-
evance as fully integrated tools in a continuously operating computation continuum
from the field to the cloud. Computing in proximity to data sources aims at reducing
latency, energy consumption and bandwidth occupation [2]. The generic edge device
term involves a variety of devices, ranging from FPGAs and devices with few KBs of
memory to families of microcontrollers (e.g., [3]), smartphones and high performance
Machine Learning (ML)-enabled microcontrollers (e.g., the Coral Dev Board [4]). This
allows creating a network of micro data centers located near to the raw information
sources. In order to increase scalability, a hierarchical structure of the connected devices
is provided, so to optimize resource consumption by dynamically organizing the data
processing. In this context, there is a growing need to provide developers with tools
able to support efficient design and implementation of applications, allowing them to
focus on the application logic rather than on the implementation details (e.g., about

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 261–266, 2021.
https://doi.org/10.1007/978-3-030-66729-0_31
262 R. Berta et al.

management of the connection and delivery of information packets, organization of the


data processing across various stages from the field to the cloud). This paper proposes
Edgine (Edge engine), a cross-platform open-source edge computing system designed
to support developers building IoT applications.

2 Cross-Platform Edge Engine Implementation


Edgine is a cross-platform edge system able to download from the cloud and locally
run scripts. In a edge-to-cloud continuum computing perspective, the system is remotely
configurable, in terms of settings and executable scripts (Fig. 1).

Fig. 1. Block diagram of the edge engine in the IoT ecosystem

On the cloud side, Edgine relies on Measurify: a RESTful Application Programming


Interface (API) cloud platform aimed at the management of IoT objects [5]. Particularly,
not only does Measurify collect all the information sent from the field by an Edgine
instance, but also it provides the interface to the developer for remotely programming a
deployed Edgine. Edgine’s runtime operation consists of two parts: an initialization and a
continuous loop. In the start-up Edgine connects to the API to download its description, in
particular the list of scripts to be executed and the parameter values for its configuration.
During the loop, Edgine executes each assigned script in sequence.
The edge device needs to be recognized and checked whether it has sufficient rights
to interoperate with Measurify. To this end, the Edgine instance makes an Hypertext
Transfer Protocol (HTTP) POST to Measurify specifying in its body the Edgine’s owner’s
credentials. If authentication succeeds, Edgine receives from the API a JavaScript Object
Notation (JSON) Web Token (JWT). This token (valid for a configurable number of
minutes) must be put in the header of all subsequent HTTP requests to the API.
Each device associated with a Measurify installation has a JSON descriptive struc-
ture which includes, the features and the scripts fields. The former indicates the list of
measurement types (specified in Measurify to assure input integrity) that a device can
perform, while the latter is a descriptor of the required computing task (Fig. 2). Partic-
ularly, its “code” field specifies the executable script, which is the chain of functions
Edgine, A Runtime System for IoT Edge Applications 263

(instructions) applicable by Edgine to its raw data before delivery to the cloud, during
the continuous loop. Each instruction is applied to its input data stream, which is the
output of its preceding instruction. The first stage of the chain is applied to the raw input
data. Instructions in Fig. 2 concern the computation of the available ROM (in GB), its
conversion in MB, and the final shipment to Measurify.

Fig. 2. A script JSON description

The current instruction set is reported in Table 1. Table 2 synthetizes the HTTP
requests during the two phases of start-up (authentication and download of the scripts)
and infinite loop (measurements upload).

Table 1. Instructions currently available in Edgine

Instruction Description
Send Sends to the API all elements of the data stream
Map A new data stream is created by performing a simple
arithmetic operation between two operands
Max/min A new data stream is created containing only the min/max
value among the values in the input stream
Window/slidingWindow A new data stream is created by applying a two-operand
function on an accumulator, initialized to the value of the
second argument, and on each input element, for a number of
values indicated by the size of the window/slidingWindow
Filter A new data stream is created letting using only the elements of
its input stream that have a value within a specified range
Average/median/stdDeviation A new data stream is created by taking the
average/median/stdDeviation of a specified number of
samples in its input stream
Available-rom/available-ram A new data stream is created containing the value of the
currently available ROM/RAM
Total-rom/total-ram A new data stream is created containing the value of total
ROM/RAM

Another major development concerned the communication interface, designed to


be as abstract as possible from the hardware, for portability. To that end, classes have
been created to allow developers to switch from Windows/Linux/Mac PC platforms to
Arduino through macros.
264 R. Berta et al.

Table 2. HTTP requests by Edgine

Request Subject Description


POST Login credentials JWT is received from the cloud
GET Device description Scripts are retrieved from the cloud
POST Measurements Edge-processed data are shipped the cloud

The difference between the two platform types concern the Internet connection. In
Arduino, connection to a WiFi network (specified in the code) is automatically provided
by the system, which is also designed to perform a reconnection in case of signal loss. On
PC-type devices, instead, the network connection (and reconnection) is not automated,
since a PC user can exploit the user interface (UI) to freely choose an available network.
Furthermore, automating the connection require different implementations for different
OSs, complicating the Edgine’s structure. The actual shipment of the data to the cloud
is made by a thread, which is generated at each cycle, in order not to block the field
device for the communication operations. Moreover, independent of the target platform,
a queue of threads is implemented, which allows enqueuing the threads if the network
connection is currently not available, in order to guarantee a correct data delivery also
in this case.

3 Results and Discussion


We have tested the system in three very different applications. The first one, implemented
on Windows, consists in providing information on the main characteristics of the hosting
system. The example collects information about the amount of available RAM and
ROM memory. To this end, we configured a Measurify instance to be able to receive
measurements having the following single dimension, numeric features: total-ram, total-
rom, available-ram and available-rom. Nine scripts have been prepared, of which one is
shown in Fig. 2, while others make more sophisticated operations, such as computing
the average available RAM over one hour of use. In the loop, Edgine retrieves from the
system the values of the available/total RAM and ROM and puts them in the body of
a POST request together with the relevant feature name, and timestamp (Fig. 2). This
functionality is obviously platform-dependent.
The second test case concerns the BSc thesis work by two Electronic Engineering
students, who created a motion detection IoT application. They used an Arduino plat-
form, equipped with the ESP8266 Wi-Fi module and the PIR motion sensor HC-SR501.
During the loop, motion data are collected by the sensor, and sent to the board via Inter-
rupt Service Routine (ISR). Data are then processed by scripts to extracting synthetized
information before delivery to the cloud. According to the students’ experience, the
benefits of using Edgine are manifold. First, they could focus on data retrieval rather
than cloud communication, thus improving the quality of the project, while reducing
the amount of required work. The students highlighted that the provided use examples
were very useful for them to easily understand how to exploit Edgine, offering as a solid
Edgine, A Runtime System for IoT Edge Applications 265

basis for the development of their project. Finally, they highlighted that the chainable
operations, while limited in number, made them achieve the set objectives in a logical
and accurate way.
The main issues concerned the need for hard-coding the Service Set Identifier (SSID)
wifi network access credentials, and a certain difficulty in understanding the runtime
status of the interaction with the cloud. Particularly, since the HTTP requests made by
Edgine are hidden in the library, there is no immediate feedback to the user on cloud
data storage.
The third use case concerns the implementation of Edgine as a plug-in in a game
engine, such as Unity 3D in a desktop environment. We created a Dynamic-Link Library
(DLL), which exports the two main Edgine functions (setup and loop), allowing their
use within the Unity 3D Integrated Development Environment (IDE), which supports
C# scripts. As a proof of concept, we implemented a very simple game scenario, where
the number of collisions of a ball with the walls of a containing box are counted. When
the accumulator reaches a value multiple of 5, the measurement is sent to Measurify.
Despite the extreme application simplicity, this example shows the versatility of a tool,
such as Edgine, which is able not only to deal with IoT sensors, but also with virtual
sensors in a 3D virtual reality environment.

4 Conclusion and Future Work

This paper has presented Edgine, a runtime system for IoT applications on edge devices.
The strengths of this system are platform independence, ease of use and remote config-
urability, that are key requirements for IoT application developers. Edgine is the edge
extension of the Measurify project, which is fully available open source (https://measur
ify.org/). To the best of our knowledge this is the first end-to-end development system
dedicated to IoT measurements (from collection and edge processing, to cloud delivery,
storage and querying), programmable on both the edge and cloud side, and completely
platform independent and non-cloud-vendor locked.
Initial tests have shown that Edgine provides an innovative and easy to use solu-
tion suitable even for little-experienced developers. The power of the Edgine’s design
abstractions has been testified by its successful application in various contexts, also for
processing desktop OS information and human-computer interaction in a virtual reality
environment.
To increase the level of security, a credential retrieving function from an encrypted
configuration file will be implemented. Also, we intend to enlarge the instruction set
in order to support facing more challenging application scenarios. Finally, more exten-
sive testing is planned for an in-depth assessment of the benefits provided by the new
development system.

References
1. Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of Things (IoT): a vision,
architectural elements, and future directions. Future Gener. Comput. Syst. 29 (2013)
266 R. Berta et al.

2. Lin, L., Liao, X., Jin, H., Li, P.: Computation offloading toward edge computing. Proc. IEEE
107, 1584–1607 (2019)
3. Sakr, F., Bellotti, F., Berta, R., De Gloria, A.: Machine learning on mainstream microcontrollers.
Sensors 20, 2638 (2020)
4. Coral, Dev Board. https://coral.ai/products/dev-board/
5. Berta, R., Kobeissi, A., Bellotti, F., De Gloria, A.: Atmosphere, an open source measurement-
oriented data framework for IoT. IEEE Trans. Ind. Inform. https://doi.org/10.1109/tii.2020.299
4414
An Action-Selection Policy Generator
for Reinforcement Learning Hardware
Accelerators

Gian Carlo Cardarilli, Luca Di Nunzio, Rocco Fazzolari, Daniele Giardino,


Marco Matta, Marco Re, and Sergio Spanò(B)

Department of Electronic Engineering, University of Rome Tor Vergata, Via del Politecnico 1,
00133 Rome, Italy
{cardarilli,di.nunzio,fazzolari,giardino,matta,re,
spano}@ing.uniroma2.it

Abstract. We propose the first hardware architecture for an action-selection Pol-


icy Generator feasible for Reinforcement Learning hardware accelerators. The
system is capable of producing outputs for random, greedy and ε-greedy action-
selection policies within the same circuit. It requires a very moderate number of
hardware resources, shows a limited power dissipation, and can be integrated in
the state of the art of Reinforcement Learning hardware accelerators due to its
high computational speed. Our architecture is meant to work with Q-Matrix based
Reinforcement Learning algorithms such as Q-Learning and SARSA.

1 Introduction
In the last few years, we assisted to a significant increase in Machine Learning (ML)
based applications and research works [1–3]. In the field of ML, Reinforcement Learning
(RL) [4] is a training technique based on the human learning process. It relies on a trial-
and-error approach where the system learns to accomplish a certain task by observing
the effects of its operation in the outer context.
Applications of RL vary from robotic control problems [5], telecommunications [6],
to Internet of Things (IoT) [7]. In the last years, a new research field based on swarm
and multi-agent RL applications and algorithms saw a growth in the literature [8, 9].
A typical RL framework is shown in Fig. 1.
The operational core is called Agent. To accomplish the task, it performs some
Actions in the Environment where it “lives”. An Observer oversees the operation giving
the Agent the State of the Environment and a Reward for the last Action performed.
This feedback process is used by the Agent to learn the optimal Action-Selection
Policy (ASP) to reach its goal.

2 Q-Matrix Based Reinforcement Learning Algorithms


Among the different RL algorithms, the most common and employed ones are based
on a Quality Matrix, also called Q-Matrix. The leading ones are Q-Learning [10] and
State-Action-Reward-State-Action (SARSA) [11].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 267–272, 2021.
https://doi.org/10.1007/978-3-030-66729-0_32
268 G. C. Cardarilli et al.

Fig. 1. Reinforcement Learning framework.

The Q-Matrix is a table composed by N rows and Z columns, representing the


possible States for the Environment and the possible Actions for the Agent. The matrix
values are a measure of the quality associated to perform a certain Action while the
Environment is in a certain State. A higher value is related to a better Action.
An example of Q-Matrix is shown in Fig. 2.

Fig. 2. Q-Matrix example with current state of the environment “State 2”.

Q-Matrix and SARSA provide for some update formulas to fill all the elements of
the Q-Matrix. At the end of the training process, considering a row associated to a State,
the column that contains the highest value suggests the best Action to perform. When
the values of the table converge, the Agent has learnt an optimal ASP.

2.1 Action-Selection Policies


A RL Agent must choose the next Action to perform. The way this operation is carried
out depends on its ASP [4]. There are different policies, being three the most used ones.
Random. The Agent chooses the next Action randomly not caring at all the content of
the Q-Matrix. This policy is usually used in the first stage of training.

Greedy. The Agent always selects the best Action that can be performed according to
the Q-Matrix. The next Action can be identified by searching for the maximum value in
the row related to the current Environment State. This policy is usually employed when
the RL algorithm converged to an optimal ASP.
An Action-Selection Policy Generator for Reinforcement Learning 269

ε-greedy. It is a hybrid solution between Random and Greedy policies. According to a


given probability ε and a random probability P, the ASP is:

random if P < 
(1)
greedy if P ≥ 

This means that the Agent performs a random Action with probability ε, while
chooses the best one the rest of the time. Such policy is usually followed in the middle
of the training process, giving the Agent a balance between exploitation and exploration
of the Environment.
Note that for ε = 1, the ε-greedy policy degrades to a random policy, while for ε = 0
it degrades to a greedy one. This relation is exploited by the proposed Policy Generator
architecture.

3 Hardware Architecture
In the last years, different hardware accelerators for Q-Learning and SARSA were pro-
posed in the literature [12–14], being [14] the actual state-of-the-art in terms of hardware
resources, power dissipation, computational performance, scalability and flexibility.
The hardware architecture of a typical RL Agent is shown in Fig. 3.

Fig. 3. Hardware architecture of a reinforcement learning agent.

It is composed by two main blocks: the RL accelerator which updates the Q-Matrix,
and the Policy Generator that outputs the next Action at+1 to be performed.
The hardware architecture of the proposed Policy Generator is shown in Fig. 4, being
the first of its kind.
270 G. C. Cardarilli et al.

Fig. 4. Hardware architecture of the proposed Policy Generator.

The system requires as inputs: the row of the Q-Matrix related to the current state
Q(st ,A), the value of ε for the ε-greedy policy, and a control signal rnd to switch to the
full random policy. The greedy policy can be simply selected by setting the ε value to 1.
The output of the system is the next action at+1 the Agent should perform.
The best action for the greedy policy is evaluated by a maximum finder tree (MAX)
[12]. A 32bit Linear Feedback Shift Register (LFSR) is used both to generate the action
for the random policy and to generate the probability comparison value P for the ε-greedy.
The 8 Least Significant Bits (LSBs) of the LFSR represent the probability P which is
compared to the ε input. The Most Significant Bits (MSB) represent the random action
and their width depends on the number of rows (actions) Z of the Q-Matrix.
A final multiplexer (MUX) selects the greedy or the random action as output of the
Policy Generator. This is mandatory because setting ε = 1 would cause the output of the
comparator to be low when the LFSR LSBs outputs a “1” value.

4 Implementation Results

We tested our architecture on a Xilinx FPGA xczu7ev-ffvc1156, which is the same


device used in [14]. This is to check the timing and resource compatibility and to allow
the integration of the Policy Generator in the Reinforcement Learning accelerator.
The implementation results are shown in Table 1. We tested the two corner cases
with Z = 4, 8 bits for the representation of the Q-Matrix values, and Z = 16, 32 bits.
Another intermediate case Z = 8, 16 bits was considered.
We give information about the required Look-Up-Tables (LUT), Flip-Flops (FF),
the maximum clock frequency, the dynamic power dissipation, and the energy required
to generate an action.
As can be seen, since the only sequential part of the architecture consists in the LFSR,
only 32 FFs are required in any case. The number of LUTs increases with the complexity
of the circuit but it is always a very small part of the FPGA total resources. The same
An Action-Selection Policy Generator for Reinforcement Learning 271

Table 1. Hardware implementation results of the proposed Policy Generator.

Z Bits LUT FF Clock Power Energy


4 8 68 (0.03%) 32 1.8 GHz 4 mW 0.002 nJ
(<0.01%)
8 16 161 (0.07%) 32 1.8 GHz 9 mW 0.005 nJ
(<0.01%)
16 32 727 (0.32%) 32 1.8 GHz 41 mW 0.023 nJ
(<0.01%)

happens for the power and the energy which have been evaluated in a worst-case scenario
where all the circuits nodes switch per clock cycle.
For all the considered cases, the maximum achievable clock frequency is 1.8 GHz.
For this reason, the Policy Generator can be included into any Reinforcement Learning
accelerator without affecting its timing requirements.

5 Conclusions

We developed a hardware architecture for an action-selection Policy generator. The


system is meant to be part of Reinforcement Learning hardware accelerators based on
Q-Matrix, like Q-Learning and SARSA. Our system is an integrated solution for the
generation of actions according to the most used policies such as random, greedy, and
ε-greedy.
The FPGA implementation results show that the architecture requires a very mod-
erate number of hardware resources, shows a limited power dissipation and a high
computational speed. For these reasons, it can be perfectly integrated in the state of the
art of Reinforcement Learning hardware accelerators.
In a future update, we would increase the capabilities of the Policy Generator by
including more complex action-selection policies such the Boltzmann [15] one.

Acknowledgments. The authors would like to thank Xilinx Inc. for providing FPGA hardware
and software tools by Xilinx University Program.

References
1. Giuliano, R., et al.: Indoor localization system based on bluetooth low energy for museum
applications. Electronics (Switzerland) 9(6), 1–20 (2020). art. no. 1055
2. Capizzi, G., et al.: Small lung nodules detection based on fuzzy-logic and probabilistic neural
network with bioinspired reinforcement learning. IEEE Trans. Fuzzy Syst. 28(6), 1178–1189
(2020). art. no. 8895990
3. Napoli, C., Bonanno, F., Capizzi, G.: An hybrid neuro-wavelet approach for long-term
prediction of solar wind. Proc. Int. Astron. Union 6(S274), 153–155 (2010)
272 G. C. Cardarilli et al.

4. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press, Cambridge
(2018)
5. Lin, J.L., et al.: Gait balance and acceleration of a biped robot based on Q-learning. IEEE
Access 4, 2439–2449 (2016)
6. Matta, M., et al.: A reinforcement learning-based QAM/PSK symbol synchronizer. IEEE
Access 7, 124147–124157 (2019)
7. Zhu, J., et al.: A new deep-Q-learning-based transmission scheduling mechanism for the
cognitive Internet of Things. IEEE Internet Things J. 5(4), 2375–2385 (2017)
8. Samadi, E., Badri, A., Ebrahimpour, R.: Decentralized multi-agent based energy management
of microgrid using reinforcement learning. Int. J. Electr. Power Energy Syst. 122, 106211
(2020)
9. Matta, M., et al.: Q-RTS: a real-time swarm intelligence based on multi-agent Q-
learning. Electron. Lett. 55(10), 589–591 (2019)
10. Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279-292 (1992)
11. Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems, vol. 37.
University of Cambridge, Department of Engineering, Cambridge (1994)
12. Da Silva, L.M., Torquato, M.F., Fernandes, M.A.: Parallel implementation of reinforcement
learning Q-learning technique for FPGA. IEEE Access 7, 2782–2798 (2018)
13. Rajat, R., et al.: Qtaccel: a generic fpga based design for q-table based reinforcement learning
accelerators. In: The 2020 ACM/SIGDA International Symposium on Field-Programmable
Gate Arrays (2020)
14. Spanò, S., et al.: An efficient hardware implementation of reinforcement learning: the Q-
learning algorithm. IEEE Access 7, 186340–186351(2019)
15. Tijsma, A.D., Drugan, M.M., Wiering, M.A.: Comparing exploration strategies for Q-learning
in random stochastic mazes. In: 2016 IEEE Symposium Series on Computational Intelligence
(SSCI). IEEE (2016)
Porting Rulex Machine Learning Software
to the Raspberry Pi as an Edge
Computing Device

Ali Walid Daher1,2,3(B) , Ali Rizik1 , Marco Muselli3 , Hussein Chible2 ,


and Daniele D. Caviglia1
1 COSMIC Lab, Department of Electrical, Electronic and Telecommunications
Engineering and Naval Architecture (DITEN), University of Genova, Genova, Italy
alidengineer@live.com, ali.rizik@edu.unige.it,
daniele.caviglia@unige.it
2 MECRL Laboratory, Ph.D. School for Sciences and Technology, Lebanese University,
Beirut, Lebanon
hchible@ul.edu.lb
3 Consiglio Nazionale Delle Ricerche, Institute of Electronics, Computer
and Telecommunication Engineering (IEIIT), Genova, Italy
marco.muselli@ieiit.cnr.it

Abstract. With the rise of Internet of Things (IoT) and Edge Computing, which
are technologies that rely on smart and low power computing nodes with adequate
processing power and storage capabilities, it is expected that Artificial Intelli-
gence and machine learning will play a role in the continuous spreading of their
application fields.
One of the most adopted hardware platforms for IoT and Machine Learn-
ing is the low-cost, multipurpose Raspberry Pi, which is small enough and still
capable of effectively handling machine learning tasks. Moreover, it is ideal for
development and educational purposes. On the other hand, among the plethora of
Machine Learning (ML) paradigms reported in the literature, we identified Rulex
[1] as a good candidate as an ML engine, suitable for advanced edge computing
applications.
In this paper, we report the deployment of the machine learning package Rulex
to operate on the Raspberry Pi in multiple arrangements. The target is to perform
training and testing of Machine Learning algorithms through running Rulex on
the Raspberry PI as an Edge Computing Device.
Specifically, we describe the process of porting Rulex external and internal
libraries on Windows 32 Bits, Ubuntu 64 Bits, and Raspbian 32 Bits. Moreover, we
present the standalone and Client/Server Configuration of Rulex on the Raspberry
Pi along with the Remote Development configuration used to compile and debug
the Rulex source code remotely. We have applied Forecasts using training and
testing data sets on the Raspberry Pi as an IoT Device, which generate promising
and accurate results.

Keywords: Internet of Things · Edge computing · Machine learning

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 273–279, 2021.
https://doi.org/10.1007/978-3-030-66729-0_33
274 A. W. Daher et al.

1 Introduction
Internet of Things applications range over a variety of fields, from biomedical wear-
able sensors, to weather, and military applications, and not to mention numerous others.
Furthermore, such emerging applications can improve their effectiveness and make intel-
ligent decisions if equipped with Machine Learning techniques. These techniques rely
on a statistical analysis of training data to generate a model, which is applied to new or
testing data, in the quest for validation.
In past implementations, the training data are applied to the machine learning method
on a remote cloud server, where the model is generated and later is applied to the new
data existing on the IoT computing node [2]. In this framework, with the aim of avoiding
network traffic and limit access to the Cloud, Edge Computing paradigm [3] should be
applied. This implies that all of the Machine Learning training is performed on the IoT
device. Our work moves in this perspective and aims to demonstrate its effectiveness in
a real-world application.
In this paper we have adopted as a Hardware Platform, the Raspberry Pi [4], an
affordable, low-power credit-card-sized Personal Computer (PC) with embedded system
applications, which can also be used for general-purpose everyday computing. The
Machine Learning software selected for this investigation is Rulex [1], a software targeted
for non-domain experts. It consists in a collection of many Machine Learning algorithms
(both supervised and unsupervised) such as Logic Learning machine (LLM) [1], Neural
Networks, Binary Trees, K-Nearest Neighbor (KNN), and others, and in a complex but
easy to use Graphical User Interface (GUI). Such an interface enables the end-user to
import and manipulate data in a user-friendly environment and apply one of the supported
learning algorithms, so she or he can make forecasts in both classification and regression
scenarios.
Our main activity consisted in the porting of Rulex on different Operating Systems,
namely to Windows 32 Bits, Ubuntu 64 Bits running in Raspberry Pi, and on Raspbian
32 Bits which is the official Operating System (OS) of the Raspberry Pi. All external
Libraries were compiled in 32 Bits for both Windows and Linux before compiling the
source code for the relevant platforms. Moreover, the code was debugged on Linux
and was made redundant such that the same code would operate on multiple Operating
Systems. Finally, forecasts were made on the Raspberry Pi in a Client/Server arrangement
[5, 6], which generate encouraging results in terms of accuracy.
The remainder of this paper is organized as follows. Section 2 presents the liter-
ature review, and Sect. 3 describes the actions taken to port Rulex on Raspberry Pi
in a Client/Server arrangement. Simulation results are presented in Sect. 4. Finally,
conclusions are drawn in Sect. 5.

2 Literature Review
2.1 Hardware Platform
The hardware platform selected for the project is the Raspberry Pi, a credit-card-sized
personal computer introduced in 2012 [7]. It can perform application-specific tasks as
well as act as a standalone Personal Computer, where it can support keyboard, HDMI
Porting Rulex Machine Learning Software to the Raspberry Pi 275

Display, SD Card, in addition to multiple Universal Serial Bus (USB) ports to connect any
other required Input/Output Device. The Raspberry Pi processor consists of a Low-Power
ARM microcontroller which utilizes the SD Card as a Hard Drive.
Concerning Communication over the internet, the Raspberry Pi possesses a Local
Area Network (LAN) port, in addition to a WIFI module for wireless connectivity.
Furthermore, the Raspberry Pi has multiple Digital Input/Output pins which can be used
in Embedded system applications such as motor control, serial communications, LCD
Display, and interfacing with a smart sensor module [4].

2.2 Rulex

Rulex [1] is a Machine Learning Software that is supported on Windows 64 Bits. It is tar-
geted mainly towards non-domain experts while remaining powerful in terms of accuracy
and performance. Rulex GUI provides a means of importing training data and manipu-
late it mathematically, filter out unwanted information, and apply functions and queries
on the input data. After Data Mining has been implemented, whether it is intended for
classification or regression, Rulex has multiple Machine Learning algorithms available
to apply on data for deriving models capable of making forecasts.
The main proprietary algorithm for Rulex is Logic Learning Machine (LLM) which
can extract intelligible rules from data. LLM relies on Switching Neural Networks [8]
and monotone Boolean Reconstruction through the Shadow Clustering technique [9] to
extract understandable knowledge from data in a white-box learning setting.

3 Porting Techniques and Tools


The main task of this project has been the porting of the Rulex environment from 64 bit
to 32 bit HW platforms. A preliminary step needed for this purpose was the compilation
of all of its dependencies in a 32-bit environment. In the case of the Windows operating
system, the Visual Studio environment can be used for this task. Dependencies should be
set in the GUI along with the header files and the output format should be specified. When
compiling in Linux, the same parameters need to be set as in Windows, however, CMake
[10, 11] can be used, which is a cross-platform application for generating executables
or libraries.
External libraries or dependencies need to be ported to 32 Bits as the first step. Also,
the compilation of these libraries should be performed using a tool specific to the target
platform, which are Windows 32 Bits, Ubuntu 64 Bits, and Raspbian 32 Bits.
Figure 1 presents the general file structure diagram of libraries and dependencies
applicable for both Linux and Windows. The header files contain the functions to be
called along with their sources. Cpp files depend on header files which themselves may
depend on other header files. Cpp files cannot call a function if the respective header
file is not called or included. The Cpp files generate their output files which produce an
overall output binary file. This last file contains a compiled version of the functions in
the source code, along with function signatures which a developer calls. Consequently,
the binary files may depend on other binary files which may form a similar tree structure
to generate a final target or executable file. So, every block of the code needs to know
276 A. W. Daher et al.

where its dependencies are and have to produce an output that links it to other proceeding
blocks.

Fig. 1. Dependencies file structure

In order to do this in Windows, in the 32-Bit case, we used Visual Studio 2019
Enterprise edition to compile external libraries and generate their output DLL and LIB
files, and then link these to the Rulex Source Code. The same is applied for the source
code by linking external libraries and linking code blocks or projects in their correct
order and based on their internal and external dependencies.
As for the Linux case, the tool used is CMake, a Cross-Platform language that
performs the same task Visual Studio 2019 GUI does. The same logic of header, libraries,
and targets is adopted except that in the implementation a makefile is written, which
describes the files and dependencies required for linking.
Rulex on Raspberry Pi can function in one of two modes: Standalone mode and
Client/Server mode, where a Personal Computer is used as a client operating the Rulex
GUI and the Raspberry Pi acts as a server or engine, where the machine learning training
and forecast take place An SSH connection [12] is used for the remote access in case
of accessing Rulex on both local and public networks, wherein case the Raspberry Pi is
accessed over a public network, a public IP is used.
After successfully establishing a secure connection to the Rulex Engine running on
the Raspberry Pi, the code needs to be modified and debugged so it can operate on both
Windows 32 Bits as well as Linux 32 Bit/64 Bit. So, Remote development is used to
compile both the dependency binary files as well as output DLL and SO libraries.
While compiling, it is found that some C/C++ variables are not compatible with 32
Bit hardware, so the code needs to be made redundant to be compiled on both systems
without any manual modifications of the Rulex source code. This is done by diverting
the flow of the code to a path or snippet specific to the running OS, with the aid of
predefined Macros. A simple example is to use Macros such as _WIN32 for detection
of Windows 32 Bits or _WIN64 for detecting Windows 64 Bits. In other cases, we may
need to detect ARM by using the__arm__Macro.
Porting Rulex Machine Learning Software to the Raspberry Pi 277

Rulex GUI is accessed through the source code and is debugged simultaneously for
both Python and C/C++. The Engine and Algorithms are written in C/C++ whereas the
GUI is written in Python. Through SSH, C/C++ debugging is performed either using
the X11 forwarding feature of SSH or in a Client/Server setting where a PC is the Client
and the Raspberry Pi is the Server. When debugging through the GUI, VSCode is used,
which is a code editor with Python and Remote Development support. Additionally,
a Docker-based PostgreSQL container [13] server is employed to act as the common
storage point between both Client and server nodes.
Rulex in Client/Server mode can lead to improved performance in terms of speed
and accessibility since the Python part of the source code is local while the Rulex Engine
is run remotely on the Raspberry Pi. Rulex stores its processes on a database, which is
by default SQLite. This only works in the case we are running Rulex on Windows, but
for remote access, it is better to have a third party server. To configure Rulex to use the
database server, we must specify the domain name or public IP, the type of the database,
port number, username, and password. Not to mention information for the application
server which is the Raspberry Pi itself.

4 Forecast Results
After porting Rulex to Windows 32 Bits as a Client and Raspberry Pi as an application
server, we tested a dataset related to Pedestrian and Vehicle Classification, through mea-
surements taken by a 24 GHz, short-range radar, based on the Infineon BGT24MTR11
RF transceiver [14]. Namely, we used for our sensor a Distance2Go development kit
from Infineon itself [15].

Fig. 2. Rulex processing blocks: consisting of excel1 which imports data, dataman1 for viewing
and data mining, split1 for splitting dataset, lm1 which performs the machine learning algorithm,
app1 to apply model, and finally confmatrix1 and featrank1 which visually display results.

For our application, we defined four classes: a Human class, and three vehicle classes:
Car, Truck, and Motorcycle. The dataset consists of 120 equally-distributed records.
The Data for the radar classification application was recorded by a second standalone
separate system which is dedicated to feature extraction. This second system used for
the collection of the data consists of three main parts: A 24 GHz radar board, another
278 A. W. Daher et al.

Raspberry PI 3B+, and a PC running MATLAB. The second Raspberry PI at this stage
was only used as an interface to provide the wireless communication between MATLAB
and the radar board. The radar is connected to the Raspberry via USB connection. A
MATLAB script initiates the data acquisition phase by sending an order to the radar
board through the USB port of the PI 3 via a Wifi network. The Raspberry PI collects the
radar data from the radar through the USB port, and then it sends it back to MATLAB
via Wifi which extracts the features from the radar measurements.
The feature vector formed from the collected data consists of 10 features. The
description of the features is as follows:

Fig. 3. Training forecast Fig. 4. Testing forecast

1. R: The spread in the range-FFT spectrum caused by the target.


2. R1 : Variance of the range-FFT spectrum.
3. R2 : Standard deviation of the range-FFT spectrum.
4. R3 : Average of the range-FFT spectrum.
5. V: The spread in the Doppler-FFT spectrum caused by the target movement.
6. V1 : Variance of the Doppler-FFT spectrum.
7. V2 : Standard deviation of the Doppler-FFT spectrum.
8. V3 : Average of the Doppler-FFT spectrum.
9. RCS: Radar cross-section, which gives a measure for the reflectivity of the target.
10. Vest : The estimated speed of the target.
After feature extraction on the second system, the target system which consists of
Rulex running on a Raspberry PI being accessed remotely is used for classification. This
classification system is more general than the system dedicated to feature extraction since
the latter is only confined to radar classification, while the first can be used for an infinite
number of ML applications. The arrangement in Rulex running on the Raspberry on the
target system is shown in Fig. 2, where there are data processing blocks followed by an
LLM, and finally a confusion matrix. Not to mention a splitting task which randomly
splits data, using 65% of them for training and the remaining 35% for testing the model.
The results for these simulations can be found in the confusion matrix format in
Figs. 3 and 4. In the training forecast in Fig. 3, Cars and Humans are classified with a
rate of 100%. As for Motorcycles and Trucks, they are 84.2% and 89.5% respectively.
In Fig. 4, which is the validation with the more critical results, where cars are forecasted
Porting Rulex Machine Learning Software to the Raspberry Pi 279

with a rate of 73.3%, Human at 100%, Trucks at 72.7%, and motorcycle were forecasted
with a rate of 90.9%.

5 Conclusion
Rulex, a machine learning software that runs on Windows 64 Bits, has been ported to
work on the Raspberry Pi for Edge Computing applications. Rulex now operates in a
Client/Server setup with the interface being operated on a PC as a Windows 32 Bits
application, with the machine learning algorithms being run on the Raspberry Pi ARM
32-Bit microcontroller. A dataset related to pedestrian and vehicle classification using
a high-frequency radar has been simulated and results show promising forecasts which
are applied through the Client/Server interface.

References
1. Marco, M.: Extracting knowledge from biomedical data through Logic Learning Machines
and Rulex. EMBnet J. 18(B), 56–58 (2012)
2. Low, Y., et al.: Distributed graphlab: “a framework for machine learning in the cloud.” arXiv
preprint arXiv:1204.6078 (2012)
3. Zhang, X., Wang, Y., Shi, W.: pCAMP: performance comparison of machine learning pack-
ages on the edges. In: USENIX Workshop on Hot Topics in Edge Computing (HotEdge 2018)
(2018)
4. Vujović, V., Maksimović, M.: Raspberry Pi as a wireless sensor node: performances and
constraints. In: 2014 37th International Convention on Information and Communication
Technology, Electronics and Microelectronics (MIPRO). IEEE (2014)
5. Cui, W., Kim, Y., Rosing, T.S.: Cross-platform machine learning characterization for task
allocation in IoT ecosystems. In: 2017 IEEE 7th Annual Computing and Communication
Workshop and Conference (CCWC). IEEE (2017)
6. Hajdarevic, K., Konjicija, S., Subasi A.: A low energy APRS-IS client-server infrastructure
implementation using Raspberry Pi. In: 22nd Telecommunications Forum (TELFOR). IEEE
(2014)
7. Maksimović, M., et al.: Raspberry Pi as Internet of Things hardware: performances and
constraints. Des. Issues 3(8) (2014)
8. Muselli, M.: Switching neural networks: a new connectionist model for classification. In:
Neural Nets, pp. 23–30. Springer, Heidelberg (2005)
9. Muselli, M., Quarati A.: Reconstructing positive boolean functions with shadow clustering.
In: 2005 Proceedings of the 2005 European Conference on Circuit Theory and Design, vol.
3. IEEE (2005)
10. Fober, D., Orlarey, Y., Letz, S.: Building faust with CMake (2018)
11. Clemencic, M., Mato, P.: A CMake-based build and configuration framework. In: Journal of
Physics: Conference Series, vol. 396. no. 5. IOP Publishing (2012)
12. Allen, D.R.: Eleven SSH tricks. Linux J. 2003(112), 5 (2003)
13. Bellavista, P., Zanni, A.: Feasibility of fog computing deployment based on docker container-
ization over Raspberry Pi. In: Proceedings of the 18th International Conference on Distributed
Computing and Networking (2017)
14. https://www.infineon.com/cms/en/product/sensor/radar-image-sensors/radar-sensors/radar-
sensors-for-consumer-and-iot/bgt24mtr11/
15. https://www.infineon.com/cms/en/product/evaluation-boards/demo-distance2go/#!doc
uments
High Voltage Isolated Bidirectional
Network Interface for SoC-FPGA
Based Devices
A Case Study: Application to Micro-pattern Gaseous
Detectors

Luis Guillermo Garcı́a1,2,3(B) , Maria Liz Crespo1,3 , Sergio Carrato2 ,


Andres Cicuttin1,3 , Werner Florian1,3 , Romina Molina1,2,3 , Bruno Valinoti1,2,3 ,
and Stefano Levorato3
1
MLAB, The Abdus Salam International Centre for Theoretical Physics,
Strada Costiera, 11, 34151 Trieste, TS, Italy
{lgarcia1,mcrespo,cicuttin,wflorian,rmolina,bvalinot}@ictp.it
2
Dipartimento di Ingegneria e Architettura, Università degli Studi di Trieste,
v. Valerio 6/1, 34127 Trieste, TS, Italy
carrato@units.it
3
Sezione di Trieste, Istituto Nazionale di Fisica Nucleare,
Via Valerio 2, 34127 Trieste, TS, Italy

Abstract. This paper describes a custom made high voltage isolated


bidirectional network interface for communication among FPGA devices,
which are in different power domains. Preliminary performance test and
measurements of noise tolerance and stability are presented. A case study
of an application regarding a network of multiple single-channel power
supply systems for Micro Pattern Gaseous Detectors is portrayed. In
order to match the specific system needs of dynamic voltage control,
the network interface provides a reliable high voltage decoupling up to
2 kV with reasonable noise tolerance and data transmission rate up to
100 Mbps. The flexibility of the interface allows the implementation of
different communication protocols.

Keywords: FPGA · System on chip · High voltage isolation ·


Networks

1 Introduction
High voltage power supplies are often needed in high energy particle detectors.
Micro-Pattern Gaseous Detectors (MPGDs), namely Gaseous Electron Multipli-
ers (GEMs), need different voltages up to several kV, according to the technology
used and the gas selected, for biasing the device electrodes. In case of large area
applications, segmentation of the electrodes is needed to reduce the accumu-
lated energy. This approach implies the increase of the number of independent
high voltage (HV) channels. A single-channel high voltage power supply system
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 280–285, 2021.
https://doi.org/10.1007/978-3-030-66729-0_34
HVIBNI for SoC-FPGA Based Devices 281

(HVPSS) based on a DC/DC converter coupled to a high resolution ammeter


controlled by a FPGA-based System-on-Chip (SoC) has been developed for study
and monitoring of electrical discharges in MPGDs, and for intelligent dynamic
voltage control. Each channel can supply up to 2 kV and can provide precise
current monitoring with 10 pA resolution up to a measuring frequency of 100
kHz, and electrical discharge detection with timestamp precision of 2 ns thanks
to a 500 Mhz analog to digital converter (ADC). The system can also adjust the
HV supply depending on the information of the neighbor channels [5].
To avoid electronic noise propagation in the detector, each channel is required
to be connected in the HV power domain with independent floating power sup-
plies. The absolute voltage among channels may be up to 2 kV making a direct
galvanic connection among them impossible. Wireless communication is not rec-
ommended to prevent electromagnetic noise induction in the detector; conse-
quently, a High Voltage Isolated Bidirectional Network Interface (HVIBNI) for
safe data transmission among multiple HVPS control units was designed.

1.1 State of the Art

Traditional circuit isolators, while effective may not be suitable for the needs of
the detector. Sustained HV DC decoupling is needed for prolonged times and
high speed transmission is preferred. Optical decoupling has several benefits such
as immunity to perturbing electromagnetic fields, however achievable bitrates are
typically limited [2]. For example, in the range of the fastest optocouplers there
are the HCPL-0723 or TLP2767, which can achieve 50 Mbps and up to 5000 Vrms
decoupling for one minute; sustained DC voltage decoupling is only 500 V [1].
Inductive coupling has the advantage of differential transfer characteristics and
high common mode noise rejection but the main problem is the susceptibility
to fast transient variations produced by spontaneous high voltage discharges [3].
Reinforced galvanic isolation can provide high data rates with high isolation
rating for DC voltage differences as well as transient peaks surges protection
up to 20 kV [4]. Among the commercial options available, the Texas Instruments
ISO78xx family of reinforced isolated buffers provide working high voltage DC
isolation using silicon dioxide barrier (SiO2 ) up to 2828 VDC , 5700 Vrms for one
minute and maximum surge isolation up to 8000 V [3].

2 System Description

The HVPSS is based on a Xilinx FPGA-SoC Zynq 7030 attached to a 8-bit


500 Mhz high speed ADC (HSADC) and an ammeter board with pico-ampere
range (PAMP BOARD). A dedicated printed circuit board (PCB) was designed
to connect a HV decoupling network system to the FPGA through a FMC
connector. The HVPSS is situated in its own power domain with a floating
ground. The decoupled network bus is set in the low voltage (LV) power domain
referred to the global ground and interconnects each channel through shielded
282 L. G. Garcı́a et al.

Fig. 1. High voltage power supply system based on FPGA-SoC

Ethernet cables via two RJ45 jacks. An interconnection diagram and a single-
channel HVPSS setup is shown in Fig. 1.
The decoupling system consists of four LVDS isolated input/output (HVIO)
buffers and is designed for bidirectional serial communication (Fig. 2). In par-
ticular, each HVIO buffer is based on a ISO7821LLDW that has two LVDS
buffers (receiver and transmitter), with a maximum signal transmission rate
up to 100 Mbps [7]. A loop-back configuration was set to provide bidirectional
signal. Each power domain has an input and an output buffers controlled by
two complementary enable signals. The FPGA is connected to the HV power
domain with a local ground; this side is in charge of handling the input/output
direction of the buffer. The direction of the transmission is set using an inverter
gate connected to an ISO7840FDWW [6] buffer for decoupling. The low voltage
domain has no control over the direction and is always referred to the global
ground.

Fig. 2. Bidirectional HVIO buffer configuration

2.1 Direction Switching Characteristics


The time required for the enable signal to reach the output buffer to make it
change from high-impedance to digital output is called enable propagation delay.
HVIBNI for SoC-FPGA Based Devices 283

The ISO7821LLDWW has different enable propagation delays depending on the


logical level present in the input buffer. This characteristic has direct impact on
the timing between input and output configuration. For a logical high input in
the buffer, the output enabling time is 20 ns, and 2.5 µs for a logical low input [7].
The timing analysis shown in Fig. 3 describes the change of direction between
the HV to LV input/outputs. The FPGA handles the HVIO buffer through an
internal bidirectional differential IO. The enable signal coming from the FPGA
takes 38 ns to move from HV to LV through the ISO7840 and the discrete
inverter gate. For an initial high input, the delay time from the enable is 83 ns
(Fig. 3a). The change of direction starts when the FPGA puts its IO buffer in
high impedance (input mode) and rises HV Enable signal. The output in the
LV will be set to zero after 25 ns, and before the enable signal reaches the LV
domain. This implies that there is a period of time when both enables are set
high, causing an undetermined value in the HV IO (Fig. 3a). Indeed, when the
enable signal reaches the LV domain, it takes 20 ns to disable the output buffer;
during this time the LV IO will have an undetermined logical value. After this
time, the HVIO buffer is ready to send the first logical value (D0) to the HV
domain, with a latency of 25 ns.

Fig. 3. Output to input transition diagram

For an initial low value switching, the overall time transition is 2.525 µs due
to the slow enable propagation delay of the impedance-to-low output on the
ISO7821LLDWW. This leads to a relatively long-lasting undetermined value
during the switching after the initial 20 ns. The timing diagram is shown in
Fig. 3b.

2.2 Tests and Measurements

The electrical characterization of the HVIO buffer was made by performing


different transmission rate measurements and eye diagram plots (Fig. 4). The
eye openings range from 640 mV at 10 Mbps (Fig. 4a) to 500 mV at 100 Mbps
(Fig. 4c). Relatively low jitter is also visible in all the plots with a maximum
value of 1.8 ns. The input differential voltage in the ISO7821LLDWW range
goes from 100 mV to 600 mV in complaint with the LVDS standard. Impedance
mismatching can lead to ripple noise due to multiple reflections; this is notice-
able in particular in Fig. 4b. In multiple units connections operating at high
284 L. G. Garcı́a et al.

frequencies further impedance matching should be consider to prevent or mini-


mize disturbing reflections. This can be tuned by changing the resistors scheme
in the PCB. Shielded cables should be considered to prevent electromagnetic
induced noise in a multi-node network.

Fig. 4. Eye diagram plots of the output signal at different frequencies without load

3 Conclusions and Work in Progress


The HVIBNI was designed to provide a reliable high-performance physical inter-
face to interconnect multiple devices in different HV power domains. The ver-
satility of the proposed interface allows different interconnection schemes and
network topologies. The HVIBNI is recommended for long data stream among
devices, the FPGA design should take into consideration to keep a high value in
the output to ensure a fast direction switch. It is possible to interconnect multiple
HVPSS units using a single line topology for the implementation (Fig. 1).
A custom made Time Division Multiplexing (TDM) network is currently
under design to allow the exchange of data and configuration packages between
all the units. A control PC should access the network through any of the units
to handle configurations, data readout and status of every HVPSS channel. The
HVIBNI can be used with different communication methods such as SPI or
PWM and in industrial applications involving high voltage decoupling.

References
1. Broadcom: HCPL-7723/0723, av02-0643en edn, September 2017
2. Fandrich, C.L.: An on-chip transformer-based digital isolator system. Master’s the-
sis, University of Tennessee, Knoxville (2013)
3. Gingerich, K., Sterizik, C.: The ISO72x family of high-speed digital isolators. Appli-
cation report, Texas Instruments, January 2018
4. Ragonese, E., Spina, N., Parisi, A., Palmisano, G.: Reinforced galvanic isolation:
integrated approaches to go beyond 20-kv surge voltage (invited). In: Saponara, S.,
De Gloria, A. (eds.) Applications in Electronics Pervading Industry, Environment
and Society, pp. 277–283. Springer, Cham (2020)
HVIBNI for SoC-FPGA Based Devices 285

5. Carrato, S., et al.: A scalable high voltage power supply system with system on
chip control for micro pattern gaseous detectors. NIM-A: Accel. Spectrom. Detect.
Assoc. Equip. 963, 163763 (2020). http://www.sciencedirect.com/science/article/
pii/S0168900220303016
6. Texas Instruments: ISO7840x High-Performance, 8000-VPK Reinforced Quad-
Channel Digital Isolator, sllsen2b edn, July 2015. Revision B
7. Texas Instruments: ISO782xLL High-Performance, 8000-VPK Reinforced Isolated
Dual-LVDS Buffer, sllset8a edn, August 2016. Revision A
A Comparison of Objective and Subjective Sleep
Quality Measurement in a Group of Elderly
Persons in a Home Environment

Maksym Gaiduk1,2(B) , Ralf Seepold1,3 , Natividad Martínez Madrid3,4 ,


Juan Antonio Ortega2 , Massimo Conti5 , Simone Orcioni5 , Thomas Penzel6,7 ,
Wilhelm Daniel Scherz1,2 , Juan José Perea3 , Ángel Serrano Alarcón4 ,
and Gerald Weiss8
1 HTWG Konstanz, Alfred-Wachtel-Str. 8, 78462 Konstanz, Germany
{maksym.gaiduk,ralf.seepold,wscherz}@htwg-konstanz.de
2 University of Seville, Avda. Reina Mercedes s/n, Seville, Spain
jortega@us.es
3 I.M. Sechenov First Moscow State Medical University, Moscow, Russian Federation
jperea@us.es
4 Reutlingen University, Alteburgstr. 150, 72762 Reutlingen, Germany
{natividad.martinez,
angel.serrano_alarcon}@reutlingen-university.de
5 Università Politecnica delle Marche, via brecce bianche 12, 60131 Ancona, Italy
{m.conti,s.orcioni}@univpm.it
6 Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany
thomas.penzel@charite.de
7 Saratov State University, Saratov, Russian Federation
8 AWO Kreisverband Schwarzwald-Baar e.V., Klinikstraße 3,
78052 Villingen-Schwenningen, Germany
G.Weiss@awo-vs.de

Abstract. The main aim of presented in this manuscript research is to compare the
results of objective and subjective measurement of sleep quality for older adults
(65+) in the home environment. A total amount of 73 nights was evaluated in this
study. Placing under the mattress device was used to obtain objective measurement
data, and a common question on perceived sleep quality was asked to collect the
subjective sleep quality level. The achieved results confirm the correlation between
objective and subjective measurement of sleep quality with the average standard
deviation equal to 2 of 10 possible quality points.

1 Introduction
Sleep is an essential part of our life. It is necessary for the restoring of our physical and
mental health [1]. However, it is essential to consider that sleep duration alone is not
enough for the complete recuperation of our body and brain – sleep quality is another
significant factor [2].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 286–291, 2021.
https://doi.org/10.1007/978-3-030-66729-0_35
A Comparison of Objective and Subjective Sleep Quality Measurement 287

Measurement of sleep quality can be, in general, separated into two main classes:

• Objective – with the help of devices, which collect bio-vital data.


• Subjective – the information is reported using questionnaires [3].

Both mentioned above groups of measurement approaches have different methods.


For example, polysomnography or actigraphy can be used for objective measurement.
Using a sleep diary or a PSQI questionnaire may provide the data for the subjective
measure of sleep quality. However, of course, also other approaches are existing and
being described in several scientific publications [4–6].
The primary motivation in this manuscript research is to compare the objective and
subjective measurement of sleep quality. To understand the tendency, if persons are more
likely to under- or overestimate the quality of their sleep and to check if it is possible
to avoid subjective measuring in case of long-term use of technologies for objective
measurement.

2 State of the Art


Sleep quality, in general, among the elderly, is mainly a topic of a lot of scientific research.
In the following, only a small selection of relevant approaches are presented.
General information about the measurement of sleep quality and the comparison of
different approaches is a central topic of [4]. It combines the description of different
methods of measurement and challenges of sleep quality determination.
Investigation of age-related differences in self-reported sleep quality in correlation
with health outcomes is presented in [1]. The conclusion was reached that there is a
correlation between better self-reported sleep and better health outcomes, especially for
mental health. Furthermore, a decrease in sleep quality across the lifetime was reported,
particularly about sleep efficiency.
In [6], the comparison of subjective and objective sleep analysis, including measure-
ment of sleep quality, is done. The total amount of 17 persons (14 females and three
males) has participated in this study. The correlation between objective and subjective
determination of sleep quality was reported, though it was small (r = 0.270, p < 0.01).
Another study on the comparison of subjective and objective measurement of sleep
quality, but with a focus on the age group 55+, is presented in [5]. As a result, there
is a significant difference between these two types of measurement, at least for the
mentioned above age group. The final recommendation was made to use both subjective
and objective methods of measurement to get a complete conclusion on sleep quality.
Another system for the recognition of Sleep/Wake states, which can be used to calculate
sleep quality, is presented in [7].
It is essential to measure the sleep quality in the older adults’ group but also to
understand what can affect it. This topic was discussed in [8], where the correlation
between subjectively measured sleep quality and physical activity was determined. This
result confirms that subjective measurement of sleep quality can be correlated with the
lifestyle.
288 M. Gaiduk et al.

3 Methodology
Our study was running over two weeks with ten subjects; the group consisted of elderly
persons (65+). Each participant was in his private home to maintain a routine environ-
ment. On the first day, the necessary devices were installed and explained. Additionally,
the questionnaires have been taught. On the last day, the devices were picked up, and the
final survey was filled out with the participant. During the study, a technical solution for
sleep monitoring (EmFit [9]) was provided for each participant, which should be used
over the test period. This device is based on ballistocardiography principle, it is contact-
free, autonomous, automatic and can be placed under a bed mattress. This allowed sleep
quality to be measured objectively during the two weeks. The used device can mea-
sure heart rate [10], heart rate variability [11], monitor breathing and sleep (incl. Sleep
staging) [12]. Besides, two short daily questionnaires (including sleep diary and a gen-
eral question of perceived sleep quality), there was a more detailed sleep questionnaire
(PSQI) at the end of the two weeks.
For the subjective documentation of the sleep quality, a graphical representation
of this question was used (see Fig. 1), here, the participant tagged the perceived sleep
quality for the last night. For the evaluation, a scale was graduated in 10 sections, so that
we could get a value between 1 to 10.

Fig. 1. Graphical representation of the sleep quality question

As mentioned above, the EmFit device was used for the objective measurement of
sleep quality (sleep score). This device calculates the value of sleep quality according
to following Formula 1 [6]:

SleepScore =
(total_duration_of_sleep +
(duration_of_REM_sleep * 0.5) +
(duration_of_DEEP_sleep * 1.5))−
   
sleep_class_awake_duration number_of_wakenings
8.5 ∗ 0.5 ∗ + (1)
3600 15

The maximal value of this objective measurement is 100 points. For the evaluation,
we have divided this number by 10 leaving one position after the decimal point to get
the same range of values as by subjective measurement.
Ethical officers of HTWG Konstanz and the University of Applied Sciences Kempten
approved the experiment design.
A Comparison of Objective and Subjective Sleep Quality Measurement 289

After the study’s execution, the objective and the subjective measurements were
collected and evaluated. The total amount of 140 nights was recorded. Due to missing
data from a used technical device or not filled out questionnaires on some days, finally,
only the records of 73 nights are available for the evaluation. Its outcome is presented
in the section ‘Results’.

4 Results

All sleep quality data collected with the objective (sleep tracking device) and subjective
(questionnaire) approach were used for the evaluation. As mentioned before, some night
records were excluded from the evaluation due to missing parts.
Table 1 represents the calculated statistical values of differences in an objective and
subjective measurement (’subjective measurement - objective measurement’) for each
subject, where negative numbers of median and mean that these values represent the case
that subjects underestimate the quality of sleep. It is recognizable that 8 of 10 subjects
(highlighted light red in the table) have a feeling that their sleep was of less quality than
a sleep tracking device measured it. For 8 out of 10 subjects, the value of the standard
deviation is low and stays in range between 1.2 and 2.1; the average standard deviation
is equal to 2. According to the ’68-95-99,7 rule’ [13], 68% of all values will not differ
more than by ±2, and 95% of all values will not differ more than by ±4 means that there
is a strong correlation between objective and subjective measurement.

Table 1. Statistical values of sleep quality subjective-objective measurement.

Subject 1 Subject 2 Subject 3 Subject 4 Subject 5 Subject 6 Subject 7 Subject 8 Subject 9 Subject 10
Median -1,8 -2,3 -5,3 2,5 -1,0 -1,6 -0,5 -1,8 -3,9 0,8
Mean -1,5 -1,5 -4,2 2,8 -1,3 -1,4 -1,3 -1,9 -3,2 0,8
Standard deviaƟon 2,1 2,0 2,5 1,2 1,6 1,5 2,0 1,8 3,2 1,9
Variance 4,6 3,9 6,2 1,4 2,6 2,1 4,0 3,1 10,2 3,5

The graphical representation of calculated values in Fig. 2 represents the stable value
of standard deviation for all subjects.
Although the calculated statistical values for most of the subjects are similar, there
is some outlier, which could be partially explained by not complete night records for
all the days. It means if, from 14-night records for some of the subjects, only seven or
less are complete (subjective and objective measurements are available). These records
have a high variance. The average variance value will also be high, whereas, with all 14
complete records, it could be lower, because the outliers would be less significant with
the higher amount of analyzed data. This is one of the reasons why in future research,
the focus will be done on getting a higher amount of complete records to decrease the
influence of outliers.
290 M. Gaiduk et al.

Fig. 2. Median, standard deviation, variance, and mean values

5 Conclusion
As presented in this manuscript, research has confirmed a strong correlation between
subjective and objective measurement of sleep quality. This knowledge is important for
assessing possible substitution of subjective measure using only the information obtained
with devices collecting bio-vital data (objective measurement). This substitution could
be necessary for the development of AAL-Systems with sleep quality measurement as
one of its parts.
Our group is planning to develop a prototype of an AAL-System for the home
environment with several devices working in the area of health monitoring. The results
described in this document study are providing necessary information for this future
development. Furthermore, a new study with a higher number of participants of the
same age group is currently planning.

Acknowledgments. This research was partially funded by the EU Interreg V-Program


“Alpenrhein-Bodensee-Hochrhein”: Project “IBH Living Lab Active and Assisted Living”, grants
ABH040, ABH041 and ABH066. Thomas Penzel has been partially funded by RF Government
grant Nº 075-15-2019-1885. We also thank the caregivers of the Workers’ Welfare Association
(AWO) Schwarzwald-Baar e.V. for their contribution in conducting the study.

References
1. Gadie, A., Shafto, M., Leng, Y. et al.: How are age-related differences in sleep quality associ-
ated with health outcomes? An epidemiological investigation in a UK cohort of 2406 adults.
BMJ Open (2017)
A Comparison of Objective and Subjective Sleep Quality Measurement 291

2. National Heart Lung and Blood Institute (NHLBI): Your Guide to Healthy Sleep, NIH
Publication No. 11–5271. National Heart, Lung, and Blood Institute, Bethesda (2011)
3. Unruh, M.L., Redline, S., An, M.-W., Buysse, D.J., Nieto, F.J., Yeh, J.-L., Newman, A.B.:
Subjective and objective sleep quality and aging in the sleep heart health study. J. Am. Geriatr.
Soc. 56, 1218–1227 (2008). https://doi.org/10.1111/j.1532-5415.2008.01755.x
4. Krystal, A.D., Edinger, J.D.: Measuring sleep quality. Sleep Med. 9(Supplement 1), 10–17
(2008). ISSN 1389-9457
5. Landry, G.J., Best, J.R., Liu-Ambrose, T.: Measuring sleep quality in older adults: a
comparison using subjective and objective methods. Front. Aging Neurosci. 7, 166 (2015)
6. Merilahti, J., Saarinen, A., Päkkä, J., Antila, K., Mattila, E., Korhonen, I.: Long-term sub-
jective and objective sleep analysis of total sleep time and sleep quality in real life set-
tings. In: Proceedings of the 29th Annual International Conference of the IEEE EMBS, Cité
Internationale, Lyon, France (2007)
7. Gaiduk, M., Seepold, R., Penzel, T., Ortega, J.A., Glos, M., Martínez Madrid, N.: Recognition
of sleep/wake states analyzing heart rate, breathing and movement signals. In: 2019 41st
Annual International Conference of the IEEE Engineering in Medicine and Biology Society
(EMBC 2019, pp. 5712–5715 (2019). doi: https://doi.org/10.1109/EMBC.2019.8857596
8. Štefan, L., Vrgoč, G., Rupčić, T., Sporiš, G., Sekulić, D.: Sleep duration and sleep quality are
associated with physical activity in elderly people living in nursing homes. Int. J. Environ.
Res. Public Health 15(11), 2512 (2018)
9. Kortelainen, J.M., van Gils, M., Päkkä, J.: Multichannel bed pressure sensor for sleep
monitoring. Comput. Cardiol. 39, 313–316 (2012)
10. Brüser, C., Kortelainen, J.M., Winter, S., Tenhunen, M., Pääkkää, J., Leonhardt, S.: Improve-
ment of force-sensor-based heart rate estimation using multi-channel data fusion. IEEE J.
Biomed. Health Inform. 19(1), 227–235 (2014)
11. Tenhunen, M., Hyttinen, J., Lipponen, J.A., Virkkala, J., Kuusimäki, S., Tarvainen, M.P.,
Karjalainen, P.A., Himanen, S.L.: Heart rate variability evaluation of Emfit sleep mattress
breathing categories in NREM sleep. Clin. Neurophysiol. (2014). pii: S1388–2457(14)00468–
4
12. Kortelainen, J.M., Mendez, M.O., Bianchi, A.M., Matteucci, M., Cerutti, S.: Sleep staging
based on signals acquired through bed sensor. IEEE Trans. Inf. Technol. Biomed. 14(3),
776–785 (2010)
13. Pukelsheim, F.: The three sigma rule. Am. Stat. 48(2), 88–91 (1994). https://doi.org/10.2307/
2684253
A Preliminary Study on Aerosol Jet-Printed
Stretchable Dry Electrode for Electromyography

M. Borghetti, Tiziano Fapanni(B) , N. F. Lopomo, E. Sardini, and M. Serpelloni

Information Engineering Department, University of Brescia, Via Branze 38, 25123 Brescia, Italy
{michela.borghetti,t.fapanni,nicola.lopomo,emilio.sardini,
mauro.serpelloni}@unibs.it

Abstract. In the last decade, measurement of physiological signals has been


attracting great interest for both health-care and Industry 4.0 applications. In
this context, the electrodes used during signal acquisition play a key role. In
this work, we propose the development of dry electrodes for electromyography
(EMG), specifically exploiting the possibility given by aerosol jet printing to real-
ize electrical pads and interconnections on stretchable substrates. We investigated
the materials and geometry of the sensors, characterizing their electromechanical
properties at rest and under stretching. Finally, a set of prototypal electrodes for
EMG were designed, produced and evaluated comparing them with commercial
ones. The described approach resulted to be feasible and promising for different
applications considering both health monitoring and human-machine interfaces.

1 Introduction

The measurement of biopotential signals is a common method used to non-invasively


track physiological processes [1]. For instance, electromyographic (EMG) acquisition
provides information about the muscular activity addressing both health monitoring
and human-machine interfaces [2]. The characteristics of the electrodes used in sur-
face recording heavily affect the signal quality and the reliability of their applications.
Nowadays, silver/silver chloride (Ag/AgCl) wet electrodes are the gold standard [1].
The conductive gel reduces the contact impedance and acts as an electrolyte for current
flow, but it dries with the use affecting the performance and may lead to skin irrita-
tion or allergic reactions, preventing their long-term application [3]. To overcome these
limits and reduce motion artifacts, dry flexible electrodes have been introduced [4]. In
stretchable electrodes, the interfacing and contact impedance improves due to their better
adherence and conformability during motion. Stretchable electrodes can be fabricated
through both additive and subtractive approaches; the former are indeed a great com-
promise to develop fast, reliable and cost-effective devices. Among them, aerosol jet
printing (AJP) represents an emerging method that reproduces patterns with resolutions
up to 10 µm generating and collimating an aerosol flux, which contains a functional
ink, onto a substrate [5]. AJP does not require masks to realize a specific pattern, unlike
screen printing [1], it allows depositing a variety of materials [6], with ranges of vis-
cosity and density much wider than inkjet printing [7] and it is suitable for printing

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021


S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 292–296, 2021.
https://doi.org/10.1007/978-3-030-66729-0_36
A Preliminary Study on Aerosol Jet-Printed Stretchable Dry Electrode 293

on non-planar surfaces [5]. This paper presents a preliminary study on dry electrodes
fabricated through AJP on stretchable substrates. The electrode is the result of a study
in which different ink/substrate combinations were tested to find the optimal solution in
terms of both ink/substrate adhesion and their electromechanical characteristics. Then,
the obtained solution is tested in an EMG application.

2 Materials and Methods

2.1 Sample Fabrication

The samples were fabricated by using an aerosol jet printer (AJ 300-UP, Optomec
Inc.), according to specifically designed patterns. We used a stretchable silver-based ink
(PE873, DuPont™) deposited on thermoplastic polyurethane (TPU, TE-11C DuPont™
Intexar™) and polydimethylsiloxane (PDMS). TPU was selected for its high elasticity,
good tensile strength, excellent resistance to fatigue, and biocompatibility, while PDMS
for its chemical inertness, thermal stability, high mechanical properties and good adhe-
sion capabilities. The main AJP process parameters were: sheath gas flow 55 SCCM,
exhaust flow 1300 SCCM, atomizer flow 1260 SCCM, speed process 2 mm/s, plate
temperature 60 °C and nozzle tip 200 µm. After the printing phase, all the samples
were cured at 130 °C in a dry oven for 15 min. Four types of samples with different
geometric configurations were investigated. In fact, besides straight segments (type C),
we evaluated also samples with “U” shape (type A) and horseshoe shape (type B), which
proved to be optimal solutions for the design of circuit connections, in terms of induced
mechanical stresses, reduced damage, and increased stretchability [8].

2.2 Characterization

We printed different type C samples using different ink/substrate combinations to select


the one that best fits our application. We performed different tests to evaluate several
characteristics including ink/substrate characterization and electromechanical behavior.
The adhesion of the ink on each substrate was assessed on a 0 to 5 scale using the cross-cut
tape test, which is described in the standard ASTM D3359 [9]. The electromechanical
characterization was performed in compliance with protocols commonly followed to
characterize polymeric materials [10].
Briefly, we employed the measurement system shown in Fig. 1 to apply on the
samples two deformation protocols. At first, a constant rate (0.5 mm/s) strain ramp was
applied until sample failure. Then, we applied different strain levels, maintaining the
reached position for 10 min after each step.

2.3 Results

The adhesion test result rated 5B, top of the scale, the combination PE873 ink and
Intexar TE-11C, while both the PDMS substrates demonstrate poor adhesion with this
ink. According to these preliminary results, we chose to further characterize and use only
the first combination of ink and substrate. The strain ramp test permitted to identify the
294 M. Borghetti et al.

Fig. 1. The experimental setup used during the tests for the electromechanical characterization

relationship between applied stress and the relative change of resistance of the sample.
We evaluated the resistance at rest (0% strain) to be 32.91  with a relative change in
resistance of 11.7% at failure under a strain of 12.25%. The overall sample behavior can
be seen in Fig. 2 (left). The static strain test depicted in Fig. 2 underlines a maximum
change of 0.13% at 5% strain in 10 min. It is worth noting that this strain-induced
variation in the interconnection resistance does not affect the overall signal quality,
because its value at zero strain is almost negligible if compared to the resistance of
skin-electrode interface (around 150–200 k) and the typical input resistance of the
electronics frontend (usually higher than 1 G).

Fig. 2. Sample characterization: strain ramp test (left), static strain test (right)

3 Application to EMG
3.1 Electrode Fabrication
For the development of the EMG electrode, the configuration DuPont™ Intexar TE-11C
substrate and the conductive ink DuPont™ PE873 stretchable silver were selected. We
designed our electrodes to be similar in shape to the commercial ones used during the
validation step to reduce variability due to geometrical factors. Briefly, we designed two
parallel electrodes spaced by 15 mm with a 10 mm diameter active area. Each electrode
A Preliminary Study on Aerosol Jet-Printed Stretchable Dry Electrode 295

was provided with a 30 mm long horseshoe (type B) interconnection to a set of medical


snap buttons as test equipment interconnection. The active area was later coated with a
layer of silver chloride (AgCl) to ease the signal retrieval.

3.2 Application Setup


Considering that the high variability of the surface EMG signal produced by voluntary
activations may hamper the comparison of the performance between different electrodes,
the main characterization was performed in controlled conditions by electrostimulation.
A single subject, seated in a firm chair, was connected to an electrostimulator (Globus
Genesy 1500), which was configured – as standard protocol - to deliver a 300 µs current
pulse followed by a ramp-up stimulus on the peroneal nerve thus to activate tibialis
anterior muscle. The recording electrodes were placed over the tibialis anterior, approx-
imately 10 cm from the ankle and fixed in place by adhesive tape. To acquire the signals,
we used a FREEEMG (BTS Bioengineering) wireless acquisition system. The custom
electrodes were employed both applying electrolytic gel (EG) and in dry conditions.
The signals retrieved were then compared with the ones obtained using commercial pre-
gelled EMG electrodes (Kendall). For each test case, the same waveform was applied
and acquired many times to average repeated measurements and reduce the intrapersonal
variability of the stimulus response.

3.3 Results and Discussion


In Fig. 3 we report both the validation setup and the averaged waveforms that we acquired
during the measurement sessions. In general, the amplitude of the first peak (initial
stimulus spike) recorded by using the printed electrodes in dry conditions is the 34%
of the one recorded by commercial electrodes, while in wet conditions it is the 84%.
However, the following peaks amplitude for the printed electrodes was comparable to
the one obtained with the commercial type. Even if our flexible dry electrodes present
these limitations, they have important advantages like their reduced encumbrance and
thickness that can improve its overall wearability and conformity during motion.

Fig. 3. Validation setup (left) and average stimulus response for different electrodes (right)

Moreover, even though EG based electrodes have lower contact impedance, the gel
dehydrates over time unpredictably changing the signal making them unreliable for
continuous monitoring.
296 M. Borghetti et al.

4 Conclusions
In this work, the development of stretchable EMG electrodes through AJP and tested
in an EMG application was reported. Several samples were fabricated in order to better
characterize the proprieties of different ink/substrate combinations, but the PE873 ink
on Intexar TE-11C was found the most suitable. The electrodes were then designed
and produced through AJP and validated through an electrostimulation-based protocol.
The validation process consisted of comparing the signals acquired by using the printed
electrodes in dry and wet conditions with the ones retrieved with a commercial electrode.
As regards the amplitudes, the results are promising with all the electrodes performing
in a similar way, even though there some differences in the signal form. This solution
can be easily introduced in wearable solutions, thus to address both health monitoring
or human-machine interfaces applications.

References
1. Chlaihawi, A.A., Narakathu, B.B., Emamian, S., Bazuin, B.J., Atashbar, M.Z.: Development
of printed and flexible dry ECG electrodes. Sens. Bio-Sens. Res. 20, 9–15 (2018)
2. Kim, N., Lim, T., Song, K., Yang, S., Lee, J.: Stretchable multichannel electromyography
sensor array covering large area for controlling home electronics with distinguishable signals
from multiple muscles. ACS Appl. Mater. Interfaces 8, 21070–21076 (2016)
3. Peng, H.-L., Liu, J.-Q., Tian, H.-C., Xu, B., Dong, Y.-Z., Yang, B., Chen, X., Yang, C.-S.:
Flexible dry electrode based on carbon nanotube/polymer hybrid micropillars for biopotential
recording. Sens. Actuators A 235, 48–56 (2015)
4. Jung, J., Shin, S., Kim, Y.T.: Dry electrode made from carbon nanotubes for continuous
recording of bio-signals. Micorelectronic Eng. 203–204, 25–30 (2019)
5. Borghetti, M., Serpelloni, M., Sardini, E.: Printed strain gauge on 3D and low-melting point
plastic surface by aerosol jet printing and photonic curing. Sensors 19, 4220 (2019)
6. Smith, M., Choi, Y.S., Boughey, C., Kar-Narayan¸ S.: Controlling and assessing the quality
of aerosol jet printed features for large ares and flexible electronics. Flex. Printed Electron.
2, 015004 (2017)
7. Khan, Y., Pavinatto, F.J., Lin, M.C., Liao, A., Swisher, S.L., Mann, K., Subramanian, V.,
Maharbiz, M.M., Arias, A.C.: Inkjet-printed flexible gold electrode arrays for bioelectronic
interfaces. Adv. Func. Mater. 26, 1004–1013 (2016)
8. Jablonski, M., Bossuyt, F., Vanfleteren, J., Vervust, T., de Vries, H.: Reliability of a stretch-
able interconnect utilizing terminated, in-plane meandered copper conductor. Microelectron.
Reliab. 53, 956–963 (2013)
9. ASTM D3359-17, Standard Test Methods for Rating Adhesion by Tape Test. ASTM
International, West Conshohocken, PA (2017). https://www.astm.org/
10. Borghetti, M., Serpelloni, M., Sardini, E., Pandini, S.: Mechanical behavior of strain sensors
based on PEDOT: PSS and silver nanoparticles inks deposited on polymer substrate by inkjet
printing. Sens. Actuators A 243, 71–80 (2016)
Author Index

A de Gioia, Francesco, 79
Alarcón, Ángel Serrano, 286 De Gloria, Alessandro, 39, 144, 252, 261
De Marchi, Luca, 69
B De Munari, Ilaria, 136
Baronti, Federico, 241, 246 De Venuto, Daniela, 111
Bellotti, Francesco, 39, 144, 252, 261 Dei, Lorenzo Flaccomio Nardi, 125
Bellucci, Daniele, 246 Delogu, Massimo, 223
Benito, Jesica, 3 Dhungana, Hariom, 144
Berta, Riccardo, 39, 144, 252, 261 Di Nunzio, Luca, 267
Berzi, Lorenzo, 223 Di Pasquale, Fabrizio, 173
Bianchi, Valentina, 136 Di Rienzo, Roberto, 246
Boni, Enrico, 223 Di Vita, Davide, 55
Borghetti, M., 292 Dini, Pierpaolo, 229
Brunelli, Davide, 192 Donati, Massimiliano, 79
Buonanno, Luca, 55 Doyle, Joseph, 39
Ducci, Marco, 235
C
Carbone, Antonio, 69 E
Cardarilli, Gian Carlo, 267 Elhanashi, Abdussalam, 30
Carloni, Andrea, 241, 246
Carminati, Marco, 55 F
Carrato, Sergio, 280 Fanucci, Luca, 79, 104
Casiello, Giovanni, 125 Fapanni, Tiziano, 292
Caviglia, Daniele D., 273 Faralli, Stefano, 173
Cesario, Paolo, 252 Fazzolari, Rocco, 267
Chible, Hussein, 273 Ferrari, Lorenzo, 125
Ciarpi, Gabriele, 182, 202 Fiorini, Carlo, 55
Cicuttin, Andres, 280 Florian, Werner, 280
Conti, Massimo, 286 Fronda, Luca, 252
Cosimi, Francesco, 182, 229
Crespo, Maria Liz, 280 G
Gagliardi, Alessio, 12, 30
D Gaiduk, Maksym, 49, 286
Daher, Ali Walid, 273 Galioto, Giuseppe, 152
Dalmonte, Filippo, 23 García, Luis Guillermo, 280

© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2021
S. Saponara and A. De Gloria (Eds.): ApplePies 2020, LNEE 738, pp. 297–298, 2021.
https://doi.org/10.1007/978-3-030-66729-0
298 Author Index

Gastaldo, Paolo, 23 Pezzoli, Matteo, 62


Giaconia, Giuseppe Costantino, 152 Picciolini, Luca, 12
Giannetti, Sandro, 229 Poce, Marica, 125
Gianoglio, Christian, 23 Prosperi, Paolo, 202
Giardino, Daniele, 267 Pugi, Luca, 223, 235
Gonzalez, Valeria, 3
Grammatikakis, Miltos D., 90, 97, 164 R
Grasso, Francesco, 223 Ragusa, Edoardo, 23
Raho, Daniel, 164
H Ramponi, Giovanni, 3
Hussain, Asad, 62 Re, Marco, 267
Re, Valerio, 62
K Rienzo, Roberto Di, 241
Koumarelis, Anastasios, 90, 97 Rizik, Ali, 273
Romano, Francesca, 69
L Roncella, Roberto, 241, 246
Lazzaroni, Luca, 261 Ruo Roch, Massimo, 213
Levorato, Stefano, 280
Locatelli, Patrick, 62 S
Lopomo, N. F., 292 Sakr, Fouad, 39
Saletti, Roberto, 241, 246
M Saponara, Sergio, 12, 30, 125, 182, 202, 229
Madrid, Natividad Martínez, 286 Sardini, E., 292
Marini, Marco, 104 Savi, Raffaele, 223
Marsi, Stefano, 3 Scherz, Wilhelm Daniel, 286
Martina, Maurizio, 213 Seepold, Ralf, 49, 286
Matta, Marco, 267 Serpelloni, M., 292
Mazzara, Andrea, 261 Sferlazza, Antonino, 152
Meoni, Gabriele, 104 Shkilniuk, Yurii, 49
Mezzina, Giovanni, 111 Spanò, Sergio, 267
Molina, Romina, 3, 280 Stighezza, Mattia, 136
Montagnani, Giovanni L., 55
Mouzakitis, Angelos, 97 T
Mouzakitis, Nikos, 164 Testoni, Nicola, 69
Muanenda, Yonas, 173 Torrisi, Alessandro, 192
Mulfari, Davide, 104 Traversi, Gianluca, 62
Muselli, Marco, 273
V
N Valinoti, Bruno, 280
Ntallaris, Efstratios, 90 Vanello, Nicola, 104
Velha, Philippe, 173
O Villella, Marco, 12
Orcioni, Simone, 286 Vita, Valerio, 223
Ortega, Juan Antonio, 286
Oton, Claudio, 173 W
Weiss, Gerald, 286
P
Paolino, Michele, 164 Y
Pedrana, Andrea, 62 Yıldırım, Kasım Sinan, 192
Penzel, Thomas, 286
Perea, Juan José, 286 Z
Peruzzi, Alessandro, 235 Zauli, Matteo, 69
Petrelli, Matteo, 229 Zeni, Marco, 246
Petrino, Ricardo, 3 Zonzini, Federica, 69

You might also like