You are on page 1of 480

Smart Innovation, Systems and Technologies 157

Jeng-Shyang Pan
Jianpo Li
Pei-Wei Tsai
Lakhmi C. Jain Editors

Advances in Intelligent Information


Hiding and Multimedia Signal
Processing
Proceedings of the 15th International
Conference on IIH-MSP in conjunction
with the 12th International Conference
on FITAT, July 18–20, Jilin, China,
Volume 2
Smart Innovation, Systems and Technologies

Volume 157

Series Editors
Robert J. Howlett, Bournemouth University and KES International,
Shoreham-by-sea, UK
Lakhmi C. Jain, Faculty of Engineering and Information Technology,
Centre for Artificial Intelligence, University of Technology Sydney,
Sydney, NSW, Australia
The Smart Innovation, Systems and Technologies book series encompasses the
topics of knowledge, intelligence, innovation and sustainability. The aim of the
series is to make available a platform for the publication of books on all aspects of
single and multi-disciplinary research on these themes in order to make the latest
results available in a readily-accessible form. Volumes on interdisciplinary research
combining two or more of these areas is particularly sought.
The series covers systems and paradigms that employ knowledge and intelligence
in a broad sense. Its scope is systems having embedded knowledge and intelligence,
which may be applied to the solution of world problems in industry, the environment
and the community. It also focusses on the knowledge-transfer methodologies and
innovation strategies employed to make this happen effectively. The combination of
intelligent systems tools and a broad range of applications introduces a need for a
synergy of disciplines from science, technology, business and the humanities. The
series will include conference proceedings, edited collections, monographs, hand-
books, reference books, and other relevant types of book in areas of science and
technology where smart systems and technologies can offer innovative solutions.
High quality content is an essential feature for all book proposals accepted for the
series. It is expected that editors of all accepted volumes will ensure that
contributions are subjected to an appropriate level of reviewing process and adhere
to KES quality principles.
** Indexing: The books of this series are submitted to ISI Proceedings,
EI-Compendex, SCOPUS, Google Scholar and Springerlink **

More information about this series at http://www.springer.com/series/8767


Jeng-Shyang Pan Jianpo Li
• •

Pei-Wei Tsai Lakhmi C. Jain


Editors

Advances in Intelligent
Information Hiding
and Multimedia Signal
Processing
Proceedings of the 15th International
Conference on IIH-MSP in conjunction
with the 12th International Conference
on FITAT, July 18–20, Jilin, China, Volume 2

123
Editors
Jeng-Shyang Pan Jianpo Li
College of Computer Science Northeast Electric Power University
and Engineering Chuanying Qu, Jilin, China
Shandong University of Science
and Technology Lakhmi C. Jain
Qingdao Shi, Shandong, China Centre for Artificial Intelligence
University of Technology Sydney
Pei-Wei Tsai Sydney, NSW, Australia
Swinburne University of Technology
Liverpool Hope University
Hawthorn, Melbourne, Australia
Liverpool, UK
University of Canberra
Canberra, Australia
KES International, UK

ISSN 2190-3018 ISSN 2190-3026 (electronic)


Smart Innovation, Systems and Technologies
ISBN 978-981-13-9709-7 ISBN 978-981-13-9710-3 (eBook)
https://doi.org/10.1007/978-981-13-9710-3
© Springer Nature Singapore Pte Ltd. 2020
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Conference Organization

Conference Founders

Jeng-Shyang Pan, Fujian University of Technology


Lakhmi C. Jain, University of Technology Sydney, Australia, University of Canberra,
Australia, Liverpool Hope University, UK and KES International, UK
Keun Ho Ryu, Chungbuk National University
Oyun-Erdene Namsrai, National University of Mongolia

Honorary Chairs

Lakhmi C. Jain, University of Technology Sydney, Australia, University of Canberra,


Australia, Liverpool Hope University, UK and KES International, UK
Guowei Cai, Northeast Electric Power University
Chin-Chen Chang, Feng Chia University
Goutam Chakraborty, Iwate Prefectural University

Advisory Committees

Yôiti Suzuki, Tohoku University


Ioannis Pitas, Aristotle University of Thessaloniki
Yao Zhao, Beijing Jiaotong University
Kebin Jia, Beijing University of Technology
Li-Hua Li, Chaoyang University of Technology
Yanjun Peng, Shandong University of Science and Technology
Jong Yun Lee, Chungbuk National University

v
vi Conference Organization

Vu Thi Hong Nhan, Vietnam National University


Uyanga Sambuu, National University of Mongolia
Yanja Dajsuren, TU/E

General Chairs

Jianguo Wang, Northeast Electric Power University


Jeng-Shyang Pan, Fujian University of Technology
Chin-Feng Lee, Chaoyang University of Technology
Kwang-Woo Nam, Kunsan National University
Oyun-Erdene Namsrai, National University of Mongolia
Ling Wang, Northeast Electric Power University

Program Chairs

Renjie Song, Northeast Electric Power University


Ching-Yu Yang, National Penghu University of Science and Technology
Ling Wang, Northeast Electric Power University
Ganbat Baasantseren, National University of Mongolia

Publication Chairs

Pei-Wei Tsai, Swinburne University of Technology


Ho Sun Shon, Chungbuk National University
Erdenetuya Namsrai, Mongolian University of Science and Technology
Yongjun Piao, Nankai University

Invited Session Chairs

Chih-Yu Hsu, Chaoyang University of Technology


KeunHo Ryu, Chungbuk National University
Oyun-Erdene Namsrai, National University of Mongolia
Erdenebileg Batbaatar, Chungbuk National University
Jianpo Li, Northeast Electric Power University
Xingsi Xue, Fujian University of Technology
Chien-Ming Chen, Harbin Institute of Technology
Shuo-Tsung Chen, National Yunlin University of Science and Technology
Conference Organization vii

Electronic Media Chairs

Jieming Yang, Northeast Electric Power University


Aziz Nasridinov, Chungbuk National University
Ganbat Baasantseren, National University of Mongolia

Finance Chairs

Yang Sun, Northeast Electric Power University


Juncheng Wang, Northeast Electric Power University

Local Organization Chairs

Jianpo Li, Northeast Electric Power University


Tiehua Zhou, Northeast Electric Power University
Meijing Li, Shanghai Maritime University

Program Committees

Aziz Nasridinov, Chungbuk National University


Anwar F. A. Dafa-alla, Garden City College
Basabi Chakraborty, Iwate Prefectural University
Bayarpurev Mongolyn, National University of Mongolia
Bold Zagd, National University of Mongolia
Bu Hyun Hwang, Chungbuk National University
Bum Ju Lee, Korea Institute of Oriental Medicine
Byungchul Kim, Baekseok University
Dong Ryu Lee, University of Tokyo
Erwin Bonsma, Philips
Garmaa Dangaasuren, National University of Mongolia
Goce Naumoski, Bizzsphere
Gouchol Pok, Pai Chai University
Herman Hartmann, University of Groningen
Hoang Do Thanh Tung, Vietnam Institute of Information Technology of
Vietnamese Academy of Science and Technology
Incheon Park, The University of Aizu
Jeong Hee Chi, Konkuk University
Jeong Hee Hwang, Namseoul University
viii Conference Organization

Jong-Yun Lee, Chungbuk National University


Jung Hoon Shin, Chungbuk National University
Kwang Su Jung, Chungbuk National University
Mohamed Ezzeldin A. Bashir, Medical Sciences and Technology University
Moon Sun Shin, Konkuk University
Mei-Jing Li, Shanghai Maritime University
Purev Jaimai, National University of Mongolia
Razvan Dinu, Philips
Seon-Phil Jeong, United International College
Supatra Sahaphong, Ramkhamhaeng University
Suvdaa Batsuuri, National University of Mongolia
Shin Eun Young, Chungbuk National University
Sanghyuk Lee, Xi’an Jiaotong-Liverpool University
Tom Arbuckle, University of Limerick
TieHua Zhou, Northeast Electric Power University
Tsendsuren Munkhdalai, Microsoft Research
WeiFeng Su, BNU-HKBU United International College
Yongjun Piao, Nankai University
Yoon Ae Ahn, Health and Medical Information Engineering, College of Life
Yang-Mi Kim, Chungbuk National University
Kyung-Ah Kim, Chungbuk National University
Khuyagbaatar Batsuren, University of Trento
Enkhtuul Bukhsuren, National University of Mongolia
Nan Ding, Dalian University of Technology
Ran Ma, Shanghai University
Gang Liu, Xidian University
Wanchang Jiang, Northeast Electric Power University
Jingdong Wang, Northeast Electric Power University
Xinxin Zhou, Northeast Electric Power University

Committee Secretaries

Hyun Woo Park, Chungbuk National University


Erdenebileg Batbaatar, Chungbuk National University
Tsatsral Amarbayasgalan, Chungbuk National University
Batnyam Battulga, National University of Mongolia
Erdenetuya Namsrai, Mongolian University of Science and Technology
Meilin Li, Northeast Electric Power University
Conference Organization ix
x Conference Organization
Preface

Welcome to the 15th International Conference on Intelligent Information Hiding


and Multimedia Signal Processing (IIH-MSP 2019) and the 12th International
Conference on Frontiers of Information Technology, Applications and Tools
(FITAT 2019) to be held in Jilin, China, on July 18–20, 2019. IIH-MSP 2019 and
FITAT 2019 are technically co-sponsored by Northeast Electric Power University,
Chaoyang University of Technology, Chungbuk National University, National
University of Mongolia in Mongolia, Shandong University of Science and
Technology, Fujian Provincial Key Lab of Big Data Mining and Applications, and
National Demonstration Center for Experimental Electronic Information and
Electrical Technology Education (Fujian University of Technology). Both confer-
ences aim to bring together researchers, engineers, and policymakers to discuss the
related techniques, to exchange research ideas, and to make friends.
We received a total of 276 submissions. Finally, 95 papers are accepted after the
review process. The keynote speeches are kindly provided by Prof. James C.
N. Yang (Dong Hwa University) on “Relationship between Polynomial-based and
Code-based Secret Image Sharing and Their Pros and Cons,” Prof. Keun Ho Ryu
(Chungbuk National University) on “Spectrum on Interdisciplinary Related to
Databases and Bioinformatics Researches,” and Prof. Yuping Wang (Xidian
University) on “A New Framework for Large Scale Global Optimization.”
We would like to thank the authors for their tremendous contributions. We
would also express our sincere appreciation to the reviewers, Program Committee
members, and the Local Committee members for making both conferences suc-
cessful. Especially, our special thanks go to Prof. Keun Ho Ryu for the efforts and
contribution from him to make IIH-MSP 2019 and FITAT 2019 possible. Finally,
we would like to express special thanks to Northeast Electric Power University,
Chaoyang University of Technology, Chungbuk National University, National

xi
xii Preface

University of Mongolia in Mongolia, Shandong University of Science and


Technology, Fujian Provincial Key Lab of Big Data Mining and Applications, and
National Demonstration Center for Experimental Electronic Information and
Electrical Technology Education (Fujian University of Technology) for their gen-
erous support in making IIH-MSP 2019 and FITAT 2019 possible.

Acknowledgements The IIH-MSP 2019 and FITAT 2019 Organizing Committees wish to
express their appreciation to Prof. Keun Ho Ryu from Chungbuk National University for his
contribution to organizing the conference.

Qingdao Shi, China Jeng-Shyang Pan


Chuanying Qu, China Jianpo Li
Hawthorn, Australia Pei-Wei Tsai
Sydney, Australia Lakhmi C. Jain
July 2019
Contents

Part I Optimization and Its Application


1 A Framework for Ridesharing Recommendation Services . . . . . . . 3
Thi Hong Nhan Vu
2 Optimal Scheduling and Benefit Analysis of Solid Heat Storage
Devices in Cold Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Feng Sun, Xin Wen, Wei Fan, Gang Wang, Kai Gao, Jiajue Li
and Hao Liu
3 Optimization Algorithm of RSSI Transmission Model
for Distance Error Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Yong Liu, Ningning Li, Dawei Wang, Ti Guan, Wenting Wang,
Jianpo Li and Na Li
4 A New Ontology Meta-Matching Technique with a Hybrid
Semantic Similarity Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Jiawei Lu, Xingsi Xue, Guoxiang Lin and Yikun Huang
5 Artificial Bee Colony Algorithm Combined with Uniform
Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Jie Zhang, Junhong Feng, Guoqiang Chen and Xiani Yang
6 An Orthogonal QUasi-Affine TRansformation Evolution
(O-QUATRE) Algorithm for Global Optimization . . . . . . . . . . . . . 57
Nengxian Liu, Jeng-Shyang Pan and Jason Yang Xue
7 A Decomposition-Based Evolutionary Algorithm with Adaptive
Weight Adjustment for Vehicle Crashworthiness Problem . . . . . . . 67
Cai Dai
8 Brainstorm Optimization in Thinned Linear Antenna Array
with Minimum Side Lobe Level . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Ninjerdene Bulgan, Junfeng Chen, Xingsi Xue, Xinnan Fan
and Xuewu Zhang

xiii
xiv Contents

9 Implementation Method of SVR Algorithm


in Resource-Constrained Platform . . . . . . . . . . . . . . . . . . . . . . . . . 85
Bing Liu, Shoujuan Huang, Ruidong Wu and Ping Fu
10 A FPGA-Oriented Quantization Scheme for MobileNet-SSD . . . . . 95
Yuxuan Xie, Bing Liu, Lei Feng, Xipeng Li and Danyin Zou
11 A High-Efficient Infrared Mosaic Algorithm Based on GMS . . . . . 105
Xia Pei, Baolong Guo, Geng Wang and Zhe Huang
12 A Load Economic Dispatch Based on Ion Motion Optimization
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Trong-The Nguyen, Mei-Jin Wang, Jeng-Shyang Pan, Thi-kien Dao
and Truong-Giang Ngo
13 Improving Correlation Function Method to Generate
Three-Dimensional Atmospheric Turbulence . . . . . . . . . . . . . . . . . 127
Lianlei Lin, Kun Yan and Jiapeng Li
14 Study on Product Name Disambiguation Method Based
on Fusion Feature Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Xiuli Ning, Xiaowei Lu, Yingcheng Xu and Ying Li
15 Delegated Preparation of Quantum Error Correction
Code for Blind Quantum Computation . . . . . . . . . . . . . . . . . . . . . . 147
Qiang Zhao and Qiong Li
16 Design of SpaceWire Interface Conversion to PCI Bus . . . . . . . . . . 155
Zhenyu Wang, Lei Feng and Jiaqing Qiao
17 A Chaotic Map with Amplitude Control . . . . . . . . . . . . . . . . . . . . . 163
Chuanfu Wang and Qun Ding
18 Analysis of Factors Associated to Smoking Cessation
Plan Among Adult Smokers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Jong Seol Lee and Keun Ho Ryu
19 An Efficient Semantic Document Similarity Calculation Method
Based on Double-Relations in Gene Ontology . . . . . . . . . . . . . . . . 179
Jingyu Hu, Meijing Li, Zijun Zhang and Kaitong Li
20 Analysis of the Dispersion of Impact Point of Smart Blockade
and Control Ammunition System Based on Monte
Carlo Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Yang Li, Chun-lan Jiang, Ming Li and Shu-chun Xie
21 Analysis of the Trajectory Characteristics and Distribution
of Smart Blockade and Control Ammunition System . . . . . . . . . . . 195
Yang Li, Chun-lan Jiang, Liang Mao and Xin-yu Wang
Contents xv

22 Study on Lee-Tarver Model Parameters of CL-20


Explosive Ink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Rong-qiang Liu, Jian-xin Nie and Qing-jie Jiao
23 Optimal Design of Online Peer Assessment System . . . . . . . . . . . . . 217
Yeyu Lin and Yaming Lin

Part II Power Systems


24 A Method of Calculating the Safety Margin of the Power
Network Considering Cascading Trip Events . . . . . . . . . . . . . . . . . 227
Huiqiong Deng, Chaogang Li, Bolan Yang, Eyhab Alaini,
Khan Ikramullah and Renwu Yan
25 Research on Intelligent Hierarchical Control of Large Scale
Electric Storage Thermal Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Tong Wang, Gang Wang, Kai Gao, Jiajue Li, Yibo Wang
and Hao Liu
26 Global Maximum Power Point Tracking Algorithm for Solar
Power System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Ti Guan, Lin Lin, Dawei Wang, Xin Liu, Wenting Wang, Jianpo Li
and Pengwei Dong
27 A Design of Electricity Generating Station Power Prediction
Unit with Low Power Consumption Based on Support Vector
Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Bing Liu, Qifan Tong, Lei Feng and Ping Fu
28 Design of Power Meter Calibration Line Control System . . . . . . . . 269
Liqiang Pei, Qingdan Huang, Rui Rao, Lian Zeng and Weijie Liao

Part III Pattern Recognition and Its Applications


29 Foreground Extraction Based on 20-Neighborhood Color
Motif Co-occurrence Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Chun-Feng Guo, Guo Tai Chen, Lin Xu and Chao-Fan Xie
30 Deformation Analysis of Crude Oil Pipeline Caused by Pipe
Corrosion and Leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
Yuhong Zhang, Gui Gao, Hang Liu, Qianhe Meng and Yuli Li
31 Open Information Extraction for Mongolian Language . . . . . . . . . 299
Ganchimeg Lkhagvasuren and Javkhlan Rentsendorj
32 Colorful Fruit Image Segmentation Based on Texture Feature . . . . 305
Chunyan Yang
xvi Contents

33 Real-Time Emotion Recognition Framework Based on


Convolution Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Hanting Yang, Guangzhe Zhao, Lei Zhang, Na Zhu, Yanqing He
and Chunxiao Zhao
34 Facial Expression Recognition Based on Regularized Semi-
supervised Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
Taiting Liu, Wenyan Guo, Zhongbo Sun, Yufeng Lian, Shuaishi Liu
and Keping Wu
35 Face Recognition Based on Local Binary Pattern Auto-
correlogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Zimei Li, Ping Yu, Hui Yan and Yixue Jiang
36 Saliency Detection Based on the Integration of Global Contrast
and Superpixels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Yikun Huang, Lu Liu, Yan Li, Jie Chen and Jiawei Lu
37 Mosaic Removal Algorithm Based on Improved Generative
Adversarial Networks Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
He Wang, Zhiyi Cao, Shaozhang Niu and Hui Tong
38 Xception-Based General Forensic Method on Small-Size
Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Lisha Yang, Pengpeng Yang, Rongrong Ni and Yao Zhao
39 Depth Information Estimation-Based DIBR 3D Image Hashing
Using SIFT Feature Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Chen Cui and Shen Wang
40 Improved Parity-Based Error Estimation Scheme in Quantum
Key Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
Haokun Mao and Qiong Li
41 An Internal Threat Detection Model Based on Denoising
Autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Zhaoyang Zhang, Shen Wang and Guang Lu
42 The Para-Perspective Projection as an Approximation
of the Perspective Projection for Recovering 3D Motion in
Real Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
Tserennadmid Tumurbaatar and Nyamlkhagva Sengee
43 Classifying Songs to Relieve Stress Using Machine Learning
Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Khongorzul Munkhbat and Keun Ho Ryu
44 A Hybrid Model for Anomaly-Based Intrusion Detection
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
N. Ugtakhbayar, B. Usukhbayar and S. Baigaltugs
Contents xvii

45 A Method for Precise Positioning and Rapid Correction


of Blue License Plate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Jiawei Wu, Zhaochai Yu, Zuchang Zhang, Zuoyong Li, Weina Liu
and Jiale Yu
46 Preliminary Design and Application Prospect of Single Chinese
Character Calligraphy Image Scoring Algorithm . . . . . . . . . . . . . . 443
Shutang Liu, Zhen Wang, Chuansheng Wang, Junxian Zheng
and Fuquan Zhang
47 Adaptive Histogram Thresholding-Based Leukocyte Image
Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Xiaogen Zhou, Chuansheng Wang, Zuoyong Li and Fuquan Zhang
48 Simulation Study on Influencing Factors of Flyer Driven
by Micro-sized PbN6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
Xiang He, Nan Yan, Weiming Wu and Liang Zhang
49 Identifying Key Learner on Online E-Learning Platform:
An Effective Resistance Distance Approach . . . . . . . . . . . . . . . . . . 471
Chunhua Lu, Fuquan Zhang and Yunpeng Li
50 A User Study on Head Size of Chinese Youth for Head-Mounted
EEG Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
Xi Yu and Wen Qi
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
About the Editors

Jeng-Shyang Pan received his B.S. in Electronic Engineering from National


Taiwan University of Science and Technology in 1986, M.S. in Communication
Engineering from National Chiao Tung University, Taiwan, in 1988, and Ph.D. in
Electrical Engineering from University of Edinburgh, UK, in 1996. He is a Professor
at College of Computer Science and Engineering, Shandong University of Science
and Technology and Fujian University of Technology, and an Adjunct Professor at
Flinders University, Australia. He joined the editorial board of the International
Journal of Innovative Computing, Information and Control, LNCS Transactions on
Data Hiding and Multimedia Security, Journal of Information Assurance and
Security, Journal of Computers, International Journal of Digital Crime and
Forensics, and the Chinese Journal of Electronics. His research interests include soft
computing, information security, and big data mining. He has published more than
300 journal and 400 conference papers, 35 book chapters, and 22 books.

Jianpo Li is a Professor at School of Computer Science, Northeast Electric Power


University, China. He completed his Ph.D. in Communication and Information
System from Jilin University, Changchun, China, in 2008, and has more than
10 years’ teaching/research experience. He has published more than 25 papers in
international journals and conferences and has 12 patents.

Pei-Wei Tsai received his Ph.D. in Electronic Engineering in Taiwan in 2012. He is a


lecturer and the deputy course convenor for Master of Data Science at the Department
of Computer Science and Software Engineering at Swinburne University of
Technology in Australia. His research interests include swarm intelligence, opti-
mization, big data analysis, wireless sensor network, and machine learning.

Lakhmi C. Jain Ph.D., M.E., B.E. (Hons), Fellow (Engineers Australia), serves at
University of Technology Sydney, Australia, University of Canberra, Australia,
Liverpool Hope University, UK and KES International, UK. He founded KES
International to provide the professional community with the opportunities for pub-
lication, knowledge exchange, cooperation, and teaming. Involving around 5000

xix
xx About the Editors

researchers drawn from universities and companies worldwide, KES facilitates


international cooperation and generates synergy in teaching and research. His inter-
ests focus on artificial intelligence paradigms and applications in complex systems,
security, e-education, e-healthcare, unmanned air vehicles, and intelligent agents.
Part I
Optimization and Its Application
Chapter 1
A Framework for Ridesharing
Recommendation Services

Thi Hong Nhan Vu

Abstract A variety of existing ride-on-demand systems support rideshare function


besides other functions like traditional taxi. However, many problems have not been
solved. First, drivers have to offer their trips and passengers input their request to
search for their drivers through a website by smartphone to find a possible match of
the trip. Rideshare function of these systems is still limited. Existing systems also fail
to provide convenient and flexible ridesharing services for especially regular users
with frequent routes. Many drivers and passengers have the same travel demand
but have to send the ride requests every day. Last but not least, when people visit a
place they often do some specific activity there, for example, eating at a restaurant,
and sometimes they do not mind to change to another place where they can do the
same activity provided that no additional travel cost and time are incurred. Therefore,
to construct proactive real-time ridesharing services, we need to solve all of those
problems. This paper focuses on designing a framework for ridesharing and location-
based services with the exploitation of knowledge discovered by spatiotemporal data
mining techniques. Users can send a ride request anytime. Depending on the time the
user needs a ride as well as his activity at the destination, his request can be executed
immediately or procrastinated to construct an optimal rideshare and possibly suggest
a location for his demanded activity so that the ride fare is lowest.

Keywords Activity analysis · Ridesharing · Point of interest

1.1 Introduction

Over the past years, in developing countries like Vietnam motorcycles and scooters
have dominated. However, as the country becomes wealthier, it is likely to move
toward car ownership, placing great burden on already overcrowded roads [1]. In

T. H. N. Vu (B)
Faculty of Information Technology, University of Engineering and Technology,
Vietnam National University, Hanoi, Vietnam
e-mail: vthnhan@vnu.edu.vn

© Springer Nature Singapore Pte Ltd. 2020 3


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_1
4 T. H. N. Vu

addition, the growing number of car’s fuel air causes air pollution, traffic jams, and
energy crisis. Ridesharing is believed to be the most effective strategy to achieve
green and efficient transportation [2, 3].
Most existing ride-on-demand and activity-based travel demand services directly
use raw GPS data like coordinates and timestamps without much understanding.
These systems usually force the riders to adapt to their recommended travel routes
instead of receiving an itinerary based on their needs. These systems do not provide
much support in giving useful information about geospatial locations while the users
are traveling either. Naturally, before going to an unknown region, users wish to
know which locations are the most interesting places in this region and what the most
optimal travel sequences users should follow. Ridesharing recommendation services
enable a group of people with similar frequent trips or similar activity preferences
to share a car [4, 5].
So far, there are two popular types of ridesharing, namely, casual carpooling and
real-time ridesharing. The former is usually used by commuters who have common
routes, departing from public transit centers to work locations. However, a problem
with casual carpooling is that it requires users to register in advance and usually they
have some relationship, while in practice users usually have a spontaneous travel
demand. Real-time ridesharing is able to address this problem with the support of
mobile devices and automated ride-matching algorithms, which enables the organi-
zation of participant only need to be done minutes prior to the beginning of the trip
or even when the trip is occurring. Some of the popular applications that have been
deployed lately include Uber [6]. However, in most of these applications, common
trips are considered invalid because those applications operate like traditional taxi
while the value common trips can bring to us is a great deal.
Profile matching is an approach to generate groups of participants. One of the most
recent studies uses a social distance to measure the relationship between participants
but only distance between home and office is discussed [7]. Another work introduces
time–space network flow technique to address the ridesharing problem using pre-
matching information such as smoking, non-smoking, or gender of the participant
[8]. However, knowledge of frequent routes is not included in this work.
In this paper, we propose a framework for ridesharing and location-based recom-
mendation services with the exploitation of knowledge discovered by spatiotemporal
data mining techniques. Users can send a ride request anytime. Depending on the
time the user needs a ride as well as his activity at the destination, his request can
be executed immediately or procrastinated to construct an optimal rideshare and
possibly suggest a location for his demanded activity so that the ride fare is lowest.

1.2 Framework for Ridesharing Service

To receive ridesharing recommendation from the system the user must first send a
request for a ridesharing service to Premium Ridesharing Service. Pickup and drop-
off locations as well as validity period of the request would be included. Users can
1 A Framework for Ridesharing Recommendation Services 5

Fig. 1.1 A framework for ridesharing services

select their pickup and drop-off locations from a list of frequent addresses. Default
time limit can be used in case there’s no validity period specified. The system will
process the request by sending the specified addresses to geocoding service and
get back the coordinates for them. All information regarding the request is then
sent to ridesharing engine. The request can be executed in online or offline fashion
depending on the specified validity period. Ridesharing engine calls the appropriate
algorithm to construct a rideshare which consists of users with common or similar
routes. The information about the rideshare is then sent to the scheduling/routing
engine (Fig. 1.1).

1.3 Basic Concepts

This section explains the basic concepts and trajectory preprocessing procedure with
semantic information.

Definition 1.1 (GPS Trajectory): a raw  trajectory of a moving user is formally rep-
resented by a sequence of points as oid , pi + in which oid is the moving user
identifier, pi + is a sequence of the geographical point pi of the user. Each point pi is
represented by x, y, t in which x and y are the spatial coordinates and t is timestamp
at which the position is captured.
6 T. H. N. Vu

Definition 1.2 (Point Of Interest): POI is a geographical location where people can
perform an activity. Formally, POI is defined as a tuple {p, lbl, topic, T } where p
is a spatial location, lbl is the name of the POI, topic is a category assigned to POI
depending on the application, and T is the business hour represented by the time
interval [open, close].

Definition 1.3 (Stay point): Stay point is a geographic region in which the user stayed
for a certain interval of time. Given a raw trajectory, stay points are detected with the
use of two scale parameters, temporal threshold t , and spatial threshold s . A stay
point is characterized by a group of consecutive points P = {pi , pi+1 , …, pj } in which
for each i ≤ k ≤ j, the distance dist(pi , pk ) between two points pi and pk is less than the
threshold s and the time difference between the first and last points is greater than
the threshold t (i.e., dist(pi , pk ) ≤ s and pj ·t −pi ·t ≥ t ). Formally, the stay point
j j
p ·x p ·y
is denoted by s = (x, y, beginT, endT ) where s · x = |j−i+1|
k=i k
e and s · y = |j−i+1|
k=i k

are the average coordinates of the points of the set P, and beginT = pi · t and
endT = pj · t are the entering and leaving time of the user.

1.4 Process of Ridesharing Recommendation

1.4.1 Detecting Stay Points

The first step is to detect the stay points from a user’s raw trajectory. Usually, each
stay point carries a particular semantic meaning, such as a restaurant, a rest area, or
some other tourist attraction. Annotating each stay point with a POI and annotating
each POI with a human activity can be done either manually or automatically. Given
a trajectory of user, temporal threshold t and spatial threshold s , all of the stay
points can be detected easily according to Definition 1.3.

1.4.2 Segmenting GPS Movements into Routes

Movement history is a set of locations that the user visited in geographical spaces
over an interval of time. Here, a user’s movement history MoveH is represented by
a sequence of stay points the user visited with corresponding entering time (beginT )
and leaving time (endT ). Therefore, we have MovH = s1 , s2 , . . . sn .
After detecting all the stay points, with the definition of the user’s movement
history above we now move on to segmenting the user movement into a set of routine
routes. A routine route is defined as a regular behavior about spatial and temporal
aspects of a user who performs the trip on a daily basis. This task is tackled by splitting
the series of locations into individual routes that the user took at a predefined time
window tw.
1 A Framework for Ridesharing Recommendation Services 7

1.4.3 Segmenting GPS Movements into Routes

In this step, the stay points of the routine routes are then mapped into a reference
plane. The reference plane is composed of geographical regions. In this study, we use
the raster method to represent regions. That means, the reference space is decomposed
into regular cells, thereby we call reference plan spatial grid. As a result, each stay
of a user is represented by a cell in which the user visited and remained for a time
interval [beginT, endT ].
Figure 1.2 illustrates stay points detected from a raw trajectory. Two stay points s1
and s2 are constructed from two sets of points p1 , p2 , p3  and p8 , p9 , respectively.
The spatial grid is represented by a matrix D[nx , ny ]. Since each cell corresponds
to the element D[i, j] we label the cell with Dij. A route can then be represented by a
sequence of cell labels. For instance, with the route shown in Fig. 1.2, the sequence
of stays points p0 , s1 , p5 , p6 , p7 , s2 , p10  can be converted into a series of cell labels
as D10, D20, D30, D31, D21, D11, D12, D13.
Users can go from a cell to its neighbors. From this idea, we can represent a grid
as a directed graph whose vertices are cells. The connection between two cells is
called an edge e. Now a route Dij+ consisting of a set of cell labels Dij can be
represented by a sequence of edges e+ .
The algorithm in Fig. 1.3 is able to reveal all of the stay points from a user’s
raw trajectory. Usually, each stay point carries a particular semantic meaning, such
as a restaurant, a rest area, or some other tourist attraction. Annotating each stay
point with a POI and annotating each POI with a human activity can be done either
manually or automatically.

p9 p10
T p8 raw GPS
trajectory

t7 p6 D[nx, ny]
p5 p10
3
Y 2 s2
p p4 1 p7 p6 p5
p1 p2 3 0 p s1
p0 0
p4
X Stay point detection 0 1 2 3

Fig. 1.2 Example of stay point detection from a raw trajectory


8 T. H. N. Vu

Fig. 1.3 Algorithm for detecting stay points from a trajectory

1.4.4 Discovering Frequent Routes

With the routes obtained from the previous step the frequent routes can be discovered
using the algorithm introduced in [7].
Frequency of a directed edge e is defined as the number of routes passing by this
edge (i.e., f (e) > α).
An edge is said qualified if its frequency is greater than a threshold α.
The frequency of a route r is reflected by a measure route score Sr() defined as
follows:
 h(f (e), α)
Sr(r) = (1.1)
e∈r
n

 
1 ifx > θ
in which h() is a membership function that is defined as h(x, θ ) = .
0 otherwise
A route is said qualified if its score is greater than a threshold γ.
1 A Framework for Ridesharing Recommendation Services 9

Frequent edge is determined by the number of qualified routes passing through


the edge e. Formally, the edge score Se() is measured by
 h(Sr(r), γ )
Se(e) = (1.2)
rL(e)
n

where L(e) is the list of routes traversing the edge e.


An edge is said frequent if its edge score is greater than a threshold β.
A route is said sharable if it is frequent. Frequent route r is determined by the
number of frequent edges passed by r. Now the frequency of a qualified route is
determined by
 h(Se(e), β)
Sr(r) = (1.3)
er
n

Generally, the algorithm for discovering frequent routes works as follows. First,
all of the qualified edges are determined by calculating the edge frequency using
Eq. (1.1) and all of the edges whose frequency is greater than α are kept in a linked
list named qEList. From the qualified edges found, all of the qualified routes are then
discovered using Eq. (1.2) and stored in the list qRList. The third step determines
which qualified edges are not frequent and then will be removed from the list qEList.
The process repeats step 2 and 3 until no more routes are removed from the list
qRList. The remaining elements in the qRList are finally the result of the algorithm.

1.4.5 Ridesharing Service Matching

An individual who performs a frequent route is assumed to be able to offer ride to


other people. This person is called a driver. A frequent routine represents a regular
behavior about spatial and temporal aspect of the driver. The term route is similar to
the route of a bus in which there is an itinerary for a certain interval of time. Frequent
routes can be discovered from the user movement history.
A ridesharing request is sent from the passenger and represented by the tuple
DeptPoint, DeptT , ArrPoint, ArrT , POICat where DeptPoint and DeptT are the
pickup place and pickup time, respectively, ArrPoint and ArrT are the drop-off place
and arrival time, and POICat indicates the category of POI associated with the
destination where the passenger plans to visit.
A ridesharing request is mapped into the reference plane as the way we do for the
route. The ridesharing service matching algorithm takes as input a reference plane
M, a set of POIs associated with business hours [openT, closetT ], a set of frequent
routes FR, a set of time interval T = {T1, …, Tn}, a ridesharing request RR, and
time window tw.
10 T. H. N. Vu

In response to the user request, the algorithm finds the possibility of new destina-
tions from the same category of POI and also proposes an adjustment to his or her
original schedule, which allows the user still do the activity he/she desires and at the
same time can be more flexible compared to the original schedule.
For each route tr in the set of frequent routes FR, the algorithm finds all the
possible pickup cells from the route tr within the time interval [RR.deptT – tw,
RR.arrT + tw]. Second, all of the cells containing the requested POI (RR.POICat)
that are traversed by the route tr during that interval [RR.deptT – tw, RR.arrT + tw]
would be determined. All the possible pairs of pickup point and destination would
be sent to the user.
With this spatiotemporal service matching strategy, the user would have more
options in making decision of performing his/her activity. This way enables the user
to be more flexible in life instead of sticking to the original schedule.

1.5 Conclusion

In this work, travel demands are modeled based on the activities individuals intend
to perform. A driver is a person who has a relatively stable routine, owns a car,
and is willing to offer a ride to other people. Given a ridesharing request including
information such as departure place and time, arrival place and time, and intended
activity at the visited place, a driver as well as an optimal routing is recommended by
the system. To this end, frequent route of the person who can share his/her vehicle is
employed. Besides that, the matching method also considers the demanded activity
in connection with spatial and temporal constraints. Consequently, both driver and
rider derive advantage from ridesharing in terms of travel expense.
We are currently carrying out the performance analysis of the proposed method
on real datasets and implementing a system prototype.

References

1. Financial Times: https://www.ft.com/content/96608536-4204-11e7-9d56-25f963e998b2?


mhq5j=e1 (2017). Accessed June 2017
2. Nechita, E., Crişan, G.C., Obreja, S.M., Damian, C.S.: Intelligent carpooling system: a case study
for Bacău metropolitan area. In: New Approaches in Intelligent Control, pp. 43–72. Springer
International Publishing Switzerland (2016)
3. Lim, J.H., Chan, J., Karunasekera, S., Leckie, C.: Personalized itinerary recommendation with
queuing time awareness. In: The International Conference of SIGIR, pp. 325–334 (2017)
4. Furletti, B., Cintia, P., Renso, C.: Inferring human activities from GPS tracks. In: UrbComp
2013 (2013)
5. Furuhataa, M., Dessouky, M., Brunetd, F.O.M., Koenig, S., Wang, X.: Ridesharing-the state-of-
the-art and future directions. Elsevier J. Transp. Res. Part B: Methodol. 28–46 (2013)
6. Kalanick, T., Camp, G.: Uber. https://www.uber.com/ (2015). Accessed 30 July 2015
1 A Framework for Ridesharing Recommendation Services 11

7. He, W., Hwang, K., Li, D.: Intelligent carpool routing for urban ridesharing by mining gps
trajectories. IEEE Trans. Intell. Transp. Syst. 15(5), 2286–2296 (2014)
8. Yan, S., Chen, C.Y.: A model and a solution algorithm for the car pooling problem with pre-
matching information. Comput. Ind. Eng. Elsevier 61(3), 512–524 (2011)
Chapter 2
Optimal Scheduling and Benefit Analysis
of Solid Heat Storage Devices in Cold
Regions

Feng Sun, Xin Wen, Wei Fan, Gang Wang, Kai Gao, Jiajue Li and Hao Liu

Abstract On the basis of analyzing the characteristics of wind power in winter


heating period and considering the critical state of wind power integration, this
paper rationalized the energy load, decoupled the energy coupling relationship under
the traditional heating mode, and optimal scheduled solid-state heat storage (SHS)
devices.

Keywords Critical state of wind power integration · Heating mode · Solid-state


heat storage

F. Sun · W. Fan · G. Wang · J. Li


State Grid Liaoning Electric Power Company Limited, Electric Power Research Institute,
Shenyang 110006, Liaoning, China
e-mail: darkmars_2000@126.com
W. Fan
e-mail: fw_ldk@ln.sgcc.com.cn
G. Wang
e-mail: wangg_ldk@ln.sgcc.com.cn
J. Li
e-mail: en-sea@163.com
X. Wen · K. Gao
State Grid Liaoning Electric Power Supply Co., LTD., Shenyang 110006, Liaoning, China
e-mail: wenx@ln.sgcc.com.cn
K. Gao
e-mail: gk@ln.sgcc.com.cn
H. Liu (B)
Northeast Electric Power University, 132012 Jilin, Jilin Province, China
e-mail: 1282625960@qq.com

© Springer Nature Singapore Pte Ltd. 2020 13


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_2
14 F. Sun et al.

2.1 Introduction

According to the statistics of the National Energy Administration, the average


national abandoned wind volume reached 12% in 2017, and the accumulated wind
power abandonment in the whole year was 41.9 billion kw h [1]. The form is still
grim.
The northeast region is rich in wind resources, but the deficiency of its adjustment
ability and the lack of wind power consumption capacity are the main reasons for
the wind power cutoff [2]. In particular, with the advent of the heating season, in
order to meet the heating needs, the unit adopts the “heat-set” operation mode, which
further reduces the peak-shaving capability of the system and generates a large-scale
abandoned wind limit [3–6]. The literature [7] describes the use of heat storage
devices and heating devices and other solutions. Literature [8] analyzes the applica-
tion prospects of large-capacity heat storage electrothermal combined systems, and
points out that the optimization method of electrothermal combined system including
large-capacity heat storage is the key to study the electrothermal combined system.
The literature [9] pointed out that the solid-state heat storage device uses low-valley
electricity for heat storage, so as to achieve the effect of suppressing the peak-to-
valley difference of the system. The scheme has good economy, but lacks analysis
of specific implementation methods.
Based on the above work, this paper will further study the specific implementation
mode of the heat storage device to improve the ability to absorb clean energy, and
combine the peak-to-valley characteristics of the system load curve with the wind
power output characteristics to reasonably arrange the input and cutout of the heat
storage device. Minimize the use of abandoned wind power for heat storage.

2.2 Method for Realizing Heat Storage Device


Participating in Wind Power Consumption

The solid heat storage device can be installed in the heating range of the thermal power
plant to utilize the low-valley electricity and the power plant to jointly supply heat,
or directly connected to the wind power generator to use the abandoned wind power
storage heat to achieve clean energy heating. The scheduling scheme using the heat
storage device only needs to ensure the demand of the heat load consumption in one
scheduling period, thereby providing greater flexibility for the unit and effectively
improving the problem of time–space mismatch of the energy load of the power
system. The specific implementation mode is: at load lows, when the power grid is
abandoned, the heat storage device is put into operation, the wind power is stored
in the form of heat energy, and the wind power consumption space is increased.
When the user needs to supply heat, the solid-state heat storage device replaces the
cogeneration unit to transfer the stored heat energy to the heat user to alleviate the
operating pressure of the thermal power unit during peak hours.
2 Optimal Scheduling and Benefit Analysis … 15

Fig. 2.1 Heat storage technology implementation in power systems

The operation principle of using solid-state heat storage technology in the power
system is shown in Fig. 2.1. The solution has the following characteristics: when
the heat storage device is put into operation, which can be equivalent to a constant
power load, and compared with the heating operation mode of the combined heat and
power generating unit, the operating technical constraints are greatly reduced, and
it has universality. Change the traditional heating mode, realize the decoupling of
electric heating and heating, better match the power demand and clean energy output
characteristics, and effectively solve the problem of clean energy consumption.

2.3 Thermal Storage Device Joint Optimization Scheduling


Model

2.3.1 The Determination of the Target Value


of the Dispatcher Becomes the Key Point to Solve
the Problem of Abandoned Wind Consumption

During the period of low load, the limitation of peak load capacity of thermal power
units caused the grid to accept insufficient wind power space, and there was a serious
contradiction between the peak characteristics of wind power and the wind power,
16 F. Sun et al.

Heating period
Heating period
Abandoned wind power MW )

January February March April May June July August September October November December
Time Resolution 15min

Fig. 2.2 Wind power curtailment throughout the year in a cryogenic provincial grid

which caused the power system to abandon wind and power. In particular, for the
analysis of the annual abandonment of wind power in a provincial power grid in a
cold area (as shown in Fig. 2.2).
It can be seen from Fig. 2.3 that the abandonment wind power is the difference
between the equivalent wind power output and the network load.
 
P jcurtail = P jwind + Pmin
unit
− P jload (2.1)

Among them, P jwind indicates the total power of wind power at the j-th moment,
unit
Pmin indicates the minimum output of the thermal power unit, and it is easy to know
unit
that Pmin is the equivalent wind power output. In this paper, the time interval T c of
wind
Pj occurring during the load valley period is optimized.
When the wind power is sufficient, it is limited by the total capacity of the heat
storage device, and the optimized scheduling plan will be regulated according to
the fixed target. Due to the uncertainty and volatility of wind power output, when
the wind power output is insufficient for a certain period of time to store heat in all
devices, the dispatch plan will be adjusted to the strategy of following the wind power
output. Therefore, in order to effectively suppress the peak-to-valley difference of
the load and maximize consumption of wind power, the scheduling target at the first
moment takes the minimum value between the total capacity limit of the system heat
storage device and the equivalent wind power output, that is,
2 Optimal Scheduling and Benefit Analysis … 17

Unit adjustable
capacity
Upward rotation
reserve
Total network
load
Power
MW

Network
abandonment of
wind power

Wind power
equivalent output
Wind power
consumption space
Unit minimum
output

Time

Fig. 2.3 Wind power consumption mechanism of power system

 
goal

N
Pj = min Piheat + DG
Pmin , P jwind + unit
Pmin (2.2)
i

Among them, Piheat represents the rated power of the i-th heat storage device, N
DG
represents the total number of heat storage devices, Pmin is the minimum load of the
wind
curve during the low-valley period, P j indicates the total power of wind power at
unit
the j-th moment, and Pmin is the equivalent wind power output.

2.3.2 Scheduling Model Objective Function

According to the content of the previous section, the plan is set according to the
scheduling target, and the optimal scheduling model of the heat storage device is
established. By controlling the input and cutout of the large-scale heat storage device,
the wind power is stored in the form of heat energy to the maximum extent. The
scheduling objective function is
 

M 
N
 
min z = Pj goal
− Piheat xi, j + P jload (2.3)
j i
18 F. Sun et al.

Among them, xi, j denotes the state of the first heat storage device at the j-th
scheduling time, N denotes the total number of heat storage devices, M denotes the
total number of nodes in the low-valley scheduling time, and P jload denotes the power
load of the power system at the j-th time.

2.3.3 Restrictions

System thermal load constraints. In order to reduce waste of resources, the total
amount of heat stored in the wind power supply should not exceed the heat load
demand during the dispatch cycle.


N 
M
Piheat xi, j · Tc · β ≤ Q lperiod (2.4)
i j

Among them, β represents the efficiency of the solid-state heat storage device and
period
Ql represents the total heat load during the scheduling period.
Constraints on the capacity of the heat storage device. During a single schedul-
ing period, the operating capacity of the heat storage device participating in the
dispatch does not exceed the effective capacity of the heat storage device.


M
Piheat xi, j · t ≤ Cirated − Cireserve (2.5)
j=1

Among them, Cirated is the rated capacity of the i-th heat storage device and t
is the minimum scheduling time step. Considering the short-term prediction error
of wind power, the heat storage device reserves the reserve capacity Cireserve to cope
with the situation that the actual wind power output exceeds the predicted value.
The system runs security constraints. From the perspective of safety and reliability,
when the wind power fluctuates and cannot be merged into the power grid, the thermal
power unit has the ability to bear this part of the load of the heat storage device.
Therefore, it is required that the total heat storage capacity incorporated in each time
node does not exceed the maximum peaking capacity of the thermal power unit.


N
 
Piheat xi, j + P jload ≤Pmax
peak
(2.6)
i

peak
Among them, Pmax is the maximum adjustable peak power of the system.
2 Optimal Scheduling and Benefit Analysis … 19

2.4 Heat Storage Device Scheduling Scheme Utility Income


Indicator

2.4.1 Direct Revenue from Heat Storage Device Scheduling

Considering the stable heating load during the heating period in winter, the heat
storage device is used to store heat in the low-valley period, and the heat is supplied
through the rated exothermic power to meet the heating demand for the peak load time
and even the whole day. Therefore, the direct economic benefits of using solid-state
heat storage devices for heating are

N


1 deprecit
Fheat = (Sunit −Swind ) · L unit − · Fi
build
+ Fi + Fi
maintain
(2.7)
i
Tiheat

Sunit , Swind are the cost of power supply for hotspot cogeneration units and wind power
supply units, L unit is the total power consumption of the solid-state heat storage
deprecit
device for the heat storage at the low valley, and Tiheat , Fibuild , Fi , Fimaintain ,
respectively, indicate the service life, construction cost, total depreciation cost, and
total maintenance cost of the i-th heat storage device.

2.4.2 Indirect Benefits of Heat Storage Device Scheduling

Reduce the auxiliary service market to compensate for power plant peak shaving
Increase the wind power consumption space by optimizing the heat storage device as
shown in Fig. 2.4. It can solve the problem of clean heating and improve the peaking
ability of the system. In order to achieve the goal of clean energy total consumption, a
large amount of wind power connected to the power grid will inevitably bring peaking
pressure to the thermal power unit without using a solid-state heat storage device.
According to the Interim Measures for the Administration of Auxiliary Services of
Grid-connected Power Plants, the regional auxiliary service market compensation
prices are shown in Table 2.1.
After the application of the heat storage device, the compensation cost for the
peak shaving of the thermal power unit can be reduced indirectly.


P 
M
 N 
Fcomp = η j · f N − η Lj · f L · Piunit · t (2.8)
i j

η Lj and f L , respectively, represent the proportion of the deep peak shaving and the
compensation cost of the unit before the heat storage device is dispatched at the j-th
time. η Nj and f N , respectively, represent the proportion of the deep peak shaving
20 F. Sun et al.

System upgraded wind power


consumption space
Traditional mode
abandoning wind power

Abandoned wind power after


MW)

being incorporated into the heat


storage device
Power

Heat storage device optimization


scheduling scheme

Load valley time

Fig. 2.4 Principle of heat storage device optimization scheduling lifting system wind power con-
sumption space

Table 2.1 Unit peaking compensation fee schedule in regional auxiliary services market
Unit peaking depth Electricity subsidy per kw h Remarks
(yuan/kw h)
60% Fine *
50% * System-defined peak shaving
depth
40–50% 0.4 *
40% or less 1 *

of the unit after the optimal dispatch of the heat storage device at the j-th time and
should be compensated. Cost P represents the total number of units operating during
the dispatch day, and Piunit is the rated active power of the i-th unit.
2 Optimal Scheduling and Benefit Analysis … 21

2.5 Case Analysis

2.5.1 Case Conditions and Different Modes of Operation

Based on the total load data of the network in Liaoning Province from September 22
to 25, 2017, it can be seen that the variation of load peaks and valleys in the heating
season is obvious (Fig. 2.5).
By analyzing the load characteristics of the province, the typical days are selected
from September 22 and 23 trough (21 o’clock to 7 o’clock). The capacity allocation
of the heat storage devices of the three major thermal storage plants in Liaoning
Province is shown in Table 2.2.
Heat storage device with a total heat storage capacity of 400 MW in FuXin has the
advantages of small capacity and flexible dispatching and distribution. The total heat
storage capacity of Dandong Jinshan and Diaobingshan heat storage are 300 MW
and 260 MW, respectively, which has the characteristics of large capacity and high
stability.

Fig. 2.5 Typical daily network supply load in Liaoning Province


22 F. Sun et al.

Table 2.2 Capacity of heat Power plant name Heat storage capacity
storage unit in Liaoning
Province 40 MW 60 MW 70 MW 80 MW
DiaoBingShan * * 2 2
DanDong JinShan * 2 2 *
FuXin 10 * * *

In order to simplify the analysis, the constraints of network load are not considered.
The optimal scheduling model is used to solve the calculation of different capacity
heat storage devices. The calculations are calculated in the following three ways:
Mode 1: The heat storage device does not participate in scheduling. In the system,
the unit is heating the heat load, and the operation mode of “determining power
generation by heating” is the reason that the system consumes insufficient wind
power space and causes the grid to abandon wind power.
Mode 2: The heat storage device does not adopt an optimized scheduling scheme,
and performs scheduling control according to the following operational principles.
According to the province’s real-time wind power data, the operation of the heat
storage device is scheduled for the abandonment period, the grid and the wind turbine
cooperate with for the heat storage device to store heat. When the abandonment wind
power is greater than the rated capacity of the heat storage device, the heat storage
device is put into operation and the wind turbine is used for heat storage. Otherwise,
the system discards the wind and obtains electric energy from the power grid for
heat storage. For the non-abandonment stage, the heat storage device only supplies
heat to meet the heat load demand. Adopting the above control strategy will increase
the pressure of the thermal power unit, and the wind power cannot be completely
absorbed.
Mode 3: The heat storage device performs heating according to an optimized
scheduling plan. The unit operation mode is not changed, and rationalizes the dispatch
of heat storage to absorb wind power.

2.5.2 Analysis of Optimal Scheduling Results of Heat


Storage Devices

In this paper, the municipal bureau of Liaoning province was analyzed during the
typical daily trough period, and the optimized scheduling model of the heat storage
device was solved, and the operation scheme of the solid heat storage device was
obtained under three different operating modes. In the case of the same wind power
output, the output plan of the heat storage device in different ways is shown in Fig. 2.6.
2 Optimal Scheduling and Benefit Analysis … 23

Mode 3
Mode 2
Mode 1
Heat storage device output MW)

Load valley time

Fig. 2.6 Comparison of three operation modes of heat storage devices

Combined with the actual wind power data, the situation of the abandoned wind
power under the three scheduling schemes is shown in Fig. 2.7.
It can be seen from the abandonment wind meter that the operation mode 2
and mode 3 can effectively reduce the abandonment wind power, and the scheduling
plan of the optimized solid-state heat storage device optimization scheduling method
reduces the most abandonment wind power.
It should be noted that when operating mode 2 is adopted, the cogeneration unit
is required to provide part of the heat storage capacity while reducing the amount
of abandonment wind power. This converts high-grade electrical energy into heat,
which creates unnecessary energy waste. The results show that the heat storage
device scheduling plan of optimized scheduling model is better. According to the
relevant provisions of the notice on the trial of peak-to-valley electricity price policy
for residential electric heating users, Table 2.3 gives a comparison of the benefits of
the three schemes.
24 F. Sun et al.

Mode1
Mode2
Mode3
MW)

Abandoned
wind power
Abandoned wind power

The amount of electricity


generated by the thermal
power unit during heat
storage

Load valley time

Fig. 2.7 Comparison of three operation modes of system wind power curtailment

Table 2.3 Three kinds of program benefit analysis tables


Scheduling method Increase the total Abandoned wind Saving economic
amount of power/(GW h) benefits/(Ten
heating/(GW h) thousand yuan)
Mode 1 * 5028.9 *
Mode 2 3361.7 1917.5 40.04
Mode 3 4070.8 958.1 48.54

2.6 Conclusion

This paper proposes to use large-scale high-power solid-state heat storage to absorb
wind power and reduce the system peak-to-valley difference. The example shows that
optimizing the dispatching heat storage device can not only save the high electricity
cost generated by the peak heating but also improve the energy load level to reduce
the deep peak peaking of the unit, maximize the system wind power consumption
space, and optimize the overall utility income.

Acknowledgements Project supported by State Grid Corporation Science and Technology


(2018GJJY-01).
2 Optimal Scheduling and Benefit Analysis … 25

References

1. National Energy Administration. Wind Grid Operation in 2017[EB/OL]. [2018-02-01]. http://


www.nea.gov.cn/2018-02/01/c_136942234.htm (in Chinese)
2. State Electricity Regulatory Commission. Supervision report on wind power accommodation in
key areas (No. 10, 2012). Beijing: State Electricity Regulatory Commission (2012) (in Chinese)
3. Zheng, L., Hu, W., Lu, Q., et al.: Research on planning and operation model for energy storage
system to optimize wind power integration. Proc. CSEE 34(16), 2533–2543 (2014) (in Chinese)
4. Yan, G., Liu, J., Cui, Y., et al.: Economic evaluation on improving wind power scheduling scale
by using energy storage systems. Proc. CSEE 33(22), 45–52 (2013) (in Chinese)
5. Wu, X., Wang, X., Li, J., et al.: A joint operation model and solution for hybrid wind energy
storage systems. Proc. CSEE 33(13), 10–17 (2013) (in Chinese)
6. Yuan, X., Cheng, S., Wen, J.: Prospects analysis of energy storage application in grid integration
of large-scale wind power. Autom. Electric Power Syst. 37(1), 14–18 (2013) (in Chinese)
7. Nuytten, T., Claessens, B., Paredis K., et al.: Flexibility of a combined heat and power system
with thermal energy storage for district heating. Appl. Energy 104, 583–591 (2013)
8. Ge, Y., Li, X., Ge, Y., et al.: Technical plan for electric heat storage and heating by wind energy
curtailment based on joint dispatching of heat and electricity. Smart Grid 3(10), 901–905 (2015)
9. Lv, Q., Chen, T., Wang, H., et al.: Analysis on peak-load regulation ability of cogeneration unit
with heat accumulator. Autom. Electric Power Syst. 38(11), 34–41 (2014) (in Chinese)
Chapter 3
Optimization Algorithm of RSSI
Transmission Model for Distance Error
Correction
Yong Liu, Ningning Li, Dawei Wang, Ti Guan, Wenting Wang, Jianpo Li
and Na Li

Abstract In wireless sensor networks localization process, RSSI-based ranging


methods mostly adopt the traditional logarithmic-distance path loss model. Its model
parameters mostly adopt empirical values and ignoring the problem of changes in the
surrounding environment during the node localization process. Thus, it is increas-
ing the localization error and reducing the applicability of the algorithm. To solve
this problem, this paper proposes an optimization algorithm of RSSI transmission
model for distance error correction (RSSI-DEC) to optimize the path loss factor and
the reference path loss between anchor nodes in the signal transmission model. FA
algorithm and PSO algorithm are used to optimize the parameters of the model, and
the model parameters adapted to the monitoring environment are obtained to correct
the ranging error. The simulation results show that RSSI-DEC algorithm proposed
in this paper can effectively improve node localization accuracy and environmental
adaptability. The algorithm proposed in this paper has an average relative localization
error of 9.17%.

Keywords RSSI · Localization error · Parameters · Correction

Y. Liu · D. Wang · T. Guan


State Grid Shandong Electric Power Company, Jinan 250003, China
N. Li
Shandong Cyber Security and Informationization Technology Center, Jinan 250003, China
W. Wang
State Grid Shandong Electric Power Company, Electric Power Research Institute, Jinan 250003,
China
J. Li (B) · N. Li
School of Computer Science, Northeast Electric Power University, Jilin 132012, China
e-mail: jianpoli@163.com

© Springer Nature Singapore Pte Ltd. 2020 27


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_3
28 Y. Liu et al.

3.1 Introduction

In WSNs, most sensor nodes are randomly deployed and their specific localization is
unknown [1]. Node localization based on RSSI ranging has certain localization error
because of the vulnerability of electromagnetic wave transmission to environmental
interference [2]. Therefore, how to improve node localization algorithm in order to
improve localization accuracy without additional hardware has become a research
hot spot of node localization technology [3]. A trigonometric extremum suppres-
sion localization algorithm is proposed. It has better stability, but cannot avoid the
existence of gross error [4]. A cooperative localization algorithm based on received
signal strength is proposed. It improved the localization accuracy of nodes to a cer-
tain extent, but it ignored the information between unknown nodes, resulting in a lot
of waste redundant information [5]. A node deployment strategy of wireless sensor
network based on IRVFA algorithm is presented. The strategy can improve network
coverage rate and effective utilization rate of nodes at the same time, but it will also
lead to increased node localization costs [6]. A parameter tracking method based on
RSSI is proposed. It can improve positional accuracy, but the algorithm is complex
and difficult to achieve quick localization requirement [7]. The shuffled frog leaping
algorithm is presented. It can reduce the localization error, but it cannot be suitable
for big scale networks [6].
In order to find more suitable parameters of the transmission model in the detection
area, this paper proposes an optimization algorithm of RSSI transmission model for
distance error correction. This algorithm uses the optimization characteristics of FA
algorithm and the fast approximation characteristics of PSO algorithm to introduce
FA algorithm into PSO algorithm to help it get the global optimal solution, and then
uses FA algorithm to get the global optimal solution. This paper also proposes a
logarithmic decrement inertia weight to improve the precision of searching solution
and accelerate convergent speed.

3.2 Basic Principle of Ranging Based on RSSI

The localization algorithm based on RSSI ranging includes ranging phrase and local-
ization phrase. In the ranging stage, the commonly used signal transmission model
is mainly logarithmic-distance path loss model, which is described as

dut
PL (dut ) = PL (d0 ) + 10k lg( ) + xσ (3.1)
d0

where PL (dut )(dBm) is the path loss when the distance between unknown node u and
anchor node t is dut (m). PL (d0 )(dBm) is the path loss when the referenced distance
is d0 (m), typically d0 = 1 m. k is the path loss exponent, usually k = 2–6 xσ is the
Gaussian noise variable with zero mean and mean variance of σ [8].
3 Optimization Algorithm of RSSI Transmission Model … 29

Therefore, the distance between unknown node and anchor node is depicted as
PL (dij )−PL (do )−xσ
dut = d0 × 10 10k (3.2)

In the localization phrase, the node localization in three-dimensional space usually


selects four-sided ranging method [9], that is, the unknown node needs to get the
distance from at least four anchor nodes through formula (3.2). Its coordinates are
calculated according to formula (3.3).

(x − xε )2 + (y − yε )2 + (z − zε )2 = dutε
2
(ε = 1, 2, 3, 4) (3.3)

The distance between the unknown node U (x, y, z) and the anchor node
Tε (xε , yε , zε ), ε = 1, 2, 3, 4 is dutε . But in the actual environment, RSSI localiza-
tion algorithm is easily affected by the surrounding environment, multipath effect,
non-line-of-sight transmission, and so on, thus generating localization errors, so only
the estimated coordinate value of the unknown node U  (x, y, z) can be obtained.

3.3 Optimization Algorithm of RSSI Transmission Model


for Distance Error Correction

In logarithmic-distance path loss model, the parameters affecting RSSI ranging accu-
racy are path loss factor and reference path loss. Their values are related to the
surroundings. Therefore, in this paper RSSI-DEC is proposed. First, FA algorithm
is introduced into PSO algorithm to help it obtain the global optimal solution. In
addition, in order to enable the algorithm to search a large area at high speed at the
beginning of iteration and gradually shrink to a better space at the end of iteration
and implement more detailed search, this paper introduces logarithmic decrement
inertia weight based on the method of linear decrement of inertia weight to improve
the accuracy of the search solution and speed up the convergence speed.

3.3.1 Determine Fitness Function

By using information of all anchor nodes (M) that can communicate with each other,
for any two anchor nodes u and t that can communicate with each other, according
to the principle of minimum sum of squares of errors, the following are
30 Y. Liu et al.

CM2 CM2 PL(dut )−PL(d0 )−xσ


min nut =1 (dut − Dut )2 min nut =1 (d0 · 10 10k − Dut )2
f (x) = = (3.4)
dut2 dut2

where dut is the measured distance between anchor nodes, Dut is the actual distance
between anchor nodes, PL(dut ) is the measured path loss according to the current
environment of the node. Individuals who use intelligent algorithm to optimize are
the path loss factor k and the reference path loss PL(d0 ) between anchor nodes are
recorded as x(k, PL(d0 )). As the objective functions of FA and PSO algorithm, the
optimal parameters of two signal transmission models are found.

3.3.2 Logarithmic Decrement Type Inertia Weight Function

Inertial weight ω plays an important role in adjusting the search capability in PSO
algorithm and FA algorithm. In order to avoid the prematurity of algorithm and
balance the local search ability and global search ability of algorithm, this paper
optimizes the inertia weight function.

ωi = ωmax − λ(ωmax − ωmin ) logitermax (iter) (3.5)

where ωi represents the current iteration weight value. ωmax , ωmin are the maximum
value and minimum value of inertia weight, respectively, itermax is the current max-
imum number of iterations, iter is the current number of iterations of the algorithm,
λ is logarithmic adjustment factor, when 0 < λ<1, it is compression factor and when
λ > 1 it is expansion factor.
The logarithmic decrement type inertia weight function is introduced into FA algo-
rithm’s position update formula and FA algorithm’s speed update formula, respec-
tively, in this paper.
(1) Localization update of FA algorithm
The relative fluorescence brightness of fireflies is

I = I0 × e−γ rij (3.6)

The attractivity of fireflies is

β = β0 × e−γ rij
2
(3.7)

where I0 is the maximum fluorescence brightness of fireflies, β0 is the maximum


attraction, that is, the fluorescence brightness and attraction of fireflies themselves
(at r = 0), which are related to the objective function value, and the better the
3 Optimization Algorithm of RSSI Transmission Model … 31

objective function value, the higher the brightness of fireflies themselves. γ is the
absorption coefficient of light intensity, because fluorescence will gradually weaken
with the increase of distance and absorption of propagation medium, so setting the
absorption coefficient of light intensity to reflect this characteristic can be set as a
constant, rij is the European distance between fireflies i and j, and in this article is
the European distance between x(k, PL(d0 ))i and x(k, PL(d0 ))j .
Logarithmic decrement type inertia weight function and update the localization
where fireflies i are attracted to move toward fireflies j

xi = ωi xi + β × (xj − xi ) + α × (rand − 1/2) (3.8)

where xi and xj are fireflies i and j in spatial positions. α is the step factor and is [0,
1] constant on [0, 1]. rand is a random factor with uniform distribution.
(2) PSO algorithm speed update
The logarithmic decrement type inertia weight function is introduced to update the
particle velocity formula and position update formula

vi = ωi × vi + c1 × rand (0, 1) × (pBesti − xi )


+ c2 × rand (0, 1) × (gBesti − xi ) (3.9)

xi = xi + vi (3.10)

where xi is particle spatial localizations. vi is the velocity of the current position of


a particle (or firefly) i. pbesti , gbesti are the current individual optimal solution and
global optimal solution. c1 , c2 accelerate the process, and they play an important role
in adjusting the cognitive and social parts of the iteration process.

3.3.3 RSSI-DEC-Based Localization Process

The basic flow of the algorithm is as follows:


(1) Initialize a certain number of (N) particles in three-dimensional space, each par-
ticle can be expressed as x(k, PL(d0 ))i , and evaluate the fitness of all initialized
particles.
(2) The current coordinate values of each particle are assigned to the parameters of
FA algorithm in turn, and the brightness of fireflies is re-initialized by calling FA
algorithm once and updating the attractiveness and localization of FA algorithm.
(3) Feedback the fitness value f (xFA ) of each FA algorithm operation result xFA to
PSO algorithm, and compare it with the individual optimal fitness value f (xp )
and the global optimal fitness value f (xg ), judge whether to update f (xp ), f (xg ),
32 Y. Liu et al.

if satisfied f (xFA ) < f (xp ) and f (xFA ) < f (xg ) then update the optimal solution
pbesti . gbesti then proceed to the next step, otherwise return (2).
(4) Update the speed and localization of PSO algorithm.
(5) Check the termination condition. The termination condition is set as iteration
number of iterations. If the iteration number reaches iteration number, the algo-
rithm ends and returns to the current global optimal particle position, which is
the best combination of the three parameters in FA algorithm. If the termination
condition is not met, return (3).

3.3.4 WSN Localization Based on RSSI-DEC

According to the above RSSI-DEC algorithm, more precise distances between nodes
can be obtained. In order to realize node localization in three-dimensional environ-
ment, four-sided ranging method can be used to obtain coordinates of unknown
nodes.
Four-sided ranging method is extended from three-sided measurement method.
Assuming that the coordinates of four beacon nodes are, respectively, Aa (xa , ya , za ),
Ab (xb , yb , zb ), Ac (xc , yc , zc ), and Ad (xd , yd , zd ) and the coordinates of unknown node
U are (x, y, z), the distance measured from the unknown node to each beacon node
is da , db , dc , and dd . According to the three-dimensional spatial distance formula, a
set of nonlinear equations can be obtained as follows:


⎪ (x − xa )2 + (y − ya )2 + (z − za )2 = da2

(x − xb )2 + (y − yb )2 + (z − zb )2 = db2
(3.11)

⎪ (x − xc )2 + (y − yc )2 + (z − zc )2 = dc2

(x − xd )2 + (y − yd )2 + (z − zd )2 = dd2

By solving the linear equation, the coordinates of unknown nodes can be obtained.

3.4 Experimental Simulation Analysis

In MATLAB 2014, we distribute 150 nodes (including 50 anchor nodes and 100
unknown nodes) in the area of 100 × 100 × 100 m3 . Node communication radius
rnode = 10 m, reference distance between nodes d0 = 1 m, ωmax = 0.9, ωmin = 0.4,
and λ = 0.2, firefly brightness I = 1, attractive force β = 0.2, and learning factor
c1 = c2 = 2.
Figure 3.1 shows the fitness curve of FAPSO algorithm with different inertia
weights. As can be seen from the figure, with the increase of iteration times, the
fitness function value gradually decreases, that is, gradually approaches the optimal
value. The logarithmic decrement inertia weight function proposed in this paper
3 Optimization Algorithm of RSSI Transmission Model … 33

60
ω=1
ω=0.8
50
ω=ωi
the fitness function value(%)

40

30

20

10

0
0 50 100 150 200
iteration

Fig. 3.1 RSSI-DEC algorithm with different inertia weights optimal value change curve

has the smallest fitness function value and the smallest error. When the number of
iterations is about 127, the fitness function value tends to the minimum. It can be
seen that the attenuation inertia weight proposed in this paper plays an active role
in the operation of the algorithm. The algorithm has relatively weak global search
capability and relatively strong local search capability. At this time, the algorithm
has stronger search capability near the extreme value, which is helpful to find the
optimal solution.
Figure 3.2 shows the comparison of node localization errors after localiza-
tion using different model parameter optimization algorithms. Algorithms include
WPSO, WFA, and RSSI-DEC and those are used to optimize the model parameters,
and then localization using four-sided localization method. The above three algorithm
optimization X (k, PL(d0 )) parameters are, respectively, and the average localization
error is about 24.08%, 18.98%, and 9.17% after localization using the three signal
transmission models, respectively, XPSO (3.91, −45.32), XW FA (3.62, −40.09), and
X (3.17, −41.53). It can be seen that the average relative localization error of RSSI-
DEC algorithm is lower than that of WPSO algorithm and WFA algorithm, which
validates the optimization effect of RSSI-DEC algorithm model effectively, and the
X (k, PL(d0 )) optimal parameter obtained is X (3.17, −41.53).
Figure 3.3 shows the comparison of average relative localization errors of nodes
after localization using different node localization algorithms. As can be seen from the
34 Y. Liu et al.

50
WPSO
WFA
RSSI-DEC
40
average relative localization error%

30

20

10

0
0 20 40 60 80 100
unknown nodes

Fig. 3.2 Comparison of the average relative localization error of unknown nodes after localization
using different model parameter optimization algorithms

figure, the average relative errors of WRSSI algorithm, ARSSI algorithm, and RSSI-
DEC algorithm proposed in this paper are 31.46%, 15.02%, and 9.17%, respectively.
It can be seen that the average relative localization error of RSSI-DEC algorithm
is lower than that of WRSSI algorithm and ARRSSI algorithm, which has good
localization effect.

3.5 Conclusion

In order to obtain the most suitable parameters of the signal transmission model for
the wireless sensor network node localization algorithm based on RSSI ranging, an
RSSI-DEC optimization algorithm based on ranging error correction is proposed. The
optimal parameters are solved by intelligent algorithm, and a new transmission model
is constructed. The model is applied to node localization. The simulation results show
that the algorithm proposed in this paper overcomes the limitations of traditional RSSI
algorithm model parameters, improves the environmental adaptability of algorithm,
and has better ranging accuracy and stability compared with the algorithm optimized
by the same distance. The RSSI-DEC-based node localization algorithm proposed in
3 Optimization Algorithm of RSSI Transmission Model … 35

60
WRSSI
ARSSI

50 RSSI-DEC
average relative localization error%

40

30

20

10

0
0 20 40 60 80 100 120

unknown nodes

Fig. 3.3 Comparison of average relative localization error of unknown nodes after localization
using different node localization algorithms

this paper has an average relative localization error of 9.17%. It is 22.29% lower than
the RSSI-based weighted centroid localization algorithm (WRSSI) and it is 5.85%
lower than the adaptive RSSI localization algorithm (ARSSI).

Acknowledgements This work was supported by “Research on Lightweight Active Immune Tech-
nology for Electric Power Supervisory Control System”, a science and technology project of State
Grid Co., Ltd. in 2019.

References

1. Yourong, C., Siyi, L., Junjie, C.: Node localization algorithm of wireless sensor networks with
mobile beacon node. Peer-to-Peer Netw. Appl. 10(3), 795–807 (2017)
2. Fariz, N., Jamil, N., Din, M.M.: An improved indoor location technique using Kalman filtering
on RSSI. J. Comput. Theor. Nanosci. 24(3), 1591–1598 (2018)
3. Teng, Z., Qu, Z., Zhang, L., Guo, S.: Research on vehicle navigation BD/DR/MM integrated
navigation positioning. J. Northeast Electr. Power Univ. 37(4), 98–101 (2017)
4. Rencheng, J., Zhiping, C., Hao, X.: An RSSI-based localization algorithm for outliers suppres-
sion in wireless sensor networks. Wirel. Netw. 21(8), 2561–2569 (2015)
5. Zhang, X., Xiong, W., Xu, B.: A cooperative localization algorithm based on RSSI model in
wireless sensor networks. J. Electr. Meas. Instrum. 30(7), 1008–1015 (2016)
36 Y. Liu et al.

6. Teng, Z., Xu, M., Zhang, L.: Nodes deployment in wireless sensor networks based on improved
reliability virtual force algorithm. J. Northeast Dianli Univ. 36(2), 86–89 (2016)
7. Jinze, D., Jean, F.D., Yide, W.: A RSSI-based parameter tracking strategy for constrained position
localization. EURASIP J. Adv. Signal Process. 2017(1), 77 (2017)
8. Yu, Z., Guo, G.: Improvement of localization technology based on RSSI in ZigBee networks.
Wirel. Pers. Commun. 95(3), 1–20 (2016)
9. Sun, Z., Zhou, C.: Adaptive clustering algorithm in WSN based on energy and distance. J.
Northeast Dianli Univ. 36(1), 82–86 (2016)
Chapter 4
A New Ontology Meta-Matching
Technique with a Hybrid Semantic
Similarity Measure

Jiawei Lu, Xingsi Xue, Guoxiang Lin and Yikun Huang

Abstract Ontology is the kernel technique of semantic web, which can be used to
describe the concepts and their relationships in a particular domain. However, dif-
ferent domain experts would construct the ontologies according to different require-
ments, and there exists a heterogeneity problem among the ontologies, which hinders
the interaction between ontology-based intelligent systems. Ontology matching tech-
nique can determine the links between heterogeneous concepts, which is an effective
method for solving this problem. Semantic similarity measure is a function to calcu-
late to what extent two concepts are similar to each other, which is the key component
of ontology matching technique. Generally, multiple semantic similarity measures
are used together to improve the accuracy of the concept recognition. How to com-
bine these semantic similarity measures, i.e., the ontology meta-matching problem,
is a challenge in the ontology matching domain. To address this challenge, this paper
proposes a new ontology meta-matching technique, which applies a novel combi-
nation framework to aggregate two broad categories of similarity measures. The
experiment uses the famous benchmark provided by the Ontology Alignment Eval-
uation Initiative (OAEI). Comparing results with the participants of OAEI shows the
effectiveness of the proposal.

Keywords Ontology meta-matching · Semantic similarity measure · OAEI

J. Lu · X. Xue (B) · G. Lin


College of Information Science and Engineering, Fujian University of Technology, Fuzhou, China
e-mail: jack8375@gmail.com
J. Lu · X. Xue
Intelligent Information Processing Research Center, Fujian University of Technology, Fuzhou,
China
X. Xue
Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fujian University of
Technology, Fuzhou, China
Fujian Key Lab for Automotive Electronics and Electric Drive, Fujian University of Technology,
Fuzhou, China
Y. Huang
Concord University College, Fujian Normal University, Fuzhou, China
© Springer Nature Singapore Pte Ltd. 2020 37
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_4
38 J. Lu et al.

4.1 Introduction

Since ontology can reach consensus on the meaning of concepts in a certain field
and provides rich domain knowledge and semantic vocabularies for the interaction
between intelligent systems, it is considered as a solution to the heterogeneity of
data in the semantic web. However, due to the decentralized nature of the semantic
web, the same concept may have different definitions in different ontologies, which
causes the so-called ontology heterogeneity problem. The ontology heterogeneity
problem seriously affects the sharing between domain knowledge and has become
the bottleneck of interaction and collaboration between semantic web application
systems. Ontology matching technique can determine the links between heteroge-
neous concepts, which is an effective method for solving this problem. Semantic
similarity measure is a key component of ontology matching technology, which
is a function to calculate the similarity between two concepts. There are currently
four types of semantic similarity measures, i.e., literal-based method, background-
knowledge-based method, context-based method, and instance-based method [1].
Each type of method is subdivided into a number of specific methods, for example,
with respect to the background-knowledge-based similarity measure [2], the specific
method could be the node-based methods, the edge-based methods, and the mixed
methods of two approaches. Usually, multiple semantic similarity measures are used
together to improve the accuracy of the concept recognition [3, 4], but how to com-
bine these semantic similarity measures, i.e., the ontology meta-matching problem,
is a challenge in the ontology matching domain [5]. To address this challenge and
improve the ontology alignment’s quality, in this paper, a new combination frame-
work is proposed to aggregate two broad categories of similarity measures, i.e., the
ones based on edit distance and background knowledge base.
The rest of this paper is organized as follows: Sect. 4.2 introduces the basic con-
cepts, Sect. 4.3 describes the composition of similarity measures in detail, Sect. 4.4
shows the experimental study, and finally Sect. 4.5 draws the conclusion and presents
the future work.

4.2 Basic Concepts

4.2.1 Ontology Matching

There are many definitions of ontology. Here, for the convenience of work, ontology
is defined as follows:

Definition 4.1 Ontology is a 3-tuple

O = (C, P, I ),
4 A New Ontology Meta-Matching Technique … 39

where

C is the set of concepts, e.g., some terminologies in a particular domain.


P is the set of attributes, e.g., the characteristics of a class or the relationships
between classes.
I is a set of instances, e.g., the real-world objects belong to some class.

Definition 4.2 Ontology alignment is also defined as a 3-tuple

(e, e , n),

where

e and e are the entities in two ontologies, n is the similarity value between e and
e , which is in [0, 1].

Definition 4.3 The ontology matching process can be defined as follows:

AN = f (O, O , A),

where
O and O are two ontologies, respectively, A is the set of entity similarity value n
in Definition 4.2.

The value of ontology matching value is the average value of entity matching
value. In this paper, it is the average value of all attribute similarity, and the same
weight is adopted for different attributes. The similarity value interval of ontology
is [0, 1]. When the similarity value of two ontologies is 1, it means that the two
ontologies are equivalent. When the similarity value of two ontologies is 0, the two
ontologies are completely unrelated.

4.2.2 Similarity Measure

This paper utilizes two broad categories of similarity measures, i.e., the edit-distance-
based similarity measure and the background-knowledge-based similarity measure.
In this work, we use the similarity measure proposed by Wu and Palmer [6], which
works with the Wordnet [literature]. With respect to the edit-distance-based similarity
measure, we use the N-gram distance [7] similarity measure and cosine distance
similarity measure. Next, these measures are described one by one in detail.
Similarity measure technology based on background knowledge base
WordNet is an electronic language database that covers a collection of synonyms for
various vocabularies. It has hierarchical sub-parent relationships and is commonly
used to measure similar relationships between concepts. This paper uses the Wu and
40 J. Lu et al.

Palmer similarity measure, which considers the depth of the recent public parent con-
cept of the two concepts in WordNet. The deeper the parent concept in WordNet, the
stronger the conceptual semantic relationship between the two concepts. Compared
with the SimLC similarity measure [8], it considers the change in the strength of the
connection between concepts, and the measurement will be more accurate. Given
two concepts c1 and c2 , the similarity measure of Wu and Palmer between them is
Simwp (c1 , c2 ) equal to

2 ∗ depth(LCAc1 ,c2 )
Simwp (c1 , c2 ) = , (4.1)
depth(c1 ) + depth(c2 )

where LCA(c1 , c2 ) represents the closest common parent concepts to c1 and c2 ,


depth(LCAc1 ,c2 ), depth(c1 ), and depth(c2 ), respectively, represent the closest com-
mon parent concepts and the depth of c1 and c2 in WordNet hierarchy.
Similarity measure technique based on edit distance
There are many similarity measure methods based on edit distance, such as Lev-
enshtein distance [9], N-gram distance [7], and cosine distance according to the
literature [7]; on the ontology matching problem, for the string, N-gram distance has
superior performance, especially when N = 3. Therefore, this paper uses N-gram
distance as a measure of similarity to strings. Given two strings s1 and s2 , the N-gram
distance is defined as follows:

2 ∗ comm(s1 , s2 )
N −gram(s1 , s2 ) = , (4.2)
NS1 + NS2

where comm(s1 , s2 ) represents the number of common substrings in the two strings,
and Ns1 and Ns2 , respectively, represent the number of substrings in the string s1 and
the string s2 , respectively.
As a famous edit distance measure, cosine distance is suitable for the similarity
measure of sentences. Given two sentences D1 and D2 , the cosine distance is defined
as follows:

V1 · V2
Cos(D1 , D2 ) = , (4.3)
V1  × V2 

where V1 is the vector of sentence D1 and V2 is the vector of sentence D2 . For


example, sentence D1 is “Lily likes eating apples” and sentence D2 is “Lily is eating
an orange.” Put the words in the two sentences into a union C, and get the set C =
{Lily, likes, eating, apples, is, an, orange}. The words appearing in the sentence are
1 in the corresponding vector, otherwise 0, and the vector dimension is the number
of words in the union C. Then, the V1 vector is (1, 1, 1, 1, 0, 0, 0) and the V2 vector
is (1, 0, 1, 0, 1, 1, 1).
4 A New Ontology Meta-Matching Technique … 41

4.2.3 Evaluation of Matching Results

The quality of ontology matching results is usually evaluated through recall and
precision. Recall (also known as complete sex) is used to measure the proportion
of the correct matching results found to account for all correct results. A value of 1
for recall means that all correct matching results have been found. However, recall
does not provide the number of incorrect matching results in the found matching
results. Therefore, recall needs to be considered together with precision (also called
correctness), which is used to measure the proportion of the correct matching result
in the found matching results. A precision value of 1 means that all found matching
results are correct, but this does not mean that all correct matching results have
been found. Therefore, recall and precision must be weighed together, which can
be achieved by the f-measure (i.e., the weighted harmonic mean of the recall and
precision).
Given a reference matching R and a matching result A, the recall, precision, and
f-measure can be calculated by the following formula:

|R ∩ A|
recall = (4.4)
R
|R ∩ A|
precision = (4.5)
A
2 ∗ recall ∗ precision
f −measure = (4.6)
recall + precision

4.3 The Composition of Similarity Measures

After parsing the benchmark test set in OAEI, three types of entities are obtained, i.e.,
data property, class, and object property. Each of the three types of entities contains
three attributes, i.e., ID, label, and comment. This paper will measure the similarity
of the three types of entities separately.
• According to the similarity measure matrix, when an entity in the ontology com-
pares with all entities of the same type in another ontology, we consider the ID and
label in the entity as a group, and measure the two entities by the N-gram method
and the Wu and Palmer method, respectively. If the maximum value of the ID and
label similarity values is larger than the threshold, the corresponding matching
pairs will, respectively, be added into the similar sets N, W of the N-gram and Wu
and Palmer. After that, N and W are combined to obtain U. In particular, when
combining N and W, there are four types of possible situations in the above three
sets as follows:
42 J. Lu et al.

1. For the complementary set N − W of the set N, the complementary set W − N


of the set W, and the union U, there is only one entity matching pair in the
three;
2. For the union U, there are multiple entity matching pairs;
3. The union U is empty, and there are multiple matching pairs in the set N − W
and the set W − N ;
4. Set N − W , set W − N , and set U are empty.

• Different measures are taken for different situations:

1. For the first type of situation, take the entity matching pair and put it into the
same entity matching set S;
2. For the second type of situation, the related entities in the matching pairs of
multiple entities in the union U are taken out, and the similarity measure is
performed on the comment of the entity using the cosine distance, and finally
the matching pair with the largest value of the merit is put into the set S;
3. For the third type of situation, take out the matching pairs in the set N − W
and the set W − N , and use the cosine distance to measure the similarity of the
entity’s comment attribute, and finally take the matching pair with the largest
common similarity value into the set S;
4. For the fourth type of situation, use cosine to measure the similarity of the
comments of all entities, and finally take the matching pair with the largest
common similarity value into the set S;
5. For the second type of situation, the third type of situation, and the fourth type
of situation, there will often be no comment, then the N-gram distance will be
used to measure the similarity of the ID and the label, taking the average of
the two as the similarity value between the entities, when the similarity value
is greater than the threshold, the corresponding matching pair is put into the
set S.

• We find that when the entity matching order is disturbed (reverse order compari-
son, random comparison), the entity matching pairs in the set S will change. By
sequential comparison, reverse order comparison, and random comparison, we
extract the entities in the change matching pair, and use the N-gram distance to,
respectively, identify their IDs and labels. The similarity measure is performed,
and the similarity measure is performed on their comment using the cosine dis-
tance. Finally, the average of the three similarity values is taken, and the entity
pair with the largest average value is put into the set S.
4 A New Ontology Meta-Matching Technique … 43

Table 4.1 The brief Case number Brief introduction


description of benchmarks in
OAEI 2016 101–104 The ontologies to be matched are identical or
the two are only slightly different in the
constraints of OWL
201–210 The conceptual structure of the ontology to be
matched is the same, but the language features
are different
221–247 The language features of the to-be-matched
ontology are the same, but the conceptual
structure is different

4.4 Experimental Results and Analysis

In this test, the famous Ontology Alignment Evaluation Initiative (OAEI) 20161 test
case set was used. A brief description of the OAEI 2016 test case set is shown in
Table 4.1. Each test case in the OAEI test case set consists of two ontology to be
matched and one reference match for evaluating the matching results.

4.4.1 Experimental Configuration

In this experiment, in the preprocessing stage each entity string will be lowercased
by the letter, when the matching pair cannot be determined, and the ID and label
need to be measured, WordNet will first be used to detect whether the vocabulary
constituting ID and label exists.
In terms of thresholds, the thresholds for each phase are determined by commis-
sioning as follows:
• When using N-gram distance and Wu and Palmer similarity measure to measure
ID and label, the threshold is taken as 0.9. (When the similarity value is greater
than 0.9, the concepts being measured may be considered similar or identical.)
• When using cosine distance to measure the similarity of comment, the threshold
is taken as 0.9. (When the similarity value is greater than 0.9, the concepts being
measured may be considered similar or identical.)
• When ID and label cannot determine the matching pair and there is no comment,
using N-gram distance to measure ID and label. Take the average of the similarity
values of the two as the final entity similarity. The threshold is 0.95. (Because
WordNet is not used here, raising the threshold is beneficial to improve the preci-
sion of the metric.)

1 Ontology Alignment Evaluation Initiative (OAEI), http://oaei.ontologymatching.org/2016,


accessed at 2019–02–22.
44 J. Lu et al.

Table 4.2 Comparison of the Ontology matching system P F R


measures in this paper with
OAEI 2016 participants edna 0.58 0.65 0.79
AML 1.00 0.56 0.48
LogMap 0.92 0.76 0.75
LogMapLt 0.50 0.59 0.79
PhenoMF 0.00 0.00 0.00
PhenoMM 0.00 0.00 0.00
PhenoMP 0.00 0.00 0.00
XMap 0.97 0.76 0..72
LogMapBio 0.51 0.54 0.60
Measure of this paper 0.96 0.89 0.87

4.4.2 Experimental Results and Analysis

Table 4.2 compares the results obtained by the method presented in this paper with
those of OAEI 2016 participants, where the values are the matching results for the
three types of test cases described in Table 4.1. According to the relevant OAEI reg-
ulations, test cases that are not automatically generated are removed for convenience
comparison: 102–104, 203–210, 230–231. The results obtained by the method in this
paper are the average of the results in five independent runs (the OAEI participants
are the average of the results in five independent runs), and in Table 4.2, the symbols
P, F, and R represent the values of precision, f-measure, and recall, respectively.
It can be seen from Table 4.2 that the precision obtained in this paper ranks third,
but the recall rate and f-measure are higher than other measures, so the measure of
this paper is effective.

4.5 Conclusion and Future Work

Ontology matching technology is critical to the realization of the knowledge sharing.


How to efficiently and accurately determine the semantic relationships between the
entities in ontologies is an urgent problem to be solved. To solve this problem, this
paper proposes a new meta-matching technique with a hybrid semantic similarity
measure. The comparison with the existing ontology matching system shows that
the proposed method is effective.

Acknowledgements This work is supported by the Program for New Century Excellent Talents
in Fujian Province University (No. GY-Z18155), the Program for Outstanding Young Scientific
Researcher in Fujian Province University (No. GY-Z160149), the 2018 Program for Outstanding
Young Scientific Researcher in Fujian, the Scientific Research Project on Education for Young and
4 A New Ontology Meta-Matching Technique … 45

Middle-aged Teachers in Fujian Province (No. JZ170367), and the Scientific Research Foundation
of Fujian University of Technology (No. GY-Z17162).

References

1. Xue, X., Wang, Y.: Using memetic algorithm for instance coreference resolution. IEEE Trans.
Knowl. Data Eng. 28(2), 580–591 (2016)
2. Xue, X., Pan, J.S.: A compact co-evolutionary algorithm for sensor ontology meta-matching.
Knowl. Inf. Syst. 56(2), 335–353 (2018)
3. Xue, X., Wang, Y.: Optimizing ontology alignments through a memetic algorithm using both
MatchFmeasure and unanimous improvement ratio. Artif. Intell. 223, 65–81 (2015)
4. Cai, Y., Zhang, Q., Lu, W., et al.: A hybrid approach for measuring semantic similarity based
on IC-weighted path distance in WordNet. J. Intell. Inf. Syst. 51(1), 23–47 (2018)
5. Xue, X., Wang, Y., Ren, A.: Optimizing ontology alignment through memetic algorithm based
on partial reference alignment. Expert Syst. Appl. 41(7), 3213–3222 (2014)
6. Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual
Meeting on Association for Computational Linguistics, pp. 133–138. Association for Compu-
tational Linguistics (1994)
7. Mascardi, V., Locoro, A., Rosso, P.: Automatic ontology matching via upper ontologies: a
systematic evaluation. IEEE Trans. Knowl. Data Eng. 22(5), 609–623 (2010)
8. Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense
identification. In: WordNet: An Electronic Lexical Database, vol. 49, no 2, pp. 265–283 (1998)
9. Richard Benjamins, V. (ed.): Knowledge Engineering and Knowledge Management: Ontologies
and the Semantic Web. Springer Verlag, Berlin (2003)
Chapter 5
Artificial Bee Colony Algorithm
Combined with Uniform Design

Jie Zhang, Junhong Feng, Guoqiang Chen and Xiani Yang

Abstract As artificial bee colony algorithm is sensitive to the initial solutions, and
is easy to fall into local optimum and premature convergence, this study presents a
novel artificial bee colony algorithm based on uniform design to acquire the better
initial solutions. It introduces an initialization method with uniform design to replace
random initialization, and selects the better ones of those initial bees generated by the
initialization method as the initial bee colony. This study also introduces a crossover
operator based on uniform design, which can search evenly the solutions in the
small vector space formed by two parents. This can increase searching efficiency and
accuracy. The best two of the offsprings generated by the crossover operator based on
uniform design are taken as new offsprings, and they are compared with their parents
to determine whether to update their patents or not. The crossover operator can ensure
that the proposed algorithm searches uniformly the solution space. Experimental
results performed on several frequently used test functions demonstrate that the
proposed algorithm has more outstanding performance and better global searching
ability than standard artificial bee colony algorithm.

Keywords Bee colony · Artificial bee colony · Uniform design · Uniform


crossover

J. Zhang · J. Feng (B) · X. Yang


School of Computer Science and Engineering, Guangxi Universities Key Lab of Complex System
Optimization and Big Data Processing, Yulin Normal University, Yulin 537000, Guangxi, China
e-mail: jgxyfjh@126.com
G. Chen
School of Computer and Information Engineering, Henan University, Kaifeng 475004, Henan,
China

© Springer Nature Singapore Pte Ltd. 2020 47


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_5
48 J. Zhang et al.

5.1 Introduction

Artificial bee colony (ABC) algorithm [1–3] is a novel heuristic optimization algo-
rithm inspired by bees’ collecting honey. Standard ABC accomplishes the optimiza-
tion for a problem by simulating the process of bees’ looking for nectar sources,
which includes the stage of employed bees, that of onlookers, and that of scouters
as well. Because of few control parameters, high accuracy, and strong search perfor-
mance, ABC has been applied to continuous space optimization, data mining, neural
network training, etc. However, ABC has still some disadvantages such as premature
convergence and be easy to fall into local optimum. Many researchers have proposed
a variety of improvement methods to improve the performance of ABC; however, till
now, it is still a difficult problem how to improve the convergence of the algorithm
and avoid falling into the local optimum.
Uniform design was first proposed by Wang and Fang in 1978. It aims how to
distribute uniformly the design points within the test range, so as to obtain as many
information as possible using as few test points as possible. ABC performs the assay
by a set of elaborately designed tables, which is similar to the orthogonal design.
Each uniform design table is accompanied by a usage table, which indicates how
to select the appropriate columns from the design table and the uniformity levels of
testing program formed by the selected columns. Uniform extends the methods for
the classical, deterministic univariate problems into the calculation of multivariate
problems. Its main goal is to sample a small number of points from a given set of
points so that the sampled points can be evenly distributed in the whole solution
vector space.
In order to search the solution space uniformly, the study introduces uniform
design to generate the initial bee colony, so that the individuals in the bee colony can
scatter evenly over the feasible space of a problem. In order to increase the guidance
and influence of the optimal nectar source on each nectar source, the study introduces
the crossover operator based on uniform design, so that two parents participating in
crossover can acquire their offsprings uniformly. The crossover operator is performed
between each nectar source and the optimal nectar source, which is to search evenly
the small vector space formed by them. This can increase the influence of the optimal
nectar source and acquire good fine search.

5.2 Artificial Bee Colony

Artificial bee colony (ABC) belongs to one of swarm intelligent optimization algo-
rithms. Inspired by the process of bees’ collecting honey, it simulates the types of
bees, the roles of bees, and the process of collecting honey to address the practical
optimization problems. If the problem to optimize is regarded as the nectar source to
search, and then its feasible solution is equivalent to the location of a nectar source,
while its fitness is equivalent to the amount of nectar in the nectar source. The more
5 Artificial Bee Colony Algorithm Combined with Uniform Design 49

the amount of nectar is, the better the nectar source is. The maximization optimiza-
tion problem can be solved directly using ABC, while the minimization optimization
problem needs to be transformed to use ABC indirectly. According to different roles
of bees, they can be divided into three types such as employed bees, onlookers, and
scouters. The number of employed bees is generally assumed to be equal to the
number of onlookers, and be equal to the number of nectar sources. However, the
number of scouters is only 1, and it can work only when certain conditions have been
met. Therefore, the searching process of an optimization problem is correspondingly
divided into the stage of employed bees, that of onlookers, and that of scouters.
Given the dimension of a problem is D, the amount of nectar sources, employed
bees, and onlookers SN, then the standard ABC algorithms regard the process of
seeking the solution for the problem as that of searching the nectar source in D-
dimensional vector space. Its detailed steps are as follows:
(1) Initialization of bee colony
Random initialization method is utilized to initialize SN nectar sources, and the
initialization formula is shown in formula (5.1):

xid = xdmin + r1 × (xdmax − xdmin ) (5.1)

where x id denotes the d-dimensional value of the i-th nectar source xi ∈


{x1 , x2 , . . . , xSN } , i ∈ {1, 2, . . . , SN }, d ∈ {1, 2, . . . , D}; xdmax , xdmin represent upper
bound and lower bound of the d-dimensional value, respectively; r 1 denotes the ran-
dom number distributed uniformly within the interval [0, 1]. If r 1 = 0, then xid = xdmin ,
while r 1 = 1, xid = xdmax . Obviously, this can ensure that the values after random
initialization lie in the scopes of the feasible solutions of the problem to optimize.
The initialization solutions of employed bees and onlookers are, respectively, set as
the initialized nectar source.
(2) Stage of employed bees
At this stage, the nectar sources of employed bees are updated by the following
formula (5.2):

vid = xid + r2 × (xid − xkd ), i = k (5.2)

where vid indicates a new nectar source, x id is the same as formula (5.1), x k represents
a nectar source different from x i , x kd indicates the d-dimensional value of x k ; k = i and
k ∈ {1, 2, . . . , SN }, and r 2 denotes the random number distributed uniformly within
the interval [0, 1]. In the above formula, it is to look for a different neighbor nectar
source, and it updates the old nectar source of bees by differential mode. Formula
(5.2) cannot ensure that the updated nectar sources of employed bees lie in the scopes
of the feasible solutions of the problem to optimize. Therefore, bound scopes need
to be checked by means of setting the values less than lower bound or those larger
than upper bound into lower bound or upper bound, respectively. After the nectar
source of employed bees was obtained by the above formula, greedy algorithms are
50 J. Zhang et al.

utilized to compare the fitness of nectar source and that of employed bees’ nectar
source. The greedy selection strategy is employed to select the better nectar source.
(3) Stage of onlookers
At this stage, onlookers select nectar sources by means of roulette strategy. This
is to ensure that the nectar source with higher fitness is updated more likely. The
probability of each nectar source is calculated according to the following formula
(5.3):

Fi
Pi = SN (5.3)
i=1 Fi

where Fi denotes the fitness of the i-th nectar source, and its calculation formula is
shown in the following formula (5.4):

1
, fiti ≥ 0
Fi = 1+fiti (5.4)
1 + |fiti |, fiti < 0

where fit i and |fit i | represent the objective function value and its absolute value,
respectively.
Similarly, employed bee, after selecting a nectar source, onlooker updates its
nectar source using formula (5.2), checks its bounds, compares its fitness and the
fitness of nectar source in terms of greedy algorithms, and selects the better nectar
source by means of the greedy selection strategy.
(4) Stage of scouters
For each nectar source, the parameter trail can determine the number of the nectar
source that does not update. This is equivalent to the number of the optimal solution
of the problem to optimize that does not change. At initialization, the trail values of
all nectar sources are all equal to 0. At the stages of employed bees and onlookers,
if a nectar source is updated, namely, a better nectar source is found, and then trail
← 0, while if a nectar source is maintained as the previous nectar source, then trail
← trail + 1. In ABC, a predefined parameter limit is utilized to control scouters. If
trail is larger than or equal to limit, then the stage of scouters will start.
Before terminal condition is satisfied, ABC goes to the abovementioned stages
of employed bees, onlookers, and scouters orderly and repeatedly, respectively. The
best nectar source so far is saved in each loop. The solution of the optimal nectar
source is regarded as the optimal solution of the problem to optimize [4].
5 Artificial Bee Colony Algorithm Combined with Uniform Design 51

5.3 The Proposed Algorithm

5.3.1 Algorithm Thoughts

Uniform design [5–11] is a sample method. It enables the sampled data points to
scatter uniformly over the solution space of a problem to optimize. This is to both
increase the diversity of data points and improve the search efficiency. The solution
space is divided into multiple subspaces first, and then uniform design is applied in
each of the subspaces to obtain the initial population generation algorithm based on
uniform design [6, 8, 9]. According to the intersection of the upper and lower bounds
of two parents, uniform design is applied in two parents to obtain the crossover
operator with uniform design [6].
ABC algorithm is sensitive to the initial solution, but the initial population plays an
important role in the subsequent iteration. The good initial solution may acquire the
optimal solution quickly, while the poor may fall into local optimum. ABC algorithm
uses the random initialization method, which does not ensure that the obtained initial
solutions are scattered in the vector space of the problem. These solutions may
concentrate only in several regions while other regions are not distributed any at all.
Therefore, the study presents an artificial bee colony based on uniform design. It
uses the initial colony generation algorithm based on uniform design to generate a
group of the initial bee colony scattered evenly over the vector space. Between each
nectar source and the optimal nectar source, the crossover-based uniform design is
conducted to generate the better nectar source. If the better nectar source is generated,
then the current nectar source is substituted by the better nectar source, otherwise
the current nectar source is kept.

5.3.2 Details of the Proposed Algorithm

The detailed steps of the proposed algorithm are as follows:


Step 1 Initialize the parameters.
Step 2 Given the number of nectar sources SN, determine the number of subintervals
S and the size of bee colony in each subinterval Q0 , such that Q0 * S ≥ SN.
Generate the initial nectar source colony P1 using the initialization colony
generation algorithm based on uniform design, calculate the fitness of each
nectar source in P1 , and find out the optimal nectar source bestP1 .
Step 3 Go to the stage of employed bees. Update each nectar source in P1 using
formula (5.2) and acquire a new nectar source P2 .
Step 4 Go to the stage of onlookers. Calculate the probability of each nectar source
in P2 using formula (5.3) and select nectar sources from P2 by means of
roulette strategy. Update each nectar source in P2 using formula (5.2) and
acquire a new nectar source P3 .
52 J. Zhang et al.

Step 5 Go to the stage of scouters. For each nectar source in P3 , if trail ≥ limit,
then generate a new nectar source using formula (5.1) to replace the current
nectar source; otherwise, keep the current nectar source. The generated new
nectar source is marked as P4 .
Step 6 Calculate the fitness of each nectar source in P4 and find out the optimal
nectar source bestP2 . If bestP2 is superior to bestP1 , then bestP1 ← bestP2 .
Step 7 Perform the crossover operator based on uniform design on each nectar
source in P4 and bestP1 and find out the best one Oopt from the generate Q1
offsprings. If Oopt is superior to the current nectar source, then update the
current nectar source to obtain a new nectar source P5 . If Oopt is superior to
bestP1 , then bestP1 ← Oopt .
Step 8 If the terminal condition is not satisfied, then P1 ← P5 and turn Step 3; oth-
erwise, output the optimal nectar source bestP1 and terminate the algorithm.

5.4 Numerical Results

Several commonly used test functions are utilized to evaluate the performance of the
proposed algorithm UABC. These test functions, respectively, take 50, 100, and 200
dimensions to evaluate the robustness of UABC. UABC and ABC are, respectively,
conducted 20 runs to calculate average value and standard deviation of the optimal
values.

5.4.1 Test Problems

The symbols and function names of several test functions are as follows: f 1 ↔ Sphere,
f 2 ↔ Rosenbrock, f 3 ↔ Griewank, f 4 ↔ Rastrigrin, f 5 ↔ Schwefel’s problem 22,
f 6 ↔ Ackley, f 7 ↔ Sum of different power, f 8 ↔ Step, f 9 ↔ Quartic, and f 10 ↔ axis
parallel hyper-ellipsoid. The expressions and search scopes of several test functions
are shown in Table 5.1.

5.4.2 Parameter Values

• Parameters for ABC: the size of bee colony SN = 60; the number of employed
bees, onlookers, and nectar sources is all equal to SN, while the number of scouters
is 1; the predefined parameter at the stage of scouters limit = 10.
• Parameters for the uniform design: the number of subintervals S = 4; the number
of the sample points or the size of bee colony in each subinterval Q0 = 17; the
parameter in uniform cross Q1 = 5.
5 Artificial Bee Colony Algorithm Combined with Uniform Design 53

Table 5.1 Test function


Function Search scope

n [−100, 100]
f1 = xi2
i=1


n−1  [−30, 30]
f2 = 100(xi+1 − xi2 )2 + (xi − 1)2
i=1


n 
n [−600, 600]
f3 = 1
4000 xi2 − cos( √xi ) + 1
i
i=1 i=1


n [−5.12, 5.12]
f4 = [x2i − 10 cos(2π xi ) + 10]
i=1


n 
n [−10, 10]
f5 = |xi | + |xi |
i=1 i=1
 


N 
N [−30, 30]
f6 = −20 exp −0.2 1
N xi2 − exp 1
N cos(2π xi ) + exp(1) + 20
i=1 i=1


n [−1, 1]
f7 = |xi |i+1
i=1
n
 2 [−100, 100]
f8 = xi + 0.5
i=1


n [−1.28, 1.28]
f9 = i · xi4 + rand ()
i=1
 2 [−100, 100]

n 
i
f10 = xj
i=1 j=1

• Terminal condition: the number of maximal iterations t max = 100. When the num-
ber of iterations t is satisfied t > t max , UABC terminates.

5.4.3 Results

When the dimension of test functions are, respectively, set as 50, 100, and 200,
the results obtained by ABC and UABC are shown in Table 5.2 and Table 5.3,
respectively.
A comparison between Tables 5.2 and 5.3 shows that the average values obtained
by UABC are much better than those obtained by ABC, and their difference is several
orders of magnitude. If considering floating-point errors, the values less than 10−6 are
regarded as 0, then for 50-dimensional test function, UABC obtains the theoretical
optimal value 0 except f 2 , f 9 , and f 10 , while ABC does not obtain the theoretical
optimal value for all test functions. For 50-dimensional test function, f 1 , f 2 , f 3 , f 4 ,
54 J. Zhang et al.

Table 5.2 Average value and standard deviation of the optimal values obtained by ABC
Average value Standard deviation
50 100 200 50 100 200
f1 3.77E+04 1.73E+05 5.01E+05 3.87E+03 1.10E+04 1.34E+04
f2 1.52E+08 8.74E+08 2.48E+09 2.30E+07 7.05E+07 7.53E+07
f3 341.74 1.54E+03 4.58E+03 45.55 112.34 103.80
f4 576.91 1.41E+03 3.14E+03 19.37 27.42 43.35
f5 1.37E+05 1.15E+30 3.55E+80 3.47E+05 2.37E+30 1.46E+81
f6 2.98 3.06 3.08 0.0203 5.83E−03 1.86E−03
f7 0.468 0.699 0.840 0.169 0.137 0.132
f8 4.09E+04 1.70E+05 5.02E+05 4.32E+03 1.11E+04 1.67E+04
f9 4.56 7.06 8.93 0.159 0.123 0.0638
f 10 1.42E+05 5.37E+05 2.19E+06 1.40E+04 8.63E+04 1.83E+05

Table 5.3 Average value and standard deviation of the optimal solutions obtained by UABC
Average value Standard deviation
50 100 200 50 100 200
f1 2.40E−11 4.91E−11 1.84E−09 5.43E−11 1.89E−11 7.39E−09
f2 36.86 79.50 162.39 9.14 23.69 46.63
f3 6.99E−08 2.61E−04 5.36E−04 3.13E−07 1.16E−03 8.88E−04
f4 6.55E−10 3.66E−09 1.28E−08 4.59E−10 1.32E−09 2.75E−09
f5 6.04E−06 3.18E−05 9.86E−05 2.48E−06 7.27E−06 9.74E−06
f6 1.44E−06 2.29E−06 2.95E−06 7.41E−07 4.84E−07 3.28E−07
f7 3.65E−12 5.92E−12 1.26E−11 3.71E−12 7.93E−12 1.50E−11
f8 0 0 0 0 0 0
f9 4.42E−04 4.08E−04 2.73E−04 3.52E−04 4.22E−04 2.46E−04
f 10 0.653 10.28 15.04 2.38 4.54 3.26

f 5 , f 8 , and f 10 , the optimal values obtained by ABC are several orders of magnitude
larger than the theoretical optimal values, while the maximal difference between the
optimal values obtained by UABC and the theoretical optimal values in solely one
order of magnitude (for f 2 ). For 100-dimensional and 200-dimensional test functions,
the phenomena are similar to 50-dimensional test function.
From Tables 5.2 and 5.3, it can also be seen that whether ABC or UABC, the opti-
mal values of 100-dimensional functions are better than those of 50-dimensional
functions and those of 200-dimensional functions are better than those of 100-
dimensional functions. This is reasonable because the differences of the obtained
optimal values and theoretic optimal values are bound to increase as the dimensions
of the problem increase. However, the increase in speed of UABC is much less than
that of ABC. In f 1 , f 3 , f 4 , f 5 , f 6 , f 7 , f 8 , and f 9 , the increase in speed of UABC is very
5 Artificial Bee Colony Algorithm Combined with Uniform Design 55

tiny, and especially in f 8 , there was no increase and it is always equal to the theoretic
optimal value 0. However, the increase in speed of ABC is an order of magnitude,
and especially in f 5 , the optimal value in 50 dimension is 105 , while that in 100
dimension and 200 dimension is 1030 and 1080 , respectively. This fully demonstrates
that UABC is not sensitive to the dimensions of the problem, and suitable for very
high-dimensional problem.
From Table 5.3, we can clearly see that the standard deviations obtained by UABC
are very small except f 2 and f 10 . This demonstrates that UABC has very high robust-
ness. By comparing Tables 5.2 and 5.3, it can be obviously observed that the standard
deviations obtained by UABC are much less than those by ABC. This fully demon-
strates that robustness of UABC is much larger than that of ABC.

5.5 Conclusion and Future Work

This study presents an artificial bee colony algorithm based on uniform design. It
makes full use of the advantage of uniform design, and generates the initial bee colony
by means of uniform design, so that nectar sources can scatter evenly into the vector
spaces of the feasible solutions. The crossover operator based on uniform design is
conducted on each nectar source and the optimal nectar source. This is to perform the
refine search as soon as possible in potential optimal vector space, in order to jump
out of local optimum quickly and find the global optimal solution. The experimental
results performed on several common test functions demonstrate that the proposed
algorithm has a strong ability to seek the optimal solutions. The algorithm can obtain
the satisfactory optimal solutions for different dimension problems. This fully shows
that the proposed algorithm has a strong robustness and applicability.
This algorithm is going on for further enhancement and improvement. One attempt
is to use a more efficient method to improve its converging speed. Another attempt is
to extend its application scopes to the other problems, such as community detection,
brain network analysis, and single cell data analysis as well.

Acknowledgements This research was supported by National Natural Science Foundation of


China (No. 61841603), Guangxi Natural Science Foundation (No. 2018JJA170050), Improvement
Project of Basic Ability for Young and Middle-aged Teachers in Guangxi Colleges and Universi-
ties (No. 2017KY0541), and Open Foundation for Guangxi Colleges and Universities Key Lab of
Complex System Optimization and Big Data Processing (No. 2017CSOBDP0301).

References

1. Cao, Y., et al.: An improved global best guided artificial bee colony algorithm for continuous
optimization problems. Clust. Comput. 2018(2018), 1–9 (2018)
2. Cui, L., et al.: Modified Gbest-guided artificial bee colony algorithm with new probability
model. Soft. Comput. 22(7), 2217–2243 (2018)
56 J. Zhang et al.

3. Ning, J., et al.: A food source-updating information-guided artificial bee colony algorithm.
Neural Comput. Appl. 30(3), 775–787 (2018)
4. Bharti, K.K., Singh, P.K.: Chaotic gradient artificial bee colony for text clustering. Soft Comput.
20(3), 1113–1126 2016
5. Liu, X., Wang, Y., Liu, H.: A hybrid genetic algorithm based on variable grouping and uniform
design for global optimization. J. Comput. 28(3), 93–107 (2017)
6. Leung, Y.-W., Wang, Y.: Multiobjective programming using uniform design and genetic algo-
rithm. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 30(3), 293–304 (2000)
7. Zhang, J., Wang, Y., Feng, J.: Attribute index and uniform design based multiobjective associ-
ation rule mining with evolutionary algorithm. Sci. World J. 2013(2013), 1–16 (2013)
8. Dai, C., Wang, Y.: A new decomposition based evolutionary algorithm with uniform designs
for many-objective optimization. Appl. Soft Comput. 30(1), 238–248 (2015)
9. Zhu, X., Zhang, J., Feng, J.: Multi-objective particle swarm optimization based on PAM and
uniform design. Math. Probl. Eng. 2015(2), 1–17 (2015)
10. Jia, L., Wang, Y., Fan, L.: An improved uniform design-based genetic algorithm for multi-
objective bilevel convex programming. Int. J. Comput. Sci. Eng. 12(1), 38–46 (2016)
11. Dai, C., Wang, Y.: A new uniform evolutionary algorithm based on decomposition and CDAS
for many-objective optimization. Knowl. Based Syst. 85(1), 131–142 (2015)
Chapter 6
An Orthogonal QUasi-Affine
TRansformation Evolution (O-QUATRE)
Algorithm for Global Optimization

Nengxian Liu, Jeng-Shyang Pan and Jason Yang Xue

Abstract In this paper, a new Orthogonal QUasi-Affine TRansformation Evolution


(O-QUATRE) algorithm was proposed for global optimization. The O-QUATRE
algorithm is actually implemented as a combination of both the QUATRE algorithm
and the orthogonal array, both of which together secured an overall better perfor-
mance on complex optimization problems. The proposed algorithm is verified under
CEC2013 test suite for real-parameter optimization. The experimental results indi-
cated that the proposed O-QUATRE algorithm obtained better mean and standard
deviation of fitness error than QUATRE algorithm, which means that the O-QUATRE
algorithm was of more robustness and better stability.

Keywords QUATRE algorithm · Global optimization · Orthogonal array

6.1 Introduction

Global optimization problems exist in various areas, such as vehicle navigation,


design of wireless sensor networks [1, 2], etc. Many of them are NP-hard problems,
in other words, nondeterministic polynomial acceptable problems and cannot be
solved analytically. In the past few decades, many kinds of optimization techniques
have been proposed for tackling such tough and complex optimization problems.
Evolutionary Computation (EC) is an important technique among them, including

N. Liu · J.-S. Pan (B)


College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
e-mail: jspan@cc.kuas.edu.tw
J.-S. Pan
Fujian Provincial Key Lab of Big Data Mining and Applications, Fujian University of
Technology, Fuzhou, China
J. Y. Xue
Business School of Qingdao University, Qingdao, China

© Springer Nature Singapore Pte Ltd. 2020 57


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_6
58 N. Liu et al.

Genetic Algorithm (GA) [3], Particle Swarm Algorithm (PSO) [4], Ant Colony Algo-
rithm (ACO) [5], Differential Evolution (DE) [6], Ebb-Tide-Fish (ETF) algorithm
[7], Monkey King Evolution [8], QUasi-Affine TRansformation Evolution (QUA-
TRE) algorithm [9], etc.
In 2016, Meng et al. proposed the QUATRE algorithm to conquer positional
bias of DE algorithm. The related works of QUATRE algorithm can be found in
[7–11]. The QUATRE algorithm is a swarm-based intelligence algorithm, which has
many advantages and has been used for hand gesture segmentation [10]. However,
it has the same disadvantages as the DE and PSO algorithms. Many researchers
have learned about these evolutionary algorithms and proposed many variants to
enhance their performance. Zhang and Leung [12] advocated incorporating exper-
imental design methods into the GA, and they have proposed Orthogonal Genetic
Algorithm (OGA). Their experimental results demonstrated that OGA can be more
robust and statistically sound, and has a better performance than the traditional GA.
Tsai et al. [13] have adopted the Taguchi method (namely, Taguchi orthogonal arrays)
into the GA’s crossover operator, and have presented the Hybrid Taguchi–Genetic
Algorithm (HTGA). Other researchers have used Taguchi method to improve the
performance of PSO [14], PCSO [15], and DE [16]. The improved algorithms men-
tioned above all use orthogonal array to reduce the number of experiments, thereby
improving the performance and robustness of the algorithm. In this paper, we will
use orthogonal array to improve the performance of QUATRE algorithm.
The rest of the paper is composed as follows. The QUATRE algorithm and the
orthogonal array are briefly reviewed in Sect. 6.2. Our proposed method Orthogonal
QUasi-Affine TRansformation Evolution (O-QUATRE) algorithm is presented in
Sect. 6.3. The experimental analysis of O-QUATRE algorithm under CEC2013 test
suite for real-parameter optimization is given, and O-QUATRE algorithm is com-
pared with the QUATRE algorithm in Sect. 6.4. The conclusion is given in Sect. 6.5.

6.2 Related Works

6.2.1 QUasi-Affine TRansformation Evolutionary


(QUATRE) Algorithm

The QUATRE algorithm was proposed by Meng et al. for solving opti-
mization problems. The individuals in QUATRE algorithm evolve according
to Eq. 6.1, which is a quasi-affine transformation evolution equation. X =
[X1,G , X2,G , . . . , Xi,G , . . . , Xps,G ]T denotes the individual population matrix with ps
different individuals, Xi,G = [xi1 , xi1 , . . . , xi1 , . . . xiD ], i ∈ {1, 2, . . . , ps} denotes
the location of ith individual of the Gth generation, which is the ith row vector of the
matrix X, and each individual Xi,G is a candidate solution for a specific D-dimension
 T
optimization problem. B = B1,G , B2,G , . . . , Bi,G , . . . , Bps,G denotes the donor
matrix and it has several different calculation schemes which can be found in [10].
6 An Orthogonal QUasi-Affine TRansformation Evolution (O-QUATRE) … 59

In this paper, we use the calculation scheme “QUATRE/best/1” which is given in


Eq. 6.2. The operation ⊗ denotes component-wise multiplication of the elements in
each matrix.

X ← M ⊗ X + M̄ ⊗ B (6.1)

M is an evolution matrix, whose elements are either 0 or 1, and M̄ means a binary


inverted matrix of M. The binary invert operation means to invert the values of the
matrix. The corresponding values of zero elements in matrix M are ones in M̄, while
the corresponding values of one elements in matrix M are zeros in M̄.
Evolution matrix M is transformed from an initial matrix Mini . Mini is initialized
by a lower triangular matrix with the elements set to ones. The transformation from
Mini to M has two steps: the first step is to randomly permute every element in each
row vector of Mini and the second step is to randomly permute the row vectors with
all elements of each row vector unchanged. An example of the transformation is
shown in Eq. 6.3 with ps = D. When the ps is larger than the dimension number of
optimization problem, matrix Mini needs to be extended according to ps. An example
of ps = 2D + 2 is given in Eq. 6.4. In general, when ps%D = k, the first k rows of
the D × D lower triangular matrix are included in Mini , and M is adaptively changed
with accordance to Mini [9].

B = Xgbest,G + F · (Xr1,G − Xr2,G ) (6.2)

where Xr1,G and Xr2,G both denote random matrices which are generated by ran-
domly permutating the sequence of row vectors in the population matrix X of the
Gth generation with all elements of each row vector unchanged. F is the muta-
tion scale factor, which ranges from 0 to 1, and its recommended value is 0.7.
 T
Xgbest,G = Xgbest,G , Xgbest,G , . . . , Xgbest,G is the global best matrix with each row
vector equaling to the Gth global best individual Xgbest,G .
⎡ ⎤ ⎡ ⎤
1 1
⎢1 1 ⎥ ⎢ ... ⎥
Mini =⎢

⎥∼⎢
⎦ ⎣1 1 ...
⎥=M (6.3)
... 1⎦
1 1 ... 1 1 1
60 N. Liu et al.

Fig. 6.1 Illustration of quasi-affine transformation evolution for a 10-D example

⎡ ⎤ ⎡ ⎤
1 1 ... 1
⎢1
⎢ 1 ⎥ ⎢
⎥ ⎢ ... 1⎥⎥
⎢ ... ⎥ ⎢ ... ⎥
⎢ ⎥ ⎢ ⎥
⎢1 1 ... 1⎥ ⎢ ⎥
⎢ ⎥ ⎢1 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢1 ⎥ ⎢ 1 ⎥
⎢ ⎥ ⎢ ⎥
Mini =⎢

1 1 ⎥ ∼ ⎢1 1
⎥ ⎢
... 1⎥ = M
⎥ (6.4)
⎢ ... ⎥ ⎢ 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢1 1 ... 1⎥ ⎢ 1⎥
⎢ ⎥ ⎢1 ⎥
⎢ .. ⎥ ⎢ .. ⎥
⎢ . ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ ⎥
⎣1 ⎦ ⎣1 1 ... 1⎦
11 1

The illustration of ith row of quasi-affine transformation evolution according to


Eq. 6.1 is shown in Fig. 6.1

6.2.2 Orthogonal Array

The orthogonal array [13] is a fractional factorial matrix, which can be used in
many designed experiments to determine which combinations of factor levels can be
used for each experimental run and for analyzing the data, and it is a major tool of
experimental design method and Taguchi method. An orthogonal array can ensure a
balanced comparison of levels of any factor or interactions between factors. Each row
in it represents the level of the factors for one run of the experiment, and each column
in it indicates a specific factor that can be evaluated independently. What’s more, the
merit of orthogonal array is that it can reduce the number of experiments efficiently.
Although it reduces the number of experiments, it is still reliable due to the powerful
support of statistical theory. For example, a problem involving three factors, three
6 An Orthogonal QUasi-Affine TRansformation Evolution (O-QUATRE) … 61

levels per

4factor,
requires 33 = 27 experiments to be tested, but with orthogonal
array L9 3 [13], only nine representative experiments need to be conducted.
In this paper, we adopt two-level orthogonal array to change the evolution matrix
M of
QUATRE algorithm, and the general notation for two-level orthogonal array
is Ln 2n−1 , where L, n, n − 1, and 2 denote Latin square, number of experimental
runs, number of columns in the orthogonal array, and number of levels per factor,
respectively. For example, assume that we have two sets of solutions with 10 dimen-
sions in the optimization

problem
and we want to find the best combination of their
values. Then, the L12 211 orthogonal array is given in Table 6.1. The number on
the left of each row represents the experiment number and varies from 1 to 12. The
elements “0” and “1” of each row indicate which factor’s value should be used in
one run of the experiment. The element “1” represents the value of the factor should
be taken from the first set of solution, and the element “0” represents the value of
the factor should be taken from the second set of solution. The illustration of eighth
experiment for 10 factors/dimensions problem according to eighth row of orthogonal
array is shown in Fig. 6.2.



Table 6.1 L12 211 Experiment number Considered factors
orthogonal array
1 2 3 4 5 6 7 8 9 10 11
1 0 0 0 1 0 0 1 0 1 1 1
2 0 0 1 0 0 1 0 1 1 1 0
3 0 0 1 0 1 1 1 0 0 0 1
4 0 1 0 0 1 0 1 1 1 0 0
5 0 1 0 1 1 1 0 0 0 1 0
6 0 1 1 1 0 0 0 1 0 0 1
7 1 0 0 0 1 0 0 1 0 1 1
8 1 0 0 0 1 0 0 0 1 0 0
9 1 0 1 1 1 0 0 0 1 0 0
10 1 1 0 0 0 1 0 0 1 0 1
11 1 1 1 0 0 0 1 0 0 1 0
12 1 1 1 1 1 1 1 1 1 1 1

Fig. 6.2 Illustration of eighth row experiment


62 N. Liu et al.

6.3 Orthogonal QUasi-Affine TRansformation Evolution


(O-QUATRE) Algorithm

In this section, we will present a new orthogonal quasi-affine transformation evolution


(O-QUATRE) algorithm, and here we adopt orthogonal array to change the evolution
matrix M of the QUATRE algorithm. In the previous sections, we have analyzed the
QUATRE algorithm and learned that the QUATRE algorithm generates the next-
generation population through quasi-affine transformation with the evolution matrix
M. Figure 6.1 shows the process of generating an individual of the next generation.
The evolution matrix M is obtained from a piled lower triangular matrix by two-step
random permutation, which makes the QUATRE algorithm have a powerful random
global exploration capability, but it cannot always pass the best individual solution to
the next generation. On the other hand, we also analyzed the orthogonal array, which
produces the best or nearly the best solution by doing the representative experiments
listed in the orthogonal array. Figure 6.2 shows the process of generating a solution in
one run of experiment. The two-level orthogonal array and the evolutionary matrix M
have similar structure, but they have different mechanisms to generate new solutions,
and they have their own advantages so that we can conveniently combine them
naturally. And we expect that the proposed algorithm not only can generate the best
solution to the next generation but also have good exploration capability. Therefore, in
our proposed O-QUATRE algorithm, we first sort the individuals in the population X
according to the fitness values and then change some top row vectors in the evolution
matrix M by doing orthogonal array experiments, and if a row vector of M selected
for change, it will be replaced by a row of orthogonal array with optimal fitness value.
The number of rows to be changed is ps ∗ rc, which is determined by the parameter
rc. The value of the parameter rc ranges from 0 to 1. In case of rc = 0, the evolution
matrix M does not need to be changed. In case of rc = 1, all rows of the evolution
matrix M are generated by orthogonal array experiments. The value of the parameter
rc is used to balance the exploration and exploitation capabilities of the algorithm.
In this paper, the value of rc is set to 0.1.
Figure 6.3 shows an example of changing the evolution matrix M using orthogonal
array experiments. In Fig. 6.3, assume that the first row vector of the 10-dimensional
evolution matrix
M is selected for orthogonal array experiments. We choose orthog-
onal array L12 211 in Table 6.1 for the experiment and assume that the optimal value
is obtained in the third experiment in the 12 representative experiments. Therefore,
the first row of the evolution matrix
M is replaced by the first 10 columns of the
third row of the orthogonal array L12 211 . Similarly, the second row vector of the
evolution matrix is replaced by the eleventh row of the orthogonal array.
6 An Orthogonal QUasi-Affine TRansformation Evolution (O-QUATRE) … 63

Fig. 6.3 Illustration of changing the evolution matrix M using orthogonal array

The pseudocode of the algorithm O-QUATRE is given in Algorithm 1.

6.4 Experimental Analysis

In order to assess the performance of the proposed O-QUATRE algorithm, we make


the comparison with QUATRE algorithm over CEC2013 [17] test suite for real-
parameter optimization, which has 28 benchmark functions (f1 –f28 ). The first 5
64 N. Liu et al.

f13-10D f24-10D f28-10D


300 300 2000
O-QUATRE O-QUATRE O-QUATRE
250 QUATRE QUATRE QUATRE
250 1500

Fitness error
Fitness error
200
Fitness error

150 200 1000

100
150 500
50

0 100 0
0 1 2 3 0 1 2 3 0 1 2 3
5 5 5
NFE x 10 NFE x 10 NFE x 10

Fig. 6.4 Simulation of functions f13 , f24 , and f28 with 10-D

functions f1 –f5 are unimodal functions, the next 15 functions f6 –f20 are multi-modal
functions, and the rest 8 functions f21 –f28 are composition functions. All test func-
tions’ search ranges are [−100, 100]D and they are shifted to the same global best
location, O{o1 , o2 , . . . , od }.
In this paper, for all these benchmark functions, we compared the performance of
the algorithms under 10-dimensional problems. Each algorithm on each benchmark
function is conducted for 150 times independently and the best, mean, standard
deviation of these runs is recorded to make statistical analysis. The parameter settings
of the O-QUATRE algorithm are ps = 100, F = 0.7, rc = 0.1, D = 10, Generations
= 1000 (NFE = 209,890,
NFE denotes the number of function evaluation), and
orthogonal array L12 211 , the parameter settings of the QUATRE algorithm are ps =
100, F = 0.7, D = 10, Generations = 2100 (NFE = 210,000). The comparison results
are shown in Table 6.2, and the simulation results of some benchmark functions are
shown in Fig. 6.4.
From Table 6.2, we can see that the QUATRE algorithm has better best value on
function f2–4 , f7 , f13–16 , f20 , f23 , f25 , and the O-QUATRE algorithm has better best
value on function f8 , f10 , f12 , f17–19 , f22 , f24 , f26 , and they have the same best value
on the other rest eight functions. The QUATRE algorithm can find two more results
with better best value than the O-QUATRE algorithm, but the O-QUATRE algorithm
has better mean and standard deviation of fitness error than the QUATRE algorithm,
which means that the O-QUATRE algorithm is more robust and has better stability.

6.5 Conclusion

In this paper, we present a new O-QUATRE algorithm for optimization problems.


The O-QUATRE algorithm employs orthogonal array experiments to change the
evolution matrix of the QUATRE algorithm. This change makes a good balance
between exploration and exploitation of O-QUATRE algorithm. The proposed algo-
6 An Orthogonal QUasi-Affine TRansformation Evolution (O-QUATRE) … 65

Table 6.2 Comparison results of best, mean, and standard deviation of 150-run fitness error
between QUATRE algorithm and the O-QUATRE algorithm under 10-D CEC2013 test suite
10-D QUATRE Algorithm O-QUATRE Algorithm
No. Best Mean Std Best Mean Std
1 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
2 0.0000E+00 2.1373E−13 2.0590E−12 1.2476E−08 3.1751E−06 1.2088E−05
3 0.0000E+00 1.0646E−01 7.3970E−01 5.2296E−12 7.4604E−01 3.9460E+00
4 0.0000E+00 4.3959E−14 9.0093E−14 1.0687E−11 1.0451E−09 1.1094E−09
5 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
6 0.0000E+00 4.4586E+00 4.7865E+00 0.0000E+00 5.7996E+00 7.8827E+00
7 1.1369E−13 8.8622E−01 4.2265E+00 4.2608E−07 1.1732E+00 5.5359E+00
8 2.0191E+01 2.0454E+01 9.5454E−02 2.0143E+01 2.0420E+01 8.5264E−02
9 0.0000E+00 1.6659E+00 1.2816E+00 0.0000E+00 1.8768E+00 1.2334E+00
10 3.2016E−02 1.7697E−01 1.1713E−01 9.8573E−03 1.7678E−01 1.2074E−01
11 0.0000E+00 2.9849E+00 1.6863E+00 0.0000E+00 2.3879E+00 1.3042E+00
12 2.9849E+00 1.4597E+01 5.6877E+00 5.4788E−01 1.2752E+01 6.6829E+00
13 1.9899E+00 2.0407E+01 8.8161E+00 4.2551E+00 1.9388E+01 7.7625E+00
14 3.5399E+00 1.0737E+02 8.5581E+01 3.6648E+00 8.0443E+01 7.1238E+01
15 1.7137E+02 9.7413E+02 3.0636E+02 2.7315E+02 9.1454E+02 3.0936E+02
16 3.1547E−01 1.1312E+00 3.5646E−01 4.0250E−01 1.1659E+00 3.1999E−01
17 5.9338E−01 1.0659E+01 3.3035E+00 3.7821E−02 1.0272E+01 3.0306E+00
18 1.0477E+01 3.1443E+01 8.9107E+00 1.0370E+01 3.1674E+01 8.3332E+00
19 2.2333E−01 6.1946E−01 1.9613E−01 1.0191E−01 6.0428E−01 1.8916E−01
20 7.9051E−01 2.9757E+00 5.9590E−01 1.2980E+00 2.9364E+00 5.4814E−01
21 1.0000E+02 3.6082E+02 8.1914E+01 1.0000E+02 3.6683E+02 7.5749E+01
22 1.7591E+01 1.9007E+02 1.2395E+02 8.9048E+00 1.6717E+02 1.0956E+02
23 1.6517E+02 9.5665E+02 3.0949E+02 1.6626E+02 8.9420E+02 3.2540E+02
24 1.0905E+02 2.0543E+02 9.2427E+00 1.0704E+02 2.0479E+02 1.1772E+01
25 1.0617E+02 2.0336E+02 1.3893E+01 1.1073E+02 2.0178E+02 1.5877E+01
26 1.0398E+02 1.6999E+02 4.8561E+01 1.0298E+02 1.6630E+02 4.9120E+01
27 3.0000E+02 3.8820E+02 9.7914E+01 3.0000E+02 3.6475E+02 9.1244E+01
28 1.0000E+02 2.8780E+02 6.5449E+01 1.0000E+02 2.9200E+02 3.9323E+01
Win 11 10 12 9 16 14
Lose 9 16 14 11 10 12
Draw 8 2 2 8 2 2
The best results of the comparisons are emphasized in BOLDFACE fonts
66 N. Liu et al.

rithm is evaluated under CEC2013 test suite for real-parameter optimization. The
experimental results indicate that the O-QUATRE algorithm has a better mean and
standard deviation of fitness error than the QUATRE algorithm, which means that
the O-QUATRE algorithm has the advantages of more robustness and better stability.

References

1. Pan, J.S., Kong, L.P., Sung, T.W., et al.: Hierarchical routing strategy for wireless sensor
network. J. Inf. Hiding Multimed. Signal Process. 9(1), 256–264 (2018)
2. Chang, F.C., Huang, H.C.: A survey on intelligent sensor network and its applications. J. Netw.
Int. 1(1), 1–15 (2016)
3. Holland, J.H.: Adaptation in Nature and Artificial Systems. The University of Michigan Press,
Ann Arbor (1975)
4. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of IEEE International
Conference on Neural Networks, vol. 4, pp. 1942–1948. IEEE (1995)
5. Dorigo, M., Maniezzo, V., Colorni, A.: Ant system: optimization by a colony of cooperating
agents. IEEE Trans. Syst. Man Cybern. Part B Cybern. 26(1), 29–41 (1996)
6. Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimiza-
tion over continuous spaces. J. Global Optim. 11(4), 341–359 (1997)
7. Meng, Z., Pan, J.S., Alelaiwi, A.: A new meta-heuristic ebb-tide-fish inspired algorithm for
traffic navigation. Telecommun. Syst. 62(2), 1–13 (2016)
8. Meng, Z., Pan, J.S.: Monkey king evolution: a new memetic evolutionary algorithm and its
application in vehicle fuel consumption optimization. Knowl.-Based Syst. 97, 144–157 (2016)
9. Meng, Z., Pan, J.S., Xu, H.: QUasi-Affine TRansformation Evolutionary (QUATRE) algo-
rithm: a cooperative swarm based algorithm for global optimization. Knowl.-Based Syst. 109,
104–121 (2016)
10. Meng, Z., Pan, J.S.: QUasi-affine TRansformation Evolutionary (QUATRE) algorithm: the
framework analysis for global optimization and application in hand gesture segmentation. In:
2016 IEEE 13th International Conference on Signal Processing (ICSP), pp. 1832–1837 (2016)
11. Meng, Z., Pan, J.S.: QUasi-Affine TRansformation Evolution with External ARchive
(QUATRE-EAR): an enhanced structure for differential evolution. Knowl.-Based Syst. 155,
35–53 (2018)
12. Zhang, Q., Leung, Y.W.: An orthogonal genetic algorithm for multimedia multicast routing.
IEEE Trans. Evol. Comput. 3, 53–62 (1999)
13. Tsai, J.T., Liu, T.K., Chou, J.H.: Hybrid Taguchi-genetic algorithm for global numerical opti-
mization. IEEE Trans. Evol. Comput. 8(4), 365–377 (2004)
14. Liu, C.H., Chen, Y.L., Chen, J.Y.: Ameliorated particle swarm optimization by integrating
Taguchi methods. In: The 9th International Conference on Machine Learning and Cybernetics
(ICMLC), pp. 1823–1828 (2010)
15. Tsai, P.W., Pan, J.S., Chen, S.M., Liao, B.Y.: Enhanced parallel cat swarm optimization based
on Taguchi method. Expert Syst. Appl. 39, 6309–6319 (2012)
16. Ding, Q., Qiu, X.: Novel differential evolution algorithm with spatial evolution rules. HIGH.
Tech. Lett. 23(4), 426–433
17. Liang, J.J., et al.: Problem definitions and evaluation criteria for the CEC 2013 special session
on real-parameter optimization. Computational Intelligence Laboratory, Zhengzhou University,
Zhengzhou, China and Nanyang Technological University, Singapore, Technical report 201212
(2013)
Chapter 7
A Decomposition-Based Evolutionary
Algorithm with Adaptive Weight
Adjustment for Vehicle Crashworthiness
Problem

Cai Dai

Abstract In the automotive industry, the crashworthiness design of vehicles is of


special importance. In this work, a multi-objective model for the vehicle design
which minimizes three objectives, weight, acceleration characteristics, and toe-board
intrusion, is considered, and a novel evolutionary algorithm based on decomposition
and adaptive weight adjustment is designed to solve this problem. The experimental
results reveal that the proposed algorithm works better than MOEA/D MOEA/D-
AWA and NSGAII on this problem.

Keywords Evolutionary algorithm · Vehicle crashworthiness problem · Adaptive


weight adjustment

7.1 Introduction

In the automotive industry, crashworthiness refers to the ability of a vehicle and its
components to protect its occupants during an impact or crash [1]. The crashwor-
thiness design of vehicles is of special importance, yet, highly demanding for high-
quality and low-cost industrial products. Liao et al. [2] presented a multi-objective
model for the vehicle design which minimizes three objectives: (1) weight (mass),
(2) acceleration characteristics (Ain), and (3) toe-board intrusion (intrusion).
Multi-objective optimization problems (MOPs) are complex. They usually include
two or more conflicting objectives. A minimized MOP can be described as follows
[3]:

⎨ minF(x) = (f1 (x), f2 (x), . . . , fm (x))
s.t. gi (x) ≤ 0, i = 1, 2, . . . , q (7.1)

hj (x) = 0, j = 1, 2, . . . , p

C. Dai (B)
School of Computer Science, Shaanxi Normal University, Xi’an 710119, China
e-mail: cdai0320@snnu.edu.cn

© Springer Nature Singapore Pte Ltd. 2020 67


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_7
68 C. Dai

where x = (x1 , . . . , xn ) ∈ X ⊂ Rn is a n-dimensional decision variable


bounded in the decision space X , an MOP includes m objective functions
fi (x)(i = 1, . . . , m), q inequality constraint gi (x)(i = 1, 2, . . . , q), and p equality
constraint hj (x)(j = 1, 2, . . . , p). Moreover, the set of feasible solutions which
meet all the constraints is denoted by . In MOPs, the quality of the optimal solu-
tion is evaluated by the trade-offs between multiple conflicting objectives. For two
solutions x, z ∈ , if each fi (x) ≤ fi (z) and F(x) − F(z)2 = 0, x dominates z (or
denoted x ≺ z). If the solution vector x is not dominated by any other solutions, x is
called a Pareto optimal solution. The set of Pareto optimal solutions (PS) is consti-
tuted by all Pareto optimal solutions. The Pareto optimal front (PF) is the set of the
objective vectors of all Pareto optimal solutions.
Multi-objective evolutionary algorithms (MOEAs) which make use of the pop-
ulation evolution to get a set of optimal solutions are a kind of effective methods
for solving MOPs. Many MOEAs have successfully been applied to solve MOPs,
such as multi-objective genetic algorithm [4], multi-objective particle swarm opti-
mization algorithm [5], multi-objective differential evolution algorithm [6], multi-
objective immune clone algorithm [7], group search optimizer [8], and evolutionary
algorithms based on decomposition [9].
Recently, Zhang [9] et al. introduced the decomposition approaches into MOEA
and developed an outstanding MOEA: MOEA/D, which has a good performance
on many problems. MOEA/D decomposes the MOP into a number of sub-problems
and then uses the EA to optimize these sub-problems simultaneously. The two main
advantages of MOEA/D are that it uses the neighbor strategy to improve the search
efficiency and well maintain the diversity of obtained solutions by the given weight
vectors. In the last decade, MOEA/D has attracted many research interests and many
related articles [10–18] have been published.
In MOEA/D, weighted vectors and aggregate functions play a very important
role. However, the Pareto front of the MOP of vehicle crashworthiness problem is
unknown, MOEA/D with fixed weighted vectors may not well solve this MOP. In
this paper, a selection strategy based on decomposition is used to well maintain the
diversity of obtained solutions, and an adaptive weighted adjustment [19] is used
to the problem of Pareto front with unknown. Based on these, a novel evolutionary
algorithm based on decomposition and adaptive weighted adjustment is proposed to
solve the MOP of vehicle crashworthiness problem.
The rest of this paper is organized as follows: Sect. 7.2 introduces the main concept
of the multi-objective optimization problems of vehicle crashworthiness problem;
Sect. 7.3 presents the proposed algorithm MOEA/DA in detail, while the experiment
results of the proposed algorithm and the related analysis are given in Sect. 7.4;
finally, Sect. 7.5 provides the conclusion and proposes the future work.
7 A Decomposition-Based Evolutionary Algorithm … 69

7.2 Vehicle Crashworthiness Problem

The vehicle crashworthiness problem (VCP) model is formulated as follows:



⎨ minF(x) = (Mass, Ain , Intrusion)
s.t. 1 ≤ xi ≤ 3, i = 1, 2, . . . , 5 (7.2)

x = (x1 , . . . , x5 )

where

Mass = 1640.2823 + 2.3573285x1 + 2.3220035x2 + 4.5688768x3


+ 7.7213633x4 + 4.4559504x5 (7.3)

Ain = 6.5856 + 1.15x1 − 1.0427x2 + 0.9738x3 + 0.8364x4 − 0.3695x1 x4


+ 0.0861x1 x5 + 0.3628x2 x4 − 0.1106x12 − 0.3437x32 + 0.1764x42 (7.4)

Intrusion = −0.0551 + 0.0181x1 + 0.1024x2 + 0.0421x3 − 00073x1 x2 + 0.024x2 x3


+ 0.0118x2 x4 − 0.0204x3 x4 − 0.008x3 x5 − 0.0241x22 + 0.0109x42
(7.5)

7.3 The Proposed Algorithm

In this paper, a decomposition-based multi-objective evolutionary algorithm with


adaptive weight vector adjustment (MOEA/DA) is proposed to address the VCP. The
proposed algorithm mainly consists of two parts: adaptive weight vector adjustment
strategy and a selection strategy, which will be introduced in this section.

7.3.1 Adaptive Weight Vector Adjustment

In the subsection, the adaptive weight vector adjustment [19] is used in this work. The
main idea of this adjustment is that, if the distance of two adjacent non-dominated
solutions is large, some weight vectors are added between corresponding weight
vectors of these two non-dominated solutions, then some weight vectors should
be deleted. This adjustment strategy uses the distances of obtained non-dominated
solutions to delete or add some weight vectors to solve the problems with complex
PF and maintain relative stability of weight vectors. The detail of the adaptive weight
vector adjustment is as follows.
70 C. Dai

For the current


 weight vectors
 W = (W1 , W2 , . . . , WH ) and current popula-
tion POP = x1 , x2 , . . . , xH , where H is the number of solutions or weight
vectors and xi (i = 1∼H ) is the current optimal solution of the corresponding
sub-problem of the weight vector Wi , we find the non-dominated solutions of
POP. For convenience, we suggest that (x1 , x2 , . . . , xK ) (K ≤ H ) are the non-
dominated solutions of POP and denote W W = (W1+K , W2+K , . . . , WH ). The
distances NDi of obtained  non-dominated
     solutions
  of Wi (i = 1∼H
) is calcu-
lated as NDi = max fj xj1 − fj xi , fj xi − fj xj2 , j = 1∼m , where j1 =
arg min{s|Wi,j > Ws,j , s = 1∼K} and j2 = arg max{s|Wi,j < Ws,j , s = 1∼K}.
The values
  ofNDi are
  mainly
  used
 to delete the weight vector. In addition, all
fj xj1 − fj xi  and fj xi − fj xj2  are sorted to add the weight vectors. For con-
  
venience, we use PDi,ui = max fj (xs ) − fj xi , j = 1∼m, s = 1∼K to denote
the distance ofobtained non-dominated
  solutions of Wui and Wi , where ui =
arg max{s|max fj (xs ) − fj xi , j = 1∼m , s = 1∼K}.
The deleting strategy is as follows. If K > N (where N is the size of the initial
population), N − K weight vectors with the minimum NDi are deleted from W .
Then, if max{NDi , i = 1∼N }/min{NDi , i = 1∼N } > 2, the corresponding weight
vector with the minimum NDi is deleted from W . After some weight vectors are
deleted from W , the adding strategy is that, if the size of the current W is smaller
than H − K + N , H − K + N − |W | new weight vectors are generated as follows:

 
0.25 ∗ Wui + 0.75 ∗ Wi /yy if ∃Wk ∈ W W, Wi ∗ tt < Wk ∗ tt
Wnew = (7.6)
tt else
 
where yy = 0.25 ∗W ui +0.75∗W i 2 , tt = 0.5 ∗W ui +0.5∗W i /0.5 ∗ W ui +
0.5 ∗ W i 2 , and the distances PDi,u of obtained non-dominated solutions of W ui
and W i are the H − K + N − |W | maximum, where |W | is the size of W . The
condition ∃W k ∈ W W , W i ∗tt < W k ∗tt makes the optimal solution of the new
sub-problem generated by the weight vector W new to be non-dominated solution.
In other words, we don’t want the generated weight vectors to locate these spaces
which have no non-dominated solution. The role of the deleting strategy and the
adding strategy is to delete the sub-problems from the crowded regions and add the
sub-problems into the sparse regions.

7.3.2 Selection Strategy

If a dominated solution is kept in a sub-region, it is very possible that it is farther to the


solution in its neighbor sub-region than two non-dominated solutions in two neighbor
sub-regions. In other words, it and its neighbor are relatively sparser. But this solution
and its neighbor are very important to keep the diversity and to be selected as parents
to generate the offspring. Thus, they should be assigned relatively higher fitness
values. In order to achieve this purpose, the vicinity distance [20] is used to calculate
7 A Decomposition-Based Evolutionary Algorithm … 71

the fitness value of a solution in the selection operators. In this way, a solution with
a sparser neighbor is more likely to be selected to generate new solutions. These
new solutions are possible to be non-dominated solutions in sub-region solution d
belongs to. These non-dominated solutions are more close to true PF. Thus, this
selection scheme can help to improve the convergence.

7.3.3 The Proposed Algorithm MOEA/DA

MOEA/DA uses the evolutionary framework of MOEA/D. The steps of the algorithm
MOEA/DA is as follows:
Input:
N the number of weight vectors (the sub-problems);
T the number of weight vectors in the neighborhood of each weight vector,
0 < T < N ; and
λ1 , . . . , λN a set of N uniformly distributed weight vectors;
      
Output: Approximation to the PF: F x1 , F x2 , . . . , F xN
Step 1 Initialization:

Step 1.1 Generate an initial population x1 , x2 , · · · xN ∗k randomly or by a


problem-specific method.
Step 1.2 Initialize z = (z1 , . . . , zm ) by a problem-specific method.
Step 1.3 evol_pop = .
Step 1.4 Compute the Euclidean distances between any two weight vectors
and the work out the T closet weight vectors to each weight vector.
For each i = 1, . . . , N , set B(i) = {i1 , . . . , iT }, where λi1 , . . . , λiT
are the T closest weight vectors to λi .

Step 2 Update:
For i = 1, . . . , N , do
i
Step 2.1 Reproduction: A better solution xi is selected by the selection 
strategy. Randomly select two indexes r2, r3 from B ii , then
i
generate a set of new solution y from xi , xr2 , and xr3 by using
the crossover operators.
Step 2.2 Mutation: Apply a mutation operator on y to produce yj .

Step 2.3 Update of z: For s = 1, . . . , m, if zs < fs yj , then set zs = fs yj .
Step 2.4 Update of neighboring solutions and sub-population: For each
 
index k ∈ B(i), if g yj |λ , z < g x |λ , z , then xk = yj
TE k TE k k

and F(xk ) = F(yj ). evol_pop = evol_pop ∪ yj

End for;
72 C. Dai

Update evol_pop according to the Pareto dominance and the vicinity dis-
tance.
Step 3 Adaptive Weight adjustment
Use the adaptive weight vector adjustment of Sect. 7.3.1 to modify the weight
vectors W , re-determine B(i) = {i1 , . . . , iT }, (i = 1, . . . , H ) (where H is the
size of W ), and randomly select solutions from POP to allocate the new sub-
problem as their current solution.
Step 4 Stopping
  criteria:
 If stopping
 criteria is satisfied, then stop and output
F x1 , F x2 , . . . , F xN ; otherwise, go to Step 2.
In this work, the aggregation function is the variant of Tchebycheff approach
whose form is as follows:
   
minimize gTE x|W i , Z∗ = max f j (x) − z∗j /W i,j (7.7)
x∈Ω 1≤j≤m

where Z∗ is the reference point of the MOP. The optimal solution x∗i of (7.7)
must be the Pareto optimal solution of (7.1). If optimal solution x∗i of (7.6) is
not the Pareto optimal
 solution
   of (7.1),
 there is a solutiony which is better than
 ∗  ∗  
xi , so f j (y) − zj  ≤ f j xi − zj , j = 1, . . . , m, max f j (y) − z∗j /W i,j ≤
∗ ∗

     1≤j≤m
 
max f j x∗i − z∗j /W i,j . Thus, x∗i is not the optimal solution of (7.7), which is
1≤j≤m
a contradiction.

7.4 Experimental Results and Discussion

In this section, MOEA/D [9], MOEA/D-AWA [20], and NSGAII [4] are used to com-
pare with MOEA/DA to solve the MOP of vehicle crashworthiness problem. These
algorithms are implemented on a personal computer (Intel Xeon CPU 2.53 GHz,
3.98 G RAM). The individuals are all coded as the real vectors. Polynomial muta-
tion and simulated binary crossover (SBX [21]) are used in MOEA/DA. Distribution
index is 20 and crossover probability is 1 in the SBX operator. Distribution index
is 20 and mutation probability is 0.1 in mutation operator. The population size is
105. Each algorithm is run 20 times independently and stops after 500 generations.
In real-world cases, the Pareto optimal solutions are often not available. Therefore,
to compare the performance of these there algorithms for the synthesis gas problem
quantitatively, the HV metric [22] and coverage metric [23] (C metric) are used.
Table 7.1 presents the mean and the best values of IGD obtained by MOEA/DA,
MOEA/D, MOEA/D-AWA, and NSGAII. In this experiment, the reference points are
set to (1700, 12, 1.1). From the table, it can be seen that the convergence performance
of MOEA/DA is better than MOEA/D, MOEA/D-AWA, and NSGAII, and we can
see that the mean values of five obtained by these four algorithms are smaller than
0.04, which indicates that the convergence performances of these four algorithms
7 A Decomposition-Based Evolutionary Algorithm … 73

Table 7.1 C and HV obtained by MOEA/DA, MOEA/D, MOEA/D-AWA, and NSGAII on vehicle
crashworthiness problem (A represents the algorithm MOEA/DA and B represents the algorithms
MOEA/D, MOEA/D-AWA, and NSGAII)
MOEA/DA MOEA/D MOEA/D-AWA NSGAII
C(A,B) Mean NA 0.0156 0.0214 0.0345
Std NA 0.0062 0.0071 0.0102
C(B,A) Mean NA 0.0084 0.0101 0.0135
Std NA 0.0025 0.0094 0.0100
HV Mean 103.5694 96.8083 99.0426 102.4931
Std 1.2827 5.8045 2.1546 1.2017

are almost the same; the mean values of HV obtained by MOEA/DA is much bigger
than those obtained by MOEA/D, MOEA/D-AWA, and NSGAII on VCP, which
indicate that the coverage and convergence of solutions obtained by MOEA/DA to the
true PF are better than those obtained by MOEA/D, MOEA/D-AWA, and NSGAII.
Moreover, the mean values of HV metric obtained by MOEA/DA are bigger than that
obtained by MOEA/D-AWA on VCP, which indicate that MOEA/DA can effectively
approach the true PFs. In summary, the comparisons of the simulation results of
these four algorithms show that MOEA/DA is able to obtain much better spread,
distributed and convergent PFs.

7.5 Conclusions

In this paper, a decomposition-based evolutionary algorithm with adaptive weight


adjustment is designed to solve many-objective problems. The goal of the proposed
algorithm is to adaptively modify the weight vectors to enhance the search efficiency
and the diversity of MOEAs based on decomposition. In this work, an adaptive weight
adjustment strategy is used to adaptively change the weight vectors, a selection strat-
egy is used to help the solutions to converge to the Pareto optimal solutions, and an
external elite population is used to maintain the diversity of obtained non-dominated
solutions. Moreover, the proposed algorithm tests 16 test problems and compares
with three well-known algorithms MOEA/D, MOEA/D-AWA, and NSGAII. Simu-
lation results show that the proposed algorithm can well solve VCP.

Acknowledgements This work was supported by National Natural Science Foundations of


China (no. 61502290, no. 61401263, no. 61672334), China Postdoctoral Science Foundation (no.
2015M582606), Fundamental Research Funds for the Central Universities (no. GK201603094,
no. GK201603002), and Natural Science Basic Research Plan in Shaanxi Province of China (no.
2016JQ6045, no. 2015JQ6228).
74 C. Dai

References

1. Du Bois, P., et al.: Vehicle crashworthiness and occupant protection. American Iron and Steel
Institute, Southfield, MI, USA, Report (2004)
2. Liao, X., Li, Q., Yang, X., Zhang, W., Li, W.: Multiobjective optimization for crash safety design
of vehicles using stepwise regression model. Struct. Multidiscipl. Optim. 35(6), 561–569 (2008)
3. Van Veldhuizen, D.A.: Multiobjective Evolutionary Algorithms: Classifications, Analyses, and
New Innovations. Air Force Institute of Technology Wright Patterson AFB, OH, USA (1999)
4. Deb, K., et al.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol.
Comput. 6(2), 182–197 (2002)
5. Tang, B., Zhu, Z., Shin, H., Tsourdos, A., Luo, J.: A framework for multi-objective optimisation
based on a new self-adaptive particle swarm optimisation algorithm. Inf. Sci. 420, 364–385
(2017)
6. Wang, X.P., Tang, L.X.: An adaptive multi-population differential evolution algorithm for
continuous multi-objective optimization. Inf. Sci. 348, 124–141 (2016)
7. Shang, R.H., Jiao, L.C., Liu, F., Ma, W.P.: A novel immune clonal algorithm for MO problems.
IEEE Trans. Evol. Comput. 16(1), 35–50 (2012)
8. Zhan, Z.H., Li, J.J., Cao, J.N., Zhang, J., Chung, H.H., Shi, Y.H.: Multiple populations for mul-
tiple objectives: a coevolutionary technique for solving multiobjective optimization problems.
IEEE Trans. Cybern. 43(2), 445–463 (2013)
9. Zhang, Q.F., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposi-
tion. IEEE Trans. Evol. Comput. 11(6), 712–731 (2007)
10. Zhao, S.Z., Suganthan, P.N., Zhang, Q.F.: Decomposition-based multiobjective evolutionary
algorithm with an ensemble of neighborhood sizes. IEEE Trans. Evol. Comput. 16(3), 442–446
(2012)
11. Wang, L., Zhang, Q., Zhou, A.: Constrained subproblems in a decomposition-based multiob-
jective evolutionary algorithm. IEEE Trans. Evol. Comput. 20(3), 475–480 (2016)
12. Zhu, H., He, Z., Jia, Y.: A novel approach to multiple sequence alignment using multiobjec-
tive evolutionary algorithm based on decomposition. IEEE J. Biomed. Health Inform. 20(2),
717–727 (2016)
13. Jiang, S., Yang, S.: An improved multiobjective optimization evolutionary algorithm based on
decomposition for complex Pareto fronts. IEEE Trans. Cybern. 46(2), 421–437 (2016)
14. Zhou, A., Zhang, Q.: Are all the subproblems equally important? Resource allocation in
decomposition-based multiobjective evolutionary algorithms. IEEE Trans. Evol. Comput.
20(1), 52–64 (2016)
15. Zhang, H., Zhang, X., Gao, X., et al.: Self-organizing multiobjective optimization based on
decomposition with neighborhood ensemble. Neurocomputing 173, 1868–1884 (2016)
16. Li, H., Zhang, Q.F.: Multiobjective optimization problems with complicated Pareto sets,
MOEA/D and NSGA-II. IEEE Trans. Evol. Comput. 13(2), 284–302 (2009)
17. Al Mpubayed, N., Petrovski, A., McCall, J.: D2MOPSO: MOPSO based on decomposition
and dominance with archiving using crowding distance in objective and solution spaces. Evol.
Comput. 22(1), 47–78 (2014)
18. Zhang, H., et al.: Self-organizing multiobjective optimization based on decomposition with
neighborhood ensemble. Neurocomputing 173, 1868–1884 (2016)
19. Dai, C., Lei, X.: A Decomposition-Based Multiobjective Evolutionary Algorithm with Adaptive
Weight Adjustment. Complexity, 2018
20. Qi, Y., Ma, X., Liu, F., Jiao, L., Sun, J., Wu, J.: MOEA/D with adaptive weight adjustment.
Evol. Comput. 22(2), 231–264 (2014)
21. Deb, K.: Multiobjective Optimization Using Evolutionary Algorithms. Wiley, New York (2001)
22. Deb, K., Sinha, A., Kukkonen, S.: Multi-objective test problems, linkages, and evolutionary
methodologies. In: Proceedings of the 8th Annual Conference on Genetic and Evolutionary
Computation GECCO’06, Seattle, WA, pp. 1141–1148 (2006)
23. Zitzler, E., Thiele, L.: Multi-objective evolutionary algorithms: a comparative case study and
the strength Pareto approach. IEEE Trans. Evol. Comput. 3(4), 257–271 (1999)
Chapter 8
Brainstorm Optimization in Thinned
Linear Antenna Array with Minimum
Side Lobe Level

Ninjerdene Bulgan, Junfeng Chen, Xingsi Xue, Xinnan Fan


and Xuewu Zhang

Abstract An antenna array (or array antenna) is composed of multiple individual


antennas to produce a high directive gain or a specified pattern. Thinning involves
reducing total number of active elements, but not to the detriment of system perfor-
mance. In this paper, a variant of the Brainstorm Optimization (BSO) algorithm is
proposed for thinning the linear arrays. The proposed thinning algorithm is employed
to minimize the side lobe level and enhance the ratio directivity/side lobe level. The
results show good agreement between the desired and calculated radiation patterns
with reduction in resource usage in terms of power consumption.

Keywords Brainstorm optimization · Linear antenna array · Thinned array · Side


lobe level

8.1 Introduction

An antenna array consists of radiating elements (individual antennas) configured in


a geometrical order of which these multiple individual antennas work together as a
single antenna to produce a high directive again. For its technological importance,
it is integrated into wireless communication equipment and electronic devices espe-
cially in robots for radar purposes, tracking, remote sensing, ground radio, satellite
communication, and other applications [1].

N. Bulgan · J. Chen (B) · X. Fan · X. Zhang


College of IOT Engineering, Hohai University, Changzhou 213022, China
e-mail: chen-1997@163.com
N. Bulgan
e-mail: ninjerdene@hhu.edu.cn
X. Xue
College of Information Science and Engineering, Fujian University of Technology, Fuzhou
350118, China
e-mail: jack8375@gmail.com

© Springer Nature Singapore Pte Ltd. 2020 75


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_8
76 N. Bulgan et al.

Thinning of antenna array plays a vital role in the concept of antennas as it


generally reduces the number of elements by taking part in the formation of the
radiation beam. From the design perspective, the antenna array thinning should be
cost-effective which includes energy efficiency. It should also have controllable array
patterns. To be specific, its radiating (receiving) patterns must have properties such
as a narrow main beam width with high gain, lowest possible side lobe levels to
comply with radio frequency utilization and other regulatory requirements nulls at
certain angles inside lobe level. All these properties of the thinned antenna array are
conducive to countering jamming effects and improving signal receiving. The main
beam can be steered (beam scanning) without pattern distortion as much as possible.
A successfully thinned antenna technically takes away some percentage of the
radiating elements in the array with the system performance unchanged. The above
properties make a thinned antenna more superior than a completely filled antenna
array [2, 3]. The problem related to antenna array thinning is a combinatorial opti-
mization problem. Many traditional optimization algorithms have produced unsat-
isfied results as there are infinite numbers of possible combinations to large arrays
and it increases exponentially with an increase in array elements. Hence, finding
the optimum solution to the problem becomes impossible. Swarm intelligence [4, 5]
and its derivatives, such as Ant Colony Algorithm (ACO), Genetic Algorithm (GA),
and Particle Swarm Optimization (PSO), have practical usefulness in their technique
approach in solving the thinning problem in attaining the optimal combination of the
antenna array.
The derivatives of swarm intelligence algorithms were employed by various stud-
ies in solving linear array antenna synthesis. Haupt proposed a method using genetic
algorithms to optimally thin both linear and planar arrays. The algorithm determined
which elements were turned off in a periodic array to give the lowest maximum
relative side lobe level with 200 elements in linear and planar arrays [6]. Ares-Pena
et al. applied a genetic-based algorithm on array pattern synthesis with respect to
two linear arrays and one involving linear and planar arrays [7]. Marcano and Duran
discussed two techniques for the synthesis of the complex radiation pattern by fus-
ing Schelkunoff’s method and GAs for linear arrays with arbitrary radiation patterns
and the synthesis of planar arrays with rectangular cells [8]. Chen et al. presented
a modified real GA for the optimization of element position by reducing the Peak
Side Lobe Level (PSLL) of the array with respect to sparse linear arrays [9]. Jain and
Mani proposed a GA with the general concept of an antenna array, array thinning,
and dynamic thinning and applied it on linear and planar arrays to reduce the total
number of active elements [10]. Ha et al. derived modified compact GA by improving
the probability vector parameters and adding suitable learning scheme between these
parameters to improve the optimized synthesis of different-sized linear and planar
thinned arrays [11]. ACO was proposed by Quevedo-Teruel et al. for pattern synthe-
sis of thinned linear and planar arrays design with minimum Side Lobe Level (SLL)
[12]. Li et al. proposed an improved PSO for electromagnetic applications to subdue
the drawbacks of the standard PSO with implementation on linear as well as planar
array [13]. Mandal et al. designed an evolutionary algorithm called the novel PSO
algorithm capable of solving general n-dimensional, linear and nonlinear optimiza-
8 Brainstorm Optimization in Thinned Linear Antenna Array … 77

tion problems with respect to the synthesis of linear array geometry with minimum
side lobe level [14]. Wang et al. proposed a modified binary PSO in the synthesis
of thinned linear and planar arrays with a lower SLL. The chaotic sequences were
embedded in the proposed algorithm to determine the inertia weight of the binary
PSO for the diversity of particles, resulting in improved performance [15]. Ma et al.
modeled a hybrid optimization method of particle swarm optimization and convex
optimization of which the peak side lobe level is considered as the objective function
to optimize the linear array synthesis [16].
In this paper, we present the method of optimization of uniformly spaced lin-
ear arrays based on a Brainstorm Optimization (BSO) algorithm. The remainder of
this paper is organized as follows. In Sect. 8.2, the thinned linear antenna array is
described. In Sect. 8.3, the brainstorm optimization is modified for thinned antenna
array. Simulation experiments and comparisons are provided in Sect. 8.4. Finally,
conclusion is given in Sect. 8.5.

8.2 Thinned Linear Antenna Array

A nonuniform linear array antenna with N symmetrical elements is depicted in


Fig. 8.1 as starting point for mathematical modeling for thinned linear array synthe-
sis. Here, θ is the radiation beam angle, d is the distance between adjacent elements,
and θ0 is the angle of observation point.
According to a pattern multiplication rule derived from antenna theory, the array
factor for a linear array depicted in Fig. 8.1, consisting of N elements uniformly
spaced can be written as


N

dm (cos θ sin ϕ−cos θ0 sin ϕ0i )
F(φ, θ ) = fm (φ, θ )Am ej λ (8.1)
m=1

In this paper, the following antenna constraints were assumed for simplicity of
calculations: amplitude only excitation and no phase difference, uniform array of
elements is considered: fm (φ, θ ) = 1, Am = 1, uniform spacing between neighboring
elements—λ\2.

Fig. 8.1 Geometry of the N z Target direction


element linear array along
the x-axis

d
θ
0
1 2 …… N y
φ
x
78 N. Bulgan et al.

So, the array factor for this case is


N −1

dm (sin θ−sin θ0 )
F(θ ) = ej λ (8.2)
m=0

In thinned array synthesis, the elements would be enabled or disabled in certain


sequence to get desired pattern characteristics: fm = 1 or fm = 0, so array factor can
be written as


N −1

dm (sin θ−sin θ0 )
F(θ ) = ej λ ∗ fm (8.3)
m=0

From this equation, we can see that F(θ ) is a complex nonlinear continuous
function. Typically, the array factor is expressed by an absolute value by above
formula, normalized to its maximum and is plotted in dB scale.
In our case, we search Minimum Side Lobe Level (MSLL) value in dBs
 
 F(θ ) 
MSLL = max{Fdb (θ )} = max    (8.4)
φ∈S φ∈S max(F(θ )) 

where S denotes the side lobe region and max FF is the peak of main beam, that is,S =
θ |θmin ≤ θ ≤ θ0 − ϕ0 ∪ θ0 + ϕ0 ≤ θ ≤ θmax (Ref: φ ∈ S: φ is element of S).
To suppress SLL, the fitness function can be defined as

min(MSSL)f (8.5)

8.3 Modified Brainstorm Optimization in Thinned Linear


Antenna Array Synthesis

Swarm intelligence is a population-based optimization algorithm family in which


Brainstorm Optimization (BSO) recently emerged as one of its latest derivatives
[17]. Generally, every human faces the challenges in all fields and these problems
present themselves in diverse ways. Now, humans require a brainstorming approach
to find solutions to some of these challenges. Literally, this implies humans have a
brainstorming quality to find solutions to their daily challenging problems.
In relation to antennas, the BSO algorithm for thinning optimization of linear
array follows a stepwise process: population initialization to meet a certain sparse
rate, this then assumes iterative process (cycle), and when the termination criterion is
satisfied the iteration aborts. Conduct population clustering and after each constraint
conversion of each individual type compute the respective fitness value. Find the
optimal individual record in each class as the class center of the class and record
the overall optimal individual; if the replacement requirements are met, one will be
8 Brainstorm Optimization in Thinned Linear Antenna Array … 79

Fig. 8.2 Flowchart of the


BSO algorithm for thinning Start
linear array

Population
IniƟalizaƟon initialization
of populaƟon

Yes
Stopping condition
is true
No
Use the conformed
Population clustering individuals as the output

New individual
generation operation
End

Selection operation

Binary operation

Evaluate the solutions

randomly selected. The class center of the class is replaced, then the new entity is
created and the competitive selection operation is performed. This process continues
until the termination condition is established. Figure 8.2 shows the thinning operation
flowchart.
For the above BSO steps to conform to the characteristics of the thinning linear
array, some of the procedures need modification. The processes to be adjusted are as
follows:
Population clustering: Group n individuals into m clusters by a clustering algorithm.
New individual generation operation: Select one or two cluster(s) randomly to gen-
erate new individual (solution).
Selection: The newly generated individual is compared with the existing individual
with the same individual index and the better one is kept and recorded as the new
individual.
Binary operation: The candidate solution is obtained using the equation shown below:

1 if rij < xij (t)
xij (t + 1) = (8.6)
0 otherwise
80 N. Bulgan et al.

where rij is a uniform random number in the range [0, 1] and the normalization
function is a sigmoid function.

1
xij (t) = sig xij (t) = (8.7)
1 + e−xij (t)

Evaluate the n individuals (solutions).

8.4 Experimental Simulation

The experimental parametric of the uniform linear array is set to 100 array elements
with an equidistant of 1 21 m wavelength, an equal omnidirectional amplitude of
wavelength 1 m, and an array aperture of 49.5 m. The antenna beam is pointed at
0°. To achieve a minimum side lobe level, a sparse array of 50 array elements is
performed for the sparse directional pattern to attain the lowest side lobe level.
The thinned antenna array based on the modified BSO algorithm is shown in
Fig. 8.3 and the element locations are illustrated in Fig. 8.4.
The change Ymin and Ymax in the individual transformation process is shown
in Fig. 8.5 with the suspected change range within (0, Ymin) (Ymax, 1) during the
constraint conversion and further run process for 200 iterations. Figure 8.6 establishes
the best fitness curves of both BSO and GA for 200 iterations. The curve shows that
the GA surpasses the BSO for some number of iterations (before 40 iterations) but
the latter shows the modified BSO fairly outperforming the GA. Comparatively,
reaching of a local optimum between the two algorithms have the BSO frequently
attaining the local optimum at a short period. This implies the BSO algorithm has
good applicability and importance to the synthesis of the thinned linear array.

Fig. 8.3 The directional 0


pattern of the optimized
antenna array
-10

-20
Array gain/dB

-30

-40

-50

-60
-80 -60 -40 -20 0 20 40 60 80
8 Brainstorm Optimization in Thinned Linear Antenna Array … 81

1.5

Array element identification


1

0.5
10 20 30 40 50 60 70 80 90 100
Array element position

Fig. 8.4 The element locations of the thinned antenna array

Ymax Variation curve

0.9
0.8
YMAX

0.7
0.6
0.5
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
ymax

Ymin Variation curve

0.4
YMIN

0.2

0
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
ymin

Fig. 8.5 Ymax/Ymin change

8.5 Conclusions

The combinatorial nature of antenna array is challenging and this makes the designing
of a suitable algorithm for thinning a large-scale antenna array very complex and
difficult. The application of our modified BSO algorithm to pattern synthesis of a
linear antenna array is successful and the simulation results of the proposed BSO
82 N. Bulgan et al.

Fig. 8.6 The fitness 17


evolution curves over
iterations 16.5

16

Fitness Values
15.5

15

14.5
BSO
14 GA

13.5

13
0 50 100 150 200
Iterations

algorithm establish its importance and good applicability to the synthesis of thinned
planar arrays. The supremacy of the proposed algorithm is tested by comparing it to
the GA and our modified algorithm outperformed.

Acknowledgements This work is supported by the National Key R&D Program of China (No.
2018YFC0407101), Fundamental Research Funds for the Central Universities (No. 2019B22314),
National Natural Science Foundation of China (No. 61403121), Program for New Century Excellent
Talents in Fujian Province University (No. GYZ18155), Program for Outstanding Young Scientific
Researcher in Fujian Province University (No. GY-Z160149), and Scientific Research Foundation
of Fujian University of Technology (No. GY-Z17162).

References

1. Bevelacqua, P.: Array Antennas. Antenna-theory.com. Accessed 23 Feb 2017


2. Schwartzman, L.: Element behavior in a thinned array. IEEE Trans. Antennas Propag. 15(4),
571–572 (1967)
3. Schrank, H., Hacker, P.S.: Antenna designer’s notebook-thinned arrays: some fundamental
considerations. IEEE Antennas Propag. Mag. 34(3), 43–44 (1992)
4. Miller, P.: The Smart Swarm: How Understanding Flocks, Schools, and Colonies Can Make
Us Better at Communicating, Decision Making, and Getting Things Done. Avery Publishing
Group, Inc. ISBN 978-1-58333-390-7 (2010)
5. Chen, J.F., Wu, T.J.: A computational intelligence optimization algorithm: cloud drops algo-
rithm. Integr. Comput.-Aided Eng. 21(2), 177–188 (2014)
6. Haupt, R.L.: Thinned arrays using genetic algorithms. IEEE Trans. Antennas Propag. 42(7),
993–999 (1994)
7. Ares-Pena, F.J., Rodriguez-Gonzalez, J.A., Villanueva-Lopez, E., Rengarajan, S.R.: Genetic
algorithms in the design and optimization of antenna array patterns. IEEE Trans. Antennas
Propag. 47(3), 506–510 (1999)
8. Marcano, D., Durán, F.: Synthesis of antenna arrays using genetic algorithms. IEEE Antennas
Propag. Mag. 42(3), 12–20 (2000)
8 Brainstorm Optimization in Thinned Linear Antenna Array … 83

9. Chen, K., He, Z., Han, C.: A modified real GA for the sparse linear array synthesis with multiple
constraints. IEEE Trans. Antennas Propag. 54(7), 2169–2173 (2006)
10. Jain, R., Mani, G.S.: Solving antenna array thinning problem using genetic algorithm. Appl.
Comput. Intell. Soft Comput. 24 (2012)
11. Ha, B.V., Mussetta, M., Pirinoli, P., Zich, R.E.: Modified compact genetic algorithm for thinned
array synthesis. IEEE Antennas Wirel. Propag. Lett. 15, 1105–1108 (2016)
12. Quevedo-Teruel, O., Rajo-Iglesias, E.: Ant colony optimization in thinned array synthesis with
minimum sidelobe level. IEEE Antennas Wirel. Propag. Lett. 5, 349–352 (2006)
13. Li, W.T., Shi, X.W., Hei, Y.Q.: An improved particle swarm optimization algorithm for pattern
synthesis of phased arrays. Prog. Electromagn. Res. 82, 319–332 (2008)
14. Mandal, D., Das, S., Bhattacharjee, S., Bhattacharjee, A., Ghoshal, S.: Linear antenna array
synthesis using novel particle swarm optimization. In: 2010 IEEE Symposium on Industrial
Electronics and Applications (ISIEA), pp. 311—316, Oct 2010
15. Wang, W.B., Feng, Q.Y., Liu, D.: Synthesis of thinned linear and planar antenna arrays using
binary PSO algorithm. Prog. Electromagn. Res. 127, 371–388 (2012)
16. Ma, S., Li, H., Cao, A., Tan, J., Zhou, J.: Pattern synthesis of the distributed array based on
the hybrid algorithm of particle swarm optimization and convex optimization. In: 2015 11th
International Conference on Natural Computation (ICNC), pp. 1230–1234, Aug 2015
17. Shi, Y.: Brain storm optimization algorithm. In: International conference in swarm intelligence,
pp. 303–309, June 2011. Springer, Berlin, Heidelberg
Chapter 9
Implementation Method of SVR
Algorithm in Resource-Constrained
Platform

Bing Liu, Shoujuan Huang, Ruidong Wu and Ping Fu

Abstract With the development of the Internet of Things and edge computing,
machine learning algorithms need to be deployed on resource-constrained embedded
platforms. Support Vector Regression (SVR) is one of the most popular algorithms
widely used in solving problems characterized by small samples, high-dimensional,
and nonlinear, with its good generalization ability and prediction performance. How-
ever, SVR algorithm requires a lot of resources when it is implemented. There-
fore, this paper proposes a method to implement SVR algorithm in the resource-
constrained embedded platform. The method analyses the characteristics of the data
in the SVR algorithm and the solution process of the algorithm. Then, according
to the characteristics of the embedded platform, the implementation process of the
algorithm is optimized. Experiments using UCI datasets show that the implemented
SVR algorithm is correct and effective, and the optimized SVR algorithm reduces
time and memory consumption at the same time, which is of great significance for
the implementation of SVR algorithm in resource-constrained embedded platforms.

Keywords SVR algorithm · Resource-constrained · Embedded platform ·


Implementation method

B. Liu · S. Huang · R. Wu · P. Fu (B)


Harbin Institute of Technology, Harbin, China
e-mail: fuping@hit.edu.cn
B. Liu
e-mail: liubing66@hit.edu.cn
S. Huang
e-mail: 1150110204@stu.hit.edu.cn
R. Wu
e-mail: 17B901027@stu.hit.edu.cn

© Springer Nature Singapore Pte Ltd. 2020 85


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_9
86 B. Liu et al.

9.1 Introduction

All calculations in the early embedded intelligent system are concentrated in the
MCU such as A/D conversion, signal conditioning, dimensional transformation of
the sensors [1]. However, with the advent of smart sensors, these computing tasks
related to smart sensors are transferred to the front or backend of embedded intelligent
systems. Such calculations assigned to smart sensors can also be referred to as edge
calculations for embedded systems. The transfer of calculations makes the system
more uniform and more real time, and allows the MCU to engage in more new tasks.
Therefore, it is of great significance and economic value to implement machine
learning algorithms on resource-constrained embedded platforms.
In recent years, machine learning has developed rapidly. Support Vector Regres-
sion (SVR) is widely used in pattern recognition, probability density functions esti-
mation, time series prediction, and regression estimation. In most application sce-
narios, data is collected by the embedded platform and sent to the PC. The training
process of the SVR algorithm is performed on the PC rather than on the embedded
platform.
Since the SVR algorithm needs to occupy a large amount of resources, especially
memory in the training process, it is difficult to implement the SVR algorithm in
the resource-constrained embedded platform. There are not many related researches
in this field. Therefore, this paper proposes a method to implement SVR algorithm
on the resource-constrained embedded platform to reduce the resources and time
consumption of SVR algorithm. In this paper, the data structure of SVR algorithm
and its solution flow are analyzed and then optimized considering the constrained
resource of the embedded platform. Then UCI datasets are applied to verify the
correctness of the implemented SVR algorithm, and the effectiveness of the proposed
time and memory optimization method.
The structure of the rest of the paper is as follows. The second section and the third
section introduces the principle of SVR algorithm and SMO algorithm, respectively,
the fourth section proposes the implementation and the optimization method of this
paper, the fifth section carries out experimental verification and analysis, and the
sixth section summarizes the paper.

9.2 SVR Algorithm

Linear support vector machines were proposed by Cortes and Vapnik [2]. At the
same time Boser, Guyon and Vapnik introduced nuclear techniques and proposed
nonlinear support vector machines [3]. And Drucker et al. extended it to support
vector regression [4].
For a training set T = {(x1 , y1 ), . . . , (xl , yl )}, where xi ∈Rn , i = 1, . . . , l is the
feature vector, and yi ∈R is the target value, l is the number of training samples. SVR
hopes to find a linear model f (x) = wT x + b that f (x) is as close as possible to y,
9 Implementation Method of SVR Algorithm … 87

where w and b is the model parameter to be determined. For nonlinear problems,


the original feature vector is mapped to the high-dimensional feature space using the
map φ(x), and then the training data is linearly regressed in this feature space F. The
problem to be solved by support vector regression can be formalized as

1  l

min ∗ w2 + C xi + xi∗
w,b,x,x 2 i=1
⎧ T
⎨ w φ(xi ) + b − yi ≤ ε + ξi ,
s.t. y − wT φ(xi ) − b ≤ ε + ξ∗i , (9.1)
⎩ i
ξi , ξi ∗ ≥ 0, i = 1, . . . , l.

Among them C > 0 is the regularization constant, which is used to balance the
complexity and generalization ability of the model. ε > 0 is the upper error limit,
indicating that the sample with an absolute error less than ε is not punished. ξi , ξ∗i ≥ 0
are the relaxation factors used to process samples that exceed the error limit. Using
the Lagrange multiplier method, we can get the dual problem:

1      
l l l l
min (αi −α i ∗)K xi , xj αj −α j ∗ + ε (αi +α i ∗) + yi (αi −α i ∗)
α,α∗ 2
i=1 i=1 i=1 i=1

⎨  l
(α − α∗) = 0,
s.t. (9.2)
⎩ i=1
0 ≤ αi , α i ∗ ≤ C, i = 1, . . . , l.
   
where K xi , xj ≡ φ(xi )T φ xj is a kernel function introduced to avoid calculating
the inner product in the high-dimensional feature space. And the most
 widely used
kernel function is the Gaussian kernel function K(x1 , x2 ) = exp −γ x1 − x2 2 ,
where γ = 2σ1 2 . The KKT condition needs to be met in the process of solving the
above dual problem.
After solving the above problem (9.2), the final model can be obtained as


l
f (x) = (−αi +α i ∗)K(xi , x) + b. (9.3)
i=1

9.3 SMO Algorithm

The training process of SVR is essentially the process of solving the dual problem of
the primary convex quadratic programming problem. At first, solve the dual problem
and get the optimal solution (−α + α∗), and then calculate b in the optimal solu-
tion of the original problem. Such convex quadratic programming problems have
global optimal solutions, and many optimization algorithms can be used to solve
88 B. Liu et al.

them. And sequence minimal optimization (SMO [5]) algorithm is one of the most
popular methods. The dual problem of convex quadratic programming that requires
the solution of SMO algorithm can be reexpressed as

1  ∗ T T
K −K α ∗ T  α∗
min α ,α + εe − y , εe + y
T T T
α,α∗ 2 −K K α α
 
α∗
s.t. yT = 0, 0 ≤ αi , αi∗ ≤ C, i = 1, . . . , l, (9.4)
α

Among them
⎡ ⎤T
 
K −K 2 
Q= , Kij = exp −γ xi − xj  , y = ⎣1, . . . , 1, −1, . . . , −1⎦ .
−K K      
l l

The SMO algorithm is a heuristic algorithm. The basic idea is that if the solu-
tions of all variables satisfy the Karush–Kuhn–Tucker (KKT) optimality condition
[6] of the optimization problem, then the solution of this optimization problem is
obtained, because the KKT condition is a sufficient and necessary condition for the
optimization problem. Otherwise, the original quadratic programming problem is
continuously decomposed into suboptimization problems with only two variables,
and the subproblems are solved analytically until all variables satisfy the KKT condi-
tion. And in the sub-question, one variable is the one that violates the KKT condition
most seriously, and the other is automatically determined by the constraint, so the
two variables are updated simultaneously in the sub-question. Because subproblems
have analytical solutions, each sub-problem is very fast. Although the number of
subproblems is many, it is generally efficient. The SMO algorithm mainly consists
of two parts: an analytical method for solving two variables quadratic programming
and a heuristic method for selecting subproblems. And as is shown in (9.4), the most
difficulty in implementing SVR algorithm in the resource-constrained embedded
platform is that the matrix Q consumes a lot of memory and calculations, which
need to be considered and resolved.

9.4 Method

Given formula (9.5) and the principle of SMO algorithm, this paper at first imple-
mented the initial version of the SVR algorithm, then optimized it according to the
characteristics of the resource-constrained embedded platform after analyzing the
data structure and the algorithm flow. The flowchart of the initial and the optimized
SVR algorithm is shown in Fig. 9.1.
9 Implementation Method of SVR Algorithm … 89

Failure Failure
Success or Success or
Start Failure Start Failure

Success Success

Initialize Calculate Initialize Calculate


Q matrix Alpha matrix K matrix Alpha matrix

Initialize Update Initialize Update


Y matrix G matrix QC matrix G matrix

Initialize Initialize
Update iterations Update iterations
G matrix Y matrix

Initialize Calculate Initialize Calculate


Alpha matrix Alpha matrix G matrix Alpha matrix

Update Alpha Calculate Initialize Calculate


Status offset b Alpha matrix offset b

Select sub- Select


End End
working set working set

Fig. 9.1 The flowchart of SVR (the left one is initial and the right one is optimized)

Among them
T
G = (ε − y1 ), . . . , (ε − yl ), (ε + y1 ), . . . , (ε + yl ) , (9.5)
T    T
Alpha = α1 , . . . , αl , α1∗ , . . . , αl∗ , Alpha = −α1 + α1∗ , . . . , −α1 , . . . , αl∗ ,
(9.6)
Alpha Status = [s1 , . . . , s2l ]T , si ∈{upper, lower, free}, (9.7)
 
K −K K
Q= , Q = , QCij = Kii + Kjj − 2Kij , i, j = 1, . . . , l. (9.8)
−K K −K

9.4.1 Time Optimization

In the solution process of SVR algorithm, since the values in Q need to be called
frequently, the values in QC are also calculated from the values in Q. Therefore, in
order to avoid repeated operations and reduce the time of function calls, matrix Q
and matrix QC are calculated and stored in the beginning.
90 B. Liu et al.

At the same time, because the Gaussian kernel function is used in this paper, a large
number of floating-point exponential operations are needed in the operation process,
but the most embedded platform does not have a separate floating-point processing
unit. The calculation of the exponential function is implemented by software and is
 N
time consuming. Therefore, this paper uses the Maclaurin formula ex = 1 + Nx
where N ∈R, to avoid calculating the exponential function directly. Although it can
lead to some loss of accuracy, when N is large enough, such as N = 256, the loss
can be ignored to some degree.

9.4.2 Memory Optimization

Note that the matrix Q consumes most memory. For a data set with l training samples,
with data type of floating point, for a 32-bit embedded platform, Q needs to occupy
4 ∗ 4 ∗ l ∗ l bytes of RAM, but the memory of the embedded platform is very limited.
In order to save memory, in the implementation of the SVR algorithm, the symmetry
of Q itself is utilized to cut it to be Q , the memory cost of which is only half of Q.
At the same time, the Alpha Status in the original algorithm flow is used to indicate
the state of the sample. After each sub-question is solved, the state matrix is updated
with the value of Alpha . But this is not necessary. This paper judges the state of
the sample by directly comparing the value of Alpha with 0 and C, thus saving the
memory occupied by Alpha Status and saving unnecessary time for updating it.

9.5 Experimental Results

In this paper, LIBSVM dataset [7] and UCI datasets [8–10] are used to verify the
proposed optimization method. The experimental platform is 32-bit ARM micro-
controller STM32F103. The chip has 512 KB Flash and 64 KB SRAM. The clock
frequency used in the experiment is 72 MHz.
A training set of 40 samples was randomly selected, each one has 5 data, 4 of
which are input features and the rest one is output feature. The 4 input features were
normalized to the interval of (0, 1). Then fivefold cross-validation was performed on
the 40 samples to test the prediction accuracy of the SVR. The RMSE and R2 of the
prediction results of the primary algorithm and the optimized algorithm are shown
in Table 9.1.
As is shown in Table 9.1, the results of the primary and optimized SVR algorithm
are almost the same, the average root mean square error (RMSE) is only about 1%,
and the average R2 is above 0.91, which show that both the primary and optimized
algorithm has good prediction accuracy.
Then the number of training samples and the number of input features are changed
to verify time and memory optimization. At first, the number of input features is fixed
9 Implementation Method of SVR Algorithm … 91

Table 9.1 Prediction results Before optimization After optimization


of the primary algorithm and
the optimized algorithm RMSE R2 RMSE R2
Fold 1 3.4129 0.9473 3.5262 0.9475
Fold 2 4.3135 0.9619 4.0453 0.9621
Fold 3 5.4762 0.9187 5.5684 0.9186
Fold 4 5.0085 0.9338 4.8987 0.9338
Fold 5 3.7809 0.8133 4.0094 0.8133
Average 4.3984 0.9150 4.4096 0.9151

as 4, the number of training samples is 40, 50, 60, 70, respectively. Then the number
of training samples is fixed as 40, and the number of input features is 4, 7, 8, 12,
respectively. Using these data and the SVR algorithm before and after optimization,
16 experiments were performed, and in each experiment, the time taken by the
algorithm initialization process and training and the RAM and ROM occupied by
the algorithm were recorded. The results of time and ROM optimization is shown in
Fig. 9.2.
The experimental results in Fig. 9.2 show that the time of the algorithm training
process is mainly related to the number of training samples, but has nothing to do
with the number of input features. The algorithm initialization process is related to
the number of training samples and the number of input features, which is consis-
tent with the previous analysis. Moreover, the optimization method proposed in this
paper reduces the time consumed by the SVR algorithm training process and the
initialization process by about 25%.
And the experimental results in Fig. 9.2 show that the RAM used in the algorithm
training process is mainly related to the number of training samples, and is indepen-
dent of the number of input features, which is consistent with the previous analysis.
At the same time, using the optimization method proposed in this paper, the ROM
occupied by the SVR algorithm is reduced by about 25%, and the RAM is reduced
by 22–24%.
The above experimental results prove that the proposed method for implementing
SVR algorithm in the resource-constrained platform is correct, and the proposed
method to improve the performance of the algorithm, including improving the run-
ning speed of the algorithm and reducing the memory consumption of the algorithm
is effective.

9.6 Conclusion

With the development of edge computing, machine learning and the Internet of
Things, it is of great significance to implement the shallow machine learning algo-
rithm in resource-constrained embedded platforms. SVR is a very extensive machine
92 B. Liu et al.

Time of per Iteration (ms) 15 10

Time of per Iteration (ms)


Before Optimization Before Optimization
12 11.63 8 After Optimization
After Optimization 6.98
9.97 6.17
9.23 6.36 6.03 5.76
9 8.41 6 5.03 5.13
6.79 7.72 5.16
6.41
6 5.19 4

3 2

0 0
40 50 60 70 4 7 8 12
Number of Samples Dimension of Feature Vectors (40 Samples)

2.5 1.5

Time of Initializing (s)


Time of Initializing (s)

Before Optimization 1.96 Before Optimization


2 After Optimization 1.2 After Optimization 1.11
1.44 0.96
1.41 0.90
1.5 0.9 0.79
1.01 0.64 0.65 0.69
1.04
1 0.6 0.46
0.63 0.72
0.5 0.46 0.3

0 0
40 50 60 70 4 7 8 12
Number of Samples Dimension of Feature Vectors (40 Samples)

100 50
Before Optimization Before Optimization
82.41
80 After Optimization 40 After Optimization
29.83 30.45
RAM (Kb)

62.45
RAM (Kb)

61.55 29.20 29.67


60 30 23.73
46.78 22.95 23.11
43.81 22.48
40 29.20 33.46 20
22.48
20 10

0 0
40 50 60 70 4 7 8 12
Number of Samples Dimension of Feature Vectors (40 Samples)

20 20
Before Optimization Before Optimization
16 After Optimization 16 After Optimization
ROM (Kb)
ROM (Kb)

10.72 10.52 10.67 11.30


12 10.34 10.44 12 10.05
10.05 8.63
7.62 7.77 8.01 7.38 7.85 8.00
8 7.38 8

4 4

0 0
40 50 60 70 4 7 8 12
Number of Samples Dimension of Feature Vectors (40 Samples)

Fig. 9.2 The results of time and memory optimization


9 Implementation Method of SVR Algorithm … 93

learning algorithm, but it needs to take up a lot of resources in the training process,
so this paper analyses the characteristics of the process and the data of the algo-
rithm, and combines the characteristics of the embedded platform to optimize the
algorithm. The experimental results using the UCI data set demonstrate that the time
of each iteration of the SVR algorithm and the time of initialization both reduced
about 25% by calculating the data that needs to be frequently invoked in advance,
removing the redundant algorithm process, and introducing the substitution function
of exponential function. And the experimental results also demonstrate that the cost
of RAM reduced 22–24%, and the cost of ROM reduced about 25% by utilizing the
symmetry of the data structure, removing unnecessary variables, and adjusting the
flow of the algorithm.

References

1. Brereton, R.G., Lloyd, G.R.: Support vector machines for classification and regression. Analyst
135(2), 230–267 (2010)
2. Cortes, C., Vapnik, V.: Support-vector network. Mach. Learn. 20, 273–297 (1995)
3. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In:
Proceedings of the Fifth Annual Workshop on Computational Learning Theory—COLT ‘92,
p. 144
4. Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A.J., Vapnik, V.N.: Support vector regression
machines. In: Advances in Neural Information Processing Systems 9, NIPS 1996, pp. 155–161.
MIT Press (1997)
5. Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In:
Advances in Kernel Methods. MIT Press (1999)
6. Kuhn, H.W., Tucker, A.W.: Nonlinear programming. In: Proceedings of 2nd Berkeley Sympo-
sium, pp. 481–492. Berkeley, University of California Press (1951)
7. Chang, C.C., Lin, C.J.: LIBSVM: A Library for Support Vector Machines, pp. 1–27. ACM
(2011)
8. Tüfekci, P.: Prediction of full load electrical power output of a base load operated combined
cycle power plant using machine learning methods. Int. J. Electr. Power Energy Syst. 60,
126–140 (2014). ISSN 0142-0615
9. Quinlan, R.: Combining instance-based and model-based learning. In: Proceedings on the Tenth
International Conference of Machine Learning, pp. 236–243. University of Massachusetts,
Amherst, Morgan Kaufmann (1993)
10. Waugh, S.: Extending and benchmarking cascade-correlation. Ph.D. thesis, Computer Science
Department, University of Tasmania (1995)
Chapter 10
A FPGA-Oriented Quantization Scheme
for MobileNet-SSD

Yuxuan Xie , Bing Liu , Lei Feng, Xipeng Li and Danyin Zou

Abstract The rising popularity of mobile devices, which have high performance in
object detection calls for a method to implement our algorithms efficiently on mobile
devices. As we know, Deep Learning is a good approach to achieve state-of-the-art
results. But it needs lots of computation and resources, mobile devices are often
resource-limited because of their small size. Recently, FPGA is a device famous
for parallelism and many people try to implement the Deep Learning Networks on
FPGA. After our investigation, we choose MobileNet-SSD to implement on FPGA
because that this network is designed for mobile devices and its size and cost are
relatively smaller. There are also some challenges about implementing the network
on FPGA, such as the large demand of resources and low latency, which are pretty
important for mobile devices. In this paper, we show a quantization scheme for object
detection networks based on FPGA and a process to simulate the FPGA on PC to help
us predict the performance of networks on FPGA. Besides, we propose an integer-
only inference based on FPGA, which truly reduce the cost of resources greatly. The
method of Dynamic Fixed Point is adopted and we make some improvement based on
object detection networks to quantize the MobileNet-SSD, which is a suitable object
detection network for embedded system. Our improvements make its performance
better than Ristretto.

Keywords Quantization · FPGA · MobileNet-SSD

Y. Xie · B. Liu (B) · L. Feng · X. Li · D. Zou


Harbin Institute of Technology, Harbin, China
e-mail: liubing66@hit.edu.cn
Y. Xie
e-mail: 1160100327@stu.hit.edu.cn
L. Feng
e-mail: hitfenglei@hit.edu.cn
X. Li
e-mail: leexp1997@126.com
D. Zou
e-mail: 17S101133@stu.hit.edu.cn

© Springer Nature Singapore Pte Ltd. 2020 95


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_10
96 Y. Xie et al.

10.1 Introduction

Deep Learning is gradually replacing the traditional computer vision method and
play a more and more important role in objection detection [1]. In order to get
better performance, deep neural networks are becoming more complicated and the
requirement of computation, storage, and energy are extremely large and increasing.
We can get that from Table 10.1. At the same time, applying this technology to FPGA
is more and more popular because of its parallelism and high performance [2]. But
the resources of mobile devices are limited and precious, FPGA is no exception.
It can be pretty difficult to implement deep neural networks in FPGA and achieve
a good and real-time performance. Then approaches about reducing the resources
consumption and speeding up are very popular. Quantizing the float point data to
fixed point is a very effective approach to achieve that.
The computational cost represents the number of calculations in an inference and
its unit is GFLOPS, which means that 109 floating-point mathematical operations.
And the DRAM access means the number of bytes read and written from memory.
The throughput means the theoretic frame number per second.
Approaches about quantization can roughly be divided into two categories. The
first category focuses on designing novel network to exploit the computation effi-
ciency to limit the consumption of resources. Such as MobileNet [3] and SqueezeNet
[4]. Others want to quantize the weight from floating point to other types to reduce
the cost of resources. This methodology includes ternary weight networks (TWN
[5]), XNOR-net [6]. And our scheme also focuses on quantizing the floating-point
data into fixed-point data which have less bit width.
It is proved that floating-point arithmetic is more complicated than fixed-point
arithmetic and require more resources and time. In addition, the accuracy loss caused
by precision loss can be restricted to a small range. Diannao [7] quantize data to 16-
bit fixed point with accuracy loss less than 1% on classification network. Ristretto
successfully quantifies CaffeNet and SqueezeNet to 8 bits in dynamic fixed-point
format [8]. When we apply Ristretto to quantize the object detection networks, the

Table 10.1 The resources consumption of inference


Model Computational cost DRAM access Throughput (frame)
(GFLOPs) (Mbyte)
VGG16 30 585 51
ResNet-50 7.7 284 10.5
MobileNet 1.1 96 32
SqueezeNet 1.7 55 4
VGG16+SSD 62 324 1.3
YOLO 40 1024 2
YOLOv2 (416 * 416) 28 468 2.9
MobileNet-SSD 2.3 99 32
10 A FPGA-Oriented Quantization Scheme for MobileNet-SSD 97

mAP decline greatly. Then we make some improvements in dynamic fixed point to
quantize MobileNet-SSD and get higher performance than Ristretto. Therefore, we
quantize the floating-point MobileNet-SSD to fixed point and limit the bit width of
data. Besides, we design an integer-only inference scheme on FPGA, which can truly
reduce the cost of resources [9]. We also run our fixed point in HLS, a simulation
tool of FPGA and get a report about the resources consumption. And in order to have
higher working efficiency, we propose a quantization scheme based on FPGA and a
method to simulate the FPGA on PC. And proving that data can achieve equation in
every bit. Cause it can be also very difficult to set up deep neural networks in FPGA
as a result of completely different programming method [10]. And we cannot get the
performance of deep neural networks until we set up them in FPGA. Simulating the
FPGA on PC can be a very good way to solve it and can really improve our working
efficiency.

10.2 Quantization Arithmetic

In this section, we introduce our quantization arithmetic, dynamic fixed point. This
arithmetic was proposed by making some improvements on fixed-point quantization.
Early some people quantized the float model to fixed point and could not get a good
result because of the large loss. Dynamic fixed point can be a very good way to solve
problems that different layers have a significant dynamic range. So every layer can
have least precision loss. In fixed point, each number is formed by three parts, a sign,
the integer part, and fractional part. The data format can be shown in Fig. 10.1. And
we can use C++ to represent this data format as a result of its advantages that allows
different types to appear in a struct. In the structure, we define a bool variable named
s to represent the positive and negative sign. Two char variables bw and fl stand for
the bit width and the length of fractional part, respectively. And the real data which
have not quantized are represented by rd.

Fig. 10.1 The data format


dynamic fixed point
98 Y. Xie et al.

Each number can be presented as (10.1)


bw−2
(−1)s × 2−fl × 2i xi (10.1)
i=0

We can get quantized data through (10.2). And the precision loss is less than 2−fl .
 
round rd × 2fl
(10.2)
2fl
We define the round(x) as follows. And [x] means that the largest integer in all
integers smaller than x.

[x] if [x] ≤ x ≤ [x] + 0.5
round(x) = (10.3)
[x] + 1 if [x] + 0.5 ≤ x ≤ [x] + 1

One problem left is that how we determine the length of fractional part and integer
part. Philipp Matthias Gysel uses (10.4) and gets a good performance in classification
networks. In this equation, the data represent a set of data such as input or weight.

IL = log2 (max{data} + 1) (10.4)

Besides, we also merge the Batch Normalization layers [11] into neighboring
convolution layers to make it convenient to deploy MobileNet-SSD. It is because
the main function of Batch Normalization layers is to speedup the training process
and merging them does not have any bad effect on inference. At first, we define μ
as the mean of the input data, σ2 as the variance of the input data, ε that represents
a small number to make sure the denominator is not zero. In addition, there are two
parameters γ, β that are able to train. And what we quantize is the model that has
been trained to perform well. So, these two parameters can be seen as constant. Then
we calculate the intermediate variables α by (10.5)
γ
α= √ (10.5)
σ2 + ε

Then we calculate through (10.6), (10.7) to get the two new parameters in con-
volution layers Weightnew , biasnew . And weight and bias represent the parameters
before we merge the Batch Normalization layers. Then we get the MobileNet-SSD
that has no Batch Normalization layers.

Weightnew = Weight × α (10.6)

biasnew = bias × α + (β − μ × α) (10.7)


10 A FPGA-Oriented Quantization Scheme for MobileNet-SSD 99

Fig. 10.2 The data path in


Input 1
our quantization scheme
16 bits

16 bits
conv Weight

32 bits
32 bits

bias +
32 bits

ReLU

input 2

10.3 Quantization Scheme

We describe our quantized scheme carefully in this section and our improvements.
The method we use is dynamic fixed point. At first, we run several epochs to get the
Maximum of every layer’s input and weight, respectively. Then we can calculate the
length of integer part to make sure the data will not overflow. Ristretto uses (10.4)
to get the length of integer part, but this method does not have a good performance
in object detection networks. We make an exchange based on (10.4) and get (10.8)
and a better performance.

IL = log2 (max{data} + 1) + 1 (10.8)

After we get the format of every layer’s input and weight, we replace the traditional
convolution layer with our own convolution layer. In this way, we can quantize the
data into fixed point in the layer. Though we represent the data by float, but they
are the same value. To achieve equal between PC and FPGA, we quantize our input
and weight before convolution operation. In fact, we can also quantize the output of
every layer. The output of current layer will be the input of next layer and they will
be quantized there.
In general, the frame of our quantization scheme can be shown as Fig. 10.2. We
take a layer as example and carefully describe the data path in this process. At first,
we quantize the input and weight into 16 bits based on the length of integer part.
Then we convolve the input and weight and we use 32 bits to represent these results
to make sure the data will not overflow. And the format of the result depends on the
input and weight, and the length of results’ fractional part is the sum of input’s and
weight’s fractional part. And we also quantize the bias into 32 bits integer and its
format is the same with the result. This is because two fixed-point data must have
the same length of fractional part to make sure their decimal point aligned when they
add. The result was sent to ReLU as input. Finally, the data are sent to the next layer.
100 Y. Xie et al.

10.4 Calculation Process Analysis

We propose an integer-only inference scheme on FPGA. The frames of scheme on


PC and scheme on FPGA are the same but the type of data is different between PC
and FPGA. The data on FPGA are integer completely, while the data are fixed-point
type on PC. The data path on FPGA is shown as follows. We still take 16 bits as
examples, the input1 can be represented as (10.9)
 
round input × 2fl_input1 (10.9)

And the weight and bias are the same. And we can show the real data through
(10.10)

output
(10.10)
2fl_input1× 2fl_weight

Then the results are sent to ReLU. After that, the data are sent into the next layer.
We will prove that the data on PC and FPGA can achieve equal. We have built up
the model of the data path very clearly. And what we need to do now is to describe
the data path mathematically according to the model we have built up before. We
can get the value of data in every link. In general, the data path on PC can be shown
as follows. And the data represent input and weight.
 
round datafloat32 × 2fl_data
dataint16 = (10.11)
2fl_data
 
round biasfloat32 × 2fl_input+fl_weight
biasint32 = (10.12)
2fl_input+fl_weight
 
outputint32 = biasint32 + convolution inputint16 , weightint16 (10.13)
   
round ReLU outputint16 × 2fl_input2
input2int16 = (10.14)
2fl_input2
The data path of our scheme on FPGA can be shown as follows: data represents
input and weight.
 

dataint16 = round datafloat32 × 2fl_data (10.15)
  

biasint32 = round biasfloat32 × 2fl_input +fl_weight (10.16)
 
outputint16 = biasint32 + convolution inputint16 , weightint16 (10.17)
  
 ReLU outputint32 × 2fl_input2
input2int16 = round   (10.18)
2fl_input +fl_weight
10 A FPGA-Oriented Quantization Scheme for MobileNet-SSD 101

As our parameters about length of fractional part are generated by the data which
are not quantized, these parameters on FPGA and PC are the same. So fl = fl. We
can get the result that can be simplified to (10.19)

   
input2int16 round ReLU outputint16 × 2fl_input2
 = (10.19)
2fl - input2 2fl_input2
And we can get (10.20). Then we can prove that the data on FPGA and PC can
be completely equal.

input2int16
= input2int16 (10.20)
2fl - input2

10.5 Experiments

10.5.1 Evaluation of Quantization Scheme

In this part, we will present the performance of ours scheme and Ristretto. And we
do our experiment on PC whose core is i5-6200U with using no GPU. We evaluate
quantization scheme on VOC0712 and the number of test data is 1000. The unit is
mAP which is the average of the maximum precisions at different recall values. And
the number of class in this dataset is 20. In addition, we merge the batch normalization
layer into convolutional layer. The results we get is shown in Table 10.2.
From the results, ours performs better than Ristretto and has nearly no loss when
quantized to int16. And ours has a better performance than Ristretto when the data
are quantized to int8. As a result, our quantization scheme can be applied to FPGA
without loss, which really it contributes to the development of AI mobile devices.

10.5.2 Experiment on FPGA

And we also conduct an experiment to prove our viewpoints by simulating in HLS,


which enables implementing the accelerator by C++ and exporting the RTL as a
Vivado’s IP core. And this project about implementing the CNN network in HLS

Table 10.2 Comparison of Type Ristretto (%) Ours (%)


two quantization schemes
float32 68.06 68.06
int6 10.76 68.06
int8 0.10 26.8
102 Y. Xie et al.

Table 10.3 The resources Name Bram_18k DSP FF LUT


consumption of floating-point
inference Total 834 967 126,936 1,964,042
Available 1510 2020 554,800 277,400
Utilization 55 47 22 70
(%)

Table 10.4 The resources Name Bram_18k DSP FF LUT


consumption of integer-only
inference Total 706 455 32,262 89,047
Available 1510 2020 554,800 277,400
Utilization 46 22 5 32
(%)

is finished by my colleagues. They design an FPGA-based CNN Accelerator which


can greatly improve the performance of CNN networks on FPGA [12].
We could get a report about the cost of resources after simulating in HLS and we
make two tables about the resources cost of our integer-only inference scheme and
normal floating-point inference scheme, respectively, to prove that our integer-only
inference scheme can really reduce the cost of resources significantly (Tables 10.3
and 10.4).
We can find out that our integer-only inference scheme cost much less resources
of FPGA than float point inference scheme. The cost of Bram declines by about 9%.
And the FF that is short for flip-flop drops to about a quarter of cost of float point
inference scheme. In addition, the DSP and LUT reduce about a half.
We also analyze the output data of the first convolution layer on PC and FPGA.
We can get that they are the same.

10.6 Conclusion and Future Work

In general, we propose an integer-only inference applied to FPGA to implement


the MobileNet-SSD on FPGA, which has limited resources. And we prove that
our scheme costs much less resources than floating-point inference by doing some
experiments. On the one hand, we also make some improvements about the dynamic
fixed-point arithmetic based on object detection networks and get higher performance
than Ristretto. We can achieve that the loss of mAP becomes zero when data are
quantized to 16 bits. Then we can run the object detection networks on FPGA without
loss and reduce greatly the cost of resources. A method about simulating FPGA on
PC is also introduced by us. We take MobileNet-SSD as an example to prove that.
Then we can more easily predict the performance of networks on FPGA.
In the future, we will apply Kullback–Leibler divergence to 8 bits quantization and
change our quantization arithmetic to have higher performance in 8 bits quantization.
We will also propose a new inference scheme to implement more easily on FPGA.
10 A FPGA-Oriented Quantization Scheme for MobileNet-SSD 103

References

1. Lee, A.: Comparing Deep Neural Networks and Traditional Vision Algorithms in Mobile
Robotics. Swarthmore University (2015)
2. Chen, X., Peng, X., Li, J.-B., Peng, Yu.: Overview of deep kernel learning based techniques
and applications. J. Netw. Intell. 1(3), 83–98 (2016)
3. Howard, A.G., Zhu, M., Chen, B., et al.: Mobilenets: efficient convolutional neural networks
for mobile vision applications (2014). arXiv:1704.04861
4. Iandola, F.N., Han, S., Moskewicz, M.W., et al.: Squeezenet: Alexnet-level accuracy with 50x
fewer parameters and <0.5 mb model size (2016). arXiv:1602.07360
5. Yin, P., Zhang, S., Xin, J., et al.: Training ternary neural networks with exact proximal operator
(2016). arXiv:1612.06052
6. Rastegari, M., Ordonez, V., Redmon, J., et al.: Xnor-net: Imagenet classification using binary
convolutional neural networks. In: European Conference on Computer Vision, pp. 525–542.
Springer, Cham (2016)
7. Chen, Y., Du, Z., Sun, N., Wang, J., et.al.: Diannao: a small-footprint high-throughput acceler-
ator for ubiquitous machine-learning. In: ASPLOS, vol. 49, no. 4. ACM, pp. 269–284 (2014)
8. Kuang, F.-J., Zhang, S.-Y.: A novel network intrusion detection based on support vector machine
and tent chaos artificial bee colony algorithm. J. Netw. Intell. 2(2), 195–204 (2017)
9. Fan,C., Ding, Q.: ARM-embedded implementation of H.264 selective encryption based on
chaotic stream cipher. J. Netw. Intell. 3(1), 9–15 (2018)
10. Gysel, P.: Ristretto: hardware-oriented approximation of convolutional neural networks (2016).
arXiv:1605.06402
11. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing
internal covariate shift (2015). arXiv:1502.03167
12. Liu, B., Zou, D., Feng, L., Feng, S., Fu, P., Li, J.: An FPGA-based CNN accelerator integrating
depthwise separable convolution. Electronics 8, 281 (2019)
Chapter 11
A High-Efficient Infrared Mosaic
Algorithm Based on GMS

Xia Pei, Baolong Guo, Geng Wang and Zhe Huang

Abstract The most important thing in infrared image mosaic technology is image
registration technology. In order to adapt to the real-time requirement of the bat-
tlefield, the ORB algorithm in this paper is used for feature extraction. In order to
obtain high-quality feature matching points, this paper proposes a new IGMS algo-
rithm on the basis of GMS algorithm. The experimental results show that the correct
point matching rate is increased by 8%. Using RNASAC to obtain the transforma-
tion matrix between the two images, and finally using the fade-in and fade-out fusion
algorithm to obtain a complete wide-field military investigation map.

Keywords Feature matching · Image mosaic · IGMS

11.1 Introduction

As a hot branch of infrared thermal imaging technology, infrared image stitching


is a way to solve narrow field of view, and has been continuously developed in
recent years. This paper studies the infrared image mosaic technology in military
reconnaissance. Infrared image mosaic combines images acquired from infrared
devices (possibly from different times, from different angles, and from different
sensors) into a large, high-resolution image. This paper mainly introduces the key
technologies of infrared image mosaic and improves on the GMS matching algorithm.
The entire paper structure is as follows:
(1) In the image preprocessing stage, the process of infrared imaging will be affected
by external noise. In order to obtain a complete and accurate mosaic image, this
paper uses the median filtering method for image denoising.

X. Pei · B. Guo (B) · G. Wang · Z. Huang


School of Aerospace Science and Technology, Xidian University, Xi’an 710071, Shaanxi,
People’s Republic of China
e-mail: blguo@xidian.edu.cn
X. Pei
e-mail: xpei_email@163.com
© Springer Nature Singapore Pte Ltd. 2020 105
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_11
106 X. Pei et al.

(2) In the feature extraction stage, in order to meet the requirements of real-time
matching, this paper uses the ORB algorithm with fast calculation speed. The
algorithm not only has rotation invariance and scale invariance, but the main
thing is that the algorithm can resist noise interference.
(3) In the image registration stage, based on the unidirectional matching principle
in GMS, a bilateral search algorithm is proposed to achieve high-quality feature
matching. Experimental data shows that the accuracy of the matching algorithm
is increased by 8%.
(4) In the image fusion stage, for images with large overlapping regions, a fade-in
and fade-out fusion algorithm are used to eliminate the stitching gap in image
registration due to uneven illumination.

11.2 Related Words

The development of image mosaic technology is traced back to 1966, Richard Szeliski
proposed a new panoramic mosaic model with eight parameters based on the L-M
(Iterative Nonlinear minimization Method) [1]. The biggest advantage of this algo-
rithm is that it can converge quickly. The most important thing is that the accuracy
of the calculation is very high. In 2000, Shmuel Peleg obtained inspiration based on
the movement of the camera, proposed an image mosaic algorithm that adaptively
selects the motion model [2]. In 2004, M. Brown and D. Glowe [3] proposed an
SIFT (Scale-Invariant Feature Transform) feature point detection algorithm based on
multi-resolution image mosaic. The feature points have scale invariance and rotation
invariance. In the later experimental results, SIFT achieved a good stitching effect.
In 2006, Bay [4] based on the principle of SIFT operator to reduce dimensionality,
proposed a new SURF feature, the calculation time is twice as fast as the SIFT cal-
culation time. In 2008, BRIEF grew out of research that uses binary tests to train a
set of classification trees [5]. He used a probabilistic model to implement the func-
tion of automatic sorting. In 2009, Rittavee Matungka proposed the APT (Adaptive
Polar Transform) algorithm [6], which can solve the problem of unevenness in the
transformation of polar coordinates. In 2010, Jungpli Shin [7] used energy spectrum
technology, which used this technology to eliminate the ghosts that appear after
image fusion, making the image gap transition smoother. In 2011, Ethan Rublee
[8] proposed an ORB algorithm suitable for real-time performance. The feature
extraction of the algorithm is extracted by FASF algorithm, and then the feature is
described by using the BRIEF algorithm. It can maintain its rotation invariance and
scale invariance. The biggest contribution is that the overall calculation speed time
is 100 times faster than the SIFT. Meanwhile, it is also 10 times faster than SURF.
In 2013, Yigisoy M. and Navab N. established a new mosaic method based on the
idea of structural communication. The core of the algorithm is to divide the overall
image into regions, so that many subregions are obtained. Then it tries to search for
valid information in each subarea, followed by mutual propagation in the subareas.
Therefore, with this method, a structural overall mapping can be formed between
11 A High-Efficient Infrared Mosaic Algorithm Based on GMS 107

the two images [9]. In 2017, Jiawang Bian proposed a simple encapsulation method
based on statistical feature matching methods. The algorithm can quickly distinguish
between correct matching and false matching, and greatly improve the stability of
feature point matching. Experimental data shows that the GMS algorithm has strong
real time and super robustness [10].

11.3 Improved GMS Algorithm

11.3.1 ORB Feature

The ORB [11] algorithm was proposed by Ethan Rublee in 2011. It combines the
detection method of FAST feature points with the BRIEF descriptor.
The FAST [12] mainly considers the gray level change of a pixel. If the difference
between the gray point value of the pixel in the neighborhood and the point to be
measured is sufficiently large, the candidate point is a feature point. Although the
FAST algorithm is fast, it only compares the gray value. To achieve its dimensional
and rotational invariant characteristics, the ORB algorithm solves by constructing
a Gaussian pyramid and using intensity centroid [13] method to obtain its scale
feature, and the main direction is determined by the offset between the grayscale and
the centroid of the corner point.
BRIEF [14] uses the binary code string to descriptor the feature point, so that
the speed is greatly improved. In order to make the BRIEF descriptor have rotation
invariance, the coordinate system established by the ORB in calculating the BRIEF
descriptor is centered on the feature point, and the 2D [15] coordinate system is
established by connecting the line of the feature point and the centroid of the point
region as the X-axis [16]. Thus, regardless of how the image is rotated, the coordinate
system of the ORB pick point pair is fixed [17]. At different rotation angles, the points
we take out in the same point mode are consistent [18]. This solves the problem of
rotation consistency.

11.3.2 GMS Algorithm

GMS [10] (Grid-based Motion Statistics), a grid-based motion statistic, is a simple


and statistical method of encapsulating motion smoothness into a number of matches
within a region. The algorithm can quickly and reliably distinguish between true
and false matches, so that a high number of feature matches can be converted into
high-quality feature matches. The flow of the algorithm is as follows: first perform
detection and descriptive sub-calculation of any kind of feature points, and use the
ORB feature for feature extraction and description. Then, the matching is performed
108 X. Pei et al.

using the BRIEF algorithm, and finally, the GMS is used to eliminate the false
matching.

11.3.3 Improved GMS Feature Matching

Although the GMS algorithm has better robustness and efficiency, the performance
is still insufficient at the correct matching rate. It is found through experiments that
when the overlapping area of the two pictures is very large, two or more points in Ia
match the same point of Ib , which cannot be eliminated by GMS. So here, we use the
idea of bilateral matching to get better matching points, we can called the improved
algorithm is IGMS. As shown in Fig. 11.1, we first look for A point a in Ia , then find
the matching point B corresponding to A in Ib , and then find the matching point  A
corresponding to matching point B in Ia , if A and  A is the same point, then we think
that A and B are the true matching points. The calculation formula is

A → B A ∈ Ia if A=A
−−−→ A ⇔ B (11.1)
B →A B ∈ Ib

11.4 Fade-in and Fade-out Infrared Image Fusion


Algorithm

The basic principle of the fade-in and fade-out method determines according to the
size of the pixel weight in the overlap region and the distance from the overlap region
to its boundary. Its calculation formula is as follows:

⎨ f1 (x, y) (x, y) ∈ R1 and (x, y) ∈
/ R2
f (x, y) = d1 f1 (x, y) + d2 f2 (x, y) (x, y) ∈ R1 and (x, y) ∈ R2 (11.2)

f2 (x, y) (x, y) ∈ R2 and (x, y) ∈
/ R1
 
where the d1 = M − width M and d2 = width M needs to be satisfied: 0 <
d1 , d2 < 1 and d1 + d2 = 1. As shown in Fig. 11.2, where width represents the width

Fig. 11.1 Bilateral search


principle
11 A High-Efficient Infrared Mosaic Algorithm Based on GMS 109

Fig. 11.2 Weight value y


change graph M
H
d1
( x, y )
width d2

o
W1 x

W2

of the overlapping pixels of the two images and M represents the overlapping pixel
area. And W1 and W2 are the widths of the two images.

11.5 Experiment Results and Analysis

For the feature matching of infrared images, the following three experiments mainly
compare the ORB+RANSAC, the ORB+GMS, and the ORB+IGMS matching algo-
rithms in terms of matching accuracy, the RMSE, and matching speed. The exper-
iment was performed on a computer with a CPU for Intel Core i7, 3.60 GHz, a
computer with 8.00 GB of memory and a Windows 7 operating system. The algo-
rithm is implemented in OpenCV 2.4.13 with Visual Studio 2017.

11.5.1 Results of Image Matching

The following is a matching effect picture of the three sets of infrared images collected
in the garden. And the size of the image is 720 × 576.
There are three evaluation indicators for image registration: CMR, RMSE, and
Matching Speed.
(1) CMR (Correct Matching Rate) is the ratio of the exact number of matching
points to the correct number of matching points. The larger the value of CMR,
the more accurate the accuracy of the match. The calculation expression for
CMR is

CMR = NC /N  (11.3)
110 X. Pei et al.

Table 11.1 Comparison table of three groups of experiments


Image Algorithm False match True match CMR
A ORB+RANSAC 462 349 75.54
ORB+GMS 437 380 86.95
IGMS 360 345 95.83
B ORB+RANSAC 470 357 75.95
ORB+GMS 462 411 88.96
IGMS 343 328 95.62
C ORB+RANSAC 452 343 75.88
ORB+GMS 442 392 88.68
IGMS 306 295 96.40

Fig. 11.3 ORB+RANSAC matching

(2) RMSE (Root Mean Square Error) is a measure of the deviation between observed
and true values. It is the ratio of the sum of the squared deviations of the observed
and true values, and then squared. The larger its value, the greater the difference
between the reference image and the registration. The calculation expression
for RMSE is:

N
1
Ermse (f ) =
(xi , yi ) − f (x , y ) 2 (11.4)

N i=1 i i

where N  represents the number of exact matching pairs, (xi , yi ) represents
the coordinates of the feature points in the image to be registered, and (xi , yi )
represents the coordinates of the feature points corresponding to (xi , yi ) in the
reference picture.
(3) Matching Speed: the speed of the matching speed is also the running time of
the algorithm. The longer the run time, the slower the match.
As shown in Table 11.1, the registration ratios of the three groups of exper-
iments are compared. From the data, the false matching points after purification
using the RANSAC algorithm are still relatively large, as shown in Fig. 11.3. And
the ORB+GMS registration rate is still relatively high as shown in Fig. 11.4, but the
IGMS has the highest registration accuracy after bilateral matching. Experiments
show that the IGMS has an accuracy of 8% higher than the GMS (Fig. 11.5).
11 A High-Efficient Infrared Mosaic Algorithm Based on GMS 111

Fig. 11.4 ORB+GMS matching

Fig. 11.5 ORB+IGMS matching

Table 11.2 Match rate and Image Algorithm RMSE Matching speed/s
root mean square error table
A ORB+RANSAC 1.071 0.16931
ORB+GMS 1.003 0.0704
IGMS 0.679 0.0333
B ORB+RANSAC 1.365 0.1559
ORB+GMS 1.264 0.0503
IGMS 0.726 0.0345
C ORB+RANSAC 0.951 0.1687
ORB+GMS 0.841 0.0661
IGMS 0.545 0.0419

As shown in Table 11.2, from the perspective of registration accuracy, the smaller
the RESM value, the higher the registration accuracy, so the registration accuracy
of the IGMS is more accurate. In terms of matching speed, the matching time of
the IGMS is minimal, so it is more suitable for real-time matching requirements.
In summary, the IGMS is more suitable for matching infrared images in terms of
registration accuracy, registration rate, and real-time performance.

11.5.2 Results of Image Stitching

The complete mosaic result of the three sets of images is as follows:


The experiment is based on the IGMS registration method and the fade-in and
fade-out fusion algorithm for image mosaic. For the stitching effect, the image seams
112 X. Pei et al.

Fig. 11.6 The results of stitching

are smoother, and the texture and edge details of the image remain very complete,
as shown in Fig. 11.6.

11.6 Conclusion

The IGMS matching algorithm in this paper is more suitable for the stitching of
infrared images. Experimental data shows that the IGMS can better eliminate the
mismatching pair than the RANSAC algorithm. The accuracy of IGMS matching is
increased by 8% compared to GMS. Finally, the gradual integration of the fusion
algorithm is used to make the stitching gap of the image transition more smoothly
and naturally, so the algorithm achieves very good results in the mosaic of infrared
images.

References

1. Shin, J., Tang, Y.: Deghosting for image stitching with automatic content-awareness. Pattern
Recognit. 23(26), 26–27
2. Zheng, W., Zhengchao, C., Bing, Z., et al.: CBERS-1 digital images mosaic and mapping of
China. 11(6), 787–791
3. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis.
60(2), 91–110
4. Bay, H., Tuytelaars, T., Van Gool, L.: Surf: speeded up robust features. In: European Conference
on Computer Vision, May 2006
5. Calonder, M., Lepetit, V., Fua, P.: Keypoint signatures for fast learning and recognition. In:
European Conference on Computer Vision (2008)
6. Matungka, R., Zheng, Y.F.: Image registration using adaptive polar transform. IEEE Trans.
Image Process. 18(10), 2340–2354 (2009)
7. Schmid G., Mohr, R.: Local grayvalue invariants for image retrieval. IEEE Trans. Pattern Anal.
Mach. Intell. 19(5), 530–534 (1997)
8. Rublee, E., Rabaud, V., Konolige, K., et al.: ORB: An efficient alternative to SIFT or SURF
(2011)
9. Yigitsoy, M., Navab, N.: Structure propagation for image registration. IEEE Trans. Med. Imag-
ing 32(9), 1657–1670 (2013)
10. Bian, J., Lin, W.-Y., Matsushita, Y., Yeung, S.-K., Nguyen, T.D., Cheng, M.-M.: GMS: grid-
based motion statistics for fast, ultra-robust feature correspondence. In: IEEE CVPR (2017)
11 A High-Efficient Infrared Mosaic Algorithm Based on GMS 113

11. Rublee, E., Rabaud, V., Konolige, K.: ORB: an efficient alternative to SIFT or SURF. IEEE
Int. Conf. Comput. Vis. 58(11), 2564–2571 (2011)
12. Rosten, E., Porter, R., Drummond, T.: Faster and better: a machine learning approach to corner
detection. IEEE Trans. Pattern Anal. Mach. Intell. 32, 105–119 (2010)
13. Rosin, P.L.: Measuring corner properties. Comput. Vis. Image Underst. 73(2), 291–307 (1999)
14. Calonder, C., Lepetit, V., Strecha, C.: BRIEF: binary robust independent elementary features.
In: European Conference on Computer Vision, pp. 778–792 (2010)
15. Chen, W.-K., Chen, H.-P., Tso, H.-K.: A friendly and verifiable image sharing method. J. Netw.
Intell. 1(1), 46–51 (2016)
16. Shen, W., Hao, S., Qian, J., Li, L.: Blind quality assessment of dehazed images by analyzing
information, contrast, and luminance. J. Netw. Intell. 2(1), 139–146 (2017)
17. Hong, S., Wang, A., Zhang, X., Gui, Z.: Low-dose CT image processing using artifact sup-
pressed total generalized variation. J. Netw. Intell. 3(1), 26–49 (2018)
18. Harold, C., Nelta, N.: Blind images quality assessment of distorted screen content images. J.
Netw. Intell. 3(2), 91–101 (2018)
Chapter 12
A Load Economic Dispatch Based on Ion
Motion Optimization Algorithm

Trong-The Nguyen, Mei-Jin Wang, Jeng-Shyang Pan, Thi-kien Dao


and Truong-Giang Ngo

Abstract This paper presents a new approach for dispatch generating powers of
thermal plants based on ion motion optimization algorithm (IMA). Electrical power
systems are determined by optimization in power balancing, transporting loss, and
generating capacity. The scheduling power generating units for stabilizing different
dynamic responses of the control power system are mathematically modeled for the
objective function. Economic load dispatch (ELD) gains as the objective function is
optimized by applying IMA. In the experimental section, several cases of different
units of thermal plants are used to test the performance of the proposed approach.
The preliminary results are compared with the other methods in the literature shows
that the proposed plan offers higher effect performance.

Keywords Ion motion optimization · Electric power generating plant outputs ·


Economic load dispatch

12.1 Introduction

Recently, electrical modes of renewable energy sources have increased rapidly [1].
Fast rate load and fluctuation in the power grids needs stable balance. Economic
load dispatch (ELD) [2] refers to a scheduling method that rationally allocates the
productive output of each generating unit to meet the constraints of power system

T.-T. Nguyen · M.-J. Wang · J.-S. Pan (B)


Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fujian University of
Technology, Fuzhou 350118, Fujian, China
e-mail: jengshyangpan@gmail.com
T.-T. Nguyen
e-mail: vnthe@hpu.edu.vn
M.-J. Wang
e-mail: meijinwang0608@gmail.com
T.-T. Nguyen · T. Dao · T.-G. Ngo
Department of Information Technology, Haiphong Private University, Haiphong, Vietnam
© Springer Nature Singapore Pte Ltd. 2020 115
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_12
116 T.-T. Nguyen et al.

operation [3]. A common single-objective optimization problem is to minimize power


generation costs and to maximize power generation efficiency. Economic scheduling
problems are the nonlinear problems that constrained optimization due to excessive
dependence on initial values and gradient information.
Traditional methods such as Lagrangian method, linear programming [4, 5], and
internal penalty function [6] are not suitable for solving nonlinear-like economic
scheduling problems. In response to the shortcomings of traditional methods, recent
techniques like metaheuristic algorithms have been applied to solve the power system
optimization problems successfully [7, 8, 9]. The algorithms such as particle swarm
optimization (PSO) [10], genetic algorithm (GA) [11], simulated annealing (SA)
[12], and neural network method (ANN) [13] refer to the metaheuristic. These meth-
ods have proven to be very effective in solving nonlinear and constrained problems
[14]. A recent metaheuristic algorithm, ion motion optimization algorithm (IMA)
[15] that is inspired based on the essential characteristics of attracting and mutually
exclusive anions and cations. IMA’s structure is simple but effective, and easy to
understand and to programming.
This paper takes advanced IAM to consider its factors for solving the optimization
of the economic load scheduling problem of the power system.

12.2 Related Work

12.2.1 Economic Load Dispatch

Economic load dispatch (ELD) is a kind of economic scheduling problems that


minimize the total power generation cost under the operating constraints of the power
system. It is a critical mathematical optimization problem in power systems. ELD
is a great significance for the economic and reliable operation of the power system
[2]. The main objective of the economic loading on generators is a minimum cost of
producing and simultaneously met power demand (PD) under constraints of generator
output limits, system losses, ramp rate limits, and prohibited operating zones.
The optimal dispatch is formulated as minimizing cost:


m
F= fi (Pi ) (12.1)
i=1

According to the analysis the cost function of ith generating unit, fi (P) is a
quadratic polynomial function that is described a:
 
fi (Pi ) = ai + bi Pi + ci Pi2 $ /h (12.2)

For different variables and notations are used, here, PD is power load demand. Pi
is active power deliver. Pimin is the minimum real power output. Pimax is maximum
12 A Load Economic Dispatch Based on Ion Motion … 117

power. Pi0 is the previous real power output. ai , bi and ci are coefficients of fuel cost.
ei , fi are coefficients of the power grid system with valve-point effects. m is a number
of the committed units. Ploss denotes the transporting load loss. Bij , B0i , B00 are B-
matrix coefficients for transmission power loss. Uri and Dri are the up ramp limits
and down ramp limits. Pimax is lower limits of the prohibited zone for generating
pzk
unit. PiU is upper limits of kth is the prohibited zone for ith generating unit. Imax is
a maximum number of iteration. I is current iteration. 
According to the equality constraints, total power generation m i=1 Pi is equivalent
to load demand PD and total loss as the following equation:


m
(Pi ) = PD + PLoss (12.3)
i=1

For the loss coefficient on transition load, the total loss may be derived as


m 
n 
m
PLoss = Pi Bij Pj + B0i Pi + B00 (12.4)
i=1 j=1 i=1

The cost function of generation is also satisfied to below inequality constrained:

Pimin ≤ Pi ≤ Pimax (12.5)

It is desired to control the generation power of each committed online generator


smoothly and should be within the generator limits. But the ramp rate limit restricts
the limit for controlling the operation of generation in two operating periods. The
ramp rate limit of ith generation unit is
   
Max Pimin , Pi0 − Dri ≤ Pi ≤ Min Pimax , Pi0 + Uri (12.6)

• (a) if generation increases,

Pi − Pi0 ≤ Uri (12.7)

• (b) if generation decreases,

Pi0 − Pi ≤ Dri (12.8)

The input–output characteristic of a generator is varied mainly when it has some


valves in its steam turbine. The ripples produced in the heat rate curve are the primary
cause of valve point, and they are not expressed by a polynomial function.
118 T.-T. Nguyen et al.

12.2.2 Ion Motion Optimization Algorithm

The ion motion optimization algorithm (IMA) [15] simulated the motion of the ions
with anion (negative ion charge) set and cation (positive ion charge) set. These two
sets are employed as ion candidate solutions in the operation process. They per-
form different evolutionary strategies in the liquid phase and the solid phase and
circulate between the liquid and the solid phase to achieve the purpose of optimiz-
ing the ions. Ions in the IMA algorithm can move toward best opposite charges. It
means that anions move toward the best cation; on the other hand, cations move
toward the best anion. The movement of ions in this algorithm can guarantee the
improvement of all ions throughout iterations. Their movement power depends on
the attraction/repulsion forces between them. The amount of this force specifies the
momentum of each atom. The following steps represent the process of the algorithm.
Initialization
An initial random population is randomly generated according to a uniform distri-
bution within the lower and upper boundaries with D dimensions.
Liquid phase
In the liquid phase, the anion group (A) and the cation group (C) updated according
to the following patterns, respectively.
 
Ai,j = Ai,j + AFi,j × Cbestj − Aj (12.9)
 
Ci,j = Ci,j + CFi,j × Abestj − Cj (12.10)

where Cbest and Abest are cation and anion optimization, respectively. Subscript
i = 1, 2, 3, . . . , NP/2, (NP/2 is the size of ions population), and j = 1, 2, 3, . . . , D.
The optimal anion and cation are the anion and cation with the lowest fitness value
in the entire anion group and the Cation group, respectively, for a minimization
problem.
The resultant of anions attracted force AFi,j and CFi,j are mathematically modeled
as follows:

1
AFi,j = (12.11)
1 + e−0.1/ADi,j
1
CFi,j = (12.12)
1 + e−0.1/CDi,j

where ADi,j and CDi,j are the distances of ith anion from the best cation,
 and cation
from the best anion in dimension, respectively. AD = A − Cbest 
j , and CDi,j =
  i,j i,j
Ci,j − Abestj .

Solid phase
The ion is gradually gathered with iteration near the optimal ion by the gravitational
force. The solid phase was set for breaking the phenomenon of excessive concen-
12 A Load Economic Dispatch Based on Ion Motion … 119

tration, and also providing diversity for the algorithm in case of over-concentration
of ions to make the algorithm fall into a local optimum. The ion motion gradually
slows down like the physical process as the iteration proceeds from the initial intense
motion, and gradually, the liquid state ions will recrystallize into crystals. The process
of recrystallization was simulated in IMA was known as a solid phase.

Aj + ϕ1 × (Cbest − 1), if rand > 0.5
Aj = (12.13)
Aj + ϕ1 × (Cbest), otherwise

Cj + ϕ2 × (Abest − 1), if rand > 0.5
Cj = (12.14)
Cj + ϕ2 × (Abest), otherwise

Termination condition
Completion of the solid phase evolution strategy to determine whether to achieve
the termination conditions of the algorithm. The termination conditions include the
presupposition accuracy, the number of iterations, and so on. If it is reached, the
optimal ion is directly output; otherwise, the anions and cations are returned to the
liquid phase from the solid phase and continue to be iterated. In such a process,
anions and cations are circulated in the liquid phase and solid stage, and the optimal
solution is gradually obtained with iteration.

12.3 Scheduling Load Power Optimization Based on IMA

Search space optimization of the ELD includes both feasible and unfeasible scenarios
that the main work is to identify the feasible points which produce close optimum
results within the boundary framework. It means the possible points have to satisfy
all the constraints, while the unworkable aspects violate at least one of them. As
mentioned in the above section, the power system economic scheduling problems
have multiple constraints, such as power balance constraints, operational constraints,
slope limits, and prohibited operating space. These constraints make the feasible
domain space of the problem very complicated [3].
Therefore, the solution or set of optimized points must necessarily be feasible,
i.e., the points must satisfy all constraints. So, it is essential to design a suitable
objective function, which results in success of an optimization problem. The perfor-
mance indices utilize in the area of optimization purposes with high acceptance rate.
The objective function characterized by the given different execution conditions and
constraints [3].
To handle constraints, we use the penalty functions to deal with unfeasible points.
We attempt figuring out an unconstrained problem in the search space points by
modifying the objective function in Eq. (12.1). The formula function is as follows:

f (Pi ), if Pi ∈ F
Min f = (12.15)
f (Pi ) + penalty(Pi ), otherwise
120 T.-T. Nguyen et al.

Start Calculate
The objectives

Modelling No
Dispatch Space
Feasible

Yes

Ions mapping
ELD model Search feasible
points for
Optimization

No
Success
i=i+1 i< iterMax

Yes Yes

The global best


Update local and Optimization
global best

End

Fig. 12.1 Flowchart of the proposed IMA for dispatch power generation (ELD)

where F is optimum dispatch. For dealing with constraints of prohibited zones, a


binary variable is used for adding to objective formula as follows:

1, if Pj violates the prohibited zones
Vj = (12.16)
0 otherwise

The nearest distance points in the possible areas measure the effort to refine the
solution.
 n 2

n  
n
Min f = Fi (Pi ) + q1 Pi − PL − PD + q2 Vj (12.17)
i=1 i=1 j=1
12 A Load Economic Dispatch Based on Ion Motion … 121

4
10 Case study with a six-unit system
1.489
The mean value of fitness function

1.488

1.487

1.486

1.485

1.484

1.483
0 20 40 60 80 100 120 140 160 180 200
Iterations

Fig. 12.2 Comparison of the proposed IMA for dispatching load scheduling generators with FA,
GA, and PSO approaches in the same condition

Parameters of penalty factors and constants associated with the power balance are
used to tune practically with values 1000 set to q1 , and the value one set to q2 in the
simulation section.
The necessary steps of IMA optimization for scheduling power generation dis-
patch:
Step 1. Initialize the IMA population that associated model dispatch space.
Step 2. Update the Anion group (A) and the Cation group (C) updated according to
Eqs. (12.10) and (12.11) as the patterns, respectively.
Step 3. Calculate ions according to the fitness value of the function as Eq. (12.17),
figure current nearest solutions and then update the position as feasible archives.
Step 4. If the termination condition met (e.g., max iterations), go to step 2, otherwise,
terminate the process and produces the result (Fig. 12.1).

12.4 Experimental Results

To evaluate the performance of the proposed approach, we use the case study of
six-unit and fifteen-unit systems to optimize the objective function in Eq. (12.17).
The outcome of the case testing for dispatch ELD is compared to other approaches,
i.e., genetic algorithm(GA) [11], firefly algorithm (FA) [16], and particle swarm
optimization (PSO) [10]. Setting parameters for the approaches: population size N
is set to 40, and the dimension of the solution space D is set to 6 and 16 for the
six-unit system he fifteen-unit system, respectively. The max-iteration is set to 200,
122 T.-T. Nguyen et al.

Table 12.1 Coefficients Units γ β $/MW α$ Pmin Pmax


setting for a six-unit system $/MW2 MW MW
1 0.0075 7.60 250.0 110.0 500.0
2 0.0093 10.20 210.0 51.0 200.0
3 0.0091 8.50 210.0 82.0 300.0
4 0.0092 11.50 205.0 51.0 150.0
5 0.0082 10.50 210.0 51.0 150.0
6 0.0075 12.20 125.0 61.0 140.0

and number of runs is set to 15. The final obtained results averaged the outcomes
from all runs. The compared results for ELD are shown in Fig. 12.2.
A. Case study of six units
The features of a system with six thermal units are listed in Table 12.1. The power
load demand is set to 1200 (MW).
The coefficients as Eq. (12.2) for a six-unit system in the operating normally with
capacity base 100 MVA are given as follows:
⎡ ⎤
0.15 0.17 0.14 0.19 0.26 0.22
⎢ 0.17 0.60 0.13 0.16 0.15 0.20 ⎥
⎢ ⎥
⎢ ⎥
−3 ⎢ 0.15 0.13 0.65 0.17 0.24 0.19 ⎥
Bij = 10 × ⎢ ⎥,
⎢ 0.19 0.16 0.17 0.71 0.30 0.25 ⎥
⎢ ⎥
⎣ 0.26 0.15 0.24 0.30 0.69 0.32 ⎦
0.22 0.20 0.19 0.25 0.32 0.85
B0 = 10−3 [−0.390 − 0.129 0.714 0.059 0.216 − 0.663],
B00 = 0.056,

and PD = 1200 MW.


Table 12.2 shows the comparison results of the proposed approach with the FA,
GA, and PSO approach. The solution has six generator outputs, including P1–P6.
The average results of the runs for generating power outputs, making total cost, total
power loss load, and total computing times, respectively.
Figure 12.2 depicts the comparison of the proposed IMA for dispatch power
generating outputs of a system six units with FA, GA, and PSO approaches in the
same condition.
B. Case study of 15 units
The given coefficients for a system has 15 thermal units as its feature is listed in
Table 12.3.
The power load demand is set to 1700 (MW). The features of a system with 15
thermal units are listed in Table 12.3. There are 15 generator power outputs in each
solution listed as P1 , P2 , …, P15 . The dimension D of the search space equalizes to
15.
12 A Load Economic Dispatch Based on Ion Motion … 123

Table 12.2 The best power outputs for six-generator systems


Outputs FA GA PSO IMA
P1 459.54 459.54 458.01 459.22
P2 166.62 166.62 178.51 171.57
P3 258.04 253.04 257.35 255.49
P4 117.43 117.43 120.15 119.83
P5 156.25 153.25 143.78 154.72
P6 85.89 85.89 76.76 73.77
Total power output (MW) 1239.76 1235.76 1234.55 1234.53
Total generation cost ($/h) 14891.00 14861.00 14860.00 14844.00
Power loss (MW) 37.76 35.76 34.56 34.54
Total CPU times (sec) 296 286 271 272

Table 12.3 Coefficients Units γ β $/MW α$ Pmin MW Pmax MW


setting for a fifteen-unit $/MW2
system
1 0.00230 10.51 671.12 150.0 445.0
2 0.00185 10.61 574.82 155.0 465.0
3 0.00125 9.51 374.98 29.0 135.0
4 0.00113 8.52 37.50 25.0 130.0
5 0.00205 10.51 461.02 149.0 475.0
6 0.00134 10.01 631.12 139.0 460.0
7 0.00136 10.76 548.98 130.0 455.0
8 0.00134 11.34 228.21 65.0 300.0
9 0.00281 12.24 173.12 25.0 165.0
10 0.00220 10.72 174.97 24.0 169.0
11 0.00259 11.39 188.12 23.0 85.0
12 0.00451 8.91 232.01 22.0 85.0
13 0.00137 12.13 224.12 22.0 85.0
14 0.00293 12.33 310.12 25.0 61.0
15 0.00355 11.43 326.12 19.0 56.0

Bi0 = 10−3 × [−0.1 − 0.22.8 − 0.10.1 − 0.3 − 0.2 − 0.20.63.9 − 1.70.0


− 3.26.7 − 6.4]; B00 = 0.0055, PD = 1700 MW.

Table 12.4 depicts the comparison of the proposed approach with the other pro-
cedures, e.g., FA, GA, and PSO methods in the same condition for the optimization
124 T.-T. Nguyen et al.

Table 12.4 The best power output for fifteen-generator systems


Outputs FA [14] GA [13] PSO [15] IMA
P1 455.21 455.01 455.01 455.01
P2 91.98 93.98 120.03 85.00
P3 90.06 85.06 84.85 84.83
P4 89.97 89.97 75.56 45.29
P5 156.00 150.00 162.94 152.00
P6 350.76 350.76 322.48 357.49
P7 226.36 226.36 165.70 242.22
P8 60.00 60.00 60.34 60.56
P9 52.37 52.37 91.84 29.60
P10 26.10 25.10 45.10 50.40
P11 25.96 25.96 42.70 30.60
P12 74.01 74.01 77.97 80.00
P13 61.99 66.99 45.38 66.27
P14 36.22 34.22 47.37 26.24
P15 52.05 51.05 55.00 55.00
Total power output (MW) 1846.81 1837.81 1828.27 1827.60
Total generation cost ($/h) 1241.09 1236.09 1235.61 1234.61
Power loss (MW) 147.84 137.84 129.27 127.60
Total CPU time (sec) 411 378 313 314

system with 15 generators. The statistical results involved the generation cost, eval-
uation value, and average CPU time are summarized in the table.
Observed over Tables, the results of quality performance in terms of the cost,
power loss and time consumption of the proposed method also produced better the
other approaches. The proposed IMA outperforms other methods.
The observed results of quality performance in terms of convergence speed and
time consumption show that the proposed method of parallel optimization outper-
forms the other methods.

12.5 Conclusion

In this paper, we presented a new approach based on ion motion optimization algo-
rithm (IMA) for dispatching power generators outputs. Economic load dispatch
(ELD) is optimized with different responses of the control system in balancing,
transporting loss, and generating capacity. The linear equality and inequality con-
straints were employed in modeling objective function. The experimental section,
12 A Load Economic Dispatch Based on Ion Motion … 125

several cases of different units of thermal plants are used to test the performance of
the proposed approach.
The preliminary results are compared with the other methods in the literature such
as FA, GA, and PSO in the same condition that shows that the proposed approach
provides better quality performance and runs less time than the other criteria.

References

1. Tsai, C.-F., Dao, T.-K., Pan, T.-S., Nguyen, T.-T., Chang, J.-F.: Parallel bat algorithm applied
to the economic load dispatch problem. J. Internet Technol. 17 (2016). https://doi.org/10.6138/
JIT.2016.17.4.20141014c
2. Al-Sumait, J.S., Sykulski, J.K., Al-Othman, A.K.: Solution of different types of economic load
dispatch problems using a pattern search method. Electr. Power Compon. Syst. 36, 250–265
(2008). https://doi.org/10.1080/15325000701603892
3. Dao, T., Pan, T., Nguyen, T., Chu, S.: Evolved bat algorithm for solving the economic load
dispatch problem. In: Advances in Intelligent Systems and Computing, pp. 109–119 (2015).
https://doi.org/10.1007/978-3-319-12286-1_12
4. Vajda, S., Dantzig, G.B.: Linear programming and extensions. Math. Gaz. (2007). https://doi.
org/10.2307/3612922
5. Nguyen, T.-T., Pan, J.-S., Chu, S.-C., Roddick, J.F., Dao, T.-K.: Optimization localization in
wireless sensor network based on multi-objective firefly algorithm. J. Netw. Intell. 1, 130–138
(2016)
6. Yeniay, Ö.: Penalty function methods for constrained optimization with genetic algorithms.
Math. Comput. Appl. (2005)
7. Soliman, S.A.-H., Mantawy, A.-A.H.: Modern Optimization Techniques with Applications in
Electric Power Systems (2012). https://doi.org/10.1007/978-1-4614-1752-1
8. Nguyen, T.-T., Pan, J.-S., Wu, T.-Y., Dao, T.-K., Nguyen, T.-D.: Node coverage optimization
strategy based on ions motion optimization. J. Netw. Intell. 4, 1–9 (2019)
9. Xue, X., Ren, A.: An evolutionary algorithm based ontology alignment extracting technology.
J. Netw. Intell. 2, 205–212 (2017)
10. Sun, J., Palade, V., Wu, X.J., Fang, W., Wang, Z.: Solving the power economic dispatch problem
with generator constraints by random drift particle swarm optimization. IEEE Trans. Ind.
Informatics. 10, 222–232 (2014). https://doi.org/10.1109/TII.2013.2267392
11. Chiang, C.L.: Improved genetic algorithm for power economic dispatch of units with valve-
point effects and multiple fuels. IEEE Trans. Power Syst. 20, 1690–1699 (2005). https://doi.
org/10.1109/TPWRS.2005.857924
12. Suppapitnarm, A., Seffen, K.A., Parks, G.T., Clarkson, P.J.: Simulated annealing algorithm for
multiobjective optimization. Eng. Optim. (2000). https://doi.org/10.1080/03052150008940911
13. Du, K.L.: Clustering: a neural network approach. Neural Netw. (2010). https://doi.org/10.1016/
j.neunet.2009.08.007
14. Nanda, S.J., Panda, G.: A survey on nature inspired metaheuristic algorithms for partitional
clustering (2014). https://doi.org/10.1016/j.swevo.2013.11.003
15. Javidya, B., Hatamloua, A., Mirjalili, S.: Ions motion algorithm for solving optimization prob-
lems. Appl. Soft Comput. J. 32, 72–79 (2015). http://dx.doi.org/10.1016/j.asoc.2015.03.035
16. Apostolopoulos, T., Vlachos, A.: Application of the firefly algorithm for solving the economic
emissions load dispatch problem (2011). https://doi.org/10.1155/2011/523806
Chapter 13
Improving Correlation Function Method
to Generate Three-Dimensional
Atmospheric Turbulence

Lianlei Lin, Kun Yan and Jiapeng Li

Abstract Atmospheric turbulence is a common form of wind field that causes turbu-
lence for aircraft. A high-intensity turbulence field may negatively affect flight safety.
With the development of simulation modeling and software engineering, the influence
of the atmospheric turbulence on an aircraft has been widely studied using simula-
tion experiments. Because the method for generating one-dimensional atmospheric
turbulence is now mature, researchers have been confronted with a growing need to
generate the three-dimensional atmospheric turbulence field that is required in the
new simulation experiments. In the current study, we generate a three-dimensional
atmospheric turbulence field based on an improved correlation function method.
The main innovation is that we use the double random switching algorithm to adapt
the Gaussian white noise sequence that is closer to the ideal condition when cre-
ating the one-dimensional atmospheric turbulence field. The two-dimensional and
the final three-dimensional atmospheric turbulence field can be generated based on
the one-dimensional one by iteration. There are experimental results to confirm that
the three-dimensional atmospheric turbulence generated by this method provides
improved transverse and longitudinal correlations as well as reduced error when
compared with the theoretical values.

Keywords Atmospheric turbulence · Three dimensional · Correlation function

13.1 Introduction

Atmospheric turbulence is the random motion of the atmosphere that usually accom-
panies the transfer and exchange of energy, momentum, and matter. Such turbulence

L. Lin (B) · J. Li
School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin
150001, China
e-mail: linlianlei@hit.edu.cn
K. Yan
Beijing Near Space Airship Technology Development Co., Ltd, Beijing 100070, China

© Springer Nature Singapore Pte Ltd. 2020 127


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_13
128 L. Lin et al.

potentially may have significant adverse effects on flight safety. As simulation mod-
eling and software engineering has developed, these techniques have been widely
used to study the influence of atmospheric turbulence on an aircraft. In such simula-
tion experiments, virtual atmospheric turbulence fields can be constructed. This has
great significance to atmospheric turbulence field modeling.
Studies of atmospheric turbulence using a mathematical model started in 1942.
Dryden established a spectrum model of atmospheric turbulence based on massive
observational data. Later, von Karman created an atmospheric turbulence energy
model with higher accuracy and a more complex form [1]. Both models considered
the classic models for the atmospheric turbulence field. Development of modeling
techniques for atmospheric turbulence started with a one-dimensional method [2].
Of these techniques, the main method is the shaping filter method [3]. Not only
has a method for one-dimensional atmospheric turbulence modeling been developed
based on Dryden’s model [4, 5], modeling methods based on von Karman’s model
have also been reported [6, 7]. Currently, the methods for modeling one-dimensional
atmospheric turbulence are considered mature.
As technology developed, multidimensional modeling and simulation technology
for atmospheric turbulence emerged in the 1990s. A method for constructing two-
dimensional atmospheric turbulence out of the one-dimensional shaping filter method
was reported by Xiao [8]. Then a method for generating two-dimensional atmospheric
turbulence based on the spatial correlation function was developed by Lu et al. [9].
In 2001, a Monte Carlo method which generated three-dimensional atmospheric
turbulence values using a correlation function matrix was reported by Hong. This
method, however, was unwieldy because of its large memory footprint and lengthy
computation time [10]. In 2012, an improved method to solve the disadvantages of the
Monte Carlo method was developed by Gao et al., but the low efficiency remained
as an unsolved problem [11]. In 2008, Gao et al. generated a three-dimensional
atmospheric turbulence field using a time–frequency transform, yet its requirement
for pre-stored data makes it unsuitable for real-time simulation [12]. Based on the
study by Lu et al. [9], an algorithm for simulating a three-dimensional atmospheric
turbulence featured good real-time performance and accuracy. It was developed based
on the correlation function method [13].
Gaussian white noise was used in existing models of atmospheric turbulence. The
quality of Gaussian white noise will directly affect the generation of atmospheric
turbulence in simulation experiments. Hunter and Kearney reported that the white
noise is improved by using a double random switching algorithm [14]. The von
Karman three-dimensional atmospheric turbulence field modeling established by
Gao et al. in 2012 also used this algorithm [11]. We develop an improved method for
generating three-dimensional atmospheric turbulence by referring to reported studies
[11] and our previous work [13]. An improved Gaussian white noise sequence is used
to generate the initial one-dimensional atmospheric turbulence, which improves the
correlation of the overall three-dimensional atmospheric turbulence field.
13 Improving Correlation Function Method … 129

13.2 Generating a Three-Dimensional Atmospheric


Turbulence Based on the Correlation Function
Method

The detailed method for generating the three-dimensional atmospheric turbulence


based on the correlation function method is as follows. A one-dimensional model
is first established to create a random model for atmospheric turbulence [9, 13] as
indicated by Eq. (13.1):

w(x) = aw(x − h) + σw r (x) (13.1)

where r is the Gaussian white noise; a and σ w are undetermined parameters which
can be generated by the correlation function method. According to the random model
and the definition of correlation function, we can have the following equations:
 
R0 = E[w(x)w(x)] = E [aw(x − h) + σw r (x)]2 = a 2 R0 + σw2 (13.2)

R1 = E[w(x)w(x − h)] = E{[aw(x − h) + σw r (x)]w(x − h)} = a R0 (13.3)

Integrate the equation group to get

R1 
a= , σw = R0 (1 − a 2 ) (13.4)
R0

Substitute the resulting Gaussian white noise sequence into the random model to
get the atmospheric turbulence value.
In two- and three-dimensional space, a random model can be set up as

w(x, y) = a1 w(x − h, y) + a2 w(x, y − h) + a3 w(x − h, y − h) + σw r (x, y)


(13.5)
w(x, y, z) = a1 w(x − h, y − h, z − h)
+ a2 w(x − h, y − h, z) + a3 w(x, y − h, z − h)
+ a4 w(x − h, y, z − h) + a5 w(x, y, z − h)
+ a6 w(x, y − h, z) + a7 w(x − h, y, z)
+ σw r (x, y, z) (13.6)

The correlation function can be deduced.




⎪ R00 = a1 R10 + a2 R01 + a3 R11 + σw2

R01 = a1 R11 + a2 R00 + a3 R10
(13.7)
⎪ R10
⎪ = a1 R00 + a2 R11 + a3 R01

R11 = a1 R01 + a2 R10 + a3 R00
130 L. Lin et al.


⎪ R000 = E[w(x, y, z)w(x, y, z)]



⎪ R = E[w(x, y, z)w(x, y, z + h)]
⎪ 001


⎪ R010 = E[w(x, y, z)w(x, y + h, z)]


R011 = E[w(x, y, z)w(x, y + h, z + h)]
(13.8)

⎪ R100 = E[w(x, y, z)w(x + h, y, z)]



⎪ R = E[w(x, y, z)w(x + h, y, z + h)]


101

⎪ R110 = E[w(x, y, z)w(x + h, y + h, z)]


R111 = E[w(x, y, z)w(x + h, y + h, z + h)]

Two- and three-dimensional matrices can be set up based on the one-dimensional


derivation.
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
R11 R00 R10 R01 a1
A = ⎣ R00 R11 R01 ⎦, B = ⎣ R10 ⎦, X = ⎣ a2 ⎦ (13.9)
R01 R10 R00 R11 a3
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
R110 R111 R010 R100 R000 R011 R101 R001 a1
⎢R R R R R R R ⎥ ⎢R ⎥ ⎢a ⎥
⎢ 101 100 001 111 011 000 110 ⎥ ⎢ 010 ⎥ ⎢ 2⎥
⎢R R R R R R R ⎥ ⎢R ⎥ ⎢a ⎥
⎢ 100 101 000 110 010 001 111 ⎥ ⎢ 011 ⎥ ⎢ 3⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
A = ⎢ R011 R010 R111 R001 R101 R110 R000 ⎥, B = ⎢ R100 ⎥, X = ⎢ a4 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ R010 R011 R110 R000 R100 R111 R101 ⎥ ⎢ R101 ⎥ ⎢ a5 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ R001 R000 R101 R011 R111 R100 R010 ⎦ ⎣ R110 ⎦ ⎣ a6 ⎦
R000 R001 R100 R010 R110 R101 R011 R111 a7
(13.10)

The value of ai (i = 1 − 3) and ai (i = 1 − 7) can be calculated from the equation


AX = B, and σ w can be calculated from Eqs. (13.11) and (13.12).

σw = R00 − (a1 R10 + a2 R01 + a3 R11 ) (13.11)

σw = R000 − (a1 R111 + a2 R110 + a3 R011 + a4 R101 + a5 R001 + a6 R010 + a7 R100 )
(13.12)

The detailed procedure is described as follows. First, the initial value of the atmo-
spheric turbulence at the origin is set, followed by the calculation of parameter values
and turbulence values for the one-dimensional model. Then the parameters and tur-
bulence value for a two-dimensional model are calculated using the one-dimensional
turbulence values as boundary conditions. Finally, the three-dimensional turbulence
value is deduced based on the two-dimensional turbulence values as boundary con-
ditions.
It should be noted that the Gaussian white noise r is used during the whole calcu-
lation, so the quality of atmospheric turbulence field largely depends on the quality
of Gaussian white noise.
13 Improving Correlation Function Method … 131

13.3 The Improved Correlation Function Method

According to the above theory, two main factors affect the accuracy of the numerical
simulation of atmospheric turbulence: one is the calculation of the model parameters.
Errors can be avoided so long as the original model is not simplified in the theoretical
derivation. The other factor is the choice of values for the Gaussian white noise
sequence substituted into the random model. If the standard Gaussian white noise
sequence is generated, the resulting turbulence value should satisfy the characteristics
of the frequency domain and time domain of the atmospheric turbulence; however,
in real numerical simulation experiments, the generated Gaussian white noise is not
ideal.

13.3.1 The Improved Gaussian White Noise Sequence

The ideal “Gaussian white noise” indicates that the frequency distribution function
of the noise fits a normal distribution (also known as a Gaussian distribution). Mean-
while, in the power density part, the ideal “white noise” refers to a noise signal that
has a constant density of spectral power, which means that the power of the signal
uniformly distributes over each frequency range.
The sequence is set to x(n). To approximate the mean value of the sequence as 0
and the standard deviation as 1 and to get a better normal distribution characteristic,
the following formula is applied:

x(n) − μ
y(n) = (13.13)
σ
where μ is the mean value of sequence, σ is the standard deviation of sequence, and
y(n) is the improved sequence.
The probability distribution of the noise sequence is already very close to the ideal
characteristics, and the spectral power density can be improved with a double random
switching algorithm while retaining the probability density [14]. The main idea is
to randomly switch the arrangement of two points in the sequence and repeat such
switching until the spectral power density is more evenly distributed. This is based
on the uniformity of the sequence power spectrum evaluated by the least squares of
the autocorrelation function.
The detailed steps are described as follows:
Step 1: Sequence x i (n) is generated by interchanging the positions of two randomly
selected data points in sequence x i−1 (n);
Step 2: Calculate the autocorrelation function of sequence x i (n) as following

N −k−1
1 
ri (k) = xi (n)xi (n + k) k = 0, 1, . . . , N − 1 (13.14)
N n=0
132 L. Lin et al.

-5
Power/frequency (dB/rad/sample)

-10

-15

-20

-25

-30

-35
the numerical Gaussian noise
-40 the improved noise

-45
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Normalized Frequency ( rad/sample)

Fig. 13.1 Power spectrum of the improved noise sequence

Step 3: Calculate the sum of the squares of the autocorrelation function as following

N −1

SSi = [ri (k)]2 i = 0, 1, 2, . . . (13.15)
k=1

Step 4: Stop the program if SS i < ε or i reaches the predetermined maximum switch-
ing times Nmax. ε is the preset standard number for performing inter-
changes before stopping the calculation.
Step 5: If SS i < SS i−1 , then return to Step 1 to continue with the calculation; other-
wise, drop the current random exchange and return to Step 1 to repeat the
above process until the requirement in Step 4 is satisfied
Theoretically, the algorithm only changes the order rather than the value of the
stochastic sequence, so the mean value and standard deviation, as well as the proba-
bility distribution of the sequence will not be affected.
There are 2000 points in the Gaussian white noise sequence in the experiment,
and thus, the sum of the squares of the autocorrelation function is reduced by 60%.
The power density spectrums before and after improvement are shown in Fig. 13.1.
13 Improving Correlation Function Method … 133

According to the figure, the power spectrum of the improved Gaussian white noise
sequence is much more evenly distributed than the numerical Gaussian noise, with
fewer isolated points and a smaller amplitude, making it closer to the ideal spectrum.

13.3.2 Generating 3D Atmospheric Turbulence


with the Improved Gaussian White Noise Sequence

The method for generating the three-dimensional atmospheric turbulence using a


correlation function is based on one-dimensional atmospheric turbulence values.
The overall three-dimensional atmospheric turbulence can be improved as long as its
basis, the one-dimensional atmospheric turbulence, is improved with the improved
Gaussian white noise sequence. To distinguish this from the previous method [13], we
refer to the new method as the three-dimensional atmospheric turbulence generation
method based on an improved correlation function method, and the specific steps are
as follows:
Step 1: Based on the improved Gaussian white noise sequence, the one-dimensional
atmospheric turbulence values are generated on coordinate axes x, y, and z;
Step 2: Two-dimensional atmospheric turbulence values are generated on coordi-
nate planes xoy, xoz, and yoz, using the one-dimensional turbulence values
as boundary conditions.
Step 3: Calculate the atmospheric turbulence at any point in three-dimensional
space using the two-dimensional atmospheric turbulence values as bound-
ary conditions
Obviously, the one-dimensional atmospheric turbulence is an important basis for
generating the three-dimensional atmospheric turbulence. Based on our experimental
results in Sect. 3.3.1, the improved one-dimensional stochastic sequence can be used
to generate the atmospheric turbulence on the axis with better correlation. And so, the
two- and three-dimensional values using this sequence for its boundary conditions
come closer to the theoretical characteristics.

13.4 Experimental Results and Analysis

The parameters we use to generate a three-dimensional atmospheric turbulence field


are: turbulence intensity σ = 1.7585 m/s, turbulence scale L = 265, Lu = 2Lw =
300 m, and the step size h = 70 m. Multiple groups of 400 × 400 × 400 Gaus-
sians white noise sequences are used to generate atmospheric turbulence values. The
experimental program was written with R2013a MATLAB software and run on the
T430 ThinkPad computer.
134 L. Lin et al.

6
8
4
6
2
4
w(m/s)

w(m/s)
2
-2 0
-4 -2
-6 -4
-8 -6
60 60
60 60
40 50 40 50
40 40
y/h 20 30 20 30
20 y/h 20
10 x/h 10 x/h
0 0 0 0

(a) Height 10×70m (b) Height 20×70m

Fig. 13.2 Three-dimensional atmospheric turbulence sectional profile

3.5
transverse correlation (theoretical)
transverse correlatin (improved)
3
longitudinal correlation (theoretical)
longitudinal correlation (improved)

2.5

2
R

1.5

0.5

0
0 50 100 150 200 250 300
ξ(m)

Fig. 13.3 Transverse and longitudinal correlation of three-dimensional atmospheric turbulence

The initial 60 grids are used to verify the turbulence field in the 10th grid (at a
height of 700 m) and the 20th grid (at a height of 1400 m). The generated sectional
profile is shown in Fig. 13.2.
The correlation is calculated and compared with the theoretical value, and the
results are shown in Fig. 13.3.
From Fig. 13.2, we can see that the random variation of atmospheric turbulence
agrees with the real atmospheric turbulence. Figure 13.3 shows that the trends of
both transverse and longitudinal correlations in the generated atmospheric three-
dimensional turbulent flow field produced with the proposed method are consistent
with the theoretical values and consistent with the limited error.
13 Improving Correlation Function Method … 135

13.5 Conclusions

In this article, we propose a new method of generating atmospheric turbulence based


on an improved correlation function, which is similar to the regular method of the cor-
relation function. Both methods calculate the one- and two-dimensional atmospheric
turbulence with a recursive calculation, which is then used to calculate the three-
dimensional atmospheric turbulence. Because the calculation of the one-dimensional
turbulent flow field serves as the basis for the calculation of the overall turbulent
flow field, we use the improved Gaussian white noise obtained by the double random
switching algorithm to get a smaller mean value and better power spectrum. The
proposed method has been verified by the experimental results, which confirm that
the three-dimensional atmospheric turbulence generated by this method shows better
transverse and longitudinal correlations and has smaller theoretical errors compared
with the ones generated by the original Gaussian white noise sequence. Moreover,
this method is fast in calculation and consumes only a relatively small memory size,
making it more suitable for the requirements of a simulation experiment.

Acknowledgements This work is supported by the National Science Foundation of China under
Grant No. 61201305.

References

1. Real, T.R.: Digital simulation of atmospheric turbulence for Dryden and von Karman models.
J. Guid. Control Dyn. 16(1), 132–138 (1993)
2. Reeves, P.M.: A non-Gaussian turbulence simulation. Air Force Flight Dynamics Lab Technical
Report AFFDL-TR-69-67, Wright-Patterson Air Force Base, OH, Nov 1969
3. Fichtl, G.H., Perlmutter, M.: Nonstationary atmospheric boundary-layer turbulence simulation.
J. Aircr. 1(12), 639–647 (1975)
4. Zhao, Z.Y., et al.: Dryden digital simulation on atmospheric turbulence. Acta Aeronaut. Astro-
naut. Sin. 10, 7(5), 433–443
5. Ma, D.L., et al.: An improved method for digital simulation of atmospheric turbulence. J.
Beijing Univ. Aeronaut. Astronaut. 3, 57–63 (1990)
6. Djurovic, Z., Miskovic, L., Kovacevic, B.: Simulation of air turbulence signal and its applica-
tion. In: The 10th Mediterranean Electrotechnical Conference, vol. 1(2), pp. 847–850 (2000)
7. Zhang, F., et al.: Simulation of three-dimensional atmospheric turbulence based on Von Karman
model. Comput. Stimul. 24(1), 35–38 (2007)
8. Xiao, Y.L.: Digital generation method for two-dimensional turbulent flow field in flight simu-
lation. Acta Aeronaut. Astronaut. Sin. 11(4), B124–B130 (1990)
9. Lu, Y.P., et al.: Digital generation of two-dimensional field of turbulence based on spatial
correlation function. J. Nanjing Univ. Aeronaut. Astronaut. 31(2), 139–145 (1999)
10. Hong, G.X., et al.: Monte Carlo stimulation for 3D-field of atmospheric turbulence. Acta
Aeronaut. Astronaut. Sin. 22(6), 542–545 (2001)
11. Gao, J., et al.: Theory and method of numerical simulation for 3D atmospheric turbulence field
based on Von Karman model. J. Beijing Univ. Aeronaut. Astronaut. 38(6), 736–740 (2012)
12. Gao, Z.X., et al.: Generation and extension methods of 3D atmospheric turbulence field. J.
Traffic Transp. Eng. 8(4), 25–29 (2008)
136 L. Lin et al.

13. Wu, Y., Jiang, S., Lin, L., Wang, C.: Simulation method for three-dimensional atmospheric
turbulence in virtual test. J. Comput. Inf. Syst. 7(4), 1021–1028 (2011). Proctor, F.H., Bowles,
R.L.: Three-dimensional simulation of the Denver 11 July 1988 Microburst-producing storm.
Meteorol. Atmos. Phys. 49, 108–127 (1992)
14. Hunter, I.W., Kearney, R.E.: Generation of random sequences with jointly specified probability
density and autocorrelation functions. Biol. Cybern. 47, 141–146 (1983)
15. Cai, K.B., et al.: A novel method for generating Gaussian stochastic sequences. J. Shanghai
Jiaotong Univ. 38(12), 2052–2055 (2004)
Chapter 14
Study on Product Name Disambiguation
Method Based on Fusion Feature
Similarity

Xiuli Ning, Xiaowei Lu, Yingcheng Xu and Ying Li

Abstract Analyzing and processing the data of product quality safety supervision
and spot check is the key to maintain healthy and sustainable development of prod-
ucts, because the data sources are extensive. In view of the ambiguity of prod-
uct names in the data, a method based on fusion feature similarity is proposed,
which disambiguates product names using features such as manufacturer name-
related information, product-related information, topic-related information, and so
on. Experiment results show that the proposed method is effective for product name
disambiguation.

Keywords Product quality · Manufacturer name · Disambiguation method

14.1 Introduction

In recent years, product quality safety incidents have occurred continuously in China,
causing severe influences on people’s lives and the properties. The incidents are
attributed to many causes. Every year, relevant authorities of the state conduct super-
vision and spot check on key products, disclose results to the public in time, analyze
and process the supervision and spot check data, which have great significance to
improve product quality. However, a large number of supervision and spot check data
contain an identical reference for different products, so it is necessary to disambiguate
product names.
For example:
(1) 15 batches of notebooks are identified as unacceptable in spot check, because
the sizing degree, brightness, dirt, marks, insufficient gutter, page number, and
deviation are not up to standards.
(2) Ms. Wang from Chengdu complained that a laptop she had just bought could
not boot up, and the laptop was found to have quality problems in inspection.

X. Ning · X. Lu (B) · Y. Xu · Y. Li
Quality Management Branch, China National Institute of Standardization, Beijing 10001, China
e-mail: 475554762@qq.com
© Springer Nature Singapore Pte Ltd. 2020 137
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_14
138 X. Ning et al.

In the above examples, “notebooks” in example (1) refer to paper notebooks, while
“laptops” in example (2) refer to notebook computers. In order to better analyze and
process the supervision and spot check data, it is necessary to fundamentally solve
the product name ambiguity problems.
The essence of disambiguation is to calculate the similarity between the reference
and the product, and select the most similar products as correlative products [1].
In recent years, many scholars at home and abroad have studied disambiguation
methods. Bunescu and Pasca [2] proposed a method based on cosine similarity sorting
for disambiguation. Bagga [3] and Mann [4] et al. expressed the context of reference
and the context of object, respectively, as BOW (bag of words) vector, and realized
disambiguation of human name using the vector space model. Huai et al. [5] proposed
a object naming correlation method based on the probabilistic topic model; Ning et al.
[6] proposed a hierarchical clustering method based on the heterogeneous knowledge
base for Chinese object name disambiguation. Zhu et al. [7] proposed a method
combining the disambiguation of reference clustering with the disambiguation of
the same reference in Baidu Baike word list.

14.2 Theory and Method

14.2.1 System Framework

14.2.1.1 Attribute Division

By analyzing related reports of product quality safety supervision and spot check, the
following attributes are defined and attribute values are listed, as shown in Table 14.1.
In the profile structure, the attribute value is obtained from related reports of
product quality safety supervision and spot check, or taken as null if it is unavailable

Table 14.1 Profile structure in report


Product name Curler
Trademark Logo
Model ND-BX01
Manufacture date November 18, 2017
Manufacturer Ningbo Meijiayi Co., Ltd.
Place of manufacture Ningbo
Standard GB 4706.15-2008 household and similar electrical
appliances—safety—particular requirements for appliances for skin or
hair care
Test items Logo and instructions, protection against contact with live parts, input
power and current, and heat emission
14 Study on Product Name Disambiguation Method … 139

from the reports. According to Table 14.1, information expressed by some attributes
is correlative to some degree, such as product name, trademark, and model, all of
which represent information related to products. Therefore, attributes are classified
into the following three features, manufacturer-related information, product-related
information, and topic-related information, based on the correlation of information
expressed by attributes. The manufacturer-related information includes the manu-
facturer name and manufacture place, the product-related information includes the
trademark, model, and manufacture date, and the topic-related information includes
the standard and inspection items.

14.2.1.2 System Framework

According to the attribute of product name, the manufacturer-related information,


product-related information, and topic-related information are selected for analyzed.
First, the data is preprocessed by word segmentation, text regeneration, etc. then
the different attribute features are classified into three categories, and the similarity
of each attribute is calculated and combined; finally, the final disambiguation result
is obtained by comparing with the preset threshold. When the similarity is greater
than the threshold, the two product references represent the same product; when the
similarity is lower than the threshold, the two product references represent different
products. System structure diagram is shown in Fig. 14.1.

14.2.2 Feature Selection and Similarity Calculation

The most important thing about product name disambiguation is to choose some main
features that can distinguish different products to the greatest extent. Analyze the

Feature extraction

Manufacturer
related information

Feat
Disa
Simil ure
Product related mbig
Preprocess arity simil
information uatio
Text to be disambiguated calc arity
n
ulati com
resul
on binat
t
ion

Topic related
information

Fig. 14.1 System structure diagram


140 X. Ning et al.

selected features, assign different feature weights according to the importance degree
of product name distinction, combine the feature weights, calculate the similarity
degree of product name, and eliminate ambiguity. For example, for any two texts T1
and T2 to be disambiguated, the computation complexity can be reduced by improving
the similarity calculation method of three categories of features.

T1 = (ω11 , ω12 , . . . , ω1n ) (14.1)

T2 = (ω21 , ω22 , . . . , ω2n ) (14.2)

14.2.2.1 Manufacturer-Related Information

According to the correlation between the manufacturer and the product name in
the report, corresponding product names to the same manufacturer refer to the same
product in most cases, so the manufacturer has high product name distinction
  degree.
For the manufacturer-related information, the calculation method log d /df is used,
and the similarity degree is as follows:


2
 
simP (T1 , T2 ) = log d /dfk (14.3)
k=1

d is the total number of reports, dfk is the number of reports in which both the
product name to be disambiguated and the manufacturer or manufacture place are
referred.

14.2.2.2 Product-Related Information

Product-related information is an important feature of product identification, includ-


ing model, trademark, manufacture date, etc. When calculating the similarity of
product-related information, assuming that the similarity is simC (T1 , T2 ); the simi-
larity simC (con1i , con2i ) = 1 if the i related information of T1 is compatible with or
identical to the i related information of T2 , or the similarity simC (con1i , con2i ) = 0
if the two are not compatible or anyone is missing. Three related informations
such as trademark, model, and manufacture date are considered herein, namely
Pk = (conk1 , conk2 , conk3 ), and the similarity formula is as follows:


3
simC (T1 , T2 ) = simC (con1i ∩ con2i ) (14.4)
i=1
14 Study on Product Name Disambiguation Method … 141

14.2.2.3 Topic-Related Information

The topic-related information refers to the inspection information involved in the


report of products to be disambiguated, which indicate the product information to a
certain degree. Both the inspection items and the standard in the report of products
to be disambiguated indicate the inspection information of the products to a great
degree, so they are combined into one text, from which the topic-related information
is extracted. With short-text topic features [8], the study uses the improved method of
word co-occurrence clustering similarity calculation based on word co-occurrence.
Generally, words in the same report express the meaning of the same topic, so
the words with the same topic are often concurrent, namely word co-occurrence.
The main difference is that different words have different degrees of correlation. The
word co-occurrence frequency under the same topic is relatively high, so the words
are clustered to the same topic.
    
P ωi , ωj = P(x)p[ωi |x ]p ωj |x , ∀ωi , ωj ∈ Y (14.5)
x∈X

 
P ωi , ωj represents co-occurrence frequency, x represents topic, X represents
topic cluster, ωi and ωj represent different words, and Y represents word cluster.
Assuming that each report expresses a topic and totally N reports are included, the
prior probability of the topic is P(x) = 1/N . When a word appears in a report with
the posterior probability of p[ωi |x ] = 1, if the words ωi and ωj occur simultaneously
in m sentences, their joint probability is P ωi , ωj = m/N . Therefore, the word
co-occurrence probability can be calculated by the following formula:
  
  text ωi , ωj 
T ωi , ωj = (14.6)
text
 
text ωi , ωj represents a report cluster whose text vector includes both ωi and ωj ,
text represents individual text, and · represents number of elements.
A text set matrix Qm×n similar to a vector space model is constructed. Assuming
that text set Q includes n reports and m concurrent word classes, the text set can be
expressed as a m × n matrix in which the column vector represents a report, and the
row vector represents distribution of a concurrent word in the text; if the concurrent
word occurs, the original value of the matrix is 1; if the concurrent word does not
occur, the original value of the matrix is 0; namely:
⎡ ⎤
q11 q12 · · · q1n
⎢ q21 q22 · · · q2n ⎥
⎢ ⎥
Qm×n =⎢ . .. ⎥ (14.7)
⎣ .. . ⎦
qm1 qm2 · · · qmn
142 X. Ning et al.

The method of word co-occurrence clustering similarity calculation based on word


co-occurrence can be defined as the product sum of the occurrence probability of all
concurrent words in text. Namely, the more frequently concurrent words occur, the
greater the similarity of the text and the text to be disambiguated will be.

simT (T1 , T2 ) = qki × qkj (14.8)
k=1,2,...,m

14.2.3 Combined Feature Similarity

Based on three categories of features such as manufacturer-related information,


product-related information, and topic-related information, by making full use of
the high distinction degree of manufacturer-related information, the importance of
product-related information and the low-complex word co-occurrence clustering of
topic-related information, the following formula of similarity of product name to be
disambiguated is obtained provided that the disambiguation effect is guaranteed, and
the algorithm complexity is reduced.

product(T1 , T2 ) = αsimP (T1 , T2 ) + βsimC (T1 , T2 ) + γ simT (T1 , T2 ) (14.9)

Judge whether the two product names refer to the same product according to the
similarity of the two product names.

1, product(T1 , T2 ) ≥ threshold
Con = f (product) = (14.10)
0, product(T1 , T2 ) < threshold

threshold represents co-reference relationship confidence, or represents that two


product names refer to the same product when Con = 1, or represents that two
product names refer to different products when Con = 0.

14.3 Experiment and Result Analysis

14.3.1 Selection of Experimental Data Set

2,000 quality safety supervisions and spot reports are checked from the website of
individual local market supervision and administration authorities as the data set
for this experiment, then the incomplete texts are removed, and finally, 200 reports
are selected at random as the experimental data. First, the 8 product names to be
disambiguated in the 200 reports are marked manually, and the number of reports
selected is shown in Fig. 14.2, including the number of product name references
14 Study on Product Name Disambiguation Method … 143

Fig. 14.2 Number of reports


selected

{12, 9, 15, 11, 6, 14, 14, 10}, which fully indicates randomness of the data. Then,
Stanford NLP word segmentation tool is used for word segmentation of the standard
and inspection items in the report.

14.3.2 Evaluation Indicator

The accuracy, recall rate, and F are used as evaluation indicators:


  
 
Ai ∈A max Ai ∩ Bj
Bj ∈B
P=  (14.9)
Ai ∈A |Ai |
  
 
Bi ∈B max Bi ∩ Aj
Aj ∈A
R=  (14.10)
Bi ∈B |Bi |
P·R
F= (14.11)
αR + (1 − α)P

A = {A1 , A2 , · · ·} and B = {B1 , B2 , . . .}, respectively, represent the data set to be


evaluated and the data set marked manually. According to Ref. [9], take α = 0.5 (α
represents balance factor), and Fα = 0.5 for comprehensive evaluation of accuracy,
recall rate, and F.
144 X. Ning et al.

14.3.3 Experimental Results and Analysis

With the accuracy, recall rate, and F of eight product names to be disambiguated as
experimental results, the feature weight, similarity threshold, and different similarity
feature combination are analyzed.
According to formula (14.9) and formula (14.10), feature weight α + β + γ = 1
where α, β, γ ∈ (0, 1) and similarity threshold. Different values of α, β, γ and
threshold are tested for many times to define F, so as to obtain the optimal feature
weight combination. Namely, when α = 0.17, β = 0.54, γ = 0.29 and threshold =
0.52, the largest F of individual product name is F = 91.36, and the average largest
F among the eight sets of data is F = 89.15.
Figures 14.3, 14.4, and 14.5 show the effects of α, β and γ on the value of accu-
racy, recall rate, and F. With α, β and γ increase, the accuracy increases while the
recall rate decreases; the largest F occurs when α = 0.17, β = 0.54, γ = 0.29. The
proportion of the manufacturer-related information, product-related information, and
topic-related information to the similarity degree of reference rises with the increase
of weight of α, β, and γ , so the reference related to the same manufacturer-related
information, product-related information, and topic-related information is mistak-
enly seemed as the same product, resulting in a constant decrease of recall rate.
However, the accuracy is highest when α approaches 1, because the manufacturer-
related information (manufacturer name and manufacture place) can better distin-
guish different products.
Figure 14.6 shows the effects of similarity threshold on accuracy, recall rate, and
F. When the threshold is too small, many different reports are retrieved into the same
category, and it is impossible to accurately identify the reference, resulting in very
high recall rate but low accuracy. Largest F occurs when threshold = 0.52. When
the threshold is too high, only references with high similarity can be identified as the
same reference, resulting in high accuracy but decreasing recall rate.

Fig. 14.3 Recall, precision,


and F with α
14 Study on Product Name Disambiguation Method … 145

Fig. 14.4 Recall, precision,


and F with β

Fig. 14.5 Recall, precision,


and F with γ

Fig. 14.6 Recall, precision,


and F with threshold
146 X. Ning et al.

14.4 Conclusion

For a larger number of product name ambiguity problems in product quality safety
supervision and spot check reports, analysis is made in terms of the manufacturer-
related information, product-related information, and topic-related information, then
topic features of product names are selected with the method based on word co-
occurrence clustering, and finally, product names are disambiguated with differ-
ent feature weight parameters and similarity thresholds. The simulation experiment
shows that the method used in this paper is effective for product name disambigua-
tion, which proves effectiveness of the algorithm.

Acknowledgements This research is supported and funded by the National Science Foundation
of China under Grant No. 91646122 and the National Key Research and Development Plan under
Grant No.2016YFF0202604 and No.2017YFF0209604.

References

1. Zhao, J., Liu, K., Zhou, G., Cai, L.: Open information extraction. J. Chin. Inf. Process. 25(6),
98–110 (2011)
2. Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In:
Proceedings of the 11st Conference of the European Chapter of the Association for Computa-
tional Linguistics, pp. 9–16. Trento, Italy (2006)
3. Bagga, A., Baldwin, B.: Entity-based cross-document coreferencing using the vector space
model. In: Proceedings of the 17th International Conference on Computational Linguistics, vol.
1, Association for Computational Linguistics, pp. 79–85. Montreal, Canada (1998)
4. Mann, G.S., Yarowsky, D.: Unsupervised personal name disambiguation. In: Proceedings of
the 7th Conference on Natural Language Learning at HLT-NAACL 2003, vol. 4, pp. 33–40.
Sapporo, Japan (2003)
5. Huai, B., Bao, T., Zhu, H., et al.: Topic modeling approach to named entity linking. J. Softw.
25(9), 2076–2087 (2014)
6. Ning, B., Zhang, F.: Named entity disambiguation based on heterogeneous knowledge base. J.
Xi’an Univ. Posts Telecommun. 19(4), 70–76 (2014)
7. Zhu, M., Jia, Z., Zuo, L., et al.: Research on entity linking of Chinese micro blog. Acta Sci. Nat.
Univ. Pekin. 50(1), 73–78 (2014)
8. Hinton, G., Deng, L., Yu, D., et al.: Deep neural networks for acoustic modeling in speech
recognition. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
9. National Institute of Standards and Technology. Open KWS13 keyword search evaluation plan
(2013)
Chapter 15
Delegated Preparation of Quantum
Error Correction Code for Blind
Quantum Computation

Qiang Zhao and Qiong Li

Abstract The universal blind quantum computation protocol allows a client to del-
egate quantum computation to a remote server, and keep information private. Since
the qubit errors are inevitable in any physical implementation, quantum error cor-
rection codes are needed for fault-tolerant blind quantum computation. In this paper,
a quantum error correction code preparation protocol is proposed based on remote
blind qubit state preparation (RBSP). The code is encoded on the brickwork state for
fault-tolerant blind quantum computation. The protocol only requires client emitting
weak coherent pulses, which frees client from dependence on quantum memory and
quantum computing.

Keywords Universal blind quantum computation · Quantum error correction ·


Remote blind qubit state preparation · Brickwork state

15.1 Introduction

Quantum computation has come into the focus of quantum information science
because quantum algorithms can quickly solve some NP problems such as factoring
large numbers [15]. The existing traditional protocols are threatened as a result of
the huge progress in quantum computing. In order to resist the quantum attack, many
signature and transfer protocols are presented based on the assumption of the hard-
ness of lattice problem [8, 17]. Although modern quantum computation is making
strides toward scalable of quantum computers, the small and privately owned quan-
tum computers remain very distant. If the large quantum computers are used as rental
system, users are granted access to the computers to do quantum computation. The
Broadbent, Fitsimons, and Kashefi proposed the universal blind quantum computa-
tion [4], which allows the client (named as Alice) execute a quantum computation on
a quantum server (named as Bob) without revealing any information about the com-

Q. Zhao · Q. Li (B)
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
e-mail: qiongli@hit.edu.cn

© Springer Nature Singapore Pte Ltd. 2020 147


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_15
148 Q. Zhao and Q. Li

putation except the upper bound of the size. This protocol has been experimentally
realized in an optical system [2, 3].
In blind quantum computation, a quantum computation can be conceptually
divided into a classical part and quantum part in framework of measurement-based
quantum computation [11, 12]. Alice, as classical controller unit, prepares qubits and
decides the measurement angles, while Bob, as quantum unit, performs the measure-
ment. The inputs are prepared into the desired single-photon state by Alice. How-
ever, the quantum states are easily affected by the environment and imperfect devices
[1, 5, 10, 13], which will inevitably produce errors. The errors may occur during qubit
preparation, quantum transmission, and quantum measurement. Hence, a practical
blind quantum computation system requires Alice to have the ability of preparing
encoded logical qubits for quantum error correction.
Quantum error correction was independently presented by Shor and Steane
[14, 16]. The Shor code is a combination of the 3-qubit phase flip and bit flip codes,
and it is nine qubits code. The Steane’s code uses seven qubits to encode one qubit,
which can protect against the effects of an arbitrary error on a single qubit. More-
over, the Steane method has an advantage over the Shor procedure in measurement
syndrome, which only 14 ancilla bits and 14 CNOT gates are needed. Hence, the
Stean’s code is used to do quantum error correction in our paper.
For fault-tolerant blind quantum computation, the encoded logical qubits, which
are prepared based on the encoding circuit, are required to replace original qubits
in the brickwork state. In [4], Broadbent, Fitsimons, and Kashefi had proposed a
fault-tolerant blind quantum computation protocol, which can convert encoding cir-
cuit to a measurement-based quantum computation on the brickwork state. However,
the encoding preparation requires Alice to have the ability of preparing the single
photon states, and consumes a large number of qubits to prepare an encoded logical
qubit. Chien presented two fault-tolerant blind quantum computation protocols [5].
In the first protocol, Alice prepares the encoded logical qubits-based quantum cir-
cuit, and then send to Bob. In the second protocol, Bob prepares initial encoded
logical qubits, and Alice randomly performs phase gate on these logical qubits, then
sends back to Bob via quantum teleportation. Two protocols all require Alice to have
the ability of quantum memory and quantum computing. In the ideal blind quan-
tum computation, Alice has to prepare perfect qubits for the blindness. However,
the preparation will inevitably be imperfect in any physical implementation. Hence,
a remote blind qubit state preparation (RBSP) protocol is presented by Dunjko et
al. [6] to prepare the approximate blind qubits. To improve the preparation effi-
ciency, a modified RBSP protocol with two decoy states is proposed by Zhao and Li
[18, 19]. Nevertheless, these prepared single qubits cannot be used for fault-tolerant
blind quantum computation.
In the paper, a quantum error correction code preparation protocol is proposed
based on RBSP, which is able to prepare the encoded logical qubits for fault-tolerant
blind quantum computation. In the protocol, Alice emits weak coherent pulses, and
delegates Bob to prepare quantum error correction code on the brickwork state, i.e.,
a universal family of graph state. According to Alice’s instructions, Bob performs
the measurement-based quantum computation on the brickwork state to prepare the
15 Delegated Preparation of Quantum Error Correction … 149

encoded logical qubits. The protocol only requires Alice to have the ability of emitting
weak coherent pulses.
The rest of this paper is organized as follows: in Sect. 15.2, technical preliminaries
are introduced. In Sect. 15.3, the delegated preparation protocol is presented, which
can prepare the encoded logical qubits on the brickwork state for fault-tolerant blind
quantum computation. In Sect. 15.4, conclusions are drawn.

15.2 Technical Preliminaries

15.2.1 Quantum Gate

The evolutions of qubits are described by quantum gates in quantum computation.


Quantum gates are unitary operation, which can be represented by matrices. The
frequently used quantum gates have the Pauli gates (I, X, Y, Z ), Hadamard gate H ,
Phase gate S, π/8 gate (T ), Controlled-NOT (CNOT) gate, and so on. The matrices
forms of them are shown in the following equations:
       
10 01 1 0 0 −i
I = , X= , Z= , Y = iXZ =
01 10 0 −1 i 0

⎛ ⎞
      1 0 0 0
1 1 1 1 0 1 0 ⎜0 1 0 0⎟
H=√ , S= , T = , C N OT = ⎜
⎝0

2 1 −1 0 −i 0 eiπ / 4 0 0 1⎠
0 0 1 0
(15.1)

In quantum computation, an algorithm is completed by a sequence of quantum


gates, which is described by the quantum circuit model. In the model, each line(wire)
represent a qubit, inputs are on the left and outputs on the right, with time flowing
left to right. In two-qubit gates, the wire with a black dot represents a control qubit
and the other represents a target qubit. The diagrammatic notations of some quantum
gates are shown in Fig. 15.1.
Each line represents a qubit. For the CNOT, CZ, and CPhase gates, the upper qubit
is the control qubit and the lower qubit is the target qubit. The SWAP gate swaps two
input qubits.
150 Q. Zhao and Q. Li

Fig. 15.1 Diagram of quantum gates a Pauli-X gate. b Pauli-Z gate. c Hadamard gate. d Phase
gate S. e π/8 gate. f Controlled-NOT (CNOT) gate. g Controlled-Z (CZ) gate. h Controlled-Phase
(CPhase) gate. i SWAP gate

15.2.2 Quantum Error Correction

A popular quantum code is the [[n, k, d]] stabilizer code, which can encode k qubits
into n qubits [7, 9, 10]. The parameter d is the distance of the code. The stabilizer
code can be also described by the generator matrix G, which has 2n columns and
n − k rows. The generator matrix is denoted as G = (X G |Z G ). In the paper, we use
a common stabilizer code, i.e., the 7-qubit Steane’s code [[7, 1, 3]]. The code can
encode one qubit in seven qubits and correct any 1-qubit errors. The encoded logical
qubit basis is denoted as {|0 L , |1 L }. The generator matrix of the [[7, 1, 3]] is shown
as follows [9]:


0 0 0 1 1 1 1

0 0 0 0 0 0 0
⎜0 1 1 0 0 1 1
0 0 0 0 0 0 0⎟


⎜1 0 1 0 1 0 1
0 0 0 0 0 0 0⎟

G [[7,1,3]] = ⎜
⎟. (15.2)


⎜0 0 0 0 0 0 0
0 0 0 1 1 1 1⎟

⎝0 0 0 0 0 0 0 0 1 1 0 0 1 1⎠

0000000
1010101

The quantum circuit that encodes qubits can be designed according to the generator
matrix. The circuit shown in Fig. 15.2 is used to prepare an unknown logical qubit
for quantum error correction [10]. It is easy to understand that the CNOT gates of the
circuit is based on an alternative expression for X G , which permutes the columns.
An unknown logical qubit α|0 + β|1 and six ancilla qubits |0⊗6 can be used to
encode into α|0 L + β|1 L , as shown in Fig. 15.2.
15 Delegated Preparation of Quantum Error Correction … 151

Fig. 15.2 An encoding


circuit for the [[7, 1, 3]] code
[10]

15.3 Delegated Preparation of Quantum Error Correction


Code

As we all known, the encoded logical qubit |0 L is the equally weighted superposition
of all of the even weight codewords of the Hamming code, and logical qubit |1 L
is the equally weighted superposition of all of the odd weight codewords of the
Hamming code.

|0 L = 1

2 2
(|0000000 + |0001111 + |0110011 + |0111100
+ |1010101 + |1011010 + |1100110 + |1101001)
(15.3)
|1 L = 2√1 2 (|1111111 + |1110000 + |1001100 + |1000011
+ |0101010 + |0100101 + |0011001 + |0010110)

To prepare the unknown encoded logical qubits, a good scheme was presented
by Preskill for the Steane’s [[7, 1, 3]] code [10]. A qubit in an unknown state can
be encoded using the circuit, as shown in Fig. 15.2. The alternative expression of
the generator matrix G [[7,1,3]] is used to construct the encoding circuit. The encoded
logical qubits are determined by the generators of G. Since the rank of the matrix X G
is 3, the 3 bits of the Hamming string completely characterize the data represented
in Eq. (15.3). The remaining four bits are the parity bits that provide the needed
redundancy to protect against errors. Hence, we can use two CNOT gates to prepare
the state |0000000 +√e2 |0000111 for the unknown input state |+θ . To add |0 L to this state,

the rest CNOT gates of the circuit switch on the parity bits determined by G [[7,1,3]] .
According to the encoding circuit in Fig. 15.2, we can prepare the unknown encoded
logical qubit |+θ  L for an unknown qubit |+θ  to Bob. If Alice wants to delegate
Bob to prepare the encoded logical qubits, Bob needs to convert the encoding circuit
from Alice to a measurement-based quantum computation. In our paper, we present
152 Q. Zhao and Q. Li

Fig. 15.3 a The encoding circuit for Steane’s [[7, 1, 3]] code. b The encoding circuit where CNOT
gates only operate on adjacent qubits. Red solid boxes represent SWAP gates, which are replaced
with three consecutive CNOT gates. c The encoding circuit that quantum gates are arranged to fit
the bricks in the brickwork state. d The brickwork state to implement the encoding circuit

a universal family of graph state, i.e., brickwork state, to prepare the encoded logical
qubits.
If Bob uses the brickwork state to perform the encoding computation, he needs
to preprocess the input qubits in Fig. 15.2. In order to entangle the ancilla qubits and
the desired qubits |+θ  using the CZ gates for Bob, the input ancilla qubits have to be
the |+ states. In addition, since the bricks are even–odd interleaved in the brickwork
state, CNOT gates can only be acted on specific two adjacent lines of qubits in each
layer. Thus, SWAP gates are required for implementing quantum gates which operate
on two nonadjacent qubits. In the following, the encoding circuit in Fig. 15.2 will
be converted to a measurement-based quantum computation on the brickwork state.
The specific processes are described as follows.
Step 1—the encoding circuit is used to preprocess the input ancilla quits |+ using
the Hadamard gates, as shown in Fig. 15.3a.
Step 2—SWAP gates are added to make sure that CNOT gates operate on adjacent
qubits as shown in Fig. 15.3b. Since the construction of SWAP gates on the brickwork
state is very complex, the SWAP gates can be replaced with the three consecutive
CNOT gates.
Step 3—the encoding circuit is divided into many layers so that all quantum gates
are arranged to fit a brick in the brickwork states as shown in Fig. 15.3c.
Step 4—these 1-qubit gates and CNOT gates can be implemented on the brickwork
state, as shown in Fig. 15.3d. In blind quantum computation, Fig. 15.3d shows that
the brickwork state needs to be divided into the bricks corresponding to the quantum
gates of the encoding circuit. The measurement basis from Alice are assigned to each
qubit of the brickwork state.
15 Delegated Preparation of Quantum Error Correction … 153

Based on the above analysis, the delegated preparation of quantum error correction
code on the brickwork state is designed as follows. In our protocol, the 97 layers of
the bricks are required to prepare an encode logical qubit. The seven input qubits
in the encoding circuit are converted to seven rows of qubits in the brickwork state.
Thus, this brickwork state consists of 2723 qubits. Bob needs to use 3298 CZ gates to
create the brickwork state. The measurement basises from Alice are assigned to each
qubit in the brickwork state, except the last column of qubits which are the output
qubits. Thus, the 2716 measurements are required for the preparation computation
on the brickwork state.
In our protocol, the interaction measurement stage is different from the basic
universal blind quantum computation. Since the ancilla qubit of encoding circuit is
carried without encoded information, their measurement basis does not need to be
encrypted. We only make sure that the required qubit |+θ  prepared based on RBSP is
-blind to Bob in encoding computation. In the basic blind quantum computation, the

measurement basis of encoded qubit is encrypted as δ = φ + θ + πr, r x,y ∈ {0, 1}.
Thus, the polarization angle θ is independent of δ in our protocol. Hence, if the qubit
prepared based on RBSP is -blind to Bob, the encoded logical qubit is also -blind.

Protocol: Delegated preparation of quantum error correction code on the brickwork state
(1) Alice’s preparation
(1.1) Alice sends N weak coherent pulses which the polarization angles σ are chosen at random
in {kπ/4 : 0 ≤ k ≤ 7}
(1.2) Alice sends a sequence of the ancilla pulses with the polarization state |+ to Bob. The
ancilla qubits can be public
(2) Bob’s preparation
(2.1) According to the remote blind qubit state preparation protocol [6], Bob can prepare the
required qubit |+θ i , i = 1, 2, ..., S
(2.2) Bob entangles the required qubit |+θ i and a group ancilla qubits to create the brickwork
state using CZ gates
(3) The interaction measurement
For each column x = 1, ... m in the brickwork state
For each row y = 1, ..., n in the brickwork state

(3.1) Alice computes δx,y = φx,y + θx,y + πr x,y , r x,y ∈ {0, 1} based on the real measurement
angle φ and the previous measurement results. If a used qubit is ancilla state, θx,y = 0
(3.2) Alice transmits δx,y to Bob via the classical channel. Bob measures in the basis
{|+δx,y , |−δx,y }
(3.3) Bob transmits the one-bit measurement result to Alice via the classical channel

15.4 Conclusions

In the paper, a delegated preparation protocol is presented to prepare quantum error


correction code on the brickwork state for fault-tolerant blind quantum computation.
The protocol only requires Alice to have the ability of emitting weak coherent pulses,
154 Q. Zhao and Q. Li

and no quantum memory and no quantum computing are needed. In addition, the
resource consumption of our protocol is analyzed to prepare an encoded logical qubit.

Acknowledgements This work is supported by the Space Science and Technology Advance
Research Joint Funds (Grant Number: 6141B06110105) and the National Natural Science Founda-
tion of China (Grant Number: 61771168).

References

1. Aharonov, D., Ben-Or, M.: Fault-tolerant quantum computation with constant error rate. SIAM
J. Comput. (2008)
2. Barz, S., Fitzsimons, J.F., Kashefi, E., Walther, P.: Experimental verification of quantum com-
putations. arXiv preprint arXiv:1309.0005 (2013)
3. Barz, S., Kashefi, E., Broadbent, A., Fitzsimons, J.F., Zeilinger, A., Walther, P.: Demonstration
of blind quantum computing. Science 335(6066), 303–308 (2012)
4. Broadbent, A., Fitzsimons, J., Kashefi, E.: Universal blind quantum computation. In: 50th
Annual IEEE Symposium on Foundations of Computer Science, 2009. FOCS’09, pp. 517–
526. IEEE
5. Chien, C.H., Van Meter, R., Kuo, S.Y.: Fault-tolerant operations for universal blind quantum
computation. ACM J. Emerg. Technol. Comput. Sys. 12, 9 (2015)
6. Dunjko, V., Kashefi, E., Leverrier, A.: Blind quantum computing with weak coherent pulses.
Phys. Rev. Lett. 108(20) (2012)
7. Gottesman, D.: Stabilizer codes and quantum error correction. arXiv preprint quant-ph/9705052
(1997)
8. Liu, M.M., Hu, Y.P.: Equational security of a lattice-based oblivious transfer protocol. J. Netw.
Intell. 2(3), 231–249 (2017)
9. Nielsen, M.A., Chuang, I.: Quantum computation and quantum information (2002)
10. Preskill, J.: Fault-tolerant quantum computation. In: Introduction to Quantum Computation
and Information, pp. 213–269. World Scientific (1998)
11. Raussendorf, R., Briegel, H.J.: A one-way quantum computer. Phys. Rev. Lett. 86(22) (2001)
12. Raussendorf, R., Browne, D.E., Briegel, H.J.: Measurement-based quantum computation on
cluster states. Phys. Rev. A 68(2) (2003)
13. Shor, P.W.: Fault-tolerant quantum computation. In: Proceedings of 37th Annual Symposium
on Foundations of Computer Science, 1996, pp. 56–65. IEEE
14. Shor, P.W.: Scheme for reducing decoherence in quantum computer memory. Phys. Rev. A
52(4), 2493 (1995)
15. Shor, P.W.: Polynomial-time algorithms for prime factorization and discrete logarithms on a
quantum computer. SIAM Rev. 41(2), 303–332 (1999)
16. Steane, A.M.: Error correcting codes in quantum theory. Phys. Rev. Lett. 77(5), 793 (1996)
17. Sun, Y., Zheng, W.: An identity-based ring signcryption scheme in ideal lattice. J. Netw. Intell.
3(3), 152–161 (2018)
18. Zhao, Q., Li, Q.: Blind Quantum Computation with Two Decoy States. Springer International
Publishing (2017)
19. Zhao, Q., Li, Q.: Finite-data-size study on practical universal blind quantum computation.
Quantum Inf. Process. 17(7), 171 (2018)
Chapter 16
Design of SpaceWire Interface
Conversion to PCI Bus

Zhenyu Wang, Lei Feng and Jiaqing Qiao

Abstract This paper introduces a firmware design of SpaceWire–PCI interface con-


version. It makes good use of PCI bandwidth and can observably increase the con-
version efficiency. Based on the analysis of packet format that defined in SpaceWire
protocol, two processes are mainly introduced which is about packet format conver-
sion and DMA data transfer. According to testing and comparing with the standard
communication card, this design can significantly increase the maximum value of
transfer rate.

Keywords SpaceWire · PCI · Interface conversion · DMA

16.1 Introduction

SpaceWire [1–3] is an onboard data-handling network for spacecraft that is designed


to connect together high data rate sensors, processing units, memory subsystems, and
the downlink telemetry subsystem. It provides high-speed (2–200 Mbits/s), bidirec-
tional, and full-duplex data links which connect together the SpaceWire-enabled
equipment.
As well as supporting high data rate applications, SpaceWire is being used in
applications where much higher stable and reliable are all required. With more and
more data payloads needed by the spacecraft, the requirements for bus bandwidth is
increasing rapidly which also promotes the development of SpaceWire technology.

Z. Wang · L. Feng · J. Qiao (B)


Harbin Institute of Technology, Automatic Test and Control Institute, Harbin, Heilongjiang, China
e-mail: qiaojiaqing@hit.edu.cn
Z. Wang
e-mail: 18S001028@stu.hit.edu.cn
L. Feng
e-mail: hitfenglei@hit.edu.cn

© Springer Nature Singapore Pte Ltd. 2020 155


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_16
156 Z. Wang et al.

The SpaceWire standard has become the ECSS standard and been published since
2003. Since then it has been adopted for using on many spacecrafts, with over 100
spacecrafts in orbit or being designed using SpaceWire [4].
Throughout the specification, design, development, and testing of a SpaceWire
system, it is important that the system is tested and verified to the various levels of
the standard [5]. If spacecraft uses SpaceWire as data-handling network, the design
of SpaceWire electronic checkout and ground support equipment will be necessary.
On the other hand, CompactPCI/PXI modular test system has the advantages of
small size, low cost, easy make, high integration, flexible software, etc., which is
widely used in aerospace and other industrial test fields. Thus, there is a need for
SpaceWire-cPCI/PXI communicating card.
In a CompactPCI-based automatic test system developed by us, a four-channel
SpaceWire-cPCI communication card is required. It is mainly used as a SpaceWire
receiving node, and responsible for transmitting SpaceWire data to the cPCI con-
troller. In this system, the amount of test data is very large so that the system needs to
make a good utilization of PCI bandwidth. However, the standard SpaceWire-cPCI
card can work in a maximum data transfer rate of 160 Mbit/s with single SpaceWire
channel [6, 7], which cannot meet the requirement in the limit case (200 Mbit/s).
We redesigned a SpaceWire-cPCI communication card with FPGA on a single
hardware board, and optimized the firmware of data receiving and data transfer. After
testing, the maximum data transfer rate was greatly improved when the card was used
as a receiving node. This design, which maximizes utilization of the bandwidth and
storage resources, is very suitable for SpaceWire instruments which use PCI as the
host interface. The remaining chapters of this paper mainly introduce the design and
optimization of FPGA firmware in detail.

16.2 Overall Design of FPGA Firmware

The conversion of two interfaces is mainly implemented by FPGA firmware. It com-


prises four major blocks: PCI interface, SpaceWire interface, a format converter, and
an DMA controller. Figure 16.1 shows the overall architecture of FPGA firmware
with the internal and external connection between various parts.
PCI interface we used is an 33 MHz, 32-bit, and target/master interface which
support bursting transfer and is responsible for converting the PCI bus of backboard
to local bus. It is in slave mode by default and can apply to arbiter for arbitration to
switch between master and slave mode according to the requirement. In slave mode,
it will receive data and commands from other master devices, then write to or read
from the control/status registers of each part within the FPGA. When data is received
from other SpaceWire links and needs to be sent up to computer, the PCI interface
will apply for becoming a master device. If succeed, the DMA controller will start
working and initiate a data transfer.
SpaceWire interface’s function is to implement SpaceWire basic protocol. It can
encode/decode for SpaceWire characters and convert them into the host data interface
16 Design of SpaceWire Interface Conversion to PCI Bus 157

FPGA Local Bus (Avalon Memory Map)


PCI Target
Controller

PCI Memory Map


PCI Interface
PCI Bus
DSout Signals
Orignal Converted
SpaceWire Code Format Packet DMA PCI Master
DSin Interface Converter Controller Controller

External
DDR2

Fig. 16.1 Overall design of FPGA firmware

coding, which comprises eight data bits and one control flag. Table 16.1 shows this
coding form [1].
Format converter’s function is to convert packet format of SpaceWire into another
format that suitable for 32-bit PCI transfer. DMA controller connects format con-
verter with PCI interface, which can provide high data throughput. By the way, we
designed Avalon slave/master interfaces for all blocks, so that the whole firmware is
interconnected by Avalon Memory Map which is an on-chip bus defined by Altera.
The increasing of transfer rate mainly depends on the format converter and the
DMA controller. The next chapter mainly introduces the structure and operation of
these two blocks.

16.3 Process of Interface Conversion

This process mainly consists of “Format Conversion” and “Data Transfer”.

16.3.1 Format Conversion

As shown in Table 16.1, code with the control flag set to zero is a normal SpaceWire
data, and any code with the control flag set to one and the least significant bit of

Table 16.1 Host data Control flag Data bits (MSB…LSB) Meaning
interface coding
0 xxxxxxxx 8-bit data
1 xxxxxxx0 (use 00000000) EOP
1 xxxxxxx1 (use 00000001) EEP
158 Z. Wang et al.

the data set to zero represents an EOP (End of Packet), while set to one represents
an EEP (Error End of Packet). Thus, a valid SpaceWire packet for a receiving node
shall comprise multiple data and an end_of_packet marker (EOP or EEP). Figure 16.2
shows this format.
This format makes it easy for a computer to identify that whether the code currently
acquired is a valid data or an end_of_packet marker, so that different packets can be
distinguished.
However, as we all know, the common data types of computer are “char”, “short”,
“int”, and “long”. None of them is a 9 bits type. If we process SpaceWire code with
short-type (16 bits) directly, the board resources of storage and bandwidth will be
almost half-wasted. So, we converted the SpaceWire packet format.
Figure 16.3 shows structure diagram of the format converter. It comprises three
major blocks: conversion logic, a DATA_FIFO developed by external DDR2, and an
MSG_FIFO developed by storage resources of FPGA.
When a SpaceWire code is sent to the format converter, it can be identified by
convert logic that whether this code is a data or an end_of_packet marker. In addition,
there is a packet length countern which can automatically add one every time when a
code is got. Once an end_of_packet marker is identified, the value of length counter
will be stored into MSG_FIFO and be reset to zero. Therefore, the length of each
SpaceWire packet is stored in MSG_FIFO in chronological order.
After identified, the highest bit of original code is discarded, leaving only the
remaining 8 bits. Meanwhile, convert logic will combine every four processed 8-bit

1'b0, 8'bData 1'b0, 8'bData ··· 1'b0, 8'bData EOP or EEP

Packet Length (in Bytes)

Fig. 16.2 Format of SpaceWire packet

Format Converter
Read by PCI
Packet Length Target Controller
MSG_FIFO

SpaceWire Identified DATA_FIFO Read by DMA


Code Format Convert Code Controller
Logic DDR2
Interface

External DDR2

Fig. 16.3 Structure of format converter


16 Design of SpaceWire Interface Conversion to PCI Bus 159

bit 32
24
data0 data4 data8 data0 data4 data8 data0 data4 0 or 1
16
data1 data5 0 or 1 data1 data5 data9 data1 data5 0
8
data2 data6 0 data2 data6 data10 data2 data6 0
0
data3 data7 0 data3 data7 0 or 1 data3 data7 0

Packet 1, Len 10 Packet 2, Len 12 Packet 3, Len 9

Fig. 16.4 Converted packet format

data into an 32-bit data. When the end_of_packet marker is detected but the number
of remaining data is less than 4, convert logic will add zero after the end_of_packet
marker. Because the width of DATA_FIFO is also 32-bit, all data operation can be
processed with the int-type. In this way, bandwidth and storage resources can be
utilized maximally. Figure 16.4 shows converted format that stored in continuous
address with int-type, when there are three continuous SpaceWire packets the length
of which is 10, 12, and 9, respectively.

16.3.2 Data Transfer

Data transfer is the process of uploading data from DATA_FIFO to a computer. It


is mainly implemented by DMA controller and computer software. We developed
a series of driver functions based on Windows OS by using NI-VISA and LabWin-
dows/CVI. This subsection mainly introduces some driver functions and how they
work about data transfer. Figure 16.5 shows the flowchart of this process.
First of all, the computer initiates a transfer process in response to the PCI interrupt.
We set two PCI interrupt sources: MSG_FIFO non-empty interrupt and DMA DONE
interrupt. When MSG_FIFO is non-empty, it also means that there is at least one
SpaceWire packet stored in DATA_FIFO.
When computer responds to a PCI interrupt, it needs to determine which of the
two sources triggered this. The transfer_flag is a static variable that indicates
whether DMA controller is on working. If transfer_flag is 0 and MSG_FIFO
is non-empty, then the computer will enter the interrupt service for the first time. It
will immediately call readPacketMsg function to read MSG_FIFO once to get
the length of currently received packet, and call dmaMalloc function to apply for
continuous physical memory with corresponding address size.
Next, the computer will call dmaConfig function to write the transfer length
and write address into the DMA control register. It should be noted that both the
developed physical memory length and the DMA transfer length are the value that
original length from MSG_FIFO up to an integer multiple of 4. In addition, since
160 Z. Wang et al.

Interrupt

Disable PCI Interrupt

N
MSG_FIFO is Non-empty
&& transfer_flag=0
N
DMA DONE
Y

Read MSG_FIFO Once Y


Data Processing
Allocate Physical Memory

transfer_flag=0
Configure DMA Controller

Enable DMA
transfer_flag=1

Enable PCI Interrupt

Interrupt Return

Fig. 16.5 Flowchart of data transfer

what DMA controller read is a DATA_FIFO, there is no need to configure DMA read
address.
Next again, the dmaEnable function will be called to start 32-bit bursting transfer
of DMA controller. And the transfer_flag variable will be set to 1 before
exiting the first interrupt service so that computer cannot operate the DMA controller
repeatedly while it is on working.
Then the computer goes into the idle status and wait for the DMA DONE interrupt.
After it arrives, a SpaceWire packet will have been stored in the physical memory we
previously allocated, and the computer will enter the interrupt service for a second
time. At this point, it will be ready to write SpaceWire data to a file or perform data
processing. The transfer_flag variable will be reset to 0 before exiting the
second interrupt service for a next DMA transfer. At this moment, one data transfer
process is completed.
16 Design of SpaceWire Interface Conversion to PCI Bus 161

16.4 Testing

The major work of testing is to obtain the speed of data transfer. We designed two
identical SpaceWire-cPCI communication card by using the firmware above to set
up the testing environment. For preventing conflict occupation of the PCI bus, we
placed these two cards in different CompactPCI chassis, and tested rate by making
them communicate with each other. Card A serves as a sender while card B serves
as a receiver. Figure 16.6 shows the structure of test environment.
It is important that how to get accurate time for a speed testing. In this design, this
is the time taken for data to be written to physical memory from the receiver card.
In order to make it include the time that calling driver functions, we decided to use
the software high-precision timer under the Windows OS [8].
This paragraph mainly introduces some operation about software timing. Com-
puter B will call QueryPerformanceFrequency function to get frequency
of the internal timer during initialization. Then card A will receive commands
from computer A and send packets of different lengths. After card B has received
packets from SpaceWire router and started data transfer, computer B will call
QueryPerformance-counter function twice when entering the interrupt ser-
vice for the first time and for the second time, respectively. In this way, we can realize
high-precision software timing and calculate the time taken by data transfer process.
Table 16.2 shows some transfer rate under different packet lengths after testing.
As packet length increases, the effective SpaceWire data transfer rate of the PCI
interface increases too. Its theoretical maximum bandwidth is significantly higher
than the value of the standard SpaceWire-cPCI card (160 Mbit/s) [7], and it is still
on increasing at the end of Table 16.2. We analyzed that this is benefits from the
converted packet format and the 32-bit burst transfer mode of the DMA controller.

CompactPCI Classic CompactPCI Classic

Card A: Card B:
CPU A CPU B
Sender Receiver

test test
commond data
Keyboard SpaceWire Monitor
cable

1 2 3 4 5 6 7 8

SpaceWire Router

Fig. 16.6 Test environment


162 Z. Wang et al.

Table 16.2 Test data of Packet length (in Bytes) Transfer rate (in Mbit/s)
transfer rate
2500 48.35
5000 93.56
7500 119.83
10,000 146.31
15,000 187.32
20,000 218.87
25,000 244.51
30,000 265.42

However, due to the bad real-time performance of Windows OS and the low exe-
cution efficiency of NI-VISA, most of the transfer time is spent on the corresponding
interrupt and calling driver function. Therefore, the transfer rate is still very low when
the packet length is smaller than 2500. The following work can consider to develop
real-time software drivers or use real-time operating systems such as VxWorks.

References

1. European Cooperation for Space Standardization: Standard ECSS-E-ST-50-12C, SpaceWire,


Links, Nodes, Routers and Networks. Issue 1, European Cooperation for Space Data Standard-
ization, July 2008
2. Parkes, S., Armbruster, P., Suess, M.: SpaceWire onboard data-handing network. ESA Bull. 145,
34–45 (2011)
3. Parkes, S.: SpaceWire Users Guide. STAR-Dundee (2012). ISBN 978-0-95734080-0. https://
www.star-dundee.com/knowledge-base/spacewire-users-guide. Accessed 2 Apr 2019
4. SpaceWire Homepage. http://spacewire.esa.int/content/Missions/Missions.php. Accessed 2 Apr
2019
5. Scott, P., Parkes, S., Crawford, P., Ilstad, J.: Testing SpaceWire systems across the full range of
protocol levels with the SpaceWire Physical Layer Tester. In: International SpaceWire Confer-
ence, San Antonio, USA, 8–10 Nov 2011
6. STAR-Dundee: SpaceWire PXI datasheet. https://www.star-dundee.com/products/spacewire-
pxi. Accessed 2 Apr 2019
7. STAR-Dundee: STAR-System API and Driver datasheet. https://www.star-dundee.com/
products/spacewire-pxi. Accessed 2 Apr 2019
8. Qiao, L.Y., Chen, L.B., Peng, X.Y.: Spacewire-PCI communication card design based on IP
core. J. Electron. Meas. Instrum. 24(10), 918–923 (2010)
Chapter 17
A Chaotic Map with Amplitude Control

Chuanfu Wang and Qun Ding

Abstract A general approach based on the control factor for controlling the ampli-
tude of the Logistic map is discussed in this paper. We consider that the approach is
illustrated using the Logistic map as a typical example. It is proved that the amplitude
of the Logistic map can be controlled completely. Since the approach is derived from
the general quadratic map, it is suitable for all quadratic chaotic maps.

Keywords Amplitude control · Logistic map · Quadratic map

17.1 Introduction

Chaos is a well-known phenomenon in physics and is widely used in engineering


fields, such as chaotic cryptography, chaotic secure communication [1–9]. Lorenz
was the first one to discover chaotic attractors, and he proposed the Lorenz chaotic
system in 1963 [10]. Rossler chaotic system was proposed by Rossler in 1976 [11]. In
1983, Chua’s circuit was proposed [12]. Although it is a simple nonlinear electronic
circuit, it can show a complex chaotic behavior. In 1999, Chen discovered a new
chaotic attractor [13]. Chen chaotic system is similar to the Lorenz system, but not
topologically equivalent and more complex. Since then, a large number of chaotic
systems have been put forward, such as Lu system [14], Qi system [15] and so on. In
addition to continuous chaotic systems, some discrete chaotic maps are also being
discovered. Through the study of the insect population model, May found that a
simple biological model has very complex dynamic behavior, and he proposed the
classical Logistic map [16]. Henon proposed Henon map in the study of celestial
motion [17]. More and more chaotic systems are proposed through the discovery
of chaotic attractors. Subsequently, some general design methods for constructing
chaotic systems are proposed [18–20]. However, the discovery of chaotic systems is
mainly the finding of chaotic attractors.

C. Wang · Q. Ding (B)


Electronic Engineering College, Heilongjiang University, Harbin 150080, China
e-mail: qunding@aliyun.com

© Springer Nature Singapore Pte Ltd. 2020 163


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_17
164 C. Wang and Q. Ding

Most of the classical chaotic attractors in these classical chaotic systems are
generated by unstable equilibria or fixed points. However, there may be some hidden
attractors in these chaotic systems. These attractors may be chaotic attractors or not.
The basin of the hidden attractors does not contain neighborhoods of equilibria or
fixed points. The investigations of hidden attractors can be traced back to the second
part of Hilbert’s 16th problem for two-dimensional polynomial systems [21]. In
1961, the problem of hidden oscillations in the two-dimensional systems of the phase-
locked loop has been revealed by Gubar [22]. With continuous researching on hidden
attractors in the automatic control systems, the hidden oscillations in the automatic
control systems with a unique stable fixed point and with a nonlinearity have been
found out [23]. The development of hidden oscillations was greatly promoted when a
new way of finding hidden attractors in Chua’s circuit was proposed [24]. Through the
research progress of hidden attractors, the existing investigations of hidden attractors
are always in continuous-time dynamic systems, and few of them are in discrete-time
dynamic systems.
Now, the research on chaotic systems is mainly focused on the study of chaotic
attractors and other chaotic behaviors, but the amplitude of chaotic systems is rel-
atively less studied. However, the amplitude control of chaotic signals is also an
important area in the application of chaotic systems. In 2013, Li and Sprott first pro-
posed an approach to control the amplitude of chaotic signals. By introducing control
functions, the amplitude of the Lorenz chaotic system was well controlled. Since
then, amplitude control of chaotic systems has been further studied. Li and Sprott
use amplitude control method to find the coexistence of chaotic attractors. However,
the existing research on amplitude control of chaotic signals is only focused on con-
tinuous chaotic systems. To the best knowledge of the authors, none of the existing
amplitude control approaches of the chaotic system is for discrete chaotic maps.
Therefore, a new approach is proposed to control the amplitude of the quadratic
chaotic map in this paper. We consider the approach is illustrated using the Logistic
map as a typical example. Since the approach is derived from the general quadratic
map, it is suitable for all one-dimensional quadratic chaotic maps.

17.2 Logistic Map with Amplitude Control

In 1976, May proposed the famous Logistic map. The iteration map is

x(n + 1) = f (μ, x(n)) = μx(n)(1 − x(n)), (17.1)

where xn is in the interval [0, 1], x0 is an initial value of xn , μ is in the interval


[3.567, 4]. Some behaviors of the Logistic map are shown in Fig. 17.1.
When x(0) = 0 and x(0) = 1, both amplitudes of the Logistic map are zero.
Therefore, the amplitude range of the Logistic map must x(n) ∈ (0, 1). When μ = 4,
the amplitude range of the Logistic map can reach the maximum. The Logistic
17 A Chaotic Map with Amplitude Control 165

Fig. 17.1 a Bifurcation diagram, b output time series, c phase diagram

map is the one-dimensional discrete chaotic map, and it satisfies the period-three
theorem proposed by Li and York [30]. Period-three theorem is very important for
one-dimensional chaotic maps, and it is an important theoretical tool to study one-
dimensional chaotic maps. From the relationship between period-three points and
the period-one points of the discrete dynamical systems, the period-one points are
also the period-three points of the dynamical systems. For comparison, period-two
points correspond to period-four points. Therefore, we first obtain the period-one
points of the Logistic map. Let x(n) = f (μ, x(n)), we can obtain

x(n) = μx(n)(1 − x(n)). (17.2)

It is easy to verify that x1 = 0 and x1 = 1 − μ1 are the period-one points of map


x(n +1) = f (μ, x(n)), they are also called fixed points. Point x1 = 0 and x1 = 1− μ1
must also be the fixed points of map x(n + 1) = f 3 (μ, x(n)), which are the period-
three points of the map x(n + 1) = f (μ, x(n)). Let set {x31 , x32 , x33 } be the period-
three orbit of map x(n + 1) = f (μ, x(n)), then each point in the orbit will satisfy
f 3 (μ, x(n)) = x(n). In order to eliminate the influence of period-one points on the
period-three points, it is necessary to transform f 3 (μ, x(n)) to eliminate period-one
points [31]. After removing the period-one points, we can obtain H (μ, x(n)).

f 3 (μ, x(n)) − x(n))


H (μ, x(n)) = (17.3)
x(n)(x(n) − 1 + 1/μ)

Simplifying H (μ, x(n)), We can get a polynomial function about μ and x(n).

H (μ, x(n)) = (−μ − μ2 − μ3 ) + (μ2 + 2μ3 + 2μ4 + μ5 )x


+ (−μ3 − 3μ4 − 3μ5 − 2μ6 )x 2
(μ4 + 3μ5 + 5μ6 + μ7 )x 3
+ (−μ5 − 4μ6 − 3μ7 )x 4
+ (μ6 + 3μ7 )x 5 − μ7 x 6 (17.4)
166 C. Wang and Q. Ding

Let H (μ, x(n)) = 0, the root of the equation is the period-three points of the
Logistic map. H (μ, x(n)) is a polynomial function of x(n). The degree of the poly-
nomial function H (μ, x(n)) is six, then it has at most six roots. Since the Logistic
map is a chaotic map, it must have period-three points according to the period-three
theorem. Therefore, it can be ruled out that equation H (μ, x(n)) = 0 has two roots,
four roots, and five roots. Since the Logistic map must have period-three points, then
it must have three different double roots. It is difficult to get an analytic expression for
the solution of the equation H (μ, x(n)) = 0, and it is also difficult to obtain accurate
values by Matlab numerical solution. A small error in the period-three points of the
chaotic maps ultimately leads to a change in the entire chaotic maps. If truncation is
performed on the roots of equation H (μ, x(n)) = 0, then the obtained period-three
points of the Logistic map are not the true period-three points. Then the control of the
period-three points is not a true control of the Logistic map. To avoid the influence
of calculating error to the period-three points in the Logistic map, the period-three
points are rearranged in this paper, and the relationship between the control factor and
the Logistic map coefficients is derived from the period-three points. First, suppose
quadratic function is

f (x) = a1 x 2 + a2 x + a3 (17.5)

Suppose it has period-three points x31 , x32 , x33 , and let x31 < x32 < x33 . Bring
period-three points into (17.5), we can obtain three equations.

⎨ f (x31 ) = a1 x31
2
+ a2 x31 + a3 = x32
f (x32 ) = a1 x32
2
+ a2 x32 + a3 = x33 (17.6)

f (x33 ) = a1 x33 + a2 x33 + a3 = x31
2

It is a nonhomogeneous equation, and the three unknowns correspond to three


equations. We can obtain the solutions by solving the Eq. (17.6).

(x32 − x33 )2 − (x33 − x31 )(x31 − x32 )


a1 = (17.7)
(x31 − x32 )(x32 − x33 )(x31 − x33 )
(x32 − x33 )(x32
2
− x33
2
) − (x33 − x31 )(x31
2
− x32
2
)
a2 = (17.8)
(x31 − x32 )(x32 − x33 )(x33 − x31 )
a3 = x32 − a1 x31
2
− a2 x31 (17.9)

  
Suppose m is control factor, and let x31 = mx31 , x32 = mx32 , x33 = mx33 .
Bring them into Eqs. (17.7–17.9). The relationship between new parameters and old
parameters is obtained
a1  a3
a1 = , a = a2 , a3 = . (17.10)
m 2 m
17 A Chaotic Map with Amplitude Control 167

From classical Logistic map, the new chaotic map with amplitude control factors
is obtained.

x(n + 1) = μx(n)(1 − mx(n)) (17.11)

When m = 1, it is the classical Logistic map. When μ = 4, m = 2, the behaviors


of the Logistic map with amplitude control factor mare shown in Fig. 17.2.
When m = 2, it shows the same bifurcation behavior and phase diagram as the
classical Logistic map, but its amplitude is half of that of the classical Logistic map.
By introducing a control factor m to the Logistic map, we can see that the nonlinear
dynamic behavior of the Logistic map has not been changed, only the magnitude
has changed. Therefore, the proposed approach can control the amplitude of the
Logistic map without changing any nonlinear dynamic behavior. By comparing with
the output time series of Fig. 17.1b, when m = 2 and x(0) = 0.1, the amplitude of
the output time series is not only half smaller, but the output sequence has become
completely different. When m = 2 and x(0) = 0.05, the output time series after

Fig. 17.2 When μ = 4 and m = 2, the nonlinear dynamic behaviors of the Logistic map with
amplitude control factor m. a Bifurcation diagram, b when x(0) = 0.1, output time-series, c phase
diagram, d when x(0) = 0.05, output time-series
168 C. Wang and Q. Ding

doubling the amplitude is the same as the classical Logistic map in Fig. 17.1b.
Therefore, when m = 2, the output time series with x(0) = 0.05 as the initial value
is in the same orbit as the output time series with the classical Logistic map at the
initial value of x(0) = 0.1. From the theory of topological conjugation, we can
know that two different initial values may be corresponding to the same orbits in two
different chaotic maps. The chaotic maps which are topologically conjugate to the
Logistic map include Tent map [32] and U-N (Ulam and von Neumann) map [33]. For
the Tent map, its transform function is g(x(n)) = sin2 ( π x(n)
2
). For the U-N map, its
transform function is h(x(n)) = 0.5 − 0.5x(n). Compared with g(x(n)), h(x(n)) is
simpler. Compared with the classical Logistic map, some topological conjugate maps
have the same ranges of , and some maps have different ranges of x(n). However,
the amplitude of these topological conjugate maps is fixed and cannot be changed.
If some parameters are changed in their corresponding transform function, it cannot
be guaranteed that these transformed maps also show chaotic behaviors. And it
is difficult to find suitable transform functions. Although the existing topological
conjugation methods cannot greatly control the amplitude of the Logistic map, they
still belong to the control methods of the internal change in the Logistic map. Its
block diagram is shown in Fig. 17.3a. In addition, another way is to add an extra
amplifier. This method directly scales the amplitude of the time series of the Logistic
map. Suppose the scaling factor is k, the block diagram of the scheme is shown in
Fig. 17.3b. Since the scaling factor k is directly applied to the output time series of
the Logistic map, it cannot control the orbit of the Logistic map.
From Fig. 17.1a, we can know that the Logistic map has the ability to control
amplitude. The amplitude of the Logistic map is changes over the parameter μ.
However, it is difficult to guarantee the Logistic map always has chaotic behavior
with different parameter μ. In addition, the maximum amplitude of the Logistic map
controlled by the parameter μ is 1. Therefore, the control ability of parameter μ to the
Logistic map is limited. Its block diagram is shown in Fig. 17.3c. The block diagram
of the method proposed in this paper is shown in Fig. 17.3d. After introducing the
amplitude control factor m, the new Logistic map cannot be decomposed into two
small subsystems. Logistic map with amplitude control factor m is still a chaotic
system with the chaotic attractor, which is inseparable and topologically transitive.
When m = 0.25, 0.5, 2, 4, the bifurcation and phase diagram is shown in Fig. 17.4.

Fig. 17.3 The block (a) (b)


diagram of Logistic map f ( µ , g ( x(n))) f ( µ , x(n))
with amplitude control (a–d) k

(c) (d)
f ( µ , x(n)) f ( µ , x(n), m)
17 A Chaotic Map with Amplitude Control 169

Fig. 17.4 The bifurcation and phase diagram with different amplitude control factor m. a Bifurca-
tion diagram, b phase diagram

Compared with different amplitude control factor m, amplitude control factor m


is in inverse proportion to the amplitude of the Logistic map, which is consistent
with Eq. (17.10).

17.3 Conclusion

In this paper, we have presented a general approach by introducing amplitude con-


trol factor into the Logistic map to realize amplitude control of chaotic signals and
discussed a new way to extend the key space of pseudorandom sequence generator
based on the Logistic map. In contrast, chaotic maps topologically conjugate with the
Logistic map, the amplitude control factor in this paper is simpler than these trans-
form functions in topological conjugate maps, and the amplitude can be controlled
completely. The pseudorandom sequence based on the Logistic map has the disad-
vantage of the small keyspace, which is an important reason to restrict its application.
By introducing the control factor m, the key space is greatly improved. The future
work is to investigate the amplitude control approach in high-dimensional chaotic
maps.

References

1. Chen, G., Mao, Y., Chui, C.: A symmetric image encryption scheme based on 3D chaotic cat
maps. Chaos Solitons Fractals 21, 749–761 (2004)
2. Chen, C.-M., Linlin, X., Tsu-Yang, W., Li, C.-R.: On the security of a chaotic maps-based
three-party authenticated key agreement protocol. J. Netw. Intell. 1(2), 61–66 (2016)
170 C. Wang and Q. Ding

3. Chen, C.-M., Wang, K.-H., Wu, T.-Y., Wang, E.K.: On the security of a three-party authenticated
key agreement protocol based on chaotic maps. Data Sci. Pattern Recogn. 1(2), 1–10 (2017)
4. Fan, C., Ding, Q.: ARM-embedded implementation of H.264 selective encryption based on
chaotic stream cipher. J. Netw. Intell. 3(1), 9–15 (2018)
5. Wu, T.-Y., Fan, X., Wang, K.-H., Pan, J.-S., Chen, C.-M.: Security analysis and improvement
on an image encryption algorithm using Chebyshev generator. J. Internet Technol. 20(1), 13–23
(2019)
6. Wu, T.-Y., Fan, X., Wang, K.-H., Pan, J.-S., Chen, C.-M., Wu, J.M.-T.: Security analysis
and improvement of an image encryption scheme based on chaotic tent map. J. Inf. Hiding
Multimed. Signal Process. 9(4), 1050–1057 (2018)
7. Chen, C.-M., Linlin, X., Wang, K.-H., Liu, S., Wu, T.-Y.: Cryptanalysis and improvements on
three-party-authenticated key agreement protocols based on chaotic maps. J. Internet Technol.
19(3), 679–687 (2018)
8. Chen, C.-M., Fang, W., Liu, S., Tsu-Yang, W., Pan, J.-S., Wang, K.-H.: Improvement on a
chaotic map-based mutual anonymous authentication protocol. J. Inf. Sci. Eng. 34, 371–390
(2018)
9. Wu, T.-Y., Wang, K.-H., Chen, C.-M., Wu, J.M.-T., Pan, J.-S.: A simple image encryption
algorithm based on logistic map. Adv. Intell. Syst. Comput. 891, 241–247 (2018)
10. Lorenz, E.N.: Deterministic non-periodic flow. J. Atmos. Sci. 20, 130–141 (1963)
11. Rössler, O.E: An equation for continuous chaos. Phys. Lett. A 57, 397–398 (1976)
12. Chua, L.O., Lin, G.N.: Canonical realization of Chua’s circuit family. IEEE Trans. Circuits
Syst. 37, 885–902 (1990)
13. Chen, G., Ueta, T: Yet another chaotic attractor. Int. J. Bifurc. Chaos 9, 1465–1466 (1999)
14. Lü, J., Chen, G.: A new chaotic attractor coined. Int. J. Bifurc. Chaos 3, 659–661 (2000)
15. Qi, G., Chen, G., Du, S., Chen, Z., Yuan, Z: Analysis of a new chaotic system. Phys. A Stat.
Mech. Appl. 352, 295–308 (2005)
16. May, R.M: Simple mathematical models with very complicated dynamics. Nature 261, 459–467
(1976)
17. Hénon, M.: A two-dimensional mapping with a strange attractor. Commun. Math. Phys. 50,
69–77 (1976)
18. Chen, G., Lai, D.: Feedback control of Lyapunov exponents for discrete-time dynamical sys-
tems. Int. J. Bifurc. Chaos 06, 1341–1349 (1996)
19. Lin, Z., Yu, S., Lü, J., Cai, S., Chen, G.: Design and ARM-embedded implementation of a
chaotic map-based real-time secure video communication system. IEEE. Trans. Circ. Syst.
Video 25, 1203–1216 (2015)
20. Wang, C.F., Fan, C.L., Ding, Q.: Constructing discrete chaotic systems with positive Lyapunov
exponents. Int. J. Bifurcat. Chaos 28, 1850084 (2018)
21. Hilbert, D.: Mathematical problems. Bull. Amer. Math. Soc. 8, 437C479 (1902)
22. Gubar, N.A.: Investigation of a piecewise linear dynamical system with three parameters. J.
Appl. Math. Mech. 25, 1011C1023 (1961)
23. Markus, L., Yamabe, H.: Global stability criteria for differential systems. Osaka Math. J. 12,
305C317 (1960)
24. Leonov, G.A.: Algorithms for finding hidden oscillations in nonlinear systems. The Aiz-erman
and Kalman conjectures and Chuas circuits. J. Comput. Syst. Sci. Int. 50, 511C543 (2011)
Chapter 18
Analysis of Factors Associated
to Smoking Cessation Plan Among Adult
Smokers

Jong Seol Lee and Keun Ho Ryu

Abstract According to the World Health Organization (WHO), smoking has gener-
ated a lot of diseases, and tobacco has been the biggest threat to human beings. The
Republic of Korea government has implemented a policy to reduce damage from
smoking since 1986. But almost 1 out of 5 Koreans still smoked in 2017 (21.2%). In
this research, we collected datasets from the Korea Health and Nutrition Examina-
tion Survey (KNHANES) from 2013 to 2015 and used statistical methods to analyze
the smoking patterns of smokers among adults. We used the chi-square test for 28
independent variables based on the before and after preparation of the dependent
variables and evaluated the result based on the significance level getting from the
statistical analysis program SPSS. In our result, the gender distribution was found to
be 2,407 (84.4%) for males and 444 (15.6%) for females. The age range was 46.36 ±
15.13 and the range was from 31 to 61 years. There were more single smokers than
married smokers, and the results were significant in this study. Rather, it was reported
that anti-smoking policy at home was not relevant, and anti-smoking policy public
places were statistically significant (p = 0.007). The results of this study suggest
that many smokers should make a decision to quit smoking by providing a negative
aspect of smoking as a significant factor related to the preparation stage of smoking
cessation.

Keywords Smoking cessation · Adult smokers · KNHANES · Cross-analysis ·


Chi-square

J. S. Lee
Department of Smart Factory, Chungbuk National University, Cheongju, South Korea
e-mail: richard@dblab.chungbuk.ac.kr
K. H. Ryu (B)
Faculty of Information Technology, Ton Duc Thang University,
Ho Chi Minh City 700000, Vietnam
e-mail: phamvanhuy@tdtu.edu.vn; khryu@tdtu.edu.vn; khryu@chungbuk.ac.kr
Department of Computer Science, College of Electrical and Computer Engineering,
Chungbuk National University, Cheongju, South Korea

© Springer Nature Singapore Pte Ltd. 2020 171


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_18
172 J. S. Lee and K. H. Ryu

18.1 Introduction

Tobacco is a cause of various cancers including respiratory diseases [1]. According


to the World Health Organization (WHO), smoking has led to a lot of diseases, and
tobacco has been the biggest threat to human beings [2]. The Republic of Korea
government has implemented a policy to reduce damage from smoking since 1986.
The nationwide public health centers have begun smoking cessation programs and
have expanded smoking cessation policies, strengthening tobacco advertising regu-
lations, tobacco price hikes since 1999. But almost 1 of 5 Koreans still smoked in
2017 (21.2%) [3].
Smoking problems are health risk factors that can be changed and emphasize that
smoking cessation is the most effective way to reduce the major causes of death in
the United States [4]. The risk of lung cancer can be reduced by 80–90% because
of smoking cessation. Even though the US government and the private sector have
already categorized tobacco as a dependence drug and has gradually lowered its
smoking rate through aggressive anti-smoking campaigns [5].
In Korea, the incidence of lung cancer and mortality increased sharply since the
1980s. In 1998, the smoking rate of male adults in Korea is 64.1%, which is the
highest level among the OECD countries [5].
Recently, it has been widely recognized that tobacco is harmful to health and is
a direct cause of lung cancer, and that secondhand smoke affects not only smokers
but also non-smokers in the vicinity, as part of our efforts to expel smoking is seen
in our society [6].
There are a few types of research on smoking cessation for adults, thus our research
aims to find relevant characteristics to help people who plan to quit smoke based on
KNHANES (2013–2015) data.

18.2 Materials and Methods

18.2.1 Data Preprocessing

In our experiment, the data preprocessing process can be shown in Fig. 18.1.
In the first step, we collected raw data from KNHANES (Korean National Health
and Nutrition Examination Survey) from 2013 to 2015 after registering personal
information and signing a pledge of confidentiality, anyone can download the raw
datasets from the website [7]. The raw datasets include 22,948 instances and 862
features.
In the second step, our research subjects were selected from adult smokers, our
target datasets included 3,027 instances.
This dataset contains a lot of features, which include many unrelated features and
a number of missing values. Thus, in the third step, we removed some irrelevant
features like personal ID number, life type, and so on. And in the last step, we
18 Analysis of Factors Associated to Smoking Cessation Plan … 173

Fig. 18.1 Data


preprocessing process

Fig. 18.2 Each year of 40.0


Proportion of current smokers (%)

current smokers from 2013 37.4


to 2015
33.2

30.0
29.4

20.0
2013 2014 2015
Year

Table 18.1 Percentage of the Year Total Contemplation Preparation


contemplation and
preparation group for each 2013 1,066 690 (64.1) 376 (35.9)
year 2014 946 584 (60.1) 362 (39.9)
2015 839 487 (56.8) 352 (43.2)

deleted some missing values. It happened because someone does not want to answer
the personal survey and so on.
There are total 2,851 adult smokers included in this study. The ratio of smokers
from 2013 to 2015 is as shown in Fig. 18.2, 37.4% appeared in 2013, 33.2% appeared
in 2014 and 29.4% appeared in 2015. The yearly contemplation and preparation
smokers and percentages are shown in Table 18.1.
174 J. S. Lee and K. H. Ryu

Fig. 18.3 KNHANES questionnaire

18.2.2 Measures

As per the measures, we used one question from the KNHANES questionnaire to
classify the preparation for smoking cessation as shown in Fig. 18.3 [7].
Other studies using KNHANES data were divided into three categories and studied
the stages of change [8], but we divided into two categories: smoking cessation
preparation and no thought. 1 and 2 were used as smoking cessation preparations
and 3 and 4 were regarded as no smoking cessation.

18.3 Experiment and Result

18.3.1 Framework of Experiment

Our experimental framework is shown in Fig. 18.4. We collected our data from the
KNHANES for adult smokers,whose ages are more than 18 years from 2013 to 2015,
then removed missing values and outliers through data preprocessing, and performed
chi-square test through composite sample crossover analysis.
Feature Selection extracts a new set of attributes to provide the necessary infor-
mation and, in some cases, better information. Therefore, statistically significant (P
< 0.05) results were extracted and used to analyze characteristics related to people
who thought smoking cessation [9, 10].

Fig. 18.4 Experimental framework


18 Analysis of Factors Associated to Smoking Cessation Plan … 175

18.3.2 Experimental Result

Our result is shown in Table 18.2. A total of 2,851 people were selected for
pre-contemplation and preparation. The gender distribution was found to be 2,407
(84.4%) for males and 444 (15.6%) for females. The age range was 46.36 ± 15.13 and

Table 18.2 The general characteristics of the pre-contemplation and preparation groups
Variable Value Pre-contemplation (%) Preparation (%) P-value
Gender Male 1,490 (87.5) 917 (86.1) 0.260
Female 271 (12.5) 173 (13.9)
Age 19–24 114 (8.9) 105 (13.0) 0.008
25–49 913 (60.1) 559 (58.3)
50–74 662 (28.8) 391 (27.2)
75–80 72 (2.2) 35 (1.5)
Mean ± SD 46.99 ±14.98 45.35 ±15.32
Education Middle school 480 (20.0) 246 (17.7) 0.342
or lower
High school 716 (44.5) 478 (47.0)
College 565 (35.5) 366 (35.4)
graduate or
higher
Marriage Married 1,404 (74.4) 821 (67.5) 0.001
Single 357 (25.6) 269 (32.5)
BMI, kg/m2 ≤18.4 68 (4.0) 41 (3.9) 0.788
18.5–24.0 1,048 (57.9) 653 (59.4)
≥25.0 645 (38.0) 396 (36.7)
Physical Intense 11 (6.1) 11 (14.0) 0.047
activity at Moderate 120 (69.6) 78 (72.0)
company
Both 41 (24.3) 17 (14.0)
Exercises (per Walking 980 (54.9) 562 (51.7) 0.055
week) Muscle 38 (2.4) 28 (2.9)
Both 402 (24.9) 319 (29.9)
None 341 (17.8) 181 (15.5)
Stress Yes 522 (31.2) 367 (35.4) 0.037
No 1,239 (68.8) 723 (64.6)
EQ-5D 0.0–0.999 500 (25.1) 309 (25.8) 0.703
1 1,257 (74.9) 781 (74.2)
Alcohol (per Never or 406 (20.3) 232 (19.2) 0.489
month) under a glass
Over a glass 1,355 (79.7) 858 (80.8)
176 J. S. Lee and K. H. Ryu

the range was from 31 to 61 years. The statistical significance was 0.008 < p-value.
There were more single smokers than married ones, and the results were significant
in this study. Physical activity at the company was statistically significant with
moderate weight, and stress was higher than that of the recipient.
Physical activity at the company was to determine the relevance of smoking ces-
sation preparations.
Smoking-related characteristics of the pre-contemplation and preparation groups
are shown Table 18.3. The general characteristics related to smoking were the high
incidence of secondhand smoke in the two groups, and it was not statistically signifi-
cant but seemed to be not relevant. Rather, it was reported that secondhand smoke at
home was not relevant, and anti-smoking in public places was statistically significant.
The smoking initiation age was 20.26 ± 5.84, which means that smoking started from
15 to 25 years old. No. of cigarettes smoked per day was 14.21 ± 7.97 and many
smokers were able to judge that they smoked more than one cup at 7 times a day.
Smoking initiation age was to determine the relevance of smoking cessation
preparations.

Table 18.3 Smoking-related characteristics of the pre-contemplation and preparation groups


Variable Value Pre-contemplation Preparation P-value
(%) (%)
Anti-smoking policy Yes 744 (55.7) 496 (60.4) 0.060
at workplace No 597 (44.3) 326 (39.6)
Anti-smoking policy Yes 202 (61.1) 128 (64.6) 0.474
at house No 126 (38.9) 73 (35.4)
Anti-smoking policy Yes 628 (47.0) 432 (53.5) 0.007
at public institution No 713 (53.0) 390 (46.5)
Smoking started age 6–13 7 (1.5) 4 (2.1) 0.944
14–16 82 (16.8) 59 (16.8)
17–19 172 (38.8) 119 (37.5)
20–69 225 (42.9) 169 (43.6)
Mean ± SD 20.14 ±5.82 20.43 ±5.88
No. of cigarettes 1–5 182 (9.6) 259 (23.4) 0.000
smoked (per day) 6–10 476 (25.6) 335 (29.8)
11–20 942 (55.6) 441 (42.0)
21–60 161 (9.1) 55 (4.7)
Mean ± SD 14.86 ±7.63 14.79 ±7.76
18 Analysis of Factors Associated to Smoking Cessation Plan … 177

18.4 Conclusion

Smoking is one of the major cause of various diseases and deaths. So, that is why
the government of Republic of Korea started smoking cessation business and try to
make a low smoking rate until now, but still many people are smoking.
Based on this research, we expect our result makes that people who want to quit
smoking about having a negative mine make a step to achieve smoking cessation.

Acknowledgements This research was supported by Basic Science Research Program through the
National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future
Planning (No. 2017R1A2B4010826), supported by the KIAT (Korea Institute for Advancement
of Technology) grant funded by the Korea Government (MOTIE: Ministry of Trade Industry and
Energy) (No. N0002429).

References

1. Choi, H.S., Sohn, H.S., Kim, Y.H., Lee, M.J.: Factors associated with failure in the continuity
of smoking cessation among 6 month’s smoking cessation successes in the smoking cessation
clinic of public health center. J. Korea Acad. Ind. Coop. Soc. 13(10), 4653–4659 (2012)
2. Kim, D.H., Suh, Y.S.: Smoking as a disease. Korean J. Fam. Med. 30(7), 494–502 (2009)
3. Kim, E.S.: Smoking high risk group woman, out-of-school youth research on development of
smoking cessation service strategy results report (2016)
4. National Prevention, Health Promotion and Public Health Council. In: 2010 Annual Status
Report. http://www.hhs.gov/news/reports/nationalprevention2010report.pdf. Accessed July
2010
5. Ministry of Health & Welfare. Yearbook of Health and Welfare Statistics (2001). http://www.
moha.go.kr
6. Kim, H.O.: The effect of smoking cessation program on smoking cessation and smoking behav-
ior change of adult smokers. Commun. Nurs. 13(1) (2002)
7. Korea Centers for Disease Control and Prevention. Korea National Health and Nutrition Exam-
ination Survey Data. Korea National Health and Nutrition Examination Survey, 1 Mar 2015
8. Leem, A.Y., Han, C.H., Ahn, C.M., Lee, S.H., Kim, J.Y., Chun, E.M.: Factors associated with
stage of change in smoker in relation to smoking cessation based on the Korean National Health
and Nutrition Examination Survey II–V. PLoS One 12(5), e0176294 (2017)
9. Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(1–4), 131–156 (1997)
10. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res.
3(Mar), 1157–1182 (2003)
Chapter 19
An Efficient Semantic Document
Similarity Calculation Method Based
on Double-Relations in Gene Ontology

Jingyu Hu, Meijing Li, Zijun Zhang and Kaitong Li

Abstract Semantic text mining is a challenging research topic in recent years. Many
types of research focus on measuring the similarity of two documents with ontolo-
gies such as Medical Subject Headings (Mesh) and Gene Ontology (GO). However,
most of the researches considered the single relationship in an ontology. To rep-
resent the document comprehensively, a semantic document similarity calculation
method is proposed, based on utilizing Average Maximum Match algorithm with
double-relations in GO. In the experiment, the results show that the double-relations
based similarity calculation method is better than traditional semantic similarity
measurements.

Keywords Double-relations · Semantic text similarity measure · Document


clustering · Gene ontology

J. Hu · M. Li (B) · Z. Zhang · K. Li
College of Information Engineering, Shanghai Maritime University, Shanghai, China
e-mail: mjli@shmtu.edu.cn
J. Hu
e-mail: Jingyu-Hu@outlook.com
Z. Zhang
e-mail: Zijun.Zhang1105@outlook.com
K. Li
e-mail: Kaitong-Li@outlook.com

© Springer Nature Singapore Pte Ltd. 2020 179


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_19
180 J. Hu et al.

19.1 Introduction

Recent years have witnessed the rapidly growing number of biological documents.
Classifying enormous literature efficiently is of vital significance for management
and reference consulting. Hence, biological text mining becomes important for auto-
matic classification, which is faster than traditional manual methods. At present,
many researchers focus on the study about text similarity measurement, such as
cross-lingual similarity measure [1], contextual similarity measure [2], passage-
based similarity measure [3], and page-count-based similarity measure [4], and so
on. Except content-based similarity calculation methods, ontology-based text sim-
ilarity calculation methods are commonly used for semantic text mining. Current
semantic similarity measures can be roughly divided into path-based methods [5–8]
and IC-based methods including Lord [9–11]. And many researchers began to apply
these methods to biological text data analyses [12–14, 16]. The transition of the
similarity of the terms from one-to-one to many-to-many can be achieved in text
clustering using these algorithms. The common feature above is that researchers are
focusing on inter-document calculation with a single relationship [12–14]. Neverthe-
less, as is known that relations, such as ‘is-a’, ‘part-of’ and ‘regulate’, differ among
gene ontology (GO) [15] terms. The role of other relations in clustering is neglected
consequently.
To consider more possible relationships between two documents, we proposed
a new method to calculate document similarity based on double-relations in the
ontology. With these double-relations combined, a document’s structure can be more
specifically described.

19.2 Proposed Method

The whole process on biological document similarity calculation and clustering is


shown in Fig. 19.1.

Fig. 19.1 The workflow of semantic biological document similarity calculation and clustering
19 An Efficient Semantic Document Similarity Calculation Method … 181

19.2.1 Semantic Biology Feature Extraction and Similarity


Calculation

To represent the document with semantic information, GO terms were extracted from
documents as semantic biology features. As it is referred that transitive relation,
which offers a theoretical basis for paths connection, exists in both ‘IS-A’ and ‘Part-
of’ while other relations like ‘Regulate’ are still under proved.
Double-relations Similarity between Two Features. In this paper, we used two
kinds of semantic similarity measurement methods: path-based similarity measures
and weighted-information-content-based similarity.
The path-based similarity algorithm used in this research is WP [5]. WP introduces
the nearest ancestor for comparing similarity between translation words. An ancestor
with shorter Lowest Common Ancestors (LCA) will be chosen as the nearest ancestor
term c if there are multiple paths reachable between two terms. The similarity goes
to zero when there is no ancestor between two terms.
Double-relations Similarity between Two Documents. Generally, it’s more like an
ontology term set than one term of a document. Similarity’s reallocation is essen-
tial for multi-term comparison. A new double-relations text Similarity Scheme is
proposed, which is based on the Average Maximum Match (AMM).
By referring the proposal of AMM, similarity with single-relation between doc-
uments Cm , Cn can be defined as
⎧  
⎨ W Sim(C , C ) = Simt(Cmi , Cn ) + Simt(Cnj , Cm )
m n
m+n (19.1)

(i = 1, . . . , m, j = 1, . . . , n)

Simt(Cma , Cn ) = MAX (F(Cma , Cnj ))(a ∈ [1, m], j = 1, . . . , n) (19.2)

where Cmi refers to ith terms of corpus C with m terms and Cnj means jth term
 
of corpus C with n terms. F Ci , Cj is the similarity between terms Ci and Cj by
using one of path-based and IC-based algorithms above. Afterwards, to make this
fundamental AMM module to apply to double-relations conditions, Eq. (19.1) is
rearranged and then aims to pick out biggest similarity index between term x and
term y within different relations R. The newly produced algorithm is as follows:
182 J. Hu et al.

Algorithm multi-relations similarity with AMM


Input: document C with m terms
document C' with n terms
multiple relations R
Function: sim(C, C’, R), SimTmp(C, C’, R)
Output: similarity between two documents
Main Function sim(C, C’, R)
1. simx =SimTmp(C, C’, R)
2. simy =SimTmp(C’, C, R)
3. RETURN (simx+simy)/(m+n)
Sub Function SimTmp(C, C’, R)
1. FOR each terms x in C
2. FOR each terms y in C’
3. FOR each relation
4. maxx ← max(fun(x,y,r), maxx)
5. END FOR
6. END FOR
7. simx ← simx + maxx
8. END FOR
9. RETURN simx

19.2.2 Document Clustering and Annotation

Efficiency of a proposed approach can be demonstrated by clustering and annota-


tion. There are various clustering methods of similarity matrix, including spectral
clustering [17], Markov clustering [18] and DBSCAN [19] clustering, and so on.
Because of space restrictions, the following process only shows spectral clus-
tering: calculate Laplacian matrix L via double-relations similarity matrix S. The
feature vector V can be worked out by L. Clustering result shows after putting V and
cluster numbers into K-means [20] algorithm with 5 times iteration.
Two methods, term frequency (TF) and term frequency–inverse document fre-
quency (TF–IDF) are chosen to annotate each cluster. At first, term frequency is cal-
culated to form a rough outline of the cluster. Furthermore, to filter common words
interference and get a more specific description, term frequency–inverse document
frequency (TF–IDF) [21] is utilized. TF–IDF is a popular term-weighting numerical
statistic to measure the importance of words to documents in text mining, which
extracts words that frequently appear in one document while taking a relatively low
proportion to other corpora.
19 An Efficient Semantic Document Similarity Calculation Method … 183

The definition of term w’s TF–IDF in ith document set D is shown as follows.

TFIDF(w, D, i) = TF(w, Di ) × IDF(w, D) (19.3)

count(w, Di ) size(D)
TF(w, Di ) = , IDF(w, D) = log( ) (19.4)
size(Di ) N

where N refers to the number of documents that contain w.

19.3 Experiments and Results

19.3.1 Dataset and Evaluation Methods

The experiment dataset contains 848 documents. It contained different classes that
are equally divided into four different classes. An 848 × 848-similarity matrix S is
obtained with selected similarity measurement.
To assess the performance of document clustering, three evaluation measures
including precision, recall, and F-measure are chosen to examine the difference
between test results and original cluster labels. Precision refers to the proportion of
documents’ mutual similarity in the same cluster. Recall is defined as the possibility
of cluster similar documents in the same cluster. F-measure is considered precision
and recall together. The formulas are the following equations:

TP
Precision = (19.5)
TP + FP
TP
Recall = (19.6)
TP + FN
2 × Precision × Recall
F= (19.7)
Precision + Recall

19.3.2 Results and Analysis

Cluster Annotation: We used the annotation method with TF–IDF to label the
text clusters. Compared with part-of relation based similarity measurement, double-
relations based similarity measurement and is-a relation based similarity measure-
ment can describe the text cluster more comprehensively (Table 19.1).
Comparison with other different methods: In the experiment, we compared our
proposed double-relations similarity measurement with other two similarity measure-
ments based on single-relation. As the experiment shows, double-relations similarity
measurement ranks top one.
184 J. Hu et al.

Table 19.1 Cluster 1 annotation result with top five TF–IDF terms
Original Double-relations Is-A only Part-of only
Protein binding Mitotic spindle Protein binding Nucleus
organization
Mitotic spindle Centrosome Mitotic spindle Protein binding
assembly assembly
Mitotic spindle Protein binding Mitotic sister Mitotic sister
midzone chromatid chromatid
segregation segregation
Mitotic spindle Mitotic spindle Condensed nuclear ESCRT III complex
elongation midzone chromosome
kinetochore
Microtubule Nucleus Nucleus Mitotic spindle pole
cytoskeleton body
organization

Table 19.2 Clustering quality evaluation among re-weighting, Is-A, and Part-of only
Precision Recall F-measure
Similarity measure with double-relations 0.7489 0.7505 0.7497
Similarity measure with is-a 0.6712 0.6980 0.6843
Similarity measure with part-of 0.3585 0.8045 0.4960

Compared with single-relation methods, the double-relation method’s evolution


result is strongly better than single relation result in precision, recall, and F-measure
assessment while recall score is slightly worse. From the result, the conclusion that
medical documents clustering quality can be optimized by adding ontology with
double-relation into consideration can be drawn (Table 19.2).

19.4 Conclusion

In this paper, a text similarity calculation method is proposed, which is based on


double-relations in GO and AMM. As shown in the experiment, combining double-
relations similarity plays a significant positive role in document clustering. Multi-
relation scheme based on AMM improves clustering efficiency to a degree. In future
research, more relationships from different kinds of ontologies are considered to be
imported.

Acknowledgements This study was supported by the National Natural Science Foundation of
China (61702324).
19 An Efficient Semantic Document Similarity Calculation Method … 185

References

1. Danushka, B., Georgios, K., Sophia, A.: A cross-lingual similarity measure for detecting
biomedical term translations. PLoS One 10(6), 7–15 (2015)
2. Spasić, I., Ananiadou, S.: A flexible measure of contextual similarity for biomedical terms.
In: Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing, pp. 197–208
(2005)
3. Rey-Long, L.: Passage-based bibliographic coupling: an inter-article similarity measure for
biomedical articles. PLoS One 10(10), 6–10 (2015)
4. Chen, C., Hsieh, S., Weng, Y.: Semantic similarity measure in biomedical domain leverage
Web Search Engine. In: 2010 Annual International Conference of the IEEE Engineering in
Medicine and Biology (2010)
5. Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual
Meeting of the Associations for Computational Linguistics (ACL’94), pp. 133–138 (1994)
6. Leacock, C., Chodorow, M.: Filling in a sparse training space for word sense identification. In:
Proceedings of the 32nd Annual Meeting of the Associations for Computational Linguistics
(ACL94), pp. 248–256 (1994)
7. Li, Y., Bandar, Z., McLean, D.: An approach for measuring semantic similarity between words
using multiple information sources. IEEE Trans. Knowl. Data Eng. Bioinform. 15(4), 871–882
(2003)
8. Choudhury, J., Kimtani, D.K., Chakrabarty, A.: Text clustering using a word net-based
knowledge-base and the Lesk algorithm. Int. J. Comput. Appl. 48(21), 20–24 (2012)
9. Lord, P., Stevens, R., Brass, A., Goble, C.: Investigating semantic similarity measures across
the gene ontology: the relationship between sequence and annotation. Bioinformatics 19(10),
1275–1283 (2003)
10. Resnik, O.: Semantic similarity in a taxonomy: an information-based measure and its applica-
tion to problems of ambiguity and natural language. J. Artif. Intell. Res. Bibliometr. 19(11),
95–130 (1999)
11. Lin, D.: Principle-based parsing without overgeneration. In: 31st Annual Meeting of the Associ-
ation for Computational Linguistics, pp. 112–120. Association for Computational Linguistics,
USA (1993)
12. Zhang, X., Jing, L., Hu, X., et al.: A comparative study of ontology based term similarity
measures on PubMed document clustering. In: International Conference on Database Systems,
pp. 115–126. Springer, Berlin, Heidelberg (2007)
13. Jing, Z., Yuxuan, S., Shengwen, P., Xuhui, L., Hiroshi, M., Shanfeng, Z.: MeSHSim: an
R/Bioconductor package for measuring semantic similarity over MeSH headings and MED-
LINE documents. J. Bioinform. Comput. (2015) (BioMed Central)
14. Logeswari, S., Kandhasamy, P.: Designing a semantic similarity measure for biomedical doc-
ument clustering. J. Med. Imaging Health Inform. 5(6), 1163–1170 (2015)
15. The Gene Ontology Resource Home. http://geneontology.org/. Accessed 27 Feb 2019
16. Wang, J.Z., Du, Z., Payattakool, R., Yu, P.S., Chen, C.F.: A new method to measure the semantic
similarity of go terms. Bioinformatics 23(10), 1274–1281 (2007)
17. Zare, H., Shooshtari, P., Gupta, A., Brinkman, R.: Data reduction for spectral clustering to
analyze high throughput flow cytometry data. BMC Bioinform. (2010)
18. Dongen, V.: A cluster algorithm for graphs. In: Information Systems, pp. 1–40. CWI (2000)
19. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters
in large spatial databases with noise. In: KDD’96 Proceedings of the Second International
Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)
20. MacKay, D.: An example inference task: clustering. In: Information Theory, Inference and
Learning Algorithms, pp. 284–292. Cambridge University Press (2003)
21. Robertson, S.: Understanding inverse document frequency: on theoretical arguments for IDF.
J. Doc. 60(5), 503–520 (2004)
Chapter 20
Analysis of the Dispersion of Impact
Point of Smart Blockade and Control
Ammunition System Based on Monte
Carlo Method

Yang Li, Chun-lan Jiang, Ming Li and Shu-chun Xie

Abstract In order to study the dispersion as well as analyze the influencing factors
of the impact point of the smart blockade and control ammunition system, a sim-
plified ballistic model of the parachute–payload system is established. Based on the
Monte Carlo method, the dispersion range of impact point is acquired, and the main
sensitive factors affecting the dispersion of impact point are compared and analyzed.
Simulation results show that the lateral dispensing velocity of the dispenser and the
factors of the parachute are the sensitive factors that affect the dispersion of the impact
point, in which the factors of the parachute are the most obvious. The research in
this paper provides reference and basis for the design of smart ammunition system
of the airborne dispenser.

Keywords Parachute–payload · Monte Carlo method · Impact point dispersion

20.1 Introduction

In future wars, it is crucial to effectively attack and block key targets or areas. With
the development and application of microcomputer technology, wireless communi-
cation technology, sensor technology, and network technology, various new types
of regional blockade ammunition are emerging. Therefore, the research on airborne
dispensers, rockets, and other platforms to adapt to the modern battlefield of the new
regional blockade ammunition system has become a hot spot [1].
The combat mission of the smart blockade and control ammunition system is to
blockade the key areas on the battlefield. The smart blockade and control ammuni-
tion studied in this paper is scattered by the platform of the airborne dispenser, and

Y. Li (B) · C. Jiang · M. Li · S. Xie


Key Laboratory of Explosion Science and Technology,
Beijing Institute of Technology, Beijing 100081, China
e-mail: 827853550@qq.com

© Springer Nature Singapore Pte Ltd. 2020 187


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_20
188 Y. Li et al.

the deceleration and attitude adjustment are realized by parachute. The dispersion of
the impact point has a direct impact on the network communication between ammu-
nitions, thus affecting the combat effectiveness of the whole system. Therefore, it is
necessary to strengthen the research on the dispersing technique, and the dispersion of
the impact point. In this paper, the dynamic model of the parachute–payload system
is established and the flight simulation experiment is carried out by the Monte Carlo
method. The range of distribution and dispersion of the impact point are obtained.
The main sensitive factors affecting the dispersion of the impact point are compared
and analyzed.

20.2 Method Description

20.2.1 Process and Principles of Dispersion

The airborne dispenser loaded with bullets is divided into three cabins, namely, the
front, the middle and the rear. Four ammunition contained in each cabin are divided
into upper and lower layers, 12 ammunition in total. The arrangement of the six
ammunitions in the lower layer is shown in Fig. 20.1. The process of dispersion is
as follows.
1. First, ammunition no. 1 and no. 2 in the rear cabin are thrown laterally at the
speed of v1 ;
2. After the time delay Δt 1 , no. 3 and no. 4 in the middle cabin are thrown laterally
with speed of v2 ;
3. After the time delay Δt 2 , no. 5 and no. 6 in the rear cabin are thrown laterally
with speed of v3 .

The rear The middle The front y


The dispersing The dispersing The dispersing
carbin carbin carbin position of the position of the position of the
vx0 rear carbin middle carbin front carbin

v1 v2 v3

Initial 1 3 5
dispersing
position
2 4 6

v1 v2 v3
z
o x
Δt1 Δt2 0 Δ x1 Δ x2 x

(a) The arrangement of ammunitions (b) The initial dispersal position


Fig. 20.1 The arrangement of the six ammunitions in the lower and the initial dispersal position
20 Analysis of the Dispersion of Impact Point … 189

20.2.2 Dynamic Model

Assume that the mass of the parachute–payload system remains unchanged and
ignore the mass of parachute. The air drag of the system can be simplified as the
pulling force along the opposite direction of velocity. All moments and the forces
which have little influence on the motion are ignored (Fig. 20.2).
Based on the above assumptions, and Newton’s law and kinematics theorem,
the simplified dynamical model of the parachute–payload system is established as
follows [2]:

⎪ dvx Fx Fb + F p vx − wx ⎧ v = v + w 

⎪ = = · ⎪
⎪ r

⎪ dt m 2m v ⎪
⎪ 

⎪ r ⎪


⎪ dv y Fy Fb + F p v y ⎪
⎪ vr = (vx − wx )2 + v2y + (vz − wz )2

⎪ = = · −g ⎪ ⎪


⎪ ⎪

⎪ dt m 2m vr ⎪
⎪ 1

⎪ ⎪
⎪ Fb = − ρ(C A)b vr2

⎪ dv z F z F b + F p v z − w z ⎪
⎪ 2
⎨ = =− · ⎨
dt m 2m vr 1
F p = − ρ(C A)b vr2

⎪ d x ⎪
⎪ 2
⎪ ⎪ 
⎪ dt = vx = v cos ϕ cos θ






⎪ ⎪
⎪ v y

⎪ dy ⎪
⎪ θ = arctan 

⎪ = = θ ⎪
⎪ vx2 + vz2
⎪ dt

v y v sin ⎪



⎪ ⎪

⎪ dz
⎪ ⎪
⎪ vz

⎩ = vz = v sin ϕ cos θ ⎩ ϕ = arctan
dt vx
(20.1)

where v, vx , v y , and vz denote the resultant velocity, horizontal velocity, vertical


velocity, and lateral velocity of the system, respectively. wx and wz stand for the

Y
y

C x

O X

Dispersal process Coordinate system

Fig. 20.2 The ground coordinate system


190 Y. Li et al.

velocity of horizontal and lateral crosswind. Fb and F p represent the aerodynamic


drags of payload and parachute. (C A)b and (C A) p denote resistance characteristics
of payload and parachute. θ , ϕ indicate trajectory inclination angle and trajectory
deflection angle. m, g and ρ indicate, respectively, the mass of the payload, gravity
acceleration, and air density.

20.2.3 Monte Carlo Method

The basic idea and principle of using Monte Carlo method to simulate the impact
point dispersion is as follows [3].
There are m random variables Xi (i = 1, 2, 3, …, m) independent of each other.
According to the distribution of each random variable
Xi,
n sets of random numbers
x 1 , x 2 , …, x n obeying the normal distribution N μi , σi2 are generated, where μi , σi
are, respectively, the average and standard deviation of a normal distribution of
random variables Xi. The random variable sampling for each random perturbation
factor is completed in this way so that the flight trajectory and the dispersion of the
impact point under the random disturbance factors can be simulated and calculated.

20.3 Simulation

Based on the mathematical model of the parachute–payload system and the existing
literature research [4–7], the random disturbance factors affecting the dispersion
of the impact point are initial dispersion condition, dispersion of parachute-payload
system, and dispersion of random wind. The initial dispersing altitude is set as 100 m,
the horizontal and lateral dispersing velocities are 200 and 16 m/s. The standard
coordinates of the impact point on the x-z plane are (128.641, 10.2913 m). The value
of random disturbance factors is shown in Table 20.1.

Table 20.1 The value of Factors Values


random disturbance factors
vx0 200 ± 2 m/s
vy0 ±2 m/s
vz0 16 ± 2 m/s
(CA)p 0.6 ± 0.025 m2
(CA)b 0.018 ± 0.0006 m
Δt 0.15 ± 0.036 s
Crosswind ±0.5 m/s
20 Analysis of the Dispersion of Impact Point … 191

20.3.1 The Effect of Initial Dispersing Conditions

Taking a single initial dispersing condition as a disturbance factor, 500 samples are
selected (Fig. 20.3 and Table 20.2).
The disturbances of horizontal and vertical velocities mainly affect the dispersion
in the x-direction. The disturbance of the initial lateral dispersing velocity influences
dispersion in the z-direction, and the influence degree is more obvious.

20.3.2 The Effect of the Parachute–Payload System

The factors of the parachute–payload system include the resistance characteristics


of the parachute and payload body, and the parachute opening delay time (Fig. 20.4
and Table 20.3).
The influence of the parachute factors on the impact point dispersion is much
greater than that of the aerodynamic characteristic of the payload.

20.3.3 The Effect of Disturbance of Random Wind

Assume the random wind as a breeze, so that the crosswinds are wx = ±0.5 m/s, wz
= ±0.5 m/s (Fig. 20.5 and Table 20.4).

Effect of horizontal velocity dispersion Effect of vertical velocity dispersion Effect of lateral velocity dispersion
10.36 14
10.5
Random impact points Random impact points
Random impact points
10.45 Standard impact point 13 Standard impact point
Standard impact point 10.34

10.4 12
10.32
10.35 11
z/m

z/m

z/m

10.3 10.3 10

10.25 9
10.28
10.2 8
10.26
10.15 7

10.1 6
127.5 128 128.5 129 129.5 130 10.24
128 128.5 129 129.5 128.4 128.45 128.5 128.55 128.6 128.65 128.7 128.75 128.8 128.85
x/m x/m x/m

(a) Effect of horizontal velocity (b) Effect of vertical velocity (c) Effect of lateral velocity
Fig. 20.3 Dispersion of impact point caused by initial dispersing conditions

Table 20.2 Dispersion deviation of the impact point caused by initial dispersing conditions
Disturbance factors Deviation in x-direction (m) Deviation in z-direction (m)
Horizontal velocity −0.6 to 1.0 −0.17 to 0.19
Vertical velocity −0.6 to 0.8 −0.04 to 0.0.6
Lateral velocity −0.23 to 0.19 −3.56 to 4.15
192 Y. Li et al.

Effect of resistance characteristic dispersion of the parachute Effect of resistance characteristic dispersion of the payload Effect of opening delay time dispersion of the parachute
11.5 10.32 14.5
Random impact points Random impact points Random impact points
Standard impact point Standard impact point 14 Standard impact point
10.31
11
13.5
10.3
10.5
13

z/m
z/m

z/m
10.29
12.5
10
10.28
12

9.5
10.27 11.5

9 10.26 11
115 120 125 130 135 140 145 128.3 128.4 128.5 128.6 128.7 128.8 128.9 129 140 145 150 155 160 165 170 175 180

x/m x/m x/m

(a) Effect of the parachute (b) Effect of the payload (c) Effect of opening delay time
Fig. 20.4 Dispersion of impact point caused by the parachute–payload system

Table 20.3 Dispersion deviation of the impacts point caused by parachute–payload system
Disturbance factors Deviation in x-direction (m) Deviation in z-direction (m)
Parachute resistance −11.76 to 14.05 −0.94 to 1.12
characteristic
Payload resistance characteristic −0.34 to 0.29 −0.027 to 0.023
Parachute opening delay time 11.83 to 48.38 0.95 to 3.87

Effect of Crosswind Dispersion in Horizontal Direction Effect of Crosswind Dispersion in Lateral Direction Effect of Comprehensive Random Wind Dispersion
10.36 11.5 11.5
Random impact points Random impact points
10.34 Standard impact point Standard impact point
11
11
10.32
10.5
10.5
10.3
z/m

z/m
z/m

10
10.28
10
9.5
10.26
9.5
9
10.24 Random impact points
Standard impact point

10.22 9 8.5
128.3 128.4 128.5 128.6 128.7 128.8 128.9 129 128.56 128.58 128.6 128.62 128.64 128.66 128.68 128.7 128.72 128.2 128.3 128.4 128.5 128.6 128.7 128.8 128.9 129 129.1
x/m x/m x/m

(a) Effect of horizontal crosswind (b) Effect of lateral crosswind (c) Effect of parachute opening
delay time

Fig. 20.5 Dispersion of impact point caused by the disturbance of random wind

Table 20.4 Dispersion deviation due to random wind disturbance


Disturbance factors Deviation in x-direction (m) Deviation in z-direction (m)
Horizontal crosswind −0.32 to 0.31 −0.05 to 0.06
Lateral crosswind −0.07 to 0.06 −1.13 to 1.11
Comprehensive wind −0.42 to 0.43 −1.27 to 1.35
20 Analysis of the Dispersion of Impact Point … 193

22
Random impact points
20 Standard impact point

18

16

14
z/m

12

10

4
120 130 140 150 160 170 180 190
x/m

Fig. 20.6 Dispersion of impact point caused by comprehensive disturbance factors

The influence of the crosswind disturbance in the lateral direction on the impact
point dispersion is more obvious.

20.3.4 The Effect of Comprehensive Disturbance Factors

Taking the influence of all the above random perturbation factors into consideration,
10,000 random samples are selected for the simulation test (Fig. 20.6).
The dispersion range is an elliptical shape centered on the standard point. The
closer to the center position, the greater the spread probability, and the farther away
from the center position, the smaller the spread probability. The initial dispensing
velocity of the dispenser and the factors of the parachute are sensitive factors affecting
the dispersion of the impact point, in which the parachute factors are the most obvious.

20.4 Conclusion

With the established mathematical model and by virtue of Monte Carlo method,
the flight simulation is carried out. The dispersion regularity of the impact point is
obtained. The influence of random disturbance factors on the dispersion of the impact
point is analyzed, and the most obvious factor is obtained. The results show that the
initial dispensing velocity of the dispenser and the parachute factors are sensitive
factors, in which the factors of the parachute are the most obvious.
194 Y. Li et al.

References

1. Sun, C., et al.: Development of smart munitions. Chin. J. Energ. Mater. 6 (2012)
2. Hang, Z.-P., et al.: The exterior ballistics of projectiles, 1st edn. Beijing Institute of Technology
Press, Beijing (2008)
3. Rubinstein, R.Y., Kroese, D.P.: Simulation and the Monte Carlo Method, vol. 10. Wiley (2016)
4. Mathews, J.H., Fink, K.D.: Numerical methods using MATLAB, vol. 3. Pearson Prentice Hall,
Upper Saddle River, NJ (2004)
5. Kong, W.-H., Jiang, C.-L., Wang, Z.-C.: Study for bomblets distribution on ground of aerial
cluster bomb. J. Aero Weapon. 4, 43–46 (2005)
6. Zeng B.-Q., Jiang, C.-L., Wang, Z.-C.: Research on the ballistic fall point spread of the parachute-
bomb system. J. Proj. Rockets Missile Guid. 30(1), 1–4 (2010)
7. Zhang, G., Feng, S.: Study on point dispersing of conductive fiber based on exterior ballistic
model. Trans. Beijing Inst. Technol. 36(12), 1216–1220 (2016)
Chapter 21
Analysis of the Trajectory Characteristics
and Distribution of Smart Blockade
and Control Ammunition System

Yang Li, Chun-lan Jiang, Liang Mao and Xin-yu Wang

Abstract In order to study the ballistic trajectory and distribution of the smart block-
ade and control ammunition system, a simplified ballistic model of the parachute—
payload system is established. Flight trajectory characteristics and distribution of the
smart blockade and control ammunition system are obtained and analyzed, and the
distribution and the area of the blockade zone of the 12 ammunition are simulated.
Simulation results show that the dispersing altitude has the greatest influence on the
falling time. The initial horizontal velocity of the dispenser and the resistance char-
acteristics together with the opening time delay of the parachute have an important
impact on the horizontal displacement and the lateral displacement, respectively.
The study of this paper provides an effective analysis method for the design of the
weapon system.

Keywords Parachute–payload · Trajectory characteristics · Impact points


distribution

21.1 Introduction

With the breakthrough of various key technologies, various new types of blockade
ammunition have been introduced. At present, countries are increasing their efforts
in research and development of new blockade munitions [1]. Under various weather
and geographical conditions, the airborne dispenser can distribute many kinds and
large quantities of blockade ammunition to multiple areas in a single distribution,
which has wide blockade area and high reliability.

Y. Li (B) · C. Jiang · L. Mao · X. Wang


Key Laboratory of Explosion Science and Technology, Beijing Institute of Technology, Beijing
100081, China
e-mail: 827853550@qq.com

© Springer Nature Singapore Pte Ltd. 2020 195


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_21
196 Y. Li et al.

As an important part of the airborne dispenser, the dispersal system has a deci-
sive influence on the operational effectiveness of the ammunition. Therefore, it is
necessary to strengthen the research on the dispersing technique, the trajectory of
ammunition and the distribution of impact points. Based on the work of predecessors
[2, 3], this paper uses the Newton mechanics method to establish the dynamic model
of the smart blockade and control ammunition system and elaborates trajectory sim-
ulation programs for simulation. Then the trajectory and velocity, as well as the
displacement of the parachute-–payload system, are obtained, and the main factors
affecting the trajectory characteristics are analyzed. The impact points distribution
and blockade range of 12 ammunition are acquired.

21.2 Method Description

21.2.1 Dispersal Process

The smart blockade and control ammunitions carried by dispenser are divided into
the upper and lower layers. Each dispenser has 3 cabins, namely, the front, the middle
and the rear. Each cabin is equipped with 4 ammunition, 12 ammunition in total. The
dispersal process and the arrangement of the six ammunitions in the lower are shown
in Fig. 21.1.

21.2.2 Mathematical and Dynamic Modeling

In order to simplify the complexity of the calculation in the analysis, the following
basic assumptions are made.

Dispenser
The rear The middle The front
carbin carbin carbin
The The The
front middle rear vx0
cabin cabin cabin
v1 v2 v3

Initial 1 3 5
dispersing
Each dispenser has 12 position
parachute-payloads in it.
2 4 6

v1 v2 v3
z
x o
Parachute-payload system Dispersal Process t1 t2

(a) System composition and dispersal process (b) Ammunitions layout at the initial time

Fig. 21.1 The dispersal process and the arrangement of the six ammunitions in the lower
21 Analysis of the Trajectory Characteristics … 197

1. The parachute opens instantly and ignores the process of straightening the
parachute rope and parachute inflation, and the changes in the quality and the
attitude of the parachute are ignored;
2. The pull force of the parachute on the payload is always parallel to the motion
direction of the barycenter, and the point of action is on the barycenter of the
payload;
3. The lift force, Coriolis acceleration, Magnus force, and Magnus moment are
ignored, and all moments and forces that have less influence on the payload
motion are omitted.;
4. Assume the gravity acceleration is constant (g = 9.8 m/s2 ), and the direction is
vertical downward.
5. The ground inertial coordinate system (O-XYZ) and the reference coordinate
system (C-xyz) are established, and the simplified dynamical model of the
parachute–payload system is derived [4, 5] (Fig. 21.2).
Using Newton’s law and kinematics theorem gives

dv   
m = Fi Fi = Fb + Fp + G
 (21.1)
dt

Y
y

C x
Fb+Fp
z

G
O X

the Force SituaƟon Coordinate system

Fig. 21.2 The force of the parachute–payload system under the coordinate system
198 Y. Li et al.
⎧ ⎧
⎪ dv Fx Fb + F p vx − wx v = vr + w

⎪ x
⎪ = = · ⎪


⎪ dt m 2m vr ⎪
⎪ 

⎪ ⎪


⎪ dv Fy Fb + F p v y ⎪
⎪ vr = (vx − wx )2 + v2y + (vz − wz )2

⎪ y
= = · −g ⎪


⎪ ⎪


⎪ dt m 2m vr ⎪
⎪ 1

⎪ ⎪
⎪ Fb = − ρ(C A)b vr2

⎪ dvz Fz Fb + F p vz − wz ⎪
⎪ 2
⎨ = =− · ⎨
dt m 2m vr 1
F p = − ρ(C A)b vr2

⎪ dx ⎪
⎪ 2

⎪ = vx = v cos ϕ cos θ ⎪
⎪ 


⎪ ⎪



dt ⎪
⎪ v y

⎪ dy ⎪ θ = arctan 2


⎪ = v y = v sin θ ⎪
⎪ vx + vz2

⎪ ⎪


⎪ dt ⎪


⎪ ⎪
⎪ vz


dz
= vz = v sin ϕ cos θ ⎩ ϕ = arctan
dt vx
(21.2)

where, m, g, and ρ represent, respectively, the mass of the payload, gravity accelera-
tion, and air density. v, vx , v y , and vz denote the resultant velocity, horizontal velocity,
vertical velocity, and lateral velocity of the system, respectively. wx and wz indicate
the velocity of crosswind from the direction of x and z. (C A)b and (C A) p denote
resistance characteristics of payload and parachute. Fb and F p stand for the aerody-
namic drags of payload and parachute. θ, ϕ indicate trajectory inclination angle and
trajectory deflection angle.

21.3 Simulation Analysis

21.3.1 Influence of Dispensing Altitude

H = 100, 150, and 200 m are selected to simulate and calculate [6, 7], and the
results show that only the landing time and horizontal displacement increase with
the dispensing altitude (Fig. 21.3).
The landing time and horizontal displacement increase with the dispersal altitude.
The lateral displacement, steady falling velocity, and final falling angle are almost
invariable under different altitudes.

21.3.2 Influence of Initial Horizontal Velocity

Initial horizontal velocities of dispenser are 200, 220, 240, 260, 280, 300 m/s
(Fig. 21.4).
21 Analysis of the Trajectory Characteristics … 199

(a) Three-dimensional flight trajectory (b) x-y projection

(c) Velocity versus time curve(v-t) (d) Trajectory inclination angle versus time (θ-t)

Fig. 21.3 Ballistic parameters under different altitudes

The larger the initial horizontal velocity of the dispenser, the larger the horizontal
displacement is and the smaller the lateral displacement.

21.3.3 Influence of Resistance Characteristic of Parachute

The simulation condition is resistance characteristics of parachute (CA)p = 0.4, 0.5,


0.6 m2 (Fig. 21.5).
Falling time and angle increase with the increase of (CA)p , while landing veloc-
ity, horizontal displacement, and lateral displacement decrease with the increase of
(CA)p .
200 Y. Li et al.

(a) Three-dimensional flight trajectory (b) x-y projection

(c) Velocity versus time curve(v-t) (d) Trajectory inclination angle versus time ( θ-t)

Fig. 21.4 Ballistic parameters under different initial horizontal velocities

21.3.4 Influence of Opening Delay Time of Parachute

The delay time of opening parachute is 0.1, 0.2, 0.3, 0.4, and 0.5 s, respectively,
(Fig. 21.6).
Opening delay time only affects the horizontal and lateral displacement of the
landing. The longer the opening time delay, the larger the horizontal displacement
and lateral displacement.

21.3.5 Calculation Results of the Distribution of Impact


Points

The spacing between two adjacent ammunition is set at (20, 50 m). vx0 = 200 m/s,
vy0 = 0 m/s, vz0 = ±16, ±30, ±16, ±30, ±16, ±30 m/s, H = 100 m, m = 15 kg,
21 Analysis of the Trajectory Characteristics … 201

(a) Three-dimensional flight trajectory (b) x-y projection

(c) Velocity versus time curve(v-t) (d) Trajectory inclination angle versus time (θ-t)

Fig. 21.5 Ballistic parameters under different resistance characteristics of the parachute

(CA)p = 0.6 m2 , (CA)b = 0.018 m2 . The dispersing time interval of different cabins
is 0.2 s.
The trajectories of 12 ammunition do not overlap with each other. The mini-
mum and maximum distance between the two adjacent impact points are 20.583 and
38.38 m. The 12 ammunition communicate through the network in the way as shown
in Fig. 21.7b. Assuming the detection radius of ammunition is 50 m, the blockade
area is about 12324 m2 .
202 Y. Li et al.

(a) Three-dimensional flight trajectory (b) x-y projection

(c) Velocity versus time curve(v-t) (d) Trajectory inclination angle versus time ( θ-t)

Fig. 21.6 Ballistic parameters under different opening delay time

(a) Flight trajectory (b) Impact points distribution

Fig. 21.7 The trajectory and impact point distribution of 12 ammunition in the airborne dispenser
21 Analysis of the Trajectory Characteristics … 203

21.4 Conclusion

In this paper, a model for calculating the trajectory of a deftly controlled ammuni-
tion system is established, and the effects of different conditions on the trajectory
characteristics and the distribution of impact points of the ammunition system are
compared and analyzed. The study shows that the dispersing altitude has the greatest
influence on the falling time, and the initial horizontal velocity of the dispenser, the
resistance characteristics and opening time delay of the parachute have an important
impact on the horizontal displacement and the lateral displacement.

References

1. Yang, J., He, G., Zhang, Z.: Common terminal-sensitive submunition with function of blockade
and control. In: 2016 5th International Conference on Advanced Materials and Computer Science
(ICAMCS 2016). Atlantis Press (2016)
2. Sun, C., Lu, Y.: Analysis of submunition distribution of an unguided cluster munition. J. Proj.,
Rocket., Missile Guid. 30(1), 1–4 (2010)
3. Fang, Y., Jiang, J.: Stochastic exterior ballistic model of submunitions and its monte carlo
solution. Trans. Beijing Inst. Technol. 29(10), 850–853 (2009)
4. Dmitrievskii, A.A.: Exterior Ballistics. Moscow Izdatel Mashinostroenie (1979)
5. Hang, Z., et al.: The Exterior Ballistics of Projectiles, 1st edn. Beijing Institute of Technology
Press, Beijing (2008)
6. White, F.M., Wolf, D.F.: A theory of three-dimensional parachute dynamic stability. Aircraft
5(1), 86–92 (1968)
7. Klee, H., Allen. R.: Simulation of Dynamic Systems with MATLAB® and Simulink®. Crc Press
(2018)
Chapter 22
Study on Lee-Tarver Model Parameters
of CL-20 Explosive Ink

Rong-qiang Liu, Jian-xin Nie and Qing-jie Jiao

Abstract With the development of MEMS (Micro-electromechanical System),


higher requirements have been put forward for the micro-charge of booster explosive.
Direct write technology can directly write the explosive ink into the hole or groove.
Compared with traditional charge, the explosive ink has small size, low forming den-
sity, and obvious non-ideal detonation characteristics. It is impossible to calibrate the
parameters of JWL EOS (Equation of State) and reaction rate equation by cylinder
test and Laplace analysis. In order to determine the Lee-Tarver model parameters of
CL-20 explosive ink with forming the density of 1.45 g/cm3 (93% CL-20, 3% GAP,
2% NC), we write CL-20 explosive ink to groove with different sizes and measure the
detonation velocities. The detonation parameters and JWL EOS parameters of CL-20
explosive ink are calculated by Explo-5 software. Simulation models are established
with AUTODYN software according to the detonation velocity test. Combining with
finite element simulation and test results, Lee-Tarver model parameters of CL-20
explosive ink are fitted. According to the determined Lee-Tarver model parameters
of CL-20 explosive ink, a simulation model of critical size test is established. The
calculation results show that the critical size of CL-20 explosive ink in this study
ranges from 0.1 to 0.2 mm.

Keywords MEMS · CL-20 explosive ink · JWL EOS · Lee-Tarver model

R. Liu · J. Nie (B) · Q. Jiao


State Key Laboratory of Explosion Science and Technology, Beijing Institute of Technology,
Beijing 100081, China
e-mail: niejx@bit.edu.cn
R. Liu
e-mail: 1226702247@qq.com
Q. Jiao
e-mail: jqj@bit.edu.cn

© Springer Nature Singapore Pte Ltd. 2020 205


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_22
206 R. Liu et al.

22.1 Introduction

Modern warfare makes weapon systems develop toward miniaturization and intel-
lectualization. Since the 1990s, the technology of MEMS (Microelectromechanical
System) has developed rapidly [1]. How to realize the precise charge of micro-
explosive in explosive train and ensure that the explosive can initiate and boost reli-
ably has become a difficult problem, which restricts the development of the booster
sequence of MEMS. The direct write deposition of explosives is to write the explo-
sive ink directly on the base surface of the MEMS device through a digital controlled
direct write device. When the solvent in ink evaporates, the explosive solids will be
deposited in the predetermined position, which has the characteristics of safety, batch
deposition and accurate graphics. It has become a potential micro-charging method
for MEMS devices.
Explosive ink is a multicomponent mixing system consisting of explosive solid,
binder system (including binder and solvent) and other additives (other high-energy
explosive components or additives), usually in suspension or colloidal state. Since
2005, Fuchs [2] has developed EDF series of CL-20-based explosive ink, and suc-
cessfully loaded it into the MEMS fuze by direct write technology, and verified its
performance of detonation propagation in complex structures. In 2010, Ihnen [3] dis-
persed RDX in the binder system of cellulose acetate butyrate or polyvinyl acetate
to obtain the RDX-based explosive ink formulation. In 2013, Zhu [4] designed CL-
20/polyvinyl alcohol/ethylcellulose/water/isopropanol-based explosive ink. In 2014,
Stec III [5] reported the formulation of CL-20/polyvinyl alcohol/ethyl cellulose ink
which can be used in MEMS devices. In 2016, Wang [6] developed CL-20/GAP-
based explosive ink, which can be used for micro-scale charge, and the critical deto-
nation size is less than 0.4 × 0.4 mm. In 2018, [7] developed CL-20 based explosive
ink with ethyl cellulose (EC) and polyazide as binders, ethyl acetate as a solvent,
and studied its critical detonation propagation characteristics.
The critical size of explosive refers to the minimum charge size of explosive
for stable detonation. The critical size of CL-20 is significantly lower than that of
RDX and HMX, which means its suitable for preparing explosive ink. At present,
research on CL-20 explosive ink mainly focuses on formulation design and experi-
ment, seldom on simulation. In the finite element simulation calculation, JWL EOS
is generally used to describe the external function of explosive ink and its detonation
products. The JWL EOS parameters of explosive are usually calibrated by the cylin-
der test method proposed by Kurry [8]. However, it is difficult for the explosive ink
to realize the structure of large size charge. So the parameters of JWL EOS cannot
be obtained by cylinder test. In the MEMS explosive train, because of the small size
and the diameter effect of the charge, Lee-Tarver model is needed to describe the
nonideal detonation behavior.
In order to determine the Lee-Tarver model parameters of CL-20 explosive ink
with forming the density of 1.45 g/cm3 (93% CL-20, 3% GAP, 2% NC), we write CL-
20 explosive ink to groove with different sizes and measure the detonation velocities.
Explo-5 software is used to calculate the detonation and JWL EOS parameters of
22 Study on Lee-Tarver Model Parameters of CL-20 Explosive Ink 207

CL-20 explosive ink. Besides, simulation models are established with AUTODYN
software according to the detonation velocity test. Combining with finite element
simulation and test results, the Lee-Tarver model parameters of CL-20 explosive ink
are fitted. According to the determined Lee-Tarver model parameters, a simulation
model is established to calculate the critical size of CL-20 explosive ink.

22.2 Detonation Velocity Test of CL-20 Explosive Ink

Due to the effect of high temperature and high pressure in the detonation reaction of
explosive ink, a sudden change of electrical signal is produced at the electrode probe.
The instantaneous information sensed by the probe is transformed into a pulse signal
with obvious waveform by RLC network, which is input into the transient recorder
as a timing pulse signal. Then, after the input signal is amplified and impedance
transformed, the analog signal is converted into a digital signal in A/D and sent to
memory for storage. After further D/A conversion, the analog signal is transmitted to
the ordinary oscilloscope for display in the form of the analog voltage. The data stored
in the transient recorder can also be read into the RAM of the computer through the
special interface inserted in the expansion slot of the computer, and then transmitted
or printed. The main test equipment and schematic diagram are shown in Fig. 22.1.
Keep the length and width of CL-20 explosive ink 100 mm and 1 mm unchanged,
and the charge thickness is, respectively, 0.2, 0.4, 0.8, 1 and 1.5 mm. The material
of the base plate is 2024Al, and the size of base plate is 180 × 40 × 12 mm. The
material of the cover plate is the same as the base plate, whose size is 180 × 40
× 10 mm. The electric detonator is used to detonate CL-20 explosive ink. The test
device is shown in Figs. 22.2, 22.3, and 22.4.
After signal processing, the average detonation velocity of CL-20 explosive ink
with different sizes is calculated as shown in Table 22.1.
As is shown in Fig. 22.5, the experimental data can be fitted with a correlation
coefficient of 0.997.

Constant Voltage Ignition


Power Supply
ELectric detonator Line Switch Ignition Controller

CL-20 Cover plate Multimeter


explosive ink
Multiplex signal Measurement and
Probe Line
collector control system
Base plate TDS3034 Output
Scilloscope teminal

Fig. 22.1 Principle diagram of detonation velocity measurement


208 R. Liu et al.

Fig. 22.2 Test model

Fig. 22.3 Forming CL-20


explosive ink

Fig. 22.4 Probe distribution

Table 22.1 Average Charge size (mm) Average velocity (m/s)


detonation velocities with
different charge sizes 1 × 0.2 6330
1 × 0.4 6537
1 × 0.8 6743
1×1 6767
1 × 1.5 6853

D j = 6871.52 − 852.67e(− 0.4354 )


x
(22.1)

Here, Dj is detonation velocity and x is the thickness of CL-20 explosive ink. So


we can get that the limit velocity of CL-20 explosive ink is 6871 m/s.
22 Study on Lee-Tarver Model Parameters of CL-20 Explosive Ink 209

Fig. 22.5 The fitting curves


for detonation velocity
versus charge thickness

22.3 Numerical Simulation of CL-20 Explosive Ink

22.3.1 Physical Model and Numerical Model

In the numerical model, we reduce the length of CL-20 explosive ink to 40 mm in


order to improve the calculation efficiency. The detonator is replaced by a 0.5 cm
high cylindrical charge, which is only used to detonate the CL-20 explosive ink. The
other structure parameters were consistent with the experimental settings.
The finite element model mainly includes five parts: air, CL-20 explosive ink,
detonator, base plate, and cover plate. In order to prevent grid distortion, Euler ele-
ment is used in the air region, CL-20 explosive ink and initiating a charge. Lagrange
element is used in the base plate and cover plate. And fluid–solid coupling algorithm
is used to describe the interaction between the fluid and solid. The air has flowed
out the boundary. The bolt connection between the base plate and the cover plate
is simplified, and the bonding constraints are added on both sides of the base plate
and the cover plate. A monitoring point is set every 0.5 cm along the groove length
direction in Euler domain to monitor the variation of pressure with time. The element
size is 0.005 cm (Figs. 22.6 and 22.7).

22.3.2 Material Model

Accurately describing the characteristics of the material is the basis for ensuring
reliable calculation results. The material involved in this study is a constrained shell,
high explosive, and air.
High Explosive. The detonator is replaced by a 0.5 cm high cylindrical charge,
which is only used to detonate the CL-20 explosive ink. The detonation process of
210 R. Liu et al.

Fig. 22.6 CL-20 explosive


ink in base plate

Fig. 22.7 Numerical model

Table 22.2 JWL parameters of CL-20 explosive ink


ρ (g/cm3 ) D P A (GPa) B (GPa) R1 R2 ω E
(m/s) (GPa) (kJ/cm3 )
1.45 7583.9 21.38 635.52 18.86 5.26 1.57 0.41 8.45

the explosive is neglected, and the expansion process of the product is described by
the JWL EOS, which is

ω ω ωE
p(V, E) = A(1 − )e−R1 v + B(1 − )e−R2 v + (22.2)
R1 V R2 V V

Here, p is the pressure of detonation products; V is the relative volume v/v0 ; E is


the internal energy; and A, B, C, R1, R2, ω are empirical parameters determined by
detonation experiments.
Based on BKW equation, the detonation parameters and JWL EOS parameters of
CL-20 explosive ink with a density of 1.4 g/cm3 are calculated by Explo-5 software,
as is shown in Table 22.2.
The diameter of CL-20 explosive ink is between the critical dimension and the
limit dimension. The detonation behavior of the charge is different from the CJ
detonation. Except for the JWL equation of state to describe the state of the product
22 Study on Lee-Tarver Model Parameters of CL-20 Explosive Ink 211

and the unreacted explosives, an ignition–combustion–fast reaction rate equation


(Lee-Tarver model) is introduced to describe the reaction mechanism in the reaction
zone [9], which is
 x
dF ρ
= I (1 − F)b − 1 − a + G 1 (1 − F)c F d p y + G 2 (1 − F)e F g p z
dt ρ0
(22.3)

Here, F is the fraction reacted; t is the time in μs; p is the pressure in Mbar; ρ
is current density in g/cm3 , ρ 0 is the initial density; I, x and b are the parameters to
control the ignition term; a is the critical compression to prevent the ignition. Only
when the compression is ρ/ρ 0 > 1 + a, can the charge be ignited; G1 , c, d, and
y control the early growth of the reaction after ignition; G2 , e, g and z determine
the rate of high-pressure reaction. According to the meanings of Lee-Tarver model
parameters and Li’s [10] work, G1 , G2 , and z are taken as variables, and the rest of
parameters are fixed, as is shown in Table 22.3.
Air. The air in Euler grids is described by ideal gas state equation, which is

p = (γ − 1)ρ E g (22.4)

where γ is the adiabatic exponent (for the ideal gas, there is γ = 1.4); ρ is density,
and the initial density of air is 0.001225 g/cm3 ; the initial pressure is 105 Pa; Eg is
gas specific thermodynamic energy.
2024Al. The material parameters of 2024 aluminum are from the AUTODYN
material library and summarized in Table 22.4. The Dynamics response behavior of
2024Al was described by Johnson–Cook strength model and Shock state equation.
The shock EOS is the Mie-Gruneisen form of EOS that uses the shock Hugoniot as
reference.

Table 22.3 Lee-Tarver model parameters of CL-20 explosive ink


I b a x G1 c d y G2 e g z
f f f f v f f f v f f v
7.43e11 2/3 0 20 1500 2/3 1/3 2 400 1/3 1 3
Note f means that the parameter is fixed; v means that the parameter is variable

Table 22.4 The model parameters of 2024Al


Shock ρ γ C0 (m/s) S1 Tr (K) Cp (J/kg
EOS (g/cm3 ) K)
2.785 2 5328 1.338 300
Strength G (GPa) A (GPa) B (GPa) n C m Tm ε̇0 (s−1 )
model (K)
27.6 0.265 0.42 0.34 0.015 1 775 1
212 R. Liu et al.

γ
p − pH = (e − e H ) (22.5)
v
Here, p is the pressure, γ is the Gruneisen constant, v is the specific volume, and
e is the specific internal energy. Subscript H denotes the shock Hugoniot, which is
defined as the locus of all shocked states for each material. Here, Shock EOS need the
p-v Hugoniot. This Hugoniot is obtained from the U-u Hugoniot or the relationship
between shock and particle velocities.

U = C0 + su (22.6)

Here, C 0 and s are empirical parameters.


The Johnson–Cook strength mode is an empirical constitutive equation regarding
the deformation of metals with large strain, high strain rates, and high temperatures.
    
 ε T − Tr m
σ = A + Bε 1 + C ln  1 −
n
(22.7)
ε0 Tm − Tr

Here, σ is the yield stress or flow stress, A is the static yield stress, B is the
hardening constant, 2 is the strain, n is the hardening exponent, C is the strain rate
constant, ε̇ is the strain rate, ε̇0 is the reference strain rate, T is the temperature, T r
is the reference temperature, T m is the melting point, and m is the thermal softening
exponent.

22.3.3 Numerical Simulation Results

Calculation Method of Detonation Velocity in Simulation. According to the gauge


points set by the model, the pressure histories of these points are recorded. The
pressure histories in the thickness 1.5 mm are shown in Fig. 22.8.
It can be seen from Fig. 22.8 that the stable detonation points are Gauge #5–10. The
pressure peak times are obtained, and then the detonation velocity can be calculated
according to their position spacing 0.2 cm. The calculation detonation velocity of
CL-20 explosive ink in 1.5 mm deep channel is shown in Table 22.5.
Detonation Velocity of CL-20 Explosive Ink at Different Channel Thickness.
The comparison of the computational and experimental detonation velocity is shown

Table 22.5 Detonation velocity of CL-20 explosive ink in 1.5 mm deep channel
Gauge #5 #6 #7 #8 #9 #10 Average
Peak time (μs) 1.5095 1.822 2.1341 2.4461 2.7581 3.07
Time interval (μs) 0.3125 0.3121 0.312 0.312 0.3119 0.3121
Dj (m/s) 6400 6408 6410 6410 6412 6408
22 Study on Lee-Tarver Model Parameters of CL-20 Explosive Ink 213

Fig. 22.8 The pressure


histories of gauge points in
the simulation test

Table 22.6 Comparison of H (mm) Ds (m/s) Dt (m/s) Deviation (%)


detonation velocity between
simulation and test with 0.2 5709 6330 9.81
different charge thickness 0.4 5921 6537 9.42
0.8 6173 6743 8.45
1 6220 6767 8.08
1.5 6408 6853 6.49

in Table 22.6 in which H is the charge thickness of CL-20 explosive ink, and Ds and
Dt is respectively detonation velocity of simulation and test.
As Table 22.6 shows, the detonation velocity of the CL-20 explosive ink increases
with the charge thickness increasing. The deviation between the calculated detonation
velocity and the experimentation is within 10%. The experimental measurement
deviation of the detonation velocity will be greater in the smaller size. It proves that
Lee-Tarver model is suitable to describe the diameter effect of CL-20 explosive ink
in small size.
Critical Size. According to the determined Lee-Tarver model parameters, a numeri-
cal model with 0.1 mm thick CL-20 explosive ink is established to explore the critical
size. The pressure histories of gauge points are recorded, as is shown in Fig. 22.9.
Distance between each gauge point is 0.05 cm.
As can be seen from Fig. 22.9, the detonation pressure decreases with the increase
of the detonation depth and the detonation eventually extinguishes. When the shock
wave interacts on the CL-20 explosive ink, some of the explosives react because of
the high pressure. As a result, the pressure decreases slowly in Gauge #1–#5. Gauge
#6 begins, the low shock wave pressure cannot stimulate the explosive to react, which
decreases exponentially and eventually the detonation extinguishes.
214 R. Liu et al.

Fig. 22.9 The pressure


histories of gauge points in
the simulation test

22.4 Conclusion

(1) The detonation velocity of CL-20 explosive ink is measured under different
charge sizes. The formula of detonation velocity with charge size was fitted:
D j = 6871.52 − 852.67e(− 0.4354 ) . The limit detonation velocity was about
x

6871 m/s.
(2) Based on BKW equation, the detonation parameters and JWL EOS parameters
of CL-20 explosive ink with a density of 1.4 g/cm3 are calculated by Explo-5
software.
(3) Lee-Tarver model can describe the diameter effect of small-sized charge. Com-
bining with finite element simulation and test results, a set of Lee-Tarver model
parameters which can describe the detonation velocity–size relationship of CL-
20 explosive ink is obtained.
(4) According to the determined parameters of the Lee-Tarver model, the critical
thickness of CL-20 explosive ink is calculated under the existing charge width
and constraints ranging from 0.1 to 0.2 mm.

References

1. Wang, K.-M.: Study on Interface Energy Transfer Technology of Explosive Train. Beijing
Institute of Technology, Beijing (2002)
2. Fuchs, B.E., Wilson, A., Cook, P., et a1.: Development, performance and use of direct write
explosive inks. In: The 14th International Detonation Symposium, Idaho (2010)
3. Ihnen, A., Lee, W.: Inkjet printing of nanocomposite high explosive materials for direct write
fuzing. In: The 54th Fuze Conference, Kansas (2010)
4. Zhu, Z.-Q., Chen, J., Qiao, Z.-Q., et al.: Preparation and characterization of direct write explo-
sive ink based on CL-20. Chin. J. Ener. Mater. 21(2), 235–238 (2013)
22 Study on Lee-Tarver Model Parameters of CL-20 Explosive Ink 215

5. Stec III, D., Wilson, A., Fuchs, B.E., et al.: High explosive fills for MEMS devices. U.S. Patent
8 636 861, 28 Jan 2014
6. Wang, D., Zheng, B., Guo, C., et al.: Formulation and performance of functional sub-micro
CL-20-based energetic polymer composite ink for direct-write assembly. RSC Adv. 6(113),
112 325–112 331 (2016)
7. Xu, C.-H., An, C.-W., Wu, B.-d., Wang, J.-y.: Performances and direct writing of CL-20 based
explosive ink. Init. Pyrotechn. 1, 41–44 (2018)
8. Kury, J.W., Hornig, H.C., Lee, E.L., et al.: Metal acceleration by chemical explosives. In: 4th
Symposium (Int) on Detonation
9. Tarver, C.M., Urtiew, P.A., Chidester, S.K.: Shock compression and initiation of LX-10. Pro-
pellants, Explos., Pyrotech. 18, 117–127 (1993)
10. Li, Y., Yang, X., Wen, Y., et al.: Determination of Lee-Tarver model parameters of JO-11C
explosive. Propellants, Explos., Pyrotech. 43, 1–10 (2018)
11. Ihnen, A., Fuchs, B., Petrock, A., et a1.: Inkjet printing of nanocomposite high explosive
materials. In: The 14th International Detonation Symposium, NJ (2010)
Chapter 23
Optimal Design of Online Peer
Assessment System

Yeyu Lin and Yaming Lin

Abstract A feasible way to do a formative evaluation is to use peer assessment.


In other words, students play the role of evaluators to evaluate the work submitted
by others. However, the reliability of students’ rating is not guaranteed. Therefore,
we propose a new strategy to design and develop an online peer assessment system
to support the effective development of blended learning activities for engineering
courses under the new situation. Empirical research shows that teachers and students
are more satisfied with the system.

Keywords Peer assessment · Optimal design · Algorithm strategy

23.1 Introduction

Peer assessment is an effective process of communication and interaction. It is one


of the achievements of sociological learning theory and has been practiced in daily
teaching activities for many years and achieved good results [1]. It is not only a way of
evaluation, but also an innovative teaching method. It has been applied to the blended
learning in universities, which has brought positive influence to students [2]. In peer
assessment, students are required to evaluate the works submitted by several peers
as part of their homework tasks. Each student’s final score is obtained by combining
the information provided by peers. Of course, the peer also has shortcomings, for
example, the reliability and accuracy of raters’ scores are not guaranteed [3], because

Y. Lin (B)
Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang
University, Fuzhou 350121, People’s Republic of China
e-mail: 109109850@qq.com
Y. Lin · Y. Lin
School of Computer and Control Engineering, Minjiang University, Fuzhou 350108, People’s
Republic of China
e-mail: leafmissyou@126.com

© Springer Nature Singapore Pte Ltd. 2020 217


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_23
218 Y. Lin and Y. Lin

they are lack of teaching and professional knowledge. Teachers need to strengthen
monitoring and management, which will undoubtedly increase their burden.
Up to now, many researchers are continually exploring how to improve the way of
peer assessment and have worked out on some effective strategies [4–8]. This study
draws on the research results of the above scholars, analyses the shortcomings of the
existing peer assessment system, and proposes a comprehensive solution to meet the
teaching characteristics of engineering courses in our university to further enhance
the reliability and effectiveness of online peer assessment and reduce the workload
of teachers.

23.2 Total Idea of System Design

The main scenarios of online peer assessment system are as follows:


1. After the teacher logs in to the system, he chooses the questions from the ques-
tion bank and arranges the assignments. The assignments include objective and
subjective questions.
2. The student logs in the system to complete his homework within a specified
period of time, and his answer information will be stored in the database.
3. The teacher starts assignments scoring process in the system. Objective questions
in the assignments are automatically corrected by the system. The answers of the
subjective item are graded by the designated peers according to the algorithm
strategy.
4. The grader must complete the grading task within a specified period of time.
5. The teacher enters the system to start the score registration process. The system
calculates the score of each subjective question answer according to the specific
strategy, counts the total score of each assignment, and calculates the contribution
value of each grader.
6. During the appeal period, students can log in to the system to check the score of
this assignment. Each subjective question has a credible threshold (Referred to
as CT). If students think that the gap between the scores and those given by their
peers is greater than CT, they can submit objections to the system.
7. The teacher logs in to the system and re-scores the objectionable subjective
answers collected by the system. If there is no objection record, jump directly to
step (9).
8. The system recalculates the score of the objection subjective question, updates
the corresponding total score of the assignment, and recalculates the contribution
value of the relevant peer raters.
The peer assessment process is completed.
23 Optimal Design of Online Peer Assessment System 219

23.2.1 Premise Assumptions

The results of the mutual assessment can reflect the students’ normal learning level.
However, a single score may have some random noise. This system is usually used for
unit testing of courses and homework, not fit for final exams, because the final exam
is serious. If most peer scorers are casual and irresponsible in their attitudes toward
scoring, the scoring information is meaningless. Even the best algorithm strategy
calculation results cannot reflect the students’ true results. Therefore, the total Grade
Contribution Value (referred to as TGCV, followed by a definition formula) of each
student should account for a large proportion of the normal scores when formulating
the course assessment methods. Only in this way can students take this work seri-
ously and consolidate their knowledge in the process of peer assessment. In practical
application, TGCV accounts for 50% of the normal results.

23.2.2 Development Environment

This system adopts Browser/Server structure, uses Spring + SpringMVC + Mybatis


technology to achieve, and combines MySQL to build the overall architecture. The
MyBatis framework is a powerful data access tool and solution, which combines
many concepts and methods of operating relational data [9].

23.3 Discussion of Key Algorithms

Due to the limitation of an article, this chapter only discusses the key algorithms for
implementing peer assessment in the system.
Each subjective question in the question bank consists of six parts: (1) topic, (2)
reference answer, (3) scoring standard, (4) trusted threshold—CT, (5) non-trusted
threshold—UCT, and (6) topic total score (TTS).

23.3.1 Scoring Criteria and Total Score

In order to implement peer evaluation, teachers need to quantify the evaluation criteria
carefully.
So there are many specific scores in the scoring criteria. Therefore, when designing
the system, the total score attribute is added to the question in the question bank,
which corresponds to the score in the scoring criteria. When assigning homework,
teachers can add multiple questions. The final score of the student’s homework is
converted into a percentage system.
220 Y. Lin and Y. Lin

23.3.2 Trusted Threshold (CT) and Untrusted Threshold


(UCT)

Each subjective question is different in difficulty and accuracy due to different con-
tents. CT reflects the allowable scoring error range. Supposing that student X scores
on an answer Y is V (X,Y ) , and the result of the answer based on the algorithm strategy
is V (F,Y ) , the trusted threshold
 of the answer
 to this question is CT Y , the untrusted
threshold is UCT Y . If V(F,Y ) − V(X ,Y ) <=CTY , it means that the student’s evalua-
tion of the answer is credible.
 UCT reflects
 the lower limit of a student’s unreliable
evaluation. That is, if V(F,Y ) − V(X ,Y ) >=UCTY , it means that the student’s evalua-
tion of the answer is not credible. Usually, good grades are evaluated poorly, which
is not credible with high probability. So the two thresholds satisfy the following
inequalities:

1
0 < CTY < TTSY < UCTY < TTSY (23.1)
2
The teacher can set the CT and UCT of the question according to the corrective
characteristics. If the score of a question is not prone to deviation, CT and UCT can
be set smaller, and vice versa. For example, for a 10 point question, CT can be set to
2 points; UCT is set to 6 points.

23.3.3 Formula Definition Used by the Algorithm

There, we define function F(V (X,Z) ) as follows:

F(V( X, Z) ) (23.2)

Then, we define the reliability degree (CD) of student X in the Yth assignment
scoring as follows:

D(X, Y) : F(V ( X ,Z ) ) TTS z


CD(X, Y) F (V ( X , Z ) ) (23.3)
N TTS z
23 Optimal Design of Online Peer Assessment System 221

Wherein, D(X,Y ) represents a set of answers to be assigned to the student X when


the Y th score is taken; N represents the number of sets; TTS Z stands for the total
score of the questions corresponding to the answer Z. Formulas (23.2) and (23.3)
embody the following scoring ideas: If the D-value between the student X’s score
and the final score of the answer is within CT, student X’s score on the answer is
satisfactory; If the D-value is beyond the UCT, student X’s score on the answer is
irresponsible and zero tolerant; if the D-value is between CT and UCT, Student X’s
score
 on the answer
 was not satisfactory, the unsatisfactory degree is quantified by
(V(X ,Z) − V(F,Z)  − CTZ ).
This defines the grade contribution degree (refers to GCD) of student X after the
Y th assignment grade is completed:


Y
GCD(X ,Y ) = CD(X ,y) (23.4)
y=1

Assuming that the course has a total of M assignments, the student’s score con-
tribution value TGCV is defined as

GCD(X ,M )
TGCVx = × 100 (23.5)
M

23.3.4 Discussion of Algorithm Implementation Details

23.3.4.1 The Assignment of Scoring Tasks

Scoring tasks allocation in the system is based on answer granularity, not job granu-
larity. Such a student’s homework may be evaluated by more students, reducing the
possibility of cheating in grading.
It is better that one answer is randomly assigned to K graders. Assuming that
each evaluator has the same scoring workload, the average number of assignments
evaluated by each student is not more than K. Generally, the larger the K value is,
the better the algorithm is. This system chooses K = 5 according to the experience
of many peer evaluators.

23.3.4.2 Calculation of the Credibility of Students in the First


Assignment Score

We can see from formulas (23.2) and (23.3) that the CD calculation of student X
needs V (F,Z) . After the first student assignment is submitted, the system requires
both teacher assessment and peer assessment. For answer Z, if the teacher’s grade
is V T , then V(F,Z) = VT . Therefore, the first time students participating in peer
222 Y. Lin and Y. Lin

assessment will not reduce the workload of teachers’ scoring. Its main role is to
produce the GCD(X,1) . The higher the GCD(X,1) , the higher the credibility of student
X’s evaluation.

23.3.4.3 Calculation Algorithm Strategy for V (F,Y)

For the answer Y in the ith homework (i >= 2), the system assigns K students to
correct the answer through the task, without loss of generality. Assuming that the
number of these K students is S1, S2, …, SK, the pseudocode for the calculation of
V (F,Y ) is as follows:

The program pseudocode indicates that the grade of the student with the highest
GCD from the previous time is selected as the final score. The reason for choosing
this algorithm strategy is that the students with high GCD in the past are more likely
to be close to the real scores. The algorithm combines the formulas (23.2) and (23.3)
to reflect the following ideas: If the student scores well, they will get more trust;
more trust will promote more high credibility.

23.3.4.4 Others

There is a problem in the calculation algorithm of V (F,Y ) : for an answer X, if students


who have the high CD selected by the algorithm happen to make random corrections,
or all scorers of the answer have low CD, then the score generated by the algorithm
cannot reflect the truth. Therefore, in step 6 of the system flow scenario, if a student
disagrees with his or her score, he or she can submit complaints to the system during
the appeal period. The teacher will re-score the student’s answer.
23 Optimal Design of Online Peer Assessment System 223

In order to prevent students from submitting objections casually, the system has
designed the following strategies:
For an answer X, after the peer correction, the score is V1, and the owner of the
answer is student Y, who thinks the answer is V2. Only when (V2 − V1 ) > CTX ,
the system allows student Y to submit objection information, including V1 and V2 ,
which is reserved for teacher evaluation. The teacher’s evaluation score for answer
X is V3 . If |V3 − V2 | <= CTX , the student will be deducted a certain score for this
assignment. The idea embodied in this strategy is that if the peer score is not much
different from the real score, it is considered a valid score.

23.4 Summary and Outlook

The peer assessment system has been tested in several classes of Computer and Con-
trol Engineering College of Minjiang University, such as “User Interface Design and
Evaluation”, “Database Course Design”, “Software Development Tools and Envi-
ronment”, and so on. The results were satisfactory. Through this system, the author
collected 117 peer evaluation data from 2016 software engineering majors and 2016
Computer Science and Technology majors. In order to test the reliability of peer
assessment, teacher assessment and peer assessment were carried out simultane-
ously for one assignment in two classes. We also analyzed the assessment results
and get the scatter plots (see Figs. 23.1 and 23.2)

Fig. 23.1 Comparison of 90


teacher assessment and peer
assessment for 2016 80
Peer
software engineering majors
assess- 70
ment
60

50
50 60 70 80 90
Teacher assessment

Fig. 23.2 Comparison of 100


teacher assessment and peer
assessment for 2016 90
computer science and 80
technology majors Peer
accessm 70
ent
60
50
40
40 50 60 70 80 90 100
Teacher accessment
224 Y. Lin and Y. Lin

As can be seen from Figs. 23.2 and 23.3, the peer score and the teacher score
show a relatively linear, indicating that the correlation between the two is high and
the consistency is better. The questionnaire survey after using this system shows that
teachers and students are more satisfied with the use of the peer assessment system.
The introduction of peer assessment in blended teaching can better mobilize the
enthusiasm of students to participate.
Online peer assessment system basically meets the practical needs of peer evalua-
tion, but there are also some problems, such as the following problems in the existing
algorithm strategy: if the peer score is higher than the real situation, the students who
are graded will have no objection. As a result, everyone was happy to get high marks.
However, the result will affect the value of the evaluation. It is easy to see from the
previous two scatter plots that the peer score is generally higher than the teacher
score. Therefore, we should have a more perfect mechanism, so that students dare
not randomly give high marks to their peers. How to measure the students’ attitude
toward the task of correcting is worth further study.
This system provides an effective solution to carry out peer assessment activities.
In the next stage, we will consider extending the system to more courses, so as to
find more problems to be solved.

Acknowledgements This study was supported by the teaching reform Research project of Minjiang
University under Grants No. MJU2018B044 and supported by Fujian Provincial Key Laboratory
of Information Processing and Intelligent control.

References

1. Gielen, S., Peeters, E., Dochy, F., et al.: Improving the effectiveness of peer feedback for learning.
Learn. Instr. 20(4), 304–315 (2010)
2. Kollar, I., Fischer, F.: Peer assessment as collaborative learning: A cognitive perspective. Learn.
Instr. 20(4), 344–348 (2010)
3. Speyer, R., Pilz, W., Kruis, J.V.D., et al.: Reliability and validity of student peer assessment in
medical education: a systematic review. Med. Teacher 33(11), e572–e585 (2011)
4. Ueno, M., Okamoto, T., Nagaoka, K.: An item response theory for peer assessment. IEEE Trans.
Learn. Technol. 9(2), 157–170 (2008)
5. Shu, C.: Design and optimization of online peer assessment system. E-educ. Res. (1), 80–85
(2017)
6. Bai, H., Su, Y., Shen, S.: An empirical study on a blended leaning model integrated peer review.
E-educ. Res. 12, 79–85 (2017)
7. Sun, L., Zhong, S.: Probabilistic models of peer assessment in MOOC system. Res. Open Educ.
20(5), 83–88 (2014)
8. Xu, T.: Design of peer assessment in xMOOCs. Res. Open Educ. 21(2), 70–77 (2015)
9. Get the Java SSM framework development, https://www.imooc.com/course/programdetail/pid/
59/. Accessed on 23 Dec 2018
Part II
Power Systems
Chapter 24
A Method of Calculating the Safety
Margin of the Power Network
Considering Cascading Trip Events

Huiqiong Deng, Chaogang Li, Bolan Yang, Eyhab Alaini, Khan Ikramullah
and Renwu Yan

Abstract Aiming at the phenomenon of the cascading trip in power system, this
paper studies a calculation method of safety margin of power system considering
the cascading trip. First, this paper analyses the behavior of cascading trip according
to the action equation of relay protector, and proposes the concept of the critical
state according to find out whether there are cascading trip events occurring in a
power system. Then, this paper proposes an index for measuring the safety margin
of power system in the case of considering a cascading trip, and proposes a model
and an algorithm for calculating the safety margin. Finally, the rationality of the
algorithm is demonstrated by the example of IEEE39 system.

Keywords Power system · Blackout · Cascading failure · Power flow transfer ·


Correlation function

H. Deng · C. Li (B) · B. Yang · E. Alaini · K. Ikramullah · R. Yan


School of Information Science and Engineering, Fujian University of Technology, Fuzhou
350108, China
e-mail: kit218@outlook.com
H. Deng
e-mail: denghuiiong66@126.com
B. Yang
e-mail: 282289192@qq.com
E. Alaini
e-mail: 1967150243@qq.com
K. Ikramullah
e-mail: engrixxs@gmail.com
R. Yan
e-mail: 16370379@qq.com

© Springer Nature Singapore Pte Ltd. 2020 227


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_24
228 H. Deng et al.

24.1 Introduction

Cascading tripping may cause cascading failure in complex, and even blackouts. In
recent years this phenomenon has been confirmed by a number of power blackouts
in the world. Due to the great impact of blackouts, the phenomenon of cascading
failures, including Cascading Trips, has received extensive attention and research in
recent years [1–3]. Some researchers have carried out research on cascading failures
by the mechanism of cascading failures, the simulation of cascading failure, the risk
analysis of cascading failures, the transmission effect of power network structure on
cascading failures, and so on. They have obtained many beneficial research results,
which provide a lot of inspiration for the further study of the cascading failures of
power system.
In addition, the paper [4] studies the risk prediction method of power network
faults evolution based on intelligent state causality chain, and forms the power net-
work faults risk assessment system based on intelligent state causal chain, and [5]
studies the main chain fault mode of the system by using chain fault mode mining
algorithm, combined with the system topology and power flow state, the system chain
fault propagation law is analyzed, and the stochastic power flow and risk value the-
ory are introduced. In the paper [6], a high-risk fault chain model is constructed, and
the risk level of cascading outages with large-scale wind turbine system is studied
and analyzed, which are studied from the intelligent state, data mining and random
currents and other angles to cut in, providing some new perspectives. In a word, the
research of chain fault is very important to the security of power system, but most
of the current research is still in the exploration stage, and there is a long way to go
into the guidance of the actual power grid.
From the blackout occurred in recent years, cascading trip in the early stage
of blackouts usually appeared. Although the actual operation of the power grid is
generally carried with various safety check works, it can not completely eliminate
the blackout. In view of this, the defense should be studied from a broader field of
vision to the cascading trip and power blackouts.
This paper gives a safety margin model and its solving method is based on opti-
mization theory. In this paper, the node injection power of the grid is taken as the main
parameter in the definition of cascading trip security margin. Finally, the IEEE39 sys-
tem is used for the numerical example analysis and verification.

24.2 The Concept and Model of Safety Margin

In this paper, we focus on the problem of the first level cascading trips caused by
the initial failure of the normal operation of the power grid. In the process of the
cascading trip, the main equipment which drives the after stage fault is the circuit
relay protection which is used for the backup [7]. The research of this paper is mainly
aimed at the situation of current type backup protection. A method of determining
24 A Method of Calculating the Safety Margin of the Power Network … 229

the expression equation of cascading trip is presented based on the action behavior
of backup relay protection in the literature [8]. An initial fault occurs at a certain
time in a power network, and the initial fault branch is a branch L ij between the node
i and the node j. In addition to the branch L ij in any branch of the power network,
such as between the node s and node t the branch L st whether the cascading trip can
be used to determine by the Eq. (24.1):

ωst·dist = |ωst·lim | − |ωst | (24.1)

In Eq. (24.1), the ωst·lim and ωst are related to the fixed value and the measured
value of the backup protection which is arranged with the branch L st , and is related to
the action equation of the backup protection. Take the current type backup protection
as an example, the ωst·lim can be taken as the protection of the set value of I st·set , ωst
can be taken to protect the measured value of I st . ωst·dist is used to measure the amount
of electrical distance between ωst·lim and ωst in the branch L st , which is expressed in
terms: if the ωst·dist < 0, then the branch L st occurs cascading trip. If the ωst·dist > 0,
the branch L st does not occur cascading trip. If the ωst·dist = 0, then the branch L st
in the occurrence of the cascading trip out of the border. The analysis of this paper
assumes that the grid can be unified by Eq. (24.1) for analysis.
For the power network, in addition to the initial fault branch, when there is at least
one branch occurence in the cascading trip, then the power grid will be cascading trip.
Cascading trips are not occurring in the power grid when any branch of a branch is
not in place. When any branch L st meet ωst·dist ≤ 0, and at least a slip at the boundary
of the cascading trip, the grid will in the boundary of the cascading trip. When the
node injection power of the power grid before and after the initial failure is ignored,
the ωst·dist is mainly determined by the power grid node injection power before the
initial fault. According to the power of node injection in power network, the operation
state of the power network can be divided into three states, one is Cascading Trips
do not occur: one is Cascading Trips: T 2 , one is the critical state: T 0 .
In order to analyze the safety margin of quantification, the T 0 state which cor-
responds to the grid node injection power vector is set to S. Node-injected power
vector for the current to be analyzed is set to S  . The distance between these two
vectors can be expressed by Eq. (24.2).
 
D =  S − S (24.2)

If the current running state of the power grid is in T 1 , the shortest distance between
the running state of the power grid and the running state of the T 1 is set to min D.
Obviously, the min D > 0 shows that when the initial fault is given, the power grid has
a certain safety margin. According to the previous analysis, if the current operation
status of the power grid can be changed, regardless of the change, as long as min D >
0, the grid can have certain safety margin when the initial fault is given. Thus, it can
be seen min D is one of the most important parameters that can be used as a measure
of safety margin index. The following paper mainly focuses on the analysis of the
230 H. Deng et al.

current status of S 1 in the T 1 collection. So it is inferred that when the initial fault is
given, the problem of the safety margin in the operating state is solving the problem
of min D. This can be classified as optimization problems. Its objective function can
be written in the form of Eq. (24.3).

F = min D (24.3)

By the previous analysis and Eqs. (24.2), (24.3), we can know that the S is the
known power vector of the node in the current to be analyzed. The amount to be
found is S 0 which is the closest between S and T 0 in Eq. (24.3). In the amount of S 0 ,
the active power, reactive power and reactive power of the PV nodes are determined
by the constraint of the power flow equation. S 0 the variables to be optimized in S 0
are the active power of the PV node, the active and reactive power of the PQ node,
which is expressed by O. Z is used to represent variables apart from the variable O
in the amount of S 0 .
When the power network is in the T 0 set, the nodal injection power is S0 , and
the corresponding equality constraints are the power flow constraints which must be
satisfied. Its specific form is shown in the Eq. (24.4) [4].
⎧ n 

⎪ 0
PGi = PDi
0
+ Ui1 U 0j G i0j cos θi0j + Bi0j sin θi0j


⎨ j=1
 n 
0
= 0
+ 0 0 0
θ 0
− 0
θ 0 (24.4)

⎪ Q Gi Q Di U i U j G i j sin i j Bi j cos i j

⎪ j=1

i = 1, 2, · · · , N . θV0 θ = 0

In Eq. (24.4), superscript “0” indicates that before the initial fault occurs, the
power grid is in the T 0 set, and the node injection power is S 0 . N represents the total
number of nodes in the grid. PGi and QGi, respectively, indicate the active power
and reactive power of the power supply on the node i in the system. PDi and QDi,
respectively, represent the active load and reactive load of node i. U i represents the
modulus value of the voltage vector on the node i. Gij and Bij, respectively, represent
the real and imaginary parts of the element Y ij in line i and column j of the node
admittance matrix. θ Vθ represents the voltage phase angle of the balance node. θ ij is
the voltage phase angle difference between node i and node j, and its specific form
as shown in Eq. (24.5).

θi0j = θi0 − θ 0j (24.5)

Equation (24.4) can be abbreviated as shown in the form of Eq. (24.6):




h0 x 0 , y0 , z 0 = 0 (24.6)

When the power grid is in the T 0 set, the node power is S 0 , the inequality constraints
are mainly the grid normal operation requirements of the various constraints before
24 A Method of Calculating the Safety Margin of the Power Network … 231

the initial fault. Any node i (i = 1, 2… N), the corresponding inequality constraints
can be expressed as Eqs. (24.7)–(24.10) as shown in the form.

min ≤ PGi ≤ PGi·max


0 0 0
PGi· (24.7)

Q 0Gi·min ≤ Q 0Gi ≤ Q 0Gi·max (24.8)

Ui·0min ≤ Ui0 ≤ Ui·0max (24.9)


0
P ≤ P0 (24.10)
ij i j·max

Except the initial fault slip, there are L branches in the network. According to
the Eqs. (24.7)–(24.10), 3 × N + L + 1 inequality constraints are formed. These
different constraints are unified, write in the form of Eq. (24.11).


g 0 x 0 , y0 , z 0 (24.11)

When the power grid is in the T 0 set, the node power is S 0 , after the initial fault
occurs, the power grid should first meet the constraint conditions which are similar
to the Eq. (24.4) of the power flow constraints. It can be written in the form (24.12)
as shown in the abbreviated form similar to the Eq. (24.4).


hb x b , yb , z b = 0 (24.12)

In order to further express the critical state of the cascading trip, the branch L st is
set as branch l. Make J l = ωst·dist , and form the matrix (24.13) as shown.

J = diag(J1 , . . . , Jl , . . . , JL ) (24.13)

From the previous analysis, we can know that when any element of the matrix J
is greater than or equal to zero and the matrix J is singular, the power grid is in a
critical state for a given initial, it can be summed up in Eq. (24.14).

| J| = 0
(24.14)
Jl ≥ 0, l = 1, 2, . . . , L

Equality and inequality in Eq. (24.14) are abbreviated as part of Eqs. (24.15),
(24.16) in the form of


f b x b , yb , z b = 0 (24.15)


gnb x b , yb , z b ≤ 0
(24.16)
n = 1, 2, . . . , L
232 H. Deng et al.

In Eq. (24.16), n represents the number of constraints. It can also be further


abbreviated as Eq. (24.17) of the form.
⎧   
⎪  
⎪ min D =
S − S


⎪ s.t. h
x , y, z = 0
0 0 0



hb
x b , y, z b = 0
(24.17)

⎪ f b
x b , y, z b = 0



⎪ g 0
x 0 , y, z 0 ≤ 0


g b x b , y, z b ≤ 0

Through the above analysis, it can form a complete solution of the mathematical
model of safety margin by Eqs. (24.2)–(24.17) together. As the former analysis, it
can be considered y0 and yb two equal symbols, and then unified with y to express.
In this way, the final model is shown in Eq. (24.18).
⎧  

⎪ min D =
 S − S



⎪ s.t. h0
x 0 , y, z 0 = 0


hb
x b , y, z b = 0
(24.18)

⎪ f b
x b , y, z b = 0



⎪ g 0
x 0 , y, z 0 ≤ 0


g b x b , y, z b ≤ 0

D is obtained by solving Eq. (24.18) which is required for the solution of the
safety margin.

24.3 A Way for Solving the Safety Margin Model

Considering the complex constraint conditions, this paper uses the particle swarm
optimization algorithm to give its solution. The particle is taken as the above to be
optimized variable y.
In the process of solving the equation, equality constraints in Eq. (24.18) which
correspond to Eqs. (24.6) and (24.12) can be solved by solving the power flow equa-
tion. If you do not meet the requirements, then remove the corresponding particles
to generate the new particles. The other constraints in the Eq. (24.18) can be pro-
cessed in the form of penalty function. The problem represented by Eq. (24.18)
can be converted into the problem represented by Eq. (24.19), and α, β, and γ are
penalized.
 

2
min D  = D + αk min 0, −gk0 x 0 , y, z 0
k
 

2 
2
+ βk min 0, −gkb x b , y, z b + γ f b x b , y, z b (24.19)
k
24 A Method of Calculating the Safety Margin of the Power Network … 233

Thus, Eq. (24.19) has become an unconstrained form. When the particle swarm
algorithm is used, the basic form of Eqs. (24.20) and (24.21) is adopted in this paper.



vik+1 = wvik + c1r1 P best·i − yik + c2 r2 g best − yik (24.20)

yik+1 = yik + vik (24.21)

In Eqs. (24.20)–(24.21), k represents the number of iterations. x ki is the iteration


position of the particle i in kth. vki is the iteration speed of particle i in kth. It is
generally required to meet the vmin ≤ vki ≤ vmax . P best·i is the optimal solution for the
particle’s own experience. gbest is the optimal solution of the whole particle swarm.
w is the inertia coefficient, generally from 0.9 to 0.1 in a linear fashion. c2 and c1 are
the acceleration constants, their value are generally taken as 2. r 1 and r 2 are random
numbers on the [0, 1] interval.
In the solving process, this paper gives the fitness of particles based on Eq. (24.22).
According to the analysis of Eq. (24.18), the safety of the solution corresponding to
the maximum fitness in Eq. (24.22).

F = 1/D  (24.22)

24.4 Experimental Simulation: Case Study

In this paper, the IEEE39 node system is used as an example to carry out example
analysis. The wiring diagram is shown in Fig. 24.2. According to the idea of solving
the safety margin, the solving process of particle swarm algorithm in this paper is
given below. We mainly calculate the F value in Eq. (24.19) and the D value in
Eq. (24.22) which are regarded as computational results. The reference capacity is
100 MVA. The initial fault slip is assumed to be L 17–18 . At the same time, in the
system of Fig. 24.1, it is assumed that the backup protection for the circuit is current
type protection. In other words, the ωst·lim can be taken as the protection of the set
value of I st·set and assume that the value is 7.5 kA.
For the active and reactive power of the generator output in Fig. 24.1, it is assumed
that the lower limit of the active power output of each generator node is 0, no upper
limit is given, and the reactive power output of each generator is assumed to have no
consideration of the upper and lower bounds. The voltage mode value of each node
in Fig. 24.1 is allowed to range from 0.95 to 1.05 (p.u.). It is assumed that the active
power limit for the transmission of the branches is 0, the upper limit is 1000 MVA.
In this example, the voltage module value of the PV node is assumed to be the
same as that of the current state, and the current situation of the power grid is taken
as a typical state shown in Fig. 24.1, the corresponding node power data can be found
in the literature [8]. After this process, the voltage module value of the PV node is
not used as the optimization variable. At this time, the optimization variables are the
234 H. Deng et al.

G G
30 37

2 25 26 28 29
27
38
1 3 18 17
G
16 21
G
39
15
4 14 G
24
36
5 13

6 12 23
9
19
11 22
7
20
8 31 32 10
34 33 35
G G G G G

Fig. 24.1 Diagram of the example system


F

CN CN

Fig. 24.2 The results of D (left) and F (right)

active power of the PV node and the active and reactive power of the PQ node, that
is, the part of the vector S 0 which is expressed by O. Each element in the vector O
is, respectively, corresponding to the active power of the PV node, the active and
reactive power of the PQ node. Further, it is arranged in sequence according to the
number of nodes in the system shown in Fig. 24.1. In the vector O, two elements are
used to represent the optimized variables corresponding to the PQ node and they are
placed close to the place.
Next, each particle corresponds to a vector O. Vector O can be generated by
assignment. The specific operation of assignment is: based on typical data of the
system shown in Fig. 24.1, ΔP was added to each element corresponding to the PV
nodes power in vector O. ΔP and ΔQ were, respectively, added to the corresponding
element corresponding to the PQ node power in vector O. In the iterative solution
24 A Method of Calculating the Safety Margin of the Power Network … 235

process, w in Eq. (24.20) is reduced from 0.9 to 0.1 in a linear fashion. c2 and c1
were taken as 2.
A vast amount of calculation indicates that when the power of the node is
increased, the power grid will be cascading trip, and the power grid will not happen
when the node injection power is reduced. This indicates that the margin of safety
by the example calculated is credible, and that this calculation method is effective.

24.5 Conclusion

Based on the running state of power network, cascading trips were studied from the
point of safety margin. The main conclusions are as follows: It can use the distance
between the actual operating state of the power grid and the critical state of the
occurrence of cascading trip as a measure of power network security margin index.
It can also use the optimized model to represent the cascading tripping safety margin
index, and it can be solved by optimization method. The example indicates that it
is feasible that using optimization method to solve the problem of safety margin of
cascading trips. This provides a reference for further research.

Acknowledgment This research was financially supported by Fujian Provincial Natural Science
Foundation of China under the grant 2015J01630, Doctoral Research Foundation of Fujian Univer-
sity of Technology under the grant GY-Z13104, and Scientific Research and Development Foun-
dation of Fujian University of Technology under the grant GY-Z17149.

References:

1. Shi, L., Shi, Z., Yao, L., et al.: Research on the mechanism of cascading blackout accidents in
modern power system. Power Syst. Technol. 34(3), 48–54 (2010)
2. Xue, Y., Xie, Y., Wen, F., et al.: A review on the research of power system cascading failures.
Autom. Electr. Power Syst. 37(19), 1–9, 40 (2013)
3. Liu, Y., Hu, B., Liu, J., et al.: The theory and application of power system cascading failure
(a)—related theory and application. Power Syst. Prot. Control 41(9), 148–155 (2013)
4. Xiao, F., Leng, X., Ye, K., et al.: Research on fault diagnosis and prediction of chain trip based
on fault causal chain of finite state machine. Power Big Data 21(08), 48–57 (2018)
5. Liu, Y., Huang, S., Mei, S., et al.: Analysis on patterns of power system cascading failure based
on sequential pattern mining. Power Syst. Autom. 1–7 (2019). http://kns.cnki.net/kcms/detail/
32.1180.TP.20190124.1036.036.html
6. Xu, D., Wang, H.: High risk cascading outage assessment in power systems with large-scale
wind power based on stochastic power flow and value at risk. Power Grid Technol. 43(02),
400–409 (2019)
7. Huang, P., Zhang, Y., Zeng, H.: Improved particle swarm optimization algorithm for power
economic dispatch. J Huazhong Univ. Sci. Technol. (Natural Science Edition) 38(3), 121–124
(2010)
8. Cai, G.: Branch transient potential energy analysis method for power system transient stability.
Harbin Institute of Technology (1999)
Chapter 25
Research on Intelligent Hierarchical
Control of Large Scale Electric Storage
Thermal Unit

Tong Wang, Gang Wang, Kai Gao, Jiajue Li, Yibo Wang and Hao Liu

Abstract Through the control of the thermal storage unit, the local control and
remote control strategy including the thermal storage unit are realized and incorpo-
rated into the power generation plan of the day before, so that the electric thermal
comprehensive scheduling model of the power system with the large-scale thermal
storage unit is established.

Keywords Heat store unit · Control strategy · Grading and switching ·


Scheduling model

25.1 Introduction

Due to the random fluctuation of wind power generation, the grid operation of wind
power generation brings great challenges to the traditional power system. In order
to ensure the safe and reliable operation of the whole system, the phenomenon of

T. Wang · G. Wang · J. Li
State Grid Liaoning Electric Power Company Limited, Electric Power Research Institute,
Shenyang 110006, Liaoning, China
e-mail: 18159335@qq.com
G. Wang
e-mail: wangg_ldk@ln.sgcc.com.cn
J. Li
e-mail: en-sea@163.com
K. Gao
State Grid Liaoning Electric Power Supply Co., Ltd., Shenyang 110006, Liaoning, China
e-mail: gk@ln.sgcc.com.cn
Y. Wang · H. Liu (B)
Northeast Electric Power University, Jilin 132012, Jilin Province, China
e-mail: 1282625960@qq.com
Y. Wang
e-mail: 469682939@qq.com
© Springer Nature Singapore Pte Ltd. 2020 237
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_25
238 T. Wang et al.

wind abandoning often occurs. In order to effectively solve the consumption problem
of wind power, literature [1–4] uses Forecasting method to study the prediction of
wind power output, and get some research results. Literature [5] proposed a dual-
time scale coordinated control method using battery energy storage system to reduce
wind power fluctuations. Literature [6] established a day-ahead scheduling model
based on the day-ahead heating load prediction, wind power output prediction and
the operation mechanism of the heat-storage device and solved it. Literature [7]
demonstrated that large-capacity heat storage can effectively solve the problems of
renewable energy consumption and peak regulation. In literature [8], the heat storage
was incorporated into the active power scheduling system of power systems with wind
power. However, the most critical problem of various energy storage technologies is
that the capacity cannot accept wind power on a large scale.
It proposes a strategy of accepting dispatching power generation instructions on
the power plant side and rationally arranging power generation and heat storage
switching. In the power grid dispatching side, the heat storage load is incorporated
into the daily dispatching plan, and the direct control of the power grid is realized
through automatic power generation control, thus forming a new large-scale accep-
tance wind power generation method.

25.2 Stratified Control Strategy of Electric Thermal


Storage Unit

For the large-capacity thermal storage system built on the side of the power plant, it is
connected with the urban heat network and becomes another coupling point between
the heat network system and the power grid system, forming a new power-thermal
coupling system. The schematic diagram of the power-thermal coupling system is
shown in Fig. 25.1.
In this paper, the thermal storage unit body device, power plant heat storage system
and thermo-electric coupling system are taken as the research object, and the unit-
collection-cluster hierarchical control strategy of unit level, power plant level and
system level is constructed, as follows.

25.2.1 Unit Control

Unit control refers to the control method that considers the operation constraints of
the heat storage unit body device. Unit control is the basis of the layered control
strategy, which is only limited by the working state of the heat storage unit itself.
The specific operational constraints are modeled as follows:

Ht = ηHt−1 + St t = 1, 2, . . . , 24 (25.1)
25 Research on Intelligent Hierarchical Control … 239

Power grid

Thermal
storage system

Power plant

Heating
User load network Unit Collection Cluster
control control control

Fig. 25.1 Thermoelectric coupling system network structure diagram

Hmin ≤ Ht ≤ Hmax t = 1, 2, . . . , 24 (25.2)

−hin
max ≤ St ≤ hmax t = 1, 2, . . . , 24
in
(25.3)


24
St = 0 (25.4)
t=1

Among them, Ht is the thermal storage capacity of the heat storage device at the
end of time t; η is the thermal storage tank efficiency; Hmax and Hmin are the upper
and lower bounds of the thermal storage capacity of the thermal storage device;
hin out
max and hmax are the upper limits of the input and output heat power; Eq. (25.1)
characterizes the heat balance state of the thermal storage device; (25.2) and (25.3)
are the absorption and exothermic energy constraints of the energy storage system;
Eq. (25.4) indicates that the heat capacity of the thermal storage device remains
unchanged during one cycle. It is balanced in one cycle.

25.2.2 Collection Control

In order to make full use of the capacity margin of the thermal storage unit group,
each thermal storage unit needs to be maintained at a certain energy level, so that it
240 T. Wang et al.

Smax
Limit interval of heat storage Limit heat
switching storage

Sa-max

Optimal Thermal
operating storage
Normal input interval interval of unit
thermal storage capacity
unit

Sa-min

Minimum heat storage Minimum


interval margin
Smin

Fig. 25.2 Zoning control diagram of heat storage unit

can obtain a reasonable balance between charging and discharging, thereby realizing
the unrestricted fast response under different switching instructions.
According to the operating characteristics of the thermal storage unit, the operating
state of the thermal storage unit can be divided into three intervals: a normal input
interval, a thermal storage switching limit interval and a minimum thermal storage
interval, as shown in Fig. 25.2.
It can be seen from Fig. 25.2 that for each of the three operating
 intervals of the
thermal storage unit, it is most reasonable to operate in the Sa−min < S < Sa−max
interval, that is, the thermal storage system has a certain charging margin and a certain
allowance margin. The energy storage unit has higher flexibility at this energy level,
and it is easier to meet the daily dispatching demand.
Normal input interval. When the thermal storage unit is operated in the interval, the
effective control of the thermal storage unit can provide sufficient heat for the heat
network system, and at the same time provide a certain adjustable load for the power
system, that is, the thermal storage unit has the best adjustment capacity margin.
When the thermal storage unit is in the normal input interval, its control logic is:

Pft > Pct = 0, 1
S
2 max
< S < Sa−max
(25.5)
Pct = Pce > Pft ≥ 0, Sa−min < S < 21 Smax
25 Research on Intelligent Hierarchical Control … 241

Among them: Pft represents the heat release power of the thermal storage unit; Pct
is the rated heat storage power of the thermal storage unit; Pce is the rated thermal
storage power of the heat storage unit. In this state, the thermal storage unit remains
operating near 50% of capacity.
Heat storage switching limit interval and minimum heat storage interval. When
the thermal storage unit is in these two intervals, its control logic is:

Pft = Pf −max > Pct = 0, Sa−max ≤ S ≤ Smax
(25.6)
Pct = Pce > Pft = 0, 0 ≤ S ≤ Sa−min

Among them, Pf −max is the maximum heat release power of the thermal storage
unit. At this time, the thermal storage unit in the thermal storage switching restriction
interval does not have the ability to realize further thermal storage, and the thermal
storage unit can only perform the heat release control. Similarly, the heat dissipation
capacity of the heat storage unit in the minimum heat storage interval is limited, and
only the heat storage operation control can be performed.
The heat storage system should have sufficient heat storage capacity:


n 
m
Pcti + Pcfj = Pquota (n + m = N ) (25.7)
i=1 j=1

Among them, Pcti represents the heat storage power of the i-th unit, Pcfj represents
the heat release power of the j-th unit; Pquota represents the system power quota;
Among them, n and m represent the amount of heat storage and exothermic state of
the heat storage unit in the power plant, N represents the total number of thermal
storage units configured in the power plant.

25.2.3 Cluster Control

When receiving the grid dispatching instruction, the control strategy is as follows:
The overall thermal storage system power is equal to the dispatching command:


n 
m
Pij = Pdispatch (25.8)
i=1 j=1

Among them: Pij represents the power of the j-th thermal storage unit in the i-th
group, the exotherm is positive and the thermal storage is negative; Pdispatch represents
the value of the system dispatch command; n and m respectively represent the number
of thermal storage unit groups installed in the power plant and the number of thermal
storage units in each thermal storage group.
242 T. Wang et al.

Under the premise of satisfying the system scheduling instruction, in order to


make full use of the adjustable capacity of the thermal storage unit to cope with
the uncertainty of the whole system, the overall control strategy is to optimize the
maximum number of heat storage units that satisfy the group control:


n 
m
MAX = Nij (25.9)
i=1 j=1

Among them: Nij represents the number of the j-th heat storage unit in the i-th
group, and is 1 when the condition is satisfied, otherwise it is 0; n and m have the
same meaning (25.8).

25.3 Optimization Problem Modeling

25.3.1 The Objective Function

Set up the system operation benefit as the maximum objective function:


24 
24
f (θ ) = R − C = [λc (t) · Pc (t) + λw (t) · Pw (t) + λh (t) · HL (t)] − Ft
t=1 t=1
(25.10)
  2   
Ft = ai Pc (t) + CV hct Hc (t) + St + bi Ptc + CV hct + St + ci (25.11)

Among them: R represents the total revenue, which includes the revenue from
selling electricity and heating revenue; C represents the total cost, including the cost
of power generation and heating; Pc and Pw respectively represent the output of
thermal power plants and wind farms; HL is the thermal load of the thermal power
plant; λc and λw respectively represent the on-grid price of thermal power plants
and wind farms; λh represents the heating price of the thermal power plant; Ft is
the operating cost of power generation and thermal storage units in thermal power
plants. ai , bi and ci are the operating cost coefficients of the thermal power plant; CV
is the operating parameter of the unit; St is the heat storage/exothermic power of the
heat storage device at time t, which is positive at heat storage and negative at heat
release.

25.3.2 Constraint Condition

System constraint. Power balance constraint:


25 Research on Intelligent Hierarchical Control … 243

Pel,i (t) + Pw (t) − Pex (t) = PD,el (t) (25.12)
i∈N

Among them, Pel,t (t) represents the output of the thermal power unit in the region;
Pw (t) is the wind power connected to the network at time t in the system; Pex (t)
indicates the exchange power between the region and the external system at time t.
When the value is positive, it indicates that the power is delivered outward. When the
value is negative, it indicates that the external system supplies power to the region;
PD,el (t) is the electrical load value at time t in the system.
System heating constraint:

Ph (t) + Shk (t) − Shk (t − 1) ≥ PDhk (t) (25.13)

Among them: k is the total number of heating zones; PDhk (t) is the total heat load
that the k-th district thermal power plant needs to bear at time t; Shk (t) is the heat
storage of the heat storage device in the k-th partition at time t.
The unit constraint. Upper and lower limit constraints of unit thermal output:

0 ≤ Ph ≤ Ph,max (25.14)

Among them, Ph,max is the maximum limit of the heat output of the unit i, which
mainly depends on the capacity of the heat exchanger.
Unit climbing rate constraint:

P(t) − P(t − 1) ≤ Pup
(25.15)
P(t − 1) − P(t) ≤ Pdown

Among them, Pup and Pdown are the upward and downward climbing speed con-
straints of unit i respectively.
Operating constraint of thermal storage device. Constraints on the stor-
age/discharge capacity of the heat storage device:

t
Sh,k − Sh,k
t−1
≤ Ph,k,c max
(25.16)
Sh,k − Sh,k
t−1 t
≤ Ph,k,f max

Among them, Ph,k,c max and Ph,k,f max are the maximum storage and release power
of the thermal storage device, respectively.
Capacity constraints of thermal storage devices:

t
Sh,k ≤ Sh,k,max (25.17)

Among them, Sh,k,max is the thermal storage capacity of the thermal storage device.
244 T. Wang et al.

25.4 Case Analysis

The regional system is shown in Fig. 25.3. The thermal storage units in the regional
system are separately analyzed in terms of local and remote control modes, and the
benefits brought by the thermal storage units are analyzed.

25.4.1 Local Control

Only the self-interval limitation of the heat storage unit is considered, and the simu-
lation calculation is carried out with the goal of maximum wind power consumption.
The simulation results are shown in Fig. 25.4.
It is learned from the historical operation of the power system in Liaoning Province
that in practice, the real trough and acceptance difficulties period of the system are
[00:00–04:00]. In order to better respond to the needs of the power grid, the heat
storage unit is divided into groups and switched. The specific switching strategy of
the heat storage unit is shown by the dotted line in Fig. 25.4. The local control strategy
is adopted to adjust the heat storage unit to maximize the space for the system to
absorb wind power during the peak wind power abandonment period.

Regional wind power Regional grid-connected


grid-connected equalizer
Heat load installed capacity 8000MW
1000MW
350MW
G G
G

S1 S2 S3 S4
35/0.69kV
G1 G2 70MW 70MW 80MW 80MW

220/35kV

600 MW+j 180 Mvar 5600MW+j 600 Mvar

0.69/35kV 500/ 220kV

35/ 220kV 220/ 500kV 500/ 220kV

Fig. 25.3 Equivalent system diagram


25 Research on Intelligent Hierarchical Control … 245

Fig. 25.4 Operation curve of thermal power plant active output in local control

25.4.2 Remote Control

The direct control electric storage heat load can meet the peak regulation of power
network and the user’s heating demand when receiving the dispatching command
from the power grid. The adjustable amount of the power generation limit of the
power plant is [0, 600 MW]. When the output is less than 300 MW, the adjustment
principle is shown in Fig. 25.5.

Typical daily load curve


Generator minimum output power
Power/MW

t=21h start
storing heat
The thermoelectric unit in sequence
initial PGmin=300

Time/h

Fig. 25.5 Adjustment schematic of PG limit


246 T. Wang et al.

In the figure, PGmin represents the minimum output value of thermal power unit
during the low valley load period. Due to the limit of the heat load, the wind power
consumption capacity is restricted, and the wind abandonment phenomenon occurs.
According to the operation strategy of the direct-controlled heat storage device
proposed in this paper, the switching of the heat storage device will be completed in
the low valley period when the output of the unit is limited. The switching process
is: 0 MW, 70 MW, 2 * 70 MW, 2 * 70 + 80, 2 * 70 + 2 * 80. When the heat storage
system is fully put into operation, the output value of the thermal power unit will
be 0 MW, which means that 300 MW of capacity can be provided for the system to
receive wind power.
It can be seen from Fig. 25.6 that during the low load period, the heat storage
operation curve is positive, and this is also the peak period of the grid wind abandon-
ment. Therefore, on the one hand, the heat storage system operation increases the
load value; on the other hand, the thermal power plant output decreases. This makes
the wind power consumption space increase. Due to the operation of the heat storage
device, the daily load curve of the system is corrected from y1 to y2 , which reduces
the peak-to-valley difference of the system to 1679.1 MW, which makes the system
run more smoothly.
In order to facilitate the dispatching organization to prepare the power generation
plan, firstly, the heat storage system operation curve is obtained according to the
remote control strategy, and then the heat storage control strategy is used to correct
daily load curve y2 and formulate a dispatch plan. Figure 25.7 shows the output curve
of the thermal power plant unit. It can be seen from the figure that when the unit is
operated in the remote control mode proposed in this paper, the maximum output
is 600 MW and the minimum output is 0 MW, which reduces the number of starts
and stops of the unit, and reserves more space for receiving wind power during the

Fig. 25.6 Operation curve of heat storage device in remote control


25 Research on Intelligent Hierarchical Control … 247

Fig. 25.7 Thermoelectric


Unit maximum output
unit output power curve
Minimum unit output

Power/MW

Time/h

trough, and at the same time the economy Operation and deep peak shaving of gird
can be achieved.

25.4.3 Power Efficiency Analysis

The use of local control and remote control during the low valley period can effec-
tively raise the trough load and provide a larger capacity margin for the grid to
consume more wind power.
In the safe operation condition of the heat storage system, the more amount of
wind power consumed by the power grid because of the heat storage system is:

365
 t2
EGwind = fHS (t)dt (25.18)
k=1 t1

Among them, t1 and t2 are the start and end times of direct control heat storage
during the low valley period; fHS (t) is the heat storage unit power at time t, which is
a step function.
According to the selected Liaoning regional power grid, the control strategy can
increase the adjustable load capacity of 300 MW for the grid side. Wind power
consumption capacity was improved by 300 MW, and the consumption capacity of
wind power was calculated under the condition that the heat storage device was
operated for 7 h every day and 5 months every year. The wind power consumption
capacity of 315 million was obtained.
248 T. Wang et al.

25.5 Conclusion

The effective utilization of the heat storage power source is realized by constructing
the local control and remote control strategy of the heat storage unit. At the same time,
the grid optimization scheduling model with large-scale electric thermal storage unit
is constructed with the goal of maximizing the operating efficiency of the system.
Finally, the rationality of the model was verified by using the actual data of Liaoning
Power Grid, and the power efficiency under the model was analyzed.

Acknowledgments Project supported by State Grid Corporation Science and Technology


(2018GJJY-01).

References

1. Peng, X., Xiong, L., Wen, J., et al.: A summary of methods for improving short-term and ultra-
short-term power forecast accuracy. Chin. Soc. Electr. Eng. 36(23), 6315–6326 (2016)
2. Lu, M.S., Chang, C.L., Lee, W.J., et al.: Combining the wind power generation system with
energy storage equipment. IEEE Trans. Ind. Appl. 45(6), 2109–2115 (2009)
3. Heming, Y., Xiangjun, L., Xiufan, M., et al.: Wind energy planning output control method
for energy storage system based on ultra-short-term wind power prediction power. Power Grid
Technol. 39(2), 432–439 (2015)
4. Zhao, S., Wang, Y., Xu, Y.: Fire storage combined related opportunity planning and scheduling
based on wind power prediction error randomness. Chin. Soc. Electr. Eng. 34(S1), 9–16 (2014)
5. Jiang, Q., Wang, H.: Two-time-scale coordination control for a battery energy storage system to
mitigate wind power fluctuations. IEEE Trans. Energy Convers. 28(1), 52–61 (2013)
6. Yu, J., Sun, H., Shen, X.: Joint optimal operation strategy for Wind-Thermal power units with
heat storage devices. Power Autom. Equip. 37(6), 139–145 (2017) (in Chinese)
7. Xu, F., Min, Y., Chen, L., et al.: Electrical-thermal combined system with large capacity heat
storage. Chin. J. Electr. Eng. 34(29), 5063–5072 (2014) (in Chinese)
8. Chen, T.: Research on wind power scheme for thermal power plant based on heat storage. Dalian
University of Technology (2014)
Chapter 26
Global Maximum Power Point Tracking
Algorithm for Solar Power System

Ti Guan, Lin Lin, Dawei Wang, Xin Liu, Wenting Wang, Jianpo Li and
Pengwei Dong

Abstract The P-U curve of the PV (photovoltaic) system has multi-peak charac-
teristics under non-uniform irradiance conditions (NUIC). The conventional MPPT
algorithm can only track the local maximum power points, therefore, PV system fails
to work at the global optimum, causing serious energy loss. How to track its global
maximum power point is of great significance for the PV system to maintain an
efficient output state. Artificial Fish Swarm Algorithm (AFSA) is a global maximum
power point tracking (GMPPT) algorithm with strong global search capability, but
the convergence speed and accuracy of the algorithm are limited. To solve the men-
tioned problems, a Hybrid Artificial Fish Swarm Algorithm (HAFSA) for GMPPT
is proposed in this paper by using formulation of the Particle Swarm Optimization
(PSO) to reformulate the AFSA and improving the principal parameters of the algo-
rithm. Simulation results show that when under NUIC, compared with the PSO and
AFSA algorithm, the proposed algorithm has well performance on the convergence
speed and convergence accuracy.

Keywords PV system · NUIC · PSO · AFSA · GMPPT

26.1 Introduction

Solar energy is an important sort of renewable energy and MPPT algorithm is one of
the key technologies in PV power generation system. Under uniform irradiance, there
is only one maximum power point on the P-U output curve where the PV module can
operate at maximum efficiency and produce maximum output power [1]. But when

T. Guan · L. Lin · D. Wang


State Grid Shandong Electric Power Company, Jinan 250003, China
X. Liu · W. Wang
State Grid Shandong Electric Power Company, Electric Power Research Institute, Jinan 250003,
China
J. Li (B) · P. Dong
School of Computer Science, Northeast Electric Power University, Jilin 132012, China
e-mail: jianpoli@163.com
© Springer Nature Singapore Pte Ltd. 2020 249
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_26
250 T. Guan et al.

part of the PV array receives lower solar irradiance due to occlusion by objects such
as clouds, trees and buildings, that condition is known as non-uniform irradiance
conditions (UNIC), the output of the PV system will be affected [2].
In order to ensure the PV system operating at the maximum power point simul-
taneously, many MPPT algorithms have been proposed like Perturb and Observe
(P&O) [3] and Incremental Conductance (INC) [4]. Under uniform irradiance, P&O
and INC show good tracking efficiency and speed. However, under NUIC, conven-
tional MPPT techniques fail to track the global peak and instead converge onto one
of the local maximum power points, resulting in considerable underutilization of the
PV power [5]. Reference [6] points out that under NUIC, the conventional MPPT
algorithm may cause a decrease in output power of the PV array by about 70%. There-
fore, under NUIC, GMPPT technology is crucial for tracking the global maximum
power point (GMPP).
To solve the problems of tracking GMPP under NUIC, the intelligent algorithm
is introduced into the GMPPT technology, and the GMPPT is achieved by using the
global search capability of the intelligent algorithm like Particle Swarm Optimization
(PSO) [7], Back Propagation (BP) Neural Network [8], and Cat Swarm Optimization
(CSO) [9]. PSO has been proposed as a GMPPT algorithm based on the behavior
of birds flocking [10]. In this technique, particles collectively solve a problem by
sharing information to find the best solution. The technique is limited by the presence
of random variables in its implementation, and it requires several parameters to be
defined for each system. Another GMPPT algorithm based on simulated annealing
(SA) optimization [11] has been proposed recently. However, this method incurs
more PV voltage variations during searching process and needs higher convergence
time.
The intelligent AFSA algorithm is introduced into the GMPPT technology, and
the paper proposes a Hybrid Artificial Fish Swarm Algorithm (HAFSA) including:
(1) using formulation of the PSO to reformulate the AFSA, (2) extending to mem-
ory behavior, communication behavior into AFSA, and (3) improving the principal
parameters of the algorithm, so that the values change is adapted to the parameter
requirements in different search stages.

26.2 Modeling of PV Cell Under Uniform Irradiance

The equivalent circuit with series and parallel resistance of each PV cell is shown in
Fig. 26.1.
where Iph is the PV current; Id is the current of parallel diode; Ish is the shunt
current; I is the output current; U is the output voltage; Rs is series resistance; Rsh is
shunt resistance.
According to the equivalent circuit of Fig. 26.1, the relationship between the
output current and the voltage of the PV cell is described as:
26 Global Maximum Power Point Tracking Algorithm … 251

Fig. 26.1 Equivalent circuit


of a single PV cell
Rs
I

U
Iph Ish Rsh
Id

   
q(U + Rs × I ) U + Rs × I
I = Iph − I0 exp −1 − (26.1)
nKT Rsh

where I0 is the reverse saturation current of PV cell, q is charge of an electron (1.6 ×


10−19 C), K is Boltzmann constant (1.38 × 10−23 J/K), T is the temperature of the
PV cell (K), n is the ideality factor of PV cell (n = 1 ∼ 5).

26.3 The Standard Artificial Fish Swarm Algorithm

The principle of Artificial fish swarm algorithm is to simulate the fish in the nature of
foraging, cluster and collision behavior and mutual assistance between fish swarm,
so as to realize the global optimal.
Define the most moving distance of artificial fish is Step, the apperceived distant
of artificial fish is Visual, the retry number is Try_Number and the factor of crowed
degree is η. The situation of artificial fish individual can be described as result the
vector X = (X 1 , X2 , . . . , Xn ), and the distance between artificial i and artificial j is
dij = Xi − Xj .
(1) Prey
Consider they apperceive food by eyes, current situation is Xi , and randomly select
a situation Xj in their apperception range.

Xj = Xi + V isual × rand () (26.2)

where, rand () is a random number between 0 and 1. If Yi > Yj , then move forward
at this direction. Otherwise random choose a new situation Xj to judge whether it
satisfies the move condition. If it does:

Xj − Xit
Xit+1 = Xit +   × Step × rand () (26.3)
Xj − X t 
i

If it cannot satisfy the move condition after Try_Number times, then random
move:
252 T. Guan et al.

Xit+1 = Xit + V isual × rand () (26.4)

(2) Swarm

In order to avoid crowding too much, set artificial current situation is Xi .Searching
the number of its companies nf and center Xc in the area (namely dij < V isual).
Then it can move toward its companies’ center location.

Xc − Xit
Xit+1 = Xit +   × Step × rand () (26.5)
Xc − X t 
i

Otherwise it began to carry out the behavior of prey.


(3) Follow
Define the current situation of the artificial fish swarm is Xi . Searching its biggest
company Yj as Xj in the area (namely dij < V isual). Then it can move toward Xj .

Xj − Xit
Xit+1 = Xit +   × Step × rand () (26.6)
Xj − X t 
i

(4) Random
The behavior of randomization can make artificial fish find food and companies in a
larger area. One situation is randomly selected, and artificial fish can move toward
it.

26.4 A Hybrid Artificial Fish Swarm Algorithm and Its


Application to GMPPT

26.4.1 A Hybrid Artificial Fish Swarm Algorithm (HAFSA)

In order to improve the convergence speed and accuracy of the algorithm, the paper
introduces several features like the velocity inertia factor, the memory factor, and
the communication factor of the PSO into the AFSA. The HAFSA algorithm makes
the artificial fish move with velocity inertia characteristic, and the behavior patterns
of artificial fish expands to memory behavior and communication behavior. The
HAFSA algorithm also reduces the blindness in the artificial fish searching process.
(1) The paper uses the formulation of the PSO to reformulate the AFSA.
The introduction of velocity inertia weight can reduce the blindness of the artificial
fish movement. Taking the update of swarm behavior as an example, if Yc /nf < η×Yi ,
the update Eqs. (26.7) and (26.8) are:
26 Global Maximum Power Point Tracking Algorithm … 253

Step × (Xtc − Xt )
Vt+1 = ωVt + rand () × (26.7)
norm(Xtc − Xt )

Xt+1 = Xt + Vt (26.8)

(2) It introduces the memory factor and the communication factor of the PSO into
the AFSA so as to add memory behavior and communication behavior.

First, the algorithm introduces the memory behavior pattern. The memory behavior
pattern is the optimal position that the artificial fish can refer to when it is moving.
If Ypbest /nf < η × Yi , it shows that the location of its companies has much food and
don’t crowd. The update Eq. (26.9) is:

pbest
Step × (Xt − Xt )
Vt+1 = ωVt + rand () × pbest
(26.9)
norm(Xt − Xt )

pbest
where, Xt is the best location vector of the artificial fish at tth iteration.
Second, the communication behavior pattern is the optimal position of the entire
fish swarm that the artificial fish can refer to them when it is moving. If Ygbest /nf <
η × Yi , it shows that the location of its companies has much food and don’t crowd.
The update Eq. (26.10) is:

gbest
Step × (Xt − Xt )
Vt+1 = ωVt + rand () × gbest
(26.10)
norm(Xt − Xt )

gbest
where, Xt is the best location vector of all artificial fishes on the bulletin board at
iteration.

26.4.2 Optimization of Principal Parameters in HAFSA

In order to meet the requirement that the fish swarm can run at high speed in the early
stage and understand the search space effectively while the fish swarm can accurately
searched at low speed within the optimal solution neighborhood in the later stage,
the paper proposes a new nonlinear decrement method based on the inertia weight
ω with linear decreasing. As shown in Eq. (26.11):

ω(t) = ωmin + (ωmax − ωmin ) × e−[t/(tmax /4)]


k
(26.11)

where, t is the number of algorithm iterations, tmax is the maximum number of


iterations, ωmin and ωmax are the upper and lower limits of the inertia weight range
respectively, k is the order, (k = 1, 2, 3, 4 . . ., the value of k is selected according to
the specific application of the algorithm).
254 T. Guan et al.

In order to further improve the performance of the algorithm, this paper proposes
an improved way to meet the expectations of changes of Step and Visual. As shown
in Eqs. (26.12) and (26.13):

V ISmax
V isual(t) = × [(V ISmin /V ISmax )1/(tmax −1) ]t (26.12)
(V ISmin /V ISmax )1/(tmax −1)

where, V ISmin and V ISmax are the upper and lower limits of the Visual respectively.
 
XE − X  Y 
Step(t) = V isual(t) × 
× 1− (26.13)
XE − X   YE 

where, X is the current situation of artificial fish, XE is the next situation that artificial
fish X explores in various behaviors, Y and YE are the fitness value corresponding to
situations X and XE respectively.

26.4.3 HAFSA Applied to Global Maximum Power Point


Tracking of PV System

Under NUIC, P-U output curve of PV system becomes multi-peak. To perform


the simulations, a PV array is built with three series connected PV modules and
the system configuration is tested with three different shade conditions (pattern 1:
G1 = 1000, 1000, 1000 W/m2 ; pattern 2: G2 = 1000, 600, 600 W/m2 ; pattern 3:
G3 = 1000, 800, 400 W/m2 ).
The paper applies HAFSA to track the global maximum power point as follows:
(1) The principal parameters of the algorithm are set and shown in Table 26.1.
(2) Artificial fish individual: the output current of the PV system is used as the
optimal value component X.
(3) The fitness function: The paper selects two PV modules in series as an exam-
ple. Assuming that the PV module 1 is shadowed but the PV module 2 is not

Table 26.1 Parameters of the


Parameter Value
proposed algorithm
ω [0.9, 0.4]
C1 , C2 2
η 0.75
Visual [Max D, Max D/100]
Step [Max D/5, 0]
Try_Number 5
tmax 150
26 Global Maximum Power Point Tracking Algorithm … 255

shadowed, the PV module 2 will get a stronger irradiance than PV module 1,


so there should be Iph1 < Iph2 :

 Iph2 −I I −I
nKT
ln( + 1) − nb KTb
ln( I0bph1 + 1) − IRs , Iph1 < I ≤ Iph2
U = q I0
I −I
q
I −I (26.14)
nKT
q
ln( ph1I0 + 1) + nKT
q
ln( ph2I0 + 1) − 2IRs , 0 ≤ I < Iph1

where nb is the diode influence factor, I0b is the saturation leakage current of the
bypass diode under standardized testing conditions.

26.5 Simulation Result of HAFSA GMPPT Algorithm

Simulation result is shown in this paper and several performances of HAFSA are
compared with that of the PSO and the AFSA under the same condition.
In the experiment, a PV array with three modules connected in series is taken
as an example. Under standardized conditions, G = 1000 W/m2 , T = 25 ◦ C, the
parameters of a single PV module are shown in Table 26.2.
When the irradiance conditiond is G = 1000, 800, 400 W/m2 , the GMPP of
the PV array is tracked by the proposed HAFSA GMPPT algorithm. The tracking
process is shown in Fig. 26.2.
It can be seen from Fig. 26.2 that the HAFSA GMPPT algorithm can accurately
track the GMPP with a high efficiency. After the 21st iteration of the algorithm, the
result tends to be smooth, the GMPP is Pmax = 894.3010 W when Impp = 6.7685 A.
When T = 25 ◦ C and G = 1000, 800, 400 W/m2 , PSO, AFSA, and HAFSA are
used for GMPPT. The tracking results are shown in Fig. 26.3.
As can be seen from the tracking process shown in Fig. 26.3, the three algorithms
can track the GMPP, and the proposed algorithm can track the GMPP with fewer
iterations. It can be seen from Fig. 26.3 that the HAFSA algorithm shows better
convergence speed and stability than the other two algorithms when tracking the
GMPP.
Figure 26.4 show the population distribution of the three algorithms after 30th iter-
ations. It can be seen, after the 30th iteration, the populations of the three algorithms
have been distributed in the optimal solution neighborhood, and the distribution of

Table 26.2 PV module


Parameter Value
parameters under standard
conditions Pmax 305 w
Uoc 44.7 V
Isc 8.89 A
Umpp 36.2 V
Impp 8.23 A
256 T. Guan et al.

515 7

6
500

Current(A)
Power(W)

485
4

3
470

2
455 GMPP reference
Power 1
Current
440 0
0 50 100 150
Iteration

Fig. 26.2 HAFSA tracking process under NUIC

530

510

490
AFSA
PSO
470
HAFSA
Power(W)

450

430

410

390

370
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150
Iteration

Fig. 26.3 Three algorithms trace results under NUIC


26 Global Maximum Power Point Tracking Algorithm … 257

Iteration=30

Power
503.7
PSO
AFSA
503.6
HAFSA

503.5

503.4
Power(W)

503.3

503.2

503.1

503

502.9

6.74 6.76 6.78 6.8 6.82 6.84 6.86


Current(A)

Fig. 26.4 Results of the three algorithms after the 30th iteration

HAFSA population is closer to GMPP. Therefore, the proposed algorithm demon-


strates better performance in terms of convergence speed and stability.

26.6 Conclusion

A novel GMPPT algorithm is proposed to increase the performance of PV system


under NUIC, which makes the characteristics multi-peak. Several factors of the PSO
algorithm are introduced into the AFSA, which reduces the blindness of artificial fish
movement. At the same time, the foraging behavior of artificial fish swarm expands
two behaviors, memory behavior and communication behavior, which further reduces
the blindness of artificial fish movement. Moreover, the paper creates equations
for the principal parameters of the proposed algorithm, to adapt the changes on
parameter requirements in different search stages. Simulation results also show that
the proposed algorithm performs better than other GMPPT algorithm.

Acknowledgements This work was supported by “Research on Lightweight Active Immune Tech-
nology for Electric Power Supervisory Control System”, a science and technology project of State
Grid Co., Ltd in 2019.
258 T. Guan et al.

References

1. Li, C., Cao, P., Li, J., Zhao, B.: Review on reactive voltage control methods for large-scale
distributed PV integrated grid. J. Northeast Electr. Power Univ. 37(2), 82–88 (2017)
2. Wang, H., Chen, Y., Li, G., Zhuang, G.: Solution of voltage beyond limits in distribution
network with large scale distributed photovoltaic generators. J. Northeast Electr. Power Univ.
37(6), 8–14 (2017)
3. Femia, N., Petrone, G., Spagnuolo, G.: Optimization of perturb and observe maximum power
point tracking method. IEEE Trans. Power Electron. 20(4), 963–973 (2005)
4. Sera, D., Mathe, L., Kerekes, T.: On the perturb-and-observe and incremental conductance
MPPT methods for PV systems. IEEE J. Photovolt. 3(3), 1070–1078 (2013)
5. Roman, E., Alonso, R., Ibanez, P.: Intelligent PV module for grid-connected PV systems. IEEE
Trans. Ind. Electron. 53(4), 1066–1073 (2006)
6. Manickam, C., Raman, G.R., Raman, G.P.: A hybrid algorithm for tracking of GMPP based on
P&O and PSO with reduced power oscillation in string inverters. IEEE Trans. Ind. Electron.
63(10), 6097–6106 (2016)
7. Oliveira, F.M.D., Silva, S.A.O.D., Durand, F.R.: Grid-tied photovoltaic system based on PSO
MPPT technique with active power line conditioning. IET Power Electron. 9(6), 1180–1191
(2016)
8. Yang, M., Huang, X., Su, X.: Study on ultra-short term prediction method of photovoltaic
power based on ANFIS. J. Northeast Electr. Power Univ. 38(4), 14–18 (2018)
9. Yin, L., Lv, L., Lei, G.: Three-step MPPT algorithm for photovoltaic arrays with local shadows.
J. Northeast Electr. Power Univ. 37(6), 15–20 (2017)
10. Miyatake, M., Veerachary, M., Toriumi, F.: Maximum power point tracking of multiple pho-
tovoltaic arrays: a PSO approach. IEEE Trans. Aerosp. Electron. Syst. 47(1), 367–380 (2011)
11. Lyden, S., Haque, M.E.: A simulated annealing global maximum power point tracking
approach for PV modules under partial shading conditions. IEEE Trans. Power Electron. 31(6),
4171–4181 (2016)
Chapter 27
A Design of Electricity Generating
Station Power Prediction Unit with Low
Power Consumption Based on Support
Vector Regression

Bing Liu, Qifan Tong, Lei Feng and Ping Fu

Abstract During the process of electricity generating station operation, its output
power will be affected by environmental factors, so there will be a large fluctuation.
If we can monitor the environmental data and the output power of the electricity
generating station in real time, we can make an accurate and effective estimation of
the operation status of the electricity generating station. To meet this demand, we
designed an electricity generating station power prediction unit based on support vec-
tor regression algorithm. The power consumption of the unit is very low, and by using
machine learning, the characteristics and rules of each index can be learned from
the environmental data collected by sensors. By processing and analyzing the newly
collected data, the real-time operation status of the electricity generating station can
be monitored.

Keywords Output power · Real-time monitor · Machine learning · Support vector


regression · Low power consumption

B. Liu · Q. Tong · L. Feng (B) · P. Fu


Harbin Institute of Technology, Harbin, China
e-mail: hitfenglei@hit.edu.cn
B. Liu
e-mail: liubing66@hit.edu.cn
Q. Tong
e-mail: 201500800806@mail.sdu.edu.cn
P. Fu
e-mail: fuping@hit.edu.cn

© Springer Nature Singapore Pte Ltd. 2020 259


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_27
260 B. Liu et al.

27.1 Introduction

27.1.1 Research Status

With the continuous development of machine learning and deep learning technology,
the concept and knowledge system of machine learning have been improved day
by day. For a long time in the future, it can be predicted that machine learning will
develop towards the direction of marginalization and terminal. The ability of machine
learning algorithm to extract features makes it very suitable for systematic analysis
and prediction which is greatly affected by environmental factors. Therefore, many
machine learning related algorithms can be applied in practical production.
In practical applications, it is more common to upload data and computing tasks to
the cloud, and then the results are returned to the local after data collection and model
training are completed in the cloud. This way of machine learning can be called cloud
computing. The advantages of cloud computing lie in the large amount of data stored
in the server, high accuracy and strong computing power. Although cloud computing
is powerful, it also has disadvantages. Many computational scenarios need to be done
locally, such as driverless vehicles. If the collected data is uploaded to the cloud for
processing and calculation during the driving process, the time delay caused by
this process is likely to lead to safety accidents. Comparatively speaking, running
machine learning algorithms on terminal devices has the advantages of real-time and
low latency, and is more suitable for many practical scenarios.
Although machine learning algorithm has a good effect and performance in solv-
ing many specific problems, the operation of the algorithm itself needs to consume
a lot of computing resources. In this paper, we need to run the machine learning
algorithm on the terminal device, so we need to process and optimize the data before
the algorithm runs, in order to reduce the running time required by the algorithm and
implement the algorithm efficiently under low power consumption.

27.1.2 Research Objective

The output power of power plant will be affected by environmental factors to a great
extent. In order to monitor the operation of power plant in real time, we design a
power prediction unit of power plant, and realize extracting the characteristics of each
environmental data index through the classical support vector regression algorithm
in machine learning, so as to accurately estimate the operation status of power plant.
27 A Design of Electricity Generating Station Power … 261

27.1.3 Data Source

Through long-term monitoring and evaluation of a combined cycle electricity gen-


erating station, we collected the power output data of the plant from 2006 to 2011,
which was set to work at full load during the six years. The power output of the elec-
tricity generating station is mainly affected by temperature, pressure, air humidity,
and exhaust vacuum. Therefore, we recorded the data of these four indicators and
made specific regression analysis experiments for the power output characteristics
of the electricity generating station.

27.1.4 Mapping Platform

We use FRDM-KW01 9032 as the embedded platform used in the experimental part.
This series of development boards adopt ARM Cortex M0+ core, which has very
low power consumption and supports ISM band wireless communication. It is very
suitable for many practical application scenarios. Therefore, this type of development
board is selected to verify and debug the algorithm.

27.1.5 General Overview

In Sect. 27.2 we introduce the workflow of the power prediction unit and the SVR
algorithm. In Sect. 27.3 we mainly discuss the methods of data preprocessing and
precision evaluation. In Sect. 27.4 we introduce the implementation of the algorithm
on the embedded platform. In Sect. 27.5 we will summarize the content of the article.

27.2 The Design of Power Predict Unit

In this Section we mainly introduce the operation mode of the electricity generating
station we monitored, the structure of the power prediction unit, and briefly intro-
duces support vector regression algorithm and some key parameters involved in the
operation of the algorithm.

27.2.1 Principle of the Electricity Generating Station

The combined cycle power plant consists of gas turbine (GT), steam turbine (ST)
and heat recovery steam generator. In the process of power generation in electricity
262 B. Liu et al.

Table 27.1 . Maximum Minimum Measurement


unit
Temperature 1.81 37.11 °C
Pressure 992.89 1033.30 milibar
Air humidity 25.56% 100.16% –
Exhaust vacuum 25.36 81.56 cmHg

generating station, electricity is generated by gas and steam turbines, which are
combined in one cycle and transferred from one turbine to another. When vacuum
is collected from steam turbines and affects them, the three environmental variables
affecting the performance of gas turbines are temperature, air relative humidity and
pressure. Therefore, the output power of the power plant is mainly related to the
above three environmental variables and the exhaust vacuum. According to the data
we collected in the factory through sensors, we can get the numerical range of these
four indicators. The range of values is shown in Table 27.1.
By acquiring the range of these parameters, we can normalize the data to a scale
between 0 and 1 before the algorithm runs, so as to avoid the inconsistency of the
weights of the running data caused by the direct use of the original data, and improve
the efficiency of the algorithm on the embedded platform.

27.2.2 Workflow of the Power Prediction Unit

The main purpose of the power prediction unit we designed in this paper is to obtain
real-time operation information of the power plant by acquiring environmental data
and learning its data characteristics, and by predicting the ideal output power value
in this environment and comparing it with the output power value, so as to achieve
the purpose of monitoring the working state of the power plant.
Firstly, the sensor on the power prediction unit reads the environmental data,
KW01 reads the data through ADC. After the Support Vector Regression algorithm
is completed, the power predict unit outputs an ideal value in the current state. Then
the ideal value will be compared with the actual value of the measured power, so as to
judge whether there are problems in the operation of the power plant. The structure
and design flow of the prediction unit are shown in Fig. 27.1.

27.2.3 Introduction of Support Vector Regression Algorithm

Support Vector Regression (SVR) is a simple machine learning algorithm. Its task
is, for given data, to find a hyperplane that can fit as many data points as possible,
and apply the results of regression to target prediction and analysis. In the case of
27 A Design of Electricity Generating Station Power … 263

Fig. 27.1 .

Fig. 27.2 .

linear inseparability, the algorithm maps data to high-dimensional space by using


kernel functions to solve such problems. The algorithm is widely used in character
recognition, behavior recognition and financial analysis. As shown in Fig. 27.2, when
the input data is one-dimensional, the purpose of the algorithm is to fit a curve, so
that the distance from a point on the plane to the curve is the shortest, and when the
input data is higher, the fitting target becomes a hyperplane.
264 B. Liu et al.

27.2.4 Key Parameters

Epsilon. The insensitive loss function coefficients can be understood as acceptable


errors. As shown by the dotted line in the figure above, for the sample points, there
exists an area that does not provide any loss value for the objective function. For a
specific data set, an acceptable flexible boundary coefficient is manually selected.
Gamma. A parameter in the RBF kernel function. It determines the distribution of
the data after mapping to the new feature space. When the Gamma is lager, there
will be fewer support vectors, and the curve will be more complex, and the operation
of the algorithm will need more iterations. Overvalue of Gamma results in poor
generalization of the model.
C. Penalty coefficient, per se, tolerance of errors. The higher the C, the more intoler-
able the error and easy to over-fit. The smaller C, the less fitting. Whether too large
or too small the C is, will make generalization ability worse.

27.3 Parameter Selection and Model Optimization Method

This section mainly introduces the accuracy evaluation method and parameter selec-
tion of training algorithm. Appropriate parameters can speed up code running and
improve the efficiency of the algorithm on embedded platform. The part of parameter
optimization is completed on PC.

27.3.1 Model Accuracy Evaluation Method

For the evaluation index of model accuracy, we choose root mean square error
(RMSE) at the beginning of the experiment. The formula is as follows:
  
n
i=1 yi − ŷi
RMSE = (27.1)
n

Through the formula, we can find that the index can describe the relationship
between the predicted value and the actual value very well. However, in the specific
experiments, we found that in the process of data preprocessing, we only zoom the
values of the four dimensions of the input, but not the label values, so this index cannot
describe the accuracy performance of the same model in different scale data sets very
well. Therefore, this paper chooses a statistical index to describe the goodness of fit,
and the formula is as follows:
n
(yi − ȳi )
R = 1 − ni=1 
2
 (27.2)
i=1 yi − ŷi
27 A Design of Electricity Generating Station Power … 265

When the value of this index approaches 1, the accuracy of the representative
model will be better.

27.3.2 Selection of Hyper Parameters

When training the model on computer, we achieved good prediction results, and the
output predicted values are in good agreement with the target values. However, when
running on KW01, we face a different situation. It has less available resources, the
main frequency is only 48 MHz, less than 1/50 of the PC, and the computing ability
is poor. Therefore, it is necessary to optimize from the level of hyperparameters to
minimize the number of iterations, so as to shorten the runtime of the algorithm. For
the three hyper-parameters of support vector regression algorithm, Epsilon needs to
be selected manually by the scale of target data. For the other two parameters, C and
Gamma, the increase of C and Gamma can improve the accuracy of the model in the
training process, but at the same time, it will make the model more complex, resulting
in more iterations and longer runtime. In theory, when Gamma is large enough, the
model can fit all the known data points, but correspondingly, the model will fall into
over-fitting, which will make the generalization effect worse. At the same time, the
number of iterations required for the operation of the algorithm will become very
large, and the efficiency of the algorithm will become low. Therefore, considering
the practical application needs to run on a single-chip computer, we will transform
the optimization goal into how to reduce the number of iterations on the basis of
ensuring the accuracy of the model. In each training, we first determine the value of
Epsilon according to the scale of the target data, and determine an expected accuracy
(with R as the evaluation index) that we want the algorithm to achieve. Then the
exhaustive method is used to find the optimal C and Gamma in a certain interval. In
evaluating the advantages and disadvantages of current C and Gamma, we divide the
data into ten parts by ten-fold cross-validation. Nine of them are taken as training
data sets at a time, and the remaining one is used as test data sets to record the mean
of R-side obtained by using each parameter training model. When R reaches a given
value, C and Gamma will not increase any more, so that we can get several sets of
values of C and Gamma sum, and then choose the optimal parameters according to
the number of iterations. Table 27.2 shows intuitively the selection of parameters
when the expected accuracy takes different values.
266 B. Liu et al.

Table 27.2 . Rˆ2 C Gamma Epsilon Iteration


0.9385 100 0.2 0.1 4132
20 0.3 0.1 3754
10 0.4 0.1 3712
5 0.6 0.1 3626
0.9274 50 0.15 0.1 3512
20 0.2 0.1 3480
10 0.3 0.1 3443
5 0.4 0.1 3589
0.9206 25 0.1 0.1 3517
15 0.15 0.1 3510
10 0.2 0.1 3407
5 0.3 0.1 3533

27.4 Performance Evaluation of the Algorithm


and Experimental Results

In this section we introduce the mapping of support vector regression algorithm


on embedded platform. We adopt two methods to evaluate the performance of the
algorithm.

27.4.1 On-Chip Part

Firstly, we store the data of the existing data sets on chip, and after the MCU runs
the support vector regression algorithm, the resource occupancy can be checked on
IDE. We took 300 sets of data for training, and five sets for prediction. Then we
calculate R-square to evaluate. The R-square we achieved is 0.911, which achieves
good fitting results. The running time of code can be viewed through IAR software
simulation. Viewing the map file exported from the project can obtain information
about memory usage. The project code takes up 28.2 KB Flash space and 9.6 KB
SRAM space. From this, we can see that the design makes reasonable and effective
use of on-chip resources on the basis of completing the purpose of the algorithm and
ensuring the accuracy of the algorithm.
27 A Design of Electricity Generating Station Power … 267

Fig. 27.3 .

27.4.2 Sensor-Connected Part

We connect sensors to MCU to obtain external data and store the latest 20 sets of data
for running support vector regression algorithm. When external sensors are attached,
the workflow of the prediction unit is shown in Fig. 27.3.

27.5 Conclusion

This paper mainly introduces the design of a power prediction unit with low power
consumption, and briefly introduces the support vector regression algorithm. In prac-
tical applications, we can run machine learning algorithms on low-power and low-
cost platforms, which can extract the characteristics of environmental data and realize
real-time monitoring of power plant operation status. At the same time, we designed
a complete set of parameter optimization methods and corresponding optimization
strategies. In the case of minimizing resource occupation, the runtime of the algo-
rithm can be shortened as much as possible by setting the hyper-parameters.
268 B. Liu et al.

References

1. Chen, X., Peng, X., Li, J.-B., Peng, Yu.: Overview of deep kernel learning based techniques and
applications. J. Netw. Intell. 1(3), 83–98 (2016)
2. Kuang, F.-J., Zhang, S.-Y.: A novel network intrusion detection based on support vector machine
and tent chaos artificial bee colony algorithm. J. Netw. Intell. 2(2), 195–204 (2017)
3. Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using second order information for
training support vector machines. J. Mach. Learn. Res. 6, 1889–1918 (2005)
4. Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl.
Discov. (2) (1998)
5. Kim, E., Lee, J., Shin, K.G.: Real-time prediction of battery power requirements for electric
vehicles. In: ACM/IEEE 4th International Conference on Cyber-Physical Systems, ACM, New
York, NY, USA, pp. 11–20 (2013)
Chapter 28
Design of Power Meter Calibration Line
Control System

Liqiang Pei, Qingdan Huang, Rui Rao, Lian Zeng and Weijie Liao

Abstract Aiming at the problem that the manual calibration in the power meter cal-
ibration is inefficient and error-prone, and the existing automatic calibration equip-
ment is cumbersome, this paper proposes a pipelined automatic calibration solution
for the instrument. Combined with the instrument automatic calibration device and
the assembly line equipment, the assembly line calibration operation of the instru-
ment verification is realized, and multiple power meters can be calibrated at the same
time. This paper introduces the structure of the power meter automatic calibration
assembly line system and the design of hardware and software. The experimental
results show that the designed system can realize the fully automated calibration
operation of the instrument.

Keywords Electric instrument · Automatic calibration · Assembly line · Control


system

28.1 Introduction

In order to ensure the measurement accuracy of the power meter, it is necessary to


periodically check the power meter. At present, the method of manual calibration
is usually adopted in China. However, the manual calibration has the problems of

L. Pei (B) · Q. Huang · R. Rao · L. Zeng · W. Liao


Electrical Power Test & Research Institute of Guangzhou Power Supply Bureau, Guangzhou,
China
e-mail: 68311454@qq.com
Q. Huang
e-mail: 18829588@qq.com
R. Rao
e-mail: 16450645@qq.com
L. Zeng
e-mail: 526868238@qq.com
W. Liao
e-mail: 453679523@qq.com
© Springer Nature Singapore Pte Ltd. 2020 269
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_28
270 L. Pei et al.

low calibration efficiency and cumbersome operation, and the calibration personnel
are extremely prone to fatigue after prolonged operation and are prone to error [1].
Therefore, it is necessary to develop automated calibration technology for power
meters.
At present, some power meter automatic calibration devices have appeared at
home and abroad, using DSP or computer as the processor, and using machine
vision to obtain the instrument representation number [2]. These devices basically
realize the automation of meter reading acquisition and calibration data processing,
which improves the calibration efficiency and calibration accuracy [3]. However,
these instrument calibration devices still have the following deficiencies: firstly, it
is necessary to manually classify and place the instruments, and automatic handling
has not yet been realized; secondly, manual connection and disconnection opera-
tions are still required; finally, the system has poor versatility and can be calibrated.
There are fewer types. And only one instrument can be verified at a time, and the
pipeline operation is not realized [4]. In order to solve the above problems, this paper
designs a power meter calibration pipeline system that can realize the power line
calibration pipeline line operation, and automatically completes the transport meter,
disconnection, calibration and range adjustment operation of the instrument [5]. It
can significantly shorten the meter calibration time and improve the meter calibration
efficiency.

28.2 System Design

The overall structure of the power meter calibration pipeline control system is shown
in Fig. 28.1. It is mainly composed of four parts: system main control unit, instrument
calibration unit, instrument identification and grabbing unit and pipeline conveyor
belt.

Fig. 28.1 The structure of control system


28 Design of Power Meter Calibration Line Control System 271

In the power meter calibration pipeline control system, the system main control
unit controls the operation of the whole system; the instrument calibration unit real-
izes the automatic calibration operation of the power meter [6]; the instrument identi-
fication and capture unit is used to check the power meter from the to-be-checked. The
instrument warehouse grabs the assembly line and identifies the instrument model.
The assembly line conveyor is responsible for transporting the instrument between
the instrument calibration unit and the instrument storage warehouse. The power
meter calibration pipeline is a distributed system. The main control unit, the instru-
ment identification and capture unit and the instrument calibration unit are connected
in the same control LAN. In actual use, different numbers of instrument calibration
units can be connected according to actual needs. The more the number of meter cal-
ibration units used, the more instruments that can be simultaneously calibrated, and
the higher the calibration efficiency. The instrument calibration unit is the core equip-
ment of the system, and is designed with independent control computer, automatic
transport meter device, automatic disconnecting device and automatic calibration
device. Through the mutual cooperation of these devices, the automatic calibration
operation of the instrument is completed [7].
The basic workflow of the power meter calibration pipeline system is shown in
Fig. 28.2. After the system is started, the main control unit sends a status query
command to each meter calibration unit. If there is an idle calibration unit, the main
control system will send a grab instrument instruction to the meter identification and
capture unit, and carry the instrument to be verified from the warehouse to the assem-
bly line. After the operation is completed, the operation completion signal and the
instrument model information will be returned to the main control unit. After receiv-
ing the operation completion signal, the main control unit sends a start calibration
command and meter model information to the meter calibration unit. After receiving
the command, the instrument calibration unit will first read the instrument calibration
plan and instrument parameters from the main control unit database. Then verify the
instrument according to the calibration plan and instrument parameter information.
After the meter calibration is completed, the meter calibration unit will return the
calibration completion signal to the system main control unit. After receiving the cal-
ibration completion signal, the main control unit will start the pipeline to transport
the meter that has been verified.

28.3 Design of the Instrument Calibration

The instrument calibration unit is responsible for completing the transport meter,
disconnection, range adjustment and automatic calibration operation of the power
meter in this system, which is the main component of the system. The schematic
diagram of the hardware structure of its control system is shown in Fig. 28.3. It
consists of a smart camera, a displacement control box, a standard source, a standard
source channel switchboard, a digital I/O board, a motor driver control board, and
a plurality of electric actuators. The meter calibration unit completes the transport
272 L. Pei et al.

Fig. 28.2 The basic


workflow of the power meter
calibration pipeline system

meter, disconnection and range adjustment operation of the power meter through the
cooperation of multiple electric actuators [8].
If the control system of the meter calibration unit is divided by functions, it can
be divided into the calibration device, the upper and lower device control circuit, the
disconnection device control circuit and the range adjustment circuit. The instrument
calibration unit control system is connected to the control LAN through a network
interface with the computer as the core, and realizes the communication connection
with the main control unit.
A digital I/O board with a PCI interface installed on the main control computer
is used to control the action of the electric actuator [9]. A motor driver control board
was designed to control two servo motor drives, three stepper motor drives and a
steering gear.
The motor driver control board is connected to the computer via a USB interface. In
order to realize the automation of the calibration standard source channel switching,
a standard source channel switching board is designed to switch the standard source
channel output channel. The standard source channel switch board is connected to
the host computer via a USB interface.
28 Design of Power Meter Calibration Line Control System 273

Fig. 28.3 The hardware structure of its control system

28.4 Design of System Software

The power meter calibration pipeline system software is designed to follow the
principles of reliability, modifiability, readability and testability [10], using multi-
threading, network communication and database technology. The system software
adopts the client/server structure, and the system software is divided into the main
control unit software and the instrument calibration unit software. The main con-
trol unit software is used as the server software, and the instrument calibration unit
software is used as the client software. A database is built in the main control unit
computer, and the instrument calibration plan, instrument parameters and instrument
calibration results are uniformly stored in it, which is conducive to unified manage-
ment of data [11].
The main control unit software is used to control the running status of the
entire instrument automatic calibration unit, and realize functions such as instrument
scheduling, status monitoring and data processing. The main control unit software
274 L. Pei et al.

Fig. 28.4 The main control unit software module

Fig. 28.5 The instrument calibration unit software

module is composed as shown in Fig. 28.4. It consists of a human-computer interac-


tion module, a system scheduling module, a network communication module, and a
data processing module.
The instrument calibration unit software can realize instrument calibration, auto-
matic transport meter, automatic disconnection and range adjustment function. It is
a set of instrument calibration control software that integrates instrument calibra-
tion, motor control and data processing. As shown in Figs. 28.2, 28.3 and 28.4, the
instrument calibration software is divided into human-computer interaction module,
operation status monitoring module, data processing module, network communi-
cation module, instrument calibration module, transport meter module, instrument
disconnection control module, meter wiring control module and range adjustment
control module. The instrument calibration unit software operation process is shown
in Fig. 28.5.
The specific operation flow of the instrument calibration unit is as follows: when
the instrument calibration unit receives the start calibration command issued by the
main control unit, the instrument’s calibration plan and instrument parameter infor-
mation are read from the database of the main control unit; after the data reading is
completed, The instrument control module on the instrument is started to perform
the operation of the meter on the meter; after the transport meter is completed, the
instrument wiring control module is executed, and the connection line is connected
28 Design of Power Meter Calibration Line Control System 275

to the instrument terminal; after the wiring is completed, the instrument calibration
operation is started; After the calibration is completed, execute the instrument discon-
nection control module to remove the connection line from the instrument terminal;
then execute the instrument’s meter control module to move the instrument from the
calibration station to the assembly line. This completes the instrument calibration.

28.5 Experiments

In order to verify the function of the designed power meter calibration pipeline
control system, the main control unit, the instrument calibration unit and the pipeline
conveyor belt are combined, and the system function calibration environment shown
in Fig. 28.6 is built to verify the system function.
The key point of calibration is whether the system can realize the whole process of
the transport meter, disconnection, range adjustment and automatic calibration of the
power meter under the control of the main control unit. The designed experimental
scheme is to put the instrument to be inspected on the pipeline, and then use the
system main control unit to start the whole system and record the time required for
each operation. Analyze the experimental results.
A total of five meters were used for testing throughout the process. In all five
experiments, all the operational procedures of the instrument calibration were suc-
cessfully completed. The experimental results are shown in Table 28.1. Excluding
the time spent on meter calibration, the average time of the transport meter, dis-
connection, range adjustment and instrumentation operation of each meter is about
260 s. The functional design basically meets the expected design goals of the system.

Fig. 28.6 The system


experiment environment
276 L. Pei et al.

Table 28.1 Result of experiments


Num Step Frequency Time (s) Result
1 Photoelectric switch 1 5 20 Pass
2 Block the meter 5 5 Pass
3 Transport meter (Up) 5 55 Pass
4 Tighten the two terminals 5 48 Pass
5 Insert range adjustment pin 5 19 Pass
6 Pull out the range adjustment pin 5 13 Pass
7 Loosen the two binding posts 5 38 Pass
8 Transport meter (Down) 5 46 Pass
9 Start the pipeline 5 8 Pass

28.6 Conclusion

Aiming at the problem that the existing power meter calibration equipment has poor
versatility and low calibration efficiency, this paper designs a power meter cali-
bration pipeline control system. The system combines computer technology, infor-
mation management technology and digital control technology. Under the synergy
of multiple motors, the automatic power meter’s automatic transport meter, auto-
matic disconnection, automatic calibration and other functions are realized. When
the system is equipped with multiple instrument calibration units, one system can
simultaneously verify multiple instruments, which can reduce the calibration time of
the power meter, reduce the labor intensity of the calibration personnel, and improve
the calibration efficiency of the instrument.

References

1. Li, Q., Fang, Y., He, Y.: Automatic reading system based on automatic alignment control for
pointer meter. In: Industrial Electronics Society, IECON 2014—40th Annual Conference of
the IEEE, pp. 3414–3418 (2014)
2. Yue, X.F., Min, Z., Zhou, X.D., et al.: The research on auto-recognition method for analogy
measuring instruments. In: International Conference on Computer, mechatronics, Control and
Electronic Engineering, pp. 207–210 (2010)
3. Zhang, J., Wang, Y., Lin, F.: Automatic reading recognition system for analog measuring
instruments base on digital image processing. J. Appl. Sci. (13), 2562–2567 (2013)
4. Chen, C., Wang, S.: A PC-based adaptative software for automatic calibration of power trans-
ducers. IEEE Trans. Instrum. Meas. (46), 1145–1149 (1997)
5. Pang, L.S.L., Chan, W.L.: Computer vision application in automatic meter calibration. In:
Fourteenth IAS Annual Meeting. Conference Record of the 2005, pp. 1731–1735 (2005)
6. Smith, J.A., Katzmann, F.L.: Computer-aided DMM calibration software with enhanced AC
precision. IEEE Trans. Instrum. Meas. 36, 888–893 (1987)
7. Wang, S.C., Chen, C.L.: Computer-aided transducer calibration system for a practical power
system. In: IEE Proc. Sci. Measure. Technol. (6), 459–462 (1995)
28 Design of Power Meter Calibration Line Control System 277

8. Advantech. PCI-1752/PCI-1752USO User Manual. Taiwan, p. 24 (2016)


9. Semenko, N.G., Utkin, A.I., Lezhnin, F.K.: Automatic calibration of dc power meters. Measure.
Tech. (29), 433–437 (1986)
10. Sablatnig, R., Kropatsch, W.G.: Automatic reading of analog display instruments. In: Confer-
ence A: Computer Vision & Image Processing
11. Edward, C.P.: Support vector machine based automatic electric meter reading system. In:
2013 IEEE International Conference on Computational Intelligence and Computing Research
(ICCIC). IEEE, pp. 1–5 (2013)
Part III
Pattern Recognition
and Its Applications
Chapter 29
Foreground Extraction Based
on 20-Neighborhood Color Motif
Co-occurrence Matrix

Chun-Feng Guo, Guo Tai Chen, Lin Xu and Chao-Fan Xie

Abstract On the basis of traditional gray level co-occurrence matrix (GLCM) and
8-neighborhood element matrix, a novel 20- or twenty-neighborhood color motif co-
occurrence matrix (TCMCM) is proposed and used to extract the foreground in color
videos. The processing of extracting the foreground is briefly described as follows.
First, the background is constructed by averaging the first many frames of the con-
sidered video. Following this, the TCMCM of each point is computed in the current
frame and background frame respectively. Next, based on the TCMCM, the entropy,
moment of inertia and energy in each of their color channel are introduced to represent
color texture features. Finally, Euclidean distance is used to measure the similarity of
color texture features between the foreground and background. Experimental results
show that the presented method can be effectively applied to foreground extraction
in color video, and can get better performance on the foreground extraction than the
traditional method based on GLCM.

Keywords Foreground extraction · Motif matrix · Gray level co-occurrence


matrix · Color motif co-occurrence matrix

29.1 Introduction

With the development of the Internet and the wide application of visual sensors,
people have entered an era of information explosion. How to accurately and quickly
extract the interesting foreground or target from a large amount of visual infor-
mation will directly affect the follow-up tracking and positioning, and it also is a

C.-F. Guo (B) · G. T. Chen · C.-F. Xie


School of Electronic and Information Engineering, Fuqing Branch of Fujian Normal University,
Fuzhou, FuJian, China
e-mail: splin88@qq.com
G. T. Chen · L. Xu
Key Laboratory of Nondestructive Testing, Fuqing Branch of Fujian Normal University, Fuzhou,
FuJian, China

© Springer Nature Singapore Pte Ltd. 2020 281


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_29
282 C.-F. Guo et al.

key preprocessing in the future prediction of target behavior and scene understand-
ing. At present, the classical methods of foreground extraction include: optical flow
method, frame difference method and background difference method [1], etc. How-
ever, optical flow method requires multiple iterative operations, which causes the
method having complex and time-consuming computation. Moreover, optical flow
method has poor anti-noise ability and it is rarely applied in real scenarios [2]. Frame
difference method is easy to produce cavity phenomenon and image dragging for
the rapidly-moving foreground object with low accuracy [3]. Background difference
method depends on the updating model of background. The shadow generated by
light is detected by most methods of foreground detection, because it has the same
motion property with the target, which affects the accuracy of extraction.
As an important perception cue on the surface of objects, texture is widely used
in feature extraction. Therefore, this paper starts with texture features and looks
for a method of foreground extraction according to the texture similarity between
foreground and background. At present, the methods of texture feature extraction
mainly lie on statistical methods and structural methods [4]. Gray level co-occurrence
matrix (GLCM) is used as a classical statistical method [5], and motif matrix is
commonly used in structural method [6].
GLCM and its derivative matrix (gray motif co-occurrence matrix [7, 8]) are
mainly based on gray level information for statistical quantity of feature, and to the
best of our knowledge, few studies have been presented on color images [7–10]. In
fact, color features can provide abundant information of color, which is conducive
to the extraction and detection of image features. Therefore, this paper intends to
mix the color features of an image and GLCM, and presents the color motif co-
occurrence matrix. However, GLCM is mainly based on 8-neighborhood motif matrix
of each pixel, and it often occurs that the extraction of moving objects in an image
is incomplete or the extraction of small moving objects is not available. Therefore,
by expanding the 8-neighborhood motif matrix, 20- or twenty-neighborhood color
motif co-occurrence matrix (TCMCM) is proposed in this paper. A new algorithm
based on the proposed TCMCM is applied to extract the foreground from the color
video, which obtains more accurate information from neighborhood pixels, and dis-
tinguishes foreground target points and background points according to different
texture features of foreground and background, so as to extract the interesting fore-
ground.

29.2 Traditional Gray Level Co-occurrence Matrix

GLCM has been proposed by Haralick et al. [11]. It characterizes texture features
statistics according to the spatial correlation and gray level relationship between
paired pixels of an image, and it has been widely used in various fields in recent
years [12].
29 Foreground Extraction Based on 20-Neighborhood Color … 283

GLCM is used to present the occurrence probability of paired pixels. Let L be the
gray level of image, i and j denote the respective gray values of any paired pixels,
which are between 0 and L − 1. More notations are shown as follows. θ is the angle
between the line determined by paired pixels and horizontal plane, which reflects the
direction of the paired pixels. Usually, the value of θ is 0, 45, 90 or 135 with unit of
degree. λ denotes the distance between two pixels of any pair. Thus, the element of
a GLCM is expressed with the above notations as follows [13].

Pλθ (i, j) (i, j = 0, 1, . . . , L − 1) (29.1)

When the direction and distance between the paired pixels are determined, the
corresponding GLCM is the expression of Eq. (29.2).
⎡ ⎤
p(0, 0) ··· p(0, j) ··· p(0, L − 1)
⎢ .. .. .. .. .. ⎥
⎢ . . . . . ⎥
⎢ ⎥
Pλ = ⎢
θ
⎢ p(i, 0) ··· p(i, j) ··· p(i, L − 1) ⎥
⎥ (29.2)
⎢ .. .. .. .. .. ⎥
⎣ . . . . . ⎦
p(L − 1, 0) · · · p(L − 1, j) · · · p(L − 1, L − 1)

GLCM requires high computation and the methods based on GLCM have inac-
curate expression and thus have poor extraction results.

29.3 Traditional Motif Matrix

Motif matrix is composed of motif values, and the value at one pixel is based on its 4
or 8 neighborhood pixels. A motif value presents the torque of neighborhood pixels
to their corresponding central pixel [14].
Suppose that the non-boundary pixel point (x, y) is considered, and G(x, y) is the
gray value of each pixel of an image. When the torque of 4 neighborhood pixels is
used to measure the motif value of the considered pixel [14], m(x, y) can be gotten
by

m(x, y) = G(x − 1, y) + G(x, y − 1) + G(x + 1, y) + G(x, y + 1),



x = 1, . . . , Lx − 2, y = 1, . . . , Ly − 2 (29.3)

Here, L x and L y are the number of quantization levels along the x and y direction
respectively.
Similarly, the motif value can be calculated as Eq. (29.4) when 8 neighborhood
pixels are considered for a central pixel (x, y) [7].
284 C.-F. Guo et al.

Fig. 29.1 The case of


20-neighborhood compared
to 8-neighborhood case


m(x, y) =INT { 2[G(x − 1, y − 1) + G(x − 1, y + 1) + G(x + 1, y − 1) + G(x + 1, y + 1)]
+ [G(x − 1, y) + G(x, y + 1) + G(x + 1, y) + G(x, y − 1)]},

x = 1, . . . , Lx − 2, y = 1, . . . , Ly − 2 (29.4)

where INT {·} is the integer function.


The motif values of all non-boundary pixel points are used to form motif matrix
M, which is shown as


M = m(x, y) x = 1, . . . , Lx − 2, y = 1, . . . , Ly − 2 (29.5)

29.3.1 Motif Matrix with More Neighborhoods

In our work, we expand 8-neighborhood matrix to 20-neighborhood matrix. For the


case of 8 neighborhood pixels, 8 neighborhood pixels around a pixel as Fig. 29.1 is
used to calculate the motif value of the pixel. To obtain more information around a
pixel, 20 neighborhood pixels around the pixel are considered to calculate the motif
value of the pixel. The case of 20 neighborhood pixels is shown in Fig. 29.1b. In
the figures, ⊗ is the considered pixel, • and × denote the neighborhood pixel points
around the considered pixel. × is the expanded pixel in comparison with the case of
8 neighborhood pixels.
The torque value of 20 neighborhood points to the current pixel is computed as
the motif value of the current pixel (x, y), and the expression is

m(x, y) = INT {2 G(x, y − 2) + G(x, y + 2) + G(x − 2, y) + G(x + 2, y)

+ 5[G(x − 2, y − 1) + G(x − 2, y + 1) + G(x − 1, y − 2) + G(x − 1, y + 2)
+ G(x + 1, y − 2) + G(x + 1, y + 2) + G(x + 2, y − 1) + G(x + 2, y + 1)]}
√ 
+ INT { 2 G(x − 1, y − 1) + G(x − 1, y + 1) + G(x + 1, y − 1) + G(x + 1, y + 1)
+ [G(x − 1, y) + G(x, y + 1) + G(x + 1, y) + G(x, y − 1)]},

x = 1, . . . , Lx − 2, y = 1, . . . , Ly − 2 (29.6)
29 Foreground Extraction Based on 20-Neighborhood Color … 285

29.4 The Proposed Method

29.4.1 Color Motif Co-occurrence Matrix

GLCM and its derivative matrix are mainly used to present statistical feature quantity
of image based on the information of gray level. For small foreground targets or
small difference between foreground and background colors, the aforementioned
matrices easily introduce incomplete extraction of the target. For color videos, each
color channel has texture information [15]. To improve the extraction performance
in color videos, we construct TCMCM on the basis of GLCM, color feature and
20-neighborhood motif matrix. The element of the constructed TCMCM matrix is
expressed as

CP(i, j, r, t|λ, θ ) (i = 0, 1, . . . , L1 , j = 0, 1, . . . , L2 ) (29.7)

where L 1 denotes the maximum value of color co-occurrence matrix on each channel
of RGB, and L 2 is the maximum motif value of 20-neighborhood.
CP(i, j, r, t|λ, θ ) is the number of paired pixels when the r-th channel value is i
and the motif value is j under the conditions of direction θ and distance λ in the color
image at time t in video.
In order to reduce the computation, the values in this paper are compressed and
quantized into 16 levels before constructing the color motif co-occurrence matrix.

29.4.2 Texture Features

The color motif co-occurrence matrix cannot be directly regarded as the features,
and its elements are used to do further statistics. The entropy, energy, contrast, cor-
relation, moment of inertia, moment of deficit, angular second moment and other
14 features are usually considered as texture statistics [16, 17]. In order to reduce
the computation and combine the features of foreground and background, our work
selects entropy, moment of inertia and energy as texture statistics which are shown
as Eqs. (29.8)–(29.9) respectively. These quantities have strong descriptive ability as
the features of foreground and background texture for statistics.
Entropy:

H (t, r, λ, θ ) = − CP(i, j, r, t|λ, θ )log CP(i, j, r, t|λ, θ ) (29.8)
i j

Energy:
 2
E(t, r, λ, θ ) = − CP(i, j, r, t|λ, θ ) (29.9)
i j
286 C.-F. Guo et al.

Moment of inertia:

I (t, r, λ, θ ) = − (i − j)2 CP(i, j, r, t|λ, θ ) (29.10)
i j

Considering that color motif co-occurrence matrix represents the spatial depen-
dence between image pixels and the comprehensive information of color space, we
construct the color texture feature vector by using the nine parameters of the texture
features of R, G and B channels as

V (t, x, y) = (H1 , H2 , H3 , E1 , E2 , E3 , I1 , I2 , I3 ) (29.11)

Here, V (t, x, y) represents the color texture feature statistics corresponding to


the surrounding neighborhood on the image position (x, y) at time t. H 1 , H 2 and
H 3 are the entropy of each component on the color channel corresponding to RGB
respectively, E 1 , E 2 and E 3 are the energy of each component on the color channel
corresponding to RGB respectively, and I 1 , I 2 and I 3 are the inertia moments of each
component on the color channel corresponding to RGB respectively.

29.4.3 Similarity Measurement

In order to describe the similarity between current foreground region and the back-
ground region at time t in video, Euclidean distance as Eq. (29.12) is introduced to
measure the similarity of foreground and background texture feature.

  T
f f
d (t, x, y) = Vt − V b Vt − V b (29.12)

Here, V is obtained by Eq. (29.11), and the superscript f and b denote foreground
and background respectively.
The smaller Euclidean distance means the higher similarity between the current
pixel texture and the background texture.

29.4.4 The Proposed Algorithm

By combining color co-occurrence matrix and 20-neighborhood motif matrix as a


20 neighborhood motif color co-occurrence matrix, the processing of extracting the
foreground is shown in the following:
29 Foreground Extraction Based on 20-Neighborhood Color … 287

(1) The first M (>100) frames of a video are input and the average of these frames
are calculated to build up the background model. The value of M is decided by
video size and complexity, usually it is set to 100, 200, 300 or more.
(2) Calculate the TCMCM of each point in the background model, and, the texture
feature quantity in the neighborhood around each pixel in the background image
is measured as V b according to Eq. (29.11).
(3) Input the time-t frame of the video and calculate the TCMCM of each pixel in
the current image. According to Eq. (29.11), the texture feature quantity in the
f
neighborhood around each pixel is measured as Vt at this moment.
(4) Based on Eq. (29.12), the similarity of texture feature quantity of each pixel (x,
y) between the current frame and background frame is calculated.
(5) If the similarity of texture feature quantity of each pixel (x, y) is less than the
threshold T, the current pixel point (x, y) belongs to the background point;
otherwise, it belongs to the foreground point.
(6) Input the next frame and repeat the processing from step (3) until all frames of
the video have been processed.

29.5 Experiment

The experimental environment in our work is MATLAB 2015b on a computer with


Windows 10.
To verify the effectiveness of the proposed algorithm, multiple groups of different
color videos are considered. Meanwhile, the experimental results under our algorithm
are compared with the traditional method based on GLCM. And subsequent anti-
noising and morphological processing are not included in the experiments so as to
better compare the difference between the two experimental results.
Four scenarios are considered in this paper. Figures 29.2, 29.3 and 29.4 are the
videos on crossroad, highway and parking lot respectively, and Fig. 29.5 shows the
case pedestrian in strong light. The proposed algorithm based on TCMCM and the
traditional algorithm are both used for these scenarios.
By observing the above experimental results, it can be seen that the proposed
algorithm based on TCMCM can extract the foreground of color video. In comparison
with the traditional method based on GLCM, the proposed algorithm is more accurate
for the foreground extraction in the color video. The reason is the comprehensive
consideration of color features and structural features in the TCMCM:
For targets with bright colors as shown in Figs. 29.2 and 29.4, our presented
method can extract the foreground without missing detection. When the color of the
target is similar to the background as shown in Fig. 29.2a, the foreground can be more
accurately extracted without any missed detection with our method in Fig. 29.2c than
the results in Fig. 29.2b by the traditional matrix.
Due to more neighborhood pixels considered, the values in TCMCM can have
more information from the pixels around the corresponding pixels, which is beneficial
288 C.-F. Guo et al.

(a)The 530thframe (b) The traditional method (c) The proposed method

(d)The 654thframe (e) The traditional method (f) The proposed method

Fig. 29.2 Results of detection for a crossroad video

(a)The 89thframe (b) The traditional method (c) The proposed method

(d)The 176thframe (e) The traditional method (f) The proposed method

Fig. 29.3 Results of detection for a highway video

(a)The 189thframe (b) The traditional method (c) The proposed method

Fig. 29.4 Results of detection for a parking video


29 Foreground Extraction Based on 20-Neighborhood Color … 289

(a)The 115thframe (b) The traditional method (c) The proposed method

Fig. 29.5 Results of detection for a pedestrian video in strong light

to detect small targets as in Fig. 29.2c, and incomplete extraction of individual moving
objects will not occur as shown in Figs. 29.2 and 29.4.
For videos including motion shadow generated with target motion as Figs. 29.3
and 29.5, this proposed method has higher accuracy of foreground extraction and
less noise than the traditional method.

29.6 Conclusions

A 20-neighborhood color motif co-occurrence matrix has been presented based on


the traditional GLCM. Based on the TCMCM, the entropy, energy and moment
of inertia of each color channel are calculated as the features of foreground and
background. And then the calculated results are used to distinguish the foreground
and background. The processing of extracting foreground or targets also has been
described in this paper. The experimental results have shown that the method based
on the proposed matrix has better performance of extracting the foreground in the
color videos of the considered scenarios in comparison with the traditional gray level
co-occurrence matrix.

Acknowledgements This work is supported by Educational Research Project for Young and
Middle-aged Teachers of Fujian No. JAT-170667 and Teaching Reform Project of Fuqing Branch
of Fujian Normal University No. XJ14010.

References

1. Lin, G., Wang, C.: Improved three frame difference method and background difference method
a combination of moving target detection algorithm. Equip. Manuf. Technol. 3, 172–173 (2018)
2. Fu, D.: Vehicle detection algorithm based on background modeling. University of Science and
Technology of China, Hefei (2015)
3. Guo, C.: Target tracking algorithm based on improved five-frame difference and mean shift. J.
Langfang Normal Univ. 18(1), 21–24 (2018)
4. Jian, C., Hu, J., Cui, G.: Texture feature extraction method of camouflage effect evaluation
model. Comm. Control Simul. 39(3), 102–105 (2017)
290 C.-F. Guo et al.

5. Gao, C., Hui, X.: GLCM-Based texture feature extraction. Comput. Syst. Appl. 19(6), 195–198
(2010)
6. Liu, X.: ROI digital watermarking based on texture characteristics. Hangzhou Dianzi Univer-
sity, Hangzhou (2011)
7. Wang, L., Ou, Z.: Image texture analysis by grey-primitive co-occurrence matrix. Comput.
Eng. 30(23), 19–21 (2004)
8. Hou, J., Chen, Y., He, S., et al.: New definition of image texture feature. Comput. Appl. Softw.
24(9), 157–158 (2007)
9. Song, L., Wang, X.: An image retrieval algorithm integrating color and texture features. Comp.
Eng. Appl. 47(34), 203–206 (2011)
10. Yu, S., Zeng, J., Xie, L.: Image retrieval algorithm based on multi-feature fusion. Comput. Eng.
38(24), 216–219 (2012)
11. Haralick, R.M., Shanmugam, K., Dinstein, I.: Textural features for image classification. IEEE
Trans. Syst. Man Cybern. 3(6), 610–621 (1973)
12. Ghulam, M., Mohammed, A., Hossain, M., et al.: Enhanced living by assessing voice pathology
using a co-occurrence matrix. Sensors 17(2), 267 (2017)
13. Wang, H., Li, H.: Classification recognition of impurities in seed cotton based on local binary
pattern and gray level co-occurrence matrix. Trans. Chin. Soc. Agric. Eng. 31(3), 236–240
(2015)
14. Wang, L., Ou, Z., Su, T., et al.: Content-based image retrieval in database using SVM and gray
primitive co-occurrence matrix. J. Dalian Univ. Technol. (4), 475–478 (2003)
15. Xu, F.: Classification of texture features based on color symbiosis matrix. J. Zhejiang Ind.
Trade Vocat. Coll. 16(4), 54–58 (2016)
16. Gui, W., Liu, J., Yang, C., et al.: Color co-occurrence matrix based froth image texture extraction
for mineral flotation. Miner. Eng. 60–67 (2013)
17. Jiao, P., Guo, Y., Liu, L., et al.: Implementation of gray level co-occurrence matrix texture
feature extraction using Matlab. Comput. Technol. Dev. 22(11), 169–171 (2012)
Chapter 30
Deformation Analysis of Crude Oil
Pipeline Caused by Pipe Corrosion
and Leakage

Yuhong Zhang, Gui Gao, Hang Liu, Qianhe Meng and Yuli Li

Abstract In this paper, the pipeline corrosion and leakage model were built by Ansys
software. The Computational Fluid Dynamics (CFD) simulation and unidirectional
fluid-solid coupling simulation have carried out for the corrosion and leakage condi-
tions of the pipeline. The results have shown that when the pipe is corroded by 2 mm,
the deformation quantity of the pipe will increase to 5.2 × 10−9 m. When the pipe
is leaked, the deformation quantity near the leaking hole of different shapes would
change and the deformation quantity near the leaking hole was the largest part. This
conclusion has provided an effective means for studying pipeline corrosion and leak
detection technology.

Keywords CFD simulation · Pipeline leakage · Pipeline monitoring

30.1 Introduction

At present, the crude oil and natural gas are transported through pipelines. The
consumption of energy is accompanied by the rapid development of the national
economy. The pipelines have been built in the Sinopec system include a number
of refined oil pipelines such as the Southwest Oil Products Pipeline, the Pearl River
Delta Pipeline, and the Lusong Pipeline. But most of existing crude oil pipelines were

Y. Zhang · G. Gao (B) · H. Liu · Y. Li


School of Electrical and Computer Engineering, Jilin Jianzhu University,
Changchun 130118, China
e-mail: gaogui1234@sina.cn
Y. Zhang
e-mail: zhangyuhong@jlju.edu.cn
H. Liu
e-mail: 793520684@qq.com
Q. Meng
Glasgow College, University of Electronic Science and Technology of China,
Chengdu 610054, China

© Springer Nature Singapore Pte Ltd. 2020 291


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_30
292 Y. Zhang et al.

built at ~30 years ago. So, the oil pipeline network has been entering the risk period
of accidents because of the pipeline spiral weld defects and corrosion. There are
many risk factors, and the safety production situation is more severe [1]. Therefore,
understand the operational status of oil pipelines and finding problems in the pipeline
transportation process is very important.
The pipeline detection and monitoring techniques have been investigated, the
pipeline inspection can be divided into two aspects: pipeline corrosion detection
and pipeline leakage detection [2]. The detection technology which can be both
used for corrosion and leakage monitor include magnetic flux leakage detection,
acoustic emission detection and optical fiber sensing detection [3]. The fiber-optic
sensing technology has obvious advantages in safety, measurement accuracy and
long-distance transmission, which could meet the potential requirements of pipeline
corrosion and leakage monitoring [4].
The deformation quantity of the pipeline would change with the corrosion and
leakage of pipe. In order to analyze the deformation quantity of the pipeline, we
have simulated the stress distribution of the pipeline under different states by Ansys
software. According to the simulation results, it can provide the theoretical reference
for selecting a suitable stress sensor to detect the deformation of the pipeline.

30.2 Principle

The flow of fluids should follow a series of standard fluid mechanics conservation
equations such as mass conservation, conservation of momentum, conservation of
energy, and conservation of chemical components [6]. Regardless of the local fluid
disturbance problem and heat transfer problem at the pipe joint, the fluid density
can be assumed to be constant and the fluid also is incompressible. The governing
differential equations of the fluid in this simulation are shown in Eqs. (30.1), (30.2),
and (30.3) respectively:
Continuity equation:

∂ρ ∂(ρuj )
+ =0 (30.1)
∂t ∂xj

Momentum conservation equation:


     
∂(ρT) ∂ ρui uj ∂ρ ∂ ∂ui ∂ ∂uj
+ =− + ui + μi + (ρ − ρ0 )gi (30.2)
∂t ∂xj ∂xi ∂xi ∂xj ∂xj ∂xi

Energy conservation equation:


    
∂(ρT) ∂(ρui T) ∂ ut ∂T Cpv − Cpa μt ∂ω ∂T
+ = + (30.3)
∂t ∂xj ∂xj σt ∂xj Cp σc ∂xj ∂xj
30 Deformation Analysis of Crude Oil Pipeline Caused … 293

uj (m/s) is direction velocity; xi (m) is i transmission distance; ρ (kg/m3) is fluid


density; T (K) is temperature; Cp (J/(kg K)) is constant pressure specific heat of
fluid; Cpv (J/(kg K)) is constant pressure specific heat of leaking substances; Cpa
(J/(kg K)) is constant pressure specific heat of air.
Due to the fast speed of oil transportation in the crude oil pipeline and the large
diameter of the pipeline, the Reynolds number is large. Therefore, the fluid flowing
state in the pipeline generally is turbulent. The standard k-2 model has been used for
the turbulence model in the solution. The k-2 model is composed of the two-equation
model proposed by Spalding. The solution of the turbulent flow energy k and the
dissipation rate 2 can be obtained through solving the turbulent flow energy equation
and the turbulence equation. The turbulent flow energy equation and the turbulent
dispersion equation are shown in Eqs. (30.4) and (30.5) respectively.
  
∂(ρk) ∂(ρkui ) ∂ ut ∂k
+ = μ+ + Gk + Gkb − ρε − YM + Sk (30.4)
∂t ∂xi ∂xi σk ∂xj
  
∂(ρε) ∂(ρεui ) ∂ ut ∂ε ε ε2
+ = μ+ + Clε (Gk + G3 Gb ) + C2ε ρ + Sε
∂t ∂xi ∂xi σε ∂xj k k
(30.5)

ρ (kg/m3 ) is the medium density; t (s) is time; ui (m/s) is the speed in the i
direction; xi (m) is the displacement in the i direction; xj (m) is the displacement
in the j direction; μ is the molecular viscosity; μt is the turbulent viscosity; Gk (J)
is the turbulent flow energy generated by the average velocity gradient; Gb (J) is
the turbulent flow energy generated by buoyancy; YM is the turbulent fluctuation of
the compressible fluid and the effect of expansion on the overall dissipation rate;
C12 = 1.44, C22 = 1.92, C32 = 0.99, they are the empirical constant; Sk and S2 are
user-defined source terms.

30.3 Ansys Simulation

30.3.1 Physical Model

We took a land open-air crude oil pipeline as an ansys model. The pipeline model
was built by the Design Modeler tool of Ansys software (see Fig. 30.1). The length
of the pipeline is 2 m, the outer diameter is 220 cm, and the wall thickness is 7.5 mm.
Pipeline models were established under pipeline leakage and corrosion conditions,
respectively. The different leaking pipe models were built through the leak location
placing in the middle of the pipe with the different shapes of the leak hole. The
different corrosion pipe models were built with the deferent thickness (0.5, 1.0 and
1.5 mm) of pipeline corrosion.
294 Y. Zhang et al.

Fig. 30.1 Normal pipeline


model

30.3.2 Simulation Conditions

Pipeline material is 20# steel, elastic modulus is 210 GPa, Poisson’s ratio is 0.3,
density is 7800 kg/m, yield strength is 245 MPa. The inflow medium is liquid product
oil with a density of 1200 kg/m3 , the dynamic viscosity is 1.3 × 10−3 Pa.
The k-2 model was adopted for simulating the steady state. The pressure–velocity
coupling in the iteration has adopted SIMPLEC to improve the convergence speed,
and the two-dimensional unsteady flow model was adopted to improve the computer
calculation speed. In the unidirectional fluid–solid coupling simulation calculation,
zero displacement constraints were applied to the inlet and outlet of the pipeline.
Boundary conditions:
• Pipeline inlet flow rate is 1 m/s;
• The pressure at the outlet of the pipe is equal to 1000 Pa;
• The pressure at the leak hole of the pipe is equal to 0 Pa [9].

30.3.3 Failure Criteria

According to the third strength criterion, it is considered to be invalid when the


equivalent stress of the corrosion defect zone exceeds the yield limit strength. Using
the criteria based on this elastic failure, the Von Mises expression was shown in
Eq. (30.6):

1 
σs = (σ1 − σ2 )2 + (σ1 − σ2 )2 + (σ1 − σ2 )2 (30.6)
2

σs (MPa) is the yield stress; σ1 (MPa), σ2 (MPa), σ3 (MPa) are the principal stresses
in three directions.
30 Deformation Analysis of Crude Oil Pipeline Caused … 295

30.4 Simulation Results Analysis

30.4.1 Influence of Different Shape Leakage Holes

Figure 30.2 shows that the deformation distribution of the leaking circular holes with
different diameters and the non-leakage pipeline. The deformation and compression
conditions are shown in Table 30.1. The comparison shows that the deformation
of pipelines under normal working conditions is uniform. Under the same internal
pressure, when the pipeline leaks, the deformation of the pipeline and the pressure
near the leak hole will increase sharply, and the deformation near the leak hole with
different shapes is different.

Fig. 30.2 Crude oil pipeline deformation distribution cloud map, a normal, b round holes, c square
holes, d elliptical holes

Table 30.1 Effecton leaking Leak hole type Pipeline Total Pipe state
holes of different shapes pressure (Pa) deformation
of pipeline
(m)
No leakage 24,393 8.921 × 10−9 Valid
Round hole 3.16 × 109 0.0064 Invalid
Square hole 9.212 × 1010 0.03487 Invalid
Oval hole 1.46 × 1010 0.02779 Invalid
296 Y. Zhang et al.

30.4.2 Influence of Different Depth of Corrosion

Figure 30.3 shows that the deformation distribution of the pipeline with different
depth of corrosion and non-corrosion. The deformation and pressure conditions are
shown in Table 30.2. It shows that the deformation of the pipeline was uniform under
the condition of no leakage. But the total deformation of the pipeline would change
when the pipe wall became thin or corroded, and the deformation of the pipe became
larger with the increasing corrosion depth under the same internal pressure of the
pipeline.

Fig. 30.3 Deformation distribution of crude oil pipelines with different degrees of corrosion, e no
corrosion, f inner wall corrosion 1 mm, g inner wall corrosion 2 mm, h inner wall corrosion 3 mm

Table 30.2 Effects on Leak hole type Pipeline Total Pipe state
different degrees corrosion pressure (Pa) deformation
of pipeline
(m)
No corrosion 24,393 8.921 × 10−9 Valid
Inner wall 25,431 1.284 × 10−8 Valid
corrosion
1 mm
Inner wall 27,810 1.373 × 10−8 Valid
corrosion
2 mm
Inner wall 31,024 1.510 × 10−8 Valid
corrosion
3 mm
30 Deformation Analysis of Crude Oil Pipeline Caused … 297

30.4.3 Sensor Selection

Based on the above simulation results, it can be seen that the pipeline would generate
the deformation when pipes were leaking and corroding. The deformation of pipeline
was about ~10−8 m in corrosion state (see Table 30.2). The accuracy of the fiber Bragg
grating strain sensor could reach up to 1 pm, with advantages of high safety and high
measurement accuracy. Therefore, the fiber Bragg grating and other high accuracy
sensor could be used to monitor the running status of the pipeline.

30.5 Conclusion

The simulation results show that the deformation of the pipeline is close to zero under
normal operation. But the amount of deformation near the leak hole would increase
sharply if the pipeline leaks. When the pipeline was corroded, the deformation of the
pipeline would increase with the increasing corrosion depth. So, we can judge that
the working states of the oil pipeline through detecting the change of the pipeline’s
deformation. The minimum deformation of pipeline is about ~10−8 m in corrosion
state. The result could provide the reference for selecting the sensor which was used
to monitor the running states of the pipeline.

Acknowledgements This work was supported by National Natural Science Foundation of China
(NSFC) (Grant No: 61705077), Science Foundation of Jilin Province Education Department (No:
92001001).

References

1. Baoqun, W., Yanhong, L., Yibin, D., Xinyu, C.: Current situation and prospect of China’s crude
oil pipeline. Pet. Plan. Des. 8–11 (2012)
2. Yanhui, Z., Tao, Z., Yigui, Z., Qu, H., Penghu, Z.: Numerical simulation of erosion and corrosion
in T-tube of gathering pipeline. Contemp. Chem. Ind. 43(11), 2457–2459 (2014)
3. Guozhong, W., Dong, L., Yanbin, Q.: Numerical simulation of surface temperature field of
underground oil stealing pipeline and buried oil pipeline. J. Pet. Nat. Gas 10, 815–817 (2005)
4. Jingcui, L., Kub, B., Dongmei, D., Qing, H.: Simulation of micro-leakage flow field detection
in natural gas pipeline. Comput. Simul. 10, 361–366 (2017)
5. Fuxing, Z., Pengfei, Z., Yinghao, Q.: Stress analysis of pipeline deformation based on ANSYS.
Chem. Equip. Technol. 37(2), 47–49 (2016)
6. Hongjun, Z.: Ansys+ 14. 5 practical guide for thermo fluid solid coupling, pp. 147–156.
People’s post and Telecommunications Publishing, Beijing (2014)
7. Hongchi, H., He, Q., Jingcui, L., Zhibing, C.: Analysis of the influence of leakage hole shape
on leakage characteristics. Electr. Power Sci. Eng. 34(1), 73–78 (2018)
8. Jianming, F., Hongxiang, Z., Guoming, C., Xiaoyun, Z., Yuan, Z, Ting, R.: Effect of geometric
shape of cracks on leakage of small holes in gas pipelines. Nat. Gas Ind. 34(11), 128–133
(2014)
298 Y. Zhang et al.

9. Sousa, C.A.D., Romero, O.J.: Influence of oil leakage in the pressure and flow rate behaviors
in pipeline (2017)
10. Hongyu, L.: Leakage detection technology for long distance natural gas pipeline. Chem. Manag.
10, 97–103 (2018)
11. Yingliang, W.: Leakage test and numerical simulation study of pipeline orifice. Zhejiang Univ.
45(2), 14–19 (2015)
12. Hongbing, H.: Analysis of the research status of natural gas pipeline leakage. Contemp. Chem.
Ind. 352–354 (2016)
Chapter 31
Open Information Extraction
for Mongolian Language

Ganchimeg Lkhagvasuren and Javkhlan Rentsendorj

Abstract In this paper, we describe MongoIE, an Open Information Extraction


(Open IE) system for the Mongolian language. We present the characteristic of the
language and, after analyzing the available preprocessing tools, we describe the fea-
tures used for building the system. We have implemented two different approaches:
(1) Rule-based and (2) Classification. Here, we describe them, analyze their errors
and present their results. In the best of our knowledge, this is the first attempt in
building Open IE systems for Mongolian. We conclude by suggesting possible future
improvements and directions.

31.1 Introduction

For the past decade, Open IE has been developed using various methods and for
many languages. These methods show different results on languages because every
language has its own peculiarity [1–3].
Mongolian is a language that is spoken by 5.2 million people all over the world.
Officially, in Mongolia it is written in Cyrillic even though in some other places, for
instance Inner Mongolia Autonomous Region,2 the traditional Mongolian script is
used. Mongolian is classified into Altaic language family. Also it was believed that
Mongolian is related to Turkish and Korean. Similar to these languages, the basic
word order in Mongolian is subject-object-verb (SOV) [4], which means that the
subject, object and verb of a sentence usually appear in that order. For instance, if
English had a SOV structure, the sentence “John plays guitar” would be expressed
as “John guitar plays”.

1 One of the autonomous regions of China.

G. Lkhagvasuren (B) · J. Rentsendorj


National University of Mongolia, Ulaanbaatar, Mongolia
e-mail: ganchimeg@seas.num.edu.mn
J. Rentsendorj
e-mail: javkhlan@seas.num.edu.mn

© Springer Nature Singapore Pte Ltd. 2020 299


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_31
300 G. Lkhagvasuren and J. Rentsendorj

Comparing with English, the Mongolian language have different grammatical tag-
ging due to its highly agglutinative nature. In Mongolian, postpositions are considered
a very important factor when understanding syntax of sentences. It is common that
almost every object is attached with a postposition. Therefore identifying appropri-
ate tags for Mongolian is significant in both preprocessing and recognition of noun
phrases. Preprocessing tools for Mongolian are scarce. For example, we couldn’t
find any free available tokenizer or sentence splitter. We tried the English tokenizer
and sentence splitter from NLTK [5] library and achieved acceptable results. As
for POS-Tagger (Part of Speech), at the best of our knowledge, currently only the
TreeTagger [6] is freely available for Mongolian. Based on our experience, it works
poorly because it was trained on a small Mongolian corpus.
Correct recognition of association between arguments and relation plays important
role in Open IE [7–9]. For Mongolian, as we surveyed, what can be identified as noun
and verb phrases is still unruled yet. In terms of the noun phrase, [10] have been
published most recently (unfortunately written in Mongolian). The contribution of
this work is three rules to recognize noun phrases as well as a dataset3 which was
annotated noun phrases in about 834 sentences manually. Thus it could be exploited
to build a noun phrase chunker and Open IE methods.
Recently, some researchers (i.e. MiLab4 in the National University of Mongolia),
have contributed in the Natural language processing (NLP) for Mongolian. Their
solution is still preliminary and not yet adequate to use as a preprocessing step for
other tasks [11].
What we observed from Mongolian, which we think will be a problem for other
languages that have limited resources, is that developing Open IE is quite challenging
due to the following reasons:
1. Either of preprocessing tools such as tokenizer, Part-of-Speech tagger, etc have
not been emerged yet or performance of such tools is not sufficient
2. Lack of datasets available
3. Complexity of language structure, grammar, etc.
In this paper we discuss Rule-based and Classification methods for Mongolian
language, implemented in MongoIE—Open Information Extraction system. Under
circumstances for Mongolian we mentioned above, we considered that these two
approaches are most applicable. Additionally, we compare their performances on
parallel dataset. In the rest of the paper, we evaluate their result and give a brief
analysis of errors.
The paper is organised as follows. Section 31.2 presents our methods in Mon-
goIE system. Experiment and a brief analysis of errors is described in Sect. 31.3.
Section 31.4 draws the conclusions and outlines future work.

3 http://172.104.34.197/brat//np-chunk/test2.
4 http://milab.num.edu.mn/.
31 Open Information Extraction for Mongolian Language 301

31.2 Methods

This section describes two approaches for Open IE for the Mongolian language,
namely Rule-based and Classification.

31.2.1 Rule-Based Approach

Rule-based methods have shown reasonable results in some languages such as


English [12], Spanish [13] and so on. Advantage of this approach is that it can
be adopted easily to other languages. Because it requires only a reliable POS tagger
[14, 15].
The approach in MongoIE is based on syntactic constraints over POS tag
sequences targeted for the Mongolian language. The text is first annotated with sen-
tences, tokens and their POS Tags. As we mentioned above, TreeTagger is used to
retrieve POS tags. In the next step, syntactic constraints are applied over sequences
of POS tags, and a list of extracted triples with subject-object-predicate tuple is
returned. Then the following basic algorithm is applied.
1. Look for a verb phrase in every POS tagged sentence.
2. If it is found, detect a noun phrases to the left from the beginning of the verb
phrase.
3. If a noun phrase detected, search for another noun phrase to the left from the
beginning of the noun phrase.
Verb and noun phrases are searched for with the following expressions which are
specified for Mongolian language:
Verb Phrase: (W* V) | (V),
Noun Phrase: (CONJ) | (N),
where V stands for a single verb. (W* V) matches a verb with dependent words,
where W stands for either nouns, adjectives, an adverb, or a pronoun. N and CONJ
stand for a noun and a conjunction ( ), optionally preceded by a number
( ), adjectives ( ), an adverb ( ). The *
symbol denotes one or more matches.
If two noun phrases are preceded by the verb phrase, these three components are
considered to be a relation and are extracted in the triple form.

31.2.2 Classification Approach

Since there is no dependency parser for Mongolian, we were not able to imple-
ment similar approaches like TextRunner [16], WOE (pos) and WOE (parse) [17].
302 G. Lkhagvasuren and J. Rentsendorj

Table 31.1 Performance evaluation and error statistics


Criteria Performance Error statistics
Precision Recall F1-score POS Detection Other
Rule-based 38.84 24.60 30.12 54.11 42.35 3.54
Classification 39.14 36.16 37.76 60.81 32.53 6.66

Therefore we also exploit TreeTagger in this module. This approach consists of two
modules:
1. Candidate Extractor: In order to extract candidates, we use A similar way to
the previous approach (Rule-based). The difference is that no expression is used
to identify verb and noun phrases. Because we found out that rule-based method
sometimes eliminates correct sentences. To avoid ignoring correct triples, we
do not employ special syntactic constraints in this module. After a sentence is
tagged by TreeTagger, first a verb is searched, if it founds two nouns are searched
in the left of the noun. Extracted triples The goal of this module is to feed the
classifier module.
2. Classifier: Extracted candidate tuples from the previous module are labelled
as either trustworthy or not by Naive Bayes classifier. To train the classifier,
we annotated 100 sample triples manually. Also we use 26 features in total.
Examples of features include presence of POS tags, sequences in noun and verb
phrases, grammatical case, the number of tokens, the number of stopwords,
whether or not a subject is found to be a proper noun, etc.

31.3 Experiments and Evaluation

For the former method as we presented in the previous section, we labeled 100 sen-
tences randomly from the web as a testing dataset.5 The result of the two approaches
and error statistics presented in Table 31.1. As shown in Table 31.1, the classification
method had better recall and F1-score. Having thoroughly examined failed sentences,
we found out that most error occurs in incorrect POS tagging. Also expressions to
identify verb and noun phrases affect to make errors.

31.4 Conclusion

In this paper, we have presented two basic methods—Rule based and Classification—
for Open IE in Mongolian language. In the best of our knowledge, this is the first
attempt in building Open IE systems for Mongolian. We believe that the result is

5 Available at: https://bit.ly/2nClF3q.


31 Open Information Extraction for Mongolian Language 303

promising and the latter method shows better results. Having thoroughly examined
failed sentences, we found out that most error occurs in incorrect POS tagging. Thus
we believe that the result is able to be improved considerably by using appropri-
ate preprocessing tools especially for POS tagger. In the future, we plan to exploit
Wikipedia in Open IE for the Mongolian language. Also another alternative way to
improve the result is to have a larger dataset. In order to build it, translating a dataset
from other language could be a promising direction.

Acknowledgements This work was supported by Ernst Mach-Stipendien (Eurasia-Pacific Uninet)


grant funded by The Austrian Agency for International Cooperation in Education and Research
(OeAD-GmbH), and Centre for International Cooperation and Mobility (ICM).

References

1. Michele Banko, O.E.: The tradeoffs between open and traditional relation extraction. In: Pro-
ceedings of the ACL-08: HLT (2008)
2. Horn, C., Zhila, A., Gelbukh, A., Kern, R., Lex, E.: Using factual density to measure informa-
tiveness of web documents. In: Proceedings of the 19th Nordic Conference on Computational
Linguistics (2013)
3. Mausam, Schmitz, M., Soderland, S., Bart, R., Etzioni, O.: Open language learning for infor-
mation extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in
Natural Language Processing and Computational Natural Language Learning (2012)
4. Lin, T., Mausam, Etzioni, O.: Identifying functional relations in web text. In: Proceedings of
the 2010 Conference on Empirical Methods in Natural Language Processing (2010)
5. Bird, S., Loper, E., Klein, E.: In: Natural Language Processing with Python. O’Reilly Media
Inc (2009)
6. Helmut, S.: In: Improvements in Part-of-Speech Tagging with an Application to German, pp.
13–25. Springer, Netherlands, Dordrecht (1999)
7. Sangha, N., Younggyun, N., Sejin, N., Key-Sun, C.: SRDF: Korean open information extraction
using singleton property. In: Proceedings of the 14th International Semantic Web Conference
(2015)
8. Sidorov, G., Velasquez, F., Stamatatos, E., Gelbukh, A., Chanona-Hernández, L.: Syntactic
dependency-based n-grams as classification features. In: Gonzalez-Mendoza, M., Batyrshin, I.
(eds.) Advances in Computational Intelligence. Proceedings of MICAI 2012 (2012)
9. Sidorov, G., Velasquez, F., Stamatatos, E., Gelbukh, A., Chanona-Hernández, L.: Syntactic
dependency-based n-grams: more evidence of usefulness in classification. In: Gelbukh, A. (ed.)
Computational Linguistics and Intelligent Text Processing. Proceedings of International Con-
ference on Intelligent Text Processing and Computational Linguistics, CICLing 2013 (2013)
10. Bayartsatsral, C., Altangerel, C.: Annotating noun phrases for Mongolian language and using
it in machine learning. In: Proceedings of the Mongolian Information Technology—2018,
Ulaanbaatar, Udam Soyol, pp. 12–15 (2018)
11. Davidov, D., Rappoport, A.: Unsupervised discovery of generic relationships using pattern
clusters and its evaluation by automatically generated sat analogy questions. In: Proceedings
of the ACL-08 (2008)
12. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction.
In: Proceedings of the Conference on Empirical Methods in Natural Language Processing,
EMNLP’11 (2011)
13. Alisa, Z., Alexander, G.: Open information extraction for Spanish language based on syn-
tactic constraints. In: Proceedings of the ACL2014 Student Research Workshop, Baltimore,
Maryland, USA, pp. 78–85 (2014)
304 G. Lkhagvasuren and J. Rentsendorj

14. Gamallo, P., Garcia, M., Fernández-Lanza, S.: Dependency-based open information extraction.
In: Proceedings of the Joint Workshop on Unsupervised and SemiSupervised Learning in NLP,
ROBUS-UNSUP ’12 (2012)
15. Van Durme, B., Schubert, L.: Open knowledge extraction using compositional language pro-
cessing. In: Proceedings of the STEP ’08 Proceedings of the 2008 Conference on Semantics
in Text Processing (2008)
16. Michele, B., Michael, J.C., Stephan, S., Matt, B., Oren, E.: Open information extraction from the
web. In: Proceedings of the Twentieth International Joint Conference on Artificial Intelligence
(2007)
17. Wu, F., Weld, D.S.: Open information extraction using wikipedia. In: Proceedings of the 48th
Annual Meeting of the Association for Computational Linguistics, ACL ’10 (2010)
Chapter 32
Colorful Fruit Image Segmentation
Based on Texture Feature

Chunyan Yang

Abstract The recognition of colorful fruit is one of the important research contents
of agricultural machinery vision system. At present, the popular image segmenta-
tion method of color model is generally suitable for the case of large difference
between fruit and background color. For the image segmentation where the differ-
ence between fruit and background color is not obvious, the image segmentation
method based on color model cannot meet the actual needs. Therefore, this paper
introduces the use of gray-level co-occurrence matrix to analyze the texture features
of fruit and background, find out the texture feature parameters to distinguish the
fruit and background, and segment the image with similar color between fruit and
background. The experimental results show that texture features can not only suc-
cessfully separate the red apple from the background but also have a very good effect
on the segmentation of blue apple image with complex background.

Keywords Texture · Gray-level co-occurrence matrix · Segmentation

32.1 Introduction

Image segmentation is an important step in digital image processing, which refers to


the segmentation of an image into different regions which are consistent or similar in
some image features, such as edge, texture, color, brightness, and so on. In the actual
process, it is often necessary to segment the parts of the image that is interested. Fea-
ture extraction and target recognition depend on the quality of image segmentation,
so the quality of image segmentation determines the final effect of image analysis.
Effective and reasonable image segmentation can provide very useful information
for subsequent image retrieval, object analysis, and so on, which makes it possible

C. Yang (B)
Baicheng Normal University, Baicheng 137000, Jilin, China
e-mail: bcsyycy@163.com

© Springer Nature Singapore Pte Ltd. 2020 305


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_32
306 C. Yang

for a higher level of image understanding. At present, image segmentation is still a


problem that has not been solved. How to improve the quality of image segmentation
has been widely concerned by scholars inland and abroad, and has been a research
hotspot.
This paper takes apple as the research object, analyzes the texture features of
apple and leaves, and introduces two important texture features based on gray-level
co-occurrence matrix: ASM energy and contrast, the method for segmenting green
apple images with complex backgrounds. The experimental results show that the
texture feature recognition method is effective in the segmentation of fruits with
similar background colors.

32.2 The Segmentation Method Based on Texture


Information

Generally, texture refers to the gray change law of image pixels observed by people.
Texture exists widely in nature. The object identified in this paper is apple, the
background is mainly leaves and branches, and obviously whether it is red apple or
green apple, their texture should be completely different from the texture of leaves
and branches. In the experiment, texture features are introduced to segment the color
apple image with complex background based on texture eigenvalues.

32.2.1 Gray-Level Co-occurrence Matrix

Gray-level co-occurrence matrix is defined as the probability of gray value leaving a


fixed position from a pixel point with gray-level I, that is, all estimated values can be
expressed in the form of a matrix, which is called gray-level co-occurrence matrix.
For the image with slow texture change, the value on the diagonal of the gray co-
occurrence matrix is larger, while for the image with fast texture change, the value
on the diagonal of the gray co-occurrence matrix is smaller, and the value on both
sides of the diagonal is larger. Let f (x, y) be a gray image of N × N, d = (dx, dy) be
a displacement vector (as shown in Fig. 32.1), and let L be the maximum gray series
of the image.
The gray-level co-occurrence matrix is defined as the probability P = (i, j|d , θ )
of the simultaneous occurrence of pixels with a distance of δ = (dx + dy)/2 and a
gray level of j, starting from the pixels with gray level i of f (x, y). The mathematical
expression is as follows:

P(i, j|d , θ ) = {(x, y) |f (x, y) = i, f (x + dx, y + dy) = j} (32.1)


32 Colorful Fruit Image Segmentation Based on Texture Feature 307

Fig. 32.1 Pixel pairs of gray-level co-occurrence matrix

where (x, y) is the pixel coordinate in the image; the range is [0, N − 1]; i, j are
the gray values; and the range is [0, L − 1]. Usually, the direction of gray-level
co-occurrence matrix is 0°, 45°, 90°, and 135°. If we do not synthesize these four
directions, we can get a variety of features in each direction, so that there are too
many texture features, which are not conducive to use. Therefore, the eigenvalues of
these four directions can be averaged, and the average values of the four directions
can be taken as the final eigenvalue symbiosis matrix by comparison in this paper.
For different theta values, the elements of the matrix are defined as follows:
   
P(i, j, d , 0◦ ) = #{((k, l) , (m, n) ∈ ly × lx × ly × lx |k − m| = 0, |l − n| = d ,
I (k, l) = i, I (m, n) = j)} (32.2)
   
P(i, j, d , 45◦ ) = #{((k, l) , (m, n) ∈ ly × lx × ly × lx |k − m| = d , |l − n| = −d
or (k − m = −d , l − n = d )I (k, l) = i, I (m, n) = j)} (32.3)
   
P(i, j, d , 90◦ ) = #{((k, l) , (m, n) ∈ ly × lx × ly × lx |k − m| = d , l − n, I (k, l)
= i, I (m, n) = j)} (32.4)
   
P(i, j, d , 135◦ ) = #{((k, l) , (m, n) ∈ ly × lx × ly × lx |k − m| = d , |l − n| = d ,
or (k − m = −d , l − n = −d ), I (k, l) = i, I (m, n) = j)} (32.5)

Many texture features can be defined based on gray-level co-occurrence matrix. In


this experiment, two main texture features are considered: ASM energy and contrast.
The formula is as follows.
The energy is


k 
k
ASM = (G(i, j))2 (32.6)
i=1 j=1
308 C. Yang

The contrast is
⎧ ⎫

k=1 ⎨  ⎬
CON = n2 G(i, j) (32.7)
⎩ ⎭
n=0 |i−j|=n

If the value in the gray-level co-occurrence matrix is concentrated in a certain


block (for example, for a continuous gray value image, the value is concentrated
in the diagonal; for structured images, the value is concentrated in a position that
deviates from the diagonal), the ASM has a larger value, and if the value distribution
in G is more uniform (such as a noisy image), the ASM has a smaller value.
Energy is the square sum of the element values of gray-level co-occurrence matrix,
so it is also called energy, which reflects the uniformity of gray distribution and texture
thickness of the image. If all values of the symbiosis matrix are equal, the ASM value
is small; conversely, if some of these values are large and others are small, the ASM
value is large. When the elements in the symbiosis matrix are distributed centrally,
the ASM value is large. A large ASM value indicates a more uniform and regular
texture pattern.
Contrast directly reflects the contrast of the brightness of a pixel value and its
domain pixel value. If the element that deviates from the diagonal has a large value,
that is, the image luminance value changes rapidly, CON will have a large value,
which is also in line with the definition of contrast, which reflects the clarity of the
image and the depth of texture grooves. The deeper the texture groove, the greater
the contrast, the clearer the visual effect; on the contrary, if the contrast is small, the
groove is shallow and the effect is fuzzy. The grayscale difference, that is, the more
pairs of pixels with high contrast, the greater the value. The larger the value of the
element far from the diagonal in the gray public matrix is, the greater the CON is.

32.3 The Experimental Object of Segmentation Based


on Texture Information

The subjects still selected 200 colorful apple images used before and sampled the
texture of fruit and leaves of red apple with complex background and green apple
with complex background, respectively. For the original image in Figs. 32.2a and
b, the experimental method is to divide the image into small blocks of equal size,
N × N , N = 5; two eigenvalues of gray-level co-occurrence matrix and gray-level
co-occurrence matrix in four directions (0°, 45°, 90°, 135°) are calculated, respec-
tively: ASM energy and contrast. The features of gray-level co-occurrence matrix
in average four directions are worthy of the average eigenvalues ASM and CON as
discriminant texture features. After a large number of data tests, the average value
of the characteristics is shown in Table 32.1.
As can be seen from Table 32.1, the ASM energy and contrast of leaves are very
different from those of red apples and can be selected as features to achieve image
32 Colorful Fruit Image Segmentation Based on Texture Feature 309

(a) Red apple image (b) Green apple image

Fig. 32.2 Apple image with complex background

Table 32.1 Gray-level Name Leaves Red apples Green apples


co-occurrence matrix
parameters ASM 0.553 0.672 0.754
CON 0.842 0.551 0.415

(a) Red Apple and leaves (b) Green Apple and leaves

Fig. 32.3 Texture characteristics of apple and leaves

segmentation and texture feature mapping (shown in Fig. 32.3a). The difference
between the ASM energy and contrast of leaves and that of green apples is also large.
The texture feature map (shown in Fig. 32.3b) is made. When the segmentation effect
based on color feature value is not satisfactory, texture features are used to segment.
By analyzing Fig. 32.2a and b, it is found that the ASM energy and contrast can
distinguish leaves from red apples, leaves, and green apples, so the ASM energy
and contrast can be used to segment green apple images with complex background.
Experiments on 200 such images show that no matter what color the apple is, the
segmentation success rate is more than 95%. Because the difference of the ASM
310 C. Yang

Fig. 32.4 The segmented


image of red apples

Fig. 32.5 The segmented


image of green apples

energy and contrast between green apple and red apple is not obvious, it indicates that
these two characteristic quantities cannot be used as the characteristics to distinguish
red apple from green apple, and new methods will continue to be explored in future
research.
Finally, through the two texture parameters of ASM energy and contrast CON, we
can not only realize the segmentation of red apple in complex background but also
realize the segmentation of green apple in complex background. The final segmented
image is shown in Figs. 32.4 and 32.5.

32.4 Conclusion

Considering that the texture features of leaves and apples are completely different,
two texture features based on gray-level co-occurrence matrix, the ASM energy and
contrast, are introduced in the experiment. The energy and entropy are selected as the
32 Colorful Fruit Image Segmentation Based on Texture Feature 311

eigenvalues to segment the red apple image and the green apple image, respectively.
The experimental results show that the texture features can not only successfully
separate the red apple from the background. It is also very good for cyan apple image
segmentation with complex background. Therefore, this paper believes that the target
recognition of color apple image should be combined with color features and texture
features, so that we can learn from each other and achieve the best recognition effect.

References

1. Bo, H., Ma, J., Jiao, L.C.: Analysis of gray level co-occurrence matrix calculation of image
texture. Acta Electron. Sin. (2006)
2. Yuan, L., Fu, L., Yang, Y., Miao, J.: Analysis of experimental results of texture feature extraction
by gray co-occurrence matrix. Comput. Appl. (2009)
3. Xuesong,W., Mingquan,Z., Yachun, F.: The algorithm of graph cut using HSI weights in color
image segmentation. J. Image Graph. 16(2), 221–226 (2012)
4. Mignotte, M.: A de-texturing and spatially constrained K-means approach for image segmenta-
tion. Pattern Recogn. Lett. 32(2), 359–367 (2013)
5. Zhiguang, Z.: A new color space YCH with strong clustering power for face detection. Pattern
Recogn. Artif. Intell. 24(4), 502–505 (2015)
Chapter 33
Real-Time Emotion Recognition
Framework Based on Convolution
Neural Network
Hanting Yang, Guangzhe Zhao, Lei Zhang, Na Zhu, Yanqing He
and Chunxiao Zhao

Abstract Efficient emotional state analyzing will enable machines to understand


human better and facilitate the development of applications which involve human—
machine interaction. Recently, deep learning methods become popular due to their
generalization ability, but the disadvantage of complicated computation could not
meet the requirements of real-time characteristics. This paper proposes an emotion
recognition framework based on convolution neural network, which contains less
number of parameters comparatively. In order to verify the proposed framework,
we train a network on a large number of facial expression images and then use the
pretrained model to predict image frame taken from a single camera. The experiment
shows that compared to VGG13, our network reduces the parameters by 147 times.

Keywords CNN · Emotion recognition · Image processing

33.1 Introduction

Emotion is the cognitive experience that human beings produce under intense psycho-
logical activities. As a vital signaling system, facial expression can express people’s
psychological state which is one of the effective methods to analyze emotions. Estab-
lishing an automatic recognition expression model is a favorite research topic in the
field of computer vision.
The first research on emotion recognition was published in 1978 which tracks
the position of the key points in a continuous set of image frames to analyze the
expression generated by the face [1]. However, due to poor face detection and face
registration algorithms and limited computational power, progress in this field devel-
oped slowly. Until the first facial expression dataset Cohn-Kanade was published
broke the situation [2]. The mainstream method will detect the underlying expres-
sion or the action unit defined by the Facial Action Coding System (FACS) as the

H. Yang · G. Zhao (B) · L. Zhang · N. Zhu · Y. He · C. Zhao


Beijing University of Civil Engineering and Architecture, Beijing 100000, China
e-mail: 18810903925@163.com

© Springer Nature Singapore Pte Ltd. 2020 313


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_33
314 H. Yang

recognition target. Early research focused on handcrafted features such as geometric


features and appearance characteristics. Geometric features are good at characteriz-
ing primary expressions. Appearance features are good at finding subtle color and
texture changes in the face. In order to supplement the depth information of the 2D
image and solve the problem of poor detection performance caused by large head
pose changes and uneven illumination conditions, the researchers focused on the new
modality of data such as 3D data and heat figure. The BU-3DFE dataset is the first
3D facial expression dataset [3], and there are works to identify the facial motion by
comparing the distance of the face landmark point in the 3D space.
Recently, the field of expression recognition has introduced a deep learning
method to integrate feature extraction and expression classification into a single pro-
cess by constructing a deep neural network [4]. This method usually requires learning
the weight values of the neurons in each layer in a large-scale calibration dataset,
which belongs to supervised learning. In addition to innovation in the original pri-
mary expression and motion unit detection algorithms, it is also a research hotspot to
identify more complex expression information. For example, self-expression recog-
nition is used to detect whether the expression is deliberate or spontaneous; fatigue
state detection is of great significance for assisting driving system [5]; and depres-
sion, pain, and depression detection can help doctors better analyze the patient’s
condition [6].

33.2 Background

Inspired by previous work, there are four main processes in our emotion recogni-
tion system: face detection, face registration, and feature extraction and expression
recognition [4]. Depending on the definition of the expression space or the modality
of data, the method for each process is different (Fig. 33.1).
The purpose of face localization is to find faces in the image and make a mark.
There are two main approaches: detection approaches and segmentation approaches.
The detection method is to find the face in the original data and return to the bounding
box of the face. Viola and Jones proposed that the AdaBoost algorithm using the Haar-
like operator is the most commonly used algorithm [7], which is computationally
fast, but not good at dealing with occlusion and head pose changes.
Support vector machines (SVM) applied over HOG features improves the accu-
racy, but sacrifices the calculation speed [8]. The method of convolutional neural
network can deal with various data distribution problems and achieve high accuracy,

Fig. 33.1 Proposed emotion recognition framework


33 Real-Time Emotion Recognition Framework … 315

but requires a large amount of training data and takes a lot of time [9]. On the other
hand, segmentation approaches assign a binary label to each pixel value of the image.
Face registration can solve the problem of false detection and missing caused by
head posture transformation, and improves the accuracy of the subsequent procedure.
For both 2D and 3D data, the purpose of face registration is to rotate or frontal the
face. In 2D face registration, the active shape model (ASM) [10] and its extended
active appearance model (AAM) [11] find face landmarks by encoding standard
facial geometry and grayscale information.
The fully connected neural network [12] constructs a learning model by specifying
the number of network layers and the number of neurons. Then it determines different
learning strategies according to the sample distribution, which includes the activation
function, the loss function, and the selection of the optimization method. However,
the training process of the neural network is a completely black box. It is necessary
to add a helper function to observe the learning curve to detect whether the network
converges. In addition, the fully connected neural network is not good at processing
image data.
Support vector machine (SVM) [13] is a traditional and widely used machine
learning method. The disadvantage of SVM is that its choice of kernel functions rather
than parameters determines the overfitting of the model, but kernel functions are very
sensitive. Random forest [14] is an ensemble method that essentially superimposes
the output of several decision trees as the final result. Each individually trained
decision tree is weakly discriminating, and integrating their outputs with weights
can achieve high accuracy. The disadvantage of random forests is that the increase
of training samples is not proportional to the improvement of accuracy. At present,
the deep learning method is the mainstream of the visual field, especially the deep
convolutional network. The calculation method of weight sharing makes it possible
to extract features with posture, illumination, and occlusion invariability when facing
new problems. The shortcomings are explained in the previous paragraph.

33.3 Emotion Recognition

This section will introduce the proposed expression recognition framework. We uti-
lized the deep convolutional network to integrate feature extraction and emotion
recognition into one pipeline.

33.3.1 Face Detection and Alignment

In the face detection part, we use SVM method applied HOG features [8], which
construct feature vectors by calculating the histograms of gradient of the local regions
of the image and then puts them into the classifier. If the result is positive, return the
316 H. Yang

Fig. 33.2 Face detection based on HOG features

position of the detection area which is the coordinates of the upper left corner of the
bounding box ( Xl , Yl ) and the coordinates of the lower right corner (Xr , Yr ).
In the face aliment part, we use the millisecond ensemble method proposed in
[14] to train several regression trees using gradient boosting, and then regression the
68 landmark points included eyes contour, bridge of the nose and mouth contour, by
the ensemble of decision trees (Fig. 33.2).

33.3.2 Emotion Recognition Based on Dense Convolutional


Neural Network

This article will describe the dense network in the following four aspects: architecture,
convolutional layer, transition layer, and training strategy.
Architecture. The dense convolutional neural network contains a total of 37 convo-
lutional layers, three pooling layers, and a softmax layer. The input is a 48 × 48 × 1
gray image, then through a 3 × 3 convolution layer, followed by three dense blocks
each containing 12 convolution layers. Connected at the end of each dense block, a
transition layer consists of an average pooling, a bottleneck layer, and a compression
layer. The purpose of the softmax layer is to map the output of multiple neurons into
the interval of (0, 1), which is calculated as
33 Real-Time Emotion Recognition Framework … 317

T (i)
  e θj x
p y(i) = j|x(i) ; θ = k T (i)
(33.1)
l=1 e θl x

where y(i) represents the label of a certain type of expression, x(i) represents the
input feature, and θ is the total weight of the network. Above function’s output is the
confidence of a specific type of expression (Fig. 33.3).
Convolution Layer. Unlike the vertical expansion algorithm ResNet [15] of DNN,
which uses the identity function to extend the effective training length, and also
unlike the lateral expansion algorithm Inception [16], which uses different sizes of
convolution filters to perform features extraction on different scales. Dense network
highly reuses feature maps and allows any layer in the network to simultaneously
use the single feature map and all feature maps of the front layer, which makes the
network more efficient and reduces a large number of parameters.
In addition, the described convolutional layer includes not only the convolution
calculation of the filtering window but also the activation function ReLU and Batch
Normalization [17]. The generalized calculation in the convolutional layer is shown
in Eq. (33.2).


⎪ f1 (xi ) = max(0, xi )


⎨ f2 (xi ) = conv3∗3 (f1 (xi ))
f (x )−E f (x ) (33.2)

⎪ f3 (xi ) = 2√i [ 2 i ]

⎪  V ar [f2 (xi )]


⎩F = f x ,x ,x ,...,x
output 3 1 2 3 l−1

Transition layer. Transition layer exists in the middle of two dense blocks and has
two purposes: reducing network parameters and facilitating the calculation of the next
dense block. The average pooling layer is a kind of pooling, which calculates the
average value in the subarea and inputs the next layer. The essence of the bottleneck
layer is a 1 × 1 convolutional layer, and its main purpose is not to extract features, but

Fig. 33.3 Proposed DenseNet with three dense block and 7-Softmax layer
318 H. Yang

to use the super-parametric filter number d to controllable dimensionality reduction


of the accumulated feature map. The compression layer is connected behind the
bottleneck layer and proportionally reduces the number of feature maps by setting a
hyperparameter θ between 0 and 1.
Training Strategy. The training strategy for this paper mainly focuses on two
aspects. One is whether the network can converge to an acceptable accuracy rate
in the verification set, and the other is to avoid over-fitting problems. The former is
mainly reflected in the choice of optimization algorithm and network architecture.
For the optimization algorithm, this paper uses the Nesterov momentum optimiza-
tion method [18]. The momentum method is an improvement for the local minimum
point oscillation problem in the optimization space for stochastic gradient descent.
It adds the weighted update vector generated by the previous iteration to the current
update vector, as shown in Eq. (33.3).

vt = βvt−1 + α∇θ L(θ )
(33.3)
θ = θ − vt

However, blindly following the gradient acceleration update also brings instability.
The Nesterov momentum gives the approximate gradient trend information after the
optimization function by calculating θ − βvt−1 . If the gradient has an increasing
trend, speed up the update rate, if the gradient has a decreasing trend, slow down the
update speed rate, as shown in formula (33.4).

vt = βvt−1 + α∇θ L(θ − βvt−1 )
(33.4)
θ = θ − vt

33.4 Experiment Result

This section will present our experimental environment and experimental results.

33.4.1 Experiment Environment

Hardware Devices. All the model training in this paper is on the GTX1060 graphics
card. It has 1280 CUDA units, 6 GB GDDR5 memory, and core frequency 1506 MHz,
and single-precision floating-point operation is 4.4 TFlops. The test device uses a
screen-integrated 2-megapixel camera that is sufficient for facial expression recog-
nition in images.
Dataset. FER2013 contains 35,887 gray images of 48 × 48 pixels. At the first
publication time, the dataset labels were divided into seven categories, including
4953 cases of “anger”, 547 cases of “disgust”, 5121 cases of “fear”, 8989 cases
33 Real-Time Emotion Recognition Framework … 319

Fig. 33.4 Learning curve


for model training on
FER2013

of “happy”, 6077 cases of “sadness”, 4002 cases of “sadness”, and “4002 cases of
surprise” and “Neutral” 6198 cases. This labeling was later verified to be inaccurate,
and we trained in the dataset, as well as the improved FER PLUS dataset [20] and
FERFIN modified from FER PLUS.

33.4.2 Dense Network Training Results

Training on the FER2013 database. For this dataset, the setting of hyperparameters
is as follows: add L2 regularization with coefficient λ 0.0001; add the compression
layer with compression factor θ 0.5; learning rate of Nesterov momentum ε is set
to 0.1; and the momentum parameter α is 0.1. The accuracy of the network in the
verification set is reached 67.01%, as shown in Fig. 33.4.
Training on the FER PLUS database. Among the challenges suggested by Good-
fellow et al. [21] regarding neural network resolution classification problems, the
performance degradation caused by the low accuracy of human labeler is included.
Therefore, we trained the second model on FER PLUS dataset. FER PLUS used the
crowdsourcing method to improve the accuracy of the label. In the original paper,
four predesigned way to handle the objective function setting problem. We only use
the majority vote for preprocessing as the main focus is on the update of framework.
The accuracy of the network in the verification set reaches 81.78%, as shown in
Fig. 33.5. The work of [20] uses VGG13 to achieve an average accuracy of 83.97%
under the use of the majority vote loss function strategy. Their network parameters
were 8.7 million, about 147 times of our network.
320 H. Yang

Fig. 33.5 Learning curve


for model training on FER
PLUS

33.5 Conclusion

There are classic pre-design feature’s methods and emerging deep learning methods
in the field of expression recognition. The former involve more prior knowledge
and have less generalization ability. Early deep learning methods can achieve top-
level accuracy but require millions of parameters. In order to train the expression
recognition network parameters with the deep convolution model, this paper proposes
to use the dense convolutional network as the new training network. Its multilevel
connection and feature reuse feature reduce network parameters; while enhancing the
network representation capability, it can reduce the need of the number of trainable
parameters as much as possible to achieve the expected accuracy.

References

1. Suwa, M., Sugie, N., Fujimora, K.: A preliminary note on pattern recognition of human emo-
tional expression. In: Proceedings of the 4th International Joint Conference on Pattern Recog-
nition 1978, IAPR, pp. 408–410, Kyoto, Japan (1978)
2. Tian, Y.I., Kanade, T., Cohn, J.F.: Recognizing action units for facial expression analysis. IEEE
Trans. Pattern Anal. Mach. Intell. 23(2), 97–115 (2002)
3. Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution 3D dynamic facial expression
database. In: 8th IEEE International Conference on Automatic Face & Gesture Recognition
2008, pp. 1–6. Amsterdam, Netherlands (2008)
4. Corneanu, C.A., Simon, M.O., Cohn, J.F., et al.: Survey on RGB, 3D, thermal, and multimodal
approaches for facial expression recognition: history, trends, and affect-related applications.
IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1548–1568 (2016)
5. Ji, Q.: Looney: A probabilistic framework for modeling and real-time monitoring human
fatigue. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 36(5), 862–875 (2006)
6. Ashraf, A.B., Lucey, S., Cohn, J.F.: The painful face—pain expression recognition using active
appearance models. Image Vis. Comput. 27(12), 1788–1796 (2009)
7. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In:
IEEE Conference Computer Vision Pattern Recognition 2001, vol. 1, pp. I–511 (2001)
8. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer
Society Conference on Computer Vision and Pattern Recognition 2005, CVPR, pp. 886–893,
San Diego, USA (2005)
33 Real-Time Emotion Recognition Framework … 321

9. Osadchy, M., Miller, M., Lecun, Y.: Synergistic face detection and pose estimation. J. Mach.
Learn. Res. 8(1), 1197–1215 (2006)
10. Cootes, T.F., Taylor, C.J., Cooper, D.H., et al.: Active shape models-their training and appli-
cation. Comput. Vis. Image Underst. 61(1), 38–59 (1995)
11. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal.
Mach. Intell. 23(6), 681–686 (2001)
12. Tian, Y.L., Kanade, T., Cohn, J.F.: Recognizing action units for facial expression analysis.
IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 97–115 (2001)
13. Lemaire, P., Ardabilian, M., Chen, L., et al.: Fully automatic 3D facial expression recognition
using differential mean curvature maps and histograms of oriented gradients. In: 10th IEEE
International Conference and Workshops on Automatic Face and Gesture Recognition 2013,
(FG), pp. 1–7, Shanghai, China (2013)
14. Dapogny, A., Bailly, K., Dubuisson, S.: Dynamic facial expression recognition by joint static
and multi-time gap transition classification. In: 11th IEEE International Conference and Work-
shops on Automatic Face and Gesture Recognition 2015, (FG), pp. 1–6, Ljubljana, Slovenia
(2015)
15. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Proceedings
of Computer Vision—ECCV 2016, vol. 9908, pp. 770–778. Springer, Cham (2016)
16. Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015)
17. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing
internal covariate shift. In: International Conference on Machine Learning. JMLR, pp. 448–456
(2015)
18. Su, W., Boyd, S., Candes, E.J.: A differential equation for modeling Nesterov’s accelerated
gradient method: theory and insights. Adv. Neural Inf. Process. Syst. 3(1), 2510–2518 (2015)
19. FER2013 Dataset. https://www.kaggle.com/c/challenges-in-representation-learning-facial-
expression-recognition-challenge. Accessed 25 Jan 2019
20. Barsoum, E., et al.: Training deep networks for facial expression recognition with crowd-
sourced label distribution. In: ACM International Conference on Multimodal Interaction ACM,
pp. 279–283 (2016)
Chapter 34
Facial Expression Recognition Based
on Regularized Semi-supervised Deep
Learning

Taiting Liu, Wenyan Guo, Zhongbo Sun, Yufeng Lian, Shuaishi Liu
and Keping Wu

Abstract In the field of facial expression recognition, deep learning has attracted
more and more researchers’ attention as a powerful tool. The method can effectively
train and test data by using a neural network. This paper mainly uses the semi-
supervised deep learning model for feature extraction and adds a regularized sparse
representation model as a classifier. The combination of deep learning features and
sparse representations fully exploits the advantages of deep learning in feature learn-
ing and the advantages of sparse representation in recognition. Experiments show
that the features obtained by deep learning have certain subspace features, which
accord with the subspace hypothesis of face recognition based on sparse representa-
tion. The method of this paper has a good recognition accuracy in facial expression
recognition and has certain advantages in small sample problems.

Keywords Semi-supervised learning · Regularization · Facial expression


recognition · Deep learning

34.1 Introduction

In recent years, facial expression recognition has been used as a biometric recognition
technology. It has become an important research topic in the fields of multimedia
information processing, human–computer interaction, image processing, and pattern
recognition. Labels play an important role in facial expression recognition, but are not
readily available. The semi-supervised learning method can simultaneously utilize
both labeled and unlabeled samples in the training set. The purpose of learning is
to construct a learning model with a small number of labeled samples and a large
number of unlabeled samples. Early on semi-supervised deep learning research was

T. Liu · W. Guo · Z. Sun · Y. Lian · S. Liu (B) · K. Wu


Changchun University of Technology, Changchun 130012, China
e-mail: liu-shuaishi@126.com
T. Liu
e-mail: liutaiting@qq.com

© Springer Nature Singapore Pte Ltd. 2020 323


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_34
324 T. Liu et al.

Weston et al. [1]. They attempted to introduce the Laplacian regular term in semi-
supervised learning of graph theory into the objective function of neural network
and semi-supervised training on multilayer neural networks. Lee [2] proposed a
network that was trained in a supervised fashion with labeled and unlabeled data
simultaneously. For unlabeled data, just picking up the class which has the maximum
predicted probability. With denoising autoencoder and dropout, this simple method
outperforms conventional methods for semi-supervised learning.
On the other hand, inspired by sparse coding [3] and subspace methods [4], Wright
et al. [5] proposed a classification method based on sparse representation, using the
original training face image as a dictionary to solve the sparse coefficient of the
test sample by norm. The classification result is obtained by solving the minimum
residual. On the basis of John Wright’s work, a series of researches on classification
methods based on sparse representation have made some progress, including the
research work on dictionary learning in sparse representation [6]. The literature [7]
creatively introduces the compensation dictionary based on sparse representation. In
the method of face recognition, a certain breakthrough has been made in the face
recognition problem of small samples. The literature [8, 9] points out that sparse
representations on facial expression recognition also have a significant effect.
This paper uses the semi-supervised deep learning model for feature extraction
and adds a regularized sparse representation model as a classifier. The combination
of deep learning features and sparse representations fully exploits the advantages
of deep learning in feature learning and the advantages of sparse representation in
recognition.

34.2 Sparse Representation Classification Method

34.2.1 Sparse Representation

The sparse representation-based classification (SRC) [5] assumes that the face image
is located in the linear subspace, and the test sample can be cooperatively and linearly
expressed for the training samples (dictionaries) of all classes, and the sample belongs
can be expressed more sparse (with fewer dictionaries for better refactoring). After
increasing the constraint of sparse representation coefficient, the non-0 items in
the sparse representation coefficient obtained by the solution should be mainly the
corresponding items of the category dictionary to which the test sample belongs.
Therefore, it is possible to classify test samples according to a dictionary to obtain
smaller errors, which is how SRC works. The algorithmic process of SRC is as
follows:
(1) The test sample is represented as a linear combination of the dictionary A, and
the sparse coefficient is obtained by the L 1 norm minimization solution:
 
α̂ = arg min y − Aα22 + λα1 (34.1)
34 Facial Expression Recognition Based on Regularized … 325

where A = [A1 , A2 , . . . , Ak ] is a training sample (dictionaries) and Ai is


a sample of class i. Let y represent the test sample and represent the test
sample y as a linear combination of training samples, i.e., y = Aa, where
α = [α1 ; . . . ; αi ; . . . ; αk ;].
(2) Compute the residuals:
  
ei =  y − Aδi â , i = 1, . . . , k (34.2)
 
where δi â is the coding coefficient vector associated with class i.
(3) Output the identity of y as

identity(y) = arg min (ei (y)) (34.3)


i∈(1,...,k)

The classification method based on sparse representation can effectively utilize the
subspace characteristics of face images, does not require a large number of samples
for classifier learning, and is robust to noise.
The identification method based on sparse representation assumes that each type
of training sample (dictionary) must be complete, and each type of training sam-
ple (dictionary) has sufficient expressive power. This assumption is generally unac-
ceptable in small sample problems with large disturbances such as illumination,
attitude, occlusion, etc. In face recognition problems with small samples and large
interference, test pictures are often misclassified into classes with similar intra-class
variations, rather than classes with the same appearance changes.

34.2.2 L1 Regularized Expression Classifier

To improve the robustness of facial recognition, the classifier adopts the sparse rep-
resentation based on regularized coding. Figure 34.1 gives an overall process.
The main process of classifier implementation is as follows:
(1) The original spatial data is embedded into the feature space, and different
weights are assigned to each pixel point of the facial expression image to be
tested.

Weighting Minimum coding


Iterations
reconstruction residual criteria

Full connection
The weight after Sparse
weight Classification
convergence representation
initialization

Fig. 34.1 Flowchart of sparse representation classification based on regularization


326 T. Liu et al.

(2) Use sparse representations to embed regularized models to reconstruct origi-


nal weights through successive iterations sparsely, and obtain the convergence
weight matrix of the sparse reconstruction.
(3) Compute a matrix that embeds a regularized L1 norm through a sparse repre-
sentation, classify the tested expression image into the category corresponding
to the minimum approximation residual approximated by the trained expression
image, and complete the facial expression classification.

34.3 Semi-supervised Deep Learning Model Based


on Regularization

34.3.1 Overall Process

The algorithm uses both labeled and unlabeled data, so it is a semi-supervised learning
algorithm. In this paper, the operation structure diagram of facial features extracted
by the regularized semi-supervised deep learning algorithm is shown in Fig. 34.2.
The steps of the facial expression recognition method based on the regularized
semi-supervised deep learning framework are as follows:
(1) Training autoencoder with unlabeled training data to get W and b.
(2) Remove the last layer of autoencoder. Get the function f (x).
(3) Enter the labeled training data (x) into the trained autoencoder. Get the new data
(x  = f (x)). Using the new data (x  ) replace the raw data (x) for subsequent
training. We call the new data a replacement input.

CNN
Unlabeled Autoencoder
Network Classifier
training data feature learning
initialization
Regularized semi-supervised
feature extraction
Facial
Labeled training
expression
data
database
Fine-tuning
Training data

Testing data Output

Fig. 34.2 The expression recognition structure of the semi-supervised deep learning method
34 Facial Expression Recognition Based on Regularized … 327

34.3.2 Feature Extraction Method Based on Deep Learning

The feature extraction process used in this paper is based on an autoencoder semi-
supervised convolutional neural networks. The parameters of each layer of the net-
work are shown in Table 34.1.
The dropout probability used in this network training is 50%, and the activation
function uses Relu.
This paper uses the Fer2013 facial expression dataset for training. The database
contains a total of 35,887 face images, including 28,709 training sets, 3,589 vali-
dation sets, and 3,589 test sets. The images in the database are grayscale images,
the size is 48 * 48 pixels, the sample is divided into seven categories: 0 = anger
(angry), 1 = disgust (disgust), 2 = fear (fear), 3 = happy (happy), 4 = SAD (sad),
5 = surprised (surprised), 6 = normal (neutral), and the distribution of each type
is basically uniform. Use the FC2 layer as a face feature and use L1 regularized
expression classifier to identify the facial expression.

Table 34.1 Regularized Network layer Instruction Parameter


semi-supervised deep
learning network structure Input Input layer 48 × 48
Autoencoder Replacement input Resize 96 × 96
Cov1 Convolutional layer 3 × 3 × 64
Cov2 Convolutional layer 3 × 3 × 64
MAXPOOL1 Pooling layer 2×2
Cov3 Convolutional layer 3 × 3 × 128
Cov4 Convolutional layer 3 × 3 × 128
MAXPOOL2 Pooling layer 2×2
Cov5 Convolutional layer 3 × 3 × 256
Cov6 Convolutional layer 3 × 3 × 256
MAXPOOL3 Pooling layer 2×2
Cov7 Convolutional layer 3 × 3 × 512
Cov8 Convolutional layer 3 × 3 × 512
MAXPOOL4 Pooling layer 2×2
FC1 Full connection 1024
FC2 Full connection 512
Softmax Softmax layer 7
328 T. Liu et al.

34.4 Experimental Result

Randomly select 300 images of each expression type in the training sets of the
FER2013 database. A total of 2100 facial expression images were used as the unla-
beled training database for this experiment for training semi-supervised learning
models. The following are the facial expression recognition cases when different
classifiers are used.

34.4.1 Softmax Classification

Softmax classification is the most commonly used classifier in deep learning. The
facial expression recognition results of Softmax classification is shown in Table 34.2.
It can be seen that the recognition rate of happy is significantly higher than other
expressions; meanwhile, the recognition rate of fear is the most difficult to distin-
guish.

34.4.2 Sparse Representation Classification via Deep


Learning Features

The classification method based on sparse representation can effectively utilize the
subspace characteristics of face images, does not require a large number of samples
for classifier learning, and is robust to noise. The facial expression recognition results
of sparse representation classifier are shown in Table 34.3.
After replacing the Softmax classifier with a sparse representation classifier, we
found that the recognition rate of happy is increased by 0.59%, the recognition rate
of fear is increased by 1.22%, and the error rate related to mistaking fear for sad
decreased by 0.66%. These recognition results demonstrated that the proposed algo-
rithm not only improved the recognition accuracy of easily distinguishable categories

Table 34.2 The facial expression recognition results of Softmax classification


Angry Disgust Fear Happy Sad Surprised Neutral
Angry 58.73 0.72 10.12 5.22 14.13 2.46 8.62
Disgust 12.41 73.86 3.15 2.18 2.14 3.87 2.39
Fear 14.84 0.64 52.43 4.19 10.93 6.64 10.33
Happy 2.33 0 2.06 89.84 1.94 1.37 2.46
Sad 8.25 0.54 5.98 3.23 55.16 1.87 24.97
Surprised 2.58 0.41 6.93 4.07 1.34 82.15 2.52
Neutral 5.2 0.36 4.03 4.12 10.32 1.71 74.26
34 Facial Expression Recognition Based on Regularized … 329

Table 34.3 The facial expression recognition results of sparse representation classification via deep
learning features
Angry Disgust Fear Happy Sad Surprised Neutral
Angry 59.42 0.69 9.97 5.27 13.73 2.43 8.49
Disgust 11.98 74.36 3.21 2.12 2.14 3.85 2.34
Fear 14.42 0.61 53.65 4.14 10.27 6.63 10.28
Happy 2.13 0 2.06 90.43 1.79 1.32 2.27
Sad 8.28 0.53 5.78 3.16 55.96 1.75 24.54
Surprised 2.39 0.41 6.68 4.02 1.37 82.67 2.46
Neutral 5.13 0.37 3.87 3.98 10.18 1.69 74.78

but also alleviated the easy misclassification of difficultly distinguishable categories.


The recognition rate of anger, sad, and neutral has also increased. The proposed
algorithm improves the recognition rates for most categories.

34.4.3 L1 Regularized Sparse Representation Classification


via Deep Learning Features

The facial expression recognition results of L1 regularized sparse representation


classification is shown in Table 34.4.
After changing the classifier to L1 regularized sparse representation classifier,
we found that the recognition rate of happy increased by 0.24% compared with the
simple use of sparse representation, the recognition rate of fear increased by 0.48%,
and the recognition rate of other expressions also increased.
At the same time, it can be seen that our proposed algorithm did not improve the
accuracy of certain classes at the expense of the accuracy of other classes, which
makes sense for the practical application of facial expression recognition.

Table 34.4 The facial expression recognition results of L 1 regularized sparse representation clas-
sification via deep learning features
Angry Disgust Fear Happy Sad Surprised Neutral
Angry 59.84 0.67 9.95 5.23 13.54 2.41 8.36
Disgust 11.72 74.71 3.19 2.12 2.11 3.85 2.3
Fear 14.17 0.6 54.13 4.14 10.13 6.59 10.24
Happy 2.07 0 2.03 90.67 1.72 1.26 2.25
Sad 8.19 0.54 5.73 3.08 56.52 1.75 24.19
Surprised 2.28 0.39 6.56 3.97 1.39 83.04 2.37
Neutral 5.06 0.33 3.78 3.94 9.86 1.71 75.32
330 T. Liu et al.

34.4.4 Comparison of State-of-the-Art Methods

Table 34.5 shows the recognition accuracy of different classifiers on the Fer2013
dataset. By changing the classifier, the recognition rate is increased by 0.69% and
0.42% when using the sparse representation classifier and the L1 sparse represen-
tation classifier. When used the L1 algorithm to recognize facial expressions, the
average recognition rate reaches 70.60%.
To verify the validity of the method in this paper. Use other methods of facial
expression recognition comparing with the proposed methods in this paper. Table 34.6
is a comparison of the recognition rate of the facial expression recognition system
with other algorithms on FER2013 database.
As can be seen from Table 34.6, the proposed algorithm has an advantage in
the recognition rate of the Fer2013 dataset. The DNNRL [11] improves local feature
recognition through the Inception layer and updates the model more or less according
to the sample difficulty. The algorithm of FC3072 proposed in literature [12] sets
a fully connected layer of 3072 parameters, which requires a lot of calculations.
The algorithm proposed in this paper uses the sparse representation classifier for
classification, the features obtained by deep learning have linear subspace features,
and the use of classifiers based on sparse representation has outstanding advantages
for small sample problems. The algorithm in this paper has certain advantages.

34.5 Conclusion

This paper improves on the semi-supervised deep learning algorithm and introduces
regularization items in sparse representation classification. By comparing the sparse
representation classifier with regularization. The experimental results show that the
introduction of regularization has improved the recognition rate of facial expres-

Table 34.5 Comparisonof Method Facial recognition rate (%)


the results of the three
classifiers CNN + Softmax 69.49
CNN + SRC 70.18
CNN + L 1 -norm-SRC 70.60

Table 34.6 Comparison of Method Facial recognition rate (%)


state-of-the-art methods on
FER2013 database Maxim Milakov [10] 68.82
Unsupervised [10] 69.26
DNNRL [11] 70.60
FC3072 [12] 70.58
Proposed approach 70.60
34 Facial Expression Recognition Based on Regularized … 331

sions. Future research work will further study and analyze the characteristics of deep
learning. By improving the network structure and loss function, the characteristics
of the network will be more satisfied with the linear subspace constraints, and the
recognition effect will be further improved.

Acknowledgements This paper is supported by Jilin Provincial Education Department “13th five-
year” Science, Technology Project (No. JJKH20170571KJ), National Natural Science Foundation
of China under Grant 61873304, The Science & Technology Plan Project Changchun City under
Grant No. 17SS012, and the Industrial Innovation Special Funds Project of Jilin Province under
Grant No. 2018C038-2 & 2019C010.

References

1. Weston, J., Ratle, F., Mobahi, H., et al.: Deep learning via semi-supervised embedding. Neural
Networks: Tricks of the Trade, pp. 639–655. Springer, Berlin, Heidelberg (2012)
2. Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep
neural networks. In: Workshop on Challenges in Representation Learning, ICML, vol. 3, p. 2
(2013)
3. Huang, K., Aviyente, S.: Sparse representation for signal classification. Advances in Neural
Information Processing Systems, pp. 609–616 (2007)
4. Lee, K.C., Ho, J., Kriegman, D.J.: Acquiring linear subspaces for face recognition under vari-
able lighting. IEEE Trans. Pattern Anal. Mach. Intell. 5, 684–698 (2005)
5. Wright, J., Yang, A.Y., Ganesh, A., et al.: Robust face recognition via sparse representation.
IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
6. Yang, M., Zhang, L., Feng, X., et al.: Fisher discrimination dictionary learning for sparse repre-
sentation. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 543–550.
IEEE (2011)
7. Deng, W., Hu, J., Guo, J.: Extended SRC: undersampled face recognition via intraclass variant
dictionary. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1864–1870 (2012)
8. Fan, Z., Ni, M., Zhu, Q., et al.: Weighted sparse representation for face recognition. Neuro-
computing 151, 304–309 (2015)
9. Guo, Y., Zhao, G., Pietikäinen, M.: Dynamic facial expression recognition with atlas construc-
tion and sparse representation. IEEE Trans. Image Process. 25(5), 1977–1992 (2016)
10. Goodfellow, I.J., Erhan, D., Carrier, P.L., et al.: Challenges in representation learning: a report on
three machine learning contest. In: International Conference on Neural Information Processing,
pp. 117–124. Springer, Berlin, Heidelberg (2013)
11. Guo, Y., Tao, D., Yu, J., et al.: Deep neural networks with relativity learning for facial expres-
sion recognition. In: 2016 IEEE International Conference on Multimedia & Expo Workshops
(ICMEW), pp. 1–6. IEEE (2016)
12. Kim, B.K., Roh, J., Dong, S.Y., et al.: Hierarchical committee of deep convolutional neural
networks for robust facial expression recognition. J Multimodal User Interfaces 10(2), 173–189
(2016)
Chapter 35
Face Recognition Based on Local Binary
Pattern Auto-correlogram

Zimei Li, Ping Yu, Hui Yan and Yixue Jiang

Abstract Face recognition mainly includes face feature extraction and recognition.
Color is an important visual feature. Color correlogram (CC) algorithm is com-
monly used in the color-based image retrieval as a feature descriptor, but most of
the existing methods based on CC have problems of high computational complexity
and low retrieval accuracy. Aiming at this problem, this paper proposes an image
retrieval algorithm based on color auto-correlogram. The new color feature vector
which describes the global and spatial distribution relation among different colors
is obtained in the CC feature matrix, thus reducing the computational complexity.
Inter-feature normalization is applied in color auto-correlogram (CAC) to enhance
the retrieval accuracy. The experimental result shows that this integrated method can
reduce the computational complexity and improves real-time response speed and
retrieval accuracy.

Keywords Face recognition · Local binary pattern · Auto-correlogram · Support


vector machine

35.1 Introduction

Face recognition has been widely used in different fields. Many face recognition
algorithms have gained encouraging performance. Face recognition mainly includes
two parts: face feature extraction and recognition. Feature extraction is the mapping
process of face data from the original input space to the new feature space, taking
the right way to extract face feature, such as size, location, and profile informa-

Z. Li (B) · P. Yu · H. Yan · Y. Jiang


School of Computer Technology and Engineering, Changchun Institute of Technology,
Changchun 130012, China
e-mail: 68220097@qq.com
Z. Li · P. Yu · H. Yan
Jilin Province S&T Innovation Center for Physical Simulation and Security of Water Resources
and Electric Power Engineering, Changchun 130012, China

© Springer Nature Singapore Pte Ltd. 2020 333


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_35
334 Z. Li et al.

tion. Face recognition can be generally classified into the following categories [1]:
image-based methods [2], such as integral projection method, mosaic image method
[3], and symmetry analysis method; template-based methods, such as deformable
template method, active contour model method [4], etc.; statistical learning-based
methods, such as feature face method [5], visual learning method [6], neural network
method [7], etc. At present, the main face feature extraction methods divide into two
categories: global feature and local feature. Global features can represent complete
structural information, such as facial contour, skin color, and the overall nature of
facial features. In order to extract features, a linear subspace of training set is con-
structed based on global features. The image to be recognized can be reproduced
by projecting to the linear subspace. Typical subspace-based methods include prin-
cipal component analysis, linear discriminant analysis, and independent component
analysis. Local features are robust to changes in light conditions, expressions, and
attitudes. In order to adapt to local changes, local feature method trains recognition
parameters based on the geometric relationship between facial organs and feature
parts. Local feature methods mainly include Gabor transform [8], local binary pattern
(LBP) [9], and histogram of oriented gradient (HOG). The method based on Gabor
transform can extract multi-direction and multi-scale information. At the same time,
the method has strong robustness in light condition and expression, but the efficiency
of Gabor transform is low. LBP can capture the fine details of the image and has strong
classification ability, but its adaptability to random noise is poor. More effective face
recognition not only uses a single method but also combines various methods organ-
ically. It maximizes the information obtained from the image itself and from a large
number of samples, fully combines prior knowledge to realize face recognition, and
forms a unique face recognition algorithm system. In order to improve the accuracy
of face recognition, a face recognition algorithm based on LBP auto-correlogram
and SVM is proposed in this paper. After extracting the LBP auto-correlogram tex-
ture feature of the original face image, the LBP auto-correlogram feature is used as
the input of SVM classifier. The experiments on ORL and AR databases verify the
validity of the proposed algorithm.

35.2 Related Works

35.2.1 Color Correlogram [11]

A color correlogram (henceforth correlogram) expresses how the spatial correlation


of pairs of colors changes with distance. Informally, a correlogram for an image
is a table indexed by color pairs, where the d-th entry for row (i, j) specifies the
probability of finding a pixel of color j at a distance d from a pixel of color i in this
image. Here d is chosen from a set of distance values D [13]. An auto-correlogram
captures the spatial correlation between identical colors only. This information is a
subset of the correlogram and consists of rows of the form (i, j) only.
35 Face Recognition Based on Local Binary Pattern Auto-correlogram 335

0 1 1 Binary:
51 98 198
(01111100)=124
34 80 204 0 1

67 189 251 0 1 1

Fig. 35.1 LBP algorithm principle diagram

35.2.2 Local Binary Pattern

Ojala et al. [10] introduced the LBP texture operator in 1996, which originally works
with the 3 × 3 neighborhood. The pixel values of eight neighbors are decided by the
value of the center pixel, and then, the so-threshold binary values are weighted by
powers of two and summed to obtain the LBP code of the center pixel. Figure 35.1
shows an example of the LBP operator. In fact, let gc and g0, …, g7 denote, respec-
tively, the gray values of the center and its eight-neighbor pixels, and then the LBP
code for the center pixel with coordinate (x, y) is calculated by (35.1)


7
LBP(x, y) = s(gc − gp ) · 2p (35.1)
p=0

where s(z) is the threshold function



1, z ≥ 0
s(z) = (35.2)
0, z < 0

In traditional real tasks, the statistic representation of LBP codes, LBP histogram
(LBPH), is usually used. That is, the LBP codes of all pixels for an input image are
collected into a histogram as a texture descriptor, i.e.,

LBPH(i) = δ{i, LBP(x, y)}, i = 0, . . . , 27 (35.3)
x,y

where δ(.) is the Kronecker product function.


One extension of the LBP operator is to use neighborhoods of different sizes. The
extension is able to take any radius and neighbors around a center pixel, denoted by
LBPP,R , by using a circular neighborhood and the bilinear interpolation whenever
the sampling point does not fall in the center of a pixel. For example, LBP16,2 refers
to 16 neighbors in a neighborhood of radius 2. Figure 35.2 shows an example with
different radii and neighbors. Another extension is the so-called uniform patterns,
denoted by LBPP, Ru2. An LBP binary code is called uniform if it contains at most
two bitwise transitions from 0 to 1 or vice versa when the binary string is considered
as a circular. For example, 00000000, 00011110, and 10000011 are uniform patterns.
336 Z. Li et al.

P=8, R=1 P=16, R= 2 P=8, R=2

Fig. 35.2 Adjacent pixel distribution in different values of P and R

For the computation of LBPH, the uniform patterns are used such that each uniform
pattern has an individual bin and all nonuniform patterns are assigned to a separate
bin. So, with 8 neighbors, the numbers of bins for standard LBPH are 256 and 59
for uniform patterns LBPH, respectively; with 16 neighbors, the numbers of bins
are 65,536 and 243, respectively. Clearly, the uniform patterns are able to reduce the
length of histogram vectors [12].

35.2.3 Face Recognition Algorithm

The support vector machine (SVM) minimizes the empirical risk and confidence
range by seeking the minimum structural risk and makes its classification more
extensive. The basic idea of SVM is map data put into high-dimensional space, and
then build the optimal classification hyperplane in the new space. In this paper, radial
basis kernel function is selected:
 
K(x, y) = exp −rx − y2 (35.4)

70% of data were selected as training data, and the expert interpretation results
in the database were taken as classification labels. And the training data and labels
were input into the SVM classifier to obtain the classification model. The remaining
30% of data were selected as testing data; the expert interpretation results in the
database were taken as test labels. And the training data were input into the SVM
classification model to get the classification results, compare classification results
and the test labels, and then calculate the classification accuracy. After the whole
above analysis, the face recognition process is designed as in Fig. 35.3.
35 Face Recognition Based on Local Binary Pattern Auto-correlogram 337

training training image


auto-correlogram
feature
LBP
S
V
M
recognizing
Recognizing auto-correlogram
feature
image
LBP

Fig. 35.3 Face recognition process

35.3 Experimental Result

In this section, we conducted a series of experiments on standard face library of


Georgia Tech face database (GT, 128 MB). GT database contains images of 50
people, All people in the database are represented by 15 color JPEG images with
cluttered background taken at resolution 640 × 480 pixels. The average size of the
faces in these images is 150 × 150 pixels. The pictures show frontal and/or tilted
faces with different facial expressions, lighting conditions, and scale. Each image
is manually labeled to determine the position of the face in the image. Figure 35.4
shows seven pieces of the face image of the fourth person in GT database [14].

35.3.1 Experiment Steps

The main steps of color face recognition based on color auto-correlogram and LBP
are as follows:
(1) Sample selection. Select the training sample image from the face database.
(2) Color auto-correlogram. Obtain the color auto-correlogram image set using the
calculation method in Sect. 2.1.
(3) LBP feature. First, each image is segmented with same size in the training set,
and the feature vector of the images in the training set is obtained by using the
calculation method in Sect. 2.2. Then, 2DPCA is used to reduce the dimension of

Fig. 35.4 Sample face images of one person


338 Z. Li et al.

Table 35.1 Recognition accuracy (%) for different CAC/LBP


Proportion 1/9 2/8 3/7 4/6 5/5 6/4 7/3 8/2 9/1
(CAC/LBP)
Training 30 0.816 0.781 0.838 0.844 0.826 0.711 0.681 0.622 0.583
set (%) 40 0.809 0.835 0.896 0.855 0.851 0.741 0.766 0.668 0.658
50 0.833 0.812 0.875 0.871 0.859 0.817 0.761 0.694 0.662
60 0.853 0.864 0.921 0.884 0.863 0.821 0.755 0.701 0.674
70 0.866 0.871 0.922 0.8933 0.876 0.836 0.746 0.733 0.687

the feature vector. Finally, the results after dimensionality reduction are taken as
a basis of a set of vectors, and then the sample training set images are projected
on this set of vectors, respectively, to obtain the LBP features of the sample
training set images.
(4) Face recognition. First, the color auto-correlogram histogram and LBP features
are integrated to obtain the final features of the training set image. Second, the
remaining images in the face database are taken as the testing set, and the final
features of the testing set images are obtained through the same steps as the
training set. Finally, SVM classifier is used for face recognition.

35.3.2 Recognition Accuracy Test

In order to make the algorithm more robust, the recognition rate of the algorithm
applied to color face recognition is compared by changing the proportion of the
training set in the color face database. In the experiment, the proportion of the train-
ing set is considered to be 30%, 40%, 50%, 60%, and 70%, respectively. In this
experiment, color features are extracted from the color face image, and then the gray
level of the color face image is converted and acquired grayscale texture features.
Finally, color features and grayscale texture features are combined into one color
person by proportional distribution method.
Face image recognition. Through the experiment, we find that the selection of
proportional allocation parameters has a certain impact on the recognition accuracy
of color face recognition. How to reasonably allocate the value of color features and
gray features, and realize the optimal combination of color features and gray texture
features by constantly adjusting the proportion allocation, so as to obtain higher
recognition accuracy in color face image recognition, as shown in the Table 35.1.
In the field of face recognition, texture feature is a very important representation
method of face feature, while color face recognition is inseparable the representation
of a color feature. According to the data in Table 35.1, the larger the proportion
of color features is, the lower the recognition accuracy will be. The reason is the
color feature cannot express the key face information of the color face image. It
only describes the distribution of color and the spatial correlation between colors
35 Face Recognition Based on Local Binary Pattern Auto-correlogram 339

Table 35.2 Comparison Methods CAC LBP CAC+LPT


combination algorithm and
single algorithm Accuracy 0.564 0.876 0.912

in the color face image. According to the data in Table 35.1, when the proportional
distribution of color auto-correlogram and LBP operator is 3:7, the accuracy of color
face recognition is the highest. We set the value assignment parameter to 3:7 as the
following experimental parameters.
The data shows that the combination algorithm is superior to the single algorithm
in Table 35.2.

35.4 Conclusion

In this paper, the application of color auto-correlogram combined with LBP method
is presented in color face recognition. Color auto-correlogram can well express the
color feature of face image, and LBP method can well describe the texture feature
of face image. Therefore, by combining the advantages of color auto-correlogram
and LBP method, the color feature and texture feature of color face image can be
extracted well and recognized by SVM classifier. Finally, experiments show that this
method is suitable for color face image recognition, and its accuracy is improved.

Acknowledgements This work is supported by the Science and Technology Department Research
Project of Jilin Province (No. 20190302115GX).

References

1. Xu-feng, Ling, Jie, Yang, Chen-zhou, Ye: Face detection and recognition system in color image
series. Acta Electronica Sinica 31(4), 544–547 (2003)
2. Moghaddam, B., Pentland, A.: Probabilistic visual learning for object representation. IEEE
Trans. PAMI 19(7), 696–710 (1997)
3. Schneiderman, H., Kanade, T.: Object detection using the statistics of parts. Int. J. Comput.
Vis. 56(3), 151–177(2004)
4. Huang, C.L., Chen, C.W.: Human facial feature extraction for face interpretation and recogni-
tion. Pattern Recognit. 25(12), 1435–1444 (1992)
5. Turk, M., Pentland, A.: Eigenfaces for recognition. J. Cogn. Neurosci. 3(1), 71–86 (1991)
6. Sung, Kah-Kay, Poggio, Tomaso: Example-based learning for view-based human face detec-
tion. IEEE Trans. PAMI 20(1), 39–50 (1998)
7. Hinton, G.E., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks.
Science 313(9), 504–507 (2006)
8. Yoo, C.H., Kim, S.W., Jung, J.Y., et al.: High-dimensional feature extraction using bit-plane
decomposition of local binary patterns for robust face recognition. J. Vis. Commun. Image
Represent. 45(C), 11–19(2017)
9. Zhao, Z., Jiao, L., Zhao, J., et al.: Discriminant deep belief network for high-resolution SAR
image classification. Pattern Recognit. 61, 686–701 (2017)
340 Z. Li et al.

10. Ojala, T., Pietikainen, M., Harwood, D.: A Comparative study of texture measures with clas-
sification based on feature distributions. Pattern Recognit. 29, 51–59 (1996)
11. Shen, X., Wang, X., Du, J.: Image retrieval algorithm based on color autocorrelogram and
mutual information. Comput. Eng. 40(2), 259–262(2014)
12. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture
classification with local binary patterns. IEEE Trans. PAMI 24(7), 971–987 (2002)
13. Huang, J., Kumar, S.R., Mitra, M., et al.: Spatial color index in gand applications. In: 6th
International Conference on Computer Vision. IEEE Press, Bombay, India (1998)
14. http://www.anefian.com/research/face_reco.htm
Chapter 36
Saliency Detection Based
on the Integration of Global Contrast
and Superpixels

Yikun Huang, Lu Liu, Yan Li, Jie Chen and Jiawei Lu

Abstract In the field of computer vision, the detection of salient object is an impor-
tant step and one of the preconditions for salient object extraction. The outcome
resulting from some existing detection methods for salient object is considerably
different from the Ground Truth. In view of the shortcomings of existing methods,
this paper proposes a saliency detection method based on the integration of global
contrast and superpixels. The salience value of each pixel is measured according to
the global contrast of the pixels in the image. A histogram optimization technique is
used to highlight the low-contrast pixels of the salient region in the image and omit
the high-contrast pixels of the background. In order to improve the image quality of
the salient image, the superpixel image segmentation based on K-Means clustering
algorithm is proposed, and finally, we generate a more accurate saliency map through
the integration with superpixels. The experiment is performed on the public dataset
MSRA10 K. The results show that the histogram optimization can help improve
the contrast of the salient pixels and generate a better saliency map by integrating
with superpixels. Compared with other classical algorithms, the proposed method
outperforms other methods.

Keywords Global contrast · Histogram · Superpixels · Saliency detection

Y. Huang (B) · L. Liu


Concord University College, Fujian Normal University, Fuzhou 350117, China
e-mail: 858881714@qq.com
Y. Li
Minnan University of Science and Technology, Quanzhou 362700, China
J. Chen · J. Lu
Intelligent Information Processing Research Center, College of Information Science and
Engineering, Fujian University of Technology, Fuzhou, China

© Springer Nature Singapore Pte Ltd. 2020 341


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_36
342 Y. Huang et al.

36.1 Introduction

The human eyes can quickly and accurately find the target object in a complex scene
based on the degree of stimulation it exerted on the eyes. The saliency detection is
mainly used for the extraction of salient target in digital images, to simulate human
recognition of salient objects, and to identify the most attractive targets or features in
natural images. Saliency detection is one of the research hotspots of computer vision
in recent years. How to enable computers to quickly and accurately extract valuable
information from a large number of image sets has become one of the challenges in
the field of computer vision.
In recent years, saliency detection has been widely used in many fields such as
image segmentation, image compression, intelligent image retrieval, image match-
ing, and target recognition. There are more and more methods for detecting image
saliency. Some of them are based on biology, information theory, frequency domain,
and contrast. In addition, contrast-based detection methods consist of global contrast
and local-based contrast. However, the results generated by many saliency detection
algorithms are lacking in sufficient similarity with the Ground Truth. In this paper,
the initial saliency map is obtained by the saliency detection method based on global
contrast. Then the histogram optimization technique is adopted to improve the dis-
play effect of the saliency map. Finally, the superpixel image segmentation helps
integrate with the saliency map to generate the final one.
The paper is organized as follows. Section 36.2 describes related work. In
Sect. 36.3, we present a detailed description of our method, including global color
contrast, histogram optimization, and the integration with superpixels. Section 36.4
shows the experimental results. Section 36.5 gives the conclusion.

36.2 Related Work

In the 1990s, experts and scholars began to study saliency detection and applied
saliency detection to biology. In the early stage, the methods of saliency detection
were relatively simple and had some noticeable errors. In recent years, many experts
and scholars have committed to the study of saliency detection and proposed a variety
of methods, and some of them are widely used in face recognition [1], image seg-
mentation [2], video fusion [3] and other fields. Many experts have proposed some
evaluation indicators and verification methods for the results of saliency detection
[4, 5]. The saliency detection model is divided into two categories: “bottom-up” and
“top-down”. The former is driven by data and does not require any prior knowledge;
the latter is task-driven and needs to rely on prior knowledge.
At the current stage, many scholars widely use bottom-up models for research.
Wang et al. suggested that using global information for saliency detection is an
effective method, and adopted a new bottom-up model combined with multiscale
global cues for saliency detection. Yun et al. proposed the LC algorithm by using
36 Saliency Detection Based on the Integration … 343

global pixel differences [6]. The HC algorithm proposed by Chen Mingming used
color differences between global pixels to produce a saliency map [7]. Niu et al.
[8] employed K-means method to cluster the images and proposed an improved
algorithm of clustering and fitting (CF) for saliency detection. This method also
achieved good results. Ishikura et al. [9] measured locally perceived color differences
by multiscale extrema for saliency detection. Singh [10] et al. used global contrast
versus local information to improve the color contrast of the overall image. This
method had certain limitations in the work of extracting saliency maps. Cuevas-
Olvera et al. [11] integrated image information with superpixels to extract saliency
maps. But this method did not use histograms for optimization and some considerable
noises in the final saliency map can be found from the experimental results. To the
best of our knowledge, many saliency detection methods do not well integrate the
information inherent in the image with the histogram information and superpixels.

36.3 Proposed Method

In this section, we introduce the methods and steps for our saliency detection. In
the first stage, the salience value of each pixel is measured by calculating the global
contrast of the pixels so that the salient object can be separated from the surround-
ing environment. For some images of complex texture, some errors may occur in
the salience value of pixel calculated by the global contrast. For instance, the pixel
contrast of the salient region is low, and the pixel contrast of the background is high.
Second, the histogram optimization is performed on the saliency map to dismiss
unreasonable distribution of contrast in the image. Third, the original image is seg-
mented by superpixels to form multiple pixel blocks with clear boundaries. Finally,
integrate the superpixel segmentation result with the histogram-optimized saliency
map to generate the final saliency map.

36.3.1 Global Color Contrast

Image contrast is one of the key factors affecting human visual perception. Color
images are usually composed of multiple color channels, and multichannel calcula-
tion takes much time for global color contrast. Since the gray-contrast feature can
extract the information of salient feature, this paper adopts the gray channel for global
contrast calculation in the calculation of saliency value. Liu et al. [12] also adopted
gray channels when extracting the salient features of infrared images, and achieved
good experimental results.
In this paper, to calculate the global color contrast of the pixel Ic in image I, it
is necessary to traverse all the pixels and to calculate the sum of the color distances
344 Y. Huang et al.

Fig. 36.1 Original images (top) and saliency maps processed by global color contrast (bottom)

between the Ic and all the other pixels. The global contrast of the Ic can be regarded
as the salience value of the pixel, recorded as S(Ic ); the formula is as follows:

S(Ic ) = Ic − Ii  (36.1)
∀Ii ∈I

Image I is a grayscale image, and the value of Ii is between 0 and 255. The
histogram is the statistical distribution of image pixels, which can directly show the
number of each gray level in the image. Since the distance between the pixels of the
same gray value and all the pixels in the image is the same, the histogram is used to
carry out prior statistics on the image, and the calculation results of the histogram is
stored in the array, which can help improve the efficiency of calculating the global
color contrast of the image. Reconstruct formula (36.1) with the following formula:


255
S(am ) = f nam − an  (36.2)
n=0

Herein fn is obtained by the histogram, representing the frequency of occurrence


of the N-th pixel in image I. The am is the color value of the pixel Ic , and an is the
color value of the pixel Ii . The salience value of each pixel can be obtained by the
calculation of formula (36.2). Finally, the salience value of each pixel is converted
to the contrast value, and the processed image is shown in Fig. 36.1.
36 Saliency Detection Based on the Integration … 345

36.3.2 Histogram Optimization

From the result shown in Fig. 36.1, it can be found that after processing the image by
the method of Sect. 36.3.1, a high-resolution saliency map can be obtained. But some
pixels in the salient regions have low contrast and some in the background have high
contrast. To solve this problem, we propose a histogram optimization method, which
can help improve the overall display effect of the saliency image by enhancing the
pixel contrast of the salient regions and lowering the pixel contrast of the background
regions.
The processing results of Fig. 36.1 are displayed by the histogram, as shown
in Fig. 36.2b. We found that a large number of pixels are distributed in the range
of 0–50, and some pixels are distributed between 50 and 250 to varying degrees.
From the perspective of the ideal saliency map, the color values of the pixels in
the saliency map should be concentrated around 0 or 255 after extracting the salient
object. Therefore, we need to optimize the histogram so that the salient regions in the
salient image can be concentrated as close as possible to the color value of 255, and
the background regions in the salient image are distributed as close as possible to the
color value of zero. We set two thresholds, minlevel and maxlevel, which are used to
indicate the minimum and maximum gray values in the saliency map, respectively.
Change the value to 0 when its gray value is less than minlevel, and change the value
to 255 when its gray value greater than maxlevel. The color value of the middle
region is assigned by the region contrast, and the calculation formula is as shown in
formula (36.3).

⎨ 0, i f an ≤ minlevel
an = 255, i f an ≥ maxlevel (36.3)
⎩ an −minlevel
maxlevel−minlevel
, i f minlevel < an < maxlevel

We experimented with the MSRA1000’s public dataset and achieved good results
when setting minlevel = 85 and maxlevel = 170. The optimized histogram is shown
in Fig. 36.2b.

36.3.3 The Integration with Superpixels

In 2018, Niu et al. [13] used a simple linear iterative clustering (SLIC) algorithm
based on color similarity and spatial distance to achieve superpixel segmentation in
the process of salient object segmentation and achieved good results. This method
converted the original image into the CIELAB color space and performed a five-
dimensional mensuration on the l, a, b color channels and the two-dimensional space
(x, y) of the pixels.
Set the number of superpixel blocks ready for division as k, and use the k-
mean clustering method to generate superpixels. Set the cluster center Ck =
346 Y. Huang et al.

Fig. 36.2 a Saliency map processed by global color contrast; b histogram before optimization;
c optimized histogram; and d saliency map optimized by histogram

[lk , ak , bk , xk , yk ]T , and move the cluster center Ck to the lowest gradient position in
the 3 × 3 neighborhood, avoiding the cluster center falling on the edge.
For an image of w×h pixels, after superpixel segmentation, the number √ of pixels in
each region is w×h/k, and the side length of each superpixel is S ≈ (w × h)/k (and
h indicate the number of pixels of the width and height of the image, respectively).
Calculate the spatial distance and color distance between the pixel and the cluster
center when Ck is in the adjacent region of 2S × 2S, as shown in formula (36.4),
(36.5), (36.6).

dc = (lk − li )2 + (ak − ai )2 + (bk − bi )2 (36.4)

ds = (xk − xi )2 + (yk − yi )2 (36.5)
  2
ds
D= (dc ) +
2
m2 (36.6)
S

In formula (36.6), the threshold m is used to adjust the weight value of ds, and
the value range is [1, 40]. With formula (36.6), the pixel is allowed to update its own
region and the clustering center, and the above steps are iterated continuously until
the algorithm converges.
In this algorithm, k = 400 is set. After processing with the above method, the
original image is superpixel segmented to generate k superpixels, and there are obvi-
ous dividing lines at the edge of the salient object, which can clearly segment the
foreground and background objects, as shown in Fig. 36.3b.
36 Saliency Detection Based on the Integration … 347

Fig. 36.3 a Saliency map optimized by histogram; b the original image after superpixel segmen-
tation; c our Saliency map; and d Ground Truth

In the process of histogram optimization in Sect. 36.3.2, the edges of the salient
map may be impaired, or the pixels that originally belonged to the foreground have
become background pixels, resulting in a notable inaccuracy of the saliency map.
Finally, we integrate the superpixel image with the histogram-optimized image and
map the region range of each superpixel to the histogram. Set the salience value after
integration to S̄, as shown in (36.7).

G × G
S̄ = (36.7)
255
G is the average gray value of the blocks in saliency map optimized by the histogram,
and the value range is [0, 255]. G is the average gray value of the region of the
superpixel map, and the value range is also [0, 255]. If the average gray value in the
region of histogram after optimization is 0, then S̄ = 0, the value range of S̄ can be
found by the formula (36.5) is [0, 255]. We set a threshold δ. If S̄ is smaller than δ, the
superpixel region is the background region. If S̄ is larger than δ, then the superpixel
region is a salient region. If it is a salient region, the gray value of the pixel in this
region in Fig. 36.3a is updated to 255; otherwise, it is updated to 0, and the final
calculation result is shown in Fig. 36.3c.

36.4 Experimental Results

In order to evaluate the effectiveness of the proposed algorithm, we compare the


proposed algorithm with some typical algorithms of saliency map extraction, includ-
ing GB, MZ, FT, CA, LC, and HC algorithms. The experiment was performed on a
public dataset MSRA1K, which contained 1000 images and Ground Truth for each
image. The database is widely used to salient target detection and segmentation, and
the image size is mainly concentrated in 300 × 400 pixels and 400 × 300 pixels.
348 Y. Huang et al.

Fig. 36.4 Results of


different saliency detection
methods

The experiment was performed in the Windows 10 operating system. The processor
was Intel(R) Core (Tm) i5-7400, and the computer memory was 8G. The algorithm
was edited by the Python programming language.
At present, there are many evaluation metrics for saliency detection, and even
some scholars have proposed their own evaluation metrics. To better contrast with
the typical saliency detection algorithm, we use the precision-recall curves to evaluate
the saliency map.
When calculating the PR curve, the saliency map of adaptive threshold binariza-
tion is employed, and the ordinate and abscissa refer to the accuracy and recall rate,
respectively. The PR curve is calculated by comparing the saliency map with the
Ground Truth diagram, as shown in Fig. 36.4. From the experimental results, the
method we propose has higher accuracy than other algorithms.

36.5 Conclusions

In this study, we propose a saliency detection method based on the integration of


global contrast and superpixels. This method proceeds with global color contrast
calculation and histogram optimization. In order to approximate Ground Truth with
the final saliency map, the optimized saliency map is proposed to be integrated with
the superpixel-segmented image. The saliency maps are compared with several clas-
sical algorithms and displayed by PR curves. After comparison, we have determined
that the proposed method has higher accuracy. This method is only tested on the pub-
lic dataset MSRA1K, and the images in the dataset basically have only one salient
object. For the image with multiple salient objects or the image with more complex
background color, whether the method can be well performed will be further studied
in the future.
36 Saliency Detection Based on the Integration … 349

Acknowledgements This work is supported by the 2018 Program for Outstanding Young Scientific
Researcher in Fujian Province University, Education and Scientific Research Project for Middle-
aged and Young Teachers in Fujian Province (No: JZ170367).

References

Karczmarek, P., et al.: A study in facial features saliency in face recognition: an analytic hierarchy
process approach. Soft. Comput. 21(24), 7503–7517 (2017)
Hui, B., et al.: Accurate image segmentation using Gaussian mixture model with saliency map.
Pattern Anal. Appl. 2, 1–10 (2018)
Yikun, Huang: Simulation of parallel fusion method for multi-feature in double channel video
image. Comput. Simul. 35(4), 154–157 (2018)
Niu, Y., Chen, J., Guo, W.: Meta-metric for saliency detection evaluation metrics based on applica-
tion preference. Multimed. Tools Appl. 4, 1–19 (2018)
Xue, X., Wang, Y.: Using memetic algorithm for instance coreference resolution. IEEE Trans.
Knowl. Data Eng. 28(2), 580–591 (2016)
Yun, Z., Shah, M.: Visual attention detection in video sequences using spatiotemporal cues. In:
ACM International Conference on Multimedia (2006)
Cheng, M.M., et al.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach.
Intell. 37(3), 569–582 (2015)
Niu, Y., Lin, W., Ke, X.: CF-based optimisation for saliency detection. IET Comput. Vis. 12(4),
365–376 (2018)
Ishikura, K., et al.: Saliency detection based on multiscale extrema of local perceptual color differ-
ences. IEEE Trans. Image Process. 27(2), 703 (2018)
Singh, A., Yadav, S., Singh, N.: Contrast enhancement and brightness preservation using global-
local image enhancement techniques. In: Fourth International Conference on Parallel (2017)
Cuevas-Olvera, M., et al.: Salient Object Detection in Digital Images Based on Superpixels and
Intrinsic Features. IEEE (2018)
Liu, S., Jiang, N., Liu, Z.: Saliency detection of infrared image based on region covariance and
global feature. J. Syst. Eng. Electron. 29(3), 483–490 (2018)
Niu, Y., Su, C., Guo, W.: Salient object segmentation based on superpixel and background connec-
tivity prior. IEEE Access 6, 56170–56183 (2018)
Chapter 37
Mosaic Removal Algorithm Based
on Improved Generative Adversarial
Networks Model

He Wang, Zhiyi Cao, Shaozhang Niu and Hui Tong

Abstract Generative adversarial networks have yielded outstanding results in unsu-


pervised areas of learning, but existing research has proven that the results are not
stable in specific areas. In this paper, an improved generative adversarial networks
model is proposed. First, the loss calculation method of the generated model is
changed, which makes the removal target of the whole network controllable. Sec-
ond, the deep convolution network is added to the existing network; this improves the
accuracy of the mosaic removal. And then combines the loss calculation method of
the pixel networks, the network effectively solve the unstable features of generative
adversarial networks in specific conditions. Finally, the experimental results show
that the overall mosaic face removal for this network performance is superior to other
existing algorithms.

Keywords Generative adversarial networks · Unsupervised learning · Mosaic


removal

37.1 Introduction

The removal of the overall mosaic of images is a challenging study. Early mosaic
removal algorithms include nearest neighbor interpolation algorithms, bilinear inter-
polation algorithms, and cubic spline interpolation algorithms. Such an algorithm

H. Wang · Z. Cao · S. Niu (B) · H. Tong


Beijing Key Lab of Intelligent Telecommunication Software and Multimedia, School of Computer
Science, Beijing University of Posts and Telecommunications, Beijing 100876, China
e-mail: szniu@bupt.edu.cn
H. Wang
e-mail: fiphoenix@bupt.edu.cn
Z. Cao
e-mail: 68545849@qq.com
H. Tong
e-mail: tonghui@bupt.edu.cn
© Springer Nature Singapore Pte Ltd. 2020 351
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_37
352 H. Wang et al.

is simple and fast, but the image obtained by interpolation recovery is not ideal,
especially at the edge of the image, the distortion is more obvious, and edge blurring
and color diffusion occur. The late demosaicing algorithms mainly include the VCD
(variance of color difference) algorithm published by Chung et al. [1] in 2006, the
SA (successive approximation) algorithm published by Li [2] in 2005, and Zhang
et al. The DLMMSE (directional linear minimum mean-square-error estimation)
algorithm published by Man [3] in 2005. These demosaicing algorithms all utilize
the correlation between the inserted pixel and its neighboring pixels, but for the whole
or deep mosaic image, the above algorithm removal effect is not obvious. With the
development of deep learning, more and more fields have introduced this method,
such as Deep kernel learning [4] and control approaches [5]. This paper attempts
to use the unsupervised learning generation to combat the network to remove deep
mosaic-processed face photos. Based on the Convolutional Neural Network (CNN),
which can acquire deep image features and generate adversarial networks to gener-
ate realistic HD faces, this paper constructs a new Generative Adversarial Networks
model based on deep learning.

37.2 Related Work

Generative Adversarial Networks (GANs) is a training method for unsupervised


learning proposed by Goodfellow et al. [6] in 2014. However, since the introduction,
there have been training difficulties in GANs. The loss of generators and discrim-
inators cannot indicate the training process and the lack of diversity in generating
samples. Since then, many papers have been tried, but improvements have been
limited. One of the most famous improvements is A Dundar et al. [7] in 2015,
Deep Convolutional Generative Adversarial Networks (DCGAN), which relies on
the experimental design of the discriminator and generator architecture. In the end,
I found a better set of network architecture settings, but the essence is still not stable
enough. In 2017, the Wasserstein GAN (WGAN) proposed by Arjovsky et al. [8].
Subsequent research found that WGAN still has problems in generating low-quality
samples or not converge in some scenarios. Gulrajani et al. [9] proposed an improved
training method for WGAN in 2017.
Another generation model corresponding to GANs is the conditional image gen-
eration proposed by Oord et al. [10] in 2016 using the PixelCNN decoder. They
studied a model based on the PixelCNN architecture that can generate new images
based on changes in conditions.
The mosaic restoration experiment in this paper mainly uses the WGAN network,
but because the performance of the single network is not stable enough, after sev-
eral experiments, the DCGAN network and the PixelCNN model are successfully
combined.
37 Mosaic Removal Algorithm Based on Improved Generative … 353

Fig. 37.1 Full convolution flowchart of the new GANs Generator network

37.3 Design of New GANs Based on WGAN

The facial repair generation was proposed by Li et al. [11] in 2017. We used the
DCGAN network in the discriminant loss calculation part and the WGAN network
optimizer in the model optimization part, which achieved good results. However,
the new GANs network is very unstable, and it is prone to a loss of generation and
a negative value of the discriminant loss. This paper combines the generation loss
calculation of PixelCNN model, and the model is gradually stable.

37.3.1 Design of Generator Network and Discriminator


Network

Due to the significant modification of the calculation of the generation loss, the
generation model of the new GANs removes the Gaussian distribution from the
traditional GANs. The generated model directly inputs 16 overall mosaic-processed
photos of size 64 × 64, that is, 16 × 64 × 64 × 3 vectors, representing the number of
pictures, picture height, picture width, and image channel. The vector is enlarged to
16 × 128 × 128 × 3 before the start of the full convolution operation. The convolution
kernel in Fig. 37.1 has four parameters, namely the height of the convolution kernel,
the width of the convolution kernel, the number of image channels, and the number
of convolution kernels, where the value of convolution kernel 1 is 8 × 8 × 3 × 256.
In the convolution, the step size of each layer of the image is 1 × 1×1 × 1. Since the
step size is 1, the filling layer does not participate in the calculation. The generator
model structure is shown in Fig. 37.1.
The convolution formula is defined as follows (out_height, out_width indicates
the convolution output height and width):

out_height = (in_height + 2 × pad_height − kernel_height)/stride_height + 1


(37.1)
out_width = (in_width + 2 × pad_width − kernel_width)/stride_width + 1
(37.2)
354 H. Wang et al.

Table 37.1 First convolution input parameters


Variable Parameter
Input Kernels Stride Padding
Height 128(in_height) 8(kernel_height) 1(stride_height) 0(pad_height)
Width 128(in_width) 8(kernel_width) 1(stride_width) 0(pad_width)

According to the parameter values in Table 37.1, combined with the convolution
formula, the output height and output width can be calculated: (128 + 2 * 0 − 8)/1
+ 1=121. The number of input images in the first layer is 16, and the number of
convolution kernels is 256, so the first convolution output is 16 × 121 × 121 ×
256. The output of the other layers can be separately determined according to the
convolution kernel and the convolution formula of Fig. 37.1.
The Discriminator network has the same structure as the Generator network, and
the input is the output of the Generator network, and the output is the discriminating
result for the generated image.

37.3.2 Generation Loss and Discriminant Loss Calculation

According to Ref. [8], it can be seen that the main improvement of WGAN compared
to the original GANs is that the generation loss and the discriminant loss calculation
do not take the logarithm. The generation loss of WGAN is calculated as shown
in formula (37.3), where X represents the output of the discriminant model. In this
paper, in order to perform mosaic restoration, the calculation using Eq. (37.3) is poor.
To this end, this paper studies the loss calculation of PixelCNN network, adjusts the
calculation method of generating loss, and puts the calculation focus on the distance
between the generated model output and the learning target feature, as shown in the
following formula (37.4). In this paper, there are three parameter inputs for the loss,
namely the discriminant model output (defined as d_out), the generated model output
(defined as g_out), and the learning target feature (defined as t_feature).

1 16 16 16 3

Lloss(X ) = −xs  (37.3)
16 × 16 × 16 × 3 s=1 i=1 j=1 k=1 i, j,k

1  16 16 16 3

L1loss(X, Y) = xs − y s  (37.4)
16 × 16 × 16 × 3 s=1 i=1 j=1 k=1 i, j,k i, j,k

1
C(y, a) = − [y ln a + (1 − y) ln(1 − a)] (37.5)
n x

Although Ref. [8] mentions avoiding logarithms, experiments have shown that
the use of cross entropy for mosaic restoration is very good, as shown in Eq. (37.5).
37 Mosaic Removal Algorithm Based on Improved Generative … 355

In this paper, the inputs x, y that generate the loss of the first part L1loss are, respec-
tively, g_out and t_feature defined above. The inputs a, y of the second part of the
cross entropy are d_out defined above and d_out (defined as d_out_one) of 1 value,
respectively. Finally, our generation loss gene_loss is defined as follows:

gene_loss = L1loss(g_out, t_feature) × 100 + C(d_out_one, d_out) × 1 (37.6)

Although Ref. [8] mentions avoiding logarithms, experiments have shown that it
is good to use cross entropy for discriminating losses for mosaic restoration. There
is a distinct feature of the DCGAN network here. The discriminant loss in this paper
is equal to the generated image loss (defined as f_loss) minus the real image loss
(defined as t_loss). This corresponds to an improvement in the generation of losses.
The real image loss is the average cross entropy of the real feature image (defined
as t_feature) at the output of the discriminant model (defined as t_out) (the inputs
a, y are, respectively, defined as t_out and the valued t_out which is represented by
t_out_one). The generated image loss is the average cross entropy of the generated
result (defined as g_out) through the discriminant model output (defined as d_out)
(the inputs a, y are, respectively, defined as d_out and the 1-valued d_out which is
represented by t_out_one). The final discriminator network loss d_loss definition
formula is as follows:

d_loss = C(d_out_one, d_out) − C(t_out_one, t_out) + L1los(g_out, t_feature)


× 100 (37.7)

37.3.3 Parameter Optimization and Truncation

In order to minimize the generation loss and discriminate the loss, the optimizer is
needed to optimize the weight parameters, and one of the two optimizers is shared.
The Adam optimization algorithm proposed by Kingma et al. [12] in 2014 is used
to optimize the gradient direction of the generated model, and then use this gradient
to minimize the next loss by updating the weight parameter values. According to
the characteristics of the DCGAN network, the fixed learning rate is 0.0002. The
RmsProp optimization algorithm proposed by Hinton et al. [13] in 2012 is used to
optimize the gradient direction of the discriminant model, and then the gradient is
used to minimize the next loss by updating the weight parameter values. According
to the characteristics of WGAN network, the fixed learning rate is 0.0002 in the
experiment, and other parameters are default.
Although the WGAN original text indicates that the generation model and the
discriminant model are optimized using the RmsProp optimization algorithm, for the
mosaic restoration, the experimental results show that the generation model is better
with the Adam optimization algorithm. Next, generate a model and a discriminant
model, and minimize the loss value by updating the weight parameters. At the same
356 H. Wang et al.

Fig. 37.2 Overall flowchart of the new GANs network

time, the updated weight parameters are used for the next convolution operation.
Thus, through the backpropagation algorithm, each time the gradient is updated by
the learning rate and then combined with the model loss; the model loss of the next
cycle is minimized.
WGAN pointed out that in order to solve the GANs network crash problem,
each time the parameter of the discriminator is updated, its absolute value needs
to be truncated to no more than one constant. The constant in this paper is defined
as 0.008. Therefore, after the optimization of the model weight parameters in the
previous step, a truncation process is added in the paper. The truncation algorithm
is specifically as follows: the parameter value is greater than 0.008, which is 0.008,
and the value less than −0.008 is −0.008. This ensures the stability of the update to
a certain extent.
Refer to the semi-supervised learning of ladder networks proposed by Rasmus
et al. [14] in 2015 and the sample-based volume proposed by Dosovitskiy et al. [15] in
2015 to reduce internal covariate acceleration deep network training and Dosovitskiy
et al. [16] in 2015. Discriminant unsupervised feature learning of the neural network.
In this paper, the fixed learning rate is specified as 0.0002, 200,000 CELEBA pictures
are used for training features, and 16 pictures are randomly loaded in a single cycle.
Each training first generates loss and discriminant loss through forward feedback
calculation, and then minimizes the loss and updates the weight parameter through
gradient descent. The weight parameter of each training 200 times is used as a
restoration model to train the test feature image, and the generation result of the
test picture is saved. Finally, 43 sets of training feature pictures are obtained, and
the definition of these feature pictures gradually becomes better as the training time
increases.
The complete structure of the new GANs is shown in Fig. 37.2.

37.4 Experiment and Analysis

The experimental operating system in this paper is Windows 10, 64-bit, based on the
TensorFlow framework; the version is 0.10. The programming language Python3.5,
the core extension MoviePy is version 0.2.2.11, Numpy is version 1.11.1, Scipy is
37 Mosaic Removal Algorithm Based on Improved Generative … 357

Fig. 37.3 Test image

Fig. 37.4 Restore results

version 0.18.0, and Six is version 1.10.0. The data set for face generation uses the
public face data set CELEBA, which has a total of 200,000 face photos of 178 ×
218 size.
The experimental learning rate of this group is 0.0002, 200,000 CELEBA images
are used for training features, and 16 images are randomly loaded in a single cycle.
The test picture is shown in Fig. 37.3 and has a size of 178 × 218. The result of the
mosaic restoration is shown in Fig. 37.4.
Before starting the experiment, compress the image to 64 × 64 and then add mosaic
to start training. The goal of the experiment was to reduce the difference between
the mosaic photos and the real features, and finally restore the mosaic photos. Each
cycle first generates an output through the generated model, and then discriminates
the model for discriminating, and then uses the output of the generated model and
the discriminant model to calculate the loss and discriminate the loss. Finally, the
weighting parameters are optimized by the backpropagation algorithm to start the
next cycle. The result is output directly every 200 cycles. The output of the 200th
358 H. Wang et al.

Fig. 37.5 Comparison of the results of 200, 800, and 15,000 cycles

Fig. 37.6 The effect of the


new GANs generation loss

cycle is shown in Fig. 37.5a, and the effect is very poor. The output image of the
800th cycle is gradually improved as shown in Fig. 37.5b. The result of 15,000 cycles
is shown in Fig. 37.5c, which is basically close to the real face.
The stability of the entire experimental process can be generated by the loss curve
shown in Fig. 37.6.
For the overall mosaic restoration, the best results come from the pixel recursive-
based super-resolution algorithm proposed in 2017 by Google Brain [17]. In Fig. 13,
the right side is the real character avatar of the 32 × 32 grid, the left side is the same
avatar that has been compressed to the 8 × 8 grid, and the middle photo is the result
of Google Brain’s guess based on low-resolution swatches.
This paper combines Arjovsky and Bottou [18] in 2016 to propose a principle
method for training generative confrontation networks. The final experimental com-
parison data calculation results are shown in Table 37.2.
37 Mosaic Removal Algorithm Based on Improved Generative … 359

Table 37.2 Comparison of mosaic restoration algorithms


Method pSNR SSIM MS-SSIM Consistency % Fooled
ResNet L2 29.16 0.90 0.90 0.004 4.0 ± 0.2
Google 29.09 0.84 0.86 0.008 11.0 ± 0.1
Ours 29.17 0.88 0.88 0.029 14.0 ± 0.1

37.5 Conclusion

In this paper, the overall mosaic restoration using the generative adversarial networks
is studied, and the calculation method of GANs network generation loss is improved,
so that the target of the generation model can be controlled. At the same time, the
characteristics of deep convolution of DCGAN network and the calculation method
of discriminant loss are introduced. The results of the realistic restoration of the
mosaic image are first generated by experiments. Second, the improvement of the
proposed method can be used to predict the instability of the WGAN network. Finally,
the comparison results show that the proposed algorithm is better than the existing
algorithm.

Acknowledgements This work was supported by National Natural Science Foundation of China
(No. U1536121, 61370195).

References

1. Chung, K.H., Chan, Y.H.: Color demosaicing using variance of color differences. IEEE Trans.
Image Process. 15(10), 2944–2955 (2006)
2. Li, X.: Demosaicing by successive approximation. IEEE Trans. Image Process. A Publ. IEEE
Signal Process. Soc. 14(3), 370–379 (2005)
3. Zhang, L., Wu, X.: Color demosaicking via directional linear minimum mean square-error
estimation. IEEE Press (2005)
4. Chen, X., Peng, X., Li, J.-B., Peng, Yu.: Overview of deep kernel learning based techniques
and applications. J. Netw. Intell. 1(3), 83–98 (2016)
5. Xia, Y., Rong, H.: Fuzzy neural network based energy efficiencies control in the heating energy
supply system responding to the changes of user demands. J. Netw. Intell. 2(2), 186–194 (2017)
6. Goodfellow, I.J., Pougetabadie, J., Mirza, M., et al.: Generative adversarial nets. Adv. Neural.
Inf. Process. Syst. 3, 2672–2680 (2014)
7. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolu-
tional generative adversarial networks. Comput. Sci. (2015)
8. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN (2017). arXiv:1701.07875
9. Gulrajani, I., Ahmed, F., Arjovsky, M., et al.: Improved training of Wasserstein GANs (2017).
arXiv:1704.00028
10. Oord, A., Kalchbrenner, N., Vinyals, O., et al.: Conditional image generation with PixelCNN
decoders (2016). arXiv:1606.05328
11. Li, Y., Liu, S., Yang, J., et al.: Generative face completion (2017). arXiv:1704.05838
12. Kingma, D.P., Ba, J., Lei: Adam: a method for stochastic optimization (2014). arXiv:1412.
6980
360 H. Wang et al.

13. Tieleman, T., Hinton, G.: Lecture 6.5—RmsProp: divide the gradient by a running average of
its recent magnitude. In: COURSERA: Neural Networks for Machine Learning (2012)
14. Rasmus, A., Valpola, H., Honkala, M., Berglund, M., Raiko, T.: Semisupervised learning with
ladder network (2015). arXiv:1507.02672
15. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing
internal covariate shift (2015). arXiv:1502.03167
16. Dosovitskiy, A., Fischer, P., Springenberg, J.T., Riedmiller, M., Brox, T.: Discriminative unsu-
pervised feature learning with exemplar convolutional neural networks. IEEE Trans. Pattern
Anal. Mach. Intell. 99 (2015)
17. Dahl, R., Norouzi, M., Shlens, J.: Pixel recursive super resolution super resolution (2017).
arXiv:1702.00783
18. Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial net-
works. NIPS 2016 Workshop on Adversarial Training
Chapter 38
Xception-Based General Forensic
Method on Small-Size Images

Lisha Yang, Pengpeng Yang, Rongrong Ni and Yao Zhao

Abstract Developing universal forensic methods that can simultaneously identify


multiple image operations to identify the authenticity and processing history of an
image has attracted more and more attention. Although numerous forensics tools
and methods emerge to detect the traces left by various image operations, the accu-
racy with current techniques still decreased significantly as the size of the investi-
gated images reduced. To overcome this issue, especially for small-size or highly
compressed images, we propose a method using Xception-based convolution neural
network. While CNNs-based methods are able to learn features directly from data
for classification task, they are not well suited for forensic problems directly in their
original form. Hence, we have added magnified layer in the preprocessing layer. The
input images are magnified by the nearest neighbor interpolation algorithm in the
magnified layer, which can preserve the property of image operations better than
other magnified tools, and then input them into the CNN model for classification.
Finally, we get adaptive average pooling function from global average pooling to
adapt to any size of input pictures. We evaluate the proposed strategy on six typ-
ical image processing operations. Through a series of experiments, we show that
this approach can significantly improve classification accuracy to 97.71% when the
images are of size 64 × 64. More importantly, it outperforms all the existing general
purpose manipulation forensic methods.

Keywords Small-size images · Operation detection · CNN · Xception

L. Yang · P. Yang · R. Ni (B) · Y. Zhao


Institute of Information Science, Beijing Jiaotong University, Beijing, China
e-mail: rrni@bjtu.edu.cn
Y. Zhao
e-mail: yzhao@bjtu.edu.cn
L. Yang · P. Yang · R. Ni · Y. Zhao
Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing
100044, China

© Springer Nature Singapore Pte Ltd. 2020 361


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_38
362 L. Yang et al.

38.1 Introduction

Since new editing operations are frequently developed and incorporated into editing
software such as Photoshop and some popular mapping software, images manipula-
tions, such as median filtering and contrast enhancement are often adopted without
authorization. These operations alter the inherent statistics without changing the
content of original natural images. The use of image operations has changed the
style of the image itself and the information, seriously affecting people’s judgment
of the truth. In this context, image manipulation detection is proposed to verify the
authenticity of the image and detect the processing history of the image by means of
analysis, making it an indispensable part of multimedia forensics.
In the early stage, most forensic algorithms were designed to detect single targeted
manipulation, thus, only a binary classification is considered [1]. The inherent statis-
tics of the original image will change with the type of various image operations. So
most forensic methods are mainly realized by detecting the changes of some inher-
ent statistical attributes in the original image. Forensic methods based on the above
considerations may lead to the following significant drawbacks: this usually leads to
misleading results if irrelevant classifiers are used. Hence, forensic algorithms need
detecting various image manipulations and maintaining high accuracy.
To address these issues, Li et al. [2] found powerfully steganalysis features called
Spatial Rich Model (SRM) [3] would be used for simultaneously identify multiple
image operations, which can distinguish 11 typical image processing operations.
However, these traditional methods relied on difficult and time-consuming human
analysis to design forensic detection features.
This issue was quickly fixed by using CNN, which can learn features from images
and do classification automatically. However, the forensic tasks are different from
traditional computer vision tasks. Classification tasks tend to extract features from
image content, while forensic tasks tend to extract traces left by image operation,
which has nothing to do with image content. Therefore, the traditional convolutional
neural network would not applicable to image forensics problems directly. In order
to solve this problem, the preprocessing layer is usually added before the neural
network. Bayar et al. restrained the content of the images by the constrained convo-
lution layer and then classified the images by Constrained CNN [4]. While CNNs
provides a way toward automatically learning the trace of image processing oper-
ations, the most of the existing methods are no longer effective for small-size or
highly compressed images. Recently, Tang et al. proposed Magnified CNN to detect
six image operations especially for small-size images [5]. But for some operations,
Tang’s method is not very satisfactory.
In this paper, we continue to aim at detecting operations for small-size images and
are motivated by using magnified layer as preprocessing layer. Compared with the
current state of the art of image forensic methods, this paper contains the following
differences and new insights.
38 Xception-Based General Forensic Method … 363

On the one hand, the nearest neighbor interpolation algorithm as magnified method
was extended to magnify the difference between pictures after various operations,
which can enlarge the difference between different types of images and preserve the
property of image operations better than other magnified tools.
On the other hand, with the rapid development of deep learning, many classical
network structures have emerged [6, 7]. In order to improve the classification perfor-
mance of the network, we compared some typical frameworks such as Xception [8],
Densenet-121 [9], Resnet-50 [10], and Resnext-50 [11]. Based on extensive exper-
iments and analysis, Xception performed best in our comprehensive experimental
settings.
Xception is based on deep separable convolution module. At the same time, the
network also used residual connection [10] to reduce information loss. On the last
pooling layer, we got adaptive average pooling function from global average pooling
to apply this method to any size of input pictures. These results show that our proposed
network can achieve 97.71% accuracy with six different tampering operations when
the images of size 64 × 64.
This paper is organized as follows: In Sect. 38.2, we present an overview of the
proposed architecture while Sect. 38.3 shows the results and performance compari-
son. Finally, Sect. 38.4 concludes our work.

38.2 The Proposed Architecture

Most image operations are carried out in local areas. Thus, it is difficult to locate
the operation position directly in a large image. To solve this issue, researchers can
detect the large image block by block to locate the operation position in the actual
processing process. So the smaller the size of the detected block, the higher the final
positioning accuracy will be. Here, we proposed a general method to improve the
accuracy of the detection especially on small-size images.

38.2.1 The Framework of the Proposed CNN Model

CNN model can automatically extract features, iterate, and update parameters at
the same time. Therefore, it has been more and more popular in forensic methods.
Convolution neural networks usually contain convolution layer, pooling layer, and
classification layer. Convolution layer mainly completes feature extraction, which
includes capturing local dependency between adjacent pixels and output the feature
map. The pooling layer can reduce the dimensionality of the features by fusing
the features extracted from the convolution layer to obtained global information.
Xception uses global average pooling to replace the traditional fully connected layers,
and the resulting vector is fed directly into the softmax in classification layer. The
364 L. Yang et al.

Fig. 38.1 The framework of the proposed CNN model

training process of CNN is accomplished by iterative algorithm, which alternately


propagates the data forward and backward. The weights are updated in each iteration
way by backpropagation algorithms.
As the size of the image decreases, the information left by various operations
reduces at the same time, so we assume that we can enlarge the difference between
pictures by adding a preprocessing layer before the CNN network without changing
the nature of the image itself. Actually, the new gray value may be introduced by
magnified methods, which can destroy the traces left by image operations and influ-
ence the accuracy of detection. To avoid this issue, we choose the nearest neighbor
interpolation to enlarge the difference between pictures. This tool only duplicates the
nearest neighbor’s pixel value, rather than introducing new pixel value. Researcher
has done a serial of experiments to prove that the best magnification factor is two
[12]; therefore in this paper, the scale of nearest neighbor interpolation was set to
two. We input the magnified image into CNN network, in which the main frame is the
excellent classification model Xception. Xception consists of 36 convolution layers,
which can be regarded as a linear stack of depthwise separable convolution layers
with residual connections [10]. Figure 38.1 shows the framework of the proposed
CNN model.

38.2.2 The Architecture of the Xception

Xception is a deep convolution neural network structure inspired by Inception and the
Inception module has been replaced by the depthwise separable convolution module
(Fig. 38.2). The deep convolution module can be divided into two parts: depthwise
38 Xception-Based General Forensic Method … 365

Fig. 38.2 The deep


separable convolution
1x1
module

3X3 3X3 3X3 3X3 3X3

Split channels

Input

convolution and pointwise convolution. A spatial 3 × 3 convolution performs over


each channel of an input feature map, followed by a point-by-point convolution of
1 × 1. It projects the channel output from the deep convolution onto the new channel
space [8]. Depth separable convolution is different from traditional convolution. It
separated cross-channel correlations and spatial correlations to entirely reduce the
connection between them. This strategy can make full use of computational power.
It may achieve high classification accuracy even when the size of input images is
small. In addition, Xception also uses residual connections [10]. He et al. found
that accuracy gradually saturated and then decreased rapidly as the depth of the
network increased. However, this degradation was not caused by over-fitting, and
adding more layers to the appropriate depth model led to higher training errors,
so they proposed residual connections to solve this degradation phenomenon. The
introduction of residual connections can improve the model accuracy by enhancing
the feedforward propagation signal and the backward gradient signal. In the CNN
network, 36 convolution layers were used to extract features. They were merged into
14 modules, all of which were connected by linear residual connections except the
first and last modules. To avoid over-fitting, Xception uses global average pooling to
replace the traditional fully connected layer, and the resulting vector is fed directly
into the softmax in classification layer. We get adaptive average pooling function
from global average pooling. Regardless of the size of the feature map output by the
last convolution layer, it would fix the output size to 1 × 1, which can adapt to any
size of input pictures.

38.3 Experiment

Our database consisted of 13,800 images. These images mainly take from three
widely used image databases: the BOSSbase 1.01 [13], the UCID database [14],
and the NRCS Photo Gallery database [15, 16]. The BOSSbase database contributes
10,000 images and the UCID database and the NRCS Photo Gallery database con-
tribute 1338 images, respectively. Finally, [16] contains 1124 natural images. Before
366 L. Yang et al.

Table 38.1 Editing Editing operation Parameter


parameters used to create our
database Median filtering Kernel size = 5 × 5
Mean filtering Kernel size = 5 × 5
Gaussian filtering Kernel size = 5 × 5, σ = 0.8
Resampling Bilinear interpolation, scaling = 2
Contrast enhancement Contrast limits [0.2, 0.8] (Matlab
2016b’s imadjust function)
JPEG compression QF = 70

any further processing, we converted the images to grayscale images. We test our pro-
posed method to be used as a multiclass detector with six types of image processing
operations as shown in Table 38.1.
Then, the image blocks were cropped from the center of a full-resolution image
with size 32 × 32 and 64 × 64, respectively. We randomly selected three out of five
images as the training set, one-fifth of images as the validation, and the rest as the
testing set. The image data were processed into grayscale images, and then amplified
by the magnification layer. Finally, the images were input to network. The proposed
CNN model was implemented by using Pytorch. All the experiments were done with
two GPU card of type GeForce GTX Titan X manufactured by Nvidia. The training
parameters of the stochastic gradient descent were set as follows: momentum = 0.9,
decay = 0.0005, the learning rate was initialized to 0.1 and multiplied by 0.1 for every
30 epochs. As the training time increased, the learning rate decreased gradually. The
step length reduces, which makes it possible to oscillate slightly in a small range at
the minimum and to approach the minimum continuously.
In each experiment, we trained each CNN for 76 epochs, where an epoch is the
total number of iterations needed to pass through all the data samples in the training
set. Additionally, while training our CNNs, the testing accuracy on a separate testing
dataset was recorded every 1 epoch to produce tables and figures in this section. The
accuracy of the form was derived from the maximum accuracy of the test dataset.

38.3.1 Multiple Operation Detection

In our experiments, we evaluate our proposed strategy in performing general image


operation detection with six types of image operations listed in Table 38.1 are con-
sidered. A total of 96,600 image blocks are used for training, validation, and testing.
We use 64 × 64 and 32 × 32 images as input to study the classification accuracy of
our proposed method. Besides, two state-of-the-art jobs Constrained CNN [4] and
Magnified CNN [5] are included for comparative studies (Tables 38.2, 38.3 and 38.4).
38 Xception-Based General Forensic Method … 367

Table 38.2 Confusion matrix about the detection accuracy of our method with magnified layer;
the size of testing image size is 64 × 64 (%)
A/P CE GF5 JEG70 MeaF5 MF5 ORG RES2
CE 89.57 0 0.04 0 0.07 10.07 0.25
GF5 0.04 99.64 0 0.22 0 0.07 0.04
JPEG70 0.07 0 99.86 0 0 0 0.07
MeaF5 0.04 0.11 0 99.67 0.14 0 0.04
MF5 0.07 0.18 0 0.54 99.17 0 0.04
ORG 3.70 0.07 0 0.04 0.11 96.05 0.04
RES2 0 0 0 0 0 0 100

Table 38.3 Confusion matrix about the detection accuracy of our method with magnified layer;
the size of testing image is 32 × 32 (%)
A/P CE GF5 JEG70 MeaF5 MF5 ORG RES2
CE 86.56 0 0 0 0.43 12.17 0.83
GF5 0 98.91 0.04 0.69 0.22 0.11 0.04
JPEG70 0.07 0 99.89 0 0 0 0.04
MeaF5 0 0.36 0.04 98.84 0.69 0 0.07
MF5 0.25 0.65 0.04 1.49 97.17 0.36 0.04
ORG 10.29 0.11 0 0.07 1.05 88.44 0.04
RES2 0 0 0 0 0 0 100

Table 38.4 The detection average accuracy of our method, Magnified CNN, and Constrained CNN
Image size Proposed network Magnified Constrained CNN
Magnified Without magnified CNN

64 × 64 97.71 95.42 95.91 91.69


32 × 32 95.69 93.78 93.77 –

Here “MeaF5”, “CE”, “GF5”, “JPEG70”, “MF5”, and “RES2” denote mean filter-
ing, contrast enhancement, Gaussian filtering, JPEG compression, median filtering,
and up-sampling, respectively.

38.3.2 Comparing with Other CNN Network

In this section, to verify the feasibility of choosing Xception to achieve feature


extraction and classification, we compared our proposed method with Resnext-50,
Resnet-50, and Densenet-121 when the input images of size 64 × 64 and 32 × 32.
While CNNs-based methods are able to learn features directly from data for classi-
fication task, they are not well suited for forensic problems directly in their original
368 L. Yang et al.

Table 38.5 The detection Network Our Densenet121 Resnext50 Resnet50


average accuracy of our method
method, Tang, resnext-50,
resnet-50, and densenet-121 64 × 64 97.71 97.01 96.88 96.61
with magnified layer; the size 32 × 32 95.69 93.81 93.79 93.11
of testing image is 64 × 64
(%) and 32 × 32 (%)

form. To be fair, all networks add magnified layer. The average detection accuracy
is presented in Table 38.5, and our proposed strategy is significantly outperforming
these traditional networks in terms of effectiveness.

38.4 Conclusion

In this paper, we proposed a novel CNN-based approach to perform image multi-


operations detection by combining the excellent network Xception with the magnified
layer. The magnified layer enlarges the difference between pictures and preserves the
original information of images after different operations. Unlike existing approaches
that do binary classification or rely on hand-designed features, our proposed CNN is
able to learn image manipulation detection features directly from data and improves
the accuracy of the detection of small-size pictures after different operations. The
results of these experiments showed that our CNN could be trained to accurately
detect multiple types of manipulations. To further assess the performance of our CNN,
we compared it to some current state-of-the-art detector using six different image
manipulations to perform that our proposed CNN architecture can outperform these
approaches. Additionally, to verify the feasibility of choosing Xception to achieve
feature extraction and classification, we compare to some state-of-the-art traditional
convolution neural networks. These experiments also show that our network still has
stable classification ability on smaller images.

Acknowledgements This work was supported in part by the National Key Research and Devel-
opment of China (2018YFC0807306), National NSF of China (61672090, 61532005), and Funda-
mental Research Funds for the Central Universities (2018JBZ001).

References

1. Stamm, M.C., Wu, M., Liu, K.J.R.: Information forensics: an overview of the first decade.
IEEE Access 1, 167–200 (2013)
2. Li, H., Luo, W., Qiu, X., Huang, J.: Identification of various image operations using residual-
based features. IEEE Trans. Circuits Syst. Video Technol. 1–1 (2016)
3. Fridrich, J., Kodovsky, J.: Rich models for steganalysis of digital images. IEEE Trans. Inf.
Forensics Secur. 7(3), 868–882 (2011)
38 Xception-Based General Forensic Method … 369

4. Bayar, B., Stamm, M.C.: Constrained convolutional neural networks: a new approach towards
general purpose image manipulation detection. IEEE Trans. Inf. Forensics Secur. 1–1 (2018)
5. Tang, H., Ni, R., Zhao, Y., Li, X.: Detection of various image operations based on CNN. In: Asia-
Pacific Signal and Information Processing Association Summit and Conference, pp. 1479–1485
(2017)
6. Chen, X., Peng, X., Li, J., Peng, Y.: Overview of deep kernel learning based techniques and
applications. J. Netw. Intell. 1(3), 83–98 (2016)
7. Xia, Y., Hu, R.: Fuzzy neural network based energy efficiencies control in the heating energy
supply system responding to the changes of user demands. J. Netw. Intell. 2(2), 186–194 (2017)
8. Chollet, F.: Xception: Deep Learning with Depthwise Separable Convolutions, pp. 1800–1807
(2016)
9. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional
networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni-
tion, pp. 4700–4708 (2017)
10. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition (2015)
11. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep
neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 1492–1500 (2017)
12. Tang, H., Ni, R., Zhao, Y., Li, X.: Median filtering detection of small-size image based on
CNN. J. Vis. Commun. Image Represent. 51, 162–168 (2018)
13. Bas, P., Filler, T., Pevný, T.: “Break our steganographic system”: the ins and outs of organiz-
ing BOSS. In: International Workshop on Information Hiding, pp. 59–70. Springer, Berlin,
Heidelberg (2011)
14. Schaefer, G., Stich, M.: UCID: an uncompressed color image database. In: Storage and Retrieval
Methods and Applications for Multimedia, vol. 5307, pp. 472–481. International Society for
Optics and Photonics (2004)
15. http://photogallery.nrcs.usda.gov
16. Luo, W., Huang, J., Qiu, G.: JPEG error analysis and its applications to digital image forensics.
IEEE Trans. Inf. Forensics Secur. 5(3), 480–491 (2010)
Chapter 39
Depth Information Estimation-Based
DIBR 3D Image Hashing Using SIFT
Feature Points

Chen Cui and Shen Wang

Abstract Image hashing has been widely used for traditional 2D image authentica-
tion, content-based identification, and retrieval. Being different from the traditional
2D image system, virtual image pair is generated from the center image according
to the corresponding depth image in the DIBR process. In one of the communica-
tion models for DIBR 3D image system, the content consumer side only receives
the virtual images without performing DIBR operation. By this way, only a variety
of copies for virtual image pairs could be distributed. This paper designs a novel
DIBR 3D image hashing scheme based on depth information estimation using local
feature points, by detecting the matched feature points in virtual image pair and divid-
ing these feature points into different groups according to the corresponding depth
information estimated to generate the hash vector. As the experiments shown, the
proposed DIBR 3D image hashing is robust against most of the content-preserving
operations.

Keywords Depth image-based rendering (DIBR) · DIBR 3D image hashing ·


DIBR 3D image identification · Depth information estimation

C. Cui
School of Information Science and Technology, Heilongjiang University,
Harbin, Heilongjiang, China
e-mail: 2018012@hlju.edu.cn
S. Wang (B)
School of Computer Science and Technology, Harbin Institute of Technology, Harbin,
Heilongjiang, China
e-mail: shen.wang@hit.edu.cn

© Springer Nature Singapore Pte Ltd. 2020 371


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_39
372 C. Cui and S. Wang

39.1 Introduction

Depth image-based Rendering (DIBR) is a convenient and practical 3D representa-


tion technology [1]. It is very easy to transmit and store an DIBR 3D image (including
the center image and depth image) because the depth image is a grayscale image with
limited file size. Moreover, the 3D video effects can be easily presented with negli-
gible additional information. A variety of problems about protection of 3D products
occur with the rapid development of DIBR. The issues of illegal access and unautho-
rized distributions in traditional 2D digital image will also restrict the development
of DIBR 3D image. In the common DIBR 3D image communication model, receiver
side performs DIBR operation to generate the virtual left and right images. During
the transmission, both center image and the virtual images will be disturbed by the
channel noise, and some illegal redistributions may be performed in the content con-
sumer side. Hence, there may exist a variety of copies for the center image or virtual
image pair distributed, which are different from the original center image but with
the same perceptual content as the center image. Thus, we need to propose a new
hashing scheme protecting the DIBR 3D images.
For traditional 2D images, the conventional cryptography has been utilized for
authentication [2, 3]. Moreover, robust image hashing has been extensively employed
for content-based identification. Generally, image hashing consists of two main
aspects: feature extraction and feature compression. The robustness and discrimi-
nation performance of image hashing is directly affected by feature extraction, many
approaches focus on finding robust feature to make the image hashing resistant to
standard degradation processing and malicious attacks, such as the transform domain
features-based hashing [4, 5]. In addition, some matrix analysis approaches have also
been employed to extract the perceptual features for hash generation, such as singu-
lar value decomposition (SVD) [6] and nonnegative matrix factorization (NMF) [7].
Geometric-invariant features have been exploited to design robust image hashing, and
salient points are the most commonly used features to due with geometric distortion
attacks [8]. Lv and Wang designed a shape contexts-based image hashing using local
feature points to resist to geometric distortion attacks, such as rotation [9]. In [10], a
robust perceptual image hashing based on ring partition and invariant vector distance
was proposed. As the experimental results shown, the method is robust to rotation
with good discriminative capability.
In the traditional 2D image hashing scheme, perceptually insignificant distortions
and most of the common digital operations would not lead to viewpoint changes.
That means the center of original image is consistent with their copies’ center with-
out changing. In fact, virtual images are generated from the center image with the
corresponding depth information in the DIBR system. Although, horizontally pixels
moving leads to the virtual images look different from the center image, the DIBR
process can be seen as a partial translation along the horizontal plane, and this kind
of operation could be considered as a content-preserving manipulation. Hence, the
virtual image pair and their copies should be identified with the same content as the
corresponding original center image.
39 Depth Information Estimation-Based DIBR 3D Image Hashing … 373

Meanwhile, the other communication models of DIBR 3D image system should


be also considered. The content consumer side directly receives the virtual images
without performing DIBR operation. By this way, only a variety of copies for virtual
image pairs would be distributed.
In this work, we propose a novel hashing scheme for DIBR 3D image. The hash
vector is generated with virtual image pair of the corresponding center image instead
of generating a hash from the center image directly. The SIFT algorithm [11] is
utilized to detect and select the matched feature points of virtual image pair. Dividing
these feature points into different groups, the image hash is calculated with their
feature descriptors. The proposed hashing has good robustness performance against
common content-preserving operations with high classification accuracy. The rest
of this paper is organized as follows. We first introduce the background of DIBR in
Sect. 39.2. Then the depth information estimation-based hashing method is given in
Sect. 39.3. Section 39.4 presents the experimental results.

39.2 Background

DIBR is a process generating the virtual images from the center image according to
its corresponding depth image [12]. As shown in Fig. 39.1, P represents a pixel in
the center image, Z is the depth value of P, f represents the focal length of the center
viewpoint, Cl and Cr are the left viewpoint and the right viewpoint, respectively. The
value of the baseline distance tx is consistent with the distance between the left and
right viewpoints. Formula 39.1 shows the geometric relationships of generating the
virtual image pair in the DIBR process.

xl = x c + tx f
2 Z
,

xr = x c − tx f
2 Z
, (39.1)

d = xl − xr = tx Zf

where xl , xc , and xr represents the x-coordinate of pixels in the left virtual image,
center image, and virtual right image, respectively. d represents the value of disparity
between the left and right virtual images, the value of f is set to 1 without loss of
generality.

39.3 Proposed Image Hashing Scheme

The proposed image hashing scheme consists of three steps. In the first step, virtual
image pair is generated from the original center image with a fixed baseline distance.
In the second step, the matched feature points extracted from virtual image pair are
374 C. Cui and S. Wang

Fig. 39.1 The relationship of pixel in left image, center image, and right image

divided into different groups according to estimated depth information, and the depth
information can be computed according to the following formula 39.1:

f
Z = tx (39.2)
d
In the third step, the descriptors of matched feature points in different groups are
utilized to generate the final hash vector. The proposed image hashing scheme will
be illustrated in the following subsections.

39.3.1 Feature Points Grouping

As shown in Fig. 39.2, the matching feature point pairs are divided into L groups
according to their depth information estimated, let P represents the set of feature
point pairs in different groups as

P = { p1 , p2 , . . . , p L } (39.3)

where pi represents the ith group of feature point pairs.

39.3.2 Image Hash Generation

Suppose Pl = { pk (x, y)}k=1


N
and Pr = { pk (x, y)}k=1
N
represent the sets of matched
feature points extracted from the left and right virtual image, Dl = {d pk (x, y)}k=1
N
39 Depth Information Estimation-Based DIBR 3D Image Hashing … 375

Fig. 39.2 Feature points grouping with depth information

and Dr = {d pk (x, y)}k=1


N
represent their corresponding local descriptors, the steps of
generating image hash from virtual image pair are as follows:
– Step 1: Suppose pi and p j be matched feature points in the left and right vir-
tual images, respectively. (x pi , y pi ) and (x p j , y p j ) represent their coordinates, the
disparity d can be computed as

d= (x pi − x p j )2 + (y pi − y p j )2 (39.4)

After computing the disparity, these matched feature point pairs can be divided
into L groups as

b(k) = { pi ∈ Pl , p j ∈ Pr : dmin + (k − 1)l ≤ d ≤ dmin + (k + L 2 )l} (39.5)

where l = dmaxL−d
1
min
and L = L 1 − L 2 . dmax and dmin represent the min disparity
and max disparity, respectively.  
– Step 2: Pseudorandom weights {ak }k=1L
from the normal distribution N u, σ 2 are
generated with a secret key to ensure the security of proposed image hashing. The
vector length of each ak is 128 consistent with the dimensions of feature descriptor.
– Step 3: The image hash vector H = {h k }k=1 L
is generated by computing each com-
ponent h k as 
hk = (< ak , d pi > + < ak , d p j >) (39.6)
pi , p j ∈b(k)

1 
128
< ak , d pi > = ak (m)d pi (m) (39.7)
128 m=1
376 C. Cui and S. Wang

1 
128
< ak , d p j > = ak (m)d p j (m) (39.8)
128 m=1

39.3.3 Image Identification

Let I = {I c i , 1 ≤ i ≤ S}, I = {I l i , 1 ≤ i ≤ S}, and I = {I r i , 1 ≤ i ≤ S} be the sets


of original center images and generated virtual image pairs, respectively. Then we
generate the compact hash H (I c i ) from virtual image pairs of corresponding center
image, where H (I c i ) = (h 1 , h 2 , . . . , h L ) is the hash vector with length L for center
image I c i .
In order to measure the similarity between two hash vectors H (I1 ) and H (I2 ), the
Euclidean distance is applied as the performance metric. Suppose I l Q and I r Q be the
pair of query virtual images, after extracting the matched local feature points from
virtual images, the image hash H (I c Q ) is calculated with descriptors of the grouping
feature points. Then, we calculate the distance between H (I c Q ) and H (I c i ) of each
original image in database, and the query virtual image pair is identified as the ith
original image as
i = argmin {D(H (I c Q ), H (I c i ))} (39.9)
i

where D(H (I c Q ), H (I c i )) is calculated as the distance between H (I c Q ) and H (I c i ).

39.4 Experimental Results

In this section, the proposed hashing’s perceptual robustness against content-


preserving manipulations will be evaluated. Perceptual robustness is an important
property for content-based image identification and retrieval, which is desired that
when the image is attacked by the content-preserving operations, the image hash
would not change much. Constructing a database with 1278 images (639 pairs of vir-
tual images) to evaluate the performance of proposed DIBR 3D image hashing, 9 pairs
of center and depth images with various resolutions from 447 × 370 to 1024 × 768
are selected from Middlebury Stereo Datasets [13] and Microsoft Research 3D Video
Datasets [14]. To generate 71 distorted versions, the virtual image pair are attacked by
8 classes of content-preserving operations, including additive noise, blurring, JPEG
compression, and gamma correction. The content-preserving operations and their
parameters setting are shown in Table 39.1.
In order to evaluate the classification performance of proposed DIBR 3D image
hashing, the hashes for all of the center images are generated with their corresponding
virtual image pairs, and then the similarity between the attacked virtual image pairs
and original center images is measured by calculating the Euclidean distances of hash
vectors. According to these distances, we decide which original center the attacked
39 Depth Information Estimation-Based DIBR 3D Image Hashing … 377

Table 39.1 Content-preserving operations and the parameters setting


Manipulation Parameters setting Copies
Additive noise
Gaussian noise variance ∈ (0.0005−0.005) 10
Salt&Paper noise variance ∈ (0.001−0.01) 10
Speckle noise variance ∈ (0.001−0.01) 10
Blurring
Gaussian blurring Filter size: 3 σ ∈ (0.5−5) 10
Circular blurring radius ∈ (0.2−2) 10
Motion blurring len = 1, 2, 3 θ = 0◦ , 45◦ , 90◦ 9
JPEG compression QF ∈ (10−100) 10
Gamma correction γ = 0.7, 1.3 2

Table 39.2 Identification accuracy performances under different attacks


Manipulation Identification accuracy (%)
Additive noise
Gaussian noise 100
Salt&Paper noise 100
Speckle noise 100
Blurring
Gaussian blurring 90.00
Circular blurring 93.33
Motion blurring 98.77
JPEG compression 100
Gamma correction 100

virtual images belong to, then the identification accuracy is finally calculated as
shown in Table 39.2.
Ideally, we hope that the virtual image pairs attacked by different kinds of content-
preserving operations should still be correctly classified into the corresponding orig-
inal center image, and two distinct pairs of virtual image should have different hash
values. As the experiments shown, the proposed DIBR 3D image hashing is robust
against common signal distortion attacks such as JPEG compression, noise addition,
and gamma correction.
378 C. Cui and S. Wang

39.5 Conclusion

In this paper, a novel DIBR 3D image hashing scheme has been proposed. The image
hash is generated with virtual image pair of the corresponding center image instead
of generating a hash from the center image directly. First, we use the SIFT algorithm
to extract and select matched feature points of virtual image pair. Dividing these
feature points into different groups according to their depth information estimated,
the image hash is generated with feature descriptors. As the experiments shown, our
DIBR 3D image hashing is robust against most of signal distortion attacks, such as
noise addition, JPEG compression, and so on. However, the proposed hashing still
has limitations when considering about geometric distortions, such as rotation. The
future works mainly focus on improving the robustness against geometric distortion
attacks and localizing the tampered contents in images.

Acknowledgements This work is supported by the National Natural Science Foundation of China
(Grant Number: 61702224).

References

1. Fehn, C.: Depth-image-based rendering (DIBR) compression and transmission for a new
approach on 3D-TV. In: Proceedings of the SPIE Stereoscopic Displays and Virtual Reality
Systems XI, pp. 93–104 (2004)
2. Chen, C.M., Xu, L.L, Wu, T.S., Li, C.R: On the security of a chaotic maps-based three-party
authenticated key agreement protocol. J. Netw. Intell. 1(2), 61–66 (2016)
3. Chen, C.M., Huang, Y.Y., Wang, Y.K., Wu, T.S.: Improvement of a mutual authentication
protocol with anonymity for roaming service in wireless communications. Data Sci. Pattern
Recognit. 2(1), 15–24 (2018)
4. Ahmed, F., Siyal, M.Y., Abbas, V.U.: A secure and robust hash-based scheme for image authen-
tication. Signal Process. 90(5), 1456–1470 (2010)
5. Monga, V., Evans, B.L.: Perceptual image hashing via feature points: performance evaluation
and tradeoffs. IEEE Trans. Image Process. 15(11), 3452–3465 (2006)
6. Kozat, S., Venkatesan, R., Mihcak, M.: Robust perceptual image hashing via matrix invariants.
In: 2004 International Conference on Image Processing, pp. 3443–3446. IEEE, Singapore,
Singapore (2004)
7. Monga, V., Mhcak, M.K.: Robust and secure image hashing via non-negative matrix factoriza-
tions. IEEE Trans. Inf. Forensics Secur. 2(3), 376–390 (2007)
8. Roy, S., Sun, Q.: Robust hash for detecting and localizing image tampering. In: 2007 IEEE
International Conference on Image Processing, pp. 117–120. IEEE, San Antonio, TX, USA
(2007)
9. Lv, X., Wang, Z.J.: Perceptual image hashing based on shape contexts and local feature points.
IEEE Trans. Inf. Forensics Secur. 7(3), 1081–1093 (2012)
10. Tang, Z.J., Zhang, X.Q., Li, X.X., Chao, S.C.: Robust image hashing with ring partition and
invariant vector distance. IEEE Trans. Inf. Forensics Secur. 11(1), 200–214 (2016)
11. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis.
60(2), 91–110 (2004)
12. Zhang, L., Tam, W.: Stereoscopic image generation based on depth images for 3d TV. IEEE
Trans. Broadcast. 51(2), 191–199 (2015)
39 Depth Information Estimation-Based DIBR 3D Image Hashing … 379

13. Scharstein, D., Pal, C.: Learning conditional random fields for stereo. In: 2007 IEEE Conference
on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, Minneapolis, MN, USA (2007)
14. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error
visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2007)
Chapter 40
Improved Parity-Based Error Estimation
Scheme in Quantum Key Distribution

Haokun Mao and Qiong Li

Abstract Quantum Key Distribution (QKD) is a promising technique to distribute


unconditional secure keys for the remote two parties. In order to improve the final
secure key rate of a QKD system, the Quantum Bit Error Rate (QBER) needs to be
estimated as accurate as possible with minimum information leakage. In this paper,
an improved parity-based error estimation scheme is proposed. The core part of the
scheme is the proposed optimal block length calculation method. Simulation results
show that the proposed scheme improves the accuracy of QBER estimation with less
information leakage.

Keywords Quantum key distribution · Error estimation · Parity

40.1 Introduction

Quantum Key Distribution (QKD) aims to generate and distribute unconditional


secure keys for the two legitimate parties [1]. Unlike the conventional cryptography
[2, 3], the security of QKD is based on the laws of quantum physics, providing a
theoretical guarantee that the secure keys are unknown to any third party with a
high and quantifiable probability [4]. Generally, a practical QKD system consists of
two main parts [5]: quantum part and classical post-processing part. In the former
part, raw keys are obtained by using by transmitting and detecting quantum signals.
However, the raw keys of two parties are only weakly correlated and partially secure
due to the imperfection control of devices, disturbance of external environment or
even the presence of Eve. Hence, the classical post-processing part is applied to
correct the errors and remove the information leakage about the raw key pairs [6].
In this paper, we focus on the error estimation stage of the post-processing for
the following reasons. First, the estimated QBER is an import security parameter

H. Mao · Q. Li (B)
Information Countermeasure Technique Institute, School of Computer Science
and Technology, Harbin Institute of Technology, Harbin 150080, China
e-mail: qiongli@hit.edu.cn

© Springer Nature Singapore Pte Ltd. 2020 381


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_40
382 H. Mao and Q. Li

for a QKD system. Once the estimated QBER is beyond the given threshold, there
may exist an attacker, so-called Eve. Second, it can be predicted that few secure
keys will be obtained when the estimated QBER is too high. In such a case, unnec-
essary subsequent processing steps can be avoided. Third, error estimation affects
the performance of error correction which is often called reconciliation in QKD. For
instance, in Cascade reconciliation [7], the knowledge of QBER is helpful to set an
optimal block length that decreases the amount of information leakage. In LDPC rec-
onciliation [8, 9], an appropriate matrix and other optimum parameters can be chosen
with the help of estimated QBER, improving the efficiency and convergence speed of
reconciliation. Although blind reconciliation [10–12] and reconciliation-based error
estimation [13] have been proposed, the traditional error estimation before reconcil-
iation is still an essential stage. That is because the protocols above are more suitable
for stable QKD systems. However, the QBER of a practical QKD system might vary
significantly between two consecutive frames. In that case, the protocols without
prior error estimation are not effective.
The optimization target of error estimation is to improve the accuracy of QBER
estimation with minimum information leakage. In order to realize the target, some
improved methods have been proposed. An improved random sampling method was
proposed to improve the performance of QKD systems [14]. The connection between
sampling rate and erroneous judgment probability was analyzed first. Then the cal-
culating method of the optimal sampling rate was presented to maximize the final
secure key rate. The issue of how the sampling rate affected the final secure key
rate in a decoy state QKD was fully discussed. However, limited by the inherent
capacity of random sampling, the performance is not good enough. In order to fur-
ther improve error estimation performance, a Parity Comparison Method (PCM) was
proposed [15]. The parities of blocks were analyzed to estimate QBER instead of ran-
dom sampling. Simulation results showed that PCM outperformed random sampling
in most realistic scenarios. However, the calculating method of the optimal block
length, which is the key parameter of PCM was insufficiently studied. In addition,
all blocks are sampled for error estimation, leaking too much information.
An improved parity-based error estimation scheme is proposed in this research.
The main contributions of our work are as follows. The optimal block length is
obtained through theoretical analysis. In addition, an effective error estimation
scheme is proposed. Simulation results show that the proposed scheme is able to
leak less information than random sampling with the same accuracy level.
The rest of the paper is organized as follows. The mathematical relationship among
parity error rate, QBER, and block length is described in Sect. 40.2. The theoretical
analysis of the optimal block length is presented in Sect. 40.3. The complete error
estimation scheme is detailed in Sect. 40.4. Finally, brief conclusions are provided
in Sect. 40.5.
40 Improved Parity-Based Error Estimation Scheme … 383

40.2 Related Works

In this section, the calculation formula of the QBER in Discrete-Variable QKD (DV-
QKD) is derived. For a DV-QKD system, the quantum channel can be viewed as
a Binary Systematic Channel (BSC) whose error probability is QBER. Hence, the
probability of a specific number of errors can be calculated by using the binomial
distribution [15].
Let eparity be the parity error rate, L be the block length, n be the number of errors
in a block, Eodd be the set of odd n, e be the QBER. It is obvious that an odd number
of errors in a block will lead to a parity error. Then eparity can be calculated by using
Eq. 40.1 
eparity = CLn en (1 − e)L−n (40.1)
n∈Eodd

Let x = 1 − e and y = e, then Eq. 40.4 can be obtained by combining Eqs. 40.2
and 40.3.
L
(x + y)L = CLn xn yL−n (40.2)
n=0


L
(x − y)L = CLn xn (−y)L−n (40.3)
n=0

1 − (1 − 2e)L
eparity = (40.4)
2
The inverse function of Eq. 40.4 is presented in Eq. 40.5
1
1 − (1 − 2eparity ) L
e= (40.5)
2
It is obvious that QBER can be calculated by using Eq. 40.5 with the statistical
eparity and preset L. In particular, QBER equals to eparity when L is 1, which indicates
that random sampling is only a special case of PCM.
A rough performance analysis of PCM is given here. Let N be the data size and α
be the sampling rate. The amount of information leakage and involved data are both
N α when random sampling is applied. However, in PCM, the amount of involved
data is LN α when the information leakage is the same as that of random sampling.
The increased amount of involved data is a benefit to error estimation. While the
error estimation accuracy in a block decreases with increasing L. Thus, there may
exist an optimal L achieving the best overall performance of error estimation.
384 H. Mao and Q. Li

40.3 Optimal Block Length

In this section, the calculating method of the optimal block length is proposed. The
calculation formula of the optimal block length is given first through theoretical
analysis and then verified with simulations.

40.3.1 Theoretical Analysis

The accuracy of parity-based error estimation is mainly affected by the following


factors: the number of sampled blocks and the block length. Since only 1 bit is leaked
in a block, the number of sampled blocks is equal to the amount of information
leakage. Hence, if the amounts of information leakage for different block lengths are
the same, the effects of the former factor on the estimation accuracy are equal too.
Unlike the former factor, the latter one affects estimation accuracy in a different way.
The estimation accuracies of different block lengths with the same eparity fluctuations
differ. Let e = f (eparity ) and eparity = g(e), where f and g are inverse functions. Then
the effects of fluctuating eparity on the estimation accuracy can be represented by f  (e).
The smaller f  (e), in other words the larger g  (eparity ), means a more accurate error
estimation. The function g  (eparity ) is presented as follows by using Eq. 40.4:

g  (eparity ) = L(1 − 2e)L−1 (40.6)

In order to get the maximal value of g  (eparity ), the derivative of g  (eparity ) is


calculated in Eq. 40.7, where L is a positive integer and e ∈ [0%, 50%). It is noticed
that the situation of 50% error rate is not considered.

g  (eparity ) = (1 − 2e)L−1 [1 + L ln(1 − 2e)] (40.7)

Let g1 = (1 − 2e)L−1 , g2 = 1 + L ln(1 − 2e). Since the result of g1 is always


greater than 0, the sign of g  (eparity ) is only determined by that of g2 . As known to
us, the QBER of a practical DV-QKD system is always lower than 15%. Then, the
result of g2 reaches the maximal positive value when L = 1 and turns negative when
L is large. Hence, g  (eparity ) will increase first and decrease afterward. The maximal
value is achieved when g  (eparity ) = 0. Let g  (eparity ) = 0, the theoretical optimal
block length Ltheory can be calculated as follows:
 
1
Ltheory = − (40.8)
ln(1 − 2e)
40 Improved Parity-Based Error Estimation Scheme … 385

40.3.2 Simulations and Analysis

In addition to the theoretical analysis, the relevant simulations have been carried,
and the results coincide with the theory deduction. The estimation efficiency fest is
defined in Eq. 40.9. In the equation, Dparity_based and Dparity_based are the variance of
estimated QBER by using parity-based and random sampling methods respectively,
and the mathematical expectation is the actual QBER. Lsl is the optimal block length
obtained through simulation.

Dparity_based
fest = (40.9)
Drandom_sampling

The simulations are conducted with the block length being 1000/100/10 kb,
respectively, and each simulation is repeated 1000 times. Simulation results are
depicted in Table 40.1. As can be seen from the table, Ltheory decreases with increas-
ing QBER and drops to 1 when QBER is 25%. If QBER further increases, random
sampling instead of parity-based will be a better error estimation method. In addi-
tion, the simulation results show that fest decreases with the increasing QBER. Hence,
though the proposed estimation method is always effective in the QBER region of a
DV-QKD system, it is more suitable for low-QBER situation. Nowadays, the QBER
of DV-QKD systems is typically less than 3% [15]. Hence, the advantage of the pro-
posed method is obvious in this situation. In addition, Eq. 40.8 is deduced without
considering the effect of finite block length. As depicted in Table 40.1, Lsl is always
a little smaller than the corresponding theoretical result. Moreover, the length gap
becomes wider with the decreasing QBER. Thus, an adjustment factor α, which can
be fixed or adjustable with the varying QBER, is supplemented to narrow the gap.
The modified formula of actual optimal length Lactual is depicted in Eq. 40.10.
 
1
Lactual = − −α (40.10)
ln(1 − 2e)

Table 40.1 The optimal block lengths and the estimation efficiency obtained through theory and
simulation for typical QBERs
QBER Ltheory Simulation l Simulation 2 Simulation 3
(%)
Lsl fest Lsl fest Lsl fest
2 24 20 8.8 18 9.4 16 8.9
3 16 15 5.9 14 6.0 14 5.3
5 9 8 3.3 7 3.7 7 3.3
8 5 4 2.2 4 2.2 4 2.1
25L 1 1 ≈ 1.0 1 ≈ 1.0 1 ≈ 1.0
386 H. Mao and Q. Li

40.4 Proposed Parity-Based Error Estimation Scheme

An efficient and convenient error estimation scheme based on the obtained optimal
block length is proposed in this section. The application scenarios of error estima-
tion are divided into three categories: blind, semi-blind, and non-blind. The blind
scenario indicates the QKD systems whose QBER are completely unknown to error
estimation. This situation usually occurs in the system debugging process. The QKD
systems with high fluctuation of QBERs are the typical examples of the semi-blind
scenario. In this scenario, the gain of estimation accuracy obtained from previous
(already corrected) frame is low. Most commonly used QKD systems belong to the
third category. The probability distribution of QBER is stable and known to error
estimation. Hence, a rough error estimation is sufficient, leaking only a small amount
of information. Since most blind QKD systems can be converted to non-blind ones
after several rounds of reconciliation, only semi-blind and non-blind systems are
concentrated on.

40.4.1 Description of the Proposed Scheme

The proposed error estimation scheme is described as follows.


Step 1: Preprocessing. Prior to the key error estimation process two remote parties
Alice and Bob determine the following parameters.
Step 1.1: Initial block length. The initial block length is calculated by the pre-
set parameter emax . The selection of emax differs between semi-blind and non-blind
systems. As known to us, if the QBER of DV-QKD systems are higher than 10%,
the final key rates are rather low. Hence for semi-blind systems without additional
information, the maximal available error rate emax can be chosen as 10%. For non-
blind systems, the choice of emax is relative to the probability distribution of QBER.
Assume Pr[emax −  ≤ e ≤ emax ] ≥ 1 − β, where  and β are predefined threshold,
then emax is the desired parameter.
Step 1.2: Sampled blocks. The sampled blocks involved in error estimation are
randomly chosen according to the amount of information leakage.
Step 2: Parity Comparison. Alice and Bob exchange the parities of sampled blocks
through the authenticated classical channel. Using Eq. 40.10, Alice and Bob calculate
QBER by comparing the parities.
Step 3: (Optional) Interac tive estimation. If there exists a significant difference
between the estimated QBER and the QBER from the previous frame, an interactive
estimation is needed. Step 2 is then repeated and the initial block length is updated
by using the latest estimated QBER.
40 Improved Parity-Based Error Estimation Scheme … 387

40.4.2 Simulations and Analysis

In order to overall evaluate the performance of the proposed scheme, simulations


for typical semi-blind and non-blind systems are conducted. The random sampling
results that leakage ratio is 5% and estimation efficiency is 1 are used as a benchmark.
In the simulation, leakage ratio is defined as the ratio between the amount of leaked
information and the frame length. The QBER ranges of two simulations are both
[2%, 8%] but the QBER probability distributions are different. In Simulation 1, the

1.25

Simulation 1
1.20 Simulation 2
Random Sampling
1.15

1.10

1.05

1.00

5.0

4.0

3.0

2.0

1.0

0.0
2 4 6 8

Fig. 40.1 Estimation efficiency (upper panel) and leakage ratio (lower panel)

Table 40.2 Comparison of leakage ratio


No. Scenario emax (%) Using probability Average leakage
(%)
1 Semi-blind 8 No 1.95
2 Non-blind 8 No 1.60
3 Non-blind 6 Yes 1.35
388 H. Mao and Q. Li

probability distribution is uniform and then emax is set to be 8%. In Simulation


2, assuming that  = 4%, β = 2% and Pr[2% ≤ e ≤ 6%] ≥ 98%, then emax = 6%.
Other parameters such as amount of leaked information and block length are assumed
to be 5% and 100 kb, respectively. Each simulation is repeated 10,000 times and the
results at different QBERs are depicted in Fig. 40.1. In addition, the average leakage
ratio is calculated in Table 40.2 by using the simulation results.
As depicted in Fig. 40.1, not only the estimation efficiency but also the leakage
ratio of proposed protocol outperform the comparative method. In addition, it can be
seen from Table 40.2 that taking advantage of probability distribution contributes to
the performance improvement of estimation.

40.5 Conclusions

In this research, an improved parity-based error estimation scheme is proposed.


The optimal block length is obtained through theoretical analysis first. Then the
corresponding error estimation scheme for two types of QKD systems is presented.
The theoretical analysis manifests that the proposed scheme is applicable to DV-
QKD systems and consistently effective when QBER is lower than 25%. In addition,
the simulation results show that the proposed scheme can meet the same accuracy
level as random sampling with much less information leakage. Thus, the proposed
scheme is able to improve the final secure rate of a QKD system with negligible extra
cost.

Acknowledgements This work is supported by the Space Science and Technology Advance
Research Joint Funds (6141B06110105) and the National Natural Science Foundation of China
(Grant Number: 61771168).

References

1. Bennett, C.H., Brassard, G.: Quantum cryptography: public key distribution and coin tossing.
Theor. Comput. Sci. 560, 7–11 (2014)
2. Chen, C.M., Wang, K.H., Wu, T.Y., Wang, E.K.: On the security of a three-party authenticated
key agreement protocol based on chaotic maps. Data Sci. Pattern Recognit. 1(2), 1–10 (2017)
3. Pan, J.S., Lee, C.Y., Sghaier, A., Zeghid, M., Xie, J.: Novel systolization of subquadratic space
complexity multipliers based on Toeplitz matrix-vector product approach. IEEE Transactions
on Very Large Scale Integration (VLSI) Systems (2019)
4. Renner, R.: Security of quantum key distribution. Int. J. Quantum Inf. 6(1), 1–127 (2008)
5. Li, Q., Yan, B.Z., Mao, H.K., Xue, X.F., Han, Q., Guo, H.: High-speed and adaptive FPGA-
based privacy amplification in quantum key distribution. IEEE Access 7, 21482–21490 (2019)
6. Li, Q., Le, D., Wu, X., Niu, X., Guo, H.: Efficient bit sifting scheme of post-processing in
quantum key distribution. Quantum Inf. Process. 14(10), 3785–3811 (2015)
7. Yan, H., Ren, T., Peng, X., Lin, X., Jiang, W., Liu, T., Guo, H.: Information reconciliation
protocol in quantum key distribution system. In: Fourth International Conference on Natural
Computation, ICNC’08, vol. 3, pp. 637–641. IEEE (2008)
40 Improved Parity-Based Error Estimation Scheme … 389

8. Li, Q., Le, D., Mao, H., Niu, X., Liu, T., Guo, H.: Study on error reconciliation in quantum
key distribution. Quantum Inf. Comput. 14(13–14), 1117–1135 (2014)
9. Mao, H., Li, Q., Han, Q., Guo, H.: High throughput and low cost LDPC reconciliation for
quantum key distribution. arXiv:1903.10107 (2019)
10. Martinez-Mateo, J., Elkouss, D., Martin, V.: Blind reconciliation. Quantum Inf. Comput. 12(9–
10), 791–812 (2012)
11. Kiktenko, E., Truschechkin, A., Lim, C., Kurochkin, Y., Federov, A.: Symmetric blind infor-
mation reconciliation for quantum key distribution. Phys. Rev. Appl. 8(4), 044017 (2017)
12. Li, Q., Wen, X., Mao, H., Wen, X.: An improved multidimensional reconciliation algorithm
for continuous-variable quantum key distribution. Quantum Inf. Process. 18(1), 25 (2019)
13. Kiktenko, E., Malyshev, A., Bozhedarov, A., Pozhar, N., Anufriev, M., Fedorov, A.: Error
estimation at the information reconciliation stage of quantum key distribution. J. Russ. Laser
Res. 39(6), 558–567 (2018)
14. Lu, Z., Shi, J.H., Li, F.G.: Error rate estimation in quantum key distribution with finite resources.
Commun. Theor. Phys. 67(4), 360 (2017)
15. Mo, L., Patcharapong, T., Chun-Mei, Z., Zhen-Qiang, Y., Wei, C., Zheng-Fu, H.: Efficient error
estimation in quantum key distribution. Chin. Phys. B 24(1), 010302 (2015)
Chapter 41
An Internal Threat Detection Model
Based on Denoising Autoencoders

Zhaoyang Zhang, Shen Wang and Guang Lu

Abstract Internal user threat detection is an important research problem in the field
of system security. Recently, the analysis of abnormal behaviors of users is divided
into supervised learning method (SID) and unsupervised learning method (AD).
However, supervised learning method relies on domain knowledge and user back-
ground, which means it cannot detect previously unknown attacks and is not suitable
for multi-detection domain scenarios. Most existing AD methods use the clustering
algorithm directly. But for threat detection on internal users’ behavior, mostly for
high-dimensional cross-domain log files, as far as we know, there are few methods
of multi-domain audit log data with effective feature extraction. An effective feature
extraction method which can not only reduce testing cost greatly, but also detect the
abnormal behavior of users more accurately. We propose a new unsupervised log
abnormal behavior detection method which is based on the denoising autoencoders
to encode the user log file, and adopts the integrated method to detect the abnormal
data after encoding. Compared with the traditional detection method, it can analyze
the abnormal information in the user behavior more effectively, thus playing a pre-
ventive role against internal threats. In addition, the method is completely data driven
and does not rely on relevant domain knowledge and user’s background attributes.
Experimental results verify the effectiveness of the integrated anomaly detection
method under the multi-domain detection scenario of user log files.

Keywords Internal threat · User cross-domain behavior analysis · Denoising


autoencoder · Gaussian mixture model · Machine learning

Z. Zhang (B) · S. Wang · G. Lu


Department of Computer Science and Technology, Harbin Institute of Technology, Harbin
150001, China
e-mail: zhaoyang_Zhang@hit.edu.cn

© Springer Nature Singapore Pte Ltd. 2020 391


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_41
392 Z. Zhang et al.

41.1 Introduction

Internal user threat detection is an important research problem in the field of system
security. In many recent security incidents, internal user attack has become one of
the main reasons [1]. Internal users usually refer to the internal personnel of an
organization. They are usually the users of information systems in the organization
such as government employees, enterprise employees, or the users of public services
such as the users of digital libraries, etc. [2, 3]. The user or user process in the
computer system in a variety of activities recorded (also known as user audit log)
is an important basis for the analysis of user behavior such as the user’s command
execution records, file search records, etc. Therefore, we’ll explore the anomaly
detection of cross-domain log files.
A lot of work has been done to propose user behavior analysis methods for internal
threat detection. The existing internal threat detection and prediction algorithms are
divided into two types: (i) anomaly detection based on unsupervised learning (AD)
and (ii) signature-based intrusion detection (SID) [4]. However, supervised learning-
based SID methods can only detect known attacks [5]. Most existing AD methods
use the clustering algorithm directly, but for threat detection on internal users, as far
as we know, there are few methods of multi-domain audit log data with effective
feature extraction. An effective feature extraction method which can not only reduce
testing cost greatly, but also detect the abnormal behavior of users more accurately.
Therefore, we adopt the method based on deep learning to extract the progressive
features of high-dimensional cross-domain log files and then detect the abnormal
behaviors of users.
In this paper, the one-hot encoder which describes the user’s multi-domain behav-
ior is put into the de-noising automatic encoder to train low-dimensional vector.
Finally, we analyze the abnormal behavior of users based on unsupervised learning
technology. Traditionally, AD method is bound to generate many false alarms [6].
Some studies suggest to use intent models and other models [7], but these methods
involve human intervention and expert experience. In our model, Robust covariance
[8], OCSVM [9], isolation forest [10], and Local Outlier Factor [11] are used to
integrate with GMM to obtain the final results, which can effectively reduce the false
alarm rate while ensuring a high recall rate. Our final experimental results show that
with our method, the experimental recall rate reaches 89%, and the false alarm rate
is only 20%.

41.2 Algorithm Description

Our goal is to propose an internal threat detection and prediction algorithm. The
method in our model includes three main steps (as shown in Fig. 41.1).
41 An Internal Threat Detection Model Based on Denoising Autoencoders 393

Fig. 41.1 User exception


detection process in our
model

41.2.1 Data Preprocessing

In the data preprocessing based on the statistical method, the user’s multi-domain
behavior description is constructed. First, the normalized data characteristics of audit
logs of users in each domain are extracted respectively. After obtaining the single-
domain behavior characteristics of users, we statistically combined all single-domain
behavior descriptions of users in the same time window based on a time window.

41.2.2 Construction of User Behavior Characteristics Based


on Denoising Autoencoders

The description of users’ multi-domain behavior is constructed based on the statis-


tical method. The obtained multi-domain behavior characteristics of users have a
high dimension, which is not conducive to the expression of users’ behavior char-
acteristics. In order to solve the problem of high dimension, we will discuss feature
extraction with autoencoder for one-hot encoding in this section.
The function of denoising autoencoder is to learn the original data of noise super-
position. The features are almost identical to those from data that never adds noise.
However, the features acquired by the denoising self-encoder from the input of super-
imposed noise are more robust and can avoid the above problems encountered by
the self-encoder by simply learning the same eigenvalues.
Figure 41.2 is the calculation diagram of the cost function of the denoising autoen-
coders. x̃ is after adding noise data, f and g is the encoder and decoder, respectively.
We introduce a damage process of C(x̃|x), This condition represents the probability
that x given data sample a will produce x damaged sample x̃.

Fig. 41.2 Calculation


diagram of the cost function
of the denoising
autoencoders
394 Z. Zhang et al.

The denoising autoencoder is trained to reconstruct the clean data point a


from the corrupted version x̃. This can be achieved by minimizing the loss of
L = − log pdecoder (x|h = f (x̃)), where x̃ are the samples after damage process
C(x̃|x). Generally, the distribution pdecoder is the distribution of factors (the draw
parameter is given by the feedforward network). According to the following process,
since the encoder (x, x̃) from the training data in the distributed learning refactoring
reconstruct (x, x̃):
1. Take a training sample x from the training set.
2. From C(x̃|x) pick a damaged sample x̃.
3. As the training sample (x, x̃) to estimate the distribution of encoder refactoring

pr econstr uct (x|x̃) = pdecoder (x|h) (41.1)

Usually, we can simplify the negative logarithm likelihood − log pdecoder (x|h)
based on gradient method, such as small batch gradient descent method similar to
minimize. As long as the encoder is deterministic, the denoising autoencoders is a
feedforward network and can be trained in exactly the same way as other feedforward
networks. Therefore, we can think that DAE performs stochastic gradient descent
under the following expectations:

−E x∼ p̂data (x) E x̃∼C (x̃|x) log pdecoder (x|h + f (x̃)) (41.2)

The p̂data (x) is the distribution of training data.


The denoising autoencoders add random noise to the input on the basis of the
autoencoder and then transmits it to the autoencoder. This process randomly sets
some input bits (at most half of the bits) to 0, so that the denoising autoencoder
needs to guess the zero bit by the uncontaminated bits. The ability to predict any
subset of the overall data from the sampled part of the data is a sufficient condition for
finding the joint distribution of variables in the sample (theoretical basis on Gibbs
sampling [12]), which indicates that it can theoretically prove that the denoising
autoencoder is capable of obtaining all the effective characteristics of the input. In
our model, a denoising autoencoder is adopted to process the training data without
abnormal points many times, and then the learning encoder is used to test the data
and obtain the user behavior characteristics.

41.2.3 Abnormal Behavior Detection

Detection of abnormal behaviors based on Gaussian mixture model. In this


section, the commonly used Gaussian mixture model is adopted for anomaly detec-
tion. The Gaussian mixture model assumes that the data obeys the mixed Gaussian
distribution. According to the central limit theorem, it is reasonable to assume that the
subject distribution is Gaussian, and the mixed Gaussian model can approximate any
41 An Internal Threat Detection Model Based on Denoising Autoencoders 395

continuous probability distribution arbitrarily [13]. The Gaussian model is composed


of K Gaussian distributions, each Gaussian function is called a “component”, and
the linear addition of these components together constitutes the probability density
function of the Gaussian mixture model. In our model, the cross-validation method
is adopted to find the best Gaussian mixture model. A common approach is the
Bayesian information criterion (BIC). The idea of Bayesian Information Criterion is
based on Bayesian factor, whose formula is

BIC = 2kln(n) − 2 ln(L) (41.3)

where k is the number of model parameters, L is the likelihood function, and n is the
number of samples. When training the model, increasing the number of parameters.
Finally, we use the user behavior pattern analysis method introduced above to
design a new detection method of camouflage attack for multiple detection domains.
The detection of camouflage attack mainly includes two aspects: “abnormal behavior
pattern detection” and “normal behavior pattern interference detection”. Generally,
the frequency of attack behavior is much less than that of normal behavior. In GMM
model, it is sparse and small clusters. In the detection process, we set a threshold
of abnormal behavior mode to distinguish th e normal behavior mode and abnormal
behavior mode of users. In GMM model, this threshold is the lower limit of cluster
size, and clusters below this threshold are abnormal behavior patterns. The user
behavior contained in the exception pattern is considered to be aggressive behavior.
In the aspect of behavioral pattern interference detection, we examine whether the
influence of each behavioral feature vector on the Gaussian distribution of its mode
is beneficial. Similarly, because the frequency of attack behavior is far less than
that of normal behavior, the normal behavior is more consistent with the Gaussian
distribution of the pattern, while the attack behavior will weaken the conformity of
the Gaussian distribution of the pattern.
Integrated method abnormal data detection. In our model, robust covariance,
OCSVM, isolation forest and local outlier factor are adopted to integrate with GMM
to get the final results. Figure 41.2 is the calculation diagram of the detection process.
Because GMM-based detection method examines whether the influence of each
behavior feature vector on the Gaussian distribution of its mode is favorable, it is con-
ducive to the detection of camouflage attacks in internal threats (camouflage attacks
are often less different from normal behaviors, and thus hidden in a large number
of normal behaviors). However, the disadvantage of GMM-based detection method
is that it requires manual control of detection threshold (a too high threshold will
miss hidden camouflage attack, and a too low threshold will lead to high false alarm
rate), which means manual intervention is required and it is difficult to guarantee
the accuracy of detection. The robust covariance, OCSVM, isolation forest and local
outlier factor are all based on different detection methods, thus leaving out different
camouflage attacks.
Therefore, we attempt to combine the robust covariance, OCSVM, isolation forest,
and local outlier factor detected by four methods into the abnormal behaviors of users
396 Z. Zhang et al.

Fig. 41.3 Calculation


diagram of the detection
Robust-
process
covariance
The
integrated
results

OCSVM

Test results

Isolation-
forest
GMM

Local
Outlier
Factor

(expecting to get the maximum recall rate), and to take the intersection with the
abnormal behaviors obtained by GMM model (expecting to reduce the false alarm
rate of detection results) to get the final detection results. GMM model in this method
can be fixed as a lower threshold to ensure a higher recall rate (Fig. 41.3).

41.3 Experiments

This section describes the experiment section in detail. First, data preprocessing
is used to obtain the characteristic expression of user behavior. The dataset for this
article is from the Insider Threat Test Dataset of Carnegie Mellon University. It should
be noted that this data set is synthesized [14]. Because malicious insiders are, first and
foremost, insiders, to collect real data, some organization must directly monitor and
record the behavior and actions of its own employees. Confidentiality and privacy
concerns create barriers to the collection and use of such data for research purposes.
Thus, it is sometimes preferable to proceed with synthetic data. This dataset had
been proved to be effective for abnormal behavior detection and was shared within a
large community of researchers engaged in the DARPA ADAMS program to develop
techniques for insider threat detection [14]. This section extracts data by extracting
all behavior data of 21 users in 90 days and quantizing it with one-hot encoding.
Then feature extraction of user behavior is carried out. In our model, the user
behavior characteristics of one-hot encoding are re-encoded and reduced to a low
41 An Internal Threat Detection Model Based on Denoising Autoencoders 397

Fig. 41.4 Feature extraction


process original Lay1 Lay2 Lay3 Lay4 Lay5
data encode encode encode encode output

1977 256 128 64 32 20

Table 41.1 Detection results Recall rate (%) Accuracy (%) F1-score
based on GMM
Outlier 66.7 85.7 0.75
detection
CA detection 100 22.5 0.37

dimension through the four-layer denoising autoencoders network. Figure 41.4 is the
feature extraction process based on denoising autoencoders.
The following detection method based on Gaussian mixture model is adopted.
Figure 41.5 is the classification result of the Gaussian mixture model when the
feature dimension is 20. Use the Gaussian mixture model and contrast the ratio of
each category. According to the above anomaly detection algorithm, the data with the
smallest proportion is regarded as the first detected anomaly. Then the second outlier
detection is carried out based on the first detection result. Because the frequency of
attack behavior is far less than that of normal behavior, the normal behavior is more
consistent with the Gaussian distribution of the pattern, while the attack behavior
will weaken the conformity of the Gaussian distribution of the pattern. Therefore,
we examine whether the influence of each behavior eigenvector on the Gaussian
distribution of its mode is favorable. In our model, a threshold of abnormal behavior
pattern is set to distinguish normal behavior pattern from abnormal behavior pattern.
In GMM model, this threshold is the lower limit of the support degree of each behavior
eigenvector to the Gaussian distribution of its mode, and the behavior eigenvector
below this threshold is the abnormal behavior pattern. The user behavior contained
in the exception pattern is considered to be aggressive behavior.
The two detected abnormal behaviors were combined to obtain the preliminary
test results. Table 41.1 shows the test results based on the GMM method. It can be
seen that all abnormal data were detected with a recall rate of 100%, but the accuracy
rate was only 22.5%, indicating a high false alarm rate. In the next section, we will
effectively reduce the false alarm rate.
Since the GMM model cannot guarantee a low false alarm rate, we use four
other detection methods to further detect user behavior characteristics. Figure 41.6
shows the test results of buck covariance, OCSVM, isolation forest and local outlier
factor. Among them, the yellow dots are detected abnormal points, and the ones with
numbers (1932–1940) are real abnormal points.
The test results of these four methods are integrated. Table 41.2 shows the test
results of robust covariance, OCSVM, isolation forest and local outlier factor. The
398 Z. Zhang et al.

Fig. 41.5 Classification results of gaussian mixture model

Table 41.2 Test results based Recall rate (%) Accuracy (%) F1-score
on Robust covariance,
OCSVM, isolation forest, and Robust 44.0 26.6 0.33
LOF covariance
OCSVM 77.8 44.6 0.57
Isolation forest 55.6 33.3 0.42
LOF 66.7 40.0 0.5

Table 41.3 Comparison of Recall rate (%) Accuracy (%) F1-score


detection results of feature
extraction with and without With DA 88.9 75.0 0.81
DA Without DA 66.7 50.0 0.57

obtained integrated test results are intersected with the preliminary test results of
GMM.
Finally, we compared our method with the detection results obtained without
using the denoising autoencoder feature extraction, and found that compared with
the original method, the F1-score of our method was improved by 42% (Table 41.3).
41 An Internal Threat Detection Model Based on Denoising Autoencoders 399

Fig. 41.6 Robust covariance, OCSVM, isolation forest, and local outlier factor detection results

41.4 Conclusion

Internal user behavior analysis is an important research problem in the field of system
security. We propose a new unsupervised log abnormal behavior detection method.
This method is based on the denoising autoencoders to encode the user log file, and
adopts the integrated method to detect the abnormal data after encoding. Compared
with the traditional detection method, it can analyze the abnormal information in
the user behavior more effectively, thus playing a preventive role against internal
threats. In the experiment, we used the method in our model to analyze all the
behavior data of 21 users in the real scene in 90 days. The experimental results
400 Z. Zhang et al.

verified the effectiveness of the analysis method in the multi-detection domain scene
to analyze the multiple patterns of user behavior. The detection method in our model
is superior to the traditional method for detecting abnormal user behavior based on
matrix decomposition.

References

1. Mayhew, M., Atighetchi, M., Adler, A., et al.: Use of machine learning in big data analytics
for insider threat detection. In: Military Communications Conference. IEEE (2015)
2. Gheyas, I.A., Abdallah, A.E.: Detection and prediction of insider threats to cyber security: a
systematic literature review and meta-analysis. Big Data Anal. 1(1), 6 (2016)
3. Evolving insider threat detection stream mining perspective. Int. J. Artif. Intell. Tools 22(05),
1360013 (2013)
4. Chen, C.-M., Huang, Y., Wang, E.K., Wu, T.-Y.: Improvement of a mutual authentication
protocol with anonymity for roaming service in wireless communications. Data Sci. Pattern
Recogn. 2(1), 15–24 (2018)
5. Chen, C.-M., Xu, L., Wu, T.-Y., Li, C.-R.: On the security of a chaotic maps-based three-party
authenticated key agreement protocol. J. Netw. Intell. 1(2), 61–66 (2016)
6. Chen, Y., Nyemba, S., Malin, B.: Detecting anomalous insiders in collaborative information
systems. IEEE Trans. Dependable Secure Comput. 9(3), 332–344 (2012)
7. Young, W.T., Goldberg, H.G., Memory, A., et al.: Use of domain knowledge to detect insider
threats in computer activities (2013)
8. Rousseeuw, P.J., Driessen, K.V.: A fast algorithm for the minimum covariance determinant
estimator. Technometrics 41(3), 212–223 (1999)
9. Liu, F.T., Kai, M.T., Zhou, Z.H.: Isolation forest. In: Eighth IEEE International Conference on
Data Mining (2009)
10. Manevitz, L.M., Yousef, M.: One-class SVMs for document classification. J. Mach. Learn.
Res. 2(1), 139–154 (2002)
11. Lee, J., Kang, B., Kang, S.H.: Integrating independent component analysis and local outlier
factor for plant-wide process monitoring. J. Process Control 21(7), 1011–1021 (2011)
12. Li, D., Chen, D., Goh, J., et al.: Anomaly detection with generative adversarial networks for
multivariate time series (2018)
13. Dilokthanakul, N., Mediano, P.A.M., Garnelo, M., et al.: Deep unsupervised clustering with
Gaussian mixture variational autoencoders (2016)
14. Glasser, J., Lindauer, B.: Bridging the gap: a pragmatic approach to generating insider threat
data. In: 2013 IEEE Security and Privacy Workshops (SPW). IEEE (2013)
Chapter 42
The Para-Perspective Projection
as an Approximation of the Perspective
Projection for Recovering 3D Motion
in Real Time

Tserennadmid Tumurbaatar and Nyamlkhagva Sengee

Abstract We present a new algorithm for determining 3D motion of a moving rigid


object in real-time image sequences relative to a single camera. In the case where
features are two-dimensional (2D), they are obtained by projective transformations
of the 3D features on the object surface under perspective model. The perspective
model has formulated in nonlinear least square problem to determine 3D motions
as characterized by rotation and translation iteratively. In practice, it is numerically
ill-conditioned and may converge slowly or even fail to converge, if it starts with not
good enough initial guess. However, since para-perspective projection model closely
approximates perspective projection for recovering the 3D motion and shape of the
object in Euclidean space, we used the results provided from para-perspective projec-
tion model as an initial value of nonlinear optimization refinement under perspective
model equations.

Keywords Para-perspective model · Perspective model · 3D motion

42.1 Introduction

Recovering 3D motion has been a challenging task incorporating ability in machines


and has occupied engineers and researchers working in fields of human–computer
interaction, augmented reality, 3D modeling, and visualization for the last couple of
years.
In this paper, we propose a method for estimating 3D motion of the moving object
in image sequences taken with single camera. For determining 3D motion of rigid
object, its corresponding features observed at different times and obtained by per-

T. Tumurbaatar (B) · N. Sengee


Department of Information and Computer Sciences, National University of Mongolia,
Ulaanbaatar, Mongolia
e-mail: tserennadmid@seas.num.edu.mn
N. Sengee
e-mail: nyamlkhagva@seas.num.edu.mn

© Springer Nature Singapore Pte Ltd. 2020 401


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_42
402 T. Tumurbaatar and N. Sengee

spective projection are used. In general, nonlinear minimization methods have been
proposed for solving the 3D motion estimation problem in perspective model. The
approach of solving nonlinear equations requires some form of initial approximate
solution if that is far away from the true solution. Since the perspective projection
can be approximated by para-perspective projection by modeling both the scaling
and the position effects [1], we initiated the proposed nonlinear equations using para-
perspective factorization method by recovering geometry of the scene and motion of
either the camera or the object from image sequences.
Tomasi and Kanade [1] first introduced a factorization method to recover 3D
shape of the object and the motion of the camera simultaneously under orthographic
projection and obtained accurate results. Aloimonos described the approximations of
perspective projection based on para-perspective and ortho-perspective projections
[2].
The methods of 3D motion estimation problem are developed by researchers based
on various types of corresponding features on the rigid objects from two (or more)
images of sequences at different times. The main mathematical problem for determin-
ing the location and orientation of one camera was proposed in nonlinear equations
with omitting depth information by Huang and Tsai in [3–5]. The approaches of
solving nonlinear equations were viable, if the good initial guess solution is avail-
able. Among others, Zhuang et al. [6], Longuet-Higgins [7], and Faugeras [8] have
shown that the motion parameters of rigid body can be estimated from point corre-
spondences by solving linear equations. However, it has been found empirically that
linear algorithms are usually more sensitive to measurement noise than nonlinear
algorithms [3].
The contribution of this work is proposed in several aspects. First, this 3D motion
estimation from single camera is fast processing without requiring any model of
additional hardware. Second, the applicability of factorization method is limited to
offline computation for recovering shape and motion after all the input images are
given. Although it is difficult to apply to real-time case, we have used para-perspective
factorization method in real time to initialize nonlinear system equations. Third,
since linear techniques are very sensitive to the noise. However, para-perspective
factorization method is formulated in linear properties. The best approach is to first
use a linear algorithm by assuming a sufficient number of feature correspondences to
find initial guess value, and then to use a nonlinear formulation to refine the solution
for getting accurate motion parameters iteratively.
The paper is organized as follows. The problem statement and general motion
model of this proposed method are described in Sect. 42.2. The perspective approxi-
mation as para-perspective projection for obtaining initial guess value is summarized
in Sect. 42.3. The implementation of the proposed method is presented in Sect. 42.4.
The experiment of the proposed method is discussed in Sect. 42.5.
42 The Para-Perspective Projection as an Approximation … 403

42.2 Problem Statement

Consider a rigid body viewed by pinhole camera imaging system. We denote that a
3D point P on the surface of the object in object space coordinate is projected at a
point p on the image space under perspective projection. We consider that there are
image sequences of the object that is moving relative to a static camera. The moving
system O is attached to the object, and a static system C is attached to the camera
as shown in Fig. 42.1.
Each image is taken with keeping some object orientation that defined by the
orthonormal unit vectors, i f , j f , and k f corresponding to the x-, y-, and z-axes of
the camera. We represent the position of the object frame in each image by the
vector t. We assume that N feature points are extracted in the first image and are
tracked to the next for each F image. N feature points Pn = (X n , Yn , Z n )T on  the
object that are projected into each F image with coordinates p f n = x f n , y f n | f =
1, . . . , F, n = 1, . . . , N . Our goal is to estimate 3D motion of the moving object
based on the tracked feature correspondence points from image sequences. Initially,
we formulate the equations based on rigidity constraint of the object.
The 3D point in the object space coordinate system is represented  in camera
coordinate system by a rotation matrix, R f whose rows i f = i x f , i y f , i z f , j f =
     T
jx f , j y f , jz f , k f = k x f , k y f , k z f and translation, t f = tx f , t y f , tz f .

Pnc = R f Pn + t f , (42.1)

where
⎡ ⎤
if

Rf = jf ⎦
kf
⎡ ⎤
cos α cos β sin α cos β − sin β
= ⎣ cos α sin β sin γ − sin α cos γ sin α sin β sin γ + cos α cos γ cos β sin γ ⎦. (42.2)
cos α sin β cos γ + sin α sin γ sin α sin β cos γ − cos α sin γ cos β cos γ

Fig. 42.1 Coordinate system of the camera and object


404 T. Tumurbaatar and N. Sengee

The rotation matrix R f is specified as three independent rotations around x-, y-,
and z-axes by angles α, β, and γ in Eq. (42.2).
Assuming the camera intrinsic parameters are known and focal length is unit,
the relationship between the image space and the object space coordinates using the
property of similar triangles can be written as
   
i f · Pn − t f j f · Pn − t f
xfn =   yfn =   (42.3)
k f · Pn − t f k f · Pn − t f

i f · Pn + tx f j f · Pn + t y f
xfn = yfn = , (42.4)
k f · Pn + tz f k f · Pn + tz f

where

tx f = −i f · t f ; t y f = − j f · t f ; tz f = −k f · t f . (42.5)

The 3D rotation and translation parameters can be obtained by formulating


Eq. (42.4) in the nonlinear least square problem:

2
2

F 
N
i f · Pn + tx f j f · Pn + t y f
min xfn − + yfn − . (42.6)
f =1 n=1
k f · Pn + tz f k f · Pn + tz f

 
Since we have the given N point correspondences p f n = x f n , y f n for
each image F, a total of 2FN equations can be solved to determine six motion
parameters (tx f , t y f , tz f , α, β, γ) and three shape parameters for each point (Pn =
[Pn1 , Pn2 , Pn3 ]) for a total of 6F + 3N . As before mentioned, the good initial guess
is essential for converging to the right solution of Eq. (42.6). In the next section,
we will introduce the para-perspective projection model and discuss how it will be
used for the current approach. We will refine results of para-perspective projection
through the perspective projection model Eq. (42.6) iteratively.

42.3 The Para-Perspective Projection as an Approximation


of the Perspective Projection

Para-perspective projection has been used for the solution of various problems.
Para-perspective projection closely approximates perspective projection by mod-
eling image distortions as illustrated in Fig. 42.2.
First, the points Pn are projected onto the auxiliary plane G that is parallel to the
image plane, including mass center of the object c. The projection ray is parallel to
the line connecting with camera focal point. This step is capturing the foreshortening
distortion and the position effect.
42 The Para-Perspective Projection as an Approximation … 405

Fig. 42.2 Basic geometry for para-perspective projection

Then, the points of the plane G are projected onto image plane using perspective
projection. Since the plane G is parallel to the image plane, it is scaling the point
coordinates by the distance between camera focal point and auxiliary plane G. This
step is capturing both the distance and position effects.
The para-perspective projection is the first-order approximation of the perspective
projection derived from the perspective projection Eq. (42.3). We suppose that the
perspective projection of the point P onto the image plane be p f n = x f n , y f n is
given by

x f n = m f · Pn + tx f ;
y f n = n f · Pn + t y f , (42.7)

where

tz f = −t f · k f (42.8)

tf · if tf · jf
tx f = − ; ty f = − ; (42.9)
tz f tz f
i f − tx f k f j f − ty f k f
mf = ;nf = . (42.10)
tz f tz f

Since we have the tracked N feature points over F frames in the image streams,
we can write all these measurements into a single matrix by combining equations as
follows:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x11 . . . x1N m1 tx1
⎢ ... ... ... ⎥ ⎢ ... ⎥ ⎢ ... ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ x F1 . . . x F N ⎥ ⎢ m F ⎥ ⎢ tx F ⎥
⎢ ⎥=⎢ ⎥[P1 . . . Pn ] + ⎢ ⎥[1 . . . 1] (42.11)
⎢ y11 . . . y1N ⎥ ⎢ n 1 ⎥ ⎢ t y1 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ ... ... ... ⎦ ⎣ ... ⎦ ⎣ ... ⎦
y F1 . . . y F N nF ty F
406 T. Tumurbaatar and N. Sengee

or in brief form

W = M S + T [1 . . . 1] (42.12)

where W  R 2F x N is the measurement matrix, M  R 2F x3 is the motion matrix,


S  R 3x N is the shape matrix, and T  R 2F x1 is the translation vector.

1 
N
tx f = x f n;
N n=1

1 
N
ty f = y f n; (42.13)
N n=1

After decomposition of W matrix, we estimated true motion and shape matrix by


computing metric transformation. We note that 2F + 1 equations can be obtained
for six unknowns, so at least three frames are necessary to compute metric transfor-
mation. Para-perspective factorization was introduced in detail by authors in [1].
Finally, we obtained all unknown motion (i (0) (0) (0) (0) (0) (0)
f , j f , k f , tx f , t y f , tz f ) and shape
parameters from Eq. (42.10) based on ortho-normality definition of these vectors
as initial guess values. We will refine these values iteratively using the equations in
(42.6) under perspective projection.

42.4 The Implementation of the Proposed Method

In this section, we explain our proposed method in detail for implementation steps.
We recovered the shape and motion parameters for every five frames under para-
perspective projection because to compute L matrix in Sect. 42.3, we obtained 2F +1
equations for six unknown parameters. Thus, we need three frames or more than
three. The first four frames are initially captured, and the fifth frame is captured
consequently at different times.
First, we extract a sub-image, which is including only part of the foreground
moving object, extracted from the current video sequence. Then, the feature points
are computed to be matched to the next frames using SIFT feature extractor for the
extracted sub-image. After these steps, the first frame is captured from the current
frame, and the corresponding features are extracted between the first frame and the
extracted sub-image. The best matches are found out by Random Sample Consensus
(RANSAC)-based robust method, eliminating outliers among the matched point with
Brute force matcher. Similarly, the second, third, fourth, and the fifth frames are
captured in order of the selection from the image sequences. All processing steps in
capturing the first frame are implemented when capturing other four frames. Since
the input of the initialization step by para-perspective projection required the exact
number of the tracked correspondences one to another frame, we computed the
42 The Para-Perspective Projection as an Approximation … 407

corresponding points (F p1 , F p2 , F p3 , F p4 , F p5 ) in each frame for all extracted feature


points in the sub-image by RANSAC-based outlier elimination method.
Second, using the computed point trajectories for each captured five frames, we
recovered the shape and motion parameters through para-perspective factorization
method. Next, the iterative method defined in Eq. (42.6) used to refine six motion
parameters (tx f , t y f , tz f , α, β, γ) from para-perspective factorization used as initial
guess in least squares solution.

42.5 The Experiments of the Proposed Method

In this section, we compare the performance of the 3D motion estimation. We per-


formed the experiments with the Intel(R) Core(TM) i5, CPU 3.0 GHz, 4096 RAM
computer, and a Microsoft LifeCam. We used the C++ programming language with
Visual Studio programming tool, OpenCV library, and OpenGL graphic library. The
320 × 240 video image sequences are taken from the single calibrated camera.
We examine the motion results, refined under perspective projection by creating
synthetic feature point sequences with known motion parameters. Each synthetic
image was created by perspective projection with choosing largest focal length that
keeps the object in the field of view throughout sequences. We created a cube whose
edge length is 8 and randomly generated 15 points within the cube. Then, we rigidly
move the cube such as rotating a cube through a total of 40° for each axis. We
ran the perspective refinement method through para-perspective factorization on
each synthetic image sequence to estimate motion parameters, and computed the
Root Mean Square (RMS) error of the estimated rotation parameter by comparing
the measured rotation parameter for each frame. The synthetic dataset consists of
115 image frames. The estimated rotation parameters and the measured rotation
parameters around camera’s x-, y-, and z-axes in each frame are shown in Fig. 42.3.
The computed RMS errors of rotation parameters around x-, y-, and z-axes were
usually about 0.503°, 0.61°, and 0.08°, respectively.

42.6 Conclusion

In this paper, we obtained the 3D motion parameters from rigid transformation equa-
tions when features in the 3D space and their perspective projections on the camera
plane are known. The solution equations were formulated in nonlinear least squares
problem for the tracked feature correspondences over image sequences. These equa-
tions require the good initial approximation and 3D features, so to avoid difficulties,
the para-perspective projection is used to approximate the perspective projection and
to find out the 3D features in Euclidean space. Then, we solved the proposed equa-
tions using results of para-perspective approximation as initial values. The results
408 T. Tumurbaatar and N. Sengee

Fig. 42.3 The comparison results. a Comparison of the rotations around the x-axis. b Comparison
of the rotations around the y-axis. c Comparison of the rotations around the z-axis
42 The Para-Perspective Projection as an Approximation … 409

of this method are accurate, and the produced errors between the estimated and the
measured motion parameters are negligibly small.

Acknowledgements The work in this paper was supported by the grant of National University of
Mongolia (No. P2017-2469) and MJEED, JICA (JR14B16).

References

1. Poelman, C.J., Kanade, T.: A paraperspective factorization method for shape and motion recov-
ery. IEEE Trans. Pattern Anal. Mach. Intell. 19(3), (1997)
2. Aloimonos, J.Y.: Perspective approximations. Image Vis. Comput. 8(3) (1990)
3. Huang, T.S., Netravali, A.N.: Motion and structure from feature correspondences: a review.
Proc. IEEE 82(2) (1994)
4. Huang, T.S., Tsai, R.Y.: Image sequence analysis: motion estimation. In: Image Sequence Anal-
ysis. Springer Verlag, New York (1981)
5. Huang, T.S.: Determining three dimensional motion and structure from two perspective views. In:
Young, T.Y., Fu, K.S. (eds.) Handbook of Pattern Recognition and Image Processing. Academic
Press, New York (1986)
6. Zhuang, X., Huang, T.S., Ahuja, N., Haralick, R.M.: A simplified linear optic flow motion
algorithm. Comput. Vis. Graph. Image Process. 42, 334–344 (1988)
7. Longuet-Higgins, H.C.: A computer program for reconstructing a scene from two projections.
Nature 293, 133–135 (1981)
8. Faugeras, O.: Three-Dimensional Computer Vision: A Geometric View-point, Cambridge. MIT
Press, MA (1993)
Chapter 43
Classifying Songs to Relieve Stress Using
Machine Learning Algorithms

Khongorzul Munkhbat and Keun Ho Ryu

Abstract Music has a great impact on stress relieving for human. We have become
very stressed by society and the times. Accumulated stress cannot be met daily,
and this will have an adverse effect on our health and our mental health, such as
obesity, heart attacks, insomnia, and so on. Therefore, this study has been offering
an ensemble approach combining algorithms of machine learning such as K-NN,
naïve Bayes, multilayer perceptron, and random forest for stress relief based on
musical genres.

Keywords Music genre classification · Relieving stress songs · Machine


learning · Ensemble approach · Classification algorithms

43.1 Introduction

Everybody is listening to music and sounds every day and somehow. All of them
affect vital organs such as the human brain and heart, and stress, the psychological
state, the behavior, and even the child education. Choosing the right music, for whom,
and choosing what to use, can positively affect health and relationships. On the other
hand, selecting inappropriate music (Not applicable to the item) may have a negative
effect on increasing the level of depression and stress and disrupting the humanity.

K. Munkhbat
Database/Bioinformatics Laboratoty, School of Electrical and Computer Engineering, Chungbuk
National University, Cheongju, South Korea
e-mail: khongorzul@dblab.chungbuk.ac.kr
K. H. Ryu (B)
Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City 700000,
Vietnam
e-mail: khryu@tdtu.edu.vn; khryu@chungbuk.ac.kr
Department of Computer Science, College of Electrical and Computer Engineering, Chungbuk
National University, Cheongju 28644, South Korea

© Springer Nature Singapore Pte Ltd. 2020 411


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_43
412 K. Munkhbat and K. H. Ryu

Therefore, the feeling of music can be different for everyone, as well as for different
stages of human life [1].
In recent years, the music industry has been widely used in health care collaborat-
ing with the health sector. In particular, research has been conducted mainly for men-
tal health and heart disease. As well as, machine learning techniques have been given
high accuracy in the music industry [2]. Thus, we created the ensemble model and got
the results by comparing the machine learning algorithms such as K-Nearest Neigh-
bors (K-NN), Naive Bayes (NB), Multilayer Perceptron (MLP), and Random Forest
(RF) [3–6]. K-NN is figured out in the context of genre classification in [7]. Zhouyu
et al. [8] applied NB classifier for both music classification and retrieval as compared
to the alternative methods. While Cemgil and Gürgen [9] obtained the results from
MLP, the new RF method is introduced [10] for music genre classification.

43.2 Research Motivation

43.2.1 Stress

Sometimes, we live and work without listening to ourselves or paying attention. We


need to take care of our health, such as taking a deep breath, focusing on ourselves,
setting things up in the brain, and doing what we love. Otherwise, we may be stressed
on our own day by day to many mental illnesses such as depression and stress [11].
So we need to change positively from the small things and the behavior of our lives.
Common stress factors. Stress can affect in your body, your thoughts, your feelings,
and your mood. When people can identify common features that create it, it is possible
to manage and relieve stress. If you lose your stress in the body, there are many
problems with your health, such as hypertension, heart disease, diabetes, and obesity.
Exposure to stress may lead to headaches, pain, chest pain, fatigue, intestines, and
insomnia. But symptoms like anxiety, depression, stressful, derangement, weakening
concentration, and anger may be revealed in your behavior.
How to get rid of stress? If stress is detected, it is helpful to relieve stress and
to manage it. Particularly, such as exercise, deep breathing, meditation, yoga, thai
and chi massage, meet friends and family, share your thoughts, and do what you
love—reading, listening to music, etc.

43.2.2 Music

Music cannot completely relieve stress, but it helps to undermine and control the
stress levels, which are shown in Myriam V. Thoma’s research works [12]. Listen to
the right music that suits your mood which can directly affect the mood, productivity,
43 Classifying Songs to Relieve Stress … 413

and attitudes of your current mood. For example, the fast rhythmic songs increase
the skills on focus and concentration while the music with upbeat makes it more
optimistic and positive. Slow rhythmic music can relax the mind and body muscles
and relieve stress. The music has 60 beats in the minute, and it is mentioned in recent
studies that the brain produces alpha waves with 8–14 derivatives (in seconds), which
indicates that our brain is conscious [13].
In the presence of music, increase of serotonin and endorphin in the human brain
creates the positive effects such as antistress and irritation, relaxation, improving
concentration, promoting the immune system, reducing blood pressure, lifting soul,
and salving. We do not need much time for listening to music or playing to relieve
stress. The main thing is to make it a habit, in other words, in the daily rhythm. For
example, in the morning stand up and listen to your favorite songs and music, listen
to after great work, on the car, or walking. Even when listening to music, playing
music, and try if not being able to do it, the chanting process will only be affected
by stress. So need to just listen, get deep breaths, and calm down. However, the
everyone’s taste of the music is different, so it is best to listen to your favorite and
personal matches.

43.3 Method and Materials

43.3.1 Dataset Preparation

We have tested using the Free Music Archive (FMA) [14], an easy and open accessible
dataset relevant for evaluating several tasks in MIR. It has various sizes of MP3-
encoded audio data and metadata. We used metadata, named tracks.csv, which has
total of 39 features such as ID, title, artist, genres, tags, and play counts, for all
106,574 tracks. Some attributes and tuples are presented in Table 43.1.
Each track in this dataset is legally free to download as artists decided to release
their works under permissive licenses. The purpose of our research is to predict and
classify the songs which reduce and release stress, so dataset genres [14] were used
to divide into two labels that stressed and stressed out as described in the study (see
Fig. 43.1).

Table 43.1 Some attributes and tuples in track.csv


Track_id Title Genres_all Dur Artist name
148 Blackout 2 Avant- 2:18 Contradiction
Garde
152 Hundred-Year Flood Punk 3:13 Animal Writes
185 Helen Lo-Fi 2:23 Ariel Pink’s Haunted Graffiti
189 The Thought Of It Folk 4:43 Ed Askew
414 K. Munkhbat and K. H. Ryu

Fig. 43.1 Number of genres


per track

43.3.2 Data Preprocessing

Data preprocessing is made by two steps using encodings: ordinal and one-hot.
String values in the dataset need to be converted to numerical value in order to
apply machine learning algorithms. Ordinal encoding, one of the model encoding
methods, is used in the first step of data preprocessing. It means changing original
value to sequential numbers. After that, one-hot encoding, the most common to code
categorical variables, is used. This method creates more than 200 new columns,
which makes the training process slower.

43.3.3 Proposed Methods

We proposed NB, K-NN, MLP, and RF algorithms in this work. K-NN is one of many
(supervised learning) algorithms used in data mining and machine learning. It can
be used for both classification and regression predictive problems. Nevertheless, it
is more broadly used in classification problems in the industry. NB is a classification
technique based on Bayes’ theorem with an assumption of independence among
predictors. That model is easy to build, with no complicated iterative parameter
estimation which makes it particularly useful for very large datasets. MLP, often
applied to supervised learning problems, is a feedforward artificial neural network
model. It is composed of more than one perceptron. RF is an ensemble algorithm
which combines more than one algorithms of a same or different kind for classifying
objects. This classifier creates a set of decision trees from randomly selected subset
of training set. It then aggregates the votes from different decision trees to decide the
final class of the test object. Using ensemble approach is a technique that combines
several machine learning techniques into one predictive model in order to decrease
variance (bagging), bias (boosting), or improve predictions (stacking) [15]. As above
mentioned algorithms are used to propose stacking ensemble approach for classifying
songs in this study, Figure 43.2 indicates the ensemble approach architecture.
43 Classifying Songs to Relieve Stress … 415

Fig. 43.2 Architecture of proposed method

43.4 Results

Table 43.2 shows the comparisons of classification algorithms with ensemble model.
We applied the machine learning algorithms including naïve Bayes, K-NN, ran-
dom forest, and multilayer perceptron in this study. Among the classifiers, RF gives
the highest accuracy to 0.801, while MLP provides the lowest accuracy, with 0.532.
At the end of this result, we approved that the ensemble approach is not suitable for
classifying songs in metadata. The best results were reached using the RF classifier.
The AUC-ROC curve is shown in Fig. 43.3.

Table 43.2 Experimental Classifier Accuracy F1 score Precision Recall


result score
MLP 0.532 0.576 0.805 0.549
Naïve 0.698 0.566 0.559 0.577
Bayes
K-NN 0.66 0.597 0.715 0.567
Random 0.801 0.636 0.967 0.586
forest
Ensemble 0.783 0.575 0.805 0.548
model
416 K. Munkhbat and K. H. Ryu

Fig. 43.3 AUC-ROC of the experiment

43.5 Conclusion

We built the model which can predict the song relieving stress in metadata set of
FMA. Experimental results have made using MLP, NB, K-NN, and RF of machine
learning. We recommend the RF algorithm to build the model which can predict the
songs reducing stress based on our experimental result. Using only metadata set of
the song has drawback to predict reducing stress song. Thus, we will use audio file
dataset, provided detailed musical objects such as chords, trills, and mordents, to
create the model which can predict relieving stress songs in the future work.

Acknowledgements This research was supported by Basic Science Research Program through the
National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future
Planning (No. 2017R1A2B4010826).
43 Classifying Songs to Relieve Stress … 417

References

1. Trappe, H.J.: Music and medicine: the effects of music on the human being. Appl. Cardiopulm.
Pathophysiol. 16, 133–142 (2012)
2. Scaringella, N., Zoia, G., Mlynek, D.: Automatic genre classification of music content: a survey.
IEEE Signal Process. Mag. 23(2), 133–141 (2006)
3. Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)
4. McCallum, A., Nigam, K.: A comparison of event models for naive Bayes text classification.
In: AAAI-98 Workshop on Learning for text Categorization, vol. 752, no. 1, pp. 41–48 (1998)
5. Gardner, M.W., Dorling, S.R.: Artificial neural networks (the multilayer perceptron)—a review
of applications in the atmospheric sciences. Atmos. Environ. 32(14–15) (1998)
6. Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22
(2002)
7. Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech
Audio Process. 10(5), 293–302 (2002)
8. Fu, Z., Lu, G., Ting, K.M., Zhang, D.: Learning naive Bayes classifiers for music classification
and retrieval. In: 20th International Conference on Pattern Recognition, pp. 4589–4592. IEEE
(2010)
9. Cemgil, A.T., Gürgen, F.: Classification of musical instrument sounds using neural networks.
In: Proceedings of SIU’97. (1997)
10. Jin, X., Bie, R.: Random forest and PCA for self-organizing maps based automatic music genre
discrimination. In: DMIN, pp. 414–417 (2006)
11. Syed, S.A., Nemeroff, C.B.: Early life stress, mood, and anxiety disorders. Chronic Stress
(Thousand Oaks, Calif.) 1 (2017). https://doi.org/10.1177/2470547017694461
12. Thoma, M.V., La Marca, R., Brönnimann, R., Finkel, L., Ehlert, U., Nater, U.M.: The effect of
music on the human stress response. PLoS One. 8(8), e70156 (2013). https://doi.org/10.1371/
journal.pone.0070156
13. University of Nevada, Reno Homepage. https://www.unr.edu/counseling/virtual-relaxation-
room/releasing-stress-through-the-power-of-music
14. Defferrard, M., Benzi, K., Vandergheynst, P., Bresson, X.: FMA: a dataset for music analysis.
arXiv:1612.01840 (2016)
15. Dietterich, T.G.: Ensemble methods in machine learning. In: International Workshop on Mul-
tiple Classifier Systems, pp. 1–15. Springer, Berlin (2000)
16. Song, Y., Simon, D., Marcus, P.: Evaluation of musical features for emotion classification. In:
ISMIR, pp. 523–528 (2012)
17. McCraty, R., Barrios-Choplin, B., Atkinson, M., Tomasino, D.: The effects of different types
of music on mood, tension, and mental clarity. Altern. Ther. Health Med. 4, 75–84 (1998)
Chapter 44
A Hybrid Model for Anomaly-Based
Intrusion Detection System

N. Ugtakhbayar, B. Usukhbayar and S. Baigaltugs

Abstract Anomaly-based systems have become critical to the fields of information


technology. Since last few years, evolution of anomaly-based intrusion detection
system (IDS), improving detection accuracy, and training data preprocessing have
been getting specifically important to the researchers of this field. In previous years,
a lot have been discussed on the problems in using anomaly-based and hybrid IDSs.
Anomaly-based approach is comparatively efficient from signature-based in novel
attacks on computer network. However, in some cases, signature-based system is
quick in identifying attacks from anomaly systems. In this work, authors have applied
preprocessing in KDD 99 and have collected dataset using information gain. Authors
have named collected dataset NUM15 as some of the features and redundant data
are beside the point which decreases processing time and performance of IDS. After
that, naive Bayes and Snort are used to classify the compression results and training
the machine in parallel model. This hybrid model combines anomaly and signature
detection that can accomplish detection of network anomaly. The results show that
the proposed hybrid model can increase the accuracy and can detect novel intrusions.

Keywords IDS hybrid model · IDS · Anomaly detection · Snort

44.1 Introduction

Computer security became vulnerable because of the massive expansion of computer


networks and rapid emergence of the hacking tools and intrusion incidents. As tech-
nology is rolling out, these attacks make the network security more vulnerable and
therefore intrusion detection system is introduced to eliminate these threats. Intru-

N. Ugtakhbayar (B) · B. Usukhbayar


National University of Mongolia, Ulaanbaatar, Mongolia
e-mail: 44911.n@gmail.com
S. Baigaltugs
Mongolian University of Science and Technology, Ulaanbaatar, Mongolia

© Springer Nature Singapore Pte Ltd. 2020 419


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_44
420 N. Ugtakhbayar et al.

sion detection system is assigned to shield the system from malicious attacks and
network vulnerability [1].
Since the last few years, network and Internet technologies have been widely
applied in industries and other sectors. Hereupon, network intrusions have been
increased with increasingly changing their types and forms. Therefore, network intru-
sion and information security are challenged while using Internet. Although many
information security technologies such as encryption, authentication, authorization,
intrusion detection, deception, and so on can protect network systems, yet they are
unable to detect novel attacks. Also, there are many undetected anomaly and intru-
sions are named zero-day attacks. Intrusion detection system has been applied to
detect network intrusions and anomalies. Signature-based network intrusion detec-
tion systems (NIDS) can capture and analyze network traffic to detect known attacks
by comparing signatures of the attacks. NIDS are also captured the packets passing
through the network devices [2]. Intrusion detection mechanism is divided into two
types; anomaly detection and misuse detection (signature-based system) [3, 4] also
host-based and network-based IDS [3]. Misuse detection is an approach where each
suspected attack is compared to a set of known attack signatures [5]. It is used for
detecting known attacks [3]. It detects the attacks in an exclusive manner in that
database, but this method cannot be used for the detection of unknown attacks [3].
Unknown attack could be mostly zero-day attacks. Anomaly detection systems are
divided into two types: supervised and unsupervised [6]. In the supervised anomaly
detection method, the normal behavior model of system or networks is established
by training with a labeled dataset. These behavior models are used to classify new
network connection and distinguish malign or anomaly behaviors from normal ones.
Unsupervised anomaly detection approaches work without any labeled training data
and most of them detect malign activities by clustering or outliers detections tech-
niques [6]. The role of anomaly detection is the identification of data points, sub-
stance, event, and observations or attacks that are not transitive to the expected pattern
of a given collection and this technique is based on defining network behavior [3, 7].
Data preprocessing and classifications are important tasks in the machine learning.
Most of the proposed techniques are tried to gain overall classification accuracy.
Even though many models introduced for dealing with network intrusions behavior
had introduced by researchers, most of them suffer from addressing dangerous and
rare attacks as well as it has several problems. Eventually, authors decided to utilize
data mining methods for solving the problem of network anomaly and intrusion for
the following reasons:
• High-speed processing in networks’ big data using several features (near to the
real-time classification);
• Detection accuracy will increase with dataset preprocessing;
• It is appropriate to discover the hidden and unseen information from novel network
attack;
• Prevent from a single point of failure.
In this paper, authors have proposed a novel hybrid model to detect network attacks
as well as solving reasons above. The main contribution of this work against other
44 A Hybrid Model for Anomaly-Based Intrusion Detection System 421

existing models cannot be defined as above. In this model, authors are focused on
data mining as a data preprocessing technique, and continuously machine learning
to increase detection rate and accuracy and also signature-based and anomaly-based
system has been used. In the previous work by the authors [4], data on some novel
attacks after 2010 with Backtrack system and the testing network environment were
collected. In this research, KDD 99 dataset has been used and the collected [4]
dataset is named NUM15. At the outset, KDD 99 dataset was preprocessed before
the experiment. KDD 99 is widely used in computer network anomaly detection and it
consists of nearly 5 million training connection records labeled as an intrusion or not
an intrusion, and separated testing dataset consists of known and unseen attacks [8].
In methodology section, training and testing model and information gain in NUM15
and KDD 99 datasets have been presented. Finally, accuracy and attack detection
with proposed hybrid model have been calculated.

44.2 Related Work

Signature-based [9] and anomaly-based [10] network IDS have been studied since
1980. There are many research papers on IDS with several algorithms and data min-
ing techniques that improve accuracy and decrease false alarm. For the very first time,
IDS was suggested by Anderson [9], which was based on applied statistic methods
to analyze users’ behavior. In 1987, [11], a prototype of IDS was proposed. The idea
of IDS spread progressively. A couple of research papers [10, 12] are focused on data
mining used in network intrusion and anomaly. Their idea is to apply data mining
programs classification and frequent episodes to training data for computing misuse
and creating anomaly detection models that exactly capture the normal and anomaly
behavior activities. Packet-based and flow-based methods are analyzed in network
IDSs based on the source of data. Packet-based IDS mostly provides signature-based
systems valuable information to detect attacks, while flow-based support anomaly-
based IDS to have the ability to detect anomalies [13]. Network IDS uses packets or
flow in network to detect anomaly or network attacks. The [14] presented three lay-
ers of multiple classifiers for intrusion detection that was able to improve the overall
accuracy. They applied naive Bayes, fuzzy K-NN, and backpropagation NN for gen-
erating the decision boundary. The experiment used KDD 99 dataset with 30 features
to test the model. Neelam et al. [15] proposed a layer approach for improving the
efficiency of attack detection rate using domain knowledge and the sequential search
to decrease feature sets and applied naive Bayes classifier for classifying four classes
of attack types. Various works used IDS evaluation dataset of DARPA and KDD 99 in
their experiment. There is various hybrid network IDSs proposed for detecting novel
attacks. Gómez et al. [16] presented hybrid IDSs anomaly preprocessor extended by
Snort named H-Snort. Cepheli et al. [17] introduced novel hybrid intrusion detection
preprocessor for DDoS attack named H-IDS. They are benchmarked by DARPA and
commercial bank dataset for H-IDS and the true positive rate increased by 27.4%.
422 N. Ugtakhbayar et al.

Patel et al. [3] designed hybrid IDS consist of six components and use Snort IDS
with Aho–Corasick algorithm.
Hussein et al. [18] combined a signature-based Snort and anomaly-based naïve
Bayes methods by hierarchical. They used KDD 99 dataset and Weka program for
testing the proposed system. In this research work, they adopted Bayes Net, J48 graft,
and naïve Bayes in anomaly-based system. Besides, they confronted the results of
anomaly-based systems. Their system achieved about 92% detection rate by naïve
Bayes and required about 10 min to build the model.
Dhakar et al. [19] combined two classifiers which are tree-augmented naïve Bayes
(TAN) and reduced error pruning (REP). The TAN classifier is used as base classifier,
while the REP classifier is for meta-classifier which learns from TAN classifier. Their
proposed hybrid model shows 99.96% accuracy for KDD 99 dataset.
MIT’s AI2 model [20] introduced both supervised and unsupervised methods and
combined them with security analyst in their detection system. The features that are
used in this paper include a big data behavioral analytics platform, an ensemble of
outlier detection methods, a mechanism to obtain feedback from security analysts,
and a supervised learning module. They tested the proposed system by real-world
dataset consisting of 3.6 billion log lines.

44.3 Methodology

The KDD 99 dataset has four categories of attacks, viz., DoS, Probe, U2R, and R2L.
Each data instance contains 41 features, and these are separated from the training
and testing datasets. The benchmarking dataset consists of different components.
The researchers have used KDD 99’s 10% labeled dataset and NUM15 dataset as
training. NUM15 dataset has four categories of attacks and 300 thousand instances,
each instance containing 41 features. Information gain is a method that measures the
expected reduction in entropy.

44.3.1 Naive Bayes Classifier

Naive Bayes classifier is based on directed acyclic graph which is broadly utilized
method in classifications purposes [21]. The computer network consists of nodes, arcs
representing variables, and interrelationships among the variables. Bayesian network
is to evaluate the relationship between these features to construct a Bayesian network;
it is called profile of system and determines the support using this profile. The profile
gives a description of the current state of the system by variables. If the probability
of occurrence is less than the threshold, an alarm should be raised.
44 A Hybrid Model for Anomaly-Based Intrusion Detection System 423

The Naive Bayes classifier combines the probability model with a decision rule.
The corresponding classifier, a Bayes classifier, is the function that assigns a class
label y = C k for some k as follows:


n
ŷ = arg max p(Ck ) p(xi Ck ) (44.1)
k∈(1,...K) i=1

In this work, the dataset has been classified into only two classes: normal = C 1
and attack = C 2 .

44.3.2 Information Gain Ratio

The S is a set of training set with their corresponding labels. Assume that there are m
classes, and the training set contains S i samples of class i and s is the total number of
samples in the S, expected information gain ratio needed to classify a given sample.
It is calculated using the following formula:
m  
  
Si Si
I (S1 , S2 , . . . , Sm ) = − log2 (44.2)
i−1
S S

A feature F with values {f1 : fv } can divide the training set into v subsets {S1 : Sv }
where S j is the subset which k has the value f j for feature F.
Information gain for F can be calculated as

IGR = Gain(F) = I (S1 , . . . .., Sm ) − E(F) (44.3)

44.4 System Architecture

44.4.1 Architecture of the Proposed System

The model can be divided into two parts. One of them is offline training module
named “Research Phase” as shown in Fig. 44.1; second one is online classification
module named “Testing Phase” as shown in Fig. 44.2.
In the research phase, data preprocessing and feature selection have been calcu-
lated. Subsequently, it was found that one of the feature sets has resulted in better
accuracy. Later, machine was trained using naive Bayes with training dataset. Using
this design (Fig. 44.1), feature selection, data cleaning (remove duplicated records),
and converting collected traffic to arff format as required by Weka program have
been performed. Before converting data, all discrete data were converted to contin-
424 N. Ugtakhbayar et al.

Fig. 44.1 Proposed offline system design

Fig. 44.2 Proposed IDS’s hybrid models working flow


44 A Hybrid Model for Anomaly-Based Intrusion Detection System 425

uous (data converting) that is used for normalizing. Subsequently, the dataset was
split into training and testing datasets. In the training dataset, all normal traffics were
chosen and testing dataset consists of KDD 99 attack and the collected dataset.
The proposed design uses both anomaly- and signature-based methods in parallel
as shown in Fig. 44.2. Snort IDS [22] was implemented for signature-based system
and compared its results to the machine classification results. The results of the
detectors are collected to database with label for signature (S) and anomaly (A) by
anomaly-based system. The goal of this system is signature-based IDS that has high
accuracy for known attacks as compared to anomaly-based system and benchmarked
the training machine system by signature-based IDS. Apart from reducing detection
delay, it has increased the detection accuracy after some time. The system compares
with the anomaly-based and signature-based systems. If the results are same, the
packet will not be saved into analysis table, whereas if the results are different then
the packet will be saved into the analysis table. Outputs of the signature- and anomaly-
based systems can be examined by network administrator. All different results were
collected into analyzing table followed by training and repetition of the first phase.
The model compares the database of results of both signature and anomaly systems
using time stamp, session ID, source, and destination IP addresses so that we can
reassess the result and train them into machine if the results are different.
In training and classification, a machine with Intel second-generation i5 2.4 GHz
processor, 8 GB DDR3 RAM disk, 1 TB SATA hard disk was used. The analysis
table is created on MySql version 5.7 in Ubuntu 16.

44.4.2 Data Preprocessing

The KDD 99 dataset consists of several types of features such as discrete and con-
tinuous with varying resolution and ranges. So, the authors calculated the symbolic
features, from 1 to N, where N is the number of symbols. Afterward, each symbol
and each value were linearly scaled to the range of [0, N]. The dataset includes five
classes that are four attack types and normal. The dataset is used in many methods
such as semi-supervised algorithms [23, 24], IDS benchmarking [4, 24], and so on.
KDD 99 dataset has nine discrete values which are protocol type, service, flag,
land, logged in, root shell, su attempted, host login, and guest login [4, 25]. Euclidian
distance to calculate dataset for normalization was used. Normalization is required
because the scales of the majority of numerical features in the dataset are not same.
Consequently, the authors conducted to select an optimal feature set using infor-
mation gain ranking, while the next step is to train the machine with training dataset.
426 N. Ugtakhbayar et al.

Fig. 44.3 Proposed models’ feature extraction results

44.4.3 Feature Extracting

The first step of the proposed model is to extract 41 features from the real network
traffic. For this, some codes were written for feature extraction, and the results are
shown in Fig. 44.3.

44.4.4 Calculated Metrics

– True positive (TP)—True positive rate measures the proportion of actual positives
which are correctly identified.
– True negative (TN)—An event when no attack has taken place and no detection is
made.
– False positive (FP)—An event signaling an IDS to produce an alarm when no
attack has taken place.
– False negative (FN)—IDS allows an actual intrusive action to pass a nonintrusive
behavior.
– Accuracy—Used for trueness of IDS detection rate and for the existing system it
was calculated using the following formula, as shown below:

TN + TP
Accuracy = (44.4)
TN + TP + FN + FP

44.5 Results and Discussion

This section evaluates the performance of the proposed model. The first step is infor-
mation gaining, where 19 features using information gain ranking were calculated
as shown in Table 44.1.
The objective of the research phase is to create a trained machine for network
attack classification to be used in the next phase. This phase consists of three steps
as shown in Fig. 44.4. The first step is the network packet capture with Wireshark
interface mode in promiscuous. The packets are stored in pcap-formatted file. Next
44 A Hybrid Model for Anomaly-Based Intrusion Detection System 427

Table 44.1 Selected features No Feature names Types


after information gaining
1 duration Cont.
2 protocol_type Disc.
3 service Disc.
4 src_bytes Cont.
5 land Disc.
6 wrong_fragment Cont.
7 num_failed_logins Cont.
8 logged_in Disc.
9 root_shell Cont.
10 num_file_creations Cont.
11 num_outbound_cmds Cont.
12 is_guest_login Disc.
13 count Cont.
14 srv_count Cont.
15 serror_rate Cont.
16 srv_serror_rate Cont.
17 diff_srv_rate Cont.
18 dst_host_count Cont.
19 dst_host_srv_count Cont.

Fig. 44.4 Research phase flowchart

step is to preprocess and extract the features from the pcap file. These are selected
features.
The accuracy of the feature selection and classifiers are given in Table 44.1. The
result shows that, on classifying the dataset with all 41 features, the average accuracy
rate of 98 and 95.8% is obtained for naive Bayes when using the selected features.
Its results were shown after 3 times training. Table 44.2 compares the accuracy of
41 and 19 features. As shown in Table 44.2 the all features got the best results than
selected features. The network IDS research using Weka (Waikato Environment for
428 N. Ugtakhbayar et al.

Table 44.2 Accuracy in selected 19 features and all features


Traffic class Naive Bayes
Selected features and preprocessing (%) All features (%)
Normal 97.2 98.3
Attack 94.4 97.7
Overall 95.8 98

Table 44.3 Selected features dependency ratio by attack class


Attack classes Relevant features Feature names Dependency ratios
Normal 29 same srv rate 0.89
30 diff srv rate 0.88
DoS 3 service 0.86
5 source bytes 0.91
8 wrong fragment 0.96
Probe 4 flag 0.81
28 srv serror rate 0.8
30 diff srv rate 0.793
36 dst host same src port rate 0.791
U2R 3 service 0.73
14 root shell 0.91
24 srv count 0.76
36 dst host same src port rate 0.69
R2L 3 Service 0.96
6 destination byte 0.845
11 failed login 0.68
39 dst host srv serror rate 0.92

Knowledge Analysis) [26] was developed. Weka is an open-source Java package


which contains machine learning algorithms for data mining.
Afterward, the authors calculated the most relevant features for each attack class
using the formula (44.5) [27] in KDD 99 and NUM 15 datasets as shown in Table 44.3.
After that experiment, we added bold features in Table 44.3 to our features.

HVF OTH
Dependency ratio = − (44.5)
TIN TON
In the following experiment, finally, we calculated the 25 features for use and
summarized the experiment results as shown in Table 44.4 in two classes. As an
overall, the 25 features give the best results after 3 times machine training (like as
all 41 features). Moreover, applying a proposed model further increases execution
speed and decreases training time about 25%.
44 A Hybrid Model for Anomaly-Based Intrusion Detection System 429

Table 44.4 Selected features Traffic class Selected 25 features (%)


after using formula (44.5)
Normal 97.8
Attack 97.1
Overall 97.5

Fig. 44.5 Proposed models 3.5 3.2


attack detection time 3
2.5 2.1
2 2 1.7
2.2
1.5
1 1.5 1.6 1.3
0.5
0
DoS Probe U2R R2L
Signature based system Anomaly based system

In the following experiment, we calculated the detection time for both signature-
and anomaly-based systems. The anomaly-based system is defended by feature
counts. In this experiment result, we adopted selected 25 features. The result is
shown in Fig. 44.5.
In the graphic, signature-based system is slow detection time in probe-type attack
and anomaly-based system is slower than signature-based system in DoS, U2R, and
R2L types. Thereupon, the proposed hybrid model can quickly classify the traffic
than other hierarchal models.
To better evaluate the performance of the proposed method, we compared the
proposed model with other researcher’s results. Table 44.5 shows the detection ratio
comparison for the proposed model with state-of-the-art methods. The detection ratio
of our model is higher than two researchers and lower than one.
In summary, we adopted naïve Bayes technique for anomaly-based system so that
it was obtainable to train new types of attacks with NUM15 dataset. We conclude
that regulating both signature- and anomaly-based systems are more effective based
on our result and empirical data. Because the Snort can detect known intrusion, it can
measure the machine by Snort. Besides, some features have no relevance in anomaly-
based intrusion detection system while some features can increase the accuracy. In
this regard, we postulate that the combination of these solutions can save time to

Table 44.5 Comparison of detection ratio


IDS hybrid Hussein et al. Dhakar et al. Aslahi-Shahri Our proposed
model hierarchical combined model et al. hybrid model
model [15] [16] model [25]
Overall detection 92 99.96 97.3 97.5
ratio (%)
430 N. Ugtakhbayar et al.

detect intrusions if we use them in parallel way. The strengths of the proposed model
are the improved accuracy compared with some methods and models as well as the
quick training time and retraining is easy.

44.6 Conclusion

This study has proposed a hybrid approach to train the machine with an effective way
using processed dataset and signature-based IDS. The proposed system is new of its
kind of hybrid system that combines signature-based detection and anomaly-based
detection approaches. The signature-based detection system is real-time network
IDS Snort that is used widely. The Snort is applied first to detect known intrusion
in real-time and has a low false positive rate. The proposed hybrid system combines
the advantages of these two detection methods. In other word, the Snort can detect
known intrusion, so it can measure the machine by Snort. Also, some features have
no relevance in anomaly-based intrusion detection system while some features can
increase the accuracy. Our feature selection process is utilized to reduce the number
of features to 25, then naive Bayes testing model works on those features. Following
are the advantages of our hybrid model:
• The model will be easy for installation and maintenance.
• Re-modeling of naïve Bayes will be easy.
• The model will increase the accuracy.
• The model was designed using fault-tolerant architecture.
The experimental results show that the proposed hybrid system can increase the
accuracy and can detect novel intrusions after multiple trainings using corrected
analyze table. The proposed model evaluation results show that the accuracy rates
are 97.5%.
Moreover, it can increase execution speed and decrease processing time. The next
task is to study the performance and classification of the computation speed and
comparing other methods.

References

1. Reazul Kabir, Md., Onik, A.R., Samad, T.: A network intrusion detection framework based on
Bayesian network using wrapper approach. Int. J. Comput. Appl. 166(4), 13–17 (2017)
2. Ashoor, A.S., Gore, S.: Importance of intrusion detection system (IDS). Int. J. Sci. Eng. Res.
1–7 (2005)
3. Patel, K.K., Buddhadev, B.V.: An architecture of hybrid intrusion detection system. Int. J. Inf.
Netw. Secur. 2(2), 197–202 (2013)
4. Ugtakhbayar, N., Usukhbayar, B., Nyamjav, J.: Improving accuracy for anomaly based IDS
using signature based system. Int. J. Comput. Sci. Inf. Secur. 14(5), 358–361 (2016)
5. Pathan, A.K.: The state of the Art in Intrusion Prevention and Detection. CRC Press (2014)
44 A Hybrid Model for Anomaly-Based Intrusion Detection System 431

6. Pajouh, H.H., Dastghaibyfard, G.H., Hashemi, S.: Two-tier network anomaly detection model:
a machine learning approach. J. Intell. Inf. Syst. 61–74 (2017)
7. Naga Surya Lakshmi, M., Radhika, Y.: A complete study on intrusion detection using data
mining techniques. IJCEA IX(VI) (2015)
8. Stampar, M., et al.: Artificial Intelligence in Network Intrusion Detection
9. Anderson, J.P.: Computer security threat monitoring and surveillance. In: Technical report,
James P. Anderson Co., Fort Washington, Pennsylvania (1980)
10. Yorozu, Y., Hirano, M., Oka, K., Tagawa, Y.: Electron spectroscopy studies on magneto-optical
media and plastic substrate interface. IEEE Trans. J. Mag. Jpn. 2, 740–741 (1987) [Digests 9th
Annual Conference on Magnetics Japan, p. 301, 1982]
11. Zenghui, L., Yingxu, L.: A data mining framework for building Intrusion detection models
based on IPv6. In: Proceedings of the 3rd International Conference and Workshops on Advances
in Information Security and Assurance. Seoul, Korea, Springer-Verlag (2009)
12. Young, M.: The Technical Writer’s Handbook. University Science, Mill Valley, CA (1989)
13. Androulidakis, G., Papavassiliou, S.: Improving network anomaly detection via selective flow-
based sampling. Commun. IET 399–409 (2008)
14. Te-Shun, C., Fan, J., Kia, M.: Ensemble of machine learning algorithms for intrusion detection,
pp. 3976–3980
15. Neelam, S., Saurabh, M.: Layered approach for intrusion detection using Naive Bayes classifier.
In: Proceedings of the International Conference on Advances in Computing, Communications
and Informatics, India (2012)
16. Gómez, J., Gil, C., Padilla, N., Baños, R., Jiménez, C.: Design of Snort-based hybrid intrusion
detection system. In: IWANN 2009, pp. 515–522 (2009)
17. Cepheli, Ö., Büyükçorak, S., Kurt, G.K.: Hybrid intrusion detection system for DDoS attacks.
J. Electr. Comput. Eng. 2016 (2016). Article ID 1075648
18. Hussein, S.M., Mohd Ali, F.H., Kasiran, Z.: Evaluation effectiveness of hybrid IDS using Snort
with Naïve Bayes to detect attacks. In: IEEE DICTAP 2nd International Conference, May 2012
19. Dhakar, M., Tiwari, A.: A novel data mining based hybrid intrusion detection framework. J.
Inf. Comput. Sci. 9(1), 37–48 (2014)
20. Veeramachaneni, K., Arnaldo, I., Cuesta-Infante, A., Korrapati, V., Bassias, C., Li, K.: AI2:
training a big data machine to defend. In: 2nd IEEE International Conference on Big Data
Security (2016)
21. Aburomman, A.A., Reaz, M.B.I.: Review of IDS development methods in machine learning.
Int. J. Electr. Comput. Eng. (IJECE) 6(5), 2432–2436 (2016)
22. Snort. http://www.snort.org
23. Pachghare, V.K., Khatavkar, V.K., Kulkarni, P.: Pattern based network security using semi-
supervised learning. Int. J. Inf. Netw. Secur. 1(3), 228–234 (2012)
24. Hlaing, T.: Feature selection and fuzzy decision tree for network intrusion detection. Int. J.
Inform. Commun. Technol. 1(2), 109–118 (2012)
25. Wang, Y., Yang, K., Jing, X., Jin, H.L.: Problems of KDD Cup 99 dataset existed and data
preprocessing. Appl. Mech. Mater. 667, 218–225 (2014)
26. Weka. http://weka.sourceforge.net
27. Olusola, A.A., Oladele, A.S., Abosede, D.O.: Analysis of KDD’99 intrusion detection dataset
for selection of relevance features. In: Proceedings of the WCECS 2010, USA (2010)
28. Aslahi-Shahri, B.M., Rahmani, R., Chizari, M., Maralani, A., Eslami, M., Golkar, M.J.,
Ebrahimi, A.: A hybrid method consisting of GA and SVM for intrusion detection system.
Neural Comput. Appl. 27(6), 1669–1676 (2016)
29. Maxion, R.A., Roberts, R.R.: Proper use of ROC curves in intrusion/anomaly detection. Tech-
nical report CS-TR-871 (2004)
Chapter 45
A Method for Precise Positioning
and Rapid Correction of Blue License
Plate

Jiawei Wu, Zhaochai Yu, Zuchang Zhang, Zuoyong Li, Weina Liu
and Jiale Yu

Abstract To alleviate the problems of slow speed and weak correction ability
of existing license plate correction methods under complex conditions, this paper
presents a faster license plate positioning method based on the color component com-
bination and color region fusion and develops a more accurate correction algorithm
of blue license plate using probabilistic Hough transform and perspective transform.
The proposed methods utilize the characteristics of white characters on the blue back-
ground of the Chinese license plate. Color component combination in HSV and RGB
color spaces and image thresholding are first performed to obtain the background
region of the blue license plate and its character region. Then, both regions are fused
to obtain complete and accurate license plate region. And finally, edge detection,
probabilistic Hough transform, and perspective transform are performed to achieve
rapid license plate correction. Experimental results show that average correction time
of blue license plate obtained by the proposed method is 0.023 s, and the average
correction rate is 95.0%.

Keywords License plate positioning · License plate correction · Color component


combination · Color region fusion

J. Wu · Z. Yu (B) · Z. Zhang · Z. Li (B) · J. Yu


College of Computer and Control Engineering, Minjiang University, Fuzhou 350121, China
e-mail: 269182663@qq.com
Z. Li
e-mail: fzulzytdq@126.com
J. Wu · Z. Yu · Z. Zhang · Z. Li · J. Yu
Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang
University, Fuzhou 350121, China
W. Liu
Department of Computer Engineering, Fujian Polytechnic of Information Technology, Fuzhou
350003, China

© Springer Nature Singapore Pte Ltd. 2020 433


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_45
434 J. Wu et al.

45.1 Introduction

License plate recognition system is a key part of intelligent traffic system, which has a
wide range of application scenarios, such as car theft prevention, traffic flow control,
parking fee management, red light electronic police, and highway toll station. The
general steps of the license plate recognition system can be divided into rough license
plate positioning, license plate correction, accurate license plate positioning, and
license plate character recognition. Each step in the license plate recognition system
is closely related. The quality of the license plate correction has an important impact
on the subsequent steps. Good correction results can greatly reduce the difficulty
of processing the subsequent steps and improve the accuracy of the license plate
character recognition. Therefore, the license plate correction is an important step in
the license plate recognition system.
In practice, there are three kinds of situations that need to be corrected, namely,
horizontal slant, vertical slant, and mixed slant. Chinese researchers put forward
many correction methods for three situations, which can be mainly divided into two
methods: (1) the method based on traditional Hough transformation [1] and (2) Radon
transform-based method [2]. The traditional Hough transform method relies on the
license plate frame to determine the slant angle, which cannot correct the situation
of characters sticking to the frame or no license plate frame. Radon transform-based
method also has the defects of large computation and slow speed, and cannot adapt
to some complex conditions. Of course, some researchers have improved the above
two methods [3–6], but most of them are difficult to complete real-time correction
under complex conditions.
To solve the above problems, we propose an accurate positioning algorithm for
rough location license plate based on component of color model combination and
color region fusion, and a rapid correction algorithm based on probabilistic Hough
transform [7] and perspective transform [8], which can rapidly and accurately com-
plete the task of blue license plate correction under complex conditions. We use SSD
[9] to get a rough license plate images as the research input.

45.2 Methods

We first scaled the license plate image to the same scale and studied it at a width
of 250 px. Considering that it is difficult to accurately segment the complete license
plate through the RGB color model and splitting the complete license plate has a
great influence on the subsequent correction, we propose a more accurate method
of license plate positioning, which integrates more underlying information and is
more robust. We have also improved the license plate correction process, greatly
improving the correction speed and accuracy. The flowchart is shown in Fig. 45.1.
45 A Method for Precise Positioning and Rapid Correction … 435

Color space Construct Get the


Input
conversion combined image blue region

The blue
Threshold Fused Get the
license
segmentation color region white region
plate

License plate Perspective


Edge detection Output
corner location transformation

Fig. 45.1 Flowchart of the entire algorithm. First, construct the channel combination diagram and
use threshold segmentation to get blue region. Next, the threshold segmentation is used to obtain
the white character region. Then, fuse the two regions to obtain the complete accurate license plate
region, and finally correct the region

45.2.1 Accurate Location of License Plate

Appropriate component combination of different color models can enlarge the dif-
ference between foreground and background and simplify the complexity of back-
ground, so as to facilitate image segmentation. The traditional algorithm [10] takes
advantage of the color features of the blue license plate, such as the blue component
gray value is larger than red component gray value in the RGB model. Based on the
above method, it can be described as

Ib = Max{0, B − R} (45.1)

where I b is the result, B is the blue component in RGB color model, and R is the red
component in RGB color model. Then, Otsu [11] algorithm was used to binarize the
combination diagram to obtain the blue region of the license plate.
However, the algorithm will not work in some cases. As shown in Fig. 45.2, this
algorithm is not robust to complex conditions such as the adhesion of characters
to the border and dim illumination, and it cannot obtain the complete license plate
region, and the result of blue region selection is also incomplete.
In order to solve the problems existing in the traditional algorithm, we converted
the preprocessed license plate image into HSV color model and obtained a single
channel image of hue, saturation, and value components through channel separation.
After careful observation of the three components of RGB color model and HSV
color model, we found that the gray value of subtracting the red component from the
value component is bigger in the blue region, as shown in Fig. 45.3. Therefore, it is
436 J. Wu et al.

Fig. 45.2 Example of subtracting of a component of RGB color model. a Original license plate
images, b the results of the subtracting between blue component and red component of RGB color
model

Fig. 45.3 The results of three representative components combined images from left to right:
original image, value component in HSV color model, red component in RGB color model, gray
image of subtracting of value component and red component, thresholding result of gray image

Fig. 45.4 Example of the LCRS method. The left side is the original image, the middle is the blue
region binary image, and the right side is the result of LCRS method

easier to obtain the blue area of the license plate by constructing a composite image
with the value component of HSV color model and the red component of RGB color
model. It can be described as

Ib = Max{0, V − R} (45.2)

where V is the value component of HSV color model and R is the red component of
RGB color model.
However, the new question is shown in Fig. 45.4; for the license plate image
which is subject to the car whose body is blue, it is necessary to remove the blue
background of the license plate. We propose a method of large connected region
screening (LCRS) method to obtain the binary map of the blue region. The LCRS
method is defined as follows:

Step1: Find the outer contours of all connected regions in the binary image.
Step2: Find the minimum outer rectangle corresponding to each outer contour.
45 A Method for Precise Positioning and Rapid Correction … 437

Table 45.1 Threshold value segmentation comparison table


H min H min H min H min H min H min Interference Character
100 100 100 255 255 255 ✓ ✗
0 0 0 180 125 255 ✓ ✓
0 0 120 180 125 255 ✓ ✓
0 0 150 180 125 255 ✓ ✓
0 0 150 180 125 255 ✗ ✓

Step3: Calculate the width and height of each enclosing rectangle.


Step4: Judge whether the width or height of each enclosing rectangle conforms to
Eq. (45.3). If not meet the formula, it is the background region and will be removed.


IR , if wrect < K ∗ w & hrect < K ∗ h
IR = (45.3)
IBG , otherwise

where K is the ratio of width of binary image. In this paper, the value of K is 0.9.
wrect is the width of inset rectangle and hrect is the height of inset rectangle. The
reason for the K value is that the background area is caused by the color of the car,
and the color of the car is usually pure color, so the width and height of the minimum
external rectangle of the background area will take a large proportion.
However, some license plate characters have adhesion to the license plate frame.
Only the blue region cannot solve the problem that the characters stick to the edge of
the license plate, and the corner phenomenon will occur. If the character region can
be increased, a more complete license plate area will be obtained. The rule of white
region obtained can be formulated as

255, if Hmin < h < Hmax & Smin < s < Smax & Vmin < v < Vmax
Iw =
0, otherwise
(45.4)

The threshold is through a large number of experiments; the value comparison


results in Table 45.1, contains ✓ representative, and does not contain ✗ representative.
The first value cannot be used to segment the character region with the interference
region. The second to fourth values can be used to segment the character region, but
with the interference region, the fifth value can obtain the character region without
interference region. The threshold value of the algorithm in this paper is the fifth
value in Table 45.1.
After obtaining the white region binary image, the white background region is
removed by the LCRS method, and the obtained white region binary image is merged
with the previously obtained blue region binary image. The merge process can be
formulated as
438 J. Wu et al.

Fig. 45.5 Example of two-region fusion. a Original image, b the result of subtracting the value
component from the red component, c the white region of thresholding result, d the results of
two-region fusion


255, if Ib + Iw > 255
If =  (45.5)
Ib + Iw , otherwise

where I b  is the blue region binary image and I w is the white region binary image.
The results of two-region fusion are shown in Fig. 45.5.

45.2.2 Locate the Four Corners of the License Plate


and Correction

The binary image after the two-region fusion uses the morphological closing opera-
tion to remove some black holes in the license plate area. The binary image after the
closed operation uses the Canny operator for edge detection [12]. The edge detec-
tion result retains the largest contour binary image, which is the external contour
binary image of the license plate area. Then, the probabilistic Hough transform [7]
is applied to the circumscribed contour binary image of the license plate area to fit
the contour line segment. The probabilistic Hough transform [7] is an improvement
on the traditional Hough transform, which has a great improvement in speed and can
detect the end points of the line segment. After detecting the line segment, because
the number of endpoints is not large, it is easy to find the four corner points of the
license plate by iterating to find the corner point.
Perspective transformation [8] is defined as a projection of an image onto a new
viewing plane, also named projective mapping. After obtaining four corner points
of the license plate, it is used as four source points. Then the positions of the four
corrected target points are calculated. The distance between the two adjacent source
points in the upper left corner is taken as the length and width of the target rectangle.
The upper left corner target point is taken as the source point in the upper left corner.
After obtaining four groups of corresponding points, the perspective transformation
matrix was calculated, and then the original license plate image was perspective
transformed with the perspective transformation matrix, that is, the corrected license
plate image was obtained. In this paper, OpenCV was used for perspective trans-
formation, and the average time spent on 40 images with a width of 250 px was
0.002 s.
45 A Method for Precise Positioning and Rapid Correction … 439

45.3 Experimental Results

In order to verify the effectiveness of the method proposed in this paper, this article
made 40 true blue license plate image datasets and 40 pictures of true license plate
coarse positioning system vehicle images and real scene segmentation, containing
the night, uneven illumination, large lateral tilt angle, fuzzy characters of license
plate tilt at border adhesion under complex conditions, such as a variety of vertical
slant horizontal slant images, and the horizontal and vertical slant images.
The experiment was run in the hardware and software environment of Intel Core
i5 4210M 2.60 GHz processor, 8 GB RAM, and Windows 7 operating system. The
eclipse-integrated development environment was used, Python language was used
for coding, and the open-source library OpenCv assisted programming. Only one
thread was used in the test.
In order to verify that the proposed algorithm has advantages over the traditional
method, using these 40 images as the test dataset, three algorithm comparison exper-
iments are designed to verify. The first algorithm is based on the traditional Hough
transform, and the second algorithm is based on the Radon transform.
The comparison results of the three algorithms to correct some images are shown
in Fig. 45.6. On the rough license plate samples with a large angle, the algorithm
has higher correction accuracy and better robustness. The comparison results of 40

Fig. 45.6 Comparison of three different methods of license plate correction results. a Original
license plate image, b the correction result of the tradition Hough transform-based, c the correction
result of the tradition Radon transform-based, d the correction result of the proposed method
440 J. Wu et al.

Table 45.2 Result obtained Number of Methods Time (s) Accuracy (%)
by different methods tests
40 The tradition 5.122 77.5
Hough-based
40 The 0.320 85.0
Radon-based
40 Proposed 0.023 95.0
method

complete test set algorithms are shown in Table 45.2. The test set image width is 250
px scaled license plate coarse positioning picture, and the three algorithm test sets are
the same. The experimental results show that the proposed algorithm outperforms the
first two algorithms in both time and accuracy. Compared with most other existing
license plate correction algorithms, the algorithm still has advantages.
In order to verify that the algorithm has good generalization, 50 real license plate
pictures were collected and corrected by this method. 46 were correct corrections
and 4 were correction failures; the correct rate is 92.0%, and the average correction
time is 0.016 s. The reason for the failure of the correction is that the white metal
rod blocks and the green paints interfere the license plate. The experimental results
show that the proposed algorithm has good generalization ability and can complete
blue license plate correction quickly in most cases.

45.4 Conclusions

A precise positioning and correction method for the blue license plate is proposed in
this paper. The proposed method first uses color component combination and color
region fusion to accurately localize the license plate, and then uses the probability
Hough transform and perspective transformation to quickly correct the license plate.
Experimental results show that the proposed method has good real-time perfor-
mance and high correction rate, which satisfies the requirements of real-time moni-
toring in real-world scenes. Meanwhile, the license plate correction method has the
advantages of good robustness and a large range of corrective adaptability. In the
future, we will expand the proposed method to suitable for other types of Chinese
license plates.
45 A Method for Precise Positioning and Rapid Correction … 441

References

1. Rui, T., Shen, C., Zhang, J.: A fast algorithm for license plate orientation correction. Comput.
Eng. 30(13), 122–124 (2004)
2. Ge, H., Fang, J., Zhang, X.: Research on license plate location and tilt correction algorithm in
License plate recognition system. J. Hangzhou Dianzi Univ. 27(2) (2007)
3. Wang, S., Yin, J., Xu, J., Li, Z., Liang, J.: A fast algorithm for license plate recognition
correction. J. Chang. Univ. (Nat. Sci. Ed.) 30(04), 76–86 (2018)
4. Wang, N.: License plate location and slant correction algorithm. Ind. Control. Comput. 27(11),
25–26 (2014)
5. Ji, J., Cheng, Y., Wang, J., Luo, J., Chang, H.: Rapid correction of slant plate in license plate
recognition. Technol. Econ. Guid. 26(35), 68 (2018)
6. Lu, H., Wen, H.: License plate positioning and license plate correction method under different
degrees of inclination. Mod. Ind. Econ. Inf. 6(05), 69–71 (2016)
7. Stephens, R.S.: Probabilistic approach to the Hough transform. Image Vis. Comput. 9(1), 66–71
(1991)
8. Niu, Y.: Discussion about perspective transform. J. Comput.-Aided Des. Comput. Graph. 13(6),
549–551 (2001)
9. Liu, W., Anguelov, D., Erhan, D., et al.: SSD: single shot multibox detector. In: European
Conference on Computer Vision, pp. 21–37. Springer, Cham (2016)
10. Zheng, K., Zheng, C., Guo, S., Cheng, K.: Research on fast location algorithm of license plate
based on color difference. Comput. Appl. Softw. 34(05), 195–199 (2017)
11. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man
Cybern. 9(1), 62–66 (2007)
12. Canny, J.: A computational approach to edge detection. In: In Readings in Computer Vision,
pp. 184–203. Morgan Kaufmann (1987)
Chapter 46
Preliminary Design and Application
Prospect of Single Chinese Character
Calligraphy Image Scoring Algorithm

Shutang Liu, Zhen Wang, Chuansheng Wang, Junxian Zheng


and Fuquan Zhang

Abstract This paper improves the image classification task based on deep learning
and proposes a new font grading system to help calligraphy lovers to practice callig-
raphy. The basic model of the framework proposed in this paper is ResNet, and then
dilated convolution, deformable convolutional, and deformable pooling are used on
the traditional ResNet to improve performance. Experimental results show that the
proposed algorithm can make a reasonable judgment on handwriting.

Keywords Chinese character calligraphy · Calligraphy image scoring

46.1 Introduction

With the extensive development of MOOC, some Chinese calligraphy courses have
emerged in the major MOOC platforms in China [1]. The 2018 National Online
Open Course Evaluation of China emphasizes more on students’ intensive and quan-
titative homework exercises [2]. The authors’ research team found that the number
of students taking calligraphy courses online is more than the average number of
students taking other courses and the evaluation of students’ calligraphy homework
is a kind of brainwork which is difficult for scorers to achieve long-term objective

S. Liu · Z. Wang · J. Zheng · F. Zhang (B)


Minjiang University, Fuzhou University Town, Fuzhou 350108, People’s Republic of China
e-mail: 8528750@qq.com
S. Liu
e-mail: 630670731@qq.com
Z. Wang
e-mail: 754033308@qq.com
J. Zheng
e-mail: 407069090@qq.com
C. Wang
Harbin University of Science and Technology, Harbin 150080, People’s Republic of China
e-mail: shiguh5538@qq.com
© Springer Nature Singapore Pte Ltd. 2020 443
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_46
444 S. Liu et al.

evaluation. Because of this, the research team went looking for a program that could
automatically score calligraphy works but found that there were only a few simple
software that collect expert scores and perform simple mathematical processing and
no software that can be directly used to score calligraphy art. In the field of education
research, timely feedback and evaluation have great significance in the training of stu-
dents’ calligraphy skills [3]. At present, the academic field of intelligent recognition
of Chinese characters has made considerable progress than before, and various kinds
of Chinese character recognition software can be seen everywhere. And the devel-
opment of deep learning technology makes the more scientific intelligent evaluation
of Chinese calligraphy becomes a possibility.

46.2 Method

This paper proposes a new algorithm for calligraphy grading based on deep learning.
It can be used to help students with calligraphy exercises. Calligraphy font recognition
based on convolution neural network can automatically extract features and avoid
the defects of manual design features. The network structure proposed in this paper
is based on the ResNet [4] image classification network combined with the newly
proposed dilated convolution [5] kernel and deformable convolution [6] design.
The scoring algorithm of Chinese calligraphy proposed in this paper is based on
the improvement of the classification algorithm. Generally speaking, the goal of the
classification task is to find out the probability of different types of targets and then
select the category with the highest probability as the classification result [7–11]. The
scoring algorithm proposed by us will make some improvements to the traditional
classification algorithm. We first assume that the category with the largest probability
is still the classification result, and then take the probability value of this category
as the reference value for scoring. The scoring algorithm proposed by us will make
some improvements to the traditional classification algorithm. We first assume that
the category with the highest probability is still the classification result, and then use
the obtained probability value as the reference for scoring. As mentioned above, the
algorithm we proposed is based on the image classification task. Therefore, we refer
to ResNet, a very effective image classification network.

46.2.1 Building Block

The advantage of the deep residual network is that it can make the network very
deep. This is because of the building block. The structural diagram of the building
block is shown in Fig. 46.1.
ResNet addresses the degradation problem by introducing a deep residual learning
framework. Instead of hoping every few stacked layers directly fit a desired underly-
ing mapping, we explicitly let these layers fit a residual mapping. In this paper, the
46 Preliminary Design and Application Prospect … 445

Fig. 46.1 Residual learning:


a building block

formulation of F(x) + x can be realized by feedforward neural networks with “short-


cut connections” (Fig. 46.1). Shortcut connections are those skipping ones or more
layers. This “shortcut connection” can make the network structure become deeper
and thus improve the convergence ability of the network.

46.2.2 Dilated Convolution

Text images differ from regular images. This kind of image is blank in a large part
of the area, so the features extracted from most areas of this image are invalid.
Therefore, we choose empty convolution to extract features. Figure 46.2 shows a
schematic diagram of dilated convolution, where the left side is the normal convo-
lution and the right side is the dilated convolution as can be seen from Fig. 46.2.
Dilated convolution does not extract the features of all pixel points but takes the
features across them. This way of feature extraction can avoid the network which
extracts too many invalid features. The purpose of this structure is to provide greater
sensing field without pooling (the pooling layer will result in information loss) and
with the same amount of calculation. Therefore, empty convolution is very suitable
for the project of calligraphy font grading. We convert the original feature extraction
method of RSRNet into this method. This makes a lot of sense for the end result.

Fig. 46.2 Two different


kinds of convolution
446 S. Liu et al.

46.2.3 Deformable Convolutional and Pooling

In this work, we adopt two new modules to enhance CNN’s transformation modeling
capability, namely, deformable convolution and deformable Roi pool. Both are based
on the idea of adding a spatial sampling location to the module with an additional
offset and learning the offset of the target task without additional supervision. These
modules can easily replace the ordinary peers in the existing CNN and carry out
end-to-end training through standard backpropagation, thus generating deformable
convolution network. The schematic diagram of deformable convolutional is shown
in Fig. 46.3. Among them, the standard convolution (a) the rules of the sampling grid
(green), (b) (deep blue dot) deformation of the sampling position, have enhanced
migration in deformable convolution (blue arrow), (c) and (d) is the special case
of (b), shows that the deformation of the convolution sums up all kinds of scale
transformation, aspect ratio and rotation (anisotropic).
In fact, the offset added in the deformable convolution unit is part of the network
structure, which is calculated by another parallel standard convolution unit, and fur-
ther end-to-end learning can be carried out through gradient backpropagation. And
the offset after learning, the size of the deformable convolution kernels and position
can be adjusted according to the current need to identify the dynamic image con-
tent. Its visual effect is different location of convolution kernels will sample point
location based on image content adaptive changes, so as to adapt to different geomet-
rical deformation object’s shape, size, etc. Deformable convolution and deformable
pooling are shown in Fig. 46.4.
Due to our task, we are very sensitive to the direction of strokes. An excellent
calligraphy font should have a good grasp of these details. Therefore, it is necessary
to extract features by means of deformable convolution. In this way, the network is
more sensitive to the orientation of fonts.
The rest of the network is aligned with the traditional ResNet. So ResNet itself is
a very good network.

Fig. 46.3 Diagram of deformable convolutional


46 Preliminary Design and Application Prospect … 447

Fig. 46.4 Diagram of deformable convolutional and pooling

46.2.4 The Network Architecture

The scoring algorithm of Chinese calligraphy proposed in this paper is based on the
improvement of the classification algorithm. The goal of the classification task is to
find out the probability of different types of targets and then select the category with
the highest probability as the classification result. As mentioned above, the algorithm
we proposed is based on the image classification task. We refer to ResNet [1], a
very effective image classification network. Different from traditional ResNet, our
proposed network adds dilated convolution, deformable convolution, and deformable
pooling to the basic ResNet.

46.3 Experiment

46.3.1 Dataset

Since Chinese calligraphy has few international character recognition tasks, all the
databases are difficult to obtain. So we use the print font provided in Windows as
the database source. There are many fonts in this database, but we only carried out
experiments on two fonts, namely, regular script and song script. The reason why we
only study these two fonts is that the main purpose of this paper is to provide a method
for scholars in the need to identify characters in calligraphy, rather than to completely
solve all problems in this field or complete a certain project. Our follow-up work
will expand the type of fonts.
Our data does not start with a single text image, but a text image with multiple
texts. The image contains only Chinese characters, not anything else. We first cut
the original image into an image containing only one Chinese character, because
the computer software generated by the image contains many Chinese characters in
shape is very regular. Therefore, as long as the size of these pictures has a certain
understanding, it is easy to perfect the cutting. After cutting, we get the dataset we
448 S. Liu et al.

Fig. 46.5 Diagram of


segmentation process

need. Figure 46.5 shows the schematic diagram of the dataset acquisition method.
As can be seen from Fig. 46.5, this is a very neat image, so it is very easy to get the
image data we need from this image.

46.3.2 Training

In order to verify the feasibility of the improved model in this section for handwriting
style recognition, we test the training model on the dataset of the two fonts, as well as
the training and testing on the standard dataset. The accuracy rate of final recognition
reached above 0.99. It shows that, first of all, our network can perfectly complete
the task of font recognition. Since the core of our task is actually identification, the
proposed network is effective.
We used the Tensorflow deep learning framework to compare and contrast the
model. We conducted many experiments with various learning rates. The training
set was 2400 and the test set 800. The batch gradient descent method was used to
update and iterate the model parameters, batch = 50, and the training set was iterated
48 times without falling. Too small a learning rate will make the convergence rate
too slow, and too large a learning rate will lead to the result that the optimal value
is skipped and the convergence cannot be achieved. Finally, the learning rate of this
experiment is set as rate = 0.001.
46 Preliminary Design and Application Prospect … 449

46.4 Application Prospect and Future Work

The application of this software is not only in the field of calligraphy courses for
MOOC. It can be released separately and applied to the teaching of calligraphy
for children and adolescents. If it is to be applied to the evaluation of calligraphy art
contests, further research and development is needed. Although the song character is a
kind of printed font, it has an important role in the evaluation of writing norms because
of its simple strokes. As a font on the pre-evolution step of regular calligraphy,
Li Calligraphy has not yet appeared too much genre differentiation and is easy to
identify intelligently. Because of the differentiation of regular scripts and running
hand scripts, their scoring indicator system requires a larger sample of deep learning
training, which is the main work of the future. The entire calligraphy work requires a
macro layout. Sample collection of the entire calligraphy work is also another major
work in the future.

Acknowledgements The paper is supported by the foundation of Fujian Province Educational


Science “Thirteenth Five-Year Plan” 2018 Project—“Research on the college students’ anomie
of online courses learning and intervention of their online courses learning” (No. FJJKCGZ18-
850, Key funding project), Young and Middle-aged Teacher Educational and Scientific Research
Project of Fujian Province—“Research on the college students’ anomie of online courses learning
and intervention of their online courses learning”, and the Teaching Reform Research Project of
Minjiang University in 2018—“The Interventional Teaching Reform aimed at the online courses
learning anomie of college students” (No. MJU2018A005).

References

1. Mi, W.: The e-curriculum development: a new way for current primary and secondary school
calligraphy teaching. Curric., Teach. Mater. Method 38(07), 87–91 (2018)
2. Ministry of Education of the People’s Republic of China official website, http://www.moe.gov.
cn/srcsite/A08/s5664/s7209/s6872/201807/t20180725_343681.html. Last accessed 24 July
2018
3. Zhou, Y.: Thoughts on the construction of online open courses for art. Art Educ. 336(20),
136–137 (2018)
4. He, K.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision
and Pattern Recognition 2016, pp. 770–778 (2016)
5. Yu, F.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.
07122(2015)
6. Dai, J.: Deformable convolutional networks. In: Proceedings of the IEEE International Con-
ference on Computer Vision (2017)
7. Fanello, S.R.: Keep it simple and sparse: real-time action recognition. J. Mach. Learn. Res.
14(1), 2617–2640 (2017)
8. Lu, C.: Two-class weather classification. IEEE Trans. Pattern Anal. Mach. Intell. (99), 1 (2017)
450 S. Liu et al.

9. Woitek, R.: A simple classification system (the Tree flow chart) for breast MRI can reduce the
number of unnecessary biopsies in MRI-only lesions. Eur. Radiol. 27(9), 3799–3809 (2017)
10. Cicero, M.: Training and validating a deep convolutional neural network for computer-aided
detection and classification of abnormalities on frontal chest radiographs. Investig. Radiol.
52(5), 281 (2017)
11. Yuan, Y.: Hyper spectral image classification via multitask joint sparse representation and
stepwise MRF optimization. IEEE Trans. Cybern. 46(12), 2966–2977 (2017)
Chapter 47
Adaptive Histogram Thresholding-Based
Leukocyte Image Segmentation

Xiaogen Zhou, Chuansheng Wang, Zuoyong Li and Fuquan Zhang

Abstract To improve the accuracy of leukocyte segmentation, this paper presents


a novel method based on adaptive histogram thresholding (AHT). The proposed
method first employs color component combination and AHT to extract the nucleus
of leukocyte and utilizes image color features to remove the complex backgrounds
such as red blood cells (RBCs) and substantial dyeing impurities. Then, Canny edge
detection is performed to extract the entire leukocyte. Finally, the cytoplasm of the
leukocyte is obtained by subtracting the nucleus with the entire leukocyte. Exper-
imental results on an image dataset containing 60 leukocyte images show that the
proposed method generates more accurate segmentation results than the counterparts.

Keywords Leukocyte (white blood cell) · Image thresholding · Image


segmentation · Image localization · Color component combination · Edge detection

47.1 Introduction

In the medical fields, the analysis and cytometry of white blood cells (WBCs) in
blood smear images is a powerful diagnostic tool for many types of diseases, such
as infections, anemia, malaria, syphilis, heavy metal poisoning, and leukemia. A

X. Zhou
College of Mathematics and Computer Science, Fuzhou University, Fuzhou, People’s Republic of
China
e-mail: xiaogenzhou@126.com
X. Zhou · C. Wang · Z. Li (B) · F. Zhang (B)
Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang
University, Fuzhou, People’s Republic of China
e-mail: fzulzytdq@126.com
F. Zhang
e-mail: 8528750@qq.com
C. Wang
School of Computer Science and Technology, Harbin University of Science and Technology,
Harbin, People’s Republic of China

© Springer Nature Singapore Pte Ltd. 2020 451


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_47
452 X. Zhou et al.

Fig. 47.1 The WBC image segmentation process of the proposed method. a Original WBC image,
b the color component combination image, c the grayscale histogram of (b), where P1 , P2 , and
P3 are three peaks of the histogram, and T is a threshold for image binarization, d segmentation
result of the leukocyte’s nucleus, e the result of (a) after removing the RBCs and background, f the
maximum object contour in the leukocyte’s edge detection result, g the leukocyte’s segmentation
result, h segmentation result of the leukocyte’s cytoplasm

computer-aided automatic cell analysis system not only saves manpower and time
cost but also reduces the effects of human error. WBC segmentation is the basis
of automatic cell image analysis, and the precision of WBC segmentation directly
influences the reliability of the blood smear image analysis.
A typical human blood smear image which consists of WBCs, red blood cells
(RBCs or erythrocytes), platelets, and the background is conventionally prepared with
Wright-Giemsa stain to visualize and identify WBCs microscopically. The goal of
cell segmentation is to extract WBCs from a complex scene for subsequent analysis,
however, due to uneven staining, illumination conditions, the limitations of various
properties of cells such as size, color, and shape of cells, and the adhesion between
WBCs and RBCs. Therefore, an accurate and robust WBC segmentation is still a
challenging task owing to the above reasons.
The primary objective of this paper is to present a method to segment entire leuko-
cyte, nucleus, and cytoplasm from the blood smear image with the standard staining
condition, as shown in Fig. 47.1; it is an example of leukocyte image segmentation.
There are various types of segmentation methods that have been proposed for
cell images over the past several decades. Specific thresholding is a widely used
technique based on the analysis of histogram in cells segmentation. Threshold-based
[1, 2] methods mainly include the region growing method, the watershed method [3,
4], and Otsu’s method [5]. Lim et al. [6] proposed a WBC segmentation method by
image thresholding and watershed techniques. In addition, learning-based methods
include supervised methods, including support vector machine (SVM) [7], deep
47 Adaptive Histogram Thresholding … 453

neural networks, and unsupervised methods such as k-means clustering [8] and fuzzy
c-means. Zhang et al. proposed a novel method for the nucleus and cytoplasm of
leukocyte segmentation based on color space decomposition and k-means clustering.
In this paper, we proposed a method to segment nucleus and cytoplasm of leuko-
cytes in blood smear images. We employ AHT and components combination in color
space (CCCS) to segment the nucleus of leukocyte and obtain the entire leukocyte
using the Canny edge detection. We also obtain the cytoplasm region by subtracting
the nucleus region from the entire leukocyte region.
The rest of the paper is structured as follows. Section 47.2 briefly introduces the
proposed method. The experimental results are shown and discussed in Sect. 47.3.
The conclusion is drawn in the final section.

47.2 The Proposed Method

To accurately and robustly segment leukocyte in blood smear images, we propose


a novel leukocyte segmentation method based on components combination in color
space (CCCS) and adaptive histogram thresholding (AHT). The proposed method
first employs AHT and CCCS to extract the nucleus of leukocyte. Then, Canny edge
detection is performed to extract the entire leukocyte. Finally, cytoplasm segmenta-
tion is achieved by subtracting the WBC nucleus region from the leukocyte region.

47.2.1 Nucleus Segmentation

We introduce a novel method to accurately segment the nucleus from the leukocyte
image, which contains two main steps. First, a novel color component combination
image (see Fig. 47.1b) is constructed by the saturation component in HSI color space,
the green component and the blue component in RGB color space, respectively.
Second, the nucleus segmentation result is obtained based on the AHT method. The
detailed process of nucleus segmentation is as follows:
(1) Components combination in color space: Construct a color component combi-
nation by the saturation component in hue, saturation, and intensity (HSI) color
space, the green and blue components as a new image I  , using the following
formulae:

I  (i, j) = S  + k1 B − k2 G, (47.1)
454 X. Zhou et al.


k1 =  1, if B0 ≥ S0 (47.2)
S0
B0
, otherwise

In Eq. (47.1), S denotes the normalized saturation component in HSI color space,
G and B indicate the green and blue components in RGB color space, respectively.
Symbols k 1 and k 2 are weights of B and G, respectively, and k 1 is adaptively set
according to Eq. (47.2). In Eq. (47.2), . indicates rounding upward, S 0 and B0 are
the thresholds of the saturation and the blue components determined by our proposed
adaptive histogram thresholding, respectively.
(2) Extraction of nucleus region: We first suppress image noise using the median
filter, then extract candidate nucleus regions by our proposed method AHT,
and finally remove small regions for obtaining final nucleus regions. The AHT
method includes the following steps.
Step 1: Construct a grayscale histogram, H, of the above color component com-
bination image.
Step 2: Find the peaks in H using Matlab function “findpeaks”, and denoted
their corresponding gray levels as g1 , g2 , . . . , gN , where N is the number of peaks.
Figure 47.1c shows all the three peaks of the image histogram.
Step 3: Calculate two gray levels, gM and gSM corresponding to the highest
peak and the second highest peak among the peaks, respectively, via the following
formulae:

gM = arg max{gi } (47.3)


1≤i≤N

gSM = arg max{gi }, gi = gM (47.4)


1≤i≤N

Step 4: Adaptively determine the threshold T for as

T = arg min H (i), min(gM , gSM ) ≤ i ≤ max(gM , gSM ), (47.5)


i

where T is the gray level corresponding to the minimum value of H among gray
levels between the highest peak and the second highest peak.
Step 5: Obtain nucleus segmentation result using the following equation:

1, if I  (i, j) > T
BT (i, j) = (47.6)
0, otherwise

and remove faked object regions with a small area.


47 Adaptive Histogram Thresholding … 455

47.2.2 Cytoplasm Segmentation

This section presents a novel method to segment cytoplasm. Specifically, the pro-
posed method first removes image background and RBCs by a preprocessing oper-
ation based on image color features, and then performs Canny [9] edge detection
to detect the contour of entire leukocyte, which is then utilized to obtain the binary
image of leukocyte. Finally, cytoplasm segmentation is achieved by subtracting the
nucleus region from the leukocyte region. The detailed steps of cytoplasm segmen-
tation are described as follows.
(1) Remove the background based on prior knowledge of image color via the fol-
lowing formula:


[255, 255, 255], if I (i, j, 2) ≥ t1 ,
Ib (i, j, :) = (47.7)
I (i, j, :), otherwise.
I (i, j, 1) + I (i, j, 3)
t1 = (47.8)
2
where I (i, j, :) and Ib (i, j, :) denote three color component values of the pixel (i, j)
in the original image and the background removal result, respectively.
(2) Remove red blood cells (RBCs) from the image Ib by the following image
thresholding:


[255, 255, 255], if Ib (i, j, 1) ≥ t2
Ibr (i, j, :) = (47.9)
Ib (i, j, :), otherwise
Ib (i, j, 2) + Ib (i, j, 3)
t2 = , (47.10)
2
where Ibr (i, j, :) denotes the image after removing the red blood cells.
(3) Perform median filter to smooth Ibr and remove impurities.
(4) Perform Canny edge detection to obtain the leukocyte contour.
(5) Obtain the maximum connected region from the edge detection result. The
corresponding result is shown in Fig. 47.1f.
(6) Fill the leukocyte contour to obtain leukocyte region by Matlab function “imfill”,
and then further perform the morphological operation by the Matlab function
“imopen” to obtain the final leukocyte segmentation result, which is shown in
Fig. 47.1g.
(7) Cytoplasm segmentation is achieved by subtracting the WBC nucleus region
from the leukocyte region, and the corresponding result is shown in Fig. 47.1h.
456 X. Zhou et al.

47.3 Experimental Results

In this paper, to validate the effectiveness of the proposed method, we used one image
database which includes 60 260×260 WBC images with single WBC under standard
staining condition, which was provided by The People’s Hospital Affiliated to Fujian
University of Traditional Chinese Medicine. There also is a color difference between
different images due to unstable illumination, different types of leukocytes, and so
on.
To demonstrate the superiority of the proposed method, we compared our pro-
posed method with other available existing WBC image segmentation methods, i.e.,
Zheng et al. [10] and Gu and Cui [11]. Segmentation results on several typical images
are first evaluated qualitatively. Then, segmentation results on the two image datasets
were quantitatively evaluated using four common image classification measures, i.e.,
misclassification error (ME) [12], false positive rate (FPR), false negative rate (FNR)
[13], and kappa index (KI) [14]. Their definitions are as follows:

|Bm ∩ Ba | + |Fm ∩ Fa |
ME = 1 − , (47.11)
|Bm | + |Fm |
|Bm ∩ Fa |
FPR = , (47.12)
|Bm |
|Fm ∩ Ba |
FNR = , (47.13)
|Fm |
|Fm ∩ Fa |
KI = 2 , (47.14)
|Fm | + |Fa |

where Bm and Fm are the background and the foreground of the manual ideal seg-
mentation result (ground truth), respectively. Ba and Fa are the background and
foreground of the automatic segmentation result obtained by a certain algorithm,
respectively, and |.| is the cardinality 0 and 1. Lower values of ME, FPR, and FNR
indicate better segmentation, while higher values of KI indicate better segmentation.
To quantitatively compare the segmentation accuracy of the three methods (i.e.,
Zheng’s method [10], Gu’s method [11], and the proposed method), we have a dataset
composed of 60 blood smear images with standard staining condition. The segmen-
tation results were quantitatively evaluated by four measures of ME, FPR, FNR,
and KI. Tables 47.1 and 47.2 show the quantitative evaluation results of leukocyte
and nuclear segmentation results on the standard staining dataset, respectively (the
47 Adaptive Histogram Thresholding … 457

Table 47.1 Quantitative ME FPR FNR KI


comparison of leukocyte
segmentation results on the Gu’s method [11] 0.152 0.179 0.108 0.817
image dataset Zheng’s method [10] 0.149 0.206 0.050 0.841
Proposed method 0.048 0.051 0.038 0.944

Table 47.2 Quantitative ME FPR FNR KI


comparison of nucleus
segmentation results on the Gu’s method [11] 0.048 0.024 0.132 0.886
image dataset Zheng’s method [10] 0.151 0.161 0.146 0.740
Proposed method 0.048 0.052 0.037 0.943

best results are highlighted in bold). Figure 47.2 shows segmentation results on eight
WBC images under standard staining condition. As for the average segmentation per-
formance on the standard-stained images, Tables 47.1 and 47.2 demonstrate that the
proposed method has the lowest value of ME, FPR, and FNR, and has the highest KI
value, which indicates that our method performs best among all the two approaches.

47.4 Conclusions

WBC image segmentation is a crucial step of developing a computer-aided automatic


cell analysis system. Segmentation accuracies of existing WBC image segmentation
methods are still unsatisfactory. To improve leukocyte segmentation accuracy, we
proposed a novel method based on adaptive histogram thresholding. The proposed
method has three main contributions. The first contribution was that it presented a
scheme of using a color component combination to stand out the nucleus for the
nucleus segmentation. The second contribution was that it developed an adaptive
histogram thresholding to segment the nucleus. The third contribution was that it
developed a scheme of using image color prior and image thresholding to remove
image background and red blood cells (RBCs). Experimental results on a leukocyte
image dataset under standard staining condition demonstrate the superiority of the
proposed method over the counterparts.

Acknowledgements This work is partially supported by the National Natural Science Founda-
tion of China (61772254 and 61202318), Fuzhou Science and Technology Project (2016-S-116),
Program for New Century Excellent Talents in Fujian Province University (NCETFJ), Key Project
of College Youth Natural Science Foundation of Fujian Province (JZ160467), Young Scholars in
Minjiang University (Mjqn201601), and Fujian Provincial Leading Project (2017H0030).
458 X. Zhou et al.

Fig. 47.2 Visual segmentation results under standard staining condition with columns from left to
right: original images, ground truths, segmentation results obtained by Gu’s method [11], Zheng’s
method [10], and the proposed method, respectively
47 Adaptive Histogram Thresholding … 459

References

1. Huang, D.C., Hung, K.D., Chan, Y.K.: A computer assisted method for leukocyte nucleus
segmentation and recognition in blood smear images. J. Syst. Softw. 85(9) (2012)
2. Putzu, L., Di Ruberto, C.: White blood cells identification and counting from microscopic
blood images. In: Proceedings of the WASET International Conference on Bioinformatics,
Computational Biology and Biomedical Engineering 2013, vol. 7(1). Guangzhou, China
3. Arslan, S., Ozyurek, E., Gunduz-Demir, C.: A color and shape based algorithm for segmentation
of white blood cells in peripheral blood and bone marrow images. Cytom. Part A 85(6), 480–490
(2014)
4. Zhi, L., Jing, L., Xiaoyan, X., et al.: Segmentation of white blood cells through nucleus mark
watershed operations and mean shift clustering. Sensors 15(9), 22561–22586 (2015)
5. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man
Cybern. 9(1), 62–66 (1979)
6. Lim, H.N., Mashor, M.Y., Hassan, R.: White blood cell segmentation for acute leukemia bone
marrow images. In: Proceedings of the 2012 IEEE International Conference on Biomedical
Engineering (ICoBE) 2012. Penang, Malaysia, IEEE (2012)
7. Zheng, X., Wang, Y., Wang, G., Liu, J.: Fast and robust segmentation of white blood cell images
by self-supervised learning. Micron 107, 55–71 (2018)
8. Zhang, C., Xiao, X., Li, X., et al.: White blood cell segmentation by color-space-based k-means
clustering. Sensors 14(9), 16128–16147 (2014)
9. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell.
8, 679–698 (1986)
10. Zheng, X., Wang, Y., Wang, G.: White blood cell segmentation using expectation-maximization
and automatic support vector machine learning. J. Data Acquis. Process. 28(5), 217–231 (2013)
11. Gu, G., Cui, D.: Flexible combination segmentation algorithm for leukocyte images. Chin. J.
Sci. Instrum. 29(9), 1977–1981 (2008)
12. Yasnoff, W.A., Mui, J.K., Bacus, J.W.: Error measures for scene segmentation. Pattern Recogn.
9(4), 217–223 (1977)
13. Fawcelt, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)
14. Fleiss, J.L., Cohen, J., Everitt, B.S.: Large sample standard errors of kappa and weighted kappa.
Psychol. Bull. 72(5), 323–327 (1969)
Chapter 48
Simulation Study on Influencing Factors
of Flyer Driven by Micro-sized PbN6

Xiang He, Nan Yan, Weiming Wu and Liang Zhang

Abstract In order to guide the structural design of the micro-explosive train, the
JWL equation of state parameters of the primer explosive PbN6 is fitted first, and then
the simulation model of the flyer driven by micro-charge and the flyer impacting the
explosion-proof component is established using AUTODYN software. The effects of
charge height, flyer thickness, and shear plate aperture on flyer velocity and kinetic
energy are obtained by simulation calculation. When the charge diameter is fixed, the
flyer velocity increases first with the increase of charge height, and then gradually
tends to a fixed value. When the charge size is fixed, the maximum flyer kinetic
energy corresponds to an optimal flyer thickness. When the shear plate aperture is
smaller than the charge diameter, the flyer velocity will be improved. The relationship
between the thickness of nickel, copper, and silicon explosion-proof component
and shock wave attenuation is studied quantitatively, and the safe explosion-proof
thickness of initiating JO-9C acceptor charge is given.

Keywords JWL parameters · Flyer velocity · Explosion propagation ·


Explosion-proof · Shock wave attenuation

48.1 Introduction

Miniaturization of explosive train can reduce the volume of ammunition fuze, which
saves more space for the circuit design of weapon system and the main charge,
thus improving the power of weapon, and it is a research hotspot of explosive train
technology. It is possible to further miniaturize the explosive train by integrating

X. He (B) · N. Yan
Beijing Institute of Technology, Beijing 100081, China
e-mail: 716280128@qq.com
W. Wu
The 53rd Research Institute of CETC, Tianjin 300161, China
L. Zhang
School of Information and Science and Technology, Peking University, Beijing 100871, China
© Springer Nature Singapore Pte Ltd. 2020 461
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_48
462 X. He et al.

1
2
3
4
5
6

(a) Safe position (b) Armed position


1-Firing layer, 2-Primer explosive(PbN6), 3-Flyer, 4-Shear plate,
5-MEMS S&A chip, 6-Lead charge(JO-9C)

Fig. 48.1 Structural sketch of flyer-type explosive train

the technology of micro-electromechanical system (MEMS) and pyrotechnics. The


MEMS explosive train is mainly divided into dislocation type and in-line detonation
type.
In this paper, an in-line flyer-type detonation explosive train is studied, which
can improve safety and reduce volume by using flyer to transfer energy as shown
in Fig. 48.1. The ignition layer is mainly composed of LTNR and electrode plug.
When the safety chip of the MEMS is in a safe position (Fig. 48.1a), the flyer impacts
on the safety chip, and the shock wave decays rapidly in it, which cannot cause the
detonation of the lead charge. When the safety chip of the MEMS is in the armed
position (Fig. 48.1b), the flyer accelerates in the blast hole of the safe chip, impacting
and detonating the lead charge.
In order to quantitatively guide the structure design of the MEMS in-line explosive
train, the important influencing factors are simulated and analyzed. The simulation
is divided into two parts. The first part studies the explosion propagation ability of
the explosive train, including the influence of charge height, flyer thickness, and
shear plate aperture on flyer velocity and kinetic energy. The second part studies the
explosion-proof ability of the MEMS safe chip. The simulation results show that the
velocity of the flyer impacts the explosion-proof components of different thicknesses
and materials to obtain the attenuation law of shock wave pressure with thickness,
and then the safe explosion-proof thickness of the explosion-proof components can
be obtained.
48 Simulation Study on Influencing Factors of Flyer Driven … 463

48.2 Simulation Study on Influencing Factors of Explosion


Propagation Ability of Micro-sized PbN6 -Driven Flyer

48.2.1 Simulation Model

A two-dimensional symmetrical simulation model, as shown in Fig. 48.2, is estab-


lished to calculate the velocity of the flyer driven by micro-sized PbN6 primer explo-
sive. The Euler algorithm is used for explosive and air. The Lagrange algorithm is
used for shearing plate and titanium flyer. The reflection boundary is set for con-
straint. The shearing plate is set to be rigid. The mesh size is 0.025 mm. The Gauss
point is located at the center of the titanium flyer.

48.2.2 Material Parameters

Determination of parameters of JWL equation of state for PbN6 primer. In the


simulation study of flyer driven by micro-charge, the determination of parameters
of JWL equation of state of explosive is an important problem in simulation. The
relationship between explosive density ρ and detonation velocity D is measured
experimentally. The value of γ of primer explosive at a specific density is obtained
using the “D-D” [1, 2] algorithm. The relationship between pressure P of gas product
and relative volume V is obtained. The unknown parameters of JWL equation of state
are obtained by fitting multiple sets of (P, V ) values. The fitting process is shown in
Fig. 48.3.

2 3 4 5

6 1 Guass point

1-Primer explosive PbN6, 2-Constraint, 3-Titanium flyer,


4-Shear plate, 5-Air, 6-Initiation point

Fig. 48.2 Simulation model diagram of flyer driven by micro-size charge


464 X. He et al.

Fig. 48.3 Parameter fitting process of JWL equation of state for detonator

Fig. 48.4 Fitting curve


between density and
detonation velocity of PbN6
charge

Table 48.1 Fitting parameters of JWL equation of state for PbN6


ρ/g cm−3 D/m s−1 P/GPa A/GPa B/GPa C/GPa R1 R2 ω
3.834 5110 13.8 1524 11.44 0.100 5.75 1.53 0.21

The relationship between density ρ and detonation velocity D of PbN6 is shown


in Fig. 48.4 [3].
The general form of JWL equation of state is

ω ω ωE
P(V, E) = A(1 − )e−R1 V + B(1 − )e−R2 V + (48.1)
R1 V R2 V V

In the formula, P is the pressure of detonation products, V is the relative specific


volume, E is the specific unit volume thermodynamic energy, A, B, R1 , R2 , and ω are
the parameters to be fitted. Finally, the JWL equations of state parameters of PbN6
are fitted as shown in Table 48.1.
48 Simulation Study on Influencing Factors of Flyer Driven … 465

Table 48.2 Material Properties Constraint, shell Titanium flyer


parameters of constraint, plate
shear plate, and flyer
ρ/g cm−3 7.896 4.528
Coefficient of 2.17 1.09
Gruneisen
c0 /cm μs−1 0.457 0.522
s 1.49 7.67

Material Parameters of Constraints, Shear Plates, Titanium Flyers, and Air.


The material parameters of constraints, shear plates, titanium flyers, and air all come
from AUTODYN material library. The impact state equation and strength model are
the same for both the constraints and shear plates. Titanium was chosen as flyer
material, and impact equation of state was adopted. The equation of state of shock
under high pressure is as follows:

u s = c0 + su p (48.2)

In the formula, μs and μp are stress wave velocity in solid medium and particle
velocity on wavefront, respectively. c0 is the elastic wave velocity in medium and s
is the test constant. The material parameters of constraints, shear plates, and flyers
are shown in Table 48.2.
The air region is described by the equation of state of ideal gas:

P = (γ − 1)ρ E g (48.3)

In the formula, γ is an adiabatic index, and for ideal gases, γ = 1.4. The initial
density ρ 0 of air is 1.225 × 10−3 g cm−3 , and the specific internal energy of gas E g
= 2.068 × 105 .

48.2.3 Study on the Influencing Factors of Explosion


Propagation Ability of Flyer

The Relation between Flyer Speed and Displacement. After the shock wave passes
through the air gap, its shock wave pressure drops rapidly, which often fails to
detonate the lead charge. The shearing process of flyer sheet is shown in Fig. 48.5.
The flyer is accelerated by shock wave at first and can maintain a certain distance
after reaching a certain speed, and then the speed decreases slowly, so the flyer
can transfer energy more effectively. The velocity–time data of Ti flyer driven by
charge of ϕ0.9 mm × 1.8 mm PbN6 are obtained by simulation calculation. The
466 X. He et al.

Fig. 48.5 Shear-forming process of 0.1-mm-thick titanium flyer sheet

Fig. 48.6 Typical


velocity–displacement curve
of flyer

A B C D

velocity–displacement relationship of Ti flyer is obtained by integrating the data, as


shown in Fig. 48.6.
The general process of flyer motion is divided into four stages. In stage A, the flyer
is sheared by shock wave, and the velocity increases sharply. In stage B, the shock
wave pressure decreases, but it still accelerates the flyer. The flyer speed increases
slowly, and the acceleration stroke is more than 1 mm, which indicates that the
flyer still accelerates after it leaves the shear plate. The flyer in stage C is driven by
detonation product gas, although it is subject to air resistance. The flyer has a steady
speed and a smooth travel of more than 1 mm. In stage D, the velocity of flyer begins
to decay, but the decay trend is not intense. It can be seen that the flyer can maintain
a high speed and has a high gap initiation ability after several millimeter distances.
Effect of Charge Height on Flyer Speed and Kinetic Energy. When the diame-
ter of the charge is constant, and after increasing the charge height to a certain value,
48 Simulation Study on Influencing Factors of Flyer Driven … 467

Fig. 48.7 The relationship between charge height and flyer speed and kinetic energy

the shock wave output pressure of the detonator tends to be fixed, and the speed of
the flyer is positively correlated with the output pressure of the primer [4]. Simulate
the charge of ϕ0.9 mm PbN6 , the maximum speed, and kinetic energy of the flyer
when the charge height increases from 0.6 to 3 mm, as shown in Fig. 48.7.
As can be seen from Fig. 48.7, after charge height > 1.8 mm, the increasing trend
of velocity and energy of flyer is gentle, so the charge height should be less than
1.8 mm. When the kinetic energy of flyer is greater than the critical initiation energy
E C of explosive, the lead charge can be detonated. According to Ref. [5], the critical
initiation energy E C of JO-9C is 164.6 mJ. Reference to GJB1307A [6], the minimum
output energy of the detonator should be at least 25% higher than the minimum input
energy required by the detonation transfer train or terminal device. The minimum
charge height of 1.25 E C is 0.85 mm, so the detonator charge height that meets the
requirements of reliable detonation transfer and margin design should be more than
0.85 mm.
Effect of Flyer Thickness on Flyer Velocity and Kinetic Energy. The process of
flyer impact initiation is high-pressure short-pulse initiation. The initiation ability is
affected by shock wave pressure and action time. The duration of shock wave pulses
in explosives τ is related to the thickness of flyer plates. The formula for calculating
τ is as follows:


τ= (48.4)
Df

In the formula, Df is the velocity of shock wave in the flyer and δ is the thickness
of the flyer. When the size of PbN6 is conformed, the velocity and kinetic energy of
468 X. He et al.

Fig. 48.8 The relationship between flyer thickness and flyer velocity and kinetic energy

titanium flyer with thickness from 0.02 to 0.1 mm are calculated by simulation. The
results are shown in Fig. 48.8.
The simulation results show that the velocity of the flyer decreases linearly with
the increase of the thickness of the flyer. Except that the kinetic energy of the flyer
with a thickness of 0.02 mm does not meet the requirement of initiation energy, the
kinetic energy of the flyer with other thickness can meet the requirement of energy
margin. The kinetic energy of the flyer increases first and then decreases. There exists
an optimal thickness of the flyer with the largest kinetic energy of the flyer, which is
also the preferred thickness of the flyer in design.
Effect of Shear Plate Aperture on Flyer Speed. The shear plate and the initiating
explosive together make the flyer shear forming. The aperture of the shear plate is the
diameter of the flyer. Three series of shear plate aperture are simulated and designed,
that is, the aperture of the shear plate is larger than the diameter of the charge,
close to the diameter of the charge, and smaller than the diameter of the charge. The
relationship between the aperture of the shear plate and the velocity of the flyer plate
is studied. In the simulation, the thickness of PbN6 and flyer is unchanged, and the
settlement results are shown in Fig. 48.9.
When the aperture of shear plate (0.2, 0.3, 0.6 mm) is smaller than the charge
diameter, the smaller the aperture of shear plate, the shorter the time for flyer velocity
to reach its maximum, and the final flyer velocity tends to be the same. When the
diameter of the shear plate (1, 0.9 mm) is close to the charge, the flyer can also
accelerate to the speed close to the small diameter, but then the speed decreases
rapidly. When the aperture of the shear plate (1.2, 1.5 mm) is larger than that of the
charge, the influence of lateral sparse wave intrusion on the shear forming process of
the flyer sheet is significant [7]. The maximum velocity of the flyer sheet is obviously
48 Simulation Study on Influencing Factors of Flyer Driven … 469

Fig. 48.9 The velocity–displacement curve of flyer under different shear plate apertures

smaller than that of the small diameter flyer sheet, and the velocity attenuation is
advanced, and the attenuation range is more obvious.
Therefore, in the design of shear plate aperture, the flyer diameter should be larger
than the charge diameter, so as to improve the flyer’s explosion transfer ability.

48.3 Conclusion

When the diameter of PbN6 charge and the size of titanium flyer are fixed, the height
of charge increases from 0.6 to 3 mm. When the charge height h = 0.6 mm, the
energy requirement of initiating JO-9C is met. When the charge height h = 0.85,
the energy margin requirement of initiating JO-9C is met. When the charge height
h > 1.8 mm, the increasing trend of flyer velocity and kinetic energy is gentle. The
simulation provides quantitative guidance for designing the minimum charge height
of primer explosive.
When the diameter of PbN6 charge and titanium flyer is constant and the thickness
of flyer is increased from 0.02 to 0.1 mm, the velocity of flyer decreases linearly,
and the kinetic energy of flyer increases first and then decreases. When the flyer is
greater than 0.044 mm, the energy margin of JO-9C initiation is satisfied, and when
the flyer is equal to 0.08 mm, the kinetic energy of flyer is the largest.
When the size of PbN6 charge and titanium flyer is fixed and the aperture of
shear plate varies from 0.2 to 1.5 mm, the velocity of flyer will eventually converge
when the aperture of shear plate is less than the diameter of charge, and the faster
470 X. He et al.

the aperture is, the faster the velocity of flyer will reach the maximum. The larger
the aperture is, the smaller the maximum velocity of the flyer is, and the earlier
the attenuation time of the corresponding velocity of the flyer is, the greater the
attenuation range is. Therefore, the diameter of the shear plate is smaller than that
of the charge.

References

1. Wu, X., Tan, D.: Poly tropic index calculation of condensed explosives. Explosives 2, 1–9 (1981)
2. Shen, F., Wang, H., Yuan, J.: A simple algorithm for determining the parameters of JWL equation
of state. Vib. Shock. 9, 107–110 (2014)
3. Lao, Y.: Pyrotechnics Pharmaceutics. North University of science and Technology Press, Beijing
(2011)
4. He, A.: Design Principle of Miniature Detonating Sequence Based on MEMS Fuze. Beijing
Institute of Technology, Beijing (2012)
5. Zhang, B., Zhang, Q., Huang, F.: Detonation Physics. Weapons Industry Press, Beijing (2001)
6. GJB1307A. 2004. General Design Code for Aerospace Pyrotechnics. National Defense Science
and Technology Industry Committee (2004)
7. Lim, S., Baldovi, P.: Observation of the velocity variation of an explosively-driven flat flyer
depending on the flyer width. Appl. Sci. 9, 97–109 (2019)
Chapter 49
Identifying Key Learner on Online
E-Learning Platform: An Effective
Resistance Distance Approach

Chunhua Lu, Fuquan Zhang and Yunpeng Li

Abstract Teachers are never the only teacher in the class, especially in online e-
learning environment. The key learner who is supposed to be more active and eager
to spread knowledge and motivation to other classmates has a huge potentiality to
improve the quality of teaching. However, the identification of such key learner
is challenging which needs lots of human experience, especially when the contact
channels between teachers and students are much more monotonous in online e-
learning environment. Inspired by resistance distance theory, in this paper, we apply
resistance distance and centrality into an interactive network of learners to identify
key learner who can effectively motivate the whole class with discussion in e-learning
platform. First, we define the terms of interactive network of learners with the node,
edge, and graph. Then the distance between nodes is replaced with effective resistance
distance to gain better understanding of propagation among the learners. Afterward,
Closeness Centrality is utilized to measure the centrality of each learner in interactive
network of learners. Experimental results show that the centrality we use can cover
and depict the learners’ discussion activities well, and the key learner identified by our
approach under apposite stimuli can effectively motivate the whole class’ learning
performance.

Keywords Key learner · Resistance distance · Centrality · E-learning system ·


Online education · Graph-based approach

C. Lu (B)
School of Electronic and Information Engineering, Anshun University, Guizhou 561000, China
e-mail: firethree123@163.com
F. Zhang
Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang
University, Fuzhou 350121, China
e-mail: 8528750@qq.com
Y. Li
Computer School, Beijing Information Science and Technology University, Beijing 100101, China
e-mail: Leeyunpengs@163.com

© Springer Nature Singapore Pte Ltd. 2020 471


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_49
472 C. Lu et al.

49.1 Introduction

Motivating students in a class is an important part to improve teaching quality in both


traditional and e-learning environment. Compared with the traditional classroom,
online e-learning platforms enable learners to receive knowledge more consistently
and effectively [1]. In addition, the development of data mining techniques impulses
the analysis of online activities generated from learners [2]. Like social networks,
learners in online e-learning platform contribute to construct a learning network based
on their interactive activities. Therefore, based on this kind of interactive network, we
define a key learner as the one who affects other classmates’ learning performance
with his/her behavior. In other words, we aim to find learner who can evoke positive
group influence in an online e-learning course [3].
Inspired by graph theory and social computing, in this paper, we introduce the term
“resistance distance” into our analytics methods to implement the identification of key
learners. In summary, the paper has the following contributions: (1) We propose the
interactive network of learners to depict the question–answer–endorsement activities
in the discussion process of an online course; (2) We utilize resistance distance to
describe the connections between learners and use centrality as the factor to measure
the capacity of the learner to disseminate knowledge; (3) We valid our methods
through a four-group control experiment.

49.2 Related Work

49.2.1 The Application of Resistance Distance

In many application scenarios, resistance distance is used to mine the effective infor-
mation among the specific graph. Balaban and Klein [4] proposed an approach to
construct the co-authorship graph as well as to calculate the Erdős number (EN) for
collaborative papers among mathematicians, which is conducted through resistance
distances leading to rational Erdős numbers (REN). Wang and Hauskrecht [5] con-
ducted document retrieval experiments through effective query expansion with the
resistance distance and improved its performance. Meanwhile, resistance network
models were also used in recommender systems with collaborative filtering methods
[6–8]. Guo et al. [9] proposed a new data clustering method that used a similarity met-
ric derived from electrical resistance networks. Aporntewan et al. [10] constructed
a novel indexing algorithm based on electrical resistance between two vertices in
the graph. The experimental results show that it produced a unique index for every
simple connected graph with ≤10 vertices. It had a simple calculation method and
good performance.
49 Identifying Key Learner on Online E-Learning Platform … 473

49.2.2 User Influence Evaluation Method

Sociologist Ritzer [11] believed that influence is the ability to transform other individ-
uals’ thoughts, feelings, and attitudes through communication with others or groups.
The current research on the entities’ influence on social networks is mainly focusing
on two aspects: user static attributes and social network topology. In terms of user
influence based on user static attributes, the most intuitive indicator is the number of
user fans. But Cha et al. [12] found that the users who have high-impact of “fans”
don’t necessarily have high influence on “retweeting” or “ mentioning”. Pal et al.
[13] integrated the users’ quantity of tweeting, responses, retweeting, and fans in
Twitter. Then they calculated the users’ communication influence, the mentioned
influence, and the retweeting influence. Boyd et al. [14] selected users’ retweet-
ing, replying, and likes as their characteristics, and obtained their influence through
weighted calculation.
In describing the influence of users based on the social network topology, Freeman
[15] proposed the importance of nodes based on the shortest path of the network
topology, the centrality of the media, and the proximity to the centrality. Weng et al.
[16] extended on the basis of the PageRank algorithm and proposed the TwitterRank
algorithm to calculate the influence of users on different topics according to the
network structure of the users’ attention relationship and the similarity of users’
interest. Ding et al. [17] comprehensively considered the microblog publishing time,
comment content, and network topology to study users’ influences.

49.3 Problem Formulation and Modeling

49.3.1 Data Collection

Properties of learner as well as their behavior construct the basement of data mining
in online education. In this paper, we designed an interactive network of learners to
extract graph-based interactive activities via their online learning behaviors. Specifi-
cally, we constructed a web-based discussion application in previously implemented
e-learning platform. In this application, all learners are supposed to ask and answer
questions during and after lessons. A question is usually proposed by one learner
and this question could be responded by different learners. We also provide a button
named “agree” for each answer to let other learners give their feedback to measure
the quality of an answer (i.e., an answer with more clicks of “agree” is supposed to
be a better answer to this question.). In other words, this application works similar to
a question-and-answer site like Quora (https://www.quora.com/) and Zhihu (https://
www.zhihu.com/).
474 C. Lu et al.

49.3.2 Interactive Network of Learners

Based on our question discussion application, an interactive network could be defined


as below:
• Node: Each learner in the same class is considered as a node in the whole graph,
with their properties like profile information and total amount of “agree” he/she
has got. In this paper, we define all learners in a class as U = {u 1 , u 2 , . . . , u n }.
• Edge: When a question proposed by one learner is answered or responded by
another learner, a directed edge is established from the responder to the question
proponent. The weight of each edge is defined as the visibility of this answer.
Particularly, let W∈ Rn×n indicate the adjacent matrix of the question–answer
activities in U, while the value of each element in the W indicates the visibility of
this activity. That is, wi, j = N A(i, j) where N A(i, j) is the amount of “agree” of
all answers from learner i to learner j (0 < i, j < n).
• Graph: After taking all question–answer activities in one class into consideration,
a sparse directed graph could be constructed. We define G(U, W) and describe the
above information in one class.
Therefore, a question–answer–agreement-based interactive network can be ini-
tially constructed. However, previous researches have suggested that posts in this
kind of social network usually have a life span [18]. Thus, an attenuation function is
needed to simulate the decay of visibility of an answer. In this paper, we are inspired
by Newton’s law of cooling to describe the decay of the interaction between two
learners [19]. Let

N A (i, j) = e−λt · (N A(i, j) + 1) (49.1)

where λ is an exponential decay constant and t represents the time span the post has
been released. Afterward, the resistance matrix R can be defined via reciprocal of
elements in W, i.e.,
⎡ ⎤
0 r1,2 ··· r1,n
⎢ .. ⎥
⎢ r2,1 . r2,n−1 r2,n ⎥
R=⎢
⎢ .. .. ⎥
⎥ (49.2)
⎣ ..
. rn−1,2 . . ⎦
rn,1 r n,2 ··· 0

where ri, j = wci, j . For instance, Fig. 49.1 gives two examples. In Fig. 49.1a, learner A
responds two questions from B and D, respectively, then B and D respond the same
question which is proposed by C. Let r1 , r2 , r3 , r4 indicate the resistance values of
the propagation A → D, D → C, A → B, and B → C, respectively. In Fig. 49.1b, we
remove the path including learner B, which only keep the path from learner A to D
and C. Most traditional graph-based social network analysis researches only consider
the shortest path while ignoring other possible pathway in a connected subgraph. For
49 Identifying Key Learner on Online E-Learning Platform … 475

Fig. 49.1 Two examples of interactive network of learners

instance, using Freeman’s scheme, we can derive the spreading resistance between
learner A and C as min{r1 + r2 , r3 + r4 }, while the real situation is that learner C
may benefit from knowledge propagation through both A → D → C and A → B →
C, which makes the propagation easier than Fig. 49.1b. Therefore, here we introduce
resistance distance to depict this process of knowledge propagation.

49.3.3 Resistance Distance and Centrality

Assuming G as a fully connected graph, we replace the all edges W with resistances R.
Thus, we can utilize Ohm’s law to calculate the actual effective resistance between
any two nodes in the network. For example, the resistance between A and C in
Fig. 49.1a can be calculated by
 
r A,D + r D,C × r A,B + r B,C (r1 + r2 ) × (r3 + r4 )
r A,C = = (49.3)
r A,D + r D,C + r A,B + r B,C r1 + r2 + r3 + r4

and resistance between A and C in Fig. 49.1b can be derived by calculating


r A,C = r A,D + r D,C = r1 + r2 . Previous studies have shown that using the resistance
distance instead of the shortest path can better describe the propagation process in
microblog platform [20], where we are inspired from. However, for large-scale resis-
tance matrices, it is still very difficult to restore an effective circuit. Therefore, we use
algebraic formula of the graph-based resistance distance. Let L(G) be the Laplacian
matrix of G:

ri j = L ii+ + L +j j − 2L i+j (49.4)

By using the above methods, the resistance matrix R can be transformed into
effective resistance matrix R + . Afterward, we suppose the key learner locates at the
center of the graph. In this paper, Closeness Centrality is utilized to measure the
centrality of our graph, in which the centrality of node u i ∈ U can be derived by
476 C. Lu et al.

1
C(u i ) =
(49.5)
u j ∈U \u i d(i, j)

where d(i, j) denotes the effective resistance distance between u i and u j from R + .
In practical use, it is uncertain whether G is strongly connected. Therefore, we use
the sum of reciprocal of distances, instead of the reciprocal of the sum of distances,

with the convention ∞ 1
= 0. It is obvious that the time complexity is O n 3 and the
 2
space complexity is O n .

49.4 Experiment and Results

49.4.1 Participants

109 rural labor workers (84 males, age range 29–51) in Anshun City, China spon-
sored by Guizhou Provincial Department of Science and Technology, China entered
the experiment. All of them have selected an online course named “Designing and
Implementation of Web Pages” in our e-learning platform. They are evenly divided
into four groups (i.e., classes) according to the gender distribution (Table 49.1).
Pearson coefficient of age between any two groups shows that there is no statistical
significance. Participants are promised to give extra credits if they participate in our
aforementioned discussion application actively.

49.4.2 Measurements of Performance

Two types of method are established to measure the performance on identification of


key learner. First, we use Spearman coefficient to measure the correlation between
participants’ centrality and their number of “agree” or the number of answers they
write. The purpose is to validate the capacity of Closeness Centrality, that is, can
Closeness Centrality cover and depict the learners’ activity both in answering and

Table 49.1 Brief information of participants


Group G1 G2 G3 G4
Number of participants 27 27 27 28
Gender (M:F) 21:6 21:6 21:6 21:7
Average (Age) 33.73 35.69 36.67 35.72
Received junior high school education (%) 100 100 96.29 96.42
Received senior high school/secondary school education (%) 81.48 85.18 77.77 78.57
Receiving college/university education 0 0 0 0
49 Identifying Key Learner on Online E-Learning Platform … 477

answers’ endorsement. Second, we give the same stimulus to one learner in each
group to let him/her try to motivate the whole class’ learning progress. Then their
mastery of knowledge is examined by an additional quiz. The statistics on the quiz
is the other measurement of our method.

49.4.3 Experiment Process

First, the discussion application is introduced to all groups before the online lec-
ture. Then at the mid-term of the lecture, four interactive networks of learners are
constructed via previous discussion activities. Afterward, the same stimulus is per-
formed to one learner for each group selected by different schemes: (1) In group
G1, the interactive network of learners with the effective resistance distance (i.e.,
+
RG1 ) is utilized, and the learner with the highest centrality is selected; (2) In group
G2, similar method to select key learner is conducted, and the only difference is that
the distance is normal distance (i.e., RG2 ); (3) To avoid the effects caused by age,
the eldest learner is selected as the key one; (4) As the control group, we select one
learner randomly as key learner. We perform the stimulus by sending a message to
the four key learner candidates, with the words about thanking his/her contribution to
the discussion, and a confirmation of a scholarship to encourage him/her motivating
the whole class’ discussion. In addition, the target key learner would be appointed
as the monitor to the class.
The remaining half of the semester is given to the four key learners and their
classmates. To avoid potential cheats in the final exam, we organized a quiz before
the final exam in the name of the pre-examination review.

49.4.4 Results

In terms of the correlation between participants’ centrality and their behaviors,


Table 49.2 gives the correlation between centrality and the number of “agree” or
the number of answers they wrote. The obvious improvement of Spearman correla-
tion shows that the Closeness Centrality can cover and depict the learners’ activity
both in answering and answers’ endorsement.

Table 49.2 The correlation between centrality and the number of “agree” or the number of answers
Correlation G1 G2 G3 G4 Overall
Centrality and number of “agree” 0.789* 0.791* 0.770* 0.762* 0.778*
Centrality and number of answers 0.644* 0.639* 0.613* 0.623* 0.630*
Number of “agree” and answers 0.576* 0.568* 0.538* 0.501* 0.546*
*denotes the statistically significant (p-value < 0.5)
478 C. Lu et al.

Table 49.3 Statistics on quiz results


Type of quiz Indicator G1 (N = 27) G2 (N = 27) G3 (N = 27) G4 (N = 28)
Mid-term quiz Average score 72.976 76.333 70.756 73.667
(Max: 100)
Pass rate 81.48 88.89 77.78 82.14
(score ≥ 60)
(%)
Final quiz Average score 80.964 78.852 71.374 74.637
(Max: 100)
Pass rate 92.59 92.59 77.78 85.71
(score ≥ 60)
(%)

After the whole online lecture process from March 2018 to July 2018, we calcu-
lated the difference in the four classes of quiz results which is shown in Table 49.3.
It is worth to mention that the mid-term quiz is also presented as a comparison. It is
worth mentioning that using Pearson correlation between four groups’ quiz results
and final exam ones, statistically significant correction can be found in all groups.
This preliminary result demonstrates that our approach to identify and motivate key
learners has contributed to improve the quality of online lecture among the class.

49.5 Conclusion

In this paper, we utilized resistance distance and centrality to construct interactive net-
work of learners, in order to identify key learner and further improve the performance
of online lecture. The centrality with effective resistance distance approach can cover
and depict learners’ discussion activities as well as achieve a visible improvement in
the results of final quiz, which is demonstrated by our experiments.

Acknowledgements This research is supported by Major Project of the Tripartite Joint Fund of
the Science and Technology Department of Guizhou Province under grant (LH[2015]7701).

References

1. Rovai, A., Ponton, M., Wighting, M., Baker, J.: A comparative analysis of student motivation
in traditional classroom and e-learning courses. Int. J. E-Learn. 6, 413–432 (2007)
2. Blagojević, M., Živadin, M.: A web-based intelligent report e-learning system using data mining
techniques. Comput. Electr. Eng. 39(2), 465–474 (2013)
3. Chu, T.H., Chen, Y.Y.: With good we become good: understanding e-learning adoption by
theory of planned behavior and group influences. Comput. Educ. s92–s93, 37–52 (2016)
4. Balaban, A.T., Klein, D.J.: Co-authorship, rational Erdős numbers, and resistance distances in
graphs. Scientometrics 55(1), 59–70 (2002)
49 Identifying Key Learner on Online E-Learning Platform … 479

5. Wang, S., Hauskrecht, M.: Effective query expansion with the resistance distance based
term similarity metric. In: Proceedings of the 33rd International ACM SIGIR Conference
on Research and Development in Information Retrieval, Geneva, Switzerland, pp. 715–716
(2010)
6. Schmidt, S.: Collaborative filtering using electrical resistance network models. In: The 7th
Industrial Conference on Advances in Data Mining: Theoretical Aspects and Applications,
Leipzig, Germany, pp. 269–282 (2007)
7. Fouss, F., Pirotte, A., Saerens, M.: The application of new concepts of dissimilarities between
nodes of a graph to collaborative filtering. In: Workshop on Statistical Approaches for Web
Mining (SAWM), Pisa, Italy (2004)
8. Kunegis, J., Schmidt, S., Albayrak, Ş., Bauckhage, C., Mehlitz, M.: Modeling collaborative
similarity with the signed resistance distance kernel. In: Conference on ECAI 2008: European
Conference on Artificial Intelligence, Patras, Greece, pp. 261–265 (2013)
9. Guo, G.Q., Xiao, W.J., Lu, B.: Similarity metric based on resistance distance and its applications
to data clustering. Appl. Mech. Mater. 556–562, 3654–3657 (2014)
10. Aporntewan, C., Chongstitvatana, P., Chaiyaratana, N.: Indexing simple graphs by means of
the resistance distance. IEEE Access 4(99), 5570–5578 (2017)
11. Ritzer, G.: The Blackwell encyclopedia of sociology. Math. Mon. 107(7), 615–630 (2007)
12. Badashian, A.S., Stroulia, E.: Measuring user influence in GitHub: the million follower fallacy.
In: IEEE/ACM International Workshop on Crowdsourcing in Software Engineering, Austin,
USA, pp. 15–21 (2016)
13. Pal, A., Counts, S.: Identifying topical authorities in microblogs. In: ACM International Con-
ference on Web Search and Data Mining, Hong Kong, China, pp. 45–54 (2011)
14. Boyd, D., Golder, S., Lotan, G.: Tweet, Tweet, Retweet: conversational aspects of retweeting
on Twitter. In: Hawaii International Conference on System Sciences, Hawaii, USA, pp. 1–10
(2010)
15. Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Netw. 1(3), 215–239
(1978)
16. Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential twitterers.
In: The Third ACM International Conference on Web Search and Data Mining, New York,
USA, pp. 261–270 (2010)
17. Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: The 2008
International Conference on Web Search and Data Mining, Palo Alto, USA, pp. 231–240 (2008)
18. Kong, S., Feng, L., Sun, G., Luo, K.: Predicting lifespans of popular tweets in microblog. In:
International ACM SIGIR Conference on Research and Development in Information Retrieval,
Portland, USA, pp. 1129–1130 (2012)
19. Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In:
International Conference on World Wide Web, Raleigh, USA, pp. 591–600 (2010)
20. Bozzo, E., Franceschet, M.: Resistance distance, closeness, and betweenness. Soc. Netw. 35(3),
460–469 (2013)
Chapter 50
A User Study on Head Size of Chinese
Youth for Head-Mounted EEG Products

Xi Yu and Wen Qi

Abstract Head-mounted EEG products are wearable devices that detect the voltage
fluctuations generated by the ionic currents of neurons in the brain, which is caused
by the changes in people’s brain states. As EEG products collect the physiological
signals of the brain from the head directly, the better an EEG headset fits a wearer’s
head, the more accurate the EEG signals obtained are. At present, most of EEG
headsets are designed for European and American users. There are few EEG headsets
that are suitable for Chinese user. In addition, there is no specific study on measuring
the head size of Chinese people for the purpose of designing an EEG headset. This
study is aimed at collecting the size information of the head of Chinese users. The
results become important reference while designing EEG headsets.

Keywords EEG headset · Wearable products · Chinese youth · Head size

50.1 Introduction

An EEG headset is the equipment that records the electrophysiological activity of


cranial nerve cells on the surface of the cerebral cortex or scalp [1]. The design of
EEG headset is relatively immature. The current design of EEG headsets including
the size and shape is mainly based on the size information of European and American
users. Therefore, available headsets are not suitable for Chinese users. This is quite
problematic since only an EEG headset that fits a user’s head perfectly can collect
accurate EEG signals from the scalp. Moreover, there is no study on the size of
Chinese users’ head particularly for EEG headset products. In this study, the size
information of Chinese youth’s head is collected in order to provide reference data
for designers of head-mounted EEG products in China.

X. Yu · W. Qi (B)
Donghua University, 200051 Shanghai, China
e-mail: design_wqi@sina.com

© Springer Nature Singapore Pte Ltd. 2020 481


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_50
482 X. Yu and W. Qi

50.2 Related Work

Daniel Lacko found that available head-mounted EEG devices cannot fit every user’s
head [2]. The mismatch often leads to poor contact between electrodes and scalp.
He proposed an adjustable EEG headset (Fig. 50.1) and verified it by experiments.
After comparing his design with the existing Emotiv EPOC cap with 14 electrodes,
the author found that the performance of the modified EEG headset was slightly
improved and easier to use.
Thierry Ellena et al. noted that EEG helmets that do not fit the user’s head will
increase their safety concerns [3], and there are no helmets that can fit for every user
currently. Based on 3D anthropometry, she proposed a method to improve the fit of
the helmet. The two parameters including SOD and GU were used to evaluate the
HFI in different cases, and the helmets with better data quality were found to be
better accepted by their users. Their study also showed that men and Europeans felt
more comfortable with the experiment helmet than women and Asians, respectively.
Hong Kong Polytechnic University and Delft University have compared the 3D
head scans of 50 European and Chinese participants. The age of these participants
ranged from 17 to 77 years old with an average age of 39 years old. The result indi-
cated that the shape of Chinese people’s head is more rounded than that of Europeans
and Americans. In addition, the forehead and hindbrain are flatter (Fig. 50.2). These
differences lead to the fact that wearable products such as helmet and masks designed
for Europeans and Americans cannot be fully adapted Chinese users [4].
The national standard GB10000-88 of the People’s Republic of China provides
basic human body size for Chinese adults (male 18–50 years old, female 18–55 years
old) [5]. In this survey, seven kinds of head data are measured, and the data is divided
into two groups according to gender (Fig. 50.1).

Fig. 50.1 The head size of Chinese adult. (Image comes from the National Standard of the People’s
Republic of China GB10000-88)
50 A User Study on Head Size of Chinese Youth … 483

50.3 Experimental Design

The purpose of this study is to provide the data of the head size of Chinese youth in
order to help in designing the head-mounted EEG products that are customized for
Chinese users. There are two reasons for carrying out such study. First, the existing
data of head size is outdated and not suitable for reference. Second, the data samples
from other studies, for example, in Fig. 50.1, cover a wide range of age groups. There
are no specific measurements about the head size of Chinese youth. In this study, six
parameters are measured as shown in Fig. 50.2:
1. Maximum Head Breadth: the linear distance between the left and right cranial
points (EU).
2. Maximum Head Length: the linear distance from the point of the eyebrow (g) to
the point of the back of the pillow (op).
3. Head Sagittal Arc: the arc length in the median sagittal plane from the point of
the eyebrow (g) to the point of the occipital bulge (i). It should be noted that,
considering the final design size of an EEG headset, the sagittal arc is divided
into the front and the back by the apex of the head, providing the measurement
for the EEG product design.
4. Head Transversal Arc: the arc length from one side of the tragus point (t), through
the head vertex (v) to the other side of the tragus point (t).
5. Head Circumference: the perimeter from the eyebrow point (g) as the starting
point, through the back of the pillow (op) to the starting length of the starting
point.
6. Head Auricular Height: offset from the apex (v) at the tragus point (t).

Fig. 50.2 The definition of six sizes of the human head. (Image from National Standard of the
People’s Republic of China GB10000-88)
484 X. Yu and W. Qi

50.3.1 Experiment Equipment

There are five different types of tools that are used to measure the six parameters
mentioned above: Anthroscan Bodyscan Color 3D body scanner, Martin-style body
shape measuring ruler, soft ruler, and nylon cap (Fig. 50.3). The 3D body scanner is
used for scanning the head of each participant and extracting related data such as the
maximum head length and the maximum head breadth from the three-dimensional
model. The Anthroscan Bodyscan is produced by Human Solutions, Germany. The
Martin ruler was used to measure the maximum head length, the maximum head
breadth, and the head auricular height in this experiment. The soft ruler (Fig. 50.3)
is used to manually measure the head circumference. A nylon cap (Fig. 50.3) is
worn on the head by each participant to avoid the interference of the hair of each
participant. Different from the traditional measurements, red markers are pasted on
the nylon cap according to the electrode positions of the FP1, FP2, F3, F4, T7,
T8, P7, and P8 in international 10–20 system. An online questionnaire is presented
to each participant to collect personal information including name, gender, age,
education, birth province, ethnicity, student number, and contact information. There
are questions about their opinions on EEG products.

Fig. 50.3 The measurement tools (left up: anthroscan bodyscan; right up: Martin-style ruler; left
down: soft ruler right down: nylon cap)
50 A User Study on Head Size of Chinese Youth … 485

50.3.2 Experiment Procedure

First, each participant filled the name–number registration form and answered the
online questionnaire. They were informed that data will be used only for research
purpose and will not be shared by others. Each participant took off his/her shoes
and wore the nylon cap, and then entered the Anthroscan Bodyscan color 3D body
scanner for scanning.
Following that, the authors use the antennae gauge of Martin ruler to measure the
linear distance between the left and right cranial points, which is the maximum head
breadth, the linear distance from the point of the eyebrow to the point of the back
of the pillow, which is the maximum head length. Then, the cross gauge of Martin
ruler was used to measure offset from the apex at the tragus point which is the head
auricular height. After the measurement with Martin ruler, the soft ruler was used
to measure three parameters: the head sagittal arc, the head transversal arc, and the
head circumference.
The last step was to measure the height and weight, and check the correctness of
the information of each participant and whether there are any measurements missing.
After the experiment, 3D model data of full body of each participant was processed
after the measurement by the experimenter with Anthroscan Bodyscan software. The
data of head part was extracted from the 3D model using the software Rhino.

50.4 Results

This study in total measured 20 Chinese young undergraduate and postgraduate stu-
dents including 10 males and 10 females. They are either from Northern or Southern
part of China, such as Liaoning, Jiangsu, and Guangdong Provinces. The geograph-
ical distribution of the samples is quite wide. The average age is 23 as well.
The average height of the 20 samples is 170 mm, and the median is 168 mm. The
average height of the male participants is 177 mm and the median is 176 mm. The
average height of the female participants is 162 mm, and the median is 162 mm.
The average body weight of the whole sample is 61 kg, and the median is 62 kg.
The average weight of male students is 70 kg, and the median is 70 kg. The average
weight of female students is 52 kg, and the median is 52 kg.
The results of this experiment include the maximum head breadth, maximum
head length, sagittal arc length, coronal arc length, head auricular height, and head
circumference.
It should be noted that in addition to the sagittal arc length itself, the sagittal
arc length is divided into the front and the back by the apex of the head, and the
measurement is provided for the design of the EEG product. Figure 50.4 shows a
summary of the experimental data statistics. The specific data will be elaborated and
analyzed in this chapter.
486 X. Yu and W. Qi

Fig. 50.4 The summary of the experimental data

50.5 Conclusion

In this study, the authors measured and analyzed the head size of Chinese youth
in order to provide reference data for designing a head-mounted EEG headset for
Chinese users. It is found that the average head width, average head length, and
average head circumference of Chinese samples are smaller than those of European
users. When the head length is the same, the head width of a Chinese person is larger
than a European person. The head circumference of a person is affected by his/her
personal attributes such as height, weight, and age, while the maximum head breadth
and maximum head length are relatively less affected by personal attributes. In terms
of product appearance, the number of electrodes is not the primary factor considered
by the youth in China while selecting an EEG headset.

Acknowledgements The author would like to thank the Program for Professor of Special Appoint-
ment (Eastern Scholar) at Shanghai Institutions of Higher Learning (No. TP2015029) for financial
support. The study is also supported by “the Fundamental Research Funds for the Central Univer-
sities”.

References

1. Zhang, H., Wang, H.: Study on classification and recognition of multi-lead EEG signals. Com-
put. Eng. Appl. 24, 228–230 (2008)
2. Lacko, D.: Ergonomic design of an EEG headset using 3D anthropometry. J. Appl. Ergon. 58,
128–136 (2017)
3. Ellena, T., Subic, A.: The helmet fit index-an intelligent tool for fit assessment and design
customization. J. Appl. Ergon. 55, 194–207 (2016)
4. Roger, B., Shu, C.: A comparison between Chinese and Caucasian head shapes. J. Appl. Ergon.
41, 832–839)(2010)
5. National Standard—Anthropometric Terminology (GB 3975–1983). China Standard Press,
Beijing (1984)
6. China’s National Development and Reform Commission, The outline of the 13th five-year plan
for national economic and social development of the People’s Republic of China, Xinhua News
Agency 6(1) (2016)
7. Chinese Academy of Sciences, Brain Science and Brain-Like Intelligence Technology. Shen-
zhen International Genomics Conference, Institute of Neuroscience (2015)
50 A User Study on Head Size of Chinese Youth … 487

8. Xiao, H., Xia, D.: Research on head and face size of Chinese adults. J. Ergon. 4(4) (1998)
9. Roger, B.: Size China: a 3D anthropometry survey of the Chinese head. Dissertation, Delft
University of Technology (2011)
10. Yan, L., Roger, B.: The 3D Chinese head and face modeling. J. Comput. Aided Des. 44(1),
40–47 (2012)
11. Yu, X., Qi, W.: A user study of wearable EEG headset products for emotion analysis. In:
ACM International Conference Proceeding Series, December 21, 2018, ACAI 2018 Confer-
ence Proceeding—2018 International Conference on Algorithms, Computing and Artificial
Intelligence; ISBN-13: 9781450366250. https://doi.org/10.1145/3302425.3302445
Author Index

A Guo, Wenyan, 323


Alaini, Eyhab, 227
H
B He, Xiang, 461
Baigaltugs, S., 419 He, Yanqing, 313
Bulgan, Ninjerdene, 75 Huang, Qingdan, 269
Huang, Shoujuan, 85
C Huang, Yikun, 37, 341
Cao, Zhiyi, 351 Huang, Zhe, 105
Chen, Guoqiang, 47 Hu, Jingyu, 179
Chen, Guo Tai, 281
Chen, Jie, 341 I
Chen, Junfeng, 75 Ikramullah, Khan, 227
Cui, Chen, 371
J
D Jiang, Chun-lan, 187, 195
Dai, Cai, 67 Jiang, Yixue, 333
Dao, Thi-kien, 115 Jiao, Qing-jie, 205
Deng, Huiqiong, 227
Ding, Qun, 163 L
Dong, Pengwei, 249 Lee, Jong Seol, 171
Lian, Yufeng, 323
F Liao, Weijie, 269
Fan, Wei, 13 Li, Chaogang, 227
Fan, Xinnan, 75 Li, Jiajue, 13, 237
Feng, Junhong, 47 Li, Jianpo, 27, 249
Feng, Lei, 95, 155, 259 Li, Jiapeng, 127
Fu, Ping, 85, 259 Li, Kaitong, 179
Li, Meijing, 179
G Li, Ming, 187
Gao, Gui, 291 Li, Na, 27
Gao, Kai, 13, 237 Lin, Guoxiang, 37
Guan, Ti, 27, 249 Li, Ningning, 27
Guo, Baolong, 105 Lin, Lianlei, 127
Guo, Chun-Feng, 281 Lin, Yaming, 217

© Springer Nature Singapore Pte Ltd. 2020 489


J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3
490 Author Index

Lin, Yeyu, 217 Rentsendorj, Javkhlan , 299


Lin, Lin, 249 Ryu, Keun Ho, 171, 411
Li, Qiong, 147, 381
Liu, Bing, 85, 95, 259 S
Liu, Hang, 291 Sengee, Nyamlkhagva, 401
Liu, Hao, 13, 237 Sun, Feng, 13
Liu, Lu, 341 Sun, Zhongbo, 323
Liu, Nengxian, 57
Liu, Rong-qiang, 205 T
Liu, Shuaishi, 323 Tong, Hui, 351
Liu, Shutang, 443 Tong, Qifan, 259
Liu, Taiting, 323 Tumurbaatar, Tserennadmid, 401
Liu, Weina, 433
Liu, Xin, 249 U
Liu, Yong, 27 Ugtakhbayar, N., 419
Li, Xipeng, 95 Usukhbayar, B., 419
Li, Yan, 341
Li, Yang, 187, 195 V
Li, Ying, 137 Vu, Thi Hong Nhan, 3
Li, Yuli, 291
Li, Yunpeng, 471 W
Li, Zimei, 333 Wang, Chuanfu, 163
Li, Zuoyong, 433, 451 Wang, Chuansheng, 443, 451
Lkhagvasuren, Ganchimeg , 299 Wang, Dawei, 27, 249
Lu, Chunhua, 471 Wang, Gang, 13, 237
Lu, Guang, 391 Wang, Geng, 105
Lu, Jiawei, 37, 341 Wang, He, 351
Lu, Xiaowei, 137 Wang, Mei-Jin, 115
Wang, Shen, 371, 391
M Wang, Tong, 237
Mao, Haokun, 381 Wang, Wenting, 27, 249
Mao, Liang, 195 Wang, Xin-yu, 195
Meng, Qianhe, 291 Wang, Yibo, 237
Munkhbat, Khongorzul, 411 Wang, Zhen, 443
Wang, Zhenyu, 155
N Wen, Xin, 13
Ngo, Truong-Giang, 115 Wu, Jiawei, 433
Nguyen, Trong-The, 115 Wu, Keping, 323
Nie, Jian-xin, 205 Wu, Ruidong, 85
Ning, Xiuli, 137 Wu, Weiming, 461
Ni, Rongrong, 361
Niu, Shaozhang, 351 X
Xie, Chao-Fan, 281
P Xie, Shu-chun, 187
Pan, Jeng-Shyang, 57, 115 Xie, Yuxuan, 95
Pei, Liqiang, 269 Xue, Jason Yang, 57
Pei, Xia, 105 Xue, Xingsi, 37, 75
Xu, Lin, 281
Q Xu, Yingcheng, 137
Qiao, Jiaqing, 155
Qi, Wen, 481 Y
Yang, Bolan, 227
R Yang, Chunyan, 305
Rao, Rui, 269 Yang, Hanting, 313
Author Index 491

Yang, Lisha, 361 Zhang, Lei, 313


Yang, Pengpeng, 361 Zhang, Liang, 461
Yang, Xiani, 47 Zhang, Xuewu, 75
Yan, Hui, 333 Zhang, Yuhong, 291
Yan, Kun, 127 Zhang, Zhaoyang, 391
Yan, Nan, 461 Zhang, Zuchang, 433
Yan, Renwu, 227 Zhao, Chunxiao, 313
Yu, Jiale, 433 Zhao, Guangzhe, 313
Yu, Ping, 333 Zhao, Qiang, 147
Yu, Xi, 481 Zhao, Yao, 361
Yu, Zhaochai, 433 Zheng, Junxian, 443
Zhang, Zijun, 179
Z Zhou, Xiaogen, 451
Zeng, Lian, 269 Zhu, Na, 313
Zhang, Fuquan, 443, 451, 471 Zou, Danyin, 95
Zhang, Jie, 47

You might also like